Series on Advances in Mathematics for Appli*
>ciences H o i . 72
ADVANC
MATH EM AT COMPUTATIONA IN METROLO
Editors
P Ciarlini E Filipe A B Forbes F Pavese C Perruchet B R L Siebert World Scientific
X
ADVANCED
MATHEMATICAL & COMPUTATIONAL TOOLS IN METROLOGY VII
Book Editors Patrizia Ciarlini (Istituto per le Applicazioni del Calcolo, CNR, Roma, Italy), Eduarda Filipe (Instituto Portugues da Qualidade, Caparica, Portugal) Alistair B Forbes (National Physical Laboratory, Teddington, UK), Franco Pavese (Istituto di Metrologia "G.Colonnetti", CNR, Torino, Italy), Christophe Perruchet (UTAC, Montlhery, France) Bernd Siebert (Physikalisch-Technische Bundesanstalt, Berlin, Germany) For the first six Volumes see this Series vol. 16 (1994), vol.40 (1996), vol.45 (1997), vol.53 (2000), vol.57 (2001) and vol.66 (2004)
X THEMATIC NETWORK "ADVANCED MATHEMATICAL AND COMPUTATIONAL TOOLS IN
METROLOGY" (SOFTOOLSMETRONET). Coordinator: F Pavese, Istituto di Metrologia "G.Colonnetti" ( I M G C ) , Torino, IT (EU Grant G6RT-CT-2001-05061 to IMGC) Caparica Chairperson: E. Filipe, Instituto Portugues da Qualidade, Caparica, Portugal INTERNATIONAL SCIENTIFIC COMMITTEE
Eric Benoit, LISTIC-ESIA, Universite de Savoie, Annecy, France Worfram Bremser, Federal Institute for Materials Research and Testing, Berlin, Germany Patrizia Ciarlini, Istituto per le Applicazioni del Calcolo "M.Picone", Roma, Italy Eduarda Corte-Real Filipe, Instituto Portugues da Qualidade (IPQ), Caparica, Portugal Alistair B Forbes, National Physical Laboratory (NPL-DTI), Teddington, UK Pedro Girao, Telecommunications Institute, DEEC, 1ST, Lisboa, Portugal Ivette Gomes, CEAUL and DEIO, Universidade de Lisboa, Lisboa, Portugal Franco Pavese, Istituto di Metrologia "G.Colonnetti" (IMGC), Torino, Italy Leslie Pendrill, Swedish National Testing & Research Institute (SP), Boris, Sweden Christophe Perruchet, UTAC, France Bernd Siebert, Physikalisch-Technische Bundesanstalt (PTB), Berlin, Germany
ORGANISED BY
Instituto Portugues da Qualidade (IPQ), Caparica, Portugal CNR, Istituto di Metrologia "G.Colonnetti", (IMGC) Torino, Italy IMEKO Technical Committee TC21 "Mathematical Tools for Measurement"
Sponsored by EU Thematic Network SofTools_MetroNet EUROMET IPQ, Portugal IMGC-CNR, Italy Societa' Italiana di Matematica Applicata ed Industriale (SIMAI), Italy LNE, France NPL-DTI, United Kingdom PTB, Germany SPMet, Portugal
Series on Advances in Mathematics for Applied Sciences - Vol. 72
ADVANCED
MATHEMATICAL & COMPUTATIONAL TOOLS IN METROLOGY VII
JH9 P Ciarlini CNR - Istituto di Applicazione del Calcolo, Roma, Italy
E Filipe Institute Portugues da Qualidade, Caparica, Portugal
I I I1 ItI1 I 1 1 0
A B Forbes
L
4
6
8
10
National Physical laboratory, Middlesex, UK
F Pavese CNR - Istituto di Metrologia, Torino, Italy National Institutefor Research in Metrology (INRiM), Torino, Italy
C Perruchet UTAC, Montlhery, France
B R L Siebert Physikalisch-Technische Bundesanstalt, Berlin, Germany
\[p World Scientific NEW JERSEY • LONDON • SINGAPORE • BEIJING • SHANGHAI • HONGKONG
• TAIPEI • CHENNAI
Published by World Scientific Publishing Co. Pte. Ltd. 5 Toh Tuck Link, Singapore 596224 USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.
ADVANCED MATHEMATICAL AND COMPUTAITONAL TOOLS IN METROLOGY VII Series on Advances in Mathematics for Applied Sciences — Vol. 72 Copyright © 2006 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.
ISBN 981-256-674-0
Printed in Singapore by World Scientific Printers (S) Pte Ltd
Foreword This volume collects the refereed contributions based on the presentation made at the seventh workshop on the theme of advanced mathematical and computational tools in metrology, held at the IPQ Caparica, Portugal, in June 2005. The aims of the European Projects having supported the activities in this field were • • •
•
To present and promote reliable and effective mathematical and computational tools in metrology. To understand better the modelling, statistical and computational requirements in metrology. To provide a forum for metrologists, mathematicians and software engineers that will encourage a more effective synthesis of skills, capabilities and resources. To promote collaboration in the context of EU Programmes, EUROMET and EA Projects, MRA requirements.
•
To support young researchers in metrology and related fields.
•
To address industrial requirements.
The themes in this volume reflect the importance of the mathematical, statistical and numerical tools and techniques in metrology and also keeping the challenge promoted by the Meter Convention, to access a mutual recognition for the measurement standards. Caparica, November 2005
The Editors
This page is intentionally left blank
Contents
Foreword
v
Full Papers Modeling Measurement Processes in Complex Systems with Partial Differential Equations: From Heat Conduction to the Heart MBar, S Bauer, R Model andR W dos Santos
1
Mereotopological Approach for Measurement Software E Benoit and R Dapoigny
13
Data Evaluation of Key Comparisons Involving Several Artefacts M G Cox, P MHarris andE Woolliams
23
Box-Cox Transformations and Robust Control Charts in SPC MI Gomes andF O Figueiredo
35
Multisensor Data Fusion and Its Application to Decision Making P S Girao, J D Pereira and O Postolache
47
Generic System Design for Measurement Databases - Applied to Calibrations in Vacuum Metrology, Bio-Signals and a Template System H Gross, V Hartmann, KJousten and G Lindner
60
Evaluation of Repeated Measurements from the Viewpoints of Conventional and Bayesian Statistics / Lira and W Woger
73
Detection of Outliers in Interlaboratory Testing C Perruchet
85
On Appropriate Methods for the Validation of Metrological Software D Richter, N Greifand H Schrepf
98
Data Analysis - A Dialogue with the Data D S Sivia
108
Vlll
Contributed Papers A Virtual Instrument to Evaluate the Uncertainty of Measurement in the Calibration of Sound Calibrators G de Areas, MRuiz, J MLopez, MRecuero andRFraile
119
Intercomparison Reference Functions and Data Correlation Structure W Bremser
130
Validation of Soft Sensors in Monitoring Ambient Parameters P Ciarlini, U Maniscalco and G Regoliosi
142
Evaluation of Standard Uncertainties in Nested Structures E Filipe
151
Measurement System Analysis and Statistical Process Control A B Forbes
161
A Bayesian Analysis for the Uncertainty Evaluation of a Multivariate Non Linear Measurement Model G Iuculano, G Pellegrini and A Zanobini
171
Method Comparison Studies between Different Standardization Networks A Konnert
179
Convolution and Uncertainty Evaluation M J Korczynski, M G Cox and P Harris
188
Dimensional Metrology of Flexible Parts: Identification of Geometrical Deviations from Optical Measurements C Lartigue, F Thiebaut, P Bourdet andN Anwer Distance Splines Approach to Irregularly Distributed Physical Data from the Brazilian Northeastern Coast S de Barros Melo, E A de Oliveira Lima, MCde Araujo Filho and C Costa Dantas Decision-Making with Uncertainty in Attribute Sampling L R Pendrill andHKdllgren
196
204
212
IX
Combining Direct Calculation and the Monte Carlo Method for the Probabilistic Expression of Measurement Results G B Rossi, F Crenna, M G Cox and P M Harris
221
IMet - A Secure and Flexible Approach to Internet-Enabled Calibration at Justervesenet A Sand and H Slinde
229
Monte Carlo Study on Logical and Statistical Correlation B Siebert, P Ciarlini and D Sibold
237
The Middle Ground in Key Comparison Analysis: Revisiting the Median A G Steele, B M Wood and R J Douglas
245
System of Databases for Supporting Co-Ordination of Processes under Responsibility of Metrology Institute of Republic of Slovenia T Tasic, M Urleb and G Grgic
253
Short Communications Contribution to Surface Best Fit Enhancement by the Integration of the Real Point Distribution SAranda, J Mailhe, J M Linares andJMSprauel Computational Modeling of Seebeck Coefficients of Pt/Pd Thermocouple H S Aytekin, R Ince, A Tlnce and S Oguz
258
262
Data Evaluation and Uncertainty Analysis in an Interlaboratory Comparison of a Pycnometer Volume E Batista and E Filipe
267
Propagation of Uncertainty in Discretely Sampled Surface Roughness Profiles J KBrennan, A Crampton, X Jiang, R Leach andP M Harris
271
Computer Time (CPU) Comparison of Several Input File Formats Considering Different Versions of MCNPX in Case of Personalised Voxel-Based Dosimetry S Chiavassa, M Bardies, D Franck, J R Jourdain, J F Chatal and IA ubineau-Laniece
276
X
A New Approach to Datums Association for the Verification of Geometrical Specifications J Y Choley, A Riviere, P Bourdet and A Clement Measurements of Catalyst Concentration in the Riser of a FCC Cold Model by Gamma Ray Transmission C Costa Dantas, V A dos Santos, E A de Oliveira Lima and S de Barros Melo
280
284
Software for Data Acquisition and Analysis in Angle Standards Calibration MDobre and H Piree
289
Calculation of Uncertainties in Analogue Digital Converters - A Case Study MJ Korczynski and A Domanska
293
Asymptotic Least Squares and Student-? Sampling Distributions A B Forbes
297
A Statistical Procedure to Quantify the Conformity of New Thermocouples with Respect to a Reference Function D Ichim and MAstrua
301
Non-Parametric Methods to Evaluate Derivative Uncertainty from Small Data Sets D Ichim, P Ciarlini, E Badea and G Delia Gatta
306
Algorithms for Scanning Probe Microscopy Data Analysis P Klapetek
310
Error Correction of a Triangulation Vision Machine by Optimization A Meda and A Balsamo
316
Atomic Clock Prediction for the Generation of a Time Scale G Panfilo and P Tavella
320
Some Problems Concerning the Estimate of the Degree of Equivalence in MRA Key Comparisons and of Its Uncertainty FPavese
325
XI
Validation of Web Application for Testing of Temperature Software A Premus, TTasic, U Palmin and J Bojkovski
330
Measurement Uncertainty Evaluation Using Monte Carlo Simulation: Applications with Logarithmic Functions J A Sousa, A S Ribeiro, C O Costa and MP Castro
335
Realisation of a Process of Real-Time Quality Control of Calibrations by Means of the Statistical Virtual Standard VI Strunov
340
An Approach to Uncertainty Analysis Emphasizing a Natural Expectation of a Client R Willink
344
Special Issue Preparing for a European Research Area Network in Metrology: Where are We Now? M Kiihne, W SchmidandA Henson
350
Author Index and e-mail addresses
361
This page is intentionally left blank
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 1-12)
MODELLING MEASUREMENT PROCESSES IN COMPLEX SYSTEMS WITH PARTIAL DIFFERENTIAL EQUATIONS: FROM HEAT CONDUCTION TO THE HEART MARKUS BAR, STEFFEN BAUER, REGINE MODEL, RODRIGO WEBER DOS SANTOS Department Mathematical Modeling and Data Analysis, Physikalisch-Technische Bundesanstalt (PTB), Abbestr. 2-12, 10587 Berlin, Germany. The modelling of a measurement process necessary involves a mathematical formulation of the physical laws that link the desired quantity with the results of the measurement and control parameters. In simple cases, the measurement model is a functional relationship and can be applied straightforward. In many applications, however, the measurement model is given by partial differential equations that usually cannot be solved analytically. Then, numerical simulations and related tools such as inverse methods and linear stability analysis have to be employed. Here, we illustrate the use of partial differential equations for two different examples of increasing complexity. First, we consider the forward and inverse solution of a heat conduction problem in a layered geometry. In the second part, we show results on the integrative modelling of the heart beat which links the physiology of cardiac muscle cells with structural information on their orientation and geometrical arrangement.
1. Introduction Analysing measurement processes in metrology requires often not only sophisticated tools of data analysis, but also numerical simulations of mathematical models and other computational tools. Here, we present a range of applications where the appropriate model is given by partial differential equations that can be solved only by numerical schemes like the methods of finite differences or finite elements. Mathematical models can serve different purposes; in the ideal case, the model is known exactly and simulations play the role of a ."virtual experiment". On the opposite side of the spectrum, models can also be used for developing hypotheses about observed behaviour in complicated systems and may serve as a guideline to design new experiments. Once plausible hypotheses have been established, models tend to evolve towards a more and more complete characterization of the system until the stage of the virtual experiment is reached. In the latter state, one may also employ inverse methods to obtain information on unknown physical parameters, functional dependencies or the detailed geometry of a measurement process from measurement data.
2
In this article, we illustrate the spectrum of possibilities of mathematical modelling by two examples. Section 2 describes finite element simulations of the heat equation and results of the inverse problem in a typical setup used for the measurement of heat conductivities as an example for a tractable virtual experiments. In Section 3, we will discuss basic aspects of the PTB heart modelling project. This project aims at the development of a ."virtual organ" and may later serve as a normal in medical physics. Modelling results are crucial in the interpretation of electro- and magneto-cardiographic measurements. The philosophy of heart modelling is to build a complete model that integrates the physiology of heart muscles cells with the dynamic geometry of the heart. Then, one can predict the impact of molecular changes or pathologies on the propagation of potential waves in the heart muscle and ultimately on the form of the ECG or MCG. Such computations may be used for an improvement of the diagnosis of heart diseases. Along this line, we present results on modelling the ventricles of the mouse heart in a realistic three-dimensional geometry and with information on the fiber orientation. 2. Determination of Heat Conduction Parameters in Layered Systems Within the process of development of new measurement procedures the part of mathematical modelling and simulation has gained importance in metrology. The so-called "virtual experiment" stands for the numerical simulation of an experiment based on a realistic mathematical model, virtual, because it proceeds in a computer instead in reality. Consequently, a virtual experiment design (VED) is a powerful numerical tool for: 1. the simulation, prediction, and validation of experiments; 2. optimization of measuring instruments, e.g., geometric design of sensors; 3. cause and effect analysis; 4. case studies, e.g., material dependencies; and 5. estimation of measurement errors. For indirect measurement problems where the physical properties of interest have to be determined by measurement of related quantities a subsequent data analysis solves an inverse problem. Here, the virtual experiment works as a solver for the forward problem within the optimization procedure for the inverse problem. In this field, a proper example is given by the determination of thermal transport properties of materials under test - an important task in the fields of engineering which try to reduce the energy involved, e.g., in process engineering and in the building industry. As it is infeasible to directly measure the thermal conductivity X and the thermal diffusivity a, the problem leads to an inverse heat transfer problem.
3
Non-steady-state methods such as the transient hot strip method (THS) [1-4] offer several advantages over conventional stationary methods, e.g., in shorter times of measurement and wider working temperature ranges. Here, a currentcarrying thin metal strip is clamped between two sample halves where it simultaneously acts as a resistive heater and a temperature sensor. The transient temperature rise is calculated from measured voltage drop with the aid of the thermometer equation and yields the input information for the subsequent data analysis.
Figure 1. Transient hot strip: Thermal part of set-up. a, sample halves; b, hot-strip of width D and length L; 1, electric current; U(f), THS voltage signal.
2.1 Mathematical Problem In case of homogeneous media a well-posed inverse problem of parameter identification has to be solved using an analytic approximation solution for the heat conduction equation. More complicated is the situation for layered composites where no adequate analytic solution is at hand and, furthermore, the standard measurement situation violates the unique solvability [5]. It is known from experiments and theory that the set-up may be treated mathematically as a two-dimensional system if the strip is sufficiently long, i.e., L > 100 mm. In that case, heat losses at both ends are generally negligible. Therefore, the underlying three-dimensional model discussed so far can be limited to two spatial dimensions. Hence the problem may be defined in a cross sectional area perpendicular to the strip as shown for a two-layered sample in Figure 2.
4
2.2 Forward Problem For symmetry reasons, the numerical integration domain can be reduced to a quarter of the cross sectional area. On the cut boundaries of the quadrant considered, homogeneous boundary conditions of the second kind are assumed, i.e., any heat flux vanishes. Then, for two concentric homogeneous layers the heat conduction equation is specified as
v
*
out ci
tetytae
layer
**
1
•••
C ~j
» •
integration domain
Figure 2. Schematic cross-section through the sample perpendicular to the strip. The thickness of the metal strip (0.01mm) is exaggerated.
pcp
d T { x
^
j )
= X(TXX (x, y, t) + Tyy (x, y, t)) +
q(x,y)
(1)
with the initial condition T(x,y, 0) = T0,
(x,y )e Q2 = [0,/] X [0,d],
the boundary conditions of the third kind and the symmetry conditions T x (x,y,t) = 0, T y (x,y,t) = 0,
x=0, 0< y < d 0< x
On the assumption of a negligible thermal contact resistance between the layers, the two additional conditions T(x, dx - 0 , 0 = T(x, di + 0 , 0 XiTy(x,
rfi-0,0
= X2Ty(x,
rfj+0,0
5
have to be satisfied. The half thickness of the whole specimen and its inner layer are denoted by d and d\, respectively; / is the half width of the specimen (see Figure 2). The thermal diffusivity and the related volumetric heats of both layers are given by , \k„ &={ [X2,
0
\p1c1, Pc„ ={ ' " \p2cp2,
0
where index 1 relates to the inner layer and index 2 to the outer layer. The thermal diffusivity a results from the known relation a = XIpcp. The heat source, q, is limited to the volume of the strip. As an example, Figure 3 shows the finite-element simulation of the THS signals vs. In t for a homogeneous and a two-layered sample applying a constant power of 1W to a strip of size 0.01 x 3 X 100 mm3. The composite has an inner layer with thickness 20 mm and a outer layer with 220mm. Each signal is calculated for adiabatic and for isothermal boundary conditions. The curves representing the temperature rise for the two-layered sample may be divided into three characteristic regions. In the first region, the thermal properties of the inner core govern the temperature rise. Therefore, the curves of the two-layered and the homogeneous sample are in agreement. In the second region, the temperature rise of the metal strip is determined by the thermal conductivity and the thermal diffusivity of both the inner core and the outer layer. At the upper end of this region, the signals differ for adiabatic and isothermal conditions, respectively. Finally, in the third region, the additional influence of the surroundings becomes apparent. 2.3 Inverse Problem The numerical simulation of the vector Tsim(Al,A,al,a,t) of the discrete measurement signal is the solution of the forward problem (1) for given thermal properties and a time vector t. In practice, the signal is a voltage drop. However, the relation to the temperature rise is defined by the linear thermometer equation. For comparison with simulated data the measurement data set is converted into a temperature vector rmeas. The related inverse problem of parameter identification
is formulated as an output-least squares problem |rsima;,22,ay)a2,0-rmeas(0|2 n
=X
,2 r
;,sim(;'/'/l2.«/.«2.';)-7,i,meaS(',)
i=l
(2)
ft
=£
^
=
mm
(=1
Here, n is the number of data points taken into account in the algorithm.
Region 1
In(tfs)
Figure 3. Numerical THS signal (temperature rise) for a layered sample (1 adiabatic , 2-isothermal) and a corresponding homogeneous sample (3 - adiabatic, 4 - isothermal).
With the identified A 1( a1; X 2, a2 the simulated signal Tiim is the best approximation to the experimental signal Tmeas. (2) is solved by a modified Levenberg-Marquardt method [6] . The algorithm combines the Gauss-Newton method with the gradient method. For layered composites, the subsequent determination of the parameters, beginning with the inner layer, strongly improves the condition of the problem. In the first step, a particular initial interval of the experimental signal is used for parameter identification of the inner layer and, in the following step, an adequately longer interval for the second layer. The time interval for the first step is reasonably chosen at [0, t\~\, see Figure 3. In the standard THS working regime the measurement time is less than t2, because here the boundary conditions have no bearing on the signal. However, in the second step the results for Xx and a2 depend on the initial guess for the iteration procedure. Let us examine the error function E,
7
Eg(Z9a)=lTsim(AlfA3al,aJ)~T§M(AifA2fal$a2ft)l
(3)
where the calculated vector Tska(X]fX2ta]fa2,t)acts
as a measurement signal
TmeasCl), quasi as clean data without measurement uncertainty. The simulated vector functionF sim (X l ,X,a l$ a,t) has two independent variables, the thermal conductivity and the thermal diffiusivity of the second layer. Consequently, the error Et is a two-dimensional function. Obviously, for a = a2 and X = fa we obtain Et (fa a2) = 0 (see Figure 4).
QO
0.1
0.2
Q3
0.4
0.5
06
0.7
0.8
09
1.0
X , W/Km
Figure 4. Error function Et (l2t«2) for a measurement time of 300 s.
Furthermore, t2 itself depends on the thermal transport properties where the influence of the thermal diffiisivity a is much greater than for the thermal conductivity X and in opposite direction. If fmeas is chosen to be sufficiently longer than *2» and if the boundary condition is known, a unique solution can be found in the second step. In the experimental situation, isothermal conditions may be better realized.
8 2.4 Summary The algorithm for the determination of thermal conductivity and diffusivity of two-layered composites based on a finite-element model and an optimization strategy works in two steps: first the determination of the thermal properties of the inner layer, and second the same for the outer layer. The procedure has been successfully applied to the experimental THS signals derived from air bricks. Samples like these can be considered as composites of two parallel layers without any thermal contact resistance between them. In another test sample, a two-layered material made of polypropylene foam (inner layer) and polystyrene bulk material (outer layer) has been used as a model sample. Provided the measurement time is sufficiently long so that the THS signal contains information about the sample surface, the data analysis allows the thermal transport properties to be determined uniquely from THS signals. 3. Modeling Cardiac Electrophysiology The heart as the center of the cardiovascular system is responsible to supply the body continuously with blood. The so called myocardium is a specialized muscular tissue which has self-excitable properties characterized by an action potential. Modeling of cardiac electrophysiology aims to calculate time and space distributions of potentials on the tissue and to get further data like simulated electro- or magnetocardiograms. 3.1 The Cardiac Action Potential The origin of action potentials lies in the characteristics of the cell membrane. Within the membrane are several ion channels which are usually selective for one type of ion. In most cases the behaviour of an ion channel is voltagedependent on the membrane potential. Membrane currents during the excitation phase can be described by systems of differential equations. An unexcited myocyte cell builds up an electrical potential difference of around Vm = -89 mV over the membrane. The total current through the membrane is the sum of the ionic currents and a capacitive current and leads directly to the Hodgkin-Huxley differential equation [7,8]:
9
C
dV m
= I -I (V e) at where Cm denotes the membrane capacity, /ion is the sum of all currents through the different ionic channels, and / m the membrane current. 3.2 Bidomain Models The cardiac excitation is characterized by waves of depolarization which spread through the myocardium with propagation velocities ranging from 0.03 to 0.6 m/s. These waves are generated by the membrane currents and spread over the tissue by local currents through the intra- and extracellular domains. One needs thus to combine the above model of single cell excitation with a model for excitation spreading [9]. Cardiac cells are roughly cylindrical with a diameter of around 20 urn and a length of about 100 urn and are arranged in sheets of spatial differing orientation of these fibers. The electrical properties of the tissue are therefore quite anisotropic both for intra- and the extracellular space. Ohmic conductivities in three dimensions are then described by a conductivity tensor a. The currents in both spaces are given by: 'ext = - < J e x t V < z ) e x t ' 4 t = -°"int V #nt The physiological source of currents in cardiac tissue are the transmembrane currents. Direction of this current is usually defined from inside to the outside of the cell. Conservation of total current flow then gives: V / .. = / Y
' ext
J
V/ m'
¥
* int
=-/ l
m
As the membrane currents consist of ionic and capacitive currents, we can write down now the complete bidomain equations:
Cm
lt~= ~V(<JextV^')_ Iio"iV""^
C m - ^ - = V((7intV^nt )-/,-„„ (Vm,g) where Vm - >-mi — 0 e x t is the trans-membrane potential. For better numerical feasibility, the equations are usually rearranged to a parabolic and an elliptic PDE which can then be solved separately:
10 c
">^irr=
v ff
( in,v<* int ) - / i o n ( v m , g )
at V(<Xint V 0 i n [
+0"ext V^ext ) = 0
3.3 Example: Magnetocardiogram of the Mouse During the design of a dedicated magnetocardiographic system used for measurements on mice, it was necessary to get an estimation of the maximum magnetic field strengths and magnetic field distribution in a realistic measurement geometry. A computer model based on the bidomain equations was used to fulfill these requirements of the design process. The potentials were calculated on a finite element mesh consisting of 5-105 active nodes at a grid resolution of 250 urn describing the left and right ventricle (see Fig. 5). Solving of the bidomain equations is numerically quite complex. A semi-implicit Crank-Nicholson scheme combined with an algebraic multigrid preconditioner was used for the system of PDEs [10]. 100 ms simulated activity needed around 20 hours on a 16-processor 2.0 GHz opteron cluster. The magnetic field was calculated by the Biot-Savart law from the superposition of intra- and extracellular cardiac tissue currents and volume currents through the surrounding medium: ^tot
=
^int + ^ext
B = ^L f
Im^ljv
An -v r The resulting magnetic field shows a maximum of 13 pT and a bipolar field distribution (see Fig. 6). It was therefore estimated that 6 SQUID magnetometers on a circumference around the mouse body should be sufficient to observe the bipolar and possible smaller multipolar parts of the overall magnetic field. Recent measurements on real mice could confirm these modeling results [11]. 3.4 Summary We described the basic physics of electrical propagation in the heart and showed some results of simulations of the mouse heart. Future work will continue using animal models like mice, rabbit and guinea pig in combination with experiments and drug testing issues. A mid-term goal is the implementation of a human heart model, which will require better hardware and further improvement of the numerical and mathematical methods used.
11
Rgure 5. (a) Geometry and fiber orientations (b) Calculated transmembrane potential at t= 15 used for the mouse MCG simulation. Apex-base ms on the surface of the mouse heart mesh. width of 7 mm.
Right
Rgure 6. (a) Radial component of the calculated (b) Calculated distribution of the radial magnetic magnetic field at 0 = - 40°.
field
during the maximum of QRS (field levels are given in units of pT).
12 4. Conclusion We have exemplified the use of partial differential equations (PDEs) with the examples of heat conduction and cardiac propagation. Further application of PDE modelling in metrology are envisaged in the fields of non-imaging optics and bio-electromagnetism as well as in the rapidly developing field of modelling in biology and molecular medicine. Once the models are reasonably well characterized, solution of inverse problems can help in developing new measurement technology. A further challenge is the determination of uncertainties in situations where PDEs are employed. References 1. S.E. Gustafsson et al, Transient hot -strip method for simultaneously measuring thermal conductivity and thermal diffusivity of solids and fluids. J. Phys. D 12, 1411-1421 (1979). 2. K.D. Maglic, A. Cezairlyan, and V.E. Peletsky (eds), Compendium of thermophysical property measurement methods. Vol. 1: Survey of measurement techniques. Plenum Press, New York and London (1984). 3. U. Hammerschmidt, A linear procedure for analyzing transient hot strip signals. Thermal Conductivity 24, 123-134 (1999). 4. R. Model and U. Hammerschmidt, In Advanced Computational Methods in Heat Transfer VI, B. Suden, C.A. Brebbia (eds), WIT Press, Southampton, Boston, 407-416 (2000). 5. R. Model and U. Hammerschmidt, High Temperature - High Pressure, 34, 649-655, (2003). 6. Dennis, J.E. and Schnabel, R.B.: Numerical methods for unconstrained optimization and nonlinear equations. Prentice-Hall, Englewood Cliffs, N.J. (1983). 7. A. Panfilov, A. Holden (Editors), Computational Biology of the Heart, John Wiley & Sons, Chichester, 1997 8. G. T. Lines et al, Computing and Visualization in Science 5(4):215-239 (2002). 9. E. Vigmond et al, IEEE Transactions on Biomedical Engineerung, 49, No. 11(2002). 10. R. Weber dos Santos et al, IEEE Transactions on Biomedical Engineerung, 51, No. 11(2004). 11. U. Steinhoff et al., Contactless magnetocardiographic characterization of knock-out mice, accepted in Folio Cardiologica (2005)
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 13-22)
MEREOTOPOLOGICAL APPROACH FOR MEASUREMENT SOFTWARE ERIC BENOIT LISTIC-ESIA, Universite de Savoie B.P. 806, 74016 ANNECYcedex, France RICHARD DAPOIGNY LISTIC-ESIA, Universite de Savoie B.P. 806, 74016 ANNECYcedex, France The history of computing science shows that languages used for the design of software appear to be part of a continuous evolution from machine-based representations of behaviours to human one. Recent languages propose to model software into a representation close to human one, but they are limited by their genericity. The purpose of this paper is to present a software modelling closer to human representation but restricted to the field of measurement and control.
1.
Introduction
Prior to software availability, the measurement science was studied by specialized researchers mainly originated from the Physical Sciences domain and was applied by engineers specialized in domains close to Physics, due to a sufficient intersection between their knowledge to work together. Actually, the introduction of software computing in the measurement domain, and more generally in Physics, had created a new community having software computing as knowledge foundation. These two communities need to cooperate in order to create intelligent sensors but physicists talk about physical structures, physical rules and states when researchers in software computing talk about software behaviors, constraints and variables. The major objective of our investigations is to include enough software engineering knowledge into a design tool to give the possibility for physicists to create efficient distributed measurement systems including efficient software. Such measurement systems are made with intelligent instruments, e.g. intelligent sensors and intelligent actuators, connected with a fieldbus. The approach is to clearly separate the design of intelligent instruments into an instrument designer part and into a software designer part. Software designers create a library of
14
functions and a design tool that is able to generate an embedded software that respects a model given by the instrument designer. The main difficulty is to define the modeling able to acquire the knowledge of the instrument designer (i.e., knowledge elicitation). A first approach is to use causal graph where vertices are pieces of software and edges are causal relations. During this design process, it is supposed that the instrument designer is able to define the behavior of the instrument with this kind of graph. This approach is close to the solution used in graphical languages but remains far from the Physical domain. Indeed, before defining the behavior of a measurement system, it's first necessary to have initial knowledge about the measured system and the measurement process. Several guidelines are given for example by Ludwik Finkelstein [1] or Luca Mari [2] to understand what is a measurement. Basically, a measurement performs a link between the empirical world, and the representational world made of concepts and relations. Then new design tools have to model both physical world and software behavior with unified modelling like the one presented by Karl H. Ruhm [3]. One solution to provide a formal support for the design tool is to include concepts and relations between concepts through the use of ontologies. First it offers a conceptual framework for data handling at several conceptual levels, and secondly it facilitates the communication between the design tool and the instrument designer. In the second approach, the design starts with the modelling of the physical environment to measure or to control [4]. Then the design includes a hierarchical model of the goals elicited by the instrument designer. Finally the resulting causal graph and the final software, are created in respect to the information included in both models. 2.
The behavioural modeling
Languages for software programming were introduced in the middle of 20th century with the first compiler. Assembling language and fourth generation languages like C and Pascal are based on a modeling of the behavior of the software. More recently, object oriented languages also include a model of the structure of data. Other approaches based on graphical modeling have been created to improve the design process. Modeling tools for software design are now based on a graph approach that expresses the causality between actions. In the automation field, Graphcet, Ladder or State graphs are commonly used, and Petri nets, devoted to the modeling of discrete actions, are sometime applied to software design. Design tools like LabView or Matlab-Simulink propose now convenient design GUIs.
15 Considering a service based implementation of measurement software in intelligent instruments, the design process we had first proposed can be split in several layers associated to different competencies as presented in [5] (see figure 1). •
•
•
The instrument user level is based on an external instrument model. This level concerns the final user who performs service requests and selects the instrument modes. The instrument designer level, based on the internal instrument model. This level concerns the systems integrator who creates services adapted to the application and the customer requirements. The software designer level, where atomic parts of software are created. This level concerns the company that designs the instrument. User level
^ Internal ServiceTT I Basic functions II
Software designer level
Figure 1. General representation of the design process.
In this approach, the software designer creates basic functions, also called actions, used in the instrument. Then he defines the causal graph that links these functions. The causal graph includes all possible sequences between functions. Then, the work of the instrument designer is limited to the definition of services as subgraphs of the general causal graph. Finally, the user activates the required services. o 1 " ?input " 1event connector | acquirePressurel |
Subgraph Graoh
| acquirePressure2 | Rdv ~ p » " 2
| computeSpeedandLevel
output event connector
Rdv Rendezvous event node
Rdv FlowControt
\i7
MoveGate
LevelControl
5 -O
Figure 2. General causal graph and the subgraph associated to a service.
16 This approach allows the instrument designer to participate to the design of the instrument software, but it remains on the behavioral level. Indeed, the instrument designer has to plan possible behaviors for each service in the instrument, and has also to understand the general causal graph defined by the software designer. 3.
The mereo-topology
Even if the preceding approach simplifies the design process, the instrument designer needs to adapt its way of thinking in order to generate the behavior of the instrument. Considering the measurement software into its environment, it is also necessary to take into account the physical system and the goal of the software. In the approach presented in this paper, both physical world and behavior are modelled within a single ontology-based theory. Ontologies were introduced to conceptualize domain knowledge. Some definitions were proposed. The general definition is The branch of metaphysics that studies the nature of existence. In the measurement field this definition can be restricted to the nature of what can be represented. T. Gruber gives an alternative definition: An explicit specification of a conceptualization [6], and a definition that can be considered as an application field of ontologies: A formalism to communicate knowledge. A more practicable definition is the following : A computational entity, a resource containing knowledge about what "concepts" exist in the world and how they relate to one another. Then an ontology can be simply defined by a set of concepts i.e. representation of thing, and a set of relations on these concepts. The different types of ontologies are distinguished by the properties of the relations on concepts. The Mereology is a particular ontology defined as a formal theory of parts and wholes in which the basic element is called individual. The set of basic relations is described with a simple set of definitions.
Pclxl where / is the set of individuals and P is the part_of relation; P(x, y)
A P(y,
x)) -» x = y .
transitivity P(x, y) A P(y, z)-> P(x,z) . »
proper_part_of definition PP(x,y)~defP(x,y)A^P(y,x)
(3)
(4)
17 asymmetry PP(x,y)-*^PP(y,x)
.
transitivity PP(x, y) A PP(y, z) -> PP(x, z) .
(6)
overlap 0(x,y) =
def3z(P(z,x)AP(z,y)).
(7)
From a mathematical point of view, the part_of relation is a partial order relation and the proper_part_of relation is the associated strict order relation. With the above set of relations, it is possible to write down some axioms specifying desirable properties for any system decomposition. Then we follow the PhysSys approach proposed by W. N. Borst et al. [7] where mereology is extended with topological relations (this reflects the fact that, in the configuration design, components are thought to be decomposed first and connected later on). The topology is based upon the is-connected-to relation C where individuals are said to be connected if they are able to exchange energy. Cc/*/; C(x, y)->C(y,x)
(8) •
(9)
Adopting an extension of the Clarke's mereotopology [8], we assume that connections only connect a pair of individuals. Additional relations define how individuals are connected. With the previous set of relations, the physical system description is brought back to the instrument as additional computational knowledge. 4.
The ontology-based model
The physical system is represented into a mereo-topological model where individuals are physical entities with a proper_part_of relation issued from the mereology, and a is-connected-to relation from the topology. The last relation represents energy links and data links between physical entities. This modelling gives rise to the structural knowledge of the physical system. In order to illustrate this modeling, a simple example of a Pitot tube for the measurement of the velocity and the level in a open water channel is presented bellow. A Pitot tube is a compound sensor that measures both static and dynamic pressure in two areas of the water with two pressure sensors. For this purpose, the water area is divided into a dynamic area, and a static one just behind the Pitot tube (see fig. 3). The front pressure sensor in contact with the static water
18
area performs the measurement of the dynamic pressure and the other one, i.e. the lateral pressure sensor, performs the measurement of the static pressure. The static pressure gives the level of the water, and the difference between both pressures gives the velocity.
velocity level
dynamic water area lateral pressure sensor
s ^z\ f r
static water area
Water area
Pitot tube^) ont pressure sensor
Figure 3. Example of a physical system to model. A Pitot tube into an open water channel.
The mereological representation of such system is made by the definition of the proper_part_of relation, for example the lateral pressure sensor is a proper part of Pitot tube. The dynamic water area is a part of water area. PP(lateralSensor, PitotTube)
(10)
PPifrontSensor, PitotTube)
(11)
P(staticWaterArea,waterArea)
(12)
PP(dynamicWaterArea,waterArea)
(13)
Note that when the water does not move, the static water area is the water area. This explains why the part_of relation is used instead of the proper_part_of relation, in the mereo-topological representation, the isconnected-to relation is also used. In the example, the pressure sensors are connected to a water area. C(lateralSensor,dynamicWaterArea)
(14)
C(frontSensor,staticWaterArea)
(15)
The figures 4 and 5 give two different graphic representations of this model.
frontSensor
staticWaterArea
Figure 4. Graph oriented representation of the mereo-topological model of the open water channel.
19 System waterArea
PitotTube lateralSensor
dynamic WaterArea
frontSensor
static WaterArea
Figure 5. Box oriented representation of the mereo-topological model of the open water channel.
The main advantage of the mereo-topology is to offer a simple modeling for complex systems. The level of description of an individual by its parts can be chosen in a consistent way with the use of the model. For example the front sensor is made of several parts, but at our level we do not decide to represent them. In order to improve the precision of the model, the description of the front sensor by its parts can be added. As this sensor is a common one, we can imagine its mereo-topological description to be included in a library of models. System / PitotTube
waterArea
PitotTube __ _ff_ __ frontSensor
staticWaterArea / \
membrane / piezo-crystal
library item )
Figure 6. Reduction of the model and expansion of the model of the front sensor. The model of the front sensor can be an item in a library of mereotopological models.
The functionality of the measurement system is also represented into the mereology theory. Individuals are now goals. A goal is a part_of another goal if and only if the achievement of the first one is required to achieve the second one. goal are defined by the triplet (action verb, physical role, physical entity) where the physical role is a physical quantity or a vector of physical quantities. For example a general goal can be (to_measure,{velocity,level],waterArea). gl = (to_acquire, {pressure}, {staticWaterArea})
(16)
g2 = {to_acquire, {pressure}, {dynamicWaterArea})
(17)
g3 = (to_compute, {velocity,level], {waterArea})
(18)
g4 = (to_send, {velocity, level], {waterArea})
(19)
20
G = {tojneasure, {velocity, level}, {waterArea})
(20)
The behavioral model is based on an event-based approach (Event Calculus) in order to include the temporal precedence. It is represented by the causal graph and can be interpreted as an ontology where the concepts are the nodes of the graph and the order relation is the causality relation. The concepts are actions, and are linked to the action verbs of the goals. The link between the physical modelling, the behavioural modelling and the functional one is performed by the application of the definition: Goals are reached by performing actions on physical quantities associated to physical entities. Practically, the concepts associated to each ontology are tuples of common entities. • Concepts in the physical ontology are defined as follows: Given R, a finite set of physical roles and 0, the finite set of physical entities, a physical context c is a tuple: c= (r; n(r), §\,
2, ...,n(r))> where r e / ? , denotes its physical role (e.g., a physical quantity), /J:R —>Nat, a function assigning to each role its arity (i.e., the number of physical entities related to a given role) and {2, ...,<|>M(r))}, a set of entities describing the spatial locations where the role has to be taken.• Concepts in the functional ontology are goals defined such as the pair (A,Q, where A is an action symbol and C is a non-empty set of physical contexts defined as above. • Concepts in the behavioral ontology are software actions (i.e., functions) guarded by input conditions specifying the way of achievement of a given goal. In [9], Richard Dapoigny shows how the physical and the functional ontologies give enough constraints to automate the design of the behavioral causal graph. 5.
Implementation
These results are right now partially implemented into a specific software design tool named Softweaver. This tool is adapted to different kind of designer competencies including the application designer one. It includes two GUIs adapted to the definition of the physical system, a mereology checking, and a code generator for the intelligent sensors that will run the application software.
21
Machineunderstandable
Developer
Internal service implemen -tation Designer Internal seivices library Intelligent sensors
Figure 7. General behaviour of the Softweaver design tool.
* Generator
Systerr
StaticWaterArea
Oyn a mi cWate fAte a
Figure 8. Implementation of the example on both Softweaver GUIs.
Lateral Sensor
O Co
22
6.
Conclusion
After the graphical approaches, the software design tools have actually reached a new level in the evolution toward an improved interaction with the knowledge of instrument designers. They now include a high level knowledge representation and are able to handle concepts and relations organized into ontologies. In the specific field of measurement, this knowledge needs to be functional, behavioral and structural. The presented approach proposes to start a software development with the description of the physical system together with the goals of the software. Several studies had shown that this approach can be used to automate the design of measurement software. The measurement process maps the real world onto a model that is an approximation of the reality. A mereo-topological based model is less precise than any analytical model but it allows to represent complex systems and to perform easily distributed reasoning. It can be a solution to understand complex systems, to check for inconsistent behaviors, and to prevent from wrong designs. References 1. L. Finkelstein, Analysis of the concepts of measurement, information and knowledge, XVIIIMEKO World Congress, 1043-1047, (2003). 2. L. Mari, The Meaning of "quantity" in measurement, Measurement, 17/2, 127-138, (1996). 3. K. H. Ruhm, Science and technology of measurement - a unifying graphic-based approach, 10th IMEKO TC7 International Symposium, 77-83, (2004). 4. R. Dapoigny, E. Benoit and L. Foulloy, Functional ontologies for intelligent instruments, Foundations of Intelligent Systems, Lecture Notes in Computer Science (Springer), 2872, (2003). 5. E. Benoit, L. Foulloy, J. Tailland, InOMs model: a Service Based Approach to Intelligent Instruments Design, SCI 2001, XVI, 160-164 (2001). 6. T. Gruber, Toward Principles for the Design of Ontologies Used for Knowledge Sharing, Int. J. of Human and Computer Studies, 43(5/6), 907-928 (1995). 7. W.N. Borst, J.M. Akkermans, A. Pos and J.L. Top, The PhysSys ontology for physical systems, Ninth International Workshop on Qualitative Reasoning QR'95, 11-21 (1995) 8. B. Clarke, A calculus of individuals based on v connection', Notre Dame Journal of Formal Logic, 23/3, 204-218 (1981). 9. R. Dapoigny, N. Mellal, E. Benoit, L. Foulloy, Service Integration in Distributed Control Systems: an approach based on fusion of mereologies, IEEE Conf. on Cybernetics and Intelligent Systems (CIS'04), 1282-1287 (2004).
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 23-34)
DATA EVALUATION OF K E Y COMPARISONS INVOLVING SEVERAL ARTEFACTS
M. G. C O X , P. M. H A R R I S A N D E M M A W O O L L I A M S Hampton
National Physical Laboratory, Road, Teddington, Middlesex TW11 E-mail: [email protected]
OLW,
UK
Key comparison data evaluation can be influenced by many factors including artefact instability, covariance effects, analysis model, and consistency of model and data. For a comparison involving several artefacts, further issues concern the manner of determination of a key comparison reference value and degrees of equivalence. This paper discusses such issues for key comparisons comprising a series of linked bilateral measurements, proposing a solution approach. An application of the approach to the spectral irradiance key comparison C C P R K l - a is given.
1. Introduction Analysis of the data provided by the national measurement institutes (NMIs) participating in a key comparison is influenced by factors such as (a) the requirements of the CIPM Mutual Recognition Arrangement (MRA) *, (b) the guidelines for the key comparison prepared by the relevant Working Group of the appropriate CIPM Consultative Committee, (c) the number of artefacts involved, (d) artefact stability, (e) correlations associated with the data provided by the same NMI, (f) as (e) but for different NMIs, (g) the definition of the key comparison reference value (KCRV), (h) the handling of unexplained effects, and (i) the consistency of the data. This paper concentrates on key comparisons that comprise a series of linked bilateral comparisons involving multiple artefacts having nominally identical property values. In each bilateral comparison a distinct subset of the artefacts is measured by just one of the NMIs participating in the comparison and the pilot NMI. Table 1 indicates the measurement design for six NMIs, A-F, each allocated two artefacts. P denotes the pilot and Ve the property of the £th artefact. Thus, e.g., the value of VQ is measured by NMI C and the pilot. The main objective is to provide degrees of equivalence (DoEs) for the
24 Table 1. The measurement design in the case of six NMIs, each allocated two artefacts. Vi
V2
A P
A P
v3 B P
v4 B P
V5
V6
C P
C P
v7 D P
v8 D P
V9 E P
Vio E P
Vn
V12
F P
F P
comparison. A KCRV is not an intrinsic part of the approach considered. Gross changes due to handling or transportation might cause some of the data to be unsuitable for the evaluation (section 2). Also, artefacts can change randomly on a small scale (section 5.2). A basic model of the measurements can be formed, as can a model in which systematic effects are included as parameters (section 3). Model parameter determination is posed and solved as a least squares problem (section 4). A statistical test is used to assess the consistency of the model and the data and strategies to address inconsistency are considered (section 5). The degrees of equivalence required by the MRA can be formed from the solution values of the model parameters (section 6). A key comparison of spectral irradiance 17 required consideration of all factors (a)-(i) above (section 7).
2. Artefact stability To obtain information about artefact instability, artefacts are measured in a sequence such as PNP or PNPN, where P denotes measurement by the pilot NMI and N that by the NMI to which an artefact is assigned. The data provided by an NMI for the artefacts assigned to it are taken within a measurement round, or simply round. For the sequence PNP, e.g., the pilot NMI measures within two rounds and other NMIs within one round. The data provided for each artefact can be analyzed to judge whether the artefact property value has changed grossly during transportation and, if so, consideration given to excluding the data relating to that artefact from the main data evaluation 17 . The use of such an initial screening process would not account for subtle ageing effects that might have occurred. Such considerations can be addressed in the main data evaluation (section 5).
3. Model formation Each data item provided by each participating NMI relating to each artefact estimates that artefact's property value and is generally influenced by: (1) A random effect, unrelated to other effects;
25
(2) A systematic effect, applying to all data provided by that NMI, but unrelated to other NMIs' data; (3) A systematic effect, applying to all data provided by a group of NMIs containing that NMI, where a non-participating NMI provides traceability to the NMIs in the group; (4) As (3) but where one member of the group provides traceability to the others in the group. A common underlying systematic effect is taken into account by augmenting the uncertainty matrix (covariance matrix) associated with the set of NMI's measurement data by diagonal (variance) and off-diagonal (covariance) terms n (section 3.1). Each such term is either associated with a pair of data items (a) provided by that NMI or (b) provided by that NMI and another participating NMI. Alternatively, quantities corresponding to systematic effects can be included explicitly in the model 5>13>14 (section 3.2).
3.1. Basic
model
A lower case letter, say q, denotes an estimate of the value of a quantity denoted by the corresponding upper case letter (Q), and u(q) the standard uncertainty associated with q. Uq denotes the uncertainty matrix associated with the estimate q of the value of a vector quantity Q. Denote by xgti>r the data item relating to artefact Ps property value provided by NMI i in its rth measurement round. It is an estimate of the value of X(titr, the quantity represented by NMI i's measurement scale of artefact Ps property in round r. Let Ve denote artefact's Ps property, the value of which is to be estimated. A model of the measurement is Xt,i,r = Ve.
(1)
Let X denote the vector of the X(^
26 (item (1) above) associated with the elements xgti,r of x. The other effects indicated above contribute to Ux as follows. Each NMI systematic effect indicated as item (2) contributes the variance associated with the (zero) estimate of the value of that effect to each element in a diagonal block of Ux relating to that NMFs measurement data. The variance equals the square of the relevant standard uncertainty from that NMI's uncertainty budget. For item (3), there should be within the uncertainty budgets for each NMI in the group an identical standard uncertainty relating to the systematic component 'transferred' from the non-participating NMI. The covariance contribution associated with each pair of measurement data provided by NMIs within the group is the square of that uncertainty. For item (4), let A denote the NMI providing traceability and B any other NMI in the group. The uncertainty in B's budget associated with systematic effects constitute two components: that 'transferred' from A as part of obtaining traceability, and that relating to systematic effects for B alone. Thus, the covariance associated with the data provided by A and B is the square of the standard uncertainty associated with systematic effects in A's uncertainty budget. By ordering the elements of x by NMI, with the pilot NMI last (X would be ordered accordingly, as would the rows of A), Ux takes the form exemplified in figure 1, in which the non-zero elements are contained within the labelled blocks. Ux is largely a block-diagonal matrix, those blocks being indicated in figure 1 by 'A, B, . . . , F', corresponding to NMIs A-F, and 'P', corresponding to the pilot NMI. When one NMI takes traceability from another, there will be non-zero off-diagonal blocks. Figure 1 shows off-diagonal blocks, 'CE' and ' E C , resulting from NMI E taking traceability from NMI C. This figure is consistent with table 1 in terms of the number of participating NMIs. If two artefacts are each measured once by each NMI, as in table 1, each small labelled block (sub-matrix) would be of dimensions 2 x 2 and the large labelled block 12 x 12. 3.2. Systematic
effects modelled
functionally
Consider an alternative to the model of section 3.1, containing parameters corresponding to the NMI systematic effects 13>14, with Ux used in place of Ux • Xt,itr is now regarded as differing from the artefact property V( by an additive effect Si, common to all NMI i's measurement data, i.e., Xe,i,r — Vt + Si.
(2)
27 A B C
EC
D CE
E F
P
Figure 1. An uncertainty matrix associated with measurement data relating to artefact property values.
These Si would correspond to a sum of systematic effects when, e.g., one NMI takes traceability from another. The matrix form of the model is as before, but now Y denotes the set of quantities Vg and Si, and the row of A corresponding to Xtti>r contains two elements equal to unity, in the column positions corresponding to Vg and Si. The matrix Ux is used instead of Ux since the systematic effects are modelled functionally, leaving only random effects to be treated statistically 7 . Features (discussed below) of the systematic-effects model are: (1) The inclusion of systematic-effects parameters gives flexibility, implying the possibility of obtaining improved model-data consistency; (2) The estimates obtained of the values of the systematic-effects parameters and the associated uncertainties (at a 95 % level of confidence) provide directly the unilateral DoEs; (3) Differences between these estimates and the associated uncertainties (at a 95 % level of confidence) provide directly the bilateral DoEs; (4) The value components of the unilateral DoEs constitute measurement deviations from the artefact property values and those of the bilateral DoEs differences of such deviations. The solution is not unique because, in terms of the model (2), an arbitrary constant K can be added to the Vt and subtracted from the S$. A particular solution can be selected by including a resolving condition in the model 3 ' 1 3 ' 1 4 , in the form of an equation involving one or more of the Vt,
28
or one or more of the Si, or both:
$>/V« + £ > < $ = C l
(3)
i
for some choice of constant multipliers he and Wi and constant C. 4. Model solution 4.1. Obtaining
the best
estimates
Viewing key comparison data evaluation as an appropriately defined least squares problem can be fully justified 16 . The best estimate y of the value of Y is given by solving the generalized least squares problem 9 minF(r]) = eT(Ux)~1e,
e = x - An,
(4)
where Ux is taken as Ux or Ux , and Y is regarded as the vector of artefact properties, or the vector of artefact properties and NMI systematic effects, according, respectively, to whether model (1) or (2) is used. Formally, if A has full rank, i.e., A A is invertible, the solution to formulation (4) and the associated uncertainty matrix are y = UyA^U^x,
Uy = {ATU~lA)-\
(5)
Numerically, a recognized solution algorithm 9 employing matrix factorization methods to avoid explicit matrix inversion would be used. The resolving constraint (3) needed to overcome the rank deficiency of A for model (2) can be accommodated by appending an additional row to the design matrix A and an additional element to the vector X 7 . 4.2. Uncertainties
associated
with the
solution
Because the uncertainty matrix Ux is used in obtaining estimates of the artefact property values and the NMIs' systematic effects when considering model (2), uncertainties associated with these estimates would be obtained as relevant elements of the uncertainty matrix Uy = (AT(UX^)~1A)~1 (cf. formulae (5)). Uy will reflect only the uncertainties associated with random effects declared by the participating NMIs. (Recall (section 3.2) that Ux is used rather than Ux since the systematic effects are modelled functionally.) However, Uy obtained this way is inappropriate if used in evaluating the uncertainty components of the DoEs. An NMI, the ith, say, would generally regard the satisfaction of the inequality \di\ < U(di), where (dt, U(di))
29
represents the unilateral DoE for NMI i, as confirmation of its measurement capability for the artefact property concerned. It is less likely that this inequality would be satisfied as a consequence of working with 1$: the several measurements made by each NMI and the considerably larger number made by the pilot NMI would tend to reduce the uncertainties associated with the elements of y to values that were inconsistent with the capabilities of the NMIs to make individual measurements, the concern of the key comparison. This aspect is addressed by regarding the least squares solution process as a GUM input-output model 2 with input quantity X and output quantity Y. Formally, Y = (AT (Ug)-1 A)'1 AT (Ug^X (cf. formulae (5)). The least squares vector estimate y of the value of Y is given by using the vector estimate x of the value of X in this formula. The full uncertainty matrix Ux is propagated through this formula, using (a generalization 6 of) the law of propagation of uncertainty, to yield Uy, thus accounting for all uncertainties provided by the participating NMIs. 5. Consistency of model and measurement data To draw valid conclusions from modelling data generally, the model must be consistent with the data. Details of appropriate consistency tests in the context of key comparison data evaluation, based on the use of the chi-squared statistic, are available 7>10>12>15. Two ways of handling inconsistency are considered: data exclusions and artefact instability modelling. 5.1. Measurement
exclusions
Model-data inconsistency can be treated by excluding measurement data that in some sense are discrepant so as to achieve a consistent reduced set of data that is as large as reasonably possible 7>12>15. If a small amount of data were so excluded, and all participating NMIs continued to be represented in that data corresponding to at least one artefact per NMI remained in the evaluation, that number of measurement exclusions might be considered acceptable. Otherwise, it would be necessary to develop a better model, viz., one that explained further effects so far not considered (section 5.2). 5.2. Artefact
instability
modelled
functionally
Since a large number of data exclusions would indicate inadequacy of the model, consideration can be given to the inclusion in the model of a further
30
random effect relating to changes in the artefacts' properties during the course of the comparison. This effect should be associated with all artefact property measurements even if they had not previously been excluded 15 . Instead of model (2), consider the augmented model Xe,i,r = Vg + Si+D, where D is the artefact instability effect. The value of D is not estimated, but D is regarded as a random variable having expected value d = 0. The value u2(d) enters the uncertainty matrix as an addition to each diagonal element. It is determined such that the model matches the data, i.e., the 'observed' value of chi-squared (the minimizing value of F(n) in expression (4)) equals the statistical expectation of chi-squared (for the relevant degrees of freedom). It is unrelated to the uncertainties associated with the participating NMIs' data. No data exclusion is then necessary, although consideration can be given to striking a balance between the number of data exclusions and the effect of the additional term in the model. Such instability effects are considered elsewhere u > 13 ' 14 > 15 . Additional terms can be included in the model to account for deterministic drift 8 - u ' 1 8 . 6. Determination of the degrees of equivalence For a comparison involving one stable artefact measured by participating NMIs, the KCRV can be taken as the least squares estimate of the artefact property value 4 . A unilateral DoE would be the deviation of an NMFs measurement of the property value from the KCRV, with the uncertainty associated with that deviation at the 95 % level of confidence. A bilateral DoE would be the difference of two such deviations, with the uncertainty associated with that difference at the 95 % level of confidence. For the comparison considered here, again least squares provides best estimates of the artefact property values and the NMI systematic effects. There is not a natural choice of overall KCRV for such a comparison although, for each £, the best estimate of the value of artefact £'s property Vg can be taken as a reference value for that property. Indeed, an overall KCRV is not needed in forming unilateral and bilateral DoEs (sections 6.1 and 6.2). 6.1. Unilateral
degrees of
equivalence
From model (2), X(,i,r — Ve = Si, the left-hand side of which is the deviation of the quantity represented by NMI i's measurement scale of artefact Ts property in round r from that property. The right-hand side is NMI i's systematic effect. Taking expectations, E(X^j, r ) — E(V^) = E(Si), i.e., E(Xe,i,r) -vt
= Si,
(6)
31 where ve is the best estimate of the value of Ve and Sj that of Si provided by least squares. The left-hand side of expression (6) constitutes the unilateral DoE for NMI i, being the deviation of (an expected) measurement data item relating to artefact £'s property value from (the best estimate of) that property value. Thus, the value component of the unilateral DoE for NMI % is given by di = Sj and the associated uncertainty by u(di) = u(si). Under a normality assumption, the unilateral DoE for NMI i would be taken as (s*, 2u(si)). The u2(si) are relevant diagonal elements of Uy. However, U(SJ) so obtained would be too small as a result of the effect of processing a quantity of data on the uncertainties associated with random effects. Hence, the value obtained can be augmented in quadrature by a typical uncertainty associated with random effects as reported by NMI i. The vector s and hence the unilateral DoEs will be influenced by the choice of resolving condition (section 3.2). In particular, the vector estimate of the value of S corresponding to two different choices of resolving condition will differ by a vector of identical constants. This situation corresponds to that in a key comparison consisting of the circulation of a single, stable artefact among participating NMIs 4 , in which different choices of KCRV yield unilateral DoEs whose values differ collectively by a constant.
6.2. Bilateral
degrees of
equivalence
The bilateral DoEs can be obtained similarly. Again from the model (2), (Xt^r — Ve) — (X^^y — Ve) = Si — S^, the left-hand side of which constitutes the difference between (a) the deviation of the quantity represented by NMI i's measurement scale of artefact £'s property in round r from that artefact property and (b) the counterpart of (a) for NMI i', artefact £' and round r'. The right-hand side is the difference between NMI i's and NMI i"s systematic effects. Taking expectations as before, the value component of the bilateral DoE for NMI i and NMI i' is given by dit»/ = s$ — s^ and the associated uncertainty by u(di) = u(s»— s,/). Under a normality assumption, the bilateral DoE for NMI i would be taken as (sj — Sj/, 2U(SJ —Sj/)). The variance U 2 (SJ —s^) =u2(si)+u2(si') — 2COV(SJ,SJ/) can be formed from the diagonal elements of the uncertainty matrix Uy corresponding to s, and Sj' and the off-diagonal element lying in the corresponding row and column position. The estimate Sj — s^ and hence the corresponding bilateral DoE will not be influenced by the choice of resolving condition (section 3.2). The reason is that the only freedom in the model solution is that s can be adjusted by
32
a vector of identical constants, and v accordingly (section 3.2), and hence differences between the elements of s are invariant.
7. Application The approach described in this paper was applied to the spectral irradiance key comparison CCPR Kl-a 1 7 . This comparison was carried out separately, as stipulated in its protocol, for each of a number of wavelengths. Each participating NMI was assigned several lamps. The comparison design involved the measurement sequence PNPN or NPNP. Nominally, there were 12 participating NMIs each measuring three lamps in two measurement rounds. The problem thus involved, nominally, 144 measurements (four measurements made by an NMI and the pilot NMI of 12 x 3 lamps) and 49 model parameters (corresponding to 36 artefact properties and 13 systematic effects). The measurement data was in fact incomplete because some NMIs measured fewer than three lamps and some measurement data items were excluded as part of the initial data screening (section 2). Let zt,i,r denote the measurement data item for artefact ts property value, a lamp spectral irradiance value, made by NMI i in its round r, as an estimate of the value of ^,i, r > the quantity provided by NMI i's measurement scale of artefact £'s property in round r. Round uncertainties were incorporated to account for changes made to some NMIs' measurement scales between rounds. Each ze,i,r has an associated fractional standard uncertainty declared by NMI i and appropriate pairs of these data items, as a consequence of measurement data being provided by the same NMI and of traceability, have associated fractional covariances. Overall, the set of ztj
33
To avoid excluding too much data, which otherwise would have occurred, an artefact instability effect D was included in the model (section 5.2) and parameter values estimated accordingly. The numerical value of u(d), determined to achieve model-data consistency, aligned well with physical knowledge of the behaviour of the types of lamps used. This statement applied to all wavelengths considered. 8. Conclusion and discussion The method described for the data evaluation of key comparisons involving linked bilateral measurements and multiple artefacts is based on the use of a model and its least squares solution. Specific parameters within the model represent the artefact properties and NMI systematic effects. A variant of the model includes an artefact instability effect. Estimates of the parameter values provided by least squares can be interpreted as reference values for the artefact properties and value components for the unilateral DoEs. Value components of the bilateral DoEs are formed, naturally in this context, as differences of the unilateral DoEs. The uncertainty components of these DoEs are formed from the uncertainty matrix associated with the parameter value estimates and other considerations. The approach was applied to the spectral irradiance key comparison CCPR Kl-a. There is much scope for further work in this area. One consideration relates to augmenting the model (2) to become Xi>i
34
2. 3.
4. 5. 6.
7.
8. 9. 10. 11.
12.
13. 14. 15. 16. 17.
18.
Technical report, Bureau International des Poids et Mesures, Sevres, France, 1999. BIPM, IEC, IFCC, ISO, IUPAC, IUPAP, and OIML. Guide to the Expression of Uncertainty in Measurement, 1995. ISBN 92-67-10188-9, Second Edition. M. G. Cox. A discussion of approaches for determining a reference value in the analysis of key-comparison data. Technical Report CISE 42/99, National Physical Laboratory, Teddington, UK, 1999. M. G. Cox. The evaluation of key comparison data. Metrologia, 39:589-595, 2002. M. G. Cox and P. M. Harris. Measurement uncertainty and the propagation of distributions. In Proceedings of Metrologie 2001, Saint-Louis, France, 2001. M. G. Cox and P. M. Harris. SS/M Best Practice Guide No. 6, Uncertainty evaluation. Technical report, National Physical Laboratory, Teddington, UK, 2004. M. G. Cox, P. M. Harris, and Emma R. Woolliams. Data evaluation of key comparisons involving linked bilateral measurements and multiple artefacts. In NCSL International Workshop and Symposium, Washington, USA, 2005. C. Elster, W. Woger, and M. G. Cox. Analysis of key comparison data: Non-stable travelling standard. Measurement Techniques,, 2005. To appear. G. H. Golub and C. F. Van Loan. Matrix Computations. John Hopkins University Press, Baltimore, MD, USA, 1996. Third edition. H. K. Iyer, C. M. Wang, and D. F. Vecchia. Consistency tests for key comparison data. Metrologia, 41:223-230, 2004. L. Nielsen. Evaluation of measurement intercomparisons by the method of least squares. Technical Report DFM-99-R39, Danish Institute of Fundamental Metrology, Denmark, 2000. L. Nielsen. Identification and handling of discrepant measurements in key comparisons. Technical Report DFM-02-R28, Danish Institute of Fundamental Metrology, Denmark, 2002. C. M. Sutton. Analysis and linking of international measurement comparisons. Metrologia, 41:272-277, 2004. D. R. White. On the analysis of measurement comparisons. Metrologia, 41:122-131, 2004. R. Willink. Statistical determination of a comparison reference value using hidden errors. Metrologia, 39:343-354, 2002. W. Woger. Remarks on the key comparison reference value. In NCSL International Workshop and Symposium, Washington, USA, 2005. Emma Woolliams, Maurice Cox, Nigel Fox, and Peter Harris. Final report of the CCPR Kl-a Key Comparison of Spectral Irradiance 250 to 2500 nm. Technical report, National Physical Laboratory, Teddington, UK, 2005. Currently at Draft B stage. N. F. Zhang, H.-K. Liu, N. Sedransk, and W. E. Strawderman. Statistical analysis of key comparisons with linear trends. Metrologia, 41:231-237, 2004.
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 35-46)
BOX-COX T R A N S F O R M A T I O N S A N D R O B U S T CONTROL CHARTS I N SPC
FERNANDA OTILIA FIGUEIREDO Faculdade de Economia da Universidade do Porto and CEAUL Rua Dr Roberto Frias, 4200-464 Porto PORTUGAL MARIA IVETTE GOMES CEA UL and DEIO, Universidade de Lisboa Bloco C2, Campo Grande, 1700 Lisboa PORTUGAL
In this paper we consider some asymetric models to describe the d a t a process. We assess that, in general, it is possible to find an adequate Box-Cox transformation which enables us to deal with the transformed data as approximately normal. To monitor the original and the transformed d a t a we implement the traditional control charts for normal data as well as robust control charts, based on the total median and on the total range statistics. All these charts are compared in terms of efficiency and robustness.
1. Introduction and Motivation Control charts are one of the basic tools in Statistical Process Control (SPC), widely used to monitor industrial processes. Most of these charts assume an underlying normal process, and such an hypothesis is often inappropriate in practice. Despite the usefulness of the normal model in SPC, the sets of continuous data related to the most diversified industrial processes (such as data from telecommunication traffic, insurance, finance and reliability) exhibit often asymmetry and tails heavier than the normal tail (see Hawkins and Olwell [10]). Then, the traditional charts, designed for normal processes, are inappropriate because they usually present a rate of false alarm much higher or lower than expected, i.e., the Mean (X) chart to monitor the process mean, and the Range (R) and the Standard Deviation (S) charts to monitor the process standard deviation, must be carefully used. It seems thus important to advance with "robust" control charts to monitor the process parameters, so that we do not have either a very high or a very low false alarm rate whenever the parameters to be controlled are
36 close to the target values, /io and <7o, although the data is no longer normal. Several authors have already addressed the problem of monitoring non-normal data. Information about the robustness of the traditional Shewhart and Cusum control charts, in the presence of non-normal distributions, has been obtained in Balakrishnan and Kocherlakota [1], Borror et al. [2], Chan and Heng [5], Chang and Bai [6], Figueiredo and Gomes [9], Lucas and Croisier [13], Ryan and Faddy [16], Stoumbos and Reynolds [18], among others. In Sec. 2 we introduce some background about the control statistics to be herewith considered, and present a simulation study on their robustness and efficiency. In Sec. 3 we provide information about Box-Cox transformations, and advance with a procedure for monitoring non-normal data, based on a Box-Cox transformation of the original data, followed by the use of the previous "robust" statistics. Finally, in Sec. 4 we present some overall comments about monitoring non-normal processes and conclusions. 2. The Control Statistics: Efficiency and Robustness Given an observed sample of size n, {x\,X2,--- >xn), from a distribution function (d.f.) F, let us denote {xi-n, 1 < i < n) the observed sample of associated ascending order statistics (o.s.), and let (x*, 1 < i < n) be any bootstrap sample. As usual, the notation Xi-n or X*, 1 < i < n, is used for the corresponding random variables (r.v.'s). Note that given an observed sample (x\, • • • ,xn), the random associated bootstrap sample (Xi, ••• , X*) is a random sample of independent and identically distributed replicates from a r.v. X*, with d.f. F*(x) = X^=i I{xi<x}/n, the empirical d.f. of the observed sample, being IA the indicator function of A. 2.1. The Location
and Scale Estimators
Under
Study
We here summarize some results obtained in Figueiredo [8] and in Figueiredo and Gomes [9]: Proposition 2.1. Let us consider the bootstrap median, i.e., the median of the bootstrap sample, given by BMd->~™ ifn = I {Xm.n + Xm+1.. ) / 2
2m-l, ifn = 2m,
m = l,2, •
and the bootstrap range, BR — X^.n — X*.n|x;.n-x1*.Tl>o- Let us denote, ctij := P {BMd = (xi:n + xj:n) /2), Aj := P (X*:n - X{m = xj:n - xi:n),
37
1 < i < j < ft, w*^ -P(^) ^ e probability of the event A. Then, with [x] denoting the integer part of x, we have ctij = a n _ . , + i i n _ i + i , [(n—1)/2] nn
Z^ fc=0
—1
.
n—k
fc!(n-fe)!
,
,...
^ r=[n/2]-fc+l
n n((n/2)!)
..„_i._ r
H(n-fc-r)!
2
'
0,
n
CTen
'
a n
1
"
— f ~~ J — "
X
-
l
n
< •? -
n odd and 1 < i < j < n
(*)",
l
Definition 2.1. The total median, denoted TMd, is the statistic, T M d : ^ ^ ^ * ^ .
(1)
Definition 2.2. The total range, denoted TR, is given by n— 1
n
^ : = E E Ai (X^« - X*=«) •
(2)
Proposition 2.2. The total median and the total range may be written as linear combinations of the sample o.s., i.e., n
TMd:=^2aiXi..n
n
and T.R := ^ & ; X i : „ ,
t=l
where the coefficients ai and bi are given by ai = \ [Y^j=i aij + S j = i and bi = J?jZ\ 0j* ~ E"=i+i / % ! < * <
(3)
i=l a
ji)
n
-
Remark 2.1. The coefficients a* and 6j, 1 < i < n, in (3), are distribution free, i.e., they are independent of the underlying model F. They enable an interesting resistance of the corresponding statistics to changes in the model, particularly when we compare these statistics with the classical ones. In opposition to what happens with the M and R robust estimators (see Lax [12], Tatum [19]), which usually provide numerical implicit estimates, these estimators have explicit expressions, and consequently, we may easily use them in statistical quality control and metrology.
38
In Table 1, we present, for each entry i, the values a» and bi, in the first and second row, respectively, for the most usual values of n in SQC. The missing coefficients in the table are either zero or obtained through: n
n
2 J Qi = 1,
2 j 6j = 0,
»=i
i=i Table 1.
n
2
1
0.500 -1.000
A
2
en = an-i+i
and bi = -bn-i+i,
1 < i < n.
Values of the coefficients ai and bi for sample sizes n < 10.
3 0.259 -0.750 0.482
4 0.156 -0.690 0.344 -0.198
3
5 0.058 -0.672 0.259 -0.240 0.366
6
7
0.035 -0.666 0.174 -0.246 0.291 -0.058
0.010 -0.661 0.098 -0.245 0.239 -0.073
4
0.306
8 0.007 -0.657 0.064 -0.244 0.172 -0.077
9 0.001 -0.653 0.029 -0.242 0.115 -0.078 0.221 -0.020 0.268
0.257 -0.016
5
10 0.001 -0.652 0.019 -0.241 0.078 -0.079 0.168 -0.022 0.234 -0.004
As an alternative to the TMd statistic in Eq. (1), we shall consider the classical sample mean X. Also, apart from the TR statistic in Eq. (2), we shall consider the usual standard deviation estimators, the sample standard deviation S and the sample range R. Any of these scale estimators is standardized through the division by a scale c„, so that it has a unit mean value whenever the underlying model is normal. The scales cn are provided in Table 2 for the same values of n. Table 2. n S R TR
2 0.798 1.128 1.128
2.2. Measures
3 0.886 1.693 1.269
Scale constants cn for sample sizes n < 10. 4 0.921 2.058 1.538
of Efficiency
5 0.940 2.326 1.801
and
6 0.952 2.534 2.027
7
8
9
10
0.959 2.704 2.210
0.965 2.848 2.365
0.969 2.970 2.491
0.973 3.078 2.611
Robustness
To compare the efficiency of different location estimators, we usually evaluate their mean squared error (MSE). The MSE of Tn, a consistent estimator of a parameter 6, is the quantity MSE(Tn) = E(Tn — 6)2, with E denoting the expected value operator. This measure is affected by the scaling of the estimator, and therefore it is common to compare scale estimators through Var(lnTn), with Var denoting the variance operator. Details about performance measures of scale estimators can be found in
39 Lax [12]. Let us generally consider a model F dependent on an unknown parameter 9, and T = a set of consistent estimators of 8. Definition 2.3. The efficiency of T„ relative to T„
is the quotient
V-yTT—I, for location estimators
* ^ H 3 S 2 i , , , , -"6i-
(4)
) r^—f, for scale estimators [ Var (in T« | Fj ' Definition 2.4. Among the set T of consistent estimators of 8, the most efficient estimator of 9 is the the estimator T„ ° | F , such that arg minM5.E (T„ | F ] , for location estimators «o : = <
ieI
(5)
a) \
r
argminFar (lnTA | F ) , for scale estimators. i e i Definition 2.5. Given a class T of non-degenerate models F dependent on 9, and the class T of consistent estimators of 9, consider (1) for any value n and for any model F € T, the most efficient estimator Tiio)\F, defined in Eq. (5); (2) next, compute the efficiency of all the estimators relatively to the best one, selected in (1), through the measure REF T £ given in Eq. (4). The "degree of robustness" of the estimator T„ € T is given by its minimum relative efficiency among the models F € !F, i.e., dr(T«):=Fmm^(wT£).
(6)
The "robust" estimator is the one with the highest minimum efficiency, i.e. T^ i r ) |F such that ir := arg max v(dr(T^)) i e i
2.3. The Simulation
Study:
Data Set and
. '
Results
In this study we consider Gamma and Weibull processes, Ga(9, S) and W(8,5), respectively, i.e., we shall assume that our characteristic quality X has a gamma probability density function (p.d.f.), fGa(x;8,5)
= jf—
xe-le-x's
, x > 0,
with \x = 58 > 0 and a = 6y/8 > 0, or has a Weibull p.d.f.,
40
fw(x;6,6)
e
-(^)e~1e-^e,
=
x>0,
with fi = 8 r ( l / 0 + 1) and a = 5yjT{2/9 + 1) - T2{\/6 +1), and T denoting, as usual, the complete Gamma function. If S = 1, we shall use the notations Ga(6) and W{6), respectively. In order to have distributions with different skewness and tail weight coefficients, we consider 9 = 0.5, (0.25), 2. Denoting F*~ and $*~ the inverse functions of F and of the standard normal d.f. <£, we shall here consider the tail-weight coefficient :=
T
2
V
/ ( $^(0.99)-<E>*~ (0.5) s $^(0.75)-*^(0.5) )
-(0.99)-*"" (0.5) j _ F~(0.5)-F~(0.01)\ -(0.75)-F~(0.5) F - (0.5)- F - (0.25)
•
and the skewness coefficient, /i3//x2 ,
7
where iir denotes the r-th central moment of F.
In Fig. 1 we present the most efficient estimator for the mean and for the standard deviation, among the estimators under study. In this figure the overall set of considered distributions is ordered by the coefficient T. Estinu Mono the mean value 1 2,260 1,305 1.218 1,105 1,062 1,062 1,042 1,030 1,023 1,018 0,972 0,933 0,916 0,911
y 9,302 7,556 2,828 2,309 2,000 8,000 1,789 1,633 1,512 1,414 8,883 9.909 10,995 12,109
F W(0,5) W(0,75) Ga(0,5) Ga(0,75) Ga(1) W(1) Ga(1,25) Ga(1,5) Ga(1,75) Ga(2) W(1,25) W(1,5) W(1,75) W(2)
Estimation of the standard deviation
3
4
5
6
7
8
TMd TMd TMd TMd TMd TMd TMd TMd TMd TMd TMd
TMd TMd TMd TMd TMd TMd TMd TMd TMd TMd TMd
TMd TMd TMd TMd TMd TMd TMd TMd TMd
TMd TMd TMd
TMd
TMd
TMd TMd
TMd TMd
TMd TMd TMd TMd
Mean Mean Mean Mean
Mean Mean Mean Mean
Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean
Figure 1.
9 TMd
10 TMd
Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean
3 TR TR TR TR TR TR TR TR TR TR TR TR TR TR
4 TR TR TR TR TR TR TR TR TR TR TR TR TR TR
5 TR TR TR TR TR TR TR TR TR TR TR TR TR TR
6
7
TR TR TR TR TR TR TR TR TR TR TR TR TR TR
TR TR TR TR TR TR TR TR TR TR TR TR TR TR
8 TR TR TR TR TR TR TR TR TR TR TR TR TR TR
9 TR TR TR TR TR TR TR TR TR TR TR TR TR TR
10 TR TR TR TR TR TR TR TR TR TR TR TR TR TR
Most efficient estimator.
From this figure we can observe that the bootstrap median and range, as well as the classical S and R, are not at all competitive. The TMd statistic in Eq. (1) is the most efficient estimator for the mean value, when we consider small-to-moderate sample sizes or heavy-tailed distributions, and we advise the use of the sample mean only for large samples of not-heavy tailed distributions. The TR statistic in Eq. (2) is without doubt the most efficient estimator for the process standard deviation. In Fig. 2 we picture the "degree of robustness" of these estimators, i.e., the indicator in Eq. (6), over the same set of distributions. Now we observe that besides the efficiency of the TMd and the TR statistics, they are much
41 Estimation of the mean value
Figure 2.
Estimation of the standard deviation
Degree of robustness of the different estimators.
more robust than the usual X, R and S estimators, and this fact justifies the use, in practice, of the associated control charts. Similar simulation studies for other symmetric and asymmetric distributions have been done to compare several location and scale estimators in terms of efficiency and robustness. For details see, for instance, Chan et al. [4], Cox and Iguzquiza [7], Figueiredo [8], Figueiredo and Gomes [9], Lax [12], Tatum [19]. 3. The Use of Box-Cox Transformations in SPC The hypothesis of independent and normally distributed observations is rarely true in practice. In the literature, many data-transformations have been suggested in order to reach normality. Given the original observations (zi, £2, • • • , xn) from the process to be controlled, assumed to be positive, as usually happens with a great diversity of industrial processes' measures, we here suggest the consideration of the Box-Cox transformed sample,
>?-l)/A
-{i
In Xi
ifA^O if A = 0
Ki
(7)
Provided we may find an adequate A, the observations z may be regarded as approximately normal, with mean value /J,\ and variance <J\. For some distributions we have explicit expressions for fi\ and <j\, but we note that in a real set-up we do not know the real underlying model, and all the three parameters must be adequately estimated. For a large set of distributions we have compared different methods of estimation of the parameter A in Eq. (7), including the maximum likelihood (ML) method, and we concluded that in general it is possible to find an adequate Box-Cox transformation, which enables us to deal with the transformed data as approximately normal. Further details about the estimation and performance of a Box-Cox
42
transformation can be found in Box and Cox [3], Figueiredo [8], Shore [17], Velilla [20]. 3.1. Monitoring
Original and Transformed
Data
Whenever we want to control the mean value and the standard deviation of an industrial process or more precisely, the quality characteristic X at the targets [IQ and
= 1-P(W
&C\
OUT)
= l-P{W
e C\ 6),
(8)
with P denoting the /3-risk, also called the operating characteristic curve. The ability of a control chart to detect process changes is usually measured by the expected number of samples taken before the chart signals, i.e., by its ARL (Average Run Length), or equivalently, by its power function in Eq. (8). In order to monitor Weibull and Gamma processes (X), we shall here implement control charts for the standardized data, i.e., for Y = (X — /x0) /co, where ^o a n d 0o denote the in-control mean value and standard deviation targets, respectively. It is important to point out that monitoring simultaneously the parameters 5 and 9 is not an easy job. Here, without loss of generality, we assume Weibull and Gamma models with 5 = 1, whenever the process is in-control. The shape parameter 9 is again assumed to be known and fixed, 9 = 0.5, (0.25), 2.
43
We also consider an a priori application of a Box-Cox transformation in order to normalize the original data X, and then we implement charts for the standardized data T = (Z - fi\) /ax, Z = (Xx - l) /A in Eq. (7), where Hx and ax denote the mean value and the standard deviation of Z, when the process is in-control. For the Ga{9) models, fix = {T(\ + 6)/T(6) — 1) /A and ax = y/T{6)T{2X + 9) - T2(A + 9) /(Ar(6>)). For the W{9) models we have nx = {T(X/9 + 1) - 1) /A and ax = y/T{2\/0 + 1) - r 2 (A/0 + 1) /A. In Table 3 we present the ML-estimates of A associated to the Box-Cox transformations herewith considered, when dealing with the implementation of the different charts for the transformed data. Table 3. ~
6 W{0) Ga{6)
3.2. Robust
0.5 0.125 0.205
Control
0.75 0.187 0.261
ML-estimates of A. 1.0 0.265 0.265
1.25 0.312 0.266
1.5 0.374 0.277
1.75 0.436 0.278
2.0 0.498 0.283
Charts
Definition 3.3. Whenever we are controlling the process at a target 9$, a control chart based on a statistic W is said "robust" if the alarm rate is as close as possible to the pre-assigned a-risk, whenever the model changes but the process parameter is kept at the respective target 9$. In order to get information on the resistance of the control statistics under study to departures from the normal model, we have computed the alarm rates of the before mentioned control charts for rational subgroups of size n = 3 up to 10. Here, and for illustration, we present, in Table 4, these alarm rates, for rational subgroups of size n = 5, both for the original and for the transformed data (first and second figure in each entry, respectively). To monitor the mean value we have implemented two-sided X and TMd charts, with control limits placed at the Xo.ooi and xo.999 quantiles of these statistics, under a normal model. To monitor the standard deviation we have implemented upper TR, R and S control charts, with the control limit placed at the quantile Xo.998 of these statistics, under a normal model. Thus, when we have normal data the alarm rate of all these charts must be a = 0.002. It is clear from Table 4 that there is a reasonably high variability of the alarm rates for all the charts used to monitor the mean value or the standard deviation of the original data (much more evident for the traditional X, S and R charts), when the model is no longer normal. Nevertheless, for
44 Table 4. Alarm rates of the charts (original data / transformed data). model W(0.5) W(0.75) W(l) W(1.25) W(\.b) Vf(1.75) W{2) Ga(0.5) Ga(0.75) Ga(l) Ga(1.25) Ga(1.5) Ga(1.75) Ga(2)
~X 0.0179/0.0017 0.0126/0.0016 0.0083/0.0017 0.0060/0.0018 0.0044/0.0018 0.0035/0.0018 0.0028/0.0017 0.0118/0.0014 0.0096/0.0015 0.0083/0.0017 0.0077/0.0018 0.0069/0.0018 0.0062/0.0018 0.0060/0.0018
TMd 0.0024/0.0019 0.0034/0.0018 0.0037/0.0018 0.0035/0.0019 0.0032/0.0019 0.0028/0.0020 0.0023/0.0018 0.0038/0.0019 0.0037/0.0019 0.0037/0.0019 0.0037/0.0020 0.0035/0.0020 0.0034/0.0020 0.0033/0.0019
S 0.0484/0.0008 0.0426/0.0008 0.0286/0.0008 0.0175/0.0009 0.0102/0.0009 0.0061/0.0009 0.0035/0.0008 0.0421/0.0002 0.0339/0.0005 0.0289/0.0008 0.0255/0.0011 0.0222/0.0010 0.0201/0.0013 0.0185/0.0016
R 0.0383/0.0005 0.0321/0.0005 0.0207/0.0005 0.0122/0.0006 0.0070/0.0006 0.0041/0.0006 0.0023/0.0006 0.0303/0.0001 0.0245/0.0003 0.0207/0.0005 0.0185/0.0009 0.0161/0.0009 0.0147/0.0011 0.0136/0.0014
TR 0.0337/0.0008 0.0288/0.0008 0.0188/0.0008 0.0112/0.0009 0.0066/0.0009 0.0039/0.0009 0.0023/0.0008 0.0273/0.0002 0.0221/0.0005 0.0190/0.0008 0.0169/0.0011 0.0148/0.0011 0.0135/0.0013 0.0124/0.0016
the most usual rational subgroup sizes, the differences to the normal-case are much smaller whenever we consider the TMd-chart, even for models with a high tail-weight. It is important to emphasize the resistance of the TMd statistic to changes in the model, comparatively to the classical X statistic. Contrarily to what happens to the total median, the total range does not exhibit a "robust" behaviour to departures from the normal model, although it provides higher robustness than the S'-chart or the i?-chart. If we apply an a priori Box-Cox transformation to the data, the TMd chart is "totally robust" to the normality assumption, while the TR chart is only "quasi-robust". However these charts present higher robustness than the classical ones. Other procedures to implement robust control charts can be found in Langenberg and Iglewicz [11], Rocke [14], Rocke [15], Ryan and Faddy [16], Wu [21], among others.
4. Some Overall Conclusions Before implementing the traditional control charts for normal data, normality tests should always be applied, in order to detect possible departures from this common normality assumption. If the normality hypothesis is rejected, the traditional charts are inappropriate, with a low performance whenever we have heavy-tailed underlying distributions. The process monitoring must be carried out using other kind of control charts or a different approach to the process control. The results in Sec. 3 lead us to suggest either a preliminary careful modelling of the process
45
and the construction of control charts for that specific model, or the use of an adequate Box-Cox transformation of the original data prior to the implementation of "robust" control charts, like the ones herewith described. Although it is in general possible to find an adequate Box-Cox transformation, the estimation of the parameter A in Eq. (7) needs to be done with care. In practice, if we have access to an a priori sample of k observations, we advise the use of the bootstrap methodology based on the resampling of our sample through subsamples of size m < k, in order to get more precise estimates of A, fix and a\. The results so far described in this paper, lead us to present a design for the implementation of a procedure for monitoring non-normal processes on the basis of Box-Cox transformations, which includes the Phase-I of estimation and the Phase-II of monitoring: (1) Collect a reasonable large data set associated to any relevant quality characteristic X, say (X\, •• • ,Xk),k = 500, collected whenever the industrial process is in control. (2) On the basis of such a sample and using a bootstrap procedure, generate B = 5000 subsamples of size m = 100. Using either the ML or the E? method, compute the B partial estimates, Aj, fl\i and cf\i, 1 < i < B, and consider their averages, A = 5Z»=i ^i/Bi Ji\ and a\, as the overall estimates to be used. (3) To control the on-line production, collect rational subgroups of size n = 5 along time, apply first the Box-Cox transformation in (7), with A replaced by A, and consider the standardized data (zi — Jl\) /cf\. The data obtained may then be regarded as approximately standard normal, being then safe to use control limits adequate for such a model. (4) To control the mean value of the industrial process, use the TMdchart instead of the usual X-chart. (5) To control the standard deviation, the T.R-chart should be used. Remark 4.1. As time goes on, it is sensible to use the larger quantity of data available in-control, and re-evaluate the estimates needed. As conclusion, we state that the use of robust control charts is not sufficient to obtain robustness to normality departures, specially for high skewed and/or heavy-tailed d.f.'s. However, in a situation of compromise between efficiency and robustness, we can advise the use of "robust" control charts for monitoring the mean and the standard deviation of an industrial process. Since modelling the available data adequately is a delicate problem in Statistics, and the performance of charts based on the Box-Cox transformed
46 d a t a or based on the original d a t a (together with the knowledge of the specific underlying model, assumed t o be known) do not differ significantly, we suggest as a possible and interesting alternative to d a t a modelling, t h e use of the Box-Cox transformation.
Acknowledgments Work partially supported by F C T / P O C T I / F E D E R . References 1. N. Balakrishnan and S. Kocherlakota, Sankhya: The Indian Journal of Statistics 48 (series B), 439 (1986). 2. C. N. Borror, D. C. Montgomery and G. C. Runger, J. Quality Technology 31 3, 309 (1999). 3. G. E. P. Box and D. R. Cox, J. Royal Statist. Soc. B 26, 211 (1964). 4. L. K. Chan, K. P. Hapuarachchi and B. D. Macpherson, IEEE Transactions on Reliability 37 1, 117 (1988). 5. L. K. Chan and J. K. Heng, Naval Research Logistics50, 555 (2003). 6. Y. S. Chang and D. S. Bai, Qual. Reliab. Engng. Int. 17, 397 (2001). 7. M. G. Cox and E. P. Iguzquiza, Advanced Mathematical and Computational Tools in Metrology, 5, 106 (2001). 8. F.O. Figueiredo, Proc. 13th European Young Statisticians Meeting, 53 (2003b). 9. F.O. Figueiredo and M.I. Gomes, Applied Stochastic Models in Business and Industry 20, 339 (2004). 10. M. Hawkins and D.H. Olwell, Springer-Verlag. (1998). 11. P. Langenberg and B. Iglewicz, J. Quality Technology 18 3, 152 (1986). 12. D. A. Lax, J. Amer. Statist. Assoc. 80, 736 (1985). 13. J. M. Lucas and R. B. Crosier, Commun. Statist.: Theory and Methods 11 23, 2669 (1982). 14. D. M. Rocke, Technometrics 31 2, 173 (1989). 15. D. M. Rocke, The Statistician 4 1 , 97 (1992). 16. T. P. Ryan and B. J. Faddy, Frontiers in Statistical Quality Control 6 Physica-Verlag, 176 (2001). 17. H. Shore, Frontiers in Statistical Quality Control 6 Physica-Verlag, 194 (2001) 18. Z. G. Stoumbos and M. R. Jr. Reynolds, J. Statist. Comp. and Simulation 66, 145 (2000). 19. L. G. Tatum, Technometrics 39, 127 (1997). 20. S. Velilla, Statistics and Probability Letters 16, 137 (1993). 21. Z. Wu, International J. Quality 113 9, 49 (1996).
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 47-59)
MULTISENSOR DATA FUSION AND ITS APPLICATION TO DECISION MAKING PEDRO SILVA GIRAO, JOSE DIAS PEREIRA, OCTAVIAN POSTOLACHE Telecommunications Institute, DEEC, 1ST Avenida Rovisco Pais, 1, 1049-001, Lisboa, Portugal In this paper the authors present their experience on the use of data fusion in multisensor systems signal processing, reviewing the different levels of data fusion, some of the techniques mostly used and domains of application, with emphasis on decision making. Examples taken from the authors past work are presented and discussed.
1. Introduction: Multisensor Systems Systems that integrate multiple sensors are increasingly common. The sensors may be spatially concentrated or distributed, may measure one or several quantities and, in the first case, may be of the same or different types. Generally speaking, one can classify the sensors in a multisensor system according to the way they contribute to the final objective in one of the following types: • Complementary, when the sensors are not directly dependent but their output can be combined to yield better data, under a pre-defined criterion; • Competitive, when the sensors provide independent measurements of the same quantity. They can be identical or can use different measuring methods and they compete in the sense that when discrepancies occur, a decision on which to believe must be made. Sensors are put in competition either to increase the reliability of a system or to provide it with fault tolerance; • Cooperative, when the sensors output is combined to produce information that is unavailable from each of the sensors alone; • Independent, in all the remaining cases, basically when their outputs is not combined. In trivial situations, sensors output convey the required information, for instance when they measure the aimed quantities with the desired accuracy. More often, though, multisensor systems are justified either because the accuracy provided by a single sensor is not sufficient or because the information required can not
48 be obtained by measuring a single quantity. The use of n identical independent sensors not only allows reducing uncertainty by a factor of nI/2 but also permits the detection of faulty sensors, increasing systems robustness. Sensors yielding complementary information may lead not only to better observability of the quantities they measure, but also to obtain information of other quantities (e.g. two position sensors allow the measurement of the velocity of a moving body). It is in this context that sensors data fusion is decisive. 2. Data Fusion 2.1. Definition and General Considerations Defined as the process of combining data or information to estimate or predict entity states [1], data fusion is pertinent in more than multisensor systems applications. Nevertheless, these systems do use most of the techniques presently associated to data fusion. According to the above definition, data fusion is information integration and thus we prefer to say that data fusion is the process of combining data and knowledge from different sources to maximize the useful information content. Well established as an engineering discipline, data fusion has its support on more or less traditional disciplines such as signal processing, statistics and statistical estimation, control theory, numerical methods and artificial intelligence. Initially associated with military applications [2-5], data fusion has been successfully used and developed for applications such as agricultural resources monitoring, natural resources location, weather and natural disasters forecast, robotics implementation, automated control of industrial manufacturing systems, smarts buildings, medical apparatus, environmental monitoring and quality assessment and preventive maintenance. Some of data fusion vocabulary and terminology is not however fully established, importing words and meanings from several of the application domains where it is mostly used, namely defense and image processing. In this paper, the authors present and discuss subjects using a language they think to be fairly universal, explaining terms or making references to the literature whenever they feel adequate. It is their personal way of framing data fusion issues and thus differences from what other authors and what can be found in the specialized literature are naturally to be expected.
49 2.2. Data Fusion Models and Architectures The first data fusion model was developed in 1985 by the U.S. Joint Directors of Laboratories (JDL) Data Fusion Group and is known as the JDL data fusion model. Widely accepted as the base model, it was revised more than once, the later revision dating from 2004 [6]. Dasarathy [7], Bedworth and O'Brien [8], and Salerno [9], among others, also proposed models and architectures for data fusion with merits. All the contributions are important and welcome to provide a comprehensive framework for information fusion. In different ways, all data fusion models, either of the functional type, like the JDL and Dasarathy models or of the process type, like the Omnibus model of Bedworth and O'Brien, differentiate types of data fusion functions using fusion levels. Those levels are more or less connected to the stages at which the fusion occurs. The identification and characterization of the stages is in our opinion particular relevant because each stage is basically served by specific fusion techniques or processing architectures, which allows for some systematization. Three are the fundamental stages to consider: • Direct fusion of sensors data or raw data fusion. If sensors measure the same quantity their raw data can be directly fused; otherwise, either the output of each sensor is converted into relative units (normalized), and special algorithms (e.g. neural networks) used, or data must be fused at feature/state vector stage or decision stage. • Feature or state vector fusion. A characteristic or a feature vector representing sensor data is extracted from the sensor output before fusion is implemented. • Decision fusion. The sensors data are combined with other data or a priori knowledge or the data of each sensor is processed to yield inferences or decisions that are then combined. 2.3. Data Fusion Techniques 2.3.1. Pre-processing techniques The organization of fusion in levels or stages separates data fusion processes involving different levels of complexity. Processes are served by techniques and thus it is possible to associate the techniques more commonly used at each stage or level of data fusion. Before, though, it is worth while mention some aspects and techniques that can not be truly included but that are often involved in sensors data fusion because they respect each sensor individually and take place
50
before the implementation of any type of fusion in what we may call a preprocessing stage: 1. Sensor inverse modelling. Some algorithms used in data fusion can operate on data irrespectively of what is represents. Nevertheless, it is always necessary, sooner or later, to obtain the values of a measured quantity from a sensor output. This involves what is commonly called sensor inverse modelling. Polynomial approximations and artificial neural networks are two of the techniques that can be used. 2. Sensor characteristic approximation. Polynomials and neural networks are also useful tools to approximate the relationship between the sensor output and input (sensor characteristic). Support vector machines (SVMs) [10] is another class of algorithms that can be used, sometimes with advantage, for this purpose. Sensor characteristic approximation is important in many practical situations. An example is presented in [11]. 3. Data size reduction. In multisensor systems the quantity of data available can be extremely huge. The size of memory required to store data is directly related with data size. The computational load of processing algorithms also depends on data size, often in an exponential way. The reduction either of the size of a data set or of the number of data sets becomes thus mandatory in many applications either at the sensor output level or later during fusion. It is important to remind here that the performance of the software depends not only on the size of the data it handles but also on how data is represented and structured. This issue must then be often addressed in data fusion [12]. The two types of data size reduction identified before are however in potentially essentially different. While the elimination of values output by a sensor reduces the coverage of the sensor range, the elimination of a data set may correspond to the reduction of the dimension of the input space because the information about one of the input quantities is deleted (dimensional reduction of data). In this case, we are indeed in the presence of a data fusion process. In both cases information is lost and the reduction should be made only if the loss of information is not critical or if more information is gained than lost (e.g. improvement in visualization). The reduction of the size of a data set of validated values output by a sensor by elimination of some of the values is very much case dependent and, to our knowledge, there is no general purpose technique or algorithm to do it. A possible alternative to sensor data reduction is to replace sensor data with
51 data that conveys "equivalent" information for instance using the FFT, the Cepstrum or Wavelets based algorithms. The pre-processing tasks and techniques above mentioned are general but not exhaustive. In applications using images, namely, other pre-processing specific tasks are required [13]. 2.3.2. Processing techniques Dimensional reduction of data, as a data fusion process, uses several techniques and algorithms namely, principal components analysis (PCA) [14,15], Sammon's mapping [16] and artificial neural networks (ANNs) and data clustering [17]. In what direct fusion of sensors data is concerned, and as mentioned before, two different situations may occur: (a) the sensors measure the same quantity and their raw data can then be fused using from simple averaging techniques to estimation techniques (e.g. Kalman filtering); (b) the sensors measure different quantities and fusion can be made using neural networks or at a higher stage. As already mentioned, in feature fusion the association is not directly of sensors data but of features extracted from their data. Thus, feature fusion involves both feature extraction and features association. Features may be obtained in many different ways (e.g. parameters of a model that fits data, data spectral content, a parameter mathematically extracted from data, etc.), but the processing of the feature vector that results from the concatenation of sensors features generally recurs to pattern recognition techniques such as neural networks, clustering, statistical classifiers, support vector machines, and mapping (position of the feature vector in a feature space map). Decision fusion is, in our approach, the ultimate stage of data fusion. In short, decision fusion operates on all available information coming either from sensors or from other sources (a priori knowledge). As in all other data fusion stages, all available techniques can be used in decision fusion but the most used and better performing include the following; classical methods of inference and evidence combination (e.g. maximum a-posteriori, minimax, etc.), inferences based on probabilities (Bayes), on belief (Dempster-Schafer [18,19]) and on possibility [20], voting (weighted, plurality, consensus, etc.), and artificial intelligence methods such as expert systems and neural networks (self organizing maps, in particular).
52
3. Data Fusion Examples The examples that follow come from authors' activity during the last 10 years. Before going into details, however, a short reference on the typical situations we recurred to fusion related techniques and algorithms in the context of multisensor systems: 1. Increase of the accuracy and reliability of a measuring system using n complementary sensors. 2. Reduction of the errors of a sensor due to influence quantities. 3. Correction of sensors drifts. 4. Measurement of quantities using output dependent sensors. 5. Situation assessment and decision making based on multiple sensors, isolated or networked. 3.1. Measurement Accuracy Increase Using Competitive Sensors Turbidity is a quantity particularly important in water quality assessment. The value of turbidity is obtained by measuring the attenuation light experiences when travelling between two points of a medium. The measuring sensor includes, thus, a light source and a light sensor, but two different configurations are usually used. In one, the sensor faces the emitter and detects the direct light while in the other the sensitive surface of the sensor makes an angle with the emitted beam different from 90° and detects then the scattered light. We studied the problem and in [21] we proposed the solution depicted in Figure 1 able to yield more accurate turbidity values than the basic configurations. MSI
scattered light
transmitted light Figure 1. The three infra red emitter, one infra red receiver turbidity measurement cell.
53
One single sensor is used. However, by fusing the data obtained with the different emitter-receiver possible configurations (Figure 2) it is possible to measure turbidity with an estimated uncertainty of ±1% of a full scale of 1000 NTU. 3.2. Increase of the Accuracy of a Measuring System Using Correlated Sensors The situation addressed in [22] can be summarised as follows: n quantities are to be identified and measured. For that purpose, a set of n sensors whose outputs are of the same type (e.g. a voltage level) is used. Each sensor informs on a quantity, but due to limited selectivity, their outputs are correlated. How to extract the value of each quantity with increased accuracy? DIOO DIOl DI02
UiRD(n-l)
•
UniD(n)
di
• Jft.
tra
1=0010
'UOISSIUISU
IR photodiode voltage values
\V7
NN weights and biases
''
scattering O0=land/orDIO2=l
UIRD(0)
Wl bl W2 b2
,, s XX Neural processing algorithm
TUNN
Figure 2. Neural network processing of the turbidity cell optical sensor.
The quantities to measure were then the concentration of lead and cadmium in water. They heavily depend on temperature and thus a temperature sensor was included in the measuring system. Sensors data fusion was performed using two techniques, a weighted voting technique according to the scheme depicted in Figure 3, and the neural network of Figure 4.
54
uPb+2
sensorPb2+
sensorT
UT
sensor Cd
f(UPb2+,T)
UT/T
Uc
f(Uw2+,T)
Figure 3. Weighted data fusion.
n
UPI
(U Pb +2 ) N
(Lpb
)
+2 N
(u C d ) NI
(C c d 2 + ) N DN
UT
procT Figure 4. Fusion architecture based on MLP-NN (Multilayer Percepton Neural Network): Nnormalization block, DN - de-normalization block.
From the results obtained we concluded then that, in this case, the voting method was a direct method that required few data but a good knowledge of the sensors to be implemented. On the other hand, the neural network required larger data sets, might not converge, but no a priori knowledge of sensors characteristic was needed. In what concerns accuracy, the voting method allowed a 10 times error reduction while the neural network proved more flexible in error reduction (up to 1000). 3.3. Situation Assessment and Decision Making Based on Multiple Sensors, Isolated or Networked: Water Quality Monitoring Our last example [23] concerns a distributed measuring system designed to monitor and assess the quality of the water in the estuary of the Tagus river. The architecture of the system is repeated here for convenience (Figure 5). PAP1, PAP2 and PAP3 are three of the acquisition and processing units. Each PAP, the associated set of sensors, and the means to wireless transmit and receive data constitute a stand-alone station installed in a buoy or raft placed in a fixed location of the estuary. In the local stations, parameters of the water are
55
measured and data fused to obtain measurements less dependent on influence quantities and with increased accuracy (direct fusion). The major goal of the system is to monitor the values the water parameters in order to assess water quality and to detect the occurrence of pollution and, in this case, its possible source and evolution. For this purpose, the data of all the stations is transmitted to a land based computer and fused using an artificial intelligence technique: Kohonen Self Organizing Maps (K-SOMs). For decision making, the fusion uses both sensors data and a priori knowledge (accepted limits for the values of water parameters). Kohonen maps are neural networks widely used in data analysis and vector quantization both because they compress the information while preserving the most important topological and metric relationship of the primary data and also for their abstraction capabilities. In the case depicted in Figure 5, each K-SOM defines a mapping from the input space R4 (TU, C, pH, T WQ parameters) on to a regular two-dimensional array of nodes. Upon training, the plane of representation becomes divided into clusters of cells (Figure 6) corresponding, in our case to the following conditions: "Normal Functioning (NF)" (the WQ parameters values are inside pre-defined limits; "Fault Event (FE)" (the WQ values corresponds to faulty functioning of one or several PAPs measuring channels), "Pollution Event (PE)" (the WQ values are outside the pre-defined values. For decision making, the position of each set of data (values of the measured quantities) on the corresponding PAP mapping is determined. The degree of confidence depends on the comparison between the vector representing the data and a reference vector that describes the cell whose distance to the current input vector is a minimum. This comparison is quantified using a Gaussian kernel [23]. The closer to 1 its value is, the higher is the probability that the input vector is assigned to the correct cluster.
56
CCP unit
•ft
GPRS
s
-
'
WLAN Ethernet j bridge Ethernet PAP3 unit
^y _
. •
l
—iWLAN
Ethernet
Ethernet
* " * «
sensors
WI^^WWBBR
piiPiSSj modem W
J L J L J L J L WQ
sr P
HHin
RS232J
Access, Pout
PAP1 unit
water under test
GPRS * Jp
C pH T TV
wafer ' under test
" *
1
WLAN Ethernet
Ethernet
ftricge
PAP 2 unit WO sensors
•C pH T TV
Figure 5. Architecture of one of the distributed measuring systems proposed in [23].
57 i
i
"
1 I FE |
FE ' FC
FE
, FE , FE f£
FE
F£
E-t
FF
*
I
1
Nh
| ^
S~£
FE
FE
NFNf.
FE
F£ , Nf
NF NF NF
NF
NF
NF
P£ PF
PC
PE
PF.
PE PE
PE
NF
NF
FE FE
FE
FE FE
FE FE
NF
FE
NF
Ft
NF
FF FF
NF
NF
I FE |+
FE
NF
NF
NF
PE
PC
Pfc PE
PAP1 K-SOM
Nf Nr
NF
PR PE
NF Nf
NF
PF PE
PC
NF
FE
tjF
Nf
NF NF
Pfc PE
N<-
NF NF
NF
FE
TCft TETE
Nr
FE
NF
l
1
p£
pfc
pc
p£
PE PE
pg PE
p£ PF
pf-
PAP2 K-SOM
p£ PE
p(-
NF
PF PF
PL PF
PF
PC
PZ
PC
PE
PAP3 K-SOM
Figure 6. PAPs K-SOM graphical representation.
4. Discussion and Conclusions We addressed in this paper several issues related with sensors data fusion and provide examples to illustrate some more or less trivial situations. Paper length limitations kept us from discussing in particular the option made in the last example. Other very important issues when dealing with data were not addressed and a brief reference to some of them is made now. • How to gauge the structure of data sets or how to have information on individual values within a data set? Association metrics, i.e. assigning a number to data according to its degree of likeness is a possibility. In the case of numerical values, one of the metrics that can be used is the sample product-moment correlation coefficient or Pearson product-moment correlation coefficient. • In the examples we presented, measurement data did not include uncertainty. How to deal with the uncertainty of each sensor? How to represent it or specify it? There are several ways [24] here briefly summarized: 1. Explicit accuracy bounds specification, if the sensor is operating correctly. 2. Probabilistic and possibilistic bounds specification. Each sensor output value is assigned a probability value. The probabilities are combined using Bayes' rule or Dempster-Schafer methodology. 3. Statistical specification in terms of type, mean and standard deviation.
58
4.
Fuzzy logic specification. Each sensor output value is a membership function of a fuzzy set. • With the exception of the last example, we considered sensor based systems whose information is the only input to problem solving. Data fusion also encompasses cases where measured values are only partially or not at all available. Then, the framework into which the problem is to be solved is less well defined and problems tend to be solved using heuristics (rules of thumb), approximate methods or probabilistic methods which do not guarantee a correct solution. This is also the domain of computer programs that represent and reason with knowledge of some specialist with the objective of solving problems or giving advices, the so-called expert systems, which like for instance neural networks, is a sub-field of artificial intelligence. One final comment. In the examples given above the information integration is mostly based on neural networks. It is clearly a polarized and limited vision that results from years of using a tool that proved quite efficient for many problems the authors had to face. The techniques and algorithms for data fusion are however only limited by their efficiency in problem solving and the authors face now (e.g. prediction of water pollution based on the system of example 3.3, heart rate variability and blood pressure variability feature fusion [25]) and will surely face in the future situations to apply other techniques better suited for the problem to overcome. References 1. David L. Hall and James Llinas (Editors), Handbook of Multisensor Data Fusion, The Electrical Engineering and Applied Signal Processing Series, CRC Press LLC, Boca Raton (FL),USA, 2001. 2. R. P. Bonasso, Army Conference on Applications of AI to Battlefield Information Management, 1983. 3. P. L. Bogler, IEEE Trans, on Systems, Man and Cybernetics, Vol. 17, No 6, pp. 968-977, 1987. 4. Dennis M. Buede, IEEE Trans, on Systems, Man and Cybernetics, Vol. 18, No 6, pp. 1009-1011, 1988. 5. T. J. Hughes, Measurement and Control, v. 22, n. 7, p. 203, Sept. 1989. 6. James Llinas, Christopher Bowman, Galina Rogova, Alan Steinberg, Ed Waltz and Frank White, Proc. 7th Intl. Conf. on Information Fusion, pp. 1218-1230, Stockholm, Sweden, 2004. 7. B. Dasarathy, IEEE Proceedings, Vol. 85, N°. 1, 1997. 8. M. Bedworth and J. O'Brien, Proc. 2nd Intl Conf. Information Fusion, 1999.
59 9. J. Salerno, Information Fusion: A High-Level Architecture Overview, 5th Intl Conf. Information Fusion, 2002. 10. N. Cristianini and J. Shawe-Taylor, An Introduction to Support Vector Machines (and other kernel-based learning methods), Cambridge University Press, Cambridge, NY, USA, 2000. l l . O . Postolache, P. Girao, M. Pereira, Helena Ramo, Proc. Third Intl. Conf. on Systems, Signals, Devices (SSD'2005), Vol. IV, Sousse, Tunisia, 2005, ISBN 9973-959-01-9/ © 2005 / 9885 IEEE, CD published. 12. R. R. Brooks, S.S. Iyengar, Multi-Sensor Fusion, Fundamentals and Applications with Software, Chapter 5, Prentice Hall, New Jersey, USA, 1998. 13. Yuhua Ding, Multi-Sensory Image Fusion Techniques at Image- and Feature-Levels, Ph.D thesis, available at: http://icsl.marc.gatech.edu/papersAfuhua_qualify_2001_03.pdf 14. Lindsay I. Smith, A Tutorial on Principal Components Analysis, 2002, available at: http://www.cs.otago.ac.nz/cosc453/student tutorials/principal components, pdf 15. http://www.fon.hum.uva.nl/praat/manual/Principal component analysis.html 16. J. W. Sammon, IEEE Trans, on Computers, C-18(5), 1969. 17. P. Andritsos, Data Clustering Techniques, 2002, available at: http://www.cs.toronto.edu/~periklis/pubs/depth.pdf 18. A. P. Dempster, J. Royal Statistical So., Series B, Vol.30, pp 205-247, 1968. 19. G. Chafer, A Mathematical Theory of Evidence, Princeton University Press, Princeton, NJ, USA, 1976. 20. D. Dubois and H. Prade, Possibility Theory: An Approach to Computerized Processing of Uncertainty, Plenum Press, NY, USA, 1988. 2 1 . 0 . Postolache, P. Girao, M. Pereira, H. Ramos, Proc. of IEEE Inst, and Meas. Tech. Conf. (IMTC 2002), pp. 535-539, Anchorage, AK, USA, May 2002. 22. P. Girao, O. Postolache, M. Pereira, H. Ramos, Proc. 12th IMEKO TC4 Intl. Symp. Part 2, pp. 442-447, Zagreb, Croatia, September 2002. 23. O. A. Postolache, P.M.B.S. Girao, J.M.D. Pereira, H.M.G. Ramos, IEEE Trans. On Instr. and Meas., Vol. 54, n. 1, pp. 322-329, February 2005. 24. R. R. Brooks, S.S. Iyengar, Multi-Sensor Fusion, Fundamentals and Applications with Software, Chapter 8, Prentice Hall, New Jersey, USA, 1998. 25. G. Postolache, I. Rocha, L. Silva Carvalho, O. Postolache, P. Girao, Proc. of IMEKO TC-4 13th Intl. Symp. on Measurements for Research and Industry Applications and the 9th European Workshop on ADC Modelling and Testing, Vol. 2, pp. 451-456, Athens, Greece, 2004.
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 60-72)
GENERIC SYSTEM DESIGN FOR MEASUREMENT DATABASES - APPLIED TO CALIBRATIONS IN VACUUM METROLOGY, BIO-SIGNALS AND A TEMPLATE SYSTEM HERMANN GROSS, VOLKER HARTMANN, KARL JOUSTEN, GERT LINDNER Physikalisch-Technische Bundesanstalt, Abbestrafie. 10587 Berlin, Germany
2-12
A general relational data model for setting up large or medium sized databases of measurement data in the field of metrology is discussed. Structuring and organization of the measurement data in vacuum metrology is considered to be an illustrating and typical example from the metrological domain. The second example describes the data model of a bio-signal database containing electro-cardiograms and clinical data of patients with cardiac diseases. The key ideas of the presented generic data models can be expressed by a separation of the relevant data objects into two groups: The first group describes the metrological area semantically by methods of measurement and by parameters of physical or technical quantities needed to define the characteristic features and the data sampled by these methods. The usage of these methods is described in the second group of data objects: Each measurement or calibration is seen as an event. It registers the deployment of a method and provides all the sampled current values of the parameters belonging to the method. The fixed current values of the parameters are organized in data-type-specific objects. They represent the data collections resulting from the measurements or calibrations. The major parts of the data model for calibrations in vacuum metrology and for bio-signals in medical physics are used to derive the reusable kernel of a template database suitable for different metrological domains. It represents the predefined and simple structure for an openended data model which is flexible enough to support a wide range of metrological applications.
1. Introduction Database technology is an increasingly significant subject in metrology. The ability of modern measuring systems to acquire data, to process them, and thus to generate additional data has increased enormously. This development has stimulated many activities aimed at using existing systems and developing extensions. Many laboratories use easily manageable spreadsheets, but often without making any efforts with regard to the underlying data model - or even a reusable data model. Large-scale databases are mainly used when infra structural issues are to be dealt with or when several partners are involved and special efforts are needed for designing and implementing suitable data structures. While the use of database systems in metrology can lead to a
61 range of benefits, it also introduces new elements of risk to which serious attention should be paid (e.g. the risk of data loss, data manipulation, etc.). It is important to ensure that the systems used are suitable with regard to their functionality, flexibility, availability, security, etc. and that the resources deployed in development and maintenance are used effectively. From the database technology point of view, the requirements for databases in metrology are rather heterogeneous. To some extent, this paper is an attempt to systematically summarize and generalize the experience the authors have gained in building up a reference database for bio-signals [1,2,3] and from discussions with colleagues responsible for the calibration of vacuum gauges. In particular, a workshop on database applications in metrology held at PTB in November 1999 was very helpful to gain insight into alternative concepts and for comparative evaluations [4]. Section 2 deals with some of the basic methods needed for a successful implementation of database applications. Section 3 considers the requirements posed by metrology or in metrology-related environments. Two examples are used to illustrate and to consider the essential requirements. Section 4 briefly outlines the solutions for the data model of a database in vacuum metrology and for a bio-signal database, both presented as normalized table schemas. Then in section 5, the common aspects of these cases are discussed and used to derive the kernel of the data model for a template database suitable for different metrological domains. 2. Basic Methods Today, relational or object-relational database management systems have many advantages over their predecessors based on the hierarchical or network data models. Purely object-oriented database management systems have been developed to a fairly great extent [5]. However, in most of the practical applications, the relational database management systems are still in use. The relational data model [6,7] - which is not based on any particular data structuring paradigm but rather on a mathematical foundation - has several important advantages: (1) Data independence, meaning that data representation or data structures are independent of the processes that use data, i.e. independent of the application. The physical representation can be modified without affecting the application programs. (2) Declarative manipulation, i.e. a standardized structured query language (SQL) is available for creating, manipulating and enquiring the
62 data. (3) No redundancy in the final implementation of the data model if the process of normalization is applied to the relational model. Primary emphasis is placed in database technology on enhancing the functionality and performance of relational systems. The realization of distributed relational databases in client/server architecture or even multi-tier architectures on different nodes in a computer network is an example of greater functionality and offers the possibility of optimizing the usage of available resources, including World Wide Web applications, and so integrating different data sources. A further example is the capability to handle binary large objects (BLOBs) as a special data type within relations. BLOBs can be images, texts or the digitized data records of any technical instrumentation, e.g. electro- or magnetocardiographic recordings or multiple time series of calibration measurements. Entity relationship modelling (ERM) is the basic method for database design and is one of the essential parts in the system design phase during the build-up of databases controlled by relational or object-relational database management systems (RDBMS). The objective is the entity relationship model of the data as a logical description of all different objects (entities) and their references (relationships). That means that the appropriate ERM has to be developed. Once this important step has been accomplished, this conceptual data model is translated into the 'physical' data model of the target database system, i.e. the database table scheme. This transformation is supported quite well by tools available in database technology and is referred to as the normalization process which guarantees that no redundancy occurs in the resulting table schema. In this paper we refer to the resulting table schema as the data model. A well-known and very intuitive form to describe a table schema is its graphical representation, where the tables are boxes and their references to each other are lines connecting them. With the above-mentioned standardized structured query language, SQL, such a table schema can be written as a script. The features of SQL concerning the definition of data types and constraints are used to create a command script whose execution on the target database management system will generate all tables and their relationships. 3. Requirements A database for the measurements and calibrations in the field of vacuum metrology should support the documentation and archiving duties of the calibrations performed. It serves as a reference archive and repository of
63 measurement and evaluation results obtained from different calibration systems at different times and attained with different units under calibration. The typical requirements can be summarized as follows: (1) The data model should support the semantic diversity of data which is due to the variety of calibration methods in vacuum metrology and due to the different pressure devices to be calibrated. (2) However, a great number of relations exist between the different calibrations as the measurement equipment, the calibration and evaluation methods, the operators and even the calibration objects etc. can be the same. (3) The knowledge of relations existing between the different methods and of their restrictions, e.g. sequences of measurements, assessments to be carried out after certain derivations etc., has to be represented in the data model. (4) The data measured or generated by data processing must be related to the methods producing the data. (5) The concept must be open for the integration of new procedures, i.e. for the definition of new methods, data and relations between the newly defined elements as well as for existing ones. (6) Furthermore, the design has to allow the representation of different kinds of data consisting of heterogeneous data types along with complex data, e.g. dynamic vectors or large blocks of binary data ('multimedia capability'). Obviously these requirements are also useful for other application domains where complex measuring and evaluating processes occur on carefully prepared objects or patients, as illustrated in the case of a bio-signal database containing electro- and magneto-cardiograms (ECGs, MCGs) [2]. Here the functional requirements must take the demands of the bio-electrical and bio-magnetic research laboratories into account, as well as the demands resulting from their co-operation with the clinical departments in which the participating patients were treated or are being treated. The ECG and MCG records are measured with special instrumentation that guarantees signals of highest quality. These records stem from patients suffering from cardiac diseases. Their diagnoses are confirmed by standardized clinical examinations and the expertise of cardiologists. To be effective, the database design has to take into consideration the following: (1) The complete process of data processing from acquisition to evaluation should be supported, i.e. the current processing state of each bio-signal record and its history should be documented. (2) The description of these processes should be storable. Not only the measured values themselves but also their meanings within the context of a more detailed process description are part of the database. (3) The clinical terms should be uniquely and uniformly coded so that they can be reliably retrieved. (4) The design has to be extensible so that new kinds of data, resulting either from new clinical examinations or further
64 evaluations, can be added without necessitating changes in the definitions of existing database tables. (5) A flexible design has to allow the representation of data which consist of heterogeneous data types along with complex data, such as large blocks of binary data (e.g. ECGs, MCGs) or dynamic parameter vectors representing the results of special evaluation procedures. 4. Suitable Generic Data Models In order to fulfil these requirements and to comply with the great semantic diversity of data due to the variety of different methods and evaluations, generic data models based on the relational concept have been developed. Figure 1 presents the data model for a database developed for all data belonging to calibration processes in the field of vacuum metrology. CLIENT » CL_SHORTNAME o CAMPANY NAME
EDrroR # EDI_SHORTNAME * NAME
CALIBRATION » CALIBJD * CUJHORTNAME o EDI SHORTNAME * OBJJD o ... o NOTES
CALIEVENTS * NO o CE_CALIB_ID o CE_EDITOR * CE_SCHEMANAME * DATETME o NOTE
P_LOBS * CALI_EVE_NO » SCH_PAR_ID * PLOBJNDEX * CONTENT
CALI SCHEMAS * SCHEMANAME o DESCRIPTION
PJ1UMBERS # CALI_EVEJTO # SCH PARJD »PNJNDEX # PN_MEAS_POINT o CONTENT
SCHEMA PARAMETERS * SP_ID * SP_PARNAME * SP SCHEMANAME * MAX INDEX
CALD3RATIONOBJECT « OBJJD • OBI SHORTNAME
ATTACHMENTS # OBJ_ID_PARENT § OBJ_ID_CHILD o NOTE
PTEXT « CAU_EVE_NO # SCHJARJD SPTJNDEX o CONTENT
PARAMETERS » PARNAME o DESCRIPTION
Figure 1. Data model of the calibration database (as relational table diagram).
Each table in figure 1 is presented in a frame with a title bar giving the name of the table and different rows defining the attributes of the corresponding table. Attributes marked by '*' are mandatory; if they are marked by 'o', they are optional. Attributes marked by '#' are primary key components. The primary key is the unique identifier of an entry of a table. It can be composed of several attributes. The connecting arrows between the tables represent (l:n) relationships, e.g. the foreign key constraints for attributes in the table where the arrow tips adjoin. A foreign key constraint for an attribute means that the values
65 of this attribute are identifiers from another table. If a connecting line is not solid, the foreign key constraint refers to an optional attribute. The main design decisions for the data model shown in figure 1 can be summarized as follows: (1) The calibration scene is structured into different calibration processes in the table CALIBRATION. We here use the term 'calibration' as a synonym for the calibration process, i.e. a series of events. Apart from several administrative attributes, three of them refer to other tables. These attributes are parts of the relationships in the data model: The attribute ' O B J I D ' refers to the table CALIBRATIONOBJECT giving the details of the devices to be calibrated. The attribute 'CLISHORTNAME' refers to the table CLIENT providing the information on the client who ordered the calibration. 'EDISHORTNAME' refers to the table EDITOR pointing to the supervisor of the calibration. (2) The measurements belonging to a calibration are organized as events of calibration experiments and registered in the table CALI_EVENTS: Each entry expresses the application of a special calibration experiment. Again, three attributes of the entries in this table refer to entries in other tables: The attribute 'CE_CALIB ID' refers to the table CALIBRATION and indicates to what calibration process the calibration event belongs to. 'CEEDITOR' refers to the table EDITOR and indicates the responsible operator for the calibration event. 'CESCHEMANAME' refers to the table CALI_SCHEMAS and identifies the applied calibration method. In CALISCHEMAS only short explanatory descriptions of the applied calibration methods are given. (3) Details of each calibration method indicated in the table CALI_SCHEMAS are given in the table SCHEMA_PARAMETERS. Here the usage of the different physical or technical parameters belonging to the calibration methods is indicated. Consequently, two of the attributes in this table refer to entries in two other tables: The attribute 'SP_SCHEMANAME' refers to the table CALI_SCHEMAS and indicates the calibration method which utilizes this parameter. Attribute 'SP_PARNAME' points to the table PARAMETERS and gives information on a particular parameter (quantity). In the table PARAMETERS, the physical or technical quantities which can be measured or evaluated by the various calibration experiments or their evaluation methods are indicated. The independent indication of a parameter by a table of its own allows the same parameter to be used in different calibration
66 methods. So a context-independent definition of parameters is combined with their method-dependent usage. (4) The results of the execution of a calibration experiment are the measured current values of the parameters belonging to the applied calibration method. They are stored under the attribute 'CONTENT' in the data-type-specific tables P_LOBS for binary large objects and P_NUMBERS and P_TEXT for simple numbers and terms. Two attributes in each of the tables refer to the tables CALI_EVENTS and SCHEMA__PARAMETERS: Attribute 'CALI_EVE_NO' indicates the event to which the current fixed value - stored in the attribute 'CONTENT' - belongs. Attribute ' S C H P A R I D ' gives information on the parameter whose current fixed value is stored into the attribute CONTENT. The primary keys of the tables P L O B S , PNUMBERS and P T E X T are composed of more than the two components just described. In order to offer the possibility to store multiple values for the same quantity the attributes P<X>INDEX [<X>: {LOB,N,T}] are established and used as a further primary key component. The default value of these attributes is 1. In this way it is possible to register a series of pressure values or a series of values for another parameter. For the table PNUMBERS a further extension of the primary key components by the attribute PN_MEAS_POINT was realized. Its default value is also 1 and it offers a further dimension, i.e. a further degree of freedom for the registration of the values of the same quantity. Figure 2 presents the kernel of the generic data model for the second example, the already-mentioned bio-signal database of PTB (PhysikalischTechnische Bundesanstalt) shown as a normalized table diagram. Again, the concept of events is applied as follows: • Clinical findings, such as ECG recordings or evaluations of these, are handled as EVENT_ISSUES. Each entry in this table has as mandatory columns the date/time and a reference to the table METHODS, where the acquisition or evaluation method of the event is defined. So the table EVENT_ISSUES is the main registry of a data acquisition or data evaluation having happened. • Different methods of data acquisition or data evaluation are handled as entries in the table METHODS, described by their names and their general descriptions. • To get a more detailed description of all methods defined, each entry in the table METHODS is referred to by a set of entries in the table
67 METHOD_PARAMETERS. Here the usage of the parameters belonging to each method is described. The table METHOD_PARAMETERS refers also to the table PARAMETERS, so that the same parameter can be used in different methods. In this way an independent parameter description is combined with a context-dependent usage of the parameter. The entries in the table PARAMETERS express the physical, biophysical or clinical quantities which can be measured or evaluated by the various data acquisition or data evaluation methods. EVEJ3TRUCTURES *EVE_NO #EVE WO CHILD
P # # *
BLOBS EVE NO MP ID Value
METHODS # ID * Name o Description
EVENTJSSUES * NO * METJD * DATETIME o PAT_NO o PHYJD o STA ID
PJMUMBERS # EVE_NO # MP_ID # PNJNDEX o Value
P_STRINGS # EVE_NO # MPJD # PSJNDEX o Value
METHODPARAMETERS * ID * METJD * PARJD * MAX INDEX
PATIENTS # NO » BIRTHDAY * SEX
PJJATETIMES # EVE_NO # MPJD # P_DT JNDEX o Value
PARAMETERS # ID « NAME
MET STRUCTURES » MET ID * MET ID_CHILD
Figure 2. Kernel of the data model for a bio-signal database as table diagram.
The fixed current values of the parameters belonging to a measurement - or evaluation event are stored in the data-type-specific tables P_NUMBERS, P_STRINGS, P_DATETIMES and P_BLOBS into the attribute 'VALUE'. So the event identifier and the identifiers of the method parameters are the main components of the primary keys of the rows in these tables. Dependencies between different entries in EVENTISSUES or in METHODS, such as parent-child relationships, are supported by the tables EVE STRUCTURES and MET STRUCTURES.
68 The optional column PAT_NO in the table EVENTJSSUES refers to the table PATIENTS. For instance, special data acquisition methods, such as clinical examinations or MCG recordings, refer directly to a patient. An MCG measurement is executed on a patient. However, an evaluation or data processing method, such as filtering or averaging of MCG curves, is executed on already existing records of patients. Such methods do not refer to a patient directly. Therefore, this column is optional. The same is true of the other optional attributes PHYJD and STAID in the table EVENTJSSUES. They refer to the tables PHYSICIANS and STAFFS, which are not shown in figure 2. In these tables, physicians and staff controlling the measurement events can be indicated. The attributes EVE_NO and MP_ID in the table P_NUMBERS are primary key components and they have foreign key constraints to the attribute NO in the table EVENTJSSUES and to the attribute ID in the table METHODJ 3 ARAMETERS. The fixed current values of the parameters belonging to a measurement - or evaluation event are stored into the attribute VALUE. In order to allow the storage of result vectors for the same parameter (e.g. the offset values for consecutive QRS signal groups of an ECG lead) the additional attribute PNJNDEX is available as the third primary key component of the table PNUMBERS. The default value for this attribute is 1. All other tables with the prefix 'P_' are designed in the same way to realize a data-type-specific storage for the results of each event. Complete ECG or MCG recordings or calculated ECG leads are storable in the VALUE attribute of the table PJBLOBS. Further details are explained in [3]. 5. Discussion The major parts of the presented data models for the two different applications are very similar. They express the same conceptual ideas: The process of measurement or the process of evaluation is seen as the deployment of a method, registered as an event. A method is characterized by an explanatory description and a set of parameters representing the use of different quantities whose current values are sampled by the applied method. The quantities of the application domain are indicated independently as parameters so that a repeated usage with different methods is possible. The sampled current values of the quantities - determined by the deployed method - are stored in data-type-specific tables. Here the different entries are identified by referring to the registry of the events and to the registry of the method parameters where the quantities of the method are specified.
69 MEASURE_EVENTS # MEAS_NO •» * METNAME * DATETAKEN o TIMETAKEN 0 NOTE
MEAS LOBS # MEAS_NO # METPAR_ID # INDEX # CONTENT
METHODS # METNAME o DESCRIPTION
f ]
MEAS_PARAMETER_VALUES # MEAS_NO # METPARJD # INDEX o CONTENT
METHOD PARAMETERS * METPAR ID * METNAME * PARNAME « DIM o UNIT 0 NOTE
PARAMETERS # PARNAME o DESCRIPTION
Figure 3. Table scheme of a template database.
In this sense, the data models presented are not specific for the application area of bio-signals or vacuum metrology. Their kernels are reusable and suitable for different metrological domains, particularly if integration and accessibility under the condition of a large semantic diversity are important. To support the application of these generic concepts, a further reduced conceptual scheme of tables can be derived from the presented examples. This scheme is shown in figure 3. In contrast to the previous examples only those tables corresponding to the tables in the middle and lower parts of figure 1 or 2 are shown. That means that only the kernel of a real metrological database scheme is considered. We call it the template system or template component, and it is open for the integration of further tables and their relationships. A measurement or an evaluation is seen as the deployment of a corresponding method (table: METHODS). This deployment of a measurement method is registered as an event (table: MEASURE_EVENTS). A measurement method is characterized by a short explanatory description and a set of parameters (table: METHOD_PARAMETERS ) representing the use of different quantities whose values are determined by the application of the associated method. The quantities of the application domain are indicated independently as parameters (table: PARAMETERS) so that a repeated usage along with different measurement- or evaluation-methods is possible. The acquired current values of the method parameters are stored in data-type-specific
70
tables: Table MEAS_LOBS serves for binary data and table MEAS_PARAMETER_VALUES serves for numbers or strings. The different entries are identified by referring to the registry of the measurement events and by referring to the registry of the use of the parameters associated with the applied method. In the proposal for the template system, some of the data-type-specific tables of the previous examples in figure 1 and 2 - namely P_NUMBERS, P_TEXT or P_NUMBERS, P_STRINGS, P_DATETIMES - are replaced by only one table, namely the table MEAS_PARAMETER_VALUES. In this table, the current fixed values of the parameters are stored into the attribute 'content' in a variable character format. Furthermore, the tables EVE_STRUCTURES and MET_STRUCTURES from the data model of the bio-signal database [see figure 2] were not included in the template in order to simplify the presentation. Such tables can be used for documenting the dependencies of different events or different methods, for example in the sense of parent-child relationships. In reference [3], a more detailed discussion of these features in the bio-signal database is included. Due to the separation into two groups of data objects, the presented template model is very simple: The metrological domain is represented by three tables describing measurement methods, parameters of physical or technical quantities and the usage of these parameters in the context of the defined methods. The executions of measurements or evaluations are represented by a second group of tables: Each measurement (or calibration or evaluation) is seen as the deployment of a method. An event table registers that the deployment of a method has taken place and the current values of the parameters associated with that method and sampled by its deployment are forwarded to the data-typespecific tables. One of the main advantages of such a general data model is its flexibility: New kinds of data resulting either from new measurement methods or new evaluation procedures can be integrated without necessitating changes in the underlying structure of tables. They only need new entries, which means that new rows in the tables PARAMETERS, METHODS and METHODPARAMETERS have to be fed. As soon as a new data acquisition method is indicated by its rows in these tables, the method is available for all existing application programs, e.g. automatic data imports, analytical reports or interactive data viewing and management tools. This reduces the costs for development and operation and increases the overall efficiency.
71 «entity object*
MeasureEvents MeasNo : Number Datetaken : Date Timetaken : Date Metname : String Note : String
AssocEventLobs
AssociationMeth
AssocEventParValues
«entity object*
«entity objects
MeasLobs
MeasParameterValues
MeasNo : Number Metparid : Number Dlmlndex : Number Content : BlobDomain
MeasNo : Number Metparid : Number Dlmlndex : Number Content : String
AssocMetParLobs
Assoc MetParValues
«entity object*
«entity objects
Methods Metname : String Description : String
Method Parameters
AssocMetPar ; ! : i : i
Metparid : Number Parname : String Metname : String Unit : String Dim : Number Note : String
«entity objects
Parameters Parname : String Description : String
Figure 4. Object model diagram of the template database.
A similar philosophy stands behind many content management systems (CMS [8]) available on the market. They are powerful database applications for the management of any kind of data. Their wide range of features for data management, including content structuring, navigation functions and data presentation levels, is only possible by generic approaches to their underlying data models. A discussion of the meta-data models used in such systems would, however, go beyond the scope of this paper. The same goes for the group of laboratory information management systems (LIMS [9]) which are designed for supporting the operations, e.g. receiving, preparing and analysing probes mainly in chemical, pharmaceutical or medical laboratories. Here the support of the chain of processes applied to the probes is an important aspect. It is possible to map the relational data model of the template database shown in figure 3 to the object-oriented modelling standard expressed by UML (universal modelling language [10]): The tables transform to entity objects and the relationships between tables transform to associations between the objects. Figure 4 shows the object diagram in UML notation as a result of a reverse generation against the relational table schema.
72
Within the scope of the European thematic network 'Advanced Mathematical and Computational Tools for Metrology' [11], a database workspace is accessible via http://ib,ptb.de:7778/pls/portal/url/page/COP5/. Under the topic 'Online Template Database' the registered user will find a web interface to the relational template database whose object model is presented in figure 4. The target system is an Oracle 9i database server. Furthermore, a 'pocket' database named 'dblnew.mdb' and implemented in Microsoft Access, is available for downloading. This implementation of the template database is garnished by several forms for adapting the template to user needs. In this implementation, it is not the objective to give the user an intuitive interface to the template database, but to provide several means for 'playing' with the key ideas which are behind it and discussed in this paper. References 1. H. GroB, V. Hartmann (1997). A generalized data model integrating bio-signal records and clinical data for research and evaluation. Comp. in Cardiology, 24, IEEE Computer Soc. Press, 573-576. 2. H. GroB (1999). Integrierende Biosignaldatenbank (Integrating bio-signal database). In: H. GroB, D. Richter (Eds.) Einsatz von Datenbanken in der Metrologie (Application of databases in metrology), PTB-Report PTB-IT-8 (ISBN: 3-89701-489-0), 97-110. 3. V. Hartmann (1999). Datenmodellierung fur die biomedizinische Forschung und Verfahrensentwicklung (Data modelling for bio-medical research & development). In: H. GroB, D. Richter (Eds.) Einsatz von Datenbanken in der Metrologie (Application of databases in metrology), PTB-Report PTB-IT-8 (ISBN: 3-89701-489-0), 111-121. 4. H. GroB, D. Richter (1999). Einsatz von Datenbanken in der Metrologie (Application of databases in metrology), PTB-Report PTB-IT-8, ISBN: 389701-489-0. 5. J. Martin (1993). Principles of object-oriented analysis and design. PTR PrenticeHall, Inc., chap. 19, 301-320. 6. E.F. Codd (1970). A relational model of data for large shared data banks. Comm. Of the ACM, June 1970, 13:6, 377-387. 7. C.J.J. Date, C.J. Date (1999). An introduction to database systems. Addison Wesley Longman, Inc. 8. http://www.cmswatch.com/TheCMSReport/About/ 9. http://www.lims.de/ 10. http://www.omg.org/technology/documents/formal/uml.htm 11. http://www.amctm.org/
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 73-84)
EVALUATION OF REPEATED MEASUREMENTS FROM THE VIEWPOINTS OF CONVENTIONAL AND BAYESIAN STATISTICS IGNACIO LIRA Department of Mechanical and Metallurgical Engineering, Pontiflcia Universidad Catolica de Chile, Vicuna Mackenna 4860, Santiago, Chile. WOLFGANG WOGER Physikalisch-TechnischeBundesanstalt, Bundesalle 100, D-38116, Braunschweig, Germany. Repeated measurement data can be evaluated within the frameworks of either conventional or Bayesian statistical theory. In most cases both approaches yield similar uncertainty intervals. However, the conventional and Bayesian interpretation of these intervals is completely different. In the opinion of the authors, the Bayesian approach is more flexible and much sounder theoretically.
1. Introduction There is currently some debate among metrologists on the merits of conventional and Bayesian statistics to evaluate measurement data. In the Guide to the Expression of Uncertainty in Measurement (GUM) [1], the former viewpoint is recommended to analyze a series of independent repeated observations of a quantity under nominally identical conditions of measurement, while the latter applies to quantities for which "other" type of information exists, e.g., systematic effects. Since conventional and Bayesian statistics are based on different paradigms, combining quantities whose estimates and uncertainties are obtained by either method is problematic. In this paper we focus on a series of independent repeated measurements of a quantity. We start by considering the conventional approach to the evaluation of the data. In section 2 we show that the procedure depends on the circumstances under which the data were gathered; touch briefly on the problems arising from the possible existence of constraints on the value of the measurand; discuss how conventional statistics handles data produced by different observers; and present the conventional way of analyzing quantities expressed by measurement models. In section 3 we present the basic theory of
74
Bayesian statistics and analyze under this viewpoint the same problems considered in section 2. We conclude that the Bayesian analysis of measured quantities, including those for which repeated measurement data exist, avoids the problems caused by the mixture of conventional and Bayesian statistics. 2. Conventional Confidence Intervals In the conventional approach to the evaluation of repeated measurement data, the measurandXis regarded as a random variable having some underlying sampling distribution whose expectation E(X) = // and variance are regarded as fixed and unknown. The data (xi,...xn), n > 2, are taken as the values assumed by the random sample (Xu.. .X„), where the X,'s are independent random variables whose sampling distributions are identical to that of X. The sample mean and the sample variance, defined respectively as: X =- Y x
2
i
andS
=-±-Y(Xi-X)2,
(1)
are also random variables having the following properties: E( X) = //, E^ 2 ) = o2 and V(X)= cP'In. The unbiased point estimates of fd and a are, respectively, the value x assumed by the sample mean and the value s2 assumed by the sample variance. Further, the unbiased point estimate of the standard deviation of X is the value si , which is taken as the standard uncertainty of x. These results are valid whatever the distribution of X, it is only required that its second moment exists. If this distribution is Gaussian, it may be shown that the distribution of X is also Gaussian and that the distribution of the random variable T = (X - //)V" / S is Student's / with v=n—\ degrees of freedom. From these facts we may derive a symmetric confidence interval for fi with confidence level 1 - or. it is x±n~ll2stvA_a/2, where tvA_a/2 is the 1 - a/2 quantile of Student's t distribution. This means that the proportion of intervals of this type computed from a large hypothetical ensemble of similar experiments that would contain the fixed but unknown parameter // would be close to 100(1 - a) %. Note that as soon as one obtains the limits of the interval, randomness disappears. For this reason the word "confidence" is used instead of "probability".
75
2.1. Other frequency distributions The sampling distribution cannot be always modelled as Gaussian. For example, suppose that the measurements correspond to the diameter of a shaft, taken at different orientations. If resolution is high enough, random manufacturing imperfections might make the values appear to arise from a uniform (rectangular) distribution centred at p and width co. The width can be estimated by the value r of the sample range R = max(Xu...X„)-mm(Xu...X„). A confidence interval for co is derived by knowing that the sampling distribution of R for given co is [2] f{r) = n{n-\){co-r)rn~2co~",
0
(2)
Thus, following Neyman's method [3], a one sided confidence interval with confidence level 1 - a is obtained as (r, cou), where cou is the smallest root greater than r of ^L-n
+
l\ = a.
(3)
V a>u)
Suppose now that the n individual results are obtained from a measurand that follows a Poisson process, for example, the number of decay events from a radioactive source observed during a fixed period of time. In this case, the obvious choice for the (discrete) sampling distribution is P(X = /) = ?-£-,
7 = 0,1,2,....
(4)
The expectation and variance of this distribution are both equal to the fixed and unknown parameter X. Hence, either the sample mean or the sample variance yield point estimates of this parameter. If the former one is used, its variance is estimated as xln. To find a confidence interval for X we note that if the Poisson distribution is a valid assumption, the sample mean X follows also a Poisson distribution with parameter nX [4], that is, -nl N e (nX) P(X = x)=-—!^2_
TV = 0,1,2,-••
(5)
where N =rix= Ex,. Therefore, employing again Neyman's method, the lower and upper limits of a central interval with confidence level 1 - a are the values XL and Xv that satisfy
76 N-l
K)' v ^ _ =_e „ - ^- ^ y vv^_ e -„x L X '("4)' 2.2.
_« -
(6)
Incorporating constraints and previous knowledge
In the conventional approach, known constraints about X can sometimes be incorporated into the model for the sampling distribution. For example, suppose that X stands for a small mass, and that the measurement result is x = as with a<3. Then, obviously a Gaussian distribution would not be a reasonably model, since we would expect the "drawing" of negative mass indications with non negligible frequency. A better assumption (though hardly justifiable) might be an asymmetrical Gaussian distribution truncated to the left at zero. In this case, the expected value E(X), which would be again estimated as 3c, is not equal to the parameter fi in the exponential, and the construction of confidence intervals for ju and/or E(X) in terms of x, s and n becomes more involved. Suppose now that n observers measure independently the quantity X, respectively sampling from heteroscedastic Gaussian distributions with a common expectation // and variances erf. Each observer performs w, independent measurements and obtains an estimate x, together with a standard uncertainty u- = s. I•^nj . The weighted mean V1 n
r n
I>
x„, —
V >=i
!>,.*,.
J
(7)
is an unbiased estimator of//, whatever the weights w,. However, the variance of X^ is minimal, and equal to 2
"
i ^
u =
-1
(8)
when the weights are chosen as w, =(u/ui)2.
(9)
Thus, the inverse-variance weighted mean pools all available information about the measurand, in whatever order the data were obtained, but its sole justification in the conventional theory is the fact that it has minimum variance. Confidence intervals for // can be computed from Student's t distribution as explained above, with an "effective" number of degrees of freedom calculated using the Welch-Satterthwaite formula (see below).
77
2.3. Combination of measured quantities Consider the linear model X = cyY + czZ, where cy and c2 are given (known) coefficients, and assume that Y and Z are uncorrected. Assume further that both input quantities are measured repeatedly and independently. The estimate of /ux (the expectation of X) is then x = cyy + czz with an estimate of variance, Wj =(cyuy)2 + (czuz)2where uy=sylny and u\=s\lnz. If both sampling distributions are Gaussian, it may be shown that a symmetric confidence interval for fa with confidence level 1 - a is obtained as x±uztvi_a/2, where tv,_a/2 is the 1 - a/2 quantile of Student's t distribution with v "effective" degrees of freedom. These are given by the Welch-Satterthwaite formula 4 V = Ur
(cyuy? ny-\
+
(czuz)4 nz-\
(10)
This "recipe" can be applied also in case the measurement model is non-linear, for example X= YZ, if its linearization about the point estimates y and z does not deviate much from the actual model (in this example, if y or z are not close to zero). The generalization to more than two input quantities is trivial. Unfortunately, this approach does not work: if the sampling distributions for the input quantities are not Gaussian; if these quantities are correlated; or if the estimates of one or both of the input quantities are not obtained from repeated measurements. The GUM circumvents these difficulties by providing ad-hoc rules; these are pragmatically employed in routine uncertainty calculations, though their justification is doubtful. 3. Bayesian Credible Intervals Bayesian statistics [5] should pose no difficulties to metrologists, who know by heart the definition of uncertainty in the GUM: "a parameter, associated with a result of a measurement, that characterizes the dispersion of the values that could reasonably be attributed to the measurand". In other words, the uncertainty is a range of values one believes the measurand can assume. "Degree of belief is central to the Bayesian interpretation of probability. In Bayesian statistics, the measurand A" is considered simply as a quantity to which a probability density function (pdf) f{x) can be associated on the basis of the given information relevant to X. This function encodes the state of knowledge about the quantity, such that f(x)dxis the information-based probability that the value of Xis in the interval (x,x + dx).
78
This way of interpreting the measurand is radically different from the one adopted by the conventional theory. In Bayesian statistics there is no need to assume the existence of an immutable "true value", nor to recur to randomness, to underlying frequency distributions, or to sampling from an infinite and imagined population of values. Instead, it is the data that are regarded as fixed and given; they are not (even mentally) allowed to vary. These data, together with any other information we may have, allows us to infer the pdf that characterizes the uncertain measurand. Its expectation and standard deviation can be considered as the "result of the measurement" and the "the dispersion of the values that could reasonably be attributed to the measurand", respectively. But it is also possible to state our degree of belief (or the probability) that the value of the measurand is within any two given limits. Conversely, it is a rather simple task to find intervals, central or otherwise, that comply with some desired coverage probability. In other words, there is no need to apply elaborate procedures to construct "confidence" intervals, nor to provide a rather imaginative interpretation on their meaning. 3.1. Bayes' theorem Bayes' theorem can be easily derived from the following axiom of probability theory: given two propositions A and B, the probability of both of them being true is equal to the product of the probability of A being true times the probability of B being true given that ,4 is true, and vice versa. In symbols: P(AB) = P(B\A)P(A) = P(A\B)P(B),
(11)
where symbol | stands for "conditional upon". Bayes' theorem then follows: nA\B) = P(B^)(A)
.
(12)
Let us now consider the following two propositions. A: the value of X lies within a given infinitesimal interval (x, x+dx); B: data d about quantity X are observed in the course of a measurement experiment. Therefore, P(A) = f(x)dx and P(A\B) = f(x\d)6x, where f(x) and /(x|d) are the pdfs that describe the states of knowledge about X without data and with data, respectively. The factor P(B\A) is the conditional probability of observing d if the value of X was known to be equal to x. This is called the likelihood, and is a mathematical function of x; for this reason it is written as L(x;d). Bayes' theorem then becomes
79 f(x\d) = KL(x;d)f(x),
(13)
where K is a normalization constant. Note that the likelihood can be multiplied by an arbitrary constant or by a function of d alone, because the multiplier would be later absorbed into the constant K. If no data are available, the likelihood is a constant and the posterior becomes equal to the prior. To construct the likelihood, a probability model is needed. The probability model takes the place of the sampling distribution in conventional statistics, and it can (but need not) use the conventional definition of probability to state quantitatively our "degree of belief. However, in the Bayesian context it is not required, in fact, it does not even make sense, to think of an infinite population from which the measured values are "drawn". For example, if the probability model is a Gaussian with known variance v, and a single datum d is obtained, the likelihood is L(x;d) = . V2TIV
exp
1 {d-xf 2
v
(14)
Note that we have avoided the use of the symbols fi and o2, to emphasize that x is a variable (below we will also consider v as a variable). This likelihood integrates to unity in x (and d) but this is only accidental: the likelihood is not a pdf. It rather reflects how beliefs about having obtained the given datum vary with the possible values x of the measurand. The pdfs f(x) and /(xld) are respectively called the "prior" and "posterior" densities. The matter of assigning the prior causes much controversy. In general this should be written as / ( * | / ) , where / stands for the available information one has about X before data are obtained. But suppose that absolutely no information is available. This is equivalent to saying that all values in the real line are equally probable. So we write f(x) = 1 to represent a uniform density from -a> to <x>. This prior is non informative, because it has no influence on the posterior, and improper, because it cannot be normalized. The first characteristic is desirable: if there is zero information, we should "let the data tell the whole story". The second characteristic might appear to be troublesome, but in fact it is of no consequence as long as the posterior is normalizable. 3.2. Repeated measurements with a Gaussian probability model Let us then consider independent data (x\,...x„) obtained in the course of the measurement of X, with n > 2. Assume that the probability model is Gaussian
80
and that absolutely nothing is known previously about the quantity and about the measurement variance. The latter then takes the place of another quantity, say V. The likelihood for the first data point is then 1 (JCJ - x)2
Ifov^oc-^exp
~2
v
(15)
A non informative joint prior f(x,v) is now needed. If it is reasonable to assume that X and V are independent, the joint prior factors out into the respective marginal pdfs. The one for X can be taken as equal to 1, but choosing / ( v ) = 1 from -oo to oo would be unreasonable, because we know that the variance cannot be negative. It turns out that invariant arguments suggest that for a scale parameter, the appropriate non informative (and improper) prior is / ( v ) -\lv (Jeffreys' prior). With this choice of priors, the first joint posterior is /(x.v^oc-jTj-exp
1 {xx-xf 2
v
(16)
This posterior becomes the prior for the rest of the data. By repeating this process, we arrive at
-^2>-*>!
f(x,v\x],...,x„)cc-^rexp
(17)
The variable v can now be "integrated out". After some manipulations and change of variables, we arrive at the marginal posterior
/(0* which is Student's t with v=n-\
n-\
(18)
degrees of freedom in the variable
T = Jn(X-x)ls.
(19)
In this derivation, x and s are mathematically equal to the sample mean and sample standard deviation of conventional statistics, but they are not interpreted as the values assumed by random variables X and S, which in this formulation are meaningless.
81 Since the expectation and variance of Student's / pdf are zero and (« - l)/(« - 3), respectively, the information contained in the final posterior can be summarized as the result x = x with standard uncertainty II = V ( « - 1 ) / ( » - 3 ) ( J / V « ) .
(20)
Finally, coverage intervals for the value of the measurand follow immediately from the meaning ascribed to the posterior. Thus, the central interval which we believe (based on the information provided solely by the data) contains the value of X with coverage probability p is x±n'l,2stv^+p)/2. This interval coincides exactly with the confidence interval derived in conventional statistics, but in the Bayesian case the interpretation is clearer. Bayesian intervals determined in this way might be called "credible" intervals to distinguish them from "confidence" intervals for the values of unknown fixed parameters of frequency distributions. 3.3. Other probability models Reconsider the problems in section 2.1 and 2.2. Suppose first that the probability model corresponds to a uniform distribution centred at X and width W, about both of which we know nothing. The likelihood for each data point is then 1
W
/«|-.
W
L(x,w,x,)oc— if x <x, <x + — and L(x,w,xi) = 0 otherwise- (/U w 2 2 Multiplying all likelihoods yields a global likelihood proportional to 1/w" foTx-w/2<xu...,x„ < x + w/2, being zero otherwise. This condition is equivalent to x - w 12 < xm and xM < x + w 12 where xm = min(xi,.. .x„) and xM = max(xi,...x„). With non informative priorsf(x) = l and f(w) = \/w (where the latter is due to W being a scale parameter) the joint posterior becomes i
1
f{x,M\xx,...,x„)v:^j-;
W
W
xM- — <x<xm+—,
w>0.
(22)
In order to quantify our state of knowledge about the width of the uniform pdf, we "integrate out" x from the joint posterior. Thus, we find f(M\Xl,...,x„)X(w-r)/wn+l;
w>r,
(23)
82 where r - xM- xmis the observed (known) range. The normalization constant can be found from the condition that the marginal posterior integrates to one; the result is K= n(n - l)r" _ 1 . The expectation and variance become: E(W\Xl,...,xn)
=^ - r ; K-2
2
V(W\xl,...,x.)=
"
r2.
(24)
(«-2) («-3)
Thus, the Bayesian analysis yields a measurement result E(fF|x,,..., xn) for the width of the distribution that is larger than the estimate r of the "true" expectation of the width obtained with conventional statistics. Note also that in the Bayesian analysis the square root of W{W\xx,...,xn) can be taken as the standard uncertainty associated with the result nrl{n-2). Finally, it is easy to check that the one sided credible interval for the width, with coverage probability p is (r, wv), where wv is the smallest root greater than r of Y7 —"--H + l = 1 - / 7 .
(25)
This interval is identical to the conventional confidence interval (see equation 3), but the Bayesian derivation is completely different. In this context, p is a probability, not a confidence level. Suppose now that (xi,...x„) are the numbers of decay events from a radioactive source observed during a fixed period of time. The quantity of interest is the emission rate X. The individual likelihoods are then of the form L(x;xt) a exp(-x)xXl . Since we know that X is positive, and that it represents a scale parameter, we use Jeffreys' prior f(x) = \lx . The posterior is then f(x\xx,..., xn) oc xN~x exp(-TTx); x > 0,
(26)
where N = Hxr We recognize this as the gamma distribution, so the normalization constant is K = nN IN\. The state of knowledge about X is summarized by the expectation and variance E(X\xu...,x„)
= x=-; n
y(X\x„...,x„)
=- , n
(27)
where, again, the average of the counts x is not the value assumed by a random variable X. The central credible interval for X with coverage probability p can now be found. Its limits, JCL and xv, are the solutions of
83
\XLxN'1 exp(-nx)dx = f xN''
Jxv
P
]
2nN
.
(28)
It can be easily shown that these equations are identical to equations 6, but again with a different interpretation. Consider, finally, n independent pairs (xh w,), where JC, is the measurement result for the quantity X as reported by the z',h observer, and w, is the associated standard uncertainty. Suppose that this is the only information available to us. It is then reasonable to use a Gaussian probability model, with individual likelihoods of the form L{x\Xj,Uj) ccexp
1(*,-*)2
(29)
With a non informative constant prior, the posterior becomes f(x\xx,ux...,xn,u„)
1^(*,-*)2
ccexp
(30)
Note that the variances uf were not placed in front of the exponentials of the likelihoods. The reason is that these variances are known constants, so they are absorbed into the normalization constant of the posterior. We then get n
E(X\xuui...,x„,un)
=
u2^^r,
(31)
where u2=V(X\xl,ul...,x„,u„)=
n
^
(32)
VM«i;
These results agree with the weighted mean as the minimum variance estimator of the fixed and unknown parameter fj of the sampling distribution. But in the Bayesian framework their justification is much clearer. 3.4. Combination of measured quantities Let X be modelled as X= g(Y,Z). Assume that data d about both input quantities is obtained and that their prior state of knowledge is described by the joint pdf h(y^). If nothing is known about X before evaluating the model, the joint posterior pdf is then
84 f{x,y,z)
= KL{y,z;A)h{y,z)5[x
- g(x,y)],
(33)
where 8 is Dirac's delta function. In this way we introduce explicitly the correlation between the output and input quantities. The marginal pdf for X can then be calculated by integrating over y and z, in any order. This allows obtaining credible intervals with no recourse to ad hoc procedures, such as the Welch-Satterthwaite formula in the GUM, and without the need to linearize the model. Of course, obtaining the posterior f(x) analytically can be quite difficult, unless the model is very simple. The alternative is to use Monte Carlo simulation, which is just a numerical tool to handle the computations. 4. Conclusions Conventional statistics works rather well in the realms of commerce, demography, public health, and other situations where actual populations exist and large amounts of data are available. By contrast, in metrology this is seldom the case: many times only one or two data points will be available. The GUM says that in these cases one should resort to "type B" methods of evaluation, meaning those that are not based on conventional statistics. It so happens that many times these methods will be Bayesian in nature. Although we have attempted in this article to present a balanced view on both frameworks to evaluate repeated measurement data, it was impossible for the authors to avoid their bias towards the Bayesian interpretation: not only is it much more natural, but it also eliminates the need to consider two "types" of uncertainty evaluation. Acknowledgments I. Lira would like to acknowledge the support of Fondecyt grant 1030399 and the organizers of this Conference for making his attendance possible. References 1. ISO 1993 and 1995 Guide to the Expression of Uncertainty in Measurement 2. Meyer PL 1970 Introductory Probability and Statistical Applications, 2nd Ed. (Reading, Mass.: Addison-Wesley) p. 275. 3. Neyman J 1937 Philos. Tr. R. Soc. S-A 236 333-380. 4. Mood AM and Graybill FA 1963 Introduction to the Theory of Statistics, 2nd Ed. (New York: Mc Graw Hill) p. 235. 5. Weise K and Woger W 1993 Meas. Sci. Technol. 4 1-11.
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 85-97)
DETECTION OF OUTLIERS IN INTERLABORATORY TESTING CHRISTOPHE PERRUCHET UTAC, Autodrome, BP 20212, F-91311 Montlhery cedex,
France
An approach for outliers identification, taking into account the multidimensional nature of the data, is proposed for interlaboratory tests, where several laboratories test one or several products for different characteristics.
1. Introduction The question of the detection and processing of data known as outliers is posed by the experimenters since many years. In 1755, Boscovich which tried to evaluate the ellipticity of the earth by realising ten values of the angle between the equator and one of the poles, rejected two extreme values as aberrant. On the other hand, Bernouilli which treated also astronomical observations, concluded in 1777 as the only reason valid for reject a data, was the existence of an incident in the information production process. Since this date, many methods of detection of outliers were proposed. This abundance of literature and the importance of the debate are certainly due to the fact that the concept of outliers is not precisely defined, and that the authors working on the subject do not agree on any rigorous definition. According to Bernouilli, outlying data can be defined as data produced by a foreign phenomenon with that studied. However, this definition does not give the means of solving the problem of their identification. It is clear that it is advisable to detect the data answering this definition, not to deform the conclusions of the experimentation and thus that it is advisable to implement an adapted methodology. However, it also should be stressed that an inadequate or abusive use of methods of outliers detection also results in deforming the conclusions. The object of this paper is to propose a solution for the multidimensional case using concepts simple to implement.
86 In interlaboratory testing, an ideal situation corresponds to a low intralaboratory variance and a low interlaboratory variance, this involving a good repeatability and a good reproducibility. Suspect situations are for example when: A laboratory is far away from the others, even if it is of low variance. A laboratory is of high variance, even if it is close to other laboratories. These typical structures apply to more than one laboratory, with the extreme situation where all the laboratories have high unequal variances and are all distant the ones from the others. We thus propose to define simple numerical indexes allowing identifying these various situations. Sections 4 and 5 are respectively devoted to the intra-laboratory analysis and to the interlaboratory analysis. They provide numerical indexes as well as graphic tools making it possible to quickly judge respectively dispersion and offsetting of a laboratory. Section 6 is a numerical application based on the data used in international standard ISO 5725. 2. General principle The data treated in this article, are data collected from interlaboratory tests or comparisons, where K laboratories test one or more products by measuring their performance for various characteristics, generally by repeating the tests. The measured characteristics can be of different nature (mass, time, amount of substance...) or of comparable nature (levels of concentration of a chemical element...), in the latter the term of "test by levels" is then generally employed. The multidimensional nature of the data is thus a consequence of the measurement of various variables or of the same variables at various levels. The objectives of the method presented are: - To treat the data in their multidimensional nature in order not to ignore aberrant phenomena in multiple dimension but not into one-dimensional situation. - To allow to identify the suspect zones of the data table, without having to analyze the data table column by column and block line per block line as with the one-dimensional methods. The advantages of this method are: The simplicity of the concepts and the calculations implemented. The use of concepts having proven reliable and largely diffused thanks to the methods of exploratory data analysis.
87
3. Data The data table is denoted by: X={x/;
1,
\<j<J)
The J index represents the measured characteristics or the levels of the test, i.e. the multidimensional nature of the problem. The statistical units are the vectors x with J components. The descriptors (variables, levels) are the J vectors x with I components. On the set of the I lines of the data table, a partition is induced by the K laboratories Lj,..., LK participating to the interlaboratory test. The test repetitions made in each of the K laboratories are respectively represented by l\ to ly^ lines of the data table: 7 = 2 {4;
\
It is supposed that a Euclidean distance d is given on the space of the statistical units and that their masses are equal to 1. However, the developments which follow generalize without difficulty with a system of unspecified masses. The centre of gravity (or average) of the laboratory K is the vector denoted by: gk = {gj \\<j<J] where g/j = Z {x/ ;ieLk) llk (1) The centre of gravity (or average) of the set of the laboratories is the vector denoted by g: g={gJ;\<j<J)
gJ= 2 {x/; 1 <*:
where
= Z{lkgjj;\
(2) (3)
In a Euclidean space with masses equal to the unit, one recalls that the total inertia of the cloud compared to its centre of gravity g is defined as: Mf = Z{cP{xi,g);\
(4)
M'f = Myf + Mg2
Myf is the intralaboratory inertia defined as: Mw2 =
H{Mk2;\
(5)
88 where Mk2 = 2 {(f(xj, gk); i e Lk) Mg2 is the interlaboratory inertia defined as: MB2 = X{lkcP(gbg);lZk
(6)
The choice of the distance d is of primary importance. Usual choices would be: The usual Euclidean distance when the columns of the data table are quantitative variables of the approximately the same variances A Mahalanobis distance when the variables have very different variances. A Chi-square distance for a table of binary data or for a table of relative frequencies data. For all that follows, it is made use of the usual Euclidean distance, without loss of generality. 4. Intralaboratory analysis The intralaboratory analysis is based on the decomposition of intralaboratory inertia Myy2 according to the laboratories and the variables. Mw2 =
T.{Mk2;\
-I.{M^);l<j<J}
where
Mjffl) = Z {Mk\j);
1 < *
For each of the K laboratories, inertia relative to its centre of gravity Mk2, is calculated: Mk2= S {d* (xt, gk) ;ieLk}=X
{(*/ - g/Jf ; i e Lk> \<j<J}
(7)
= Z {Mk\p ; 1 <j <J} where Mk2 (/') = £ {(x/ - gjj)2; i e Lk) is the absolute contribution of variabley to the inertia of laboratory k. The relative contribution of the laboratory k to intralaboratory inertia is: CTWk = Mk21 Mtf with the index satisfying:
I {CTWk ;l
(8) = l.
89 The dispersed laboratories are those whose intralaboratory inertias Mg or contributions CTW^ are highest. One will thus build a bar chart of CTW^ values, to quickly consider relative dispersion of the laboratories, by presenting the laboratories by decreasing contribution. If dispersions of the laboratories are of the same order of magnitude, their intralaboratory common inertia is equal to the average inertia Myf IK. Their common relative contribution is then UK The ordinate of the graph will thus present the values \IK and 21K beyond which the strongly dispersed laboratories will appear. For the laboratories of high inertia, inertia will be broken up by calculating the relative contributions: CTWk{]) = Mg{j)IM£ with the index satisfying:
(9)
Vfc S {CTWk (j); 1 <j < J} = 1.
For each of these laboratories a bar chart of the CTWk (j) values will make it possible to quickly identify the variables generating a strong inertia of the laboratory. It should be noted that if, for a given laboratory, the variables have the same influence, the Mjf (/') have as a common value, the average value equal to MfcVJ. Their common relative contribution (i.e. the average relative contribution) is then \IJ. The ordinate of the graph associated with each laboratory will thus present the values \IJ and 2IJ beyond which the highly significant variables will appear. 5. Interlaboratory analysis The interlaboratory analysis is based on the decomposition of interlaboratory inertia A/g2 according to the laboratories and the variables.
MB>=l{lkcP(ghg)-,l
(10)
where ItfPigk, g) is the absolute contribution of laboratory k to the interlaboratory inertia, and where Mg1 (j) = Z {/^ (g/J - gJ)2; 1 < k < K) is the absolute contribution of variable j to the interlaboratory inertia. This quantity breaks up as the sum overy of the values Mg\(j) = /^ (g^ S>) 2 .
90 The relative contribution of laboratory k to the interlaboratory contribution is : CTBk = lkd>(gk,g)/MB
(11)
This index satisfies: Z {CTBk ;l
(12)
Vfc Z {CTBk (j); 1 <j < J} = 1.
For each laboratory, a bar chart of the CTB^Q) values will make it possible to check quickly if the outlying of a laboratory is due to the influence of a particular variable or to several variables. One will answer by this means the identification of aberrant laboratories which cannot be identified by onedimensional methods. 6. Application In this paragraph, we will illustrate the exposed method by using an example from international standard ISO 5725-2. Annex B.3 is about thermometric titration of creosote oil. The test result is the mass percentage of tar in creosote oil. We made this choice so that the reader can compare the multidimensional method with the one-dimensional tests of Cochran (intralaboratory analysis) and Grubbs (interlaboratory analysis) of the ISO standard. 6.1. Data They are presented on the lines LI to L9 of table 1, in the form of a table (table 1) with 5 columns (5 levels of content of tar) and 18 lines representing the repeated analyses (2 replications in each of the 9 laboratories).
91 Table l.Example (J = 5, /= 18,K= 9, lk = 2, /
4.44 4.39 4.03
9.34 9.34 8.42
17.4 16.9 14.42
L2
4.23 3.7 3.7
8.33
L3 L3
7.6 7.4
L4
4.1
L4 L5 L5 L6
4.1 3.97 4.04 3.75
L6 L7 L7 L8 L8 L9 L9
4.03 3.7 3.8 3.91 3.9 4.02 4.07
24.28 24 20.4
14.5 13.6 13.6
19.23 19.23 16.06 16.22 14.5 15.1
8.93
14.6
15.6
20.3
8.8 7.89 8.12
14.2 13.73 13.92
8.76 9.24
13.9 14.06 14.1 14.2 14.84 14.84 14.24 14.1
15.5 15.54 15.78 16.42 16.58 14.9 16 15.41
20.3 20.53 20.88 18.56 16.58 19.7 20.5 21.1
15.22 15.14 15.44
20.78 20.71 21.66
8 8.3 8.04 8.07 8.44 8.17
19.91 19.3 19.7
6.2. Preliminary analysis It is to establish the decomposition of the total inertia into intralaboratory inertia and interlaboratory inertia. 6.2.1. Calculation of the centre of gravity of each laboratory This is a multidimensional centre of gravity, i.e. a vector with J components. One applies formula (1) to obtain the lines gi to g9 of table 2. 6.2.2. Calculation of the centre of gravity of the data set This is too a vector with components where the components are calculated in two possible ways using formulas (2) or (3). g is given in table 2. The components of g are the averages by level.
92 Table 2. Preliminary analysis. 4.415 4.13 3.7
9.34 8.375 7.5
g7 g8 g9
4.1 4.005 3.89 3.75 3.905 4.045
8.865 8.005 9 8.15 8.055 8.305
*
3.9933
8.3994
gl g2 g3 g4 g5 g6
19.23 16.14 14.8
24.14 20.155 19.5
13.825 13.98 14.15 14.84 14.17
15.55 15.66 16.5 15.45 15.315 15.29
20.3 20.705 17.57 20.1 20.94
14.5083
15.9928
20.5106
17.15 14.46 13.6 14.4
21.185
6.2.3. Calculation of the total inertia One applies formula (4). Mf is the sum of the square of the distances from vectors JC,- to vector g. Vectors X[ and g have J components. Mf(j) = 2 {(xj-g!)2; 1 < i < /} is inertia due to level y Mf = Z {Mf (/); 1 <j < J) Mf = 0.818 + 5.490 + 18.087 + 28.379 + 50.798 = 103.572 Table 3. Total inertia. MP)
0.818
5.4901
18.0868
28.3794
50.7979
MP =
103.5722
MPj
0.00125 0.02
0 0.00405 0.02
0
MK2
0.0128 0.18 0.005 0.0288 0.0128 0.605 0.01805 0.045
0.0392 0.12005 0.08 0 0.06125
0.16545 0.1601 0.28 0.09345 0.137 2.1402 0.98 0.06975 0.54375
0.005 5E-05 0.00125
0.045 0.00045 0.03645
0.125 0.0032 0 0.08 0.01805 0.0128 0.005 0 0.0098
MW1)
0.0692
0.2561
0.2539
0.9074
3.0831
MW1 =
4.5697
MB\j
0.7488
5.234
17.833
27.4719
47.7147
MB1 =
99.0025
0 0 0.00245 0.0392
0.00845 0.02645 0.1152
1.9602 0.32 0.0512 0.45125
93 6.2.4. Calculation of the intralaboratory inertia One applies formula (7). In table 3, the sub-table of M/fQ) gives for each level j (5 columns) and each laboratory k (9 lines), the square of the distance from the results of a laboratory to its average. Intralaboratory inertia can be obtained in two possible ways: It is the sum for the J levels and K laboratories oiMk2 (/'). One can also firstly sum over they, to obtain the K values of M^2, and then apply formula (5). Mrf = 4.5697 6.2.5. Calculation of the interlaboratory inertia One applies the formulas (6) and (10) where the set of {gfc} and g are vectors with J components. One calculates MB2 (j) and then MB2: MB2 = 99.0025 6.2.6. Total inertia decomposition Mf = Mw2 + MB2
i.e.
103.5722 = 4.5697 + 99.0025
6.3. Intralaboratory analysis One calculates the inertia of each laboratory compared to its centre of gravity by applying formula (7). These calculations enable us to obtain Myy2 and the vector M^ 2 with K components which is presented in table 3. The relative contribution of the laboratory k to intralaboratory inertia CTW^ is given by formula (8). If all the laboratories had the same relative contribution, this one would be equal to \IK, which is 0.11 in our example. The relative contributions CTW^ are presented in figure 1.
94
4 5 Laboratories Figure 1. Relative contributions CTW^
Obviously, laboratory 6 and, to a lesser extent, laboratory 7, have a very strong contribution to the intralaboratory inertia: they have a very poor repeatability. One can then wonder which levels are the causes of this phenomenon. Figure 2 represents the contributions CTWfc (J) (cf. formula (9)) to the intralaboratory inertia for laboratories 6 and 7, for the 5 levels. It is clearly seen that: Laboratory 6 "failed" measurement on the level j = 5. That could be regarded as an isolated anomaly. Its results are isolated for this single level. Laboratory 7 has a strong contribution for levels 4 and 5. Consequently, this laboratory would have to re-examine its procedure of application of the test method.
95 CTWk(j) D Labo 6 sLabo 7
2/J t-
1/J
1 0
,
. 1
,i
umm ,
i
2
3
l
1
I-
• — •
4
5
Levels (j)
Figure 2. The contributions CTW^ (/).
6.4. Interlaboratory analysis The decomposition of interlaboratory inertia A/g2 is done according to laboratories and levels by applying formula (10). One finds then the contributions CTB^ to interlaboratory inertia for each laboratory by applying formula (11). Figure 3 represents the contribution CTBfc of each laboratory to interlaboratory inertia. CTB,,
2/K •^•"••••t
Laboratories
Figure 3. The contribution CTB^
96
If the laboratories had equal contributions they would be equal to UK = 0.11. It is thus seen that laboratory 1 has a very large contribution to interlaboratory inertia, as wall as, to a lesser extent, laboratory 6. These laboratories will be responsible for an important reproducibility. Figure 4 represents the contributions CTBj^j) (cf. formula (12)) to the interlaboratory inertia for laboratories 1 and 6, for the 5 levels. Figure 4 shows that: Laboratory 1 has a very important contribution to inertia for the levels 3, 4 and 5. It must thus be isolated because it provides results too different from the other laboratories: its results are too much high. Laboratory 6 provided too different results for level 5. One had already seen in paragraph 6.3 that for this level, its results were too much dispersed. The low value of 16.58 can be responsible. CTB^J) - D Labo 1 _Labo 6
1
2/J
1/J
_____
,_! 1
£ -••••••*
2
i
I
l 3
'i
____, 4
L
•_
1
5
Levels (j) Figure 4. The contributions CTB/JJ).
7. Conclusion This article proposes a simple method for detection of aberrant or outlying data in multidimensional situations. This method, which gives a multidimensional lighting to the data table, comes in complement from the standardized methods which apply level by level or variable by variable.
97
This method thus comes usefully to support the analyst, in particular when it is engaged in the evaluation of test or measurement methods, proficiency testing of laboratories or evaluation of reference materials. At present, these tools are implemented at UTAC for the analysis of interlaboratory tests and the analysis of cars technical inspection data. They make it possible to synthesize complex information. References 1. 2. 3. 4.
V. Barnett, T. Lewis: Outliers in statistical data, 3 rd edition, Wiley (1994) R.J Beckman, R.D Cook: Outlier...s, Technometrics, 25, 2 (1983) ISO: Accuray (trueness and precision) of measurement methods and results, ISO 5725-2 (1994) Perruchet C, Sado G.: Detection de donnees aberrantes dans le cas d'essais interlaboratoires et fidelite multidimensionnelle, Revue de Statistique Appliquee, Vol. XLII n° 1(1994)
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 98-107)
ON APPROPRIATE METHODS FOR THE VALIDATION OF METROLOGICAL SOFTWARE DIETER RICHTER, NORBERT GREIF, H E K E SCHREPF Physikalisch-Technische Bundesanstalt Braunschweig and Berlin Abbestr. 2-12, 10587 Berlin, Germany An adequate concept of software validation must consider all steps beginning with the definition of requirements and ending with the provision of evidence that the requirements are fulfilled. This includes the refinement of requirements to testable attributes, the selection of appropriate methods, the proper execution of methods and the documentation of results. As regards the validation methods, functional tests are preferred in metrology, although this is not always justified. Code inspections may be very beneficial and more straightforward than often expected. Software validation in metrology - in particular the definition of testable requirements - is a highly interdisciplinary process. The approach is demonstrated by an example of software that claims to be conform to the Guide to the Expression of Uncertainty in Measurement (GUM).
1. Introduction The demand for software validation is growing steadily, not only in general, but in particular in the field of metrology. This development reflects the increasing desire for the provision of software that has been proven to be suitable for a certain purpose and is reusable without any doubt. This desire has its roots not only in requirements for correct and reliable functions in increasingly complex systems, but it also has an economic basis: Software of insufficient quality may lead to malfunctions which can have unlimited consequences, and the costs of system failures due to the application of untested software may largely exceed the costs of validation. On the other hand, it has shown in metrology and elsewhere that - probably due to time and cost pressure during the development - software is not always tested and documented completely before it is being delivered. A validation may provide the requested trust. However, the opinions on what an appropriate validation is are diverging. There is not yet a common understanding of software validation procedures. This is true in particular for metrology. The view on software validation is too often restricted to test methods and, within a test method, to the functional testing of devices with embedded software. This view is too narrow to master the software validation problem in metrology. But what would be an adequate concept for software validation in metrology?
99 In general, validation means to confirm - by providing objective evidence that the requirements for a specific intended use or application have been fulfilled [1]. Validation clearly relates to the application context and includes the requirements as a basic component to understand what evidence shall be provided. Having the requirements defined, it is a matter of appropriateness which method is selected. From case to case, different methods may be appropriate to provide the requested evidence. An adequate concept of software validation must take all aspects of validation into account: from the definition of requirements to the provision that the requirements are fulfilled. A concept must not stick to a general philosophy but offer practical approaches. Although several approaches addressing particular aspects of software validation in metrology are known [2, 3, 4], an overall concept is not yet available. This paper mainly describes, by means of examples, selected aspects of validation procedures and addresses the question of appropriate methods within the context of an integrated concept. 2. Requirements and Validation Methods According to the basic procedure of software validation (see figure 1), software requirements and validation methods are the main objects of interest within the validation procedure. The definition of requirements and their refinement to testable criteria are the fundamental working steps of the validation procedure [5]. After the selection of appropriate methods and their proper execution, the documentation of requirements, refinements and test results is as important as the execution of the steps itself. Feedbacks between different steps may have an influence. For a successful validation it is essential that testable criteria have been derived from the defined requirements. These refined requirements must cover both, metrological and software-related aspects. Therefore, the refinement process is highly interdisciplinary. Furthermore, refined requirements serve as an interface between software developers and software testers. The software developer must consider them as target criteria in the development process, and the tester uses them as quality criteria. It has shown in practice, however, that although there is a desire for validation, requirements are usually not completely defined and refined to testable criteria. The requirements and the validation objective are often vague, as, for example, "the software shall be of high quality", or "the software shall function correctly". For a validation, these "requirements" are too imprecise. In the next section, it will be shown by means of an example how a refinement can look like. To illustrate the diversity of software requirements, different types are listed in table 1 including examples for its metrological relevance.
100
1 Definition of basic requirements
Refinement of requirements •
Selection of methods
*
»
Performing tests / checks
Figure 1. Basic procedure of software validation. Table 1: Main types of software requirements and examples of relevance
Types of Requirements Model conformity Standard conformity Numerical stability Correctness of implementation Performance Security Usability Maintainability
Examples of Metrological Relevance Control of measurement, data analysis Domain-specific standards, GUM Simulation, data analysis Control of measurement, data analysis Control of measurement, remote access Data handling, software upgrades User interfaces, (remote) access Software upgrades under preservation of metrological characteristics
Metrological software is not only subjected to a diversity of requirements but there is also a large variety of methods which are potentially capable of providing the evidence that the requirements are fulfilled (see table 2). In a particular case, the appropriate validation method must be selected depending on the type of a requirement. Certain validation methods are related to specific requirements as, e.g., performance measurements to performance requirements and comparisons of model and experiment to requirements of
101 model conformity. Other validation methods are of a more universal nature such as, for example, the dynamic tests, code inspections, or expert evaluations. Among the validation methods there are methods whose importance as a validation method is usually underestimated. A typical example is the long-term field experience with software. If the quality criteria representing the refined requirements are known, and if there is a documentation of their fulfilment during a sufficiently long period and sufficiently broad application, this can be accepted as a validation. This type of validation is not less acceptable than any other method. It may even be closer to the practical needs. Table 2: List of validation methods
Validation methods System test with integrated (embedded) software Dynamic black-box test of the isolated executable software Dynamic white-box test of the isolated executable software Static analysis of the source code Manual inspection of the source code Manual inspection of the software documentation Review of specific documents (requirements, design, tests) Comparison of experiment and simulation Domain expert evaluation Analysis of long-term field experience Performance measurements Acceptance of quality management systems and/or software development models Audits of quality management systems and/or software development models The manual source code inspection is another type of validation method often underestimated in its effect. In addition, this method is often overestimated in its effort. In section 4 we show a few examples of source code inspections that are easy to perform and may be very beneficial. 3. Refinement of Requirements The derivation of testable criteria is a major task within the validation procedure. This is usually a stepwise process and includes at the end the assignment of validation methods. In the following, the definition and refinement process will be described by means of an example.
102 Refinement Example We assume that a software is given which supports the calculation of uncertainty. Overall validation objective: The objective is to provide sufficient trust that the software conforms to the "Guide to the Expression of Uncertainty in Measurement (GUM)" [6]. Refinement step 1: The general validation objective is refined into two main metrological requirements: • correct implementation of the algorithms prescribed by GUM, • correctness of the results calculated for the examples in GUM appendices, • etc. In the following refinement steps, we restrict ourselves to the first requirement. Refinement step 2: To validate the correct implementation of the GUM algorithms, the following requirements have to be met: • correct evaluation of the standard measurement uncertainty for input data according to type A, • correct evaluation of the standard measurement uncertainty for input data according to type B, • correct differentiation of model formulas, • etc. In the following refinement steps, we restrict ourselves to the first requirement. Refinement step 3: The correct evaluation of the standard measurement uncertainty according to type A is broken down into • correct calculation of arithmetic mean, • correct calculation of standard deviation, • correct calculation of degree of freedom, • etc. In the last refinement step, we restrict ourselves again to the first requirement. Refinement step 4 (Final refinement): The correct calculation of the arithmetic mean requires a) compliance of calculated and expected results, b) protection against abnormal ends of program, c) independence of calculated results from interface variables.
103 By step 4 of the refinement process, a level of requirements has been achieved so that concrete validation methods can be assigned to each of the derived criteria. Assignment of test methods Refined requirement a): Compliance of calculated and expected results Validation method: Dynamic program test using test data Refined requirement b): Validation method:
Protection against abnormal ends of program Checking for appropriate clauses in the source code by static program analysis
Refined requirement c):
Independence of calculated results from interface variables Inspection of the relevant piece of source code.
Validation method:
The refinement process is illustrated in figure 2. 4. Benefits of Code Inspections Provided the requirements have been defined and refined, it is often a difficult task for a metrologist to select the appropriate validation methods from the variety of methods. The difficulty is twofold: There is a lack of knowledge on the capabilities of (part of) the methods, and it is difficult to differentiate between methods with respect to their evidential power related to a given particular requirement. In very particular cases, support is available in guiding documents such as, for example, for measuring instruments subject to the European Measuring Instruments Directive for legal metrology in the newly issued WELMEC Software Guide 7.2 [7]. If no guiding document is available, metrologists are recommended to discuss questions of method selection with software engineers. The described situation might be the reason why metrologists prefer to use functional system tests for validations. Functional tests are a widely acceptable means, although one must be careful that the requirements - provided they are defined and refined - have been met. Code inspections may well serve as additional means to bridge a gap that has been left by functional tests.
Validation objective
Refinement steps
GUM conformilv
Correct implementation of the algorithms
T Correct type A evaluation of input
I
Correct calculation of arithmetic mean
i Compliance of calculated and expected res.
Validation methods
Black-box dynamic program test
Correct results for the examples in annexes Correct differentialion of model formulas Correct calculation of standard deviation ' » Protection against abnormal ends
Independence of calculations from interfaces
Static program analysis
Manual source code inspection
Figure 2. Example of requirements refinement and assignment of validation, methods
105 We describe two examples to show which benefits code inspections of only small pieces of software may have. As a first example, we will consider the following problem: The software to be validated has both a graphical user interface and a programming interface. In this way, it cannot only be used interactively, but also as a subprogram in other systems. If functional tests are applied only to validate the software, the tests have to be performed for both cases. Otherwise the software would not have been completely tested and would thus not be fully validated. Instead of repeating the tests, one may inspect the usually small piece of code, which contains the different connections of the program kernel to the interfaces. From this inspection one can infer from the validation of one case to the validation of the other case. This is illustrated in figure 3.
Is there any hidden influence from programming interface? Are results independent from the way the program is used? // Function Calculate // The function calculates . //it*************************
Do programming interface and user interface call the same functions?
Function Calculate (. .)
Figure 3. Example of a software with two interfaces.
In another example we consider the case of a software that sequentially performs several computational steps. In the case of the software for the evaluation of the measurement uncertainty, these steps are, e.g.: a) processing input data according to type A, b) calculation of the correlation among input data, c) processing of the measurement model, d) calculation of the sensitivity coefficients, e) computation of the estimated value of the output quantities, f) computation of the standard measurement uncertainty of the output quantities, g) determination of the degrees of freedom, h) calculation of the expanded uncertainty. Each of these steps has input and output data and may be influenced by program parameters. The output data of one step may serve as input data for
106 another step. Normally, the testers would design the test cases in such groups that each group of test cases focuses on one of the computational steps while the conditions and parameters for all other steps remain constant. The questions arising here are: Is it feasible to neglect test cases where the parameters for two or more steps are varied at the same time? Is the software well-structured according to the computation steps, i.e., does the software contain separate modules for each step? These questions can only be answered by analysing the relevant pieces of code. Figure 4 illustrates this issue. Overall program structure Is each algorithm implemented by a separate module?
Is the data flow between modules plausible? Are there other data shared by the modules?
Module for type A evaluation of input
Module for type B evaluation of input
~z~
Module to compute correlation influence
Module for processing model
Figure 4. Example of a test-motivated separation of software. Beyond the examples here presented, code inspection may be helpful in many cases. The complexity of a code inspection, however, is not always restricted to a straightforward analysis as in the examples given above. 5. Summary and Outlook Software validation is far more than testing a program. In metrology, the entire validation process has to be mastered. This process begins with the definition and refinement of requirements and ends with the issuing of objective validation reports. The entire process is an interdisciplinary work. Metrologists and software engineers can well share the work. At decisive points in the process, they must cooperate. This concerns, e.g., the refinement of requirements. Validation is to a great deal specification of requirements. The rest is more or less engineering skill in the software area at different levels. To support the metrology experts in the entire validation process, a guidance document is necessary. A guide that is - as regards the importance -
107 equivalent to the GUM can provide a common understanding and well-defined recommendations for all tasks in the validation process. A future guide for software validation could be a supporting companion for metrologists in the process of meeting the current challenges, namely, the elaboration and application of a unified terminology, and the exploitation of beneficial software engineering methods and best practices. However, a validation guide would become much more complex than, e.g., the GUM. It is high time now to start a corresponding initiative for software validation. References 1. 2. 3.
4. 5. 6.
ISO 9000:2000, Quality Management Systems - Fundamentals and Vocabulary, 2000. Wichmann, B. et al., Validation of Software in Measurement Systems, version 2.1, NPL Best Practice Guide No.l, ISSN 1471-4124, 2004. Greif, N., Schrepf, H., Software Requirements for Measuring Systems Examples for Requirements Catalogues, PTB Laboratory Report, PTB8.31-2000-2, Braunschweig und Berlin, 2000. Richter, D. (ed.), Special Issue "Validation of Software in Metrology", Computer, Standards & Interfaces, Elsevier, to be appeared in 2005. Sommerville, I., Sawyer, P., Requirements Engineering, Wiley, 1997. Guide to the Expression of Uncertainty in Measurement (GUM), ISO/BIPM Guide, ISBN 92-67-10188-9, 1995. 7.WELMEC 7.2, Software Guide (Measuring Instruments Directive 2004/22/EC), Issue 1, WELMEC Secretariat BEV, Vienna, 2005.
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 108-118)
DATA ANALYSIS - A DIALOGUE W I T H T H E DATA
D. S. SIVIA Rutherford Appleton Laboratory, Chilton, 0X11 OQX, England E-mail: [email protected] A modern Bayesian physicist, Steve Gull from Cambridge, described data analysis as simply being 'a dialogue with the data'. This paper aims to illustrate this viewpoint with the aid of a simple example: Peelle 's pertinent puzzle.
1. Introduction The training in data analysis that most of us are given as undergraduates consists of being taught a collection of disjointed statistical recipes. This is generally unsatisfactory because the prescriptions appear ad hoc by lacking a unifying rationale. While the various tests might individually seem sensible at an intuitive level, the underlying assumptions and approximations are not obvious. It is far from clear, therefore, exactly what question is being addressed by their use. Although attempts to give guidelines on 'best practice' are laudable, the shortcomings above will not be remedied without a programme of education on the fundamental principles of data analysis. To this end, scientists and engineers are increasingly finding that the Bayesian approach to probability theory advocated by mathematical physicists such as Laplace 1 , Jeffreys2 and Jaynes 3 provides the most suitable framework. This viewpoint is outlined in Section 2, and its use illustrated with an analysis and resolution of Peelle's pertinent puzzle" in Sections 3 and 4 respectively; we conclude with Section 5. 2. Bayesian Probability Theory The origins of the Bayesian approach to probability theory dates back over three hundred years, to people such as the Bernoullis, Bayes and Laplace, "Posed by Robert Peelle, from the Oak Ridge National Laboratory, Tennessee, in 1987.
109 and was developed as a tool for reasoning in situations where it is not possible to argue with certainty. This subject is relevant to all of us because it pertains to what we have to do everyday of our lives, both professionally and generally: namely, make inferences based on incomplete and/or unreliable data. In this context, a probability is seen as representing a degree of belief, or a state of knowledge, about something given the information available. For example, the probability of rain in the afternoon, given that there are dark clouds in the morning, is denoted by a number between zero and one, where the two extremes correspond to certainty about the outcome. Since the assessment of rain could easily be very different with additional access to the current weather maps, it means that probabilities are always conditional and that the associated information, assumptions and approximations must be stated clearly. 2.1. Manipulating
Probabilities
In addition to the convention that probabilities should lie between 0 and 1, there are just two basic rules that they must satisfy: Pr(X|7)+Pr(X|/) = l, Pi(X, Y\I)=
Pr(X\Y,I) xPi(Y\I).
(1) (2)
Here X and Y are two specific propositions, X denotes that X is false, the vertical bar ' | ' means 'given' (so that all items to the right of this conditioning symbol are taken as being true) and the comma is read as the conjunction 'and'; I subsumes all the pertinent background information, assumptions and approximations. Equations (1) and (2), known as the sum and product rule respectively, are the same as those found in orthodox or conventional statistics; this later school of thought differs from the Bayesian one in its interpretation of probability, restricting it to apply only to frequencies, which limits its sphere of direct application. Many other relationships can be derived from Eqs. (1) and (2). Among the most useful are: _ Pr(Y\X,I)xVv(X\I) Px{X\Y,I), (3) p r ( y | / ) P r ( X | / ) = Px(X,Y\I)
+ Pi(X,Y\l).
(4)
Equation (3) is called Bayes' theorem. Its power lies in the fact that it turns things around with respect to the conditioning symbol: it relates Pr(X | Y, I)
110
to PT(Y\X,I). Equation (4) is the simplest form of marginalisation. Its generalisations provide procedures for dealing with nuisance parameters and hypothesis uncertainties.
2.2. Assigning
Probabilities
While Eqs. (1) and (2), and their corollaries, specify how probabilities are to be manipulated, the rules for their assignment are less well denned. This is inevitable to some extent as 'states of knowledge' can take a myriad different forms, often rather vague. Nevertheless, there are some simple but powerful ideas on the issue based on arguments of self-consistency: if two people have the same information then they should assign the same probability. We refer the reader to some recent textbooks for a good discussion of this topic, and for examples of Bayesian analyses in general: Jaynes 3 , Sivia4, MacKay 5 and Gregory 6 .
3. Peelle's Pertinent Puzzle In 1987, Robert Peelle, from the Oak Ridge National Laboratory, posed the following simple problem as a way of highlighting an anomalous result from a standard least-squares analysis that is sometimes encountered by the nuclear data community b : "Suppose we are required to obtain the weighted average of two experimental results for the same physical quantity. The first result is 1.5 and the second result is 1.0. The full covariance matrix of these data is believed to be the sum of three components. The first component is fully correlated with standard error of 20% of each respective value. The second and third components are independent of the first and of each other, and correspond to 10% random uncertainties in each experimental result. The weighted average obtained from the least-squares method is 0.88 ± 0.22, a value outside the range of the input values! Under what conditions is this the reasonable result that we sought to achieve by use of an advanced data reduction technique?"
b
O h and Seo 7 quote this from a secondary source, Chiba and Smith 8 .
Ill 3.1. The Least-Squares
Approximation
Let us begin by reviewing the Bayesian justification for least-squares. Recasting the problem in symbolic terms, we wish to infer the value of a quantity /J, given two measurements x = {x 1,2:2} and related covariance information I\: ((xi-n)2)=aj,
( ( z 2 - / x ) 2 ) = a\
and ({xi-fi)
(x2-fi))=
ea1a2 , (5)
where the angled brackets denote expectation values and the coefficient of correlation, e, is in the range - 1 < e < 1. This means that we need to ascertain the conditional probability Pr(/x|x,Ii), since it encapsulates our state of knowledge about /x given the relevant data. Bayes' theorem allows us to relate this probability distribution function (pdf) to others that are easier to assign: Pr(Ai|x,Ii) oc Pr(x|/i, Ji) x Pr(/i|/i) ,
(6)
where the equality has been replaced by a proportionality due to the omission of Pr(x|7i) in the denominator, which simply acts as a normalisation constant here. Armed only with the covariance information in Eq. (5), the principle of maximum entropy3 (MaxEnt) leads us to assign a Gaussian likelihood function: e
-Qi/2
Pr(x|/i,J1)== = , 27r<7i<72vl — e where
Q ^ ^
1
" "
" ^ ^ (
°l
^
(7)
I l~* " I •
(8)
If we also assign a uniform prior for [i over a suitably large range, to naively represent gross initial ignorance, T-. ,
,r
x
I vMmax
MminJ
lor Mmin S ^ S Mmax
Pr(/x|/i) = <^ , I 0 otherwise then the logarithm of the posterior pdf being sought, £ 1 , becomes zCi = ln[Pr(/x|x, 7i)] = constant
—
,„*
(9)
(10)
for /i m i n < fj, < /imax! and —00 otherwise. Since Eq. (10) tells us that C\ is largest when Q\ is smallest, our 'best' estimate /xo is given by that value of H which minimises the quadratic scalar mismatch of Eq. (8) — this is the least-squares solution.
112 A Taylor series expansion of L\ shows that Pr(/x | x, I\) is Gaussian, with the mean and variance parameters, \i$ and a2, being defined by dQi d/x
and
=o
1 ^ -= ^-
Hence, our inference about zx can be summarised by a best estimate and an associated error-bar, /XQ ± cr, in the standard way: axi + (3x2 1-e2 M= —3— ± °i°2\l——j ,
a+p where
(12)
ya+p
a = 02 (02 — e CTI) and /3 = ui (o\ — e<j2) •
(13)
For Peelle's pertinent puzzle, x\ = 1.5, xi = 1.0, o\ = 0.3354, 02 = 0.2236 and e = 0.8; this leads to the conclusion that /x = 0.88 ± 0.22. 3.2. Understanding
the
Puzzle
The result of the above analysis is anomalous, because it's at odds with our expectation that the best estimate should be bounded by the two measurements. Although seemingly weird, is it unacceptable? From Eq. (12), it's not too difficult to see that fio will lie between Xi and X2 as long as both a and /3 are positive. Equation (13) translates this into the requirement that
/^M. (14 ) I 0-2 CTl J This will always be satisfied for negative correlations, but will fail as e—>1 when <7i ^CT2. Upon reflection, this is not surprising. Independence, or e = 0, indicates that X2 is no more likely to be higher or lower than the true value of /x no matter what the corresponding deviation of x\; and vice versa. It seems reasonable that HQ should then lie between x\ and X2, since this minimises the (sum of squared residuals) mismatch with the measurements. There is an additional reason for this outcome when e < 0, as there is also the expectation that x\ — ix and X2~ /x have opposite signs. By the same token, there is increasing pressure for /xo to be outside the range spanned by the data as e —> 1 so as to satisfy the growing expectation from the positive correlation that both measurements deviate from the true value in the same sense. With e = 0.8 in Peelle's case, a best estimate of 0.88 for /x doesn't now seem quite so ludicrous. e
<
m m
113 4. Peelle's Pertinent Ambiguity Although we can understand the reason for the anomalous result in Peelle's pertinent puzzle, there is a curious feature in the statement of the problem: the covariance elements are given in relative, rather than absolute, terms. Is this significant? 4.1. Least-Squares
for Magnitude
Data
The least-squares analysis of the preceding section relied, for the most part, on the Gaussian likelihood of Eqs. (7) and (8). This assignment was based on the constraints of Eq. (5), and motivated by the principle of MaxEnt. A more literal interpretation of the covariance information in Peelle's statement, however, would be
"•!• ( ( £ J H "* ((£)(£)) = "•- <15> where 6x\ and 8x2 are the deviations of the measurements from the true value of n; the fractional error-bars are equal, with s\ = s 2 = 0.2236, but the correlation coefficient remains unchanged (e = 0.8). These constraints can be turned into the simpler form of Eq. (5) through the substitution of 2/i = lnxi and y2 = In%2, so that (5y?) = si,
(<5y22 ) = sj
and (5yi Sy2)= e si s 2 •
(16)
The MaxEnt principle would then lead us to assign a Gaussian likelihood for the logarithm of the data, e-02/2
Pr(lnx|lnM,/2) =
where Q2 = ( ^ M
H^/fi])
= = , 27rsiS2vl —e
/ 4 \esis2
"i^A si J
(17)
fa[xi/rf Un[z 2 /d
(18)
where I2 denotes the covariance information in Eq. (15). The above discussion suggests that ^ is a scale parameter, or something that is positive and pertains to a magnitude. As such, the prior that expresses gross initial ignorance 3 is n n
Ir 1
/ ( l n [ / W x / M m i n ] )~
Pr(ln/x|J 2 ) = < I
n
0
'
for
M™" ^ ^ ^ ^ > »
^ . otherwise
,,M
(19)
114 1
1
1
(b)"
t .••••''" 1 /
-0.5
0
0.5
Quantity of interest
1
0.5
\n/i
i
1
H
1.5
2
Quantity of interest /J.
Figure 1. The posterior pdfs of Eqs. (10) and (21), with Peelle's measurements marked by arrows, which have been scaled vertically to have a maximum value of unity to aid comparison, (a) The posterior pdf of Eq. (21), P r ( l n ^ | l n x , / 2 ) , which is a Gaussian, (b) The same pdf transformed to linear fi, Pr(^i|lnx, I2), where it is non-Gaussian; the posterior pdf of Eq. (10), Pr(/x|x, Ii), is plotted as a dotted line.
Using Bayes' theorem, Pr(hiju|lnx,h) oc Pr(lnx|ln/u,/2) x Pr(ln/i|7 2 ) ,
(20)
where we have again omitted the denominator Pr(lnx|/2), we find that the logarithm of the posterior pdf for In /x is J02 = I n [ P r ( l n ^ | l n x , J 2 ) ] = constant
Qi
(21)
for fimm < n < /i m a x , and —00 otherwise. Thus Pr(ln fi | In x, I2) is also a Gaussian pdf which, for the case of equal relative error-bars s\ = S2, can be succinctly summarised by Infi = Iny/x\X2
± Si
(22)
The substitution of Peelle's data yields ln/x = 0.20 ± 0.21 or, through a standard (linearised) propagation of errors 4 , /x w 1.22 ±0.26. The posterior pdfs of Eqs. (10) and (21) are shown graphically in Fig. 1. 4.2. Looking at the
Evidence
The above two analyses of Peelle's data give noticeably different optimal estimates of /x, although there is a substantial degree of overlap between them. This should not be too surprising as each is predicated on a different set of assumptions, I\ and I2, corresponding to alternative interpretations of the information provided. Hanson et al.9 correctly point out that the real solution to this problem rests with the experimentalists giving more
115
details on the nature of the uncertainties in the measurements. Whatever the response, probability theory provides a formal mechanism for dealing with such ambiguities; it is based on marginalisation. If I represents the statement of Peelle's puzzle, and any other information pertinent to it, then our inference about the value of \i is encapsulated by Pr(^z|x, I). This can be related to analyses based on alternative interpretations of the data, Ii, I2, ..., IM, by M
Pr(/i|x,/) = ] T P r G u , / J - | x , / ) ,
(23)
i=i
which is a generalisation of Eq. (4). Using the product rule of probability and Bayes' theorem, each term in the summation becomes P r f o / j l x , / ) = PrQulx,/,) x ^ ' f f ^ ^
1 0
,
(24)
where the conditioning on I has been dropped, as being unnecessary, when Ij is given. Since Pr(x|J) does not depend on /J, or j , it can be treated as a normalisation constant. Without a prior indication of the 'correct' interpretation of the data, when all the Pr(7j |7) can be set equal, Eq. (23) simplifies to M
Pr(/i|x,7) ex J2 Pr(Hx,-0) * ?*(*&)
•
(25)
This is an average of the alternative analyses weighted by the evidence of the data, Pr(x|Jj). The latter, which is also known as the global or marginal likelihood, or the prior predictive, is simply the denominator term that is usually omitted in applications of Bayes' theorem as an uninteresting normalisation constant:
Using the assignments of Eqs. (7) and (9), the evidence for 7i is given by P r ( x , H / i ) dM =
—
/
/=% •
(27)
The dependence of the analysis on ^ m j n and /umax might seem surprising, but that is because their exact values tend to be irrelevant for the more familiar problem of parameter estimation: the posterior pdf Pr(/i|x,Ij) is independent of the bounds as long as they cover a sufficiently large /u-range
116
to encompass all the significant region of the likelihood function Pr(x| //, Ij). For the assignments of Eqs. (17) and (19), the corresponding evidence is best evaluated in log-space: Pr(x|7 2 ) =
Pr(lnx|7 2 ) a:iX2 InMmax
I-Q2/2
r e~Q2/2
XiX2\n[lJ,ma.x/nmin]
\
J
d l n
2irsis2 y/T-
^
(28)
where the x\ x2 in the denominator is the Jacohian for the transformation from Pr(lnx|i2) to Pr(x|/2). It should be noted that ^ m ; n and /x max do not have to have the same values in Eqs. (27) and (28): these bounds must be positive in Eq. (28), in keeping with the scale parameter view of /J, implied by I2, whereas they are free from this restriction in Eq. (27). Carrying out the evidence-weighted averaging of Eq. (25) for M = 2, with /i m ; n and /imax set somewhat arbitrarily to 0.1 and 3.0 in both Eqs. (27) and (28), we obtain the marginal posterior pdf for Peelle's problem shown in Fig. 2; it has a mean of 0.96, a standard deviation of 0.27, a maximum at 0.91 and is asymmetric with a tail towards higher /j,. Although the precise result necessarily depends on the //-bounds chosen, it does so fairly weakly. The essential conclusion is that a value of // between 1.5 and 2.0, which is on the upper-side of the larger measurement, cannot be excluded with such high certainty if the possibility of I2 is admitted (in addition to 7i).
0.5 1 1.5 2 Quantity of interest \j. Figure 2. The marginal posterior pdf of Eq. (25), Pr(/u|x, I), for M = 2. The evidenceweighted contributions from the two alternative interpretations of the data considered, P r ( / i | x , 7 i ) and Pr(fi|x,I2), are shown with a dotted and dashed lines; / i m i n and /i m a x were taken to be 0.1 and 3.0 in both cases. Peelle's measurements are marked by arrows and, to aid comparison with Fig. 1, all the pdfs have been scaled vertically so that Pr(/i|x, I) has a maximum value of unity.
117 5. Conclusions We have used Peelle's pertinent puzzle as a simple example to illustrate how the analysis of data is a dynamic process akin to holding a conversation. When the initial least-squares analysis of Section 3.1 led to results that seemed 'wrong', we reacted by looking more carefully at the validity of the assumptions that underlie that procedure. This prompted us to formulate a different question, addressed in Section 4.1, denned by an alternative interpretation of the information provided. In the absence of experimental details regarding the nature of the uncertainties associated with the given measurements, we again turned to probability theory to ask, in Section 4.2, what we could conclude in face of the ambiguity. To avoid any confusion, let us clarify further a few points regarding what we have done in this analysis of Peelle's pertinent puzzle and about our Bayesian viewpoint in general. We have not said that the least-squares analysis was wrong. Indeed, in Section 3.2, we have explained why the counter-intuitive result could actually be quite reasonable. We simply asked a series of questions, defined by alternative assumptions, and addressed them through probability theory — it was just a dialogue with the data. The Bayesian viewpoint expounded here follows the approach of mathematical physicists such as Laplace 1 , Jeffreys2 and Jaynes 3 , and is still not widely taught to science and engineering undergraduates today. It differs markedly in its accessibility for scientists from the works of many statisticians engaged in Bayesian field; the latter carry over much of the vocabulary and mind-set of their classical frequentist training, which we believe to be neither necessary nor helpful. We refer the reader to some recent textbooks, such as Jaynes 3 , Sivia4, MacKay 5 and Gregory 6 , for a good introduction to our viewpoint. To conclude, a black-box approach to the subject of data analysis, even with useful guidelines, is best avoided because it can be both limiting and misleading. All analyses are conditional on assumptions and approximations, and it's important to understand and state them clearly. While the evaluation of an arithmetic mean might seem objective and incontrovertible, for example, its status as a crucial number requires some qualified justification. We believe that an understanding of the principles underlying data analysis, along the lines outlined here, is at least as important as formal prescriptions of best practice.
118 Acknowledgments I a m grateful to Soo-youl Oh for bringing this fun little problem to my attention, and to Stephen Gull and David Waymont for giving me useful feedback on my analysis of it.
References 1. P. S. de Laplace, Theorie analytique des probabilites, Courcier Imprimeur, Paris (1812). 2. H. Jeffreys, Theory of probability, Clarendon Press, Oxford (1939). 3. E. T. Jaynes, Probability theory: the logic of science, edited by G. L. Bretthorst, Cambridge University Press, Cambridge (2003). 4. D. S. Sivia, Data analysis - a Bayesian tutorial, Oxford University Press, Oxford (1996). 5. D. J. C. MacKay Information Theory, Inference, and Learning Algorithms, Cambridge University Press, Cambridge (2003). 6. P. Gregory, Bayesian Logical Data Analysis for the Physical Sciences, Cambridge University Press, Cambridge (2005). 7. S.-Y. Oh and C.-G. Seo, PHYSOR 2004, American Nuclear Society, Lagrange Park, Illinois (2004). 8. S. Chiba and D. L. Smith, ANL/NDM-121, Argonne National Laboratory, Chicago (1991). 9. K. Hanson, T. Kawano and P. Talou, AIP Conf. Proc. 769, 304-307.
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 119-129)
A VIRTUAL INSTRUMENT TO EVALUATE THE UNCERTAINTY OF MEASUREMENT IN THE CALIBRATION OF SOUND CALIBRATORS GUILLERMO DE ARCAS, MARIANO RUIZ, JUAN MANUEL LOPEZ, MANUEL RECUERO, RODOLFO FRAILE Grupo de Investigation en Instrumentation y Acustica Aplicada (I2A2), I.N.S.I.A., Universidad Politecnica de Madrid, Ctra. Valencia Km. 7, Madrid 28010, Spain A software tool to automate the calibration of sound calibrators is presented which includes the automatic estimation of measurement uncertainty. The tool is a virtual instrument, developed in LabVIEW, which uses several databases to store all the information needed to compute the uncertainty conforming to the Guide to the Expression of Uncertainty in Measurement. This information includes instruments specifications and calibration data.
1. Calibration of sound calibrators According to IEC 60942, sound calibrators* are designed to produce one or more known sound pressure levels at one or more specified frequencies, when coupled to specified models of microphone in specified configurations. They are calibrated in accordance with IEC 60942 and during their calibration the following magnitudes are determined: generated sound pressure level (SPL), short-term level fluctuation, frequency and total distortion of sound generated by the sound calibrator. These instruments are mainly used to check or adjust the overall sensitivity of acoustical measuring devices or systems. So, as they are only used as a transfer standard for sound pressure level, this is the most important magnitude to be determined during calibration. Therefore, although the application here presented includes the rest of the tests, the discussion will be focused in the determination of the SPL. The SPL generated by a sound calibrator can be determined using two methods: the comparison method or the insert voltage method. According to [1] the uncertainties obtained with the former are bigger, so the latter is preferred.
* In this paper the term sound calibrator is used for all devices covered by IEC 60942:2003 including pistonphones and multi-frequency calibrators.
120 In the insert voltage method, a measurement microphone, whose opencircuit sensitivity is known (.?dB), i s u s e ^ to determine the SPL generated by a sound calibrator. This can be done by measuring the open-circuit voltage (v0c) generated by the microphone, when coupled to the sound calibrator. Applying the definition of the SPL, or sound level (Lp), the following equation is obtained: Lp = 20 • log
20/iPa
*dB
(1)
As microphones are high output impedance transducers it is not easy to measure their open-circuit voltage directly, so the insert voltage method proposes an indirect way of determining this magnitude. A signal generator is inserted in series with the microphone as shown in Fig.l. First, the sound calibrator is switched on, the signal generator is switched off, and the voltage at the output of the preamplifier is measured. Then, the sound calibrator is switched off, the signal generator is switched on and its amplitude is adjusted to produce the same output voltage at the preamplifier's output, as the one produced when the sound calibrator was switched on. Last, the open circuit voltage at the microphone's output, voc» can be determined by measuring the open-circuit voltage at the output of the signal generator. Preamplifier
Microphone
Signal Generator
4= 0
Figure 1. Simplified schematic of the insert voltage method for determining the open-circuit voltage generated by a microphone on known sensitivity.
2. Software architecture A virtual instrument (VI) has been developed in Lab VIEW to automate the calibration of sound calibrators using the insert voltage method. 2.1. System architecture Figure 2 shows the instruments needed to calibrate a sound calibrator. The virtual instrument has been developed according to this system architecture, and assuming that all instruments except the microphone itself, its power supply and
121
the preamplifier, can be controlled from the computer trough a GPIB or similar interface. r-1
S
,=, Environmental Conditions Unit
Signal Generator
Microphone Power Supply
i
IF
Insert Voltage Unit
Microphone and preamplifier
->
4 Digital Multimeter Figure 2. System architecture for calibrating sound calibrators.
2.2. Software architecture The VI has a modular architecture that can be divided in two levels. High level modules, shown with rectangular shapes in Fig. 3, deal with the behavior of the application, while low level modules, represented with oval shapes, provide all the services needed to perform the basic operations: • Report Manager. Provides the functions needed to generate the calibration report annexes directly in a Microsoft Word file. It uses the National Instruments Report Generation for Microsoft Office Toolkit. • Error Manager. It manages all the possible errors that can occur in the application. • DDBB Manager. All the information needed in the application is stored in databases. This module provides all the functions needed to retrieve the information from the databases. It uses the National Instruments Database Connectivity Toolkit functions. • Uncertainty Manager. Provides the uncertainty calculation functions for each type of measurement. • Hardware managers: Generator Manager, DMM Manager and Environm. Manager. They provide the functions needed to control the hardware: configure the instruments, take measurements, etc. • Message Manager. Provides communication between modules (refer to section 2.3).
122 Main
Sequencer
/ Report \ V JManager,/*v / \
Error v \ Manager^/
y^'\ Config
+
f DDBB \ i f *• Manager J
DDBB Access
Lp Test
/'Me \ Manager /
X
f Generator \ Manager y
Test
DMM \ Manager J
hv E
Environm. \ Manager J
Uncertainty ^ Manager _/'
Figure 3. Software architecture of the virtual instrument.
These services are used by the high level modules to perform the following operations: • Main. Initializes the resources (queues, globals, instruments, etc) and launches the rest of the modules. • Sequencer. It a simple state machine that sequences the operation of the applications trough the following phases: configuration, database access, sound pressure level test, frequency test, and distortion test. • Configuration. Enables the user to choose the sound calibrator model to be calibrated, the microphone that will be used, and the calibration information (standard to be used, serial number of the calibrator, tests to be performed). • DDBB Access. Provides the user with access to databases content. It is only executed if the user needs to access the databases to enter a new type of sound calibrator. • Lp Test, Frequency test, and Distortion Test. These modules implement the algorithms of the three tests performed during calibration, and include the user interface management for each of them. 2.3. Implementation details The software architecture has been implemented following a message based structure, [2]. Low level modules are always running during execution. This means that high level modules run in parallel with them, and they send them messages to execute their services. Messages are multi-line strings sent through Lab VIEW queues, and have the following structure: the first line contains the destination manager; the second one the message, or command; and the third one, when applicable, the
123
command option or parameter value. This permits to embed the interface to low level modules in a component that can send all the possible commands to that module, as shown in Fig. 4 for the DMM Manager.
1^^
Commands Init Config DC ConFig AC Meas Close Exit
Command -
• Voltage
error in K =* error out DMM Command.vi
L_—p^
Format the message
Send the message
Wait For answer
Figure 4. Example of a VI used to send a command to a low level module.
Low level modules have all a similar structure. They are always waiting for commands to appear at their input queue. Once a command is received, it is executed and a response is sent informing of the result as shown in Fig. 5 for the DMM Manager. H No Error
|[
TKz^ifB
^K
ICommandl
send message and stop!
Commands Init Config DC - Config AC Meas Close E^it
||#DMM Addres
cm
m Figure 5. Low level module example code corresponding to the DMM Manager.
High level modules have been implemented using standard LabVIEW architectures, mostly state machines and/or standard dataflow programming structures. They send their outgoing messages to the Message Manager, who redistributes them to its destinations. This permits to trace the execution
124
program by saving all messages that pass trough this module in a text file, as shown in Fig. 6.
—'
hcreate or replace T l
Figure 6. Implementation of the Message manager VI.
2.4. Databases structure All the information needed in the application, both for measurement configuration and uncertainty calculation, is stored in five databases: • DBMic. For each microphone that can be used in the system to calibrate sound calibrators this database stores the following information: o Serial number, model, manufacturer, type, last calibration date. o Calibration history data: sensitivity, associated uncertainty and calibration conditions. o Temperature, static pressure and humidity coefficients for each frequency and associated uncertainties. • DBCal. It stores the following information of the sound calibrators: model, manufacturer, type, levels, frequencies, temperature, static pressure and humidity coefficients and associated tolerances, when applicable.
125 •
• •
DBInstr. For each instrument used during the measurement process (digital multimeter, signal generator and environmental conditions unit) it stores its specifications and the calibration data used to evaluate the uncertainty of measurement. DBTol. It stores the tolerances and maximum uncertainties defined for each test in the IEC standards supported by the application^ DBUnc. It stores the accredited best measurement capacity (BMC) of the calibration laboratory. The UCalculator module checks that the measurement uncertainty is never smaller than the BMC. If this is the case, the BMC is used as the measurement uncertainty and a warning is used.
3. Uncertainty evaluation Measurement uncertainty is estimated following the method described in the guide to the expression of uncertainty in measurement (GUM). First, a measurement model has been developed, and then the GUM method has been applied to this model. 3.1. The measurement model A couple of considerations must be made in order to complete the measurement model presented in Eq. 1. First, the effective load volume of the microphone can affect the measured sound pressure level depending on the susceptibility of the sound calibrator to this effect. Thus, a term (5SELV) must be included for this effect, if necessary*. Another effect that can be considered is the possible drift in the microphone's sensitivity between calibrations. This can be estimated if an historic report of the calibrations is kept (85,). And last, it has to be noted that microphone's sensitivity varies with frequency, polarization voltage and environmental conditions [3]. The value of the microphone sensitivity for each frequency (sdsj) is obtained from the microphone calibration certificate. This certificate gives the value of the microphone's sensitivity referenced to certain polarization voltage and environmental conditions. Usually, these conditions differ from those under which the calibration of the sound calibrator is performed. * The actual Spanish legislation for legal metrology control of sound calibrators refers to the IEC 942:1988, but calibration is usually performed against IEC 60942. ' The difference in the level of a pistonphone coupled to microphones of different effective load volumes is well known. Sound calibrators are not supposed to be influenced by this effect because of their design, although some influences have been reported, [ 1 ]
126 So Eq. 1 must be rewritten to give the expression of the sound pressure level produced at the output of a sound calibrator working at a frequency/ and determined under calibration conditions (vPC, tc, Pa rhc)> using a microphone whose open-circuit sensitivity is known for certain values of polarization voltage, temperature, static pressure and relative humidity (vPM, tM, PM, rhM). Lpccj = 2 0 ' l
o
-&t-hcsf
g ^ V - sdB.f ~ 2 0 l o g - ^ - &ELV 20juPa vPM tcsf -(tc-tM)pcsf • (pc - pM)
(2)
-(rhc-rhM)
In order to use Eq. 2 the temperature, static pressure and relative humidity coefficients of the microphone's sensitivity must be known at frequency f{tcsf, pcsf, hcsf). Finally, the IEC 60942 standard states that the sound pressure level generated by the sound calibrator must be referenced to specific reference conditions (tRi pRi rhR). This can be done if the temperature, static pressure and relative humidity coefficients of the sound calibrator are known (tec, pec, hec), giving Eq.3. 3.2. Sources of uncertainty According to the measurement model presented in Eq. 3, the following sources of uncertainty can be considered: v
VQC LpRCf = 20 • log—2£- sdBJ - —• 2 0 l o g" - ^ - &ELV
20/jPa
-&, -tcsf -{tc-tM)-pcsj-
-(PC-PM)
0)
- hcsf • (rhc -rhM) + tec • (tR - tc) + Pec -(PR-PC)
1.
2.
3. 4.
+ hec • {rhR -
rhc)
Measurement method. A single contribution must be determined associated with the insert voltage method measurements as this will depend on the actual implementation of the method and the equipment used. Calibration of reference microphone. This is typically one of the dominant contributions and it is obtained directly from the microphone's calibration certificate and from the historic record of the calibrations. Polarization voltage measurement. Influence of the effective load volume of microphone in the sound calibrator.
127
5.
6. 7.
8.
9.
Uncertainty of the temperature, static pressure and relative humidity coefficients of the microphone's sensitivity. These coefficients are not always known, so they can be estimated as in [4] and [5]. Measurement of the environmental conditions during calibration. Uncertainty of the temperature, static pressure and relative humidity coefficients of the sound calibrator. Some manufacturers provide nominal values for these coefficients and associated tolerances. Repeatability. According to the IEC 60942 standard measurement of SPL must be repeated three times, so a Type A contribution must be determined from these replications. Rounding of the final result. As the final result is rounded to a finite number of digits a contribution must be included to account for the error introduced by the rounding.
4. Conclusions A software application has been developed to automate the complete calibration process of sound calibrators. The application has been used for the last year and a half in the acoustical instruments calibration laboratory of the Universidad Politecnica de Madrid, calibrating around 600 instruments during this period. One of the most important parts of the application is the uncertainty calculation module. Although this module is part of the application, it can also be used as a component that can be included in any software project. The numerical correctness of the Ucalculator module has been validated by comparing its results with those produced using an Excel datasheet implementing the same model. The measurement model and the virtual instrument itself have been validated through an inter-laboratory comparison that has been organized by the laboratory at a national level with the support of the Spanish national accreditation body, EN AC. Table 1 shows the uncertainty budget obtained when calibrating a Briiel & Kjaer Type 4231 sound calibrator using a Briiel & Kjaer Type 4180 laboratory standard microphone. For each contribution its value, probability density function (pdf) and sensitivity coefficient (c,) is shown.
128 Table 1. Uncertainty budget for the calibration of a Bruel & Kjaer Type 4231 sound calibrator using a Bruel & Kjaer Type 4180 laboratory standard microphone. Contribution u(voc) U(SdB)
u(vPC) W(5JELV)
«(&,) u(tcsj) «('c) u(pcsf) u(pc) u(hcsj) u(rhc) u(tcc) u{pcc) u(hcc) Repeatability Rounding uc U
Value 0,085 0,03 0,031 0 0 0,0023 0,141 0,0005 0,142 0 4,5 0,141 0,142 4,5 0,005 0,005
pdf Normal Normal Rectangular Rectangular Rectangular Rectangular Rectangular Rectangular Rectangular Rectangular Rectangular Normal Normal Normal Normal Rectangular Normal Normal
Cf
1 -1 -0,0434 -1 -1 3 -0,0063 9 -0,0013 30 0,00008 0,0015 0,00008 0,001 1 1
Ui
(dB) 0,0085 -0,0150 -0,0009 0,0000 0,0000 0,0040 -0,0005 0,0026 -0,0001 0,0000 0,0002 0,0001 0,0000 0,0023 0,0045 0,0029 0,02 0,04
The use of a graphical programming language that has a good support for instrument control, and its associated toolkits for database management and report generation have reduced the development time considerably. On the other hand, it must be noted that special care must be paid when using graphical programming languages to develop medium to large software applications. These languages tend to make the programmer forget the basic software engineering concepts that must be followed to develop any software application [7], [8]. Lab VIEW offers several methods to make the software modular and scalable. Queues are a good example, as they can be used to develop message based software architectures. Concerning the implementation of the GUM method, it must be noted that it is difficult to maintain the information about the pdf assigned to each contribution throughout the calculation process. This affects the validity of the method for determining the expanded uncertainty and can lead to errors. Alternative methods such as the Monte Carlo Simulation (MCS) [9] can be a solution to this problem. This method has the additional advantage of its numerical nature, which is very interesting for applications where the calibration software computes the measurement uncertainty "on the fly". During testing of the application, an implementation of MCS in MatLab has been used to validate the results given by the GUM method. The results obtained for the uncertainty
129 budget shown in Table 1 are shown in Fig. 4, giving a confidence interval of [0.377, 0.378] decibels for a 95% coverage probability. 1800,1600 1400 1200 1000 800 600 400 200 -
93 9
93.95
94
94.05
94.1
Figure 7. Uncertainty of measurement calculated using the Monte Carlo Simulation method for the uncertainty budget shown in Table 1.
References 1. Rasmussen, K. EA Interlaboratory comparison Acl. Final Report. Danish Primary Laboratory of Acoustics (1999). 2. E. Barrera et al. National Instruments NIDAYS, Madrid (2000). 3. Microphone Handbook. Vol. 1: Theory, pp. 3-18:3-21 (1996). 4. Rasmussen, K. Metrologia, 36, pp.265 - 273 (1999). 5. Rasmussen, K. Bruel&Kjaer Technical Review, No.1-2001, pp. 5-17 (2001). 6. Guide to the Expression of Uncertainty in Measurement. BIPM, IEC, IFCC, ISO, IUPAP, IUPAC, OIML (1995). 7. J. Conway, S. Watts. Software Engineering Approach to LabVIEW. Prentice Hall PTR (2003). 8. T. Clement, L. Emmet, P. Froome, S. Guerra. Best Practice Guide No. 12. National Physical Laboratory (2002). 9. M G Cox and P M Harris. Software Support for Metrology Best Practice Guide No. 6. National Physical Laboratory (2004).
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 130-141)
INTERCOMPARISON REFERENCE FUNCTIONS AND DATA CORRELATION STRUCTURE WOLFRAM BREMSER Federal Institute for Materials Research and Testing (BAM) Unter den Eichen 87, 12205 Berlin, Germany
Interlaboratory comparisons (IC) are key instruments for ensuring the performance and reliability of, and the comparability between, the participating facilities. Interlaboratory comparisons are also a mighty tool for assessing reference values to measurement standards or reference materials. In modern measurement and testing, determination of functionally related properties of the reference object, or of the performance characteristics of an instrument within a certain measuring range is common practice. Intercomparisons in these areas require the calculation of IC reference functions (RF) instead of series of isolated IC reference values. A number of decisions have to be made concerning the appropriate type of the ICRF, a regression technique which enables accounting for the full uncertainty/correlation structure, and appropriate result assessment. The paper discusses a particular approach which is exemplified for an IC in equipment qualification using a (travelling) transfer instrument for the comparison of an on-site measuring instrument with a reference standard.
1.
Introduction
Bilateral, regional or global interlaboratory comparisons (IC) are key instruments for ensuring the performance and reliability of, and the comparability between, the participating facilities. IC are also a mighty tool for assessing reference values to measurement standards or reference materials. Traditionally, measurement standards are materialisations of a single property value, and proficiency testing (PT) schemes compare single property values obtained by different laboratories on the same reference object (or at least on a part of a homogeneous batch). When basic properties such as length or mass are concerned, there is obviously no problem in characterising an object by one single value. Statistics and procedures for treating experimental data obtained from repeated investigation in one single or a limited number of laboratories and distilling a reference value are well established. ISO Guide 35 [1] represents such a system of procedures for defining reference values, and ISO 5725 [2] can be taken as a
131 basis for assessing method validation (and partially PT) scheme results. ISO 13528 [3] provides further guidance on the assessment of PT results. The above picture changes when more than one single value are to be determined for the reference object, and it becomes completely different when functionally related properties are to be determined. Irradiance curves or fluorescence efficiency/yield of e.g. a dye, the uptake of an adsorbate gas by a porous material, implantation profiles, or the performance characterisation of an instrument within a certain measuring range may serve as examples. Obviously, such situations can only be handled by calculating IC reference functions (ICRF) instead of series of isolated IC reference values [see also reference 4]. 2.
General principles
Regression of an appropriately selected model function to the experimentally obtained data will obviously be the method of choice for determining the ICRF. Different regression techniques may be applied which aim at minimising different objective functions. Most often, the objective function will be the unweighted or somehow weighted sum of squared deviations (SSD), and the regression technique thus a least-squares regression. Besides certain numerical difficulties which may occur in (very) complex cases, ICRF determination faces three more principal problems, namely i)
the selection of the type of RF to be regressed to the data This is trivial as long as sufficient evidence can be provided that underlying principles predict a specific type of a model function, like say in the case of a radioactive decay. It becomes more complicated in cases where individual properties may significantly alter a behaviour predicted by basic principles. In those cases normally sets of appropriately chosen basis functions (like e.g. polynomials) will be used to approximate the "true" data structure. The problem becomes complex when poor or no good evidence can be provided concerning underlying basic principles (expressed as a suitable model function) which is often the case for material-related properties. Spline fitting or kernel estimation may be appropriate here.
ii)
the full account for the uncertainty/correlation structure of the data Within the philosophy of the GUM [5], any measurement result has, and any statement of a result should be accompanied by, a measurement uncertainty. Series of results (within or between laboratories par-
132 ticipating in the IC) may be correlated. The uncertainty/correlation structure of the measured data should undoubtedly be taken on board for the determination of the ICRF. The chosen regression technique should thus allow taking all data uncertainties and possible data correlations on board, and all data uncertainties and possible data correlations should properly be propagated to the ICRF. iii)
the assessment of compatibility. This mainly refers to the establishment of compatibility between the data provided by the participating laboratories and/or reference functions determined separately for each of the participating laboratories with the common ICRF. Although regression substantially reduces the (in some cases enormous) amount of data to a couple of reference function parameters, most of the reference function types will have more than one parameter. These parameters have to be compared, and suitable tools be applied for establishing compatibility. Again, a proper assessment will require parameter variance/ covariance matrices which take up the full uncertainty/correlation structure of the initial, experimentally obtained data set.
The problem of selecting an appropriate type of the RF will not be elaborated further, emphasis is made on the account for the full uncertainty/correlation structure of the data and the assessment of compatibility. A suitable technique able to account for the uncertainty/correlation structure is generalised least-squares regression using the inverse variance/covariance matrix as the weight matrix (GLS, although other brand names and abbreviations for the same technique can also be found in the literature). The principles of GLS were laid down by Deming [6], and an overview on later developments and implementations of this technique is given in [7], although restricted only to straight-line model functions. Meanwhile, GLS is standardised for a series of five different model function types [8] and already well established in the field of gas analysis. For the case of a relationship between one dependent variable y and an independent variable x, the principle of GLS can be formulated as follows: Let {Yit Xj} be a set of n measured pairs of values, and y = f(x) a suitable model function with a set of parameters/?. A vector z of differences Z = {X,-xh
.... X„-x„, Y, -y,, .... Y„-yn}T
(1)
133 is formed where the yt are the values of the model function taken at so-called adjusted points x,. Together with the set of parameters p, these x, are subject to determination by solving the minimisation problem Z-VC'-zT
= min
(2)
where VC1 is the inverse variance/covariance matrix of the {Yh X,} data set. After determination of the set of parameters p, the full variance/covariance matrix of the parameters is determined by uncertainty propagation of the {Yh X,} data set variance/covariance matrix VC. For feasible propagation algorithms see e.g. [8]. As can be seen from this brief introduction, GLS is capable to correctly account for the full data uncertainty/correlation structure. Existing algorithms and implementations however refer to either i) straight-line model functions and the diagonal elements of the variance/covariance matrix VC only [see reference 7] or ii) a selection of model functions and the diagonal elements plus the covariance terms in the independent, but not to covariances in the dependent variable (e.g. the software available for the implementation of ISO 6143). For the purpose of taking advantage of the full GLS power, the algorithm mentioned under ii) had to be extended to allow for taking covariances in the dependent variable on board. Assessment of data compatibility can generally be based upon three criteria, namely the residual SSD of the regression, the goodness-of-fit parameter (GoF) calculated according to [8], and the joint confidence region (JCR) of the set of function parameters/;. The joint confidence region (see e.g. [9]) is helpful for comparisons of two or more RF: Two RF may be considered being compatible if the centre point (CP in table 1) of the parameter JCR of any of the two functions is contained in the parameter JCR of the respective other function. There are cases when either nominal values for the parameters are known (see also chapter 3) or the latter should fall within certain specification limits. Compatibility with the specification or the expected value can by tested by checking whether or not the nominal point P„omi„ai is contained in the JCR of the corresponding RF parameters. Given the conditions above (nominal/expected values known or given by specification), a (simpler) test of the En criterion separately for each of the parameters against the corresponding critical value may help assessing compatibility. One should bear in mind that RF parameters are normally correlated, and that the simple test grossly neglects these correlations. This may lead in certain
134
situations to different outcomes of the test, especially when the nominal point is close to, or slightly beyond, the limits of the JCR. Finally, if the data sets {YJ and {Xj} should also coincide one-by-one (i.e. each Xj to the corresponding Yh see the case of equipment qualification), an average En can be helpful in assessing this mutual compatibility. Table 1 gives an overview on the various assessment criteria, the critical values, and the particular property (of the reference functions) which can be assessed by using this criterion. Expressions for the different criteria are given in chapter 3 for the case considered. Table 1: Criteria for the assessment of data compatibility
Criterion
critical value
criterion tests for
Residual SSD
~ 2{n - k)
global compatibility between all data and the fitted ICRF local compatibility between each data point and the fittted ICRF, indication of possibly outlying points compatibility between two reference functions, including correlations
GoF
Joint confidence region (JCR)
'
2
CPJCR1 e JCR2 A
CPJCR2 6 JCR1 * nominal ^ JV--.K-
En for each of the parameters
2
Average E„
2
bias detection (if nominal or expected values are known, or specifications to be fulfilled), including correlations bias detection (if nominal or expected values are known, or specifications to be fulfilled), neglecting correlations global compatibility between data sets (if data must coincide one-byone)
where n is the number of data points, and k the number of ICRF parameters.
3.
Example: Equipment qualification
Modern measuring equipment is capable to determine values for the measurand under consideration within certain (and in many cases quite considerable) measuring ranges. Equipment qualification and/or validation includes the provision of evidence for the trueness of the measurement results or, in other words, the
135 absence of (systematic) bias. In particular, this can be done by comparing the equipment under test with a reference equipment. Both instruments take measurements on a series of objects, and the results are then compared. Intercomparisons of this kind, either direct or by using a transfer or travelling standard, are quite common in many fields of measurement and testing.
X3 corrected, jr, VC c o r r e c t e d
X,, r, a t time (2) VC at time (2)
Pot Pi a t time (1) VC(p) a t time (1)
Figure 1. Schematic layout of equipment intercomparison: A transfer standard is tested against a reference equipment at a time ti, then transferred to the location of the equipment to be tested, and the latter is tested against the transfer standard at time t2.
In a direct comparison, one would expect a similar behaviour of both instruments. Even if the response functions of the instruments are different, the property value determined for an object should be the same regardless of whether it was determined using the reference instrument or the instrument under test. If these property values are plotted against each other, one can expect a strictly linear dependence, i.e. a straight line. Furthermore, the nominal values of the line parameters are also known, namely unity for the slope and zero for the intercept. The comparability is then assessed by the closeness of intercept and slope to their corresponding expectations, within the corresponding uncertainties. The evaluation procedure should thus provide best estimates for the line parameters and account for the full uncertainty/correlation structure of the data. Only GLS can comply with these requirements When a transfer standard is involved (as schematically shown in figure 1), measurements taken by the transfer standard at the time tj and the location of the equipment under test must additionally be corrected based upon the transfer standard performance demonstrated at time /; against the reference equipment.
136 The calculation of the corrected variance/covariance matrix of the {Yh Xj} data again requires full account of the uncertainty/correlation structure of the involved input variables. While the application of GLS to equipment qualification has already comprehensively been described in [10] for the case of zero or negligibly small correlations in the {Yu Xt} data set, little has been done for further extending the model to take correlations into account. Looking at the origins of such correlations (see e.g. [11]) one may conclude that any measurement equation may be divided into two parts, namely i) a functional dependence/,,^ on parameters pi... pk which are individual and independent for any measurement taken, and ii) a functional dependence fco(pj) on parameterspk+i-.- p„ which are common for all measurements. Thus, any measurement result is a product of the form X
=fin(Pj) fco(Pj)
(3)
the uncertainty of the result also consists of a part w,„ originating from the individual parameters, and a part uco originating from the common parameters giving rise to a non-zero covariance between two measurements xi and x2. Let the functions in (3) simply be two factors each coming with the corresponding uncertainty, then C0V(X,, X2) =fm,l' fin, 2 ' uj
(4)
Depending on the structure of the measurement equation and the relative contribution of the parameters to the combined uncertainty, the corresponding correlation coefficients may reach considerable values (between 0.5 and 0.95). The full variance/covariance matrix of the {Yt, Xj) data set takes the form indicated below, where the upper right and lower left block (which describe correlations between the dependent and the independent variable) contain only zeros, but the upper left and lower right blocks are non-zero both for the diagonal and the offdiagonal elements. It has been shown in [12] that taking into account the full variance/covariance matrix in the determination of the function parameters is still not well understood and may lead to instabilities and unexpected effects. It was therefore decided to exclude correlations from the parameter estimation but include the full variance/covariance matrix in the determination of parameter uncertainties. Specific software enabling the sumptuous calculations was developed.
137
VC(Yh XJ = u2(Y,) u2(Yz)
covfoYj) 0 u2 (¥„.,)
covftjj)
u2(Y„) u2(X,) u2(X2)
covfX.Xj)
0 cov^Xj)
u2(X„.,)
The approach was applied to data obtained in an intercomparison of gas analysers, both direct and via a transfer standard. The uncertainty/correlation structure referred to a simple model with constant relative uncertainty, i.e. the covariance between two measurement results x, and x2 was equal to the product of xi and x2 times an individual factor for each analyser. Uncertainty of intercept and slope, r - 0.69
number of sampling points
Figure 2. Uncertainty of intercept (triangles, left scale) and slope (squares, right scale) in dependence on the number of sampling points for a moderate correlation. The corresponding uncertainties for the case without correlations are also shown (dotted lines). The grey line (circles, left scale) represents the uncertainty of a determination in the vicinity of the centre of the measuring range.
138 Measurements on a large number of objects were taken by two of the analysers involved. Data evaluation started with the complete data set, subsequently the data set was systematically reduced by leaving out sampling points (e.g. every fourth, third, or every other data point). The results for the uncertainties of intercept and slope are shown in figure 2. From the comparison between the uncertainties of intercept and slope calculated with and without accounting for data correlation, it can clearly be seen that i) the impact of correlations on the intercept uncertainty is much less than, or even negligible compared with, the impact on the slope uncertainty, and ii) correlations substantially reduce the gain in performance (i.e. the reduction of uncertainty) from an increase in the number of sampling points. No further significant reduction in the size of the slope uncertainty is already observed for 10-15 sampling points upwards. When looking at the uncertainty of a determination based upon the regressed correction function of the comparison, a remarkable plateau is observed starting from 10 sampling points upwards. Uncertainty of intercept and slope, r = 0.91 ®~$
%
®
®
®
%
®
.# 0.012
B-H
B
B
B
B
B
H
H
0.008 £
ru
«
'
t c |
£a "'••0
-0.004
" "
,—,—,—.
0
|
10
.
Q
- - - B - .
H B
---B---B-..
H
.—_*—,—.———.—i—•—•—i——i—•—J-0
20
30
40
number of sampling points
Figure 3. Uncertainty of intercept (triangles, left scale) and slope (squares, right scale) in dependence on the number of sampling points for a large correlation. The corresponding uncertainties for the case without correlations are also shown (dotted lines). The grey line (circles, left scale) represents the uncertainty of a determination in the vicinity of the centre of the measuring range.
For larger correlations, this effect is amplified as can be seen from figure 3 which displays the same values but for a very large correlation in the data structure (correlation coefficient r - 0.91). The conclusion can be drawn that a clear
139 optimum for the number of data points exists (in the sense of attainable precision compared to efforts spent) if only non-negligible correlations are present in the data structure. A number of analysers were tested against the reference equipment (both direct and via a transfer standard), and the data assessed as described in chapter 2 and 3. Since nominal values for intercept and slope determined from each comparison are known, the assessment criteria listed in chapter 2 could be applied in the form as given in table 2. Table 2: Expressions for the assessment criteria
Criterion
Equation
Residual SSD GoF Joint confidence region (JCR) En for each of the parameters Average En
SSD^±'/X:-.X,/:inx.j
• i'/r:-i(x.. pjl: in),)
GoF = max{\Xt-x,\/ufXJ,
|Y,-
8pT VC1 8p = 2 F(0.05.
2,n-2)
ffc)\/u(Yd)
En (Po)= fpo - OJMPo); E„ (p,)= fpi Enavemge = ZfXi - Ytf/rfQQ +u2(Yi)]/n
IJMPJ)
where VC is the variance/covariance matrix of the parameters, and dp a vector in the parameter space Sp = {pt- po, py - Pi}T with p0 and pt being intercept and slope as determined from regression (the centre of the JCR).
Each comparison had to comply with the criteria for the residual SSD and the goodness-of-fit (GoF), otherwise the analyser under test was rejected for non-linear behaviour or lack of precision. Compatibility of the analyser under test with the reference equipment was assessed from the criteria for both the average and the separate En numbers. The performance of the JCR criterion can be seen from figure 4 which displays i) the points [po,Pi]j for some of the analysers tested against the JCR of the mean, and ii) the individual JCR of one of the analysers involved against the rest and the nominal/expected point [0, 1]. While the selected analyser fails to comply with the criterion of containing the nominal values in its individual JCR and thus must be characterised as revealing some bias, the same analyser is still within the JCR of the mean of the data ensemble and, consequently, compatible with the consensus value formed by the group considered here. This instance can quite often be observed for consensus-value evaluations of intercomparisons, namely that a parallel evalua-
140
tion against reference values (if available) would unveil more bias and lead to partially different conclusions for the participating facilities. In the example given, the uncertainty stated for the selected analyser was obviously to small to cover the observed bias.
-*-1——
slope
— •
- I -
1
slope
Figure 4. Results (intercept and slope) for selected analysers: Left: Data ensemble including an outlier against the JCR of the mean (outlier in black, JCR centre indicated by a cross). Right: Selected analyser and its JCR against the rest (without the outlier).
4.
Conclusions
The approach presented here is a possible way to tackle the problem of calculating intercomparison reference functions, and assessing participant's compatibility with the reference. GLS has proven to provide best estimates for the function parameters. The technique is capable, in principle, to account for the full uncertainty/correlation structure of the data obtained in the intercomparsion. Existing GLS implementations neglected correlations in the dependent and/or independent variable, and the algorithm had to be extended significantly in order to take the uncertainty/correlation structure on board. Software implementing the algorithm is available from the author. References 1. 2. 3.
ISO Guide 35. ISO Geneva, 1989. ISO 5725: Accuracy (trueness and precision) of measurement methods and results, Part 1 and 2. ISO Geneva, 1994. ISO/DIS 13528: Statistical methods for use in proficiency testing by interlaboratory comparisons. ISO Geneva.
141 4. 5. 6. 7. 8. 9. 10. 11. 12.
M. G. Cox, A discussion of approaches for determining a reference value in the analysis of key-comparison data. NPL Report CISE 42/99. Guide to the Expression of Uncertainty in Measurement, ISO Geneva, 1995. W.E. Deming, Statistical Adjustment of Data. John Wiley & Sans, Inc., 1946 J. Riu and F.X. Rius, J. Chemometrics, vol. 9, no. 5 (1995), 343. ISO 6143:2001, Gas analysis - Comparison methods for determining and checking the composition of calibration gas mixtures. ISO Geneva. A.G. Gonzalez and A.G. Asuero, J Anal Chem (1993) 346:885-887. B.D. Ripley and M. Thompson, The Analyst, 112(1987), 377. W. Haesselbarth, and W. Bremser, Accred Qual Assur (2004) 9:597-600 W. Bremser and W. Haesselbarth, Accred Qual Assur (1998) 3:106-110.
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 142-150)
VALIDATION OF SOFT SENSORS IN M O N I T O R I N G AMBIENT PARAMETERS
PATRIZIA CIARLINI, G I U S E P P E REGOLIOSI IAC-CNR,
Viale del Policlinico
137 00161 Rome,
Italy
UMBERTO MANISCALCO ICAR-CNR,
Viale delle Scienze
90128 Palermo,
Italy
The soft sensor technology is an innovative tool for obtaining measurements in complex experimental conditions. Recently it has been introduced in the field of cultural heritage to solve the problem of a non-invasive ambient monitoring. This work shows from a statistical point of view how well are estimated the physical atmosphere parameters in specific points of the monument by an Elman network. The accuracy of the virtual measures is analysed by means of a two-phase procedure, based on suitable substitution errors.
1. Introduction In modern system theory soft sensors are one of the most useful and innovative tools, being applicable in an extraordinary variety of applications. The term soft sensor (also named virtual sensors) refers to a technological mechanism able to reconstruct missing data or obtain data in difficult experimental conditions. Different kinds of soft sensors are already used in the industrial production process control [1] and [2], or measure in complex industrial plants [3]. In the field of the conservation of cultural heritage is very important to have not invasive tools for monitoring physical or chemical conditions of materials composing a monument, since damages are mainly due to interactions among the materials and the atmosphere. In order to monitor ambient parameters, such as humidity, on a monument for a long time, several sensors should be maintained and surveyed many times. Such actions may be too expensive or particularly invasive and reduce the enjoyment of the monument itself. To solve the problem of a non-invasive monitoring the computational tool of neural network can be adopted (see [4]), in particular a recursive network was firstly introduced in [5]. This research, developed
143
in the framework of a project financed by RAVA (Val d'Aosta Region), has designed and trained a connectionist system as a set of soft sensors, whose prototype is now working for the monitoring of roman theater in Aosta. The aim of this work is to show from a metrological and statistical point of view how well the physical atmosphere parameters are estimated by the soft sensors in some specific locations on the monument. Section 2 briefly introduces the Elman neural network and section 3 the soft sensors, designed for the application in cultural heritage. Section 4 describes the two-phase procedure identified to validation the soft sensor performances. The accuracy of the virtual measures for a specific ambient parameters and a rich test set, is finally discussed in Section 5. 2. The Elman neural network An Artificial Neural Network (ANN) is a powerful information processing paradigm that is composed of a large number of highly interconnected processing elements (neurons) working in unison to solve specific problems and learning by example. An ANN is configured for a specific application, through a learning process and it is able to derive meaning from complicated or imprecise data. A single neuron has two modes of operation: the training mode and the using mode. In the first mode, the neuron is trained to give an output activation value, depending on specific input vectors. In the using mode the neuron produces an output depending on both the input and the learnt structure, which is coded as sets of weights and biases. Connecting several neurons by a suitable architecture and topology we can obtain an intricate computational structure able to manage, for example, complex relationship between sets of data. A trained ANN can be thought of as an "expert" in the category of information it has been given to analyse, for example in extracting patterns and detecting complex trends, but also able to answer in new cases. The Elman [6] neural network is a particular neural network topology (see Fig. 1) in which the output depends also on the state of the net realized by a context layer of neurons and on a loop of feedback (represented by D in the figure) between the hidden layer and the context layer as described by the Eq. 1. uk = tansig(Wi * ik + W'{ * y£_i + b i )
(l)
yk = purelin(W2 * uk + 62) where, uk is the output of the hidden layer, W[ is the matrix of the weights that connect the input layer to the hidden layer, yk_i is the output of the
144
Figure 1.
Elman neural network schema.
hidden layer, W" is the matrix of the weights that connect the context layer to the hidden layer, i is the input of the net, W2 is the matrix of the weights that connect the hidden layer to the output layer, b\ and 62 are the biases of the of the hidden layer and the output layer respectively and yk is the output of the net. Usually, recursive algorithms are used to estimates the network weights and biases in the non linear dynamic system of Eq. 1. For a chosen topology and i/o links, a training phase for each neural network must be performed to estimate the weight matrices and biases in Eq. 1. In the training phase a sufficiently large set of chronological data (training set) must be used to obtain good performances. Since our soft sensors have to deal with physical phenomena that have a temporal inertia, recursive neural network appears to be the most appropriate choice. 3. Soft sensors to estimate ambient parameters values In our specif application, the soft sensors estimate ambient parameter values in given positions on the monument, using only the measurements acquired by an Air Ambient Monitor Station (AAMS) located nearby the monument. The designing idea of using a specific neural network to realize a specific soft sensor, has produced the connectionist system in Fig. 2, whose computational units have been independently trained and tuned. The training data set and the test set contain measurements of the atmosphere parameters acquired both by an AAMS and by hard sensors in four positions on the theater. The sets are formed by ordered pairs of measurements at the same time t (m, r ) . From the training set, each soft sensor learns the associations among ambient parameters, measured by the
145
Figure 2. The AAMS and the connectionist system topology with its i/o variable dependences. Ta, Tc and H are the estimated air temperature, monument surface temperature and air humidity in 4 different locations on the monument surface.
AAMS and by an hard sensor on the monument, to predict values at a specific location on the monument in differed times. These predictions are to be validated using the test set. For example, in the using mode the trained soft sensor that estimes the contact temperature in the third location on the monument, has the following compact notation: fc3(t)
= f(Ta(t),
h(t),
(2)
where (p gives the output of the hidden layer. Here m is a vector containing the ambient temperature Ta, measured by the AAMS and the day hour h at which the Ta is acquired, / represents the specific trained Elman neural network, where 0 represents all inner parameters, such as the number of neurons, the weight matrices and biases, which were estimated in the training phase. We want to underline that the soft sensor, say s(t) = / ( m ; 0 ) , can be viewed as an instrument that provides indirect measurements for two reasons: the location of the estimated output value is always different from the location of the input measurements (AAMS); the output quantities are given by a transformation of the ones in input, which can be of different types, such as Tcj, and Ta in Eq. 2. However, a classical methodology adopted to evaluate the performances of real instruments cannot be directly transposed in the context of soft sensors.
146
4. The procedure to evaluate the soft sensor performances The usual procedure for evaluating the performance of a neural network can be applied to our soft sensors. It consists of a statistical validation of the results obtained for a new test set, i.e. statistical evaluators able to assess the success of the forecasting for a new test set, which must be "comparable and compatible" to the used training set. These evaluators mostly compute overall values that may hide some particular behaviors, such as the response in case of a rare event, i.e. event occurring a small number of times in the set. We introduce the validation procedure for the soft sensor performances which is driven by the idea that each sensor will substitutes a hard sensor in a specific location for a long time monitoring. Therefore, we propose a validation procedure in two phases: the statistical phase based on specif estimators; validation by comparison. ' Let us define a substitution error es(t) — s(t) — r(t) between the values estimated by the soft sensor s(t), and the measurement acquired by an hard sensor, say r(t), which is placed on the specific location on the monument. We want to characterize the behavior of e s in the observed range of each ambient variables in m , i.e. the observed range of Ta(t) or of h(t) in Eq. 2. Moreover, we characterise the soft sensor error using several estimators as the ones used in specifications of a real instrument (data sheet). Let us subdivide the range of an observed variable in C subintervals Ic, c = 1, ...,C, of equal length; consequently the test set is subdivided in C subsets and for each Ic the substitution errors ecs are computed. In the first phase, the statistical analysis of the soft sensor is performed using the vectors es, all e% and the following evaluators for a test set of N pairs of data: • • • •
ratio RR between the ranges of s and r; minimum, maximum, mean and standard deviation values for es; CC, correlation coefficient \s,r]; ratio Re between the number of estimates lower than a fixed value e and N; • ratio RE between the number of estimates greater than a fixed value E and N; • mean and standard deviation values for each ecs. The second phase of the validation is linked to the consideration that the AAMS is a particular soft sensor, where no physical modeling feature is in-
147
volved. The monitoring operation could be achieved without the placement of any soft or hard sensors on the monument only by using the AAMS surveys, as the unique source of information. In this respect the corresponding substitution error e ^ i ) can be computed: eA(t)=g(m(t))-r(t)
(3)
where g(m) can simply be a selection function that returns the value of the investigated variable in the m vector, otherwise it is a physical formula to obtain the investigated value. In order to assess the better performances of our soft sensor than the AAMS a comparison of the errors evaluators can be performed. 5. Experimental results The twelve soft sensors for the ambient parameters (in Fig. 2) have been analysed according to the two-phase procedure. The chosen test set contains AT=3960 surveys, belonging to a period of one year after the training year, the measures are taken hourly, but missing data for some hours are possible. The statistical results for the ambient parameter Tc^ are reported in Table 1 and in the figures, where Fig. 3 shows the histogram of eA and ea, Fig. 4 the plots of the mean values and standard deviations of ecA ecs in the temperature range, c = 1,..., 45, with Ic of length of 1 °C, Fig. 5 the plots of same values during the day, i.e. in the hour range c = 1,..., 24. The right column in Table 1 and the histogram show that the soft sensor reproduces the phenomenum with errors having symmetric shape with short and thin tails. In fact, about 84% of the estimations are <1 °C. In particular, the plot of the mean values of ecs (Fig. 4) shows how the soft sensor mantains its mean error around the zero in the range of the observed temperature and during all the day. The standard deviation values of ecs provide a constant behaviour, being in the interval [0.5 °C, 1 °C] for the most part of the surveys. A worst behaviour only occurs at the lowest temperature values: it may be due to the occurence of rare events at these temperatures in the train set. An important result of this first phase is the validation of the data modelling achieved by the soft sensor, since the errors show a Gaussian-like distribution with zero-mean, low standard deviation and short tails. The second phase is based on the comparisons of the two columns in Table 1 and of the plots in Figs. 3; 4 and 5. The comparison analysis assesses a better performance of the soft sensor than the AAMS for every
148 Table 1. Statistical validation for the soft sensor TC3: comparison between AAMS and the Elman technology Statistical evaluator
AAMS substitution error
Soft Sensor substitution error
RR
1.3473
1.0963
|min(e)|
0°C
0.00013 ° C
|max(e)|
9.1600 ° C
4.0318 ° C
mean(e)
1.7341 °C
0.0204 °C
std(e)
2.7668 °C
0.7520 °C
CC
0.9574
0.9947
R, ( < 1 °C)
0.1346
0.8377
RE
0.4325
0.0023
( > 3 °C)
_. Figure 3.
A
Tcz errors distribution for AAMS sensor (left) and for the soft sensor (right)
Figure 4. The plot of mean values of e.% (solid line) and eA of Tcz (dashed line) for 45 intervals in the temperature range (left). Plot of the std values of e£ (solid line) and ecA (dashed line) (right)
149
Figure 5. The plot of mean values of e^ (solid line) and eA (dashed line) for 24 hour intervals of h (left). Plot of the std values of e j (solid line) and eA (dashed line) (right)
estimator (in particular note the high reduction of the std error and the absence of any periodic trend in the plots of the mean values). The comparison phase makes evident the gain obtained in using the soft sensor instead of the AAMS one. The validation by comparison has been applied to the twelve soft sensors and each of them showed better results that the AAMS for the used test set. 6. Conclusions The soft sensor based on the Elman recursive mechanism constructed an implicit physical model that is correct, since the virtual measurements are substantially without bias and in the same range (temperature, time) of the real data. The proposed validation procedure analysed the performances of the soft sensors from a statistical and metrological point of view. It highlighted the ability of the soft sensors to monitor complex ambient phenomena for a long period, using the trained neural networks and the AAMS measures acquired nearby the monument. Differences between virtual measures and the real ones have been considered to be negligible by the cultural heritage experts. Therefore our soft sensors can be used to replace the physical sensors on the monument in the next future.
Acknowledgment We would like to express our best thanks to the Dr. Lorenzo Appolonia, "Soprintendenza della Regione Valle d'Aosta", and to Prof. Antonio Chella, DINFO, University of Palermo, for their fundamental contributes.
150 References 1. H. Ressom et al., Proc. of IEEE International Conference on Systems, Man, and Cybernetics, 1745 (2000). 2. M. J. Willis et a l , Automatica 28 (6), 1181 (1992). 3. J. Sevilla, C. Pulido, in Proc. Instrumentation and Measurement Technology Conference. IEEE 1, 293 (1998). 4. U. Maniscalco, in Neural Nets, Series on Lecture Notes in Computer Science, Proc. 15th Italian Workshop on Neural Nets, Perugia, 2004 (to appear). 5. A. Chella, U. Maniscalco, IEEE Tran. on Instrum. and Measurement (submitted). 6. J. Elman, Machine Learning, 7, 2/3, 195 (1991).
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 151-160)
EVALUATION OF STANDARD UNCERTAINTIES IN NESTED STRUCTURES
EDUARDA FILIPE Instituto Portugues da Qualidade - Laboratorio Caparica,
Central de
Metrologia,
Portugal
The realization of ITS-90 scale requires that the Laboratories usually have more than one cell for each fixed point. These cells must be compared at a regular basis and the experiments repeated in subsequent days. The measured differences are obtained with two or three standard thermometers and this run may be repeated some time after. The uncertainty calculation will take into account these time-dependent sources of variability, using the Analysis of Variance for designs consisting of nested or hierarchical sequences of measurements. The technique and the estimation of the variances components are described. An application example of a cells comparison is drafted.
1. Introduction The Temperature Laboratories for the realization of International Temperature Scale of 1990 (ITS-90) usually have more than one cell for each fixed point. The Laboratory may consider one of the cells as a reference cell and its reference value, performing the other(s) the role of working standard(s). Alternatively, the Laboratory may consider its own reference value as the average of the cells. In both cases the cells must be compared at a regular basis and the calculation of the uncertainty of these comparisons performed. A similar situation exists when this Laboratory compares its own reference value with the value of a travelling standard during an inter-laboratory comparison. These comparisons experiments are usually performed with two or three thermometers to obtain the differences between the cells. The repeatability measurements are performed each day at the equilibrium plateau and the experiment is repeated in subsequent days. In the case of internal comparisons, the experiment is repeated some months after. The uncertainty calculation will take into account these time-dependent sources of variability, arising from short-term repeatability, the day-to-day reproducibility and the long-term random variations in the results [1]. These components of uncertainty are evaluated by the statistical analysis of the data
152 obtained from the experiment using the Analysis of Variance (ANOVA) for designs consisting of nested or hierarchical [2] sequences of measurements. The ANOVA is defined [2] as a "technique, which subdivides the total variation of a response variable into meaningful components, associated with specific sources of variation". In the application example that we work out, we use a nested design. The estimation of the variances components is drafted, to evaluate the standard uncertainty of type A method in the comparison of two triple point of water cells. 2. General Principles and Concepts Experimental design is a statistical tool concerned with planning the experiments to obtain the maximum amount of information from the available resources [3]. This tool is used generally for the improvement and optimisation of processes [4], where the experimenter, controlling the changes in the inputs and observing the corresponding changes in the outputs, enables statistical inference from the evaluation of the null hypothesis (H0) that the outputs are equal on average. The decision is taken at a declared significance level a (or, alternatively, the p-value corresponding to the observed value of the test statistic is computed), also known as the "producer's" risk.
D= 10 P=2
. . .
Measurements
-,- -
Factor T Factor D
.'."-• ^ -- -'
Factor P
Figure 1. Nested or hierarchical design
In addition, it can be used to test the homogeneity of a sample(s) for the same significance level, to identify the results that can be considered as "outliers" or to evaluate the components of variance between the "controllable" factors. In the comparison experiment to be described, the factors are the standard thermometers, the subsequent days measurements and the run (plateaus) measurements. These factors are considered as random samples of the population from which we are interested to draw conclusions.
153 2.1. The Nested or Hierarchical Design. General Model The nested design is defined [2, 5] as "the experimental design in which each level of a given factor appears in only a single level of any other factor". The objective of this model is to deduce the values of the variance components that cannot be measured directly. The factors (see Fig. 1) are hierarchized like a "tree" and any path from the "trunk" to the "extreme branches" will find the same number of nodes. In this design, each factor is analysed with the one-way analysis of variance model, nested in the subsequent factor. 2.1.1. Model for one factor Considering firstly only one factor with a different levels taken randomly from a large population, [3, 6] any observation made at the /' factor level with j observations, will be denoted by ytj. The mathematical model that describes the set of data is: yij=Mi+£ij=/j.
+ Ti+£ij
(i = l,2,...,a
and
j = l,2,-,n)
C1)
where Mt is the expected (random) value of the group of observations i, ju the overall mean, r, the parameter associated with the j-th factor level designated by z'-th factor effect and Sy the random error component. This model with random factors is called the random effects or components-of-variance model. For the hypothesis testing, the errors and the factor effects are assumed to be normally and independently distributed, respectively with mean zero and variance a2 or eij~N(0,a2) and with mean zero and variance a2 or 2 Ti~N(0, <JT ). The variance of any observation is composed by the sum of the variance components, according to: a2y=a2T+a2
(2)
The test is unilateral and the hypotheses are: H0 : crr2 = 0
(3)
//, : a2 > 0 that is, if the null hypothesis is true, all factors effects are "equal" and each observation is made up of the overall mean plus the random error £jj~N(0, a2). The total sum of squares, which is a measure of total variability in the data, may be expressed by:
154
tt(yy-y)2 =t£fa-y)+(y,-y,)\ = a
,,,
n
«Za -yf + H(X)-y,f +2 Z E a -Myt-y,) As the cross-product is zero [4], the total variability of data (SST) can be separated into: the sum of squares of differences between factor-levels averages and the grand average (SSFactor), a measure of the differences between factorlevels, and the sum of squares of the differences of observations within a factorlevels from the factor-levels average (SSE), due to the random error. Dividing each sum of squares by the respectively degrees of freedom, we obtain the corresponding mean squares (MS):
(5) a
n
EEC?,-?,)
2
'-i ;='
MSError -
a(n-\)
The mean square between factor-levels (MSFactor) [7] is an unbiased estimate of the variance a2, if H0 is true, or a surestimate of a2 (see Eq. 7), if it is false. The mean square within factor (error) (MSEn-or) is a n unbiased estimate of the variance a2. In order to test the hypotheses, we use the statistic: _
F
0
_
MS
Fac,or l *c ^Error
M
p-
(6) '
v
or,a-l,a(»-l)
where F a is the Fisher-Snedcor sampling distribution with a and ax(n- 1) degrees of freedom. If F0 > F^ a.i a(„.i), we reject the null hypothesis and conclude that the variance a2 is significantly different of zero, for a significance level a. The expected value of the MSFactor is [4]: E{MSFaMr) = E
a-llT
•a2+na2
(7)
The variance component of the factor is then obtained by:
F distribution - Sampling distribution. If %u2 and Xv2 are two independent chi-square random variables with u and v degrees of freedom, then its ratio /•"„ v is distributed as F with u numerator and v denominator degrees of freedom.
155 2
or,
_
E{MSFactor)-a
(8)
2.1.2. Model for three factors Considering now the three "stages" nested design of Fig. 1, the mathematical model is: yr*m=M+np+Ad+T,
+ £rtm
(9)
where yrd,m is the (rdtmf1 observation, // the overall mean, n^ the P-th random level effect, A,/ the D-th random level effect, T, the T-th random level effect and Eftdm the random error component. The errors and the level effects are assumed to be normally and independently distributed, respectively with mean zero and variance a2 or £j;~N(Q, a2) and with mean zero and variances ap2, erf and a2. The variance of any observation is composed by the sum of the variance components and the total number of measurements, N, is obtained by the product of the dimensions of the factors (N=R xDxTxM). The total variability of the data [8, 9] can be expressed by: p
d
t
m
p
P
d
2 p
d
t
(10)
m
or
This total variability of the data is the sum of squares of factor P (SSp), the P - factor effect, plus the sum of squares of factor D for the same P (SSD\P), plus the sum of squares of factor T for the same D and the same P (SST\DP) and finally SSE, the residual variation. Dividing by the respective degrees of freedom, P - 1, P x (D - 1), P x D x (T- 1) and PxDxTx {M- 1) we obtain the mean squares of the nested factors, which are estimates of a2, if there were no variation due to the factors. The estimates of the components of the variance, are obtained by equating the mean squares to their expected values and solving the resulting equations:
156
E(MSP) = E
SSP
E(MSDlP) = E
SS D\P P(D-\)
E(MSADP) = E E(MSE) = E
•• crz +MaT +TMaDi
SS,T\DP PD(T-l)
+DTMa/
• crz+Mcr/ + TMaD
(11)
= (T1 + MaT
SS,
PDT(M-l)
3. Example of the comparison of two thermometric water triple point cells in a three-stage nested experiment 3.1. Short description of the laboratorial work In the comparison of the two water cells (Fig. 2), JA and HS, we used two standard platinum resistance thermometers (SPRTs) A and B. After the preparation of the ice mantles, the cells were maintained in a thermo regulated water bath at a temperature of 0.007 °C. This bath can host until four cells and is able to maintain them at the triple point of water (t= 0.01 °C) during several weeks.
Figure 2. Triple point of water cell; A - Water vapour; B - Water in the liquid phase; C - Ice mantle; D -Thermometer (SPRT) well.
The ice mantle in the cells was prepared according to the laboratory procedure, 48 hours before beginning the measurements. Four measurement differences were obtained daily with the two SPRTs and this set of measurements was repeated during ten consecutive days. Two weeks after, a second ice mantle was prepared. A complete run was then repeated (Run/Plateau 2).
157 3.2. Measurement Differences Analysis Consider the data of table 1 represented schematically at figure 3. In this nested experiment, are considered the effects of Factor-P from the Plateaus (P = 2), the effects of the Factor-Z) from the Days (D= 10) for the same Plateau, the effects of the Factor-r from the Thermometers (T = 2) for the same Day and for the same Run and the variation between Measurements (M= 2) for the same Thermometer, the same Day and for the same Plateau or the residual variation. Table 1. Measurement results of the three-stage nested experiment
2 3 4
,__ Plate
3 cd
5 6 7 8 9 10
All Bll A12 B12 A13 B13 A14 B14 A15 B15 A16 B16 A17 B17 A18 B18 A19 B19 AlO BIO
Measurements (uK.) 1 2 103 93 93 93 73 78 118 68 91 126 130 96 93 88 78 118 117 97 118 118 108 98 80 80 105 100 128 108 80 110 67 77 97 92 104 84 106 96 63 68
a
X
Measurements (\iK) 1 2 107 117 74 54 123 48 103 83 114 64 102 72 74 119 70 105 68 58 104 104 93 83 100 70 77 89 84 104 62 112 91 51 68 63 69 68 88 95 78 113
Days SPRTs A21 B21 A22 B22 A23 B23 A24 B24 A25 B25 A26 B26 A27 B27 A28 B28 A29 B29 A20 B20
1 2 3 4 IN
Plateau
Days SPRTs
5 6 7 8 9 10
• • •
S
2< m
8 9 10 1 1 2 3 4 5 6 Pla tea u1
•
J!
n
a
Q
n
•
3 4 5 6 Pla tea u 2
\l
Figure 3. Schematic representation of the observed temperatures differences
9 10
158 The variance analysis is usually drawn in the ANOVA Table, displaying the sums of squares, the degrees of freedom, the mean squares, the expected values of the mean squares and the statistics F0 obtained calculating the ratios of subsequent levels mean squares. Table 2. Analysis of variance table, example of a comparison of two triple point of water cells in a three-level nested design Source of variation
Sum of squares
Plateaus Days Thermometers Measurements Total
2187.96 6873.64 7613.19 14762.50 31437.29
Degrees of freedom 1 18 20 40 79
Mean square 2187.96 381.87 380.66 369.06
Fo 5.7296 1.0032 1.0314
Expected value of mean square o2 + 2at2 + 4aD2+ 40<7S2 o2 + 2or2 + 4cr02 o2 + 2or2 o2
From the ANOVA Table 2, we obtain F0 values that will be compared with the F distribution for a = 5% and 1 and 18 degrees of freedom, ^o.o5,1,18 =4.4139 for the Plateau/Run effect, 18 and 20 degrees of freedom Fo.o5,18,20 = 2.1515 for the Days effect and 20 and 40 degrees of freedom ^0.05,20,40 =1.8389 for the Thermometers effect. We observe that the F0 values are inferior to the F distribution for the factor-days and factor-thermometers so the null hypotheses are not rejected. At the Plateau factor, the null hypothesis is rejected for a = 5 % although not rejected for a=l% (F00i,i,is = 8.2855), so a significant difference exists between the two Plateaus results at 5% significance level but not at 1% significance level. Equating the mean squares to their expected values, we can now calculate the variance components and include them in the budget of the uncertainty (see Table 3). Table 3. Uncertainty budget for components of uncertainty evaluated by a Type A method Components of uncertainty Type A evaluation Plateaus Days Thermometers Measurements Total
Variance (uK.)2
Standard -deviation (uK)
45.15 0.30 5.80 369.06 420.32
6.7 0.5 2.4 19.2 20.5
These components of uncertainty, evaluated by Type A method, reflect the random components of variance due to the factors effects.
159 3.3. Residual Analysis It was referred (2.1.1. and 2.1.2.) that in this model the errors and the level effects are assumed to be normally and independently distributed. This assumption was checked drawing a normality plot (Henry line) of the obtained residuals (see Fig. 4). 3.4. Remarks This model for "Type A" uncertainty evaluation that takes into account these time-dependent sources of variability it is foreseen at the GUM [1]. The obtained value using this nested design wA= 20.5 uK (Table 3) is considerably larger from that obtained by calculating the standard deviation of the mean of the 80 measurements u\= 2.2 uK. This last approach is generally used at the international comparisons and evidently sub-estimates this component of standard uncertainty. Normal probability plot
o—'6
n
^jrr^TT
o"~~ o -40
-30
-20
-10
0
10
20
30
40
Residuals grouped at 5|.tK classes
Figure 4. Normal probability plot of the residuals of the comparison experiment.
4. Conclusion The nested-hierarchical design was described as a tool to identify and evaluate components of uncertainty arising from random effects. Applied to measurement, it is suitable to estimate the components standard uncertainty evaluated by type A method in time-dependent situations. An application of the design has been drafted to illustrate the variance components analysis in a three-factor nested model of a short, medium and longterm comparison of two thermometric fixed points. The same model can be applied to other number of factors, easily treated in an Excel spreadsheet.
160 References 1. BIPM et al, Guide to the Expression of Uncertainty in Measurement (GUM), 2nd ed., International Organization for Standardization, Geneve, 1995, pp 11, 83-87. 2. ISO 3534-3, Statistics - Vocabulary and Symbols - Part 3: Design of Experiments, 2nd ed., Geneve, International Organization for Standardization, 1999, pp. 31 (2.6) and 40-42 (3.4) 3. Milliken, G.A., Johnson D. E., Analysis of Messy Data. Vol. I: Designed Experiments. 1st ed., London, Chapmann & Hall, 1997. 4. Montgomery, D., Introduction to Statistical Quality Control, 3rd ed., New York, John Wiley & Sons, 1996, pp. 496-499. 5. ISO TS 21749 Measurement uncertainty for metrological applications — Simple replication and nested experiments Geneve, International Organization for Standardization, 2005. 6. Guimaraes R.C., Cabral J.S., Estatistica, 1st ed., Amadora: Mc-Graw Hill de Portugal, 1999, pp. 444-480. 7. Murteira, B., Probabilidades e Estatistica. Vol. II, 2nd ed., Amadora, McGraw Hill de Portugal, 1990, pp. 361-364. 8. Box, G.E.P., Hunter, W.G., Hunter J.S., Statistics for Experimenters. An Introduction to Design, Data Analysis and Model Building, 1st ed., New York, John Wiley & Sons, 1978, pp. 571-582. 9. Poirier J., "Analyse de la Variance et de la Regression. Plans d'Experience", Techniques de I'lngenieur, Rl, 1993, pp. R260-1 to R26023.
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 161-170)
M E A S U R E M E N T SYSTEM ANALYSIS A N D STATISTICAL PROCESS CONTROL
ALISTAIR B. FORBES Hampton
National Physical Laboratory Road, Teddington, Middlesex, UK, TW11
OLW
In statistical process control, critical characteristics of the outputs of a process are measured in order to assess whether the production capability is meeting requirements and t o detect unwanted changes in the process behaviour. T h e variation in the measured characteristics is the result of a combination of variation in the process and variation in the measurement systems. In commissioning and implementing measurement systems for process control, it is necessary to perform measurement system analysis on candidate measurement methods to determine if they meet specific metrological requirements. There are two general methodologies for measurement systems analysis. The first is described in the Guide to the Expression of Uncertainty in Measurement (GUM), and is used throughout metrology. The seconded is described in ISO 5725: Accuracy (trueness and precision) of measurement methods and results, and is prevalent in the measurement and testing community. In this paper we try to put both approaches in a common framework. Once a measurement system has been characterised (using either approach), it is then possible to consider what component of the observed variation in the measured process characteristics can be explained by the measurement system. To answer this question it is necessary to consider the likely variation of the behaviour of the measurement system over time. We consider a simple and practical approach that can be implemented using information readily available from the measurement system analysis.
1. Introduction In this paper we consider two approaches to measurement systems analysis and their relationship to statistical process control. The first approach is that described by the Guide to the Expression of Uncertainty in Measurement (GUM [1]). The GUM gives guidelines on how to evaluate and express the uncertainty associated with the output of a system, given estimates of the inputs, their associated uncertainties, and a model specifying how the output depends on the inputs. The uncertainty quantifies the likely variation in the output values if the measurement was to be repeated many times with all inputs allowed to vary in accordance with their uncertainties.
162 Although the GUM has wide applicability, it tends to be used in the uncertainty evaluation associated with a particular set of measurements gathered by a particular measurement system - for example, in the preparation of a calibration certificate. The approach to measurement systems analysis in ISO 5725 [2] addresses measurement uncertainty in terms of trueness and precision. Trueness relates to the difference between the expected measurement result using the method and an accepted or reference value. Precision relates to the closeness of agreement between repeated measurements under stipulated conditions: the more the conditions are allowed to vary, the larger the expected variation of results. The principle behind statistical process control [3] is that if the manufacturing process leading to the final product is sound, then the product itself will also be sound. By means of monitoring the process at key stages, we should be able to detect any change in the underlying system generating these products. In order make valid inferences about the process it is also necessary to take into account the likely variation in the measurements due to the measurement systems. In this, it is important to understand how the variation in the measurements is likely to change over different timescales. This paper is organised as follows. In section 2 we describe the characterisation of stochastic processes in terms of statistical distributions. In sections 3 and 4 we describe the two approaches to measurement systems analysis in terms of characterising stochastic processes, while in section 5 we discuss some main concepts in statistical process control. The problem of assessing the process variation in the presence of measurement variation with a time dependency is considered in section 6. Our concluding remarks are given in section 7.
2. Characterisation of stochastic processes The output from a stochastic process is characterised in terms of statistical distributions. Suppose y = ( y i , . . . , 2/ m ) T is the output from a process for which there is a significant random element. While we are unlikely to be able to explain how the stochastic processes gave rise to the individual data values, we endeavour to explain the aggregate behaviour in terms of a statistical distribution for which we can estimate summarising statistics such as its mean and variance. For some applications, estimates of the mean and variance are sufficient to enable useful inferences about the behaviour of the process to be made. For others, further information such as higher order moments, quartile values or confidence limits are required or highly
163 desirable. Often, this information is derived by assuming that the distribution has a particular parametric form (such as a ^-distribution) for which estimates of the distribution parameters are available (e.g., mean, scale and number of degrees of freedom for a ^-distribution). 3. The G U M approach to measurement uncertainty evaluation In general, metrology measurements are made in a well characterised environment. The behaviour of the instrument is modelled in terms of physical laws for which there is substantial theoretical and/or experimental evidence. In these circumstances, the input quantities X\ influencing the measurement result can be listed and quantified and their effect on the response Y specified in functional form: Y = f{X). From the knowledge of the instrument, a probability distribution is associated to X defining, through the function / , the probability distribution associated with Y. Summarising information about this latter distribution can be determined using the law of propagation of uncertainty [4] (LPU). If Y is a linear combination cTX of random variables X = {X\,..., Xn)T which are such that the expectation E(X) = n and the variance V(X) = V, then Y has expectation E(Y) = cTfi and variance V(Y) = c T Vc. If Y is a nonlinear function Y = f(X) of X, then E(Y) is approximated by /(//) and V(Y) by g T V r g, where g is the vector of partial derivatives of / evaluated at /x. The adequacy of these approximations depends on the nonlinearity of / . For X ~ N(0,1) and Y = X2, for example, they are entirely useless. For this reason, other methods such as Monte Carlo simulation [5,6] (MCS), are being used to provide summary information about the distribution associated with Y. If x g are random draws from the joint distribution associated with X, then yq = f(xg) are random draws from that for Y. Hence, means, variances, quartiles, etc., can be estimated from yq. 3.1. Uncertainty
budgets
Uncertainty budgets represent a systematic approach to evaluating the uncertainty of an output Y in terms of the uncertainties of the inputs Xj. Often the information can be presented in a table or spreadsheet. A simple example is given in Table 1, and corresponds to a model of the form Y = X\ + X2 + X3. If Xj, j = 1,2,3, are associated with normal distributions, then Y is also associated with a normal distribution with variance u2(Y) = 0.01 2 + 0.022 + 0.022 = 0.03 2 . While the uncertainty budget is
164 used to characterise the likely variation in the measured values if the measurement was repeated many times with the inputs varying according to their associated distributions, an actual sequence of repeated measurements will be different from independent draws from a normal distribution. This is because some of the inputs will be constant over varying time scales. For example, the calibration offset is likely to be constant between calibrations. Only the repeatability effect is likely to vary from measurement to measurement. Symbol Xi
x2 x3 u(Y)
Description Calibration offset Environment Repeatability Standard uncertainty
Uncertainty 0.01 0.02 0.02 0.03
Table 1. A simple uncertainty budget. 4.
ISO 5725 Accuracy (trueness and precision) of measurement methods
ISO 5725 [2] describes measurement uncertainty in terms of trueness and precision. Trueness relates to the difference between the expected measurement result using the method and an accepted or reference value. Precision relates to the closeness of agreement between repeated measurements under stipulated conditions. The more the conditions are allowed to vary the larger the expected variation of results. Typical factors that affect the precision are the operator, environment, calibration of the equipment and time elapsed between measurements. The repeatability of a system is the precision when all the influence factors are held as close to constant as possible (e.g. the same operator). The reproducibility is the precision when all likely sources of variation are allowed (e.g. all operators using the equipment). Compared to metrological experiments, measurements are made in less controlled environments. The influence factors are not well characterised and the functional relationship of between the output and inputs is not known precisely. We write the output Y = f(X, Z) as depending on factors X that can be controlled and factors Z over which we have little influence. In order to estimate the influence of X on F , the factors X are deliberately varied and the responses analysed. The observed precision under repeatability conditions is the variation in the measurements yq when all controllable factors are held constant: yq = / ( X Q , Z Q ) , where XQ S -X",
165
i.e., xo is a draw from X and zq G Z. The observed precision under reproducibility conditions is the variation when all controllable factors vary as in normal operation: yq = / ( x g , z g ) with x.q G X and zq G Z. 4.1. Analysis
of
variance
If we assume the influences of factors X and Z are independent and additive so that Y = 53 • fj(Xj) + g(Z), then in a repeatability experiment yq = /o + gq, where f0 = £ \ fj(x0ij), gq = g{zq), x0ij G Xj and zq G Z, from which an estimate of the variation of g{Z) can be determined. Similarly, if we hold all but xk fixed, then yq = f0\k + fk(xq,k) + 9q, where f0\k = Ylj j^k fj(xo,j)- This allows us to estimate the variation of fk(Xk)+g(Z). Together with the information from a repeatability experiment, an estimate of the variation of fk(Xk) can be obtained. Analysis of variance (ANOVA) extends this approach to designing experiments and performing the analysis of the measurement results in order to estimate the standard deviations associated with individual contributing factors [7]. 4.2.
Trueness
Repeatability and reproducibility experiments provide information about the likely variation in measurement results for a fixed type of measurement task. They will not detect systematic effects that influence all the measurements and the incompleteness of the model means that it is difficult to estimate their effect on the output. We can use prior information about the output quantity (available through a more accurate measuring method, reference materials, or calibrated artefacts, for example) to quantify these systematic effects. If the information about the mean y of measurements from a reproducibility study is summarised by y G N(/J, + 5,s2), where 6 represents trueness of the measurement method, and the prior information about \x is summarised by j/o £ N(fi,, SQ), then the information about S is summarised by y — yo G N(6, SQ + s2). 5. Statistical process control The aim of statistical process control (SPC) is to know enough about the usual behaviour of the process in order to a) detect when something has gone wrong, and b) ignore events that can reasonably be explained by chance. Sources of variation inherent in the process are referred to as common or random causes. Assignable or special causes are sources of variation
166 external to the process that need to be investigated and eliminated. We model the outputs Y — f(X, Z) of the process as an unknown function of factors X (over which we have some control) and Z. 5.1. Control
charts
Control charts are used to specify normal behaviour of the process and as a tool in detecting changes in the behaviour of the process. A two-phase approach is used. In the first phase a sampling scheme is employed to set up the control limits. Measurements are made in batches or subgroups {yb,q} and the average j/6 and standard deviation s;, for each subgroup recorded. The measurements within each subgroup are made in conditions close to repeatability conditions: Vb,q =
f(Xb,q,Zb,q)
where Xf,i « x^ in the sense that the variation of Xfci9 about x& is expected to be significantly smaller than the variation of X over the long term. If the sequence of standard deviations Sb is reasonably constant, s& « ay, the process is regarded as stable. Otherwise, assignable causes are assumed to be present. Once stability of the process has been established, a run of subgroup measurements is gathered and the average y and standard deviation OR of the within-subgroup averages y~b a r e calculated. The variation of y~b can be regarded as the variation under reproducibility conditions. It is then assumed that if the process remains stable subsequent subgroup averages are drawn from a normal distribution: y~b € N{y,aR). The actual measured values are recorded on control charts. Events are signalled if their probability of happening by chance (assuming, normal, independent sampling) is less than 3 in 1000, for example, a measurement of y~b for which |j/6 — y\> 3(TR. The underlying philosophy is that assignable causes occur more often than 3 per 1000 so given that an event has occurred, the most likely explanation is an assignable cause. Tests can be tuned to balance the risks associated with false alarms and undetected changes in the process. 6. Measurement systems analysis and S P C In SPC, the observed variation is a combination of process and measurement variation. Ideally, the measurement systems implemented are such that the variation due to measurement is insignificant compared to the process variation. In practice this may not be the case. Given a sequence of measurements, what can be said about the likely contribution of the measurement
167
system to the observed variation? The output of uncertainty budgets, measurement system analysis and analysis of variance are a characterisation of the measurement system that can be used to answer this question. However, the variation in measurement results generally has a dependence on time. The repeatability ay is a measure of the short term variation, the reproducibility <JR, that of the long term, while the measurement uncertainty u(y) will generally include components that can make no contribution to the observed variation. We can include time-dependent information through a time series model defined by parameters a — (<TI, . . . ,
where £j,q+i = ej,q with probability 0 < <x, < 1. The parameters Oj are the standard deviations determined as part the uncertainty budget or analysis of variance, while the ctj are used to model the frequency at which the effects are likely to change (e.g., time between calibrations, change of operator, change of measuring instrument). If we expect the jth factor to change every Uj time steps then we set ctj = 1 — 1/rij. For such a model, the covariance cov(yq,yq+r) between the output yq and that yq+r at r time steps later is n
cov(yq,yq+r)
=
^2^oTj.
The variance matrix V(cr, a) for y is a symmetric Toeplitz matrix [8] defined by cr = (CTI, . . . , <xn)T and a = (on,...,a„)T. (A symmetric Toeplitz matrix is constant along its diagonals and is defined by the first column, in this case Viti = X)"=i a]aT1^ 6.1. Variation
attributable
to
measurement
Suppose the measurement of the process is modelled as yq = fi + 5q + eq,
q = l,...,m,
(1)
where S £ N(0,a'pl) represents the variation in the process, and e e N(0,V(cr,a) that of the measurement system. We assume that from a measurement system analysis we have reliable estimates of cr and a. From observations y = (j/i,... , y m ) T we wish to estimate ap. Define V(X) to be V(X) — V(
yq(a) = yq - a,
168 with respect to a, so that (t(X) is the best estimate of [i, given V(X). Setting X2W = X2(A(-^)i ^)> if y i s generated as specified in (1), the expected value of X 2 ( ( T P ) is m — 1 since it is a sample from a \ 2 distribution with m — 1 degrees of freedom. Thus, an estimate ap of op based on the observed y is given by the A that solves x 2 (A) = m — 1. In a different context, this is precisely the same computational problem considered in [9]. In determining the solution we note that if V has eigenvalue decomposition V = XDXT then
yT(a)V-\X)y(a)
= y(afX(D
+ X2I)-'X^y(a)
m
= £
~2( \
^ L ,
where y(a) = XTy(a). These calculations assume the cr (and a ) are known exactly. The likelihood l(/j,,ap,cr\y) of observing y given /x, up and cr is given by the multivariate normal density function with variance matrix
6.2. Numerical
example
As an example calculation based on the model summarised by Table 1, we have generated m = 250 data points e e iV(0, V) where V = V(cr, a) with a = (0.01,0.02,0.02) T and a = (1 - 1/50,1 - 1/10,0) T , and set y = e + 6, S £ N(0,0.032I). In terms of the example in Table 1, a is set to represent re-calibrating every 50 measurements and the environment changing every 10 measurements. The value of eTV~1e = 256 which is close to its expected value of m = 250 relative to the standard deviation, \f~(2m) = A / ( 5 0 0 ) W 22, of the \m distribution. The value of A such that X2(A) = 249 is ap = 0.0307 which compares well with the value of 0.03 used to generate the data. If we ignore the expected correlation in the measurements and use V = 0.03 2 / instead, the estimated value of ap is 0.024, i.e., about two standard deviations away from the true value. Figure 1 shows the simulated values of the X\ variable (calibration offset), that of X\ +X2 (offset plus and environment), and yq, the simulated combined measurement and process variation.
169 0.15
-0.2 ' 0
' 50
' 100
' 150
' 200
' 250
Figure 1. Data simulating combined process and measurement variation with the measurements produced by a system characterised as in Table 1.
7. Concluding remarks Both the GUM and ISO 5725 are concerned with characterising the stochastic behaviour of measuring systems in terms of statistical distributions. The emphasis in the GUM is on deriving and characterising the distribution associated with the output on the basis of assigned distributions to the inputs and a known functional relationship between the output and the inputs. The emphasis in ISO 5725 is on characterising the output variation in terms of observations arising when the inputs are deliberately varied. The GUM accounts for systematic effects through the functional relationship of the output with the inputs. For ISO 5725, it is assumed that the model is not sufficiently well-defined for this approach and the impact of systematic effects have to be estimated using external information such as reference values determined by a more accurate system. ISO 5725 uses the term 'trueness' for the difference between the expected mean of the measured values and the reference value. In metrology, trueness (and, similarly, 'bias') as a concept tends to be avoided, mainly because measurements are made in a
170 context in which more accurate measuring methods are not available. T h e use of t e r m s such as repeatability, reproducibility, random and systematic effects, bias and trueness, reflects the fact t h a t factors contributing to the uncertainty associated with a measured value have different behaviour with respect t o time. In accounting for t h e observed variation in measurements of a process over time, it is i m p o r t a n t t h a t t h e time-dependent aspects of the measurement system are taking into account. We have considered simple models t h a t are able to do this on the basis of information t h a t is likely to be available. Acknowledgements This work was undertaken as part of t h e UK's National Measurement Syst e m Directorate Software Support for Metrology Programme. I am grateful t o Dr Peter Harris, N P L , for comments on an earlier draft of this paper. References 1. BIPM, IEC, IFCC, ISO, IUPAC, IUPAP, and OIML. Guide to the Expression of Uncertainty in Measurement. Geneva, second edition, 1995. 2. International Organization for Standardization, Geneva. ISO 5725-1: Accuracy (trueness and precision) of measurement methods and results Part 1: General principles and definitions, 1994. 3. International Organization for Standardization, Geneva. ISO 11462: Guidelines for implementation of statistical process control (SPC) - Part 1: Elements of SPC, 2001. 4. K. V. Mardia, J. T. Kent, and J. M. Bibby. Multivariate Analysis. Academic Press, London, 1979. 5. M. G. Cox, M. P. Dainton, A. B. Forbes, P. M. Harris, H. Schwenke, B. R. L. Siebert, and W. Woeger. Use of Monte Carlo simulation for uncertainty evaluation in metrology. In P. Ciarlini, M. G. Cox, E. Filipe, F. Pavese, and D. Richter, editors, AMCTM V, pages 94-106, Singapore, 2001. World Scientific. 6. M. G. Cox and P. M. Harris. Software Support for Metrology Best Practice Guide No. 6: Uncertainty Evaluation. National Physical Laboratory, Teddington, 2004. 7. D. C. Montgomery. Design and analysis of experiments. John Wiley & Sons, New York, 5th edition, 1997. 8. G. H. Golub and C. F. Van Loan. Matrix Computations. John Hopkins University Press, Baltimore, third edition, 1996. 9. M. G. Cox, A. B. Forbes, J. Flowers, and P. M. Harris. Least squares adjustment in the presence of discrepant data. In P. Ciarlini, M. G. Cox, F. Pavese, and G. B. Rossi, editors, AMCTM VI, pages 37-51, Singapore, 2004. World Scientific.
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 171-178)
A BAYESIAN ANALYSIS FOR THE UNCERTAINTY EVALUATION OF A MULTIVARIATE NON LINEAR MEASUREMENT MODEL GABRIELLA PELLEGRINI Department
of Applied Mathematics, University of Florence, Via di S. Marta, 3 - Florence -150139
GAETANO IUCULANO, ANDREA ZANOBINI Department of Electronics and Telecommunications, University of Florence, Via di S. Marta, 3 - Florence -150139 Following the principles of the Bayesian inference a multivariate non linear measurement model is studied. In particular the analysis concerns a measurand modeled as the ratio between two quantities: the first one being the sum of two variables, which represent respectively constant systematic effects and the sample of indications, while the second quantity, is referred to a measurement standard. By assuming that the information about the input quantities are in form of prior joint probability density function and a series of direct measurement data are available, the Bayes' theorem is applied to evaluate the posterior expectation (estimate), the posterior standard uncertainty and the posterior coverage probability. Numerical results are reported to asses the validity of the proposed analysis.
1. Introduction: the Bayesian approach The main characteristic of Bayesian approach is that prior to obtaining the measurement results the experimenter considers his degrees of belief and individual experience for the possible models concerning all the relevant quantities involved in the measurement process and represents them in the form of initial probabilities (called "prior"). Once the measurement process is carried out and the results are obtained Bayes' theorem enables the experimenter to calculate a new set of probabilities which represents revised degrees of belief in the possible models, taking into account the new information provided by the measurements results. Bayes' Theorem: to update our beliefs about the values of certain quantities 9 = (0,,..., 0 m ) on the basis of the observed values of other quantities (the data)
172 y = (yl,...,y„), and of some known relation between Band y , the Bayes' theorem tell us that the updated or posterior distribution is given by:
where 71(0) is the prior probability codifying the previous knowledge about 0 and p(y|e) is the "likelihood". 2. The Measurement Model Referring to the following non linear functional relationship:
M=^XS
(1)
where all the quantities are treated as random variables, we distinguish : • The input quantities: M , the measurand; / , the quantity representing the sample of n indications /, „ I2,..., I„;R , a constant systematic effect; S , the measurement standard; V, an auxiliary variable (nuisance-parameter) associated to the unknowledge of the variance of the output quantities. • The output quantities: / = [I^,^,...,!„} • The measurement data refer only to / a n d they consist of a sample of n observations Ix,I1,...,In obtained by repeating the measurement process in the same repeatability conditions. 2.1. The Prior Stage In the prior stage, the marginal probability density function (PDF), associated with the quantity M, depends on the information that exists about the quantities ./?,£',/, which are supposed to be mutually independent before performing the measurement process. So that the a priori joint density of the quantities R,S,I is given by the product of their a priori marginal densities, that is: / ™ ( w ) = \\fmsAm^^^v)dmdv where fMBSIV(m,r,s,i,
= fR{r)fs{s)f,(i)
(2)
v) is the prior joint density of the input quantities.
The variable V is a priori independent of M and it is independent of I,R,S too, so that we have: /MRSIV (OT> >% s, i, v) = fms,
(m, r, s, i)fY (v)
(3)
173 where
fms,{m>r>s'i)= lfms,v{m^r^s^hv)dv is the a priori joint density of M,R,S,I and fy(v) is the a priori marginal density of V . Introducing the conditional density assuming R,S,I: fM[mpi = r,S = s,I = i), the a priori marginal density of the measurand M can be written as:
/„(M)s jWf^^f,(HI* = r.S = *,/ = •')**&<«
(4)
On the other hand, it is possible to give the marginal density is the form: fu(M)=
Wfms^r^rds
=H\s\fl!S,(r,s,ms-r)drds
(5)
Comparison between eq.s (4) and (5) implies: fu {m\R = r,S = s,I = i)= \s\b(i -ms + r)
(6)
where 8 is the delta Dirac function and consequently, taking into account eq.(2), we obtain the final expression of the prior joint density: / » { m , r , s , i ) = \s\b{i -ms + r)fR (r)fs (s)f, (i)
(7)
Looking at Iul1,...,In as empirical images of / , which is considered the population universal variable, we assume that It,k = \,...,n are mutually independent, assuming M,R,S,I,V, with the same conditional density fh (iJk/jUBsn,), for each Ik,k = l,...,n, where, to save space, the symbol d^ stay for X = x, Y = y and we assume that E{h\\dMSiy}= E{l\dI¥}=
i, F a r { / t | < 0 = Var\fk\dIV}= v
consequently we deduce: E{lk} = E{l}, Var{lk}=E{v}\/k
= \,...,n
(8)
which enforce the consistence of the hypothesis. From the principle of maximum information entropy we deduce that the conditional density of each Ik,k = 1,...,« is normal (Gaussian) with expected value equal to / and variance equal to v, that is:
/,.
felhw)=/,
fclK)=-7==e~^
<9>
V27TV
The likelihood of the single output quantity Ik,k = \,...n is proportional to the expression (9). Now, the Bayes' theorem allows to write the posterior joint density of M,R,S,I,V assuming I_ = i:
174 fwm ( w ' r ' s>»'»V|Z = i) = Wwm (»> r> s>»»v)// d^iiMK )
where the likelihood is given by / , {§dms,v)=f[
f(ik \\dURSIV).
The posterior density of M is given by: fu {m\\L = Lo)=i\\ \fwwv (w> r> s, i, v||/ = i_o )irdsdidv where, by taking into account eq.s (3), (7) and (9), fumy (m,r,s,i, v||/ = JO )= k"fMRSl (m,r,s,i)fp (v)v"/2e" ** '?l ^
(10)
being J 0 = (J0 ,...j'0_ J the observed values of the output quantities. 2.2. First Case of Study If we assume the prior density of F : fv(v) = v~' (Jeffreys' rule), from eq.s (7) and (10) we obtain: fmm (m> r> s> U v]|Z = U)= k"\s\8{i -ms + r)fs {r)fs {s)f, ( j ) - ^ ^ ^
(11)
By integrating out of v , assuming non informative prior density ft (j) we have: /*w('">'V,'>l£ = £»)= * W ~ms
+r
)fn(r)fs{s)w{i)
(12)
being W(i)= l + o- 2 (j-ioj
2
, with jo = - Z v , a2 = - l ( j 0 i -J'J
For 5 = 1, see [1]. Now, by integrating eq. (12) out of i, r and s , and taking ,
+00
.
^ +00
into account the normalization property \fu[m\\l = J0 jdm =fc"'\W{m)dm = 1, we .00
-00
can write the posterior density of the measurand as: I ii
\ Ul
fM\i=L)= ^
l'"l~\s\fR{r)fs{s)w(ms-r)drds
':
/ — - —
d3)
jW(m)dm -oo
From eq. (13) we deduce the posterior expected value of the measurand:
E{M\\L = i0}J"jw(m)dm
[yXr)<}riZsfs(s)dslmW(ms-r)dm
(14)
175 It can be shown that: *]mW(ms - r)dm = ^^-*]w(ms
- r)dm = ^f]w{m)dm
(15)
and by eq.s (14) and (15) we obtain:
The posterior variance of the measurand is: Var{M\\i = z}= E{M%
= i)- £ 2 { M | / =
i)
(17)
by taking into account eq. (16) we find: E{M% = j}= ?-± lys(r)(r
+ If dr ^ ( ] / * 2 )fs{s)ds + (18)
E
1
+ - *zt-Q/s )fs(s)ds+ n—i
-±-l"Mr){r n-i "
+
2
2
i) dr[-"(l/S )fs(s)ds
Eq. (17) can be written in the following form: E{M% = i]=E^E{F}+l+2UE{R}
+
-^
with E{l/S2} = Var{l/S}+ E2{l/s). 3. Numerical Results We consider the model presented before related to a digital voltmeter (see GUM [3] paragraph 4.3.7 example 2) being the measurand M given by: M =l ^ -
(19)
a Consider that the instrument is used to measure a potential difference / , being R the quantity referred to systematic effects and Qs a standard resistor. The arithmetic mean of a number n of independent repeated observations of L = {lu...,I„} is found to be Ta = 0,928571V with standard deviation CT0 = 7.563 xlO"6 V. The range of the random variable associated to systematic effects R is assumed to be: E{R}-4I,UR
176 approximation l/Qs s£'{l/g i }=10" 3 S. It is useful to introduce standardized variables : M.__M-E{\/QsJE{Rh.h>)
R- =
R-E{R)
£{1/&)K/V»-1] I'=M'- R'UR(GJJ^\)X by
assuming
uR = (ao 0 /V«-l)
(20)
ut
= (l - i0)(oJJ^i)' we
have
(21)
M" =f + aR',a>0
and
- V3 < R' < V3 . Taking into account that /* and R' are independent as / and R and that R' and 7 are independent too, the standardized probability density can now be written in the convolution form:
fA^%
=i)= ^rlfr^'-r\\l,
= i)dr,a>\
(22)
2<xV3 -Vva having supposed a prior uniform density for R'. On the other hand being:
fM\l = Lo)=lfJ'Al = L)^
(23)
0
by applying Bayes's theorem we have:
fJ'Al = Lh ¥r{i'k(v)fMr = i'-v = v) Now, by taking into account that I" and v are independent and assuming a prior non informative density for /"and / f ( v ) « 1 / v
("Jeffreys' rule") , we
deduce:
fJ"4L = Lh^fMr=>
K^ f,MA^=h)=-tif h
1=
(24)
-^- •i'+i0,V = v
V^T
Following the principle of Maximum Information Entropy, the conditional density of each Ikk = \,...n is normal, that is: r ii
f,
W-
\
S=j' + J0,K = v
1
V^T
From equations (24) and (25) we deduce:
f
exp
\2^
f
i-i 2v Vi - L — Vn^T *A
*0
(25)
177
T*1
1 2v *-i
n -1
and being £(z0J - i„)= 0 we can write:
/,>(^v|k=i)=4r ex p(V2
1
n
,.»2
(26)
-CT„I
2v/j-l
From eq.s (23) and (26): ex
/,-(''*l£=i)= u 4 r p f -
CT r ~2 v— «-l " '
•-2 >
(27)
l * = *-1 + n-\
and we deduce that /,.(*'"||/ =£0) is proportional to a Student's t with « - l degrees of freedom, whose corresponding distribution density is: r(»/2)
/ . . » = V7i(«-l)r((«-l)/2)l 1 + - n-\ The constant k4, in eq. (22), is determined by applying the normalization property and the posterior density function has the following final expression: AT
JA
*-
ia}
2aV3"V^T]r((«-l)/2)JAt
"-
dr
1
(28)
In Figure 1 is reported the behavior of the posterior density function in eq. (28) for different values of n. The posterior expectation is E\M" ||/ = / 0 } = 0, and the posterior variance is: n-\ n-3 So that the level of confidence for M" may be written as:
Var {M" } = E{M" \\l = i0} = a2 +
P*. = P{~ *«„ <M'< ka„}= 7 / ^ {m\\l = i)dm, am = J a 2
+
The coverage interval for the measurand M has the bounds given by: akrm^M
^ \
178
Numerical results for the level of confidence are reported in table 1 for different values of the parameters n, k and assuming the parameter a = 1. 0.35
0.3
0.25
0.2
0.15
0.1
0.05
0; -
4
-
3
-
2
-
1
0
1
2
3
4
m
Figure 1. The posterior density function for n=5 (*), n=13 (•), n=21 (-) assuming the parameter a=l. Table 1. The level of confidence values for a=l Ph, n=5 n=13 n=21
k=l 0.72169 0.67632 0.67178
k=2 0.95804 0.95834 0.95903
k=3 0.99124 0.99707 0.99792
References 1. Lira I, Kyriazis G., Bayesian inference from measuremnt information Metrologia 36 (1999), pp. 163-9 2. Lira I. and Woger W., Bayesian evaluation of the standard uncertainty and coverage probability in a simple measurement model, Institute of Physics Publishing, Meas. Sci Technol. 12 (2001), pp. 1172-1179 3. BIPM,IEC,IFCC,ISO,IUPAC,and OIML, Guide to the Expression of Uncertainty in Measurement, 1995, ISBN 92-67-10188-9, Second Edition.
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 179-187)
M E T H O D COMPARISON STUDIES B E T W E E N D I F F E R E N T STANDARDIZATION N E T W O R K S
ANDREA KONNERT University
of Dortmund, Department of Statistics, Dortmund, Germany Roche Diagnostics GmbH, Department of Bio statistics, Penzberg, Germany
Diagnostic tests are often traceable to reference materials with assigned values derived within standardization networks. However for some analytes several of these networks exist, where measured values are not comparable among the networks. Method comparison studies are launched to derive rules for transformation of values of one network into values of another network. Bayesian hierarchical models allow the derivation of transformation equations between two networks based on multiple studies as well as the modelling of between-study variation, to make uncertainty calculations of transformed values adequately. Furthermore an approach is developed, to compare a new method comparison study with previous studies.
1. Introduction and Background Diagnostic assays for HbAlc% (Hemoglobin Ale in percent of total Hemoglobin) can be calibrated based on calibrators from different standardization networks. Laboratories in the US use calibrators, based on the NGSP (National Glycohemoglobin Standardization Program), but there are also standardization networks in Japan, Sweden and Australia. As these networks differ by the reference measurement system for HbAlc%, the reported HbAlc% values are not comparable between these networks. To achieve comparability of the results of HbAlc% assays a worldwide accepted reference material is needed, from which all calibrators are derived. The IFCC (International Federation of Clinical Chemistry and Laboratory Medicine) has launched a project for the worldwide establishment of a reference material for the measurement of HbAlc% in human sera [1]. HbAlc% target values based on NGSP standardization are already established for long-term assessment of the glycemic status in patients with diabetes mellitus and therapy goals [2]. For the transformation of the NGSP
180 reference intervals into values based on the IFCC network a transfomation rule (i.e. a master equation) is needed. For the derivation of such a master equation method comparison studies are launched twice a year between the IFCC and the other networks. In the following we will denote these networks as designated comparison methods (DCMs). In Figure 1 the data of these method comparison studies is shown for the IFCC - DCM B relationship. The aim of these studies is to derive a master equation so that IFCCHbAlc% values can be transformed into values of the other networks and vice versa. Besides the transformation, the uncertainty of the transformed values must be calculated, too. One must be aware of variation of the regression lines within the individual studies as well as between the individual studies. Both variations must be taken into account in the uncertainty calculation process. In [3] a master equation is already derived by pooling data from 4 studies. However this approach does not account for between-study variation of the regression lines and results therefore in overoptimistic confidence intervals of the master equation coefficients and the transformed values. Finally criteria are needed to compare a regression line of a new study with the master equation. This allows to conclude on the consistency of the new data with the situation of the previous studies. We present a Bayesian approach for the mentioned tasks. To incorporate between-study variation the data are modelled by a hierarchical linear model. The uncertainty calculation of transformed values is based on the posterior distribution of the parameters. The intercept-slope pair of a new regression line is compared to the posterior predictive region of the master equation coefficients.
2. Hierarchical Linear Model A linear relationship is assumed between the values of the IFCC network and the DCM networks. Five studies are combined to derive the respective IFCC-DCM relationship. If between-study variation of the regression lines is observed this variability must be incorporated within the master equation. In this case the master equation may be written as Y = a+~pX
w
(!)~ H?)4
(1)
181 IFCC vs. DCM B
i
1
1
1
p
4
6
8
10
12
IFCC-HbA1c%
Figure 1. Graphical representation of the data for the IFCC - DCM B relationship. The different symbols indicate d a t a from different studies. For each study a regression line, obtained by simple linear regression, is given, too. The bisecting line is shown as solid line, too.
where X represents an IFCC-HbAlc% value and Y one of the DCMHbAlc% value. The coefficients of the master equation are normally distributed with mean (a, (3)' and variance-covariance matrix S. Bayesian analysis allows to analyze all tasks within one model. Based on a prior distribution Bayesian analysis focuses on the derivation of the posterior distribution of the parameters, i.e. the distribution of the parameters conditional on the observed data. However the theoretical derivation of the posterior is mostly impossible, but Markovian updating schemes provide possibilities to simulate values that converge in density to the posterior.
182
The Gibbs sampler [4] is such an updating scheme. The parameter vector is divided into d subvectors. Each iteration of the Gibbs sampler cycles through the subvectors of the parameter vectors, drawing each subset conditional on the values of the others. The WinBugsl.4 software [5] makes Gibbs sampling very easy, additionally diagnosis tools are implemented to survey the convergence status of the simulations. Given the data of different studies Model (1) may be written as a hierarchical linear model Yi\biye,Xi~N{Xibi,
(2) (3)
where Y$ is the vector of DCM- HbAlc% sample means from study i and Xi the design matrix of study i with IFCC-HbAlc% sample means. Conditional on the intercept a, and slope (3i peculiar to study i, the sample readings of the DCM network are normally distributed with mean Xfii and variance <7^Ini. The intercept-slope pairs of the studies are themselves random draws from a multivariate normal distribution. As Gibbs sampling works best with independent parameters [7,8] the IFCC values were centered around their mean (X) and this centered model is fitted. This ensures the indepence of slope and intercept of the master equation and reduces the burn-in of the Gibbs sampler substantially. Afterwards the intercept of the centered model (ac) is transformed into the intercept of the uncentered model by a = ac — (3 • X. By means of the master equation IFCC-HbAlc% values can be transformed into DCM-HbAlc% and vice versa. Given an IFCC-HbAlc% value x, the DCM-HbAlc% value reads y — a+(3-x, where the vector (a, f3)' is normally distributed with mean (a, /?)' and variance-covariance matrix S. Similarly if a DCM-HbAlc% value y is given, the transformation into an IFCC— HbAlc% value reads x = (y — a)/(3. The first transformation is also known as prediction, the second one as inverse prediction or calibration. Uncertainty calculation of transformed values is incorporated within the model. Being functions of the parameters, the distribution of transformed values is simulated by evaluating the function at each simulated point of the posterior of the parameters. The master equation allows to recalculate measured values into values based on DCM calibrators, too. For the uncertainty of these values the uncertainty component due to the master equation as well as due to routine measurement procedure must be taken into account. Therefore the posteriors of the parameters of the master
183
equation as well as the posterior of the measured values must be known. The Bayesian approach offers also the possibility to check on the consistency of a new study-wise regression line with the previous studies. A new study is represented by a new (intercept, slope) pair, hence if consistency is given this pair should be a random draw from the N(b, E) distribution. Based on the posterior of b and £ the distribution of new draws of coefficients can be assessed, a new regression line is then compared to this distribution. 3. Prior Distribution and Model Checking The full Bayesian model needs the definition of priors for the parameters. To ensure the property of the posterior, conjugate prior distributions [6] for the multivariate normal setting are used, given by a\ ~ InvGamma(ue,
ve), b ~ N(rrib,Q, • I), £ ~ InvWishart(\,
ft,).
The inverse Gamma distribution is the conjugate prior for the normal vari2
ance, with mean given by m = ,Vf n and variance v = -, ^h——sv. The location parameters of the priors of G\ and b were taken from maximumlikelihood analysis of the centered model, whereas the scale parameters are set to 106 to reflect the poor knowledge on the priors. Therefore for the IFCC-DCM A relationship ue = 2,ve= 0.003 and b = (8.10,0.91)' for the IFCC-DCM B relationship these parameters were set to ue =2,ve= 0.006 and b = (7.10,0.97)'. The inverse Whishart distribution is the conjugate prior of the multivariate normal variance-covariance matrix. The inverse variance-covariance matrix (precision matrix) is then Whishart distributed with parameters (A, ft-1). In the following we will refer to ft-1, as in WinBugsl.4 distributions are parametrized in terms of precision matrices. The matrix ft-1 influences the mean as well as the variance of the Whishart distribution. As it is difficult to assess a closed form of the variance of the Whishart distribution different settings for ft-1 were regarded, ranging from 10~ 5 • I to I, with A = 3. We observed that the smaller the multiplier of I, the narrower the posterior distribution of the coefficients and the smaller the uncertainty of transformed values. For selecting the appropriate ft-1 the study-wise coefficients of the 5 studies are compared with the distribution of new draws of coefficients. All 5 pairs of coefficients should be included within the 0.99 confidence region of this distribution. Based on this decision criterion for the IFCC-DCM A relationship ft-1 = 5 • 1 0 - 4 / and for the IFCC-DCM B relationship ft-1 = 1 • 10~ 3 / was chosen.
184 Table 1.
Master Equation coefficients derived from 5 studies.
DCM
Coefficient
Estimate
Var(b)
A
Intercept
2.209
0.003
-0.0004
0.007
-0.0009
Slope
0.905
-0.0004
0.0002
-0.0009
0.0002
Intercept
0.995
0.009
-0.001
0.023
-0.003
Slope
0.967
-0.001
0.0002
-0.003
0.0005
1
fi"
= 0.0005 • /
B 1
Q-
=0.001-/
E
4. Results Multiple chains are run to show that the Gibbs sequences has reached effective convergence (i.e. the inferences on the parameters do not depend on the starting point of the simulations). Lack of convergence can be determined from multiple independent chains by calculating for each parameter the potential scale reduction factor [9] R, given by the quotient of (length of total-chain interval)/(mean length of the within-chain interval). If R is large, this suggests that either the length of the total-chain interval can be decreased by further simulations, or that the mean length of the withinchains intervals will be increased, since the simulated sequences have not made a full tour of the target distribution. If R is close to 1 each chain of the simulated observations is close to the target distribution. Three chains were run with starting points for b — (a, (3)' being (0,10)', (10,10), (10,0). All other starting points were pseudo-randomly generated. Based on the scale reduction factor convergence of the chains could be assessed at least after 25000 runs. To reduce the autocorrelation of the simulated values every 50th value was saved. After the burn-in phase simulations were continued until 1000 repetitions from each chain were obtained. Table 1 displays the estimators of the mean coefficients, the empirical variance-covariance matrix of their posterior and the estimator of S. For both relationships the mean intercepts are far from zero, indicating strong systematic differences between the reference measurement procedures of the IFCC network and the DCMs. This is due to the different specificities of the HbAlc measurement procedures. In the IFCC reference system HbAlc is denned on its molecular structure and specifically measured with a reference method, whereas in the other DCM's HbAlc is defined as the "HbAlc" peak of the chromatogram of the chosen DCM. But these HbAlc peaks contain not only HbAlc, but also to some extend substances that are
185
2.2
DCM A to IFCC 3.5 % - > 1.43%
DCM A to IFCC 5.3 % - > 3.41 %
DCM A to IFCC 12 % - > 10.82 %
IFCC HbA1c%
IFCC HbA1c%
IFCC HbA1c%
DCM B to IFCC
DCM B to IFCC
3.5 % - > 2.59 %
5.3 % - > 4.45 %
2.4
2.8
2.8
IFCC HbA1e%
3.0
4.1
4.3
4.5
IFCC HbA1c%
DCM B to IFCC q
4.7
12 % - > 11.38%
11.0 11.2 11.4 11.6 11.8 12.0 IFCC HbA1c%
Figure 2. Density of inverse predicted values for the IFCC-DCM A and IFCC-DCM B relationship with 0.025 and 0.975 quantiles (dotted lines). The dashed density replays the estimated density based on the pooled model. The points indicate the respective inverse predicted values based on the 5 study-wise regression lines.
not HbAlc. (For further details see [3] and the references given therein.) For the IFCC-DCM A relationship between-study variation of the regression coefficients is much smaller than for the IFCC-DCM B relationship. This is mostly due to the different network structure of DCM A and DCM B. DCM A consists of 10 reference labs, whereas DCM B only of one lab. Differences from study to study of the single lab are directly reflected in the mean of the measured values, whereas this kind of differences are mostly averaged in the DCM A network. Knowing the whole posterior of the parameters, (1 — a) confidence intervals can be given in terms of the a / 2 and (1 — a/2) quantiles of the simulated values. Figure 2 displays the densities of inverse predicted IFCC-Hbalc% values from both DCM networks. The 0.025 and 0.975 quantiles (dotted lines) are given as well as the estimated standard deviations of the densities. Furthermore points indicate the transformed values based on the study-wise regression lines. If no between-study variation is taken into ac-
186 IFCC - DCM A
I F C C - DCMB
Simulated re
StD(lnt)= 0.085
StD{lnt)= 0.051
S!D(Slops)= 0.013 StD(S!ope)= 0.007
Corr{lnt.Slope)= -0.9£9
CoiT(lnt,Slope)= -0.971 . *
J
"
"* *-
0.91
2.21
J
3
lniOmE<ja= 6$-(M "I
_
]
1.9
o
J_
2.0
2.1
*#o«
3 rv;
2.2
2.3
2.4
2.5
2.6
H
1 0.8
1 1.0
1
H
Figure 3. Distribution of new draws of coefficients with regression coefficients of the 5 studies (*) and the coefficients of a new study (+).
count, i.e. the master equation is derived from all data points, ignoring the study structure [3] the uncertainties of transformed values are overoptimistic. Within Figure 2 the densities of this approach are given as dashed lines, too. Especially for the IFCC-DCM B relationship this densities are too narrow, compared with the transformed values based on the study-wise regression lines. The Bayesian approach offers also the possibility to evaluate a new studywise regression line. Based on the posterior of the parameters the distribution of new draws of coefficients can be assessed. A new pair of coefficients is compared to this distribution. Figure 3 displays this distribution with the study-wise regression lines of the 5 studies (*) as well as the pair of coefficients of a new study (+). The new study lies within the posterior, so one can conclude that the relationship is still stable. The same holds for Network B when comparing the posterior of new draws of coefficients with the study-wise coefficients. 5. S u m m a r y Based on the example of the HbAlc% standardization networks we showed how data from multiple regression lines can be compared, taking into account between-experimental variation. The advantage of bayesian modelling is the derivation of posterior distributions of the parameters from which distributions of functions of these parameters can easily be derived.
187 This incorporates the uncertainty calculation of transformed values within the model and avoids error propagation formulas as proposed by GUM [10]. New regression lines can be compared to the posterior of new draws of the coefficients, providing a control instrument for further studies. However care must be taken in the definition of the prior distributions as they may affect the outcomes. We based the decision criterion for the selection of the appropriate priors on the evaluation criteria for new studies. MCMC techniques allow also to enlarge the hierarchical model for incorporation of measurement errors in the predictor variable [11], this subject rests for further application to our data. References 1. J.-O. Jeppsson, U. Kobold, J. Barr et al,Clin Chem Lab Med 40, 1, 78-89 (2002) 2. The Diabetes Control and Complications Trial Reserach Group, New England Journal of Medicine 329, 14, 997-986 (1993) 3. W. Hoelzel, C. Weycamp, J.-O. Jeppson et al, Clinical Chemistry 50, 1, 166-174 (2004) 4. G. Casella and E. I. George, JASA 46, 3, 167-174 (1992) 5. D. Spiegelhalter, A. Thomas, N. Best N. and D. Lunn, WinBUGSUserManual Version 2, http://www.mrc-bsu.cam.ac.uk/bugs, (2004) 6. A. Gelman, J. Carlin, H. Stern and D. Rubin, Bayesian Data Analysis, Second Edition, Chapman & Hall, New York, (2004) 7. A. E. Gelfand, S. K. Sahu and B. P. Carlin, Biometrika 82, 3, 479-488 (1995) 8. G. Zuur, P. H. Garthwaite and R. J. Fryer, Biometrical Journal 44, 2, 433-455 (2002) 9. S. Brooks and A. Gelman, J. Comp. Graph. Stat. 7, 4, 434-455 (1998) 10. ISO International Organization for Standardization, Guide to the Expression of Uncertainty in Measurement, Geneva, (1993) 11. R. Landes, P. Loutzenhiser and S. Vardeman, Technical Report, Iowa State University, http://www.sfb475.uni-dortmund.de/berichte/trl404.pdf, (2004)
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 188-195)
C O N V O L U T I O N A N D U N C E R T A I N T Y EVALUATION
MARIAN JERZY KORCZYNSKI Technical University of Lodz Stefanowskiego 18/22, 90-924 Lodz, Poland MAURICE COX, P E T E R HARRIS Hanmpton
National Physical Laboratory Road, Teddington TW11 OLW,
UK
A review is made of the use of convolution principles, implemented using the Fast Fourier Tranform (FFT) and its inverse, to form the probability distribution for the output quantity of a measurement model of additive, linear or generalized linear form. The approach is contrasted with other methods for uncertainty evaluation, and illustrated with some simple examples.
1. Introduction A number of models of measurement used in uncertainty evaluation take the additive form Y = X1+X2
+ ---+XN,
(1)
where Xi denotes the ith input quantity and Y the output quantity. A broader class of problems takes the linear form Y = c1X1+c2X2
+ -'- + cNXN,
(2)
where the Cj denote multipliers or sensitivity coefficients. A still broader class takes the generalized linear form Y = dfiiXi)
+ c2f2(X2)
+ ••• + cNfN(XN),
(3)
where the fi are given functions and the Xi are mutually independent vectors of input quantities. Some or all of the Xi may be scalar quantities. In the probabilistic approach to uncertainty evaluation the value & of each Xi in models (1) and (2) is assigned a probability density function (PDF) gXi(&), or a joint PDF gxX£i) 1S assigned to each Xi in model (3). The problem addressed is the determination of the PDF gyiv)
189 for the value r) of Y. The uncertainty information required, typically the best estimate of 77, the associated standard uncertainty and a coverage interval for 77, is then deduced from gy(rf). The Guide to the Expression of Uncertainty in Measurement (GUM) provides a framework1,3 for determining gy{rj). The framework is based on assigning gy(ri) to D e a Gaussian distribution (or scaled and shifted redistribution). The expectation of the distribution is given by the value of the measurement model for the expectations of the distributions gxi(£i) or gj£ .(£»)) and its standard deviation by applying the law of propagation of uncertainty to combine the standard deviations of those distributions. The Monte Carlo method 2 ' 3 constitutes a numerical procedure for obtaining gy(r)). The method is based on undertaking a large number of trials, each of which consists of sampling from the distributions gxt (£») or gj^ (£J and forming the corresponding model value. An approximation to gy(rj) is given by a scaled frequency distribution constructed from the model values. A review (section 2) is made of the use of convolution principles, implemented using the Fast Fourier Transform (FFT) and the Inverse FFT (IFFT), to form an approximation to gy(rj) for the class of problems defined by the additive model with two input quantities. The treatment of broader classes of problem is then considered (sections 3 and 4), covering (a) the additive model (1) with a general number of input quantities, (b) the linear form (2), and (c) the generalized linear form (3). The treatment is based on using processes of centralizing, scaling and transforming applied to the input quantities to convert a problem from these broader classes into one defined by the additive model. The concepts are illustrated with examples (section 5). 2. Convolution of Probability Distributions Consider the model Y — X\ + X2, where gx1 (£1) and gx2 (£2) are the PDFs for the values of Xi and X2. Then, the PDF gy(r]) for the value of Y is given by the convolution integral 4 oo
9xdZi)9xa(v
/
- Zi) <%i.
(4)
•00
In some cases, such as when the gxi(£i) are Gaussian or rectangular, this integral can be evaluated analytically to provide an algebraic expression for gy(rj). Generally, a numerical method can be used based on replacing the convolution integral by a convolution sum evaluated using the Fast Fourier Transform. A treatment is provided here for the case that the
190 distributions gxii^i), i = 1,2, have expectation zero. A treatment for broader classes of problem is given in sections 3 and 4. Consider each gxi(€i) over a sufficiently wide interval, [—Wi, Wi], say, centred on zero with Wi > 0. For example, for gx^i) rectangular, the endpoints of the interval would be the least and greatest points at which the PDF is non-zero and, for gxt{^i) Gaussian, the endpoints would be taken such that the probability that £, £ [-Wj, w;] was acceptably small. For instance, for a standard Gaussian PDF (having expectation zero and standard deviation unity) taking Wi = 4 would give that probability as less than 7 x 10" 5 . Let w = w\ +W2, and define the periodic function hxt (£i) with period 2w by
{
0,
&€
\-w,-Wi),
9Xi(€i),€i e [-Wi,Wi\, 0, £i€(wi,w], with the periodicity condition hxMi + 2w£) = hxiiti), Ci G [~w, w], £ integral. Choose a sampling interval T, where 0 < T <S w, in terms of which uniformly-spaced points T]r are defined by r\r = rT, r integral. Let rrii be the number of points r\r in the interval [—«;», Wi], and m the number in the interval [—w, w]. Because the intervals [—Wi, Wi] and [—w, w] are centred on the origin, the numbers rrij and m will be odd. Because w = wi + W2, the numbers will be related by m = mi + m^ ± 1. The product of a constant C and the convolution sum 4 given, for r = - ( m - l ) / 2 , . . . , 0 , . . . , ( m - l ) / 2 , by (m-l)/2
hY{Vr)=C
Y^
hx1{Vs)hx2(Vr-s),
(5)
s=-(m-l)/2
provides an approximation to the convolution integral (4) at the points rjr in the interval [—w, w\. C is a normalizing factor determined such that (m-l)/2 T
53
hY{r]r) = l,
r=-(m-l)/2
and thus the piecewise-linear function joining the points (rjr, hy {Vr)) can be regarded as a PDF.
191 The convolution sum (5) can be implemented efficiently and numerically stably using the Fast Fourier Transform (FFT) and the Inverse FFT (IFFT) 4 as follows. Define vector hxt to have elements h Xi s
'
= fhxi(rls-(m1-i)/2), \0,
s = 0,...,m1 - 1, s = mi,...,M-l,
where M > m. A Matlab implementation of the computational core is then hy =
ifft(fft(hl).*fft(h2))
where h i is a variable containing the elements of hx1, h2 contains those of hx2, andhy contains the values hyirfr), r = — (m —1)/2,... , 0 , . . . ,(m — l)/2, in its first m elements. In order to exploit the efficiency of a radix-2 FFT algorithm, M may be chosen to be the smallest power of two that is no smaller than m. Advice on the choice of sampling interval T (or number of sampling points m) is available5. 3. The Additive Model For the additive model (1) with a general number N of input quantities the PDFs for the values of X\ and X^ are convolved using the treatment of section 2 to produce the PDF for the value of X\ + AV Then, the PDFs for the values of X\ + X2 and A3 are convolved to produce the PDF for the value of X\ + X2 + X3, and so on. 4. Transformations to Additive Form Consider a measurement model of the linear form (2) for which gxi(€i), with mean Xi and standard deviation u(xi), is the PDF for the input quantity Xi. Define Zi
= ^ff,
i = l,...,N.
U{Xi)
The PDFs for the quantities Zi are obtained from those for Xi by centralizing (by the mean) and scaling (by the standard deviation): they are standard PDFs having mean zero and standard deviation unity. Substituting into expression (2), N
Y = ^Ci
(u(xi)Zi + x^ ,
192
i.e., N
Y = ^Xi,
(6)
»=i
where N
Xi = Ciu{xi)Zu
i = l,...,N,
Y = Y -^2
CiXi.
(7)
i=l
Since expression (6) is in the form of a general additive model, the PDF gyiv) f° r Y i s given by the convolution of the PDFs gx.(€i)- Since the model involves input quantities Xi having zero expectation, an approximation to gY(v) is obtained by evaluating a discrete convolution (5) using the F F T (as in section 2). From expression (7) 6 , 9xMi) =
T^9Zi CiU(Xi)
(
j—:) , i =
l,...,N,
\CiU(Xi)J
which shows how to sample gx.(£i) by sampling the standard PDF gzi(£i)Furthermore, N
y = Y^cixi,
9Y(V)=9Y(V-V):
i=i
which shows how to obtain an approximation to gy {rf) from an approximation to gY{r]). Now consider a measurement model of the generalized linear form (3). Define $i = fi(Xi),
i=
l,...,N,
and $;(&) to be the PDF for the quantity $ j , with mean fa and standard deviation u{
193 (4) the sum of squares of n quantities having standard Gaussian distributions, which gives a \n distribution with v = n degrees of freedom (5) the linear combination of a set of quantities having correlated Gaussian distributions, which gives a Gaussian distribution. Substituting into expression (3), N
i=l
which is in the form of a general linear model and may be treated in the manner described above. The PDFs for the quantities $j are obtained by transforming those for X j . 5. Examples 5.1. Convolution distribution
of a Gaussian
and
rectangular
Take gxii^i) as a standard Gaussian distribution and 5x2(^2) as a rectangular distribution with endpoints ±4. Set u>i = 4 and w^ = 4, giving w = 8, m = 201 and M = 256. Figure 1 shows the values of the functions /iXi(£i) and hx2(&) at the points r?r in the interval [—w, w\. The figure also shows (as points) the approximation to the distribution gy (77) obtained by evaluating the convolution sum (5) using the F F T in the manner described in sections 2-4. For comparison are shown the approximation (scaled frequency distribution) to gy(r]) obtained using the Monte Carlo method (with 100 000 trials), and the result obtained from the GUM uncertainty framework (a Gaussian distribution shown as a continuous curve). The figure shows that there is good agreement between the result returned by the Monte Carlo method and that obtained using convolution. However, 100 000 Monte Carlo trials would appear to be insufficient, even for this simple model, to provide an approximation to the PDF gy{rj) that is as 'smooth' as that obtained from discrete convolution based on 256 points. The PDF for the value of Y is clearly not Gaussian and, consequently, the distribution obtained by the application of the GUM uncertainty framework is quite poor, even though, for the additive model (1), that distribution has the correct expectation and standard deviation. For large coverage probabilities, however, the coverage intervals determined from this distribution will be larger than the intervals determined more correctly using convolution or the Monte Carlo method.
194
ODQGE X,
X,
y
Figure 1. Values of the functions ft-x^Ci) a t the points r;r in the interval [—w, w] (left and middle), and the approximation to the P D F for the value of the output quantity Y obtained by convolution (points), the GUM uncertainty framework (curve), and the Monte Carlo method (scaled frequency distribution) for a two-term additive model.
Figure 2.
As Figure 1, but for the measurement of the temperature of a gauge block.
5.2. Temperature
of a gauge
block
A treatment of the uncertainty evaluation for the calibration of a gauge block is given as Example H.l of the GUM 1 . One of the (input) quantities considered in the treatment is the deviation in temperature from a reference temperature of 20 °C of the gauge block to be calibrated. A (sub-)model of the additive form (1) is used to express the quantity, here denoted by Y = 6, in terms of quantities X\ = 6, the mean temperature deviation, and X2 = A, the cyclic variation of temperature in the room in which the calibration is undertaken. A Gaussian distribution is assigned to the vaue of X\ with mean —0.1 °C and standard deviation 0.2 °C, and a U-shaped distribution to the value of X2 with mean 0 °C and standard deviation 0.35 °C. Set wi = 0.2 x 4 °C and w2 = 0.35 x y/2 °C (where V2 is the semi-width of a standard U-shaped distribution), m = 201 and M = 256. Figure 2 shows the (sampled) functions /ixx(Ci) a n d hx2{&), and approximations to the distribution gy(v) obtained (a) by convolution, (b) by the Monte
195 Carlo m e t h o d (with 100 000 trials), and (c) from t h e G U M uncertainty framework. T h e example was also run with a larger number of sampling points (m = 1001 and M — 1024) to confirm there was no change (at least visually) t o the result of the convolution. There is good agreement between the results obtained using convolution and the Monte Carlo method. In this example, the distribution gyiv) i s bimodal and, consequently, very different from the result provided by the G U M uncertainty framework.
6.
Conclusions
For a model of measurement t h a t is in additive, linear or generalized linear form, the use of the Fast Fourier Tranform ( F F T ) and its inverse to undertake a convolution of probability distributions, provides an approach to obtaining t h e distribution for t h e value of t h e o u t p u t quantity of t h e model. T h e approach compares favourably with t h e G U M uncertainty framework for problems for which it is inappropriate to assume t h e distribution for the o u t p u t quantity is Gaussian (or a scaled and shifted i-distribution). T h e method would appear t o compare favourably with the Monte Carlo method in terms of the number of sampling points used t o undertake a discrete convolution compared with the number of Monte Carlo trials t h a t is necessary t o obtain a 'smooth' approximation t o t h e P D F for the value of t h e o u t p u t quantity.
References 1. BIPM, IEC, IFCC, ISO, IUPAC, IUPAP and OIML. Guide to the Expression of Uncertainty in Measurement. ISBN 92-67-10188-9, Second Edition (1995). 2. BIPM, IEC, IFCC, ILAC, ISO, IUPAC, IUPAP and OIML. Guide to the Expression of Uncertainty in Measurement. Supplement 1: Propagation of distributions using a Monte Carlo method. Technical report, Joint Committee for Guides in Metrology. Draft (2005). 3. M. G. Cox and P. M. Harris. Software Support for Metrology Best Practice Guide No. 6: Uncertainty Evaluation. National Physical Laboratory, Teddington, UK (2004). 4. E. O. Brigham. The Fast Fourier Transform. Prentice-Hall, New York (1974). 5. M. J. Korczynski, A. Hetman and P. Fotowicz. Fast Fourier Transformation: An Approach to Coverage Interval Calculation vs. Approximation Methods. Joint International IMEKO TCI and TC7 Symposium, September 21-24, IImenau, Germany (2005). 6. J. R. Rice. Mathematical Statistics and Data Analysis. Duxbury Press. Second edition (1995).
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 196-203)
DIMENSIONAL METROLOGY OF FLEXIBLE PARTS: IDENTIFICATION OF GEOMETRICAL DEVIATIONS FROM OPTICAL MEASUREMENTS CLAIRE LARTIGUE, FRANCOIS THIEBAUT IUTde Cachan, 9av. Division Leclerc, 94234 Cachan cedex - France LURPA-ENS de Cachan-Universite Paris sud-11 PIERRE BOURDET LURPA-ENS de Cachan-Universite Paris sud-11 61 avenue du Pt Wilson, 94235 Cachan cedex - France NABIL ANWER IUTde Saint-Denis, LURPA-ENS de Cachan-Universite Paris sud-11
This paper deals with an approach to identify geometrical deviations of flexible parts from optical measurements. Each step of the approach defines a specific issue to which we try to give an answer. The problem of measurement uncertainties is solved using an original filtering method, leading to only consider a few number of points. These points are registered on a mesh of the CAD model of the constrained geometry. From finite element simulation of the measuring set-up and of external forces, the shape resulting from deflection can be identified. Finally, geometrical deviations are obtained by subtracting geometrical deflections to measured geometrical deviations.
1.
Introduction
Dimensional metrology of flexible parts is now a challenge, for the knowledge of part geometry is of a great importance in sheet metal assembly simulation. Sheet metal assembly consists in joining together one or more parts the characteristics of which are to be largely flexible and of complex forms. The process is often modelled as four sequential steps [1][2]: - Parts are placed in fixtures. - Parts are clamped. - Parts are joined together. - Assembly is released from the fixture. The simulation allowing understanding how dimensional variations propagate is based on a Finite Element Approach (FEA) which requires the finest definition of part geometry [1][2][3]. In fact, actual geometry deviations may affect boundary conditions as shown in figure 1 [2]: the displacement at the point Ptl
197 not only depends on the displacement ALl but is also affected by the form deviation.
Figure 1. Effect of component shape in simulating the assembly process.
More generally, the part geometry is modelled using a CAD system. Following, surfaces are meshed for the finite element calculation. In addition, real part surfaces are measured on inspection points using a classical measuring system: a CMM equipped with a contact probe. It is important to notice that to make simulation consistent, mesh nodes and inspection data must be coincident [2] [3]. As a result, the geometry of the surface is only acquired in a few measuring points which strongly depend on the initial meshing of the surfaces. This may affect the finite element simulation [2]. Moreover, as it is well-known that part set ups influence flexible plate parts' deflections under their own weight [4], parts are measured in the exact position of the assembly to integrate the weight effect. Therefore, classical methods for dimensional measurements of flexible parts are not yet satisfactory. In this paper we propose an approach for dimensional measurements of flexible large-sized parts such as car doors. The successive steps leading to identifying geometrical deviations of flexible parts from optical measurements are proposed. Indeed, 3D optical digitizing systems are suitable for the measurement of large-sized flexible parts for they allow non-contact measurements and are able to deliver in a relatively short time large clouds of points that are representative of object surfaces. The part is set-up on the CMM regardless of the assembly process. Due to its own weight and the supports, part deformations occur leading to displacements that can be of the same order than the geometrical deviations. Therefore, an identification method must be defined in order to extract geometrical deviations due to manufacturing defects only. However, it is here essential to detail some notions about the geometry of flexible parts as their shape may vary in function of their own weight. 2.
What is the geometry of a flexible part?
The free shape of a part (component of an assembly) is the shape the part should have in the state of weightlessness. As this situation is rarely possible, the shape
198 of a component is generally defined in the use state: when joined with other parts defining an over-constrained assembly subjected to external forces (figure 2). The use state defines the constrained geometry which is the support to the definition of the CAD model. It defines the reference geometry on which finite element calculations are conducted. When free from all the constraints, the shape of the component corresponds to the theoretical free shape: when the constraints are applied to the theoretical free shape, the geometry of the assembled component is identical to the CAD model. This theoretical free shape can be calculated from the FEA. By the way, the actual free shape is not identical to the theoretical free shape for it is not possible to elaborate exact geometry. It includes form deviations. At final, the component is measured defining the actual measured shape which includes deflections due to the part set-up and its own weight. These deflections can easily be simulated by the FEA.
Theoretical constrained shape
Theoreiicai shape under weight only
Theoretical free shape
Actual shape under weight only
I Actual constrained shape
r-*-:=.-.-=,-——?• Actual free shape
Figure 2. Various definitions of the geometry of compliant component for assembly.
The standard ISO 10579 defines two ways of tolerancing flexible parts: tolerancing at the free-state and tolerancing under constraints. Indications must be the following: - Conditions of constraints: assembly, external forces (gravity). - Admissible deviations at the free-state. - Admissible deviations at the constrained state. The free-state does not correspond to the state of weightlessness, considering that the position of the part as regards the direction of the gravity is clearly defined. However, tolerance values are greater in the free-state than in the constrained state which generally corresponds to the use state. This section emphasizes difficulties in defining the shape geometry of flexible parts. Nevertheless, we suppose that a CAD model of the part surfaces exists.
199 3.
Optical measurements of flexible parts
Optical measuring means seem more suitable to flexible part measurements than classical measuring means. The measuring system considered in this work is a CMM equipped with a motorized indexing head PH10 from Renishaw (http://www.renishaw.com), which supports a laser-plane sensor. The part is set-up onto reference support points the position of which is clearly defined within the part frame. Note that the set-up must be non-over constrained allowing locating the frame part within the CMM framework. At this stage, the aptitude of the optical means with respect to dimensional measurements must be analyzed. As the scan planning influences measurement accuracy, the set of relative sensor/surface situations must be chosen so that local uncertainties are minimized [5]. Local indicators of quality can also be associated to points and compared to thresholds [5]. At this stage, the actual measured shape is defined as a large cloud of points which coordinates are defined in the CMM framework, RQMM- The position of the locating points is also well-known in the CMM framework and defined by the rotation matrix M. The points are thus defined in the locating framework Rioc by: OlocMi=0^p
+ MDMi
(1)
The restored cloud of points is discrete, inhomogeneous and dense. As data are further exploited, a pseudo-continuous representation of the points is carried out by a spatial 3D representation based on voxels (3D pixels). A voxel is a cube, the dimension of which is defined so that each cube at least contains a minimal number of measured points M, (figure 3).
one \ o \ e l •',
3-a Position of points and locating
3-b voxel and its attributes
Figure 3. Measured points and voxel representation.
200
However, data must be filtered to remove effect of measure noise. The baricentre of the measured points belonging to each voxel is calculated and the least-square plane is fitted to the points. voxj
0~G
i
= -isl
(2) 1=1
The normal to the plane defines the local normal n ,• at the surface at the Gvoxj. A weight coefficient w, can be attributed to each point in function of its quality indicators. The orientation of each normal is outward material. Note that the definition of the voxel representation strongly depends on the surface mesh used for the FEA. Following, the next step consists in the registration of the points on the CAD model in order to identify the actual free shape. 4.
Identification of the part geometry (free-state)
The CAD model on which points must be registered is meshed according to the required mechanical calculations. Therefore, we have to register two clouds of points, one that results from the measurement and the other one which is linked to the meshing. For this purpose, we take advantage of the possibilities offered by the voxel representation and we use the Small Displacement Torsor (SDT) method [7]. Let Pi be a theoretical point corresponding to a node of the mesh and Nt be the local normal to the surface at the Pt point. The problem is to find the corresponding point within the cloud of all measured points. Along the normal Nt, we move a voxel (cube the size of which is sufficiently small to represent a local neighbourhood of the mesh point) since it encounters a set of measured points. The barycentre of these points is calculated as well as the local normal to the voxel as discussed in the previous section: (G,-, «,•) (figure 4). Let § be the deviatioji between G; and P; projected onto the normal Nt: £,• = PjGj -Nj Let'\pA,fij be the components of the SDT associated to the meshed surface, where DA is the displacement of a reference point and Q is the rotation vector. As the displacement of each point Pt can be expressed using (DA, Q), the optimized deviation can be expressed as follows:
et=Si -Dp •Ni=^i -{DA .Nrf-inVAPO-Nt
(3)
201
Therefore, we have to find the coordinates of '{pA,fi)so minimal [7], where m is the number of mesh points.
m
that W = £ e / »=1
IS
meshed surface Figure 4. Point registration.
As only baricentres are used for point registration, the effect of digitizing noise is clearly decreased. By the way, as all the measured points are conserved, form deviations can be evaluated for a large number of points. The last step concerns the evaluation of geometrical deviations. The effect of part flexibility must be here taken into account. Linearity is assumed here, which leads to linearly separate the different sources of deviations as follows: r^Form-, _ r^Measuredj _ xj&*\
(4)
Measured
where [£ ] i s calculated for each point of the mesh as the deviation between the CAD model and the measured surface according equation 3, and £ def is the deviations due to part deflections. £ def results from the simulation of the mechanical behaviour of the part (under its weight) located on the supports, and thus evaluate the displacements due to part deformation. As the mesh of the structure is obtained from the CAD model which is defined considering the constrained geometry, we have to first simulate the theoretical free shape using FEA. To simulate the mechanical behaviour, the relationship between the displacements and the forces [F] is assumed to be linear: [F] = [K] [U] (5) where [K] is the rigidity matrix established through finite element modelling, assuming the material to be elastic and isotropic; [U] is the displacement vector and [F] is the force vector for all mesh knots [3]. In function of the chosen setup, either forces or displacements are imposed at the reference points. Inversing (5), after the two states are simulated, leads to [£def], the displacement vector associated to the deformation of the part when relaxed from its constrained and
202
set-up for measurement. Note that, in most cases, industrial users prefer to measure in the use state in order to be free from mechanical simulations of the effect of the set-up.
5a boundary conditions on the theoretical shape
5b boundary conditions on the actual shape
Figure 5. Influence of the form deviation on mechanical simulations.
At this stage, the form geometry of the part is known including form deviations. Simulation of the assembly process can be carried out taking into account the actual shape, as illustrated in figure 5, for which measured points are simulated. When boundary conditions are applied onto the actual shape, the mechanical behaviour may strongly vary [3]. The mesh is obviously preserved for both simulations. 5.
Conclusions
Knowing the actual shape of a part which is a component of an assembly is essential to correctly simulate the assembly process. This paper has detailed the steps of an approach we propose to identify geometrical deviations of flexible parts from optical measurements. Each step of the approach leads to a specific problem to which we suggest some solutions. Measuring using an optical digitizing system gives large sets of noisy points. To decrease the effect of digitizing noise, we suggest an original filtering method by defining a voxel representation onto the points. The barycentre of the voxel is calculated as the barycentre of the points belonging to the voxel. This leads to only consider a few number of points which are registered on the node points of the mesh of the CAD model (the constrained geometry) using the SDT method. From finite element simulation of the measuring set-up and of external forces, the shape resulting from deflection can be identified. Finally, geometrical deviations are obtained by subtracting geometrical deflections to measured geometrical deviations. The development of this approach is actually in progress.
203
6.
References
1. Camelio J., Jack Hu S., (2003), Modeling variation propagation of multistation assembly systems with compliant parts, J. of Mechanical design, 125: 673-681 2. Dahlstrom S., Lindkvist L., Sodeberg R., (2005), Practical implications in tolearance analysis of sheet metal assemblies. Experiences from an automotive application. CIRP seminar on Computer Aided Tolerancing, Charlotte (USA), CDRom paper 3. Cid G., Thiebaut F., Bourdet P., Falgarone, H., (2005), Geometrical study of assembly behaviour taking into account rigid components'deviations, actual geometric variations and deformations, CIRP seminar on Computer Aided Tolerancing, Charlotte (USA), CDRom paper 4. Lee E.S., Burdekin M., (2001), A hole-plate artifact design for the volumetric error calibration of CMM, IntJAdv ManufTechnol, 17:508-515 5. Lartigue C , Contri A., Bourdet P., Digitised point quality in relation with point exploitation, Measurement 32, pp. 193-203, 2002 6. ISO Standard 10579 (1993), Tolerancement des pieces non rigides 7. Bourdet P., Clement A., (1976), Controlling a Complex Surface with a 3 axis measuring machine, Annals of the CIRP, 25(1):359-361
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 204-211)
D I S T A N C E SPLINES A P P R O A C H TO I R R E G U L A R L Y D I S T R I B U T E D P H Y S I C A L DATA F R O M T H E B R A Z I L I A N NORTHEASTHERN COAST
S. B. MELO Centro de Informtica - UFPE Recife - PE - Brazil E. A. O. LIMA Depariamento de Estatstica e Informtica - Unicap Recife - PE - Brazil M. C. A. FILHO Departamento de Oceanografia - UFPE Recife - PE - Brazil C. C. DANTAS Departamento de Energia Nuclear - UFPE Recife - PE - Brazil In this paper we propose the investigation of an interpolation method that is both simple and smooth with respect to derivative continuity: Distance Splines. This method is applied to physical data acquired along the Northeastern Coast of Brazil, in different seasons of the year.
1. Introduction The analysis of oceanographic data has a profound impact on governmental policies for adequately dealing with coastal resources. The acquisition of physical and chemical data from different depths in certain sites in the ocean a helps producing a map that supports studies on the distribution of nutrients in both time and space. The map is produced by using interpolating or approximating techniques, which allow estimating nutrients a
This information was collected by group of oceanography of UFPE during the years of 2002, 2003 and 2004
205
information at any point in the convex hull of the sites. The unavoidably irregular distribution of the data along the coast, however, poses some obstacles to the mathematical modeling, which is called scattered data modeling (Nielson7). For interpolation, several methods have been proposed (Franke et al 5 ) as well as some different variants, in order to suit different needs, ranging from the simple Shepard (inverse distance approach) and its modified quadratic version, to sofisticated ones such as some statistical approaches and the minimum norm network for large volumetric data. The presence of land barriers or topographic barriers (Ridgway8) may require adjustments in these methods. The amount of interpolating data is another important primary factor to such methods, but in the case of the available data from the Northeastern coast of Brazil, its order of magnitude (in the hundreds) and its lack of land barriers inside its convex hull allow the use of simpler methods. Figure 1 shows a Delaunay Triangulation of the above mentioned sites.
Figure 1.
The Delaunay Triangulation of the data sites in the Brazilian Northeastern Coast
Smoothness assumptions about the involved physical quantities require derivatives continuity to be taken into account. We propose using Distance Approach to Cubic Splines interpolants, which are C 2 functions (second order derivative continuous), and are simple to implement for not very large data sets. In the next section we present this method and in the following section we discuss all aspects involved in the use of this approach as it is applied to our oceanographic data, and point out other directions we are taking in this ongoing research. 2. Distance Approach to Cubic Splines A bivariate interpolation method for irregularly distributed data through a distance approach has been obtained by considering the "volume spline
206 method" in Nielson7 for modeling volumetric scattered data. The method is here briefly named "distance spline approach". Let us recall the definition of the the cubic spline with interior ordered knots x\,..., xn in [a, b], which is piecewise polynomial of degree three, having only continuity in the second derivative at the junctions x^s of each interval [:EJ,2;, + I] (Greville 9 ). They are well known functions in the approximation field and commonly used in multidimensional data fitting problems. The following cubic spline F that interpolates the n data (xi,fi) and that is characterized by the following conditions, is called natural cubic spline: (1) F, F', F" continuous over [a,b]; (2) F(Xi)=fi, i = l ) 2,...,n; (3) F is piecewise cubic, i.e., F is a cubic polynomial on each interval [xi,xi+1), i = l , 2 , . . . , n - l ; (4) F"(xi) = F"(xn) = 0 and F is linear on [o,x\) and [a;n,6]. These last end conditions are called natural end conditions. In order to apply in the bivariate interpolation case the idea of natural end conditions as in Nielson7, we start by letting F : IR2 —> IR be defined as follows: F(x,y)
= Co + cxx + cyy+
£c;||(a; - xity - yi)\\3
By imposing the corresponding end conditions and by defining Vij = (xi — Xj,yi — yj) the euclidean distance between the points i and j in a plane, we get the following system of equations: 0 1 x2 J/2 1^211
\Vl2\
\Vlr>
0
\v2n
|3
C-x
h
Cy
1 0 0 \0
Xn 0 0 0
yn \\Vnl\\3 | K 2 | 0 1 1 0 xx x2 0 2/1 y2
Cl
0 1 %n
Vn
fn
C2
/
\cnJ
0 0
In Nielson7 it is said that, for the volumetric case, when there are more than 300 data points, the condition number of the coefficient matrix may drop to single-precision calculations for data sets of size n = 300 to 500. The model is adequate for smaller data sets, like the ones used in the present work (48 per depth level).
207
3. Interpolating Oceanographic Data The Distance Splines approach was applied for the first time to the irregularly distributed physical data from the Brazilian Northeastern coast. The available data included Nitrate and Phosphate levels acquired at different sites (stations) on the surface level and at 60 mtrs deep. A Delaunay triangulation was produced from the irregular mesh for each situation, since the locations of the stations varied according to the ocean deepness and season. When the values obtained by the interpolant became negative, they were set to zero. The same was done at points outside the corresponding convex hull. These prevent the final interpolant from being C 2 or even C 1 , even though the functions are C2 in the open areas defined by these borders. Since there is no additional physical information about the real function that describes these nutrients in the above mentioned location, this type of modeling becomes a legitimate approximation. For comparison purposes, we produced a piecewise linear interpolation for each case. Figures 2, 3 and 4 show interpolants for Nitrate levels at surface during Winter, Spring and Summer seasons, respectively. Some differences may be observed in the convex hulls, which is due to the fact that the sites have changed across different seasons, as meterological factors affected the data acquisition process. As we can infer from these figures, Nitrate distribution is strongly dependent on the season of the year, reaching a maximum in dispersion during Winter and Summer seasons, with an average concentration four times higher during the Winter than that observed during the Summer. During Spring Season, the Nitrate concentration is two times higher than during Winter, however it is not as much uniformely distributed. Figures 5, 6 and 7 for Nitrate interpolants at 60 mtrs deep during Winter, Spring and Summer seasons, respectively, show the same general behavior as for surface level: the average concentration is four times higher during the Winter than during than during the Summer. During Spring season, though, the Nitrate distribution is more uniform at 60 mtrs deep than at surface level. In the other Figures, we may observe that the Phosphate interpolants follow the same relative tendency as the Nitrate interpolants at both ocean depths, but with a concentration ten times smaller.
4. Final Remarks We applied the distance approach to cubic splines to Nitrate and Phosphate data from the Brazilian Northeastern Coast in the Winter, Summer and Spring seasons, at the surface level, and at 60 mtrs deep (Nitrate only).
208
Figure 2.
Nitrate P.W. Linear and Splines Interpolants: Surface Level - Winter.
Figure 3.
Nitrate P.W. Linear and Splines Interpolants: Surface Level - Spring.
i j
!
„.
.A
V-
-** ^
•'V
•> i
Figure 4.
"N
*;. '
- --.-v.
Nitrate P.W. Linear and Splines Interpolants: Surface Level - Summer.
The interpolation around the data points produces much smoother functions, as expected for a C 2 interpolant (excluding convex hull borders and borders where the interpolants are zero valued), but the resulting function presents wild wiggles, and in some sites it assumes negative values, in view of the highly irregular nature of the data points, as it can be seen in Figure 1. The interpolants were set to zero at points where negative values were found. Although this ends up limiting its smoothness, it is still physically
209
i
•
Figure 5.
:
'
Nitrate P.W. Linear and Splines Interpolants: 60 mtrs Deep - Winter.
:
1
I: v
Figure 6.
Sfx
'••• -y
f '* "
'
Nitrate P.W. Linear and Splines Interpolants: 60 mtrs Deep - Spring.
V
Figure 7.
Nitrate P.W. Linear and Splines Interpolants: 60 mtrs Deep - Summer.
acceptable, since there is no additional physical information about the nutrients, or at least it is not currently feasible to produce useful geometric information from what it is known. It is in the core of scattered data modeling the pre-requisite of no information about the real function, other than its unstructured samples. The observed wiggles may or may not be a sign of more intense divergence with the actual concentrations of Nitrate and Phosphate in the ocean. In order to validate the model, it will be necessary
210 ; k
Figure 8.
W
:h ^
!
]
Phosphate P.W. Linear and Splines Interpolants: Surface Level - Winter.
\•
Figure 9.
Figure 10.
ffi$
Phosphate P.W. Linear and Splines Interpolants: Surface Level - Spring.
Phosphate P.W.L. and Splines Interpolants: Surface Level - Summer.
to confront its produced values against new acquired values at the ocean sites. As part of this ongoing research, efforts are being concentrated on extending the basic interpolation by generating a minimal rectangular mesh from a piecewise linear interpolation containing the convex hull, and interpolating it by some shape-preserving method, such as Monotone Piecewise Bicubic Interpolation (Carlson et al 1 ' 2 ), or the Shape-Preserving Bivariate Interpolation (Costantini et al 3 ). This scheme transforms our method
211 in a global approximation. Some validating techniques will be applied to compare it against other interpolating or approximating methods.
References 1. 2. 3. 4. 5.
6. 7. 8. 9.
R. E. Carlson, F. N. Fritsch, SIAM J. Numer. Anal. 22, No 2, 386 (1985) R. E. Carlson, F. N. Fritsch, SIAM J. Numer. Anal. 26, No 1, 230 (1989) P. Costantini, F. Fontanella, SIAM J. Numer. Anal. 27, No 2, 488 (1990) R. Franke and G. Nielson.Int. J. Numer. Meth. Eng. 15, 1691 (1980). R. Franke and G. Nielson, Scattered Data Interpolation and Applications: A Tutorial Survey, in Geometric Modeling: Methods and Their Applications, edited by H. Hagen and D. Roller (Springer, Berlin, 1991). G. M. Nielson,Math. Comp. 40, 253 (1983). G. M. Nielson, IEEE Comput. Graph. Appl. 13, 60 (1993). K. R. Ridgway, J. R. Dunn, J. of Atmosph. and Ocean. Techn., 19, No. 9, 1357 (2002) T. N. E. Greville, J. W. Jerome, I. J. Schoenberg, L. L. Schumaker, R. S. Varga, Theory and Applications of Spline Functions - Edited by T. N. E. Greville - Academic Press, New York (1969)
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 212-220)
DECISION-MAKING WITH UNCERTAINTY IN ATTRIBUTE SAMPLING t LESLIE RPENDRILL. HAKAN KALLGREN Swedish National Testing & Research Institute (SP), Box 857, SE-501 15 Boras, Sweden Wherever there are uncertainties, there are risks of incorrect decision-making in conformity assessment associated both with sampling and measurement. The present work considers the evaluation and treatment of uncertainty in decision-making, including for the first time application of an optimised uncertainty methodology in testing and sampling by attribute. This goes beyond traditional statistical sampling plans in that it assigns costs to both testing and consequences of incorrect decision-making.
1. Introduction A principal target of conformity assessment is to evaluate and test the intrinsic error of a type of object/product as manufactured in a certain 'process'. If there are not time or resources to perform a 100% inspection of product, the decisionmaker will need estimates of test uncertainty in order to make valid deductions of product conformity. In order to evaluate the required intrinsic product error, estimates of the other sources of error which may affect the entity characteristic - i.e. associated with sampling and measurement - also have to be made so that their contribution can be removed from the test result preferably before assessing conformity. Where these latter errors are known, then they should be corrected for as far as possible in the measurement/test result. Remaining uncorrected measurement errors and sampling errors will lead to uncertainty [1], These uncertainties will in turn lead to certain risks of incorrect decision-making in conformity assessment [2],
Partially financed by European Thematic Network "Advanced Mathematical and Computational Tools in Metrology", NICe project 04039 and by grant 38:10 National Metrology of the Swedish Ministry of Industry, Employment and Communication.
213 This paper considers the evaluation of and allowance for uncertainty in sampling by attribute and subsequent decision-making. In both types of acceptance sampling - by attribute [3] or by variable [4] - one can make predictions, for a given sample size about the probability of incorrectly accepting/rejecting products, using various distributions in calculating probabilities [2, 5]. Of considerable practical and economic importance is to set and test compliance of product to a specified limit of fraction non-conforming product. As will be shown, an optimised uncertainty methodology - analogous to that familiar in quantitative decision-making by variable [6, 7] - can be introduced in which costs of attribute sampling are balanced against the risks and consequence costs of exceeding a specification limit of fraction non-conforming product. An example is taken from current work [8 - 11] on uncertainty in decisionmaking when conformity assessing a wide range of measuring instruments covered by legal metrology and the new EU Measurement Instrument Directive (MID) [12]. 2. Optimised uncertainties by balancing costs 2.1. Distribution of instrument error in use An example is taken from legal metrological monitoring of fuel (diesel and petrol) dispensers as a type of measuring instruments "Measuring systems for the continuous and dynamic measurement other than water MI-005" [12, Annex MI-005]. A typical distribution of instrument error based on sample data for fixed octane fuel dispensers (flow < 100 L/min; half max flow) from Swedish legal metrology [13, 14] is illustrated in figure 1. In this example, about 0.6% of the instruments sampled lie outside of specification on either side, that is, in total about p = 1.2% of the 4940 instruments sampled have errors exceeding the maximum permissible error MPE = SLX = ± 0.5e = ± 0.5% volume specification limits.
214
Number of instruments 1400
1400
I
1200 1000
1200 1000
eoo
800
600
FtrpXSr l
600
P^SP0.6%
/I ,
400 200
-
3
-
2
-
1
200
W
•4.' I I +-H*-
0
400
0
l-t- ' ! ! + • 1
0 2
3
Instrument error (scale division, e) Figure 1.
Distribution of measurement instrument error for fixed octane fuel (diesel and petrol)
dispensers {flow < 100 L/min; half max flow) Sample size, n =4940, mean ,w = Oe; standard deviation er= 0.2e; MPE = 0.5e [13]
2.2. Modelling the costs of instrument error A model of the consequence costs of exceeding a specification limit (LSL = MPE here) is illustrated in figure 2, where costs are taken as rising linearly with increasing instrument error from zero, and are evaluated as, for instance:
CL5L=Cost-
1
In-u_
(*-*m, 2<
•dx
(1)
Here we assume that the distribution in instrument error, x, is dominated by measurement uncertainties, umeasure and follows a Normal Gaussian form (that is, that intrinsic and sampling variations are small in comparison). For this specific example of legal metrological control, with 0.6% of petroleum dispensers nonconforming, consequence costs are found to be CLSL = - 66 k€ and CUSL = + 66 k€, evaluated with Eq. (1) where 'Cost' equals the total annual tax income from petroleum fuel sales nationally, estimated at 4.8 109€/annum. A meter displaying
215
a dispensed volume less than the true value (i.e. error below LSL) will result in a loss of tax income for the state, and vice versa. Instrument error costs
Added costs
Instrument error, x -00
SL,
Figure 2. Model of instrument error costs beyond a specification limit
2.3. Optimised attribute sampling It might in the case of petroleum dispensers be that a national decision cap on loss of tax revenue due to under-estimated fuel metering of say The corresponding limit on the fraction of non-conforming instruments thus b e S Z . = 0.73%. While the fraction non-conforming instruments,
puts a 80 k€. would p =
0.6% appears here to be less than the specification limit of SL. , unless every instrument in the nation is tested, there will be a statistical sampling uncertainty in p
and a corresponding risk of exceeding the specification limit on the
fraction non-conforming instruments. (This sampling uncertainty is in addition to other sources of sampling uncertainty, such as associated with entity heterogeneity [15 - 17].) In order to estimate the uncertainty in these estimates of the number of nonconforming units, one can observe that p can be considered as a random variable following the binomial distribution (shown in figure 3) with mean fi = p and standard deviation cr = | ^ — E l [18]. ' V n
216 £=
Figure 3. Fraction non-conforming instruments distributed binomially
An optimised attribute sampling uncertainty, usamp/e, may then be calculated as that which gives a minimum in overall costs ELSL>a„r, by balancing consequence, CLSL a n a sampling costs D„p with the following formula:
D E
LSL,a,,r
=
C
LSL ' ® binomial (^,
n,SL.)+-
(2)
usample where a specification limit, SL., is set for the fraction non-conforming entities exceeding a product error (lower) specification limit, from a sample of size n and actual observed number of non-conforming entities, d. Consequence costs are modelled as scaling with the integrated probability
217
SL>
sample
Economic loss, €
MID
-i
r
4103
210 5
(a) 0.15
0.2
0.25
0.3
0.35
Uncertainty, sampling % Economic loss, €
:
" ample
sh
MID •
i
1
1
(b) 1
I
1
4000
5000
6000
7000
Sample size, no. of meters
Figure 4. Costs as function of (a) attribute sample uncertainty; (b) attribute sample size for fuel dispensers (dashed line indicates test costs only)
Costs at the actual (standard) sampling uncertainty, usample, and size, nsamp\e, are shown by the thin solid vertical line, while the thick solid vertical line indicates costs corresponding to the specification limit, SL. = 0.73%, on the fraction nonconforming instruments. Finally, the dashed vertical line marked 'MID' in figures 4 indicates the sampling uncertainty corresponding to the attribute sampling plans typically used in legal metrology, with AQL = 1% and LQL = 7% [12, Annex F]. Test costs dominate at larger sample sizes, but consequence costs rise more rapidly towards small sample sizes. An optimum sampling size and uncertainty minimises overall costs at about 100 k€, apparently quite close to the traditional 'MID' attribute sampling plans. Note however that the optimised uncertainty methodology goes beyond traditional statistical sampling
218 plans in that it assigns costs to both testing and consequences of incorrect decision-making. 3. Conclusion The development of general procedures about how to make decisions in conformity assessment in general in the presence of measurement uncertainty is a topic of international study. It has been shown that an optimised uncertainty methodology can be applied to minimising costs in testing and sampling by attribute, as exemplified for one category of instrument covered by the new EU Measurement Instrument Directive (MID). This economic approach is a complement to traditional sampling plans and treatment of risks in decision-making. It is hoped that this will stimulate a more harmonised approach to measurement uncertainty in decision-making in future, in both the instrument standards and normative documents in the context of legal metrology as well as in the broader conformity assessment area (such as health, safety, environmental protection and fair trading). 4. Acknowledgements Work performed in the framework of EU project "SofTools MetroNet" COP 2 including a Short Stay at Bundesanstalt fur Materialforschung und prufung (BAM), Berlin (DE) week 2005/05. Thanks to cooperation with Drs W Hasselbarth, W Bremser, A Schmidt, W Golze of BAM, as well as Dr K Lindlov of JV (NO), Dr K-D Sommer of LMET (DE) and Ms K Mattiasson SP (SE). References 1. K Birch and A Williams "A survey of existing standards in Product and Equipment Conformity", draft report, British Measurement and Testing Association (2002) 2. AFNOR "Metrology and statistical applications - Use of uncertainty in measurement: presentation of some examples and common practice", FD x07-022 (2004)
219 3. ISO 2859 "Sampling procedures for inspection by attributes", ISO Geneva (CH) - under revision (in 10 parts) (1999) 4. ISO/FDIS 3951-1 "Sampling procedures for inspection by variables Part 1", ISO/TC69/SC 5 - under revision (2005) 5. W Bremser and W Hasselbarth "Uncertainty in semi-qualitative testing", in Advanced Mathematical and Computational Tools in Metrology VI, (P Ciarlini, M G Cox, F Pavese & G B Bossi eds.) pp 16 - 23, World Scientific Publishing Company (2004) 6. M Thompson and T Fearn, Analyst, 121, 275 - 8 (1996) 7. M H Ramsey, J Lyn and R Wood Analyst 126, 1777 - 83 (2001) 8. NICe/Nordtest project "Measurement uncertainty in legal metrology" http://www.nordtest.org/projects/fqapri.htm project 04039 co-ordinated by SP (2004) 9. WELMEC "Guide to Measurement Uncertainty & Decision-making in Legal Metrology: Measurement Instrument Directive", Draft, WELMEC WG4 (2005) 10. H Kallgren, K Lindlov and L R Pendrill, "Uncertainty in conformity assessment in legal metrology (related to MID)", International Metrology Congress, Metrologie 2005 Lyon (FR), 20 - 3 June (2005) 11. L R Pendrill, B Magnusson and H Kallgren "Exhaust gas analysers and optimised sampling, uncertainties and costs", CEN-STAR Brussels workshop "Chemical and Environmental Sampling", April 14 - 15 2005 and Accred. Qual. Assur. (submitted) (2005) 12. MID "Measurement Instrument Directive" EU Commission PE-CONS 3626/04 MID 2004/22/EC (2004) 13. R Ohlon "Error statistics for fuel meters and weighing instruments monitored 1982", SP Technical Report 1983:05 ISSN 0280-2503 (in Swedish) (1983) 14. L Carlsson "Error statistics for fuel meters and weighing instruments monitored 1982/83", SP Technical Report 1983:48 ISSN 0280-2503 (in Swedish) (1983) 15. AMC "What is uncertainty from sampling, and why is it important?", Analytical Methods Committee background paper, Royal Society of Chemistry rwww.rsc.org] (2004) 16. M Thompson Accred. Qual. Assur. 9,425 - 6 (2004) 17. EURACHEM/EUROLAB/CITAC "Estimation of measurement uncertainty arising from sampling", 1st draft Guide, August 2004
220
18. D C Montgomery "Introduction to statistical quality control", J Wiley & Sons, ISBN 0-471-30353-4 (1996) 19. R Kacker, N F Zhang and C Hagwood, Metrologia 33,433 - 45 (1996)
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 221-228)
COMBINING DIRECT CALCULATION AND THE MONTE CARLO METHOD FOR THE PROBABILISTIC EXPRESSION OF MEASUREMENT RESULTS GIOVANNI B. ROSSI, FRANCESCO CRENNA University ofGenova - DIMEC, via All 'Opera Pia 15 a, 16145, Genova, Italy MAURICE COX, PETER HARRIS National Physical Laboratory, Hampton Road, Teddington TW11 OLW, UK We consider combining the direct calculation of probability distributions and the Monte Carlo method for uncertainty evaluation in the case of a measurement described by a series of repeated observations obtained from a single measuring instrument. We propose dealing with random effects by direct calculation, and to use the Monte Carlo method for combining the probability distributions assigned to systematic effects. We demonstrate the application of this approach to the example of gauge block calibration (Example H.l) presented in the Guide to the Expression of Uncertainty in Measurement.
1. Introduction The Guide to the Expression of Uncertainty in Measurement (GUM) [1] is rooted in probability theory: for example, it describes (so-called) Type A and Type B evaluations of uncertainty using recognized interpretations of probability. A forthcoming supplement to the GUM, on the propagation of distributions, is also based on probability theory, but more strongly so [2]. Direct calculation [3, 4] of joint and conditional probability distributions provides a flexible way for dealing with random variations affecting a measurement, allowing the merging of prior information, possibly from previous calibration or repeatability tests, with repeated observations. Moreover, it permits a straightforward treatment of the effects of finite resolution of a measuring device, as well as particular kinds of non-linearity, such as hysteresis. The Monte Carlo method [5]-[7] provides a numerical procedure to propagate probability distributions through a (deterministic) model. It is particularly flexible and effective in dealing with systematic variations affecting
222
a measurement, especially in cases where the measurement depends non-linearly on such effects. In this paper we consider combining the direct calculation of probability distributions and the Monte Carlo method in the case of a measurement described by a series of repeated observations obtained from a single measuring instrument. We propose using the Monte Carlo method to combine, through a deterministic model, the distributions assigned to systematic effects, and then to use direct calculation to combine, through a probabilistic model, the Monte Carlo result with information about random effects, including prior information and repeated observations. We present a general probabilistic framework for describing a measurement process (section 2), and its application to the case of a measurement described by a series of repeated observations (section 3). The implementations of the methods of direct calculation and Monte Carlo are summarized (section 4). The application of the proposed approach to the calibration of a gauge block is described (section 5). 2. Probabilistic Framework In this section we present a probabilistic framework that provides a compact description of a wide class of measurement processes [3]. Although the framework may be used to describe vector-valued measurement processes, we will consider here only scalar processes. Let xe.X be the value of the measurand, yeY a vector of indications of the measuring instrument, and 0e© a vector of parameters influencing the measurement process. Then, the probabilistic model relating x and y may be expressed by the parametric conditional probability distribution p(y\x,0). This distribution describes the process of observation. When y is observed, a probability distribution may be assigned to the value of the measurand, assuming a constant prior distribution for x, of the form
p(^\y)= lp{y\^)-[\xp(y\x,e)dxY
P{e\y)de
(i)
and the measurement result x defined as x = E(x\y). The distribution defined in formula (1) describes the process of restitution. Finally, the overall measurement process may be described by
In such a derivation we consider that, for each given x, x is a function of y, which in turn is a random variable, conditioned upon x. So we have obtained this formula by propagating the
223
p{i\x)=
[d[x-E{x\y)\-[\ep{y\x,e)p{e)de\dy.
(2)
For ease of reference, a summary of the main formulae of the framework, for a scalar measurand, is presented in Table 1. It is important to note that we are using here a notation different from that in the GUM. In that document the "estimate of the measurand value" is denoted by y (since it is viewed as the output of the evaluation process), whereas here it is denoted by x (since x is viewed as the input to the observation process and x is its estimate). Table 1. Probabilistic framework for a scalar measurand. Process/sub-process
Probabilistic model
Observation
p ( y \ x, 0 )
Restitution Measurement
p(x | y) = £p(> |*,«)-[ j ^ ( j > I X,0)dxj p(x | x) = f d[x-E(x
p(0\y)d0
| y)~j • \ p(y\ X,ff) p(ff)d0 \dy
x e X : the (scalar) measurand
y eY : Af-dimensional observation vector
x e X : the measurement result
0 e.0 : ^-dimensional parameter vector
3. Repeated Observations Consider a series of repeated observations {y,,t = 1,...,JV}, collected in a vector j> = [>>,, j ^ , . . . , ^ ] , and described by the following stochastic model : y, =k(v)x + b(v)+w,,
(3)
where (i) £ = [k, b] is a vector of parameters, which in turn depends upon a vector of random variables v , which may account for calibration uncertainty, as well as for other multiplicative and additive effects, and (ii) w, is a sample of a Gaussian white noise process, with p(w) = a~l(p(
224
1
p(y\x,k,b,a) = YliPw{y,-kx-b) = Yl-p\^—
fy,-kx-bs
J
(4)
and that of restitution by p((x,a)\yXb)
=
]\^(P{^^[p{a\
1
U i ^
(y -kx-b 2 t
^ •'<*)*'* (5)
after assuming a constant prior distribution for x. Finally, we get p(x\y)=
jp{(x,tr)
| y,k,b)p{k,b)dadkdb
.
(6)
In the proposed approach, the final result is obtained numerically using a two-step procedure: 1. calculate p {k, b) by propagating the distributions assigned to the
2.
components of v through functional models for k(v) and b(v) by the Monte Carlo method; directly calculate p(x\y) using formulae (4)-(6).
4. Supporting Software In the previous section we described how a measurement process described by a set of repeated observations might be treated by a combination of direct calculation and the Monte Carlo method. We consider in more detail how such methods work, in order to appreciate better the benefit of their combination. 4.1. Direct calculation In direct calculation we treat each random variable as a discrete variable on a finite domain. In this case all calculations can be carried out explicitly. Although the approach is inefficient for a general non-linear model, for a linear combination of random variables it is very efficient, since the calculation then reduces to a series of convolutions, which may be implemented using the Fast Fourier Transform (FFT) [11]. Direct calculation is particularly useful for dealing with the random variations associated with repeated observations. In this case the approach is perhaps more flexible than the Monte Carlo method since it provides a distribution a posteriori for the standard deviation associated with the repeated observations. The approach is also useful for dealing with particular kinds of non-linearity, such as quantisation and hysteresis [4].
225
Let q be the resolution (quantisation interval) of the indications yt, and x, k, b and a discrete random variables defined over a finite range. Let the distributions P(x) , P{k,b) and P(cr) be given. Then, P(yl\x,k,b,v) = ®[(y,+q/2-kx-b)/o-]-[(yl-q/2-kx-b)/a]
(7)
P(y\x,k,b,a) = YlP(y,\x,k,b,a).
(8)
P(x,k,b,a | y) = [P(y)Y P{y | x,k,b,a)P(x)P(k,b)P(a),
(9)
and
It follows that
where P
(y)= Z P(y\x,k,b,a)P(x)P(k,b)P(a).
(10)
x,A,fe,er
Direct calculation has a number of features, including that it is possible (a) to combine prior information about a, obtained through calibration, with fresh information arising from repeated observations in an optimal way, (b) to obtain a distribution for a, and (c) to plot intermediate joint and conditional probability distributions, which may be useful for checking purposes and for education. Direct calculation has been implemented in the UNCERT software package developed at the University of Genova [4]. 4.2. The Monte Carlo method Given input quantities (here, v) and a (deterministic) model relating the input quantities to an output quantity (here, k or b or both), the Monte Carlo method, as an implementation of the propagation of distributions, constitutes a numerical procedure for obtaining an approximation to the probability distribution for the output quantity k or b or the joint probability distribution for k and b. The Monte Carlo method is based on undertaking a large number of trials, each of which consists of randomly sampling from the probability distributions for the input quantities and forming the corresponding model value. An approximation to the probability density function is obtained as a scaled frequency distribution constructed in terms of the model values. The Monte Carlo method has a number of features, including that the method makes no assumption about the nature of (a) the model, e.g., that it is linear or mildly non-linear, (b) the input quantities, e.g., that no one dominates,
226 and (c) the output quantity, e.g., that it is described by a Gaussian distribution. Information about implementing the Monte Carlo method is available [5]-[7]. 5. A Case Study: Gauge Block Calibration We consider the calibration of a gauge block, an example treated in the GUM [1, Example H.l], and also to be included in the forthcoming supplement to the GUM. Various treatments of this example are available [1,8]. The length of a gauge block is determined by comparing it with a standard of the same nominal length x0 = 50 mm. Let the difference in temperature between the gauge blocks be denoted by 86 = 6-6s, the difference in their coefficients of thermal expansion by 8a = a - as, and the length of the standard by ls =l0+8l, where /„ = 50 000 623 nm . The deviation 6 of the temperature of the gauge block being calibrated from the reference temperature of 20 °C is expressed as 6 = 6 + A, where A describes a cyclic variation of the mean temperature deviation 6 from 20 °C. Moreover, suppose N repeated observations y, of the difference in lengths are made, with 8d an unknown systematic deviation affecting the comparator and w, a random deviation affecting the indication. Then, after neglecting some higher order terms, the model describing an observation is y, =x-(l0+8l)(\-8a(6+A)-as86)
+ 8d + wt.
(11)
If we collect all the variables giving rise to a systematic deviation in a vector v = [8l,8d,a s ,6,A,8a,861 we may rewrite the observation model as in formula (3), with &(v) = 1 and b(v) = -(l0 + 8l)(l-8a(6 + A)-as8d) + 8d.
(12)
Now, in the GUM presentation of this example it is assumed that the instrument has previously undergone a repeatability test, based on M = 25 repeated observations, on the basis of which the standard deviation of the comparator has been evaluated as cr0 = 13 nm. Moreover, it is assumed that the instrument indications consist ofN=5 repeated observations, but only the mean of the 5 observations (and not their standard deviation) is used. We consider here how to use all available information. To do so, we assume the behaviour of the instrument during the measurement is the same as that during the repeatability test. This hypothesis may be checked using the available data: we calculate the standard deviation a of the 5 observations and use a significance
227
test to check whether a »
§ 0.012
linear NON linear
funct
^J:
/
(13)
\
Joooe
/ /
£ probi
•S 0.004
/
J
\ \ \
V
820
840 x-x0 [nm]
860
(b)
Figure 1. Calculation results for the gauge-block example.
In Fig. lc-ld we present the final distribution p(x-x0\y) posterior distribution for the standard deviation p(a \ y).
and the
228 The example demonstrates the flexibility of the method and its ability to make optimal use of all available information. The possibility of obtaining separately all intermediate distributions is also a desirable feature, for both checking purposes and for education. 6. Conclusions In this paper we have proposed a general numerical approach to the probabilistic expression of measurement results. It is based on combining direct calculation of (absolute and conditional) probability distributions with the Monte Carlo method. We have shown how this approach allows us to deal with repeated observations from a single measuring instrument, subject to general (additive and multiplicative) systematic and random effects, making optimum use of all available information. A complete visualization of all uncertainty sources and their interaction is also possible. We propose studying the application of the method to the case of measurement from multi-channel systems, as a forthcoming development. References 1. BIPM, IEC, IFCC, ISO, IUPAC, IUPAP and OIML. Guide to the Expression of Uncertainty in Measurement (1995). ISBN 92-67-10188-9. 2. M G Cox, P M Harris, Accreditation and Quality Assurance 8 (2003), 375379. 3. G B Rossi, Measurement 34 (2003), 85-99. 4. G B Rossi, F Crenna, M Codda, in M G Cox, F Pavese, D Richter, G B Rossi eds, Advanced Mathematical and Computational Tools in Metrology VI, World Scientific (2004), 106-121. 5. M G Cox, P M Harris, B R L Siebert, Izmeritelnaya Technika, 9 (2003), 914. 6. M G Cox, P M Harris. Software Support for Metrology Best Practice Guide No. 6: Uncertainty Evaluation. NPL, 2004. 7. M G Cox, P M Harris. NPL Report 40/04, 2004. 8. C M Wang, H K Iyer, Metrologia 42 (2005) 145-153. 9. S D Silvey, Statistical inference, Chapman and Hall, London, 1975. 10. J Gill, Bayesian methods, Chapman and Hall, Boca Raton, 2002. 11. M J Korczynski, M G Cox, P M Harris, This Book
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 229-236)
IMET - A SECURE AND FLEXIBLE APPROACH TO INTERNET-ENABLED CALIBRATION AT JUSTERVESENET
ASMUND SAND, HARALD SLINDE Justervesenet,
Kjeller,
Norway
Justervesenet, the Norwegian Metrology Service, has an ongoing project aiming to create a general, secure, and distributed instrument control system using the Internet as a transport medium for control signals and results. The iMet system is presented here, where an operator may control multiple electrical instruments via the Internet regardless of the location of the instruments. The system will initially be used in an Internet-enabled calibration service, where Justervesenet sends a transfer normal to a customer and then controls the calibration of the Unit-Under-Test directly in the customer's laboratory.
1. Introduction Recent advances in Internet technologies and PC-based instrumentation open new possibilities for Internet-enabled metrology. The ability to operate an electrical instrument remotely via the Internet introduces cost benefits compared to the traditional way of co-location of operator and instrument. Most of the published work on Internet-enabled calibration and distributed measurement systems has a traditional client-server approach. Either the client controls instruments connected to a web server [1-8], or the client downloads procedures from a server to be run locally [9-10]. Some systems also use symmetrical protocols to obtain bi-directional communication between the server and the client [11], though they may not communicate bi-directionally through all firewalls and proxy servers, which might only support the asymmetric hypertext transfer protocol (HTTP). Today's electrical instruments are often expensive and can be sensitive to environmental changes, and should be transported in a secure way with great care. Traditionally, uncertainty due to transportation back and from a calibration laboratory must be added. If instead a specialized transfer normal was sent to the instruments, and the calibration process remotely controlled by a calibration laboratory, this added uncertainty could be taken care of by the calibration laboratory.
230
Introducing Internet-enabled calibration in this way, more accurate calibrations could also be obtained, because the instrument is calibrated under its normal working conditions. Proper knowledge of the properties of the transfer normal is required. Time and money can also be saved, since the UnitUnder-Test (UUT) is out of operation for a shorter time 2. System Overview The iMet system consists of multiple client computers and one public web server. The clients may communicate seamlessly with each other via the server over secure, full-duplex channels, using the hypertext transfer protocol over secure sockets (HTTPS). By using such channels, potential problems concerning traversing firewalls, proxy servers and network address translators (NATs) are bypassed, because the HTTP packets are allowed through as long as the connections are initialized from the inside. All clients are authenticated to the server with server-signed X.509 client certificates. The server authenticates itself to the clients using an X.509 server certificate signed by a trusted thirdparty certificate authority. After connecting to the server, a client may offer his/her instruments as services to an external operator. That means the operator may use the client's instruments as if they were connected locally to the operator's computer. The operator is then able to run measurement procedures locally, which operate on the remote instruments directly. The system architecture is shown in Fig. 1. Operator
B
^ 1
Instrument owner
Server (t n a
HTTPS
HTTPS
Instrument commands Instrument commands (HTTPS) (Virtual path)
Instruments
Figure 1. The iMet system overview. The operator may operate instruments remotely via the public web server. The instruments seem locally connected to the operator's computer.
2.1. Internet Communication To operate instruments remotely, an underlying system is needed for transferring the operating signals and results between the operator and the
231 instruments. The iMet system uses Microsoft's .NET Remoting, due to its flexibility in choice of communication protocols, object serialization and remote method invocation. .NET Remoting is an object-oriented middleware, enabling objects separated by a network to communicate as if they belonged to the same process on a computer. When operating on a remote object, a local proxy representation of the object is needed. When calling an object method on such a local proxy, the call is serialized, sent across the Internet to the real object where the real method gets called. The method reply is returned in a similar fashion. When using a full-duplex HTTP channel, all clients initially set up two HTTP connections to the server, SENDER and LISTENER. The LISTENER connection is in a pending HTTP request mode. All method invocations from a client to the server are sent in HTTP request packets using the SENDER connection. Similarly, all method invocations from the server to the clients are sent in HTTP responses on the LISTENER connection. The result is that the client and the server may send data to each other simultaneously, without inefficient client polling. Two clients may communicate with each other in a two-step process where the server forwards data from the one to the other using the appropriate connections. 2.2. Operating the Instruments The physical instruments connected to a client are also partially represented in software. These software instruments contain instrument-specific syntax information and real instrument state information. The software instruments are also used when a measurement procedure communicates with physical instruments. All computers involved in the system implement the InstrumentController interface, which defines three instrument communication methods, "Read", "Write", and "Query". The different method implementations are linked together in such a way that, regardless of which method implementation gets called, the call ends up at the correct instrument. This process is shown in Fig. 2. The system uses LabVIEW and NI VISA for low-level instrument communication. NI VISA is National Instruments' implementation of the Virtual Instrument Software Architecture (VISA), which is a standard for programming instrumentation systems comprising several hardware interfaces, such as GPIB, RS232, and USB. At present, there is no standardized and secure method to authenticate the individual instruments using software. As more and more measurements may be
232
performed remotely, this ought to be of concern for the instrument manufacturers.
Operator:
Server:
Owner:
InstrumentControUer
InstrumentControUer
InstrurnentControHei
Read Write
'1
Read
_
Read
1
Write
-I
Write
-
u*
Query
-
-*
Oner/
Query-
-*>
Instrument interfaces
Registered instruments
—»
Instrument interfaces
w«?r!SiJ;!i>*\-:?«?!»s^
Calibration procedures
Calibration procedures
'
1
!
Instrumen t commands (HTTPSj
I 1 t i I
•-*
Physical instruments
*»
(virtual path) Fig. 2. Instrument operation overview. Any computer that implements the InstrumentControUer interface can operate the owner's instruments and run calibration procedures, including the owner. It is possible, though not recommended, to run the calibration procedures on the server directly.
2.3. Time Delays Operating instruments via the Internet, one will always experience some time delays. Generally, time could be saved by bundling commands together before sending, compared to sending single commands one at a time. An example would be to operate a digital multimeter remotely to do highresolution DCV readings from an electrical calibrator. The measurement procedure runs locally at the operator's PC. Three bundled commands are needed to set up the instruments. One also needs to wait between 1 and 10 minutes to stabilize the readings. The multimeter might need between 5 and 15 seconds integration time per reading. If the time delay is 1 second on average, the effect of the delay is normally less than 10%. If there is some waiting time between each measurement, it is possible to compensate for the delay times, thus reducing the effect to almost zero. It is also possible to make block readings
233
on some multimeters, where a number of readings are made consecutively and stored inside the instrument before returned to the caller, and thereby effectively reducing the delay effect to zero. But this reduces the visibility of the process to the operator; the operator initiates the block reading, but he cannot see the details of the measurements before the block is finished. One could also download the calibration procedure, or parts of it, to the computer connected to the instruments, and compile and run it there. This would also reduce the delay effect to zero. The operator might then gain insight to the measurement process if the instrument computer sent him update information. But this would mean that the operator has less control if an unpredicted exception occurs in the procedure or in the data connections, and potential reading data could be lost. 3. Measurement and Calibration Procedures The iMet system could be used as a general instrument control system, where an operator operates an instrument remotely using single instrument commands. More efficiently, one could construct more complex measurement or calibration procedures for automatically performing lengthy and complicated measurements. The iMet system stores such procedures in a database. The procedures are downloaded, compiled in memory and run when requested, and they may be run on any of the computers implementing the InstrumentController interface. This means that new procedures may be added to the database, without the need for recompilation of all system components. The procedures operate on instrument interfaces, thus allowing the same procedure to be used for different instruments implementing the same interface. Such procedures were used when performing an Internet-enabled calibration referenced later in this chapter. 4. Measurement Scenarios There are different ways the iMet system could be utilized. The operator may sit at any computer with an Internet connection, and he may control any number of measurement or calibration processes at the same time. The important thing is that the operator's location is completely independent of the location of the instruments. 4.1. Operator and Instruments Located at the Same Computer When a client wants to perform a local measurement, but does not want to or know how to write the instrument drivers or measurement procedures himself,
234
he may use the iMet system to download, compile and run the correct procedure. This way he can be sure of the correctness of the procedure and the generated measurement data. Because the procedures are potentially available from any computer with an Internet connection, the client does not need to store the procedures locally. If many clients in a company perform the same measurements locally, they can be sure of the consistency of all the measurements performed, if all use the iMet system. 4.2. Operator and Instruments Located at Different Computers When an inexperienced client wants to perform a local measurement, he may outsource the operator function to an external and trusted expert. The expert may use the client's instruments as if they were connected locally, and he may run the measurement procedures on his own computer. This scenario could be used in Internet-enabled calibration, where a National Metrology Institute (NMI) wants to remotely calibrate one of its customer's instruments. The NMI first sends a calibrated transfer standard to the customer. The customer then logs on to the public web server, and hands over the control of the instruments to the NMI. The NMI authority may then download, compile and run the correct calibration procedure, depending on which instruments are connected to the customer. He may also do initial testing on the instruments, by sending single instrument commands. The results are stored, and a calibration certificate may be produced after the transfer standard is recalibrated. 5. Internet-Enabled Calibration Test In June 2005, Justervesenet performed an extensive Internet-enabled calibration test in collaboration with one of their customers in Stavanger, Norway. Two digital multimeters (DMMs), Fluke 8508A and HP 3458A, were calibrated and sent to the customer's laboratory, where they were used to do comparative measurements of a multifunction calibrator, Fluke 5520A. The process was initiated and controlled directly from Justervesenet. Five types of calibrations were performed, DCV, DCI, ACV, ACI and OHM. The same calibration procedure could be used for both multimeters, because they implemented the same instrument interface, as explained before. Only one procedure was used for all calibrations, and the logical structure of the procedure was as follows. The calibrator and the multimeter were set up to generate and measure the correct signal for each calibration point. A certain amount of time was added between each point (for stabilization) and the readings were made equally spaced in time.
235
The whole process was controlled directly from a laptop computer at Justervesenet. The operator first did some initial testing on the instruments to verify correct behavior. He then downloaded the correct calibration procedure, compiled it in memory and ran the procedure locally. The operator at Justervesenet and the customer used a phone to communicate verbally. The customer handled all instruments and needed guidance on how to connect them to each other and to the computer. 5.1. The Results The communication between the operator and the instruments worked perfectly. No significant time delays were observed when running the calibration procedure locally at Justervesenet. Because the procedures and the calibration points were stored in a database, it was easy to modify them without the need to restart the applications. When the DMMs were returned to Justervesenet and recalibrated, the results showed that there were slight changes in their calibration values. From the analysis so far, it is difficult to say if this was due to natural drift or transportation. More analysis and experience with the DMMs is needed. An example of the change of calibration values can be seen in Table 1. Table 1. Change of AC calibration values for HP 345 8A and Fluke 8508 A after transport.
Frequency (Hz)
Nominal (mA)
45 1000 10000 1000 1000 1000
3.29 3.29 3.29 0.19 190.00 1000.00
Change (mA) HP3458A -0.000150 -0.000150 -0.000190 -0.000009 -0.005700 -0.064000
Change (mA) Fluke 8508A -0.000100 -0.000120 -0.000120 -0.000002 -0.001600 -0.016000
6. Conclusions The iMet system seems well suited for Internet-enabled metrology and for operating instruments remotely via the Internet. Internet-enabled calibration requires detailed knowledge of the properties of the transfer standard to be used. If one knows the historic behavior of the standard, it is possible, to some degree, to predict the future behavior. This is
236 important when calculating transport uncertainties for the standard after performing calibrations via the Internet. A web camera should also be integrated with the system, so that the operator can see the instruments. References 1. F. Pianegiani, D. Macii and P. Carbone, IEEE Trans. Instrum. Meas, vol. 52, pp. 686-692, June 2003 2. M. Bertocco, F. Ferraris, C. Offelli and M. Parvis, IEEE Trans. Instrum. Meas, vol. 47, pp. 1143-1148, October 1998 3. W. Winiecki and M. Karkowski, IEEE Trans. Instrum. Meas, vol. 51, pp. 1340-1346, December 2002 4. M. Bertocco, M. Parvis, IMTC, vol. 2, pp. 648-651, May 2000 5. K. Michal, W. Wieslaw, IMTC, vol.1, pp. 397-402, May 2001 6. H. Ewald, G. Page, IMTC, vol. 2, pp. 1427-1430, May 2003 7. T. A. Fjeldly andM. S. Shur, "Lab on the Web, Running Real Electronics Experiments via the Internet", John Wiley & Sons, New York, NY (2003). ISBN: 0-471-41375-5 8. S. Kolberg and T. A. Fjeldly,, Advances in Technology-Based Education: Towards a Knowledge-Based Society, Proc. 2nd Int. Conf. on Multimedia ICTs in Education (m-ICTE2003), Badajoz, Spain, Dec. 2003), A. M. Vilas, J. A. M. Gonzalez, and J. M. Gonzalez, Editors, Vol. 3, pp. 17001704, ISBN 84-96212-09-2 9. R. A. Dudley, N. M. Ridler, IEEE Trans. Instrum. Meas, vol. 52, pp. 130134, February 2003 10. D. Ives, G. Parkin, J. Smith, M. Stevens, J. Taylor and M. Wicks "Report to the National Measurement System Directorate, Department of Trade and Industry - Use of Internet by Calibration Services: Demonstration of Technology", NPL Report CMSC 49/04, Mars 2004 11. A. Carullo, M. Parvis and A. Vallan, IMTC, vol. 1, pp. 817-822, May 2002
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 237-244)
MONTE CARLO STUDY ON LOGICAL AND STATISTICAL CORRELATION BERND R. L. SIEBERT1, PATRIZIA CIARLINI2, DIETER SIBOLD1 1[
Physikalisch-Technische Bundesanstalt Bundesallee 100, D-38116 Braunschweig, Germany CNR, htituto per le Applicazioni del Calcolo, "M. Picone " V.le delPoliclinico 137,00161 Roma, Italy Generally, a finite set of repeated simultaneous measurements of two quantities will always show correlation. A possible logical correlation can not be distinguished from the inevitably existing statistical correlation. This raises the question on how to determine the uncertainty associated with the value of an output quantity that is composed of these "correlated" input quantities. This paper uses the Monte Carlo simulation approach to reveal interesting aspects and to discuss possible answers. 1. Introduction The Guide to the Expression of Uncertainty in Measurement [1], briefly termed the GUM provides a consistent basis for the determination of uncertainties in measurement. The standard approach requires a model that reflects the knowledge about the interrelation and links the input quantities Xt with the output quantity Y, i.e. the measurand. The best estimate for the value of the measurand and the uncertainty, associated with that estimate, are derived from a linear Taylor expansion of this model and the propagation of the uncertainty. A forthcoming supplement to the GUM by the Joint Committee on Guides for Metrology (JCGM) provides guidance for a general procedure. Its central point is to state explicitly the probability distribution functions (PDF) for all input quantities and to propagate these in accordance with the model in order to generate the PDF for the output quantity. A PDF for a quantity reflects, based on knowledge derived from given information, the degree of belief in reasonably possible values of that quantity. The expectation of such a PDF is taken as an estimate of the value and its standard deviation is taken as the uncertainty associ-
238 ated to that estimate. The expanded uncertainty can be determined from that PDF, too. For historical reasons, the GUM distinguishes information gained from repeated measurements (Type A) and any other information (Type B). This distinction is now obsolete, since both types of evaluation lead to a PDF for the considered input quantity X. For instance, in practice often a Gaussian PDF is assumed to describe the variability of a particular quantity that is repeatedly measured. In this case, Bayesian probability theory infers a /-distribution for the values that could reasonably be attributed to the measurand [2]. Correlation between input quantities plays an important role as it can raise or lower the uncertainty associated with the value of an output quantity that depends on these quantities. This paper studies the problem that arises if few repeated simultaneous measurements of two quantities appear to be correlated. The next section summarizes, based on the above mentioned supplement, briefly the general approach for the determination of the PDF for the output quantity. The third section discusses information obtained from few repeated simultaneous measurements of two quantities that are known to be correlated. The fourth section introduces a method for inferring knowledge on the logical correlation from given data that are blurred out by statistical correlation and the fifth section presents some results. The final section summarizes the paper and draws some conclusions. 2. General approach for the determination of uncertainty The general approach for determining the uncertainty associated with the value of a measurand that depends on given input quantities is fully in accordance with the GUM [1]. In this approach, one combines, according to the model, all reasonably possible values of the input quantities in order to explicitly obtain the PDF for the output quantity Y, i.e. the measurand. In practice, the formulation of a suitable model is often a difficult task since a complete theory on modeling does not exist. However, systematic approaches are possible [3]. In essence, modeling follows a Bayesian learning process. The assignment of PDFs to quantities is based on the Principle of Maximum Information entropy (PME) and Bayes's theorem. For details see the for instance the textbook by Sivia [4]. For ease in writing u(x) is abbreviated by u. The combination of all reasonably possible values of the input quantities according to the model can be conveniently formulated by the Markov formula
gAi)=\gx{s)8{v-As))*t,
(i)
239
where 77 is a possible value of the measurand Y and the vector £ a set of possible values of the input quantities XT=(X\,..JCN),f\s the model function and gy(rj) and gx(£) denotes the PDFs for the indexed quantities. By integration of Eq. (1) over 77 one obtains y = EY= jgMndrj
and u] =Var[y]= \gMil~yJ^
(2)
and for a linear model, \.z.f(X1)=c\X\+..cNXN, one obtains from Eq. (1) N
N
T
u) = c Uxc = X Z c'u< ru UJ CJ '
(3)
i.e. the standard Gaussian uncertainty propagation. The c, are the partial derivatives of the full model function with respect to Xt, evaluated at EY* = xh Ux is the uncertainty matrix for the values of the input quantities and the correlation coefficient is defined by r:. = u^u^jU'/, where u, y = Cov XtXj.
(4)
Two physical quantities are logically correlated if there is at least one other quantity that influences both of these two quantities. If only few pairs of measurements are taken, statistical correlation will blur the logical correlation. The Monte Carlo method is a generally applicable method for combining the reasonably possible values of the input quantities concordant with the model in order to obtain the PDF for the output quantity Y. One generates a possible value (sample) of the output quantity, J]k= A^\,h---' §v,t)- The samples ^1 ,*,..., %N,k a r e obtained via random numbers from the corresponding PDFs. A possible correlation between input quantities must be taken account by using an appropriate sampling procedure. For Gaussian PDFs one can express correlation by a joint PDF, gx({)
= ((2xy\u,\Yexp{-y2(Z-xyu;({-x)
},
(5)
and use the Cholesky factorization of the uncertainty matrix U, i.e. U = RTR=>U-l=R-,(RTY.
(6)
In the case of only two correlated quantities, this leads to: £u = *. + ".£.**. a n d £ . * =x2+u2\r^kM
+ -Jl^r^Mh
(?)
where the index k indicates a sample, x^ and x2 are the expected values, r is the correlation coefficient and the index "std" denotes values drawn fromN(0,l), i.e.
240
a normal PDF with expectation zero and variance 1. The Monte Carlo sampling produces a frequency distribution for Y that approximates the PDF for Y and, if all samples have the same statistical weight, the formulae given in Eq. (2) lead to y = EY = nl ± t]k and u\ = Var7 = {n-1)~' ± {Vk - yf.
(8)
3. Monte Carlo simulation of few repeated measurements Assume that one is given a pair of correlated quantities X\ and X2 and knows that their PDF is a joint Gaussians and ux, u2 and r are given. Given this information, one can simulate a set of n simultaneous measurements, ^i,*,^,*- Without loss in generality, one can set uy=\ and u2=cc and use transformations, such that X]=0 and x2=0. Using Eq. (7), one obtains: 1,= ""'jX.s,d
and
*2.*="~'«ZK.*,.d + ^ 1 -'" 2 £.*.s,d)-
(9)
The uncertainties associated with x]}„ and x2i„ are denoted by u]t„ and u2i„, respectively, and the "seen" correlation coefficient by r„. For greater ease in writing, we define
c,j,=(»-i)_1Zfe.«
( 10 )
-*>*)£,** -*J*)>
and obtain w2„ = C u „ andu\ n = cc\2CUn
+ 2rVl-r 2 C, 2 „ + ( l - r 2 ) C 2 2 n ) .
(11)
The "seen" correlation coefficient r„, which results from the superposition of logical and statistical correlation, is given by
rC^+Jl^C^ ^CuyCUn
+ 2r4l^CXXn
(12)
+ (l-rJ)CwJ '
and the "seen" ratio of the uncertainties a„ is given by: < = C;la\r2CXXn + 2rVT^ J C, 2 ,„ + (l -r 2 )C 2 J
.
(13)
If the sample size n tends to infinity, Cyi(I tends to SQ and, c.f. Eqs. (11-13), w1>00, W2,a>, ?"oo and a* tend to u\, u2, r and a, respectively. Given a and r, one can simulate a large set S of possible values of C,^„, which is denoted by Cy,n>i. Using Eqs. (12) and (13) one obtains then possible
241 values of a„s and rns that might result from an experiment of n repeated measurements. From these one obtains a frequency distribution as approximation to the.corresponding probability distribution function (PDF). We rename a„s to aexp and rns to rexp and denote this PDF by p{ae^,rex)i a,r,ri). In other words, this PDF is the joint probability distribution function of pairs of aexp and rexp that might be seen in any one experiment if a, r and n are given. This procedure is in the following termed forward calculation.
Figure 1. Examples of joint frequency distributions p(atslv,r^ and r as indicated in the graphs.
a,r,ri) for n=64 and given values of a
Figure 1 shows examples of such frequency distributions for w=64. The given values of a and r are indicated in the graphs in Fig. 1. Examples with r>0 suffice since the abscissas in Fig. 1 are symmetry axes. Clearly, the statistical correlation blurs the given logical correlation increasingly with decreasing sample size n. See also marginal frequency distributions in Figures 2 and 3. Inspection of Fig. 1 raises the question on what can be inferred from one given experiment that comprises a few repeated simultaneous measurements of a pair of quantities. 4. Inference from experiments A possible answer to this question can again be given by Monte Carlo simulation, i.e. by computing a set of S possible values of Cy,B>J. For each such set, one searches now for that pair of values of as and rs that would lead to the experimentally given values aexp and rexp. This consideration leads to Eq. (14), r c
, u,.,+V1~r/Cu..J
r
(14)
2
^,1JU(r C1JJU + 2r,JT?C„„ + (l - r ' K , J ' which allows to compute rs numerically and, by inverting Eq. (15), to obtain as:
242
<^^{rXu,,+^^C^s+{\-r:)pil„)j.
(15)
This procedure allows to provide a joint frequency distribution of pairs of values of a and r that could have caused the result of one given experiment. Again, for a large sample number S, this frequency distribution is a sufficient approximation for the corresponding PDF. We denote the corresponding PDF by p(a,r\ amp,rexp,n). In the following, this procedure is termed backward calculation. Let us assume that the aim of the measurement is to determine a quantity Y that is given by the model function Y = X\± X2. One would then obtain a PDF g(uy) by g\Uy\<x^.^P,n)= \\p(a,r\a^,r^,n)s\iiy
-Jl±2ra
+ a2)dadr
. (16)
This PDF can then be used to judge the credibility of the stated uncertainty. Again, if the sample size n tends to infinity, Cy,„,s tends to SQ and one can infer from Eqs. (14) and (15) that for n—>oo
/>("> r Kp - % -») -> 5(a ~ °W -r ~ %) •
(! 7 )
Next we compare forward and backward calculations and denote a and r in Eq. (17) by aB and rB and c^xp and rexp, as used in p(aeKp,reKp\ a,r,ri), by a F and rF. Finally, we omit the subscript "exp" in/?(aB,rB| amp,rexv,n) and use a and r for experimentally or otherwise given values. 5. Results Figures 2 and 3 show the marginal frequency distributions (MFD) of p(a7,r¥\ a,r,ri) and p{a^,r^ a,r,n) for (a,r) = (1.0,0.0) and (0.0,0,75), respectively. The MFDs are denoted by g(r B |a,r,«), and analogously for rF, aB and aF. The results from forward and backward calculations are clearly different. As expected, this difference decreases with increasing sample size n. The most probable values (peaks), except for g(rB\cc=l.0,r=0.0,n) and g(rF|a=1.0,?=0.0,«), differ in position. This difference decreases with increasing sample size n, too.
243
Figure 2. Marginal frequency distributions for g(r&\ a=\,r=0,n) and g(rr\ a=l,r=0,n), (left graph) and g{a^ a=l,r=0,n) and g(ofFl o=l,r=0,w) (right graph). The curves with the lower maxima pertain to «=5 and the curves with the higher maxima to «=16. For better legibility, the distributions are normalized to 103. The MFDs for as and ov are not fully shown.
Figure 3. Marginal frequency distributions forg-(ral o=0.5,r=0.75,n) and g(rf\ a=0.5,r=0.75,«) (left graph) and g(aa\ a=0.5,r=0.75,n) and g(af\ a=0.5,r=0.75,«) (right graph). The curves with the lower maxima pertain to n=5 and the curves with the higher maxima to n=16. For better legibility, the distributions are normalized to 103. The MFDs for ae and ae are not fully shown.
In addition, the uncertainty uy for Y-X\±X2 was calculated using Monte Carlo simulation. It can be written asw*n = «~'(l±2F n a n +a^); w riere the subscript n indicates forward (F) or backward (B) calculation. The aB, a F , FB andrF are not necessarily the expectations of the corresponding PDFs! As expected, a F and FF are equal to a and r, respectively. However, Table 1 shows that the values obtained from backward calculations differ considerably. This difference decreases with increasing sample size n and aB approaches a from larger values and FB approaches r from smaller values. Table 1. Expectation estimates aB and rB as derived from computing uy according to Equation (16). The numbers of the upper (lower) row are for sample size n = 5 (n = 16). o=l. 00 ^0.00 r=0.25 f=0.75
«B
2.26 1.15 2.18 1.14 1.66 1.07
r
B
0.00 0.00 0.11 0.22 0.45 0.70
a=0.50 A=0.00
r=0.25 r=0.75
«B
1.13 0.58 1.10 0.57 0.83 0.53
'"B
0.00 0.00 0.11 0.22 0.45 0.70
244
The values of rg for a given r depend only on that r and the sample size n, but not on a. The values of aB for a given a depend on r, but the ratios of aB for different sample sizes n do not depend on r. Instead of the Monte Carlo simulation one can also use the Bayesian theorem to infer p{a&,r^ a,r,ri). Some first results have been obtained. They appear to be consistent with the results here presented 6. Summary and Conclusions In practical metrology, the number of repeated measurements is often small due to the high costs of each measurement. However, if only few pairs of measurements are taken, the possibly existing logical correlation may be blurred out by statistical fluctuations and vice versa, if there is no logical correlation present, the fluctuations may pretend correlation. Monte Carlo simulation and a Gaussian PDF were used to simulate such measurements for a given correlation and to infer the uncertainty of a linearly composed quantity from a given experiment. The trends observed in Section 5 are consistent and easily explained by the fact that a and r as obtained from few repeated measurements represent a highly incomplete knowledge that leads to higher uncertainties for smaller n. From the results presented in this paper we can conclude that in the case of few repeated measurements the uncertainties, obtained following the standard GUM procedure, are generally too small. Further work is needed to provide a general procedure for handling such cases that are very common in metrology. References 1. 2. 3.
4.
Guide to the Expression of Uncertainty in Measurement, International Organization for Standardization (1995). I. Lira, W. W6ger, Evaluation of repeated measurements from the viewpoints of conventional and Bayesian statistics, In this book (2005). K. D. Sommer, B. R L. Siebert, M. Kochsiek, A. Weckenmann, Systematic Approach to the Modelling and Uncertainty Analysis of Measurements, In Proceedings of the 12th International Metrology Congress Lyon 2005 «La maitrise des processus de mesure, recteur du developpement durable », College Francais de Metrologie, Paris Cedex 15, France (ISBN 2-9154/603-6). D. S. Sivia, Data Analysis - A Bayesian Tutorial. Clarendon Press, Oxford (1966).
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 245-252)
THE MIDDLE GROUND IN KEY COMPARISON ANALYSIS: REVISITING THE MEDIAN ALAN G. STEELE, BARRY M. WOOD, ROBERT J. DOUGLAS National Research Council of Canada, M-36 Montreal Road, Ottawa, K1A 0R6 Canada The median of a data set is an order statistic: in a Key Comparison it is the middle value (or the average of the middle pair of values) submitted by the participants. It has been used to select the Key Comparison Reference Value (KCRV), and the 'median absolute deviation' (MAD) has been used as an estimate of its uncertainty. We discuss the median in the context of the alpha-trimmed simple mean and the alpha-trimmed weighted mean. We revisit the median, using Monte Carlo re-sampling to explore the dispersion imputed by the participants' uncertainty claims. Compared to traditional techniques, this study has revealed differences in value, variance, skewness and covariances. We discuss simple insights to be gained from Monte Carlo evaluation of the probability of a given participant being the median laboratory in the Key Comparison, and of the difficulties created by the median for doing simple "degrees of equivalence" arithmetic.
1. Introduction 1.1. The Median Family The median of a data set having N values x-} is obtained by sorting the values to find the middle value if N is odd, or the average of the middle pair of values if N is even. For an MRA[1] Key Comparison, a data set will typically have 5 to 25 values from different laboratories. In the sorting process, it is useful to retain the association of the value Xj with Lab i, as can be done by a keyed sort[2]. For Key Comparisons, the median is sometimes used as the KCRV when there is some disquiet about using either the inverse-variance weighted mean, or the simple mean as the KCRV. The advantage perceived for the median is its robustness against the magnitude of the deviation of a single outlier result. Note that the median is less robust to the sign of the deviation of a single outlier. These three KCRV candidates can be viewed as members of a family of ordered statistics, the alpha-trimmed family, where a fraction a of the N values is dropped from both the low side and the high side of the sorted values[3]. One side of the family is the a-trimmed simple mean. When oc=0, this is just the simple mean (no values are trimmed and the simple mean of the entire
246 data set is calculated). The largest value to consider for a is slightly less than 0.5: a=(N-l)/2N. When N is odd, this trims all but the central result - the median. When N is even, a=(N-l)/2N trims all but the central pair of results (each "half-dropped" and their average taken) - again the median. The other side of the family is the a-trimmed, inverse-variance weighted mean. When a=0, this is just the usual weighted mean (no values are trimmed and the weighted mean of the entire data set is calculated). When N is odd, the upper limit of a=(N-l)/2N trims all but the central result - the median; and if N is even, a=(N- \)I2N trims all but the central pair of results (each "half-dropped" and their weighted average taken) - this time only almost the median. 1.2. Monte Carlo Resampling In a Key Comparison, when an uncertainty is to be calculated for the KCRV, two main resources are available. The dispersion of the reported laboratory values around the KCRV is one resource that can give a range of values that might reasonably be attributed to the measurand (each result has been attributed to the measurand). In different cases evaluated using this resource, the standard uncertainty has been taken variously as the standard deviation, as the standard deviation of the mean (dividing the standard deviation by ViV), or as the median absolute deviation (or MAD, for a median KCRV). Unfortunately, additional assumptions must be introduced to justify any of these calculations as representing the uncertainty of that method's KCRV. The second resource that is available is the set of uncertainty statements as claimed by each laboratory for its result. These may be used in isolation to deduce the uncertainty of either the simple mean or inverse-variance weighted mean KCRV. They may be used along with the dispersion of the results in a generalized chi-squared-like test[4] of the null hypothesis that the values agree with the KCRV within the dispersion expected from the claimed uncertainties. To evaluate the uncertainty of an order statistic, both resources are used, without constraints from the null hypothesis. In its simplest form, a single random variable is associated with the uncertainty claimed by each laboratory, with its Probability Density Function (PDF) shifted and scaled: its expectation value is to be equal to the laboratory's reported value and its width and shape of the PDF are to be those described in the laboratory's uncertainty statement. In Monte Carlo resampling, for each laboratory a computer function is used to produce pseudo-random numbers that replicate its claimed PDF and have the claimed covariance properties (often meaning uncorrelated) with the pseudorandom numbers of the other laboratories. Each set of the N random numbers
247
represents a resample of the N laboratories claims, and the results are evaluated by the candidate algorithm for determining a KCRV, and used to develop a histogram of the frequency distribution of the values that could reasonably be expected for this algorithm, based on the laboratories' claimed values and uncertainties^]. For order statistics, it is important that each laboratory's PDF have its expectation value equal to the laboratory's reported value. 1.3. An Example CCEM-K8 100V: 10V is a Key Comparison[6] with a clear outlier, and serves here as an example of Monte Carlo evaluation of the uncertainties of the whole family of alpha-trimmed statistics. The data are shown in Figure 1, with error bars showing the claimed standard uncertainties and also the 95% coverage intervals. The left two panels show the data in the as-reported order, and in value-sorted order as needed for the median and the a-trimmed statistics. &iph*-tnm
Ji\.
L'^rf?^
Time Ordered
rs-aed Wsi$hs*-d Mes
m
-i«
Sorted by Value Siroftte h%%n
inverse-variance Weighted fears
Figure 1. CCEM-K8 data and sorted data (left two panels) and a surface plot of Monte Carlo histograms, showing its resampling dispersion for both the a-trimmed simple means (including the simple mean at its a=0), and the a-trimmed inverse-variance weighted means (including the weighted mean at its a=0). Both families approach the median's histogram as their values of a approach 0.5.
For both types of a-trimmed means, as a runs from 0 to 0.47, the number of labs omitted runs from 0 to 14 (0-7 from each side). Here, the resampled distribution for any of these KCRV candidates is reasonably well described as a Gaussian. For a > 0.13, the extreme outlier is entirely omitted, and the candidate values have a less rapid variation with a, as shown in Figure 2.
248
Figure 2. For CCEM-K8, from the right panel of Figure 1, the alpha-trimmed family of candidate reference values (round symbols) and their standard deviations (square symbols). The solid symbols to the left show the alpha-trimmed simple mean and its standard deviation; the open symbols to the right show the alpha-trimmed inverse-variance weighted mean. Both converge towards the median in the middle of the graph. The large symbols represent resampling from Gaussians, and the small symbols represent resampling when the claimed degrees of freedom are taken into account.
Figure 2 shows the quantitative detail that may be obtained from Monte Carlo resampling, representing the variations to be expected in the candidate reference values, and in their standard deviations, based on the claims of each participant. These claims might be the participants' full uncertainty budgets, in which case the standard deviation would represent a joint claim of the standard uncertainty of a particular KCRV candidate while all the influence parameters of the uncertainty budgets vary randomly over their natural ranges. The standard deviation evaluated in this way makes no allowance for the possible variation in KCRV algorithm (varying the number of Labs trimmed, or varying the weighting of the mean), although a graph like Figure 2 can convey some idea of the potential variation with algorithm. A prediction for the best repeatability to be expected for a particular KCRV candidate can be obtained by resampling if the 'repeatability' part of participants' uncertainty budgets can be identified and used in the resampling. Figure 2 also reveals a difficulty with the resampled median that does not pertain to either of the untrimmed means: the instance median (Lab 8 in the
249 sorted reported values) is equal to zero, but the expectation of the resampled median distribution is equal to -0.02 U.V/V. The treatment of such a small bias is a comparatively minor difficulty: the standard deviation of the resampled median about the instance median is only 6% larger than the standard deviation of the resampled median about the mean of the resampled median. Nevertheless, clarity is important in expressing exactly what one is using as the KCRV candidate and its standard uncertainty. Although in principle the alpha-trimmed distributions can be non-Gaussian, with skewness beyond that of the simple bias discussed above, even so they usually can be rather well described by Gaussians. There is one striking exception: the covariance of the median with one of the contributing Laboratories cannot always be well represented by a bivariate Gaussian distribution, and the degrees of equivalence to a median KCRV fails as a mechanism for properly deducing the bilateral degrees of equivalence needed for discussing interoperability. 2. Median Resampling and Covariances For any trimmed statistic, the simplest new information that is accessible with Monte Carlo resampling is an accounting of how often, on average, each of the Laboratories is dropped from the resampled statistic. For the median with an odd number of participants, this answers the simple question "How often is Lab j the resampled median?" For the median with an even number of participants, the corresponding question is more a bit more subtle: "What fraction of the weights of resampled medians comes from Lab7?" The treatment of this simple concept goes beyond the simple law of propagation of uncertainty[7]. First order covariances cannot rescue the median KCRV nor make it a helpful mediator of the bilateral degrees of equivalence that are wanted for demonstrations of interoperability. When resampling distributions of (Lab / - Median) and (Lab j - Median) are obtained for calculating the degrees of equivalence, with the specified 95% coverage interval, the characteristics of the resampled statistics provide ample opportunities for difficulties. In what follows, we present and discuss some of these difficulties. Their remedy is relatively simple: when the median is selected as the KCRV, the KCRV should generally be avoided as an intermediary for calculating bilateral degrees of equivalence - unless the median is regarded as a pure number with a conventional standard uncertainty of zero, or if 'the median' is regarded as a
250
method for selecting the 'KCRV laboratory' whose results are to be used for future access to the KCRV value without additional uncertainty components. Below, we use CCT-K2[8] to provide an example with 5 or 6 participants. 2.1. Resampling Covariance The covariance of a participant's resampled value with the resampled median can be illustrated with a Key Comparison having 5 or 6 participants. CCT-K2, Ar triple point, Group A, with 6 participants, provides our example both for 6 participants, and - by dropping the pilot - for 5 participants as well. The data are shown in the left panel of Figure 3. With 5 participants, each participant's resampled value can become the resampled median with Monte Carlo probabilities for Lab 1 of 23%, for Lab 2 6%, Lab 3 0.3%, Lab 4 (the instance median) 42% and for Lab 5 29%. With 6 participants, the resampled median is the average of the central two resampled values, so the corresponding measure, the participants' weights, are 24%, 4%, 0.7%, 32%, 23% and for the pilot 17%.
0.0
(f>
1
W *It
*l_
CD
Q.
E o O
|—1
> C o
0.5
M—i
CD _Z[ CD
-•—1—1
E,
1—1—•—1—1
1.0
-0.5
CD
-1.0
Figure 3. Left panel: the data from the 6 participants of CCT-K2 Ar Group A. Lab 6's result, indicated by the open circle, is dropped to create an artificial 5-Lab comparison for which Lab 4 is the instance median. The right panel shows the 5-Lab comparison's joint resampled distribution of Lab 4 with the median, with the 'blade' being the 42% of the resamples where Lab 4 is the median.
For results xt from Lab i, the correlation coefficient with the median xmed is: ((*,• - < * , » ( ^ -{xj)))l\(x,
-{x,)f){{xmed
-(xjj?)}'1
(1),
where the expectation values (...) are the Monte Carlo resampled averages. The correlation coefficients for our 5-Lab example are: 0.479, 0.062, 0.005, 0.445 and 0.509; and for our 6-Lab example they are 0.536, 0.048, 0.014, 0.356, 0.430
251
and 0.411. This captures the first-order covariance of the laboratories' resampled values with the resampled median, and can accurately represent a general bivariate normal (Gaussian) distribution. Although both the laboratory's distribution and the median's distribution may be well described by Gaussians, their joint distribution can be far from a bivariate normal distribution, as shown in the right panel of Figure 3. With an odd number of participants, the fraction of resamples with Lab i as the median will form a 'blade' in the bivariate distribution of Lab i and the median, with slope=l, since for these xmej = xt. For an even number of Labs, the "blade" has a slope of Vi but is also smeared out by the variability and co-variability of results from the other half of the median. 2.2. Degrees of Equivalence with a Median KCRV The MRA prescribes a 95% coverage interval between a participant and the KCRV. For a median KCRV, this interval can be found directly by resampling. In a resampled distribution of (x, - xmej) for an odd number of comparison participants, as shown in the left panel of Figure 4, every resample with the Lab i value as the median contributes to the 5-function at (x, - xmed) = 0. The integrated area of the 8-function is the fraction of times that Lab i is the median.
-1.0
-0.5
0.0
0.5
(Lab 4 - Median) (mK)
1.0 -1.0
-0.5
0.0
0.5
1.0
(Lab / - Median) (mK)
Figure 4. The distribution of (Lab 4 - Median) for a resampled 5-Lab CCT-K2 Ar comparison (left panel), and of (Lab 4 - Median) and (Lab 1 - Median) for the full 6-Lab comparison (right panel). Note the 5-function at (xt - x„^) - 0, with integrated area equal to the fraction of time Lab 4 is the resample median, when the number of participants is odd.
Although there is a cusp at (xt- xmed) = 0 in the resampled distribution obtained for an even number of participants, as shown in the right panel of Figure 4, no such 8-function appears, and the Lab i weight in the median is simply the area
252
under the curve. The disappearance of the 8-function in this difference distribution is, once more, related to the fact that two Laboratories are averaged to form the median in a comparison with an even number of participants. 3. Conclusions We have explored Monte Carlo techniques for calculating the resampled median value in a Key Comparison, going beyond simply building the median histogram by keeping track of the identity of the laboratories along with their resampled values. The reorganization of the information contained in the laboratories' values and uncertainty claims can be done to answer a very wide variety of questions, ranging from evaluating the probability of a given laboratory being the median, to detailed investigation of the covariance relationship between the laboratory value and the resampled median. Although this latter question has academic appeal in numerical analysis, it is likely that the choice of the median, as a reference value in a Key Comparison, should be limited to using the instance median value with no uncertainty, rather than as a representative distribution used as an estimator of central tendency. References 1. Mutual recognition of national measurement standards and of calibration and measurement certificates issued by national metrology institutes, 1999. 2. Numerical Recipes in C, W.H. Press, B.P. Flannery, S.A. Teukolsky and W.T. Vetterling, Cambridge University Press, (1988). page 248 3. Robust Estimates of Location, Survey and Advances, D.F. Andrews, P.J. Bickel, F.R. Hampel, P.J. Huber, W.H. Rogers, and J.W. Tukey, Princeton University Press, (1972). page 7 4. A.G. Steele and R.J. Douglas, Metrologia 42,253 (2005). 5. M.G. Cox, Metrologia 39, 589 (2002). 6. G. Marullo Reedtz and R. Cerri, Metrologia 40, Technical Supplement, 01001 (2003). 7. ISO Guide to the Expression of Uncertainty in Measurement 8. A.G. Steele, B. Fellmuth, D.I. Head, Y. Hermier, K.H. Kang, P.P.M. Steur and W.L. Tew, Metrologia 39, 551 (2002).
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 253-257)
SYSTEM OF DATABASES FOR SUPPORTING COORDINATION OF PROCESSES UNDER RESPONSIBILITY OF METROLOGY INSTITUTE OF REPUBLIC OF SLOVENIA TANASKO TASIC, MATJAZ URLEB, GORAN GRGIC Metrology Institute of the Republic of Slovenia, Grudnovo nabrezje 17 1000 Ljubljana, Slovenia Metrology institute of the Republic of Slovenia (MIRS) is responsible for wide scope of national metrological activities, including maintaining the system of national and reference standards for physical quantities and chemical measurement, system of legal metrology (type approvals, verifications, precious metals) metrological surveillance of legally controlled instruments and Slovenian business excellence prize. Monitoring of all processes in such a system requires acquiring, manipulation and processing of large amount of data. For accomplishing such a task the database system has to be implemented
1. Introduction Analysis of existing means and tools for monitoring key processes in MIRS within different departments which was performed several years ago gave very interesting results. There were no connections between different databases/lists. Every department had it own list of customers, same customers had several names. Metrological surveillance department used only paper versions of text of regulations and type approvals during their in-situ inspections. It was not easy to reach data about costs for maintaining of traceability for national standards or about errors of particular instruments before calibration or legal verification was performed. All necessary public information was not easily available as well. Programming tools used were i.e. MS Access™, MS Excel™, MS Word™ and Borland Paradox™. Persons who programmed these databases were different and sometimes not employed at MIRS any more. Obviously, something has to be done to prepare some more consistent and transparent solution [1]. In presented article the background, concept, design and two applications are presented. 2. Design of the system of databases The decision was to build the common database system from the scratch using web-enabled technology. Certain project-building approaches were applied: as
254 clear as possible definition of project goals, identification of problems that need solution, identification of criteria for evaluation of project success, identification of stakeholders and their interests and necessary resources. The first dilemma was whether to build the system in-house or to outsource it. Two major problems in case of outsourcing were identified: building and maintaining of such a system requires lot of domain - specific knowledge and MIRS employees responsible for particular areas would need to spend lot of time in interactions with developers, which is almost comparable with the time they spend for building the system. Other problem is cost of building and especially maintaining the system. Since certain knowledge of databases building already existed in MIRS, the decision was to build system within house, after some necessary training courses. Centre of the system is the database of customers. All databases and applications communicate with this database, with different access rights. Identification and authentication of customers is assured by digital certificates (PKI based, issued by Governmental Centre for Informatics), username and password. In continuation, users and applications are organised in groups with specific access rights on particular tables in databases. 3. Database for support of national standards and their traceability Slovenia has a distributed metrology system [3] - which means that besides MIRS, who is primarily responsible for the national measurement standards, these can also be maintained by other laboratories. For the base SI quantities: length, time, frequency, electric current and temperature there are laboratories nominated with national decision - holders of national measurement standards. For other (derived) important physical quantities like electrical power, energy, pressure, flow etc. there is also established system for ensuring traceability of reference standards to the international level. Activities and obligations of the holders of the national and reference standards are contractually agreed with MIRS on a yearly basis. All administrative work in the field of national and reference standards is performed on paper, which means that any kind of data analysis is difficult and time consuming. Laboratories prepare offers for annual planed activities as holders which are for a holder of national standard usually more than 80 pages long and for the holders of reference standards there is an 11-page questionnaire and one to two pages for every reference standard. Introduction of the database enables easy access to the information about the laboratories and data analysis on chosen criteria. The database contains all the important information about the laboratories (important for nomination of a laboratory)
255
and their measurement capabilities. These are scientific research activities, measurement standards and other measuring equipment, measurement capabilities on a certain physical quantity, CMC's, international collaboration, strategy of laboratory's development, financial investments, staff and their references, organization scheme, quality system and other. All this results in a very complex database entity model with 60 tables and 370 attributes. With regard to the input of the data into the database there are two types of data. First there is the data which is submitted directly by the laboratories through the web-forms on a secure internet connection. This is the case when laboratories refresh their information when a change occurs. Up to date information about the laboratory is precondition for nomination of a laboratory for a holder of a national or reference standard. Identification and authentication is assured by digital certificates, username and password. And then there is the data which is submitted by digitally signed XML documents. This is used for laboratory's annual reports and offers. Digitally signed XML documents assure data integrity and identification of the signatory [4]. If someone changes a single character of a signed document, the electronic signature becomes invalid.
••
INSPifcU OftS
PRfcClOUS MirlAL5»
Figure 1: MRS databases - an overview
"
i
256 Different access rights to the database are assigned to different end users like the national EUROMET coordinator, EUROMET contact persons, MIRS employees and general public. Customers will have the ability to search for a laboratory with the desired measurement capabilities for a certain quantity. Our plan for the future is to have even more calibration laboratories in the database, beside holders of national and reference standards. This way the laboratories would have ability to present their capabilities to potential customers for services in the field of metrology. Major benefits for the MIRS are simplified monitoring of laboratories work, simplified gathering of the needs for ensuring traceability, transparency of the system of nomination and authorization of laboratories, easier assessment of the national needs, simplified administrative procedures and easier supervision of the quality system of distributed metrology system. 4. Application for support of metrological surveillance The laws and regulations in the field of metrology cannot be implemented in practice without an efficient state metrological surveillance. Effectiveness of metrological supervision can be significantly increased with informatisation of procedures. Application for support to metrological surveillance [2] is part of MIRS information system and it is built from the relational MySQL database, several internet (PHP) applications for data processing and standalone application for in-situ operation. This is a MS Access database for making official documents (report about controlled instruments, provision about the ban of use of instrument, provision about penalty...) which is installed on laptops of metrological surveillance inspectors. The database contains data about metrological regulations, metrological institutions, type approvals, measuring instruments (location, user, manufacturer, type, serial number, verifications, surveillance examinations ...). One of the most important data which is collected is the measurement error at the time of verification of measurement instrument, which represents the metrological state of instrument. The data will be processed with statistical methods (average value...) and the results will be the basis for planning metrological surveillance and for determination of time periods for verifications. Since the metrological surveillance is performed on the field, the access to database is realised through replica of database (database replication is the process of copying a database so that two or more copies can exchange updates
257
of data). Access through cellular phone network is not acceptable because the coverage of territory with cellular phone network is not 100%. The synchronization conflicts and synchronization errors in replication are solved with use of one way replication (with master database and read-only replicas - data are copied only from master database to replicas). Data which must be changed or added to database on the field are copied from replica to temporary database and when the connection to server is established the synchronization of data is performed. Synchronization conflicts are solved with semi-automatic procedure (intervening of operator which must confirm changes of data). 5. Conclusion Presented concept is suitable for monitoring activities of the distributed metrology system in a small country, covering various aspects from the point of view of responsible organisation, players, public and international partners. Besides facilitating organisational issues it gives transparent information for the wide area of metrology community, from high-end metrological laboratories to the user of measuring instruments. Selected technology is based on Linux [5] operating system with Apache [6] server and MySQL [7]. Access and applications are realized with PHP [8] and sometimes JavaScript [9] scripts. With selected open source technology it is possible to minimise initial expenses for building such a system. If necessary, the platform may be changed afterwards, when basic concepts are clarified References 1.
Flegar, R, Supporting infrastructure of the system for implementation of MID, Workshop on Future Aspects of Software and IT in Legal Metrology (FASIT), 2526 September 2003, Ljubljana, Slovenia 2. Urleb, M, Tasic, T. Databases for support of processes in legal metrology, International Workshop from Data Acquisition to Data Processing and Retrieval, Ljubljana, September 13th - 15th, 2004, Slovenia 3. Skubic I, Slovenian Metrology system and MIRS, International workshop from data acquisition to data processing and retrieval, Ljubljana, September 13th - 15th, 2004, Slovenia 4. http://www.xml4pharma.com/eSignature/ 5. http://www.linux.org 6. http://www.apache.org 7. http://www.php.net 8. http://www.mysql.com 9. http://javascript.internet.com/
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 258-261)
CONTRIBUTION TO SURFACE BEST FIT ENHANCEMENT BY THE INTEGRATION OF THE REAL POINT DISTRIBUTION SEBASTIEN ARANDA, JEAN MAILHE, JEAN MARC LINARES, JEAN MICHEL SPRAUEL EA(MS)2,1.U.T., Avenue Gaston Berger Aix-en-Provence cedex 1, F13625, France The evolution of Coordinate Measuring Machines (CMM) allows increasing the quantity and precision of the data collected from parts surfaces. It results in new kinds of information included in the set of measured points, which were until now hidden by the probe uncertainty. The classical best fit approaches, least squares and infinite norm criteria, do not account for this new available information and can even be invalidated, in some case, by the presence of different bias. Only a statistical approach of the bestfitcan lead to a good integration of this information but it requires first to predict the surface texture left by the manufacturing process.
1. Introduction The classical best fit criteria, which are the most actually used, are based on the minimization of a function A„ depending on distances between the measured points and the surface model. k =2 : Least square k = °° : Infinite norm
=EKI" k=l
These best fit methods do not use the statistical information contained in the set of points. It means that they cannot be employed to obtain statistical conclusions on the optimized surface like the covanance matrix of the evaluated parameters. Other approaches can be investigated by considering the statistical aspect of the set of points [1]. In this case, the set can be described as a sample of the real surface point population. A good estimation of the surface can be reached by maximizing the likelihood function. N
f = Y[f(dk)
(2)
This approach imposes to model the probability density / of the points around the mean surface. Three ways are available to define it:
259 •
The first way consists in employing a generic model of distribution. This method has been investigated by Choi and Kurfess [2]. The authors employed a Beta law because of its capability to describe various dissymmetrical kinds of distributions according to its parameters. Dissymmetrical distributions are obtained in the case turned or milled surfaces. Gaussian ones correspond to high quality surfaces. • The second way would consist in evaluating the probability density directly from the set of points. At the beginning, an arbitrary criterion may be used to find a first approximation and an initial distribution will be derived from the residues. Then, likelihood maximization will be performed and an updated distribution will be derived. By iteration this method converges to the real distribution. However, a lot of points will be required to use such method which will cost long time of calculation and measurement. • The third way consists in predicting the probability density. The deviations between the real (manufactured) surface and the ideal one are essentially explained by the tool trace left on the generated surface. This trace corresponds to the global process signature and can be predicted if all the parameters of the manufacturing process are well known. This paper explains the last way proposed and presents a prediction model of the point's distribution around the ideal surface. 2. The signature Each machining process gives a particular texture to the generated surface. This texture influences the distribution of the residues around the ideal feature by introducing systematic deviations. It corresponds to the signature left by the process on the surface, and can be defined as a function depending on the cutting parameters, the tool geometry and the machining strategy. In first approximation, the measured signature corresponds to the convolution of the trace left by the process and the deviations introduced by the CMM (figure 1). Optical probe
\ ?
Process signature
'fid,)
k
* Real machined surface Ideal surface with a form defect !
"* '• ^\-J\-^J^'\-J^X --'' "'/-*• Ideal surface Figure 1. Measured signature using an optical probe.
k
*/w
CMM signature
Measured signature fid,)
"fid,)
>
)>
260
3. Prediction of the process signature It is theoretically possible to derive the probability density from the trace left by tool. This density has already been analytically expressed in the case of a turning process. This was realized using the geometrical properties of the surface texture which allow simplifying the calculations. In the case of the end milling process, such simplifications are not possible because of the complexity of the tool trace. This paper proposes a solution based on an enhanced discretization of the part, inspired from the classical z-map method. « A 0)=w(t,Pi)
y?
yt
r
r-+ Cutting point: / P(t, rt k)
r
^> -4i
1 1 J 1 ""• /©
/ /
, /
t fe^^/^O
/
O ^ ^ ^ y*~r*-X<~
T$\
• Trace let by the tool
- Followed ' trajectory
oPo, g(t7pi)
. „ „. . , • Cutting edge k: z~h(r)
Minimum-z intersection J^ z-axis $(xi, y i)
Figure 2. Determination of the trace left by the tool in the part coordinate system.
A coordinate system (Op,xp,yp,zp) is linked to the part which is discretized into a set of z-axis A=fSj normal to the machined surface. Each line St intercepts the cutting plane (Op,xp,yp) at given coordinates xt and yt. Theses two coordinates are generated randomly by a Monte Carlo method, assuming a uniform surface probability density. The position of a cutting point P in the part's coordinate system (OpXpypZp) is expressed in the form of a function depending on the cutting parameters (/?,), the followed trajectory (g), the milling tooth number (k), the cutting edge geometry (h), the cutting radius (r) and time (t). The trace left by the tool corresponds to the surface generated by the cutting point when the parameters t, r and k are varying (figure 2). Then, the manufactured surface is predicted by searching the minimum intersection z between the tool trace and the z-axis. It consists, for each z-axis, in solving the equation system (3) and selecting the solution that leads to the minimum z coordinate. OpP.xp=xt
P(t,r,k ) • xp = xt
OpP-yp = ys
P(t,r,k)-y
=y,
(3)
The machined surface is finally represented by the set of minima obtained for the whole z-axis. The particular z-map representation used in the simulation ensures that the resulting signature is provided without any statistical bias and can be employed in statistical methods.
4.
Example of prediction of the measured process signature
Figure 3 shows the measured and predicted distributions obtained for a classical milling surfacing. The sample was realized using a mono-bloc carbide milling tool of 10 mm diameter and two cutting edges. A circular machining cycle path was employed for that operation. The cutting speed was set to 100 m/min and the feed was fixed to 0.2 mm/tooth. The CMM signature was characterized by measuring a gauge glass plane. It was found to correspond to a Gaussian distribution with a standard deviation of 0.32 \im. • Predicted m h/feasured
1
IS
lllllllh. j I 11 iTrr
IIIIIIIIIIIUup
-193 -1,24 -0,55 0,14
~i i rrrpprrrrrrrri i
003
151
2,20
289
358
Figure 3. Comparison between the measured and predicted signatures.
The model gives a correct approximation of the real process signature. However, the experimental distribution shows a mode which is not predicted. It may be explained by some wrenching of the matter produced by the cutting and differences existing in the geometry of the milling teeth. 5. Conclusion In our paper, the surface texture produced by end milling has been simulated. In a statistical approach, such texture has been characterized by the distribution of the points around the perfect associated surface, called process signature. The results show that end milling leads to non-symmetrical and multi-modal distributions. Consequently: the classical best fit methods, which use criteria assuming symmetrical distributions, can no longer be applied when the extent of process signature becomes larger than the deviations introduced by the measurements. Even methods based on a beta law are only a first approximation. A new criterion based on the predicted signature will therefore be implemented in the next future. References 1. J.M. Sprauel, J.M. Linares, P. Bourdet, 7,h CIRP Seminar CAT, 237 (2001). 2. W. Choi, T. Kurfess, J. Cagan, Precision engineering 22, 153 (1998).
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 262-266)
COMPUTATIONAL MODELLING OF SEEBECK COEFFICIENTS OF PT/PD THERMOCOUPLE H. S. AYTEKIN, R. INCE, A. T. INCE Department
of Physics, Yeditepe University, 26 Agustos Kayisdagi, Istanbul 34755, Turkey
Yerlesimi
S. OGUZ TUBITAK-UlusalMetroloji Enstitusu (UME), P.O. Box 54 Gebze, Kocaeli 41470, Turkey This work presents a method in molecular dynamics utilising an artificial neural network designed specifically for the modelling and prediction of thermopower in a Pt/Pd thermocouple. In molecular dynamics simulations, semi-empirical interatomic potentials are employed to compute some thermodynamic properties, i.e., enthalpy, potential energy, volume and density of the metals. Then the artificial neural network model is utilized to predict thermopower of Pt/Pd thermocouple, using the computed equilibrium properties. Predicted results are in good agreement with the existing experimental data.
1. Introduction A detailed understanding of thermoelectric heat transport in even the simplest of metals is quite complicated. Nonetheless, the basic principles are now well established but their application in individual cases proves difficult due to the difficulty of calculating the transport properties in metals. Currently, existing interpolation functions used in thermocouple thermometry have no physical basis; furthermore existing Seebeck theories cannot predict the magnitude and sign of absolute Seebeck coefficients (ASCs) of transition metals simultaneously. If a more accurate Seebeck theory or a computational method were developed, then better interpolation or reference functions can be developed to reduce the number of calibration points. In this paper we present a computational method, consisting of molecular dynamics (MD) simulations and an artificial neural network (ANN), to predict the first derivative values of the Pt/Pd thermocouple's emf outputs from their computed thermodynamic properties.
263 2.
Absolute Seebeck coefficient
The output of a thermocouple can be calculated by integrating the ASCs of two different metals, A and B, using an appropriate sign convention VBJT,Tref)=\SAdT-\sBdT
(1)
where SA and SB are the ASCs of the metals, Tref is the reference temperature, and VB/A is the output of the thermocouple. In practice, however, such calculations based on the existing ASC values do not agree with experimental results. Thermoelectric phenomena were investigated theoretically and numerically by Maes et al. [1], Fujita et al. [2], Skal [3] and Mott et al. [4], where the Seebeck coefficient is directly proportional to absolute temperature. According to Fujita et al., the Seebeck coefficient can be determined from the variation of carrier densities due to different temperatures by
where d, RH, £F, kB, N0 and Kare respectively the dimension, Hall coefficient, Fermi energy, Boltzmann constant, density of states and cell volume. These studies show that thermoelectricity is still in a developmental stage especially for transition metals because they are highly complex nonlinear systems [2]. 3.
Molecular dynamics simulations
MD simulations enable one to determine the trends in thermodynamic properties of real materials with a classical approach. MD simulations [5] were carried out to compute the enthalpy, potential energy, volume and density of pure Pt and Pd metals for a range of temperatures from 0 K to 1600 K , including eight nominal ITS-90 fixed point temperatures, using isobaric-isoenthalpic ensemble (NPH) and constant pressure constant temperature ensemble (NPT). The simulations were performed by solving the classical equations of motion by the 5th order Gear predictor corrector integration scheme with a time step of 2 fs on 5x5x5 unit cells of Pt and Pd consisting of 500 atoms each. In the simulations, the Sutton-Chen interatomic potentials [6] were used. The Parrinello piston mass parameter was chosen as W=400 in the NPH ensemble and the Nose parameter was set to Q=100. The simulations were performed using a temperature gradient of 1 K per step. In the computations, different numbers of corrector steps were used depending upon the specified temperatures.
264 4.
Artificial neural network
ANN modelling is basically a nonlinear statistical analysis technique which links the input data to output data using a particular set of nonlinear functions. In ANN simulations, where a Bayesian regularization training algorithm [7] was used, a fully connected three-layer (9x9x1) backpropagation network was utilised to predict the thermopower values of a Pt/Pd thermocouple. Enthalpy, potential energy, density and volume were computed using MD simulations for Pt and Pd metals separately at the same temperatures in the range of 300 K to 1400 K with 100 K steps. This resulted in nine, including temperature, different input variables. These nine input variables were prepared to reduce the training root-mean-square (rms) error before employing them into the ANN as input variables below. As ANN learns a linear relationship more efficiently than a nonlinear relationship, this property was used in ANN modelling to reduce the training rms error. Therefore, the relationship representing the temperature was taken as linear (7) instead of a nonlinear one (l/T) due to Mott's et al. equation and cell volume was chosen as its reciprocal due to Eq. (2) as input variables in the ANN modelling. Moreover, enthalpy, potential energy, and density were entered directly into the network because the first order linear correlation coefficients between them and ASCs [8,9] of each metal are larger than the coefficients between their reciprocals and ASCs of each metal. In the same way, thermopower values of the Pt/Pd thermocouple [10] were entered directly into the output layer of the network as output variable. Because a tangent sigmoid transfer function, which modulates the output of each neuron to values between -1 and 1, was used in the network, the prepared input and output data sets were normalized according to the following normalization equation (3) before performing the ANN computations. (2(V-V { V V
max
• X\ -V
•
^
mm J
Where Vmax and Vmin are the maximum and minimum values of a variable respectively and each value V was scaled to its normalized value of A. After normalizing each input and output data sets, nine different input variables and the one output variable were employed in random order into the ANN. Later, a training set was constructed to train the ANN by minimizing the rms error without over training the network by removing the randomly chosen thermopower values and computed input variables at temperatures 600 K, 900 K, and 1200 K.
265 The training data sets were used to optimize the ANN experimentally by trial and error until the minimum rms error value for prediction of the test output data sets was obtained. This is especially important at the ITS-90 temperature values predicted here, because international comparisons are carried out at these temperatures. 5.
Results
Possible MD computation errors are due to the constants in the interatomic potentials, number of atoms, truncation error and round-off error. Truncation and round-off errors are generated by Gear's algorithm. The magnitude of those errors reflects the accuracy, while the propagation of those errors reflects the stability of the algorithm. The discrepancies in computed properties can be reduced by increasing the number of atoms and by decreasing the time step since deviation in potential energy is largely due to the number of atoms dependence on potential energy. Nevertheless, use of parallel computing systems can also increase the computation efficiency. In Table 1 the predicted first derivative values d£/drpredicted of the Pt/Pd thermocouple's emf outputs and the difference between the predicted first derivative values and the experimental first derivative values dis/drpredicted - d£/dr exp of the Pt/Pd thermocouple's emf outputs are presented at nominal ITS-90 fixed point temperatures. Table 1. The predicted thermopower values and dE/dT^ad fixed point temperatures for Pt/Pd thermocouple. Nominal ITS-90 Temperature
iffiyiffpredicted (uV/K)
-
fiE/Wpredicted - dE/dTnp (uV/K)
(K) 303
5.570
0.020
430
6.415
-0.015
505
7.045
-0.013
693 933
9.547 13.963
0.009 0.007
1235 1337 1358
19.185 20.631 20.909
-0.003 0.005 0.007
266 6.
Conclusion
In this paper, we developed a computer programme which employs the state-ofthe-art molecular dynamics algorithms based on extended Hamiltonian formalism. This computer programme successfully computed the thermodynamic properties of the given pure metals. In our ANN modelling, twelve data sets were given to ANN to predict the thermopower values of the Pt/Pd thermocouple at eight nominal ITS-90 fixed point temperatures with similar accuracy to literature values. By using this method to predict the effect of the dissolved impurities, the performance or design of such thermocouples might be improved and better interpolation reference functions for thermocouples can be developed. References 1. C. Maes and M. H. Wieren, Thermoelectric Phenomena via an Interacting Particle System, J. Phys. A38, 1005-1020 (2005). 2. S. Fujita, H. C. Ho and S. Godoy, Theory of the Seebeck Coefficient in Alkali and Noble Metals, Mod. Phys. Lett. B13, 2231-2239 (1999). 3. A. S. Skal, New Classic Equations for Conductivity and Thermopower, J. Phys. A30, 5927-5940 (1997). 4. N. F. Mott and H. Jones, The Theory of the Properties of Metals and Alloys, Dover, New York (1962). 5. H. S. Aytekin, A. T. Ince and R. Ince, Investigation of Thermodynamical Properties of Pure Metals Using Molecular Dynamics, Proceedings of 9th International Symposium on Temperature and Thermal Measurements in Industry and Science, 1325-1330, Dubrovnik (2004). 6. A.P. Sutton and J. Chen, Long-range Finnis-Sinclair Potentials, Phil. Mag. Lett. 61, 139-146 (1990). 7. S. Haykin, Neural Networks, Macmillan, New York (1994). 8. N.E. Cusack and P.W. Kendall, The Absolute Scale of Thermoelectric Power at High Temperature, Proc. Phys. Soc. 72, 898-901 (1958). 9. R. B. Roberts, F. Righini and R. C. Compton, Absolute Scale of Thermoelectricity III, Phil. Mag. B52, 1147-1163 (1985). 10. G. W. Burns, D. C. Ripple and M. Battuello, Platinum Versus Palladium Thermocouples: An EMF-Temperature Reference Function for the Range 0 °C to 1500 °C, Metrologia 35, 761-780 (1998).
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 267-270)
DATA EVALUATION AND UNCERTAINTY ANALYSIS IN AN INTERLABORATORY COMPARISON OF A PYCNOMETER VOLUME ELSA BATISTA, EDUARDA FILIPE Institute Portugues da Qualidade, Rua Antonio Gido, 2 - Caparica,
Portugal
The EUROMET comparison "Volume Calibration of a 100 ml Gay-Lussac Pycnometer", between fourteen National Metrology Institutes (NMIs), was performed with a gravimetric method procedure [1, 2], This paper describes the data evaluation (determination of the reference value and a chi-square test) and uncertainty analysis of the results.
1. Introduction Volume measurement is critical in many laboratories and industries. In order to identify and reduce possible errors in intensive liquid handling process it is necessary to calibrate the volumetric glassware equipment. With the purpose of comparing the experimental calibration procedure and uncertainties a EUROMET 692 comparison was performed with 14 participants NMIs. The calibration of a Gay-Lussac pycnometer of 100 ml, representative for all type of laboratory glassware, was carried on and IPQ, as the pilot laboratory, provided the pycnometer volume standard. The suggested method to perform the pycnometer calibration was a gravimetric method, using distilled water at a reference temperature of 20 °C. For the calculation of the volume, the following formula was used [2]: V 2 0 =(m 2 -m,)x
x 1 - ^ - x[l-7(r-20)j PW-PA
\
(1)
PB)
Some laboratories used their own model for the determination of the volume. In this paper is presented the data evaluation and uncertainty of the results of the described comparison, which will allow us to verify and understand the differences between laboratories.
268 1.1.
Correction of the measured results
During the comparison it was necessary to replace the original pycnometer. In order to have comparable results for all laboratories and to proceed with the comparison, a correction was applied to the results of the first pycnometer. The correction was obtained averaging the difference of the values obtained by the laboratories that performed the calibration of both pycnometers, PTB and IPQ. Table 1 - Correction of the volume Difference (ml)
Average (ml)
Uncertainty (ml)
IPQ
0.1109 (IPQ1-IPQ2)
0.1118
0.0016
PTB
0.1120 (PTB1-PTB2)
The value 0.1118 ml was then added to the results of the first pycnometer. The uncertainty and the average value were calculated according to section 2.1. 2. Determination of statistical parameters 2.1. Determination of the reference value (RV) and uncertainty In order to obtain the RV [3] the formula of the weighted mean was used, using the inverses of the squares of the associated standard uncertainty as weights: ^_xJu2(xl) + ... + xJu2(xn) y
£2)
1/M2(JC,) + .... + 1/M 2 U„)
To calculate the standard deviation u(y) next formula was used: L "W = Jl / -« T( j c— —-T— , ) + ... + l / K ( j e ) 2
2
^
B
This approach for the determination of the RV was decided based on the BIPM procedure for key comparisons [3] and it was taken in account that all the participants' laboratories met the necessary conditions. 2.2. Consistency statistical test - Chi-square test To identify the inconsistent results a chi-square test was applied to all results [3], with degrees of freedom v= N -1 2 _(*i-y)2 . u (JC,)
(*„-y)2 u
(xn)
(4)
269 This consistency check is failing if: Pr\z2 (v) > %lbs }< 0.05. Considering now the weighted mean for the 14 laboratories, v = 100.0917 ml with an uncertainty u(y) = 0.0006 ml (k=2), we obtain now for the chi-square test the following r e s u l t s : ^ =41.3602 and ^ 2 (0.05;13) = 22.3620 and the consistency test fails. We note that the value for one of the laboratories SLM: X2SLM = 26.4622 is higher then the total ^r2(0.05;13) = 22.3620. The volume result for the SLM laboratory was then removed from the weighted mean calculation and a new consistent test was performed. The new results are the following: xlbs =14.3879 and ^ 2 (0.05 ;12) = 21.0261. We can conclude that the results are consistent. The new value for the weighted mean y was calculated as the reference value xref and u(y) as the standard uncertainty u(xref): xref = 100.0914 ml, u(xref) = 0.0006 ml with k=2. The volume results for all the laboratories along with the reference value are presented in Figure 1.
•
100.1160 -
Volume Reference value
- - - - Expanded uncertainty of th* nfennce vahft
100.1110 ^100.1060 •=-100.1010 -
J 100.0960 o
j : : i : : l : : ^ : : i : : : : : : : : : t : : j : : T : : H-E-^^
•*• 100.0910 100.0860 1000810 100.0760
^
Laboratories
Figure 1. Laboratories results compared with reference value
3. Uncertainty presentation Each laboratory presented its uncertainty calculation in detail, so it was possible to compare each component separately. The components that were taken in account for the uncertainty analysis were: mass determination, mass pieces density, air density, water density, expansion coefficient of the glass, temperature and other smaller and specific components.
270
0.00160 0.00140 0.00120 %
0.00100
g
0.00080
"3 0.00060
>
0.00040 0.00020 (••«•'
i
0.00000
4* •j?
iiS
,^
^
cP
** Standard uncertainties
Figure 2. Mean average of the presented standard uncertainty components for all the laboratories
The previous figure represents averaged values over all laboratories for the main uncertainty components. As can be seen the major source of uncertainty is the water density followed by the mass determination. 4. Concluding Remarks This comparison involved 14 laboratories and lasted one and half year. One of the major risks was to break the glass pycnometer and this occurred after 5 measurements. Replacing the pycnometer and adding a correction to the first 5 volume results solved the problem. Globally the results are quite satisfactory, the maximum and minimum reported volumes differ less that 0.01%. With the exception of two participants, the laboratories volume results were quite consistent with the reference value, and with each other. The uncertainty budgets were very similar and the major uncertainty component to the final uncertainty was, for the majority of the participants, the water density. References 1. ISO 3507 - Laboratory glassware - Pyknometers, Geneve 1999; 2. ISO 4787 - Laboratory glassware - Volumetric glassware - Methods for use and testing of capacity; Geneve, 1984; 3. M.G.Cox, "The evaluation of key comparison data", Metrologia, 2002, Vol. 39, 589-595.
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 271-275)
P R O P A G A T I O N OF U N C E R T A I N T Y IN DISCRETELY S A M P L E D SURFACE R O U G H N E S S PROFILES *
JAMES K. BRENNAN, ANDREW CRAMPTON, XIANG JIANG, RICHARD K. LEACH* AND PETER M. HARRIS* School of Computing and Engineering, University of Huddersfield, HD1 3DH, UK * National Physical Laboratory, Teddington, TW11 OLW, UK E-mail: [email protected]
The need to evaluate the combined standard uncertainty of an output quantity that itself is a function of many input quantities each with their own associated standard uncertainties is a problem arising in surface profile metrology. This paper presents research in progress that attempts to overcome complications that arise when trying to compute a robust value for this combined standard uncertainty.
1. Introduction Measuring a surface with a contact stylus instrument and then evaluating surface texture profile parameters, such as those defined in ISO: 42871, quantifies certain aspects of the surface. It is important to note that these surface profile points are not uniformly spaced along the x-axis because of many influences during the measurement process 2 . This inconsistency in the spacing of points causes noticeable errors in the calculation of surface texture parameters due to the apparent incompatibility with standard filtering software3. The magnitude of the co-ordinate values that make up a surface profile are very small and susceptible to error in numerical computation and the standard uncertainty associated with these values is equally small. Compound this with the fact that the ISO parameters are defined in terms of a continuous function and have ambiguities in some definitions, there is a need to develop robust software for the safe calculation of these parameters and their subsequent standard uncertainty values. "This research was carried out under a CASE studentship project funded in full by EPSRC and the National Physical Laboratory.
272 X,,Z,, Ufa) X, Z:, 11 fa)
**
)'•
i'(y)
X„, Z,„ Ufa)
Figure 1. Input-output model illustrating the propagation of uncertainty. The model has n sets of input quantities X and Z, that are the profile co-ordinate values estimated by x and z with associated standard uncertainty u(z). There is a single output quantity Y, estimated by the measurement result y with associated standard uncertainty u(y). This concept can be extended to incorporate the standard uncertainty associated with the traverse axis x, u(x).
2. Profile Fitting A novel method of surface profile data fitting has already been presented 4 , this study explored the use of the natural cubic spline 5 (NCS) interpolant for this purpose. This approach fitted each profile peak and valley elements individually, see Figure 2, to allow a more stable data fit as measured profiles usually contain many thousands of points. The result gives the discrete surface profile as a series of mathematically smooth and continious functions. The natural cubic spline possesses minimum curvature properties due to the contstraints imposed upon it 6 and ensures a smooth approximation to the underlying profile as opposed to simple polynomial schemes
! Q-S
Non-uniform spaced profile fitted with natural cubic spline interpolant
X Measured profile points — NCS interpolant Mean Line • •- Profile Element Dividers
5.75
5.76
5.77
5.78
5.79
5.8
5.81
5.82
5.83
5.84
Figure 2. A section of a surface profile fitted using a natural cubic spline interpolant.
273 S i m p l e profile with wictrUtnty bound* NluftrHed
Figure 3. Plot illustrating the concept of the uncertainty coverage intervals. The inner plot shows the given measured profile. The outer plots show the 95% confidence boundary of the given associated standard uncertainty value. Inset, a close-up showing the uncertainty interval of the measured points and how the outer plots have been constructed.
which may exhibit excessive oscillation between points. The advantages of using this method are that, firstly, it fits a series of continuous functions to the profile and thereby represents the profile in the same format that the ISO standard parameters are defined. Secondly, because the profile is now a series of continuous functions it can be re-sampled at equidistant points so that it is compatible with standard filtering procedures. This method for data fitting has been extended to incorporate the propagation of uncertainty for surface texture parameters, utilising both the methodology from the Guide to the Uncertainty of Measurement (GUM) 7 and Monte-Carlo simulation (MCS) approach 8 .
3. Propagation of Uncertainty The software has been developed to replicate the concept shown in Figure 1, and to output a parameter value together with its associated standard uncertainty. The model aims to quantify the extent of the inexactness (u(y)) of the parameter value (y) with respect to the inexactness (u(z)) of the input quantities (z). This inexactness of measured quantities is described the uncertainty in the measurement and provides us with an idea of the measurement quality. Every measured point that makes up a profile
274
Figure 4. Computed uncertainty distribution for two surface profile parameters using GUM and MCS methodologies.
is assigned the same standard uncertainty, given by the intstrument and environmental condition, and so we assume that if a continuous function were to interpolate this profile then this would inherit the same value along its range, see Figure 3. As a result of this, the advantages of the profile fitting method hold when propagating uncertainties for parameter values. From the results gathered so far it can be seen that using this method, with both GUM and MCS procedures, provides a more accurate result for the parameter value and the uncertainty coverage interval 9 . This is compared to results that were generated from an equally spaced reference profile and using the same software. The plots in Figure 4 show results for the uncertainty evaluation of two surface profile parameters, Ra (top) and Rv (bottom), for both MCS (left) and GUM (right) methodologies. In both cases concerning Ra the probability density function (pdf) of the result using profile fitting has more overlap with the reference result than the results obtained from evaluating the raw profile data. The Rv parameter is the absolute minimum value of the profile and for evaluation by GUM methods it requires certain approximation methods to be used, and assigns the output pdf to be Gaussian (the MCS results clearly show a skewed pdf). The plots for the pdfs of the reference profile and the natural cubic spline fitted profile overlay one another in the figure for the Rv results using the GUM methodology (bottom, right).
275
4. Conclusions A robust method of propagating uncertainty through surface texture profile parameters has been presented. The method of data fitting has already shown to overcome the ineffectiveness of the Gaussian profile filter with non-uniform spaced profile points and, using this method in conjunction with existing methods for the evaluation of uncertainty, show a more accurate approximation to both the measurand (parameter value) and its associated uncertainty value. This process of data fitting represents the discrete profile as a series of continuous functions permitting the exact calculation of definite integrals and other phenomena such as maximum peak height and therefore provides a stable basis for both mainstream GUM and Monte-Carlo methods to compute sensitive uncertainty estimates. References 1. ISO 4287: 1997 Geometrical Product Specifications - Surface texture: Profile method terms, definitions and surface texture parameters (Geneva: ISO) 2. Taylor Hobson Ltd 2003 Exploring surface texture 4th Edition Leicester, UK 3. Koenders L, Andreasen J L, De Chiffre L, Jung L and Krger-Sehm R 2004 EUROMET.L-S-11 Comparison on surface texture Metrologia 41 4. Brennan J, Crampton A, Jiang X, Leach R and Harris P 2005 Reconstruction of continuous surface profiles from discretely sampled data 6th Int. euspen Conf, Montpellier, Prance, 8-11 May 5. Powell M J D 1987 Radial basis functions for multivariate interpolation: A review Algorithms for Approximation eds Mason J C, Cox M G Clarendon Press, Oxford 6. Duchon J 1977 Splines minimising rotation invariant semi-norms in sobelov spaces Lecture Notes in Mathematics 5./7 85-100 eds. Schempp W, Zeller K Springer Verlag, Berlin 7. BIPM, IEC, IFCC, ISO, IUPAC, IUPAP and OIML 1995 Guide to the Expression of Uncertainty in Measurement, ISBN 92-67-10188-9, Second Edition 8. Cox M and Harris P 2001 Measurement uncertainty and the propagation of distributions 10th Int. Metrology Congress, Saint Louis, France 9. Ciarlini P and Pavese F 1993 Special reduction procedures to metrological data Numerical Algorithms Vol. 5 p 479-489
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 276-279)
COMPUTER TIME (CPU) COMPARISON OF SEVERAL INPUT FILE FORMATS CONSIDERING DIFFERENT VERSIONS OF MCNPX IN CASE OF PERSPONALISED VOXEL-BASED DOSIMETRY SOPHIE CHIAVASSA1, MANUEL BARDIES1, DIDIER FRANCK2, JEAN-RENE JOURDAIN2, JEAN-FRANCOIS CHATAL1 AND ISABELLE AUBINEAULANIECE2'3 French Institute of Health and Medical Research - INSERM U601, 9 Quai Moncousu 44035 Nantes Cedex, France 2
Institute for Radiological Protection and Nuclear Safety - IRSN, DRPH/SDI, B.P. 17, F-92262 Fontenay-aux-Roses Cedex, France 3
Current addresses: CEA-Saclay, DRT/DETECS/LNHB/LMA, 91191 Gifsur Yvette Cedex, France
To perform accurate dosimetric studies for internal radiotherapy and contamination, a new tool (Oedipe) was developed for personalised voxel-based dosimetry. Patientspecific geometries are created from anatomical images and associated with the Monte Carlo N-Particle extended code, MCNPX, using the Graphical User Interface Oedipe. Computer time (CPU) is a limiting factor to use routinely Oedipe for nuclear medicine, particularly for dosimetric calculations at the tissue level. The Los Alamos National Laboratory developed the new version 2.5e of the MCNPX code in order to reduce CPU time. The presented studies were performed to evaluate the CPU time decrease using MCNPX2.5e in the context of personalised voxel-based dosimetry. Results show that the optimisation is well suited for internal dosimetry and involves an important CPU time reduction by a factor equal or higher to 100. 1. Introduction Internal dosimetric evaluation is currently performed considering standard anthropomorphic mathematical models [1]. Dosimetric evaluations as accurate as possible are needed in internal radiotherapy and particularly in radioimmunotherapy [2]. It is now possible to create patient specific geometries from computational tomo-densitometry or magnetic resonance images. Those geometries consist in a three-dimensional (3D) matrix, which each element corresponds to a volume element called voxel. Associated with a Monte Carlo radiation transport code, those so-called voxel-based geometries allow personalised dosimetric evaluations at the organ or voxel level. In this context, a Graphical User Interface (GUI), named Oedipe (a French acronym standing for
277
Tool for Personalised Internal Dose Assessment) was developed. Oedipe creates voxel-based specific geometries and associates them with the Monte Carlo N-Particle extended code, MCNPX [3]. The main limit of this method is that it is very time consuming, especially for dosimetric estimation at the voxel level. Indeed, Computer Time (CPU) increases with the number of voxels that composes the voxel-based geometry. The MCNPX code, developed by the Los Alamos National Laboratory (LANL) allows the use of an original format, called "Repeated Structure", to define the geometry. This format, which consists in repeating a single volumic element many times, is specially suited to define voxel-based geometries. Moreover, MCNPX offers several formats to specify expected results, so-called Tallies, of the calculation. The version 2.5e [4] of the code was optimised by LANL for the association of the Repeated Structure format to a tally format called "Lattice Tally". The presented studies evaluate the decrease in CPU time resulting from this optimisation when personalised dosimetric calculations at the voxel level are considered. Comparisons with an anterior version considering different MCNPX input file formats were performed. 2. The Monte Carlo N-Particle codes, MCNP and MCNPX MCNPX [3], a general Monte Carlo N-Particle transport code, represents a major extension of the MCNP code, and allows to track all types of particles. These codes treat arbitrary 3D configurations of user-defined materials in geometric cells (volumes) bounded by first-and second-degree surfaces. The cells are defined by the intersections, unions and complements of the regions bounded by the surfaces. Geometry is limited to 99,999 cells. As patient-specific voxel-based geometries are usually composed of millions of voxels, this geometric format, referred here as classic geometric format, is unsuitable for patient-specific voxel-based dosimetry. MCNP and MCNPX propose a second geometric format, called Repeated Structure, which makes possible to describe only once the cells and surfaces of any structure that appears more than once in a geometry. This unit then can be replicated at any location. Replication can be performed in a rectangular grid (lattice). Indeed, patient anatomy is a matrix corresponding to a rectangular lattice and all voxels that compose the patientspecific geometry have a similar shape and size. Each similar voxel (filled with the same material) is then defined only once. Using Repeated Structure format, it is possible to model by MCNP(X) any anthropomorphic geometry of any size. MCNP and MCNPX calculate various tallies such as energy deposition (MeV) or absorbed dose (MeV/g). Three formats allow to specify regions of interest for a tally:
278 -The Tally, referred in the present paper as Classic Tally, is used with the classic geometric format. All the voxels of interest have to be listed. -The Lattice Tally is used with a Repeated Structure geometric format. The lattice indexes are used to define voxels of interest. -The Mesh Tally consists in defining a rectangular, cylindrical, or spherical grid (mesh) superimposed onto the geometry. Particles are tracked through the independent mesh. Tally is then scored in each mesh cell. In order to reduce CPU time required by the code, a Lattice Tally Speedup patch of MCNPX, involving the Repeated Structure and Lattice Tally formats, is now available in the 2.5e version [4], The Lattice Tally Speedup efficiency is optimal when considering the whole lattice. 3. Optimisation of CPU time considering the different available geometry and tally formats The Lattice Tally Speedup runs for the Repeated Structure and Lattice Tally formats using the MCNPX2.5e code. We compared this configuration with a previous version of the code, MCNPX2.4 [5], considering three different formats: Classic geometry with Classic tally, Repeated Structure geometry with Lattice tally and Repeated Structure geometry with Mesh tally. As the dose assessment at the voxel level is the more time consuming calculation, comparisons were performed in such case. Moreover, CPU time increasing with the number of voxels, various geometries were created by the extraction of several slices (1-6,12 and 54) of an initial patient-specific geometry consisting in a matrix of 256 x 256 x 157 voxels. Four densities were considered: air, lungs, soft tissues and hard bones. To obtain smaller geometries, sampling was reduced to 128 x 128 x 157 voxels and several slices were extracted (1-6 and 48). The larger geometry considered for this study (54 slices) corresponds to the trunk of the patient, area of usually major interest in internal radiotherapy. For the purpose of the comparison, it was arbitrary chosen that 1,000,000 photons of iodine 131 (data taken from ICRP 38 [6]) would be uniformly distributed in soft tissues. Results are shown on figure 1. Firstly, the comparison of the two geometric formats shows clearly the limitation of the classic format since only 1 and 6 slices can been defined with respectively 256 x 256 and 128 x 128 voxels in each slice. Secondly, the Repeated Structure format associated with a Mesh Tally is the less time consuming configuration for the MCNPX2.4. Nevertheless, to assess the spatial dose distribution in the geometry of 256 x 256 x 12 voxels, such configuration with the MCNPX2.4 code requires an important CPU time of about 75 hours. The Lattice Tally Speedup reduces this CPU time by a factor equal or higher to 100. Moreover, this factor increases with the lattice size.
279 100 - • - MCNPX2.4 + Repeated Structure + Mesh Tally - * - MCNPX2.4 + Repeated Structure + Lattice Tally - * - MCNPX2.4 + Classic Geometry and Tally -~-MCNPX2.5e + Repeated Structure + Lattice Tally 60
3 0. O
40
20
it /
If /
L
10000
1010000
2010000
3010000
Lattice size (voxels)
Figure 1. CPU time (hours) required by the MCNPX2.4 and MCNPX2.5e codes considering various geometric and tally formats and various lattice sizes.
4. Conclusion The presented comparative studies demonstrate the efficiency of the Lattice Tally Speedup for patient-specific voxel-based dosimetry. Particularly, at the voxel level, an important CPU gain (higher than 100) is obtained for large lattice comparing with an anterior version of the code. The optimised version of MCNPX allows performing voxel-based dosimetry with reasonable CPU time. This paves the way for personalised dosimetric assessment to be performed routinely for both radiation protection and radiotherapeutic purposes. References 1. M.G. Stabin: J. Nucl. Med. 37 538-46 (1996) 2. M. Bardies, P. Pihet: Current Pharmaceutical Design 6 1469-1502 (2000) 3. H.G. Hughes et al: MCNPX the LAHET/MCNP Code Merger XTM-RN 97-012 (1997) 4. J.S. Hendricks: MCNPX, Version 2.5e LA-UR-04-0569 (2004) 5. MCNPX User's Manual, Version 2.4.0 LA-CP-02-408 (LANL, 2002) 6. ICRP-38, International Commission on Radiological Protection, Pergamon Press (1983)
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 280-283)
A NEW APPROACH TO DATUMS ASSOCIATION FOR THE VERIFICATION OF GEOMETRICAL SPECIFICATIONS JEAN-YVES CHOLEY, ALAIN RIVIERE, ANDRE CLEMENT LISMMA, SUPMECA,
93407Saint-Ouen
Cedex,
France
PIERRE BOURDET LURPA, ENS Cachan, 94235 Cachan Cedex,
France
In order to perform the association of a datum when verifying a geometrical specification and instead of moving the geometry as the traditional methods usually do, measured points are considered as perturbations which generate modifications of the nominal geometry by variation of its orientation, location and intrinsic dimensional parameters, without requiring rotation and translation variables. This variational association process can be applied to single datums, datums systems or common datums, with the least squares and minimax criteria.
1. Introduction The variational association method [1] is based on both a vectorial modeling of the geometry, taken out of the CAD system database, such as points and vectors to characterize the MRGE of a TTRS [2] or a GPS situation feature (ISO 174501), and a variational distance function as well as its Jacobian matrix. The influence of all the measured points is taken into account when an optimisation criterion is applied [3]. 2. The variational association 2.1. Definition of the variational distance function The distance function is necessary to associate an ideal to the actual geometry: • (O.x.y.z) is a unique reference system for the CAD model and the CMM. • G(u) is an ideal geometry characterized by its parameter u (position, location and intrinsic characteristics depending of its class). • E = {Mj, i=l..n} is a set of measured points the locations of which are defined by the vectors Mt = OM{.
281
F(Mj, G) is the variational distance function, smallest Euclidian orthogonal distance between Mt and G(u). It inherits the parameter u. ,.--*" Extracted Geometry F
/
M.MQ.J
v 11 /
F(MtG}
Parameters u
Nominal Geometry Gilt) Figure !. Distance function between extracted geometry and nominal geometry.
2.2. The association process It relies on the variational distance function and can be described by figure 2. Extracted Geometiy {F@f>G(uc)),i=l..n} Set of initial distances
Initial Geometry
0W„ Gm i=I~n)
E
Set of optmiEad. distances (residuals)
G(itd
G(u)
L
Optimized Associated Geometry
Modification of parameters H O - * u with respect to the constraint: « norm of {F\M„ G(u)X i=I..n) is minimal»
Figure 2. The variational association process.
For a given measured point Mj, Eq. (1) describes the process, which can be linearized as shown in Eq. (2) and (3): F(Mi,G(u)) F{Ml,G{U))
=
F(Mf,G{u„+du))
= F{M„G(u0))
+
dF{Mi,G{u0))
., ,^,,, ^ M &dF(M.,G(u„)) with: dF(M., G(u„)) = X ' °
, duj
(2) (3)
6U;
Eq. (4) permits to formulate the process as a system of equations that takes into account all the measured points. J is the Jacobian matrix of the variational distance function for parameter un. F = F„+Jdu
(4)
282 3. Optimisation of the association In order to define the associated geometry, it is necessary to determine du while minimizing the norm of the optimized distances Fo+Jdu. Thus, the least squares optimisation is achieved using the pseudo-inverse matrix, whereas the minimax optimisation is treated with an algorithm developed by the PTB [4] and adapted on purpose. 3.1. How to apply the least squares criterion The optimization problem is expressed by Eq. (5), Eq. (6) gives its formal solution, with f being the pseudo-inverse matrix that solves the problem while minimizing the L2-norm of F0+Jdu. Jdu = -F0
(5)
du = -J+F0
(6)
If an additional constraint - Eq. (7) - needs to be taken into account, the use of a Lagrangian multiplier - Eq. (8) - is recommended. The optimized constrained solution is then obtained from the unconstrained one as described in Eq. (9). cTdu = k X= l+
*-«?" , cT(J'jy'c
du' = du + X(JTJf'c
(7) (8) (9)
3.2. How to apply the minimax criterion Applying the PTB combinatory method [5] implies that the "tangent outside material" and the "tangent inside material" geometries are simulated as shown on figure 3. The t-parameters allow to allocate and to permute the contacting points on each side of the "thick" geometry, in order to perform the combinatory process based upon essential subsets of points: F(Mk,G(u))
= tk.e
cIF( Mk, G( u,)) -tk.e = -F(Mk ,G(uJ)
(10) (11)
283 „(••»•
A * *•*••...
« TangmtOutside Material»
JX^X-J
Has on fte oufcide of material
Parameter u
<^^^:^:^K^r^^-G^^=^e
Ideal*
Jfc=-J«iienp«biiliesan «Tanaent Inside Material* .•!.•** . . .* fte mode of material
j I
Figure 3. The non-ideal simulated geometry.
The system of equations has to be written as follow, with dv being du complemented with the thickness parameter e, and A being the modified Jacobian matrix with an additional column for the t-parameters: Adv = -F0
(12)
As long as essential subsets of points are processed, A is a square matrix, thus easily invertible. For each subset, it is then possible to find the solution that minimizes e, allowing to carry on with the PTB algorithm. 4. Perspectives This variational association has been tested on single datums, common datums and datums systems made of cylinders and planes. It may be useful for verification as well as location and alignment a part on a CMM. Based on a unique mathematical model, it can be used to compare the association criteria. Since this association process doesn't use rotations and translations [5], it may also be apply to multi-physic mechatronic systems that conjugate geometry and electromagnetism for instance. References 1. J-Y. Choley, PhD of Ecole Centrale Paris, 2005-05 (2005). 2. A. Clement, C. Vallade, A. Riviere, Advanced Mathematical Tools in Metrology III, World Scientific (1997). 3. P. Bourdet, A. Clement, Annals of the CIRP, page 503, v. 37/1/1988. 4. M.G. Cox, R. Drieschner, J. Kok, Report EUR .15304 EN, Commission of the European Communities (1993). 5. P. Bourdet, L. Mathieu, C. Lartigue, A. Ballu, Advanced Mathematical Tools in Metrology II, World Scientific (1995).
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 284-288)
MEASUREMENTS OF CATALYST CONCENTRATION IN THE RISER OF A FCC COLD MODEL BY GAMMA RAY TRANSMISSION
CARLOS COSTA DANTAS Department of Nuclear Energy, Federal University of Pernambuco - UFPE, Av. Prof. Luiz Freire 1000 CDU 50740-540 Recife, PE.
VALDEMIR ALEXANDRE DOS SANTOS Department of Chemistry, Catholic University of Pernambuco - UNICAP, Rua do Principe 526, Boa Vista, 50050-900, Recife, PE. EMERSON ALEXANDRE DE OLIVEIRA LIMA Department of Statistics and Informatics, Catholic University of Pernambuco UNICAP, Rua do Principe 526, Boa Vista, 50050-900, Recife, PE. SILVIO DE BARROS MELO Informatics Center, Federal University of Pernambuco - UFPE, Av. Prof. Luiz Freire 1000,CDU, 50740-540, Recife, PE. By gamma ray transmission the radial distribution of the catalyst concentration, in the riser of a cold model of a FCC - Fluid Cracking Catalytic type unit, were measured. The solid concentration in the circulating fluidized bed was determined by transmission measurements with a 241Am gamma source and a Nal(Tl) detector. A computational PLC control the cold unit operation and the maintenance of a steady state working regime. Measurements of the air flow, pressure drop and catalyst density are continuously taken for monitoring the process variables. The radial catalyst concentration was in the (2 - 9) kg/m3 interval at mid-height of the riser. The radial distribution was in a rather annulus configuration for this diluted catalyst concentration. 1.
Introduction
To study the flow dynamics of a FCC riser an experimental cold model, operating at room temperature, is used to simulate problems occurring in the hot unit. The components in the circulating gas-solid system are FCC catalyst and air. Along the length L of the riser the total pressure drop APriser results from three contributions: acceleration, gravity and friction, the solely contribution due to gravity APEs is considered in Eq. (1). Nuclear methods can provide non-
285 intrusive measuring techniques and direct measurements from catalyst concentration as it can be evaluated by means of Eq. (2). The axial profile of the catalyst concentration distribution was determined and a one-dimensional mathematical model for describing the riser fluid dynamics was applied [1], [2], For a better understanding of the gas-solid flow in the riser a validation methodology is under study. Therefore, by analyzing each step of experimental and simulation procedures, a discrete modeling approach is required, for evaluate instrumental measurements and variables correlations in the flow parameters. 2. Flow Parameter Measurements The FCC cold model and the gamma ray measurements are associated to study the riser fluid dynamic according to the following investigation strategy. A mean values of the volumetric fraction of solid ss, equal to 1 menus the porosity 8 , between two riser positions pj and p 2 was calculated from the pressure profiles and the Equation proposed in [3]
<'-*> = < ^ 5 ^
(1)
where: Ap is the pressure drop over the distance Ah, p and pf are the solid and air densities. The acceleration and friction effects were neglected. A direct measurement of the solid volumetric fraction was made, using a Beer-Lambert type Equation [4] and obtaining a mean Rvalue along the gamma ray pass length, here specifically on the riser diameter £),-.
(\-e)=
^lnf-
(2)
Where Iv and If are gamma intensity in empty riser and at flow conditions, aP and pP are mass attenuation coefficient and particle density. Discrete Modeling Fluid dynamic models require parameters as e, ss, and gas and solid velocities Ug, Us that are based on measured variables, constant values as ps, p/, g, which are from reference data and parameters calculated from measurement and reference data as friction coefficient and Reynolds number. The experimental data was initially modeled to evaluate instrumental precision on the measurements of: gas flow Qg, solid flow Qs and the pressure drop in riser APriser- To estimate measured and expected values from the control set up the data were fitted in regression curves. As the independent variable of the system, Qg, was also calculated from flow equations as a matter of comparison. A good agreement with the calibration data from the flow meter
286 certificate was obtained. The solid flow Qs in kg/s was evaluated by calibrating the motor rotation rps versus catalyst mass transported per second. The measured variables correlations were investigated by modeling the pressure drop in riser as a function of gas and solid flows AP^e, = f(Qg, Qs) and the solid volumetric fraction as ES = f(APnser, Qs)All these data investigation were carried out with models that were linear in the parameters, the linear least squares method was the a parameter estimator as a=(CTC)~lCTy
(3)
with the matrix C of the independent variables and y the model response variable. The problem of to minimize the residuals r = f(a) = y - a C which is according to the LLS method rmnf(a)=fTf=\y-aC\2
(4)
a
And the uncertainty of the fitted parameters was calculated by means of the uncertainty matrix in Equation (3) and the standard deviation & calculated with residual r values, as usual. Equations (3) and (4) are given by Barker et al [5], the calculations where carried out in the computational program. Measuring the gamma intensities to estimate the catalyst concentration, or catalyst density given in kg/m3 as a mean value on the riser diameter [6]
^=TFXnj; " p i
(5) F
A specific problem emerges in measuring radial catalyst density distribution, by the radial scanning of the riser, as the gamma ray intensities assume a parabolic shape enhancing the experimental errors. Linear transformation and nonlinear models, as fitting a Gaussian peak, in [5 ] for example, do not fits the experimental data adequately because it is not exactly a bell curve shape. Only a three term Gaussian model, fits the parabolic curve of the radial scanning. To estimate the solid fraction es as a function of the relative axial distance z/L in riser and the gamma intensities Iv and If, the models were nonlinear in the parameters. The nonlinear least squares was applied for the parameters estimator and the nonlinear solver provide the Jacobian matrix in order to calculate the uncertainty associated with the fitted parameters. The discrete models were analyzed by Mest on the uncertainty of fitted parameters, by means of the residuals plots to evaluate the errors distribution and correlations. A Monte Carlo simulation was used to the least-squares estimator validation and a comparison of the results. After these tests the model were rejected or finally adopted.
287
3. Results and Discussions A very useful result from the data models investigation was to predict the solid density along the riser diameter ff„ by means of the following equation 2
2
z= ai + a2x+aiy+a4xy+a5x + a6y (6) where: x =AP, y = Qs. Results were compared with the on line measurement using Equation (5). The equation (6) gives also a comparison value for the solid volumetric fraction es from Equation (2), shown a good agreement. The catalyst density evaluation by means of Equation (5) were carried out by means of numerical integration of the area under the Iv and IF curves. The area under the curves was calculated by a Simpson method, direct on the experimental data and also on the spline interpolated data. A measurement of the radial density distribution was achieved by integrating, in narrow limits, along the radius interval the curve in the Gaussian model. / = aiexp(-((^-^)/c,) 2 ) + a 2 expHU-/7 2 )/c2) 2 )+exp(-((A:-fc3)/c3) 2 )
(7)
All the calculations were done through MatLab functions and a fair agreement of the integral values was obtained. Radial catalyst density distribution at 0.80 m, 0.90 m axial heights of the riser is presented in Figure 1.
Figure 1. Radial catalyst density distribution at two riser heights by gamma ray transition measurements.
4. Conclusions To apply a validation methodology for fluid dynamics modeling was necessary a discrete model approach for the experimental data.
288 From the radial catalyst distribution given in Figure 1, it can be seen that a mean value is not representative for the catalyst radial distribution, therefore, a bi-dimensional mathematical model is required for describing the fluid dynamic conditions in riser. Acknowledgments The work is supported by CTPETRO\FINEP\PETROBRAS. The authors wish to thank Dr. Waldir Martignoni, SIX and Eng. Claudia de Lacerda Alvarenga, CENPES/PETROBRAS, for their suggestions and assistance. References 1. A.C.B. Melo, 2004, Ph.D thesis, Department of Nuclear Energy - UFPE. 2. V.A. Santos, C.C. Dantas, Powder Technology 140, (2004) 116-121. 3. T. Grassier and K.E. Wirth, DGZfP Proceedings BB 67-CD, Germany (1999). 4. U. Parasu Veera, Chemical Engineering Journal, 81 (2001) 251-260. 5. R. M. Barker, M. G. Cox, A. B. Forbes, P. M. Harris, Best Practice Guide No. 4, Discrete Modeling and Experimental Data Analysis (2004). 6. R. N. Bartholomew, R. M. Casagrande, Ind. and Eng. Chem., vol. 49, No. 3, March (1957), p. 428.
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 289-292)
SOFTWARE FOR DATA ACQUISITION AND ANALYSIS IN ANGLE STANDARDS CALIBRATION MIRUNA DOBRE, HUGO PIREE FPS Economy - SMD Metrology, Bd. du roi Albert II, 16 1000 - Brussels - Belgium
Careful preparation of experiment is essential in angle standard calibration but is time consuming and mistakes are likely to occur. In order to overcome this problem, we have implemented an integrated acquisition and analysis software. This virtual instrument exploits the possibility to interact with mathematical software in order to perform data analysis. The user interface is an interactive guide to the design of experiment and the performing of the measurement cycle. Therefore this software may be used by lessexperimented technical staff.
1.
Introduction
The calibration of angle standards such as optical polygons, indexing tables, pentaprisms,... , requires no primary standard since any angular value can be realized by dividing a circle into N corresponding angular segments [1]. Each angular segment is compared with a reference angle (figure 1). Angles to be calibrated
Reference angle
Figure 1. Angle standard calibration: the circle division principle.
The comparison involves at least one autocollimator (optical device for small angle measurements). As well described in references [2, 3, 4], measurements form a system of N linear equations A supplementary constraint is the circle closure principle (sum of all angular segments is 360°). The reference angle
290 may therefore be unknown as there are N+l equations available. One of the angular segments can be chosen as reference and compared to all the others (self-calibration). If the reference angle belongs to a second standard, both devices will be calibrated simultaneously (cross-calibration). Each angle of the first standard will be compared with each angle of the second one. For complete statistical analysis, all the angular segments are successively compared with all the others. The measurements will form a system of N*(N-l) equations. This overdetermined system requires a least squares solution (eq. 1). Mn = AK - a2 M Mv = A,- ari 0 = a A. 0= a a.
I>
In matrix notation: M =C X 13 Solution: X= ( C T ' C ) ' 1 ' C T
(1) M
In our laboratory practice the solutions are found by arranging the data into ordered tables and summing columns or rows to calculate angular errors as described by Sim [2]. Tabular arrangement depends on the components of the measurement system, the category of angle standard, the number of division and the calibration method (self or cross-calibration). The method is limited only by the physical realization of circle division, usually done by a mechanical indexing table which allows a smallest increment of 10' of arc. Some of the required divisions (non-integer angles) may not be achieved with enough precision. Yandayan & al. proposed a technique to overcome this limitation [5]. The choice of the right table depends on several factors of the measurement setup. Errors are likely to occur and the preparation of the calibration is very time-consuming. The solution was to design an acquisition software in order to guide the operator through the experimental setup process, to acquire data and finally to calculate angular errors. 2.
Software description
LabView data acquisition software is used in interaction with MatLab mathematical software to create a single operator interface (virtual instrument VI). The flow chart is shown in figure 2. The first step is to input information about calibration: the standard(s) to be calibrated (polygon, indexing table, angle gauges), the available instrumentation (number of autocollimators, number and minimal increment of indexing tables, square mirrors). A permanent concern when designing the user interface was its accessibility to less-experimented technical staff. The user decision is assisted by information
291 about possible experimental schemes, estimations of measurement time and suggestions about the number of measurement scheme repetitions. INPUT CALIBRATION
U U 4
Proposed methods:
CHOOSE Method
SETUP
PARAMETERS
Measurement
A C Q U I R E Data & A N A L Y S E Data
Method 1. Cross-calibration against polygon 2. Cross-calibration against indexing table 3. Self-calibration
Nbr. of
Time
N,
T,
N2
T2
JV?
Process data in Matlab: •S Build coefficient matrix. S Solve the overdetermined linear equations system (least squares) •S Calculate type A uncertainties.
JpS
OUTPUT CALIBRATION
RESULTS
Figure 2. Data acquisition and analysis flow chart. The setup of the measurement is also well documented with step-by-step indications about the necessary adjustments which minimize type B uncertainties: accurate centering of standards, autocollimator adjustment, shielding to avoid air circulation through the optical path. Most autocollimators have the ability to send data to computers through RS232 or IEEE ports. Each acquisition is followed by a manual repositioning of the rotating table. The acquisition step is therefore long and requires a permanent human intervention. Further improvement of our angular calibration facility involves setting up an automatic rotating table. Angular errors can not be estimated before the full experimental scheme is realized. Meanwhile the acquired values are simply displayed in tabular form. Full calculations are done between two repetitions of the scheme and temporary results are displayed in tables and graphics (errors and cumulated errors around the circle) The data analysis process is transparent to the user. It is based on a MatLab script node imbedded in the LabVIEW VI. The script node sends the measurement matrix, creates the coefficient matrix depending on the chosen calibration scheme, uses MatLab capabilities of solving over-determined linear equations systems (by least square computation) and displays the results on the
292 LabVIEW front panel. If all repetitions of experiments has been achieved, the standard uncertainty (type A) is also estimated. The particularity of calibrations constrained by a condition (circle closure equation) is that the standard uncertainty for any angular interval error a, is in fact a combined uncertainty involving all others intervals uncertainties. As demonstrated by Estler [4] if we assume that the standard deviation u0 is constant for all the measurements, the combined standard uncertainty of aj is less than u0 by a factor ofJ(N - l ) / N . Each measurement of one of the other deviations adds a little to the knowledge of a, because of the closure constraint. The raw data is saved in ascii form and analysis results are transferred to a spreadsheet file for further report generation. 3.
Conclusions and perspectives
The calibration of angle standards using self or cross calibration methods involves careful design of experiments and solving of over-determined equations systems. Written laboratory procedures and hand filling of graphical square calculation forms are impractical to use on a daily basis. Data acquisition software is used in conjunction with a mathematical solver and an explicit user interface is designed in order to avoid mistakes in measurement setup and data analysis. This interface helps the user in choosing the optimal experimental design by taking into account the available devices, the required uncertainty and the time factor. Self and cross calibration techniques are very time consuming as they require much more experiments than simple comparison calibration. Further development of angle standards calibration includes the transition to a full automatic system. Such a system involves not only data acquisition and analysis but also automatic positioning of the devices to be calibrated. References 1. Schatz B., Controle des angles Techniques de l'lngenieur. R-1300, (1986). 2. Sim P.J., Angle standards and their calibration, in Modern Techniques in Metrology ed P L Hewitt (Singapore: World Scientific). 102-121 (1984). 3. Reeve C. P., The calibration of indexing tables by subdivision, National Bureau of Standards (US), NBSIR 75-750 (1975). 4. Estler, W. T., 1993, Uncertainty analysis for angle calibrations using circle closure J. Res. Natl. Inst. Stand. TechnoL 103,141-151 (1993). 5. Yandayan T. & al., A novel technique for calibration of polygon angles with non-integer subdivision of indexing table Prec. Eng. 26, 412-424 (2002).
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 293-296)
CALCULATION OF UNCERTAINTIES IN ANALOGUE DIGITAL CONVERTERS - A CASE STUDY* MARIAN JERZY KORCZYNSKI Electrical and Electronic Engineering Department, Institute of Theoretical Electrotechnics, Metrology and Material Science, Technical University of Lodz, 18/22 Sefanowskiego Street, 90-924 Lodz, Poland ANNA DOMANSKA Department of Electrical Engineering, Institute of Telecommunication, Poznan University of Technology, Piotrowo 3A, 60-956 Poznan, Poland The paper describes a case study concerned with the identification and evaluation of uncertainties in Analogue to Digital Converters. The originality of this work is that a newly introduced method of evaluating uncertainties based on using the Fast Fourier Transform is applied, and the results obtained compared to those obtained from traditional methods. It is proposed that in the future instruments might display not only a single value (the best estimate of the measured quantity) but also an interval that contains the measured quantity with a specified probability.
1. Introduction The uncertainty associated with a measurement depends on the accuracy of all elements of the measuring system. The elements of a digital system include sensors, conditioners, analogue-to-digital conversion, and procedures for handling and displaying the acquired data [1]. The value that appears on the display of the system is the best estimate of the measured quantity. However, the measurement is incomplete without a statement of the uncertainty associated with the value. The authors of this paper believe that in the future such instruments will display both the estimate and a statement of the associated uncertainty corresponding to a stated level of confidence. The evaluation of uncertainties will be a part of the data handling procedures of the system, and will include the accuracy of the analogue-todigital conversion as a component contributing to the overall uncertainty. A short description of the evaluation of the uncertainty arising from this component is described in the following sections. A physical model of an Analogue to Digital Converter (ADC) is used as the basis for the evaluation of
* Work partially founded under EU SofTools_MetroNet Contract No G6RT-CT-2001-05061
294 uncertainties [4]. The originality of the work is in the application of a newly developed method for evaluating uncertainties that uses the Fast Fourier Transform (FFT) to undertake numerically the convolution of probability distributions assigned to the input quantities in a linear model of the measurement. 2. Model of Analogue to Digital Converter The model of an ADC used as the basis for the evaluation of uncertainties is illustrated in Fig. 1. The ADC is decomposed into two parts (Fig. la): the first part is represented by a transfer function g(x), which is a continuous non-linear function with a certain gain and offset, and the second part is an ideal quantized element represented by the function Q(z). The noise of the ADC is applied to the output of the model and is denoted by n. A„, F »„
.t
' X
Y
K.
y
A A» V rA. /*• J * ' r
X
•^
\\
-7
'x)
/,
A0.v„
,. 4,.. 1
Q(*)s7\y
4-
• • ! • •
e,n
"1 A„
b)
Figure 1. Model of an ADC: a) transfer functions, b) effects contributing to the overall uncertainty.
The transfer function g(x) is given by z
=g{x)={G
+ A
G)X+AOFF+ANL{X)=:X
+ A
GX+AOFF+ANL(x)
0)
where x is the input signal, G the gain with G = 1, AG an error associated with the gain, A0FF an offset error, and A^, a non-linearity error. The transfer function Q(z) is given by Q(z)=z + Aq(z)
(2)
where Aq denotes a quantisation error. Consequently, y = Q(z)+n = z + Aq(z)+n~x + AGx+A0FF + ANL{x)+ A (x)+n where n is the noise applied at the output of the ADC. The error associated with a sampled value is then given by
(3)
295 e
i=yi-xi~AGxi+A0FF+ANL{xi)+Aq(xi)+ni
(4)
and illustrated in Fig lb. Applying the law of propagation of uncertainty to the model defined by Eq. (4), the standard uncertainty associated with a sample value is expressed in terms of the standard uncertainties associated with the component errors as follows: uc=^u2Gx2 + u20FF + u2NL + u2 +ul
(5)
Manufacturers of ADCs provide information about the component errors, which are assumed to be mutually independent, as follows. The gain, offset and nonlinearity errors are characterized by rectangular distributions with mean values of zero and prescribed limits MG, MOFF and MNL, where |A G |<M G ,
\A0FF\<MOFF,
\ANL\<MNL
(6)
The noise n is characterized by a Gaussian distribution with mean zero and standard deviation an, and the quantisation error by a rectangular distribution with mean value zero and semi-width q/2. This information can be used to provide a coverage interval, corresponding to a stated coverage probability, for the error associated with a sampled value. 3. Example of calculation The evaluation of ADC uncertainty was performed for a 16-bit successive approximation analogue-to-digital converter, bipolar ± 5 V, for which the Full Scale Range is FSR = 10 V, the quantisation is q = 0.15 mV, and N= 16. The specification of the ADC is given in the first three columns of Table 1. Table 1. Component contributions to the uncertainty of an ADC Transfer CharacteristicsAccuracy (G=l) Offset Error (max) Gain error (max) Linearity Error (max) Quantization Error Noise (3 a) (max) Gaussian
Limits of errors ± ± ± ± ±
0,2 %FSR 0,2 % 0,006%FSR 1/2 LSB 0,003%FSR
M(x = 5V) mV MOFF = 20
M 0 = 10 MNL = 0.6 Mq =0,075 V„=0,1 mVrms
u mV »OFF= 1 1 , 5 5
ua = 5,77 «NL=
0,35
Hq = 0,04 u„ = 0,10 uc =31,09
If the error associated with a sampled value is characterized by a Gaussian distribution [4], then for a coverage probability of 0.95, the expanded uncertainty is
296 U = ku =2-31,09 = 62,18 mV
(7)
corresponding to a coverage factor of k = 2. Alternatively, a method [2] that uses the FFT to undertake numerically the convolution of distributions characterizing the components errors may be applied (Fig. 2). The expanded uncertainty obtained, for a coverage probability of 0.95, is 55.84 mV. u„(uniforni)
TTTTT. -*fFFTr-i A„ v„
uH(uniform)
'-% ASL
TTTTTTTTTT - * m
1
uH(uniform)
F g H TTTTTTTTTT A
u„(uniform)
CiU r
" M
\.
u A (Normal)
o,u
*
..tT
T%.
Figure 2. Block diagram of uncertainty evaluation based on applying the FFT method.
4.
Conclusion
An FFT based method has been applied to evaluate the uncertainty for an ADC. The method has been validated using a Monte Carlo method. The method gives a result that is 11 % smaller than that provided by a traditional method. It has the advantage over the Monte Carlo method of permitting the calculation to be undertaken in real-time. References 1. E. Ghiani, N.Locci N., Muscas C , Auto-Evaluation of the Uncertainty in Virtual Instruments, IEEE Tr. on Instrumentation and Measurement, Vol. 53, No. 3, June 2004. 2. Korczynski J. Fotowicz P. Hetman A. Gozdur R. Hlobaz A., Methods of Calculation of Uncertainties in Measurement PAK 2-2005 3. Nuccio S., Spataro C , Approaches to Evaluate the Virtual Instrumentation Measurement Uncertainties, IEEE Tr. on Instrumentation and Measurement, Vol. 51, No. 6, December 2002. 4. Guide to the Expression of Uncertainty in Measurement, BIPM, IEC, ISO, IUPAC, IUPAP, OIML, 1995.
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 297-300)
A S Y M P T O T I C LEAST S Q U A R E S A N D S T U D E N T - T SAMPLING DISTRIBUTIONS
ALISTAIR B. F O R B E S Hampton
National Physical Laboratory Road, Teddington, Middlesex, UK, TW11
OLW
Asymptotic least squares (ALS) methods can be used to implement robust estimation methods using nonlinear least squares algorithms. In particular, this paper shows how maximum likelihood estimates for Student-t sampling distributions can be implemented using ALS.
1. Asymptotic least squares Asymptotic least squares (ALS) approaches [1] have proved to be effective in fitting models to data in which there are outliers. Suppose we wish to fit a model y = (x,a), parametrized by a, describing a response y as a function of a: to data points {(xi,yi)}r^L1. The standard least squares (LS) approach is to estimate a by the solution of the problem m
minE(a) = y^/^aii.j/i.a), a
*—* i=l
/»(£*,&, a) = yt - <j>(xi,a).
(1)
Least squares solutions correspond to maximum likelihood estimates for normal (Gaussian) sampling models: if the model for the measurement data is of the form yi = (j)(xi, a) + e,, where e* e N(0, a2) is drawn from a normal distribution with a known, then the least squares solution a of (1) maximises the probability p(y|a) of observing y with respect to a. In other words, a is the most likely explanation of the data. The normal probability density function accords vanishingly small probabilities to values more than a few standard deviations from the mean. For this reason, a least squares model fit is forced to move the model value closer to an outlying data point, often at the expense of dragging the fit away from the rest of the data. The main idea of ALS is to apply a weighting function to fi that will reduce the effect of large residuals associated with wild points. (ALS is similar to M-estimators in this regard [2].) We look to minimise an objective
298
function of the form m
E{a) = ,£j?(xi,yi,*),
fi = w(fi){fi),
(2)
for some suitable weighting function such as
W{X)
(3)
= (1 + cW 2 '
where c > 0. Optimisation problem (2) can be solved using the GaussNewton algorithm [3], for example. The solution of (2) with the weighting function (3) can be thought of as the maximum likelihood estimate for a sampling distribution with an improper probability density function derived from p(x) = exp{—w 2 (x)x 2 /2}. We use the qualifier 'improper' since for c > 0 the function p{x) does not have a finite integral: p(x) tends to a strictly positive constant as \x\ tends to oo. Rather than starting from a weighting function w(x) and then considering what is the corresponding sampling distribution, we can instead start from an appropriate distribution and then determine a corresponding w(x). For dealing with outliers, we require a distribution which is similar to a Gaussian but has longer tails, according higher probabilities to values far from the mean. 2. Maximum likelihood for student t distributions The Student-i distribution [4] t(fx, a, v) can be regarded as a generalisation of the Gaussian distribution. It is parametrized by location /x, scale parameter a > 0 and shape parameter v > 0, also referred to as the degrees of freedom. Its probability density function is given by 2-1
io\
\
1
„(%,,,„) = g ( y t r ) 1 / a
r((i/ + l)/2)
r(y/2)
1+
-(u+l)/2
I^-M v
The distribution has mean /j,, v > 1, and variance va2/{y — 2),v> 2. As v —* oo, this PDF approaches that for the normal distribution N(fi,cr). Given a model y = cp(x, a) + e, sampling distribution t(0, a, v), and data {(xi,yi)}iL\, maximum likelihood estimates of a, a and v are found by minimising m
- L ( a , a, v\y) = - ^ l o g p ( / j | 0 , o - , i / ) ,
/j(a) = yt -(xi,a).
299 If a and v are regarded as fixed, we can find maximum likelihood estimates of a using an ALS method if we choose w(x) defined by 1/2 1 loV , J(X) =, (!/. +- v1)^L g/f,l + 1 xI \^
W(
Since log(l + e) = e - e 2 /2 + e 3 / 3 . . .,|e| < 1, ty(a;) is well defined at x = 0 with value of (1 + l/i/) 1 / 2 . 3. Numerical example: polynomial fitting We use Monte Carlo simulation [5] to compare ALS with LS approximation using polynomial functions 4>{x) = Y^j=i aj^j(x) where (f>j(x) are basis functions (such as Chebyshev polynomials [6].) We generate sets of data as follows. We fix m the number of data points, mo < m the number of wild points, n the number of basis functions, a the standard deviation for standard noise and ao > a the standard deviation associated with wild points. We form uniformly spaced abscissae x = (x\,..., xm)T and corresponding mxn observation matrix C with c^ = (pj(xi). Given a = ( o i , . . . , a „ ) T , we set yo = C a so that the points {(xi, 2/1,0 )}£Li lie exactly on the polynomial curve specified by parameters a. For each q = 1 , . . . , N, where N is the number of Monte Carlo simulations, we perform the following steps: I Add standard noise: y g = yo + r, r* € A^(0, a2). II Choose at random a subset Iq C { 1 , . . . , m} of mo indices. III Add large perturbations: for each i € Iq, yiq)}'iLi determining parameter estimates a.q and &q, respectively. V Store estimates: A(q, 1 : n) = aj and A(q, 1 : n) = a From these simulations we can compare, for example, the sample standard deviations Uj (LS) and Uj (ALS) of the calculated parameter values {-4(*,j)}™i and {A^J)}^, respectively, j = l , . . . , m . Figure 1 shows straight line polynomial (n = 2) fits to data generated with a = (1.0,1.0) T , m = 26, m0 = 5, a = 0.05 and a0 = 2.0. The ALS fit was generated using a = 0.05 and v = 10. For iV = 1000 Monte Carlo simulations, the differences between a and the mean of ALS estimates {a,} were (0.0010,0.0006)T while those for the LS estimates {a g } were (0.0227,0.0116) T . The standard deviations of the ALS estimates u(ALS) were (0.024,0.021) T while those for the u(LS) were (0.35,0.28) T , more than ten times the corresponding ALS standard deviations.
300 1
«
+
Data O Outliers -LSfit - ALSt fit
1.5
^
®
1
0.5
^ -^3 +
0
-•' 0.5"
1
-1
I
i
-0.8
-0.6
1
-0.4
1
-0.2
i
i
i
i
i
0
0.2
0.4
0.6
0.8
1
Figure 1. LS and ALS straight line polynomial fits to data with outliers. Acknowledgements This work has been supported by the U K ' s Department of Trade and Ind u s t r y National Measurement System Directorate as p a r t of t h e Software Support for Metrology programme. References 1. D. P. Jenkinson, J. C. Mason, A. Crampton, M. G. Cox, A. B. Forbes, and R. Boudjemaa. In P. Ciarlini, M. G. Cox, F. Pavese, and G. B. Rossi, editors, AMCTM VI, pages 67-81, Singapore, 2004. World Scientific. 2. P. J. Huber. Robust Statistics. Wiley, New York, 1980. 3. P. E. Gill, W. Murray, and M. H. Wright. Practical Optimization. Academic Press, London, 1981. 4. M. Evans, N. Hastings, and B. Peacock. Statistical Distributions. Wiley, 2000. 5. M. G. Cox, M. P. Dainton, A. B. Forbes, P. M. Harris, H. Schwenke, B. R. L. Siebert, and W. Woeger. In P. Ciarlini, M. G. Cox, E. Filipe, F. Pavese, and D. Richter, editors, AMCTM V, pages 94-106, Singapore, 2001. World Scientific. 6. D. C. Handscombe and J. C. Mason. Chebyshev Polynomials. Chapman&Hall/CRC Press, London, 2003.
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 301-305)
A STATISTICAL P R O C E D U R E TO Q U A N T I F Y T H E C O N F O R M I T Y OF N E W T H E R M O C O U P L E S W I T H R E S P E C T TO A R E F E R E N C E F U N C T I O N
DANIELA ICHIM, MILENA ASTRUA Istituto di Metrologia "G. Colonnetti" - CNR, 10135 Strada delle Cacce, 13, Torino, Italy
The problem of assessing the degree of conformity of an artefact with respect to a standard model is addressed. Three possible strategies are proposed: the correlation coefficient, a least squares approach and the concordance correlation coefficient. These techniques are compared when applied to assess the conformity of six new Platinum-Palladium thermocouples with respect to the NIST/IMGC emf-temperature function.
1. Introduction Aiming at replacing the presently used type S (Pt/Pt-Rh alloy) thermocouples as secondary reference standards up to 1500 °C, Platinum versus palladium (Pt/Pd) thermocouples have been studied. One of the most complete works on the P t / P d thermocouples is the joint study performed by the National Institute of Standard and Technology (NIST) and the IMGC [1], which provided a standard emf- temperature reference function / * for the range 0 °C to 1500 °C. Six new P t / P d thermocouples were constructed at IMGC, using Pt and Pd wires from three different manufacturers and of different nominal purity [2]. Since many studies on these thermocouples indicated that their behaviour strongly depends on the preparatory treatments, in the thermocouples construction the procedure adopted in [1] was closely followed. Section 2 presents the statistical methods used at IMGC to quantify the conformity of a new artefact with respect to a model; Section 3 discusses the application of these techniques to evaluate the conformity of the six IMGC P t / P d thermocouples with respect to the NIST/IMGC reference function.
302
2. Statistical Techniques Suppose a certain class C of artefact is characterized by describing the relationship between two physical quantities Qr and Qe. Suppose that a model function Qr = f*(Qe) is given. When a new product V (artefact) is manufactured, its belonging to C has to be assessed. Generally, this task is performed by measuring the values of Qr and Qe with V. Normally, the setpoint values of Qe are pre-defined, say Qei,- • • ,QeN- Then, the Q^i i = 1,- • • ,N values of the product V are acquired. The comparison to be performed is between the model sample Xi = f*(Qei),i = 1,. .• ,N and the experimental sample Yi — Q^t, i = 1,... ,N, respectively. Three simple approaches to conformity quantification are presented. 2.1. The correlation
coefficient
Pearson's correlation coefficient p is a well known measure of extent of how much the two data samples are linearly related. The sample correlation coefficient is introduced as r = . Sx0" , where sxv denotes the sample co"xx&yy
variance, while s 2 and s 2 denote the sample variances of X and Y. The main drawback of p is that it measures only the linearity between two variables. If the two samples were different only as to translational term, hence non-conformity, the correlation coefficient would still be close to 1. 2.2.
The least squares
testing
A least squares approach could be based on the following: if the two samples were conform, the pairs (Xi,Yi)i=it...tN would fall on the bisector line. Assume the relationship between X and Y to be modelled as: Yi = po+01*Xi
+ ei,
i = l,...,N
(1)
where £», i = 1 , . . . , N are the errors, supposed to follow a gaussian distribution with zero mean and standard deviation a. Denote by X = (xid) G IR^'2, where xitj = {Xtf-1^ = 1,.. .,N,j = 1,2 and by Y = (Ylt..., YN)1 £ R^' 1 . The estimator /3 = ( X ' X ) " 1 XlY of t 2 / 3 = (/30,/31) G R follows a gaussian distribution of mean f3 and variancecovariance matrix (XX) a2, see [3]. To test the null hypothesis Ho : 0o = 0 against the alternative hypothesis Hi : (3o 7^ 0, the test statistic to = Po/se($o), following the t distribution with N — 2 degrees of freedom (tjv-2) is used, see [3]. To test Ho : /?i = 1 against the two-sided alternative hypothesis, ti = (/3i — 1J /se(fii) ~ ijv-2
303 Table 1. EN
T(°C) 0.01 156.5985 231.928 419.527 660.323 961.78
E
(mV)
0 0.9217 1.4286 2.9643 5.7810 10.8115
Measured emf values of the six IMGC thermocouples. EET (mV) -0.0002 0.9217 1.4290 2.9653 5.7816 10.8130
EEa (mV) -0.0004 0.9197 1.4258 2.9598 5.7731 10.8008
EEaP (mV) -0.0002 0.9199 1.4260 2.9601 5.7735 10.8017
En (mV) 0.0004 0.9224 1.430 2.9654 5.7833 10.8148
E12 (mV) 0.00000 0.9226 1.430 2.9667 5.7836 10.8129
is used. Here se denotes the standard error of the estimator. Then, given a significance level a, to and ii are compared with the critical values £w-2,i-aSuch least squares approach could fail to reject the null hypothesis (conformity) if the data were very scattered (non-conformity). 2.3.
The concordance
correlation
coefficient
The third method is based on the deviations Z» = Xi — Y^i = 1,...,N. If there were conformity, Z would follow a zero-mean gaussian distribution. Let Z(i) < Z ( 2 ) . . . < Z(jv) be the values Z\,..., Zjy ordered in ascending order. The empirical distribution function Fz is denned as: Fz(z)=j/N,
if
Z{j)
(2)
The concordance correlation coefficient, C, was firstly introduced in [4]. The empirical (Fz) and normal (Fv) distribution functions are compared like in the Kolmogorov-Smironov test. Instead of taking the maximum deviation, an indicator is constructed. As in [4], C is estimated by: S2z + S],+
(FZ-FM)2
where SzM = jr E t i {Fzizi) - Fz) [FM{Zi) - FM)
S2z =irEf = 1 (^(^)-^) 2
fy
2
(4)
^TiElAF^i)-^)
and Fz and FAT denote the corresponding means. |C| values close to 1 indicate conformity, while values close to 0 indicate non-conformity. 3. I M G C - C N R thermocouples study The thermoelectric emf values of six IMGC thermocouples (called EN, ET, II, 12, Ea and Eap respectively) have been measured at iV = 6 fixed points defining the ITS - 90 in the range 0 °C to 961.78 °C, see [5]. The obtained
304
values are presented in the table 1. For each thermocouple, its conformity to the model / * has to be assessed. Denote by (Ei,Ti)i=1 N the set of measurements (emf and fixed points temperatures, respectively) acquired for a single thermocouple and let E^1 = f*(Tt),i = 1,...,N. Figure 1 shows the differences Ei — E™, i = 1 , . . . , TV, for each thermocouple. II, 12, EN and ET, produced emf values similar to E™,i = 1 , . . . ,N: for example, the dispersion of each of the four emf values at the fixed point of silver (971.78 °C) was about 3 /xV. The thermoelectric emfs of the other two thermocouples, were clearly lower at each fixed point. The emf values at the Ag point were found to be about 12 pN below the E™ values, which corresponds to a temperature equivalent of about 0.6 °C. The three statistical
4r 2
*
1
0-
1
-2-4-6-
• EN 0 ET O Ea
-8-
X *
86
+
•
0
#
*
V
H
0
6
Eap 1
6
+ 12 -10-
X
-12-
-ifoo
0 200 400, Temperature (C)
600
800
1000
Figure 1. T h e differences between the model function and t h e thermoelectric values of the six IMGC thermocouples.
tools were applied to quantify the conformity of the IMGC thermocouples. Correlation coefficient The estimated values of p were greater than 0.999, for every thermocouple. Based only on p, all thermocouples would be considered in perfect agreement with /*. This is not the right conclusion, as showed in figure 1, too. Least squares testing The least squares results are presented in table 2. to and ii were computed as described in the section 2.2. The critical value £4,0.95 equals 2.1318. When testing the intercept, the null hypothesis was rejected for the thermocouple II only. Testing the slope, the null hypothesis was rejected for
305 Table 2. EN ET Ea Eap 11 12 Table 3.
Least squares estimations.
$o 0.1552 0.2199 -1.159 -1.034 0.7297 1.221
se0o) 0.1911 0.3770 0.5949 0.6192 0.2118 0.5946
$i 0.99980 1.00000 0.99889 0.99896 1.00000 0.99993
seQ3i) 3.7E-05 7.6E-05 1.1E-04 1.2E-04 4.1E-05 1.1E-04
C values for the six IMGC thermocouples.
Thermocouple C
EN 0.58
ET 0.88
Ea 0.13
Eap 0.15
II 0.38
12 0.73
the EN, Ea and Eap thermocouples. Using this approach, it is not easy to quantify the degree of conformity with respect to the model function /*. Concordance correlation coefficient For each of the six analyzed thermocouples, C was estimated by (3). The results are presented in the table 3. This concordance correlation coefficient quantifies the degree of conformity of each of the six IMGC thermocouples with the model function / . Obviously, looking also at the figure 1, the "worst" thermocouples are Ea and Eap, while the most conform are ET, 12 and EN, respectively. The low C value for the II thermocouple is due to the fact that its emf measurements are always higher than the / * values. Actually, this was also identified by the least squares approach, II being the only thermoucouple whose intercept null hypothesis was rejected. 4. Conclusions Three statistical tools for assessing the conformity of new thermocouples with respect to a model function were discussed. Reliable methods proved to be the least squares testing and the concordance correlation coefficient. The latter is also able to quantify the conformity. This study indicates that the conformity between three thermocouples and the standard model can be stated. Further investigations are required for the others. References 1. 2. 3. 4. 5.
Burns G. W., Ripple D. C , Battuello M., Metrologia 35, 761 (1998). Astrua M., Battuello M., Dematteis R., Mangano A., Tempmeko 2004, 465. Draper N., Smith H., Applied Regression Analysis, New York, Wiley. Lin L. I., Biometrics 45, 255 (1989). Supplementary information for the International Temperature Scale of 1990, BIPM, (1990).
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 306-309)
N O N - P A R A M E T R I C M E T H O D S TO EVALUATE DERIVATIVE U N C E R T A I N T Y F R O M SMALL DATA SETS
DANIELA ICHIM Istituto di Metrologia "G. Colonnetti" - CNR, Strada delle Cacce 73, 10135 Torino, Italy. PATRIZIA CIARLINI Istituto
per le Applicazioni Viale del Policlinico
del Calcolo "M.Picone" 137, 00161 Roma, Italy.
CNR,
ELENA BADEA, GIUSEPPE DELLA GATTA Dipartimento
di Chimica
IFM,
Universita di Torino, Torino, Italy.
Via P. Giuria 9,
10125
Linear models describing the behaviour of the molar enthalpy of solution at infinite dilution, Aso\H^, as a function of temperature T are considered in a hydrophobic hydration study. Purpose of this work is to evaluate the Aso\H^ — f(T) function, its point derivative and associate uncertainty, when information available for data modelling is rather limited from a statistical point of view. Methods not requiring the assumption of strict conditions on data probability are used.
1. The problem In a hydrophobic hydration study of small peptides, the molar enthalpy of solution at infinite dilution in water, A s o i i / ^ , here called H(T), has to be modelled by a suitable function f(T) to evaluate the molar heat capacity change for the solution process and its uncertainty. This quantity, A g o i C ^ , here named C(T), is denned as the point derivative of the enthalpy H(T) with respect to temperature T at a given To. In chemical studies, this uncertainty is needed to obtain the uncertainty of the partial molar heat capacity of solutes at infinite dilution, defined as C ^ = A s o iC~ m + Cp>m, where Cp>m is the molar heat capacity of the solid pure compound [1]. It should be underlined that an incorrect choice of the model / sistematically affect the evaluation of the mentioned uncertainty. For low-molecular mass molecule solutions, the procedure adopted to
307
get an estimate of /'(To) was as follows: a) for i = 1 , . . . , N, H(Ti) values from small sets of independent measurements were obtained; b) a model / that approximates the relationship H(Ti) ~ /(Tj) was constructed; c) C(TQ) was evaluated as the / derivative at To. In this experimental field, a few number of temperature (N < 5) is commonly considered and, at each T,, only from 4 to 8 independent measurements of H(T) are acquired. Let us denote these measurements by Hij,i = 1 , . . . , N, j = 1 , . . . , Wj. It must be stressed that the small numbers W{ and N are crucial to the statistical analysis. To avoid misleading least squares (LS) solutions in a polynomial regression for the H(T) approximation, the mentioned problem of having a reduced size of data should be carefully examined [3]. Moreover, if one would correctly infer the uncertainty of the polynomial coefficients as BLUE least square estimates, strict assumptions on data probability distributions should be assumed and tested. In such experimental conditions, it seemed reasonable to choose a linear polynomial f(T) = ao + a\T and to use non-parametric estimators to make inferences. Hence, the requested uncertainty of the derivative at To is the uncertainty of the slope parameter. In the next section, two procedures are proposed and applied. They are based on resampling strategies that allowed us to evaluate the point derivative uncertainty without the normality assumption on data probability. 2. Resampling strategies and regression The first method adopted was based on the bootstrap strategy to evaluate the accuracy of the mean value for each temperature point (step a). The non-parametric bootstrap approach [4] for the mean estimate was applied to the Hij data to obtain point or interval measures of accuracy to be associated to each H(Ti). For each i = 1 , . . . , N, the method consisted in drawing (with replacement) B samples of size W,, say H* • b,j = 1 , . . . , Wj, from the corresponding data set Hitj,j = 1 , . . . , Wi; then, H*b = ^ Y^f=i H*,j,b> b = l,...,B was computed. This strategy allowed us to compute the bootstrap standard estimate of the mean Hi = -^ Yl,j=i Hi,j a n d its standard bootstrap 95% confidence interval as follows: TT* n
i
1 V*^
fj*
— B 2^b=lni,b
s*B(Hi) =^^TEl1(Htb-H*)2 (L*,U*) = (Hi - 1.96s*B(Hi),Hi + 1.96s*B(Hi))
(1)
308 Then, the N pairs of lower and upper bounds, L* and U*, were used to compute 2N regression lines to get a possible slope variability, k = l,... ,2N: N
a(fc) : min J T (Yt - a 0 - a^f
(2)
a
where Yt may equal either L* or U*, i = 1 , . . . , N. Finally, the slope interval (k)
uncertainty was obtained by using the extreme values of a\ from equation (2). The second procedure was based on the permutation notion (resampling without replacement). Being Wi small numbers, the analysis of all permutations of Hitj of size N was possible in a reasonable computing time. The proposed method to estimate the derivative uncertainties was the following: (1) Selection, without replacement, of an index j — (ji,- • • ,JN) €
n£i{i.---,wi} (2) Consideration of enthalpy points Hij1, ^2,j 2 > • • • > HN,JN (3) Computation of LS estimates: aw
/ N \2 : min I ^ Hitji - a0 - aiT- J
(3)
(4) Denoting W = n*=i ^*> computation of required standard error: w
,
w (4)
3. E x a m p l e s The proposed resampling procedures were applied to our study of seven small peptides, for which the enthalpy measurements were acquired at N = 3 points, Ti = 296.84 K, T2 = 306.89 K and T 3 = 316.95 K. Commonly, in hydrophobic hydration studies, the evaluation temperature is the standard value To = 298.15 K. The number of enthalpy measurements, at a given temperature, varied from 4 to 8. To check the normality of these data, the kurtosis was computed and tested by means of bootstrap methodology as in [2]: one of these compounds showed a significant skewed distribution. In table 1, the results were obtained using both resampling methods (BM, PM) and the ordinary LS procedure applied to all the W data. The variability of ai is given by the Min and Max values and its standard deviation (Std). Moreover, the standard error (SE) for the OLS estimate is
309 Table 1. Comparison of uncertainty estimates for a\: bootstrap method (BM) with B = 250, permutation method (PM), ordinary LS standard error (SE) Small P e p t i d e
Min
Max
SE
Std
BM
PM
BM
PM
BM
PM
OLS
SP1
0.015
0.003
0.024
0.038
0.004
0.009
0.001
SP2
0.114
0.108
0.129
0.151
0.006
0.012
0.022
SP3
0.032
0.024
0.041
0.050
0.004
0.007
0.008
SP4
0.358
0.351
0.363
0.368
0.002
0.005
0.004
SP5
0.120
0.113
0.125
0.128
0.002
0.004
0.004
SP6
0.219
0.198
0.230
0.243
0.004
0.011
0.001
0.070
0.064
0.075
0.002
0.004
0.007
SP7
0.082
reported in the last column. It must be noted that the bootstrap intervals (with replacement) were narrower than the intervals estimated by the permutation method (without replacement), as shown by the standard error estimates and by the difference of the two interquartile ranges. This differences can be due to the fact that the proposed bootstrap procedure uses the mean confidence intervals while the permutation method uses the most extreme data combinations. Except for one of the studied compounds, the bootstrap relative uncertainties were lower than 5%, while the permutation relative uncertainties were lower than 10%. The same set of data was also analyzed by the Nelder-Mead optimization algorithm applied to a polynomial model [5]. The average value of uncertainties was 2.5%, for only one compound the uncertainty being about 5%. The use of non-parametric methods, avoiding strict normality assumptions, allows to evaluate realistic uncertainties from small data sets. References 1. L. Abate, B. Palecz, C. Giancola, G. Delia Gatta, J. Chem. Thermodyn. 29, 359 (1997). 2. P. Ciarlini, in AMCTM III, Series on Advances in Mathematics for Applied Sciences, vol. 45, 1997, 12 - 23. 3. B. Siebert, P. Ciarlini, in AMCTM VI, Series on Advances in Mathematics for Applied Sciences, vol. 66, 2004, 122 - 136. 4. B. Efron and R. J. Tibshirani, An Introduction to the Bootstrap, Chapman & Hall, Inc., New York, 1993. 5. G. Delia Gatta, T. Usacheva, E. Badea, B. Palecz, D. Ichim, J. Chem. Thermodyn. (2005) in press.
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 310-315)
ALGORITHMS FOR S C A N N I N G P R O B E M I C R O S C O P Y DATA ANALYSIS*
P. K L A P E T E K Okruzni
Czech Metrology 31, 638 00 Brno,
Institute Czech Republic
In this article two algorithms related to analysis of scanning probe microscopy d a t a are presented. Article focuses namely on algorithms related with advanced mathematical and software tools like wavelet analysis and neural networks. Wavelet denoising application on scanning probe microscopy d a t a is shown to be effective method for processing high resolution images. Neural network processing of data having strong tip-sample interaction artefacts related to sample morphology is presented and it is shown that it can be effectively used in scanning thermal microscopy d a t a analysis.
1. Introduction During last twenty years, scanning probe microscopy (SPM) has became one of the widely used and developed methods for characterising surface structure of solid materials. The basic idea behind all the methods from the SPM family is measuring interaction between small probe and surface material while the probe moves in the plane parallel to the surface. The scanning probe microscopes therefore produce 2D data fields that can be used for many different kinds of mathematical analysis of surface properties. Besides the well known statistical analysis of rough surfaces properties (having origin in profilometry) also new analytical methods are continuously being developed. Within this article two data processing approaches using advanced mathematical and software methods implemented for SPM data analysis are presented. Note that all the algorithms presented in this article were implemented for use within open source software for SPM data analysis Gwyddion * This work is supported by Ministry of Industry and Commerce of Czech Republic within the program TANDEM under contract No. FT-TA/094
311
(http://gwyddion.net) which is developed by initiative of Czech Metrology Institute in a wide international group of volunteers and SPM users. 2. Wavelet denoising In general, the wavelet transform (WT) in one dimension decomposes original signal f(x) as follows: W(a, b) = -$= [°° f(x)r ( ^ 3 . ) dx (1) a V|a| J-oo \ J where parameter a controls scale of the wavelet ip, parameter b controls its shift along the x axis and asterisk denotes the complex conjugate. Using wavelet transform we decompose the original signal into a set of differently shifted and scaled wavelets. Depending on the wavelet orthogonality we can distinguish between the mostly used two types of WT - discrete and continuous wavelet transform 1 . Within atomic force microscopy (AFM) data analysis, we work with both two-dimensional (2D) discrete and continuous wavelet transforms. In this article the denoising of AFM data using 2D DWT will be presented. Atomic force microscope is designed for very large signal to noise ratio while measuring at micrometer scale. However, the noise influence becomes important while measuring very fine structures. Moreover, within some other AFM-related analytical methods (e.g. magnetic force microscopy or near-field scanning optical microscopy) the signal to noise ratio can be much smaller. Thus the denoising procedure needs to be often applied to the AFM data. For data denoising we used 2D DWT in the following manner: (1) We compute direct 2D DWT of the AFM data. (2) We threshold the wavelet coefficients in each scale, i. e. we remove or lower the coefficients that are smaller than predefined threshold value. (3) We compute inverse 2D DWT of the processed wavelets coefficients. The main problem is determination of the threshold value for different scales of the wavelet coefficients. For AFM data processing we use two simple methods — scale adaptive thresholding 2 and scale and space adaptive thresholding 3 . For threshold determination within both the methods we first determine the noise variance guess given by „ _ Median [Fiji ~ 0.6745
a
[
'
312
where Yij corresponds to all the coefficients of the highest scale sub-band of the decomposition (where most of the noise is assumed to be present). Alternatively the noise variance can be obtained in an independent way, for example from the AFM signal variance while not scanning. For each sub-band (for scale adaptive thresholding) or for each pixel neighbourhood within sub-band (for scale and space adaptive thresholding) the variance is the computed as
-V2 = ~ £ Yl
(3)
t»i=i
Threshold value is finally computed as T(arx)=a2/ax,
(4)
where (fx = y max(aV 2 — &2,0). When threshold for given scale is known, we can remove all the coefficients smaller than threshold value (hard thresholding) or we can lower the absolute value of these coefficients by threshold value (soft thresholding).
'
*
s^
,sss
.
*
H*
?y*- _. .*
V
"
-y
S
^ ^
*
s
v
**" *
oc*
*s
$$<™
my: "
•."
*s
—, ^ ^
" ;» t-5
\ V jr£ ' v
B S> ,
^
J?r
-O s s
v-^
-y^\
s
s
' "^
^ ;s
*N
% *"
s
x
&?£&••
':
*^™
air • X i*. &»»*
A V v>
*,
- .-.";v..l , ^Jp ,s ^ *
Figure 1. AFM image of the atomic resolution on mica before (left) and after (right) scale and space adaptive threshold denoising.
In Figure 1 the noisy and denoised AFM images of mica are presented. These images correspond to atomic resolution topography measured in air by commercial AFM (Veeco Explorer). We can see that the periodic structure is enhanced by means of denoising. However, it should be noted that there was no a priori information about the spatial frequencies expected in the data used for denoising. This is the main difference from Fourier
313
transform denoising that is often used for atomic resolution images enhancement. Other important property of the DWT denoising is that it is local and therefore it does not cause unwanted blurring of the data in case that edges or fine structures are present at data.
3. Neural network removal of topography artefacts in thermal imaging Scanning probe microscope can be equipped with many different extensions that enable user to measure simultaneously more different signals (usually topography together with some physical quantity). In this way the values of the physical quantity are usually interpreted as being measured locally in the points corresponding to topography and thus with the approximately same resolution as the topography signal has. In this article we focus on scanning thermal microscopy (measuring simultaneously temperature of thermal conductivity). The interpretation of signals corresponding to different physical quantities is very often complicated by fact that the finite size of SPM tip influences not only the topography itself but also the measured physical quantity, and this influence is often even much stronger that the influence found on topography. This is the reason why most of the interesting applications of the above mentioned SPM modes come from perfectly flat surface and interface systems. However, as the real samples are very often rough there is a need for correcting the topography artefacts from SPM data. In this article the results of topography effect removal from scanning thermal microscopy (SThM) 5 using method developed by Price 6 will be presented. Scanning thermal microscopy is usually based on use of thin platinum wire (having ' V shape) as a SPM probe. Wire resistance is than used for probe temperature calculation. The wire can be also self-heated to enable local thermal conductivity measurements. The topography and thermal conductivity images for solar cell electrode (several nanometers thin chromium plate on rough silicon surface) can be seen in Fig. 2. From Fig. 2 it can be seen that the thermal conductivity image is heavily influenced by topography related artefacts, that are produced by effect of varying local tip-sample junction geometry on different sample positions. This thermal data property heavily influences all the quantitative measurements of local thermal properties of sample surface. As both the tip and sample geometry is unknown (we observe only its convolution by SThM) it is impossible to remove these effects by using any physical model
314
Figure 2. Topography of solar cell electrode (left) raw thermal conductivity image (middle) and processed thermal conductivity image.
of data acquisition. Within this work we have used simple feed-forward neural network trained with back-propagation algorithm for topography influence removal fromthermal data. The network had input layer consisting of 16 height differences corresponding to characterisation of the closest neighbourhood of certain point in the topography image, one hidden layer of typically 15 neurons and output layer representing the modelled thermal output value. The network was trained on several sets of images representing thermally homogeneous materials (with topography artefacts only) to simulate the topographic artefacts well Then after applying the netwerk to unknown data with both topographic and material contrast the topographic artefacts could be separated and eliminated. The result can be seen in Fig. 2.
4. Conclusion In this article two scanning probe microscopy related algorithms using advanced mathematical and software tools were presented. Within the wavelet denoising it is possible to reduce influence of both the high and low frequency noise present in SPM images of high resolution without a priori guess of spatial frequencies that should be present in the image. This approach can therefore prevent to user-induced errors caused by improper selection of the spatial frequencies corresponding to noise within classical Fourier filtering. Neural networks can be used easily for determination and elimination of the tip finite size effects from data representing different physical properties. Using this correction, scanning thermal microscopy can be used for analysis of very fine structures located on rough substrates.
315
References 1. 2. 3. 4. 5.
A. Bultheel, Bull. Belg. Math. Soc. 2, 1 (1995) S. G. Chang, B. Yu, M. VetterH IEEE Trans. Image Processing 9 1532 (2000) S. G. Chang, B. Yu, M. VetterH IEEE Trans. Image Processing 9 1522 (2000) A. Arneodo, N. Decoster, S. G. Roux Eur. Phys. J. B 15 (2000) L. Shi, A. Majumdar, in Applied Scanning Probe Methods, ed. H. Fuchs, S. Hosaba, B. Bhusan, 327 (2003) 6. D. M. Price, "The Development & Applications of Micro-Thermal Analysis & Related Techniques", PhD. thesis, Loughborough University, Loughborough, U.K. (2002)
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 316-319)
ERROR CORRECTION OF A TRIANGULATION VISION MACHINE BY OPTIMIZATION' ALICE MEDA, ALESSANDRO BALSAMO IMGC-CNR,
Istituto di Metrologia "G.Colonnetti", Torino, 10135, Italy
strada delle cacce 73
This paper reports the error model, the identifiability study of its parameters, its software simulation and the experimental validation used to compensate the error of a triangulation vision machine (AVS-Active Vision System) intended for monitoring and diagnosing monuments and other Cultural Heritage. A careful alignment is required but not sufficient to achieve the sought performance. In addition, operation in open air - where temperature fluctuations up to ±15 K are expected - and transportation would require realignment on the spot. As the errors resulting from misalignment are highly repeatable, they are best compensated in software.
1. Introduction The accuracy of most coordinate measuring systems is limited primarily by systematic geometric errors. In our case, this is due mainly to misalignment of the system axes. High performances are achieved only at great cost in alignment time; moreover, the system is portable and works in open air and transportation shocks and thermal excursions -up to ±15 K expected - may spoil the alignment. Since a manual alignment on the spot is impractical, the best way to compensate geometrical errors is by software. In general, the software compensation of a machine needs three steps: 1. the development of a mathematical model x =J[yr, w), where x is the output quantity of the machine - typically the Cartesian coordinates - , y/ the quantity measured along or about the machine axes, and u the model parameters; the above equation may be in implicit form, and/or expressed by simultaneous equations, and is referred to as direct kinematic equation; 2. the identifiability study of the parameters «, both to eliminate redundant ones and to tune the model to the sought accuracy; * This work is supported by the Italian Ministry of Education, University and Research, in the national initiative "Parnaso".
317 3. the design of an experimental plan to derive u, requiring minimum time, usually based on a suitable calibrated artefact, to achieve traceability. This paper describes the software compensation of a contactless triangulation measuring system (AVS, Active Vision System). 2. The AVS: a triangulation vision machine The AVS is a novel instrument developed in a nation-wide project named SIINDA, carried out with the purpose of monitoring, diagnosing and documenting Cultural Heritage by means of an integrated knowledge system. The instrument [1] provides simultaneous measurement of Cartesian and colorimetric coordinates of natural and artificial targets, by acquiring image pairs. This paper focuses on the geometrical capability only. The AVS operating range is 3 m - 8 m with a field of view of about (6x4) m2 at 8 m. Two cameras at known distance 2d aim at a target by independent azimuthal rotations (a, P) and a common elevation, y. The three angles are measured to form an coordinate triplet of the target ( ^ in the direct kinematic equation), converted into the Cartesian coordinates x by triangulation. 3. The instrument model Five straight lines are considered, namely the three rotation axes {a, j3, y) and two camera optical axes (n, a). Each line is modelled with a unit vector a and a localisation point/?, a is parameterized with two direction cosines and/> with the two coordinates of the intersection of the straight line with a coordinate plane. The resulting total number of model parameters, 20 is reduced to 14 by
it'''* i
Figure 1. Nominal instrument model.
318
eliminating the 6 degrees of freedom associated with the reference system. Due to the geometric errors, a target cannot be aimed at exactly by both cameras at the same time, because they share the elevation. The direct kinematic equation is expressed as a set of two simultaneous vectorial equations; the first equation constrains the target to lie on the optical axis ri', the second in the plane defined by the other optical axis & and the rotation axis P (i.e. the elevation of a is irrelevant)13. This way, the equations privilege the n on the a axis; this choice is arbitrary.
(1)
W-/>;)=o
t is the distance of the target to the first camera, b is a unit vector in the xy plane, orthogonal to aa and insensitive to its elevation. The rotated quantities are expressed e.g. as: Pit' = R(ay >r)[R(aa,<*)(Pn
-Pa)
+
Pa\
R(a, 0) are rotation matrices, about the unit vector a, of the angle 0 [2]: R(a, 6) = (1 - cos 0)L20
+sin0-L0+I
where / is the identity matrix and Lg = (a x) the operator of cross product to a. 4. Identifiability of the model parameters The distance between point pairs is taken as a basis to estimate the error parameters, as it is independent of the instrument location and orientation. In its linearised form, it is: |*l - x21 = <*\,2 +e + JuAu
with
/ „ = nf>2 (J, - J2)
whererf1>2is the calibrated value of the distance, e the size error evaluated by the model, «1?2 the unit vector through the point pair, and J\, J2 are the Jacobian matrices of the kinematic equation, evaluated at X\, x2, respectively. The expression of J„ is not trivial and has been derived by intensive use of chain differentiation. The identifiability of the parameters has been investigated both in theory and by numerical simulation: the least-squares problem design and the parameter correlation matrices have been investigated, the former by SVD analysis of the condition number [3], and the latter by searching the maximum b
The prime, e.g. it", denotes a quantity after the rotations (a, P, f).
319 of the absolute values of off-diagonal elements. The result is that some parameters are strictly not identifiable, while others exhibit strong mutual correlations in practical cases, i.e. under the experimental limitations of ranges and artefact size. The number of essential parameters is then reduced to 10. The small loss in model accuracy is well balanced by increased numerical stability and reduction in parameter correlation. 5. Experimental validation A suitable artefact was designed, manufactured and calibrated to validate the model and to achieve traceability. In addition, an experimental plan has been designed with minimum measurement time, to enable in situ execution. For a successful estimation of the model parameters it is sufficient to have 11 positions and orientations in space of the artefact; some extra measurements taken on the same artefact are used to test the error compensation effectiveness. The results (Table 1) show that the compensation improves the size-errors of about one order of magnitude, with residuals close to the noise of the a, p, y rotary tables (40 urad, resulting in 0.2 mm AVS repeatability at 5 m). Table 1. Summary of experimental and simulated results. Experimental: 800 mm size rms/mm
Simulated: raax(8.5x3x3) m3 grid of points
Raw
Compensated
Raw
Compensated
3.00
0.40
20.00
0.14
6. Conclusions A highly non-linear instrument model has been designed, studied, and validated experimentally. The achieved performance with software compensation is deemed satisfactory, as it approaches the system repeatability. The elimination of some nearly non-identifiable parameters from the model has proved effective, as the accuracy loss is limited -especially in the central part of the measuring volume- and the numerical stability has greatly improved. References 1. A. Balsamo et al., "A Portable Stereovision System for Cultural Heritage Monitoring ", under pubblication in Annals of the CIRP 54/2, (2005) 2. Jay P. Fillmore, "A Note on Rotation Matrices", IEEE CG&A 4, (1984) 3. Gene H. Golub, Charles F. Van Loan, "Matrix Computations", The Johns Hopkins University Press (1996)
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 320-324)
ATOMIC CLOCK PREDICTION FOR THE GENERATION OF A TIME SCALE G. PANFILO 1 ' 2 , P. TAVELLA 2 1
Politecnico di Torino, Corso Duca degli Abruzzi 24, 10124 Torino, Italy 2 IEN "Galileo Ferraris", Strada delle Cacce 91, 10135 Torino, Italy
In the generation of a time scale, the prediction of the atomic clock behaviour plays an important role. The international time reference "Universal Coordinated Time" (UTC) is in fact available a posteriori, while the national time scales are realised in real time and they need to be good local approximations of the international UTC. In this case a prediction of the local clock behavior for a period of about 45 days is needed. In this paper two methods to predict are presented; extrapolation from past data and study of the stochastic behavior of the atomic clock error.
1.
Introduction
In time metrology the interest in the prediction [1-4, 6-8] of the atomic clock behaviour is becoming stronger due to new demanding applications as the new European Global Navigation Satellite System Galileo and the international Mutual Recognition Arrangement asking for "equivalent" standards which means a close agreement among national and international time scales. In this paper we have studied the prediction using two different methods: 1) Extrapolation of the past measurement data. 2) Use of stochastic differential equations and stochastic processes to obtain a mathematical model of the behavior of the atomic clocks. The extrapolation (linear or quadratic) from past measurement data is the most common method, widely applied in the metrological institutions. The study of the mathematical clock model requires some mathematical tools as the stochastic differential equations and the stochastic processes. We use it to infer analytically the evaluation of the clock prediction and also its uncertainty. We have applied the studied methods to experimental data from the Italian time scale UTC(IEN) and different atomic clocks as Cesium beams and Hydrogen masers, using as reference the international time scale UTC.
2.
The prediction methods
Two method to predict a time scale are presented:
321 1. A widely applied prediction method is based on the extrapolation from past data. The type of extrapolation (linear or quadratic) depends on the behaviour of the atomic clock. Here we evaluate the clock behaviour on one month of past data and predict it for the following 45 days. Every month the predictions are updated. A posteriori, when the true clock behaviour estimate versus UTC is available, the differences between the true values and the predicted ones are estimated for any 45 day period and also the standard deviation of the prediction error at 45 days is evaluated. We applied this method to Cesium and Hydrogen maser data. The results concerning this method will be presented in the next section. 2. The second method is based on the study of the mathematical clock model. The atomic clock model can be considered as the following system [5]: 2 ' Xl(t)=x0+y0t + dt— + CT1W1{t)+CT2jw2(s)ds 0
^
X2(t) = y0+dt + a2W2(t) where we have two different noises in the phase component Xt: Wi(t) is a phase random walk originated by the White Frequency Noise (WFN) (Brownian motion (or Wiener process) on the phase) and W£t) represents the Random Walk Frequency Noise (RWFN) which leads to on Integrated Brownian process on the phase. The
best
prediction of X\ in tp is: ~t
2
Xi(tp)=Xo+yotp+<J-y
(2)
The estimate of the noises c^ and CJ2 is based on the typical clock stability (Allan Variance), the estimates of x0 ,yo and d depends on the past period T and on the noises. In this case we have considered the E(Wj) = E(W2) = 0. If we apply the law of the propagation uncertainty on the phase deviation X, we obtain: u
where u
the
\ =«L
+U
Stoc ="!„
uncertainty
stoc = u WFN+u RWFN
=(y t +a
on
\ p 2—>
+U t2 j0 p +^UYP
the
+U
WFN +"lwFN
stochastic while
the
part
is
(3)
given
uncertainty
on
by the
deterministic part depends on the estimation method that can be optimized for
322
the different kind of stochastic processes. We have considered T=\ month, tp = 45 days and the uncertainty of the measures u2meassystem = 2 ns . Also in this case we have applied this method to cesium and hydrogen maser data. We thus obtain a new set of predicted values according to (2) and again we can evaluate the "experimental" prediction error at tp = 45 days. Then we estimate the standard expected uncertainty according to (3) and we compare it to the experimental results. 3.
The results
We applied the methods presented to IEN Cesium and Hydrogen maser data. 3.1. Application to Cesium Clock 1) We have considered the difference UTC-UTC(IEN), where UTC(IEN) is a time scale obtained by a single Cesium clock. The test was carried out on around 1.5 years of data. In the Fig. 1 the differences UTC-UTC(IEN) with their predicted values are shown. 1S0 -|
1
-50
54J1/01 (51914)
Daya
30*35432 ^2424)
Figure 1. Predicted (gray lines) and real values (black lines) of UTC-UTC(IEN).
The standard deviation of the prediction errors after 45 days on the whole 1.5 years is 28.5 ns (1 a). We have evaluated also the effect of using a longer observation period T of previous data but the results are not improved. 2) To model the cesium clock signal we consider only the WFN and d = a2 = 0 in (1). From the theory of stochastic process, the best estimation of yo is the difference between the last term and the first term divided by the duration of the interval. The uncertainty in this case is:
323
=
U
2 xQ
+U
2 2 yJp
2
2
+UWFN ~ U'meas.system +
?t TI
2
N 2
_ ? " meas.system
t2p+a2tp
(4)
2
T , yT Additional details about this formula can be found in [7, 9]. We have applied this estimation to an IEN Cesium clock one year data. The experimental standard deviation of the prediction error obtained from this prediction procedure after 45 days is 28 ns (1 o). The theoretical evaluation of (4) leads to on estimate of 32.3 ns (1 o) which is in agreement with the experimental value, apart a slightly pessimistic evaluation. 3.2. Application to HMaser 1) For the HMaser clock the test was performed by using UTC-Hmaser data for a period of about 1 year. The HMaser frequency measures show a frequency drift corrupted by white FM noise. The linear frequency drift is evaluated on one month of past frequency data and predicted for the subsequent 45 days period. Integrating the estimate of the frequency values we obtained the estimate on the time difference UTC-HMaser. The standard deviation after 45 days of prediction on the whole 1 year period is estimated to be 18.8 ns (1 a). 2) The model of the Hydrogen maser is the complete model presented in (1). dW This model can be written as: Xl= X2 + <J\ . dt To estimate the values of parameters appearing in (2) we followed two different estimation procedures: a) First case: we used the least square approach to predict y0 , d and to evaluate the uncertainty. The standard deviation of the prediction error results 18.8 ns (1 o) not so much different from the theoretically expected of 21.5 ns (1 a). In the Fig 2 the prediction error with 1 a and 2 a uncertainties are shown. After 45 days each prediction error is within the uncertainties boundaries (2 a). b) Second case: to remove the effect of the WFN, we filtered the frequency data with a moving average. In this case the model results only a Brownian Motion with a drift on the frequency. The best estimate of the frequency drift is the difference between the last term and the first term divided by the duration of the interval. The best estimation of y0 is the last measured term. Applying this different procedure to experimental data, we obtained the standard deviation of the prediction error after 45 days equal to 7 ns (1 a). From the theoretical analysis we obtain the uncertainty equal to 17.6 ns (1 a). The second estimation procedure seems more powerful in the identification of the deterministic parameters but the theoretically evaluated uncertainty is overestimated. This estimation needs further development
324
Figure 2. Prediction error (gray lines), 1
4.
and 2 a (blank lines)
Conclusion
We have presented two methods to predict a time scale: the extrapolation from the past data and the study of the model of the atomic clock error. We have applied these methods to experimental data from the Italian time scale UTC(IEN) (cesium clock) and Hydrogen masers. We have shown that the model describes the variability of the prediction error. Further analyses are in progress to estimate the uncertainty applied to model of the hydrogen maser clock. References 1. A. Bauch, A. Sen Gupta, E. Staliunene. 19th European Frequency and Time Forum (EFTF). Besancon (France) (2005). 2. L-G. Bernier. 19th European Frequency and Time Forum. Besancon (2005). 3. G. Busca, Q. Wang. Metrologia Vol. 40 265 (2003). 4. J.A. Davis, P.M. Harris, M.G. Cox and P. Wimberley. IEEE Intern. Frequency Control Symposium (FCS) and 17'h EFTF, Tampa (Florida, USA), (2003). 5. L. Galleani, L. Sacerdote, P. Tavella, C. Zucca. Metrologia Vol. 40 257 (2003). 6. C. Greenhall. 17th Annual Conference on Formal Power Series and Algebraic Combinatorics (FPSAC '05), Taormina (Italy) (2005). 7. G.Panfilo, P.Tavella. IEEE Intern. FCS and 17lh EFTF, Tampa (USA) (2003). 8. F. Vernotte, J. Delporte, M. Brunet, T. Tournier. Metrologia Vol. 38 325 (2001). 9. F. Cordara, R. Costa, L. Lorini, D. Orgiazzi, V. Pettiti, I. Sesia, P. Tavella, P. Elia, M. Mascarello, M. Falcone, J. Hahn. Proc. of 36'h Annual Precise Time and Time Interval (PTTI), Washington (USA), (2004).
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 325-329)
SOME PROBLEMS CONCERNING THE ESTIMATE OF THE DEGREE OF EQUIVALENCE IN MRA KEY COMPARISONS AND OF ITS UNCERTAINTY FRANCO PAVESE CAT?, Istituto di Metrologia "G.Colonnetti" (1MGC), strada delle Cacce 73 Torino, 10135 Torino, Italy The meaning of the degree of equivalence of each laboratory mainly depends on the meaning of the key comparison reference value, and this, in turn, depends on the type of key comparison and on the kind of outcomes that each specific key comparison provides. Commonly, the proposed statistical approaches for the treatment of data of MRA exercises do not account for most of these particularities. This paper shortly addresses these problems and the best estimate of the value of the degree of equivalence for certain types of key comparisons and of the uncertainty associated with these values.
1. Introduction The MRA [1] introduced the need for the metrological Institutes to participate in specific exercises, the key comparisons, in order to gain the so-called "international equivalence", defined in the MRA document. This document and the accompanying technical documents do not provide specific guidance or further specification concerning individual types of standards, except for allowing, in exceptional cases, to avoid the definition of the so-called "key comparison reference value" (KCRV). On the other hand, the MRA states that a degree of equivalence (DoE) can always be defined for a laboratory, irrespective to the size of its value and of its associated uncertainty (i.e., to the "quality" of its results), not allowing for rejection of any key comparison data, once formalized by the participants. Only for the purpose of the associated definition of "calibration measurement capabilities" (CMC), a concept of "significant unresolved deviations" related to these degrees of equivalence is introduced, no better specified -today still object of speculations, largely un-resolved. Since the results of the carried-out key comparisons showed a variety of problems arising from apparently inconsistent data or non-Gaussian output probability distribution functions (pdfs), the best that was so far made available in the literature was a variety of tools to check data consistency (see, e.g., [2,3]),
326 for the purpose of screening the data to be considered for the computation of the KCRV. However, the cases of non-consistent data remain essentially helpless. In an effort to find a way to provide a more general solution, valid also for these cases, the author has already proposed the need to consider at least two different classes of standards, i.e., of key comparisons involving them [4-6], summarized here in Section 3. The statistical tools to be used should be adapted to the different cases. In this short paper, a step forward is attempted, by starting from defining the very purpose -and, hence, the meaning- of the key comparison, on which the possibility to define an internationally recognised DoE is based - a purpose and meaning that are not necessarily shared today by all metrologists. Based in part on previous work, some consequences on the usefulness of the KCRV and on the meaning of the DoE and of the uncertainty associated to its value are summarized, arising from the defined meaning of key comparison and from differentiating the key comparisons in classes.
2. Meaning of key comparison A key comparison (KC) should not be seen as an extra-ordinary exercise, but, instead, as a scientific exercise establishing a quantitative measure of the degree of equivalence (DoE) of a given standard (unilateral DoE), as currently maintained in each National Metrology Institute (NMI) (called in the following "local standard") under its normal -though generally the best- working conditions and using the methods currently available at that NMI. The consequence is that the supplied value of the local standard is a representative value of the local population. It is neither a summary statistics of merely the (few) specific measurements performed during the KC, nor the expectation value of the particular pdf inferred from these occasional measurements. Rather, it is assumed to be consistent with the expectation value of the currently maintained local standard, i.e., of the local population of samples normally drawn from that standard, i.e., under routine conditions -not a "special" value nor specific for that KC. The uncertainty associated to the value of the local standard is related to the local capability of realising the given standard. It is neither the uncertainty associated with the values resulting from the (few) specific measurements performed during the KC, nor the second moment of the occasional pdf inferred from these specific measurements. It is evaluated from the local standard as
327
currently maintained, i.e., from the local population of samples normally drawn from that standard (i.e., from the local pdf -incidentally, not necessarily described by only two moments). In other words, unless a local standard is occasionally in a non-"normal" condition, undetectable by the NMI, the data specifically obtained for a KC (generally very limited in number a ) are assumed to measure the state-of-the-art local standards (typical values). Significant differences between NMIs can arise only from undetected (or under-estimated) systematic effects or from deficiencies of the transfer (KC travelling) standard. Without this initial assumption, the unilateral DoE of each NMI would not be possible. In fact, this is the rational why a KC can internationally be taken as a measure of the current "quality" of the NMI values and of the level of the uncertainties associated to the values of their standards. The above-indicated meaning of the data provided as the input to a KC by each NMI, and consequently of their purpose, has not been so far clearly identified in the current literature on the subject, and probably is not even taken as granted. Possibly because of that, suitable means have not been investigated to separate the contribution to the unilateral degree of equivalence of the systematic effects between local standards from that arising from the limitations attached to the transfer (travelling) standard(s). 3. Classes of key comparisons In addition to the above issue, the current literature is proposing means for treating KC data without making any distinction between possible conceptual differences between types of standards involved. It has been shown in several author's contributions and some recent additional work [4-7], that at least two basic classes, or types, of standards should be identified: one (Class 1) based on artifacts, so that a KC involves the comparison of several measurands; a second (Class 2) based on realizations of a single physical state or involving a single transfer standard, so that the corresponding KC involves only a single common measurand. The statistical tools used for the data analysis should consequently be adapted to the specific case.
* a minimum number is selected just enough to test this assumption.
328 4. Degree of equivalence and the KCRV Obviously, the resulting unilateral DoE is affected by the above considerations. In addition, its meaning -not that of the bilateral DoE between two NMIs- is affected by that of the KCRV. One of the consequences is that, since the NMIs participating to a KC do not necessarily represent the population of the NMIs, but only an de facto group, the KCRV is not an unbiased estimate of the "best approximation" of the value of the SI unit, and, therefore, the DoE relative to the KCRV is biased too, irrespective of the way it is defined. Another consequence is that the KCRV in a Class 1 KC represents a summary value of the given group of artefact standards measured in the KC, while a in Class 2 KC it is the expectation value of the pdf representing the group of physical-state realisations, or of the determinations of the value of a single transfer standard. While for the first class the use of the current statistical tools is suitable, for the second class the use of a KC output pdf, called "pooled distribution" [8] or "mixture distribution" [9], has been proposed as the correct approach. The meaning of the uncertainty associated to the value of the unilateral DoE does heavily depend of the process bringing to the DoE definition. In many KCs, the definition of a KCRV is an ill-posed problem. In the case shown in [10], where a translational NMI-related term is added to the data model (i.e., a "fixed effect" model [11, 6] is used), "unilateral DoEs will be influenced by the choice of resolving condition", so that the value of the resolving constraint can be interpreted as a KCRV, and "different choices of KCRV yield unilateral DoEs whose values differ collectively by a constant". However, unless a definite choice can be obtained from prior knowledge, the arbitrariness of the choice of the constraint value will affect the KCRV, hence the unilateral DoE. Due to this arbitrariness, for Class 2 standards and KCs, the meaning of the KC can often be reduced to a mere consensus value, of no statistical relevance, hence with no associated uncertainty. Clearly, when the statistical meaning of the KCRV is lost, also an associated uncertainty becomes meaningless. This is a frequent case, e.g., in temperature KCs, where the pooled output statistical distribution are mono-modal in less than 10% of the cases: the meaning of the KCRV is merely the expectation value of the pooled distribution - a value that is always possible to define, but with a non-trivial statistical meaning, if any. Or, the definition of a KCRV can be omitted at all and only bilateral DoEs defined.
329
5. Conclusions The main purpose of the MRA is to assess the DoEs and one should carefully avoid that these values are dependent on mere features of the KC procedures or outcomes. In this sense, some risks may arise from inadequacies of the travelling standards (e.g., drift with time), but other risks may arise from inadequate KC protocols for collecting the input data or for processing these data to obtain the DoEs. The bilateral DoEs are very robust on the latter risks, while for the unilateral DoEs the risk is very high, since they depend on the computation of a KCRV and the latter is always a biased estimate of the value of the SI unit, being any KC based on a group of NMIs whose statistical "representativeness" is uncertain. In addition, for the class of KCs of artefacts, the KCRV is basically the value of a "virtual standard", chosen to represent the tested group of artefact, and the uncertainty of which is computed according to GUM indications. For the class of KCs with a single measurand, the KCRV problem is, instead, to define this value, if provided, as an expectation value according to the knowledge gained from the KC: this value can be obtained either from the usual definition of "expectation value" of a probability distribution, or as a mere consensus value. In either cases, it seems difficult, and even questionable, to estimate an uncertainty to associate to this value, of reasonable statistical meaning and useful to the MRA purposes. References l.BIPM, MRA, Technical Report, Sevres, France, 1999. 2.M. G. Cox, Metrologia 39,589 (2002). 3. A. G. Steele, K. D. Hill and R. J. Douglas, Metrologia 39, 269 (2002). 4. P. Ciarlini and F. Pavese, Izmeriskaya Tekchnika N.4, 68 (in Russian, 2003); Measurement Techniques N.4,421 (in English, 2003). 5. F. Pavese and P. Ciarlini, PTB Berichte, PTB-IT-10, 63 (2003). 6. F. Pavese, in "Advanced Mathematical and Computational Tools in Metrology VF, World Scientific, Singapore, 240 (2004). 7. R. N. Kacker, R. U.Datla and A. Parr, Metrologia 41, 340 (2004). 8. P. Ciarlini, F. Pavese and G. Regoliosi, in Algorithms for Approximation IV (edited by I. J. Anderson, J. Levesley and J. C. Mason), University of Huddersfield, Huddersfield, UK, 138 (2002). 9. A. G. Steele, K. D. Hill and R. J. Douglas, Metrologia 39, 269 (2002). 10. M. G. Cox, P. M. Harris and E. Woolliams, This book, 23 (2006). 11. F. Pavese and P. Ciarlini, Metrologia 27, 145 (1990).
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 330-334)
VALIDATION OF WEB APPLICATION FOR TESTING OF TEMPERATURE SOFTWARE
ALEKSANDER PREMUS, TANASKO TASIC, UROS PALMIN Laboratory for Metrology Information Technology, Metrology Institute of the Republic of Slovenia, Grudnovo nabrezje 17, 1000 Ljubljana, Slovenia JOVAN BOJKOVSKI University of Ljubljana, Faculty of Electrical Engineering.Laboratory for Metrology and Quality, Trzaska cesta 25, 1000 Ljubljana, Slovenia In this contribution the validation of web-based reference software based on standardised thermocouple calculations is described. In this case it was performed by modified method of reference software, using three different programming languages. Values of thermoelectric voltage and temperature from the ITS-90 were used for the validation as well. Some observations motivated us to perform additional validation of different realisations of reference modules used for comparison.
1. Introduction The validated reference software (in the continuation: SUT) calculates thermoelectric voltage from given temperature or calculates temperature from given thermoelectric voltage for a selected thermocouple type. SUT is based on Linux operating system with apache server and MySQL. Access and calculations are realized with PHP and PERL scripts. If one claims that an application is the reference software for testing other temperature measuring applications, it should be appropriately validated. For the purpose of comparison, algorithms defined in standard [5] were programmed in three different programming languages: National Instruments Lab VIEW™ 6.0, MathWorks MATLAB™ 7.0 and Microsoft Excel 2003™. Test cases were realised by data generator covering entire range of values for input variables with step of 1°C. 2. Validation of the reference software For the validation of SUT appropriate reference modules were programmed according to the equations from standard [5]. Test cases were formed with data sets covering whole temperature range for the particular type of thermocouple.
331 Same test cases were used for all realisations of reference software for validation. The comparison of all realisations is presented in figures 2 and 3. The standard [5] gives the result for the thermoelectric voltage calculation with 3 decimal places and reference modules calculate polynomial equations with 12 decimal places. In order to prove inverse calculation and to avoid comparison with numbers with less significant figures, the results of t90 —* E were led as input in polynomials for calculation of E —» t90. At the end the original t90 and t9o(E(tgo)) were compared. Results are presented in figure 3. Expected results should be the same as data that was entered in the first calculation. The difference is the cumulative error of both calculations. 1820 °<5
0°G
Figure 1. The validation process.
Used Lab VIEW modules were original modules for thermoelectric voltage and inverse temperature calculations, used in a shell that entered input values for the test cases and recorded them in Microsoft Excel table. In the case of MATLAB, Excel and PERL modules were programmed according to [5] . In all cases double precision floating point type of variables were used. For the analysis of results Microsoft Excel was used. 3. Analysis of Results First comparison of results of thermoelectric voltage calculations from all 4 sources and SUT was performed and all results fitted into the margins given by the standard [5] as shown in figure 2. MATLAB was chosen as most suitable program for this validation and it's results were taken as a reference. In the next Comparison all results were compared to the results given by MATLAB.
332
n i'i' 11 n n i
I
i I I I l.i i i I i ilililli jllljll Jii* I III iiili
g Q,GS~O0 -
>t S 8 S S 8 1 Tsmperotur* |SC)
! l ^ § ^ § ^ !
Figure 2. Difference between results of calculations and ITS90 voltage table. I.OE-00 1,SE-Q1 1,06-02 i,ee-05
1 oe sa
om
ioeoe
t - 1 Of-OS
g 1 oe o? £
i CE OS
Q
1.CE-GS
~Wr7TATBVȣW2, M A T L A B , Ex<
,li, ; .....;.,,,li,li.i
T*mp*ratur» (°C)
Figure 3. Difference between results of inverse calculations.
As shown in figure 3 the difference between the results of calculations was significantly lower than the accuracy of values in the table [5] (5 10"4 mV). The exception was the Lab VIEW calculation which differs by 10"6 mV from other calculations.
7T
/-^-Mf
SUT, LABVIEW, MATLAB, EXCEL, PERL
~T
TsmpM-ature (*C)
g|s|S||§?
Figure 4. Difference between results of inverse calculations and initial set of data.
333
For the inverse calculation, all results were inside the error margins (figure 4) given by the standard [5].
s.oe-01 - • -i,GE-02 • • I.GE'W • •
1
86 171 256 345 428 511 5S6 661 760 SSI S36 10211106 1191 5276 I3S5 14^6 1531 16161701 1786 Temperature {*c>
Figure 5. Difference between results of calculations.
4. Conclusions Reference web-applications may be very suitable for validation of standardised metrological software. In this particular case main goal was to check the numerical accuracy of the SUT. The results were inside the error margins given by the standard [5] and differed from the results of Excel and MATLAB for ± 10"12 V and ± 10~10 °C. The results from the LabVIEW modules, on the other hand, differed for ± 10"6 °C from the results given by other applications. It appeared that these differences were caused by a scaling (xlOOO or -^1000) within original LabVIEW modules and several changes of variable type between double precision floating point into single precision floating point and reverse. In the case of presented results for B-type thermocouple there was also an error in FQ coefficient (figure of weight 10"6 is missing). After the adjustment of the original LabVIEW module the results fitted within ± 10"12 V and ± 10"10 °C compared to the results from other applications. 5. Ideas for the future As shown in this study there is a lack of suitable V&V software for the support of development of temperature software. SUT is accurate in the whole range of the calculations defined with the coefficients from the standard [5]. It calculates the correct result when the input of data is inside the expected range for selected thermocouple type and gives a clear indication when the inserted data is outside
334
the defined range. This validation proved that this easy-to-use temperature software may be used as reference software for temperature and inverse calculations. Further development in this area will bring application for verification of curve fitting polynomials. References 1. 2.
3.
4.
5.
ITS-90, The Internet resource for the International Temperature Scale of 1990, http://srdata.nist.gov/its90/main/its90_main_page.html. NPL Report CMSC 04/00 (Cox, M. G. and Harris, P.M.), Guidelines To Help Users Select And Use Software For Their Metrology Applications, September 2000. Tasi6, T, Bojkovski J, Intercomparison Of Metrological Software Modules - Can It Be Useful For Metrological Laboratories?, Joint BIPM-NPL Workshop on the Impact of Information Technology in Metrology, Teddington, UK, September 17, 2002. Palmin, U, Tasid, T, A database for reference datasets suitable for the validation of standardised metrology SW modules - open source solution, Workshop "From Data Acquisition to Data Processing and Retrieval", Ljubljana, Slovenia, September 13-15, 2004. International standard IEC 60584:1995, Thermocouples.
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 335-339)
MEASUREMENT UNCERTAINTY EVALUATION USING MONTE CARLO SIMULATION: APPLICATIONS WITH LOGARITHMIC FUNCTIONS JOAO A. SOUS A 1 , ALVARO S. RffiEIRO 2 , CARLOS O. COSTA 2 , MANUEL P. CASTRO 2 Madeira
Regional
2 National
Laboratory
Laboratory
of Civil Engineering,
of Civil Engineering,
9000-264
1700-066
Funchal,
Lisbon,
Portugal
Portugal
The Guide to the Expression of Uncertainty in Measurement (GUM) treats mainly linear models. Monte Carlo simulation (MCS) allows an alternative approach without limitations on the model. It provides richer information about the measurement model and can serve as an important validation tool. This study focuses on uncertainties associated with level quantities that occur in areas such as electromagnetic, electrical or acoustics. Here logarithmic transformations are used and the standard GUM results may be inappropriate.
1. Introduction The use of MCS to study and evaluate measurement uncertainties has generally been restricted to a few National Measurement Institutes. The current perception is, therefore, that MCS is particularly suitable to uncommon problems where the conventional "simpler" methodology stated in the GUM [ 1 ] might not be appropriate. The reality shows, however, that significant problems related to the application of GUM may arise in measurement models based on functional relationships with few input quantities (lower than 4) and non-linear behaviour, that is very common in different areas of metrology. Obviously, any departure from the ideal conditions based on which the GUM was developed will only lead to an approximate solution [2]. The advantages of MCS can be illustrated in situations where the relational functions are more complex, such as those involving logarithmic functions of rectangular distributions, as exemplified in this study with a problem in the field of acoustics. 2. Merits and shortcomings of GUM The problem of using GUM lies not on its formulation but rather on the interpretation most laboratories use, leading to an inadequate implementation for the task in hand. A few questions have been raised recently concerning the
336 adequacy of this approach to evaluate and express measurement uncertainties [3], especially in situations where asymmetric distributions, non-linear calibration curves and other particular characteristics exist and whose influence in the results can not be overlooked. From our experience, two of the more important aspects to consider when deciding on the adequacy of the modelling technique are the number of input quantities to model and, among them, if there is an input quantity that is dominant [4]. 3. MCS - An alternative approach Monte Carlo Simulation is a numerical technique that is able to propagate the probability density functions (pdf) of the input quantities of a measurement model, instead of just their variances, so that an estimate of the pdf of the output quantity is provided rather than just a single statistical parameter such as the output standard deviation. It is a numerical (alternative) approach to the propagation of uncertainties having several advantages, mentioned earlier, when compared to the corresponding (conventional) analytical procedure. Another important advantage lies in the applicability of MCS as a validation tool. There are specific problems inherent to this approach requiring proper consideration, namely numerical stability, convergence and numerical accuracy, which are in turn related to the optimum number of Monte Carlo trials to be taken in each application. In general the number of trials is not known a priori, and needs proper analysis for deciding when the procedure has delivered the required target accuracy. In all cases, it is required to check the accuracy of the results that define the interval limits - 2.5 and 97.5 percentiles, considering the confidence interval of 95% - , using the number of significant digits and evaluating the percentiles own 95% confidence intervals [2]. 4. Applications to a logarithmic function of a rectangular distribution Decibel scales have been specialized for use in the fields of acoustics, electricity and electromagnetism, differing from each other by chosen reference value and units employed. The decibel is a unit to express values of logarithmic quantities, and all measurements given in decibels are statements either of a ratio between a measurement (a) and a reference value (b) or between two measurements {alb).
337
Where (often) the ratio being examined is a squared ratio (as in inputs to power calculations), this equation takes a slightly different form: Lp = 10 l g £ _ dB = 20 l g J L J 5 <=> p — £>n10 " \\) Po Po The power in a sound wave, generally, goes with the square of the peak pressure of the sound wave. Sound pressure level measurements are generally based on a reference pressure of 20 liPa or 2.0-10"5 Pa [5,6]. In Acoustics, the measurement of linear variables and logarithmic variables, handling nonlinear changes of variable, makes the use of the propagation of uncertainties adequate only when the uncertainties are sufficiently small, due to the asymmetry entailed by the nonlinear transformation. Considering the example of a logarithmic transformation with a uniform pdf between a = l and b = 3, the asymmetric effect produced by the logarithmic transformation is quite apparent. The application of GUM to this model is not likely to give trustworthy results since in its formulation a Gaussian output pdf, hence symmetric, is assumed.
" rr: •• : »fj! | | I I n f l + l - i - >'*r i:
Ii
a
-
"
m
iWi b
"
log(a)
•
'
•
log(b)
Figure l. The effect of a logarithmic transformation as applied to a uniform pdf between a = l and 6 = 3.
This study considers three levels of uncertainty of sound pressure level (SPL), from which the corresponding variation of pressure was estimated, using the equation SPL = 20logl0(/?2 //?,) with a reference value of 2.0 10"5 Pa. The results are presented in Table l. Table l. Baseline values for the performed simulations SPL [dB] P2 fPal
50 ± l [0.005637 ,0.007096]
50±5 [0.003557,0.01125]
50 ±10 [0.002,0.02]
The uniform pdf was assumed to qualify the variation of pressure within the boundaries of each interval. MCS procedure began with 60 000 trials applied to the mathematical model stated above for the SPL. The results of the
338
numerical simulation are expressed in Fig. 2 and 3 below, where the importance of the size of the input uncertainties is immediately striking.
49 [dB]
50 [dB]
51[dB]
45 [—-dB]
50 [dB]
55 [dB]
Figure 2. Output pdf of model 201ogi0(p2/2 10"5)Figure 3. Output pdf of same model where p2 is uniform in [0.005637 , 0.007096] p2 is uniform in [0.002 , 0.02],
As the input uncertainties increase, the departure from uniform pdf entailed by the logarithmic transformation becomes more and more evident. For these three examples, the results in table 2 indicate an increasing asymmetry towards the right, illustrated by the differences between the percentiles and the mean values, which almost doubles for the third interval. As a consequence if a comparison is made between the results form MCS and GUM there will be reasonable agreement in the first case, but in the other two the results can be as different as 8.2 % and 28.5%, respectively, in terms of measurand estimate. Table 2. MCS for three different levels of uncertainty in acoustics. SPL [dB] SPLmean[dB] U,5%[dB]
50 ± 1 50.037 [49.056,50.9541
50 ± 5 50.925 [45.447,54.855]
50 ±10 53.553 [41.757,59.809]
The results above were validated by a comparison with values obtained from a purely analytical deduction. These are displayed in Table 3. Table 3. Theoretical results for the same examples. SPL [dB] SPL mean [dB] U95%[dB]
50 ± 1 50.038 [49.056,50.955]
50 ± 5 50.940 [45.457,54.850]
50 ±10 53.536 [41.763,59.802]
Numerical accuracy, however, needs to be checked, for the required accuracy. For this purpose, it is necessary to determine the end points of the coverage interval for the 2.5 and 97.5 percentiles. The number of trials M is sufficiently large when both intervals are less or equal 0.5 • 10"', where q stands for the number of digits after the decimal point in the uncertainty of the measurand, after its value has been rounded to two significant figures.
339 From the results included in Table 4 below, we realize that increasing the number of trials to 100 000 eliminates the first convergence problem since both percentiles could be reduced below 0.005 as required by the expression 0.5 • 1(T2. On the other hand, the coverage interval of the 2.5 percentile for the third tested case of (50 ± 10) dB, is far from the limit of 0.05 imposed by its uncertainty of 5.1 dB. In this last case, the problem could only be overcome after a simulation with 750 000 trials. Table 4. Numerical accuracy for a MCS using M= 100 000 trials. _ ^ , Tested case
Measurand y
(50±l)dB (50±5)dB (50± 10)dB
50.04 50.93 53.54
Std. uncertainty , „ u(y) _ 0.58 2.8 5J
_ Coverage . ° interval [49.06,50.95] [45.44,54.85] [41.74,59.81]
2.5 pctil. cov. int.
97.5 pctil. cov. int.
0.0042 0.0365 0.1202
0.0037 0.0125 0.0136
5. Conclusions The results obtained in this study confirm that nonlinear changes of variables such as logarithmic transformation, can produce inaccurate results if handled by models developed for linear applications that are dependent on the size of the input uncertainty quantities. The error increases with the increment of the latter, as expected. Perhaps more important, GUM over predicts the standard uncertainty and thus additional terms in the equation to calculate the output uncertainty will only exaggerate the error. The terms of the second order derivatives outweigh the values of the first derivatives as the input uncertainties increase. In practice this can lead to a greater rate of equipment rejection that can be very costly for industry. Therefore, a method such as MCS is particularly adequate in circumstances where the conventional approach is out of scope of applicability. References 1. 2. 3. 4. 5. 6.
ISO Guide to Expression of Uncertainty in Measurement, 2" Ed, 1995. M.G. Cox, M.P. Dainton, P.M. Harris. SS/M Best Practice Guide N° 6, NPL, Teddington, UK, 2001. M.G. Cox, P.M. Harris. GUM Supplements CIE Expert Symp. on Uncertainty Evaluation, Austria, 2001. A.S. Ribeiro, J.A. Sousa, M.P. Castro, 10th IMEKO TC7 Advances on Measurement Sciences, St-Petersburg, Russia, 2004. ASQ The Metrology Handbook, ASQ Quality Press, USA 2004. L. Kinsler, A. Frey, A. Coppens, J. Sanders. Fundamentals of Acoustics, J.Wiley & Sons, 3rd Ed, USA 2004.
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 340-343)
REALISATION OF A PROCESS OF REAL-TIME QUALITY CONTROL OF CALIBRATIONS BY MEANS OF THE STATISTICAL VIRTUAL STANDARD VICTOR I. STRUNOV Metrology Department, SOOO 'AJAX', Gikalo str., 5, Minsk, 220071, Republic of Belarus A method for the quality control of the process of calibrations of measuring instruments is presented. The essential feature of the method is the statistical processing of the results of calibrations, which are considered as random quantities. The method allows estimating the calibration process's quality when analyzed time series is stationary or not-stationary. Application of the method does not require any significant financial input and also does not need a change in the usual method of operation of calibration laboratories. The main principles of the method and some obtained practical results are briefly discussed.
1. Basis of the method 1.1. Simplified model of the process of calibrations Usually, as the result of calibration, the error of measuring instrument is obtained. In the simplest case, a standard and a comparator are used for calibration of ordinary instruments. During the process of calibration some value of physical quantity is measured by means of ordinary instrument and the standard simultaneously. The error of the ordinary instrument is determined as difference between the results of such measurements with an assumption that true value of the standard's error is equal to zero or precisely known. But this is only an assumption and in our case it cannot be accepted. Actually, true level of the error of any measuring standard is not precisely known and, moreover, changes with time. It is also necessary to take in account the action of various influence quantities Z, which are not the objects of the measurements (temperature, humidity, atmospheric pressure, supplied voltage etc), but they are able to deform the result of calibration. In practice, during some period of time the same measuring standard is used for calibration of some series of identical ordinary measuring instruments. Further we will consider the process of calibrations as operation of a measuring system. The simplified model of the measuring system is shown in Figure 1.
341 Zstdi(U)\ )\
Zm(ti) Zm(U)\
WORKING STANDARD
Zstwfti) Xsyst(ti)=Xori(ti)-Xstd(ti)
Xstd(ti)
,. ORDINARY X0rd(ti) INSTRUMENT
t ZcMl(tt)
t...t Z0rd2(ti)
Zordinfti)
Figure 1. Measuring system (simplified model).
The output signal of the measuring system at any moment of time X Sys ,(0 is the difference between the value of ordinary instrument's output signal X 0 r d(0 and the standard's output signal X s , d (/,). The value of this difference is treated as systematic error of ordinary measuring instrument and fixed in the report of calibration. Actually, the result which has been obtained (like any result of any measurement) is a random quantity. Thus, the ordinary instruments' errors which have been determined during the analyzed period of time represent time series or realization of random process. 1.2. Data processing of stationary time series As the result of various physical and chemical processes which act "inside" the standard, some drift of true value of its error takes place with time. The equation for the measuring system's output signal can be given by the formula:
- W O = JO', -^oJ^-'KA A
CW0>(O-A5W(l)(O
-r)^(T)dT (1)
where AQrd and z^td - true values of the errors of ordinary instrument and the standard, k*0rd(t.~v) and k*Std(t.-r)- weighting functions, r -time parameter. In this case we can say about the first type of the measuring system's excitation. Measuring standard and ordinary instrument operate at some conditions characterized by influence of different external quantities Z. So the changes in
Some time series can be considered as stationary if, by means of testing of the null hypothesis of stationary (statistical independence, randomness) against the alternative, the result "Null hypothesis is not rejected" has been obtained.
342
the measuring system's output signal can be considered as the second type of measuring system's excitation: M
7=1 0
7=1 0
= AOr,(2)(0-V2)(0where L (t. —T) and L influencing quantities.
(t.-r)
(2)
- weighting functions, M - a number of
Thus, the output signal of the measuring system at any moment of time U can be given by the formula: x
SyM
= ^(,,(0+ A « ( 2)(0- V . ) ( 0 - V 2 , ( 0 •
(3)
Let's assume that: 1) the ordinary instruments' errors are independent random quantities distributed symmetrically with a mean value (average) close to zero; 2) true value of the standard's error does not changes considerably with time and 3) calibrations are carried out under controlled reference conditions and all external influences are not considerable. Thus if the obtained sample is of a sufficient size N, then: 1
N
1
JV
M
WSyM
(4)
1 N because, due to the assumptions accepted above —Z^on/m^/^"~* ^ TV
,- = i
Thus, the estimation (with opposite sign) of true value of the standard's error can be obtained. This estimation serves as quality parameter of the process. Hypothesis of independence of the results of calibrations can be tested by means of different statistical criteria [1]. Type A standard uncertainty and expanded uncertainty can be evaluated by usual way. 1.3. Data processing of not-stationary time series If, due to the influence of external factors and (or) the instability of the standard's metrological characteristics, the hypothesis of independence is rejected, the obtained time series must be considered as dynamic one. Various ways of smoothing can be applied to the statistical processing of such series. General purpose of smoothing is the allocation of trend. But solution of this task can not be strictly unequivocal. Kind and parameters of an approximation function substantially depend on the features of the analyzed process. The
343
results of modeling have shown that various kinds of graphical, quasianalytical and analytical methods can be used for smoothing, but in this case the reliability of the method requires experimental confirmation. 2. Practical application of offered method With the purpose of confirming the assumptions put forward, analysis of calibrations of micrometer measuring heads 1MIG has been done. Calibrations have been carried out by means of the working standard "PPG-3". Some important results of the analysis are shown in Figure 2. Micrometer heads 1MIG. Una mart '1000 mm' • - enots of micrometer heads
1 - maximum permissible errors for PPG-3 2 - expanded uncertainty, k=2
3 - real-lime estlmetlons
5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 11.2004 Serial number ofverification 16.0) .21
Figure 2. Practical results of the calibrations process analysis.
The graphical representation of obtained information (Figure 2) confirms that instability of the metrological characteristics of the working standard "PPG-3" does not exceed allowable limits. Therefore it is possible to conclude, that quality of the process of calibrations meets the established requirements. 3. Conclusion Statistical method for estimation of the quality of calibrations has been developed. The experimental testing of its application has confirmed their legitimacy. References 1. Bendat J.S., Piersol A.G. Random Data Analysis and Measurement Procedures, John Wiley & Sons, New York, 1986.
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 344-349)
A N A P P R O A C H TO U N C E R T A I N T Y ANALYSIS E M P H A S I Z I N G A N A T U R A L E X P E C T A T I O N OF A CLIENT
R. W I L L I N K Industrial Research Ltd., P.O. Box 31-310, Lower Hutt, New Zealand E-mail: [email protected] This article describes an approach to uncertainty analysis that is based on a natural performance expectation of the user of the uncertainty statement, i.e. the recipient. This expectation is that, in the long run, 95% of all the "95% uncertainty intervals" received will contain the corresponding measurand. By placing the focus on the set of intervals received by the user from various sources at various times, errors that are systematic with respect to any set of measurements made by any one laboratory may be treated as random. This enables a defensible method of analysis to be devised to combine "random" and "systematic" errors.
1. Introduction Several different approaches exist for the analysis of uncertainty in measurement problems. Each acknowledges there is no guarantee that any resulting interval will contain the value of the measurand. So the interval is quoted with a level of confidence or coverage that is typically the figure 0.95, which in some sense is a "probability" that the measurand is enclosed. The concept of probability is, however, contentious, so it is not trivial to evaluate the performance of a method of uncertainty analysis in a way that all will regard as valid. Similarly, there is often no discussion of the practical meaning of this probability, 0.95, for the recipient of the uncertainty statement. This article is based on the premise that the client, or user, should be able to find practical and specific meaning in this figure 0.95. In the author's opinion, this practical meaning is found in the understanding that a 95% uncertainty interval is an interval the like of which contains the value of the measurand on (at least) 95 of every 100 independent occasions, on average.1* This seems to be the only practical meaning a
Equivalently, a 95% uncertainty interval is one in a series of equally-trustworthy independently-determined intervals received over a period of time of which (at least)
345
of a 95% uncertainty interval to the user. What other practical meaning could there be? With this criterion in mind, we describe a potential approach to uncertainty analysis that involves an unconventional, but valid, way of combining Type A and Type B evaluations 2 or "random" and "systematic" errors. The approach enables information about systematic errors to be expressed in terms of distributions, as in Type B evaluations. Also, it avoids the need to specify prior beliefs about the parameters of sampling distributions in Type A evaluations, which exists in a Bayesian analysis.13 Our premise is that the perspective that is important when we quote a probability accompanying an uncertainty interval, say 95%, is the perspective of the user. The user should be able to say that 95% of such intervals that he or she receives over a period of time will contain the value of the corresponding measurand. We suppose, reasonably, that these intervals arise from different measurement problems undertaken by one or more laboratories. So errors that are "systematic" with respect to some set of measurements made by a laboratory are "random" with respect to the set of problems encountered by the user. (A schematic illustration of this is given in Figure 1.) This idea is the key to the method. It allows us to treat both types of error in the same way, which facilitates the construction of a method of uncertainty analysis with good performance. This performance is measured by the typical width of the interval generated, subject to the requirement that 95% of the intervals enclose the value of the relevant measurand. -A <$j$
<*>^
<4J^
<\^
<$^
-ci^
a < c ^ a<, c ^ a^ a^ fvi
User 1 User 2 User 3
v
e\ ei e\
^'
V»P'
eu eii eii
v' 1
v
p-1
^'
em em em
V»P'
€[y 6\y
G\V
C" v
^'
ev ev ev
G'* pvi V »P6vi 6vi
evi
Figure 1. Errors e j , . . . , e v j that are systematic down a column are random along a row. 95 out of 100, on average, will contain the value of the relevant measurand.1 T h e usual theoretical justification for assigning a shifted-and-scaled ^-distribution to an influence quantity involves giving unknown statistical parameters certain "noninformative" prior distributions in a Bayesian analysis. 3 ' 4 ' 5 An alternative justification can be found in "fiducial" probability, but this is out of favour in the statistics community. b
346 2. Method: one "random" and one "systematic" error The approach is illustrated using the simple type of measurement that involves a sample of normally distributed measurement errors and a bias that is assigned a uniform distribution as a result of expert knowledge. Imagine a measurand given by the equation Y = X\ + X2, where X\ and X2 are the values of influence quantities that are estimated independently. Suppose that X\ is estimated from n independent observations each assumed to have been drawn from a normal distribution with mean X\ and unknown standard deviation a. Let x\ and s be the sample mean and usual sample standard deviation of these observations. The error x\ — X\ is therefore an element drawn from the normal distribution with mean 0 and standard deviation a/^/n. Also, suppose that X2 is estimated by expert knowledge to lie within a known amount w of some estimate X2, with no value being thought more likely than any other. So X2 — X2 can be regarded as an element drawn from a uniform distribution with mean zero and known half-width w. Often, the error X2 — X2 will be systematic, i.e. this error will not change if the laboratory carried out a similar measurement at a later date. However, when we envisage the set of measurement problems of this type encountered by the user, the nature of this error as being constant with respect to the laboratory becomes irrelevant; the error X2 — X2 becomes interpretable as an element drawn from a uniform distribution with mean 0 and half-width w, and w will be an element drawn from some undefined population of half-widths for measurement problems of this type. The figure y = x\ + X2 is the measurement estimate and the total error y — Y is the sum of the two individual errors. Let Z be a standard normal variable, i.e., Z ~ N(0,1), and let U be a variable with the continuous uniform distribution with mean 0 and half-width w, i.e., U ~ U[—w,w}. So, with respect to the set of measurement problems encountered by the user involving that value of w, the total error y — Y is the value of a variable E = (aI\/n)Z + U, with a being an unknown parameter. Let S be the random variable for which s is the observed value. If there is some function H(S,w,n) such that Pr {H(S,w,n)
> \E\} > 0.95
(1)
for every possible fixed value of o then the interval [y - H(s, w,n), y + H(s, w, n)]
(2)
will contain the relevant measurand on at least 95% of the occasions that
347
the user receives such an interval, provided that the half-width is w.c This would be true for every possible value of w, from which it follows that the interval (2) will contain the relevant measurand for at least 95% of all measurement problems of this type. If this approach can also be used to define intervals with the same property in other types of measurement problems then the result will be a system generating an interval containing the measurand on at least 95% of occasions as viewed by the user. Let us write the inequality (1) as 0.95 < Pr {H(S,w,n)>
\E\} < 0.95 + 6
(3)
for some value of 5. Choosing H(S,w,n) to minimize 5 will encourage narrow intervals and, hence, will give the most informative statements to the user. It is well known that the variables for the sample mean and sample variance are independent when the distribution is normal. So S and E are independent. Despite this, finding a suitable form for H(S, w, n) from theory is difficult. However, we know that when a/'^/n 3> w the value of the function will ideally be equal to the appropriate quantile of the scaled ^-distribution, tgstn^iS/y/n, and when cr/y/n
+ (0.95w)2 + G(S, w, n)
(4)
being the quantity
f 0 . 3 7 - ^ - ^ ) f ^ ^ y ' 5 ( o . 9 5 w ) - e x p f-0-35*0-95tA , V n n2 J \ y/n J \ i95,n_i5/Vn / The term G(S,w,n) was chosen empirically to keep the probability in (3) small between those extremes, where G(S, w, n) will be negligible and H(S, w, n) will be exact. Simulations for 5 < n < 100 and 0.05a/^/n <w< 25&a/y/n suggest that (3) holds with 5 = 0.004. No significance should be attached to the form of G(S, w,n). A simpler expression would presumably be available if we were satisfied with a larger value of 8. Also, a simpler expression permitting a smaller value of 5 may exist. C
A referee provided a helpful restatement of the problem along the following lines. Suppose Vi = ix + 5 + ei, i = 1 , . . . , n, where 8 is a point drawn from the distribution U[—w, w] and e, is a point drawn from the distribution N(0,1). Let v and s2 be the sample mean and sample standard deviation of {vi}. With n and a2 held fixed, we can generate many such samples from which we calculate the sets {VJ} and {SJ}. We wish to find a function H(s,w,n) such that the interval with limits Vj ± H(sj,w,n) contains \i at least 95% of the time, irrespective of the values of \x and a.
348 The type of measurement considered is not representative. A practical method would need to be more general and, in particular, would need to accommodate several random components and several systematic components, none of which might be uniform. Also, the method would need to make resulting uncertainty information transferable to future calculations. So (2) is not being proposed for practical use, and the particular form of G(S,w,n) is irrelevant. Rather, the intention is to indicate a potential approach when the focus is on fulfilling a natural requirement of the user. 3. Comparison of intervals Many readers will consider this paper to be incomplete without a comparison of (2) with intervals calculated by better-known methods, e.g. the basic method of the Guide to the Expression of Uncertainty in Measurement2 (Guide) or a method involving the propagation of posterior distributions (in a Bayesian analysis). However, a comparison of different algorithms for calculating uncertainty intervals is only fair if the intervals purport to fulfil the same task, and the author is hesitant to represent claims of actual performance for the method of the Guide and the method of "propagation of distributions". Nevertheless, we compare these intervals here on the understanding that Y should be enclosed on at least 95% of occasions for every possible fixed value of a and that the intervals are to be kept narrow. The basic method of the Guide, if applied in this simple situation, would give the interval 5 [y - uctg5iVe„, y + uct95:Ve!{] where u2c = s2/n + w2/3 and ves = (n — l)n2u^/s4. We call this "Interval G". The usual interval obtained by the propagation of distributions is found by treating X\ and X2 as random variables, assigning X\ the distribution of x\ + (s/^n)Tn-i,d and convolving this with the distribution U[a;2 — w, X2 +w] assigned to X%. Numerical integration is used here to identify this interval, "Interval PG". Without loss of generality, we set X\ = X2 = 0 and w = 1. Subsequently, 106 measurements were simulated with n — 4 and n = 8 for various fixed a2/n. Table 1 shows the proportions of intervals that contained the value of the measurand, Y = 0, and the mean ratio of the width of the interval to the width of (2). (The standard error in the estimate of a proportion close to 0.95 based on 106 trials is 0.0002, and the standard error of each "mean ratio of width" was less than 0.0001. So the third decimal place is meaningful in each entry. Another negligible source of error with the Interval PG data relates to the calculation of the interval limits.) d
T h i s shifted-and-scaled Student's t-distribution can be justified if, prior to the sample being observed, no value for X\ and log a was thought more probable than any other. 4
349 In some situations, Interval G encloses the value of the measurand on fewer than 95% of occasions. In contrast, Interval PG encloses the measurand at least 95% of the time but, on average, the interval is slightly wider than need be (when the sample size is small), e.g. almost 5 percent wider than (2) when n = 4 and a2/n = 0.1. Results for the "ratio of mean widths" were very similar. In particular, the figure 1.049 was unchanged.
n 4 4 4 4 4 4 8 8 8 8 8 8
a2/n 0.001 0.01 0.1 1 10 100 0.001 0.01 0.1 1 10 100
Interval (2) prop, width 0.953 1 0.951 1 0.952 1 0.953 1 0.950 1 0.950 1 0.952 1 0.952 1 0.951 1 0.953 1 0.951 1 0.950 1
Interval G prop. width 1.188 1.000 0.997 1.166 0.967 1.019 0.937 0.904 0.941 0.965 0.995 0.948 1.000 1.189 1.164 0.997 0.968 1.059 0.974 0.948 0.949 0.991 0.950 0.999
Interval PG prop. width 0.954 1.001 0.960 1.022 1.049 0.963 0.958 1.025 0.952 1.006 0.950 1.001 0.951 0.999 0.953 1.003 0.955 1.017 0.954 1.006 0.951 0.999 0.950 1.000
Table 1 - Proportion of 95% intervals enclosing the value of the measurand in 10 6 trials, with mean ratio of width to that of (2). 4. Conclusion This analysis has focused on what many users would regard as their legitimate expectation for an uncertainty interval, i.e. the idea that, in the long run, 95% of the (independent) "95% uncertainty intervals" that they are quoted will contain the value of the measurand. The fundamental idea is that the "population" of interest is the population of problems encountered by the user. With respect to this population we can regard systematic errors incurred by individual laboratories as being random. The approach leads to intervals with the stated property and with reduced typical width. References 1. R. Willink and I. Lira, Measurement 38 61 (2005). 2. BIPM, IEC, IFCC, ISO, IUPAC, IUPAP and OIML. Guide to the Expression of Uncertainty in Measurement, (1995). ISBN 92-67-10188-9. 3. R. Kacker and A. Jones, Metrologia 40, 235 (2003). 4. S. J. Press, Bayesian Statistics: Principles, Models, and Applications, Wiley, (1989). 5. R. Willink, Metrologia 42 329 (2005).
Advanced Mathematical and Computational Tools in Metrology VII Edited by P. Ciarlini, E. Filipe, A. B. Forbes, F. Pavese, C. Perruchet & B. Siebert © 2006 World Scientific Publishing Co. (pp. 350-359)
PREPARING FOR A EUROPEAN RESEARCH AREA NETWORK IN METROLOGY: WHERE ARE WE NOW? MICHAEL KUHNE, WOLFGANG SCHMID Physikalisch-TechnischeBundescmstalt, Bundesallee 38116 Braunschweig, Germany ANDY HENSON National Physical Laboratory, Hampton TW11 OLWTeddington, U.K.
100
Road
Advances in metrology underpin innovation, enable improvements in quality of life, and unlock potential from other areas of science. Increased demand for top-level metrology in traditional areas, allied with the need to support emerging areas of technology, requires a new approach from the European NMIs. Increasing levels of cooperation in R&D will enable the NMIs to significantly improve impact from funding that can not reasonably expected to keep pace with demand growth. Within the iMERA project a coordinated European Metrology Research Programme will be prepared and the institutional and legal conditions for its implementation will be generated.
1.
Introduction
Industry, trade, and increasingly quality of life depend on effective, consistent measurements. Therefore the demands on metrology are steadily growing, and can be expected to grow even more rapidly in the future [1]. The drivers behind this pressure may be considered threefold: • traditional areas of industry, are becoming more complex and are requiring broader measurement ranges and lower uncertainties, • new areas of technology are emerging, e.g. nano-technology or biotechnology, • areas which are not in themselves new, but in which the value of metrology is increasingly being recognized, e.g. chemistry, clinical medicine or food safety. One of the roles of the National Metrology Institutes (NMIs) is to address these needs, but more and more NMIs are facing the difficult situation that they need to free resources for the development and delivery of new metrology for new
351 technologies and sectors whilst still servicing the current and sometimes even growing demands of the existing "traditional" sectors [2] - a situation which is interpreted as the "metrology dilemma". One key approach to address this dilemma is to increase the level of collaboration in metrology, both in the research and development effort and in the delivery of the resulting measurement services. The vision of how this might be achieved in the field of research through the implementation of the European Research Area (ERA) in metrology will be discussed in this paper and ideas of how to accomplish this ambitious goal explored. 2.
EUROMET
In September 1987 EUROMET was created as a European collaboration in measurement standards [3], with the objective to foster collaboration between European NMIs and the coordination of metro logical activities and services. The aims of EUROMET are to achieve a closer cooperation in the work on measurement standards within the present decentralized metrological structure, an optimised utilization of resources deployed towards identified metrological needs, and the accessibility of improved measurement services and of national facilities developed in the context of EUROMET collaboration to all members [3]. Today EUROMET comprises 33 member countries plus the Joint Research Centre of the European Union represented through IRMM. The member NMIs of EUROMET encompass around 4000 employees of whom about half of work in R&D in metrology is of the order of 370 M€ for the NMIs plus a further 30 M€ for the IRMM [4]. The technical scope of EUROMET activities is divided into ten areas: • acoustics, ultrasound, and vibration • electricity and magnetism • flow • ionizing radiation • length • mass and related quantities • metrology in chemistry • photometry and radiometry • thermometry • time and frequency • each with a Technical Committee and elected chairperson.
352 In addition to these "vertical" TCs EUROMET has created a "horizontal" TC for Interdisciplinary Metrology called TC INTMET. Main areas of activity are matters related to the Mutual Recognition Arrangement of the CIPM (CIPMMRA) and other matters that cross borders between TCs. A subcommittee of INTMET with the name QS-Forum deals with the review of the Quality Management Systems of the NMIs as required by the CIPM-MRA. This subcommittee has been upgraded to a full TC at the general assembly of EUROMET in 2005 and has now become TC Quality. Formal collaborations within EUROMET are classified under four different categories: • comparisons, are undertaken in order to establish the degree of equivalence between the national standards, • traceability, where one NMI takes formal traceability from another NMI rather than maintaining an own primary standard, • consultation on facilities, which means that, on equal partner basis, each NMI shares its knowledge and experience with other NMIs of EUROMET in order to help and assist their further development, and • cooperation in research, which comprises formal research projects as well as the various forms of exchange of information. One consequence of the Mutual Recognition Arrangement (MRA) of the CIPM, signed in 1999, is that much of the collaborative work within EUROMET over recent years has focused on meeting the CIPM MRA obligations, that is the organization of and the participation in comparisons and the technical review of the calibration and measurement capabilities (CMC)s and the NMI quality systems [5]. Numerous collaboration activities have been developed around R&D projects. Generally, this type of collaboration is spontaneous, generated "bottom up", and therefore does not necessarily reflect the strategic aims of the NMIs. The current ad hoc collaborations have added value to the R&D undertaken, although a variety of factors limit the potential impact. These factors range from the lack of strong and formal support by the management level of the participating NMIs which may result in missing and significant delays in the execution of the projects [5] to barriers arising from incompatibilities when members are trying to both collaborate and meet national funding timescales. 3.
The MERA Project
Within the EU 5th Framework Programme EUROMET conducted a study "Metrology for the European Research Area" (MERA) [5] which analysed the
353 metrological needs for Europe at the beginning of the 21 st century. The aim of the project was to understand whether and how the "European metrology dilemma" might be addressed through closer collaboration. This involved developing the plans to optimize and increase significantly the impact of European metrology research. The situation within the NMIs is different in each country. Whilst the smaller NMIs have always focused their resources on the most pressing priorities of their country, relying on the capabilities of the larger NMI for the balance, in many larger countries the NMIs have historically provided a comprehensive range of services and research. The pressure on resources means that the assumption that all capability must be provided from within the country, is being questioned even in the larger countries. Whilst smaller countries can identify and concentrate on their priority topics on a unilateral basis, if the larger countries adopt the same approach Europe risks losing vital capability and coherence [2]. Within the MERA project several scenarios for the future metrological infrastructure and their advantages and disadvantages were discussed. These included: (a) Comprehensive national provision, where every country has an NMI, offering a comprehensive metrological service, (b) Selected standard holders, where the NMI provides a service as comprehensive as possible, but maintains primary standards only in some selected (and agreed on the European level) areas, obtaining traceability in the other areas from another NMI, (c) Specialised centres of excellence, where selected NMIs maintain primary standards and offer traceability in one or several areas to all users in Europe, (d) Single European Metrology Institute, where the national NMIs are discontinued. Whilst the current situation lies somewhat between scenarios (a) and (b), the European NMIs see the future in a solution (b) or (c) or a mixture between (b) and (c). In this aspect, it should be noted that the optimum solution may vary between technical areas. Concerning R&D in metrology, the MERA study identified, amongst other results, that: • the aims of various national programmes have significant overlap, • many of the research teams are sub-optimal in terms of critical mass,
354
• •
major/special metrological facilities are expensive to establish, and may not necessarily be fully utilized through national demand alone, many of the topics tackled by research would best be addressed collaboratively.
This indicates the potential for raising the effectiveness and impact through coordinated R&D activities in metrology realized at a European level. A pan European approach to metrological research would increase critical mass, share the burden and create a common scientific and technological perspective. 4.
The iMERA Project
In October 2004, as a consequence of the outcome of the MERA project, EUROMET members submitted a proposal to the European Commission for a European Research Area Network under the 6th Framework Programme. This proposal called iMERA ("implementing MERA") was positively evaluated by the European Commission and will start on 1 April 2005. The aim of the proposal is twofold. Firstly, to facilitate some "bottom-up" but coordinated European research projects and, secondly, to create the conditions and structures within EUROMET to enable the NMIs to conduct coordinated metrological research in identified areas of strategic importance. The proposal also includes the investigation of the potential for collaboration on the basis of Article 169 of the European treaty. This article enables the European Commission to financially support European research in areas where EU member countries are willing to pool their national financial contributions into a European research fund. To facilitate this goal a number of ministries that fund national metrological research are also partners in the iMERA project. If the conditions are deemed appropriate, a proposal for a coordinated European metrology research programme (EMRP) will be prepared within the iMERA project with the aim of realizing it within the EU 7th Framework Programme. 4.1 The Consortium The iMERA consortium comprises 20 partners from 14 countries including the IRMM-JRC (see table 1). They may have the status of a "programme manager" (NMI) or "programme owner" (Ministry or their representative), or both.
355 Table 1. The iMERA consortium Country UK UK Germany
Institution NPL DTI
Programme owner
Programme manager X
X X
Germany
BMWA PTB
France
BNM
X
X
Italy
IMGC
X
X
Sweden
SP
X
Slovakia
UNMS
X X
Slovakia
SMU EZ
X
Netherlands
X
X
Netherlands
NMi
Denmark
DFM
X
Switzerland
METAS JV
X X
Czech Rep.
COSMT CMI
X
Poland Slovenia
GUM MIRS
X
X
Finland
MIKES IRMM
X X
X X
Norway Czech Rep.
EC
X X X X X
4.2 The Project Management The consortium will be managed by a Project Coordination Team, comprising the Project Coordinator and a Network Management Committee, with representatives of the different task groups of the project. A Network Steering Committee composed of senior representatives of 8 countries will be formed, with the responsibility to monitor the progress of the project, to suggest - if necessary - corrective actions and to make the Go/No-Go decision regarding the continuation towards Article 169. A web portal will be established for the project management, for consultation and also for dissemination of the project outputs. A number of workshops will be organized in order to enable input from the practitioners and stakeholders.
356 4.3 The Tasks Considering the rather ambitious aim of the project, a significant number of issues in widely diverse areas have to be addressed. The project comprises 37 "tasks" organized into 12 task groups [4], described briefly below. In general, the scheme within the activities of a task group may be as follows: Following collation and review of the status quo in order to understand what we each do nationally and based on an assessment of this information, options for joint activities between the iMERA partners will be identified and started, merging in the next step to trans-national activities, with a deeper level of collaboration and developing coordinated European solutions. Finally, a joint European Metrology Research Programme will be developed and the legal and organizational conditions for its realization shall be established. 4.4 The Task Groups Foresight: The objective of this task group is twofold. Firstly, elaborating the mechanisms for a foresight analysis of the future needs in metrology including consultation of the stakeholders as the potential users and, secondly, identifying those areas in metrology with the highest potential impact for industry and society. It will commence with collation and assessment of the national foresight studies and it will finally lead to the elaboration of a report "European Metrology Foresight 2007" which is intended to be updated periodically. Research programme: National Programme landscaping will provide an overview of the national methodologies currently used for research programmes and will analyse their strengths and weakness. The results of the European Metrology Foresight study elaborated within the former task group and the subsequent identification of strategic European metrology research activities to support innovation and quality of life will provide an important thematic input. The final aim of this task group is the preparation of a European metrology research programme. Additionally the initial experiences gained via the realization of pilot joint research projects will provide input on the practical aspects for the realization of aEMRP. Prioritisation: As a first step, an overview of the methodologies currently used to identify metrology R&D priorities at the national level will be undertaken and their strengths and weaknesses analysed. Finally this task group will consider the opportunities and advantages of a European collaboration in a specific research activity before the decision on the prioritization in the national programme is taken. That means integration of the national needs into European
357
solutions rather than first defining the national needs and asking later whether other countries may have similar requirements. Ownership: Legal aspects will be analysed and, if necessary, adapted in order to enable the national funding contributions to a EMRP. Consensus on the financial contributions of the participating countries must be achieved and the sustainability of the trans-national activities must be assured for the time when the financial support of the European Commission may have been reduced. Developing Structures: The realization of a EMRP may require structural and legal changes for the participating institutions, both at a national level and a European level. It is expected that EUROMET will play an important role in the preparation of the EMRP and its implementation. Elements which should assure the proper participation of all interested parties (NMIs, stakeholders, national Ministries, EC) in the elaboration of the research programme have to be generated, the mechanisms for the project selection process and the designation of the executing institutions based on excellence criteria, have to be developed and implemented and the required expert committees formed. As in many cases it will be desirable or necessary that the project is executed by consortia of several NMIs (and possibly other institutions) rather than by a single NMI, the administrative obstacles for doing so have to be eliminated. Of particular interest would be the opening of existing special facilities to metrologists of all European countries, in order to achieve their highly efficient use and to foster the collaboration on a European level and possibly the joint establishment and operation of major facilities. Finally an action plan for the realization of an Article 169 based EMRP, based on all the results obtained in this task group, will be established. Training and Mobility: Among the tasks of this group, joint training courses will be offered in the fields of identified needs, specially for NMI staff from countries with emerging metrology programmes. With support of the mobility programme of the EU the mobility of European metrologists should be improved. Special needs and expanding the ERA-Net: Strategies for the participation of EUROMET members who at present to not have wellestablished research activities will be developed. This may be "actively" by resolving particular problems of a major project or "passively" as beneficiaries of the results of projects that are executed in other institutions. Furthermore, it should be possible for countries, which are not yet ready to join all the start of iMERA, to join the EMRP at a later stage.
358 Knowledge Transfer, Intellectual Property Issues and Ethic Issues: Aspects for the proper provision and transfer of the results obtained in the execution of a EMRP to all interested (European) users will be addressed, including the related intellectual property issues. Measuring R&D Impact: The processes to identify the impact of the R&D activities in metrology will be analysed in order to support the foresight and prioritization processes. Information and Communication Technology Tools: Key aspects related to information and communication technology as a tool in the ERA-Net will be surveyed. Beyond Europe - Addressing Wider Collaboration: In some metrological challenges a wider collaboration beyond Europe may be desirable or even necessary. Dissemination, Governance and Consortium Management: Project and consortium management and related activities. 5
Conclusions
A promising approach to increase impact from European funding in metrology may comprise the enhanced coordination of the R&D activities in metrology between the European NMIs. The iMERA project has the ambitious aim to generate the conditions for the elaboration and execution of a coordinated European Metrology Research Programme (EMRP) and to prepare such a programme which could be realized within the 7th framework programme of the European Commission with a mixed funding - national contributions and support of the European Union. The coordinated R&D activities will increase their benefit-cost-rate significantly by avoiding unnecessary redundancy and by looking for the most appropriate approach and institution or consortium for the realization of a specific research project. This efficient use of the resources will open the possibility to address new fields in metrology, such as nano-metrology, biotechnology, etc., thus responding to the future metrological needs of European industry and society. Additionally, the ability to pool together the technical and human capacity of all participating European NMIs, particularly in the case of the most ambitious research projects, will enable the required "critical mass" to be achieved, - a scenario probably not possible within an individual NMI. In this way iMERA will enable the participating countries to increase significantly the national and European impact of their investment in R&D metrology.
359 References 1. 2.
3. 4.
5.
"Evolving Needs for Metrology in Trade, Industry and Society and the Role of the BIPM", CIPM, 2003, http://wwwl.bipm.org/en/publications/official/ A. Henson, D. Beauvais, F. Redgrave, "Globalisation and the integration of the European measurement systems: The MERA project", Proceedings of the XVIIIMEKO World Congress, Dubrovnik, Croatia, 2003, pp. 15781580 EUROMET Directory, edited by Gordon Clark, EUROMET Secretary, July 2004, http://www.eiiromet.org/docs/directorv/ "Implementing the Metrology European Research Area", Proposal for the Sixth Framework Programme, Co-ordination of National and Regional Activities, 2004 (not public) "Planning the European Research Area in metrology", Final report, MERA project, EUROMET, April 2004, http://www.euromet.org/docs/pubs/docs/Mera final-report.pdf
This page is intentionally left blank
Author Index Anwer N, [email protected]. 196 Aranda S, [email protected]. 258 Astrua M, [email protected]. 301 Aubineau-Laniece I, 276 Aytekin H S, hsavtekin @veditepe.edu.tr. 262 Badea E, [email protected]. 306 Balsamo A, [email protected]. 316 Bar M, [email protected]. 1 Bardies M, manuel.bardies @nantes.inserm.fr. 276 Batista E, [email protected]. 267 Bauer S, [email protected]. 1 Benoit E. [email protected]. 13 Bojkovski J, [email protected]. 330 Bourdet P, [email protected]. 196, 280 Bremser W, wolfram.bremser @ bam.de. 130 Brennan J K, [email protected]. 271 Castro M P, [email protected]. 335 Chatal J F, jean-francois.chatal @ nantes.inserm.fr. 276 Chiavassa S, [email protected]. 276 Choley J Y, jean-yves.cholev @ supmeca.fr. 280 Ciarlini P, [email protected]. 142, 237, 306 Clement A, [email protected]. 280 Costa C O, [email protected]. 335 Costa Dantas C, [email protected]. 204, 284 Cox M G, [email protected]. 23, 188, 221 Crampton A, [email protected]. 271 Crenna F, crenna @ dimec.unige.it. 221 Dapoigny R, [email protected]. 13 de Araujo Filho M C, 204 de Areas G, [email protected]. 119 de Barros Melo S, silvio [email protected]. 204, 284 de Oliveira Lima E A, 204, 284 Delia Gatta G, [email protected]. 306 Dobre M, [email protected]. 289 Domanska A, 293 dos Santos R W, [email protected]. 1 dos Santos V A, 284 Douglas R J, [email protected]. 245
362 Figueiredo F O, [email protected]. 35 Filipe E, [email protected], 151, 267 Forbes A B, [email protected]. 161, 297 Fraile R, [email protected]. 119 Franck D, 276 Girao P S, [email protected], 47 Gomes M I, [email protected]. 35 Greif N, [email protected]. 98 Grgic G, [email protected], 253 Gross H, [email protected], 60 Harris P M, [email protected]. 23, 188, 221, 271 Hartmann V, [email protected]. 60 Henson A, [email protected]. 350 Ichim D, [email protected]. 301, 306 Ince A T, [email protected]. 262 Ince R, [email protected]. 262 Iuculano G, iuculano@ingfil .ing.unifi.it. 171 Jiang X, [email protected]. 271 Jourdain J R, 276 Jousten K, [email protected], 60 Kallgren H, [email protected]. 212 Klapetek P, [email protected]. 310 Konnert A, [email protected]. 179 Korczynski M J, [email protected]. 188, 293 Kiihne M, [email protected]. 350 Lartigue C, [email protected]. 196 Leach R, [email protected]. 271 Linares J M, [email protected]. 258 Lindner G, [email protected]. 60 Lira I, [email protected]. 73 Lopez J M, [email protected]. 119 Mailhe J, [email protected]. 258 Maniscalco U, [email protected]. 142 Meda A, [email protected]. 316 Model R, [email protected]. 1 Oguz S, [email protected]. 262 Palmin U, [email protected]. 330 Panfilo G, [email protected]. 320 Pavese F, [email protected]. 325 Pellegrini G, [email protected]. 171
363 Pendrill L R, [email protected]. 212 Pereira J D, [email protected]. 47 Perruchet C, [email protected]. 85 Piree H, [email protected]. 289 Postolache O, [email protected]. 47 Premus A, [email protected]. 330 Recuero M, mrecuero @ sec.upm.es. 119 Regoliosi G, [email protected]. 142 Ribeiro A S, [email protected]. 335 Richter D, [email protected]. 98 Riviere A, alain.riviere @ supmeca.fr. 280 Rossi G B, [email protected]. 221 Ruiz M, [email protected]. 119 Sand A, [email protected]. 229 Schmid W, [email protected]. 350 Schrepf H, [email protected]. 98 Sibold D, [email protected]. 237 Siebert B, [email protected]. 237 Si via D S, [email protected]. 108 Slinde H, [email protected]. 229 Sousa J A, [email protected]. 335 Sprauel J M, [email protected], 258 Steele A G, alan.Steele@nrc-cnrc. gc.ca. 245 Strunov V I, [email protected]. 340 Tasic T, tanasko.tasic®gov.si. 253, 330 Tavella P, [email protected], 320 Thiebaut F, [email protected], 196 Urleb M, [email protected]. 253 Willink R, [email protected], 344 Woger W, [email protected]. 73 Wood B M, 245 Woolliams E, [email protected]. 23 Zanobini A, [email protected]. 171
This page is intentionally left blank
Series on Advances in Mathematics for Applied Sciences Editorial Board N. Bellomo Editor-in-Charge Department of Mathematics Politecnico di Torino Corso Duca degli Abruzzi 24 10129 Torino Italy E-mail: [email protected]
F. Brezzi Editor-in-Charge IMATI-CNR Via Ferrata 1 1-27100 Pavia Italy E-mail: [email protected]
M. A. J. Chaplain Department of Mathematics University of Dundee Dundee DD1 4HN Scotland
S. Lenhart Mathematics Department University of Tennessee Knoxville, TN 37996-1300 USA
C. M. Dafermos Lefschetz Center for Dynamical Systems Brown University Providence, Rl 02912 USA
P. L. Lions University Paris Xl-Dauphine Place du Marechal de Lattre de Tassigny Paris Cedex 16 France
J. Felcman Department of Numerical Mathematics Faculty of Mathematics and Physics Charles University in Prague Sokolovska 83 18675 Praha 8 The Czech Republic
B. Perthame Laboratoire d'Analyse Numerique University Paris VI tour 55-65, 5ieme etage 4, place Jussieu 75252 Paris Cedex 5
M. A. Herrero Departamento de Matematica Aplicada Facultad de Matematicas Universidad Complutense Ciudad Universitaria s/n 28040 Madrid Spain
K. R. Rajagopal Department of Mechanical Engrg. Texas A&M University College Station, TX 77843-3123 USA
S. Kawashima Department of Applied Sciences Engineering Faculty Kyushu University 36 Fukuoka 812 Japan M. Lachowicz Department of Mathematics University of Warsaw Ul. Banacha 2 PL-02097 Warsaw Poland
France
R. Russo Dipartimento di Matematica Universita degli Studi Napoli II 81100 Caserta Italy
Series on Advances in Mathematics for Applied Sciences Aims and Scope This Series reports on new developments in mathematical research relating to methods, qualitative and numerical analysis, mathematical modeling in the applied and the technological sciences. Contributions related to constitutive theories, fluid dynamics, kinetic and transport theories, solid mechanics, system theory and mathematical methods for the applications are welcomed. This Series includes books, lecture notes, proceedings, collections of research papers. Monograph collections on specialized topics of current interest are particularly encouraged. Both the proceedings and monograph collections will generally be edited by a Guest editor. High quality, novelty of the content and potential for the applications to modern problems in applied science will be the guidelines for the selection of the content of this series.
Instructions for Authors Submission of proposals should be addressed to the editors-in-charge or to any member of the editorial board. In the latter, the authors should also notify the proposal to one of the editors-in-charge. Acceptance of books and lecture notes will generally be based on the description of the general content and scope of the book or lecture notes as well as on sample of the parts judged to be more significantly by the authors. Acceptance of proceedings will be based on relevance of the topics and of the lecturers contributing to the volume. Acceptance of monograph collections will be based on relevance of the subject and of the authors contributing to the volume. Authors are urged, in order to avoid re-typing, not to begin the final preparation of the text until they received the publisher's guidelines. They will receive from World Scientific the instructions for preparing camera-ready manuscript.
SERIES ON ADVANCES IN MATHEMATICS FOR APPLIED SCIENCES
Vol. 17
The Fokker-Planck Equation for Stochastic Dynamical Systems and Its Explicit Steady State Solution by C. Soize
Vol. 18
Calculus of Variation, Homogenization and Continuum Mechanics eds. G. Bouchitte et al.
Vol. 19
A Concise Guide to Semigroups and Evolution Equations by A. Belleni-Morante
Vol. 20
Global Controllability and Stabilization of Nonlinear Systems by S. Nikitin
Vol. 21
High Accuracy Non-Centered Compact Difference Schemes for Fluid Dynamics Applications by A. I. Tolstykh
Vol. 22
Advances in Kinetic Theory and Computing: Selected Papers ed. B. Perthame
Vol. 23
Waves and Stability in Continuous Media eds. S. Rionero and T. Ruggeri
Vol. 24
Impulsive Differential Equations with a Small Parameter by D. Bainov and V. Covachev
Vol. 25
Mathematical Models and Methods of Localized Interaction Theory by A. I. Bunimovich and A. V. Dubinskii
Vol. 26
Recent Advances in Elasticity, Viscoelasticity and Inelasticity ed. K. R. Rajagopal
Vol. 27
Nonstandard Methods for Stochastic Fluid Mechanics by M. Capinski and N. J. Cutland
Vol. 28
Impulsive Differential Equations: Asymptotic Properties of the Solutions by D. Bainov and P. Simeonov
Vol. 29
The Method of Maximum Entropy byH. Gzyl
Vol. 30
Lectures on Probability and Second Order Random Fields by D. B. Hernandez
Vol. 31
Parallel and Distributed Signal and Image Integration Problems eds. R. N. Madan et al.
Vol. 32
On the Way to Understanding The Time Phenomenon: The Constructions of Time in Natural Science — Part 1. Interdisciplinary Time Studies ed. A. P. Levich Lecture Notes on the Mathematical Theory of the Boltzmann Equation ed. N. Bellomo
Vol. 33 Vol. 34
Singularly Perturbed Evolution Equations with Applications to Kinetic Theory by J. R. Mika and J. Banasiak
SERIES ON ADVANCES IN MATHEMATICS FOR APPLIED SCIENCES
Vol. 35
Mechanics of Mixtures by K. R. Rajagopal and L Tao
Vol. 36
Dynamical Mechanical Systems Under Random Impulses by R. Iwankiewicz
Vol. 37
Oscillations in Planar Dynamic Systems by R. E. Mickens
Vol. 38
Mathematical Problems in Elasticity ed. R. Russo
Vol. 39
On the Way to Understanding the Time Phenomenon: The Constructions of Time in Natural Science — Part 2. The "Active" Properties of Time According to N. A. Kozyrev ed. A. P. Levich
Vol. 40
Advanced Mathematical Tools in Metrology II eds. P. Ciarlini et al.
Vol. 41
Mathematical Methods in Electromagnetism Linear Theory and Applications by M. Cessenat
Vol. 42
Constrained Control Problems of Discrete Processes by V. N. Phat
Vol. 43
Motor Vehicle Dynamics: Modeling and Simulation by G. Genta
Vol. 44
Microscopic Theory of Condensation in Gases and Plasma by A. L Itkin and E. G. Kolesnichenko
Vol. 45
Advanced Mathematical Tools in Metrology III eds. P. Cianlini et al.
Vol. 46
Mathematical Topics in Neutron Transport Theory — New Aspects by M. Mokhtar-Kharroubi
Vol. 47
Theory of the Navier-Stokes Equations eds. J. G. Heywood et al.
Vol. 48
Advances in Nonlinear Partial Differential Equations and Stochastics eds. S. Kawashima and T. Yanagisawa
Vol. 49
Propagation and Reflection of Shock Waves by F. V. Shugaev and L. S. Shtemenko
Vol. 50
Homogenization eds. V. Berdichevsky, V. Jikov and G. Papanicolaou
Vol. 51
Lecture Notes on the Mathematical Theory of Generalized Boltzmann Models by N. Bellomo and M. Lo Schiavo
SERIES ON ADVANCES IN MATHEMATICS FOR APPLIED SCIENCES
Vol. 52
Plates, Laminates and Shells: Asymptotic Analysis and Homogenization by T. Lewiiiski and J. J. Telega
Vol. 53
Advanced Mathematical and Computational Tools in Metrology IV eds. P. Ciarlini et at.
Vol. 54
Differential Models and Neutral Systems for Controlling the Wealth of Nations by E. N. Chukwu
Vol. 55
Mesomechanical Constitutive Modeling by V. Kafka
Vol. 56
High-Dimensional Nonlinear Diffusion Stochastic Processes — Modelling for Engineering Applications by Y. Mamontov and M. Willander
Vol. 57
Advanced Mathematical and Computational Tools in Metrology V eds. P. Ciarlini et al.
Vol. 58
Mechanical and Thermodynamical Modeling of Fluid Interfaces by Ft. Gatignol and R. Prud'homme
Vol. 59
Numerical Methods for Viscosity Solutions and Applications eds. M. Falcone and Ch. Makridakis
Vol. 60
Stability and Time-Optimal Control of Hereditary Systems — With Application to the Economic Dynamics of the US (2nd Edition) by E. N. Chukwu
Vol. 61
Evolution Equations and Approximations by K. /to and F. Kappel
Vol. 62
Mathematical Models and Methods for Smart Materials ecte. M. Fabrizio, B. Lazzari and A. Morro
Vol. 63
Lecture Notes on the Discretization of the Boltzmann Equation eds. N. Bellomo and Ft. Gatignol
Vol. 64
Generalized Kinetic Models in Applied Sciences — Lecture Notes on Mathematical Problems by L Arlotti et al.
Vol. 65
Mathematical Methods for the Natural and Engineering Sciences by R. E. Mickens
Vol. 66
Advanced Mathematical and Computational Tools in Metrology VI eds. P. Ciarlini et al.
Vol. 67
Computational Methods for PDE in Mechanics by B. D'Acunto
SERIES ON ADVANCES IN MATHEMATICS FOR APPLIED SCIENCES
Vol. 68
Differential Equations, Bifurcations, and Chaos in Economics by W. B. Zhang
Vol. 69
Applied and Industrial Mathematics in Italy eds. M. Primicerio, R. Spigler and V. Valente
Vol. 70
Multigroup Equations for the Description of the Particle Transport in Semiconductors by M. Galler
Vol. 71
Dissipative Phase Transitions eds. P. Colli, N. Kenmochi and J. Sprekels
Vol. 72
Advanced Mathematical and Computational Tools in Metrology VII eds. P. Ciarlini et al.
SERIES ON ADVANCES IN MATHEMATICS FOR APPLIED SCIENCES Published: Mathematical Topics in Nonlinear Kinetic Theory II: The Enskog Equation by N. Bellomo et al. Discrete Models of Fluid Dynamics ed. A. S. Alves Fluid Dynamic Applications of the Discrete Boltzmann Equation by R. Monaco and L. Preziosi Waves and Stability in Continuous Media ed. S. Rionero A Theory of Latticed Plates and Shells by G. I. Pshenichnov Recent Advances in Combustion Modelling ed. B. Larrouturou Linear Kinetic Theory and Particle Transport in Stochastic Mixtures by G. C. Pomraning Local Stabilizability of Nonlinear Control Systems by A. Bacciotti Nonlinear Kinetic Theory and Mathematical Aspects of Hyperbolic Systems ed. V. C. Bom Vol. 10
Nonlinear Evolution Equations. Kinetic Approach by N. B. Maslova Mathematical Problems Relating to the Navier-Stokes Equation ed. G. P. Galdi
Vol. 12
Thermodynamics and Kinetic Theory eds. W. Kosihski et al.
Vol. 13
Thermomechanics of Phase Transitions in Classical Field Theory by A. Romano
Vol. 14
Applications of Pade Approximation Theory in Fluid Dynamics by A. Pozzi
Vol. 15
Advances in Mathematical Modelling of Composite Materials ed. K. Z. Markov
Vol. 16
Advanced Mathematical Tools in Metrology eds. P. Ciarlini et al.
6038 he ISSN: 1793-0901
ISBN 981-256-674 0
YEARS Of
PUBLISHING!
9 "789812 566744
www.worldscientific.com