DATA HANDLING IN SCIENCE AND TECHNOLOGY - VOLUME 19
Robustness of analytical chemical methods and pharmaceutical technologxal products
DATA HANDLING IN SCIENCE AND TECHNOLOGY
Advisory Editors: B.G.M. Vandeginste and S.C. Rutan Other volumes in this series:
Volume 1 Volume 2 Volume 3 Volume 4 Volume 5 Volume 6 Volume 7 Volume 8 Volume 9 Volume 10 Volume 11 Volume 12 Volume 13 Volume 14 Volume 15 Volume 16 Volume 17 Volume 18 Volume 19
Microprocessor Programming and Applications for Scientists and Engineers by R.R. Smardzewski Chemometrics: A Textbook by D.L. Massart, B.G.M. Vandeginste, S.N. Deming, Y. Michotte and L. Kaufman Experimental Design: A Chemometric Approach by S.N. Deming and S.L. Morgan Advanced Scientific Computing in BASIC with Applications in Chemistry, Biology and Pharmacology by P. Valko and S. Vajda PCs for Chemists, edited by J. Zupan Scientific Computing and Automation (Europe) 1990, Proceedings of the Scientific Computing and Automation (Europe) Conference, 12-15 June, 1990, Maastricht, The Netherlands, edited by E.J. Karjalainen Receptor Modeling for Air Quality Management, edited by P.K. Hopke Design and Optimization in Organic Synthesis by R. Carlson Multivariate Pattern Recognition in Chemometrics, illustrated by case studies, edited by R.G. Brereton Sampling of Heterogeneous and Dynamic Material Systems: theories of heterogeneity, sampling and homogenizing by P.M. Gy Experimental Design: A Chemometric Approach (Second, Revised and Expanded Edition) by S.N. Deming and S.L. Morgan Methods for Experimental Design: principles and applications for physicists and chemists by J.L. Goupy Intelligent Software for Chemical Analysis, edited by L.M.C. Buydens and P.J. Schoenmakers The Data Analysis Handbook, by I.E. Frank and R. Todeschini Adaption of Simulated Annealing to Chemical Optimization Problems, edited by J.H. Kalivas Multivariate Analysis of Data in Sensory Science, edited by T. NZS and E. Risvik Data Analysis for Hyphenated Techniques, by E.J. Karjalainen and U.P. Karjalainen Signal Treatment and Signal Analysis in NMR, edited by D.N. Rutledge Robustness of Analytical Chemical Methods and Pharmaceutical Technological Products, edited by M.M.W.B. Hendriks, J.H. de Boer and A.K. Smilde
DATA HANDLING IN SCIENCE AND TECHNOLOGY - VOLUME 19 Advisory Editors: B.G.M. Vandeginste and S.C. Rutan
Robustness of analytical chemical methods and pharmaceutical technological products
edited by Margriet M.W.B. Hendriks Agricultural Mathematics Group, P.0. Box 100, 6700 AC Wageningen,The Netherlands
Jan H. de Boer Gasunie Research, P.O.Box 19,9700 MA Groningen, The Netherlands
Age K. Smilde Laboratory for Analytical Chemistry, University of Amsterdam Nieuwe Achtergracht 166, 7018 WVAmsterdam, The Netherlands
1996
ELSEVIER Amsterdam - Lausanne
- New York -Oxford -Shannon -Tokyo
ELSEVIER SCIENCE B.V. Sara Burgerhartstraat 25 P.O. Box 211,1000 AE Amsterdam, The Netherlands
ISBN
0-444-89709-7
0 1996 Elsevier Science B.V. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior written permission of the publisher, Elsevier Science B.V., Copyright & Permissions Department, P.O. Box 521,1000 AM Amsterdam, The Netherlands. Special regulations for readers in the USA - This publication has been registered with the Copyright Clearance Center Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923. Information can be obtained from the CCC about conditions under which photocopies of parts of this publication may be made in the USA. All other copyright questions, including photocopying outside of the USA, should be referred to the copyright owner, Elsevier Science B.V., unless otherwise specified. No responsibility is assumed by the publisher for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions or ideas contained in the material herein. This book is printed on acid-free paper. Printed in The Netherlands
PREFACE
The aim of this book is to help those who are working in the area of analytical chemistry or pharmaceutical technology to develop robust analysis methods and pharmaceutical technological products. Robustness is a part of quality assurance. Awareness of the importance of quality is growing. Proof of this growth is the abundance of norms and protocols that are created to assure quality, e.g. Good Laboratory Practice (GLP) and Good Manufacturing Practice (GMP). It is therefore worthwhile to think about the formal methodology which can be used to assess and assure robustness. One of the quality aspects of analytical chemical methods is whether they are rugged or robust against extraneous influences or changing conditions. The latter can be a change in analist performing the analysis or a change in instrument, laboratory, supplier of chemicals etc. In pharmaceutical technology, it is important to develop formulations which are robust to environmental conditions, like temperature changes and relative humidity variation, and that have a long life time. These aspects belong to the quality assurance of pharmaceutical formulations. In this book we try to give a framework for assessing and assuring the robustness of analytical chemical methods and pharmaceutical technological products. The methodology to serve these purposes is essentially the same in both areas. The book contains review chapters, explaining the methodology, and chapters with applications from both analytical chemistry and pharmaceutical technology. We hope that this mixture gives a good flavor of the methodology and how it can be used. At a certain point in time all three of us were working in the Research Group Chemometrics at the University Centre for Pharmacy of the University of Groningen in the Netherlands. In this research group, one of the topics was developing methodology for assessing and assuring robustness. The methodology was applied in analytical chemistry and pharmaceutical technology. Some of the authors were or still are part of this research group: J. Wieling, C.A.A. Duineveld, P.M.J. Coenegracht and P. Koopmans. There was a close working relationship with the Department of Pharmaceutical Technology and Biopharmacy, of which C.E. Bos was and G.K. Bolhuis still is a member.
V
vi
PREFACE
We are glad that all other authors, Y. vander Heyden, D.L. Massart, S.P. Jones and M. Mulholland accepted the invitation to participate in the project of writing this book. We thank Gjalt Feenstra for his help in editing some of the figures and, finally, we hope that you will enjoy reading this book! September 1996 Margriet M.W.B. Hendriks, Wageningen Jan H. de Boer, Groningen Age K. Smilde, Amsterdam
TABLE OF CONTENTS Chapter 1
GENERAL INTRODUCTION TO ROBUSTNESS
1
1.1 INTRODUCTION
1
1.2 APPLICATION AREAS AND RELATED ROBUSTNESS QUESTIONS 1.2.1 Pharmaceutical formulations 1.2.2 Analytical chemical methods
2 2 3
1.3 STATISTICAL METHODOLOGY 1.3.1 Taguchi method 1.3.2 RSM and Experimental design 1.3.3 Sequential or simultaneous optimization 1.3.4 Multicriteria optimization 1.3.5 Robustness criteria 1.4 RECOMMENDED READING PATHS
8
REFERENCES
9
Chapter 2
STABILITY AND RESPONSE SURFACE METHODOLOGY
11
2.1 INTRODUCTION 2.1.1 Example
11 13
2.2 AN OVERVIEW OF RESPONSE SURFACE METHODOLOGY 2.2.1 First-order designs 2.2.2 Adding center points 2.2.3 Second-order designs 2.2.4 Optimal designs 2.2.5 Other second-order designs 2.2.6 Interim Summary
15 18 24 25 33 34 35
2.3 ROBUST DESIGN AND RESPONSE SURFACE METHODOLOGY 2.3.1 Response surface modeling of the mean and standard deviation 2.3.2 Analyzing the mean and standard deviation response surfaces 2.3.3 Experimental design with environmental variables
35 37 39 41
vii
...
Vlll
CONTENTS
2.3.4 Analysis of experimental designs with environmental variables 2.3.5 Example
47 53
2.4 SPLIT-PLOT DESIGNS FOR ROBUST DESIGN 2.4.1 Overview of split-plot designs 2.4.2 Precision of split-plot designs 2.4.3 Variants of split-plot designs 2.4.4 Analysis of split-plot designs for robust experimentation
57 59 69 69 70
2.5 CONCLUSIONS
73
ACKNOWLEDGEMENTS
75
REFERENCES
75
Chapter 3
REVIEW OF THE USE OF ROBUSTNESS AND RUGGEDNESS IN ANALYTICAL CHEMISTRY 79 3.1 INTRODUCTION
79
3.2 PLACE OF RUGGEDNESS TESTING IN METHOD VALIDATION
80
3.3 DEFINITIONS OF RUGGEDNESS
83
3.4 RUGGEDNESS TESTING OF PROCEDURE RELATED FACTORS 3.4.1 The steps of a ruggedness test 3.4.2 Selection of the factors 3.4.3 Selection of the levels of the factors 3.4.4 Selection of the experimental design 3.4.5 Experimental part of the ruggedness test 3.4.6 Analysis of the results 3.4.7 Statistical analysis of the results 3.4.8 Using predefined values to identify chemically relevant factors 3.4.9 Case studies 3.4.10 Expert systems and software packages for ruggedness testing
85 85 86 88 92 112 114 115 124 127 138
3.5 RUGGEDNESS TESTING OF NON-PROCEDURERELATED FACTORS: 138 THE USE OF NESTED DESIGNS 3.6 CONCLUSIONS
143
ACKNOWLEDGEMENTS
144
REFERENCES
145
CONTENTS
ix
Chapter 4
ROBUSTNESS CRITERIA; INCORPORATING ROBUSTNESS EXPLICITLY IN OPTIMIZATION PROCEDURES UTILIZING MULTICRITERIA METHODS 149 4.1 INTRODUCTION
149
4.2 A BRIEF INTRODUCTION TO THE TAGUCHI METHODS 4.2.1 Introduction 4.2.2 The loss function 4.2.3 Off-line quality control 4.2.4 Orthogonal arrays
150 150 151 154 156
4.3 THE ROBUSTNESS CRITERIA 4.3.1 Introduction 4.3.2 The variancekovariance structure of a mixture 4.3.3 General aspects of the robustness criteria 4.3.4 The Jones method 4.3.5 The Weighted Jones method 4.3.6 The Projected Variance method 4.3.7 The Robustness Coefficient
157 157 159 166 166 169 170 172
4.4 MULTICRITERIA DECISION MAKING 4.4.1 Introduction 4.4.2 Theory ofMCDM
175 175 180
4.5 THE ROBUSTNESS COEFFICIENT APPLIED IN A MCDM STRATEGY183 4.5.1 Introduction 183 4.5.2 Theory 183 4.5.3 Experimental 184 4.5.4 Results and discussion 185 REFERENCES
189
Chapter 5
RUGGEDNESS TESTS FOR ANALYTICAL CHEMISTRY
191
5.1 INTRODUCTION 5.1.1 Designing a protocol for method validation 5.1.2 Summary of the role of a ruggedness test in a method validation program
191 192 196
5.2 SELECTION OF FACTORS TO TEST 5.2.1 Selection of the number of levels at which to test a factor
197 198
X
CONTENTS
5.2.2 Selection of factors for HPLC methods 5.2.3 Selection of factors for other analytical methods
198 20 1
5.3 SELECTION OF EXPERIMENTAL DESIGNS 5.3.1 Factorial designs 5.3.2 Star designs 5.3.3 Central composite designs 5.3.4 Box-Behnken designs 5.3.5 Matching the ruggedness test to an efficient design
202 203 209 21 1 21 1 212
5.4 TREATMENT OF RESULTS 5.4.1 Measurements for a HPLC Study 5.4.2 Treatment of the results from the ruggedness study 5.4.3 Confounding effects in fractional factorial designs
214 214 216 217
5.5 EXAMPLE CASE STUDIES 219 5.5.1 The application of a ruggedness test to the assay of Aspirin and its major 219 degradation product, salicylic acid 5.5.2 The application of a ruggedness test to the assay of Salbutamol and its major 226 degradation product, AH4045 5.6 CONCLUSIONS
230
REFERENCES
230
Chapter 6
STABILIZING A TLC SEPARATION ILLUSTRATED BY A MIXTURE OF SEVERAL STREET DRUGS
233
6.1 INTRODUCTION
233
233 6.2 THEORY 233 6.2.1 Thin Layer Chromatography 235 6.2.2 Separation problem 236 6.2.3 Selection of mobile phases 238 6.2.4 Influence of temperature and relative humidity 240 6.2.5 Optimization 242 6.2.6 The Taguchi approach to robustness 243 6.2.7 Application of a parameter design in optimization 6.2.8 Generalization of parameter design towards Response Surface Methodology 245 (RSM) 246 6.2.9 Construction of experimental designs 248 6.2.10 Selection of the dependent variable. 250 6.2.11 Construction of models for the dependent variables
CONTENTS 6.2.12 Selection criteria for models 6.2.13 Selection of optimization criteria.
xi 25 1 252
6.3 EXPERIMENTAL 6.3.1 Materials and methods 6.3.2 Software
254
6.4 RESULTS 6.4.1 Introduction 6.4.2 Box Cox transformation 6.4.3 Selection of models 6.4.4 Chromatographic and empirical models 6.4.5 Determination of a solvent with a high minimum resolution 6.4.6 Determination of a solvent composition with a robust minimum resolution
254 254 254 257 258 258 259
6.5 CONCLUSIONS
262
REFERENCES
263
254
Chapter 7
ROBUSTNESS OF LIQUID-LIQUID EXTRACTION OF DRUGS 265 FROM BIOLOGICAL SAMPLES 7.1 INTRODUCTION
265
7.2 THEORY 7.2.1 Liquid-liquid extraction optimisation theory 7.2.2 Optimisation criteria
268 268 270
7.3 EXPERIMENTAL 7.3.1 Validation of robustness criteria by means of a comparison with a simulation experiment 7.3.2 Selection of solvents 7.3.3 The extraction of a group of sulphonamides from plasma
281
7.4 RESULTS AND DISCUSSION 7.4.1 Validation of robustness criteria by means of a comparison with a simulation experiment 7.4.2 The extraction of a group of sulphonamides
288
7.5 CONCLUSIONS
304
ACKNOWLEDGEMENTS
305
REFERENCES
305
28 1 284 286
288 295
xi i
CONTENTS
Chapter 8 THE USE OF A FACTORIAL DESIGN TO EVALUATE THE PHYSICAL STABILITY OF TABLETS AFTER STORAGE UNDER 309 TROPICAL CONDITIONS 8.1 INTRODUCTION 8.1.1 The use of experimental designs in tablet formulation 8.1.2 The use of factorial designs in physical tablet stability studies
309 310 311
8.2 THE USE OF THE RELATIVE CHANGE IN TABLET PARAMETERS IN A FACTORIAL DESIGN 312 313 8.2.1 Planning of the design 314 8.2.2 Tabletting, storage and measurements 8.2.3 Results 314 8.2.4 Conclusions 325 8.3 SELECTION OF EXCIPIENTS SUITABLE FOR USE IN TROPICAL COUNTRIES 8.3.1 Planning of the design 8.3.2 Tabletting, storage and measurements 8.3.3 Results 8.3.4 Conclusions
328 328 330 33 1 340
REFERENCES
340
LIST OF CONTRIBUTORS J.H. DE BOER Gasunie Research, P.O. Box 19, 9700 M A Groningen, The Netherlands
G.K. BOLHUIS Department of Pharmaceutical Technology and Biopharmacy, University Centre for Pharmacy, University of Groningen, A. Deusinglaan I , 9713 A V Groningen, The Netherlands C.E. Bos A.UK Veterinary Cooperation, P.O. Box 94, 5430 AB Cuijk, The Netherlands P.M.J. COENEGRACHT Research Group Chemometrics, University Centre for Pharmacy, University of Groningen, A. Deusinglaan 1, 9713 AV Groningen, The Netherlands C.A.A. DUINEVELD Quest International, P. 0.Box 2, I400 CA Bussum, The Netherlands
Y. VANDER HEYDEN ChemoAC, Pharmaceutical Institute, Vrije Laarbeeklaan 103, B-I 090, Brussels, Belgium
Universiteit
Brussel,
S.P. JONES Boeing Computer Services, The Boeing Company, P. 0.Box 24346, MS 7L22, Seattle, WA98124-0346, USA P. KOOPMANS Academic Hospital Groningen, P.O. Box 30001, 9700 RB Groningen, The Netherlands D.L. MASSART ChemoAC, Pharmaceutical Institute, Vrije Laarbeeklaan 103, B-I 090, Brussels, Belgium ...
Xlll
Universiteit
Brussel,
xiv
CONTRIBUTORS
M. MULHOLLAND Department of Analytical Chemistry, University of New South Wales, P. 0. Box I , Kensington, New South Wales 2033, Australia A.K. SMILDE Laboratory for Analytical Chemistry, Nieuwe Achtergracht 166, I018 WV Amsterdam, The Netherlands J. WIELING Biolntermediair Europe BV, P.O. Box 454, 9700 AL Groningen, The Netherlands
Chapter 1
GENERAL INTRODUCTION TO ROBUSTNESS AGEK. SMILDE Laboratory for Analytical Chemistry, Universityof Amsterdam, Nieuwe Achtergracht 166, 1018 WV Amsterdam, The Netherlands
1.1 INTRODUCTION In the field of analytical chemistry and pharmaceutical formulations there is a growing awareness of quality issues. To mention a few, Good Laboratory Practice (GLP) and Good Manufacturing Practice (GMP) and I S 0 norms are important topics in both laboratory and industrial environments. Moreover, there is a growing number of regulatory committees concerned with all aspects of quality, both on a national and international level [ 1-31. In analytical chemistry, validation of the analytical methods is of utmost importance [4,5]. One of the aspects of this validation is the robustness of analytical methods against variations in experimental circumstances. The term “experimental circumstances” is very broad; it might even include inter-laboratory variation. In this book, only intra-laboratory experimental conditions are considered. No explicit attention is given to inter-laboratory variations, although some of the presented methodology might be useful in that area. In pharmaceutical technology, quality assurance of the pharmaceutical formulation is important. When a pharmaceutical formulation is produced, on-line quality monitoring and control has to be performed in order to check the quality of the outgoing products. Methodology to perform this task is Statistical Process Control (SPC) and is not included in this book. Good text books in the area of SPC exists [6-91. In this book the focus is on off-line quality control, e.g. how to make products that are intrinsic robust against process variations.
1
2
A.K. SMILDE
This book consists of eight chapters. Chapters 2, 3 and 4 give methodological background and reviews. Chapters 5 to 8 carry the applications; both in the field of analytical chemistry and pharmaceutical formulations. Since the field of which this book tries to give an overview is still under active research, this book is by no means a monograph with well established and tested methods. There are still a lot of questions and open ends. This book, however, does give some ideas how to tackle the problems of robustness. An important distinction has to be made with respect to the presented methods. There is a class of methods dealing with testing robustness of a given analytical method or a given pharmaceutical formulation. When related to an analytical method, such strategies are often referred to as “ruggedness testing”. The other class of methods aims at building in robustness in the design phase for pharmaceutical formulations or in the method development phase of analytical methods. This second class of methods is sometimes called “quality by design” for obvious reasons. This general introduction will continue with a summary of application areas covered in the following chapters and the related robustness questions which have to be solved. Then the different statistical methods that play a role in solving the questions and which are discussed in the following chapters will be put in a general framework. Recommendations will be given for readers of different levels how to approach this book in such a way that the reading of it is fun!
1.2 APPLICATION AREAS AND RELATED ROBUSTNESS QUESTIONS 1.2.1 Pharmaceutical formulations Robustness is a relatively new area of (optimization of) pharmaceutical formulations. Therefore, no review is given of existing applications. Some applications are given in the Chapters 2, 4 and 8; these are explained below. When tablet formulations are made, usually different quality criteria have to be met, e.g., a high crushing strength, a low disintegration time, a pre-set dissolution profile. A tablet consists normally of the pharmacon (the pharmacologically active compound; the drug) and excipients. Hence, a tablet can be made with different relative amounts of excipients (a mixture composition) and this creates room for optimizing a tablet
GENERAL INTRODUCTION TO ROBUSTNESS
3
formulation. Not only the relative amounts of excipients influence tablet properties but also process variables like compression force. Moreover, also environmental variables like temperature and humidity can influence (long-term) stability of tablets. Different robustness questions can be raised in this area. Suppose that a tablet has to be made with certain relative amounts of excipients. In the weighing of the excipients small errors are made which result in a variation of the mixture composition of excipients. How much are the quality criteria of tablets affected by this variation? Is it possible to select excipient compositions that minimize the quality reducing effects? These questions are posed and answered in Chapter 4. The properties of tablets are also influenced by temperature and humidity fluctuations, e.g., during storage. Is it possible to select excipient compositions that minimize the effect of temperature and humidity fluctuations on the quality of tablets? These questions are raised and treated in a small example in Chapter 2 and extensively in Chapter 8. 1.2.2 Analytical chemical methods A review of ruggedness testing methods is presented in Chapter 3 and in Chapter 5 examples are given. In these chapters procedures are described that test the robustness or ruggedness of existing methods. Hence, incorporating robustness explicitly in analytical techniques (see Section 1.1) is not discussed.
I .2.2.I High Performance Liquid Chromatography High Performance Liquid Chromatography (HPLC) is an analytical chemical method which is used on a large scale routinely. If an HPLC method is developed, the question arises whether the analytical results from this method depend critically on small deviations in the mobile phase composition, the selected UV-wavelength for detection etc. Methods to deal with these problems are outlined in Chapter 5 and examples are given how to tackle these problems. Examples of ruggedness testing in HPLC are also given in Chapter 3 . 1.2.2.2 Thin Layer Chromatography (TLC)
TLC is a simple, cheap and fast analytical chemical technique which is often used for screening purposes. Due to the set-up of an TLC experiment, not only the mobile phase composition has influence on retention (and resolution) but also temperature and humidity. Of course, a
4
A.K. SMILDE
TLC experiment can be carried out in controlled conditions, but then the appealing characteristics of simplicity, cheapness and fastness disappear. In order to make the TLC method simple and robust against temperature and humidity changes, it is possible to select a mobile phase that minimizes the harmhl effects of these changes. This is described in Chapter 6. 1.2.2.3 Liquid-liquid extraction One of the most used sample pre-treatment methods, especially in bioanalysis, is liquid-liquid extraction. Analytes present in an aqueous sample can be extracted from that sample with the use of organic solvents. A pure organic solvent can be used, but also mixtures of organic solvents. It is important in liquid-liquid extraction that as much as possible of the analyte(s) and internal standard are extracted in equal amounts. This defines an optimization problem where a mixture of organic solvents has to be chosen to reach the goals stated above. In mixing the organic solvents, errors are made. How much do these errors affect the quality criteria (maximizing amounts of analytes and internal standards extracted in an equal way)? Is it possible to select an organic solvent mixture that minimizes the effects of these errors? This problem is treated in Chapter 7 and is very related to the problem discussed in Chapter 4.
1.3 STATISTICAL METHODOLOGY In this Section a brief overview is given of the statistical methods that are discussed in the separate chapters in this book. For the established methodology, references will be given to existing text books.
1.3.1 Taguchi method The Taguchi method consists of a philosophy of quality, experimental design methods to build in robustness and methods to analyze data obtained from the experiments. All these topics are treated in text books [ 10-131. The philosophy is simple: make products with build-in robustness against all kinds of environmental disturbances and fluctuations. The experimental design methods are presented as linear graphs and orthogonal arrays, but are not essentially different from established designs like Factorial and Fractional Factorial designs [ 121. The methods of analyzing
GENERAL INTRODUCTION TO ROBUSTNESS
5
the data consists of criteria formulated in terms of signal-to-noise ratios and (simple) statistical tools to establish the relationship between the design variables, the environmental variables and the signal-to-noise ratios. There is also a lot of criticism on Taguchi’s method, especially with respect to the data analysis part. Some of this criticism focuses on the use of Taguchi’s quality criteria: the signal-to-noise ratios [ 151. A good example of the drawback of these signal-to-noise ratios is given in Chapter 6. A brief introduction into the Taguchi method is given in Chapter 4, whereas in Chapters 2 and 6 some of Taguchi’s ideas are touched upon and discussed. The philosophy of Taguchi (build-in robustness) is present in Chapters 2 , 4 , 6 , 7 and 8.
1.3.2 RSM and Experimental design The idea of Response Surface Methodology (RSM) is straightforward. Suppose that a tablet has to be made with a high crushing strength. The crushing strength depends on the relative amounts of excipients. The hnctional relationship between the quality criterion (crushing strength) and the design variables (relative amounts of excipients) may be very complicated, but can be approximated by a Taylor expansion. This results in a linear approximation (first-order approximation) if only one term in the Taylor expansion is considered. If this is not sufficient, a second term can be added and a second-order model is obtained. This results in empirical first-order models like
where CR is crushing strength; x,, x2 are the relative amounts of excipients 1 and 2, respectively and e is an error term. The parameters a,, a , and a2 have to be estimated; preferably using data obtained from an experimental design (see later). With the use of such an empirical model, the crushing strength can be optimized with respect to the relative amounts of excipients. The empirical model above can be represented as a surface in the 3-D space, where the axis are x,, x2 and CR. Hence, this methodology is called response surface methodology. Good textbooks have appeared in the area of RSM [ 14,161. In order to optimize the information content of experiments, it is wise to plan the experiments ahead in a well defined manner. Formal ways to plan
6
A.K. SMILDE
experiments are the subject of experimental design. The basic idea is to vary systematically all variables that might influence the responses (quality criteria). These ideas are explained in textbooks [14,16]. RSM is discussed in Chapter 2 and experimental design is discussed in Chapters 2, 3 and 5. Both RSM and experimental design techniques are present in all the other chapters in this book. Two special topics of experimental design are used in this book. The first topic is that of mixture designs. As an example, suppose that only two recipients are used in a tablet. Then the above used model for crushing strength has the restriction that x,+x,=l . Designs especially suited for this situation are called mixture designs and are treated in Cornell [17]. In Chapter 4, 6 and 7 these designs are used. The second special topic is the combination of factorial designs and mixture designs. Suppose that, in the example above, not only the two recipients influence the crushing strength but also the compression force and the mixing time of the recipients. The relative amounts of recipients have to be varied in a mixture design, but the compression force and mixing time can be varied according to a factorial design. Hence, a combination between a mixture design and a factorial design is needed. This topic is treated partly in Cornell [ 171 and extensively in some papers [18,19]. Chapter 6 gives an example of such combined designs. 1.3.3 Sequential or simultaneous optimization For the optimization of, for instance, a tablet formulation, two strategies are available: a sequential or a simultaneous approach. The sequential approach consists of a series of measurements where each new measurement is performed after the response of the previous one is known. The new experiment is planned according to a direction in the search space that looks promising with respect to the quality criterion which has to be optimized. Such a strategy is also called a hill-climbing method. The Simplex method is a well known example of such a strategy. Textbooks are available that describe the Simplex methods [20]. In the simultaneous approach the experiments are planned beforehand (preferably using experimental design techniques) and performed randomly. With RSM techniques the obtained experimental data can be used to model the quality criterion as a function of the design variables. Then an optimal setting of the design variables can be calculated. All the optimization experiments described in this book are using the simultaneous approach. The simultaneous approach uses in almost all
GENERAL INTRODUCTION TO ROBUSTNESS
7
cases the RSM method, and this approach is henceforth described in textbooks [14,16]. There are also hybrid forms of the sequential and simultaneous approach to optimization. In a part of the search space a small design is made and the initial experiments are carried out according to this design (simultaneous). Then the direction of steepest ascent is calculated and some experiments are made in this direction (sequential). In a promising new area a new design is made and experiments are performed according to this new design. Optimization is performed by repeating these steps a few times. This hybrid procedure is also described in standard textbooks [ 14,161. One of the drawbacks of sequential optimization methods is that optimizing two or more criteria at the same time is hard, if not impossible. If the two or more criteria are combined in one overall criterion, which is advocated sometimes, then ambiguous results are obtained. This is shown in Chapter 4. There are ways to overcome this ambiguity to some extent [21]. Another drawback of a sequential procedure is that it gives not much information on the dependence of the criterion on the design variables. In the context of robustness this is a very serious drawback. This is one of the reasons why the use of sequential optimization methods is not present in this book. 1.3.4 Multi-criteria optimization In practice often more than one quality criterion is relevant. In the case of the need to build in robustness, at least two criteria are already needed: the quality criterion itself and its associated robustness criterion. Hence, optimization has to be done on more than one criterion simultaneously. If a simultaneous optimization technique is used then there are procedures to deal with multiple optimization criteria. Several methods for multi-criteria optimization have been proposed and recently a tutorialheview has appeared [22]. An introduction to one particular multi-criteria optimization method the so called Pareto-Optimality method - is discussed in Chapter 4, where also an application of this method is given. 1.3.5 Robustness criteria If robustness has to be build in, then the concept of robustness has to be formalized and optimized. This is contrary to the class of methods that check the robustness or ruggedness of existing methods; then the influence
8
A.K. SMILDE
of the variables on the response can be expressed, for instance, in percentage change of the response. Several ways to formalize the concept of robustness are presented in this book. Robustness can be formalized and expressed as a variance of the quality criterion which is done in Chapter 7. Another way to formalize robustness is the percentage change of the response, which is done in Chapter 8. It is also possible to express robustness in more complicated ways, examples of those are given in Chapters 2 and 4. In Chapter 6 a maxi-min formalization is chosen: select the TLC-solvent composition in such a way that the minimum resolution between two pair of solutes is maximized.
1.4 RECOMMENDED READING PATHS For readers with no prior knowledge of experimental design and RSM Reading several chapters in the text books of Box et al. [14] is a good introduction. After that, the introduction in experimental design and RSM methodology of Chapter 2 can be read and an overview is also given in Chapter 3. For readers with no prior knowledge of optimization methods In the textbook of Box et.al. [14] the basic principles of optimization are also explained. The sequential simplex method is presented in Walters et.al. [20]. Multi-criteria optimization is presented in Chapter 4 on an introductory level. For those readers who want to know more about multicriteria optimization, see the references given in Section 1.3.4 and Chapter 4. For readers with no prior knowledge of the Taguchi method The Taguchi method is explained to some extent in Chapter 4. A general introduction is given in [ll-131. For detailed discussions, see the references given in Chapter 4 and Section 1.3.1. For readers with some knowledge of experimental design and RSM Start with reading Chapters 2 and 3, this will fresh up your memory.
GENERAL INTRODUCTION TO ROBUSTNESS
9
For readers with some knowledge of optimization methods Start reading Chapter 2 and 4, that gives the background material which should be understandable. If a more detailed understanding of, e.g., multicriteria optimization is wanted, then the references in Chapter 4 will suffice. For readers with some knowledge of the Taguchi method Start reading Chapter 2 and 4,that gives the background material which should be understandable.
REFERENCES
[ 101
[l 11 [ 121 [ 131
[14]
M. Parkany (editor ), Quality assurance for analytical laboratories, Royal Society of Chemistry, Cambridge, United Kingdom, 1993 (Proceedings of the Fifth International Symposium on the Harmonization of Internal Quality Assurance Schemes for Analytical Laboratories held in Washington DC, USA, 22-23 July 1993). Good Laboratory Practice in the Testing of Chemicals, Organization of Economic Co-operation and Development (OECD), Paris, 1982. Quality Management and Quality Assurance Standards: Guidelines for Selection and Use, 1987-03-15, International Organization for Standardization, 1987. G. Kateman and L. Buydens, Quality Control in Analytical Chemistry, John Wiley, New York, 1993. L.Buydens and P. Schoenmakers (editors), Intelligent software for chemical analysis, Elsevier, Amsterdam, 1993. G.B. Wetherill and D.W. Brown, Statistical Process Control: Theory and Practice, Chapman and Hall, London, 1991. R.W. Berger and T.H. Hart, Statistical Process Control: A Guide for Implementation, ASQC Quality Press, Milwaukee, 1986. A. Mitra, Fundamentals of Quality Control and Improvement, MacMillan, New York, 1993. Th.P. Ryan, Statistical Methodsfor Quality Improvement, John Wiley, New York, 1989. T. Bendell (editor), Taguchi Methods, Elsevier Applied Science, Amsterdam, 1989. G. Taguchi, Introduction to quality engineering: designing quality into products andprocesses, Kraus International Publications, White Plains, NY, USA, 1986. G.S. Peace, Taguchi methods: a hands-on approach, Addison Wesley, 1992. P.J. Ross, Taguchi Techniques for Quality Engineering: Loss Function, Orthogonal Experiments, Parameter and Tolerance Design, McGraw-Hill, New York, 1988. G.E.P. Box, J.S. Hunter and W.G. Hunter, Statistics for Experimenters, John Wiley, New York, 1978.
10
A.K. SMILDE
[151 G.E.P. Box, Signal-to-noise ratios, performance criteria, and transformations, Technometrics, 30 (1988) 1-40. [ 161 G.E.P. Box and N.R. Draper, Empirical Model Building and Response Surfaces, John Wiley, New York, 1986. [17] J.A. Cornell, Experiments with Mixtures, John Wiley, New York, 1990. [ 181 C.A.A. Duineveld, A.K. Smilde and D.A. Doornbos, Comparison of experimental designs combining process and mixture variables. Part 1: Design construction and theoretical evaluation, Chemometrics and Intelligent Laboratory Systems, 19 (1993) 295-308. [ 191 C.A.A. Duineveld, A.K. Smilde and D.A. Doornbos, Comparison of experimental designs combining process and mixture variables. Part 2: Design evaluation on measured data, Chemometrics and Intelligent Laboratory Systems, 19 (1 993) 309318. [20] F.H. Walters, L.R. Parker, S.L. Morgan and S.N. Deming, Sequential Simplex Optimization, CRC Press, Florida, 1991. [21] C.A.A. Duineveld, C.H.P. Bruins, A.K. Smilde, G.K. Bolhuis, K. Zuurman and D.A. Doornbos, Multicriteria Steepest Ascent, Chemometrics and Intelligent Laboratory Systems, 25 (1994) 183-202.
Chapter 2
STABILITY AND RESPONSE SURFACE METHODOLOGY STEPHEN P. JONES Boeing Computer Services, The Boeing Company, P. 0.Box 24346, MS 7L-22, Seattle, WA98124-0346, United States
2.1 INTRODUCTION In recent years much attention has been focused on the impact of the use of statistics, and in particular experimental design, to improve the quality of products and processes. An important component of the quality of a product is its robustness or stability in the presence of what Taguchi has called noise variables. These noise variables can be from a variety of sources, such as environmental conditions, deterioration of components, or variation in product components and manufacturing processes. It is possible that variation due to these sources will cause variation in the key characteristics of a product or process, resulting in a product of inferior quality. This chapter will examine the application of statistical experimental design to designing a product or process that is robust to variation from environmental variables. It should be understood that the phrase “environmental variables” is to be viewed broadly and is not just limited to variables such as temperature and humidity. In this context, variation from environmental variables is variation that is external to the product and that is outside of the control of the manufacturer during production. Thus, it might also include variation in the conditions in which the customer uses the product, or in the conditions in which the product is stored, or in how the product is maintained and serviced. It should be noted that experiments with this objective of robust design have been run for many years in agricultural research. For example, a paper by Yates and Cochran [ 13 describes experiments on crop varieties in different regions over several years; the objective being to determine a variety that consistently will produce a good yield over a range of climate
11
12
S.P. JONES
and soil conditions represented by the different regions. They used a graphical analysis of the interaction between the varieties and the regions to investigate the robustness of the varieties to the different regions. It is clear from this description that investigating crop varieties that are robust to environmental variation, whether due to climate, soil, aspect, farming practice, etc., is an application of experimental design techniques to robust design. The experiments conducted to perform ruggedness tests of measurement procedures can also be viewed as experiments to investigate robust design; see, for example, Wernimont [2], and Youden [3,4]. The objective of ruggedness tests is to determine a robust measurement procedure; that is a procedure that will give a consistent (and correct) result under a range of measurement conditions. An industrial example of the use of experimental design for robust design, given in Box and Jones [ 5 ] , is the case of a manufacturer of medical packaging material who sought a method of manufacture that would yield a robust packaging material. In this context, a robust packaging material is one that can be used to seal medical equipment under a range of sealing process conditions used by its customers, the medical equipment manufacturers. The environmental conditions were the sealing process factors. The objective of the experiment was directed towards achieving a suitable product design so that the variation in the environmental conditions did not result in variation in the product's performance, that is, how well the material seals. Packaging material that would yield a good seal over a range of sealing process conditions would have a competitive advantage since medical equipment manufacturers would not have to operate their sealing process within a narrow tolerance to produce a good seal. Therefore the equipment manufacturer can use less precise equipment or machines that are difficult to control consistently or a less qualified workforce. The motivation for interest in designing robust products and processes is that it is frequently more cost effective to reduce the effect of the environmental variation rather than to eliminate the source of the variation by controlling the environment. Furthermore, in some situations it might be impossible to eliminate or control the environmental variation. As an example, a manufacturer cannot control the variation in the use of their product and so would prefer to design the product to be robust to a wide range of customer usage conditions rather than to impose instructions that
STABILITY AND RESPONSE SURFACE METHODOLOGY
13
need to be strictly adhered to by the customer. In this way the product design is forgiving of variation beyond the control of the manufacturer. It should be noted that although it has been stated that the environmental variables are beyond the control of the manufacturer in the normal production or usage conditions, it is necessary that they can be controlled for an experiment. The objective of the experiment is to learn how to minimize the influence of the environmental variables on the product or process performance. To accomplish this objective it will be necessary to understand how variation in environmental conditions affects the product or process performance. The methodology that will be described in this chapter requires that the environmental conditions be changed in a controlled, structured manner.
2.1.1 Example Consider the set of data given in Table 2.1. In this example a tablet formulation is desired that will retain desired properties in both tropical and temperate climates. The actual climatic conditions that will be experienced in practice are beyond the control of the manufacturer but they can be simulated in a laboratory experiment. In this example, experiments are to be run with three constituents of the tablet formulation, say, glidant, lactose, and disintegrant, which will be denoted as A, B, and C, in a 23 factorial design. The two levels for each of the factors in the experiment are denoted by -1 and + l . The manufacturer wants a stable, or robust, tablet formulation so that it will retain its efficacy when stored in a range of temperatures and humidities. To yield data on this, for each of the eight tablet formulations, the storage temperature and humidity will be varied in a laboratory experiment following a 32 factorial design. In this design the environmental variables are varied in a climate-controlled chamber above and below their nominal settings (denoted by +1, - 1, and 0, respectively). A set of hypothetical data for a response of interest, say crushing strength, is shown in Table 2.1. The objective is to determine a combination of the factors glidant (A), lactose (B), and disintegrant (C) that will yield high values for crushing strength across the ranges of temperature and humidity studied in the experiment. At first glance it might appear that the formulation with A=-, B=-, and C=+ gives good values for crushing strength. Indeed at the nominal settings of temperature and humidity (0, 0) the crushing strength is 125 for this design combination, close to the largest response in the data set.
14
S.P. JONES
However, calculations of means and standard deviations for the response over the environmental conditions, shown in Table 2.2, reveal that the formulation with A=-, B=+, and C=+ yields an average crushing strength that is identical in magnitude but with considerably less variation as the temperature and humidity variables are changed. This formulation is robust, or stable, to storage in the range of climates represented by the changes in temperature and humidity considered in the experiment.
TABLE 2.1 HYPOTHETICAL DATA SET FOR TABLET FORMULATION EXPERIMENT Environmental Variables Temperature - o o o + + 0 + - o + - o Humidity Design Variables A B C 119 106 97 107 107 95 87 88 + 100 95 87 101 119 91 107 87 + - 116 112 119 102 101 87 105 105 + + - 109 93 100 91 102 103 85 88 + 115 108 104 128 125 97 99 107 + - + 112 113 90 103 94 88 96 97 + + 121 95 101 103 111 107 106 107 + + + 104 103 89 104 102 98 97 89
+ + 87 83 100 96 76 80 108 102
The arrangement containing the tablet design formulations is the inner array and the arrangement containing the environmental variables is the outer array. In this chapter these two arrays will be referred to as the design and environmental arrays and the total design will be called a crossproduct array. If there are n , runs in the design array and n2 runs in the environmental array, and the runs are made independently, then the total experiment will require n, x n2 runs. Thus, except where both n , and n2 are small, this could involve a large amount of experimental work. An issue that will be considered in this chapter is how the investigator can construct experimental designs that will require less work than these cross-product arrays and still be able to determine settings for the design variables that are stable (or insensitive) to variation from the environmental variables.
STABILITY AND RESPONSE SURFACE METHODOLOGY
15
TABLE2.2 SUMMARY STATISTICS FOR DATA SET OF TABLE 2.1 Design Variables Standard A B C Mean Deviation 11.21 99.22 + 1 1.42 96.67 105.22 9.61 7.8 1 96.33 15.66 106.56 10.87 97.00 7.14 106.56 6.00 98.67 The next section will present an overview of the statistical techniques associated with response surface methodology. In Section 2.3 the applicability of response surface methodology for robust design will be investigated. Section 2.4 will discuss the applicability of an alternative class of experimental designs called split-plot designs and show how the use of these designs can significantly reduce the amount of work required to conduct robust design experiments. Conclusions are given in Section 2.5.
2.2 AN OVERVIEW OF RESPONSE SURFACE METHODOLOGY The strategy for robust design experiments that will be considered in Section 2.3 is based on the statistical techniques associated with response surface methodology. This section will give an overview of response surface methodology, presenting some of the more common experimental designs that have been developed in this area. To motivate the response surface approach, suppose that there is some response of interest (for example, crushing strength in the tablet formulation example of Section 2.1.1), and a set of quantitative, continuous design variables that are of interest to the researcher (for example, the quantities of glidant, lactose, and disintegrant for the tablet formulation example). One possible objective for the researcher might be to understand and describe the relationship between the design variables and the response. This relationship can be described mathematically by
16
S.P. JONES
constructing an empirical model of the response as a function of the design variables over a range of interest. In the case where there is one design variable of interest, say percentage of lactose, the model of the response can be graphed as a curve on an x-y plot, as shown in Figure 2.1. When there are two factors of interest, the model of the response can be represented as a surface, often plotted as a contour diagram, as shown in Figure 2.2. On this plot the lines are contours of constant response and indicate the predicted response for the design variable combination. A response surface can be used to determine optimum factor settings for the response or to indicate a range of factor settings that yield an approximately equivalent response. This latter use indicates a region in the factor space where the response is robust to changes in the factors. For example, in Figure 2.2, it appears that the maximum crushing strength occurs when the percentage of lactose is 23% and the percentage of disintegrant is 3.3%. However, it can also be seen that near the optimum point the crushing strength is more stable or robust to changes in the quantity of lactose than to changes in the quantity of disintegrant.
110
8
5M
105
c
g m
%
2 u
100 95
90
20
30 Percentage Lactose
40
Figure 2.1 Curve showing the effect of lactose on crushing strength.
-
STABILITY AND RESPONSE SURFACE METHODOLOGY
c
* 2
8
4'0
c
3.5
6"
3.0
42
m
.r(
8 J cd
*
9 :: E
I
17
2.5
90
~
2.0
20
30
40
Percentage Lactose
Figure 2.2 Response surface showing the effect of lactose and disintegrant on crushing strength
To introduce some notation, let the response of interest be denoted by 77 and suppose that there are p quantitative, continuous design variables, xl, x2,..., xP , such that 77 is a function of the design variables, x,,xD ..., xd that is
where the form off is unknown. If the response is measured at a particular setting of the design variables then the measured response will differ from the true response due to experimental error, that is
where y is the measured response and E is the error. In response surface methodology, it is frequently assumed that f can be approximated in some region of the design variables by a low-degree polynomial. For example, if p=2, and a first-order model is assumed appropriate then
where
Po,PI,and P2 are constant coefficients that
measure the mean and
18
S.P. JONES
the effects of x 1 and x2 on the response. It is assumed that the x's are controlled and measured with no error in the experiment. Alternatively, the experimenter might assume that f can be approximated by a second-order model so that
The rationale for using low degree polynomials to approximate f is based on a Taylor series expansion off around x=O. The statistical techniques associated with response surface methodology are concerned primarily with two aspects of the experimentation process; the construction of experimental designs that yield data to permit the efficient modeling of the response surfaces, and the analysis of the experimental data and derived response surfaces. The statistical investigation of response surfaces has a history dating back to the pioneering work of George Box and his colleagues in the 1950's; see, for example, Box and Wilson [ 6 ] ,Box [7], Box and Youle [8]. An introduction to the concepts and techniques associated with response surface methodology can be found in Box, Hunter, and Hunter [9] (chapter 15) and Cornell [lo]. For an extensive coverage of response surface methodology see Myers [ I l l , Box and Draper [12], Khuri and Cornell ~31. The following sub-sections will describe some of the experimental designs that are commonly used to fit the first-order and second-order model. These designs will be called first-order and second-order designs, respectively.
2.2.1 First-order designs Suppose that there arep quantitative variables of interest, x l , x2, ..., xp, and that in some region of interest the response can be approximated by the general first-order model y
=
Po+ P I X l + p2x2 + ... + ppxp+ &
A class of experimental designs that are appropriate for obtaining data that will permit the estimation of the coefficients in equation ( 5 ) by least squares are the two-level factorial and fractional factorial designs. A single P
replicate of a two-level full factorial design in p variables will have 2
STABILITY AND RESPONSE SURFACE METHODOLOGY
19
experimental runs composed of all possible combinations of the p variables. Such a design will permit the estimation of all p main effects, all possible two-factor interactions, all possible three-factor interactions, ...,p P
factor interaction; a total of 2 -1 main effects and interactions. P
Frequently an experiment that required all 2 experimental runs would be too costly to run, especially for p not small. In these situations important information on the effects of the variables may be determined by running only a fraction of the full factorial design. With such designs, called fractional factorial designs, the ability to estimate the effect of some higher-order interactions is lost, and other effects and interactions are aliased together. This aliasing implies that the calculated effects cannot be unambiguously assigned to one of the effects or interactions that are aliased together in the design. To illustrate the concept of aliasing, consider an experiment with three variables, xl, xB x3, using the fractional factorial design given in Table 2.3. In a two-level fractional factorial design with the two levels of each factor coded - 1 and +1, the estimate for the coefficient of a variable is calculated as half of the difference between the average response at the high and the low setting of the variable. Thus, b,, the estimate of PI the coefficient for x,, will be calculated as
Now in Table 2.3 the column headed x2x3has been derived by multiplying together the columns for x2 and x3.This column can be used to calculate the interaction effect of x2 and x3.The interaction effect of two variables measures how the effect of one variable on the response depends on the level of the other variable. From the table it can be seen that the column headed x2x3is identical to the column headed x, and so the estimate of the interaction effect of x2x3 will be identical to that for the estimate of the coefficient for x,. Thus x, is aliased with x2x3and the calculated effect in equation (6) cannot be unambiguously assigned to the effect of x, or the interaction effect x2x3.In general, the aliasing of effects occurs when the calculation of the effects uses identical columns (apart from a switching of signs).
20
S.P. JONES
The degree of aliasing in a design can be summarized by stating the design's resolution. In general a design has resolution R if all effects containing k or more variables are unaliased with any variables containing less than R-k variables. Resolution is denoted by the appropriate Roman numeral. A resolution I11 design has all main effects unaliased with other main effects but may alias them with two-factor interactions. A resolution IV design does not alias main effects with two-factor interactions, but may alias two-factor interactions with one another. A resolution V design does not alias main effects with three-factor interactions nor alias two-factor interactions with one another, but may alias three-factor interactions with two-factor interactions.
TABLE2.3 Run 1
2 3
4
FOUR-RUN DESIGN WITH THREE FACTORS -1 +1 +1 -1
x3
-1 +1 -1 +1
-1
-1
y1
-1 +1
+I
y2
+1
y3
+I
x2x3
Response
x2
-1
Y"
Although some information is lost when fractional factorial designs are used instead of hll factorial designs, the advantage of these designs is that the total number of experimental runs can be reduced considerably. Furthermore, by carehl choice of design and allocation of the variables to the design, and by following a sequential approach to experimentation, the experimenter can use fractional factorial designs to obtain information in an economical manner. An excellent description of fractional factorial designs and aliasing can be found in Box, Hunter, and Hunter [9]. This book also contains a description of how these designs can be blocked to remove additional sources of variation from the analysis, thereby increasing the precision of the estimates of the coefficients. There is also a discussion of how experimental designs can be run sequentially, designing the next experiment in the light of information that has been obtained, and the unresolved questions that remain, from the previous experiments. The rationale for sequential experimentation is that the best time to design an experiment is after the experiment has been run, since at that stage more is
STABILITY AND RESPONSE SURFACE METHODOLOGY
21
known about the process than when the experiment was designed. Box, Hunter, and Hunter recommend that no more than 25% of the experimental budget be devoted to the first experiment, so that sufficient resources are retained to investigate questions that the data from the first experiment will raise. From the discussion of design resolution above, it should be clear that a resolution 111 design will permit the fitting of all the coefficients for the first-order model in equation ( 5 ) . However, the resolution 111 design will alias main effects with two-factor interactions. The aliasing of effects in fractional factorial designs has implications for the fitting of the response surface. The aliasing in the resolution 111 design implies that the coefficients associated with the main effects will be biased by the presence of any interactions in the true (unknown) model. To illustrate this, consider fitting the first-order model with three variables, xl, x2,x,,
3-1
Suppose that the experimenter runs the 2 fractional factorial design shown in Table 2.3. With this design each main effect is aliased with the two-factor interaction composed of the other two factors; that is, X I is aliased with x2x3,x2 is aliased with xIx3,and x, is aliased with xIx2.This can be verified by multiplying together the appropriate columns, as was done for x2x3. Suppose that the true unknown model is
Then with the design given in Table 2.3, b,, the experimenter's estimate of p3, will be biased by the coefficient P12. In fact
Similarly, if the true model is
22
S.P. JONES
then with the design in Table 2.3, b,, the experimenter’s estimate of P,, will be biased by the coefficient PZ3;that is
It can be seen that the use of a fractional factorial design can lead to biases in the estimation of the first-order coefficients from any interactions that are present in the true (unknown) model and that have been aliased with the main effects of the factors. Therefore, the experimenter needs to be aware of the aliasing that occurs with the use of a fractional factorial design and understand the biases that can result in the estimation of the coefficients of the model. A more complete discussion of the biases in estimation of coefficients from using fractional factorial designs and a description of how the biasing can be calculated for larger fractional factorial designs can be found in Box and Draper [I21 (pp. 65-70) and Myers [ 1I] (pp. 110-114). Some protection against the effect of biases in the estimation of the firstorder coefficients can be obtained by running a resolution IV fractional factorial design. With such a design the two-factor interactions are aliased with other two-factor interactions and so would not bias the estimation of the first-order coefficients. In fact the main effects are aliased with threefactor interactions in a resolution IV design and so the first-order effects would be biased if there were third-order coefficients of the form x?,xk in the true model. Fractional factorial designs use n = 4, 8, 16, 32, 64, ... runs, and can be constructed to carry up to p = n-1 variables. (A design that has p = n-1 variables in only n runs is called a saturated design since it cannot hold any more variables.) For values of n that are multiples of 4 but not a power of 2, that is, n = 12, 20, 24, 28, 36,..., an alternative class of first-order design that can be used are the Plackett-Burman designs; see Plackett and Burman [ 141. Plackett-Burman designs may be of use in screening situations, that is in situations when the experimenter wishes to examine many variables but believes that only a few are of importance. Furthermore, Plackett-Burman designs are particularly useful when following a sequential experimental strategy since a resolution IV design can be constructed from a PlackettBurman design by augmenting it with the foldover design; that is the design where all of the runs have the signs of all the variables switched. An example of a Plackett-Burman design with 11 factors in 12 runs is given in Table 2.4.
23
STABILITY AND RESPONSE SURFACE METHODOLOGY
It can be seen that this 12-run design is generated by starting with a particular row of -1's and +l's and generating the next row by cycling through the variables and shifting each sign one place to the right. This is repeated eleven times to obtain the first eleven runs and then the final run is constructed by adding a final row of -1's. The starting rows for 12-, 20-, and 24-run designs given by Plackett and Burman [14] are as follows: n=12: +1+1 -1 +1 +1 +l -1 -1 -1 +1 -1 n=20: + 1 + 1 - 1 - 1 + 1 + 1 + 1 + 1 -1 + 1 - 1 + 1 - 1 - 1 - 1 - 1 + 1 + 1 - 1 n=24: +1 + 1 + 1 + 1 + 1 -1 +1 -1 +1 + l -1 -1 +1 +1 -1 -1 +1 -1 +1 -1 -1 -1 -1.
TABLE 2.4 R u n A
PLACKETT-BURMAN DESIGN IN 12 RUNS B C D E F G H I
J
K
Plackett and Burman [14] give the method of design construction for all values of n that are multiples of 4 up to 100 except n=92. A disadvantage of Plackett-Burman designs is that the structure of the aliasing is more complex than the fractional factorial designs, so that it is harder to determine the effect of any biases in the estimates of the coefficients. Draper and Lin [ 151 indicate how, in the situation where only a few variables are important, additional runs can be added to PlackettBurman designs to yield experimental designs with higher resolution or clearer alias structure. Hamada and Wu [16], under the assumptions that there are few significant variables and that any variable in a significant interaction is likely to have a significant main effect, show how it may be
24
S.P. JONES
possible to study a few interactions in Plackett-Burman designs without adding any runs. Box and Meyer [17] describe how a Bayesian analysis can reveal active variables in the complex aliasing that occurs with Plackett-Burman designs. In conclusion, Plackett-Burman designs tend to have a complex alias structure and so the presence of interactions in the true model induces a complex bias structure on the first-order coefficients. Therefore, it is recommended that these designs only be used if the assumption of no second-order interactions is reasonable, or as a part of a sequential strategy of experimentation that would generate a resolution IV design by augmenting the Plackett-Burman design with its foldover design. 2.2.2 Adding center points When an unreplicated experiment is run, the error or residual sum of squares is composed of both experimental error and lack-of-fit of the model. Thus, formal statistical significance testing of the factor effects can lead to erroneous conclusions if there is lack-of-fit of the model. Therefore, it is recommended that the experiment be replicated so that an independent estimate of the experimental error can be calculated and both lack-of-fit and the statistical significance of the factor effects can be formally tested. In some experimental contexts, however, each experimental run is expensive. Thus it is infeasible to replicate each design point of the experiment to obtain an estimate of the experimental error. When all of the variables are quantitative, an estimate of the experimental error can be obtained by adding to the full factorial, fractional factorial or Plackett-Burman design, a number of runs at the center of the design. The center of the design is the midpoint between the low and high settings of the two-level factors in the experiment. Thus, if there are p variables, and the levels of the variables have been coded (-1, +l), then the center of the design is (x,, x2, ..., xp) = (0, 0, ..., 0). If the center point is replicated no times in the experiment, then the variance of the response at those runs provides an estimate of the experimental error with no - 1 degrees of freedom to statistically test both the lack-of-fit of the model and the significance of the coefficient estimates of the model. Another reason for augmenting the two-level design with center points is that these points allow for an overall test of curvature. It is clear that with only two levels for each variable it is impossible to detect any quadratic effect of the variables. Thus, the underlying model is assumed to
STABILITY AND RESPONSE SURFACE METHODOLOGY
25
be linear over the experimental region. To examine the quadratic effect of all the variables requires each variable to be run with at least three levels. An overall test of the presence of quadratic effects can be obtained by comparing the average of the center point runs, Yo, with the average of the cube portion of the design, 7,,since the expected value of ( J , -Yo) is
where Pi; is the quadratic effect of factor xi. A formal statistical test can be constructed by comparing
with the F1,(no-l) distribution, where 6; is the estimate of the experimental error from the variance of the no center point runs, and nc is the number of runs in the cube portion of the design. If the F-test is significant then there is evidence of a quadratic effect due to at least one of the variables. With the present design, however, the investigator will not be able to determine which of the variables has a quadratic effect on the response. Additional experimentation, perhaps by augmenting the current design with some star points to construct a central composite design (see section on central composite designs below), will need to be conducted to fully explore the nature of the quadratic response surface. 2.2.3 Second-order designs Suppose that there are p variables of interest, x,, x2, ..., xd and that in some region of interest the response can be approximated by the general second-order model
26
S.P. JONES
that is, y = intercept + (first-order terms) + (quadratic terms)
+ (cross-product terms) + E
This section will describe some of the classes of experimental designs that are appropriate for obtaining data that will permit the estimation of the coefficients in equation (14) by least squares. Three-level designs It is obvious that to be able to estimate the quadratic coefficients, p,,,pZ2,
f133 ,.., ppp,in equation (14), it is necessary to have at least three
distinct levels or settings for the variables. This suggests that a suitable design for estimating the coefficients of the second-order model would be a single replicate of a three-level full factorial design in p variables. This design P
will have 3 experimental runs composed of all possible combinations of the p variables. If there are only p=2, or p=3 variables then a full factorial design is often feasible. However, the number of runs required becomes prohibitively large as the number of variables increases. For example, with p=5 variables, the second-order model requires the estimation of 21 coefficients: the mean, five main effects, five pure quadratic terms, and ten two-factor interactions. The three-level full factorial design would require 5
= 243 runs. It might be supposed that a smaller design that permitted the estimation of the coefficients of interest could be constructed by taking a fraction of the full factorial. However, the aliasing of three-level designs is very complex and so fractionating a three-level design will not be pursued. The interested reader may refer to Kempthorne [ 181. Therefore, unless the number of factors is small, three-level designs are not usually feasible for response surface studies.
3
Central Composite Designs An alternative approach to constructing designs for estimating secondorder models is to consider building a design from those constructed for the first-order model. In Section 2.2.1, we discussed the use of fractional factorial designs to estimate the coefficients of the first-order model. It was noted that a fractional factorial design of resolution V would yield
STABILITY AND RESPONSE SURFACE METHODOLOGY
27
unbiased estimates of all coefficients for the main effects and two-factor interactions. To estimate the quadratic coefficients of the second-order model this design could be augmented with additional points where the variables are at additional settings to the fractional design so that each variable has at least three settings. A class of augmented designs, first proposed by Box and Wilson [6] and frequently applied in response surface work, is the central composite design. Composite designs consist of a full or fractional factorial design of at least resolution V; the number of runs in this design will be nc = 2@-k), these runs forming a cube portion with coordinates ofthe form (*l, f l , ..., f l ) ; ns = 2p star points with coordinates (fa,0, 0, ..., 0), (0, f a , 0, ..., 0), ..., (O,O, ..., *a); nocenter points (0, 0, 0, ..., 0). The use of the terms cube, star and center points is descriptive of the design pattern, as is clear when there are p = 3 variables. In that case the points of the central composite design, shown in Table 2.5, can be represented by the points in Figure 2.3. In Table 2.5, runs 1-8 are the cube portion, runs 9- 14 are the star portion, and runs 15-17 are the center points. In general the cube portion might be replicated yC times and the star portion might be replicated rs times. Also, it might be possible to use a fractional factorial design of resolution less than V if the experimenter is prepared to assume that certain interactions are negligible. A central composite design in four variables is shown in Table 2.6. In this table, runs 1-16 are the cube portion, runs 17-24 are the star portion, and runs 25-27 are the center points. The central composite design has several advantages over the three-level design. Firstly, the total number of runs in a central composite design is frequently less than that required for a three-level full factorial design. For example, with p=5 variables 243 runs would be required for the three-level full factorial design, whereas with single replicates for the cube and star portions and four center points, the total number of runs required for a central composite design would be 16 + 10 + 4 = 30 (for the cube portion a 2('-') fractional factorial design could be used).
28
S.P. JONES
t 1
+I
/
r
* *
0 cube points
star points 0 center point
-1
Figure 2.3 Central composite design with three variables.
TABLE 2.5 CENTRAL COMPOSITE DESIGN WITH THREE VARIABLES Run A B C 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
-1 +1 -1 +1 -1 +1 -1 +1
-a +a
-1 -1 +1 +1 -1 -1 +1 +1 0 0
0 0 0 0 0 0 0
-a +a
-1 -1 -1 -1 +1 +1 +1 +1 0 0 0 0
0 0 0 0 0
0 0 0
-a +a
STABILITY AND RESPONSE SURFACE METHODOLOGY
29
A second advantage of the central composite design is that it lends itself to a sequential approach to experimentation, since the central composite design can be built in sections. For example, an experimenter might initially assume that the response surface can be adequately represented by a first-order model, possibly with the addition of some two-factor interaction terms. Thus they might initially conduct a resolution V fractional factorial design. Following the analysis the experimenter might suspect some nonlinearity and so augment the first design with some center points. If examination of the response at the center point runs indicates the presence of quadratic effects, then the experimenter might be interested in fitting a second-order model. Data to enable this to be accomplished can be obtained by augmenting the design with star points to generate the central composite design. In some situations design augmentation can be accomplished so that the designs are orthogonally blocked, thus allowing for block differences to be eliminated in the analysis and estimation of the coefficients. The central composite design gives the experimenter the flexibility of choosing the value of a, the distance of the star points from the center of the design. One possible criterion for a is to choose it so that the central composite design is rotatable. A rotatable design is one in which the precision of the predicted response is the same at all points equidistant from the center point (0, O,..., 0). Rotatability is a useful property for a design since it relieves the experimenter from making any assumption that the underlying response surface is oriented in a particular direction. Rotatability ensures that whatever the orientation of the response surface the precision of the predicted response will not be dependent on the direction from the center of the design, only on the distance from the center of the design. It can be shown that for a central composite design to be rotatable the distance of the star points from the design center is a = (2‘p-k’rc/ Y ~ ) ’ ’ ~ where 2@-k’is the number of factorial points. Therefore, if p =5 and k =1 , then the design would be rotatable if a = (2(5-1))1’4 = 2. For p=3, then 3 = 1.68 generates a rotatable design. e(2 A possible disadvantage of the central composite design is that it requires five levels of each variable (0, h1 ,ha). In some situations it might be necessary or preferable to have only three different settings of the variables. In this case a can be chosen to be 1 and the design is called a face-centered composite design. These designs are not rotatable.
30
S.P. JONES
TABLE2.6 CENTRAL COMPOSITE DESIGN WITH FOUR VARIABLES Run A B C D
1 2 3 4 5 6 7 8 9 10 I1 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
-1 -1 -1 -1 +I
-1 +1 -1 +I -1 +1 -1 +1 -1 +1 -1 +1 -1 +I -1 +I
-1 -1 +1
0 0 0 0
-a +a 0 0
-a +a
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
-a +a
+1
-1 -1 +1 +1 -1 -1 +I +1
-1 -1 +1 +1 0 0
+1
+1 +1 -1 -1 -1 -1 +I +I +I +I 0 0 0 0
-1 -1 -1 -1 -1 -1 -1 -1 +1 +1 +1 +1
+I +1 +1 +1 0 0 0 0 0 0
-a +a 0 0 0
The choice of the number of center points and the blocking of composite designs is discussed in Myers [ 113, Box and Draper [ 121, and Khuri and Cornell [13]. One final point: it should be noted that the experimenter is not constrained to use a resolution V design or to add star points for all of the factors. In particular, if it is believed that certain two-factor interactions
31
STABILITY AND RESPONSE SURFACE METHODOLOGY
can be assumed negligible, then it might be possible to use a resolution IV design with a particular assignment of variables to columns of the design. Alternatively, if there are certain pure quadratic effects that are deemed unimportant, then star points for those variables need not be added to the design. Box-Behnken Designs
Another alternative to the 3 full factorial is the Box-Behnken design (Box and Behnken [19]). These designs are a class of incomplete three-level factorial designs that either meet, or approximately meet, the criterion of rotatability. A Box-Behnken design for p=3 variables is shown in Table 2.7. This design will estimate the ten coefficients of the second-order 3
model in only fifteen runs, in contrast with the 3 =27 runs required by the full factorial design. This design is shown graphically in Figure 2.4.
TABLE2.7 BOX-BEHNKEN DESIGN FOR THREE VARIABLES C Runs A B 1 2 3 4 5 6 7 8
9 10 11 12 13 14 15
0
-1 +1 -1 +1 -1 +1 -1 +1 0 0 0 0
-1 -1 +1 +1 0 0 0 0 -1 +1 -1
0 0 0 -1 -1 +1 +1 -1 -1 +1
+I
+I
0 0 0
0 0
0 0 0
0
Table 2.8 gives the runs for a Box-Behnken design in four variables. In this table, the runs are grouped in sets of four, each set of four being
S.P. JONES
32
composed of all the combinations of *l for the two variables indicated, the other two variables being set at 0. The design is completed with three center points, runs 25-27.
-1 -1
+I
Figure 2.4 Box-Behnken design for three Variables
TABLE 2.8 BOX-BEHNKEN DESIGN FOR FOUR VARIABLES Runs A B C 1-4 5-8 9-12 13-16 17-20 2 1-24 25-27
*l 0 kl 0 *1 0 0
*l 0 0 *1
0 kl 0
0 *l 0 *1 *1 0 0
D 0 kl kl 0 0
*I 0
As was mentioned above for central composite designs, the experimenter can modify these designs if they believe that certain two-factor interactions can be assumed negligible. Box and Jones [20,21] show how this can be done to yield what they call a modified Box-Behnken design that requires fewer runs than the standard Box-Behnken design. A table of Box-Behnken designs for p = 3, 4, ..., 7 variables can be found in Box and Draper [ 121, and for p = 3, 4, ..., 7, 9, 10, 11, 12, 16 variables in Box and Behnken [ 191.
STABILITY AND RESPONSE SURFACE METHODOLOGY
33
2.2.4 Optimal designs Optimal design theory provides an alternative approach to the selection of an experimental design. For a description of the theory of optimal design see, for example, Atkinson and Donev [22]. To motivate this approach, suppose that an experiment with n runs will be conducted and a model with k coefficients is to be fit to the data. This model can be represented in matrix notation as
y=xp+€
(15)
where y is an n x 1 column vector of response values, p is a k x 1 column vector of coefficient estimates, X is a n x k matrix that defines the runs in the experiment and E is an n x 1 column vector of errors. It is commonly assumed that the errors are independent and follow a normal distribution with variance 2. One of the questions that the experimenter needs to consider is how to choose good values for the elements of X. It can be shown that the variance of the coefficient estimates, b, of p is d(XTX)-'. Furthermore, the variance of the predicted response at any setting of the variables is also a function of (X'X)-'. Thus one way to choose good values for the elements of X is to choose them so that (XTX)-' is, in some sense, "smally7.A number of criteria have been developed, the most popular of which are: 0
0 0
A-optimality criterion: D-optimality criterion: E-optimality criterion: G-optimality criterion: the predicted response.
minimize trace (X'X)-', minimize det (X'X)-' , minimize max eigenvalue (XTX)-', minimize maximum value of the variance of
Efficient algorithms have been developed that construct D-optimal designs for a given response model, candidate design points, and number of runs (see, for example, Mitchell [23]). Optimal experimental design can be useful in situations where:
0
the experimental design region is irregularly shaped due to constraints on the variables, it is necessary to augment an existing design, designs must be constructed for special models or with a limited number of runs,
34
S.P. JONES
the experimenter has prior knowledge on the form of the model and desires coefficient estimates in a minimal number of runs. There has been an extensive critique on the role of optimal design theory in practical experimental design; see, for example, Box [24], Box and Draper [12]. One of the underlying assumptions behind optimal experimental design is that since the designs are only optimal within the region defined by the candidate design points, then there is a well-defined region of interest within which experiments can be run. The assumption is that the experimenter has no interest in the response outside of the region defined by the candidate points. In typical response surface studies, however, the region of interest might be poorly defined and might change as the investigation proceeds. Thus, it might be advisable to design the experiment to obtain information about the response beyond the current region of interest defined by the candidate points. Furthermore, optimal design theory assumes that the model is true within the region defined by the candidate design points, since the designs are optimal in terms of minimizing variance as opposed to bias due to lackof-fit of the model. In reality, the response surface model is only assumed to be a locally adequate polynomial approximation to the truth; it is not assumed to be the truth. Consequently, the experimental design chosen should reflect doubt in the validity of the model by allowing for model lack-of-fit to be tested. 2.2.5 Other second-order designs There are many other second-order designs that have been proposed in the statistical literature. Some of these designs are based on variants of the central composite design. For example, the designs proposed by Hartley [25] use cube portions in the central composite design of resolution less than V with no two-factor interactions aliased with one another. Secondorder designs can also be constructed using irregular fractional factorial designs for the cube portion. Irregular fractional factorial designs (see, for example, John [26] and Maclean and Anderson [27]) are non-orthogonal fractions of a full factorial design. Some second-order designs, such as the uniform shell designs (Doehlert [28]), have been proposed which are not based on the central composite design. A more thorough treatment of additional second-order designs can be found in the texts mentioned earlier: see Myers [ 1 11, Box and Draper [12], Khuri and Cornell [13].
STABILITY AND RESPONSE SURFACE METHODOLOGY
35
2.2.6 Interim Summary This section has given an overview of some of the experimental designs that are suitable for collecting data to estimate the coefficients of the firstorder and second-order model. Many of these designs are based on factorial and fractional factorial designs. It is clear that if the first-order model (equation (5)) is assumed to be valid, then a resolution I11 design or a Plackett-Burman design can be used since this design will estimate all of the P, without bias. However, it has been shown that with a resolution I11 design the estimates of the coefficients P, will be biased if the true (unknown) model contains interactions. The biases and lack-of-fit of the first-order model due to interaction effects can be examined by running a resolution IV design, which will yield unbiased estimates of all of the P,. A resolution IV design can be obtained from a Plackett-Burman or resolution I11 fractional factorial design by augmenting the initial design with its foldover design. If the true model contains quadratic terms then the estimate of the intercept, Po,of the first-order model will be biased. The lack-of-fit of the first-order model due to quadratic effects can be tested by adding center points to the design. To construct the central composite design to estimate the coefficients of the second-order model (equation (14)), usually a fractional factorial design of at least resolution V is used. In this case, if the model is valid, then all of the estimates of the main effect coefficients, p,, and the
interaction coefficients, P,,, are unbiased. An alternative to the central composite designs for estimating the coefficients of the second-order model are the Box-Behnken designs or the designs referenced in Section 2.2.5. Table 2.9, shows the minimum number of runs for a single replicate of a fractional factorial design with the desired resolution for p variables, p=3,...,11.
2.3 ROBUST DESIGN AND RESPONSE SURFACE METHODOLOGY This section will describe how the techniques associated with response surface methodology, outlined in Section 2.2, can be applied to designing a product or process that is insensitive, or robust, to variation that is difficult
36
S.P. JONES
or impossible to control. Two alternative strategies will be outlined in this introduction and will be considered in detail in Sections 2.3.1 - 2.3.5.
TABLE2.9 MINIMUM NUMBER OF RUNS FOR DESIRED RESOLUTION Number of Resolution Variables I11 IV V 3 4 8 8 4 8 8 16 16 16 5 8 16 32 6 8 16 64 7 8 16 64 8 16 32 128 9 16 32 128 10 16 11 16 32 128 In the first approach it is assumed that the effect of environmental variation on the response is investigated by running a replicated experiment. The replication enables the variation of the response to be estimated at each design point. In this scenario the environmental variation is uncontrolled during the experiment but is assumed to affect the response in a random manner and is captured in the replication. It is acknowledged that the variation that is measured at each design point will be from many sources, including the sources of the environmental variation. However, with this approach the objective of finding design variable settings that minimize the variation in the response can be achieved, although no information will be gained as to how the design variable settings might make the response robust to particular sources of environmental variation. The design and analysis of experiments with this first approach will be covered in Sections 2.3.1 and 2.3.2. In the second approach, the environmental variation is deliberately introduced into the experiment by including in the experimental design environmental variables that are controlled at predetermined settings during the experiment. In this approach it will be possible to estimate how much of the variation is due to the environmental variables and how much is due to unassignable sources. It will be possible also to determine how particular design variable settings might make the response robust to the sources of environmental variation considered in the experiment. The
STABILITY AND RESPONSE SURFACE METHODOLOGY
37
design and analysis of experiments with this second approach will be covered in Sections 2.3.3 and 2.3.4, and an example will be given in section 2.3.5. It will be seen that some of the methods for analysis of experiments conducted under the first approach can also be applied to data derived from experiments conducted under the second approach. 2.3.1 Response surface modeling of the mean and standard deviation In Section 2.2 it was shown that response surface methodology can be applied to enable a researcher to model the effect of multiple quantitative variables on a response with a low-degree polynomial. Frequently, response surface techniques have focused on the mean response as the only response of interest. However, by regarding the variation in the response as an additional response of interest, the researcher can investigate how to achieve a mean response that is on target with minimum variation. In particular, if a researcher replicates each design point in an experiment, then an estimate of the standard deviation at each point can be calculated and used to model the effect of the variables on the variability of the response. To illustrate this approach, suppose that in an experiment on tablet formulation a researcher is interested in understanding how three quantitative variables, pressure force, lactose quantity, and disintegrant quantity, affect crushing strength. Suppose that the objective is to have a mean crushing strength of 125 N with minimum variation. If it is believed that the effect of the variables on the crushing strength can be adequately represented by a second-order polynomial then a 17-run central composite design, shown in Table 2.5, could be run to estimate all of the terms in the second-order model
where the xi are coded settings for the three design variables. Now if each of the design points in the central composite design is replicated five times, so that the complete design has 75 runs, then at each design point we can calculate the average response and the standard deviation of the response. The analysis techniques associated with response surface methodology can then be applied to fit separate models to
38
S.P. JONES
the mean and the standard deviation. The researcher is then in a position to determine settings of the variables that will give a mean response that is close to target with minimum variation. (It should be noted that many authors suggest that, for theoretical reasons, the log of the standard deviation, ln(s), be modeled rather than s; see, for example, Bartlett and Kendall [29] and Box [30].) In the context of the tablet formulation example, the model of the mean and the standard deviation can be used to determine which factors affect the mean crushing strength only, which affect the variability in crushing strength only, and which affect both the mean and the variability. The researcher can then choose settings of the variables that will give a mean crushing strength that is consistently close to 125 N. At this stage it is important to stress that the run order of the experimental design, including all replicates, should be completely randomized, since the purpose of the replicates is to provide an estimate of the total variation in the process or product at each design combination. If the replicated experiment is not completely randomized, then it is likely that the variation at each design point will be under-estimated since it will not include a component due to any variation in the set-up of the design variables. This could lead to erroneous conclusions about robust design combinations if certain design combinations have less set-up variation than others. The advantages of using the response surface approach to study both the mean and the variability are that it is easy to apply, no new methods of analysis are required, and the standard analysis methods can be used to bring insight to bear on the dual objective of the mean response and the variability. Some of these methods of analysis are considered in Section 2.3.2. As was mentioned above, a disadvantage of this approach is that the variation that is measured at each replicated design point will be from many sources, including sources of environmental variation, and it will be impossible to attribute the variation to a particular source. Another disadvantage of this approach is that it assumes that the variation experienced at the design points during the course of the experiment is similar to that experienced in practice in the real world. Frequently an experiment will be well-controlled and so the variation experienced will be considerably less than that normally encountered. One of the rationales for the noise arrays and cross-product designs advocated by Taguchi and discussed in Section 2.1 is to deliberately
STABILITY AND RESPONSE SURFACE METHODOLOGY
39
introduce into the experiment sources of variation that are more in line with what will be encountered in practice. During the experiment the noise (or environmental) variables are changed in a controlled manner that mimics the variation likely to be experienced in practice. Experiments that deliberately introduce the variation into the experiment through the experimental design (called the second approach, above) will be considered in Sections 2.3.3 and 2.3.4. 2.3.2. Analyzing the mean and standard deviation response surfaces One analysis approach, appropriate if there are only a couple of design variables, is to construct contour plots of the mean response and the standard deviation of the response over the range of the variables. This will enable the researcher to see the constraints and trade-offs that may need to be made to achieve required values for the mean and variability of the response. A more rigorous analysis for simultaneously obtaining a target value for the mean and minimizing the variance has been discussed by Vining and Myers [3 11. They propose applying the dual response approach developed by Myers and Carter [32] and state that this approach can satisfy the goals of achieving a target for the mean and for the variance within a more rigorous statistical methodology than that proposed by Taguchi. The objective of the dual response approach of Myers and Carter is to optimize a primary response subject to an appropriate equality constraint on the value of a secondary response. An application of this approach to the study of products and processes that are stable to environmental variation would involve running a response surface design, such as a central composite design or Box-Behnken design, that is replicated at each design point, as described in Section 2.3.1. Since each design point is replicated, the mean and variance can be calculated for each point in the experiment. Separate second-order models are fit to the data from the experiment that adequately describe the effect of the variables on the mean and on the standard deviation of the response. Then these two models are studied using the dual response approach of optimizing a primary response subject to an appropriate equality constraint on the value of a secondary response. The choice of whether to make the mean the primary or the secondary response will depend on the objectives of the experiment. For example, if the objective is to have the mean on target with minimum variation then the dual response approach would suggest minimizing the variance (or
40
S.P. JONES
some function of the variance such as ln(s)), subject to the constraint that the mean is at its target value. In this case the variance (or ln(s)) will be the primary response and the mean will be the secondary response. Alternatively, if the objective is to maximize (or minimize) the mean response and keep the variation as small as possible then the dual response approach would suggest optimizing the mean subject to the constraint that the variance is less than some upper bound. In this case the mean will be the primary response and the variance will be the secondary response. As suggested by Vining and Myers [31], the investigator may wish to select several possible constraint values for the variance, find the corresponding optimum values for the mean response subject to these variance constraints, and select a good compromise among these values. Details of the dual response approach can be found in the references given above. It is an extension of ridge analysis (Hoerl [33], see also Box and Draper [12]). The assumption is that there is a spherical region of interest of the design variables and that the variable combination that optimizes the primary response subject to a constraint on the secondary response is likely to be on the boundary of this region of interest. Thus an additional constraint is introduced, that the optimal value for the primary response is on the boundary of this spherical region. Lagrange multpliers are used to solve this constrained optimization problem. An example of the application of the dual response approach is given in Vining and Myers 1311. The application of the standard nonlinear programming techniques of constrained optimization on analyzing the mean and variance response surfaces has been investigated by Del Castillo and Montgomery [34]. These techniques are appropriate since both the primary and secondary responses are usually quadratic functions. Del Castillo and Montgomery recommend the generalized reduced gradient (GRG) algorithm for the following reasons. Firstly, the GRG algorithm is a primal method meaning that at each iteration the method searches only through the feasible region to determine a point that improves the primary response. Secondly, the GRG algorithm is one of the most robust nonlinear programming methods in that it can solve a wide variety of problems. Finally, the GRG method is known to work well unless the starting point is far from optimal and the constraints are highly nonlinear. Neither of these conditions are likely to be of concern when applying GRG methods to the dual response problem. Del Castillo and Montgomery also mention that if, in the dual response problem, the
STABILITY AND RESPONSE SURFACE METHODOLOGY
41
primary response is quadratic and the secondary response is linear, then a simpler method, such as quadratic programming, would be appropriate. An explanation of the GRG algorithm and its application to the dual response problem is given in Del Castillo and Montgomery [34]. In this paper Del Castillo and Montgomery claim that the GRG methodology has an advantage over the dual response method of Vining and Myers [31] in that it allows more constraints (secondary responses, such as cost constraints) to be included in the optimization and the constraints can be of a more flexible form. Furthermore, the optimization can be conducted over non-spherical regions of interest; for example, a cuboidal region defined by design variables within the region -1 5 xiI + l .
2.3.3 Experimental design with environmental variables In this section it is supposed that the environmental variation is deliberately introduced into the experiment by including in the experimental design environmental variables that are controlled at predetermined settings during the experiment. Freeny and Nair [35] considered robust design experiments with uncontrollable, but measureable, environmental variables. Their approach will not be considered here; in this chapter it will be assumed that environmental variables can be controlled during the experiment. An advantage of including environmental variables in the experimental design is that the analysis can investigate the effect of design variables on specific sources of environmental variation with the objective of understanding how particular design variable settings might affect the variation in the response due to changes in the environmental variables. This and the subsequent section will consider the application of response surface methodology to these experiments. Section 2.2.4 will show how split-plot designs can be applied to include environmental variables in the experimental design. An example of this type of experiment is the tablet formulation experiment described in Section 2.1 and given in Table 2.1. The usual method that Taguchi advocates for introducing the environmental variation is to construct an experimental design that contains the environmental variable settings and to completely cross this design with the experimentill design that contains the design variables. If there are n l runs in the design array and n2 runs in the environmental array, and the runs are madc independently, then there will be n, x n2 runs for the total experiment.
42
S.P. JONES
Thus, the experimental designs advocated by Taguchi can require a prohibitively large number of runs. An alternative approach is to regard the environmental variables as standard experimental variables and to apply the techniques associated with response surface methodology to the combined set of design and environmental variables (see Welch, Yu, Kang, and Sacks [36], Shoemaker, Tsui, and Wu [37], and Box and Jones [38]). This approach can result in considerably smaller and therefore cheaper experiments. As an example of the reduction in the size of the experiment, consider the tablet formulation study of Table 2.1 which had three quantitative design variables, x,, x2, x3, and two quantitative environmental variables, z,, z2. Suppose that all of the variables, both design and environmental, are to be studied at three settings (coded -1, 0, +1), and that each combination was to be run independently and the experiment hlly randomized.
TABLE 2.10 TAGUCHI-DESIGN FOR THREE DESIGN VARIABLES AND TWO ENVIRONMENTAL VARIABLES
-1 0 0 0 +1
+1
-1 0 +I
+1
-1 0
+I
+I
0 +1 0 -1 0
-1
.___
+1
Taguchi's approach of using a separate design and environmental array might result in a nine run fractional factorial design for the design
STABILITY AND RESPONSE SURFACE METHODOLOGY
43
variables and a nine run full factorial design for the environmental variables. The complete crossed design is shown in Table 2.10. It can be seen that it would require 9x9 = 8 1 runs. This design would yield estimates of linear and quadratic effects for the variables and of the interactions between the design and the environmental variables. However, it does not yield any unbiased estimates of the two-factor interactions among the design variables. An alternative design, based on applying response surface methodology to a combined set of design and environmental variables, could result in a smaller number of runs. One such design is the face-centered composite design described in Section 2.2.3. This design would consist of a 16-run, resolution V, fractional factorial design, augmented by a pair of star points for each factor, and a number (no)of center points. Such a design, with no = 4 center points, is shown in Table 2.1 1. This design will permit the estimation of all the terms of a full-second order model
i=l
j=l
i=l
j=l
i=l k = i + l
Thus, not only will this design estimate all of the linear and quadratic terms and interactions between the design and the environmental variables, but it will also estimate all of the two-factor interactions among the design variables and among the environmental variables. It will accomplish this in only (26 + no)runs, compared with the 81 runs for the Taguchi design that yields less information. It might be argued that a more reasonable approach for the Taguchi-type design given in Table 2.10 would be to run a two-level design in the environmental variables since an experimenter is unlikely to be interested in estimating the quadratic effects of the environmental variables. Such a situation would permit the use of a 22 full factorial design for the environmental array, and the complete design would require 9x4 = 36 runs. It is noted that this is still more than is required for the composite design in Table 2.1 1. In fact, under the assumption that the quadratic effects of the environmental variables are not of interest, the design of Table 2.1 1 could be reduced to (22 + no)runs by eliminating runs 23-27, the star points for the environmental variables.
44
S.P. JONES
TABLE 2.1 1
FACE-CENTERED CENTRAL COMPOSITE DESIGN FOR THREE DESIGN VARIABLES AND TWO ENVIRONMENTAL VARIABLES Run x2 x3 =I =2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
-1
-1 -1 -1 +1 -1 -1 -1 -1 +1
+1 +1
+1 +1
+1 +1
-1 +1 0 0 0 0 0 0 0 0 0 0 0 0
-1 -1 -1 +1 -1 -1 +1
+1 +1 -1 -1 -1 +1 +1 +1 +1 0 0 -1 +I 0 0 0 0 0 0 0 0 0 0
-1 -1 +1 -1 -1 +1 +l -1 +1 -1 +1 +1 -1
-1 +1 +1
0 0 0 0 -1 +1 0 0 0 0 0 0 0 0
-1 +1 -1 -1 -1 +1 +1 +1 -1 +1 -1 +1
-1 +1 -1 +1 0 0 0 0 0 0 -1 +1
0 0 0 0 0 0
+1 -1 -1 -1 -1 +1
-1 +1 +1 +1 +1 -1 +1 -1 -1 +1 0 0 0 0 0 0 0 0 -1 +1 0 0 0 0
As another example of the reduction in the number of runs, consider an experiment to investigate three design and four environmental variables, all at three levels. A Taguchi crossed array might use a 3(3-1)fractional factorial design for the design array and a 3(4-1)fractional factorial design
45
STABILITY AND RESPONSE SURFACE METHODOLOGY
for the environmental array giving a complete crossed design of 9x27 = 243 runs. This design would yield estimates of the linear and quadratic effects for all the variables and of the interactions between the design and the environmental variables. However, it does not yield any unbiased estimates of the two-factor interactions among the design variables. An alternative design would be the seven variable Box-Behnken design shown in Table 2.12. In this table each group of eight runs consists of all eight combinations of *1 for the three variables indicated, the other four variables being set at their center point, 0. The design is completed with no center points giving a total of (56 + no)runs.
TABLE 2.12 BOX-BEHNKEN DESIGN FOR THREE DESIGN VARIABLES AND FOUR ENVIRONMENTAL VARIABLES Runs =4 zl z2 z3 x2 x3
1-8 9-16 17-24 25-32 33-40 40-48 49-56 nn
0 *1 0 *1 0 *l 0 0
0 0 *1 *l 0 0 *I 0
0 0 0 0 *1 *l *1 0
*1
0 0 *1 +1 0 0 0
*I 0 *1 0
0 *l 0 0
*1 +1 0 0 0 0 *1 0
0 *l +1 0 *1 0 0 0
The design of Table 2.12 will permit the estimation of all the terms of a full-second order model
Thus, this design will provide data to estimate all of the linear and quadratic terms and interactions between the design and environmental variables, but it will also estimate all of the two-factor interactions among the design variables and among the environmental variables. As with the
46
S.P. JONES
previous design, the Taguchi crossed design gives less information while requiring more runs than a standard second-order design. Note that even if the environmental variables are at two levels so that a 2(4-1) fractional factorial design can be run for the Taguchi environmental array, the complete crossed design has 9x8 = 72 runs, still more than the Box-Behnken design of Table 2.12, while providing estimates of fewer coefficients of the second-order model. Shoemaker et al. [37] give several examples of the reduction in the number of experimental runs that can occur when it is assumed that some of the terms in the fill second-order model are negligible. The reader is warned, however, that assuming a term is negligible is not an assurance that it can be ignored. The presence of terms in the true model that were assumed negligible will bias the estimates of the other coefficients. Box and Jones [20,38], showed that by considering the experimental objective, it is possible to construct smaller designs without having to assume that certain interactions are negligible. They showed that if the experimenter's objective is to find the design combination that minimizes the variance, then the second-order effects among the environmental variables (that is the pure quadratic and interaction terms) are not of interest. Consequently, smaller designs can be constructed by aliasing together interactions among the environmental variables. These designs would still enable the unbiased estimation of all other coefficients of the second-order model, even if the interactions among the environmental variables are not negligible. As an example, consider the experiment with three design variables and four environmental variables described above. A design based on the central composite design could be used that would only require (38 + no) runs. This would be achieved by using as the cube portion a 32-run7 resolution IV design that confounded all of the two-factor interactions among the environmental variables with one another, along with 6 star points for each of the design variables, and no center points. A facecentered composite design of this form, with four center points, is shown in the example given in Section 2.3.5; see Table 2.13. It should be noted that the use of this experimental design does not require the assumption of negligible interactions among the environmental variables, only that they are not of interest. If non-negligible interactions do exist they will not bias the estimates of the other coefficients of the second-order model. To summarize, it has been shown that combining the design and environmental variables into a single set for a response surface design not
STABILITY AND RESPONSE SURFACE METHODOLOGY
47
only results in experiments that frequently require fewer runs than Taguchi's designs, but also there is considerable flexibility in choosing the designs so that all of the coefficients of interest can be estimated and runs are not wasted to estimate coefficients that can be ignored. 2.3.4 Analysis of experimental designs with environmental variables Having considered the advantages of designing an experiment with a combined set of design and environmental variables, as opposed to Taguchi's crossed arrays, this section will consider the analysis of such experiments. It should be noted that in contrast to the previous section there is no pure replication of the design points from the response surface design. Consequently it is not possible to estimate the variance at each design point and to fit a model for the variance. The analysis approach in this section is based on fitting a model to the data without distinction as to whether the variables are design or environmental variables. The explicit modeling of the environmental variables has advantages over the modeling of a summary measure of variation such as the standard deviation which can lead to erroneous conclusions (see Steinberg and Bursztyn [39]). At this stage it is helpful to consider, in general terms, the objective of an experiment to investigate robustness. Consider an experiment with one design variable, x, and one environmental variable, z. The objective is to determine a setting of x that will yield a response that does not change as z varies. From this description it is clear that information on robustness will be contained in the interaction between x and z.
\ x=
+1
Environmental Variable,z
Figure 2.5 Design x Environment interaction plot
48
S.P. JONES
Figure 2.5 shows a possible interaction plot of x and z. In this figure the 0 setting of x yields a response that is approximately constant as the environmental variable, z is changed. This setting yields a response that is robust, or stable to the environmental variation, z. In contrast, at the other settings of x the response changes as z is varied, indicating that these settings of x do not make the response robust to the environmental variation, z. A good summary of the analysis methods discussed in this section can be found in Myers, Khuri, and Vining [40]. Similar approaches have been described by Welch et a1 [36], Shoemaker et a1 [37], and Box and Jones [5,38]; see also Myers [41]. Suppose that a response surface design has been run with n design variables, x,, x2,x3 ,..., x,, and m environmental variables, z,, z2,z3 ,..., 2,. During the experiment the environmental variables are controlled at fixed levels and can be regarded as fixed effects. Suppose that the x’s and z's are centered and scaled around 0. In this section, several alternative models for the relationship between the design and environmental variables and the response will be considered. Suppose, initially, that the response from the experiment can be adequately modeled by a first-order model in both the design and the environmental factors. n
m
i= I
j=l
In matrix notation,
where p and x are ( n x 1) vectors and y and z are (m x 1) vectors, and the E are independent N(O,o," ). In the experiment the environmental variables are controlled at fixed levels, but in reality the environmental variables have a random effect on the response, y,,. Thus, the actual variation in the response is
STABILITY AND RESPONSE SURFACE METHODOLOGY
49
where z are random settings of the environmental variables that affect the response in reality (outside of the experiment), and V is the variancecovariance matrix of z. It is clear from this formula that the variance of the response is independent of x, the settings of the design variables. Consequently, there is no opportunity for achieving a more robust response in the presence of the environmental variation, z, by selecting particular settings for the design variables. Consider, now, a second example where the response from the experiment can be adequately modeled by a model that contains linear terms in the design variables, x, and the environmental variables, z, and also cross-product terms xz. Therefore, if there are n design variables, x l , x2,. . ., x , and rn environmental variables, z,, z2,. . ., zm , then the response, y xz , can be represented by
= A + x a x i + cy j z j + m
n
y
x n
m
p J x i z j+ E
In matrix notation, yxz= Po+p'x
+ zTy + Z'DX + E
where f3 and x are (n x 1) vectors, y and z are (m x 1) vectors, and where D is an ( m x n) matrix that contains the coefficients that measure the interactions between the design and the environmental variables. It is assumed that an experimental design has been conducted that will permit estimation of all these two-factor interactions and the main effects of the design and the environmental factors. Box and Jones [21] discuss experimental designs that accomplish this. Now let
gj
=
I-[
=yj
dYXZ
dZj
z=O
n
+C4jxi i=l
forj = I , ...,m
50
S.P. JONES
Then
and g(x) = y + Dx is a measure of the change in the response, as a function of the design variables, in the direction of z at z = 0. Therefore, we have
y,,
=
Po + xTp+ zTg(x)+ &
(26)
Now, as before, in reality the environmental variables have a random effect on the response, y,,. Therefore the actual variance of the response is
where g(x) = y + Dx and V is the variance-covariance matrix of z. From this formula, it can be seen that the variance of the response is a function of the settings of the design variables. Therefore there is an opportunity for making the response robust to the environmental variation by careful selection of the settings of the design variables. Suppose that from an experiment good estimates of the terms of y and D are obtained and that the elements of V are known. Then the variance of y,, can be minimized as a function of the design variables, x. Also from equation (26), the mean response level is
under the assumption that the random environmental variables have a mean of zero. It can be seen that both VO, ) (equation (27)) and EO, ) (equation (28)) are essentially response surface models. From an experiment, estimates of y, D, 0,' , Po,and fl can be derived. Suppose, also that the elements of V are known, or can be estimated. Then the search for a choice of design variables that yields a response that is robust to the environmental variation and close to target will involve an examination of these two response surfaces. At this point, the scientist might proceed by following
STABILITY AND RESPONSE SURFACE METHODOLOGY
51
the dual response or constrained optimization approaches discussed in Section 2.3.2 or by simply overlaying contour plots of the mean and variance response surfaces. In practice, of course, there could be considerable uncertainty as to values for the elements of V, although it might be possible to estimate them from historical data. If reliable estimates of the values of the elements of V are unavailable then several alternative guesses could be made and the sensitivity of the conclusions to these estimates could be ascertained. If there is some target value, z, for the response then a measure of closeness of the mean response to that target is
Box and Jones [38] discussed the use of a general robustness measure of the form
where 0 I A I 1. Selection of a particular value for A corresponds to a particular weighing of the relative importance of being close to target and having small variation. Suppose, now, that the response from the experiment can be adequately represented by a model as in equation (22) but with the addition of pure quadratic and interaction terms for the design variables, x. For n design variables, x,, x2,. . ., x,,, and rn environmental variables, z I ,z2, . . ., z,, it is supposed that the model for the experiment is
In matrix notation we have
y
XZ
=
Po+ xTP+ xTBx+ zTy + zTDx+ E
52
S.P. JONES
where p and x are (n x 1) vectors, y and z are (m x 1) vectors, B is an (n x n) matrix that contains the coefficients that measure the interactions and pure quadratic terms among the design variables, and D is an (m x n) matrix that contains the coefficients that measure the interactions between the design and the environmental variables. As before, let g'(x) be as in equation (25) so that g(x) = y + Dx is a measure of the change in the response, as a function of the design variables, in the direction of z at z = 0. Therefore, we have y,,
=
Po+ x'p+
xTBx+ zTg(x)+ E
(33)
Now, as before, in reality the environmental variables have a random effect on the response, yxz. Therefore the actual variance of the response is
where g(x) = y + Dx. Therefore, the formula for the variance of the actual response is identical to the previous model (see equation (27)) and is a function of the settings of the design variables only through y and D. Therefore, as before, there is an opportunity for making the response robust to the environmental variation by careful selection of the settings of the design variables. Also from equation (33), the mean response level is
EO, ) = Po+ x'p + xTBx,
(35)
under the assumption that the random environmental variables have a mean of zero. The mean response level is now a function of both the firstorder and second-order terms in the design variables. Thus, it can be seen that both VbX) and EbX) are quadratic response surface models in x. From an experiment, estimates of y, D, a,", Po,p, and B can be derived. Suppose, also that the elements of V are known, or can be estimated. Then, as before, the search for a choice of design variables that yields a response that is robust to the environmental variation and close to target will involve an examination of these two response surfaces, equations (34) and (35).
STABILITY AND RESPONSE SURFACE METHODOLOGY
53
2.3.5 Example To illustrate the approach described above, consider an experiment with z3, three design variables, xl, x2,x,, and four environmental variables, zl, z2, z4.The objective was to find a setting of the design variables that will lead to a small response with minimum variability due to the environmental variables. Suppose that it was reasonable to assume that the second-order effects (that is the pure quadratic and interaction terms) among the noise variables were not of interest. The experimental design used was one based on the face-centered central composite design that only required (38 + no)runs. This design, described in Section 2.3.3, has as the cube portion a 32-run, resolution IV design that confounds all of the two-factor interactions of the noise factors with one another, along with six star points for the design factors and no center points. The design, with the responses from the experiment, is shown in Table 2.13. The following model, equation (36), was fit to the data.
It can be seen that this model contains all main effects, all quadratic terms in the design variables, all interactions among the design variables, and all interactions between the design and the environmental variables. An estimate of the pure experimental error can be obtained from the replication at the four center points. The ANOVA table shown in Table 2.14 indicates that there was no significant lack-of-fit of the model. Parameter estimates and t-statistics for this model are shown in Table 2.15. The following model for the response was derived using the significant effects indicated in Table 2.15. yXZ =41.83 + 2 . 5 0 ~- 3~ . 9 1 +~ 4.192, ~ - 4.382, + 2 . 6 9 +~2~. 3~8 ~ ,-~2,. 8 1 ~ ~+2 ,E
(3 7 )
54
S.P. JONES
TABLE 2.13 EXPERIMENTAL DESIGN AND DATA SET EXAMPLE Design Variables Environmental Variables Response
Runs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 21 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42
x, -1 +1 -1 +1 -1 +1 -1
+I -1 +1 -1
x2
x3
-1 -1
-1 -1 -1 -1
+I +I -1 -1 +1
+I -1 -1
+I +I +I +I -1 -1 -1 -1
+I
+I +I
-1 +1 -1
-1 -1 +1
+1
+I
+I
+I
-1 +1 -1 +1 -1
-1 -1 +1 +1 -1 -1
-1 -1 -1 -1
+I -1 +1 -1
+I -1 +1 -1
+I -1 +1 -1
+I 0 0 0 0 0 0 0 0
+I +1 -1 -1
+I +1 -1 -1
+I +1 0 0 -1
+I 0 0 0 0 0 0
+I +I
+I +1 +1
+I -1 -1 -1 -1
+I +1
+I +I 0 0 0 0 -1
+I 0 0 0 0
-1 -1 -1 -1 -1 -1 -1 -1 +1
+I +I +1
+I +I +I +I -1 -1 -1 -1 -1 -1 -1 -1
+I +I +I +I +I +I +I +I 0 0 0
0 0 0 0 0 0 0
z2
z3
z4
-1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 +1 +1 +1 +1
+1 -1 -1
+1 -1 -1
+I
+I
+I +I +I +I
-1
-1
+I
+I
+1 -1 -1
+1 -1 +1 -1 -1
+I +I -1
+I -1 -1 +1 +1 -1 -1
+I +1 -1
-1
+I
+I
-1 +1 +1 -1 +1
+1
+I +I
+I +I
+1
-1 +1 -1 -1
0 0 0 0 0 0 0 0 0 0
+1 -1 -1
+I -1 -1 -1
+I +I +I +I
+I -1 +I
+I -1
+I -1 -1
+I
+I
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
Y 46 41 40 30 41 45 23 45 56 41 24 58 46 52 40 48 44 41 39 25 48 35 21 44 47 49 33 57 45 56 46 44 31 50 48 31 36 40 34 42 37 39
STABILITY AND RESPONSE SURFACE METHODOLOGY
55
TABLE 2.14 ANOVA TABLE FOR DATA IN TABLE 2.13 Sumof Mean Source df Squares Square F-ratio Model 24 2863.16 114.526 7.614 Error 16 240.67 15.042 Lack-of-fit 13 206.67 15.898 1.403 PureError 3 34.00 11.333 Total 41 3103.83
p-value < 0.0001 0.4394
This can be re-expressed as ~ , , = 4 1 . 8 3+ 2 . 5 0 ~- 3~ . 9 1 +~ 2.69x,x2 ~ + (4.19+ 2 . 3 8 ~ +~ (-) 4.38~ ~ 2 . 8 1 ~+~E) ~ ~ From this we have as the estimated mean response surface
E(yJ
= 41.83
+ 2.50x, - 3 . 9 1 +~ 2.69x1x2 ~
(39)
and, if we assume that the z's are uncorrelated, then the estimated variance response surface is ?(yJ
= (4.19+
2 . 3 8 ~ i?zl ~ ) ~+ (- 4.38- 2 . 8 1 ~ 3z3+ ~ ) ~ 3:
(40)
$2
Now from the center point runs an estimate of of 11.33 is obtained. Suppose that from previous studies or additional information it is known that good estimates for and 3f3are 1.0. Using these estimates, the estimated response surface for the variance is ?(yJ
= (4.19-t 2
. 3 8 ~ ~+ )(-~4.38- 2 . 8 1 ~ ~ + ) 11.33. ~
(41)
From equations (40) and (41), it can be seen that x3 has no effect on either the mean response or the variation. Furthermore, it can be seen that both xI and x2 have an effect on the variation of the response and that an opportunity exists to minimize the effect of the environmental variables z1 and z3 by a particular selection of these two design variables. It should be
56
S.P. JONES
noted that the analysis indicates that the other two environmental variables, z2 and z4,do not affect the response.
TABLE 2.15 PARAMETER ESTIMATES FOR DATA IN TABLE 2.13 Parameter Standard Variable Estimate Error t p-value Intercept 38.58 1.51 25.58 < 0.0001 2.50 0.67 3.76 0.0017 -3.91 < 0.000 1 0.67 -5.88 x2 x3 0.38 0.67 0.58 0.5734 4.19 < 0.0001 0.69 6.11 zl z2 -0.06 0.69 -0.09 0.9285 z3 -4.38 0.69 -6.38 < 0.0001 0.69 1.09 0.2902 z4 0.75 4.34 2.31 1.88 0.0787 0.34 2.31 0.15 0.8850 -0.66 2.31 -0.29 0.7787 2.69 0.69 3.92 0.00 12 1.06 0.69 1.55 0.1408 0.69 0.64 0.44 0.5324 2.38 0.0032 0.69 3.46 -0.88 0.2201 0.69 -1.28 -0.06 0.69 -0.09 0.9285 -0.19 0.7880 0.69 -0.27 0.63 0.3755 0.69 0.91 0.50 0.69 0.73 0.4764 -2.8 1 0.0008 0.69 -4.10 -0.8 1 0.69 -1.19 0.2533 0.13 0.69 0.18 0.8576 0.38 0.69 0.55 0.5920 0.69 0.69 1.00 0.3309 x3z4 -0.94 0.69 -1.37 0.1904
STABILITY AND RESPONSE SURFACE METHODOLOGY
57
Figures 2.6 and 2.7 show contour plots of the mean response and the standard deviation. It can be seen that within the experimental region the best setting of the design variables to minimize the response is to choose x,=-1 and x,=+l. In terms of minimizing the variation due to the environmental variables the best setting would be x,=-1 and x2=-1. Similar conclusions are reached for a range of alternative choices for the estimates of and indicating that the conclusions are not oversensitive to the particular estimates chosen for the variances of the environmental variables. Clearly, a compromise between the two objectives of minimizing the mean response and the variation in the response would need to be reached. One possibility would be to minimize the variation subject to the constraint that the mean response would be less than some target value. Alternatively, one could minimize the mean response subject to the constraint that the variation would be less than some target value. If there are different costs involved in operating the process at different settings of the design variables, then a contour plot showing the operating costs can be constructed to help find an operating condition that has low operating cost and reaches a satisfactory compromise between the two objectives.
&i3,
2.4 SPLIT-PLOT DESIGNS FOR ROBUST DESIGN In Sections 2.2 and 2.3 we considered the application of response surface methodology to the investigation of the robustness of a product or process to environmental variation. The response surface designs discussed in those sections are appropriate if all of the experimental runs can be conducted independently so that the experiment is completely randomized. This section will consider the application of an alternative class of experimental designs, called split-plot designs, to the study of robustness to environmental variation. A characteristic of these designs is that, unlike the response surface designs, there is restricted randomization of the experiment. Section 2.4.1 contains a brief description of split-plot designs and describes several alternative split-plot arrangements. In Section 2.4.2 the precision of the estimates from these split-plot arrangements is considered. Section 2.4.3 contains a discussion of variants of the standard split-plot arrangements. In Section 2.4.4 there is a discussion of the analysis of split-plot designs and a comparison of the design and analysis of split-plot experiments with the design and analysis methods proposed by Taguchi.
58
S.P. JONES
Figure 2.6 Contour plot of the mean response surface
xl
Figure 2.7 Contour plot of the variance response surface
STABILITY AND RESPONSE SURFACE METHODOLOGY
59
2.4.1 Overview of split-plot designs Split-plot designs occur in a wide range of applications of experimental design. One application area for split-plot designs is when there are some variables that can be applied only to experimental units that are larger than the units to which the other variables can be applied. As an example, consider an investigation on a heat treat process to determine if the temperature of the quench bath, whether parts are stacked horizontally or vertically, and two different types of fixture, affect warpage in metal castings. Suppose the bath can hold eight parts. Then within each bath we can test the effect of stacking and fixture, for example by running 8 parts according to a replicated 2* design in those two factors. We would run that same set-up for quench baths of different temperatures. To investigate the effect of temperature we need larger experimental units (baths), but to investigate the effect of stacking and fixture we can use smaller experimental units (parts). A second application area is when there is a variable that is difficult or expensive to change and so the randomization is restricted to limit the number of changes of that variable. This is accomplished by conducting the experiment in blocks with the restricted variable held constant within a block, but changed randomly between blocks. In this case the large experimental units are the blocks and the smaller experimental units are the individual runs within a block. An excellent exposition of split-plot experimental designs can be found in D.R. Cox’s book, “Planning of Experiments” [42]. He states that splitplot designs are particularly useful when one (or more) factors are what he calls classification factors. These factors are included in the experiment to determine whether they modify the effect of the other factors or indicate how the other factors work. The classification factors are included to examine their possible interaction with the other factors. Lower precision is tolerated for comparisons of the classification factors, in order that the precision of the other factors and the interactions can be increased. In the standard terminology associated with split-plot experiments, the classification factors are called whole-plot factors and are applied to the larger experimental units. The smaller experimental units are called subplots. In the following subsections several alternative experimental arrangements of split-plot experiments will be considered. The tablet formulation data given in the example of Table 2.1 in Section 2.1.1 will be
60
S.P. JONES
used to illustrate the applicability of these split-plot experiments to designing robust products and processes. Recall that in that example an experiment is to be run with three constituents of the tablet formulation, the design variables, which will be denoted as A, B, and C, and two environmental variables, storage temperature and humidity. The three design variables are arranged in a factorial design which is crossed with a 32 factorial design containing the environmental variables. A set of hypothetical data was shown in Table 2.1. The objective of the experiment is to determine a combination of the design variables that will yield high values for the response across the ranges of temperature and humidity studied in the experiment. The same set of data will be used in the following subsections to illustrate the different analyses for the alternative split-plot designs. Clearly, in practice the correct analysis of the experiment will depend on the particular experimental arrangement that was adopted.
Design (I): environmentalfactors as whole-plotfactors Using Cox's concept of classification factors, it seems most reasonable to have the classification factors, that is the whole-plot factors, associated with the environmental variables, since they are in fact included primarily to examine their possible interaction with the design variables. Thus, the first arrangement considered is one in which the whole plots contain the environmental variables and the subplots contain the design variables. Now, suppose that there are m levels of the environmental variables, El, E2, . . ., E,, . . ., Em,applied to the whole plots, that there are n levels of the design variables, D1,D2, . . ., D,, . . ., D , applied to the subplots, and that there are 1 replicates, rl, r2, . . ., rk, . . ., r,, with the whole plots in 1 randomized blocks. For the tablet formulation example given in Table 2.1 in Section 2.1.1, the environmental variables are temperature and humidity that are varied in a climate-controlled chamber and m is 9, the design variables are the quantities of A, B, and, C in the tablet formulation and n is 8, and since there is only one replicate 1 is 1. For the tablet formulation example of Table 2.1, this split-plot arrangement would require mxnxl = 9x8 = 72 tablet formulation batches to be made but only mxl = 9 operations of the climate chamber. The experiment would be conducted by placing in the climate chamber a complete set of 8 different tablet formulations at the same time. A completely randomized experiment (the cross-product experiment of
STABILITY AND RESPONSE SURFACE METHODOLOGY
61
Taguchi) with no replication would require not only 72 tablet batches, but also 72 operations of the chamber. It is clear, therefore that this experimental arrangement can be considerably easier to run than the completely randomized cross-product design. The model for arrangement (I) is y,
=m
+ rk+ Ej + hjk+ Dj + (DE)v+ egk,
where y is the response of the kthreplicate of the ihlevel of factor D, and Vk thejrhlevel of factor E, m is the overall mean, rk is the random effect of the th k replicate, with rk-N(o, 0," ), El is the fixed effect of thejfh level of E, D is the fixed effect of the ifh level of D, (DE) is the interaction effect of the ! I
irh level of D with the j t hlevel of E, hi,NCO,o:), is the whole-plot error,
e -N(O, rJk
0 : ) is the
subplot error, and h and eVk,are independent. lk
The ANOVA table is shown in Table 2.16. In this tablei,, d,, and Dkyare estimates of E ,D , and (DE) respectively. J ' V It can be seen from the ANOVA table that the sources of variation split into two parts, those coming from the whole plots (Env, and RXE) and those coming from the subplots (Design, DxE, and Error). In this case the mean square for Env would be tested against that of RxE, and the mean square for Design and for DxE would be tested against that of Error. Now suppose that there is no replication. Then to test Env, Design, and the interaction DxE estimates of 0,"and 0,'+ no: would be required. One possibility is to construct two normal plots, one for whole-plot and one for subplot contrasts, and to pick out as active contrasts those that fall away from a line (see Daniel [43]). Alternatively, if the design and the environmental factors are factorial combinations it may be possible to assume that higher-order interactions are negligible. If this assumption is reasonable then the whole-plot error can be estimated by pooling the higher-order interactions among the environmental variables, and the subplot error can be estimated by pooling the higher-order interactions among the design factors and between the design and the environmental factors. For example, for the tablet formulation data of Table 2.1, assuming that all contrasts involving three or more factors are estimating error, the following ANOVA table (see Table 2.17) is obtained.
62
S.P. JONES
TABLE 2.16 Source
ANOVA TABLE FOR ARRANGEMENT (I) df Sum of Squares Expected Mean Squares
Reps(R)
1-1
mna,2 + a,2 + n o ,2
Env(E)
m-1
- -nl- C E ~
RXE
(1-I ) (m-I )
Design(D) n-1
Error
(l I)m -(n-I )
j=l
i=l
2
+ a,2 +no:
2
+ 0, 2
m-1 j = l a, 2 + n a, 2,, - lm xDi n-1 i=l
2 0,
For the subplot analysis it appears that the effects due to A, and the interaction between B and Humidity are real, with some evidence of an interaction between B and Temperature. It is possible to split the two degrees of freedom for Temperature and Humidity into linear and quadratic contrasts and to construct a normal probability plot for the whole plot contrasts. This would reveal important effects due to the linear components of both Temperature and Humidity.
Design (II): design factors as whole-plotfactors An alternative arrangement to design (I) would be to have the design variables as the whole-plot factors and the subplots contain the environmental variables. As before, suppose that there are m levels of the environmental variables, E l , E2, . . ., Ej, . . ., Em , that there are n levels of the design variables, D,, D2, . . .Di, . . ., Dn , and that there are 1 replicates, rl,r2, * ., rk, . * ., r,, with the whole plots in 1 randomized blocks. The contrast with arrangement (I) is that in this arrangement the environmental variables are applied to the subplots, and not the whole plots, whereas the design variables are applied to the whole plots not the subplots. For the tablet formulation example, this experimental arrangement would arise if eight tablet formulation batches were made according to the eight different design combinations and that each of these batches were
STABILITY AND RESPONSE SURFACE METHODOLOGY
63
divided into nine sub-batches and that each of the 72 sub-batches were placed individually in the chamber for the appropriate setting of temperature and humidity. This would require only nxl = 8 tablet formulation batches to be made but would require mxnxl = 72 operations of the chamber. A completely randomized experiment in which there was no replication would have required 72 tablet batches and also 72 operations of the chamber. It is clear, therefore that this experimental arrangement can be considerably easier to run than the completely randomized design.
TABLE 2.17 ANOVA FOR THE TABLET FORMULATION DATA OF TABLE 2.1 FOR ARRANGEMENT (I) Source df ss MS F-ratio 2 Temp(T) Humidity(H) 2 TxH 4 Design A 1 B 1 C 1 AxB 1 AxC 1 BxC 1 Design x Env A x T 2 AxH 2 BxT 2 BxH 2 CxT 2 CxH 2 Higher Order 45 Interactions F (.05)=4.06 F (.05)=3.20 F
Whole Plot (Env)
1,45
2,45
1204.1 1199.4 350.5 938.9 60.5 144.5 24.5 40.5 18.0 62.1 17.0 399.0 799.1 65.3 97.6 2759.5 1,45
602.1 599.7 87.6 938.9 60.5 144.5 24.5 40.5 18.0 31.1 8.5 199.5 399.5 32.7 48.8 61.3
(.01)= 7.23 F
2,45
15.31 0.99 2.36 0.40 0.66 0.29 0.5 1 0.14 3.25 6.52 0.53 0.66
(.01)=5.11
The model for experimental arrangement (11) is
where, as before, y,, is the response of the kth replicate of the ifh level of factor D, and the j"' level of factor E, m is the overall mean, rk is the
64
S.P. JONES
random effect of the Ith replicate, with rk-N(O,a,?>,E,is the fixed effect of the j t h level of E, D,is the fixed effect of the interaction effect of the the subplot error.
iIh
J
ihlevel of D, (DE)..is the ! I
level of D with thejthlevel of E, e..-N(O,o:) is !Ik
In arrangement (11), q,-N(O,a:) is the whole-plot error, and q, and e.. !Ik are independent. The ANOVA table is shown in Table 2.18. This table shows that the sources of variation can be split into two parts, those coming from the whole plots (Design, and RxD) and those coming from the subplots (Env, DxE, and Error). In this case the mean square for Design would be tested against that of RxD,and the mean square for Env and for DxE would be tested against that of Error.
TABLE 2.18 Source
ANOVA TABLE FOR ARRANGEMENT (11) df Sum of Squares Expected Mean Squares
Reps(R)
1-1
Design(D) n-1
RXD
(1- 1)(n-1)
Env(E)
m-1
mno,2 Im
n
Cij; i=l
2 o,2 + + m o w
nl
rn
C i?; j=l
Error
(1- 1)n(m-I )
+ as2 + m a ,2
-C nl E j m-l j = l
2
+ 0,"
2 a"
As has been already pointed out, of course, the constructed data would only be appropriate for the model that reflected the way in which the experiment was carried out. However, to illustrate the analysis the same data from-Table 2.1 will be used. The ANOVA table for the tablet formulation data, assuming that the experiment was run according to arrangement (11), is shown in Table 2.19. For this arrangement, higher-order interactions are assumed to be negligible and their sums of squares are pooled to give an estimate of error. From the ANOVA table it appears that in the sub-plot analysis there
STABILITY AND RESPONSE SURFACE METHODOLOGY
65
are real effects due to both Temperature and Humidity as well as the interaction between B and Humidity. Further analysis would reveal that both the Temperature and Humidity effects are predominantly linear rather than quadratic. A normal plot of the whole-plot contrasts involving the design variables could be constructed and, in this example, would indicate the importance of factor A.
TABLE 2.19 ANOVA FOR THE TABLET FORMULATION DATA OF TABLE 2.1 FOR ARRANGEMENT (11) Source df ss MS F-ratio
A 1 B 1 1 C 1 AxB 1 AxC 1 BxC 1 AxBxC 2 Env Temp(T) 2 Humidity(H) 4 TxH 2 Design x Env A x T AxH 2 BxT 2 BxH 2 CxT 2 CxH 2 Higher Order 44 Interactions F (.05)=4.06 F (.05)=3.21 F
Whole-Plot (Design)
1,44
2,44
938.9 60.5 144.5 24.5 40.5 18.0 72.0 1204.1 1199.4 350.5 62.1 17.0 399.0 799.1 65.3 97.6 2687.5
938.9 60.5 144.5 24.5 40.5 18.0 72.0 602.1 599.7 87.6 31.1 8.5 199.5 399.5 32.7 48.8 61.1
(.01)= 7.25 F
1,44
2,44
9.86 9.82 1.43 0.5 1 0.14 3.25 6.52 0.53 0.66
(.01)=5.12
Design (IIJ: strip-block design Let us now consider an experimental arrangement where the subplot levels are assigned randomly in strips across each block of whole-plot levels. Such arrangements are frequently called strip-block designs. As an illustration of this arrangement, suppose that we have a whole-plot variable with three levels, a,, u2, and u3,a subplot variable with two levels, bl and
66
S.P. JONES
b2 and three blocks. Pictorially, the design is represented by the following figure (Figure 2.8). Block 1 a2 a1 a3
Block 2 a3 a2 a1
Block 3 a1 a3 a2
Figure 2.8 Strip-block design There is a certain symmetry with the allocation of the two variables, to the extent that it seems that both variables could equally well be designated as the whole-plot variable. Suppose, as before, that there are m levels of the environmental variables, E l , E2, . . ., Ej, . . ., E m , that there are n levels of the design variables, D,, D2, . . .Di, . . ., D , and that there are 1 replicates, r , , r2, . . ., rk, . . ., r,. For the tablet formulation example, the strip-block arrangement would arise if nxl = 8 tablet formulation batches were made according to the design combinations and that each of these batches were divided into mxl = 9 sub-batches. One sub-batch from each of the eight batches is then selected at random and these eight are placed in the chamber at the same time at the appropriate setting of temperature and humidity. This design would require only n = 8 tablet formulation batches to be made and only m = 9 operations of the chamber. Strip-block experiments, such as the one described in this section, are clearly considerably easier to run than either the completely randomized product design or either of the split-plot designs described above, that is, arrangements (I) and (11). The model appropriate for the strip-block arrangement is y, = m + rk+ EJ. +h. Jk + D.I + ' i k
where, as before, y,
+ (DQii + eUk,
(44)
is the response of the kth replicate of the
i" level of
factor D, and the j hlevel of factor E, m is the overall mean, rk is the
random effect of the kthreplicate, with rk-N(0, 0,'), Ej is the fixed effect of t h e j h level of E, D. is the fixed effect of the
ilh level of D, ( D a Vis the
interaction effect of the ihlevel of D with t h e i h level of E. In arrangement
67
STABILITY AND RESPONSE SURFACE METHODOLOGY
(III), hjk-N(O,$), qik-N(O,oi), eyk-N(0,02)and hjk , q, , and eiikare independent. The ANOVA table is shown in Table 2.20. This table shows that the sources of variation can be split into three parts. The mean square for the environmental variables are tested against that of RxE; the mean square for the design variables are tested against that of RxD; and the mean square for the design x environment interactions are tested against that of RxExD. When there is no replication and the design is sufficiently large either three normal plots can be constructed, one for the design variable contrasts, one for the environmental variable contrasts, and one for the designxenvironment interactions contrasts. Alternatively, ( o2+n ) could be estimated by pooling the higher-order interactions among the environmental variables, ( o2+m 02, ) by pooling the higher-order
02
interactions among the design variables, and o2 by pooling the higherorder interactions among the design x environment interactions.
TABLE 2.20 ANOVA TABLE FOR ARRANGEMENT (111) Sum of Squares Expected Mean Squares
Source
df
Reps(R)
1-1
Env(E)
m-1
RXE
(1- 1)(m-1)
Design(D) n-1
mno;
+ o2+ m o D 2 + n o ,2
j=1
2 02+noE
fi:
lm i=l
- lm XD:+o n-1 i=l
RXD
(1- 1)(n-1)
o2 + m o ,2
RxDxE
(1-1)(m-1)(n-1)
o2
2
+moD 2
To illustrate the analysis the same data given in Table 2.1 will be used, assuming that the experiment was conducted as a strip-block design. The ANOVA table for the tablet formulation data is given in Table 2.2 1.
68
S.P. JONES
As was discussed with arrangement (I), it is possible to split the two degrees of freedom for Temperature and Humidity into linear and quadratic contrasts and to construct a normal probability plot for the environmental variable contrasts. This would reveal important effects due to the linear components of both Temperature and Humidity. A normal plot for the design contrasts would indicate that there appears to be a real effect due to A. The analysis of the design x environment interactions is obtained by pooling together higher-order interactions to obtain an estimate of the error term 0 2 .This analysis indicates that the interaction between B and Humidity appears to be the only real interaction effect.
TABLE 2.21 ANOVA FOR THE TABLET FORMULATION DATA OF TABLE 2.1 FORARRANGEMENT(II1) Source df ss MS F-ratio Temp(T) Humidity(H) TxH A B C AxB AxC BxC AxBxC AxT AxH BxT BxH CxT CxH Higher Order Interactions
2 2 4 1 1 1 1 1 1 1 2 2 2 2 2 2 44
1204.1 1199.4 350.5 938.9 60.5 144.5 24.5 40.5 18.0 72.0 62.1 17.0 399.0 799.1 65.3 97.6 2687.5
F2,44(.05) = 3.21
602.1 599.7 87.6 938.9 60.5 144.5 24.5 40.5 18.0 72.0 31.1 8.5 199.5 399.5 32.7 48.8 61.1 F2,44(.01) = 5.12
0.5 1 0.14 3.25 6.52 0.53 0.66
STABILITY AND RESPONSE SURFACE METHODOLOGY
69
2.4.2 Precision of split-plot designs The above descriptions of alternative split-plot arrangements show that the ease of experimentation can vary depending on the particular experimental design followed. However, in considering which experimental design to adopt, the investigator should weigh many other criteria besides ease of experimentation. One criterion of importance is the precision of the estimates of the effects that these arrangements yield. It can be shown, see for example Kempthorne [18], Box and Jones [5], that for both design (I) and (11), the whole plot effects are determined less precisely than compared with the cross-product design, but that the subplot variables and the design x environment interactions are determined more precisely. For the strip-block design both the design and environmental vaiable effects are determined less precisely than with the cross-product design, but the design x environment interactions are determined more precisely. When the strip-block design is compared with both split-plot designs (I) and (11), the whole plot effects are determined with the same precision, the sub-plot effects are determined with less precision but the design x environment interactions are determined with more precision. 2.4.3 Variants of split-plot designs Adaptations of the split-plot methodology have been suggested by many authors (see, for example, Kempthorne [ 181, Cochran and Cox [44]). These authors describe various blocking arrangements to control for other sources of variation in split-plot experiments. The relevance of some of these arrangements to split-plot designs that investigate the influence of environmental variation is discussed in Box and Jones [5]. One of the most usefil adaptations of the split-plot design is the confounding of higher-order split-plot interactions when the split-plot treatments are in a factorial design; first suggested in Bartlett [45]. Such an experimental design requires fewer sub-plots within each whole plot, but still enables the required effects to be estimated. When the whole-plot design is a factorial design then it may be possible to reduce the number of whole plots required by confounding certain whole-plot interactions. The use of factorial and fractional factorial designs in split-plot arrangements has been investigated by Addelman [46], see also Daniel [47]. As an example of such an arrangement, consider a tablet formulation experiment with two environmental variables, temperature (T) and humidity (H), and five design variables, A, B, C, D, and E; with all of the
70
S.P. JONES
variables at two levels. Suppose that it has been decided that the environmental variables will be assigned to the whole plots and the design variables assigned to the sub-plots (design arrangement (I)). Suppose that the chamber can hold no more than 20 batches of tablets at a time. With this constraint it is no longer possible to use a full factorial for the design factors within each whole plot (run of the chamber) since this would require 25= 32 tablet batches for each run of the chamber. An alternative design would be to use a half-fraction of the design variables for each run of the chamber. Such a design, before randomization, is shown in Table 2.22. With this design the ABCDE fivefactor interaction is confounded with the TxH whole-plot contrast. Under the assumption of negligible three-factor and higher-order interactions all main effects and two-factor interactions can be estimated as well as interactions between the design and the environmental variables. 2.4.4 Analysis of split-plot designs for robust experimentation The appropriate analysis of data obtained from an experiment should be determined by the experimental design used to obtain those data. The hndamental characteristic of split-plot designs is that there are experimental units of different sizes and consequently multiple sources of variation. The analysis needs to take account of this structure and include multiple error terms and to test the significance of effects and interactions against the appropriate error term. This has been illustrated above with the three experimental arrangements for split-plot and strip-block designs. If there is replication of the experiment then an independent estimate of the error terms can be calculated and valid statistical tests, such as ANOVA, can be constructed. In split-plot and strip-block designs that are unreplicated, there is no independent estimate of the appropriate error terms available. Several alternative analysis approaches have been advocated. A number of authors, for example, Mason, Gunst, and Hess [48] (p. 370), suggest estimating the error terms by combining higher-order interactions that are assumed to be negligible. An alternative is to construct separate normal or half-normal probability plots for the effects and interactions calculated from the different types of experimental units. Then under the assumption of effect sparsity the slopes of the lines from the inert effects can be used to estimate the separate error terms. An alternative approach, which also depends on the assumption of effect sparsity, suggested by Box and Jones [ 5 ] , would be to construct separate
71
STABILITY AND RESPONSE SURFACE METHODOLOGY
TABLE 2.22
EXAMPLE OF A SPLIT-PLOT DESIGN USING A FRACTIONAL FACTORIAL Chamber Run 1 Chamber Run 2 Chamber Run 3 Chamber Run 4 Temp = Temp = + Temp = Temp = + Humidity = Humidity = Humidity = + Humidity = + A B C D E A B C D E A B C D E A B C D E
- - - - -
- + + + +
+ + + - - - + - - + + - - + - + - - + - + - + + -
+ + + +
+ + +
+ + +
+ + + + + + -
+ + - - + + - + - + + - - + - - -
+ + + + + -
+ + + + + + -
+ + + + + + + + + +
+ + + + + + + + -
+ + + - +
+ + + + + -
+ -
+ + + + +
+ + + + +
+ + +
+ -
- + - +
+ + + + + -
+ + + + + + -
+ + + + + + + + + + +
- -
+ + + + + + + + -
+ + +
+ + + + + -
+ + + + +
+ + + +
-
+
- -
+ + + + +
+ + + + +
-
- +
_ - - - -
+ + + +
- -
-
+ + + +
-
+ + + +
+ + + + - + + + + - - + + -
+ + + + + + + -
+ + + - - - + - - + - + - + - - -
Bayesian probability plots (Box and Meyer [49]) for the contrasts from different types of experimental units. The split-plot arrangement has similarities with Taguchi's crossed designs since both arrangements divide the factors into two groups; in the split-plot terminology the factors are assigned to either whole plots or subplots, in Taguchi's terminology the factors are either assigned to an inner (design) array or an outer (noise) array. Although there are similarities in the appearance of the designs, there are marked differences in the analysis of these designs. Some of these differences reflect different philosophical approaches to data analysis. Taguchi's analysis of robust design experiments is frequently conducted in terms of a performance statistic, such as a signal-to-noise ratio, that is calculated for each point of the design array using data obtained from the environmental (noise) array about that point.
72
S.P. JONES
The use of a single-valued summary statistic, such as a signal-to-noise ratio, has been considered in Box [30]. One of the criticisms of these signal-to-noise ratios is that they can obscure information that is contained in the data and thus limit the impact that the experimenter can have as they study the data. From an analysis of a signal-to-noise ratio, the experimenter does not know which of the environmental variables have significant interactions with the design variables and the magnitude of those interactions. This information, if it were available, might suggest ideas to the experimenter as to the underlying scientific theory which might influence the future course of the investigation.
I
I
-1
I
0
I
+1
Humidity
Figure 2.9 Interaction offactor B and humidity in datafiom Table 2.1 As was discussed in Section 2.3.4, the effects of the environmental variables on the design variables can be determined by studying the interactions between the design and environmental variables. To illustrate this, for the tablet formulation with the split-plot design arrangement (I) it was concluded that there was a significant interaction between humidity and factor B (see Table 2.17). This interaction is illustrated in Figure 2.9. From this figure it can be seen that by using the +1 setting of B the tablet is less sensitive to the changes in humidity. This information could be important to the subject matter specialist who might know of similar
STABILITY AND RESPONSE SURFACE METHODOLOGY
73
constituents that could be included in the tablet formulation to make it even more robust to changes in the humidity. It is the investigation of these design x environment interactions that are the key to understanding robustness and could lead to new aspects in the investigation and to significant improvements in the robustness of products. Therefore, a preferred analysis is one that identifies the significance and magnitude of individual design x environment interactions rather than an analysis in terms of a signal-to-noise ratio or a standard deviation that would obscure information that would be present in the individual interactions. We have seen that split-plot and strip-block designs can be considerably more convenient to conduct than the cross-product designs advocated by Taguchi. In particular, since in robust designs we are not specifically interested in the main effects of the environmental variables and would be prepared to accept lower precision in our estimates of these main effects, it would generally be more appropriate to have the environmental variables as whole-plot factors, as in arrangement (I). Alternatively, a strip-block design, arrangement (111), might be the most convenient design and would yield precise estimates of the key design x environment interactions. With regard to analysis, since Taguchi's analysis is fiequently conducted in terms of a single performance measure, such as a signal-to-noise ratio, that is calculated for each point of the design array, he ignores any information that might be contained in particular design x environment interactions. It is these interactions that are key to understanding robustness of product designs. In contrast, the split-plot and strip-block designs enable efficient estimation of these interactions. Therefore, split-plot designs and strip-block designs are of tremendous value in robust design experiments since they permit the precise estimation of the interactions of interest and can be considerably easier to run than the cross-product design that have traditionally been advocated.
2.5 CONCLUSIONS The concept of designing products and processes that are robust, or stable, to environmental variation is clearly very important. Robust design enables the experimenter to discover how to modify the design of a product or process to minimize the effects due to variation fiom environmental sources that are difficult, if not impossible, to control.
74
S.P. JONES
In this chapter the use of statistical experimental designs in designing products and processes to be robust to environmental conditions has been considered. The focus has been on two classes of experimental design, response surface designs and split-plot designs. The choice of an appropriate experimental design depends on the experimental circumstances. Box and Draper [12] (p. 502, 305) list a series of experimental circumstances that should be considered by the investigator when selecting a response surface design. Many of these considerations also apply to split-plot designs, and to experimental design in general. In the response surface strategy that was discussed in Section 2.3 standard response surface techniques are used to generate two response surface models, one for the mean response and one for the standard deviation of the response (or some function of the standard deviation). The standard deviation measures the stability of the response to the environmental variation. Standard analysis can reveal which factors affect the mean only, which only affect the variability, and which affect both the mean and the variability. The researcher can then apply optimization methods or construct contour plots of the mean and standard deviation response surfaces to determine settings of the design variables that will give a mean response that is close to the target with minimum variation. Taguchi's designs, a cross-product of two experimental designs, one for design variables and one for environmental variables, can require an excessive number of runs. In Section 2.3.3 it was shown how the number of runs can be substantially reduced by constructing a single experimental design that combined both the design variables and the environmental variables. The designs associated with response surface methodology offer considerable flexibility and can be built sequentially so that experimental resources can be used efficiently. The analysis proposed by Taguchi involves the construction of a signalto-noise ratio that combines both the mean and variability. Thus, with Taguchi's analysis there is a missed opportunity for a deeper understanding of the different variability that might affect the mean and the variability. As has been noted in this chapter, any restriction on the randomization of the experiment will lead the investigator to conduct one of the split-plot designs that were described in Section 2.4. In that section it was shown that the split-plot type designs can be a more efficient way to run robust design experiments than the cross-product arrays of Taguchi. Furthermore, the standard methods of analysis of split-plot experiments, that seek to
STABILITY AND RESPONSE SURFACE METHODOLOGY
75
estimate individual design x environment interactions, will yield more information than the signal-to-noise ratios proposed by Taguchi.
ACKNOWLEDGEMENTS The author expresses his appreciation to Denis Janky, David Rose, and Rod Tjoelker for their helpful comments on earlier versions of this chapter.
REFERENCES F. Yates, and W.G. Cochran, The analysis of groups of experiments, Journal of Agricultural Science, 28 (1938) 556-580. [2] G. Wernimont, Ruggedness evaluation of test procedures, Standardization News, 5 (1977) 13-16. [3] W.J. Youden, Experimental design and ASTM committee, Materials Research and Standards, 1 (1961) 862-867. Reprinted in Precision and measurement Calibration (Vol. 1. Special Publication 300), Gaithersburg, MD: National Bureau of Standards, 1969, ed. H.H. Ku. [4] W.J. Youden, Physical measurement and experimental design, Reprinted in Precision and measurement Calibration (Vol. 1. Special Publication 300, 1961), Gaithersburg, MD: National Bureau of Standards, 1969, ed. H.H. Ku. [5] G.E.P. Box and S.P. Jones, Split-plot designs for robust product experimentation, Journal of Applied Statistics, 19 (1992) 3-26. [6] G.E.P. Box and K.B. Wilson, On the experimental attainment of optimum conditions, Journal of the Royal Statistical Society, Series B, 13 (195 1) 1-45. [7] G.E.P. Box, The exploration and exploitation of response surfaces: some general considerations and examples, Biometrics, 10 (1954) 116-60. [8] G.E.P. Box and P.V. Youle, The exploration and exploitation of response surfaces: an example of the link between the fitted surface and the basic mechanism of the system, Biometrics, 11 (1955) 287-323. [9] G.E.P. Box, J.S. Hunter and W.G. Hunter, Statistics for Experimenters, New York, Wiley, 1978. [lo] Cornell J.A., How to Apply Response Surface Methodology. Basic References in Quality Control: Statistical Techniques, Vol. 8. Milwaukee: American Society for Quality Control, 1985. [ l 11 R.H. Myers, Response Surface Methodology, Boston, Allyn and Bacon, 1971. [ 121 G.E.P. Box and N.R. Draper, Empirical Model Building and Response Surfaces, New York, Wiley, 1986. [13] A.I. Khuri and J.A. Cornell, Response Surfaces: Designs and Analyses, New York, Marcel Dekker, 1987. [14] R.L. Plackett and J.P. Burman, The design of optimum multifactorial experiments, Biometrika, 33 (1946) 305-325.
[l]
76
S.P. JONES
[15] N.R. Draper and D.K.J. Lin, Projection properties of Plackett and Burman designs, Technometrics, 34 (1992) 423-428. [16] M. Hamada and C.F.J. Wu, Analysis of designed experiments with complex aliasing, Journal of Quality Technology, 24 (1992) 130-137. [17] Box, G.E.P. and R.D. Meyer, Finding the active factors in fractional screening experiments, Journal of Quality Technology, 25 (1993) 94- 105. [18] 0. Kempthome, The Design and Analysis of Experiments, New York, Wiley, 1952. [19] G.E.P. Box and D.W. Behnken, Some new three-level designs for the study of quantitative variables, Technometrics, 2 (1960) 455-475. [20] G.E.P. Box and S.P. Jones, Robust product designs, part 11: second-order models, Report No. 63 (1990), Center for Quality and Productivity Improvement, University of Wisconsin-Madison. [21] G.E.P. Box and S.P. Jones, Robust product designs, part I: first-order models with design x environment interactions, Report No. 62 (1990), Center for Quality and Productivity Improvement, University of Wisconsin-Madison. [22] A.C. Atkinson and A.N. Donev, Optimum Experimental Designs, New York, Oxford University Press, 1992. [23] T.J. Mitchell, An algorithm for the construction of D-optimal experimental designs, Technometrics, 16 (1974) 203-210. [24] G.E.P. Box, Choice of response surface design and alphabetic optimality, Utilitas Mathernatica, 21B (1982) 11-55. [25] H.O. Hartley, Smallest composite designs for quadratic response surfaces, Biometrics, 15 (1959) 61 1-624. [26] P.W.M. John, Statistical Design and Analysis of Experiments, New York, Macmillan, 1971. [27] R.A. Maclean and V.L. Anderson, Applied Factorial and Fractional Designs, New York, Marcel Dekker, 1984. [28] D.H. Doehlert, Uniform shell designs, Journal of the Royal Statistical Society, Series C, 19 (1970) 231-239. [29] M.S. Bartlett and D.G. Kendall, The statistical analysis of variance heterogeneity and the logarithm transformation, Journal of the Royal Statistical Society, Series B, 8 (1946) 128-150. [30] G.E.P. Box, Signal-to-noise ratios, performance criteria, and transformations, (with discussion), Technometrics, 30 (1988) 1-40. [31] G.G. Vining and R.H. Myers, Combining Taguchi and response surface philosophies: a dual response approach, Journal of Quality Technology, 22 (1990) 38-45. [32] R.H. Myers and W.H. Carter, Jr., Response surface techniques for dual response systems, Technometrics, 15 (1973) 30 1-317. [33] A.E. Hoerl, Optimum solution of many variables equations, Chemical Engineering Progress, 55 (1959) 69-78. [34] E. Del Castillo and D.C. Montgomery, A nonlinear programming solution to the dual response problem, Journal of Quality Technology, 25 (1993) 199-204. [35] A.E. Freeny and V.N. Nair, Robust parameter design with uncontrolled noise variables, Statisticu Sinica, 2 (1992).
STABILITY AND RESPONSE SURFACE METHODOLOGY
77
[36] W.J. Welch, T.K. Yu, S.M. Kang and J. Sacks, Computer experiments for quality control by parameter design, Journal ofQuality Technology, 22 (1990) 15-22. [37] A.C. Shoemaker, K.L. Tsui and C.F.J. Wu, Economical experimentation methods for robust design, Technometrics, 33 (1991) 41 5-427. [38] G.E.P. Box and S.P. Jones, Designing products that are robust to the environment, Total Quality Management, 3 (1992) 265-282. [39] D.M. Steinberg and D. Bursztyn, Dispersion effects in robust-design experiments with noise factors, Journal of Quality Technology, 26 (1994) 12-20. [40] R.H. Myers, A.L. Khuri and G.G. Vining, Response surface alternatives to the Taguchi robust parameter design approach, American Statistician, 46 (1992) 131139. [41] R.H. Myers, Response surface methodology in quality improvement, Communications in Statistics: Theory and Methods, 20 (1991) 457-476. [42] D.R. Cox, Planning of Experiments, New York, Wiley, 1958. [43] C. Daniel, Use of half-normal plots in interpreting factorial two-level experiments, Technometrics, 1 (1959) 311-341. [44] W.G. Cochran and G.M. Cox, Experimental Design, New York, Wiley, 1957. [45] M.S. Bartlett, Discussion of “Complex experiments”, by F. Yates, Journal of the Royal Statistical Society, Series B, 2 (1935) 224-226. [46] S. Addelman, Some two-level factorial plans with split-plot confounding, Technometrics, 6 (1964) 253-258. [47] C. Daniel, Applications of Statistics to Industrial Experimentation, New York, Wiley, 1976. [48] R.L. Mason, R.F. Gunst and J.L. Hess, Statistical Design and Analysis of Experiments, New York, Wiley, 1989. [49] G.E.P. Box and R.D. Meyer, An analysis for unreplicated fractional factorials, Technometrics, 28 (1986) 11-18.
This Page Intentionally Left Blank
Chapter 3
REVIEW OF THE USE OF ROBUSTNESS AND RUGGEDNESS IN ANALYTICAL CHEMISTRY Y. VANDER HEYDEN AND D.L. MASSART ChemoAC,Pharmaceutical Institute, Vrije Universiteit Brussel, Laarbeeklaan 103, B-I 090 Brussels, Belgium
3.1 INTRODUCTION This review describes the determination of robustness and ruggedness in analytical chemistry. The terms ruggedness and robustness as used in method validation are sometimes considered to be equivalent [1,2]. In other publications a difference is made between the two terms [3]. In the following only the term ruggedness will be used. The ruggedness of an analytical method can generally be described as the ability to reproduce an analytical method in different laboratories or in different circumstances without the occurrence of unexpected differences in the obtained results. A ruggedness test is a part of method validation (Table 3.1) and can be considered as a part of the precision evaluation [2,4,5]. Ruggedness is related to repeatability and reproducibility. Some definitions for ruggedness come very close to those for reproducibility. Certain interpretation methods to identify the significant factors in a ruggedness test use criteria based on results for repeatability or reproducibility. These two items will be considered in Section 3.4.7. The validation of analytical methods is becoming increasingly important, particularly in the pharmaceutical industry. This is due amongst others to the regulations imposed by the drug regulatory agencies [6]. The ruggedness testing should be performed for nearly all analytical methods used in pharmaceutical and biopharmaceutical analysis [2,7] as can be seen in Table 3.2. However, until now no uniform ruggedness testing procedure
79
80
Y. VANDER HEYDEN and D.L. MASSART
exists. This has led to a variety of approaches, proposed by different authors, for the different steps in a ruggedness test. These many approaches and the complexity of the ruggedness tests are two reasons why a ruggedness test is often not performed. The different steps in a ruggedness test will be discussed in this review.
TABLE 3.1 PERFORMANCE CRITERIA FOR METHOD VALIDATION [2,4,5] * Bias : Systematic errors * Accuracy : Random errors and systematic errors * Precision : Random errors - repeatability - intermediate precision estimate - reproducibility - ruggedness * Specificity and selectivity : - interference - peak purity * Range * Linearity * Sensitivity * Limits : - detection limit, limit of detection (LOD) - limit of quantitation (LOQ) 0 lower limit of quantitation (LLQ) 0 higher limit of quantitation (HLQ)
3.2 PLACE OF RUGGEDNESS TESTING IN METHOD VALIDATION As already mentioned in the introduction, ruggedness is a part of the precision evaluation. Precision is a measure for random errors. Random errors cause imprecise measurements. Another kind of errors that can occur are systematic errors. They cause inaccurate results and are measured in terms of bias. The total error is defined as the sum of the systematic and random errors.
TABLE 3.2 OVERVIEW OF METHOD VALIDATION IN PHARMACEUTICAL AND BIOPHARMACEUTICAL ANALYSIS (REPRINTED, WITH PERMISSION, FROM [2]) Type analytical-chemical method Validation parameter
Confirmation of the identity of pure substances
Determination of identity of unknown substances
Accuracy Linearity Precision Ruggedness Specificity Selectivity Sensitivity Limit of
no no no Yes Yes no no
*
no no no Yes Yes Yes no Yes
no
no
detection Limit of quantitation Range
Amount single pure substance
Amount active substance
Limit test (semiquantitiative)
Amount impurities/ degradation products (quantitative)
Dissolution speed of substances
Bioequivalence studies
Yes
Yes
Yes
Yes
Yes Yes
Yes Yes
Yes Yes Yes Yes Yes Yes Yes no
Yes Yes Yes
Yes
*
*
*
no Yes no
Yes Yes no
no Yes Yes Yes Yes Yes
*
*
*
*
*
* *
Yes
*
* * *
Yes Yes Yes
*
Yes
*
*
* * * * Yes Yes * = not always required (depending on the judgement of the experimentator)
no no yes = always required; no = not required;
82
Y. VANDER HEYDEN and D.L. MASSART
From a statistical point of view, precision measures the dispersion of the results around the mean, irrespective of whether that mean is a correct representation of the true value. This requires the calculation of a standard deviation. How this is done depends on the context. Two types of precision are usually distinguished, namely the repeatability and the reproducibility. Repeatability is the precision obtained in the best possible circumstances (same analyst, one instrument, within one day when possible) and reproducibility under the most adverse possible circumstances (different laboratories, different analysts, different instruments, longer periods of time, etc.). Reproducibility can be determined only with interlaboratory experiments. Intermediate situations may and do occur. They are for instance defined in terms of M-factordifferent intermediate precision measures, where M is one, two, three or even higher [8,9]. In this definition M refers to the number of factors that are varied to make the estimation. The most likely factors to be varied are time, analyst and instrument. According to this terminology, one estimates e.g. the time-and-analyst-different intermediate precision measure (M=2), when the precision is determined by measuring a sample over a longer period of time in one laboratory by two analysts with one instrument. In a protocol about collaborative studies [lo] it is also considered what is called preliminary estimates of precision. Among these the protocol defines the “total within-laboratory standard deviation”. This includes both the within-run or intra-assay variation (= repeatability) and the between-run or inter-assay variation. The latter means that one has measured on different days and preferably has used different calibration curves. It can be considered as a within-laboratory reproducibility. These estimates can be determined prior to an interlaboratory method performance study. The total within-laboratory standard deviation may be estimated fi-om ruggedness trials [lo]. A third term in the context of precision is robustness or ruggedness. The result of a ruggedness test indicates how tightly controlled the experimental factors should be. The detection of factors that heavily influence the results of a method leads eventually to a more reproducible method.
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
83
3.3 DEFINITIONS OF RUGGEDNESS The terms “ruggedness” and “ruggedness test” were introduced in analytical chemistry by Youden and Steiner [Ill. They proposed to perform an experiment in which one can verify whether certain factors of the test procedure have influence on the response of a method or not. If none of the investigated factors shows an influence on the response the method was considered to be “rugged” or “robust”. They introduced the ruggedness test because in collaborative tests it was not unusual to observe unexpected results for an analytical procedure which performed well in the laboratory that developed the method. The explanation for this was found in the fact that the initiating laboratory has a set of conditions, operations and equipment which do not vary. Transferring the procedure to another laboratory causes small changes in a number of these conditions, operations and equipment properties. Some of these produce changes in the response of the method which then lead to the unexpected results in the collaborative test. By performing the ruggedness test the factors causing difficulties in a collaborative test were tracked and could be controlled strictly thereby avoiding disappointments. Agencies or authorities such as I S 0 or IUPAC still do not provide any definition of ruggedness. In the chemical literature however, a ruggedness test was defined as [4,12]: “ A n intralaboratory experimental study in which the influence of small changes in the operating or environmental conditions on measured or calculated responses is evaluated. The changes introduced reflect the changes that can occur when a method is transferred between different laboratories, different experimentators, different devices, etc.” . Some pharmaceutical sources give definitions for ruggedness. Not all definitions however are the same. The definition which comes closest to the chemical definition mentioned above is the one of the French Guide for Validation of Analysis Methods [ 131. This Guide states that “the ruggedness of an analysis procedure is its capacity to yield exact results in the presence of small changes of experimental conditions such as might occur during the utilisation of these procedures”. It continues by defining that by a small change in experimental conditions is meant “any deviation of a parameter of the procedure compared to its nominal value as described
84
Y. VANDER HEYDEN and D.L. MASSART
in the method of analysis”. A definition similar to this one is also given in reference [2]. Some other sources have definitions that are different from the one given above [7,14]. The US Pharmacopeia [7] defines ruggedness as: “The ruggedness of an analytical method is the degree of reproducibility of test results obtained by the analysis of the same sample under a variety of normal test conditions, such as different laboratories, different analysts, different instruments, different lots of reagents, different elapsed assay times, different assay temperatures, different days, etc. Ruggedness is normally expressed as the lack of influence on test results of operational and environmental variables of the analytical method. Ruggedness is a measure of reproducibility of test results under normal, expected operational conditions from laboratory to laboratory and fiom analyst to analyst”. In fact this is nearly the definition of reproducibility. This definition is also followed by other authors [ 151. The Canadian Acceptable Methods document [14] gives more or less a combination of the two definitions described above and considers 3 levels in the testing of the ruggedness of a method, with the third level being performed only rarely. Level one “requires verification of the basic insensitivity of the method to minor changes in environmental and operational conditions and should include verification of reproducibility by a second analyst”. The first part of this definition resembles the French Guide’s definition. The second part is a check on the adequacy of the method description and should be done without input from the original analyst. Level two “requires a verification of the effect of more severe changes in conditions, such as the use of chromatographic columns from different manufacturers or the substitution of different equipment, and should be performed in a different laboratory”. This second level can be considered as being equivalent to the US Pharmacopeia (USP) definition. The third level of testing is then “a full collaborative testing” which is however done rarely. The extent to which a ruggedness test is performed depends on the general use of the method. Level 1 ruggedness testing is required for all methods. Level 2 testing is performed when a method is intended to be applied at multiple locations or in a number of laboratories at a single location. Collaborative studies are recommended if it is
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
85
intended that a method will be used in multiple locations using a variety of equipment. From the definitions given above it can be seen that there are two approaches to ruggedness testing (also equal to levels 1 and 2 given in the Acceptable Methods document [14]). In the first approach factors to be examined are selected from the set of operating and environmental conditions that are or could be stipulated in the analytical procedure. This kind of factors can be called procedure related factors. In the second approach non-procedure related factors are considered. Factors such as e.g. different laboratories, different analysts, different instruments, different lots of reagents, different days, different columns for HPLC methods or different plates for TLC methods are then examined. In the literature, ruggedness tests concern mainly procedure related factors, but occasionally one of the other factors, e.g. a column factor in HPLC, is examined. This will be discussed hrther in more detail (see Sections 3.4.2 and 3.4.4.4). The examination of the non-procedure related factors in ruggedness testing is described less frequently and requires another approach than the examination of procedure related factors. In Section 3.4 the strategy and the different possibilities for performing a ruggedness test when mainly procedure related factors are examined will be reviewed. In a later part (Section 3.5) ruggedness testing of nonprocedure related factors will be discussed.
3.4 RUGGEDNESS TESTING OF PROCEDURE RELATED FACTORS 3.4.1 The steps of a ruggedness test A ruggedness test requires an experimental design approach. It consists of the following steps: 1.
2.
Selection and identification of the operational or environmental factors to be investigated; Selection of levels for the factors to be examined. In a ruggedness test 2 or 3 levels for each factor are normally considered. The ruggedness for the factors in the intervals between the factor levels is then investigated;
86 3. 4. 5.
6. 7. 8.
Y. VANDER HEYDEN and D.L. MASSART
Selection of the experimental design; Carrying out the experiments described in the design. This is the experimental part of the ruggedness test; Computation of the effect of the factors on the response(s) of the method, to derive which factors might have experimentally relevant effects; Statistical analysis of the results. In this part of the test statistically significant effects are identified; Drawing chemically relevant conclusions; When necessary giving advice for improvement of the performance of the method and definition of suitability criteria.
The different steps described above will be explained in more detail in the following sections.
3.4.2 Selection of the factors As a first step one selects a number of factors to examine. The selected factors should be chosen from the description of the analytical procedure or from environmental parameters which are not necessarily specified explicitly in the analytical method. The factors can be quantitative (continuous, numerical) or qualitative (discrete). The factors to be tested should represent those that are most likely to be changed when a method is transferred, for instance, between different laboratories, different devices, or over time, and that potentially could influence the response of the method. However it is not always obvious which factors will influence a response and which will not. This is one of the reasons why screening designs are used (see Section 3.4.4). They allow to screen a large number of factors in a relatively small number of experiments. In Table 3.3 a list of different factors investigated in different publications is given [ 1,4,13,16,17]. The list is not exhaustive and is only shown to give the reader an idea. Table 3.3 shows that many authors have not really understood the nature of ruggedness testing. For instance, changing the type of the reagents in sample preparation, the mobile phase in HPLC [ 131 or the type of acid to control the pH of the mobile phase (e.g. orthophosphoric acid, acetic acid, perchloric acid [ 171) does not make sense. These kind of factors are more likely to be examined in a screening design to eliminate factors that are not
REVIEW OF ROBUSTNESS I N ANALYTICAL CHEMISTRY
87
significant during an optimization procedure [ 181. In a ruggedness test however, one starts from an optimized procedure which was already validated for precision and accuracy and which should not be changed in any detail. Some comments should be made: The selection of the factor “type of acid” in a ruggedness test could be accepted when only the pH is specified by the method rather than the acid used to bring the solution or the buffer up to the desired pH. Clearly, however, in such a case the method is poorly defined. In references [4,13] the sample weight is entered as a covariable. This may make some sense if the design employed later allows to measure the effect of other factors as influenced by the sample weight (interactions). The design then should be able to estimate interaction effects (see Section 3.4.4). However, in certain cases, one investigates if the factor sample weight influences the measured concentration. If it does not, this would mean the method is not able to perform the required analysis! Besides this effect should already have been studied in previous experiments in method validation. Entering the sample weight as a factor can be usehl in the case resolution is considered as the response. Then one is able to detect if the sample weight does influence the resolution between peaks in a chromatogram. However, this would have been studied better in the context of defining the limits of quantification and the range. A group of factors causing problems are HPLC columns. Some articles [4,6,19] propose to include the factor “batch of material” or “manufacturer of material” in a two level design and do this by comparing two columns. However, it is far from evident that these two selected columns are extreme levels for the whole population of batches from one manufacturer or for the population of columns from different manufacturers. The problem could be tackled by examining more than two columns. One possibility is to consider the column factors in the same way as the factors “different laboratories, different analysts, different instruments” and to examine these factors in a nested design (nested ANOVA) [20,21]. These designs will be discussed in Section 3.5. Another possibility is the use of (screening) designs where the factor of interest is examined at more then 2 levels. Some designs of this type are described in Section 3.4.4.5. One also has to realize that one is
88
Y. VANDER HEYDEN and D.L. MASSART
able to examine only one of the three mentioned column factors (manufacturer, batch, age) at the time in a Plackett-Burman or a fractional factorial design. Entering two of them requires the use of nested designs (see Section 3.5). A problem that could occur and that is usually overlooked is the possible interaction between the column factor and the other factors of the design. This will be discussed further in Section 3.4.4.4.
3.4.3 Selection of the levels of the factors In a second step the levels for the chosen factors are selected. For quantitative factors one considers a low and a high extreme level that is respectively smaller and larger than the nominal one. The nominal level is the level for the factor as it is given in the description of the procedure or the one that is most likely to occur in the case it is not specified in the analytical procedure. The levels for the factors are chosen in such a way that they represent the maximum difference in the values of the factors that could be expected to occur when a method is transferred from one laboratory to another without the occurrence of major errors [4]. In some publications only one extreme level is examined [ 11,221. In these cases only the influence of changing the factors to one side of the nominal level is examined. Three levels (two extremes and the nominal) are selected if one cannot exclude a nonlinear behaviour of the response in function of the change of the factors [6,17,19,23] (see Figure 3.1). If a factor that causes a nonlinear change of a response is examined in a two level design where its extreme levels were used, one could find a small effect, E(+1,-1) as is seen in Figure 3.1. One would conclude that the response is robust in the interval between the two extreme levels. However, changing the factor from the nominal to one of the extreme levels causes a considerable change in the response, E(+I,o) and E(o,-1). By examining this factor at three levels it will be observed that the response is not rugged in the interval between the two extreme levels.
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
89
IN THE LITERATURE Sample meparation. sample weight shaking or dissolution time sonication time temperature of sample preparation extraction volume wash volume centrifugation time pH of the solution composition of the reagents type of reagents
HPLC methods. pH of mobile phase amount organic modifier buffer concentration or ionic strength concentration tailing suppressor flow rate acid type in mobile phase age of the sample solution For gradient elution : * the factors concerning the mobile phase and summed up above could be considered for each mobile phase * steepness of the gradient * initial ratio of the mobile phases * final ratio of the mobile phases column factors : - batch number - manufacturer - age of the column detector factors :wavelength integration factors : - signal-to-noise ratio or sensitivity - method to draw baseline under a peak
90
Y. VANDER HEYDEN and D.L. MASSART
TLC methods. batch of plates composition of the mobile phase developing temperature
GC methods [ 161. injection temperature split flow liner type temperature rate sample matrix detector temperature column flow
Response
I
-1
0
+1
(level)
Figure 3.I Comparison of the observed change of the response by examination at 2 levels and the actual changes for an optimizedfactor having a non-linear response. E(+l,-i) = observed change when examined at the two extreme levels. E(+l,o); E(o,-I) = actual changes between method conditions (nominal level) and the extreme levels. -I = low extreme level, 0 = nominal level, + I = high extreme level
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
91
A common error is to select levels that are too far apart from each other. In a ruggedness test one selects the extreme levels of the factors to be somewhat larger than the changes that would occur for this factor under normally changing conditions (different laboratories, etc.). In a number of published ruggedness tests one finds levels that are quite far from each other, much further than can occur by transferring a method between different laboratories. If one can prove that in this chosen interval the factor is rugged, this is of course excellent. However, since one does not know the effect of the factor in advance one will introduce a large possibility of finding a significant effect which is not relevant for the evaluation of the ruggedness. If in a method description the pH of the mobile phase is 5.0 then one normally should be able to work in an interval between 4.8 and 5.2. This then is the interval proposed to be examined in a ruggedness test and not for example 4.0 and 6.0. Examples of levels of factors that seem too far from each other and that were tested in different ruggedness tests are given in Table 3.4. It should be noted that the same designs as those applied here are sometimes used in the optimization stage of the method [18,24-271, i.e. before the validation. In that case it makes sense to apply levels that are hrther apart than in ruggedness testing.
TABLE 3.4
SOME LEVELS OF FACTORS THAT ARE TESTED WITH LARGE INTERVALS IN A RUGGEDNESS TEST (HPLC METHODS) Factor Levels as tested in the literature PH nominal k 1 [4] nominal k 0.5 [6,13,19] flow rate
nominal k 0.3 ml/min [ 131 nominal k 0.5 ml/min [6,19]
wavelength (UV)
nominal k 8 to 12 nm [19]
92
Y. VANDER HEYDEN and D.L. MASSART
3.4.4 Selection of the experimental design To examine the ruggedness of the factors that were selected one could test these factors one variable at a time, i.e. change the level of one factor and keep all other factors at nominal level. The result of this experiment is then compared to the result of experiments with all factors at nominal level. The difference between the two types of experiments gives an idea of the effect of the factor in the interval between the two levels. The disadvantage of this method is that a large number of experiments is required when the number of factors is large. For this reason one prefers to apply an experimental design. In the literature a number of different designs are described, such as saturated fractional factorial designs and Plackett-Burman designs, full and fractional factorial designs, central composite designs and Box-Behnken designs [ 5 ] . In practice however, most designs used for the determination of ruggedness are fractional factorials or of the Plackett-Burman type. For this reason we will pay more attention to these designs and to a number of related concepts, such as interaction, confounding, defining relations, aliases, etc. The fractional factorial and Plackett-Burman designs are also called screening designs [28] because they allow to screen a large number of factors. 3.4.4.1 Full factorial designs
In a full factorial design all combinations between the different factors and the different levels are made. Suppose one has three factors (A,B,C) which will be tested at two levels (- and +). The possible combinations of these factor levels are shown in Table 3.5. Eight combinations can be made. In general, the total number of experiments in a two-level full factorial design is equal to 2/ with f being the number of factors. The advantage of the full factorial design compared to the one-factor-at-a-time procedure is that not only the effect of the factors A, B and C (main effects) on the response can be calculated but also the interaction effects of the factors. The interaction effects that can be considered here are three two-factor interactions (AB, AC and BC) and one three-factor interaction (ABC). From the 23 full factorial design shown in Table 3.5 these seven effects can be calculated. An eighth statistic that can be obtained from this design is the mean result. From a 2' full factorial design therefore 2' statistics can be calculated. The
93
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
number of effects (statistics) belonging to each type (average; main and multiple-factor interaction effects) is given in Table 3.6 for different full factorial designs.
TABLE 3.5
FULL FACTORIAL DESIGN FOR 3 FACTORS Factors A B C Resuonse Exueriment
+ +
2 3 4 5 6 7 8
+ + -
-
+ +
+ +
Y2 Y3 Y4 Y5 Y6 Y7 Y8
+ +
+ +
TABLE3.6 NUMBER OF STATISTICS THAT CAN BE CALCULATED FOR DIFFERENT FULL FACTORIAL DESIGNS Full factorial design ($ Statistics f=2 f=3 f=4 f=5 f=6 f=7 f=S Average 1 1 1 1 1 1 1 Main effects 2 3 4 5 6 7 8 Interaction effects 2-factor 3-factor 4-factor 5 -factor 6-factor 7-factor 8-factor
c.:
1
3 1
6 4 1
10 10 5 1
15 20 15 6 1
21 35 35 21 7 1
4
8
16
32
64
128
28 56 70 56 28 8 1 256
94
Y. VANDER HEYDEN and D.L. MASSART
The number of effects (statistics) can be calculated with the following formula [29]: Number ofp-factor interactions in a 2‘ design: To explain the concept of interaction, let us consider a two-factor interaction. This interaction occurs when the effect of the first variable obtained at the lowest level (-) of the second variable is different from the effect of the first factor at the highest level (+) of the second one. The effect of one variable is influenced by that of the other and therefore it is said that they “interact”. We will try to explain this with an example. Suppose that in Table 3.5 factor B is the “HPLC column” (level (+) = column K and level (-) = column L) and factor A is the “pH of the mobile phase” (level (+) = 5.2 and level (-) = 4.8). A two-factor interaction between the column manufacturer and the pH of the mobile phase occurs when the effect of the pH on the response (e.g. resolution) on column K is different from the effect of the pH on column L. The interaction is calculated as half the difference between the effect of the pH on column K and the effect of the pH on column L. The interaction is called in this example the pH by column interaction and is symbolised by pH x column or AxB or AB or BA. 1
Interaction effect (AB) =- (effect pH on column K-effect pH on column L) 2
or Interaction effect (AB) =
1 (EA,B(+) - EA,B(-)) 2
-
Let us now consider three-factor interactions (e.g. ABC in Table 3.5) to give a general idea how these and higher-order interaction effects (four-, five-factor interaction effects, etc.) are derived. A three-factor interaction means that a two-factor interaction effect is different at the two levels of the third factor. Two estimates for the AB interaction are available from the experiments, one for each level of factor C. The AB interaction effect is estimated once with C at level (+) (represented by EAB,c(+)) and once
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
95
with C at level (-) (represented by EAB,c(-)). Half the difference between these two estimates gives the three-factor interaction effect (ABC). 1
Interaction effect (ABC) = - (EAB,c(+) - EAB,c(-)) 2
As is the case for the two-factor interactions, the three-factor interaction is also symmetric in all its variables : the interaction effects ABC, ACB, BAC, CAB, BCA and CBA all give the same result. Higher-order effects are calculated by analogous reasoning. To compute main and interaction effects (see further Section 3.4.6) one determines:
where EX is the effect of factor X CY(+) and CY(-)are the sums of the responses where factor X was respectively at level (+) and at level (-) and where n is the number of runs from the design where the factor was at level (+) or at level (-).The effect of a factor can be considered as the difference between the mean response at high level (+) and those at low level (-). The term n is equal to
N when each experiment in the design is performed 2
-
once and N is the number of experiments of the design. For factor A from Table 3.5 this gives:
To calculate the interaction effects following the reasoning given above would require a lot of work. An easier way exists, namely by using the columns of so-called contrast coefficients. The contrast coefficients for the 23 design of Table 3.5 are given in Table 3.7. The contrast coefficients for the interactions are obtained by multiplying the corresponding signs of the contributing factors. For instance, the levels for AB are obtained by multiplying the signs of the columns A and B for each experiment. The
96
Y. VANDER HEYDEN and D.L. MASSART
interaction effects are then calculated analogously to the main effects. This would give for the interaction effect AB:
TABLE3.7 COLUMNS OF CONTRAST COEFFICIENTS FOR A 23FULL FACTORIAL DESIGN Factors Interactions Experiment A B C AB AC BC ABC + + + 1 + + + 2 + + 3 + + + + 4 5 + + + + + 6 + + + + 7 + + + + + + + 8 3.4.4.2 Fractional factorial designs The disadvantage of full factorial designs is that the number of experiments increases rapidly when the number of factors increases. For 6 factors 64 experiments are required and for 7 factors 128. In practice it is usually not possible to perform such a large number of experiments in a reasonable span of time. For this reason often only a fraction of a full factorial design is performed. This kind of design is called fractional (or partial) factorial design. Let us first consider a half-Ji.actionfactorial design. Only half of the number of experiments needed for a full factorial are performed. For 16
example for 4 factors, -=8 2
experiments are performed. This can be
observed from Table 3.8 in which a full factorial design for 4 factors is shown. By selecting 8 appropriate experiments from the set of 16 a halffraction factorial design is obtained. The experiments 1, 4, 6 , 7, 10, 11, 13 and 16 (experiments between brackets) form a half-fraction factorial
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
97
design for four factors while the experiments 2, 3, 5, 8, 9, 12, 14 and 15 form another half-fraction of the full factorial design. This kind of design is symbolized as 24-1.
TABLE 3.8 SELECTION OF A HALF-FRACTION FACTORIAL DESIGN FROM A FULL FACTORIAL DESIGN FOR 4 FACTORS Factors Exueriment A B C D
By reducing the number of experiments one, of course, also loses some information. In a fractional factorial design not all main and interaction effects can be estimated separately as in a full factorial design. In a halffraction factorial design the main effect of a factor will be estimated together with a higher-order interaction effect. Let us, for example, consider the half-fraction factorial design formed by the experiments between brackets in Table 3.8 and calculate the columns of contrast coefficients for the three-factor interactions (see Table 3.9). It can be seen that column ABC is equal to column D. This is also the case for the columns A and BCD; B and ACD; C and ABD. In the design formed by the experiments which are between brackets in Table 3.8, one estimates the
98
Y. VANDER HEYDEN and D.L. MASSART
total effect of factor D and of the interaction ABC. It is said that factor D and the interaction ABC are confounded with each other, meaning that they cannot be estimated separately. All other main effects are also confounded with a three-factor interaction (A with BCD; B with ACD and C with ABD). By calculating the columns of contrast coefficients for the two-factor interactions, one sees that in this design each two-factor interaction is confounded with another two-factor interaction (e.g. AB with CD). In terms of absolute size, main effects tend to be larger than two-factor interactions, which in turn tend to be larger than three-factor interactions, and so on. In the half-fraction factorial design of Table 3.9 the main effects are expected to be significantly larger than the three-factor interactions with which they are confounded. As a consequence it is supposed that the estimate for the main effect and the interaction together is an estimate for the main effect alone.
TABLE 3.9 THE COLUMNS OF CONTRAST COEFFICIENTS FOR THE THREEFACTOR INTERACTIONS OF THE HALF-FRACTION FACTORIAL DESIGN FOR FOUR FACTORS SELECTED FROM TABLE 3.8 Factors Interactions Exu. A B C D ABC ABD ACD BCD 1 + + 4 + + + 6 + + 7 10 + 11 + + 13 + + + + + + 16 Let us now consider how one selects the experiments from the full factorial to obtain a proper half-fraction factorial design. In practice, to construct a 24-1design one first constructs a full factorial design for 4-1=3 factors (see Table 3.7). The fourth factor (D) will be awarded to one of the columns of the interactions (e.g. to ABC in this case). This means that one confounds factor D with the interaction ABC. In this way the design of Table 3.10 is
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
99
obtained which is equal to the half-fraction factorial of Table 3.9. Only the order of the experiments (rows) is different.
TABLE 3.10
HALF-FRACTION FACTORIAL DESIGN FOR 4 FACTORS p4-') Factors Experiment A B C D = ABC -
+ -
+ +
+ + -
+ +
+ + -
+ -
+
The relationship D = ABC in this design is called the generator. The factor D and the three-factor interaction ABC are called aliases of one another because they are confounded. All aliases can be determined with the help of the defining relation or defining contrast (I). It is obtained by multiplying the effects occuring in the generator. I = D x ABC = ABCD The alias of any effect can be obtained by multiplying the effect with the defining relation, with as an additional rule that when a term appears an even number of times this term disappears from the product. For instance the aliases of factor A and of interaction AB are respectively :
A = A x ABCD = A ~ B C D= BCD AB = AB x ABCD = A ~ B ~ C= D CD All aliases of this design are shown in Table 3.1 1. The alias of the defining relation itself is the mean response. This design is called a design of resolution IV. The design resolution is determined by the number of terms in the defining relation. The higher the resolution of the design the higher
100
Y. VANDER HEYDEN and D.L. MASSART
the order of the interaction effect confounded with the main effect. In general in a design of resolution R no p-factor (interaction) effect is confounded with any effect containing less than R-p factors. The design given in Table 3.10 could be symbolized as a 24-1(IV) design. In this design no main effect (p=l) is confounded with any effect containing less than 3 factors (R-p = 4-1 = 3) and no two-factor interaction (p=2) is confounded with another effect that contains less than two factors (R-p = 4-2 = 2).
TABLE 3.1 1 ALIASES OF A 24-1DESIGN WITH I = ABCD Aliases A = BCD B = ACD C = ABD D = ABC AB = CD AC = BD AD = BC ABCD = Mean Now let us try to generate a quarter--action factorial design for 6 factors. In this design one fourth of the experiments of a full factorial design are performed, i.e.
26
-
4
6-2
=16 experiments. The design is symbolized by 2 . The
first four (6-2) columns in the design are given by the full factorial design for four factors. They are shown in Table 3.12. For the last two columns (E and F) a generator must be defined. For example E = ABCD and F = ABC. This gives two defining relations associated with the generators: I = ABCDE and I = ABCF. There are three defining relations in a quarter-fraction factorial design meaning that each effect is confounded with three other effects. The third defining relation is obtained by multiplying the two originally obtained relations using the multiplication rules mentioned higher:
I = ABCDE x ABCF = A ~ B ~ C ~ D = DEF EF
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
101
The defining relations are then I = ABCDE, I = ABCF and I = DEF. The resolution of the design is I11 since the smallest defining relation contains three terms. This means that certain main effects are confounded with twofactor interactions, e.g. D = EF = ABCE = ABCDF.
TABLE 3.12 QUARTER-FRACTION, 26-2(IV),FACTORIAL DESIGN. GENERATORS: E = ABC AND F = BCD Factors Experiment A B C D E F 1 2 3 4 5 6 7 + 8 + + 9 + + + + 10 + + + 11 + + + 12 + + 13 + + + 14 + + + 15 + + + + 16 Since two-factor interactions tend to be larger than three-factor interactions it would be worth-while to construct a design of resolution IV (or even higher if possible). In such a design the main effect would be confounded only with three-factor and higher-order interactions. This design could be expected to give better estimates for the main effects than the design of resolution 111. Since one is free to define the generators one can define another set of them to try to increase the resolution of the design. For instance one could define E = ABC and F = BCD. By using these latter two generators a design of resolution IV is created. The defining relations
102
Y. VANDER HEYDEN and D.L. MASSART
obtained with these generators are I = ABCE, I = BCDF and I = ADEF. The complete design is shown in Table 3.12 and the aliases of this design in Table 3.13.
TABLE 3.13 ALIASES IN THE 26”(IV) DESIGN OF TABLE 3.12 Aliases A = BCE = ABCDF = DEF B = ACE = CDF = ABDEF C = ABE = BDF = ACDEF D = ABCDE = BCF = AEF E = ABC = BCDEF = ADF F = ABCEF = BCD = ADE AB = CE = ACDF = BDEF AC = BE = ABDF = CDEF AD = BCDE = ABCF = EF AE = BC = ABCDEF = DF AF = BCEF = ABCD = DE BD = ACDE = CF = ABEF BF = ACEF = CD = ABDE ABD = CDE = ACF = BEF ABF = CEF = ACD = BDE Mean = ABCE = BCDF = ADEF Smaller fractions of a full factorial can be constructed by analogous reasoning as for the quarter fraction. The defining relations are obtained by making all possible multiplications between the original defining relations derived from the generators. Suppose that a 28-4design must be created, i.e. a sixteenthfiuction of an 8 factor full factorial design. In this design 4 generators have to be defined creating 4 original defining relations. The total number of defining relations in a sixteenth fraction is fifteen. The eleven remaining defining relations are obtained by multiplying the original ones two by two (6 relations), three by three (4 relations) and all four at a time (one relation). In general, a fractional factorial design can be written as in whichfis 1 2
the number of factors that is examined in the design and 7is the fraction
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
103
considered (v=l, 2, 3,...). When constructing such a design v generators have to be defined giving a total of 2"-1 defining relations. The design has experiments. Each effect is confounded with 2'- 1 other effects. The smallest fraction of a design in which main effects still can be estimated without confounding among each other is called a saturated fractional factorial design. The 27-4design for example can estimate the main effects of 7 factors in 8 experiments (see Table 3.14). The design is of resolution I11 and main effects are confounded with two-factor interactions. The design is saturated since it is not possible to increase the level of confounding. These saturated designs are used in a ruggedness test which means that one estimates that two-factor and higher-order interaction effects are negligible compared to the main effect. The saturated designs allow then to make an estimation for the main effects of the factors from the design.
TABLE 3.14 SATURATED FRACTIONAL FACTORIAL DESIGN FOR 7 FACTORS: 27-4(1~~). GENERATORS: D=ABC, E=AB, F=AC, G=BC Factors A B C D E F G Experiment 1 + + + 2 + + + 3 + + + 4 + + + 5 + + + 6 + + + 7 + + + 8 + + + + + + + 3.4.4.3 Plackett-Burman designs The most important alternative for saturated fractional factorial designs are Plackett-Burman designs [30]. For N = 4, 8, 16, ... experiments (generally 2x with x=2, 3, ...), these designs are saturated designs of resolution 111. However, there are also Plackett-Burman designs for 12, 20, 24, ... experiments. Generally, Plackett-Burman designs are described for a number of experiments, N, up till 100 with N being a multiple of four.
104
Y. VANDER HEYDEN and D.L. MASSART
Designs of more than 20 or 24 experiments are of no practical use in a ruggedness test, because the time to perform these designs becomes too long. The first line for the designs with N = 8, 12, 16,20 and 24 as described by Plackett and Burman [30] is given below:
An example of a Plackett-Burman design for N = 12 is shown in Table 3.15. The first row in the design is the one given by Plackett and Burman. The following N-2 rows are obtained by a cyclical permutation of one place (i.e. shifting the line by one place) compared to the previous row. The sign of the first factor (A) in the second row is equal to that of the last factor (K) in the first row. The signs of the following N-2 factors in the second row are equal to those of the first N-2 factors of the first row. The third row is derived from the second one in an analogous way. This procedure is repeated N-2 times until all but one line are formed. The last (flh) row consists completely of minus signs. Since Plackett-Burman designs are saturated designs of resolution I11 main effects are confounded with many higher-order effects among which also a number of two-factor interactions. In the eight experiment design for instance each main effect is confounded with 15 higher-order effects among which three two-factor interactions. It is possible to define the multiple-factor interactions that are confounded with each main effect. This can be done by constructing columns of contrast coefficients as in the full and fractional factorial designs. The algebraic rules used here however are different from those for the full and fractional factorial designs. Full andfiactional factorial designs: 1. negative and negative + positive 2. negative and positive + negative 3. positive and negative + negative 4. positive and positive -+positive
105
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
Plackett-Burman designs: 1. negative and negative --+ negative 2. negative and positive + positive 3. positive and negative + positive 4. positive and positive --+ negative The algebraic rules used for the Plackett-Burman designs are opposite to those for the full and fractional factorial designs. In a fractional factorial design (Tables 3.8, 3.10, 3.12, 3.14) there always is a row containing only plus signs. In a Plackett-Burman design however one always has a row containing only minus signs (Table 3.15). Let us compare the N=8 Plackett-Burman design of Table 3.16 with a saturated 27-4design. After 7-4 rearranging the Plackett-Burman design in an appropriate way the 2 design of Table 3.14 is obtained but with opposite signs. Therefore the algebraic rules to obtain the contrast coefficients must also be different.
TABLE 3.15 Exp. 1 2 3 4 5 6 7 8 9 lo 11 12
A
+ + -
PLACKETT-BURMAN DESIGN FOR N =12 Factors B C D E F G H I
+ + + -
+ + + +
+ + + -
-
-
+ + + -
+ + + -
-
J
K
-
-
+ -
+ + + +
+ + + + -
+ + + + +
+ + + +
+ + + -
+ + +
+ +
-
-
+ -
+ -
+ + -
+ + +
+ + -
-
+ + -
+ -
+ -
-
-
+
-
+
-
-
-
+
+
+ + + + + -
In Table 3.16 the columns of contrast coefficients for the two-factor interactions are given. They were obtained using the above stated rules. The contrast coefficients for three- and higher-order interactions can be
106
Y. VANDER HEYDEN and D.L. MASSART
obtained analogously. The columns of contrast coefficients that are equal to each other are confounded with each other (e.g. ABC confounded with E). It can be seen that in this design each main effect is confounded with three two-factor interactions (Table 3.17), and with a number of higherorder interactions.
TABLE 3.17 TWO-FACTOR INTERACTIONS CONFOUNDED WITH THE MAIN EFFECTS IN A N = 8 PLACKETT-BURMAN DESIGN Two-factor interactions confounded with the Factor corresponding factor A BF CD EG AF CG DE B AD BG EF C D AC BE FG E AG BD CF F AB CE DG AE BC DF G The Plackett-Burman designs, as do the saturated fractional factorial designs, only allow for estimating the main effects. One assumes that all interaction effects are negligible compared to main effects. A Plackett-Burman design with N experiments can examine up to N-1 factors. This is a difference with fractional factorial designs. Some saturated fractional factorial designs however contain also N- 1 factors (e.g. the 27-4design of Table 3.14) but this is not always the case. The saturated design for 5 factors, for example, is the 25-2design. In this design only 5 factors are examined in 8 experiments. Each Plackett-Burman design contains a fixed number of factors (a multiple of four minus one). After determination of the number of factors to be examined (this factors will be called real factors), the remaining number of potential factors in the design are defined as dummy factors. A dummy factor is an imaginary factor. A fractional factorial design on the other hand will be constructed depending only on the number of real factors. Normally no dummies are entered in those designs.
TABLE 3.16 PLACKETT-BURMAN DESIGN FOR 7 FACTORS (N=8) AND THE COLUMNS OF CONTRAST COEFFICIENTS FOR TWO-FACTOR INTERACTIONS AND FOR A THREE-FACTOR INTERACTION Factors Em. A B C D E F G AB AC AD AE AF AG BC
TABLE 3.16 (CONTINUED)
PLACKETT-BURMAN DESIGN FOR 7 FACTORS (N4)AND THE COLUMNS OF CONTRAST COEFFICIENTS FOR TWO-FACTOR INTERACTIONS AND FOR A THREE-FACTOR INTERACTION Exp. BD BE BF BG CD CE CF CG DE DF DG EF EG FG ABC 1 + - + + + - + + + + + + 2 + + + - + + - + + - + 3 + + - + + + + + + 4 + + + + + + + - + + + 5 + + + + + + + + 6 + + + + + + + + 7 1+ + + + + + + 8 -
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
109
The difference can be seen from the following example. If there are six factors, one can perform a Plackett-Burman design with 8 experiments, containing six real and one dummy factor. Another possibility would be to perform the twelve experiment design that would contain five dummies. The decision to select a larger design with more experiments could depend on the statistical interpretation one would like to apply (see Section 3.4.7). If the same six factors are examined in a fractional factorial design, one could create an eighth fraction, fJ(e.g. with generators D = BC, E = AB and F = AC). This is also a design with 8 experiments as the PlackettBurman design but containing no dummy factors. 3.4.4.4 Taking into account certain interactions when constructing a design Usually one considers all interaction effects negligible when performing a ruggedness test. If one suspects certain two-factor interactions as potentially important, one can take into account this fact when constructing or selecting a design. For instance the interaction between an HPLC column as a factor and the other factors of the design might be considered as being potentially significant. Suppose one is examining the factors “batch number of the HPLC column” and “concentration tailing suppressor’’ in a design together with some other factors. Depending on the degree of endcapping of a column (more or less residual silanolgroups) the concentration tailing suppressor could have a larger or a smaller effect on the asymmetry of a peak on different columns, i.e. there is an interaction between the factors batch number and concentration tailing suppressor. When both factors are examined in one design the effect on the tailing found for the factor “tailing suppressor concentration” will be the mean effect for both columns. To determine whether these interactions are important one can create a fractional factorial design in which the two factor-interactions of interest are not confounded with each other nor with main effects. In such a design the interaction effects can be directly estimated (see also Section 3.4.10).
3.4.4.5 Designs for ruggedness testing at three levels When in a ruggedness test three levels for the factors are examined different designs are theoretically possible.
110
Y. VANDER HEYDEN and D.L. MASSART
A first possibility is the use of full factorial designs with three levels [3 13. The disadvantage of the three-level factorial designs is that the number of experiments increase very rapidly for a larger number of factors. A second possibility is the use of central composite designs. Their disadvantage is the large number of experiments required even for a low number of factors, e.g. 25 experiments for 4 factors and 273 experiments for 8 factors. Even 25 different experiments can already take an unreasonable long time to be feasible. The central composite designs allow to estimate not only main effects but also interaction and quadratic effects. However, quadratic effects are normally never considered in a ruggedness test. These designs are mainly used for optimization purposes and less for ruggedness testing. Central composite designs are interpreted using a regression method. To obtain a reasonable regression the levels of the factors should differ enough but in a ruggedness test broad intervals between the levels should be avoided. Nevertheless, some authors have used them for ruggedness tests [ 161. Another possibility that at first sight appears to be attractive are the three-level designs as proposed by Plackett and Burman [30]. However it was shown that in these designs a confounding occurs between main effects [32]. Well balanced three-level designs (designs without a confounding of main effects) derived from the three-level Plackett-Burman designs have been described, but they can only examine half the number of factors originally proposed by Plackett and Burman. These well balanced designs could be used in ruggedness tests. However, no case studies are known. To test the factors at three levels in a ruggedness test one usually applies the so-called reflected designs [17,19,23]. A reflected design is in fact a two level Plackett-Burman, full or fractional factorial design that is executed twice. Once the design contains the first extreme and the nominal level and once the other extreme and the nominal level. Both designs have one experiment in common, namely an experiment in which all factors are at nominal conditions. A reflected Plackett-Burman design for 7 factors is shown in Table 3.18.
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
111
TABLE 3.18 REFLECTED DESIGN FOR 7 FACTORS DERIVED FROM THE PLACKETT-BURMAN DESIGN FOR 7 FACTORS Factors Exp. A B C D E F G 1 +1 +l +1 0 +1 0 0 2 0 +1 +1 +1 0 +1 0 3 0 0 +1 +1 +1 0 +1 4 +1 0 0 +1 +1 +1 0 5 0 +1 0 0 +1 +1 +1 +1 +l +1 6 0 +l 0 0 0 +1 7 +1 +1 0 0 +1 0 0 0 0 8 0 0 0 0 0 9 -1 -1 -1 -1 0 10 -1 0 -1 0 -1 0 -1 0 -1 11 -1 0 0 -1 -1 -1 0 12 -1 0 0 -1 -1 13 -1 -1 0 -1 0 -1 0 14 -1 -1 -1 0 -1 0 0 0 -1 15 -1 -1 0 0 -1 3.4.4.6 Taguchi designs Taguchi designs allow both to optimize a method with regard to certain factors and at the same time to test the robustness of a number of factors. The factors to be optimized are called control factors, design variables [33], controllable factors [34] or design parameters [35]. Those to be tested for ruggedness are known as noise factors [33,34], environmental variables [33] or sources of noise [35]. The Taguchi design examines both sets of factors in a combination of two experimental designs. These designs and the treatment of the results is described in other chapters in this book. The use of these designs was introduced to improve the quality of technological products. However, one could consider to use the Taguchi designs for optimization and ruggedness testing of analytical methods. The control factors would be the factors to optimize and the noise factors those for which to test the ruggedness. Certain control and noise factors could be the same but examined at different levels. In the inner design (design with
112
Y. VANDER HEYDEN and D.L. MASSART
the control factors) of a Taguchi design the factor would be examined over a broader range and possibly at more levels than in the outer design (design with the noise factors). Let us try to explain the above with an example. Suppose that one is trying to optimize k’ of different peaks in an HPLC analysis as a function of the pH and the percentage organic modifier in the mobile phase. The factors in the inner design would be the pH and the percentage organic modifier. In the outer design the ruggedness of the response towards the noise factors is examined. These noise factors could again include the factors pH and percentage organic modifier but now examined over a much smaller interval. The drawback of Taguchi designs is the relatively large number of experiments to perform. No case studies that optimize factors and at the same time test their ruggedness towards noise factors in the field of analytical chemistry are known to us.
3.4.5 Experimental part of the ruggedness test The experiments are performed according to the chosen design and a response or a number of responses are measured. The sequence in which the experiments are performed can influence the estimation of the effect of a factor [36]. The reason for this lies in the fact that the measurements can be influenced by different sources of error. Each measurement is influenced by uncontrolled factors that cause random error. Measurements can also be influenced by systematic errors or by systematic errors caused by dr$t (linear drift; due to time-dependent factors). The occurrence of systematic errors or of drift will affect the estimation of the effects of the factors from the design [36]. If only random errors would occur the experiments could be performed in any order. Performing the experiments in a randomized way allows to make more correct estimations of the effects when there are also systematic errors. These errors are due to uncontrolled factors which vary between sets of experiments [36]. An example of this type of error could be the factor “room temperature”. Suppose that the factor “room temperature” is not controlled or examined during the performance of a design and that it takes more than one day to execute the complete design. It is then possible that a different room temperature during the different days introduces systematic differences between the sets of experiments carried out on those days.
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
113
Especially the full and fractional factorial designs are best performed in a random order to avoid the influence of systematic errors because they are constructed so that one factor is at one level in the first N/2 experiments and at the other in the last N/2 experiments. The Plackett-Burman designs can be considered as randomized when they are performed in the sequence that is described in the original papers of Plackett and Burman. Sorting the experiments for organizing reasons or experimental constraints as is proposed in some publications [6,17,19] should preferably be avoided. In some experimental designs it is however too difficult and time-consuming to perform the design in a random order. This is for instance the case when one of the factors is the “column manufacturer” or “batch number of the column”. In this case two columns are used and a performance of the design in random order is not really indicated. Experiments performed with the different columns must be grouped. The experiments with each column still can be executed in random order. An example of drift which occurs in practice is the ageing of a chromatographic column. The ageing of a column can be considerable in the time needed to perform a design. In a ruggedness test performed in our laboratory of the HPLC method for the determination of tetracycline as described in the USP [37], considerable changes in response were observed due to the ageing of the column in the span of time needed to perform a half-fraction factorial design for 4 factors [38]. A reduction in retention time, capacity factor, relative retention and resolution for tetracycline up to respectively 14%, 19%, 11% and 18% occurred. This effect caused by the ageing of the column could be confounded with the effect of the sorted factor. Grouping of experiments has to be done with caution, always keeping in mind that time-dependent factors (drifting factors) could disturb the effects of the sorted factors. If systematic errors due to drift are expected then one can perform the design in a well defined randomized way so that the calculated main effects are not biased by the drift [36]. These designs are called anti-drift designs and they are described for full and fiactional factorial designs. However, the interactions effects calculated from these designs are still biased by the drift. Drift-free effects also can be obtained by regularly performing experiments at nominal level between the experiments of the design. From the experiments at nominal level the drift can be measured. This allows to
114
Y. VANDER HEYDEN and D.L. MASSART
correct the responses of the design and to obtain drift-free effects. The disadvantage of this method is that more experiments are required.
3.4.6 Analysis of the results After carrying out tine design the results are analysed. In the first instance the results of the design (y,, y,, ...,y,) can be plotted to see if one or a few results differ extremely from the rest of the design indicating possible errors in the execution of those experiments. Those differing significantly from the rest (according to, for instance, Dixon's test or Grubb's test) should best be repeated to verify if the extreme value was not due to an error. The effects of the different factors on the response are calculated. This is done according to equation (2). In equation (2) and in the rest of this chapter n represents the number of runs that are performed at one level of a factor, respectively at (f) or (-) level. The symbol N indicates the number of different experiments 3 specified in a design. For instance, for a 2 design for which each experiment is performed once, N is equal to 8 and n to 4. A theoretical example of how to calculate an effect is shown in equation (3) and some practical results can be observed in the tables belonging to the case studies described in Section 3.4.9 (Tables 3.22,3.24,3.26,3.28).1
In ref. [13,36], an effect is calculated as: EX=
c
y(+)-
N
c y(-)
1 y(+)-
2n
y(-)
(2')
An effect found with this equation is half the effect found with the formula used in eq.
(2). The effect obtained using equation (2) describes the effect that occurs when the factor is changed from one extreme level to the other. The use of equation (2') can be justified as an estimation of the effect that occurs when changing a factor from the nominal to an extreme level. However, the conclusions drawn from eq. (2') are only valid if a number of assumptions is fulfilled: (a) the factor is quantitative and not qualitative; (b) the nominal level is situated in the middle of the interval between the two extreme levels; (c) the response is linear in the interval between the two extreme levels. Since these assumptions are not always fulfilled the use of equation (2') is not recommended.
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
115
When a factor is examined at three levels two effects for each factor can be calculated:
and
where Ex(+l) and EX(-1) are the two effects of factor X respectively between the nominal and high extreme level and between the nominal and low extreme level; CY(+l), CY(-1) and CY(0) are the sums of the responses where factor X was respectively at level (+l), (-1) and (0) and where n is the number of runs where the factor was at each level. The effects on a response can also be normalized on a scale between 0 and 100% by dividing the effect of a factor by the mean nominal response ( ~ n o m and ) multiplying by 100 :
Ex(%)
= $100
Ynom
(7)
In this way the effect is expressed as a percentage of the response at nominal level. The effects and normalized effects on a response can be arranged from highest to lowest to show which effects have the largest influence on the considered response.
3.4.7 Statistical analysis of the results To determine whether an effect is statistically significant or not, a statistical interpretation method is used. Different possibilities have been described. An overview of them is given below. 3.4.7.1 Normal or half normal probability plots Normal probability plots or half normal probability plots (Birnbaun plots) [24,29] are graphical methods that help to decide which factors are significant. Effects that are normally distributed around zero are effects
Y. VANDER HEYDEN and D.L. MASSART
116
that are not different from experimental error (nonsignificant effects) and they tend to fall on a straight line in these plots. Significant effects do not belong to such a normal distribution and deviate from this line. A normal probability plot and a Birnbaun plot are shown in Figure 3.2. In Figure 3.2a the main effects for factors M and A and the two-factor interaction MC are significant. Factor M has the largest influence on the considered response. In Figure 3.2b factor E has a highly significant influence on the response. The effect of factor B can be considered as being on the edge of significance since it is not always obvious how to draw the straight line exactly.
Pp!)
i
92,9 7
--
78,6 6
--
7,l
1
@M
@Mc
--
@A I
I
I
I
Figure 3.2a Example of a Normal probability plot (takenfiom ref [29]).
REVIEW OF ROBUSTNESS rN ANALYTICAL CHEMISTRY
117
VARIABLE
0
20
40
60
80
100
120
140
160
180
EMPIRICAL DISTRIBUTION ( IE,I*lO)
Figure 3.2b Example of a Birnbaun plot (haEfnorma1probability plot) (ReprintedfiomInternational Laboratory, volume 16, page 43, 1986. Copyright 1986 by International Scientific Communications Inc. [24]) 3.4.7.2. Interpretation methods that use t-tests The test statistic [ 11,13,24,27,28,39] is:
where EX is the effect of factor X and (SE), is the standard error of the effect. In fact one tests whether EX is significantly different from zero or not. One could also say that one is comparing the mean responses at the two levels to see if they are significantly different. The critical value is a tvalue (tcritical) for the relevant number of degrees of freedom and a given
118
Y. VANDER HEYDEN and D.L. MASSART
a (usually 0.05). If the It1 -value for an effect exceeds the critical value it is considered to be statistically significant. In this type of application, one often does not try to determine the degrees of liberty correctly. Some authors [11,39] simply use the value 2 for tcjtical. The tcjtical value (-0.05) really depends on the number of degrees of freedom but tends to 2, especially when the number of degrees of freedom becomes larger. All hypothesis tests can be represented in two ways, i.e. using critical values, as described above, or using confidence intervals. The confidence limits are given by:
If the confidence interval of the effect of a factor contains zero, then the effect is not significantly different from zero. When zero is outside the confidence interval, then the factor has a statistically significant influence. A convenient way of representing the results is to calculate an Ecritical above which a calculated effect of a factor will be considered significant: 'critical
= 'critical (sE)e
Only one Ecritical has to be calculated for each response. The Ecritical is
l l
I I
then compared with the E x -values of the factors. The E x -values that are larger are significant.
Methods that estimate (SE)efiom the variance of the experiments. Since an effect is a difference of means (equation (2)) the standard error of the effect is calculated according to the equation for the standard error on a difference of means:
where s,' and sg estimate the variances of the two sets of measurements and na and nb are the number of measurements of those sets. When
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
119
adapting equation (1 1) for the calculation of the standard error of an effect it is assumed that
02 =
0:
= cr2
(estimated by s2) and na=nb=n with s2
being the variance of the experiments from the design and n the number of experiments performed at each level for a factor of the design. This gives2
A number of methods have been proposed to determine the variance, s2. They are described below. 1. Using a variance (s2) from R replicate measurements at nominal level [3 1,391. The standard error of an effect determined fiom unreplicated runs is given by equation (12). The number of degrees of freedom for tcritical is equal to R- 1.
2. Using duplicated runs for the experiments of the design [31]. The d? x d , ' variance is then given by s z = L = where di is the difference 2N 2n between the duplicated experiments. The standard error of an effect is derived from equation (12) and given by
Here n is equal to N since there are N measurements performed at each level of a factor. The number of degrees of fieedom for tcritical is N.
* Remark:
If eq. (2') is used to calculate the effects this has a consequence on the formula used to estimate (SE),. In eq. (2') the effect is considered as "a mean of N individual results [13]" (in which N/2 results have a (+) sign and N/2 a (-) one). The standard error is then estimated as a standard error on the mean of N results giving (SE),
120
Y. VANDER HEYDEN and D.L. MASSART
3. Using the variance obtained with the results (yl,y,, ... , y,) from the
~=C(Y~-Y)*
design as proposed by [ 131. The variance is given by s
N-1 where is the mean of the results. This criterion can however only be used if no significant effects occur [ 1 11. Therefore, this method should not be used. In reference [ 131, however, the restriction is not made.
Methods that estimate (SE), using negligible effects. In the previous methods the standard error for an effect was estimated from the variance of the experiments using equation (12). In the methods described in this section, the variance for an effect is estimated with the help of calculated effects that are considered to be negligible. 1. Using the effects of multiple-factor interactions from full or fractional factorial designs. Multiple-factor interactions (e.g. three- and four-factor interactions) are considered to have a negligible effect. It is then considered that these higher-order interaction effects measure differences arising from experimental error [3 13.
c
The variance of an effect is given here by (SE)e2=
EiiXjXk
where x jxk E ~ X X are the effects of the multiple-factor interactions and n x x X 1 1 k 1 J k is the number of these effects. The estimated standard error of an effect, (SE)e, is then given by nX
i
The number of degrees of freedom used for tcritical is nX.X.X 1 J k' 2. Using dummy factors in Plackett-Burman designs [24,26,27]. The effect of a dummy factor is considered to be due to experimental error.
(SE); =
C E2durnrny;i ndurnrny
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
121
where Edummy,i are the effects of the dummy factors and fldummy is the number of dummies. The estimated standard error of an effect is then
The number of degrees of freedom used for tcritical is ndummy. The equations used to estimate (SE), from negligible effects can be explained as follows. The mean effect of the multiple-factor interactions or of the dummies is expected to be zero. The variance of these effects can then be calculated as:
or IZdummy
IZdummy
In Plackett-Burman designs the main effects are confounded with a number of two-factor and higher-order interactions as already seen before. In a ruggedness test one is mainly interested in finding the significant main effects. The two- and higher-order interactions are often considered to be negligible. Two-factor interactions are usually indeed smaller than the main effects but they are not always so small to be completely neglected. As a consequence they will contribute to the calculated main effect. By using a number of dummies to estimate the (SE), value, one obtains a measure for experimental error to which these interaction effects can contribute. The (SE), value obtained with the methods of the previous part of this section where it was estimated from duplicated experiments in the design or from replicated measurements at nominal level and that can be considered as a measure for the
122
Y. VANDER HEYDEN and D.L. MASSART
repeatability do not take this into account and therefore is expected to be smaller then the one obtained from the method with the dummies. When (SE), is estimated from two-factor interactions in fractional factorial designs (instead of from three-factor or higher-order interactions) one has an interpretation criterion which is analogous to the one with dummies used for the Plackett-Burman designs. A possible consequence of using the dummy factor effects or the twofactor interactions is that significant interactions will increase the (SE), considerably. This problem can be avoided in the following way. The strongly significant dummy or two-factor interaction effects can be detected with normal probability plots and omitted from the statistical interpretation. Moreover, enough dummies or two-factor interaction effects should be used to estimate (SE), (for example at least three). 3.4.7.3 Interpretation methods that use an F-test Multiple-factor interaction effects can be used to determine a critical effect value for the interpretation of full and fractional factorial designs [29] :
Im
where m is the number of interaction effects considered; E x x x is the 1 J
k
effect of a multiple-factor interaction and F(2,m) the F-value of the Fisher distribution. The following critical level, derived from the above described statistical critical level, was also found in the literature [40]:
where k depends on the fraction of the factorial design and represents the number of effects that are confounded with each other, e.g. k = 2 for a half-
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
123
fraction factorial design, since 2 effects are confounded with each other. The idea behind the use of Ecritica12 is that the calculated main effect for a factor is also attributed partly to a number of multiple-factor interaction effects. The calculated effect for a non-significant factor can, according to [40], become statistically significant when using Ecriticall, due to the contribution of the multiple-factor interaction effects to the calculated effect, while this would not be the case with Ecritical However, in the 2' formula for the calculation of Ecritical there is a contradiction. On the 2 one hand it considers the multiple-factor interactions as being negligible (see calculation of Ecritical in equation (20) as well as the analogy with 1 equation (14)) while on the other hand it assumes that the multiple-factor interaction effects confounded with the main effects have the same order of magnitude as the main effects (see the use of factor k). Therefore, we do not recommend its use. One can also use the dummy factor effects in a similar way for the interpretation of Plackett-Burman designs. One then uses a formula analogous to equation (20). Instead of using the multiple-factor effects in the formula one uses the dummy factor effects:
i=l
'dummy
F(l, ndummy
where Ed is the effect of a dummy factor, ??dummy is the number of dummies used and F(I,ndummy) the corresponding value of a Fisher distribution. These interpretation criteria are less used in the literature than the t-test methods. However, both methods will yield identical results since f m d m ' [413. Some authors [29,34] present the statistical interpretation method as an ANOVA table. A general example for a 23 full factorial design is given in Table 3.19. The sums of squares (SSd are obtained with the effect values (EX) and the number of experiments in the design (N). The mean square
124
Y. VANDER HEYDEN and D.L. MASSART
values (MSx) are calculated with the SSx and the corresponding number of degrees of freedom ( d ! . The variance ratio, Fx, is obtained by dividing the M S x with the mean square value representing the error (MSerror). The calculated Fx value is then compared to a Fcritical that is equal to the tabulated F(dfx, df-ror). If Fx is larger the effect is statistically significant. The error term can be approximated in different ways. A first possibility is that, analogous to the above, it is estimated from the multiple-factor interactions (two-, three-factor interactions, etc.) for (fractional) factorial designs [29]. In the example of Table 3.19 the sums of squares of the interactions AB, AC, BC and ABC are summed giving a SSerror with 4 degrees of freedom. From this SSenor a MSenor is calculated that is used in the calculation of Fx for the main effects. The ANOVA table and equation (20) give of course the same results. A second possibility is described in reference [34]. It consists of summing all S S x values for which the %SSxvalue (see Table 3.19) is smaller than 5% to make an approximation of SSerrop The method assumes that these sums of squares come from effects that are negligible. The 5% value is an arbitrary value. The number of degrees of freedom is also equal to the sum of the dfx. The values for M S x and F x are then calculated analogous to what is described above. Thirdly, when the experiments of the design are replicated a number of degrees of freedom remain after the calculation of the SSx values and they allow to calculate a SSerror. This SSerror and the &error are then used to calculate MSerror and F X [34].
3.4.8 Using predefined values to identify chemically relevant factors When using predefined values [4] no statistical test is performed to identify relevant factors. So-called chemically relevant effects of factors are identified by comparing them with predefined critical values for the responses.
TABLE 3.19 Source of variation
Degrees of freedom
factors) A B C AB AC
&A (=I) dJij (=I) dfc (=I) ~ A (=I) B dfAC (=I)
ANOVA TABLE FOR A 23 FACTORIAL DESIGN Sum of squares (SS) %SS Mean Square
Variance ratio: F -.
SSA SSB SSC S~AB SSAC
%ssA %ssB YOSSC YOSSAB %ssAC
MSA MSB MSC M~AB MSAC
FA FB FC =3
Mserror
BC ABC
d h C (=I) dfABC (=I)
General notation:
dfx
SSBC SsABC
%ssBC %SSABC
MSBC M~ABC
126
Y. VANDER HEYDEN and D.L. MASSART
The authors also define a standard error although differently from the one previously given in equation 12. For that reason, when speaking about the standard error defined in reference [4] it will be indicated as the relative standard deviation of the experiments (RSD) since in fact that is what is calculated. The calculated effects are normalized on a scale of 100%. The experiments in the design are duplicated. The effect and the relative standard deviation are calculated as follows:
E EX(%) = 2-1 0 0 Y?Z
vn
where EX represent the effect of factor X as described higher and is the mean result of a number of measurements at nominal level (obtained from within or outside the design). In this method the effect values EX(%) are compared with predefined values to identify relevant factors. These predefined values do not represent the limit of statistical significance but the limit of chemical relevance. These limits represent acceptable variations that are allowed to occur in practice. A list of these predefined values for the effect of factors on responses measured in HPLC is shown in Table 3.20. The relative standard deviation is not used in a statistical test [4]. It is only used to check if the repeatability of the method is good enough. If the relative standard deviation is larger than 1%, repeatability is considered too high for a HPLC method. In that case the reason for the large relative standard deviation has to be diagnosed preliminary to the interpretation of the main effects from the ruggedness test. To obtain a standard error as defined earlier in this review one should use the formula described in equation (1 2) and also presented in reference [ 171 which would give in this case:
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
127
TABLE 3.20 LIST OF SOME PREDEFINED VALUES IN HPLC (REPRINTED WITH PERMISSION FROM REFERENCE 141) Response Predefined value 1% Conc. peak area 1Yo Conc. peak hight Plate count 50% Retention time 10% Peak area 2% Peak hight 2% 50% or 2.5 Resolution
3.4.9 Case studies
3.4.9.1 Ruggedness tests of a HPLC method for the determination of tetracycline.HC1 as described in the USPXXII [38] A Plackett-Burman (N=l2) and a quarter-fraction factorial design (Generators: E=ABC, F=BCD) were used to examine six factors. The factors were examined at two levels. The factors and their levels are shown in Table 3.21. The effects and the normalized effects on the following responses were determined: retention time, capacity factor, relative retention of tetracycline and of three by- and degradation products (44-epianhydrotetracycline (EATC) and epitetracycline (ETC), anhydrotetracycline(ATC)) and the resolution between the peaks. Critical effect values were obtained with a t-test (see equation (8) and (10)). The standard error was estimated from dummy factor effects for the PlackettBurman design (see Section 3.4.7.2, equation (19)) and from two-factor interaction effects for the fractional factorial design (see Section 3.4.7.2, equation (17)). Normal probability plots were also drawn for the normalized effects. The effects of the factors on some responses of tetracycline are given in Table 3.22 and one of the normal probability plots is shown in Figure 3.3. Analogous results are found from the statistical analyses of the PlackettBurman and the fractional factorial design in spite of the different level of confounding in these designs and of the different ways of estimating (SE),.
128
Y. VANDER HEYDEN and D.L. MASSART
The normal probability plots also lead to identical conclusions as the statistical analysis with the t-tests.
1
__
095
--
0
0
__
__ -1 __ F
o0
-0,5
-195 -40
C
0 0 0 0
0. 0 0
BDKF 0
-30
-20
-10
0
10
20
Observed Value (=Ex)
30
Figure 3.3 Normal probability plot of the normalized effectsfor the resolution between epianhydrotetracycline and tetracycline obtained JFom theJFactional factorial design
TABLE 3.21 FACTORS AND THEIR LEVELS FOR THE DETERMINATION OF TETRACYCLINE.HCL (CASE STUDY 1) Factors Levels A. Inorganic substances in mobile 0.0975M 0.1025M phase 0.195M 0.205M M(ammoniumoxalate) Ratio M(ammoniumphosphate) 260 ml 280 ml B. Dimethylformamide in mobilephase 7.50 7.80 C. pH of mobile phase 0.9 ml/min 1.1 ml/min D. Flow of mobile phase E. Integration parameter (SN-ratio) 1 3 F. Age of column new column 2 weeks used
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
129
One concludes for example that for the response capacity factor the ageing of the column (factor F) has the most significant effect. The amount DMF in the mobile phase (factor B) has a smaller effect which is on the limit of significancy. The effect of the factors A, C, D and E on the capacity factor is not significant. For the retention time the same conclusions as for the capacity factor can be drawn with as only difference the fact that the flow of the mobile phase (factor D) has a large effect. Concerning the resolution between the different peaks, the ageing of the column (F) has the largest influence. The pH (C) also has a significant effect while the other factors are not significant. From Figure 3.3 and Table 3.22 it can be observed that the effect estimated for the two-factor interactions BD+CF also is significant. Considering the fact that the main effects of the factors C and F are highly significant it is most likely that the significant effect found for BD+CF is due to the interaction CF and not to BD nor to one of the other two higher-order interactions confounded in this estimation. For the relative retention analogous results as for the resolution were obtained. The same factors cause an effect.
TABLE 3.22 EFFECTS, NORMALIZED EFFECTS (%E) AND CRITICAL EFFECT VALUES ON SOME RESPONSES OF TETRACYCLINE FROM THE PLACKETT-BURMAN AND THE FRACTIONAL FACTORIAL DESIGN Plackett-Burman design Retention time Factors
A B C D E F
Effect
Capacityfactor
%E
0.085 1.18 -0.363 -5.03 0.230 3.19 -1.344 -18.63 -0.033 -0.46 -0.929 -12.88
Effect
0.048 -0.128 0.071 -0.067 0.000 -0.367
Relative retention
Resolution
%E
Effect
%E
Effect
%E
2.52 -6.68 3.72 -3.51 -0.01 -19.15
-0.0370 -0.0502 0.0912 -0.0416 0.0178 -0.2808
-2.14 -2.91 5.28 -2.41 1.03 -16.27
-0.1148 -0.0521 0.3584 -0.2552 0.1869 -0.3133
-6.35 -2.89 19.84 -14.13 10.34 -17.34
130
Y. VANDER HEYDEN and D.L. MASSART
TABLE 3.22 (CONTINUED) Dummy 1 Dummy2 Dummy3 Dummy4 Dummy 5
0.160 0.176 0.136 -0.177 0.005
2.22 2.45 1.88 -2.46 0.07
0.080 0.086 0.035 -0.060 -0.010
4.16 4.49 1.82 -3.14 -0.52
0.0727 0.0495 0.0597 0.0028 0.0203
Ecritical
0.376
5.21
0.157
8.20
0.125
4.21 0.1198 6.63 2.87 0.1842 10.19 3.46 0.1540 8.52 0.16 -0.0212 -1.17 1.18 0.0003 0.02 7.21
0.309 17.13
Quarter @actionfactorial design Retention time Factors A B C D E F
%E
Effect
0.096 -0.200 0.236 -1.192 0.097 -0.854
1.33 -2.77 3.28 -16.53 1.34 -11.84
0.030 1.54 -0.094 -4.93 0.097 5.06 -0.004 -0.19 0.021 1.12 -0.328 -17.10
0.161 0.144 -0.006 0.216 -0.142 -0.128 0.038
2.23 2.00 -0.09 2.99 -1.97 -1.77 0.53
0.043 0.045 0.014 0.040 -0.042 -0.050 0.034
0.324 4.49 Ecritical Ecriticd Without effect of (BD+CF)
0.094
AB+CE AC+BE AD+EF AE+BC AF+DE BD+CF BF+CD
Effect
Capacifyfactor %E
Relative retention Effect %E
Effect
%E
-0.0122 -0.71 -0.0237 -1.37 0.0991 5.74 -0.0175 -1.02 0.0141 0.82 -0.2020 ..11.70
0.0882 -0.0019 0.4166 0.1374 -0.0339 -0.643 1
4.88 -0.10 23.06 7.60 -1.87 -35.60
0.87 0.50 1.22 -1.30 -1.50 -5.17 0.06
0.0716 -0.0076 -0.0822 0.0906 0.0535 -0.4053 -0.0802
3.96 -0.42 -4.55 5.02 2.96 -22.43 -4.44
0.089 5.15 0.039 2.27
0.393 0.153
21.78 8.49
2.26 0.0151 2.36 0.0086 0.74 0.0211 2.08 -0.0225 -2.21 -0.0258 -2.62 -0.0893 1.75 0.0010 4.93
Resolution
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
131
3.4.9.2 HPLC assays of acetylsalicylic acid and its major degradation product, salicylic acid [5,17] and of Salbutamol and its major degradation product [5,6] These two case studies are good examples of ruggedness tests that focus on the type of data analysis described in Section 3.4.6 and where one prefers not to carry out a statistical analysis. They will be described in detail later in this book (Chapter 5). The factors are examined at three levels in reflected Plackett-Burman designs. From the results of the designs normalized effects, EX(%), are calculated. No statistical interpretation criterion is used to identify significant effects but possibly relevant factors are determined by plotting the effects, EX(%), of the different factors (see Figure 3.4). Salbutamol
Figure 3.4 The effects, EXPA),of the differentfactors on the resolution (RJ between Salbutamol and its major degradation product (extractedfiom ref [4])
132
Y. VANDER HEYDEN and D.L. MASSART
Other results obtained from the ruggedness test are the definition of optimized method conditions for the factors and of system suitability criteria for a number of responses. System suitability parameters [6,17] are defined as an interval in which a response can vary for a rugged method. The system suitability criteria are the range of values between which a response (e.g. retention time, capacity factor, number of theoretical plates, resolution) can vary without affecting the quantitative results of the analysis. For instance, a design is performed and the retention time of the main substance varies between 200 s and 320 s without affecting the quantitative determination of the substances. The system suitability criteria for the retention time is then defined as the interval 200 s - 320 s. Optimal values for the factors are selected from the tested levels for the factors (extremes or nominal) in function of a number of responses of the method (see also references [ 16,191). When one changes the method conditions due to these results one has to be aware that a new method is defined. What is done here is in fact a simplistic way of optimizing a method. The optimization of a method however is a step that is expected to come much sooner in the method development than in the ruggedness testing. One also has to realize that when one defines a new method this requires a new full validation, including a ruggedness test.
TABLE 3.23 FACTORS AND THEIR LEVELS FOR THE DETERMINATION OF LINCOMYCINE A [13] Factor Nominal Minimal Maximal level level level Concentration 300 285 315 lincomycine A (mg/ml) pH of mobile phase
4.5
4.0
5.0
Flow of mobile phase (ml/min)
1.o
0.7
1.3
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
133
TABLE 3.24 EFFECTS ON THE RESPONSE ANALYTICALLY DETERMINED CONCENTRATION LINCOMYCINE A FOR THE RUGGEDNESS TEST OF CASE STUDY 3 [ 131 Factor Effect Confidence Conclusion interval on the effect A: Concentration + 15 +1 t ) + 2 9 Significant lincomycine A B: pH of mobile phase + 4.5 - 9.5 t)+ 18.5 Not significant C: Flow
+ 0.5
Interaction AB
-0.25
Interaction AC
- 0.75
Interaction BC
- 0.75
- 13.5 t)+ 14.5 - 14.3 t)+ 13.8 - 14.8 t)+ 13.3 - 14.8 t)+ 13.3
Not significant Not significant Not significant Not significant
3.4.9.4Ruggedness test for the determination of water infertilizers by total distillation with heptane [ 121 I Seven factors were examined in a 1/16th fraction of a 2 design. The factors and their levels are shown in Table 3.25. The factors were examined at the nominal and at an extreme level. A design was performed on four types of fertilizers containing, when determined under nominal conditions, respectively 18.80%, 1.10%, 3.83% and 1.03% water. For the first three fertilizers the results of the design and the calculated effects are given in Table 3.26. No statistical test was performed to identify significant effects. Important effects are determined by ranking the effects and comparing their values with each other. Due to the large difference in the water content comparisons between the effects in the different fertilizers (see Table 3.26) are clearer when normalized effects are used. For fertilizer 2, for instance, the effects are smaller in absolute value than for fertilizer 1, but much larger in relative value. The use of the relative normalized effects allows clearer comparisons between effects of a factor on the response of different samples.
134
Y. VANDER HEYDEN and D.L. MASSART
TABLE 3.25 THE FACTORS AND THEIR LEVELS EXAMINED FOR THE DETERMINATION OF WATER IN FERTILIZERS (CASE STUDY 4) r121
Factors A. Amount of water B. Reaction time C. Distillation rate D. Distillation time E. n-Heptane F. Aniline G. Reagent
Nominal level ca. 2ml 0 min. 2 dropsls 90 min. 210 ml 8 ml new
Extreme level ca. 5ml 15 min. 6 dropsls 45 min. 190 ml 12 ml used
TABLE 3.26
Exp. 1 2 3 4 5 6 7 8
RESULTS FOR THE RUGGEDNESS TEST ON THE DETERMINATION OF WATER IN FERTILIZERS [ 121 Amount of water (%) Effect 1 2 3 Factors 1 2 A 0.27 -0.07 18.80 3.93 1.10 B -0.10 -0.09 4.10 20.58 1.74 C -0.11 0.21 19.90 1.02 3.59 D -0.63 -0.23 18.03 1.19 3.38 19.50 1.10 3.49 E 0.08 0.19 F 0.83 0.11 19.16 1.13 3.77 19.88 1.24 3.97 G 0.99 0.11 19.85
1.27
3 -0.07 -0.21 -0.06 -0.27 0.09 0.33 -0.09
3.40
Factors A B C D E F G
Normalized effect (96) 1 2 -6.36 1.44 -0.53 -8.18 -0.86 19.09 -3.35 -20.91 0.43 17.27 4.41 10.00 5.27 10.00
3 -1.83 -5.48 -1.57 -7.05 2.35 8.62 -2.35
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
135
3.4.9.5 Ruggedness test on an HPLC assay for the determination of phenylbutazone and its major degradation products [22] A Plackett-Burman design for 7 factors (N=8) was applied. The factors are tested at the nominal and an extreme level. They are shown in Table 3.27. For three brands of phenylbutazone (injectable solutions) and a reference solution (standard) experiments according to a design were performed. The responses considered are the amounts of phenylbutazone and two of its degradation products expressed as a percentage of the theoretical amount phenylbutazone in the injection solution. To identify significant effects they were compared to a critical value. From reference [22] it is not clear how this critical value was determined. The results for one brand of phenylbutazone are given in Table 3.28. From the results of the three brands it was observed that the only factor with a really significant influence was the age of the reference solution (C, in Table 3.28). The ruggedness test shows that the method description should specify that the test solutions have to be freshly prepared a shortly before analysis.
TABLE 3.27 FACTORS AND THEIR LEVELS FOR THE DETERMINATION OF PHENYLBUTAZONE (CASE STUDY 5) (REPRINTED WITH PERMISSION FROM [22]) Factors Nominal level Extreme level A. Ionic strength of buffer 0.10M 0.09M B. pH of buffer 5.25 5.35 C. Age of the reference solution Oh 18h D. Concentration in the reference 200 mg/ml 180 mg/ml solution E. Composition of the mobile phase 51 : 46.5 : 2.5 52 : 46 : 2 (TRIS/citrate-acetonitriletetrahydrofuran) F. Detection wavelength 237 nm 239 nm G. Flow of mobile phase 2.0 ml/min 1.9 ml/min
136
Y. VANDER HEYDEN and D.L. MASSART
TABLE 3.28 SOME RESULTS FOR THE RUGGEDNESS TESTS ON PHENYLBUTAZONE (REPRINTED WITH PERMISSION FROM r221) Deg. Exp prod. I
Deg. prod. 11
Phenyl- Factors butazone
1 2 3 4 5 6 7 8
1.559 1.553 1.506 1.649 1.568 1.570 1.533 1.573
97.78 101.58 98.39 102.54 96.80 100.82 95.84 103.65
0.743 0.739 0.719 0.715 0.743 0.714 0.756 0.748
Deg. prod. I
Deg. prod. 11
- 0.010
B
+ 0.005
+ 0.008 + 0.795 + 0.002 - 0.860
C
+0.010 + 0.025 - 0.010 + 0.005 + 0.005
- 0.042 - 0.022 - 0.022
A
D
Phenylbutazone
- 4.945 + 0.075
+ 0.970 + 1.035
+ 0.048 + 0.028 - 0.860
Critical value P<0.05 P
0.023 0.031
0.057 0.075
3.961 5.213
3.4.9.6 Ruggedness test on a TLC assay of a degradation product of diclofenac sodium in a pharmaceutical formulation [ 1 ] The ruggedness was examined on a test solution prepared from a tablet. Each of the factors was tested at three levels. The factors and their levels are
A) batch of plates: 3 different batches, B) composition of the mobile phase, ratio dichloromethane I methanol: nominal : 92 / 8 extremes : 93 I 7 and 91 I 9 C) developing temperature: nominal : 20°C extremes : 15°C and 30°C.
. . . .
A reflected half-fraction factorial design for three factors p 3 - l ) was performed. The influence of the factors on the responses recovery (‘?A), resolution between peaks (R)and Rpalue was calculated. Approximate critical values were obtained using the method given by Youden and Steiner (Ecritical = 2(SE)e). The standard error was estimated from
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
137
replicate measurements at nominal level. However it has to be remarked that the Ecritical= 2(SE)e = 2
E d2:
- = $s
used in reference [I] is the one
as described by Youden [ 1 13 for a design with 8 experiments, while here in fact only a 4 experiment design was performed. For a 4 experiment design __
the correct Ecritical = 2(SE)e = 2
- = 2s
(see equations (10) and (12)
with tcritical considered equal to 2).
3.4.9.7 Ruggedness testing of a gas chromatographic method for residual solvents in steroids [ 161 This case study shows that in practice other designs then the two-level screening designs are used for ruggedness testing. Two groups of four factors, i.e. four factors related to the injection process and four factors related to separation and detection, were selected and examined in separate central composite designs. From the results of the experimental designs a functional relationship between a chromatographic response (e.g. area) and the investigated factors is estimated and used to establish the significance of the factors. An example of such a relationship is:
A = 0.307 + 0.0077*1- O.O040*L
(29)
where A is the chromatographic area; I and L are respectively the factors injection temperature and liner type. Inserting tolerance intervals of the chromatographic responses in this equation results in rugged intervals for the factors. By comparing these intervals with the inaccuracy of the settings of the experimental conditions a statement about the ruggedness of the method is made. The tolerance intervals of the responses are defined by the experimentator, e.g. 2.5% difference in the area response between two independent analyses is considered acceptable in reference [16], i.e. a value of 0.025*0.307=0.0076 for the above mean response. The rugged interval for the injection temperature is then obtained from equation (29): 0.307 f 0.0076 = 0.307 + 0.0077 I
+
- 0.99 < I < 0.99
138
Y. VANDER HEYDEN and D.L. MASSART
In practice this gives a rugged interval of 143-157°C for the injection temperature with 150°C being the nominal level. The rugged interval is then compared to the “inaccuracy of the settings” which was here 150°C f 2°C and the conclusion is that the factor is rugged. 3.4.10 Expert systems and software packages for ruggedness testing A number of software packages or expert systems for ruggedness testing has been developed. RES (commercialized under the name “Shaiker”) is an expert system created by Van Leeuwen et al. [4,23] and has been validated and evaluated [42,43]. It uses fractional factorial and PlackettBurman designs and allows to test the factors at two or three levels. The interpretation criteria used here are the predefined values (see Section 3.4.8). Merck also proposed recently an expert system called Ruggedness Method Manager for ruggedness tests of chromatographic assay methods. The system uses fractional factorial designs. Besides the factors to be examined, interactions that possibly also could be relevant have to be defined by the user. The system then calculates a design in which the main effects are not confounded with one of the specified interactions. The interpretation criterion to identify statistically significant effects is not known to the authors of this chapter. Many commercial statistical or chromatographic software packages also allow to set up a ruggedness test. This is for instance the case with Statgraphic@ Plus, Unscrambler I1 and DryLabB for ‘
[email protected] list is far from complete. Finally, one can use spreadsheets (e.g. Excel, Lotus) to calculate the effects of the factors and to interpret the results. An example with Lotus- 12-3 is given in reference [44].
3.5 RUGGEDNESS TESTING OF NON-PROCEDURE RELATED FACTORS: THE USE OF NESTED DESIGNS The examination of more than one of the non-procedure related factors (e.g. different laboratories, analysts, instruments, columns or batches of reagents, days) by Plackett-Burman and fractional factorial designs causes problems. These designs require combinations that are impossible to
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
139
execute in practice. For example, if the factors “laboratory” and “analyst” are examined, it is not possible that an analyst should move from one laboratory to another to perform in each laboratory a number of experiments. Similar problems occur when examining different column factors. As already mentioned before, one is able to examine only one of the three column factors at the same time in a design (see Section 3.4.2). Suppose for example that one would like to examine the factors “batch” and “manufacturer”, among other factors, in a Plackett-Burman design and that the design for seven factors described in Table 3.16 was selected. In this design each factor has two levels, e.g. factor A (= manufacturer) has a (f) level for column IS and a (-) level for column L and factor B (= batch number) has a (+) level for batch B1 and a (-) level for batch B2. This would give the design shown in Table 3.29. The combinations necessary to perform the design as usual are impossible. The Plackett-Burman and fractional factorial designs are not usefil here. To study such factors one can use a nested design and interpret the results with a nested or hierarchic ANOVA. An example of a nested design is shown in Figure 3.5 and the ANOVA table for it is given in Table 3.30. The designs and ANOVA are called nested because the subordinate classification is nested within the higher level of classification. An example from practice of nested ANOVA is given by Wernimont [45]. Two different ANOVA models can be considered, namely the Model I or fixed effect model and the Model I1 or random effect model. In a Model I ANOVA (fixed effect model) differences among groups are assumed to be due to the fixed treatment effects investigated by the experimentator. The term fixed treatment effect means that the effect of a factor is considered to be due to a deviation of the mean of groupj from the grand mean. In ruggedness testing a group is the set of experiments performed at one level of a certain factor. With fixed treatment effect is meant that there is a systematic effect when changing from one level of the factor to another. The purpose of the Model I ANOVA is to decompose each result as yq=p+aj+eq where p is the population mean, uj the effect of groupj and eij the randomly distributed error or residual. Significance in a fixed effect
140
Y. VANDER HEYDEN and D.L. MASSART
ANOVA means that at least one group shows a significantly different result from the others.
TI
Anal.1
Instr.l
1;borY;
Anal.2
Instr.2
Anal.3
Anal.4
a=3
ialmiatoy;
And.5
Anal.6
Instr.16
Instr.3
I
Instr.17 Instr.18
m I
Day1 Day2 Day3 Day4
Day69 Day70 Day71 Day72
b=2
c=3
n=4
Figure 3.5 Design for a nested ANOVA in which thefactors laboratories, analysts, instruments and days are examined
TABLE 3.29 A PLACKETT-BURMAN DESIGN FOR SEVEN FACTORS OF WHICH TWO ARE CHROMATOGRAPHIC COLUMN FACTORS (MANUFACTURER AND BATCH) IS IMPOSSIBLE Factor A B C D E F G Experiment (manufacturer) (batch) 1 K B1 + - + 2 L B1 + + - + 3 L B2 + + + - + 4 K B2 + + + + + + 5 L B1 + - - + + 6 K B2 + - + 7 K B1 8 L B2 -
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
141
In a Model I1 ANOVA (random effect model) the result can be decomposed as yij=,u+Aj+eij, where A j represents a normally distributed variable with mean zero and variance u:/. In this model one is not interested in a specific effect due to a certain level of the factor, but in the general effect of all levels on the variance. That effect is considered to be normally distributed. Since the effects are random it is of no interest to estimate the magnitude of these random effects for any one group, or the differences from group to group. What can be done is to estimate their contribution oil to the total variance. The subordinate level of a nested ANOVA is always Model I1 (random effect model). The highest level of classification of a nested ANOVA may be Model I (fixed effect model) or Model 11. If it is Model I1 it is called a pure Model I1 nested ANOVA. If the highest level is Model I it is called a mixed model nested ANOVA. In nested designs performed for a ruggedness test one does not determine the specific effect of one or a number of groups from a factor but the general effect of all groups of that factor (expressed as a variance). For example, one does not determine the specific effect of one or a number of laboratories but the general effect of all laboratories. In contrast, Plackett-Burman and fractional factorial designs are fixed effect designs. In Plackett-Burman and fiactional factorial designs the effect, EX, that is determined is the fixed effect between the two groups of observations consisting of the two levels of that factor. The use of ANOVA for the statistical interpretation of the effects calculated from the Plackett-Burman and fractional factorial designs is already explained in Section 3.4.7.3. From the nested designs one can estimate the part of the total variance of the response that is caused by a factor. Interactions between two factors cannot be estimated from this kind of designs. The ANOVA table for the nested design of Figure 3.5 is given in Table 3.30. The F-values for the factors are obtained by dividing the mean square (MS)of a factor with the MS exactly below it in the table. The calculated F-values are significant at the 5% level when they are larger than the critical F-values obtained from an F-table.
TABLE 3.30
ANOVA TABLE FOR THE NESTED DESIGN OF FIGURE 3.5 Expected MS ss MS F-value Critical F-value (a = 0.05)
Source of variation
df
Analysts within laboratories Instruments within analysts
a(b - 1 )
sSana1
F[a(b-l),ab(c-l)]
ab(c - 1 )
MSana1 M S,,,, MS,",,,
SSinstr
MSinstr
F[ab(c-l),abc(n-1)]
?E!LK!
2
'days
2 'instr
~2~~ + nsinstr 2
MSdayr
Days within abc(n - 1 ) ssdays Msdays instruments Total abcn - 1 SStotal MStotal Remark: The meaning of a, b, c and n is explained in Figure 3.5.
-+
2
'days
’nc
2 'anal
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
143
From the mean square (MS)values and the formulas for the expected mean square (see Table 3.30) one can calculate the variance components. For the example of Figure 3.5 this gives:
2
2
MSinStr = sdays
2
Sinsti- -+ 'instr
Msinstr
- Msdays
n
The variance component can be expressed as a percentage of the sum of the variances [21]:
s
2-
-
2
sdqs
2
2
+ nsinsb+ ncs,,,uI + nebslub + 2
2 Slab
2 S
represents 'ab.100% S2
In this way the contribution of the different factors to the total variance is determined. Suppose that the contribution of the factor instruments was most significant in the example of Figure 3.5. If one then wants to obtain a better overall reproducibility the reason for the large variance due to the instruments must be investigated and corrected. The nested designs can be used to perform a ruggedness test following the definition of the US Pharmacopeia [7] or the second level requirements of the Canadian Acceptable Methods [ 141. However, to our knowledge this methodology has not been applied yet for this purpose.
3.6 CONCLUSIONS The ruggedness of a method can be tested using two types of experimental designs. Procedure related factors at the one hand are examined mainly in screening designs of the Plackett-Burman or
144
Y. VANDER HEYDEN and D.L. MASSART
fractional factorial type. Non-procedure related factors on the other hand can be examined in nested designs. As a result of the data analysis one is able to indicate which of the tested factors were not rugged for certain responses. When factors that are not rugged are detected one can decide to change the method or to control the factors in question more strictly. Other results from a ruggedness test described by some authors are the definition of rugged intervals and of system suitability parameters and the selection of optimal values for the factors. Rugged intervals are defined as the interval between the levels of a factor for which no significant effect is seen on a response [ 191. System suitability parameters [6,17] are defined as an interval in which a response (e.g. retention time, resolution, number of theoretical plates) are allowed to vary for a robust method. They can be derived from the minimal and maximal result for the considered response as seen with a design in which the quantitative results of the method were found to be rugged. Most applications studied in this chapter concern different types of chromatography. However, the methodology can be useful for ruggedness testing in most other fields of analytical chemistry. This can, for instance, be seen in case study 4 on the determination of water in fertilizers by total distillation (see Section 3.4.9). Finally, it must unfortunately be concluded that the analytical literature about the determination of ruggedness contains many errors and bad practices, the main ones being inapt choice of levels (too large differences), inclusion of factors that should not be included (because they should stay constant in properly standardized analytical procedures) and errors in the statistical analysis. The reader is therefore urged not to base his methodology on examples from the literature without having verified that they are correct.
ACKNOWLEDGEMENTS The authors thank the National Foundation for Scientific Research (NFWO) for financial support. Acknowledgements are also made to the editors who gave the permission for reprinting parts of their publications.
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
145
REFERENCES S.W. Sun, Experimental and statistical approach for validating a test procedure in a pharmaceutical formulation, Ph. D. thesis, Laboratoire de Chimie Analytique, FacultC de Pharmacie, Montpellier, 1993. F.J. van de Vaart et al., Study group “Quality in Pharmaceutical Analysis” of the work group “Quality in Analytical Chemistry”, Section Analytical Chemistry of the KNCV, Validation in Pharmaceutical and Biopharmaceutical Analysis, Het Pharmaceutisch Weekblad,127 (1992) 1229-1235. J.C. Wahlich, G.P. Carr, Chromatographic system suitability tests - What should we be using?, Journal of Pharmaceutical and Biomedical Analysis, 8 (1990) 619-623. J.A. Van Leeuwen, L.M.C. Buydens, B.G.M. Vandeginste, G. Kateman, P.J. Schoenmakers, M. Mulholland, RES, an expert system for the set-up and interpretation of a ruggedness test in HPLC method validation. Part 1: The ruggedness test in HPLC method validation, Chemometrics and Intelligent Laboratory systems, 10 (1991) 337-347. M. Mulholland, Ruggedness testing in analytical chemistry, TRAC, 7 (1988) 383-389. M. Mulholland, J. Waterhouse, Investigation of the limitations of saturated fractional factorial experimental designs, with confounding effects for an HPLC ruggedness test, Chromatographia, 25 (1988) 769-774. The United States Pharmacopeia XXII, The National Formulary XVII, United States Pharmacopeial Convention, Rockville, 1990, p. 1712. International organization for Standardization, Accuracy (trueness and precision) of measurement methods and results, ISOIDIS 5725-1 to 5725-3, Draft versions 1990191. C. Hartmann, Increasing precision by replicate measurements, Analusis, 22 (1994) 19-21. W. Honvitz, Protocol for the Design and Interpretation of Collaborative Studies, Pure andApplied Chem., 60 (1988) 855-864. W.J. Youden, E.H. Steiner, Statistical Manual of the Association of OfJicial Analytical Chemists, The Association of Official Analytical Chemists ed., Arlington, 1975, p. 33-36,70-71, 82-83. G.T. Wernimont, Use of Statistics to develop and evaluate Analytical Methods, W. Spendley ed., Association of Official Analytical Chemists, Arlington, USA, 1985, p. 78-82. Caporal-Gautier J., Nivet J.M., Algranti P., Guilloteau M., Histe M., Lallier M., NGuyen-Huu J.J. and Russotto R., Guide de validation analytique, Rapport d’une commission SFSTP, STP Pharma Pratiques, 2 (1992) 205-239. Drugs Directorate Guidelines, Acceptable Methods, Health Protection Branch Health and Welfare Canada, 1992,20-22. [ 151 K. Callewaert, Validation of analytical methods used in the determination of the test substance in biologicalfluids, EOQ Workshop, Budapest, October 1992. [16] G. Wynia, P. Post, J. Broersen and F. Maris, Ruggedness testing of a gas chromatographic method for residual solvents in pharmaceutical substances, Chromatographia, 39(5/6) (1994) 355-362.
146
Y. VANDER HEYDEN and D.L. MASSART
[17] M. Mulholland, J. Waterhouse, Development and evaluation of an automated procedure for the ruggedness testing of chromatographic conditions in highperformance liquid chromatography, J. Chromatogr., 395 (1987) 539-55 1. [18] K. Jones, Optimisation procedure for the silanisation of silicas for reversedphase high-performance liquid chromatography. I. Elimination of nonsignificant variables, J. Chromatogr., 392 (1987) 1-10. [19] L. Abdel-Malek, Test de robustesse, Comett Euro training course in advanced HPLC and capillary electrophoresis, Montpellier, September 1993. [20] International Organization for Standardization, Draft International Standard ISOIDIS 5 725-3, Accuracy (trueness and precision) of measurements methods and results. Part 3: Intermediate measures on the precision of a test method, 1991. [21] R.R. Sokal, F.J. Rohlf Biometry, the principles and practices of statistics in biological research (second edition), W.H. Freeman, New York, 1981, p. 271. [22] H. Fabre, V. Meynier de Salinelles, G. Cassanas, B. Mandrou, Validation d'une methode de dosage par chromatography en phase liquide haute performance, Analusis, 13 (1985) 117-123. [23] J.A. Van Leeuwen, L.M.C. Buydens, B.G.M. Vandeginste, G. Kateman, P.J. Schoenmakers, M. Mulholland, RES, an expert system for the set-up and interpretation of a ruggedness test in HPLC method validation. Part 2: The ruggedness expert system, Chemometrics and Intelligent Laboratory systems, 11 (1991) 37-55. [24] K. Jones, Optimization of experimental data, International Laboratory, November 1986,32-45. [25] K. Jones, Process scale high-performance liquid chromatography. Part I: An optimisation procedure to maximise column efficiency, Chromatographia, 25 (1988), 437-442. [26] J. Vindevogel, P. Sandra, Resolution optimization in micellar electrokinetic chromatography: use of Plackett-Burman statistical design for the analysis of testosterone esters, Anal. Chem., 63 (1991) 1530-1536. [27] S.F.Y. Li, Capillary electrophoresis: principles, practice, and applications, Journal of Chromatography Library - volume 52, Elsevier, Amsterdam, 1992, p. 3 16-318. [28] StatgraphicsB Plus, Statistical Graphics System by Statistical Graphics Corporation, version 6, Reference Manual; Manugistics Inc., Rockville USA. [29] E. Morgan, Chemometrics: experimental design, Analytical Chemistry by Open Learning, J. Wiley, Chichester, 1991, p.118-188. [30] R.L. Plackett, J.P. Burman, The design of optimum multifactorial experiments, Biometrika, 33 (1946) 305-325. [31] G. Box, W. Hunter, J. Hunter, Statistics for Experimenters, an introduction to Design, Data analysis and Model Building, J. Wiley, New York, 1978, p. 306418. [32] Y. Vander Heyden, M.S. Khots, D.L. Massart, Three-level screening designs for the optimisation or the ruggedness testing of analytical procedures, Anal. Chim. Acta, 276 (1993) 189-195. [33] J.H. de Boer, Chemometrical Aspects of Quality in Pharmaceutical Technology. The application of robustness criteria and multi criteria decision making in
REVIEW OF ROBUSTNESS IN ANALYTICAL CHEMISTRY
[34] [35] [36]
[37] [38]
[39] [40] [41] [42]
[43] [44] [45]
147
optimization procedures for pharmaceutical formulations, Doctoral thesis, 1992. J.J. Pignatiello and J.S. Ramberg, Discussion, Journal of Quality Technology, 17 (1985) 198-206. R.N. Kackar, Off-line Quality Control, Parameter Design, and The Taguchi Method, Journal of Quality Technology, 17 (1985) 176-188. J.L. Goupy, Methods for experimental design, principles and applications for physicists and chemists, Data handling in science and technology - volume 12, B.G.M. Vandeginste and S.C. Rutan ed., Elsevier, Amsterdam, 1993, pp.159177,421-429. The United States Pharmacopeia XXII, The National Formulary XVII, United States Pharmacopeial Convention, Rockville, 1990, p. 1337. Y. Vander Heyden, K. Luypaert, C. Hartmann and D.L. Massart, J. Hoogmartens, J. De Beer, Ruggedness tests on the HPLC assay of the United States Pharmacopeia XXII for tetracyc1ine.hydrochloride.A comparison of experimental designs and statistical interpretations, Analytica Chimica Acta, 316 (1995) 15-26. D.L. Massart, B.G.M. Vandeginste, S.N. Deming, Y. Michotte, L. Kaufman, Chemometrics: a textbook, Elsevier, Amsterdam, 1988, p. 101-106. H. Leuenberger, W. Becher, A factorial design for compatibility studies in preformulation work, Pharm. Acta Helv., 50 (1975) 88-91. N.R. Draper, H. Smith, Applied Regression Analysis, second edition, J. Wiley, New York, 1981, p. 102. J.A. Van Leeuwen, L.M.C. Buydens, B.G.M. Vandeginste, G. Kateman, A. Cleland, M. Mulholland, C. Jansen, F.A. Maris, P.H. Hoogkamer, J.H.M. van den Berg, RES, an expert system for the set-up and interpretation of a ruggedness test in HPLC method validation. Part 3: The evaluation, Chemometrics and Intelligent Laboratory systems, 11 (1991) 161-174. F. Maris, R. Hindriks in “Intelligent software for chemical analysis”, L. Buydens, P. Schoenmakers ed., Elsevier, 1993,202-21 1. D.L. Massart, N. Vanden Driessche, A. Van Dessel, Databases and spreadsheets, Chapter 2 in “PCsfor chemists”, J. Zupan ed., Data Handling in Science and Technology - Volume 5 , Elsevier, Amsterdam, 1990, p.17-41. G. Wernimont, Design and interpretation of interlaboratory studies of test methods, Anal. Chem., 23 (1951) 1572-1576.
This Page Intentionally Left Blank
Chapter 4
ROBUSTNESS CRITERIA; INCORPORATING ROBUSTNESS EXPLICITLY IN OPTIMIZATION PROCEDURES UTILIZING MULTICRITERIA METHODS JAN H. DE BOER Gasunie Research, P.O. Box 19, 9700 M A Groningen, The Netherlands
4.1 INTRODUCTION In pharmaceutical technology, one is often engaged with the design and analysis of formulations. A formulation of a solid dosage form (e.g. a tablet) consists normally of the medicinal-substance(s) (drug) and a number of excipients. In many cases, the physical properties of the formulation are determined by the physico-chemical properties of these excipients and the process of manufacturing. These formulation properties can be influenced by changing the proportions of the excipients andor by changing the conditions of the manufacturing process. It is assumed that in most cases the concentration, but almost always the absolute amount, of the active-substance is fixed. Also other components, in most cases in minor concentrations, can appear with a fixed concentration in the formulation. These components can be: a lubricant, a glidant and/or a disintegrant. Another assumption is that the total weight of the formulation is constant. When speaking about the optimisation of a (tablet) formulation then in most cases the attainment of preferable response values is meant, e.g. a high crushing strength, a low disintegration time, a certain dissolution profile etc. Another important desirable property of a formulation can be that the formulation is robust towards (small) deviations (errors) in process conditions or the proportions of the excipients; this means that despite these errors the values of the responses considered remain at (almost) the same level or deviate with only an acceptable amount. It is of course desirable to maintain properties of any product on exactly the same value during production or use, but on the other hand this can be very expensive 149
150
J. H. DE BOER
and it is not always needed. For a pharmaceutical formulation it is needed that the product hlfils the demands of registration authorities, but within this framework a compromise between cost and quality is always made. This robustness aspect can also be extended towards environmental factors like temperature and (more important) humidity. With this latter knowledge it can be predicted what the shelf life of a certain formulation under certain conditions is or what the desired conditions are to keep a certain product during a certain time on a desired quality level. In this chapter three types of robustness criteria will be explained. Two of the three robustness criteria are based on the work and philosophy of the work of Taguchi. For this reason a brief explanation will be given of the Taguchi methods beforehand. There are almost always a number of criteria to which the formulation has to fulfil, and in the case of incorporating robustness aspects (as an optimisation criterion) into the optimisation the number of criteria is also increased. It is however almost impossible to hlfil all the criteria in the most optimal way at once. This means that a compromise has to be found between all criteria. A large number of methods is available to search for such a compromise variable setting. One of these methods is Pareto Optimality which will be explained and applied in this chapter. Pareto Optimality searches for a compromise between the optimisation of a certain tablet property and the optimisation of the robustness of this property.
4.2 A BRIEF INTRODUCTION TO THE TAGUCHI METHODS 4.2.1 Introduction In this part a brief introduction will be given to the so-called Taguchi methods. The methods for quality improvement which are introduced and described in subsequent parts of this chapter are partly based on the work of Taguchi. For a good understanding and placement of the techniques described in these (sub)sections, the relevant parts of the Taguchi methods are explained in this section. Because enough literature on Taguchi methods is available (see literature references [l-131) it should not be difficult to obtain more information if one is interested. One of the persons who has made a great impact on the world of quality improvement and control is the Japanese engineer Professor Genichi
151
ROBUSTNESS CRITERIA
Taguchi. He has developed both a philosophy and a methodology for the process of quality improvement, which depend heavily on statistical concepts and tools. Many Japanese firms have applied these methods with great success. Taguchi’s ideas can be separated into two fundamental concepts, which are: 1.
2.
The loss function, which is the concrete form of Taguchi’s definition of quality: “The quality of a product is the loss caused by the product to society from the time the product is shipped”. Off-line quality control. A collection of methods to achieve the demanded quality. These methods enclose the following stages: 1) Systems or functional design 2) Parameter or targeting design 3) Allowance or tolerance design
4.2.2 The loss function In Western companies a common way to define quality is conformation to specification, or stated otherwise, all parts must be within specification limits. With this definition of quality there is no need to improve quality. A product has good quality if its parts are within the stated limits, and this changes immediately to bad quality when one of the limits is passed. If, in a pharmaceutical company, a tablet has to be produced with a target weight of 250 f 10 mg, then a tablet of 260 mg has good quality and a tablet of 261 mg has an inferior quality. This is, in fact, an unnatural situation, a tablet with a weight of exactly 250 mg is simply the best tablet and every deviation from this target weight gives a loss of quality. This can best be demonstrated using Figure 4.1 which illustrates a “simplistic” with a “realistic” view on quality. In theory every product property examined should have its own loss function evaluated in order to reach an optimum quality strategy. However, the shape of such a loss function depends on the process or product which is considered and is often difficult to establish. In general the goals of quality improvement (or the shapes of the belonging lossfunctions) are often simplified to three types, which are: 1. Nominal the best e.g. dimension, weight (target
depicted in Figure 4.2A)
=
certain chosen value,
152
J. H. DE BOER
2. Smaller the better e.g. wear, noise, cost (target = 0, depicted in Figure 4.2B) 3. Larger the better e.g. strength, yield (target = 00, depicted in Figure 4.2C)
240
250
260
Tablet weight (mg)
Figure 4.I A “simplistic (---)and a “realistic” ( jview on quality
A
target
B
target= 0
C
ta r g e t = m
Figure 4.2 Loss functions for A ) nominal value, B) smaller the better and C) larger the better quality characteristics
ROBUSTNESS CRITERIA
153
The quality characteristic of type ‘nominal the best’ which uses a specific target value combined with one particular loss function, the quadratic, is probably the most commonly used: Loss = constant * (deviation from target)2 If the constant in the quadratic equation is chosen properly then the quadratic loss function can be used for direct calculation of the financial loss induced by an off-target quality characteristic. Because Taguchi has made no immediate coupling between the concept of loss-functions (the more philosophical part) and the off-line quality control methods (the more concrete part), the loss functions probably serve best as an important background concept rather than a practical working tool. One of the misunderstandings with the traditional (western) way of quality thinking is that it is not needed to improve quality above the specification limit because the product is already good. But a product with increased quality (closer to the target) performs better and induces less complaints from customers. The latter reason is more important than most people think as it will not only save the manufacturer cost in replacing defective products, but it will also contribute to an increased confidence and goodwill of the customer towards the manufacturer. An other misunderstanding is that quality improvement beyond production within specification limits is a cost increasing operation. Many people think that this quality improvement can only be reached by adapting the production process to production with narrower specification limits or by using components or raw material with higher quality. This way of quality improvement is called “the NASA method” by Taguchi. This means that an initial prototype is produced, from this prototype the reliability and stability is studied and problems are corrected by requesting better components or elements. Not much fantasy is required to imagine that this a very expensive way to improve quality. The Taguchi method, however, is to improve quality by changing the design of a product so that it becomes less sensitive to variability. This means that the product is more robust to all sorts of quality decreasing factors. To reach this goal he developed the concept of off-line quality control.
154
J. H. DE BOER
4.2.3 Off-line quality control The performance of a product or process depends on a large number of factors. The problem is to find such settings of these factors that the product performs well under a number of conditions and during its intended life time. For this demand the product has to be robust against a number of factors and/or variables that disturb the operation of it, these factors are called noise factors. Taguchi has classified the noise factors into three types: 1. Outer noise - environmental variables that affect the performance of a product. For example: temperature, dust, humidity. 2.Inner noise - changes in material properties through usage, product deterioration, wear. 3. Variational noise - differences between the individual manufactured units, manufacturing variations. Products are said to have good quality if they are robust against these three types of noise. Outer and inner noise are from these three types the most important. Through design, robustness against these two types of noise has to be achieved. The following procedure describes Taguchi's product development strategy.
TABLE4.1 THE THREE STAGES OF PRODUCT DEVELOPMENT AND THE POSSIBILITIES OF COUNTERMEASURES AGAINST VARIOUS TYPES OF NOISE, WITH 0 - COUNTERMEASURES POSSIBLE AND X - COUNTERMEASURES IMPOSSIBLE Product Sources of Variation Development Product Manufacturing Environmental deterioration variations variables Product design 0 0 0 Process design X X 0 Manufacturing X X 0 The development cycle of a product can be separated into three stages, which are: product design, process design and manufacturing. In Table 4.1 , copied from reference [8], is shown in what stage countermeasures against the three types of noise factors can be taken. As can be seen countermea-
ROBUSTNESS CRITERIA
155
sures against environmental variables and product deterioration are only possible at the product design stage. Therefore, the product design phase is the most important one. To arm oneself against the noise effects Taguchi has developed a product development procedure containing three stages which are:
- a prototype of a certain product is made using specialised knowledge, statistical design of experiments is not relevant at this stage. 2. Parameter design - the optimum levels of the individual factors are determined using experimental design and other statistical methods with the following objectives: a) the product should have the demanded quality characteristics (on target) b) these characteristics should be robust to the noise factors 3. Tolerance design - if the parameter design step did not achieve the required results then this step can be used to decrease tolerances in the production ,which involves however a cost increase 1. System design
Especially the parameter design step gives the best opportunity to incorporate and/or to increase the quality of a product or process. Therefore the rest of this part concentrates on the parameter design step. Before designing any experiment it is important to know all the factors that could possibly affect the quality of the product. After identification of these factors they are ordered into two classes: 1. Control factors - factors that can be set and controlled by the engineer.
2. Noise factors - factors that normally cannot be controlled, however for the experimental part of this design step the noise factors must be controlled to determine their possible effects. The quality increasing results are reached by exploiting the possible effects of the control factors. Four possible types of effect of the control factors can be distinguished:
1. Control factors affecting the mean are used to adjust the mean (also called signal or adjustment factors). 2. Control factors affecting the variability are used to reduce variability.
156
J. H. DE BOER
3. Control factors affecting both the mean and variability are usually used to reduce variability. 4. Control factors neither affecting mean nor variability are used to relax requirements/tolerances with the objective to reduce cost. The challenge is to set up experiments in such a way that settings of the control factors are determined which give a product with good quality characteristics and with the least possible influence of the noise factors. For the experimental section Taguchi has developed a number of tools which are known as orthogonal arrays, linear graphs and signal-to-noise ratios. Only the orthogonal arrays are described in this chapter because they show in the clearest way how the off-line quality control works. The linear graphs and the signal-to-noise ratios are not dealt with here because the first does not add anything to the understanding (it is only a practical tool to set up an array), the latter is not recommended to be used by a number of authors because of the ambiguous character of these ratios. 4.2.4 Orthogonal arrays Orthogonal arrays are experimental designs consisting of either the set of control factors or the set of noise factors. Taguchi has developed a large amount of experimental designs, of which the majority already existed. He most commonly used 2k factorial designs, 2k-P fractional factorials, Plackett-Burman designs, 3k factorials, designs constructed from the Latin square, Graeco-Latin square and hyperGraeco-Latin square. He also used all sorts of combinations of the mentioned designs to permit factors with different numbers of levels to appear in the same experimental design. For each of the orthogonal arrays a code is available of the form L,(b"), where a is the number of experiments, b the number of levels for each factor and c the number of columns in the array. For example the L&i7) orthogonal array is the same as a 27-4fractional factorial design. With this L8Q7)orthogonal array the effect of maximal 7 factors in 8 experiments can be determined (this is of course only possible if there does not exists any interaction between the 7 factors). But the L8(27)orthogonal array can also be used for 4 factors where 3 of the four factors have second degree interactions among each other, while other combinations are also possible. For an ordinary Taguchi analysis two orthogonal arrays are needed, one for
157
ROBUSTNESS CRITERIA
the control factors (called the inner array) and one for the noise factors (called the outer array).
TABLE4.2 AN EXAMPLE OF A CROSS-PRODUCT ARRAY; FACTORS A, B AND C FORM THE CONTROL ARRAY (AN L4),FACTORS X, Y AND Z FORM THE (ROTATED) NOISE ARRAY (ALSO AN L4).EACH OF THE ARRAYS HAS SETTINGS I - IV. NUMBERS 15 THROUGH 22 ARE FICTIVE RESPONSE VALUES Z
DP I I1 I11 IV
A
B
C
1
1 2 1 2
1 2 2
I 2 2
I
Y X DP
1 1 1
2 2 1
2 1 2
1 2 2
I
I1
I11
IV
15 22 24 24
17 21 20 22
10 16 12 23
14 18. 17 22
From these two orthogonal arrays a cross-product design is constructed (see Table 4.2 for an example), for each setting of the control factors in the inner array the complete noise factor (outer) array has to be executed to determine the effect of the environmental factors.
4.3 THE ROBUSTNESS CRITERIA 4.3.1 Introduction In order to achieve the best quality of a process or a product, the design step of the process or product is a very important one. Most of the quality needed later can be built into the product at the design stage. Here, the best quality is defined as obtaining or approaching as close as possible those characteristics which are desired. Moreover, one wishes to keep close to these properties when variations in process conditions appear, or when a product deteriorates through use or ageing, or when external (environmental or noise) factors affect the product. To obtain this kind of quality, research has to be done on all the factors influencing the product. So knowledge is gathered about all the factors and their interaction(s) influencing the quality characteristic(s). This
158
J. H. DE BOER
knowledge can then be used to choose an optimal (or better: a preferred compromise) design of the product or process. When limiting ourselves to one quality characteristic b),with preferred value z, a common way to define a measure of quality is the quadratic loss function (as used by Taguchi [13]), which is defined as: Lo/) = k o / - ~ ) ~ , where k is a constant, coupling the loss o/-z) to, for example, an economic quantity. The value of y is affected by the product design factors (factors which can be selected and maintained by the engineer) and by the environmental (noise) factors (factors which are beyond the control of the engineer). The expected loss function can be rewritten as:
E ( L ( y ) )= kE(y - z)’ = k ( 0 2 +
c2)
where 02 is the variance of y, and 5 is Eb)- z,which is the bias of y. This definition of quality indicates that obtaining a minimal L b ) for a certain product does not always imply that the expected value of the quality characteristic should be on target. A low 02 might be preferable, at the cost of some bias, if the attended loss is lower than a situation with no bias and larger a2. If all the factors influencing the quality characteristic can be separated into two sets - a set product design factors and a set environmental factors - then the techniques used and promoted by Taguchi can be used to derive the desired properties. However, the situation described further in this chapter does not include a clear separation of factors into two sets; the product design factors are also the noise factors, which means that the product design factors can be set to a certain mean value, but a certain random variation exists around this mean. The problem involved in the application which is the subject of this chapter is the optimisation of a property of a mixture of compounds (a common situation in pharmaceutical practice, where for example tablet formulations have to be optimised). This property has to be optimised with respect to a certain goal (maximum, minimum or target value) and also with respect to the robustness or ruggedness of the mixture property. This means that despite any variation in the response or the independent variables (mixture variables in our case) due to unknown variation the response values have to be as close as possible to a desired value (target value).
ROBUSTNESS CRITERIA
159
In this chapter emphasis will be put on the way how to optimise and describe the robustness of the mixture property in a number of criteria that can be optimised directly. The optimisation of the robustness will be combined with the optimisation of the mixture property itself using a Multicriteria Decision Making strategy, which will be explained there. 4.3.2 The variance/covariance structure of a mixture For tablet formulations a response is usually described as a function of the mixture composition. The mixture components are also the cause of a complication of the robustness problem: the amount of instability caused by errors made in the composition results in a variance/covariance structure of the mixture variables which depends on the mixture composition itself. The relation between the variance/covariance structure of the mixture variables and the mixture composition itself can be derived using partial derivatives which is shown below. The assumptions made for the approximation of the variance/covariance structure of mixture variables are: 1. There is no covariance in the weighing of the amount of the components because all the components are weighted individually. 2. The standard deviation of the measurement error is proportional to the weighted amount of the component. Therefore, the relative standard deviation (or coefficient of variation (cv)) is constant. 3. The components are all weighted with the same relative measurement error, resulting in an equal relative standard deviation for all the components. Considering the described assumptions the following transformation can be derived. Let oi (i=l,...,k) be the weighed amount of the original k-components and ooithe standard deviation of the error in the weighing of component oi. As a result of the assumptions: cov(o,,o,) = 0, and cv(oi) = constant (i=l, ...,k , j = l , ...,k, igj) After mixing the original components the fractions xi (i=l,..$) of the mixture are calculated by:
160
J. H. DE BOER
A dependence between the fractions is introduced, because the fractions of a mixture sum to unity. This results in a covariance between the fractions even if the covariances in the original components are zero. Calculation of 02(xJ Applying the rules of error propagation and considering that xi is a function of oi,$(xi) can be approximated by the following equations:
with qxi)= random error of fraction xi;qoj)= random error of component oj; 6(xj)/s(oj)= partial derivative of xj to oj. To calculate the variance of xi the expression is squared, summed and divided by n (number of cases). The covariance-terms produced, after squaring the equation, are zero and will be eliminated because the qoj) = 0, covariances between the original components are zero (60~) ( F l , 2,...,k7j=i+1, i+2,...,k). For fraction xi this results in:
As a consequence of assumptions 2 and 3 the variation coefficients of oi can be rewritten as: k
j=l.j # i
161
ROBUSTNESS CRITERIA
cv(oi) = s(oi)/oi* 100% and v = [cv(o,)/loo]2 so
d(0;) = vo*;.
Substituting equation (6) into equation (5) results in:
The amount of the original component ( 0 ) can be substituted by its fraction (x) using the relations in equation (2), so:
02(Xi)
i i
=
xjxs]
j = l ,j # i , j # s s=l,s#i
j = l ,j # i
Equation (8) expresses the relation of the approximated variance of fraction xi as a function of the composition of the mixture and the square of the coefficient of variation. For example for a ternary mixture:
d(xJ
= 2.V.X,2(XI2+ x32 + X / ' X 3
a(x,) = 2*v-x32(x12 + x;
)
+ XI'X2 )
The minimum value of zero of the variance of fraction x1 will be reached if x 1 = 0 (component x 1 not present in the mixture) or if x I = 1 (only component x1 present). The maximum value of v/16 of the variance of x 1 will be reached if x 1 = 0.5 and x2 = 0.5 or when x I = 0.5 and x3 = 0.5. Figure 4.3 depicts 8 ( x 1 )as a function of the mixture composition.
162
J. H. DE BOER
Figure 4.3 d(x,) as a function of the mixture composition Calculation of cov(x,x$ The covariance between two fractions can also be calculated using the rules of error propagation, this leads to:
After multiplying both blocks in equation (9), summing and dividing by n (and eliminating the produced covariance-terms) the following result can be obtained:
After substituting the partial derivatives by their solutions, equation (10) can be rewritten as:
ROBUSTNESS CRITERIA
163
Considering that the relative coefficient of variation of every component is equal and constant (equation ( 6 ) ) and after substituting the amount of each component by its fraction, this results in:
COV(Xi,Xj)
=
xi exj - v
-
.j(
i ..)-
2xixj
s=l.s#i,s# j
In equation (12) the relation between the covariance of x, and x, is expressed as a function of the composition of the mixture and the square of the coefficient of variation. For example for a ternary mixture:
- x2.xI- x3.xI- 2.x2.x3)
C O V ( X ~ F ~ ) = x2*x3*v (xI2
The covariance between xl and x2 varies between a minimum value of zero reached when xl = 0 or x2 = 0 and a maximum value of -v/l6 reached at x 1 = 0.5 and x2 = 0.5.
164
J. H. DE BOER
Figure 4.4 Cov(x,,x j as a function of the mixture composition In Figure 4.4 the relation between cov(x,,x,) and the composition of the mixture has been depicted. The variance of a fraction xi is completely described by the covariance of this fraction with all the other fractions, or in other words; a variation in fraction xi interacts with all the other fractions of the mixture. This is expressed in the following relation, which is a combination of equation (8) and equation (12):
Or in the ternary mixture example: 02(x,) = - COV(X,&,) - COV(X,&,) 02(x,)
=
- cov(x,&,) - cov(x2&3)
02(x,) = - cov(x,p,) - cov(x,&2) In a k-component mixture k-1 independent variables are present. The multivariate distribution will then be formed using the variances of k-1
ROBUSTNESS CRITERIA
165
fractions and the covariances between the same k- 1 fractions. In Figure 4.6 a number of iso-Mahalanobis distance contours of a three component mixture have been depicted. The square of the variation coefficient (v) is constant. The ellipses drawn at Figure 4.5 are each contour lines with the same probability density value. This means that if a mixture is set to the centre point of one of the ellipses in the figure, than the probability that the composition of the mixture is present inside the drawn ellipse is in all the cases the same.
x1
Figure 4.5 Iso-Mahalanobis distance contours at difSerent mixture compositionsprojected on a mixture triangle
It can be seen instantly that the mixture with the largest uncertainty in the composition of the mixture is present when all of the components have equal fractions (xl = x2 = x3 = 1/3), because of the large area within the ellipse. Contrary, a mixture near a vertex of the triangle, has a much smaller area inside the drawn ellipse. This indicates that the uncertainty in the mixture composition is considerable smaller here. Finally, when the composition of a mixture reaches a vertex the uncertainty will be zero (the mixture consists in this case only of one component, so no compositional errors are possible).
166
J. H. DE BOER
4.3.3 General aspects of the robustness criteria The three robustness criteria that are explained here (Weighted-Jones (WJ), Projected-Variance (PV) and Robustness-Coefficient (RC)) describe all three the robustness of a certain mixture composition in direct relation to the response to be optimised. All three express the concept of robustness into a numerical value that can be calculated for each mixture setting (composition) of interest. So each of the criteria can be calculated as a function of the mixture composition and belongs directly to a certain response of interest. In this way a robustness criterion can be dealt with in a 'normal' way in a mixture optimisation strategy. 4.3.4 The Jones method The Jones method [14] was introduced as an alternative strategy to the Taguchi method. In fact it is the response surface methodology version of the cross-product designs of Taguchi without the application of the S/N ratios. For the Jones method first an experimental design is set up for the total number of factors (both product design and environmental). Experiments are performed and a suitable model is chosen to represent the response as a hnction of the factors considered. Thus no classification of the factors is made beforehand. Then, the factors influencing the response are split up into two classes, which are: the product design factors and the environmental (noise) factors.
environment
z
+
Figure 4.6 Representation of the response surface of dependent variable y against design factor x and environmentalfactor z. z is the ideal response value
167
ROBUSTNESS CRITERIA
The Jones method is explained with the use of Figures 4.6 and 4.7. These figures depict a single product design factor (x), a single environmental factor (z) and a response o/) as the dependent variable. If the product design factor is set to a certain value and the environmental factor is allowed to vary (as in actual cases) a section of Figure 4.6 is the result. This section has been depicted in Figure 4.6 as the hatched area and is redrawn in Figure 4.7. T
I\
1'
-c-
VX
x
aJ m
Yxz
C
0
a
-
*
Yx
m
aJ L
environment z
>-.
is the mean predicted response at a constant x calculated Figure 4.7 over the design space of z, ,$ is a predicted value of y at a certain combination of x and z The ideal response value is represented by z, $,represents the mean response at a particular value of the product design factor (represented by the index x) calculated over the region of interest of the environmental factors (RJ, so
with k-' = IR dz being an integrating constant, and ?,being Z
(predicted) response value at a certain combination of x and z.
one
I68
J. H. DE BOER
Jones uses an integrated squared error loss value [L(x)] as the performance criterion:
with R, as the region of interest of the environmental factors. But this L(x) value has in fact the same problem as an S/N ratio, it contains two experimental goals with a fixed relation between them. This problem is solved by separating L(x) into two distinct criteria, which are: M(x)= k 1 [ ~ - $ ~ ] ~ d z Rz
where M(x) represents the mean squared deviation of the mean response from the ideal response at a certain x value, and:
where V(x) represents the mean squared variation about the average response. Thus L(x)
=
M(x)
+
V(x)
(17)
To control the relation between M(x) and V(x) in L(x), which is in equation (17) still completely dependent on z,a weighing factor A is introduced, which results in: R(x)
=
AV(X) + (1 - A)M(x)
0 IA 5 1
By adjusting A,the relative importance of variance V(x)] and bias [M(x)] can now be controlled by the engineer. Another important advantage of using a response surface methodology based method, like Jones' method, is that a mathematical model is used to describe the response as a hnction of the product design and environmental variables. Using this model, interpolations of the product
ROBUSTNESS CRITERIA
169
design factors can be used which results in a large increase in potential suitable settings of the design factors. This is in contrast with Taguchi, who uses only design points for the selection of the most suitable setting of the product design factors. 4.3.5 The Weighted Jones method The Weighted Jones method (WJ) is a performance criterion based on the V ( x ) part of the Jones method described above. The M(x) part has not been changed. While Jones uses a weighing factor (A)to determine the relative importance of the M(x) and V(x) parts, a Multicriteria Decision Making (MCDM) method was used, based on the Pareto Optimality concept [15,16] for selecting a compromise setting (this strategy will be explained later in this chapter). The Pareto Optimality method has the advantage of selecting a (compromise) optimal setting without having to choose a weighing factor in advance. The Jones method cannot be used directly in the type of optimisation problem which is considered here (see Introduction). In the first place no classification can be made of the factors influencing the response into two groups, because in this case the product design factors and the environmental or noise factors are the same. In the second place, Jones uses a uniform distribution of the environmental factors in the calculation of the R(x) or L(x) values, which means that responses calculated at z values which have a lower probability of appearing (usually at the borders of the R, region) have the same influence on L(x) or R(x) as responses measured at more probable z values (usually at the centre of the R, region). In view of the central limit theorem it is more likely that the noise factors have a normal distribution than a uniform distribution. However, by incorporating a weighing factor in the calculation of the WJ criterion any (empirical) probability distribution of the environmental factors can be accounted for. Both items described above have been incorporated in the WJ method. The WJ criterion is defined as follows:
with c as a point of interest in the mixture space; Rx, is the elliptical region round point c which holds a major part (for example 99%) of the probability distribution of the errors in the settings of the mixture variables in point
170
J. H. DE BOER
c (see variance/covariance structure of a mixture); W, is the probability density of the distribution of the mixture variables, which acts as a weight; jxc = s f i c j X W x d x is the mean weighted predicted value of the response, which can be slightly different from the predicted value of y at point c in the case of a non-linear response surface.
t
I
! A I
I
I
I
I
I
I
I
I
I
I I
I
I
X
I
Figure 4.8 Example of WJ calculation. The hatched area is integrated. It can be seen that yXcis not the same yc which is the predicted value of the response at point c Figure 4.8 gives an example of the calculation of the WJ criterion. The hatched area in this figure represents the IkC[Fx- Fxc]dxpart of WJ,.
4.3.6 The Projected Variance method The Projected Variance (PV) method describes robustness as the variance of the response induced by the variance in the independent variable(s) propagated through the response surface. This method was first described by Box [17]. Vuchkov et al. [18] has used this method in case of second
ROBUSTNESS CRITERIA
171
order polynomials to minimise the variance of the response variable using a constraint optimisation method.
A A I
I I
I
I I
I
I
I
I
I I
I
I
I I
I
I
I
II
I
II
a
I
I
b
X
Figure 4.9 Example of error propagation through the response suYface in two situations Figure 4.9 shows an example of error propagation's in case of a single factor and a second order model: f = bo +blx +b11x2. The amount of propagated error depends on the variance of the independent variable (x) and the gradient of the response function. In case of setting b the amount of propagated error is reduced by a smaller gradient compared to setting a. In the general case the following assumptions are valid: a response y depends on p independent variables, so 5= f(x1 ,x2 ,..,xp) or f = x'b if linearity is assumed, when x is the vector of the independent variables and b the vector of regression coefficients. The factors are set precisely during the investigation period, so functionf(x1 ,x2 ,..,xp) is calculated without errors in the XIS. During mass production, however, they can be set with known tolerance e, (i = 1, 2,..., p). It is also assumed that ej are random variables with the following moments:
E(ei) = 0; var(e,) = s2;cov(e, ej) = pijoioj
172
J. H. DE BOER
The response is not only affected by the tolerances of xi but also the measurement error (or pure error) in the response disturbs the response, of which the following assumptions are made: : Pure error = v; E(v) = 0; var(v) = ;0
cov(v,,
V J = 0;
cov(v,, ejJ
=0
where u and t are different experimental conditions. If it is assumed for simplicity that the response surface can locally be approximated using a linear function (this is plausible when the function f(x,, x,, .., x,) describes a smooth surface andor when s is relative small), then the propagated variance can be calculated by (see reference [ 191 for a complete derivation): 0; =
(b + 2Wc)’C, (b + 2 Wc)
where Wc = Wij is a p*p matrix with elements: bii,if i =j ?4b,, if i#j c is a variance/covariance matrix of the errors in the independent variables in point c; b is a vector with regression coefficients; bt-2 W, is the tangent of the response surface at point c. C
Figure 4.10 elucidates this idea. 2 = 0, 2 + o, 2 which is the sum The total variance in the response is thus ot,, 2 If the of the pure error (0,2 ) and the propagated error at point c (0,). function describing the response surface consists only of terms linear in the factors (a 'flat' response surface) then the results of WJ and PV are approximately the same. The proof of this is given in reference [30].
4.3.7 The Robustness Coefficient The Robustness Coefficient (RC) has been developed especially to handle the kind of robustness problem as described in part 4.3.2 of this Chapter. The RC is extensively described in reference [20], the behaviour of the RC
173
ROBUSTNESS CRITERIA
has been examined in reference [21]. The RC represents the probability that, given a known variation in the settings of a set independent (mixture) variables (x,),the dependent variable o/) can be expected to fall within a predefined interval. If a model has been fitted on the basis of which the response can be predicted, then the RC can be calculated for every mixture composition in the design space, using the variance/covariance structure of the independent variables, the measurement error and the relation between xi and y .
/: I
I I
I
I
I
I
I I I
I
I
I
I I
I I
a
b
X
Figure 4.I 0 (a) Good variance approximation of PV due to smooth response surface. (b) good variance approximation of PV due to a small variance in x The concept of the RC is outlined in Figure 4.1 1. A minimum amount of overlap (m,, for example 95%) is demanded between the probability distribution of a predicted value of the response, at a certain value of the independent variable(s), and an interval relative to j c , which is a prediction of the response at the point of interest (c). The amount of overlap is called p,. If we assume that i t the point of interest p , is larger
174
J. H. DE BOER
than m, and there is a single independent variable (x), then, normally, two points (a and b ) on the x-axis can be identified which have a p , value exactly equal to the m, value defined a priori. If the characteristics of the error distribution of x at point c are known then, using this knowledge, the probability can be calculated that x is present between a and b when set to C.
b
d c a
X
Figure 4. I 1 Concept of the robustness coeficient It would now be most logical to let this probability between a and b be the RC, but in case of more than one independent variable with a multivariate error distribution it is a very complicated problem to calculate an almost always asymmetrical part of this distribution. To handle this problem the
ROBUSTNESS CRITERIA
175
RC will be represented by the smallest symmetrical part around point c which has a p , value larger or equal to m,, i.e. in Figure 4.12 from a to a*. This is, in fact, a pessimistic approximation of the ‘real’ probability, but this area can be calculated a lot more easier. If, however, the real error distribution of x is not known (but the variance/covariance structure is), then the RC is represented by the smallest Mahalanobis distance (which is directly related to probability in case of a known error distribution) between point c and a point with po=m, (here from c to a). In this case only differences in variances and covariances of the independent variables are taken into account in the RC value. Here the Mahalanobis distance is used because the errors in the fractions of a mixture are not normally distributed.
4.4 MULTICRITERIA DECISION MAKING 4.4.1 Introduction In pharmaceutical technology research is, among other things, directed to the design of formulations. Considering a tablet system, the physical properties of the formulation depend on the nature and levels of the compounding substances and of the process variables used in manufacturing; such as for instance compression force. Often the formulation has to satisfy a number of demands, such as a high crushing strength, a low disintegration time, a low friability and also the price of a formulation can play a role. Frequently these formulation goals are conflicting, e.g. low disintegration time may involve decreased crushing strength and the addition of excellent lubricants, in order to improve the tablet preparation, may cause a decreasing crushing strength and an increasing disintegration time. Therefore, an acceptable rather than optimum solution to the formulation goals is achieved. Different optimisation techniques have been developed [22] which can be divided into two general groups. The first group consists of the model dependent approach in which mathematical models are generated for every formulation property of interest as a hnction of the composition of the mixture of the compounding substances and possibly process variables [23]. This method is called the simultaneous approach. The settings of all the variables (mixture components and/or process variables) in the employed model have to be
J. H. DE BOER
176
preselected, so the number of experiments to be performed is known beforehand. After experimentation and calculation of a model, a relation is established between each formulation property separately and the variables in the employed model. When a model adequately describes this relation, predictions of this property can be made by interpolation over the whole range of the boundary values of the used variables, which forms the response surface. In Figure 4.12 the relation between the crushing strength and mixtures of three components (where the factor space can be represented by a triangle) is presented by a contour plot. The composition that gives a desired criterion value can be read directly from the figure.
lactose 1 aq. 100 mesh
dried potato starch
Figure 4. I2
lactose anhydrous
Contour plot and levels of the crushing strength of the tablets (N)
In Figure 4.13 the relation between disintegration time and mixture composition is depicted. With the application of more than one criterion a combined contour plot (Figure 4.14) can be made, where parts of the factor space that do not satisfy our demands, are shaded. So the unshaded area is the region where mixtures can be found that will satisfy our demands.
ROBUSTNESS CRITERIA
177
l a c t o s e 1 aq. 100 mesh
dried p o t a t o s t a r c h
l a c t o s e anhydrous
Figure 4.13 Contour plot and levels of the disintegration time of the tablets (s)
Figure 4. I 4 Combined contour plot of the crushing strength and the disintegration time = = disintegration time > 20 s; /// = crushing strength I00 N; U = remaining area
178
J. H. DE BOER
The second approach to optimisation is a model independent one. One of these model independent methods is the sequential simplex [24,25] used by Shek et al. [26]. The method is claimed to be ideally suited for the optimisation of formulations [27] because of the relatively low number of experiments to be performed. Disadvantages of the simplex method are: the number of experiments to reach an optimum is not known beforehand, this can lead to better but also to worse results compared to a simultaneous approach. If an optimum is reached nothing is known about that part of the response surface that has not been investigated, e.g. other, even higher optima can be present and, which is more important, the stability of the reached optimum against small variations of a criterion, is not known. When using a sequential optimisation method one is forced to use one criterion. If more than one formulation goal is demanded, all these criteria have to be combined into e.g. a weighted linear combination of one response, but this leads to ambiguous results. E.g., when using a composite criterion to maximise crushing strength and minimise disintegration time of a given tablet formulation the following expression could be used:
R, = aR, - bR,
with R, = overall response R, = crushing strength R, = disintegration time a and b are positive weighing factors
In this particular situation two criteria are optimised. The overall responses of the vertices of the present simplex are compared and the simplex moves away from low responses. The optimisation route depends on the measured values of R, and R, and the values of a and b, which have to be chosen before the start of the optimisation procedure. Suppose in the progress of the optimisation procedure the following results are obtained: Vertex 1:
R, = 40 N R,=30s
Vertex 2:
R, = 90 N R, =150 s
Vertex 3:
R , = 110 N R, = 160 s
Situation A: let a = 2 and b = 1, the R, values for vertex 1, 2 and 3 are 50, 30 and 60 respectively. Vertex 3 is better then vertex 1 and vertex 1 is better than vertex 2.
ROBUSTNESS CRITERIA
179
Situation B: let a = 5 and b = 1, the R, values for vertex 1, 2 and 3 are 170, 300 and 390. In this case vertex 3 is better then vertex 2 and vertex 2 is better then vertex 1. l a c t o s e 1 aq. 100 mesh
dried p o t a t o s t a r c h
l a c t o s e anhydrous
Figure 4. I5 Illustration of the sequential simplex optimisation procedure in a mixture of three components. Point A is the result of situation A, point B is the result of situation B These situations are illustrated in Figure 4.15. In situation A the optimisation procedure moves away from vertex 2 and in situation B it moves away from vertex 1. The choice of this search ,direction depends on the weighing factors and not on the individual responses R, and R,. When using a (sequential) optimisation technique with a composite criterion nothing is known about the performance of the formulation in terms of the individual criteria R,and R,. A good choice of the weighing factors should rely upon the performance of the tablet formulation in terms of both criteria. Therefore a good choice of the weighing factors, at the start of the optimisation procedure, is hardly to achieve in practice. The Multicriteria Decision Making (MCDM) method that is proposed here [28] is based on the Pareto Optimality (PO) concept, does not make preliminary assumptions about the weighting factors, the various responses are considered explicitly. Pareto Optimality, which makes provisions about mixtures in the whole factor space, therefore cannot be used in combination with a sequential optimisation method.
180
J. H. DE BOER
4.4.2 Theory ofMCDM The formulation considered as an example here consists of three components. Two criteria, the crushing strength (maximise) and the disintegration time (minimise), were used to optimise this formulation. The factor space representing these components can be formed by a triangle where each vertex represents a pure component. Measurements of the criteria of mixtures of the three components are made at regular points (according to a simplex lattice design) in the factor space (Figure 4.16). The criteria measured can be related to the mixture composition with the use of a special-cubic equation:
where y is the modelled criterion, p, to fi,23are the regression coefficients, E is the measurement error and xI,x2 and x3 are the fractions of the three mixture components. The seven regression coefficients in this model can be estimated by use of multiple regression. This requires at least seven measurements of the crushing strength and the disintegration time located in the factor space (here the complete mixture triangle).
dried p o t a t o s t a r c h
6
Lactose anhydrous
Figure 4. I6 Simplex lattice design for a special cubic model with ten design points
181
ROBUSTNESS CRITERIA
After calculating the models for each criterion the values of the crushing strength and the disintegration time can be predicted at every mixture composition within the factor space. So far this approach is analogous to most of the simultaneous optimisation methods. However, the optimisation is not continued by preselecting desired values for any criterion to construct contour plots (Figure 4.13 and 4.14), or to search for acceptable solutions [29]. In the PO-approach it is not necessary to preselect acceptable values for any criterion. All the predicted values of the two criteria, crushing strength and disintegration time, at each mixture composition are represented in a two-dimensional PO-plot (Figure 4.17). Each point in this plot relates to a pair of those criteria values. The space occupying these points is called the feasible criteria space.
8
150 0
9 M El
2
c M
0
-
100
0
-
0
-
0
Q
0
0 0
0
0
n
0
0 0
0 I
I
I
I
I
0 I
1
I
Disintegration time (s) Figure 4. I 7 Plot of the feasible criteria space of the crushing strength and the disintegration time; 0 = Pareto-optimal point; o = inferior point
J. H. DE BOER
182
W
I
1501
100 -
50 -
0
t
4 -_____
2
I I I‘ I
3
+------I I I I
1
I
L
Figure 4.18 Point p with an illustration of thefour quadrants For illustrative purposes point p of Figure 4.17 is depicted in Figure 4.18 and the space around this point is divided into four quadrants. The following deductions can be made about these four quadrants. Quadrant 1: All the points falling in this quadrant are inferior to point p , because the considered mixtures have a larger disintegration time and a smaller crushing strength. Quadrant 2 and 3: Point p is incomparable to the points falling into these regions. Points in quadrant 2 have a worse crushing strength compared to point p but a better disintegration time. For quadrant 3 the reverse is valid. Quadrant 4: Whenever a point is falling into quadrant 4, point p will be inferior, because point p has a worse crushing strength and also a worse disintegration time compared to this point. By taking every point in Figure 4.18 as point p successively, all the inferior points can be removed by applying those three rules, only the noninferior or Pareto Optimal points remain.
ROBUSTNESS CRITERIA
183
A point in the feasible criteria space is a Pareto Optimal point if there exists no other point in that space which yields an improvement in one criterion without causing a degradation in the other. By evaluating quantitatively the pay-off between a minimal disintegration time and a maximal crushing strength, a choice can be made between the Pareto Optimal points. The method will be illustrated with an example. For an introduction to the theory of MCDM see [30].
4.5 THE ROBUSTNESS COEFFICIENT APPLIED IN A MCDM STRATEGY 4.5.1 Introduction
In practice not only the robustness of a response at different mixture compositions is interesting for optimisation purposes but also the value of the product property itself (in our case the predicted value of the response). So an optimisation strategy directed to two criteria has to be applied. Pareto Optimality (PO) is the ideal method to consider both these optimisation goals. 4.5.2 Theory
The example considered here concerns the optimisation of a tablet formulation, consisting of three components. Two criteria, the crushing strength and the RC of the crushing strength, were used to optimise the tablet formulation. In Table 4.3 the experimental design is shown. A quadratic equation of the form:
was fitted to the data of Table 4.3 with the use of ordinary least squares regression. A proportional relationship is assumed between the weighed amount of pure component and the measurement error made in the weighing. The robustness coefficient is only based on the three components at variable concentrations and not on the components at fixed concentrations (see the experimental section), because with these latter components the RC value can not be changed (optimised). Therefore only the weighing errors of the components at variable concentrations will be used to calculate the RC, the weighing errors in the components at fixed
184
J. H. DE BOER
concentrations will express themselves in the measurement error of the dependent variable.
TABLE4.3 MEASUREMENTS OF THE CRUSHING STRENGTH OF A TABLET FORMULATION, X, = a-LACTOSE; x*= P-LACTOSE; ~3 = RICE STARCH; Y-MEAN = AVERAGE OF TEN MEASUREMENTS OF THE CRUSHING STRENGTH (N) XI
x2
x3
y-mean
1.ooo 0.000 0.000 0.500 0.500 0.000 0.333 0.667 0.333 0.667 0.333 0.000 0.000
0.000 1.ooo 0.000 0.500 0.000 0.500 0.333 0.333 0.667 0.000 0.000 0.667 0.333
0.000 0.000 1.ooo 0.000 0.500 0.500 0.334 0.000 0.000 0.333 0.667 0.333 0.667
25.0 49.5 57.1 33.3 31.3 70.0 39.6 30.0 37.5 17.1 38.4 56.8 62.1
For the sake of illustration the constant relative standard deviation (a,) for the three variable components was set to 5%. The demands concerning the calculation of the robustness coefficient were to have a deviation in the crushing strength (Sy)of 10N at most with a reliability (m,) of 0.95. The demands concerning the decision making method were: a maximal value for the crushing strength and a maximal value for the robustness coefficient. 4.5.3 Experimental
The fillerbinders used were a-lactose monohydrate 100 mesh (D.M.V., NL-Veghel), anhydrous 0-lactose (marketed as Pharmatose' 2 1, D.M.V., NL-Veghel) and modified rice starch (developed as Eratab' by Erawan Pharmaceutical Research and Laboratory Co. Ltd., TH-Bangkok and marketed as Primotab' ET by Avebe NL-Veendam). The disintegrant used was Primojel (Avebe, NL-Veendam). The tablets were lubricated with
ROBUSTNESS CRITERIA
185
magnesium stearate (Pharm. Eur. grade, O.P.G., NL-Utrecht), which was sieved through a 210 mm sieve prior to use. Colloidal silicondioxide (Aerosil 200R,Degussa, D-Frankfurt) was added as a glidant. Prior to use Aerosil was sieved through a 210 mm sieve. The drug used was oxazepam (Pharm. Eur. grade, Pharmachemie, NL-Haarlem).
Tablet composition: Components with a fixed concentration were Primojel (4%), oxazepam (4%), magnesium stearate (1%) and Aerosil200R (0.2%). The components with a variable concentration were a-lactose, D-lactose and rice starch. The concentrations of these components sum up to 90,8% of the total tablet weight, their individual concentrations can be obtained from the experimental design listed in Table 4.3 where the listed values of x,-x3 are fractions of the total amount of 90.8% for these three components. Preparation of the tablets: All ingredients except magnesium stearate were mixed for 15 minutes in a Turbula mixer (model 2P, W.A. Bachoven, CH-Basle) at a rotation speed of 90 r.p.m.. Magnesium stearate was added and the mixing was continued for 2 minutes. Tablets (250 mg) were prepared on a single-punch tabletting machine (HOKO, NLAijswijk), using 9 mm flat punches. Tablets were produced at a load level of 157 MPa. The tabletting was performed in a room with constant temperature (20 1 "C) and relative humidity (45 f 5%).
*
Crushing strength: From each batch the crushing strength of 10 tablets was measured using a Schleuniger instrument (model 4M, Dr.K. Schleuniger, CH-Zurich). 4.5.4 Results and discussion A quadratic equation was found to fit the crushing strength values the best compared to other Scheffk models [3 11. The calculated coefficients of the quadratic model equation (21) are presented in Table 4.4. Using equation (21), the crushing strength can be predicted at any place within the mixture triangle. A contour plot of the predicted crushing strength is depicted in Figure 4.19 Also at any place in the mixture triangle the robustness coefficient can be calculated. The contour plot for the RC values is presented in Figure 4.20.
186
J. H. DE BOER
TABLE 4.4 MODEL FITTING RESULTS FOR THE CRUSHING STRENGTH DATA Model coeflcients b, : b, : b3 : bl2: b13: b23:
Standard error
23.554 48.825 58.649 -12.113 -52.914 40.063
1.442 1.442 1.442 5.666 5.666 5.666
R*(adj) = 0.903 F-ratio (regression mshesidual ms) =
241.7 (p = 0.000)
Combining the behaviour of both criteria, maximisation of the crushing strength and the maximisation of the robustness coefficient, results in a PO-plot, as shown in Figure 4.2 1. Alfa-lactose
Beto-lactose
Rice starch
Figure 4. I 9 Contour plot of the crushing strength with contour lines at levels 25, 30, 35, 40, 45, 50, 55 and 60 N (labeled I-VIII, respectively)
187
ROBUSTNESS CRITERIA
Alfo-lactose
Beta-loctose
Rice storch
Figure 4.20 Contour plot of the RC of the crushing strength with contour lines at Mahalanobis distances: 0.3, 0.6, 1.0, 2.0, 3.0, 4.0, 5.0, 7.0, 10.0, 15.0 and 20.0 (labeled A-K respectively)
c
R ._ 3
2 0
30.0
0
-
I
S
2
25.0 -
U
K
20.0 -
2
15.0
0
-
3 0 4 '
10.0 -
5
0
6
7 0
5.0 0.0
12.-nv23.5 23.6 23.7 23.8 23.9 24.0
24.1 24.2
64.3 64.4
Crushing strength (N)
Figure 4.21 PO-plot of the crushing strength vs. the RC of the crushing strength. Thepoints I - 7 correspond to those in Table 4.5
J. H. DE BOER
188
TABLE4.5 PARETO-OPTIMAL POINTS. x,=a-LACTOSE; xz=P-LACTOSE; x3=FUCE STARCH; yl= PREDICTED VALUE OF THE CRUSHING STRENGTH (N); y2=C OF THE CRUSHING STRENGTH
No.
XI
x2
x3
Y1
Y2
1 2 3 4 5 6 7
0.960 0.920 0.940 0.880 0.900 0.920 0.000
0.020 0.040 0.040 0.060 0.060 0.060 0.380
0.020 0.040 0.020 0.060 0.040 0.020 0.620
23.53 23.64 23.85 23.89 24.01 24.18 64.35
31.63 15.81 12.65 11.46 11.07 10.28 9.84
Every point in figure 4.21 corresponds directly to a predicted value of the crushing strength and the calculated value of the robustness coefficient, at one mixture composition. These points are called Pareto Optimal (PO) points, which are listed in Table 4.5. The PO points can also be placed in the corresponding mixture triangle, which is presented in Figure 4.22. Alfa-lactose
Bela-lactose
Rice starch
Figure 4.22 PO-points placed in the mixture triangle. Thepoints 1-7 correspond to those in Figure 4.21 and Table 4.5
The PO points are not directly comparable to each other. Moving from point 1 to 2 in the PO plot results in a higher crushing strength but a lower robustness coefficient. Between points 7 and 6 the reverse is valid, moving
ROBUSTNESS CRITERIA
189
from 7 to 6 results in a more stable mixture but with a lower crushing strength. Therefore, for example a mixture measured at point 1 cannot be called better than a mixture measured at point 7. Considering the set PO points presented in Table 4.5, the following statement can be made: there exists no mixture composition with a higher robustness coefficient and, simultaneously, a higher crushing strength than a PO mixture composition. The PO concept is clearly presented in Figure 4.22, the pay-off between the two criteria can be evaluated quantitatively. The experimenter can decide how much he or she wants to pay in one criterion to gain in the other criterion.
REFERENCES G.E.P. Box, S. Bisgaard, The scientific context of quality improvement, Quality Progress, June 1987, pp 54-61. G.E.P. Box, S. Bisgaard, C. Fung, An explanation and critique of Taguchi's contributions to quality engineering, Report no. 28, University of WisconsinMadison, USA, March 1988. P.M. Burgam, Design of experiments - the Taguchi way, Manufacturing Engineering, May 1985, pp 44-47. J. Cullen, An introduction to Taguchi methods, Quality Today, September 1987. B. Gunter, A perspective on the Taguchi methods, Quality Progress, June 1987, 44-52. J.S. Hunter, Statistical design applied to product design, Journal of Quality Technology, 17(4) ( 1985) 2 10-221. R.N. Kackar, Off-line quality control, Parameter design, and the Taguchi method, Journal of Quality Technology, 17(4) (1985) 176-188. R.N. Kackar, Taguchi's quality philosophy: Analysis and commentary, Quality progress, December 1986, pp 2 1-29. R.H. Lochner, J.E. Matar, Designing for quality; an introduction to the best of Taguchi and western methods of statistical experimental design, Chapman and Hall, London-Madras, 1990. L.P. Sullivan, Reducing variability: A new approach to quality, Quality Progress, July 1984, pp 15-21. L.P. Sullivan, The power of Taguchi methods, Quality Progress, June 1987, pp 76-79. J.J. Pignatiello Jr., J.S. Ramberg, Discussion of ref. 7, Journal of Quality Technology, 17(4) (1985) 198-206. G. Taguchi, Introduction to quality engineering; designing quality into products andprocesses, American Supplier Institute, Deambom, USA, 1986. S.P. Jones, Designs for minimizing the effect of environmental variables, PhD Thesis, University of Wisconsin-Madison, Madison, 1990.
190
J. H. DE BOER
[151 J.L. Cohon, Multiobjective Programming and Planning, Academic Press, New York, 1987. [16] J.H. de Boer, Age K. Smilde and Durk A. Doombos, Introduction of multi-criteria decision making in optimization procedures for pharmaceutical formulations, Acta Pharmaceutica Technologica, 34 (1988) 140-143. [17] G.E.P. Box, The effects of errors in the factor levels and experimental design, Technometrics, 5(2) (1 963) 247-262. [18] I.N. Vuchkov and L.N. Boyadjieva, The robustness against tolerances of performance characteristics described by second order polynomials, paper presented at: First international conference-work-shop on optimal design and analysis of experiments, Neuchiitel, Switzerland, July 25-28, 1988. [ 191 G.E.P. Box, Signal-to-noise ratios, performance criteria and transformations, Technometrics, 30 (1988) 1- 17. [20] J.H. de Boer, A.K. Smilde and D.A. Doombos, Introduction of a robustness coefficient in optimization Procedures: Implementation in mixture design problems. Part I: Theory, Chemometrics and Intelligent Laboratory Systems, 7 (1990) 223-236. [30] J.H. de Boer, A.K. Smilde and D.A. Doornbos, Introduction of a robustness coefficient in optimization Procedures: Implementation in mixture design problems. Part 11: Some practical considerations, Chemometrics and Intelligent Laboratory Systems, 10 (1991) 325-336. [3 11 D.A. Doornbos, Optimisation in pharmaceutical sciences, Pharmaceutisch Weekblad ScientiJc Edition, 3 (198 1) 549. [32] J.A. Cornell, Experiments with mixtures, John Wiley & Sons, New York Singapore 1981. [33] W. Spendley, G.R. Hext, F.R. Himsworth, Sequential application of simplex designs in optimization and evolutionary operation, Technometrics, 4 (1 962) 441. [34] J.A. Nelder, R. Mead, Simplex method for function minimization, Computer Journal, 7 (1965) 308. [35] E. Shek, M. Ghani, R.E. Jones, Simplex search in optimization of capsuleformulation, Journal of Pharmaceutical Sciences, 69 (1980) 1135. [36] P.L. Gould, Optimisation methods for the development of dosage forms, Int. J Pharm. Tech. & Prod M)., 5 (1984) 19. [37] A.K. Smilde, A. Knevelman, P.M.J. Coenegracht, Introduction of multi-criteria decision making in the optimization procedures for high-performance liquid chromatographic separations, Journal of Chromatography 369 (1986) 1. [38] J.B. Schwartz, J.R. Flamholz, R.H. Press, Computer optimization of pharmaceutical formulations I: General procedure, Journal of Pharmaceutical Sciences, 62 (1973) 1165. [39] J.H. de Boer, A.K. Smilde and D.A. Doombos, Introduction of a robustness coefficient in optimization Procedures: Implementation in mixture design problems. Part 111: Validation with competing criteria, Chemometrics and Intelligent Laboratory Systems, 15 (1992) 13-28. [40] H. Scheffe, Experiments with mixtures, Journal of the Royal Statistical Society, Series B, 20(2) (1958) 344-360.
Chapter 5
RUGGEDNESS TESTS FOR ANALYTICAL CHEMISTRY MARY MULHOLLAND
Department of Analytical Chemistry, University of New South Wales, P.0. Box 1, Kensington, New South Wales 2033, Australia.
5.1 INTRODUCTION Many quantitative applications of analytical chemistry currently have very stringent requirements regarding the accuracy and precision of results. This is particularly true for applications that assay potentially toxic compounds or those coming under regulatory controls. Over the past decade or so the pharmaceutical industry has become increasingly regulated to control the quality of the analytical methodology, employed throughout the various stages in the lifetime of a drug from stability trials to quality control. This has resulted in an increased awareness of method validation procedures and prompted much research into suitable statistical methods that are both effective and efficient [ 1,2]. This chapter describes such an investigation into statistical methods for the determination of the ruggedness of an analytical method as part of an overall method validation strategy. Ruggedness testing is carried out as part of a precision study and the goal is to establish the effect of small changes in the method conditions (such as temperature or instrumental settings) on the qualitative and quantitative abilities of the method. A ruggedness test allows: The identification of conditions which are critical to the overall method performance. The method to be documented in an unambiguous format. The specification of system suitability criteria. These are a set of conditions that an instrumental set up must meet before it can be used for the method. They usually consist of a range of values for
191
192
M. MULHOLLAND
performance characteristics such as retention times for chromatography or absorbance response in spectroscopic methods. This chapter begins by describing the design of an overall strategy for method validation and the role of a ruggedness test in this strategy. The implementation of a ruggedness test is then discussed in detail including aspects such as the selection of factors to test, selecting an experimental design, interpreting the results and the final documentation of the validated method. 5.1.1 Designing a protocol for method validation Ruggedness testing is one part of an overall method validation program and it is therefore important to begin this chapter by giving a brief outline of the levels of protocols for validation showing clearly where the ruggedness test is performed. There are many ways to design and carry out a method validation study, and most companies provide two levels of validation protocols that describe the preferred methods. The top level protocol describes general procedures and contains an overview of the characteristics that should be tested and the generally accepted pass/fail criteria. These protocols are usually drawn up for a laboratory or group of laboratories, although some companies have company wide protocols. A top level protocol could, for example, specify that ruggedness is tested for all stability indicating methods that use HPLC and could give an example padfail criteria such as “all main effects on the concentration results should be less than 2%’. A top level protocol requires consideration of the compounds to be assayed, the application of their results together with any relevant regulatory requirements. The second level protocol is specific to the method to be validated. This is a more detailed document combining the recommendations of the top level protocol with the requirements of the current method. The decisions on which tests to perform are dependent on the application of the method, for instance: How complex is the sample matrix ? How often is the method to be used ? How many samples are assayed in one run ? What is the resolution between chromatographic peaks ?
193
RUGGEDNESS TESTS FOR ANALYTICAL CHEMISTRY
It is clear that a method that is only to be used 5 or 10 times requires much less vigorous testing than a method that is to be used as a quality assurance method over several years. The considerations for designing protocols for a ruggedness test are described for selecting both factors for testing and experimental designs in the relevant sections of this chapter. Figure 5.1 shows the various characteristics and stages in a method validation program. For most quantitative methods of analysis, the method characteristics that require evaluation are accuracy, sensitivity, selectivity, precision and method limitations. Each of these characteristics have contributions from various effects, all of which require consideration within a method validation study.
I
METHOD VALIDATION
I I
REPEATABILITY
REPRODUCIBILITY
1
RUGGEDNESS
: INTRALABORATORY
INTERLABORATORY
Figure 5. I Representation of a method validation program including the characteristics to test and some of the possible methodsfor a method validation study Accuracy is defined as the bias of the method, it is a measure of how close the observed result is to the specified quantity. It is usually tested using a method of spiked placebos or of standard addition [3,4]. The expected performance of a method with respect to its accuracy varies enormously from sample to sample. A simple drug formulation, such as a tablet or injection, assayed by HPLC can expect an accuracy of around +/- 1%.
194
M. MULHOLLAND
Whereas a pharmacological assay for a compound in a complex sample matrix such as blood, may expect an accuracy level closer to +/- 20%. It is thus important to try to set achievable accuracy levels in the protocol. Selectivity is often referred to as the specificity of an analytical method and is a measure of the discriminating ability of the technique. The general requirement for specificity is that the method should be capable of unambiguously determining the compounds of interest in the presence of impurities, degradation products and other sample matrix components. A specificity study often involves accelerated degradation studies to ensure all degradation products will not interfere and the collection of likely process impurities. Often a placebo sample is assayed to check for interference from the sample matrix. Sensitivity is a measure of the rate of change of response with concentration. It is dependent on the method of detection and the nature of the analytes. The majority of analytical methods are designed around a linear response within a certain concentration range. However, it is always necessary to validate this linearity and the slope of the linearity plot determines the sensitivity of the method. The study of the precision of a method is often the most time and resource consuming part of a method validation program, particularly for methods that are developed for multiple users. The precision is a measure of the random bias of the method. It has contributions from the repeatability of various steps in the analytical method, such as sample preparation and sample injection for HPLC [5-91, and from reproducibility of the whole analytical method from analyst to analyst, from instrument to instrument and from laboratory to laboratory. As a reproducibility study requires a large commitment of time and resources it is reasonable to ensure the overall ruggedness of the method before it is embarked upon. The final stage of a method validation study involves the testing of the method for miscellaneous limitations that need to be determined before the method can be passed on for its intended application. Limitations that are commonly included in the validation study are: Lifetimes of samples and standards Lifetimes of reagents Detection limits Limits of quantitation [101
RUGGEDNESS TESTS FOR ANALYTICAL CHEMISTRY
195
TABLE5.1 METHOD CHARACTERISTICS AND TEST PROCEDURES FOR METHOD VALIDATION Validation Contributions Test Procedures Characteristic Specificity Peak Purity Diode array detection test Accelerated degradation Degradation Products Chromatogram Retention Behaviour Sensitivity
Linearity
Plot of response Concentration
Accuracy
Recovery Linearity
Spiked placebo test Intercept of a linearity plot
Precision
Repeatability
Sample prep replication Jnjection replication Factorial designs Intra-laboratory study Inter-laboratory study
Ruggedness Reproducibility Limitations
vs.
Limit of Detection Signal/Noise Ratio Lifetimes Degradation Studies
Table 5.1 summarises the characteristics of a method that require validation together with the method features contributing to these characteristics and some example test procedures. It can be seen that there is a certain amount of overlap in the contributions and their test procedures for the various characteristics. For instance, a linearity test can give information on both the accuracy and the sensitivity of a method. The ruggedness test is normally included as part of a precision study, however it can also contribute to other performance characteristics such as sensitivity and method limitations. Despite this it fits best as part of a precision study where it can be used to effectively and efficiently link repeatability to reproducibility tests.
196
M. MULHOLLAND
5.1.2 Summary of the role of a ruggedness test in a method validation program The ruggedness test evaluates the effects of small changes in the method conditions on the analytical performance. Analytical methods often must be designed and optimised for an individual assay, this is particularly true for HPLC where almost every new method is unique and hence requires complete validation. The method validation study is for the most part carried out within the environment in which it was developed, often by the same analyst on the same instrument. However, it's expected usage could be outside this environment and hence the validation study must establish how rugged the method will be to expected changes of the analytical method conditions. Youden et al. set down guidelines for the validation of analytical methods in their book [l 13 and they specified the testing of ruggedness prior to a reproducibility study for reasons of efficiency. There are several reasons for careful placement of the ruggedness test in a program of method validation tests. Firstly the ruggedness test itself can be a complex and time consuming task and thus should be carried out as late in the method validation as possible, (i.e. when most other performance characteristics have been established and are acceptable). This reduces the chance of a failed ruggedness test and for this reason it is recommended that the precision study be one of the last experiments in a validation study. Some validation tests can provide valuable information that helps to design a more efficient ruggedness test. For instance, if the repeatability of the various stages in the method is already established then the order of an experimental design is not so critical and it is usually sufficient to perform duplicates for each experiment. These features will be discussed in more detail in the section on experimental designs. Figure 5.2 illustrates the various stages of designing a method validation study, with examples for a ruggedness test. The first stage involves the selection of the relevant features of the method to test (i.e. the selection of factors for a ruggedness study). The next stage is the selection of a suitable test design, this often involves the use of a formal statistical design such as a factorial design. This stage is followed by a diagnosis of results that can, for an unsuccessful study, mean the re-design of the whole ruggedness study. If the test is successhl then the end results are reported in a suitable format. Each of these stages is now discussed in some detail in the following sections.
RUGGEDNESS TESTS FOR ANALYTICAL CHEMISTRY
).
Selection of a method feature to test
v
>
197
Temperature Flow rate % Solvent
Selection of a test design
I
Re-optimise test
Make the necessary test measures
Main Effects
Diagnosis
Pasdfail criteria
1
Report
Figure 5.2 Stages in the design of a ruggedness test
5.2 SELECTION OF FACTORS TO TEST The first stage in the design of a ruggedness test is to select various factors to test and the levels at which to test them. Factors are method conditions which are variable and could cause deterioration in the method results. The choice of factors is critical to the ruggedness test, they must be relevant and reflect the changes that are likely to occur over the lifetime of the method, however they cannot be so extreme as to cause the method to fail and require a repeat of the ruggedness test. Once the factors have been chosen it is necessary to decide the number and the value of levels at which they should be tested. Again this is dependent on the changes expected throughout the lifetime of the method and on whether a linear response to that change is expected. Usually one or two levels are tested above and below the method value.
198
M. MULHOLLAND
5.2.1 Selection of the number of levels at which to test a factor The number of levels selected for a test is usually decided for pragmatic reasons such as test efficiency and it is rarely necessary to test more than two extreme levels together with the method level for each factor. The following are some guidelines that can be used to decide whether to test one or two extreme levels: 1. 2. 3.
4.
5.
If the method under test is a simple analytical test such as a UVNis quantitation, a pH reading, a simple potentiometric method then it is likely to require only two levels of testing. If the method is more complex such as HPLC, ICP, AA etc. then it is more likely to require three levels of testing. If the factor chosen for testing is unlikely to show a linear response to changes then three levels are required. Such factors include UV wavelength, column manufacturers for HPLC, acids for use in acid hydrolysis as a sample preparation procedure for ICP, etc. If a method is developed for use in a strictly controlled regulatory environment such as for use during pharmaceutical stability studies or clinical trials then it is a good idea to test at three levels for the increased confidence this provides in the method. Any method developed for widespread use over a number of years is worth a thorough ruggedness test which will reduce the chance of method failure, and thus is worth testing at three levels.
These are just a few guidelines and many other aspects of the application must be considered in making this decision. However, the above examples serve to demonstrate the importance of pragmatic considerations over technical details associated with individual methods. 5.2.2 Selection of factors for HPLC methods HPLC is a complex analytical methodology that involves the development of a unique method for each new application. This method development often requires the optimisation of several method conditions to achieve a desired selectivity and sensitivity [12,13]. HPLC is also one of the most commonly applied analytical techniques and is in widespread use throughout the pharmaceutical industry for applications as diverse as quality control, stability studies and clinical trials. These two reasons mean that HPLC has been the focus of most research into ruggedness testing procedures because it is most likely to require extensive ruggedness
RUGGEDNESS TESTS FOR ANALYTICAL CHEMISTRY
199
studies. Hence, much of this chapter is devoted to the specific needs of HPLC from a ruggedness test. A HPLC method can be divided into five groups and generally it is recommended to select factors from each of these groups for a thorough ruggedness study: Sample preparation The HPLC instrumental conditions The HPLC column The method of detection The data handling If the sample preparation is simple (i.e. does not involve extensive shaking, heating or ultrasonification and does not require more than one dilution step or a derivatisation or extraction stage) then it is not usual to test the ruggedness of the sample preparation unless the method is for a strictly controlled regulatory environment. However, if a derivatisation or extraction is performed then it makes sense to test factors from these procedures as they are likely to effect the method ruggedness. Table 5.2 shows some examples of factors that can effect the ruggedness of the sample preparation procedure. The HPLC instrumental conditions include settings such as flow rate and the solvent mix. These are often the most critical factors for a HPLC method and can require extensive optimisation to achieve the desired selectivity. If a formal optimisation study has been carried out for solvent mix then this can often provide the necessary ruggedness information and hence these factors can be excluded from the ruggedness study. Otherwise, it is usually essential to test at least the ruggedness to changes in the solvent component with the smallest percentage composition in the mobile phase (as this one is likely to be the most critical). Other factors that may require testing include the pH, especially if it is not sufficiently buffered. Flow rate can be useful to test as the effects of these changes are not always as expected. Table 5.2 shows other example factors for the testing of HPLC instrumental conditions. The HPLC column is a crucial component in the chemistry of any HPLC method. Unfortunately it is not easy to obtain consistent performance from columns, nominally containing the same functional groups, obtained from different manufacturers or even from a single manufacturer but from a different batch of material. As columns have a finite lifetime it is usually a
200
M. MULHOLLAND
good idea to evaluate the method's ruggedness to column changes. Example factors are given in Table 5.2.
TABLE 5.2 EXAMPLE FACTORS TO SELECT FOR TESTING THE RUGGEDNESS OF A HPLC METHOD Data Sample HPLC Method of HPLC Handling weparation Instrument Detection Column Peak start Sample Flow rate WNisible Pore or slope weight particle size wavelength Smoothing Shake time Refractive % Solvent Batch of index range factors mix material Peak end Filter pore Fluorescence Manufacturer PH slope size emission or of material absorption wavelengths Centrifuge Time Degassing Column time constant length method Extraction Acid or base Column wash used to internal volumes diameter control pH Temperature Buffer conc. Column age Derivative Column temperature reagent conc. Dilution volumes Age of sample I The detection method can also be a source of potential variations from instrument to instrument. For example, W detectors usually specify the wavelength accuracy to within +/- 1-3 nm, and a change of this value could prove significant. Another, often ignored, factor is the time constant of the detector. Too low a setting of this factor can show significant noise levels and too high a value can distort the peak shape. The data handling or integration software can also cause peak distortion if the settings are altered throughout the lifetime of the method and it is often necessary to test these for ruggedness.
RUGGEDNESS TESTS FOR ANALYTICAL CHEMISTRY
20 1
The levels at which to test these factors must be chosen to reflect the variation likely to be seen in practice, instrument manufacturers provide specifications that can give some idea of expected variance. The method under test will also restrict the levels, for instance a method showing very small resolution between a critical pair of peaks will not tolerate large changes. In summary, the following method parameters must be considered when selecting factors and their levels for a HPLC ruggedness test 1141: Chromatographic results (peak symmetry, resolution, run time of first Peak) Intended application (Stability studies, quality control, clinical trials, pharmacological studies) Complexity of method (sample extraction's or derivatisation, very small components of the mobile phase)
TABLE 5.3 FACTORS TO TEST FOR SEVERAL ANALYTICAL METHODS TLC GC ICP Titrations Electrochemical Solvent mix Gas flow Gas flow Indicator Age of rates rates volumes electrode RF power Temperature Temperature Plate Column manufacturer packing of sample of sample material Development Injection Lifetimes of Analyte Analyte procedures volume matrix reagents matrix Column temperatures Detection time constant
5.2.3 Selection of factors for other analytical methods All analytical methods involve some kind of sample preparation and thus the type of factors selected for a HPLC method as shown in Table 5.2 are similar for all methods. However, different stages in the preparation will be more critical for some methods. For instance, derivatisation will be more common for gas chromatography methods and hence more critical,
202
M. MULHOLLAND
or the removal of interfering ions from an ion selective potentiometric method could also prove critical for that method. Other factors, particularly instrumental controls and settings, which influence ruggedness will vary widely from method to method. Table 5.3 provides a summary of some likely factors for various analytical methods. It can be seen that similar sample preparation factors are selected from method to method, but relevant instrument factors are much more varied from technique to technique.
5.3 SELECTION OF EXPERIMENTAL DESIGNS For many years the recommended designs for ruggedness testing were saturated fractional factorial designs [ 1 1, 15- 181. However, other designs full, fractional and saturated factorials together with central composite, Box-Behnken and star designs could provide more thorough solutions for some applications. There is a school of thought that the necessary ruggedness information can be gathered from all the rest of the method validation data without the need for formal statistical designs. Box et al. [20] provide the following arguments against the use of such “happenstance data”. First, it is unlikely that data collected over a period of time will remain consistent. Typically standards are modified, instruments change, calibrations drift, changes occur in materials used and some cases can even be effected by the weather. It is sometimes possible to take some of these things into account, for instance if it was noted when a column manufacturer was changed. However, much of what is relevant is not usually recorded. The range of variables can be limited by controls and thus prevent the discovery of statistically significant effects. For instance, if the column temperature was suspected to be critical and hence always controlled within a range that caused no major effects. Analysis of this data after its collection could lead to the conclusion that changes in temperature are insignificant. Variables can often be altered simultaneously, this makes it impossible to later determine which variable caused a given effect. This is known as semi-confounding of eflects. Another effect is known as nonsense correlation, and is observed because inevitably we do not measure all the variables that can effect a method. If there is a latent or lurking variable that causes an effect in
RUGGEDNESS TESTS FOR ANALYTICAL CHEMISTRY
203
parameters x and y, it could be assumed there is a causal relationship between x and y that would in fact be a nonsense correlation. Other problems pointed out by Box et al. [20] are serially correlated errors, dynamic relations and feedback. All the above problems can be overcome by the use of properly designed statistical experiments that employ features such as randomisation, blocking and other suitable controls.
5.3.1 Factorial designs Box et al. [20] provide a good introduction to factorial designs, the most thorough ruggedness test would involve the application of a full factorial design that tests all main effects and interaction effects.
TABLE 5.4 EXAMPLE LEVELS FOR THREE HPLC FACTORS TESTED AT TWO LEVELS Factors to test Method value Level 1 Level 2 Wavelength 'YOAcetonitrile Sample extraction volume
254 nm 10% 5 ml
252 nm 8% 4 ml
256 nm 12% 6 ml
If, for example, an analyst selects a number of factors to test ( k ) at a fixed number of levels (I), then the number of required experiments (N) is lk, e.g. if three factors are chosen to test a HPLC method, wavelength, Yo acetonitrile and sample extraction volume, and these three factors were to be tested at two levels, each of these levels is above and below the method value (shown in Table 5.4). Then the number of experiments required by the full factorial experimental design is 23, or 8. An illustration of the experimental space covered by this design is shown in Figure 5.3. Figure 5.3 shows the normal method conditions represented by the centre of the cube and each comer of the cube represents the conditions for each of the eight experiments required for this design. These eight experiments are shown in Table 5.5. These full factorial designs can give values for all main effect and all interaction effects between these three factors; main effects and interactions are shown in Table 5.6.
204
M. MULHOLLAND
Wavelength
Method Conditions
Figure 5.3 A fullfactorial experimental design spanning 3 factor dimensions
TABLE 5.5 THE EXPERIMENTAL CONDITIONS FOR A FULL FACTORIAL, TWO LEVEL DESIGN TO TEST THREE HPLC FACTORS Experiment Wavelength % Acetonitrile Sample extraction volume (mu number (nm) 1 4 8 252 4 2 8 256 3 4 252 12 4 4 256 12 6 5 252 8 6 6 256 8 7 6 252 12 8 6 256 12
RUGGEDNESS TESTS FOR ANALYTICAL CHEMISTRY
205
TABLE 5.6 EFFECTS AND INTERACTION EFFECTS CALCULATED BY A FULL FACTORIAL, DESIGN TESTING THREE FACTORS A, B AND C Main Effects Two Factor Threefactor Interactions Interactions A AB B ABC AC C BC These two level full factorial designs which test levels above and below the method conditions do not perform the same task as designs testing the change from the method level to an extreme level. These designs work only on the assumption that each of the three factors tested will have a linear response in the range between the two levels. In many cases this is not a valid assumption, for instance a UVNis detector is usually operated at an optimum absorbance value or at the for the compound being assayed. If two levels above and below this maximum were tested then no noticeable change may be observed, see Figure 5.4. However, the actual change observed when measured against the method value would be much more significant. This problem is widespread in analytical methods particularly for factors that are optimised and thus changes to either side of the method value are likely to cause deterioration in the method's performance. In order to avoid the kind of problem illustrated in Figure 5.4 it is recommended that a three level design be carried out.. Figure 5.5 shows a three level full factorial design for the HPLC example which tests three factors, centred around the method conditions. This design allows the testing of changes from the method conditions to extremes on either side. The number of experiments required to perform a full factorial design increases dramatically with the number of factors. Consider a two level design for 7 factors the full design requires 128 experiments. From which 128 statistics can be measured to estimate the effects shown in Table 5.7. In terms of absolute magnitude the main effects tend to be higher than two factor interactions which in turn are higher than three factor interactions and so on. At some point it is true to say that after a certain order interaction effects become negligible and can thus be disregarded in the experimental design. To do this full factorial designs are fractionated to allow the estimation of only a certain level of interaction effects.
206
M. MULHOLLAND Absorbance
A
Actual change
Observed change
,
1
2
*
3
Wavelength (nm)
Figure 5.4 The difference in the observed response when two wavelengths are tested, 2 and 3, compared to the actual effect of changingfiom the method value to either 2 or 3
TABLE 5.7 NUMBER OF EFFECTS CALCULATED FROM A FULL FACTORIAL DESIGN FOR SEVEN FACTORS Average Main Interaction Effects Effects 2 3 4 5 6 7 factor factor factor factor factor factor 1
7
21
35
35
21
7
1
Full factorial designs can be fractionated by the exclusion of experiments designed to identify higher order effects and such reduced designs are known as fractional factorial designs. The most commonly used designs are half-fractional designs and saturated fractional designs.
RUGGEDNESS TESTS FOR ANALYTICAL CHEMISTRY
207
A %Acetonitrile Sample extraction volume
Wavelength
] ,,
I
-
1
&
Method conditions
Experimental point
Figure 5.5 A three levelfactorial design Half-fractional designs are constructed by assuming that all interaction effects higher than first order can be assumed to be negligible. For a study of k factors at two levels the number of experiments ( n ) is 2k-1. For three factors at two levels the number of experiments required is four and the design is shown in Table 5.8 (the two extreme values to be tested are represented as + and -).
TABLE 5.8 HALF FRACTIONAL DESIGN FOR THREE FACTORS Experiment No Factor A B C + I
+
3 4
-
+
-
+ +
Half-fractional designs reduce the number of experiments by half for a two level design and can prove very efficient, however the number of experiments can still be prohibitive when a large number of factors require
208
M. MULHOLLAND
testing at more than two levels. For these situations some kind of saturated fractional design is recommended. Saturated fractional designs are constructed on the assumption that all interaction effects can be assumed to be insignificant and the number of experiments is now reduced to k +1. These designs are particularly usehl for an efficient solution to three level designs as they can be reflected as shown in Figure 5.6. This type of reflected design that evaluates the experimental space in only two of the eight potential hypercubes is only valid with saturated factorial designs as they depend on the assumption that all interaction effects are negligible. With this assumption the results for any of the remaining six hypercubes can be calculated from the results of the two evaluated diagonal hypercubes. Table 5.9 shows the experiments required for a Plackett-Burman design for the testing of seven factors. These Plackett-Burman designs are the most commonly used designs for ruggedness testing of HPLC methods [21]. This is mainly due to their efficiency and ability to generate a large amount of ruggedness data. For a ruggedness test it is essential to determine whether a method is rugged to many changes rather than determine the values of each effect.
%Acetonitrile
t
24L
Sample extraction volume
-
4
Wavelength
t Figure 5.6 A reflected saturated factorial design
RUGGEDNESS TESTS FOR ANALYTICAL CHEMISTRY
209
TABLE 5.9 A PLACKETT-BURMAN DESIGN FOR SEVEN FACTORS AT TWO LEVELS Experiment Factor A B C D E F G 1 + + 2 + 3 + 4 5 + 6 7 8 Although these saturated designs , assume interaction effects to be negligible and only estimate main effects, they have the feature known as confounding where higher order effects can overwrite the main effects. Thus, if a method is not rugged to higher order effects this will be observed in the values of the main effects. This effect will be discussed in more detail later in the section on evaluation of results. 5.3.2 Star designs An example star design which tests three factors at three levels is illustrated in Figure 5.7. These designs require 2k + 1 experiments [22] and appear to offer a very efficient solution for a ruggedness test, however, star designs do not give a good overview of the effects of changing factors simultaneously.
t
\
Experimental point
Figure 5.7 A star design for three factors tested at three levels
210
M. MULHOLLAND
TABLE 5.10 EXPERIMENTAL PROCEDURE FOR A STAR DESIGN Experiment No Factor A B C 1 0 0 0 2 0 0 1 3 0 1 0 4 1 0 0 5 0 0 -1 6 0 -1 0 7 -1 0 0
These limitations can be seen by comparing the experimental procedure for a three factor, three level star design, shown in Table 5.10, with the experimental procedure for a reflected saturated fractional design, which also tests three factors at three levels, shown in Table 5.1 1. In Tables 5.10 and 5.1 1 the method conditions are represented by 0 and the extreme conditions by 1 and -1. The star design requires one measurement at each extreme value whereas the reflected design has two measurements. This difference becomes more marked for larger k values. Star designs can not test interaction effects.
TABLE 5.1 1 EXPERIMENTAL PROCEDURE FOR A REFLECTED SATURATED FACTORIAL DESIGN Experiment No Factor A B C 1 0 0 0 2 1 1 0 3 1 0 1 4 0 1 1 5 -1 -1 0 6 -1 0 -1 -1 -1 7 0
RUGGEDNESS TESTS FOR ANALYTICAL CHEMISTRY
21 1
5.3.3 Central composite designs Central composite designs are constructed by a juxtaposition of a two level h l l factorial and a star design. They can be used to determine the effects of changing from the method value to the extremes and the effects of changing from extreme to extreme.
;Acetonitrile
/’
-I4 I I
I Sample extraction volume
Wavelength
Method conditions
’Experimental point Figure 5.8 A central composite design
The design, which is illustrated in Figure 5.8, gives the most comprehensive evaluation of the response surface using a given number of experiments. It provides greater efficiency than a three level h l l factorial design yet essentially obtains the same information. However, as k increases the number of required experiments quickly becomes impractical P21.
5.3.4 Box-Behnken designs Box-Behnken designs can be defined as simplex sum designs [23], a design for three factors is constructed from the three dimensional regular simplex. It is a second order rotatable design and the complete design is identical to the central composite design shown in Figure 5.8.
212
M. MULHOLLAND
Box-Behnken provide efficient solutions for some k values compared with the central composite design; for example, a design for k = 7 with three levels uses 66 experiments compared to 92 for a similar central composite design.
5.3.5 Matching the ruggedness test to an efficient design All the designs described above are useful in setting up a ruggedness study. The main limitation around which a ruggedness test is designed is firstly the number of factors and levels that require testing and secondly the number of experiments needed by a particular experimental design. Table 5.12 shows the number of required experiments for the designs described above with different numbers of factors to test. This table shows clearly the compromise that needs to be made between the thorough study provided by designs such as full factorials and central composites and the efficiency of the reduced designs such as the star and saturated factorial designs. It is unlikely that any ruggedness test could justify the outlay of more than 30 experiments, in fact it is rare that more than 20 experiments would be carried out. If a method of analysis is fast or can be fully automated and requires the testing of few factors (three or less) then the larger designs can be considered. Good choices are central composite designs, or if a linear factor response is expected a full factorial design at two levels. These large designs have the advantage of allowing a complete study where all interaction effects are estimated. However, for the majority of applications the large number of experiments required discourages their use. When a large number of factors need to be tested then it is more efficient to select one of the saturated factorials or star designs, while bearing in mind the limitations of these designs. (This will be discussed in more detail in the section on treatment of results). As can be seen from the discussion above the most critical limitation is the number of factors that need to be tested. A complex HPLC method may require 10 factors to be tested at three levels and thus a reflected Plackett-Burman design would provide an efficient solution with 23 experiments. However, a simple UV spectroscopic method may require the testing of five factors and thus an eleven experiment star design would be sufficient.
RUGGEDNESS TESTS FOR ANALYTICAL CHEMISTRY
213
TABLE 5.12 THE NUMBER OF EXPERIMENTS REQUIRED FOR EXPERIMENTAL DESIGNS WITH VARIOUS k VALUES Half Central BoxI Full factorial Vumber Star hetorial :omPosite Behnken ?f bctors 2 levels 3 levels 2 levels 9 4 13 9 2 5 4 27 8 20 20 3 7 16 243 32 5 11 32 729 64 69 53 13 6 2187 128 66 7 92 15 107 656 1 256 214 93 8 17 59049 1024 21 10 11 23 15 31 16 33 19 39
I
Vumber ?fbetors 2 3 5 6 7 8 10 11 15 16 19
Plackett-Burman [23] factorial
2 levels
3 4
5 7
8
15
16 17
31 33
3 levels
Peflected YackettYurman
9 15
l6 20
II
23 31 39
214
M. MULHOLLAND
5.4 TREATMENT OF RESULTS The first stage in deciding how to treat the results from a ruggedness test is to select a range of parameters to measure which will provide both qualitative and quantitative information on the method’s performance. The second stage is to decide how best to evaluate the main effects, standard errors and interaction effects provided by the selected experimental design. For this discussion we will consider only the application of HPLC, normally one of the most complex analytical methods to evaluate.
5.4.1 Measurements for a HPLC Study For quantitative analysis by either the external or internal standard methods, HPLC requires the use of calibration solutions that are injected under identical conditions. Thus to hlly identify quantitative effects, calibration solutions plus standard solutions need to be analysed for each experiment in a ruggedness test. As duplicate determinations are required for the estimation of standard errors a single experiment can consist of up to six chromatographic experiments as shown below. 1. Calibration 1 2. Calibration 1 3. Sample 1 4. Sample 1 5. Calibration 2 6. Calibration 2
These are then treated as two duplicate series: 1, 3, 5 and 2, 4,6. Often the user will decide that a series of four injections is sufficient with two injections from each of single sample and calibration solutions. Although with modem HPLC equipment it is possible to fully automate a ruggedness test and hence little is really gained by reducing the number of calibration injections. The first calibration injection is used to identify the number of peaks and their retention times, this will vary throughout the experiments of a ruggedness study as the method conditions are altered. This information can then be used to identify the sample peaks. It is usual to start the ruggedness test by running the experiment with all factors set to their method values. This is because the analyst is familiar with these conditions from the rest of the validation study and thus can evaluate the suitability of
RUGGEDNESS TESTS FOR ANALYTICAL CHEMISTRY
215
the instrumental set up. The order of experiments in most formal statistical designs is generated to minimise the possible effects of drifting conditions, however if a full repeatability study has already been carried out over a similar time to that taken by the ruggedness study then it is acceptable to let pragmatic reasons prevail. Thus, for instance if an HPLC column change is required it is sensible to run all the experiments requiring one column sequentially. After ordering the experiments in a pragmatic way the following data is usually collected for each experiment: 1. 2. 3.
4.
The retention times of sample peaks ( t ). Sample peak area ( a ) and sample peak height (h ) Concentration in the sample (c). This is normally calculated using both peak areas and peak heights as it is a good idea to postpone the selection of a calibration technique until after the ruggedness study. Mean number of theoretical plates, N, there are several methods to calculate N. The following calculation is often employed due to its convenience as it uses values which are previously collected as part of the data handling. (2x)x h x t
N=[ 5.
6.
a
I’
The resolution between each peak and its nearest eluting peak in the sample chromatogram where Rs can be calculated as follows:
Where t l and t2 are the retention times for the two peaks and N is the mean plate number A simple estimate of peak symmetry S can be useful and is calculated as follows:
216
M. MULHOLLAND
Where tsand te are the peak start time and peak end time, respectively. (Note that this estimation of symmetry can often prove difficult to use in practice due to the difficulty of estimating peak start and peak ends. It is often useful to measure peak widths at 10% of the peak height to overcome this problem). The above parameters are monitored throughout the ruggedness study and saved for use in the calculation of main effects and standard errors. The measurement of so many parameters allows not only the quantitative effects but the qualitative effects of changes in the method conditions. This has the advantage of estimating the effect of drifting conditions and of setting a range of system suitability parameters.
5.4.2 Treatment of the results from the ruggedness study For each experimental design used for ruggedness results are calculated separately for each level tested. The usual results estimated by these designs are a main effect for each factor, interaction effects for all combinations of factors and a standard error that estimates the precision achieved throughout the study. The main effects are calculated by adding together all the values of a given parameter obtained at one level and subtracting the sum of the values obtained from the other level and divided by the half the number of experiments. For instance, for the calculation of the main effect on plate count N for factor A, using the Plackett-Burman design shown in Table 5.9, is carried out as follows:
where N 1 to Ng are the number of theoretical plates calculated for experiment 1 to 8. The problem with calculating several main effects on different parameters each with different units is it is difficult to interpret the relevance of the numbers. The same problem is observed for the calculation of standard errors, SE, which estimate the average error between duplicate experiments, and hence give an estimate of the precision throughout the ruggedness test. The standard errors are calculated as follows:
RUGGEDNESS TESTS FOR ANALYTICAL CHEMISTRY
217
where dl is the difference between duplicate experiments and g is the number of degrees of freedom (and is equivalent in this case to the number of experiments). In order to standardise the units and numerical size of the main effects and standard errors for each of the measured parameters, all were recalculated as a percentage of the values obtained at nominal method conditions (x), hence
%ME = -*ME 100 and
SE %SE = --.lo0 X
(7)
This expression of results allows us to immediately interpret the main effects and standard errors in terms of a percentage deviation from the normal observed values for that method. A relative standard deviation (RSD) is also calculated for each set of results to show the overall variation measured throughout the study and as a reassurance that this level of variation is reflected in the observed main effects. 5.4.3 Confounding effects in fractional factorial designs One of the major limitations in the use of fractionated factorial designs is that they assume higher order effects are negligible. For half factorial designs all effects above second order are considered negligible and for saturated designs all effects above first order are assumed negligible. This can have a confounding effect where these higher order effects overwrite the measured main effects. This can be a problem for some types of applications of these designs such as process control [24,25]. However for a ruggedness confounding can prove advantageous, this is because the overall aim of a ruggedness test is to establish that the analytical method is
218
M. MULHOLLAND
rugged not only to first order effects but to a range of effects likely to be observed throughout the lifetime of the method, and this will include higher order effects. It is thus an advantage if the analyst selects a factorial design with a known confounding pattern [26]. Some Plackett-Burman designs have the feature of a known confounding pattern for second order effects [26]. An awareness of this pattern allows the analyst to investigate suspect results for confounding interaction effects and even sometimes identify the culprit effect.
( + N , - N , - N 3 + N 4 + N , + N , - N , - N * ) /4 Examination of the main effect of factor D reveals that it is identical to this main effect. This can be done for all two factor interaction effects to reveal the confounding pattern shown in Table 5.13.
TABLE 5.13 THE CONFOUNDING PATTERN FOR THE PLACKET-BURMAN DESIGN SHOWN IN TABLE 5.9 Main Effects Confounding Effects A B C D E F G
BD AD AG AB AF AE AC
CG CE BE CF BC BG BF
EF FG DF EG DG CD DE
Table 5.13 shows the confounding pattern for the Plackett-Burman design shown in Table 5.9. The main effect for this design calculated for the plate count is shown in equation (4). The interaction effect between factor B with factor A is the difference between the main effect when B is at the method level and that when B is at its extreme level. An awareness of these confounding effects can help both in the design of a ruggedness study and in the interpretation of results. For instance, factors that are considered likely to interact can be placed such that they overwrite the dummy effect, thus removing them from altering other main effects and allowing a possible identification of the effect. If a factor is observed to have an effect that cannot be attributed to that factor, e.g. an
RUGGEDNESS TESTS FOR ANALYTICAL CHEMISTRY
219
effect on retention times by altering the sample preparation for a HPLC method, it can be investigated to see if one of the confounding interaction effects could cause these measurements.
5.5 EXAMPLE CASE STUDIES This section considers two case studies in detail, both test the ruggedness of HPLC methods. In practice HPLC is the most likely technique to require a ruggedness test, this fact is due to both the complexity of the technique and its widespread usage. The first example considers the analysis of aspirin together with its main degradation product salicylic acid and the second considers the assay of salbutamol together with one of its major degradation products. Both these methods are stability indicating and therefore are likely to come under intense regulatory scrutiny. 5.5.1 The application of a ruggedness test to the assay of Aspirin and its major degradation product, salicylic acid The first case study we will consider is the assay of aspirin together with its major degradation product salicylic acid [ 191. This application study was selected as the HPLC assay of aspirin is well covered in the literature and we could select factors to test from the variety of HPLC conditions used in these published methods. This test was performed using a reflected saturated factorial design requiring a total of 15 experiments.
Chromatographic Conditions Calibration solutions were prepared by dissolving 60 mg of aspirin and 1.8 mg of salicylic acid in 50 ml of a solvent mix containing acetonitrile, methanol and orthophosphoric acid in the following concentrations 92:8:0.5 by volume respectively. One of the potential problems with this assay is that the samples and standards could hrther degrade during their assay, thus altering the original ratio of aspirin to salicylic acid. The above solvent mix has been shown to reduce the rate of degradation to an insignificant level [28]. Sample solutions were prepared by placing one tablet and 9 mg of salicylic acid in a 250 ml volumetric flask, adding 150 ml of the above solvent mixture and ultrasonicating for 15 minutes. The solution was then made up to volume with the solvent and filtered.
220
M. MULHOLLAND
Aspirin and salicylic acid were eluted with a mobile phase of aqueous acetonitrile, acidified to pH 2.6, using orthophosphoric acid, at a flow rate of 1.5 ml/min. A UVNis detector was employed with a wavelength of 295 nm, 0.1 aufs sensitivity and a response time of O.5sec. A 25 cm x 4 mm Lichrocart C 18 cartridge column with a guard column of the same material was used and the column temperature was maintained at 40 "C.
Planning the Ruggedness Test Before the ruggedness test could be contemplated it was essential to fully validate the method with respect to other method characteristics. Hence the following tests were carried out, specificity, spectral purity of chromatographic peaks, linearity of detector response, and repeatability over 100 injections. Satisfactory results were achieved for all these experiments before we continued to the ruggedness test. Table 5.14 shows the factors that were selected together with the levels tested. A range of factors was chosen such that each part of the method was examined from the mobile phase to the detection method. The acid type was selected as this was a factor that differed for various publications, often the method only specified the pH rather than the acid to use, hence it seemed worthwhile to examine the effect of changing this factor.
TABLE 5.14 Factor Acetonitrile ("/.I Flow- rate (ml/min) Wavelength (nm) Temperature ("C) Acid Type Response Time (s)
FACTORS TESTED Method Value Minimurn Value 25 23 1.5 1.4 295 290 40 38 1. o-Phosphoric 2. Acetic acid Acid 0.5 0.12
Maximum Value ___ 27 1.6 300 45 3. Perchloric Acid 2
The experimental scheme for a three level reflected saturated fractional design for seven factors is shown in Table 5.15 ( note that one factor was retained as a dummy factor to be used as an additional error check). The experimental order of the scheme was sorted on acid type as this required long equilibration times, this ordering loses some of the features of the initial design but is a compromise that can be justified on the fact that
22 1
RUGGEDNESS TESTS FOR ANALYTICAL CHEMISTRY
previous method validation tests have shown that the equipment remains stable over the time required for a test, hence experimental order is not so critical.
TABLE 5.15 EXPERIMENTAL DESIGN (%)
Acid Type
Flowrate (ml/min)
Temperature ( c)
Wavelength (nm)
25 27 27 25 23 23 25 27 27 25 25 23 23 25 25
1 1 1 1 1 1 1 3 3 3 3 2 2 2 2
1.5 1.5 1.5 1.6 1.6 1.5 1.4
40 45 45 40 38 38 40 40 40 45 45 40 40 38 38
295 295 300 300 295 290 290 300 295 295 300 290 295 295 290
Exp.
Acetonitrile
NO
8 1 5 7 9 13 15 2 3 4 6 10 11 12 14
1.4
1.5 1.6 1.5 1.5 1.4 1.4 1.5
Response time (s)
0.12 2 0.12 2 0.5 0.12 0.5 0.12 2 0.12 2 0.12 0.5 0.12 0.5
A second change was made to the experimental order in that the initial experimental conditions were run first not last as recommended in the Plackett-Burman scheme. This is again a practical compromise as it allows the analyst to check the instrumental set-up with well-established conditions. Results of the Ruggedness Test (The sequence of injections from experiment 8, the original method conditions, is shown in Figure 2 from reference [19]). The calculation of peak symmetry for this design was abandoned as our calculation method could not adequately define peak start and peak end times.
222
M. MULHOLLAND
% Acetonitrile The effect of altering the acetonitrile concentration was observed as a reduction in retention times as the solvent strength increased, peak height increased correspondingly. The resolution between the peaks was slightly reduced with increased solvent strength but this was insufficient to cause peak overlap and hence deterioration of quantitative results. Acid Type Many significant effects were observed on changing the acid used to control the pH of the mobile phase. Both chromatographic, qualitative, effects (e.g., retention times and resolution) and detection, quantitative, effects (e.g., peak areadheights) were observed. The changes in peak areas could be attributed to the spectra for both aspirin and salicylic acid undergoing bathochromic shifts when observed in different acid solutions. Figure 3 from reference [19] shows the observed spectral changes for aspirin in the three different mobile phases. Temperature An increase in the temperature was found to reduce the retention times and the plate count. This had a net effect of decreasing the resolution and increasing the peak height. The temperature dependence is likely to be dependent on the thermodynamics of the separation equilibria. Flow rate Increasing the flow rate reduced the retention times and residence time in the flow cell and hence reduced the peak area. The peak height however does not change.
Wavelength The effects of changing wavelengths were the most dramatic of all the observed effects. The peak areas changed by up to 100% from the initial method conditions. The concentration results did not alter significantly as the method was re-calibrated at each wavelength value. However, it should be noted that the limit of detection for the method could be altered significantly. The effects observed on wavelength changes are shown in Table 5.16.
RUGGEDNESS TESTS FOR ANALYTICAL CHEMISTRY
223
TABLE 5.16 EFFECTS OBSERVED FOR CHANGES IN WAVELENGTH 295-300 nm Chromatographic 295-290 nm Parameters Aspirin Salicylic Aspirin Salicylic acid acid -2.078 -2.253 -0.355 0.333 Retention time Plates 0.328 0.333 0.984 -0.010 Resolution -0.021 0.468 Peak area - 101.637 9.102 35.194 -5.739 Peak height -82.869 8.888 -39.3 12 -6.141 Concentration (area) -1.008 -1.391 0.394 0.637 Concentration (height) -0.507 -1.412 0.333 0.533
Response time The results for changing response time showed that an increase in peak heights was observed for aspirin when the response time was reduced. This suggests that the method value of 0.5 is slightly too low to detect the aspirin peak without distortion. However the distortion must be reproducible as this effect did not significantly alter the concentration results. Summary of Results Table 5.17 shows a complete report for the effect of all factors on the plate count. The overall relative standard deviation is displayed to give an estimate of the overall size of variations observed when all factors change together. The standard error reflects the statistical relevance of all the main effects (i.e. a main effect with a smaller value than the standard error is not statistically relevant). In this instance a main effect must be larger than -0.7 to be considered a real effect and not just a reflection of the overall precision of the method. The results for each factor are given in Table 5.18. The largest effect is around 3% and is due to the change in the acid type used to control the pH of the mobile phase. All observed effects were unlikely to cause a lack of method ruggedness as no effect caused a critical reduction in the plate count.
224
M. MULHOLLAND
TABLE 5.17 SUMMARY OF RESULTS FOR THE RUGGEDNESS OF THE METHOD ON THE PLATE COUNT CALCULATED FOR ASPIRIN Experiment Replicate I Replicate 2 Mean No. 265 I .9 2633.2 1 2594.4 2 2591.2 2563.3 2577.2 3 2465.5 2470.6 2468.1 2413.0 2453.5 4 2494.0 2586.9 2555.8 5 2524.7 2385.2 2436.3 6 2487.4 2616.7 2614.2 7 2611.6 2857.5 2864.3 8 287 1.2 9'0 Standard Error = 0.692 RSD = 5.362 2857.5 2864.3 8 287 1.2 2778.8 2809.2 9 2839.6 2646.3 2614.7 10 2583.1 261 1.5 2628.2 11 2644.8 2657.3 265 1.9 12 2646.5 2872.2 2829.3 13 2786.3 2614.0 262 1.5 14 2629.0 2839.2 2813.0 15 2786.7 % Standard Error = 0.609 RSD = 3.980
Documentation of results Apart from documenting the list of observed main effects there are several other relevant pieces of information that can be gleaned for a ruggedness study. It is important to determine whether factors are likely to be time dependent or drifting factors as the main effects of such factors can be critical to the calibration method used. For the aspirin example two factors could drift, % acetonitrile (which can evaporate from a pre-mixed solvent) and temperature (which can either drift or change dramatically for instance if laboratory heating is switched off overnight). Both of these factors effected the peak heights much more significantly than the peak areas, thus suggesting that calibration using peak areas will provide more rugged
RUGGEDNESS TESTS FOR ANALYTICAL CHEMISTRY
225
results. The results can also be used to respecify the method with the optimum values found for each factor tested, this can be presented with the tolerated ranges (i.e. those tested in the study). Table 5.19 gives the list of optimum values for this method based on increased sensitivity, maximum N and optimum speed of analysis with given resolution. Finally, the results can be used to specify system suitability criteria. These are a list of value ranges for qualitative features of a chromatogram that need to be met before a given instrument can be used for the method.
TABLE 5.18 THE MAIN EFFECTS ON THE PLATE COUNT CALCULATED FROM TABLE 5.17 FOR ASPIRIN % Main eflects Factor and levels
YOAcetonitrile (25-27%)
% Acetonitrile (25-23%) Acid Type (phosphoric acid to perchloric acid) Acid Type (phosphoric acid to acetic acid) Flow rate (1.5- 1.6 ml/min) Flow rate (1.5- 1.4 ml/min) Temperature (40-45) Temperature (40-3 8) Wavelength (295-300 nm) Wavelength (295-290 nm) Response time (0.5-2 seconds) Response time (0.5-0.12 seconds)
0.628 0.303 3.152 3.489 1.787 -0.055 1.985 0.036 0.984 0.328 1.349 0.386
Many methods to be used for regulatory analysis require these parameters to be provided before a method is acceptable. The ruggedness test provides an ideal opportunity to provide these values as during the test large changes occur in qualitative parameters, such as resolution, and the quantitative results are assured throughout these ranges for a successful ruggedness test. Table 5.20 shows the system suitability parameters derived from the ruggedness test on the aspirin method. The peak area and height responses are specific to the data handling package used [ 191.
226
M. MULHOLLAND
TABLE 5.19 OPTIMISED METHOD CONDITIONS USNG RESULTS FROM THE RUGGEDNESS TEST Method condition Original Optimum Optimum Optimum method value valuefor valuefor compromise salicylic aspirin value acid Acetonitrile (9'0) 25 25 27 25 Flow- rate 1.5 1.4 1.4 1.4 (ml/min) Wavelength (nm) 295 300 295 290 40 45 45 45 Temperature ("C) Acid Type phosphoric phosphoric perchlori perchloric acid acid c acid acid Response Time (s) 0.5 0.12 0.12 0.12
TABLE 5.20 SYSTEM SUITABILITY PARAMETERS Qualitative parameter Aspirin Salicylic acid Number of theoretical plates Retention times (seconds) Resolution Peak area response Peak height response
2450-2860
2900-3700
200-320
240-320 2.8-6.0
2.87-3.86 14.6-37.3
1.88-3.68 0.1 8-1.98
5.5.2 The application of a ruggedness test to the assay of Salbutamol and its major degradation product, AH4045 This study was carried out with the intention of not only determining the ruggedness of the method but also to investigate the effect of confounding in the Plackett-Burman designs [26].
227
RUGGEDNESS TESTS FOR ANALYTICAL CHEMISTRY
Chromatographic Conditions Calibration solutions were prepared by dissolving 20 mg of salbutamol and 0.2 mg of AH4045 in 100 ml of water. Tablets were dissolved in 20 ml of water by mechanically shaking for 20 minutes. All reference materials for salbutamol, its impurity AH4045 and salbutamol tablets were obtained from Glaxo Group research (Ware, UK). All solvents were HPLC grade. Sample solutions were filtered through a GF/F filter paper before injecting 5 pl aliquots onto a 20 cm x 4.6 mm i.d. 5 pm nitrile column. Planning the Ruggedness Test As for the aspirin example, before the ruggedness test could be contemplated it was essential to fully validate the method with respect to other method characteristics. Hence the following tests were carried out, specificity, spectral purity of chromatographic peaks, linearity of detector response, and repeatability over 100 injections. Satisfactory results were achieved for all these experiments before we continued to the ruggedness test. Table 5.21 shows the factors that were selected together with the levels tested. A range of factors was chosen such that each part of the method was examined. A common problem in HPLC methodology is the specification of columns. The column performance can be crucial to the separation and therefore must be adequately specified.
TABLE 5.21 Factor Column Manufacturer Flow-rate (ml/min) Wavelength (nm) Temperature ("C) PH Propan-2-01 (%) Dummv Factor
THE FACTORS SELECTED Method Value Minimum Value Maximum Value Spherisorb
Techsphere
Hypersil
2
1.5
2.5
276
270
280
45
40
60
4.5 5 1
3 3 2
5 10 3
228
M. MULHOLLAND
TABLE 5.22 Commercial name Spherisorb ODS Hypersil ODS Techsphere C 18
COLUMN SPECIFICATIONS Pore Size Surface area (nm) (m2@ 10 200 8 220 8 200
Carbon loading (99) 9 7 10
Variation in column packing material from different manufacturers can be significant yet it is not always feasible to specify a given manufacturer in a method, due to problems such as availability. Therefore, a variety of column manufacturers were selected as a factor in the ruggedness test. These columns and their specifications are shown in Table 5.22, all columns had a specified particle size of 5pm and a spherical particle shape. This ruggedness test provided a greater challenge to the method than did the test conditions of the aspirin studies, most factors being tested to larger extreme values. The experimental order of this study was also changed from the ideal randomised and blocked design. This was justified for reasons of automation where column changes needed to be minimised. As for the aspirin study the validity of this compromise depends on the fact that the repeatability of the method, over a time span such as that required for the ruggedness test, had previously been established.
Results of the Ruggedness Test The most dramatic main effects observed were due to column changes, the Hypersil column causing a 53% increase in the resolution, as compared to the Spherisorb column. These effects are unexpectedly high due to the close correlation of the specifications for these columns. As one of the purposes of this study was to investigate confounding interaction effects a dummy variable was used to estimate this effect. One dummy variable had the effect of reducing the resolution by 6.3 %. The possible confounding effects were identified from the confounding pattern illustrated in Table 5.23. The most likely confounding effect for the dummy is the interaction between the Techsphere column and the flow rate. When the 2 ml/min flow rate was used for this column a significant reduction of resolution was observed and this could explain the interaction effect. This would mean that when both the flow rate and the column were changed simultaneously the observed effect was much larger than either
RUGGEDNESS TESTS FOR ANALYTICAL CHEMISTRY
229
individual main effect. The Techsphere column caused significant peak tailing that may not have been observed on the individual main effects but could be revealed when the flow rate was increased and the peaks began to overlap. The probability that this is a real interaction effect is increased when the results for concentration main effects are examined. These results show unexpectedly high values for the same dummy variable, again suggesting that the peaks are beginning to overlap. Other important results were that the method was not rugged to the use of a Techsphere column due to an observed main effect of 2% on the concentration results.
TABLE 5.23 Main Effect Column PH Flow rate Temperature Wavelength Propan-2-01 Dummy
CONFOUNDING PATTERN ConfoundingFirst order Interaction Effects PH Temperature Column Temperature Column Dummy Column PH Column Propan-2-01 Column Wavelength Column Flow rate
Flow rate Dummy Flow rate Wavelength PH Wavelength Flow rate Propan-2-01 PH Flow rate PH Dummy PH Propan-2-01
Wavelength Propan-2-01 Propan-2-01 Dummy Temperature Propan-2-01 Wavelength Dummy Temperature Dummy Flow rate Temperature Temperature Wavelength
The flow rate caused a large effect on peak areas with a 25% decrease causing a 17% increase in observed peak areas. This indicates that flow repeatability of an instrument must be better than 1.5% and ideally around 1% to achieve acceptable results from this assay procedure. Results also show that solvent evaporation and temperature changes need to be minimised to reduce the effects of drifting method conditions. The system suitability criteria for this method had to be derived excluding the results achieved for the non-rugged Techsphere column and are shown in Table 5.24. A fuller description of the results for this study can be found in reference [26].
230
M. MULHOLLAND
TABLE 5.24 SYSTEM SUITABILITY CRITERIA Chromatographic Parameter Salbutamol Retention times (seconds) Number of Plates Resolution
98-250 1900-3700
AH4045 120-355 2700-4800
2.7-7.5
5.6 CONCLUSIONS In conclusion this chapter has presented a practical guide through the design of a ruggedness test for analytical methods and two case studies are presented. The time investment required for a ruggedness study is substantial however the benefits from the testing are many, particularly if the method is designed for long term use [29]. The advantages can be summarised as follows: It provides an early warning to potential problems with the method. The combination of a flexible set of system suitability factors plus a range of possible valid method changes allows the analyst to achieve acceptable method performance within a valid set of guidelines. The effect of drifting factors can be estimated and minimised. The results of a ruggedness study can help in the definition of a suitable calibration technique. Most of these benefits are of significant value when a method is transferred for use outside the laboratory or environment within which it was developed.
REFERENCES [l] [2]
[3]
M.J.Cardone, J. Assos. Off Anal. Chem., 66 (1983) 1257. E.L.Inman, J.K. Krischman, P.J. Jimenez, G.D.Winke1, M.L.Persinger and BXRutherford, General method validation guidelines for pharmaceutical samples, J. Chromatogr. Sci.,25 (1987) 252-256. A. Vanderweilen and E.A. Hardwidge, Pharmaceutical Technology, 66 (1982) March.
RUGGEDNESS TESTS FOR ANALYTICAL CHEMISTRY [4] [5] [6] [7] [8] [9] [lo] [11] [121
[13] [ 141
[15] [16] [171 [18] [ 191 [20] [21] [22] [23] [24]
23 1
E. Debesis, J.Boehlelt, T. Givand and J. Sheridan, Submitting HPLC methods to the compendia and regulatory agencies, Pharmaceutical Technology, 6(9) (1 982) 120-137. M.Mulholland, J.A. Van Leeuwen and B.Vandegiste, "An Expert System to Design and Intelligent Spreadsheet for the Determination of the Precision of an HPLC Method", Analytica Chimica Acta, 223(1) (1989) 183-192. M.Thompson, Variation of precision with concentration in an analytical system, The Analyst, 113 (1988) 15791587. R. McGill, J.W. Turkey, W.A. Larson, The American Statistician, 32 (1978) 1216. W.J. Dixon, Processing data for outliers, Biometrics, 9 (1953) 4. D.L. Massart, B.G.M. Vandeginste, S.N. Deming, Y. Michotte and L. Kaufinan, Chemometrics: A textbook, Elsevier,l988,93-105. E.L. Inman and E.C. Rickard, Chromatographic detection limits in pharmaceutical method development, Journal of Chromatography, 447 (1 988) 1-12. W.J. Youden and E.H. Steiner, Statistical Manual of the AOAC. Statistical Techniques for Collaborative Tests. Planning and Analysis of Results of Collaborative tests, AOAC, Washington,DC, (1975) 33-121. P.J. Schoenmakers, Optimization of Chromatographic Selectivity, Elsevier, Amsterdam, 1986. P.J.Schoenmakers and M.Mulholland, An Overview of Contemporary Method Development in Liquid Chromatography, Chromatographia, 25(8) (1988) 737748. J.A. van Leeuwen, M.Mulholland, B.G.M.Vandeginste and G.Kateman, An Expert System for the Selection of Factors for a Ruggedness test of HPLC Methods, Analytica Chemica Acta, 228 (1990) 145-153. W.J. Youden, Muter. Res. Stand., November (1961) 862-867. G. Wemimont, ASTMStandardisation News, March (1977) 13-16. C.D. Hendrix, Chem Technol., March (1979) 167-174. B. Fischer, Anal. Proc., 21 (1984) 443-448. M. Mulholland and J. Waterhouse, Development and Evaluation of an Automated Procedure for the Ruggedness Testing of Chromatographic Conditions in HPLC, Journal of Chromatography, 395 (1987) 539-551. G.E.P. Box, W.G. Hunter and J.S. Hunter, Statistics for Experimenters, An Introduction to Design, Data Analysis and Model Building, Wiley, New York, 1978,291-453. R.L. Plackett and J.P. Burman, The design of optimum multifactorial experiments, Biometrika, 33 (1946) 305-325. S.N. Deming and S.L. Morgan, Experimental Design: A Chemometric Approach, (Data Handling in Science and Technology, Vo1.3), Elsevier, Amsterdam,l987. G.E.P.Box and D.W.Behnken, Simplex-sum Designs: a class of second order rotatable designs derivable from those of the first order, Ann. Math. Statist., 31 (1960) 838-864. J. Hudec, M. Polievka and J. Balak, Esterification of glutaric acid with methanol. I. Application of the Plackett-Burman scheme, Petrochernia, 22 (1) (1982) 12-19.
232
M. MULHOLLAND
[25] B.S. Aswathan and V.J. Victor, Plackett-Burman design for effective screening of process variables, Indian Journal of Technologv,12 (1 974) 367-369. [26] M. Mulholland and J. Waterhouse, lnvestigation of the Limitations of Saturated Fractional Factorial Designs with Confounding Effects for a HPLC Ruggedness Test, Chromatographia, 25(9) (1988) 769-774. [27] K. Jones, International Laboratory, November (1986) 32-45. [28]. J. Fogel, P. Epstein and P. Chen, Simultaneous high-performance liquid chromatography assay of acetylsalicylic acid and salicylic acid in film-coated aspirin tablets, Journal of Chromatography, 317 (1 984) 507. [29] L. Buydens, J. A. van Leeuwen, M. Mulholland, B. G. M. Vandeginste, An Expert System for the Validation of HPLC Methods, TRends in Analytical Chemistry, 9 (1 990) 58-62.
Chapter 6
STABILIZING A TLC SEPARATION ILLUSTRATED BY A MIXTURE OF SEVERAL STREET DRUGS C.A.A. DUINEVELD', P. KOOPMANS~ AND P.M.J. COENEGRACHT
Research Group Chemometrics, University Centrefor Pharmacy, Universityof Groningen, A. Deusinglaan I , 9 713 A V Groningen, Netherlands
6.1 INTRODUCTION Thin Layer Chromatography is a valuable analytical technique. It is cheap, fast and simple. Optimization of TLC is therefore of the highest importance and subject of many studies. A review of optimization methods is given by Nurok [l]. The aim of such optimizations is to find a mobile phase composition at which a good separation of all solutes is possible. However, not only the mobile phase has influence on the retention time, but also the temperature and the relative humidity. Temperature and relative humidity cause problems because they are difficult to control except when special equipment is used. The variation depends on the weather and the quality of the climate control, but they do vary. Their effect on the retention is different .for different solutes. Therefore the resolution can change.
6.2 THEORY 6.2.1 Thin Layer Chromatography Thin Layer Chromatography (TLC) is methodological simple. The solutes travel different distances with a mixture of solvents, the mobile phase, along
’ Present address: Quest International, P.O. Box 2, 1400 CA Bussum, The Netherlands.
* Present address: Academic Hospital Groningen, P.O. Box 30001,9700 Groningen. 233
234
C.A.A. DUINEVELD, P. KOOPMANS and P.M.J. COENEGRACHT
the surface of a thin layer of adsorbent, the stationary phase. Various processes cause the solutes to adsorb on the stationary phase. Since the solutes have different affinities towards the solvent and the stationary phase they show different retention. TLC differs in many aspects from High Performance Liquid Chromatography (HPLC). The first difference is that the solutes are not separated over a fixed length (the separation column) but during a fixed time (the development time). Therefore, the chromatographic behaviour is not characterized by the time needed to traverse the column, but by distance travelled within a certain time span. A second difference is that the composition of the mobile phase may vary over the length of the plate. More volatile components may vaporize causing a different composition at different places on the TLC plates. Since the retention is measured as distance in stead of time, the Rpalue, which expresses the position of a substance on a developed plate, should be calculated differently than retention measures are calculated in HE’LC. First we introduce the distances which are of importance. These are zr, the distance between the solvent source and the solvent front, z, the distance between the solvent source and the place where the solutes start and z, the distance between the start and final places of the solutes. From these distances it follows that:
This value is relatel, to Rf’, the thermodynamic value. This latter value describes the equilibrium between the solute in the solvent and adsorption on the stationary phase. The relation between Rjand Rf’ is: Rf’=S*R, 6 is a constant which depends on several factors. When the stationary phase is preloaded with solvent molecules from the gas phase 4 is about 1.1. Based on the Rf’ several other characteristics can be calculated. These are the capacity factor:
STABILIZING A TLC SEPARATION
23 5
and the R,-value:
where 6 is neglected. The R, value is supposed to have linear relationships with basic TLC parameters [2],such as the transfer energy between the mobile and the stationary phase. A chromatographic property of interest is the separation between spots (solutes). The most simple definition is resolution R,=(Z,(~,-(Z~(~,)/(~( q+o,)) with o,the bandwidth of the developed spot. Assuming 01=02and using zJo=dNR, [2](N is the plate number) the formula becomes:
where R,is the average of the Rfvalues of the two solutes. In this paper a model is made describing the relation between a response variable (e.g. R,, K, z,) and the independent variables (mobile phase composition, temperature and relative humidity). When a separation is optimized it is of importance to select the best performing response variables. Best performing, in this case, meaning, giving the most trustworthy predictions. For reasons stated below R, was used as response variable. The R, can be used to calculate R,3 K and R, values (equation(4)). The value taken for N in this latter calculation was 3000.
6.2.2 Separation problem The chromatographic problem which we investigated is difficult. The solutes are seven nitrogen containing alkaloids. These alkaloids were selected because they may accompany heroin in illegally sold street drugs. Coenegracht et al. [3] have introduced a four solvent system to compose mobile phases for the separation of the parent alkaloids in different medicinal dry plant materials, like Cinchona bark and Opium. Through the use of mixture designs and response surface modeling an optimal mobile phase was found for each type of plant material. These new mobile phases resulted in equally good or better separations than obtained by the procedures of the Pharmacopeias. Although separations were as predicted, the accuracy of the quantitative predictions needed to be improved.
236
C.A.A. DUINEVELD, P. KOOPMANS and P.M.J. COENEGRACHT
Therefore the influence of temperature and humidity on the robustness will be investigated. The maximum number of alkaloids to be separated in the paper of Coenegracht et al. [3] was six in opium, and the separation of seven alkaloids in this paper is a severe test of the four solvent systems suitability for the separation of alkaloids. The alkaloids are: strychnine, quinine, narceine, heroin, noscapine, caffeine and papaverine.
6.2.3 Selection of mobile phases There are several systems which can be used to select the solvents of the mobile phase. The number of selected solvents and the solvents which are selected not only depend on the chromatographic problem but also on the method which will be used to optimize the system. With response surface methodology it is appropriate to use a minimum number of solvents. For reasons stated below this minimum number of solvents was four. The second question is, which solvents will be selected, is more difficult to answer when a small number of solvents is used because the consequences of a wrong selection are large. Several approaches are possible to select the solvents. The most simple method is comparison with common solvent systems for the solutes under investigation. A more general approach is to use the selectivity triangle of Snyder [4] in the selection of the solvents. It was decided to use four solvents because: 0
0
0
0
such a solvent system should allow sufficient variation of solvent strength and of selectivity the factor space of a four component mixture is a tetrahedron, which is the highest dimensional simplex which can be visualized in combination with variation of temperature and relative humidity, more solvents would result in an impractical large number of experiments [5] Coenegracht et al. [3] proved that it is possible to obtain suitable separations for alkaloids with the use of only four solvents.
In order to be successful with only four solvents it is necessary to use solvents which have all different properties. The most important properties are solvent strength and solvent selectivity. The solvent strength is the overall elution power. The solvent selectivity is the way in which different components are eluted over different distances. Two solvents with the same solvent strength and different selectivity have the same average elution distance on the plate but the solutes are eluted in a different pattern. By choosing solvents with extremely different properties, intermediate
23 7
STABILIZING A TLC SEPARATION
properties can be obtained from using a mixture of the solvents. There are several methods to classify the properties of solvents. The selectivity triangle of Snyder [4] provides good guidelines, in addition to which it is possible to examine the selectivity properties. The selectivity properties measure the relative importance of the interaction parameters (proton acceptor (x,), proton donor (xd)and dipole interactor (x,)) [6] and therefore sum to one. It is also possible to examine solvent strength on silica (SSi)and the solubility parameters (proton donor proton acceptor ( 6 b ) , orientation interaction (6,) and induction parameter (Sin)).Another property of interest is the flow velocity (0). In Table 6.1 the properties of some common solvents for TLC separations of alkaloids are given. With these solvents five of the eight selectivity groups of Snyder are represented. The problem is now reduced to the selection of four of these solvents. In order to do this both the selectivity group concept and selectivity properties will be used. Additionally the solubility, the flow velocity and the capability to prevent tailing will be used. Chemisorption of basic solutes can occur since the silanol groups have acidic properties. This causes tailing from the point of application to the final spot and is therefore undesirable. Tailing can be prevented by addition of a basic solvent or by impregnation of the silica layer with a basic solution. Since it was preferred not to pretreat the silica diethylamine (DEA) was selected as basic solvent in the mobile phase.
TABLE 6.1 PROPERTIES OF SELECTED SOLVENTS. DATA FROM REFERENCES [2,3,6] Solvent
X,
xd
X,
Group
ssi 6,
6, 6, S,,
Acetone 0.35 0.23 0.42 VIa 0.53 - 3.0 a a I1 ‘0.54 Butanol 0.59 0.19 0.25 Chloroform 0.25 0.41 0.33 VIII 0.31 6.5 0.5 a a a a a Diethylamine 111 ’0.55 0.48 - 2.7 Ethyl acetate 0.34 0.23 0.43 VIa Methanol 0.48 0.22 0.31 0.70 8.3 8.3 I1 0.22 - 0.6 VII Toluene 0.25 0.28 0.47 a No data available. Calculated with SSi=0.77Salumina
e
5.1 1.5 118 a a 82 3.0 0.5 75 a 69 a 4.0 1.0 85 4.9 0.8 65 - - 77
238
C.A.A. DUINEVELD, P. KOOPMANS and P.M.J. COENEGRACHT
It was preferred to have one proton donor in the solvents. Chloroform (CHCl,) is the only solvent in Table 6.1 which is a proton donor. This is indicated by both the selectivity parameters and the solubility parameters. Of the remaining solvents toluene had too low a solvent strength. There remained a choice between four solvents. Two of these are in group I1 (butanol and methanol) and two in group VI (ethyl acetate and acetone) of the Snyder triangle. In general it is preferable to use solvents with about equal flow velocities. When these parameters differ too much then this can result in demixing of the mobile phase. Ethyl acetate (EtAc) and methanol (MeOH) were chosen because their flow velocity constants differ less from the flow velocity constants of DEA and CHC1, than acetone and butanol. To summarize, the four selected solvents represent different selectivity aspects; DEA is a strong base, CHC1, is a proton donor, EtAc is a dipole interactor and MeOH has both proton donor and proton acceptor properties. This means that the solvent systems which can be obtained by mixing these solvents can provide for a large variation in selectivity.
6.2.4 Influence of temperature and relative humidity Relative humidity and temperature are two variables which have influence on the chromatographic behavior of the solutes [2] but which can not always be set at desired levels. The relative humidity is expected to have a large influence, while temperature has a small influence. In reference [2] it is stated that a temperature change of f 5 degrees seldom exceeds reproducibility limits of the standard working techniques. It is most feasible to discuss the effect of variation in relative humidity and temperature in terms of activity. Therefore in the following paragraphs first the concept of activity will be introduced. Then the concept will be applied in a short examination of the effect of relative humidity and temperature on the retention. A thorough description of most topics related to activity can be found in [2]. The following gives a short description of activity as it is found relevant for this paper. Activity is a surface property of the adsorbent which, together with the solvent and temperature, gives rise to a particular retention for a given substance. If all other parameters are kept constant then an increase in layer activity will lead to lower R, and a decrease of activity to higher Rr values. The activity of a sorbent contains two components, an energy contribution and a surface area contribution. A larger surface energy per unit
STABILIZING A TLC SEPARATION
239
surface will lead to stronger interactive forces between sorbent and sorbate. This specific potential surface energy determines the strength of the interaction between the stationary phase and the solvents and solutes. The surface per unit weight of the stationary phase determines the number of sorptive processes that can occur simultaneously. Coverage of the surface therefore causes a decrease of the activity. This is because first the most active sites will be covered. A fundamental equation (7) describes the most important phenomena in liquid-solid adsorption chromatography. In this equation the retention of a solute is dependent on its properties and the properties of solvent and stationary phase:
where V, is the volume of the amount of solvent molecules which can occupy the free surface of the adsorbent (cm3/g), W, the weight of adsorbent in a layer (g), V, the total volume of void spaces in a layer (cm3), and f ( 4 S ) a dimensionless constant. In this equation both factors which determine the activity can be seen as independent. The first term is the surface component. In this factor V, is the surface component and WJV, is approximately the solvent volume on the layer after development. The second term is the energy component. In this term a a substance independent, and - for weak solvents - also solvent independent parameter and f ( 4 S ) is a constant dependent on solvent, solute and sorbent. With increasing deactivation of the surface a decreases. The largest activity controlling factor is the relative humidity. Water, adsorbed from the air, can cover a part of the surface (decrease of V,). When water covers the complete surface, then the adsorbent is completely deactivated and the liquid-solid chromatography is replaced by liquid-liquid chromatography. In general relative humidity influences, amongst others, the Rfvalues, the separation, the front migration velocity and the solvent profile gradients. In terms of equation (5), V, is determined by the amount of water in the air, as given by the relative humidity. Usually, a higher relative humidity results in a lower activity and thus in lower R, values. A second effect is that a decreases when water is adsorbed. The most active sites are the first to be covered by water. This means that especially when there is less than 20% coverage of the surface an increase in the coverage results in an decrease in a.
240
C.A.A. DUINEVELD, P. KOOPMANS and P.M.J. COENEGRACHT
Equation ( 5 ) can be adapted to include temperature effects:
where T represents the temperature in Kelvin. This means that at higher temperatures the activity decreases, and that this decrease depends on both the solvent and the sorbent. There may even be an interaction between the temperature and the relative humidity, meaning that the effect of temperature is different at different relative humidities, since a is influenced by the general activity, which is partly controlled by the relative humidity. A more simple modification is use of a linear relation for the influence of the temperature. After all, the inverse of the temperature, with values around 300 K, is almost as linear as the temperature itself. It is believed that the difference in the values can be neglected in the range covered in this paper. 6.2.5 Optimization There are several approaches towards the optimization of TLC separations. First a distinction has to be made between the method (design) according to which the experiments are performed and the method by which the resulting chromatograms are evaluated. Although not every evaluation method can be combined with every optimization strategy, it is overly restrictive to let an optimization method be limited by an evaluation method. The common optimization methods for TLC are given by Nurok [ 11, who mentions, for systems with three or more solvents, three feasible methods: 0
0
The sequential simplex method The Prisma method [8] Mixture designs and response surfaces
The sequential simplex method is conceptually the most simple method. First an initial set of design points is selected according to a simplex. For these settings experiments are performed. The chromatograms are ordered according to the quality of the separation. The worst chromatogram is eliminated. A new design point is calculated by reflection of the eliminated point in the (hyper)-plane (in the solvent space) spanned by the other points. Depending on the exact algorithm used, this reflection may involve a shrinkage or expansion in the direction of reflection. This is repeated until a
STABILIZING A TLC SEPARATION
24 1
suitable optimum is found. The optimum which is found by such a method is dependent on the response variable, which should be a value describing the quality of the chromatographic result. When the elution order can change the simplex method does not guarantee provision of the global optimum instead of a local optimum. The Prisma method explicitly uses the solvent strength in a trial and error method. From the solvent strengths of the pure solvents an isoeluotropic domain is determined by dilution with a basic solvent. The resulting domain has the shape of a triangle for a four-solvent system. The strength of the solvents should result in Rf values between 0.2 and 0.8. If necessary the strength of the solvents is adjusted by addition of an extra compound (hexane or a polar compound). Selectivity points are selected on the apices of the triangle. Based on the results new selectivity points are selected. During the final stages of the optimization the solvent strength may be fine tuned by adjusting the hexane concentration. If the best chromatogram does not exhibit adequate resolution, one or more of the primary solvents can be replaced and the optimization procedure repeated. With the mixture designs and response surfaces methodology coefficients are estimated which determine the properties of the chromatograms as a hnction of the solvent composition. With this method first design points are selected in a design space of three or four components. The retention observed is modeled and the models are used to predict settings which have desirable chromatographic properties. Several chromatographic properties may be used as dependent variables in this model. Glajch et al. [9] used the chromatographic optimization function (COF) and the resolutions between all peak pairs (RJ. On the basis of the experimental .results one (COF) or several models are made (RJ. Coenegracht et al. [3] used the logarithm of the k value and estimated subsequent resolutions on the predicted ln(k) values. The exact mathematical form of the model is determined by the data. With the aid of the least squares method [ 101the parameters in the model are estimated. When resolutions are used, these are combined with the aid of overlapping resolution maps. The three methods described in the preceding paragraphs each offer distinct advantages and disadvantages. The first and most obvious difference between the methods is the distinction between the sequential methods (sequential simplex and prisma method) and the simultaneous method (mixture design). With the sequential method some experiments are performed, these are evaluated, and on the basis of this evaluation new design points are selected, these are evaluated etc. With the simultaneous
242
C.A.A. DUINEVELD, P. KOOPMANS and P.M.J. COENEGRACHT
method first all experiments are performed, after which, with the aid of more complex calculations, an optimum is determined. The choice for a sequential or a simultaneous method depends on the expected difficulty to obtain the optimum and the effort needed to perform one experiment. In this paper the problem is hrther complicated by the requirement of robustness. The optimization method will be determined after a discussion of robustness and the Taguchi method. 6.2.6 The Taguchi approach to robustness In the interpretation of the concept of robustness we will follow the Taguchi approach. According to Taguchi [l 11 the robustness of a process is: “the ability of the process to produce consistently good products with minimal effect from changes in uncontrollable manufacturing influences”. In terms of TLC separation this becomes: The ability of the TLC system to produce consistently good separations with minimal effects from changes in uncontrollable external influences. The external influences are relative humidity and temperature. According to Taguchi this should be achieved with a parameter design. A parameter design is a design where two kinds of independent variables are used. These are the controllable influences and the uncontrollable influences, also called control factors and noise factors. The aim of the parameter design is to find settings of the control factors where the variability due to the noise factors is minimal. In order to use a parameter design it is necessary to be able to control the levels of the noise factors during the optimization period. This control of the noise factors is necessary in order to use fixed levels of the noise factors. The responses at the extreme settings of the noise factors determine what variation (noise) in the response values is present. The standard method to handle a parameter design is with the aid of inner and outer arrays. In the inner array a design is made for the control factors. Similarly in the outer array a design is made for the noise factors. Experiments are performed at each combination of settings of the inner array and the outer array. The results are summarized (mean response and variance) for each setting of the inner array. The mean response is the mean response which is to be expected at the setting of the inner array. The variance is the variance in the response which is caused by change of the noise factors. The preferable setting of the inner array is the setting which combines an attractive mean level with a low variance.
STABILIZING A TLC SEPARATION
243
In the next section the parameter design is explained for the experiments of this paper. This can be used as a guideline to understand the problem and the method by which the problem is tackled. 6.2.7 Application of a parameter design in optimization For a Taguchi approach to robustness the solvent composition is varied in the inner array, and the relative humidity and temperature in the outer array. The response is the resolution of noscapine and quinine. The data used in this example is a subset of the data set which is collected for this paper. The inner array consists of seven design points, the outer array of three. The three responses obtained at each setting of the inner array are combined to form a so-called signal-to-noise ratio. For maximization of the responses the following equation is [ 1I]:
S / N = - I 0 log
I / y;+ I / y;+ I / y: 3
(7)
where y,, y2 and y, are the resolutions at different settings of the outer array. The results are given in Table 6.2. In this table on the left hand side the settings of the control factors (solvent composition) are given and on the upper site settings of the noise factors (temperature and relative humidity). Each of the combinations of control factor and noise factor settings results in one resolution. The signal-to-noise ratio (S/n? summarizes the effect of noise factors for each of the control factors. It is clear from the Table 6.2 that the fourth composition is preferable, since the signal-to-noise ratio is highest there. When the original R, values are examined it appears that at this composition the highest values are found, so this is not surprising. The question now is: what happens when this calculation is performed for more components, i.e. when at each design point the minimum resolution over all solutes is taken. This is shown in Table 6.3. The first consequence is that a large amount of information is lost, since it is not detectable which solutes lack resolution. A second consequence is that sometimes the minimum resolution is zero. This means that this setting results in ):’/ cannot be calculated. insufficient separations, but also that the inverse (1 Therefore the signal-to-noise ratios are calculated on the nonzero R, values. When the S/N values are examined it appears that the third solvent composition results in the best separation. However, this high value is obtained by neglecting the 0 (no resolution) at 20 "C, 80% relative humidity
244
C.A.A. DUINEVELD, P. KOOPMANS and P.M.J. COENEGRACHT
(RH). So actually this is a solvent composition where two or more of the solutes are not always separated. Thus, it may be better to use the fourth composition. From a chromatographic viewpoint it is better to examine both the S/N ratios and the R, values. The R, values clearly reveal insufficient separations at all solvent compositions. In Geiss [2] an R, of 0.5 is given as the absolute minimal value, while a value of 1 is barely acceptable. Therefore it is necessary to examine more solvent compositions in order to find a suitable separation. Another problem of this approach is that it is not clear if some spot crossover occurs when the temperature or relative humidity change. This information is, however, available. At mobile phase composition DEA=O.O5, MeOH=O, CHC1,=0.475, EtAc=0.475 the spot order at 20 "C, 40% RH is noscapine, papaverine, heroin, caffeine, strychnine, quinine. At 20 "C, 80% RH strychnine and caffeine have crossed. Considerably more has happened for a mobile phase composition of DEA=0.4, MeOH=O, CHCl,=O, EtAc=0.6. At 20 "C, 40% RH the order is noscapine, papaverine, heroin, caffeine, strychnine, quinine, while increase to 80% RH results in the spot order noscapine, quinine, papaverine heroin, strychnine, caffeine. It is not clear what happens at environmental circumstances which are not tested. The possibility of spot cross-over is prominent in TLC, and resolution is not a response which can easily be interpolated.
TABLE 6.2
ORTHOGONAL ARRAY FOR SEPARATION OF NOSCAPINE AND
QTemp.
1R.H DEA MeOH CHC1, EtAc
I
40
80
40
I 4.1
2.6
6.4
S/N 11.16
1.6
5.5
8.9
8.37
0.95
84.5 12.1 126.6
26.32
0.475 0.475
139.6 26.2 176.5
32.9
~
~~
0
1
0
0
0.025
0.5
0.475
0
0.05
0
0
0.05
0
0.4
0
0
0.6
6.4
0.9
7.9
3.69
0.4
0
0.6
0
13.6
2.0
13.0
10.76
0.4
0.6
0
0
0.2
0.1
0.3
-16.17
245
STABILIZING A TLC SEPARATION
TABLE 6.3 ORTHOGONAL ARRAY FOR SEPARATION OF STRYCHNINE, QUININE, HEROIN, NOSCAPINE, CAFFEINE AND PAPAVERINE
DEA MeOH CHCI, EtAc
Temp.
20
20
30
R.H.
40
80
40
I
I
S/N
~~
0
1
0
0
0
0 0.17
-1.53
0.025
0.5
0.475
0
0 0.12 0.09
-2.00
0.05
0
0
0.95
0.05
0
0.4
0
0
0.4
0
0.4
0.6
0 0.38
-0.62
0.94 0.12 0.66
-1.37
0.6
0.09 0.11 0.27
-1.87
0.6
0
0.09 0.19 0.09
-1.98
0
0
0.08
-2.19
0.475 0.475
0.87
0
0
The conclusion is therefore that both more solvent compositions and more environmental circumstances should be tested to be able to obtain a suitable and robust separation. A second conclusion that can be drawn is that the signal-to-noise ratio does not contain all relevant information. The consequence is that a different approach is necessary for TLC. This approach will be based on response surface methodology and is developed in the following section. 6.2.8 Generalization of parameter design towards Response Surface Methodology (RSM) In the previous section it looked as though a gigantic experimental effort would be necessary to get a TLC separation which is robust against changes in temperature and relative humidity. However, with the aid of models based on much less experiments it is possible to predict the separations under all experimental situations. Based on these predictions it should be possible to optimize the separation.
246
C.A.A. DUINEVELD, P. KOOPMANS and P.M.J. COENEGRACHT
The general procedure is:
A design is constructed which covers the experimental region in a suitable manner. General rules for the construction of such designs can be found in Box and Draper [ 121. Retentions are measured in experiments performed according to the design. A suitable response variable is selected. This variable should be chosen such that it has a homoscedastical error and results in simple models. For reasons stated below R, is chosen (see Section 6.2.10). Models are constructed. Predictions are made on grid points in the design space. These predictions can in turn be used to predict resolutions. The prediction which offers the best trade-off between resolution and robustness is selected. The use of RSM has a lot of consequences, and not only resolves some problems, but also causes new problems. in the following sections the problems which are encountered are described and solutions are presented. 6.2.9 Construction of experimental designs The construction of an experimental design for this separation problem is complicated because both mixture and process variables are present. The former variables, which describe the composition of a mixture in terms of fractions, usually result in design spaces which are a subspace of a simplex (e.g. of a triangle or a tetrahedron). Process variables, on the other hand, are really independent. The design space is often a square or a cube. in this paper there are four mixture variables and two process variables. The design space is therefore a part of a tetrahedron in the mixture space, and a square in the process variables space. The boundaries of the feasible mixture space are determined by the chromatographic behavior of the solutes. Outside this region the soluted do not elute at all, or elute on the solvent front. The feasible region concerning the fraction of DEA is from 0 to 0.4. MeOH may be used in all fractions. CHC1, is used in fractions from 0 to 0.95. Finally EtAc is also used in fractions from 0 to 0.95. in Figure 6.1 the mixture tetrahedron together with the boundaries is shown. It is not difficult to construct a design which is feasible for this design space. Two approaches are fairly common.
STABILIZING A TLC SEPARATION
247
The first design approach uses extreme vertices designs [13]. With this approach design points are placed at all extreme corners of the design space. (e.g. (100% MeOH), (40% DEA, 60% MeOH)), at the middle of the edges (e.g. 20% DEA, 80% MeOH), at the middle of the planes (e.g. 20% DEA, 40% MeOH, 20% EtAc), and in the centre (21% DEA, 27% MeOH, 26 YO CHCl,, 26% EtAc). For this problem this would result in 21 design points. These settings have to be combined with settings of the process variables (temperature and relative humidity). A feasible design for these variables is a factorial design with an additional (duplicated) centre point. Use of all combinations will result is an overly large number of design points. DEA
EtP&
Figure 6.I Feasible compositions depicted in mixture tetrahedron A second design approach uses the “optimal” design construction method [14,15]. With this approach first a model is constructed, after which the design points are arranged in such a manner that the model can be estimated as precise as possible. Application of this method has certain complications, involving the model for which the design is constructed, the design property according to which the design is optimal, the search strategy which is used to examine the design space etc. Application of this method in the TLCseparation problem design space should however not result in any special problems [ 161. For the TLC-separation design space both design construction methods can be used. It was decided to use an extreme vertices design. In order to obtain a high quality extreme vertices design it was decided to construct a design with the optimal design method first. This design was examined and the design points of this design were used as a guideline during the construction of the design. Besides the points which cover the design space
248
C.A.A. DUINEVELD, P. KOOPMANS and P.M.J. COENEGRACHT
as good as possible this design consists of several replications of the centre point. These replications are necessary to precisely determine the measurement error. Unfortunately, the design as developed could not completely be performed because of breakdown of the equipment. Therefore some experiments were performed at the temperature and relative humidity of the laboratory. The design used consists of 65 design points. The levels for temperature and relative humidity were not according a factorial design because of the breakdown. However, experiments for all the settings of the mixture design were performed. At level 20 "C, 40% RH there were 16 different solvent compositions At level 20 "C, 42% RH there were 5 different solvent compositions (all mixtures of DEA, MeOH and EtAc) At level 20 "C, 45% RH there were 5 different solvent compositions (all mixtures of DEA, MeOH and CHC1,) At level 20 "C, 80% RH there were 9 different solvent compositions At level 30 "C, 40% RH there were 7 different solvent compositions At level 30 "C, 80% RH there were 10 different solvent compositions At level 20 "C, 41% RH the centre point (DEA=0.217, MeOH=0.267, CHC1,=0.258, EtAc=0.258) was measured 5 times At level 25 "C, 60% RH the centre point (DEA=0.217, MeOH=0.267, CHC1,=0.258, EtAc=O.258) was measured 8 times 6.2.10 Selection of the dependent variable. In the previous sections it has been stipulated that there are several response variables which can be modeled. The success of the optimization procedure depends on the selection of the response variable(s). There are several criteria which can be used to select a response variable [12,17]. The response variable should have a homoscedastical error structure and have to change continuously and smoothly. Both experimental data and chromatographic theory can be used to check these properties. When these properties are examined it is clear that the resolution, although it is the response of interest, is not a response which should be modeled. This is because of the typical behavior of the resolution when spots cross (Figure 6.2). At such situations the resolution is not continuous in its derivative and is therefore difficult to model with standard models. These problems are even more pronounced when minimum resolution is used. It is
249
STABILIZING A TLC SEPARATION
therefore preferable to construct models describing the retention of each of the solutes and use equation (4) to calculate the resolution.
k 2-21
Rs
t 18
Independent variable Figure 6.2 Typical changes of resolution (dotted line) and k' values (solid line) as dependent on an independent variable From chromatographic theory [2] it is clear that the R, value should result in simple models. For this reason it is preferred over. the k' or the Ry These latter response values can be calculated from predicted R, values. It is more difficult to determine the error structure of the R,. It is believed however that logarithmic transformation of the k' values should result in homoscedastical error structures [3]. If there is no theory available to determine a suitable transformation, statistical methods can be used to determine a transformation. The Box-Cox transformation [ 181 is a common approach to determine if a transformation of a response is needed. With the Box-Cox transformation the response, y , is taken to different powers h (e.g. -2
250
C.A.A. DUINEVELD, P. KOOPMANS and P.M.J. COENEGRACHT
formed values can be interpreted as ln(y), otherwise as 9. The confidence interval of h can be used to find a value where h can be interpreted. Values such as h=-1 (l/y), h=O (ln(y)) or h=0.5 (sqrt(y)) are preferred over values such as h=-0.9. It is decided that R, is the response which is preferred. When however, there are convincing contradicting results from the Box-Cox transformation then an alternative response has to be used.
6.2.11 Construction of models for the dependent variables The construction of models by multiple regression is, in the design space used, somewhat more complicated than usual. This is due to the presence of both mixture and process variables. In this section a short discussion on the consequences of the construction of models with both kind of variables is given. Mixture variables, expressing the composition of the mobile phase as fractions, have the property that they add up to one (the mixture restriction). The consequence is that no intercept can be estimated when the effects of the solvents are evaluated [ 10,191. Moreover interactions and quadratic effects, such as used when the independent variables are process variables, can not be estimated independently. Mathematically it is better to use blending effects only. Interpretation of these blending effects, i.e. explicitly stating what components are responsible for the non-linear effects, is not possible. When the mixture variables are combined with process variables, the complications caused by the mixture restriction are amplified by the presence of the process variables. For example, when the “interactions” between all mixture variables and a process variable are estimated, it is not possible to estimate the effect of the process variable itself. This means that the models do not need to contain the more simple model terms, when the more complex model terms are present [ 191. The terms in the models may be classified by degree of complexity. First order terms cause linear effects, second order terms cause curvature, third order terms more complex curvature etc. When d, m,c, e, t, and h are used for, respectively, the concentrations of DEA, MeOH, CHCl,, EtAc and the setting of temperature and relative humidity, then examples of second order terms are d*m, d*e, e*t and t*h. The complete first order model is:
STABILIZING A TLC SEPARATION
25 1
The complete second order model is:
With the design used it is not possible to construct a complete third order model. The most complex third order model which can be used contains 38 terms. This includes third order blending effects (the so called special cubic model terms [lo], e.g. d*m*c) and temperature and relative humidity dependent blending effects (e.g. d*m*t and e*t*h). Concluding, it may be necessary to construct models which are not symmetrical in the process and the mixture variables, and from which one or more simple model terms are missing while more complex interactions are present. 6.2.12 Selection criteria for models The use of the best possible models is of paramount importance for the whole optimization. Therefore all information should be used to find this model. In this case there are seven dependent variables, i.e. the R, values of the solutes. Since the physico-chemical processes . which determine the chromatographic separation may largely be the same for all solutes a good approach is to try to select the same model for all solutes. This has also the advantage that random errors in the observed R, values have less influence on the selected models. Several criteria can be used to select the best models, such as the F-test on regression, the adjusted correlation coefficient (R2,dj) and the PRESS [20] (Predictive error sum of squares). In general, even only adequate models show significant F values for regression, which means that the hypothesis that the independent variables have no influence on the dependent variables may not be accepted. The F value is less practical for further selection of the best model terms since it hardly makes any distinction between different predictive models. There are alternative criteria besides the prediction power to examine models. These are case deletion diagnostics [ 17,211, which are used to deter-
252
C.A.A. DUINEVELD, P. KOOPMANS and P.M.J. COENEGRACHT
mine if some observation has overly large influence on the parameter estimates, and measures of collinearity [ 17,221 which are used to determine if model terms are highly correlated and therefore impractical for predictions. An example of a case deletion diagnostic is Cook's distance. Collinearity can for instance be measured with the variance inflation factor (VIF). R2, and PRESS will be used to select models. Cook's distance and the Variance Inflation Factor will be used to examine further model properties. 6.2.13 Selection of optimization criteria In chromatographic terms the purpose of this paper is to find a mixture composition which results in a good separation of the solutes, both under standard environmental conditions and for different temperatures and relative humidities. In this section this chromatographic purpose is combined with the Taguchi approach to robustness. This results in mathematical expressions which quantify the separation and the robustness against environmental influences. The purpose of the optimization is to find mixture compositions which result in robust separations. Therefore the separation power of a mobile phase at all combinations of temperature and relative humidity should be examined. The question is how all these values should be combined. In our opinion the signal-to-noise ratio of Taguchi is not a suitable optimization variable. This has two reasons. The first is that R,=O cannot be handled by this response. The second reason is that situations can result with the same S/N values, while in a chromatographic context these situations are different. As an example, suppose that there are three mixture compositions which result in a signal to noise ration which is doubtful (i.e. S/' < 0.6, Figure 6.3). The first (continuous line) has almost adequate resolutions at all temperatures. The second (dashed line) has reasonable resolutions at 20 "C, but no resolution at 30 "C. The third (dotted line) has reasonable resolutions at 30 "C, but no resolution at 20 "C. In the laboratory 20 "C occurs often, therefore the last mobile phase composition is unacceptable. A choice between the first and the second requires judgement from the researcher. There are several advanced mathematical techniques available to evaluate several responses, such as the resolutions at different temperatures and relative humidities [22]. We have chosen for a technique called Pareto Optimality since this method is simple and graphical techniques can be used to display the results.
253
STABILIZING A TLC SEPARATION
1.0 ’ ,
’’ ’
0.9
’\
0.8
’.
’\
-.*a
*.**
.,-
"
,/’\
0.6
-.a-
,.+ \
:-
-’
’\
,.J
0.5
,.-.. .,a*
0.4
*.**
'a.-
,,+*
\’
0.7
....--
_..,-
’’
".’’’\ \
\
0.3*-. . . . , . . . , . . . . , . . . . , . . . . , . . . . , . . . . , . . . . , . . . . , . . . . , 20 21 22 23 24 25 26 27 28 29 30 "\
-.--*
Temperature
Figure 6.3 Diflerent dependencies of R, on temperature
One of the more important criteria in the estimation of robustness is the occurrence of different spot orders at different temperatures or relative humidities. This is called spot cross-over. A spot cross-over clearly implies a lack of resolution and an interpretation problem (which spot is which compound). The number of spot cross-overs should be minimal and preferably zero. The number of cross-overs can easily,be determined when at fixed settings the elution order on the plates is examined. A second criterion should express the robustness of the separation more subtle than the number of spot cross-overs. The minimum resolution at different temperatures and relative humidities can be combined to an average and a minimum value. It is also of importance to examine what happens in a special region, such as the temperatures under 25 "C and relative humidities under 6O%, since these circumstances happen more often than other circumstances. The exact criteria which are employed depend on the feasible results. When it is impossible to avoid any change of elution order, then the number of changes in the elution order in the region of interest is of high importance. On the other hand, if it is easy to get situations where the solutes always elute in the same order, then the resolutions throughout the whole temperature and relative humidity range are decisive.
254
C.A.A. DUINEVELD. P. KOOPMANS and P.M.J. COENEGRACHT
6.3 EXPERIMENTAL 6.3.1 Materials and methods Chloroform, ethyl acetate, methanol and alkaloids were of analytical reagent grade, diethylamine was chemically pure. All reagents were used without further purification. Solutions of the alkaloids were made at concentrations of 1 mg/ml by dissolving in methanol. 20* 10 cm, precoated silicagel 60 F254 TLC plates (Merck) were used. The plates were presaturated in a temperature and humidity controlled tank for 60 minutes. With disposable micropipettes a 2 p1 sample was spotted 1 cm from the lower edge of the plate. The spots were detected with the help of an UV lamp at 254 nm. 6.3.2 Software Calculations were performed on an IBM compatible 80486 computer and a HP Apollo 9000 series 700 model 715/75 using the SAS software package [23]. For prediction of R, values the SAS procedure PROC REG was used, subsequent calculations based on these R, values were performed with a variety of home-written macros in the SAS system.
6.4 RESULTS 6.4.1 Introduction The results can be divided in several parts. The first part concerns the determination of suitable models which describe the R, as b c t i o n of the solvent composition, temperature and relative humidity. This part can be subdivided in the selection of a suitable Box-Cox transformation, determination of the suitable models and a discussion of the performance of the models found versus the more chromatographic models such as models (5) and (6). The second part of the results concerns the search for an optimum separation. This can be subdivided in a part concerning an optimum at a fixed setting of temperature and relative humidity and a part concerning the search for an optimum separation which is robust against changes in temperature and relative humidity. 6.4.2 Box Cox transformation There are two response representations which can be tested with a Box-Cox transformation. These are K and R, The R, cannot be used with a Box-Cox
STABILIZING A TLC SEPARATION
255
transformation since it can have negative values. Since h=O can be interpreted as a logarithmic transformation, if h=O, R, is a good dependent variable (R,="log K , the base of the logarithm is not important). The goal of a Box-Cox transformation is to find a guideline for a general decision about a transformation. Therefore it was decided to examine for each of the solutes both its k value and R, value. Based on the sum of the knowledge of the feasible 3L values decision on an appropriate transformation can be made. One of the responses, narceine, was not used because at some solvent compositions and environmental circumstances the R,value was zero. The consequence is that the k' value (k'=l/R,-l) is infinite. These values give problems in the Box-Cox transformation. The model which was used in the Box-Cox transformation is a rephrased linear model:
The range of h values which were examined was -2 to 2 with a stepsize of 0.2. The results are given in Table 6.4. Examination of these values confirms that the k' values should be transformed by a logarithm. There is one exception however, with quinine Udk' should be used. Another result is that, apparently, the R, values were acceptable without any transformation. It is not difficult to explain why the R,values seem to be acceptable. When a graph of R, vs. R, is made (R,=lolog((l-R,)/R,)), and the range of the Rr values is limited to 0.1 to 0.9 then Figure 6.4 results. In this figure it is clearly visible that, especially over the range 0.2 to 0.8, but also in the range 0.1 to 0.9, the connection between R, and R, is almost linear. This means that, in the region of preferable R, values, there is no practical difference between use of the Rf and the R,. With this in mind the results of the BoxCox transformation can again be examined. It appears that for all solutes, except for quinine that the R,range is between 0.14 and 0.86. This means that for these solutes the transformation from Rf to R, does not have statistical consequences. The only two exceptions are quinine and narceine. These have low Rf values (quinine from 0.04, narceine from 0), and thus the features of the data change when a transformation to R, is made. Therefore it can be concluded that for all responses except quinine and narceine, the R, values can be used. For quinine and narceine it may be better to consider some other transformation, for instance l/dk'. If no
256
C.A.A. DUINEVELD, P. KOOPMANS and P.M.J. COENEGRACHT
suitable model for the R, values of quinine and narceine can be found, then this transformation will be used.
TABLE 6.4 RESULTS OF BOX-COX TRANSFORMATION Transformation of k values Transformation of Rfvalues Solute strychnine quinine heroine noscapine caffeine papaverine
rnin. /z
max. /z
rnin. /z
max. /z
-0.6 -0.6 -0.6 -0.2 -0.2 -0.2
0 -0.4 0 0.4 0.2 0.2
0.6 0.8 0.8 0.8 0.4 0.8
1.8 1.6 2.0 2.0 1.8 2.0
Rm
1.0; 0.8i 0.64 0.4: 0.2: 0.0i -0.2j -0.4i -0.6i -0.81 -1.0; 0.1
1
0.2
0.3
0.4
0.5 Rf
0.6
Figure 6.4 Plot of R, vs. Rr
I
I
1
0.7
0.8
0.9
257
STABILIZING A TLC SEPARATION
6.4.3 Selection of models The use of the best models is of paramount importance for the whole optimization. Therefore all information should be used to find these model. In this case there are seven dependent variables, namely the R, values of the solutes. Since the physico-chemical processes which determine the chromatographic separation may largely be the same for all solutes a good approach is to try to select the same models for all solutes. This has as an advantage that random errors in the observed R, values have less influence on the final selected models. A problem in the analysis of the data is that there are various missing values for narceine. This results when narceine is not eluted at all. The consequence is that narceine is more difficult to model. It was decided to analyse first the other solutes after which narceine would be described with the same model.
TABLE6.5 RESULTS OF MODEL VALIDATION 2ndorder model
Strychnine Quinine Heroin Noscapine Caffeine Papaverine Narceine
0.8065 0.8479 0.7395 0.8038 0.8871 0.8466 0.8977
2.3801 3.3181 2.4902 1.1543 0.8976 1.0508 6.0670
As a first examination a complete second order model was used for all responses. Then it appeared that there was a significant lack of fit for all the responses. This implied that third order model terms might be necessary. Further examination of the residuals revealed that there was one observation which had large residual values for all responses. Further examination showed that these large residual values were also present when other models were used. Examination of the laboratory reports showed no reasonable cause for the large residual. After hrther consideration it was suggested that the solvent must have had the wrong composition in the experiment belonging to this observation. The observation was removed from the data set.
258
C.A.A. DUINEVELD, P. KOOPMANS and P.M.J. COENEGRACHT
When again a second order model was used for the data it appeared that there still was some lack of fit. Therefore models were constructed which included third order model terms. Further examination of these more complicated models however revealed that there was a much too high collinearity in these models. This means that there are systematical effects in the data which cannot be explained in a completely satisfactory manner. It was decided to use the second order model and to neglect further model complexities. The model validation results of the second order model can be found in Table 6.5. It is clear that the model for narceine is inferior to all the other models. This can have three reasons. The first is that the solute has a fundamentally different behaviour than the other solutes. The second reason is that the range of the R, values of narceine is different, which in turn requires a different model or transformation. The third reason may be the presence of the missing values. There are only 57 observed values for narceine, opposed to 64 for the other solutes. 6.4.4 Chromatographic and empirical models The chromatographic behaviour of solutes, as described in equations ( 5 ) and (6), only predicts the effects of temperature and relative humidity. According to the formula the R, should decrease with an increase of the relative humidity (V, decreases because the active sites are covered with water). When the quantitative effects are examined it appears that this effect exists, but with one exception. When methanol is used as solvent an increase in the relative humidity causes an increase in the R, values for all solutes but strychnine. A second effect predicted from the chromatographic theory is that temperature increase will decrease the activity. The experimental results show that an increase of temperature can, depending on the composition of the solvent, both increase and decrease the activity. 6.4.5 Determination of a solvent with a high minimum resolution It is not difficult to determine a solvent composition which has a high minimum resolution, once it has been determined what temperature and relative humidity should be used. Suppose, for example, that a temperature of 20 "C and relative humidity of 40% have been selected, then it is possible to predict the Rfvalues and thus the resolutions between the solutes for all solvent compositions. Since there are seven solutes, there are six resolutions. The minimum resolution is the lowest of these values. The problem is now
259
STABILIZING A TLC SEPARATION
reduced to the selection of the solvent composition where this minimum resolution obtains its maximum value. The design space was scanned with a stepsize of 1% to find the maximum of the minimum resolution. It appeared that the maximum value of the minimum resolution (2.1) occurred at composition DEA=O, MeOH=O. 18, CHCl,=O, EtAc=0.82. The k' values of the solutes at this composition are given in Table 6.6.
TABLE6.6 k' VALUES AT COMPOSITION WHICH GIVES HIGHEST MINIMUM RESOLUTION AT T=20 "C AND RH=40% Narceine
Quinine
Strychnine
Heroin
Caffeine
Papaverine
Noscapine
20.9
6.1
3.9
2.0
1.4
0.8
0.5
6.4.6 Determination of a solvent composition with a robust minimum resolution In this section first some preliminary study is made of the robustness of some mixture compositions. After that a systematical examination of the design space is performed. In Figure 6.5 the effect of temperature and relative humidity on the minimum resolution between the spots is showri for solvent composition DEA=0.217, MeOH=0.267, CHC13=0.258, EtAc=0.258 (the centre of the design space). It is clear that there are different elution orders, 9 in total. This is mainly caused by quinine. At 25 "C quinine is the second worst eluting solute at 40% RH and best eluting at 80% RH. At intermediate relative humidities it crosses many other spots. The final interpretation is that this solvent composition does not result in a robust TLC separation. In Figure 6.6 the temperature and relative humidity effects are depicted for solvent composition DEA=O, MeOH=O.18, CHCl,=O, EtAc=0.82 (the point with the highest minimum resolution). In total this resulted in 3 different elution orders. These are caused by two changes of elution order. The first between strychnine and quinine, the second between heroin and caffeine. The interpretation is that this solvent composition does not result in a robust TLC separation.
260
C.A.A. DUINEVELD, P. KOOPMANS and P.M.J. COENEGRACHT
Figure 6.5 Minimum resolution vs. temperature (r) and relative humidity (RH)for composition DEA=O.217, MeOH=O.267, CHC13=0.258, EtAc=O.258
Figure 6.6 Minimum resolution vs. temperature (T) and relative humidity (RH)for composition DEA=O, MeOH=O. 18, CHCl,=O, EtAc=O.84
26 1
STABILIZING A TLC SEPARATION
Apparently, a systematical search is necessary to obtain a robust TLC separation. Therefore the complete mixture design space was scanned with a resolution of 4%. For each of these solvent compositions the minimum resolutions and peak orders were calculated for 121 different temperaturehelative humidity combinations (1 1 temperature and 11 relative humidity levels). Based on these 121 resolutions the lowest and the average minimum resolution were estimated. The number of different spot orders was calculated. Besides that, a region of special interest was designated. This region had as upper boundaries 25 "C and 60% RH Within this region the same values were estimated. There were 87 compositions which showed no spot crossover. To select fiom these 87 compositions the Pareto Optimal points [22] were calculated (maximizing all four criteria). There were nine such points, these are given in Table 6.7. Plots of the minimum resolution for all these compositions were made, and finally the composition DEA=0.08, MeOH=O, CHCl,=O. 16, EtAc=0.76 was selected as resulting in the best preferred separation. In Figure 6.7 the change of minimum resolution at this mixture composition at different temperatures and relative humidities is depicted. It is clear that the resolution is reasonably well for most temperatures and relative humidities, but at real humid situations the resolution declines.
TABLE 6.7
AVERAGE (Avg.)AND MINIMUM (Min.)OF MINIMAL RESOLUTIONS. A: COMPLETE RANGE OF T AND RH, U: LIMITED RANGE DEA MeOH
0.08 0.08 0.08 0.08 0.08 0.08 0.04 0.04 0.04
0.04 0.04 0.04 0.00 0.00 0.00 0.00 0.00 0.00
CHCI, 0.32 0.28 0.24 0.24 0.20 0.16 0.16 0.12 0.08
Min. A.
Avg. A
Min. U
Avg. U.
0.56 0.50 0.60 0.56 0.64 0.63 0.68 0.39 0.72 0.45 0.76 0.46 0.80 0.62 0.84 0.62 0.88 0.47
1.08 1.11 1.13 1.32 1.35 1.37 1.37 1.33 1.28
0.70 0.70 0.69 1.13 1.12 1.12 0.62 0.62 0.63
0.94 0.96 0.97 1.45 1.47 1.49 1.01 1.05 1.08
EtAc
262
C.A.A. DUINEVELD, P. KOOPMANS and P.M.J. COENEGRACHT
Figure 6.7 Minimum resolution vs. temperature (T) and relative humidity (RH)for composition DEA=O.O8, MeOH=O, CHC1,=0.16, EtAc=O. 76.
6.5 CONCLUSIONS A robust TLC separation of the seven alkaloids (strychnine, quinine, heroin, noscapine, caffeine, papaverine and narceine) is difficult to obtain. The solutes used have different properties and can therefore not be separated trustworthy for all possible values of relative humidity and temperature. The used optimization method is acceptable. The approach where first the data is measured, after which the whole separation problem is analysed, allows for rapid evaluation of a multitude of compositions.
STABILIZING A TLC SEPARATION
263
REFERENCES D. Nurok Strategies for optimizing the Mobile Phase in Planar Chromatography. Chem. Rev., 89 (1989) 363-375. F. Geiss, Fundamentals of thin layer chromatography (planar chromatography), Dr. Alfred Huthig Verlag, Heidelberg, 1987. P.M.J. Coenegracht, M. Dijkman, C.A.A. Duineveld, H.J. Metting E.T. Elema and Th. M. Malingre, A new quaternary mobile phase system for optimization of TLC separations of alkaloids using mixture designs and response surface modelling, Journal of Liquid Chromatography, 14 (1991) 3213-3239. L.R. Snyder, Classification of the solvent properties of common liquids, J. Chromatogr. Sci., 16 (1978) 223. C.A.A. Duineveld, A.K. Smilde and D.A.Doombos, Designs for mixture- and process variables applied in tablet formulations, AnaZytica Chimica Acra, 277 (1993) 455. C.F. Poole and S.K. Poole, Chromatography today, Elsevier, Amsterdam, 1991. L.R. Snyder, Principles of adsorption chromatography, Dekker, New York, 1965. Sz. Nyiredi, C.A.J. Erdelmeier, B. Meier, 0. Sticher, The PRISMA mobile phase optimization in thin-layer chromatography: Separation of natural compounds, Planta Medica 3 (1985) 241-246. J. Glajch, J.J. Kirkland, K.M. Squire, J.M. Minor, Optimization of solvent strength and selectivity for Reversed-Phase Liquid Chromatography using an interactive mixture-design statistical technique, Journal of Chromatography, 199 (1980) 57. [lo]. J.A. Comell, Experiments with mixtures designs, Models and the Analysis of Mixture Data,John Wiley & Sons, New York, 1990. [l 11. G.S. Peace, Taguchi methods: a hands-on approach, Addison Wesley, 1992. [ 121. G.E.P. Box, Draper N.R., Empirical model-building and response surfaces, John Wiley and Sons, New York, 1987. [13] G.F. Piepel, Programs for generating extreme vertices and centroids of linearly constrained experimental regions, Journal of Quality Technology, 20 125- 139. [14] V.V. Fedorov, Theory of optimal experiments, Academic Press, New York, 1972. [ 151 T.J. Mitchell, An algorithm for the construction of “D-optimal” experimental designs, Technometrics, 16 (1974). [16] C.A.A. Duineveld, A.K. Smilde and D.A. Doombos, Comparison of experimental designs combining process and mixture variables. Part I: Design construction and theoretical evaluation, Chemometrics and Intelligent Laboratory Systems, 19 (1993) 295-308. [ 171 S. Weisberg, Applied linear regression, second edition, John Wiley & Sons, New York, 1985.
264
C.A.A. DUINEVELD, P. KOOPMANS and P.M.J. COENEGRACHT
[18] G.E.P Box and D.R. Cox, An Analysis of Transformations, Journal of the Royal Statistical Society B, 26 (1964) pp. 2 1 1-246. [191 C.A.A. Duineveld, Construction and analysis of mixture-process variables designs as applied to tablet formulations, Ph.D. Thesis, 1993, University of Groningen, Netherlands. [20] R.R. Hocking, The analysis and selection of variables in linear regression, Biometrics 32 (1976) 1-49. [21] D.A. Belsley, E.Kuh, R.E. Welsch, Regression diagnostics. Identzfiing influential data and sources of collinearity, John Wiley & Sons, New York, 1980. [22] M.M.W.B. Hendriks, J.H. de Boer, A.K. Smilde and D.A. Doornbos, Multicriteria decision making, Chemometrics and Intelligent Laboratory Systems, 16 (1992) 175-191. [23] SAS Institute Inc., Cary, NC, USA, SASSofiare Release.
Chapter 7
ROBUSTNESS OF LIQUID-LIQUID EXTRACTION OF DRUGS FROM BIOLOGICAL SAMPLES JACOB WIELING’ Biolntermediair Europe BV, Postbus 454, 9700 AL Groningen, The Netherlands
7.1 INTRODUCTION Currently, a lot of emphasis is put on quality evaluation, quality improvement and quality optimisation in analytical chemical laboratories. The moment good quality of laboratory management procedures (e.g. logistics, such as information flow) has been achieved and ascertained by Good Laboratory Practice (GLP) regulations or other compliance programs, the quality of the chemical procedures may be improved. For example, methods or procedures can be optimised with respect to selectivity, specificity, accuracy, precision and ruggedness. Several definitions of quality have been given in the literature. An explicit definition is given by Taguchi et al. [l]: “the quality of a product is expressed by its loss to society”. The parameter design procedure of Taguchi was developed to improve product performance and distinguishes between design variables (controllable variables) and noise variables (noncontrollable factors).
1
The results of the extraction experiments described in this contribution were collected in the laboratories of Pharma Bio-Research International B.V. (PBR), Zuidlaren, The Netherlands (J.Wieling, J.H.G. Jonkman, C.K. Mensink, J. Hempenius), in cooperation with the Chemometrics Research Group of the University Centre for Pharmacy (UCF), Groningen, The Netherlands (D.A. Doornbos, P.M.J. Coenegracht). The co-operation was h d e d by the Dutch Technology Foundation (STW). The development of the algorithms presented in this paper is a consequence of this co-operation.
265
266
J. WIELING
The Taguchi method uses a particular experimental design, the goal of which is to select those settings of the design variables, which give optimal results for the performance of a product. More over, those settings of the noise factors are selected, that have minimal effects on the performance of the product. In bioanalysis, High-Performance Liquid Chromatography (HPLC) is the analytical technique most frequently used. Often, extended sample preparation is required to make a biological sample (the matrix) suitable for HPLC-analysis. The compound of interest, the analyte, has to be isolated from the matrix as selective and quantitative as possible. The quality of the sample preparation largely determines the quality of the total analysis procedure. In a survey Majors [2] showed that approximately 30% of an error generated during sample analysis was due to sample preparation, which indicates the need for error reduction and quality improvement in sample preparation. A systematical approach of sample preparation methods and optimisation of the quality aspects of sample preparation may enhance the efficiency of total analytical methods. This approach may also enhance the quality and knowledge of the methods developed, which actually enhances the quality of individual sample analyses. Unfortunately, in bioanalysis, systematical optimisation of sample preparation procedures is not common practice. Attention to systematical optimisation of assay methods has always been mainly on instrumental analyses problems, such as minimising detection limits and maximising resolution in HPLC. Optimisation of sample extraction has often been performed intuitively by trial and error. Only a few publications deal with systematical optimisation of liquid-liquid extraction of drugs from biological fluids [3,4,5]. Although application of chemometrics in sample preparation is very uncommon, several optimisation techniques may be used to optimise sample preparation systematically. Those techniques can roughly be divided into simultaneous and sequential methods. The main restrictions of a sequential simplex optimisation [6,7] find their origin in the complexity of the optimisation function needed. This function is a predefined function, often composed of several criteria. Such a composite criterion may lead to ambiguous results [8]. Other important disadvantages of simplex optimisation methods are that not seldom local optima are selected instead of global optima and that the number of experiments needed is not known beforehand.
ROBUSTNESS IN LIQUID-LIQUID EXTRACTION
267
Regression methods (simultaneous methods) are applied to model criteria as a hnction of factor adjustments. (Fractional) Factorial designs are often used to optimise process variables, e.g. pH or ionic strength. The levels of the factors can be chosen at will. Examples of factorial designs are given in [9,10,11]. A review on the subject has been given by Deming et al. [12]. Mixture designs [ 13-181 are used for the optimisation of the composition of a mixture. They allow the construction of a response surface (i.e. a model) of a criterion from a relatively small number of preselected experiments. Levels of all variables cannot be chosen arbitrarily since the fractions pjof the components add up to unity (for n components: 0 I pji 1; pl + p2 + ... + p,,= 1). Once the model function is found to be statistically acceptable (descriptive as well as predictive) it can be used to create the response surface, which can be examined to select a region in the factor space that gives optimum or acceptable values of the criterion modelled. This region is also selected on a regression model. The validity of the optimum may be verified by additional experiments. Mixture experimental designs can be used to optimise the composition of extraction liquids in liquid-liquid extraction in biomedical analysis, which was demonstrated by Wieling et al. [4,5]. In this contribution, new criteria are introduced for the optimisation of liquid-liquid extraction: the minimal partition coefficient for the simultaneous extraction of more than one compound, a robustness criterion for the partition coefficient (C,), the ratio of two partition coefficients (selectivity ajj), the minimal selectivity min aij for the simultaneous extraction of more than two compounds and a robustness criterion for the selectivity (C,) for the simultaneous extraction of more than one analyte.
A simulation experiment is performed to validate the method, which can be formulated in algorithms. The applicability of the algorithms to practice is tested by means of the performance of extraction experiments for a group of sulphonamides. The response (partition coefficient) is modelled versus the composition of the extraction liquid. The models are used to predict the criteria within the entire mixture space.
268
J. WIELING
7.2 THEORY 7.2.1 Liquid-liquid extraction optimisation theory Several studies attempted to relate the partition coefficient P of a solute in a liquid chromatographic or a gas chromatographic system with the composition of the two phases, one of which has a varying composition [19-231. Tijssen et al. [24] and Schoenmakers [25] derived a relation between the partition coefficient and a binary mobile phase in reversedphase HPLC from the solubility parameter theory of Hildebrand et al. [26]. Similarly, a relation can be derived for liquid-liquid extraction with extraction liquids composed of three components:
In P
= A q;
+ B q i + C q l + D q l q2+ E qlq3
fF'?2q3
+ G q l +H q 2 + I q 3 + J
In this equation A through J are functions of the solubility parameters of the extraction liquid components and ql, q2 and q3are the fractions of mixture components 1, 2 and 3 respectively. This equation is a canonical form of a mixture equation with three mixture variables, but this complex equation can be simplified since the sum of the fractions of the extraction liquid components equals 1 (q3= 1 - ql - qJ. It provides considerable insight into partitioning of a solute between the aqueous and the organic phase in liquid-liquid extraction. Quadratic effects of fractions and blending effects between two fractions are indicated. This factorial designlike model can be transformed in a quadratic mixture model using the constraint that the sum of the fractions equals 1 (ql + q2 + q3= 1). The mixture variables (the fractions of the components in the extraction liquid) are now represented by xi:
This transformation of a physico-chemical model into an empirical model was previously discussed by Weyland et al. [27]. A ternary nonlinear blending term is often added to improve the descriptive power of this equation. Then, a special cubic mixture model is obtained:
ROBUSTNESS IN LIQUID-LIQUID EXTRACTION
269
where x , , x2 and x3 are the mixture variables (fractions of the mixture components in the extraction liquid), are the model coefficients to be estimated and E is the residual error. For estimation of the regression coefficients and the residual error at least as many experiments have to be performed as the number of model coefficients plus one. In a mixture design experiment, the response of a mixture of q components depends only on the fractions of mixture components and does not depend on the total amount of the mixture. Liquid-liquid extraction is often quantified by the recovery R, i.e. the fraction of the total amount of analyte transferred from the aqueous phase into the organic phase (R= OOrg):
where V,, and V,, are the volumes of the aqueous and organic phase, respectively, and P is the partition coefficient of a given analyte under certain conditions (pH, ionic strength, extraction liquid, temperature). This relationship can be transformed into the following relationship:
With a constant ratio of the phase volumes throughout all the experiments: PCC-
R I-R
Thus, when recoveries of analytes are measured, these recoveries can be related to the extraction liquid composition as follows:
270
J. WIELING
Summarising, to optimise the partition coefficient P of a solute i, Pi should be maximised by mixing three solvents in the correct proportions. The use of mixture design statistical techniques with the natural logarithm of the partition coefficient as response criterion is a valid way to achieve this.
7.2.2 Optimisation criteria In the previous section, the optimisation of liquid-liquid extraction with the help of mixture designs justified by the solubility theory was examined. A relation was derived between the partition coefficient and the mixture composition for liquid-liquid extraction with extraction liquids composed of three components, and a special cubic mixture model was obtained (equation (3)). Once a relationship between the partition coefficient and the composition of the organic phase have been found, these models can be used to build response surfaces of the partition coefficient or the recoveries of the solutes or of other criteria. Partition coeflcient If a substance has to be extracted from an aqueous matrix for quantitative determination, it is important to maximise its partition coefficient. The models for the partition coefficient in relation to the mixture composition can be used to estimate that mixture composition, where the partition coefficient reaches its highest value. Selectivity The ratio of extraction of two solutes i and j is represented by the selectivity qj,which is the ratio of the partition coefficients Piand Pj [28]:
For quantitative extraction of two solutes i and j , both Piand Pj should be maximised. If the ratio of the two partition coefficient is more important then the individual partition coefficients, a,, should be used as optimisation criterion (by definition indices of the partition coefficient are attached as such that alJis always smaller or equal to 1). Optimal values for alJ are those values which are equal to or which approximate unity
(a,,= 1).
ROBUSTNESS IN LIQUID-LIQUID EXTRACTION
27 1
Minimal Partition coescient Situations can be imagined, in which more than one solute has to be extracted fiom a sample. Such situations are, for instance, the extraction of an analyte simultaneously with an internal standard or a drug simultaneously with one or more major metabolites or co-drugs. Under these conditions, the aim of an extraction procedure is to extract all substances as quantitatively as possible. However, for each solute to be extracted the optimum composition may be located in another region of the factor space: there may be no such combination of mixture variables, that guarantees optimum extraction for all substances. The most economical procedure for a liquid-liquid extraction would be a single step extraction, since extraction procedures including several steps with the same or with different solvents are laborious and economically disadvantageous. Optimisation of extraction of more than one solute, which give different selective interactions (different response surfaces in the same mixture space), may require several extraction steps with different optimal extraction solvents or separate analysis of each analyte. However, procedures can be used, which select a composition of the extraction liquid that provides satisfactory partition coefficients or extraction yields for all solutes to be extracted. The criterion of the minimal partition coefficient (P,,) is introduced now: an extraction liquid composition is optimal if there is no other composition that gives a better partition coefficient for the worst extractable compound (maximise P,,,). If n substances have to be extracted from an aqueous matrix by a given extraction liquid composition, the minimal partition coefficient is defined as:
In an optimisation procedure involving the minimal partition coefficient, the minimal partition Coefficient is calculated using all compositions of the extraction liquid (all possible combinations of x,, x2 and x3 within the mixture space). The highest value calculated for this minimal partition coefficient (the maximal minimal partition coefficient) is the optimal value and hence the composition where the partition coefficient of the worst extractable substance is highest.
272
J. WIELING
Minimal Selectivity Similar to the criterion of the minimal partition coefficient, one can also use the criterion of the minimal selectivity. This criterion (a,,,) is defined as follows: an extraction liquid composition is optimal if there is no other composition that gives a higher value for the minimal selectivity (maximise am;,).In other words: the optimum composition is that composition where the affinity of n compounds of interest for the extraction liquid is most equal.
A paper of Modin and Schill [29] describes variation of selectivity due to a change of extraction solvent.
Robustness of P and a As a result of using mixtures of solvents, small variations in the composition might occur, that may have significant effects on the variability of the partition coefficients of a compound or on the selectivity ajJof two compounds. For mixture designs these effects were previously noticed and investigated by de Boer et al. [30-321. With the use of the regression models a partition coefficient for each compound and a selectivity for each pair of substances can be predicted in each composition of the extraction liquid within the factor space. Figure 7.1 demonstrates the response surfaces of the partition coefficients of two compounds i a n d j in a binary extraction liquid (i.e. an extraction liquid composed of two solvents). The compositions where the partition coefficients of i and j are optimal are represented by Oi and Oj, respectively. The optimal compositions with regard to the selectivity is represented by 0,. Two situations are given. In the left part of the figure the shapes of the response surfaces of the partition coefficients are dissimilar for both compounds. The maximal minimal partition coefficient is found in 0,. 0, also generates the maximal minimal selectivity, which is equal to unity in this case ( a j jis represented by the minimum value of ajjand q,,). The selectivity varies largely with extraction liquid composition. Composition 0, yields a high ratio of Piand Pj, but this ratio is very sensitive to small fluctuations in the composition of the extraction liquid: a fraction of extraction liquid component one (xl)
273
ROBUSTNESS IN LIQUID-LIQUID EXTRACTION
that is little higher as compared to the composition in 0, makes compound j the best extractable compound; on the contrary, a fraction of extraction liquid component one that is little lower as compared to the composition in 0, makes compound j the worst extractable compound. In other words: the robustness of the ratio of the partition coefficients with respect to variability of the extraction liquid composition is low. This robustness is subject to the accuracy of the composition of the extraction liquid composition and the response surfaces of the partition coefficients of the compounds i and j . Response
0
20
40
60
fraction X I in X2
80
100 0
20
40
60
fraction X I in X2
80
1
Figure 7.I The response surfaces of the partition coeficients and the selectivity of two compounds i and j in a binary extraction liquid In the right part of Figure 7.1, the maximal minimal partition coefficient is found in Oj. Here, the composition where the ratio of the partition coefficients of compounds i a n d j reaches its optimum, 0,, is also the composition that gives a robust selectivity. Optima with respect to maximal partition coefficients of both compounds are obtained with different compositions. The response surfaces are not completely parallel. Little variation in the composition of the extraction liquid does influence
274
J. WIELING
the partition coefficients of both i andj, it influences the minimal partition coefficient (depending on the slopes of the surfaces) but is scarcely alters the ratio of the partition coefficients of i andj. The robustness of this ratio is high. Summarising, the robustness of the partition coefficient (C,) and the robustness of the selectivity (C,) are functions of the variance in mixture composition (oPl2) and the shape of the response surfaces:
where PI and PJ are defined by the regression models of the partition coefficients of two solutes i andj, cov(Cpl, CpJ)is the covariance of the variances of PI and PJ due to small changes in the composition of the extraction liquid. Using the rules of the propagation of errors [33,34] a measure of the robustness of the partition coefficient (C,) and the robustness of the selectivity (C,) can be obtained. Below, a derivation of robustness of the partition coefficient PI of a compound i and the selectivity alJfor two compounds i and j with respect to variation in extraction liquid composition is given. The general form of a (Special Cubic) mixture model for three-component mixtures is given by:
where y is the response factor to be modelled. Then, the partial derivatives for y with respect to x , , x2 and x3 are:
ROBUSTNESS IN LIQUID-LIQUID EXTRACTION
275
The variance of the response due to possible variation in mixture composition is (defined by the variances of the mixture components oXl2 , ox: and
3x2
( ) ($ %)
+ 2c0vx1,x2 -.@
? y
+
2c0vx1,x3
3x1 ax2
+
2c0vx2,x3
o; = T,’O:~
+
2 T22 ox2 +
+ 2c0vx1,x2
(Z E)
2 q2 ox3
qT2 + 2c0vx1,x3
qq
2c0vx2,x3
‘2q
(14b)
where ox,,ox2en ox3are the variances of the fractions of XI, x2 and x3 and COV,~,,~, C O V , ~ , , ~and C O V , ~ , , ~are the covariances between fraction x1 and x2, x1 and x3 and x2 and x3.These covariances between two fractions have been derived by de Boer et al. [30]. In Figure 7.2, the response surface of the partition coefficient of a compound as a function of the composition of a binary extraction liquid is given. Also, the variation in x, as a function of the mixture composition is given for three different compositions. It can be seen from these three Gaussian (a normal distribution for the error in xi is assumed) that the less pure a composition is, the higher the variations in the composition of the extraction liquid becomes. It is shown in Figure 7.2 that, despite relatively high variation in mixture composition, the variance of the partition coefficient is not necessarily high: in mixtures with 10% x1 or 90% xl, the variance of the mixture composition is relatively low. However, for 90% x1 the variation in the partition coefficient (dy3) due to composition error is much larger as compared to the mixture composition with 45% x1 (dy2), although this latter composition demonstrates a composition variance that is much larger.
276
J. WIELING
1.21 1-
.
-
0.8 -
0.6 0.4 0.2 -
0 --
i; v
0.4
0.6
0.8
1
Fraction of Mixture Component 1 (Xl) Figure 7.2 The response surface of the partition coeficient of a compound as a function of the composition of a binary extraction liquid If the variance of a partition coefficient (which has been modelled with equation (3)) in a given mixture composition has to be estimated, the following derivation is made (y=In P):
Then, equations (13a), (13b) and (13c) (partial derivatives for P with respect to x,, x2 and XJ become:
ROBUSTNESS IN LIQUID-LIQUID EXTRACTION
277
Now, an equation for the variance of the partition coefficient of a compound due to variation in extraction liquid composition is derived.
The selectivity of an extraction system for two solutes i a n d j is the ratio of their partition coefficients (equation (8)). The variance of the selectivity is a function of the individual variances of the partition coefficients which are due to small variations in the between these extraction liquid composition and of the covariance opi,pj partition Coefficients (equation (18)). The variances for the individual partition coefficients has been derived above.
The covariance of the partition coefficients can be estimated by the correlation between the tangent planes of the response surfaces in a given mixture composition. This is explained in the next part of this paragraph. The correlation of the slopes of the tangent planes of the response surfaces of P, in P, in the investigated mixture composition M is represented by Y.
278
J. WIELING
Then equation (20) for the variance of the selectivity is obtained from equation (1 9) and from r.
In other words, the robustness of this ratio is a fbnction of the robustness of the individual partition coefficients Pi and P, and of the parallelism of the tangent planes of the response surfaces in mixture composition M. If the tangent planes in M a r e more or less parallel then Pi and P, are approximately equally affected by a variation in the mixture composition and the correlation of the response surfaces in M is high and the variance in the selectivity is small. If the tangent planes in M have opposite slopes then Pi and P, are affected in an opposite way by a variation in the mixture composition and the correlation of the response surfaces in M will be low and the variance in the selectivity high. The calculation of the correlation r (i.e. the parallelism of the response surfaces) is outlined below. The tangent plane in M for a response surface fitted with In P can also be described by using the adjusted mixture model (i.e. the restrictions of the mixture models are under consideration: 01 xi< 1; x,+ x2+ x3= 1) (equation (3) becomes equation (2 1a)):
+
In other words, the tangent planes are now described by two variables x1 and x2.The adjusted partial derivatives (equations (13a), (13b) and ( 1 3 ~ ) ) for x1and x2 are now represented by equations (22a) and (22b).
ROBUSTNESS IN LIQUID-LIQUID EXTRACTION
279
The correlation between the slopes of the planes is expressed by the cosine of the angle between their normal vectors. The normal vector (norm) of a plane is the line perpendicularly to the tangent plane. Here, this vector for i and j is described as follows:
The inner product of two vectors i and J is the product of the norm of ? (=(I i II), the norm of J (=I( J 11) and the cosine of the angle 0 between r and J :
280
J. WIELING
The norms of T and J are:
The correlation r (or the cosine of the angle 6 between the two vectors) then becomes: r = correlation of the tangent planes
=
cose
=
~
67)
IITII .I14
-
Now, the criteria for the robustness of the partition coefficient of single compounds and of the selectivity of two compounds have been derived. The following part of this paper deals with the practical use of the criteria. In Figure 7.3, the partition coefficients, their ratio, the correlation of the response surfaces and the variances of the partition coefficients and the ratio are plotted for extraction of two compounds i a n d j into a binary extraction liquid. The regression coefficients PI,P2 and PI, for P I and P, are 1.0, 3.0 and 4.0 and 3.0, 2.0 and 7.0. In the middle part of Figure 7.3 it can be seen that the closer the composition of the extraction liquid gets to 100% xl,the more parallel the response surfaces appear (i.e the closer r comes to l), with a maximum in composition 85% x , l 15% x,. Variances of partition coefficients are largest between the maximum of a response surface and those compositions where there is no variation in the independent variables (100% x, and 100% x,). The variance of the selectivity is largest (lower part of Figure 7.3) in the composition where the selectivity is optimal (20% xl). Figure 7.3 also demonstrates that negative values for the correlation between two response surfaces not necessarily mean poor values for the robustness; the variability of the extraction liquid composition also plays an important role and consequently the measured region in the mixture space.
28 1
ROBUSTNESS IN LIQUID-LIQUID EXTRACTION
7.3 EXPERIMENTAL 7.3.1 Validation of robustness criteria by means of a comparison with a simulation experiment In order to assess the validity of the robustness criteria, which were derived in the previous section, an algorithm was developed for the validation of the criteria. This algorithm existed of two parts, the results of which were compared and consequently used to assess the robustness algorithm. The first part uses the robustness algorithm (i.e. the algorithm introduced and described in the preceding part of this paper) to calculate the variance of the partition coefficients P of two solutes i a n d j and the selectivities a . .using two preselected models of 1nP and seven preselected ’! extraction liquid compositions. The second part uses a computer program to generate partition coefficients P of two solutes i a n d j and consequently the selectivity aij using the same two preselected models of 1nP and the same seven preselected extraction liquid compositions. By varying the volume of the three components of the extraction liquid using a mean (the preselected extraction liquid compositions) and a standard deviation (the coefficient of variation of a dispensed volume), a normally distributed variance in the extraction liquid composition (noise) is obtained. For each preselected
Partition Coefficient
0
0.2
0.4
0.6
0.8
Fraction of Component 1 in mixture
1
282
J. WIELING
Selectivity and Correlation Coefficient 0.8 0.4
0 -0.4
0
0.2
0.4
0.6
0.8
1
Fraction of Component 1 in mixture
Variance in P and Selectivity
0.00Q
0.003 0.002 0.001
0 . 0
0.2
0.4
0.6
0.8
1
Fraction of Component 1 in mixture Figure 7.3 Partition coeficients of two compounds i andj (Pi and Pj, previous page) in a binary extraction liquid, the selectivity avand the correlation coeficient r (upper part, this page) and variances of P and a (lower part, thispage)
ROBUSTNESS IN LIQUID-LIQUID EXTRACTION
283
extraction liquid composition 1O5 extraction liquid compositions were generated, using a predefined coefficient of variation in the volume of a component. Consequently, a variance in the partition coefficient and the selectivity is obtained. In detail, this simulation procedure was as follows: 1. Select a mixture composition with fraction xl,x2 and x,; 2. Select theoretical values of the volumes of the components of the extraction liquid composition V,, V, and V,, corresponding with the fractions xl,x, and x,; 3. Use the computer program to generate values for V,, V, and V, with the normal distributed noise. V,*,V,* and V,* are obtained now; 4. Calculate the mixture composition xl*/x2*/x3*originating from V,*,V2* and V,*which results in xi*=V,* / ( V,*,V,* and V3*) 5. Use the mixture composition generated in step 4 and the models of 1nP to calculate values for Piand Pj; 6. Calculate ajjfrom Pi and Pi; 7. Repeat steps 3 to 6 lo5 times; 2 2 pi,, and q,,,)and the variances (spi,s, spj,s 8. Calculate the means (
e,,,
and s;,,) of P ,Pjand aij(the subscript s refers to the simulated value); 9. Compare the values obtained in step 8 with the values obtained from the
e,c,
e,,.
2
2
first part ( and , and the variances spi,c,spj,c and s& ) (the subscript c refers to the calculated value with the algorithm for the robustness); Two separate comparisons were made, each one using the same models and extraction liquid compositions, but different coefficients of variations (CV%) for dispensed volumes of pure extraction solvents. A small value for CV% was used (1%) and a large value (5%). The last value of 5% is assumed to be a maximum acceptable value. Values higher than 5% are unacceptable, keeping in mind the present technological possibilities. The lower the value of the coefficient of variation, the more accurate the algorithm for the robustness will be. Hence, the values of the validation of the algorithm will be better in the case of lower coefficients of variation. The model coefficients for 1nP of compounds i a n d j are given in Table 7.1. The models are, within the mixture triangle, partly parallel and partly perpendicular. Table 7.2 gives the extraction liquid compositions investigated. These compositions are equally spaced over the mixture triangle (Figure 7.4).
284
J. WIELING
7.3.2 Selection of solvents The composition of the extraction liquid is very important for the magnitude of the partition coefficient of a solute. Several quantities have been studied to describe solvent properties. Snyder [20,211 grouped the relative selectivity of solvents into solvent selectivity classification groups, each group formed according to proton donating, proton accepting and dipole interaction properties.
TABLE 7.1
REGRESSION MODELS OF LN P, AND LN PJFOR TWO CASES USED IN THE VALIDATION OF THE ROBUSTNESS ALGORITHM FOR THE PARTITION COEFFICIENT AND THE SELECTIVITY PI
P2
P3
P4
Ps
P 6
p7
lnp, 0.000000 0.000000 0.000000 2.000000 2.000000 2.000000
6.000000
lfl,
6.000000
0.000000
1.OOOOOO
1.OOOOOO
3.000000
3.000000
6.000000
XI
Figure 7.4 Extraction liquid compositions usedfor the validation of robustness algorithm (XI,X2 and X3: pure constituents of the extraction liquid: DCM, Clfand t B m , respectively)
285
ROBUSTNESS IN LIQUID-LIQUID EXTRACTION
TABLE 7.2
COMPOSITIONS OF THE EXTRACTION LIQUIDS USED FOR THE ALGORITHM VALIDATION EXPERIMENT AND FOR THE MODELLING OF THE EXTRACTION OF A NUMBER OF SULPHONAMIDES Composition @action x, @action x2 @action x3 (DCW (CM (tBrn) 1
1.ooo 0.000 0.000 0.500 0.500 0.000 0.333
0.000 1.ooo 0.000 0.500 0.000 0.500 0.333
0.000 0.000 1.ooo 0.000 0.500 0.500 0.333
Solvents used here for a general liquid-liquid extraction method were selected from Snyders solvent selectivity triangle. As extraction liquids have to be composed of mixtures of three solvents which may enter into maximum interaction with the analyte, three solvents had to be selected that represent a wide variety of selective interactions. In addition, the solvents should be sufficiently polar to ensure quantitative extraction. Besides selectivity and polarity requirements, the solvents should also meet a few other criteria, mainly for practical reasons: they should not be miscible with water, have low boiling points (for relatively fast evaporation procedures) and have densities sufficiently different from the density of water, for pure solvents as well as for selected binary or ternary mixtures of solvents. Solvents selected were similar to the solvents that Glajch et al. [35] used for Normal Phase Liquid Chromatography. Methyl tert.- butyl ether (a proton acceptor) was selected instead of ethyl ether, since the former one is less volatile. The other two selected solvents were methylene chloride (dipole interactions) and chloroform (proton donor). These three solvents meet all practical requirements. The polarity P' [21] of the solvents is 2.5, 3.1 and 4.1, respectively. The solvents were used in pure form: no supporting solvent was used.
286
J. WIELING
7.3.3 The extraction of a group of sulphonamides from plasma Instruments and Ivstrumental Conditions The assay was performed with a HPLC-system consisting of: a SpectraPhysics (Spectra Physics, San Jose, CA 95134, USA) model SP8700 solvent delivery system used at a flow rate of 1.0 ml.min-', a Kratos (Kratos Analytical Instruments, Ramsey, NJ 07446, USA) model 757 UVdetector, wavelength 260 nm, range 0.005 aufs, rise-time 1 second. Injections of extracts into a Zymark (Zymark Corporation Inc., Hopkinton, MA 01748, USA) Z 310 HPLC-injection station, equipped with an electrically controlled Rheodyne valve and a 20 pl sample loop, were performed by a Zymate I1 robot system. The Zymark Z 3 10 Analytical Instrument Interface was used to control the HPLC-injection station. Data analysis was performed by means of a Spectra Physics Chromjet SP4400 computing integrator. The analytical column was a Chrompack (Chrompack, 4330 EA Middelburg, The Netherlands) 100 x 4.6 mm Microsphere 3 pm C,* cartridge system. Mixing of plasma samples with a buffer was performed on a vortex-mixer type VF2 (Janke und Kunkel GmbH, D-7813, Staufen, Germany), extraction of the samples with an extraction liquid was performed on a Heidolph Reax-2s tumble mixer (Heidolph, D-8420 Kelheim, Germany) and a Heraeus (Heraeus-Christ GmbH, D-3360 Osterrode am Harz, Germany) Labofuge GL was used for centrifuging. Calculations were performed on an IBM 486 computer under MS-DOS 5 .O using the home-made software package SOLEX (Systematical Optimisation of Liquid =traction) written in Pascal. Chemicals and Reagents Sulphonamides were supplied by Sigma (Sigma Chemical Company, St. Louis, MA 63 178 USA): Sulphisomidine (SOMI), Sulphathiazole (THIA), (METH), Phtalylsulphacetamide (FTAL), Sulphamethizole Sulphacetamide (ACET), Sulphapyridazine (PYRI), Sulphamerazine (MERA), Sulphamethoxypyridazine (MEPY) and Sulphachloropyridazine (CLPY). Acetonitrile (ACN), tetrahydrofuran (THF), methylene chloride (DCM) and methanol (MeOH) were supplied by Labscan (Labscan Limited, Unit T26 Dublin, Ireland) and were of HPLC-grade. Chloroform ChromAR (Clf) was of analytical grade and supplied by Malinckrodt (Promochem GmbH, D-4230 Wesel, Germany). Methyl tert.- butyl ether Uvasol (tBME), Acetic acid (100%) (HAc), triethylamine (TEA), phosphoric acid
ROBUSTNESS IN LIQUID-LIQUID EXTRACTION
287
(85%), potassium-dihydrogenphosphate (KH2P04) and ammoniumacetate were all of analytical grade and supplied by Merck (Merck, D-6100 Darmstadt, Germany). Water was purified by using a Milli-RO-4 and a Milli-Q water purification system (Millipore Corp., Bedfort, MA 01730, USA). Unless otherwise stated Milli-Q-water quality was used. All blank plasma samples used in this study were obtained from a single pool of blank plasma. This was done in order to eliminate the effect that may be present as a consequence of the use of different plasma samples. An acetate buffer @H=5.0; 0.5 M) was prepared by dissolving 3.85 g of ammoniumacetate in 100 ml water. pH adjustment was performed using concentrated HAc. A phosphate buffer (pH=3.0; 0.05 M) was prepared by dissolving 6.80 g of KH2P04 in 1000 ml of water. pH adjustment was performed using concentrated phosphoric acid. To this buffer 4.15 ml of TEA and 10 ml of HAc were added. The mobile phase was prepared by mixing 1 ml acetonitrile, 5 ml THF and 140 ml MeOH and adding phosphate buffer (pH=3.0; 0.05 M) to complete 1000 ml. Extraction solvents were composed according to Figure 7.4 and Table 7.2 using different mixtures of DCM (Xl), Clf ( X 2 )and tBME (X3). The stock solutions of sulphonamides were prepared by dissolving 100 mg of the compounds in 100 ml methanol. These solutions were stored at +4"C and were used to prepare a standard solution (1 mg.1-I). This solution was stored at +4"C. Analytical Procedure An aliquot of 250 p1 of plasma to be analysed and 250 p1 of the standard solution were pipetted in an 11.5 ml glass tube. Then, 250 pl of acetate buffer solution was added and mixed for ten seconds on a vortex mixer. 9 ml of extraction liquid was added and the solution in the tubes was extracted on a Heidolph tumble mixer. A potential problem is present if the solvents used are mixed in different compositions: a composition can possibly be selected that has a density equal to the density of the aqueous layer. This may give rise to problems with the phase separation. After centrifugation at 4000 rpm for 10 minutes the organic layer was transferred to another glass tube of 11.5 ml and evaporated to dryness under a gentle stream of nitrogen at 55°C. The residue of the extract was
288
J. WIELING
reconstituted in 1 ml 50 % MeOH; 20 p1 of the solution was injected into the HPLC-system. of For the determination of the absolute analytical recovery (= R= the compounds the peak heights of prepared samples were compared to the mean peak height from seven direct injections of 10 p1 of the standard solution into the HPLC-system. For correct determination of the recoveries and the partition coefficients, each tube was weighed separately before and after the extraction solvent dispensing step (to give win)and before and after phase separation (to give wOut).V,, was calculated from winand from the density of the extraction liquid considered. This way the exact volumes used could be measured, which were used to calculate the partition coefficients: Dorg = R =
peak height extractions .-win peak height direct injection Wout
(27)
p=-.- R 0.75 1-R vin where 0.75 is the volume of the aqueous layer (ml), V,, is the volume of the Wj/pi;pi is the density of the organic phase (ml) used for extraction ( extraction liquid involved).
vi,=
7.4 RESULTS AND DISCUSSION 7.4.1 Validation of robustness criteria by means of a comparison with a simulation experiment The response surfaces of the criteria introduced earlier using the regression models of the artificial compounds in Table 7.1 are plotted in Figure 7.5af. From these plots it is confirmed that if pure extraction solvents are used, no variation due to variation in the mixture composition occurs in partition coefficients Pi and Piand selectivity aiz Tables 7.3 and 7.4 list the results of the simulation of the variances of the partition coefficients for two different CV values, CV=l% and CV=5%. Table 7.5 list the results of the variances of the selectivities of both cases (1% and 5%).
ROBUSTNESS IN LIQUID-LIQUID EXTRACTION
Figure 7.5a The response surface of the partition coeficient of the extraction of compound i used in the validation of the robustness algorithm
Figure 7.5b The response surface of the partition coeficient of the extraction of compoundj used in the validation of the robustness algorithm
289
290
J. WIELING
Figure 7.52 The response surface of the robustness of the partition coeflcient of the extraction of compound i used in the validation of the robustness algorithm
Figure 7.5d The response surface of the robustness of the partition coeflcient of the extraction of compoundj used in the validation of the robustness algorithm
ROBUSTNESS IN LIQUID-LIQUID EXTRACTION
29 1
Figure 7.5e The response surface of the selectivity aijof the extraction of compound i and compoundj used in the validation of the robustness algorithm
Figure 7.5f The response surface of the robustness of the selectivity aijof the extraction of compound i and compoundj used in the validation of the robustness algorithm
TABLE 7.3 RESULTS OF THE VALIDATION OF THE ROBUSTNESS ALGORITHMS FOR THE PARTITION COEFFICIENTS P,AND P ,FOR A MIXTURE COMPOSITION COEFFICIENT OF VARIATION OF 1Yo 2 2 2 2 Mixture Composition 4,C 4,s SPi,c SPi,s p,,C
1.oooo 0.0000 0.0000 0.5000 0.5000 0.0000 0,3333
0.0000 1.0000 0.0000 0.5000 0.0000 0.5000 0.3333
0.0000 0.0000 1.OOOO 0.0000 0.5000 0.5000 0.3333
1.ooooo 1.00000 1.OOOOO 1.64872 1.64872 1.64872 2.42757
1.ooooo 1.00000 1.OOOOO 1.64871 1.64872 1.64866 2.43237
0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
1.ooooo 2.71828 2.71828 3.49034 3.49034 12.18249 9.19100
q , S
1.ooooo 2.71828 2.71 828 3.49632 3.49442 12.18115 9.26667
SPj,c
0.00000 0.00000 0.00000 0.00015 0.00015 0.00000 0.00249
SPj ,s
0.00000 0.00000 0.00000 0.00015 0.00015 0.00000 0.00252
TABLE 7.4 RESULTS OF THE VALIDATION OF THE ROBUSTNESS ALGORITHMS FOR THE PARTITION COEFFICIENTS PIAND PJFOR A MIXTURE COMPOSITION COEFFICIENT OF VARIATION OF 5% 2 2 2 2 Mixture Composition E , C E , S SPi,c SPi,s p,,C p,,S SPj,c SPj,s 1.oooo 0.0000 0.0000 0.5000 0.5000 0.0000 0.3333
0.0000 1.0000 0.0000 0.5000 0.0000 0.5000 0.3333
0.0000 0.0000 1.0000 0.0000 0.5000 0.5000 0.3333
1.ooooo 1.00000 1.00000 1.64872 1.64872 1.64872 2.42757
1.ooooo 1.00000 1.00000 1.64847 1.64861 1.64718 2.43105
0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00001
1.ooooo 2.71828 2.71828 3.49034 3.49034 12.18249 9.19100
1.ooooo 2.71828 2.71828 3.52000 3.51019 12.14843 9.42645
0.00000 0.00000 0.00000 0.00381 0.00381 0.00000 0.06226
0.00000 0.00000 0.00000 0.00381 0.00379 0.00105 0.06312
TABLE 7.5 RESULTS OF THE VALIDATION OF THE ROBUSTNESS ALGORITHMS FOR THE SELECTIVITY FOR MIXTURE COMPOSITION COEFFICIENTS OF VARIATION OF 1% AND 5% cv=1Yo cv=5% 2
Mixture Composition 1.0000 0,0000 0,0000 0.5000 0.5000 0,0000 0.3333
0.0000 1,0000 0.0000 0.5000 0.0000 0.5000 0.3333
0.0000 0.0000 1.OOOO 0.0000 0.5000 0.5000 0.3333
2
sa,s
1.00000 0.36788 0.36788 0.47237 0.47237 0.13534 0.26412
1.ooooo 0.36788 0.36788 0.47156 0.47181 0.13535 0.26249
0.0000000 0.0000000 0.0000000 0.0000028 0.0000028 ,O.OOOOOOO 0.0000021
0.0000000 0.0000000 0.0000000 0.0000028 0.0000028 0.0000000 0.0000021
SCYJ
1.00000 0.36788 0.36788 0.47237 0.47237 0.13534 0.26412
1.00000 0.36788 0.36788 0.46832 0.46966 0.13559 0.25790
0.0000000 0.0000000 0.0000000 0.0000697 0.0000697 0.0000000 0.0000514
0.0000000 0.0000000 0.0000000 0.0000702 0.0000698 0.0000000 0.0000517
ROBUSTNESS IN LIQUID-LIQUID EXTRACTION
295
The tables show that the differences between variances calculated with the help of the robustness algorithm and the values calculated after the simulation experiment are very small in the extraction liquids selected. For the 5% case these differences are more significant than for the 1% case, mainly due to the approximation algorithm. The mean difference between all calculated and simulated values for the partition coefficient is only 0.4%, for the selectivity this is 2.6%. This denotes that the algorithm used to calculate variances in the partition coefficients and selectivities due to small variations in the extraction liquid composition is a good approximation for these variances. Table 7.5 also shows that higher predefined coefficients of variation in the volume of the extraction liquid components give rise to slightly worse predictions of variance in the selectivity. Generally speaking, it can be said that the algorithms developed and tested here, give good approximations of the variances of the partition coefficients of compounds and of selectivities estimated with models obtained from mixture experimental designs. 7.4.2 The extraction of a group of sulphonamides The mixture compositions in Table 7.2 did not give rise to any problems with phase separation due to equal densities of the organic and the aqueous layers: all organic solvent compositions used could be separated from the aqueous layer. The density and volume measurements after mixing different solvents showed that there was no influence of mixing on the density and volume of the produced liquid. The resulting volume (V,) of the mixed liquid was equal to the sum of the individual volumes of the different extraction solvents. For a liquid composed of arbitrary volumes of methylene chloride (V,,,), chloroform (V&) and methyl tert.- butyl ether (VtBME):
A linear relationship (r = 0.9999) was found to describe the density of the produced liquid @), as a function of the particular densities @;) and fractions of the different solvents in the liquid (xi).For a liquid composed of n extraction solvents: n
i=l
296
J. WIELING
Due to the linearity of these properties it was not necessary to correct for mixing effects on density and volume. The mixture design optimisation techniques used here for the optimisation of liquid-liquid extractions of drugs from biological fluids have previously been tested [4,5]. The authors modelled the extraction of sulphonamides and tricyclic amines with regression models using both mixture variables (extraction liquid composition) and factorial variables (extraction time and extraction intensity). The final objective of this investigation was to study the applicability of combined mixture-factorial experimental designs for the simultaneous optimum choice of extraction liquid composition (the mixture design part) and extraction time and extraction intensity (the factorial design part). The optimisation criterion was to maximise the extraction efficiency of the extraction system and to minimise the time needed for a single extraction. The latter criterion was introduced for future application of a laboratory robot with serialised sample processing. Moreover, the ruggedness of the extraction system was evaluated for pairs of amines. It was concluded that mixtures of preselected organic solvents result in higher recoveries than pure solvents. The application of a factorial design incorporated in a mixture experimental design in the optimisation of liquid-liquid extraction of drugs from biological matrices gave good results for the extraction from plasma. Another conclusion was that optimisation of recoveries is reasonable by mixing three solvents with different selective interactions. The introduction of two process variables (extraction time and intensity) and the simultaneous evaluation of these variables with the mixture variables makes possible the simultaneous modelling of the result obtained from combined experimental designs. A preselected limited number of design points results from the use of fractional designs for the optimisation of liquid-liquid extraction of drugs from biological matrices; the applicability of combined mixture-fractional experimental designs for the simultaneous optimisation of extraction liquid composition, extraction time and extraction intensity has been demonstrated. For the group of tricyclic amines it was shown that there is interaction between the two process variables (extraction intensity and extraction time); a higher extraction intensity justifies a shorter extraction time. This may result in a distinct decrease in the time needed for an analytical run. Also, an interaction exists between the composition of the extraction liquid and the process variables: the extraction behaviour changes when these variables are varied simultaneously.
ROBUSTNESS IN LIQUID-LIQUID EXTRACTION
297
Structurally related compounds demonstrate different extraction behaviours in a ternary liquid-liquid extraction system composed of methylene chloride, chloroform and methyl tert.-butyl ether. In investigations for finding proper internal standards, one should take into account the extraction liquid used for extraction of both analyte and internal standard. The above findings have lead to the further exploration of the algorithms for the optimisation of drug extraction, the results of which have been presented in Section 7.2. Some of the sulphonamides investigated here have also previously been used in another robustness study [3,5]. The authors examined the correlation of the experimental errors of simultaneously extracted sulphonamides and introduced a method for the selection of the proper internal standard in combination with the proper composition of the extraction liquid. One of the findings was that the correlation of experimental errors in recoveries of structurally related compounds varies considerable. The extraction liquid chosen highly affects the correlation of the errors in the recovery of analyte and internal standard. Therefore, the selection of an appropriate extraction liquid is very important for development of accurate and reproducible assay methods. Selection of unsuitable extraction liquids may introduce errors in internal standard calibration that are larger than errors in external standard calibration. It was also found that the choice of the internal standard is very important: even compounds that are structurally related to the analyte may demonstrate dissimilar extraction behaviour. The authors recommended to select as internal standard a compound that is not only structurally related but that demonstrates also a strongly similar extraction behaviour in the selected liquid. It was in general concluded that internal standard calibration gives better results for liquid-liquid extraction than external standard calibration, but that there may be circumstances where external calibration is better. A method was developed for the selection of an extraction liquid and/or an internal standard in liquid-liquid extraction sample preparation prior to HPLC-analysis. The quality of routine analysis was used as selection criterion. This quality was approximated by simulation of 50 analytical runs under different conditions. The quality control results under these conditions were compared to select optimum extraction conditions. The method 'developed was also useful for the selection of the composition of an extraction liquid, that gives most robust
298
J. WIELING
results for all recoveries and recovery ratios after extraction of several analytes. The algorithms presented in this paper provide an alternative method for the selection of robust conditions for liquid-liquid extraction of drugs from biological fluids. The advantage of this new method is that response criteria are modelled. The models can subsequently be used to predict responses under conditions that have not been measured (by interpolation in the factor space). The regression models of the partition coefficients of the sulphonamides investigated are listed in Table 7.6. The models are used to design response surfaces of the criteria presented in Section 7.2. Not all possible response surfaces are given since there are numerous possibilities. The most interesting response surfaces are shown and discussed. Figure 7.6a gives the response surface of the partition coefficient of sulphacetamide. It can be seen that optimal extraction conditions of sulphacetamide are binary compositions of methylene chloride and methyl tert.-butyl ether. It can also be observed that the partition coefficient is nearly constant at the binary axis methylene chloride/chloroform. Therefore, small variations in the binary compositions of methylene chloride and chloroform will not significantly change the partition coefficient. In other words: binary compositions with methylene and chloroform yield robust extractions for sulphacetamide. This conclusion is confirmed by the robustness plot of the partition coefficient of sulphacetamide (Figure 7.6b). This plot also shows that under conditions where the partition Coefficient is optimal (binary mixtures of methylene chloride and methyl tert.-butyl ether), the robustness of the partition coefficient reaches a maximum value. Figure 7.6a gives the response surface of the partition coefficient of sulphacetamide. It can be seen that optimal extraction conditions of sulphacetamide are binary compositions of methylene chloride and methyl tert.-butyl ether. It can also be observed that the partition coefficient is nearly constant at the binary axis methylene chloride/chloroform. Therefore, small variations in the binary compositions of methylene chloride and chloroform will not significantly change the partition coefficient. In other words: binary compositions with methylene and chloroform yield robust extractions for sulphacetamide. This conclusion is confirmed by the robustness plot of the partition coefficient of sulphacetamide (Figure 7.6b). This plot also shows that under conditions where the partition coefficient is optimal (binary mixtures of methylene
ROBUSTNESS IN LIQUID-LIQUID EXTRACTION
299
chloride and methyl tert.-butyl ether), the robustness of the partition coefficient reaches a maximum value.
Figure 7.6a The response surface of the partition coeficient of sulphacetarnide (partition coeficient plot)
Figure 7.6b The response surface of the robustness of the partition coeficient of sulphacetarnide (partition coeficient robustness plot)
TABLE 7.6 CALCULATED REGRESSION COEFFICIENTS AND MODEL VALIDATION CRITERIA OF THE SPECIAL CUBIC MODEL (EQUATION (3)) FOR THE LIQUID-LIQUID EXTRACTION OF 9 SULPHONAMIDES WITH THE RESULTS OF EXPERIMENTS 1 TO 7 Solute Coeflcient ACET METH FTAL SOMI THIA PYRI m R A MEPY CLPY PI P2
P3 PI2 PI3 P23 PI23
-3.12990 -3.92125 -3.49546 0.82240 -0.55635 0.62628 7.57849
-1.23898 -1.62729 -2.46938 0.79869 -1.97672 0.86805 6.42470
-0.66223 -0.56796 -1.50888 0.401 12 -4.43728 -0.44674 8.22987
-1.34641 -1.42228 -1.68144 0.12553 1.50000 -0.76535 -3.59415
-1.36930 -2.26967 -1.73014 1.12552 1.71948 0.85975 -2.01305
0.12390 0.41029 -0.38426 2.99221 -1.06514 -1.40456 -6.73010
0.94746 0.59888 -0.14681 5.69229 -2.09237 -1.34413 -25.41497
0.27072 0.18291 -0.32809 3.68176 -2.03753 -1.48145 -13.49709
0.51784 0.06981 0.56366 0.59898 -3.29647 -2.08825 -10.12001
ROBUSTNESS OF LIQUID-LIQUID EXTRACTION
301
Figure 7.6a gives the response surface of the partition coefficient of sulphacetamide. It can be seen that optimal extraction conditions of sulphacetamide are binary compositions of methylene chloride and methyl tert.-butyl ether. It can also be observed that the partition coefficient is nearly constant at the binary axis methylene chloride/chloroform. Therefore, small variations in the binary compositions of methylene chloride and chloroform will not significantly change the partition coefficient. In other words: binary compositions with methylene and chloroform yield robust extractions for sulphacetamide. This conclusion is confirmed by the robustness plot of the partition coefficient of sulphacetamide (Figure 7.6b). This plot also shows that under conditions where the partition coefficient is optimal (binary mixtures of methylene chloride and methyl tert.-butyl ether), the robustness of the partition coefficient reaches a maximum value. Figure 7.8a gives the selectivity ajjof the extraction of sulphisomidine and sulphathiazole. A high value is obtained when extractions are performed with chloroform. The corresponding robustness plot of the selectivity is given in Figure 7.8b, which gives the value of the variance of the selectivity as a function of the extraction liquid composition. Worst values are obtained in compositions near the maximum selectivity value. Good values are obtained for the maximum selectivity value and in binary compositions of methylene chloride and methyl tert.-butyl ether. Figure 7.9 gives the minimal partition coefficient plot of the extraction of sulphapyridazine, sulphamerazine, sulphamethoxypyridazine and sulphachloropyridazine. It can be seen from this figure that the minimal part'ition coefficient plot is dominated by two of the four compounds, namely by sulphamerazine in the methylene chloride region and in the methyl tert.-butyl ether region and by sulphapyridazine in the chloroform region. Figure 7.10 represents the minimal selectivity plot after extraction of sulphisomidine, sulphathiazole, sulphamethizole and phtalylsulphacetamide. A maximum is obtained using a binary mixture of methyl tert. -butyl ether and methylene chloride. Minimal values are obtained using a binary mixture of methyl tert. -butyl ether and chloroform.
3 02
J. WIELING
Figure 7.7 The response surface of the partition coeficient of sulphachloropyridazine
Figure 7.8a The response surface of the selectivity aij of the extraction of sulphisornidine and sulphathiazole (selectivity plot)
ROBUSTNESS OF LIQUID-LIQUID EXTRACTION
303
Figure 7.8b The response surface of the robustness of the selectivity aijof the extraction of sulphisomidine and sulphathiazole (selectivityrobustness Plot)
Figure 7.9 The response of the minimalpartition coeficient of the extraction of sulphapyridazine, sulphamerazine, sulphamethoxypyridazine and sulphachloropyridazine (minimalpartition coeflcient plot)
3 04
J. WIELING
Figure 7.I0 The response surface of the minimal selectivity of the extraction of sulphisomidine, sulphathiazole, sulphamethizole and phthalylsulphacetamide (minimalselectivityplot)
7.5 CONCLUSIONS A new method has been developed for the selection of robust conditions for liquid-liquid extraction of drugs from biological fluids prior to chromatographic analysis. The method is an alternative to previously developed methods. The method uses algorithms that have been developed for the estimation of the robustness of partition coefficients and selectivities of drugs, which have previously been extracted from plasma according to a mixture experimental design. The algorithms give good predictions of these robustnesses, which was confirmed with a simulation experiment. The models obtained after extraction of nine sulphonamides in ternary mixtures of methylene chloride, chloroform and methyl tert.-butyl ether from spiked plasma samples have been used to design response surface plots of all the criteria involved.
ROBUSTNESS OF LIQUID-LIQUID EXTRACTION
305
ACKNOWLEDGEMENTS The author wishes to thank Professor Henk de Snoo, University of Groningen, Department of Mathematics and Professor Age K. Smilde, University of Amsterdam, Laboratory for Analytical Chemistry for the mathematical and statistical support during the development of the robustness algorithms.
REFERENCES G. Taguchi and Y. Wu, Introduction to Off-Line Quality Control, Central Japan Quality Control Association, Nagoya, 1980. R.E. Majors, An overview of sample preparation, LC-GC International, 4 (1991) 10-14. J. Wieling, P.M.J. Coenegracht, C.K. Mensink, J.H.G. Jonkman and D.A. Doombos, Selection of Robust Combinations of Extraction Liquid Composition and Internal Standard: Monte Car10 Simulation of Improvement of Methods with Liquid-Liquid Extraction prior to HPLC, Journal of Chromatography, 594 (1992) 45-64. J. Wieling, H. Dijkstra, C.K. Mensink, J.H.G. Jonkman, P.M.J. Coenegracht, C.A.A. Duineveld and D.A. Doombos, Chemometrics in Bioanalytical Sample Preparation: a Fractionated Combined Mixture and Factorial Design for the Modelling of the Recovery of Five Tricyclic Amines from Plasma after LiquidLiquid Extraction', Journal of Chromatography,629(2) (1 993) 181-199. J. Wieling, Liquid-Liquid Extraction of Drugs ?om Biological Fluids: Robotisation and Optimisation,thesis, University of Groningen, 1993. S.N. Deming and S.L. Morgan, Simplex optimization of variables in analytical chemistry,Analytical Chemistry, 45 (1973) 278A. C.H. Lochmiiller, K.R. Lung and K.R. Cousins, Applications of optimization strategies in the design of intelligent laboratory robotic procedures, Analytical Letters, 18 (A4) (1985) 467. A.K. Smilde, A. Knevelman and P.M.J. Coenegracht, Introduction of multi-criteria decision making in optimization procedures for HPLC separations, Journal of Chromatography,369 (1986) 1- 10. S.A. Ahmed, C.A. Lau-Cam and S.M. Bolton, Factorial design in the study of the effects of selected liquid chromatographic conditions on resolution and capacity factors, Journal of Liquid Chromatography, 13(3) (1990) 525-54 1. [lo] C.H. Lochmiiller and K.R. Lung, Application of laboratory robotics in spectrophotometric sample preparation and experimental optimization, Analytica Chimica Acta, 183 (1986) 257-262.
J. WIELING P.H. Hoogkamer and J.H.M. van den Berg, Robuustheid van chromatografische methoden, LABIABC, november 1989,12-14 (in Dutch). S.N. Deming, J.A. Palasota and J.M. Palasota, Experimental Design in Chemometrics,Journal of Chemometrics, 10 (1991) 181-192. H. Scheffk, Experiments with mixtures, Journal of the Royal Statistical Socieo, Series B, 20 (1958) 344-360. R.D. Snee, Experimentingwith mixtures, Chemtech,November 1979,702-7 10. J. W. Gorman and J.E. Hinman, Simplex lattice designs for multicomponent systems, Technometrics,4 (1962) 463-487. P.M.J. Coenegracht, A.K. Smilde, H.J. Metting and D.A. Doombos, Comparison of optimizationmethods in reversed-phase HPLC using mixture designs and MCDM, Journal of Chromatography,485 (1989) 195. J. Wieling, J. Schepers, J. Hempenius, C.K. Mensink and J.H.G. Jonkman, Optimisation of chromatographicselectivity of 12 sulphonamides in reversed-phase high-performance liquid chromatography using mixture designs and multi-criteria decision making, Journal of Chromatography,545 (199 1) 101- 1 14. J.A. Comell, Experiments with Mixtures: An update and bibliography, Technometrics,21 (1979) 95-106. L. Rohrschneider, Solvent characterization by gas-liquid partition coefficients of selected solutes, Analytical Chemistry, 45 (1973) 1241. L.R. Snyder, Classification of the solvent properties of common liquids, Journal of Chromatography, 92 (1974) 233-230. L.R. Snyder, Classification of the solvent properties of common liquids, Journal of Chromatographic Science, 16 (1978) 223-234. C. Horvath, W. Melander and I. Mohir, Solvophobic interactions in liquid chromatography with nonpolar stationary phases, Journal of Chromatography, 125 (1976) 129. C. Horvath, W. Melander and I. Molnh, Liquid chromatography of ionogenic substances with nonpolar stationary phases, Analytical Chemistry, 49 (1977) 142. R. Tijssen, H.A.H. Billiet and P.J. Schoenmakers, Use of the solubility parameter for predicting selectivity and retention in chromatography, Journal of Chromatography, 122 (1976) 185. P.H.J. Schoenmakers, A Systematic Approach to Mobile Phase Esfects in RPHPLC, thesis, Delft 1981. J.H. Hildebrand, J.M. Prausnitz and J.M.M. Scott, Regular and Related Solutions, Van Nostrand Reinhold, New York 1970. J.W. Weyland, C.H.P. Bruins, D.A. Doombos, Use of 3-dmensiod a-plots for optimization of mobile phase composition for RP-HPLC separation of sulfonamides,Journal of ChromatographicScience, 22 (1984) 3 1-39. G. Schill, Separation methodv for drugs and related compoundrr, Apotekarsocieteten,Stockholm, 1978,12-13. R. Modm and G. Schill, Selective extraction of organic compounds as ion-pairs and adducts. Talanta. 22 (1975) 1017.
ROBUSTNESS OF LIQUID-LIQUIDEXTRACTION
307
[30] J.H. de Boer, A.K. Smilde and D.A. Doombos, Introduction of a robustness
[31]
[32]
[33] [34] [35]
coefficient in optimization procedures: implementation in mixture design problems. Part I. Theory, Chemometrics and Intelligent Laboratory Systems, 7 (1990) 223236. J.H. de Boer, A.K. Smilde and D.A. Doombos, Introduction of a robustness coefficient in optimizationprocedures: implementation in mixture design problems. Part 11: Some practical considerations, Chemometrics and Intelligent Laboratory Systems, 10 (1991) 325-336. J.H. de Boer, A.K. Smilde and D.A. Doombos, Introduction of a robustness coefficient in optimization procedures: implementation in mixture design problems. Part 111: Validation and comparison with competing criteria, Chemometrics and Intelligent Laboratory Systems, 15 (1992) 223-236. H.H. Ku, Notes on the use of error propagation of error formulas, 1 Res. NBS, 70C (1 966) 263-273. S.T. Bake, Quantitative Column Liquid Chromatography, a survey of chemometric methods, Elsevier, Amsterdam, 1984, pp. 38-45. J.L. Glajch, J.J. Kirkland, K.M. Squire and J.M. Minor, Optimization of solvent strength and selectivity for reversed phase LC using an interactive mixture-design statistical technique, Journal of Chromatography, 199 (1980) 57-79.
This Page Intentionally Left Blank
Chapter 8
THE USE OF A FACTORIAL DESIGN TO EVALUATE THE PHYSICAL STABILITY OF TABLETS AFTER STORAGE UNDER TROPICAL CONDITIONS C.E. Bos A. U.V. Veterinary Cooperation, P.O. Box 94, 5430 AB Cuijk, The Netherlands
G.K. BOLHUIS Department of Pharmaceutical Technology and Biopharmacy, University of Groningen, Antonius Deusinglaan 1, 9713 AV Groningen, The Netherlands
A.K. SMILDE Laboratoryfor Analytical Chemistry, Nieuwe Achtergracht 166, I018 WV Amsterdam, The Netherlands J.H. DE BOER Gasunie Research, P. 0.Box 19, 9700 M A Groningen, The Netherlands
8.1 INTRODUCTION In designing tablet formulations, the aim is to develop such a formulation that many quality requirements (chemical, physical as well as microbiological) are met, not only at the moment of preparation but also during the whole shelf life-time. The studies described in this chapter are restricted to the physical stability of tablets. Physical properties that are of interest are e.g. crushing strength, fiiability, disintegration time and dissolution profile. One of the causes of 3 09
310
C.E. BOS et al.
changes in physical properties is water uptake or loss during storage [1, 21. The extent of this effect depends, among others, on the amount of water present at the moment of preparation, the tablet composition (presence of hygroscopic materials, such as disintegrants) and packaging and storage conditions (temperature and relative humidity). Especially when used in tropical countries, tablets are frequently exposed to extreme climate conditions, with both high temperatures and high relative humidities. This is especially the case when tablets are stored under consumers conditions, since tablets are frequently dispensed in non-protective packaging or without any packaging at all. Several authors [3-71 have described the effect of storage conditions on the physical tablet parameters. All tablets under investigation in these studies contained a drug as well as several excipients (filler, binder, disintegrant, lubricant, glidant). The relative humidity during storage had a marked effect on the physical tablet parameters. However, none of these authors expressed the effects in such a way that they are generally applicable. York [8] studied the influence of the storage conditions of powders prior to tabletting on the physical stability of tablets. He found that the initial storage conditions and thus the initial water content of the tablet ingredients had a significant effect on the physical tablet stability. Chowhan [9] defined different pathways of physical instability of tablet formulations. These physical paths may involve one or more complex physical processes, e.g. change in polymorphism, crystallization, vaporization and adsorption. These pathways and thus the physical tablet parameters, are influenced by different types of variables; formulation variables (e.g. solubility and hygroscopicity), in-process variables (e.g. moisture content) and aging variables (e.g. temperature and relative humidity). In order to develop physically stable preparations, the effect of tablet composition as well as the effect of storage conditions on the behaviour of physical tablet properties during storage must be studied. 8.1.1 The use of experimental designs in tablet formulation In the optimization of tablet formulations, different approaches can be used. The 'one variable at a time method' requires many experiments and there is no guarantee that an optimal formulation is achieved. Moreover the interaction between different factors, which may influence the tablet properties, will not be detected [lo]. The use of an experimental design can be helpful in the optimization of tablet formulations. Mixture designs can be used to describe the response (tablet properties) as a function of the
PHYSICAL, STABILITY OF TABLETS
311
composition. After describing the response of different tablet parameters of interest, a selection can be made at which composition the tablet formulation meets all requirements [l I]. In a factorial design different variables can be varied independently. The effect of the variables as well as the interaction between the variables on the response can be described [ 12, 131. Many papers have been published in which an experimental design is used to optimize tablet formulations. Both process variables as well as compositional variables (quantitative and qualitative) were used as independent (adjustable) variables. The studied dependent variables (factors to be optimized) are physical tablet properties directly after preparation such as: weight variation, crushing strength, dissolution profile, disintegration time and friability. Doombos reviewed papers in which statistical methods were used to optimize tablet formulations [131. Experimental designs also have been used to study the influence of tablet composition and storage conditions on the chemical stability (degradation of the drug) [e.g. 14-17]. 8.1.2 The use of factorial designs in physical tablet stability studies Only few authors describe the use of a factorial design to study the influence of storage on the physical tablet parameters. Fenyvesi et al. [ 11, Vila-Jato et al. [2] and Sangekar et al. [ 181 studied the effect of adjustable variables on physical tablet properties after storage as a hnction of storage conditions. Vila-Jato et al. included storage time in the factorial design as one of the adjustable variables. The absolute values of the physical parameters after storage were studied and no relation was made to the initial values of these tablet parameters. However, the main parameter of interest is the decrease or increase in the physical tablet properties and not the absolute value of these properties after storage. This can be expressed as the ratio of the value of a variable after storage and the initial value of the variable, expressed as a percentage i.e. the Storage to Initial Ratio (SIR):
SIR( y ) = (y, / yo ) .loo% where yoand yt are the initial value of the parameter of interest and the value after storage, respectively. Section 8.2 describes the feasibility of the use of the Storage to Initial Ratio as a measure to express the physical stability of tablets. By expressing the stability as the relative value after storage related to the initial value, it is
3 12
C.E. BOS et al.
possible to compare batches of tablets with varying initial values, with respect to their stability during storage. It may also be possible to use the SIR as a response in the optimization of tablet formulations by means of a mixture design. In section 8.3 a study is described in which the SIR of crushing strength and the SIR of disintegration time are used to evaluate batches of tablets prepared for several combinations of filler-binders and disintegrants, with respect to their physical stability after storage under tropical conditions.
8.2 THE USE OF THE RELATIVE CHANGE IN TABLET PARAMETERS IN A FACTORIAL DESIGN In this section a method is described to assess the influence of formulation and aging variables on the physical stability of tablets. A factorial design was used in which the disintegrant concentration, the compression load, the storage temperature and the storage relative humidity were the independent, adjustable variables. The Storage to Initial Ratio of physical tablet parameters (SIR, equation (1)) was used as dependent variables. By means of multiple linear regression, the effect of the adjustable variables on the dependent variables was calculated. The aim of this study was to evaluate if SIR is a helpful tool to evaluate the physical stability of tablets as a function of formulational, process and storage variables. The Storage to Initial Ratio of two tablet parameters: crushing strength (S) and disintegration time (D) were measured for a combination of one fillerbinder (a-lactose monohydrate) and one disintegrant (rice starch), at three concentration levels. a-Lactose monohydrate is a cheap and commonly used filler-binder in the preparation of tablets by direct compression. Rice starch can be used as a disintegrant and it also has been proven to have good binding properties [ 191. The influence of a compositional variable (disintegrant concentration), a process variable (compression load) as well as storage conditions (temperature and relative humidity), on the SIR of crushing strength and of disintegration time, were studied in a 34 factorial design. A factorial design is a very general scheme to investigate the influence of independent variables on responses. When the response is normally distributed, statistics like the t-test can be used for inference purposes. The ratio SIR as a response can be analyzed with a factorial design but some care must be taken with interpreting the results. Generally, an approximate
PHYSICAL STABILITY OF TABLETS
3 13
distribution of a ratio of two normally distributed variables can be derived, but this is very tedious [20, 211. Hence, approximate variances of the ratio exist and t-tests can be used but these tests must be interpreted as approximate tests. 8.2.1 Planning of the design A 34 factorial design was used for the 4 variables studied: disintegrant concentration (C), compression load (F), storage temperature (T) and storage relative humidity (R), at three levels each. Since the levels of each variable were not expressed in units of the same order, comparison of the extent of the contribution of the effect of the different variables was difficult. In order to be able to compare the effect of the different variables, the values of the variable levels were standardized (Table 8.1). The general form of the model which describes the effect of the variables is given by the following formula:
i# j # k
i# j
where y is the response, either crushing strength (S) or disintegration time (D), xl represents the disintegrant concentration, x, the compression load, x3 the storage temperature and x4 the storage relative humidity. With aid of multiple linear regression, model coefficients pi were calculated, which describe the effect of the variables on the tablet parameters and the interactions of the different variables. Since three levels of each variable were studied it was not only possible to calculate the linear, but also the quadratic contribution of the variables. The fourth order and higher interactions were not included, because no effect is expected fiom these interactions. Since the aim of this study was to evaluate the use of the change in the tablet properties during storage as a parameter to express the physical stability of tablet formulations, the Storage to Initial Ratio of the crushing strength (SIR(S)) and of the disintegration time (SIR(D)) were calculated and used as response values for equation (2). The calculations of SIR(S) and SIR(D) were performed as in equation (1). The mean of the measurements directly after preparation were used as initial value.
3 14
C.E. BOS et al.
8.2.2 Tabletting, storage and measurements The used tablet ingredients were a-lactose monohydrate (Ph.Eur grade, 100 mesh), rice starch (Ph.Eur. grade) and magnesium stearate (Ph.Eur. grade). Before use the magnesium stearate was sieved through a 2 10 pm sieve. Prior to use, the materials were stored at 20 f 1 "C and 45 f 5% relative humidity (RH) for at least one week. Tablets of 250 mg with starch concentrations of 5%, 15% and 25% w/w respectively, were prepared. Starch and lactose were mixed for 15 minutes in a Turbula mixer at a rotation speed of 90 rpm.. Magnesium stearate (0.5% w/w) was added and mixing was continued for 2 minutes. The tablets were prepared on a single punch tabletting machine, using 9 mm flat punches. From each mixture tablets were produced at three compression load levels: 157,314,472 MPa. From each batch tablets were stored at 9 different storage conditions: 3 temperatures * 3 relative humidities. The tablets were stored in open containers, placed in exsiccators with saturated salt solutions in climate chambers. The storage temperatures were: 20 f 1 "C, 30 f 1 "C and 40 f 1 "C. The used saturated salt solutions were respectively: potassium carbonate (44 f 5% RH), sodium chloride (75 f 5% RH) and barium chloride (90 f 5% RH). Both directly after tabletting and after 8 weeks of storage the crushing strength and the disintegration time of the tablets were measured. From each batch the crushing strength of 10 tablets was measured using a Schleuniger instrument. The disintegration time of 6 tablets from each batch was measured using a Ph.Eur. apparatus, with water (37 f 1 "C) as a test fluid. The tests were performed without disks.
8.2.3 Results The initial values of crushing strength (So) and disintegration time (Do) of the nine batches of tablets (3 starch concentrations * 3 compression loads) are given in Table 8.2. Just as was expected, the tablets prepared with a compression load of 157 MPa and low starch concentrations, show low values of crushing strength. Initially all tablets disintegrated within 1 minute.
315
PHYSICAL, STABILITY OF TABLETS
TABLE 8.1 LEVELS OF THE VARIABLES AND THE NORMALIZED VARIABLES Variable
level
starch concentration 'YOw/w (C) normalized starch concentration (c)
5 -1
15
157 -1
3 14
0
472 1
storage temperature "C (T) normalized storage temperature (t)
20 -1
30 0
40 1
storage relative humidity 'YORH (R) normalized storage relative humidity (r)
44 -1
75 0.35
90 1
compression load MPa (F) normalized compression load (f)
25 1
0
TABLE 8.2
INITIAL VALUES OF CRUSHING STRENGTH (So) AND DISINTEGRATION TIME (Do) starch concentration compression (% w/w) load (MPa) s,(l (n) DOb (4 5 5 5
157 3 14 472
25 157 3 14 25 25 472 a mean (standard deviation); n = 10 mean (standard deviation); n = 6
18 (3) 41 (3) 54 (4)
49 (4) 39 (2) 41 (2)
47 (10) 102 (5) 116 (4)
21 (1) 18 (3) 24 (1)
316
C.E. BOS et al.
Crushing strength The influence of standardization of the variables is demonstrated for the model of the initial crushing strength (So), which can be expressed as a function of the compression load (F) and the starch concentration (C):
So = -22.4-3.3C+0.14C2 +0.38F-4.6~10- 4F2 + 5 . 2 ~ 1 0 - ~ C F (R2a4 = 0.97)
(3)
Since the levels of the compression load and the starch concentration are not of the same order of magnitude, the effect of adjusting the concentration apparently is larger than the effect of adjusting the compression load. In order to allow for the comparison of the effect of adjusting the different variables, the values of the variables were standardized (Table 8.1). When the initial crushing strength (So) is expressed as a function of the standardized starch concentration (c) and the standardized compression load (0, the effect of the compression load and the starch concentration are in the same order: So = 56.3+ 2 5 . 4 ~ + 1 4 . 1 +~ 25.6f ~ - 11.3f2 + 8.lcf
R& = 0.97
(4)
If the Storage to Initial Ratio (SIR) is a measure to evaluate and compare the physical stability of tablets, then the SIR should be independent of the initial values of the tablets. In this study the initial values are varied by applying different levels of compression load. For each compression load level a model is calculated expressing the influence of the standardized, adjustable variables (disintegrant concentration, storage temperature and storage relative humidity) on, the crushing strength and on the SIR of crushing strength after storage. These models are compared with each other. Finally the data are combined and a model is calculated in which the compression load is included as an adjustable variable. The observed values of crushing strength (S,) after storage for 8 weeks are shown in Figures 8.la-c, for the batches of tablets prepared with compression load levels of 157, 3 14 and 472 MPa, respectively. From these figures a decrease in crushing strength can be seen with increasing storage relative humidity. The influence of the storage temperature on the crushing strength is small. Tablets with high starch concentrations are influenced more strongly by storage than the tablets prepared with less disintegrant.
PHYSICAL STABILITY OF TABLETS
317
Figures 8.2a-c show the calculated values of the Storage to Initial Ratio of crushing strength after storage for 8 weeks (SIR(S)), for the tablets prepared with 157,314 and 472 MPa compression load levels, respectively. The equation which describes the effect of the standardized, adjustable variables on the crushing strength or on the SIR of crushing strength after storage for 8 weeks, of tablets prepared at one compression load level is:
i
1
i# j
(5)
in which y is either the crushing strength (S,) or the SIR of crushing strength (SIR(S)) after storage, and x1 represents the standardized disintegrant concentration (c), x2 the standardized storage temperature and x3 the standardized storage relative humidity. The values of the coefficients P, for the models of the crushing strength (S,) and the SIR of crushing strength (SIR(S)), respectively, are given in the Tables 8.3 and 8.4, for batches of tablets prepared at the three compression load levels. When comparing the coefficients of the equations that describe the relation between the adjustable variables and the absolute crushing strength after storage or the SIR of crushing strength (Tables 8.3 and 8.4), the most striking difference is the effect of the starch concentration. The concentration coefficient (/31,s,)is larger than 0 in the models which describe the crushing strength after storage (Table 8.3). This means that tablets with increasing starch concentrations have an increasing crushing strength, even after storage. In contrast, the concentration coefficient (/31,sIR(s)) is smaller than 0 in the models which describe the SIR of crushing strength after storage (Table 8.4). This implies that tablets with higher starch concentrations have a larger decrease in crushing strength during storage than tablets with a lower starch concentration. This effect is not detected when the absolute value of the crushing strength after storage is used, because with increasing starch concentrations, the initial crushing strength is so high that even after storage at humid conditions the resulting crushing strength is still higher than of tablets prepared with low starch concentration (Figures 8.1 and 8.2).
318
C.E. BOS et al
TABLE 8.3
COEFFICIENTS FOR THE EQUATION OF THE CRUSHING STRENGTH ( s 8 ) AFTER STORAGE coeficient variable 157 314 4 72 MPa MPa MPa intercept 18.2 47.1 61.0 PO,S8
P1,S8
C
7.7
17.8
18.2
Pl1,SS
C2
9.7
15.5
12.7
P2,S8
t
-4.7
-2.5
-4.0
P22,S8
t2
-4.5
-4.7
-6.9
P3,SS
r
.12.0
-13.9
-18.7
P33,SS
1-2
*
P12,SS
c*t
-2.1
* *
* *
P13,S8
c*r
-7.3
-12.9
P23,SS
t*r
2.4
*
-14.3
P123,SS
c*t*r
*
*
*
P122,S8
C*t2
*
-2.8
-4.6
P112,ss
C2*t
P133,SS
C*1-2
P113,S8
1.4
* *
* *
c2*r
2.6
-3.4
-3.2
P233,S8
t*1-2
5.1
7.0
8.7
P223S8
t2*r
*
-3.1
-3.5
0.82
0.95
0.97
R2dj.
* = not significant (1%)
* 3.2
3 19
PHYSICAL STABILITY OF TABLETS
TABLE 8.4
COEFFICIENTS FOR THE EQUATION OF THE SIR OF CRUSHING STRENGTH (SIR@)) AFTER STORAGE coeficient variable 157 314 4 72 MPa MPa MPa PO,SIR(S) intercept 79.3 91.1 86.0 PI,SIR(S)
PI I ,SIR(S)
c2
-16.1
-11.8
13.5
3.8
4.5
*
PZ,SIR(S) P22,SIR(S)
-12.9
t2
P3,SIR(S)
*
-5.1
-15.4
-7.3
-8.1
-54.5
-28.4
-27.2
c*t
-5.3
* *
c*r
-11.9
.7.9
-7.4
t*r
5.4
* *
1.8
r;!
c*t*r C*t2 C2*t C*Z
*
* * * *
* *
*
* * *
-2.6
*
4.1
c2*r
30.9
7.6
4.4
B233,SIR(S)
t*8
2.8
6.9
11.1
P223,SIR(S)
t2*r
*
-4.4
-4. I
0.76
0.93
0.95
R'acij.
* = not significant (1%)
320
C.E. BOS et al.
If the models for the SIR of crushing strength for the batches of tablets prepared at different compression load levels are compared (Table 8.4), little difference is seen between the model coefficients for tablets prepared at 3 14 or 472 MPa compression load. The 157 MPa model shows some differences in the coefficients. Also the fit of the model for 157 MPa is not as good as for the other models (R2,dj= 0.76). This is attributed to the fact that the initial values of the crushing strength are very low, resulting in large relative changes after storage, even if the absolute change is small. By combination of the data, two large models can be calculated which describe the effects of the four adjustable, standardized variables (temperature, relative humidity, starch concentration and compression load) on the crushing strength (S,) and the SIR of crushing strength after storage (SIR(S)), respectively:
S, =49.5+17.9c+12.6c2 +22.5f-11.0f2 -2.6t-5.4t2 -15.8r +4.9cf - 11.6cr + 1.5ft - 6.4fr + 1.4tr + 1.4cft - 3.2cf2 - 1.3c2r -1.4ft2 -1.8f2t+1.6f2r+7.1tr2 -2.4t2r
(6)
(RZ4 =0.96)
and
SIR(S)(%)=91.0-15.9c+7.2c2 +4.5f-8.0f2 -6.8t-10.6t2 -35.8r
+ 3.4fr + 3.2tr + 2.7cft + 2.2cfr + 4. lcf2 6.0c2f + 14.3c2r+ 3.4ft2 - 4.2f2r + 12.3tr2 (R& = 0.83)
- 2.0ct -9.2cr -
(7)
The coefficients that are not significant (1%), are not included in the equations. When comparing these equations, the major difference is the change in the coefficients which describe the effect of the starch concentration. The concentration has a positive effect (+17.9) on the crushing strength after storage (equation (6)), whereas the influence of the concentration on the SIR of crushing strength is less than 0 (-15.9, equation (7)). Another difference is, that in equation (6), which describes the crushing strength, the coefficient for the compression load is significant, whereas in equation (7), which describes the SIR of crushing strength the coefficient for the compression load is small as compared to the coefficients for the relative humidity and the concentration effect.
PHYSICAL STABILITY OF TABLETS
32 1
storage condition
Figure 8.l a Crushing strength after storage of tablets prepared with I57 MPa compression load
storage condition
Figure 8. I b Crushing strength after storage of tablets prepared with 31 4 MPa compression load
322
C.E. BOS et al.
storage condition
Figure 8.l c Crushing strength a#er storage of tablets prepared with 472 MPa compression load.
storage condition
Figure 8.2a Storage to Initial Ratio of crushing strength (SIR(S)) a#er storage of tablets prepared with 157 MPa compression load.
PHYSICAL STABILITY OF TABLETS
323
storage condition
Figure 8.2b Storage to Initial Ratio of crushing strength (SIR(S))after storage of tablets prepared with 314 MPa compression load.
storage condition
Figure 8 . 2 ~Storage to Initial Ratio of crushing strength (SIR(S)) after storage of tablets prepared with 472 MPa compression load.
324
C.E. BOS et al.
In other words, during storage the crushing strength of the tablets used in this study (prepared with a-lactose monohydrate and rice starch) decreases more with an increase in both starch concentration and storage relative humidity. The effect of the storage temperature and compression load level are slight as compared to the starch concentration and the relative humidity effect. Also it is seen that the effect of the relative humidity at storage depends on the level of the starch concentration and vice versa. Since the compression load has an effect on the physical stability of tablets, this variable should be considered when the SIR is used as a response in the development of tablet formulations that are physically stable e.g. by means of mixture designs. However, it should be kept in mind that in the end, the compression load is subordinate to the initial crushing strength and that for other, practical reasons, the initial crushing strength usually is adjusted in the range of 40 - 90 N. Within this range of initial values, the SIR of crushing strength can be helpful in comparing different batches of tablets with respect to their susceptibility to storage conditions. Disintegration time The initial disintegration time (Do) can be expressed as a function of the standardized starch concentration (c) and the normalized compression load
(0:
Do =22.5-10.6c+4.2c2 -l.lf+6.8f2 +2.7cf
(R& =0.93)
(8)
Just as may be expected, the initial disintegration time of a-lactose monohydrate/starch tablets is influenced by the starch concentration. The compression load level has a small effect on the initial disintegration time. The effect of the concentration is influenced by the level of the compression load. For the disintegration time two models were calculated, which describe the effect of the four variables on the disintegration time and the SIR of disintegration time after storage of lactose/rice starch tablets:
1
i# j # k
i
i# j
i# j
(9)
PHYSICAL STABILITY OF TABLETS
325
in which y is either disintegration time (DJ or Storage to Initial Ratio of disintegration time (SIR(D)) after storage for 8 weeks, and x1represents the standardized disintegrant concentration (c), x2 the standardized compression load (0,x3 the standardized storage temperature (t) and x4 the standardized storage relative humidity (r). The values for the coefficients Vj)for these models are given in Table 8.5. As expected an increase in the starch concentration causes a decrease in the disintegration time, even after storage: the value of the coefficient ( ~ 3 ~is, negative ~ ~ ) (- 7.7). However, after calculating the SIR of disintegration time, the coefficient of the concentration (&IR(D)) becomes positive (+21.7): tablets with higher starch concentrations have a higher Storage to Initial Ratio for the disintegration time than tablets with lower starch concentrations. The compression load has no significant effect on the disintegration time after storage (P2,Dg is not significant). However, when the SIR of disintegration time is used, the compression load has a large and positive effect @,sIR(D) = 11.1); the SIR of disintegration time increases with an increasing compression load level. Surprisingly the relative humidity (P4,SIR(D)) has no significant effect on the SIR of disintegration time. However, the relative humidity effect is present in many interactions. In both models which describe the disintegration time and the SIR of disintegration time, many significant interactions between the factors can be detected. Some third order interaction coefficients (e.g. large* h Z l , S l R ( D ) , pl14,SIR(D) and P443,SIR(D)) are Since the SIR of disintegration time after storage depends on the compression load as well as on the disintegrant concentration, these factors should both be taken into account, when the SIR of disintegration time is used as a measure for the physical stability of tablets after storage, in the developing stage of tablet formulations. 8.2.4 Conclusions
The four studied adjustable factors (disintegrant concentration, compression load level, storage temperature and storage relative humidity) all influence the Storage to Initial Ratio of the crushing strength and the disintegration time of the tablets investigated in this study. Since the compression load is of influence in the physical stability of tablets, this variable should be considered when the SIR is used as a response in the development of tablet formulations that are physically stable; However, it should be kept in mind that, in the end, the compression load is subordinate to the initial tablet parameters and that the applied
326
C.E. BOS et al.
compression load can be used to adjust the initial tablet parameters to relevant levels. Nevertheless, the use of the Storage to Initial Ratio of tablet parameters can be usefil in evaluating different batches of tablets with more or less the same initial values, with respect to their physical stability under storage.
TABLE 8.5 COEFFICIENTS FOR THE EQUATION OF THE DISINTEGRATION TIME (Ds) AND SIR OF DISINTEGRATION TIME (SIR(D)) AFTER STORAGE coeficient variable D8 SIR(D) Po
intercept
8.2
60.2
PI
C
-7.7
21.7
PI1
C2
30.6
82.4
P2
f
*
11.1
P22
B
2.7
-21.6
83
t
9.7
26.8
P33
t2
7.9
22.7
P4
r
9.7
31.1
P44
z
*
*
PI2
c*f
-8.9
-26.4
P13
c*t
*
11.2
P14
c*r
-16.0
-29.4
823
Pt
3.0
10.5
P24
frr t*r
4.5
11.0
P34
* = not significant (1%)
*
*
327
PHYSICAL STABILITY OF TABLETS
TABLE 8.5 (CONTINUED) COEFFICIENTS FOR THE EQUATION OF THE DISINTEGRATION TIME (D,) AND SIR OF DISINTEGRATION TIME (SIR(D)) AFTER STORAGE coeficient variable 4 (SIW?? PI23 PI24
c*Pt
*
*
c*Pr
-8.4
-25.8
c*t*r
*
P234
Pt*r
*
-6.8
PI22
C*P
-6.9
-3 1.9
PI22
C2*f
12.4
29.1
8133
c* t2
-7.4
-13.5
PI13
C2*t
6.3
23.5
C*t
-4.4
*
c2*r
18.8
51.7
PI34
PI44 PI14
Pt2
9.5
2.6
*
f*t
*
frt
*
f*r
*
* * *
P344
t*12
-1 1.5
-29.4
P334
t2*r
5.5
16.5
0.91
0.86
P233 P223 P244 P224
R2ac+j. *=not significant (1%)
328
C.E. BOS et al.
8.3 SELECTION OF EXCIPIENTS SUITABLE FOR USE IN TROPICAL COUNTRIES In this section a study is described in which tablets prepared with binary blends of a filler-binder and a disintegrant are evaluated, with respect to their physical stability after storage under tropical conditions. With the results of this study a selection from the excipients can be made, which are suitable for use in tropical countries. Tablet formulations can be developed with the thus selected excipients. Tablets were prepared either with an insoluble (dicalcium phosphate dihydrate), a soluble (B-lactose) or a moderately soluble filler-binder (alactose monohydrate). As a disintegrant four different starches (corn, rice, potato and tapioca) were used. As a comparison the effect of two 'superdisintegrants' (crospovidone and sodium starch glycolate) was studied. The disintegrants were added at two concentration levels. The compression load was adjusted in order to obtain tablets with comparable initial crushing strengths. Tablets from each combination of filler-binder and disintegrant were stored under different storage conditions. After storage the crushing strength and the disintegration time were measured. The influence of standardized storage temperature and standardized relative humidity as well as the standardized disintegrant concentration, on the physical tablet stability (SIR of crushing strength and of disintegration time) were calculated, as described in Section 8.2. The storage conditions were derived from the climate zones into which the world is divided for stability testing [22, 231. For each zone the kinetic average temperature and the average relative humidity can be calculated. The humid tropical and the dry tropical climate can be represented with a kinetic average temperature of 31 "C and an average relative humidity of 75% and 3 1 "C and 45% RH, respectively. The kinetic average temperature of the moderate climate zone is 20 "C. The average relative humidity in the moderate climate zone is less than 60 %. The used storage conditions in this study are 20 "C and 3 1 "C and 44% and 75% relative humidity (RH). 8.3.1 Planning of the design For each combination of filler-binder and disintegrant a 23 factorial design was used for the 3 variables studied: disintegrant concentration (C), storage temperature (T) and storage relative humidity (R), at two levels each. Since the levels of each variable were not expressed in units of the same order of
329
PHYSICAL STABILITY OF TABLETS
magnitude, comparison of the extent of the contribution of the effect of the different variables was only possible when the values of the variable levels were standardized (Table 8.6).
TABLE 8.6 LEVELS OF THE VARIABLES AND THE NORMALIZED VARIABLES variable level starch concentration % w/w (C) normalized starch concentration (c)
10 -1
20 1
super-disintegrant concentration YOw/w (C) normalized disintegrant concentration (c)
2 -1
4 1
storage temperature "C (T) normalized storage temperature (t)
20 -1
31 1
storage relative humidity YORH (R) normalized storage relative humidity (r)
44 -1
75 1
With the aid of multiple linear regression, model coefficients were calculated, which describe the effect of the variables on the physical stability of the tablets. Since two levels of each variable were studied it was possible to calculate the linear contribution of the variables. The general form of the model which describes the effect of the variables is given by the following formula:
in which xl, x2 and x3 are respectively the standardized disintegrant coficentration (c), the standardized storage temperature (t) and the standardized relative humidity (r). The response Y is either the Storage to Initial Ratio of the crushing strength (SIR(S)) or of the disintegration time (SIR(D)) after 8 weeks of storage. Since the aim of this study was to detect only the major effects, the significance of the coefficients were tested at a 1YOlevel.
330
C.E. BOS et al.
The Storage to Initial Ratio (SIRb)) was calculated as:
SIR(y ) = (y*/yo).l00% in which ys is the response after 8 weeks of storage and yo the average of the response measurements directly after preparation.
8.3.2 Tabletting, storage and measurements The used tablet filler-binders were a-lactose monohydrate 100 mesh, anhydrous &lactose and dicalcium phosphate dihydrate. The used tablet disintegrants were corn starch (Ph.Eur. grade) and rice starch (Ph.Eur. grade), tapioca starch, potato starch (Ph.Eur. grade), sodium starch glycolate and crospovidone. Magnesium stearate (Ph.Eur grade) was used as lubricant and sieved through a 150 pm sieve, prior to use. Before use all materials were stored at 20 f 1 "C and 45 f 5% RH, for at least one week. Tablets were prepared from binary blends of a filler-binder with a disintegrant. Tablets prepared with starch contained 10% or 20% w/w of the used starch. Tablets prepared with a 'super-disintegrant' contained 2% or 4% w/w of the disintegrant. The disintegrant and the filler-binder were mixed for 15 minutes in a Turbula mixer at a rotation speed of 90 rpm. Magnesium stearate (0.5% w/w) was added and mixing was continued for 2 minutes. Tablets (250 mg) were prepared on a single punch tabletting machine using 9 mm flat punches. In order to obtain tablets with initial crushing strengths in the range of 40 - 90 N the compression load level was adjusted. Tablets prepared with S-lactose were produced at a compression load level of 157 MPa. Dicalcium phosphate dihydrate and a-lactose monohydrate tablets were prepared at compression load levels of 3 14 and 472 MPa, respectively. From each batch tablets were stored at 4 different storage conditions: 2 temperatures * 2 relative humidities. The tablets were stored in open containers in exsiccators with saturated salt solutions in climate chambers. The storage temperatures were: 20 f 1 "C and 3 1 f 1 "C. The used saturated salt solutions were potassium carbonate (44 f 5% RH) and sodium chloride solution (75 f 5% RH). Directly after compression and after storage during different time periods (1, 2 , 4 and 8 weeks), the crushing strength an the disintegration time of the tablets were measured. From each batch the crushing strength of 10 tablets was measured using a Schleuniger instrument. The disintegration time of 6 tablets from each batch was measured using a Ph.Eur. apparatus, with water (37 f 1 "C) as a test fluid. The tests were performed without disks.
PHYSICAL STABILITY OF TABLETS
33 1
8.3.3 Results The four starches used in this study have been proven to be suitable tablet ingredients [ 191. Both used 'super-disintegrants', Primojel@(sodium starch glycolate) and Polyplasdone@XL(crospovidone), are commonly used as a disintegrant in tablets prepared by direct compression. The initial values of crushing strength (So) and disintegration time (Do), of the tablets are given in Table 8.7. From the results in Section 8.2 it can be concluded, that the SIR of crushing strength can be used if the initial values of the crushing strength are in the same order of magnitude. In order to obtain batches of tablets with crushing strengths that were of the same order of magnitude, the compression load levels were adjusted to 157 MPa for D-lactose tablets, 3 14 MPa for the dicalcium phosphate dihydrate tablets and 472 MPa for the a-lactose monohydrate tablets. However, due to the excellent binding properties of rice starch [ 191, tablets prepared with rice starch had a higher initial crushing strength. The initial disintegration times of the tablets prepared with a-lactose were excellent: all tablets disintegrated within 2 minutes. The D-lactose tablets showed longer, but still adequate disintegration times. The difference between the effectiveness of sodium starch glycolate and crospovidone in Dlactose tablets was previously studied by Van Kamp et al. [24] and ascribed to the higher capillary action of crospovidone. However, some of the batches of tablets prepared with dicalcium phosphate dihydrate had long disintegration times. The tablets prepared with dicalcium phosphate and potato starch showed the longest disintegration times, with large variation in the results. This is in accordance with previous findings [19,25]. Bolhuis et al. [25] found that potato starch interacted with the hydrophobic lubricant. This in contrast to sodium starch glycolate, which promotes a chain reaction of disruption from the outside of the tablet and subsequently a fast disintegration. Bos et al. [19] found that due to the relatively large particle size of potato starch as compared to rice starch no continuous hydrophilic network can be formed, resulting in long disintegration times of dicalcium phosphate/potato starch tablets. The reduced efficiency of crospovidone at the low concentration in dicalcium phosphate tablets can be attributed to the fact that the amount is too low for the formation of a continuous hydrophilic network [24]. The crushing strength and the disintegration time were measured after 1, 2, 4 and 8 weeks of storage, at the different conditions. Figures 8.3a and 8.3b show the SIR of crushing strength and of disintegration time of
332
C.E. BOS et al.
a-lactoseltapioca starch tablets (20% w/w disintegrant), as a function of the storage time. These figures are representative graphs of the behavior of the SIR of crushing strength and of disintegration time, for all tablet formulations during storage. The major changes in the crushing strength, respectively the disintegration time took place within 2 weeks of storage. In order to be sure that no more major changes would take place, in this study the physical stability was expressed as the SIR of crushing strength or of disintegration time after 8 weeks of storage. This was calculated according to equation (1 1).
TABLE 8.7 INITIAL VALUES OF CRUSHING STRENGTH (So) AND DISINTEGRATION TIME (Do) OF TABLETS CONSISTING OF A FILLERBINDER AND A DISINTEGRANT combination of disintegrant Jiller-binder and concentration disintegrant (!Aw/w) V(N) D,"(s) Lactose monohydrate/ Maize starch Potato starch Rice starch Tapioca starch Na-starch glycolate Crospovidone R-Lactose/ Maize starch Potato starch Rice starch Tapioca starch Na-starch glycolate Crospovidone
10 20 10 20 10 20 10 20 2 4 2 4 10 20 10 20 10 20 10 20 2 4 2
333
PHYSICAL STABILITY OF TABLETS
TABLE 8.7 (CONTINUED)
INITIAL VALUES OF CRUSHING STRENGTH (So) AND DISINTEGRATION TIME (DO) OF TABLETS CONSISTING OF A FILLERBINDER AND A DISINTEGRANT Combination of dis integrant filler-binder and concentration disintegrant (?Aw/w) Dob(N)
%vY)
Dicalcium phosphate dihydrate/ Maize starch
10 20 Potato starch 10 20 Rice starch 10 20 Tapioca starch 10 20 Na-starch glycolate 2 4 Crospovidone 2 4 a mean (standard deviation), n = 10 mean (standard deviation), n = 6
’
5
n
s
69 (4) 51 (3) 63 (10) 46 (3) 82 (4) 122 (6) 47 (3) 44 (3) 53 (3) 47 (5) 44 (3) 37 (3)
44 (33) 13 (2) 932 (383) 239 (166) 33 (9) 23 (5) 65 (15) 21 (3) 62 (45) 10 (2) 430 (23 1) 6 (1)
120
+
I10
0 2OOC,75% RH
n
I00
90
2OOC,44% RH
0
3I0C,44% RH
0
31°C,75% RH
80 70 60
0
1
2
3
4
5
6
7
8
storage time ( weeks )
Figure 8.3a Storage to Initial Ratio of crushing strength (SIR(S))of alactose rnonohydrate/tapioca starch tablets (20% w/w disintegrant) as a function of the storage time
334
C.E. BOS et al.
s e
n
2 0 o c , 44% RH
I2O T
2 0 o c , 75% RH
W h
31° C , 44% RH
31° C , 75% RH
70
._
0
1
2
3
4
5
6
7
I
storage time ( weeks )
8
Figure 8.3b Storage to Initial Ratio of disintegration time (SIR(0)) of alactose rnonohydrate/tapioca starch tablets (20% w/w disintegrant) as a function of the storage time
TABLE 8.8 COEFFICIENTS FOR THE EQUATION OF THE SIR OF CRUSHING STRENGTH (SIR(S)) AFTER STORAGE slR(s)(%) = ~ , S I R ( S+A,sIR(s)~ ) + ~ , s I R (+s/)~~, s I R ( s ) ~ +fi,SIR(S)Ct +~,SIR(S)Cr+~,SIR(S)tr+~,SIR(S)Ctr
PO,SIR(S)
PI,SIR(S) P2,SIR(S) h,SIR(S)
LACS
LAPS
LARS
LATS
LAPR
LAPV
97.9
95.1 -3.2
101.8 -5.4
97.8
106.1
*
-4.6
-7.1
* * * *
- 12.6 * * * *
7.5 -6.3 4.4
91.9 -12.9 12.8 -9.6
0.63
0.77
* *
-9.7
*
b4,SIR(S) h,SIR(S)
* *
h,SIR(S)
-3.1
*
* *
0.76
0.43
b.SIR(S)
R2A
*
2.7
*
*
*
*
* *
-2.4
*
* *
0.41
0.89
335
PHYSICAL STABILITY OF TABLETS
TABLE 8.8 (CONTINUED) COEFFICIENTS FOR THE EQUATION OF THE SIR OF CRUSHING STRENGTH (SIR(S)) AFTER STORAGE = h , S I R ( S )+&IR(S)c h,SIR(S)c
+&SIR(S)t+&SZR(S)r
@i,SIR(S)cr -k &SIR(S)tr
h,SIR(S)'
tr
LBCS
LBPS
LBRS
LBTS
LBPR
LBPV
PO,SIR(S)
1 10.9
PI,SIR(S)
* *
109.8 -4.9
107.8 -4.9
-4.9
8
-3.0
113.6 -4.2 2.7 -7.5 -2.8
113.7 -5.3 -3.3 -10.5
124.1 -10.2 8.1 -5.7 3.O - 12.3
PZ,SIR(S) P3,SIR(S) P4,SIR(S) h,SIR(S)
*
-4.9
* *
*
* * *
*
*
-4.5
*
* *
0.4 1
0.67
0.75
DIRS
DITS
DIPR
DIPV
86.9 -3.1 3.7 -1 1.6
99.5
96.2
-9.3
3.9 -9.8
87.1 -3.3 4.1 -1 1.9
*
-3.8 3.1
0.22
0.10
0.44
DICS
DIPS
PO,SIR(S)
96.0
97.6
P2,SIR(S)
3.7 - 12.2
3.5
-3.7 5.7
*
P7,SIR(S)
R2adi
PI,SIR(S) P3,SIR(S) P4,SIR(S) P5,SIR(S) P6,SIR(S) P7.S1R(S,
R2adi
* *
*
* * *
4.1
*
-6.9 -5.5
*
* *
P6,SIR(S)
*
*
-2.6 3.7
*
* *
* *
3.8
*
*
* * * *
0.82 0.22 0.83 0.62 0.46 * = not significant (1%) CS =Cornstarch PS = Potato starch LA= a-Lactose monohydrate LB= &Lactose RS =Ricestarch TS = Tapioca starch DI = Dicalcium phosphate dihydrate PR = Na-starch glycolate PV = Crospovidone
*
-4.9 9.1 3.7 0.80
336
C.E. BOS et al.
Crushing strength The influence of the adjustable variables (disintegrant concentration as well as storage temperature and relative humidity), on the SIR of crushing strength (SIR(S)) was calculated for each combination of disintegrant and filler-binder. This was expressed as in equation (10). The coefficients of the equations for the different combinations of excipients are given in Table 8.8. The intercept in the equations (po,sIR(s)) represents the overall SIR of crushing strength, i.e. the physical stability, of the tablets after storage. For tablets prepared with dicalcium phosphate dihydrate po,sIR(s) is always smaller than 100, indicating that overall the tablets showed a decrease in crushing strength after storage. But in contrast, for the tablets prepared with 13-lactose the po,sIR(s) is larger than 100, indicating that the overall tablet strength increased after storage. This effect has been described by Bolhuis et al. [26] for 13-lactose tablets, prepared with 0.5% magnesium stearate, after storage at 20 "C and 85% RH. This might be explained by the fact that part of the S-lactose dissolves in water of adsorption and recrystallizes as the less soluble a-lactose monohydrate. The overall crushing strength of tablets prepared with a-lactose monohydrate was not influenced by storage (po,s,R(s) = 100). This was also seen by Bolhuis et al. ~251. Tablets prepared with dicalcium phosphate dihydrate increased in crushing strength due to increasing temperatures > 0). The relative humidity had a negative effect on the SIR of crushing strength of the tablets prepared with dicalcium phosphate dihydrate, except for the tablets prepared with potato starch. Also a significant interaction between the temperature # 0), indicating that the effect and relative humidity effect was seen of the relative humidity on the SIR of crushing strength of dicalcium phosphate dihydrate tablets depended on the level of temperature and vice versa. The a-lactose tablets were influenced by the relative humidity too is significant) but the effect was smaller than for the dicalcium phosphate dihydrate tablets. From the tablets investigated, the B-lactose tablets were least influenced by storage. Irrespective of the filler-binder used, in tablets prepared with crospovidone as a disintegrant, all three adjustable factors influenced the SIR of crushing strength. Also the interaction between the concentration and the relative humidity was significant # 0).
337
PHYSICAL STABILITY OF TABLETS
Of the tablets prepared with sodium starch glycolate, only the Blactose/sodium starch glycolate combination was influenced by all three adjustable factors (Table 8.8). Moreover, the effect of the relative humidity depended on the level of the temperature as well as the level of the disintegrant concentration and vice versa.
TABLE 8.9
COEFFICIENTS FOR THE EQUATION OF THE STORAGE TO INITIAL RATIO OF DISINTEGRATION TIME (SIR(D)) AFTER STORAGE sIWD)(%)= A,SIR(D) + ~J,SIR(D)C + L$,sIR(D)~ + A,sIR(D)~ +
PO,SIR(D)
PI,SIR(D) P2,SIR(D)
P3,SIRp) P4,SIR(D) PS,SIR(D)
P6,SIR(D) P7,SIR(D)
'*ad,
PO,SlR(D) PI,SIR(D) P2,SIR(D) P~,SIR(D) P4,SIR(D) PS,SIR(D)
P6,SIR(D) P7,SIR(D)
R2acii
P~,sIR(D)c~ + A,SIR(D)C~ + & , , s I R ( D ) ~+~P ~ , s I R ( D ) c ~ ~
LACS
LAPS
LARS
LATS
LAPR
97 -7 2
149
* *
* * * *
113 -19 6 4 -6 -13 4 -2
147 -12 4
-2 -5
85 -5 4 -3 -2 -6 2 -2
*
140 -29 19 26 4 -12 11
0.76
0.77
0.89
0.97
0.95
0.98
LBCS
LBPS
LBRS
LBTS
LBPR
LBPV
111 16
111 -6
121 22
5
23
150 10 16 31 8
335 83 69 164 46 61 59 46 0.97
*
*
13
* *
* *
0.77
*
9 21
*
2 -17
*
* *
7 -12
10
*
116 4 -9 -14 7 -15 -26 -7
0.45
0.8 1
0.79
0.95
*
*
-4
9
* *
*
LAF'V
*
338
C.E. BOS el al.
TABLE 8.9 (CONTINUED) COEFFICIENTS FOR THE EQUATION OF THE STORAGE TO INITIAL RATIO OF DISINTEGRATION TIME (SIR(D)) AFTER STORAGE sIR(D)(%) A,SIR(D) + A,SIR(D)C + A,SIR(D+ +~ , s I R ( D ) ~ 1
+P~,sIR(D)c~+ & , s I R ( D ) c ~ + A , s I R (+ D P ~), s~I ~R ( D ) c ~ ~
PO,SiR(D)
Pi,sIR(D)
PZ,SIR(D)
P3,SIR(D) P~,sIR(D)
DICS
DIPS
DIRS
DITS
DIPR
DIPV
72 33
146 79
47 19
655 -270
136 68
19
*
830 746 326 225 328 191 233 227
* *
* *
PS,SIR(D)
6
*
P6,SIRp)
*
45 38
P~,sIR(D)
*
0.89 0.72 * = not significant (1%) LA= a-Lactose monohydrate LB= S-Lactose DI= Dicalcium phosphate dihydrate
R2adj
0.87
*
6 5 5
* *
* * *
*
58
* *
*
* *
*
229
0.79 0.27 CS= Corn starch PS = Potato starch RS= Rice starch TS = Tapioca starch PR= Na-starch glycolate PV= Crospovidone
0.63
The calculations for the three combinations with potato starch showed a poor fit (RZadjis small), when compared to the other combinations with the same filler-binder. This can be caused by a large variation in the measurements of the crushing strength after storage, which results in loss of the effect of the adjusted variables in the 'noise'. It can also mean that the SIR of crushing strength is not influenced by these three variables. As an illustration, in Figure 8.4 the SIRS of crushing strength of D-lactose/potato starch tablets are shown as a function of the storage conditions. This combination had the smallest R2,dj(0.10). In this figure the small negative effect of the concentration is visualized. The standard deviations of the measurements of SIR of crushing strength after storage are depicted by a vertical bar.
PHYSICAL STABILITY OF TABLETS
339
storage condition Figure 8.4 Storage to Initial Ratio of crushing strength (SIR(S))of lactose/potato starch tablets, as a function of storage temperature ( c) and storage relative humidity PAM) Disintegration time Also the influence of the adjustable variables (disintegrant concentration and storage temperature and relative humidity), on the SIR,of disintegration time (SIR(D)) was calculated for each combination of disintegrant and fillerbinder. This was expressed as in equation (10). The coefficients of the equations for the different combinations of excipients are given in Table 8.9. Each combination behaves differently after storage. In all cases there was an effect of the starch concentration (&IR(D) is significant). In most cases the relative humidity as well as the interaction between the relative humidity and the disintegrant concentration plays a role in the disintegration time of tablets prepared with either lactose. The dicalcium phosphate dihydratehe starch combination is influenced very strongly by the three factors studied. This combination is not suitable for use in tropical countries. Neither is the combination of B-lactose and crospovidone. Due to the specific behavior of each combination it is not possible to draw any general conclusions about these excipients and their SIR of disintegration time after storage. Each combination should be considered separately.
340
C.E. BOS et al.
8.3.4 Conclusions
From the results of the SIR of crushing strength calculations, a selection can be made as to which excipients are suitable for use in formulations that are to be exposed to extreme climate conditions. Because generally the tablets prepared with B-lactose or a-lactose were not influenced strongly by the storage conditions, these can be used in the optimization of basic tablet formulations. Dicalcium phosphate dihydrate should not be used, because it is strongly affected by the storage conditions. As a disintegrant a starch can be selected. The choice of the starch depends on the used filler-binder. Potato starch can be used in combination with either of the investigated filler-binders. The use of crospovidone must be avoided. Sodium starch glycolate can be used in combination with a-lactose. From the SIR of disintegration time calculations no general conclusions can be drawn. It is merely possible to detect which specific combination is influenced strongly by the storage conditions. In this respect the dicalcium phosphate dihydratehice starch and the fl-lactose/crospovidone combinations were conspicuous. These combinations should be avoided in formulations for use in tropical countries.
REFERENCES E. Fenyvesi, K. Takayama, J. Szejteli and T. Nagai, Evaluation of cyclodextrin polymer as an additive for fbrosemide tablet, Chem. Phurm. Bull., 32 (1 984) 670. J.L. Vila-Jato, A. Concheiro and D. Torres, Influence of storage conditions on the characteristics of digoxin-Emcompress tablets, S. T.P. Pharma, 1 (1 985) 194. Z.T. Chowhan, The effect of low- and high-humidity ageing on the hardness, disintegration time and dissolution rate of dibasic calcium phosphate-based tablets. J Pharm. Pharmacol., 32 (1980) 10. [41 A.M. Molokhia, Effect of storage on crushing strength, disintegration and drug release from mixed tablet bases. Int. J Pharm., 22 (1984) 127. [51 E. Graff, A.H. Ghanem, H. Mahmoud and H. Abdel-Alim, Studies on the direct compression of pharmaceuticals. 19. Effect of moisture on tablet physical parameters and bioavailability. Phurm. Znd., 48 (1986) 292. S. Kadir, N. Yata, M. Kawata and S. Goto, Effect of humidity aging on disintegration, dissolution and cumulative urinary excretion of p-aminosalicylate formulations. Chem. Pharm. Bull., 34 (1986) 5 102. [71 P.K. Shiromani and J.F. Bavitz, Effect of moisture on the physical and chemical stability of granulations and tablets of the angiotensin converting enzyme inhibitor, enalapril maleate. Drug Dev. Znd. Phurm., 12 (1986) 2467. P. York, A preliminary study of the physical stability of tablets prepared from powders stored under tropical conditions. Pharm., 31 (1976) 383.
PHYSICAL STABILITY OF TABLETS
34 1
[9] Z.T. Chowhan, Physical paths of instability. Pharm. Techn., (1982) 47. [lo] G.E.P. Box, W.G. Hunter and J.S. Hunter, Statistics for experimenters. An introduction to design, data analysis and model building. John Wiley & Sons, New York, 1978. [ l l ] R. Huisman, H.V. Van Kamp, J.W. Weyland, D.A. Doombos, G.K. Bolhuis and C.F. Lerk, Development and optimization of pharmaceutical formulations using a simplex lattice design. Pharm. Weekbl.Sci., 6 (1984) 185. [121 H. Stricker, Faktorielle Versuchsplanung der Halbarkeitsprtihg von Arzneimitteln. Pharm. Ind., 37 (1975) 97. [131 D.A. Doombos, Optimization in pharmaceutical sciences. Pharm. Weekbl. Sci., 3 (1981) 33. [ 141 C. Ahlneck and J.O. Waltersson, Factorial designs in phamaceutical preformulation studies. 11. Studies on drug stability and compatibility in the solid state. Act. Pharm. Suec., 23 (1986) 139. [15] H.M. El-Banna, A.A. Ismail and M.A.F. Gadalla, Factorial design of experiment for stability studies in the development of a tablet formulation. Pharmazie, 39 (1984) 163. [16] H. Leuenberger and W. Becher, A factorial design for compatibility studies in preformulation work. Pharm. Act. Helv., 50 (1975) 88. [171 J.O. Waltersson, Factorial designs in pharmaceutical preformulation studies. I. Evaluation of the application of factorial designs to a stability study of drugs in suspension form. Act. Pharm. Suec., 23 (1986) 129. [18] S.A. Sangekar, M. Sarli and P.R. Sheth, Effect of moisture on physical characteristics of tablets prepared from direct compression excipients. J. Pharm. Sci., 61 (1972) 939. [19] C.E. Bos, G.K. Bolhuis, H. Van Doome and C.F. Lerk, Native starch in tablet formulations: properties on compaction. Pharm. Weekbl.Sci., 9 (1987) 274. [20] D.V. Hinkley, Biometrika, 56 (1969) 635. [21] T. Steememan and A. Ronner, Testing independence in multivariate linear regression when the number of variables increases. Internal Report, Economics Institute, University of Groningen, 1984. [22] N. Futscher and P. Schumacher, Haltbarkeitspriihgen von Arzneispeziali~ten. Pharm. Ind., 34 (1972) 479. [23] W. Grimm, Stability testing in indusq for worldwide marketing. Drug Dev. Ind. Pharm., 12 (1986) 1259. [24] H.V, Van Kamp, G.K. Bolhuis and C.F. Lerk, Effect of both lubrication and addition of disintegrants on properties of tablets prepared from different types of crystalline lactose. Act. Pharm. Suec., 23 (1986) 2 17. [25] G.K. Bolhuis, H.V. Van Kamp, C.F. Lerk and F.G.M. Sessink, On the mechanism of action af modem disintegrants.Act. Pharm. Technol., 28 (1982) 111. [26] G.K. Bolhuis, G. Reichman, C.F. Lerk, H.V. Van Kamp and K. Z u m a n , Evaluation of anhydrous a-lactose, a new excipient in direct compression. Drug Dev. Znd. Pharm., 11 (1985) 1657.
This Page Intentionally Left Blank
INDEX Accuracy, 193 Activity, 239 Adjusted correlation coefficient, 25 1 Adsorbent, 238 Aliasing, 19; 99 Alkaloids, 235 Allowance design, 151 Analysis of Variance (ANOVA), 60; 68; 126; 143 Anti-drift design, 113 A-optimality criterion, 33 Augmented design, 29
Controllable influences, 242 Cook’s distance, 252 Crushing strength, 13; 181; 309 Curvature, 24; 250 Defining contrast, 99 Defining relation, 99 Design, 12; 155; 202 Design array, 16 Design generator, 99 Design resolution, 100 Design variables, 265 Disintegrant, 149 Disintegration time, 18 1; 309 Dissolution profile, 309 D-optimality criterion, 33 Dual response, 39 Dummy factor, 106; 120
Bias, 80 Bioanalysis, 266 Birnbaun plots, 115 Blending, 253; 268 Blocking, 30; 69 Box-Behnken design, 3 1; 45; 2 11 Box-Cox transformations, 249
Elution power, 236 Environmental factor, 150 Environmental variables, 11; 14 Environmental variation, 4 1 E-optimality criterion, 33 Excipient, 149; 3 10 Expert systems, 138 Extreme vertices design, 247
Case deletion diagnostic, 25 1 Central composite design, 26; 110; 21 1 Collinearity, 252 Column change, 200 Combined contour plot, 176 Composite criterion, 179; 266 Compression load, 3 12; 3 16 Confounding, 98; 2 17 Constrained optimization, 40; 5 1 Contour plot, 39; 57 Contrast coefficient, 95 Control factor, 111; 155; 242;
Face-centered composite design, 29; 43 Factor, 66; 88; 153 classification, 59 Factorial design, 13; 203; 3 12 Feasible criteria space, 181 First order design, 18; 25
343
344
INDEX
First order effect, 2 18 First order model, 18; 250 Fixed effect, 54 Fixed effect model, 139 Foldover design, 22 Formulation, 149, Fractional factorial design, 18; 96; 202 Friability, 181; 309 Functional design, 151 Glidant, 149 Good Laboratory Practice (GLP), 265 G-optimality criterion, 33 Graeco-Latin design, 155 Half-fractional factorial design, 96; 206 Half normal probability plot, 1 15 Higher order effects, 2 18 Higher order interaction, 94 Homoscedastical error, 248, Humidity, 13; 238; 3 10; 3 14 Hyper Graeco-Latin design, 155 Inner array, 14; 157; 242 Inner noise, 154 Interaction, 94; 203 Interaction plot, 48 Inter-assay variation, 82 Intra-assay variation, 82 Incomplete three level factorial design, 3 1 Jones’ method, 166 Lack-of-fit, 53 Linearity test, 195
Loss function, 151 larger the better, 152 nominal the best, 151 smaller the better, 152 Lubricant, 149 Lurking variable, 202 Main effect, 203 Manufacturing, 154 Method limitations, 193 M-factor different intermediate precision measures, 82 Mixture design, 241; 267 Modified Box Behnken design, 32 Model, 29 Multicriteria Decision Making, 175 Multiple linear regression, 3 12 Nested design, 139 Noise array, 43 Noise factor, 111; 242 Noise variables, 11; 265 Nonlinear behavior, 88 Non-procedure related factors, 85 Nonsense correlation, 202 Normal probability plot, 1 15 Off-line quality control, 154 One variable at a time, 92 Optimal design, 33; 247 Optimization, 240; 266 Optimization of tablet formulations, 3 10 Orthogonal array, 156 Outer array, 14; 154 Outer noise, 158
INDEX
Parameter design, 151; 242; 265 Pareto Optimality, 179; 252 Partition coefficient, 268 Peak symmetry, 20 1 Physical stability of tablets, 309 Plackett-Burman design, 22; 104; 155; 208 Plots, 115 Precision, 80; 194 Predictive Error Sum of Squares (PRESS), 25 1 PRISMA method, 24 1 Procedure related factors, 85 Process design, 154 Product design, 13; 154 Projected Variance method, 170 Protocol for method validation, 192 Quadratic loss function, 158 Quadratic mixture model, 268 Quantitative factor, 88 R2a4,25 1 Random effect model, 139 Random errors, 80 Randomized, 38 Recovery, 269 Reflected design, 110 Relative standard deviation, 2 17 Repeatability, 82; 194 Reproducibility, 82; 194 Resolution (RJ, 20; 201; 235 Response surface, 16; 270 Response surface design, 48 Response surface methodology, 15; 241 Rpalue, 234
345 R,-value, 235 Robust experimentation, 70 Robustness Coefficient, 172 Robustness criteria, 157 Robust design, 12; 35; 57 Rotatable design, 29 Rotatability, 29 Ruggedness, 80 Ruggedness test, 12; 85; 191 Sample preparation, 266 Saturated design, 104 Saturated fractional factorial design, 103; 206 Screening design, 86 Second order design, 25; 34; Second order effects, 2 18 Second order model, 25; 25 1 Selection of factors, 197 Selectivity, 194; 238; 270 Semi-confounding, 202 Sequential experimentation, 20 Sequential optimization methods, 178; 241; 266 Shelf life, 154 Signal-to-noise ratio, 72; 168; 243 Simplex optimization method, 240; 266 Simultaneous optimization method, 24 1;266 Solid dosage form, 149 Solvent selectivity, 236 Solvent strength, 236 Special cubic model, 180; 268 Specificity, 194 Split-plot design, 57 Spot cross-over, 244; 253 Standard error, 2 16 Star design, 209
346
INDEX
Star point, 28 Statistical experimental design, 1 1 Strip-block design, 65 Storage to Initial Ratio, 3 11 System design, 155 System suitability criteria, 191; System suitability parameters, 216 Systematic error, 80; 112 Taguchi, 11; 242; 265 design, 111 loss function, 151 method, 150 Targeting design, 151 Third order model, 25 1 Three factor interaction, 94 Three level design, 26 Three level factorial design, 110 Thin Layer Chromatography (TLC), 85; 233 TLC separation optimization, 240 Tolerance design, 155 Total error, 80 Tropical conditions, 328 Two factor interaction, 94 Two level factorial design, 18 Uncontrollable influences, 242 Uniform shell design, 34 Validation, 196 Variance Inflation Factor (VIF), 252 Variational noise, 154 Weighted Jones method, 169 Whole-plot factor, 59