Reliability Verification, Testing, and Analysis in Engineering Design
Gary S. Wasserman Wayne State University Detroit...
199 downloads
2551 Views
4MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Reliability Verification, Testing, and Analysis in Engineering Design
Gary S. Wasserman Wayne State University Detroit, Michigan, U.S.A.
Marcel Dekker, Inc.
New York • Basel
Copyright © 2002 by Marcel Dekker, Inc. All Rights Reserved. Copyright © 2002 Marcel Dekker, Inc.
Library of Congress Cataloging-in-Publication Data A catalog record for this book is available from the Library of Congress. ISBN: 0-8247-0475-4 This book is printed on acid-free paper. Headquarters Marcel Dekker, Inc. 270 Madison Avenue, New York, NY 10016 tel: 212-696-9000; fax: 212-685-4540 Eastern Hemisphere Distribution Marcel Dekker AG Hutgasse 4, Postfach 812, CH-4001 Basel, Switzerland tel: 41-61-260-6300; fax: 41-61-260-6333 World Wide Web http:==www.dekker.com The publisher offers discounts on this book when ordered in bulk quantities. For more information, write to Special Sales=Professional Marketing at the headquarters address above. Copyright # 2003 by Marcel Dekker, Inc. All Rights Reserved. Neither this book nor any part may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, microfilming, and recording, or by any information storage and retrieval system, without permission in writing from the publisher. Current printing (last digit): 10 9 8 7 6 5 4 3
2 1
PRINTED IN THE UNITED STATES OF AMERICA
Copyright © 2002 Marcel Dekker, Inc.
Preface
The writing of this book continues a long tradition at Wayne State University (WSU). B. Epstein (1948, 1954, 1958, 1960), a mathematics professor at WSU, pioneered much of the early development of practical life models based on the exponential and extreme value distributions. Later on, Kapur and Lamberson (1977) authored a very popular textbook on engineering reliability that continues to be in wide use today, despite the fact that it emphasizes the use of linear estimation schemes, which are no longer in use. To continue this long tradition, I have written a book that I believe accurately captures the theory and practice behind many of the design verification techniques used in industry. This book was inspired by my unique opportunity to live and work in metropolitan Detroit, the motor capital of the world. This has afforded me some real-world experiences that I have enjoyed, such as the opportunity to consult with automakers and their tier-1 suppliers. When designing test verification plans, I have seen engineers panic when they realize they do not have the necessary training and background to decide how many items should be placed on test. My motivation for writing this book is based on the need to provide reliability engineers with a reference book that to can help them meet these challenges. This book can be used in the classroom to expose students to the theory and practice of applied life data analysis, or it can be used as a reference book for practicing reliability engineering professionals and their counterparts. Consultants and academicians working in allied areas are also apt to use this book as a
Copyright © 2002 Marcel Dekker, Inc.
reference. This text contains numerous worked-out examples using either Microsoft1 Excel or MINITABTM statistical computing software, along with a brief list of suggested exercises at the end of each chapter. The former product was chosen for its obvious, ubiquitous nature. The latter was chosen because it has become a very popular choice in the classroom and among Fortune 500 companies who wish to quickly and easily analyze small industrial data sets. The reader is not required to use these products, however, as they are presented only to demonstrate the underlying implementation of efficient computer-based procedures. In keeping with the spirit of this text, the reader might be amazed to find that reference tables for looking up normal probabilities do not appear in this book. The reader is encouraged to use the built-in routines of Excel and similar software to look up these values. Numerous examples appear throughout the book, which should serve to assist the reader in locating these values. Some of the features and design formulas presented in this text are unique and some are just unusual:
In the introductory chapter, an overview of modern reliability thinking of the late twentieth and early twenty-first centuries is presented, including emphasis on understanding what is a failure, the importance of understanding customer usage profiles, and the deployment of reliability throughout the product design process from cradle to grave. Although this chapter cannot serve as a comprehensive overview of reliability management techniques, I am certain the reader will find this information useful. Additional topics include the use of Quality Function Deployment for reliability planning, FMEA=FMECA, and the use of DVP&R and its relationship with FMEA. The book makes use of Microsoft1 Excel spreadsheet and Tool > Solver and Goal Seek nonlinear search procedures wherever possible. Actual spreadsheets are reproduced along with background information on how the procedures are to be run. The book demonstrates how Excel can be used to develop both Fisher matrix and likelihood ratio estimates of reliability metrics. In the latter case, the use of maximum likelihood estimation techniques for developing asymptotic properties is clearly explained. MINITABTM is used to develop Monte Carlo interval estimates of reliability metrics. This is not new, but the reader will find very few textbooks that cover the practical use of Monte Carlo to this extent. The book shows how simple macros can be written in MINITABTM to run Monte Carlo estimation routines. The theory behind the use of rank estimators and the development of small sample binomial confidence limits for success-failure testing is presented to the user in the Appendices to Chapters 2 and 6. The
Copyright © 2002 Marcel Dekker, Inc.
formulas are shown to be different. Their differences have never before been published in a textbook! The equivalence between Kaplan–Meier, other product-limit estimators including the modified Kaplan–Meier, and Johnson’s formula for estimating the rank of multiply censored data is also demonstrated. This work has also never appeared in a textbook before. I closely follow the recommendations by Abernethy (1996), who advises the use of inverse rank regression techniques for estimating the parameters of a distribution (from probability plots). This is not the method usually recommended; however, we now see that the latest release of MINITABTM (version 13) performs rank regression this way! One of the most comprehensive and concise discussions of goodnessof-fit methodology, including the use of ordinary regression R2 statistics to assess fit, is provided. The book also describes how to identify mixtures and competing failure modes from the examination of probability plots, and introduces the use of mixture distributional forms by Tarum (1996) for modeling reliability bathtub phenomena. Chapter 6 presents formulas for extended bogey testing, which are usually presented in a more theoretical context. It also discusses the use of tail testing techniques and fully describes the use of a Bayesian adjustment, which permits a sample size reduction of one. The book surveys both the use of accelerated life techniques for modeling life versus stress relationships and the use of HALT=HASS (see Chapter 7) for quickly identifying design deficiencies. In the former case, we show how to use MINITAB’sTM built-in accelerated life-testing routines for modeling the Arrhenius relationship using maximum likelihood techniques for estimating the parameters of the Arrhenius model. The book attempts to provide balance by surveying the use of computeraided engineering techniques for design verification. The uses of finite element models, probabilistic design, etc., are surveyed. The reader is made aware that in the future testing will be used exclusively for confirmation, not for troubleshooting deficient designs! Chapter 10 explains the use of simple quantile–quantile (Q–Q) plots for examining differences between life data sets and for estimating acceleration factors.
The coverage of theory is intentionally varied. For example, in Chapter 3, I introduce basic foundations of distributional modeling, including the use of the Z-transform for developing estimates of normal fractiles. This is done to provide some background and reference material to a broad audience of reliability professionals and students. Much of the more advanced material is placed in appendices, and the more advanced material on likelihood estimation is post-
Copyright © 2002 Marcel Dekker, Inc.
poned until Chapter 9. Everyone should find something to gain from this book, and our less experienced readers will hopefully find expanded uses for this text as they continue to progress. Finally, I wish to thank the many informal, formal, and anonymous people who have reviewed this text. In particular, I wish to acknowledge the following people: Dave Deepak, Harley-Davidson; Dr. Yavuz Goktas, Baxter Healthcare Corp.; Ron Salzman, Ford Motor Company; James McLinn, Rel-Tech; Leonard Lamberson, Western Michigan University; and the students of Reliability Engineering class, IE 7270, Fall Semester 1999, Wayne State University. Gary S. Wasserman
Copyright © 2002 Marcel Dekker, Inc.
Contents
Preface 1.
A Modern View of Reliability Concepts and Design for Reliability 1.1
What Is Reliability? 1.1.1 Definition of Reliability 1.1.2 What Is a Failure? 1.1.3 Three Reasons Why Products Fail 1.1.4 History of Reliability in the United States 1.2 Overview of Reliability Modeling 1.2.1 Six Basic Approaches to Modeling Product Reliability 1.2.2 Failure Mechanisms 1.2.3 Establishing Reliability Specifications 1.3 An Overview of Reliability Planning 1.3.1 Elements of Design for Reliability 1.3.2 Deploying Reliability Requirements 1.3.3 Reliability Prediction 1.3.4 Cost of Reliability and Product Testing 1.3.5 Reliability Bathtub Curve
Copyright © 2002 Marcel Dekker, Inc.
1.4 1.5 1.6
2.
1.3.6 Life-Cycle Costing Enablers for a Successful Reliability Planning Effort Reliability Growth Management Exercises Appendix 1A. FMEA=FMECA=DVP&R
Preliminaries, Definitions, and Use of Order Statistics in Reliability Estimation 2.1
Reliability Metrics 2.1.1 Reliability Functions 2.1.2 Population Moments 2.1.3 Worked-Out Examples 2.2 Empirical Estimates of FðtÞ and Other Reliability Metrics: Use of Order Statistics 2.2.1 Naive Rank Estimator 2.2.2 Mean and Median Rank Estimators 2.2.3 Use of Rank Estimators of FðtÞ as a Plotting Position in a Probability Plot 2.2.4 Beta-Binomial and Kaplan–Meier Confidence Bands on Median Rank Estimator 2.2.5 Empirical Estimates of Other Reliability Metrics: RðtÞ, lðtÞ, f ðtÞ, and HðtÞ 2.2.6 Working with Grouped Data 2.3 Working with Censored Data 2.3.1 Categorizing Censored Data Sets 2.3.2 Special Staged Censored Data Sets 2.3.3 Nonparametric Estimation of Reliability Metrics Based on Censored Data 2.3.4 Developing Empirical Reliability Estimates of Warranty or Grouped Censored Data 2.4 Exercises Appendix 2A.
3.
A Survey of Distributions Used in Reliability Estimation 3.1 3.2
Introduction Normal Distribution 3.2.1 Central Tendency 3.2.2 Properties of Normal Distribution 3.3 Lognormal Distribution 3.4 Exponential Distribution 3.5 Weibull Distribution
Copyright © 2002 Marcel Dekker, Inc.
3.6 3.7
3.8 3.9
4.
3.5.1 Weibull Power-Law Hazard Function 3.5.2 Weibull Survival Function 3.5.3 Properties of the Weibull Distribution 3.5.4 Three-Parameter Weibull Extreme Value (Gumbel) Distribution Other Distributions Used in Reliability 3.7.1 Logistic and Log-Logistic Distributions 3.7.2 Gamma and Log-Gamma Distributions 3.7.3 Miscellaneous Other Noteworthy Distributions Mixtures and Competing Failure Models Exercises Appendix 3A. Background Appendix 3B. Weibull Population Moments
Overview of Estimation Techniques 4.1 4.2
4.3
4.4 4.5 4.6 4.7
4.8 4.9
Introduction Rank Regression and Probability Plotting Techniques 4.2.1 Normal Probability Plotting Techniques 4.2.2 Weibull Probability Plotting Techniques Maximum Likelihood Estimation 4.3.1 Introduction to ML Estimation 4.3.2 Development of Likelihood Confidence Intervals 4.3.3 Maximum Likelihood Estimation of Normal Parameters, m and s, for Complete Sample Sets 4.3.4 ML Estimation of Normal Parameters m and s2 in the Presence of Censoring 4.3.5 ML Estimation of Weibull Parameters y and b Simulation-Based Approaches for the Development of Normal and Weibull Confidence Intervals Other Estimators 4.5.1 Best Linear Estimators of m and s Recommendations for Choice of Estimation Procedures Estimation of Exponential Distribution Properties 4.7.1 Estimating the Exponential Hazard-Rate Parameter, l, or MTTF Parameter, y 4.7.2 Exponential Confidence Intervals 4.7.3 Use of Hazard Plots Three-Parameter Weibull Exercises Appendix 4A. Monte Carlo Estimation Appendix 4B. Reference Tables and Charts
Copyright © 2002 Marcel Dekker, Inc.
5.
Distribution Fitting 5.1 5.2
Introduction Goodness-of-Fit Procedures 5.2.1 Goodness-of-Fit Tests Based on Differences Between Empirical Rank and Fitted Distributions 5.2.2 Rank Regression Tests 5.2.3 Other Goodness-of-Fit Tests 5.3 Exercises Appendix 5A.
6.
Test Sample-Size Determination 6.1
6.2 6.3
6.4
6.5
Validation=Verification Testing 6.1.1 Verification Testing 6.1.2 Specifying a Reliability Requirement 6.1.3 Success–Failure Testing 6.1.4 Testing to Failure 6.1.5 Strategies for Reducing Sample-Size Requirements 6.1.6 Underlying Distributional Assumption Success Testing 6.2.1 Bayesian Adjustment to Success Formula Success–Failure Testing 6.3.1 Use of Binomial Nomograph 6.3.2 Exact Formulas for Binomial Confidence Limits in Success–Failure Testing 6.3.3 Large-Sample Confidence Limit Approximation on Reliability 6.3.4 Bayesian Adjustment to Success–Failure Testing Formula 6.3.5 Correctness of Binomial Success–Testing Formula Exponential Test-Planning Formulas 6.4.1 Success Testing Under an Exponential Distribution Assumption Using Alternate Formula 6.4.2 Extended Bogey Testing Under Exponential Life Model 6.4.3 Extended Success Testing—Exponential Distribution 6.4.4 Risks Associated with Extended Bogey Testing 6.4.5 Reduced Test Duration Weibull Test Planning 6.5.1 Weibayes Formulas 6.5.2 Adequacy of Weibayes Model
Copyright © 2002 Marcel Dekker, Inc.
6.5.3
Chrysler Success–Testing Requirements on Sunroof Products 6.6 Tail Testing 6.7 Failure Testing 6.8 Other Management Considerations 6.9 Summary 6.10 Exercises Appendix 6A. Binomial Distribution Appendix 6B. Bayesian Estimation of Failure Fraction, p Appendix 6C. Weibull Properties 7.
Accelerated Testing 7.1
7.2 7.3
7.4
7.5
7.6 7.7
8.
Accelerated Testing 7.1.1 Benefits=Limitations of Accelerated Testing 7.1.2 Two Basic Strategies for Accelerated Testing Highly Accelerated Life Testing (HALT) Accelerated Life Test 7.3.1 Accelerated Cycling or Time-Compression Strategies 7.3.2 Stress-Life Relationships at Two Different Stress Levels Use of Physical Models 7.4.1 The Arrhenius Model 7.4.2 Other Acceleration Models Use of Linear Statistical Models in Minitab for Evaluating Life Versus Stress Variables Relationships 7.5.1 Arrhenius Linear Model in Minitab 7.5.2 Use of Regression with Life Data Procedure in Minitab 7.5.3 Use of Proportional Hazards Models Closing Comments Exercises Appendix 7A. Q–Q Plots Appendix 7B. ML Estimation of Parameters in Regression Model with Multiply Censored Life Data
Engineering Approaches to Design Verification 8.1
Computer-Aided Engineering Approaches 8.1.1 Finite-Element Analysis 8.1.2 Other Computer-Aided Engineering (CAE) Approaches
Copyright © 2002 Marcel Dekker, Inc.
8.2
Probabilistic Design 8.2.1 Simple Strength Versus Stress Models 8.2.2 Multivariate Strength Versus Stress Competition 8.2.3 Probabilistic FEA 8.3 Parametric Models 8.4 Summary 8.5 Exercises Appendix 8A. First Order Reliability Method (FORM)
9.
Likelihood Estimation (Advanced) 9.1
Maximum Likelihood (ML) Point Estimation 9.1.1 Maximum Likelihood Estimation of Exponential Hazard Parameter, l 9.1.2 ML Estimates of Normal Parameters, m and s2 9.1.3 Worked-Out Example 9.1.4 Weibull Distribution: ML Estimation of b and y 9.1.5 ML Estimation of Three-Parameter Weibull Distribution 9.1.6 Other Modified Estimation Procedures for the Three-Parameter Weibull Distribution 9.2 ML-Based Approaches for Confidence Interval Estimation 9.2.1 Exponential Confidence Intervals 9.2.2 Asymptotic (Large-Sample) Confidence Intervals 9.2.3 Confidence Intervals on Normal Metrics 9.3 Exercises Appendix 9A. Algorithm by Wolynetz (1979) for Obtaining ML Estimates of Normal Parameters, m and s Appendix 9B. Proof: The Exponential Total Unit Time on Test Variable, T , Follows a Gamma (rl) Distribution 10.
Comparing Designs 10.1
Graphical Procedures Based on Probability or Rank Regression Plots 10.2 Q–Q Plots 10.2.1 Technical Note: Use of Q–Q Plots 10.2.2 Inferential Statistics for Using Q–Q Plots 10.3 Use of Likelihood Theory for Assessing Differences 10.4 Approximate F-Test for Differences—Weibull and Exponential Distribution
Copyright © 2002 Marcel Dekker, Inc.
10.4.1
Test for Differences in the Exponential MTTF Parameter, y 10.4.2 Approximate Test for Differences with Weibull Shape Parameter 10.4.3 Use of Approximate F-Tests 10.5 Summary 10.6 Exercises References
Copyright © 2002 Marcel Dekker, Inc.
1 A Modern View of Reliability Concepts and Design for Reliability
True Story ANZ Corporation, a second-tier automotive supplier of trim products, was informed by its customer, an OEM (original equipment manufacturer, e.g., GM, Ford, or Toyota), that its contract would be terminated at the end of year. ANZ management was not given a direct reason for this decision, but through informal discussions it was informed that its customer could no longer tolerate the numerous customer complaints and warranty claims associated with ANZ’s parts. As this program represented 18% of the company’s business, ANZ management was now put in a position where it could no longer ignore significant quality and reliability problems. The general manager of the business unit realized that ANZ does not have the adequate systems in place to ensure that products are designed and built correctly to specification. He cried out, ‘‘We need to improve the reliability of our products!’’ Great, but what is reliability anyway? What systems or standard reliability processes need to be in place? Do you think that ANZ management was a bit late in its realization? Author’s note: Within six months of this episode, the company had regained all of its lost business with new contracts, and its worries about reliability were a thing of the past. Is this a correct strategy for the company? Why?
Copyright © 2002 Marcel Dekker, Inc.
In order to achieve high product reliability, one has to truly understand the definition of the term ‘‘reliability.’’ We begin this chapter (x1:1) with a modern definition of this term, including an overview of what it means for a product to fail in the field. In particular, we point out the importance of (a) understanding the customer’s perspective on reliability and (b) product engineers understanding exactly how the customer will use the product. The use and importance of quality function deployment for ensuring that customer requirements drive the design for reliability process are outlined. In x1:2 we outline basic approaches for modeling reliability followed by an overview on the use of R (reliability) by C (confidence) specifications for verifying product reliability during design. Finally, we provide an overview of reliability planning in x1:3 and discuss the use and importance of failure mode and effects analysis (FMEA) in the appendix.
1.1 1.1.1
WHAT IS RELIABILITY? Definition of Reliability
‘‘Reliability’’ refers to ‘‘quality over time.’’ It is correct to say this. However, quality has many attributes, and reliability is but one attribute (characteristic) of it. A much more modern definition, which properly takes into account the issues related to how the customer uses the product, might be expressed as follows: Reliability is the probability of a product performing its intended function over its specified period of usage, and under specified operating conditions, in a manner that meets or exceeds customer expectations. Each phrase in the preceding definition warrants our special attention. Each of the following components of this definition is discussed separately: 1. 2. 3. 4. 5.
Probability of product performance Intended function Specified life Specified operating conditions Customer expectations (voice of the customer)
Probability Reliability is a probabilistic phenomenon. As such, we make heavy use of probability distributions to model test and field data. The sources of variation in performance are attributed, but not limited, to the following: 1. 2. 3.
Differences in the supplied materials and components Differences in how the product is processed and assembled Differences in how the customer uses the product
Copyright © 2002 Marcel Dekker, Inc.
4. 5.
Differences in exposure to stresses that affect performance (history of environmental effects, loading, etc.) Interactions among subsystems at the systems level
The acceptance of a probabilistic notion of reliability, which admits the possibility of failure, is a source of great concern for organizations. Many organizations are fearful that the public release of reliability information can lead to abuses, particularly in the event of injury lawsuits caused by product malfunction or safety concerns. Some corporations are even beginning to adopt alternate terminology such as ‘‘performance-challenged’’ or ‘‘impaired function’’ in order to avoid any usage of words such as ‘‘failure’’ or ‘‘defect,’’ which the public could misinterpret. Intended Function The intended function(s) of a product must be identified early in the design process to ensure that important customer requirements are designed into the product. Accordingly, it is important that the fulfillment of each function be understood in terms of the customer’s expectations. The identification of intended functions is also an important task in reliability planning. In the very first steps of constructing failure mode effects (criticality) analysis (FMEA=FMECA), every possible intended function of a product is examined to assess how a product might fail to perform its intended function. FMEA=FMECA is discussed in greater detail in Appendix 1A.1 of this chapter. Sometimes the engineering design team may fail to identify potential customer uses of a product up-front in the design processes. In some instances the hidden functions of a product may be perceived to be customer abuses or misuses of a product. As an example, consider the following two scenarios: 1.
2.
A product team responsible for the design of a flashlight product is likely to interpret the use of its flashlight as a spare hammering device as product misuse. This can easily damage a flashlight that is not designed to be a hammer! With today’s popularity of large sport utility vehicles (SUVs), it is very common to see a very large number of people using their steering wheel to pull themselves into the passenger cab. This can damage the steering column if the design team fails to take this hidden function into account during the design stage.
The first example is easily arguable as an example of customer misuse of a product. However, it might be difficult to argue that the second is an example of unreasonable customer misuse. From a legal viewpoint, product organizations need to take responsibility for all foreseeable misuses of their product. The use of a steering wheel as an assist device to facilitate entry into the passenger cab
Copyright © 2002 Marcel Dekker, Inc.
should be regarded as a foreseeable misuse since the entry step into the cab is higher than in passenger cars, forcing new owners of SUVs to find creative ways to facilitate their entrance into the passenger cab. Specified Life The specified life provides a usage or timeframe for analysis. A very myopic organization might simply choose to design a product to be reliable over the stated warranty period. An enlightened organization might choose a design life, which is commensurate with how long the product is expected to be used in the field. Depending on the designer’s perspective, the life specification might be based on any of the following criteria:
Useful life: the estimated economic life of the product. This term is used quite extensively in engineering economics and finance. Beyond this life, the costs of maintenance and repair do not justify the continued use of the product. For example, a lawnmower can often be used for a long period of time—about 15 years—simply by making sure that the oil and filters are changed periodically. However, once the cylinders or valves in the engine wear out, the costs to fix or replace the engine are not justified. Warranty life: From the economic perspective of a manufacturer, the product should perform its intended functions properly over this period to minimize the company’s negative exposure. Tire manufacturers often select a warranty period based on the grade of their products. Companies that wish to excel in a global market understand that it is necessary for their customers to continue to be satisfied well beyond the warranty life. Design life: Specifications on usage should be based on information coming from the customer and competitive benchmarks. Engineers often define a test bogey to represent an engineer’s specification of product usage under which the reliability must be verified (see Figure 1-4). It is important that careful thought go into the synthesis of this specification to ensure that product unreliability issues toward the end of the useful life do not result in excessive customer dissatisfaction. Operating Conditions In order for a product to perform well in the field, it must be designed to perform its intended functions under operating conditions that are representative of how the customer actually uses the product. Often it is difficult for the engineer to have accurate information on product usage. This may lead to an over- or underdesigned product. The use of robust product design methods is useful in this
Copyright © 2002 Marcel Dekker, Inc.
regard to ensure the design of a product whose performance is insensitive to the extreme combinations of environmental factors that influence performance. Some of the factors that influence performance and should therefore be taken into consideration include
Environmental factors such as temperature, humidity, atmospheric pollutants, and salt concentration. Load, vibration, and other physical stresses that affect performance over time. (It is important to consider multi-axial stresses.) Surface-to-surface contact that can cause cracking, and other changes to surface geometry and material properties over time. Manufacturing variation in raw-material composition and geometry, process settings, and process performance. Residual stress concentrations due to forming, cutting, heat-treating, and assembly operations. Contamination created from manufacturing byproducts. Creep and other stress relaxation phenomena of soft materials. Other factors that may induce early wearout or fatigue.
Voice of the Customer Ultimately, it is up to the customer to decide whether or not a product’s performance is acceptable. Each of the three defining criteria for evaluating reliability—(a) intended function, (b) specified life, and (c) specified operating conditions—must be evaluated in terms of how (and how long) the customer uses a product and what he=she experiences when using your product. In the global competitive marketplace of the late 20th and early 21st centuries, reliability is an important requisite to ensure continued success in the marketplace. On the one hand, when a customer experiences trouble-free performance, it leads to increased levels of customer satisfaction and attracts new and repeat business. On the other hand, poor reliability translates into an erosion of customer satisfaction and loyalty, which has the opposing effect. Accordingly, it is important for the product design team to understand customer expectations. Quality function deployment (QFD) is a valuable tool to ensure that the customer’s voice is properly captured and that a realistic set of performance and reliability requirements is deployed. Y. Akao first introduced QFD in Japan in 1972 in conjunction with his work at the Mitsubishi Heavy Industries Kobe Shipyard (see Mizuno and Akao, 1994). Toyota successfully used it in the 1970s in a rust prevention study. The American Supplier Institute (ASI) of Dearborn, MI, and GOAL=QPC of Methuen, MA, have subsequently adopted QFD for use in the West. As Figure 1-1 illustrates, QFD begins with the creation of a productplanning matrix used to describe the translation of customers’ requirements into
Copyright © 2002 Marcel Dekker, Inc.
substitute quality characteristics or design requirements. Subsequently, other QFD planning matrices are used to deploy these requirements onto their final form, which in our case consists of design verification tests that are to be run. In Figure 1-1 we illustrate the use of QFD to deploy the voice of the customer throughout the design process, from the early synthesis of high-level design requirements, to the generation of detailed parts or functional requirements, and onto the study of the impact of governing stresses, their impact on the design, and the specialized reliability verification tests that may be required later (see King, 1989, and Vanooy, 1989).
1.1.2
What Is a Failure?
Because reliability deals with the probability that a given item—a component, a subsystem, a system, etc.—will fulfill its intended function over time, we must be clear about what a product’s intended function is and when a failure occurs: A failure is said to have occurred when one or more intended functions of a product are no longer fulfilled to the customer’s satisfaction. The most serious failures are those caused by a deficiency in the product or process design that was not identified prior to the product’s release to the customer. In such cases design errors might result in the product’s never performing to customer expectations, or it might do so for a while until performance degrades to an unacceptable level. The life models discussed in this reference book are all based on an assumption that a product perfectly fulfills its intended function at its date of release to the customer or sales distribution channel. Loss of Product Function The definition of a failure is obvious when there is a total loss of product function. If something breaks during product usage, and this was not intended, then we all agree that it has failed.* However, this will not be the case if only a partial loss of function is involved. In such instances the very definition of the term ‘‘failure’’ may not be precise when one observes a gradual or intermittent loss of performance over time. We refer to these events as soft failures, as product function may be satisfied—but not fully—and not necessarily to the satisfaction of all customers. As an example, many soft materials, such as weather seals, experience a degradation of material properties over time, resulting in a gradual
* An anonymous referee posed the following question, ‘‘If a clay pigeon breaks after a trapshooting hit, is this event labeled a failure?’’ The answer is ‘‘No!’’ as breakage is the intended function in this case.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 1-1 Use of QFD to synthesize reliability requirements based on the voice of the customer for a vehicle braking system. (From Vanooy, 1990.)
Copyright © 2002 Marcel Dekker, Inc.
decline in performance. Losses in strength, elasticity, and color are just some of the important characteristics that may decline over time. For instance, weather seals may not afford the level of protection against water seepage as the seals lose their elasticity over time. In Table 1-1 failure events are classified according to their extent of loss of intended function. The introductions of unintended and hidden functions, which result in undesirable performance, are also described.
TABLE 1-1 Classifying Failures According to the Level of Loss of Intended Function Extent of loss of function
Description of failure event
Example(s)
Total loss of function (catastrophic)
Design is totally dysfunctional, inoperative, or a completely unexpected event occurs.
(i) Component breakage (ii) appliance that will not ‘‘turn on’’
Partial=degraded function
Meets some of the specifications or some combination of the specifications but does not fully comply with all attributes or characteristics. This category includes overfunction and degraded function over time. This is a soft failure, in that functions are being satisfied, but not 100%.
(i) Color fading due to UV rays (ii) Bearing wear
Intermittent function
Complies with all specification, but loses some functionality or becomes inoperative due to external impacts such as temperature, humidity, etc. Intermittent failures are associated with the condition of: on, suddenly off, recovered to on again function or starts= stops=starts again series of events.
(i) The effect of electromagnetic radiation on sound fidelity of a radio is intermittent (ii) A digital camera that will temporarily not function at high temperatures or humidity
Copyright © 2002 Marcel Dekker, Inc.
TABLE 1-1 (Continued) Extent of loss of function Unintended function
Hidden function
Description of failure event
Example(s)
This means that the interaction of several elements whose independent performance is correct, adversely impacts the product or process when synergy exists. Their combined performance leads to an undesirable performance, and hence ‘‘unintended function.’’ Customer uses product in a way that is unintended or unknown to product design team. Customer may end up misusing or abusing product. From a legal viewpoint, the design team should understand that it is their responsibility to foresee product misuses.
The interference of engine, transmission and axle natural frequencies leading to a ‘‘resonance’’ condition, resulting in a serious noise, vibration, or harshness (NVH) condition
(i) Use of steering column in sports utility vehicle to pull passenger into cab (ii) Use of a flashlight as a ‘‘hammer’’
Severity of Failure A dead car battery, an engine stall, or transmission troubles are all examples of dependability failures: A product cannot be dependably used in such instances. Due to the severity of such failures, it is important that the incidence of such events be minimized. Failures associated with safety problems or a major loss of product function are rated highest in severity; they should be carefully examined for their potential for occurrence. In the appendix, Table 1-6, a classification of failure modes according to their severity, is presented according to the Automotive Industry Action Group (AIAG) industry specification (AIAG, 1995). The severity classification is an important input in failure mode and effects analysis (FMEA). FMEA is used to assist in the identification of potentially critical (important) failure modes up front in the design process. Each failure mode is rated by its criticality—the severity of the failure event multiplied by the likelihood of its occurrence—or by its risk priority number (RPN)—severity multiplied by the likelihood of its occurrence and multiplied again by the likelihood of its not being detected during design. A brief overview of FMEA methodology is presented in the appendix to this chapter, Appendix x1A.1.
Copyright © 2002 Marcel Dekker, Inc.
When assessing severity, it is important to realize that not all failures involving a complete loss of function are classified as severe. AIAG (1995) provides guidelines for assessing severity on a 1 to 10 scale. These guidelines are reproduced in Table 1-6. A severity rating at a level of 9 to 10 is used to rate any failure mode whose occurrence directly affects the ability of a product to meet federal safety standards. A rating of 7 to 8 is used for any failure mode that stops the operation of a product or system. In complex systems such as automobiles, subsystems might fail catastrophically but not prevent the product—the vehicle— from performing its primary functions. Such failure occurrences might affect the customer only slightly, which leads to minor customer annoyances. For example, if an LED display in a vehicle’s dashboard temporarily fails, or a latch on a glove compartment seizes, the customer might experience only a slight annoyance from the failure. However, if a failure is associated with only a partial loss of product function, it is still possible for severity to be high, particularly if product safety or dependability issues are involved. For example, if an engine leaks oil or idles very poorly, it can lead to engine breakdowns later on, and so such failure modes should be studied very carefully. Though severity won’t be quite as high as that associated with a complete loss of a safety function, severity will still be relatively higher than those failure modes that result in only minor customer annoyances. In Figure 1-2 we graphically illustrate a joint classification of failure modes according to both their severity and degree of fulfillment of intended function. 1.1.3
Three Reasons Why Products Fail
In general, failures can occur for any of three basic reasons: 1.
2.
3.
Design deficiencies or flaws a. An important customer requirement=design feature was omitted. The product will never perform to its requirements. b. The product design is deficient in some way, leading to early failures. c. The process design is deficient in some way, resulting in defective product or early field failures. Quality control a. Nonconforming items produced due to quality-control (QC) problems, causing problems with performance when product is used in the field. b. Product is damaged during handling and=or distribution. Misuse a. Product is misused by customer or perhaps during service.
In the preceding list we make a distinction between incomplete designs and those designs with deficiencies. Incomplete designs should never be produced if
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 1-2 Classification of potential failure modes according to their severity and degree of loss of product function.
organizations are using sound engineering design and design verification methods. Accordingly, when we speak of design deficiencies, we are referencing designs that are not known to have any significant weaknesses until the produced product has been released from the factory. Deficiencies can be present in either process or product design. We illustrate this by example: Design deficiency: Examples include early wearout or fracture of materials due to a substandard specification on materials or geometry. In the software industry, the Y2K problem is an example of a deficiency in the software that led to a mammoth effort in 1999 to correct the display and internal usage of dates. In electronics incorrect specification of materials, for example, can lead to infant mortalities and other field-related issues. Process deficiency: Examples include inadequate heat-treating, metal forming, or cutting, which leads to the buildup of residual stresses in
Copyright © 2002 Marcel Dekker, Inc.
notched or formed areas, leading to cracking and other fatigue-related phenomena. QC-related problems: Examples are attributed to special causes due to abnormalities in the grade of materials, assembly, and human errors, etc. They are an important concern to reliability professionals, particularly in electronics, as the majority of reliability problems in electronics have historically been due to quality-related causes (O’Connor, 1991, pp. 200–201). Accordingly, it is important for reliability engineers who work in the failure analysis=returns departments of manufacturers of electromechanical subsystems to work closely with plant quality personnel. This recommendation is increasingly holding true for today’s mechanical systems that incorporate complex embedded systems for monitoring and regulation. In all cases effective teamwork and problem-solving strategies are required to identify the root causes of the failures. Designed experiments may be useful when many potential variables are under exploration. Corrective action must eventually be taken at the product design, process design, or production stage to prevent failure recurrence. Customer misuse can occur when the product is used under conditions or ways that were not intended. For example, as discussed in x1:1, consider the customer who needs to use his=her steering wheel to lift himself or herself properly into the cab of a truck or sport utility vehicle, or the customer who uses his=her flashlight as a hammer in an emergency. These unintended functions can result in damage to the product. It is the engineer’s responsibility to make sure that the product is robust enough to tolerate a wide range of product usage. The use of robust design techniques is discussed in Chapter 6. 1.1.4
History of Reliability in the United States
The field of reliability in the United States is an indirect outgrowth of problems experienced with the use of electronic equipment during the American World War II effort. Reliability declined as electronic systems grew in complexity, prompting the U.S. Air Force, Navy, and Army to set up individual committees to investigate the reliability problem. In 1952 the U.S. Department of Defense (DOD) coordinated these efforts by establishing the Advisory Group on Reliability of Electronic Equipment (AGREE). Some important developments in the history of reliability engineering are summarized in Table 1-2. Early efforts, as exemplified by the first release of U.S. Mil-Std-781, were devoted mainly to setting up agreements for inspection and testing to qualify electronics equipment. Concurrently, a Swedish mathematician named Wallodi Weibull proposed a flexible framework for modeling fatigue-
Copyright © 2002 Marcel Dekker, Inc.
TABLE 1-2 Some Important Developments in the History of Reliability Engineering Year 1941–1945 1950–1951 1952
1955 1961 1965 1966 1978 1982 1994
1995
Late 20th to early 21st century
Engineering developments WWII: Importance of reliability realized; 60–75% of vacuum tubes in communications equipment were failing. Wallodi Weibull proposes a distribution for modeling failure-rate expressions by any general power-law expression. AGREE: ad-hoc committee of DOD called ‘‘Advisory Group on the Reliability of Electronic Equipment.’’ Mil-STD-781 ‘‘Reliability Qualification and Production Approval Tests.’’ AGREE sets up Mil-STD-441 ‘‘Reliability of Military Electronic Equipment’’ and Mil-R-26667. Mil-STD-756 for reporting prediction of weapons’ systems reliability. Mil-STD-785 Reliability Programs for Systems and Equipment (DOD). Society for Reliability Engineers is established. IEEE started Reliability Society. Mil-Hdbk 217D Reliability Prediction of Electronic Equipment is released. Ford Motor Company invites leading reliability professionals to a symposium on the Integration of Reliability and Robust Design Methods within Product and Process Engineering. DOD drops support of Mil-Hdbk 217 and exits the standards business. Society of Automotive Engineers (SAE) and Automotive Industry Action Group (AIAG) release a standard for potential failure mode and effects analysis (FMEA). Availability of new computer technologies for visualization (virtual reality), rapid prototyping, and improved computer-aided engineering tools diminishes the need for hardware testing.
related failures. The underlying distribution—the Weibull distribution—is one of the most widely used distributions in reliability. Efforts later began on the development of procedures that can be used for reliability prediction. Mil-Std756 became one of the first military standards ever developed for prediction. In the late 1960s, U.S. Mil-Std-785 was released. It was a first attempt to describe the elements of a successful reliability program. This included extensive emphasis on (a) the use of failure modes and effects analysis for understanding what can cause product unreliability and (b) the use of prototype testing methods during design. In the 1990s, the DOD began to adopt total quality management principles in its organizations. It has since exited the standards business and has encouraged privatization of that business. More importantly, organizations have begun to integrate reliability into their design processes. More and more we are seeing
Copyright © 2002 Marcel Dekker, Inc.
reliability engineers with product-line responsibilities. We are also witnessing a greater reliance on the use of computer-aided engineering tools for validating designs. This helps to reduce both cost and design time by reducing reliance on the building of prototypes.
1.2 1.2.1
OVERVIEW OF RELIABILITY MODELING Six Basic Approaches to Modeling Product Reliability
The six basic approaches to modeling product reliability are summarized in Table 1-3. We survey these approaches. QC Models Quality-control (QC) models are used to assess process capability (i.e., conformance to specifications). However, the linkage between process capability and product life—although we know it exists—is not very exact. In Eq. (1.1) we show the use of a distributional model on a quality characteristic, X, to model a fraction nonconforming, p. In Eq. (1.1) we make use of f ðxÞ, the probability density of a quality characteristic, X, to model the probability of its occurrence in the interval [LSL, USL]: ð USL p ¼ 1 P½LSL X USL ¼ 1
f ðxÞdx
ð1:1Þ
LSL
From a modern quality engineering perspective, any deviation of X from its ideal value, denoted by T below, can result in decreased levels of customer satisfaction. A loss function, L, advocated by G. Taguchi (see Phadke, 1989), can be used to model quality loss as LðX Þ ¼ gðx T Þ. A common form of this loss function is the well-known quadratic loss function given by LðX Þ ¼ K*ðX T Þ2, where K is a constant, upon which a distribution on loss, LðX Þ, can be obtained. More commonly, the average loss is used as a quality metric. For a quadratic loss function, it is evaluated by ð E½L ¼ k ðx T Þ2 f ðxÞdx 0 x 1 B C ¼ kB s2 þ ðm T Þ2 C @ |{z} |fflfflfflfflffl{zfflfflfflfflffl}A population variance
Copyright © 2002 Marcel Dekker, Inc.
bias
2
ð1:2Þ
TABLE 1-3 Six Basic Approaches to Modeling Failure-Related Phenomena Reliability model
Description
Representation
1. QC models
Use of a continuous distribution to model the distribution of a quality characteristic, X, or capability of a process
p ¼ 1 P½LSL X USL ð USL ¼1 f ðxÞdx:
2. Time-to-failure models
Use of a continuous distribution to model the occurrence of a failure event
FðtÞ ¼
3. Success=failure models
The number of failures or successes in an interval (0,T] is counted. The binomial distribution is often used.
pðxÞ ¼
4. Probabilistic design
Stress and strength distributions used to determine probability of interference (failure event).
Prob. failure ¼ prob. stress (L) >strength (S).
5. Performance modeling
The function representing performance over time is modeled and is optimized using design of experiments or Taguchi methods.
PðtÞ, a parametric function, and a functional limit, L.
6. Physics of failure model
The stresses that impact product life are modeled using either empirical or physical models.
Mean life (y) vs. stress (S) is modeled.
LSL
ðt f ðtÞdt: 0
n x p ð1 pÞnx for x x ¼ 0; 1; . . . ; n:
Time-to-Failure Models The second approach involves the use of a probability (density) distribution, f ðtÞ, of a continuous variate, T—denoting time-to-failure or other usage metric related to age, duty cycles—is perhaps the most widely used model for describing
Copyright © 2002 Marcel Dekker, Inc.
reliability over time. FðtÞ, the cumulative distribution function, is used to characterize the population failure fraction at any usage level, t. ðt FðtÞ ¼
f ðtÞdt
ð1:3Þ
0
It is important to realize the limitations of the use of a distribution model as suggested by Eq. (1.3): When using Eq. (1.3), we are essentially modeling a failure proportion, FðtÞ, or survival fraction, RðtÞ, over time. We are not modeling the underlying mechanism responsible for failure! Accordingly, it is important for one to realize that the investment in systems for tracking reliability will not lead to improved product reliability.
The use of Eq. (1.3) is discussed in greater detail in Chapter 2. Success–Failure Testing In design verification testing, items are placed on test for a predetermined duration. The numbers of test items that survive or fail a test are recorded and compared to test requirements. The binomial distribution is used to model the distribution of the number of items that have survived (or failed) a test when n items are placed on test. Actual failure-time information is not used, and as such, sample-size requirements are much greater under a binomial model. The binomial probability distribution is given by n pr ð1 pÞnr biðr; p; nÞ ¼ ð1:4Þ r The binomial probability, biðr; p; nÞ, is the probability of observing exactly r failures or successes in n independent trials, with parameter p denoting the independent probability of a single item failing or surviving. The order of occurrence of the r outcomes is not important. The probability that not more than r failures occur, when p denotes the independent probability of occurrence of a single item failing during this period, is given by Biðr; p; nÞ, the cumulative binomial distribution function: n r P P n Biðr; p; nÞ ¼ pi ð1 pÞni biði; p; nÞ ¼ ð1:5Þ i i¼0 i¼0 Success–failure testing is described in great detail in Chapter 4.
Copyright © 2002 Marcel Dekker, Inc.
Probabilistic Design Stresses vary randomly over time; as such, random high levels of stress can lead to a sudden failure. Strengths vary due to manufacturing and product usage and therefore also possess a distribution. Accordingly, both stress and strength possess distributions, a conceptual model of which is presented in Figure 1-3. Such models are referred to as probabilistic design models. The failure of a structural element due to a random load in excess of its breaking strength is an example of such an occurrence. As another example, consider the performance of electronic devices such as capacitors and resistors, which have a specified resistance to excessive voltages. A random occurrence of a voltage that is far in excess of this resistance will result in a failure. Conceptually, if we allow Y, a random variable, to denote strength or resistance to failure and X, a random variable, to denote stress or load on a system, the probability of failure, pf , may be modeled by pf ¼ PðY < X Þ Probabilistic design is surveyed in greater detail in Chapter 6.
FIGURE 1-3
Stress–strength distribution.
Copyright © 2002 Marcel Dekker, Inc.
ð1:6Þ
Performance (Degradation) Modeling We define PðtÞ, a continuous, observable characteristic, to model performance degradation or cumulative damage. It is a measure of a process’s progression toward failure. A conceptual model for the function PðtÞ is illustrated by Figure 1-4. Many failure phenomena due to wear, corrosion, diffusion, radiation, creep, stress relaxation, diffusion, cracking, etc. can be modeled this way. The identification of a suitable metric for PðtÞ is a first step in a dedicated process study to improve reliability. The use of designed experiments enables the identification of settings of a collection of influential design parameters to reduce the mean rate of progression toward failure and=or associated variability around the mean. Robust design techniques allow for the identification of designs that are less sensitive to the effects of uncontrollable noise factors, which tend to accelerate the progression to failure. An illustrated example showing the use of robust design methods to optimize PðtÞ is presented in Chapter 6.
Physics of Failure Models In this approach the reliability engineer seeks to attain a more in-depth understanding of the failure mechanism and dynamics that lead to a failure event. In this case the engineer seeks to understand the effect of stresses such as temperature, loading, etc. on product life using models founded upon physics. The models may also be empirical based on scientific studies. Such models are discussed in greater detail in Chapter 7.
FIGURE 1-4
Example of failure event defined by test bogey criteria.
Copyright © 2002 Marcel Dekker, Inc.
1.2.2
Failure Mechanisms
The use of probabilistic design, performance modeling, and physics of failure models are probably of greater interest to the engineer who seeks a more fundamental understanding of the underlying failure mechanism or physics of failure. Table 1-4 lists a broad range of failure mechanisms and associated models of the underlying failure phenomena. TABLE 1-4 Examples of Failure Mechanism and Associated Models Failure mechanism
Performance= reliability metric
Fatigue (gradual loss of strength) due to vibration, corrosion, low temperatures (brittle fatigue), cyclic temperatures, or cumulative loading effects
Cumulative damage or crack propagation model= stress concentration model at critical locations Change in ductility and other material properties over time=use
Random overstress or overload or failure due to design weakness Evaporation
Prob. design model
Friction and excessive wear
Copyright © 2002 Marcel Dekker, Inc.
Moisture=humidity characteristic Cumulative damage
Example Cyclic loading on mechanical component results in viscous deformation followed by fracture. In the early 20th century, the Titanic passenger ship was built using high-sulfur steel so that steel was malleable. Unfortunately, at the low temperatures, which existed when steel hull was hit by an iceberg, the steel hull suffered from low-temperature brittle fatigue, resulting in far worse damage due to the subsequent crack propagation that occurred. Fracture; sudden loss of material properties, etc. Loss of material properties (e.g., lightbulb failure). Failures in drive belts, gears, other machinery; delamination; fretting; galling; pitting; peeling; scratched surface.
TABLE 1-4 (Continued) Failure mechanism Contamination from dust and other environmental sources Fading of appearance Creep
Corrosion and= or chemical
Stress relaxation
1.2.3
Performance= reliability metric
Example
Increased resistance in electronic circuits
Electrical failures.
Cumulative U.V. radiation exposure Cumulative deformation under constant stress
Any coated or painted surface. Loss of material properties in polymeric components. Pitting; coating; rust; burned.
A measure of oxidation or other compound related to the undesirable chemical process underway Loss of elastic or compressive force under constant strain
Loss of sealing function (soft components such as gasket seals).
Establishing Reliability Specifications
Reliability-based specifications are typically stated in terms of ‘‘life,’’ with life being hours, years, miles, cycles, etc. In establishing a meaningful specification, the design engineer must consider the end user, including the severe end user, the desired level of performance, how long a product should perform at a certain level, the user environment (temperature, humidity, vibration, dust, electrical, etc.), and the set of relevant customer requirements.
Accordingly, in establishing reliability requirements, it is very important for the engineer to have a complete understanding of how the product is to be used. For example, in the auto industry the severe end user is often the 90th or 95th percentile customer—the customer who lives in Calgary, Alberta, in the winter, or in Tucson, AZ, in the summer, or one who delivers packages in the city and thus must start and stop his or her engine quite frequently. In the latter case we refer to this as a duty cycle requirement—the number of times an engine on a delivery truck is typically started over a period of time. If these requirements are not understood, then it is next to impossible for a design team to specify an appropriate set of tests for validating product reliability.
Copyright © 2002 Marcel Dekker, Inc.
Stating Specifications To be truly meaningful, reliability and confidence levels must be considered an integral part of a reliability specification, and vice versa. For example, a specification that merely states that an automotive sunroof must be able to be cycled 3650 times without incidence of a severe failure is insufficient. A confidence level must be associated with the reliability target. This is often summarized using R by C notation, where R is the reliability target, and C is the prescribed confidence level. For example, an R95C90 reliability specification on the sunroof would signify the following: The likelihood or confidence that there is a 95% chance or greater that the sunroof will be able to withstand 3650 cycles of use without incidence of a severe failure is at least 90%. Mathematically, we are expressing a level of confidence that is associated with a lower confidence limit on reliability, Rð3650Þ, as follows: 1 0 B B B PB B Rð3650Þ B @
C C C C 0:95 |{z} C 0:90 C lower A
ð1:7Þ
confidence limit on R
Guidelines for specifying reliability targets according to the severity of the failure classification are presented in Table 1-6 in the appendix to this chapter. There are implied tradeoffs between the level of confidence, and the minimum reliability that must be achieved. However, there might not be much difference in test requirements between an R90C95 and an R95C90 specification, for example. Reliability Specifications Under Degradation or Intermittent Loss of Product Function(s) Soft failures, which are evidenced by a gradual loss of performance or ability to withstand stresses, require our special attention. Initially, the customer may not notice the loss of performance. Over time, however, this can lead to a catastrophic failure event. For such situations, the engineer must set up an artificial limit, the test bogey limit, beyond which a bogey failure is said to have occurred. Its usage is illustrated by Figure 1-4. The bogey limit is set using subjective information based on engineers’ knowledge of the effect of performance degradation on the use of the product. The bogey limit should be set based on customer inputs that relate to a degraded level of performance that is likely not noticed by or does not bother 95% of the customers (but does annoy 5%). (This is treated like a 95% reliability target!)
Copyright © 2002 Marcel Dekker, Inc.
As an example, engineers must specify a limit of acceptability on tool wear, whereupon policies call for tool replacement. The interpretation of performance (e.g., hole drilling performance) and setting of this limit are subjective and decided by the expert opinion of the design engineers. Importance of Verification=Validation In manufacturing we use inspection techniques and control charts to monitor a process for its continued conformance to manufacturing requirements. So, too, in design we employ a form of inspection that we refer to as design verification or design validation to verify that the specified functions are satisfied over a specified life of the product. Verification=validation techniques are of three basic forms: 1.
2.
3.
A mock-up, (rapid) prototype, or digital rendering of the product is constructed. The basic features and functions of the product are examined for their conformance to requirements. Use of test procedures to verify satisfaction to requirements. Tests might be pass=fail or items might be run to failure. Process=production validation tests are conducted with product built with process tooling. Use of advanced computer-aided engineering tools such as CATIA1 computer-aided design renderings linked with other design tools and finite-element models to verify performance. These are surveyed in Chapter 6.
We seem to be using the terms ‘‘validation’’ and ‘‘verification’’ interchangeably. These terms are clearly very similar, as evidenced by the formal definitions provided by the U.S. Dept. of Health, Education, and Welfare: Food and Drug Administration, Part 820—Quality System Regulation (ASQ, 1998): 1.
2.
Validation: confirmation by examination and provision of objective evidence that the particular requirements for a specific intended use can be consistently fulfilled. Verification: confirmation by examination and provision of objective evidence that specified requirements have been fulfilled.
According to ISO 9001: 2000 Element 7.3.5 (Design and Development Verification), verification is characterized as those activities involved in the evaluation of whether design outputs are properly translated from design inputs (e.g., design reviews, CAE, simulation), while validation is a term used for ongoing test activities dedicated toward demonstrating the achievement of design objectives.
Copyright © 2002 Marcel Dekker, Inc.
1.3 1.3.1
AN OVERVIEW OF RELIABILITY PLANNING Elements of Design for Reliability
The breadth of a reliability planning effort is best understood if placed in context with the many tasks and deliverables required from taking a design from concept to production. Figure 1-5 illustrates a block diagram representation of the important stages involved in this effort. In this representation the significant activities that impact product reliability are noted in each block. For each phase of product design, we illustrate a subset of the design activities that impact product reliability:
FIGURE 1-5
Activities that impact product reliability from cradle to grave.
Copyright © 2002 Marcel Dekker, Inc.
1. Concept planning: This process begins with a high-level definition of reliability requirements. Much of the effort during the early stages of product design is highly dependent on the collective knowledge base that the crossfunctional team has with similar products. At this stage the only product tangibles might consist of mock-ups (rapid prototypes) and 3D representations on the computer. This phase is quite important from a reliability perspective, as the design team must begin to formalize a reliability plan. Since no hard data is available at this stage, the team will base its plans on customer requirements and whatever field=warranty information it may have on existing similar products. At this stage the team should work on a concept failure modes and effects analysis (CFMEA). A CFMEA is a preliminary design evaluation procedure used to identify early any potential design weakness that may result in safety hazards or reliability problems. This information is used to develop a reliability plan, including preliminary acceptance criteria on performance, and so forth. 2. Product design: Much of these early efforts are repeated again, but at a much more detailed level, as the design progresses from concept into detailed product and process design. For example, the CFMEA is used as a starting point for the development of a design FMEA. 3. Process design: The design must now be translated into a detailed process plan. A process FMEA is useful for analyzing and preventing potential manufacturing problems that can impact product reliability and quality. 4. Design verification: the design processes that have been described for the eventual formulation of a Design Verification, Planning, and Reporting (DVP&R) document, which spells out the reliability tests and other verification methods—such as the use of computer-aided engineering tools—that are to be conducted. The allocation of resources for verification testing should be based on the priorities set forth in the FMEA documents. An example of a DVP&R document is presented in Appendix 1A3. 5. Manufacturing: Process and production validation is conducted to verify that manufacturing process is adequate. 6. Reliability monitoring: The process is continually monitored for any changes that can impact product reliability. 1.3.2
Deploying Reliability Requirements
The early reliability requirements are continually updated and refined as reliability targets and possible predictions as well are formed at all levels of detail— system, subsystem, and component. A generic process for developing and refining reliability requirements is set forth in Table 1-5. As such, this information should be viewed as a starting point only, and not as a specific guideline. The needs of individual organizations will no doubt require modifications to these overall guidelines to meet the needs of their industry.
Copyright © 2002 Marcel Dekker, Inc.
1.3.3
Reliability Prediction
For mission-critical products such as defense or medical products, design engineers are often required to develop an estimate of reliability before the product is ever built. In many cases engineers rely on field data on existing products to develop this estimate. In other cases, particularly for expensive, shortproduction-run products such as jet or space aircraft, engineers rely on published data at the component level for the development of a reliability prediction.
TABLE 1-5 The Integration of Reliability Requirements in Product Design Step
Task
Description
1
Define reliability requirements.
2
Integrate reliability requirements into design and allocate reliability.
3
Identification of significant designfor-reliability characteristics based upon product usage information.
4
Reliability prediction.
Identify customer and functional requirements. Define operational and environmental profiles; establish acceptance=failure criteria and failure definitions. Establish a system reliability target. Integrate all available information originating from the field, laboratory, and R&D on same=similar products. Cascade reliability requirements into targets for each subsystem, component, or function. Allocate reliability at the lowest level required. Assess technology and feasibility for reliability testing downstream. Use failure modes and effects analysis (FMEA) and fault-tree analysis (FTA) to identify critical failure modes and their root causes. Additionally, field failure information on similar products should be used to identify top-priority failure modes. Failures that affect safety should be given the highest priority. Dependability failures, wherein the failure of a given component or subsystem will cause the entire system not to perform its intended functions, should be assigned a higher priority than any other localized failures. Significant reliability characteristics should be identified based upon competitive benchmark and voice-of-the-customer feedback. Reliability prediction estimates may be based on past experience or from reliability databases such as Mil-Hdbk-217F, GIDEP (Government and Industry Data Exchange Program), Belcore, etc. Identify failure modes wherein the differences between the predicted and target reliabilities are greatest. Identify those failure modes that present the greatest opportunity for reliability improvement.
Copyright © 2002 Marcel Dekker, Inc.
TABLE 1-5 (Continued) Step
Task
Description
5
Design=process verification through testing or use of computer-aided engineering tools.
Durability and other validation test procedures should be run to validate product reliability if a product’s performance cannot be realistically simulated using computer-aided engineering tools. There are two phases: 1. Design verification (DV) using prototype parts built by hand 2. Process verification (PV) using production-grade tooling to build the prototypes
6
Monitor reliability.
It may be also useful to test the competitor’s products. Robust product design should also be used to identify low-cost design improvements that enhance the reliability of the product (see Chapter 8). From a modern design for reliability perspective, it is always preferable to be able to conduct design verification using computer-aided engineering tools and to then test a small sample of design or process prototypes to confirm a good design. With market pressures to reduce product lead time, engineers no longer can afford the luxury to rely on testing to identify design deficiencies. Institute reliability reporting and tracking systems to monitor reliability. Selectively use failure analysis techniques to identify the root cause of failure on returned parts.
Many sources of public information can be used to assist in the development of a reliability prediction for equipment. For example, the GIDEP (Government and Industry Data Exchange Program) and NPRD (Nonelectric Parts Reliability Data) are available to the public. Perhaps the most widely known and used source for reliability prediction for electronics equipment is Mil-Hdbk217 (1995), the Military Handbook for ‘‘Reliability Prediction of Electronic Equipment.’’ Mil-Hdbk-217, Notice 2, is published by the DOD based on work done by the Reliability Analysis Center and Rome Laboratory at Griffiss AFB, NY (currently at IIT Research Institute, Rome, NY). Mil-Hdbk-217 handbook contains failure-rate models for predicting the reliability of various semiconductor and other electrical components such as integrated circuits, transistors, diodes, resistors, capacitors, relays, switches, connectors, etc. These failure-rate models are based on the best field data
Copyright © 2002 Marcel Dekker, Inc.
accumulated for a wide variety of parts and systems; this data is then analyzed and manipulated, with many simplifying assumptions thrown in, to create usable models. For each part in the handbook, a base reliability estimate in failures=hr is provided along with a collection of factors (Pi ) used to adjust the reliability estimate to account for the effect of elevated or reduced stress levels on product life. For example, elevated temperatures and voltages are known to accelerate the breakdown of dielectric materials, resulting in reduced life. So, too, the actual grade or quality of the materials used—that is, commercial or industrial grade— will affect life. The base model for prediction of the reliability of a single component has the generic form lp ¼ lb pT pA pR pS pC pQ pE
ð1:8Þ
where lp ¼ part failure rate (failures=106 hr). lb ¼ base failure rate (failures=106 hr). pT ¼ temperature modification factor (usually based on junction temperatures and Arrhenius relationship [see Chapter 5]). pA ¼ application factor. pR ¼ power modification factor. pS ¼ electrical stress factor. pC ¼ construction factor. pQ ¼ quality factor (screening, commercial or industrial grade). pE ¼ environment modification factor (ground benign, ground fixed, ground mobile, naval sheltered or unsheltered, airborne, space, etc.). Hybrid and complex part assemblies often require several lp estimates for each segment. An example of such is described in the worked-out example below for a microprocessor. An overall failure rate for a subassembly is typically obtained by summing the hazard-rate contribution from each part in a subassembly. Common-mode failure-rate contributions from circuitboard connections, etc. should also be taken into consideration. It is important to note that beginning in 1995, the DOD halted its support of the Military Handbook 217 for reliability prediction of electronic components. Its primary reason for doing so was based on its intent to privatize the standards business. However, there are other reasons why one should consider a deemphasis of this activity in his or her organization. They include 1.
The fact that the task of constructing a reliability prediction does not add to the overall effort to build a high-reliability product. Other activities such as the development of an FMEA form are designed to
Copyright © 2002 Marcel Dekker, Inc.
2.
remove deficiencies from the product design before the production facility is built. The estimates from Military Handbook 217 are not always very accurate. Much of the information is 10 years behind the times. In fact, there is the vociferous minority who question its usefulness (see, for example, Pecht and Wang, 1998). Nevertheless, it is useful for comparing two designs, such as existing and proposed.
Many commercial electronic product companies are now choosing to use the Bellcore (1997) handbook by Telcordia Technologies (formerly Bellcore) for their reliability predictions. Telcordia Technologies, a spin-off of the old AT&T Bell Labs, previously used Mil-Hdbk-217 for its reliability predictions but found that 217 gave pessimistic numbers for its commercial-quality products. In 1985 Telcordia modified and simplified the models to better reflect its field experience and developed the Bellcore reliability prediction database, which is applicable to commercial electronic products. In addition to the models used to predict the failure rate of electronics components, a similar handbook (NSWC-94=L07) has been developed by the Navy for predicting the reliability of various types of mechanical devices including springs, bearings, seals, motors, brakes, and clutches. Example 1-1:
From Mil-Hdbk-217F, Section 5.3
A Siemens 16-bit microprocessor with a 144-pin connector is to be used in a biomedical device. For low-powered CMOS devices greater than 60,000 gates, we will use Section 5.3 of Mil-Hdbk-217F to estimate part hazard rate. The failure rate, lp , is estimated using lp ¼ lBD pMFG pT pCD þ lBP pE pQ pPT þ lEOS failures=106 hr where lBD ¼ die base failure rate (0.16 custom). pMFG ¼ manufacturing process correction factor (0.55). pT ¼ temperature correction factor (0.287 Arrhenius CMOS 50 C jct. temp.). pCD ¼ die complexity correction factor [13 (1.0 m, die area)]. lBP ¼ package base failure rate [0.0047 (144 pins)]. pE ¼ environmental correction factor [0.5 ground benign (GB )]. pQ ¼ quality correction factor (10 commercial grade). pPT ¼ package-type correction factor (2.9). lEOS ¼ electrical overstress failure rate (0.065). lp ¼ 0.461689 failures=106 hr.
Copyright © 2002 Marcel Dekker, Inc.
1.3.4
Cost of Reliability and Product Testing
Management must understand the benefits of having a reliability program. There are fixed costs involved, but they are more than offset by the benefits that will be obtained when potential failure modes are identified and removed early on. That is, the earlier in the life cycle that potential reliability issues can be identified, the less costly it is to correct the problems. During the concept and the earliest stages of detailed product design, the cost of prevention is quite minimal compared to these costs later that will be incurred in manufacturing and assembly if a deficiency is not caught during design. Once a product is manufactured, the cost of field failures will include all the costs associated with product recalls, design fixes, and product repair. If a product failure results in a personal injury, then the costs of settling lawsuits and litigation costs should also be included. More importantly, customer awareness of quality problems can lead to a reduced market presence and lower customer loyalty. If any potential failure modes do escape to the field, the presence of a product support team, which includes failure analysis specialists, can be mobilized for rapid containment of any problems and for identification and removal of root causes responsible for reliability problems. Conceptually, the economic model, which is illustrated in Figure 1-6, may be used to assess program cost tradeoffs. In this model the allocation of resources
FIGURE 1-6
Conceptual model for reliability economics. (From O’Connor, 1991.)
Copyright © 2002 Marcel Dekker, Inc.
costs for prevention and testing—failure mode and effects analysis, design reviews, life testing—will result in improved reliability on the one hand. On the other hand, failure costs—warranty and other field failure costs—will decline as product reliability improves. These tradeoffs can best be viewed by adding these two cost components. The total reliability program cost function is convex, and so an optimum, R*, exists. It is also important to note that on the left of the optimum, R*, lies a zone of improvement wherein prevention and testing costs result in an ever-greater reduction in failure costs. However, on the right of R* lies a zone of perfectionism, where further investment in testing and prevention given the current technology level and organization may be viewed as inefficient, because it is not economical. Some may question the validity of such a model in that its use would appear to be contradictory to a philosophy of continuous quality and reliability improvement (see O’Connor, 1995). This is not the case, however. Wasserman (1997) has argued that the cost model is, indeed, quite dynamic. Due to market influences brought on by increased competition and customer expectations, and by the realization that costs for proactive reliability improvement should be viewed as an investment, the dynamics of the cost model will always be such that the optimum, R*, will migrate toward 100% reliability over time. 1.3.5
Reliability Bathtub Curve
Considering performance of a product over its life cycle, the reliability bathtub curve is a conceptual model for describing reliability-related phenomena at the component level over its life cycle. The bathtub curve consists of three stages. Figure 1-7 reveals the three components of the reliability bathtub curve. Note that not all products follow this model. Mechanical systems may not appear to have an infant mortality period—although many do! Some failure modes in electronic products may not appear to ever have a wearout phase. Three stages
Characterized by
Caused by
Burn-in
Decreasing failure rate
Useful life
Constant failure rate
Wearout
Increasing failure rate
Manufacturing defects: welding flaws, cracks, defective parts Environment, random loads, human error, chance events Fatigue, corrosion, aging, friction
The three phases are now described in greater detail: 1. Infant mortality phase: a component of the overall failure rate that is most active during the infant mortality phase; its contribution declines to
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 1-7 Reliability bathtub curve; overall instantaneous failure rate is sum of hazard rates associated with (1) weak items, (2) stress-related, and (3) wearout.
insignificant levels beyond the wear-in period. This relationship is used to account for a subpopulation dominated by quality-control defects due to poor workmanship, contamination, out-of-specification incoming parts and materials, and other substandard manufacturing practices. Examples include premature failures due to improper manufacturing or assembly, the use of substandard materials, or early wearout due to other QC problems. In the software industry programming bugs in the development of software can lead to premature ‘‘failures.’’ This has led to the development of popular burn-in procedures to eliminate infant mortalities before the product is delivered to the customer. From a modern quality viewpoint, however, it is not right for one to accept the occurrence of infant mortalities as a natural occurrence. Instead, organizations should strive to design and produce defect-free products, so that the incidents of infant mortalities are undetectable. 2. Useful life phase: a constant failure-rate (CFR) component curve. This component is used to account for failures due to random, excessively high loads
Copyright © 2002 Marcel Dekker, Inc.
or environmentally induced stresses that exceed the capacity of the part or subsystem to tolerate. Ideally, the magnitude of this failure-rate component should be negligible if sufficiently high safety factors are used during design. That is, the use of effective computer-aided design tools and DVP&R practices should keep the incidence of such failures to undetectable levels. Only ‘‘acts of God’’ should allow for such failure events to occur. 3. Wearout phase: a wearout failure-rate component due to aging phenomena. This can include effects due to fatigue, corrosion, creep, and other aging phenomena. Due to accelerated wearout, the age of onset of wearout, t ¼ t*, is often regarded as the (economic) useful life of a product. Beyond t*, the economics of maintaining an asset may not be justifiable, and hence, it is unlikely that the product would be used much beyond its useful life.* Quality-control problems can also result in the production of items that experience early wearout.
1.3.6
Life-Cycle Costing
Life-cycle costing is the total cost of ownership of a product during the lifetime of its use. It includes acquisition cost, which is based on the basic costs of designing, producing, and distributing a product, including marketing and profit for the company. It also includes cost of ownership—repair, routine maintenance, and usage costs—and the ultimate cost to retire the product, including recycling and disposal costs. Reliability and maintainability are strong cost considerations under life-cycle costing. Maintainability is defined to be ‘‘the probability that a failed component or system will be restored or repaired to a specified condition within a period of time when maintenance is performed in accordance with prescribed conditions’’ (Ebeling, 1997, p. 6).
1.4
ENABLERS FOR A SUCCESSFUL RELIABILITY PLANNING EFFORT
If you were the ANZ Corporation, which we described in the very beginning of this chapter, what action plan might you institute to turn things around? Well, there are no absolute answers to this question, but it is the author’s opinion, based upon his experiences with working with automakers and their suppliers, that a successful reliability planning effort requires an organization that is committed to the use of (a) comprehensive problem-solving strategies to solving existing * My students are familiar with wearout phenomena from first-hand experience. Many of our full-time students drive vehicles that are over 10 years old and understand the hardship of having to maintain a vehicle beyond its useful life.
Copyright © 2002 Marcel Dekker, Inc.
problems and (b) a proactive strategy for ensuring the design of reliable product. Accordingly, we look at strategies and techniques for improving the reliability of both existing and new products. Existing Processes 1.
2.
3.
4.
5.
6. 7.
8. 9.
Characterize any existing problems on similar products. Are there any problems? Develop a failure reporting system. Collect warranty data. Perform a Pareto analysis of your top problems in the field. Contain the current problem(s). Implement temporary measures such as increased source inspection, etc. to halt shipment of defective product to the customer. Use basic problem-solving tools to identify root causes. In addition to using basic quality improvement tools, there may be a need for using designed experiments (and Taguchi methods) to understand complex relationships involving a multitude of variables. Develop a failure reporting, analysis, and corrective action program (FRACAS) to help understand failure mechanisms and root causes. Take steps to correct the problem. Identify an engineered solution to eradicate the root causes of the reliability problem. Implement the process change. Implement any design or procedural changes required to prevent any problem(s) from recurring. Verify that the change was successful. Standardize the change, so that change is permanent. Make sure that the FMEA and all CAE databases are updated. Update any changes to process=operator plans. Use SPC=continuous conformance testing to continually monitor the process. Develop a knowledge database of lessons learned. Look for other applications=opportunities to implement changes.
Future Processes 1.
Collect and analyze all reliability information. Use TGW (things gone wrong!), warranty information, etc. on existing products to understand what the top challenges are for the design team in developing a reliable product. Utilize failure reporting, analysis, and corrective action system (FRACAS) information to develop a better understanding of the underlying failure mechanisms and mode of failure.
Copyright © 2002 Marcel Dekker, Inc.
2.
Seek out the voice of the customer. Collect information using customer satisfaction surveys, focus groups, etc. What are the customer expectations vis-a`-vis product performance? What functions=features should be incorporated into the product? How is the product to be used in the field? Under what conditions? Use QFD to drive the design process. Identify important customer wants; deploy these desires to synthesize reliability requirements, etc. Benchmark the competition. What is an acceptable design life? What kind of warranty should be offered to the customer (duration, terms, etc.)? Perform feasibility analysis. Build or mock up a concept property. For mechanical systems, the use of rapid-prototyping methodologies is very effective. Use computer-aided engineering (solid models, finite-element methods, variation simulation analysis, dynamic structural analysis, collision detection, etc.) techniques to simulate performance properties of product. Follow a disciplined strategy for design for reliability. Use DFMEA (design failure mode and effects analysis) effectively. Identify opportunities for design verification testing. Use PFMEA (process failure mode and effects analysis) effectively. Identify opportunities for process improvement later on. Conduct formal design reviews. Implement supplier quality programs. Make sure that suppliers are team members and that they employ the same quality and reliability assurance practices that you do. Complete systems integration. The reliability engineer must make sure that system and subsystem reliability requirements are consistent. Implement DVP&R program. Complete a DVP&R (design verification, planning and reporting) plan. Use robust design techniques upstream to optimize the performance of critical functions. Develop computer-aided engineering models and other predictive reliability models for design validation. Follow through with reliability test plans for validation. Have representative design prototypes built and tested. Use this information to make any last-minute changes to design.
3.
4.
5.
Copyright © 2002 Marcel Dekker, Inc.
6.
Production validation. Use sound quality engineering strategies for validating processes. Test the performance and assess the capability of first parts coming off the production line. Reliability monitoring of existing processes. Implement SPC program to monitor production quality: Implement continuous conformance testing and performance assessment program. Implement preventive and predictive maintenance programs, etc.
7.
1.5
RELIABILITY GROWTH MANAGEMENT
Reliability growth management is used during design and development to track and predict reliability improvement. It is most useful for the design of new products of high complexity to ensure that actual reliability growth meets planned reliability growth. For very low-production-run items of high cost and complexity such as commercial aircraft or defense products, there aren’t any allowances for building prototype product. Instead, the built product off the assembly line undergoes a series of tests. Any deficiencies that are detected are corrected immediately prior to any further testing. This is often referred to as a series of ‘‘test-analyze-and-fix (TAAF)’’ cycles, wherein product improvement leads to continual reliability growth. Reliability growth models are also used in highproduction durable-good industries such as the automotive industry to some extent for tracking reliability growth during advanced product development through manufacturing. The AMSAA (U.S. Army Material Systems Analysis Activity) model is widely used for assessing changes in reliability (see Crow, 1974, 1990, and MilHDBK-189). Under this model, the log of the cumulative number of field incidents [ln N ðtÞ] is plotted as a function of the log of the cumulative times that product has been used or tested (ln t). The model is of the form
N ðtÞ ¼ l t b
or
ln N ðtÞ ¼ ln l þ b ln t
ð1:9Þ
This is illustrated in Figure 1-8. A parametric model is fit, and this information can then be used to predict the likelihood that the planned reliability target or growth level will be achieved.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 1-8
1.6 1.
Typical reliability growth representation.
EXERCISES Consider the following six scenarios, which describe a reliability problem. For each scenario, (i) Categorize the failure mode in terms of its extent of loss of function and severity. (ii) For each failure mode that you can identify, what might have been the RPN rating during the product design phase? a. My new coffee cup is shown in Figure 1-9. With this design, the top cover consists of two rotating plates in contact. When the cup is open, the pressure between the plates is allowed to relax, allowing the coffee to seep into the gap between the plates as I tilt my cup to sip my coffee. I usually drink my coffee during my morning commute to the university. By the time that I am halfway to the university, I begin to experience a ‘‘drip-drip’’ onto my clothes as I sip my coffee. The coffee in the open gap space has finally worked its way out and is spilling onto me from any space between the cap and the threads, even when cap is screwed on very tight! (Assume that this has resulted from a design deficiency that was not discovered during product design.) b. An engine fan belt prematurely begins to squeal more as it is worn and stretches out. This annoys the customer greatly. (Assume that
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 1-9
My ‘‘coffee cup’’.
the belt is unlikely to break apart or come off the pulleys for quite a while, however.) c. The cable that controls the brakes on my bike snaps apart without any notice. d. The battery meter on my Palm Pilot suddenly no longer correctly shows the remaining life on the 2 AAA batteries that power it. Assume that everything else works fine, however! e. The seals on a transmission pan begin to wear out and fret, causing some leakage of transmission fluid on the floor of my garage. f. My new lawnmower is shown in Figure 1-10. Great lawnmower! The oil filler neck is a plastic molded part. It sits on an opening with a grommet to provide a seal to the passageway to the engine. The top of the part is mounted at one location with a screw to hold it in place. Unfortunately, the part is not guarded in any way, as you can see from the figure. When mowing around bushes, swing sets, etc., the filler neck can be unexpectedly hit. Finally, one day the attachment came loose. The oil filler neck separated from the housing, leaving oil gushing out of my lawnmower and lots of
Copyright © 2002 Marcel Dekker, Inc.
2.
smoke! This failure mode constitutes a potential safety hazard! It can also result in excessive engine damage. For each scenario, identify the potential failure mode and estimate the RPN: a. The accuracy of a missile defense system is very important. If a missile guidance system is inaccurately programmed, there may be significant unintended civilian casualties. Combat soldiers are well trained on their use, and so likelihood of such an occurrence is low; however, it is difficult to foolproof the system to prevent this from happening. b. A coffeemaker can become easily plugged up from oxide scale. A particular brand is highly susceptible to this happening in the first year of heavy use. The coffeemaker’s manufacturer does conduct studies on oxide scale formation, but the studies seem to be inadequate. c. The failure of an airbag switch in a vehicle might lead to injury or death. This is unlikely to happen, given the redundancy in its circuitry. The airbag’s manufacturer conducts multiple testing to prevent this failure mode from occurring. d. A lighting product is being produced with an on–off switch of low-grade quality. The switch can fail in some products if the customer abuses the switch when using it. The manufacturer doesn’t have the resources to validate the durability of the electronic switch.
FIGURE 1-10
New self-propelled lawnmower.
Copyright © 2002 Marcel Dekker, Inc.
3.
Briefly answer the following: a. Quality and reliability are intimately linked. What effect does manufacturing quality have on field performance? b. What effect does poor or deficient design quality have on field performance? c. What role does product support have on ensuring reliability growth? From a reliability cost perspective, what cost category should product support be charged to (i.e., prevention, appraisal, or failure)? d. Why is it important for the reliability analyst to be familiar with distribution theory? e. A company verifies the reliability of its mobile electronics communications equipment by running life tests of 3000 hr under ambient conditions. Why is the use of such a design verification strategy ineffective for proving out reliability? f. Why might the use of FMEA detection ratings, as currently specified under the AIAG=SAE standard, encourage prolonged design verification activities? g. Why is it desirable to solicit the voice of the customer in setting product acceptance limits during product design? h. What role does the use of reliability prediction have in product design? Are they accurate?
Copyright © 2002 Marcel Dekker, Inc.
APPENDIX 1A APPENDIX 1A.1
FMEA=FMECA=DVP&R
FAILURE MODE EFFECTS ANALYSIS (FMEA)
Failure mode and effects analysis (FMEA) originated in the 1950s by aerospace and U.S. military as FMECA (failure mode effects and criticality analysis). It was adopted by private industry in the late 1960s. FMEA (failure mode effects analysis) is a reliability-planning tool that consists of a systematized group of activities intended to: 1. 2.
3. 4. 5.
Recognize and evaluate the potential failure of a product=process and its effects. Identify the root causes of the potential failure mode at a very fundamental level that is related to the underlying failure mechanism or physics of failure. Prioritize potential failures according to their risk. Identify actions that could eliminate or reduce the chance of the potential failure occurring. Provide a living document for future use and for continuous reliability improvement. It is complementary to the design process of positively defining what a design must do to satisfy the customer.
There are two types of FMEAs: 1. 2.
FMEA: Automotive industry type as described by SAE FMEA J1739 (1994) and AIAG (1995) FMECA: Mil-STD-1629A (1984), Military Standard Procedures for Performing a Failure Mode, Effects and Criticality Analysis
We discuss the AIAG=SAE type first as, in some sense, the functionality of the 1629A format may be viewed as a superset of the automotive industry standard. An FMEA is a living document, which should be continuously updated whenever significant changes occur in the design or manufacturing process. The significant changes can include the following:
When new systems, designs, products, processes, or services are designed When existing systems, designs, products, processes, or services are about to change regardless of reason When new applications are found for the existing conditions of the systems, designs, products, processes, or service
Copyright © 2002 Marcel Dekker, Inc.
1A.1.1
Types of FMEA
There are five main types of FMEA as depicted in Figure 1-11. They are 1. 2. 3. 4. 5. 1A.1.2
Concept FMEA Design FMEA Process FMEA Service FMEA Equipment FMEA Design FMEA (DFMEA)
The design FMEA is used to identify and correct potential failure modes at the system, subsystem, and component levels before they are released to the manufacturing environment. DFMEAs help to identify potential special characteristics during the product design stage so that corrective actions can be identified to eliminate the concerns. In synthesizing a DFMEA document, we assume that all components and subsystems are manufactured=assembled within engineering specifications. However, we do allow for the possibility that a design deficiency might lead to unacceptable manufacturing=assembly variation later on. Thus, the linkage between design and process FMEA is apparent. Some of the key objectives of a DFMEA can be summarized as follows (SAE J1739, 1994):
To drive design improvements as a primary objective To address all high-risk failure modes To help to allocate resources for design verification activities
FIGURE 1-11
Types of FMEA.
Copyright © 2002 Marcel Dekker, Inc.
To document ‘‘lessons learned’’ to be used as inputs to failure mode identification at the next product iteration To identify key characteristics To complete FMEA during a ‘‘window of opportunity’’ where it can best impact product design To assemble the right team of people throughout the FMEA process To ensure that time spent by the FMEA team is an effective and efficient use of time, with value-added results
Benefits of DFMEA The benefits of the design FMEA include the following:
1A.1.3
Establishes a priority for design improvement actions Documents the rationale for changes Provides information to help through product design verification and testing Helps identify the critical or significant characteristics Assists in the evaluation of design requirements and alternatives Helps identify and eliminate potential safety concerns Helps identify potential product failures early in the product development phase Improves reliability, reduces warranty Saves on prototype development Guides the development and use of design verification methods Prioritizes testing and validation resources Encourages simultaneous engineering Provides a formal, living document describing the process Reduces engineering design changes—reduces cycle time and costs Improves customer satisfaction Helps to identify design and manufacturing controls or ensure that defects at any stage do not ‘‘escape’’ to the field Process FMEA (PFMEA)
A process FMEA (PFMEA) is a disciplined analysis=method of identifying potential or known process failure modes and providing follow-up and corrective actions before the first production run occurs. It is utilized by the manufacturing responsible engineers=team and is part of the manufacturing planning process. PFMEA involves consideration of labor, machine, methods, material, measurement, and environment. The main purpose of a process FMEA is to identify the potential manufacturing or assembly process causes and process variables on which to
Copyright © 2002 Marcel Dekker, Inc.
focus controls for identifying potential quality problems during manufacturing and for detection of potential failures. In synthesizing a PFMEA document, we make the assumption that the design is correct, and so attention is focused on the identification of potential process deficiencies. We also assume that incoming parts are manufactured to their specification. Benefits of Process FMEA
Provides assurance that the manufacture and assembly will not cause failures Ensures that potential manufacturing failure modes have been addressed Identifies significant process or assembly characteristics Provides a mechanism for process improvement changes 1A.1.4
Concept FMEA
A concept FMEA is used at the beginning of program definition when the feasibility of a new program is assessed. The scope of a concept FMEA can be a design concept at a component, subsystem, or system level, or a manufacturing or assembly level. Benefits of Concept FMEA
Helps select the optimum design configuration Helps assess the design feasibility Helps determine if redundancy is required
1A.1.5
Description of DFMEA Form
We describe the DFMEA form in detail. Concept, equipment, and process FMEA forms are similar in structure but differ in terms of guidelines for entering ratings, etc. The FMEA process begins with filling in the header of the FMEA form with all relevant information pertaining to type of FMEA, description, responsible party, date and revision date of preparation, etc. An illustrated example of a design FMEA form is presented in Figure 1-12. In the illustrated example a potential failure mode is listed that has to do with a steering pump seal not performing its intended function properly. It leaks power-steering fluid, which leads to a loss of power-steering assist. One potential cause is described. This is the degradation of the pump seal material due to corrosive chemicals in the power-steering fluid that attack polymeric linkages, resulting in a loss of material sealing properties. The severity, occurrence, and detection ratings are shown along with a resultant risk priority number (RPN). We now describe the information fields of the DFMEA form in detail.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 1-12
FMEA template with an example of a failure mode associated with a power-steering pump subsystem.
Copyright © 2002 Marcel Dekker, Inc.
There are two basic approaches to the development of design FMEA(s): 1. 2.
Hardware FMEA(s). Each part=component is listed as in a bill of materials. Functional FMEA(s). Only relevant functions are listed, keeping the size of the FMEA document manageable.
The hardware approach is normally used when hardware items can be uniquely identified from schematics, drawings, and other engineering and design data. In essence, we list a ‘‘bill of materials’’ in column 1 of the FMEA form. The hardware approach is normally utilized in a parts’ level-up fashion (bottom-up approach); however, it can be initiated at any level of indenture, and progress is in either direction. The functional approach is normally used when hardware items cannot be uniquely identified or when system complexity requires analysis from the initial indenture level downward through succeeding indenture levels. The functional approach is normally utilized in an initial indenture level down fashion (top-down approach); however, it can be initiated at any level of indenture and progress is in either direction. In this case all the known functions at the level of detail being studied are listed. Potential Failure Mode A failure mode is the manner in which a component or system failure occurs; it is the manner in which the part or system does not meet design intent. The failure mode is the answer to the question, ‘‘How could the component or system fail?’’ Note that a potential failure mode may also be the cause of a potential failure mode in a higher level of analysis or the effect of a lower level of analysis (component). All the functions of the item have to be listed in this field. Potential failure modes can be considered in any of the following four categories: 1. 2. 3. 4.
No function: There is a complete absence of the intended function. Partial=degraded function: The item does not meet some of the required functions. Intermittent function: The item performs its function intermittently (sometimes). Unintended function: Another function is performed, which was unintended in the design.
Potential Effect of Failure Potential effect(s) of failure are defined as the effects of the failure mode on the function of the item as perceived by the customer. There are four types of
Copyright © 2002 Marcel Dekker, Inc.
customer: internal customers, external customers, end user, and government= regulatory. Severity Severity defines the seriousness of the failure effect on the customer. A definition of the severity rating along with its relationship to reliability is depicted in Table 1-6. A severity rating of 10 or 9 must be addressed with an action and validated afterward to ensure that the rating of 10 or 9 does not recur. TABLE 1-6 An Adaptation of the Severity Rating System Used in the Auto Industry Severity of effect
Description
Severity rating
Reliability target
Safety
Any failure mode that directly affects the ability of a product to meet Federal Safety Standards, or creates a potential product liability issue, or can result in death or extensive property damage.
9–10
100%
Major
Any failure mode that stops the operation of a product or system which requires immediate repair (catastrophic).
7–8
95%
Moderate
Any failure mode that results in a product from meeting one of its intended functions, but does not preclude it from satisfying its most important functions. Customer may be annoyed or irritated.
4, 5, or 6
80%–90%
Low
Failure causes only a slight customer annoyance.
2–3
—
Minor
Customer may not even notice failure.
1
—
Source: Adapted from AIAG, 1995. Note: Severity ratings are based on guidelines provided by AIAG (1995). The reliability targets were suggested to the author based on discussions with several reliability engineers who work in the auto industry.
Copyright © 2002 Marcel Dekker, Inc.
Classification Important quality characteristics are identified as follows: 1.
2.
3.
Critical characteristics: those characteristics that can affect compliance with governmental regulations or safe product or service operation. These characteristics must be identified in the drawings and=or procedures, as well as on the FMEA form. Generally, the critical characteristics are defined by The courts—through product liability Regulatory agencies—through formal laws and=or regulations Industrial standards—through generally accepted practices in the industry Customer requisition—through their wants, needs, and expectations Internal engineering requirements—through historical data or leading-edge technology, or experience with product or service Significant characteristics: quality features of a process or product or service on which data should be collected. These characteristics are identified by a consensus of the customer and supplier as well as the FMEA team. All significant characteristics should be designated and agreed upon during the feasibility stage. Key characteristics: measurement indicators that provide rapid feedback as to process and performance issues.
Potential Causes Potential cause of a failure is defined as the design deficiency (weakness) that causes the item to have a specific failure mode. Every conceivable potential failure mode and=or mechanism should be listed for each failure mode. The cause=mechanism should be listed as concisely and completely as possible so remedial efforts can be aimed at identification and prevention of root cause from occurring. The root cause must be determined for the failure modes with a severity rating of 10 or 9. Occurrence Occurrence is the likelihood that a specific failure cause=mechanism will occur during the design lifetime of an item. The likelihood of occurrence of a specific failure cause=mechanism is estimated using the probability of failure of the item. Table 1-7 is designed to guide the FMEA team to assign occurrence rating to a specific failure cause=mechanism.
Copyright © 2002 Marcel Dekker, Inc.
TABLE 1-7 Occurrence Rating System Used by Auto Industry Probability of failure Very high: Failure is almost inevitable. High: Repeated failures. Moderate: Occasional failures.
Low: Relatively few failures. Remote: Failure is unlikely.
Possible failure rates
Rating
1 in 2 1 in 3 1 in 8 1 in 20 1 in 80 1 in 400 1 in 2000 1 in 15,000 1 in 150,000 1 in 1,500,000
10 9 8 7 6 5 4 3 2 1
Source: AIAG, 1995.
Occurrence ratings are set based on actual experience with the product or similar products. In the case of very new, innovative products, the design team will be challenged to define an acceptable estimate of the occurrence rating. Current Design Controls Design control is a method or test method used to detect a first-level cause of a potential failure mode and detect a failure mode before it is released to manufacturing. There are three types of design controls=features to consider: 1. 2. 3.
Those that prevent the cause=mechanism or failure mode=effect from occurring or that reduce their rate of occurrence Those that detect the cause=mechanism and lead to corrective action Those that detect the failure mode
The preferred approach is to first use type 1 controls, if possible; second, use type 2; third, use type 3. Keep in mind that desired output of applying design control methods is to detect a potential design flaw and then take corrective actions to eliminate them or reduce its rate of occurrence. In this step the FMEA team can contribute to the development of an efficient design verification test plan. Current Design Controls Detection is the degree of confidence that the specific failure cause=mechanism will be caught using the existing design control methods. Since detection is a relative rating of the effectiveness of the design control to catch the specific failure cause=mechanism, the FMEA team should agree on a consistent evalua-
Copyright © 2002 Marcel Dekker, Inc.
tion and ranking system throughout the FMEA process. Detection should be estimated using Table 1-8. Risk Priority Number (RPN) The risk priority number (RPN) is defined as the number calculated as the product of severity (S), occurrence (O), and detection (D) and is shown as RPN ¼ severity ðSÞ occurrence ðOÞ detection ðDÞ Even though there is no threshold value for RPNs in terms of prioritizing failure modes for corrective actions, the following procedure can be adopted by the FMEA team to utilize the RPNs: TABLE 1-8 Detection Rating System Used by Auto Industry Detection
Criteria: Likelihood of detection by design control
Absolute uncertainty
Design control will not and=or cannot detect a potential cause=mechanism and subsequent failure mode; or there is no design control. Very remote chance the design control will detect a potential cause=mechanism and subsequent failure mode. Remote chance the design control will detect a potential cause=mechanism and subsequent failure mode. Very low chance the design control will detect a potential cause=mechanism and subsequent failure mode. Low chance the design control will detect a potential cause=mechanism and subsequent failure mode. Moderate chance the design control will detect a potential cause=mechanism and subsequent failure mode. Moderately high chance the design control will detect a potential cause=mechanism and subsequent failure mode. High chance the design control will detect a potential cause=mechanism and subsequent failure mode. Very high chance the design control will detect a potential cause=mechanism and subsequent failure mode. Design control will almost certainly detect a potential cause=mechanism and subsequent failure mode.
Very remote
Remote Very low Low Moderate
Moderately high High Very high
Almost certain Source: AIAG, 1995.
Copyright © 2002 Marcel Dekker, Inc.
Rating 10
9
8 7 6 5
4
3 2
1
Under minor risk, no action is taken. Under moderate risk, some action may take place. Under high risk, definite action will take place. Under critical risk, definitive actions will take place and extensive changes are required in the system, design, product, process, and=or service.
Because of the subjective nature of RPN numbers, design teams should never compare RPN ratings extracted from differing FMEA projects, nor should they artificially set RPN thresholds for action.
Risk of Relying on RPN Rating Scheme Engineers should be cautioned against relying on RPN ratings excessively, as the use of RPN ratings goes against a modern design for reliability philosophy, which calls for designing the product ‘‘right the first time!’’ Under such a rating scheme, the design team is rewarded for relying on hardware testing to detect design deficiencies. This can lead to several iterations of prototype build, test, and fix, which is both costly and time-consuming! Instead, the design team ought to rely more on criticality ratings, which are just the product of severity and occurrence ratings.
Recommended Action FMEA can be viewed as a Pareto chart for prioritizing the allocation of resources for design improvements. Based on the RPN or criticality ratings, resources should be allocated for further improvement or design verification. This might be reflected in the recommended actions column. The purpose of a recommended action is to eliminate potential failure modes. The FMEA team should prioritize actions based on those failure modes:
With effects that have the highest severity ratings With causes that have the highest severity times occurrence ratings With the highest RPNs
The FMEA team should know that only a design revision can bring about a reduction in the severity ranking. Actions such as the following should be considered:
Design of experiments Revised test plans Revised design Revised material specification
Copyright © 2002 Marcel Dekker, Inc.
With the action plan, the designer should be sure to provide detailed supporting facts that give traceability to the action plan. Hierarchical Nature of FMEA Information The information reported in an FMEA is hierarchically arranged, allowing one or more data entries of the next field to be associated with the data entry of a previous field. That is, Each failure mode and effect combination can have one or more cause associated with it. Subsequently, each cause can have one or more design control associated with it. Each unique combination of failure mode, effect, cause, and design control will have a unique assignment of severity, occurrence, detection rating, and RPN number.
As an example, we reproduce a portion of a design FMEA on a braking system in Figure 1-13, FMEA of braking system (Augustine, 2001). In this reproduction we see many causes listed for each failure mode. Other FMEAs may appear quite a bit more elaborate and lengthy, due to the detail associated with the reproduction of many cause and design control combinations that are associated with each failure mode and effect pairing. System Integration FMEA information at various levels of detail—system, subsystem, and component—or—concept, design, and process—must be consistent. To accomplish this, it is important to recognize the linkages between FMEAs at various level of detail. This is illustrated in Figure 1-14, wherein an instance is illustrated showing that the failure mode at one level is synonymous with the cause at the next-higher level of detail. Additional integration of FMEAs is possible. That is, it is desirable for the design FMEA to be clearly linked with the design verification, planning, and reporting (DVP&R) document. Similarly, there should be a linkage between the design FMEA and the process FMEA and between the process FMEA and the process and control plans. These linkages are illustrated in Figures 1-19 and 1-20. Other Issues Design engineers who have recently completed their FMEA training often come away with a ‘‘false’’ perception that the FMEA methodology is very straightforward. In fact, this is far from the case, as the veteran reliability engineer is likely to understand that the use of the FMEA methodology is more of an ‘‘art’’ than a ‘‘science.’’
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 1-13
FMEA of braking system. (From Augustine, 2001.)
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 1-14 Illustration of end-effect failure mode at one level viewed as a cause at the next-higher level (next-higher effect).
For example, is it possible as we illustrate in Figure 1-15 for a gasket seal component that a ‘‘leak’’ can cause a ‘‘leak,’’ which results in a ‘‘leak?’’ The answer is ‘‘yes’’ if one is not careful with how one details the information fields in an FMEA. This can lead to a nearly worthless FMEA document, as they do little to assist in designing out potential failure modes. The key is always to identify the underlying root causes or underlying failure mechanisms. Information from failure reporting, analysis, and the corrective action system (FRACAS) and other field or warranty information should be used to drive this process. This will allow the design engineer to use problem-solving methodologies to identify corrective actions to design out design deficiencies early in product design. Fault-tree analysis is another useful tool for identifying root causes. The FMEA effort can only be as good as the FMEA team is able to work together effectively. FMEA teamwork requires effective team facilitation and leadership. There must be occasions when the team is allowed all the time that they require for brainstorming and other free-thinking activities, while not allowing for criticism. On other occasions the team leader must be able to make sure that the process is expedited by setting ground rules for limiting discussion and for formal voting activities if required. An FMEA document can be valuable only if it is highly regarded, and not too lengthy. Many FMEA efforts utilize a ‘‘hardware approach,’’ which results in very lengthy documents. It is not uncommon to see a voluminous FMEA on a
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 1-15
‘‘Leak’’ causes a ‘‘leak’’?
printed circuitboard that consists of line items containing the term ‘‘overstress voltage’’ as the cause of the failure. How valuable can such a document be? It is probably much better for the design team to go against accepted practices, and consider the use of hybrid FMEAs, which contain mostly functional-related failures, while keeping some of the hardware line items where necessary. The key is to come up with a useful document. This should be one of the goals of the team, but not the most important goal—because the most important goal is to come up with a plan for improving reliability by designing out=preventing potential failure modes from occurring in the field.
APPENDIX 1A.2
FAILURE MODE EFFECTS AND CRITICALITY ANALYSIS (FMECA)
Failure mode effects and criticality analysis (FMECA) is carried out in two parts. The first part identifies failure modes and their effects and so is very similar to the auto industry standard without the ratings. An example of such is shown in Figure 1-16.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 1-16 FMEA part of FMECA. (Output from ALD1’s FMECA processor system.)
Copyright © 2002 Marcel Dekker, Inc.
According to Mil-STD-1629A, four severity classifications are used in FMECA. They are Category I—Catastrophic: a failure that may cause death or weapon system loss (i.e., aircraft, tank, missile, ship, etc.) Category II—Critical: a failure, which may cause severe injury, major property damage, or major system damage, which will result in mission loss Category III—Marginal: a failure that may cause minor injury, minor property damage, or minor system damage that will result in delay or loss of availability or mission degradation Category IV—Minor: a failure not serious enough to cause injury, property damage, or system damage, but which will result in unscheduled maintenance or repair What really differentiates FMECA from FMEA is the criticality analysis portion. Criticality implies much more than severity occurrence in Mil-STD1629A. For very complex, mission-critical systems such as aerospace or defense products, a very careful hazard-rate analysis of each failure mode by category is performed. Operational times, the window of opportunity for each failure mode to evolve, are considered. A criticality rating, Cm , for each failure mode is established through the use of the following relationship (see Ireson and Coombs, 1996): Cm ¼ lp t a b
ð1A-1Þ
where lp ¼ Failure effect probability. t ¼ Specified time interval. a ¼ Failure mode ratio that the failure will involve a specified failure mode. b ¼ Probability that a specified failure effect will occur, given that the failure mode occurs. An illustration of criticality analysis is shown in Figure 1-17. The criticality ratings are aggregated by severity classification, as shown in the figure.
APPENDIX 1A.3
DESIGN VERIFICATION, PLANNING & REPORTING (DVP&R)
An illustration of a drop test in a DVP&R report is presented in Figure 1-18. The major fields filled in include
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 1-17 Example of criticality analysis. (Output from ALD1’s FMECA processor system.)
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 1-18 Sample DVP&R form for a drop test. (Provided by Rob Zweke, IE7270 class, Wayne State University, Fall 1998.)
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 1-19 A failure mode with high RPN translates into test requirements that appear in the DVP&R.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 1-20
1. 2. 3. 4. 5. 6. 7.
Linkage between design and process FMEAs.
Name of the test procedure A description of the test Acceptance criteria Target requirements–duration, pass=fail criteria, number on test, etc. Test responsibilities (name of vendor, Magna) Test stage (i.e., design or process verification, and stage or iteration) Sample quantities on test and type of information recorded
The fields to the right are provided for recording the results of design verification reports. Information includes 1. 2. 3.
Number of items put on test, type of information recorded, and completion date Results of test Any notes
Copyright © 2002 Marcel Dekker, Inc.
Again we iterate that the allocations of resources for testing should be based on the priorities as determined by the outputs from FMEA documents. This is illustrated in Figure 1-19. Figure 1-20 depicts the relationship between design and process FMEAs.
Copyright © 2002 Marcel Dekker, Inc.
2 Preliminaries, Definitions, and Use of Order Statistics in Reliability Estimation
Have you heard the story about a plant engineer who was ‘‘made’’ a reliability engineer? The engineer’s lack of training in the field of reliability never seemed to matter until one day a mistake that he made caught up with him. It seems that he incorrectly entered test data into his statistical package. Some of his data was incomplete, and he didn’t known what to do with it, so he ignored the data. He always treated incomplete data sets this way, leading to overly optimistic reliability predictions. This was not noticed until unexpectedly high warranty claims occurred on one particular product that caught management’s attention. Management demanded to know how this could have happened! (Management should have been told that it was its fault for putting this engineer in this position in the first place!) If there is a moral to this story, and there may be many, it is that training is important. The only way to make up for a weak background in the statistical sciences is with training. Organizations who are looking for reliability engineers might want to seek out professionals who have graduate-level coursework in reliability and=or are Certified Reliability Engineers by the American Society for Quality (ASQ). These individuals have to pass a comprehensive exam in order to earn this distinction.
In this chapter we provide a survey of basic definitions and associated mathematics for modeling reliability. Specifically, we discuss
Copyright © 2002 Marcel Dekker, Inc.
1.
2.
3.
2.1
Time-to-failure distribution theory, including the introduction of the use of hazard-rate and cumulative hazard-rate formulas. The definition of B10 life along with the use of tR notation, denoting the 100*Rth percentile of the survival distribution, are introduced. Use of (a) median and mean rank formula for providing empirical estimates of reliability and of (b) the cumulative failure distribution are introduced. Applications for both individuals and grouped data are presented. The theory behind the development of order statistics is deferred to the appendix. The use of adjusted rank formulas for estimating order statistics in the presence of censoring is discussed. Definitions of left, right, and interval censoring are presented. Kaplan–Meier product-limit estimators are discussed in the appendix, wherein the equivalence of productlimit estimators and the much more simplified discussion on the use of adjusted rank formulas are shown to be equivalent approaches.
RELIABILITY METRICS
Speak with information, not opinion! Anonymous. Reliability is a probability that a system (component) will perform its intended function, as defined by the customer, over a specified period of exposure (including conditions of usage). Accordingly, the ability to work with distribution metrics should comprise a required portion of the training of today’s reliability practitioner, as probabilistic-based metrics are needed to characterize reliability phenomena.
2.1.1
Reliability Functions
Reliability, RðtÞ, is defined as the probability of survival to some time or usage metric. Let T , a random variable, denote the time-to-failure of a system or component; T 0. Thus,* RðtÞ ¼ PðT tÞ
ð2:1Þ
*Referring to our definition of reliability, we speak of a failure occurrence when performance does not meet the minimum criteria for acceptability as defined by the customer.
Copyright © 2002 Marcel Dekker, Inc.
with initial condition, Rð0Þ ¼ 1:0 and limt!1 RðtÞ ¼ 0. Let f ðtÞ denote the probability density of failure function (see Table 2-1), then ðt ð1 0 0 RðtÞ ¼ 1 f ðt Þdt ¼ f ðt 0 Þdt 0 ð2:2Þ 0
t
Similarly, we define the complementary function, FðtÞ ¼ 1 RðtÞ, to denote the probability of occurrence of a failure before time, t. Thus, its boundary conditions are complementary with RðtÞ, with Fð0Þ ¼ 0:0 and limt!1 FðtÞ ¼ 1, and ðt ð2:3Þ FðtÞ ¼ f ðt 0 Þdt 0 ¼ 1 RðtÞ 0
The shape of the probability density function, f ðtÞ, can be found by taking the first derivative of either FðtÞ or RðtÞ: f ðtÞ ¼
dFðtÞ dRðtÞ ¼ dt dt
ð2:4Þ
Instantaneous Hazard (Failure) Function, lðtÞ The (instantaneous) hazard-rate or failure-rate function, lðtÞ, is interpreted as a measure of the instantaneous proportional rate of change of the survival fraction. It is obtained as a ratio of the probability density function to the reliability function: lðtÞ ¼
f ðtÞ RðtÞ
ð2:5Þ
The instantaneous failure-rate function is useful for describing wear-in, wearout, and constant (random) failure phenomena. The reliability bathtub curve is a conceptual model of the hazard-rate function over the life of a product. Conceptually, a product may have three phases: a wear-in phase, a random failurerate phase, and a wearout phase. This model is discussed and the bathtub curve illustrated earlier (see x1.3.5 and Figure 1-7). Physical Meaning of lðtÞ. To assist the reader in associating a physical meaning to lðtÞ, we define the following notation: N0 ¼ the number of operational units at time t ¼ 0: N ðtÞ ¼ the number of operational units at time t: Then, RðtÞ ¼
N ðtÞ N0
Copyright © 2002 Marcel Dekker, Inc.
ð2:6Þ
TABLE 2-1 Distributed Metrics Used in Characterizing Reliability Metric
Definition
f ðtÞ, probability density function (pdf)
f ðtÞdt represents the probability of failure in the interval ðt, t þ dtÞ.
FðtÞ, cumulative failure distribution (cdf)
(1) Probability that a single item fails by time t (2) Population failure fraction by time t
FðtÞ ¼
R(t), reliability
(1) Probability that a single item survives to time t (2) Population survival fraction by time t
RðtÞ ¼ 1 FðtÞ ¼ 1
lðtÞ, instantaneous failure, or hðtÞ, hazard rate
Conditional probability of failure per unit time
lðtÞ ¼ f ðtÞ=RðtÞ
HðtÞ, cumulative hazard function
Accumulation of lðtÞ
HðtÞ ¼
tM , median life
50th percentile of the population ð¼ B50 ¼ t0:50 Þ
FðtM Þ ¼ 0:5 ) tM ¼ F 1 ð0:5Þ
B100p life
100pth left percentile of failure distribution
FðB100p Þ ¼ p ) B100p ¼ F 1 ðpÞ
B10 life (see Figure 2-1)
10th percentile of failure distribution
FðB10 Þ ¼ 0:10
tR life (see Figure 2-1)
Rth right tail of failure distribution
Population mean (m)
The first population moment, an average distributional measure
FðtR Þ ¼ 1 R ) tR ¼ F 1 ð1 RÞ Ð1 Ð1 m ¼ 0 tf ðtÞdt ¼ 0 RðtÞdt
Population variance (s2 )
The second population moment about the mean, a measure of dispersion
s2 ¼
Coefficient of variation (g)
The ratio of s to m
g ¼ s=m
Population skewness (g3 )
A relative measure of departure from symmetry
Population kurtosis (g4 )
A relative measure of how peaked the center of a distribution is, or how heavy the tails are
Copyright © 2002 Marcel Dekker, Inc.
Functional expression Ðb Pða t bÞ ¼ a f ðtÞdt Ðt 0
Ðt 0
Ð1 0
f ðtÞdt Ðt 0
f ðtÞdt ¼
Ð1 t
f ðtÞdt
lðtÞdt
ðt mÞ2 f ðtÞdt ¼
Ð1 0
t 2 f ðtÞdt m2
Ð1 ðt mÞ3 f ðtÞdt Eðt mÞ3 g3 ¼ ¼ 0 3 s3 s Ð1 4 4 Eðt mÞ 0 ðt mÞ f ðtÞdt g4 ¼ ¼ s4 s4
Additionally, by Eq. (2.4), f ðtÞ ¼ dRðtÞ=dt, and so f ðtÞ ¼ lim
Dt!0
RðtÞ Rðt þ DtÞ Dt
ð2:7Þ
Combining Eqs. (2.6) and (2.7), we provide an alternate expression for lðtÞ as follows: lðtÞ ¼
f ðtÞ RðtÞ
RðtÞ Rðt þ DtÞ Dt RðtÞ ðN ðtÞ=N0 Þ ðN ðt þ DtÞ=N0 Þ ¼ lim Dt!0 Dt ðN ðtÞ=N0 Þ N ðtÞ N ðt þ DtÞ =Dt ¼ lim Dt!0 N ðtÞ
¼ lim
Dt!0
ð2:8Þ
Thus, we may attach the physical association: lðtÞ is an instantaneous rate of change in fraction=proportion=probability failed over time. Note: You might sometime see the units for lðtÞ expressed in FITS (failures=109 hr) or in BITS (failures=109 hr). Relationship Between lðtÞ and RðtÞ and HðtÞ RðtÞ may be directly evaluated from an expression for lðtÞ if it is known. We begin with Eq. (2.5) and integrate both sides over t: dRðtÞ=dt RðtÞ ðt ðt ðt dRðt 0 Þ=dt 0 0 dRðt 0 Þ dt ¼ ln RðtÞ lðt 0 Þdt 0 ¼ ¼ 0 0 Rðt Þ 0 0 0 Rðt Þ lðtÞ ¼
Thus,
ð2:9Þ
3
2
7 6 ðt 7 6 RðtÞ ¼ exp6 lðt 0 Þdt 0 7 5 4 0 |fflfflfflfflffl{zfflfflfflfflffl}
ð2:10Þ
HðtÞ
The bracketed expressionÐ on the right-hand side is referred to as the cumulative t hazard function, HðtÞ ¼ 0 ðt 0 Þdt 0 . Table 2-2 summarizes these relationships in compact form. It is in the form of a from–to table. Given any metric displayed in a column header, the interrelationship with any of the other reliability metrics displayed in a row
Copyright © 2002 Marcel Dekker, Inc.
heading is summarized in the cell corresponding to the intersection of the selected row and column. 2.1.2
Population Moments
Population Mean (m) or MTTF The population mean, m, or mean time-to-failure (MTTF) is given by ð1 m¼ tf ðtÞdt ð01 ¼ RðtÞdt
ð2:11Þ
0
Ð1 Proof of the alternate form, m ¼ 0 RðtÞdt, is shown in the appendix; see Appendix x2A.1. The mean time-to-failure (MTTF) is commonly estimated using the sample P average estimator, t ¼ ni¼1 ti =n. Population Variance (s2 ) The variance is the second population moment about the mean and is given by ð1 ð1 2 2 s ¼ ðt mÞ f ðtÞdt ¼ t 2 f ðtÞdt m2 ð2:12Þ 0
0
The population variance is a measure of process width. For example, under a normal distribution assumption, m 3s will cover 99.73% of the population. The population variance is commonly estimated by s2, the sample variance, where n P
s ¼ 2
ðti t Þ2
i¼1
n1
Coefficient of Variation (g) The coefficient of variation is given by g¼
s m
ð2:13Þ
This is a popular measure to relate process width in proportion to the population mean. It can be estimated using s=t .
Copyright © 2002 Marcel Dekker, Inc.
TABLE 2-2 Interrelationship Between Reliability Metrics: f ðtÞ, FðtÞ, RðtÞ, lðtÞ, and HðtÞ To:
f ðtÞ
From:
FðtÞ
ðt
ðt f ðtÞ
f ðtÞdt
FðtÞ ¼
—
RðtÞ ¼ 1
0
FðtÞ
f ðtÞ ¼
dFðtÞ dt dRðtÞ dt
f ðtÞ ¼
lðtÞ
ðt f ðtÞ ¼ lðtÞ exp lðtÞdt 0
HðtÞ
f ðtÞ ¼
dHðtÞ exp½HðtÞ dt
Copyright © 2002 Marcel Dekker, Inc.
f ðtÞdt
lðtÞ ¼
0
RðtÞ ¼ 1 FðtÞ
—
RðtÞ
lðtÞ
RðtÞ
lðtÞ ¼
HðtÞ ¼
dFðtÞ=dt 1 FðtÞ
HðtÞ ¼
—
lðtÞ ¼
ðt FðtÞ ¼ 1 exp lðtÞdt
ðt RðtÞ ¼ exp lðtÞdt
—
FðtÞ ¼ 1 exp½HðtÞ
ðt
f ðtÞ Ðt 1 0 f ðtÞdt
FðtÞ ¼ 1 RðtÞ
0
HðtÞ
d ln RðtÞ dt
0
ðt 0
f ðtÞdt Ðt 0 0 0 f ðt Þdt
dFðtÞ 1 FðtÞ
HðtÞ ¼ ln RðtÞ ðt HðtÞ ¼
0
RðtÞ ¼ exp½HðtÞ
1
lðtÞdt 0
lðtÞ ¼
dHðtÞ dt
—
Population Skewness ðg3 Þ The skewness measure is given by Ð1 3 m3 Eðt mÞ3 0 ðt mÞ f ðtÞdt g3 3 ¼ ¼ s3 s s3
ð2:14Þ
g3 is a measure of symmetry. For symmetric distributions, g3 ¼ 0. g3 will be positive for distributions possessing a very long right tail and negative for very long left-tailed distributions. This effect is illustrated in Figure 2-1. It can be estimated using (Kendall et al., 1987) g^ 3 ¼
n2 m 33 ðn 1Þðn 2Þ s
where n P
m3 ¼
ðti t Þ3
i¼1
n
Population Kurtosis (g4 ) The Pearson kurtosis is given by Ð1 4 m4 Eðt mÞ4 0 ðt mÞ f ðtÞdt ¼ g4 4 ¼ s4 s s4
ð2:15Þ
Kurtosis is a measure of peakedness, or how heavy the tails of a distribution are. This measure is equal to 3.0 for a normal distribution. This measure will exceed 3 for distributions having very heavy tails and will be less than 3 for
FIGURE 2-1
Difference between left tail, B10 life, and right tail, t0:10 life.
Copyright © 2002 Marcel Dekker, Inc.
distributions having a concentrated probability density at the center (large peak). This effect is illustrated in Figure 2-2. An estimator for the Pearson kurtosis is given by (Kendall et al., 1987) n2 ðn þ 1Þ m4 3ðn 1Þm22 g^ 4 ¼ s4 ðn 1Þðn 2Þðn 3Þ where n P
m4 ¼
2.1.3
n P
ðti t Þ4
i¼1
n
and
m2 ¼
ðti t Þ2
i¼1
n
Worked-Out Examples
Example 2-1 The time-to-failure of a key component is believed to follow a uniform distribution on the interval (0, a). Find general expressions for the reliability metrics: f ðtÞ; FðtÞ; RðtÞ; lðtÞ; HðtÞ; population moments m and s2 ; and median; tM
FIGURE 2-2
Characteristic shape of distribution associated with different g3 -, g4 -values.
Copyright © 2002 Marcel Dekker, Inc.
Solution: By definition, the uniform distribution has probability density function f ðtÞ given by f ðtÞ ¼
1 a
for 0 t a; 0 elsewhere:
Thus, expressions for FðtÞ and RðtÞ are readily obtained: 8 0:0 t0 > > <Ð 0 t t t1 0 FðtÞ ¼ 0 > 0 : 1:0 ta 8 t0 > < 1:0 RðtÞ ¼ 1 FðtÞ ¼ 1 t=a 0 : 0:0 ta These are followed by expressions for lðtÞ and HðtÞ: 8 0=1:0 ¼ 0:0 t0 > > f ðtÞ < 1=a ¼ 1=ða tÞ 0 > : 0 ta 8 0:0 > > ðt < ðt 1 0 0 HðtÞ ¼ lðt Þdt ¼ dt 0 ¼ ln a lnða tÞ 0 > a t 0 > : 0 1 m and s2 are found using ð1 ða t dt t f ðtÞdt ¼ m¼ a 0 0 a t2 a ¼ ¼ 2a 0 2
ð1 s2 ¼
t 2 f ðtÞdt m2 0
ða ¼
a 2 t2 dt 2 0 a
or Ð a t 0 RðtÞdt ¼ 0 1 a dt a t 2 a ¼t ¼a 2 2a 0 a ¼ 2
m¼
Ð1
Copyright © 2002 Marcel Dekker, Inc.
a t 3 a2 3a 0 4 a3 a 2 ¼ 3a 4 a2 ¼ 12 ¼
t0 0
The median, tM is found using FðtM Þ ¼ 0:5 ¼ ðtM =aÞ ) tM ¼ 0:5a, which is the midpoint of the interval ð0; aÞ. The reliability functions for this uniform distribution are charted Figure 2-3, for a ¼ 1:0. Example 2-2 The failure rate on a new steering pump design is estimated to be lðtÞ ¼ 1:6 109 t where t is in km driven. Fifty vehicles in a test fleet are each driven for 20,000 km. What is the expected number of steering pump failures over this time period? Solution:
The expected number of failures is
50 Fð20;000Þ To find Fð20;000Þ, we refer to Table 2-2 and look up the expression FðtÞ ¼ Ðt 1 exp½ 0 lðtÞdt, to convert from an instantaneous failure rate to a cumulative failure distribution. In our case ð 20;000 9 Fð20;000Þ ¼ 1 exp 1:6 10 tdt 0
¼ 1 exp½8:0 1010 t 2 j020;000 ¼ 0:274 So, 50 0:274 14 failures. This calculation is based on an assumption that no vehicles are removed from the study if they are involved in an accident or fail due
FIGURE 2-3
Graphical illustration of f ðtÞ, FðtÞ, RðtÞ, and HðtÞ ða ¼ 1:0 in Example).
Copyright © 2002 Marcel Dekker, Inc.
to another failure mode. It also requires the assumption that no vehicle can experience more than one pump failure during the study period. Example 2-3 The reliability of a component is given by RðtÞ ¼ expð0:03t 0:010t 2 Þ, where t is in years of usage. What is lðtÞ? From Table 2-2, d ln RðtÞ dt ln RðtÞ ¼ 0:03t 0:010t 2 d ln RðtÞ ¼ 0:03 þ 0:02t ) lðtÞ ¼ dt lðtÞ ¼
2.2
EMPIRICAL ESTIMATES OF F(t) AND OTHER RELIABILITY METRICS: USE OF ORDER STATISTICS
Given a set of ordered observations (i.e., failure times) with t1 t2 ti1 ti tiþ1 tn1 tn , our challenge is to identify a suitable distribution model and then to estimate the parameters of the distribution. As a precursor to this activity, we introduce a distribution-free model for estimating the distribution function, FðtÞ. We refer to these models as one of the following:
Empirical Nonparametric Distribution-free
These models reference only the order of the observation, not the actual value of the observation. Accordingly, we refer to these estimates as order statistics, and we refer to these empirical estimates of FðtÞ—denoted by F^ ðtÞ—as rank estimators. The rank estimators are used to generate probability plots of the data, which, in turn, can be used to assess the fit of the data to a prospective distribution. This methodology is discussed in greater detail in Appendix 3B of Chapter 3. Here we describe some very simple models for developing empirical estimates of FðtÞ or any of the other common reliability metrics discussed in the previous section. 2.2.1
Naive Rank Estimator
Given a set of ordered failure times—t1 t2 t1 tiþ1 tn1 tn —whose distribution is unknown, our task is to generate distribution-free
Copyright © 2002 Marcel Dekker, Inc.
TABLE 2-3 Naive Empirical Distribution on t1 t2 tn1 tn i 1 2 3 ... n1 n
ti
pðti Þ
t1 t2 t3 ... tn1 tn
1=n 1=n 1=n ... 1=n 1=n
(nonparametric) estimates of reliability and any of the associated metrics discussed in x2.1. Intuitively, a simple, unbiased empirical distribution of the ordered times would be to assign a probability mass of 1=n to each of the n sample failure times (see Table 2-3 and Figure 2-4). To provide a complete description of F^ ðtÞ, we first need to define the 0th order statistic. That is, we artificially set F^ ðt0 Þ 0. An expression for the naive rank estimator, F^ ðtÞ, is given by i F^ ðtÞ ¼ n
2.2.2
for ti t < tiþ1
for i ¼ 0; 1; 2; . . . ; n
ð2:16Þ
Mean and Median Rank Estimators
The naive estimator of Eq. (2.16) is observed to have a significant deficiency in that, for t tn, F^ ðtÞ ¼ 1:0. This is an undesirable property, particularly with small sample sizes, as the likelihood of a failure occurring for t > tn is significant. To improve upon this estimator, Leonard Johnson (1951, 1964) proposed an improvement based on a distribution-free model on the order statistics, ti , i ¼ 1; 2; . . . ; n. This distribution is described in Appendix 2A.2, wherein it is
FIGURE 2-4
Empirical probability distribution.
Copyright © 2002 Marcel Dekker, Inc.
shown that the distribution of the order statistic, Wi ¼ F^ ðti Þ, follows a beta distribution with the expected value given by E½F^ ðti Þ ¼ i=ðn þ 1Þ. Thus, the mean rank estimator of Fðti Þ (sometimes referred to as the Herd–Johnson estimator) is given by F^ ðti Þ ¼
i nþ1
for i ¼ 0; 1; 2; . . . ; n
ð2:17Þ
Due to the skewness of the beta distribution, Johnson (1951, 1964) proposed the use of the median rank estimator, which is more representative of the location of the distribution. It is the median of the underlying beta distribution model described in the appendix. At that time, the calculation of the exact values of the median ranks required enormous computing resources. Accordingly, several good approximations of the median rank have been suggested. Benard’s (1953) approximation is given by i 0:3 F^ ðti Þ ¼ n þ 0:4
for i ¼ 0; 1; 2; . . . ; n
ð2:18Þ
Several textbooks and computer packages, including Minitab1, make use of the following rank estimator proposed by Blom (1958) for plotting normal data: i 3=8 F^ ðti Þ ¼ n þ 1=4
for i ¼ 0; 1; 2; . . . ; n
ð2:19Þ
Gerson (1975) reports that there is significantly less bias in the estimate of the normal parameter, s, with the use of Eq. (2.19). In a limited simulation study on the Weibull distribution—the extreme-value distribution formed from logged failure times (see Chapter 3)—Kimball (1960) reports that there are less bias and a lower mean square error of the estimate of the Weibull shape parameter, b, with the use of Eq. (2.19). An exact expression for the median rank involving the Fisher F-distribution is derived in Appendix 2A.2 and presented here: F^ ðti Þ ¼
i i þ ðn þ 1 iÞF2ðnþ1iÞ;2i;0:50
ð2:20Þ
Unfortunately, most Fisher F-tables do not include tabulated values of its 50th percentile. However, it can be readily obtained within Microsoft1 Excel. The use of Excel for evaluating Eq. (2.20) is illustrated in Figure 2-5. Tables of exact median rank values are widely available. A summary of the rank estimators that has been discussed appears in Table 2-4. For convenience, a table of median ranks is presented in Table 2-5. However, with the widespread availability of spreadsheet programs on personal or networked computers, median rank estimates can be readily calculated based on the use of Eq. (2.20).
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 2-5 Use of Excel spreadsheet formula to calculate median rank of i ¼ 5th order statistics in a complete sample of size, n ¼ 10.
2.2.3
Use of Rank Estimators of FðtÞ as a Plotting Position in a Probability Plot
The risks and tradeoffs in the choice of a rank estimator are illustrated in Figure 2-6. Both the mean and the median rank estimators are used to construct a Weibull probability plot of a sample data set. It is seen that the median fit and the mean fit differ by a clockwise rotation of the median fit line by a degree or two. Although the parameter estimates obtained with each of these fits will differ, their difference is likely to be less than the underlying differences caused by sampling variation. The use of probability plots is discussed in greater detail in Chapter 3.
TABLE 2-4 Popular Rank Estimators of FðtÞ Plotting conventions
Formula
Uniform ‘‘naive’’ estimator Mean rank estimator
i=n i=ðn þ 1Þ (Herd–Johnson) i a. (exact expression) i þ ðn þ 1 iÞF2ðnþ1iÞ;2i;0:5 i 0:3 (Benard’s 1953 approximation) b. n þ 0:4 i 3=8 (Blom’s 1958 approximation) c. n þ 1=4
Median rank estimator
Copyright © 2002 Marcel Dekker, Inc.
TABLE 2-5 Table of Median Ranks Based on the Use of Microsoft1 Excel to Evaluate Eq. (2.20)
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 2-6 Instance of the use of both the mean and median rank plotting method on Weibull probability paper.
2.2.4
Beta-Binomial and Kaplan–Meier Confidence Bands on Median Rank Estimator
In Appendix 2A.2 we give details of a model developed by Leonard Johnson (1951, 1964) of the rank estimator of Fðti Þ—Wi , i ¼ 1; 2; . . . ; n—to show that it theoretically should follow a beta distribution with parameters i and ðn þ 1 iÞ. Based on the properties of the beta distribution and its relationship with the Fdistribution, ð1 aÞ confidence limit on the rank estimator, Wi , is given by PðWi;1a=2 Wi Wi;a=2 Þ 1 a with lower confidence limit Wi;1a=2 ¼
i i þ ðn þ 1 iÞF2ðnþ1iÞ;2i;a=2
and upper confidence limit Wi;a=2 ¼
i i þ ðn þ 1 iÞF2ðnþ1iÞ;2i;1a=2
Copyright © 2002 Marcel Dekker, Inc.
ð2:21Þ
The limits referenced by Eq. (2.21) are often referred to as beta-binomial or binomial limits due to the relationship that also exists between the binomial and beta distributions. As pointed out by Wasserman (1999, 2000a), the reader should be aware of the fact that popular expressions for binomial confidence limits—as introduced in Chapter 4—differ somewhat from the beta limits of Eq. (2.21). The lower limits of both agree, but the upper limits disagree as to the number of degrees of freedom used. On the binomial upper confidence limit, an F-value will be used having 2ðn iÞ numerator degrees of freedom and 2ði þ 1Þ denominator degrees of freedom. The use of Excel for the evaluation of Eq. (2.21) is illustrated in Figure 2-7. In Appendix 2A.3 we present an expression for variance of the Kaplan– Meier rank estimator, F^ ðti Þ, which in turn can be used to construct standard normal confidence intervals on the rank estimator. The variance (see Lawless, 1982) is given by
Var½F^ ðti Þ ¼ F ðti Þ ^2
i P j¼1
! dj ðnj iÞðnj þ 1 iÞ
ð2:22Þ
FIGURE 2-7 Use of Excel spreadsheet formula to calculate median rank and 95% confidence limit for i ¼ 5th order statistic in a complete sample of size, n ¼ 10.
Copyright © 2002 Marcel Dekker, Inc.
where
dj ¼
1 0
if the jth event is an observed failure if the jth event is a censored reading
(Note: Censoring is discussed in x2.3.) Based on Eq. (2.22), the asymptotically correct, standard normal limits will be of the form 2 6 P6 F^ ðti Þ Za=2 sF^ ðti Þ F^ ðti Þ 4|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl} LCLF^ ðt Þ
3
i
7 F^ ðti Þ þ Za=2 sF^ ðti Þ 7 1a |fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl}5
ð2:23Þ
UCLF^ ðt Þ i
Either of the two nonparametric procedures—namely the beta-binomial or Kaplan–Meier can be used to form confidence bands on probability plots. In Figure 2-8 we illustrate the use of distribution-free (nonparametric) 95% confidence bands on a Weibull plot. That is, these confidence bounds would be the same no matter what probability scale the life data was plotted on. Many popular software packages for reliability, such as WinSmithTM, provide the option of generating parametric confidence limits on probability plots. The use of such techniques is presented in Chapter 3. The underlying theory of their use is discussed in Chapter 7. 2.2.5
Empirical Estimates of Other Reliability Metrics: RðtÞ, lðtÞ, f ðtÞ, and HðtÞ
We now illustrate how the rank estimators of FðtÞ can be used to derive empirical estimates of the metrics RðtÞ
lðtÞ
f ðtÞ
HðtÞ
Let’s start with RðtÞ first. An expression for R^ ðti Þ is readily obtained as 1 F^ ðti Þ. Based on the mean rank estimator (see Table 2-4), nþ1i R^ ðti Þ ¼ nþ1
for ti t < tiþ1
for i ¼ 0; 1; 2; . . . ; n
ð2:24Þ
The density function is the instantaneous derivative of the cumulative failure distribution [see Eq. (2.4)]. Using the notation Dti ¼ ðtiþ1 ti Þ, we
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 2-8
Nonparametric confidence bands on a Weibull plot (Minitab1 V12.1).
approximate this derivative over the interval [ti , tiþ1 ) as follows, based on a use of mean rank estimator: F^ ðtiþ1 Þ F^ ðti Þ f^ ðtÞ ¼ Dti 1 ¼ Dti ðn þ 1Þ
for ti t < tiþ1
ð2:25Þ
for ti t < tiþ1
The definition of lðtÞ given by Eq. (2.5) will now be used to construct an empirical estimate of the hazard function, lðtÞ, when a mean rank estimator is used: ^ 1 ðn þ 1 iÞ ^lðtÞ ¼ f ðtÞ ¼ ^RðtÞ Dti ðn þ 1Þ ðn þ 1Þ 1 for ti t < tiþ1 ¼ Dti ðn þ 1 iÞ
Copyright © 2002 Marcel Dekker, Inc.
ð2:26Þ
To estimate HðtÞ, we make use of Eq. (2.10): H^ ðtÞ ¼ ln R^ ðtÞ
Example 2-4:
ð2:27Þ
Use of empirical reliability estimates on complete data sets
Recorded failures for a sample of size n ¼ 9 occur at t ¼ 60, 150, 299, 550, 980, 1270, 1680, 2100, and 2400 Kcycles. Develop empirical estimates of the reliability metrics FðtÞ, RðtÞ, f ðtÞ, lðtÞ, and HðtÞ (see Table 2-6).
2.2.6
Working with Grouped Data
We assume that the failure and censor times have been grouped into k þ 1 intervals of the form ½ti , tiþ1 ) for i ¼ 0; 1; 2; . . . ; k, with t0 0 and tkþ1 1. The width of the intervals can be variable. Most practitioners would refer to such data as histogram data. To ensure clarity in notation and expression, define the following: ni ¼ no: of survivors ðe:g:; operational unitsÞ at t ¼ ti with n0 ¼ n: Dni ¼ ni niþ1 : Dti ¼ tiþ1 ti : t þ ti ti ¼ iþ1 : 2 TABLE 2-6 Worksheet for Deriving Empirical Estimates of Reliability Metrics, Including Both Estimators of HðtÞ i
ti
tiþ1
F^ ðti Þ ¼ R^ ðti Þ ¼ i=10 ð10 iÞ=10
0 1 2 3 4 5 6 7 8 9
0 60 150 299 550 980 1270 1680 2100 2400
60 150 299 550 980 1270 1680 2100 2400 —
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Copyright © 2002 Marcel Dekker, Inc.
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1
f^ ðtÞ ¼ 0:1=Dt
l^ ðtÞ ¼ 1=ð10 iÞ Dt
H^ ðtÞ
0.001667 0.001111 0.000671 0.000398 0.000233 0.000345 0.000244 0.000238 0.000333 —
0.001667 0.001235 0.000839 0.000569 0.000388 0.000690 0.000610 0.000794 0.001667 —
0 0.105 0.223 0.357 0.511 0.693 0.916 1.204 1.609 2.303
Estimators of Grouped Reliability Metrics For grouped data, a well-accepted empirical estimate of RðtÞ is given by n R^ ðti Þ ¼ i n
for ti t < tiþ1
ð2:28Þ
Based on Eq. (2.28), expressions for other reliability metrics can be readily obtained: n ni F^ ðti Þ ¼ 1 R^ ðti Þ ¼ for ti t < tiþ1 n R^ ðt Þ R^ ðtiþ1 Þ for ti t < tiþ1 f^ ðtÞ ¼ i Dti Dni for ti t < tiþ1 ¼ n Dti Dni f^ ðtÞ l^ ðtÞ ¼ ¼ for ti t < tiþ1 R^ ðtÞ ni Dti
ð2:29Þ ð2:30Þ
ð2:31Þ
To estimate the sample moments, t (also known as the mean time-to-failure) and s2, we use the formulas k k Dn ðt t Þ2 P P Dni ti i i t ¼ ð2:32Þ s2 ¼ n n 1 i¼0 i¼0 Example 2-5:
Complete, grouped data set
n ¼ 200 prototype units are put on test. At the end of every shift (8 hr), the number of failed items is tallied. The data set is presented in Table 2-7 along with empirical estimates of RðtÞ, f ðtÞ, and lðtÞ.
2.3
WORKING WITH CENSORED DATA
Life data is said to be censored when exact information on time-to-failure for a particular test specimen is not available. Reasons for censoring include* 1.
A special need to terminate a life test before all specimens have failed.
*In the biostatistics field, reasons for having censored data points are well understood by my students. Consider a longitudinal study conducted by a drug company to study long-term efficacy of a new drug. Over a period of years, subjects are likely to leave the study for a variety of reasons, such as moving away, dying, etc.
Copyright © 2002 Marcel Dekker, Inc.
TABLE 2-7 Interval i 0 1 2 3 4 5 6 7 8
Test data—prototype replaceable unit ti (beginning of interval i), in hr
tiþ1 (end of interval i), in hr
ni (beginning of interval)
Dni (no. failed in interval)
R^ ðti Þ ¼ ni 200
f^ ðtÞ 1000 ¼ DZ 1000 8 200
l^ ðtÞ 1000 ¼ Dni 1000 8 ni
0 8 16 24 32 40 48 56 64
8 16 24 32 40 48 56 64 72
200 197 195 189 184 175 160 141 121
3 2 6 5 9 15 19 20 29
1.000 0.985 0.975 0.945 0.920 0.875 0.800 0.705 0.605
1.875 1.250 3.750 3.125 5.625 9.375 11.875 12.500 18.125
1.875 1.269 3.846 3.307 6.114 10.714 14.844 17.730 29.959
Copyright © 2002 Marcel Dekker, Inc.
2. 3.
2.3.1
A failure of a different nature has occurred from the one being investigated. A specimen is removed or suspended from test for reasons beyond the control of the investigator, such as a problem with a test stand.
Categorizing Censored Data Sets
We distinguish between left-censored, right-censored, and interval-censored data. We also distinguish between singly and multiply censored data sets, and between time and failure censored data sets. Left, Right, and Interval Censoring An observation is said to be right-censored if it is removed from test prior to failure. In some sense, we cannot fully estimate the right tail of the failure distribution because of this; hence the term ‘‘right-censored’’ (see Figure 2-9). An observation is said to be left-censored if an item has failed, but its exact time on test is not known. This can happen if we do not have exact information on a test item’s failure time or time placed on test. In this case, we cannot fully estimate the left tail of the failure distribution (see Figure 2-10). An observation is said to be interval-censored if we know that an item has failed in some time interval, but we do not know its exact time. Histogram or grouped data is, by its definition, interval-censored. In such cases, the data might be regarded as being both right- and left-censored.
FIGURE 2-9 Right-censored data. Some items on test are still operational when their testing is halted.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 2-10 time.
Left-censored data. Failure is known only to have occurred before a certain
Time and Failure Censoring In our analysis we pay special attention to right-censored data. A right-censored observation is said to be time-censored, or type I censored, if it is removed from the test at a predetermined time of day. Data is failure-censored, or type II censored, if it is removed once a predetermined number of failures have occurred. Data is said to be singly censored if all censored readings share a common censoring time. Data is said to be multiply censored if test or operating times differ among the censored items. For example, suppose that records are being kept on a fleet of trucks to determine the time-to-failure of transmissions. Trucks retired prior to the observance of a transmission failure (e.g., truck destroyed in a vehicle accident, retired early due to other wear processes, or the test has ended) are multiply censored. In this case data might also be referred to as random right-censored data. Test data might be either multiply failure- or multiply time-censored depending on the type of test run and the criteria used to terminate the test (see Nelson, 1982, 1990). Lawless (1982, p. 35) discusses a common experimental framework that leads to the evolution of multiply time-censored data sets. In this case items are put on test at random or staggered times, and censoring times differ when the test is stopped at some predetermined time of day.
Copyright © 2002 Marcel Dekker, Inc.
2.3.2
Special Staged Censored Data Sets
Sometimes experiments are run in stages, and special stopping rules are used to decide when to stop testing. Two very common forms are discussed here, both of which are special forms of type II, failure-censored testing: 1. Progressive censoring: n items are put on test and run until r1 failures are recorded. At this point, n1 items are removed from the test, leaving n n1 r1 items on test. The test then continues until r2 items fail, and so on. 2. Sudden-death testing: n items are put on test and run until r units fail. At this point, n new items are put on test, and the test is run until r more recorded failures are obtained, and so on. This allows for a closer study of the left tail of the failure distribution; that is, F^ ðtÞ ¼ r=n.
2.3.3
Nonparametric Estimation of Reliability Metrics Based on Censored Data
The development of mean and=or median rank estimators for censored data sets requires specialized procedures for adjusting the ranks. Abernethy (1996) reports the development of a simple formula for adjusting the ranks. It is based on the use of Leonard Johnson’s (1964) formula for adjusting the rank increment after each consecutive run of censored observations: Adjusted rank ¼
ðreverse rankÞ ðprevious adj: rankÞ þ ðn þ 1Þ 1 þ reverse rank
ð2:33Þ
The reverse ranks for the ordered set of observations are just n, n 1, n 2, etc. Once the adjusted ranks are calculated, then either the mean or median rank estimators may be used to generate F^ ðtÞ. We demonstrate the use of this formula by example in Table 2-8, for the data set of Example 2-5. For this example, we have n ¼ 10 items on test. Their reverse ranks are 10, 9, 8, 7, . . . , 2, 1. The procedure is initialized using an adjusted rank at t ¼ 0 of 0. This is particularly important to note when the first observation is an early suspension. Note that the adjusted rank formula is applied only on times when an observed failure occurs. A plot of the empirical failure function based on the use of a mean rank appears in Figure 2-11.*
*For the case of an early suspension at t ¼ t1 , we use Eq. (2.33) with the previous adjusted rank set to 0.
Copyright © 2002 Marcel Dekker, Inc.
TABLE 2-8 Use of Adjusted Rank Method to Derive Mean or Median Rank Estimates Rank Cycles 1 2 3 4 5 6 7 8 9 10
544 663 802 827 897 914 939 1084 1099 1202
Status
Rev. rank
Adjusted rank
Failure 10 1 Failure 9 2 Suspension 8 — Suspension 7 — Failure 6 ¼ ð6 2 þ 10 þ 1Þ=ð6 þ 1Þ ¼ 3:286 Failure 5 ¼ ð5 3:29 þ 10 þ 1Þ=ð5 þ 1Þ ¼ 4:575 Suspension 4 — Failure 3 ¼ ð3 4:575 þ 10 þ 1Þ=ð3 þ 1Þ ¼ 6:181 Failure 2 ¼ ð2 6:181 þ 10 þ 1Þ=ð2 þ 1Þ ¼ 7:787 Suspension 1 —
Mean Median rank rank 9.1% 18.2% — — 29.9% 41.6% — 56.2% 70.8% —
6.7% 16.2% — — 28.4% 41.1% — 56.7% 72.2% —
Several other widely used procedures exist for generating empirical estimates of F^ ðti Þ. In particular, Kaplan–Meier and product-limit estimators are popular alternatives. Wasserman and Reddy (1992) show that these procedures are virtually identical in terms of how the ranks of the observed failure times are adjusted to account for censoring. The Kaplan–Meier and product-limit estimators are discussed in Appendix 2A.3.
FIGURE 2-11
Plot of empirical failure function, F^ ðtÞ.
Copyright © 2002 Marcel Dekker, Inc.
2.3.4
Developing Empirical Reliability Estimates of Warranty or Grouped Censored Data
Life tables originated in epidemiological studies, but are now used extensively in warranty modeling. The data is grouped and censored. Let’s look at an iterative procedure for calculating R^ i , the empirical estimate of product reliability at time ti as follows:
Dni 1 0 hi |fflfflfflfflfflffl{zfflfflfflfflfflffl}
R^ iþ1 ¼
prob: survives to time tiþ1 given it has survived to ti
R^ i |{z}
ð2:34Þ
prob: unit survives to ti1
where we define the additional notation:* ci ¼ No: of removals ðcensoredÞ occurrences in the ith interval: hi ¼ No: of test items at risk beginning of ith interval; with hi ¼ hi1 Dni1 ci1 : h0i ¼ Adjusted no: of items at risk beginning of ith interval; assuming that the censoring times occur uniformly over the interval; with h0i ¼ hi ci =2: Dni ¼ Conditional probability of a failure occurring in the ith h0i interval given survival to beginning of ith interval: 1 ðDni =h0i Þ ¼ Conditional probability of surviving the ith interval given survival to time ti1 :
Example 2-6: function
Construction of life table estimates of the empirical survival
An auto supplier collected data on the life of its strut-suspension products from a fleet of 800 vehicles. Some of the vehicles were removed from the study for various reasons such as damage due to vehicle collisions or the failure of other subsystems that indirectly or directly affected strut-suspension performance. The data is presented in Table 2-9. Note the use of the notation pi ¼ 1 ðDni =h0i Þ, which in combination with Eq. (2.34) is used to evaluate empirical reliability according to the recursive relationship R^ i ¼ pi R^ i1 . *Earlier we define Dni ¼ ni niþ1 as the number of failures that occur in the ith interval.
Copyright © 2002 Marcel Dekker, Inc.
TABLE 2-9 Analysis of Warranty Data from an Auto Suspension Component Miles (thousands)
Dni
ci
hi
h0i
pi
R^ iþ1
0–6 6–12 12–18 18–24 24–30 30–36 36–42 42–48 48–54 54–60
3 5 10 30 24 50 45 95 140 175
3 2 10 9 12 7 11 15 20 18
800 794 787 767 728 692 635 579 469 309
798.5 793.0 782.0 762.5 722.0 688.5 629.5 571.5 459.0 300.0
0.996 0.994 0.987 0.961 0.967 0.927 0.929 0.834 0.695 0.417
0.996 0.990 0.977 0.939 0.908 0.842 0.782 0.652 0.453 0.189
Warranty Modeling In the commercial durable-goods sector, field and warranty data modeling presents an unusual set of difficulties. Some of the difficulties include, but are not limited to, the following.
Data is very dirty. Parts are often replaced when no trouble is found. It is difficult to study one failure mode or class of failure modes, since reports coming back from dealers or retail establishments may be very inexact and vague as to reported failure. Time or mileage in service data is extremely inexact. The time a product remains at a retail establishment before it is sold may not always be properly taken into account when estimating product usage. Only a very small proportion of field data consists of reported failures. In the auto industry we frequently speak of ‘‘repairs=1000’’ (R=1000) or ‘‘conditions per 100’’ (C=100) over a warranty period of three years or more. The data is highly censored. Warranty information is often bivariate. For example, in the auto industry it is very common to see warranties based on 36,000 miles or 3 years, whichever comes first. The adjustment of the warranty data to take into account a portion of the population that has exceeded one of the warranty limits is very inexact. The reader might wish to refer to a conference paper by Betasso (1999), who discusses an approach for adjusting the number of units at risk by an estimated proportion of units that have exceeded the mileage limits of warranty coverage. It is difficult to assess the number of items removed from service early. For example, a vehicle may be involved in an accident or a natural
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 2-12 Warranty triangle. Cells contain number of field incidents by month produced and months-in-service.
disaster that requires it to be removed from service either temporarily or permanently. Consumers may retire a product early due to dissatisfaction or other factors. Production is staggered over a production period. The data must be compiled based on both when the product was released and how long the customer has used it. At a minimum, we often speak of a ‘‘warranty triangle,’’ as warranty data is frequently arranged into a triangular matrix as shown in Figure 2-12. This data must then be translated into a form illustrated by Table 2-9. WinSmithTM provides a capability for direct importing triangular data.
2.4 1.
EXERCISES The time-to-failure in operating hours of a critical solid-state power unit has failure rate lðtÞ, given by lðtÞ ¼ 0:005=hr a.
What is the reliability if the power unit must operate continuously for 40 h? b. What is the MTTF?
Copyright © 2002 Marcel Dekker, Inc.
c.
Find t0:90 (90th percentile of the survival distribution—the design life corresponding to 90th% reliability). d. Given that a unit has survived 40 hr, what is the probability that it will survive 30 more hr? Hint: Use a ¼ 70 hr and b ¼ 40 hr in the following: Pððt aÞ \ ðt bÞÞ Pðt bÞ Pðt aÞ ¼ Pðt bÞ RðaÞ ¼ RðbÞ
Pðt ajt bÞ ¼
2.
The probability density function for the time-to-failure in years of a drive train component is given by f ðtÞ ¼ 0:2 0:02t
for 0 t 10 yr; 0:0 otherwise
Show that the hazard rate, lðtÞ, is increasing, which is indicative of continuous wearout over time. b. Find the mean and median time-to-failure, t0:50 , and B10 life. The results for a life test with n ¼ 10 items on test appears below. The reading at t ¼ 500 hr is a suspended item.
a.
3.
t 59 72 92 100 500 þ 16,076 16,265 79,434 116,222 785,868
a.
This data set is said to be i) complete ii) singly right-censored iii) multiply left-censored iv) interval-censored v) multiply right-censored b. Develop a median rank estimate of F(16,265).
Copyright © 2002 Marcel Dekker, Inc.
c.
4.
Develop a beta-binomial 90% two-sided confidence interval on F(16,265). The probability density of failure, f ðtÞ, is given by f ðtÞ ¼ t=50
5.
for t 10; 0:0 otherwise
a. Find lðtÞ. b. Find RðtÞ. c. Find the MTTF. A warranty data for an automobile component is shown below:
Mileage (1000 miles)
# failures ðDni Þ
# vehicles beginning of interval ðhi Þ
# vehicles censored within band (ci )
0–2 2–4 4–6 6–8 8–10 10–12 12–14 14–16 16–18 18–20 20–22 22–24 24–26 26–28 28–30 30–32 32–34 34–36 >36 Totals
400 200 150 120 110 105 120 85 90 78 90 68 58 50 59 56 55 80
200,000 199,600 199,400 199,245 199,110 198,850 197,945 195,825 192,240 186,150 178,072 167,682 155,614 142,556 129,006 115,947 103,891 92,036 80,456
0 0 5 15 150 800 2,000 3,500 6,000 8,000 10,300 12,000 13,000 13,500 13,000 12,000 11,800 11,500 80,456 198,026
Estimate reliability at end of 12K, 24K, and 36K miles.
Copyright © 2002 Marcel Dekker, Inc.
APPENDIX 2A APPENDIX 2A.1
DERIVATION OF ALTERNATIVE FORM, Ð1 m ¼ 0 RðtÞdt
By definition, the mean time-to-failure (MTTF, or m) may be evaluated using ð1 tf ðtÞdt m¼ ð01 dRðtÞ dt ð2A:1Þ ¼ t dt 0 ð1 ¼ tdRðtÞ 0
We integrate (2A.1) by parts to obtain ð1 1 m ¼ tRðtÞj0 þ RðtÞdt 0 ð1 RðtÞdt ¼0þ
ð2A:2Þ
0
The fact that the first integration term—tRðtÞj1 0 —vanishes requires a bit more analysis. Clearly, at t ¼ 0, tRðtÞ vanishes. However, it is not so clearÐ that this term t vanishes as t ! 1. To show this, write tRðtÞ as t=R1 ðtÞ ¼ t= exp½ 0 lðt 0 Þdt 0 [see Eq. (2.10)]. Both the numerator and denominator tend to 1 as t ! 1. We invoke L’Hospital’s rule, taking the first derivative on both the numerator and denominator simultaneously, which results in the expression 1= expð Þ. Thus, the term tRðtÞ vanishes as t ! 1.
Copyright © 2002 Marcel Dekker, Inc.
APPENDIX 2A.2
THEORETICAL DEVELOPMENT OF MEAN AND MEDIAN RANK ESTIMATORS
Following Leonard Johnson (1951, 1964), we begin with an expression for fi ðti Þ, the marginal density of the ith order statistic (also see Grosh, 1989): 3 f ðti Þdti |fflffl ffl {zfflffl ffl } 76 7 6 4likelihood of ði1Þ54 likelihood of 5 2 n! ði 1Þ!ðn iÞ! |fflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflffl}
fi ðti Þdt i ¼
number of indistinguishable arrangements associated with this outcome 3 2 ½1 Fðti Þni
Fðti Þi1 |fflfflffl{zfflfflffl}
failures occurring before t¼ti
32
failure occurring in ðti ;ti þdti Þ
6|fflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflffl}7 likelihood 7 6 4 of ðniÞ 5
ð2A:3Þ
failures occurring after t¼ti
The expression given by Eq. (2A.3) is just the density function formed from the trinomial probability of observing ði 1Þ failures before t ¼ ti and ðn iÞ failures after t ¼ ti (see Figure 2-13). To obtain a rank estimator of Wi ¼ Fðti Þ, we find the density function of the transformed random variable, Wi ¼ Fðti Þ. Substituting wi ¼ Fðti Þ and dwi ¼ f ðti Þdt yields an expression for the density function of wi : gi ðwi Þdwi ¼
n! wi1 ð1 wi Þni dwi ði 1Þ!ðn iÞ i
for 0 wi 1:0 ð2A:4Þ
For those who have some familiarity with the beta distribution, Eq. (2A.4) is easily recognized as the density function of a beta distribution having parameters i and ðn þ 1 iÞ. The properties of the beta distribution are summarized on the following page.
FIGURE 2-13
Trinomial partition of ordered failure times about t ¼ ti .
Copyright © 2002 Marcel Dekker, Inc.
Beta Distribution A random variable W follows a beta distribution if it has probability density function given by beðw; a; bÞ ¼
1 wa1 ð1 wÞb1 Betaða; bÞ
0w1
ð2A:5Þ
where Betaða; b), the beta function, can be expressed as a function of several gamma functions according to Betaða; bÞ ¼
GðaÞ GðbÞ Gða þ bÞ
ð2A:6Þ
Note that GðX Þ, the gamma function, is just ðX 1Þ!, when X is an integer. The properties of the gamma function are summarized in Chapter 3, x3.5.3. The mean of a beta-distributed random variable is given by m¼
a aþb
ð2A:7Þ
The mean rank estimator of f ðti Þ is EðWi Þ, the expected value of a beta-distributed random variable. According to Eq. (2A.7), the mean rank is evaluated using E½Fðti Þ ¼
i nþ1
ð2A:8Þ
Except for very small or very large i, the beta distributions for the fWi g, i ¼ 0; 1; . . . ; n, are quite skewed. Thus, it is often better to work with the median rank estimator, which is more representative of the location of the distribution. The median estimator, F^ i , is defined as the 50th percentile of the distribution of Wi : ð F^ i
gi ðwi Þdwi ¼ 0:5
ð2A:9Þ
0
Rather than seek to evalute Eq. (2A.9), which is cumbersome, we can make use of an alternate formula that involves the evaluation of F-quantiles. Tables of F-values are readily available and can be readily evaluated in spreadsheet format using the built-in statistical addons for Excel, for example. Given that Wi betaði, n þ 1 i), by Eq. (2A.12), percentiles of Wi can be evaluated using Wi;a ¼
i i þ ðn þ 1 iÞF2ðnþ1iÞ;2i;1a
Copyright © 2002 Marcel Dekker, Inc.
ð2A:10Þ
Furthermore, the median of Wi , denoted F^ i, is found using F^ i ¼
i i þ ðn þ 1 iÞF2ðnþ1iÞ;2i;0:50
ð2A:11Þ
The demonstration of this result is founded on the general relationship between the F-distribution and the beta distribution, which is described in the following section. 2A.2.1
Using the F-Distribution to Evaluate Percentiles of the Beta Distribution
Prove the following: Let W betaða; bÞ over the interval 0 W 1. Then W ¼
ða=bÞF 1 þ ða=bÞF
where F denotes an F-distributed random variable having n1 ¼ 2a numerator and n2 ¼ 2b denominator degrees of freedom. Therefore, we evaluate the right-tailed quantile, W1C , using W1C ¼
ða=bÞF2a;2b;1C 1 ¼ 1 þ ða=bÞF2a;2b;1C 1 þ ðb=aÞF2b;2a;C
ð2A:12Þ
In Eq. (2A.12) we make use of the fact that Fn1 ;n2;C ¼ 1=Fn2 ;n1;1C . To prove this, it is necessary to transform the beta-distributed random variable into an alternate form using V =ðV þ 1Þ ¼ W . We use the Jacobian transform to derive an expression for gðv; a, b), the density function for V : dW gðv; a; bÞ ¼ f ðwðvÞ; a; bÞ dV 1 ¼ f ðwðvÞ; a; bÞ ð2A:13Þ ðv þ 1Þ2 1 va1 ¼ for 0 v 1 Betaða; bÞ ðv þ 1Þaþb The alternate form compares directly with the F-distribution. The density function of an F-distribution with n1 numerator degrees of freedom and n2 denominator degrees of freedom, denoted hð f Þ, is n1=2 n1 f ðn1 =21Þ 1 n2 hð f Þ ¼ n n for 0 < f ð2A:14Þ ðn1 þn2 Þ=2 n1 B 1; 2 f þ1 2 2 n2
Copyright © 2002 Marcel Dekker, Inc.
The interrelationship between Eqs. (2A.13) and (2A.14) is evident once the substitution V ¼ ðn1 =n2 ÞF is made. Then n1=2Þ n1 f a1 n2 gð f Þ / aþb n1 f þ1 n2 requiring that (a) a 1 ¼ ðn1 =2Þ 1 and (b) a þ b ¼ ðn1 þ n2 Þ=2. Thus, n1 ¼ 2a and n2 ¼ 2b.
APPENDIX 2A.3
OVERVIEW OF MODIFIED KAPLAN–MEIER RANK ESTIMATOR
The modified Kaplan and Meier (K–M) estimator is a form of the product-limit estimator. It differs from the original K–M estimator only in the underlying rank estimator. The original K–M procedure was based on the use of the naive estimator, R^ ðti Þ ¼ 1 i=n; the modified form is based on use of the mean rank estimator (for i ¼ 1; 2; . . . ; n): nþ1i R^ ðti Þ ¼ nþ1
i ¼ 0; 1; 2; . . . ; n
ð2A:15Þ
The product-limit estimator is a conditional expression describing the survivability at t ¼ ti given that the (i 1)th failure occurred at t ¼ ti1 : nþ1i ^ ðti Þ nþ1i R ¼ nþ1 ¼ R^ ðti jti1 Þ ¼ ^Rðti1 Þ n þ 2 i n þ 2 i nþ1
ð2A:16Þ
This estimator is updated only upon the occurrence of an observed failure. That is, 8 < n þ 1 i R^ ðt Þ if failure occurs at time; ti i1 R^ ðti Þ ¼ nþ2i :^ Rðti1 Þ if no failures are recorded at ti ð2A:17Þ Equation (2A.17) generalizes to dj i Q nþ1j ^ R^ ðti Þ ¼ Rð0Þ j¼1 n þ 2 j
Copyright © 2002 Marcel Dekker, Inc.
ð2A:18Þ
TABLE 2-10 Illustration of Calculations for Product-Limit Estimator (Modified K–M Estimator) i 1 2 3 4 5 6 7 8 9 10
Cycles on test
Status
nþ1i nþ2i
Calculations
R^ ðtÞ
F^ ðtÞ
544 663 802 827 897 914 939 1084 1099 1202
Failure Failure Suspension Suspension Failure Failure Suspension Failure Failure Suspension
10=11 9=10 8=9 7=8 6=7 5=6 4=5 3=4 2=3 1=2
ð10=11Þ 1:000 ¼ ð9=10Þ 0:909 ¼ — — ð6=7Þ 0:8181 ¼ ð5=6Þ 0:7013 ¼ — ð3=4Þ 0:5844 ¼ ð2=3Þ 0:4383 ¼ —
0.9090 0.8181 0.8181 0.8181 9.7013 0.5844 0.5844 0.4383 0.2922 0.2922
0.0910 0.1819 — — 0.2987 0.4156 — 0.5617 0.7078 —
where R^ ð0Þ ¼ 1:0, dj ¼ 1 if the jth event is an observed failure, and dj ¼ 0 if the event corresponds to a censored observation. Table 2-10 demonstrates the use of the modified K–M estimator using the data set of Table 2-8 in x2.3.3. Note that the mean rank estimates shown earlier in Table 2-8 are in agreement with the modified K–M estimates of FðtÞ shown in Table 2-10. We also observe that these procedures do not provide useful information beyond the last recorded failure, which, for this example, occurred at t ¼ 1099 hr, with R^ ¼ 0:2922. Variance of K–M Estimator Lawless (1982) provides an expression for estimating the variance of R^ ðtÞ: Var R^ ðti Þ ¼ R^ 2 ðti Þ
i P
dj j¼1 ðnj iÞðnj þ 1 iÞ
Copyright © 2002 Marcel Dekker, Inc.
ð2A:19Þ
3 A Survey of Distributions Used in Reliability Estimation
Histograms and other exploratory data models are useful for summarizing timeto-failure data. This information can be used to fit a distribution to the data.
3.1
INTRODUCTION
In this chapter a survey of widely used distributional models for modeling timeto-failure is presented. The properties for each distribution surveyed are summarized in tabular form. These include expressions for the following: Metric (i)
f ðtÞ, the probability density function
Definition ðb Pða < t < bÞ ¼ f ðtÞdt, to evaluate a
the likelihood of a failure occurring in the interval ða; bÞ. ð b
(ii)
FðtÞ, the cumulative distribution function
Copyright © 2002 Marcel Dekker, Inc.
f ðtÞdt, to evaluate the
Pðt < bÞ ¼ 0
likelihood of a failure occurring by time b.
Metric (iii)
RðtÞ, the survival function
Definition ð1 f ðtÞdt, to Pðt > bÞ ¼ 1 FðbÞ ¼ b
(iv)
lðtÞ, the hazard function or instantaneous failure rate
(v)
Expressions for first and second population moments (m and s2 ) and potentially higher population moments
evaluate the likelihood of survival to time b. f ðtÞ lðtÞ ¼ RðtÞ ð1 ð1 m¼ tf ðtÞdt ¼ RðtÞdt, the true 0
0
mean time to failure. ð1 s2 ¼ ðt mÞ2 f ðtÞdt 0 ð1 ¼ t 2 f ðtÞdt m2 , the true mean 0
(vi-a)
tR , the Rth quantile of the survival distribution (a right-tailed quantity)
squared deviation about mean time to failure. ð1 RðtR Þ ¼ f ðtÞdt. tR
¼R
(vi-b)
B10 , the 10th percentile of the failure distribution (a left-tailed quantity)
FðB10 Þ ¼
ð B10
f ðtÞdt ¼ 0:10. Can be
0
generalized to Bp , pth percentile of failure distribution.
A description of each of the distributions, along with a summary of their overall properties, is presented in Table 3-1. As described in the appendix, each of the distributions can be generalized as either a location-scale or log-locationscale distribution. This generalization is often useful when modeling general time-dependent failure phenomena.
Copyright © 2002 Marcel Dekker, Inc.
TABLE 3-1 Overview of Classical Time-to-Failure Distributions Used in Reliability Distribution Normal distribution
Lognormal distribution
Exponential distribution
Weibull distribution
Extreme-value distribution (EVD)
Properties Widely used to model central tendencies. Useful if the coefficient of variation, C ¼ s=m, is less than 10%. Tolerances, material properties, etc. are generally modeled using the normal distribution. Widely used in (probabilistic) design for reliability. Useful for modeling fatigue-related phenomena and other stress–strength phenomena. The logged data will be normally distributed. Widely used in electronics and probabilistic modeling. Applicable for modeling constant failure-rate phenomena. General-purpose distribution used to model time-to-failure phenomena. Originally proposed to model fatigue-related phenomena. Its hazard rate follows a general power-law model, lðtÞ ¼ at b . Used almost exclusively to model extreme environmental=stress phenomena such as minimum rainfall, maximum load, etc. Logged Weibull data will follow a minimum EVD distribution.
As referenced in Table 3-1, several of these distributions are interrelated. Specifically, Y is lognormal , ln Y is normal Y is Weibull , ln Y is a minimum extreme value ðGumbelÞ These relationships are important to consider, given that the Weibull and normal distributions are much better known and studied. Rather than provide an exhaustive summary of each and every distribution listed, it is much simpler to advise the reader to utilize the more popular distribution forms like the Weibull and normal distributions. If necessary, the reliability analyst is encouraged to apply a simple log or exponential transformation so that the life data can be fit to either a normal or Weibull distribution. Accordingly, only an abbreviated overview of the extreme-value (EVD) and lognormal distributions is presented. Information on other lesser-known or used distributions is also provided. This list includes the (log-)gamma, (log-)logistic, inverse Gaussian and gamma, Birnbaum–Saunders, and Gompertz–Makeham distributions.
Copyright © 2002 Marcel Dekker, Inc.
3.2
NORMAL DISTRIBUTION
Conformists tend to be moderate, just like normal data. The normal distribution is frequently used to model the distribution of qualityrelated characteristics and other variable sampling measures. It is, of course, symmetric and possesses a single mode. By the central limit theorem, characteristics involving a sum of many other independent variables are approximately normally distributed. This signifies that the data has central tendency, wherein most of the population lies within several standard deviations of its mean. As illustrated in Figure 3-1, 68.26% of the population lies within 1s unit of the population mean, m; graduating rapidly to a level of 95.46% for m þ 2s units; 99.73% for m 3s units; up to 99.994% for m 4s units, and so on. The normal distribution is used to approximate a wide variety of process behaviors. Despite its widespread popularity, the normal distribution is probably the most overused and incorrectly used distribution. In quality control, in-control processes are modeled as a sequence of stationary (constant), normally distributed random variables. In the real world, however, processes are noisy. Noises are the result of a whole host of uncontrollable factors, resulting in excessive, and unpredictable, process variation. Consequently, process characteristics are often serially correlated, and perturbed by random process changes, which invalidates any distributed assumptions. 3.2.1
Central Tendency
P Under the central limit theorem and fairly general conditions, the sum ni¼1 yi —a sum of n independent, identically distributed random variables each with finite mean m ¼ Eðyi Þ and finite variance s2 ¼ Varðyi Þ—is asymptotically (i.e., for n ! 1) normally distributed, regardless of the distribution of the yi random Pn y is variables. Consequently, the sum i¼1 i Papproximately normally distributed P with mean n ni¼1 m ¼ nm and variance n ni¼1 s2 ¼ ns2 , for sample sizes as small as 4 or 5 for unimodal, symmetric underlying distributions on yi .
FIGURE 3-1
Illustrating central tendency of the standard normal distribution.
Copyright © 2002 Marcel Dekker, Inc.
A manifestation of the central limit theorem is shown in Figure 3-2, wherein we simulate 100 outcomes from a binomial distribution (i.e., number of successes out of n trials) with p ¼ 0:10 and n ¼ 5, 10, 20, and 50. Histogram summaries of the simulation are presented. Note the pronounced ‘‘bell-shaped’’ tendency for n ¼ 20 and 50. Thus, for failure-related phenomena involving an accumulation of a multitude of effects, the resultant phenomena can often be modeled using a normal distribution. 3.2.2
Properties of Normal Distribution
The probability density function of a normally distributed variable, f ðtÞ, follows the well-known bell shape: 1 1 t m 2 f ðtÞ ¼ pffiffiffiffiffiffi exp 2 s s 2p
1 t 1
ð3:1Þ
FIGURE 3-2 Simulation of 100 outcomes from a binomial distribution with p ¼ 0:1; n ¼ 5, 10, 20, and 50.
Copyright © 2002 Marcel Dekker, Inc.
Direct integration of f ðtÞ requires the use of either complex numerical methods or functional approximations of f ðtÞ. Alternatively, tabulated values of cumulative probabilities of the standard normal distribution have been developed and are widely available. Percentiles and cumulative probabilities of the survival distribution can then be evaluated by applying the standard normal transformation, Z ¼ ðt mÞ=s. The standard normal random variable Z has a population mean m ¼ 0 and variance s2 ¼ 1:0. Thus, FðtR Þ, the failure distribution about survival percentile tR , is given by FðtR Þ ¼ PðT tR Þ T m t m R ¼P s s ¼ PðZ ZR Þ
ð3:2Þ
subtracting m and dividing by s on both sides t m where ZR ¼ R s
FðZR Þ ¼1R where FðzÞ ¼ Cumulative distribution for a standard normal random variable; N ð0; 1Þ; Z ¼ ðt mÞ=s; the standard normal random variable: ZR ¼ ðtR mÞ=s; the Rth survival percentile of standard normal distribution: The use of the standard right-tail notation ZR is illustrated with Figure 3-3. The table values were constructed using Microsoft Excel formulas. The following relates the standard metrics back to their untransformed values: tR ¼ m þ sZR
ð3:3Þ
The properties of the normal distribution are summarized in Table 3-2 in terms of both the transformed Z and originating variable, t. The overall behavior of several of the key reliability metrics is illustrated in Figure 3-4, where it is seen that the standard normal hazard function, lðzÞ, is an increasing function—which is useful for modeling wearout phenomena. A normal plot of 30 data points from a standard normal distribution is also displayed. This procedure is described in x4.2.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 3-3 ZR .
The survival percentile, tR , rescaled with respect to the right-tailed quantile,
Example 3-1 A wear characteristic is normally distributed with mean m ¼ 15;000 cycles and variance s2 ¼ 4 106 cycles2 . Find the B10 life. (Note: B10 life is standard industry notation for the 10th percentile of the failure distribution, with t0:90 ¼ B10 , the 90th percentile of the survival distribution.)
Solution ðdenoting T as our usage characteristicÞ: PðT < B10 Þ ¼ 0:10 or P½ðT mÞ=s < ðB10 mÞ=s ¼ 0:10. 0
0
11
BB10 15;000CC B B CC ¼ 0:10 ) Z0:90 ¼ Z0:10 ¼ B10 15;000 PB AA @Z < @ 2000 2000 |ffl{zffl} z0:90
¼ 1:28 ) B10 ¼ 15;000 1:28 2000 ¼ 12;440 cycles
Note: To obtain Z0:10 , we can make use of the NORMSINV function in Excel (see Figure 3-5).
Copyright © 2002 Marcel Dekker, Inc.
TABLE 3-2 Distributional Properties of the Normal Distribution Property Probability density function (pdf), f ðtÞ Failure distribution (cdf), FðtÞ
Expression [in terms of t and z ¼ ðt mÞ=s 1 1 t m 2 1 pffiffiffiffiffiffi exp
fðzÞ; 1 t; z 1 2 s s s 2p t m F ¼ FðzÞ s
Reliability, RðtÞ
1 FðtÞ ¼ 1 FðzÞ
Survival percentile, tR
m þ sZR
Population mean1, m
1 1 t m 2 1 pffiffiffiffiffiffi exp fðzÞ f ðtÞ s 2p 1 2 s s ¼ ¼ ¼ lðzÞ t m 1 FðzÞ RðtÞ s 1F s m
Population variance, s2
s2
Skewness, m3
0
Kurtosis, m4
3
Instantaneous failure rate, lðtÞ
1 My students often find it confusing to use the notation m and s2 to denote the parameters of the normal distribution and as a general notation for the population mean and variance, respectively.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 3-4 Reliability metrics illustrated for a standard normal, N ð0; 1Þ variate (Minitab V12.1 Stat > Reliability=Survival > Distribution Overview procedure).
FIGURE 3-5 Use of NORMSINV function in Microsoft Excel to look up standard Zvalues.
Copyright © 2002 Marcel Dekker, Inc.
3.3
LOGNORMAL DISTRIBUTION
Rather than suggest a whole new set of distribution fitting tools for the lognormal distribution, one suggestion is that the practitioner work with the more familiar ln T random variable formed simply by applying a natural log transformation on all observed values. Due to the wide use and applicability of the normal distribution, a rich collection of statistical tools for estimation and distribution fitting has been developed. (See Figure 3-6.) Accordingly, specific maximum likelihood (ML) estimation routines of lognormal parameters and the use of dedicated lognormal probability plotting paper are not covered here. Consult references such as Lewis (1996) for more specific information on dedicated procedures for assessing lognormal fits. By the central limit theorem, if T can be modeled by a product of many independent terms, then ln T , the logged characteristic, can be modeled as a sum of many terms, and so it is approximately normally distributed. Accordingly, T follows a lognormal distribution since ln T is approximately normally distributed. As an example, consider the distribution of grain size of particulate matter. Suppose that a particle of some initial size, d0 , is subjected to repeated, but independent, impacts and on each impact a proportion, Xi , of the particle remains. Then after the initial impact, the size of the particle is Y1 ¼ X1 d0 ; after the second impact, the size is Y2 ¼ X2 X1 d0 ; and after the nth impact, the size is Yn ¼ Xn Xn1 X2 X1 d0 Then ln Yn ¼ lnðd0 Þ þ
n P
ln Xi
i¼1
FIGURE 3-6
Relationship between the lognormal and normal distributions.
Copyright © 2002 Marcel Dekker, Inc.
ð3:4Þ
and by the central limit theorem, ln Yn will be approximately normally distributed for large n. The model described by Eq. (3.4) can be used to model a wide variety of failure phenomena. The lognormal distribution is a popular distribution for modeling fatigue-related failures in manufactured components and support structures. In such cases metal fatigue results from the initiation of cracks and their growth, due to repeated exposure to physical loads, thermal effects, and so on. To derive an expression for the cumulative distribution function of a lognormal random variable, we begin with a standard normal formulation involving the log characteristic, ln ðtÞ, and its mean tmed , and variance, s2 . Thus, ln t ln tmed FðtÞ ¼ PðT tÞ ¼ Pðln T ln tÞ ¼ F s
ð3:5Þ
In Eq. (3.5) we make use of the fact that the logarithm is a monotonically increasing function, and thus the cumulative distribution of a lognormal characteristic can readily be expressed in terms of the standard normal cumulative probability. It is easily seen that for t ¼ tmed, Z ¼ ðln t ln tmed Þ=s ¼ 0, and FðZÞ ¼ 0:50. Therefore, tmed is the median of the lognormal distribution—the t0:50 point. To find the density function, we take the derivative of FðZÞ with respect to t as follows: ln t ln tmed s ln t ln tmed d 1 1 s ¼ fðzÞ since ¼ st st dt 2 ! 1 1 t ¼ pffiffiffiffiffiffi exp 2 ln for t 0 2s tmed ð 2pÞst
f ðtÞ ¼
dFðzÞ dz dz dt
where z ¼
ð3:6Þ
The standard shapes of f ðtÞ and lðtÞ—with tmed ¼ 1 are illustrated in Figure 3-7 for a wide range of values of the scale factor, s. The characteristic long right tail of the lognormal distribution is evident! The hazard function grows to a maximum early in its life and then decreases. As such, Meeker and Escobar (1998) make a note of its suitability for modeling failure times of certain materials or components that exhibit an ‘‘early life hardening.’’ The properties of ln T and T are summarized in Table 3-3.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 3-7 Shape and hazard function of lognormal distribution with tmed ¼ 1, and range of s-values.
TABLE 3-3 Comparison of Distributed Properties of ln T and T Property
ln T (normally distr.)
T (lognormally distr.)
Location parameter
ln tmed
tmed
Scale parameter
s
2 ! 1 1 t pffiffiffiffiffiffi exp 2 ln 2s tmed ð 2pÞs ln t ln tmed F s
2 ! 1 1 t pffiffiffiffiffiffi exp 2 ln 2s tmed ð 2pÞst ln t ln tmed F s
Pop. mean, m
ln tmed
tmed expðs2 =2Þ
Median
ln tmed
tmed
Density functions, f ðtÞ Cum distr., FðtÞ
2
Variance, s
2
s
s
2 tmed expðs2 Þ½expðs2 Þ 1 1
Note: In many textbooks and software packages such as Minitab , the lognormal distribution is parameterized with m ¼ ln tmed , the population mean of lnðT Þ.
Copyright © 2002 Marcel Dekker, Inc.
TABLE 3-4 Lognormal Data Set t
343 409 589 862 962 1020 1256 1282 1308 1344 1835 2335 3482 5216 6967
ln T 5.84 6.01 6.38 6.76 6.87 6.93 7.14 7.16 7.18 7.20 7.51 7.76 8.16 8.56 8.85
Example 3-2:
Lognormal fit
A data set given in Table 3-4 consists of n ¼ 15 prototype failure times that are to be assessed for a lognormal distribution fit. The logged values are plotted on normal paper in Figure 3-8.
FIGURE 3-8
Normal plot of lnðTÞ values.
Copyright © 2002 Marcel Dekker, Inc.
3.4
EXPONENTIAL DISTRIBUTION
The Fountain-of-Youth Distribution The exponential distribution is so named for its survival function, which is of the (simple) exponential form RðtÞ ¼ 1 FðtÞ ¼ expðltÞ
for t 0
ð3:7Þ
The density function is identified by evaluating dRðtÞ=dt: f ðtÞ ¼ l expðltÞ
for t 0
ð3:8Þ
The properties of the exponential distribution are summarized in Table 3-5. However, we do call attention to the following unique properties of the exponential distribution: 1.
2.
Mean time-to-failure (MTTF): ð1 ð1 1 m¼ RðtÞdt ¼ expðltÞdt ¼ l 0 0
ð3:9Þ
Constant hazard-rate (memoryless) property: lðtÞ ¼
f ðtÞ l expðltÞ ¼ ¼l RðtÞ expðltÞ
for t 0
ð3:10Þ
As evidenced by the calculation of Eq. (3.10), the instantaneous failure- or hazard-rate function, lðtÞ, is a constant for the exponential distribution. When the hazard rate lðtÞ is constant, the unconditional probability of a failure occurring in some future time interval, ðt, t þ Dt] does not depend on the age of the product. We refer to this as the memoryless property of the exponential distribution. It can be described mathematically, by creating an expression for P½T > t þ DtjT > t, the conditional probability of failure over some interval ðt, t þ DtÞ, given survival to some age, t: P½T > t þ Dt \ T > t P½T > t P½T > t þ Dt ¼ P½T > t Rðt þ DtÞ ¼ RðtÞ exp½lðt þ DtÞ ¼ exp½lt ¼ exp½lDt
P½T > t þ DtjT > t ¼
¼ PðT > DtÞ
Copyright © 2002 Marcel Dekker, Inc.
ð3:11Þ
TABLE 3-5 Distributional Properties of Exponential Distribution Expression in terms of l
Property
lt
Failure distribution (cdf)
f ðtÞ ¼ le for t 0 ðt 0 FðtÞ ¼ lelt dt0 ¼ 1 elt for t 0
Reliability, RðtÞ
RðtÞ ¼ 1 FðtÞ ¼ elt
Survival percentile, tR
tR ¼
B10 life
B10 ¼ lnð0:90Þ=l
Probability density function (pdf)
Expression in terms of y f ðtÞ ¼ ð1=yÞet=y for t 0 ðt 0 FðtÞ ¼ y1 et =y dt 0 ¼ 1 et=y for t 0
0
Instantaneous failure rate, lðtÞ Cumulative hazard function, HðtÞ
lnðRÞ l
tR ¼ y lnðRÞ B10 ¼ y lnð0:90Þ
lt
f ðtÞ le ¼ lt ¼ l e RðtÞ ðt HðtÞ ¼ l dt 0 ¼ lt lðtÞ ¼
ð1 Mean time-to-failure, m
0
0
ð1
RðtÞdt ¼
m¼ 0
elt dt ¼ 1=l
Coefficient of skewness g3 Kurtosis, m4
m4 ¼
Coefficient of kurtosis, g4
9.0
Standard Deviation, sðT Þ Coefficient of variation, ðs=mÞ Skewness, (m3 )
Copyright © 2002 Marcel Dekker, Inc.
9 l4
f ðtÞ ð1=yÞet=y 1 ¼ ¼ et=y RðtÞ y ðt 1 0 t HðtÞ ¼ dt ¼ y 0y ð1 ð1 m¼ RðtÞdt ¼ et=y dt ¼ y
lðtÞ ¼
0
2 1 1 VarðtÞ ¼ Eðt 2 Þ m2 ¼ 2 2 ¼ 2 l l l pffiffiffiffiffiffiffiffiffiffiffiffi 1 s ¼ VarðtÞ ¼ ¼ m l 1.0 2 m3 ¼ 3 l 2.0
Variance, Var(T ) to failure
RðtÞ ¼ 1 FðtÞ ¼ et=y
0
0
VarðtÞ ¼ Eðt2 Þ m2 ¼ 2y2 y2 ¼ y2 s¼
pffiffiffiffiffiffiffiffiffiffiffiffi VarðtÞ ¼ y ¼ m
1.0 m3 ¼ 2y3 2.0 m4 ¼ 9y4 9.0
According to Eq. (3.11), the conditional probability P½T > t þ DtjT > t ¼ P½T > Dt—the unconditional probability of failure over the next Dt time units. Accordingly, we state that the exponential distribution is memoryless since the probability of failure in the interval ðt, t þ DtÞ does not depend on t, the product age or usage—just Dt. Thus, we have renamed the exponential distribution the fountain-of-youth distribution, as a product will not age under an exponential distributional assumption! To understand the implications of this assertion, let’s look at a common situation that many of our readers are all too familiar with—the decision to repair or replace a costly consumer appliance: A costly part of a popular consumer appliance has failed and is in need of replacement. A repair technician is called out. He takes a look at the applicance and says, ‘‘Don’t even think about repairing this! Your appliance is already quite old, and it is simply not worth my time and the cost of a replacement unit to repair it!’’ However, the customer suggests that he replace the costly part with one taken from a salvaged appliance that is several model years older. Should the repair technician follow the customer’s suggestion? What would you do? Why? In making this decision, are you at all concerned about the age and usage of the recycled unit? What about the age of your appliance? Under an exponential distribution, the time-to-failure is memoryless, and so age should not make a difference. In this case you should be confident that the reliability of a refurbished unit should be comparable to a newly manufactured unit, and so you should not be concerned with the usage history of a refurbished appliance.
The exponential distribution is sometimes re-parameterized in terms of a mean time-to-failure (MTTF) parameter, y. In this case we have RðtÞ ¼ 1 FðtÞ ¼ expðt=yÞ
for t 0
ð3:12Þ
In this formulation the population mean, known more popularly as the mean time-to-failure (MTTF), is equal to y. We summarize the basic properties of the exponential distribution in terms of both l and y in Table 3-5. The exponential distribution is used widely for modeling time-to-failure of electronic components. The original Mil-Handbook 217 (see Rev F, 1991) for reliability prediction was based on the use of an exponential distribution assumption. The exponential distribution assumption cannot be used, however, if infant mortalities ½lðtÞ decreasing] exist or if wearout phenomena ½lðtÞ increasing] exist. The last release of the military handbook, before the DOD no longer supported it, actually did allow for a time-dependent hazard rate—the Weibull hazard rates—when deemed applicable. The exponential distribution is also widely used for modeling failure phenomena at the system level. We hypothesize that at the system level numerous
Copyright © 2002 Marcel Dekker, Inc.
failure modes may evolve from several subsystems. The overall effect of these random occurrences is modeled as a constant failure-rate process.
3.5
WEIBULL DISTRIBUTION
Waloddi Weibull first proposed the Weibull distribution in 1937. Weibull showed that his distribution could model a wide range of phenomena, from fatigue life to the heights of adult males in the British Isles. It has enjoyed wide popularity since then because its hazard function follows a power law, which is useful for modeling quite a variety of phenomena.
3.5.1
Weibull Power-Law Hazard Function
An expression for the hazard function (instantaneous failure rate) of the Weibull distribution is presented in Eq. (3.13): lðtÞ ¼
b t b1 y y
for t 0
ð3:13Þ
The flexibility of the Weibull distribution is evidenced by the power-law form of lðtÞ: lðtÞ is decreasing for b < 1; it is increasing for b > 1; and b ¼ 1 corresponds to a constant failure-rate phenomenon, the exponential distribution model. The Weibull can also be used to approximate the normal distribution for b 3:5. The Rayleigh and extreme-value distributions are also special cases of the Weibull distribution. The particular distribution or features of the Weibull over a range of b-values are summarized in Table 3-6.
TABLE 3-6 Features of Weibull Distribution over a Range of b-values b < 1.0 1.0 > 1.0 2.0 3.5 > 10
Features Decreasing failure-rate phenomena Exponential (constant failure rate) Increasing failure-rate phenomena Rayleigh single peak (linearly increasing) Normal shape Type I extreme value
Copyright © 2002 Marcel Dekker, Inc.
3.5.2
Weibull Survival Function
The Weibull survival function has the following simple, exponential form: RðtÞ ¼ expððt=yÞb Þ
t0
ð3:14Þ b
The failure distribution FðtÞ ¼ 1 RðtÞ ¼ 1 exp½ðt=yÞ . The similarity between the Weibull and exponential survival functions [see Eq. (3.7)] is apparent, wherein the exponential survival function is generalized with ðt=yÞb . If the Weibull times are transformed according to y ¼ t b , then RðyÞ ¼ expðlyÞ, which is the exponential survival function on y. If b is known, such a transformation is helpful, allowing the use of the more tractable, exponential expressions for test planning. The Weibayes procedure is based on this approach. It is discussed in greater detail in Chapter 4. 3.5.3
Properties of the Weibull Distribution
Expressions for the reliability metrics—f ðtÞ, FðtÞ, RðtÞ, lðtÞ—along with the population mean, variance, and percentiles of the survival distribution are summarized in Table 3-7. Figure 3-9 shows graphical plots of the probability density function, f ðtÞ; the cumulative failure distribution, FðtÞ; and the instantaneous failure rate, lðtÞ. The representations are evaluated over a range of values of the Weibull shape parameter b ¼ 0:5, 1.0, 2.0, 4.0, and 10.0. The diversified shapes of f ðtÞ, FðtÞ, and lðtÞ for different values of b are evident. Note the appearance of the gamma function, GðaÞ, in the expressions for the population mean and variance. For a constant a, the gamma function is defined by the integral ð1 GðaÞ ¼ xa1 ex dx ð3:15Þ 0
Repeated integration by parts of the right-hand side of Eq. (3.15) will reveal properties of the gamma function that GðaÞ ¼ ða 1ÞGða pffiffiffi 1Þ ¼ ða 1Þ! when a is an integer. [It is also useful to note that Gð1=2Þ ¼ p]. The gamma function is indirectly available in Excel as the function gammaln(a), the natural log of the gamma function. The use of this Excel function is illustrated in Figure 3-10. In Appendix 3B we outline the mathematics for generating expressions for the population mean (m), variance (s2 ), and other higher-order moments of the Weibull distribution. These expressions, in turn, can be used to obtain expressions for the coefficients of skewness and kurtosis. Useful graphical assists have been developed for evaluating the population mean m. A chart of the function m=y ¼ Gð1 þ 1=bÞ is presented in Figure 3-11. Alternatively, m can be directly evaluated in Excel using the relation m ¼ y Gð1 þ 1=bÞ:
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 3-9
3.5.4
Scaled values of f ðtÞ, FðtÞ, RðtÞ, and lðtÞ for b ¼ 0:5; 1.0; 2.0; 3.5; and 10.0.
Three-Parameter Weibull
The survival and probability density functions of the three-parameter Weibull distribution are respectively given by b
RðtÞ ¼ eðtd=yÞ for t d b1 b b td f ðtÞ ¼ eðtd=yÞ y y
ð3:16Þ for t d
ð3:17Þ
The addition of a threshold parameter, d, represents a minimum life before which failure events cannot occur. In effect, the time axis is rescaled so that life begins at t ¼ d, and thus the variance will not change, but the mean time-tofailure becomes 1 þd ð3:18Þ m ¼ yG 1 þ b
Copyright © 2002 Marcel Dekker, Inc.
TABLE 3-7 Weibull Distribution Properties Property
Expression b t b1
f ðtÞ ¼
Failure distribution (cdf)
FðtÞ ¼ 1 eðt=yÞ
Reliability, RðtÞ
RðtÞ ¼ eðt=yÞ
Survival percentile, tR
tR ¼ yð ln RÞ1=b b t b1 lðtÞ ¼ y y
Instantaneous failure rate, lðtÞ
y y
b
eðt=yÞ for 1 t, z 1
Probability density function (pdf)
b
b
Population mean
m ¼ y Gð1 þ 1=bÞ
Population median
T50 ¼ yðln 2Þ1=b
Population variance
s2 ¼ y2 ½T ð1 þ 2=bÞ G2 ð1 þ 1=bÞ
Population skewness
See Appendix 3B
Population kurtosis
See Appendix 3B
FIGURE 3-10 Tabulated values for GðaÞ and the use of Excel function, GammaLn, for evaluating GðaÞ.
FIGURE 3-11
m=y values for Weibull distribution.
Copyright © 2002 Marcel Dekker, Inc.
The three-parameter Weibull distribution is used to model phenomena that take a minimum time to evolve, such as failures due to fatigue, corrosion, creep, and other degradation phenomena. The distribution is often used to correct for a poor fit of the data to a two-parameter Weibull distribution. In Chapter 4 we illustrate the use of a three-parameter Weibull fit to remedy the concave appearance of the data on ordinary Weibull plotting paper (see Figure 4-18). Detractors will say that its addition is just a gimmick to force a two-parameter Weibull fit onto a data set when it is not appropriate. Often the data can just as well be fitted to a lognormal distributional (see Abernethy, 1996). It is important to note that the choice to use either lognormal or threeparameter Weibull should be founded not on the data but on the underlying understanding of the physics of failure (see Meeker and Escobar, 1998, p. 270). A word of caution is in order: Kapur and Lamberson (1977, p. 292) and others present an alternate parameterization of the three-parameter Weibull having y replaced by y d. This results in the adjustment yKL ¼ y þ d. Negative d? In his WeiBathTM model Tarum (1996) allows for a negative value of d to account for the possibility that part of a product’s life might be consumed before t ¼ 0. This allows for the scaling of a convex (curvature upward) Weibull fit back to linearity! But how can this be justified? In general, product life begins once the customer uses a product. In such instances, think what can happen to a product during the period prior to t ¼ 0, from the time a product leaves the factory for the wholesale and retail distribution locations. During this period, t < 0, product damage due to misuse in handling conceivably could occur. Other aging phenomena can possibly set in even before the customer has begun to use the product. For example, in the auto industry one often hears of the unattractive metaphore lot rot to account for any aging of the vehicle’s appearance and other subsystems while the vehicle is parked on dealer lots for up to 90 days or more. In such cases product aging or failure might occur before t ¼ 0! 3.6
EXTREME VALUE (GUMBEL) DISTRIBUTION
Formally, the smallest extreme-value (EV) or Gumbel distribution is referred to as a type I extreme-value distribution of the minimum (see Bain, 1979, Ch. 6). It is the limiting form of an asymptotic (large) number of competing, identical, potential failure modes, with time to failure T ¼ limn!1 fminðt1 , t2 ; . . . ; tn g. As an example, Kapur and Lamberson (1977) cite the use of the Gumbel distribution to model the progression of a corrosive process that eventually results in a pinhole opening in an automotive exhaust pipe. In this case we assume
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 3-12 The minimum EV distribution is the limiting form of an asymptotic large number of competing, identical, potential failure modes, where FðtÞ ¼ limn!1 1 ðGðtÞÞn .
that for all practical purposes an infinite number of sites on the exhaust-pipe surface are conducive for corrosive pit growth. At each of these sites, we assume that pit growth will eventually traverse the entire wall thickness, resulting in an eventual pinhole—a failure. Each site is, in a sense, ‘‘competing in a race’’ to be the first to penetrate through the wall, at which time a failure is said to have occurred. Mathematically, we describe this distribution with the help of Figure 3-12. Assume that GðtÞ is any (log-) location scale (see Appendix 3A) distribution with exponential tail; then FðtÞ ¼ limn!1 1 ð1 GðtÞÞn . This distribution is commonly used to model failure phenomena in a wide variety of applications in both electronic and mechanical systems. FðtÞ, the cumulative distribution of a Gumbel distribution with location parameter m and scale parameter s is given by t m FðtÞ ¼ PðT tÞ ¼ 1 exp exp s
for 1 t 1 ð3:19Þ
The Weibull and Gumbel distributions have a very interesting relationship, in that the distribution of Y ¼ expðT Þ is Weibull-distributed with y ¼ expðmÞ and b ¼ 1=s as illustrated in Figure 3-13.
FIGURE 3-13 b ¼ 1=s.
Equivalence between Gumbel and 2 parameter Weibull with y ¼ expðmÞ and
Copyright © 2002 Marcel Dekker, Inc.
We can verify this relationship through the development of an expression for the distribution function of Y ¼ expðT Þ: FðyÞ ¼ PðY yÞ ¼ PðexpðT Þ yÞ ¼ PðT ln yÞ ln y m ¼ 1 exp exp s y b substituting ln y for m and b for 1=s ¼ 1 exp exp ln y y b Weibull ðy; bÞ distribution function ¼ 1 exp y ð3:20Þ Thus, we have verified the following very remarkable result: If T is distributed according to a Weibull ðy; bÞ distribution m then lnðT Þ is distributed according to a Gumbel ðm; sÞ distribution with m ¼ lnðyÞ and s ¼ 1=b. The EV distribution is often used as an alternative to the Weibull or lognormal distribution. If a standard Weibull or lognormal fit is not acceptable, however, then it might be desirable for the analyst to assess the adequacy of an EVD fit, since the analyst is likely to be more familiar with Weibull analysis and have good computer software to assist in this regard. In such a case it might be desirable for practitioners to work with the exponential transformation, expðT Þ, on all observations, thus allowing the use of Weibull analysis on transformed values. However, one major risk should be considered before proceeding this way: The use of an experimental transformation might result in values that exceed the internal number representation of the computer. This is more likely to occur if the computer package does not make use of double-precision formats internally. In such cases the data might need to be rescaled by a constant to ensure that the largest exponential value is not too large.
Given the wide availability of Weibull analysis tools, specific estimation procedures of EVD parameters and the use of dedicated EVD probability plotting paper are not covered here. Consult other references such as Lewis (1996) for more specific information on dedicated procedures for assessing EV fits. The distribution properties of T and expðT Þ are compared in Table 3-8.
Copyright © 2002 Marcel Dekker, Inc.
TABLE 3-8 Comparison of Distributional Properties of T and ln T Property Location parameter Scale parameter Density fct. f ðtÞ Cum Distr. FðtÞ Distr. mean, m Variance, s2
T (Type I EVD of minimum) for 1 t 1
exp T (Weibull distr.) for expðTÞ 0
m ¼ lnðyÞ s t m t m 1 exp exp exp s s s t m 1 exp exp s m 0:5772 s ðp2 =6Þs2
y ¼ expðmÞ b ¼ 1=s b t b1 ðt=yÞb e t0 y y 1 eðt=yÞ
b
m ¼ y Gð1 þ 1=bÞ s2 ¼ y2 fGð1 þ 2=bÞ G2 ð1 þ 1=bÞg
The minimum EVD has a sister distribution, the type I EV distribution of the maximum. Its distribution function is given by t m Þ FðtÞ ¼ expð exp s
for 1 t 1
ð3:21Þ
It results from finding the limiting distribution of T ¼ limn!1 fmaxðt1 ; t2 ; . . . ; tn g, formed by FðtÞ ¼ limn!1 GðtÞn. The mirror relationship between the two is illustrated in Figure 3-14. The type I EVD of the maxima is used in civil
FIGURE 3-14 minima.
Mirror relationship between type I EV distributions of the maxima and
Copyright © 2002 Marcel Dekker, Inc.
TABLE 3-9 Asymptotic Extreme Value (EV) Distributions EV distributions
Distribution Fct. t m FðtÞ ¼ exp exp s for 1 t 1
1a. Gumbel or Type I EV distributions with parent distribution having an exponential tail. Normal, lognormal, exponential, Weibull, Gumbel, and type II are possible parents. Note that the log of type I EVD of the minima is a Weibull. t m 1b. Type I EV distribution of maximum 1 exp exp s for 1 t 1
t b 2. Frechet or Type II EV distributions. Parent FðtÞ ¼ exp d distributions include Pareto, Student’s t. Their tails are thicker than the exponential forms. A Minimum: t < 0; d, b > 0 log of a type II EV distribution of the maxima is Maximum: t > d > 0; b > 0 a type I EV distribution of the maxima. Used in environmental applications. b 3. Type III EV distributions with having uniform, Minimum: 1 exp t d t d; yd beta, or itself as a parent distribution. Such y > d; b > 0 distributions have a finite upper limit b td (maximum) or lower limit (minimum). The Maximum: exp ; t y; yd 3-parameter Weibull distribution is a type III d < y; b > 0 EV distribution of the minima.
and environmental applications such as modeling flood flows, maximal rainfall, wind gusts, etc. (see Kottegoda and Rosso, 1997). It is interesting to note that there are just three types of limiting distributions. The other two are denoted as type II and type III extreme-value distributions. The properties of the three types are summarized in Table 3-9. The second type, type II, is known as the Frechet distribution and is used extensively in modeling extreme environmental events. It is also interesting to note that the three-parameter Weibull distribution is a type III EVD of the minimum. With the exception of the type I EVD of the minimum, EVDs are not widely used in reliability applications in industry.
3.7
OTHER DISTRIBUTIONS USED IN RELIABILITY
We summarize several other distributions that are used in reliability but do not enjoy the widespread popularity of the normal, lognormal, Weibull, or EVD distribution. Copyright © 2002 Marcel Dekker, Inc.
3.7.1
Logistic and Log-Logistic Distributions
The logistic distribution is a two-parameter location-scale distribution with distribution function expðzÞ ty where z ¼ FðtÞ ¼ ð3:22Þ 1 þ expðzÞ s with location parameter 1 < y < 1 and scale parameter 0 < s. The shape of the logistic distribution is very similar to that of the normal, with ‘‘slightly longer tails’’ (see Meeker and Escobar, 1998). Its standard pdf is given by f ðzÞ ¼ expðzÞ=ð1 þ expðzÞ2. The logistic distribution is essentially indistinguishable from the normal distribution with small sample sizes. It is widely used in quality control for hazard modeling of percent in conformance, and in survey modeling of percent favorable=satisfied. There is also a companion log-logistic distribution, whose logged values follow a logistic distribution. 3.7.2
Gamma and Log-Gamma Distributions
The gamma distribution’s probability density function is given by f ðtÞ ¼
lðltÞk1 elt GðkÞ
for t > 0
ð3:23Þ
The gamma distribution has a flexible distributional form, which is used to model phenomenon consisting of a sum of exponential random variables. It is widely used in stochastic modeling. In addition, Lawless (1982, pp. 21–22) surveys a log-gamma companion distributional form. 3.7.3
Miscellaneous Other Noteworthy Distributions
We also mention the following distributions that have dedicated uses in the field of reliability: 1.
Birnbaum–Saunders (1969) introduce a distribution based on normal properties for modeling the number of cycles necessary to force a fatigue crack to propagate to a critical size that would result in failure (a fracture). Fðt; y; bÞ ¼ FðzÞ, where rffiffiffi rffiffiffi! 1 t y z¼ b y t
2.
The Gompertz–Makeham distribution is discussed by Meeker and Escobar (1998, p. 108). Its hazard function is very similar to the
Copyright © 2002 Marcel Dekker, Inc.
3.
4.
5.
minimum EVD except for a constant. It is reported to be used to ‘‘model human life in middle age and beyond!’’ The inverse Gaussian has been introduced to model situations where early failures dominate the lifetime distribution (see Martz and Waller, 1982, p. 93). Its hazard rate is similar to that of the lognormal distribution. The inverted gamma distribution is used as a prior distribution in Bayesian reliability for modeling exponential distributional outcomes (see Martz and Waller, 1982, p. 101). Mixture distributions are a weighting of several distributions. They are useful for modeling competing failure-mode phenomena: FM ðtÞ ¼ P1 ðtÞ þ P2 ðtÞ
WinSmithTM and Y-BathTM software provide the capability for modeling Weibull mixtures. 3.8
MIXTURES AND COMPETING FAILURE MODELS
Often our goodness-of-fit procedures will break down in the presence of mixtures and competing failure modes. In such cases the Weibull and normal plots might reveal a ‘‘dogleg’’ or some other systematic inconsistency in the linear fit. This can be due to the lack of homogeneity of our sample. How can this happen? This can occur if the data is a mixture or there are competing failure modes. Their origin is explained: 1.
Mixtures: A mixture consists of observations taken from two or more populations. Such a phenomenon is quite common when, for example, warranty data is collected from multiple geographic regions. We know that due to temperature and humidity differences, the hazard rates may vary considerably by geographic location and=or season. The overall failure distribution is a linear combination of the individual failure distributions. For a bi-mixture, let p denote the proportion of data from a given population: Foverall ¼ p F1 þ ð1 pÞ F2
ð3:24Þ
Equation (3.24) can be generalized for any k number of subpopulations in a mixture: Foverall ¼
k P i¼1
2.
pi Fi
with
k P
pi ¼ 1
ð3:25Þ
i¼1
Competing failure modes: Competing risks occur when a population has two or more failure modes and the entire population is at risk from
Copyright © 2002 Marcel Dekker, Inc.
either failure mode. The phenomenon is usually observed when the competing risks are associated with very different hazard rates—for instance, one failure mode is wearout and the other is random or associated with infant mortalities. For example, a semiconductor device can fail due to a random overstress voltage condition or gradually lose its performance over time due to aging phenomena. Many electromechanical or hydromechanical components will exhibit both infant mortalities, due to quality-control problems, and wearout later on. The overall reliability for k competing failure modes is given by k Q Ri Roverall ¼
ð3:26Þ
i¼1
Tarum (1996) shows how Eqs. (3.25) and (3.26) can be adapted for use when the underlying distribution is a Weibull. In fact, he has developed YBathTM software for modeling the reliability bathtub phenomenon. For a homogeneous dataset having three types of competing failure modes, the distribution function is given by b1 b2 b3 ! t t t Roverall ðtÞ ¼ exp ð3:27Þ y1 y2 y3 The hazard rate will resemble that of a bathtub curve if we choose b1 1 (wearin), b2 1:0 (random), and b3 1 (wearout). He also allows the combination of several competing risk subsystems as a competing risk mixture with the use of Eq. (3.24). The most complex model in Ybath software is a seven-parameter model incorporating a bi-mixture of competing risks, one of the form given by Eq. (3.27) with three competing risks, and another incorporating just two risks. The user should consult the reference by Tarum (1996) for more details on the underlying construction of these models. Data sets taken from two populations, or with two competing failure modes, will often have the characteristic dogleg appearance as illustrated in Figure 3-15 for competing life data on mainframe Winchester disk drives. Example 3-3: p. 3-15)
From Raheja (1991, p. 80) and cited by Abernethy (1996,
7 12 49 140 235 260 320 380 388 437 472 493 524 529 592 600 As shown in the Weibull plot of Figure 3-15, the data appears to have a convex shape (curvature up), presumably due to competing failure modes. The
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 3-15 Weibull plot of Raheja (1991) data along with BiWeibull SmithTM fits on left and right tail.
Weibull fit is shown along with a local fit (dashed lines) along both tail regions. We report the Weibull SmithTM local estimates of the competing failure modes with b1 ¼ 0:50 (infant mortalities) and b2 ¼ 4:72 (wearout). For mixtures of many competing failure modes or subpopulations, the doglegs will disappear and b will tend to 1, the exponential case (see Abernethy, 1996, p. 3–15). It is usually best to separate mixtures of failure modes and subpopulations and analyze the groups separately, rather than together, if possible.
3.9 1.
EXERCISES Fill in the following table, placing check marks where lðtÞ behavior is possible for the listed distribution:
Distribution
lðtÞ decreasing
Normal Lognormal Exponential Weibull
Copyright © 2002 Marcel Dekker, Inc.
lðtÞ constant
lðtÞ increasing
2.
3.
4. 5. 6.
7. 8.
Exponential distribution a. List two reasons why the exponential distribution is often ill suited for modeling the reliability of electronics products. b. Why is the reliability of personal computer products often modeled adequately using an exponential time-to-failure distribution? c. Why is the hazard plot a useful graphical procedure for estimating the exponential hazard rate, l? A Weibull distribution is fitted with y^ ¼ 3500 cycles and b^ ¼ 1:13. Estimate the mean and variance. Extra credit: Estimate skewness and Pearson kurtosis (see Appendix 3B). Is the Weibull distribution itself an extreme-value distribution? Explain. Why is it not necessary to introduce special probability plotting procedures for the lognormal, exponential, and EV distributions? The Weibull distribution is approximately normally distributed when b ¼ 3:5. Can any normal distribution be fit to a Weibull distribution? (The answer is no! Please explain!) Determine the population mean, median, and variance of a Weibull (1000, 1.2) distributed characteristic. Why is the EV distribution likely to be used in the civil engineering profession?
Copyright © 2002 Marcel Dekker, Inc.
APPENDIX 3A APPENDIX 3A.1
BACKGROUND
LOCATION-SCALE DISTRIBUTION FAMILY
Many of the widely used statistical distributions in reliability belong to the location-scale distribution family. A location-scale distribution family is one in which the cumulative distribution function (cdf) of a random varible, T , the timeto-failure, or probability of failure, is given by FðtÞ ¼ PðT tÞ ty ¼G s
for 1 < y < 1; s > 0; t > 0
ð3A:1Þ
where y ¼ location parameter: s ¼ Scale parameter: GðÞ ¼ Standard ðknownÞ cdf :
3A1.1
Log-Location Scale Distribution
A random variable T belongs to a log-location scale distribution family if Y ¼ logðT Þ is a member of a location-scale distribution family. That is, FðtÞ ¼ PðT tÞ yy ¼G s where y ¼ lnðtÞ
3A1.2
ð3A:2Þ
for 1 < y < 1; s > 0; t > 0
Threshold Log-Location-Scale Distributions
The three-parameter Weibull is an example of a distribution that can be generalized as a log-location scale distribution with a threshold parameter: FðtÞ ¼ PðT tÞ lnðt dÞ y ¼G s
Copyright © 2002 Marcel Dekker, Inc.
for 1 < y < 1; s > 0; t > d
ð3A:3Þ
3A1.3
Location-Scale Distributions
A listing of popular distributions used in reliability and that can be described using a location-scale distribution form is described in Table 3-10. The function Gð:Þ references any of the generalized location-scale distribution forms presented in Eqs. (3A.1)–(3A.3). The generalized parameters y, s, and d head the columns. The rows list many of the popular distributions we have surveyed. For each of the column headers, we list how the distributional parameters for each of the listed distributions correspond with the generalized parameters, y, s, and d.
TABLE 3-10 Popular Distributions that Are Used in Reliability in Their Location-Scale Distributional Forms (log-)location scale properties Distribution Normal ðm; sÞ
Type y¼t location
yy F s
G(.)
y
s
d
m
s
n=a
ln tmed
s
n=a
0
1=l
n=a
ln y
1=b
n=a
ln y
1=b
d
m
s
n=a
m
s
n=a
cum. standard normal Lognormal ðtmed ; sÞ
y ¼ lnðtÞ log-location
Exponential (l)
y¼t location
Weibull (b; y)
y ¼ lnðtÞ log-location
Weibull (b; y; dÞ
y ¼ lnðt dÞ threshold log-location
Gumbel (m; s) (smallest EV)
Y ¼t location
Largest EV
y¼t location
Copyright © 2002 Marcel Dekker, Inc.
yy s cum. standard normal yy 1 exp s yy 1 exp exp s F
yy 1 exp exp s yy exp exp s yy exp exp s
APPENDIX 3B
WEIBULL POPULATION MOMENTS
We begin with the definition for the kth population moment about the origin for the Weibull distribution (Kececioglu, 1991, vol. I, pp. 271–279, and Ebeling, 1997, pp. 77–78). ð1 t k f ðtÞdt Eðt k Þ ¼ 0 ð 1 b1 b b t t dt tk exp ¼ y y y 0 Substituting y ¼ ðt=yÞb and using dy ¼ ðb=yÞðt=yÞb1 dt, then ð1 t k ey dy Eðt k Þ ¼ 0
Noting that t ¼ yk yk=b , ð1 k k Eðt Þ ¼ y yk=b ey dy 0 k k ¼y G þ1 b k
ð3B:1Þ
Equation (3B.1) can then be used to express population moment properties of the Weibull distribution: 1 m ¼ EðtÞ ¼ y G þ 1 b 2 2 2 1 2 2 2 s ¼ Eðt Þ m ¼ y G þ 1 G þ1 b b m3 ¼ Eðt mÞ3 ¼ Eðt 3 Þ 3mEðt 2 Þ þ 2m3 3 1 2 3 3 1 þ1 ¼ y G þ 1 3G þ 1 G þ 1 þ 2G b b b b m4 ¼ Eðt mÞ4 ¼ Eðt 4 Þ 4mEðt 3 Þ þ 6m2 Eðt 2 Þ 3m4 4 1 3 2 4 2 1 þ1 G þ1 ¼ y G þ 1 4G þ 1 G þ 1 þ 6G b b b b b 1 3G4 þ 1 ð3B:2Þ b Expressions for the dimensionless quantities g3 ¼ ½Eðt mÞ3 =s3 (coefficient of skewness) and g4 ¼ ½Eðt mÞ4 =s4 (coefficient of kurtosis) can then be obtained with the use of the relationships formulated in Eq. (3B.2).
Copyright © 2002 Marcel Dekker, Inc.
4 Overview of Estimation Techniques
Reliability tests are often costly. Whether or not a concept or design prototype item survives a test, its time on test is a useful information measure that should not be discarded. But how does one make effective use of this information? In this chapter we show how this information can be used to derive parametric point and interval estimates of reliability measures.
4.1
INTRODUCTION
The first steps in the analysis of life data is the selection of a potential distribution from the set surveyed in Chapter 3. In order to fit a distribution, the parameters of the distribution must be estimated. Once this is established, other informative reliability metrics such as percentiles of the survival or failure distribution can be estimated. In this chapter we survey popular approaches for the development of point estimates of parameters and other reliability metrics and then introduce methods for the development of the more powerful confidence interval estimates of the same. We focus our discussion on the application of these estimation techniques for the more popular Weibull, exponential, and normal distributions. The methods work for any distribution, however. Per the advice given in Chapter 3, the analyst is encouraged to apply a simple log transformation of his or her data
Copyright © 2002 Marcel Dekker, Inc.
set if he or she wishes to conduct a lognormal distribution fit. Extreme-value distribution fits can be tested using a Weibull fit according to the recommendations of Chapter 3. Alternatively, reliability analysts might want to utilize popular software such as Minitab or Reliasoft1 Weibullþþ, which have built-in capabilities for directly estimating the properties of a whole collection of distributions used in reliability. In this chapter the exponential distribution is treated as a special case because the development of point and interval estimates of exponential properties is straightforward, as the exact sampling distribution of the exponential MTTF or hazard-rate parameter is well known and related to the chi-square sampling distribution. This is not the case for Weibull or incomplete normal data sets, wherein approximate methods must be used for obtaining confidence interval estimates. This chapter surveys four popular approaches for distribution fitting and parameter estimation of Weibull and normal distribution metrics: 1.
2.
3.
4.
Graphical approaches: Probability plotting is a widely used method by reliability practitioners. Rank estimators of FðtÞ are plotted on special graph paper, constructed so that a good fit will be evidenced by data points that closely hug a straight-line fit to the data. Parameter estimates may then be determined graphically. Rank regression: The raw data used to construct a probability plot is analyzed by applying simple, linear regression techniques to transformed rank statistics and raw observations. Parameter estimates, ttests for significance, confidence intervals, etc. are developed based on information provided from analysis of variance and the least-squares fit to the data. Maximum likelihood estimation: Maximum likelihood (MLE) methods are based on formal theory and so are appealing to statisticians. The estimates tend to be biased for small samples but are asymptotically correct (consistent). In this chapter we make use of statistical computing packages such as Minitab and Reliasoft to generate ML properties. Monte Carlo simulation: Monte Carlo (MC) methods constitute a simulation strategy wherein we resample from a distribution with known parameters. Given the availability of powerful, yet inexpensive computing resources in desktop computers, such methods are increasing in popularity. We demonstrate the use of WinSmith software for MC simulation.
Other estimation methods including the use of linear estimation methods and method of moments (and hybrid methods) are mentioned briefly.
Copyright © 2002 Marcel Dekker, Inc.
4.2
RANK REGRESSION AND PROBABILITY PLOTTING TECHNIQUES
Probability plotting techniques have enjoyed immense popularity due to The ease of which they can be generated manually or with computer-based methods. The analyst can readily assess the goodness-of-fit by examining how well the data can be fitted with a ‘‘straight line.’’ In this section we survey the use of probability plotting techniques along with the use of more formal rank regression methods for estimation. 4.2.1
Normal Probability Plotting Techniques
Definition. The normal probability plot is a plot of the (Zscore) order statistic, Zi ¼ F1 ðF^ ðti ÞÞ, versus a usage characteristic, ti , the ordered times, on Cartesian paper, with the y-axis relabeled in cumulative probabilities, FðZi Þ. 1.
2.
3.
The recorded observations, ti ; i ¼ 1; 2; . . . ; n, represent the ordered raw data in units of time or usage, with ti t2 < ti tiþ1 tn . The index i, the order of occurrence or rank of the ordered failure times, must be adjusted if some of the observations are censored readings. Techniques for making this adjustment were presented in Chapter 2. F^ ðti Þ is an order statistic, such as F^ ðti Þ ¼ ði 38Þ=ðn þ 14Þ, the default rank estimator used internally by Minitab software in the construction of normal probability plots. The index i is the rank or adjusted rank if the data set contains censored observations. Zi ¼ F1 ðF^ ðti ÞÞ, which converts the order statistic, F^ ðti Þ, to standard normal statistics, or Zscores, with the application of a standard normal inverse operator. By the property of the normal distribution, 99.73% of Zscore statistics will fall in the range of 3 to þ3. For small sample sets, 100% of all order statistics will be in this range.
The scatter of the points on the normal plot should appear to be linear if t is normally distributed. If the fit is deemed adequate, then it may be used to obtain estimates of m and s2 . An eyeball fit of the plotted points is usually sufficient for assessing the adequacy of the normal fit. If not, rank regression methods can be used to assess the fit and arrive at formal estimates of the parameters of the normal distribution. Details for construction of a rank regression model are discussed next.
Copyright © 2002 Marcel Dekker, Inc.
Rank Regression Model We begin with an expression for the cumulative distribution function of a normally distributed characteristic and equate it to a rank estimator of Fðti Þ such as F^ ðti Þ ¼ ði 38Þ=ðn þ 14Þ. Formally, ! i 38 ^F ðti Þ ¼ F ti m^ ¼ ð4:1Þ s^ n þ 14 |fflfflfflfflffl{zfflfflfflfflffl} |fflfflfflffl{zfflfflfflffl} std: normal cdf
rank estimator
The standard normal inverse operator, F1 , is then applied on all of the terms that appear in Eq. (4.1). The resultant relationship is linear between the Zscore statistics and ordered failure times, ti , as follows: ! 3 1 m^ 1 i 8 ZScore F ti ð4:2Þ ¼ þ 1 |fflfflfflffl{zfflfflfflffl}i s^ |{z} s^ nþ4 |{z} |{z} y
intercept
slope
x
Thus, we can fit the following regression model m 1 ZScore ¼ þ t i þ ei |fflfflfflffl{zfflfflfflffl}i s s |{z} |{z} |{z} y x intercept
ð4:3Þ
slope
where ei is an N ð0; 1Þ random variable. Using Normal Probability Plotting Paper When a computer is not nearby, the normal plotting paper Ref 4-1 of Appendix 4B can be used to visually assess a normal fit. The y-axis is automatically scaled in standard normal ðzÞ units: as such, the ordered pair ðti ; F^ ðti ÞÞ can be directly plotted on the paper. Graphical Estimation The best straight-line fit can be made visually. The mean, m, can be easily estimated by identifying the x-axis value corresponding to the 50th percentile. This point maps to either ZScore ¼ 0:0 in the Cartesian plot representation or to F(ZScore) ¼ 0:50 in the normal probability paper representation)—the 50th percentile. Specifically, m^ ¼ t0:50
ð4:4Þ
The standard deviation, s, can be estimated from the slope of the fitted relationship. However, a simpler method is to take the difference between the
Copyright © 2002 Marcel Dekker, Inc.
84th percentile of the fit, which corresponds to Z ¼ 1, and the 50th percentile, mean, m, which corresponds to Z ¼ 0. That is, t0:16 m^ ¼ 1:0 ¼ Z0:16 ) s^ t0:16 ¼ m^ þ s^ t0:50 þ s^ ) s^ ¼ t0:16 t0:50
ð4:5Þ
The use of Eqs. (4.4) and (4.5) is illustrated in the worked-out example that follows (Example 4-1.)
Rank Regression Rather than rely on the analyst’s subjectivity, it is often desirable to develop formal regression least-squares estimates of the parameters of the model expressed by Eq. (4.2). Abernethy (1996, 1998) contends that it is preferable to run an inverse regression; that is, a regression model run with time t the dependent variable—the x in Eq. (4.2)—and with Zscore, the independent variable—the y in Eq. (4.2). The reader is likely to feel a bit uncomfortable running an inverse regression, since we usually think of y, our dependent variable, as the order statistic, F^ ðti Þ, which appears as a y-axis on probability plots. Most reliability software is set up this way. However, the arguments for the use of an inverse regression model are compelling, and they follow. In an inverse regression run, the Zscore values are order statistics. As such, their values are predetermined by sample size, with the introduction of a potential random component if random censoring is present. The failure times, however, are totally random. Thus, it makes more sense to use Zscore as the independent, regressor variable and the ordered failures times as the dependent variable. In this case the least squares are minimized in the x-direction instead of the more familiar y-direction. Through the use of simulation techniques, Abernethy (1998) argues that there will be less bias in the regression fit if an inverse regression model is run—that inverse rank regression parameter estimates are more consistent. Abernethy (1996) uses a simulation approach on 1000 samples from a given Weibull distribution with y ¼ 1000 and b ¼ 3:0 to demonstrate the benefits of using inverse rank regression methods. The inverse regression of the Zscore order statistic upon t will be of the form s Zscorei þ ei ti ¼ m þ |{z} |{z} |{z} |fflfflffl{zfflfflffl} new y
intercept
slope
Copyright © 2002 Marcel Dekker, Inc.
new x
ð4:6Þ
Setting Up Rank Regression Models Accordingly, we suggest the following regression procedure to estimate m and s: 1.
Create two columns of information: a. Ordered failures, ti b. F^ ðti Þ using adjusted ranks, if data set contains censored observations Only ordered failures and adjusted ranks are analyzed. 2. Create a column of Zscores, where Zscore ¼ F1 ðF^ ðti ÞÞ. Note that in Excel this is accomplished using the function, NORMSINVðF^ ðti ÞÞ. 3. Run a simple linear regression, with independent variable, Zscore, and dependent variable (y), the ordered failure times. 4. m^ ; intercept in inverse regression model. 5. s^ ; slope in inverse regression model.
Example 4-1:
(Right-censored data set)
Data is collected on the wearout of n ¼ 15 ball bearings in hundred-hr. The test is stopped at 90 hundred-hr, and only 11 failures are recorded. The following information is summarized in Table 4-1: a. Ranks b. Ordered bearing life times (Blife) c. Rank estimator, pi ¼ ði 38Þ=ðn þ 14Þ d. Zscore ¼ F1 ðpi Þ
TABLE 4-1 Bearing Life Data (in hundred-hr) i
BLife
ði 38Þ=ðn þ 14Þ
ZScore
1 2 3 4 5 6 7 8 9 10 11
70.1 72.0 75.9 76.2 82.0 84.3 86.3 87.6 88.3 89.1 89.4
0.041 0.107 0.172 0.238 0.303 0.369 0.434 0.500 0.566 0.631 0.697
1.74 1.24 0.94 0.71 0.51 0.33 0.16 0.00 0.16 0.33 0.51
Copyright © 2002 Marcel Dekker, Inc.
Analysis. procedures: 1.
A normal probability plot will be constructed using two different
The Minitab statistical computing package was used to automatically output a normal probability plot of the data with superimposed 95% confidence bands. The output is displayed in Figure 4-1. All points reside within the 95% confidence band, but we do observe points at each tail that appear to deviate significantly from the straight line. As such, the adequacy of the normal fit must be in doubt. A goodness-offit test might help to resolve this issue further. Goodness-of-fit tests are surveyed in Chapter 5. The parameters m and s are estimated as follows: m^ ¼ t0:50 88 hundred-hr s^ ¼ t0:16 t0:50 101 88 ¼ 13 hundred-hr
2.
A plot of the Zscore statistic versus bearing life with superimposed ‘‘best straight-line fit’’ is presented in Figure 4-2. We construct an
FIGURE 4-1 Normal probability plot of bearing life data with 95% confidence bands (test stopped at 90 hundred-hr) and annotated graphical estimates of m and s.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 4-2 Bearing Life (hundred-hr) vs. Zscore with superimposed least squares fit from Minitab1 Stat > Regression > Fitted Line Plot:
inverse regression model to estimate m and s. The fitted relationship is expressed as BLife ¼ 86:17 þ 10:08 ZScore Based on Eq. (4.6), the parameters are estimated as m^ ¼ 86:17 hundred-hr s^ ¼ 10:08 hundred-hr 4.2.2
Weibull Probability Plotting Techniques
We begin with the Weibull survival function, RðtÞ ¼ eðt=yÞ
b
Copyright © 2002 Marcel Dekker, Inc.
and take the natural log of both sides, which yields ln RðtÞ ¼
t b y
By taking the natural log of both sides again, we have the linear relation ! 1 ¼ b |{z} ln t b ln y ð4:7Þ ln ln |{z} |ffl{zffl} 1 F^ ðtÞ x intercept |fflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflffl} slope y
Thus, a plot of the rank statistic, the double log of 1=ð1 F^ ðti ÞÞ ought to vary linearly with lnðtÞ when failure times follow a Weibull distribution. Weibull plotting paper is based on this relationship. It is constructed by scaling the x-axis in natural log units while scaling the y-axis in units of the double log of the inverse of 1 F^ ðtÞ. It becomes the familiar Weibull plot once the y-axis is relabeled in cumulative probabilities of F. Sample Weibull paper is presented in Ref 4-2 in Appendix 4B. A graphical ‘‘protractor’’ for estimating b is annotated on the paper (Ford, 1972). 1. 2.
To estimate b, we make use of the ‘‘protractor’’ on the upper left corner, to estimate b as the slope of the fitted relationship. To estimate y, we make use of the relationship FðyÞ ¼ 1 expððy=yÞb Þ ¼ 1 e1 ¼ 0:632 ) y^ ¼ t0:368
Rank Regression We parallel the discussion that was used to describe the development of rank regression estimators on the normal parameters, m and s. We use the linear relationship expressed by Eq. (4.7) to develop rank regression estimates of b and y. Rather than regress the double log of the rank estimator with ln t, an inverse rank regression model is used, where the left-hand side of (4.7) is treated as the independent variable. The inverse regression is of the form 1 ln t ¼ lnð ln RðtÞÞ þ |{z} ln y þ e |{z} b |fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl} new y |{z} intercept new x
ð4:8Þ
slope
We now demonstrate the use of inverse regression techniques for Weibull analysis.
Copyright © 2002 Marcel Dekker, Inc.
Example 4.2:
Multiply right-censored Weibull data set
A group of n ¼ 20 electric motors was tested for 200K revolutions. Two motors were removed from the test for other reasons at 30K and 35K revs, respectively. Two motors were still operational at 200K revs. Thus, only a total of r ¼ 16 failures were recorded. The data set is presented in Table 4-2, along with median rank statistics based on the adjusted ranks of the multiply censored data set.
Analysis. Empirical estimates of F^ ðtÞ were obtained with the use of Minitab’s default estimator, F^ ðti Þ ¼ ðadjusted ranki 38Þ=ðn þ 14Þ. A Weibull plot of the data set is shown in Figure 4-3. The data appears to hug the superimposed straight-line fit, which is indicative of a good fit. We now illustrate the use of probability plotting and rank regression techniques for estimating y and b. TABLE 4-2 Multiply Censored Test Data on n ¼ 20 Electric Motors K revs
InvRank
AdjRank
F^ ðtÞ
20 25 30þ 35þ 41 53 60 75 80 84 95 128 130 139 152 176 176 180 200þ 200þ
20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
1.00 2.00 — — 3.12 4.24 5.35 6.47 7.59 8.71 9.82 10.94 12.06 13.18 14.29 15.41 16.53 17.65 — —
0.031 0.080 — — 0.135 0.191 0.246 0.301 0.356 0.411 0.467 0.522 0.577 0.632 0.687 0.743 0.798 0.853 — —
Note: ‘‘þ’’ denotes a suspended item.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 4-3
Weibull plot of data in Table 4-2.
Graphical Estimation (see Figure 4-3) b^ 1:60: y^ 130K revs:
Copyright © 2002 Marcel Dekker, Inc.
Inverse Rank Regression The results of an inverse rank regression of ln t (y-values) versus doubln R ¼ ln ln 1=ð1 F^ ðtÞÞ (x-values) are presented in Figure 4-4. Note the extremely high level of significance of the fit, with p-values equal to 0 to 3 decimal places, and R2 of 98.1% This is to be expected, as the use of order statistics induces a correlation between time and the median rank statistics. The fitted relationship is ln T ¼ 4:899 þ 0:599 doubln R: b^ ¼ inverse of the slope of the fit ¼ ð0:599Þ1 ¼ 1:67: ln y^ ¼ intercept of fit ¼ 4:899 ) y^ ¼ 134:2K revs:
FIGURE 4-4
Inverse regression analysis of ln t vs. ln lnð1=RÞ provided by Minitab1.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 4-5 Inverse rank regression capability of Minitab1 V13.1 for the multiply censored data set of Table 4-2.
Commencing with version 13, Minitab has introduced a built-in capability for generating inverse rank regression estimates. The Minitab output is presented in Figure 4-5. The parameter estimates differ slightly, perhaps due to a difference in the internal choice of rank estimator used in the routine.
4.3 4.3.1
MAXIMUM LIKELIHOOD ESTIMATION Introduction to ML Estimation
We generalize our discussion by considering a multiply right-censored data set consisting of independent observations, ti t2 tn1 tn , with associated survival function Rðt; yÞ, and failure density function f ðt; yÞ, where y is an array of one or more unknown parameters that are to be estimated (see Table 4-3). The likelihood function, LðyÞ, is constructed as follows: LðyÞ ¼
n Q
Li ðyÞ
i¼1
Copyright © 2002 Marcel Dekker, Inc.
ð4:9Þ
TABLE 4-3 y Is an Array of Parameters Distribution
y
Exponential Normal Lognormal Weibull EVD
l or y ðm; sÞ ðtmed ; sÞ ðb; yÞ or ðb; y; dÞ ðm; sÞ or ðln y; sÞ
where each likelihood term Li ðyÞ is exchanged with f ðti ; yÞ if ti is a recorded failure Rðti ; yÞ if ti is a right-censored observation The maximum likelihood (ML) estimator for y is just that unique value of y, if it should exist, that maximizes LðyÞ. Maximum likelihood (ML) approaches enjoy several advantages over rank regression methods for parameter estimation. Due to the nature of least-squares estimation, the regression fit can be excessively influenced by observations in the tail regions of the distribution. In the left-tailed region, this might be considered as a benefit, since it is the region of greatest interest (earliest failures). However, this would never be the case for the righttailed region! On the other hand, ML methods constitute a formal framework for estimation, which, for large samples (asymptotic) are unbiased (consistent) and of minimum variance (asymptotically efficient). For small samples, however, ML estimates are biased estimators; that is, Eðy^ Þ 6¼ y. A more formal overview of likelihood estimation is presented in Chapter 7, which illustrates the use of Excel procedures for obtaining likelihood estimates and asymptotic, approximate confidence intervals on reliability metrics. 4.3.2
Development of Likelihood Confidence Intervals
Confidence interval estimation comprises a much more effective strategy for conveying estimates. When we develop a single point estimate, y^ , of y, we do not associate a level of confidence with it. We recognize that each time a new sample is collected, a different point estimate will be obtained. For that reason, y^ has a sampling distribution with expected value E½y^ , which will equal y if the qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ^ estimator is unbiased, and with standard error se y ¼ Eðy E y^ Þ2. Knowledge of the sampling distribution can then be used to develop a confidence interval estimate of y and an associated level of confidence, C ¼ 1 a, that the true value of y lies in the constructed interval.
Copyright © 2002 Marcel Dekker, Inc.
Definition. A confidence interval about one or more parameter(s) y—or a function of one or more parameters f ðyÞ—has associated with it a level of confidence, C, on the likelihood that the confidence interval contains the true value, y. It will be one of the following forms: Two-sided Confidence Interval on y: PðyL y yU Þ C
ð4:10Þ
One-sided Confidence Interval on y: Pðy yL Þ C ðone-sided lower confidence intervalÞ
or
Pðy yU Þ C ðone-sided upper confidence intervalÞ ð4:11Þ
The mathematics of constructing confidence intervals is closely related to hypothesis testing. Given a sample of size n from a population, a confidence interval consists of All values of y ¼ y0 for which the hypothesis y ¼ y0 is accepted. For twosided confidence intervals, we test y ¼ y0 versus y 6¼ y0 ; for one-sided confidence intervals, we test y ¼ y0 versus y < y0 or y > y0.
However, if the sampling distribution is not known, the confidence interval must be approximated, and this leads to concerns about the efficacy of such intervals. For data sets consisting of one or more suspensions, expressions for the exact confidence limits on normal reliability metrics such as m, s, or tR cannot be obtained due to the inability to derive expressions for the sampling distributions of m^ ; s^ , and tR . For all data sets, suspended or complete, the same is true for the development of expressions for the sampling distributions of Weibull metrics, y^ ; b^ , and ^tR . Instead, we must take advantage of the asymptotic (large sample) properties of likelihood estimators to arrive at approximate confidence intervals on the distributional parameters or reliability metrics of interest. Two types of approximations are commonly used. They are 1. 2.
Fisher matrix (FM) asymptotic intervals Likelihood ratio (LR) based procedures.
These procedures are discussed in Chapter 9. Fisher matrix (FM) intervals are based on the normal approximation of a sum of n partial derivatives of a loglikelihood function. (Normality is justified by the central limit theorem.) Likelihood ratio (LR) procedures are based on the asymptotic distribution of a ratio of likelihood functions, as used in the application of hypothesis testing. That is, LR confidence interval may be viewed as consisting of all values of y ¼ y0 for which the hypothesis test H0 : y ¼ y0 is not rejected, as explained in x9:2:2. These
Copyright © 2002 Marcel Dekker, Inc.
procedures require the use of unconstrained, nonlinear optimization methods for Weibull and incomplete normal data sets. The underlying mathematics and procedures for generating these confidence intervals are deferred to Chapter 9, wherein we introduce the use of the Excel1 Tools > Solver or Goal Seek routine for carrying out the optimization. Fortunately, with the advent of good statistical computing packages, the practitioner need not be familiar with the algorithms used to generate these limits. We will demonstrate this capability with the use of Minitab in worked-out examples later on. Efficacy of Approximate Confidence Intervals We must remember that confidence intervals are usually constructed based on outcomes from a single sample. If the process of obtaining a random sample is repeated, and the confidence limits recalculated, the limits will be different and, due to sample variation, will be different each time. That is, both the center and width of confidence intervals will be affected by sample variation. This, confidence intervals must possess the following desirable properties: 1.
2.
They must be of minimum width using either the width H ¼ yU yL for two-sided confidence intervals or the half-width H ¼ y^ yL or yU y^ for one-sided confidence intervals. Ideally, their coverage must be C*100%, where coverage is defined as the long-run proportion of time that the true value for y will lie in the confidence interval if the process of taking a random sample and recalculating the confidence limits is repeated a large number of times.
Most of the classical confidence limit expressions for complete samples on the normal parameters, m and s, do possess this ideal property. However, for incomplete samples, we often make use of the asymptotic (large sample) properties of the maximum likelihood estimator to devise approximate confidence intervals on parameters or reliability metrics of interest. In such cases we must look at both interval width and coverage when evaluating the efficacy of using these approximations. Monte Carlo (MC) simulation constitutes an effective methodology for evaluating the efficacy of confidence interval limits. In this, MC methods are used to create many samples from a known distribution. The coverage percentage, or proportion of times that the true value of a parameter is contained in the confidence interval, is tabulated along with information on the average width or half-width of the intervals. Tradeoffs in coverage and width must be considered when evaluating several competing confidence limit expressions. The best C*100% confidence interval will then be that confidence interval having the
Copyright © 2002 Marcel Dekker, Inc.
shortest width among all confidence intervals that have exactly C coverage. Monte Carlo estimation is discussed in x4:4 and Appendix 4A. 4.3.3 Maximum Likelihood Estimation of Normal Parameters, m and s, for Complete Sample Sets For complete data sets, the ML estimates of m and s are well known and presented here: m^ ¼ x
n ðx x P Þ n1 2 i 2 ¼ s^ ¼ s n n i¼1 2
ð4:12Þ ð4:13Þ
Note: s^ 2 is a biased estimator of s2 since Eðs^ 2 Þ ¼ ½ðn 1Þ=n Eðs2 Þ ¼ ½ðn 1Þ=n s2 . Confidence Interval Estimates of m, s, and Percentiles of the Normal Distribution for Complete Samples Knowledge of the sampling distributions of m^ and s^ is needed to create exact confidence intervals on m and s. When data sets are complete—that is, every item on test has failed—standard sampling distributions such as the standard normal, T, or chi-squared distributions are used to construct these confidence intervals. This is not the case when samples consist of one or more suspensions, as the sampling distributions on m^ and s^ must be approximated in some way. The confidence intervals will be of the form PðmL m mU Þ C
ð4:14Þ
Pðs sU Þ C
ð4:15Þ
and
Only a one-sided confidence interval on s is shown in Eq. (4.15), as we recognize that we are interested only in obtaining information relevant to setting an upper bound on s. The familiar confidence interval expressions on m and s, cited in every introductory engineering statistics textbook, are based on an assumption that the data set is complete. These expressions, which are repeated here, are based on the use of normal sampling distributions, T and chi-squared, each with n 1 degrees of freedom. These standard distributions are used to model the sampling distributions of t and s2, respectively: t m ðn 1Þs2 pffiffiffi Tn1 w2n1 ð4:16Þ s= n s2
Copyright © 2002 Marcel Dekker, Inc.
This leads immediately to the well-known confidence interval expressions on m and s2, which are presented in Eqs. (4.17) and (4.18): 3
2
rffiffiffiffi! rffiffiffiffi!7 6 6 s2 s2 7 7C 6 m m^ þ tn1;ð1CÞ=2 P6 m^ tn1;ð1þCÞ=2 n n 7 5 4 |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} mL
mij
3
2
ð4:17Þ
!7 6 6 2 ðn 1Þs2 7 7C 6 P6s w2n1;C 7 5 4 |fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl} s2U
ð4:18Þ A graphical representation of the right-tailed percentiles, Tn1;a and w2n1;a, is presented in Figures 4-6 and 4-7, along with information on how to use Excel’s statistical functions to evaluate percentiles of the T- and w2 -distributions, respectively. (Unlike many textbooks, the author has elected not to reproduce tables of normal sampling distribution percentiles. They can be easily obtained with the use of Excel.) Example 4-3:
Confidence intervals on m and s2 —no censoring allowed
Data is collected on a machine characteristic (thread depth). Based on a sample of size n ¼ 12, average thread depth is 15 ten-thousandths with a sample variance of s2 ¼ 3:7 ten-thousandths. Construct a two-sided confidence interval on mean depth m and a one-sided upper confidence interval on the variance, s2 . Use C ¼ 90%. Solution: Confidence Limits on m Using Eq. (4.17) rffiffiffiffiffiffiffi 3:7 ¼ 14:002 ten-thousandths 12 rffiffiffiffiffiffiffi pffiffiffi 3:7 mU ¼ m^ þ T11; 0:05 s= n ¼ 15 þ 1:796 ¼ 15:997 ten-thousandths 12 pffiffiffi mL ¼ m^ T11; 0:05 s= n ¼ 15 1:796
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 4-6
Evaluating percentiles of the t-distribution with the use of Microsoft1 Excel.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 4-7 Evaluating percentiles of the chi-squared distribution with the use of Microsoft1 Excel.
Copyright © 2002 Marcel Dekker, Inc.
Confidence Limits on s Using Eq. (4.18) s2U ¼
ðn 1Þs2 ð12 1Þ 3:7 ¼ 7:296 ten-thousandths2 ¼ 5:578 w211; 0:90
Confidence Intervals on Percentiles of the Survival Distribution Based on Complete Sample Information For complete data sets (that is, failure time for every item placed on test), the survival percentile of the normal distribution is calculated using (see Table 3-2) ^tR ¼ m^ þ s^ ZR
ð4:19Þ
Alternatively, Excel can be directly employed to evaluate survival percentiles, as shown earlier. Exact expressions for obtaining interval estimates of tR are based on the finding that the pivotal quantity ð^tR tR Þ=s is distributed pffiffiffi according to a noncentral t-distribution with noncentrality parameter ZR n. Statistical computing packages have built-in functions for generating confidence intervals on tR . 4.3.4 ML Estimation of Normal Parameters m and s2 in the Presence of Censoring In the real world, data sets are rarely complete: Suspensions can occur randomly due to either unforeseen circumstances or competing failure modes. Additionally, resource limitations on availability of test fixtures and time constraints due to product lead time reduction pressures result in early withdrawal of items under test. Accordingly, today’s practitioners need to be familiar with the variety of methods available for analyzing censored data. Most practitioners are not well versed in the complexities involved in the generation of maximum likelihood estimates of m and s in the presence of censoring. In such instances the resultant relationships for obtaining ML estimates of m and s require the use of a nonlinear search procedure such as the Newton– Raphson numerical method. For our advanced readers, the mathematical relationships for deriving ML estimates of m and s2 are presented in Chapter 7, along with an introduction to the use of Excel for generating easy, one-step ML estimates of m and s. Fortunately, many of today’s statistical computing packages have built-in capabilities for obtaining maximum likelihood estimates of m and s2 . We illustrate the use of Minitab for obtaining ML estimates of m and s for the bearing life data of Table 4-1. The output from Minitab is shown in Figure 4-9. To develop asymptotically correct confidence intervals on normal parameters m and s, we make use of Minitab’s built-in capability for generating Fisher-matrix
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 4-8 Evaluation the 90th percentile of survival distribution with the use of Microsoft1 Excel for a normally distributed characteristic with mean, 10, and standard deviation of 2.0.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 4-9 ML estimates and 95% confidence intervals provided by MinitabTM V13.1 Stat > Reliability=Survival > Parametric Right-Censored procedure.
confidence intervals. Both point and 95% confidence interval estimates on m (location parameter) and s (scale parameter) are shown in Figure 4-9. The reader is encouraged to compare the ML parameter estimates, m^ ¼ 85:56 and s^ ¼ 8:71, with either of the two rank regression estimates worked out in x4:2:1.
4.3.5
ML Estimation of Weibull Parameters y and b
The likelihood equations for estimating Weibull parameters, y and b, are derived in Chapter 7. For multiply, right-censored data sets, the relationships for developing ML estimates of Weibull parameters y and b reduce to n n 1 P ln t 1 P þ di i n t b ln ti ¼ 0 P b i¼1 i r b i¼1 ti i¼1
Copyright © 2002 Marcel Dekker, Inc.
0P b 11=b ti B C y^ ¼ @ 8i A r
ð4:20Þ
ð4:21Þ
Note that in Eq. (4.20), we make use of the indicator variable:
1 if ti is a recorded failure di ¼ 0 if ti is a right-censoring time The identification of a value of b that satisfies Eq. (4.20) generally requires the use of nonlinear, gradient search procedures. In Chapter 9 we present the use of Excel’s Tool > Solver or Tool > Goal Seek procedure for deriving b^ . Once b^ is identified, Eq. (4.21) can then be used to derive y^ . Here we will make use of the strong features of Reliasoft Weibullþþ V6.0 (2001) software to develop ML estimates for y and b for the Weibull data set of Table 4-2. Weibullþþ software has the capability to generate likelihood contours, which are reproduced in Figure 4-10. The contours are seen to be fairly ‘‘well behaved.’’ As such, most nonlinear gradient search routines should work well. There are challenges, however, which are discussed in greater detail in Chapter 9. ML estimates are readily provided by Reliasoft and reproduced in Figure 4-11. The ML estimates are b^ ¼ 1:77848
FIGURE 4-10 (2001).
and
y^ ¼ 132:8035
Likelihood contours generated by Reliasoft Weibull þþ V6.0 software
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 4-11
ML estimates of beta and theta from Reliasoft Weibull þþ V6.0 software.
To develop asymptotically correct confidence intervals on Weibull parameters y and b, we make use of Minitab’s built-in capability for generating Fisher-matrix confidence intervals. We reproduce the output from the Minitab Stat > Reliability=Survival > Parametric Right Censored procedure, in Figure 4-12 for the sample Weibull data set. Point and interval estimates of the Weibull parameters are summarized are shown. Using Monte Carlo simulation, Abernethy (1996) shows that confidence interval approximates of this type perform very poorly. For example, for Weibull data, 90% asymptotic confidence interval approximations have been shown to have coverage percentages as low as 75% when sample sizes are as small as n ¼ 20.
4.4
SIMULATION-BASED APPROACHES FOR THE DEVELOPMENT OF NORMAL AND WEIBULL CONFIDENCE INTERVALS
With the advent of widely available, powerful desktop computing resources and analysis software, we are witnessing a return to simulation-based approaches. As a prime example of this statement, consider the fact that WinSmith Weibull Analysis software now includes built-in capabilities for devising Monte Carlobased confidence intervals. Monte Carlo (MC) simulation procedures have been developed for obtaining approximations to the exact confidence intervals about Weibull and normal parameters and reliability metrics (see Lawless, 1982, pp. 226–232).
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 4-12 ML point and interval estimates of b; y; t0:90 , and R(100) provided by Minitab1 Stat > Reliability=Survival > Par. Distr. Analysis-Right Censored procedure.
The MC capabilities of WinSmith software were tested on the Weibull data set of Table 4-2. Monte Carlo (MC) samples are drawn from a hypothetical Weibull distribution with y ¼ y^ and b ¼ b^ . This is a form of parametric bootstrap sampling (see Hjorth, 1994, x6:1, and Meeker and Escobar, 1998). The results are shown next. Example 4.4: Table 4-2
Monte Carlo confidence limits for Weibull data set of
MC confidence intervals were generated for the data set presented in Table 4-2. The output report from WinSmith is presented and summarized in Figure 4-13. The use of MC methods for the development of approximate confidence intervals is discussed in greater detail in Appendix 4A.1. A word of caution is in order: The use of MC simulation on multiply right-censored observations is still an open area of research. Either of the approaches outlined in the appendix has not been proven. The methodology does seem useful, however. The reader
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 4-13
Monte Carlo percentile confidence and point estimates for y; b, and t0:90 .
should consult Appendix x4A.1, which provides an overview on strategies for obtaining MC intervals along with two worked-out examples illustrating the use of Minitab macros for manually generating MC confidence limits.
4.5 4.5.1
OTHER ESTIMATORS Best Linear Estimators of m and s
Linear estimators were popular in the early days of computing when computing resources were not widely available to obtain ML estimates. Today they are rarely used as, with the advent of inexpensive computing resources, the availability of software for the development of maximum likelihood estimates has superseded the need for using linear estimators. Until the 1980s, reliability textbooks contained many pages devoted to the reproduction of tables in the appendices for constructing linear estimators. For example, the reader might consult textbooks by Kapur and Lamberson (1977) or
Copyright © 2002 Marcel Dekker, Inc.
Mann et al. (1974) for the availability of such tables. The reader is likely to come upon a reference to BLUE (best-linear unbiased estimator) or BLIE (best-linear invariant estimator) estimators. These estimators are formed from linear combinations of the ordered failure times, y^ ¼ a1;n t1 þ a2;n t2 þ þ ar;n tr , where the constants a1;n ; a2;n ; . . . ; ar;n are tabulated or incorporated internally into statistical computing packages. The constants are chosen so that y^ has the minimum variance (BLUE) or minimum mean-squared error (BLIE) among all possible linear estimators of this kind. The reference textbooks are often filled with 30 or more pages of tabulated constants for developing linear estimates for sample sizes up to n ¼ 25 (see Lawless, 1982, pp. 144–145).
4.6
RECOMMENDATIONS FOR CHOICE OF ESTIMATION PROCEDURES
The uses of rank regression, maximum likelihood, and simulation-based approaches to estimation have been described. It is now up to the reliability analyst to decide on an estimation scheme for his or her data. To assist in making this decision, we provide the following information, as we refer back to the Weibull data set of Table 4-2. Differences in the rank regression and maximum likelihood estimates appear to be significant, but, in fact, their differences are relatively minor compared to the standard error of the estimates of y and b. For example, refer back to the results given by Figure 4-5 (Weibull rank regression) to those presented by Figure 4-12 (Weibull ML estimation). Even with a fairly goodsized data set (n ¼ 20; r ¼ 16 failures), we see the construction of very wide confidence intervals on b, from 1.19 to 2.67. This is a much more important issue than whether or not rank regression methods are preferred over simulation-based or maximum likelihood procedures. In any case, with respect to arguments for and against ML-based approaches, we provide the following arguments given by Abernethy (1996). For small and moderate samples sizes of less than 100 failures, Abernethy (1996) reports that ML estimates tend to be biased. The findings are based upon a Monte Carlo simulation study of 1000 replicates from a known Weibull distribution, comparing inverse rank regression and maximum likelihood estimates. For this reason, Abernethy does not recommend the use of likelihoodbased methods. To alleviate the bias, some work has begun on the development of simple bias correction factors, similar to the n=ðn 1Þ correction factor used for removing the bias on the ML estimator of the normal variance, s2 (see Abernethy, 1999). However, there are also mathematical objections to the use of least-squares (rank regression) methods for Weibull parameter estimation. We briefly allude to
Copyright © 2002 Marcel Dekker, Inc.
some of the difficulties in x4:2:2. Observations in the tail region can overly influence the regression fit and the corresponding parameter estimates. Abernethy (1996) reports that the lower end (tail) of the fit tends to be overweighed compared to the upper end, particularly so when data is time- (type I) or failure(type II) censored. From an estimation viewpoint, this is not a good property. Abernethy (1996) also reports that Dr. Suzuki of the University of Tokyo suggests ‘‘discarding the lower third of the data before fitting the data.’’ Based on simulation studies, Abernethy (1996) and Fulton (1999) recommend the following: For small sample sizes, r < 11 failures, use rank regression for estimating y and b followed by using Monte Carlo simulation for developing confidence interval estimates on any reliability metric. For sample sizes of 11 or greater, use rank regression methods for estimating b and y in combination with the use of ML-based, Fisher matrix, for generating approximate confidence intervals on Weibull parameters and associated reliability metrics of interest. For sample sizes of 11 or greater, and if ML techniques have been used to estimate y and b, use LR methods for the development of confidence intervals on Weibull parameters and associated reliability metrics of interest (despite the fact that they are not symmetric!).
4.7
ESTIMATION OF EXPONENTIAL DISTRIBUTION PROPERTIES
The exponential distribution is extremely easy to work with. It consists of only one parameter: either a mean time-to-failure parameter, y; or a hazard-rate parameter, l ¼ 1=y. The exponential distribution is named for its survival function, which is of the (simple) exponential form: RðtÞ ¼ 1 FðtÞ ¼ expðltÞ ¼ expðt=yÞ
for t 0
ð4:22Þ
Here we show how straightforward it is to obtain maximum likelihood estimates of exponential properties. Because the exact distribution of y^ and l^ are related to the chi-square distribution, the development of exact ML, one-sided confidence limits on exponential metrics is easy to obtain and reproduced here. The theory behind the development of these expressions is discussed in greater detail in §9.1.
Copyright © 2002 Marcel Dekker, Inc.
4.7.1 Estimating the Exponential Hazard-Rate Parameter, l, or MTTF Parameter, y For complete data sets, the MTTF parameter, y ¼ 1=l, may be estimated using n P
ti i¼1 ¼ l1 y^ ¼ n
ð4:23Þ
The MTTF parameter, y, has the characteristic property that FðyÞ ¼ 1 expð1Þ ¼ 0:632. Thus, the 63.2 percentile of the failure distribution, or 36.8 percentile of the survival distribution, may be used to develop point estimates of y and l: y^ ¼ t:368 ¼ 1=l^
ð4:24Þ
Should a data set contain censored observations, it is evident that the use of Eq. (4.23) would lead to an underestimation of the true mean time-to-failure parameter, as the potential failure times of any censored items could be far greater than the times they were suspended from the test. Without the use of a formal procedure for estimation, which can take censoring into account—for example, likelihood methods, it is not obvious how Eq. (4.23) might be properly modified to take censoring into account. We now play a game that we play in the classroom! What kinds of simple modifications to Eq. (4.23) might one suggest to properly take into account the effect of censoring mechanisms? To this end, in Table 4-4 we present an enumerated list of all possible sample average estimators, including the one shown in Eq. (4.23). Four possible sample average estimators are shown. For the first three—(a) to (c)—arguments against their use are obvious and explained below. However, the fourth one is quite curious, as there are not any apparent arguments that would support its use, yet there are not any strong objections against its use either. In Chapter 9, x9.1.1, the fourth estimator is shown to be the maximum likelihood estimator of y: MTTF ¼
total exposure time of all units on test T
no: of recorded failures r
ð4:25Þ
Generalization of l^ ¼ r=T Equation (4.25) applies to any arbitrary censoring (e.g., combinations of left and right censoring and interval censoring) situation. In Table 4-6 we provide formulas for calculating T, the total unit exposure time, under a wide range of censoring scenarios.
Copyright © 2002 Marcel Dekker, Inc.
TABLE 4-4 Possible Estimators of the Exponential MTTF Parameter n P
r P
ti
(1) MTTF ¼ i¼1 n MTTF is underestimated if some of the observations are censored. r P
ti (3) MTTF ¼ i¼1 r MTTF is underestimated since censoring times are ignored in the numerator.
ti (2) MTTF ¼ i¼1 n MTTF is severely underestimated since censoring times are ignored in the numerator. n P ti (4) MTTF ¼ i¼1 r Least objectionable estimator, but there is no rational argument for suggesting this estimator. We later determine that it is an unbiased estimator.
In the table ti ¼ Recorded event ðeither failure or censored observationÞ: tr ¼ Time of rth recorded failure ðstopping time for type II; singly-failure censoredÞ: t* ¼ Time when test is stopped ðfor type I; time-censoringÞ: trþc ¼ Time of rth recorded failure when c early suspensions occur ðstopping time for type II; multiply-censored testÞ: r ¼ Number of failures: c ¼ Number of suspended items: k ¼ Number of test stands or test fixturing devices: For example, consider a replacement test, wherein we have k test fixtures available. Our policy is to keep every test fixture running for t* time units. That is, we assume that the moment that an item fails, that in an instant, the item is removed from test, and a new test item is placed on test. Under this assumption, Eq. (4.25) generalizes to T ¼ no: of test fixtures ðkÞ duration of test ðt*Þ TABLE 4-5 Multiply Right-, Failure-Censored Exponential Data Set 10 156
23 170
28 210
30þ 272
66 286
Note: ‘‘þ’’ denotes an item taken off test.
Copyright © 2002 Marcel Dekker, Inc.
85 328
89þ 367
102 402
119 406
144 494
150 535
TABLE 4-6 Calculation of Total Unit Exposure Time (T) Type of data
Expression for T
Complete sample
T¼
n P
ti
i¼1
Singly time-censored (type I) at t
T¼
r P
ti þ ðn rÞt*
i¼1
Singly failure-censored (type II) at tr
T¼
r P
ti þ ðn rÞtr
i¼1 rþc P
Multiply time-censored at t* (time-censored with c suspensions)
T¼
Multiply failure-censored (failure-censored with c suspensions) at trþc
T¼
Replacement test
T ¼ kt*
Example 4-5:
rP þc
ti þ ðn r cÞtrþc
i¼1
P
ðt off all test items
General censoring
ti þ ðn r cÞt*
i¼1
t on Þ test
Use of ML estimation of exponential hazard-rate parameter
A type II, multiply right-censored data set in Table 4-5 is based on n ¼ 30 items on test that was stopped at 535 hours upon the occurrence of the 20th recorded failure. Due to unfortunate circumstance, c ¼ 2 items were taken off the test early at t ¼ 30 and 89 hours, respectively. Based on Eq. (4.25), r 20 20 l^ ¼ ¼ rþc ¼ 0:00229=hr ¼ T P 4472 þ ð30 20 2Þ*535 ti þ ðn r cÞtr i¼1
4.7.2
Exponential Confidence Intervals
In x9:2:1 it is proven that T is distributed according to a chi-square distribution, for which tables are readily available. Accordingly, exact expressions for the confidence limits on l; y, or other reliability metrics are available. A C ¼ 1 a one-sided, upper confidence interval on the hazard parameter l is given by 2 3 !7 6 w22r;ð1CÞ 7 6 7C P6 ð4:26Þ 6l 7 2T 4 5 |fflfflfflfflfflfflffl{zfflfflfflfflfflfflffl} lU
Copyright © 2002 Marcel Dekker, Inc.
The reader will note the absence of two-sided confidence limit expressions on exponential reliability metrics. This is due to the fact that the author does not believe in the use of two-sided confidence limits for metrics whose ideal value is either a ‘‘smaller-the-better,’’ in the case of expressions related to the exponential hazard-rate parameter, l, or a ‘‘larger-the-better,’’ for the case of expressions related to the MTTF parameter, y, or percentiles of the exponential distribution. It is customary to adjust the degrees of freedom to 2r þ 2 for type I censoring. Equation (4.26) holds exactly true for failure-censored data. For type I (time) censoring, the number of recorded failure is a random quantity. To approximately account for this effect, it is customary to adjust the degrees of freedom associated with the chi-square quantities to 2r þ 2 degrees of freedom. (See x9:2:1). An inversion of the confidence interval expression for (4.26) results in a one-sided lower confidence interval expression on y, the MTTF parameter: 3
2
!7 6 7 6 2T 7¼C 6 P 6y 2 w2r;ð1CÞ 7 5 4 |fflfflfflfflfflfflffl{zfflfflfflfflfflfflffl}
ð4:27Þ
yL
To obtain confidence interval limits on other reliability metrics, we begin with the C ¼ 1 a limits on y: PðyL yÞ C which, in turn may be used to develop a one-sided, C*100% lower confidence limit on RðtÞ: 0
1
B C P@expðt=yL Þ expðt=yÞA C |fflfflfflfflfflfflffl{zfflfflfflfflfflfflffl} |fflfflfflfflfflffl{zfflfflfflfflfflffl}
ð4:28Þ
R
RL
For percentiles of the survival distribution, tR ¼ l1 ln R ¼ y ln R, and so we multiply the confidence bounds on y by ln R: PðyL yÞ C 1
0
B C P@yL ln R y ln R ¼ tR A C |fflfflfflffl{zfflfflfflffl} tR;L
Copyright © 2002 Marcel Dekker, Inc.
ð4:29Þ
Example 4-6 We wish to be C ¼ 90% confident of meeting a l ¼ 0:2%=K hr specification (MTTF ¼ 500K hr). We can run the test for 5000 hr, and we agree to allow up to r ¼ 5 failures. What sample size is needed? Solution: This is a type I censored test. If we assume that r n, and we discount the possibility of a large number of suspensions, then T 5nK hr. For a one-sided confidence interval on l, we make use of Eq. (4.27), adjusted for a type I test: yL
2T w22rþ2;1C
¼ 500 Khr
with r ¼ 5 and w22*5þ2;0:10 ¼ 18:55
Accordingly, we need to solve yL
2T 2 5 n Khr ¼ 500 Khr ) n ¼ 500 18:55=10 ¼ 928 ¼ 18:55 w22rþ2;1C
This is a very large sample-size requirement. Increasing the risk, 1 C, or the test time, or both can reduce the sample-size requirements. Adjustments in the allowed number of failures may also be considered, but its effect on both the numerator and denominator must be considered. 4.7.3
Use of Hazard Plots
Method 1: Accumulating Inverse Ranks for Complete Samples The hazard plot is a plot of the cumulative hazard function, HðtÞ, versus usage, t. From Table 3-5 HðtÞ ¼ l t
ð4:30Þ
Accordingly, the hazard plot is a very useful tool for judging the adequacy of an exponential fit. For complete samples, HðtÞ is just the sum of the inverse ranks of the recorded failures. ðt P HðtÞ lðt 0 Þdt lðti Þ Dtiþ1 where Dtiþ1 ¼ tiþ1 ti ð4:31Þ i
0
But 1 for ti < t tiþ1 ðn þ 1 iÞ Dtiþ1 P P 1 1 Dtiþ1 ¼ ) HðtÞ ðn þ 1 iÞ Dtiþ1 ðn þ 1 iÞ i i l^ ðtÞ ¼
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 4-14
Cumulative hazard plot of an exponential data set.
Example 4-7:
Hazard plot for complete data set
An illustration of the construction of a hazard plot is presented in Figure 4-14 for the sample data set of Table 4-7. The cumulative hazard function was formed by accumulating the inverse ranks. Method 2: Use of Natural Log of RðtÞ For multiply censored data sets, we make use of the empirical relationship H^ ðtÞ ¼ ln R^ ðtÞ
ð4:32Þ
Equation (4.32) is simply a logarithmic transform of the complement of the rank estimator of FðtÞ. The adequacy of the linear fit on the hazard plot can be used to assess whether or not lðtÞ might be increasing or decreasing rather than TABLE 4-7 Sample Exponential Data Time
Inv. rank
Hazard value
139 271 306 344 553 1020 1380 2708
8 7 6 5 4 3 2 1
0.125 0.268 0.435 0.635 0.885 1.218 1.718 2.718
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 4-15
Adequacy of linear fit of hazard plot.
constant (see Figure 4-15). A convex fit is indicative of an increasing hazard-rate function; a concave fit is indicative of a decreasing hazard-rate function.
Example 4-8:
Worked-out example (multiply failure censored data)
A test consisting of n ¼ 16 electromechanical components was conducted. The test was stopped at 1510 cycles after r ¼ 10 failures (see Table 4-8). One test item was suspended from the test early due to reasons not associated with the study. Its censoring time is marked by a ‘‘þ’’ next to its entry. The data is presented in Table 4-8. The adjusted ranks are shown along with the empirical reliability function, R^ ðtÞ, based on use of the mean rank formula. The calculations for TABLE 4-8
Multiply Failure-Censored Data Set (n ¼ 16; r ¼ 10)
Rank
T (cycles)
Rev Rank
Adj. Rank
R^ ðtÞ
H^ ðtÞ
1 2 3 4 5 6 7 8 9 10 11
21 188 342 379 488 663þ 768 978 1186 1361 1510
16 15 14 13 12 11 10 9 8 7 6
1 2 3 4 5 — 6.09 7.18 8.27 9.36 10.46
0.941 0.882 0.824 0.765 0.706 — 0.642 0.578 0.513 0.449 0.385
0.061 0.125 0.194 0.268 0.348 — 0.444 0.549 0.667 0.800 0.954
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 4-16
Cumulative hazard plot of data in Table 4-8.
H^ ðtÞ ¼ ln R^ ðtÞ are shown along with a plot of the cumulative hazard function in Figure 4-16. Analysis. In Table 4-8 the ranks are adjusted for suspensions. HðtÞ is estimated with the use of Eq. (4.32). The hazard plot is presented in Figure 4-16. The fit appears to be adequate. An inverse, linear regression through the origin was run. The output from the regression is shown in Figure 4-17. The fitted model is t ¼ 1662 H^ ðtÞ Thus, l^ 1 ¼ MTTF ¼ 1662 cycles. The model appears to be a very good fit ( p-value of 0.000). Residual analysis should be conducted to verify the adequacy of the hazard-rate model. The residuals represent unexplained differences between ti and 1662 H^ ðtÞ. A plot of the residuals versus t is also contained in Figure 4-17. The residuals appear to fluctuate randomly about zero, which is an indication of an adequate fit. ML estimates of l and the MTTF were obtained with the use of Eq. (4.25): r 10 l^ ¼ ¼ T 21 þ 188 þ 342 þ 379 þ 488 þ 663 þ 768 þ 978 þ 1186 þ 1361 þ 1510 þ ð5 1510Þ 10 ¼ ¼ 0:000648 cycles1 ) y^ ¼ 1=0:000648 cycles 15;435 ¼ 1543:5 cycles
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 4-17
Regression Analysis of Hazard Plot Data [Minitab1 output].
A one-sided lower 95% confidence limit on the mean time-to-failure was calculated based on the use of Eq. (4.27):
yL ¼
2T 2ð15;435Þ ¼ 983:1 cycles ¼ 31:4 w22r;0:05
This means that we can assign a level of confidence, C ¼ 95%, that our MTTF is at least 983.1 cycles or greater.
Copyright © 2002 Marcel Dekker, Inc.
We also constructed a 95% one-sided, lower confidence limit on the B10 life or t0:90. From Eq. (4.29), the lower confidence interval on t0:90 is given by t0:90;L ¼ yL ln R ¼ 983:1 ln 0:90 ¼ 103:2 cycles Thus, we assign a confidence level of 0.95 that our B10 life or t0:90 exceeds 103.2 cycles.
4.8
THREE-PARAMETER WEIBULL
ML Estimation The task of obtaining ML parameter estimates of the three-parameter Weibull distribution is quite challenging. The difficulties of ML estimation are discussed in Chapter 9, x9:1:5. In general, the solution to the ML equations may have two or no solutions (see Lawless, 1982, p. 192). Additionally, the parameter estimates for d-values in the neighborhood of t1 are unstable, particularly for b < 1. For a wide range of conditions, the choice to set d^ ¼ t1 , or just slightly less than t1 , makes a lot of sense in that the likelihood function is quite flat around these values of d. Another way to handle ML estimation is to find conditional ML estimates of y and b for a range of d-values, and then to choose the d-value that results in the lowest overall value of the likelihood function. This procedure is also described in Chapter 7. Parameter estimation can be readily conducted with the use of Weibull probability plotting techniques. We know that a practical limitation on d is for d not to exceed the first-order statistic, t ¼ t1 . Graphically, we determine the value of d that provides the best linear fit to the data. Regression techniques can be used to find a suitable value for d, by which the data is scaled by subtracting its value from each time of failure. The resultant data set may then be fitted using
TABLE 4-9 Grinding Wheel Life Wheel number
Pieces per wheel
Adj. life data pieces—19,600
22,000 25,000 30,000 33,000 35,000 52,000 63,000 104,000
2400 5400 10,400 13,400 15,400 32,400 43,400 84,400
1 2 3 4 5 6 7 8
Copyright © 2002 Marcel Dekker, Inc.
conventional two-parameter Weibull techniques. This approach is illustrated in the following worked-out example.
Example 4.9:
Grinding wheel life (Kapur and Lamberson, 1977, p. 313)
A data set from Kapur and Lamberson (1977) is presented in Table 4-9 on grinding wheel life in pieces. A Weibull plot of the median ranks is displayed as the uncorrected plot in Figure 4-18. Note the curvature of the fit. To correct for
FIGURE 4-18 Weibull plots of grinding wheel life data both before and after scaling data by d^ ¼ 19;600 pieces
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 4-18
(continued )
this, the data was rescaled using a minimum life, d^ , of 19,600. This value was approximated using 90% of t1 ¼ 0:9 22;000 pieces ¼ 19;600. In Figure 4-18 the rescaled data is plotted on Weibull paper. Note the improved fit! The estimated parameters from WinSmithTM are as follows: y^ ¼ 24;500 pieces: b^ ¼ 0:84: d^ ¼ 19;600 pieces:
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 4-19 Use of Microsoft Excel1 to estimate d^ using MSE regression criteria. Note use of ‘‘slope’’ and ‘‘intercept’’ functions with array formula to develop rank regression estimates.
Copyright © 2002 Marcel Dekker, Inc.
A second estimate of d was obtained using an inverse regression approach. y ¼ lnðtÞ was regressed upon x ¼ lnð lnð1 F^ ðtÞÞ. Excel was used to evaluate the fit for a range of d-values in ½0; 22;000Þ. The mean square error (MSE) was used to evaluate the fit. It is a smaller-the-better characteristic. The MSE was calculated as the sum of the squared differences between actual and predicted values of lnðti dÞ: o2 n n P lnðti dÞ ln y^ b1 lnð lnð1 F^ ðti ÞÞ ð4:33Þ MSE ¼ i¼1 n2 The Excel spreadsheet with accompanying Cartesian plot of MSE versus d is presented in Figure 4-19. The minimum value of the MSE statistic of MSE ¼ 0:2052 occurred at d^ ¼ 20;500 pieces. The inverse rank regression estimates of the three Weibull parameters were identified as y^ ¼ 25;079 pieces: b^ ¼ 0:860: d^ ¼ 20;500 pieces: It is also interesting to identify the conditional ML estimates of y and b given d ¼ 20;500. They are y^ ¼ 24;597 pieces: b^ ¼ 0:9653: The range of reported estimates of b and d must perplex the reader. This is not unexpected for several reasons: 1. 2.
3.
4.9 1.
Graphical ‘‘best straight-line fits’’ are subjective in nature. In particular, the estimate of the slope, b, is very difficult. Regression fits differ greatly from ML estimates. Regression fits are greatly influenced by values in the lower tail region as explained in x4:2. ML estimates tend to be biased (see Abernethy, 1996).
EXERCISES Use Excel to generate n ¼ 30 Weibull data values for b ¼ 1:25 and y ¼ 10;000 hr. Failures truncate at r ¼ 26. To do this, do the following: a. Use rand ( ) function to generate 30 (U) random uniform values in the interval (0,1).
Copyright © 2002 Marcel Dekker, Inc.
2.
3.
b. Use inverse Weibull function, t ¼ y ð lnð1 U ÞÞ1=b . c. Replace t27 ; t28 ; t29 , and t30 by t26. Develop ML estimates for the data set of Exercise 1. a. Create a Weibull plot of the data. b. Use Minitab or Reliasoft Weibullþþ software or a program of your choice to generate point and 95% confidence limits on y; b, and the B10 ðt0:90 Þ life. We are given the following life data: Time 40 43 50 110 150
suspended
Test the fit against a lognormal distribution by doing the following: a. Plot the logged values on normal probability plotting paper. b. Estimate the s and tmed parameters of the lognormal distribution. c. Estimate B10 life. d. Generate inverse rank regression values: x ¼ F1 ðF^ ðtÞÞ 4.
and
y ¼ lnðtÞ
e. Use simple linear regression to develop point estimates of tmed and s. Given the following exponential life data set for n ¼ 10 prototype items (‘þ’ denotes a censored reading), estimate the following: 63 313 752 951 1101 1179 þ 1182 1328 1433 2776
5.
a. Assuming an exponential distribution, estimate y, the MTTF parameter. b. Develop a lower 90% confidence limit on y and B10 life. c. Develop a lower 90% confidence limit on Rð1433Þ. Given the Minitab output below: 1. 2. 3.
Find y^ and b^ . Find 95% confidence interval on b. Find 95% confidence interval on t0:10 .
Copyright © 2002 Marcel Dekker, Inc.
Minitab output: Distribution Analysis: Weibull Variable: Weibull Censoring Information Uncensored value Estimation Method: Maximum Likelihood Distribution: Weibull
Count 20
Parameter Estimates Parameter Shape Scale
Estimate 0.8958 740.7
Standard 95.0% Normal CI Error Lower Upper 0.1590 0.6326 1.2686 194.7 442.4 1240.1
Log-Likelihood = -152.996 Characteristics of Distribution Standard
95.0% Normal
CI Estimate Upper Mean (MTTF) Standard Deviation Median First Quartile (Q1) Third Quartile (Q3) Interquartile Range (IQR)
781.3562 873.8590 492.0013 184.3508 1066.596 882.2450
195.4084 279.1305 144.6176 76.2000 266.9132 224.1208
Error 478.6004 467.2487 276.5453 81.9988 653.1134 536.2342
Table of Percentiles Percent 1 2 3 4 5 6 7 8 9 10 20 30 40 50 60 70 80 90
Percentile 4.3604 9.5063 15.0334 20.8458 26.8978 33.1624 39.6224 46.2661 53.0851 60.0738 138.8314 234.3477 349.9454 492.0013 671.8435 911.2558 1259.957 1879.245
Copyright © 2002 Marcel Dekker, Inc.
Standard Error 4.4691 8.4739 12.2313 15.8136 19.2596 22.5943 25.8350 28.9949 32.0843 35.1113 63.0962 89.1138 115.4453 144.6176 180.6366 231.2580 314.8393 494.1186
95.0% Normal CI Lower Upper 0.5849 32.5047 1.6568 54.5463 3.0515 74.0636 4.7130 92.2019 6.6104 109.4467 8.7238 126.0630 11.0391 142.2163 13.5461 158.0192 16.2373 173.5533 19.1066 188.8804 56.9682 338.3317 111.2193 493.7887 183.3129 668.0479 276.5453 875.3188 396.6505 1137.963 554.1442 1498.504 772.0699 2056.150 1122.466 3146.254
Lower 1275.631 1634.311 875.3188 414.4602 1741.852 1451.523
APPENDIX 4A
MONTE CARLO ESTIMATION
Monte Carlo (MC) simulation approaches are useful for approximating stochastic relationships when no known exact expression is available. MC methods are computer-intensive, relying on a stream of pseudo-random numbers to simulate realizations from a distribution. In our application we wish to devise confidence intervals on parameters and reliability metrics of interest based on multiply censored data sets. For many (log-)location-scale distributions, it is not possible to derive closed-form expressions for the sampling distribution of the maximum likelihood estimates for parameters, y and s, in the presence of censoring. In such cases, the confidence intervals of interest must be approximated. Given the power of today’s desktop computers, MC simulation is becoming a popular way to form this approximation. For example, the WinSmith Weibull analysis software now includes built-in capabilities for devising MC-based confidence intervals. Assuming a data set of size n consisting of r recorded failures, MC samples are drawn from a hypothetical (log-)location-scale distribution with parameters y ¼ y^ and s ¼ s^ . This is a form of parametric bootstrap sampling (see Hjorth, 1994, x6:1, and Meeker and Escobar, 1998).
APPENDIX 4A.1
MONTE CARLO SIMULATION STRATEGY
A generalized procedure for Monte Carlo (MC) simulation is illustrated in Figure 4-20. An effective simulation strategy must mimic as best as possible the underlying process that leads to an evolution of failures and censored observations. A specialized strategy is required for each type of censoring scenario. Abernethy (1996) accomplishes this rather effectively, but crudely, by omitting all random suspensions items from the data and then treating the resultant data subset as a complete data set. The resultant subset is fit to the distribution of interest. After each MC sample, the suspended readings are added back. A summary describing a comprehensive strategy for common censoring scenarios is presented in Figure 4-20.* If data is time-censored, then any MC times beyond the censoring time, t*, are reset to t and labeled as right-censored. Similarly, if data is failure-censored, then any MC times beyond tr, the rth recorded failure, are reset to time tr and labeled as right-censored. If the data set contains a small number of random suspensions, then the method of Abernethy * With careful thought, the simulation strategy outlined by Figure 4-20 can be applied to a wide variety of test plans. For example, with sudden-death testing, there are k groups, each run until a single failure is recorded. In this case the simulation should be run as if there are independent, failurecensored samples. The data is then aggregated after the simulation into one larger group for analysis.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 4-20
Monte Carlo simulation strategy.
(1996) is recommended. Otherwise, if the data set contains a large number of suspended items, with r n, then alternate strategies should be pursued. For competing failure modes, a censoring distribution might be fit to the censored values, and MC simulation used to generate values from it. Meeker and Escobar (1998, x4:13:3) provide an efficient algorithm for generating MC values in such cases. The use of pivotal quantities is sometimes helpful to reduce the demands of Monte Carlo simulation. The latest release of WinSmith provides this capability. For the Weibull distribution, Lawless (1982, p. 147) recommends the use of the parameter-free, scaled quantities: ln y^ ln y Z1 ¼ b^
!
! b^ Z2 ¼ b
and
ln y^ ln y Z3 ¼ b
!
For the (log-) normal distribution, Lawless (1982) recommends the use of the following scaled quantities: Z1 ¼
m^ m s
Z2 ¼
s^ s
and
Z3 ¼
m^ m s
A key advantage in working with pivotal quantities is that you are able to resample from the standard normal distribution. WinSmith software provides the option of conducting MC simulation on pivotal quantities. The theory of their use is outlined by Lawless (1974, 1982). The use of pivotal quantities also allows for direct approximation of the confidence limits. Copyright © 2002 Marcel Dekker, Inc.
APPENDIX 4A.2 Example 4-10:
WORKED-OUT EXAMPLES
Normal data set of Table 4-1
We now present the details for the development of MC confidence intervals on the bearing life normal data set of Table 4-1. The ML estimates of m and s were m^ ¼ 85:564 and s^ ¼ 8:714 (see Figure 4-9). A Monte Carlo simulation was conducted using m ¼ 85:478 and s ¼ 8:713. According to the procedures outlined by Figure 4-20 and Table 4-10, we repeatedly simulated observations from an N ð85:564; 8:7142 Þ distribution based on a sample of n ¼ 15 points. An Exec Macro in Minitab was used to accomplish TABLE 4-10 Monte Carlo Simulation Strategy for Different Censoring Scenarios from Gðy^ ; s^ Þ Distribution Failurecensored Random at t ¼ tr ; censoring c2 c1 failure- removed censored early
No. failures
Time-censored at t ¼ t0 ; c1 timecensored
1
n
No
No
No
2
r
Yes
No
No
3
r
No
Yes
No
4
n c2
No
No
c2 items
If c2 is small ( 4), keep c2 observations and generate n c2 items from Gðy^ ; s^ Þ.
5 n c1 c2
Yes
No
c2 items
If c2 is small ( 4), keep c2 observations and generate n c2 items from Gðy^ ; s^ Þ; truncate at t ¼ t*, and set censor flag accordingly.
6
No
Yes
c items
If c2 is small ( 4), keep c2 observations and generate n c2 items from Gðy^ ; s^ Þ; truncate all failure times beyond t ¼ tr, and set censor flag accordingly.
r
Copyright © 2002 Marcel Dekker, Inc.
Simulation strategy n items from Gðy^ ; s^ Þ. n items from Gðy^ ; s^ Þ; truncate at t ¼ t* and set censor flag accordingly. n items from Gðy^ ; s^ Þ; truncate all failure times beyond t ¼ tr, and set censor flag accordingly.
FIGURE 4-21 Minitab Exec Macro to generate Monte Carlo interval estimates of parameters and reliability metrics associated with sample, time-censored data set.
this.* Any realization that exceeded t ¼ 90 was truncated to its time-censored value of 90. The Minitab Stat > Reliability=Survival > Parametric Right Censored procedure was run on each simulated sample to drive ML estimates of m; s, and t0:90 . The macro was executed 1000 times, and descriptive statistics on the 1000 realizations of m^ ; s^ , and t0:90 were generated. The Exec Macro is shown in Figure 4-21. Descriptive statistics on the 1000 ordered parameter estimates were used to generate MC-based percentile confidence limits on m and s. They are summarized in Table 4-11. Note the increased width of the MC intervals compared to the asymptotic ML intervals that were presented in Figure 4-9. In general, ML-based confidence intervals are less conservative than MC intervals. * The Exec Macro procedure in Minitab constitutes a first-generation macro capability. It is a batch file of session commands. Minitab has extended this earlier capability immensely, adding formal local and global macro capabilities.
Copyright © 2002 Marcel Dekker, Inc.
TABLE 4-11 Monte Carlo Percentile Confidence and Median Values for m, s, and t0:90 Parameter
5th percentile
Median
95th percentile
79.737 4.008 66.552
85.551 8.113 74.980
89.964 12.2844 79.993
m s t0:90
Example 4-11:
Weibull data set of Table 4-2
The data set of Table 4-2 consists of n ¼ 20 observations. It is assumed that early observations at t ¼ 30 and t ¼ 35 are suspended readings due to causes that are beyond the scope of the study. Additionally, there are two suspended readings at t ¼ 200, the time at which the test was stopped. Therefore, we adopt a strategy outlined by Table 4-10 with n ¼ 20; c1 ¼ 2, and c2 ¼ 2. There are r ¼ 16 failures in all. From Figure 4-12 the ML estimates of y and b are y^ ¼ 132:8K revs: b^ ¼ 1:785: To initialize the MC simulation, we first delete the two early suspensions from the data set and develop ML estimates of y and b for the reduced data set. They are y^ ðreducedÞ ¼ 131:73K revs: b^ ðreducedÞ ¼ 1:751: A global macro in Minitab was developed to sample 1000 times from this hypothetical distribution.* It is presented in Figure 4-22. Eighteen observations were randomly simulated from a Weibull (131.73, 1.751) distribution. Any observations beyond t ¼ 200K revs were treated as time-censored observations at t ¼ 200K revs. The two suspensions at t ¼ 30 and 35 were incorporated into each simulated sample to bring the sample size back up to n ¼ 20. The Minitab Stat > Reliability=Survival > Parametric Right Censored procedure was called during each simulation run, and ML estimates of y; b, and t0:90 were stored. The procedure was called 1000 times, and the 1000 realizations of y^ ; b^ , and t0:90 were generated and stored. The output from the macro is summarized in Table 4-12. * It should be noted that this macro is a full-featured macro language, which differs considerably in its capabilities from the Exec Macro illustrated by Figure 4-21. The Exec Macro is basically a ‘‘batch file’’ of Minitab session commands.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 4-22 4-2.
Minitab global macro for simulating 1000 MC for Weibull data set of Table
TABLE 4-12 Monte Carlo Percentile Confidence and Median Values for y; b, and t0:90 Parameter
5th percentile
Median
95th percentile
103.46 1.313 22.98
133.22 1.842 39.38
1667.28 2.739 62.44
y b t0:90
Copyright © 2002 Marcel Dekker, Inc.
APPENDIX 4B
Ref 4-1
REFERENCE TABLES AND CHARTS
Normal plotting paper with standard Z units annotated on Y -axis.
Copyright © 2002 Marcel Dekker, Inc.
Ref 4-2
Sample Weibull plotting paper. (From Ford, 1972.)
Copyright © 2002 Marcel Dekker, Inc.
5 Distribution Fitting
5.1
INTRODUCTION
Upon surveying the distributions examined in this book, the reader is apt to seek answers to the question, Given a data set, what distribution model should one use to fit the data? To assist the analyst in this area, formal procedures for testing the goodness-of-fit of a particular distribution are introduced. However, the efficiency of such procedures is limited. As such, reliability professionals usually settle for the use of one or two distribution models that make sense for the kind of data they see in their industry. For example, reliability engineers who work with mechanical systems often make use of a Weibull distribution for modeling wearout phenomena. This is not exclusively true, as others, depending on their specific industry or company, make use of the lognormal distribution. On the other hand, reliability engineers who work in the electronics industry often rely on the exponential distribution for modeling time-to-failure phenomena. This has begun to change in certain segments of the electronics industry that have started to use distributions such as the Weibull for modeling time-dependent hazard rates.
5.2
GOODNESS-OF-FIT PROCEDURES
The ‘‘eye’’ works best for assessing goodness-of-fit!
Copyright © 2002 Marcel Dekker, Inc.
To assist the analyst in this area, formal procedures for testing the goodness-of-fit of a particular distribution are described. Underlying every goodnessof-fit test is an implied hypothesis test of the form H0 : FðtÞ ¼ F0 ðtÞ; null hypothesis: Ha : FðtÞ 6¼ F0 ðtÞ; alternative hypothesis: It is quite common to conduct this test based on a single sample of size 3– 10. This results in a relatively low (statistical) power of the test procedure, leading to high acceptance rates of the hypothesized distributions. In addition, with small sample sizes it may not be possible to test specifically for the goodness-of-fit in the left-tail region although if we think about it, it is really the region of greatest interest. With this as a backdrop, we begin our overview of popular procedures for conducting goodness-of-fit assessment. In particular, we describe the Kolmogorov–Smirnov (K–S) test, based on comparisons with an empirical distribution, and a rank regression test, based on a correlation coefficient. It is important to note that the chi-squared goodness-of-fit procedure is not covered, as evidence seems to support the effectiveness of the K–S test procedure over classical chisquared tests (see Kececioglu, 1994, Vol. 1, p. 729, and Lilliefors, 1967). 5.2.1
Goodness-of-Fit Tests Based on Differences Between Empirical Rank and Fitted Distributions
Definitions Consider a set of ordered failure times, t1 , t2 ; . . . ; tn , with empirical distribution function (EDF) Fn ðtÞ ¼
i n
for ti t < tiþ1 ;
i ¼ 1; 2; . . . ; n 1
and a location-scale distribution with known parameters, a and b, F0 ðt; a; bÞ
or just F0 ðtÞ;
with reliability R0 ðtÞ
We consider a class of goodness-of-fit tests based on differences between these two distributions. The following statistics are under consideration (see Lawless, 1982, pp. 432–433, and Stephens, 1974): 1.
Cramer–von Mises statistic (CM)—an integrated sum of the squared differences over the hypothesized density function, f0 ðtÞ: ð1 2 ½Fn ðtÞ F0 ðtÞ2 f0 ðtÞdt ð5:1Þ Wn ¼ n 1
Copyright © 2002 Marcel Dekker, Inc.
2.
Anderson–Darling (AD)—an integrated conditional sum of the squared differences over the hypothesized density function, f0 ðtÞ: ð1 ½Fn ðtÞ F0 ðtÞ2 f0 ðtÞdt A2n ¼ ð5:2Þ F0 ðtÞR0 ðtÞ 1
3.
Kolmogorov–Smirnov (K–S)—maximum absolute difference between the distribution functions: Dn ¼ sup jFn ðtÞ F0 ðtÞj
ð5:3Þ
t
Implementation of EDF Test Procedures Figure 5-1 illustrates the relationship between the empirical distribution function, Fn ðtÞ, and the hypothesized distribution, F0 ðtÞ. Fn ðtÞ ¼ i=n is a step function with jumps at the order statistics, ti , i ¼ 1; 2; . . . ; n. F0 ðtÞ is a known function. Note þ that the K–S statistics, D i and Di , are also shown [see Eq. (5.6)]. Based on this perspective, the test statistics for CM, AD, and K–S goodness-of-fit tests reduce to the following distribution-free expressions (see Lawless, 1982, p. 433): 1.
Cramer–von Mises statistic (CM):
Wn2 ¼
FIGURE 5-1
2 i1 1 F0 ðti Þ þ n 12n i¼1 n P
ð5:4Þ
Construction of empirical and hypothesized distributions and K–S statistics.
Copyright © 2002 Marcel Dekker, Inc.
2.
Anderson–Darling (AD): n 2i 1 P 2 An ¼ lnfF0 ðti ÞR0 ðtnþ1i Þg n n i¼1
ð5:5Þ
The AD test is popular for assessing normality. It can be seen to be a weighted sum of the squared differences between the plot points and the best linear fit, placing greater weight in the tails of the distribution. 3. Kolmogorov–Smirnov (K–S): Dn ¼ maxfDþ n ; Dn g
where
ð5:6Þ
i F0 ðti Þ 1in n i1 D ¼ max F ðt Þ n 0 i 1in n Dþ n ¼ max
The exact distribution of the K–S test statistic is known for all n and is tabulated in Ref. 5-1 of Appendix 5A, under an assumption that the parameters of the distribution are known. This is not the case for the CM and AD statistics; only asymptotic distribution expressions have been worked out. Monte Carlo simulation is generally used to construct tables of percentage points for these distributions under H0 for small sample sizes (Lawless, 1982, p. 433). Caution. Lawless (1982, p. 434) cautions that ‘‘all of these test procedures are of limited value because of the requirement that F0 ðtÞ must be completely specified.’’ That is, the parameters of F0 ðtÞ must be known, which is typically not the case. In this case the distribution of the test statistics becomes much more complex, and so Monte Carlo simulation is commonly used to estimate percentiles of the test statistics under H0. Lilliefors (1967, 1969) has generated specific K–S tables for the normal and exponential distributions based on estimates of the parameters of the distributions. Chandra et al. (1981) and Woodruff et al. (1983) have developed similar resources for the Weibull distribution. These tables are reproduced from Kececioglu (1994, Vol. 1). For singly right-censored data sets, the test statistics should include only those differences between F0 ðtÞ and Fn ðtÞ over t t*, the time a test is stopped. Under random censoring, several modifications have been proposed. This author prefers the method suggested by Michael and Schucany (1979), wherein Johnson’s adjusted rank is substituted for the order, i, in the test statistic formulas. In all cases Monte Carlo simulation methods had to be used to develop tables of percentages of the test statistics. The reader should consult Guilbaud (1988) and Lawless (1982) for more information on this topic.
Copyright © 2002 Marcel Dekker, Inc.
Example 5-1: Use of K–S test N ¼ 12 data points were simulated from a Weibull distribution with b ¼ 1:2 and y ¼ 500. The simulated data set appears in Figure 5-2 along with a Weibull plot of the data. The Weibull plot reveals that t1 lies outside the confidence bands, and so the goodness-of-fit must be in doubt. Apply the K–S test for the data set in the figure. The underlying test is of the form H0 :
Failure data follows a Weibull distribution with y ¼ 500; b ¼ 1:2:
Ha :
Failure data does not follow a Weibull distribution with y ¼ 500; b ¼ 1:2:
Use a ¼ 0:10; n ¼ 12. The K–S calculations are summarized in Table 5-1. Since D12 ¼ 0:38 > D12;0:10 ¼ 0:338 (from Ref. 5-1 of Appendix 5A), we reject H0 and conclude that the data is not from a Weibull (500, 1.2) distribution. This finding is consistent with the visual assessment of the fit based on the Weibull plot. Of course, we know that the data is from a Weibull distribution, but due to sampling variation, we ended up choosing a sample that does not appear to fit very well. (Is it a type I error)
FIGURE 5-2
K–S test data set for Weibull distribution (n ¼ 12) Minitab1 V13.
Copyright © 2002 Marcel Dekker, Inc.
Table 5-1
Summary of K–S Test Calculations for Figure 5-2 Data Set
Rank
Time
ði 1Þ=n (A)
i=n (B)
1 2 3 4 5 6 7 8 9 10 11 12
49 216 404 501 564 597 689 703 762 803 973 1466
0.0000 0.0833 0.1667 0.2500 0.3333 0.4167 0.5000 0.5833 0.6667 0.7500 0.8333 0.9167
0.0833 0.1667 0.2500 0.3333 0.4167 0.5000 0.5833 0.6667 0.7500 0.8333 0.9167 1.0000
F0 ðt; 500; 1:2Þ (C) 0.0597 0.3060 0.5390 0.6330 0.6851 0.7098 0.7699 0.7780 0.8095 0.8289 0.8917 0.9736 Largest value: D12 ¼
D n (C–A)
Dþ n (B–C)
0.0597 0.2226 0.3723 0.3830 0.3518 0.2931 0.2699 0.1947 0.1428 0.0789 0.0584 0.0570
0.0236 0.1393 0.2890 0.2997 0.2684 0.2098 0.1866 0.1114 0.0595 0.0044 0.0249 0.0264
0.3830
0.0264
0.383
Note: F0 ðt; 500; 1:2Þ ¼ 1 expððt=500Þ Þ 1:2
Technical Note: Alternative K–S Test Statistic Procedure In some textbooks one might come across an alternative formulation of the K–S test statistic—one that involves absolute differences as follows: Dn ¼ maxfDþ n ; Dn g
ð5:7Þ
where i ¼ max F0 ðti Þ 1in n i 1 Dn ¼ max F0 ðti Þ 1in n
Dþ n
Is the use of Eq. (5.7) incorrect? To see this, consider the conditions under which a difference might be negative. This can occur only if the fit is so weak that both empirical estimators—ði 1Þ=n and i=n—lie either above or below the fitted relationship described by F0 ðti Þ. In the latter case, it is evident that Dþ n will be negative because F0 ðti Þ will exceed i=n. In this case not only will D n remain . And so, only the positive, but it also will be greater than the absolute value of Dþ n
Copyright © 2002 Marcel Dekker, Inc.
positive differenced quantity is important. The same argument holds for the former case, when both EDFs lie above F0 ðti Þ.
5.2.2
Rank Regression Tests
The idea of using the correlation coefficient associated with a rank regression to assess the goodness-of-fit was suggested as far back as the early 1970s by Filliben (1974). Tarum (1999) conducts Monte Carlo studies on the empirical distribution of the correlation coefficient, R, or coefficient of determination, R2 . Given a simple, linear regression fit of the form (see Neter et al., 1990, p. 100): y^ ¼ b0 þ b1 X , b21 R ¼ 2
n P
ðXi X Þ2
i¼1 n P
ð5:8Þ
ðyi y Þ2
i¼1
Tarum’s (1999) studies are based on 1000 Monte Carlo simulations from (a) two-parameter Weibull distribution with b ¼ 3 and y ¼ 10;000, (b) three-parameter Weibull with d ¼ 300, and (c) lognormal distribution with m ¼ 100 and s ¼ 10, for 56 values of n ranging from n ¼ 3 to 1000. Bernard’s rank estimator, F^ ðti Þ ¼ ði 0:3Þ=ðn þ 0:4Þ, is used. Additionally, to test the effect when Weibull parameters, y and b, are varied, additional Monte Carlo runs are conducted at n ¼ 20 for a range of b-values from 0.4 to 5.5 and for y-values ranging from 10 to 100. Monte Carlo results reveal very low sensitivity of the 5th and 10th percentiles of R2 to changes in y and b. A Weibull fit of the 5th and 10th percentiles of R2 is made as a function of sample size, n. (The Weibull is a very flexible distribution form for modeling a wide range of distribution shapes!) A best Weibull fit of these percentile values is presented in Figure 5-3 (from Abernethy, 1998). The critical values are plotted as a function of the number of recorded failures, r, and two confidence levels— C ¼ 90% and 95%—for two- and three-parameter Weibull and normal distributions. Note that the tendency to accept H0 (the hypothesized distribution under test) increases with increasing R or R2.
Example 5-2:
Use of rank regression goodness-of-fit tests
For the normal data set of Table 4-1, we developed rank regression estimates of m and s, and an adjusted R2 of 94.4% is reported in Figure 4-2 based on r ¼ 11 failures and n ¼ 15. From Figure 5-3 we see that, for r ¼ 11 failures, the critical
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 5-3 Critical values of correlation coefficient, R, and coefficient of determination, R2 (from Abernethy, 1996).
value of R2 (right-ordinal scale) at the 95% level of confidence for normal data is approximately R20:95;normal 0:86. Since R2 > R20:95;normal 0:86, we accept H0 and conclude that the normal fit is adequate.
5.2.3
Other Goodness-of-Fit Tests
Mann’s Test for the Weibull Distribution Mann’s test statistic (Mann et al., 1974) for assessing Weibull goodness-of-fit is calculated as follows: k1 M¼
r1 P i¼k1 þ1
k2
k1 P
‘i
‘i
i¼1
Copyright © 2002 Marcel Dekker, Inc.
where r ¼ number of recorded failures: hr i r1 k1 ¼ ; where bxc is the integer portion of x: and k2 ¼ 2 2 1=Mi t ‘i ¼ ln iþ1 : ti i 0:5 : Mi ¼ Ziþ1 Zi ; where Zi ¼ ln ln 1 n þ 0:25 The test is applicable for complete or type II censored data sets. For n 20, we reject H0 : FðtÞ ¼ F0 ðtÞ if M > Fa;2k2 ;2k1 , an approximation. For small samples, Mann et al. (1973, 1974) provide tabulated values of percentiles of the test statistic. In a limited study Mann et al. (1974) show that the Mann test is somewhat more powerful than EDF test procedures such as CM and AD (and K–S). Table 5-2 here shows a worksheet for developing Mann’s statistic. Example 5-3:
Use of Mann’s test for Weibull data in Figure 5-2 data set
r ¼ 12; k1 ¼ 6; k2 ¼ 5; M ¼
k 1 num: sum 6 2:5368 ¼ 0:729: ¼ k2 denom: sum 5 4:1766
Summary Statistics M ¼ 0:729: F0:10;25;26 ¼ 2:19: Table 5-2 i 1 2 3 4 5 6 7 8 9 10 11 12
ti 49 216 404 501 564 597 689 703 762 803 973 1466
Worksheet for Developing Mann’s Statistic t ln ti Zi Mi ln iþ1 ti 3.9 5.4 6.0 6.2 6.3 6.4 6.5 6.6 6.6 6.7 6.9 7.3
3.1779 2.0355 1.4773 1.0892 0.7813 0.5175 0.2793 0.0541 0.1687 0.4014 0.6657 1.0272
Copyright © 2002 Marcel Dekker, Inc.
1.1424 0.5582 0.3880 0.3080 0.2637 0.2383 0.2252 0.2228 0.2327 0.2643 0.3615
1.4835 0.6261 0.2152 0.1184 0.0569 0.1433 0.0201 0.0806 0.0524 0.1920 0.4099
‘i 1.2985 1.1217 0.5546 0.3846 0.2156 0.6015 0.0893 0.3618 0.2252 0.7265 1.1340
} }
denominator sum ¼ 4.1766
numerator sum ¼ 2.5368
At a ¼ 0:10, we do not reject H0 , because M < F0:10;10;12 , and so we accept the null hypothesis that the Weibull is an adequate distribution model. Note that we used the F-distribution approximation in arriving at this decision. However, in this case the evidence is quite strong, despite the fact that an exact percentile of the Mann statistic is not used. To confirm this finding, we ran a rank regression test on the data. We used the built-in capability of Minitab to develop Weibull least-squares estimates. The Weibull plot is presented in Figure 5-4. Note a correlation coefficient R of 0.944, which is high enough to judge the Weibull fit to be adequate. Shapiro and Wilk (1965) Test for Normality The Shapiro–Wilk test for normality for complete samples is based on the test statistic n P
S2 ¼
ða1 t1 þ a2 t2 þ þ an tn Þ2
i¼1
n P
ðti t Þ2
ð5:9Þ
i¼1
The ai , i ¼ 1; 2; . . . ; n, are constants that are tabulated and reported along with percentiles of the test statistics. Shapiro et al. (1968) report that this test is more
FIGURE 5-4
Rank regression fit of Weibull data (Minitab1 V13).
Copyright © 2002 Marcel Dekker, Inc.
powerful than the collection of empirical distribution (EDF) tests presented earlier.
5.3 1.
EXERCISES Consider the multiply censored data set that is time-censored at 40 hundredhr: T 24 25 25 28þ 29 30 31 32 35 36 37 36 40þ
Test the goodness-of-fit of this data against the Weibull, normal, and lognormal distributions using the K–S test procedure. 2. A life test is stopped at t ¼ 700 hr. With n ¼ 4, the recorded observations are as follows: 175
500
650
700
The Minitab output for this data set is shown in Figure 5-5. Use this information to develop a goodness-of-fit assessment in two ways. a. Use R or R2 test to assess a two-parameter Weibull fit. b. Use the K–S test to assess adequacy of fit. 3. The following singly censored data set is for a sample of size n ¼ 8. The last reading is a censored observation due to the fact that the test was stopped at t ¼ 1500 hr. 440
535
676
781
868 1
953
1225
1500þ
Figure 5-6 shows the output from Minitab V13, which was used to generate probability plots based on a least-squares fit. The correlation coefficients are shown for each distribution. Use this information to comment on the goodness-of-fit for each of these distributions.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 5-5
Minitab1 V13 output for data set for n ¼ 4 data set.
FIGURE 5-6 ID-plot of life data for four distributions: Weibull, lognormal, exponential, and normal (from Minitab1 V13).
Copyright © 2002 Marcel Dekker, Inc.
APPENDIX 5A Ref. 5-1 Critical Values Kolmogorov-Smirnov (K–S) Test Statistic (Parameters Known) a n
0.20
0.10
0.05
0.01
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 30 35 40 45 50 >50
0.900 0.683 0.565 0.493 0.447 0.410 0.381 0.358 0.330 0.323 0.308 0.296 0.285 0.275 0.266 0.258 0.250 0.243 0.237 0.232 0.226 0.221 0.217 0.212 0.208 0.190 0.176 0.165 0.156 0.148 1.07 pffiffiffi n
0.950 0.776 0.636 0.565 0.509 0.468 0.436 0.409 0.388 0.369 0.352 0.338 0.325 0.314 0.304 0.295 0.286 0.278 0.271 0.265 0.259 0.253 0.248 0.242 0.237 0.218 0.202 0.189 0.179 0.170 1.22 pffiffiffi n
0.975 0.842 0.708 0.624 0.563 0.519 0.483 0.454 0.430 0.409 0.391 0.376 0.361 0.349 0.338 0.327 0.318 0.327 0.302 0.294 0.287 0.281 0.275 0.269 0.264 0.242 0.224 0.210 0.199 0.189 1.36 pffiffiffi n
0.995 0.930 0.829 0.733 0.668 0.617 0.576 0.542 0.513 0.489 0.468 0.449 0.433 0.418 0.404 0.392 0.381 0.370 0.361 0.352 0.345 0.337 0.330 0.323 0.317 0.200 0.269 0.253 0.238 0.227 1.63 pffiffiffi n
Copyright © 2002 Marcel Dekker, Inc.
6 Test Sample-Size Determination
Reliability demonstration of very complex systems can be quite costly. Concept or design prototypes are often built by hand and then tested until a catastrophic failure event occurs. Can organizations justify the expenses associated with design verification testing at the systems level? Can they afford not to? The economics are such that companies are often unwilling to spare more than two or three units for testing. Is this a correct strategy? Can reliability be demonstrated with a specified confidence level when sample sizes are so small? Is it possible to demonstrate reliability with a small sample, but by testing the small sample to two or three times’ design life? These questions need to be answered. In formulating design and=or product verification (DV=PV) plans, engineers often ask, ‘‘How many items do I need to put on test to demonstrate that my product is reliable?’’ Once that question is answered, they often ask the follow-up question, ‘‘Is there any way that I can justify a lower sample-size requirement?’’ Having worked for tier-one and tier-two automotive suppliers, this author believes that such questions come up over and over again. It is also a question that today’s graduate engineers are least trained to answer when they come out of college. Most DV=PV tests are pass–fail tests. Exacting test requirements are specified in the design verification, planning, and reporting (DVP&R) document.* Sample sizes, test conditions, etc. are carefully spelled out along *DVP&R is a common term used in automotive-related reliability testing. Those who work in other industries are probably not familiar with this expression. ‘‘Design verification’’ is a phrase with which most engineers are familiar.
Copyright © 2002 Marcel Dekker, Inc.
with very specific acceptance standards for passing the test, and a specification of the number of items that must pass each test. The cost to run a test is of greatest importance to design and test organizations. Prototypes of complex products can be extremely expensive to build. For example, in 2002 U.S. dollars, the cost to build up an automotive cockpit can run close to $350,000 per prototype. Even a subsystem as simple as a window glass unit can run $30–$40,000 per prototype. Electronic circuit boards can run $5–15,000 per prototype. And these costs do not include the costs to build or modify test equipment and custom test fixturing units! Accordingly, sample size and test-duration requirements are often much greater than organizations can justify based on resource requirements on time and investment on testing. To reduce test resources, we describe the use and advantage of extended bogey testing for testing a multiple of test-life, and tails testing, for reducing sample requirements. A technical overview of the chapter’s coverage now follows: This chapter describes the use of success and success–failure tests for reliability validation. Nonparametric approaches are based on the binomial distribution; parametric approaches under a Weibayes distribution—a Weibull distribution with the added assumption that the shape parameter, b, is known—are surveyed. In the former case we refer to the use of the popular Clopper–Pearson (1934) binomial confidence limits for determining sample sizes. The inconsistencies in the use of such formulas, particularly as they are compared to Leonard Johnson’s (1951, 1964) beta-binomial bounds, are pointed out. In the latter case we point out the risks and limitations in assuming that the Weibull shape parameter is known. This is achieved through the use of Monte Carlo simulation and reference to a published Weibull database. We point out the relatively high level of uncertainty in knowledge of b based on the width of confidence intervals and the standard error of b^ . Accordingly, the reliability engineer is advised to consider a most conservative value for b, b ¼ 1:0, particularly when testing for durability. This is reflected in a Chrysler Motors’ sunroof test standard, which has been ‘‘reverseengineered,’’ to show that it, too, is based on a value of b ¼ 1, despite the fact that sunroofs are wearable items. Of special interest, the focus is on success testing—no failures allowed—a popular framework for design verification (DV) testing. The equivalency between both approaches, binomial and Weibayes, is demonstrated.
6.1 6.1.1
VALIDATION=VERIFICATION TESTING Verification Testing
Design and product verification (DV=PV) tests are conducted to determine if a product will meet its performance requirements. Prototype parts must often be assembled by hand. At the product verification stage, production-grade tooling is
Copyright © 2002 Marcel Dekker, Inc.
used to build test prototypes that are representative of anticipated production. In order to save time, the tests are usually performed under accelerated conditions of temperature, humidity, and other identifiable usage variables such as voltage (in the case of electronics). Each test is fully described in the design verification, planning, and reporting (DVP&R) document. Information on sample size, criteria for passing a test, and either a description of the test or a reference to a test standard are carefully described in the document (see Appendix x1A.3 of Chapter 1). As an example, we illustrate the wide variety of verification tests that might be run on electronics components, subsystems, and systems. In the electronics industry, verification tests require test prototypes to survive exposure to testing under elevated environmental and electrical stresses. Tomase (1989) provides an overview of the assortment of tests that might be run: 1.
2. 3. 4. 5. 6. 7. 8. 9. 10. 11.
Power=temperature cycle test (exposes defective solder joints, bad wire bonds, and cracking and brittleness of conformal coatings due to excessive differential temperature expansion between parts) Thermal shock test (rapid temperature changes) Random vibration test (exposes poor support of large components or board assemblies; excessive flexing of wires and component leads) Sine vibration test (predictable vibration) Electrical stress tests (overstressing is a common root cause of failure of electronic devices) Salt spray and salt fog tests (test for resistance to corrosive phenomena) Biased humidity tests Electromagnetic compatibility (EMC) testing (to check for electrical interference) High-temperature endurance testing Mechanical shock test (ability to withstand excessive loads) Drop test (for mobile devices)
The tests are usually single-environment tests, for which acceleration factors can normally be developed to relate failure rates at accelerated conditions to failure rates at normal usage conditions. (The use of acceleration factors is discussed in Chapter 7.) Combined environmental stress testing (CERT) is useful for accelerating the evolution of potential failure modes; however, risks exist that new failure modes may be introduced that might never occur under normal usage conditions. However, the use of CERT allows for examination of the effect of an interaction between two or more elevated stresses. It is generally not possible to develop an appropriate model for predicting performance under CERT.
Copyright © 2002 Marcel Dekker, Inc.
6.1.2
Specifying a Reliability Requirement
When we specify reliability, we specify it over a usage interval of time or other duty-cycle metric. Typically, this usage interval is the target design life for a product, although it does not have to be. The target life might be a warranty period, for example, based on a period of time when an organization faces its greatest financial exposure to field problems. In most cases the design engineering team sets this period of time, and test standards are based on it. We refer to this time period as t ¼ tb , a test bogey time. As discussed in Chapter 1, this period of time is artificially set based on a variety of factors, which might include what the company’s competitors are doing in this regard. Following the conventions used in the auto industry, we express our reliability requirement using R by C notation. That is, an R by C specification signifies that we want to meet a reliability requirement of R ¼ Rðtb Þ with level of confidence C 100% over a test period tb . For example, an R90C95 specification requires a confidence level of 95% of meeting a reliability target of 0.90. The reliability, Rðtb Þ, should be viewed as a lower confidence limit on reliability, which we denote as RL . That is, RL ¼ 0:90 implies P½Rðtb Þ 0:90 0:95 6.1.3
ð6:1Þ
Success–Failure Testing
DV=PV test standards usually consist of a set of criteria for which a predetermined number of test items in a sample of size, n, must pass. These are referred to as success–failure tests, as we are primarily concerned with the numbers that pass or fail various tests. We discuss two basic types of success–failure tests, with the option that the test duration for either can be extended to allow for reduced sample sizes: 1.
2.
Success (bogey) tests: Test a predetermined number of prototypes for a predetermined length of time. All units must pass the test in order for the R by C specification to be achieved. The test is stopped once the test requirements have been met. No distribution assumptions are required, as we are simply modeling Rðtb Þ, the reliability at t ¼ tb , as a binomial parameter, denoting the probability of a single unit surviving a bogey test. Most bogey tests are success tests; that is, all items put on test must satisfy test requirements. Success=failure tests: For noncritical items, it is often sufficient if just r of n items pass the test. This procedure is a generalization of success testing. No distribution assumptions are required.
Extended Bogey Testing. The test period is extended a multiple, m, of the bogey life, tb . Either type of test can be extended this way. In this case the test
Copyright © 2002 Marcel Dekker, Inc.
period is of duration te ¼ mtb . The quantity m is often referred to as the test-tofield ratio. It is not uncommon for m to be in the range of 1.25 to 2.0. If the bogey test period is tb , then all units are placed on test until they fail or are taken off test at T ¼ mtb units. This strategy is commonly used to allow for smaller sample sizes. Some knowledge of the relationship between Rðtb Þ and Rðmtb Þ must be known to adjust the test-planning formula. That is, if Rðtb Þ ¼ 0:90, a corresponding value for Rðmtb Þ with Rðmtb Þ 0:90 must be known and substituted in the success–failure test-planning formula. Knowledge of the underlying distribution is generally needed to model this relationship. 6.1.4
Testing to Failure
It is almost always advantageous to test to failure whenever possible. This allows data to be fit to a distribution and parametric estimates of various reliability metrics to be generated. In the event that all items pass a bogey test, and we then stop the test, neither are we left with any information that can be used to identify potential failure modes—nor are we able to develop any of the parametric estimates we speak of! Accordingly, the advice below should be followed: The test engineer and design team should always think about finding the time to allow as many test items to run to failure as possible. If time is a consideration, it is recommended that strategies such as step-stress testing— wherein test conditions are periodically elevated to a higher level of stress conditions—or other accelerated test procedures be used to ensure that the remaining survivors on a test do fail. These issues are explored in Chapter 5, wherein we discuss the use of accelerated test procedures. 6.1.5 Strategies for Reducing Sample-Size Requirements Resource limitations on the availability of test prototypes and=or test equipment often pose a restriction on the number of items that can be tested. Thus, it is extremely important that the design team be aware of strategies that can be used to justify small sample sizes: 1.
2.
3.
Extended bogey testing: Test a multiple, m, of the test requirement. That is, test to mtb . An assumption on the underlying failure distribution is required to implement this feature. Accelerated testing: Use temperature or another stress factor to accelerate test conditions. The effective time on test is AF tb , where AF is an acceleration factor. The development of acceleration factors is discussed in further detail in Chapter 5. Bayesian model: The Bayesian adjustment allows for a sample size of n 1. Additionally, the use of a full Bayesian model—allowing for full
Copyright © 2002 Marcel Dekker, Inc.
4.
5.
6.1.6
use of all prior information on previous test data, or on field performance data on similar products—allows for reduced sample sizes. Repair=replace failed components: Failed items taken off test might not have to be discarded. If repairable, they should be repaired and placed back on test. This practice is commonly incorporated in the testanalyze-and-fix development strategy used by aerospace and defense developers of high-end, low-production-run items such as aircraft and defense systems. Tail testing: Assuming a relationship between one or more design characteristics and reliability, we manufacture or screen potential test items for extreme left-tail characteristics that correlate to the left tail of the failure distribution. Our test sample-size requirement is reduced to pn, where p is the tail we are testing.
Underlying Distributional Assumption
The assumption of an underlying distribution is very important in the analysis and in how we differentiate test sample-size determination formulas. We consider the following underlying models in our discussion: 1.
2. 3.
4.
6.2
Success or success–failure testing: No distribution assumptions are made. The underlying test plan formulas are based on the properties of the binomial distribution. Exponential testing: The underlying failure process is typical of constant failure-rate phenomena. Weibayes testing: The underlying failure process is Weibull-distributed with shape parameter known. Thus, t b follows an exponential distribution with MTTF parameter, yb . Weibull testing: The underlying failure process follows a Weibull distribution with b and y unknown. Unfortunately, there are no known general models for Weibull test planning at the time of this publication.
SUCCESS TESTING
In this case a binomial model is used to model test outcomes. Let Rðtb Þ denote the reliability of a test item subjected to tb units of usage. It represents the independent probability of success that a test unit will survive the test. Alternatively, the test can be modeled in terms of Fðtb Þ ¼ 1 Rðtb Þ, the independent probability that a unit will fail the test. Under a binomial model,
Copyright © 2002 Marcel Dekker, Inc.
it is assumed (a) that each test item in a sample of n test items is subjected to identical test conditions and (b) that each test item has an identical, independent probability, Rðtb Þ, of surviving the test. Given a confidence level C, the design formula for a success test is ½RL n ¼ 1 C
ð6:2Þ
This very simple formula is one of the most useful formulas for reliability test planning. It is used to provide a lower confidence limit, RL , on reliability, Rðtb Þ, assuming that all n items survive the life test. This can be stated as P½Rðtb Þ RL ¼ C
ð6:3Þ
If a log transformation is applied to both sides of Eq. (6.2), and the terms are rearranged, we obtain the well-known success-testing sample-size determination formula: n¼
lnð1 CÞ ln RL
ð6:4Þ
Equation (6.4) is useful for describing tradeoffs between level of confidence and reliability or for determining sample-size requirements under an R by C requirement. The desire to meet a higher reliability requirement without increasing sample size can be accomplished only by reducing the level of confidence we are willing to agree to, and vice versa. Note: Because Eq. (6.4) is a ratio of log terms, we might just as well express this requirement as n ¼ logð1 CÞ= logðRL Þ, because the base of the log does not matter when ratios are involved. Example 6-1 The ABC manufacturing company wishes to demonstrate a reliability of 0.99 with a level of confidence of 0.95 (R99C95) for a critical assembly subsystem that is to be tested to 1 design life. What sample size is required to demonstrate this reliability, assuming that all units must pass this test? By Eq. (6.4), n ¼ logð1 0:95Þ= logð0:99Þ ¼ 298 items This requirement might be viewed as excessive for tests involving very expensive or difficult-to-manufacture items. Considering any of the following might lower sample sizes: 1. 2. 3. 4.
Lowering the reliability target, Rðtb Þ Lowering the required level of confidence, C Using extended bogey testing, and running a test a multiple m of test life Using tails testing (see x6.6)
Copyright © 2002 Marcel Dekker, Inc.
The latter two techniques require knowledge of an underlying distribution or, in the case of extended bogey testing, knowledge of how to relate the theoretical failure fraction at the extended life to its value at tb . The use of extended bogey testing is introduced in x6.4.2 for the exponential and x6.5.1 for the Weibull. Tails testing is surveyed in x6.6. 6.2.1
Bayesian Adjustment to Success Formula
Note that Eq. (6.4) is sometimes modified to n¼
lnð1 CÞ 1 ln RL
ð6:5Þ
The origin behind this simple adjustment is discussed in Appendix 6B. It is based on a Bayesian model, which incorporates a uniform prior distribution on p. Equations (6.5) and (6B.8) are equivalent expressions. Note: It is the author’s observation that reliability practitioners rarely use Bayesian methods due to their inherent complexity. However, pragmatics say that if there is a scientific justification for using a lesser sample requirement, then go ahead and use it!
6.3
SUCCESS–FAILURE TESTING
Success–failure testing comprises a generalization of success testing wherein up to r failures are allowed. A simple estimator of product reliability is the ratio R^ ðtb Þ ¼ 1 r=n
ð6:6Þ
An R by C test specification must satisfy the following relationship, using RL to denote the reliability requirement at t ¼ tb : r n P ð1 RL Þi RLni ¼ 1 C ð6:7Þ Biðr; RL ; nÞ ¼ i i¼0 Most readers will recognize Biðr; RL ; nÞ as an expression for the cumulative binomial distribution. The binomial distribution is discussed in greater detail in Appendix 6A.1. Equation (6.7) is the formula that Clopper and Pearson (1934) suggest for obtaining confidence limits on a binomial proportion (see Appendix 6A.2). Accordingly, RL may be viewed as a lower 100C% confidence limit on reliability, Rðtb Þ. Note that for r ¼ 0, Eq. (6.7) reduces to the relationship for success testing given by Eq. (6.2). The Tools>Goal Seek procedure in Microsoft Excel can be used to directly search for a solution to Eq. (6.7). Its use is illustrated in the following example.
Copyright © 2002 Marcel Dekker, Inc.
Example 6-2:
Use of Microsoft Excel
1
A success–failure test is planned with allowances for up to one failure. Based on an R90C90 reliability specification, how many test items should be tested? The use of Excel’s Tools>Goal Seek procedure for determining test requirements is illustrated in Figure 6-1. The cumulative distribution is evaluated in cell D4 using the BINOMDIST function as shown in the Excel formula box. Upon completion of the Goal Seek routine, the result, n ¼ 38—the sample requirements for this test—appears in cell D7.
6.3.1
Use of Binomial Nomograph
Before the era of inexpensive, powerful desktop computing, the binomial nomograph was frequently used to quickly identify parameters of binomial plans. In Figure 6-2 we reproduce a binomial nomograph that has been modified with R by C notation. The use of the nomograph is illustrated in the cutout contained within the figure. In the cutout area, a quality-control sampling plan is described wherein up to c (or rÞ ¼ 4 nonconforming items in a sample of size n ¼ 95 are allowed. The binomial nomograph is used to identify two sampling points at the 10% and 95% probabilities of acceptance as shown. At the 10% level of acceptance, the probability of a defective is 0.09; at the 95% level of acceptance, the probability of a defective is 0.02. That is, Bið4; 1 0:02; 95Þ ¼ 0:95
FIGURE 6-1
and
Bið4; 1 0:09; 95Þ ¼ 0:10
Use of Microsoft Excel to arrive at a sample plan.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 6-2
Binomial nomograph.
Example 6-3a:
Use of binomial nomograph
A success–failure test is to be run using n ¼ 10 prototypes. Up to one failure is allowed if each item is subjected to 5500 cycles of testing. A lower confidence limit of 95% on R(5500) is desired. What reliability can be minimally achieved at this level of confidence as approximated by using a binomial nomograph? A cutout of the binomial nomograph section used to find the properties of this test plan is presented in Figure 6-3. On the nomograph we identify two points: 1. 2.
The right scale is in units of 1 C. We mark off 1 C ¼ 0:05. We then find the point corresponding to n ¼ 10; c or r ¼ 1.
We connect the two points and extend this to the left scale, which is in units of probability of failure, with p 0:38. Thus, RL 1 0:38 ¼ 0:62.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 6-3
Illustrated use of binomial nomograph.
Example 6-3b:
Use of binomial nomograph
What sample size is required to achieve a reliability of 0.90? Referring back to Figure 6-3, we see that for r ¼ 1, C ¼ 0:95, a sample size of 45 is required (approximately, viewing dashed line). A simple clockwise rotation of the solid line was made to identify the required sample size. 6.3.2
Exact Formulas for Binomial Confidence Limits in Success–Failure Testing
Given a level of confidence C, an exact expression for the lower confidence limit, RL , is given by RL ¼
nr n r þ ðr þ 1ÞF2ðrþ1Þ;2ðnrÞ;1C
ð6:8Þ
F2ðrþ1Þ;2ðnrÞ;1C is standard notation for a right-tail quantile of the Fdistribution having 2ðr þ 1Þ numerator degrees of freedom and 2ðn rÞ denominator degrees of freedom. Those curious as to how a cumulative binomial probability could possibly be related to an F-distribution might want to refer to Appendix 6A.2. It describes the duality between the use of the binomial distribution for modeling success–failure test outcomes and the beta distribution, which is a continuous distribution used for modeling a wide variety of phenom-
Copyright © 2002 Marcel Dekker, Inc.
Right-tailed F-quantile, F0:05;6;16 .
FIGURE 6-4
ena. The transformations necessary to then relate the beta distribution with an Fdistribution are described in full detail. Example 6-3c:
Use of exact formula
For the example in x6.3.1, a success–failure test is to be run using n ¼ 10 prototypes. Up to one failure is allowed if each item is subjected to 5500 cycles of testing. A lower confidence limit of 95% on R(5500) is desired. RL is evaluated using the exact F-test relationship with n1 ¼ 2ðr þ 1Þ ¼ 2ð1 þ 1Þ ¼ 4, n2 ¼ 2ðn rÞ ¼ 2ð10 1Þ ¼ 18, 1 C ¼ 0:05, from which we can look up F4;18;0:05 ¼ 2:927 in tables or use a statistical computing procedure to find it. nr RL ¼ n r þ ðr þ 1ÞF2ðrþ1Þ;2ðnrÞ;1C ¼
10 1 ¼ 0:606 10 1 þ ð1 þ 1Þ2:927
Microsoft Excel can be readily used to generate solutions to Eq. (6.8). To illustrate this point, a table of C ¼ 0:95 values has been generated for an array of success–failure tests. The data is presented in Table 6-1. 6.3.3
Large-Sample Confidence Limit Approximation on Reliability
Generally, limitations on sample sizes and high-reliability expectations do not allow for many failures. As such, the use of a large-sample approximation on reliability, R, is usually impractical. The large-sample approximation, which is founded on normal theory, does provide a reasonable approximation when
Copyright © 2002 Marcel Dekker, Inc.
TABLE 6-1 Use of Microsoft Excel to Develop a Table of R by C Data for a Range of Success–Failure Testing (r, n) in the Table for Various Sample Sizes and Numbers of Allowed Failures (r), for a C Requirement of C ¼ 0.95
nð1 RL Þ exceeds 5. It is based on the approximation that Rðtb Þ is approximately h r r i normally distributed with mean ð1 r=nÞ and variance n. The 1 n n normal approximation is given by 1=2 ðr=nÞð1 r=nÞ RL ¼ 1 r=n Z1C ð6:9Þ n where Z1C is a standard normal quantile associated with a right tail of 1 C. Example 6-3d:
Use of large sample approximation
For the example in x6.3.1, a success–failure test is to be run using n ¼ 10 prototypes. Up to one failure is allowed if each item is subjected to 5500 cycles of testing. A lower confidence limit of 95% on R(5500) is desired.
Copyright © 2002 Marcel Dekker, Inc.
RL is now evaluated using the large-sample approximation. With 1 C ¼ 0:05, Z0:05 ¼ 1:645, RL ¼ 1 r=n Z0:05 ½ðr=nÞð1 r=nÞ=n1=2 ¼ 1 ð1=10Þ ¼ 1:645ð0:1 0:9=10Þ1=2 ¼ 0:90 0:16 ¼ 0:74. As expected, the approximation does not perform as well with small samples: nð1 RL Þ ¼ 10ð1 0:74Þ ¼ 2:6 5. 6.3.4
Bayesian Adjustment to Success–Failure Testing Formula
Note that Eq. (6.7) is sometimes modified to r n P Biðr; RL ; n þ 1Þ ¼ ð1 RL Þi RLni ¼ 1 C i¼0 i
ð6:10Þ
The origin behind this simple adjustment is introduced in x6.2.1 and discussed in greater detail in Appendix 6B. This adjustment results in a reduction of the sample requirements by 1. The Bayesian adjustment to Eq. (6.8) results in the modified exact expression RL ¼
nþ1r n þ 1 r þ ðr þ 1ÞF2ðrþ1Þ;2ðnþ1rÞ;1C
ð6:11Þ
The author has never seen this adjustment used in practice. However, based on the arguments in x6B of the appendix, it would seem inconsistent to use such an adjustment in designing success tests but not use it here. Perhaps there is less of a need to do so here, as the sample-size requirements for success–failure tests are generally less than those required for success tests. 6.3.5
Correctness of Binomial Success-Testing Formula
Wasserman (1999, 2000a) points out the misunderstanding and inconsistency in the use of the formula based on the Clopper–Pearson binomial confidence limits [see Eqs. (6A.3) and (6A.4) in Appendix x6A.2]. This inconsistency is worth noting. In success–failure testing, the failure fraction, p, a discrete distribution quantity, is generally viewed as having a linkage with some unknown, continuous distribution, FðtÞ. For individual data, each failure occurrence at t ¼ tr , r ¼ 1; 2; . . . ; n, is associated with an estimated failure fraction of p^ r ¼ ðr=nÞ, a naive estimator of FðtÞ. Improved rank estimators for FðtÞ have been developed. The more widely used estimators such as the median and mean rank (Herd– Johnson) estimators have been developed based on a multinomial model of the underlying order statistics (see Wasserman, 2000). Johnson (1951, 1964) proposes the beta-binomial bounds on the rank estimators, F^ ðti Þ, i ¼ 1; 2; . . . ; n. Due to the duality between the beta distribution and the binomial distribution, and the equivalence in relationship between the beta and the F-distributions—as described both in Appendix 2A.2 for betaCopyright © 2002 Marcel Dekker, Inc.
binomial bounds and in Appendix 6A.2 for Clopper–Pearson binomial bounds— it is possible to express these bounds in terms of either the binomial distribution or the F-distribution. Expressions for these bounds are summarized in Table 6-2. We see that the lower bounds on the binomial parameter, p, match up for both, but there is disagreement in the upper bound, which is the limit in which we are interested. They differ only in degrees of freedom. Specifically, the relationships on the upper bound agree if r 1 is substituted for r in the Clopper–Pearson expressions! That is, the upper confidence limit on F^ ðt1 Þ, the beta-binomial bounds on the earliest or first failure, r ¼ 1, match exactly with the upper limit on the binomial failure fraction under success testing (r ¼ 0). Accordingly would it not make sense for a very wise reliability engineer to do the following? Under success testing, even if r ¼ 1 failure does occur, the engineer might still consider the test a success in that the beta-binomial upper bounds justify acceptance of R by C specifications!
This scenario is unbelievable, but true! However, it wasn’t until recently that this truth was discovered. The details behind the construction of the test plan formulas in Table 6-2 may be found in Wasserman (1999–2000a).
6.4
EXPONENTIAL TEST-PLANNING FORMULAS
Due to the ease with which parameters and reliability metrics can be estimated, the exponential distribution is a desirable distribution to work with. The exponential distribution is surveyed in Chapter 3, and ML estimation formulas on the MTTF parameter, y, are presented in Chapter 4. We begin with the maximum likelihood (ML) expressions for estimating y, the mean time-to-failure parameter of the exponential distribution (see x4.7.1, Chapter 4): T y^ ¼ r
ð6:12Þ
T denotes the total time of exposure for all units placed on test. Formulas for calculating T under various censoring scenarios are summarized in Table 4-6 in Chapter 4. The exponential test-planning formulas are based on the achievement of a specified, minimum mean time-to-failure, yL . yL is a lower 100C% confidence limit on the mean time-to-failure parameter, y. An expression for yL , under time censoring, is provided in x4.7.2 and repeated here: 3 2 ! 7 6 7 6 2T 7C ð6:13Þ P6 y 7 6 w2 5 4 2rþ2;1C |fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl} yL
Copyright © 2002 Marcel Dekker, Inc.
TABLE 6-2 Estimating pL , pU Based on Clopper–Pearson Binomial Confidence Limits and Johnson’s Beta-Binomial Limits Expressed in Terms of Either the Binomial or F-Distribution Limit
Expressed in terms of
Pð p > pL Þ C
Binomial distribution F-distribution
Pð p < pU Þ C
Binomial distribution F-distribution
Source: Wassermann, 1999–2000a.
Copyright © 2002 Marcel Dekker, Inc.
Based on Clopper–Pearson
Based on Johnson’s beta-binomial bounds
Biðr 1; n; pL Þ ¼ C
Biðr 1; n; pL Þ ¼ C
1 pL ¼ nrþ1 F2ðnrþ1Þ;2r;1C 1þ r
pL ¼
Biðr; n; pU Þ ¼ 1 C 1 pU ¼ nr F 1þ r þ 1 2ðnrÞ;2ðrþ1Þ;C
Biðr 1; n; pU Þ ¼ 1 C 1 pU ¼ nrþ1 F2ðnrþ1Þ;2r;C 1þ r
1 nrþ1 F2ðnrþ1Þ;2r;1C 1þ r
To develop an expression for RL, the lower confidence limit on Rðtb Þ, we begin with an expression for a one-sided lower confidence limit on y: PðyL yÞ C
ð6:14Þ
Then we take an exponential transform on both sides, which preserves this relationship, resulting in the following transformed relationship in terms of reliability at t ¼ tb , the end of the bogey period: 1
0
B C P@expðtb =yL Þ expðtb =yÞA C |fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl} |fflfflfflfflfflfflffl{zfflfflfflfflfflfflffl} RL
ð6:15Þ
R
Equation (6.15) provides a useful expression for RL under an assumption that time-to-failure follows a constant-failure model. By combining Eqs. (6.13) and (6.15), we obtain an expression for RL :
tb w22rþ2;1C RL ¼ expðtb =yL Þ ¼ exp 2T
ð6:16Þ
Equation (6.16) can be used to determine the total unit time on test requirements for any reliability requirement, under exponential test-planning assumptions. Taking the log of both sides of Eq. (6.16), and rearranging terms, leads to T ¼
tb w22rþ2;1C 2 ln RL
ð6:17Þ
Now the fun begins! Under a variety of circumstances, Eq. (6.17) can be applied directly to arrive at test requirements for a DV test run for tb units of usage. For example, for a replacement test—wherein every test stand is kept occupied, T ¼ ntb . For a DV test with very few failures, where r n, T ntb . Under these circumstances, sample size requirements for a success–failure test that allows for r failures is given by n ¼ 0:5
w22rþ2;1C ln RL
Copyright © 2002 Marcel Dekker, Inc.
ð6:18Þ
Under other circumstances, Eq. (6.17) can be adapted using the following general expressions for time on test, T : T ¼ ry^ T ¼ n tb
ð6:19Þ ð6:20Þ
Equation (6.19) is useful when the mean time-to-failure can be approximated. Equation (6.20) requires knowledge of the average number of units on test. Each of these alternative expressions for T results in a slightly different exponential test-planning formula. Example 6-3 (cont.): Design of success–failure test under an exponential distribution assumption Let’s reconsider the worked-out problem of Example 6-2. A success–failure test is planned with allowances for up to one failure. Based on an R90C90 reliability specification, how many test items should be tested? For an R90C90 specification and an allowance of up to r ¼ 1 failure, the minimum required number of items on test is determined with the use of Eq. (6.18): n ¼ 0:5
2 w22rþ2;1C w4;0:1 37 ¼ 0:5 ln RL lnð0:90Þ
It is prudent to check the assumption that r=n 1 holds. For this test, r=n ¼ 1=37, which is relatively small, and so the use of Eq. (6.18) is justified. (Note the minor reduction in sample-size requirements from 38 to 37 derived earlier.) The implementation of an Excel worksheet for performing this calculation is shown in Figure 6-5.
FIGURE 6-5 Microsoft Excel spreadsheet implementation of exponential test planning calculations.
Copyright © 2002 Marcel Dekker, Inc.
6.4.1
Success Testing Under an Exponential Distribution Assumption Using Alternate Formula
If no failures are allowed on test, there is simply no advantage in assuming an underlying exponential distribution. To prove this, it is sufficient to show that Eq. (6.18) with r ¼ 0 is equivalent to Eq. (6.4). That is, w22;1C ¼ 2 ln RL |fflfflffl{zfflfflffl}
n¼
exponential
lnð1 CÞ ln RL |ffl{zffl}
ð6:21Þ
binomial ðsuccess testÞ
For this statement to be true w22;1C ¼ 2 lnð1 CÞ
ð6:22Þ
Proof. To prove this, we start with a general expression for the probability density of a chi-squared random variable with v degrees of freedom: n=2 1 1 xðn=21Þ ex=2 fw2 ðxÞ ¼ Gðn=2Þ 2
for 0 x < 1
Substituting n ¼ 2 yields fw2 ðxÞ ¼
2=2 1 1 1 xð2=2Þ1 ex=2 ¼ ex=2 Gð2=2Þ 2 2
for 0 x < 1
ð6:23Þ
which is an exponential density with y ¼ 2. Accordingly, w22;1C may be expressed in terms of the right tail of an exponential distribution as shown: Pðw22
w22;1C Þ
2 w2;1C ¼ exp ¼1C 2
Taking the natural log of both sides gives
w22;1C ¼ lnð1 CÞ 2
or
w22;1C ¼ 2 lnð1 CÞ
The result is proven, and so Eq. (6.22) is true. Therefore, for success-tests, the parametric assumption of an exponential time-to-failure distribution does not provide any advantage over the distribution-free relation given earlier by Eq. (6.5).
Copyright © 2002 Marcel Dekker, Inc.
Lower Confidence Limit on Exponential Mean Time-to-Failure, Success Testing (Zero Failures) We can now show an alternative formula for yL, the lower confidence limit on y, by substituting w22;1C ¼ 2 lnð1 CÞ into Eq. (6.13) when r ¼ 0: yL ¼
ntb lnð1 CÞ
ð6:24Þ
This expression does not involve a lookup of a chi-square quantile.
6.4.2
Extended Bogey Testing Under Exponential Life Model
Resource limitations on the availability of test prototypes and test stands often dictate the use of small sample sizes. To make maximum use of the test resources, testing is often extended to a multiple of test life. It is not uncommon to have testto-field ratios (m) in the range of 1.25 to 2.0. If possible, the tests are run until all test items have experienced a catastrophic loss of function. The use of extended tests introduces two additional major challenges: 1.
2.
The extension of the test beyond the test life is likely to introduce a risk that the failures are induced by phenomena not likely to be revealed during normal product use. In this case the analyst should regard such test data as being conservative in nature. The need to relate performance at the extended test life to 1.0 times life requires some assumptions on the underlying distribution to relate reliability between the two different test conditions.
Under extended testing, testing continues for a period, te , of te ¼ mtb
ð6:25Þ
Now we need to adjust our design formulas for extended testing. It turns out that the exponential test formulas given by Eqs. (6.15) to (6.17) are still applicable providing it is clear that for extended testing r ¼ No: of failures observed during the test; including the extended period: T ¼ Total time of exposure of all units on test; including the extended period:
Copyright © 2002 Marcel Dekker, Inc.
RL;e ¼ reliability at the end of the extended test period; te : RL;e ¼ expðte =yL Þ ¼ expðmtb =yL Þ ¼ ½expðtb =yL Þm
ð6:26Þ
¼ ðRL Þm Therefore, ln RL;e ¼ m ln RL . We are now able to reexpress the conditions given by Eq. (6.17) for extended testing. That is, if Eq. (6.17) is true at end of bogey life, t ¼ tb , it must also hold true at end of extended bogey life, t ¼ te : te w22rþ2;1C 2 ln RL;e
T ¼
ð6:27Þ
For replacement tests, wherein each test fixture is kept fully occupied, or tests with very few or no failures are allowed (r n), T nte ¼ nmtb . Therefore, we substitute the following into Eq. (6.27): 1. 2. 3.
te ¼ mtb . T nte ¼ nmtb . ln RL;e ¼ m ln RL .
Our modified extended sample requirement becomes n¼
w22rþ2;1C 2m ln RL
ð6:28Þ
Thus, sample requirements are reduced 1=m times, under extended testing. Alternatively, assuming that n is fixed, and so is the reliability specification, we can rewrite Eq. (6.28) in terms of the required test-to-field ratio (m): m¼
w22rþ2;1C 2n ln RL
Example 6-3 (cont.): planning
ð6:29Þ
Testing a multiple of specified life—exponential
Let’s reconsider the example at the beginning of x6.3 and x6.4. A success–failure test is planned with allowances for up to one failure. Based on an R90C90 reliability specification, how many test items should be tested? Let’s now assume that a decision has been made to allow extension of the test to two times life; that is, te ¼ 2tb , or m ¼ 2.
Copyright © 2002 Marcel Dekker, Inc.
Applying Eq. (6.28), with C ¼ 0:90, RL ¼ 0:90, and m ¼ 2 results in the calculation w24;10:90 19 n¼ 2 2 ln 0:90 Thus, the test extension to two times life results in a 50% reduction in sample-size requirements from 37 to 19. 6.4.3
Extended Success Testing—Exponential Distribution
We need only to substitute r ¼ 0 in Eqs. (6.28) and (6.29), and w22;1C ¼ 2 lnð1 CÞ, or the extended requirements, T ¼ nmtb in the standard exponential test-planning formula (when r n). Thus, for exponential success testing, we have n¼
lnð1 CÞ m ln RL
ð6:30Þ
m¼
lnð1 CÞ n ln RL
ð6:31Þ
and
Bayesian Adjustment to Extended Testing Success Formula Under a Bayesian assumption, test sample requirements are reduced by 1: n¼
lnð1 CÞ 1 m ln RL
ð6:32Þ
The origin behind this simple adjustment is discussed in x6.2.1 and Appendix 6B. It is based on a Bayesian model, which incorporates a uniform prior distribution on p. 6.4.4
Risks Associated with Extended Bogey Testing
What happens if more than r failures occur? In particular, what interpretation should be given to test results wherein r or fewer failures occur in (0, tb ) but more than r failures have occurred by t ¼ te ? Such a test outcome becomes a challenge for the reliability analyst. The job is made easier if the reliability engineer can determine that the late failures are due to some kind of damage or wearout phenomenon that is unlikely to occur under normal usage conditions—but this is very hard to prove! Denial of the fact that failure outcomes did occur during the extended period does not constitute an
Copyright © 2002 Marcel Dekker, Inc.
effective strategy either. (That is, one might simply ignore the late failures and merely increase sample-size requirements to meet m ¼ 1:0 requirements!) The correct advice to give then is to simply acknowledge the fact that the test requirements were not passed, and so appropriate actions need to take place to identify the root cause(s) of the failure phenomena, and to subsequently take corrective action to prevent failure recurrence.
6.4.5
Reduced Test Duration
The mathematics for extended bogey testing is also applicable for reducedduration bogey testing. Here we are willing to run larger samples in return for shorter test duration. Equation (6.30) can be used to determine sample sizes when m < 1:0.
6.5
WEIBULL TEST PLANNING
Due to the underlying complexity of likelihood theory, closed-form expressions for determining testing requirements under a Weibull time-to-failure are not available. The mathematical complexities, which we refer to, do vanish if we can assume that the Weibull shape parameter, b, is known to some degree. Under such an assumption, which is often referred to as a Weibayes assumption, the transformation t 0 ¼ t b is exponentially distributed (see Nelson, 1985): t 0 ¼ t b is distributed exponentially with MTTF parameter yb Proof. In the proof that follows, we make use of Rw ðtÞ ¼ expððt=yÞb Þ, the reliability function of a Weibull-distributed time-to-failure random variable. We begin with an expression for the reliability function associated with the random variable t 0 : Rðt 0 Þ ¼ Pðt b t 0 Þ ¼ Pðt t 01=b Þ ¼ Rw ðt
01=b
the reliability of the r:v: t 0 both sides to 1=b power
Þ in terms of reliability of a Weibull r:v: 01=b b ! t ¼ exp y 0 t ¼ exp b y
Copyright © 2002 Marcel Dekker, Inc.
Therefore, t 0 ¼ t b is distributed exponentially with mean time-to-failure parameter yb. Thus, we may draw upon all our previous results for estimating the exponential parameter y, substituting t 0 for t and yb for y. For extended testing, we have teb ¼ ðmtb Þb . Accordingly, we also need to substitute mb for m. 6.5.1
Weibayes Formulas
The Weibull planning formulas under a Weibayes assumption are obtained directly from the exponential test-planning formulas and the use of the following substitutions: t b for t yb for y mb for m We now survey the Weibayes formulas. Weibayes Maximum Likelihood Estimate of yb The Weibayes maximum likelihood estimate of yb must be y^ b ¼ TW =r, where P b TW ¼ toff t bon test 8 units test ð6:33Þ P b ¼ ti i
if all units are put on test at t ¼ 0. If all units are placed on test at t ¼ 0, the ML estimate for y is represented by 0P 11=b n tib Bi¼1 C C y^ ¼ B ð6:34Þ @ r A In this case TW represents the total time of exposure for all units placed on test in terms of the transformed times, t b . The lower confidence limit on yb remains 3 2 7 ! 6 7 6 2TW 6 b7 y 7¼C P6 2 7 6 w2rþ2;1C 5 4|fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl} ybL
Copyright © 2002 Marcel Dekker, Inc.
ð6:35Þ
The reliability bounds become 0
1
B C P@expðtbb =ybL Þ expðtbb =yb ÞA ¼ C |fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl} |fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl} RL
ð6:36Þ
R
Therefore, t b w2 RL ¼ exp b 2rþ2;1C 2TW
! ð6:37Þ
Weibayes Test Requirements For replacement tests, or DV testing with few or no failures (i.e., r n), TW ntbb . Using this approximation and rearranging Eq. (6.37) result in the following sample requirement. Weibayes Success–Failure Testing Sample Requirement n¼
w22rþ2;1C 2 ln RL
ð6:38Þ
Shockingly, the sample requirements are identical to the exponential test requirements given by Eq. (6.17). Furthermore, under success testing, we substitute w22;1C ¼ 2 lnð1 CÞ [see Eq. (6.22)], which results in the identical success-testing formula given by Eq. (6.4), based on the binomial distribution, and Eq. (6.21), based on exponential success testing: n¼
w21C;2 2 ln RL
lnð1 CÞ ln RL
or
ð6:39Þ
However, this is not to be the case under extended bogey testing, wherein the assumed value for the Weibull shape parameter, b, does become a factor in the determination of sample-size requirements. Extended Bogey Testing For the extended period, te ¼ mtb , and so RL;e ¼ expðteb =ybL Þ ¼ expðmb tbb =ybL Þ ¼ ½expðtbb =ybL Þm b
¼ ðRL Þm
Copyright © 2002 Marcel Dekker, Inc.
b
ð6:40Þ
Therefore, ln RL;e ¼ mb ln RL . Thus, we need only to substitute mb for m in the exponential, extended test formula given by Eqs. (6.28) and (6.29): n¼
w22rþ2;1C 2mb ln RL
ð6:41Þ
Equation (6.41) can be rearranged to express test-to-field requirements as a function of the R by C specification and sample size: 2 1=b w m ¼ 2rþ2;1C 2n ln RL
ð6:42Þ
Weibayes Extended-Success Testing For extended testing, we set r ¼ 0 and use w22;1C ¼ 2 lnð1 CÞ: lnð1 CÞ mb ln RL 1=b lnð1 CÞ m¼ n ln RL n¼
ð6:43Þ ð6:44Þ
Bayesian Adjustment to Extended-Testing Success Formula The use of a Bayesian model allows for a reduction in sample size by one unit [see Eq. (6.32) and Appendix 6B]: n¼
lnð1 CÞ 1 mb ln RL
ð6:45Þ
Example 6-4: Extended bogey testing under Weibayes model (Wasserman 2001) A DV test is conducted to verify the life of roller bearings. A success test is to be run for 60,000 cycles to verify an R95C90 requirement. A value of 1.3 is assumed for the Weibull shape parameter, based on data from Bloch and Geitner (1994), which is reproduced in Appendix 6C. 1. 2.
What sample size is required to demonstrate R95C90? From Eq. (6.39), n ¼ logð1 0:90Þ= logð0:95Þ 45. Suppose that only 20 items are available for test; what test-to-field ratio (m) is required to demonstrate R95C90 with only n ¼ 20?
Copyright © 2002 Marcel Dekker, Inc.
From Eq. (6.44), 1=1:3 1=b lnð1 CÞ lnð1 0:90Þ m¼ ¼ ¼ 1:8625 n ln RL 20 ln 0:95 This means that the test period will need to be extended to Te ¼ mtb ¼ 1:8625 60;000 cycles ¼ 111;750 cycles The simplicity and ease of working with the Weibayes distribution are evident. Despite these benefits, it is advisable to be fully informed of the risks inherent in its use. This is discussed next. 6.5.2
Adequacy of Weibayes Model
Can anyone really say he or she knows b with any reasonable level of certainty? Conventional wisdom says that if you have a long history of failure data, then you do have enough information to treat the Weibull shape parameter b as if it is known. For example, suppose that you had abundant data on failures associated with a specific metal fatigue process. Under such circumstances, you are likely to be confident enough to assume a known value for b. But how good is this assumption? Weibull confidence intervals on b can be quite wide, as evidenced by several examples described in the last chapter. This remains true, even with sample sizes as large as n ¼ 20. In one example the limits range from 1.3 to 2.6. Considering how much the shape of the distribution is impacted by b, one would have to wonder! To demonstrate this, a Monte Carlo simulation was conducted for a complete sample of n ¼ 20 from a Weibull (1000, b) distribution for b-values of 0.8, 1.0, 1.2, 1.5, 2.0, and 3.0. The data is reported two ways—as an asymptotic standard error, as is customarily reported by Cohen et al. (1984)—or as a percentile confidence interval on b, a result of a parametric bootstrap on Weibull (y^ , b^ ), the recommended approach by Abernethy (1998). Peruse the tabulated summaries provided by Tables 6-3 and 6-4, wherein the 5th, 50th, and 95th percentiles of s^ b^ and of b^ , itself, are summarized. Note that median standard errors on b, s^ b^ range from 0.146 at b ¼ 0:8 to 0.550 at b ¼ 3:0. These values are quite significant. For example, a standard error as low as 0.18, the 5th percentiles estimate of s^ b^ at b ¼ 1:2, can significantly impact an analyst’s confidence in his or her estimate. For example, an interval as small as 1.1 standard errors around b ¼ 1:2 will lead to a lower limit that is less than 1.0—a value not representative of wearout phenomena. This is also reflected in the percentile confidence limits on b reported in Table 6-3. Again, for a b ¼ 1:2, the interval spans a greater range, [0.953, 1.721], which can be indicative of a wide range of failure phenomena, from wear-in (b < 1) to wearout (b > 1). This uncertainty is seen to increase for larger values of b.
Copyright © 2002 Marcel Dekker, Inc.
TABLE 6-3 ML Estimate, b^ , from 500 Replicates from Weibull (1000, b) Distribution, n ¼ 20 Complete Sample b 5th pct. 50th pct. 95th pct.
0.8
1.0
1.2
1.5
2.0
3.0
0.65292 0.84989 1.19158
0.79246 1.06133 1.45646
0.95327 1.26934 1.72103
1.19376 1.56703 2.09441
1.53046 2.08614 2.94095
2.38708 3.17777 4.36658
Source: Minitab V13.0.
This very uncertainty is reflected in published Weibull databases, for which a portion of data from Bloch and Geitner (1994) is reproduced in Appendix 6C. In the very first entry of the table, data on ball bearings is displayed. The stated lower limit on b is 0.7 with an upper limit of 3.5 and having a typical value of 1.3. The range of uncertainty is significant! How does this uncertainty, which we reference, impact sample-size determination? To answer this question, sample requirements for an R90C90 success test were evaluated for a range of test-to-field (m) and b-values. The objective was to observe the overall effect of our assumption on a known b-value as it relates to sample-size requirements for a range of test-to-field ratios. Equation (6.43) was used to evaluate test plan requirements for known values of b in the interval [0.8, 3.00] and for values of m in the interval [1.0, 2.5]. Results are summarized in Table 6-5. Sample requirements ranged from a low of 1.4 to a high of 21.9. At m ¼ 1:0, the sample requirements are not sensitive to the choice of b, remaining at n ¼ 21:9 [see Eq. (6.43) with m ¼ 1:0]. For extended testing, sample-size requirements range from 11.2 to 18.3 at m ¼ 1:25, with much greater variation as m increases. At m ¼ 2:5, sample-size requirements vary from 1.4 to 10.5, which are considerable. To remain on the conservative side, choose a b-value on the low side. In most cases this means a choice of b ¼ 1:0 or less. But is it always realistic to assume an exponential distribution? These are questions that need an answer!
TABLE 6-4 Standard Error of ML Estimate, b^ (from Fisher Information Matrix) from 500 Replicates from Weibull (1000, b) Distribution, n ¼ 20 Complete Sample b 5th pct. 50th pct. 95th pct.
0.8
1.0
1.2
1.5
2.0
3.0
0.117128 0.146266 0.203127
0.147146 0.183494 0.254659
0.180281 0.222772 0.303512
0.216166 0.277661 0.382139
0.290756 0.368304 0.514060
0.437942 0.549837 0.755755
Source: Minitab V13.0.
Copyright © 2002 Marcel Dekker, Inc.
TABLE 6-5 Weibayes Sample-Size Requirements for R90C90 Test over a Range of m and b m
0.80
1.00
1.25
1.00 1.25 1.50 2.00 2.50
21.9 18.3 15.8 12.6 10.5
21.9 17.5 14.6 10.9 8.7
21.9 16.5 13.2 9.2 7.0
6.5.3
b Shape parameter 1.50 2.00 21.9 15.6 11.9 7.7 5.5
21.9 14.0 9.7 5.5 3.5
2.50
3.00
21.9 12.5 7.9 3.9 2.2
21.9 11.2 6.5 2.7 1.4
Chrysler Success-Testing Requirements on Sunroof Products
The very uncertainty in the Weibull shape parameter is reflected in industryspecific documents dealing with designing success tests. For example, the author has been able to reverse-engineer an old Chrysler Test Standard, PF-9482, for electric sunroofs, which provides a table to determine appropriate sample size requirements under extended bogey testing to meet an R by C durability specification. The table, Table 6-6 here, has been recreated in Excel. Reliability lower limits are provided for a range of sample sizes and test-to-field ratios under success testing at a C ¼ 90% level of confidence. The practitioner would consult this table to determine appropriate sample-size requirements based on the test-tofield ratio and a reliability specification. After careful analysis, the plans were identified as extended, exponential success-testing (b ¼ 1) plans with a Bayesian correction factor on n—substitution of (n þ 1) for n. The exact relationship used is found by rearranging Eq. (6.31), substituting (n þ 1) for n, and assuming b ¼ 1 (exponential), resulting in an expression for evaluating RL as both a function of the level of confidence, C ¼ 90%, and the test-to-field ratio, m: ln 0:10 RL ¼ exp ¼ 101=mðnþ1Þ ð6:46Þ ðn þ 1Þm Somewhat surprising is the fact that these plans have been confirmed to be exponential testing plans (b ¼ 1:0), since we know that sunroofs contain wearable items such as weatherstrips and other electromechanical components. It simply is a manifestation of the reality that the Weibull shape parameter is difficult to characterize with any reasonable level of certainty, and so basing the design on a conservative value of b ¼ 1:0 makes the most sense. (Note: Another reason for choosing b ¼ 1:0 might be due to the fact that we are modeling the reliability of a mechanical system that consists of many components and many potential failure modes. Often we assume that for a wide collection of potential
Copyright © 2002 Marcel Dekker, Inc.
TABLE 6-6 Reproduction of Chrysler Test Standard PF-9482 with R ¼ 0:90 Cells Shaded Confidence level: 90% Sample size
1.00
1.25
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 25 30
0.32 0.46 0.56 0.63 0.68 0.72 0.75 0.77 0.79 0.81 0.83 0.84 0.85 0.86 0.87 0.87 0.88 0.89 0.89 0.90 0.92 0.93
0.40 0.54 0.63 0.69 0.74 0.77 0.79 0.81 0.83 0.85 0.86 0.87 0.88 0.88 0.89 0.90 0.90 0.91 0.91 0.92 0.93 0.94
Test-to-field ratio (m) 1.50 2.00 0.46 0.60 0.68 0.74 0.77 0.80 0.83 0.84 0.86 0.87 0.88 0.89 0.90 0.90 0.91 0.91 0.92 0.92 0.93 0.93 0.94 0.95
0.56 0.68 0.75 0.79 0.83 0.85 0.87 0.88 0.89 0.90 0.91 0.92 0.92 0.93 0.93 0.93 0.94 0.94 0.94 0.95 0.96 0.96
2.50
3.00
0.63 0.74 0.79 0.83 0.86 0.88 0.89 0.90 0.91 0.92 0.93 0.93 0.94 0.94 0.94 0.95 0.95 0.95 0.95 0.96 0.97 0.97
0.68 0.77 0.83 0.86 0.88 0.90 0.91 0.92 0.93 0.93 0.94 0.94 0.95 0.95 0.95 0.96 0.96 0.96 0.96 0.96 0.97 0.98
Note: Test standard is for success testing ðr ¼ 0Þ and Weibull shape parameter of 1.0.
random failure modes, overall system wearout is also a random (exponential, constant failure rate) process.) To verify this, we rearrange Eq. (6.46) in terms of finding sample size, n, and plot this relationship as a function of b for R ¼ 0:90 and C ¼ 0:90 for differing test-to-field ratios: n¼
lnð1 CÞ 1 mb ln RL
ð6:47Þ
Following Wang (1991), we plot this relationship for R90C90 as a function of both b and m in Figure 6-6, illustrating the fact that sample-size requirements are increasing for lower values of b, except for the case where m ¼ 1. Thus, m ¼ 1 is a conservative choice when b > 1:0.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 6-6 ratio, m.
6.6
Sample size as a function of Weibull shape parameter, b, and test-to-field
TAIL TESTING
The use of tail testing is predicated upon an understanding of a relationship between one or more product characteristics and product life. As illustrated by Figure 6-7, we hypothesize that there exists a weakest extreme of a product characteristic and that parts sampled from this region are most susceptible to failure. If tested parts are from the tail region, and they pass the test, then it is very likely that all stronger parts would have also passed the test, had they been put on test. Of course, if they don’t pass the test, it is still likely that some or all of the stronger parts may have passed. Regardless of this insight, the fact that tail items have failed a test presents evidence of design weakness that the design team must address. To implement tail testing, we must either sample from the tail region or build prototypes whose characteristics fall in this region. If we identify a tail region of probability p under the hypothesized distribution on a characteristic we sample, then our required number of items we must sample, np , is reduced by this fraction, p: np ¼ np
ð6:48Þ
For example, for success tail testing, our design formula becomes np ¼ pn ¼ p lnð1 CÞ= ln Rðtb Þ All other design formulas are adjusted this way.
Copyright © 2002 Marcel Dekker, Inc.
ð6:49Þ
FIGURE 6-7
Relationship between product characteristic and product life.
Limitations of Tail Testing The relationship between a product characteristic (or several) and product life is not known with much certainty, for if it were, why would one ever need to do DV=PV testing? Interactions between characteristics and the overall variability in the functional relationship between life and product characteristics introduce significant risk in the use of tail-testing techniques.
6.7
FAILURE TESTING
The most common question the author hears from reliability engineers is the following, ‘‘Given a Weibull probability plot, what design life will meet an R by C specification?’’ The answer to this question and other questions like it involves
Copyright © 2002 Marcel Dekker, Inc.
the use of the formula for setting beta-binomial bounds, which is presented in Chapter 2. The upper bound for the plotting position, in terms of the Fdistribution, is given by [see x2.2.4; Eq. (2.21)] Fu ¼
i i þ ðn þ 1 iÞF2ðnþ1iÞ;2i;C
therefore, RL ¼ 1 FU ðn þ 1 iÞF2ðnþ1iÞ;2i;C ¼ i þ ðn þ 1 iÞF2ðnþ1iÞ;2i;C nþ1i ¼ ðn þ 1 iÞ þ iF2i;2ðnþ1iÞ;1C
ð6:50Þ
Note: To arrive at Eq. (6.50), we use the notation FU , the upper C confidence limit on the rank F^ ðti Þ, which replaces Wi;a . We also make use of the identity 1=Fn1;n2;a ¼ Fn2;n1;1a . It is important to note the differences between Eq. (6.50) and Eq. (6.8). Although both evolve from the equivalence of the F-distribution with the underlying beta distribution, Wasserman (1999–2000a) shows that there are subtle differences between the two expressions. The rank limits as expressed by Eq. (6.50) for the first failure (i ¼ 1) are equivalent to the success–failure testing as expressed by the binomial confidence limits. In general, the two relationships will match if i þ 1 is substituted for i in Eq. (6.50). The differences are due to the asymmetry of the classical expressions for the binomial confidence limits [see Eqs. (6A.3) and (6A.4)], which were first proposed by Clopper and Pearson (1934). This results in a slightly different degrees of freedom on the upper confidence limit on the binomial parameter, p, than that resulting from the beta distribution model on the rank distribution as discussed by Johnson (1951, 1964). Refer to the appendices of Chapter 2 (rank model) and Chapter 6 (binomial confidence limits) for more detailed information on the underlying models for both. Example 6-5:
R by C limits on order statistic
The Ford Motor Company (1997) FAO Reliability Guide poses the following question: Suppose I wish to demonstrate an R95C90 level on first failure (i ¼ 1). What sample size do I need? From Eq. (6.50) substituting i ¼ 1, RL ¼ 0:95, and C ¼ 0:90, we have 0:95 ¼
n n þ F2;2n;0:10
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 6-8
Use of Microsoft Excel to evaluate R by C limits on ti .
To solve this, we could use the Tools>Goal Seek nonlinear search procedure of Excel, which is illustrated in Figure 6-8. For RL ¼ 0:95, the left-hand side (LHS) of Eq. (6.50) represents a target of 0.95. We find n using Goal Seek such that the right-hand side (RHS) of Eq. (6.50) is equal to 0.95. This corresponds to an n 45. It is interesting to note that this n is the same n obtained with the use of the success-testing formula, n¼
lnð1 CÞ ln R
Again, this agrees with our expectations, considering that we have emphasized the equivalence of the use of the beta-binomial confidence limits formula with i ¼ 1 with the binomial confidence limits formula with r ¼ 0 (success testing). Example 6-6:
Determine the bogey life, t b , to satisfy R by C specifications
A Weibull plot is presented in Figure 6-9 for a complete data set of size n ¼ 20. What bogey life, tb , corresponds to an R90C80 specification? We can graphically arrive at this answer by finding the estimated life, which lies on the upper confidence band at 0.10 occurrences of failure. This information is annotated on the graph. The design life corresponding to this requirement is 621 time units.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 6-9
Sample Weibull output from WinSmith1 software.
The confidence bands shown in Figure 6-9 are beta-binomial limits. However, it really does not matter what procedures—Fisher-matrix, likelihood ratio, Greenwood’s formula, etc.—were used to construct the confidence bands, as the graphical procedure will be the same. Excel can be used to identify a more precise estimate. To do so, we make use of the ideal straight-line Weibull fit: t 1:886 R^ ðtÞ ¼ exp 2971 The procedure followed, is illustrated by Figure 6-10: 1.
From Eq. (6.50), find the rank, i, satisfying the upper bound condition on the beta-binomial upper bound on FðtÞ: i i þ ðn þ 1 iÞF2ðnþ1iÞ;2i;C i 0:10 ¼ i þ ð21 iÞF2ð21iÞ;2i;0:80 F0 ¼
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 6-10
2. 3.
Use of Microsoft Excel Tools > Goal Seek to Identify R90C80 life.
The equivalent rank, i, is i ¼ 1:30. The Excel Tools > Goal > Seek procedure was used to determine this rank. The median rank of i is F^ ðti Þ ¼ 0:0494. We assume that this point lies perfectly on the Weibull fit. Use F^ ðtÞ ¼ 1 expððt=2971Þ1:886 Þ ¼ 0:0494 to find tb . This value is 612.1.
Similar procedures can also be used based on the use of other confidence bands. For example, Minitab makes use of Fisher–matrix (asymptotic) bounds, wherever appropriate. However, the internal computational effort is much greater due to the need to calculate components of the observed Fisher information matrix.
6.8
OTHER MANAGEMENT CONSIDERATIONS
The very definition of reliability includes the requirement that operating conditions be carefully described. Test conditions should be reflective of the aggregate stresses placed on a product as the customer uses it. The common practice of running tests at ambient conditions is not adequate. In order for test results to be useful, they must correlate with field performance. This can happen only if testing is conducted under conditions that are reflective of the ways a product is used. As an example, the test plan for verifying the design of an automotive latching
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 6-11
Latch mechanism durability test.
mechanism is illustrated by Figure 6-11. To accurately reflect customer usage, temperature test conditions are varied from very cold to very warm. Humidity factors are included, too. An effective testing strategy must call for combined environmental stress testing. Such a strategy must be employed over the full range of significant environmental factors that impact product performance. Combinations of environmental test factors must be chosen to accurately reflect the range of conditions. Loading profiles during the test must accurately reflect multi-axial stresses during usage. 6.9
SUMMARY
A general formula for sample-size determination is given by n¼
w22rþ2;1C 2mb ln RL
and for the test-to-field ratio, 2 1=b w m ¼ 2rþ2;1C 2n ln RL We modify these formulas for specific testing scenarios as follows: For success tests, we substitute w22rþ2;1C ¼ 2 lnð1 CÞ. This results in n ¼ ðlnð1 CÞ=ðmb ln RL ÞÞ and m ¼ ðlnð1 CÞ=ðn ln RL ÞÞ1=b . For tests that are not extended, we set m ¼ 1.
Copyright © 2002 Marcel Dekker, Inc.
6.10 1.
2. 3. 4. 5. 6.
7.
For exponential tests, substitute b ¼ 1. For the Bayesian adjustment, substitute n þ 1 for n.
EXERCISES
In today’s global competitive marketplace, design engineers have increased pressures to reduce product lead times. How does this impact DV=PV testing? n ¼ 35 items are put on test. What reliability can be demonstrated by a success test at a confidence level of 85%? What are the risks of using extended bogey testing? What sample size is required to satisfy an R98C80 success test? A success–failure test allowing r ¼ 2 failures is to be conducted. What sample size is required to verify R92C90 specifications? An extended success test is to be run. At R90C95, suggest a combination of sample size and test-to-field ratio to achieve this based on a. Underlying exponential distribution assumption. b. Weibull having a b of 1.55. Data is fit to a Weibull distribution. For a complete data set of n ¼ 15 items, the rank regression fit results in the estimates b^ ¼ 1:20, y^ ¼ 1500 hr. Find the corresponding R95C90 bogey life.
Copyright © 2002 Marcel Dekker, Inc.
APPENDIX 6A APPENDIX 6A.1
BINOMIAL DISTRIBUTION
BINOMIAL DISTRIBUTION
The binomial probability, bi (r; p, n) is the probability of observing exactly X ¼ r failures or successes in n independent trials. The order of occurrence of the r outcomes is not important. A parameter, p, denotes the probability of occurrence over each of the n trials. The binomial probability distribution is given by n pr ð1 pÞnr ð6A:1Þ r n The combinatorial coefficient, , denotes the number of ways that one can n r choose r items of one kind from a set of n items, with ¼ n!=ðn rÞ!r!. The r mean of X , a binomial distributed random variable, is EðX Þ ¼ np. Its variance, VarðX Þ ¼ npð1 pÞ. Both the mean and variance may be estimated by using p^ ¼ r=n. The cumulative binomial distribution function, Bi(r; p, n), represents the probability of observing r or fewer failures in n independent trials. It is given by biðr; p; nÞ ¼
Biðr; p; nÞ ¼
n P
biði; p; nÞ ¼
i¼0
r n P pi ð1 pÞni i¼0 i
ð6A:2Þ
Example 6-7 (Ebeling, 1997) A small aircraft landing gear has three tires. It can make a safe landing if no more than one tire bursts. Historical records reveal that, on the average, tire bursts occur once in every 2000 landings. What is the probability that a given aircraft will make a safe landing? Answer: The binomial parameter, p, denoting the probability of a single tire bursting is (1=2000) ¼ 0.0005; n ¼ 3, the number of tires. Therefore, the probability of an aircraft making a safe landing may be written as 3 Bið1; 0:0005; 3Þ ¼ 0:00050 ð1 0:005Þ3 0 3 0:00051 ð1 0:005Þ2 ¼ 0:99999 þ 1
Copyright © 2002 Marcel Dekker, Inc.
APPENDIX 6A.2
BINOMIAL CONFIDENCE LIMIT EXPRESSIONS
The so-called Clopper–Pearson (1934) binomial confidence limits may be obtained using any of the following expressions:
Upper one-sided confidence limit on the binomial parameter, pð pU Þ: Biðr; pU ; nÞ ¼ 1 C
1 Biðr; pU ; nÞ ¼ C
ð6A:3Þ
Lower one-sided confidence limit on the binomial parameter, pð pL Þ: Biðr 1; pL ; nÞ ¼ C
1 Biðr 1; pL ; nÞ ¼ 1 C ð6A:4Þ
For two-sided confidence limits, replace 1 C by 12 ð1 CÞ and C by 12 ð1 þ CÞ. These binomial confidence limit expressions have been suggested to provide the largest (smallest) reasonable values of p based on observing r failures in a sample of size n, with level of confidence C. The binomial confidence limit expressions may be evaluated using trial-and-error methods or approximated using a binomial nomograph or Clopper–Pearson (1934) charts (also available in Lewis, 1996, or Kececioglu, 1994). Alternatively, exact expressions have been developed based on the F-distribution, for which tables and statistical computer procedures are widely available. The demonstration of the use of the equivalent F-distribution relationship requires the use of an intermediary relationship based on the beta distribution. This is discussed next.
6A.2.1
Use of Beta Distribution to Evaluate Binomial Confidence Limits
A random variable, Wa;b , follows a beta distribution with parameters a and b if its probability density function is given by beðw; a; bÞ ¼
1 wa1 ð1 wÞb1 Betaða; bÞ
for 0 w 1
ð6A:5Þ
where Beta(a; b) denotes the beta function, which can be expressed as a function of several gamma functions according to Betaða; bÞ ¼
GðaÞ GðbÞ ða 1Þ!ðb 1Þ! ¼ Gða þ bÞ ða þ b 1Þ!
Copyright © 2002 Marcel Dekker, Inc.
if a and b are integers
We demonstrate the equivalent relation: 1 Biðr; p; nÞ ¼ Beð p; r þ 1; n rÞ ð p n! nr1 r w ð1 wÞ dw
r!ðn r þ 1Þ! 0
ð6A:6Þ
In Eq. (6A.6), Be( p; r þ 1, n r) denotes the cumulative distribution function, PðWrþ2;nr pÞ, of a beta distribution with parameters a ¼ r þ 1, b ¼ n r. The bracketed term in Eq. (6A.6) is known as the incomplete beta function for which values have been tabulated in reference textbooks. Using integration by parts, the right-hand side of Eq. (6A.6) can be shown to be equal to the binomial probability, bi(r þ 1; p, n), plus another beta distributional term, Beð p; r þ 2, n r 1Þ: Be ¼ ðp; r þ 1; n rÞ 2 3 ðp n! 4 ð1 wÞnr1 wr dw5 ¼ r!ðn r þ 1Þ! 0 |fflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflffl} |ffl{zffl} dV U 2 6 1 n! 6 ½ð1 wÞnr1 wrþ1 p0 r!ðn r 1Þ! 4ðr þ 1Þ |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} UV 3 ð 1p rþ1 7 w þ ðn r 1Þð1 wÞnr2 dw7 5 rþ1 0 |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl ffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} VdU n! 1 prþ1 ð1 pÞnr1 ¼ r!ðn r 1Þ! ðr þ 1Þ |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
¼
ð6A:7Þ
biðrþ1;n;pÞ
ð n! n r 1 p rþ1 þ w ð1 wÞnr2 dx r!ðn r 1Þ! rþ1 0 |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} Beð p;rþ2;nr1Þ
¼ biðr þ 1; p; nÞ þ Beðp; r þ 2; n r 1Þ The process Eq. (6A.7) describes can be reapplied on the resultant beta distributional term, Beð p; r þ 2, n r 1Þ as follows: Beð p; r þ 2; n r 1Þ ¼ Biðr þ 2; p; nÞ þ Beð p; r þ 3; n r 2Þ ð6A:8Þ
Copyright © 2002 Marcel Dekker, Inc.
This process can be repeated for a total of n r times, resulting in the relation Beð p; r þ 1; n rÞ ¼
n P
biðx; p; nÞ ¼ 1 Biðr; p; nÞ
ð6A:9Þ
x¼rþ1
Thus, the expression for pU , the upper binomial confidence limit, given by Eq. (6A.3), may be reexpressed in terms of an equivalent relationship based on the beta distribution. The equivalent relationship may be expressed as Beð pU ; r þ 1; n rÞ ¼ C
ð6A:10Þ
or in terms of the right-tail quantile of the beta distribution: pU ¼ Wrþ1;nr;1C
6A.2.2
ð6A:11Þ
Using the F-Distribution to Evaluate Beta Distribution Quantiles
Let Wa;b Beða; bÞ over the interval 0 W 1; then we can use the transform Wa;b ¼
ða=bÞF 1 þ ða=bÞF
where F denotes an F-distributed random variable having n1 ¼ 2a numerator and n2 ¼ 2b denominator degrees of freedom. Thus, to evaluate Wa;b;1C , a right-tail quantile, we merely need to look up F2a;2b;1C or F2b;2a;C in an F-table, and then find Wa;b;1C using Wa;b;1C ¼
ða=bÞF2a;2b;1C 1 ¼ 1 þ ða=bÞF2a;2b;1C 1 þ ðb=aÞF2b;2aC
ð6A:12Þ
In Eq. (6A.12) we make use of the fact that Fn1;n2;C ¼ 1=Fn2;n1;1C . Proof. The details can be found in x2A.2.1. 6A.2.3
Using the F-Distribution to Evaluate the Binomial Distribution
Combining Eqs. (6A.11) and (6A.12) yields pU ¼ Wrþ1;nr;1C ¼
1 nr 1þ F r þ 1 2ðnrÞ;2ðrþ1Þ;C
ð6A:13Þ
Similarly, for pL, the relationships Eq. (6A.4) displays may be expressed in terms of the beta distribution as Beð pL ; r; n r þ 1Þ ¼ 1 C
Copyright © 2002 Marcel Dekker, Inc.
ð6A:14Þ
or in terms of the beta right-tailed quantile, pL ¼ Wr;nrþ1;C
ð6A:15Þ
or in terms of the F-distributional quantile, pL ¼ Wr;nrþ1;C ¼
1 nrþ1 F2ðnrþ1Þ;2r;1C 1þ r
ð6A:16Þ
Equations (6A.13) and (6A.16) comprise standard formulas for binomial confidence limits using the F-distribution. To find the lower confidence limit on R, denoted RL, we note that RL ¼ 1 pU . nr F 1 r þ 1 2ðnrÞ;2ðrþ1Þ;C RL ¼ 1 ¼ nr nr F F 1þ 1þ r þ 1 2ðnrÞ;2ðrþ1Þ;C r þ 1 2ðnrÞ;2ðrþ1Þ;C nr ¼ n r þ ðr þ 1ÞF2ðrþ1Þ;2ðnrÞ;1C ð6A:17Þ This is the desired result.
Copyright © 2002 Marcel Dekker, Inc.
APPENDIX 6B
BAYESIAN ESTIMATION OF FAILURE FRACTION, p
Bayesian approaches to reliability estimation are useful in the presence of limited sample information, as their use allows for the incorporation of other useful information outside the test. For example, consider the automaker that is working on the launch of a new vehicle. In such instances, only a limited number of prototypes can be tested. The use of Bayesian methods allows for the use of a much larger pool of field data on similar makes, or on the same make but last model year. This information and other engineering knowledge can go into the formulation of a prior distribution on the parameter, p, of the binomial distribution. Unfortunately, the complexity of Bayesian methods has turned away most practitioners from considering its use. Accordingly, the use of Bayesian methods is not presented as a primary subject area in this reference. For further information, consult Martz and Waller (1982) as a good technical starting point for learning more about Bayesian methods. In this section we describe the use of an uninformative prior and how that might be used to estimate the binomial proportion, p, the failure fraction. The use of this Bayesian model results in a reduction of sample size requirements from n to n 1. We describe this model next.
6B.1
OVERVIEW OF BAYESIAN ESTIMATION
Bayesian estimation is a sequential process. We form a prior distribution on a parameter(s) of interest. The parameters of the prior distribution are chosen in such a way that the mean, variance, and other moments of this distribution accurately reflect any prior knowledge on the reliability metric to be estimated. The distribution mean, mode, or median, which you choose, will reflect your prior estimate of the metric. The certainty in your belief is reflected by your choice of a variance. A large variance is indicative of a high level of uncertainty. In the binomial estimation model, which we describe, we choose a very uninformative prior. It is based on use of a uniform prior on p over the interval (0,1). Its use conveys a vague sense of knowledge that the true failure fraction is uniformly likely anywhere in the domain of p. Formally, the prior distribution, hð pÞ, is the uniform distribution hð pÞ ¼ 1
for 0 p 1
Copyright © 2002 Marcel Dekker, Inc.
ð6B:1Þ
Once sample information does become available, we use Bayes theorem to update our estimate on p. This is referred to as a posterior distribution. The posterior distribution is constructed using
hð pjxÞ ¼
f ðxjpÞhð pÞ fx ðxÞ
ð6B:2Þ
where
f ðxjpÞ ¼ Observable distribution on the outcome; x ðbinomial distribution in our caseÞ: fx ðxÞ ¼ The marginal distribution on x formed by ð1 fx ðxÞ ¼ gðx; pÞdp where 0
gðx; pÞ ¼ f ðxjpÞhð pÞ
In Eq. (6B.3), gðx; pÞ is a joint distribution on the test outcome, x, with the estimated parameter of interest, p. Once the posterior distribution is identified, it may then be used as a subsequent prior distribution for the next sample or test. This process continues sequentially as long as new test information becomes available. The use of a Bayesian estimation procedure allows for efficient combination of past sample information. Generally, the prior distribution is chosen carefully so that its combination with the observable distribution, f ðxjpÞ, produces a posterior distribution of the same distribution form as the prior, but with different parameters. In our case the use of a beta prior distribution in conjunction with a binomial likelihood combines to produce a posterior distribution that also follows a beta distribution. When the prior distribution is selected this way, it is referred to as a conjugate distribution. When a conjugate distribution is used, the Bayesian estimation procedure reduces to a set of simple equations for updating the parameters of the posterior distribution. Often these equations are similar to the use of exponentially weighted moving average used in quality control and forecasting. Unfortunately, due to the complexity of the Weibull distribution, the Bayesian models for estimating the parameters of the Weibull are very difficult to establish.
Copyright © 2002 Marcel Dekker, Inc.
6B.2
THE USE OF AN UNINFORMATIVE PRIOR ON THE BINOMIAL FRACTION, p
The uniform prior on p, hð pÞ, is expressed by Eq. (6B.1). The distribution of outcomes for modeling success–failure tests is the binomial distribution f ðrj pÞ ¼
n pr ð1 pÞnr r
ð6B:4Þ
The marginal on p, fr ðrÞ, is related to the beta distribution [see Eq. (6A.5)]: ðp fr ðrÞ ¼
f ðrjpÞhðpÞ 0 ð1
n r p ð1 pÞnr dp r 0 ð1 1 ðn þ 1Þ! ðrþ1Þ1 p ð1 pÞðnrþ1Þ1 dp ¼ n þ 1 0 r!ðn rÞ! ð1 1 ¼ beð p; r þ 1; n r þ 1Þdp nþ1 0 1 ¼ nþ1 ¼
ð6B:5Þ
By Eq. (6B.5), the marginal distribution on the number of failed units, r, is just the uniform probability point mass of 1=ðn þ 1Þ on each of the possible outcomes, r ¼ 0; 1; 2; . . . ; n. The posterior distribution on p, hð pjrÞ, is given by f ðxjpÞhðpÞ f ðxÞ n x pr ð1 pÞnr ¼ r 1=ðn þ 1Þ ðn þ 1Þn! ðrþ1Þ1 p ð1 pÞðnrþ1Þ1 ¼ r!ðn rÞ! ¼ beðr; r þ 1; n þ 1 rÞ
hð pjxÞ ¼
ð6B:6Þ
The posterior distribution is a beta distribution with parameters a ¼ r þ 1 and b ¼ n þ 1 r. Comparing Eq. (6A.6) with Eq. (6B.6), it is evident that percentiles of this beta distribution will be related to percentiles on a binomial
Copyright © 2002 Marcel Dekker, Inc.
distribution for a sample size of n þ 1. Formally, the percentiles on the posterior distribution of p are related to the binomial as ð PU Pð p pU Þ ¼ hð pjxÞdp 0 ð6B:7Þ ¼ Beð pU ; r þ 1; n rÞ ¼ 1 Biðr; p; n þ 1Þ For a success test, with r ¼ 0, we evaluate the binomial probability: Bið0; p; n þ 1Þ ¼ ð1 pÞnþ1 ¼ 1 C
ð6B:8Þ
Thus, the use of a vague prior results in success-testing models wherein the quantity n þ 1 appears instead to be n. This translates into a reduction of the sample requirement by 1. To confirm this, one should examine the expressions given by Eq. (6B.8) and (6.2).
Copyright © 2002 Marcel Dekker, Inc.
APPENDIX 6C
WEIBULL PROPERTIES
A summary of typical, high, and low values of Weibull b and y for mechanical components appears in Table 6-7. The data, which is taken from Bloch and Geitner (1994), also appears at http: ==www:barringer1:com=wdbase:htm.
TABLE 6-7 Weibull Properties Database b
y
Components
Low Typical High
Ball bearing Roller bearings Sleeve bearing Belts, drive Bellows, hydraulic Bolts Clutches, friction Clutches, magnetic Couplings Couplings, gear Cylinders, hydraulic Diaphragm, metal Diaphragm, rubber Gaskets, hydraulics Filter, oil Gears Impellers, pumps Joints, mechanical Knife edges, fulcrum Liner, recip. comp. cyl. Nuts ‘‘O’’-rings, elastomeric Packings, recip. comp. rod Pins Pistons, engines Pivots Pumps, lubricators Seals, mechanical Shafts, cent. pumps Springs
0.7 0.7 0.7 0.5 0.5 0.5 0.5 0.8 0.8 0.8 1 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.8 0.8 0.5
Copyright © 2002 Marcel Dekker, Inc.
1.3 1.3 1 1.2 1.3 3 1.4 1 2 2.5 2 3 1.1 1.1 1.1 2 2.5 1.2 1 1.8 1.1 1.1 1.1 1.4 1.4 1.4 1.1 1.4 1.2 1.1
3.5 3.5 3 2.8 3 10 3 1.6 6 4 3.8 6 1.4 1.4 1.4 6 6 6 6 3 1.4 1.4 1.4 5 3 5 1.4 4 3 3
Low
Typical
High
14,000 40,000 250,000 9,000 50,000 125,000 10,000 50,000 143,000 9,000 30,000 91,000 14,000 50,000 100,000 125,000 300,000 100,000,000 67,000 100,000 500,000 100,000 150,000 333,000 25,000 75,000 333,000 25,000 75,000 1,250,000 9,000,000 900,000 200,000,000 50,000 65,000 500,000 50,000 60,000 300,000 700,000 75,000 3,300,000 20,000 25,000 125,000 33,000 75,000 500,000 125,000 150,000 1,400,000 150,000 1,400,000 10,000,000 1,700,000 2,000,000 16,700,000 20,000 50,000 300,000 14,000 50,000 500,000 5,000 20,000 33,000 5,000 20,000 33,000 17,000 50,000 170,000 20,000 75,000 170,000 300,000 400,000 1,400,000 13,000 50,000 125,000 3,000 25,000 50,000 50,000 50,000 300,000 14,000 25,000 5,000,000
TABLE 6-7 ( continued ) b
y
Instrumentation
Low Typical High
Controllers, solid state Control valves Motorized valves Solenoid valves Transducers Transmitters Temperature indicators Pressure indicators Flow instrumentation Level instrumentation Electromechanical parts
0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5
0.7 1 1.1 1.1 1 1 1 1.2 1 1 1
1.1 2 3 3 3 2 2 3 3 3 3
Low 20,000 14,000 17,000 50,000 11,000 100,000 140,000 110,000 100,000 14,000 13,000
b
100,000 100,000 25,000 75,000 20,000 150,000 150,000 125,000 125,000 25,000 25,000
High 200,000 333,000 1,000,000 1,000,000 90,000 1,100,000 3,300,000 3,300,000 10,000,000 500,000 1,000,000
y
Service liquids
Low Typical High
Low
Coolants Lubricants, screw compr. Lube oils, mineral Lube oils, synthetic Greases
0.5 0.5 0.5 0.5 0.5
11,000 11,000 3,000 33,000 7,000
1.1 1.1 1.1 1.1 1.1
Typical
2 3 3 3 3
Typical 15,000 15,000 10,000 50,000 10,000
High 33,000 40,000 25,000 250,000 33,000
Source: Bloch and Deitner, 1994. Note: Some entries, such as that for cyliinders–hydraulic, appear to be anomalous, with typical values for y appearing outside low and high limits.
Copyright © 2002 Marcel Dekker, Inc.
7 Accelerated Testing
Expect the unexpected! In today’s competitive marketplace, product design teams are under immense pressure to reduce product lead times. For example, in the auto industry, product lead times, which often exceeded 48 months 10 years ago, are now below 24 months in some cases. Manpreet Nagvanshi (1999) of General Motors provided Figure 7-1, which reveals a vision of reducing lead times even further, to a level below 18 months by 2002. In turn, test organizations have to adopt new methods for reducing design verification (DV) test time. In the auto industry, for example, it is not uncommon to see a durability test requiring thousands of cycles or hours of test time. (And consider what happens if design changes have to be made and a second DV test event must be scheduled!) Consequently, engineers must identify opportunities to shorten test times. One way to achieve this is through the use of accelerated test procedures.
7.1
ACCELERATED TESTING
The situation depicted in Figure 7-1 is not unique to the auto industry. The pressure to reduce product lead times is common to all durable-goods manufacturers, including farm machinery, consumer appliances, and electronics. Design and product verification (DV=PV) activities can often require thousands
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 7-1
Product lead times in auto industry.
of hours of real-time testing; as such, there are time pressures to reduce test times. Accelerated testing allows for reduced test times by providing test conditions that ‘‘speed up’’ the evolution of failure modes. This is usually achieved by elevating either environmental conditions or conditions of usage. For example, the use of elevated temperatures is a common approach to accelerate test times. As we know from chemistry and physics, the use of elevated temperatures accelerates the kinetics of chemical reactions—resulting in an acceleration of physical-chemical phenomena such as corrosion processes, oxidation, and other chemical breakdowns. Additionally, we know that elevated temperatures lead to expansion and possible phase changes. In the former case material properties of bonded or composite materials might be affected by dissimilarities in coefficients of expansion. Such phenomena are particularly evident in the electronics industry. There are risks, however, in the use of accelerated test procedures. It may be possible for new or different failure modes to evolve under accelerated conditions. Again, looking at temperature as an acceleration factor, the possibility of phase changes or other high-temperature phenomena may lead to the evolution of failure modes that the customer might never experience under normal usage conditions.
7.1.1
Benefits=Limitations of Accelerated Testing
The benefits of accelerated testing are well known:
Better understanding of the effect of various stresses on product life Improved product reliability through identification and elimination of design and process deficiencies
Copyright © 2002 Marcel Dekker, Inc.
Shorter DV=PV test times, which lead to an overall reduction in product lead times Reduced product development costs There are inherent risks in the use of accelerated life test procedures because the physics underlying the evolution of failure modes might be quite different. The risks include the following:
Anomalous failure modes might be introduced under elevated stress conditions. These are failures customers are not likely to observe under normal product usage conditions. The use of an improperly specified acceleration model may lead to erroneous conclusions since the analysis of failure data under elevated stress conditions cannot be properly transformed into information that can be used under normal usage conditions. The need to specify stress levels as close to operating range as possible while still achieving shortened test duration. Stresses outside product operating limits can lead to damage resulting from phase changes, for example. Sometimes accelerated tests may require masking some failure modes so that the test can be run at elevated stresses. For example, a design may use solder that melts at 200 F. In order for a test to run at this temperature, a prototype might be built that uses an alternative grade of solder that can withstand high temperatures. If multiple stresses are combined, it may not be possible to develop stress versus life relationships due to the inherent interaction and complexities in their combined application. 7.1.2
Two Basic Strategies for Accelerated Testing
There are two basic strategies for accelerated testing: highly-accelerated life testing (HALT) and accelerated life testing (ALT). They differ in the type of failure information communicated back to the engineer. HALT tests elevate stresses to quickly expose design weaknesses for the purpose of quick feedback to the design engineering community for resolution. ALT testing uses the physics of failure to model the time-to-failure, allowing the engineer to project back to real-world usage. In addition to providing qualitative information about failures, ALT testing provides the engineer with quantitative information about failures. We now describe these two basic strategies for accelerated testing in terms of common test procedures used today. 1. Highly accelerated tests: A set of highly elevated stress conditions is identified for testing. No attempt is made to mathematically relate performance under elevated stress conditions to normal usage conditions. At these elevated conditions, potential failure modes are more likely to be observed. Any early
Copyright © 2002 Marcel Dekker, Inc.
failures are interpreted as indicators of potential design or process weaknesses, which should be corrected. For many products a combination of elevated temperature and threedimensional loading is used to accomplish this. In order to minimize the evolution of unrealistic failure modes, the level of elevation must not be excessive. For example, Feinberg and Gibson (1995) point out that anomalous failures will occur when elevated stresses induce ‘‘nonlinearities’’ in the product. For example, material phase changes from solid to liquid involve a physicalchemical phase transition (e.g., solder melting, intermetallic changes, etc.), which is highly nonlinear. Specific applications of highly accelerated tests include, but are not limited to, the following:
HALT tests: Tests that are commercially advertised as highly accelerated life (HALT) tests. These are specifically designed for use in DV=PV testing to quickly detect any latent deficiencies in a design. It is widely used for verifying the properties of electronics subsystems. A version promoted as FMVT ( failure mode verification tests) is a HALT test for mechanical systems. Step-stress tests: Test items are subjected to successively higher levels of stresses until failure occurs. Test approach is used to ensure the evolution of failures. Differs from HALT testing in that stresses are applied in a systematic manner, not randomly. Under step-stress testing, testing is run under nominal test conditions for 1 life. Then, stresses such as temperature, voltage, and vibration, are ramped up systematically in 10% increments. To accomplish this, destruct operating limits of the product must be identified. Destruct limits are determined by the level of stress at which the product permanently stops functioning. The difference between the destruct limits and nominal is broken down into ten increments.
Elephant tests: A DV=PV test under which product is subjected to extreme levels of stress (torture tests). Product must pass this test to be qualified. Burn-in tests: Used on manufactured electronics components and subassemblies. Elevated humidity and temperature are used to screen out infant mortalities. These are the items that could potentially fail early in their use due to manufacturing defects. Environmental stress screening (ESS): Developed as an improvement over traditional burn-in testing. The process is subjected to stresses that are strong enough to detect latent failures but do not take significant life
Copyright © 2002 Marcel Dekker, Inc.
out of the product. Instead of using high levels of temperature and humidity, which might damage sensitive components, ESS involves more moderate temperature cycling, physical vibration, and accelerated rates of circuit activation. Mil-Std-2164 describes standard procedures for the ESS of electronic equipment. Also, the reader might refer to Kececioglu and Sun (1995) for a more comprehensive description of ESS methods. Highly accelerated stress screening (HASS): A form of ESS commercially advertised as HASS. HASS allows for the discovery of process changes in manufacturing. It is advertised more as a way to monitor the process for special causes rather than as a means to screen out defective product. 2. Accelerated life testing (ALT): An acceleration factor or physics-offailure model is used to relate life at varying elevated and normal usage conditions. ALT works best when only one or two stress factors are elevated at a time, and a single failure mode is studied. It is common to use a single stress factor, such as temperature or voltage, to elevate stress. For example, elevated temperature can lead to acceleration of corrosive and other chemical process phenomena, which can affect product life. In a polymeric material, this would be manifested as a degradation of material properties over time, due to the accompanying destruction of polymeric cross-linkages. The combined use of cyclic loading and temperature is quite common in ALT testing. A difference in the coefficients of expansion of adjoining materials can lead to separation and=or damage due to temperature cycling effects. Be aware of certain risks in the use of ALT on systems. It is entirely possible that the acceleration factor(s) may result in no uniform elevation or acceleration of failure phenomena across subsystems. This is an important consideration when looking at subsystems=component-specific failure phenomena. We now survey these two approaches in greater detail.
7.2
HIGHLY ACCELERATED LIFE TESTING (HALT)
A typical HALT chamber is shown in Figure 7-2. During a HALT test, a series of individual and combined stresses such as multi-axis vibration, temperature cycling, and product power cycling is applied in steps of increasing intensity (well beyond the expected field environments) until the product fails. Based on physics of failure considerations, there is a greater likelihood of observing a potential failure mode under elevated stress conditions. Thus, potential failure modes, which normally might not be discovered until a product is used for some time, are more likely to be discovered under HALT conditions.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 7-2
A HALT=HAST chamber by QualMark (1999).
As illustrated by Figure 7-3, close attention is paid to identifying both the operating and destructive limits of a product. The operating limits represent boundaries on product operability, beyond which a product will cease to function properly. However, once the elevated stresses are reduced, the product will function again. Destruct limits denote the boundaries beyond which irreversible damage may occur to the product (e.g., phase changes such as melting). for temperature, the destruct limit might be that temperature at which irreversible phase changes occur. For voltage, this might be the level of voltage at which irreversible damage occurs to a component. For vibration, this might be the operating level at which something breaks. Typically, HALT testing begins at stress levels that are near the operating limits. Stresses are then gradually accelerated to a level below the natural tolerances of the destruct limits. Usually, single stresses are applied first, such as thermal, followed by a 6-axis vibration load. These tests are used to identify operating and destruct limits. This might then be followed by ramp rate tests. Finally, a HALT test with combined stresses would be run. Each test might take a minimum of 1–2 days. The use of HALT methods has been well received by industry: For example, in a QualMark (1999)TM publication, it is reported that Ronald M. Horrell, Chief of Reliability, United Technologies Corporation, has observed that his ‘‘test costs were reduced by a factor of eight and test time by a factor of 30 over conventional reliability testing’’ with the use of HALT testing.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 7-3 Stress levels defined in HALT=HASS (highly accelerated life testing for design and process improvement, Ann Marie Hopf, Array Technology Corporation, Boulder, CO).
There is a significant risk, however, of observing failure modes that would never occur under normal operating conditions. Some industry practitioners have not been able to successfully implement HALT or HASS in their organizations. They feel that the use of HALT overloaded their organizations with the need to investigate unrelated or anomalous failure modes. Consequently, the test engineer should carefully analyze each observed failure for potential for corrective action and the risk for not taking corrective action. A cost-benefit analysis should be conducted. Generally, if problems occur during the early stages of product design and development, fixes are easy to make, and so should be made regardless of the likelihood of such a failure mode ever being observed by a customer. Once production grade tooling is built, such decisions would likely require more careful consideration.
7.3
ACCELERATED LIFE TEST
Accelerated life tests (ALT) for estimating life versus stress may require a significant amount of testing. Usually, a single stress such as temperature or voltage is utilized. To obtain data for ALT testing, a small sample of size n is put on test at conditions at or near maximum stress levels until r items fail. For example, for testing a push button on a cell phone, n ¼ 8 prototypes might be placed on test until r ¼ 4 fail. The test is then rerun under lower stress levels until another r ¼ 4 fail in a small sample of size n ¼ 8. This might be repeated several times. Details for analysing such test data are provided in this section. We make use of the Minitab1 procedure, Stat > Reliability=Survival Analysis > AccelerAccelerated Life Testing or Regression with Life Data for this. But before we proceed, we must first describe some basic models for accelerated life tests. We
Copyright © 2002 Marcel Dekker, Inc.
begin with a very simple model and then introduce the use of parametric models. We initially assume that simple subsystems or components are under test and that only one stress variable is used. Extensions to two or more stress variables are described briefly. 7.3.1
Accelerated Cycling or Time-Compression Strategies
For products whose usage is measured in cycles, the simplest strategy to reduce test time is to accelerate the number of cycles of usage per unit time. In effect, we are reducing test times by an acceleration factor (AF), which is defined as follows: AF ¼
cycles of usage per unit of time under accelerated cycling cycles of usage per unit of time under normal cycling
Example 7-1:
ð7:1Þ
Accelerated cycling
Shock absorbers were tested at an accelerated rate of 600 cycles=hr. Normal cycle rate is 200 cycles=hr. Data from this accelerated life test was fit to a Weibull distribution with b ¼ 1:7 and ys ¼ 2500 hr. Develop Weibull reliability metrics under normal cycling conditions. Solution:
The acceleration factor, AF, is estimated as
AF ¼
600 cycles=hr ¼ 3:0 200 cycles=hr
Assuming that accelerated cycling does not result in the introduction of any new potential failure modes, reliability under normal test conditions is Weibulldistributed with shape parameter b ¼ 1:7 and with Weibull scale parameter, yN , determined by yN ¼ ys AF ¼ 2500 3 ¼ 7;500 hr and
7.3.2
" RN ðtÞ ¼ exp
t 7;500
1:7 #
Stress-Life Relationships at Two Different Stress Levels
Here we assume that test data has been collected at two different stress levels (see Figure 7-4): 1. 2.
Normal operating conditions Elevated stress conditions
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 7-4
Dot plots of life data at two different stress levels
We introduce the use of a simple acceleration factor, AF, to relate life back and forth from the two levels: AF ¼
life under normal operating conditions life under elevated stress levels
ð7:2Þ
The use of Eq. (7.2) is predicated upon the assumption that under elevated stresses, the underlying hazard rate of the failure distribution will not change; just its location changes. This is based on an assumption that no additional failure modes are introduced at the elevated stress condition. Accordingly, we assume that for a two-parameter (log) location-scale distribution such as the normal, lognormal, Weibull, or minimum extreme-value distribution (EVD), only the location parameter will vary as stress conditions are elevated. Accordingly, the overall shape of the distribution, as expressed by the Weibull shape parameter, b, or the (log-)normal standard deviation, s, is assumed to remain constant. In the case of the Weibull distribution, we assume the following: bN ¼ bs ¼ b y AF ¼ N yS
ð7:3Þ ð7:4Þ
where yN ; bN ¼ Weibull parameters under normal operating conditions: yS ; bS ¼ Weibull parameters under elevated stress conditions: The AF can be estimated either parametrically (assumption of an underlying distribution) or nonparametrically (distribution-free). The use of Eq. (7.4) is demonstrated in the example that follows. We first demonstrate, however, how Q–Q plots can be used to develop estimates of AF. Methodology for developing Q–Q plots is presented in Appendix 7A of this chapter. We demonstrate their use in the example that follows.
Copyright © 2002 Marcel Dekker, Inc.
TABLE 7-1 Ordered and Paired Failure Data at Two Different Temperatures (in hr) 35.5 44.3 50.5 79.1 83.8 107.7 150.0 171.2 237.3 271.9 Normal ð22 CÞ Accelerated ð160 CÞ 3.7 4.4 16.3 18.2 24.8 28.5 31.4 32.8 36.1 50.8 361.9 377.6 380.0 404.1 476.6 602.4 689.2 748.3 774.3 1326.8 Normal ð22 CÞ Accelerated ð160 CÞ 70.5 80.7 90.5 105.7 118.7 158.9 160.7 183.4 192.5 613.5
Example 7-2:
Estimating acceleration factor, AF
Ordered and paired, time-to-failure data is presented in Table 7-1 for two different temperature levels—normal (22 C) and accelerated (160 C). Develop nonparametric and parametric estimates of AF. In the latter case, assume that failure times are Weibull-distributed. Nonparametric Analysis Per the suggestion outlined by Modarres et al. (1999, p. 398), the use of Q–Q plots for estimating AF is demonstrated. Since we have two complete samples of size 20, the Q–Q plot is simply a plot of the ordered pairs of data given in Table 7-1. The development of Q–Q plots is discussed in greater detail in Appendix 7A. A Q–Q plot is shown in Figure 7-5. A regression through the origin was run on
FIGURE 7-5
Q–Q plot of data.
Copyright © 2002 Marcel Dekker, Inc.
the ordered pairs to develop a formal estimate of AF. A summary of the regression output is presented in Figure 7-6. The slope of the fit is 2.79, and so we form the estimate AF ¼ 2:79
Parametric (Weibull) Analysis The Weibull plot of both the accelerated and normal life data sets appears in Figure 7-7. Maximum likelihood estimates of Weibull b and y are presented. The Weibull shape parameters do differ: bN ¼ 1:14 ðnormalÞ
versus
bS ¼ 0:89 ðacceleratedÞ
However, their difference is not significant when compared to the standard error of the estimates, which runs in the range of 0.14 to 0.20 (see Figure 7-7). Accordingly, we are justified in estimating the acceleration factor as the ratio of the y, values [see Eq. (7.4)]: AF ¼
yN 387:7 hr ¼ 4:08 ¼ yS 95:0 hr
Finally, note the differences in the AF estimates from 2.79 (Q–Q plots) to 4.08 (Weibull analysis). Looking at the Q–Q plot in Figure 7-5, it is possible that these differences are due to an ‘‘outlier data point,’’ which has unduly influenced the regression estimate of AF. This is a general problem with the use of regression procedures, such as those discussed earlier on the use of rank regression parametric estimation procedures.
FIGURE 7-6
Regression fit of Q–Q data.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 7-7
Weibull plot of normal and accelerated life data for Example 7-2.
Example 7-3:
Use of acceleration factor under an assumed distribution
The time-to-failure of an IC chip is believed to follow an exponential distribution with MTTF parameter yN . An accelerated test was conducted, with acceleration factor, AF, estimated as 2.5, and MTTF parameter, yS ¼ 3:5 days. Estimate B10 life. Solution::
By Eq. (7.2),
yN ¼ AF yS ¼ 2:5 3:5 days ¼ 8:75 days B10 ¼ t0:90 ¼ yN ln R ¼ 8:75 ln 0:9 ¼ 0:92 days
Copyright © 2002 Marcel Dekker, Inc.
7.4
USE OF PHYSICAL MODELS
The use of physics-of-failure models implies an understanding of the underlying failure mechanism. As such, their use is preferred whenever possible. Such models are parametric, allowing for interpolation of the rate of failure at any stress level.
7.4.1
The Arrhenius Model
The most common model, the Arrhenius model, is used when temperatures are elevated. Under this model, the median failure time (t0:50 ) is related to temperature according to the exponential model: t0:50 ¼ AeDE=kT
ð7:5Þ
where A ¼ Constant: DE ¼ Activation energy in electron volts: k ¼ Boltzmann constant; ¼ 1=11604:83 in units of electron volts= K: T ¼ Temperature in K; where Kelvin temperature ¼ 273:16 þ temperature in C: To fit an Arrhenius model, we often take the log of both sides to linearize the relationship between temperature1 and life: ln t0:50 ¼ constant þ DE=kT
ð7:6Þ
Based on Eqs. (7.4) and (7.5), the acceleration factor, AF ¼ yN =yS , is estimated by fitting life data at two different temperatures, T1 and T2 : yN AeDE=kT1 11;604 11;604 ð7:7Þ AF ¼ ¼ ¼ exp DE yS AeDE=kT2 T1 T2 or DE ¼
ln AF 11; 604 11; 604 T1 T2
ð7:8Þ
Once DE is known, the AF at any other temperature can be immediately calculated with the use of Eq. (7.7).
Copyright © 2002 Marcel Dekker, Inc.
Example 7-4:
Determination of activation energy, DE.
For the data in Example 7-2, estimate the constant, DE, in the Arrhenius equation. Use this information to predict life at 140 C. Solution ( part a, estimation of DE): T1 ¼ 22 C þ 273 ¼ 295 K: T2 ¼ 160 C þ 273 ¼ 433 K: From Weibull analysis, AF ¼ 4:08AF = 4.08. From Eq. (7.5), DE ¼
ln AF ln 4:08 ¼ 0:112 ev:=deg ¼ 11; 604 11; 604 11; 604 11; 604 T1 T2 295 433
Solution ( part b, life at 140 C): We use Eq. (7.7), with T1 ¼ 295 K, T2 ¼ 140 C þ 273 ¼ 413 K, DE ¼ 0:112 ev.=deg., to arrive at an acceleration factor, AF, of 3.52. 11;604 11;604 AF ¼ exp DE T T2 1 11;604 11;604 ¼ exp 0:112 295 4132 ¼ 3:52 Thus, yS at 140 C is estimated as yS ð140 CÞ ¼
Example 7-5:
yN 387:7 hr ¼ ¼ 110 hr AF 3:52
Estimating life versus temperature relationships
Life data (hr) at three temperatures—22 C, 102 C, and 177 C—is shown in Table 7-2. Fit an Arrhenius model to the data. Solution: Minitab was used to generate the Weibull plots and maximum likelihood estimates of the Weibull parameters for the three temperature data sets. Weibull plots of the three data sets along with maximum likelihood estimates of the Weibull parameters are shown in Figure 7-8. Under an assumption that the shape parameter (b) does not vary with temperature, we should expect that the fitted Weibull lines are parallel to one another. Some deviation from parallel is to be expected, due to sample error as
Copyright © 2002 Marcel Dekker, Inc.
TABLE 7-2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
Life Data at Three Temperatures C1
C2
C3
22 deg C
102 deg C
177 deg C
113 252 284 300 336 505 585 602 706 708 757 898 967 1012 1192 1317 1431 1435 1639 1793
69 85 120 223 227 275 277 290 292 294 312 340 376 409 434 471 488 562 714 803
24 56 71 105 105 167 189 193 197 203 213 215 231 271 310 340 408 427 566 593
evidenced by the magnitude of the standard error of the ML estimates of b. From Figure 7-8, differences among the estimates of b are noted: b22 ¼ 1:80: b102 ¼ 2:00: b177 ¼ 1:64: However, their differences are unlikely to be very significant, given that the standard errors on b range from 0.286 to 0.346, and so differences among b^ -values are on the order of one standard error on b. Under an assumption that b is not dependent on temperature, it might be preferable to develop ML estimates of y subject to the constraint that b is the same constant at all three temperatures. A simple way to estimate a combined estimate for b is to simply stack all the observations (for all three temperature levels) into a single column and treat the data as if it belongs to the same sample.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 7-8 Weibull plot and parameter estimates of data set in Table 7-2 from Stat > Reliability=Survival > Parametric Right-Censoring in Minitab V13.3.
The ML estimate for b in the combined data set is the combined estimate for b that should be used. (We ignore the combined estimate for y, of course!) Justification for doing so is predicated upon the observation that the ML estimate must satisfy the following y-free expression, involving just the ordered observations and b (see Chapter 9): n n 1 P ln t 1 P þ di i P tib ln ti ¼ 0 n b i¼1 r b i¼1 ti
ð7:9Þ
i¼1
Note the use of the indicator variable:
di ¼
1 0
if ti is a recorded failure if ti is a right-censoring time
The output from Minitab is presented in Figure 7-9. The combined ML estimate for b is b^ ¼ 1:27. To find y, we might choose to rerun the Stat > Reliability=Survival > Parametric-Right Censoring Procedures for each sample under the constraint b^ ¼ 1:27. Minitab has a check box under Options for setting b to some fixed value. The use of this procedure is illustrated by Figure
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 7-9
ML estimate of b for combined data set from Minitab.
FIGURE 7-10 Using Minitab1 V13.3 to generate constrained ML estimates of y at 22 , 102 , 177 C.
Copyright © 2002 Marcel Dekker, Inc.
7-10. However, with b^ fixed, we might just as well use the well-known Weibayes estimator of y given by Eq. 6.34: 0P 11=b n tib Bi¼1 C C y^ ¼ B @ r A
ð7:10Þ
In our case we ran the Minitab procedure as illustrated by Figure 7-10, wherein the Weibayes option under Estimate was selected. The output from Minitab is shown in Figure 7-11 and summarized in Table 7-3, where we list the constrained and unconstrained ML estimates of b and y for the three data sets. Temperature versus life (y) under the constraint b^ ¼ 1:27 is summarized graphically by Figure 7-12.
FIGURE 7-11
Constrained ML estimates of Weibull y at different temperatures.
Copyright © 2002 Marcel Dekker, Inc.
TABLE 7-3 Constrained and Unconstrained Maximum Likelihood Estimates of Weibull Data at Three Different Temperature Levels
22 C 102 C 177 C
7.4.2
b
y
Combined b
y under b ¼ 1:27
1.80 2.00 1.64
947.0 398.7 273.1
1.27 1.27 1.27
879.4 365.8 256.8
Other Acceleration Models
Inverse Power-Law Model of Voltage Acceleration The area of electronics reliability is a fruitful area for identifying useful physics models for modeling failure phenomena at different stresses. Voltage acceleration is a common stress factor used in studying the reliability of electronics. Meeker and Escobar (1998) provide an insightful description of the underlying influence of voltage on dielectric materials and on devices like insulating fluids, transformers, and capacitors. In such applications, the application of overstress voltage results in accelerated time-dependent degradation of the resistance of the dielectric to voltage. The model is of the empirical form: b VU AF ¼ VS
for b > 1
ð7:11Þ
FIGURE 7-12 Temperature versus life (constrained ML estimate of Weibull y) [lnðyÞ versus 1=T K in cutout].
Copyright © 2002 Marcel Dekker, Inc.
VU ¼ Voltage under normal usage conditions: VS ¼ Votlage under stressed conditions: Equation (7.11) is derived from kinetic and activation energy considerations. The mean time-to-failure decreases according to the bth power. Eyring model The Eyring model comprises a general model for incorporating both temperature and one or more additional stress variables. It is based on a theoretical derivation on chemical reaction rate theory (kinetics) and quantum mechanics. In this derivation the term DH is used in chemical kinetics to denote the amount of energy needed to start a diffusion or migration process. The Eyring model equation is of the form BþðC=T ÞS
DþðE=T ÞS
1 2 t0:50 ¼ A T a eDH=kT temp: term estress term;S1 estress term;S2 |fflfflfflfflfflfflffl{zfflfflfflfflfflfflffl} |fflfflfflfflfflfflffl{zfflfflfflfflfflfflffl} |fflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflffl}
ð7:12Þ
Elsayed (1996) reports the use of a humidity stress factor in Eq. (7.12), to account for the acceleration of corrosive phenomena in the exterior leads of an encapsulated integrated circuit component. This model is often used to model accelerated stress levels under HAST (highly accelerated stress test) conditions of elevated temperature of 85 C and 85% relative humidity. The model is frequently expanded to accommodate accelerated voltage stresses. Electromigration Model Electromigration involves the transport of metal atoms due to electron and conductive ‘‘wind’’ effects. If the electronic current is sufficiently dense, an electron wind is created, causing a metal atom such as copper or aluminum to migrate, which can lead to a loss of current flow once an area of metal atoms becomes depleted. The median time-to-failure can be modeled using Black’s equation (1969). This equation takes into account temperature acceleration effects as follows: t0:50 ¼ AJ n eDH=kT
ð7:13Þ
where J is the current density (e.g., amps=cm2 ). Fatigue Life Models In these models the cyclic loading on a part may eventually lead to a fatiguerelated failure. Some well-known empirical relationships for modeling fatigue life include the following two relationships:
Copyright © 2002 Marcel Dekker, Inc.
1.
S–N Curve I. Stress life under predominate elastic deformation (constant, cyclical loading):
N ¼ AS m
ð7:14Þ
where N ¼ Cycles to fatigue-related failure: S ¼ Constant stress amplitude or range: A; m ¼ Model parameters: 2.
S–N Curve II. Strain life at a notch where stress concentration is high and both elastic and viscous (plastic) deformation takes place (constant cyclical strain load):
ea ¼ ðs0f =EÞð2N Þb þ e0f ð2N Þc
ð7:15Þ
ea ¼ Strain amplitude: s0f ¼ Fatigue strength coefficient: E ¼ Young’s modulus: N ¼ Cycles to fatigue-related failure: e0f ¼ Fatigue ductility coefficient: b ¼ Fatigue strength exponent: c ¼ Fatigue ductility exponent:
Tool Life (Taylor’s Model) Taylor’s model is an empirical model for relating tool life as a function of cutting speed: L¼
A Vm
where L ¼ Tool life: A ¼ Model constant: V ¼ Cutting speed: m ¼ Material parameter ðm 8 for high-strength steels; m 4 for carbide steel; m 2 for ceramicÞ:
Copyright © 2002 Marcel Dekker, Inc.
ð7:16Þ
7.5
USE OF LINEAR STATISTICAL MODELS IN MINITAB FOR EVALUATING LIFE VERSUS STRESS VARIABLES RELATIONSHIPS
In this section we explore the use of the accelerated life testing and regression with life data procedures in Minitab for building formal life versus stress relationships.* 7.5.1
Arrhenius Linear Model in Minitab
From Eq. (7.7), ln y ¼ b0 þ b1 ArrTemp
ð7:17Þ
where ArrTemp ¼ 11604:83=½Tempð CÞ þ 273:16 and b0 and b1 are parameters of a linear statistical model. To complete the synthesis of an Arrhenius linear model, we will need to include the stochastic nature of failure data. For Weibull (y, b) data, the logged failure times are distributed according to a minimum EVD with location parameter, ln y, and scale parameter, s ¼ 1=b. (An overview of the EV distribution is presented in Chapter 3, x3.6.) e is a standard EVD error term that is evaluated using the scaled expression e¼
ln t ln y s
ð7:18Þ
Therefore, percentiles of EV failure times may be evaluated according to ln tp ¼ ln y þ sep
ð7:19Þ
where ln tp ¼ pth right percentile of EV failure time: ep ¼ pth right percentile of a standard EVD error distribution: s ¼ EVD scale parameter; where s ¼ 1=b: We can then combine the relationships given by Eqs. (7.17) and (7.19) to form the linear statistical model ln tp ¼ b0 þ b1 ArrTemp þ b1 ep
ð7:20Þ
*The author does not mean to portray himself as a salesman for Minitab. However, the author does not recommend the adaptation of Microsoft Excel for ALT modeling. The reader is advised to evaluate other potentially available statistical computing software for ALT modeling from vendors such as ReliaSoft.
Copyright © 2002 Marcel Dekker, Inc.
Notes: 1. For lognormal distributed data with parameters tmed and s, we recognize that the logged failure times will follow a normal distribution with mean ln tmed and standard deviation s. In this case e is a standard normal error term. 2. Although Eq. (7.20) is treated as a regression model, maximum likelihood methods are normally used to estimate the Weibull parameters and the coefficients b0 and b1 . Its use is necessary, particularly when data sets are incomplete, since least-squares methods cannot be adapted for use on multiply censored data sets. Example 7-6:
Estimating life versus temperature relationships
For the data set given by Table 7-2, estimate the 50th percentile of the failure distribution at an elevated temperature of 140 C. Solution: The Stat > Reliability=Survival > Accelerated Life Testing in Minitab1 V.13.3 was used to fit the model suggested by Eq. (7.20). (Note to Minitab users: All failure times and temperature values must be stacked in two columns— one column containing ordered readings, and the other containing temperature readings.) Minitab provides a plot of the 10th, 50th, and 90th percentiles of t, on a natural log scale, versus temperature, on an ArrTemp Scale, reproduced in Figure 7-13. The 50th percentile of the distribution at 140 C can be graphically estimated with the use of Figure 7-13. However, this is not necessary, because we asked Minitab to generate the 50th percentile of the failure distribution at 140 C by clicking on options within the estimate options box. From Figure 7-14, the 50th percentile of the failure distribution at 140 C is 267.2 hr, with 95% confidence limits ranging from 220 hr to 324.4 hr. The fitted model is of the form lnðtp Þ ¼ 3:226 þ 0:09140 ArrTemp þ 1:7781 ep By the properties of a minimum EVD, and Eq. (7.17), at 140 C the Weibull parameter, y is estimated using the relationship y^ ¼ expð3:226 þ 0:09140 ArrTempÞ 11604:83 ¼ exp 3:226 þ 0:09140 140 þ 273:16 ¼ 328:07 hr ^b ¼ 1:778 from Minitab output
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 7-13 Output from Minitab1 V13.3 Stat > Reliability=Survival > Accelerated Life Testing-regression fit showing 10th, 50th, and 90th percentiles of the fitted model.
FIGURE 7-14 Testing.
Output from Minitab1 V13.3 Stat > Reliability=Survival > Accelerated Life
Copyright © 2002 Marcel Dekker, Inc.
7.5.2
Use of Regression with Life Data Procedure in Minitab
We can expand the regression approach introduced in the last section to make use of the general linear model Yp ¼ b0 þ b1 X1 þ b2 X2 þ b12 þ þ Xk þ sep
ð7:21Þ
In this case our dependent variable, Yp , is the pth percentile of the (logged) failure times, and ep denotes the pth percentile of the error distribution. s is a scale factor. Under a general linear model, the Xi predictor variables can be crossed or nested with each other. The predictor variables can be 1. 2.
Continuous variables like stress, temperature, and voltage Qualitative=categorical variables like plant location, design; or number of different heat treatment operations under consideration
The error distribution on e can be generalized to any popular distribution form such as normal, lognormal, Weibull, extreme-value, etc., depending on the application. Analysis of such data sets is very complicated, particularly in the presence of censoring. In such cases it is not possible to adapt least-squares estimation techniques, and so the parameters are usually solved with the use of maximum likelihood techniques. In that case, a nonlinear search algorithm such as Newton–Raphson must be used to identify the ML estimates of the model parameters, s, b0 , b1 ; . . . ; bk . The use of likelihood estimation procedures for estimating model parameters is discussed in greater detail in Appendix B to this chapter.
Example 7-7:
Use of Minitab Insulate:mtw data set
1
Minitab V12 or later provides a Regression with Life Data procedure that can be used to fit a general linear model to a set of multiply censored failure data. We describe its use for the sample Minitab data set, Insulate.mtw. In this application we investigate the deterioration of an insulation used for electric motors. You want to know if you can predict failure times for the insulation based on (a) the plant location where it was manufactured and (b) the temperature at which the motor runs. The bogey life of the insulation product is collected at four temperatures— 110 C, 130 C, 150 C, and 170 C—and at two plant locations. The data set is shown in Table 7-4. Because the motors generally run between 80 C and 100 C, we want to predict the insulation’s behavior at these temperatures.
Copyright © 2002 Marcel Dekker, Inc.
TABLE 7-4 Insulate.mtw Sample Minitab Data Set Plant 1 170
130
343 869 244þ 716 531 738 461 221 665 384þ
Plant 2 394þ 369 366 507 461 431 479 106 545 536
Plant 1
Plant 2
8290 10183 3987 3545 4735þ 7919 4925 2214 5351 3147
7304 6947 5355 3308 4373 6226 5117 3620þ 3128 4348
Plant 1 150
110
2134þ 2746 2859 1826þ 996 2733 3651 2073 2291 1689
Plant 2 1533 1752 1764 2042 1043þ 1214 3154 2386 2190 1642
Plant 1
Plant 2
21900þ 13218 17610 7336þ 18397 13673 8702 21900þ 13513 14482
20975 12090 17822 11769 21900þ 16289 21900þ 18806 11143þ 17784
Note: þ denotes censored readings.
The regressor (predictor) variables to be used are temperature ( C) and plant location. The underlying linear model under a Weibull distribution assumption is of the form ln tp ¼ Intercept þ Coeff ArrTemp ArrTemp þ Coeff Plant Plant þ b1 ep ð7:22Þ where ep is a standard EVD error term. The output from Stat > Regression=Survival > Regression with Life Data is shown in Figure 7-15. A probability plot of the standard residuals is shown in Figure 7-16. This can be used to evaluate the efficacy of the linear statistical model. In our case the residuals appear to adequately hug the fitted line in the EVD probability plot.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 7-15 Life Data.
Output from Minitab1 V13.3 Stat > Regression=Survival > Regression with
Interpreting the Results The coefficients of the linear model are presented in Figure 7-15. A Weibull timeto-failure distribution is assumed; however, logged values of the times will follow a minimum EVD. We used indicator values of ‘‘1’’ and ‘‘2’’ for plant location. The predictor equation—the equation that describes the relationship between temperature and failure time for the insulation at either plant—becomes, for plant 1, ln tp ¼ 15:3411 þ 0:83925ðArrTempÞ þ 0:33978ep and for plant 2 becomes ln tp ¼ 15:52187 þ 0:83925ðArrTempÞ þ 0:33978ep Prediction: When using this Minitab procedure, we clicked on the estimate option box to request a prediction of life for motors running at 80 C and 100 C at each of the two plant locations. The output for this is also presented in Figure 7-15. Based on this output, we make the following predictions:
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 7-16
1.
2.
Probability plot of standard residuals.
For motors running at 80 C, insulation from plant 1 is expected to last about 182,093 hr or 20.77 yr; insulation from plant 2 is expected to lasts about 151,980 hr or 17.34 yr. For motors running at 100 C, insulation from plant 1 lasts about 41,530 hr, or 4.74 yr; insulation from plant 2 lasts about 34,662 hr, or 3.95 yr.
As you can see from the low p-values, the plants are significantly different at the a ¼ 0:05 level, and temperature is a significant predictor.
7.5.3
Use of Proportional Hazards Models
Proportional hazards (PH) models have been widely used in biostatistics, but only recently have they begun to be used by reliability practitioners in industry (see Elsayed and Chan, 1990, and Dale, 1985). Unlike standard regression models,
Copyright © 2002 Marcel Dekker, Inc.
under a proportional hazards model we assume that applied stresses have a multiplicative, instead of an additive, effect on hazard rate. It is a distribution-free model, which is based on the assumption that differences in hazard rate at different stress levels depend only on the stress levels, and not on time. The basic PH model is given by
lðt; X1 ; X2 ; . . . ; Xk Þ ¼ l0 ðtÞ expðb1 X1 þ b2 X2 þ þ bk Xk Þ
ð7:23Þ
The method of partial likelihood estimation is used to provide estimates of the PH parameters, bi , i ¼ 1; 2; . . . ; k. For more information on PH modeling, the reader might begin with any of the following references: Elsayed (1996); Meeker and Escobar (1998); Crowder et al. (1991); and Lawless (1982).
TABLE 7-5 Acceleration Stresses by Failure Mode Failure mode
Acceleration stresses
Product applications
Corrosion=oxidation= rusting
Temperature Humidity Voltage (electronic potential) Residual stresses Mechanical stress Temperature Temperature Concentration gradient Current density Temperature Temperature gradient Intensity and cumulative U.V. radiation
Oxidation of metal surfaces; electrical contacts; etc.
Creep and creep-rupture Diffusion Electromigration
Fading
Fatigue
(crack initiation and propagation) Wear
Cyclic mechanical stress=strain amplitude or range Cycling temperature range Frequency (usage) Contact force Relative sliding velocity Temperature Lubrication
Copyright © 2002 Marcel Dekker, Inc.
Plastics; welds; bonds; other joint processes Plastics; lubricants, etc. Electronic circuits
Paints; surface finished; discoloring aging of plastics Metals; plastics; composite materials
Solid surfaces in contact (plastics and metals); tires; coatings
7.6
CLOSING COMMENTS
We have provided some basic information on the use of accelerated tests for design=process validation. Accelerated test conditions are also used during development and other studies aimed at improving performance and reliability. Tests of this kind are discussed in Chapter 8. Finally, Table 7-5 provides a tabular summary of some recommended strategies for test acceleration. This information is organized by failure mode. The stresses, which might be elevated, are listed along with some product examples.
7.7 1. 2. 3.
4.
5.
EXERCISES Contrast the similarities and differences between ESS and burn-in testing. What common goals does the use of HALT and ALT testing strategies have? How do they differ? Why did we run a constrained ML estimation of Weibull b, forcing a common value of b over all temperatures? Under what conditions is this assumption justified? An ALT is run at two different temperatures. At 23 C, we fit b^ ¼ 1:2, and y^ ¼ 1000 hr. At 163 C, we fit b^ ¼ 1:2 and y^ ¼ 333:3 hr. a. What value of the acceleration factor (AF) should one use to estimate the properties at the elevated temperature of 163 C? b. Estimate the activation energy under an Arrhenius model of temperature acceleration. c. What acceleration factor should one use to relate properties at 123 C from 23 C? d. What is your estimate of y^ and b^ at the elevated temperature condition of 123 C? Reliasoft suggested this ALT experiment at the RAMS 2000 symposium in Philadelphia:
Object. To model the fatigue life of a paper clip that is measured in the number of cycles-to-failure as it is bent back and forth to a specified angle Experiment. angular limits of
Measure the number of cycles-to-failure of a paper clip using
45 90 180
At each angle, test at least n ¼ 5 paper clips. Record your life data.
Copyright © 2002 Marcel Dekker, Inc.
Analysis.
Use Eq. (7.14), which in natural log units becomes
ln N ¼ ln A þ m ln S where: S ¼ stress in degrees or radians: A ¼ Constant to be determined: N ¼ Number of cycles before failure: Assume a lognormal distribution on N (ln N is normally distributed). Estimate the parameters of the model, and use the model to predict median life, t0:50 , at a 60 angle. Can you develop a low 95% confidence limit on life at 60 ? 6. Accelerated life data was collected at three different temperatures and summarized in Table 7-6. Fit an Arrhenius model to predict median life at t ¼ 150 C, and develop a 95% lower confidence level on life at this temperature.
TABLE 7-6
Arrhenius Data Set
160
80
25
45 131 218 247 320 335 372 393 538 930 1026 1072 1443 1772 2029
92 304 365 377 433 819 989 1166 1202 1403 1441 1534 1957 2255 2796
125 198 336 514 748 769 1058 1684 1943 1955 2067 2267 2620 2672 2799
Copyright © 2002 Marcel Dekker, Inc.
APPENDIX 7A
Q–Q PLOTS
Lorenz (1905) introduced the use of Q–Q plots—a distribution-free graphical procedure—for making comparisons between two groups. Regression methods can be used to assess the Q–Q fitted relationship. Fisher (1983) cautions against the use of Q–Q plots when sample sizes are less than 30. In general, Q–Q plots are more sensitive to differences between the tail regions of two distributions. For equally sized groups, Q–Q plots are easily constructed. First, both data groups must be sorted in ascending order. The Q–Q plot is then just a plot of the ordered pairs, (Xi , Yi ), i ¼ 1; 2; . . . ; n. Their inverse, P–P plots, consists of a plot of the ordered pairs F^ 1 ðtÞ and F^ 2 ðtÞ. We demonstrate the construction of a Q–Q plot for relating the properties of two unequal-sized groups of data. For each data set, one can develop empirical estimates of FðtÞ using median or mean rank estimators, for example. That is, we have F^ 1 ðtj Þ ¼ pj
j ¼ 1; 2; . . . ; n1
F^ 2 ðtk Þ ¼ pk
k ¼ 1; 2; . . . ; n2
ð7A:1Þ
The quantile relationships are formed from the inverse relationships: tj ¼ F^ 11 ð pj Þ
j ¼ 1; 2; . . . ; n1
tk ¼ F^ 21 ð pk Þ
k ¼ 1; 2; . . . ; n2
ð7A:2Þ
The empirical relationships Eq. (7A.2) expresses form the quantile relationships. That is, the ordered times tj , j ¼ 1; 2; . . . ; ni , and tk , k ¼ 1; 2; . . . ; n2 , are the quantiles, and so we have constructed two quantile relationships: Q1 ð pÞ ¼ F^ 11 ð pÞ Q2 ð pÞ ¼ F^ 21 ð pÞ
ð7A:3Þ
For various values of the cumulative percent, p, we identify the ordered pairs ðQ1 ð pÞ, Q2 ð pÞÞ, which can then be plotted on ordinary graph paper. The relationship between the two groups might then be assessed using formal regression methods or simple visual assessment. The identification of these ordered pairs is easy if sample sizes are the same; otherwise, some interpolation techniques may be required. For equal sample sizes, the order failure times (ti ; tj ), for i ¼ j ¼ 1; 2; . . . ; n, can be plotted on ordinary graph paper. In other cases, if sample sizes are multiples of each other, that property can be used to identify (Q1 ; Q2 ) pairs. For example, if n2 ¼ 2n1 , then one can simply pair up every ordered time in group 1 with every other ordered time in group 2.
Copyright © 2002 Marcel Dekker, Inc.
Modaress et al. (1999) provide a simple approximation to identify Q; Q pairs when sample sizes are not equal:
t if np is an integer Qp ¼ np any value in the interval ðt½np ; t½npþ1 Þ where [x] denotes the greatest integer that does not exceed x. An arguably more exact way to identify Q–Q pairs involves interpolation of the estimated quantile relationship associated with the larger sample. This approach may be needed when sample sizes are small. A visual way to do this is to plot both quantile relationships, and then to identify corresponding pairs, (Q1 ; Q2 ), from the two empirical, quantile functions. A systematic way to accomplish this is to use the ordered QðpÞ-values associated with the smaller sample size and estimate QðpÞ-values for the larger sample size either graphically or using interpolation methods. This is demonstrated in Figure 7-17 for the data set presented in Table 7-7. The completed Q–Q plot is shown in the cutout figure to Figure 7-17.
TABLE 7-7
Sample X –Y Data Sets (nx ¼ 10; ny ¼ 15Þ
Qx ð p x Þ
px
Qy ð p y Þ
py
98.01 98.04 98.21 98.95 99.12 99.34 100.18 101.64 103.33 106.18
0.05 0.15 0.25 0.35 0.45 0.55 0.65 0.75 0.85 0.95
72.85 72.99 73.26 73.57 74.13 74.68 75.12 75.61 75.64 76.04 76.24 76.56 77.54 77.83 80.38
0.03 0.10 0.17 0.23 0.30 0.37 0.43 0.50 0.57 0.63 0.70 0.77 0.83 0.90 0.97
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 7-17 Graphical method to identify Q–Q pairs.
Copyright © 2002 Marcel Dekker, Inc.
APPENDIX 7B ML ESTIMATION OF PARAMETERS IN REGRESSION MODEL WITH MULTIPLY CENSORED LIFE DATA 7B.1
DESCRIPTION OF GENERAL LINEAR MODEL WITH LIFE DATA
We follow the approach outlined by Meeker and Escobar (1998, x17.3.1) to describe the details of the underlying regression model and likelihood estimation procedure. Assume that we have a time-to-failure distribution that can be generalized by a (log-) location-scale distribution, Gðy; sÞ, with parameters y and s. Then we have ty Fðt; y; sÞ ¼ G s
ð7B:1Þ
or equivalently, for percentile tp , we have Fðtp ; y, sÞ ¼ 1 p, and it follows that tp ¼ y þ sG1 ð1 pÞ ¼ y þ sep
ð7B:2Þ
where in Eq. (7B.2) we recognize that the standard inverse of Gð:Þ is a standard error term of the location-scale distribution form. To this, we incorporate the second linear relationship on the mean to account for the effect of a wide variety of factors on life. y ¼ b0 þ b1 X1 þ b2 X2 þ þ bk Xk
ð7B:3Þ
Note: Equations (7B.1)–(7B.3) need to be adapted for each life distribution used as follows: 1. 2. 3. 4. 5.
Minimum extreme-value distribution: Substitute m for y ¼ m and s for s. Weibull distribution: Substitute ln t and ln y for t and y, respectively, and 1=b for s. Lognormal distribution: Substitute tmed for y and s for s. Normal distribution: Substitute m for y. Exponential distribution: Treat as a special case of Weibull with s ¼ 1.
Copyright © 2002 Marcel Dekker, Inc.
7B.2
USE OF ML ESTIMATION PROCEDURES
For multiply right-censored data, the likelihood expression is of the form d1 n 1 Q Lðb0 ; b1 ; . . . ; bk ; sÞ ¼ gðti ; b0 ; b1 ; . . . ; bk ; sÞ i¼1 s ½1 Gðti ; b0 ; b1 ; . . . ; bk ; sÞ1di
ð7B:4Þ
where we have introduced the additional notation: gðb0 ; b1 ; . . . ; bk ; sÞ ¼ Density function of a location-scale distribution: Gðb0 ; b1 ; . . . ; bk ; sÞ ¼ Distribution function of a location-scale distribution: di ¼ Indicator variable; with di ¼ 1 if ith observation is a failure; di ¼ 0 if observation is right-censored: Simple closed-form solutions exist for complete data sets. In general, a nonlinear optimization technique such as Newton–Raphson is needed to identify a solution to Eq. (7B.4). Consult references by Meeker and Escobar (1998) and Lawless (1982) for more detailed information on this model. The use of likelihood estimation techniques is discussed in greater detail in Chapter 9.
Copyright © 2002 Marcel Dekker, Inc.
8 Engineering Approaches to Design Verification
A design can be verified through testing or the use of computer-aided engineering (CAE) models. The use of CAE models is preferred, however, as its use constitutes a proactive strategy for identifying any design deficiencies that may exist early in the design process. Product testing should be viewed only as a final test or confirmation of a good design. The combination of these verification methods will result in a reduction in the number of prototype build and verification events, leading to a reduction in both cost and product lead times. The advantages of using CAE approaches over conventional DV=PV testing include, but are not limited to, the following:
Rapid identification of potential design deficiencies Reduction in the number of prototype builds and builds events Reduction in resource requirements (cost) for testing Reduction in engineering design changes Reduction in overall product lead times
There are some risk and=or disadvantages in its use, including
Technology is still under development. It is not always available. The roles of today’s engineers must change to meet challenges of using today’s technology, which is ever-evolving. Software can be very expensive to acquire and use.
Copyright © 2002 Marcel Dekker, Inc.
CAE models can become quite complex and difficult to use and=or administer. They require advanced training in most cases. CAE models do not always correlate very well with real-world performance. For example, CAE is not very useful for evaluation of EMT(electromagnetic interference) related faults. In such cases it is necessary to take a design all the way to the board level for hardwire verification and follow that up again with completed system-level testing and validation. If a problem arises from the use of CAE, or one that it did not detect, it might be difficult and costly in both time and money to reprogram the software. In this chapter we survey a collection of modern alternatives to life testing. We briefly survey the use of computer-aided engineering tools, including finiteelement analysis techniques, probabilistic design, degradation modeling, and robust design. As the effectiveness of these tools continues to improve and evolve with the introduction of new technology and methods, it is probable that the reliability engineer of the 21st century will have quite a different role than what we see today. The reliability engineer is likely to be more involved with the product assurance function associated with these new technologies. In this role the engineer will be responsible for the development of new procedures and systems for ensuring the effective use of methodology for modeling potential product performance in the field.
8.1 8.1.1
COMPUTER-AIDED ENGINEERING APPROACHES Finite-Element Analysis
Finite-element analysis (FEA) is a technique for predicting the response of structures and materials to external stresses such as forces, heat, and vibration. Capabilities for modeling fluid mechanics and their interfaces are also included. This, in turn, can be use to assess the likelihood that a part or structure being simulated will endure the stresses it is subjected to in the field. The process starts with the creation of a geometric model. The model is subdivided (meshed) into small pieces (elements) of simple shapes connected at specific node points. There are a number of computer-aided design (CAD) packages such as CATIA1, Pro=Engineer1, and AutoCAD1 for which a three-dimensional solid or wireframe surface model can be saved in Initial Graphics Exchange Specification (IGES) format or other standard. Most FEA software, such as I-DEAS1, FEMAP1, Hypermesh1, LS-Dyna1, Algor1, Abaqus1, and Nastran, provides a preprocessor to model the geometry, if one is not available, or it can access the saved model directly. FEA software can often be used to model both static and
Copyright © 2002 Marcel Dekker, Inc.
dynamic loading situations. They often include dedicated postprocessors for viewing the results and for conducting sensitivity analysis. Increasingly, many of the CAD systems have integrated FEA capabilities (e.g., AutoCAD with ANSYS plug-in and CATIA with FEM1 add-on component). A sample output from CATIA for a turbine engine component is presented in Figure 8-1. Two solid model renderings are shown: (a) a visual rendering and (b) a 3D representation showing the FEA mesh, with color-coded elements. The color-coding is used to provide a visual indication of areas under the greatest stress (thermal or mechanical, generally). Design engineers can then draw their attention to areas that are most likely susceptible to external stresses. (They can also use the model to identify locations where it might be safe to remove material to reduce weight!)
8.1.2
Other Computer-Aided Engineering (CAE) Approaches
There are numerous commercially available CAE packages for assessing design properties. Not all are finite-element. Algor, for example, include capabilities for
FIGURE 8-1 Catia1 solid and mesh rendering of a turbine engine component. (Courtesy of IBM1, from http:==www.catia.ibm.com.)
Copyright © 2002 Marcel Dekker, Inc.
modeling linear and nonlinear behaviors including heat transfer and fluid flow wherever applicable. Interactions between parts (contact) and other dynamic modeling capabilities are included. In such cases, parts are broken down into high-level elements that are flexible or rigid, kinematic or static, linear or nonlinear, etc. Design teams also develop or customize their own dedicated CAE models to meet their needs. For example, suppliers of U.S. automotive front-end accessory drive (FEAD) systems have developed dedicated design CAE packages for modeling performance of such systems under a wide range of conditions.
8.2
PROBABILISTIC DESIGN
When we think of engineering design, we generally think of it in a deterministic sense. That is, all design parameters, including material properties, loads, geometry, and boundary conditions, are treated as constants. In the simplest representation we examine the strength of a component or subsystem—the capacity to endure stress—versus strength, which is a composite effect of potential electrical, thermal, mechanical, or chemical loads. Means and variance information on stress and strength are usually ignored in this analysis. A factor of safety—a ratio of the likely (mean or median) values of strength divided by stress—is chosen such that the chance of the stress exceeding strength is believed to be some acceptable minimum value (close to nil!). Safety factors usually run from 1.2 to 4.0, with 2.0 being an average (see Ebeling, 1997, p. 162). The high end of the range is reserved for very critical or safety-related performance requirements. In probabilistic design we recognize that due to variation in material properties, manufacturing, environmental conditions, and so forth, both stress and strength possess a distribution, which should be considered. We now survey probabilistic design models, from simple normal or lognormal competing distributions on stress and strength, to the more complex representations, which require the use of computer methods.
8.2.1
Simple Strength Versus Stress Models
In a simplistic sense we acknowledge the presence of a distribution on stress and strength. As illustrated in Figure 8-2, the probability of failure is evaluated in the interference region, where stress exceeds strength.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 8-2
Simple stress versus strength competition model.
In probabilistic sense we define the following notation: pf ¼ Probability of failure; PðY < X Þ: Y ¼ Strength of subsystem; a random variable with mean my and standard deviation sy : X ¼ Stress placed on subsystem with mean mx and standard deviation sx : SF ¼ Safety factor ¼ my =mx : SM ¼ Safety margin ¼ my mx :
General Distribution Models of Y X For two competing variables, X and Y, we generalize the evaluation of pf to three scenarios: 1.
Known strength, but unknown loads or stress distribution:
ð8:1Þ
Copyright © 2002 Marcel Dekker, Inc.
2.
Known stress, but unknown strength distribution:
ð8:2Þ
3.
Unknown stress and strength:
ð8:3Þ
Normal Distribution Assumption of Y X If X and Y are normally distributed, then the difference D ¼ Y X , also follows a normal distribution, with mean difference the safety margin ðSMÞ ¼ my mx . The probability of subsystem failure, pf , is pf ¼ PðSM < 0Þ. D ¼ Y X is normally distributed with mD ¼ SM ¼ my mx s2D ¼ s2y þ s2X Thus, we can directly arrive at an expression for pf , the probability of failure, as follows: 0 1 0 mD SM C B ¼ F@ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiA ð8:4Þ pf ¼ PðD < 0Þ ¼ P Z < sD s2y þ s2x
Lognormal Distribution Assumption of Y X We assume that X and Y are lognormally distributed with median lives Xmed and Ymed , respectively. By the properties of the lognormal distribution, ln X , ln Y , and their difference are normally distributed. It follows, as we show, that the
Copyright © 2002 Marcel Dekker, Inc.
probability of failure, pf , is pf ¼ PðSF < 0Þ, where SF is a safety factor. This is a normal probability. Specifically, Pf ¼ PðY X < 0Þ ¼ Pðln Y ln X < 0Þ and ln X is normally distributed with mean mln X ¼ ln Xmed and variance s2ln X ¼ s2X ; and ln Y is normally distributed with mean mln Y ¼ ln Ymed and variance s2ln Y ¼ s2y . Therefore, D ¼ ln Y ln X is normally distributed, with Y mD ¼ ln SF ¼ ln med Xmed s2D ¼ s2y þ s2X We use the information to generate an expression for pf , the probability of failure, as follows: 0 1 0 mD ln SF C B pf ¼ PðD < 0Þ ¼ P Z < ð8:5Þ ¼ F@ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiA sD s2y þ s2x Example 8-1:
Estimating pf
Calculate pf , the likelihood of a failure occurrence, based on two hypothetical distributions (see Figure 8-3): Y ðstrengthÞ distributed lognormal ð1808; 0:12 Þ X ðstressÞ distributed lognormal ð1096; 0:12 Þ From Eq. (8.5), we get 0
1
ln SF C B pf ¼ F@ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiA s2y þ s2x lnð1808=1096Þ ¼ F pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 0:12 þ 0:12 ¼ Fð3:54Þ ¼ 0:02%
8.2.2
Multivariate Strength Versus Stress Competition
The conceptual model of a competition between two single variables—strength and stress—that we have presented so far is very limited. In reality, stresses and strengths are functions of many potentially random variables due to differences in
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 8-3 example.
Lognormal stress versus lognormal strength distribution—worked-out
material properties, geometry, boundary conditions, and the like. Accordingly, a much more generalized model is required. For example, in Figure 8-4 we consider the competition between the maximum tension stresses imparted to a rigid beam of weight W from a perpendicular load of magnitude P and yield strength Y . In this simplified model, we ignore the effects of deflection. Maximum tension stresses, which act parallel to the axis of the shaft, can be modeled using the well-known formula S¼
Md LðP þ W =2Þd 32LðP þ W =2Þ 4 ¼ ¼ 2I Pd 3 pd 2 64
FIGURE 8-4 Bending load, P, on a beam with circular cross-section of diameter, d, and weight, W .
Copyright © 2002 Marcel Dekker, Inc.
where S ¼ Maximum tension stress: M ¼ Bending moment: I ¼ Moment of inertia: L ¼ Length of beam: P ¼ Perpendicular load: W ¼ Weight of beam: d ¼ Diameter of beam: Failure occurs when the maximum yield strength, Y , is exceeded by the maximum tension stress, S:
32LðP þ W =2Þ pf ¼ PðY < SÞ ¼ P Y < Pd 3
ð8:6Þ
We assume that P, L, Y , W , and d are random variables in Eq. (8.6). Following Wirsching and Ortiz (1996), we generalize the challenge to evaluate the probability expressed by Eq. (8.6) to a generalization involving a multivariate variable, x, as follows: pf ¼ PðgðxÞ 0Þ
ð8:7Þ
where xt ¼ ðY ; P; L; W ; dÞt and so gðxÞ ¼ Y
32LðP þ W =2Þ Pd 3
ð8:8Þ
The canonical form expressed by Eq. (8.7) can become quite difficult to evaluate. Approximate solutions to Eq. (8.7) can be obtained directly with the use of Monte Carlo simulation, which requires sampling from hypothetical distributions on Y , P, L, W , and D. Alternatively, numerical approximations have been developed. We discuss the use of a linear, first-order stochastic approximation (FORM) to approximate pf . It makes use of a first-order Taylor-series approximation of gðX Þ as follows.
Copyright © 2002 Marcel Dekker, Inc.
Consider the random vector x ¼ ðx1, x2 ; . . . ; xd Þ, with mi ¼ Eðxi Þ and s2i ¼ Varðxi Þ, i ¼ 1; 2; . . . ; d. We form the first-order Taylor-series expansion of gðxÞ about x0 ¼ ðm1 , m2 ; . . . ; md Þ as follows: d P @g gðxÞ ¼ gðx0 Þ þ ðxi mi Þ þ higher-order terms ð8:9Þ @x i¼1 i xo The probability of failure, pf ¼ PðgðxÞ < 0Þ, will be approximated by a standard normal approximation using first-order approximations of mg and sg, the population mean and standard deviation of gðxÞ: 0 1 BgðxÞ mg 0 mg C C pf ¼ PðgðxÞ 0Þ PB @ s sg A g |{z}
ð8:10Þ
z
FðbÞ b ¼ mg =sg is defined by Wirsching and Ortiz (1996) as a safety index. To estimate mg and sg, we use the well-known result that for a linear combination of independent variables (e.g., Vardeman, 1994, p. 254), W ¼ a0 þ a1 X1 þ a2 X2 þ þ an Xn with EðXi Þ ¼ mi
for i ¼ 1; 2; . . . ; n
and VarðXi Þ ¼ s2i
for i ¼ 1; 2; . . . ; n
EðW Þ ¼ a0 þ a1 m1 þ a2 m2 þ þ an mn
ð8:11Þ
and VarðW Þ ¼ a21 s21 þ a22 s22 þ þ a2n s2n Referring back to the first-order (FORM) Taylor-series expansion of Eq. (8.9), we set x0 ¼ ðm1 , m2 ; . . . ; md Þt , the vector of means. Combined with the result of Eq. (8.11), we approximate mg and sg as follows: mg ¼ EðgðxÞÞ gðx0 Þ þ 0
ð8:12Þ
(Note: Expected values of the ðxi mi Þ terms vanish.) 2 d P @g 2 sg ¼ Var½gðxÞ Var½gðx0 Þ þ s2 þ higher-order terms |fflfflfflfflffl{zfflfflfflfflffl} i¼1 @xi x i 0 0
ð8:13Þ
Copyright © 2002 Marcel Dekker, Inc.
Note that we make use of Eq. (8.11), treating gðx0 Þ and ð@g=@xi Þ as constants in Eqs. (8.12) and (8.13). The choice of x0 ¼ ðm1 , m2 ; . . . ; md Þt results in the simplified expression for mg given by Eq. (8.12) when the ðxi mi Þ terms vanish when the expectation operator is applied. More details on the development of the first-order model are presented in the appendix to this chapter. Wirsching and Ortiz (1996) indicate that the approximation given by Eqs. (8.12) and (8.13) is reasonable when Ci ¼ si =mi 0:15, for i ¼ 1; 2; . . . d. Equation (8.10) has been adapted for lognormal variables. Many improvements have been suggested to the FORM approximation, which has resulted in very computationally challenging procedures that require the use of dedicated computer programs. A popular choice of probabilistic design software is the NESSUS1 software by the Southwest Research Institute. Notable enhancements including SORM, a generalized second order approximation model (see Wu et al., 1990, Melchers, 1987, and Madsen et al., 1986, for example) have been incorporated into the software. A worked-out example illustrating the use of FORM now follows. First-order approximation to pf
Example 8-2:
Consider the model presented by Eq. (8.7) with the following properties:
LðmÞ PðN Þ W ðN Þ dðmÞ Y ðKPaÞ
m
s
2 700 200 0.5 132
0.1 50 20 0.05 1.5
Note: 1 KPa ¼ 1000 N=m2 .
From Eq. (8.12), gðxÞ ¼ Y
32LðP þ W =2Þ Pd 3
and so 32mL ðmp þ mW =2Þ Pm3d 32 2ð700 þ 200=2Þ ¼ 132;000 ¼ ¼ 1554 Pa p0:53
mg gðx0 Þ ¼ my
Copyright © 2002 Marcel Dekker, Inc.
From Eq. (8.13), 2 2 2 2 @g @g @g @g s2g ðmg Þ s2P þ ðmg Þ s2W þ ðmg Þ s2L þ ðmg Þ s2d @P @W @L @d 2 2 2 @g 32mL ð32=2ÞmL ðm Þ s2y ¼ þ s2P þ s2W @y g pm3d pm3d 2 2 32ðmP þ mW =2Þ 3 32mL ðmP þ mW =2Þ 2 þ s þ s2D þ s2y L pm4d pm3d 2 2 32 2 ð32=2Þ 2 2 ¼ 50 þ 202 p0:53 p0:53 2 2 32ð700 þ 200=2Þ 3 32 2ð700 þ 200=2Þ 2 þ 0:1 þ 0:052 p0:53 p0:54 ¼ 175;176;343 Pa2 sg 13;235 Pa ! mg 1554 < 0:50 pf ¼ F ¼F 13;235 sg
8.2.3
Probabilistic FEA
With probabilistic FEA, we assume a distribution on material properties or loads. The Institute of Safety Research and Reactor Technology of the Helmholtz research center in Ju¨lich, Germany, reports the development of PERMAS-RATM finite-element analysis software with probabilistic capability. An approach for incorporating stochastics on boundary conditions has been developed. Additionally, a Monte Carlo capability is added for handling FORM and SORM methods. No doubt, by the time of this printing, more vendors who have advanced the development of probabilistic FEA technology with their own product offerings will exist.
8.3
PARAMETRIC MODELS
We define a broad class of models for which a suitable performance or degradation metric, PðtÞ, is identified. These models can be used to validate product reliability before the final prototype test. A representation of prospective PðtÞ-measures was suggested in Chapter 1 (see x1.2 and see Table 1-4). There is a continuum of values of PðtÞ, from a level where performance is at or near perfection to a level where, perhaps, product performance is so compromised that
Copyright © 2002 Marcel Dekker, Inc.
the product might be labeled ‘‘defective!’’ Accordingly, it is very challenging for the reliability analyst to come up with clear-cut acceptance limits for PðtÞ where, on one side, the product is considered to be functioning adequately and, on the other side, the product is considered to be in a failed state. Sometimes it even makes sense to partition PðtÞ into multiple states. Given the distribution of customer perceptions about the acceptability of performance for a given PðtÞvalue, it sometimes makes sense to employ fuzzy set theory (see Mohanty, (2001) to model product states. In any case, the use of bogey limits, as discussed in Chapter 1, makes perfect sense, as the engineer is often required to come up with subjective criteria for product acceptability. This criterion is often based on moving targets, based on benchmarked results of what the competition is doing relative to its bogey testing, and so on. Without loss of generality, in the discussion that follows we refer to this bogey limit as D, a degradation limit, that can sometimes be modeled as a stochastic random variable to represent the uncertainty and random nature of this limit. PðtÞ is modeled as one of the following: 1. 2.
Parametric fit: A physical fit of a parametric model to degradation data. Physics of failure: A model such as an Eyring model (Chapter 7, x7.4.2) can be used to model progress to failure. The CALCE Electronic Products and Systems Consortium center at the University of Maryland has constructed a database of physical models for modeling electronics reliability.
Failure (catastrophic or ‘‘bogey’’) is said to have occurred when PðtÞ crosses a degradation limit, D. For example, Wasserman (1996) employs a modified Paris model, which is tailored after the work by Lu and Meeker (1993) for modeling degradation phenomena. In this representation the degradation rate is proportional to the accumulated level of degradation, such as that associated with fatigue crack growth. A key assumption is made regarding the existence of a single, critical degradation characteristic, Y ðtÞ, whether observable or not, which is strongly related to the failure event (see Figure 8-5): Y ðtÞ ¼ rðt; fjdÞei ðd; nÞ
for i ¼ 1; 2; . . . ; n
ð8:14Þ
where rðt; f; jdÞ or just rðtÞ ¼ Mean degradation level at time t: f ¼ A vector of fixed-effect parameters common for all units: ei ¼ A lognormal ð1; s2n Þ random error term: d ¼ A vector of control factors: n ¼ A vector of noise factors:
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 8-5
Hypothetical degradation path (see Wasserman, 1996.)
Note that in Eq. (8.14) we explicitly differentiate between variables that can be controlled, such as design parameters, and those that are essentially uncontrollable (i.e., noise factors), such as the effects of environmental stresses. This allows for a robust design framework, wherein a setting of the design parameters is identified that optimizes product performance in the presence of noise. In this case some of the noise factors are time-dependent, to allow for optimization of reliability—product performance over time. Following the philosophy of Taguchi (see Leon et al., 1987, Phadke, 1989, and Shoemaker and Kacker, 1988), the objective is to identify a setting of the design factors that results in a design that is least sensitive to the noise effects. Examples of noise effects include the following:
1. 2. 3.
External effects: environmental effects; customer usage; interactions between systems Internal or within effects: wearout, degradation, or any other timedependent noise factor phenomena Between effects: manufacturing and assembly variation
Wasserman (1996) demonstrate the advantage of examining internal noise effects such as wearout phenomena and the resultant relationship between robust design and reliability when internal effects are properly addressed (see Example 8-3). Based on the degradation model given by Eq. (8.14), the conditional distribution of Y ðtÞ on t is lognormally distributed with median Ymed ¼ Y0 t a
Copyright © 2002 Marcel Dekker, Inc.
and deviation parameter s ¼ s2n. According to the properties of the lognormal distribution, the mean and variance of the degradation function, Y ðtÞ, are given by 2 s mY ðtÞ ¼ Y0 t a exp n 2 s2Y ðtÞ ¼ ðY02 t 2a Þ expðs2n Þ½expðs2n Þ 1:0 ¼ m2Y ðtÞ ½expðs2n Þ 1:0 respectively. As illustrated by Figure 8-5, the true degradation limit, D itself, can be treated as a random variable if it is not a predetermined test bogey. In such cases it is assumed that the distribution of D follows a lognormal distribution with location parameter Ld and deviation parameter sd ; that is, D lognormal ðLd , s2d Þ, where sd ¼ 0:0 if a test bogey is used. Thus, pf , the probability of failure, is the probability PðY ðtÞ > DÞ. Equivalently, we may evaluate the probability pf ¼ Pðln Y ðtÞ > ln DÞ formed by taking the natural log of both sides. Justification for doing so is based on the fact that we are transforming both sides using a monotonic transformation. pf ¼ Pðln Y ðtÞ > ln DÞ ! mln Y mln D ¼ F pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi s2n þ s2d ln Y0 þ a ln t ln Ld ¼F s0n 0 1=a 1 Ld Bln t ln Y C B C 0 ¼ FB C @ A s0n =a
where s0n ¼
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi s2n þ s2d
ð8:15Þ
In this case the failure distribution can be seen to follow a lognormal distribution, with 0 1=a Ld sn Ymed ¼ s¼ ð8:16Þ Y0 a
Example 8-3:
(Wasserman, 1996)
In this example we describe the use of robust design to optimize product reliability. The results of a robust design on a thermoplastic component are shown in Figure 8-6. In this experiment an eight-run ðL8 Þ orthogonal array of
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 8-6
Robust design on a plastic component.
seven design factors, A to G, is shown. The seven design parameters are as follows: A. B. C. D. E. F. G.
Injection pressure Plastic formulation Plastic temperature Cutting time Curing time Mold type Mold design
At each orthogonal array setting of the seven design factors, randomized experiments on a metric, PðtÞ, were obtained at the two different ages: 1. 2.
1 cycle of usage 80,000 cycles of usage
A parametric model was fit at each of the eight control conditions. At each control condition, the Paris model parameters, a and s2n, are identified from a best fit, and a signal-to-noise (s=n) performance measure, Z, is calculated. The choice of an appropriate s=n measure depends on application assumptions. In this case
Copyright © 2002 Marcel Dekker, Inc.
the s=n measure used is what Taguchi refers to as a ‘‘nominal-the-best’’ measure. The application-specific s=n measure developed is given by Z ¼ 10 logbs2YM þ ðmYM Þ2 c ¼ 10 logð0:5m2Y ð1Þ T 2a expðs2n ÞÞ ¼
10 logð0:5Y02 T 2a
for T a 1
ð8:17Þ
expð2s2n ÞÞ
Wasserman (1996) provides additional information on model development and details behind the development of the s=n measures. The output from the designed experiment is presented in Figure 8-6. Based on an analysis of the average effects of these performance statistics, an optimal design was identified.
8.4
SUMMARY
Design verification activities in a modern reliability organization should emphasize the use of the CAE models for early identification of design deficiencies. Specific computer-aided approaches include 1. 2. 3.
8.5 1.
2.
3.
Use of computer-aided engineering tools such as finite-element analysis Use of probabilistic design approaches (strength versus stress competition) Use of mechanistic and=or degradation models
EXERCISES Explain the advantages in the use of computer-aided engineering (CAE) techniques over traditional DV=PV testing strategies. What limitations are there in the use of CAE techniques? A bolted joint has mean shear strength, mY , of 45 k-lb=in2 and standard deviation, sY , of 4.2 k-lb=in2 . The joint is loaded such that the shear stress on the joint has a mean value, mX , of 38 k-lb=in2 and a standard deviation of 6.4 k-lb=in2 . Assuming that the shear strength and induced stress are independent and normally distributed, find the probability of failure of the bolted joint. Consider a simple DC circuit components with resistor R. Suppose that resistor’s rated voltage, VR , is mV ¼ 101 volts; sV ¼ 0:67 volts. The actual voltage is calculated by V ¼ IR. The current, I , has a mean mI ¼ 25 amps, with sI ¼ 0:333. The resistor has a mean mR ¼ 4 ohms and a standard deviation sR ¼ 0:01.
Copyright © 2002 Marcel Dekker, Inc.
What is the probability of failure? (Hint: Let gðVR ; I ; RÞ ¼ VR IR ) Pf (probability of failure) ¼ Pðg < 0Þ FðbÞ.) 4. Measure the tread depth and mileage on one of the tires on your car. Determine from the manufacturer or your tire dealer what the tread depth is supposed to be when it is new. Construct a plot of tread depth versus time using the tread-depth information. Extrapolate this fit linearly to an artificial bogey of 3=3200 . (Scheme 8-1.)
SCHEME 8.1
a. At what mileage is the tread depth expected to be worn down to 3=3200 ? b. If you worked for a tire manufacturer on tire development, what design parameters would you consider in a designed experiment to increase tread life? Are there cost–life considerations? c. What noise factors—that is, the uncontrollable factors that affect performance during tire usage—should you consider in a robust designed experiment to increase tread life? d. What kind of robust designed experiment for reliability might you run to optimize tread life?
Copyright © 2002 Marcel Dekker, Inc.
APPENDIX 8A
FIRST-ORDER RELIABILITY METHOD (FORM)
Following Wirsching and Ortiz (1996), we assume that the system probability of failure, pf , is a function of the joint distribution on random variables x1 , x2 ; . . . ; xd as follows: 1 0 BgðxÞ m 0 mg C g B C pf ¼ PðgðxÞ 0Þ PB C @ sg sg A |{z} Z r:v:
FðbÞ mg and sg are first-order approximations. The FORM (first-order approximation method) evaluation of pf is an adaptation of the Hasofer–Lind (1973) algorithm for evaluating the safety index, b ¼ md =sd . First we transform the xr random variables to standard normal random variables using zi ¼ F1 ðFxi ðxi ÞÞ
i ¼ 1; 2; . . . ; d
ð8A:1Þ
Equation (8A.1) assumes knowledge of the distribution on the xi random variables. For normally distributed variables, we use zi ¼ ðxi mi Þ=si for i ¼ 1; 2; . . . ; d. Following Hasofer–Lind (1973), we define the generalized safety index (b0 ): qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð8A:2Þ b0 ¼ min z21 þ z22 þ þ z2d In this representation pf FðbÞ
ð8A:3Þ 0
Graphically, we view the identification of b as shown in Figure 8-7.
FIGURE 8-7
First-order reliability method identification of safety index, b0 .
Copyright © 2002 Marcel Dekker, Inc.
9 Likelihood Estimation (Advanced)
The development of (maximum) likelihood-based procedures for estimation of exponential properties is straightforward. This is not the case, however, for the Weibull and normal distributions, for which we must introduce procedures for obtaining asymptotically correct Fisher-matrix or likelihood ratio confidence intervals on various reliability metrics of interest. The results are generalized for any multiply right-censored data set. We present worked-out examples illustrating the use of Microsoft Excel or Minitab for likelihood estimation for each of the distributions discussed.
9.1
MAXIMUM LIKELIHOOD (ML) POINT ESTIMATION
We consider an arbitrarily censored data set consisting of independent observations, t1 t2 tn1 tn , with associated failure density function, f ðt; yÞ, where y is an array of one or more unknown parameters to be estimated. We define a likelihood function, LðyÞ, associated with this data set as follows:
LðyÞ ¼
n Q
Li ðyÞ
i¼1
Copyright © 2002 Marcel Dekker, Inc.
ð9:1Þ
where each likelihood term, Li ðyÞ, is exchanged with f ðti ; yÞ if ti is a recorded failure Fðti ; yÞ if ti is a left-censored observation Rðti ; yÞ if ti is a right-censored observation Fðti;R ; yÞ Fðti;L ; yÞ if a failure occurs somewhere in ðti;L ; ti;R Þ Note: In more formal presentations of maximum likelihood estimation, Eq. (9.1) is often modified to include a permutation constant, n!, in the likelihood expression, inserted to denote the number of arrangements in the ordered data set. We take the liberty of not showing this constant, as it will immediately be seen that this constant drops out in the analysis. For multiply right-censored observations, Eq. (9.1) may be rewritten as LðyÞ ¼
n Q
½ f ðti ; yÞdi ½Rðti ; yÞ1di
i¼1
di ¼
1
if ti is a recorded failure
0
if ti is a right-censoring time
ð9:2Þ
If the data is singly right-censored, it is possible to simplify Eq. (9.1) even further. Assume that of the n items on test, we record r failures and observe ðn rÞ survivors, all of which have been on test for exactly t ¼ t* units of exposure. Equation (9.2) becomes LðyÞ ¼ f ðt1 ; yÞ f ðt2 ; yÞ f ðtr ; yÞ ½Rðt*; yÞnr
ð9:3Þ
Equation (9.3) is just the joint density of observing r random failures at times t1 , t2 ; . . . ; tr , with n r survivors taken off test at t ¼ t*, each contributing a probability Rðt*Þ of survival to the likelihood expression. To obtain the maximum likelihood (ML) estimate of y, denoted by y^, we treat ti , i ¼ 1; 2; . . . ; n, as known sample information in LðyÞ, while y is treated as the unknown in LðyÞ. y^ is then obtained as follows: The ML estimate of y is identified by searching over the parameter space of y for the value(s) of y that maximizes the likelihood of observing the sample; hence the term ‘‘maximum likelihood estimation.’’ In deriving ML estimates it is often more convenient to work with the loglikelihood function, which transforms the series of multiplicative terms in Eqs. (9.1) through (9.3) to an additive form. Since the natural log is a monotonic transformation of L, optimization of ln L is equivalent to the optimization of L. For multiply right-censored data, we work with ln LðyÞ ¼
n P
½di ln f ðti ; yÞ þ ð1 di Þ ln Rðti ; yÞ
i¼1
Copyright © 2002 Marcel Dekker, Inc.
ð9:4Þ
For singly right-censored data [see Eq. (9.3)], we work with ln LðyÞ ¼
r P
ln f ðti ; yÞ þ ðn rÞ ln Rðt*; yÞ
ð9:5Þ
i¼1
The ML estimates of y are just the solutions to the resultant relationships when the partial derivatives of ln L are equated to zero. Note: Technically, we should be concerned about the global optimality of the ML estimates. For the class of one- or two-parameter (log-) location-scale distributions that we address, the solutions to d ln L=dy ¼ 0 are always the ML estimates. This is not the case for the three-parameter Weibull distribution, because the introduction of a threshold parameter, d, can pose some difficulties in identifying the global solution to the optimization problem. These difficulties are discussed in x9.1.5.
9.1.1
Maximum Likelihood Estimation of Exponential Hazard Parameter, l
We begin our discussion with a presentation on the ML properties of the exponential distribution; they are very straightforward to evaluate. We describe the underlying model for the development of maximum likelihood estimates of the exponential hazard parameter, l, for data that is multiply right-censored. ML properties of the exponential MTTF parameter, y, are easily obtained by substituting y^ ¼ l^ 1 . For multiply censored exponential data, using f ðt; lÞ ¼ l expðltÞ and Rðt; lÞ ¼ expðltÞ, and so ln f ðti ; lÞ ¼ ln l lti and ln RðtÞ ¼ lt, with Eq. (9.4), we arrive at an expression for ln LðlÞ as follows: ln LðyÞ ¼
n P
½di lti þ di ln l ð1 di Þlti
i¼1
¼ l
n P
ti þ r ln l
ð9:6Þ
i¼1
¼ lT þ r ln l where T , the total unit time on test or the total exposure time of all units on test, is given by T ¼ t1 þ t2 þ þ tn
Copyright © 2002 Marcel Dekker, Inc.
ð9:7Þ
To maximize the log-likelihood function, we take d lnðLðlÞ=dl and equate it to zero: d ln LðlÞ r ¼ T ¼ 0:0 ) dl l ^l ¼ r T
ð9:8Þ
Therefore, the ML estimate for the exponential MTTF parameter is given by T y^ ¼ r
ð9:9Þ
Equation (9.8) holds under a broad range of multiply right-censoring conditions. Expressions for the total unit time on test, T , are given in Table 4-6 of Chapter 4. For example, for type I right-censored data, T ¼ t1 þ t2 þ þ tr þ ðn rÞt*, where t* is the time the test was stopped. In general for arbitrary censoring, T is determined from the difference between (a) the time each item was taken off test and (b) the time it was put on test. That is, P toff test ton test Þ T¼ all items
The quantity r in Eq. (9.8) is always the number of recorded failures during the life test. Worked-out examples on the generation of ML estimates for the exponential parameter, l, or y can be found in Chapter 4, x4.7.1. 9.1.2
ML Estimates of Normal Parameters, m and s2
For complete data sets, the likelihood estimates for m and s are well known and easy to obtain: m^ ¼ x and s^ 2 ¼
ðn 1Þs2 n
a biased estimator of s2
As discussed in x4.2.2, it is not possible to derived closed-form expressions for evaluating m^ and s^ in the presence of censored data. We now present general expressions for evaluating Lðm; sÞ, the normal likelihood function, for a multiply right-censored data set. In this case the adaptation of Eq. (9.3) for normal data leads to ln Lðm; sÞ ¼
n P
½di ln f ðti ; m; sÞ þ ð1 di Þ ln Rðti ; m; sÞ
i¼1
Copyright © 2002 Marcel Dekker, Inc.
ð9:10Þ
where 1 1 ti m 2 f ðti ; m; sÞ ¼ pffiffiffiffiffiffi exp 2 s s 2p and Rðti Þ ¼ 1 F
t m i s
It is easier to work with the standard normal variate, zi . Substituting zi for ðti mÞ=s and lnð1 Fðzi ÞÞ for ln Rðti Þ, we may express the log-likelihood terms as 1 ln f ðti ; m; sÞ ¼ constant lnðsÞ z2i 2 and ln Rðti ; m; sÞ ¼ lnð1 Fðzi ÞÞ Ignoring the constants—since they are not involved in the optimization— leads to the modified log-likelihood expression: n n P 1 P ln Lðm; sÞ ¼ r ln s di ðzi Þ2 þ ð1 di Þ ln Rðzi Þ ð9:11Þ 2 i¼1 i¼1 The first partial derivatives of ln Lðm; sÞ may be expressed as follows: dRðzi Þ @zi n n P P @ ln Lðm; sÞ @zi @m dzi ð9:12Þ ¼ 0 di zi þ ð1 di Þ @m @m Rðzi Þ i¼1 i¼1 n n P 1 P ¼ di zi þ ð1 di Þlðzi Þ s i¼1 i¼1 @Rðzi Þ @zi n n P @ ln Lðm; sÞ r P @zi @s zi ð9:13Þ ¼ di z i þ ð1 di Þ @s Þ @s s i¼1 Rðz i¼1 i n n P P 1 ¼ r þ di z2i þ ð1 di Þzi lðzi Þ s i¼1 i¼1 Note that we arrived at the final form shown by Eqs. (9.12) and (9.13) by substituting @zi =@m ¼ 1=s, dRðzi Þ=dzi ¼ fðzi Þ, lðzi Þ ¼ fðzi Þ=Rðzi Þ, and @zi =@s ¼ zi =s.
Copyright © 2002 Marcel Dekker, Inc.
At the MLE point, the bracketed expression [. . .] in Eqs. (9.12) and (9.13) must vanish: n P
di z i þ
i¼1
r þ
n P i¼1
di z2i þ
n P
ð1 di Þlðzi Þ ¼ 0
i¼1 n P
ð1 di Þzi lðzi Þ ¼ 0
ð9:14Þ ð9:15Þ
i¼1
The details describing the use of an efficient algorithm for solving the system of equations given by Eqs. (9.14) and (9.15) are presented in the appendix to this chapter. 9.1.3
Worked-Out Example
Example 9-1:
Easy ML estimation of normal parameters, m and s
In this section we attempt to directly solve Eq. (9.11) with the use of Excel’s builtin powerful, nonlinear search code, which is part of its Tools > Solver routine. We begin with an example discussed earlier for which a normal plot of the data is presented in Figure 9-1. We use the rank regression estimates to initialize the search. For a well-behaved region about the ML point, we expect this easy ML search procedure to work quite well although it is always possible that the gradient search procedure may stall just short of the optimum. We discuss this at the end of this section.
FIGURE 9-1
Normal plot of data.
Copyright © 2002 Marcel Dekker, Inc.
A sample of n ¼ 7 component parts was put on test to study the time to wearout, which was believed to follow a normal distribution (see Table 9-1). Due to problems with two of the test stands, two components were removed from the test at 145 and 160 hundred-hr, respectively. An inverse, rank regression fit of the data was made, with the inverse fit described by Figure 9-1. The fitted relationship is T ¼ 19:61Zscore þ 161:2 Use this information to generate rank regression estimates of m and s, which, in turn, can be used to initialize an Excel Tools > Solver procedure to develop ML estimates for m and s.
By Eq. (4.6), the rank regression estimates are s^ ¼ 19:61 and m^ ¼ 161:2. They will be used to initialize a nonlinear search procedure to identify the ML estimators of m and s. Wasserman (2001) demonstrates how the Excel Tools > Solver procedure can be used to easily obtain the ML estimates in a single step. Excel’s Tools > Solver procedure incorporates a powerful, generalized, reduced-gradient (GRG2) nonlinear optimization code, which, in the majority of instances, can be used to develop ML estimates of m and s in a single step. Its use is illustrated in Figure 9-2. Excel Tools > Solver was used to develop onestep, easy ML results. The results, which Table 9-2 shows, are in absolute agreement with those provided by Minitab or WinSmith software. Is it always possible to identify the ML estimates of m and s in a single step? The answer is ‘‘No,’’ but in many cases it should work. To understand why, the reader should refer to Figure 9-6, wherein contours of the unconstrained loglikelihood contours, ln Lðm; sÞ, are plotted over a range of confidence levels. The log-likelihood space appears to be fairly well behaved. In the hill region (near the ML point) the surface seems to have a concave appearance. Accordingly, gradient search procedures, which are based on finding a direction of improvement, should work particularly well if the search is initialized with the rank regression estimates of m and s. Like any gradient search procedure, however, the risk TABLE 9-1 Normal Data Set T 137 145 150 152 159 160 184
Copyright © 2002 Marcel Dekker, Inc.
Censor 1 0 1 1 1 0 1
FIGURE 9-2 Use of Microsoft Excel1 Tools > Solver to directly identify the ML parameter estimates of normal parameters, m and s.
TABLE 9-2 ln L m s
Easy ML Results 17.222 159.83 15.470
exists that convergence might be too slow, which could result in early termination of the search. In such cases formal ML algorithms are needed. The use of a formal procedure is illustrated in the appendix to this chapter. 9.1.4
Weibull Distribution: ML Estimation of b and y
The log-likelihood expresion for the two-parameter Weibull distribution is obtained with the use of Eq. (9.4), substituting b b t b1 ðt=yÞb f ðtÞ ¼ e and RðtÞ ¼ eðt=yÞ y y the density and survival functions, respectively (see Table 3-7): t b P t b n n P ln Lðy; bÞ ¼ di ln b b ln y þ ðb 1Þ ln ti i ð1 di Þ i y y i¼1 i¼1 n n t b P P i ¼ di ½ln b b ln y þ ðb 1Þ ln ti i¼1 i¼1 y ð9:16Þ
Copyright © 2002 Marcel Dekker, Inc.
To find the ML estimates, equate the partial derivatives to zero: n b t b @ ln Lðy; bÞ rb P i ¼ þ ¼0 @y y i¼1 y y t P n n t b t @ ln Lðy; bÞ r P i ¼ þ di ln i ln i ¼ 0 y y @b b i¼1 i¼1 y
ð9:17Þ ð9:18Þ
From Eq. (9.17), n t b P i ¼r y i¼1
ð9:19Þ
yielding 0P b 11=b ti B C y^ ¼ @ 8i A r
ð9:20Þ
We now show an implicit expression for evaluating b^ . First Eq. (9.18) is rearranged: n n t b n t b P P r P i i þ di ln ti r ln y ln ti þ ln y ¼ 0 b i¼1 i¼1 y i¼1 y
Collecting terms in ln y yields " # n n t b n P P Pti b r P i þ d ln t ln ti þ ln y r ¼ 0 b i¼1 i i i¼1 y 8i y i¼1
ð9:21Þ
According to Eq. (9.19), the bracketed expression in Eq. (9.21) vanishes. Then, by substituting Eq. (4.20), a y-free expression for estimating b is obtained: n n 1 P ln t 1 P þ di i P tib ln ti ¼ 0 n r b i¼1 b i¼1 ti
ð9:22Þ
i¼1
Equation (9.22) must be solved numerically using a numerical analytic method such as Newton–Raphson. Following Nelson (1982) and others, we rearrange Eq. (9.22) to the form expressed by Eq. (9.23), to ensure a rapid convergence to the ML estimate of b. In this form we treat the left-hand side as a
Copyright © 2002 Marcel Dekker, Inc.
target for the right-hand side, which can be seen to be a monotonic increasing function of b^ : n P
ln t 1 P 1 di i ¼ P b tib ln ti r b t i 8i i¼1 |fflfflfflfflffl{zfflfflfflfflffl} 8i target |fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
ð9:23Þ
Tool Solver cell
y^ may then be estimated with the use of Eq. (4.20).
Worked-out Example 9-2 (cont.): Use of ML equations for estimating properties of censored Weibull data. The multiply censored data set given by Table 4-2 is re-examined at this time. The data set is also presented in the Excel spreadsheet that appears in Figure 9-3, which illustrates the use of the Excel Tools > Solver procedure for finding the solution to Eq. (9.23). The nonlinear search procedure is initialized using the inverse rank regression estimate of b, b^ ¼ 1:508, whose final value, the ML estimate, is displayed in cell E1. The left-hand side of Eq. (9.23), the target, is determined to be the constant, 4.431, for this data set. It is displayed in cell I10. Tools > Solver is set up to vary the value of b^ until the value of cell I8 is equal to the target. The ML estimate of y is displayed in cell I13 with the use of Eq. (4.20). Convergence is
FIGURE 9-3 Use of Microsoft Excel1 ‘‘Tool=Solver’’ procedure to identify ML estimates of y and b.
Copyright © 2002 Marcel Dekker, Inc.
rapid. The ML estimates are in agreement with those generated by Minitab in Figure 4.12: b^ ¼ 1:7849 0P b 11=b ti 8i C ^y ¼ B @ A ¼ 132:805k rev r
9.1.5
ML Estimation of Three-Parameter Weibull Distribution
The log-likelihood function, denoted ln Lðd; y; bÞ, is obtained by substituting the quantity ðti dÞ for each occurrence of ti in Eq. (9.16): n n t d b P P i ln Lðy; bÞ ¼ r ln b rb ln y þ di ½ðb 1Þ lnðti dÞ y i¼1 i¼1 ð9:24Þ The difficulties of obtaining reliable ML parameter estimates of the threeparameter Weibull distribution are well known and summarized by Lawless (1982, pp. 191–192): 1. 2.
3. 4.
When b < 1, the likelihood function is unbounded as d ) t1 . The solution to the likelihood equations may yield either no solution or two solutions. In the former case set d^ ¼ t1 and b^ ¼ 1:0. For the latter case one of the solutions will be a saddle point. In this case, but only for b > 1, the solution to the ML equations must be compared to the restricted points, d^ ¼ t1 and b^ ¼ 1:0, to determine which is the true ML point. For b < 1, use d^ ¼ t1 and b^ ¼ 1:0. The likelihood function is quite flat in the region about d^ ¼ t1 when b exceeds 1. d^ must be nonnegative, a fact that must be guarded against in the analysis.
Accordingly, the preferred method for likelihood estimation is to simply choose a range of d-values that do not exceed the first-order statistic and then find conditional ML estimates of y and b. The likelihood expressions resulting from the solutions to the first two likelihood equations on y and b are derived using ð@ ln Lðd; y; bÞ=@y ¼ 0 and ð@ ln Lðd; y; bÞ=@b ¼ 0, resulting in a simple modification of Eqs. (9.17) to (9.22) formed by substituting ti d for each occurrence of ti . The log-likelihood expression must then be compared over the range of d-values to determine the optimum setting.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 9-4 Evaluation of log-likelihood function, ln Lðy^ , b^ jd), for a range of d-values that do not exceed t1 ¼ 22,000 pieces. The use of Microsoft Tools > Solver is illustrated. A plot of ln L versus d is also displayed.
Worked-out Example 9-3: Develop ML estimates of Weibull parameters y, b, and d for the data set in x4.8. Note: Rank regression estimate of y, b, and d were worked out in Figure 4-19. The use of Excel1 is illustrated in Figure 9-4, which shows a plot of the conditional log-likelihood, ln Lðy^ , b^ jdÞ, for a range of d-values that do not exceed t1 ¼ 22;000 pieces. Because b < 1, the log-likelihood is expected to be unbounded around t ¼ t1 ¼ 22;000 pieces, which is the case. In this case no exact solution exists, and one should simply choose a d-value close to 22,000. The corresponding y^ - and b^ -values are shown for various choices for d, along with details on the construction of appropriate Excel array formulas and the use of Tools > Solver.
9.1.6
Other Modified Estimation Procedures for the Three-Parameter Weibull Distribution
Other modified procedures for estimation not subject to the difficulties alluded to earlier have been developed. Two modified approaches have been suggested and are summarized here.
Copyright © 2002 Marcel Dekker, Inc.
Modified Maximum Likelihood Estimation (Lehtinen, 1979) Under this scheme we continue to develop ML estimates on y and b; however, we estimate that the threshold parameter, d, is based on the expected value of the first (order statistic) failure, E½T1 ¼ t1 . We estimate d using y 1 d ¼ t1 1=b G 1 þ ð9:25Þ n b For the worked-out example shown by Figure 9-4, we follow the suggestion by Lehtinen (1979) and repetitively solve for the following. 1.
Guess d and b. Find 1=b ^yðb; dÞ ¼ Pðt dÞb =r i 8i
2.
Use Tools > Solver, as per Figure 9-4, and max lnðd; y; bÞ by varying only b, subject to 1=b P y^ ðb; dÞ ¼ ðti dÞb =r 8i
3.
Given b, search for a value of d satisfying d ¼ t1
y n1=b
Gð1 þ 1=bÞ
[see Eq. (9.25)] and 1=b P y^ ðb; dÞ ¼ ðti dÞb =r 8i
4.
If convergence criteria based on either change in d or ln L are met, stop; otherwise, go back to step 2.
Figure 9-5 illustrates the use of the Microsoft Excel spreadsheet. The estimates of the Weibull parameter upon termination of the procedure yield the following estimates: d^ ¼ 16;964: y^ ¼ 30;484: b^ ¼ 1:195:
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 9-5 Use of Microsoft Excel Tools > Solver to identify modified ML estimators of d, y, and b.
Method of Modified Moment Estimation (Cohen et al., 1984) This approach constitutes a variant of the use of traditional method-of-moments relationships for estimating the Weibull parameters. In this case we replace the third moment with the use of Eq. (9.25), the expected value of the first-order statistic. Refer to Cohen et al. (1984) for more information on the use of this procedure for developing estimates of d, y, and b.
9.2
ML-BASED APPROACHES FOR CONFIDENCE INTERVAL ESTIMATION
The development of exact confidence intervals on the exponential hazard parameter, l, or other exponential reliability metrics is straightforward, as the sampling distribution of y^ ¼ l1 can easily be shown to follow a gamma distribution. This is not the case of the other distributions on which we focus our attention. For EVD, Weibull, (log-) normal, and other location-scale distributions that we discuss, the complexity required to obtain point estimates makes it possible to obtain exact expressions for their sampling distributions. Therefore, we resort to the use of asymptotically correct, normal approximations. We survey both sets of procedures next.
9.2.1
Exponential Confidence Intervals
In order to derive a confidence interval expression about l, RðtÞ, tR , or any other exponential reliability metric, the distribution of T ¼ t1 þ t2 þ þ tr þ ðn rÞtr
Copyright © 2002 Marcel Dekker, Inc.
must be determined. Without loss of generality, we will assume that the data set is singly right-censored. It turns out that this result is true for all multiply rightcensored data sets. In this case it is useful to work with an alternate expression for T . The alternate expression is based on the knowledge that T is the sum of 1. 2. 3. r:
ntr ¼ Contribution of n items on test during ½0; t1 Þ. ðn 1Þðt2 t1 Þ ¼ Contributions of ðn 1Þ items on test during ½t1 , t2 ). ðn 2Þðt3 t2 Þ ¼ Contribution of ðn 2Þ items on test during ½t2 , t3 ). ... ... ... ðn r þ 1Þðtr tr1 Þ ¼ Contribution of ðn þ r 1Þ items on test during ½tr1 , tr Þ.
Therefore, T ¼ t1 þ t2 þ þ tr þ ðn rÞtr ¼ nt1 þ ðn 1Þðt2 t1 Þ þ ðn 2Þðt3 t2 Þ þ þ ðn r þ 1Þðtr tr1 Þ
nYi þ ðn 1ÞY2 þ ðn 2ÞY3 þ þ ðn r þ 1ÞYr
Q1 þ Q2 þ þ Qr ð9:26Þ In Eq. (9.26), we introduce the notation yi ¼ ðti ti1 Þ, i ¼ 1; 2; . . . ; r, the time between the ith and ði þ 1Þth recorded failures, Qi ¼ ðn i þ 1ÞY i i ¼ 1; 2; . . . ; r. Each of the Qi random variables is shown to follow an exponential distribution with parameter l . Thus, T is the sum of r independent exponential (l) random variables, which, by definition, is a gammaðr; lÞ distribution. The percentiles of the gamma distribution are expressed in terms of chi-square percentiles, for which tables are widely available. The details behind this proof are presented in Appendix 9B to this chapter. Thus, T gammaðr; lÞ and 2lT w22r . The properties of the gamma distribution are summarized below. Gamma Distribution f ðtÞ ¼
1 r r1 lt lt e GðrÞ
for 0 t < 1; l > 0
where GðrÞ ¼ gamma function ¼ ger: EðT Þ ¼
r l
VarðT Þ ¼
Copyright © 2002 Marcel Dekker, Inc.
Ð1 0
r l2
ð9:27Þ
xr1 ex dx ¼ ðr 1Þ!, where r is an inte-
The chi-square ðw2 Þ distribution is just a special case of the gamma distribution with n ¼ 2r and l ¼ 1=2. The properties of the chi-square distribution are as follows: Chi-Square Distribution f ðxÞ ¼
n=2 1 1 xðn=21Þ ex=2 Gðn=2Þ 2
Eðw Þ ¼ n 2
for 0 x < 1
ð9:28Þ
Varðw Þ ¼ 2n 2
It is straightforward to show that 2lT is distributed gammaðr; 12Þ or, equivalently, w2 -distributed with 2r degrees of freedom. To arrive at an expression for the confidence intervals about l or y ¼ 1=l, we begin with Pðw21a=2;2r w2 w2a=2;2r Þ ¼ 1 a, substitute 2lT for w2, and rearrange terms: Pðw21a=2;2r w2 w2a=2;2r Þ ¼ Pðw22r;1a=2 2lT w22r;a=2 Þ 0 1 Bw22r;1a=2 w22r;a=2 C C ¼ PB l @ 2T 2T A |fflfflfflffl{zfflfflfflffl} |fflffl{zfflffl} lL
ð9:29Þ
lU
¼1a Alternatively, this expression may be inverted to provide ð1 aÞ confidence intervals on the mean time-to-failure, y ¼ 1=l. That is, " # w22r;1a=2 1 w22r;a=2 P 1a ð9:30Þ y 2T 2T which, upon inversion, yields the desired expression: 3 2 6 2T 2T 7 7 6 y 2 P6 2 71a 4w2r;1a=2 w2r;a=2 5 |fflfflfflffl{zfflfflfflffl} |fflffl{zfflffl} yL
ð9:31Þ
yU
Note: The results shown are exact for a type II right-singly censored data set. However, for a time-censored test, the number of recorded failures, r, at the time the test is stopped is random. Furthermore, the time at which the test is stopped, t*, can lie anywhere in the interval (tr , trþ1 ). As such, the distribution of T is intractable. Several adjustments for Eqs. (9.30) and (9.31) have been proposed to approximate the true distribution of T . The most common adjustment is adjusting the number of degrees of freedom to 2r þ 2 in Eq. (9.31).
Copyright © 2002 Marcel Dekker, Inc.
9.2.2
Asymptotic (Large-Sample) Confidence Intervals
Due to the numerical complexities of ML estimation, it is often difficult to derive closed-form expressions of the parameter estimates. In turn, it is not possible to model the underlying distribution of y^ , or even its standard error, s^ y^ , for that matter. However, by the central limit theorem, the sum of the log-likelihood terms will be approximately normally distributed for a sufficiently large sample size. This allows for the development of asymptotically (large-sample) correct expressions for the standard error, which, in turn, can be used to develop approximate standard normal confidence intervals. We discuss two asymptotic forms:
1. 2.
Intervals based on the Fisher (information) matrix Intervals based on the use of likelihood contours or likelihood ratios
The details for justification of the constitutive relations presented in this section are beyond the scope of this textbook. For further information, consult Leemis (1995, x7.3–4), which provides a good, readable introduction to likelihood theory.
Fisher-Matrix Intervals Without loss of generality, we consider a location-scale distribution with location parameter g and a nonnegative scale parameter, b. The asymptotic properties of the Fisher information matrix, I, are used to develop asymptotically correct confidence intervals on g and b: 2 2 @ ln L E 2 6 2@g I ¼ 6 4 @ ln L E @g@b
E E
2 3 @ ln L 7 @g@b 7 @2 ln L 5
ð9:32Þ
@b2
Direct evaluation of the information matrix, I, is not possible for the class of two-parameter location-scale distributions that we discuss—the normal and Weibull distributions. Even if it were, the result must be regarded as an approximation in that g and b are unknown and so must be replaced by their ML estimates. A well-accepted alternative is to work with the observed Fisher information matrix, F. The observed matrix, F, is obtained by substituting g^ for g
Copyright © 2002 Marcel Dekker, Inc.
and b^ for b, in lieu of taking the expectation. This is a most probable point estimate: 3 2 2 2 @ ln L ^ Þ @ ln L ð^g; b^ Þ ð^ g ; b 7 6 @g2 @g@b 7 ð9:33Þ F ¼ 6 2 5 4 @2 ln L @ ln L ^ ^ ð^g; bÞ ð^ g ; b Þ @g@b @b2 According to ML theory, under certain conditions of regularity (see Nelson, 1982, p. 546), the product ð^g g; b^ bÞt Fð^g g; b^ bÞ w22
ð9:34Þ
and so ð^g; b^ Þ is asymptotically distributed according to a bivariate normal with mean (g, b) and variance–covariance matrix given by Varð^gÞ Covð^g; b^ Þ ð9:35Þ COVð^g; b^ Þ F1 ¼ Covð^g; b^ Þ Varðb^ Þ A 100ð1 aÞ% asymptotically correct, standard normal confidence interval on g can then be constructed as follows: 2 3 6 7 P4g^ Za=2 s^ g^ g g^ þ Za=2 s^ g^ 5 1 a |fflfflfflfflfflfflffl{zfflfflfflfflfflfflffl} |fflfflfflfflfflfflffl{zfflfflfflfflfflfflffl} gL
ð9:36Þ
gU
For a positive-scale parameter, such as the Weibull shape parameter, b, it is generally recommended (e.g., Meeker and Escobar, 1998, and Nelson, 1982) that the standard normal confidence interval be about ln b^ , rather than about b^ directly. Although accuracy may or may not be improved, the sampling distribution of ln b^ is likely to be more symmetric than b^ , and the use of a log transformation ensures that the lower confidence limit will be positive. (Note: The lower confidence limit of the standard normal confidence interval on b can be negative!) In this case Zln b^ ¼
ln b^ ln b s^ ln b^
a standard normal random variable, and the standard error of ln b^ can be approximated using the delta method—which entails taking the expectation of a first-order Taylor-series expansion about ln b^ —as s^ ln b^ ¼
s^ b^ b^
Copyright © 2002 Marcel Dekker, Inc.
Thus, the approximate standard normal confidence interval about ln b^ is of the form 2 3 6 7 P4ln b^ Za=2 s^ ln b^ ln b ln b^ þ Za=2 s^ ln b^ 5 1 a |fflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflffl} |fflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflffl} ln bL
ln bU
Taking the exponential transform of all sides leads to 0 1 B b^ C PB b^ w C @ w b |{z} A1a |{z} b bL
ð9:37Þ
U
where w ¼ expðZa=2 s^ b^ =b^ Þ. Equations (9.33) to (9.37) are used later to develop asymptotically correct standard normal confidence intervals on normal and Weibull parameters. Likelihood Ratio (LR) Confidence Intervals Without loss of generality, we describe the use of the likelihood ratio statistic for developing confidence intervals about g, a parameter of interest. The likelihood ratio (LR) test statistic provides a formal framework for testing H0 : g ¼ g0
versus
Ha : g 6¼ g0
As its name implies, the LR test statistic is a ratio of likelihood functions. However, it is more convenient to work with the log form, which is a difference of log-likelihood expressions (see Lawless, 1982, pp. 173–174, for example). Specifically, we reject H0 at some level of confidence, C ¼ 1 a, if 2ðln L* ln L*ðg0 ÞÞ > w21;a
ð9:38Þ
where ln L ¼ Log-likelihood function evaluated at ML point; ln Lð^g; b^ Þ: ln L*ðg0 Þ ¼ maxb ln Lðg0 ; bÞ: If we denote b^ ðg0 Þ as the solution to maxb ln Lðg0 ; bÞ, then ln L*ðg0 Þ ¼ ln Lðg0 ; b^ ðg0 ÞÞ
ð9:39Þ
An asymptotically correct confidence interval or region on g consists of all values of g for which the null hypothesis, g ¼ g0 , is not rejected at some stated level of significance, C ¼ 1 a. That is, it consists of all values of g0 satisfying 1 ln L*ðg0 Þ ln L* w21;a 2
Copyright © 2002 Marcel Dekker, Inc.
ð9:40Þ
Note: Justifying the use of Eq. (9.40) is possible by constructing a second-order Taylor-series expansion of ln L*ðg0 Þ about ln L* as follows:
1 @2 ln L*ðgÞ @ ln L*ðgÞ ^ þ ln L*ðg0 Þ ¼ ln L* þ g¼^g ðg0 g @g 2 @g2 g¼^g |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} first-order term vanishes at g¼^g ðg0 g^ Þ2 þ
ð9:41Þ
First, we note that the first-order term in Eq. (9.41) vanishes (ML point). Under certain mild regularity conditions, the second-order term is asymptotically chi-square–distributed (see Nelson, 1982, pp. 546–550). Specifically, by Eqs. (9.33) to (9.35), the inverse of a 1 1 observed Fisher information matrix on g^ is just its asymptotic variance: 2 1 @ ln L*ðgÞ 2 s^ g^ @g2 g¼^g Therefore, !2 1 @2 ln L*ðgÞ 1 g0 g^ 1 1 2 ðg0 g^ Þ ¼ ¼ Z 2 ¼ w21 ^ 2 @g2 2 2 2 s g^ g¼^g LR confidence intervals can be graphically identified with the use of likelihood contours, which consist of all values of g and b for which ln Lðg; bÞ is a constant. Specifically, the contours we construct are solutions to 1 ln Lðg; bÞ ¼ ln L* w21;a 2
ð9:42Þ
Solutions to ln L*ðg0 Þ ¼ ln L* 1=2w21;a will lie on these contours. For the Weibull and normal distributions, the confidence intervals can be graphically estimated by drawing lines that are both perpendicular to the coordinate axis of interest and tangent to the likelihood contour. The use of likelihood contours for graphical estimation of LR confidence intervals is illustrated here for both the Weibull and normal distributions (see Figures 9-5 and 9-11). LR methods can also be used to derive confidence intervals about reliability metrics such as tR . Without loss of generality, assume that tR ¼ gðg; bÞ might be easily rewritten as g ¼ hðtr ; bÞ, so the log-likelihood expression can be expressed in terms of just tR and b as ln LðtR , bÞ. Accordingly, a ð1 aÞ asymptotically correct, LR confidence interval on tR will consist of all values of tR that satisfy 1 ln L*ðtR Þ ln L* w21;a 2
Copyright © 2002 Marcel Dekker, Inc.
ð9:43Þ
where ln L*ðtR Þ ¼ maxb ln LðtR ; bÞ ¼ ln Lðt ; b^ ðt ÞÞ R
ð9:44Þ
R
where b^ ðtR Þ is the solution to ð@ ln LðtR ; bÞÞ=@b ¼ 0. The use of Eqs. (9.40) to (9.44) is illustrated with worked-out examples for both the normal and Weibull distributions.
9.2.3
Confidence Intervals on Normal Metrics
For complete data sets, standard techniques for devising confidence intervals on m and s are well known; they are reviewed in x4.2.2. This is not the case with censored (incomplete) data sets, for which exact methods are not available. Approximate methods based on Monte Carlo simulation and other sampling approximations such as the use of F-approximations have been developed, but they are tedious to use and generally require dedicated computer packages (see Chapter 4, Appendix 4A). A rather simple approximation of the confidence interval limits on m that works fairly well even with small sample sizes has been developed by Wolynetz (see Lawless, 1982, p. 232). The approximation results in confidence interval expressions that are identical to the complete sample case, except we replace s by s0, an adjusted estimator: 3
2
6 s^ 0 s^ 0 7 7 6 P6m^ tn1 pffiffiffi m m^ þ tn1 pffiffiffi7 1 a 4 n n5 |fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl} |fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl}
mL
n* 02 s^ ¼ s^ 2 n1
ð9:45aÞ
mL
ð9:45bÞ
P based on the adjusted sample size, n* ¼ r þ ni¼1 ð1 di Þl_ ðzi Þ, where l_ ðzi Þ is the first derivative of lðzi Þ ¼ fðzi Þ=Rðzi Þ. Unfortunately, no such direct approximation has been developed for s2. Instead, we rely on the asymptotic normal properties of the log-likelihood function to arrive at large-sample approximations of the standard error (variance) of m^ and s^ . These are introduced next.
Copyright © 2002 Marcel Dekker, Inc.
Asymptotically Correct Fisher Information Matrix Confidence Intervals on m and s As described in x9.2.2, ðm^ , s^ ) are asymptotically distributed according to a bivariate normal with variance–covariance matrix, COVðm^ ; s^ Þ, equal to the inverse of the local Fisher information matrix, F: 3 2 @2 ln L @2 ln L 6 @m2 @m@s 7 7 6 ð9:46Þ F¼6 7 2 4 @2 ln L @ ln L 5 @m@s @s2 Varðm^ Þ Covðm^ ; s^ Þ covðm^ ; s^ Þ ¼ F1 ¼ ð9:47Þ Covðm^ ; s^ Þ Varðs^ Þ The elements in Eq. (9.46) are evaluated using n P @2 ln L 1 _ ¼ 2 r þ ð1 di Þlðzi Þ @m2 s i¼1 n n P @2 ln L 1 P ¼ 2 2dj zi þ ð1 di Þðlðzi Þ þ zi l_ ðzi ÞÞ @s@m s i¼1 i¼1 2 n n P P @ ln L 1 2 2_ l ðz ¼ r þ 3d z þ ð1 d Þ½z Þ þ 2z lðz Þ i i i i i i i @s2 s2 i¼1 i¼1
ð9:48Þ ð9:49Þ ð9:50Þ
Following Eq. (9.36), a ð1 aÞ standard normal confidence interval on m is given by 0 1 pffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiC B ð9:51Þ P@m^ Za=2 Varðm^ Þ m m^ þ Za=2 Varðm^ ÞA 1 a |fflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflffl} |fflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflffl} mL
mU
Following Eq. (9.37), a ð1 aÞ two-sided, standard normal confidence interval on s is given by 1 0 " " pffiffiffiffiffiffiffiffiffiffiffiffiffiffi# pffiffiffiffiffiffiffiffiffiffiffiffiffiffi#C B B Za=2 Varðs^ Þ Za=2 Varðs^ Þ C C1a B PBs^ = exp s s^ exp C s^ s^ A @ |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} sL
sU
ð9:52Þ In practice, only a one-sided upper confidence limit on s should be needed.
Copyright © 2002 Marcel Dekker, Inc.
Worked-out Example 9-1 (cont.): Illustrated use of Fisher-matrix confidence intervals on m and s2 Based on the use of equations (9.48) to (9.50), the local Fisher information matrix was calculated as 0:025 0:006 F¼ 0:006 0:044 Its inverse is the covariance matrix, 40:914 5:382 COVðm^ ; s^ Þ ¼ 5:382 23:202 The standard error of m^ and s^ is just the square root of the diagonal elements. That is, the standard error of m^ is estimated as 40:9140:5 ¼ 6:4; the standard error of s^ is estimated as 23:2020:5 ¼ 4:817. The covariance between m^ and s^ is the off-diagonal element, estimated as 5.382. Accordingly, approximate, 90% standard normal confidence intervals on m and s are given as follows. Two-Sided 90% Confidence Limits on m pffiffiffiffiffiffiffiffiffiffiffiffiffiffi mL ¼ m^ Z0:05 Varðm^ Þ ¼ 159:85 1:645ð6:4Þ ¼ 149:32 hundred-hr: pffiffiffiffiffiffiffiffiffiffiffiffiffiffi mU ¼ m^ þ Z0:05 Varðm^ Þ ¼ 159:85 þ 1:645ð6:4Þ ¼ 170:38 hundred-hr: One-Sided 90% Upper Confidence Limit on s " pffiffiffiffiffiffiffiffiffiffiffiffiffiffi# pffiffiffiffiffiffiffiffiffi Z0:10 Varðs^ Þ 1:282 23:2 sU ¼ s^ exp ¼ 23:05 ¼ 15:47 exp 15:47 s^ Asymptotically Correct LR Confidence Intervals on m and s Based on the use of Eqs. (9.40) and (9.42), a ð1 aÞ likelihood ratio (LR) confidence interval about m consists of all values of m0 satisfying 1 ln L*ðm0 Þ ln L* w21;a 2 where ln L*ðm0 Þ ¼ maxs ln Lðm0 ; sÞ ¼ ln Lðm0 ; s^ ðm0 ÞÞ
ð9:53Þ
ð9:54Þ
In Eq. (9.54), s^ ðm0 Þ is the solution to @ ln Lðm0 , sÞ=@s ¼ 0 [see Eq. (9.15)]. Similarly, a confidence interval about s consists of all values of s0 satisfying 1 ln L*ðs0 Þ ln L* w21;a 2
Copyright © 2002 Marcel Dekker, Inc.
ð9:55Þ
where ln L*ðs0 Þ ¼ maxm ln Lðm; s0 Þ ¼ ln Lðm^ ðs0 Þ; s0 Þ
ð9:56Þ
In Eq. (9.56), m^ ðs0 Þ is the solution to @ ln Lðm; s0 Þ=@m ¼ 0 [see Eq. (9.14)]. To avoid these complexities, the LR confidence intervals may be identified graphically with the use of likelihood contours. Following Eq. (9.42), the contours are constructed according to the relationship 1 ln Lðm; sÞ ¼ ln L* w21;a 2
ð9:57Þ
The use of likelihood contours is illustrated by Figure 9-6 for the workedout example of x9.1.3. Likelihood contours were constructed with the use of Eq. (9.57) for a range of confidence levels, C ¼ 90%, 95%, and 99%. From Table 9-2, ln L* ¼ 17:22, and so contours are of the form 1 ln Lðm; sÞ ¼ 17:22 w21;1C 2
ð9:58Þ
The use of Excel software for generating these contours is illustrated by Figure 9-7. Excel array formulas were used to evaluate ln L. The Tools > Solver
FIGURE 9-6 ML contours of normal parameters, m and s, for worked-out problem in x9.1.3 at C ¼ 90%, 95%, and 99%. Also shown: m^ ðs0 Þ and s^ ðm0 Þ; ML point; and tangents to C ¼ 90% contour revealing LR confidence limits on m and s.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 9-7 Use of Microsoft Excel to construct a normal ML contour at level of confidence of 0.90.
procedure was used to identify either mjs and sjm , which satisfy (9.58). Excel recorded macros were developed to automate much of the procedures. Specifically, we would like to identify LR confidence intervals on m and s at C ¼ 90%: At C ¼ 90%, w21;0:10 ¼ 2:706, and so the C ¼ 0:90 contour satisfies the relationship ln Lðm; sÞ ¼ 18:575 As illustrated by Figure 9-6, LR confidence intervals may be identified graphically by drawing tangential lines—which are parallel to a coordinate axis of
Copyright © 2002 Marcel Dekker, Inc.
TABLE 9-3 Graphical Estimation of LR Intervals of Normal Parameters, m and s, Using ML Contours mL
mU
sL
sU
147
174
10
28
interest—to the boundaries of the contour at the extreme values for m and s on the contour. The partial likelihood equation solutions, s^ ðm0 Þ and m^ ðs0 Þ, which are also plotted in Figure 9-6, are seen to intersect the C ¼ 90% contour at the tangential intersections with the contour. The 90% LR limits on m and s are displayed in Table 9-3. To obtain accurate LR limits, we need to solve the two nonlinear optimization problems, shown in Table 9-4. Actual LR limits were derived with the use of Excel in conjunction with Eqs. (9.53) to (9.57). The search procedures were initialized with the graphical estimates shown in Table 9-3. The alternating application of Tools > Goal Seek procedures is illustrated in Figure 9-8. For each parameter, there is an alternating search to set the first partial derivative of the free parameter to zero followed by another search procedure on the constrained parameter such that the likelihood w2 target is satisfied. Convergence should take but a handful of steps and can be TABLE 9-4 Searching for Exact LR Limits on Normal Parameters, m and s Solve for LR limits on m:
Solve for LR limits on s:
ln Lðm; sÞ ¼ 18:575 s:t: s ¼ s^ ðmÞ [Two solutions; see Eq. (9.15)]
ln Lðm; sÞ ¼ 18:575 s:t: m ¼ m^ ðsÞ [Two solutions; see Eq. (9.14)]
FIGURE 9-8 Use of Excel1 Tools > Goal Seek to solve for LR confidence intervals on normal parameters, m and s.
Copyright © 2002 Marcel Dekker, Inc.
TABLE 9-5 Exact LR Limits on Normal Parameters, m and s mL 148.98
mU
sL
sU
173.15
9.98
28.74
easily automated with the development of user macros. The results of the nonlinear search are summarized by Table 9-5. Asymptotic Confidence Limits on Percentiles of Normal Survival Distribution Estimating tR . tR , the Rth percentile of the survival distribution for a normally distributed time-to-failure, is defined by the relationship FðtR Þ ¼ 1 R. It is estimated using ^tR ¼ m^ þ s^ Z R
ð9:59Þ
Worked-out Example 9-1 (cont.) ^t0:90 ¼ m^ þ s^ Z0:90 ¼ 159:85 20:16 1:282 ¼ 134:0 hundred-hr. Note: Z0:90 ¼ Z0:10 ¼ 1:282:Þ As simple as Eq. (9.60) might appear, a closed-form expression for the confidence interval on tR is not possible. This is due to the fact that the distribution of ^tR is dependent on the joint distribution of m^ and s^ . Asymptotic expressions for the standard error of ^tR require knowledge of the covariance matrix on m^ and s^ . This requirement is discussed next. Fisher-Matrix Intervals on Normal Percentile, tR Lawless (1982) presents an expression for the asymptotic variance of ^tR : Varð ^tR Þ Varðm^ Þ þ ZR2 Varðs^ Þ þ 2ZR COVðm^ ; s^ Þ
ð9:60Þ
This is used to provide a standard normal approximation of the confidence interval on tR : qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð9:61Þ Pð ^tR Za Varð ^tR Þ tR Þ 1 a Worked-out Example 9-1 (cont.) Continuing with the worked out example of x9.1.3, we will demonstrate the use of Eqs. (9.60) and (9.61) to generate an asymptotically correct, 97.5% one-sided, lower confidence interval on t 0:90 , the 90th percentile of the survival distribution. To calculate Varð ^tR Þ, we use the
Copyright © 2002 Marcel Dekker, Inc.
elements of the covariance matrix—Varðm^ Þ ¼ 40:914; Varðs^ Þ ¼ 23:202; and COVðm^ ; s^ Þ ¼ 5:382—to develop a large-sample estimate of the variance of ^tR : Varð ^tR Þ Varðm^ Þ þ ZR2 Varðs^ Þ þ 2ZR COVðm^ ; s^ Þ ¼ 40:914 þ 1:2822 23:202 2ð1:282Þ 5:382 ¼ 65:25 hundred-hr2 Equation (9.61) can now be used to devise an approximate 0.975 standard normal, one-sided lower confidence limit tR : qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi tR;L ¼ ^tR Z0:025 Varð ^tR Þ ¼ 134:0 1:96ð8:07Þ ¼ 118:2 hundred-hr Asymptotic Confidence Intervals on Normal Reliability Dodson and Nolan (1995) provide an expression for the approximate lower confidence limit on reliability, RðtÞ ¼ 1 FðZÞ, with Z^ ðtÞ ¼ ðt m^ Þ=s^ : 3 2 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 7 6 ð9:62Þ P41 FðZ^ ðtÞ þ Za VarðZ^ ðtÞÞ RðtÞ5 ¼ 1 a |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} RL ðtÞ
where
! Varðm^ Þ þ Z^ 2 ðtÞ Varðs^ Þ þ 2Z^ ðtÞCovðm^ ; s^ Þ ^ VarðZ ðtÞÞ s^ 2
ð9:63Þ
Worked-out Example 9-1 (cont.) Now, suppose that the design life tb ¼ 120 hundred-hr. The reliability at the design life is evaluated as follows. Point estimate of Rðtd Þ t m^ 120 159:85 R^ ðtb Þ ¼ 1 F d ¼1F ¼ 1 Fð2:57Þ ¼ 0:9949 15:47 s^ One-Sided Lower Confidence Limit on Rðtd Þ A 95% lower confidence interval on reliability at the design life can be developed based on the use of Eqs. (9.62) and (9.63) as qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi td m^ RL ðtd Þ ¼ 1 F þ Z0:05 VarðZ^ ðtd Þ s^ pffiffiffiffiffiffiffiffiffiffiffi ¼ 1 Fð2:57 þ 1:645 0:696Þ ¼ 0:115
Copyright © 2002 Marcel Dekker, Inc.
where
! ^ 2 ðtd Þ Varðs^ Þ þ 2Z^ ðtd Þ Covðm^ ; s^ Þ ^ Varð m Þ þ Z Var Z^ ðtd Þ ¼ s^ 2 40:9 þ 2:572 23:2 2 2:57 5:382 ¼ ¼ 0:696 15:472
LR-Based Intervals on Normal Percentile, tR Following the general approach outlined by x9.2.2, the likelihood contour, ln LðtR ; sÞ, is formed by incorporating the relationship for tR given by Eq. (9.59) into the general expression for the expression for the normal contour, ln Lðm; sÞ, given by Eq. (9.57). n 1 P ln LðtR ; sÞ ¼ r ln s d ðt tR þ ZR sÞ2 2 i¼1 i i ð9:64Þ ht t i n P R þ ð1 di Þ ln Q i þ ZR ¼ w21;a s i¼1 where ln Qð:Þ ¼ 1 Fð:Þ. The LR confidence limits will be of the form 1 ln L*ðtR Þ ln L w21;a 2
ð9:65Þ
where ln L*ðtR Þ ¼ maxb ln LðtR ; sÞ. If we denote the solution to @ ln LðtR ; sÞ=@s ¼ 0 as s^ ðtR Þ, then ln L*ðtR Þ ¼ ln LðtR ; s^ ðtR ÞÞ The partial likelihood estimate is the solution to i n h t t t t P @ ln LðtR ; sÞ R i R þ ZR ¼ r þ di i s s @s i¼1 t t t t i n h P R R þ ZR ¼ 0 þ l i ð1 di Þ i s s i¼1
ð9:66Þ
In Eq. (9.66), note the definition of the standard hazard expression: t t R t t þ AR f i R þ ZR l i s t tR s þ ZR 1F i s The LR confidence limits on tR are found by finding the intersection of the partial likelihood relationship, s^ ðtR Þ, and the likelihood contour.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 9-9 90% normal likelihood contour, ln LðtR ; sÞ, and s^ ðtR Þ, with tangential lines to identify C ¼ 0:90, 2-sided LR limits on survival percentile, tR .
Worked-out Example 9-1 (cont.) A graphical illustration of how LR confidence limits on tR are identified is presented in Figure 9-9 for the worked-out example. Likelihood contours are shown for C ¼ 0:90, 0.95, and 0.99. The likelihood contour at the 0.90 level of confidence was constructed with the use of Eq. (9.64). The smallest and largest values of tR on the contour were identified and tangential lines drawn to the abscissa, the tR -axis. The 2-sided 90% confidence limits are tR;L ¼ 120:5
and
tR;U ¼ 151
To find exact expressions for the LR limits on tR , one must solve the nonlinear optimization problem in Table 9-6. TABLE 9-6 Finding Exact LR Limits on tR Solve for LR limits on tR : ln LðtR ; sÞ ¼ 18:575 s:t: s ¼ s^ ðtR Þ [Two solutions; see Eq. (9.66)]
Copyright © 2002 Marcel Dekker, Inc.
The graphical limits were used to initialize a search procedure to find the exact LR limits. Equations (9.64) and (9.66) were implemented in a two-step search procedure; they are illustrated by Figure 9-10. A Microsoft Excel macro was recorded to automate the alternating Tools > Goal Seek or Solver procedures to find the solutions to both relationships. The exact limits were found to be tR;L ¼ 120:31
and
tR;U ¼ 150:84
In similar fashion, LR confidence limits might then be developed on the normal reliability, Rðtd Þ, or any other reliability metric of interest. Asymptotically correct normal confidence intervals on Weibull parameters, b and y, are based on the Fisher information matrix. Fisher-Matrix Expression The observed Fisher information matrix has the form: 2
@2 ln L 6 @y2 F¼6 4 @2 ln L @b@y
3 @2 ln L @b@y 7 7 @2 ln L 5 @b2
ð9:67Þ
FIGURE 9-10 Use of Microsoft Excel search procedures to identify LR intervals on percentile, tR .
Copyright © 2002 Marcel Dekker, Inc.
The inverse of F is the asymptotic covariance matrix, COVðy^ , b^ ): COVðy^ ; b^ Þ ¼ F1 ¼
Varðy^ Þ Covðy^ ; b^ Þ ^ ^ Covðy; bÞ Varðb^ Þ
ð9:68Þ
To evaluate the observed Fisher matrix, F, Eqs. (9.17) and (9.18) are used to obtain expressions for the second partial derivates of the log-likelihood function: n bðb þ 1Þ t b @2 ln Lðy; bÞ rb P i ¼ 2 y @y2 y y2 i¼1 2 b n P @ ln Lðy; bÞ r ti 2 ti ¼ 2 ln y y @b2 b i¼1 n 1 t b b t t b @2 ln Lðy; bÞ r P i ¼ þ þ ln i i y y @y@b y i¼1 y y y b t n P r 1 ti 1 þ b ln i ¼ þ y y i¼1 y y
ð9:69Þ ð9:70Þ ð9:71Þ
Based on the use of Eqs. (9.68) to (9.71), the estimated standard error of the ML estimates, s^ y^ , s^ b^ , may be obtained. As b and y are positive-valued, we prefer to work with approximate standard normal confidence intervals on ln b and ln y. Methods for their construction are discussed in x9.2.2 [see Eq. (9.37)]. Suggested expressions for the asymptotic, Fisher-matrix confidence intervals on y and b are presented next: 0
1
B qffiffiffiffiffiffiffiffiffiffiffiffiffi3C 2 C B B ^ Varðy^ Þ C Z y a=2 B 5C qffiffiffiffiffiffiffiffiffiffiffiffiffi3 y y^ exp4 PB 2 C1a ^ C B y C B Za=2 Varðy^ Þ A @exp4 5 ^y 1 0 B qffiffiffiffiffiffiffiffiffiffiffiffiffiffi3C 2 C B B ^ Za=2 Varðb^ Þ C b C B ^ 4 5 qffiffiffiffiffiffiffiffiffiffiffiffiffiffi3 b b exp PB 2 C1a ^ C B b ^ C B Za=2 VarðbÞ A @exp4 5 b^
Copyright © 2002 Marcel Dekker, Inc.
ð9:72Þ
ð9:73Þ
Worked-out Example 9-2: Fisher-matrix confidence intervals on b and u. Equations (9.67) to (9.73) were used to directly output elements of the Fisher information matrix, F, and its inverse: F¼ F1
0:0029
0:0219 2 s^ 2^ y ¼4 Covðy^ ; b^ Þ
0:0219
7:6126 3 Covðy^ ; b^ Þ 353:72 5¼ 2 1:0194 s^ ^ b
1:0194 0:134
Equations (9.67) to (9.73) were then used to obtain 90% two-sided, asymptotically correct standard confidence intervals on y and b. The results are summarized in Table 9-7. 1 0 B qffiffiffiffiffiffiffiffiffiffiffiffiffiC 2 C B B ^ Varðy^ ÞC Z y 0:05 C B qffiffiffiffiffiffiffiffiffiffiffiffiffi3 y y^ exp4 PB 2 C 0:90 ^ C B y ^ C B VarðyÞ Z A @exp4 0:05 5 ^y 0 1 B 132:81 1:645 19:129 C C 0:90 PB @ 1:645 19:129 y 132:81 exp A 132:81 exp 132:81 0 1 A P@104:79 168:32 |fflfflffl{zfflfflffl} y |fflffl ffl{zfflfflffl} 0:90 0
yL
yU
1
B qffiffiffiffiffiffiffiffiffiffiffiffiffiffi3C 2 C B B ^ Z0:05 Varðb^ Þ C b C B ^ 4 5 qffiffiffiffiffiffiffiffiffiffiffiffiffiffi3 b b exp PB 2 C 0:90 ^ C B b ^ C B Z0:05 VarðbÞ A @exp4 5 b^ 0 1 B 1:785 1:645 0:361 C C 0:90 PB @ 1:645 0:361 b 1:785 exp A 1:785 exp 1:785 0 1 A 2:489 P@|ffl 1:279 ffl{zfflffl} 0:90 ffl{zfflffl} b |ffl bL
Copyright © 2002 Marcel Dekker, Inc.
bU
TABLE 9-7 Fisher-Matrix Intervals on Weibull Parameters, y and b yU
bL
bU
168.99
1.260
2.470
yL 105.48
Likelihood Ratio (LR) Intervals on Weibull Parameters, b and y Following Eq. (9.40), a ð1 aÞ confidence interval about b consists of all values of b0 that satisfy (see Lawless, 1982, p. 173) 1 ln L*ðb0 Þ ln L* w21;a 2
ð9:74Þ
where ln L*ðb0 Þ ¼ maxy ln Lðy; b0 Þ ¼ ln Lðy^ ðb Þ; b Þ 0
ð9:75Þ
0
and P tib0 y^ ðb0 Þ ¼ 8i r
!1=b0
is the solution to partial likelihood equation to @ ln Lðy; b0 Þ=@y ¼ 0 [see Eq. (4.20)]. Similarly, a confidence interval about y consists of all values of y0 satisfying 1 ln L*ðy0 Þ ln L w21;a 2
ð9:76Þ
where ln L*ðy0 Þ ¼ maxb ln Lðy0 ; bÞ ¼ ln Lðy ; b^ ðy ÞÞ 0
ð9:77Þ
0
In this case the determination of b^ ðy0 Þ requires a nonlinear search to identify the solution to the partial differential equation @ ln Lðy; bÞ=@b ¼ 0 for each evaluation of y ¼ y0 . Worked-out Example 9-2 (cont.):
Graphical LR limits on b and u
To avoid these complexities, the likelihood confidence intervals can be identified graphically by drawing likelihood contours at a given confidence level. Following
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 9-11 ML contours of Weibull parameters, b and y, for levels of confidence between 0.70 and 0.99—generated by Win SmithTM software, V1.1, and plotted in Minitab1 V12.
Eq. (9.42), the contours at C ¼ ð1 aÞ are constructed according to the relationship 1 ln Lðy; bÞ ¼ ln L* w21;a 2
ð9:78Þ
For the worked-out problem, we plot contours of the log-likelihood function in Figure 9-11, for C ranging from 0.70 to 0.99. For the contour associated with a 90% level of confidence, we have ln L* ¼ 90:708, w21;0:10 ¼ 2:71, and so the contour consists of all values of ln Lðy; bÞ satisfying ln Lðy; bÞ ¼ 92:063 The LR limits were identified by drawing lines tangent to the boundary of contour of the 90th% confidence level as shown in Figure 9-11. A more accurate determination of the LR limits may be obtained with the use of the Excel Tool > Solver routine to estimate the partial likelihood estimates, y^ ðb0 Þ and b^ ðy0 Þ. The graphical estimates of the LR limits given by Table 9-8 were used to initialize the search procedures. To determine the exact values of the LR TABLE 9-8
Graphical Estimates of Weibull LR Intervals
yL
yU
bL
bU
105
170
1.24
2.48
Copyright © 2002 Marcel Dekker, Inc.
TABLE 9-9 Searching for Exact LR Limits on Normal Parameters, m and s Solve for LR limits on y:
Solve for LR limits on b:
ln Lðy; bÞ ¼ 92:063 s:t: b ¼ b^ ðy0 Þ [Two solutions; see Eq. (9.18)]
ln Lðy; bÞ ¼ 92:063 P b s:t: y ¼ y^ ðb0 Þ ¼ ð 8i ti 0 =rÞ1=b0 (two solutions)
limits, we need to solve the pair of nonlinear optimization problems shown in Table 9-9. Figure 9-12 displays the ML contour at the 90th% level of confidence, along with the partial likelihood relations y^ ðb0 Þ and b^ ðy0 Þ and the tangential lines used to identify the LR limits at C ¼ 90%. The use of the Tools > Goal Seek procedure for identifying 90% limits on b is illustrated in Figure 9-13. The LR intervals for b were very straightforward to generate since the constrained ML estimate of y given b is easily evaluated with the use of Eq. (4.20). As we are looking for two solutions—on upper and lower limit—the solutions are not unique. Thus, the search procedures needed to be initialised with either low or high starting values for b^ . The evaluation of limits on y requires the use of a user-defined macro or lots of patience, as for each y under evaluation, b must be estimated as the solution to the relation @ lnðy; bÞ=@b ¼ 0
FIGURE 9-12 Likelihood contour at 90% level of confidence for the worked-out problem. Also shown, the partial likelihood relationships, y^ ðb0 Þ and b^ ðy0 Þ, and their intersection with the contour, which are used to identify the LR limits on y and b.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 9-13 C ¼ 90%.
Use of Excel1 2000 to generate Weibull LR confidence intervals at
[see Eq. (9.18)]. This requires another stage of optimization. The confidence intervals displayed on y were generated by repetitive back-and-forth application of a Tools > Goal Seek or Solver procedure on @ lnðy; bÞ=@b ¼ 0:0 and 2fln L* ln Lðy0 , b^ ðy0 Þg ¼ w21;0:10 ¼ 2:71. The LR limits obtained after implementation of the Solver procedure are summarized in Table 9-10.
Asymptotic Confidence Intervals on the Weibull Survival Quantile, tR We make use of the asymptotic properties of the Fisher information matrix to form an approximate, lower one-sided, standard normal confidence limit on the Weibull survival quantile, tR . The Weibull survival quantile is estimated using ^
^tR ¼ y^ ½ ln R1=b
ð9:79Þ
TABLE 9-10 LR Interval Estimates of Weibull Parameters, y and b yL 104.36
Copyright © 2002 Marcel Dekker, Inc.
yU
bL
bU
171.00
1.241
2.452
For practical purposes, we wish to identify tR;L, að1 aÞ lower confidence limit on tR , defined by PðtR tR;L Þ 1 a Following Nelson (1982), we work with the transformed extreme-value, survival percentile, yR : tR ¼ expðyR Þ
ð9:80Þ
A ð1 aÞ lower confidence limit on tR is expressed in terms of the lower confidence limit on the extreme-value percentile, yR;L : tR;L ¼ expðyR;L Þ
ð9:81Þ
An approximate ð1 aÞ lower, standard normal confidence limit on the extreme-value percentile, yR , is given by pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi yR;L ¼ y^ R Za Varð^yR Þ ð9:82Þ where Varð^yR Þ Varðl^ Þ þ u2R Varðd^ Þ þ 2uR Covðl^ ; d^ Þ
ð9:83Þ
with uR ¼ lnð ln RÞ and Varðl^ Þ
Varðy^ Þ y^ 2
Varðd^ Þ
Worked-out Example 9-2 (cont.): Fisher-matrix expressions
Varðb^ Þ b^ 4
Covðl^ ; d^ Þ
Covðy^ ; b^ Þ y^ b^ 2
R90C95 requirements on t R using
The use of Eqs. (9.80) through (9.84) to obtain a 90% lower confidence interval on the 90th percentile of the survival distribution is now demonstrated. First, by Eq. (9.82), ^ t0:90 ¼ y^ ½ ln R1=b ¼ 133:51½ lnð0:90Þ1=b ¼ 37:29 rev y0:90 ¼ ln t0:90 ¼ lnð37:29Þ ¼ 3:619
Now, t0:90;0:95 ¼ expðy0:90;0:95 Þ. By Eq. (9.83), y0:90;0:95 , the 95% lower limit of the 10th percentile of the extreme-value distribution, is found using pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi y0:90;0:95 ¼ y^ 0:90 Z0:05 Varð^y0:90 Þ ¼ 3:619 1:645 Varð^y0:90 Þ
Copyright © 2002 Marcel Dekker, Inc.
Now, to find Varð^y0:90 Þ, we need estimates of the following: Earlier, we calculated the elements of the inverse Fisher matrix: " # Covðy^ ; b^ Þ s^ 2^ 365:935 1:024 1 y F ¼ ¼ 1:024 0:131 Covðy^ ; b^ Þ s^ 2^ b
u0:90 ¼ lnð ln 0:90Þ ¼ 2:250 Varðy^ Þ 365:9 Varðl^ Þ ¼ ¼ 0:02053 133:512 y^ 2 Varðb^ Þ 0:131 Varðd^ Þ ¼ ¼ 0:01347 ^b4 1:7644 Covðy^ ; b^ Þ 1:024 ¼ ¼ 0:00246 Covðl^ ; d^ Þ ^yb^ 2 133:51 1:7642 We can now produce an approximation for Varð^y0:90 Þ as Varð^y0:90 Þ ¼ Varðl^ Þ þ u20:90 Varðd^ Þ þ 2u0:90 Covðl^ ; d^ Þ ¼ 0:02053 þ 2:252 0:01347 þ 2 2 250 0:00246 ¼ 0:09984 Thus, pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi y0:90;0:95 ¼ 3:619 1:645 Varð^y0:90 Þ ¼ 3:619 1:645 0:09984 ¼ 3:0989 and the lower limit on t0:90 expð3:0989Þ ¼ 22:173 rev
is given by t0:90;0:95 ¼ expðy0:90;0:95 Þ ¼
Asymptotic Confidence Interval Estimates of the Weibull Reliability, RðtÞ Here we make use of the asymptotic properties of the Fisher information matrix to form an approximate, lower, one-sided, standard normal confidence limit on the Weibull reliability, RðtÞ. Dodson and Nolan (1995) provide convenient expressions for generating confidence intervals about RðtÞ based on Fisher information matrix information: RL ðtÞ, the lower confidence interval on RðtÞ, is of the form PðRðtÞ RL ðtÞÞ 1 a where the lower limit, RL ðtÞ, is evaluated using qffiffiffiffiffiffiffiffiffiffi ^ RL ðtÞ ¼ exp exp z þ Za Varz^
Copyright © 2002 Marcel Dekker, Inc.
ð9:84Þ
with z ¼ b lnðt=yÞ and " ! ! !# Varðy^ Þ z2 Varðb^ Þ 2z^ Covðb^ ; y^ Þ 2 ^ ^ VarðzÞ ¼ b þ y^ 2 b^ 4 y^ b^ 2
Worked-out Example 9-2 (cont.): R90C95 requirements on reliability at t ¼ 100 rev using Fisher-matrix expressions A 95%, one-sided, lower confidence limit on reliability at t ¼ 100 rev is now demonstrated: ^z ¼ b^ ln t ¼ 1:764 ln 100 ¼ 0:5099 ^ 133:51 " y ! ! !# ^ 2 Varðb^ Þ ^Þ ^ b^ ; y^ Þ z Varð y 2 Covð Cov 2 Varðz^ Þ ¼ b^ þ y^ 2 b^ 4 y^ b^ 2 365:935 0:50992 0:131 ¼ 1:7642 þ 133:512 1:7644 2ð0:5099Þ 1:024 ¼ 0:08262 133:51 1:7642 Now, qffiffiffiffiffiffiffiffiffiffi R0:95 ð100Þ ¼ exp exp z^ þ Z0:05 Varz^ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ expð expð0:5099 þ 1:645 0:08262ÞÞ ¼ 0:3815
LR Confidence Intervals on Weibull Percentile, tR Upon rearrangment of Eq. (9.79), y¼
tR 1=b ð ln RÞ
ð9:85Þ
Substitution of this relation into ln Lðy; bÞ results in a log-likelihood function that is expressed in terms of just tR and b, as ln LðtR ; bÞ. A ð1 aÞ LR confidence interval on tR consists of all values of tR satisfying 1 ln L*ðtR Þ ln L* w21;a 2
Copyright © 2002 Marcel Dekker, Inc.
ð9:86Þ
where ln L*ðtR Þ ¼ maxb ln LðtR ; bðtR ÞÞ ¼ ln Lðt ; b^ ðt ÞÞ R
ð9:87Þ
R
To obtain the partial likelihood estimate, b^ ðtR Þ, we must solve @ ln LðtR , bÞ=@b ¼ 0, which leads to the expression (see Lawless, 1982, p. 174) b n n P P r ti t rtR þ di ln ti þ ln R ð9:88Þ ln i ¼ 0 t t b i¼1 i¼1 R R This procedure is described in greater detail in the worked-out problem that follows. Worked-out Example 9-2 (cont.):
Likelihood intervals on t R
The C ¼ 0:90 contour consists of all values of tR and s that satisfy ln LðtR ; sÞ ¼ 18:575 This contour is shown in Figure 9-14. Graphical LR limits are estimated as tR;U ¼ 58:5
and
tR;L ¼ 19:5
FIGURE 9-14 90% likelihood contours on b and tR . Also shown: Partial likelihood relation, b^ ðtR Þ, and graphical LR limits on tR .
Copyright © 2002 Marcel Dekker, Inc.
Note: However, we are really only interested in knowing the lower confidence limit (how short our life might be!). Formally, LR limits on tR are obtained by solving the nonlinear optimization problem continued in Table 9-6. We now outline the use of Excel’s Tools > Goal Seek or Solver procedure for identifying a lower LR limit on tR (a 95% lower limit in this case). We initialise our search with the graphical estimate tR;L ¼ 19:5. The search involves a two-step search, which must be repeated many times before convergence is obtained. First, the Excel Tools > Goal Seek or Solver procedure must be used to identify the conditional likelihood estimate, b^ ðtR Þ, by searching for a solution to @Ln0=@b ¼ 0. The second step involves another Goals Seek or Solver procedure to search for a tR -value located on the C ¼ 0:90 likelihood contour. A macro written to alternate between these two procedures is illustrated by Figure 9-15. A solution was found, although this search procedure is ‘‘quite
FIGURE 9-15 Use of Excel1 Tools > Goal Seek procedure and array formulas to identify a lower likelihood ratio interval estimates of t0:90;0:10 .
Copyright © 2002 Marcel Dekker, Inc.
delicate,’’ and as such, the author cannot guarantee that this procedure will always run smoothly. It is evident that more research needs to be done from the practitioner’s viewpoint on implementing this procedure. The likelihood ratio-based confidence limit was identified as PðtR 19:89018Þ 0:95
9.3 1.
2.
EXERCISES What advantages and disadvantages are there in the use of maximum likelihood techniques for estimating the properties of Weibull, (log-) normal, and extreme-value distributional data? The following singly censored data set with n ¼ 8 was presented in the Exercise set that accompanied Chapter 4. The last reading is a censored observation due to the fact that the test was stopped at t ¼ 1500 hr. 440
535 676
781
868
953
1225
1550þ
For parts (a) and (b), attempt to answer the questions directly using reliability analysis software. Next, repeat this again using Microsoft1 Excel to develop either LR or Fisher-matrix intervals. a. Fit the data to the normal, lognormal, and Weibull distributions. Develop 95% two-sided confidence intervals on the parameters of the distributions. b. Find a lower 95% confidence limit on the B10 ðt0:90 Þ life. c. Review the use of Monte Carlo (MC) simulation techniques, which are discussed in Chapter 5. Develop MC confidence limits for the Weibull case, and compare your answers to (a) and (b).
Copyright © 2002 Marcel Dekker, Inc.
APPENDIX 9A ALGORITHM BY WOLYNETZ (1979) FOR OBTAINING ML ESTIMATES OF NORMAL PARAMETERS, m AND s 9A.1
FORMAL ALGORITHM FOR ESTIMATING m AND s
The system of equations that must be solved to obtain ML estimates of the normal parameters, m and s, is presented earlier (see x9.1.2) and reproduced here: n P
di z i þ
i¼1
r þ
n P
di z2i þ
i¼1
n P
ð1 di Þlðzi Þ ¼ 0
i¼1 n P
ð1 di Þzi lðzi Þ ¼ 0
ð9:14Þ ð9:15Þ
i¼1
For convenience, as we describe quite shortly, Eq. (9.15) is modified into the equivalent requirement:
n n P P r þ di ðzi Þ2 þ ð1 di Þl2 ðzi Þ i¼1 i¼1
n n P P di zi lðzi Þ ð1 di Þl2 ðzi Þ ¼ 0 ð9A:1Þ þ i¼1
i¼1
We now describe the algorithm by Wolynetz (1979), which assures that a solution will be obtained. We begin by introducing the notation:
di ¼ 1 ðrecorded failure wi ¼ ti wi ¼ ð9A:2Þ di ¼ 0 ðcensoredÞ Eðt j t ti Þ ¼ m þ slðzi Þ Referencing Eq. (9A.2), it is evident that the newly defined wi terms ought to be regarded as adjusted failure times. That is, for an observed failure, wi ¼ ti , the failure time. For censored observations, wi represents an adjustment of the censored time upward to what the failure time might be expected to be if censoring had not occurred. In terms of the wi , i ¼ 1; 2; . . . ; n random variables, substitution of
wi m z if di ¼ 1 ðrecorded failureÞ for i ð9A:3Þ if di ¼ 0 ðcensoredÞ lðzi Þ s into Eqs. (9.14) and (9A.1) results in the transformed requirements: n w m n P 1P i ¼ 0 ) m^ ¼ w ¼ w s n i¼1 i i¼1
Copyright © 2002 Marcel Dekker, Inc.
ð9A:4Þ
TABLE 9-11 Procedures for Obtaining ML Estimates of m and s2 1. 2. 3. 4.
Form initial estimates of m and s. One might begin with estimates obtained from the use of normal plotting paper. Evaluate wi , i ¼ 1; 2; . . . ; n using equation (9A.2). Update estimates of m and s with the use of equations (9A.4) and (9A.5). If the updated values are within the level of accuracy desired, then stop the evaluation process; otherwise, go back to step 2.
and
rþ
n w m 2 P n P i ð1 di Þl_ ðzi Þ ¼ 0 ) s i¼1i i¼1 n P
s^ ¼ rþ 2
ðwi m^ Þ2
i¼1 n P
ð1 di Þl_ ðzi Þ
ð9A:5Þ
i¼1
In Eq. (9A.5) we introduce l_ ðzi Þ ¼ lðzi Þðlðzi Þ zi Þ, the derivative of lðzi Þ. The use of Eqs. (9A.4) and (9A.5), introduces an iterative procedure for identifying the ML estimators, m^ and s^ . The algorithm is stable, and convergence is always obtained. Wolynetz (1979) reports that other more general approaches based on the use of Newton– Raphson methods are not as reliable. The ML estimation procedure can be readily extended for all types of censoring situations, including left- and interval censored data. Wolynetz (1979) provides a Fortran subprogram for arriving at ML estimates of m and s under a set of generalized censoring conditions. We illustrate the use of the algorithm given by Table 9-11 for the workedout example presented in x9.1.3. The details are described in Figure 9-16.
9A.2
ILLUSTRATED USE OF FORMAL ALGORITHM FOR ESTIMATING m AND s
Per step 1, we initialized our estimate of m^ and s^ with s^ ¼ 19:61; m^ ¼ 161:2, the rank regression estimates of m and s. The iterative procedure was implemented in an Excel spreadsheet to evaluate Eqs. (9A.2), (9A.4), and (9A.5). The details are
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 9-16
Use of Iterative Method (see Eq. 9A.4 and 9A.5) for generating normal MLE estimates of m and s.
Copyright © 2002 Marcel Dekker, Inc.
shown in Figure 9-16. Convergence was obtained in five or six iterations. The ML point estimates are m^ ¼ 159:85 hundred-hr s^ ¼ 15:47 hundred-hr
and
This result matches the result obtained with the use of the ‘‘easy’’ ML estimation procedure described in x9.1.3.
Copyright © 2002 Marcel Dekker, Inc.
APPENDIX 9B PROOF: THE EXPONENTIAL TOTAL UNIT TIME ON TEST VARIABLE, T, FOLLOWS A GAMMA (r;l) DISTRIBUTION For a singly, right-censored test data set, the total unit time on test may be expressed as follows: T ¼ t1 þ t2 þ þ tr þ ðn rÞtr ¼ nt1 þ ðn 1Þðt2 t1 Þ þ ðn 2Þðt3 t2 Þ þ þ ðn r þ 1Þðtr tr1 Þ
nY1 þ ðn 1ÞY2 þ ðn 2ÞY3 þ þ ðn r þ 1ÞYr
Q1 þ Q2 þ þ Qr ð9B:1Þ Lemma. T is distributed gamma (r; l). The proof is conducted in three parts: 1. 2. 3.
Show that each Yi is distributed according to an exponential distribution with parameter ðn i þ 1Þl . The Qi random variables are distributed according to an exponential with parameter l . T is distributed gammaðr; lÞ.
Part 1 At any failure epoch, ti1 ; i ¼ 2; 3; . . . ; r, ði 1Þ units have failed, leaving ðn i þ 1Þ operational units on test, each competing to be the next one to fail. For each of the ðn i þ 1Þ survivors, let Ti; j ¼ the remaining life of the jth unit that remains on test, j ¼ 1; 2; . . . ; ðn i þ 1Þ, at time ti1 . By memoryless property of the exponential distribution, each Ti; j is independent and distributed according to an exponential distribution with parameter l . Then, Yi ¼ minðTi;1 , Ti;2 ; . . . ; Ti;niþ1 ), whose distribution is determined using PðYi yÞ ¼ PðTi;1 y; Ti;2 y; . . . ; Ti;nþ1 i yÞ ¼ RðyÞRðyÞ . . . RðyÞ |fflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} nþ1i terms
¼ expðlyÞ expðlyÞ . . . expðlyÞ ¼ expððn þ 1 iÞlyÞ ð9B:2Þ
Copyright © 2002 Marcel Dekker, Inc.
Thus, Yi is distributed according to an exponential distribution with parameter ðn þ 1 iÞl. Part 2 The next step is to consider each of the Qi ¼ ðn þ 1 iÞyi , i ¼ 1; 2; . . . ; r, terms in Eq. (9.26). First we express the distribution function of Qi in terms of Yi : FQi ðqÞ ¼ PðQi qÞ ¼ P½ðn þ 1 iÞYi q ¼ P½Yi q=ðn þ 1 iÞ ð9B:3Þ In part 1 we establish the fact that Yi is distributed exponential ðn þ 1 iÞl, with distribution function Fyi ðyi Þ ¼ 1 expððn þ 1 iÞlyi Þ. Accordingly, P½Yi q=ðn þ 1 iÞ ¼ 1 exp½ðn þ 1 iÞlq=ðn þ 1 iÞ ¼ 1 expðlqÞ
ð9B:4Þ
Note: In Eq. (9B.4), we substituted yi ¼ q=ðn þ 1 iÞ. We recognize the resultant expression on the right-hand side of Eq. (9B.4) as the distribution function of an exponential ðlÞ random variable. Thus Qi , i ¼ 1; 2; . . . ; r, is distributed exponential (l). Part 3 T is the sum of r identically distributed, exponential random variables. By definition, T follows a gamma distribution with parameters r and l. The gamma ðr; lÞ distribution, sometimes referred to as an Erlang(r; l) distribution, is used frequently in operations research to model the time to serve r people waiting in line, each being serviced according to the same exponential (memoryless s) timeto-service distribution.
Copyright © 2002 Marcel Dekker, Inc.
10 Comparing Designs
Vive la diffe´rence! In this chapter we look at comparing one design to another, one data set to another, etc. During design and advanced product development, two or more designs might be under consideration. The analyst would like to know if one particular design fares better than another with respect to various performance measures and reliability. Once a product enters production, the need to make comparisons continues, particularly during ongoing efforts to improve existing product and process designs. The need continues as the production phase begins, and a warranty and=or field data analyst is assigned the task of investigating differences in field performance due to customer usage, geography, etc. Several very different approaches have been proposed for investigating differences. We survey the following methods: 1. 2. 3. 4.
Graphical comparisons of probability and rank regressions plots Use of quantile–quantile (Q–Q) plots Use of likelihood ratio methods including contour plots Approximate F-tests
In most cases our comparisons reduce to a simple hypothesis test on a parameter of interest: H0 : y1 ¼ y2
versus
Copyright © 2002 Marcel Dekker, Inc.
Ha : y1 6¼ y2
ð10:1Þ
10.1
GRAPHICAL PROCEDURES BASED ON PROBABILITY OR RANK REGRESSION PLOTS
Probability plots are used to parametrically assess the goodness-of-fit of a particular data set to a hypothetical distribution. Similarly, it can be used to visually assess for differences among multiple data sets. Minitab1 V12.1 and WinSmith1 are among the popular statistical computing packages that provide capabilities for simultaneous display of two or more probability plots, along with display of 95% beta-binomial or Fisher-matrix confidence bands. Once these are displayed, the analyst can look for an overlap between the displayed data sets. A lack of overlap is taken as evidence that significant differences do exist. We demonstrate this using illustrated examples from a simulated data group. The following case studies are illustrated. Illustrated Example 10-1:
Use of probability plots to discern differences
A complete separation of two data sets—namely, no overlap at all of the confidence bands between two data sets—provides strong evidence of overall differences. We illustrate such a case in Figure 10-1, wherein
FIGURE 10-1 Normal plots of two data sets, N ð2080; 102 Þ and N ð2150; 102 Þ, generated by Minitab1 V12.1.
Copyright © 2002 Marcel Dekker, Inc.
normal plots of two data sets, N ð2080; 102 Þ and N ð2150; 102 Þ are simultaneously displayed. Left-tail separation around B10 life. Often we recognize that we are really interested in the left tail of the distribution. (We should never be interested in right-tail comparisons, as all design=data should be deemed unacceptable if this were the case!) In such cases Abernethy (1996) recommends looking for a separation of confidence bands around a lower B life of interest, such as B10 . In Figure 10-2 lognormal plots of two complete data sets, Lnð55; 1:1Þ and Lnð347; 1Þ are presented. Notice the gap between the two confidence bands around B10 and higher. The two data sets are judged to be significantly different, at least at the B10 life, in this case. Median B10 fit does not reside within confidence bands of other data sets. This is a very weak test for differences. We illustrate this in Figure 10-3, wherein Weibull plots of two complete data sets, Weibullð1:1; 500Þ and Weibullð1:1; 2000Þ, are simultaneously displayed. Despite large differences in the location parameters, we still find overlap in the lower tail regions. Here the median fit of each data set does not reside in the confidence bands of the other data set in the right-tail region; however, overlap might occur in the left-tail region
FIGURE 10-2
Lognormal plots of two complete data sets, Lnð55; 1:1Þ and Lnð347; 1:1Þ.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 10-3 Weibull plots of two complete data sets, Weibullð500; 1:1Þ and Weibullð2000; 1:1Þ.
(see Figure 10-3). Based on Monte Carlo simulation runs, Abernethy (1996) does not recommend this criterion for visual assessment of differences.
10.2
Q–Q PLOTS
The development of empirical Q–Q plots is presented in Appendix 7A to Chapter 7. For data sets X and Y, a Q–Q plot is a Cartesian plot of the quantile pairs Qx ð pÞ and Qy ð pÞ. The quantile relationships are derived by inverting the rank estima^ ðyp Þ ¼ p. For equally sized groups, Q–Q plots are easily ^ ðxp Þ ¼ p and G tors, G constructed by first sorting both data groups in ascending order. The Q–Q plot is then just a plot of the ordered pairs ðXi ; Yi Þ; i ¼ 1; 2; . . . ; n. For unequal-size groups, the quantile relationship associated with the larger sample must be interpolated in order to identify Qx ð pÞ and Qy ð pÞ pairs. Techniques for accomplishing this are discussed in Appendix 7A to Chapter 7. Once the plot is constructed, a simple, linear regression on the Q–Q data can be used to assess the similarities of the two data sets without any assumption
Copyright © 2002 Marcel Dekker, Inc.
on the underlying distributions. Assessment of the degree of similarity between X and Y can be based on the following guidelines: 1. 2. 3. 4.
A highly linear relationship provides strong evidence that both X and Y are from the same location-scale distribution type. A satisfactory fit of the data through the origin provides evidence that the location parameters are of nearly equal magnitude; yx ¼ yy . A satisfactory regression fit with slope nearly 45 provides evidence that the scale parameters are of nearly equal magnitude; sx ¼ sy . X and Y are identically distributed if the linearity of the regression fit is sufficient to describe the association between Qx ð pÞ and Qy ð pÞ, and the straight-line fit is of nearly 45 slope and running through the origin.
To understand why we use these guidelines, we make use of a parametric assumption on the distributions of X and Y. Without loss of generality, we assume that the distributions of X and Y are modeled by the (log-) location-scale distribution forms, Gx ððx yx Þ=sx Þ and Gy ððy yy Þ=sy Þ, respectively. The true quantile–quantile relationship between X and Y is obtained as follows: Qx ð pÞ yx ¼p) ð10:2Þ Gx sx Qx ð pÞ ¼ sx Gx1 ð pÞ þ yx Qy ð pÞ ¼ sy Gy1 ð pÞ þ yy
ð10:3Þ
From Eqs. (10.2) and (10.3), we observe that if 1. 2. 3.
Gx ð:Þ ¼ Gy ð:Þ, yx ¼ yy , and sx ¼ sy ,
then Qx ð pÞ ¼ Qy ð pÞ. This is the essence of our argument for using Q–Q plots in this application. The Q–Q relationship is likely to appear linear if (1) is satisfied. Furthermore, if (2) is satisfied, the regressed relationship is likely to run through the origin. Finally, if (3) is satisfied, the slope of the fit should be close to 1.0. A more practical implementation of Q–Q plots is to simply plot the relationship using identical scales on the x- and y-axes and overlaying a 45 line on the plot. Any abnormal scatter of the data away from this relationship is indicative of differences. An analysis of variance of a regression fit without intercept (‘‘through the origin’’) can be used to examine the significance of the fit, b^ ðslopeÞ ¼ 1. We describe this using several illustrated examples from simulated data.
Copyright © 2002 Marcel Dekker, Inc.
Illustrated Example 10-2: 1.
2.
Use of Q–Q plots to discern differences
Figure 10-4. Q–Q plots of the normal data sets from Figure 10-1 are constructed. The plotted points are not even near the 45 line in this case. A statistical test will not provide any more useful information. However, output from the regression ANOVA is presented. Figure 10-5. Q–Q plots of two lognormal data sets of Figure 10-2 are constructed. Again, the plotted points are not even near the 45 line in this case. A statistical test will not provide any more useful information. However, output from the regression ANOVA is presented.
FIGURE 10-4
Q–Q plots of two complete normal data sets of Figure 10-1.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 10-5
3.
Q–Q plots of two lognormal data sets of Figure 10-2.
Figure 10-6. Q–Q plots of two Weibull data sets of Figure 10-3 are constructed. Again, the plotted points are not even near the 45 line in this case. The regression ANOVA output is presented. The t-statistic on the underlying hypothesis, b^ ¼ 1, is soundly rejected. tb^ ¼1 ¼
4.
3:134 1 ¼ 43:8 ! p 0:000 0:0487
Figure 10-7. Q–Q plots of Weibull data wherein both groups are sampled from a Weibullð500; 1:2Þ distribution are constructed. In this
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 10-6
Q–Q plots of two Weibull data sets of Figure 10-3.
case the scatter about the 45 line does not appear to be totally random. However, the regression ANOVA does not provide any strong evidence to conclude that a 45 fit is not satisfactory. The regression equation is Sample2 ¼ 0:954 Sample1. The t-statistic on the underlying hypothesis b^ ¼ 1 is not significant: tb^ ¼1 ¼
0:9537 1 ¼ 0:69 ! p 0:61 0:06663
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 10-7
10.2.1
Q–Q plots of Weibull data. Both groups sampled from Weibullð500; 1:2Þ.
Technical Note: Use of Q–Q Plots
Fisher (1983) recommends using Q–Q plots only when sample sizes exceed 30. Generally speaking, probability plots can be influenced considerably by the vagaries of random sampling for sample sizes much less than 30, so any inferences based on plots with substantially fewer data would be very tenuous. Q–Q plots are rather sensitive to differences in the tail regions of the
Copyright © 2002 Marcel Dekker, Inc.
distributions. This is because the quantile is a rapidly changing function of p where the density is sparse. (Wilk and Gnanadesikan, 1968, p. 5, Gerson, 1975). The reader might also come across a variant of Q–Q plots referred to as P–P plots. P–P plots are a plot of the ordered pairs F^ 1 ðtÞ and F^ 2 ðtÞ. These points can be picked off the probability plots for both data sets. Fisher (1983) notes that P–P plots have greater sensitivity to differences in the center regions of the distributions. 10.2.2
Inferential Statistics for Using Q–Q Plots
Jayasheela (2002) has reported very high type I errors when using regression tstatistics. This is due to the fact that we have induced an artificial association between the two regressor variables when we order both data sets. Consequently, our t-statistic, t0 ¼ ðb^ 1Þ=sb^ , is excessively large due to the artificially deflated value of the standard error of b^ . To work around these difficulties, we follow the approach by Tarum (1999), and make use of Monte Carlo simulation techniques to identify suitable critical values of regression statistics for making inferences on the goodness of the Q–Q plot fit. Note that Tarum (1999) used simulation techniques for identifying critical values of the regression R2 statistic for assessing the goodness-of-fit of rank regression fits. Rank regression fits may be regarded as Q–Q plots of the (logged-) data values regressed upon the inverse transformed values of the median ranks. That is, probability plots are inherently nothing more than Q–Q plots of transformations of actual data values versus the theoretical ranks.
FIGURE 10-8 Plot of t-statistic values for replicates from Weibull distribution (1.2, 1000) at 90% and 95% confidence levels.
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 10-9 Plot of R2 values for replicates from Weibull distribution (1.2, 1000) at 90% and 95% confidence levels.
Jayasheela (2002) has developed graphs of critical values of t-statistics and R2 values for Weibull data. Several of these relationships are reproduced in Figures 10-8 and 10-9.
10.3
USE OF LIKELIHOOD THEORY FOR ASSESSING DIFFERENCES
Following the notation of Chapter 9, we assume that both data sets (X and Y ) follow a particular two-parameter (log-) location-scale distribution form, with parameters ax ; bx , and ay ; by , respectively. The underlying hypothesis test for overall differences reduces to the following joint hypothesis: H0 : ax ¼ ay
and bx ¼ by
Ha : ax 6¼ ay
and=or
versus
bx 6¼ by
The underlying likelihood ratio test statistic is of the form 2ðln LX þ ln LY ln LX þY Þ follows a w22;a distribution under H0
Copyright © 2002 Marcel Dekker, Inc.
where
max log-likelihood of X data set : aX ; bX ln LY ¼ max ln LY ðaY ; bY Þ max log-likelihood of Y data set :
ln LX ¼ max ln LX ðaX ; bX Þ aY ; b Y
ln LX þY
¼
max
aX þY ; bX þY
ln LX þY ðaX þY ; bX þY Þ max log-likelihood of merger of X ; Y data sets :
Therefore, we reject H0 if 2ðln LX þ ln LY ln LX þY Þ > w22; a
ð10:4Þ
Note: Through simulation studies, Fulton (1999) recommends reducing the appearance of ‘‘2’’ by 2FF, where FF ¼ ðN 1Þ=ðN þ 0:618Þ, where N is the combined sample size of nX þ nY . It is also possible to graphically examine for differences with the use of likelihood contours. Under H0, we are constraining both a and b for each distribution, leaving zero degrees of freedom under H0. Under the alternative, both a and b can freely vary, so we have 2 degrees of freedom. The difference, 2, is the number of degrees of freedom to use in constructing contours. That is, each contour is constructed according to the relationship 1 ln Lða; bÞ ¼ ln L* þ w22;a 2
ð10:5Þ
We interpret the contour plots as follows: The data sets are judged to be distinctly different if the contours for X and Y do not overlap. Illustrated Example 10-3 We consider the Weibull data sets of Figure 10-3. For these data sets, WinSmith reports the summary statistics for data sets X ; Y , and X þ Y (combined) shown in Table 10-1. TABLE 10-1 WinSmithTM Analysis of Weibull Data Sets X ; Y , and X þ Y (Combined) Data set
n
b
y
ln L*
X Y X þY
30 30 60
1.133 1.098 0.9331
576.4 1777 1093
218:9272 253:1754 481:6183
Copyright © 2002 Marcel Dekker, Inc.
FIGURE 10-10
90% likelihood contours of data sets X and Y of Figure 10-3.
For a level of confidence of C ¼ 90%, w22;0:10 ¼ 4:605, and so 0:5 w22;0:10 ¼ 2:303. The contours for data sets X and Y satisfy [see Eq. (10.5)] ln LX ðbX ; yX Þ ¼ 218:9272 þ 2:303 ln LY ðbY ; yX Þ ¼ 253:1754 þ 2:303 The likelihood contours at C ¼ 90% are displayed in Figure 10-10. Note the clear separation of the contours. This provides a strong visual indication of differences between the two groups. Based on the LR test [see Eq. (10.4)], we reject H0 and conclude that differences do exist if 2ðln LX þ ln LY ln LX þY Þ > w22;a or 2ð218:9272 253:1754 þ 481:6183Þ ¼ 19:03 > w22;0:10 ¼ 4:605 And, as we see, there is strong evidence that differences do exist.
10.4
APPROXIMATE F-TEST FOR DIFFERENCES— WEIBULL AND EXPONENTIAL DISTRIBUTION
We consider the test for differences on the exponential MTTF parameter, for which exact results are available based on the use of the ML estimate y^ ¼ T =r (see Chapter 4, x4:7:1).
Copyright © 2002 Marcel Dekker, Inc.
Test for Differences in the Exponential MTTF Parameter, y
10.4.1
In Chapter 9, x9:2:1, we prove that for type II, singly right-censored exponential data, 2T w22r y
ð10:6Þ
where T is the total unit time on test, r is the number of recorded failures in a sample of size, n, placed on test. If we substitute the ML estimate, y^ ¼ T =r, we have 2ry^ w22r y
ð10:7Þ
Note: For type I failure-censored data, the degrees of freedom in the chi-square quantities in Eqs. (10.6) and (10.7) must be adjusted to the approximation of 2ðr þ 1Þ degrees of freedom. To test the hypothesis suggested by Eq. (10.1), we make use of the Fdistribution: Fn1 ;n2 ¼
w2n1 =n1 w2n2 =n2
Accordingly, the F-test for differences becomes y^ 1 F2r1 ;2r2 y^
ð10:8Þ
2
Illustrated Example 10-4 Data from two exponential life tests has MTTF1 ¼ 10;000 cycles based on r1 ¼ 10 failures and MTTF2 ¼ 13;500 cycles based on r2 ¼ 8 failures. Is there evidence to support the assertion that the MTTF for the second data set is greater than the first at C ¼ 0:95? Solution H0 : y1 ¼ y2 Ha : y2 > y1 Under H0 ; F210;28 ¼ y2 =y1 ¼ 13;500=10;000 ¼ 1:35. From Minitab (see Figure 10-11), F20;16;0:05 ¼ 2:28; therefore, we fail to reject H0 , and so we cannot conclude that there are significant differences.
Copyright © 2002 Marcel Dekker, Inc.
F-value output from Minitab1.
FIGURE 10-11
10.4.2
Approximate Test for Differences with Weibull Shape Parameter
For testing differences between Weibull location parameters, we adapt Eq. (10.8) under a Weibayes assumption of Chapter 4 that b is known—and the same for both sets—to develop an approximate test for differences in the location parameter, y. Under a Weibayes model, the scaled data, t b exponential (yb ). Thus, under a Weibayes assumption, Eq. (10.8) becomes !b y^ 1 F2r1 ;2r2 ð10:9Þ y^ 2
For testing for differences between b, we present the empirical approximation by Lawless (1982, p. 157), which is suggested by McCool (1975) and Lawless and Mann (1976): ! b w2hr;n ð10:10Þ ðhr;n þ 2Þ b^ where b^ ¼ ML estimate of Weibull shape parameter b. hr;n ¼ An empirical factor determined by Monte Carlo simulation for a type II censored sample of size n, with r recorded failures. These values are reproduced in Table 10-2. To test for differences between two Weibull shape parameters, b1 and b2, we may use the approximate F-test given by Lawless (1982, p. 181): ! ðh1 þ 2Þh2 b^ 2 ð10:11Þ Fh1 ;h2 h ðh þ 2Þb^ 1
2
1
Note: In Eq. (10.11), we simplify notation, using h instead of hr;n , with h1 ; h2 denoting the respective h-factor for each of two groups.
Copyright © 2002 Marcel Dekker, Inc.
TABLE 10-2 hðr; nÞ for Testing for Differences Between Weibull Shape Parameter, b n
5
10
20
40
60
80
100
1
— — — 2.2 3.5 4.7 6.0 7.8 10.3 12.9
— 2.0 4.3 6.7 9.1 11.4 14.8 18.5 23.0 29.3
2.0 6.2 10.9 15.8 20.7 25.8 32.6 40.0 49.0 62.4
6.0 14.6 24.0 33.8 44.0 54.7 68.1 83.3 100.9 128.2
10.0 23.0 37.0 51.8 67.3 83.5 103.8 126.4 153.0 194.8
14.1 31.5 50.1 69.9 90.6 112.3 139.5 169.5 204.9 257.6
18.1 39.9 63.2 87.9 113.9 141.1 175.0 212.5 256.9 325.5
0.205n 0.420n 0.652n 0.899n 1.165n 1.457n 1.782n 2.155n 2.607n 3.29n
r=n 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Illustrated Example 10-5 Consider the data set shown in Table 10-1. We first test for differences in the Weibull shape parameter for the two groups. Using Eq. (10.11), under H0 with b1 ¼ b2 —with r1 ¼ n1 ¼ r2 ¼ n2 ¼ 30, so r=n ¼ 1:0 for both groups (complete data), and so h ¼ h1 ¼ h2 —we have b^ 2 Fh;h b^ 1
For both sets, h ¼ 30, and the empirical factor, h30;30 , is obtained from Table 10-2 with r=n ¼ 1:0 and n ¼ 30; h30;30 12 ð62:4 þ 128:2Þ ¼ 95:3. Our test statistic is composed of the ratio of the b’s, which in our case is the ratio 1:133=1:098 ¼ 1:032. At 0.90 level of confidence, F95;95;0:10 ¼ 1:3022 2 (reported by Minitab). Therefore, since ðb^ 2 =b^ 1 Þ ¼ 1:032 < F95;95 ¼ 1:3022, we fail to reject the underlying null hypothesis and we cannot conclude that significant differences between the two groups exist at C ¼ 0:90 level of confidence. Given that the b’s are the same for the two groups and that our combined sample of the two groups in Table 10-1 consists of 60 observations, we use a Weibayes model to test for differences in the Weibull location parameter, y. We use the combined estimate of b in the analysis. Under a Weibayes assumption, we restrict our estimation by imposing the requirement b^ ¼ 0:9331—the combined estimate for b in Table 10-1. This gives us the results that Table 10-3 displays.
Copyright © 2002 Marcel Dekker, Inc.
TABLE 10-3 Weibayes Analysis of Data in Table 10-1 Data set X Y
b
y
0.9331 0.9331
536.46 1669.5
Under Eq. (10.9), our test statistic becomes 0:9331 y^ b1 1669:5 ¼ ¼ 2:89 536:46 y^ b2 At the 90% level of confidence, F230;230;0:10 ¼ 1:395. Thus, y^ b1 =y^ b2 ¼ 2:89 > F230;230;0:10 ¼ 1:395; so we reject H0 and conclude that there are significant differences in the location parameter, yb , between the two groups. As of this printing, the latest version of Minitab1 software has been released with enhanced capabilities for testing for differences between two sets of life data. The test makes use of Wald’s test statistic (see Nelson, 1982, p. 591), which is based on the Fisher information matrix and is thus asymptotically equivalent to the LR test for differences. Minitab V13.0 software reports the data shown in Figure 10-12. We see that the test for overall differences and differences in the Weibull scale parameter, y, is very significant ( p-value ¼ 0.000). However, the individual test for differences in the Weibull shape parameter, b, is not significant (pvalue ¼ 0.877). This is consistent with the results obtained with the use of approximate F-test statistics.
FIGURE 10-12
Use of Minitab1 V13 software to test for differences.
Copyright © 2002 Marcel Dekker, Inc.
10.4.3
Use of Approximate F-Tests
There is no advantage for supporting the use of approximate F-tests, which are very close in performance to the use of LR or Wald’s tests for differences.
10.5
SUMMARY
Three very different methods for judging differences have been presented. Each has its own advantage or disadvantage, which we now summarize. Method
Advantage
Disadvantage
1. Visual assessment of probability plots
Easy to examine; most software packages provide the capability to display multiple probability plots on one page with confidence bands. Nonparametric; easy to construct when sample sizes are equal. Conclusions can be made based on graphical evidence or output from regression ANOVA. Distribution assumptions are not required. Easy to visually assess differences.
Not the most powerful procedure. Difficult to judge when confidence bands partially overlap.
2. Q–Q plots
3a. Contour plots
3b. Likelihood ratio test
Once optimum of loglikelihood functions is calculated, it is straightforward to construct the LR test statistic. Fulton (1999) and Abernethy (1996) have indicated that their preliminary research shows this to be the most powerful.
Copyright © 2002 Marcel Dekker, Inc.
Not implemented in most popular reliability software packages. Capability of test procedures is not fully evaluated. However, it is unlikely to be the most powerful procedure, since it is a nonparametric test. Contours are messy to construct. Most software packages, with the exception of WinSmith, do not provide this capability. Have to be careful with degrees of freedom on chisquare which is used. Computational demands are high if software procedures are not available.
4. Approximate F-tests
10.6
Can be calculated by hand with simple lookup for type II censored data.
EXERCISES 1.
2.
Consider the two data sets, A and B. a. Generate a Weibull plot of both data sets. Is there any evidence of differences between the two data sets? b. Use Q–Q plots to test for differences. c. Use LR test statistics to test for differences. Repeat Exercise 1 for data sets A and C.
Copyright © 2002 Marcel Dekker, Inc.
Not recommended if reliability software is available to handle the computations using either LR or Wald’s tests.
A
B
C
9.6 164.4 168.3 199.7 258.5 354.4 384.2 583.3 653.2 661.2 776.5 866.6 907.4 999.5 1013.3 1218.5 1317.8 1384.4 1437.3 1543.8 1810.9 2094.6 2310.5 2528.6 2962.2
49.8 108.1 112.9 122.2 416.2 480.8 562.7 666.1 721.9 749.2 759.8 791.3 934.9 963.1 973.5 1221.0 1535.3 1536.5 1604.0 1605.2 1866.3 2208.8 2233.6 2427.1 4620.4
73.6 210.4 474.8 680.9 733.1 814.4 1030.1 1057.9 1073.6 1282.7 1824.4 1846.4 1859.6 1878.1 2026.7 2076.7 2240.8 2248.2 2970.4 3983.7 4534.8 4608.2 4766.0 5760.6 6045.9
References
Abernethy RB. (1998). The New Weibull Handbook. 3rd ed. SAE Publications, Warrendale, PA. Abernethy RB. (1999). Weibull News, Feb 1999 issue, p. 1. Augustine S. (2001). FMEA of a braking system. Wayne State University, IE 7270 class, Detroit, MI. Automotive Industry Action Group (AIAG, 1995). Potential Failure Mode and Effects Analysis (FMEA). 2nd ed. Southfield, MI. Bain LJ. (1978). Statistical Analysis of Reliability and Life-Testing Models: Theory and Methods. Marcel Dekker, New York. Benard A, Bos-Levenbach EC. (1953). Het Uitzetten Van Waarnemingenop Waarschijnlykhiedspaper. Statistica 7. Betasso L. (1999). Life prediction using warranty data. Proceedings of the SAE RMSL 1999 Workshop, Auburn Hills, MI, Warrendale, PA. Birnbaum ZW, Saunders SC. (1969). A new family of life distributions. J Appl Prob 6:319–327. Black JR. (1969). Electromigration: A brief survey and some recent results, IEEE Trans Electron Devices 16:338. Bloch HP, Geitner FK. (1994). Practical Machinery Management for Process Plants, Volume 2: Machinery Failure Analysis and Troubleshooting. 2nd ed. Gulf Publishing Company, Houston, TX. Blom G. (1958). Statistical Estimates and Transformed Beta Variables. Ch. 12, John Wiley, New York. Chandra M, Singpurwalla ND, Stephens MA. (1981). Kolmogorov statistics for tests of fit for the extreme value and Weibull distributions. J Amer Stat Assoc 76:729–731.
Copyright © 2002 Marcel Dekker, Inc.
Chrysler Corp. (1993). Performance Standard PF-9482 for Electric Sliding Sunroof. Chrysler Corporation, Auburn Hills, MI. Clopper CJ, Pearson ES. (1934). The use of confidence or fiducial limits, illustrated in the case of the binomial. Biometrika 26:404–413. Cohen AC. (1986). Parameter Estimation in Reliability and Life Plan Models. Marcel Dekker, New York. Cohen AC, Whitten BJ, Ding Y. (1984). Modified moment estimation for the threeparameter Weibull distribution. J Quality Tech 16(3):159–167. Crow LH. (1974). Reliability analysis for complex, repairable systems. Reliability and Biometry, Statistical Analysis of Lifelength, SIAM, Philadelphia, PA. Crow LH. (1990). Evaluating the reliability of repairable systems. Proceedings Annual Reliability and Maintainability Symposium, pp. 275–279. Crowder MJ, Kimber AC, Smith RL, Sweeting TJ. (1991). Statistical Analysis of Reliability Data. Chapman & Hall, New York. Dale CJ. (1985). Application of the proportional hazards model in the reliability field. Reliability Engineering 10:1–14. Department of Defense. (1995). Military Handbook: Reliability Prediction of Electronic Equipment, MIL-HDBK-217F. Notice 2 (217F-2), Defense Printing Service, Philadelphia, PA. Department of Defense. (1984). Mil-Std-1629A Military Standard Procedures for Performing a Failure Mode, Effects and Criticality Analysis. Notice 2, Defense Printing Service, Philadelphia, PA. Department of Defense. (1985). Environmental Stress Screening Process for Electronic Equipment. Defense Printing Service, Philadelphia, PA. Department of Defense. (1981). Mil-HDBK-189: Reliability Growth Management. Defense Printing Service, Philadelphia, PA. Under Secretary of Defense, DOD. (1988). Enhancing defense standardization with specifications and standards: Cornerstones of quality. Dept. of Health, Education, and Welfare: Food and Drug Administration, Part 820— Quality System Regulation, published by Food, Drug, and Cosmetic Division of American Society for Quality, 1998. Dodson B, Nolan D. (1995). Reliability Engineering Bible. Quality Publishing, Tucson, AZ. Ebeling C. (1997). An Introduction to Reliability and Maintainability Engineering. McGraw-Hill, New York. Elsayed EA. (1996). Reliability Engineering. Addison-Wesley, Reading, MA. Elsayed EA, Chan CK. (1990). Estimation of thin-oxide reliability using proportional hazards models. IEEE Trans in Reliability 39(3):329–335. Epstein B. (1948). Statistical aspects of fracture problems. J Applied Physics 19. Epstein B. (1954). Truncated life tests in the exponential case. Annals of Mathematical Stat 25. Epstein B. (1958). The exponential distribution and its role in life testing. Industrial Quality Control. Epstein B. (1960). Elements of the theory of extreme values. Technometrics 2(1).
Copyright © 2002 Marcel Dekker, Inc.
Filliben JJ. (1974). The probability plot correlation coefficient test for normality. Technometrics 17(1):111–117. Feinberg AA, Gibson GJ. (1995). Accelerated reliability growth methodologies and models, Recent Advances in Life-Testing and Reliability, N. Balakrishnan, ed. Chapter 12, CRC Press, Boca Raton, FL. Fisher NI. (1983). Graphical methods in nonparametric statistics: A review and annotated bibliography, International Stat Rev 51(1):25–58. Ford Motor Company. (1997). Module 11: Sample size reduction technique. FAO Reliability Guide (uncontrolled confidential copy), Dearborn, MI. Ford Motor Company. (1972). Plotting on Weibull Probability Paper, Ford Motor Company North American Automotive Operations, Dearborn, MI. Fulton W. (1999). Issues with Weibull analysis, presentation to the Southeastern Chapter of the Society for Reliability Engineers, Troy, MI. Gerson M. (1975). The techniques and uses of probability plotting, Statistician 24:235– 257. Grosh DL. (1989). A Primer of Reliability Theory. John Wiley and Sons, New York. Guilbaud O. (1988). Exact Kolmogorov-type tests for left-truncated and=or right-censored data. J Amer Stat Assoc 83(401):213–221. Gumbel EJ. (1958). Statistics of Extremes. Academic Press, New York. Hallinan AJ. (1993). A review of the Weibull distribution. J Quality Tech 25(2):85–93. Hasofer A, Lind N. (1974). An exact and invariant first-order reliability format. J Eng Mech Div, ASCE 100:111–121. Hauser JR, Clausing D. (1989). The house of quality. Harvard Business Rev 63–73. Hjorth JS. (1994). Computer-Intensive Statistical Methods: Validation, Model Selection, and Bootstrap, Chapman & Hall, Boca Raton, FL. Ireson WG, Coombs CF. (1996). Handbook of Reliability Engineering and Management. 2nd ed. McGraw-Hill, New York. Jayasheela, VP. (2002). The development of quantile–quantile plots for comparing life data. M.S. thesis, Wayne State University, Detroit, MI. Jeng S-L, Meeker WQ. (2000). Comparisons of approximate confidence interval procedures for type I censored data. Technical Report, Iowa State University, Ames, IA. Johnson L. (1951). The median ranks of sample values in their population with an application to certain fatigue studies. Industrial Mathematics 2:1–6. Johnson L. (1964). The Statistical Treatment of Fatigue Experiments. Elsevier, NY. Kapur KC, Lamberson LR. (1977). Reliability in Engineering Design. John Wiley, New York. Kececioglu D. (1994). Reliability and Life Testing Handbook. Vols 1 & 2. PTR PrenticeHall, Englewood Cliffs, NJ. Kececioglu D, Sun F. (1995). Environmental Stress Screening: Its Quantification, Optimization, and Management. Prentice-Hall, Englewood Cliffs, NJ. Kendall MG, Stuart A, Ord JK. (1987). Kendall’s Advanced Theory of Statistics. Vol. 1: Distribution Theory. 6th ed. Oxford University Press, New York. Kimball BF. (1960). On the choice of plotting positions on probability paper. J Amer Stat Assoc 55:546–560.
Copyright © 2002 Marcel Dekker, Inc.
King B. (1989). Better Designs in Half the Time: Implementing Quality Function Deployment in America. Goal=QPC, Methuen, MA. Klion J. (1992). Practical Electronic Reliability Engineering: Getting the Job Done from Requirement Through Acceptance. Van Nostrand Reinhold, New York. Kottegoda N, Rosso R. (1997). Statistics, Probability, and Reliability for Civil and Environmental Engineers. McGraw-Hill, New York. Lawless JF. (1974). Approximations to confidence intervals for parameters in the extreme value and Weibull distributions. Biometrika 61(1):123–129. Lawless JF. (1982). Statistical Models and Methods for Lifetime Data. John Wiley and Sons, New York. Lawless JF, Mann NR. (1976). Tests for homogeneity for extreme-value scale parameters. Communications in Stat A5:389–405. Leemis LM. (1995). Reliability: Probabilistic Models and Statistical Models. PrenticeHall, Englewood Cliffs, NJ. Lehtinen EA. (1979). On the estimation for three-parameter Weibull distribution. Technical report by Technical Research Centre of Finland. Leon R, Shoemaker A, Kacker RN. (1987). Performance measures independent of adjustment: An explanation and extension of Taguchi’s signal-to-noise ratios. Technometrics 29(3). Lewis EE. (1996). Introduction to Reliability Engineering. 2nd ed. John Wiley and Sons, New York. Lilliefors HW. (1967). On the Kolmogorov–Smirnov test for normality with mean and variance unknown. J Amer Stat Assoc 62(318):399–402. Lilliefors HW. (1969). On the Kolmogorov–Smirnov test for the exponential distribution with mean unknown. J Amer Stat Assoc 64(325):387–389. Liu C-C. (1997). A comparison between the Weibull and lognormal models used to analyze reliability data. PhD dissertation, University of Nottingham. Lorenz MO. (1905). Methods for measuring the concentration of wealth. J Amer Stat Assoc New Series. (70):209–219. Lu CJ, Meeker WQ. (1993). Using degradation measures to estimate a time-to-failure distribution. Technometrics 35(2):161–174. Madsen HO, Krenk S, Lind NC. (1986). Methods of Structural Safety, Prentice-Hall, Englewood Cliffs, NJ. Mann NR, Fertig KW. (1973). Tables for obtaining confidence bounds and tolerance bounds based on best linear invariant estimates of parameters of the extreme value distribution. Technometrics 17:361–368. Mann NR, Schafer RE, Singpurwalla ND. (1974). Methods for Statistical Analysis of Reliability and Lifetime Data. John Wiley, New York. Martz HE, Waller RA. (1982). Bayesian Reliability Analysis. John Wiley and Sons, New York. Massey FJ. (1951). The Kolmogorov–Smirnov test for goodness of fit. J Amer Stat Assoc 46:68–78. McCool JI. (1975). Inferential techniques for Weibull populations II. Technical Report ARL-75-0233. Wright Patterson Airforce Base, Dayton, OH.
Copyright © 2002 Marcel Dekker, Inc.
Meeker WQ, Escobar LA. (1998). Statistical Methods for Reliability Data. John Wiley and Sons, New York. Melchers RE. (1987). Structural Reliability, Analysis, and Prediction. John Wiley & Sons, New York. Michael JR, Schucany W. (1979). A new approach to testing goodness of fit with censored samples. Technometrics 21:435–442. Mizuno S, Akao Y, eds. (1994). QFD, the Customer-driven Approach to Quality Planning and Deployment, Asian Productivity Organization, Tokyo. Modarres M, Kamiskiy M, Krivtsov V. (1999). Reliability Engineering and Risk Analysis: A Practical Guide. Marcel Dekker, New York. Nelson W. (1981). Analysis of performance-degradation data from accelerated tests. IEEE Trans on Reliability R-30(2):149–155. Nelson W. (1982). Applied Life Data Analysis. John Wiley and Sons, New York. Nelson W. (1985). Weibull analysis of reliability data with few or no failures. J Quality Tech 17(3):140–146. Nelson W. (1990). Accelerated Testing: Statistical Models, Test Plans, and Data Analysis. John Wiley and Sons, New York. Neter J, Wasserman W, Kutner M. (1990). Applied Linear Statistical Models. 3rd ed. Irwin Publishing. Ochi MK. (1990). Applied Probability and Stochastic Processes. John Wiley and Sons, New York. O’Connor PDT. (1991). Practical Reliability Engineering. 3rd ed. John Wiley and Sons, New York. Pecht M, Kang W-C. (1998). A critique of Mil-Hdbk-217E reliability prediction methods. CALCE Center, University of Maryland, College Park, MD. Phadke MS. (1989). Quality Engineering Using Robust Design. Prentice-Hall, Englewood Cliffs, NJ. Raheja DG. (1991). Assurance Technologies. McGraw-Hill, New York. Shapiro SS, Wilk MB. (1965). An analysis of variance test for normality (compete samples). Biometrika 52:591–611. Shapiro S, Wilk MB, Chen J. (1968). A comparative study of various tests of normality. J Amer Stat Assoc 63:1343–1372. Shoemaker AC, Kacker RN. (1988). A methodology for planning experiments in robust product and process design. Quality and Reliability Engineering International 4(2). Society for Automotive Engineers. (1994). SAE FMEA J 1739, Failure Mode Effects Analysis. Warrendale, PA. Stephens MA. (1974). EDF statistics for goodness of fit and some comparisons. J Amer Stat Assoc 69:730–737. Tarum CD. (1996). Modeling bathtub curve failure data with Weibull equations. SAE Weibull Users Conference, Detroit, MI. (Also see: Weibull mixtures, http:==www.bathtubsoftware.com=mixtures.htm, BathTub Software.) Tarum CD. (1999). Determination of the critical correlation coefficient to establish a good fit for Weibull and lognormal failure distributions. SAE Technical Paper Series, 1999-01-0057, SAE, Warrendale, PA.
Copyright © 2002 Marcel Dekker, Inc.
Telcordia Technologies. (1997). (Bellcore) Reliability Prediction Procedure for Electronic Equipment (document number TR-332, issue 6). Piscataway, NJ. Tobias PA, Trindade D. (1986). Applied Reliability. Van Nostrand Reinhold, New York. Tomase JP. (1989). SAE 890806: The role of reliability engineering in automotive electronics. Society of Automotive Engineers, Warrendale, PA. US Navy. (1994). Handbook of Reliability Prediction Procedures for Mechanical Equipment (NSWC-94=L07). U.S. Navy David Taylor Research Center, Annapolis, MD. Vanooy H. (1990). Useful enhancements to the QFD technique. Trans from a Symposium on Quality Function Deployment. Automotive Division of the American Society for Quality Control, pp. 381–440. Vardeman SB. (1994). Statistics for engineering problem solving. PWS Foundations in Engineering Series. Wang CJ. (1991). Sample size determination of bogey tests without failures. Quality and Reliability Engineering International 7:35–36. Wasserman GS. (1996). A modeling framework for relating robustness measures with reliability. Quality Engineering 8(4):681–692. Wasserman GS. (1999–2000a). Reliability validation via success–failure testing: A hidden controversy. Quality Engineering 12(2):267–276. Wasserman GS. (1999–2000b). Easy ML estimation of normal and Weibull metrics. Quality Engineering 12(4):569–581. Wasserman GS. (2001). Reliability verification with few or no failures allowed. Submitted for publication. Wasserman GS, Lindland JL. (1997). A case study illustrating the existence of dynamics in traditional cost-of-quality models. Quality Engineering 9(1):119–128. Wasserman GS, Reddy I. (1992). Practical alternatives for estimating the failure probabilities of censored life data. Quality and Reliability Engineering International 8:61–67. Wilk MB, Gnanadesikan R. (1968). Probability plotting methods for the analysis of data. Biometrika 55:1–17. Wirsching P. (1992). Reliability methods in mechanical and structural design. 16th Annual Seminar and Workshop, Southwest Research Institute, San Antonio, TX. Wirsching P, Ortiz K. (1996). Reliability methods in mechanical and structural design, A Seminar and Workshop on Modern Reliability Technology for Design Engineers, Tucson, AZ. Wolynetz MS. (1979). Maximum likelihood estimation from confined and censored normal data, Algorithm AS 138 and AS 139. Applied Statistics (28):185–206. Woodruff BW, Moore AH, Dunne EJ, Cortes R. (1983). A modified Kolmogorov–Smirnov test for Weibull distribution with unknown location and scale parameters. IEEE Trans on Reliability R-32(2):209–213. Wu YT, Millwater HR, Cruse TA. (1990). Advanced probabilistic structural analysis of implicit performance functions. AIAA Journal 28(9).
Software Abaqus1 software (2000). Hibbitt, Karlsson & Sorensen, Inc., Pawtucket, RI 02860-4847.
Copyright © 2002 Marcel Dekker, Inc.
ALD RAM software (FMECA processor, 2000). Advanced Logistics Developments, Rishon Le Zion 75106, Israel. Algor1 software (2000). Algor, Inc., Pittsburgh, PA 15238-2932. Altair1 Hypermesh1 (2000). Altair Engineering, Troy, MI 48084-4603. Ansys1 software (2000). ANSYS, Inc., Canonsburg, PA 15317. AutoCad1 software (2000). Autodesk Inc., San Rafael, CA 94903. Catia1 software (2000). Dassault Systemes, Woodland Hills, CA 91367. FEMAP1 software (2000). SDRC FEMAP Division, Exton, PA 19341. LS-Dyna1 software (2000). Livermore Software Technology Corporation, Livermore, CA 94550. Minitab V 12.1 Statistical Software (1998). State College, PA 16801-3008. Microsoft Excel Software for Office 97 & 2000 (1997–1998). Microsoft Corporation, Seattle, WA, with Excel Solver code by Frontline Systems, Inc., Incline Village, NV 89450-4288, #1995. Nessus1 software. Southwest Research Institute, San Antonio, TX. Permas-RATM. Institute of Safety Research and Reactor Technology of German Helmholtz research center, Ju¨lich, Germany. Pro=Engineer (2000). Parametric Technology Corp., Waltham, MA 02453. Reliasoft# Weibullþþ V6.0 (2001). Tucson, AZ 85710. WinSmith# (Windows version) and WeibullSmith# (DOS version) reliability analysis software (1996). Copyrighted by Fulton FindingsTM, distributed by SAE, Warrendale, PA. YBathTM (1999). Bathtub software, Flint, MI.
Copyright © 2002 Marcel Dekker, Inc.