The stress-strength model and its generalizations MVsa

The Stress-Strength Model and its Generalizations Theory and Applications The Stress-Strength Model and its Generaliz...

Author: Samuel Kotz | Yan Lumelskii | Marianna Pensky

447 downloads 895 Views 10MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

The Stress-Strength Model and its Generalizations Theory and Applications

The Stress-Strength Model and its Generalizations Theory and Applications

= >

P(X
Yan Lumelskii Statistics Laboratory, Technion, Israel

Marianna Pensky University of Central Florida, USA

World Scientific New Jersey London Singapore Hong Kong

Published by World Scientific Publishing Co. Pte. Ltd. 5 Toh Tuck Link, Singapore 596224 USA office: Suite 202,1060 Main Street, River Edge, NJ 07661 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.

THE STRESS-STRENGTH MODEL AND ITS GENERALIZATIONS THEORY AND APPLICATIONS Copyright © 2003 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.

ISBN 981-238-057-4

Printed by Fulsland Offset Printing (S) Pte Ltd, Singapore

To our families

Preface

The term "stress" has acquired in the second half of the 20-th century a special meaning to a modern person. We all are continuously under stress, and, alas, not always have the "strength" to overcome it. The stress-strength relationship is nowadays studied in many branches of science such as psychology, medicine, pedagogy, etc. and the pharmaceutical industry accumulates billion-dollar profits assisting us to overcome or at least alleviate psychological stresses. Broadly speaking, the term stress is used nowadays in two different meanings: 1) structural, mechanical (or engineering) stress studied in the engineering discipline called "strength of materials", and more a more recent concept of 2) psychological stress which is usually denned as any "external stimulus - from threatening words to the sound of gunshot - which the brain interprets as dangerous". Another way of describing this concept is "a demand, threat or other event that requires an individual to cope with a charged situation." In engineering, stress in a solid body (liquids do not admit engineering stress) arises due to applied loads and is denned as "the force per unit of area (pounds per square inch, say) that one part of the body exerts on adjacent parts". Some knowledge of strength of materials existed already when ancient Greeks and Romans built their large structures, but the modern science began at the time of Galileo Galilei (1564-1642) who was one of the first of a long line of mathematicians, physicists and engineers to constitute and develop this field. The subject is being now researched in engineering schools, commercial, industrial and government supported laboratories and various societies such as the earliest Society for Experimental Stress Analyvii

viii

Preface

sis organized in 1943 which together with other institutions publicizes and disseminates the methodology and applications. Psychological stress is a concept of a more recent origin stimulated by the work of Sigmund Preud (1856-1939) - the father of psychoanalysis. More specifically, Canadian scientist Hans Selye pioneered stress research in 1926 by investigating the effects of stress on the body. Somewhat later on, two American psychiatrists Thomas H. Holmes and Richard Bake have devised the stress measuring methodology ranking 43 critical stresses according to severity of their impact on an individual life. Since then the investigations of various aspects of psychological stress is one the most vigorous and lucrative fields of modern psychology. The book that you are holding now in your hands has however little to do with these activities. It is devoted to a seemingly very simple topic - estimation of the probabilities of the type P(X < Y), P(X
Preface

ix

Models for Reliability" (1988) in Handbook of Statistics, Vol. 7. The book by Ivshin and Lumelskii (1995) (which serves as a blueprint and impetus to our volume) was written in Russian and published by the Perm University Press being unavailable and thus unknown to the Western readers. It is devoted mainly to the point estimation of P(X < Y) focusing on the best unbiased and the maximum likelihood estimation techniques. Interval estimation of P(X < Y) is studied only in the case of the normal distribution, nonparametric methods of estimation are confined to just five pages and Bayes techniques are completely left out of the book. The Johnson's article, although more balanced, is only twenty eight pages long which makes it impossible to cover the diverse approaches and treatments of the stressstrength models that has been accumulated in the last five decades since their inception in the middle of the 20-th century. The objective of our book is to attempt to fill in this void and provide a relatively comprehensive treatment of the stress-strength models. Nevertheless, we have tried to keep the level of this book as elementary as possible. Basic knowledge of undergraduate-level mathematics and statistics is a prerequisite, and many basic concepts are explained in the book with the aim to keep expositions as self-contained as possible. In Chapter 1 we acquaint the reader with the various aspects of the stress-strength models, its history, applications and extensions. Chapter 2 presents the necessary background knowledge describing statistical techniques associated with estimation of R = P(X < Y). Chapter 3 is devoted to various parametric methods of point estimation of R and some of its multivariate extensions. In this chapter we derive analytical expressions for estimators when X and Y belong to "standard" parametric families. Chapter 4 deals with various parametric techniques of interval estimation of R = P(X < Y). Chapter 5 is reserved for nonparametric methods of the point and interval estimation of R. Chapter 6 concentrates on major extensions and some special cases of the stress-strength models. Finally, Chapter 7 surveys applications which we were able to acquire from scanning various diverse sources. The book thus can be viewed as the initial exhaustive review monograph on a subject of interest and importance for both theoretically-oriented scientists and practitioners. It can be used as a textbook for a low graduate/upper undergraduate level special topics courses in reliability theory or statistical methodology in general. Each chapter is supplemented by a number of problems. The book as a whole can serve as a rich and com-

x

Preface

prehensive source for research projects at the undergraduate and graduate levels for students in statistics and engineering: to locate "blank spots" in the research conducted so far all is needed is to casually leaf through the chapters. At last, this book is consolidation of research of scientists all over the world, from the USA, Canada and Russia, India and Korea, Japan, Eastern Europe and Israel who have worked on the problems related to the stress-strength models. We are indebted to many authors (too numerous to be listed here individually) who supplied us with copies of the valuable advise of their contributions and to Professor Norman L. Johnson with whom we exchanged ideas on this topics especially in the early stages of the project. We hope that you will enjoy and profit from this outcome of our efforts. S.K. Washington, D.C., USA Y.L. Haifa, Israel M.P. Orlando, Florida, USA September 2002

Some Notations and Abbreviations pdf cdf edf MLE LRT MSE EB UMVUE HPD region BVED WMW statistic ROC curve OD graph R R R Y

r(-)

probability density function cumulative distribution function empirical distribution function maximum likelihood estimation (estimator) likelihood ratio test mean squared error empirical Bayes uniformly minimum variance unbiased estimator highest probability density (region) bivariate exponential distribution Wilcoxon-Mann-Whitney statistic receiver operating characteristic curve ordinal dominance graph the MLE of R the UMVUE of R the Bayes estimator of R lv Y \ ,

-Ani)

(Ylr--,Yn2) distributed as approximately distributed as the cdf of the standard normal distribution the Gamma function the Beta function

xi

Contents

Preface

vii

Chapter 1 The Stress-Strength Models. Mathematics, History, and Applications 1.1 What are the Stress-Strength Models? 1.2 Motivation and Mathematical Formulations 1.2.1 Motivations 1.2.2 Mathematical Formulations 1.3 Stress-Strength Models: History and Geography 1.3.1 History 1.3.2 Geography 1.4 Applications Chapter 2 The Theory and Some Useful Approaches 2.1 The Maximum Likelihood Estimators 2.1.1 The Theory 2.1.2 Construction of the MLE 2.1.3 One-parameter Exponential Distribution 2.1.4 Multivariate Case 2.2 Unbiased Estimation 2.2.1 The Theory 2.2.2 Construction of UMVUEs 2.2.3 One-parameter Exponential Distribution 2.2.4 A Multivariate Case 2.3 Bayes and Empirical Bayes Estimation of R xiii

1 1 3 3 5 6 6 8 9 11 11 11 12 14 15 16 16 18 20 23 23

xiv

2.4

2.5

2.6

Contents

2.3.1 The Theory 2.3.2 The Choice of a Prior 2.3.3 One-parameter Exponential Distribution 2.3.4 Bayes Predictive and Empirical Bayes Estimation Interval Estimation 2.4.1 The Theory 2.4.2 Exact Methods of Interval Estimation 2.4.3 Asymptotic Methods of Interval Estimation 2.4.4 Bayesian Credible Sets 2.4.5 Hypothesis Testing: Theory and Methods 2.4.6 One-parameter Exponential Distribution Transformation Methods 2.5.1 The Theory 2.5.2 Examples of Transformations 2.5.3 The Rayleigh Distribution Exercises

...

Chapter 3 Parametric Point Estimation 3.1 The Maximum Likelihood Estimation (Univariate Case) . . . . 3.1.1 The Normal Distribution 3.1.2 The Two-parameter Exponential Distribution 3.1.3 The Gamma Distribution 3.1.4 The Truncation Parameter Families 3.1.5 The Pareto Distribution 3.1.6 The Weibull Distribution 3.1.7 Burr Type X and Type XII Distributions 3.1.8 The Generalized Gamma Distribution 3.1.9 Other Distributions 3.2 Unbiased Estimation (Univariate Case) 3.2.1 The Normal Distribution 3.2.2 The Two-parameter Exponential Distribution 3.2.3 The Gamma Distribution 3.2.4 The Truncation Parameter Families 3.2.5 The Generalized Gamma Distribution 3.2.6 Other Distributions 3.3 Bayes and Empirical Bayes Estimation (Univariate Case) . . . 3.3.1 The Normal Distribution 3.3.2 The One-Parameter Exponential Distribution

23 25 27 29 30 30 31 31 33 33 36 39 39 42 42 46 47 47 47 48 49 51 52 53 54 55 58 59 59 61 63 64 69 70 71 72 74

Contents

3.4

3.5

3.6

3.7

3.8

3.3.3 The Weibull Distribution 3.3.4 The Burr-Type X Distribution Elliptical Distributions 3.4.1 Maximum Likelihood Estimation 3.4.2 Bayes Estimation 3.4.3 The Pearson Type II Distribution 3.4.4 The Multivariate T and Cauchy Distributions The Multivariate Normal Distribution 3.5.1 Maximum Likelihood Estimation 3.5.2 Unbiased Estimation 3.5.3 Bayes Estimation Bivariate Exponential Distributions (BVED) 3.6.1 Various Types of Exponential Distributions 3.6.2 Stress-Strength Estimators for the Marshall-Olkin BVED 3.6.3 Stress-Strength Probabilities and Their Estimators for Other BVEDs Discrete Distributions 3.7.1 Multivariate Discrete Distributions 3.7.2 Univariate Discrete Distributions Exercises

xv

75 77 78 79 82 84 86 88 88 90 92 95 96 97 100 101 101 103 105

Chapter 4 Parametric Statistical Inference 109 4.1 Confidence Intervals Based on Exact Distributions 109 4.1.1 The Normal Distribution: Dependent Variables 110 4.1.2 The Normal Distribution: Independent Variables . . . . 112 4.1.3 The Gamma Distribution 114 4.1.4 The Generalized Gamma Distribution 115 4.1.5 The Burr Type X Distribution 117 4.2 Asymptotic Confidence Intervals 118 4.2.1 The Normal Distribution 118 4.2.2 The Left-Truncated Exponential Distribution 119 4.2.3 The Two-parameter Exponential Distribution 119 4.3 Bayesian Credible Sets 123 4.3.1 The Normal Distribution: Independent Variables . . . . 123 4.3.2 The Weibull Distribution 125 4.4 Hypothesis Testing 126 4.4.1 Tests Based on Exact Confidence Intervals 127 4.4.2 Tests Based on Generalized p-values 129

xvi

4.5

4.6

Contents

4.4.3 Bayesian Tests 131 Bootstrap 132 4.5.1 The Concept of the Bootstrap 132 4.5.2 Bootstrap-Based Asymptotic Confidence Intervals . . . 133 4.5.3 The Percentile Method 135 Exercises 137

Chapter 5 Nonparametric Models 5.1 Point Estimation of R = P(X
139 140 140 141 144 144 147 148 149 150 151 154 155 157 158 158 160 161 164 167

Chapter 6 Some Selected Special Cases 169 6.1 Stress-Strength Models for System Reliability 170 6.1.1 Various Models for System Reliability 170 6.1.2 Estimation of System Reliability Based on Numerical Data 172 6.1.3 Estimation of System Reliability Based on Count Data 174 6.2 Estimation of P(X1 < X2 < < Xk) 177 6.2.1 General Case 177

Contents

6.3

6.4

6.5

6.6

6.2.2 Estimation of P(X
xvii

180 182 182 187 189 189 192 195 195 197 197 199

Chapter 7 Applications and Examples 7.1 Applicability of the Stress-Strength Model 7.2 Engineering and Military Applications of the Stress-Strength Model 7.2.1 The Rocket Motor Case Example 7.2.2 Comparison of Two Treatments in Engineering Setting 7.2.3 Military Applications 7.3 Applications in Medicine and Psychology 7.3.1 Applications Based on Numerical Data 7.3.2 Applications Based on Categorized Data 7.4 ROC Curves Analysis 7.4.1 ROC Curves and Their Relation to P(X
201 201

Bibliography

233

Index

251

205 205 207 211 214 214 216 223 223 226 227 227 230

Chapter 1

The Stress-Strength Models. Mathematics, History, and Applications

1.1

What are the Stress-Strength Models?

We intend to describe some developments in a remarkable field of enquiry of statistical/probability theory. Why is it remarkable? Although it is relatively, though not especially, long-established (from about mid of the twentieth century) it is of a limited scope, yet in the last quarter of the century it has produced an impressive volume of publications (see the Reference List). Many of the papers in this field include the enigmatic "words" "P(X < Y)" (or some slight variations) in the title. This indicates a claim to some relationship to the broader field of (partial) ordering of distributions. A variable Y is said to be stochastically larger than a variable X if the cumulative distribution function (cdf) of Y is never greater than that of X, i.e. using the standard notation Fy(t)
for all*.

An immediate consequence of this relation is that P{X

1/2

since O

P(X
O

Fx(t)dFY(t) > /

1

FY(t)dFY(t) = -.

(1.1)

This admittedly somewhat tenuous connection may perhaps be regarded as conferring certain degree of respectability on the field of investigation 1

2

The Stress-Strength Models. Mathematics, History, and Applications

designated by " P(y > X)". This would however be insufficient to account for the considerable interest towards these studies in the last two decades. The basic impetus to these developments can perhaps be ascribed to the specific practical problem of applied statistics encapsulated by the term "stress-strength". In the simplest terms this can be described as an assessment of "reliability" of a "component" in terms of random variables X representing "stress" experienced by the component and Y representing the "strength" of the component available to overcome the stress. According to this simplified scenario if the stress exceeds the strength (X > Y) the component would fail; and visa versa. Reliability is then defined as the probability of not failing: P{X < Y). The germ of this idea was introduced by Birnbaum (1956) and developed by Birnbaum and McCarty (1958). The latter paper does for the first time include P(Y < X) in its title. The formal term "stress - strength" appears in the title of Church and Harris (1970) . This is the earliest date in our bibliography though earlier cases may exist. In the course of time, there have been attempts to introduce some elements of further aspects of reality into the picture. We shall discuss various generalizations of the model and its applications in Chapters 6 and 7, respectively. However, there is a substantial number of papers devoted to purely probabilistic problems associated with evaluation of R = P(X < Y) and construction of efficient and reliable estimators of this PARAMETER based on sample values with various assumptions on the distributions of X and Y. Much of the work presupposes that both random variables has distributions belonging to the same family (such as normal, exponential, log-normal, Weibull, etc.) and more significantly it assumes independence between them. Unfortunately, so far relatively little attention has been paid to the more realistic problem of variation in both X and Y over time as a measure of reliability (we discuss this issue in Section 6.5). There have, however, been studies in which X and Y admit certain specified form of dependence (bivariate and multivariate normal, bivariate exponential, etc.), as indicated in Chapters 3, 4 and 6 and the bibliography. Before proceeding to discuss specific cases it would be desirable to bear in mind the comments of Harris and Soms (1983) in regard to practical applications of the results. The authors draw attention to the fact that in very many applications the "reliability" (defined as described above) has to be very close to 1 for the device to have any possibility of "useful"

Motivation and Mathematical Formulations

3

life. One consequence is that very large samples may be needed to obtain sufficiently accurate estimates of reliability since we are here dealing with extreme tails of distributions. A further, and even more serious, difficulty lies in the sensitivity of "reliability" to small changes from assumed models for the distributions of X and Y. In the words of Harris and Soms (1983): "... relatively small perturbations of the tail of the strength distribution can make the failure probability far higher than may be desirable, particularly, where failure can be catastrophic." This can lead to cases where "the estimation procedures ... produced results which were significantly contradicted by subsequent experience." To avoid confusion, let us also keep in mind that also other forms than P(X < Y) appear occasionally in titles of the relevant papers and reports - not only P{Y < X) but P(Xi < X2), P(X < Y), etc. and although X commonly represents "stress" and Y "strength", this is not always the case. Sometimes just laconic "Evaluation of P(Y < X) for . . . " is used. 1.2 1.2.1

Motivation and Mathematical Formulations Motivations

In an important methodological note written over 30 years ago, Wolfe and Hogg (1971) assert that numerical values of P(X < Y) make more sense to practitioners - particularly those in the medical profession - than the equivalent statements about (/xi — 1x2)/^ (under the normal assumptions) and point out that P(X < Y) can be estimated under many distributional assumptions (not only the normality), thus permitting us to avoid the trap of using normal distributions when they are obviously inappropriate. In a sense, Wolfe and Hogg (1971) provide a road-map to the research which resulted in a flood of papers starting from Church and Harris (1970) up to the beginning of the 21-st century. Not only the problems of deriving theoretical expressions for P(X < Y) and its modifications and extensions under various distributional assumptions were found to be challenging, but also estimation of these probabilities based on samples of various structure opened new avenues in deriving approximations to variances and confidence bounds. Similar sentiments were expressed some fifteen years later by Halperin et al. (1987) who emphasize the suitability of P(X < Y) estimators for versatile comparisons of two samples embracing the possibility that two

4

The Stress-Strength Models. Mathematics, History, and Applications

underlying distributions may differ in one or more parameters. It might be desirable to elaborate a bit on the assumptions and characteristics of the pivotal quantities involved in this seemingly straightforward model: P(X < Y). In the seventies of the 20-th century when first serious attempts to analyze reliability of a component by applying probabilistic argument to a physical model of failure were initiated, the term "inference theory" was often used in engineering literature (see e.g. Mazumdar (1970)). According to this theory a component fails if at any moment the applied stress (often being a load) exceeds the component's strength (or resistance). The stress - as we have already alluded - is a function of the environment in which the component is located and can be estimated from the available technological knowledge about the relevant conditions of the system and the manner in which they interact. Engineers claim (or used to claim) that the values of mechanical stress at different points of time can be computed deterministically given the set of the initial values. Church and Harris (1970) provide an example of a missile flight where the initial values of the stress correspond to propulsive force, angles of elevation, atmospheric conditions, etc. One is tempted to recall the famous assertion of P.S. Laplace (1812) in his "Theorie Analytique des Probabilities" to the effect that given the initial conditions and some relevant data one can predict with complete certainty the location of the moving particle at any given time. Laplace believed that "the curve described by a simple molecule of air or any gas is regulated in a manner as certain as the planetary orbits, the only difference between them lies in our ignorance". "Give me the sufficient data", - claimed Laplace " and I will tell you the exact location of a ball on a billiard table". A less rigid approach is to use random variables since, after all, even the initial conditions are random quantities. This amounts to postulating stress to be a random variable based on a priori considerations. The strength cannot be computed from a-priori considerations and can only be estimated by means of statistical methods from the results of the tests specifically geared for this purpose. This sets limitations on the amount of data that can be generated and increases the temptation to use expert elicitation and Bayesian methods. However considerations regarding the nature of stress and strength do not require that they will be related in any way. Hence, in the seventies of the last century the vast majority of authors conducted inference about P(X < Y) under the assumption that X and Y are independent variables.

Motivation and Mathematical Formulations

5

As Bilikam (1985) points out in his thought provoking paper, the strength is "necessarily conditioned on the stress because the physical realization of strength is found only when stress is applied". Bilikam (1985) extends his investigations of the relationship between strength and stress and assumes that both strength and stress are realizations of continuous random processes. He studies however very simple models assuming that X and Y are stochastically independent and are related via time-dependent parameter values Q\{t) and 02(t) (see Section 6.5.1). Unfortunately very little has been done to further investigate the time-dependent stress-strength model. We strongly encourage our readers to pursue this topic in their future research. 1.2.2

Mathematical Formulations

Above we have attempted to provide justification for the model P(X < Y). This however is not the only quantity of interest in a variety of practical situations. Constructing and operating complicated devices leads to estimation of system reliability. This occurs when a device under consideration is a combination of k, usually independent components with the strengths Yi, , Yk and each component of the system is subject to a common shock of a random magnitude X. The most popular models for this type are parallel systems which can operate successfully whenever at least one of the k components survives, or series systems which survive only when all of the components are intact. The reliabilities of the parallel or series system , Yk)) and P(X < min(yi, , Yk)), can be expressed as P(X < max(Yi, respectively. More diverse probabilities are coming to light if a system is more complex, for example, if the system functions properly when at least s, 1 < s < k, components survive the shock, or if it consists of a number of independent subsystems, say m, performing different tasks. We shall study this type of models in some detail in Section 6.1. Another important quantity of interest is P(X\ < X 2 < < Xk) which appears in isotonic regression problems where it is necessary to estimate P(Xx < X2 < < Xk) to obtain the level probabilities. An important particular case is estimation of P(X
6

The Stress-Strength Models. Mathematics, History, and Applications

person's blood pressure has two limits - systolic and diastolic - and his/her blood pressure should lie within these limits. Probabilities of this kind are treated in Section 6.2. Finally, in some applications the data may consist of two independent random vectors X and Y and one is interested in estimating the probability that the linear combination A'X + B'Y exceeds certain level, namely, P(A'X + B'Y + C > 0) where A and B are vectors and C is a scalar. This sort of models occur when, for example, X^ is the density of the vehicles of type i o n a bridge (cars, buses, trucks and so on) and A^ is the damage (stress) caused by a vehicle of type i, i = 1, , k\. If the strength of the system is provided by several components (for example, special steel or concrete, extra strong supports for a bridge), then the strength of the system can be viewed as a linear combination of the random components yW, i = l,---,fe2, that is B'Y. In this model, the stress X and the strength Y are independent. As usually, reliability of the system is defined as the probability that the strength exceeds the stress: P(A'X < B'Y). If we are interested in estimating the probability that the strength exceeds the stress by a fixed value C, the problem reduces to estimating P(A'X + B'Y + C > 0 ) . Estimation of P(A'X + B'Y + C> 0) is studied in detail in Sections 3.4 and 3.5. 1.3 1.3.1

Stress-Strength Models: History and Geography History

It may be of interest to point out that chronologically the stress-strength model originated not in a parametric but rather in a nonparametric set-up in the path breaking works of Wilcoxon (1945), Mann and Whitney (1947). The main objective of these investigations was to compare two random variables X and Y which describe results of two treatments. Wilcoxon, Mann and Whitney introduced statistic which bears their names and is based on ranks of the observations on X and Y in the joint sample. They also pointed out the connection between the hypothesis Fx — Fy and P(X < Y) = 1/2. Their initial effort lead to the series of papers studying point and interval estimation of P(X < Y) in sixties of the last century. Here we should mention Birnbaum (1956), Birnbaum and McCarty (1958) (this was the first paper with P(X < Y) in its title), Govindarajulu (1967, 1968), Owen et al. (1964), Sen (1960, 1967), Van Dantzig (1951) and Zaremba (1965) among

Stress-Strength Models: History and Geography

7

others. Nonparametric methods, were "safe" in the sense that they posed no assumptions on X and Y, however, they may be were too inefficient for practical purposes. In a way, this methodology was somewhat akin to the approach encountered by one of the authors of this book on his sojourn to China in late seventies where at that time dentures of only one size were available and the patients were compelled to adjust their mouths to these average specifications. The first attempt to study P{X < Y) under certain parametric assumptions on X and Y was undertaken by Owen et al. (1964) who constructed confidence limits for P(X < Y) when X and Y are dependent or independent normally distributed random variables. In the sixties very little was done to investigate a parametric version of the stress-strength model, however, in the seventies investigation of the topic gathered some steam. By the end of seventies, estimation of P(X < Y) was carried out for the major distributions such as exponential (Kelley et al. (1976), Tong (1974)), normal (Church and Harris (1970), Downton (1973), Woodward and Kelley (1977)), Pareto (Beg and Singh (1979)) and exponential families (Tong (1977)). Also, significant advances in Bayes estimation of P(X < Y) for exponentially or normally distributed X and Y were made by Enis and Geisser (1971). The other milestones of the seventies are introduction of non-parametric empirical Bayes estimation of P(X < Y) (Ferguson (1973), Hollander and Korwar (1976)) and the study of system reliability (Bhattacharyya and Johnson (1974)). By the late eighties estimators of P(X < Y) were obtained for the majority of common distribution families for the situations when X and Y are independent, see e.g. Awad and Gharraf (1986), Beg (1980a,b,c), Constantine et al. (1986), Ismail et al. (1986), Iwase (1987), Reiser and Guttman (1986), Voinov (1984). At the same time, the efforts were also made to consider broader, more realistic, models. In the view of the successful introduction of a variety of bivariate exponential distributions by Gumbel (1960), Preund (1961), Marshall and Olkin (1967), Block and Basu (1974), it became possible to study dependent exponential random variables with various types of dependence. Estimators of P(X < Y) for a bivariate exponential random vector (X, Y) were derived by Abu-Salih and Shamseldin (1988), Awad et al. (1981), Klein and Basu (1985) among others. Pensky (1982) constructed estimators for P(A'X + C > 0) for a normally distributed random vector X with a general variance-covariance matrix. Going further, Bilikam (1985) discussed above suggested a time-dependent

8

The Stress-Strength Models. Mathematics, History, and Applications

model for X and Y, and Raghava Char et al. (1984) studied stress and strength Markov models for the system reliability. Some other important advances of the eighties were investigations of extensions of "standard" stress-strength models such as stress-strength models with categorized data (see e.g. Brownie (1988), Halperin et al. (1989), Simonoff et al. (1986)) or explanatory variables (Guttman et al. (1988)). Another major achievement was an application of the theory to a variety of real-world problems (see e.g. Guttman et al. (1988), Johnstone (1983), Halperin et al. (1987), (1989)), Simonoff et al. (1986), Ury and Wiggins (1979)). Some of the above mentioned works were summarized in the review paper by Johnson (1988). In the nineties and the years 2000-2002 we have witnessed further developments on stress-strength models. More diverse probabilities such as A'kXk + C > 0) where Xi, , Xfc are P{Xi <Xk) and P(AiXi + independent normal vectors were studied (see e.g. Ivshin (1998), Ivshin and Lumelskii (1994), Hayter and Liu (1996), Miwa et al. (2000)), and some new, less familiar distributions were considered such as Burr type X (Ahmad et al. (1997), Surles and Padgett (1998, 2000)), mixtures of inverse Gaussian (Akman et al. (1999)), skew-normal (Azzalini and Chiogna (2002), Gupta and Brown (2001)), Wienman multivariate exponential (Cramer and Kamps (1997a), Cramer (2001)), bivariate Pareto (Hanagal (1997a)), elliptical (Pensky (2002)), and generalized gamma (Pham and Almhana (1995), Pensky and Takashima (2002)). The field seems to have reached its maturity. It is virtually impossible to mention here every author who contributed to the development of the stress-strength models, and we apologize in advance for inadvertently omitting some names in this brief history review. 1.3.2

Geography

In our age of globalization, Internet and instantaneous World communications, geography rarely affects development of scientific theories. This, however, is not entirely valid as far as the stress-strength models are concerned. The research in this area has been conducted all over the world, and the results appeared in publications ranging from the Journal of the American Statistical Association to Pakistanian Journal of Statistics, from the Canadian Journal of Statistics to the Chinese Journal of Mathematics, from the Journal of Korean Statistical Society to the Journal of Mathematical Sciences (which publishes translations of Russian collections), from the Journal of Indian Association for Productivity, Quality Control and

Applications

9

Reliability to Prace Naukowe Instytutu Matematyczuego Politechniki Wroclawskiej (Proceedings of the Institute of Mathematics of the Wroclaw Polytechnic) . However, the bulk of results was obtained by the American, Russian, Canadian and Indian scientists (some of the latter are residing in USA). The peculiarity of situation was that the Russian school, being mainly confined to provincial city of Perm where almost no foreign publications were available, has developed and worked in complete isolation from their Western and Eastern (Indian) colleagues. For this reason, useful techniques in estimation theory developed in Perm from late sixties to the early eighties (see e.g. Lumelskii (1969a), Lumelskii and Sapoznikov (1969), Lumelskii and Pensky (1982)) were unknown to the rest of the world. On the other hand, due to the lack of information, Russian scientists occasionally "reinvented the wheel" constructing estimators that have already been available in literature. It should also be noted that their derivations of the best unbiased estimators were based on a general technique (described in Section 2.2) and were supplemented by estimators of the corresponding variances. The main thrust of exploration of the Perm school were the maximum likelihood and the best unbiased estimation of P(X < Y) and that , X& are independent normal of P(A[Ji.i H Aj.Xfc + C > 0) where Xi, vectors (k > 1). Their work of some twenty five years has culminated in the monograph by Ivshin and Lumelskii (1995) already mentioned above.

1.4

Applications

As it was already pointed out, the subject-matter of this book - the stressstrength models - initially originated from a seemingly unrelated problem of classical non-parametric tests of equality of two distribution functions. It then naturally led to the expressions of the type P(X < Y) and next it was realized that these quantities can be fruitful for examining the probability of inequality-type relations between two or more random variables under a great variety of conditions and situations. This naturally resulted in applications in numerous engineering problems under the banner of "reliability" provided that the random variables under consideration admit appropriate interpretation. These will be discussed in detail in Chapter 7 in which a number of specific engineering cases and other applications are described.

10

The Stress-Strength Models. Mathematics, History, and Applications

Next it became evident that practical applications are by no means confined to engineering, or to military problems (only a small portion of the latter research constitutes a public knowledge). In fact, the advances in medical statistics in the last twenty years triggered numerous applications for medical-oriented problems of which the clinical trials are one of the fastest growing areas. Next came applications in psychology which required adjustment of the theory to accommodate categorical data. Further natural applications, especially in but not limited to medicine, involve comparison of two or more random variables representing the state of affairs in two or more situations at different time intervals. The new frontier of potential application is the real-world problems where the model cannot be viewed as consisting involving independent identically distributed random variables and is more appropriately represented by a binary data leading to the so-called "ROC approach" with a strong dose of logistic regression. One of the recent applications is the challenging problem of estimating the unknown strength characteristics from the observable distribution of stress which leads to more interesting probabilistic and statisticaltheoretical problems. Another possible application still in its infancy is the relation between the stress-strength models and the quality control concepts, specifically the so called process capability indices originated in the quality control literature some twenty years ago. It should be noted that as the sources of numerical data are becoming more widely available and statistical calculation becoming more accessible due to the rapid advances in computer technology more and more applications are to be expected. The stress-strength relation is an universal flexible relation easily adaptable to various fields of human endeavor and nature phenomena. It is a powerful tool for comparing and dissecting interrelated situations, the simplicity of the model can be both deceiving and rewarding.

Chapter 2

The Theory and Some Useful Approaches

2.1 2.1.1

The Maximum Likelihood Estimators The Theory

The maximum likelihood estimation (MLE) is undoubtedly the most popular (at least until now) procedure for estimation of reliability R = P(X < Y) due to its flexibility and generality. The technique can always be used if the joint distribution of the stress X and the strength Y is a known function with some unknown parameters. A detailed description of the MLE method is presented, for example, in Casella and Berger (1990) and Lehmann and Casella (1998). Here, we shall concentrate on a discussion of the MLE of the reliability R = P(X < Y). Assume that a random vector (X, Y) has the probability density function (pdf) f(x, y\&) with an unknown scalar or vector-valued parameter 6 € (Xn, Yn) ©. The aim is to estimate R on the basis of observations {X\, Y\), Note that if X and Y are independent with the pdf of the form (2.1) the number of observations for X and Y need not be the same. In general, the data is of the form (X_, Y_) (2.2)

with n\ =ni'\iX

and Y are dependent. 11

12

The Theory and Some Useful Approaches

Let f(2L, Yji 8) denote the joint pdf of the data, i.e. (2.3) Note that if X and Y are independent, (2.3) becomes ri2

n\

f(X,Y\8) = Ylfx{Xi\B) J ] MYj\0).

(2.4)

Definition 2.1 Given that (X, Y) is observed, the function of 9 defined by L(9\X_,Y_) = f{X_,Y_\8) is called the likelihood function. Definition 2.2 The maximum likelihood estimator (MLE) 8 = 9(X_, Y_) of the parameter 8 based on the sample (X_, Y_) is the parameter value at which the likelihood function L(8\X_, Y) attains its maximum as a function of 9. Theorem 2.1

(Invariance property of the MLEs.) If 8 is the MLE

of 8, then for any function
2.1.2

Construction of the MLE

From the description above it follows that maximum likelihood estimation of R involves three steps: 1. calculation of reliability R = R(9) = P(X < Y) as a function of 9; 2. construction of the MLE 9 of the parameter 9; 3. calculation of the MLE R = R(9) of R. We shall focus upon the first two tasks, the last step being trivial. By the definition of reliability, R(9) can be calculated as

R(9)= f°° r

f(x,y\9)I(x
(2.5)

— oo J — oo

If X and Y are independent with the pdfs fx(x\9) and /y(y|0) and the cumulative distribution functions (cdfs) Fx{x\9) and Fy{y\9), respectively,

The Maximum Likelihood Estimators

13

(2.5) can be rewritten as

R(9) = f" Fx(z\9)fY(z\9)dz = [" (1 - FY(z\9))fx(z\9)dz. J

(2.6)

J

In the case when either X or Y is exponentially distributed, the probability R(9) can be expressed via the Laplace transform (see e.g. Nandy and Aich (1994)). For instance, if X is exponential with the cdf Fx(x\a) = 1 — exp {—ax} , x > 0, R(6) = l -

r°°

exp{-az}fY(z\9)dz

= l-£[fy](a),

(2.7)

J—oo

where C[fY](a) is the Laplace transform of fy(y\9) calculated at the point a. If the pdf of (X, Y) is not available but the characteristic function ( p(zi,Z2', 0) = E[exp(jziX + i^lO] is known, one can use an alternative representation for R. The indicator function I(c > 0) can be expressed as I(c> 0) = =- + -Im I

7T

/

—dz,

(2.8)

Z

Jo

(see e.g. Gradshtein and Ryzhik (1980), 3.721). Here lm(z) denotes imaginary part of z and i = y/—T- Choosing c = y - x and substituting (2.8) into (2.5) yields

= I + m rv(-*'X'e)dz. 2

7T

Jo

(2.9)

Z

Since occasionally relation (2.9) leads to much easier calculations (as we shall show in Sections 3.4 and 3.5), it is advantageous to use (2.9) even if the pdf f(x, y\9) is available. Having derived an expression for R(9), we now construct the MLE of the unknown parameter 9 as indicated above. Since the logarithmic function is strictly increasing, maximizing L(0\X_,Y) is equivalent to maximizing the logarithm of the likelihood function \nL(9\X_,Y_), i.e. \n.L(9\X_,Y) = maxlnZ,(0|XZ), 0

where \nL{9\X_,Y_) = hi/(X,F|0) and f{2L,Y_\0) is given by (2.3) or (2.4).

14

The Theory and Some Useful Approaches

Due to the invariance property of the maximum likelihood estimators, the MLE of R has the form R = R(6).

(2.10)

The concept of the mean squared error (MSE) of R may require some elaboration. By definition: / (R- R(e))2f(X,Y\6)Y[dXi

HdYj.

(2.11) To construct the MLE of MSE(ii) one thus needs to calculate MSE(.R) as a function of 6 using the expression (2.11) and replacing 0 by its MLE 6. Often, however, the expression for MSE(i?) may be difficult to present not only via elementary but even via special functions. In such cases, the only general technique available so far for evaluation of MSE(JR) is the bootstrap methodology to be discussed in Chapter 4. As we shall see below, a complicated form of the MSE(i?) may be a disadvantage of the maximum likelihood estimation over unbiased estimation to be studied in Section 2.2.

2.1.3

One-parameter Exponential Distribution

We shall now illustrate the derivation of the MLE. For didactic purposes, we shall perform all three steps described above. However, for a number of well-known distribution families the MLEs of parameters are readily available. These estimators can be found in e.g. Johnson et al. (1994), (1995). Consider the case when X and Y are independent exponential random variables with the pdfs fx{x\a) = aexp(-ax) and fy(y\P) = /3exp(—(3y) where the vector parameter 9 = (a,/3) is unknown. From (2.6), we have {

'~Jo

{

}

a+V

If X\, , Xm and Yi, , Yn2 are samples from fx{x\a) spectively, then (c.f. (2.4)) fUL,Y\a,p) = ani pn2 exp{-aniX - pn2?}

(2 12)

'

and /y(j/|/3), re-

(2.13)

The Maximum Likelihood Estimators

15

where ni

n2

(2.14) The MLEs a and ^ of a and /3, are the values of a and /3 maximizing In /(2£. Y_\a, P) = nj In a 4- n2 In /3 - aniX — fin-iX. Taking the derivatives of the last expression with respect to a and /3 and equating them to zero, we arrive at the equations ni/a — n\X — 0 and H2//3 — n^Y = 0 which yield a = 1/X and (3 = 1/Y. Hence, the MLE of R is simply

R

^

(215)

Note that R depends on the samples via the ratio X/Y only. The MLE (2.15), initially obtained by Tong (1974), is one of the basic formulas of the stress-strength statistical theory. Several authors discussed evaluation of the MSE of R given by (2.15) or construction of its upper bounds (see e.g. Kelley et al. (1976), Sathe and Shah (1981), Chao (1982) and Jana (1997)). Here, we shall follow the approach of Chao (1982) who derived an asymptotic expression for (2.11) in this case. The technique of Chao (1982) is based on expanding R — R into the Taylor's series in (X - I/a) and (Y - 1//3) and then determining JJ = ER-R and MSE(R). For m = n 2 = n, Chao (1982) obtained =

T{T

- 1)(1 + r ) - 3 [n- 1 + (r 2 - 4r

]

and MSE(£) = 2r 2 (l + r ) - 4 ^ 1 + 4T 2 (2T - l)(r - 2)(1 + r^n'2

+ o(n~2), (2.16) where r = a//3. Estimating r by f — Y/X and plugging f into (2.16), we obtain an estimator of the asymptotic MSE given by (2.16). 2.1.4

Multivariate Case

A few remarks on "stress-strength" models where X and Y are independent or dependent random vectors are in order. If X = (X^, ...,X^fel)) and Y = (Y^\ ...,F(fc2)) are random vectors, one may be interested in the probability R = P[(X, Y) € fi] where O is a subset of the (k\ + &2)

16

The Theory and Some Useful Approaches

dimensional Euclidian space. For example, one may consider fii = {(x,y): A'x>B'y}

(2.17)

Q2 = {(x, y) : A'x + B'y + C > 0} ,

(2.18)

or

where A and B are known vectors and C is a known scalar (see e.g. Pensky (1982), Gupta and Gupta (1990), Ivshin and Lumelskii (1993) and Reiser and Faraggi (1994)). In the case when vectors X and Y have the same dimension (fci = k2 =-k), another important quantity is R = P((X,Y) € SI3) where fi3 = {(x, y) : xt < yu i = 1,..., k}

(2.19)

(see e.g. Singh(1981)). Similarly to the one-dimensional case, when constructing the MLE of R, the first step is evaluation of

R{8) = JJ f(x,y\0)dxdy. Then, the MLE of R is of the form R = R(9) where 6 is the MLE of 9.

2.2 2.2.1

Unbiased Estimation The Theory

The merit of the MLE approach is that it is universal and allows to obtain an estimator of R for practically any distribution family. However, estimators derived by the MLE method may be biased which is undesirable especially if the sample size is small. In such a situation, a way out might be construction of an unbiased estimator of R = P(X < Y). Below we shall briefly discuss unbiased estimation of R. A more detailed description of the methods of unbiased estimation and the sufficiency principle on which this method is based can be found in any standard text on statistical inference, e.g. Casella and Berger (1990) or Lehmann and Casella (1998). In this section we shall assume the same parametric set-up as in Section 2.1 postponing the discussion of nonparametric unbiased estimation until

Unbiased Estimation

17

Chapter 5. To develop the procedure one needs to assume that the family of pdfs f(x, y\8) has a sufficient statistic. Definition 2.3 A statistic T — T(X_,Y_) is said to be a sufficient statistic for 6 if the conditional pdf of the sample given the value T does not depend on 8. Intuitively, if T is a sufficient statistic for 8, then T captures all information about the parameter 8 that the sample (X_,Y) contains. To retrieve a sufficient statistic for the family of pdfs f(x, y\8) the theorem below can be used (see e.g. Casella and Berger (1990) or Lehmann and Casella (1998)). Theorem 2.2 Factorization Theorem. A statistic T is a sufficient statistic for f(x, y\8) if and only if there exist functions q(-\0) and h(X,Y) (the latter does not depend on 8) such that for all the sample points X_, Y_ and all possible 6 € 6 the joint pdf of (X_,Y_) defined in (2.3) or (2.4) is of the form f(X,Y_\8) = 9(T(X,Y)\8) h(X,Y).

(2.20)

There are infinitely many unbiased estimators of R = P(X < Y) based on the sample (K,Y_), e.g. V(X_,Y_) = -f(*i < ^i), V(X_,¥.) = [min(ni,n 2 )]~ 1 E^ii ( n i ' n 2 ) I(Xj < Yj), etc. Our objective, however, is to catch the one which has the smallest variance (and, consequently, the smallest MSE) for all values of 8. Definition 2.4 Let cp(8) be any parametric function. Then the unbiased estimator V*(X_, Y) of
(2.21)

is valid for any 8 e Q. The existence and construction of an UMVUE of R is based on the following basic theorem (see e.g. Casella and Berger (1990) or Lehmann and Casella (1998)). Theorem 2.3 (Cramer-Rao-Blackwell). If V(X_,Y) is any unbiased estimator of
18

2.2.2

The Theory and Some Useful Approaches

Construction of UMVUEs

Since V{X_,Y) = I(XX < Y\) is the simplest unbiased estimator of R, the UMVUE of R can be obtained as R = E[I(Xi < Yi)|T], namely R

= J I{Xt < Yi) P(X1,Y1\T)dX1dX2,

(2.22)

where p(Xi,Yi\T) is the conditional pdf of (X\, Yi) given T. In what follows, however, we shall suggest another somewhat more powerful technique for derivation of UMVUE of R based on the UMVUE of the pdf/(z,j/|0). For this purpose, we shall consider a more general problem of construction of the UMVUE of the joint pdf f(xx,,Xk,yi,,Vk\6) oiXx, ,Xk, ,Yk) with k < min(m,n) (see e.g. Lumelskii and Sapoznikov (1969) Y\,or the monograph of Voinov and Nikulin (1993)). Theorem 2.4 Let 9Q e Q be an arbitrary value of 9. Denote by g(T\9o) andgeo(T\X1=x1,---,Xk=xk,Y1 = y1,---,Yk=yk) thepdfofT(X,Y_) , k, and the conditional density ofT for given Xj = Xj, Yj = Vj, j — 1, respectively, as 9 = 9Q. Then the UMVUE of f(xi,,xk, j/i, y k \9) is of the form n fe f( ,Q \ v 9eo(T\X1=x1,-,Xk=xk,Yi=y1,-,Yk=yk) l l j = l J\Xj,yj\O0) X WW)

{1.2,6) '

Proof Let g(xi, ,xk,yi, ,yk,T\O) be the joint pdf of Xx,Yi, , Yk and T. Then, the conditional pdf of (Xx, , Xk, Yx, given T is a n U M V U E of f{x1: tt f{xu...,xk,yi,...,Vk)-

k,yi,--

\ -

Note t h a t f(xi,---,xk,yx,---,yk)

-,Vk\8) q(xir--,xk,yi,---,yk,T\9) - ^ .

= p{xx,

,xk,yx,

,yk\T)

depend on 9. Also, it is an unbiased estimator of f(xx, , xk, yx, since the expectation of the right-hand side of (2.24) is equal to p(xx,

k,yx,--

-,yk\T)

g(T\9)dT

,Xk, Yk)

= f(xx,-

,xk, yx,

(2.24) does n o t

, yk\9),

yk\9).

Finally, f(xx, ,xk,yx, , yk) is a function of the sufficient statistic T and, therefore, is an UMVUE by the Cramer-Rao-Blackwell theorem.

Unbiased Estimation

19

To complete the proof, observe that / ( z i , , xk, J/i, , Vk) does not depend on 0, so that one can choose any value 0 = 6Q in the right-hand side of (2.24) and rewrite (2.24) as (2.23). Since the structure of the equation given by (2.23) is similar to the Bayes ,Xk,yi, ,Vk\6) formula, the method of construction of UMVUE of f(x\, based on (2.23) is sometimes called the Bayes method (see e.g. Voinov and Nikulin (1993)). Evidently, formula (2.23) has been designed for dependent X and Y. If X and Y are independent with the pdf (2.1) and sufficient statistics Tx and Ty, then the UMVUE of f(x\,, Xk, yi, , Vk\Q) is simplified as f(xi,---,xk,yi,---,yk)

= fx(xi,---,xk)

,xk) and fy(yi, Here, fx(xi, method. Namely, i <
fy(yi,---,yk)-

(2.25)

,Vk) can be calculated by the Bayes

~ \ TT f i \a \ 9eQ{Tx\Xi = xi, ,xk) = [[ fx{xj\e0) 9(Tx\e0)

,Xk = xk)

,

(2 26)

.

'

where gea(Tx\Xi = x\,- -,Xk = xk) and g(Tx\0o) are, respectively, the conditional density of Tx given Xi — X\, Xk = Xk and the pdf of Tx for 6 = 60. To obtain the expression for fy{y\, 2/fc) one needs only to replace x's (X's) by y's {Y's) in (2.26). We are now ready to derive the UMVUE of an arbitrary positive integer power Rk of R. This will be useful for unbiased estimation of R and Var(fl). Theorem 2.5

The UMVUE of Rk is of the form

Rk = / f(xi,

k,yi,---,yk)

I I [I{XJ < yj)dxjdyj],

(2.27)

,xk,yi, ,yk) is given by one of the relations (2.23) where f{x\, (2.25) depending on the sample. In particular, the UMVUEs of R and R2 are, respectively,

R = I I(x
=

/ / I(xi < yi)I(x2 < y2)f{xi,x2,yi,y2)dx1dx2dyidy2.

(2.28) (2.29)

20

The Theory and Some Useful Approaches

We observe that formulae (2.22) and (2.28) provide identical expressions for R. The reason is that the estimator f(x, y) is the conditional density of (Xi,Yi) given T at the point (x,y). However, being identical mathematically, the algorithms based on (2.22) and (2.28) are quite different in their construction conceptually. While (2.22) assumes that we derive the conditional density p(X\,Yi\T) for every newly occuring distribution, (2.28) implies that we should utilize unbiased estimators of pdfs derived previously. These estimators can be found, for example, in Voinov and Nikulin (1993). An advantage of the UMVUE over the MLE is that it is possible to provide a method for constructing an UMVUE of MSE(^) which, in view of the unbiasedness of R, is equal to the variance Var(.R). Theorem 2.6

The UMVUE 'Var(R) of Var(R) is given by

fariR) = (R)2 - 3,

(2.30)

where R and R2 are defined in (2.28) and (2.29), respectively. The validity of this theorem (see Lumelskii (1968) and Ivshin and Lumelskii (1995)) follows from the fact that Var(-R) is an unbiased estimator of Var(E) and it is a function of sufficient statistic by construction. Consequently, 'Var(R) is the UMVUE of Var(£) by Cramer-Rao-Blackwell theorem. Evidently, formula (2.30) does not contain unknown parameter 0, hence, even if the integral in (2.30) cannot be obtained analytically, Var(.R) can always be evaluated numerically which is hardly a problem nowadays. To appreciate the described methodology, we shall now discuss an unbiased estimation of R in some detail including the derivation of f(x, y). 2.2.3

One-parameter Exponential Distribution

Assume that, analogously to Example 2.1, X and Y are independent exponential random variables with the pdfs fx(x\a) = a exp(—ax) and /y(j/|/3) = (3exTp(—(3y) and unknown a and /3. The objective is to derive the UMVUE R of R and the UMVUE of the variance Var(E). For this purpose we need to obtain the UMVUE f{xi,---,Xk,y\,----,Vk) with k = 1,2, which, due to independence of X and Y, are of the form (2.25). Note that the joint density f(X_,¥_\a, (3) is given by (2.13), so that, by Factorization Theorem (2.20), Tx = J2"U xj a n d TY = E"=i Yj a r e

Unbiased Estimation

21

sufficient statistics for X and Y. We shall derive ,%k) and then obtain fy(yi, Vk) by replacing x by y and n\ by n2 in the expression for fx(xi,---,xk). Setting 6Q = a = 1 in (2.26) and noting that Tx is a sum of n\ independent standard exponential random variables, we have Tx to obey the gamma distribution Gamma(ni,l), i.e. g(Tx\l) = T^1'1 exp(—Tx)/T(m). = Xk, we can write Tx = Y^j=\xi + Similarly, given X\ = X\,---,Xk S"ife+i -^J ' s 0 t n a t ^ f ~ J2j=i xj i s exponential variables and 9a(Tx\Xi

= xi, E : = i

tne s u m

,Xk = Xk) = ) ( T x E l i .,)..!-»-»

, Y

()

°f ( n i ~ ^) independent

^ ^

2

X

J

^

Recall that fx(xj\l) = exp(-Xj) and substitute expressions for g(Tx\l) and ga(Tx\Xi = xi, , Xk = Xk) into (2.26). Noting that Tx = niX where X is defined in (2.14), we arrive at

J (2.31) , Vk) and letting k = 1, we obtain

Writing a similar expression for fyiyi, the UMVUE (2.28) of R to be

(2.32) where the last integral is calculated over the region W = {(x, y) : 0 < x < niX, 0 < y < n2Y, x < y) . Since W = {(x, y) : x < y < n2Y, 0 < x < min(niX,n2y")} and noting that

x

\

)

y

n

2

-l

(2.33)

after substitution of (2.33) into (2.32) and integration over a; € [0, min(niX,

22

The Theory and Some Useful Approaches

we obtain

{

Qi(ni,n2,niX,n2Y),

iin2Y
W2{ni,n2,nxJi.,n2y

j,

it n2x > n i A .

where O-2

-,-,

(2.35)

? f c ()'

(2-36)

The estimator (2.34) has initially been derived by Tong (1974, 1975). It is along with (2.15) a basic formula of the stress-strength theory. We are now in a position to derive the UMVUE of the variance Var(.R). Prom (2.31), setting k = 2, we have ni(ni-l) [^

xi+x2]ni~3

T,

,

^

^

Application of (2.30) leads to T

cm22 ( nni - !)("! - 2 )( n 2 -1)(»2 - 2) (2.37)

where H(ni, n2, X, Y) is given by the integral H(nun2,X,Y) = (2-38)

where W* = {(xi,x2,yi,2/2) :^i +%2 0 < X ! <J/i,0<X 2 <J/2>-

,j

j

The integral in (2.38) ought to be calculated numerically.

Bayes and Empirical Bayes Estimation of R

2.2.4

23

A Multivariate Case

So far, we have been discussing only the case when X and Y are scalar random variables. However, all of the results obtained above can easily be generalized to the case when X = (X^, ...,X^) and Y = (Y^\ ...,Y^) are random vectors. For example, the UMVUE of /(xi, ,Xfc,yi, ,yfc|0) becomes /(xi, .Xfe^!, ,yfc) = n*=i /(xj,yj|0o)x ,Xfc = Xfc, Yi = yi, teWo)]-1 geoWX! = x l f

, Y* = yfc), (2.39)

, Xfe = xfe, Yi = where g(T\90) is the pdf of T(X, Y) and se o (T|Xi = xi, yi, , Yfe =yk) is the conditional density of T for given X.j = Xj, Yj = yj, j = 1, , k, when 6 = 60. The UMVUE of R = P((X, Y) e fi) can then be derived as

R = JI {(x, y) € n} /(x, y)dxdy,

(2.40)

where dx = n * i i dx^, dy = 11%! dy{j) and fi is a subset of the (fci + fc2) dimensional Euclidian space described in Section 2.1. For example, fi can be defined by (2.17), (2.18) or (2.19). The UMVUE of the variance of (2.40) can be determined as -

/ / JJ

/(xi,x2,yi,y2)dxidx2dy!dy2.

(2.41)

Here, R is defined by (2.40) and W** = {(xi,x2,y!,y2) : (xi,yi)eft, (x 2 ,y 2 )efi}. 2.3 2.3.1

Bayes and Empirical Bayes Estimation of R The Theory

Bayes estimators of R are constructed in the same set-up as the MLE or the UMVUE. Let (X,Y) ~ f(x,y\9), where ~ indicates "distributed as", and the sample (X_, Y_) (see (2.2)) be available for estimation of R = P(X < Y). Bayesian approach treats parameter 6 (scalar or vector) not as fixed unknown constant(s) but as a random variable (vector) with the (joint) pdf TT(0) called the prior pdf. This pdf is based on some knowledge available

24

The Theory and Some Useful Approaches

to the person carrying out the inference and should be formulated before data has been obtained. Definition 2.5

Let -K{9) be a prior pdf of 9. Then the posterior pdf of

0is K{V\2L,L.)

- — /jf y\

(2-42)

Here, f{X_,Y_\9) is defined by (2.3) or (2.4) and

p{X,Y_) = [ f{X.,Y\0M9)M Je

(2.43)

is the joint unconditional marginal pdf of X_ and Y_. The pdf n(6\2L, Y.) in (2.42) is termed posterior since it is derived after X_ and y have been observed, and can be interpreted as an update of the prior pdf based on the data. Note that, in view of (2.42), the posterior pdf remains invariant if TT(#) is multiplied by a constant. This fact is often used in Bayesian analysis and is expressed by denoting "TT(#) oc " which u means n(6) is proportional to". Bayesian approach to statistical inference in particular and to the problem at hand is becoming more prominent. We shall therefore briefly present a necessary background. The Bayes estimator R of R can be obtained as the expectation of R — R(0) with respect to the posterior pdf TT(9\X_,Y_) R= f R{6) 7r(6\X, Y)d6.

(2.44)

The value of R(9) in (2.44) can be calculated using one of the expressions (2.5), (2.6) or (2.9). Another way of determining estimator (2.44) is to derive (if possible) the posterior pdf of R first and then find the estimator R as an expectation over this posterior pdf. The pdf TTR(R\X_, 1Q can be obtained using a transformation of the random variables. For this purpose, one needs to choose a one-to-one transformation F : (6) —> (R,9R) with the inverse Q = F~1. Then, the joint posterior pdf of (R, 9R) is given by ir(Q(R, 9R))\JQ(R, 6R)\, where \JQ(R, 9R)\ is the Jacobian of transformation Q, so that

2L,Y) = j'n(Q(R,9R))\JQ(R,eR)\d9R.

(2.45)

Bayes and Empirical Bayes Estimation of R

25

The Jacobian \JQ(R, 6R)\ here is the absolute value of the determinant of the matrix of the partial derivatives of 6 with respect to R and components of 6R. For example, if 6 = (61,62), then \JQ(R, 6R)\ is the absolute value of ' 88x 8R

8ix_ 88R

det ~8R

80R~

As we have already mentioned above, the most common choice for the Bayes estimator of R is the expectation over (2.45) R= f R irR(R\X, Y) dR.

(2.46)

However, other Bayes estimators such as the median of TTR(R\X_, Y_) or the value of R maximizing TTR(R\X_, Y) can be utilized. Each of these variants is used depending on the specific problem at hand. The posterior pdf (2.42) can also be applied for construction of an interval estimators of R. We relegate formulation and discussion of Bayes credible sets to Section 2.4. 2.3.2

The Choice of a Prior

From the discussion above it follows that a starting point of any Bayesian analysis is the choice of the prior. Volumes has been devoted to this problem compiled by most brilliant researchers in the last 20-30 years. How can one choose 7r(#) if no specific information about the values of the parameters is available or prior information is rather vague? There are several ways to obviate the dilemma. One of the most popular solutions is to take a conjugate prior for n(6). Definition 2.6 Let T denote the class of pdfs f(x,y\6). A class V of prior distributions is said to be a conjugate family for J- if the posterior distribution is in class V for all / G T and all priors in V. The advantage of using conjugate priors is that the posterior belongs to the same class as the prior, so that updating the prior reduces to updating of its parameters. As a rule, conjugate priors lead to straightforward mathematical calculations, and this may be one of the reasons that it has

26

The Theory and Some Useful Approaches

been applied for estimation of R by a number authors (see, e.g. Enis and Geisser (1971), Abu-Salih and Shamseldin (1988), among others). Another possibility is to use a noninformative prior. The convenience of choosing a noninformative prior lies in the fact that it can be constructed if no knowledge about the values of the parameters is available. However, its shortcoming is that majority of noninformative prior pdfs turn out to be improper, i.e. while being nonnegative, they do not integrate to one. Historically, it would seem that Laplace (1812) was the first to introduce the uniform noninformative prior n(9) — 1. However, since this prior alters under a one to one reparametrization, inference based on the resulting posteriors can often show significant variation. As a partial remedy, Jeffreys (1961) proposed the prior proportional to the positive square root of the determinant of the Fisher information matrix TT(0) = [det(/(0)] 1/2 .

(2.47)

If 9 = (0i, ,9k), then, under commonly satisfied assumptions (see e.g. Lehmann and Casella (1998)) 1(9) is the matrix with (i, j)-th element

Yet another possibility - in the absence of specific information about 9 is to match the Bayesian solution with the frequentist solution of the problem. A prior which satisfies this condition is called a matching prior. It is derived by requiring the classical frequentist coverage probability of the posterior region of a real-valued parametric function to match the nominal level with a remainder of the order of O(n~^2). Here n is the sample size, j = 1 for the first order and j = 2 for the second order matching prior (see, e.g. Datta (1996), Datta and Ghosh (1995), Ghosh and Mukerjee (1992), Mukerjee and Dey (1993)). However, unlike the case of Jeffreys's prior, derivation of the matching prior leads to much more extensive calculations. The advantage of Jeffreys's prior is that it remains invariant under any one to one reparametrization. But despite its success in one parameter case, Jeffreys's prior often runs into serious technical difficulties in the presence of nuisance parameters, that is, when some parameters that are present in a model may not be of a direct inferential interest. This situation often occurs in the case of the stress-strength model since we are interested in the value of R and do not need to know other component 9R (see discussion

Bayes and Empirical Bayes Estimation of R

27

preceding (2.45)). For this reason, it may be advantageous to use the socalled reference prior introduced by Bernardo (1979) and generalized in the articles by Berger and Bernardo (1989), (1992). This prior is specially devised for multiparameter situations and is derived by dividing the set of parameters into parameters of interest and the nuisance ones. Another attractive feature of reference priors is that they usually satisfy the matching criterion described above. Readers interested in this topic are referred to Kass and Wasserman (1996). Since the derivation of noninformative priors is not always an easy task, a catalog of noninformative priors compiled by Yang and Berger (1997) may be helpful. Several authors have constructed noninformative priors specifically for the stress-strength model (see, e.g. Kim et al. (2000), Lee (1998), Thompson and Basu (1993) and Sun et al. (1998)). 2.3.3

One-parameter Exponential

Distribution

Let as in Example 2.1, X and Y_ be independent samples from distributions with pdfs fx(x\a) = aexp(—ax) and /y(y|/3) = /?exp(—f3y), respectively. Parameters a and /3 could be reasonably assumed to be independent a priori. Following Enis and Geisser (1971), we shall employ conjugate gamma prior distributions for a and /3 with parameters n, 7 and is, A, respectively, so that ir(a,(3)oca^ 1 e-' y a ^ - 1 e ~ A / 3 ,

M ,7,i/,/3

> 0.

(2.48)

(Recall that <x means "proportional to"). Taking into account that f(X_, Y_\a,f3) is of the form (2.13) and applying the Bayes formula (2.42) we obtain the posterior density of (a, /3) of the form ir(a,/3\K,Y) ex Q»i+M-ie-a(7+m*) ^na+iz-ig-^A+n^

^

^

Evidently the posterior is also the product of gamma pdfs with the updated parameters H* = ni + (i, 7* = 7 + niX, v* = n 2 + u, A* = A + n2Y.

(2.50)

Here X and Y are the sample means (see (2.14)). In order to derive the posterior pdf of R, we shall follow the recipe described in the beginning of this section and consider a one-to-one transformation F : R = a/(a + /3), On = a + f3 with the inverse Q : a =

The Theory and Some Useful Approaches

28

,P = R{1- OR). The Jacobian \JQ(R,0R)\ det

da dR

da d6R

90 dR

d0 ~"~

here is

= det

= 9R.

\_-eR 1 - R j

Hence, the joint posterior density of R and OR becomes TT*(R,6R\X_,Y)

where 00

ot R^-^l

- Ry'-1eR'+u"-1e-f>Ry

{1 BR

~ \

(2.51)

and S = (A*-7*)/A*
(2.52)

Integrating out 6R e (0, oo) in (2.51), we arrive at

where CR is the normalizing coefficient ensuring that TTR(R\X_, Y) integrates to one. The Bayes estimator of R corresponding to the conjugate gamma priors can now be obtained using, for example, formula (2.46) and equation 3.197.3 in Gradshtein and Ryzhik (1980). We obtain ),

for|B|
R= 2FI(JJI* + V*,V*,IJL* + V* + UT§B),

for

J5 < - 1 .

(2.54) Here the parameters B, A*,7*,/x* and v* are defined in (2.52) and (2.50), and 2-Fi(ffl) b, c; z) is the hypergeometric series defined by (see e.g. Abramowitz and Stegun (1992), page 556)

c(c+

l)

(2.55) , and reduces The series in (2.55) is convergent for \z\ < 1, c ^ 0, —1, —2, to a finite sum if a or 6 is zero or a negative integer. Since \B/(1 — B)\ < 1 for B < —1, the hypergeometric series in (2.54) are always convergent. The Bayesian analysis conducted above can also be carried out with noninformative priors. In this section, we shall obtain the Jeffreys's prior postponing the derivation and the discussion of more complicated reference and matching priors to Section 3.3. To derive Jeffreys's prior for fx(x\a) =

Bayes and Empirical Bayes Estimation of R

29

aexp(—ax), note that the Fisher information I(a) = — Ea is equal to a~2 in this case, hence Jeffreys's prior for a is simply a" 1 . Analogously, the Jeffreys's prior for /3 is /3" 1 , so that the joint prior pdf 7r(a, /3) has the form (2.48) with 7 = X = fi = u = 0. Thus, we can use the results of the previous calculations to obtain the posterior of R and the Bayes estimator of R of the forms (2.53) and (2.54), respectively, with B = (nY- mX)/(nY) and n* = m, 7* = n^X, u* = n2, A* = n2Y. 2.3.4

Bayes Predictive and Empirical Bayes

Estimation

The Bayes estimation is closely related to the Bayesian predictive approach where the goal is to estimate the probability Ppred = P(Xm+i < Yn+i) for some future observations Xni+i and Yn2+i- Since the observations are i.i.d., it is easy to show that Ppred — R(0) and, therefore, the Bayes estimator R is an appropriate choice for the predicted value of Prediction of the probability P ( X n i + i < Yn2+i) is related to the empirical Bayes (EB) estimation problem. In the EB set-up we are required to estimate the probability R = P(Xn+i < Yn+i) given the data ( X i , Yi,6i),

(Xn, Yn, 6n), (Xn+1,Yn+1,0n+1).

In these vectors, t h e i.i.d.

parameters 6j with the common density ir(9) are unobservable and the pdf is unknown. Given 6j, j = 1, ...,n, the first two components (Xj,Yj) are observable and are i.i.d. with a known joint conditional pdf f(x,y\6j). If TT(0) is completely unspecified, the EB model is called nonparametric while when ~K{9) has a known form but contains unknown parameters the EB set-up is referred to as parametric. In both cases, the unknown , prior pdf is recovered on the basis of the previous observations (Xi, Y{), (Xn,Yn) and jt(O) is used then to construct the EB estimator of R which is given by REB

_ JBR(e)f(Xn,Yn\9)*(9)de ff(xY\e)*(6)de

(2 56)

'

Here R(6) is defined in Section 2.1.1. If, in the above setting, 6 = (61,62) and X and Y are independent with the pdf f(x,y\6) — fx(x\6x)fy(y\6Y), the data may be of the form (X1,611),(X2,612),,(Xni,6lni) B,nd(Y1,62i),(Y2,e22),,(Yn2,62n2). In this situation, the samples X_ and Y_ may be of unequal sizes and R =

30

The Theory and Some Useful Approaches

Readers interested to learn more about empirical Bayes models and methods are referred to Carlin and Louis (2000) or the paper by Casella (1985). 2.4 2.4.1

Interval Estimation The Theory

In many applications just knowing a point estimator is not sufficient. For illustration, consider a medical application where X and Y represent responses by an old treatment and a new treatment A and B, respectively; the aim is to decide whether one should abandon the old treatment in favor of the new one. If the point estimator R of R = P(X < Y) is obtained and is equal to 0.59, we still cannot confidently recommend the course of action since we don't have information on the variability of R. What we really need in such situation is an interval which covers the unknown value of R with a high probability of at least 1 — 7 where 7 > 0 and small. Definition 2.7

Let the statistics L(X_,Y) and U(X_,Y_) be such that

P(L(X,Y)l-j,

0<7
(2.57)

The interval (L(X_,Y_),U(K,Y_)) is called the confidence interval for R with the lower and the upper bounds L(X_, Y_) and U{X_, Y_), respectively, with the confidence coefficient 1 — 7. In the example above we may decide to give up the old treatment if, for instance, L(X_,Y_) = 0.54 with 1 — 7 = 0.95, since the new treatment is better than the old one with probability 0.95. On the other hand, if L(X_,Y_) = 0.43 we cannot at all be certain that the new treatment is preferable. Note that in this example (and it is often the case in medical and engineering applications for example) we are not interested in the upper bound and may be quite satisfied knowing that P(R > L(X_,Y)) > 1 — 7 for some small 7 > 0. The intervals (L(X.,Y_),oo) or (—00,U(X_,Y_)) are said to be one-sided while the interval (L(X_,Y_),U(X_,Y)) is referred to as a two-sided one. There exist at least three approaches to construction of confidence intervals: exact methods, asymptotic methods and Bayesian methods. We

Interval Estimation

31

shall present them briefly referring a reader interested in more details to, for example, Chapter 9 of Casella and Berger (1990). 2.4.2

Exact Methods of Interval Estimation

Exact methods can be applied if the one or several pivotal quantities (or pivots) axe available. A random variable Q{X_, Y_\6) is said to be a pivotal quantity if the distribution of Q{X_,Y\9) is independent of all the parameters. If Qj(X_, Y_\9), j = 1, ...,-ftT, are pivots, then one can find numbers a, and bj, j = 1,..., K, which do not depend on 9 such that P(aj < Q(X,Y\6) < bj) > 1-jj,

j = 1,...,K.

(2.58)

Using the Bonferroni inequality ~{K-

1),

(2.59)

, K, we combine the inequalities where Ej are any random events j — 1, (2.58). Solving inequalities for R(0) one obtains K 3= 1

where W(2[_,Y_) is a random set on the given sample space. Exact confidence intervals for R have been derived when (X, Y) is a normally distributed random vector (under various assumptions on the parameters), when (X, Y) have independent gamma or generalized gamma distributions, and in a nonparametric case when the distribution of (X, Y) is completely unspecified (see, e.g. Birnbaum and McCarty (1958), Owen et al. (1964), Enis and Geisser (1971), Ury (1972), Yang and Mo (1985), Reiser and Guttman (1986), Constantine et al (1986), Teskin and Kostyukova (1991) and Pensky and Takashima (2002) among others). 2.4.3

Asymptotic Methods of Interval Estimation

Asymptotic confidence intervals have been devised for the cases when construction of exact confidence intervals for R is impossible. Usually, construction of asymptotic confidence intervals is based on the central limit

32

The Theory and Some Useful Approaches

theorem which states that under "mild regularity conditions" a point estimator R of R, whether it is MLE or UMVUE, is asymptotically normal with the mean R and the variance a\ as the sample sizes n\ and n2 of X and Y, respectively, turn to infinity. Thus, constructing an estimator a\ for a2R, one can assert that P{—z1/2 < (R — R)/(TR < z1/2) « 1 - 7 for arbitrary small 7 > 0 provided n\ and n2 are fairly large. Here, za is the (1 — a) percentile, or the upper a-cut-off point of the standard normal distribution, namely, za is the solution of the equation

r

= a.

(2.60)

Jza

Alternatively, *(zQ) = 1 - a,

(2.61)

where $(2) is the standard normal cdf $(z) =

= / e V27T 7-oo

2

'

(2.62)

Confidence intervals for estimators of R based on normal approximations have been studied by Church and Harris (1970), Nandy and Aich (1994a) and Gupta et al. (1999) among others. Another asymptotic technique which has so far been less popular in the applications to estimating R is based on the well known fact in the statistical inference that, for any fixed R, the logarithm of the likelihood - 2 In L{R\X_,Y) is distributed asymptotically as \2 with 1 degree of freedom (see, e.g. Casella and Berger (1990)). The confidence set W(X_,Y_) with the confidence coefficient 1 — 7 then becomes W(X,Y) = {R : -2lnL(R\X,Y) < x$(l)} , where x ^ l ) , 0 < 7 < 1, is the upper 1007-percent point of the x 2 distribution with 1 degree of freedom. This approach has been implemented by Madansky (1965) and Easterling (1972). The advantage of the asymptotic algorithms of interval estimation is that they can be carried out for practically any distribution. However, a shortcoming is that these techniques very often run into serious difficulties and provide crude unreliable results when R is close to zero or one and the sample sizes are relatively small. Moreover, the recent work by Hallin

Interval Estimation

33

and Seon (1999) indicates that one ought to take asymptotic results with a grain of salt. 2.4.4

Bayesian Credible Sets

The third group of methods for construction of confidence intervals are the Bayesian methods. There exists a conceptual difference between frequentist and Bayesian confidence intervals. With the former ones, one knows that the interval covers R with the probability of at least 1 — 7. In contrast, the Bayesian set-up allows one to assert that R is inside the interval with a given probability. This is due to the fact that, under Bayesian model, R is viewed as a random variable having probability distribution and all Bayesian assertions about coverage are made with respect to the posterior distribution of the parameter. To keep the distinction between Bayesian and classical interval estimators, the Bayesian interval estimators will be referred to as credible rather than confidence intervals. Definition 2.8 Let nR(R\X_,Y_) (see (2.45)) be the posterior pdf of R. Then for any set W € [0,1], the credible probability of W is

P(ReW\£,Y)=

[ TrR(R\K,Y)dR. Jw

Evidently, there exist infinitely many credible sets W satisfying the condition P(R € W\X_,Y^ = 1—7. However, since smaller sets are preferable in applications, we shall form the credible set W by taking the highest posterior density (HPD) region defined by W(X, Y) = {R: irR(R\X, Y) > c7} ,

(2.63)

where c 7 is chosen so that

L

KR(R\2L,

Y_)dR = 1 - 7 .

(2.64)

w(2L,Y)

2.4.5

Hypothesis Testing: Theory and Methods

Suppose one is required to test a hypothesis about R of the type Ho

R G S versus Hi : R <E Hc,

(2.65)

34

The Theory and Some Useful Approaches

where H is a subset of the interval [0,1] and Hc is its complement of S in [0,1], i.e. [0,1] \ 3. Here Ho and Hi are called the null hypothesis and the alternative hypothesis, respectively. Since R = R(9), 0 6 0, where © is the set of all possible values of parameters, hypotheses Ho and Hi about R can be re-formulated in terms of 6, that is Ho : 0 € 0 O versus Hx : 6 e 9g. Here ©g is the complement of ©o in © and R € 2 whenever 0 € ©oA hypothesis test is a rule that specifies for which sample values Ho is accepted as true and for which sample values it is rejected and, consequently, Hi is accepted as being true. This description is somewhat simplified but should be adequate for our purposes. Definition 2.9 The subset of the sample space for which Ho is rejected is called the rejection region (or the critical region). The complement of the rejection region is called the acceptance region. A hypothesis test is subject to two kinds of errors: rejecting Ho while it is true (this error is called Type I Error) or accepting Ho when it is false (Type II Error). For a fixed sample size, it is usually impossible to make probabilities of both types of errors arbitrary small. In searching for a good test, it is common to consider only those that control Type I Error at a specified level. Within this class of tests we then search for tests that have Type Error II probability as small as possible. Definition 2.10 We say that the test with the rejection region TZ is a level 7 (or size 7 ) test if sup Typically, the hypothesis test is specified in terms of a test statistic V = V(X_, Y) (a computable function of observations {X_, Y)), that is we accept HQ if the values of V are in some subset Wo of (—00,00). The size of the test carries important information about our decision: if 7 is small, the decision to reject Ho is quite convincing while, if 7 is large, this is not the case since the test has a large probability of false rejection of HQ. Another way of reporting the results of a hypothesis test is to provide the p-value. The p-value is especially useful when large (or small) values of test statistic indicate that Ho is false.

Interval Estimation

35

Definition 2.11 Let V(X_, Y_) be a test statistic such that the large values of V give evidence that Ho is false. A p-value p(X_, Y.) € [0,1] is a statistic satisfying P(p{X_, Y) < 7) < 7 for every 6 £ ©0 and every 7 £ [0,1] and such that for any sample point (x, y) p(x, y) = sup P(V(X, Y) > V(x, y)).

(2.66)

9€0

The p-value can be interpreted as the probability to obtain the data that has been observed in our particular experiment if hypothesis Ho is true. Consequently, small values of p(2L, Y_) can be treated as an evidence against .HoA practical construction of hypothesis tests is closely associated with interval estimation. For example, if (L(X_, Y_), U(X_,Y)) is a two-sided (1 — 7)-confidence interval for R, then the test which accepts HQ : R = Ro whenever Ro e (L(X_, F), U(X_,Y_)) and rejects it otherwise is a size 7 test since P(L(2L,Y) Ro versus Hi : R < Ro, a one-sided confidence interval (0,f/(X, y),0) will do the job. Indeed, a test that rejects Ho if Ro > U(X_, Y_) is of size 7 since

<

P(reject H0\H0) P(R>U(X,Y)\R

= Ro)

= P(R0 > U(2L,Y)\R0 < R) = 7.

Similarly, a test which rejects Ho : R < RQ whenever Ro < L(X_, Y) is a size 7 test. Another approach to testing a hypothesis about R is to construct a Bayesian test. Consider the general problem (2.65). Here the Bayesian test rejects HQ if \X,Y) P{R e 5C \X,Y) _l-P(R€E P(ReE\X,Y) ~ P(ReE\X,Y)

> A

'

[ ()

^

where the numerator and the denominator in (2.67) are the posterior probabilities of Hc and S, respectively, and A is a threshold value chosen in advance. For example, if A = 1, Ho is rejected if evidence in support of Hi is stronger than evidence in support of HQ. However, if one wants to

36

The Theory and Some Useful Approaches

be more conservative in rejection of Ho, one should increase the value of A. Observe that P(R G 3|X,20 i n (2.67) can be evaluated as P(RGE\X_,Y)

=

f I(R € E)nR(R\X,Y)dR

=

G

(2.68)

E)n(9\X,Y)d6,

where R(9), TT(6\X_,Y_) and TVR(R\X_,Z) are defined in (2.5) or (2.6), (2.42) and (2.45), respectively. It is appropriate to use the first half of (2.68) when the expression for T^R{R\X_, Y_) is available since it leads to a one-dimensional integration. However, the knowledge of 7Tfl(i?|X, Y) is not absolutely necessary: one can always numerically calculate the second (multivariate) integral in (2.68). 2.4.6

One-parameter Exponential Distribution

We are back to the set up of Examples 2.1 or 2.2 but now we are looking for the interval estimator of R. The one-parameter exponential distribution is one of the exceptional cases when the exact confidence interval can be constructed. Surprisingly, this derivation was carried out in a purely Bayesian paper of Enis and Geisser (1971) which shows interconnection between seemingly different approaches. Recall that the MLE of R = a/(a + (3) is of the form R = Y/(X + Y) ((2.12) and (2.15)) and note that n\X and n-^ have gamma distributions with parameters (a,ni) and (/3,712), respectively. In order to obtain the exact confidence interval for R we shall derive the exact distribution of the variable anxX + /3n2Y Denote £ = an\X, r\ = (3ri2Y and observe that £ and 77 have gamma distributions with the parameters (l,ni) and (I,ri2) respectively. Introduce now a new set of variables C = £/(£ + v),T — V- Expressing the old variables in terms of the new ones £ = ( ( T ) / ( 1 — £), V = T and obtaining the Jacobian of transformation J = (1 - ()~2T, we arrive at the following joint pdf of £ and r

r(m)r(n2)(i -

Interval Estimation

37

Integrating out r we have the marginal distribution of £ p(0 = [^(ni.na)]- 1 ^ ( l - 0n'~\

0 < C < 1,

namely, £ has a beta distribution with the known parameters n\ and n2. Therefore, for any 0 < a < b, P(a < ( < b) = Ib(ni,n2) - Ia(ni,n2),

(2.69)

where F ni-l

-1

n 2 -l

Jo

(2.70)

is the incomplete beta function (see Abramowitz and Stegun (1992), page 263). To connect £ and R, it is easy to check by direct calculations that

(2.71) hence the right hand side of (2.71) is a pivotal quantity. Consequently, if a and b in (2.69) are such that for a given 7 2)-Ia(ni,n2)

= l-J,

(2.72)

then

Y < b ) l ,

(2.73)

Solving the inequality in (2.73) for R:

- a) + ri2Ra

Hi(l — -R)(l — b) + n2Rbl

(2.74) The exact confidence interval (2.74) has the advantage of being valid for any values of ni and n2, large or small. However, for its construction one needs to solve the equation (2.72) for a and b. Although the equation has infinite number of solutions, the objective is to find those that are the closest to each other, namely, for which b - a — min. Since the solution of this optimization problem is often far from trivial, we may wish to replace the exact confidence interval by the asymptotic one provided n\ and n2 are

38

The Theory and Some Useful Approaches

relatively large. We can base our asymptotic confidence interval on either the UMVUE R or the MLE R of R. In both instances, we shall use the fact that as m —> oo and n 2 —> 00, the estimators R and R are asymptotically normal with the mean R. In example 2.2 we have derived the unbiased estimator a2 = Vari? for the variance Var.R of the form given by (2.37) and (2.38) . Then, the asymptotic confidence interval for R with the confidence coefficient 1 — 7 is given approximately by (R — z^/2a, R — Z7/2CT) where as above z 7 / 2 is the 1 — 7/2 percentile of the standard normal distribution. We warn our readers that even with the modern computer facilities, an application of (2.37) and (2.38) may require substantial numerical effort. If the sample sizes are equal n\ = n2 = n, we can obtain an asymptotic confidence interval for R based on the MLE using the asymptotic expression (2.16) for MSE(£). Choose a2 = 2f 2 (l + f ) - 4 ^ - 1 + 4f 2 (2r - l)(f - 2)(1 + f)-6n~2 with f = Y/X. Then, the asymptotic confidence interval based on the MLE, is of the form (R — z7/2<7, R — z^/zv), where za is defined in (2.60). Consider now a construction of the credible interval for R. If we choose conjugate gamma priors for a and (3, as it has been done in Example 2.3, the posterior pdf of R will be of the form (2.53) with the updated parameters (2.50) and B = (A* - 7*)/A*. Then, the (1 - 7)-credible set is the HPD region formally defined in (2.63) and (2.64). However, since /i* > 1 and v* > 1, the pdf (2.53) is unimodal, so that luckily the HPD region reduces to the interval W{X_,Y_) = (L(X_,Y.),U(2L,¥_)). Algebraically, in view of (2.63) and (2.64), L and U are such that (2.75) and = 7.

(2.76)

The system of equations (2.75) and (2.76) ought to be solved numerically. We have not done it here but an example of solution of the system of equations (2.63) and (2.64) is given in Section 4.3.2.

Transformation Methods

2.5 2.5.1

39

Transformation Methods The Theory

In the Sections 2.1 — 2.4 we have dealt with construction of both point and interval estimators of R — P(X < Y) when the random vector (X, Y) has the pdf f(x,y\9) where 9 is a scalar or a vector-valued parameter. In this section we shall describe how the point or the interval estimators can be derived using the so called transformation methods. These methods seem to have been overlooked by statisticians, and, to the best of our knowledge, has never been applied to the stress^strength problem before. Let, as above, (X, Y) be a random vector with the pdf f(x, y\9). We shall assume that there exist random variables £ and r? and a monotone function u(-) with the inverse v = u~l such that

X = u(0 ^=> £ = v(X),

Y = ufa) «=> r, = v(Y).

(2.77)

Assume, also without loss of generality, that the functions u and v are strictly increasing, so that (£, rf) is the random vector with the pdf

9*(tM°) = /M£),ufa)|0)u'(£)u'(T,).

(2.78)

Parameterization of the pdf (2.78) can also be carried out in a different manner. Namely, let (£, rj) be the random vector with the pdf g(£, TJ\T) where the scalar or vector-valued parameter r is connected to 6 by the one-to-one transformation g with the inverse v: 6 = U{T) <=>T = Q(6).

(2.79)

Thus, there exists the following correspondence between f(x, y\6) and
=

r))At)Av),

(2-80)

g(v(x),v(y)\g(e))v'(x)v'(y).

(2.81)

Since the function u in (2.77) is monotonically increasing, P(£ < rf) = P(u(£) < u(r])) — P(X < Y), so that for this model R remains invariant. Therefore, if R^tT)(r) and RXty{9) are two parametric expressions for R in terms of r and 9, respectively, and RX,Y(6) = RUQW),

Ri,v(7) = RX,Y{"
(2.82)

40

The Theory and Some Useful Approaches

Suppose that the MLE, the UMVUE , the Bayes estimator as well as the confidence interval for R = P(£ < 77) have already been derived. The aim of the theory is to transform these estimators into estimators of R = P(X < Y). In the proofs of the theorems stated below we shall assume, without loss of generality, that £ and 77 are dependent variables. To obtain proofs for independent £ and 77 one needs only to replace I%=i9(.tj,Vi\T) and n ? = 1 / f o , y # ) by Uti9^i\r)U]Li9v(Vj\r) and nr=i fx(xi\e) n"=i fv{yj\0), respectively. Theorems 2.7-2.10 serve as the basis for transformation of existing estimators of R into required new ones. Theorem 2.7

Let *(£, 77) be the MLE of R based on observations £ =

(£i>''" >£ni) and t] = (771,

,77 n2 ), where n\ = n^ = n whenever £ and 77

are dependent. Then the MLE R of R based on X_ and Y_ is given by (2-83) where ..,v(Yn2))-

(2.84)

Proof. By definition, the MLE f(£, 77) of r is the value of the parameter maximizing the likelihood function L(r|£, 77) = 11^=15(£:n Vj\T) a n d, by the invariance property of the MLEs (Theorem 2.1),

H,V) =

fle,,(T(£,2)).

(2-85)

The correspondence (2.81) implies that the likelihood function for {X_, Y_) is of the form

L(e\x,Y) = flpiXjMO) = f[ isKX,-), and, consequently, it is maximized by 0 = ^(f (v(X),v(y))) (cf (2.79)). It is easy to see that 8 is the MLE of 6, so that, by Theorem 2.1, f(v(X), v(Y)) is the MLE of g(8). Hence, (2.82) implies that the MLE of R is R(2L, Y) = Ri>r)(f(v(X), v(Y)).

(2.86)

It now remains to compare (2.85) and (2.86). Theorem 2.8 Let T^v = T^v(^, 77) be a sufficient statistic for r based on (£,77) and let there exist an UMVUE ^/(T^^) of R based on observations (£,77)- Then, TX,Y = T^tn{v(X_),v(Y_)) is a sufficient statistic for 6 based

Transformation Methods

41

on the sample Q£, Y_) and the UMVUE R of R based on X_ and Y_ is given by ( 2 - 87 )

R = i>(Tx,Y)-

Moreover, if ^fi(T^tV) is the UMVUE of the variance of the unbiased estimator ^f(T^tV), then the UMVUE of the variance of R is of the form Var(R) = * i ( T x , r ) .

(2.88)

Proof. The existence and the form of the sufficient statistic for 6 follow directly from relations (2.80), (2.81) and the Factorization Theorem (Theorem 2.2). To prove that R = $(TXtY) is the UMVUE for R we actually need only to show that it is unbiased. Then, since TX
E(R) =

Y)f[[f(xj,yj\e)dxjdyj]

'(xj)v'(yj) dxjdyj] 3=1

3=1

=

ReM9))=RXtY(0).

(2.89)

Here, x = (xx, ,xm), y = (j/i, ,yn), and we have applied (2.80), (2.81) and (2.82) to derive (2.89). To prove that $I(TX,Y) is the UMVUE of Var(.R) one need only to note that *i(T €>n ) = ^2(T^V) - ^ ( I j , , , ) , where *2(Tc,r,) is the UMVUE of R^^T), and reapply the arguments similar to (2.89). Theorem 2.9 Let ^(£,7?) be a Bayes estimator of R based on {^,rj) and the prior pdf vr(r). Then R = ^(v(X),v(y)) is a Bayes estimator of R based on the sample (X_,Y) and the prior pdf s(d)\. Here \J8{6)\ is the Jacobian of the transformation g(0); the vector functions v(X) and v(Y_) are defined in (2.84).

42

The Theory and Some Useful Approaches

Proof.

By the conditions of the theorem,

Therefore, due to (2.80), (2.81) and (2.82), the Bayes estimator of R based on the sample (X_, Y_) and the prior pdf TV(Q(6))\ Je{6)\ is

f>

=

f f Rx,y(e) f(X,Y\6)n(g(e))\Je(e)\de J"J f---ff(x,Y\e)n(Q(e))\je(e)\d9

Theorem 2.10 Let (L(£, rj), U(£, r/)) be the confidence interval for R with the confidence coefficient~(f--y)7 Then, (L(v(X_),v(Y_)),U(v(2Q,v(Yi))) is the confidence interval for R based on (2L,Y_) with the same coverage probability. Proof. 2.5.2

The proof is straightforward and is omitted. Examples of

Transformations

Reparametrizations (2.80) and (2.81) are valid for many probability distributions. Table 2.1., illustrating this, is designed for a pair of random variables X and £ with the pdfs f(x\6) = g(v(x)\g(6)) and g(£\r) = f (U(£)\L>(T)), respectively. The transformations are £ = v(X) and r = g(9). 2.5.3

The Rayleigh

Distribution

Let us consider examples of application of Theorems 2.7-2.9 when X and Y are independent random variables with the Rayleigh distribution. Let X and Y be independent random variables with the pdfs Rayleigh (o"i) and Rayleigh (a^), respectively, where the pdf of Rayleigh (a) is of the form (see also Table 2.1)

Given the samples X_ and Y_, we are interested in point and interval estimation of R = P(X < Y).

Transformation Methods

43

Table 2.1 Transformations of random variables. pdf/(*!
Transforms

pdf»«|r)

Weibull: f(x\a) =

v(a:) = xa 0 =a

One-parameter exponential: A 5(£|A) = Ae- «, ^>0.

T = X

a known, x > 0.

X = a~a

Pareto: /(x|
v{x) = In a; 6 = (a,\) T = (n, A) fj, = In a

Rayleigh:

f(*W) =

v(x) = x2/2 0 = cr T=X

Two-parameter exponential: ffKlAi,A) = Ae- A «-"\ One-parameter exponential: ff(f|A) = Ae-A«,

x > 0.

X=a

Power: /(x|cr,A) =

Two-parameter exponential: <7(£|M,A) = Ae-A<«-">,

0 < x < A.

v(x) = ln(l/x) 0=(
Burr type X: /(x|a) = 2axe- a:2 xfl-e-")- 1 , x > 0.

v(a;) = -ln(l-e-x2) d= a T=X X=a

One-parameter exponential: g{t;\X) = Ac-*«, C>0.

Burr type XII: /(x|a) =

v(x) = ln(l + x") «-a

One-parameter exponential: g{i\X) = Ae-A«,

/? known, x > 0.

T= A A=a

£>o.

Lognormal

v{x) = lnx

Normal:

1+q ) 0(1+0;") o

C>0.

/(ar|ju,
e

(ln*-|.)2

2^2

(3,_M)2

r = (ft, a)

J_

e~

2O-2

44

The Theory and Some Useful Approaches

From Table 2.1 it follows that £ = X2/2 and r\ = Y2/2 are independent exponential random variables with parameters Ai = a\ and A2 =
where

l

1

jri »=i

jr?.

(2.90)

3=1

Similarly, the UMVUE of R is r Qi(ni,n 2 ,niTx,n 2 r y ),

ifnzTy^mTx,

I Q2(ni,n2,niTx,n 2 Ty),

if n2Ty > niTx,

R=l

(2.91)

where functions Q\ and Q2 are denned in (2.35) and (2.36), respectively. To derive the Bayes estimator of R we have to work a bit harder. Since g is the identity transformation, the conjugate prior for a\, C2, by (2.48), is the product of gamma pdfs, i.e. 7r(£Ti,(72) a ^ - V ^ V ^ e - ^

2

.

Analogously, the posterior pdf of (
with updated parameters Mi = ni + /ii, 71 = 71 + mTx,

M2 = "2 + A«2, 72 = 72 + n 2 T y . (2.92)

The quantities Tx and Ty are denned in (2.90). The posterior distribution of R can be derived in the same manner as in the Example 2.3 ^+^),

0<#
(2.93)

Transformation Methods

45

with B — (7! — 7i)/72- The Bayes estimator of R corresponding to the conjugate gamma priors is of the form if |

(2.94)

Here 2-Fi(a, b, c; z) is the hypergeometric series defined in (2.55) and already encountered on several occasions. Note that in the case of noninformative Jeffreys's prior 7T(<7I,
We shall now discuss a construction of confidence intervals for R. By Theorem 2.8, the UMVUE for the variance of R is of the form

where the function H(ni,ri2,x,y) is defined in (2.38). Consequently, the asymptotic confidence interval for R with the confidence coefficient 1 — 7 is (R — z7/20',-R — z7/2<5') where z7/2 is the (1 — 7/2) percentile of the standard normal distribution. One can also derive (1 — 7)-credible set (L,U). Since the posterior pdf of R is of the same shape as in the case of one-parameter exponential distribution, the bounds L and U can be obtained from a system of equations similar to (2.75) and (2.76)

= 7. which should be solved numerically

46

2.6

The Theory and Some Useful Approaches

Exercises

2.1. Let X and Y be independent exponential random variables (see Section 2.1.3). Find the MLE of P(aX + bY + c> 0) where a, b and c are known constants. 2.2. Let X and Y be independent exponential random variables (see Section 2.1.3). Find the UMVUE of P(aX + bY + c > 0) where a, b and c are known constants. 2.3. As it was mentioned in Section 2.3.1, a Bayes estimator is usually chosen to be the expectation with respect to the posterior pdf. However, if the posterior pdf is nonsymmetric, it is often advantageous to base a Bayes estimator on the posterior median. Using the posterior pdf (2.53) in Section 2.3.3., derive the posterior median-based Bayes estimator of R. 2.4. Using mathematical induction, prove the Bonferroni inequality (2.59). 2.5. The pdf of the Burr type X distribution is given in Table 2.1 or in (3.28). Using transformations, derive the UMVUE (3.81) of R in this case. 2.6. In the conditions of problem 2.5, derive the Bayes estimator of R corresponding to the conjugate gamma prior or Jeffreys's noninformative prior. 2.7. The pdf of the Burr type XII distribution is given in Table 2.1 or in (3.31). Using transformations, derive the UMVUE of R in this case. 2.8. In the conditions of problem 2.7, derive the Bayes estimator of R corresponding to the conjugate gamma prior or Jeffreys's noninformative prior.

Chapter 3

Parametric Point Estimation

3.1

The Maximum Likelihood Estimation (Univariate Case)

In this section we shall consider a derivation of MLEs for some distribution families that appear in applications. As it was already mentioned, when constructing the MLEs, we shall employ the MLEs constructed previously (see e.g. Johnson et al. (1994), (1995)). As above, the estimators are based on the samples X = (Xi,X2, ,Xni), and Y_ = (Yi, Y2, ,Yn2) (see (2.2)), where n\ = n^ = n if X and Y are dependent variables. Since the volume of this book does not allow to discuss construction of the MLEs for every distribution family satisfying the above criterion, we shall elaborate derivation of the MLEs only for a number of distributions leaving the rest for exercises. We shall begin with the normal distribution which dominated statistical practice for over 100 years.

3.1.1

The Normal Distribution

Let X and Y be independent normal variables with the means /ii, (i2 and variances a\ and o\, respectively. In this subsection and subsection 3.2.1 we shall consider independent X and Y only. In the situation when X and Y are dependent normal variables, they can be viewed as a two-dimensional random vector X = (X, Y) and estimation of P(X < Y) reduces to estimation of P(A'X > 0) with A = (—1,1)' which studied in details in Section 3.5. If X and Y are independent, then Y — X is a normal variable with the 47

48

Parametric Point Estimation

mean fi2 — A*i and the variance a\ + cr|, so that (3.1) where <£(z) is the cdf of the standard normal distribution (see (2.62)). If the parameters of both stress and strength are unknown, then, by the invariance principle (Theorem 2.1), the MLE of R is

(3.2)

where X and Y are defined in (2.14),

(3.3) Note that X, Y, (ni - I)s2x/ni and (712 — I)sy/ri2 are the MLEs of /xi, /i2, (Tj and e l , respectively. In some cases, the distribution of stress is known or may be reliably calculated. Then, fix and <j\ are available with sufficient accuracy, and the MLE of R is

Church and Harris (1970) substitute a\ in (3.1) with its UMVUE s\ arriving at

R* = $ (V - /iiV-^+4) 3.1.2

(3.4)

The Two-parameter Exponential Distribution

Let X and Y" be independent random variables with the pdfs Ex(/ii, cr{) and Ex(/i2,o"2), respectively, where the pdf of the two-parameter exponential distribution Ex(/x, a) is of the form f(x\n,a) = aexp {-a(x - fj,)} I(x > (i).

(3.5)

The Maximum Likelihood Estimation (Univariate Case)

49

Then,

{

^^exp{-cr2(/ii-/i2)},

if/xi > f/2, (3.6)

To derive expression (3.6), represent i? in the form R=

dxdy,

where the integral is evaluated over the region {(x, y) : x > MI, y > M2, x < y} . If /ii > M2> the above integral can be rewritten as

r°°

R =

r f°

O

J Ui

exp {—x(<7\ + (X2)

<7l +(72 exp {-
To obtain the formula for R when MI < M2 simply use P(X Y).

(3.7)

When all (or some) of the parameters Mi> A*2i o"i and 02 are unknown, the MLE of R is given by (3.6) where the unknown parameters are replaced by &nda2 = (Ytheir MLEs: £1 = X(1), fa = Y{1), ax = (X-X^'1 For example, if all the parameters are unknown, the MLE of R is

(3.8) 1 ~ viyjy

3.1.3

The Gamma

W_y

eXp ( - T^C1

Distribution

Let X and Y be independent random variables with the gamma pdfs Gamma(ai,<7i) and Gamma(a2,<72) where a i and 0*2 are known positive parameters. The pdf Gamma(a, a) of the gamma distribution is of the form

(3.9)

50

Parametric Point Estimation

Observe that

(3.10) It is well known (see e.g. Johnson et al. (1995)) that the random variable Z = -yn—'-^rn— has the beta distribution with the pdf (3.11) where B{a, b) = T(a)r(6)/[r(a + b)}

(3.12)

is the beta-function. From (3.10) it follows that R = I "i (a-\.co).

(3.13)

Here, Ix(a, b) is the incomplete beta function (see (2.70)). Using the relation between the incomplete beta function and the hypergeometric series (see e.g. Abramowitz and Stegun (1992), page 263), we rewrite (3.13) as (3.14) where the hypergeometric series 2F\ (a, b, c; z) is defined in (2.55). If a-i is an integer, the hypergeometric series in (3.14) reduces to the finite sum (see (2.55))

R= B{oti,a2)

V j^

K 0

2)'''U (ai+i)j!

2)

f 2 I K^i + ^J

(3 15)

Similarly, if a.\ is an integer, then from (3.7) (3.16) To obtain the MLE of R we need to replace o\ and
The Maximum Likelihood Estimation (Univariate Case)

51

Constantine et al. (1986) derived a large sample approximation of the bias and the MSE of the estimator (3.14) which is of the form E(R-R)

»

B\al,a2)Q

\g(ai - 1) - (qa 2n2a2

- (a 2 - 1)'

where g = <J\j<J2. It is worth noting that for fixed n\ +n2, the asymptotic MSE in (3.17) is minimized by the choice ofni/n2 = y/a2/ai, a condition which does not involve the unknown parameter g. 3.1.4

The Truncation Parameter

Families

The family of distributions f(x\/j,, A,a) = [c(fi, A,a)}~lh(x;
(3.18)

is called the doubly truncation parameter family. If parameter A is known, then (3.18) becomes a lower truncated parameter family and similarly, if/i is known, (3.18) is the upper truncation parameter family. It is easy to see that the two-parameter exponential distribution belongs to the lower truncation parameter family with A = oo, h(x; a) = ae~xa and c(fi,a) = e~^a. The other basic representative of the family (3.18) is the uniform distribution for which h(x;a) = l, c(/i,A,(j) = (A —/x)"1.

(3.19)

If the sample Xi, , Xni is available, then the MLEs of the parameters A and \i are the extreme order statistics /2 = X(i) and A = X(ni), respectively. The MLE a of a is the solution of the equation 1 ^kdhjXj-a) — / >

5

dc(X(1),X{ni),a) ~

5

1 ' ~7v

v

\

(o.ZU)

Let X and Y be independent random variables with the pdfs (3.18) with the parameters {^\,\\,
52

Parametric Point Estimation

P(X < Y) can be represented as

{

r-min(Ai,A2)

h(x;
[ fX2

h(y;
J/ii pmin(Ai,A2) W

h{y;a2)

[ r\i

h{x;
c(M2,A2,a2) [Jj,

i

]

(3.21) To obtain the MLE of R one needs to replace /xi,Ai,<7i and /i2)A2,U2 by X(!),X( ni ),5-i and'Y^i), Y(n2),cr2) respectively. Here d\ andCT2are the solution of equation (3.20) with X replaced by Y and n\ replaced by 712 in Formula (3.21) provides an easy way to derive the MLE of R for the uniform distribution. In fact, substituting (3.19) into (3.21), replacing /ii,Ai,/i2 and A2 by X^),X^ni),Y^) and Y(n2), respectively, and integrating, we arrive at (3.22)

3.1.5

The Pareto

Distribution

The pdf of the Pareto Pareto(A, n) distribution is given by I

a ) .

(3.23)

If X ~ Pareto(Ai,cri) and Y ~ Pareto(A2,(T2) are independent variables, the MLE of R can be obtained without any additional calculations. The Pareto distribution is of importance in both engineering and economics and serves as the most popular model of the income distribution. Indeed, it is well known (see also Table 2.1) that a Pareto random variable can be obtained from the two-parameter exponential random variable by the logarithmic transformation v(x) = lnx (see (2.77)). Applying Theorem 2.7 to (3.8), one then arrives at

fV

if TX>TY, (3.24)

The Maximum Likelihood Estimation (Univariate Case)

Here, Tx = n^1 YZi 3.1.6

53

lnX

> a n d Tr = n21 £ £ i

The Weibull Distribution

In the case of the Weibull distribution (popular in engineering applications) all the results available in literature (see, e.g. Johnson (1988) or McCool (1991)) refer to the case when Weibull distributions have a common known shape parameter a. Although the authors found estimators by direct calculations, the MLE R can easily be derived by application of transformations £ = Xa, T) = Ya to one-parameter exponential random variables £ and r\. Namely, if X and Y have the Weibull distributions with the known common shape parameter a and unknown scale parameters o\ and <72, respectively, then by Theorem 2.7 and (2.15)

R=

Y

"2

-flkr3

-1

= a2 = a.

(3.25)

Consider now a more general situation when X and Y have known but yet different scale parameters an and «2- Then R

JJ

Changing the variables u = {x/<Ji)ai, v = {y/vz)012, integrating over u and expanding exp < — f ^a I vaz > into the Maclaurin series, we derive R

e~ue-vI I 0 < u < I — )

=

«»> I dudu

^7 +1 .

(3.26)

j=o

Note that the series in (3.26) is convergent provided a\ < ai, otherwise, we can use relation (3.7). Now, replacing o\ and 02 by their MLEs, we arrive

54

Parametric Point Estimation

at

R= " (3.27) 3.1.7

Burr Itype X and Type XII Distributions

In 1942 I.W. Burr published a system of cdfs that might be useful "for purposes of graduation". He has suggested twelve types (the first being the above mentioned Uniform(0,1) distribution). Special attention has been devoted to the type XII but recently type X has received attention in applications. Although type X and XII Burr distributions cannot be considered among the "common" distributions they are, however, instructive from the analytical and technical aspects of calculating P(X < Y). A random variable X has Burr type X distribution if its pdf is f(x\a) = 2axe~x\l

- e"*2)01-1, x > 0.

(3.28)

Observe that monotone transformation v(x) = - ln(l — e~x ) converts Burr type X random variable into the one-parameter exponential random variable studied in Example 2.1. Therefore, Theorem 2.7 yields R = TX/(TX+TY),

(3.29)

where

(-X?)]> TY = n~2x f > [ l - exp(-lf)]. (3.30) Estimator (3.29) has been derived by direct calculations by Ivshin and Lumelskii (1995) and then repeated independently by Ahmad et al. (1997) and Surles and Padgett (1998).

The Maximum Likelihood Estimation (Univariate Case)

55

The pdf of the Burr type XII distribution (which is becoming more and more widespread in applications) is of the form (3.31)

x>0.

Assume that X and Y are independent Burr type XII variables with parameters (ai,/?) and (0:2>/3), respectively, i.e. /3 is the common parameter for X and Y. Note that Burr type XII variables can be transformed to one-parameter exponential variables by v(x) = ln(l + x@) (see Table 2.1), so that from (2.12) and (2.82) R = «i/(ai + a 2 ). Therefore, the MLE of Ris (3.32)

where &\ and &2 are the MLEs of a.\ and a^: -1 Oil

=

n2 J

&2

-

n2

4=1

-\ - 1

V^ In (l + Y?\ 0=1

Here ft is the MLE of /3 given by the equation (3.33)

The MLE (3.32) has initially been derived by Awad and Gharraf (1986). However, Awad and Gharraf (1986) when deriving their analog of (3.33) , Yn2) in their calculations. did not include the sample Y_ — (Y\, 3.1.8

The Generalized Gamma Distribution

The probability density function (pdf) of the generalized gamma GG3(cr, a, /?) distribution is of the form

(f)

(3.34)

The generalized gamma distribution was introduced by Stacy (1962) and became popular in applications, due mainly to the highly flexible form of its pdf. Since by the specific choice of parameters in (3.34) one can obtain

56

Parametric Point Estimation

a wide variety of familiar pdfs (see Table 3.1), in reliability and survival analysis one can really benefit by usage of GG3. Table 3.1

Common distributions as special cases of GG3(cr, a, f3)

parameters

0=1 (3 = a 13 = 1, a =2 f3 = 2,a = V26,a

=2

0 = 2, <7 = cr-y/2, a = 1

distribution Gamma Weibull Chi-squared Rayleigh Half-normal

Thus, an attractive feature of considering the generalized gamma distribution here is that it allows one to estimate R = P(X < Y) when X and Y belong to different distribution families. The case when fx(x) and fy(y) belong to different distribution families is not unusual from practical point of view, but because of its seeming technical complexity has been considered to the best of our knowledge only by Nandi and Aich (1994b) who derived the MLEs when fx{x) is the exponential pdf and fy{y) is the inverse Gaussian, half-normal or half-Cauchy pdf. Note that consideration of the generalized gamma distribution does not make the derivations for the gamma or the Weibull distributions obsolete since the expressions for the estimators of R in the case of GG3 are very cumbersome. Let X and Y be independent variables with GG3(«Ti,ai,/3i) and GG3((T2,o;2,/32) distributions, respectively, where parameters /3\ and /J2 are known. By formula (2.5),

W

7

Changing variables u = x/ai,v where roo fvz

Jo Jo

= y/a2 we write R = HG{(3i,f32,ai,a2,

ftu°i-'exp{-/i)

F^g

ffa"°a~1exp{-i;^}>j r(g)

(3.35)

dudv.

Expanding exp[—u^1] into power series in u^1, integrating with respect to

The Maximum Likelihood Estimation (Univariate Case)

57

u and using the change of variables y — v02 we obtain HG{Pi,Pi,ai,a2,z) = JX V°° (-l)*» ai+ * ft rfai+aa i uPi\

(3.36)

Verifying convergence of the series in (3.36) by means of the limit ratio test (see e.g. 0.222 of Gradshtein and Ryzhik (1980)) we conclude that presentation (3.36) is valid only for p2 > fa. If Pi < Pi we can use the relationship P(X
= \---Jb-—.

(3.37) -

Ji

P, note that £ = X01 and r\ - Y02

To calculate R in the case of Pi-Pihave gamma distributions £ = X01 ~ Gamma(ai//3i,CTf1),

- ,

77 = y ^ ^ Gamma(a 2 //32,^ 2 ). (3.38)

Therefore, applying Theorem 2.7 and formula (3.14) we arrive at R = -TtOn/^y+i]-!

{ax/P,a2/P),

where Iz(a,b) is the incomplete gamma function (see (2.70)). Therefore,

HG(f3i,p2,a1,a2,z)

= I .„ (ai/p,a2/p),

fa

= p2 = p.

(3.39)

1 + zP

Finally, the MLE of R is of the form R = HG(p1,p2,a1,a2,a2/a1),

(3.40)

where HG(Pi,P2,ai,a2,z) is given by (3.36), (3.37) and (3.39), whenever Pi < Pi, Pi > Pi or P\ = p2, , respectively. Estimator (3.40) was derived in Pensky and Takashima (2002).

58

3.1.9

Parametric Point Estimation

Other

Distributions

1. The beta distribution. The pdf of the beta distribution Beta (0:1,0:2) is given by (3.11). The MLE of R is of the form (3 41)

+ a2 + ft

where we write F(a)/F(a - j) for n i = i ( a ~ *) Here ai, a-i, j9i and ft are the MLEs of ai, 02, /3i and ft, respectively (see e.g. Johnson et al. (1994)). Formula (3.41) was derived by Lenhof and Pensky (2002). 2. The lognormal distribution. The pdf of the lognormal distribution has the form

*

(3.42)

{(^!}

If X and Y are independent lognormal variables with parameters (/ii,<7i) and (/U2,C2), where all parameters are unknown, then

Here

t=l "2 2

. (3.44)

3. The scaled Burr type X distribution. The pdf of the scaled Burr type distribution is given by f(x\a,a) - 2oxe-<x/CT)2(l - e -(*/»> 2 )«-i. If X and Y have independent scaled Burr Type X distributions with parameters (ai,cri) and (02,02). Let oi and 02 be known and at least one of

Unbiased Estimation (Univariate Case)

59

them, say, a\ is a positive integer. Then (3.45)

R=

Here, &\ and d
(3.46)

If X ~ Power(o-i, Ai) and Y ~ Power(
lny ( n 2 ) -T y i lnl

:}

if

(3.47)

where (3.48)

3.2

Unbiased Estimation (Univariate Case)

In this section we shall obtain UMVUEs for a number of basic univariate distributions. Here we shall assume that X and Y are independent. See Section 3.5 for the case of dependent normal variables. The estimators are based on the samples X_ = (Xi,X2, , Xni), and Y_ = (Yi, Y2, , Yn2) (see (2.2)). 3.2.1

The Normal

Distribution

Let, as in Section 3.1.1, X and Y be independent normal variables with means \x\, /Z2 and variances o\ and a2, respectively. As it follows from Theorem 2.3, the UMVUE of R = P(X < Y) will in general be a function of sufficient statistics X, Y, s2x and sY which are defined in (2.14) and

60

Parametric Point Estimation

(3.3), respectively, if all of the parameters of the underlying distributions are unknown. A number of authors constructed the UMVUE of R for this popular case, among them Mazumdar (1970), Downton (1973), Chandra and Owen (1975), Woodward and Kelley (1977), Voinov (1984), Mukherjee and Sharan (1985) and Rukhin (1986). Initially, all effort has been devoted to the case when X has known parameters /zi and o\ (Mazumdar (1970), Downton (1973), Woodward and Kelley (1977)). The UMVUE of the pdf of a normal distribution with unknown mean and variance based on n2 observations Y_ is of the form f M

V^r(0.5(n2-l)) - l)r(0.5(n 2 - 2)) [l

=

jY{y>

sY(n2 - 1)2J

x I{yfc\v-Y\<(n2-1)8Y).

(3.49)

Therefore, it follows from (2.28) that

R= ff

1

exp {-0.5
thus R = / ^ $(crf 1(y — Hi))fY(y)dy with fy(y) given by (3.49). Introducing a new variable z = 0.5 + 0.5-v/n^(y — y)/(sy(ri2 — 1)) and noting that z G (0,1) we obtain k=

o/TJU f z^il-z^Qja-^dz,

(3.50)

where
y/n2 J

and B(a,b) is the beta function. Note that (3.50) differs from the formula of Mazumdar (1970) since the latter contains an error. Calculation of R requires one-dimensional numeric integration. Another representation of R can be derived using the change of the variable u = y/ri2(y - Y")/(sy(n2 — 1)) which yields ( 1 _ u 2 ) o.5(n 2 -4)

(Y-n

1

5(0.5,0.5(n2-2))H^r+

sY(n2-l)u

(3.51)

Downton (1973) suggested normal approximation for (3.51) while Woodward and Kelley (1977) designed a series approximation based on the gener-

Unbiased Estimation (Univariate Case)

61

alized Laguerre polynomials. However, with the modern computer facilities using the above approximations may perhaps be more complicated than numerical integration based on (3.50) or (3.51). Consider now the case when all parameters fj,\, /i2, &\ and 02 are unknown. Since X and Y are independent, the UMVUE fx(x) of the pdf of X is of the form (3.49) with y, Y, sy and n2 being replaced by x, X, sx and m , respectively. Hence, by (2.28),

* (3.52) where B{a,b) is the beta function (see (3.12)) and the setfi0is the intersection of half plane and the square Clsq = [—1,1] x [—1,1], namely,

(3.53) the UMVUE (3.52) has initially been derived by Downton (1973). Some authors (for example, Ivshin and Lumelskii (1995)) discussed the situation when either both \i\ and n% are known while o\ and a2 are unknown, or both \i\ and fi2 are unknown while o\ and 0-2, are known. We shall leave these two special cases for exercises. 3.2.2

The Two-parameter Exponential Distribution

Let X and Y be independent random variables with the pdfs Ex(/j,i,ai) and Ex(/i2,o'2)(see (3.5)). Then, sufficient statistics for the parameters /ii,ci and /X2>o"2 are, respectively, X^,X and Y^,Y, or, equivalently, X^,Tx and Y(I),TY, where Tx=X-X(l),

TY = Y-Y(1).

(3.54)

Denote Ux = n\Tx + X(l), UY = n2TY + Y(1). The UMVUEs Fx(x) and fx (x) of the cdf Fx (x) and the pdf fx (x) are then

Fx(x) = [l - nx=k

)ni2l

7 ( X ( 1 ) < x < UX) + I[x > UX), .(3.55)

ni(niT x )"i- 2

62

Parametric Point Estimation

where 8(x) is the Dirac delta-function such that J tp(x)S(x — xo)dx = ip(xo) for any function tp(x) and XQ- Similarly, Fy(y) and fy(y) can be obtained from (3.55) by replacing x, n\, -X'(i), Tx and Ux with y, n2, Y^), Ty and Uy. Since the variables X and Y are independent, the UMVUE R of R = P(X < Y) is given by (2.28) with f{x,y) = fX{x)fy(y). It has the following rather cumbersome form 0, l-flifni.na.^D.rx.rw.Ty), 1 - H2{ni,n2,X{1),Tx,Y(1),TY), TT

/„

„ "!/ rp v" T1 \ j ^ 1 ) x ( i ) , J y j A/]_), J x j j

H2(n2,ni,y (1) ,ry,X (1)> Tx), 1.

if X(1) > Uy, if X{1)
v (1)

^-* 7~r ^

^ rr ^ ^-^T

if V(i) < X(1) < Ux < Uy, if V(i) ^ ^ . (3.56)

where n, m,a,o, c,a) — n m—2

x

fm-2 (m-2\

> V fc=on-2

^ ; . (a + nb-c)" + f c - 2 (c + m d - a - n b ) m - f c - 2 ,

(3.57)

+ fc

H2(n, m, a, b, c, d) =

I1

n

\

n"2

;— I

no J

1-3

(3.58) This formula was initially derived by Beg (1980b) and then was re-invented by Ivshin and Lumelskii (1995). Ivshin (1996) also derived the UMVUE for Var(E). Validity of (3.56) can be proved by direct but tedious integration. For example, in the case when Y^) < X(i)
r"-^1 I 7o

I" \ [

m - i fux-Uy + n2TYt\ni-2] 1 m I ni \ niTx ) J

3

Unbiased Estimation (Univariate Case)

Using binomial formula for ( R =

63

-) n i ~ 2 in the integrand, we arrive at

n2 — L I Uy — -<*(i) \

(ni —

2

— l j ( 7 i 2 — 2,)

n2

x

si * y

which completes the proof. 3.2.3

ITie Gamma Distribution

Let X and F be independent random variables with the Gamma(ai,<7i) and Gamma(a2,o"2) pdfs with known integer-valued shape parameters a\ and a.% and unknown scale parameters G\ and a2. Note that (see Voinov and Nikulin (1993)) 7

(

fx (x) = - ^

n

y _

'>l

r

(

7,

)

-N

r 1(0 < i < mX)

(3.59)

and /y(y) can be obtained from (3.59) by replacing x, n\, a.\ and X by y, n2, a2 and Y. To derive R, we need to integrate the product /x(^)/y(y) over the region {(x, y) : x < y}. We have two different cases here: n\X > n2Y and n\X < n 2 y. Consider the former since the UMVUE for the latter can be derived by interchanging X and Y. If n\X > n2Y, integration is carried out over the region x < y < n2Y < n\X. Therefore, (3.60)

where, Cn,a = [B(a, n\a — a)]" 1 and, by the binomial formula, - l /(n 2 -l)a 2 -l\ { k /

(3.61) ^ " 2 , a 2 2^A;=0

-l ((n2-l)a2-l$-l)k ( x I, k )~^+k \n2Y

Here, (™) = m\/((m — k)\k$ for any nonnegative integer m and k. The last equality in (3.61) is valid since i/>(0) = 1. Combining (3.60) with (3.61)

64

Parametric Point Estimation

and changing the variable z = x/(n\X),

i i\t / X

-o-\o:2+k ,

I n\X

v

we arrive at

\

/

r1

a2+k \n2Y)

h

cti+a2+k—1(-\

V1

Z

s

i

(3.62)

_ \ ( n i — l)c«i — 1 A~\ z)

az\

Introducing a function /"_

fl-3(m,n)o,6,C,d)

1M. 1

B(a + b + k,(m— l)a)

=

(3.63) of positive integer arguments m, n, a and 6, we obtain from (3.63) that

{

1 - H3{ni,n.2,ai,a2,X,Y),

if

n2Y<mX, (3.64)

H3(n2,ni,a2,ai,Y,X),

ifn2Y>mX.

The second part of (3.64) follows from (3.7) and (3.62). The UMVUE of R in the case of the gamma distributions has been studied by Ismail et al. (1986) and Constantine et al. (1986). The expression (3.64) is due to by Constantine et al. (1986). 3.2.4

The Truncation Parameter

Families

In Section 3.1.4 we elaborated on the MLE of R for truncation parameter family of the form (3.18). Although it is impossible to derive a general form for the UMVUE of R when fx{x) and /y(y) include unknown parameters ai and a2, this task becomes quite manageable if all parameters except for truncation parameters \i and A are known. The doubly truncation parameter families. Let us first consider the case when X and Y belong to doubly truncation parameter families with the parameters (/xi, Ai) and (n2, X2), respectively, i.e.

=

<x<\1), [cy(M2,A2)]-1/iy(2/)/(/x2<J/
(3.65) (3.66)

Unbiased Estimation (Univariate Case)

65

The sufficient statistics for (fii, Xi) and (/i2> X2) are (-X"(i), X(ni)) and (Y^, Y(n2)), respectively, and the UMVUEs of the pdfs fx(x) and /y(j/) can be constructed on the basis of Theorem 2.4. Denote

9x{x) = [cx(0,l)]~1hx(x),

Gx(x) = f£ gx(z)dz,

Then the UMVUE of fx(x) derived by by Ivshin and Lumelskii (1995) is of the form

fx(x)

[5(x -

= Cl9x(x)I(X{1) <x<

5{x -

with

i =

{nl-2)n^[Gx{X{ni))-Gx{X(1))]'\

and fy(y() c a n be obtained from fx(x) by replacing x,X^,X^ni),Ci,gx and Gx by y, V(i), V(n2), C2,fifyand Gy. Here 5(x) is Dirac's delta function introduced in Section 3.2.2. Recalling that R can be written as (see (2.6)) R = J_ _oo and noting that

Fy(y) = \C1(Gx(y)-Gx(X(1) we rewrite R as R=

[Ci(Gx(y)

/

-

Examining relations between X^, X(ni), 0, if 1 if 1 if if if 1, if

fY(y)dy.

(3.67) Y(i) and F( n2 ) we represent R as X {nx),

{n2)

(3.68)

66

Parametric Point Estimation

Here, by (3.67) (n2) )

+

d[Gx(X(1))

+

C1C2 f

- Gx(X(1))]/n2

= C2[GY(Y{n2))-GY(X(ni))} + C2[GY{X{ni)) -

( l)

" [Gx(y)-Gx(Xw

By similar calculations, we derive that

\

)

(Gx(y) -

Gx(X{1)))gY(y)dy

+ Cx[Gx{Y{n2)) + GX(YW) - 2Gx(X{1))]/n1 + 2/{nxn2). The uniform distribution. In this case, gx(x) = gY{x) = 1, Gx{x) GY(x) = xI(0 < x < 1) + I(x > 1), so that u ,

„

Y

v

v

v

\

(m - 2)(n 2n2in2-(X2)[(X - X-(1)X )(Y (1)(n2) {ni) (ni)

(n2-2)(y(n2) i ) -y ( 1 ) )

|

H2{ni,n2,X{1),X{ni),Y{l),Y{n2))

This estimator was derived by Ivshin and Lumelskii (1995). Ivshin (1996) also derived the UMVUE for Var(fl). The left- or right-truncated families. When parameters Ai and \2 are unknown (with fj-i and /x2 being known), we say that (3.65) and (3.66) are the right-truncated families, and when \i\ and /x2 are unknown (with Ai and \2 being known) we term (3.65) and (3.66) the left-truncated families. Below we consider the left-truncated families noting that the righttruncated families can be transformed into the left-truncated by a simple change of variables (see Section 2.5). In this case, sufficient statistics are

Unbiased Estimation (Univariate Case)

67

and

!x{x) = gx(x)I(x > X(1)) + fc^fi)) with n\

Cl

=

<"'

ni(l - G x (

(3.69) and fy(y) can be obtained from (3.69) by substituting y, Y^), C2, gy and Gy for x, X(1), Cu gx and GX- The UMVUE of i? is of the form

R ={

(3.70)

1 - H(Yw,X(1),n2,ni),

HYW < Xw,

where , I
=

(n2 - 1)(1 -

Gy(X(1)))

— ——jrz—rr— nin2(L — Lry(Y(i)))

(6.11)

Note that Rohatgi (1989) proposed a somewhat more general method for estimation of an arbitrary function R(0i,02) when fx(%) = ai(0i)bi(x) and fy(y) = ^2(^2)^2(2/)- His method can be used for derivation of R when 6\ and #2 are truncation parameters. It is however, harder to obtain general expressions (3.70) and (3.71) using his technique.

The left-truncated exponential distribution. Let X and Y be independent random variables with the pdfs Ex(/ii,<7i) and Ex(/i2,02) (see (3.5)) where parameters a\ and (T2 are known. Therefore, fx(x) and fy(y) are left-truncated families with gx{%) = cr^1 exp(-x/cri) and gY(y) = 0) and Gy{y) = (1 - exp(-y/a2))I(y > 0), expression (3.71) becomes

(3.72) One can also derive an explicit expression for the UMVUE of the variance Var(.R) using formulas (2.29) and (2.30) in Section 2.2. Note that the

68

Parametric Point Estimation

UMVUE of fx(xi)fx(x2)

is of the form

^

xi+X2

f(xi,x2)

= (niCT?)-1^! - 2)e"

"

if min(xi,X2) > -X'(i), and the UMVUE of /y(j/i)/y(j/2) is given by a similar expression. An explicit expression for Var(Ti) can thus be obtained by direct integration and is of the form (see Ivshin and Lumelskii (1995))

Var(.R) = if

)

(3.73)

The uniform distribution with unknown left end. In this situation gx(x) = 9Y{X) = -f(0 < x < 1) and Gx(^) = Gy(x) = a;7(0 < x < 1) + 7(x > 1). Then, the expression (3.71) becomes «'n'in2) = 2 n

1

n

2

(l-y

( 1 )

)

(3.74)

The uniform distribution with unknown right end. Note that R in the case of the right-truncated uniform distribution can be obtained from (3.70) and (3.74) by applying transformations £ = 1 — X, r] = 1 — Y. Sufficient statistics in this case are -X"(ni) and Y(n2) and utilizing techniques of Section 2.5 we arrive at n

)

»

l f X(m)

R=

^

Y(n2),

(3.75) (ni-l)(n 2 +l)y (n2 )

The UMVUE for the variance is of the form i

2

)

Var(jR) =

(3.76) ,

i

r

Formula (3.76) was initially obtained by Ivshin and Lumelskii (1995).

Unbiased Estimation (Univariate Case)

3.2.5

The Generalized Gamma

69

Distribution

Let X and Y be independent variables with the GG3(<7i,ai,/3i) and GG3(
(3.77) H—i

and recall that the UMVUE of the gamma pdf with the known shape parameter o"i and unknown scale parameter o\ is given by formula (3.59). Using ideas of Section 2.6 and the relation (3.38) between the generalized and the regular gamma distribution we conclude that fx (1) can be obtained from the expression (3.59) replacing a\ by ai/Pi, x by x^1, X by Sf1 and multiplying the result by the Jacobian of the transformation /?ix/3l~1. Hence, the UMVUE of fx(x) is

fx(x) =

7(0 < x < S{), B

PI '

Pi

J

1

and fy(y) has a similar expression. Thus, R is given by (2.28) with f(x,y) = fx(x)fy(y)- Integrating, we arrive at if 5i > S2, (3.78)

R= ), if Si < S2, where ,oc2, Pi, 02,ni,n2, Si, S2) =

(3.79)

Using the ratio test one can verify that the series in (3.79) is convergent whenever S2 < Si so the expression (3.79) is computationally tractable. Estimator (3.78) was derived in Pensky and Takashima (2002).

Parametric Point Estimation

70

3.2.6

Other

Distributions

1. The lognormal distribution. Let X ~ Lognormal(/ii,o'i) and Y ~ Lognormal(/i2)Or2)i where the pdf of the lognormal distribution is defined in (3.42). If all parameters are unknown, then R is given by integral (3.52) where fio is a subset of the square fio C £lsq = {(u, v) € [—1,1] x [—1,1]} such that Yi

Here TXi,TYi,TX2

-

0

and TY2 are defined in (3.43) and (3.44).

2. The power distribution. Let X ~ Power(<7i, Ai) and Y ~ Power((T2, X where the pdf of the power distribution is defined in (3.46). If X and Y are independent and all parameters are unknown, then 0, if lny (n2)
if

UY UY.

(3.80)

Here functions Hi and H2 are given by (3.57) and (3.58), respectively, and Tyi are defined in (3.48) and

TY2

i!)-Txi, = lny(n2)-Tyi,

Ux . = niTxi - (ni - l)lnX( n i ), UY = n2TY1 - (n2 - 1) InY(na).

The estimator was originally derived by Beg(1980a). 3. The Pareto distribution. The pdf of the Pareto distribution is given by (3.23). Let X ~ Pareto(Ai,cri) and Y ~ Pareto(A2,cr2) be inde-

Bayes and Empirical Bayes Estimation (Univariate Case)

71

pendent variables. Denote

TY2

= Txl-lnX(1),

Ux =

= Tyi-lnY(1),

UY =

n2Tyi-(n2-l)lny(1).

where TX\ and TY1 are denned in (3.48). Then the UMVUE of R is of the form hr2pexpl with X^, Y^, Tx and Ty being replaced by In X(i), In Y^, and Tyii respectively. Here, as before, functions H\(n, m, a, b, c, d) and (n, m, a, b, c, d) are defined in (3.57) and (3.58). The estimator was originally obtained by direct calculations in Beg and Singh (1979). 4. Burr type X distribution. Let independent random variables X ~ BurrX(ai) and Y ~ BurrX(a2), where the pdf of BurrX(a) is defined in (3.28). Then the UMVUE of R is f Qi(n 2 ,ni,n 2 7y,niTx), R=l

I

ifniTx
where functions Q\ and Q2 are defined in (2.35) and (2.36), respectively, and Tx and Ty are given by (3.30). The estimator (3.81) has been derived by direct calculations by Ivshin and Lumelskii (1995) and then repeated independently by Surles and Padgett (1998).

5. Burr type XII distribution. Let X ~ BurrXII(ai,/3) and Y ~ BurrXII(a2,/3), where /? is a known common parameter (see (3.31)). Then R is of the form (2.91) where

and functions Qi and Q2 are denned in (2.35) and (2.36), respectively. The estimator R has been derived by direct calculations by Awad and Gharraf (1986).

3.3

Bayes and Empirical Bayes Estimation (Univariate Case)

In this section we shall review some results on Bayesian estimation of R — P(X
72

Parametric Point Estimation

with the case of the normal distribution investigated by Enis and Geisser (1971), where the Jeffreys's prior is employed. We shall then discuss further developments in Bayes estimation for the one-parameter exponential distribution, such as Bayes estimation with reference and matching priors carried out by Thompson and Basu (1993) and Lee (1998). Next, we shall review results of Sun et al. (1998) on the Weibull distribution and conclude the section by discussing the Burr type X distribution (Ahmad et al. (1997), Surles and Padgett (1998) and Kim et al. (2000)). We emphasize that this section by no means exhausts the available results on the Bayes estimation of R. Unfortunately, due to space limitations, we shall omit the Bayesian estimation for geometric distribution (Ahmad et al. (1995), Maiti (1995)), for the Burr type XII distribution (Gharraf (1986), or the Pareto distribution (Beg and Singh (1979)). These papers cover mainly Bayesian analysis with conjugate priors. Some of the cases listed above are posed as problems at the end of Chapter 3. 3.3.1

The Normal Distribution

Following Enis and Geisser (1971), we shall study a somewhat more general problem. Namely, we shall assume that Xi, i = 1,..., k, are independent normal variables with the unknown means Hi and a common but unknown variance a2 and consider a known vector B = (B%, The goal is to estimate k

R = P(Y^ BiXi > 0) = P(B'X > 0),

(3.82)

where X = (Xi, , Xk)'. Relation (3.82) in the case k = 2 (bi = - 1 and 62 = 1) reduces to the univariate case P(X < Y). Let Xy, j = 1, , n», be a sample of independent observations from N(jUj,o"2) and m

k

^ij,

X'=

,

AT = X > -

(3-83)

Denote the (k x k) diagonal matrix with An = n; by A and the pooled sample variance by

s2 = (N - k)'1 ]T JTiXy - X,f.

(3.84)

Bayes and Empirical Bayes Estimation (Univariate Case) T h e n t h e j o i n t l i k e l i h o o d for fi' = (fix,

73

Hk) a n d a2 is p r o p o r t i o n a l t o

oc

For the joint prior on fi and a we choose Jeffreys's prior (see Yang and Berger (1997)) g(ti,a) oc a'1. Combining L(/j,,a2) with g(fi,a), we obtain the joint posterior density for (fi, a)

\X, s2) oc a-( w + 1 ) exp { - ^ [(AT - A;)*2 + (X - M )'A(X - /i } (3.85) Note that the scalar B'X has the univariate normal distribution with the mean B'/z and the variance cr2B'B. Therefore, (3.82) is given by

R = P(B'X > 0) = $ (-%£=)

,

(3-86)

where as above ) is the cdf of the standard normal distribution. To derive the Bayes estimator of R it is advantageous to use the Bayes predictive approach described in the subsection 2.3.3. Before undertaking this task, we shall recall a few facts about the multivariate Student's Tdistribution. If the randomfc-dimensionalvector has the pdf h{x}

~

r(v/2)

(W)V> l 1 +

a

where S is a positive definite (fc x k) matrix, then X is said to have a A;-variate T-distribution 7&(/i, S, v, a). The special case Ti(0,1, v, v) is the usual univariate i-distribution with v degrees of freedom tv. Further, it is easy to verify (see Fang et al. (1990)) that if X has the fc-variate Tdistribution Tk{n, S,i/, u), then, for any non-null vector of real constants a = (ai,,Ofe)', the scalar a'(X — //)/Va'S~ 1 a has the univariate Tdistribution tv with v degrees of freedom. The predictive density of a future vector observation X is

74

Parametric Point Estimation

where /(x|/i,<72) is the pdf of the fe-variate normal distribution with the mean fj, and the covariance matrix cr2l, I being the (k x k) identity matrix. It follows from (3.85) that p(x|X, s2) oc [(AT - k)s2 + (x - X)'A(A + I ) " 1 ^ - X)}~N/2.

(3.87)

Therefore, for the future vector observation X given X and s2, the quantity B'(X - X)/(sy/B'(A + IJA-iB) has the univariate T-distribution tN_k, and, hence (see Section 2.3.3) R = P (tN-k < (sy/B'(A + I ) A - i B ) - 1 B'X|X, s 2 ) .

(3.88)

Note that the right-hand side of formula (3.88) can be evaluated using tables of the T-distribution (see e.g. Johnson et al. (1998), Chapter 28). For k = 2, Xx = X, X2 = Y, Bi = - 1 and B2 = 1, (3.88) reduces to R=P with s2 = (m +n2- 2)- 1 [(m - l)s\ + (n2 - 1)5^].

(3.89)

The sample variances s\ and Sy are as denned in (3.3). 3.3.2

The One-Parameter Exponential Distribution

We are back to the exponential distribution which guided us steadily in Chapter 2. If X and Y have one-parameter exponential distributions with parameters ct\ and a2, respectively, then R = P(X < Y) = a\/{ai + a2) (see (2.12)). In Example 2.3 we have provided Bayesian estimation with the noninformative Jeffreys's prior nj(ai,a2) — (aic^)" 1 - The reference prior in this case is given by the theorem due to Thompson and Basu (1993) and Lee (1998). Theorem 3.1 The reference prior for R with the nuisance parameter 6R — OL\+ a2 is unique and coincides with the Jeffreys's prior nj(ai,a2). The theorem implies that the reference prior leads to a proper posterior given by (2.53) and the Bayes estimator of R is of the form (2.54) with 7* = nxX, A* = n2Y, B = {n2Y - mX)/{nY), fj,* = m and v* = n2.

Bayes and Empirical Bayes Estimation (Univariate Case)

75

Definition 3.1 A prior TT is said to be an i-th order matching prior for a parameter 6 if

where 0n(a) is the a-th quantile of the posterior density for 9 based on n observations X_ and PQ is the joint pdf of X_ given 6. Lee (1998) derived the second order matching prior for the exponential model which can be expressed as l

oc

g{a1a2),

(3.90)

where g is any continuously differentiate positive function. One of the 1 which corresponds to possible priors satisfying (3.90) is TTM{1*1,012) the choice of 7 = A = 0 and fj, = v = 1 in (2.48). The latter results in a proper posterior of the form (2.53) and the Bayes estimator of R (2.54) with 7* = mX, X* = n2Y, B = (n2Y - mX)/(n2Y): n* = m + 1 and v* = n2 + 1. Another possibility is to choose 5 = 1 , which would lead to the matching prior that coincides with Jeffreys's and reference priors. For this case the results of the Bayesian analysis were discussed in Section 2.3.3. 3.3.3

The Weibull

Distribution

Let X and Y have independent Weibull pdfs with a common shape parameter a and scale parameters <j\ and a2, respectively, where all three parameters are unknown. Here we shall follow Sun et al. (1998) who have derived noninformative priors in this situation. The log-likelihood of (cri, cr2, a) is lxiL(
k

h

It is easy to calculate the Fisher information matrix of (<Ji,a2,a.) 1 2 2 i 0 0

76

Parametric Point Estimation

where 7 = 1 + /0°° e~z In zdz w 0.422785, 1 — 7 being the Euler constant, and 70=/ Jo

e-z(lnz)2dz-(

[ \Jo

e~* In zdz) « 1.64493. )

(3.91)

The Jeffreys's prior for (cri,cr2,a:) is -rrj(ai,a2,a) oc a^aic^)" 1 . It is symmetric in <7i and 02 and does not depend on the sample sizes n\ and ri2. This seems to be unreasonable since if n\ is larger than 712, we have more information about a\ than a 2 . To derive the reference prior we ought to identify the nuisance parameters and "order" them in terms of inferential importance. The main parameter of interest is R = P(X < Y) = crg/i^i + af), and we shall introduce a new parameter u> = (erfo'fVC'7? + a2)- The transformation from (cri,a2,Oi) to (i?,w,a) is one-to-one transformation with the inverse <7i = (RLJ)*1/01, and a can either viewed to be of equal inferential importance, or w can be viewed more important than a, and visa versa, which leads to the reference priors TTRI, itm and TTR3, respectively. These are of the form irRJ(R,e,,a) = [(1 - R^Rua^ipjiR),

j = 1,2,3,

(3.92)

where

(3.93) and 70 is defined in (3.91). A general class of matching priors involves complicated expressions. However, an interesting class of the second order matching priors can be written as

(3.94) where

(3.95)

Bayes and Empirical Bayes Estimation (Univariate Case)

77

where 4. The posterior pdfs of R under these priors are given by the following theorem Theorem 3.2 Let G(R,a;X,Y) =

^

'-

>- — — 2 -

Under the prior ITRJ, j = 1,2,3, defined in (3.92), the posterior density of R = ag/(a? + a§) is O

PRj(R\2L,Y) oc ^j(R)Rni-1(l

- R)712'1 /

Jo

ani+n2"1G(J?,a;X13 (3.96)

Under the prior TTM defined in (3.95) the posterior density of R is

Pm(R\K,Y) oc [^(RT'R^H^-RT2'1

I"'<*ni+n*-AG(R,a;X,TQda. Jo

(3.97)

Integration in (3.96) and (3.97) should be performed numerically. To evaluate normalizing constants two-dimensional integration is required. The Bayes estimators are then the expectations over posterior pdfs (3.96) and (3.97). 3.3.4

The Burr-Type X Distribution

Let X and Y be independent random variables with pdfs BurrX(ai) and BurrX(a2), respectively. Here, in accordance with Table 2.1, the variables £ = — ln(l - e~x ) and ry = — ln(l — e~Y ) are independent exponentially distributed with the parameters a\ and a2, respectively. Therefore, all results previously derived for the one-parameter exponential distribution can be applied here with minor changes using the techniques of Section 2.5. For example, similarly to the case of one-parameter exponential distribution, the Jeffreys's prior coincides with the reference prior and is proportional to (aia^)" 1 . One can also choose the second-order matching prior coinciding with Jeffreys's and reference priors, i.e. 1

.

(3.98)

78

Parametric Point Estimation

The priors (3.98) have been derived by Kim et al. (2000) by direct calculations. Prom Theorem 2.9 it follows that the posterior pdf and the Bayes estimator of R in this situation are of the forms (2.53) and (2.54) with 7* = mTx, A* = nTY, B = (nTY - mTx)/(nTY), fi* = m and v* = n. Here

4=1

3= 1

We have omitted here the results on Bayes estimation with gamma priors (Ahmad et al. (1997)) that can be obtained automatically from Section 2.3.3 with the aid of Theorem 2.9. 3.4

Elliptical Distributions

Elliptical (or elliptically contoured) distributions have been popular since early eighties as a natural extension of the multivariate normal distribution with attractive properties, both theoretical and applications oriented (see e.g. Fang et al. (1990)). In this and the next section we shall investigate a generalization of the stress-strength model where instead of a two-component vector (X, Y) we have two independent fci and /^-component random vectors X = (X^\ ..., X(fcl>) and Y = {Y^,..., Y(fc2>) are used. We shall be interested in estimating the probability P(A'X + B'Y + C > 0) where A and B are known fei andfc2-dimensionalvectors and C is a known scalar. This setting and its minor variations were considered in the case when X and Y are normally distributed random vectors by Pensky (1982), Gupta and Gupta (1990) and Ivshin and Lumelskii (1994). It is easy to see that in the one-dimensional case when fci = fc2 = 1, C = 0, A = (—1) and B = 1, the problem reduces to estimation of the probability P(X < Y) for independent random variables X and Y. The case when fci = 2, A = (—1,1) and B = 0 allows us to estimate P(X < Y) for dependent variables X and Y. Estimation of P(A'X + B'Y + C > 0) is very important in many practical situations. Consider a technical system which is functioning under a variety of random stresses JfW, i = 1, ,fci,such that the total stress on the system is given by a known linear combinations of the stresses A'X. This situation occurs when, for example, X^ is the density of the vehicles of type i on the bridge (cars, buses, trucks and so on) and A^ is

Elliptical Distributions

79

the damage (stress) caused by a vehicle of type i. If the strength of the system is provided by several components (for example, special steel or concrete, extra strong supports for a bridge), then the strength of the system can be viewed as a linear combination of some random components Y^\ i = 1, , k2, that is B'Y. In this model, stress X and strength Y are independent. Reliability of the system is the probability that strength exceeds stress P(A'X < B'Y). If we are interested in estimating the probability that strength exceeds stress by afixedvalue C, the problem reduces to estimating P(A'X + B'Y + C > 0). We shall now derive estimators for P(A'X + B'Y + C > 0) in the situation when vectors X and Y have elliptical distributions with the pdfs of the forms pj(Z|^,Sj) = | S r 1 / 2 / j ( ( z _ 0 j y E - i ( z _ 0 j ) ) )

j = i, 2>

(3.99)

where S^ is a positive definite symmetric kj x kj matrix and fj(z) > 0 is such that (3.100) Here and below, we denote vectors and matrices by bold characters, W is the transpose of matrix W, |W| is the determinant of W, tr(W) is the trace of W. As in the previous sections, we shall assume that i.i.d. samples X = (X 1 ,X 2 ,...,X n i ),

Y = (Y 1 ,-.-,Y n 2 )

(3.101)

having elliptical pdfs (3.99) are available and construct the MLE and the Bayes estimators of P(A'X + B'Y + C > 0). Note that this Section as well as 3.6 follow Pensky (2002). 3.4.1

Maximum Likelihood Estimation

Denote -R = P(A'X + B'Y + C > 0 ) ,

RQ = P(A'X + C > 0)

(3.102)

and observe that R= f JRK

/(A / x + By + C>0)pi(x|ei,E 1 )p 2 (y|»2,S2)dxdy,

(3.103)

80

Parametric Point Estimation

where RK is /^-dimensional Euclidian space with K = k\-\-k2, and pj(z\8j, j = 1,2, are defined in (3.99). Hence, the MLE R of R is of the form (3.103) with 0j and S j replaced by their MLEs 6j and Hj, j = 1,2, respectively. The MLEs of the parameters 6j and S j of the elliptical distributions have been derived by Anderson et al. (1986). These are: #i

=

X,

82 = Y,

Si

=

fciCAmO^Sx,

S 2 = fc2(A2n2)"1 Sy,

where

X = nr1^^.

Y = nJ1J]Yi,

2=1

(3.104)

»=1

- X)', Sy = £ ( ¥ * - Y)(Yi - Y)', (3.105) i=l

Aj

=

argmax{z^/,(«)}, j = 1,2.

(3.106)

Introducing the new parameters a = x/A'SiA, t = x/B'SaB, c = A'^i + B'6>2 + C,

(3.107)

we have the MLEs a = y/ki A'S x A/(mAi), 6 = v'*aB / SyB/(n 2 A 2 ), c = A'X + B'Y + C. (3.108) Since S j , j = 1,2, are positive definite symmetric matrices, there exist matrices Vj such that V^-Vj = S^, j = 1,2. Then, vectors £ = V^(X-0i),

r,=V2-1(Y-02)

(3.109)

possess spherical distributions with the pdfs /i(z'z) and / 2 (z'z), respectively, where fj(t), j = 1,2, satisfy the condition (3.100) (see e.g. Fang et al. (1990)). The following theorem and its corollary allow us to obtain the MLE of R and i?oTheorem 3.3 form

The probability R and its MLE R are, respectively, of the

R = J{a,b,c),

R = J(a,b,c).

(3.110)

Elliptical Distributions

81

Here, a,b and c are defined in (3.108) and the function J(a,b,c) is given by J(a,b,c)=

/

I(ax + by + c> 0)fn(x)f2i{y)dxdy,

(3.111)

J—oo J—oo

where (3.112) are the pdfs of the first components £(*' and r/( ' of the vectors £ and rj defined in (3.109). If characteristic functions ipi (u) and(f2{u) of£^ andrf1^ are available fj(cj)=

f°° e^fj^dz,

j = l,2,

(3.113)

J—oo

then J(a, b, c) can be expressed as J(a,b,c) = - + -Im / -—^i(aw)¥> 2 (M c?w 2 n Jo LJ Here, Im(z) is the imaginary part of z.

(3.114)

Proof. Denote by ||z|| = yz'z the norm of a vector z. Observe that in terms of the vectors £ and 77, inequality A'X + B'Y + C > 0 becomes A'V;£ + B'V'2r) + c > 0, where c is given by (3.108). It follows from Theorem 2.4 in Fang et al. (1990) that A'VU = IIVjAUeW,

B'V'2V = ||V2B||r?(1),

(3.115)

where £ ^ and 77W are as above the first components of vectors £ and 17, respectively. These components have pdfs

With the aid of equation 4.642 in Gradshtein and Ryzhik (1980), the last integrals can be rewritten as (3.112). To complete the first half of the proof one needs only to observe that ||ViA|| = a and HV2BH = b.

82

Parametric Point Estimation

To prove the relation (3.114) we shall use formula (2.8) which implies that 1

1

ei(ax+by+c)u

l"°°

I(ax + by + c> 0) = - + - Im / 2 Thus, it follows from (3.111) that o

J(a,b,c)=

/-oo

7T ri

/ fn(x)f2i(y) J-ooJ-oo

Jo 1

du.

w />oo

i(ax+by+c)uj

1

du dxdy. U J (3.116) Changing the order of integration in (3.116) we arrive at (3.114). The following corollary can be obtained immediately from Theorem 3.3. Corollary 3.1 written as

x + - Im / [2 ft Jo

The probability Ro defined in (3.102) and its MLE can be Ro = J(a, c), RQ = J(a,c)

(3.117)

where the function J(a, c) has the form rc/a

j

fc/a

J(a,c)= / fn(x)dx=-+ J-oo % Jo Here a,c, fn(x) spectively.

3.4.2

\

\

f°°

icw e

fn(x)dx = --\—Im ^1(au)doj. 2 IT Jo U

(3.118) and (pi(w) are given by (3.108), (3.112) and (3.113), re-

Bayes Estimation

To construct a Bayes estimator of R one needs to choose a prior pdf g(9i, S i , 62, £2). Then, the Bayes estimator of R becomes

02,1321200(0!, S i , 0 2 ) S 2 ) Y[)=l ddjdSj (3.119) Here, a, b and c are as given by (3.107), X and Y are samples on X and Y and Li(0i, S i | X ) and Li(Qi, S 2 |Y) are the likelihood functions

(3.120)

Elliptical Distributions

83

where Pj(X\dj, Ej), j = 1,2, are defined in (3.99). The integrals in (3.119) are calculated over the space R^ x Akt x -Rfc2 x ^ 2 where «4fe is the space of symmetric positive definite (k x k) matrices in Rk- The space Ak is a subspace of the k(k + l)/2-dimensional Euclidian space and, can in fact, be transformed to the latter by change of variables. Evaluation of (3.119) requires [fci + k2 + 0.5 fafa +1) + 0.5 k2(k2 +1)]dimensional integration which is computationally intractable since even in the case of fci = k2 = 2 one needs to perform a 10-dimensional integration. For this reason, in the case when (0i,£i) and (02, £2) are a-priori independent, i.e. g(0u Ei, 0 2 , £2) - Si(0i, £i)32(02, £2),

(3.121)

we suggest an algorithm that allows us to circumvent the painful integration over Oj and £ j , j = 1,2. Theorem 3.4 Assume that (3.121) is valid and denote the pdfs o/X and Y (see (3.101)) by qi(X) andq2(Y_), respectively:

= f f (3.122)

T/ien tft-e Bayes estimator of R can be written as

- f/

...,X n i ,x) g 2 (Yi,...,Y n2 ,y)

dxdy.

(3.123)

, B, C) = {x e B fcl , y € i?fc2 : A'x + B'y + C > 0} .

(3.124)

n(A,B,C) 9H-X-l,"-,^-ni) >/n( i/ie se£ f2(A, B,C) is as follows:

92(1 I,--,

Note that (3.123) requires (fci +fc2)-dimensionalintegration only. For example, if ki = k2 = 2, (3.123) results in 4-dimensional integration (compare with 10-dimensional integration in (3.119)). Moreover, since the value of the marginal density is essential for Bayes analysis, the expressions for <7i(2Q and q2 (Y) may be available from Bayesian analysis conducted previously by another researcher. In the case of B = 0, Theorem 2 reduces to

84

Parametric Point Estimation

Corollary 3.2 as

/ / B = 0, then the Bayes estimator ofR0 can be evaluated

P(A, C)= f

/(A'x + C > 0)

JRH

gl(*i'"-'X"1'*)

dx.

(3.125)

gi(Xi,...,x n i )

Fere, .Ro a^^ #ie marginal density <7i(X) are defined in (3.102) and (3.122), respectively. 3.4.3

The Pearson Type II

Distribution

Let X and Y have pdfs (3.99) with fj(z)

= [ I X ^ ) ] - 1 T T ^ r (^

+ oc^j (1 - z)a^I{Q

(3.126)

where ctj are known parameters, j = 1,2. By formula (3.112), functions in (3.111) have the forms

(3.127) where B{a,0) is the beta function (see (3.12)). Observe that A^ = kj/(kj + 2ctj - 2), j = 1,2. The MLE of R is provided by (3.110) with a = c =

V(fci + 2ai - 2)A'SxA/ni, 5 = ^(^2 + 2a 2 - 2)B'S y B/n 2 , A'X + B'Y + C, (3.128)

and J(a, b, c) = Q(a, b, c, «i, a2) (3.129)

dxdy —

B(0.5,ai+0.5fei-0.5)B(0.5,a2+0.5fe2-0.5)

'

It is easy to see that the last formula requires two-dimensional integration. If it is desirable to reduce estimation of R to one-dimensional integration, one can use combination of formulae (3.113) and (3.114). By (3.113), (3.127) and formula 3.771.8 of Gradshtein and Ryzhik (1980), the characteristic functions ipj (LJ) can be written as 1,2.

(3.130)

Elliptical Distributions

85

Here ) is the Bessel function of the first kind which can be presented via infinite series (see 8.402 of Gradshtein and Ryzhik (1980)) ,2fc

(3.131) fe=O

Substitution of (3.130) into (3.114) leads to

o sin(cw) simcui jJ X

Jo

- ..fc, fc,(aw) law JJ

n

*« (6a;) ,, ,, % ow

b w ai+a2+0.5(fe 1+ fc 2 )-l

i

d w +

21

( 3 ' 132 ^

so one-dimensional integration is required to calculate J(a, b, c) using (3.132). Consider now the case of B = 0. In this situation, by (3.118),

J(a, c) = 0.5 +[*(!, 2 a i + 2 f c l 1 )] [ (1 - f r ^

m

< l)dt.

(3.133) It is easy to see that for \c\ > a, the integral in (3.133) is equal to sign(c)B ( i , 2a » +2fc '~ 1 ), so that J(a,c) == 5 + ^sign(c). If \c\ < a, then by 3.197.3 of Gradshtein and Ryzhik (1980),

(3.134) Here, as above 2Fi(a,/3;j;z) denotes the hypergeometric series defined in (2.55). Finally, combining (3.133) and (3.134) we have J(a, c) = Qo(a, c, c*i) where i f | c < a

Q0(a,c,ai) =

2

\ + isign(c),

)

'

|C

-

a

'

(3 . 13 5)

if \c\ > a.

Note that the hypergeometric series in (3.135) terminates if 2a\ + k\ > 3 is an odd integer.

86

3.4.4

Parametric Point Estimation

The Multivariate T and Cauchy

Distributions

Let X and Y have multivariate T-distributions with the pdfs (3.99) where the one-dimensional marginals are given by the univariate t-densities of the form

fj(z) = TT-^ vj* [ T f o ^ - ^ a a y + *,-)/2) (1 + z / a j ) " 2 ^ . (3.136) Here otj,aj > 0, j = 1,2, are known parameters, obtain derive that 7B(0.5<x,-, 0.5))~2 (1 + , hence

J(a,b,c) =0.5 + (3.137)

Then ^ has the form (3.110) with Xj = kj<jj/otj, j = 1,2, and

If ai and OL-I are both odd integers, we can derive a finite sum representation for J(a, b, c) using formula (3.114). Note that, by relation 3.737.1 of Gradshtein and Ryzhik (1980) and (3.113), for j = 1,2,

= 2 [./^^(0.5^-,0.5)1 " 1 /

dz

Otj — 1 2

.e-"V*J

(3.138)

Substituting (3.138) for
J{a,o,c) — 2 + w ^ i w ^ (ai-l

txn — 1

(3.139)

Elliptical Distributions

87

with r(ii+ta) »l+»2

X

sin [(ii + i2) arctan ( ^—a^^—fc)l , if «i + i 2 > 0, arctan (-==-£—==,-) ,

(3.140)

if u + i2 = 0.

Note that in the case of B = 0, (3.137) reduces to (3.141) Note that the hypergeometric series in (3.141) terminates whenever ot\ > 2 is an even integer. If «i is an odd integer, equations (3.139) and (3.140) become (3.142) where

(^SS)^ ^n (i arctan ( ^ )

) , if * > 0, (3-143)

arCtan

(afe)

.

if t = 0.

Equations (3.141) - (3.143) give a finite sum representation for P(A,C) if a.\ is an integer. The value of J(a, c) can also be obtained by using the probability tables for the Student's T-distribution. In fact, from (3.118), J(a,c) = [
2

d«- Changing variables x =

J{a, c) = P (tai < (aV5T)" lc VaT),

(3-144)

where a random variable ta has the Student's T-distribution with a degrees of freedom. The multivariate Cauchy distribution, is a particular case of a multivariate T- distribution with ay = 1, j — 1,2. It follows then from

88

Parametric Point Estimation

(3.139) and (3.140) that J(a,b,c) = l + - arctan (

°

) .

(3.145)

To obtain an expression for J(a, c) set b = 0 in (3.145). 3.5

The Multivariate Normal Distribution

Estimation of the probability P(A'X+B'Y+C > 0) and its particular cases under the assumption of normality was the subject of investigations by Enis and Geisser (1971), Pensky (1982), Gupta and Gupta (1990) and Ivshin and Lumelskii (1993, 1994) among others. Ivshin and Lumelskii (1994) study unbiased estimation of P( A'X + B'Y + C > 0) and its generalization when N > 2 independent normal vectors are given. Pensky (1982) constructed an UMVUE of P(A'X + C > 0). Gupta and Gupta (1990) studied a particular case of estimating P(A'X + C > 0) with C = 0: their model is written as P(A'X > B'Y), however the normally distributed vectors X and Y are assumed to be dependent, so that these two vectors can be combined into a single vector X. Enis and Geisser (1971) derived the Bayes estimator of P(A'X + C > 0) based on a noninformative Jeffreys's prior for C = 0. Several authors studied the MLE and the UMVUE of P(X < Y) when (X, Y) is a two-dimensional random vector, among them Mukherjee and Sharan (1985) and Roy (1993). Below, we shall apply the general theory developed in the previous section to obtain the MLE, the UMVUE and the Bayes estimator oiR= P(A'X+B'Y + C > 0) and RQ = P(A'X+C > 0). The pdf of the fc-variate normal distribution with the mean vector 0 and the covariance matrix S is of the form p(z\9, S) = (27r)-fc/2 | S | - 1 / 2 exp {-(z - flJ'E"1^ - 0)} .

(3.146)

Let X and Y be independent fci and fe-variate normal vectors with the means 0\ and 92 and the covariance matrices S i and £2, respectively. 3.5.1

Maximum Likelihood

Estimation

The MLE of R is of the form (3.110) where J(a, b, c) is calculated in (3.114). Noting that in this case /j(z'z) are thefej-variatestandard normal densities, we obtain that ^ and 77W have univariate standard normal distributions,

The Multivariate Normal Distribution

89

so that (pj(oj) = e-" 2 / 2 , j = 1,2. Then, from (3.114) utilizing formula 3.952.6 of Gradshtein and Ryzhik (1980), we have

2*+l

(3.147) (

3

'

1

4

7

)

Now compare (3.147) with the power series representation of erf(z) :

erf(z) = - L / e"*'* = 4=Y

,n

,v,

(3-148)

(see e.g. 7.1.1 and 7.1.5 of Abramowitz and Stegun (1992)). Formulae (3.147) and (3.148) imply that

(3.149) where ) is the probability integral. To derive J(a, c) simply set b = 0 in (3.149). Thus, using the MLEs of parameters of the multivariate normal distribution we arrive at

( (A'X + B'Y + C)/Jrq A'S A

\

+ n^B'SyB ) , / if f0i, 02, S+i ,B'0 S22 + unknown, R=( $ (A'0i C)Iynr1 A'S^A + n^B'SyBJ , (3.150) l

x

if 0i,02 known, S i , S 2 unknown, $ ((A'X + B'Y + C)/v/A'S x A + B ' S 2 B ) , if 0i,02 unknown, S i , E 2 known, where Sx and Sy are denned in (3.105), ~~ j=l

-02)'.

(3.151)

4=1

The MLEs (3.150) of the general form has been derived by Anderson et al. (1986). Formula (3.149) implies that if X = (X,Y) is a normal vector, then R = P(X 0) with A = (-1,1). Therefore, in (3.107),

90

Parametric Point Estimation

a — v A ' S A = yO\

— 2(J\2 +
andCT12are the variances of X and Y and covariance between X and Y, respectively. Hence, it follows from Theorem 3.3 and equation (3.149) that R = P(X < Y) = <E» [ - T = / 2 ~ / i l

. I

(3.152)

The case of a two-dimensional random vector has been studied by Mukherjee and Sharan (1985) who obtained (3-153) where s2x and Sy are the sample variances defined in (3.3) and SXY = 7^1 EJLiPO - X){Yj ~ Y). Mukherjee and Sharan (1985) also calculate the asymptotic variance of the estimator (3.153) as n —> 00 exp

3.5.2

Unbiased

j -2

-

9

2cr12

Estimation

To find an UMVUE of R observe that the UMVUE of the fcrvariate normal density with unknown parameters 6j and S j based on rij observations is of the form (see e.g. Voinov and Nikulin (1996))

(3.154) w i t h Fj(z)

1

j

j

3

= [1 - ( n j - l ) - n j z ] " ' ~ i ~ 1 ( 0

< z < n j

1

^

- 1)), j = 1,2.

It is easy to see that (3.154) are the pdfs of fcj-variate Pearson-type II distributions (see (3.126)) with the parameters 0j = Oj, Tlj = nJ1(rij — 1)8j, a.j = 0.5(nj - kj - 1), j = 1,2. The UMVUE R of R is of the form R = J(a, b, c) where

a- xh1~1A'SxA, V n

b = J^2-Z-lB'SyB, c = A'X+B'Y+C (3.155) V n

The Multivariate Normal Distribution

91

and the function J(a, b, c) is given by either (3.129) or (3.132). For example, application of (3.129) yields R = Q(a, b, c, (m -ki-

l)/2, (n2 - k2 - l)/2)),

(3.156)

where function Q(a, b, c, a\, a 2 ) is denned in (3.129). Reasoning in a similar manner, we can derive estimators for the other two cases: Q(^A'SXA, ^ B ' S y B , A'0i + B'0 2 + C, {n\ — k{)/2, (n2 - fe)/2), if 0i,0 2 known, S i , S 2 unknown, R=

*

$ T(A'X + B'Y + C)/Jni\ni

- l)A'SiA + n^l{n2 - 1)B'E 2 B^ ,

if 0i,0 2 unknown, S i , S 2 known. (3.157) Here Sx, Sy, S^ and S y are denned in (3.105) and (3.151), respectively. Estimators (3.156) and (3.157) have originally been derived by Ivshin and Lumelskii (1994). If B = 0, then the UMVUE _R0 of RQ is given by 1)A'S X A, A'X + C, (m - fci - l)/2), unknown, , A'd Q 0 ( x / p , xx ++ C,, (ni (i - h)/), if 61,02 known, Si, S 2 unknown, 2

R=

(3.158)

((A'X + if 0i,0 2 unknown, Si, S 2 known, where S x , Sy, S^, S y and function Qo are denned in (3.105), (3.151) and (3.135), respectively. The estimator (3.158) has been initially derived in Pensky (1982). Estimator of Gupta and Gupta (1990) coincides with (3.158) the only difference being that it is expressed via incomplete beta function instead of hypergeometric series. Note that the hypergeometric series in (3.158) terminates if ni > 4 is even. Ivshin and Lumelskii (1994) considered a generalization of estimation of P(A'X + B'Y + C > 0). They have derived an UMVUE of R* = P ( S f c i Aj-Xj + C > On where X.j are independent fcj-variate normal vectors with parameters 6j and S j , j = 1, , m, Aj are known fc^-variate vectors and C is a known scalar.

Parametric Point Estimation

92

It is easy to generalize the UMVUE of P(A'X + B'Y + C > 0) to the case of more than two independent normal vectors. If samples of Xj of size rij are available, the UMVUE R* of R* = P fe/=i A J X J + c > °)) i s o f the form (compare with (3.155) and (3.156)) R* =

B(0.5,0.5nj -

Here the set £2* = j x € Rm : Xj € [-1,1], Tlj=\ °-jxj + O o} is the intersection of the m-dimensional hypercube with the sides [—1,1] and halfspace above the hyperplane Y^jLi CLJXJ + c > 0, a,- = and, as above,

i=l

i=l

Singh (1981) derived UMVUEs of P{Xi
Bayes Estimation

Assume that all the parameters 61,62, S i and £2 are a priori independent with the flat priors for 6j and the inverse Wishart priors or Jeffreys's priors for Hj, j = 1,2. The inverse Wishart distribution is the joint distribution of the k(k + l)/2 elements of the inverse S" 1 of the sample covariance matrix S based on independent fc-variate normal observations. In the first case, gj(6j, Sj), j = 1,2, are the inverse Wishart pdfs with the parameters Tj and W,-, that is .7 = 1,2, (3.159)

The Multivariate Normal Distribution

93

where, as before, <x means "proportional to". We shall derive q\ (X) since the expression for
(3.160) Integrating (3.160) with respect to 0i and noting that J27=i(x-i~X)' X) = t r ( S ^ 1 S x ) we have

p(X|Ei) oc S ^ exp j - i It is easy to observe that the product p(X|Si)(7i(Wi|0i, Si) is proportional to the pdf of the inverse Wishart distribution with parameters (rj + n\ — 1) and (Sx + W i ) . Hence, integration over S i yields (r-i+fci + n,

lSWr

( l

x l

-2)

»

Consequently, g(X,x) oc |St where X = J n i X + x)/(ni + 1) and SJ = ( X n i + i - X ) ( X n i + i - X )' + 2 " = i ( x i — x )(X» — X*)'. Rearranging S | we obtain g(X 1 ,...,X Wl ,x)

|Sx+Wi|-

Similarly, q(Yu...,Yna,y) ...... Yn,)

<x

For any square positive definite matrix A denote by v A a square positive definite matrix such that y/A \/A = A. Introducing new variables

94

Parametric Point Estimation

denoting

and using (3.123), we arrive at ^

^

^

(3.161)

Here the set fi(A,B,C) is defined in (3.124) and we used the fact that |zz' + Ifc| = z'z + 1 for anyfc-variatevector z and (fc x A;) identity matrix Ifc. To obtain the constant Ci 2 in (3.161) note that R = 1 whenever A = 0, B = 0 and C > 0. Since the integrand in (3.161) is proportional to the product of the pdfs of the kj-v&riate T-distributions, j = 1,2, and comparing the integrand in (3.161) with (3.99) and (3.136), we derive that r((r2+n2-l)/2) Therefore, we obtain that .R = J(d, 6, c) where

(3.162) and J(o, 6, c) is given as above by equation (3.137) with aj = 1, a,- = nj + rj - 1, i = 1,2.

(3.163)

If we choose r\ and r 2 so that n^ + rj > 2, j = 1,2, are even, then there exists a finite sum representation of J{a, b, c) given in (3.139) and (3.140). If B = 0, then .Ro = J(d>, c) where J(a, c) is given by (3.141) and sequel, d, c, CXJ, and a j , j = 1,2, are defined in (3.162) and (3.163), respectively. For example, we can write .Ro as

Similarly, we can derive the Bayes estimator when S j is distributed kj+l

according to Jeffreys's prior: gj(Wj\0j, Sj) oc |Sj|~ 2 , j = 1,2 (see Yang and Berger (1994)). Note that Jeffreys's prior is a particular case

Bivariate Exponential Distributions (BVED)

of (3.159) with Wj = 0 and R = J(a*, b*, c) where

rj

95

= l - kj, j = 1,2. Let n, > kj. Then

L,

b* = y n;nn 2 + l)B'SyB

c is denned in (3.162) and J(a, b, c) is given by formula (3.137) with <7j

1,

(Xj

Ihj — Kj,

J

-l,^-

If B = 0, the Bayes estimator based on Jeffreys's prior reduces to (see (3.141) and (3.144)) P(A,C) = | +

^^' ^

2

'2'(a')2+a2Z

(3.164)

or (3.165) Note that the Bayes estimator of Enis and Geisser (1971) is the particular case of (3.165) when C — 0. The estimators coincide in this case. The seeming discrepancy between the two estimators is due to the difference in notations: Sx in (3.165) is given by (3.105) while Enis and Geisser (1971) denned Sx as Sx = (nx - I)" 1 YZ

3.6

Bivariate Exponential Distributions (BVED)

The term "bivariate exponential" usually (but not necessarily) refers to bivariate distributions with both marginals being exponential. There are by now a large number of different kinds of bivariate exponential distributions. In the following section we shall introduce several types of bivariate exponential distributions and construct estimators of R in the case when (X, Y) is a bivariate exponential random vector. We shall use abbreviation BVED for "bivariate exponential distribution". Unfortunately, the volume of the present book does not allow to exhaust the topic, we shall not be able to cover some available results, such as inference for the Weinmann multivariate exponential distribution studied by Cramer and Kamps (1997) and Cramer (2000).

96

Parametric Point Estimation

3.6.1

Various Types of Exponential Distributions

The first attempt to construct a bivariate distribution where X and Y are dependent and have exponential marginals has been undertaken by Gumbel (1960) who introduced BVED with the survival function F(x, y) = P(X > x,Y > y) and pdf given by, respectively, F{x, y) = e-^x-x*y~x™xy,

x,y>0,

0 < a < 1,

f{x, y) = e->**->»y->»*xv[(\2 + Ai 2 i)(Ai + X12y) - A12],

(3.166) (3.167)

where Ai, A2, A12 are positive. It is easy to check that marginal distribution of (3.166) are exponential. A drawback of the Gumbel's BVED is that it describes no physical reality. To remedy the situation, Freund (1961) introduced another BVED that has been designed, in particular, for the life testing of two-component systems which can function even after one component has failed (for example, two-engine plane, person's kidneys, etc.) The joint pdf of Freund's BVED has the form C aip2e-02y-{ai+l3l-l32)x,

if 0 < x < y,

f(x,y)=\

(3.168) [

, if 0 < y < x,

The marginal pdfs of X and Y for (3.168) are not exponential in this case but rather weighted sums of exponential pdfs. Another drawback of Freund's BVED is that pdf (3.168) does not possess bivariate version of the loss of memory property inherent in the univariate exponential distributions: P{X >si+t,Y>s2

+ t\X>t,Y>t)=

P(X >si,Y>s2).

(3.169)

In 1967, Marshall and Olkin suggested another BVED which corresponds to a "fatal shock" model and preserves the memory loss property (3.169). In the model, a two-component system dies after receiving a shock which is always fatal. The shocks to the first, second and to both components are described by Poisson processes with parameters Ai, A2 and A12, respectively. If X and Y are the life-times of the first and the second components, then the survival function is given by F(x,y) =exp[-Aix-A 2 y-Ai 2 max(x,y)].

(3.170)

Bivariate Exponential Distributions (BVED)

97

It is easy to verify that the marginal distributions of X and Y are exponential and that (3.170) does satisfy the memory loss property (3.169). These two features make Marshall-Olkin distribution to be one of the most popular BVEDs. The lack of absolute continuity (the singularity on the diagonal) may be viewed as an objectional feature of the Marshall-Olkin BVED. Block and Basu (1974) introduced absolutely continuous BVED with the survival function and the pdf given, respectively, by

F(x,y)

~

A/(Ai+A 2 )exp[-Aix+ A2)exp[-Amax(z,y)], x,y>0,

(3.171)

and

{

AAi(A2+Aia)

-A^-CAa+An)!/

Q

if

<

x

<

y

(3.172) A1+A2

e

'

u

u<.y<.i,

where A = Ai + A2 + A12.

(3.173)

The Block and Basu's BVED satisfies the loss of memory property (3.169), but unfortunately the marginal pdfs are not exponential but rather mixtures of exponential pdfs. Clearly you cannot win them all!

3.6.2

Stress-Strength Estimators for the Marshall-Olkin B VED

Since Marshall-Olkin BVED is not absolutely continuous, we may be interested in estimating three different probabilities R

=

P{X
\1/\,

Ri

=

P(X>Y) = X2/\,

R2

= P(X = Y) = X12/\,

(3.174)

where, as above, A is given by (3.173). To obtain the MLEs R, R\ and R2 of R, Ri and i? 2 , we need to substitute the MLEs Ai, A2, A\2 and A = Ai + A2 + Ai2 for Ai, A2, Ai2 and A in (3.174). Let {X_,Y) be a sample with n\ = n 2 = n since the variables X

98

Parametric Point Estimation

and Y are dependent. Denote

TX = J2X^ Ty = flYh 3=1

T1 = J2I(Xi
j=l

3=1

T2 = £j(X,>lS), T3^f^I(Xj = Yj), TXY

= £max(Xj,Yj),

(3'175)

T£ y = X>in(X,,^).

(3.176) (3.177)

The MLEs Ai, A2 and Ai2 are the solutions of the following equations: Tx TY TXY

= = T2/A2 + ri/(A 2 + Ai2), =

(3.178)

The MLEs of parameters of Marshall-Olkin BVED have been constructed by Proschan and Sullo (1976) and further has been studied, among others, by Basu (1981) and Dinh et al. (1991). We shall follow the approach presented in the most recent paper. There are numerous approaches to estimation of the Marshall-Olkin BVED parameters. Interested readers are referred to Kotz et al. (2000). To solve equations (3.178) iteratively, we set .

Ai(m)

N

A2(m)

X1(m)+X12(my n-T2' If Ti > 0, i = 1,2,3, the roots are obtained by iterating the following system of equations (starting with m = 0 until it converges): Ai(m + 1) = A2(m + 1) = Ai2(m +1) =

(Ti + T2z/i(m))/Tx, (T 2 +T 1 z/ 2 (m))/T y , (n - T2i/i(m) - Tlty2{m))/TXY

If Ti — 0 for some i, the MLEs are then given explicitly by Jx

Bivariate Exponential Distributions (BVED)

Ai 2

99

=

Basu (1981) advocates the so called INT estimator where instead of the m-th iteration, the very first iteration solutions Ai(l), A2(l) and Ai2(l) are used. Since solutions of the equations (3.178) may not be easy to find, Awad et al. suggested the use of the moment type estimates for Ai, A2 and Ai2 instead of Ai, A2 and Ai2. Their method results in the estimators

R =

X+Y

'

Ri =

The estimator R* can be viewed as a "corrected" version of Y/(X + Y), the corrector n~lT3X being the estimator of the product EX P(X = Y). Now consider the situation when stress is censored at strength. This situation occurs when, for example, a balloon, a spring or rubber is stretched more than its strength. This leads to the destruction of the object, and the stress cannot be recorded. In this case, the stress X is censored , n, at strength Y, so that only the observations (Yj, min(Xj, Yj)), i = 1, are available. In this situation, the likelihood function has the form i, A2, A12|X,r_) = (A2 + Ai 2 ) n Af exp {-(A2 + Ai2)Ty (3.179) where Ti, TY and T£Y are defined in (3.175) - (3.177). Then, the MLEs of Ai and A2 + Ai2 which maximize (3.179) are Ai = T i / T £ y , and A2 + Ai2 = n/TY. Hence, it follows from (3.174) that the MLE of R = P(X < Y) is

R = T{TY (nTJy + TXTYY1

The estimator (3.180) is due to Hanagal (1997).

.

(3.180)

100

3.6.3

Parametric Point Estimation

Stress-Strength Probabilities and Their Estimators for Other BVEDs

Let (X,Y) follows Gumbel's BVED with the pdf (3.167). Then, R = P(X < Y) can be obtained from (2.5) and is of the form (Jana (1994)) /*oo

= /

Jo

((Ai i + A122;) 12) e xpp {{— A12X 12 2 —((Ai i ++ 2)) dx.

Simplifying the last expression, we represent it as E =

1 1 f°° 2 ~ 2 ( A 2 ~ A l ) / e x P{- A i2* 2 -(Ai+A 2 )z}dz.

(3-181)

To estimate R, Jana and Roy (1994) observe that if Z = min(X, Y), then for this model R = l/2- (A2 - Ai) EZ/2.

(3.182)

Hence, if Ai = X2, R = \ regardless of A12. To prove (3.182) note that, by (3.166), F(z) = l-exp{-Ai 2 2 2 - (Ai + X2)z} and EZ = f™(l-F(z))dz for any random variable Z € (0,00). Since the marginal distributions of X and Y are exponential with parameters Ai and A2, consistent estimators of Ai, A2 and EZ are n/Tx, n/Ty and TXY/n, respectively, where Tx, Ty and TXY are defined in (3.175) -(3.177). Using this fact, Jana and Roy (1994) suggest the following consistent estimator of R

If vector (X, Y) has Freund's BVED with the pdf f(x, y) given by (3.168), calculating R — P(X < Y) in accordance with (2.5), we obtain the familiar R=c

/ JO

e-{ai+pl-02)xdx

/ Jx

To derive the MLE RoiR simply replace a\ and (3\ in the preceding formula by their MLEs (see Hanagal and Kale (1992) for the MLEs of parameters). Consider now the case when vector (X, Y) have the BVED of the Block and Basu type with the pdf (3.172) where A is given by (3.173). Then, R = P(X < Y) can be calculated using (2.5) and again given by R = Ai/(Ai + A2). Then, MLE of R has the form R = Ai/(AX + A2) where

Discrete Distributions

101

Ai and A2 are the MLEs of Ai and A2, respectively, derived by Hanagal (1992). 3.7 3.7.1

Discrete Distributions Multivariate Discrete

Distributions

Let us consider, similarly to Sections 3.4, two independent k\ and k<2component random vectors X = ( X « , ...,X^) and Y = (yW, ...,y( fe )). Again, as in Section 3.4, our goal will be to estimate the probability R = P(A'X + B'Y + C > 0) on the basis of i.i.d. samples (3.101). Here, as before, A and B are known k\ andfc2-dimensionalvectors and C is a known scalar. Multivariate Poisson distribution. The probability that a fc-variate vector X with the multivariate Poisson(fc, A) distribution takes the value x is given by k

P(x\k, A) = exp(-fcA) I J

V

(x^)xU) Jn

(3.183)

where A = (A^, , A^fe^) is a vector with positive components, A = an 1 ,#(*)) is a Ar-variate vector with nonnegad x = (x^1^ k" Y^j=i ^ tive integer components. Let independent vectors X and Y have multivariate Poisson distributions Poisson(fci, Ai) and Poisson(fe2, A2), respectively. Define X and Y by (3.104), and note that X and Y are the MLEs of the vector parameters Ai and A2, respectively. Then, from general theory (see Section 2.1.4) it follows that the MLE of R is

R=

J2

P(x|fc1,X)P(y|A;2)Y),

(3.184)

A'x+B'y+oo where x and y take all possible nonnegative integer values. The series in (3.184) converges quite rapidly, and thus, one can use finite sums instead of infinite ones for calculations by means of formula (3.184). To construct the UMVUE of R note that by Belyaev and Lumelskii (1988) the UMVUE of P(x|fc, A) based on the iid sample with n observa-

102

Parametric Point Estimation

tions is given by (3.185) where Xo = E?=i * ( j ) and x0 = £ * = 1 * (j) given by (compare with (2.27)) R=

]T

Then> t h e

UMVUE of R is

P(x\k1,n1,X.)P(y\k2,n2,Y).

(3.186)

A'x+B'y+C>0

Multinomial Distribution. Consider an integer m andfc-variatevectors x and A such that 0 < x ( i ) < m, J2*j=\x^ = m

a n dx(j)

integer,

(3.187)

We say that a fc-variate random vector X has the Multinomial (fc, m, A) distribution if P(X = x) is equal to

| H

x 0 )

-

(3-188)

Multinomial distribution is the generalization of the binomial distribution. Let independent vectors X and Y have, respectively, Multinomial (fci,mi,Ai) and Multinomial (fc2,W2,A2) distributions. Then, since the MLE of Ai and A2 are X/mi and Y/m2, the MLE of R can be written as (compare with (3.184)) P(x|fci, mi, X/mi)P(y |fc2, ™2, Y/m 2 )

(3.189)

A'x+B'y+C>0

where the sum is taken over all vectors x and y satisfying (3.187) with fci,mi and k2,m2, respectively. The UMVUE of probabilities (3.188) based on the sample of n observations is of the form (see Lumelskii and Sapoznikov (1969) or Voinov and Nikulin (1996)) -1

k

(3.190) J=l

Discrete Distributions

103

Then the UMVUE of R is given by (compare with (3.186) and (3.189)) R=

Yl

P(x]fci,mi,ni,X)P(y|fc 2 ,m 2 ,n 2) Y).

(3.191)

A'x+B'y+C>0

The sums in both (3.189) and (3.191) are finite and can be computed directly. 3.7.2

Univariate Discrete Distributions

Now let us consider the situation when X and Y are independent random variables. Note that for estimation of R — P(X < Y) we can use results from the previous section with k\ =fc2= 1> A= - 1 and B — 1. Poisson distribution. The probability that X ~ Poisson(A) takes value x can be obtained from (3.183) with fc = 1. The MLE and UMVUE of R are then follow from formula (3.184) and formulas (3.185) and (3.186) directly, namely,

where C/ = min(niX, n 2 y — 1). Note that the second formula is represented via finite sum while the first one contains a rapidly converging series. Binomial distribution. Binomial distribution is a particular case of multinomial distribution with fc = 2. Usually, x^ is the number of successes in m independent trials, A ^ = p is the probability of success in each trial and A^2^ = 1 - p. Let -X" and Y have binomial distributions with parameters (mi,pi) and (m 2j p 2 ), respectively, where mi and m2 are known. Then, the expression for R can be obtained from (3.189) by setting fci =fc2= 2, p\ = X/mi and p 2 : mi

7T12

104

Parametric Point Estimation

To construct an UMVUE of R note that the UMVUE of the binomial ,Xn is of the form probability P(x\m,p) based on observations Xi,

{x){[m_x))

(3-192)

Then the UMVUE of R is given by mi

7712

R = ^2 J2

P(x\numi,X)P(y\n2,m2,Y)I(x<m2-l).

x=0y=x+l

Negative binomial distribution. We say that a discrete random variable X has the negative binomial distribution NegBin(m, p) if (l-p)m,

x = 0,l,---.

(3.193)

Let X ~ NegBin(mi,pi) and Y ~ NegBin(m2,£>2) be independent random variables and mi and m2 be known. Note that the MLEs for p\ and p2 are Pi = X/{m\ + X) and p?, = Y/(rri2 + Y), respectively. Then the MLE of R = P(X < Y) is

x=0

3/=0

(3.194) To construct the UMVUE of R note that the UMVUE of the negative binomial probability based on n observations is of the form (see Voinov and Nikulin (1993), Table A25) P(x\n,m,X) = l

n{m+X)-l\~ (m+x-l\ (m+x-l\(nX-x+m{n-l)-l\ (n(m+X)-l\~ (nX—x+m(n-l) — l\ V nX ) \ x A nX-x )'

(3.195)

if a: < nX and zero otherwise, so that the UMVUE of R is (3.196) x=0 y=x

The estimator (3.196) have been derived by Ivshin and Lumelskii (1995) and studied by Sathe and Dixit (2001). In the case of mi = rri2 = 1, the negative binomial distribution reduces to geometric distribution. Estimators R and R can be evaluated using

Exercises

105

expressions (3.194) and (3.196). In particular, formula (3.194) takes the form R = {l-Pi)p2/{l-piP2).

(3.197)

Expression (3.197) has originally been obtained by Maiti (1995).

3.8

Exercises

3.1. X and Y be independent normal variables with parameters (JKI,<TI) and (/X2,02). Using formulas (3.150) and (3.157), derive the MLEs and the UMVUEs of R when fix and \i
Let X ~ ALPHA(/i,(7i) and Y ~ ALPHA(//, 02) be independent random variables and // be a common known parameter. Find a monotone function ) transforming X and Y into independent normal variables and, using Theorems 2.7 - 2.9, derive the MLE R, the UMVUE R and the Bayes estimator R of R = P{X < Y). 3.7. Derive the expression (3.78) for the UMVUE of R in the case of the generalized gamma distribution (see Section 3.2.6).

106

Parametric Point Estimation

3.8. Using expressions (3.40) and (3.78) for R and R in the case of the generalized gamma distribution, derive the MLE and the UMVUE of R for independent X and Y in the following situations: a) X ~ Weibull(ai,<Ti) and Y ~ Gamma(oi2,(r2), &i,(X2 known; b) X ~ Rayleigh {&i) and Y ~ Gamma(a!2>0"2)j <*2 known. 3.9. Let X and Y belong to left-truncated families (see Section 3.2.4). Check the validity of Rohatgi's (1989) UMVUE of R: R

—

+ where R((ni,fi2) = P(X < Y), fx{x\fi\)

and fY(y\\muy)

are of the forms

(3.65) and (3.66) with Ai = A2 = oo, and g

3.10. (Bai and Hong (1992), Cramer and Kamps (1997)). Let X and K be two-parameter exponential random variables with the parameters {H\,(7\) and (/i2,<72), respectively, and let /ii = /J2 = M be a common location parameter. In this situation, the UMVUE for the joint pdf f(x, y) of X and Y has the form (Bai and Hong (1992)): f(x,y)

= [CiC^Tx -x + TXY)nx-2(TY -y + C2C^{TX - x + TXY)nx-3(TY -y + TXY)n*~2] x I(TXY <x
where d = (nx - l)(n 2 - l) 2 (n 2 - 2), C2 = (m - l) 2 (ni - 2)(n2 1), C3 = m(m - l ) ^ 1 - 2 ^ 2 - 1 + n2(n2 - l)T^-lT^-2 and TXY, Tx and TY are the following sufficient statistics for /j,, G\ and a2: TXY = min(Xi, Xni, Ylt , Yn2), Tx = Eti(xi ~ TXY), TY = E " i i ( ^ TXY)- Derive the UMVUE of R in this case. 3.11. (Bai and Hong (1992). In the conditions of the previous problem find the MLE of R. Note that the MLE of /x is TXY. 3.12. (Abu-Salih and Shamseldin (1988)). Assume that in the MarshallOlkin BVED Ai - A2. Construct the Bayes estimator of R = P(X <

Exercises

107

Y) when Ai(= A2) and A12 are a-priori independent with gamma priors 5(Ai,Ai2) oc A f ^ A ^ - V ^ e " * " * " . 3.13. (Hanagal (1997)). We say that a vector has the bivariate Pareto distribution if

where Ai,A2,Ai2 are positive. Assuming that scale parameter j3 is known use an appropriate function v(x) (see (2.77)) to transform (X, Y) into a vector (£,77) with the Marshall-Olkin BVED. Using Theorem 2.7, and results of Section 3.6.2 obtain the MLE of R = P(X < Y). 3.14. (Ivshin and Lumelskii (1995)) Use the UMVUE of the squared binomial probability (compare with (3.192))

and Theorem 2.5 to construct the UMVUE of Var(E) in the case when X and Y have binomial distributions with parameters (mi,pi) and (7712,^2), respectively, where m\ and mi are known. 3.15. (Ahmad et al. (1997)) Using results of Section 2.3.3 and Theorem 2.9, obtain the Bayes estimator of R based on gamma-distributed priors in the case when independent variables X and Y have Burr type X distributions. 3.16. (Gupta et al. (1999)). Let X and Y be independent normal variables N{ni,a\) and N(fj,i,crf), respectively, with a common but unknown coefficient of variation a = ai/ni = 02/^2- Find the MLE of R. 3.17. (Gupta and Subramanian (1998)). Solve problem 3.16 when X and Y are dependent normal variables with unknown correlation coefficient.

Chapter 4

Parametric Statistical Inference

The reader is urged to pay special attention to this Chapter. Here the basic concepts familiar to a novice as well as to a seasoned researcher are used in a somewhat non-orthodox manner. The results are practically useful. Also, new avenues for further research are available here in great abundance. In this chapter we discuss construction of confidence intervals for R = P(X < Y) as well as testing hypotheses concerning R when distributions of variables X and Y have known forms with one or several unknown parameters (the so-called parametric inference). The extensive literature on interval estimation in the case when the distributions of X and Y are unspecified (the nonparametric inference) will be covered in Chapter 5. As it follows from Section 2.4, there are three groups of methods for construction of confidence intervals applicable to our problem: exact, asymptotic and Bayesian methods. We shall study each of them in turn concluding the section with a discussion of hypotheses testing and bootstrap techniques. 4.1

Confidence Intervals Based on Exact Distributions

In Section 2.4, exact confidence interval for R in the case of the oneparameter exponential distribution has been derived. Historically, however, the first exact confidence intervals have been obtained in the case of the normal distribution. We draw reader's attention to the fact that although construction of the exact confidence intervals usually requires cumbersome and occasionally sophisticated calculations, these intervals are by far more reliable than the asymptotic ones, especially, in the case when 109

110

Parametric Statistical Inference

R = P(X < Y) is in the vicinity of 1. 4.1.1

The Normal Distribution:

Dependent

Variables

Known covariance matrix. Let (X, Y) be a normal vector with unknown mean (fii,^) and a known covariance matrix S. The objective is to construct a confidence interval for R = P(X < Y) based on the observations {X_,Y) with m = ri2 = n (see (2.2)). Denote (4.1) and observe that (see (3.152)) R = $(£), where ) is the cdf of the standard normal distribution. Note that \/"(C ~~ C) is the standard normal variable, hence P ( - z 7 / 2 / v ^ < C - C < z 7 / 2 /v^) = 1 - 7 ,

(4.2)

where za is the (1 — a) quantile of the standard normal distribution. Now, relation (4.2) and monotonicity of the function ) imply that the twosided (1 — 7)-confidence interval for R is given by ) =1-7-

(4-3)

It is of course quite straightforward to construct a one-sided confidence intervals for R. For example,

(R< *(C + zy/y/nj) = 1 - 7 -

(4-4)

This confidence interval has been derived originally by Owen et al. (1964). Mazumdar (1970) developed confidence intervals in the case when /xi is known. His confidence intervals are of the form (4.3) and (4.4) with X replaced by /xi. Owen et al. (1964) also investigated the sample size n which ensures that P{R < $(C) + e) = 1 - 7

(4.5)

for given positive 7 and e. Relation (4.5) implies that P($(£) > R - e) — 1 - 7 , hence, by (2.61), P(£ > zi_R-s) = 1 - 7 . Noting that $ « ) = R is equivalent to C = ^ I - R , we arrive at P(£ — C, < Z\-R — z\-R-e). Recall now

Confidence Intervals Based on Exact Distributions

that A/n(£ — C) is a standard normal variable, so that ^(ZI-R z-y. Thus, (4.5) is valid provided

111 — ZI-R-S)

=

n > z*(zi-R - zi_ f i _ e )" 2 . Unknown covariance matrix. If the covariance matrix S is unknown, Var(y — X) can be estimated by

s2 = (n - I)"1 J In this case, ,l) and (n - l)s/cr ~

xl-i,

where a is defined in (4.1). Denote (4.6) and observe that y/nC has a non-central T-distribution with (n— 1) degrees of freedom and the noncentrality parameter *n-i(VnO-

(4.7)

Since the noncentral T-distribution possesses the monotone likelihood ratio property, the (1 — 7)-lower confidence bound for £ will be £7ii, where £7]i is the solution of the equation ) < v^C) = 1 - 7 -

(4.8)

Then, the one-sided confidence interval for R with the confidence coefficient (1 - 7) is ($(Cy,i),l) with ) as above. The last confidence interval has been suggested by Owen et al. (1964). Equation (4.8) ought to be solved numerically. A simplification of (4.8) is possible by using a well-known approximation (see e.g. Johnson et al. (1995), Chapter 31) (tm(x) - x) I [1 + (tm(x))2/(2rn)]1/2

~ N(0,1)

(4.9)

where ~ stands for "asymptotically distributed as". Applying (4.9) to (4.7) we arrive at ~ N(0,1).

112

Parametric Statistical Inference

Thus, a (1 —7) approximate lower confidence bound £7)2 on £ can be written as C7,2 = C - ^/l/n + C 2 /(2(n-l)) * 7 ,

(4.10)

where z7 is as above. 4.1.2

The Normal Distribution:

Independent

Variables

Known variances. Consider the case when X and Y are independent normal variables with known variances and sample sizes n\ and n2 which are not necessarily equal. Then [(Y -X)-

(/x2 - inWy/vyTH+oZ/m

~ N(0,1).

Denote M = (af + o\)l {a\/nx + a%/n2).

(4.11)

Let £ and £ be given by (4.1) with a = \Ju\ + a\. Then VM((-() has the standard normal pdf, and, consequently, confidence intervals for R are of the form (4.3) or (4.4) with n replaced by M. Note that, actually, M = n provided n\—n2— n. Unknown variances. When the variances a\ and a\ are unknown, the confidence interval has been constructed by Reiser and Guttman (1986) and independently re-discovered by Teskin and Kostyukova (1991). In what follows we shall follow Reiser and Guttman (1986). Note that a\ and o\ can be estimated by s\ and s\ (see (3.3)), respectively. Also, Y - X (TIJ — Ij^x/^i

~ "*" Xn —i and

with all three random variables above being pairwise independent. Let ( and M be denned by (4.1) and (4.11), respectively, with a = \Jo\ + erf. Then, y/Ma^i? - X) ~ N(y/M(, 1) and (4.12)

Confidence Intervals Based on Exact Distributions

113

Therefore (see e.g.Johnson et al. (1995), Chapter 29), -

^

a xl

wlth

„,

{a{ +

^ /

(

^

+

^

)

.

(4.13)

Let as above C, be of the form (4.6) and recall that Y — X and s are independent. Hence VMC, ~ tu(VMC),

(4.14)

where tt,{vMC,) denotes a noncentral T-distribution with v degrees of freedom and the noncentrality parameter y/MC,. Note that in the case when m = ri2 = n and <j\ = o"2, relation (4.14) holds exactly with M = n and v = 2(n — 1) and leads to the confidence intervals suggested by Owen et al. (1964). If rii y£ 7i2 or a\ ^ oi, we estimate M and v by

(4 + 4), , . and arrive at the second approximation (4.15) Note, however, that M does not require estimation if n\ = n
(4-16)

Then, the (1 - 7) approximate lower confidence bound for R is $(£7,3), i.e. Similarly to (4.8), equation (4.16) can in general be solved only numerically. Applying approximation (4.9) to (4.15) we derive (2i>) ~ N(0,1). Thus, a (1 — 7)-level approximate confidence bound £7i4 can be obtained as C7,4 = C - V l / M + C2/(2i>) 27.

(4.17)

The corresponding bound on R is $(£7,4). Since 1/M varies very little for large M, the result (4.17) indicates that for M moderately large even

114

Parametric Statistical Inference

a substantial error in the choice of M will not significantly change the numerical value of the lower bound $(£7,3) or $(C7]4). Teskin and Kostyukova (1991) also studied some particular cases of the above problem, namely, when the ratio a = <J2/(Ti is known. In this situation, v = (a 2 + I ) 2 / ( l / ( n i - 1) + a4/(n2 + 1)) and the confidence bound follows trivially from the general case. We leave construction of a confidence interval in this situation as an exercise. Remark. Although the confidence interval (4.16) is approximate, we have discussed its construction together with exact confidence intervals for two reasons. Firstly, the idea underlying development of (4.16) is akin to derivation of the exact confidence interval (4.8). Secondly, for m = n2 and <7i/<72 known, the values of M and v are available, so that the confidence interval (4.16) is exact in this case. Note that the fact that n\ — n2 simplify substantially the arguments and the results related to construction of confidence intervals. 4.1.3

The Gamma Distribution

Let X and Y be independent random variables with the gamma pdfs Gamma(a1,(Ti) and G a m m a ^ , ^ ) , where <x\ and a2 are known positive parameters (see definition in (3.9)). Denote Q = Vil
Q = {a1Y)/{a2X).

(4.18)

It is well known (see e.g. Johnson et al. (1994), Chapter 18) that 2n\X/a\ ~ Gamma(n 1 ai,2) = X2niai and, similarly, 2n2Y/a2 ~ x!n2a2> w h e r e xl is the pdf of the chi-squared distribution with a degrees of freedom. Hence,

I Q

=

2n2Y/(a2n2a2) 2n1X/(a1n1a1)

=

xL2aJ(2n2a2) x| n i Q l /(2n 2 a 2 )

^ v

lh

y

'

where F(a, b) denotes the Snedecor's F-distribution with a and b degrees of freedom. Analogously, Q/Q ~ F{2n\ai, 2n2a2). For any A denote by F\ = F\{2n\ai,2n2a2) the (1 - A) quantile (i.e., the A-cut-off point) of F(2mai,2n2a2) distribution. Recall also that the (1 — A) quantile of F(2n,2a2,2n\Oi\) distribution is related to F\ as follows

Confidence Intervals Based on Exact Distributions

115

(see e.g. Johnson et al. (1995), Chapter 27) Fx(2n2a2,2n1ai)

= \F(i-x)(2n1ai,2n2a2)]~1

= l/F ( i_ A) .

Let 7i and 72 be nonnegative numbers such that 71 + 72 = 7- Then, P(gF{1_72) < 0 < £F 7l ) = 1 - 7.

(4.20)

Recall now that R = P(X
(c*i ,012)
I gfT1 (c*i, ot2) 1 = 1 — 7 .

(4-21)

The confidence interval (4.21) has originally been derived by Constantine et al. (1986). In the case of a\ = a2 = 1, formula (4.21) provides (1 — 7)interval for the one-parameter exponential distribution (4.22) which coincides with the interval derived by Enis and Geisser (1971) (see (2.74)). Note also, that if a one-sided confidence interval is required, it is sufficient to choose 71 = 0 or 72 = 0 in (4.21) or (4.22). 4.1.4

The Generalized Gamma Distribution

Let X and Y be independent random variables with the generalized gamma distributions GG3(cri,ai,/?i) and GG3(cr2,a2,(32), respectively, where the pdf of GG3(<7, a, /3) is defined in (3.34). If parameters ai,a.2,(3\ and fj2 are known, we shall derive exact confidence intervals for R, thus, providing a technique for constructing exact confidence intervals when X and Y belong to various distribution families (for example, Weibull - Weibull, gamma Weibull or Weibull - chi-squared). Our goal is to obtain a (1 - 7 ) confidence interval {RL,RV) such that P(R < RL) = 71, P(R > Ru) = 72 and 7i + 72 = 7- The above construction carried out in Pensky and Takashima (2002) ensures that P(R S (RL,RU)) = 1—7 and allows one to derive one-sided confidence intervals by letting 71 = 0 or 72 = 0.

116

Parametric Statistical Inference

Let Si and S2 be given by (3.77) and observe that Sf1 and S^2 have independent gamma distributions Gamma f B^£L, cr^1J and Gamma ( s^z,

respectively (see (3.9)). Note that {S1/a1)^

~ Gamma ( ^ , l ) ,

( S 2 / * 2 ) * ~ Gamma {^,

l) . (4.23)

Thus, the cdf of the ratio C, = {Si/ai)/{S2/a2)

is of the form

Q((z) = HG(Pi,P2,niai, n2a2,z),

(4.24)

where Ho(/3i,f32,niai, n 2 a 2 ,z) is defined in (3.35), and this is free of the unknown parameters o\ and
< Ci_72) = 1 - 7-

Solving the last inequality for <JI/G2 and taking into account that R = H(Pi,[32,ai,ot2,o'2/cri) is an increasing function of <J2/GI, we arrive at (1 — 7) confidence interval for R a

„.

„.

Cl-72S'2

Note that to construct a one-sided confidence interval one just needs to set 71 or 7 2 to zero. Construction of the confidence interval above involves evaluation of quantiles £7 of the distribution Q^ which reduces to numerical solution of the equation H{j31,(52,n1ai,n2ct2,z)

= i.

(4.25)

However, if /?i = /32 = (3, one can bypass solution of equation (4.25) and use standard distribution tables. Let Ti = /3i 5^ / n i an<^ T2 = /32 S^ ln2. Observe that Gamma(niai//3, (jj),

n2T2//3 ~ Gamma(n2a2//3,cr^),

so that 2niTi/(/3irf) and 2n2T2/(Pa23) have chi-squared distributions with 2nioti/(3 and 2n2a2/(3 degrees of freedom, respectively. Thus, a \0 — )

~F(2niai/P,2n2a2/f3),

Confidence Intervals Based on Exact Distributions

117

where F(a, b) is the Snedecor's F-distribution with a and 6 degrees of freedom. If F 7 l and Fj_ 72 are quantiles of F(2niai//3,27120:2//3) distribution, then J

C7i

[

Txa2

J

(4.26) K '

)

In terms of statistics Si and S2, (4.26) can be rewritten as |V/3 o

(4.27) (4-27)

i.e. C71 = 17laini/(o;2n2)]1/;3 and
The Burr Type X

Distribution

To the best of the authors' knowledge, no other algorithms for construction of exact confidence intervals have been suggested in the literature. However, Theorem 2.10 allows one to derive exact confidence intervals whenever X and Y can be transformed to normal, gamma or one-parameter exponential variables with the help of a monotone one-to-one transformation (see Table 2.1). For example, let X and Y possess BurrX(ai) and BurrX(a2) pdfs defined in (3.28). The monotonically decreasing function v(x) = — ln(l—e~x ) converts the BurrX(a) random variable into the one-parameter exponential random variable. Let F 7 l and F(i_72) be the (1 — 71) and 72 quantiles of the F(2ni,2Ti2) distribution, respectively, and let Q =

Then, by Theorem 2.10, the (1 - 7)-confidence interval for R is of the form P (QF 71 + I ) " 1 < R < (QF{1^2)

+ I)" 1 )

=1-7-

118

4.2

Parametric Statistical Inference

Asymptotic Confidence Intervals

In this section we shall discuss the popular and widespread construction of confidence intervals based on normal approximations. The technique have been described in Section 2.4.3, however, the main difficulty of its implementation - estimation of the variance of the MSE - has been left aside. In what follows, we shall start with the normal distribution in which case the variance can be easily approximated. We shall then proceed to the left-truncated exponential families for which the explicit estimator for Var(.R) is available (see Section 3.2.4). We shall conclude the section with an example of interval estimation of R = P(X < Y) for the two-parameter exponential families where the approximate confidence interval is based on the asymptotic expression for MSE(i?). 4.2.1

The Normal Distribution

Let X and Y be independent normal variables with the pdfs N(/zi,<72) and N(/i2,02) a n d let the parameters of the stress X, fi\ and a\, be known. In this situation, Church and Harris (1970) suggest a procedure for an interval estimation of R based on the statistic

r = (y- m )/ v /a 2 + 4 , where sY is defined in (3.3). It is easy to verify that with probability one

Recall that Y and sY are independent and that (712 — l)sy/cr| has the Xn2-i distribution, so that Var(sy) = 2cr|/(n2 — 1) (see e.g. Casella and Berger (1990)). Therefore, T is asymptotically normal with the mean (H2 — fii)/y/af + a\ and the variance [n2

2(n 2 -

Hence,

T + Z7<7T J = 1 - 7 ,

(4.28)

Asymptotic Confidence Intervals

119

where z 7 is the (1 —7) quantile of the standard normal distribution. Taking into account that R is of the form (3.1) and that ) is a monotone function of its argument, we obtain P(R > $(T — z^ar)) = 1 —j. Replacing a2-, by its estimator

4

.2 2

T

[1 [ ,

(

+

2(n2 -

P(R > $(T -

Z^&T))

a + 4 [n2

we arrive at = 1-7-

(4.29)

This is the asymptotic confidence interval derived originally by Church and Harris (1970). Note that the lower confidence bound in (4.29) coincides with the lower bound $(C7,4) of Reiser and Guttman (1986) (c.f. (4.17)) when ni —> 00. 4.2.2

The Left-Truncated Exponential Distribution

In the case of the normal distribution we are in the fortunate situation that the estimator of R is a monotone function of a statistic the variance of the which is relatively easy to estimate. Since it is not always the case, alternative techniques have been developed where the confidence intervals are based on the UMVUE R of R and the UMVUE of its variance (see Theorems 2.5. and 2.6). Let X and Y be independent random variables with the pdfs Ex(/ii, a{) and Ex(/i2,02) (see (3.5)) where parameters o\ and a
(R

- z7/2<7 < R < R + -Z7/2
The Two-parameter Exponential Distribution

In the case of the left-truncated exponential distribution with a known scale parameter one can derive an explicit expression for a2 = Var(E) and then use R and a2 for constructing an asymptotic confidence interval for R. However, usually, unbiased estimation of Var(.R) results in a fourdimensional integration which may be somewhat tricky even if performed

120

Parametric Statistical Inference

numerically. In this situation, the procedure we have been followed in the previous subsection just would not work. A solution to this difficulty is to base our interval estimator on the MLE R and its asymptotic MSE. Let parameter 9 in (2.1) be a fc-dimensional i.e. R = R(6), where R(9) is calculated by means vector 0 = (6i,of one of the formulae (2.5), (2.6) or (2.9). Then, R = R(0) where 8 is the MLE of 0. Hence, using the Taylor expansion up to order two, the MSE of R (see (2.11)) can be represented as

(

(4.30)

£||(SX*J

Egg

-
(4.30)

¥

where « stand for "approximately equal". Replacing Oj, j = 1, , k, in (4.30) by their estimators, we obtain the MLE of the asymptotic MSE of R. Gupta and Subramanian (1998) and Gupta et al. (1999) constructed confidence intervals based on asymptotic MSE in the case when X and Y are normal variables, either dependent or independent, with a common coefficients of variation: IA\/<J\ = [iifoi. In what follows, however, we shall consider an example when X and Y have independent two-parameter exponential distributions delegating Gupta and Subramanian (1998) and Gupta et al. (1999) results to the exercise section. In our opinion, the case of the two-parameter exponential distributions is of special importance since it allows us to derive interval estimators for the Pareto and the power distributions by means of one-to-one transformations (see Theorem 2.10 and Table 2.1). Let X and Y be independent variables with the pdfs Ex(/ii,<7i) and Ex(/i2, <x2), respectively, where Ex(/i, a) is denned in (3.5). For the purpose of this section, it will be easier to adhere to a somewhat different parameterization, namely, a, = aj1, j = 1,2,. Then, X (1) , Tx = X - X{1), Y(1) and Ty = Y — Y^ will both be sufficient statistics and the MLEs of fix, cci, \ii and 0:2, respectively, namely Ai = X ( i), £2 = F (1) , &i = Tx, a2 = TY.

(4.31)

Asymptotic Confidence Intervals

121

Moreover, (see, e.g., Voinov and Nikulin (1993)) all these statistics are mutually independent with fJLj ~ Ex(/Xj, rij/aj),

&j ~ G a m m a ( a j / r i j , rij — 1), j = 1,2.

Thus, it is easy to calculate (see, e.g., Casella Berger (1990)) that Ejlj = Hi+otj/nj, E&j = (rij - l)aj/rij, Var(/2j-) = (atj/rij)2, Vav(&j) = (rijl)(aj/rij)2, j = 1,2, so that -H)2

= 2(^-)

and

E(aj-aj)2

= ^-, j = 1,2. n

3

(4.32)

j

Denote N = n\ + n2 and assume that limAr_oo m/N = A with 0 < A < 1. Then, it follows from (4.32) that -

H)

2

= O(N~2) = o ( E(&i -

aj)2).

Moreover, since all the parameters in (4.31) are independent, the absolute values of expectations of all cross-products are the same - iH)frj - Hj)]\ = \E[(&i - a,)(6, - a,-)]l = - m)(aj - a.j)\\ = \aiaj\/{ninj) = 0{N~2), i ^ j . Therefore, as N is large, the only terms left in the asymptotic MSE (4.30) are the terms with dR/da\ and dR/da.2In the case of the two-parameter exponential distribution R(9) is of the form (3.6). Denote by Cj the derivative dR/dotj, j = 1,2, at the point

c

=

. ^ V & F p[ ^ ] ,

1 s

j

[ f e ^0}

ih x

\ ^\

if

if~H>

where i = 2 if j = 1 and i = 1 if j = 2. Now, utilizing formula (4.30), we arrive at the asymptotic MSE of R of the form N~xa\ where a% = \-lC2a2 +

(l-\)-1C2al

Then, the asymptotic two-sided confidence interval for R is

122

Parametric Statistical Inference

X and strength Y are independent with 0 = (#1,02), where the vector or scalar parameter 9\ (O2) is the parameter of X (Y) only. We shall describe Johnson's (1988) technique in the case when 8\ and 62 are scalars. Let #i and 62 be the MLEs of 61 and 62 based on independent samples X_ and Y_ such that V^-^-NOD,/-1^-)),

j = l,2,

(4.33)

where, as before, ~ means "asymptotically distributed as" and Ij(Oj) are the Fisher informations, j=l, 2,

Here, fx(x\@i) and fy(y\92) are the pdfs of X and Y, respectively (see (2.1)). Note that under assumptions (4.33) one can write +Cg

where as before N = n\ + 712. Hence, under suitable regularity conditions (including the interchangeability of integration and differentiation operators)

l - FY(y\e2)\

A

where Fx(#|0i) and Fy(jy|^2) are the cdfe of X and y , respectively (see (2.6)). If, as before, limjv-«x>ni/iV = A with 0 < A < 1, then

uh) - R(91,e2)} ~ where the asymptotic variance
a\ = [Xh{h)}-1 C\ + [(1 -

^

Bayesian Credible Sets

4.3

123

Bayesian Credible Sets

In this section we shall provide examples of construction of Bayesian credible sets for R. The general method for derivation of Bayesian credible sets has been described in Section 2.4.4. However, there are two difficulties in carrying out this procedure. The first one is to obtain the posterior pdf of R which may not be available even after the Bayes estimator (2.46) is constructed. The second challenge is a numerical solution of the system of equations (2.63) and (2.64). Below we shall present two examples illustrating these issues. 4.3.1

The Normal Distribution:

Independent

Variables

We now return to the problem of estimation of i? = P(B'X > 0), studied by Enis and Geisser (1971), where X = (Xi, , Xk)' is the normal vector with the mean /*' = ,/Ltfc) and covariance matrix cr2l, I being the (k x k) identity matrix. In Section 3.3.1 we have derived a Bayes estimator R of R of the form (3.88) based on the statistics (3.83) and (3.84). However, the estimator R has been obtained using Bayes predictive approach and does not allow an immediate construction of a Bayes credible set for R since the posterior pdf of R has not been evaluated. For a derivation of the (1 — 7)-credible set for R we return to Enis and Geisser (1971). Recall that the joint posterior density of n and a2 is given by (3.85). Therefore, if a is fixed, p(n\a, X, s2) oc a~k exp {-(X - /x)'A(X - y)/2a2} .

(4.34)

Denote BV

(N - k)s2

2

B'A^B

Here, and in the previous formula, X, N and s2 are as defined in (3.83) and (3.84), respectively, A is the (k x k) diagonal matrix with Au — n*. Recall that R = $ ( 0 and the standard normal cdf $(z) is increasing in z. Therefore, a (1 — 7)-credible interval for R will be of the form ($(£i 7 ), $(£27)), where (C17JC27) is the credible interval for £, i.e. 2

)-l-7-

(4.36)

124

Parametric Statistical Inference

To obtain the posterior pdf of £, note that (4.34) implies that (4.37) Integrating fi out of the posterior pdf (3.85), we obtain p(a/s2) oc <j-(N-k+1') exp[— (N — k)s2/(2a2)]. Therefore, the statistic v given s2 has the XN-k distribution, and, since n and a are a-posteriori independent (see (3.85)), it follows from (4.37) that (4.38) where T=

. B X =r- t> > 0. (4.39) /(N - k)B'B s To arrive at the posterior pdf of £ we need only to integrate (4.38) with respect to v. For this purpose, we combine the terms containing v in the argument of exponent in (4.38) and expand ex.p(^T^/v/b) into a power series. Integrating the resulting series term-by-term, we obtain

(4.40) The expression (4.40) can now be used for a numerical evaluation of the values of £i 7 and £27 ensuring (4.36). However, simultaneous solution of equations (2.63) and (2.64) leads to rather complicated calculations. To avoid this difficulty, one may use the approximation suggested by Enis and Geisser (1971) who observed that (4.41) (4.42) and, moreover, that for fixed values of X/s and for large values of any one of the m,i = 1,..., k, the standardized variable £ = (C — ^0/V Var£ is asymptotically normally distributed N(0,1). Consequently, the approximate (1 - 7) credible set for ( is ( E£ - 2 7 / 2 -/VarC, EC +

Bayesian Credible Sets

125

Note that if there were just two independent variables X and Y with equal variances a2, then (4.40) - (4.42) are valid with k = 2, r and b2 reduced to -

- n2 - 2) s

(4.43)

and s2 given in (3.89).

4.3.2

The Weibull

Distribution

Let X and Y be two independent Weibull random variables with a common shape parameter a and scale parameters a\ and cr2, respectively, where all the three parameters are unknown. In Theorem 3.2 of Section 3.3.3 we have provided the posterior pdfs of R — P(X < Y) when the priors on a, a\ and <72 is are reference or matching priors. However, the expressions (3.96) and (3.97) for the posterior pdfs are rather complex, so that it is difficult to evaluate the performance of Bayesian credible sets based on (3.96) and (3.97). Hence, it is worthwhile to consider a numerical example. In our study relatively small samples X_ = (0.1496,0.2641,0.3349,0.1806, 0.1533,0.1555,0.2591,0.0980,0.3008,0.2509) and Y = (0.7750,0.2894,0.3138, 0.2759,0.2598,0.7934,0.2110,0.4193,0.4757,0.5352,0.7553,0.2396) with m = 10 and n 2 = 12 are drawn from Weibull (2,0.25) and Weibull (2,0.5) populations, respectively. It is easy to calculate that the true value of R = P(X < Y) in this case is R = 0.8. For a Bayesian analysis it would seem reasonable to assume that baring supplementary information the nuisance parameters are of equal inferential importance which leads us to j = 1 in (3.93) and (3.96). Performing numerical integration in (3.96), we arrive at the posterior pdf p*(R) = PRI(R\X_,Y.) presented on Figure 4.1 and the Bayes estimator £ = 0.818. Sincep*(R) is unimodal, the (1—7) confidence set is an interval (Ri,R2) where (see (2.63) and (2.64)) R\ and R2 are solutions of the following system of equations p*(i?i) = p*(R2),

F*(R2) - F*(Rx) = 1 - 7 .

Here, F*(R) = JQ p*(x)dx is the posterior pdf of R. In our numerical example 7 = 0.1 and the confidence interval for R is (0.70,0.94). For

126

Parametric Statistical Inference

Fig. 4.1 The posterior pdf.

comparison, the MLE of R for the data above is R = 0.833.

4.4

Hypothesis Testing

Hypothesis testing is an important and somewhat delicate area of statistical inference. Some authors consider it to be equivalent to the theory of confidence intervals and indeed we shall confirm this in our discussion. In this section we shall consider testing hypotheses about R. We shall start with tests based on the confidence intervals for R derived above. Although in the subsequent subsection we shall consider only the use of exact confidence intervals, asymptotic confidence intervals can evidently be utilized as well. Next we shall describe tests based on generalized p-values introduced by Tsui and Weerahandi (1989) and implemented by Weerahandi and Johnson (1992). The last topic of this section deals with Bayesian tests. The reader is encouraged to compare them with the Bayesian credible intervals discussed above.

Hypothesis Testing

4.4.1

Tests Based on Exact Confidence

127

Intervals

As it follows from Section 2.4.5, in order to test a hypothesis Ho : R = Ro one is required to construct a two-sided (1 —7)- confidence interval for R of the form (L(X_,Y_),U(X_,Y_)) and reject Ho whenever RQ lies outside this interval. Similarly, if (L(X_,Y_), 1) is a one-sided (1 — 7)-confidence interval for R, then the test which rejects HQ : R < RQ whenever Ro < L(X_,Y_) is a size 7 test. We shall demonstrate construction of these tests based on confidence intervals derived in Section 4.1.1. Example 4.1 The Normal distribution: dependent variables. Let (X, Y) be a bivariate normal vector with unknown mean (/zi,/i2) and unknown covariance matrix S. In Section 4.1.1 we have constructed a (1 — 7)-confidence interval ((£7,i),l) for R where £7,1 is the solution of the equation (4.8). Therefore, the test which rejects the hypothesis Ho R < Ro whenever Ro < $(C-/,i) is a size 7 test. Also, since equation (4.8) is difficult to solve one can use an approximate lower confidence bound $(£7,2) where £7,2 is defined in (4.10). Example 4.2 The Normal distribution: independent variables. Let X and Y be independent normal variables with unknown means and variances. In Section 4.1.2 we have derived a (1 — 7) lower confidence bound $(£7,3) f° r -R where £7,3 is the solution of the equation (4.16). Therefore, analogously to the previous example, the test which rejects the hypothesis Ho R < RQ whenever RQ < $(£7,3) is a size 7 test. Again, as above, to avoid tedious calculation of Cy,3i one might prefer to use an approximate lower bound $(£7,4) based on the normal approximation to the noncentral T-distribution. The value of £7)4 is given by (4.17). Example 4.3 The Gamma distribution. Let X and Y be independent random variables with the gamma pdfs Gamma(ai,<7i) and Gamma(o;2,cr2); where ot\ and 0:2 are known positive parameters, same as in Section 4.1.3. Then, the (1 — 7)-confidence interval for R is of the form (4.20) where Q is defined in (4.18). As it has been mentioned before, we can obtain one-sided confidence intervals from (4.20) by letting 71 = 0 or 72 = 0. For example, the (1 — 7)-lower confidence bound for R has the form

128

Parametric Statistical Inference

where Iz{a,b) is the incomplete beta function defined in (2.70) and Fy = F 7 (2mai, 2ri2a2) is the (1—7) quantile of the F(2mai, 27120:2) distribution. The test which rejects HQ : R < Ro whenever Ro < i?i,(i f) is a size 7 test. Example 4.4 The Exponential distribution. If a\ = 0.1 — 1, the gamma distribution reduces to exponential distribution and the (1 — 7)-confidence interval becomes (4.22). In this case, the size 7 test rejects Ho : R < Ro whenever Ro < #2,(1-7) where (1_y)

+ 1],

g is defined in (4.18) and F 7 = F 7 (2ni, 2712). Reiser et al. (1992) studied testing the hypothesis HQ : R = RQ versus the alternative Hi : R = where 0 < Ro < Ri < 1. They were searching for the test which accepts HQ with probabilities (1 — 70) or 71 whenever R = RQ or R = Ri, respectively. Here, the values of 70 and 71 can be interpreted as producer's and consumer's risks and are usually small in practical applications. Since in the case of the exponential distributions, R = g/(l + g) (see (2.12)), where g is defined in (4.18), the testing problem can be reformulated as HQ : Q = go versus H\ : g = gi- Also since Ro < Ri implies go < g\, the hypothesis HQ is accepted whenever g < gc. The problem is then to find gc and the sample sizes n\ and 712 such that P(Q < Qc I Q = go) = 1 - 7o and P(g < gc \ g = gi) = 71. Since g/g has the

.F(2ni,2n2) distribution (see (4.19)), these equations are equivalent to g1/gc = Fli{2nl,2n2).

(4.44)

It follows from (4.44) that F 71 (2ni,2n 2 ) F ( i_ 7o) (2ni,2n 2 )

go'

(4.45)

Equation (4.45) cannot explicitely provide the actual sample sizes ni and n2 but only some relation connecting them. To obtain this relation explicitly we use the following approximation for a quantile of F-distribution (see, e.g., Johnson et al. (1995), Chapter 27)

\nF7(2ni,2n2) « J (

) + ZyJ — + —,

(4.46)

Hypothesis Testing

129

where z7 is the (1 —7) quantile of the standard normal distribution. Reiser et al. use (4.46) to derive the sample size n when n\ = n2 = n and obtain n = 2(z70 +z T1 ) 2 /[hi£i

-\ng0]2.

Without the assumption about equality of sample sizes ni and n2, formula (4.46) yields the equation

which may have infinitely many solutions. Remark. Reiser et al. also solve problems similar to the one considered in Example 4.4 in the situation when X and Y are independent or dependent normal variables. We present these problems as exercises. An interested reader may then consult Reiser et al. (1992). 4.4.2

Tests Based on Generalized

p-values

Weerahandi and Johnson (1992) suggested a testing technique based on the generalized p-value. Following these authors, we shall consider independent normal variables X and Y with the means (ii and /J,2 and variances a2 and erf, respectively. Assume that [i\ and o\ are known (similarly, for example, to Church and Harris (1970) already discussed in some detail in Section 4.2.1). Denote

/af+crt

(4.47)

V1 + (TZ

(compare with (4.1)) and recall that R = $(C) (see (3.1)). Then, the problem of testing HQ : R < Ro versus Hi : R > Ro becomes equivalent to testing HQ : C < Co versus Hi : C > Co> where Co = ^~1(-Ro)If Y and Sy are the sample mean and the sample variance of Y based on a sample Y_ of size n2, define S2 = (n2 -

l)sY/{n2al).

Observe that Y' and S2 are independent with ?'~N(/i,cT 2 /n 2 )

and U = n2S2/a2 ~

X'2_i

Let y' and s2 be observed values of Y' and 5 2 , respectively.

130

Parametric Statistical Inference

The testing procedure of Weerahandi and Johnson (1992) is based on the interesting method suggested by Tsui and Weerahandi (1989) and can be described as follows. If X is the observable random vector with the observed value x, £ is the parameter of interest and v is the vector of nuisance parameters, T(X;x, £ u) is referred to as a generalized test variable provided it satisfies the following requirements: 1. For fixed x, the function T(x;x,Co,*') as well as the distribution of T(X; x, Co> v), are free of nuisance parameter u. 2. For fixed x, fixed v and for all t, the probability P(T(x; x, £ u) > i|£) is nondecreasing in £ namely, T is stochastically increasing in £ Now, in the analogy with the definition of the "conventional" extreme region, a generalized extreme region is denned as fic = {X : T(X; x, £ u) > T(x; x, £ v)} . Note that under the assumptions above, the larger the value of £ the greater is the probability of the extreme region, and this probability is computable without the knowledge of the values of the nuisance parameters. In our model, £ is the parameter of interest while \i2 (or a2) is the Yn2) and y is the observed value of nuisance parameter, Y = Y_ = (Yi, the sample X_. Consider generalized test variable

y'VU - (VU + n2s* Here Z is a standard normal variable and by simple algebraic manipulations T can be re-written as rp

s It is easy to observe that distribution of T is independent of/z and a, that T is stochastically increasing in £ and that at the observed quantities Y' = y' and S2 = s2, the value of T is s~l. Therefore, the generalized extreme region is of the form S \ = { ( ? , S2) :T>s} and the generalized p-value PR is the probability of the extreme region at ( = Co-

Hypothesis Testing

131

E [$ ((y'VU -

(4.48)

where the expectation is taken with respect to Xn2-i distribution of the random variable U. Recall that, as we mentioned in Section 2.4.5, small values of PR can be viewed as strong evidence against HQ. 4.4.3

Bayesian Tests

To develop a Bayesian test one is required to know the pdf of R or of a random variable which is a monotone function of R. To describe the test we return to the example elaborated in Section 4.3.1 where X = (X\, Xk)' is the normal vector with the mean /x' = (/x1; , Hk) and the covariance matrix cr2l. Since R = $>(() where C is defined in (4.35) and the standard normal cdf ) is an increasing function of its argument, any hypothesis about R can be reformulated in terms of the parameter £ For example, Ho : R> RQ is equivalent to Ho : C > Co where £o = $~1(Ro). To test Ho against the alternative Hi : C < Co the posterior pdf (4.40) of ( can be used. Hypothesis Ho is rejected whenever [l - / " p(flX, s2)d(\ / [ / " p(C|X, s2)d<] > X, where A is a threshold value chosen in advance. The last inequality can be re-written as

r

(4.49)

Combining (4.40) with (4.49) we arrive at the rejection region for Ho consisting of X_ such that

h where b and T are as defined in (4.35) and (4.39), respectively, and

Ij = f tf o

Note that the integrals Ij can be expressed as ifCo>Ooriisodd, otherwise

132

Parametric Statistical Inference

where T{a,z) — /z°°xot~1e~xdx is the incomplete gamma function (see, e.g., formula 8.350 of Gradshtein and Ryzhik (1980)). Recall also that if R = P(X < Y) where X and Y are independent normal variables with common variance a 2 , then b and T are of the form (4.43).

4.5

Bootstrap

Bootstrap was introduced to the world by Efron (1982). A more detailed and practical exposition is presented in his by now classical monograph (1982). Bootstrap is closely related to jackknife, an earlier technique introduced by Quenouille in 1956. However, it is by far more computationally intensive and its discovery served as the beginning of the new era in the theory and applications of a noval class of nonparametric statistical procedures substituting computer computations for mathematical analysis. It is by now a large-scale industry and a number of papers and books written on this subject in statistically-oriented literature in the last decade of the 20-th century perhaps exceeds the number of Kims in the Seoul telephone book. A predominal majority of researchers and users consider it to be the panacea for all the problems related to estimation and testing procedures especially when parameters stem from a complicated functional of the true distribution. A minority warns quite convincingly that a blind careless and indiscriminate use of bootstrap may be misleading and cause a substantial damage by drawing wrong conclusions leading to disastrous consequences in particular in engineering and medical sciences. In what follows, we shall consider interval estimation on the basis of bootstrap methodology keeping in mind that any confidence interval can in principle be converted into a hypothesis test using techniques described in Sections 2.4.5 and 4.4. 4.5.1

The Concept of the Bootstrap

The basic idea of the bootstrap is ingeniously simple. Let Q(X_, F) be a > Xn) and the i.i.d. obserrandom variable of interest where X_ = (-X"i> vations Xi, i = 1, , n, have a common cdf F. For example, Q may be the variance or a percentile of some statistic T = T(X). To estimate Q(2L,F), the bootstrap method draws an "independent sample" of values of Q and then uses the "sample average" as an estimator

Bootstrap

133

of Q. For the purpose of generating this "independent sample", the Monte Carlo simulations are used. The following procedure usually is utilized: 1. Draw an i.i.d. sample from the unknown F. 2. Construct the estimator F of F on the basis of observations X\, , Xn (usually F is the MLE of F in the parametric set-up or the empirical cdf in the nonparametric one). 3. Draw a large number M of i.i.d. bootstrap samples X_^ = (x[ , , Xn , M. from F and calculate Q(X W , F), j = 1, 4. Estimate Q{X_, F) by M

QB

=

M-^QQ^.F). .7=1

The above technique in its modification can be applied to construction of confidence intervals for R = P(X < Y) in two different ways. The first one is almost identical to asymptotic confidence intervals described in Section 4.2 with the only difference that the MSE is estimated by bootstrap technique. The second, the so-called percentile method, is based on estimation of the cdf of an estimator of R. We shall briefly discuss both approaches.

4,5.2

Bootstrap-Based Asymptotic Confidence Intervals

Let, as above, (X, Y) possess the cdf Fe(x,y) with an unknown scalar or vector-valued parameter 9 £ O and let {X_, Y) be a sample from Fe defined in (2.2). Consider an estimator R = R{X_,Y^) of R on the basis of (X_,Y); for example, R can be the MLE R, or the UMVUE R, or the Bayes estimator R. To construct the (1 - 7) confidence interval for R , we first construct the MLE § of 9 so that the MLE F of F be of the form F(x, y) = Fg(x, y). Next, following the procedure described above, we draw independent bootstrap samples (X_(j), Y_{j)) from F(x, y) where X_U) = (x[j), , X$) and Yij) = j (Y} \, Y$), j = 1, , M. We then calculate

134

Parametric Statistical Inference

for each one of the bootstrap samples and estimate MSE(i?) by M M

2

MSEB = M' ] T [R{j) - 5] . 1

(4.50)

3=11

It is easy to verify that the bootstrap estimator of MSE(.R) can be written as

MSEB = VarB + (BiasB)2 where Vare and Biass are the bootstrap estimators of the variance and the bias of R. Explicitly M

VarB = M~ ^2 [[RU) ~ £*]] 1

2

and

3=1

where M

R*

1

=M-

Now, as in Section 4.2, assuming that R is asymptotically normal as both n\ and n
I

(4.51)

Note that the bootstrap estimator of the MSE can be computed for ANY estimator of R. Hence, the bootstrap can serve as an efficient tool for comparison of various estimators of R. After constructing bootstrap estimators for the MSE of the MLE R (or the UMVUE R, or_the Bayes estimator R, etc.), we can choose the one with the smallest M S E B which will then be used for construction of the confidence intervals. Comparison of various types of estimators and construction of confidence intervals on the basis of the algorithm described above was carried out by Constantine et al. (1989) for the case of the gamma distribution. It can be of course carried out in a similar manner for any other distribution using the technique described above.

Bootstrap

4.5.3

135

The Percentile Method

In the percentile method unlike in the method discussed in Section 4.5.2 the assumption of asymptotic normality of R is not required. Let, as above, R = R(X_,Y_) be a specific estimator of R based on (X_,Y)- Similarly to the previous section, we draw a large number M of independent bootstrap M, for each one of samples and calculate Rb"> = # ( X ( j ) , Y ( i ) ), j = 1, them. The bootstrap estimator of the cdf of R is then M

CDFB(i) = M - 1 ^I(R(j)

< t)

3=1

where ) is an indicator function. The (1 — 7) confidence interval for R constructed by means of percentile method is represented by (Cra i s 1 ( 7 /2) 1 CDF~\l

- 7/2))

(4.52)

where CDF# is the inverse function of CDF^. Unfortunately, as the examples in Efron (1982) and numerous subsequent investigations show, the percentile method gives somewhat erratic results, both in terms of the length of the intervals and of their skewness relative to R. To remedy the situation the so-called bias-corrected percentile method can be used. The bias-corrected percentile method is based on the assumption of existence of a transformation to a normal pivotal quantity, namely, a normally distributed quantity which distribution does not depend on the unknown parameters. Suppose there exists a monotonic increasing function g(-) such that the transformed quantities g = g(R),

Q = g(R),

e(j) = gR(j)

satisfy Q-Q~N(-z0a,a2),

Q{j) -§~N(-z0a,a2)

(4.53)

for some constants ZQ and a. In other words, g — g is a normal pivotal quantity having the same normal distribution under F and F. Denote the bootstrap estimator of the cdf of g = g(R) by M

CDGB(t) = M-1 J^ 3=1

136

Parametric Statistical Inference

and note that since g is an increasing function CDGB{g{t)) = CDFB(i) and CDGB(i) = CDFufo"^*)).

(4.54)

The standard (1 - 7) confidence interval for g is given by P (Q + zocr - z7/2cr < Q < Q + zocr + z7/20-) = 1 - 7 .

(4.55)

Relations (4.53) and (4.54) imply that CDFB(R) = CDGB(Q)

= P(eU) )

(4-56)

Hence zo = $- 1 (CDF B (E)).

(4.57)

Reapplying (4.53) one can write P(§U)
z^/2a) = $(2z0

and, similarly to (4.56), utilizing (4.54) we have $(2z0

z 7 / 2 ) = C5DGB(g + zoo-

z7/2o-) = CDF Big'1 (Q + zoa

or ^ + zo
z7/2(T = 5 [ C D F B 1 ( $ ( 2 Z 0

zl/2))}.

(4.58)

Combining (4.58) with (4.55), we arrive at P (CDFs1($(2zo - z 7/2 )) < R < C D F B 1 ( $ ( 2 Z 0 + z 7/2 ))) = l - 7 . (4.59) Note that it is sufficient to be assured only about the existence of the monotonic mapping g; its specific form is irrelevant. Moreover, the normal distribution plays no special role in the above argument. Instead of (4.53) we could assume that the pivotal quantity possesses some other symmetric distribution than normal, in which case $ would denote the cdf of this (rather than standard normal) distribution. The corrected bootstrap interval (4.59) very often performs better than the interval (4.52) obtained by the percentile method directly. It is worth noting that (4.59) can be viewed as the interval based on a combination of the percentile method and the asymptotic bootstrap interval estimation.

Exercises

4.6

137

Exercises

4.1. Let X and Y be independent lognormal variables with parameters and (^2,02), where the pdf of the lognormal distribution is given in (3.42). Using Theorem 2.10 and results of Section 4.1.2, derive (1 - 7) one-sided and two-sided confidence intervals for R. 4.2. Let X and Y be independent Burr type XII variables with parameters (ai,/3) and (02,(3), respectively, i.e. /? is the common parameter for X and Y. Using formula (4.22) and Theorem 2.10, derive (1 - 7 ) one-sided and two-sided confidence intervals for R. 4.3. Let X and Y be independent Weibull variables with parameters (a, <7i) and (a, 02), and the common parameter a is known. Using formula (4.22) and Theorem 2.10, derive (1 — 7) confidence interval for R. 4.4. (Teskin and Kostyukova (1991)). Let X and Y be independent normal variables with unknown means and variances but known ratio a = 0"2/<xi. Construct a lower confidence bound for R. Consider the cases n\ = 712 and n\ ^ n^ separately. 4.5. Let X and Y be independent Weibull variables with parameters (ai,o"i) and (02,02), respectively, where shape parameters a\ and a.2 are known and different. Using results of Section 4.1.4, describe how the (1 - 7) confidence interval for R can be constructed. 4.6. Using the UMVUE (3.76) of Var(fi) in the case of the uniform distribution with unknown right end, derive the confidence interval for R based on normal approximation. 4.7. In the conditions of problem 4.1, derive Bayesian credible set for R. 4.8. (Reiser et al. (1992)). Solve ther problem of Example 4.4 in the case when X and Y are a) independent normal variables; b) dependent normal variables. 4.9. (Gupta et al. (1999)). Let X and Y be independent normal variables N(fii,af) and N(fii,af), respectively, with a common but unknown coefficient of variation a = o\jp,\ = 0^/'/i2- Using techniques of Section 4.2.3 for estimation of MSE(i?), find the asymptotic confidence interval for R. 4.10. (Gupta and Subramanian (1998)). Solve problem 4.9 when X and Y are dependent normal variables with unknown correlation coefficient. 4.11. Describe the procedure for the construction of bootstrap-based asymptotic confidence intervals in the case when X and Y have indeoendent two-parameter exponential distributions. (MI>°"I)

Chapter 5

Nonparametric Models

Since the seventies of the twentieth century, the allure of nonparametric models has become almost irresistible in statistical methodology due a number of factors. Among those are the rise of the discipline of Data Analysis and of random estimation procedures spearheaded by J. Tukey, P. J. Huber and F. Hampel and unprecedented advances in computer technology. Psychologically these models are quite appealing since they free us from the constraints of distributional "straight jacket". However, the lesser efficiency and ambiguity of the results sometimes turns out to be quite heavy. This chapter deals with the nonparametric stress-strength model where the distributions of X and Y are unknown. The version of the problem is quite important not only because it is the only set-up that can be used in a number of applications but also since it preceded historically the parametric formulation of the problem. In this chapter we shall essentially follow the same well trotted route as in earlier three chapters of the book. We start with construction of the point estimator R of R = P(X < Y) and investigate its properties. Section 5.2 deals with numerous estimators of the variance of R. In Section 5.3 we shall provide confidence intervals for R based on R. Section 5.4 is devoted to nonparametric Bayesian approach to the problem. Finally, Section 5.5 describes a probabilistic design approach to the problem.

139

140

5.1 5.1.1

Nonparametric Models

Point Estimation of R = P(X < Y) Initial Results. The WMW Statistic

Chronologically, the stress-strength model started in a nonparametric setup in the simple but pioneering and ingenious works of Wilcoxon (1945) and Mann and Whitney (1947). These authors considered comparison of two independent random variables X and Y with continuous cdfs Fx and Fy, respectively. The aim was to test the hypothesis H0 : Fx = Fy by testing Ho : P(X < Y) = P(X > Y) = 1/2. These ideas can be traced to the European continental authors in the early years of the 20-th century (see Hald (1998)) but it was Wilcoxon, Mann and Whitney who put them on the front burner. The basic idea of Wilcoxon (1945) was to apply ranking methods to testXni) ing of HQ . He came up with the following procedure. Let X_ = (X\ and Y_ = (Yi, , Yn2) be two samples from X and Y, respectively. Wilcoxon (1945) suggests to form the sample of N = n\ + n2 observations and rank each of the observations Xi, i = 1, , n\, and Yj, j = 1, n 2 , in this overall sample. Denote by TRX and TRY the sums of ranks of observations Xi and Yj in the joint sample. Since under the assumption that Fx = Fy, the sum of ranks for X is TRX = ni(N + l)/2 and similarly the sum of ranks for Y is TRY = ri2(N + l)/2, statistics TRX and TRY can be applied to test the null hypothesis HoMann and Whitney (1947) developed further Wilcoxon's idea. Let X and Y be continuous random variables. They define statistic W counting the number of times that an X precedes a Y in a combined sample:

(5.1)

This statistic is commonly called the Wilcoxon-Mann-Whitney (WMW). It is evident that the expectation of (5.1) is EW = ni7i2-R, hence statistic W can be used not just for testing the hypothesis about the equality of Fx and FY but also for the statistical inference concerning R. Statistics W, TRX and TRY are interrelated via the equation W = nm2 + ni(m + l)/2 - TRX = TRY - n2(n2 + l)/2.

(5.2)

Point Estimation of R = P(X < Y)

141

To verify the first equality in (5.2) note that TRX = £

The validity of the second equality follows from the fact that TRX = N(N+ l)/2 - TRY (since the total sum of the ranks is TRX + TRY = N(N +1)/2). Owen et al. (1964) has shown that the rank representation (5.2) remains in force even if random variables X and Y are not continuous provided Xi ^ Xj and Yi ^ Yj for i ^ j . In such a situation, whenever Xi = Yj, one needs to rank Yj first and then Xi. 5.1.2

Nonparametric UMVUE of R

Since EW = n\n2R, the unbiased estimator of R is W

i

" \ "2

ninj ^

(5.3)

^

Using (5.2), J? can be written as

^_^1 = ![^_I^1. (5.4) 2

71], J

n i [ ni

2J

An alternative rank representation of R is given by

J U f e l + .. iV [ n 2

TH J 2

(5 .5)

The ratios TRX/UI and TRY ln
[1 - F y ^ ) ] 2 ^ * ) .

(5.6)

142

Nonparametric Models

- R2 where

Taking into account that Var(.R) = (nin2)~2EW2

(n2 -

2

. (5.7)

Comparing (5.6) with (2.6), we arrive at R2
R2
R,

(5.8)

thus 1

R{1 -R)<

Var(E) <

Ul+n2

~1R{l

- R).

(5.9)

Recalling that 0 < R < 1 we have -R(l — R) < 1/4 and since (ni + n2)l{nin2) < 2/min(ni,n 2 ), inequality (5.9) implies that Var(i?) < [2min(ni,ri2)]~1. Van Dantzig (1951) provides a sharper upper bound on Var(JR): Var(i?) < [4mm(ni,n2)}~1.

(5.10)

Note also that under the assumption that Fx = Fy, we have v\ = v2 = 1/3 and hence in this case

Var(A) =

l 2

Yln,\n2

i

l ()

Equation (5.7) and inequality (5.10) show that R is the UMVUE of R with the variance of the order O(l/min(ni,n 2 )). Moreover, the estimator R possesses yet another useful properties: it is admissible and minimax

Point Estimation of R = P(X < Y)

143

under a wide class of loss functions. To clarify importance of these features, we shall introduce the following definitions.

Definition 5.1 A loss function L is any function of two variables x and y satisfying L(x, y) > 0 such that L(x, x) = 0. The expected value of the loss function with respect to the sample distribution EL(R, R) is called the risk. Quite often, a loss function is chosen to depend on the difference of arguments, i.e. L(x,y) = l(x — y) where l(z) is convex. Definition 5.2 An estimator R of R is said to be inadmissible for the loss function L(x, y) if there exists another estimator R' of R such that EL(R', R) < EL(R, R) for any Fx and Fy with the strict inequality for at least one pair Fx, Fy, and it is admissible if no such estimator R' is available. Definition 5.3

An estimator R of R satisfying inf sup EL(R',R) = sup R' FX,FY

EL(R,R),

FX,FY

where the infimum is taken over all possible estimators of R is called a

minimax estimator. Theorem 5.1

(Yu and Govindarajulu (1995)).

Let X and Y_ be

two independent samples from Fx and Fy, respectively. Then the UMVUE R of R is admissible under any loss function of the form (R—R)2h(Fx, Fy) where h(x, y) is any positive function. In particular, R is minimax under where a2(Fx,Fy) = Var(R) given the loss function (R - R)2a2(Fx,Fy) by the expression (5.7). Theorem 5.1 states that whenever a loss function is the product of (f2R) and a positive function of Fx and Fy, there is no estimator which is superior to R for all possible distributions Fx and Fy . Moreover, if the the estimator R minimizes loss function is of the form (R-R)2a2(Fx,Fy), the maximum risk among all possible estimators of R. 2

144

5.2

Nonparametric Models

Estimation of the Variance of R

5.2.1

Estimators Based on Rank

Statistics

In the preceding section we have discussed upper bounds on Var(.R). However, to assess appropriately the quality of R in any particular situation it is desirable to know the value of Var(.R) as accurately as possible. This can be achieved by estimating Var(-R). Sen (1960) was the first to construct an estimator of Var(Il). Denote

H

^

(5.H)

and, let furthermore, Q2 _

10 —

1 1

"»

V ^ r r (Y\

i~ /

m2

\^10\-^*-i/ — i")

1 L

c2 _

?

01 —

i=l

"»

\^/

1" /

\

(5.12)

j=\

It follows directly from (5.3) and (5.11) that i

X

"^ r

r

/

-rr

\

^

5

Sen (1960) proposes a consistent estimator for the Var(.R) of the form S2 =

5|o ni

+

^1 n2

( 5 13)

To obtain a representation of S% in terms of the ranks denote the ranks of Xi and Yj in the combined sample by Txi and Tyj, respectively. Then i = 1, n x , and similarly, U0X{Yj) = (TYj Uio(Xi) = (n2-TXi+i)/n2, j)/nit 3 = l , - - ' j ^ 2 - Hence, using rank representations (5.4) for R, one can write 5^ 0 and 5QJ in the form

Estimation of the Variance of R

145

The advantage of the estimator S% is its simplicity; unfortunately, its drawback is that it is biased as will be shown below. This fact prompted Sen (1967) to introduce another estimator of Var(Ji) which is unbiased. He uses the functions Ux{XuXj,Yk) = \ [I(Xi < Yk)I(Yk < Xj) + I(Xi < Yj)I(Yk < Xk)}, U2(Xi,Yj, Yk) = \ [I{Yk < Xi)I(Xi < Yj) + I(Yj < Xi)I[Xi < Yk)}, U3(XitXjtYk,Yi) = | [I(Xi < Yk)I(Yi < Xj) + I(Xj < Yt)I(Yk < Xt)], where, same as above, statistics

) is an indicator function, and constructs the

(5.14) I—J. J<,K—1 ni

n2

Note that the expressions for V\, V2 and V3 are somewhat complex and may require considerable amount of computations (especially, for large n\ and n2)To obtain the expression for the estimator of variance observe that EU^X^X^Yk) = J^FxWll - Fx(x)]dFY(x) = R - vv Similarly, EU2(Xi,Yj,Yk) = R — V2, where v\ and v2 are defined in (5.6), and EUsiXuXjiYkM) = R(l-R). Hence, Vi, V2 and V3 are unbiased estimators of R — vi, R — V2 and R(l — R), respectively, so that (5.7) implies that an unbiased estimator of Var(.R) is of the form Si = —

[(m + n 2 - 1)V3 - (m - l)Vi - (n2 - 1)V2].

(5.15)

n\U2

Sen (1967) also provides rank representations of V\, V2 and V3. Hollander and Wolfe (1999) provide an illuminating example of an application of Sen's (1967) estimators in the second edition of their famous book. As it was already pointed out, the estimators (5.13) and (5.15) both possess certain disadvantages — the first one is biased and the second one is somewhat laborious for practical purposes. Hilgers (1981) proposes an alternative, easily computable estimator of Var(JJ) which is unbiased.

146

Nonparametric Models

Denote Vjj2 _ J_ Y" [U0i(Yj)]2 ,

V?o = -

where UOi(Yj) and Uio(Xi) are defined in (5.11). The quantities V^ and V2Q can be written as "l

where Tx(fc) a n d Ty(fe) are the fc-th smallest ranks in samples X_ and Y_, respectively. An unbiased estimator of Var(,R) is then given by

(5.16) Unbiasedness of the estimator S% follows from the relations EV& = - + ^ ^ - / " [Fx(x)fdFY(x)

= - +^-^Wl;

(5.17)

similarly

n j V - 1) f° [FY(x)]2dFx(x) J — oo

-l)]+nJ1(n2-l)u2. 2

(5.18)

2

Now, since E(R) = Va,i(R)+R where Var(.R) is given in (5.7), equalities (5.16) - (5.18) imply that ES$ = Var(JR). Observe that the estimator S2 enjoys both the unbiasedness of the first Sen's (1960) estimator and the simplicity of the second. In fact, there exists a direct correspondence between the estimators S2 and 5 | . Indeed, one can express S2 as

f

5? = Mn 1 -l)r 1 f>-t/ 10 (X i ))-(l- J R)] 2 +

M n 2 - I)]" 1 JTlUoiiYj) - R}2

=

(ni - i)-l[V20 - (l - R)}2 + (n2 - 1 ) - 1 ^ 2 - R?

=

s23 + [(m - i)(n 2 - 1 ) ] - 1 ! ! - RQ- -R)-

V& - vo\].

Estimation of the Variance of R

147

The last equation shows that the estimator S2 is biased with Bias^2)

[(n1-l)(n2-l)]-1E[l-R{l-R)-V&-V&\

=

[n1n2]-1[R(l-R)-(R2-v1)-(R2-v2)}.

=

(5.19)

(The last equality in (5.19) follows directly from (5.7), (5.17) and (5.18)). Sen (1967) has shown, that the bias (5.19) is always nonnegative and of the order O([n\n2]~1), consequently S2 slightly over-estimates the variance. However, this small positive bias overcomes a somewhat undesirable property of Hilger's estimator S2, namely, that it vanishes whenever R = 0 or 1. (The situations when R = 0 or 1 can be viewed as being of minor practical importance since it is quite likely implies that the probability R = 0 or 1 which is of little interest in practice). 5.2.2

Estimators Functions

Based on Empirical

Distribution

Definition 5.4 An empirical distribution function (edf) of Fx(x) based on a random sample X. = (Xi, , Xn) is the step function n

<x).

(5.20)

It is well known (see e.g. Hogg and Craig (1978)) that = Fx(x) and Var[Fx(x)] = n-xFx{x){l

EFx(x)

-

Fx{x)\.

The edf can thus be used to estimate Var(.R). Indeed, note that O

/"OO

[Fx(x)]2dFY(x),

«i=/ J~ OO

[l-FY(x)]2dFx(x)

v2= J — OO

are consistent estimators of V\ and V2, respectively, so that S\ = —!— \R + (m - l)«i + (n2 - 1)«2 - (ni + n2 - 1)^21 n\n2

L

J

(5.21)

is a consistent estimator of the variance. Estimator S\ may seem to be somewhat cumbersome, and Govindarajulu (1968) suggested another estimator S2 which is also based on the edfs. Denote O

v3=

/-OO

FY{x)dFx{x), J— OO

v3=

FY(x)dFx(x). J— OO

148

Nonparametric Models

Evidently, v2 = 2R — 1 + vz and V3 is a consistent estimator of ^3. Direct calculations show that Var(.R) =

(n1n2)~1Mwi--R2]+n2[w3-(1--R)2] (5.22)

If ni and n2 are sufficiently large, the last term on the right-hand side of (5.22) is substantially smaller than the sum of the first two terms. Hence, Var(£) » (nina)- 1 (m[«i - R2} + n2[v3 - (1 - R)2]) and S2 = (mm)-1 (m[vi - R2} + n2[v3 - (1 - £) 2 ])

(5.23)

is the consistent estimator of Var(J?). 5.2.3

Jackknife

Estimators

It may perhaps be helpful to first briefly review the concept of a jackknife estimator. It was introduced by Quenouille (1956) and extended by Tukey (1958). It was popularized in Gray and Schucany (1972) and since then it became one of the most popular estimators in statistical practice. Suppose that we estimate a functional 6 = 6(F) of the cdf F by means of its empirical counterpart (5.24) where JP is the edf based on observations X\ %) = e(Xx,

^Xi-uXt+u

Xn. Let ,Xn) = 0(F (i) )

where F^ is the edf based on the (n—1) observations (Xi, Xn) (omitting the observation Xi), and

(5.25)

, Xi-i,Xi+i,

3=1

Then the estimator of Var(^) - called the jackknife estimator - is of the

Interval Estimation of R

149

form (see e.g. Efron (1982)) (5.26) The reasoning behind the estimator (5.26) is as follows. The variance of , Xn) given by (5.24) is approximately equal to CQ/TI the estimator 9(X%, for large values of n and some constant 0%. On the other hand, 9^, j = 1, , defined by (5.25) can be treated as a "sample" of estimators § based o n n - 1 observations, and plays the role of the mean of this "sample". Therefore, (n - I)" 1 £"=i[
based on (ni — I)n2 and ni(ri2 — 1) observations, respectively, and observe that their averages Ux(-) = Uy(-) = R- Hence, the jackknife estimator of Var(JR) is i

^

i

n i

x(}) - ^ ] 2 + ^

"2

E l ^ C i ) - ^] 2 .

(5-27)

The estimator (5.27) was originally introduced by Cheng and Chao (1984) who provides a slightly different but equivalent representation of (5.27), and studied by Shirahata (1993). 5.3

Interval Estimation of R

In this section we shall consider a construction of confidence intervals for R based on R. First we shall briefly review the following approaches to interval estimation of R: 1) confidence intervals based on the Chebyshev (or the Hoeffding) inequality or the Kolmogorov-Smirnov statistics; 2) asymptotic confidence intervals based on a normal approximation; 3) confidence

150

Nonparametric Models

intervals based on pivotal quantities; 4) confidence intervals constructed bymeans of the bootstrap method.

5.3.1

Confidence Intervals Based on Classical Inequalities

Denote v = min(m,n2) and recall that R is the unbiased estimator of R with the variance bounded by (4i/) - 1 (c.f. (5.10)). Therefore, for any £ > 0, the Uspensky (see e.g. Bennett (1962)) and the Chebyshev inequalities yield + iue2)~l

P(R R-e)>l-(l

(5.28)

and P(\R-R\<s)>l-(4iys2)-1,

(5.29)

respectively. The two-sided confidence interval (5.29) was originally suggested by Ury (1972) while the one-sided version (5.28) appears in Yang and Mo (1985) who also derived the Hoeffding-type bounds for R: P(R
> l-e"2^2

(5.30)

2ve2

(5.31)

> l-2e-

(see Hoeffding (1963)). Combining (5.28) with (5.30) and (5.29) with (5.31), we obtain the result derived in Yang and Mo (1985):

P(R PQR-R\<e) > Therefore, the one-sided and the two-sided (1 — 7)-confidence intervals for R are P(R < R + £7,i) > 1 - 7,

P(.R - £ 7]2 < fl < £ + £7>2) > 1 - 7 , (5.32)

where

L{(I)i^j

^ { ( ^ , i _ } . (5.33)

Note that confidence bounds in (5.32) involve the size of the smaller sample only, hence they would not perform very well if the difference in sample sizes — 712I is large.

Interval Estimation of R

5.3.2

Confidence Intervals Based on the Statistics.

151

Kolmogorov-Smirnov

Confidence intervals for R based on the Kolmogorov-Smirnov statistics were derived by Birnbaum and McCarty (1958) and chronologically are the first ones. Following these authors directly and using (2.6) and integration by parts, we represent R — R as O

R-R=

/-CO

Fx(x)dFY(x)-

Fx(x)dFY(x)

J — oo

=

r

Fx(x)d[FY(x) - FY(x)} + I" (Fx(x) - Fx(x)}dFY(x)

J — oo

=

(5.34)

J—oo

J — oo

f°° [FY(x) - FY(x)]dFx(x) + [°° (Fx(x) - Fx{x))dFY{x) J—oo

J—oo

Denoting the Kolmogorov-Smirnov statistics by

D~x

=

sup

[Fx(z)-Fx{z)),

z€(—oo,oo)

D+x

=

sup

[Fx(z) -

Fx(z)\,

z6(—oo,oo)

D+2

=

sup

[FY(z)-FY(z)}

z€(—oo,oo)

(see e.g. Sprent (1989)), we obtain from (5.34) that P(R < R + e) > P{D~x + D+2 < e). It is well known (see e.g. Csaki (1984)) that for any z, P(D" X +2 < z) = F*2(z) where F*(z) is a cdf which depends solely on n but not on Fx or FY. Thus, P(R < R + e) > P(U+ + D+ < e) = F*x >na (e)

(5.35)

where F*^ n2 is the convolution of the cdfs F*x and F*2:

Kun2{e) = f F*1{e-z)F^{z)dz.

(5.36)

Using an exact expression for F*(z) [n(l-z)] /

\

/

Kiz) = l-z g Qd+z)

\ J-l

/

_

\n-j

, ^[0,1], (5.37)

152

Nonparametric Models

where [n(l — z)} is the integer part of n(l — z), and (5.36), one can evaluate numerically the right-hand side of (5.35). However, when reversing the problem, i.e. if the confidence coefficient (1 — 7) is given and we are required to find £T,3 such that F^ ,^(£7,3) > 1 —7, the calculations based on (5.36) and (5.37) become very cumbersome. This prompted Birnbaum and McCarty (1958) to derive an asymptotic expression for F*in2 as v —* 00. Theorem 5.2 If v = min(m,n2) —> 00, then we have uniformly in e e [0,1]

where (e) = 1 - n i N ^ e - 2 ^ 2 - n2N^e-2n^

2

^ )] ; as above, N = n\ + n
.

.

^ '

) is the standard normal cdf.

Proof. The proof of this statement is based on the classical result for the Kolmogorov-Smirnov statistic (see e.g. Csaki (1984)) which states that lim P(D+ < z/y/n) = 1 - e" 22 ' = L(z). n—»oo

Therefore, [F*(z) — L(zy/n)) —> 0 uniformly in z. To complete the proof one needs only to note that

=JoI

(5.39) n

Formulas (5.35) - (5.37) and Theorem 5.2 provide a one-sided confidence interval for R. To obtain a two-sided confidence interval for R one is required to use the statistics

D B1 =

sup \Fx(z)-Fx(z)\,

Dn2=

z€(—00,00)

sup \FY(z) - FY{z)\ z6(—00,00)

(5.40) with the limiting distribution (see e.g. Smirnov (1948)) oo

lim P{^Dn < z) = Y (-l)fce-2fe2z2 = L*(z). k= — oo

Interval Estimation of R

153

It is easy to observe from (5.34) and (5.40) that P(\R -R\<e)>

P(Dm + Dm < e)

and using an analog of (5.39) it can be shown that the limiting distribution for Dni + Dn2 is Qn,,na(e) = [£L*((e-z)^)dL*(z^r1).

(5.41)

Jo

A disadvantage of the Birnbaum-McCarty confidence intervals is that they are quite conservative. In fact, Ury (1972) comments that for 7 = 0.5 and ni = ri2, Chebyshev's intervals require about 1/3 of the sample size needed for the Birnbaum-McCarty bound. This is due to the fact - as it was observed by Yang and Mo (1985) - that the Birnbaum-McCarty bound is actually a simultaneous confidence bound for all R(z) = P(X < Y + z) = r ° Fx(x + z)dFY{x),

z G (-00,00)

J—00

rather than just for R = R(Q). Indeed, let

R{z) = {rum)-1 J^Jl1^

Fx(x + z)dFY(x), J

-°°

»=i 3=1

then, analogously to the derivation leading to (5.34), R(z)-R(z)

=

H[FY{x)-FY(x)]dFx(x + z) J-00 rOO

+

/

(Fx(x + z)-Fx(x + z)}dFY(x).

J — OO

Hence, snpz[R(z) - R(z)] < D~x + D+3 and thus P(R(z) - R(z) <e)> P(D+ + £>+ < e) = F^n2(e) (compare with (5.35)). This simultaneous confidence intervals interpretation was elaborated by Arsham (1986).

154

Nonparametric Models

5.3.3

Confidence Intervals Based on the Asymptotic Normality

As it was just mentioned above, both the inequalities-based confidence intervals and Birnbaum-McCarty confidence intervals are too conservative (the first ones due to use of the minimal sample size and the second ones because they are actually simultaneous confidence intervals for R(z)). Moreover, both types of confidence intervals are geared towards the least favorable pair of distributions Fx and Fy which may not be the case. To remedy the situation, a number of researchers utilized with some success the asymptotic normality of R for obtaining more stringent confidence intervals. Indeed, if 2

_ mn 2 Var(£) _ R + (m - 1 ) ^ + (n2 - l)v2 - (N - l)R?

a -

-

-

-

(5.42)

where, as before, ni + n2 = N, then y/niri2/N(R - R) is asymptotically normally distributed with zero mean and variance a2 as n\ and n2 tend to infinity. Since

1

m +n2 _ 1

J_ < 2

where, as in Section 5.3.1, v = min(ni, 712), one can use u instead as the normalizing coefficient, namely, y/v{R — R) will be asymptotically normally distributed with zero mean and variance v Var(.R). As noted by Govindarajulu (1968) this implies - using van Dantzig's (1951) upper bound (5.10) - that v Vai(R) < 1/4. Hence, the one-sided and two-sided (1 - 7)-confidence intervals are of the form (5 43) ^ ) - 1 " 7 ' respectively, where za is the (1 - a) percentile of the standard normal distribution.

Even these refined intervals (5.43) are still quite conservative since they are based on the van Dantzig'z (5.10) upper bound for the variance which may sometimes be much larger than the actual value of the variance. Hilgers (1981) suggested to use confidence intervals based on asymptotic normality and an estimator of the standardized variance a2. He proves the following statement.

Interval Estimation of R

155

Theorem 5.3 //min(7ii,n 2 ) — oo and S% is a sequence of estimators for Var(R) such that (ni7i2/N)[Sjf — Var(R)] converges in probability to zero, then (R — R)/SN is asymptotically normal with zero mean and unit variance. One could verify that the estimators S?, j — 1,- ,6, in Section 5.2 (equations (5.13), (5.15), (5.16), (5.21), (5.23) and (5.27)) satisfy the conditions of Theorem 5.3, so that the asymptotic (1 — 7)-confidence intervals can be expressed as ^Sj)

>

1-7, (5.44)

P(\R - R\ < z^Sj)

>

1-7,

for all j = 1, , 6. The reader may wish to consult Shirahata (1993) and Chen and Chao (1984) for comparison of the various types of confidence intervals. 5.3.4

Confidence Intervals Based on Pivotal

Quantities

In this subsection we shall review two different methods of interval estimation of R based on pivotal quantities suggested by Halperin et al. (1987) and Feigin et al. (2001). Further details on Halperin et al. (1987) approach are given in Chapter 7. The technique of Halperin et al. (1987) is based on the inequality (5.9) which implies that Var(ii) = [(ni + n 2 — for some g e [0,1]. From this equality it follows that Q = [(m + n 2 - 2)i?(l - R)]~1mn2 Var(B) and, thus, Q can be estimated by -1, (n\ -\- H2 — 2)i?(l — R) where 5? is one of the estimators of the variance presented in Section 5.2, j = 1, , 6. Halperin et al. (1987) use Govindarajulu's estimator 5 | (equa-

156

Nonparametric Models

tion (5.21)) and propose the following pivotal quantity: (R - R)V^2

=

/

(

+

2

)

„

1}

(

(5.45)

+ l/R(lR)

J

It follows from (5.45) that with probability of at least (1 — 7), 0 < 7 < 1, \R — R\

(5.46)

n 2 - 2) + 1 A / R ( 1 -

R)

where, as above, z 7 / 2 is the (1 — 7/2) quantile of the standard normal distribution. Solving the last inequality for R, we obtain

h i h i j ^

1

-

7

'

(547)

where zL 2 [f(tll + 7l2 - 2) + 1] 1,

Hi =

7/2L

It is easy to verify that the expression under the square root in (5.47) is positive for all 0 < 7 < 1. A similar but slightly different method for construction of a confidence interval for R have been more recently provided by Feigin et al. (2001). The authors note that vi and v2 in (5.6) can be represented as V! = P[max(Xi,X 2 ) < Yi]

and

v2 = P[XX < min(Yi,

and, hence,

01

=

n^-m. i'C.J — 1 K — 1

v2 =

^ nn{n

-f^T,

I(Xi<mm(Yj,Yk))

(5.48)

1 ) ^ ^

are unbiased estimators of v\ and v2, respectively. It is easy to observe that v\ and v2 are modified versions of Sen's statistics V\ and V2 denned in (5.14) with EVi = R - vt, i = 1,2.

Interval Estimation of R

157

Now let V3 = (ni — l)#i + (n2 —l)i)2. Since Var(.R) is of the form (5.7), Feigin et al. (2001) propose the pivotal quantity

^R +

)

v3-(m+n2-l)R2

(5.49)

(c.f. (5.45)). Thus,

(c.f. (5.46)). Solving the last inequality for R we arrive at p I _?

V_J

2_2 < R < _2

V ^

2_2 \ > J _ 7 ;

(5 50)

where Z2

TJ

2

B

l + (n + n l ) H

2 C = = R R2-H2v3.

It is not difficult to show that A2 — B2C2 is positive for all possible values of vi, v2 and 7 G [0,1]. 5.3.5

Confidence Intervals Constructed by Bootstrap Method

The bootstrap procedure for construction of confidence intervals has been described in some details in Section 4.5. There, we have assumed that (X, Y) have the cdf Fg(x, y) with an unknown scalar or vector-valued parameter 9 € 9 where 9 is a set of the parameter values and then constructed the MLE F of F of the form F(x,y) = F§(x,y). We base our interval estimators on the independent bootstrap samples from F(x,y). In a nonparametric situation, the form of Fg(x,y) is unknown, so that the MLE of F is the joint edf F(x,y) = n-1YjI{Xj

<x,Y5< y)

(5.51)

j=\

provided X and Y are dependent, or simply the product of the marginal

158

Nonparametric Models

edfs

Fx(x)FY(y) = ( n ^ ) - 1 jrJTl(Xi

< x)7(y,- < y)

(5.52)

in the independent case. After generating the bootstrap samples (X_^, Y , M, using (5.51) or (5.52), we construct the bootstrap confidence j = 1, intervals in the same manner as it was done in Section 4.5. Chen and Chao (1984) applied percentile method (without bias correction) described in Section 4.5.3 to construct bootstrap confidence intervals for R. 5.4

Nonparametric Bayes and Empirical Bayes Estimation

In this section we shall consider nonparametric Bayes and empirical Bayes approach to estimation of R. To understand material below it would be helpful to be familiar with the main concepts of measure theory which is impossible to review even briefly in the book of this size. Hence, the readers that are not equipped with this knowledge are encouraged to study measure theory using any of the numerous standard textbooks (such as Billingsley (1995) or Shiryaev (1996)) or to skim this section in order to obtain general ideas of the methodology. 5.4.1

Dirichlet Process

Preliminaries

It was Ferguson (1973) who initiated nonparametric Bayes estimation introducing the by now classical Dirichlet process which is highly flexible and versatile in assigning prior measures. Before studying his definition, it would be desirable to recall the Dirichlet distribution. , be independent random variables Definition 5.5 Let Zi, i = 1, with the pdfs Gamma(l, a,) of the form (3.9) where a^ > 0 for all i and a, > 0 for some i, i = 1, , k. The Dirichlet distribution with parameters (ai, , afc) is defined as the distribution of (Yi, , VJb) where

Nonparametric Bayes and Empirical Bayes Estimation

159

The Dirichlet distribution is always singular with respect to Lebesgue measure infc-dimensionalspace since Y\ H \-Yk = 1. However, if a^ > 0 , Y"fc-i) is absolutely for all i, the (fc — l)-dimensional distribution of (Yi, continuous with the pdf fc_l v « * * - l

I-£K) »=1

x /((yi,-,»M)e5), where

/

(5.53)

) is the indicator function and S is the simplex

For k = 2, expression (5.53) reduces to the density of the beta distribution. Let (X, A) be a measurable space. Ferguson (1973) defined the following stochastic process {P(A),A e A} . Definition 5.6 Let (X,A) be a measurable space. Let a be a nonnull finite measure (nonnegative and finitely additive) on (X,A). The measure P(A) is a Dirichlet process on (X, A) with parameter a if for every fc = 1,2, , and every measurable partition (Bi, , Bk) of X, the vector (P(Bi), ,P(Bfc)) has the Dirichlet distribution with the parameter Definition 5.7 say that X\,

Let P be a random probability measure on {X, A). We Xn is a sample of size n from P if for any positive integer

m and measurable sets Ai,

Am, C\,

Cn

V{X1eC1,---,Xn€Cn\P(A1),---,P{Am),P(C1),---,P{Cn)} with probability one. Here V

} denotes the probability of an event.

Intuitively, we may view a sample of size n from a Dirichlet process as follows. The process chooses a random distribution F, and then, given F, X\, Xn is a random sample from F. The following theorem provides the conditional distribution of a Dirichlet process P given a sample Xi, , Xn from P. Theorem 5.4 (Ferguson (1973)). (X, A) with parameter a, and let X\,

Let P be a Dirichlet process on ,Xn be a sample of size n from

160

Nonparametric Models

P. Then the conditional distribution of P given Xi, ,Xn is a Dirichlet process with parameters CK+X^=I ^Xa where 5X is a measure assigning mass one to the point x. The following statement combining Theorems 3 and 4 of Ferguson (1973) explains how one can calculate the expectations of the integrals with respect to random probability measures. Theorem 5.5 Let P be the Dirichlet process with the parameter a and let / i and fi be measurable real-valued functions defined on (X,A). If f \fj\da < oo, j = 1,2, and f \fif2\da < 00, then f\fj\dP < 00 with probability one, and the expectations of the integrals are of the form

jdP = Jfjd(EP) = [a(X)}-1 jfjda, j = 1,2, = [a{X) + l]^[a{X)]-l[Jhf2da 5.4.2

Nonparametric Bayes Estimation of R

Before starting to describe estimation of R, we shall consider a more general problem of nonparametric Bayes estimation of a distribution function F(t) — P((~oo,t}) under the squared loss. Let X = 7£ = (—00,00) and ft(x) - 7(-oo <x
(5.54)

Since, by Theorem 5.4, the posterior distribution of P given the observations is chosen by the Dirichlet process with the parameter a + Yl7=i &X*» the Bayes estimator of F based on a sample X\, ,Xn, can be obtained by replacing a by (a + X)"=i <^O i n (5-54), namely

oaoMHipx.aco,,])

(5.55)

a(7c) + n To construct nonparametric Bayes estimator of R = f Fx(x)dFy(x) under the squared loss, we shall choose two finite measures a\ and a%. For a prior for (Fx,Fy), we assume that Fx and Fy are the distribution functions of random probability measures Pi and Pi, respectively, where Pi and Pi are independent and P,- is chosen by a Dirichlet process with the parameter otj, j = 1,2.

Nonparametric Bayes and Empirical Bayes Estimation

161

If no samples from Fx and Fy are available, it then follows from Theorem 5.5 that the Bayes rule for the no-sample problem is

£0 = r

F^)(x)dFY0)(x)

(5.56)

J — oo

where F^' = EFX and Fy = EFy are given by (5.54) with a replaced by «i and a2, respectively: p(°)(x)

=

x

<*i((-°°,x]) ax(H) '

F(0)(X)Y

(5.57)

Given the samples, the Bayes rule is O

R=

Fx(x)dFY(x)

(5.58)

J — oo

where Fx(x) and Fy(x) are the Bayes estimators of Fx(x) and Fy(x) of the form (5.55). Introducing [aj{n) + nj], j = 1,2,

(5.59)

we rewrite Fx(x) and Fy(x) as (5.60) , Xni) where Fx(x) and Fy(x) are the edfs based on samples X_ = (Xi, and y = (Yi,--- ,Yn2), respectively. Substituting (5.60) into (5.58) we arrive at the estimator R

— QIQ2RQ

+ QiO-— Q2)—/

Fv CYi)

(5.61)

n2 ~zi

+

(1 - Qi)Q2— Y"(l - FY0)(Xr)) + (1 - ei)(l - Q2)R,

where R is the UMVUE of R given by (5.3). The estimator (5.61) was proposed by Ferguson (1973). 5.4.3

Nonparametric Empirical Bayes Estimation of R

This estimation requires a rather delicate construction. Let (X}%\Y_^), i = 1, , m, be two independent sequences of independent random vectors of

162

Nonparametric Models

observations with respective random probability measures (Pu,P2i), i = 1, ,m. Here, XW = ( X « , - - . , X W ) ,

r W = (nW----,^)>

<=

,

(5.62)

are samples of X^ and Y^\ respectively. Let Pu and P^i be independent with Pji having a common Dirichlet process prior with parameter ay, j = 1,2. Let Fxi and Fyi be the distribution functions corresponding to Pu , m. Our objective is to estimate and Piu respectively, i — 1, R(m)

=

/" J —o

on the basis of the observations (5.62). If the parameters a\ and a2 were known, the Bayes estimator of based on samples (X ( m ) ,y ( r o ) ) would be of the form i?i m)

=

QlmQ*m&0 + Qlm(l-Q2m)F$£

+

(1 - Qlm)Q2m(l ~ Fp£) + (1 - 01m)(l - p 2 m)4m,

R^ (5.63)

where (5.64)

and the quantities F^ and P1^0' are defined in (5.57). In an empirical Bayes (EB) analysis, one or several prior parameters in (5.63) are assumed to be unknown and ought to be estimated from data. Hollander and Korwar (1976) were the first to construct an EB estimator of R^K They treat _R0, FJj?^* and F^ as unknown but assume that ay (ft) are known and riji = rij, j = 1,2, i = l,---,m. The corresponding EB estimator of R% then becomes

Nonparametric Bayes and Empirical Bayes Estimation

m—lm—1 p(m)

_

QlQ2

i-

^ V ^ p

(m - I) 2 *—' *-i

163

. m—1

, Ql(l-Q2) V^

m- 1

J

p

^

m-1

f

(5.65)

where gi and Qi are defined in (5.59). Comparing (5.65) with (5.63) it is easy to see that R\, FyJ^ and Fy^ are estimated on the basis of only past data (2L ,¥. )i i = 1,- ,m — 1, but not the current one. Ghosh and Lahiri (1992) generalize the result of Hollander and Korwar (1976). They construct an EB estimator for unequal sample sizes in the cases when aj(7V), j = 1,2, are either known or unknown. They also include the current data into estimation of .Ro, i^m an( ^ 4 m 1 Specifically, in the case when aj(TZ), j = 1,2, are known, they denote m

(1-£,-*),

J -

1,2,

and estimate £ 0 , F%£ and F-f?£ by

respectively. Then, the nonparametric EB of Rg R^B2

is of the form

=

QlmQ2mR*o + Qlm(l -

e2m)F{^

+

(1 - Qlm)Q2m(l ~ F$£) + (1 - ffim)(l - Q2m)Rmm,

where Qim and g2m are defined in (5.64). Ghosh and Lahiri (1992) also construct a nonparametric EB estimator when aj(R,), j = 1,2, are unknown and ought to be estimated from obser-

164

Nonparametric Models

vations. This estimator is given by a rather complicated expression and is not reproduced herein. , m, all We should mention that, whenever riji > 1, j — 1,2, i = 1, the EB estimators turn out to be optimal in the sense that the quadratic risk of the EB estimator approaches the quadratic risk of the Bayes estimator as m —> oo. 5.5

Probability Design Approach to Estimation of R

The probability design approach to the estimation of R was introduced by Kapur (1975). This approach differs from all the other by using a very different set of data for estimating R. Specifically, Kapur (1975) assumes that the stress X and the strength Y are independent and that there exists an "interference" interval within which the stress and the strength are likely to interact. This interference interval is then partitioned into n subintervals, and it is assumed that instead of the samples from X and Y some bounds are available on the probabilities of the stress and the strength falling into the interference interval as well as on the probabilities of the stress and the strength falling into each subinterval of the interference interval. These bounds are supposed to be known a-priori or constructed from existing data. Kapur (1975) uses his technique to estimate "unreliability" 1 - R = P(X > Y). Following the approach adopted in this book, we shall however present Kapur's method for estimating R = P(X < Y). Let -Xmjj, be the lower bound for X and YmaK be the upper limit for Y that occur with high probability. Then the interference interval is [Xm\n, Ymax] and /

Fx{x)dFY{x). in

Partition [Xm;n, Fmax] into n subintervals with the endpoints a,-, j = 0, < an = F max . Denote i.e. Xmin = a0 < at < Pi = P[a^i < X < Oi], qi = P[a,i-i < Y < a*].

, n,

(5.66)

Then (5.67) 4=1

t=l

Probability Design Approach to Estimation of R

165

and R can be approximated by the sum of probabilities P(ak-i < X < a f e , a i _ i

i = 1,-

,n, k = 1,-

,i.

Let the conditional probabilities be denoted by y | a j _ i <X

< a*),

i = l,---,n.

(5.68)

Consequently, P(X < Y, aj_i < X < aj,aj_i < V < aj) = Pigi^i and the representation for R becomes (5.69) Introducing n

v = L»=i

we represent i? as (5.70) Let bounds on (5.66) and (5.67) be available (5.71) (5.72) Then the upper (lower) confidence bound Ru (RL) for R is constructed by maximizing (minimizing) (5.69) or (5.70) under constraints (5.71) and (5.72). Computationally, this leads to a standard quadratic programming problem. The interested readers are referred to e.g. Charnes and Cooper (1961). After the confidence bounds for R are obtained , the point estimator of R is given by R = (Rv + RL)/2. Kapur (1975) assumes that strength dominates stress on each subinterval, so that all iVs are equal to one and v = 1. Park and Clark (1986) note that Kapur's assumption that i/j = 1, , n, leads to overestimation of both RL and Ru- To remedy the i — 1, , n, which corresponds to situation, they suggested to set i/j = 1/2, i — 1,

166

Nonparametric Models

the case of equal probabilities that strength or stress dominates within the i-th interval. Hence, the objective function to be maximized and minimized is (5.70) with v = 1/2. Park and Clark's confidence interval, although more realistic than Kapur's, has the shortcoming that it sometimes fails to cover the actual value of R. Melloy and Cavalier (1989) provide an example of such a situation and suggest to "play it safe" strategy. They propose to use v = 1 for maximizing R and v = 0 for minimizing it. Melloy's and Cavalier's interval is obviously the longest of the three but it is the one to contain most likely the actual value of R. Shen (1992) notes that Kapur (1975), Park and Clark (1986) and Melloy and Cavalier (1989) suggest identical values for Vi for minimization or maximization of R. He attempts to improve the methods of the previous authors by introducing variable values for Pi. Denote Pi = (Lp,i + UPii)/2,

qt = (£,,< + Uq,i)/2.

Shen's (1992) suggestion is to use

{

1, if pi <&, 0.5, ifpi-i < ft_i and pi+i > ft+i, 0, otherwise,

for the maximization of R and

{

1, if ft < Pi, 0.5, ifft_i
and qi+i > pi+1,

for the minimization of R. The author demonstrates by means of extensive simulations that his method leads to more precise point estimators and more realistic bounds for R. We conclude this section by noting that there are also two other optimization approaches to the estimation of R. The first one is due to GeungHo Kim (1981) and is based on mathematical programming techniques. The second was developed by Wang and Liu (1996) and is based on fuzzy reliability. Unfortunately, due to the space limitation, we are unable to cover these techniques and refer the interested readers to the original papers.

Exercises

5.6

167

Exercises

5.1. Using the fact that TRX + TRY = N(N + l)/2, derive formula (5.5) from (5.4). 5.2. Prove that the estimator Sf given by (5.15) is an unbiased estimator of Var(B). 5.3. Derive an unbiased estimator of Var(.R) based on statistics (5.48). 5.4. Investigate for what values of 7 the confidence bounds in (5.33) are based on the Hoeffding-type inequalities (5.30) and (5.31). 5.5. Write an explicit finite-sum representation for F*in2(e) using formulas (5.36) and (5.37). 5.6. Derive a series representation for Qn1,n2(£) gi v e n by (5.41). Explore convergence of the series. 5.7. Project. Let unbiased estimators v\ and #2 of vi &nd V2 be given by Feigstat. Denote v(a) — (1 — a)R + (ni — l)#i + (n2 — 1)#2 and observe that E[aR + v(a) — (n\ -\-n,2 — 1)-R2] = Var(fl) for any number a. Hence, the pivotal quantity in npiv2 can be replaced by v(a) - (m + n 2 - 1)R2 Construct a confidence interval based on Z3 similarly to (5.50). Study which value of a provides the shortest confivence interval. 5.8. Write an explicit expression for the estimator (5.61) if OJI ((—00, x}) = a2((—oo,x]) = a((—00, x]) with a) a ( ( - o o , i ] ) = / f ^ O . 5 exp(-|z|)dz;

b) a((-oo,x]) = JljV^)-1

exp(-z2/2)dz.

Chapter 6

Some Selected Special Cases

The arsenal of statistical distributions is truly inexhaustible. New distributions are being discovered literally on a weekly basis prompted by either theoretical considerations or by pressing practical applications or both. Glance through any recent issue of "Annals of Statistics" or "Statistics in Medicine" to be reassured by the validity of this assertion. Each of these distributions can, of course, be used for studying the probability P(X < Y) and its estimation and we hope that the readers will engage themselves in this rewarding activity. We shall refer to this type of investigations as the "mainstream" case of the stress-strength models. These situations were tackled in the preceeding chapters involving most popular and interesting - in our subjective opinion - distributions. We also studied the case when X and Y are random vectors and the probabilities of inequalities of some linear combination of X and Y are estimated. These two models, however, do not cover all the situations that appear in various applications or theoretical exercises. The first group of additional models studied in this chapter stems from the problems in system reliability and naturally leads to estimation of prob, Yjt) and their modiabilities of inequalities of the type X < max(yi, fications. These models will be considered in Section 6.1. Estimation of probabilities of inequalities other than X < Y which are not covered by the previous sections (e.g. P(X < Y < Z) or P(Xi < X2 < < Xm)) will constitute the content of Section 6.2. Section 6.3 is devoted to linear models formulations in the stress-strength set-up such as stress-strength models with explanatory variables or ANOVA. Section 6.4 reviews stress-strength models with grouped and categorical data. Finally Section 6.5 considers 169

170

Some Selected Special Cases

briefly stochastic processes formulations of stress-strength models. This chapter is an excellent source for further research.

6.1

Stress-Strength Models for System Reliability

6.1.1

Various Models for System Reliability

The stress-strength models for system reliability occur when a device under consideration is a combination of A; usually independent components with the strengths Y\, , Yk and each component of the system is subject to a common shock of a random magnitude X. If the system functions when at least s, 1 < s < k, components survive the shock, we talk about s-out-of-A; system. As a typical example of sout-of-A) system one may consider (see Johnson (1988)) a panel consisting of k identical solar cells which maintains an adequate power output if at least s of the cells are active during the course of the mission. The external force interfering with the operation of the cells may possibly be extreme temperatures and the strength of a cell, in this context, may be taken as its capacity to withstand these extreme temperatures. If we assume that the stresses and the strengths of the components are i.i.d. with the pdfs fx and fy and the cdfs Fx and Fy, respectively, then the reliability (i.e. the probability of successful operation) of the s-out-of-fc system is given by the well known relation R

[1

>." = £ (T) r j=s

v/

- Fy^)\iFY-j{x)dFx{x).

(6.1)

o

If the system is operating successfully whenever at least one of the k components survives (s = 1), it is termed parallel in the analogy with electric circuits. If however the system survives only when all of the components are intact (s =fc),we are dealing with series system. In view of (6.1), the reliabilities of series and parallel systems are represented by Rllk = P(X < max(yi,

, Yk)) = 1 - f°° FY(x)dFx(x)

(6.2)

J —OO

and Rktk

= P(X < m i n ( F 1 ,

k))=

f°° [1 - FY(x)]kdFx(x),

(6.3)

Stress-Strength Models for System Reliability

171

respectively. Estimation of reliability of s-out-of-A; system has been discussed by Myhre and Saunders (1968), Madansky (1965), Easterling (1972), Bhattacharyya and Johnson (1975, 1977) and Choi and Kim (1983) among many other sources. Particular cases of parallel and series systems have been studied by Gupta (1972), Bhattacharyya and Johnson (1974), Chandra and Owen (1975, 1977), Rinco (1973), Singh (1981), Gupta and Gupta (1988) and Ivshin and Lumelskii (1995). So far, we have assumed that the components Y\, , Yk are i.i.d. random variables and all of them are subjected to a common random stress X independent of Y's. Johnson (1988) outlines several extensions of the above model for representing the reliability structure of more complex systems. 1. Non-identical component strength distributions. When the components of a system are of different structure, the assumption of identical strength distributions may not be quite realistic. Suppose that out of k components, k\ belong to one category and k% = k — k\ to the other. Denote the strength distribution of the components of the i-th category by Fyi, i — 1,2. Assume now that all the k components are exposed to a common stress X having the distribution Fx and the system operates successfully if at least s out of k components withstand the stress. The system reliability is then of the form

RsMM = £ f) [f] h,h

U1/ U a /

J

~°°

(6.4) where the summation is over all possible pairs (ji,J2) with 0 < ji < k\ and 0 < J2 < fe such that s < ji + J2 < k. For example in the case of ki = fc2 = 1 and s = 1, the possible choices for (ji, j'2) are (0,1), (1,0) and (1,1), so that in this case (6.4) becomes FYl{x)FY2{x)dFx{x). 2. Subsystems with independent stresses. In a more complex situation a system may consist of a number of independent subsystems, say m, performing different tasks. Within each subsystem, the components have independent and identically distributed strengths and are subjected to a common stress, so that each subsystem has a structure of an s out of k

172

Some Selected Special Cases

stress-strength model. The strength and the stress distributions as well as the parameters s and k may vary among the subsystems. Representation of the system reliability depends here on the manner in which subsystems are combined in the total system. For example, if subsystems have a series connection, the system fails whenever one of the subsystems becomes unoperational, and the system reliability is in this case given by R — R\ih2 ' ' ' Rm

where R4 is reliability of the i-th. subsystem, i = 1, , m. If subsystems are connected in parallel, the system functions properly provided at least one of the subsystems survives. In this case we have

Evidently, there is a multitude of variations in between the above two extreme cases. 6.1.2

Estimation of System Reliability Based on Numerical Data

Estimation of reliability of s-out-of-k system in the case of exponential distributions. Consider the case when /x(x) and fy(y) are both one-parameter exponential: fx{x\a\) = «iexp(—a\x) and fy(y\ct2) = a2exp(—ot2y) and samples of sizes n\ and ri2 from fx and fy, respectively, are available. Calculation of system reliability in the exponential case has been carried out by Bhattacharyya and Johnson (1974). To evaluate (6.1) they used the relation between binomial probabilities and the cdf of the beta distribution (see e.g. Johnson et al. (1992)) K(u) = [B(s, k-s

+ I)]"1 /

xs-\l

- x)k~sdx

= Y, j=s

(6.5) Denote A = ati/a2. Noting that Fy(x) = 1 - exp(—a.2x) and changing the variable of integration to z = exp(—a\x) we represent (6.1) as RStk = [ Jo

K{ul'x)du.

Utilizing the transformation u = y and integrating by parts we easily

Stress-Strength Models for System Reliability

173

obtain B(s + X,k-s + l) 1

(k-s)l

A

*

(6.6)

Using the partial fraction expansion for the product of reciprocals in (6.6) we finally arrive at /If

\

-R.,k = [B{a,k-8 + l)]-1^2(-l)i( (k T S')J (s+j + A)""11.

(6.7)

7—0

Hence, the MLE of RStk can be written as (compare with Section 2.1.3)

%<»C;s) %<-»C;

> -

(6.8)

To obtain the UMVUE of the system reliability in this case, note that for any a > 0 V'a(A) = (a + A)- 1 - a" 1 {1 - E[/(oXi < where ) is the indicator function. Recall that A = cci/a^- Thus, using (2.31) and Theorem 2.4 we write the UMVUE of ipa(\) as ,1, (\\

1

(m - l)(n 2 - 1 ) —2

(6.9) /

\ ri2 —2

where the set W = {(x, t/) : 0 < x < niX, 0 < y < n2Y, ax < y} (compare with (2.32)). Changing variables in (6.9) to u = x/{n\X) and v = y/(ri2Y), integrating over v € (aTu, 1) and using the notation T = (niX)/(n2Y), we can represent ^O(A) as

= a"1 1*1 - (m - 1) f (1 - u)"'- 2 (l -

(6.10)

174

Some Selected Special Cases

Using integral representation of the hypergeometric series (see e.g. formula 9.111 in Gradshtein and Ryzhik (1980)) we rewrite (6.10) as &(A) = a" 1 [1 - 3 Fi(l - n2,l;n1,aT)].

(6.11)

The hypergeometric series in (6.11) is convergent provided aT < 1. For aT > 1, an alternative expression for V>O(A) can be used: ) = a-^FtQ. - m, 1; n2, (aT)-1).

(6.12)

Combining (6.7) with (6.11) and (6.12), we arrive at the UMVUE (k ~ 3=0

6.1.3

Estimation of System Reliability Based on Count Data

Suppose that a complex mechanism is constructed from a number of different types of components and separate observations have been made on reliability of each one of the components of the system. Namely, the data consists of vectors x = (xi, , xm) and n = (ni, , nm) where we have , m. The observed Xi successes in rij trials of the i-th component, i = 1, number of successes Xi has the binomial distribution p(xi-,0t) = ( " * W ( l - W-",

' < = 1,

,m.

Importance of this setting stems from the fact that numerical measurements of strength and stresses are often much more expensive and involved than observations of survival and failure which may often be obtained simply by visual inspection of the components. Hence, although numerical measurements are usually more informative than counts, it may be economically advantageous to collect a large number of samples of count data rather than obtain numerical measurements with fewer components. Madansky (1965), Myhre and Saunders (1968a,b) and Easterling (1972) worked out confidence intervals for the system reliability R for the data of this type while Choi and Kim (1983) developed a Bayesian sequential procedure for estimation of R. Madansky (1965) derives confidence interval for R by inverting the likelihood ratio test (LRT) for hypothesis Ho : R = RQ (see e.g. Casella and

Stress-Strength Models for System Reliability

175

Berger (1990), Chapter 9). For example, in the case of the series system, R = YijLi ®j an( ^ ^ e likelihood ratio is given by

L(Ro)=

sup

U

n^(x i ; ^) / s u Bp L L*=i J // Bi U=i

(6.13)

Taking into account (see e.g. Casella and Berger (1990), Section 8.4.1) that —21nL(i?o) has approximately chi-squared distribution with one degree of freedom, we obtain a (1 — 7)-confidence set for R of the form W = {R:-2\nL(R)<X21^(l)}

(6.14)

where X^(l) is the 100a upper percentile of the chi-squared distribution with 1 degree of freedom. It is easy to observe that the supremum in the denominator of (6.13) is achieved for $i = Xj/rij. To obtain the logarithm of the numerator of L(Ro) we shall first maximize the Lagrangian /

m

m

\

J2 Inp(x4; 9i) - A In J ] 9t - In Ro i=l

V i=l

/

where A is a Lagrange multiplier. Then the maximizing set of 6i's is given by

where A < minxj for all i, since 0 < 6(X) < 1. Hence, substituting 0j and Qi into (6.13), we obtain inL(Ro) = TXiln

1 - A ) - Vn.ln (1 -

(6.15)

where A satisfies the equation [(Xi - X)/(m - A)] = Ro.

. (6.16)

Now a (1 — 7)- confidence set (6.14) is obtained by replacing Ro by R in (6.15) and (6.16) and noting that

4=1

176

Some Selected Special Cases

is a monotonically decreasing function of A. Observe also that — 2lnL(R) is a monotonically increasing (decreasing) function of A for A > 0 (A < 0). Hence, the set of A's such that — 2 In L(R(A)) < Xi- 7 (1) w m be an interval [AJ, A£] where Af < 0 < A2 are the solutions of the equation

Since -R(A) is a monotonically decreasing function of A, the (1—7)-confidence interval for R is indeed of the form (i^Ajl), -R(Af)). This result can easily be modified for the parallel systems by simply replacing R with 1 — R, 9i with (1 — 0,) and Xi with n, — Xj, i = 1, , m. Moreover, as Madansky (1965) shows, his method can be generalized to the case where the system is composed of m components in series and s subsystems that are in series with the m components but are themselves composed of components in parallel. This extension is based on monotonicity property analogous to the one used above. Myhre and Saunders (1968a,b) applied the Madansky's (1965) method to more diverse systems of elements. Let the state of the i-th component be a Bernoulli random variable y* taking on value one for success and zero for failure, i = 1, , m. Then, following Myhre and Saunders (1968a), the system is a coherent monotone structure if there exists a non-decreasing Boolean function ,ym) (i-e. taking only values zero and one) of the state of components which serves as an indicator of the state of the structure. If Eyi = 0j is the reliability of the i-th component, then R = E
0m)

=

.

ffm

is asymptotically normally distributed as nj —> 00, j = 1,

(6-17)

, m, with the

Estimation of P(Xi < X2 <

mean R = h{6i,

< Xk)

177

Om) and the variance

(6.18) To obtain confidence intervals for R based on statistic (6.17), one needs , m. only to replace 0, in (6.18) by their estimators, 9% = Xi/rii, i = 1,

6.2

Estimation of P ( X i < X2 <

< Xk)

In previous sections we have considered estimation of various inequalities. Estimation of Rk = P(Xi < Xi < < Xk) is perhaps one of the important cases which has not yet been covered. It is also of interest from theoretical aspects. Estimation of Rk appears in isotonic regression problems where it is essential to estimate Rk to find the level probabilities (see, e.g., Miwa et al. (2000)). The important particular case is estimation of R3 = P(X < Y < Z) which represents the situation where the strength Y should not only be greater than stress X but also smaller than stress Z. For example, many devices cannot function at high temperatures, neither can do at very low temperatures. Similarly, person's blood pressure has two limits - systolic and diastolic pressures and his/her blood pressure must lie within these limits. This section presents estimation of Rk in general, following more detailed elaboration of the case fc = 3.

6.2.1

General Case

Using techniques of Sections 2.1 and 2.2, it is easy to represent estimators of Rk in integral forms. Indeed, if f(xi, , Xk) is the MLE or UMVUE of the joint pdf of X\, based on observations on X\, ,Xk, then the MLE or UMVUE of Rk is of the form Rk = / / f{xi,---,Xk)I{xi

<

< xk)dxi

dxk.

(6.19)

Hence, the main challenge is to compute the integrals of the form (6.19). There are several ways to simplify the expression for R^. The first is to

178

Some Selected Special Cases

use the change of variables z\ = x\,Zj = Xj — Xj-i,j o

^k

=

/

r

/-oo ^

/>oo

/

> 2, which results in

" ' /

J-oa [Jo

"I

/(21> ^l + Z2> ' - ' i z l H

1" 2/0^22 ' - dZk dz\.

JO

J (6.20)

If Xj's are independent, one can utilize the following recursive relationship suggested by Hayter and Liu (1996). Denote by fj(-) the MLE or the UMVUE of the pdf of Xj, j = 1, , k, and define rj{x) = P(Xi <

< Xj < x), j > 1, ro(x) - 1.

(6.21)

Then, it is easy to observe that =

/ J-

j = l,---,k,

(6.22)

and Rk = rk(oo). We shall illustrate these comments by evaluating explicitly (6.19) for a number of basic distributions. The exponential distribution. Let Xj, j — l , - - , f c , be independent exponential random variables with fj(x) — otjexp(-ajx). The MLE of fj(x) is fj(x) = &jex.p(—ajx) (here we are using "~" instead of "A" to emphasize that this is a MLE) where &j is the inverse of the average of observations on Xj. Now (6.20) becomes

=

&i---&k

so that the MLE of Rk is given by elegant, intuitively plausible expression Rk = " ' The normal distribution. If Xj, j — 1, , A;, have independent normal distributions with the means 6j and variances a?, an application of formula

Estimation of P(Xi < X2 <

< X k)

179

(6.19) yields Rk = /

/ J-oo Jo

/ Jo

,- , , , , (27r)fe/2(T

dz2---dzk

i,

(6.23)

where cr = Ylaj- Define fii = #i, //_, = #j — 0 j - i , J > 2, and change variables Zj = yj + fij, j = 1,- ,h. Then the argument of the exponent in (6.23) becomes

Simplifying *(2/i, ' > J/fe) we express it as

Z

where * f depends on y2, the entries k

( E-r2

\~x

i=2 j=2

Vk only and W is a symmetric matrix with

I k

k

k

k

EE

/ \l=X m=max(i,j) over j/i, we express l=i m=j Rk in terms of i,j — 2,---,k.( = 1 Integrating the orthant normal probability

Rk = P(Y2>fi2,---,Yk>»k)

(6.24)

where Y = (Y2, , Yk) is a (fe — l)-dimensional normal vector with zero mean and covariance matrix W " 1 . Some analytic approximations to probabilities (6.24) can be found, for example, in Gupta (1963) and references therein. More recent references are given in Kotz et al. (2000). To obtain Jhe MLE of Rk one needs only to replace ^ and of by their MLEs in_(6.24). Miwa et al. (2000) also suggest to calculate the probability Rk directly using recursive relationships (6.21) and (6.22). This approach leads to application of numerical integration techniques which can be quite time consuming even with modern computers. However, if n(x) are approximated by piecewise cubic polynomials, then integration in (6.22) can be performed analytically. Miwa et al. (2000) provide coefficients for such polynomials and discuss a choice of grid points.

180

6.2.2

Some Selected Special Cases

Estimation of P{X
Remarkably, there are numerous papers devoted to the estimation of P(X < Y < Z) scattered in the literature in the last 25 years or so. Estimation ,Xni), of Rz = P(X < Y < Z) based on independent samples (X\,, Yn2) and (Z\, Zn3) was studied by Chandra and Owen (1975), (Yi, Hlawka (1975), Singh (1980), Dutta and Sriwastav (1986) and Ivshin (1998). Hlawka (1975) and Singh (1980) concentrate on the nonparametric version of the problem while the other authors deal with the parametric set-up. Hlawka (1975) suggests to estimate R3 by the [/-statistic n\

712 1 3

U3 = (nm2n3)-1 £ £ X > ( * i < Yi < Z^ i=\ j=i 1=1

He shows that U3 is an unbiased consistent estimator of R3 with the variance bounded by (4nin2+nin3+4ri2n3 + 5ni + 2n2 + 5n3+4)/(180ni7i2n3). He then proves that \/n(C/3 — R3) is asymptotically normal whenever n\ = n-i = 713.

Singh (1980) considers a special case when the cdfs Fx(-) and Fz(-) of X and Z, respectively, are known while the pdf Fy(-) of Y is unknown but the observations (Yi, , Yn2) are available. He provides an unbiased estimator of R3 of the form

with the variance Var(fla) < n^ 1 { E[F x (Y i )(l - FziYt))] - E§} = n^(R3-R23)

< l/(4n 2 ).

A useful representation of R3 as R3 = P(X < Y) - P(X
(6.25)

is also given by Singh (1980). Dutta and Sriwastav (1986) deal with the estimation of R3 when X , Y and Z are exponential random variables which is a particular case of the general model considered in Section 6.2.1.

Estimation of P(Xi < X2 <

< Xk)

181

Ivshin (1988) investigates the MLE and UMVUE of R3 when X, Y and Z are either uniform or exponential random variables with the unknown location parameters. The article utilizes the formula (6.19) with k = 3. For example, if X, Y and Z are uniform with the pdfs ^~ 1 /(0 < x < 6i), i = 1,2 and 3 for X, Y and Z, respectively, the MLE £3 and the UMVUE R3 of R3 are given by formidable but rather straightforward expressions R3 R3

= piI(X{ni) < r (Ba) < Z(B8)) + p2I(Yin2) < minpr ( n i ) ) Z (n8) )) + p3I(Xim) < Z{n3) < Y(n2)) + pd{Z(n3) < min(X(Bl)> y (Ba) )), = piI(X(m) < Y(n2) < Z{n3)) + p2l(Y(n2) < min(X (ni) , Z{ns))) ) < min(X (ni) ,F (n2) )),

where

P3 = \Z\n3) ~ Pi

+ (Y(n2) ~ = ^(

and A

_

("a-AJ(»a)-^l»iV

,

l

2

(ni))2

p2

= 2n*n2n3X

P3

=

+

-. P4 =

(n1-l)(n3-l)(2y(n2)-X(ni)) ) """ 1n\n2n3Z(n3) > V ^ + (ni-l)(n 2 -l)(n3-l)Y (ana)

nin,5,y/..i

"T"

(ni-l)(n 2 -l)(n 3 -l)(y (n2) -Z (n3) ) 2

—

Here ^ ( n i ) , ^(n2) an( ^ z(n3) a r e t n e largest observations on X, Y and Z, respectively. To obtain expressions for .R3 and .R3 one needs to perform integration in (6.19) keeping in mind that the MLE and the UMVUE of the uniform Xn are, respectively, f(x) = pdf f(x) = 6~XI{Q < x < 9) based on Xlt X min J(0 < x < X{n)) and f(x) = (n - l ) f ( i < X {n) )/(nX (n) ) + [1 -

182

Some Selected Special Cases

(n — l)x/(nX(n))]5(x — X(n)) where S(-) stands for Dirac delta-function (see definition in Section 3.2.2). We conclude this section by noting the paper of Chandra and Owen (1975) which is concerned with a slightly different problem, namely, estim a t i o n o f P{Xx

< Y,--,Xt

< Y) a n d P(X

< Ylr-,X

< Yt).

I t is

easy to observe that the first of these probabilities is related to R3 by formula (6.25) where / = 2. Chandra and Owen (1975) construct MLEs and UMVUEs of the above mentioned probabilities in some special cases. We shall leave this material as exercises. 6.3

6.3.1

Linear Models Formulations for Stress-Strength Systems Stress-Strength

Models with Explanatory

Variables

Often an experimenter has access to the measurements of some auxiliary (explanatory) variables that affect the strength or influence the stress. This situation is particularly prevalent in modern medical applications. The additional information can play an important role in analysis. Suppose that X depends on fci explanatory variables zi and Y depends on &2 explanatory variables z2, namely, X|zi = j9' 1 z 1 +ei,

y|z 2 = / ^ z 2 + £2,

(6.26)

where (3j arefcj-dimensionalvectors and £, are random variables with specific distributions, j = 1,2. Some of the explanatory variables may be common as it often happens in drug trials when the remission times are adjusted for the age of the patients. The model (6.26) and its minor modifications were considered by Guttman et al. (1988), Aminzadeh (1997) and Lee and Park (1998). Normal stress and strength, equal variances. Guttman zt al. (1988) assume that £\ and e2 are independent normal variables with zero means and variances o\ and <72, respectively. Assuming that n\ (n2) observations for X (Y) are available, they introduce the notation:

Linear Models Formulations for Stress-Strength

— \-A-\i ' ' ' i-^-ni) i

x

— ( . J l i ' ' ' > In%)

Systems

j

183

(p.ni\

i = Z^Zi, W 2 = Z 2 Z 2 , It is easy to show that for specified values zj and z2 we have R = R(z1,z2) = P(X < y|zi, z2) = $(C)

(6.28)

where

C = (/32z2 - 01z1)/Jal

+ d^

(6.29)

and ) as above is the standard normal cdf. Following Guttman et al. (1988), consider first the case of <j\ = a2 = a. Then the sufficient statistics for the model are (N - k)S2 = (m - fci)S? + (n2 -

(6.30)

where (m - ki)Sl = X'X - jSiZiX,

(n2 - fc2)S^ = Y'Y - ^' 2 Z 2 Y.

In view of the normality assumptions, from standard linear model theory (see e.g. Seber (1977)) we have (6.31) with ^ l 5 $2 a n d ^ 2 independently distributed. Denote (cf. with (6.29)) (6.32) af = z J W r ^ i , i = 1,2, a2 = a? + a|.

(6.33)

Then an estimate for R is given by R = $(C) (cf. (6.28)). To obtain a (1 — 7)- lower confidence bound for R note that (6-34)

184

Some Selected Special Cases

where tjv-fc(A) stands for noncentral T-distribution with (N — k) degrees of freedom and noncentrality parameter A (compare with (4.14)). Denoting the cdf of this noncentral T-distribution by F(t; N — k, A) where A = \/2C/a> a n exact (1 — 7) lower bound £7 for £ can be obtained by solving equation F(i&C/a; JV - k, A) = 1 - 7

(6.35)

for A, say A7, and then determining £7 as £7 = aA 7 /\/2. Hence, a (1 — 7) lower confidence bound for R is Ry(z1,z2)

= *(Cy) = $(aA 7 /v / 2).

(6.36)

Similarly to Section 4.1.2 one can use normal approximation to the noncentral T-distribution to avoid the technical difficulties involved in numerical solution of equation (6.35). Normal stress and strength, non-equal variances. In the situation of non-equal variances cr\ ^ 02, the lower bound derived by Guttman et al. (1988) can be obtained by generalizing the technique of Reiser and Guttman (1986) (see Section 4.1.2). In this case, denote

(6.37)

M =

{a\+a\)l{a\a\+a\al)

where a\ and a
/32z2 - £iai ~ N(p'2z2 - /3[zi, a\a\ + a\a\)

(6.39)

is independently distributed from S\ + 5 | which is a sum of weighted chisquared random variables. Approximating the distribution of 5^ + 5 2 by

Linear Models Formulations for Stress-Strength

Systems

185

and using (6.37) and (6.39), we derive (cf. (4.14)) VMC ~ tv(>/MQ,

(6.40)

where C, is given in (6.29) and M and v can be approximated by

M = {Si + Sl)/{a\Sl + a\Sl)

(6.41)

and

i> = (Si + Si)2 I [ — ^ - + -^-r\ I \nx -ki

,

n2-k2\

(6.42) '

respectively. An approximate (1 — 7) lower confidence bound for £, say, £7, is then found by solving (/

V

=1-7

(6.43)

for £ (cf. (6.35)). The lower (1 — 7) confidence bound for R is then Of course, here, analogously to Section 4.1.2, one can use the normal approximation to non-central T-distribution to avoid the often troublesome solution of equation (6.43). Exponential stress and strength. Aminzadeh (1997) proposes the following regression model: X|zi=exp{/3'lZi+£i},

y|z2=exp{/3^Z2+£2},

(6.44)

where X and Y have exponential pdfs fx{%) = ^{"1exp(—x/6{) and/y(y) = 621 exp(—y/62) with 61 = exp(^iZi) and 62 = exp(/32z2). Under this model, Si, i = 1,2, have the pdfs /e(e) = exp(e — exp(e)), and R = R(zuz2) = P(X < Y\zuz2) = 61/(6! + 62), a very familiar formula for the exponential case. To construct a confidence interval for R the author introduces auxiliary function *(iJ) = R/(l -R) = 6i/62 = exp(/3izi - /3 2 z 2 ). Aminzadeh (1997) also derives the MLEs fix and /3 2 of j3l and (32 and then uses the asymptotic normality of the MLE to construct confidence intervals for R. Asymptotically we have &Z1 - /32z2 ~ N (0lzi - /32z2) [ z j W ^ Z ! + z' 2 W 2 - 1 z 2 ]/2),

186

Some Selected Special Cases

where W,-, j = 1,2, are denned in (6.27). Nonparametric model for stress and strength. Lee and Park (1998) consider the model very similar to (6.26) but with no parametric assumptions on X and Y. Namely, they study z2-z2)+e2,

(6.45)

with unknown fij and (3j, and zj=nj1'^2^ji,

.7 = 1.2.

i=l

The only conditions imposed on £j are E(£j) = 0 and_ Vax(ej) = Gj, 3 = 1,2If the parameters fij and f3j, j = 1,2, were known, we would have had R = P(X < Y\px,fi2,m,ti2) = P(e2 - £i < r) with r = M2 - A*i + #!(Z2 - z2) - /3i(zi - zi).

(6.46)

Hence, one should estimate R by j

- eii < r).

(6.47)

Now, since T is unknown and en and £2^- are unavailable, the authors derive the least squares estimators jij and (3^ of (ij and /3j, j = 1,2, respectively. They set f = p,2 - p-i + /32(z2 - z2) - /3i(zi - zi), £ H = Xi - pn 3i(zii — zi) and £2j = I j — jCt2 - 02{z2j — ^2) and construct the estimator of J?(zi,Z2,yui,/i2) of the form

Ufa, K Ml, /S) = — E E /(% - lii < r). "

(6.48)

Lee and Park (1998) prove consistency of the estimator (6.48) and show that under regularity conditions this estimator is asymptotically normal. They also provide an estimator for the asymptotic variance and propose an asymptotic lower confidence bound for R. Their computations are quite instructive, and we recommend further study of their paper.

Linear Models Formulations for Stress-Strength

6.3.2

Systems

187

ANOVA Formulations of Stress-Strength Models

In statistical quality control and reliability analysis observations for X and Y can sometimes be divided in groups. For example, a quality control engineer may decide to compare failure times X and Y of two products A and B where both products are tested under various sets of conditions (temperature, pressure, exposure to sun, air, etc.). When the number of batches in a population is high, a possible model for measurements could be the random effect model considered by Aminzadeh (1991). Let Xij and Y^ be the i-th measurements in batch j for X and Y, respectively, where i = 1, , j = 1, k\ for X, and i = 1, n2, , k2 for Y. Aminzadeh (1991) suggests a one-way ANOVA random j = 1, effect model for X and Y of the form Xij

=

Hi + Qj + eij,

Yij

= fi2+Tj+Sij,

(6.49) (6.50)

where \x\ and fi2 are the overall means of populations X and Y and Qj and Tj are the random batch effects. Under the assumption that all Qj, TJ, eij and e%j are independent, we obtain from (6.49) and (6.50) that

l

2

l

2,

a22=a2T + a2,).

Using the well known results for one-way random ANOVA model we estimate Hi and Hz by X= X ) ^ i / ( n i ^ i ) a n d Y= J2*W(n2&2), respectively, and variances by

al = {sKm - 1) + s2e)/m,

a\ - (4(n 2 - 1) + 4)/n 2 ,

(6.51)

where s\ and s2 (s2 and s2) are mean squares within and between batches for X^ (Yij). Note that

X~ N (/ii, [ma2, + alyimkx)),

y ~ N (m, [n2a2T + a2]/(n2k2)). - (6.52)

and a2 and a2 have approximately the chi-squared distributions

sf ~ (iM) ^ x ^ M ,

?! ~ (iM) ^x^2As

with 1

k ^

(qf + l)(fc, l)kini + - ^ + i l - ^ i h i y

(6.53)

188

Some Selected Special Cases

where ai =
^r/^e-

Observe that R = P(X < Y) is of the familiar form with C= To construct confidence bounds for R = P(X < Y), cases of equal (<j\ — a\) and non-equal {a\ ^ a^) variances ought to be considered separately. Equal variances. If o\= o\ = a2, then the pooled estimator of variance based on (6.53) will be a1 =

u2) ~

and, hence, the estimator of £ is

Since Y - X~ JV(/x2 - ViM2) with n\ai + 1 0 = —T-,

7T +

T = \flJbC, has a noncentral T-distribution with (y\ + u2) degrees of freedom and the non-centrality parameter ^/2/6£. The lower confidence bound for R can be constructed as it has been done in Section 6.3.1. Non-equal variances. The procedure here is analogous to one in the case of equal variances. In the case of nonequal variances we approximate distribution of a\ + a\ in £ by a\ + <x| ~ C^*)" 1 ^! + ^DxL*) where v* = (<j\ + <JI)2I(?\IV1 + al/v2). Let s.(jil-i)*l

+
|

(n2-1)^ + 4 n2k2

In this case, T* = \J{o\ + cr|)/6* C, has a non-central T-distribution with the non-centrality parameter \/{cr\ + cr'^/b* ^ and v* degrees of freedom. For construction of the lower bound for R one needs to approximate v* by using a\ and d\ instead of
Stress-Strength Models with Grouped and Categorical Data

6.4

189

Stress-Strength Models with Grouped and Categorical Data

In many practical situations, instead of numerical measurements on X and Y the data is presented in a form of ordered categories. Namely, the counts riij are available which represent the number of observations on X (i = 1) and Y (i = 2) that fall into the category Cj, j = 1, , K. These categories may be obtained simply by discretizing all possible values of X and Y by the partition — oo < CQ < c\ < < CK-\ < CK < OO, SO that Cj = (c,-_i, Cj] where the cut-off points Cj may or may not be available. For example, Hochberg (1981) in a study sponsored by the US Department of Transportation considers comparison of injury distribution of belted and unbelted drivers involved in accidents, letting C\, , CK to be ordered injury categories ranging from least to most severe injury in which case Cj cannot be obtained. Gastwirth and Krieger (1991) use the above model in measuring economic inequality. Income data typically are reported in grouped form: the number of persons whose income falls in each interval is reported and, hence, the cut-off points are recorded. Other examples of applications of the above model in medicine and psychology will be considered in more detail in Section 7.3.2. The model with categorical data was studied by Hochberg (1981), Simonoff et al. (1986), Halperin et al. (1989), Gastwirth and Krieger (1991) and Edwardes (1995) among others. It is easy to note that a major difference between continuous and categorical data is the possibility of ties, i.e P(X = Y) may not be zero. For this reason, it is sensible to estimate the quantity different from R = P(X < Y) which takes the latter possibility into account. One of the candidates may be R* = P{X
6.4.1

Point Estimation

Evidently, vectors Nj — (nn,ni2, tributions with

,riiK), i = 1,2, have multinomial dis-

190

Some Selected Special Cases

probabilities Pij = P

fx(x)dx,

P2j=

P

fY(y)dy,

j = l,---,K,

(6.54)

for an observation from X or V, respectively, to fall into category j , j = 1, , K. If the cut-off points c3- are not available, the most natural approach to estimation of A is nonparametric one. Otherwise, one can use a parametric approach or one of the two parametric-nonparametric alternatives suggested by Simonoff et al. (1986). Nonparametric estimator. Following Hochberg (1981) we denote the group index of an observation with the value a by J(a), so that ) is a random variable with the values 1,2, , K. Let Pij=nij/ni,

i = 1,2, j =

1,---,K,

be estimators of P^ using frequencies. Then the WMW-type estimator of A is K-l

K

K t-l

(6.55)

^2^2PIJP2J i=2 j=l

Parametric estimator. Assuming that X ~ N{^\,a\) 2,0-2), the MLE of A is of the familiar form

and Y ~

lMLE = 2*(^=b=)-l.

(6.56)

Here K

J2 3=1

K

Y^v^i?

~ (Ai)2, * = 1,2, -

(6.57)"

3=1

with Cj — (cj-i +Cj)/2. The performance of the estimator (6.56) is highly sensitive to violation of normality assumptions. Two parametric/nonparametric compromises. A more robust parametric estimator can be constructed by deriving fa and di, i = 1,2, using

Stress-Strength Models with Grouped and Categorical Data

(6.57) and then estimating Py, i = 1,2, j — 1,

191

, K, by the inference

Then a "pseudo-MLE" estimator of A will be K-\

K

K

i-l

(6.58) i=l j=i+l

t=2 i=l

Note that APML is essentially the same as AWMW given by (6.55), except that the cell probability estimates are derived parametrically rather than from frequency estimates. The behavior of (6.58) depends on whether X and Y are normally distributed but to a somewhat lesser extent than that of K-MLE-

Another method suggested by Simonoff et al. (1986) does not require the underlying model to be normal, but only smoothness of the pdfs fx and fy is required. Under this condition, the probabilities P+j can more accurately be estimated by a "roughness penalty method" as follows. Rather ,K, using the logthan estimate the probabilities Pij, i = 1,2, j = 1, likelihoods K

K

^-=1,

i = 1,2,

the estimators are now defined as values that maximize the log-likelihoods modified by "roughness penalties" K

K-\

3=1

3=1

The method yields smoothened probability estimators p*j and results in a nonparametric estimator of A (since only smoothness of-fx and fy is assumed): (6.59) i=l j=t+l

t=2 j=l

It is instructive to note that the four estimators (6.55), (6.56), (6.58) and (6.59) can be viewed as occupying a continuum from most parametric

192

Some Selected Special Cases

to least parametric as follows: h-MLE — A p M i —> A s -> h-WMW-

6.4.2

Confidence Intervals

Confidence intervals based on asymptotic normality.

Since the

estimators AMLE, A-PML, AS and h-wMW are all asymptotically normal with the mean A, the (1 — 7) asymptotic confidence interval for A is of the form (A —
Estimation of Var(AwMw) was initially carried out by Hochberg (1981) who shows that \

=

[P(X < Y) + P(X > Y) - (m + n2 - 1)A2

+ (ni - 1)(-KXXY + TTYXX - 1KXYX)

(6.60)

where =

P(J(F i )>max[J(X j ),J(X i )]),

= P(J(ri)<min(Jr(A»,J(X«)]), = P(J(Xj) < J(Yi) < J(Xt)),

where, as above, J(a) is a group index of an observation with a value a and Xj, Xi and Yi are original (not discretized) X and Y observations {TTYYXI KXYY and TVYXY are similarly defined). The variance (6.60) can consistently be estimated from the estimators of P(X < Y) and P(Y < X) using the first and the second halves of formula (6.55) and

=

[nw2(n2 - I)]

Stress-Strength Models with Grouped and Categorical Data

193

. KYXX and TTXFX are denned similarly). Edwardes (1995) provides an estimator of the variance (6.60) for more complex sample designs. Confidence intervals based on pivotal quantities. The pivotal-based confidence interval for R* = P(X < Y) + 1/2P(X = Y) has been suggested by Halperin et al. (1989) and is derived in the spirit of the technique discussed in Section 5.3.4. The value of R* is K-i

K

1

K

(6.61) and its estimator R* is obtained by substituting Pij in place of P^, i = 1,2, j = 1, , K. A straightforward but somewhat tedious computation shows that R* has variance R* — (n\ + n2 — 1)(.R*)2 + (n2 — 1)A + (ni - l)B — \ J2i=i Pijftj y z=z

-

-

nin 2

' (6.62)

where K-\

(6.63) It is easy to verify that (R*)2 < A, B < R*, so that the last three terms of n\n2V are bounded below by (ni + n2 — 2)(R*)2 — 1/4 and above by (ni + n 2 - 2)iJ*. Thus, for some 0 € (0,1)

(n2 - 1)^ + ( n i - 1)J5 - EJL n2-2)(JR*)2-l/4] + ( l - 0 ) ( n 1 + n 2 - 2 ) E * + n 2 - 2)[R* - 0K*(1 - H*) - (m + n 2 - ) 1 / ]

(6.64)

194

Some Selected Special Cases

Since the term (ni+n 2 —2)~10/4 is asymptotically negligible as rii —» oo, i = 1,2, it is ignored and the modified version of V obtained from (6.62)-(6.64) will be V* = ( n i n j ) - 1 ^ + n 2 - 1 - (m + n 2 - 2)0]iT (1 - J2*).

(6.65)

To derive an estimator of 9 we rely on equation (6.64) ignoring the asymptotically negligible terms J^PijPzj and [4(ni + n2 — 2)]~1#, hence n2 -

(6.66)

Let A and B be the estimators obtained by replacing Py by py in formulae (6.63). Tedious but straightforward computations show that A and B have unbiased estimators

(6.67) , K. Ignoring the terms of order where qij = 1 — p^-, i = 1,2, j = 1, O(l/(nin 2 )) an unbiased estimator of R*(l — R*) is given by

-rn-n2 (ni

+ 2)R* - nin2(R*)2 A B _ 1 ) ( n 2 _ i) + ^Zi + ^ Z T -

(6.68)

Substituting R*, A, B and (6.68) into (6.66) results in a consistent estimator $o of 9. li §o < 0 or $o is indeterminate (since then A = B = 0, we define 6 = 0; if §o > 1, then 0 = 1; otherwise, 0 = 0OWe may now compute the pivotal quantity (R* — R*)2/V where V is obtained from, expressioa (6.65) by substituting 0for 6. Denote g = (ni + n2 — 1) — (ni -\-n2- 2)9. The confidence interval is then obtained as the solution set in R* of the quadratic inequality mn2(R* - R)2/[gR*(l - R*)} < Xi- 7 (1) where X?- 7 (l) as above is the (1—7)-quantile of the chi-squared distribution with one degree of freedom. Thus, the confidence limits are given explicitely

Stochastic Processes Formulations of Stress-Strength Systems

195

by

2{C + 1)

with C = £Xi_7(l)/(nin2). Confidence intervals derived by means of optimization techniques. Gastwirth and Krieger (1991) studied probabilistic upper and lower bounds on P(X < Y) when X and Y are not independent and probabilities Py, i = 1,2, j = 1, , K, are available. Assuming that X and Y are bounded (and, thus, can be scaled to the [0,1] interval) they derived their bounds under various conditions on X and Y somewhat analogously to that of Section 5.5. For example, if X and Y have means /L*I and /j.2, respectively, the lower bound for P(X < Y) is zero and the upper one is 1 + \i\ — fj>2The above approach is somewhat related to the probability design method described in Section 5.5.

6.5

Stochastic Processes Formulations of Stress-Strength Systems

The stochastic process formulations for strength-stress systems were developed by Basu and Ebrahimi (1983), Raghava Char et al. (1984), Bilikam (1985) and Aminzadeh (1999). It was Bilikam (1985) who justified the use of stochastic processes in reliability models. He writes: "The strength...is necessarily conditioned on the stress because the physical realization of strength is found only when stress is applied."

6.5.1

General Stochastic Systems

Basu and Ebrahimi (1983) and Bilikam (1985) consider a general formulation where both stress X and strength Y vary in time, i.e. X = X(t) and Y = Y(t) and the cdfs of X and Y depend on t via parameters 0i(t) and 02(<), respectively. The reliability is then can be assessed at any given instant of time t as R(t) — P(X(t) < Y(t)) (the probability of a successful operation at t) or

196

Some Selected Special Cases

at a period of time (0, to) as Rt0=p{

sup [X(t) - Y(t)} < o) . J Lo
(6.69)

(the probability of absence of a failure before the time t = to)Bilikam (1985) obtains expressions for R(t) for various distributions. For example, if X(t) and Y(t) possess an extreme value distributions with the cdfs F\{x) = l-exp[(-expx)/0i)] and Fy(y) = l-exp[(-expy)/0 2 )] with the parameters 9j = 9j(t), j = 1,2, being functions of t, the reliabilityis then the familiar expression R(t) = 92/{9i + 92)- Bilikam (1985) then considers various time-dependent models for 9j such as In 9j = a,j — bj In t,

Basu and Ebrahimi (1983) studied (6.69) in the case when X(t) and Y(t) are Brownian motions or when the strength is decreasing while the stress remains fixed. Let X(t) and Y(t) be independent Brownian motions with the mean value functions mt and [i^t and covariance kernels o^min(s,£) and
,

|z(0)| f —— exp<

(z(0) + »t)2\ -—5- >at,

where /J, — (J-2 — ^i, cr2 =
(6.70)

when x(0) < y(0) and is zero otherwise. The formula (6.70) for Rto fits very well within framework of formulas for the cases when X and Y are normally distributed. In the case when the strength is decreasing while the stress is fixed X(t) = X, the reliability is found to be (cf. (6.69))

Rt0

= p{x-

inf Y{t)) < 0} = P(X < Y(t0))

Fx(x)dFp(x),

Stochastic Processes Formulations of Stress-Strength Systems

197

where FY is the cdf of Y(to) in the accordance with the general formula (2.6) introduced in Chapter 2. 6.5.2

Markov Models for System

Reliability

Raghava Char et al. (1984) consider Markov models for system reliability with discrete time. They assume that the stresses (i.e. attacks) arrive , an at discrete points of time 1,2,3, and at any moment j , j = 1,2, attack occurs with probability a, 0 < a < 1. Let Pk be the probability that the unit survives the fc-th attack given it has survived the previous (k — 1) attacks, k > 1, and define Xk to be the number of attacks successfully withstood among the first k encounters provided that the item has not failed upto then. If it has already failed we shall describe this situation by the statement Xk = F, where F denotes "failed". Under these conditions, {Xk, k > 1} is a random walk on [0,1,2, , F], with F being an absorbing state. It is then easy to obtain that P(Xk+i = i + l\Xk = i) = aPi+i, P(Xk+1 =i\Xk=i) = l-a, P(Xk+1 = F\Xk = t) = a(l - Pi+1) and P{Xk+1 = F\Xk = F) = 1. Let N(a) be the time to absorbtion of Xk: N(a) = inf[fc : Xk = F\. If linifc-voo Ylj=i Pj = 0> Raghava Char et al. (1984) find the characteristic function of N(a) to be -oo
They also show that aN(a) converges in distribution to a random variable with the characteristic function 0 (i here denotes V—T)- From this general result the authors derive the limiting distribution of aN (a) for various scenarios of system reliability. For example, if Pk = p, then the limiting distribution is exponential with the cdf 1 -exp(—(1 — p)x), so that N(a) is approximately exponentially distributed with the mean [(1 — p)a]~~l for small a. Raghava Char et al. (1984) also investigate the case of attacks on series and parallel systems. 6.5.3

Stochastic Time Series Models

, Recently, Aminzadeh (1999) studies prediction problem when X_ = {X\, Xni) and Y_= (Yi, , Yn2) represent observed values of two correlated time series Xt and Yt, respectively, and X n i + m i and Kn2+m2 denote the values

198

Some Selected Special Cases

of Xt and Yt at the future times t = ni + mi and t = n-i + m2, mi, m2 = . In practice one often is required to estimate R = P(X n i + m i < Y^2+m2) where ni +mi = n2 + m2 — r. Here R may describe the reliability at the moment r where Xt and Yi are the stress and strength at the instant t, or when Xt and Yt are prices of two stocks, R is then can be interpreted as the probability that Yt is doing better than Xt at that particular moment t = T. Aminzadeh (1999) investigates stationary autoregressive (AR), moving average (MA) and autoregressive moving average (ARMA) models for time series Xt and Yt as well as the case of nonstationary ARMA models. For example, if Xt and Yt follow AR model of order k, then fe

X t-

Hi = 3=1 k

3=1

where 8itt are autoregressive parameters, /ij are the means of Xt and Yt, res) spectively, and Sitt are white noise processes with E{eitt) = 0, Cov(£j)t, , k. Since the assumption of in0 and Var(eiit) = af, i = 1,2, t = 1, dependence is very restrictive in practice, we shall assume, following the author, that the joint probability distribution of ei jt and £2,t is a bivariate normal with Cov(£i]t+S, £2,t+r) = ^i<^2 for any s and r. Denoting = fr(mi), E(Ym2+n,\Y) = &{m2), h(i,Q) = 0, h(i, 1) = 9itl, E(Xmi+ni\T) , h(i, m) = Ylj=i ^i,jh(i,m — j), i = 1,2, and recalling that A; is the order of the AR model under consideration, we obtain the familiar expression where

(M2 - MI) + E?=i E?=i ei,j(Zi(mi - 3) - IM)

Estimation of R requires estimation of parameters involved in the expression for £ which is a tedious but a straightforward task.

Exercises

6.6

199

Exercises

6.1. (Myhre and Saunders (1968b)). Using formulas (6.17) and (6.18), write an explicit expression for the asymptotic (1 — 7) confidence intervals , Yk)) and Rk,k = P(X < min(Yi, , Yk)). for Rhk = P(X < max(y1, 6.2. (Chandra and Owen (1975)). Derive the MLE of P{Xi +Vl{x > A) (0,- > 0, A > 0 is the common parameter). Use formula (6.20) directly or produce estimators by using monotone transformations Yj = ln(Xj/X). 6.5. (Ivshin (1998)) Derive the MLE and the UMVUE of R3 = P(X < Y < Z) when X, Y and Z are independent exponentially distributed random variables with the pdfs exp(-(z-0j) I(x > 6j, 6j unknown, j = 1,2,3.. 6.6. Using formula (6.20), derive the MLE of P{X < Y < Z) in the case when X, Y and Z, have independent gamma distributions with a common integer shape parameter m. 6.7. Let X, Y and Z be independent binomial random variables with parameters (rrij,pj), j = 1,2,3, respectively. Derive the MLE and the UMVUE of P(X
200

Some Selected Special Cases

and B given by (6.67) and (6.67) is an unbiased estimator of V*. Construct a pivotal quantity (R* - R*)/^n1n2{V - (m + n2- l)(R*)2) ~ iV(0,l) and derive confidence bounds for R* similarly to (5.50).

Chapter 7

Applications and Examples

7.1

Applicability of the Stress-Strength Model

In previous chapters we have discussed in some detail the following two main topics: 1) Expressions for the probability P(X < Y) and its generalizations for various distributions of X and Y; 2) Expressions and properties of various estimators of P(X < Y) and its generalizations based on a random sample as well as other sampling procedures. We have seen that these topics often involve challenging calculations and are of great usefulness when viewed from probabilistic and statistical aspects. We have also attempted to describe in the earlier chapters the genesis and motivation for probabilities of the type P(X < Y) and their connection with the classical non-parametric tests of equality of two distribution functions based on the extensively used and popular Wilcoxon-Mann-Whitney statistic. It would seem that Birnbaum (1956) was perhaps one of the first researchers who dealt with the model P{X < Y) in stress-strength content. It is worth quoting Birnbaum's " Illustration" as presented in his pioneering paper which may serve as the road-map of the research in the last 45 years. His ideas are in resonance with the observations by Bilikam (1985) which appeared in engineering literature some 30 years later: " An illustration. If structural components of a mechanism are mass produced, the strength at failure Y of each single component (equals stress at which this component will fail) may be considered a random variable. 201

202

Applications and Examples

The component is installed in an assembly and exposed to a stress which reaches its maximum value X, again a random variable. If Y < X, then the component will fail in use. In this situation, p = P(Y < X) is the probability that failure will occur because, due to a chance, a component with relatively low strength was paired off with a high stress. It clearly is of interest to estimate this probability, preferably from samples of X and Y alone, since installing the components in complete assemblies and trying them out under conditions of actual use may involve nearly prohibitive expense and effort. It also will be important to be able to estimate p without knowing the distribution of the strengths of the components, or of the stresses, or both." (Birnbaum (1956), p. 14). R.A. Johnson (1988) in his survey paper in Handbook of Statistics, Vol. 7, interprets R = P(X < Y) somewhat more liberally - as the probability that a unit in operating environment performs satisfactory when - as usual - X is the stress placed on the unit, specifically X is taken to be the maximum value attained by a "critical stress". He points out an early application by Lloyd and Lipow (1962) where X represents the maximum chamber pressure generated by the ignition of a solid propellant in rocket engine. (We shall return to rocket engines applications in the sequel on several occasions). In a subsequent paper by Kececioglu (1972), X represents the torsion stress which is the most critical type of stress for a rotating steel shaft on a (by now obsolete) computer. The message of these examples is that, in practice, the stress variable X is usually difficult to model accurately due to lack of sufficient data, and therefore various models of P(X < Y) where X is assumed to have many different distributions discussed in preceeding chapters seem to have more than a passing theoretical significance. With regard to the strength variable, Johnson (1988), as many other Bayesians, advocates the expert opinion elicitation. The most prominent examples of applications of P(X < Y) relationship in engineering and medicine as presented in Johnson's (1988) survey article are: a) Rocket engines. Here X usually represents the maximal chamber pressure generated by ignition of a solid propellent while Y is assumed to be the strength of the rocket chamber so that P(X < Y) is simply a__ probability of successful firing of an engine. We shall indicate below further instances when this model is used. b) Two-treatment comparisons. This is an old technique motivated

Applicability of the Stress-Strength Model

203

by the close relation between Wilcoxon-type tests and the P(X < Y) models. Typically drug I is assigned to one group of subjects and drug II to the other. If X and Y represent the remission times with these two drugs, , Yn2, respectively, the main interest of say, X\, X2, Xni and Y\, Y2, the researcher is to estimate P(X < Y). Here the terminology "stressstrength" may not be appropriate, but the net result is evidently the same. Indeed the very first application of the P(X < Y) relationship — as explained in the Introduction and Chapter 1 - originated from the classical Wilcoxon test already available in Wilcoxon (1945) ground breaking paper and it deals with two treatment comparison. Wilcoxon provides results of the fly spray tests on two preparations in terms of percentage of mortality. He compares the percent killed in the sample A versus the percent killed in sample B (each involving 8 observations) concluding by means of this test that sample B provides a lower percent; thus preparation B should be considered less effective. Another example involving paired comparisons motivated by R.A. Fisher's experimental data on the differences in height between cross- and self-fertilized corn plants determines the significance of these differences. c) Response models. A certain unit - be it a receptor in a human eye or ear or any other organ (including sexual) - operates only if it is stimulated by source of random magnitude Y and the stimulus exceeds a lower threshold specific for that unit. In this case P(unit functions) is equivalent to the familiar P(X < Y) - a stress-strength relationship. d) Earthquake resistance. R. Mensing (1984) in his personal communication to R.A. Johnson (1988) provides the following example which captures many aspects of the problem at hand. In a study of the risk of a nuclear power plant (or some other tall or spacious building) it is necessary to assess the ability of the steam or another generator to withstand the stresses due to the ground motion as a result of an earthquake. (The same applies to the ability of a very tall building to withstand the impact of a terrorist missile attack which - unfortunately - is now a very concrete rather than a hypothetical example). Since at the time (the year 1984) when this example was provided there were no data available concerning the strength of such generators and methods of their estimation, the author solicited opinion of 5 experts who provided estimates of the 10-th, 50-th and 90-th percentiles of the relevant strength variable expressed as a peak accelera-

204

Applications and Examples

tion in ft/sec2. The values of the 90-th percentile provided by the experts range from 103 ft/sec2 to 48 ft/sec2. (Remarkably, the assessment of the 10-th percentile was proportionally more divergent - from 81 ft/sec2 to 32 ft/sec2). Based on this data, R. Mensing advocates to model the strength by the time-honored log-normal distribution and estimates its parameters by a weighted least squares procedure. He also utilizes the same distribution to model the distribution of the stress at the base of the steam generator with the mean value about 1/2 of that of the log-normal distribution representing strength. This leads to a rather optimistic and encouraging estimator of P(lnX < inY) « $(3.52) = 0.99978. R.A. Johnson (1988) also points out that this type of model - perhaps using the multivariate normal distribution - can be extended to several components characterizing strength and the common stress due to an earthquake which can also be visualized as a multivariate normal variable leading to estimation of something like P(X < Y), X and Y both being multivariate normal variables. It was evident from the discussions above that the concept of "stressstrength relations" is quite basic and natural, reflecting sound relationship among various real-world phenomena and one would expect to have an avalanche of applications discussed in the literature. To our surprise and some disappointment - after an extensive literature search - we have located only some 25 papers (out of overall bibliography of more than 300 items) in which explicit applications are provided and often only in a sketchy form. It should in all honesty be pointed out that there are quite likely numerous reports and papers of a semi-classified and classified nature on this subject not available to general audience (see e.g. Section 7.2.3). We think that a reason for such a discrepancy is that the original direction of the research on the stress-strength problem was carried out in the USA, USSR, Canada and India by mainly theoretically oriented statisticians whose interest in applications may have been somewhat marginal. A vast number of sources in our possession usually pay only lip service to the paramount importance of relations of the type P(X < Y) in reliability theory and then immediately proceed to the "main business" of deriving some ingenious expressions for P(X < Y) and its estimators under various assumptions on X and Y. Only relatively few authors (that are known to us) have taken advantage of the enormous applications-oriented potentials hidden in this type of probabilistic-statistical problems.

Engineering and Military Applications of the Stress-Strength Model

205

However, even the small collection of papers dealing with practical applications - which we were able to uncover - serves as a strong indication of versatility of the stress-strength relationship and - as we shall see below - its relevance in various sciences - (not necessarily limited to engineering and medicine). 7.2

7.2.1

Engineering and Military Applications of the StressStrength Model The Rocket Motor Case Example

The strength of the rocket motor case versus the operating pressure - for some reasons which are not clear to us- attracted substantial attention of statisticians working on stress-strength models. The four authors of the paper by Guttman et al. (1988) are deliberately vague about its specific origin and curtly assert that the application to be described below was "brought to our attention by scientists". We can only naively speculate that it has perhaps been related to exploration of the outer space technology which was quite popular in the eighties of the past century (especially in the USA and USSR). Another - more sinister possibility - is, of course, that we are dealing with military applications in the period when Star War program was launched. This may perhaps explain the incomplete data provided by Guttman et al. (1988) who are hiding behind the confidentiality clock. In the paper under review, Y denotes the strength of the rocket motor case and X the operating pressure (which is the stress that the motor must withstand). The reliability of the motor case denoted by R(z) depends on the ambient temperature Z, so the authors assume a model involving explanatory variables and normally distributed error terms for the stress and the strength described in detail in Section 6.3.1. Specifically, in this example, (£.26) becomes ... ... . _

where z = (1, Z) and e% ~ iV(0, of), i = 1,2, are independent. Table 7.1 presents a portion of the data in Table 1 provided by Guttman et al. (1988). The pressure values are rounded up to the third decimal place. (In the original paper authors cite 51 values of (X, Z) and 17 values of Y.)

206

Applications and Examples Table 7.1 A portion of the data presented by Guttman el al. (1988)

z

X

temperature (C°)

operating pressure in psi

-39 -39 -39 -21 59 59 59

5.89 5.85 6.03 7.32 7.74 7.57 7.91

Y chamber burst strength in psi 15.30 16.75 16.00 17.50

The authors study this example under the assumptions of equal and non-equal variances. If variances are equal ( defined in (6.37), (6.41) and (6.42), respectively. Namely,

Engineering and Military Applications of the Stress-Strength Model

207

for the rocket motor case example, 0.999999 using the summary statistics X, Y, s\ and s\ and conclude that the observed data provides strong evidence in favor of Hi, the p-value being equal to 0.0000042. Nandi and Aich (1996b) return to the Weerahandi and Johnson (1992) data by slightly modifying the assumptions on the means /xi and ^ which somewhat simplifies the computations and leads to a shorter 95% HPD credible interval than the corresponding frequentist interval reported by Weerahandi and Johnson (1992). 7.2.2

Comparison of Two Treatments in Engineering Setting

Comparing strength of two types of steel. In a preceeding section we have described a motor rocket case applications. In the same paper, Guttman et al. (1988) analyze the data presented in Duncan's (1986) classical text dealing with the results of measuring shear strength for spot welds for two different gauges of steel (a typical two-treatment problem). An explanatory variable Z is naturally the weld diameter (measured in units of 0.001 inches). Here X and Y represent the strength of two types of steel, the first corresponding to .040" and the second to .064". Evidently we are concerned next with estimating

208

Applications and Examples

R = P(X < Y\Z = z). The authors claim - referring to Duncan's work of 1974 (not cited in their paper) - that the strengths depend linearly on weld diameter (presumably as a first approximation). They utilize the model (6.26) with £i ~ N(0, af), z* - (1, Z) and A* = 2, i = 1,2, and also initially assume that the variances a\ are equal (which turns out to be an incorrect assumption). The data is presented in Table 7.2. Table 7.2 Data on Shear Strengths of Two Gauges of Steel

.064"

.040" - -.040"

.064"

X

Z

Y

Z

350 380 385 450 465 485 535 555 590 605

140 155 160 165 175 165 195 185 195 210

680 800 780 885 975 1,025 1,100 1,030 1,175 1,300

190 200 209 215 215 215 230 250 265 250

For this data (n\ =n-i = 10) we estimate p[ = (-216.33,3.99),

p'2 = (-569.47,6.90)

and for Z = 200 we evidently have in this set-up the two-dimensional row vectors = z2 = (1,200) which (according to formulas (6.28) and (6.32)) yields i? = $(2,193) = 0.986 (namely, for Z — 200 the gauge Y is very likely to possess a stronger shear than the gauge X). Again this conclusion is evident from the raw data (small sample size not withstanding) especially if we note that diameters of gauge Y are uniformly larger than that of X. As it was alluded above, the trouble with this example is that the estimators of the variances are Sf = 876.20 and 5 | = 9,980.16 respectively which seems to invalidate the assumption <J\ = a\. The authors revise

Engineering and Military Applications of the Stress-Strength Model

209

their calculations by assuming that cr%/a\ — 10. This leads to a similar conclusion R = $(2,250) = 0.9878 for Z = 200 and the approximate lower bound $(0.8334) = 0.7977. When no assumptions are imposed on the ratio of the residual variances the approximate lower confidence bound for R for Z = 200 (using the same model) is found to be $(0.6060) = 0.7277 which is lower than the other two bounds cited above (when the assumptions erf/of = 1 and cr\la\ = 10, respectively, are utilized.) The authors emphasize that their procedure used for analyzing Duncan's (1986) data assumes normality and a linear dependence on the explanatory variables. These assumptions could and should be checked using for example the residual - analysis techniques prevalent in statistical literature. The value of R = P(X < Y) depends heavily on the tails of distributions involved, thus the confidence bound will possibly be sensitive to departures from normality. The authors conclude with a recommendation to extend their theory (involving explanatory variables) to non-normal distribution and check its applicability by means of a real world data. To the best of our knowledge this has not been yet carried out. Carbon fiber strength example. In two more recent papers Surles and Padgett (1998), (2000) deal with inference on P(X < Y) in Burr-type X model with the cdfs Fi(x\6) = (1 - e-* 2 ) 0 , x,6 > 0 (compare with (3.28)) or Fi(z|0) = (l - e-^/"^\ , x,6,a > 0 (the scaled version). In both papers the authors provide an application to carbon fiber strength data collected by Bader and Priest (1982). Tensile strength data (in GPa) for single carbon fibers and "impregnated 1000-carbon fiber tows" were obtained under tension at gauge lengths of 1, 10, 20 and 50mm (single fibers) and the lengths of 20, 50, 150 and 300mm (impregnated tows of 1000 fibers). Earlier Durham and Padgett (1990) fitted a Weibull distribution to this failure data with unsatisfactory result. The Burr type X model seems to be more appropriate and the two-parameter model - not surprisingly - provides even a better fit. For the inference on R= P(X < Y) where X represents the strength of 20mm

210

Applications and Examples

fiber and Y - the strength of 50mm fiber (the terminology "stress-strength" may not be quite applicable in this example), Surles and Padgett (1998) calculate a MLE of R to be 0.57284 with sample sizes n\ = 69 (20mm single fibers) and ni = 65 (50mm single fibers). Their conclusion (again not very surprisingly!) is that longer fiber is weaker than a shorter one. They use some formal tests to reach this conclusion. A more ambitious analysis utilizing the two-parameter Burr type X distribution (with different scale parameters) resulted in a MLE R = 0.616592 and various formal tests once more confirm (with a higher confidence) the same conclusion reached when using a one-parameter Burr X model. Comparison of motorettes insulation. Gupta et al. (1999) concentrate on estimation of P(X < Y) in the normal case with common coefficient of variation 7 = a/fi and exemplify their theoretical results by means of the data taken from W. Nelson's (1990) well-known text. It represents the hours to failure of 20 motorettes with a new class-H insulation run at 240oC and 220°C. Nelson (1990) claims that the lognormal distribution fit the data adequately for the both temperature regimes. The variables X and Y are here the logarithms of the failure times and we have 10 observations for each temperature. The null hypothesis HQ : (j\j'Hi = (J2IHi = 7 was not rejected using a score test of Gupta and Ma (1996) since the p-value was close to 1. The authors present a confidence interval on R = P(X < Y) taking into consideration the equality of the coefficients of variation and compare it with the confidence interval on R (for the same data) proposed by Reiser and Guttman (1986) without the assumption of a common coefficient of variation. The interval based on Gupta's et al. (1999) method for the data depicted above is of length 0.101 while the Reiser and Guttman procedure yields a somewhat wider interval of length 0.124. Independent simulation results based on 10,000 observations with parameters - the mean values - /xi(/X2) = 0.5(0.8) and various 7 § [0.2,2] for sample sizes n\{n2) = 30(32), a = 0.05 and a = 0.10 and for 711(77.2) = 20(25) using the same a's show that taking into account the equality of coefficients of variation may reduce the length of the confidence interval two-fold. Robustness of the Gupta et al. (1999) methodology has not been addressed as yet. In this application - as in many others - we don't have strictly speaking - a stress-strength comparison and the problem is closer to

Engineering and Military Applications of the Stress-Strength Model

211

an equivalent problem of testing equality of two (log)normal distributions. Reliability of cables and piping. In a rather obscure Finnish technical report (dated 1977) T. Mankamo investigates the common cause failures (CCF) problem, emphasizing the case of dependent failures. The failure condition of a structural item is determined as follows. Let N identical items (components) be loaded by a common stress. The common stress is treated as a random variable with the pdf fx(x). Each item has an identical resistance to stress, which is also treated as a random variable with the pdf fy(x) and the cdf Fy(x). Then the failure condition is expressed by the familiar X > Y, i.e. the stress X exceeds the structural resistance and the probability of failure P(X > Y) is given by the expression J^ Fy(z)fx(z)dz (compare with (2.6)). The author deals with the normal and lognormal cases and the well known model of failures out of N apparently being unaware of the voluminous literature existing in this area by 1977. (This is by no means the only example of the lack of coordination between researchers in various countries.) The author refers in a footnote to a very similar study conducted by A.D.S. Carter as presented in NCRS symposium in Bradford, UK, in February 1977. He also mentions the following potential applications: 1. reliability of pre-stressing cables of a prestressed concrete pressure vessel during an overpressure accident; 2. multiple breaks of piston casing in BWR control rod insertion mechanism under scram conditions; 3. integrity of standby injection piping under pressure build up when initiating the system (parallel loops may be loaded by a common counter pressure). This case is discussed in some detail. 7.2.3

Military

Applications

Military applications is another obvious and challenging area in which the numerous scenarios fit very well into the framework where the reliability should appropriately be defined as probability that system "strength" exceeds in-use "stress". There are no doubts numerous classified (confidential, FYEO, etc.) military-oriented papers in English (and we dare say also in Russian and other languages) in which the stress-strength relations are utilized that are unavailable to us. However, one can taste the flavor of these

212

Applications and Examples

investigations from the paper by M. A. Johnstone (of the US Military Academy) published in 1983 issue in the Journal of the Washington Academy of Sciences. After developing an appropriate theory based on the Bayesian approach, the author does not mince words and presents an example in which the reliability for an anti-tank sabot round fired against a Soviet T-62 tank is denned as the "probability that a given round will penetrate its target". The ranges at which the round will be fired in battle represent the stress distribution (assumed to be normal with [i = 1600 meters and a = 100 meters.) The strength distributions represent a distribution of ranges at which a given round just penetrates the target. The reliability is defined in the natural manner: the probability that the strength exceeds the in-use stress. The author utilizes quantal response data - testing a sample of identical test specimen at a number of stimulus levels and observing whether the response occurs - namely the applied stimulus exceeds the critical level of stimulus associated with test specimen. In the stress-strength content a response is observed when a specimen fails, i.e. strength is less than stress. For the problem at hand, Johnstone (1983) uses particular test strategy for selecting the stimulus , or stress level(s) which are applied to test specimen: the so-called Churchman two-stimuli designs are used to generate the data. These designs involve testing two samples of test specimen: one of size n\ at stress level Y\ and the other of size n^ at level Y2. Let the observed number of failures be mi and mi respectively. Stress levels Y\ and Yi are chosen based on satisfaction of the inequalities: 0 < mi/ni < m2/ri2 < 1.

Observe that in our offensive against hypothetical (and by now non-existent) Soviet T-62 tank, to complete the mission we must fully penetrate the armor. For each round there exists a critical range at which one will not be able to complete this task. The population of rounds is considered by Johnstone (1983) to have a strength distribution of the corresponding critical ranges. In his example, level Y\ corresponds to the distance of 50 meters from an armor plate similar to the armor of the tank we wish to destroy, Tii rounds are fired at this distance and the number of non-penetrations mi is recorded. If the number of non-penetrations exceeds 50%, the rounds will then be fired from a distance twice closer to the plate (level Yz) and the number of non-penetrations mi is recorded. If however, the number of non-penetrations at level Y\ have been less than 50%, the rounds are then

Engineering and Military Applications of the Stress-Strength Model

213

fired from a distance of "one half further from the plate (alternative level I2) and as above the corresponding H2 and m^ are recorded. The goal is to achieve that the two probabilities of non-penetration at levels Y\ and Y2 differ by at least 20%. This data is then used to estimate the parameters of the strength distribution of the critical ranges for the sabot round. The author dismisses the classical approach to estimation of parameters since (in his opinion) it does not allow us to measure the uncertainty of the reliability which he defines, as usual, as R = P(X >Y) = P{X -Y>0)

= P(Z>0)

where Z = X - Y. He calculates R using the expression O

R= I n(z\ns- VE,Os + cr%)dz, Jo where n(z\/j,,a) is the pdf of the normal distribution with the mean /i and variance a1 and "for the purpose of this paper" fis and OE are assumed to be given values. The Churchman type data described above yields observed values of the binomial random variable b(mi\ni,pi) where p\ is estimated by f /

n{x\ns,a%)dx = mi/ni

(7.1)

J—

(at stress level Y{). Analogously, at stress level Y2 we estimate p2 which results in the equation similar to (7.1). Solving the two equations simultaneously, we obtain the estimators fts and <J| that allowing us to estimate Rby O

R

=

/

./o

Johnson (1983) then proceeds in detail to estimate R using Bayesian approach with uniform priors on p\ and pi - the unknown probabilities of failure at Yi, i = 1,2. Prom the observations the joint posterior distribution for pi and pi is developed and the conditional marginal posterior distributions of pi and P2 (given p\ < P2) are used to determine the posterior distribution for reliability. This approach is straightforward for our readers who absorbed the theory presented in Section 2.3 but the application is a daring one. The author suggests utilization of this approach to data originating from other experimental designs but we have not been able to locate

214

Applications and Examples

any later literature citations. He also recommends to develop methodology to update reliability parameters via the actual reliability results of fielded systems. 7.3 7.3.1

Applications in Medicine and Psychology Applications Based on Numerical

Data

Turning now to medical-pharmaceutical applications (more precisely, comparison of efficiencies of two or more drugs), out of multitude of examples provided in recent years in numerous medically oriented statistical journals (whose number mushrooms almost monthly), we shall analyze two papers which use the model R = P(A'X > B'Y) in the multivariate normal case. The point estimation procedure for P(A'X > B'Y) is discussed in Section 3.5. Here X and Y are random vectors (not necessarily independent) of dimensions k\ and fc2, respectively, possessing multivariate normal distribution and A and B are two known vectors. Gupta and Gupta (1990) pointed out that the problem arises in a system to which energy is supplied by fci sources and is consumed via k^ sources. Their example, however, deals with the well-known data of Morrison (1976) in his popular text on multivariate analysis. The data depicts changes in the level of three biochemical components found in the brains of 24 mice of the same strain randomly divided into two groups with the second group receiving periodic administration of a certain drug. Both samples received the same diet and care (although two mice in the first (control) group died of natural causes). It would seem that the measurements on the two groups should be considered independent (as pointed out in the follow-up paper by Reiser and Faraggi (1994)) while Gupta and Gupta (1990) treat them as dependent. Both papers assume multivariate normality with A = (1/fci, , 1/fci)' and B = (l/fc2,---,l/fc2)',1'essentially estimating R = P Q ^ X W < E i y W ) where yW, i = 1,2,3, are the amounts of the three biochemical components in micrograms per gram of the brain tissue of the mice which did not receive the drug and similarly X^\ i = 1,2,3, are corresponding amounts for the drugged mice. Gupta (and Gupta (1990) estimators of R are R = 0.7324 (the MLE) and R = 0.7171 (the UMVUE). Based on simulation results (unfortunately details are not provided) the authors conclude that these values are "approximately" two standard deviations above 0.5 and thus conclude that the drug has an effect on the level of biochemical components in the

Applications in Medicine and Psychology

215

brain. The same conclusion is reached by Morrison (1976) by using standard multivariate tests, and further analysis may be necessary to convince practitioners that the stress-strength method is advantageous for these applications. Reiser and Faraggi (1994), in a follow-up paper, challenge Gupta and Gupta (1990) conclusions by pointing out that the 95% lower confidence bounds are 0.4736 (using "exact" method) and 0.4782 (using an approximation of non-central T-distributions by means of a standard normal cdf). Evidently the point estimator of R the range of which is 0 < R < 1 does not tend to normality (unless the true value is in the vicinity of 0.5) and Gupta and Gupta's final argument may perhaps be misleading. Indeed, Reiser and Faraggi (1994) obtain an approximate confidence lower bound of 0.32 on R = P (£V A"W < £V Y&)) for Morrison's data assuming independence and conclude that the assertion that the true value of R > 0.5 is unwarranted (admitting that the sample sizes are rather small). Perhaps after all the stress-strength approach does provide some additional insight! An earlier investigation along these lines is due to Ury and Wiggins (1979) where they use the P(X < Y) approach to compare lung function test of smokers and nonsmokers based on the ratio of forced expiration volume in one second and forced vital capacity. Their tool is based on the upper bound for the variance of an estimator R of R = P(X < Y) when the distributions of X and Y are unknown but assumed to be continuous, symmetric and differing at most by a shift parameter. The theory was discussed in Sections 5.2 and 5.3. Ury and Wiggins (1979) provide an upper bound on the variance of R of the form (ni + n 2 ) 2 [I7(m + n 2 ) 2 - 40(ni + n 2 ) + 24] n2 — I) 3 where ni and n2 are sample sizes. The authors do not provide the source of their data and present only the number of cases (292) when the smoker ratios (Xi) exceeds the nonsmoker ratios (Yi) out of 400 comparisons. Calculations yield that the 90% confidence interval on the value of P(X < Y) is (0, 0.57) and the 75% interval is (0.08, 0.460) ignoring ties. Akman et al. (1998) return us to the data of Morrison's (1976) type utilized by Gupta and Gupta (1990) and Reiser and Faraggi (1994) comparing a control group of animals with five groups of animals (guinea pigs in their case) injected with different doses of tubercle bacilli. This is a real

216

Applications and Examples

world data as reported by T. Bjerkdal (1960). The control group consists of 107 animals while the five injected groups contains 72 guinea pigs each. The data provides survivals of animals after 2-year period and the dosages were expressed as a logarithm of the number of bacillary units in 0.5 ml of "challange" solution. The regimen (logarithm of the number of bacillary units) was restricted to 4.3 and 6.6 yielding two data sets. Here the P{X < Y) model was applied when X and Y are assumed to have a mixed inverse Gaussian distribution (MIG) of the form fe(x) = (l-6)f(x)

+ 0g(x), 0 < B < 1,

where

(the inverse Gaussian distribution) and /»OO

g(x) = fi^xfix),

n=

xf(x)dx J—oo

(the so called length biased inverse Gaussian (LBIG) pdf). The authors tested appropriateness of their model by means of the Kolmogorov-Smirnov test which did not reject the MIG fit. It is however not clear the degree of applicability of MIG distribution to the guinea pigs data. For both data sets the parameters jx, A and 6 were estimated using MLE (the authors do not provide any information about the estimates of parameter v). The value of R = P(X < Y) was estimated using bootstrap and jackknife methodology (described in Section 4.5) resulting in RB - 0.7407 and Rj = 0.7402. The authors also construct a standard univariate kernel density estimate of the density of T = X - Y using the normal kernel (and another estimate based on all possible differences Tij = Xi — Yj). The bandwidths were chosen to be h = 32.72 and h — 27.00, respectively. The results are of interest due to their robustness yielding the values of estimates of R 0.7371, 0.7279 and 0.7078 for mixture of IG, LBIG and the original IG model, respectively. 7.3.2

Applications Based on Categorized Data

All the data utilized in the previous applications can be characterized as continuous data using mostly parametric approach (with a heavy emphasis

Applications in Medicine and Psychology

217

on the normality assumptions). We were estimating R = P(X < Y) given samples of sizes ni and 712 from X and Y, respectively. However, in medical and pharmaceutical applications as well as in psychology this type of data is rarely available or, in some cases, does not make much sense. In a vast majority of cases, the data is not assigned to any particular distribution or family and is treated by nonparametric techniques described in Chapter 5 and Section 6.4. Applications in psychology Simonoff et al. (1986) focus their attention on the data provided as a two-way contingency table of frequencies {n^} , i = 1,2, j = 1, ,k. The rows correspond to X and Y variables and the columns to ordered response , K, (see categories. The continuous variables are discretized by Cj, j = 1, (6.54)) and the counts n^- represent the number of observations of X and Y in the interval (CJ_I, Cj). The sets of counts for each row are distributed as multinomial vectors with probabilities Pij, i = 1,2, j = 1, , K, defined in (6.54). Since for categorical data P(X = Y) is not necessarily zero, the authors concentrate on inference about A = P(X < Y) — P(Y < X) and R* = P{X < Y) + 1/2P(X = Y). Recall that these quantities are connected by the linear relationship A = 2.R* — 1. Point estimation of A is treated in detail in Section 6.4.1. Two applications of these techniques based on real-world data are of psychological nature while the third is based on the data in Cochran's (1954) well known paper. The first application stems of Oskamp's (1962) data comparing performances of staff (X) and trainees (Y) in correctly interpreting diagnostic tests provided to psychologically disturbed patients using Pettitt's (1984) discretization of the data (only the integer values of the test score were recorded). Data is presented in Table 7.3. Note that this is a sparse table with an average of less than 2 observations per cell. The value of A is estimated by the WMW type statistic (6.55), and for the above data AWMW = -0.4348. corresponding to R* = 0.283. (Brownie (1988) points at typographical errors corrected herein). A hypothesis test of HQ : P\j — P23 based on A-yy-j^-y^ rejects Ho with the p-value 0.013, namely, the probability that a trainee outperforms a staff member is approximately 0.3. It would seem that the distributions are bimodal. The second application is borrowed from B.S. Everitt's (1977) popular

218

Applications and Examples

Table 7.3 Analysis of data on diagnostic tests (Oskamp, 1962) Rows correspond to staff and trainees Columns are the integer value of 1ihe performance of the experts Data

62

63

66

68

69

70

Staff Trainees

0 1

0 1

0 1

1 1

2 3

3 5

Data

71

72

73

74

75

76

Staff Trainees

1 4

1 0

3 2

3 5

4 0

3 0

text and provides age-oriented classification of 223 boys into nonliars and inverate liars. Data is presented in Table 7.4. Table 7.4 Analysis of data on inveterate liars (Everitt, 1977) The rows are the groups corresponding to whether or not the boy is an inveterate liar. The columns form age groups. Data Age group

Inveterate liars Nonliars

5-7

8-9

10-11

12-13

14-15

6 15

18 31

19 31

27 32

25 19

Here, A-yyjyf-^ = —0.1901 and R* = 0.405. Namely, the probability that a liar is younger than a nonliar is estimated to be around 0.4. Clinical trials applications The other important application of the P(X < Y) model is related to clinical trials of medical treatments or drugs. The first example we are going to discuss here is not the most important but we shall proceed with it since it is presented in the same Simonoff et al.

Applications in Medicine and Psychology

219

(1986) paper considered in the previous subsection. The application deals with a leprosy treatment. The data is presented in Table 7.5 below. Table 7.5 Analysis of data on leprosy patients (Cochran, 1954) Rows are the initial condition of the patient (little or much infiltration). The columns are the change in health after 48 weeks of treatment. Data Infiltration Little Much

Change in health

Worse

No change

slight

11 1

53 13

42 16

Improvement moderate 27 15

marked 11 7

A quick glance at the table indicates that a patient with much infiltration is more likely to improve from the one with little infiltration. Formally, % M W = 0-2326 while other normal-based estimators are not applicable. The hypothesis Ho : Py = P2j is strongly rejected with the p-value approaching 1. Note that \2 tests on each table would not reject the independence between rows and columns (X versus Y). This is due to the fact that such a test does not take into account the ordering of the columns. For a more detailed discussion of medical applications we turn to work of Halperin et al. (1987), (1989) who analyze data on diabetic and gallstone treatments trials. In two seminal papers Halperin et al. (1987), (1989) - written shortly before his untimely demise in 1988 - with two sets of different co-authors - experts in clinical trials, a new method for confidence interval estimation of R = P(X < Y) using distribution-free approach has been devised. It is closely related to WMW statistic and uses the fact that the variance of a two-sample Wilcoxon statistic can be bounded by explicit functions of R (see (5.7)). In the fist paper (1987) a pivotal quantity (5.45) is derived which allows us to construct an approximate confidence interval on R of the form (5.47) (a comprehensive description of Halperin et al. (1987) method is presented in Section 5.3.4). The authors (among them John Lachin - a world authority on diabetes clinical trials) once more emphasize the importance and intuitive appeal of the parameter R = P(X < Y) for comparing two samples which may

220

Applications and Examples

arise from distributions that differ in more than one parameter. This is a broader model than various parametric or semiparametric models involving a single parameter. The application presented in Halperin et al. (1987) deals with data obtained from Diabetes Control and Complication Trials (DCCT) containing two randomized groups of insulin-dependent diabetics. They concentrate on the percentage of hemoglobin that is glycosylated (HbAk) which represents - at given time - an "integration" of blood glucose level of the period of at least the past four weeks. In the authors' opinion, HbAk represents a single convenient measure of the degree of control of blood glucose levels. Comparison of experimental- intensive group with the standard group, using the P(X < Y) approach, provides information about the better control which involves not only the shift in location but also the possibility for reduced inter-individual variation among patients in the experimental group. The 1987 data cited by Halperin et al. (1987) consists of samples of size 90 of adult diabetics in each treatment group which were followed for 6 months or more. The authors utilize subsamples of sizes 40 and 20 (comparing the two treatments twice). Here the amount of HbAk at 6 month were used. Firstly, a standard i-test was applied and the results were found significant in both cases, with sample size of 40 yielding more pronounced differences. (Recall that the £-test is based on the normality assumption.) Next, one-sided lower confidence limits on P(X < Y) (5.44) are obtained using confidence intervals (described in detail in Section 5.3.2) based on the asymptotic normality and Sen's (1967) (see (5.15)) and Govindarajulu's (1968) (see (5.23)) estimators of variance. Halperin et al. (1987) also present their lower bound for R given by (5.47). The authors note that based on simulations their method is preferable when constructing confidence intervals for samples as low as 20. For subsample size of 20, the 95% lower confidence limit using (5.47) is calculated to be 0.628; for the larger subsample of size 40 the corresponding bound is higher: 0.746. The corresponding estimators of P(X < Y) are 0.765 and 0.831 respectively^ Extensive simulations were performed by Halperin et al. (1987) to evaluate the comparative adequacy of several confidence interval procedures under various scenarios and three distribution-free methods. The results are not very encouraging. Even the least conservative Halperin et al. (1987) method for R — P(X < Y) — 0.9 and n = 20 yields the coverage deviation which is "sufficiently less than zero", so that one might not wish to use this

Applications in Medicine and Psychology

221

method; however, Sen's (1967) and Govindarajulu's (1968) methods "are even much worse in this case". For n = 40 there were no negative estimates but signed deviation from nominal one-sided coverage yielding lower confidence limits for P(X < Y) (under underlying exponential distribution) clearly indicates that Halperin et al. (1987) method results in substantially smaller signed deviation from the nominal value even for sample sizes of n = 80, especially when P(X < Y) is at its extreme values such as 0.1, 0.2, 0.7, 0.8 or 0.9. Returning to the estimates of P(X < Y) and 95% lower confidence limits for the DCCT study which resulted in point estimates in the vicinity 0.7-0.8, the authors cautiously conclude that: "It does not seem reasonable to postulate P(X < Y) < 0.5." Halperin et al. (1989) paper is devoted to distribution-free confidence intervals for R* = P(X < Y) + 0.5P(X = Y) in the case of categorical and right-censored continuous data using adaptation of Halperin et al. (1987) approach. The pivotal quantity in this case is (R* - R*)2/V where R* is denned in (6.61), its estimator R* is obtained by substituting riij/rii in place of Pij, i = 1,2, j = 1,---,K, and V is obtained from expression (6.65) by substituting R* for R* and § for 6. A detailed description of the interval estimation technique is provided in Section 6.4.2. The application of the above confidence procedure provided by Halperin et al. (1989) deals with hepatic toxicity data originated from the US National Cooperative Gallstone Study (NCGS). The study was a placebocontrolled double-blind clinical trial with the aim to assess the efficacy of chenodeoxycholic acid (chenodiol) for the dissolution of gallstones. Due to early concerns about potential hepatoxicity, a separate initial study of hepatic morphology was carried out some 7 years earlier. Each of the 126 patients was treated with a low dose (375 mg/day; m = 56) or a high dose (the double amount, ri2 = 61). Liver biopsies were obtained at a baseline and after 9 and 24 months treatment. Two morphologists (A and B) provided evaluations on each of 89 variables denned on categorical and ordinal scales. The authors focused their attention on the report of morphologist A at 9 months who evaluated the condition of portal triads (see Table 7.6). The variable X (Y) corresponds to low- (high-) dose variable, thus if R* is greater than 1/2, it would imply higher risk associated with 750 mg/day treatment. The estimated R* is R* = 0.5815, and the 95% confidence

222

Applications and Examples

Table 7.6 Enlargement of portal triads

High dose Low dose

Normal

Mild

Moderate

Severe

Total

44 49

1 2

15 5

1 0

61 56

interval is (0.508, 0.652) which reflects - in author's words - a significantly higher risk of portal triad enlargement for the higher dose group. Hochberg (1981) carried out similar analysis and obtained the interval on R* (0.511, 0.652). Using the parameter A = P(X
Unequivocally normal Probably normal Mildly abnormal Moderately abnormal Severely abnormal Total

High dose

Low dose

0 13 40 3 2 58

0 13 30 5 0 48

Note that sample sizes are somewhat smaller especially among the low dose patients (due to insufficient or poor biopsy). Surprisingly, the overall assessment provided inconclusive results yielding R — 0.5162 and the 95% confidence interval for R* of the form (0.4230, 0.6082). This discrepancy can perhaps be explained by noting that the distinction between the cate-

ROC Curves Analysis

223

gory "probably normal" and "unequivocally normal" (no unequivocal cases were identified) is rather blurred and the assessor may tend to "play safe" by lumping cases into "probably" category. The same conclusion may be reached when comparing "mildly" and "moderately abnormal" categories data in overall assessment which are quite opposite to the "mild" and "moderate" data when evaluating enlargement of portal triads. In our opinion, classifications of the type "overall assessment" are too vague and possibly are even not very reliable.

7.4

ROC Curves Analysis

For a good part of the 20-th century the assumption of independent random samples from continuous distributions dominated applications of statistical methodology. Most of earlier work in the area of stress-strength relationship follows this pattern. From the middle of the eighties of the 20-th century we are beginning to observe deviations from this set-up, mainly because real-world sources of data were not conforming to the i.i.d. continuous model. In fact, a substantial amount of categorized data plays important role especially in medical-oriented applications (while engineering applications continued to adhere to the assumption of random samples, sometimes supplemented as we have seen above by explanatory variables). One of the developments of this type is the analysis of ROC (receiver operating characteristic) curves. This topic was a real hit in the last decade with a large number of publications appearing. In this volume we shall cite only a few of them referring the interested reader to Swets and Pickett (1982) or more recent Swets (1996). 7.4.1

ROC Curves and Their Relation to P(X < Y)

ROC curve is a particular type of an ordinal dominance (OD) graph. Consider random variables X and V, and for every real number c plot a point T(c) in a Cartesian coordinate system with the coordinates (P(X < c), P(Y < c)). The collection of the points T(c) form a ROC graph. Note that the coordinates of T{c) lie between 0 and 1, so that the ROC graph is always located within the unit square {(x, y) : 0 < x < 1,0 < y < 1}. Moreover, by letting c = o we conclude that ROC graph always starts at (0,0)

224

Applications and Examples

and ends at (1,1). If X and Y are both continuous, their OD graph is also a continuous curve, while for discrete X and Y the OD graph will be a collection of distinct points. Note that these two cases are interrelated: if for continuous variables X and Y the probabilities P(X < c), P(Y < c) are available only for a limited number of values of c, one ends up with a graph constituted by a finite number of points located on the ROC graph for (X,Y). On the other hand, a discrete graph can be converted into a continuous curve by connecting the consecutive points on the graph. The relation between OD graphs and the P(X < Y) model was originally pointed out by Bamber (1975) and brought a variety of methods developed for an inference about P(X < Y) into analysis of ROC curves. Bamber (1975) observed that the area-above the OD graph for continuous X and Y is equal to A(X, Y)

=

[ P(X < c)dP{Y < c) Jo

=

f P(X
(7.2)

since P(X = V) = 0 in this case. For discrete X and Y he shows that (recall Section 6.4) A{X,Y) = R* =P(X
(7.3)

It is evident that the OD graph for (X, Y) can be obtained by rotating the OD graph for (Y,X) and A(X,Y) = 1 - A(Y,X). In view of the relations (7.2) and (7.3), the area A(X, Y) can be utilized as a measure of the size of difference between two populations with A(X, Y) = 1 if and only if the distribution of X lies entirely below the distribution of Y. On the other hand, if X and Y are identically distributed, A(X, Y) = 1/2. However, A(X, Y) is more commonly applied to measure how accurately a given test differentiates two populations. Consider the "yes-no" signal detection experiment. In this experiment, the observer is told to respond "yes" if he/she thinks that the signal was presented on the trial and to respond "no" otherwise. It is assumed that the observer performs this task as follows. First, he/she adopts (often subjectively) an impression strength criterion, say c. Then, on each trial if the impression strength reaches or exceeds the criterion, he/she responds

ROC Curves Analysis

225

"yes", and responds "no" otherwise. Let Is and In be continuous random variables denoting the strengths of sensory impressions aroused by signal and noise events, respectively. Then

P(yes|signal) = P{IS > c) and P(yes|noise) = P(In > c).

The (In, Is) ROC curve (or yes-no ROC curve) is then a collection of points (P(In > c),P(Is > c)) in a unit square. It is easy to observe that this is the rotated OD graph for (/ n , Js) so that the area below the graph is equal to P(In c) and P(Is > c) under certain (parametric or nonparametric) assumptions. However, sometimes no such data is available. Consider the case when an expert performing an experiment on each trial assesses his/her degree of confidence that a signal was indeed presented on that trial (for instance, "definitely no", "probably no", "questionable", "probably yes", "definitely yes"). For this purpose, he/she is given a confidence scale consisting of K confidence levels which are obtained by simply discretizing all possible values of In and Is by means of a partition —oo < CQ < c\ < < CK-I < CK < °o. The expert concludes that the signal belongs to category Cj if it lies between e,_i and Cj, j = 1, , K, although the cut-off points CJ may not be explicitly available. The data allows one to plot several sample points on ROC curve, and essentially coincides with the categorical data for P(X < Y) model discussed in some detail in Section 6.4. In other words, the ROC curve analysis described above is yet another version of the stress-strength model and can be performed by the variety of methods developed in Section 6.4. The specific (and somewhat controversial) assumption feature of the ROC curve analysis, however, is the existence of a monotone transformation such that on a transformed scale /„ and Is are normally distributed with possibly different means and variances (see e.g. Brownie (1988) and Metz et al. (1998)). Since in a number of applications the cut-off points are unknown, this is equivalent to the assumption that (In,Is) are normally distributed, and the methods developed in Section 6.4 can apply in this case.

226

Applications and Examples

7.4.2

Applications of ROC Curves

ROC curve analysis has been used in various fields of medical imaging, radiology, psychiatry, nondestructive testing and manufacturing inspection systems (see e.g. Hsiao et al. (1989), Metz (1989), Nockemann et al. (1991), Reiser (2000), Swets (1996) and Swets and Pickett (1982)). Here, we shall consider an example of comparison of predictive validities of aptitude tests studied by Humphreys and Swets (1991). These authors compare two methods of assessment of air pilot's training which were in use during the World War II. The training was broken into nine steps, stanine, which represented weighted raw-score composites of different tests. The results of these tests provide information about the expected date of graduation (when pilot's wings and a commission as a second lieutenant were awarded). The tests and their weights entering into the pilot stanine were changed from time to time as research information had been accumulated. One of these changes took place in 1942-43. The data in Humphreys and Swets (1991) represents three classes. Below, we chose just two of them, the class 43-H tested on the pilot stanine of December 1942 and the class 44-1 tested on the stanine of November 1943. Note that during this period there was a change in the training standards: between 43-H and 44-1 classes, the decision was made to use a minimum cut-off of 4 on a pilot stanine as a prerequisite for the entry to pilot training program. Table 7.8 Pass/fail classification in pilot training classes

Stanine

No. passed

9 8 7 6 5 4 3 2 1

663 565 988 1184 1127 841 401 148 43

43-H No. failed 45 101 249 486 708 827 620 397 214

No. passed 683 718 1166 1306 962 359 2 1 0

44-1 No. failed 17 76 159 327 405 282 3 0 0

Using data presented in Table 7.8, the authors were testing the assump-

P(signallnoise)

Some Other Applications

227

1 0.8 0.6 0.4 0.2 0 0

0.2

0.4

0.6 0.8

1

P(signallnoise)

Fig. 7.1 ROC curve for the pass-fail data in Class 43-H.

tion that the later stanine assessment of a student's performance is "better" than the earlier one. The data for each stanine is assumed to be normal with unknown partition values. The normality assumptions were tested by chi-squares tests and provided satisfactory fits. The ROC curves for both classes are constructed with In and Is being the unknown raw scores of the failing and passing trainees. These ROC curves are presented in Fig. 7.1 and 7.2. The area under the ROC curve for each class was calculated and turned out to be 0.734 for 43-H class and 0.714 for 44-1 class, thus, showing no significant differences between two stanines. 7.5 7.5.1

Some Other Applications Estimation of Strength Characteristics from the Distribution of Stress

Using the US Air Force Material Laboratory report (1974), Durham and Padgett (1990) analyze simplified data on windgust loading experiments

Applications and Examples

P(signallnoise)

228

1 0.8 0.6 0.4 0.2 0 0 0.2

0.4

0.6

0.8

1

P(signallnoise)

Fig. 7.2 ROC curve for the pass-fail data in Class 44-1.

with sheets of steel alloy. The models utilized by the authors are not the conventional P(X < Y) models but are related to a probabilistic interpretation of the Miner's rule (Birnbaum and Sanders (1968)). Durham and Padgett (1990) study estimation of characteristics of the cdf of the magnitude of strength Y of an item under the assumption that the cdf of the stress X applied in a typical loading is known while the cdf of Y is unknown but assumed to be continuous. Neither X nor Y are directly observable and the estimates of characteristics of Y are obtained semi-parametrically. Specifically, a simple stress-strength model with i.i.d. loadings postulates that loads are applied to the item until it fails (namely, the current load exceeds the strength). The number of loadings until failure is recorded for each item and there are no other failure modes. Under these assumptions the conditional cdf of the number N of loadings until failure is given by the geometric distribution

where py = P(Xij < Yi\Yi = y) = Fx(y) and qy = 1 — py. Here i is the

Some Other Applications

229

index for the item, i = 1, , k, Xtj are the loading applied to the item i, j = 1,2, , which is a random variable, and Yi is the strength of item i on test, which is a random variable as well. Hence, P(N > n\Y — y) = p™, The unconditional probability that the number of loadings until failure of a system exceeds n is R{n) = P(N > n) = f°° p^dFY(y) Jo where Fy(-) is the cdf of Y. Equivalently, introducing the unconditional probability P(N = n)= n(n) =

we have R(n) = 1 m=l

Assuming that Fx(-) is uniform on (0, a), a > 0, and the support of Fy(-) is contained in (0,a), from the above expression of R(n) we obtain R(n) = a~n E(Yn). Namely, the first two moments of strength Y are HY = aR(l)

and

aY = a2[R(2) -

R2(l)\.

Now one can estimate /iy and aY noting that an unbiased and consistent estimators of R(l) and R(2) are R(l) = (number of Nt > l)/jfc,

R(2) = (number of JV< > 2)/k,

respectively. Here Ni is the least upper bound on the set {n : Xn < Yi, , Xitn-i < Yi}, n > 1, and as above k is the number of items on the test. Details of the procedure are given in Durham and Padgett (1990) who approximated the distribution Fy(y) by means of a discrete distribution using the method of Deely and Kruse (1968). In the example provided by the authors, the mean stress per loading is 32 kpsi. Each of the seven components were repeatedly stressed until failure. The stress cdf was assumed to be exponential Fx{x) = 1 — exp(—Ax) with X = (1/32) = 0.031 (kpsi). The ordered observed numbers of loadings to

230

Applications and Examples

failure were: 1,1,2,3,4,5,14. The problem is to estimate the mean strength. , 7, The estimator depends only upon the choice of the weights yjk, j = 1, at which the cdf Fy is approximated. Indeed, the choice of y^ : 25 (25) 175 yields the estimated mean strength of 55.1 kpsi and the estimated standard deviation of strength 10.1 kpsi. An alternative choice of yjk 10 (20) 130 results in estimated mean strength of 54.9 kpsi. See Durham and Padgett (1990) for further details. A similar procedure can be used for a model of cumulative damage in which loading j produces a random amount of damage Dj, which is manifested by a reduction in the strength of the item and the damage accumulates linearly until the item fails (namely, as above, the current load exceeds the current strength) and again there are no other failure modes. The basic change in derivation is that now P{N > n\Y > y) = 3=1

Consequently,

which can be written as R(n)= f°° F*Dn(y)dFy(y). Jo

Here Fpn(y) - the cdf of J2"=i^j ~ i s the n-iold convolution of FD3, Dj being the damage caused by load j , a random variable. The same procedure as described above can be used to estimate the characteristics of the strength - just replace (1 — Fx{-))n by the convolution of For 7.5.2

A Relation Between the Stress-Strength Model and the Process Capability Index

Finally, we shall briefly comment on the as yet unexplored relation between the process capability indices and the stress-strength model - two seemingly distinct fields of study within the general framework of statistical quality control.

Some Other Applications

231

The use of process capability indices (see e.g. Kotz and Johnson (2002) for a recent survey) is motivated by a desire to have an index related to the probability that an attribute (Z) of a component (size, density, elastic strength, etc.) falls within fixed specification limits (sometimes the "specification interval" is one-sided - e.g. (—00, A) or (A, 00) - and only a lower or upper limit is specified). However, in some circumstances it may be desirable to have an "index" allowing for possibly varying limits - TL or Tu, say, for lower and upper limits respectively. We are then interested in P(TL < Z < TU). If only one limit (TL or Tu) is finite, we are back to the stress-strength model calculation of P(X < Y) type. The more complicated analysis for P(TL < Z < Tu) corresponds to the generalization P(X < Y < Z) of stressstrength model studied in detail in Section 6.2.2. An analysis covering less restrictive situation than those described in that Section could turn to be useful and revealing in bridging and unifying between the two approaches.

Bibliography

Abramowitz, M., Stegun, LA. (1992) Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. Reprint of the 1972 edition. Dover Publications, New York. Abu-Salih, M. S., Shamseldin, A. A.(1988) Bayesian estimation of P(X < Y) for a bivariate exponential distribution. Arab Gulf J. Sci. Res. A. Math. Phys. Sci., 6(1), 17-26. Abusev, R.A., Kolegova, N.V. (1998) On estimators of probabilities of linear inequalities in the case of multivariate T-distributions. In Statistical Methods of Estimation and Hypothesis Testing, 18 - 24, Perm State University, Perm (in Russian). Ahmad, K.E., Fakhry, M.E., Jaheen, Z.F. (1995) Bayes estimation of P(Y > X) in the geometric case. Microelectron. reliab., 35(5), 817-820. Ahmad, K.E., Fakhry, M.E., Jaheen, Z.F. (1997) Empirical Bayes estimation of P(Y < X) and characterization of Burr-type X model. J. Statist. Plan. Inf., 64, 297 - 308. Akman, O., Sansgiry, P., Minnotte, M.C. (1999) On the estimation of reliability based on mixture inverse Gaussian distributions, pp. 121 - 128. In Applied Statistical Science, IV, Nova Science Publishers. Al-Hussaini, E.K., Mousa, M.A.M,, Sultan, K.S. (1997) Parametric and nonparametric estimation of P(Y < X) for finite mixtures of lognormal components. Commun. Statist. - Theory Meth. , 26, 1269-1289. Aminzadeh, M.S.(1991) Confidence bounds for Pr(X > Y) in 1-way ANOVA random model. IEEE Trans. Reliab., 40, 537-541. Aminzadeh, M.S. (1997) Estimation of reliability for exponential stress-strength models with explanatory variables. Appl. Math. Comput, 84, 269-274. Aminzadeh, M.S. (1999) Estimation of P(Z < Y) for correlated stochastic time series models. Appl. Math. Comput, 104, 179 - 189. Anderson, T.W., Fang, K.T., Hsu, H. (1986) Maximum-likelihood estimates and likelihood-ratio criteria for multivariate elliptically contoured distributions. 233

234

Bibliography

Canadian Journ. Statist, 14, 55-59. Arsham, H. (1986) A generalized confidence region for stress-strength reliability. IEEE Trans. Reliab., 35, 586 - 588. Awad, A.M., Azzam, M.M., Hamdan, M.A. (1981) Some inference results on Pr(Jf < Y) in the bivariate exponential model. Commun. Statist. - Theory Meth., 10, 2515-2525. Awad, A.M., Fayoumi, M. (1985) Estimate of P(X < Y) in case of the double exponential distribution. In Proceedings of the Seventh Conference on Probability Theory, Aug. 29-Sept. 4, 1982, Brasov, Romania. Vnuscience Press, Utrecht, Netherlands, 527-531. Awad, A.M., Gharraf, M.K. (1986) Estimation of P(Y < X) in the Burr case: a comparative study. Commun. Statist. - Simul. Comp., 15, 389-403. Azzalini, A., Chiogna, M. (2002) Stress-strength model for skew-normal distributions. - submitted. Bader, M.G., Priest, A.M. (1982) Statistical aspects of fibre and bundle strength in hybrid composites. In Progress in Science and Engineering Composites, eds. Hayashi, T., Kawata, K., Umekawa, S., vol. ICCM-IV, Tokyo, 11291136. Bai, D.S., Hong, Y.W. (1992) Estimation of Pr(X < Y) in the exponential case with common location parameters. Commun. Statist. - Theory Meth., 21, 269-282. Baklizi, A. (2001) Estimation of P{X < Y) in the exponential distribution with censored data. Pakistan J. Statist, 17, 143-149. Bamber, D. (1975) The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. J. Math. Psychol., 12, 387-415. Barton, D.E. (1961) Unbiased estimation of a set of probabilities. Biometrika, 48, 227-229. Basu, A.P. (1967) On the large sample properties of a generalized WilcoxonMann-Whitney statistic. Ann. Math. Statist., 38, 905-915. Basu, A.P. (1977) A generalized Wilcoxon Mann-Whitney statistic with some applications in reliability. In The Theory and Applications of Reliability, with Emphasis on Bayesian and Nonparametric Methods. Conf., Univ. South Florida, Tampa, Fla., 1975, 1, 131-149. Academic Press, New York. Basu, A.P. (1981) The estimation of P(X < Y) for distributions useful in life testing. Naval Res. Logist Quart 28, 383-392. Basu, A.P. (1988) Multivariate exponential distributions and their applications in reliability. In Handbook of Statistics, eds. Krishnaiah, P.R. and Rao, C.R., 7, 467-476. Basu, A., Ebrahimi, N. (1983) On the reliability of stochastic systems. Statist. Probab. Letters, 1, 265-267. Basu, D. (1964) Estimates of reliability for some distributions useful in life testing. Technometrics, 6, 215-219. Bechhofer, R.E. (1954) A single-sample multiple decision procedure for ranking

Bibliography

235

means of normal populations with known variances. Ann. Math. Stat, 25, 16-39. Beg, M.A. (1980a) Estimation of P(Y < X) for truncation parameters distributions. Commun. Statist. - Theory Meth., 9, 327—345. Beg, M.A. (1980b) On the estimation of P(Y < X) for the two-parameter exponential distribution. Metrika, 27, 29-34. Beg, M.A. (1980c) Estimation of P(V < X) for exponential family. IEEE Trans. Reliab., 29, 158 - 159. Beg, M.A. (1983) Unbiased estimators and tests for truncation and scale parameters. Amer. J. Math. Mgmt. Sci., 3, 251-274. Beg, M.A., Singh, N. (1979) Estimation of P(Y < X) for the Pareto distribution. IEEE Trans. Reliab., 28, 411-414. Belyaev, Y., Lumelskii, Y. (1988) Multidimensional Poisson walks. Journ. Math. Sciences, 40, 162-165. Bennett, G. (1962) Probability inequality for the sum of independent random variables. J. Amer. Statist. Assoc. , 57, 33-45. Berger, J., Bernardo, J.M. (1989) Estimating a product of means: Bayesian analysis with reference priors. J. Amer. Statist. Assoc. , 84, 200 -207. Berger, J., Bernardo, J.M. (1992) On the development of reference priors (with discussion). In Bayesian Statistics 4, eds. J.M. Bernardo, J. O. Berger, A.P. Dawid and A.F.M. Smith, Oxford University Press, Oxford, UK, pp. 35-60. Bernardo, J.M. (1979) Reference posterior distributions for Bayesian inference (with discussion). J.R. Statist. Soc, B41, 113-147. Bhattacharyya, G.K. (1977) Reliability estimation from survivor count data in a stress-strength settings. IAPQR Trans. - J. Indian Assoc. for Prod., Qual. and Reliab., 2, 1-15. Bhattacharyya, G.K,, Guttman, I., Johnson, R.A., Reiser, B. (1986) Statistical Inference for Stress-Strength Models With Covariates. Tech. Report 8, University of Toronto, Dept. of Statistics. Bhattacharyya, G.K., Johnson, R.A. (1974) Estimation of reliability in a multicomponent stress-strength model. J. Amer. Statist. Assoc, 69, 966-970. Bhattacharyya, G.K., Johnson, R.A. (1975) Stress-strength models for system reliability. In Proc. Symp. on Reliability and Fault Tree Analysis. Ed. Barlow, R. E., Fussell, J.B., Singpurwalla, N.D. SIAM, Philadelphia, 509-532. Bhattacharyya, G.K., Johnson, R.A. (1977) Estimation of system reliability by nonparametric techniques. Bulletin of Mathematical Society of Greece (Memorial volume), 94-105, Bhattacharyya, G.K., Johnson, R.A. (1981) Stress-strength models for reliability: overview and recent advances. Proceedings of the Twenty-sixth Conference on the Design of Experiments in Army Research, Development and Testing. New Mexico State Univ., Las Cruces, 1980. ARO Rep. 81, 2, U. S. Army Res. Office, Research Triangle Park, N.C., 531-548. Bilikam, J.E. (1985) Some stochastic stress-strength processes. IEEE Trans. Reliab., 34, 269 - 274.

236

Bibliography

Billingsley, P. (1995) Probability and Measure. Wiley, New York. Birnbaum, Z.W. (1956) On a use of Mann-Whitney statistics. Proc. Third Berkeley Symp. in Math. Statist. Probab., Vol. 1, 13-17, University of California Press, Berkeley, CA. Birnbaum, Z.W., McCarty, B.C. (1958) A distribution-free upper confidence bounds for Pr(V < X) based on independent samples of X and Y. Ann. Math. Statist, 29, 558-562. Birnbaum, Z.W., Saunders, S.C. (1968) A probabilistic interpretation of Miner's rule. SIAM J. Applied Mathematics, 16, 637-652. Bjerkdal, T. (1960) Acquisition of resitance in guinea pigs injected with different doses of virulent tubercle bacteria. Amer. J. Hygiene, 72, 130-148. Block, H.W., Basu, A.P. (1974) A continuous bivariate exponential extension. J. Amer. Statist. Assoc, 69, 1031-1037. Brownie, C. (1988) Estimating Pr(X < Y) in categorized data using "ROC" analysis. Biometrics, 44, 615-621. Burr, I.W. (1942) Cumulative frequency functions. Ann. Math. Statist, 13, 215222. Carlin, B.P., Louis, T.A. (2000) Bayes and Empirical Bayes Methods for Data Analysis. Chapman and Hall, London. Casella, G. (1985) An introduction to empirical Bayes data analysis. The American Statistician, 39, 83-87. Casella, G., Berger, R. (1990) Statistical Inference. Duxbury Press, California. Chandra, S., Owen, D.B. (1975) On estimating the reliability of a component subject to several different stresses (strengths). Nav. Res. Logist. Quart., 22, 31-39. Chandra, S., Owen, D.B. (1977) On an estimator of the probability P(Xi < Y,X2 < Y, ,XN < Y). South African Statist. J., 11, 149-154. Chao, A. (1982) On comparing estimators of P(V < X) in the exponential case. IEEE Trans. Reliab., 31, 389-392. Chao, A., Cheng, K. (1985) Interval estimators for highly reliable stress-strength models. Chinese J. Math., 13, 131-136. Charnes, A., Cooper, W. W. (1961) Management Models and Industrial Applications of Linear Programming. Wiley, New York. Chaturvedi, A., Surinder, K. (1999) Further remarks on estimating the reliability function of exponential distribution under type I and type II censoring. Brazilian J. Probab. Statist, 13(1), 29 - 39. Cheng, K.F, Chao, A. (1984) Confidence intervals for reliability from stressstrength relationships. IEEE Trans. Reliab., 33, 246-249. Choi, S.S., Kim, J. J. (1983) A Bayes reliability estimation from life test in a stress-strength model. J. Korean Statist. Soc, 12, 1-9. Church, J.D., Harris, B. (1970) The estimation of reliability from stress-strength relationship. Technometrics, 12, 49-54. Cochran, W.G. (1954) Some methods for strengthening common x 2 tests. Biometrics, 10, 417-451.

Bibliography

237

Constantine, K., Karson, M., Tse, S.-K. (1986) Estimators of P(Y < X) in the gamma case. Commun. Statist. - Simul. Comput, 15, 365-388. Constantine, K., Karson, M., Tse, S.-K. (1989) Bootstrapping estimates of P(Y < X) in the gamma case. J.Statist. Comput. Simul., 33, 217-231. Cramer, E. (2001) Inference for stress-strength models based on Wienman multivariate exponential samples. Commun. Statist. - Theory Meth., 30, 331346. Cramer, E., Kamps, U. (1997a) The UMVUE of P(X < Y) based on typeII censored samples from Weinman multivariate exponential distributions. Metrika, 46, 93-121. Cramer, E., Kamps, U. (1997b) A note on UMVUE of Pr(X < Y) in the exponential case. Commun. Statist. - Theory Meth., 26, 1051-1055. Csaki, E. (1984) Empirical distribution function. In Handbook of Statistics. Ed. Krishnaiah, P.R., and Sen, P.K., Vol. 4, Elsevier, North Holland, 405-430. Dahel, S. (1989) Bias in a stress-strength problem. IEEE Trans. Reliab., 38, 386-387. Datta, G.S., Ghosh, M. (1995) Some remarks on noninformative priors. J. Amer. Statist. Assoc. , 90, 1357 - 1363. Datta, G.S. (1996) On priors providing frequentist validity of Bayesian inference for multiple parametric functions. Biometrika, 83, 287-298. Deely, J.J., Kruse, R.L. (1968) Construction of sequences estimating the mixing distribution. Ann. Math. Stat, 39, 286-288. DeLong, E.R., Sen, P.,K. (1981) Estimation of Pr(X > Y) based on progressively truncated versions of the Wilcoxon-Mann-Whitney statistic. Commun. Statist. - Theory Meth., 10, 963-981. DeLong, E.R., Sen, P.K. (1982/83) The extended two-sample problem: progressively truncated estimation of P{X > Y}. Statist. Decis., 1(2), 147-170. Dinh, K.T., Singh, J., Gupta, R.C. (1991) Estimation of reliability in bivariate distributions. Statistics, 22, 409-417. Downton, F. (1973) On the estimation of Pr(Y < X) in the normal case. Technometrics, 15, 551-558. Duncan, A.G. (1986) Quality and Industrial Statisitcs, 5th ed. Homewood, IL: Richard D. Irwin. Durham, S.D., Padgett, W.J. (1990) Estimation for a probabilistic stress-strength model. IEEE Trans. Reliab., 39, 199-203. Dutta, K., Srivastava, G.L. (1987) An n-standby system with P(X < Y < Z). IAPQR Trans., 12, 95-97. Easterling, R. (1972) Approximate confidence limits for system reliability. J. Amer. Statist. Assoc, 67, 220-222. Edwardes, M.D. de B. (1995) A confidence interval for Pr(X < Y) - Pr(X > Y) estimated from simple cluster samples. Biometrics, 51, 2, 571-578. Efron, B. (1979) Bootstrap methods: another look at the jackknife. Ann. Statist., 7, 1-26. Efron, B. (1982) The Jackknife, the Bootstrap and Other Resampling Plans.

238

Bibliography

CBMS-NSF monograph, —bf 38, SIAM, Philadelphia. Enis, P., Geisser, S. (1971) Estimation of the probability that Y > X. J. Amer. Statist. Assoc, 66, 162-168. Everitt, B.S. (1977) The Analysis of Contingency Tables. London, Chapman and Hall. Fang, K.-T., Kotz, S., Ng, K.-W. (1990) Symmetric Multivariate and Related Distributions. Chapman and Hall, London, UK. Feigin, P.D., Lumelskii, Ya.P. (2000) On confidence limits for the difference of two binomial parameters. Commun. Statist. - Theory Meth., 29, 131-141. Feigin, P.D., Lumelskii, Ya.P., Volkovich, Z.E. (2001) On Monte Carlo simulation of confidence bounds for reliability problems. In Proceedings of 15-th European Simulation Multiconference "Modelling and Simulation 2001", Prague, 719-721. Ferguson, T.S. (1973) A Bayesian analysis of some nonparametric problems. Ann. Statist., 1, 209-230. Freund, J.E. (1961) A bivariate extension of the exponential distribution. J.Amer. Statist. Assoc, 56, 971-977. Gastwirth, J.L., Krieger, A.M. (1991) On bounding P(X2 < X\) from grouped data. Scand. J. Statist, 18, 111-117. Ghosh, M., Lahiri, P. (1992) Estimation of P(XW < X ( 2 ) ): a nonparametric empirical Bayes approach. In Order Statistics and Nonparametrics: Theory and Applications (P.K. Sen and I.M.Salama, eds.) Elsevier Science, Netherlands, Amsterdam, 247-261. Ghosh, J.K., Mukerjee, R. (1992) Non-informative priors (with discussion). In Bayesian Statistics 4, eds. J.M. Bernardo, J. O. Berger, A.P. Dawid and A.F.M. Smith, Oxford University Press, Oxford, UK, 321 -344. Ghurye, S.G., Olkin, I. (1969) Unbiased estimation of some multivariate probability densities and related functions. Ann. Math. Statist., 40, 1261-1271. Gnedenko, B.,V. (1943) Sur la distribution limite du terme maximum d'une serie aleatoire. Ann. Math., 44, 423-453. Govindarajulu, Z. (1967) Two sided confidence limits for P(X > Y) based on normal samples of X and Y. Sankhyd, 29, 35-40. Govindarajulu, Z. (1968) Distribution-free confidence bounds for P(X < Y). Ann. Inst. Statist. Math., 20, 229-238. Govindarajulu, Z. (1974) Fixed-width confidence intervals for P(X < Y). Reliability and Biometry. Statistical Analysis of Lifelength. Proc. Conf., Florida State Univ., Tallahassee, Fla. 1973. SIAM, Philadelphia, 747-757. Govindarajulu, Z. (1976) A note on distribution-free confidence bounds for P(X < Y) when X and Y are dependent. Ann. Inst. Statist. Math., 28, 307-308. Gradshtein, I.S., and Ryzhik, I.M. (1980) Tables of Integrals, Series, and Products. Academic Press, New York. Gray, H. L., Schucany, W. R. (1972) The Generalized Jackknife Statistic. Marcel Dekker, New York .

Bibliography

239

Gumbel., E.J. (1960) Bivariate exponential distribution. J. Amer. Statist. Assoc, 55, 698-707. Gupta, C.G., Brown, N. (2001) Reliability studies of the skew-normal distribution and its application to a strength-stress model. Commun. Statist. - Theory Meth., 30, 2427-2445. Gupta, R.C., Gupta, R.D. (1987) A comparison of various estimators of reliability. Comput. Statist. Data Anal, 5, 215-226. Gupta, R.C., Ma, S. (1996) Testing the equality of coefficients of variation in k normal populations. Commun. Statist. - Theory Meth., 25, 115-132. Gupta, R. C., Ramakrishnan, S., Zhou, X. (1999) Point and interval estimation of P(X < Y) : the normal case with common coefficient of variation. Ann. Inst. Statist. Math., 51, 571-584. Gupta, R.C., Subramanian, S. (1998) Estimation of reliability in a bivariate normal distribution with equal coefficients of variation. Commun. Statist. Simui, 27, 675-698. Gupta, R.D., Gupta, R.C. (1988) Estimation of P(YP > max(Yi, Y2,..., Yp_i)) in the exponential case. Commun. Statist. - Theory Meth., 17, 911-924. Gupta, R.D., Gupta, R.C. (1990) Estimation of Pr(a'a; > b'y) in the multivariate normal case. Statistics, 21, 91-97. Gupta, R.P. (1972) Reliability estimation of a system comprised of k elements from the same truncated exponential model. Statist. Neerl., 26, 55-59. Gupta, S.S. (1963) Probability integrals of multivariate normal and multivariate t. Ann. Math. Stat., 63, 792-828. Gupta, S.S. (1963) Bibliography on the multivariate normal integrals and related topics. Ann. Math. Stat, 63, 829-838. Guttman, I., Johnson, R.A., Bhattacharyya, G.K., Reiser, B. (1988) Confidence limits for stress-strength models with explanatory variables. Technometrics, 30, 161-168. Hald, A. (1998) A History of Mathematical Statistics From 1750 to 1930. Wiley, New York. Hallin, M., Seoh, M. (1997) When does Edgeworth beat Berry and Esseen? Numerical evaluations of Edgeworth expansions. J. Statist. Plann. Inference, 63, 19-38. Halperin, M., Gilbert, P.R., Lachin, J.M. (1987) Distribution-free confidence intervals for Pr(Xi < X2). Biometrics, 43, 71-80. Halperin, M., Hamdy, M.I., Thall, P.F. (1989) Distribution-free confidence intervals for a parameter of Wilcoxon-Mann-Whitney type for ordered categories and progressive censoring. Biometrics, 45, 509-521. Hanagal, D.D. (1992) Some inference results in modified Preund's bivariate exponential distribution. Biom. J., 34, 745 -756. Hanagal, D.D. (1995) Testing reliability in a bivariate exponential stress-strength model. J. Indian Statist. Assoc, 33, 41-45. Hanagal, D.D. (1997a) Note on estimation of reliability under bivariate Pareto stress-strength model. Statist. Papers, 38, 453-459.

240

Bibliography

Hanagal, D.D. (1997b) Estimation of reliability when stress is censored at strength. Commun. Statist. - Theory Meth., 26, 911-919. Hanagal, D.D. (1999) Estimation of reliability of a component subjected to bivariate exponential stress. Statist. Papers, 40, 211-220. Hanagal, D.D., Kale, B.K. (1992) Large sample tests for testing symmetry and independence in some bivariate exponential models. Commun. Statist. Theory Meth., 21, 2625 -2643. Harris, B, Soms, A.P. (1983) A note on a difficulty inherent in estimating reliability from stress-strength relationships. Naval Res. Logist. Quart., 30, 659 -662. Hayter, A.J., Liu, W. (1996) A note on the calculation of Pr {Xi < X2 <

< Xk}.

The American Statistician,

50(4), 365.

Hilgers, R. (1981) On asymptotically distribution-free confidence bounds for P(Xi > X2) based on samples not necessarily independent. Biom. J., 23, 627-633. Hilgers, R. (1981) On an unbiased variance estimator for the Wilcoxon- MannWhitney-statistic based on ranks. Biom. J., 23, 653-661. Hlawka, P. (1975) Estimation of the parameter p = P(X < Y < Z). Prace Nauk. Inst. Mat. Politechn. Wroclaw. No. 11, Ser. Stud, i Materialy No. 10 Problemy rachunku prawdopodobienstwa. 55-65 (in Polish). Hochberg, Y. (1981) On the variance estimate of a Wilcoxon-Mann-Whitney statistic for group ordered data. Comm. Statist. - Theory Meth., 10, 17191732. Hoeffding, W. (1963) Probability inequalities for sums of bounded random variables. J. Amer. Statist. Assoc, 58, 13-30. Hogg, R.V., Craig, A.T. (1978) Introduction to Mathematical Statistics. Fourth ed. Macmillan Publishing Co., New York. Holla, M.S. (1967) Reliability estimation of the truncated exponential model. Technometrics, 9, 332-335. Hollander, M., Korwar, R.M. (1976) Nonparametric empirical Bayes estimation of the probability that X < Y.Comm. Statist. - Theory Meth., 5, 1369-1383. Hollander, M., Wolfe, D.A. (1999) Nonparametric Statistical Methods. Second ed. Wiley, New York. Hsiao, J.K., Bartko, J.J., Potter, W.Z. (1989) Diagnosing diagnoses. Archives of General Psychiatry, 46, 664-667. Humphreys, L.G., Swets, J.A. (1991) Comparison of predictive validities measured with.biserial correlations and ROCs of signal detection theory. Journal of Applied Psychology, 76, 316-321. Hurt, J. (1980) Estimates of probability for the normal distribution. Aplikace Matematiky, 25, 432-444. Hwang, T.Y., Hu, C.Y. (1990) More comparisons of MLE with UMVUE for exponential families. Ann. Inst. Statist. Math., 42, 65-75. Ismail, R., Jeyaratnam, S., Panchapakesan, S. (1986) Estimation of P(X > Y) for gamma distributions. J. Statist. Comput. SimuL, 26, 253-267.

Bibliography

241

Ivshin, V. V. (1996) Unbiased estimators of P(X < Y) and their variances in the case of uniform and two-parameter exponential distributions. J. Math. Sci., 81(4), 2790-2793. Ivshin, V.V. (1998) On the estimation of the probabilities of a double linear inequality in the case of uniform and two-parameter exponential distributions. J. Math. Set., 88, 819-827. Ivshin, V.V., Lumelskii, Ya.P. (1993) Unbiased estimators for linear influence in the case of multivariate normal distribution. Proceedings of the Sixth international Vilnius conference on probability theory and mathematical statistics, Vilnius, 1993, 1, 152-153 (in Russian). Ivshin, V. V., Lumelskii, Ya. P. (1994) Unbiased estimators for density functions and probabilities of linear inequalities in the multivariate normal case. Stability Problems for Stochastic Models. Frontiers in Pure and Applied Prob-

ability, 3, 71 - 80. Ivshin, V.V., Lumelskii, Ya.P. (1995) Statistical Estimation Problems in "StressStrength" Models. Perm University Press, Perm, Russia. Iwase, K. (1987) On UMVU estimators of Pr(Y < X) in the two-parameter exponential case. Mem. Fac. Hiroshima Univ., 9, 21-24. Jana, P.K. (1994) Estimation of P(Y < X) in the bivariate exponential case due to Marshall-Olkin. J.Indian. Statist. Assoc, 31, 25-37. Jana, P.K. (1997) Comparison of some stress-strength reliability estimators. Calcutta Statist. Assoc. Bull, 47, 239-247. Jana, P.K., Roy, D. (1994) Estimation of reliability under stress-strength model in a bivariate exponential set-up. Calcutta Statist. Assoc. Bull., 44, 175-181. Jeevanand, E. S., Nair, N. U. (1994) Estimating P[X > Y] from exponential samples containing spurious observations. Commun. Statist. - Theory Meth., 23, 2629-2642. Jeevanand, E.S. (1997) Bayes estimation of P(X2 < X\) for a bivariate Pareto distribution. Statistician, 46, 93 - 99. Jeevanand, E.S. (1998) Estimation of reliability under stress-strength model for the Marshall-Olkin bivariate exponential distribution. IAPQR Trans., 23(4), 133-136. Jeevanand, E.S., Nair, N.U. (1994) Estimating P[X > Y] from exponential samples containing spurious observations. Commun. Statist. - Theory Meth., 23, 2629-2642. Jeffreys, H. (1961) Theory of Probability, Oxford University Press, Oxford, UK. Johnson, B.McK. (1975) Bounds on the variance of the U-statistic for symmetric distributions with shift alternatives. Ann. Statist, 3, 955-958. Johnson, N. L., Kotz, S., Balakrishnan, N. (1994) Continuous Univariate Distributions. Vol. 1. Wiley. New York. Johnson, N.L., Kotz, S., Balakrishnan, N. (1995) Continuous Univariate Distributions. Vol. 2. Wiley. New York. Johnson, N.L., Kotz, S., Balakrishnan, N. (1997) Discrete Multivariate Distributions. Wiley. New York.

242

Bibliography

Johnson, N. L., Kotz, S., Kemp, A. W. (1992) Univariate Discrete Distributions. Wiley. New York. Johnson, R.A. (1988) Stress-strength Models for Reliability. In Handbook of Statistics. Ed. Krishnaiah, P.R. and Rao, C.R., Vol. 7, Elsevier, North Holland, 27-54. Johnstone, M.A. (1983) Bayesian estimation of reliability in the stress-strength context. J. Washington Acad. of Sci., 73, 140-150. Kass, R.E., Wasserman, L. (1996) The selection of prior distributions by formal rules. J. Amer. Statist. Assoc, 91, 1343 - 1370. Kakati, M.C. (1987) Multivariate stress-strength model. IAPQR Trans., 12(1), 87-92. Kapur, E.C. (1975) Reliability bounds in probability design. IEEE Trans. Reliab., 24, 193-195. Kattan, A.K.A. (1997) On interference theory for half-alpha distributions. Pakistan J. Statist, 13, 261-266. Kelley, G.D., Kelley, J.A., Schucany, W.R. (1976) Efficient estimation of P(Y < X) in the exponential case. Technometrics, 18, 359-360. Kececioglu, D. (1972) Reliability analysis of mechanical components and systems. Nuclear Eng. Des., 9, 257-290. Kim, G.-H. (1981) Bounds for stress-strength interference via mathematical programming. Naval Res. Log. Quart., 28, 7 5 - 8 1 . Kim, D.H., Sang, G.H., Jang S.C. (2000) Noninformative priors for stress-strength system in Burr-type X model. Journ. Korean Stat. Soc, 29, 17 - 27. Klebanov, L.B. (1979) Unbiased parametric estimation of probability distributions. Mat. Zametki, 25, 743-750 (in Russian). Klein, J.P., Basu, A.P. (1985) Estimating reliability for bivariate exponential distributions. Sankhya, B47, 346-353. Kotz, S., Balakrishnan, N., Johnson, N.L. (2000) Continuous Multivariate Distributions. Vol.1. Wiley. New York. Kotz, S., Johnson, N.L. (2002) Process capability indices. A review, 1992-2000. (With discussion). Journ. Quality Technol., 34, 2 -53. Laplace, P. (1812) Theorie Analytique Des Probabilities. Courcier, Paris. Lee, G.(1998) Development of matching priors for P(X < Y) in exponential distributions. J. Korean Statist. Soc, 27, 421-433. Lee, S., Park, E. (1998) Confidence intervals for the stress-strength models with explanatory variables. J. Korean Statist. Soc, 27, 435-449. Lehmann, E.L. (1959) Testing Statistical Hypotheses. Wiley, NY. Lehmann, E.L., Casella, G. (1998) Theory of Point Estimation. Springer-Verlag, NY. Lenhof,S., Pensky, M. (2002) Estimation of P(X < Y) for beta-distributed random variables. Submitted. Lieberman, G.J., Resnikoff, G.J. (1955) Sampling plans for inspection by variables. J. Amer. Statist. Assoc, 50, 457-516. Lloyd, D.K., Lipow, M. (1962) Reliability, Management, Methods and Mathemat-

Bibliography

243

ics. Prentice-Hall, Englewood Cliffs, NJ. Lumelskii, Ya.P. (1968) Unbiased sufficient estimators of probabilities in the case of the multivariate normal distribution. Vest. MGU, Mathematics, No. 6, 14-17 (in Russian). Lumelskii, Ya.P. (1969a) Confidence limits for linear functions of unknown parameters. Theor. Probab. Appi, 14, 364-367. Lumelskii, Ya.P. (1969b) Unbiased estimators in the case of the Poisson distribution. In Scient. Records of the Perm State University, 218, 234-240, Perm (in Russian). Lumelskii, Ya.P. (1995) On inadmissibility of biased estimators relative to the quadratic loss. J. Math. Sci., 75, 1401- 1403. Lumelskii, Ya.P., Pensky, M. (1982) Unbiased estimation of characteristics of random variables. In Mathematical Statistics and Its Applications, 8, 114122, Tomsk (in Russian). Lumelskii, Ya.P., Pensky, M. (1985) Statistical control and unbiased estimation of deviations of random characteristics, in Proceedings of the All- Union Conference "Application of Multivariate Statistical Analysis in Economy and Quality Control". Tartu, Estonia, 1985, 41-42. Lumelskii, Ya.P., Sapoznikov, P.N. (1969) Unbiased estimators of probability densities. Theor. Veroyat. Primen., 14, 372-380. Mace, A.E. (1964) Sample Size Determination. Reinhold. Madansky, A. (1965) Approximate confidence limits for the reliability of series and parallel systems. Technometrics, 7, 495-503. Maiti, S.S. (1995) Estimation of P(X < Y) in the geometric case. J. Indian Statist. Assoc, 33, 87-91. Mankamo, T. (1977) Common load model. A tool for common cause failure analysis. Technical Report, 31, Electrical Engineering Laboratory, Valtion Tenillinen Tutkimuskeskus Technical Research Center, Helsinki, Finland. Mann, H.B., Whitney, D.R. (1947) On a test whether one of two random variables is stochastically larger than the other. Ann. Math. Statist., 18, 50-60. Maritz, J., and Lwin, T. (1989) Empirical Bayes Methods. Chapman & Hall, London. Marshall, A.W., Olkin, I (1967) A multivariate exponential distribution. J. Amer. Statist. Assoc, 62, 30-44. Mathai, A.M. (1997) Jacobians of Matrix Transformations and Functions of Matrix Argument. World Scientific Publ., Singapore. Mazumdar, M. (1970) Some estimates of reliability using interference theory. Naval Res. Logist. Quart., 17, 159-165. McCool, J.I. (1991) Inference on P(Y < X) in the Weibull case. Commun. Statist. - Simul. Comput., 20, 129-148. Melloy, B.J., Cavalier, T.M. (1989) Bounds for the probability of failure resulting from stress/strength interference. IEEE Trans. Reliab., 38, 383-385. Mensing, R. (1984) Personal communication. Metz, C.E. (1989) Some practical issues of experimental design and data analysis

244

Bibliography

in radiological ROC studies. Investigation Radiology, 24, 234-245. Metz, C.E., Herman, B.A., Shen, J.H. (1998) Maximum likelihood estimation of receiver operating characteristic (ROC) curves from continuously distributed data. Stat. Med., 17, 1033 - 1053. Miwa, T., Hayter, A. J., Wei Liu (2000) Calculations of level probabilities for normal random variables with unequal variances with applications to Bartholomew's test in unbalanced one-way models. Comput. Statist. Data Anal., 34, 17-32. Miwa, T., Hayter, A.J., Kuriki, S. (2001) The evaluation of general non-central orthant probabilities. J. Royal Stat. Soc. Ser. B, to be published. Morrison, D.F. (1976) Multivariate Statistical Methods. McGraw-Hill, New York. Mukerjee, R., Dey, D.K. (1993) Prequentist validity of posterior quantiles in the presence of the nuisance parameter: high order asymptotics. Biometrika, 80, 499 - 505. Mukherjee, S.P., Saran, L.K. (1985) Estimation of failure probability from a bivariate normal stress-strength distribution. Microelect. Reliab., 25, 699702. Myhre, J.M., Saunders, S.C. (1968a) On confidence limits for the reliability of system. Ann. Math. Statist, 39, 1463-1472. Myhre, J.M., Saunders, S.C. (1968b) Comparison of two methods of obtaining approixmate confidence intervals for system reliability. Technometrics, 10, 37-49. Nandi, S.B., Aich, A.B. (1994a) A note on estimation of P(X > Y) for some distributions useful in life-testing. IAPQR Trans., 19, 35-44. Nandi, S.B., Aich, A.B. (1994b) A note on confidence bounds for P(X > Y) in bivariate normal samples. Sankhyd, Ser. B, 56, 129-136. Nandi, S.B., Aich, A.B. (1996a) A note on testing hypothesis regarding P(X > Y) in bivariate normal samples. IAPQR Trans., 21, 149-153. Nandi, S. B., Aich, A. B. (1996b) Hypothesis-test for reliability in a stressstrength model with prior information. IEEE Trans. Reliab., 45, 129 -131. Nelson, W. (1990) Accelerated Testing. Wiley, New York. Nikulin, M. S., Voinov, V. G. (1993) Unbiased estimators of multivariate discrete distributions and chi-square goodness-of-fit test. Questii, 17, 301-326. Nikulin, M., Voinov, V. (1996) Tables of the best possible unbiased estimates for functions of parameters of multinomial and negative multinomial distributions. Journ. Math. Sciences, 81, 2363-2367. Nikulin, M., Voinov, V. (2000) Unbiased estimatiom in reliability and similar problems. In Recent Advances in Reliability Theory. Methodology, Practice and Inference., Eds. Limnios, M. and Nikulin, M., Birkhauser, Boston, pp. 435-448. Nockemann, C, Heidt, H., Thomsen, N. (1991) Reliability in NTD: ROC study of radiographic weld inspections. Nondestructive Testing and Evaluation International, 24, 235-245. Oskamp, S. (1962) The relationship of clinical experience and training methods

Bibliography

245

to several criteria of clinical production. Psychological Monographs, 76, No. 28. Owen, D.B., Craswell, K.J., Hanson, D.L. (1964) Nonparametric upper confidence bounds for P(Y < X) and confidence limits for P(Y < X) when X and Y are normal. J. Amer. Statist. Assoc, 59, 906-924. Pandit, S.M., Sheikh, A.K. (1980) Reliability and optimal replacement via coefficient of variation. In Proc. Prevention Reliab. Comput, St. Louis, 102. Papadopoulos, A. S. (1983) Empirical Bayes confidence bounds for the Weibull distribution. J. Inform. Optim. Sci., 4, 43-47. Park, J.W., Clark, G.M. (1986) A computational algorithm for reliability bounds in probability design. IEEE Trans. Reliab., 35, 30 - 31. Patil, G.P., Wani, J.K. (1966) Minimum variance unbiased estimation of the distribution function admitting a sufficient statistics. Ann. Inst. Statist. Math., 18, 39-47. Patnaik, P.B. (1949) The non-central x2— and F-distributions and their applications. Biometrika, 36, 202-232. Pensky, M. (1982) Unbiased estimation of probabilities defined by linear inequalities. In Application of the random search, 124-132, Kemerovo (in Russian). Pensky, M. (2002) Estimation of probabilities of linear inequalities for independent elliptic random vectors. Sankhya, to be published. Pensky, M., Takashima, R. (2002) Estimation of P(X < Y) for the generalized gamma distributions. Submitted. Peszek, I., Rukhin, A.L. (1993) Estimating normal distribution function and normal density. Statist. Decis. 11, 391-406. Pettitt, A.N. (1984) Tied, grouped continuous and ordered categorical data: A comparison of two models. Biometrika, 71, 35-42. Pham, T., Almhana, J. (1995) The generalized gamma distribution: its hazard rate and stress-strength model. IEEE Trans. Reliab., 44, 392-397. Pham-Gia, T., Turkkan, N. (1998) Distribution of the linear combination of two general beta variables and applications. Comm. Statist. - Theory Meth., 27, 1851-1869. Pham-Gia, T. (2000) Distributions of the ratios of independent beta variables and applications. Comm. Statist. - Theory Meth., 29, 2693 - 2715. Pieruschka, E. (1963) Principles of Reliability. Prentice Hall, Englewood-Cliffs, NJ, pp. 278-281. Prasanta, K.J. (1998) Estimation of P[Y < X] under a bivariate exponential stress-strength model. 207 — 214. In Frontiers in probability and statistics. Papers from 2nd International Triennial Symposium on Probability and Statistics. Calcutta, December 30, 1994-January 2. 1995, Mukherjee, S. P., Basu, S. K., Sinha, B. K. eds. Narosa Publishing House, New Delhi. Priebe, C.E., Cowen, L.J. (1999) A generalized Wilcoxon-Mann-Whitney statistic. Comm. Statist. - Theory Meth., 28, 2871-2878. Proschan, F., Sullo, P. (1976) Estimating the parameters of a multivariate exponential distribution. J. Amer. Statist. Assoc, 71, 465-472.

246

Bibliography

Pugh, E.L. (1963) The best estimate of reliability in the exponential case. Oper. Res., 11, 57-61. Quenouille, M. (1956) Notes on bias in estimation. Biometrika, 43, 353-360. Raghava Char, A. C. N., Kesava Rao, B., Pandit, S. N. N. (1984) Stress and strength Markov models of the system reliability. Sankhyd, Ser. B, 46, 147-156. Reiser, B., Faraggi, D. (1994) Confidence bounds for Pr(a'x > b'y). Statistics, 25, 107-111. Reiser, B., Faraggi, D., Guttman, I. (1992) Choice of sample size for testing the P(X > Y). Commun. Statist- Theory Meth., 21, 559-569. Reiser, B., Guttman, I. (1986) Statistical inference for Pi(Y < X): the normal case. Technometrics, 28, 253-257. Reiser, B., Guttman, I. (1987) A comparison of three point estimators for P(Y < X) in the normal case. Comput. Statist. Data Anal., 5, 59-66. Reiser, B., Guttman, I. (1989) Sample size choice for reliability verification in stress-strength models. Can. J. Statist, 17, 253-259. Reiser, B. (2000) Measuring the effectiveness of diagnostic markers in the presence of measurement error through the use of ROC curves. Stat. Med., 19, 2115 - 2129. Rinco, S. (1983) Estimation of P{YP > max(Yi, Y2,-, Vp-i)}: predictive approach in exponential case. Can. J. Statist, 11, 239-244. Rohatgi, V.K. (1989) Unbiased estimation of parametric functions in sampling from two one-truncated parameter families. Austral. J. Statist, 31, 327332. Roy, D. (1993) Estimation of failure probability under a binomial normal stressstrength distribution. Microelect. Reliab., 33, 2285 - 2287. RukhinA. (1986) Estimating normal tail probabilities. Naval. Res. Logist. Quart., 33, 91-99. Sathe, Y.S., Dixit, U.J. (2001) Estimation of P(X < Y) in the negative binomial distribution. J. Statist. Plann. Inference, 93, 83-92. Sathe, Y.S., Shah, S.P. (1981) On estimating P(X > Y) for the exponential distribution. Commun. Statist.- Theory Meth., 10, 39-47. Sathe, Y.S., Varde, S.D. (1969) Minimum variance unbiased estimation of reliability for the truncated exponential distribution. Technometrics, 11, 609-619. Sathe, Y.S., Varde, S.D. (1969) On minimum variance unbiased estimation of probability. Ann. Math. Statist, 40, 710-714. Scheaffer, R. (1976) On the computation of certain minimum variance unbiased estimators. Technometrics, 18, 497-499. Schechtman, E. (1983) A conservative nonparametric distribution-free confidence bound for the shift in the change point problem. Comm. Statist. - Theory Meth., 12, 2455-2464. Seber, G. A. F. (1977) Linear Regression Analysis. Wiley, New York. Selvavel, K. (1989) Unbiased estimation in sampling from two one-truncation parameter families when both samples are type II censored. Commun. Statist

Bibliography

247

- Theory Meth., 18, 3519-3531. Sen, P.K. (1960) On some convergence properties of {/-statistics. Calcutta Stat. Assoc. Bull., 10, 1-18. Sen, P.K. (1967) A note on asymptotically distribution-free confidence intervals for Pr(X < Y) based on two independent samples. Sankhya, Ser. A, 29, 95-102. Shen, K. (1992) An empirical approach to obtaining bounds for the failure probability through stress-strength interference. Reliab. Eng. Systems Safety, 36(1), 79-84. Shirahata, S. (1993) Estimate of variance of Wilcoxon-Mann-Whitney statistic. J. Japanese Soc. Comput. Statist, 6, 1-10. Shiryaev, A.N. (1996) Probability. Springer-Verlag, New York. Simion, E., Preda, V., Constantinescu, N., Barboi, M. (2000) Reliability analysis of the stress-strength. In Proceedings of the Sixth International Symposium For Design and Technologies For Electronic Modules, Sept. 21-24, 2000, 29-32. Simonoff, J.S., Hochberg, Y., Reiser, B. (1986) Alternative estimation procedures for P(X < Y) in categorized data. Biometrics, 42, 895-907. Singh, N. (1980) On the estimation of Pr(Xi < Y < X2). Commun. StatistTheory Meth., 9, 1551-1561. Singh, N. (1981) MVUE of PT(X < Y) for multivariate normal populations: an application to stress-strength models. IEEE Trans. Reliab., 30, 192 - 193. Sinha, B. K., Zieliriski, R. (1997) Estimating P{X > Y} in exponential model revisited. Statistics, 29, 299-316. Sinha, S.K. (1989) A note on the variance of the uniformly-minimum-varianceunbiased- estimator of the reliability function of exponential life distribution. Calcutta Statist Assoc. Bull., 38, 237-240. Shah, S.P., Sathe, Y.S. (1982) Erratum: "On estimating P(X > Y) for the exponential distribution", Comm. Statist.- Theory Meth., 1981, 10, 39-47. Comm. Statist- Theory Meth., 11, 2357. Smirnov, N. (1948) Table for estimating the goodness of fit of empirical distributions. Ann. Math. Statist, 19, 279-281. Sprent, P. (1989) Applied Nonparametric Statistical Methods. Chapman & Hall, London. Stacy, E.W. (1962) A generalization of the gamma distribution. Ann. Math. Stat, 33, 1187-1192. Sun, D., Ghosh, M., Basu, A.P.(1998) Bayesian analysis for a stress-strength system under noninformative priors. Canad. J. Statist, 26, 323-332. Surles, J.G., Padgett, W.J. (1998) Inference for P(Y < X) in the Burr type X model. J. Appl. Statist. Sci., 7, 225-238. Surles, J.G., Padgett, W.J. (2001) Inference for reliability and stress-strength for a scaled Burr type X distribution. Lifetime Data Analysis, 7, 187-200. Swets, J.A., Pickett, R.M. (1982) Evaluation of Diagnostic Systems : Methods from Signal Detection Theory. Academic Press, New York.

248

Bibliography

Swets, J.A. (1996) Signal Detection Theory and ROC Analysis in Psychology and Diagnostics. Coolected Papers. Lawrence Erlbaum Assoc, New Jersey. Teskin, O.I., Kostyukova, T.M. (1991) Interval estimation of exponent of reliability using the "load-strength" rejection method. Journ. Soviet Math., 56, 2434 - 2438. Thompson, R.D., Basu, A.P. (1993) Bayesian reliability of stress-strength systems. In Advances in Reliability, ed. Basu, A.P., Elsevier Science Publishers, Amsterdam, 411-421. Tong, H. (1974) A note on the estimation of P(V < X) in the exponential case. Technometrics, 16, 625. Errata: Technometrics, 17, 395. Tong, H. (1977) On the estimation of P(Y < X) for exponential families. IEEE Trans. Reliab., 26, 54-56. Tsui, K.W., Weerahandi, S. (1989) Generalized p-values in significance testing of hypotheses in the presence of nuisance parameters. J. Amer. Statist. Assoc, 84, 602 - 607. Tukey, J.W. (1958) A problem of Berkson, and minimum variance orderly estimators. Ann. Math. Statist, 29, 588-592. Ury, H.K. (1972) On distribution-free confidence bounds for Pr{Y < X}. Technometrics, 14, 577-581. Ury, H.K., Wiggins, A.D. (1976) A general upper bound for the variance of the Wilcoxon-Mann-Whitney U-statistic for symmetric distributions with shift alternatives. Brit. J. Math. Statist. Psychol., 29, 263-267. Ury, H.K., Wiggins, A.D. (1979) Distribution-free confidence bounds for Pr{Y < X} when F(x) and G(y) = F(x — 6) are continuous and symmetric. Commun. Statist- Theory Meth., 8, 1247-1253. Van Dantzig, D. (1951) On the consistency and power of Wilcoxon's two-sample test. Koninklijke Nederlandse Akademie van Wetenschappen Proceedings, Ser. A, 54, 1-8. Varde, S.D. (1969) -Life testing and reliability estimation for the two-parameter exponential distibution. J. Amer. Statist. Assoc, 64, 621-631. Vedernikova, A.P., Lumelskii, Ya.P. (1991) Unbiased estimation of linear functionals in the case of inverse normal distribution. J. Soviet Math. , 56, 2407-2409. Vysokovskii, E.S. (1966) Reliability of tools used in semi-automatic lathes. Russian Eng. J., 46(6), 46-50. Voinov, V.G. (1984) On unbiased estimation of P(Y < X) in the normal case. Zapiski Nauchn. Sem. LOMI, 136, 5 -12 (in Russian). Voinov, V.G., Nikulin, M.S. (1993) Unbiased Estimators and Their Applications. Volume 1: Univariate Case. Kluwer Academic Publishers, Dordrecht, Netherland. Voinov, V.G., Nikulin, M.S. (1996) Unbiased Estimators and Their Applications. Volume 2: Multivariate Case. Kluwer Academic Publishers, Dordrecht, Netherland. Wang, J.D., Liu, T.S. (1996) Fuzzy reliability using a discrete stress-strength

Bibliography

249

interference model. IEEE Trans. Reliab., 45, 145 - 149. Weerahandi, S., Johnson, R.A. (1992) Testing reliability in a stress-strength model when X and Y are normally distributed. Technometrics, 34, 8391. Wilcoxon, F. (1945) Individual comparisons by ranking methods. Biometrical Bull., 1, 80 - 83. Wolfe, D.A., Hogg, R.V. (1971) On constructing statistics and reporting data. The American Statistician, 25, 27-30. Woodward, W.A., Grey, H.L. (1975) Minimum variance estimation in the gamma distribution. Commun. Statist, 4, 907-922. Woodward, W.A., Kelley, G.D. (1977) Minimum variance unbiased estimation of P(X < Y) in the normal case. Technometrics, 19, 95-98. Wu, K.F., Fan, J.C., Li, Y.W. (1990) Strongly consistent estimation for a multivariate linear relationship model. Ada Math. Appl. Sinica, 13, 90-98 (in Chinese). Yang, M.C.K., Mo, T.C. (1984) Some improvements on the Birnbaum - McCarty bound for P(Y < X). Statist. Probab. Letters, 2, 127 - 132. Yang, M.C.K., Mo, T.C. (1985) Distribution-free confidence bounds for Pr{Y < X} of an r-out-of-fc system. IEEE Trans. Reliab., 34, 499-503. Yang, M.C.K., Mo, T.C. (1985) Distribution-free confidence bounds for P(Xi + X2 + h Xk < z). J. Amer. Statist. Assoc, 80, 227-230. (Correction: J. Amer. Statist. Assoc, 81, 1132.) Yang, R., Berger, J. (1997) A catalog of noninformative priors. ISDS Discussion Paper 97-42, Duke University. Yu, Q.Q., Govindarajulu, Z. (1995) Admissibility and minimaxity of the UMVU estimator of P{X < Y}. Ann. Statist, 23, 598-607. Zacks, S. (1971) The Theory of Statistical Inference. Wiley, New York. Zalkikar, J.N., Tiwari, R.C., Jammalamadaka, S.R. (1986) Bayes and empirical Bayes estimation of the probability, that Z > X + Y. Commun. Statist Theory Meth., 15, 3079-3101. Zaremba, S.C. (1965) Note on the Wilcoxon-Mann-Whitney statistic. Ann. Math. Statist, 36, 1058-1060.

Index

F-distribution, 114, 117, 128 T-distribution multivariate, 73, 86, 94 noncentral, 111, 113, 127, 184, 185, 188, 215 univariate, 73, 74, 87 [/-statistic, 176, 180 p-value, 34, 35, 210, 217, 219 generalized, 126, 129, 130, 207 s-out-of-A: system, 170-172

samples, 133, 158 Burr type X distribution, 43, 54, 71, 77, 117 scaled, 58 Burr type XII distribution, 43, 55, 71 BVED, 95 Gumbel, 96, 100 Marshall and Olkin, 96, 97 Block and Basu, 97, 100 Preund, 96, 100

acceptance region, 34 ambient temperature, 205, 206 average ranks, 141

categorized data, 189, 216, 225 chi-squared distribution, 56, 114, 116, 175, 187, 194 Churchman two-stimuli design, 212 clinical trials, 218, 219, 221 coherent monotone structure, 176 confidence bounds lower, 30 upper, 30 confidence coefficient, 30 confidence interval, 30 asymptotic, 31 exact, 31 conjugate prior, 25, 27, 28, 38, 44, 45, 72 count data, 174 Cramer-Rao-Blackwell theorem, 18, 41 critical stress, 202

Bayes credible set, 33, 123 estimation of R, 23, 28, 71, 82, 92, 158 predictive approach, 29 test, 35, 131 Bayes method for construction of UMVUE, 19 beta distribution, 50, 58 binomial distribution, 103 bivariate Pareto distribution, 107 Bonferroni inequality, 31 bootstrap, 132, 150 confidence interval, 133, 136, 157 estimator of the variance, 134 251

252

cumulative damage, 230 Diabetes Control and Complication Trials, 220 elliptical distribution, 78 empirical Bayes estimation, 23, 29, 158 empirical distribution function, 147, 148, 157, 161 estimator admissible, 142 minimax, 142 factorization theorem, 17 Fisher information matrix, 26 gamma distribution, 49, 56, 63, 114, 127 generalized extreme region, 130 generalized gamma distribution, 55, 69, 115 generalized test variable, 130 geometric distribution, 104, 228 half-normal distribution, 56 highest posterior density (HPD), 33, 38, 125 hypergeometric series, 28, 45, 50, 85, 87, 91, 174 definition, 28 incomplete beta function, 37, 50, 91, 115, 128 incomplete gamma function, 57, 132 inverse Gaussian distribution, 216 jackknife estimator, 148, 149 Jeffreys's prior, 26, 28, 73-77, 88, 92, 94 Kolmogorov-Smirnov statistic, 149, 151, 152

Index

likelihood function, 12, 13, 32, 40, 73, 75, 82, 93, 99, 191 likelihood ratio, 111 test, 174 lognormal distribution, 43, 58, 70, 211 loss function, 143, 160 loss of memory property, 96, 97 matching prior, 26, 72, 75-77, 125 mixed inverse Gaussian distribution, 216 multinomial distribution, 102, 103, 189, 217 multivariate Cauchy distribution, 86, 87 negative binomial distribution, 104 noninformative prior, 26, 27, 45, 71, 75, 207 normal distribution, 110, 120, 127, 178, 210, 211, 225 multivariate, 72, 74, 88-90, 92, 131, 179, 198, 204, 214 univariate, 32, 45, 47, 59, 60, 72, 112, 118, 123, 127, 129, 152, 154, 182, 184, 212, 213, 227 one-parameter exponential distribution, 14, 20, 27, 36, 43, 74, 178 ordefing of distributions, 1 parallel system, 170, 172, 176, 197 Pareto distribution, 43, 52, 70 Pearson type II distribution, 84, 90 pivotal quantity, 31, 37, 135, 136, 155-157, 167, 193, 194, 219, 221 Poisson distribution multivariate, 101 univariate, 103 posterior pdf, 24, 25, 27-29, 33, 38, 44, 45, 73, 75, 77, 78, 123-125, 131, 160, 213 power distribution, 43, 59, 70

Index

process capability index, 230 Rayleigh distribution, 42, 43, 56 reference prior, 27, 28, 74-77, 125 rejection region, 34, 131 reliability, 2-4, 170, 171, 187, 195, 196, 198, 205, 207, 211-214 ROC curve, 223-227 stochastically larger, 1 sufficient statistic, 17-21, 40, 41, 61, 65, 66, 106, 120, 183 system reliability, 5-7, 169-174, 197 truncation parameter family, 51, 64, 67 doubly, 51, 64 lower, 51, 67, 68 upper, 51, 68 two-parameter exponential distribution, 43 uniform distribution, 51, 52, 66, 68, 181, 229 uniform noninformative prior, 26, 213 Weibull distribution, 43, 53, 56, 75, 125 Wilcoxon test, 203 Wilcoxon-Mann-Whitney (WMW) statistic, 6, 140, 201, 219

253

The stress-strength model and its generalizations MVsa

Read more

The stress-strength model and its generalizations

Read more

Pfaff's problem and its generalizations

Read more

Pfaffs Problem and Its Generalizations

Read more

Conformal differential geometry and its generalizations

Read more

Conformal Differential Geometry and Its Generalizations

Read more

Minimal NetworksThe Steiner Problem and Its Generalizations

Read more

Generalized estimating equations MVsa

Read more

Finite Model Theory and Its Applications

Read more

Linear Models and Generalizations

Read more

Quaternions and Their Generalizations

Read more

Generalized estimating equations MVsa

Read more

The State Space Method: Generalizations and Applications

Read more

The state space method: generalizations and applications

Read more

The Mountain Pass Theorem: Variants, Generalizations and Some Applications (Encyclopedia of Mathematics and its Applications)

Read more

Model selection and model averaging

Read more

Model selection and model averaging

Read more

Generalizations of Steinberg groups

Read more

Model Selection and Model Averaging

Read more

Model selection and model averaging

Read more

The characteristic method and its generalizations for first-order nonlinear PDEs

Read more

The Characteristic Method and Its Generalizations for First-Order Nonlinear Partial Differential Equations

Read more

The Characteristic Method and Its Generalizations for First-Order Nonlinear Partial Differential Equations

Read more

Geometric Analysis on the Heisenberg Group and Its Generalizations (Ams Ip Studies in Advanced Mathematics)

Read more

Geometric Analysis on the Heisenberg Group and Its Generalizations (Ams Ip Studies in Advanced Mathematics)

Read more

Model Theory (Encyclopedia of Mathematics and its Applications)

Read more

Soft Systems Methodology: Conceptual Model Building and Its Contribution

Read more

The Model

Read more

The Model

Read more

The Model

Read more

Recommend Documents

The stress-strength model and its generalizations MVsa

The Stress-Strength Model and its Generalizations Theory and Applications The Stress-Strength Model and its Generaliz...

The stress-strength model and its generalizations

Pfaff's problem and its generalizations

Pfaffs Problem and Its Generalizations

Conformal differential geometry and its generalizations

Conformal Differential Geometry and Its Generalizations

Conformal Differential Geometry and Its Generalizations I MAKS A. AKIVIS VLADISLAV V. GOLDBERG CONFORMAL DIFFERENTIA...

Minimal NetworksThe Steiner Problem and Its Generalizations

Generalized estimating equations MVsa

Generalized Generalized Estimating Estimating Equations Equations © 2003 2003 by by Chapman Chapman & & Hall/CRC Hall/C...

Finite Model Theory and Its Applications

Texts in Theoretical Computer Science An EATCS Series Editors: W. Brauer J. Hromkoviˇc G. Rozenberg A. Salomaa On behal...

Linear Models and Generalizations

Springer Series in Statistics Advisors: P. Bickel, P. Diggle, S. Fienberg, U. Gather, I. Olkin, S. Zeger Springer Ser...