This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
)
(4-56)
Hence zo = $- 1 (CDF B (E)).
(4.57)
Reapplying (4.53) one can write P(§U)
z^/2a) = $(2z0
and, similarly to (4.56), utilizing (4.54) we have $(2z0
z 7 / 2 ) = C5DGB(g + zoo-
z7/2o-) = CDF Big'1 (Q + zoa
or ^ + zo
z7/2(T = 5 [ C D F B 1 ( $ ( 2 Z 0
zl/2))}.
(4.58)
Combining (4.58) with (4.55), we arrive at P (CDFs1($(2zo - z 7/2 )) < R < C D F B 1 ( $ ( 2 Z 0 + z 7/2 ))) = l - 7 . (4.59) Note that it is sufficient to be assured only about the existence of the monotonic mapping g; its specific form is irrelevant. Moreover, the normal distribution plays no special role in the above argument. Instead of (4.53) we could assume that the pivotal quantity possesses some other symmetric distribution than normal, in which case $ would denote the cdf of this (rather than standard normal) distribution. The corrected bootstrap interval (4.59) very often performs better than the interval (4.52) obtained by the percentile method directly. It is worth noting that (4.59) can be viewed as the interval based on a combination of the percentile method and the asymptotic bootstrap interval estimation.
Exercises
4.6
137
Exercises
4.1. Let X and Y be independent lognormal variables with parameters and (^2,02), where the pdf of the lognormal distribution is given in (3.42). Using Theorem 2.10 and results of Section 4.1.2, derive (1 - 7) one-sided and two-sided confidence intervals for R. 4.2. Let X and Y be independent Burr type XII variables with parameters (ai,/3) and (02,(3), respectively, i.e. /? is the common parameter for X and Y. Using formula (4.22) and Theorem 2.10, derive (1 - 7 ) one-sided and two-sided confidence intervals for R. 4.3. Let X and Y be independent Weibull variables with parameters (a, <7i) and (a, 02), and the common parameter a is known. Using formula (4.22) and Theorem 2.10, derive (1 — 7) confidence interval for R. 4.4. (Teskin and Kostyukova (1991)). Let X and Y be independent normal variables with unknown means and variances but known ratio a = 0"2/<xi. Construct a lower confidence bound for R. Consider the cases n\ = 712 and n\ ^ n^ separately. 4.5. Let X and Y be independent Weibull variables with parameters (ai,o"i) and (02,02), respectively, where shape parameters a\ and a.2 are known and different. Using results of Section 4.1.4, describe how the (1 - 7) confidence interval for R can be constructed. 4.6. Using the UMVUE (3.76) of Var(fi) in the case of the uniform distribution with unknown right end, derive the confidence interval for R based on normal approximation. 4.7. In the conditions of problem 4.1, derive Bayesian credible set for R. 4.8. (Reiser et al. (1992)). Solve ther problem of Example 4.4 in the case when X and Y are a) independent normal variables; b) dependent normal variables. 4.9. (Gupta et al. (1999)). Let X and Y be independent normal variables N(fii,af) and N(fii,af), respectively, with a common but unknown coefficient of variation a = o\jp,\ = 0^/'/i2- Using techniques of Section 4.2.3 for estimation of MSE(i?), find the asymptotic confidence interval for R. 4.10. (Gupta and Subramanian (1998)). Solve problem 4.9 when X and Y are dependent normal variables with unknown correlation coefficient. 4.11. Describe the procedure for the construction of bootstrap-based asymptotic confidence intervals in the case when X and Y have indeoendent two-parameter exponential distributions. (MI>°"I)
Chapter 5
Nonparametric Models
Since the seventies of the twentieth century, the allure of nonparametric models has become almost irresistible in statistical methodology due a number of factors. Among those are the rise of the discipline of Data Analysis and of random estimation procedures spearheaded by J. Tukey, P. J. Huber and F. Hampel and unprecedented advances in computer technology. Psychologically these models are quite appealing since they free us from the constraints of distributional "straight jacket". However, the lesser efficiency and ambiguity of the results sometimes turns out to be quite heavy. This chapter deals with the nonparametric stress-strength model where the distributions of X and Y are unknown. The version of the problem is quite important not only because it is the only set-up that can be used in a number of applications but also since it preceded historically the parametric formulation of the problem. In this chapter we shall essentially follow the same well trotted route as in earlier three chapters of the book. We start with construction of the point estimator R of R = P(X < Y) and investigate its properties. Section 5.2 deals with numerous estimators of the variance of R. In Section 5.3 we shall provide confidence intervals for R based on R. Section 5.4 is devoted to nonparametric Bayesian approach to the problem. Finally, Section 5.5 describes a probabilistic design approach to the problem.
139
140
5.1 5.1.1
Nonparametric Models
Point Estimation of R = P(X < Y) Initial Results. The WMW Statistic
Chronologically, the stress-strength model started in a nonparametric setup in the simple but pioneering and ingenious works of Wilcoxon (1945) and Mann and Whitney (1947). These authors considered comparison of two independent random variables X and Y with continuous cdfs Fx and Fy, respectively. The aim was to test the hypothesis H0 : Fx = Fy by testing Ho : P(X < Y) = P(X > Y) = 1/2. These ideas can be traced to the European continental authors in the early years of the 20-th century (see Hald (1998)) but it was Wilcoxon, Mann and Whitney who put them on the front burner. The basic idea of Wilcoxon (1945) was to apply ranking methods to testXni) ing of HQ . He came up with the following procedure. Let X_ = (X\ and Y_ = (Yi, , Yn2) be two samples from X and Y, respectively. Wilcoxon (1945) suggests to form the sample of N = n\ + n2 observations and rank each of the observations Xi, i = 1, , n\, and Yj, j = 1, n 2 , in this overall sample. Denote by TRX and TRY the sums of ranks of observations Xi and Yj in the joint sample. Since under the assumption that Fx = Fy, the sum of ranks for X is TRX = ni(N + l)/2 and similarly the sum of ranks for Y is TRY = ri2(N + l)/2, statistics TRX and TRY can be applied to test the null hypothesis HoMann and Whitney (1947) developed further Wilcoxon's idea. Let X and Y be continuous random variables. They define statistic W counting the number of times that an X precedes a Y in a combined sample:
(5.1)
This statistic is commonly called the Wilcoxon-Mann-Whitney (WMW). It is evident that the expectation of (5.1) is EW = ni7i2-R, hence statistic W can be used not just for testing the hypothesis about the equality of Fx and FY but also for the statistical inference concerning R. Statistics W, TRX and TRY are interrelated via the equation W = nm2 + ni(m + l)/2 - TRX = TRY - n2(n2 + l)/2.
(5.2)
Point Estimation of R = P(X < Y)
141
To verify the first equality in (5.2) note that TRX = £
The validity of the second equality follows from the fact that TRX = N(N+ l)/2 - TRY (since the total sum of the ranks is TRX + TRY = N(N +1)/2). Owen et al. (1964) has shown that the rank representation (5.2) remains in force even if random variables X and Y are not continuous provided Xi ^ Xj and Yi ^ Yj for i ^ j . In such a situation, whenever Xi = Yj, one needs to rank Yj first and then Xi. 5.1.2
Nonparametric UMVUE of R
Since EW = n\n2R, the unbiased estimator of R is W
i
" \ "2
ninj ^
(5.3)
^
Using (5.2), J? can be written as
^_^1 = ![^_I^1. (5.4) 2
71], J
n i [ ni
2J
An alternative rank representation of R is given by
J U f e l + .. iV [ n 2
TH J 2
(5 .5)
The ratios TRX/UI and TRY ln
[1 - F y ^ ) ] 2 ^ * ) .
(5.6)
142
Nonparametric Models
- R2 where
Taking into account that Var(.R) = (nin2)~2EW2
(n2 -
2
. (5.7)
Comparing (5.6) with (2.6), we arrive at R2
R2
R,
(5.8)
thus 1
R{1 -R)<
Var(E) <
Ul+n2
~1R{l
- R).
(5.9)
Recalling that 0 < R < 1 we have -R(l — R) < 1/4 and since (ni + n2)l{nin2) < 2/min(ni,n 2 ), inequality (5.9) implies that Var(i?) < [2min(ni,ri2)]~1. Van Dantzig (1951) provides a sharper upper bound on Var(JR): Var(i?) < [4mm(ni,n2)}~1.
(5.10)
Note also that under the assumption that Fx = Fy, we have v\ = v2 = 1/3 and hence in this case
Var(A) =
l 2
Yln,\n2
i
l ()
Equation (5.7) and inequality (5.10) show that R is the UMVUE of R with the variance of the order O(l/min(ni,n 2 )). Moreover, the estimator R possesses yet another useful properties: it is admissible and minimax
Point Estimation of R = P(X < Y)
143
under a wide class of loss functions. To clarify importance of these features, we shall introduce the following definitions.
Definition 5.1 A loss function L is any function of two variables x and y satisfying L(x, y) > 0 such that L(x, x) = 0. The expected value of the loss function with respect to the sample distribution EL(R, R) is called the risk. Quite often, a loss function is chosen to depend on the difference of arguments, i.e. L(x,y) = l(x — y) where l(z) is convex. Definition 5.2 An estimator R of R is said to be inadmissible for the loss function L(x, y) if there exists another estimator R' of R such that EL(R', R) < EL(R, R) for any Fx and Fy with the strict inequality for at least one pair Fx, Fy, and it is admissible if no such estimator R' is available. Definition 5.3
An estimator R of R satisfying inf sup EL(R',R) = sup R' FX,FY
EL(R,R),
FX,FY
where the infimum is taken over all possible estimators of R is called a
minimax estimator. Theorem 5.1
(Yu and Govindarajulu (1995)).
Let X and Y_ be
two independent samples from Fx and Fy, respectively. Then the UMVUE R of R is admissible under any loss function of the form (R—R)2h(Fx, Fy) where h(x, y) is any positive function. In particular, R is minimax under where a2(Fx,Fy) = Var(R) given the loss function (R - R)2a2(Fx,Fy) by the expression (5.7). Theorem 5.1 states that whenever a loss function is the product of (f2R) and a positive function of Fx and Fy, there is no estimator which is superior to R for all possible distributions Fx and Fy . Moreover, if the the estimator R minimizes loss function is of the form (R-R)2a2(Fx,Fy), the maximum risk among all possible estimators of R. 2
144
5.2
Nonparametric Models
Estimation of the Variance of R
5.2.1
Estimators Based on Rank
Statistics
In the preceding section we have discussed upper bounds on Var(.R). However, to assess appropriately the quality of R in any particular situation it is desirable to know the value of Var(.R) as accurately as possible. This can be achieved by estimating Var(-R). Sen (1960) was the first to construct an estimator of Var(Il). Denote
H
^
(5.H)
and, let furthermore, Q2 _
10 —
1 1
"»
V ^ r r (Y\
i~ /
m2
\^10\-^*-i/ — i")
1 L
c2 _
?
01 —
i=l
"»
\^/
1" /
\
(5.12)
j=\
It follows directly from (5.3) and (5.11) that i
X
"^ r
r
/
-rr
\
^
5
Sen (1960) proposes a consistent estimator for the Var(.R) of the form S2 =
5|o ni
+
^1 n2
( 5 13)
To obtain a representation of S% in terms of the ranks denote the ranks of Xi and Yj in the combined sample by Txi and Tyj, respectively. Then i = 1, n x , and similarly, U0X{Yj) = (TYj Uio(Xi) = (n2-TXi+i)/n2, j)/nit 3 = l , - - ' j ^ 2 - Hence, using rank representations (5.4) for R, one can write 5^ 0 and 5QJ in the form
Estimation of the Variance of R
145
The advantage of the estimator S% is its simplicity; unfortunately, its drawback is that it is biased as will be shown below. This fact prompted Sen (1967) to introduce another estimator of Var(Ji) which is unbiased. He uses the functions Ux{XuXj,Yk) = \ [I(Xi < Yk)I(Yk < Xj) + I(Xi < Yj)I(Yk < Xk)}, U2(Xi,Yj, Yk) = \ [I{Yk < Xi)I(Xi < Yj) + I(Yj < Xi)I[Xi < Yk)}, U3(XitXjtYk,Yi) = | [I(Xi < Yk)I(Yi < Xj) + I(Xj < Yt)I(Yk < Xt)], where, same as above, statistics
) is an indicator function, and constructs the
(5.14) I—J. J<,K—1 ni
n2
Note that the expressions for V\, V2 and V3 are somewhat complex and may require considerable amount of computations (especially, for large n\ and n2)To obtain the expression for the estimator of variance observe that EU^X^X^Yk) = J^FxWll - Fx(x)]dFY(x) = R - vv Similarly, EU2(Xi,Yj,Yk) = R — V2, where v\ and v2 are defined in (5.6), and EUsiXuXjiYkM) = R(l-R). Hence, Vi, V2 and V3 are unbiased estimators of R — vi, R — V2 and R(l — R), respectively, so that (5.7) implies that an unbiased estimator of Var(.R) is of the form Si = —
[(m + n 2 - 1)V3 - (m - l)Vi - (n2 - 1)V2].
(5.15)
n\U2
Sen (1967) also provides rank representations of V\, V2 and V3. Hollander and Wolfe (1999) provide an illuminating example of an application of Sen's (1967) estimators in the second edition of their famous book. As it was already pointed out, the estimators (5.13) and (5.15) both possess certain disadvantages — the first one is biased and the second one is somewhat laborious for practical purposes. Hilgers (1981) proposes an alternative, easily computable estimator of Var(JJ) which is unbiased.
146
Nonparametric Models
Denote Vjj2 _ J_ Y" [U0i(Yj)]2 ,
V?o = -
where UOi(Yj) and Uio(Xi) are defined in (5.11). The quantities V^ and V2Q can be written as "l
where Tx(fc) a n d Ty(fe) are the fc-th smallest ranks in samples X_ and Y_, respectively. An unbiased estimator of Var(,R) is then given by
(5.16) Unbiasedness of the estimator S% follows from the relations EV& = - + ^ ^ - / " [Fx(x)fdFY(x)
= - +^-^Wl;
(5.17)
similarly
n j V - 1) f° [FY(x)]2dFx(x) J — oo
-l)]+nJ1(n2-l)u2. 2
(5.18)
2
Now, since E(R) = Va,i(R)+R where Var(.R) is given in (5.7), equalities (5.16) - (5.18) imply that ES$ = Var(JR). Observe that the estimator S2 enjoys both the unbiasedness of the first Sen's (1960) estimator and the simplicity of the second. In fact, there exists a direct correspondence between the estimators S2 and 5 | . Indeed, one can express S2 as
f
5? = Mn 1 -l)r 1 f>-t/ 10 (X i ))-(l- J R)] 2 +
M n 2 - I)]" 1 JTlUoiiYj) - R}2
=
(ni - i)-l[V20 - (l - R)}2 + (n2 - 1 ) - 1 ^ 2 - R?
=
s23 + [(m - i)(n 2 - 1 ) ] - 1 ! ! - RQ- -R)-
V& - vo\].
Estimation of the Variance of R
147
The last equation shows that the estimator S2 is biased with Bias^2)
[(n1-l)(n2-l)]-1E[l-R{l-R)-V&-V&\
=
[n1n2]-1[R(l-R)-(R2-v1)-(R2-v2)}.
=
(5.19)
(The last equality in (5.19) follows directly from (5.7), (5.17) and (5.18)). Sen (1967) has shown, that the bias (5.19) is always nonnegative and of the order O([n\n2]~1), consequently S2 slightly over-estimates the variance. However, this small positive bias overcomes a somewhat undesirable property of Hilger's estimator S2, namely, that it vanishes whenever R = 0 or 1. (The situations when R = 0 or 1 can be viewed as being of minor practical importance since it is quite likely implies that the probability R = 0 or 1 which is of little interest in practice). 5.2.2
Estimators Functions
Based on Empirical
Distribution
Definition 5.4 An empirical distribution function (edf) of Fx(x) based on a random sample X. = (Xi, , Xn) is the step function n
<x).
(5.20)
It is well known (see e.g. Hogg and Craig (1978)) that = Fx(x) and Var[Fx(x)] = n-xFx{x){l
EFx(x)
-
Fx{x)\.
The edf can thus be used to estimate Var(.R). Indeed, note that O
/"OO
[Fx(x)]2dFY(x),
«i=/ J~ OO
[l-FY(x)]2dFx(x)
v2= J — OO
are consistent estimators of V\ and V2, respectively, so that S\ = —!— \R + (m - l)«i + (n2 - 1)«2 - (ni + n2 - 1)^21 n\n2
L
J
(5.21)
is a consistent estimator of the variance. Estimator S\ may seem to be somewhat cumbersome, and Govindarajulu (1968) suggested another estimator S2 which is also based on the edfs. Denote O
v3=
/-OO
FY{x)dFx{x), J— OO
v3=
FY(x)dFx(x). J— OO
148
Nonparametric Models
Evidently, v2 = 2R — 1 + vz and V3 is a consistent estimator of ^3. Direct calculations show that Var(.R) =
(n1n2)~1Mwi--R2]+n2[w3-(1--R)2] (5.22)
If ni and n2 are sufficiently large, the last term on the right-hand side of (5.22) is substantially smaller than the sum of the first two terms. Hence, Var(£) » (nina)- 1 (m[«i - R2} + n2[v3 - (1 - R)2]) and S2 = (mm)-1 (m[vi - R2} + n2[v3 - (1 - £) 2 ])
(5.23)
is the consistent estimator of Var(J?). 5.2.3
Jackknife
Estimators
It may perhaps be helpful to first briefly review the concept of a jackknife estimator. It was introduced by Quenouille (1956) and extended by Tukey (1958). It was popularized in Gray and Schucany (1972) and since then it became one of the most popular estimators in statistical practice. Suppose that we estimate a functional 6 = 6(F) of the cdf F by means of its empirical counterpart (5.24) where JP is the edf based on observations X\ %) = e(Xx,
^Xi-uXt+u
Xn. Let ,Xn) = 0(F (i) )
where F^ is the edf based on the (n—1) observations (Xi, Xn) (omitting the observation Xi), and
(5.25)
, Xi-i,Xi+i,
3=1
Then the estimator of Var(^) - called the jackknife estimator - is of the
Interval Estimation of R
149
form (see e.g. Efron (1982)) (5.26) The reasoning behind the estimator (5.26) is as follows. The variance of , Xn) given by (5.24) is approximately equal to CQ/TI the estimator 9(X%, for large values of n and some constant 0%. On the other hand, 9^, j = 1, , defined by (5.25) can be treated as a "sample" of estimators § based o n n - 1 observations, and plays the role of the mean of this "sample". Therefore, (n - I)" 1 £"=i[(j) - #(-)]2 is the "sample" variance of the estimator based on (n — 1) observations and, thus, the estimator of Oo/(n - 1). To obtain the estimator of Var(0) one needs only to multiply the above "sample" variance by (n — l)/n. Applying these ideas to the estimation of Var(.R) we introduce the estimators
based on (ni — I)n2 and ni(ri2 — 1) observations, respectively, and observe that their averages Ux(-) = Uy(-) = R- Hence, the jackknife estimator of Var(JR) is i
^
i
n i
x(}) - ^ ] 2 + ^
"2
E l ^ C i ) - ^] 2 .
(5-27)
The estimator (5.27) was originally introduced by Cheng and Chao (1984) who provides a slightly different but equivalent representation of (5.27), and studied by Shirahata (1993). 5.3
Interval Estimation of R
In this section we shall consider a construction of confidence intervals for R based on R. First we shall briefly review the following approaches to interval estimation of R: 1) confidence intervals based on the Chebyshev (or the Hoeffding) inequality or the Kolmogorov-Smirnov statistics; 2) asymptotic confidence intervals based on a normal approximation; 3) confidence
150
Nonparametric Models
intervals based on pivotal quantities; 4) confidence intervals constructed bymeans of the bootstrap method.
5.3.1
Confidence Intervals Based on Classical Inequalities
Denote v = min(m,n2) and recall that R is the unbiased estimator of R with the variance bounded by (4i/) - 1 (c.f. (5.10)). Therefore, for any £ > 0, the Uspensky (see e.g. Bennett (1962)) and the Chebyshev inequalities yield + iue2)~l
P(RR-e)>l-(l
(5.28)
and P(\R-R\<s)>l-(4iys2)-1,
(5.29)
respectively. The two-sided confidence interval (5.29) was originally suggested by Ury (1972) while the one-sided version (5.28) appears in Yang and Mo (1985) who also derived the Hoeffding-type bounds for R: P(R
> l-e"2^2
(5.30)
2ve2
(5.31)
> l-2e-
(see Hoeffding (1963)). Combining (5.28) with (5.30) and (5.29) with (5.31), we obtain the result derived in Yang and Mo (1985):
P(RPQR-R\<e) > Therefore, the one-sided and the two-sided (1 — 7)-confidence intervals for R are P(R < R + £7,i) > 1 - 7,
P(.R - £ 7]2 < fl < £ + £7>2) > 1 - 7 , (5.32)
where
L{(I)i^j
^ { ( ^ , i _ } . (5.33)
Note that confidence bounds in (5.32) involve the size of the smaller sample only, hence they would not perform very well if the difference in sample sizes — 712I is large.
Interval Estimation of R
5.3.2
Confidence Intervals Based on the Statistics.
151
Kolmogorov-Smirnov
Confidence intervals for R based on the Kolmogorov-Smirnov statistics were derived by Birnbaum and McCarty (1958) and chronologically are the first ones. Following these authors directly and using (2.6) and integration by parts, we represent R — R as O
R-R=
/-CO
Fx(x)dFY(x)-
Fx(x)dFY(x)
J — oo
=
r
Fx(x)d[FY(x) - FY(x)} + I" (Fx(x) - Fx(x)}dFY(x)
J — oo
=
(5.34)
J—oo
J — oo
f°° [FY(x) - FY(x)]dFx(x) + [°° (Fx(x) - Fx{x))dFY{x) J—oo
J—oo
Denoting the Kolmogorov-Smirnov statistics by
D~x
=
sup
[Fx(z)-Fx{z)),
z€(—oo,oo)
D+x
=
sup
[Fx(z) -
Fx(z)\,
z6(—oo,oo)
D+2
=
sup
[FY(z)-FY(z)}
z€(—oo,oo)
(see e.g. Sprent (1989)), we obtain from (5.34) that P(R < R + e) > P{D~x + D+2 < e). It is well known (see e.g. Csaki (1984)) that for any z, P(D" X+2 < z) = F*2(z) where F*(z) is a cdf which depends solely on n but not on Fx or FY. Thus, P(R < R + e) > P(U+ + D+ < e) = F*x >na (e)
(5.35)
where F*^ n2 is the convolution of the cdfs F*x and F*2:
Kun2{e) = f F*1{e-z)F^{z)dz.
(5.36)
Using an exact expression for F*(z) [n(l-z)] /
\
/
Kiz) = l-z g Qd+z)
\ J-l
/
_
\n-j
, ^[0,1], (5.37)
152
Nonparametric Models
where [n(l — z)} is the integer part of n(l — z), and (5.36), one can evaluate numerically the right-hand side of (5.35). However, when reversing the problem, i.e. if the confidence coefficient (1 — 7) is given and we are required to find £T,3 such that F^ ,^(£7,3) > 1 —7, the calculations based on (5.36) and (5.37) become very cumbersome. This prompted Birnbaum and McCarty (1958) to derive an asymptotic expression for F*in2 as v —* 00. Theorem 5.2 If v = min(m,n2) —> 00, then we have uniformly in e e [0,1]
where (e) = 1 - n i N ^ e - 2 ^ 2 - n2N^e-2n^
2
^ )] ; as above, N = n\ + n
.
.
^ '
) is the standard normal cdf.
Proof. The proof of this statement is based on the classical result for the Kolmogorov-Smirnov statistic (see e.g. Csaki (1984)) which states that lim P(D+ < z/y/n) = 1 - e" 22 ' = L(z). n—»oo
Therefore, [F*(z) — L(zy/n)) —> 0 uniformly in z. To complete the proof one needs only to note that
=JoI
(5.39) n
Formulas (5.35) - (5.37) and Theorem 5.2 provide a one-sided confidence interval for R. To obtain a two-sided confidence interval for R one is required to use the statistics
D B1 =
sup \Fx(z)-Fx(z)\,
Dn2=
z€(—00,00)
sup \FY(z) - FY{z)\ z6(—00,00)
(5.40) with the limiting distribution (see e.g. Smirnov (1948)) oo
lim P{^Dn < z) = Y (-l)fce-2fe2z2 = L*(z). k= — oo
Interval Estimation of R
153
It is easy to observe from (5.34) and (5.40) that P(\R -R\<e)>
P(Dm + Dm < e)
and using an analog of (5.39) it can be shown that the limiting distribution for Dni + Dn2 is Qn,,na(e) = [£L*((e-z)^)dL*(z^r1).
(5.41)
Jo
A disadvantage of the Birnbaum-McCarty confidence intervals is that they are quite conservative. In fact, Ury (1972) comments that for 7 = 0.5 and ni = ri2, Chebyshev's intervals require about 1/3 of the sample size needed for the Birnbaum-McCarty bound. This is due to the fact - as it was observed by Yang and Mo (1985) - that the Birnbaum-McCarty bound is actually a simultaneous confidence bound for all R(z) = P(X < Y + z) = r ° Fx(x + z)dFY{x),
z G (-00,00)
J—00
rather than just for R = R(Q). Indeed, let
R{z) = {rum)-1 J^Jl1^
Fx(x + z)dFY(x), J
-°°
»=i 3=1
then, analogously to the derivation leading to (5.34), R(z)-R(z)
=
H[FY{x)-FY(x)]dFx(x + z) J-00 rOO
+
/
(Fx(x + z)-Fx(x + z)}dFY(x).
J — OO
Hence, snpz[R(z) - R(z)] < D~x + D+3 and thus P(R(z) - R(z) <e)> P(D+ + £>+ < e) = F^n2(e) (compare with (5.35)). This simultaneous confidence intervals interpretation was elaborated by Arsham (1986).
154
Nonparametric Models
5.3.3
Confidence Intervals Based on the Asymptotic Normality
As it was just mentioned above, both the inequalities-based confidence intervals and Birnbaum-McCarty confidence intervals are too conservative (the first ones due to use of the minimal sample size and the second ones because they are actually simultaneous confidence intervals for R(z)). Moreover, both types of confidence intervals are geared towards the least favorable pair of distributions Fx and Fy which may not be the case. To remedy the situation, a number of researchers utilized with some success the asymptotic normality of R for obtaining more stringent confidence intervals. Indeed, if 2
_ mn 2 Var(£) _ R + (m - 1 ) ^ + (n2 - l)v2 - (N - l)R?
a -
-
-
-
(5.42)
where, as before, ni + n2 = N, then y/niri2/N(R - R) is asymptotically normally distributed with zero mean and variance a2 as n\ and n2 tend to infinity. Since
1
m +n2 _ 1
J_ < 2
where, as in Section 5.3.1, v = min(ni, 712), one can use u instead as the normalizing coefficient, namely, y/v{R — R) will be asymptotically normally distributed with zero mean and variance v Var(.R). As noted by Govindarajulu (1968) this implies - using van Dantzig's (1951) upper bound (5.10) - that v Vai(R) < 1/4. Hence, the one-sided and two-sided (1 - 7)-confidence intervals are of the form (5 43) ^ ) - 1 " 7 ' respectively, where za is the (1 - a) percentile of the standard normal distribution.
Even these refined intervals (5.43) are still quite conservative since they are based on the van Dantzig'z (5.10) upper bound for the variance which may sometimes be much larger than the actual value of the variance. Hilgers (1981) suggested to use confidence intervals based on asymptotic normality and an estimator of the standardized variance a2. He proves the following statement.
Interval Estimation of R
155
Theorem 5.3 //min(7ii,n 2 ) — oo and S% is a sequence of estimators for Var(R) such that (ni7i2/N)[Sjf — Var(R)] converges in probability to zero, then (R — R)/SN is asymptotically normal with zero mean and unit variance. One could verify that the estimators S?, j — 1,- ,6, in Section 5.2 (equations (5.13), (5.15), (5.16), (5.21), (5.23) and (5.27)) satisfy the conditions of Theorem 5.3, so that the asymptotic (1 — 7)-confidence intervals can be expressed as ^Sj)
>
1-7, (5.44)
P(\R - R\ < z^Sj)
>
1-7,
for all j = 1, , 6. The reader may wish to consult Shirahata (1993) and Chen and Chao (1984) for comparison of the various types of confidence intervals. 5.3.4
Confidence Intervals Based on Pivotal
Quantities
In this subsection we shall review two different methods of interval estimation of R based on pivotal quantities suggested by Halperin et al. (1987) and Feigin et al. (2001). Further details on Halperin et al. (1987) approach are given in Chapter 7. The technique of Halperin et al. (1987) is based on the inequality (5.9) which implies that Var(ii) = [(ni + n 2 — for some g e [0,1]. From this equality it follows that Q = [(m + n 2 - 2)i?(l - R)]~1mn2 Var(B) and, thus, Q can be estimated by -1, (n\ -\- H2 — 2)i?(l — R) where 5? is one of the estimators of the variance presented in Section 5.2, j = 1, , 6. Halperin et al. (1987) use Govindarajulu's estimator 5 | (equa-
156
Nonparametric Models
tion (5.21)) and propose the following pivotal quantity: (R - R)V^2
=
/
(
+
2
)
„
1}
(
(5.45)
+ l/R(lR)
J
It follows from (5.45) that with probability of at least (1 — 7), 0 < 7 < 1, \R — R\
(5.46)
n 2 - 2) + 1 A / R ( 1 -
R)
where, as above, z 7 / 2 is the (1 — 7/2) quantile of the standard normal distribution. Solving the last inequality for R, we obtain
h i h i j ^
1
-
7
'
(547)
where zL 2 [f(tll + 7l2 - 2) + 1] 1,
Hi =
7/2L
It is easy to verify that the expression under the square root in (5.47) is positive for all 0 < 7 < 1. A similar but slightly different method for construction of a confidence interval for R have been more recently provided by Feigin et al. (2001). The authors note that vi and v2 in (5.6) can be represented as V! = P[max(Xi,X 2 ) < Yi]
and
v2 = P[XX < min(Yi,
and, hence,
01
=
n^-m. i'C.J — 1 K — 1
v2 =
^ nn{n
-f^T,
I(Xi<mm(Yj,Yk))
(5.48)
1 ) ^ ^
are unbiased estimators of v\ and v2, respectively. It is easy to observe that v\ and v2 are modified versions of Sen's statistics V\ and V2 denned in (5.14) with EVi = R - vt, i = 1,2.
Interval Estimation of R
157
Now let V3 = (ni — l)#i + (n2 —l)i)2. Since Var(.R) is of the form (5.7), Feigin et al. (2001) propose the pivotal quantity
^R +
)
v3-(m+n2-l)R2
(5.49)
(c.f. (5.45)). Thus,
(c.f. (5.46)). Solving the last inequality for R we arrive at p I _?
V_J
2_2 < R < _2
V ^
2_2 \ > J _ 7 ;
(5 50)
where Z2
TJ
2
B
l + (n + n l ) H
2 C = = R R2-H2v3.
It is not difficult to show that A2 — B2C2 is positive for all possible values of vi, v2 and 7 G [0,1]. 5.3.5
Confidence Intervals Constructed by Bootstrap Method
The bootstrap procedure for construction of confidence intervals has been described in some details in Section 4.5. There, we have assumed that (X, Y) have the cdf Fg(x, y) with an unknown scalar or vector-valued parameter 9 € 9 where 9 is a set of the parameter values and then constructed the MLE F of F of the form F(x,y) = F§(x,y). We base our interval estimators on the independent bootstrap samples from F(x,y). In a nonparametric situation, the form of Fg(x,y) is unknown, so that the MLE of F is the joint edf F(x,y) = n-1YjI{Xj
<x,Y5< y)
(5.51)
j=\
provided X and Y are dependent, or simply the product of the marginal
158
Nonparametric Models
edfs
Fx(x)FY(y) = ( n ^ ) - 1 jrJTl(Xi
< x)7(y,- < y)
(5.52)
in the independent case. After generating the bootstrap samples (X_^, Y , M, using (5.51) or (5.52), we construct the bootstrap confidence j = 1, intervals in the same manner as it was done in Section 4.5. Chen and Chao (1984) applied percentile method (without bias correction) described in Section 4.5.3 to construct bootstrap confidence intervals for R. 5.4
Nonparametric Bayes and Empirical Bayes Estimation
In this section we shall consider nonparametric Bayes and empirical Bayes approach to estimation of R. To understand material below it would be helpful to be familiar with the main concepts of measure theory which is impossible to review even briefly in the book of this size. Hence, the readers that are not equipped with this knowledge are encouraged to study measure theory using any of the numerous standard textbooks (such as Billingsley (1995) or Shiryaev (1996)) or to skim this section in order to obtain general ideas of the methodology. 5.4.1
Dirichlet Process
Preliminaries
It was Ferguson (1973) who initiated nonparametric Bayes estimation introducing the by now classical Dirichlet process which is highly flexible and versatile in assigning prior measures. Before studying his definition, it would be desirable to recall the Dirichlet distribution. , be independent random variables Definition 5.5 Let Zi, i = 1, with the pdfs Gamma(l, a,) of the form (3.9) where a^ > 0 for all i and a, > 0 for some i, i = 1, , k. The Dirichlet distribution with parameters (ai, , afc) is defined as the distribution of (Yi, , VJb) where
Nonparametric Bayes and Empirical Bayes Estimation
159
The Dirichlet distribution is always singular with respect to Lebesgue measure infc-dimensionalspace since Y\ H \-Yk = 1. However, if a^ > 0 , Y"fc-i) is absolutely for all i, the (fc — l)-dimensional distribution of (Yi, continuous with the pdf fc_l v « * * - l
I-£K) »=1
x /((yi,-,»M)e5), where
/
(5.53)
) is the indicator function and S is the simplex
For k = 2, expression (5.53) reduces to the density of the beta distribution. Let (X, A) be a measurable space. Ferguson (1973) defined the following stochastic process {P(A),A e A} . Definition 5.6 Let (X,A) be a measurable space. Let a be a nonnull finite measure (nonnegative and finitely additive) on (X,A). The measure P(A) is a Dirichlet process on (X, A) with parameter a if for every fc = 1,2, , and every measurable partition (Bi, , Bk) of X, the vector (P(Bi), ,P(Bfc)) has the Dirichlet distribution with the parameter Definition 5.7 say that X\,
Let P be a random probability measure on {X, A). We Xn is a sample of size n from P if for any positive integer
m and measurable sets Ai,
Am, C\,
Cn
V{X1eC1,---,Xn€Cn\P(A1),---,P{Am),P(C1),---,P{Cn)} with probability one. Here V
} denotes the probability of an event.
Intuitively, we may view a sample of size n from a Dirichlet process as follows. The process chooses a random distribution F, and then, given F, X\, Xn is a random sample from F. The following theorem provides the conditional distribution of a Dirichlet process P given a sample Xi, , Xn from P. Theorem 5.4 (Ferguson (1973)). (X, A) with parameter a, and let X\,
Let P be a Dirichlet process on ,Xn be a sample of size n from
160
Nonparametric Models
P. Then the conditional distribution of P given Xi, ,Xn is a Dirichlet process with parameters CK+X^=I ^Xa where 5X is a measure assigning mass one to the point x. The following statement combining Theorems 3 and 4 of Ferguson (1973) explains how one can calculate the expectations of the integrals with respect to random probability measures. Theorem 5.5 Let P be the Dirichlet process with the parameter a and let / i and fi be measurable real-valued functions defined on (X,A). If f \fj\da < oo, j = 1,2, and f \fif2\da < 00, then f\fj\dP < 00 with probability one, and the expectations of the integrals are of the form
jdP = Jfjd(EP) = [a(X)}-1 jfjda, j = 1,2, = [a{X) + l]^[a{X)]-l[Jhf2da 5.4.2
Nonparametric Bayes Estimation of R
Before starting to describe estimation of R, we shall consider a more general problem of nonparametric Bayes estimation of a distribution function F(t) — P((~oo,t}) under the squared loss. Let X = 7£ = (—00,00) and ft(x) - 7(-oo <x
(5.54)
Since, by Theorem 5.4, the posterior distribution of P given the observations is chosen by the Dirichlet process with the parameter a + Yl7=i &X*» the Bayes estimator of F based on a sample X\, ,Xn, can be obtained by replacing a by (a + X)"=i <^O i n (5-54), namely
oaoMHipx.aco,,])
(5.55)
a(7c) + n To construct nonparametric Bayes estimator of R = f Fx(x)dFy(x) under the squared loss, we shall choose two finite measures a\ and a%. For a prior for (Fx,Fy), we assume that Fx and Fy are the distribution functions of random probability measures Pi and Pi, respectively, where Pi and Pi are independent and P,- is chosen by a Dirichlet process with the parameter otj, j = 1,2.
Nonparametric Bayes and Empirical Bayes Estimation
161
If no samples from Fx and Fy are available, it then follows from Theorem 5.5 that the Bayes rule for the no-sample problem is
£0 = r
F^)(x)dFY0)(x)
(5.56)
J — oo
where F^' = EFX and Fy = EFy are given by (5.54) with a replaced by «i and a2, respectively: p(°)(x)
=
x
<*i((-°°,x]) ax(H) '
F(0)(X)Y
(5.57)
Given the samples, the Bayes rule is O
R=
Fx(x)dFY(x)
(5.58)
J — oo
where Fx(x) and Fy(x) are the Bayes estimators of Fx(x) and Fy(x) of the form (5.55). Introducing [aj{n) + nj], j = 1,2,
(5.59)
we rewrite Fx(x) and Fy(x) as (5.60) , Xni) where Fx(x) and Fy(x) are the edfs based on samples X_ = (Xi, and y = (Yi,--- ,Yn2), respectively. Substituting (5.60) into (5.58) we arrive at the estimator R
— QIQ2RQ
+ QiO-— Q2)—/
Fv CYi)
(5.61)
n2 ~zi
+
(1 - Qi)Q2— Y"(l - FY0)(Xr)) + (1 - ei)(l - Q2)R,
where R is the UMVUE of R given by (5.3). The estimator (5.61) was proposed by Ferguson (1973). 5.4.3
Nonparametric Empirical Bayes Estimation of R
This estimation requires a rather delicate construction. Let (X}%\Y_^), i = 1, , m, be two independent sequences of independent random vectors of
162
Nonparametric Models
observations with respective random probability measures (Pu,P2i), i = 1, ,m. Here, XW = ( X « , - - . , X W ) ,
r W = (nW----,^)>
<=
,
(5.62)
are samples of X^ and Y^\ respectively. Let Pu and P^i be independent with Pji having a common Dirichlet process prior with parameter ay, j = 1,2. Let Fxi and Fyi be the distribution functions corresponding to Pu , m. Our objective is to estimate and Piu respectively, i — 1, R(m)
=
/" J —o
on the basis of the observations (5.62). If the parameters a\ and a2 were known, the Bayes estimator of based on samples (X ( m ) ,y ( r o ) ) would be of the form i?i m)
=
QlmQ*m&0 + Qlm(l-Q2m)F$£
+
(1 - Qlm)Q2m(l ~ Fp£) + (1 - 01m)(l - p 2 m)4m,
R^ (5.63)
where (5.64)
and the quantities F^ and P1^0' are defined in (5.57). In an empirical Bayes (EB) analysis, one or several prior parameters in (5.63) are assumed to be unknown and ought to be estimated from data. Hollander and Korwar (1976) were the first to construct an EB estimator of R^K They treat _R0, FJj?^* and F^ as unknown but assume that ay (ft) are known and riji = rij, j = 1,2, i = l,---,m. The corresponding EB estimator of R% then becomes
Nonparametric Bayes and Empirical Bayes Estimation
m—lm—1 p(m)
_
QlQ2
i-
^ V ^ p
(m - I) 2 *—' *-i
163
. m—1
, Ql(l-Q2) V^
m- 1
J
p
^
m-1
f
(5.65)
where gi and Qi are defined in (5.59). Comparing (5.65) with (5.63) it is easy to see that R\, FyJ^ and Fy^ are estimated on the basis of only past data (2L ,¥. )i i = 1,- ,m — 1, but not the current one. Ghosh and Lahiri (1992) generalize the result of Hollander and Korwar (1976). They construct an EB estimator for unequal sample sizes in the cases when aj(7V), j = 1,2, are either known or unknown. They also include the current data into estimation of .Ro, i^m an( ^ 4 m 1 Specifically, in the case when aj(TZ), j = 1,2, are known, they denote m
(1-£,-*),
J -
1,2,
and estimate £ 0 , F%£ and F-f?£ by
respectively. Then, the nonparametric EB of Rg R^B2
is of the form
=
QlmQ2mR*o + Qlm(l -
e2m)F{^
+
(1 - Qlm)Q2m(l ~ F$£) + (1 - ffim)(l - Q2m)Rmm,
where Qim and g2m are defined in (5.64). Ghosh and Lahiri (1992) also construct a nonparametric EB estimator when aj(R,), j = 1,2, are unknown and ought to be estimated from obser-
164
Nonparametric Models
vations. This estimator is given by a rather complicated expression and is not reproduced herein. , m, all We should mention that, whenever riji > 1, j — 1,2, i = 1, the EB estimators turn out to be optimal in the sense that the quadratic risk of the EB estimator approaches the quadratic risk of the Bayes estimator as m —> oo. 5.5
Probability Design Approach to Estimation of R
The probability design approach to the estimation of R was introduced by Kapur (1975). This approach differs from all the other by using a very different set of data for estimating R. Specifically, Kapur (1975) assumes that the stress X and the strength Y are independent and that there exists an "interference" interval within which the stress and the strength are likely to interact. This interference interval is then partitioned into n subintervals, and it is assumed that instead of the samples from X and Y some bounds are available on the probabilities of the stress and the strength falling into the interference interval as well as on the probabilities of the stress and the strength falling into each subinterval of the interference interval. These bounds are supposed to be known a-priori or constructed from existing data. Kapur (1975) uses his technique to estimate "unreliability" 1 - R = P(X > Y). Following the approach adopted in this book, we shall however present Kapur's method for estimating R = P(X < Y). Let -Xmjj, be the lower bound for X and YmaK be the upper limit for Y that occur with high probability. Then the interference interval is [Xm\n, Ymax] and /
Fx{x)dFY{x). in
Partition [Xm;n, Fmax] into n subintervals with the endpoints a,-, j = 0, < an = F max . Denote i.e. Xmin = a0 < at < Pi = P[a^i < X < Oi], qi = P[a,i-i < Y < a*].
, n,
(5.66)
Then (5.67) 4=1
t=l
Probability Design Approach to Estimation of R
165
and R can be approximated by the sum of probabilities P(ak-i < X < a f e , a i _ i
i = 1,-
,n, k = 1,-
,i.
Let the conditional probabilities be denoted by y | a j _ i <X
< a*),
i = l,---,n.
(5.68)
Consequently, P(X < Y, aj_i < X < aj,aj_i < V < aj) = Pigi^i and the representation for R becomes (5.69) Introducing n
v = L»=i
we represent i? as (5.70) Let bounds on (5.66) and (5.67) be available (5.71) (5.72) Then the upper (lower) confidence bound Ru (RL) for R is constructed by maximizing (minimizing) (5.69) or (5.70) under constraints (5.71) and (5.72). Computationally, this leads to a standard quadratic programming problem. The interested readers are referred to e.g. Charnes and Cooper (1961). After the confidence bounds for R are obtained , the point estimator of R is given by R = (Rv + RL)/2. Kapur (1975) assumes that strength dominates stress on each subinterval, so that all iVs are equal to one and v = 1. Park and Clark (1986) note that Kapur's assumption that i/j = 1, , n, leads to overestimation of both RL and Ru- To remedy the i — 1, , n, which corresponds to situation, they suggested to set i/j = 1/2, i — 1,
166
Nonparametric Models
the case of equal probabilities that strength or stress dominates within the i-th interval. Hence, the objective function to be maximized and minimized is (5.70) with v = 1/2. Park and Clark's confidence interval, although more realistic than Kapur's, has the shortcoming that it sometimes fails to cover the actual value of R. Melloy and Cavalier (1989) provide an example of such a situation and suggest to "play it safe" strategy. They propose to use v = 1 for maximizing R and v = 0 for minimizing it. Melloy's and Cavalier's interval is obviously the longest of the three but it is the one to contain most likely the actual value of R. Shen (1992) notes that Kapur (1975), Park and Clark (1986) and Melloy and Cavalier (1989) suggest identical values for Vi for minimization or maximization of R. He attempts to improve the methods of the previous authors by introducing variable values for Pi. Denote Pi = (Lp,i + UPii)/2,
qt = (£,,< + Uq,i)/2.
Shen's (1992) suggestion is to use
{
1, if pi <&, 0.5, ifpi-i < ft_i and pi+i > ft+i, 0, otherwise,
for the maximization of R and
{
1, if ft < Pi, 0.5, ifft_i
and qi+i > pi+1,
for the minimization of R. The author demonstrates by means of extensive simulations that his method leads to more precise point estimators and more realistic bounds for R. We conclude this section by noting that there are also two other optimization approaches to the estimation of R. The first one is due to GeungHo Kim (1981) and is based on mathematical programming techniques. The second was developed by Wang and Liu (1996) and is based on fuzzy reliability. Unfortunately, due to the space limitation, we are unable to cover these techniques and refer the interested readers to the original papers.
Exercises
5.6
167
Exercises
5.1. Using the fact that TRX + TRY = N(N + l)/2, derive formula (5.5) from (5.4). 5.2. Prove that the estimator Sf given by (5.15) is an unbiased estimator of Var(B). 5.3. Derive an unbiased estimator of Var(.R) based on statistics (5.48). 5.4. Investigate for what values of 7 the confidence bounds in (5.33) are based on the Hoeffding-type inequalities (5.30) and (5.31). 5.5. Write an explicit finite-sum representation for F*in2(e) using formulas (5.36) and (5.37). 5.6. Derive a series representation for Qn1,n2(£) gi v e n by (5.41). Explore convergence of the series. 5.7. Project. Let unbiased estimators v\ and #2 of vi &nd V2 be given by Feigstat. Denote v(a) — (1 — a)R + (ni — l)#i + (n2 — 1)#2 and observe that E[aR + v(a) — (n\ -\-n,2 — 1)-R2] = Var(fl) for any number a. Hence, the pivotal quantity in npiv2 can be replaced by v(a) - (m + n 2 - 1)R2 Construct a confidence interval based on Z3 similarly to (5.50). Study which value of a provides the shortest confivence interval. 5.8. Write an explicit expression for the estimator (5.61) if OJI ((—00, x}) = a2((—oo,x]) = a((—00, x]) with a) a ( ( - o o , i ] ) = / f ^ O . 5 exp(-|z|)dz;
b) a((-oo,x]) = JljV^)-1
exp(-z2/2)dz.
Chapter 6
Some Selected Special Cases
The arsenal of statistical distributions is truly inexhaustible. New distributions are being discovered literally on a weekly basis prompted by either theoretical considerations or by pressing practical applications or both. Glance through any recent issue of "Annals of Statistics" or "Statistics in Medicine" to be reassured by the validity of this assertion. Each of these distributions can, of course, be used for studying the probability P(X < Y) and its estimation and we hope that the readers will engage themselves in this rewarding activity. We shall refer to this type of investigations as the "mainstream" case of the stress-strength models. These situations were tackled in the preceeding chapters involving most popular and interesting - in our subjective opinion - distributions. We also studied the case when X and Y are random vectors and the probabilities of inequalities of some linear combination of X and Y are estimated. These two models, however, do not cover all the situations that appear in various applications or theoretical exercises. The first group of additional models studied in this chapter stems from the problems in system reliability and naturally leads to estimation of prob, Yjt) and their modiabilities of inequalities of the type X < max(yi, fications. These models will be considered in Section 6.1. Estimation of probabilities of inequalities other than X < Y which are not covered by the previous sections (e.g. P(X < Y < Z) or P(Xi < X2 < < Xm)) will constitute the content of Section 6.2. Section 6.3 is devoted to linear models formulations in the stress-strength set-up such as stress-strength models with explanatory variables or ANOVA. Section 6.4 reviews stress-strength models with grouped and categorical data. Finally Section 6.5 considers 169
170
Some Selected Special Cases
briefly stochastic processes formulations of stress-strength models. This chapter is an excellent source for further research.
6.1
Stress-Strength Models for System Reliability
6.1.1
Various Models for System Reliability
The stress-strength models for system reliability occur when a device under consideration is a combination of A; usually independent components with the strengths Y\, , Yk and each component of the system is subject to a common shock of a random magnitude X. If the system functions when at least s, 1 < s < k, components survive the shock, we talk about s-out-of-A; system. As a typical example of sout-of-A) system one may consider (see Johnson (1988)) a panel consisting of k identical solar cells which maintains an adequate power output if at least s of the cells are active during the course of the mission. The external force interfering with the operation of the cells may possibly be extreme temperatures and the strength of a cell, in this context, may be taken as its capacity to withstand these extreme temperatures. If we assume that the stresses and the strengths of the components are i.i.d. with the pdfs fx and fy and the cdfs Fx and Fy, respectively, then the reliability (i.e. the probability of successful operation) of the s-out-of-fc system is given by the well known relation R
[1
>." = £ (T) r j=s
v/
- Fy^)\iFY-j{x)dFx{x).
(6.1)
o
If the system is operating successfully whenever at least one of the k components survives (s = 1), it is termed parallel in the analogy with electric circuits. If however the system survives only when all of the components are intact (s =fc),we are dealing with series system. In view of (6.1), the reliabilities of series and parallel systems are represented by Rllk = P(X < max(yi,
, Yk)) = 1 - f°° FY(x)dFx(x)
(6.2)
J —OO
and Rktk
= P(X < m i n ( F 1 ,
k))=
f°° [1 - FY(x)]kdFx(x),
(6.3)
Stress-Strength Models for System Reliability
171
respectively. Estimation of reliability of s-out-of-A; system has been discussed by Myhre and Saunders (1968), Madansky (1965), Easterling (1972), Bhattacharyya and Johnson (1975, 1977) and Choi and Kim (1983) among many other sources. Particular cases of parallel and series systems have been studied by Gupta (1972), Bhattacharyya and Johnson (1974), Chandra and Owen (1975, 1977), Rinco (1973), Singh (1981), Gupta and Gupta (1988) and Ivshin and Lumelskii (1995). So far, we have assumed that the components Y\, , Yk are i.i.d. random variables and all of them are subjected to a common random stress X independent of Y's. Johnson (1988) outlines several extensions of the above model for representing the reliability structure of more complex systems. 1. Non-identical component strength distributions. When the components of a system are of different structure, the assumption of identical strength distributions may not be quite realistic. Suppose that out of k components, k\ belong to one category and k% = k — k\ to the other. Denote the strength distribution of the components of the i-th category by Fyi, i — 1,2. Assume now that all the k components are exposed to a common stress X having the distribution Fx and the system operates successfully if at least s out of k components withstand the stress. The system reliability is then of the form
RsMM = £ f) [f] h,h
U1/ U a /
J
~°°
(6.4) where the summation is over all possible pairs (ji,J2) with 0 < ji < k\ and 0 < J2 < fe such that s < ji + J2 < k. For example in the case of ki = fc2 = 1 and s = 1, the possible choices for (ji, j'2) are (0,1), (1,0) and (1,1), so that in this case (6.4) becomes FYl{x)FY2{x)dFx{x). 2. Subsystems with independent stresses. In a more complex situation a system may consist of a number of independent subsystems, say m, performing different tasks. Within each subsystem, the components have independent and identically distributed strengths and are subjected to a common stress, so that each subsystem has a structure of an s out of k
172
Some Selected Special Cases
stress-strength model. The strength and the stress distributions as well as the parameters s and k may vary among the subsystems. Representation of the system reliability depends here on the manner in which subsystems are combined in the total system. For example, if subsystems have a series connection, the system fails whenever one of the subsystems becomes unoperational, and the system reliability is in this case given by R — R\ih2 ' ' ' Rm
where R4 is reliability of the i-th. subsystem, i = 1, , m. If subsystems are connected in parallel, the system functions properly provided at least one of the subsystems survives. In this case we have
Evidently, there is a multitude of variations in between the above two extreme cases. 6.1.2
Estimation of System Reliability Based on Numerical Data
Estimation of reliability of s-out-of-k system in the case of exponential distributions. Consider the case when /x(x) and fy(y) are both one-parameter exponential: fx{x\a\) = «iexp(—a\x) and fy(y\ct2) = a2exp(—ot2y) and samples of sizes n\ and ri2 from fx and fy, respectively, are available. Calculation of system reliability in the exponential case has been carried out by Bhattacharyya and Johnson (1974). To evaluate (6.1) they used the relation between binomial probabilities and the cdf of the beta distribution (see e.g. Johnson et al. (1992)) K(u) = [B(s, k-s
+ I)]"1 /
xs-\l
- x)k~sdx
= Y, j=s
(6.5) Denote A = ati/a2. Noting that Fy(x) = 1 - exp(—a.2x) and changing the variable of integration to z = exp(—a\x) we represent (6.1) as RStk = [ Jo
K{ul'x)du.
Utilizing the transformation u = y and integrating by parts we easily
Stress-Strength Models for System Reliability
173
obtain B(s + X,k-s + l) 1
(k-s)l
A
*
(6.6)
Using the partial fraction expansion for the product of reciprocals in (6.6) we finally arrive at /If
\
-R.,k = [B{a,k-8 + l)]-1^2(-l)i( (k T S')J (s+j + A)""11.
(6.7)
7—0
Hence, the MLE of RStk can be written as (compare with Section 2.1.3)
%<»C;s) %<-»C;
> -
(6.8)
To obtain the UMVUE of the system reliability in this case, note that for any a > 0 V'a(A) = (a + A)- 1 - a" 1 {1 - E[/(oXi < where ) is the indicator function. Recall that A = cci/a^- Thus, using (2.31) and Theorem 2.4 we write the UMVUE of ipa(\) as ,1, (\\
1
(m - l)(n 2 - 1 ) —2
(6.9) /
\ ri2 —2
where the set W = {(x, t/) : 0 < x < niX, 0 < y < n2Y, ax < y} (compare with (2.32)). Changing variables in (6.9) to u = x/{n\X) and v = y/(ri2Y), integrating over v € (aTu, 1) and using the notation T = (niX)/(n2Y), we can represent ^O(A) as
= a"1 1*1 - (m - 1) f (1 - u)"'- 2 (l -
(6.10)
174
Some Selected Special Cases
Using integral representation of the hypergeometric series (see e.g. formula 9.111 in Gradshtein and Ryzhik (1980)) we rewrite (6.10) as &(A) = a" 1 [1 - 3 Fi(l - n2,l;n1,aT)].
(6.11)
The hypergeometric series in (6.11) is convergent provided aT < 1. For aT > 1, an alternative expression for V>O(A) can be used: ) = a-^FtQ. - m, 1; n2, (aT)-1).
(6.12)
Combining (6.7) with (6.11) and (6.12), we arrive at the UMVUE (k ~ 3=0
6.1.3
Estimation of System Reliability Based on Count Data
Suppose that a complex mechanism is constructed from a number of different types of components and separate observations have been made on reliability of each one of the components of the system. Namely, the data consists of vectors x = (xi, , xm) and n = (ni, , nm) where we have , m. The observed Xi successes in rij trials of the i-th component, i = 1, number of successes Xi has the binomial distribution p(xi-,0t) = ( " * W ( l - W-",
' < = 1,
,m.
Importance of this setting stems from the fact that numerical measurements of strength and stresses are often much more expensive and involved than observations of survival and failure which may often be obtained simply by visual inspection of the components. Hence, although numerical measurements are usually more informative than counts, it may be economically advantageous to collect a large number of samples of count data rather than obtain numerical measurements with fewer components. Madansky (1965), Myhre and Saunders (1968a,b) and Easterling (1972) worked out confidence intervals for the system reliability R for the data of this type while Choi and Kim (1983) developed a Bayesian sequential procedure for estimation of R. Madansky (1965) derives confidence interval for R by inverting the likelihood ratio test (LRT) for hypothesis Ho : R = RQ (see e.g. Casella and
Stress-Strength Models for System Reliability
175
Berger (1990), Chapter 9). For example, in the case of the series system, R = YijLi ®j an( ^ ^ e likelihood ratio is given by
L(Ro)=
sup
U
n^(x i ; ^) / s u Bp L L*=i J // Bi U=i
(6.13)
Taking into account (see e.g. Casella and Berger (1990), Section 8.4.1) that —21nL(i?o) has approximately chi-squared distribution with one degree of freedom, we obtain a (1 — 7)-confidence set for R of the form W = {R:-2\nL(R)<X21^(l)}
(6.14)
where X^(l) is the 100a upper percentile of the chi-squared distribution with 1 degree of freedom. It is easy to observe that the supremum in the denominator of (6.13) is achieved for $i = Xj/rij. To obtain the logarithm of the numerator of L(Ro) we shall first maximize the Lagrangian /
m
m
\
J2 Inp(x4; 9i) - A In J ] 9t - In Ro i=l
V i=l
/
where A is a Lagrange multiplier. Then the maximizing set of 6i's is given by
where A < minxj for all i, since 0 < 6(X) < 1. Hence, substituting 0j and Qi into (6.13), we obtain inL(Ro) = TXiln
1 - A ) - Vn.ln (1 -
(6.15)
where A satisfies the equation [(Xi - X)/(m - A)] = Ro.
. (6.16)
Now a (1 — 7)- confidence set (6.14) is obtained by replacing Ro by R in (6.15) and (6.16) and noting that
4=1
176
Some Selected Special Cases
is a monotonically decreasing function of A. Observe also that — 2lnL(R) is a monotonically increasing (decreasing) function of A for A > 0 (A < 0). Hence, the set of A's such that — 2 In L(R(A)) < Xi- 7 (1) w m be an interval [AJ, A£] where Af < 0 < A2 are the solutions of the equation
Since -R(A) is a monotonically decreasing function of A, the (1—7)-confidence interval for R is indeed of the form (i^Ajl), -R(Af)). This result can easily be modified for the parallel systems by simply replacing R with 1 — R, 9i with (1 — 0,) and Xi with n, — Xj, i = 1, , m. Moreover, as Madansky (1965) shows, his method can be generalized to the case where the system is composed of m components in series and s subsystems that are in series with the m components but are themselves composed of components in parallel. This extension is based on monotonicity property analogous to the one used above. Myhre and Saunders (1968a,b) applied the Madansky's (1965) method to more diverse systems of elements. Let the state of the i-th component be a Bernoulli random variable y* taking on value one for success and zero for failure, i = 1, , m. Then, following Myhre and Saunders (1968a), the system is a coherent monotone structure if there exists a non-decreasing Boolean function ,ym) (i-e. taking only values zero and one) of the state of components which serves as an indicator of the state of the structure. If Eyi = 0j is the reliability of the i-th component, then R = E
0m)
=
.
ffm
is asymptotically normally distributed as nj —> 00, j = 1,
(6-17)
, m, with the
Estimation of P(Xi < X2 <
mean R = h{6i,
< Xk)
177
Om) and the variance
(6.18) To obtain confidence intervals for R based on statistic (6.17), one needs , m. only to replace 0, in (6.18) by their estimators, 9% = Xi/rii, i = 1,
6.2
Estimation of P ( X i < X2 <
< Xk)
In previous sections we have considered estimation of various inequalities. Estimation of Rk = P(Xi < Xi < < Xk) is perhaps one of the important cases which has not yet been covered. It is also of interest from theoretical aspects. Estimation of Rk appears in isotonic regression problems where it is essential to estimate Rk to find the level probabilities (see, e.g., Miwa et al. (2000)). The important particular case is estimation of R3 = P(X < Y < Z) which represents the situation where the strength Y should not only be greater than stress X but also smaller than stress Z. For example, many devices cannot function at high temperatures, neither can do at very low temperatures. Similarly, person's blood pressure has two limits - systolic and diastolic pressures and his/her blood pressure must lie within these limits. This section presents estimation of Rk in general, following more detailed elaboration of the case fc = 3.
6.2.1
General Case
Using techniques of Sections 2.1 and 2.2, it is easy to represent estimators of Rk in integral forms. Indeed, if f(xi, , Xk) is the MLE or UMVUE of the joint pdf of X\, based on observations on X\, ,Xk, then the MLE or UMVUE of Rk is of the form Rk = / / f{xi,---,Xk)I{xi
<
< xk)dxi
dxk.
(6.19)
Hence, the main challenge is to compute the integrals of the form (6.19). There are several ways to simplify the expression for R^. The first is to
178
Some Selected Special Cases
use the change of variables z\ = x\,Zj = Xj — Xj-i,j o
^k
=
/
r
/-oo ^
/>oo
/
> 2, which results in
" ' /
J-oa [Jo
"I
/(21> ^l + Z2> ' - ' i z l H
1" 2/0^22 ' - dZk dz\.
JO
J (6.20)
If Xj's are independent, one can utilize the following recursive relationship suggested by Hayter and Liu (1996). Denote by fj(-) the MLE or the UMVUE of the pdf of Xj, j = 1, , k, and define rj{x) = P(Xi <
< Xj < x), j > 1, ro(x) - 1.
(6.21)
Then, it is easy to observe that =
/ J-
j = l,---,k,
(6.22)
and Rk = rk(oo). We shall illustrate these comments by evaluating explicitly (6.19) for a number of basic distributions. The exponential distribution. Let Xj, j — l , - - , f c , be independent exponential random variables with fj(x) — otjexp(-ajx). The MLE of fj(x) is fj(x) = &jex.p(—ajx) (here we are using "~" instead of "A" to emphasize that this is a MLE) where &j is the inverse of the average of observations on Xj. Now (6.20) becomes
=
&i---&k
so that the MLE of Rk is given by elegant, intuitively plausible expression Rk = " ' The normal distribution. If Xj, j — 1, , A;, have independent normal distributions with the means 6j and variances a?, an application of formula
Estimation of P(Xi < X2 <
< X k)
179
(6.19) yields Rk = /
/ J-oo Jo
/ Jo
,- , , , , (27r)fe/2(T
dz2---dzk
i,
(6.23)
where cr = Ylaj- Define fii = #i, //_, = #j — 0 j - i , J > 2, and change variables Zj = yj + fij, j = 1,- ,h. Then the argument of the exponent in (6.23) becomes
Simplifying *(2/i, ' > J/fe) we express it as
Z
where * f depends on y2, the entries k
( E-r2
\~x
i=2 j=2
Vk only and W is a symmetric matrix with
I k
k
k
k
EE
/ \l=X m=max(i,j) over j/i, we express l=i m=j Rk in terms of i,j — 2,---,k.( = 1 Integrating the orthant normal probability
Rk = P(Y2>fi2,---,Yk>»k)
(6.24)
where Y = (Y2, , Yk) is a (fe — l)-dimensional normal vector with zero mean and covariance matrix W " 1 . Some analytic approximations to probabilities (6.24) can be found, for example, in Gupta (1963) and references therein. More recent references are given in Kotz et al. (2000). To obtain Jhe MLE of Rk one needs only to replace ^ and of by their MLEs in_(6.24). Miwa et al. (2000) also suggest to calculate the probability Rk directly using recursive relationships (6.21) and (6.22). This approach leads to application of numerical integration techniques which can be quite time consuming even with modern computers. However, if n(x) are approximated by piecewise cubic polynomials, then integration in (6.22) can be performed analytically. Miwa et al. (2000) provide coefficients for such polynomials and discuss a choice of grid points.
180
6.2.2
Some Selected Special Cases
Estimation of P{X
Remarkably, there are numerous papers devoted to the estimation of P(X < Y < Z) scattered in the literature in the last 25 years or so. Estimation ,Xni), of Rz = P(X < Y < Z) based on independent samples (X\,, Yn2) and (Z\, Zn3) was studied by Chandra and Owen (1975), (Yi, Hlawka (1975), Singh (1980), Dutta and Sriwastav (1986) and Ivshin (1998). Hlawka (1975) and Singh (1980) concentrate on the nonparametric version of the problem while the other authors deal with the parametric set-up. Hlawka (1975) suggests to estimate R3 by the [/-statistic n\
712 1 3
U3 = (nm2n3)-1 £ £ X > ( * i < Yi < Z^ i=\ j=i 1=1
He shows that U3 is an unbiased consistent estimator of R3 with the variance bounded by (4nin2+nin3+4ri2n3 + 5ni + 2n2 + 5n3+4)/(180ni7i2n3). He then proves that \/n(C/3 — R3) is asymptotically normal whenever n\ = n-i = 713.
Singh (1980) considers a special case when the cdfs Fx(-) and Fz(-) of X and Z, respectively, are known while the pdf Fy(-) of Y is unknown but the observations (Yi, , Yn2) are available. He provides an unbiased estimator of R3 of the form
with the variance Var(fla) < n^ 1 { E[F x (Y i )(l - FziYt))] - E§} = n^(R3-R23)
< l/(4n 2 ).
A useful representation of R3 as R3 = P(X < Y) - P(X
(6.25)
is also given by Singh (1980). Dutta and Sriwastav (1986) deal with the estimation of R3 when X , Y and Z are exponential random variables which is a particular case of the general model considered in Section 6.2.1.
Estimation of P(Xi < X2 <
< Xk)
181
Ivshin (1988) investigates the MLE and UMVUE of R3 when X, Y and Z are either uniform or exponential random variables with the unknown location parameters. The article utilizes the formula (6.19) with k = 3. For example, if X, Y and Z are uniform with the pdfs ^~ 1 /(0 < x < 6i), i = 1,2 and 3 for X, Y and Z, respectively, the MLE £3 and the UMVUE R3 of R3 are given by formidable but rather straightforward expressions R3 R3
= piI(X{ni) < r (Ba) < Z(B8)) + p2I(Yin2) < minpr ( n i ) ) Z (n8) )) + p3I(Xim) < Z{n3) < Y(n2)) + pd{Z(n3) < min(X(Bl)> y (Ba) )), = piI(X(m) < Y(n2) < Z{n3)) + p2l(Y(n2) < min(X (ni) , Z{ns))) ) < min(X (ni) ,F (n2) )),
where
P3 = \Z\n3) ~ Pi
+ (Y(n2) ~ = ^(
and A
_
("a-AJ(»a)-^l»iV
,
l
2
(ni))2
p2
= 2n*n2n3X
P3
=
+
-. P4 =
(n1-l)(n3-l)(2y(n2)-X(ni)) ) """ 1n\n2n3Z(n3) > V ^ + (ni-l)(n 2 -l)(n3-l)Y (ana)
nin,5,y/..i
"T"
(ni-l)(n 2 -l)(n 3 -l)(y (n2) -Z (n3) ) 2
—
Here ^ ( n i ) , ^(n2) an( ^ z(n3) a r e t n e largest observations on X, Y and Z, respectively. To obtain expressions for .R3 and .R3 one needs to perform integration in (6.19) keeping in mind that the MLE and the UMVUE of the uniform Xn are, respectively, f(x) = pdf f(x) = 6~XI{Q < x < 9) based on Xlt X min J(0 < x < X{n)) and f(x) = (n - l ) f ( i < X {n) )/(nX (n) ) + [1 -
182
Some Selected Special Cases
(n — l)x/(nX(n))]5(x — X(n)) where S(-) stands for Dirac delta-function (see definition in Section 3.2.2). We conclude this section by noting the paper of Chandra and Owen (1975) which is concerned with a slightly different problem, namely, estim a t i o n o f P{Xx
< Y,--,Xt
< Y) a n d P(X
< Ylr-,X
< Yt).
I t is
easy to observe that the first of these probabilities is related to R3 by formula (6.25) where / = 2. Chandra and Owen (1975) construct MLEs and UMVUEs of the above mentioned probabilities in some special cases. We shall leave this material as exercises. 6.3
6.3.1
Linear Models Formulations for Stress-Strength Systems Stress-Strength
Models with Explanatory
Variables
Often an experimenter has access to the measurements of some auxiliary (explanatory) variables that affect the strength or influence the stress. This situation is particularly prevalent in modern medical applications. The additional information can play an important role in analysis. Suppose that X depends on fci explanatory variables zi and Y depends on &2 explanatory variables z2, namely, X|zi = j9' 1 z 1 +ei,
y|z 2 = / ^ z 2 + £2,
(6.26)
where (3j arefcj-dimensionalvectors and £, are random variables with specific distributions, j = 1,2. Some of the explanatory variables may be common as it often happens in drug trials when the remission times are adjusted for the age of the patients. The model (6.26) and its minor modifications were considered by Guttman et al. (1988), Aminzadeh (1997) and Lee and Park (1998). Normal stress and strength, equal variances. Guttman zt al. (1988) assume that £\ and e2 are independent normal variables with zero means and variances o\ and <72, respectively. Assuming that n\ (n2) observations for X (Y) are available, they introduce the notation:
Linear Models Formulations for Stress-Strength
— \-A-\i ' ' ' i-^-ni) i
x
— ( . J l i ' ' ' > In%)
Systems
j
183
(p.ni\
i = Z^Zi, W 2 = Z 2 Z 2 , It is easy to show that for specified values zj and z2 we have R = R(z1,z2) = P(X < y|zi, z2) = $(C)
(6.28)
where
C = (/32z2 - 01z1)/Jal
+ d^
(6.29)
and ) as above is the standard normal cdf. Following Guttman et al. (1988), consider first the case of <j\ = a2 = a. Then the sufficient statistics for the model are (N - k)S2 = (m - fci)S? + (n2 -
(6.30)
where (m - ki)Sl = X'X - jSiZiX,
(n2 - fc2)S^ = Y'Y - ^' 2 Z 2 Y.
In view of the normality assumptions, from standard linear model theory (see e.g. Seber (1977)) we have (6.31) with ^ l 5 $2 a n d ^ 2 independently distributed. Denote (cf. with (6.29)) (6.32) af = z J W r ^ i , i = 1,2, a2 = a? + a|.
(6.33)
Then an estimate for R is given by R = $(C) (cf. (6.28)). To obtain a (1 — 7)- lower confidence bound for R note that (6-34)
184
Some Selected Special Cases
where tjv-fc(A) stands for noncentral T-distribution with (N — k) degrees of freedom and noncentrality parameter A (compare with (4.14)). Denoting the cdf of this noncentral T-distribution by F(t; N — k, A) where A = \/2C/a> a n exact (1 — 7) lower bound £7 for £ can be obtained by solving equation F(i&C/a; JV - k, A) = 1 - 7
(6.35)
for A, say A7, and then determining £7 as £7 = aA 7 /\/2. Hence, a (1 — 7) lower confidence bound for R is Ry(z1,z2)
= *(Cy) = $(aA 7 /v / 2).
(6.36)
Similarly to Section 4.1.2 one can use normal approximation to the noncentral T-distribution to avoid the technical difficulties involved in numerical solution of equation (6.35). Normal stress and strength, non-equal variances. In the situation of non-equal variances cr\ ^ 02, the lower bound derived by Guttman et al. (1988) can be obtained by generalizing the technique of Reiser and Guttman (1986) (see Section 4.1.2). In this case, denote
(6.37)
M =
{a\+a\)l{a\a\+a\al)
where a\ and a
/32z2 - £iai ~ N(p'2z2 - /3[zi, a\a\ + a\a\)
(6.39)
is independently distributed from S\ + 5 | which is a sum of weighted chisquared random variables. Approximating the distribution of 5^ + 5 2 by
Linear Models Formulations for Stress-Strength
Systems
185
and using (6.37) and (6.39), we derive (cf. (4.14)) VMC ~ tv(>/MQ,
(6.40)
where C, is given in (6.29) and M and v can be approximated by
M = {Si + Sl)/{a\Sl + a\Sl)
(6.41)
and
i> = (Si + Si)2 I [ — ^ - + -^-r\ I \nx -ki
,
n2-k2\
(6.42) '
respectively. An approximate (1 — 7) lower confidence bound for £, say, £7, is then found by solving (/
V
=1-7
(6.43)
for £ (cf. (6.35)). The lower (1 — 7) confidence bound for R is then Of course, here, analogously to Section 4.1.2, one can use the normal approximation to non-central T-distribution to avoid the often troublesome solution of equation (6.43). Exponential stress and strength. Aminzadeh (1997) proposes the following regression model: X|zi=exp{/3'lZi+£i},
y|z2=exp{/3^Z2+£2},
(6.44)
where X and Y have exponential pdfs fx{%) = ^{"1exp(—x/6{) and/y(y) = 621 exp(—y/62) with 61 = exp(^iZi) and 62 = exp(/32z2). Under this model, Si, i = 1,2, have the pdfs /e(e) = exp(e — exp(e)), and R = R(zuz2) = P(X < Y\zuz2) = 61/(6! + 62), a very familiar formula for the exponential case. To construct a confidence interval for R the author introduces auxiliary function *(iJ) = R/(l -R) = 6i/62 = exp(/3izi - /3 2 z 2 ). Aminzadeh (1997) also derives the MLEs fix and /3 2 of j3l and (32 and then uses the asymptotic normality of the MLE to construct confidence intervals for R. Asymptotically we have &Z1 - /32z2 ~ N (0lzi - /32z2) [ z j W ^ Z ! + z' 2 W 2 - 1 z 2 ]/2),
186
Some Selected Special Cases
where W,-, j = 1,2, are denned in (6.27). Nonparametric model for stress and strength. Lee and Park (1998) consider the model very similar to (6.26) but with no parametric assumptions on X and Y. Namely, they study z2-z2)+e2,
(6.45)
with unknown fij and (3j, and zj=nj1'^2^ji,
.7 = 1.2.
i=l
The only conditions imposed on £j are E(£j) = 0 and_ Vax(ej) = Gj, 3 = 1,2If the parameters fij and f3j, j = 1,2, were known, we would have had R = P(X < Y\px,fi2,m,ti2) = P(e2 - £i < r) with r = M2 - A*i + #!(Z2 - z2) - /3i(zi - zi).
(6.46)
Hence, one should estimate R by j
- eii < r).
(6.47)
Now, since T is unknown and en and £2^- are unavailable, the authors derive the least squares estimators jij and (3^ of (ij and /3j, j = 1,2, respectively. They set f = p,2 - p-i + /32(z2 - z2) - /3i(zi - zi), £ H = Xi - pn 3i(zii — zi) and £2j = I j — jCt2 - 02{z2j — ^2) and construct the estimator of J?(zi,Z2,yui,/i2) of the form
Ufa, K Ml, /S) = — E E /(% - lii < r). "
(6.48)
Lee and Park (1998) prove consistency of the estimator (6.48) and show that under regularity conditions this estimator is asymptotically normal. They also provide an estimator for the asymptotic variance and propose an asymptotic lower confidence bound for R. Their computations are quite instructive, and we recommend further study of their paper.
Linear Models Formulations for Stress-Strength
6.3.2
Systems
187
ANOVA Formulations of Stress-Strength Models
In statistical quality control and reliability analysis observations for X and Y can sometimes be divided in groups. For example, a quality control engineer may decide to compare failure times X and Y of two products A and B where both products are tested under various sets of conditions (temperature, pressure, exposure to sun, air, etc.). When the number of batches in a population is high, a possible model for measurements could be the random effect model considered by Aminzadeh (1991). Let Xij and Y^ be the i-th measurements in batch j for X and Y, respectively, where i = 1, , j = 1, k\ for X, and i = 1, n2, , k2 for Y. Aminzadeh (1991) suggests a one-way ANOVA random j = 1, effect model for X and Y of the form Xij
=
Hi + Qj + eij,
Yij
= fi2+Tj+Sij,
(6.49) (6.50)
where \x\ and fi2 are the overall means of populations X and Y and Qj and Tj are the random batch effects. Under the assumption that all Qj, TJ, eij and e%j are independent, we obtain from (6.49) and (6.50) that
l
2
l
2,
a22=a2T + a2,).
Using the well known results for one-way random ANOVA model we estimate Hi and Hz by X= X ) ^ i / ( n i ^ i ) a n d Y= J2*W(n2&2), respectively, and variances by
al = {sKm - 1) + s2e)/m,
a\ - (4(n 2 - 1) + 4)/n 2 ,
(6.51)
where s\ and s2 (s2 and s2) are mean squares within and between batches for X^ (Yij). Note that
X~ N (/ii, [ma2, + alyimkx)),
y ~ N (m, [n2a2T + a2]/(n2k2)). - (6.52)
and a2 and a2 have approximately the chi-squared distributions
sf ~ (iM) ^ x ^ M ,
?! ~ (iM) ^x^2As
with 1
k ^
(qf + l)(fc, l)kini + - ^ + i l - ^ i h i y
(6.53)
188
Some Selected Special Cases
where ai =
^r/^e-
Observe that R = P(X < Y) is of the familiar form with C= To construct confidence bounds for R = P(X < Y), cases of equal (<j\ — a\) and non-equal {a\ ^ a^) variances ought to be considered separately. Equal variances. If o\= o\ = a2, then the pooled estimator of variance based on (6.53) will be a1 =
u2) ~
and, hence, the estimator of £ is
Since Y - X~ JV(/x2 - ViM2) with n\ai + 1 0 = —T-,
7T +
T = \flJbC, has a noncentral T-distribution with (y\ + u2) degrees of freedom and the non-centrality parameter ^/2/6£. The lower confidence bound for R can be constructed as it has been done in Section 6.3.1. Non-equal variances. The procedure here is analogous to one in the case of equal variances. In the case of nonequal variances we approximate distribution of a\ + a\ in £ by a\ + <x| ~ C^*)" 1 ^! + ^DxL*) where v* = (<j\ + <JI)2I(?\IV1 + al/v2). Let s.(jil-i)*l
+
|
(n2-1)^ + 4 n2k2
In this case, T* = \J{o\ + cr|)/6* C, has a non-central T-distribution with the non-centrality parameter \/{cr\ + cr'^/b* ^ and v* degrees of freedom. For construction of the lower bound for R one needs to approximate v* by using a\ and d\ instead of
Stress-Strength Models with Grouped and Categorical Data
6.4
189
Stress-Strength Models with Grouped and Categorical Data
In many practical situations, instead of numerical measurements on X and Y the data is presented in a form of ordered categories. Namely, the counts riij are available which represent the number of observations on X (i = 1) and Y (i = 2) that fall into the category Cj, j = 1, , K. These categories may be obtained simply by discretizing all possible values of X and Y by the partition — oo < CQ < c\ < < CK-\ < CK < OO, SO that Cj = (c,-_i, Cj] where the cut-off points Cj may or may not be available. For example, Hochberg (1981) in a study sponsored by the US Department of Transportation considers comparison of injury distribution of belted and unbelted drivers involved in accidents, letting C\, , CK to be ordered injury categories ranging from least to most severe injury in which case Cj cannot be obtained. Gastwirth and Krieger (1991) use the above model in measuring economic inequality. Income data typically are reported in grouped form: the number of persons whose income falls in each interval is reported and, hence, the cut-off points are recorded. Other examples of applications of the above model in medicine and psychology will be considered in more detail in Section 7.3.2. The model with categorical data was studied by Hochberg (1981), Simonoff et al. (1986), Halperin et al. (1989), Gastwirth and Krieger (1991) and Edwardes (1995) among others. It is easy to note that a major difference between continuous and categorical data is the possibility of ties, i.e P(X = Y) may not be zero. For this reason, it is sensible to estimate the quantity different from R = P(X < Y) which takes the latter possibility into account. One of the candidates may be R* = P{X
6.4.1
Point Estimation
Evidently, vectors Nj — (nn,ni2, tributions with
,riiK), i = 1,2, have multinomial dis-
190
Some Selected Special Cases
probabilities Pij = P
fx(x)dx,
P2j=
P
fY(y)dy,
j = l,---,K,
(6.54)
for an observation from X or V, respectively, to fall into category j , j = 1, , K. If the cut-off points c3- are not available, the most natural approach to estimation of A is nonparametric one. Otherwise, one can use a parametric approach or one of the two parametric-nonparametric alternatives suggested by Simonoff et al. (1986). Nonparametric estimator. Following Hochberg (1981) we denote the group index of an observation with the value a by J(a), so that ) is a random variable with the values 1,2, , K. Let Pij=nij/ni,
i = 1,2, j =
1,---,K,
be estimators of P^ using frequencies. Then the WMW-type estimator of A is K-l
K
K t-l
(6.55)
^2^2PIJP2J i=2 j=l
Parametric estimator. Assuming that X ~ N{^\,a\) 2,0-2), the MLE of A is of the familiar form
and Y ~
lMLE = 2*(^=b=)-l.
(6.56)
Here K
J2 3=1
K
Y^v^i?
~ (Ai)2, * = 1,2, -
(6.57)"
3=1
with Cj — (cj-i +Cj)/2. The performance of the estimator (6.56) is highly sensitive to violation of normality assumptions. Two parametric/nonparametric compromises. A more robust parametric estimator can be constructed by deriving fa and di, i = 1,2, using
Stress-Strength Models with Grouped and Categorical Data
(6.57) and then estimating Py, i = 1,2, j — 1,
191
, K, by the inference
Then a "pseudo-MLE" estimator of A will be K-\
K
K
i-l
(6.58) i=l j=i+l
t=2 i=l
Note that APML is essentially the same as AWMW given by (6.55), except that the cell probability estimates are derived parametrically rather than from frequency estimates. The behavior of (6.58) depends on whether X and Y are normally distributed but to a somewhat lesser extent than that of K-MLE-
Another method suggested by Simonoff et al. (1986) does not require the underlying model to be normal, but only smoothness of the pdfs fx and fy is required. Under this condition, the probabilities P+j can more accurately be estimated by a "roughness penalty method" as follows. Rather ,K, using the logthan estimate the probabilities Pij, i = 1,2, j = 1, likelihoods K
K
^-=1,
i = 1,2,
the estimators are now defined as values that maximize the log-likelihoods modified by "roughness penalties" K
K-\
3=1
3=1
The method yields smoothened probability estimators p*j and results in a nonparametric estimator of A (since only smoothness of-fx and fy is assumed): (6.59) i=l j=t+l
t=2 j=l
It is instructive to note that the four estimators (6.55), (6.56), (6.58) and (6.59) can be viewed as occupying a continuum from most parametric
192
Some Selected Special Cases
to least parametric as follows: h-MLE — A p M i —> A s -> h-WMW-
6.4.2
Confidence Intervals
Confidence intervals based on asymptotic normality.
Since the
estimators AMLE, A-PML, AS and h-wMW are all asymptotically normal with the mean A, the (1 — 7) asymptotic confidence interval for A is of the form (A —
Estimation of Var(AwMw) was initially carried out by Hochberg (1981) who shows that \
=
[P(X < Y) + P(X > Y) - (m + n2 - 1)A2
+ (ni - 1)(-KXXY + TTYXX - 1KXYX)
(6.60)
where =
P(J(F i )>max[J(X j ),J(X i )]),
= P(J(ri)<min(Jr(A»,J(X«)]), = P(J(Xj) < J(Yi) < J(Xt)),
where, as above, J(a) is a group index of an observation with a value a and Xj, Xi and Yi are original (not discretized) X and Y observations {TTYYXI KXYY and TVYXY are similarly defined). The variance (6.60) can consistently be estimated from the estimators of P(X < Y) and P(Y < X) using the first and the second halves of formula (6.55) and
=
[nw2(n2 - I)]
Stress-Strength Models with Grouped and Categorical Data
193
. KYXX and TTXFX are denned similarly). Edwardes (1995) provides an estimator of the variance (6.60) for more complex sample designs. Confidence intervals based on pivotal quantities. The pivotal-based confidence interval for R* = P(X < Y) + 1/2P(X = Y) has been suggested by Halperin et al. (1989) and is derived in the spirit of the technique discussed in Section 5.3.4. The value of R* is K-i
K
1
K
(6.61) and its estimator R* is obtained by substituting Pij in place of P^, i = 1,2, j = 1, , K. A straightforward but somewhat tedious computation shows that R* has variance R* — (n\ + n2 — 1)(.R*)2 + (n2 — 1)A + (ni - l)B — \ J2i=i Pijftj y z=z
-
-
nin 2
' (6.62)
where K-\
(6.63) It is easy to verify that (R*)2 < A, B < R*, so that the last three terms of n\n2V are bounded below by (ni + n2 — 2)(R*)2 — 1/4 and above by (ni + n 2 - 2)iJ*. Thus, for some 0 € (0,1)
(n2 - 1)^ + ( n i - 1)J5 - EJL n2-2)(JR*)2-l/4] + ( l - 0 ) ( n 1 + n 2 - 2 ) E * + n 2 - 2)[R* - 0K*(1 - H*) - (m + n 2 - ) 1 / ]
(6.64)
194
Some Selected Special Cases
Since the term (ni+n 2 —2)~10/4 is asymptotically negligible as rii —» oo, i = 1,2, it is ignored and the modified version of V obtained from (6.62)-(6.64) will be V* = ( n i n j ) - 1 ^ + n 2 - 1 - (m + n 2 - 2)0]iT (1 - J2*).
(6.65)
To derive an estimator of 9 we rely on equation (6.64) ignoring the asymptotically negligible terms J^PijPzj and [4(ni + n2 — 2)]~1#, hence n2 -
(6.66)
Let A and B be the estimators obtained by replacing Py by py in formulae (6.63). Tedious but straightforward computations show that A and B have unbiased estimators
(6.67) , K. Ignoring the terms of order where qij = 1 — p^-, i = 1,2, j = 1, O(l/(nin 2 )) an unbiased estimator of R*(l — R*) is given by
-rn-n2 (ni
+ 2)R* - nin2(R*)2 A B _ 1 ) ( n 2 _ i) + ^Zi + ^ Z T -
(6.68)
Substituting R*, A, B and (6.68) into (6.66) results in a consistent estimator $o of 9. li §o < 0 or $o is indeterminate (since then A = B = 0, we define 6 = 0; if §o > 1, then 0 = 1; otherwise, 0 = 0OWe may now compute the pivotal quantity (R* — R*)2/V where V is obtained from, expressioa (6.65) by substituting 0for 6. Denote g = (ni + n2 — 1) — (ni -\-n2- 2)9. The confidence interval is then obtained as the solution set in R* of the quadratic inequality mn2(R* - R)2/[gR*(l - R*)} < Xi- 7 (1) where X?- 7 (l) as above is the (1—7)-quantile of the chi-squared distribution with one degree of freedom. Thus, the confidence limits are given explicitely
Stochastic Processes Formulations of Stress-Strength Systems
195
by
2{C + 1)
with C = £Xi_7(l)/(nin2). Confidence intervals derived by means of optimization techniques. Gastwirth and Krieger (1991) studied probabilistic upper and lower bounds on P(X < Y) when X and Y are not independent and probabilities Py, i = 1,2, j = 1, , K, are available. Assuming that X and Y are bounded (and, thus, can be scaled to the [0,1] interval) they derived their bounds under various conditions on X and Y somewhat analogously to that of Section 5.5. For example, if X and Y have means /L*I and /j.2, respectively, the lower bound for P(X < Y) is zero and the upper one is 1 + \i\ — fj>2The above approach is somewhat related to the probability design method described in Section 5.5.
6.5
Stochastic Processes Formulations of Stress-Strength Systems
The stochastic process formulations for strength-stress systems were developed by Basu and Ebrahimi (1983), Raghava Char et al. (1984), Bilikam (1985) and Aminzadeh (1999). It was Bilikam (1985) who justified the use of stochastic processes in reliability models. He writes: "The strength...is necessarily conditioned on the stress because the physical realization of strength is found only when stress is applied."
6.5.1
General Stochastic Systems
Basu and Ebrahimi (1983) and Bilikam (1985) consider a general formulation where both stress X and strength Y vary in time, i.e. X = X(t) and Y = Y(t) and the cdfs of X and Y depend on t via parameters 0i(t) and 02(<), respectively. The reliability is then can be assessed at any given instant of time t as R(t) — P(X(t) < Y(t)) (the probability of a successful operation at t) or
196
Some Selected Special Cases
at a period of time (0, to) as Rt0=p{
sup [X(t) - Y(t)} < o) . J Lo
(6.69)
(the probability of absence of a failure before the time t = to)Bilikam (1985) obtains expressions for R(t) for various distributions. For example, if X(t) and Y(t) possess an extreme value distributions with the cdfs F\{x) = l-exp[(-expx)/0i)] and Fy(y) = l-exp[(-expy)/0 2 )] with the parameters 9j = 9j(t), j = 1,2, being functions of t, the reliabilityis then the familiar expression R(t) = 92/{9i + 92)- Bilikam (1985) then considers various time-dependent models for 9j such as In 9j = a,j — bj In t,
Basu and Ebrahimi (1983) studied (6.69) in the case when X(t) and Y(t) are Brownian motions or when the strength is decreasing while the stress remains fixed. Let X(t) and Y(t) be independent Brownian motions with the mean value functions mt and [i^t and covariance kernels o^min(s,£) and
,
|z(0)| f —— exp<
(z(0) + »t)2\ -—5- >at,
where /J, — (J-2 — ^i, cr2 = \ +<J% a n d z (0) = y(0) — a;(0). If the means of the Brownian motions X(t) and Y(t) are fixed (EX(t) = fj.\ and EY(t) = /i2) but all the other assumptions above are still valid, then Rt0 - 2$ [(/i! - fi2)/(a2t0)}
(6.70)
when x(0) < y(0) and is zero otherwise. The formula (6.70) for Rto fits very well within framework of formulas for the cases when X and Y are normally distributed. In the case when the strength is decreasing while the stress is fixed X(t) = X, the reliability is found to be (cf. (6.69))
Rt0
= p{x-
inf Y{t)) < 0} = P(X < Y(t0))
Fx(x)dFp(x),
Stochastic Processes Formulations of Stress-Strength Systems
197
where FY is the cdf of Y(to) in the accordance with the general formula (2.6) introduced in Chapter 2. 6.5.2
Markov Models for System
Reliability
Raghava Char et al. (1984) consider Markov models for system reliability with discrete time. They assume that the stresses (i.e. attacks) arrive , an at discrete points of time 1,2,3, and at any moment j , j = 1,2, attack occurs with probability a, 0 < a < 1. Let Pk be the probability that the unit survives the fc-th attack given it has survived the previous (k — 1) attacks, k > 1, and define Xk to be the number of attacks successfully withstood among the first k encounters provided that the item has not failed upto then. If it has already failed we shall describe this situation by the statement Xk = F, where F denotes "failed". Under these conditions, {Xk, k > 1} is a random walk on [0,1,2, , F], with F being an absorbing state. It is then easy to obtain that P(Xk+i = i + l\Xk = i) = aPi+i, P(Xk+1 =i\Xk=i) = l-a, P(Xk+1 = F\Xk = t) = a(l - Pi+1) and P{Xk+1 = F\Xk = F) = 1. Let N(a) be the time to absorbtion of Xk: N(a) = inf[fc : Xk = F\. If linifc-voo Ylj=i Pj = 0> Raghava Char et al. (1984) find the characteristic function of N(a) to be -oo
They also show that aN(a) converges in distribution to a random variable with the characteristic function0 (i here denotes V—T)- From this general result the authors derive the limiting distribution of aN (a) for various scenarios of system reliability. For example, if Pk = p, then the limiting distribution is exponential with the cdf 1 -exp(—(1 — p)x), so that N(a) is approximately exponentially distributed with the mean [(1 — p)a]~~l for small a. Raghava Char et al. (1984) also investigate the case of attacks on series and parallel systems. 6.5.3
Stochastic Time Series Models
, Recently, Aminzadeh (1999) studies prediction problem when X_ = {X\, Xni) and Y_= (Yi, , Yn2) represent observed values of two correlated time series Xt and Yt, respectively, and X n i + m i and Kn2+m2 denote the values
198
Some Selected Special Cases
of Xt and Yt at the future times t = ni + mi and t = n-i + m2, mi, m2 = . In practice one often is required to estimate R = P(X n i + m i < Y^2+m2) where ni +mi = n2 + m2 — r. Here R may describe the reliability at the moment r where Xt and Yi are the stress and strength at the instant t, or when Xt and Yt are prices of two stocks, R is then can be interpreted as the probability that Yt is doing better than Xt at that particular moment t = T. Aminzadeh (1999) investigates stationary autoregressive (AR), moving average (MA) and autoregressive moving average (ARMA) models for time series Xt and Yt as well as the case of nonstationary ARMA models. For example, if Xt and Yt follow AR model of order k, then fe
X t-
Hi = 3=1 k
3=1
where 8itt are autoregressive parameters, /ij are the means of Xt and Yt, res) spectively, and Sitt are white noise processes with E{eitt) = 0, Cov(£j)t, , k. Since the assumption of in0 and Var(eiit) = af, i = 1,2, t = 1, dependence is very restrictive in practice, we shall assume, following the author, that the joint probability distribution of ei jt and £2,t is a bivariate normal with Cov(£i]t+S, £2,t+r) = ^i<^2 for any s and r. Denoting = fr(mi), E(Ym2+n,\Y) = &{m2), h(i,Q) = 0, h(i, 1) = 9itl, E(Xmi+ni\T) , h(i, m) = Ylj=i ^i,jh(i,m — j), i = 1,2, and recalling that A; is the order of the AR model under consideration, we obtain the familiar expression where
(M2 - MI) + E?=i E?=i ei,j(Zi(mi - 3) - IM)
Estimation of R requires estimation of parameters involved in the expression for £ which is a tedious but a straightforward task.
Exercises
6.6
199
Exercises
6.1. (Myhre and Saunders (1968b)). Using formulas (6.17) and (6.18), write an explicit expression for the asymptotic (1 — 7) confidence intervals , Yk)) and Rk,k = P(X < min(Yi, , Yk)). for Rhk = P(X < max(y1, 6.2. (Chandra and Owen (1975)). Derive the MLE of P{Xi+Vl{x > A) (0,- > 0, A > 0 is the common parameter). Use formula (6.20) directly or produce estimators by using monotone transformations Yj = ln(Xj/X). 6.5. (Ivshin (1998)) Derive the MLE and the UMVUE of R3 = P(X < Y < Z) when X, Y and Z are independent exponentially distributed random variables with the pdfs exp(-(z-0j) I(x > 6j, 6j unknown, j = 1,2,3.. 6.6. Using formula (6.20), derive the MLE of P{X < Y < Z) in the case when X, Y and Z, have independent gamma distributions with a common integer shape parameter m. 6.7. Let X, Y and Z be independent binomial random variables with parameters (rrij,pj), j = 1,2,3, respectively. Derive the MLE and the UMVUE of P(X
200
Some Selected Special Cases
and B given by (6.67) and (6.67) is an unbiased estimator of V*. Construct a pivotal quantity (R* - R*)/^n1n2{V - (m + n2- l)(R*)2) ~ iV(0,l) and derive confidence bounds for R* similarly to (5.50).
Chapter 7
Applications and Examples
7.1
Applicability of the Stress-Strength Model
In previous chapters we have discussed in some detail the following two main topics: 1) Expressions for the probability P(X < Y) and its generalizations for various distributions of X and Y; 2) Expressions and properties of various estimators of P(X < Y) and its generalizations based on a random sample as well as other sampling procedures. We have seen that these topics often involve challenging calculations and are of great usefulness when viewed from probabilistic and statistical aspects. We have also attempted to describe in the earlier chapters the genesis and motivation for probabilities of the type P(X < Y) and their connection with the classical non-parametric tests of equality of two distribution functions based on the extensively used and popular Wilcoxon-Mann-Whitney statistic. It would seem that Birnbaum (1956) was perhaps one of the first researchers who dealt with the model P{X < Y) in stress-strength content. It is worth quoting Birnbaum's " Illustration" as presented in his pioneering paper which may serve as the road-map of the research in the last 45 years. His ideas are in resonance with the observations by Bilikam (1985) which appeared in engineering literature some 30 years later: " An illustration. If structural components of a mechanism are mass produced, the strength at failure Y of each single component (equals stress at which this component will fail) may be considered a random variable. 201
202
Applications and Examples
The component is installed in an assembly and exposed to a stress which reaches its maximum value X, again a random variable. If Y < X, then the component will fail in use. In this situation, p = P(Y < X) is the probability that failure will occur because, due to a chance, a component with relatively low strength was paired off with a high stress. It clearly is of interest to estimate this probability, preferably from samples of X and Y alone, since installing the components in complete assemblies and trying them out under conditions of actual use may involve nearly prohibitive expense and effort. It also will be important to be able to estimate p without knowing the distribution of the strengths of the components, or of the stresses, or both." (Birnbaum (1956), p. 14). R.A. Johnson (1988) in his survey paper in Handbook of Statistics, Vol. 7, interprets R = P(X < Y) somewhat more liberally - as the probability that a unit in operating environment performs satisfactory when - as usual - X is the stress placed on the unit, specifically X is taken to be the maximum value attained by a "critical stress". He points out an early application by Lloyd and Lipow (1962) where X represents the maximum chamber pressure generated by the ignition of a solid propellant in rocket engine. (We shall return to rocket engines applications in the sequel on several occasions). In a subsequent paper by Kececioglu (1972), X represents the torsion stress which is the most critical type of stress for a rotating steel shaft on a (by now obsolete) computer. The message of these examples is that, in practice, the stress variable X is usually difficult to model accurately due to lack of sufficient data, and therefore various models of P(X < Y) where X is assumed to have many different distributions discussed in preceeding chapters seem to have more than a passing theoretical significance. With regard to the strength variable, Johnson (1988), as many other Bayesians, advocates the expert opinion elicitation. The most prominent examples of applications of P(X < Y) relationship in engineering and medicine as presented in Johnson's (1988) survey article are: a) Rocket engines. Here X usually represents the maximal chamber pressure generated by ignition of a solid propellent while Y is assumed to be the strength of the rocket chamber so that P(X < Y) is simply a__ probability of successful firing of an engine. We shall indicate below further instances when this model is used. b) Two-treatment comparisons. This is an old technique motivated
Applicability of the Stress-Strength Model
203
by the close relation between Wilcoxon-type tests and the P(X < Y) models. Typically drug I is assigned to one group of subjects and drug II to the other. If X and Y represent the remission times with these two drugs, , Yn2, respectively, the main interest of say, X\, X2, Xni and Y\, Y2, the researcher is to estimate P(X < Y). Here the terminology "stressstrength" may not be appropriate, but the net result is evidently the same. Indeed the very first application of the P(X < Y) relationship — as explained in the Introduction and Chapter 1 - originated from the classical Wilcoxon test already available in Wilcoxon (1945) ground breaking paper and it deals with two treatment comparison. Wilcoxon provides results of the fly spray tests on two preparations in terms of percentage of mortality. He compares the percent killed in the sample A versus the percent killed in sample B (each involving 8 observations) concluding by means of this test that sample B provides a lower percent; thus preparation B should be considered less effective. Another example involving paired comparisons motivated by R.A. Fisher's experimental data on the differences in height between cross- and self-fertilized corn plants determines the significance of these differences. c) Response models. A certain unit - be it a receptor in a human eye or ear or any other organ (including sexual) - operates only if it is stimulated by source of random magnitude Y and the stimulus exceeds a lower threshold specific for that unit. In this case P(unit functions) is equivalent to the familiar P(X < Y) - a stress-strength relationship. d) Earthquake resistance. R. Mensing (1984) in his personal communication to R.A. Johnson (1988) provides the following example which captures many aspects of the problem at hand. In a study of the risk of a nuclear power plant (or some other tall or spacious building) it is necessary to assess the ability of the steam or another generator to withstand the stresses due to the ground motion as a result of an earthquake. (The same applies to the ability of a very tall building to withstand the impact of a terrorist missile attack which - unfortunately - is now a very concrete rather than a hypothetical example). Since at the time (the year 1984) when this example was provided there were no data available concerning the strength of such generators and methods of their estimation, the author solicited opinion of 5 experts who provided estimates of the 10-th, 50-th and 90-th percentiles of the relevant strength variable expressed as a peak accelera-
204
Applications and Examples
tion in ft/sec2. The values of the 90-th percentile provided by the experts range from 103 ft/sec2 to 48 ft/sec2. (Remarkably, the assessment of the 10-th percentile was proportionally more divergent - from 81 ft/sec2 to 32 ft/sec2). Based on this data, R. Mensing advocates to model the strength by the time-honored log-normal distribution and estimates its parameters by a weighted least squares procedure. He also utilizes the same distribution to model the distribution of the stress at the base of the steam generator with the mean value about 1/2 of that of the log-normal distribution representing strength. This leads to a rather optimistic and encouraging estimator of P(lnX < inY) « $(3.52) = 0.99978. R.A. Johnson (1988) also points out that this type of model - perhaps using the multivariate normal distribution - can be extended to several components characterizing strength and the common stress due to an earthquake which can also be visualized as a multivariate normal variable leading to estimation of something like P(X < Y), X and Y both being multivariate normal variables. It was evident from the discussions above that the concept of "stressstrength relations" is quite basic and natural, reflecting sound relationship among various real-world phenomena and one would expect to have an avalanche of applications discussed in the literature. To our surprise and some disappointment - after an extensive literature search - we have located only some 25 papers (out of overall bibliography of more than 300 items) in which explicit applications are provided and often only in a sketchy form. It should in all honesty be pointed out that there are quite likely numerous reports and papers of a semi-classified and classified nature on this subject not available to general audience (see e.g. Section 7.2.3). We think that a reason for such a discrepancy is that the original direction of the research on the stress-strength problem was carried out in the USA, USSR, Canada and India by mainly theoretically oriented statisticians whose interest in applications may have been somewhat marginal. A vast number of sources in our possession usually pay only lip service to the paramount importance of relations of the type P(X < Y) in reliability theory and then immediately proceed to the "main business" of deriving some ingenious expressions for P(X < Y) and its estimators under various assumptions on X and Y. Only relatively few authors (that are known to us) have taken advantage of the enormous applications-oriented potentials hidden in this type of probabilistic-statistical problems.
Engineering and Military Applications of the Stress-Strength Model
205
However, even the small collection of papers dealing with practical applications - which we were able to uncover - serves as a strong indication of versatility of the stress-strength relationship and - as we shall see below - its relevance in various sciences - (not necessarily limited to engineering and medicine). 7.2
7.2.1
Engineering and Military Applications of the StressStrength Model The Rocket Motor Case Example
The strength of the rocket motor case versus the operating pressure - for some reasons which are not clear to us- attracted substantial attention of statisticians working on stress-strength models. The four authors of the paper by Guttman et al. (1988) are deliberately vague about its specific origin and curtly assert that the application to be described below was "brought to our attention by scientists". We can only naively speculate that it has perhaps been related to exploration of the outer space technology which was quite popular in the eighties of the past century (especially in the USA and USSR). Another - more sinister possibility - is, of course, that we are dealing with military applications in the period when Star War program was launched. This may perhaps explain the incomplete data provided by Guttman et al. (1988) who are hiding behind the confidentiality clock. In the paper under review, Y denotes the strength of the rocket motor case and X the operating pressure (which is the stress that the motor must withstand). The reliability of the motor case denoted by R(z) depends on the ambient temperature Z, so the authors assume a model involving explanatory variables and normally distributed error terms for the stress and the strength described in detail in Section 6.3.1. Specifically, in this example, (£.26) becomes ... ... . _
where z = (1, Z) and e% ~ iV(0, of), i = 1,2, are independent. Table 7.1 presents a portion of the data in Table 1 provided by Guttman et al. (1988). The pressure values are rounded up to the third decimal place. (In the original paper authors cite 51 values of (X, Z) and 17 values of Y.)
206
Applications and Examples Table 7.1 A portion of the data presented by Guttman el al. (1988)
z
X
temperature (C°)
operating pressure in psi
-39 -39 -39 -21 59 59 59
5.89 5.85 6.03 7.32 7.74 7.57 7.91
Y chamber burst strength in psi 15.30 16.75 16.00 17.50
The authors study this example under the assumptions of equal and non-equal variances. If variances are equal (defined in (6.37), (6.41) and (6.42), respectively. Namely,
Engineering and Military Applications of the Stress-Strength Model
207
for the rocket motor case example,0.999999 using the summary statistics X, Y, s\ and s\ and conclude that the observed data provides strong evidence in favor of Hi, the p-value being equal to 0.0000042. Nandi and Aich (1996b) return to the Weerahandi and Johnson (1992) data by slightly modifying the assumptions on the means /xi and ^ which somewhat simplifies the computations and leads to a shorter 95% HPD credible interval than the corresponding frequentist interval reported by Weerahandi and Johnson (1992). 7.2.2
Comparison of Two Treatments in Engineering Setting
Comparing strength of two types of steel. In a preceeding section we have described a motor rocket case applications. In the same paper, Guttman et al. (1988) analyze the data presented in Duncan's (1986) classical text dealing with the results of measuring shear strength for spot welds for two different gauges of steel (a typical two-treatment problem). An explanatory variable Z is naturally the weld diameter (measured in units of 0.001 inches). Here X and Y represent the strength of two types of steel, the first corresponding to .040" and the second to .064". Evidently we are concerned next with estimating
208
Applications and Examples
R = P(X < Y\Z = z). The authors claim - referring to Duncan's work of 1974 (not cited in their paper) - that the strengths depend linearly on weld diameter (presumably as a first approximation). They utilize the model (6.26) with £i ~ N(0, af), z* - (1, Z) and A* = 2, i = 1,2, and also initially assume that the variances a\ are equal (which turns out to be an incorrect assumption). The data is presented in Table 7.2. Table 7.2 Data on Shear Strengths of Two Gauges of Steel
.064"
.040" - -.040"
.064"
X
Z
Y
Z
350 380 385 450 465 485 535 555 590 605
140 155 160 165 175 165 195 185 195 210
680 800 780 885 975 1,025 1,100 1,030 1,175 1,300
190 200 209 215 215 215 230 250 265 250
For this data (n\ =n-i = 10) we estimate p[ = (-216.33,3.99),
p'2 = (-569.47,6.90)
and for Z = 200 we evidently have in this set-up the two-dimensional row vectors = z2 = (1,200) which (according to formulas (6.28) and (6.32)) yields i? = $(2,193) = 0.986 (namely, for Z — 200 the gauge Y is very likely to possess a stronger shear than the gauge X). Again this conclusion is evident from the raw data (small sample size not withstanding) especially if we note that diameters of gauge Y are uniformly larger than that of X. As it was alluded above, the trouble with this example is that the estimators of the variances are Sf = 876.20 and 5 | = 9,980.16 respectively which seems to invalidate the assumption <J\ = a\. The authors revise
Engineering and Military Applications of the Stress-Strength Model
209
their calculations by assuming that cr%/a\ — 10. This leads to a similar conclusion R = $(2,250) = 0.9878 for Z = 200 and the approximate lower bound $(0.8334) = 0.7977. When no assumptions are imposed on the ratio of the residual variances the approximate lower confidence bound for R for Z = 200 (using the same model) is found to be $(0.6060) = 0.7277 which is lower than the other two bounds cited above (when the assumptions erf/of = 1 and cr\la\ = 10, respectively, are utilized.) The authors emphasize that their procedure used for analyzing Duncan's (1986) data assumes normality and a linear dependence on the explanatory variables. These assumptions could and should be checked using for example the residual - analysis techniques prevalent in statistical literature. The value of R = P(X < Y) depends heavily on the tails of distributions involved, thus the confidence bound will possibly be sensitive to departures from normality. The authors conclude with a recommendation to extend their theory (involving explanatory variables) to non-normal distribution and check its applicability by means of a real world data. To the best of our knowledge this has not been yet carried out. Carbon fiber strength example. In two more recent papers Surles and Padgett (1998), (2000) deal with inference on P(X < Y) in Burr-type X model with the cdfs Fi(x\6) = (1 - e-* 2 ) 0 , x,6 > 0 (compare with (3.28)) or Fi(z|0) = (l - e-^/"^\ , x,6,a > 0 (the scaled version). In both papers the authors provide an application to carbon fiber strength data collected by Bader and Priest (1982). Tensile strength data (in GPa) for single carbon fibers and "impregnated 1000-carbon fiber tows" were obtained under tension at gauge lengths of 1, 10, 20 and 50mm (single fibers) and the lengths of 20, 50, 150 and 300mm (impregnated tows of 1000 fibers). Earlier Durham and Padgett (1990) fitted a Weibull distribution to this failure data with unsatisfactory result. The Burr type X model seems to be more appropriate and the two-parameter model - not surprisingly - provides even a better fit. For the inference on R= P(X < Y) where X represents the strength of 20mm
210
Applications and Examples
fiber and Y - the strength of 50mm fiber (the terminology "stress-strength" may not be quite applicable in this example), Surles and Padgett (1998) calculate a MLE of R to be 0.57284 with sample sizes n\ = 69 (20mm single fibers) and ni = 65 (50mm single fibers). Their conclusion (again not very surprisingly!) is that longer fiber is weaker than a shorter one. They use some formal tests to reach this conclusion. A more ambitious analysis utilizing the two-parameter Burr type X distribution (with different scale parameters) resulted in a MLE R = 0.616592 and various formal tests once more confirm (with a higher confidence) the same conclusion reached when using a one-parameter Burr X model. Comparison of motorettes insulation. Gupta et al. (1999) concentrate on estimation of P(X < Y) in the normal case with common coefficient of variation 7 = a/fi and exemplify their theoretical results by means of the data taken from W. Nelson's (1990) well-known text. It represents the hours to failure of 20 motorettes with a new class-H insulation run at 240oC and 220°C. Nelson (1990) claims that the lognormal distribution fit the data adequately for the both temperature regimes. The variables X and Y are here the logarithms of the failure times and we have 10 observations for each temperature. The null hypothesis HQ : (j\j'Hi = (J2IHi = 7 was not rejected using a score test of Gupta and Ma (1996) since the p-value was close to 1. The authors present a confidence interval on R = P(X < Y) taking into consideration the equality of the coefficients of variation and compare it with the confidence interval on R (for the same data) proposed by Reiser and Guttman (1986) without the assumption of a common coefficient of variation. The interval based on Gupta's et al. (1999) method for the data depicted above is of length 0.101 while the Reiser and Guttman procedure yields a somewhat wider interval of length 0.124. Independent simulation results based on 10,000 observations with parameters - the mean values - /xi(/X2) = 0.5(0.8) and various 7 § [0.2,2] for sample sizes n\{n2) = 30(32), a = 0.05 and a = 0.10 and for 711(77.2) = 20(25) using the same a's show that taking into account the equality of coefficients of variation may reduce the length of the confidence interval two-fold. Robustness of the Gupta et al. (1999) methodology has not been addressed as yet. In this application - as in many others - we don't have strictly speaking - a stress-strength comparison and the problem is closer to
Engineering and Military Applications of the Stress-Strength Model
211
an equivalent problem of testing equality of two (log)normal distributions. Reliability of cables and piping. In a rather obscure Finnish technical report (dated 1977) T. Mankamo investigates the common cause failures (CCF) problem, emphasizing the case of dependent failures. The failure condition of a structural item is determined as follows. Let N identical items (components) be loaded by a common stress. The common stress is treated as a random variable with the pdf fx(x). Each item has an identical resistance to stress, which is also treated as a random variable with the pdf fy(x) and the cdf Fy(x). Then the failure condition is expressed by the familiar X > Y, i.e. the stress X exceeds the structural resistance and the probability of failure P(X > Y) is given by the expression J^ Fy(z)fx(z)dz (compare with (2.6)). The author deals with the normal and lognormal cases and the well known model of failures out of N apparently being unaware of the voluminous literature existing in this area by 1977. (This is by no means the only example of the lack of coordination between researchers in various countries.) The author refers in a footnote to a very similar study conducted by A.D.S. Carter as presented in NCRS symposium in Bradford, UK, in February 1977. He also mentions the following potential applications: 1. reliability of pre-stressing cables of a prestressed concrete pressure vessel during an overpressure accident; 2. multiple breaks of piston casing in BWR control rod insertion mechanism under scram conditions; 3. integrity of standby injection piping under pressure build up when initiating the system (parallel loops may be loaded by a common counter pressure). This case is discussed in some detail. 7.2.3
Military
Applications
Military applications is another obvious and challenging area in which the numerous scenarios fit very well into the framework where the reliability should appropriately be defined as probability that system "strength" exceeds in-use "stress". There are no doubts numerous classified (confidential, FYEO, etc.) military-oriented papers in English (and we dare say also in Russian and other languages) in which the stress-strength relations are utilized that are unavailable to us. However, one can taste the flavor of these
212
Applications and Examples
investigations from the paper by M. A. Johnstone (of the US Military Academy) published in 1983 issue in the Journal of the Washington Academy of Sciences. After developing an appropriate theory based on the Bayesian approach, the author does not mince words and presents an example in which the reliability for an anti-tank sabot round fired against a Soviet T-62 tank is denned as the "probability that a given round will penetrate its target". The ranges at which the round will be fired in battle represent the stress distribution (assumed to be normal with [i = 1600 meters and a = 100 meters.) The strength distributions represent a distribution of ranges at which a given round just penetrates the target. The reliability is defined in the natural manner: the probability that the strength exceeds the in-use stress. The author utilizes quantal response data - testing a sample of identical test specimen at a number of stimulus levels and observing whether the response occurs - namely the applied stimulus exceeds the critical level of stimulus associated with test specimen. In the stress-strength content a response is observed when a specimen fails, i.e. strength is less than stress. For the problem at hand, Johnstone (1983) uses particular test strategy for selecting the stimulus , or stress level(s) which are applied to test specimen: the so-called Churchman two-stimuli designs are used to generate the data. These designs involve testing two samples of test specimen: one of size n\ at stress level Y\ and the other of size n^ at level Y2. Let the observed number of failures be mi and mi respectively. Stress levels Y\ and Yi are chosen based on satisfaction of the inequalities: 0 < mi/ni < m2/ri2 < 1.
Observe that in our offensive against hypothetical (and by now non-existent) Soviet T-62 tank, to complete the mission we must fully penetrate the armor. For each round there exists a critical range at which one will not be able to complete this task. The population of rounds is considered by Johnstone (1983) to have a strength distribution of the corresponding critical ranges. In his example, level Y\ corresponds to the distance of 50 meters from an armor plate similar to the armor of the tank we wish to destroy, Tii rounds are fired at this distance and the number of non-penetrations mi is recorded. If the number of non-penetrations exceeds 50%, the rounds will then be fired from a distance twice closer to the plate (level Yz) and the number of non-penetrations mi is recorded. If however, the number of non-penetrations at level Y\ have been less than 50%, the rounds are then
Engineering and Military Applications of the Stress-Strength Model
213
fired from a distance of "one half further from the plate (alternative level I2) and as above the corresponding H2 and m^ are recorded. The goal is to achieve that the two probabilities of non-penetration at levels Y\ and Y2 differ by at least 20%. This data is then used to estimate the parameters of the strength distribution of the critical ranges for the sabot round. The author dismisses the classical approach to estimation of parameters since (in his opinion) it does not allow us to measure the uncertainty of the reliability which he defines, as usual, as R = P(X >Y) = P{X -Y>0)
= P(Z>0)
where Z = X - Y. He calculates R using the expression O
R= I n(z\ns- VE,Os + cr%)dz, Jo where n(z\/j,,a) is the pdf of the normal distribution with the mean /i and variance a1 and "for the purpose of this paper" fis and OE are assumed to be given values. The Churchman type data described above yields observed values of the binomial random variable b(mi\ni,pi) where p\ is estimated by f /
n{x\ns,a%)dx = mi/ni
(7.1)
J—
(at stress level Y{). Analogously, at stress level Y2 we estimate p2 which results in the equation similar to (7.1). Solving the two equations simultaneously, we obtain the estimators fts and <J| that allowing us to estimate Rby O
R
=
/
./o
Johnson (1983) then proceeds in detail to estimate R using Bayesian approach with uniform priors on p\ and pi - the unknown probabilities of failure at Yi, i = 1,2. Prom the observations the joint posterior distribution for pi and pi is developed and the conditional marginal posterior distributions of pi and P2 (given p\ < P2) are used to determine the posterior distribution for reliability. This approach is straightforward for our readers who absorbed the theory presented in Section 2.3 but the application is a daring one. The author suggests utilization of this approach to data originating from other experimental designs but we have not been able to locate
214
Applications and Examples
any later literature citations. He also recommends to develop methodology to update reliability parameters via the actual reliability results of fielded systems. 7.3 7.3.1
Applications in Medicine and Psychology Applications Based on Numerical
Data
Turning now to medical-pharmaceutical applications (more precisely, comparison of efficiencies of two or more drugs), out of multitude of examples provided in recent years in numerous medically oriented statistical journals (whose number mushrooms almost monthly), we shall analyze two papers which use the model R = P(A'X > B'Y) in the multivariate normal case. The point estimation procedure for P(A'X > B'Y) is discussed in Section 3.5. Here X and Y are random vectors (not necessarily independent) of dimensions k\ and fc2, respectively, possessing multivariate normal distribution and A and B are two known vectors. Gupta and Gupta (1990) pointed out that the problem arises in a system to which energy is supplied by fci sources and is consumed via k^ sources. Their example, however, deals with the well-known data of Morrison (1976) in his popular text on multivariate analysis. The data depicts changes in the level of three biochemical components found in the brains of 24 mice of the same strain randomly divided into two groups with the second group receiving periodic administration of a certain drug. Both samples received the same diet and care (although two mice in the first (control) group died of natural causes). It would seem that the measurements on the two groups should be considered independent (as pointed out in the follow-up paper by Reiser and Faraggi (1994)) while Gupta and Gupta (1990) treat them as dependent. Both papers assume multivariate normality with A = (1/fci, , 1/fci)' and B = (l/fc2,---,l/fc2)',1'essentially estimating R = P Q ^ X W < E i y W ) where yW, i = 1,2,3, are the amounts of the three biochemical components in micrograms per gram of the brain tissue of the mice which did not receive the drug and similarly X^\ i = 1,2,3, are corresponding amounts for the drugged mice. Gupta (and Gupta (1990) estimators of R are R = 0.7324 (the MLE) and R = 0.7171 (the UMVUE). Based on simulation results (unfortunately details are not provided) the authors conclude that these values are "approximately" two standard deviations above 0.5 and thus conclude that the drug has an effect on the level of biochemical components in the
Applications in Medicine and Psychology
215
brain. The same conclusion is reached by Morrison (1976) by using standard multivariate tests, and further analysis may be necessary to convince practitioners that the stress-strength method is advantageous for these applications. Reiser and Faraggi (1994), in a follow-up paper, challenge Gupta and Gupta (1990) conclusions by pointing out that the 95% lower confidence bounds are 0.4736 (using "exact" method) and 0.4782 (using an approximation of non-central T-distributions by means of a standard normal cdf). Evidently the point estimator of R the range of which is 0 < R < 1 does not tend to normality (unless the true value is in the vicinity of 0.5) and Gupta and Gupta's final argument may perhaps be misleading. Indeed, Reiser and Faraggi (1994) obtain an approximate confidence lower bound of 0.32 on R = P (£V A"W < £V Y&)) for Morrison's data assuming independence and conclude that the assertion that the true value of R > 0.5 is unwarranted (admitting that the sample sizes are rather small). Perhaps after all the stress-strength approach does provide some additional insight! An earlier investigation along these lines is due to Ury and Wiggins (1979) where they use the P(X < Y) approach to compare lung function test of smokers and nonsmokers based on the ratio of forced expiration volume in one second and forced vital capacity. Their tool is based on the upper bound for the variance of an estimator R of R = P(X < Y) when the distributions of X and Y are unknown but assumed to be continuous, symmetric and differing at most by a shift parameter. The theory was discussed in Sections 5.2 and 5.3. Ury and Wiggins (1979) provide an upper bound on the variance of R of the form (ni + n 2 ) 2 [I7(m + n 2 ) 2 - 40(ni + n 2 ) + 24] n2 — I) 3 where ni and n2 are sample sizes. The authors do not provide the source of their data and present only the number of cases (292) when the smoker ratios (Xi) exceeds the nonsmoker ratios (Yi) out of 400 comparisons. Calculations yield that the 90% confidence interval on the value of P(X < Y) is (0, 0.57) and the 75% interval is (0.08, 0.460) ignoring ties. Akman et al. (1998) return us to the data of Morrison's (1976) type utilized by Gupta and Gupta (1990) and Reiser and Faraggi (1994) comparing a control group of animals with five groups of animals (guinea pigs in their case) injected with different doses of tubercle bacilli. This is a real
216
Applications and Examples
world data as reported by T. Bjerkdal (1960). The control group consists of 107 animals while the five injected groups contains 72 guinea pigs each. The data provides survivals of animals after 2-year period and the dosages were expressed as a logarithm of the number of bacillary units in 0.5 ml of "challange" solution. The regimen (logarithm of the number of bacillary units) was restricted to 4.3 and 6.6 yielding two data sets. Here the P{X < Y) model was applied when X and Y are assumed to have a mixed inverse Gaussian distribution (MIG) of the form fe(x) = (l-6)f(x)
+ 0g(x), 0 < B < 1,
where
(the inverse Gaussian distribution) and /»OO
g(x) = fi^xfix),
n=
xf(x)dx J—oo
(the so called length biased inverse Gaussian (LBIG) pdf). The authors tested appropriateness of their model by means of the Kolmogorov-Smirnov test which did not reject the MIG fit. It is however not clear the degree of applicability of MIG distribution to the guinea pigs data. For both data sets the parameters jx, A and 6 were estimated using MLE (the authors do not provide any information about the estimates of parameter v). The value of R = P(X < Y) was estimated using bootstrap and jackknife methodology (described in Section 4.5) resulting in RB - 0.7407 and Rj = 0.7402. The authors also construct a standard univariate kernel density estimate of the density of T = X - Y using the normal kernel (and another estimate based on all possible differences Tij = Xi — Yj). The bandwidths were chosen to be h = 32.72 and h — 27.00, respectively. The results are of interest due to their robustness yielding the values of estimates of R 0.7371, 0.7279 and 0.7078 for mixture of IG, LBIG and the original IG model, respectively. 7.3.2
Applications Based on Categorized Data
All the data utilized in the previous applications can be characterized as continuous data using mostly parametric approach (with a heavy emphasis
Applications in Medicine and Psychology
217
on the normality assumptions). We were estimating R = P(X < Y) given samples of sizes ni and 712 from X and Y, respectively. However, in medical and pharmaceutical applications as well as in psychology this type of data is rarely available or, in some cases, does not make much sense. In a vast majority of cases, the data is not assigned to any particular distribution or family and is treated by nonparametric techniques described in Chapter 5 and Section 6.4. Applications in psychology Simonoff et al. (1986) focus their attention on the data provided as a two-way contingency table of frequencies {n^} , i = 1,2, j = 1, ,k. The rows correspond to X and Y variables and the columns to ordered response , K, (see categories. The continuous variables are discretized by Cj, j = 1, (6.54)) and the counts n^- represent the number of observations of X and Y in the interval (CJ_I, Cj). The sets of counts for each row are distributed as multinomial vectors with probabilities Pij, i = 1,2, j = 1, , K, defined in (6.54). Since for categorical data P(X = Y) is not necessarily zero, the authors concentrate on inference about A = P(X < Y) — P(Y < X) and R* = P{X < Y) + 1/2P(X = Y). Recall that these quantities are connected by the linear relationship A = 2.R* — 1. Point estimation of A is treated in detail in Section 6.4.1. Two applications of these techniques based on real-world data are of psychological nature while the third is based on the data in Cochran's (1954) well known paper. The first application stems of Oskamp's (1962) data comparing performances of staff (X) and trainees (Y) in correctly interpreting diagnostic tests provided to psychologically disturbed patients using Pettitt's (1984) discretization of the data (only the integer values of the test score were recorded). Data is presented in Table 7.3. Note that this is a sparse table with an average of less than 2 observations per cell. The value of A is estimated by the WMW type statistic (6.55), and for the above data AWMW = -0.4348. corresponding to R* = 0.283. (Brownie (1988) points at typographical errors corrected herein). A hypothesis test of HQ : P\j — P23 based on A-yy-j^-y^ rejects Ho with the p-value 0.013, namely, the probability that a trainee outperforms a staff member is approximately 0.3. It would seem that the distributions are bimodal. The second application is borrowed from B.S. Everitt's (1977) popular
218
Applications and Examples
Table 7.3 Analysis of data on diagnostic tests (Oskamp, 1962) Rows correspond to staff and trainees Columns are the integer value of 1ihe performance of the experts Data
62
63
66
68
69
70
Staff Trainees
0 1
0 1
0 1
1 1
2 3
3 5
Data
71
72
73
74
75
76
Staff Trainees
1 4
1 0
3 2
3 5
4 0
3 0
text and provides age-oriented classification of 223 boys into nonliars and inverate liars. Data is presented in Table 7.4. Table 7.4 Analysis of data on inveterate liars (Everitt, 1977) The rows are the groups corresponding to whether or not the boy is an inveterate liar. The columns form age groups. Data Age group
Inveterate liars Nonliars
5-7
8-9
10-11
12-13
14-15
6 15
18 31
19 31
27 32
25 19
Here, A-yyjyf-^ = —0.1901 and R* = 0.405. Namely, the probability that a liar is younger than a nonliar is estimated to be around 0.4. Clinical trials applications The other important application of the P(X < Y) model is related to clinical trials of medical treatments or drugs. The first example we are going to discuss here is not the most important but we shall proceed with it since it is presented in the same Simonoff et al.
Applications in Medicine and Psychology
219
(1986) paper considered in the previous subsection. The application deals with a leprosy treatment. The data is presented in Table 7.5 below. Table 7.5 Analysis of data on leprosy patients (Cochran, 1954) Rows are the initial condition of the patient (little or much infiltration). The columns are the change in health after 48 weeks of treatment. Data Infiltration Little Much
Change in health
Worse
No change
slight
11 1
53 13
42 16
Improvement moderate 27 15
marked 11 7
A quick glance at the table indicates that a patient with much infiltration is more likely to improve from the one with little infiltration. Formally, % M W = 0-2326 while other normal-based estimators are not applicable. The hypothesis Ho : Py = P2j is strongly rejected with the p-value approaching 1. Note that \2 tests on each table would not reject the independence between rows and columns (X versus Y). This is due to the fact that such a test does not take into account the ordering of the columns. For a more detailed discussion of medical applications we turn to work of Halperin et al. (1987), (1989) who analyze data on diabetic and gallstone treatments trials. In two seminal papers Halperin et al. (1987), (1989) - written shortly before his untimely demise in 1988 - with two sets of different co-authors - experts in clinical trials, a new method for confidence interval estimation of R = P(X < Y) using distribution-free approach has been devised. It is closely related to WMW statistic and uses the fact that the variance of a two-sample Wilcoxon statistic can be bounded by explicit functions of R (see (5.7)). In the fist paper (1987) a pivotal quantity (5.45) is derived which allows us to construct an approximate confidence interval on R of the form (5.47) (a comprehensive description of Halperin et al. (1987) method is presented in Section 5.3.4). The authors (among them John Lachin - a world authority on diabetes clinical trials) once more emphasize the importance and intuitive appeal of the parameter R = P(X < Y) for comparing two samples which may
220
Applications and Examples
arise from distributions that differ in more than one parameter. This is a broader model than various parametric or semiparametric models involving a single parameter. The application presented in Halperin et al. (1987) deals with data obtained from Diabetes Control and Complication Trials (DCCT) containing two randomized groups of insulin-dependent diabetics. They concentrate on the percentage of hemoglobin that is glycosylated (HbAk) which represents - at given time - an "integration" of blood glucose level of the period of at least the past four weeks. In the authors' opinion, HbAk represents a single convenient measure of the degree of control of blood glucose levels. Comparison of experimental- intensive group with the standard group, using the P(X < Y) approach, provides information about the better control which involves not only the shift in location but also the possibility for reduced inter-individual variation among patients in the experimental group. The 1987 data cited by Halperin et al. (1987) consists of samples of size 90 of adult diabetics in each treatment group which were followed for 6 months or more. The authors utilize subsamples of sizes 40 and 20 (comparing the two treatments twice). Here the amount of HbAk at 6 month were used. Firstly, a standard i-test was applied and the results were found significant in both cases, with sample size of 40 yielding more pronounced differences. (Recall that the £-test is based on the normality assumption.) Next, one-sided lower confidence limits on P(X < Y) (5.44) are obtained using confidence intervals (described in detail in Section 5.3.2) based on the asymptotic normality and Sen's (1967) (see (5.15)) and Govindarajulu's (1968) (see (5.23)) estimators of variance. Halperin et al. (1987) also present their lower bound for R given by (5.47). The authors note that based on simulations their method is preferable when constructing confidence intervals for samples as low as 20. For subsample size of 20, the 95% lower confidence limit using (5.47) is calculated to be 0.628; for the larger subsample of size 40 the corresponding bound is higher: 0.746. The corresponding estimators of P(X < Y) are 0.765 and 0.831 respectively^ Extensive simulations were performed by Halperin et al. (1987) to evaluate the comparative adequacy of several confidence interval procedures under various scenarios and three distribution-free methods. The results are not very encouraging. Even the least conservative Halperin et al. (1987) method for R — P(X < Y) — 0.9 and n = 20 yields the coverage deviation which is "sufficiently less than zero", so that one might not wish to use this
Applications in Medicine and Psychology
221
method; however, Sen's (1967) and Govindarajulu's (1968) methods "are even much worse in this case". For n = 40 there were no negative estimates but signed deviation from nominal one-sided coverage yielding lower confidence limits for P(X < Y) (under underlying exponential distribution) clearly indicates that Halperin et al. (1987) method results in substantially smaller signed deviation from the nominal value even for sample sizes of n = 80, especially when P(X < Y) is at its extreme values such as 0.1, 0.2, 0.7, 0.8 or 0.9. Returning to the estimates of P(X < Y) and 95% lower confidence limits for the DCCT study which resulted in point estimates in the vicinity 0.7-0.8, the authors cautiously conclude that: "It does not seem reasonable to postulate P(X < Y) < 0.5." Halperin et al. (1989) paper is devoted to distribution-free confidence intervals for R* = P(X < Y) + 0.5P(X = Y) in the case of categorical and right-censored continuous data using adaptation of Halperin et al. (1987) approach. The pivotal quantity in this case is (R* - R*)2/V where R* is denned in (6.61), its estimator R* is obtained by substituting riij/rii in place of Pij, i = 1,2, j = 1,---,K, and V is obtained from expression (6.65) by substituting R* for R* and § for 6. A detailed description of the interval estimation technique is provided in Section 6.4.2. The application of the above confidence procedure provided by Halperin et al. (1989) deals with hepatic toxicity data originated from the US National Cooperative Gallstone Study (NCGS). The study was a placebocontrolled double-blind clinical trial with the aim to assess the efficacy of chenodeoxycholic acid (chenodiol) for the dissolution of gallstones. Due to early concerns about potential hepatoxicity, a separate initial study of hepatic morphology was carried out some 7 years earlier. Each of the 126 patients was treated with a low dose (375 mg/day; m = 56) or a high dose (the double amount, ri2 = 61). Liver biopsies were obtained at a baseline and after 9 and 24 months treatment. Two morphologists (A and B) provided evaluations on each of 89 variables denned on categorical and ordinal scales. The authors focused their attention on the report of morphologist A at 9 months who evaluated the condition of portal triads (see Table 7.6). The variable X (Y) corresponds to low- (high-) dose variable, thus if R* is greater than 1/2, it would imply higher risk associated with 750 mg/day treatment. The estimated R* is R* = 0.5815, and the 95% confidence
222
Applications and Examples
Table 7.6 Enlargement of portal triads
High dose Low dose
Normal
Mild
Moderate
Severe
Total
44 49
1 2
15 5
1 0
61 56
interval is (0.508, 0.652) which reflects - in author's words - a significantly higher risk of portal triad enlargement for the higher dose group. Hochberg (1981) carried out similar analysis and obtained the interval on R* (0.511, 0.652). Using the parameter A = P(X
Unequivocally normal Probably normal Mildly abnormal Moderately abnormal Severely abnormal Total
High dose
Low dose
0 13 40 3 2 58
0 13 30 5 0 48
Note that sample sizes are somewhat smaller especially among the low dose patients (due to insufficient or poor biopsy). Surprisingly, the overall assessment provided inconclusive results yielding R — 0.5162 and the 95% confidence interval for R* of the form (0.4230, 0.6082). This discrepancy can perhaps be explained by noting that the distinction between the cate-
ROC Curves Analysis
223
gory "probably normal" and "unequivocally normal" (no unequivocal cases were identified) is rather blurred and the assessor may tend to "play safe" by lumping cases into "probably" category. The same conclusion may be reached when comparing "mildly" and "moderately abnormal" categories data in overall assessment which are quite opposite to the "mild" and "moderate" data when evaluating enlargement of portal triads. In our opinion, classifications of the type "overall assessment" are too vague and possibly are even not very reliable.
7.4
ROC Curves Analysis
For a good part of the 20-th century the assumption of independent random samples from continuous distributions dominated applications of statistical methodology. Most of earlier work in the area of stress-strength relationship follows this pattern. From the middle of the eighties of the 20-th century we are beginning to observe deviations from this set-up, mainly because real-world sources of data were not conforming to the i.i.d. continuous model. In fact, a substantial amount of categorized data plays important role especially in medical-oriented applications (while engineering applications continued to adhere to the assumption of random samples, sometimes supplemented as we have seen above by explanatory variables). One of the developments of this type is the analysis of ROC (receiver operating characteristic) curves. This topic was a real hit in the last decade with a large number of publications appearing. In this volume we shall cite only a few of them referring the interested reader to Swets and Pickett (1982) or more recent Swets (1996). 7.4.1
ROC Curves and Their Relation to P(X < Y)
ROC curve is a particular type of an ordinal dominance (OD) graph. Consider random variables X and V, and for every real number c plot a point T(c) in a Cartesian coordinate system with the coordinates (P(X < c), P(Y < c)). The collection of the points T(c) form a ROC graph. Note that the coordinates of T{c) lie between 0 and 1, so that the ROC graph is always located within the unit square {(x, y) : 0 < x < 1,0 < y < 1}. Moreover, by letting c = o we conclude that ROC graph always starts at (0,0)
224
Applications and Examples
and ends at (1,1). If X and Y are both continuous, their OD graph is also a continuous curve, while for discrete X and Y the OD graph will be a collection of distinct points. Note that these two cases are interrelated: if for continuous variables X and Y the probabilities P(X < c), P(Y < c) are available only for a limited number of values of c, one ends up with a graph constituted by a finite number of points located on the ROC graph for (X,Y). On the other hand, a discrete graph can be converted into a continuous curve by connecting the consecutive points on the graph. The relation between OD graphs and the P(X < Y) model was originally pointed out by Bamber (1975) and brought a variety of methods developed for an inference about P(X < Y) into analysis of ROC curves. Bamber (1975) observed that the area-above the OD graph for continuous X and Y is equal to A(X, Y)
=
[ P(X < c)dP{Y < c) Jo
=
f P(X
(7.2)
since P(X = V) = 0 in this case. For discrete X and Y he shows that (recall Section 6.4) A{X,Y) = R* =P(X
(7.3)
It is evident that the OD graph for (X, Y) can be obtained by rotating the OD graph for (Y,X) and A(X,Y) = 1 - A(Y,X). In view of the relations (7.2) and (7.3), the area A(X, Y) can be utilized as a measure of the size of difference between two populations with A(X, Y) = 1 if and only if the distribution of X lies entirely below the distribution of Y. On the other hand, if X and Y are identically distributed, A(X, Y) = 1/2. However, A(X, Y) is more commonly applied to measure how accurately a given test differentiates two populations. Consider the "yes-no" signal detection experiment. In this experiment, the observer is told to respond "yes" if he/she thinks that the signal was presented on the trial and to respond "no" otherwise. It is assumed that the observer performs this task as follows. First, he/she adopts (often subjectively) an impression strength criterion, say c. Then, on each trial if the impression strength reaches or exceeds the criterion, he/she responds
ROC Curves Analysis
225
"yes", and responds "no" otherwise. Let Is and In be continuous random variables denoting the strengths of sensory impressions aroused by signal and noise events, respectively. Then
P(yes|signal) = P{IS > c) and P(yes|noise) = P(In > c).
The (In, Is) ROC curve (or yes-no ROC curve) is then a collection of points (P(In > c),P(Is > c)) in a unit square. It is easy to observe that this is the rotated OD graph for (/ n , Js) so that the area below the graph is equal to P(Inc) and P(Is > c) under certain (parametric or nonparametric) assumptions. However, sometimes no such data is available. Consider the case when an expert performing an experiment on each trial assesses his/her degree of confidence that a signal was indeed presented on that trial (for instance, "definitely no", "probably no", "questionable", "probably yes", "definitely yes"). For this purpose, he/she is given a confidence scale consisting of K confidence levels which are obtained by simply discretizing all possible values of In and Is by means of a partition —oo < CQ < c\ < < CK-I < CK < °o. The expert concludes that the signal belongs to category Cj if it lies between e,_i and Cj, j = 1, , K, although the cut-off points CJ may not be explicitly available. The data allows one to plot several sample points on ROC curve, and essentially coincides with the categorical data for P(X < Y) model discussed in some detail in Section 6.4. In other words, the ROC curve analysis described above is yet another version of the stress-strength model and can be performed by the variety of methods developed in Section 6.4. The specific (and somewhat controversial) assumption feature of the ROC curve analysis, however, is the existence of a monotone transformation such that on a transformed scale /„ and Is are normally distributed with possibly different means and variances (see e.g. Brownie (1988) and Metz et al. (1998)). Since in a number of applications the cut-off points are unknown, this is equivalent to the assumption that (In,Is) are normally distributed, and the methods developed in Section 6.4 can apply in this case.
226
Applications and Examples
7.4.2
Applications of ROC Curves
ROC curve analysis has been used in various fields of medical imaging, radiology, psychiatry, nondestructive testing and manufacturing inspection systems (see e.g. Hsiao et al. (1989), Metz (1989), Nockemann et al. (1991), Reiser (2000), Swets (1996) and Swets and Pickett (1982)). Here, we shall consider an example of comparison of predictive validities of aptitude tests studied by Humphreys and Swets (1991). These authors compare two methods of assessment of air pilot's training which were in use during the World War II. The training was broken into nine steps, stanine, which represented weighted raw-score composites of different tests. The results of these tests provide information about the expected date of graduation (when pilot's wings and a commission as a second lieutenant were awarded). The tests and their weights entering into the pilot stanine were changed from time to time as research information had been accumulated. One of these changes took place in 1942-43. The data in Humphreys and Swets (1991) represents three classes. Below, we chose just two of them, the class 43-H tested on the pilot stanine of December 1942 and the class 44-1 tested on the stanine of November 1943. Note that during this period there was a change in the training standards: between 43-H and 44-1 classes, the decision was made to use a minimum cut-off of 4 on a pilot stanine as a prerequisite for the entry to pilot training program. Table 7.8 Pass/fail classification in pilot training classes
Stanine
No. passed
9 8 7 6 5 4 3 2 1
663 565 988 1184 1127 841 401 148 43
43-H No. failed 45 101 249 486 708 827 620 397 214
No. passed 683 718 1166 1306 962 359 2 1 0
44-1 No. failed 17 76 159 327 405 282 3 0 0
Using data presented in Table 7.8, the authors were testing the assump-
P(signallnoise)
Some Other Applications
227
1 0.8 0.6 0.4 0.2 0 0
0.2
0.4
0.6 0.8
1
P(signallnoise)
Fig. 7.1 ROC curve for the pass-fail data in Class 43-H.
tion that the later stanine assessment of a student's performance is "better" than the earlier one. The data for each stanine is assumed to be normal with unknown partition values. The normality assumptions were tested by chi-squares tests and provided satisfactory fits. The ROC curves for both classes are constructed with In and Is being the unknown raw scores of the failing and passing trainees. These ROC curves are presented in Fig. 7.1 and 7.2. The area under the ROC curve for each class was calculated and turned out to be 0.734 for 43-H class and 0.714 for 44-1 class, thus, showing no significant differences between two stanines. 7.5 7.5.1
Some Other Applications Estimation of Strength Characteristics from the Distribution of Stress
Using the US Air Force Material Laboratory report (1974), Durham and Padgett (1990) analyze simplified data on windgust loading experiments
Applications and Examples
P(signallnoise)
228
1 0.8 0.6 0.4 0.2 0 0 0.2
0.4
0.6
0.8
1
P(signallnoise)
Fig. 7.2 ROC curve for the pass-fail data in Class 44-1.
with sheets of steel alloy. The models utilized by the authors are not the conventional P(X < Y) models but are related to a probabilistic interpretation of the Miner's rule (Birnbaum and Sanders (1968)). Durham and Padgett (1990) study estimation of characteristics of the cdf of the magnitude of strength Y of an item under the assumption that the cdf of the stress X applied in a typical loading is known while the cdf of Y is unknown but assumed to be continuous. Neither X nor Y are directly observable and the estimates of characteristics of Y are obtained semi-parametrically. Specifically, a simple stress-strength model with i.i.d. loadings postulates that loads are applied to the item until it fails (namely, the current load exceeds the strength). The number of loadings until failure is recorded for each item and there are no other failure modes. Under these assumptions the conditional cdf of the number N of loadings until failure is given by the geometric distribution
where py = P(Xij < Yi\Yi = y) = Fx(y) and qy = 1 — py. Here i is the
Some Other Applications
229
index for the item, i = 1, , k, Xtj are the loading applied to the item i, j = 1,2, , which is a random variable, and Yi is the strength of item i on test, which is a random variable as well. Hence, P(N > n\Y — y) = p™, The unconditional probability that the number of loadings until failure of a system exceeds n is R{n) = P(N > n) = f°° p^dFY(y) Jo where Fy(-) is the cdf of Y. Equivalently, introducing the unconditional probability P(N = n)= n(n) =
we have R(n) = 1 m=l
Assuming that Fx(-) is uniform on (0, a), a > 0, and the support of Fy(-) is contained in (0,a), from the above expression of R(n) we obtain R(n) = a~n E(Yn). Namely, the first two moments of strength Y are HY = aR(l)
and
aY = a2[R(2) -
R2(l)\.
Now one can estimate /iy and aY noting that an unbiased and consistent estimators of R(l) and R(2) are R(l) = (number of Nt > l)/jfc,
R(2) = (number of JV< > 2)/k,
respectively. Here Ni is the least upper bound on the set {n : Xn < Yi, , Xitn-i < Yi}, n > 1, and as above k is the number of items on the test. Details of the procedure are given in Durham and Padgett (1990) who approximated the distribution Fy(y) by means of a discrete distribution using the method of Deely and Kruse (1968). In the example provided by the authors, the mean stress per loading is 32 kpsi. Each of the seven components were repeatedly stressed until failure. The stress cdf was assumed to be exponential Fx{x) = 1 — exp(—Ax) with X = (1/32) = 0.031 (kpsi). The ordered observed numbers of loadings to
230
Applications and Examples
failure were: 1,1,2,3,4,5,14. The problem is to estimate the mean strength. , 7, The estimator depends only upon the choice of the weights yjk, j = 1, at which the cdf Fy is approximated. Indeed, the choice of y^ : 25 (25) 175 yields the estimated mean strength of 55.1 kpsi and the estimated standard deviation of strength 10.1 kpsi. An alternative choice of yjk 10 (20) 130 results in estimated mean strength of 54.9 kpsi. See Durham and Padgett (1990) for further details. A similar procedure can be used for a model of cumulative damage in which loading j produces a random amount of damage Dj, which is manifested by a reduction in the strength of the item and the damage accumulates linearly until the item fails (namely, as above, the current load exceeds the current strength) and again there are no other failure modes. The basic change in derivation is that now P{N > n\Y > y) = 3=1
Consequently,
which can be written as R(n)= f°° F*Dn(y)dFy(y). Jo
Here Fpn(y) - the cdf of J2"=i^j ~ i s the n-iold convolution of FD3, Dj being the damage caused by load j , a random variable. The same procedure as described above can be used to estimate the characteristics of the strength - just replace (1 — Fx{-))n by the convolution of For 7.5.2
A Relation Between the Stress-Strength Model and the Process Capability Index
Finally, we shall briefly comment on the as yet unexplored relation between the process capability indices and the stress-strength model - two seemingly distinct fields of study within the general framework of statistical quality control.
Some Other Applications
231
The use of process capability indices (see e.g. Kotz and Johnson (2002) for a recent survey) is motivated by a desire to have an index related to the probability that an attribute (Z) of a component (size, density, elastic strength, etc.) falls within fixed specification limits (sometimes the "specification interval" is one-sided - e.g. (—00, A) or (A, 00) - and only a lower or upper limit is specified). However, in some circumstances it may be desirable to have an "index" allowing for possibly varying limits - TL or Tu, say, for lower and upper limits respectively. We are then interested in P(TL < Z < TU). If only one limit (TL or Tu) is finite, we are back to the stress-strength model calculation of P(X < Y) type. The more complicated analysis for P(TL < Z < Tu) corresponds to the generalization P(X < Y < Z) of stressstrength model studied in detail in Section 6.2.2. An analysis covering less restrictive situation than those described in that Section could turn to be useful and revealing in bridging and unifying between the two approaches.
Bibliography
Abramowitz, M., Stegun, LA. (1992) Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. Reprint of the 1972 edition. Dover Publications, New York. Abu-Salih, M. S., Shamseldin, A. A.(1988) Bayesian estimation of P(X < Y) for a bivariate exponential distribution. Arab Gulf J. Sci. Res. A. Math. Phys. Sci., 6(1), 17-26. Abusev, R.A., Kolegova, N.V. (1998) On estimators of probabilities of linear inequalities in the case of multivariate T-distributions. In Statistical Methods of Estimation and Hypothesis Testing, 18 - 24, Perm State University, Perm (in Russian). Ahmad, K.E., Fakhry, M.E., Jaheen, Z.F. (1995) Bayes estimation of P(Y > X) in the geometric case. Microelectron. reliab., 35(5), 817-820. Ahmad, K.E., Fakhry, M.E., Jaheen, Z.F. (1997) Empirical Bayes estimation of P(Y < X) and characterization of Burr-type X model. J. Statist. Plan. Inf., 64, 297 - 308. Akman, O., Sansgiry, P., Minnotte, M.C. (1999) On the estimation of reliability based on mixture inverse Gaussian distributions, pp. 121 - 128. In Applied Statistical Science, IV, Nova Science Publishers. Al-Hussaini, E.K., Mousa, M.A.M,, Sultan, K.S. (1997) Parametric and nonparametric estimation of P(Y < X) for finite mixtures of lognormal components. Commun. Statist. - Theory Meth. , 26, 1269-1289. Aminzadeh, M.S.(1991) Confidence bounds for Pr(X > Y) in 1-way ANOVA random model. IEEE Trans. Reliab., 40, 537-541. Aminzadeh, M.S. (1997) Estimation of reliability for exponential stress-strength models with explanatory variables. Appl. Math. Comput, 84, 269-274. Aminzadeh, M.S. (1999) Estimation of P(Z < Y) for correlated stochastic time series models. Appl. Math. Comput, 104, 179 - 189. Anderson, T.W., Fang, K.T., Hsu, H. (1986) Maximum-likelihood estimates and likelihood-ratio criteria for multivariate elliptically contoured distributions. 233
234
Bibliography
Canadian Journ. Statist, 14, 55-59. Arsham, H. (1986) A generalized confidence region for stress-strength reliability. IEEE Trans. Reliab., 35, 586 - 588. Awad, A.M., Azzam, M.M., Hamdan, M.A. (1981) Some inference results on Pr(Jf < Y) in the bivariate exponential model. Commun. Statist. - Theory Meth., 10, 2515-2525. Awad, A.M., Fayoumi, M. (1985) Estimate of P(X < Y) in case of the double exponential distribution. In Proceedings of the Seventh Conference on Probability Theory, Aug. 29-Sept. 4, 1982, Brasov, Romania. Vnuscience Press, Utrecht, Netherlands, 527-531. Awad, A.M., Gharraf, M.K. (1986) Estimation of P(Y < X) in the Burr case: a comparative study. Commun. Statist. - Simul. Comp., 15, 389-403. Azzalini, A., Chiogna, M. (2002) Stress-strength model for skew-normal distributions. - submitted. Bader, M.G., Priest, A.M. (1982) Statistical aspects of fibre and bundle strength in hybrid composites. In Progress in Science and Engineering Composites, eds. Hayashi, T., Kawata, K., Umekawa, S., vol. ICCM-IV, Tokyo, 11291136. Bai, D.S., Hong, Y.W. (1992) Estimation of Pr(X < Y) in the exponential case with common location parameters. Commun. Statist. - Theory Meth., 21, 269-282. Baklizi, A. (2001) Estimation of P{X < Y) in the exponential distribution with censored data. Pakistan J. Statist, 17, 143-149. Bamber, D. (1975) The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. J. Math. Psychol., 12, 387-415. Barton, D.E. (1961) Unbiased estimation of a set of probabilities. Biometrika, 48, 227-229. Basu, A.P. (1967) On the large sample properties of a generalized WilcoxonMann-Whitney statistic. Ann. Math. Statist., 38, 905-915. Basu, A.P. (1977) A generalized Wilcoxon Mann-Whitney statistic with some applications in reliability. In The Theory and Applications of Reliability, with Emphasis on Bayesian and Nonparametric Methods. Conf., Univ. South Florida, Tampa, Fla., 1975, 1, 131-149. Academic Press, New York. Basu, A.P. (1981) The estimation of P(X < Y) for distributions useful in life testing. Naval Res. Logist Quart 28, 383-392. Basu, A.P. (1988) Multivariate exponential distributions and their applications in reliability. In Handbook of Statistics, eds. Krishnaiah, P.R. and Rao, C.R., 7, 467-476. Basu, A., Ebrahimi, N. (1983) On the reliability of stochastic systems. Statist. Probab. Letters, 1, 265-267. Basu, D. (1964) Estimates of reliability for some distributions useful in life testing. Technometrics, 6, 215-219. Bechhofer, R.E. (1954) A single-sample multiple decision procedure for ranking
Bibliography
235
means of normal populations with known variances. Ann. Math. Stat, 25, 16-39. Beg, M.A. (1980a) Estimation of P(Y < X) for truncation parameters distributions. Commun. Statist. - Theory Meth., 9, 327—345. Beg, M.A. (1980b) On the estimation of P(Y < X) for the two-parameter exponential distribution. Metrika, 27, 29-34. Beg, M.A. (1980c) Estimation of P(V < X) for exponential family. IEEE Trans. Reliab., 29, 158 - 159. Beg, M.A. (1983) Unbiased estimators and tests for truncation and scale parameters. Amer. J. Math. Mgmt. Sci., 3, 251-274. Beg, M.A., Singh, N. (1979) Estimation of P(Y < X) for the Pareto distribution. IEEE Trans. Reliab., 28, 411-414. Belyaev, Y., Lumelskii, Y. (1988) Multidimensional Poisson walks. Journ. Math. Sciences, 40, 162-165. Bennett, G. (1962) Probability inequality for the sum of independent random variables. J. Amer. Statist. Assoc. , 57, 33-45. Berger, J., Bernardo, J.M. (1989) Estimating a product of means: Bayesian analysis with reference priors. J. Amer. Statist. Assoc. , 84, 200 -207. Berger, J., Bernardo, J.M. (1992) On the development of reference priors (with discussion). In Bayesian Statistics 4, eds. J.M. Bernardo, J. O. Berger, A.P. Dawid and A.F.M. Smith, Oxford University Press, Oxford, UK, pp. 35-60. Bernardo, J.M. (1979) Reference posterior distributions for Bayesian inference (with discussion). J.R. Statist. Soc, B41, 113-147. Bhattacharyya, G.K. (1977) Reliability estimation from survivor count data in a stress-strength settings. IAPQR Trans. - J. Indian Assoc. for Prod., Qual. and Reliab., 2, 1-15. Bhattacharyya, G.K,, Guttman, I., Johnson, R.A., Reiser, B. (1986) Statistical Inference for Stress-Strength Models With Covariates. Tech. Report 8, University of Toronto, Dept. of Statistics. Bhattacharyya, G.K., Johnson, R.A. (1974) Estimation of reliability in a multicomponent stress-strength model. J. Amer. Statist. Assoc, 69, 966-970. Bhattacharyya, G.K., Johnson, R.A. (1975) Stress-strength models for system reliability. In Proc. Symp. on Reliability and Fault Tree Analysis. Ed. Barlow, R. E., Fussell, J.B., Singpurwalla, N.D. SIAM, Philadelphia, 509-532. Bhattacharyya, G.K., Johnson, R.A. (1977) Estimation of system reliability by nonparametric techniques. Bulletin of Mathematical Society of Greece (Memorial volume), 94-105, Bhattacharyya, G.K., Johnson, R.A. (1981) Stress-strength models for reliability: overview and recent advances. Proceedings of the Twenty-sixth Conference on the Design of Experiments in Army Research, Development and Testing. New Mexico State Univ., Las Cruces, 1980. ARO Rep. 81, 2, U. S. Army Res. Office, Research Triangle Park, N.C., 531-548. Bilikam, J.E. (1985) Some stochastic stress-strength processes. IEEE Trans. Reliab., 34, 269 - 274.
236
Bibliography
Billingsley, P. (1995) Probability and Measure. Wiley, New York. Birnbaum, Z.W. (1956) On a use of Mann-Whitney statistics. Proc. Third Berkeley Symp. in Math. Statist. Probab., Vol. 1, 13-17, University of California Press, Berkeley, CA. Birnbaum, Z.W., McCarty, B.C. (1958) A distribution-free upper confidence bounds for Pr(V < X) based on independent samples of X and Y. Ann. Math. Statist, 29, 558-562. Birnbaum, Z.W., Saunders, S.C. (1968) A probabilistic interpretation of Miner's rule. SIAM J. Applied Mathematics, 16, 637-652. Bjerkdal, T. (1960) Acquisition of resitance in guinea pigs injected with different doses of virulent tubercle bacteria. Amer. J. Hygiene, 72, 130-148. Block, H.W., Basu, A.P. (1974) A continuous bivariate exponential extension. J. Amer. Statist. Assoc, 69, 1031-1037. Brownie, C. (1988) Estimating Pr(X < Y) in categorized data using "ROC" analysis. Biometrics, 44, 615-621. Burr, I.W. (1942) Cumulative frequency functions. Ann. Math. Statist, 13, 215222. Carlin, B.P., Louis, T.A. (2000) Bayes and Empirical Bayes Methods for Data Analysis. Chapman and Hall, London. Casella, G. (1985) An introduction to empirical Bayes data analysis. The American Statistician, 39, 83-87. Casella, G., Berger, R. (1990) Statistical Inference. Duxbury Press, California. Chandra, S., Owen, D.B. (1975) On estimating the reliability of a component subject to several different stresses (strengths). Nav. Res. Logist. Quart., 22, 31-39. Chandra, S., Owen, D.B. (1977) On an estimator of the probability P(Xi < Y,X2 < Y, ,XN < Y). South African Statist. J., 11, 149-154. Chao, A. (1982) On comparing estimators of P(V < X) in the exponential case. IEEE Trans. Reliab., 31, 389-392. Chao, A., Cheng, K. (1985) Interval estimators for highly reliable stress-strength models. Chinese J. Math., 13, 131-136. Charnes, A., Cooper, W. W. (1961) Management Models and Industrial Applications of Linear Programming. Wiley, New York. Chaturvedi, A., Surinder, K. (1999) Further remarks on estimating the reliability function of exponential distribution under type I and type II censoring. Brazilian J. Probab. Statist, 13(1), 29 - 39. Cheng, K.F, Chao, A. (1984) Confidence intervals for reliability from stressstrength relationships. IEEE Trans. Reliab., 33, 246-249. Choi, S.S., Kim, J. J. (1983) A Bayes reliability estimation from life test in a stress-strength model. J. Korean Statist. Soc, 12, 1-9. Church, J.D., Harris, B. (1970) The estimation of reliability from stress-strength relationship. Technometrics, 12, 49-54. Cochran, W.G. (1954) Some methods for strengthening common x 2 tests. Biometrics, 10, 417-451.
Bibliography
237
Constantine, K., Karson, M., Tse, S.-K. (1986) Estimators of P(Y < X) in the gamma case. Commun. Statist. - Simul. Comput, 15, 365-388. Constantine, K., Karson, M., Tse, S.-K. (1989) Bootstrapping estimates of P(Y < X) in the gamma case. J.Statist. Comput. Simul., 33, 217-231. Cramer, E. (2001) Inference for stress-strength models based on Wienman multivariate exponential samples. Commun. Statist. - Theory Meth., 30, 331346. Cramer, E., Kamps, U. (1997a) The UMVUE of P(X < Y) based on typeII censored samples from Weinman multivariate exponential distributions. Metrika, 46, 93-121. Cramer, E., Kamps, U. (1997b) A note on UMVUE of Pr(X < Y) in the exponential case. Commun. Statist. - Theory Meth., 26, 1051-1055. Csaki, E. (1984) Empirical distribution function. In Handbook of Statistics. Ed. Krishnaiah, P.R., and Sen, P.K., Vol. 4, Elsevier, North Holland, 405-430. Dahel, S. (1989) Bias in a stress-strength problem. IEEE Trans. Reliab., 38, 386-387. Datta, G.S., Ghosh, M. (1995) Some remarks on noninformative priors. J. Amer. Statist. Assoc. , 90, 1357 - 1363. Datta, G.S. (1996) On priors providing frequentist validity of Bayesian inference for multiple parametric functions. Biometrika, 83, 287-298. Deely, J.J., Kruse, R.L. (1968) Construction of sequences estimating the mixing distribution. Ann. Math. Stat, 39, 286-288. DeLong, E.R., Sen, P.,K. (1981) Estimation of Pr(X > Y) based on progressively truncated versions of the Wilcoxon-Mann-Whitney statistic. Commun. Statist. - Theory Meth., 10, 963-981. DeLong, E.R., Sen, P.K. (1982/83) The extended two-sample problem: progressively truncated estimation of P{X > Y}. Statist. Decis., 1(2), 147-170. Dinh, K.T., Singh, J., Gupta, R.C. (1991) Estimation of reliability in bivariate distributions. Statistics, 22, 409-417. Downton, F. (1973) On the estimation of Pr(Y < X) in the normal case. Technometrics, 15, 551-558. Duncan, A.G. (1986) Quality and Industrial Statisitcs, 5th ed. Homewood, IL: Richard D. Irwin. Durham, S.D., Padgett, W.J. (1990) Estimation for a probabilistic stress-strength model. IEEE Trans. Reliab., 39, 199-203. Dutta, K., Srivastava, G.L. (1987) An n-standby system with P(X < Y < Z). IAPQR Trans., 12, 95-97. Easterling, R. (1972) Approximate confidence limits for system reliability. J. Amer. Statist. Assoc, 67, 220-222. Edwardes, M.D. de B. (1995) A confidence interval for Pr(X < Y) - Pr(X > Y) estimated from simple cluster samples. Biometrics, 51, 2, 571-578. Efron, B. (1979) Bootstrap methods: another look at the jackknife. Ann. Statist., 7, 1-26. Efron, B. (1982) The Jackknife, the Bootstrap and Other Resampling Plans.
238
Bibliography
CBMS-NSF monograph, —bf 38, SIAM, Philadelphia. Enis, P., Geisser, S. (1971) Estimation of the probability that Y > X. J. Amer. Statist. Assoc, 66, 162-168. Everitt, B.S. (1977) The Analysis of Contingency Tables. London, Chapman and Hall. Fang, K.-T., Kotz, S., Ng, K.-W. (1990) Symmetric Multivariate and Related Distributions. Chapman and Hall, London, UK. Feigin, P.D., Lumelskii, Ya.P. (2000) On confidence limits for the difference of two binomial parameters. Commun. Statist. - Theory Meth., 29, 131-141. Feigin, P.D., Lumelskii, Ya.P., Volkovich, Z.E. (2001) On Monte Carlo simulation of confidence bounds for reliability problems. In Proceedings of 15-th European Simulation Multiconference "Modelling and Simulation 2001", Prague, 719-721. Ferguson, T.S. (1973) A Bayesian analysis of some nonparametric problems. Ann. Statist., 1, 209-230. Freund, J.E. (1961) A bivariate extension of the exponential distribution. J.Amer. Statist. Assoc, 56, 971-977. Gastwirth, J.L., Krieger, A.M. (1991) On bounding P(X2 < X\) from grouped data. Scand. J. Statist, 18, 111-117. Ghosh, M., Lahiri, P. (1992) Estimation of P(XW < X ( 2 ) ): a nonparametric empirical Bayes approach. In Order Statistics and Nonparametrics: Theory and Applications (P.K. Sen and I.M.Salama, eds.) Elsevier Science, Netherlands, Amsterdam, 247-261. Ghosh, J.K., Mukerjee, R. (1992) Non-informative priors (with discussion). In Bayesian Statistics 4, eds. J.M. Bernardo, J. O. Berger, A.P. Dawid and A.F.M. Smith, Oxford University Press, Oxford, UK, 321 -344. Ghurye, S.G., Olkin, I. (1969) Unbiased estimation of some multivariate probability densities and related functions. Ann. Math. Statist., 40, 1261-1271. Gnedenko, B.,V. (1943) Sur la distribution limite du terme maximum d'une serie aleatoire. Ann. Math., 44, 423-453. Govindarajulu, Z. (1967) Two sided confidence limits for P(X > Y) based on normal samples of X and Y. Sankhyd, 29, 35-40. Govindarajulu, Z. (1968) Distribution-free confidence bounds for P(X < Y). Ann. Inst. Statist. Math., 20, 229-238. Govindarajulu, Z. (1974) Fixed-width confidence intervals for P(X < Y). Reliability and Biometry. Statistical Analysis of Lifelength. Proc. Conf., Florida State Univ., Tallahassee, Fla. 1973. SIAM, Philadelphia, 747-757. Govindarajulu, Z. (1976) A note on distribution-free confidence bounds for P(X < Y) when X and Y are dependent. Ann. Inst. Statist. Math., 28, 307-308. Gradshtein, I.S., and Ryzhik, I.M. (1980) Tables of Integrals, Series, and Products. Academic Press, New York. Gray, H. L., Schucany, W. R. (1972) The Generalized Jackknife Statistic. Marcel Dekker, New York .
Bibliography
239
Gumbel., E.J. (1960) Bivariate exponential distribution. J. Amer. Statist. Assoc, 55, 698-707. Gupta, C.G., Brown, N. (2001) Reliability studies of the skew-normal distribution and its application to a strength-stress model. Commun. Statist. - Theory Meth., 30, 2427-2445. Gupta, R.C., Gupta, R.D. (1987) A comparison of various estimators of reliability. Comput. Statist. Data Anal, 5, 215-226. Gupta, R.C., Ma, S. (1996) Testing the equality of coefficients of variation in k normal populations. Commun. Statist. - Theory Meth., 25, 115-132. Gupta, R. C., Ramakrishnan, S., Zhou, X. (1999) Point and interval estimation of P(X < Y) : the normal case with common coefficient of variation. Ann. Inst. Statist. Math., 51, 571-584. Gupta, R.C., Subramanian, S. (1998) Estimation of reliability in a bivariate normal distribution with equal coefficients of variation. Commun. Statist. Simui, 27, 675-698. Gupta, R.D., Gupta, R.C. (1988) Estimation of P(YP > max(Yi, Y2,..., Yp_i)) in the exponential case. Commun. Statist. - Theory Meth., 17, 911-924. Gupta, R.D., Gupta, R.C. (1990) Estimation of Pr(a'a; > b'y) in the multivariate normal case. Statistics, 21, 91-97. Gupta, R.P. (1972) Reliability estimation of a system comprised of k elements from the same truncated exponential model. Statist. Neerl., 26, 55-59. Gupta, S.S. (1963) Probability integrals of multivariate normal and multivariate t. Ann. Math. Stat., 63, 792-828. Gupta, S.S. (1963) Bibliography on the multivariate normal integrals and related topics. Ann. Math. Stat, 63, 829-838. Guttman, I., Johnson, R.A., Bhattacharyya, G.K., Reiser, B. (1988) Confidence limits for stress-strength models with explanatory variables. Technometrics, 30, 161-168. Hald, A. (1998) A History of Mathematical Statistics From 1750 to 1930. Wiley, New York. Hallin, M., Seoh, M. (1997) When does Edgeworth beat Berry and Esseen? Numerical evaluations of Edgeworth expansions. J. Statist. Plann. Inference, 63, 19-38. Halperin, M., Gilbert, P.R., Lachin, J.M. (1987) Distribution-free confidence intervals for Pr(Xi < X2). Biometrics, 43, 71-80. Halperin, M., Hamdy, M.I., Thall, P.F. (1989) Distribution-free confidence intervals for a parameter of Wilcoxon-Mann-Whitney type for ordered categories and progressive censoring. Biometrics, 45, 509-521. Hanagal, D.D. (1992) Some inference results in modified Preund's bivariate exponential distribution. Biom. J., 34, 745 -756. Hanagal, D.D. (1995) Testing reliability in a bivariate exponential stress-strength model. J. Indian Statist. Assoc, 33, 41-45. Hanagal, D.D. (1997a) Note on estimation of reliability under bivariate Pareto stress-strength model. Statist. Papers, 38, 453-459.
240
Bibliography
Hanagal, D.D. (1997b) Estimation of reliability when stress is censored at strength. Commun. Statist. - Theory Meth., 26, 911-919. Hanagal, D.D. (1999) Estimation of reliability of a component subjected to bivariate exponential stress. Statist. Papers, 40, 211-220. Hanagal, D.D., Kale, B.K. (1992) Large sample tests for testing symmetry and independence in some bivariate exponential models. Commun. Statist. Theory Meth., 21, 2625 -2643. Harris, B, Soms, A.P. (1983) A note on a difficulty inherent in estimating reliability from stress-strength relationships. Naval Res. Logist. Quart., 30, 659 -662. Hayter, A.J., Liu, W. (1996) A note on the calculation of Pr {Xi < X2 <
< Xk}.
The American Statistician,
50(4), 365.
Hilgers, R. (1981) On asymptotically distribution-free confidence bounds for P(Xi > X2) based on samples not necessarily independent. Biom. J., 23, 627-633. Hilgers, R. (1981) On an unbiased variance estimator for the Wilcoxon- MannWhitney-statistic based on ranks. Biom. J., 23, 653-661. Hlawka, P. (1975) Estimation of the parameter p = P(X < Y < Z). Prace Nauk. Inst. Mat. Politechn. Wroclaw. No. 11, Ser. Stud, i Materialy No. 10 Problemy rachunku prawdopodobienstwa. 55-65 (in Polish). Hochberg, Y. (1981) On the variance estimate of a Wilcoxon-Mann-Whitney statistic for group ordered data. Comm. Statist. - Theory Meth., 10, 17191732. Hoeffding, W. (1963) Probability inequalities for sums of bounded random variables. J. Amer. Statist. Assoc, 58, 13-30. Hogg, R.V., Craig, A.T. (1978) Introduction to Mathematical Statistics. Fourth ed. Macmillan Publishing Co., New York. Holla, M.S. (1967) Reliability estimation of the truncated exponential model. Technometrics, 9, 332-335. Hollander, M., Korwar, R.M. (1976) Nonparametric empirical Bayes estimation of the probability that X < Y.Comm. Statist. - Theory Meth., 5, 1369-1383. Hollander, M., Wolfe, D.A. (1999) Nonparametric Statistical Methods. Second ed. Wiley, New York. Hsiao, J.K., Bartko, J.J., Potter, W.Z. (1989) Diagnosing diagnoses. Archives of General Psychiatry, 46, 664-667. Humphreys, L.G., Swets, J.A. (1991) Comparison of predictive validities measured with.biserial correlations and ROCs of signal detection theory. Journal of Applied Psychology, 76, 316-321. Hurt, J. (1980) Estimates of probability for the normal distribution. Aplikace Matematiky, 25, 432-444. Hwang, T.Y., Hu, C.Y. (1990) More comparisons of MLE with UMVUE for exponential families. Ann. Inst. Statist. Math., 42, 65-75. Ismail, R., Jeyaratnam, S., Panchapakesan, S. (1986) Estimation of P(X > Y) for gamma distributions. J. Statist. Comput. SimuL, 26, 253-267.
Bibliography
241
Ivshin, V. V. (1996) Unbiased estimators of P(X < Y) and their variances in the case of uniform and two-parameter exponential distributions. J. Math. Sci., 81(4), 2790-2793. Ivshin, V.V. (1998) On the estimation of the probabilities of a double linear inequality in the case of uniform and two-parameter exponential distributions. J. Math. Set., 88, 819-827. Ivshin, V.V., Lumelskii, Ya.P. (1993) Unbiased estimators for linear influence in the case of multivariate normal distribution. Proceedings of the Sixth international Vilnius conference on probability theory and mathematical statistics, Vilnius, 1993, 1, 152-153 (in Russian). Ivshin, V. V., Lumelskii, Ya. P. (1994) Unbiased estimators for density functions and probabilities of linear inequalities in the multivariate normal case. Stability Problems for Stochastic Models. Frontiers in Pure and Applied Prob-
ability, 3, 71 - 80. Ivshin, V.V., Lumelskii, Ya.P. (1995) Statistical Estimation Problems in "StressStrength" Models. Perm University Press, Perm, Russia. Iwase, K. (1987) On UMVU estimators of Pr(Y < X) in the two-parameter exponential case. Mem. Fac. Hiroshima Univ., 9, 21-24. Jana, P.K. (1994) Estimation of P(Y < X) in the bivariate exponential case due to Marshall-Olkin. J.Indian. Statist. Assoc, 31, 25-37. Jana, P.K. (1997) Comparison of some stress-strength reliability estimators. Calcutta Statist. Assoc. Bull, 47, 239-247. Jana, P.K., Roy, D. (1994) Estimation of reliability under stress-strength model in a bivariate exponential set-up. Calcutta Statist. Assoc. Bull., 44, 175-181. Jeevanand, E. S., Nair, N. U. (1994) Estimating P[X > Y] from exponential samples containing spurious observations. Commun. Statist. - Theory Meth., 23, 2629-2642. Jeevanand, E.S. (1997) Bayes estimation of P(X2 < X\) for a bivariate Pareto distribution. Statistician, 46, 93 - 99. Jeevanand, E.S. (1998) Estimation of reliability under stress-strength model for the Marshall-Olkin bivariate exponential distribution. IAPQR Trans., 23(4), 133-136. Jeevanand, E.S., Nair, N.U. (1994) Estimating P[X > Y] from exponential samples containing spurious observations. Commun. Statist. - Theory Meth., 23, 2629-2642. Jeffreys, H. (1961) Theory of Probability, Oxford University Press, Oxford, UK. Johnson, B.McK. (1975) Bounds on the variance of the U-statistic for symmetric distributions with shift alternatives. Ann. Statist, 3, 955-958. Johnson, N. L., Kotz, S., Balakrishnan, N. (1994) Continuous Univariate Distributions. Vol. 1. Wiley. New York. Johnson, N.L., Kotz, S., Balakrishnan, N. (1995) Continuous Univariate Distributions. Vol. 2. Wiley. New York. Johnson, N.L., Kotz, S., Balakrishnan, N. (1997) Discrete Multivariate Distributions. Wiley. New York.
242
Bibliography
Johnson, N. L., Kotz, S., Kemp, A. W. (1992) Univariate Discrete Distributions. Wiley. New York. Johnson, R.A. (1988) Stress-strength Models for Reliability. In Handbook of Statistics. Ed. Krishnaiah, P.R. and Rao, C.R., Vol. 7, Elsevier, North Holland, 27-54. Johnstone, M.A. (1983) Bayesian estimation of reliability in the stress-strength context. J. Washington Acad. of Sci., 73, 140-150. Kass, R.E., Wasserman, L. (1996) The selection of prior distributions by formal rules. J. Amer. Statist. Assoc, 91, 1343 - 1370. Kakati, M.C. (1987) Multivariate stress-strength model. IAPQR Trans., 12(1), 87-92. Kapur, E.C. (1975) Reliability bounds in probability design. IEEE Trans. Reliab., 24, 193-195. Kattan, A.K.A. (1997) On interference theory for half-alpha distributions. Pakistan J. Statist, 13, 261-266. Kelley, G.D., Kelley, J.A., Schucany, W.R. (1976) Efficient estimation of P(Y < X) in the exponential case. Technometrics, 18, 359-360. Kececioglu, D. (1972) Reliability analysis of mechanical components and systems. Nuclear Eng. Des., 9, 257-290. Kim, G.-H. (1981) Bounds for stress-strength interference via mathematical programming. Naval Res. Log. Quart., 28, 7 5 - 8 1 . Kim, D.H., Sang, G.H., Jang S.C. (2000) Noninformative priors for stress-strength system in Burr-type X model. Journ. Korean Stat. Soc, 29, 17 - 27. Klebanov, L.B. (1979) Unbiased parametric estimation of probability distributions. Mat. Zametki, 25, 743-750 (in Russian). Klein, J.P., Basu, A.P. (1985) Estimating reliability for bivariate exponential distributions. Sankhya, B47, 346-353. Kotz, S., Balakrishnan, N., Johnson, N.L. (2000) Continuous Multivariate Distributions. Vol.1. Wiley. New York. Kotz, S., Johnson, N.L. (2002) Process capability indices. A review, 1992-2000. (With discussion). Journ. Quality Technol., 34, 2 -53. Laplace, P. (1812) Theorie Analytique Des Probabilities. Courcier, Paris. Lee, G.(1998) Development of matching priors for P(X < Y) in exponential distributions. J. Korean Statist. Soc, 27, 421-433. Lee, S., Park, E. (1998) Confidence intervals for the stress-strength models with explanatory variables. J. Korean Statist. Soc, 27, 435-449. Lehmann, E.L. (1959) Testing Statistical Hypotheses. Wiley, NY. Lehmann, E.L., Casella, G. (1998) Theory of Point Estimation. Springer-Verlag, NY. Lenhof,S., Pensky, M. (2002) Estimation of P(X < Y) for beta-distributed random variables. Submitted. Lieberman, G.J., Resnikoff, G.J. (1955) Sampling plans for inspection by variables. J. Amer. Statist. Assoc, 50, 457-516. Lloyd, D.K., Lipow, M. (1962) Reliability, Management, Methods and Mathemat-
Bibliography
243
ics. Prentice-Hall, Englewood Cliffs, NJ. Lumelskii, Ya.P. (1968) Unbiased sufficient estimators of probabilities in the case of the multivariate normal distribution. Vest. MGU, Mathematics, No. 6, 14-17 (in Russian). Lumelskii, Ya.P. (1969a) Confidence limits for linear functions of unknown parameters. Theor. Probab. Appi, 14, 364-367. Lumelskii, Ya.P. (1969b) Unbiased estimators in the case of the Poisson distribution. In Scient. Records of the Perm State University, 218, 234-240, Perm (in Russian). Lumelskii, Ya.P. (1995) On inadmissibility of biased estimators relative to the quadratic loss. J. Math. Sci., 75, 1401- 1403. Lumelskii, Ya.P., Pensky, M. (1982) Unbiased estimation of characteristics of random variables. In Mathematical Statistics and Its Applications, 8, 114122, Tomsk (in Russian). Lumelskii, Ya.P., Pensky, M. (1985) Statistical control and unbiased estimation of deviations of random characteristics, in Proceedings of the All- Union Conference "Application of Multivariate Statistical Analysis in Economy and Quality Control". Tartu, Estonia, 1985, 41-42. Lumelskii, Ya.P., Sapoznikov, P.N. (1969) Unbiased estimators of probability densities. Theor. Veroyat. Primen., 14, 372-380. Mace, A.E. (1964) Sample Size Determination. Reinhold. Madansky, A. (1965) Approximate confidence limits for the reliability of series and parallel systems. Technometrics, 7, 495-503. Maiti, S.S. (1995) Estimation of P(X < Y) in the geometric case. J. Indian Statist. Assoc, 33, 87-91. Mankamo, T. (1977) Common load model. A tool for common cause failure analysis. Technical Report, 31, Electrical Engineering Laboratory, Valtion Tenillinen Tutkimuskeskus Technical Research Center, Helsinki, Finland. Mann, H.B., Whitney, D.R. (1947) On a test whether one of two random variables is stochastically larger than the other. Ann. Math. Statist., 18, 50-60. Maritz, J., and Lwin, T. (1989) Empirical Bayes Methods. Chapman & Hall, London. Marshall, A.W., Olkin, I (1967) A multivariate exponential distribution. J. Amer. Statist. Assoc, 62, 30-44. Mathai, A.M. (1997) Jacobians of Matrix Transformations and Functions of Matrix Argument. World Scientific Publ., Singapore. Mazumdar, M. (1970) Some estimates of reliability using interference theory. Naval Res. Logist. Quart., 17, 159-165. McCool, J.I. (1991) Inference on P(Y < X) in the Weibull case. Commun. Statist. - Simul. Comput., 20, 129-148. Melloy, B.J., Cavalier, T.M. (1989) Bounds for the probability of failure resulting from stress/strength interference. IEEE Trans. Reliab., 38, 383-385. Mensing, R. (1984) Personal communication. Metz, C.E. (1989) Some practical issues of experimental design and data analysis
244
Bibliography
in radiological ROC studies. Investigation Radiology, 24, 234-245. Metz, C.E., Herman, B.A., Shen, J.H. (1998) Maximum likelihood estimation of receiver operating characteristic (ROC) curves from continuously distributed data. Stat. Med., 17, 1033 - 1053. Miwa, T., Hayter, A. J., Wei Liu (2000) Calculations of level probabilities for normal random variables with unequal variances with applications to Bartholomew's test in unbalanced one-way models. Comput. Statist. Data Anal., 34, 17-32. Miwa, T., Hayter, A.J., Kuriki, S. (2001) The evaluation of general non-central orthant probabilities. J. Royal Stat. Soc. Ser. B, to be published. Morrison, D.F. (1976) Multivariate Statistical Methods. McGraw-Hill, New York. Mukerjee, R., Dey, D.K. (1993) Prequentist validity of posterior quantiles in the presence of the nuisance parameter: high order asymptotics. Biometrika, 80, 499 - 505. Mukherjee, S.P., Saran, L.K. (1985) Estimation of failure probability from a bivariate normal stress-strength distribution. Microelect. Reliab., 25, 699702. Myhre, J.M., Saunders, S.C. (1968a) On confidence limits for the reliability of system. Ann. Math. Statist, 39, 1463-1472. Myhre, J.M., Saunders, S.C. (1968b) Comparison of two methods of obtaining approixmate confidence intervals for system reliability. Technometrics, 10, 37-49. Nandi, S.B., Aich, A.B. (1994a) A note on estimation of P(X > Y) for some distributions useful in life-testing. IAPQR Trans., 19, 35-44. Nandi, S.B., Aich, A.B. (1994b) A note on confidence bounds for P(X > Y) in bivariate normal samples. Sankhyd, Ser. B, 56, 129-136. Nandi, S.B., Aich, A.B. (1996a) A note on testing hypothesis regarding P(X > Y) in bivariate normal samples. IAPQR Trans., 21, 149-153. Nandi, S. B., Aich, A. B. (1996b) Hypothesis-test for reliability in a stressstrength model with prior information. IEEE Trans. Reliab., 45, 129 -131. Nelson, W. (1990) Accelerated Testing. Wiley, New York. Nikulin, M. S., Voinov, V. G. (1993) Unbiased estimators of multivariate discrete distributions and chi-square goodness-of-fit test. Questii, 17, 301-326. Nikulin, M., Voinov, V. (1996) Tables of the best possible unbiased estimates for functions of parameters of multinomial and negative multinomial distributions. Journ. Math. Sciences, 81, 2363-2367. Nikulin, M., Voinov, V. (2000) Unbiased estimatiom in reliability and similar problems. In Recent Advances in Reliability Theory. Methodology, Practice and Inference., Eds. Limnios, M. and Nikulin, M., Birkhauser, Boston, pp. 435-448. Nockemann, C, Heidt, H., Thomsen, N. (1991) Reliability in NTD: ROC study of radiographic weld inspections. Nondestructive Testing and Evaluation International, 24, 235-245. Oskamp, S. (1962) The relationship of clinical experience and training methods
Bibliography
245
to several criteria of clinical production. Psychological Monographs, 76, No. 28. Owen, D.B., Craswell, K.J., Hanson, D.L. (1964) Nonparametric upper confidence bounds for P(Y < X) and confidence limits for P(Y < X) when X and Y are normal. J. Amer. Statist. Assoc, 59, 906-924. Pandit, S.M., Sheikh, A.K. (1980) Reliability and optimal replacement via coefficient of variation. In Proc. Prevention Reliab. Comput, St. Louis, 102. Papadopoulos, A. S. (1983) Empirical Bayes confidence bounds for the Weibull distribution. J. Inform. Optim. Sci., 4, 43-47. Park, J.W., Clark, G.M. (1986) A computational algorithm for reliability bounds in probability design. IEEE Trans. Reliab., 35, 30 - 31. Patil, G.P., Wani, J.K. (1966) Minimum variance unbiased estimation of the distribution function admitting a sufficient statistics. Ann. Inst. Statist. Math., 18, 39-47. Patnaik, P.B. (1949) The non-central x2— and F-distributions and their applications. Biometrika, 36, 202-232. Pensky, M. (1982) Unbiased estimation of probabilities defined by linear inequalities. In Application of the random search, 124-132, Kemerovo (in Russian). Pensky, M. (2002) Estimation of probabilities of linear inequalities for independent elliptic random vectors. Sankhya, to be published. Pensky, M., Takashima, R. (2002) Estimation of P(X < Y) for the generalized gamma distributions. Submitted. Peszek, I., Rukhin, A.L. (1993) Estimating normal distribution function and normal density. Statist. Decis. 11, 391-406. Pettitt, A.N. (1984) Tied, grouped continuous and ordered categorical data: A comparison of two models. Biometrika, 71, 35-42. Pham, T., Almhana, J. (1995) The generalized gamma distribution: its hazard rate and stress-strength model. IEEE Trans. Reliab., 44, 392-397. Pham-Gia, T., Turkkan, N. (1998) Distribution of the linear combination of two general beta variables and applications. Comm. Statist. - Theory Meth., 27, 1851-1869. Pham-Gia, T. (2000) Distributions of the ratios of independent beta variables and applications. Comm. Statist. - Theory Meth., 29, 2693 - 2715. Pieruschka, E. (1963) Principles of Reliability. Prentice Hall, Englewood-Cliffs, NJ, pp. 278-281. Prasanta, K.J. (1998) Estimation of P[Y < X] under a bivariate exponential stress-strength model. 207 — 214. In Frontiers in probability and statistics. Papers from 2nd International Triennial Symposium on Probability and Statistics. Calcutta, December 30, 1994-January 2. 1995, Mukherjee, S. P., Basu, S. K., Sinha, B. K. eds. Narosa Publishing House, New Delhi. Priebe, C.E., Cowen, L.J. (1999) A generalized Wilcoxon-Mann-Whitney statistic. Comm. Statist. - Theory Meth., 28, 2871-2878. Proschan, F., Sullo, P. (1976) Estimating the parameters of a multivariate exponential distribution. J. Amer. Statist. Assoc, 71, 465-472.
246
Bibliography
Pugh, E.L. (1963) The best estimate of reliability in the exponential case. Oper. Res., 11, 57-61. Quenouille, M. (1956) Notes on bias in estimation. Biometrika, 43, 353-360. Raghava Char, A. C. N., Kesava Rao, B., Pandit, S. N. N. (1984) Stress and strength Markov models of the system reliability. Sankhyd, Ser. B, 46, 147-156. Reiser, B., Faraggi, D. (1994) Confidence bounds for Pr(a'x > b'y). Statistics, 25, 107-111. Reiser, B., Faraggi, D., Guttman, I. (1992) Choice of sample size for testing the P(X > Y). Commun. Statist- Theory Meth., 21, 559-569. Reiser, B., Guttman, I. (1986) Statistical inference for Pi(Y < X): the normal case. Technometrics, 28, 253-257. Reiser, B., Guttman, I. (1987) A comparison of three point estimators for P(Y < X) in the normal case. Comput. Statist. Data Anal., 5, 59-66. Reiser, B., Guttman, I. (1989) Sample size choice for reliability verification in stress-strength models. Can. J. Statist, 17, 253-259. Reiser, B. (2000) Measuring the effectiveness of diagnostic markers in the presence of measurement error through the use of ROC curves. Stat. Med., 19, 2115 - 2129. Rinco, S. (1983) Estimation of P{YP > max(Yi, Y2,-, Vp-i)}: predictive approach in exponential case. Can. J. Statist, 11, 239-244. Rohatgi, V.K. (1989) Unbiased estimation of parametric functions in sampling from two one-truncated parameter families. Austral. J. Statist, 31, 327332. Roy, D. (1993) Estimation of failure probability under a binomial normal stressstrength distribution. Microelect. Reliab., 33, 2285 - 2287. RukhinA. (1986) Estimating normal tail probabilities. Naval. Res. Logist. Quart., 33, 91-99. Sathe, Y.S., Dixit, U.J. (2001) Estimation of P(X < Y) in the negative binomial distribution. J. Statist. Plann. Inference, 93, 83-92. Sathe, Y.S., Shah, S.P. (1981) On estimating P(X > Y) for the exponential distribution. Commun. Statist.- Theory Meth., 10, 39-47. Sathe, Y.S., Varde, S.D. (1969) Minimum variance unbiased estimation of reliability for the truncated exponential distribution. Technometrics, 11, 609-619. Sathe, Y.S., Varde, S.D. (1969) On minimum variance unbiased estimation of probability. Ann. Math. Statist, 40, 710-714. Scheaffer, R. (1976) On the computation of certain minimum variance unbiased estimators. Technometrics, 18, 497-499. Schechtman, E. (1983) A conservative nonparametric distribution-free confidence bound for the shift in the change point problem. Comm. Statist. - Theory Meth., 12, 2455-2464. Seber, G. A. F. (1977) Linear Regression Analysis. Wiley, New York. Selvavel, K. (1989) Unbiased estimation in sampling from two one-truncation parameter families when both samples are type II censored. Commun. Statist
Bibliography
247
- Theory Meth., 18, 3519-3531. Sen, P.K. (1960) On some convergence properties of {/-statistics. Calcutta Stat. Assoc. Bull., 10, 1-18. Sen, P.K. (1967) A note on asymptotically distribution-free confidence intervals for Pr(X < Y) based on two independent samples. Sankhya, Ser. A, 29, 95-102. Shen, K. (1992) An empirical approach to obtaining bounds for the failure probability through stress-strength interference. Reliab. Eng. Systems Safety, 36(1), 79-84. Shirahata, S. (1993) Estimate of variance of Wilcoxon-Mann-Whitney statistic. J. Japanese Soc. Comput. Statist, 6, 1-10. Shiryaev, A.N. (1996) Probability. Springer-Verlag, New York. Simion, E., Preda, V., Constantinescu, N., Barboi, M. (2000) Reliability analysis of the stress-strength. In Proceedings of the Sixth International Symposium For Design and Technologies For Electronic Modules, Sept. 21-24, 2000, 29-32. Simonoff, J.S., Hochberg, Y., Reiser, B. (1986) Alternative estimation procedures for P(X < Y) in categorized data. Biometrics, 42, 895-907. Singh, N. (1980) On the estimation of Pr(Xi < Y < X2). Commun. StatistTheory Meth., 9, 1551-1561. Singh, N. (1981) MVUE of PT(X < Y) for multivariate normal populations: an application to stress-strength models. IEEE Trans. Reliab., 30, 192 - 193. Sinha, B. K., Zieliriski, R. (1997) Estimating P{X > Y} in exponential model revisited. Statistics, 29, 299-316. Sinha, S.K. (1989) A note on the variance of the uniformly-minimum-varianceunbiased- estimator of the reliability function of exponential life distribution. Calcutta Statist Assoc. Bull., 38, 237-240. Shah, S.P., Sathe, Y.S. (1982) Erratum: "On estimating P(X > Y) for the exponential distribution", Comm. Statist.- Theory Meth., 1981, 10, 39-47. Comm. Statist- Theory Meth., 11, 2357. Smirnov, N. (1948) Table for estimating the goodness of fit of empirical distributions. Ann. Math. Statist, 19, 279-281. Sprent, P. (1989) Applied Nonparametric Statistical Methods. Chapman & Hall, London. Stacy, E.W. (1962) A generalization of the gamma distribution. Ann. Math. Stat, 33, 1187-1192. Sun, D., Ghosh, M., Basu, A.P.(1998) Bayesian analysis for a stress-strength system under noninformative priors. Canad. J. Statist, 26, 323-332. Surles, J.G., Padgett, W.J. (1998) Inference for P(Y < X) in the Burr type X model. J. Appl. Statist. Sci., 7, 225-238. Surles, J.G., Padgett, W.J. (2001) Inference for reliability and stress-strength for a scaled Burr type X distribution. Lifetime Data Analysis, 7, 187-200. Swets, J.A., Pickett, R.M. (1982) Evaluation of Diagnostic Systems : Methods from Signal Detection Theory. Academic Press, New York.
248
Bibliography
Swets, J.A. (1996) Signal Detection Theory and ROC Analysis in Psychology and Diagnostics. Coolected Papers. Lawrence Erlbaum Assoc, New Jersey. Teskin, O.I., Kostyukova, T.M. (1991) Interval estimation of exponent of reliability using the "load-strength" rejection method. Journ. Soviet Math., 56, 2434 - 2438. Thompson, R.D., Basu, A.P. (1993) Bayesian reliability of stress-strength systems. In Advances in Reliability, ed. Basu, A.P., Elsevier Science Publishers, Amsterdam, 411-421. Tong, H. (1974) A note on the estimation of P(V < X) in the exponential case. Technometrics, 16, 625. Errata: Technometrics, 17, 395. Tong, H. (1977) On the estimation of P(Y < X) for exponential families. IEEE Trans. Reliab., 26, 54-56. Tsui, K.W., Weerahandi, S. (1989) Generalized p-values in significance testing of hypotheses in the presence of nuisance parameters. J. Amer. Statist. Assoc, 84, 602 - 607. Tukey, J.W. (1958) A problem of Berkson, and minimum variance orderly estimators. Ann. Math. Statist, 29, 588-592. Ury, H.K. (1972) On distribution-free confidence bounds for Pr{Y < X}. Technometrics, 14, 577-581. Ury, H.K., Wiggins, A.D. (1976) A general upper bound for the variance of the Wilcoxon-Mann-Whitney U-statistic for symmetric distributions with shift alternatives. Brit. J. Math. Statist. Psychol., 29, 263-267. Ury, H.K., Wiggins, A.D. (1979) Distribution-free confidence bounds for Pr{Y < X} when F(x) and G(y) = F(x — 6) are continuous and symmetric. Commun. Statist- Theory Meth., 8, 1247-1253. Van Dantzig, D. (1951) On the consistency and power of Wilcoxon's two-sample test. Koninklijke Nederlandse Akademie van Wetenschappen Proceedings, Ser. A, 54, 1-8. Varde, S.D. (1969) -Life testing and reliability estimation for the two-parameter exponential distibution. J. Amer. Statist. Assoc, 64, 621-631. Vedernikova, A.P., Lumelskii, Ya.P. (1991) Unbiased estimation of linear functionals in the case of inverse normal distribution. J. Soviet Math. , 56, 2407-2409. Vysokovskii, E.S. (1966) Reliability of tools used in semi-automatic lathes. Russian Eng. J., 46(6), 46-50. Voinov, V.G. (1984) On unbiased estimation of P(Y < X) in the normal case. Zapiski Nauchn. Sem. LOMI, 136, 5 -12 (in Russian). Voinov, V.G., Nikulin, M.S. (1993) Unbiased Estimators and Their Applications. Volume 1: Univariate Case. Kluwer Academic Publishers, Dordrecht, Netherland. Voinov, V.G., Nikulin, M.S. (1996) Unbiased Estimators and Their Applications. Volume 2: Multivariate Case. Kluwer Academic Publishers, Dordrecht, Netherland. Wang, J.D., Liu, T.S. (1996) Fuzzy reliability using a discrete stress-strength
Bibliography
249
interference model. IEEE Trans. Reliab., 45, 145 - 149. Weerahandi, S., Johnson, R.A. (1992) Testing reliability in a stress-strength model when X and Y are normally distributed. Technometrics, 34, 8391. Wilcoxon, F. (1945) Individual comparisons by ranking methods. Biometrical Bull., 1, 80 - 83. Wolfe, D.A., Hogg, R.V. (1971) On constructing statistics and reporting data. The American Statistician, 25, 27-30. Woodward, W.A., Grey, H.L. (1975) Minimum variance estimation in the gamma distribution. Commun. Statist, 4, 907-922. Woodward, W.A., Kelley, G.D. (1977) Minimum variance unbiased estimation of P(X < Y) in the normal case. Technometrics, 19, 95-98. Wu, K.F., Fan, J.C., Li, Y.W. (1990) Strongly consistent estimation for a multivariate linear relationship model. Ada Math. Appl. Sinica, 13, 90-98 (in Chinese). Yang, M.C.K., Mo, T.C. (1984) Some improvements on the Birnbaum - McCarty bound for P(Y < X). Statist. Probab. Letters, 2, 127 - 132. Yang, M.C.K., Mo, T.C. (1985) Distribution-free confidence bounds for Pr{Y < X} of an r-out-of-fc system. IEEE Trans. Reliab., 34, 499-503. Yang, M.C.K., Mo, T.C. (1985) Distribution-free confidence bounds for P(Xi + X2 + h Xk < z). J. Amer. Statist. Assoc, 80, 227-230. (Correction: J. Amer. Statist. Assoc, 81, 1132.) Yang, R., Berger, J. (1997) A catalog of noninformative priors. ISDS Discussion Paper 97-42, Duke University. Yu, Q.Q., Govindarajulu, Z. (1995) Admissibility and minimaxity of the UMVU estimator of P{X < Y}. Ann. Statist, 23, 598-607. Zacks, S. (1971) The Theory of Statistical Inference. Wiley, New York. Zalkikar, J.N., Tiwari, R.C., Jammalamadaka, S.R. (1986) Bayes and empirical Bayes estimation of the probability, that Z > X + Y. Commun. Statist Theory Meth., 15, 3079-3101. Zaremba, S.C. (1965) Note on the Wilcoxon-Mann-Whitney statistic. Ann. Math. Statist, 36, 1058-1060.
Index
F-distribution, 114, 117, 128 T-distribution multivariate, 73, 86, 94 noncentral, 111, 113, 127, 184, 185, 188, 215 univariate, 73, 74, 87 [/-statistic, 176, 180 p-value, 34, 35, 210, 217, 219 generalized, 126, 129, 130, 207 s-out-of-A: system, 170-172
samples, 133, 158 Burr type X distribution, 43, 54, 71, 77, 117 scaled, 58 Burr type XII distribution, 43, 55, 71 BVED, 95 Gumbel, 96, 100 Marshall and Olkin, 96, 97 Block and Basu, 97, 100 Preund, 96, 100
acceptance region, 34 ambient temperature, 205, 206 average ranks, 141
categorized data, 189, 216, 225 chi-squared distribution, 56, 114, 116, 175, 187, 194 Churchman two-stimuli design, 212 clinical trials, 218, 219, 221 coherent monotone structure, 176 confidence bounds lower, 30 upper, 30 confidence coefficient, 30 confidence interval, 30 asymptotic, 31 exact, 31 conjugate prior, 25, 27, 28, 38, 44, 45, 72 count data, 174 Cramer-Rao-Blackwell theorem, 18, 41 critical stress, 202
Bayes credible set, 33, 123 estimation of R, 23, 28, 71, 82, 92, 158 predictive approach, 29 test, 35, 131 Bayes method for construction of UMVUE, 19 beta distribution, 50, 58 binomial distribution, 103 bivariate Pareto distribution, 107 Bonferroni inequality, 31 bootstrap, 132, 150 confidence interval, 133, 136, 157 estimator of the variance, 134 251
252
cumulative damage, 230 Diabetes Control and Complication Trials, 220 elliptical distribution, 78 empirical Bayes estimation, 23, 29, 158 empirical distribution function, 147, 148, 157, 161 estimator admissible, 142 minimax, 142 factorization theorem, 17 Fisher information matrix, 26 gamma distribution, 49, 56, 63, 114, 127 generalized extreme region, 130 generalized gamma distribution, 55, 69, 115 generalized test variable, 130 geometric distribution, 104, 228 half-normal distribution, 56 highest posterior density (HPD), 33, 38, 125 hypergeometric series, 28, 45, 50, 85, 87, 91, 174 definition, 28 incomplete beta function, 37, 50, 91, 115, 128 incomplete gamma function, 57, 132 inverse Gaussian distribution, 216 jackknife estimator, 148, 149 Jeffreys's prior, 26, 28, 73-77, 88, 92, 94 Kolmogorov-Smirnov statistic, 149, 151, 152
Index
likelihood function, 12, 13, 32, 40, 73, 75, 82, 93, 99, 191 likelihood ratio, 111 test, 174 lognormal distribution, 43, 58, 70, 211 loss function, 143, 160 loss of memory property, 96, 97 matching prior, 26, 72, 75-77, 125 mixed inverse Gaussian distribution, 216 multinomial distribution, 102, 103, 189, 217 multivariate Cauchy distribution, 86, 87 negative binomial distribution, 104 noninformative prior, 26, 27, 45, 71, 75, 207 normal distribution, 110, 120, 127, 178, 210, 211, 225 multivariate, 72, 74, 88-90, 92, 131, 179, 198, 204, 214 univariate, 32, 45, 47, 59, 60, 72, 112, 118, 123, 127, 129, 152, 154, 182, 184, 212, 213, 227 one-parameter exponential distribution, 14, 20, 27, 36, 43, 74, 178 ordefing of distributions, 1 parallel system, 170, 172, 176, 197 Pareto distribution, 43, 52, 70 Pearson type II distribution, 84, 90 pivotal quantity, 31, 37, 135, 136, 155-157, 167, 193, 194, 219, 221 Poisson distribution multivariate, 101 univariate, 103 posterior pdf, 24, 25, 27-29, 33, 38, 44, 45, 73, 75, 77, 78, 123-125, 131, 160, 213 power distribution, 43, 59, 70
Index
process capability index, 230 Rayleigh distribution, 42, 43, 56 reference prior, 27, 28, 74-77, 125 rejection region, 34, 131 reliability, 2-4, 170, 171, 187, 195, 196, 198, 205, 207, 211-214 ROC curve, 223-227 stochastically larger, 1 sufficient statistic, 17-21, 40, 41, 61, 65, 66, 106, 120, 183 system reliability, 5-7, 169-174, 197 truncation parameter family, 51, 64, 67 doubly, 51, 64 lower, 51, 67, 68 upper, 51, 68 two-parameter exponential distribution, 43 uniform distribution, 51, 52, 66, 68, 181, 229 uniform noninformative prior, 26, 213 Weibull distribution, 43, 53, 56, 75, 125 Wilcoxon test, 203 Wilcoxon-Mann-Whitney (WMW) statistic, 6, 140, 201, 219
253