This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
O / z u-Pe'ndu=ap+iQp-^u-p (e"-1-iu+ 2 du •' +^tu p(1+iu- 2 Idu =ap+i/3p- Z t4-P- 1 (1-t1-p) 6(4-p) p-1
p 2(1- t2- p)_2(3 1
t3- a)
p)(1-
+o(t2).
From (2. 1) we have for small t>0 t2 it3
6 -...1
ap(t)= 1-it- 2
•f l+ p-2 it+(rp+i6p)tp-1+2 3-p)t2+...} 1
-1 +
p-Zit
+( rp+i8p )
1 )(3-p) t2+
tp 1+(p
...
2 where rp is some negative constant and 8p is some real constant. Hence it follows from (2. 1) that
¢ p(t) =1 +p 12 it + rp +iBpi t l I t I P -1 Letting
an(t)= E[exp ( itZn)]
+...
+(p 2)(3- p) t2
with Zn= n- '/(P -')E ^Xi -p
12/
we obtain log
a n( t )= ( rp
i5P -j_
) I t
I
P 1+ l(P-2)(3-C)
+
2
n- (3-010-1)t2
1
(2.6)
As immediately seen from above, the asymptotic distribution of Z. is a stable law with the characteristic exponent p-1 ([2], [3]) and the order of the next term is n-c3-p>i(p-1) (> n-1) Next we shall obtain the asymptotic expansion of the density function g(x : p-1) corresponding to On(t) given by (2. 6). From (2. 6) we have for 2
3, then letting On(t)=E[exp(itZn)] with Zn=^X;IVn // log an(t)=n log gyp( n
we have
1
2 pt2+(p_ 1)ap* n(3-P)/21 t I P-1+o ( n(3-P)/2 ) for 3
o with the estimators Bn in `U k is contained in 0,/2. In SuffiPn,o0(Aen, o0) in CUk it is order to obtain the upper bound of lim ,
cient to find a sequence
{on }
of the tests which maximize HE E n,00 n -
(o
n)
It is shown by the Neyman-Pearson fundamental lemma that has the rejection Sn satisfying 01/2•
in
on
n log f( XX, 60-rcn')
where kn is some constant .
f(Xi, 00)
Then it follows from (2.3) that the upper
157 44 MASAFUMI AKAHIRA AND KEI TAKEUCHI
bound of lim Pn,BO(Aen,6O) in
V, n-.w
is given by lim En,6O n--
(on*).
Hence the kth
order asymptotic distribution of the DLE BDL attains the bound of the kth order asymptotic distributions of kth order AMU estimators at -r. In a similar way as the case when r>O, we also obtain for r<0 the upper bound of lim Pn,e0(Agn, e0) in V, of the same form. Hence the desired result also holds for the case r<0. In later sections it will be seen that BDL is asymptotically efficient up to second order. Note that the DLE usually depends on r in cases more than third order. In the subsequent discussion we shall deal with the case when cn =II/n.
3. Second order asymptotic efficiency Suppose that X,, X2i • • •, Xn, • • • is a sequence of i.i.d. random variables with a density f (x, 0) satisfying ( i) -(iv).
(i) {x : f (x, 0) > 0} does not depend on 0; (ii) For almost all x[p], f (x, 0) is three times continuously differentiable in 0; (iii) For each 0 E (9 0
{-j_-B - log .f (X, B)} 2]
E[
, 0)] < 0
( iv) There exist J(0)=Ee
[{ ae2-
log f(X, 0)} {
log AX, 0)} ]
and K(B) = EB L j aB
log .f (X, 0) I l
and the following holds : EB [_! log f(X, 0)] _ -3J(0)- K(B)
ao,
By the following way we have shown in [11] that an MLE is second order asymptotically efficient . Let B. be a maximum likelihood estimator. By Taylor expansion we have
O= tiZae log f(Xz , BML) =tE
0) ae log f(X^, +
{ae2 log f(Xi, B)} (BML-B)
158 DISCRETIZED LIKELIHOOD METHODS 45
E
a$
+1 2 ti=1 {ae2 log f(X£, o *)} (BML-B)2 , where 10*-015I0ML - 01. Putting T„=%/n(BML-o) we obtain U= 1 E a log Ax" 0)+ 1 {E aE log f(X" B)} T. t=1 ao n $=1 ao2 1 + 2n ^l n
a2 ae2
g f (X" , lo
o*) T„ .
Set Z1(o)= 1 E a log f(X" o) Vn =1 ao
Z2(o)=__L E { 1 n a2 log f(Xs, o) W(B)=- E n 1=1 aB Then it follows that W(o) converges in probability to -3J(6)-K(0)Hence the following theorem holds : THEOREM 1 ([11]).
Under conditions (i) ~ (iv)
) - 3J(o)+K(o) Zl(o)z+o^( 1 1 (9ML-0)= Zo(o) + f Z,(o)Z2(0 /nf I(0) N /W I(19)2 2,,/WI(O)-
(3.1)
up to order n- 1/2 as n-> oo. Put
BML
-BML+
K(&
)
6nI(0ML)2
Then 6ML is second order AMU. From Theorem 1 we have established the following : THEOREM 2 ([11]). Under conditions (i)-(iv), BML is second order asymptotically efficient.
In the sequel we obtain the same result for DLE. We further assume the following : (v) For given function a„ (o, r) log L(o+rn 1"2, z„)-log L(o,
is locally monotone in 0 with probability larger than 1-o(n 1). Remark.
In usual situation it is generally true since (1/n) (a2/ao2) .
159 MASAFUMI AKAHIRA AND KEI TAKEUCHI
46
log L(9, xn) is asymptotically equal to -I(9) (<0) and an(e, r) is smaller order than n-1, and is usually of constant order (0(1)) as is shown below. Let Bn be an DLE. Since n
n
Z log f(XX, en+rn-1/2)-j log f(Xi, en)=an ,
i=1
t=1
it follows by Taylor expansion that n
n Z ae
n
log f ( Xi , en)
2
+ 2n
Z
CF
log f(Xi, en)
a3 log f (Xi , e*) = E aeg
r2 n
+ 6n n
an
,
where
l0*-
enI <
.
r ,VW
Since (1/n) Z
{(a 2lae3) log f(Xi, 9)}
converges in probability to - 3J(0)-
i=1
K(0), it is seen that
(3.2) Z, (en)+ 2
Z2(en)} { I(an)+ ^n 6 r
{3J( Bn) +K(en)} +O ( 1 ) = an
On the other hand we have
(3.3) Z1(Bn)=Z1(0)+ 1 {Z2(0)-'%/ I(e)}Tn- 3J(0)+K(0) T7 +0P( 1 ../n 2.%/W ,V n) where T n=../ n (&-9).
Since
Z1(en)=Z1(e)+ 1 Z2(e)Tn- I(B)Tn./n
3J(6) +K(0)
2.,/W
T,^+op(
it follows that
(3.4)
Tn= I^B) {-Z1(en) +Z1(0)+
_n
Z2(0)Tn
_ 3J(6) +K(0) Tn +oP 1 2.v/-n ,/n Since I(en)=I(6)+ 1 12J(e)+K(6)}Tn-I-op( 1 4./n 4./n
1
),
160 DISCRETIZED LIKELIHOOD METHODS
47
J(&)=J(0)+ J'(B) T.+op( 1 )
K(BD) =K(8)+ 1 K'(6)T„ +oP( 1
)
Z2(e)=Z2(B)-J(B)T.+o'(1) it follows from (3.2) that
rn + 2
r {3rJ(B)+ rK(0)-3Z2(0)} +6 v/W
+ 2 / i {3J(0)+K(0)}T. +o, ) up to order n- 111.
From (3.4) we have
T.=_ an - r2(3J+K) + Z, + ZIZ2 - (3J+K)Z,2 +op r 1 ) rI 6I-1 n I I2.^/ n 21'.,/-n up to order n-112, where a„=-(rI/2)+an with an'=o(1/.,/n) so that B„ be AMU. In order to have second order asymptotic median unbiasedness of 9. we put r`(3J+K) _ rK +o\ 1 ) a`= 6-/n 61.,/W ^n and we denote by T,* the corresponding T. to this particular value of a..
Then we may also denote Then it is shown that
k is second order AMU. Hence we have established the following theorem. THEOREM 3. Under conditions (i) - (v), the DLE 6,* with a. defined above satisfies the following : T,* N/ n (en -0)
K + Z1 (0) + Z1(B)Z2(0) - 3J(0) +K(0) Z,(0)2 2I(0)a.,/ I(0)2. VW 6124,/n I(0) + r I Z2(0)- 3J(B)+K(B) Z1(0)} n +o( 1l ) n) 2I(0)^/ n I(0) ^/=
up to order n- 1/2 as n--> oo. Remark.
Since we have
EB [Z1(0) {z2(o) - 3J(0)+K(0) Zl(e)11=0 JJ I(0)
161 48 MASAFUMI AKAHIRA AND KEI TAKEUCHI
V 0 (T,*)= I(B) +o\.^l n l , and the third order cumulant is equal to that of the MLE up to the order n- '/2. Hence the asymptotic expansion of the distribution of T,* n- 1/2. Therefore the is equal to that of ^/n(9ML-B) up to the order DLE B,* is asymptotically equivalent to the MLE BML up to that order.
4. Third order asymptotic efficiency We proceed to the problem of third order asymptotic efficiency. We further assume the following : (vi) For almost all x[p] , f (x, 0) is four times continuously differentiable in 0; (vii) there exist L(O)=E, L as s log .f (X, 0)} aB log .f (X, B)}
a M(e)=EB I 1-66-F log
N(e) =E8
a 22
.f(X, }2]
log AX,
I
B)} ae log f(X, 0)}
and
f(X, 0)} 4] H(e) = EB 11ae log and the following holds : EB N log f (X, 0)] 4L(0) - 3M(O) - 6N(6) - H(0) O-14 Let 9n be an DLE. Since
E lo g f
(Xi, en+ / n ^ i log f (X4, en)=an ,
it follows that E a log f(Xi, B..)+ r E 82 log AX, Bn) 2n i=1 a82 ^/ n i=1 aB a4 as ^,3 n r2 +
an
r
E-
log f (Xi , Bn) +
E
6n./ n i=1 8B3 24n2 i=1 _FO
log f (Xi ,
6 *)
162 DISCRETIZED LIKELIHOOD METHODS 49 n
(1/n) Z (a'lao') log f(X;,, 0) converges in i=1 probability to -4L(0)-3M(0)-6N(6)-H(0), it is seen that
where
I0*-o1
Since
Z1(on)+ 2 { -I(B&)+ _ + 6n Z3(B
Z2(Bn)}
ban
{3J(9n) +K(on)}
4L(9n )+ 3M(&)+6N(Bn)+ H(Bn)}_ r , n)- 24n {
where Z3(B)= 1
' {iog f(x, 0)+ 3J(8)+K(0)}
Hence (4.1)
Z,(on)- rn + 2 I(Bn)- rn Z2(&)+ 6rn {3J(en)+K(&)} - 6n Zs(On)+- 24n {4L(Bn)+3M(Bn)+6N(Bn)+H(en)}
On the other hand we have
ZO(B )=Z1(o)+ 1 {Z2(Bn)-1/nI(0)}Tn- 3J(o)+K(o) Tn2 2 ^/n n VW 1 - 6n {4L(0)+3M(o)+6N(o)+H(0)}Tn +o^(n) where Tn=./n(B -o).
We obtain
(4.2) Tn= Z1(&)+Z1(e)+ 1 Z2(0)T.- 3J(o)+K(o) Tn I(o) ^/ n 2^/n - 6n {4L(o)+3M(o)+6N(o)+H(0)} Tn ] +oP(n) . Since
I(6n)=I(o)+ ^n {2J(0)+K(0)}TT -L) + 2n {2L(o )+ 2M(9)+5N(0)+H(0)}T„+ oP(ry1 J(en)=J(o)+ 1 {L(o)+M(o)+N(o)}Tn+op 1 ) ; ^/n K(en)=K(o)+ 1 n {3N(o)+H(o)}Tn+op( ) ; %/ 4/n
163 MASAFUMI AKAHIRA AND KEI TAKEUCHI
50
L(on)=L(O)+ 1 L'(e)T.+o,, 1 ) 4/n ^n
M(9n)= M(o)+JM'(o) TT
+o1, (^n)
N(6n)=N(o)d ^n N'(o)Tn+o^(_n) H(en)=H(e)+ 1 H'( o)TT+op(,V1n) Z2 (en)=Z#)-J(O)T. -
1
{2L(o)+M(o)+N(o)}T„+o p ( 1 ) 4/ n
and ZOn)=
Z3(B)Tn +op ( 1 )
it follows from (4.1) that Z,(9n)= an + I- r Z2+ r2(3J+K) _ r2 Z3+ r (3J+K)Tn r 2 2.,/W 6^/n 6n 24/n 3 24n (4L+3M+6N+H) r (4L+3M+6N+H)T„ +
+ 6n (3L+3M+6N+H)Tn+op(1) up to order n-1 as n--+oo. Under conditions (i)-(vii) we have from (4.2) (4.3)
Z1 +
rZ3 r3 11 rI 2 61.,/W I 21.,/W 61n 241n
T _ _ an
r r2(3J+K) +
rZ2 +
r (3J+K)Tn- r2L' Tn+ Z2 T.- 3J+K Tn 21.,/ n 21n I^/ n 21,^/ n r
2_
.LTn +o( 41n `rT n 61n p n =-an'+ K +Z,+ r (Z2_ 3J+KZ)+ Z,Z2 rI 6I .^/n I 21./n I I.,/n - (3JI3^ )Z2 + I L {Z1Z2 -(3J+K)Z12Z2- 6 41 + (3J+K)2Z1 _ (3J+K)Z1Z2 _ (3J+K)KZ1 + KZ2 212 21 61 6 _ r(3J+K) _ r2I(3J+K)Z2 + r2(3J+K)Z1 12 4 4
164
DISCRETIZED LIKELIHOOD METHODS 51
_ 3r(3J+K)Z1Z2 + r(3J+K)2Z2 + rIZ2 _ r2LE'Z1 2 41 2 2 +r(3J+K)2Z,2-r_fZI}+o(1) 21 4 _ n up to order n-1 as n--+oo, where a,=-(rI/2)-r3(3J+K)/61./n+rK/ 6I,/n +a;' with anl'=0(1/n) and 1=4L+3M-l-6N+H and £'=3L+3M +6N+H. We denote by T,* the corresponding T„ to this particular value of an. Then we may also denote T, * _ ,y/ n (9n* - 0). On the other hand under conditions (i) - (vii) we have obtained in [11] (4.4) /n(BML-0)= Il+I21
+ II { +
fz1z2-(3J2K)Z'2} ZiZ3 -
3J2jK Z
12Z2
(3J+K)2 3 _ 4L+3M+6N+H713} +o ( 1) Z, , n 61 2I
Let 0ML be a modified MLE so that it is third order AMU. Comparing (4.3) with (4.4), we see that .v/n(9ML-0) and T,* are essentially different in the order n-1. We put TML =.v/n(eML-0). Note that the difference of T L and T,* appears in the linear term of order n-1/2 and other terms of order n-1. It is shown in [12], [13] and [14] that the asymptotic cumulants are determined up to order n-1 by the terms of order up to n-1/2 if the first term is equal to Z(0)/I(0) ; and the fourth order cumulant is identical in the first term of order n-1 for all asymptotically efficient estimators. In the third order cumulants we have K3(T -E,(TL ))_ a(0) +
o(n)
K3(T,*-E, ( T.*))_ ^3 (B) + r3(B)+o(1) A/n n in) Therefore there is a difference of asymptotic distributions of "/ n (aL0) and T,* in the term of n-1 if r3(O) *0. If we denote by Theorem 3 and (4.3) T*= Z_ J+K + r L*+ 1 (Q- c)+ 6I2^/n / I ^/ n n R+op(nI) 1
L* 2I (Z2_ 3J +K K z)
165 52 MASAFUMI AKAHIRA AND KEI TAKEUCHI
Q= I2 (Z1Z2- 3J21K Z^2)
C=- J+K 212 We decompose that (4.5)
P3(B) + r3(e) +0(1) _ - 1 Eo (Z,3) + 3 Eo [Z1(T,*-Eo (T,*))2] .Vln n n 2I3 21 + 21n Eo [&(rL*+Q-c)2]+o(n )
Note that Eo (Q) = c. We have Eo (Z3) = K/4/n . For the second term of the right-hand side of (4.5) we use the following lemma (see [12], Lemma 4.1). LEMMA . Suppose that X1, X2, • • • , X,,, • • • is a sequence of i.i.d. random variables with a density f (x, B) which is differentiable in 0 for almost all x[u]. Suppose further that To is a .`"'-measurable function on 6''"' into R1 and differentiable function of 0 and that
B
EB (T6)=
J (") ao
IT, trr
f(xt, o)}
sT
du( x,)
holds. Then we have 1 a E, (To)- 1 Eo (aTo ) ^ n ao V/ n ao
Eo (Z1 To) =
By the lemma we obtain
(4.6)
Eo [Z1(T,*-Eo
(Tn*))2]
1 a Vo( Tn*)- 2 Eo [(T,*-Eo(T,*)) "/n ao ao
(T,*- Eo (Tn ))
ao (1) +n K i I z.,/+n +0 (n)
We also obtain (4.7)
Eo [Z,(rL*+Q-c)2] =2r Eo [Z^L*(Q-c)]+o(1)= r I M_ (3J+K)K l +0(1) 212 J I I
166 DISCRETIZED LIKELIHOOD METHODS 53
From (4.5), (4.6) and (4.7) we have r3(e) __ 3r [ M(0) - {3J(0) +K(0)} K(0) 21(0)2 L 1(0) 21(0)2 Since the third order asymptotic distribution of T,* attains the bound of the third order asymptotic distributions of third order AMU estimators at r, there is no third order AMU estimator which uniformly attains the bound, if 73(0) is not equal to zero. Hence we have established : THEOREM 4 . Under conditions (i) - (vii), BML is not third order asymptotically efficient if r3(0) *0.
5. Maximum log-likelihood estimator Instead of the equation (1.1) we may take a solution an of the discretized likelihood equation : n
n
(5.1)
Z 1ogfX,
i=1
Bn
+rcnl) -E lo gf( Xi, Bn- rcn1) =0
i=1
The solution an is 0 which maximizes rcnl n
Z log f(Xi, 0+t)dt=
J -rcnl 2 = 1
rcnl _rcml
log L(0+t)dt
where L(0) denotes the likelihood function. Then an is called maximum log-likelihood estimator (MLLE) ([3]). If log L(an+t) is locally linearized, an agrees with maximum probability estimator by Weiss and Wolfowitz ([15]). Now we shall consider a location parameter case when the density function f satisfies the following assumption : (viii) f(x, 0)=f(x-0) and f(x) is symmetric w.r.t. the origin. It follows by the symmetricity of f that the solution en of (5.1) is AMU. We also have
J(0) =K(0) = 0 . Let an be an MLLE. Then n
n
E logf(X,- an- (r/s/n))-Z logf(Xi- an +(r[/n)) =0 . i=1 i=1 Without loss of generality we assume that 00=0. Since n
n
Z log f(X -0n-(r/ '✓ n))-E log .f (Xi) i=1 i=1
167 54 MASAFUMI AKAHIRA AND KEI TAKEUCHI
I l =0 Z log f(Xi-on+(r/ ✓ n))-^ i=1 i=1 log f(X1), it follows by Taylor expansion around 0=0 that 2r :
[-L log f (Xt _ 0)
n
ao {
=1
J ,=0
{ 2r6,, .= n
a2
log .f (X' - o)
F L ae , Z
J 0=0
+ (_!1 t log .f (X _ 0)] e o ,9n' + 3nr3 n / L aB^ +lrren+ A lra' log AX,-0)]a_o~0 , 3 \^Pn n^/n /LaB' where 10* I S 10I . Putting T„ = .// n B,, , we obtain 2rZ,+2r (^n -I(0))Tn+{ Z3- /n(3J(0)+K(0))}
(.712 + 3 r )•
{4L(0)+3M(0 )+ 6N(0)+H(0)}
r T + r3 Tn0 ) . n
n
n
Hence it follows that Tn=
+ Z1Z2 ) +
r2 1%/ n I I2n 61n Z2 (Z1
Za+
1
21n
Zg
Z.'
I2
3 6In (4L + 3M+ 6N+ H) ( Ig
+r2 Il +op(n .
Under conditions (i) ~ (vii ) we have (5.2)
Tn=
Il + IZZ n + I n
{Z1Z2 + 2 Zi Z3- _I ( 4L+3M+6N+H)Z3}
+ rZ3 r2(4L+3M+6N + H) Z1+op 6In 612 n (in) up to order n' as n-> oo. As was stated preciously the difference in the order n-1 term between (4.4) and (5.2) does not affect the asymptotic distribution up to the order n1. Hence we have established : THEOREM 5.
Under conditions (i) ~ (iv) and (vi)-(viii), the MLLE
9n is asymptotically equivalent to the MLE 6ML up to order n1. It is shown in [12], [13] and [14] that BML maximizes the symmetric probability Pn,8 {4/nf BML-OI
168 DISCRETIZED LIKELIHOOD METHODS 55
up to the order n-1 among all regular AMU estimators. Therefore it is seen that
lira n[PP,g {,/ n ^BD1L -01 _0 n-oo
for all u, where 9DL denotes DLE. The asymptotic distribution of the MLLE OMLL is equal to that of BML up to the order n 1, but not in the n-3/2 and we can show that
lim n3'2[Pn,g { ,/n 1BMLL -01_0 n_oo in general situations. Hence for symmetric intervals 0MLL is not fourth order asymptotically efficient.
6. Conclusion remarks on discretized likelihood methods" If we also define as asymptotic efficient estimator as
On
which max-
imize lim Pn,B {cnlOn -Oj
ing the following equation (6.1) is asymptotically efficient : (6.1) i
-1
f(Xi, Bn+rcnl) f(XX, On*-rcnl) -a f(XX, 0*) - n f(X1, Bn) t-1
where an is chosen so that B* is AMU. Then (6.1) is also expressed as (6.2) exp {log L(6,*+rcn1)-log L(B*)} -exp {log L(9*-rcn1)-log L(B*)} an . If exp (log L) is linearize , then (6.2) is reduced to the type of (5.1), where in this case the right-hand side of (5.1) is not always zero. Hence it is seen from the above that the asymptotic efficiency (including higher order cases) may be systematically discussed by the discretized likelihood methods. Further the discretized likelihood methods may be also applied to the statistical estimation theory of the fixed sample size. UNIVERSITY OF ELECTRO - COMMUNICATIONS UNIVERSITY OF TOKYO
REFERENCES [ 1 ] Akahira, M. (1975). Asymptotic theory for estimation of location in non -regular cases, I: Order of convergence of consistent estimators, Rep. Statist. Appl. Res., JUSE, 22, 8-26. [ 2 ] Akahira, M. and Takeuchi, K. (1976). On the second order asymptotic efficiency of 1) This section is based on Akahira and Takeuchi [3].
169 56 MASAFUMI AKAHIRA AND KEI TAKEUCHI estimators in multiparameter cases, Rep. Univ. Electro-Comm., 26, 261-269. [ 3 ] Akahira, M. and Takeuchi, K. (1977). Asymptotic properties of estimators obtained from discretized likelihood equations , Annual Meeting of the Mathematical Society of Japan. [ 4 ] Chibisov, D. M. (1972). On the normal approximation for a certain class of statistics, Proc. 6th Berkeley Symp. Math. Statist. Prob., 1, 153-174. [ 5 ] Chibisov, D. M. (1973). Asymptotic expansions for Neyman's C(a) tests, Proc. 2nd Japan-USSR Symp. on Prob. Theory, (Lecture Notes in Math. 330) Springer-Verlag, 16-45. [ 6 ] Efron, B. (1975). Defining the curvature of a statistical problem (with applications to second order efficiency), Ann. Statist., 3, 1189-1242. [ 7 ] Ghosh, J. K. and Subramanyam, K. (1974). Second order efficiency of maximum likelihood estimators, Sankhya, A, 36, 325-358. [ 8 ] Pfanzagl, J. (1973). Asymptotic expansions related to minimum contrast estimators, Ann. Statist., 1, 993-1026. [ 9 ] Pfanzagl, J. (1975). On asymptotically complete classes , Statistical Inference and Related Topics, Proc. of the Summer Research Institute on Statistical Inference for Stochastic Processes, 2, 1-43. [10] Pfanzagl, J. and Wefelmeyer, W. (1978). A third order optimum property of the maximum likelihood estimator, J. Multivariate Anal., 8, 1-29.
[11] Takeuchi, K. and Akahira, M. (1976). On the second order asymptotic efficiencies of estimators, Proc. of the 3rd Japan-USSR Symp. on Prob. Theory, (Lecture Notes in Math. 550) Springer-Verlag, 604-638. [12] Takeuchi, K. and Akahira, M. (1978). Third order asymptotic efficiency of maximum likelihood estimator for multiparameter exponential case, Rep. Univ. Electro-Comm., 28, 271-293. [13] Takeuchi, K. and Akahira, M. (1978). On the asymptotic efficiency of estimators, (in Japanese), A report of the Symposium on Some Problems of Asymptotic Theory, Annual Meeting of the Mathematical Society of Japan, 1-24. [14] Takeuchi, K. and Akahira, M. (1979). Third order asymptotic efficiency of maximum likelihood estimator in general case, to appear. [15] Weiss, L. and Wolfowitz, J. (1967). Maximum probability estimators, Ann. Inst. Statist. Math., 19, 193-206.
170 Ann. Inst. Statist. Math. 31 (1979), Part A, 403-415
ASYMPTOTIC OPTIMALITY OF THE GENERALIZED BAYES ESTIMATOR IN MULTIPARAMETER CASES KEI TAKEUCHI AND MASAFUMI AKAHIRA
(Received July 28, 1978; revised Oct. 24, 1979)
Abstract The higher order asymptotic efficiency of the generalized Bayes estimator is discussed in multiparameter cases. For all symmetric loss functions, the generalized Bayes estimator is second order asymptotically efficient in the class A2 of the all second order asymptotically median unbiased (AMU) estimators and third order asymptotically efficient in the restricted class D of estimators. 1. Introduction The expansion of a generalized Bayes estimator with respect to a loss function of the type L(B)=IOId (a>1) is obtained by Gusev [5]. His result can be extended to all symmetric loss functions. Strasser [8] also obtained asymptotic expansions of the distribution of the generalized Bayes estimator. In one parameter case the second order (or third order) asymptotic efficiency of the generalized Bayes estimator has been discussed by Takeuchi and Akahira [12]. It is shown in this paper that in multiparameter case for all symmetric loss function the generalized Bayes estimator d is asymptotically expanded as
^/n(e-B)=U-2^n I-IV+Vn I-IW+oP\^n where the symbols of the right -hand side are defined in the contexts. And the asymptotic distributions of the estimators are the same up to the order n-I except for constant location shift . Therefore if it is properly adjusted to be asymptotically median unbiased , it is third order asymptotically efficient among the estimators belonging to the class D ([3], [4], [11]) whose element d is third order AMU and is asymptotically expanded as 403
171
404 KEI TAKEUCHI AND MASAFUMI AKAHIRA
s/n(9-0 )=U+^n Q+op\"An l
and Q, = Op(1) (a = 1,---,p) and E [UQ' ]= o(1) (k=1 , 2) for all a, p=1, • • • , p, where E denotes asymptotic expectation and U= ( U1 , • • • , Up)' and /
Q= (Q, ,
... ' Qp)'.
2. Results Let (a', B) be a sample space . We consider a family of probability measures on B, ) = {P0 : 0 E 01, where the index set ® is called a parameter space . We assume that 0 is an open set in a Euclidean pspace R' with a norm denoted by II • II. Then an element 0 of 0 may be denoted by (01 i • • • , Bp). Consider n-fold direct products (1'`n), B`n)) of (X, B) and the corresponding product measures P' (n) of P0. An estimator of 0 is defined to be a sequence {B.} of _B (n)-measurable functions Bn on X(n) into ® (n =1, 2,. • • ). For simplicity we denote an estimator as d instead of {6n} . Then may be denoted by (61 f • • • , d,). For an increasing sequence of positive numbers {cn} (cn tending to infinity) an estimator is called consistent with order { cn} (or {cn}-consistent for short) if for every s>0 any every 9 E ® there exist a sufficiently small positive number 8 and a sufficiently large number L satisfying the following : lim sup P1n) {cnlle -0II _L} <s ([1]) . 0:10 -SI1<8
n-too
For each k=1,2,---, a {cn}-consistent estimator d is kth order asymptotically median unbiased (or kth order AMU) estimator if for each 9 E ® and each a =1, • • • , p, there exists a positive number 8 such that c,-1 Pan){B 0}-2
=0;
c,-ll Pen) {Ba>_BQ}- 1 lim sup n-» 0:110-s11<8 2
=0.
lim sup n--
0: 110-911<3
For B kth order AMU, Go(t, 0)+cnIG1(t, 0.)+ • • ' cn
lim c,-1I PBn){cn(Ba-ea)
cn 1G1( t, Ba)
... -Cn (k -1)Gk
-1(t,
0,,)I
=0
We note that G#, 0.) (i=1,..., k -1; a=1,---,p) may be generally
172
ASYMPTOTIC OPTIMALITY OF THE GENERALIZED BAYES ESTIMATOR 405
absolutely continuous functions , hence the asymptotic marginal distributions for any fixed n may not be a distribution function. Suppose that d is kth order AMU and has the kth order marginal asymptotic distribution G0(t, 0.)+cn'G1(t, 0.) + • • +cn ( k-')Gk_1 (t, O) (a=1, •, p) and the joint distribution of B admits asymptotic expansion up to kth order , i.e., the order of en ( "'. Letting Bo (E ®) be arbitrary but fixed. Denote Bo by (Bol, • • • , Bo,). Leta be arbitrary but fixed in 1, • • • , p. We consider the problem of testing hypothesis H+ : Ba=Boa+tcn' (t < > 0) against K : Oa = Boq . Put 01/2 { On} ; El0a+1,n1(0n) =1/2 + o(cnck-"), 0 _ ¢n(xn)51 for all x„ E _'`"' (n=1, 2,. • .)}. Putting Adq,00a={cn(Ba- Boa)St}, we have
lim Poa'+n 1(Aea,60.)=lim Poa+ i^n^ {BaSBoa-^ tcn'} = 2 Hence it is seen that a sequence {XA L, 0oQ } of the indicators (or characteristic functions ) of ABa, eoa (n=1, 2,. • •) belongs to 01/2 If sup lim cn -' {E(n) t, Boa) - cn 1H,+(t, Boa) - .. 80. (O n) - H+(
{^n}Eo ,a n-soo
n
(k-1)H+
i(t,
BOa)} -0
then we have
Go(t, Boa) s H+(t, Bo.) ; and for any positive integer j ( S k) if Gi(t, Boa ) = H,+(t, Boa ) (i =1, • • • , j-1) then G;(t, Boa)=Hj+(t, Boa) • Consider next the problem of testing hypothesis H- : BQ=Ba„+ tc-1 (t<0) against K : Ba = Boa . If sup Lim c'n -' {Edna (0n) -Ho (t, G0 )-cn'HC( t, Bon)- .. .
{!6n }Eo1
,2
n-->oo
-cn(k-I)Hk l(t , Boa)} =0
then we have Go(t, O..) Z Ho (t, Boa) ;
and for any positive integer j (< k) if Gilt, Boa) = H1 (t , Boa) (i =1, • • • , j-1), then G1(t, Boa)>HH (t, Boa). B is called to be kth order asymptotically efficient in the class Ak of the all kth order AMU estimators if the kth order asymptotic marginal distribution of it attains uniformly the bound of the kth order asymptotic marginal distributions of kth order AMU estimators, that
173
406 KEI TAKEUCHI AND MASAFUMI AKAHIRA
is, for each a =1, • • • , p
HI(t, O)
for t>O ,
Hz (t, O)
for t
i=0, • • •, k-1 ([2], [4], [9]). [Note that for t=0 and each a=1, • •, p we have Gi(0, 0.)=Hi+(0, O)=Hi- (0, 9Q) (i=0,• • •, k-1) from the condition of kth order asymptotically median unbiasedness.] B is called to be third order asymptotically efficient in the class D if the third order asymptotic marginal distribution of it attains uniformly the bound of the third order asymptotic marginal distributions of estimators in D. It is generally shown by Pfanzagl and Wefelmeyer [6], [7] and Akahira and Takeuchi [3], [4], [10], [11], [14] that there exist second order asymptotically efficient estimators but not third order asymptotically efficient estimators in the class A8. But it was also shown in [4], [10] and [11] that if we restrict the class of estimators appropriately, we have asymptotically efficient estimators among the restricted class of estimators and that the maximum likelihood estimator belongs to the class of higher order asymptotically efficient estimators. We assume that for each 0 E ® PB is absolutely continuous with respect to a-finite measure p. We denote a density dPB/dp by f(x, 0). n Then the joint density is given by T[ f (xi, 0). i=1
In the subsequent discussion we shall deal with the case when cn = ..l n . Let ®= RP. Let Ln(u) (u = (U1, • • • , Up) E RP) be a bounded nonnegative and quasi-convex function around the origin, i.e. for any c the set {u : L(u) S c} (c RP) is convex and contains the origin and ir(0) be a non-negative function. Define a posterior density pn(0I xn) and a posterior risk rn(d I xn) by
and
rn(d I xn) _ ^e Ln(d - 0)pn(0 I x n)d0 , respectively, where xn = (x, , ,
=L*(u) for all u E RP.
xn).
Now suppose that lim Ln(u /.,/ n )
We define
r*(d I x%) = J L *(-./ n (d- 0))pn(0 I xn)d0 . An estimator B is called a generalized Bayes estimator with respect to a loss function L* and a prior density it if
174 ASYMPTOTIC OPTIMALITY OF THE GENERALIZED BAYES ESTIMATOR 407
r:(BIxn)=inf rn(dIxn) dE8
Then i =../ n (9 - 0) may also be called a generalized Bayes estimator w.r.t. L* and a. Since Ln(d-0)5(0)d0-inf L*(%/n(d-0))5(0)d0l = 0
limlinf
dEB 8
n-- dE8 9
uniformly in every posterior density p(6), it follows that for a generalized Bayes estimator
limlinf rn(djxn)- r, (6 xn)j =0 n-•O
dE8
Suppose that X1, X2, • • • , Xn, • • • is a sequence of i.i.d. random variables with a density f (x, 0) satisfying (i)-(iv).
(i) {x: f(x, 0)>0} does not depend on 0. (ii) For almost all x[,u], f (x, 0) is three times continuously differentiable in 0. (a =1, • • • , p). (iii) For each a, R (a, j9 =1, • • • , p) 0
log f (X, 0) < 0 _ - EB I a2
aoaa6^
(iv) There exist Jad•r(6) =E8
LI
a2
a6Qaos
a log f(x, 6)} log f( x , 6)} {__
K(0)=E a L a log f(x, 6)} { a log f(x, 6)}
a6a a6^
• l a6, log
f(x, 6)} ]
and War(6
) = EB L a6sa6^a6,
log f (X, 0)]
and the following holds : Ee [ a9 log AX, 0)] _ -J ^.r(6)-J,.s(6)-J^r.a(6)-Kaar(6)
a6Qa6pa6r
It was shown by the same way in [9] that a maximum likelihood estimator (MLE) is second order asymptotically efficient. Let B be an MLE. By Taylor expansion we have
175
408 KEI TAKEUCHI AND MASAFUMI AKAHIRA
0=z E aea log.f(XX, B)
a2 log e) +E { :^ log f(X¢, o)} a aeaf(X1, a }z aoaaoo as oa)(Bo -oo + 1 E E E {E aoaao0aer log f(X=, o*)} (Ba-Ba)
(ea-
2 a a 7
(Ba-
'^
Ba)(Ba- 00)(87-B7)
where SI0*-0IISII9- 8!I. Putting T=./n(B-B) we obtain E {a2 log f(X{, e)} TTT, o= 1 {^ 8 log f(Xi, B)} Ta+1 ^/ n a a Boa n s s aoaaop + 2n ✓n
E E E {Z adaa'saor
where T=(T1f • • ., TT)'.
log f(X,, B*)} TTTfT7 ,
Set
Z-(8) =1/1n Z(O) 1
a '=1
log AX, B)
aBa
82 log f(XX, B)+Ia(B)
177 i=1 aeaae^
0) converges in probability to - {J p.r(B)+J 7.fl(B) Then it follows that W,,,( + Ja7•a(0) + Kaer(B )}. We Put Paa7(B) = J fi.7(0) + Jar-P(O) + Ja7.a( B) + K dr(0). Hence the following theorem holds : THEOREM 1.
Under conditions (i)-(iv) 2^n I-1V
+.^n I-1W+op(-V/1n) ,
where I= (I p) and P= (Pa,) are matrices and V= (Z pap, Up U), W= d r
(3^1, UZpr) and U= (U) are p- dimensional column vectors and U. = I "PZZ and Ia" denotes the (a, p)-element of the inverse matrix of the information matrix L Since the proof of the theorem is essentially same as that of one parameter case ([9]), it is omitted. Put
1 Y, where Y= (Z Up W . p,) is a column vector . d r
Then 9* is second order
176 ASYMPTOTIC OPTIMALITY OF THE GENERALIZED BAYES ESTIMATOR 409
AMU. From Theorem 1 we have established the following : THEOREM 2. Under conditions (i)-(iv) 9* is second order asymptotically efficient in the class A2.
Since the proof of the theorem is essentially same as that of one parameter case ([9]), it is omitted. It will be shown that the generalized Bayes estimator w.r.t. a loss function and a prior density is second order asymptotically efficient. Let Bo be a true parameter of B (E 9). Denote 0 and Bo by (B1, • • •, Bf,)' and (001f • • • , Bo,)' respectively. Further we assume the following : (v) For each a= 1, • • • , p, ir(B) is twice partially differentiable in Ba . Then we have pn(B I xn )/Pn(Bo I xn)
= exp [log {pn(B I xn)/pn(0 I FOX } = exp [ElOgif(X,,O)lf(X,,00 )1+10gl7r(0)17r(OO)I 1 i_1 J =exp
^
log f(xE, e) -Z 1
[, =eXP
I Y"
a i=1
Z=1
log f(x{, Bo)+log 2r(B)-log 2r(Bo)]
8 lo g f(xti, aea
Bo)}(Ba
-Boa)
2
a log f(x i, 0o)1 + 1 E E {E t=1 aoaaop
(0a-
2 a a
+6
E
{yE
aBaao^aB,
lo g
00 (00 -000
f(x" B*)}
EQ(B? • (Ba-Boa)(Ba-B(,p)(Br-B(r ) + Ea 7r(Bo) Boa)+o(1 )] -✓ n (oa-
=exp [v'i
E Z (Bo)(Ba -Boa) + 2 Ea E { ✓ nZ a(Bo) -nI a(00)}
• (Ba-BOa )(Ba-BOa )+ C / V
• (Br-Bor )
+E
E E E Waar( 0*)(0a-0Oa ) ( 0a -000
(Ba- BOa ) +oI
11
i(Bo) where 7C a(B)=air(B)/a0a (a =1,• • •, lp). It follows that pn(B I xn)/pn( 0 I xn) = eXp
E E { Zaa( Bo) _ Iaa(0 )} [Ea Z (BO) { ✓i (Ba -Boa )} + 1 2 a a ^/n {n(Ba-B0a )( Ba-Boa
)} - 6.,/n
E E Paar(Bo)
• {n^/n(Ba- Baa)(O -B°^)(B,-eo,>}
177
410 KEI TAKEUCHI AND MASAFUMI AKAHIRA 1
E ^a(B o){^/n(Ba-Boa )}+0,(,/n a
+ -✓ n7r (Bo)
Putting tQ = ,/-n (Ba - Boa ) (a =1, • • • , p) we obtain (1) pn(O I xn)/P..(eo I FO-) =exp [:p ZZ(O)t,+ 1 E E {Z-P(00) -IP( Oo)}tats 2 a P N/ n P tr r Pr 09o)tat
- 6.V/W - 5
+.Vln7r(Bo) ^a =eXp [-
2EEI
(eo)t.-FoP(1/n ) I
(O) (t.- U)(tP-UP)+
^a(Bo)ta +
+ 1
-,/n7c(Oo)
1
2 E IP(Bo)U UP
E E Z P(Bo) t atP
n P l1 E P-Pr(0O)tgtPtr+OP( 1 )] n a P T N/n 6.v/-
= (exp 2 E E I #o) U. UP) • [exp {-
E E I A00) (ta - U) (tp - UP)} ] __ 2 E 7r a(eo) ta+ 1 E E Z P(O0)t.tP • exp { 1 V/ -n ic(Oo) a n a P 6 1n E E E P.WOO)tatPtr + Op(
'^/lw ) }
_ (exp 2 E E I #o) U. UP)
• [cxp {- 2
E E IP(eo)(ta- Ua)(tP -UP)
E na(O0)t.+ 1 E E ZZ P(O0)ta tp n n(O0) a 2^/ n a P
{
E E E P-WOO )VA1 +Op( 1 )} 6,/ n a P r ^/ n =4'.(t, Bo 1x.)
where t = (t, , • • • , tp)'. given by (2) r. *(t I x.) _
(say)
Let t = ^/ n (6 - Bo).
^n
p.(Bo 1 xn)
5
Then the posterior risk is
L*(t-t)q„ (t, 60 I x„ )dt
178
ASYMPTOTIC OPTIMALITY OF THE GENERALIZED BAYES ESTIMATOR 411
Further we assume the following : (vi) L*(u) is a convex function ; (vii) For each a=1, . • • , p,
5
L*(-u)gn(u+t, Bo I xn)du is continuously
partially differentiable with respect to to under the integral sign. By (2) and the assumption (vi) it is shown that the generalized Bayes estimator t w.r.t. L*(.) and 7r(.) is given as a solution u of the equation
1
(d/dua) L*(u- t)gn(t, Bo I xn)dt = 0
(C'= 1'. - -, p) ,
Since by (vii)
d
dua
5
L*(u-t) gn(t, Bo I xn)dt d dua d
dua
5 5
L*(- t)gn(u +t, L*(- u)gn(t +u,
001
xn)dt
Bo 1 xn)du
ff +u, Bo^ x n) }du (a=1,..., p), -J L*(- u)Idtagn(t the generalized Bayes estimator t is obtained by a solution of the equation
(3) 5
L*(_v){_qn(t+u,0okn)}du__0 (a=1,..., p) dta ,
Since t=^/n(0-0o) and t=-/_n(0-00), t may be called to be the generalized Bayes estimator. From (1) and (3) we have
0
=5 L*(-u) exp [ -
[-
--
E IQa(Bo)(ta+ua -Ua)(ta+ua-UP)]
1 6(BO) ( ta +uo - Ue) 11+
1 ^/ n ^c (Bo)
LI ia(0o) (ta +
1 +2.,/n E Z-00)(ta+ua)(to+ua)
-6^n E + .^/n ^c(o)
-2^n
Pap ,(eo)( t a+ua)(to +UP)(t,+ur)}
n«(Bo ) + n Z(00) (tP up)
Z Z P-400) (to +ua ) (t r +u,)] du+ Or( `/n/
ua)
179 KEI TAKEUCHI AND MASAFUMI AKAHIRA
412
Putting i t , = to - U U (a =1, • • • , p) we obtain
0
=5
{- 2 Y Z I p(6.)( ua +ua)(up+it)}]
L*(-u)[exp
• [-LJ
+
Iap( u p +u a) {i+ , ✓ nn(Bo )
1
L
2,,/n a
Ll
p
a
7ra(60)(ua +ua+U
)
Zap(60)( ua +ua+U.)( u p +u p +Up)
+,k +U r (u , + it,+U) ua+uap + U )( up) 1a Zp papr( 60 r r )(
64./n
+ ^/n^c(6o)
7ra(6o)+ ^n
Zap(60)(up
+up +U)
-2 In E E pa460)(40 +up+U)(11r+ur+U,)]du+op( Further we assume the following. (viii) L*(u) is a symmetric loss function about the origin. We define
5
M= L*(-u) exp [-
I p(6o)u.up]du ;
2
pas L*(-u)uaup exp
E Ia0 o)u.up]du
[-i-
(a,= 1,..., A; Qapro=
J
L*(- u)uaupuru, exp [-
-i
I p(60)uaup]du (a, Q, r, 3=1,- --, p) •
Note that by (viii)
5
L*( -u)ua exp (2
Iap( 6o)u.up)du
L*(- u)uaupur exp (-
1
Iap(60 ) uaup)du=0
(a, j9, r=1,..., p) Then we have 0=J L*(-u)[exp
Ip(6o)uaup}] {1 -E E Ira(6o)urua}
• [-i] i] I p(60)up - E I p(6o)up - E 71
I p(6o)^r(6o) n xc(60)
180 ASYMPTOTIC OPTIMALITY OF THE GENERALIZED BAYES ESTIMATOR 413
( ud +u
p)(ur +ur +U')- 2 1/ i E E E II^(O0)Zra(O0) N
' (ud+ua)(urua+U1Ua+2uric,+2Ui,+2Urua)
n Ee Er Ea Ee I ft(oo )Pra e(oo)(u p +ua)(urusue +UrU'Ue
+ 1 6,/
+ 3urruaie + 3u,ur Ue + 3 Ur Uaue + 3 Ur. Uaue + 6urua Ue )
+ 1 na(Bo)+
E Z ZP(B0)ue+ 1
E Za0(9o)UU- 1
2%/ n l
.^l n 7r(Bo) P ,/ n A
• P.,'(0o)(upur+UpUr+2upiur+2Uur+2Usur)]du+oz,( 1 1 _ -ME Ip(oo)ue-
l E E Ia^(oo) r (oo) per-
I E E E IQa(eo)
P ^/ 7G P r 7r(Oo) 9 r a
• Zra(Bo)UrP 4+ 6 +
^a(oo) 7r(BO)
•
,V 1 n_
EE
E
E I p(Bo)
Prde(Bo)(Qa ra e +3UrUo Pae)
1 M+ 1 E UPZ P(oo) ME 17n a 2^/ n e
E P.P'(oo) 1
(PPr +UUU,M) + E
E I; Aoo)I 00)Pa 14N +01 ( ) B r a x/12
Using a matrix representation we obtain (IPI-MI)ic= ^n (PI-ME)7r*-{ 1n L- 2 1 (PI-ME)V + 1 (PI-ME)W+op(
1),
where E is an unit matrix , and L, V and W are column vectors with lr(oo) E E E I p(0o)Prae (Bo )Q9rae+ r 2
L= (- 16 V=
(zP r
Papr (o o)Pdr)
P, Nr(oo ) UpUr)
W=(E UAr(B0)) Since it is derived from (vi) that the matrix IPI-MI is positive definite, it follows that (4) A= 1
I-'7r* + 1 (IPI-MI )-IL- 2 1_ I-IV
181 414 KEI TAKEUCHI AND MASAFUMI AKAHIRA
Hence we have to=ua+U. (a=1,..., p) ,
(5)
where a=(u,, • • - , it,)' is given by (4). Since
Vn(B-Bo), we modify
d to be second order AMU and denote by B*. From Theorem 1, (4) and (5) it follows that the MLE 8ML is asymptotically equivalent to the generalized Bayes estimator 9* up to order n-'"2. By Theorem 2 it is seen that d* is second order asymptotically efficient in the class A2. We have defined in [4], [11] and [12] the class D as the set of the all third order AMU estimators B satisfying the following : (a) B is asymptotically expanded as
and Qa=OP(1) (a=1, , p) and E(UQk)=o(1) (k=1, 2) for all a, a=1, • • , p, where E denotes the asymptotic expectation of UUQ1 with U= 1 I°" 8 E log.f(Xi, B) ( = 1 , . . . , p ) ;
(b) The joint distribution of d admits Edgeworth expansion. It follows from (4) and (5) that the generalized Bayes estimator B* belongs to the class D. Then the asymptotic marginal distribution of B* is equivalent to that of the MLE 9ML up to order n-'.
Since BML is third order asymptotically efficient in the class D ([4], [10], [11]), 9* is also so. Hence we have established : THEOREM 3.
Under the assumptions (i )-(viii), the generalized Bayes
estimator d* is second order asymptotically efficient in the class A2 and also third order asymptotically efficient in the class D. UNIVERSITY OF TOKYO UNIVERSITY OF ELECTRO - COMMUNICATIONS
REFERENCES
[1]
Akahira, M. (1975). Asymptotic theory for estimation of location in non -regular cases, I: Order of convergence of consistent estimators , Rep. Statist. Appl. Res., JUSE, 22, 8-26. [2 ] Akahira, M. and Takeuchi, K. (1976). On the second order asymptotic efficiency of estimators in multiparameter cases, Rep. Univ. Electro-Comm ., 26, 261-269.
182
ASYMPTOTIC OPTIMALITY OF THE GENERALIZED BAYES ESTIMATOR 415 [3] Akahira , M. and Takeuchi , K. (1979). Discretized likelihood methods-Asymptotic properties of discretized likelihood estimators (DLE's ), Ann. Inst. Statist. Math., 31, A, 39-56. [4 ] Akahira , M. and Takeuchi , K. (1979). The Concept of Asymptotic Efficiency and Higher Order Asymptotic Efficiency in Statistical Estimation Theory, Lecture Note. [ 5 ] Gusev, S. I. (1975). Asymptotic expansions associated with some statistical estimators in the smooth case 1. Expansions of random variables , Theory Prob. Appl., 20, 470498. [ 6 ] Pfanzagl , J. and Wefelmeyer , W. (1978). A third-order optimum property of maximum likelihood estimator , J. Multivariate Anal., 8, 1-29. [ 7 ] Pfanzagl, J. and Wefelmeyer, W. (1979). Addendum to "A third -order optimum property of the maximum likelihood estimator ", J. Multivariate Anal., 9 , 179-182. [ 8 ] Strasser, H. (1977). Asymptotic expansions for Bayes procedures , Recent Development in Statistics (ed. J. R. Barra et al.), North -Holland, 9-35. [9 ] Takeuchi , K. and Akahira, M. (1976). On the second order asymptotic efficiencies of estimators , Proceedings of the Third Japan-USSR Symposium on Probability Theory (eds. G . Maruyama and J . V. Prokhorov), Lecture Notes in Mathematics 550, Springer-Verlag , Berlin, 604-638. [10] Takeuchi , K. and Akahira , M. (1978). Third order asymptotic efficiency of maximum likelihood estimator for multiparameter exponential case , Rep. Univ. ElectroComm ., 28, 271-293. [11] Takeuchi, K. and Akahira , M. (1978). On the asymptotic efficiency of estimators, (in Japanese), A report of the Symposium on Various Problems of Asymptotic Theory, Annual Meeting of the Mathematical Society of Japan, 1-24. [12] Takeuchi , K. and Akahira , M. (1978). Asymptotic optimality of the generalized Bayes estimator, Rep. Univ. Electro -Comm., 29, 37-45. [13] Takeuchi, K. and Akahira, M. (1979 ). Note on non-regular asymptotic estimationWhat " non-regularity " implies, Rep. Univ. Electro-Comm ., 30, 63-66. [14] Takeuchi , K. and Akahira, M. (1979). Third order asymptotic efficiency of maximum likelihood estimator in general case , (to appear).
183
Rep. Univ. Electro-Comm. 30-1, (Sci. & Tech. Sect.), pp. 63-66 August, 1979
Note on Non-Regular Asymptotic Estimation What " Non-Regularity" Implies* Kei TAKEUCHI ** and Masafumi AKAHIRA***
Abstract In the asymptotic theory of statistical estimation the authors tried to compare the regular versus non-regular situation to clarify the significance and implication of each of the regularity conditions. This paper summarizes the results far obtained.
1. Introduction Asymptotic theory of estimation is usually based on a set of regularity conditions. Regular case has been studied since Cramer [ 5 ] to Huber [ 6 ], and Pfanzagl [ 7 ] among others. Non-regular case has been studied by various authors at various occasions and Weiss and Wolfowitz [ 9 ] gave general results covering non-regular cases. The authors tried to compare the regular versus non-regular situation to clarify the significance and implication of each of the regularity conditions ([ 3 ]). In this paper we summarize the results thus far obtained.
2. Results Suppose that X1, X2, • • •, X,,, • • • are a sequence of i. i. d. random variables whose distribution Ft depends on an unknown parameter ^ESCR°.
A parameter to be estimated is
considered to be a real -valued function 9=0(^) of In the regular case it is assumed that i ) The distribution Ft is dominated by some measure p with the density f; (ii) The support , i. e. the set Ae= {x : f e(x)>0} is independent of e ; (iii)
f&) is continuous in ^ for almost all x Lit]; f t(x) is continuously differentiable (up to specified order ) in ^ for almost all x [It] ; lr ^ (v) The information matrix le = E log f t(x) }j log f t(x)} 1 exists. (iv)
Moreover it is usually assumed that ( vi) O(ff) is continuously differentiable (up to specified order) in ^. Then with some other set of regularity conditions it is shown that if the maximum likelihood estimator (m. 1. e.) is 0 =6(^) /whose ^/ is the m . 1. e. of 6, V n (6-8) is asymptotically normal with mean 0 and variance 100 )1I-' I a) and B is asymptotically efficient, and it is also shown to be second order asymptotically efficient. * Received on May 30, 1979. This paper was submitted to the Second Vilnius Conference on Probability Theory and Mathematical Statistics, 1977 to which the authors could not attend nor present it. University of Tokyo *** Statistical Laboratory, University of Electro-Communications
184 64 Kei TAKEUCHI and Masafumi AKAHIRA
The conclusion involves following statements : (a) The order of convergence of estimators is n''2 ; (b) Asymptotic bound for the asymptotically efficient estimators is given by the normal distribution. (c) The m. I. e. asymptotically attains uniformly that bound ;
(d) It also attains the bound up to the order n-112 (Pfanzagl [ 7 ], Takeuchi and Akahira [8]). If either of the above assumptions (i)-(v) fails to hold, some of the conclusions above may not necessarily hold true.
(1) In the undominated case, if the family of the probability distributions is decomposed into dominated subclasses such that E=UEa (a$9) a
and correspondingly the sample space X is also decomposed. a
and
Pr {X E 'a I B} =1
if
O E Ea ;
Pr {X E 'a I O} = 0
if
O Er Ea,
then each subclass Ea can be treated separately. For other undominated case little, if any, has been known.
(2) If the support depends on ^, the order of convergence of estimators is not necessarily to be n12. Assuming smoothness of O(ff) the order usually depends on the power a defined by
1- Pt- {At} = O(p ^ - ^' It ') as ^' approaches to ^, where PC, is the probability measure with the density f e'.
If f e(x)
is sufficiently regular, the maximum order of convergence is n'/" when 02 (Akahira [ 11. (3) Discontinuity of f E(x) in ^ plays similar role as the non-identical support (actually the later can be dealt with as the case when ft(x) = 0 for some x). (4) When the support depends on E, an (uniformly) asymptotically efficient estimator may not exist, and m. 1. e. may not be asymptotically good. In the case of convergence of order n'l"(a<2) m. 1. e. is not asymptotically efficient, nor even asymptotically admissible : Suppose that
f(x-B)-
cexpr-
2
(x-B)2}
if
Jx-BI <1 ;
0 t 1 otherwise, where c is some constant. Then the m. 1. e. is asymptotically equivalent to min X; +1 with probability 1/2 t
185
August, 1979 Note on Non- Regular Asymptotic Estimation
65
max X, -1 with probability 1/2 , i while the estimator 6* = (min Xi +max Xi )/2 is asymptotically uniformly more concentrated. (The asymptotic distributions of the two are equal except for the scale being twice as large for the m.1. e. as for 0*). In this case there exists one-sided asymptotically efficient estimator, but no estimator is asymptotically uniformly efficient. If we define two-sided efficiency by inf lim Pr { 10 -0 I
together with the asymptotically median unbiased (a. m. u.) condition of B (Takeuchi and Akahira E8]), then 0* above is uniformly (in t) asymptotically two-side efficient. Also Weiss and Wolfowitz's maximum probability estimator is uniformly inferior to 9* (Akahira and Takeuchi [3], [ 4 ]). When the order of convergence is (n log n)"2 and the bound of the asymptotic distributions is normal, then m. 1. e. is asymptotically efficient. (5) If the density is not smooth enough, while Fisher information is well defined, then in. 1. e. is asymptotically efficient but not second order (that is in the order n-"2) asymptotically efficient as in the regular case : Suppose that f(x-B)= 2 exp{- Ix-O } The asymptotic distribution of the in. I. e. 9 is given by 2
lim ✓ n Pr {V n (0-0)
where 0 (u)=Su
0(t) dt with 0(t)= Z e- t'/2, while the bound of the asymptotic distributions
of a. in . u. estimators is given by z 0(t)+ 6 n O(t)sgn(t).
which can not be uniformly attained (Takeuchi and Akahira E8]) but is attained for a specified to by an estimator B satisfying
-EIXi-8I+^ Xi-B+lo -c=0, n
where c is determined so that B be a. in. u. (Akahira and Takeuchi [ 2 ], [ 3 ]). (6) When the Fisher information is infinity, the asymptotic distribution can be obtained if the distribution of (8/aO)logf (X, 0) belongs to the attraction of some stable law, and m. 1. e. may or may not be asymptotically efficient : Suppose that Xi=(X,i, Xzi) (i =1, 2, •••) ; Xzi=6Xii+Ui (i=1, 2, ..) , where Ui's are i . i. d. with the density f (u) for which I= J {f1(.)} 2/f (u)du
186 66 Kei TAKEUCHI and Masafumi AKAHIRA
The above examples are chosen to illustrate the situation which usually happens in more general cases.
References [ 1 ] Akahira, M.: "Asymptotic theory for estimation of location in non-regular cases, I : Order of convergence of consistent estimators, Rep. Stat. Appl. Res., JUSE, 22, 8-26 (1975).
[ 2 ] Akahira, M. and Takeuchi, K.: "Discretized likelihood methods-Asymptotic properties of discretized likelihood estimators (DLE's)," Ann. Inst. Statist. Math. 31, Part A, 39-56 (1979). [ 3 ] Akahira, M. and Takeuchi, K.: "The Concept of Asymptotic Efficiency and Higher Order Asymptotic Efficiency in Statistical Estimation Theory," Lecture Note (1979).
[ 4 ] Akahira, M. and Takeuchi, K.: "Remarks on the asymptotic efficiency and inefficiency of maximum probability estimators," (to appear). 151 Cramer, H.: "Mathematical Methods of Statistics," Princeton University Press (1946).
[ 6 ] Huber, P.: "The behavior of maximum likelihood estimates under non-standard conditions," Proc. Fifth Berkeley Symp. on Math. Statist. Prob. 1, 221-233 (1967). [ 7 ] Pfanzagl, J.: "Asymptotic expansions related to minimum contrast estimators," Ann. Statist., 1, 993-1026 (1973).
[ 8 ] Takeuchi, K. and Akahira, M.: "On the second order asymptotic efficiencies of estimators," Proc. Third Japan-USSR Symp. on Probability Theory. Lecture Notes in Mathematics 550, Springer Verlag, 604-638 (1976). 191 Weiss, L. and Wolfowitz, J.: "Maximum Probability Estimators and Related Topics," Lecture Notes in Mathematics 424, Springer Verlag (1974).
187 Austral. J. Statist., 22 (3), 1980 , 332-335
A NOTE ON PREDICTION SUFFICIENCY (ADEQUACY) AND SUFFICIENCY' MASAFUMI AKAHIRA AND KEI TAKEUCHI
University of Electro- Communications and University of Tokyo Summary Let V, s$) be a measurable space and 9 a family of probability measures on s4. Let 98 and `C be sub cr-algebras of ,s4 and A0 a sub a--algebra of A. It is shown that if A is prediction sufficient (adequate) for A with respect to '6 and 90, and # is sufficient for 98'i' with respect to P then Y is sufficient for 98 V( with respect to 9a; that if 9$ is homogeneous and (980; 98, ') is Markov for 91, and 980 Ce is sufficient for 98"T with respect to 9A, then 980 is sufficient for 98 with respect to 9i'; and by example that the Markov property is necessary for the latter proposition to hold.
1. Introduction Prediction sufficiency (adequacy) was defined and discussed by, among others, Skibinsky (1967), Takeuchi & Akahira (1975), and Torgersen (1977). The purpose of this note is to show the relation with prediction sufficiency and the "usual" concept of sufficiency in the following situation. Suppose that the observations are given sequentially in time in two sets. Let (X, Y) be a set of observations where X is observed first and Y comes later. In such a situation one may be interested to know how much information in X should be reserved to be later combined with Y without any loss of information in the combined sample. Technically it amounts to giving the condition for the statistic T = t(X), that the combined statistic (T, Y) be sufficient for (X, Y). It is shown that it is sufficient that T be prediction sufficient for Y, and it is also necessary under further conditions. We may further add that in this formulation the distribution of Y can be dependent on X in a general manner so that we can include the sequential design case when the design (that is observation scheme) of Y is determined depending on X. We will, essentially, use the framework of Skibinsky (1967). The notion of adequacy in Skibinsky's paper is, however, replaced by the notion of prediction sufficiency. ' Manuscript received September 13, 1978; revised January 16, 1979.
188
A NOTE ON PREDICTION SUFFICIENCY 333
2. Results Let (T, 4) be a measurable space and 9a a family of probability measures on s4. We denote by 4; s42 the smallest o--algebra which contains each member of two subclasses sal, 42 of A Let XA denote the indicator function of a set A. 98 and T denote sub o--algebras of si, and 980 denotes a sub o--algebra of 90. Definition 1. 900 is said to be sufficient for 9o with respect to 95, symbolically
suf (9; 90),
oJ
if for each B E 90 there exists a 900-measurable function ¢B° such that for every p E9'
4s = EP"XB a.e. [p], where EP° XB denotes the conditional expectation under p of XB given moo. Definition 2. (Skibinsky, 1967; Takeuchi & Akahira, 1975). 910 is said to be prediction sufficient (adequate) for 90 with respect to ig and 9', symbolically
900 pred. suf (-60; 98,19) if Ro suf (-60; 90) and (980i 90, `e) is Markov for 9, that is, 98 and T are conditionally independent given 980 for each p E 9. Lemma. Let (d,°, A, p) be a probability space. (980; 90, W) is Markov for p if and only if for every B E 98 and every C E T
EPx'(X
B nc)
= Xc E P°( XB)
a.e.
[p].
Proof. Suppose that (980, 98, IC) is Markov for p. For every B E R, every C E ', every B0 E 900 and every C e ', ^o Ep
°nC'
XBnc dP = XBnc B„n
dp
C'
= Ep(XB„nBXc'nc) = Ep[EP°(XB„nBXc'nc)] = Ep[(EP°XB°nB)(EP°Xc,nc)1 = Ep[EP"(Xc'ncEP°XB„nB)]
Ep[Xc'nc EP"XB°nBJ = Ep[XB„nc'ncEP°XB XcEP°XB dp. B„nc'
Since {B0 fl C B0 E 980, C E '} generates the a--algebra 900vre, it follows that for every Be 90 and every C ET
EP°v'eXB = XcEp"XB
a.e. [p].
189 334 MASAFUMI AKAHIRA AND KEI TAKEUCHI
Conversely, if for every B E 98 and every C E W EP° XB = XcEP°XB a.e. [p], then for every B E 98 and every C E `' E',(XBnC) = EP°( "XBIC) = EP°(XcEP°XB) = EP°XBEP°Xc a.e. [p]. Hence (98,,;98, ') is Markov for p. Theorem 1. Let &' be a sub if-algebra of 980"`e. Suppose that 980 pred. suf (9; 98, W) and 9' suf (P; 980"`x). Then 9° suf (91; 98"<). Proof. For every B ER, every C E IC and every p E
9'
EPX B nC = EP ( Ep ° 'fXB nC) = EP(XcEP°XB)
(by Lemma)
= EP(Xc4s°) (by 98o suf (9; 98)) a.e. [p]. Since XcoOa ° is a 91 o Ig-measurable function and 9 suf (9; 980 w), it follows that there exists a 9-measurable function (Psnc such that for every pc9' 4snc= EP(Xc4s°) = EPXsnc
a.e. [p].
Since {B fl c I B E 98, C E W} generates 98"`x, it is seen that 9 suf (9 ; 98"x). For Theorem 2 we assume that 9 is homogeneous, that is, p and q are mutually absolutely continuous for every p and q in 91. Theorem 2. Suppose that (980; 98, W) is Markov for 9 and 980"'e suf (9; 98"`'). Then 980 suf (9'; 98). Proof . Since P is homogeneous and 980"` suf (-60; 98"') and (980; 98, W) is Markov for 9, it follows by the Lemma that for every B E 9V there exists a 9f 0-measurable function (AB'- such that for every pE9 EP°(XB) = EP°(EP°'XB) = EP°(4e°) = Os'
a.e. [p].
Hence 980 suf Remark. It is shown by the following counter-example that the homogeneity of 95 is necessary for Theorem 2. Suppose that 98 =If= {4, B, B`, X}, 98o = {4, '} and 9= {p, q} with p(B) = q(B`) = 1. Then (980; 98 , ') is Markov for 9' and 980"'C suf (9a; R"ig) but 980 is not sufficient for 98 with respect to 95.
190 A NOTE ON PREDICTION SUFFICIENCY 335
The following example shows that the Markov property is necessary for Theorem 2 to hold. Suppose that (X, Y, Z) are jointly normally distributed with means (0, 0, 0) where 0 is the unknown parameter and with known variances and convariances to be later specified. Let 910, y8 and' be the o -algebras induced by, respectively X, (X, Y) and Z. Let us assume that Y can be expressed as Y = X - Z + U, where U is independent of X and Z. We further assume that Cov (X, Z) = 0 and V(X) = V(Z) = V(U) = 1, where Cov and V denote covariance and variance, respectively. Then it follows that E(U) = 0, V(Y) = 3, Cov (Y, X) = 1 and Cov (Y, Z) _ -1. And given (X, Z), Y is conditionally normally distributed with mean X-Z and variance 1, hence its conditional distribution is independent of 0. Therefore (X, Z) is sufficient for (X, Y, Z) or equivalently 9"19 is sufficient for M"'. However, X is not sufficient for (X, Y), that is, 98o is not sufficient for 98, because given X, Y is conditionally normally distributed with mean X - 0 and variance 2, which is dependent on 0. Nor are Y and Z conditionally independent given X, hence (_qO, 98, ig) is not Markov. References SKIBINSKY , M. (1967). Adequate subfields and sufficiency. Ann. Math. Statist. 38, 155-161. TAKEucHI, K. & AKAHIRA , M. (1975). Characterizations of prediction sufficiency (adequacy) in terms of risk functions . Ann. Statist. 3, 1018-1024. ToRGEnsEN , E. N. (1977). Prediction sufficiency when the loss function does not depend on the unknown parameter. Ann. Statist. 5, 155-163.
191
Rep. Univ. Electro-Comm. 31-1, (Sci. & Tech. Sect.), pp. 89-96 August, 1980
Third Order Asymptotic Efficiency and Asymptotic Completeness of Estimators* Kei TAKEUCHI** and Masafumi AKAHIRA***
Abstract In the previous paper it was shown that the maximum likelihood estimator (MLE) was third order asymptotically efficient in multiparameter exponential cases. In this paper it is shown that the result is extended to more general cases. The concept of asymptotic completeness of an estimator
is introduced
and it is shown that the MLE is higher order asymptotically complete in the appropiate classes.
1. Introduction In the previous papers [ 6 ], [ 7 ] and [ 3 ] Takeuchi and Akahira have discussed the third order asymptotic efficiency of the maximum likelihood estimator (MLE) and the generalized Bayes estimator in multiparameter exponential cases.
In this parer we show that it is
possible to extend the result on the MLE to more general cases in a very straightforward manner by introducing appropriate classes of estimators. Further we introduce the concept of an estimator in the sense that it can beat asymptotically any other " regular " estimators by making appropriate transformations if necessary and show that the MLE is higher order asymptotically complete in the appropriate classes.
2. Third order asymptotic efficiency of maximum likelihood estimator in general cases In order to generalize the results in the previous paper E6], it is necessary to define appropriate classes of estimators among which asymptotic distribution is to be compared. As was shown in Akahira and Takeuchi [ 3 ] it is possible to establish the second order asymptotic efficiency of the modified MLE within the class of second order AMU estimators without any further restriction of the class of estimators to be considered. On the contrary in the previous paper [ 6 ] we restricted the class of estimators to be of a special type of functions of the sufficient statistic which we called the extended regular estimators. As will be shown later we can not usually have third order asymptotically efficient estimators without such restriction. In more general cases where sufficient statistics of finite dimension do not exist, the argument in the previous paper [ 6 ] can not be applied directly. We shall establish the third order asymptotic efficiency of the modified MLE and also of the modified generalized Bayes estimator within the classes C and D of estimators which are defined below. For the simplicity's sake * Received on June 10, 1980 ** University of Tokyo *** Statistical Laboratory, University of Electro-Communications
192 90 Kei TAKEUCHI and Masafumi AKAHIRA
we shall first consider the real parameter (one dimensional) case, and that the sample is independently and identically distributed with all necessary regularity conditions to be specified later. Let (1, B)be a sample space. We consider a family of probability measures on B, B= [Pe: 8 E 0} , where the index set 0 is called a parameter space.
We assume that 0 is an open
set in a Euclidean 1-space R'. Consider n-fold direct products (2C°' , B(n)) of (-T, B) and corresponding product measures Po,. of Pe.
For each n=1,,..., the points of _T(n) will be An estimator of 8 is defined to be a sequence {en} of B(n)-
denoted by .xn=(xi, . . . , xn).
measurable functions On on X(n) into (9 (n = 1, 2.... ).
For simplicity we denote an estimator as
On instead of {On}. For an increasing sequence of positive numbers {c.} (cn tending to infinity) an estimator On is called consistent with order {cn} (or {cn} -consistent for short) if for every E> 0 and every 9 E 0 there exists a sufficiently small positive number 6 and a sufficiently large positive number L satisfying the following :
lim sup n-+oo
8: 19-91<e
Pe. n {cn l8n-81 >L} <E
(Ell.
For each k=1, 2, . , a {cn} -consistent estimator On is called to be k-th order asymptotically median unbiased (or k-th order AMU) if for any 9 E 0 there exists a positive number S such that lim
sup
cnk-' Pe, n {6n <8} - 2
sup
cnk-' Pe,n
n-.oo e : 1e-91 <6
lim n-.oo 9 :
{B n >_8} -
1e-91<6
2
=0; =0.
For en k-th order AMU, Go(t, 8)+cn-'Gi(t, 8)+ ... +Cn-(k-')Gk-i(t, 8) is called to be the k-th order asymptotic distribution of cn(On-8)(or On for short) if Jim cnk-' I Pe,n {cn(en-8)
We denote that G1(t, 8) (i =1, ... , k-1) may be general absolutely continuous function, hence the asymptotic distribution for any fixed n may not be a distribution function. Suppose that On is k-th order AMU and has the k-th order asymptotic distribution Go(t, 8)+cn-'Gi(t, 8)+ . . . +Cn-(-')Gk-1(t, 0). Letting 8o(E(9) be arbitrary but fixed we consider the problem of testing hypothesis
H+: 8
= 8o+tcn -'( t>O) against K: 8=80. Put 01/2= { {/in} : Ee,+to„ ',n(On)=1/2+o(cn-(k-')), 055 n(.xn)<1 for all x.E_Ttn' (n=1, 2.... )}. Putting Adn,e,={cn(Bn-Bo)
lim Pe,+tC.-'> n(AB..e,)= lim Pe, +tcA ', n{On <8o +tcn-'}
n-»oo
Hence it is seen that a sequence
to
01/2.
n-+
2
{xAB. eo} of the indicators of AB., e, (n=1,2,...) belongs
If
sup lim cnk-' {Ee,,n(On)-Ho+(t, 8o)-cn-'Hl+( t, 80)- ... -cn-tk -'"Hk-1+(t, 80)} =0, (0.(E^ n-.oo
then we have Go(t, 8o)< Ho+(t, 8o) ; and for any positive integer j (
G;( t, 80)<_ H, +(t, 8 o).
193 August, 1980 Third Order Asymptotic Efficiency and Asymptotic Completeness of Estimators 91 Consider next problem of testing hypothesis H-: O=Oo+tcn-' (t<0) against K : 0=00.
If
inf lira cnk -' {Ee,.n(0n)-Ho-(t, Oo)-cn-'Hi-( t, Oo)- Oo)} =0, then we have Go(t, Oo)> Ho-(t, Oo) ; and for any poistive integer j(H[ (t, Oo). 0, is called to be k-th order asympotically efficient if the k-th order asymptotic distribution of it attains uniformly the bound of the k-th order asymptotic distributions of k-th order AMU estimators, that is, for each OEO
Gi(t, 0)- Hi+(t, 0) -{H;-(t, 0)
for t>0; for t<0,
2 ], [ 3 ], [ 4 ], [ 5 ]). [Note that for t=0 we have Gi(0, O)=Hi+(0, O)=Hi-(0, 0) (i =0, ... , k-1) from the condition of k-th order asymptotically median unbiasedness.] Thus if On* is k-th order asymptotically efficient in the above sense, then for any k-th order AMU estimator On we have
lim cnk '[Po, n {-a0.
(2. 1)
n_w
Note that the k-th order asymptotic efficiency defined in the above
for all a, b> 0 and 0 E 0.
gives a sufficient condition but not a necessary condition (2. 1). We assume that for each 0EO Pe is absolutely continuous with respect to a or-finite measure p.
We denote a density dPe/die by f (x, 0).
is given by
11 f (xi, 0). i=1
Then the joint density of (Xi, . . . , Xn)
In this section we shall deal with the case when cn=V n .
First we shall restrict our attention to the class of (first order) asymptotic efficient estimators which we shall call the. class A. It has been well established that any estimator On in the class A can be expressed as
Vn (9n-0)=I(B)1^ n a n+ op(1), n
where Ln= E log f (Xi, 0). i=1
Secondly, we call the class of estimators which are second AMU and second order asymptotically efficient as class B. Then any estimator On which is second order AMU and for which the distribution of 111 n (On-B) admits the Edgeworth expansion up to the order n-112 belongs to the class B. Now we generalize the concept into one step further. We call the class C the class of estimators On which are third order AMU and for which the distribution of ✓ n (On-0) admits the Edgeworth expansion up to the order n-' and
V n (Bn-B)I(0)V n 0 +l n-Q(0)+OP( nI where Q (0) is a quantity of stochastic order 1. We call the estimator O n belongs to the class D if in the above we have
E[IL.Q(6)^]=O'
194
92
Kei TAKEUCHI and Masafumi AKAHIRA
where E stands for the asymptotic mean. It is to be noted that Lemma 4. 1 in the previous paper [ 6 ] can be generalized straightforwardly as follows.
Lemma 2.1. Suppose that ZB is a function of X1, ... , X. and B and is differentiable in 0. Then E9(UZ9)=V n aBEe( Z9)-1 1 EB` 06 1 -W where U= V a log f (X, 0), provided that the differentiation under the integral sign of E,? I(0) 8B (Z9) is allowed. In the following discussions we assume that expectation of the relevant quantities always exist and the asymptotic value of the expectation and the asymptotic mean are the same at least up to the required order. Remark : The problem of non-existence of the relevant moments vis-a-vis asymptotic moments can be evaded in the following way. Suppose that we have a smooth and montone increasing transformation ij=0(6) so that 17 is bounded. Instead of 0 we may consider 27 as the parameter to be estimated. Since , is bounded we may assume that all the estimators are also bounded hence have moments of any order, and we can easily formulate the condition for the asymptotic moments and the moments of the asymptotic distribution being equal. If we consider the estimation of 77 instead of 0 the asymptotic theory based on moments is applicable in a straightforward way. Transforming back to the estimator of 0 by 9=0-'(fj) and considering the monotoniticy of the transformation of the distribution, the results are also applicable to the estimation of 0. Let us first consider the class C. Suppose that ✓ n (6„-B) is expanded as ✓ n (9„-B)= I(^j)+ 111 n Q(0)--On(n ),
where Q(B)=O,(1) and Zi(B)=- n
tZi - - log
f (X;, B).
Then we have the following lemma. Lemma 2.2. EB[Zi(B)Q(0)] =o(1) Proof. Define Te=V n (fin-9)- I (8))= .LQ(9)+oa(Vn). Then we have IA I,/-n Do V_n n aBE9[Q n). -0"(v 111
From which we have
(6)]+1-ynE^
+ov(n)
195 August, 1980 Third Order Asymptotic Efficiency and Asymptotic Completeness of Estimators 93 Ee[Zi(0)Q(0)]= ✓ n Ee[Zi(B)Te]+o,(1) =0,(1).
Thus we have the asymptotic cumulants of ✓ n (6n-0) up to the order n-' as Ee[V n (6n-0)]= nn iz1(0)+ /t2(B)+o(n) ;
Ve[✓n (6n- B)] = 1
+
n
v 2(B)+o (n) ;
n n)
K3[V n (6n-e)]=Tn P3(0)+ - 73(0)+O
K4[l/ n (6n-0)]= n P4(0)+0(n ). We can show in exactly similar way as in the previous papers [ 5 ] and [ 6 ] that ^3(B) = - 3J(9) +2K(6) .
I(0)3 P'(B)=-I8)•{3H*(0)+12N*(0)+4L(0)} +IB)s{2J(0)+K(0)} {J(9)+K(8)} where
L(O)= E ^{ a 303 log f (X, 0)}{aB log .f (X, 0)}] N*(O)=
E9[ f
logf(X, 0)+I(6)}{a log f(X, B)
}2]
H *(B)= Ee[{a8 log f(X, 0))4]-3I(o)2. Since the odd order cumulants affects only the asymmetric aspects of the asymptotic distribution, the asymptotic expansion of Ps,. {V n 1 6n - 01
is independent of them. Hence we have
the following theorem. Theorem 2. 1. Let OML* be the modified MLE in the class C and On be any other estimator in the class C. Then it holds that limn[Pe,n{Vn I6ML*-0j 0 n-*oo
for all a> 0 and all 6 E 6. For the class D we further have that r3(0)=0 and p1i(0)=/33(0)/6, hence all the asymptotic moments are determined up to the order n-' except V2. Since V2 is minimized for the modified MLE we have the following theorem.
Theorem 2.2. Let OML** be the modified MLE in the class D and On be any other estimator in the class D. Then it holds that lim n[Pe,n {-a
for all a>0, all b>0 and all BE 6.
3. The concept of asymptotic completeness of an estimator We consider first the case of a real-valued parameter. The concept of asymptotic efficiency and higher order asymptotic efficiency gives a very strong condition for the "goodness" of an estimator in that it uniformly maximizes asymptotically the probability of its lying within a
196
94 Kei TAKEUCHI and Masafumi AKAHIRA
neighborhood of the value irrespective of the parameter value and the choice of neighborhood. But only one somewhat artificial restriction is that the estimators considered should be (higher order) asymptotically median unbiased. It is, however, easily seen from the context that the only requirement is that the class of estimators to be compared must have the same asymptotic "location" whatever the exact meaning of the word may be, without which nothing can be derived. Therefore one may consider different types of the definition of " location ", and accordingly different type of "adjustment" of the estimator to satisfy the condition. Thus we may consider in the following way. Suppose that we want to establish asymptotic optimality of an estimator One, e. g. MLE, in some class of estimators.
What has been virtually established is that for any estimator On in the class if both On and Ono are adjusted like O n*=On+
1 g(On) ;
n
Ono*=On0+ 1 h(OnO) ; n so that On* and On0* are both (higher order) asymptotically median unbiased then 0n°* is (higher order) asymptotically uniformly "better " than On*. If we change the definition of " location " the way of "adjustment " will be changed, but the same results hold for adjusted estimators On" and On*.
This fact will lead to the following question : Instead of comparing
two adjusted estimators On* and On°*, isn't it possible to compare On and the adjusted On* ? Naturally, the way On* is adjusted depends on On to be compared, but is independent of the parameter. And we introduce the following concept. An estimator On* is called to be asymptotically complete with respect to a class M of estimators if for any On EM there is an "adjusted" estimator On **=On* +gn(On *)
so that for any a>0, b>0 and 0 we have lim [P9,- {-a0.
(3.1)
Now let L be a loss function such that L(0)=0 and L(u) is an increasing function for u>0 and a decreasing function for u<0 and it is bounded. If On* is asymptotically complete in the above sense, then for any estimator On in M there is On** which is a function of On* for which lim [ Ee, n( L( cn( On**-B)))- E9,n(L(cn( On- 0)))]<0. (3. 2) n-co
Higher order asymptoic completeness can be defined in a similar way. For each k=2, 3, ... , an estimator On* is called to be k-th order asymptotically complete with respect to a class M of estimators if for any On E M there is an "adjusted" estimator On**=On*+gn(On*) so that for any a>0, b>0 and 0 we have
lim cnA -'[Pe,n {-a0.
(3.3)
n-'oo
Further On* is called to be k-th order asymptotically symmetrically complete with respect to a class M of estimators when (3. 3) holds when a= b.
If On* is k-th order asymptotically complete
with respect to M, then for any estimator On in M there is On** which is a function of On'e`
197
August, 1980 Third Order Asymptotic Efficiency and Asymptotic Completeness of Estimators 95
for which limcnk '[ Ee, n( L( cn( Bn * *-6)))- E9,n(L(cn(Bn- 0)))]S0.
(3.4)
n-.oo
If 6n* is k-th order asymptotically symmetrically complete with respect to M, then (3. 4) holds for all symmetric loss function around the origin. Unfortunately the above definition is meaningless as such, since no estimator can asymptotically complete, which can be seen below. Let On* be an asymptotically efficient estimator (e. g. MLE) and 1/ n (0n*-0) be asymptotically distributed with mean 0 and variance 1/I, where I is the Fisher information. Let On be an AMU estimator with asymptotic variance larger than 1/I. Obviously On is not asymptotically efficient and inferior to On*. A naturally adjusted estimator of On* corresponding to On will be On**=On*+c/v n
which has the same
asymptotic mean with On. Now take (0<)b(0) large enough. Then we have lim [Pe,n{-a
which contradicts the condition of asymptotic completeness. However we may adjusted On* we can not get uniformity irrespective of a and b.
It is, of
course, possible to get the reverse inequality if we adjust On* depending on a and b, but the nice uniformity would be lost. There is nothing unnatural in this problem : If there is a bias in the estimators, the estimator with large variance may hit upon the true value by chance but one with small variance has little chance to get close to the true value. Therefore we must modify the concept of the asymptotic completeness and call On* asymptotically symmetrically complete if (3. 1) holds when a=b, then (3. 2) holds for all symmetric loss function. The class F of estimators here is more specifically defined as the class of estimators On such that cn(On-6) has the asymptotic distribution in the sense that lim Pe, n {cn(On-0)
where the convergence is locally uniform in 0 and Fo(a) is absolutely continuous in a and continuous in 6, and we further assume that cn is equal to the maximum order of consistency. Now we have the following theorem, the proof of which is omitted. Theorem 3. 1. Let On* be asymptotically efficient and the density of the asymptotic distribution of cn(On-B) is symmetric around the origin. Then On* is asymptotically symmetrically complete with respect to the class F.
In this case it is seen that On* need not be modified. Higher order asymptotic compleness has, however, more meaningful implication. Now we consider the regular case with cn=Vn and we define the classes B*, C* and D* which are defined in the same way as the class B, C and D, respectively but omitting the condition of higher order asymptotic median unbiasedness. Then the following theorems hold : Theorem 3.2. The MLE is second order asymptically complete with respect to the class B*. Theorem 3.3. The MLE is third order asymptotically symmetrically complete with respect to the class C*.
198 96 Kei TAKEUCHI and Masafumi AKAHIRA
Theorem 3.4. The MLE is third order asymptotically complete with respect to the class D*. The outline of the proof of Theorem 3. 4 is as follows :
Let 65ED* and the asymptotic bias of Bn be Ee.n(Un(0n-6))_nµi(B)+o( n and also of Bn* E9,n(V n (Bn*-0))=n Ftl*(B) 1 In ).
Let the modified estimator fi0n** be defined as 11 (7n**.e,,*+ i {1ai*(en*)- f.Li*(en*)} n
Then we can prove exactly as of Theorem 6. 1 in the previous paper [ 6 ] that
lim n[Pe, n
a < I/ n (O n **-0)0
n-+oo
for all a, b>0 and 0, because the asymptotic moments of V _n (6n-0) and V n (en**-B) are the same up to the order n-' except the second term in the variance which is smaller for On*. The proofs of Theorem 3. 2 and 3. 3 are similar and even simpler. The classes C* and D* are much natural classes than classes C and D and Theorems 3. 2, 3. 3 and 3. 4 may be nearly the final conclusion of the asymptotic derivability of the MLE because we have the problem of the choice of the way of modification which should be based on the consideration of the asymptotic bias versus asymptotic variance, which can not be decided once and for all, but what is established in the above implies that we only need to consider the estimators which are modified MLE but it is not necessary to take any other (regular) estimators into consideration.
4. Conclusion remark The conclusion of this paper can be easily generalized to the multiparameter cases and the discussion is very much similar to that in the multiparameter exponential case. So we omit the details of the derivation.
References Akahira, M.: Asymptotic theory for estimation of location in non-regular cases, I : Order of convergence of consistent estimators , Rep. Stat. Appl. Res., JUSE, 22, 8-26 (1975). [ 2 ] Akahira, M.: A note on the order asymptotic efficiency of estimators in an autoregressive process, Rep. Univ. Electro-Comm., 26, 143-149 (1975). [ 3 ] Akahira, M. and Takeuchi, K.: The Concept of Asymptotic Efficiency and Higher Order
[1]
Asymptotic Efficiency in Statistical Estimation Theory, Lecture Note (1979) ; Revised (1980). [ 4 ] Akahira, M. and Takeuchi, K.: Discretized likelihood methods : Asymptotic properties of discretized likelihood estimators (DLE's), Ann. Inst. Statist. Math., 31, Part A, 39-56 (1979). [ 5 ] Takeuchi, K. and Akahira, M.: On the second order asymptotic efficiencies of estimators, Proceedings of the Third Japan-USSR Symposium on Probability Theory, Lecture Notes in Mathematics, 550, Springer-Verlag, 604-638 (1976). [ 6 ] Takeuchi, K. and Akahira, M.: Third order asymptotic efficiency of maximum likelihood estimator for multiparameter exponential case, Rep. Univ. Electro-Comm., 28, 271-293 (1978). [ 7 ] Takeuchi, K. and Akahira, M.: Asymptotic optimality of the generalized Bayes estimator in multiparameter cases, Ann. Inst. Statist. Math., 31, Part A, 403-415 (1979).
199
Statistics & Decisions 1, 17-38 (1982) © Akademische Verlagsgesellschaft 1982
ON ASYMPTOTIC DEFICIENCY OF ESTIMATORS IN POOLED SAMPLES IN THE PRESENCE OF NUISANCE PARAMETERS Masafumi Akahira and Kei Takeuchi
Received : revised version : May 4, 1982
Abstract The concept of asymptotic deficiency is extended to the case when a common parameter 6 is estimated from m sets of independent samples each of size n , and the asymptotic deficiencies of some asymptotically efficient estimators relative to the maximum likelihood estimator based on the pooled sample are discussed in the presence of nuisance parameters.
1. Introduction. The third order asymptotic efficiency of the maximum likelihood estimator (MLE) in the case of independently and identically distributed
( i.i.d.)
samples has been established by Pfanzagl and Wefelmeyer ( 1978 ), Akahira and Takeuchi ( 1978, 1981a ) and Ghosh , Sinha and Wieand ( 1980), and it can be formulated in terms of asymptotic deficiency introduced by Hodges and Lehmann ( 1970) ( Akahira, 1981 ; Akahira and Takeuchi 1981a, 1981b). Let X1, X2, Xn , ...
be i.i.d. random variables according to
the distribution with a density function f(x,6) where 6 is a real
AMS Subject Classification ( 1980): 62F12, 62F10. Key words and phrases :
Asymptotic deficiency , asymptotically efficient
estimator , maximum likelihood estimator , pooled samples, nuisance parameter, loss of information , asymptotic unbiasedness , ancillary statistics.
200 18
AKAHIRA-TAKEUCHI
parameter . Under a suitable set of regularity conditions it can be proved that the MLE 0n of 0 can be asymptotically expanded into the form
^(6n- 0) = I Z + 1 Q* + opI 1 1 , T \Jn/ n CC where Z 1 G a0 log f(xi,0) and I = E0[{ae log f(x,0)}2] and that vrn- i=l which admits the same for any asymptotically efficient estimator
60
type of expansion x(00 0)
1 Z+ 1 Q+ op\ 1 1 n t An- 0
we have * V(Q0) > V(Q )
where V designates variance. This implies that we can construct a "modified " estimator ( relative to 0n) of the form
0** = 6* + h(0*) n n n n so that 0n* has the same asymptotic bias as
6n up to the order n-1
and that
lim n[P0 n{rI 6*n *-0 < a} - PO n n ,
{,rI 0n-0
< all > 0
for all 0 and all a > 0. Moreover, if n1 is defined so that the n (with Q**) based on the sample of size n1 has modified estimator 0 ** the same asymptotic distribution as that of the estimator
6n
(with Q0)
based on the sample of size n(>nI) up to the order n ' , we have
lim(n-n1 ) = I{V(Q0)- V(Q**)} n-o (Akahira, 1981). We call the above limit the asymptotic deficiency of 6n relative to the MLE. Note that the right-hand side of the above
201 AKAHIRA-TAKEUCHI
19
equation does not necessarily mean the difference of the variances of estimators but of the asymptotic variances , hence we do not need to bother about the remainder terms ( Akahira and Takeuchi , 1981a). In this paper we extend this concept to the case when a common parameter 0 is estimated from m independent sets of samples and discuss the deficiencies of some asymptotically efficient estimators relative to the MLE based on the pooled sample from the distribution with different nuisance parameters.
2. Asymptotic Deficiencies Of Estimators For Samples From The Distributions With Different Nuisance Parameters. Suppose that it is required to estimate an unknown real-valued parameter 0 based on m samples of size n whose values are
Xis (i = 1,...,m;
j = 1,...,n) from the distributions with different nuisance parameters Now we assume that for each i Xil, Xi2, ...' Xin
Ei (i = 1,...,m ).
are independently and identically distributed according to a density function
f(x,0,Ei)
with respect to a a- finite measure p.
We assume further regularity conditions: (A.1) For each i the set {xlf(x,0,Ei) > 0} does not depend on 0 and Ei. (A.2) For each i , for almost all x[ul, f (x,0,^i) is three times continuously partially differentiable in 0 and Ci. (A.3) For each 0 c 0 and each i
0 < I80 = E [{ae log f(Xij,0 )}2l 2 -E[a2 log f (Xij,Oi)l
< -o;
0 < Iii = E[{---L log f(X1^e ,^i)}2l i
202 AKAHIRA-TAKEUCHI
20
2 -E[a 2 log f ( X.j6,Ei)) i < CO
(A.4) The parameters are defined to be "orthogonal" in the sense that 2 E[862 log f i
(Xi^,O,Ei)^ = 0 (i
Remark that this assumption is not necessarily restrictive, since 901Y ( i = 1,...,m) so
otherwise we can redefine the parameters that they have the above orthogonality. (A.5) There exist
2 -log f (Xij,e, i)}{se log f ( Xij,0,^i)}] 000 - E[ { De 2 2 se2 log f ( Xij,6,^i)}{ai log f ( Xij,6 ,Ei)}] JeeE = E[{ 2 JUF = E[{808 1 log f
Je - E[{
(Xii ,O, i)}{a6 log f ( Xij,O,Ei)}]
2 868 1 log f(Xij,O , i)}{a i log f (Xij,e ,ci)}I
2 Je = E[ { a 2 log f (Xij,6,i)}{se log f (Xij,U,Fj)}} i
K000
= E[
{-
Ke0 = El {-56
log f(Xi3,U, i)}3 ]
log f (Xis , e , Ei)
,
}2 {al
log f ( Xis , e , i) } ]
2 2 E[{a 2 log f ( Xij,6,Ei)} 2l - 160 M0066 = 30 2 Me0 = E[{8A8ki log f(Xij,U, i)}2] , and the following hold:
203 AKAHIRA-TAKEUCHI
21
3 E[ -2
log f (Xij,A,Ei) l = -3J600-K B00 1 3 3 log ,B, i)l = -ji E[ 2 f (Xij
aA ai E[ 2 log f(Xij,B, i)J _ -Je aAa^i for i = 1, .
., M.
From the assumption ( A.4) it is noted that K88 Denote the log likelihood functions
= JBcA - JBA ' Li(8,Ci) ( i = 1,...,m) and
L(A,El,...,Em) by n c Li(8,Ci) = G log f(xij,8,^i) (i = 1,...,m) j=1 m m n L(8,^1,..., m) _ L.( 8, log f (xij,0,Ei) i=1 i=l j=l
I E
When are known, we denote by 80 the MLE of 0 based on L(0, 1,..., m).
For each i = 1, ..., m let
of 0 and based on
Ai and be the MLE's
L i,E1 (0 ) , respectively .
Let
Em
be the MLE ' s of 8 , Cl, ..., ^m based on L(0 , 1,..., m) , respectively. ( i) The MLE 80 when
are known.
Since A0 is the MLE of 0 for known parameters
^l' ..Em '
it follows that
ae L(BO,l,...,m) = 0
Then it is seen by a similar way as Akahira and Takeuchi ( 1981a, page 92) that the expansion of 0 0 is given by
'(e0 A)
+I Ze + 1 *2 z00Z0 - 31 *3 Ze2 + 0 1 I VI 2^ fn zA+ QO+0 (say) ✓n p (i)
where
204 AKAHIRA-TAKEUCHI
22
m
in
Ze, Zee
Ze
I
zee , it
=
m
i
in
i
i=1 1A6> J i=1 JeBA
m and K* _ Ke66 with i=1
Ze 1 An-
a6
Li(,Ei) 2
zee =
1 {a 2 L. ( A, i) + nIeA} n ae
Hence it is easily seen that 1 * * *2 (J*+K*) 2 (2.1) V(QO) = I *4 (M I - J ) + 21 *4 * m
where M =
I i=1
Me000*
(ii) The MLE 's 61, ..., . Since a * * aA L(8 1'..., m) = 0
0 (1 = 1,....m)
it follows by a similar way as the case (i) that the stochastic expansion of 6 is given by
r(e*-A)
* * 1 * 1 * 3J +K *2 2iT 1*3 ze 1* z0 + ,r I* z6ez0 1 m
1
i
i_
1
Z
+ YrnI* i =11 ii z6EZ I*2
i=1 T
1 6i Z ( + op l 2 \ 1 / 2i I i=1 I1 Ze + I
1
QO + 1 vrn-
Q1 r
+ o 1 p
* * Z + 1 Q + 0 1 ( say) I ,T p (^n::)
1
(say)
205 23
AKAHIRA-TAKEUCHI
where
z = 1
Li (e
a
a2
zed
Yrn-
aeaE1
Li(O, i)
(2.2) V(Q*) = V(QO) + V(Q1)
with i2 i2 1 mc 1 i Jere JO V(Q1) = I*2 1 = 1 1 i Mee - Ii - Ii
1 CmC 00 1 1
i2 1
Jere
i2 1CIl Jed
+ *2 iLl Ii I* Ii + 2I*2 i-1 Ii2 (iii) The weighted estimator by ^^a = 2/ae2)
We consider the estimator m C i
1 `fee ei
e _ i= mC
gee
iLl According to the remark by R. A. Fisher (1925) that the second order derivative of the loglikelihood function gives the "actual" information of the MLE , we may compute. By a similar way as in case ( i) we have * (2.3)
ze
+
*
In 3J1 +K1 * * 6861 He ze z e I 2 i=1
1
1I * 2 zeeze -
Iee
+
1 me 3Jieee Keee i i2 + 2v I*2 i=l l
ze ee
206 AKAHIRA-TAKEUCHI
24
+
1
,T
i f 1 Zi Zi - I m J8 Zi z*
m CC
I*
i11
I
i 1 JOSE
8 12
I*2
+ op
i=l
IU
^
r 11
=1 i I Z 21 I* _ *Z 8 I
+1Q 0 1 1 2 +p + 1 Q lr 1 Q2 0 r
z+ I
/rn- An-
1
Q + 0 1
(say)
p
vrn-
where m
Q2 =
2 21
1I 1
(3J688
*
+K660)
i
2
z 8 i Z8
(I lee
Since Cov(QO,Q2) = Cov(Q1,Q2) = 0 it follows that (2.4) V(Q) = V(QO) + V(Q1) + V(Q2) with m (3J i 888+K i888) 2 _ (3J*+K*)2 V(Q2) = 21 *2 i=l let 21 *4 (iv) The weighted estimator by Ie8-I68
m = i=1 C i i=1 188 Then the stochastic expansion of is given by
207
AKAHIRA-TAKEUCHI
1
^(e-e)
*
Ze
1
+ * ✓n I
I
m
1
i
i zeez e
i=1 186
I*2 i= 1
IN
+
i i Jeee +Ke6e
*
i2
z i2 e
Ziz
6 6
6
1 1 zi zi T I* i=1 I e
in
2^ I i=1 Iee
1 m 2J N i e+K ei66
+
1
25
i
i
+ 1 J6 e+K66E zizi v I* i=1 Ii I 00
M Ti
i 2Je^e+ KOe zZ*+o( 1 i=1 I e p
0 Z it 1 C' * L. 12 P T T 2V I i=1 I 1
Z e + 1 QO+ I
1
QI+
✓n
,r
*
ze + 1 + 0
Q3+
1
f
Q4+op 1
r
1 1 (say),
p
I
1
where in Q3 I
i i i 1 in Jeee+Keee 12 Z Z +Z e8 e IN e 21 i=1 IN
1
i=1
1 C 2Je0e+Kei i e6 i * _ 1 * * 3J*+K* *2 zeze 1 *2 zeeze + 2I*3 Ze I *2 i=1 Ii
ee
i i i in J6 6+K6e ze Ze = * Q4 I i=1 I
* 66
z
We divide Q3 into two parts Q5 and Q6 as follows: 1 m C^
Q3
I iLl
* 1 in Jeee i* zeez 6 1* i=l Ii Z6z6 ee ee
1
i
i
1
Ii z66z e
*
I*
1 me i *2 1 mC i i z e + I* icl J66eze + 2 I* ix (Jeee+Ke0e) Ii _ 1 m
ze
I* i=1 I1
ee
m
- Ze i _ J8 Z
I*
6
I
z1
Z
+ 21* ixi (Jeee + K6ee) 16 - I
es = Q5 + Q6 (say) .
z
208 26
AKAHIRA-TAKEUCHI
Since Cov(Qi,Q.) = 0 (i i j; i,j = 0, 1, 4, 5, 6) , it follows that (2.5) v(Q) = V(QO) + V(Q1) + V(Q4) + v(Q5) + v(Q6)
(v) The weighted estimator by I08_= Ie ( e,Ei) (i We consider the estimator me L 160 Ii
9* = 1=1 m
Then we have
r(ee) _
Ze +
1
QO +
1 Q1
+
1
Q4 +
1
I
Q7 + 0
P(1 An-
1 z * + 1 Q* + oP (say) I where i C^ Jeee i * J *2 __ 1 m 1 i i - 1 * 1 Cm ZeeZe I* i11 Ii Z 0Ze + I* Ze ZeeZe Q7 1* ae
1* iLl Ii
i
ee i
i
i
i2 1 m 3J66e+K6e6 i 1 m 3J66e+Ke6e X Z Z X Z +*3 21*2 i=1 i2 e I i=1 Iee e e
* *
+KZe *2 _ 3J 21*4
Since 2
m Q7 = Q5 21 *2 iLl(3J000+K000) Ze 1 ( Il e
Ze
=Q5-Q2 and Cov(Qi,Qj) = 0 (i # j; i,j = 0,1,2,4,5), it follows that (2.6) V(Q*) = V(QO) + V(Q1) + V(Q2) + V(Q4) + V(Q5) .
209
AKAHIRA-TAKEUCHI
27
(vi) The asymptotically best estimator based on 6. andi (i = 1, ,m). The estimators S and S have the property of being functions of 8i and
i (i = 1,..., m) alone. Now let us construct an estimator
6 = f61t...,6m,
1,..., m)
which is asymptotically best among the estimators which depend only on 6i and
i (i = 1,...,m).
We assume that f is twice continuously
partially differentiable in 6i and
i (i
Assuming the
condition of consistency
Denoting of
Z1 =C19 .... m Em ;
e1 b = of
k J
0 6l 61e,...,6m =6, El Ell...l Em Em 81e,...'0m 0 , El Ell..., EM =E M
a2f eQk aZlD^k for i, j, k, 2, = 1, ..., m , we have from the consistency condition that m ai = 1; bk = 0 i=1 and also
( k = 1,...,m)
210 AKAHIRA-TAKEUCHI
28
m
aai m m c = ; L c.. = 0 ; j=1 iJ ae i=1 j=l 1J C aai mm iLl dik 0 ' dik aEk ' e Qk = 0 for i, k, Q = 1, ..., in. Taylor's expansion of 0** yields that
e** = 0
in m m + ai (01 0) + 2 I cij( ei- e)(0 cci=1 i=1 j=1
j -0)
+ G X dik(el e) (Ekk) + op(n) i=l k=l **
,r(e -0 )
The asymptotic variance of
is given by
2 n 2 mCC ai n ai V(0i) L i i=1 i=1 00 m of which minimum value under the condition I ai = 1 is attained when i=1 i i ai = m.00 = 180 C i I
(i = 1 ,...,m) .
i==1 00 Then we have aai dik
aCk
2Je0+K00 s ik
Ie0(2Je^0+Kee^) *2
for i, k = 1, ..., m , where 6ik denotes Kronecker ' s delta. It follows from these conditions that m
m
m
,T(e Iie ,r(si-e) + 1 cijn(ei-e)(0.-0) I
i=1
2vrn-
i=l
j=1
m Cm n(Oi-0)(^k i=l k=l dik vrn-
k) + oplYrn)
211
AKAHIRA-TAKEUCHI
29
and substituting the stochastic expansion of 0i we get m
r(0 **-0) Z I
O+
1 *
0
Ze
Z1
Ji m Zee _ e9e + 1
i =1 7i 00 109
I
Zi * Zi
V I i
=1 I^
0
m cc i i i 1 * L Jed Z12 + 1 * C J09+K90 ZiZi n 37n I i=1 I 7 I * i== 1 I80ICc 0 i i 1 m 2JeE0+Ke0 Zi I*2
i=1
I
E
Z 0
i - d J OO+Ke0 ZiZi CC CC + 1 L L 2Vn I* i=1 j=1 Ieeie9 ij I*Iee e 9
1
Ze + 1 Q0 + 1 Ql + opl 1 / (say) , I \\ ^/ where 1
Q1
ci
J000+K990 i ^- - d . Z Z Z i 21 i=1 j=1 1601e0 ^ I*Iee 0 0
Since we have Cov(QO,Ql) = 0 we should minimize V(Ql) in order to minimize the asymptotic deficiency of 0** relative to 6* under the condition
m ta2J K I(2J*+K*) i 00000 00 + (2.7) ,l =c ij = a6 = I* - 1*2
for i = 1, ..., m. It is expressed by
V(Q*) _ * c - d J099*K000 1 2I i=1 j=1 ii i^ I 109190 and noting that cij = cji we get by applying Lagrange 's method of
212 30
AKAHIRA-TAKEUCHI
constrained minimization i
c ij
8i3
Jeea *K eee
+ 10000
^`i+Aj) (i,j = 1,...^m)
where ai (i = 1,...,m ) are Lagrangean multipliers.
From (2.7) we have
m m 0= I I ci. =-J*K +2I A1IA0 . i=1 j=l I i=1 Hence C i = - J*+K* i=1 i I 0e21 2 Again from (2.7) it follows that i
i
ci = J006*K000 + A1Ie0I* + a.Iee 160 j=1 ;1 2J6ee+Ke0e Iee(2J*+K*)
i _ J000 180(3J*+K*) ^i10e
I
21
*2
for i = 1, ..., in. * Consequently the minimum of V(Q1)
is attained when
I1 Ji +I3 Ji Ii Ii (3J*+K*) ii +Ki + a eee eee ee eee 00 000 ee 00 cij = Z* 1*2 ij 1*
and * __ 1 * m J ie00 1 3J*+K* *2 Z e Z Q1 1*2 0 i=1 Zi 0 21 *3 0 ee Hence the asymptotically best estimator a is expanded , after some rearrangements of terms into the form
/_(9**-e) Z* + 1 Q0 + 1 Q1 + 1 Q4 + 1 Q5 + 0 1 ✓n p (^n::) YST Z fn F
213
AKAHIRA-TAKEUCHI
31
and the asymptotic deficiency of 0 relative to 0 is given by I{V(Q4) + V(Q5)}. Comparing the above with the stochastic expansion of 0 , we get
0 - = - n Q6 + op (1) n m
1* (J000+K000 )( 0i-0)2 + op(n) . 21 i=1
Hence an estimator which is asymptotically equivalent to 0** up to the order n-1 is given m
0*** = 0 _ 1*
2I i=l
(3600+Ke00 )(
0i e)2
where I*, 3000 and K000 are estimators of I*, J000 and K000 respectively, obtained substituting 0 and Ci for
0 and
Ei (i = 1,
...,m). The adjusting term m 1* X (J000+K000)(0i 2I i=1
-0)2
may be considered as the estimated non-linear effect in the best combination of 0.'s. i From the above we have established the following theorem. THEOREM 2.1. Assume that (A.1) - (A.5 ) hold. The stochastic expansions of the estimators 00, 0*, 0, 0, s and 0** are given by
,T(e0-0) z0 + 1 QO + o 1 ; ,T P (,' ) Z0 + 1 Q0 + 1 Ql + op\1/ Yrnvrn✓n(6-0) = * Z0 + 1 QO + 1 Q1 + 1 Q2 + 0 1 p F ,r ,T
/
^(80) _ * Z* +1Q + 1 Q +1Q + 1 Q +1Q +o C1); 0 F 0 Fn 1 p F 5 v 6 + vrnFn 4
r(e*-0)
1
QO + 1 Ql - 1 Q2 + 1 Q4 + 1 Q5+opl 1 ) Fn F vrnFn r `vln-
Z*
I
214 32
AKAHIRA-TAKEUCHI
* Z0 + 1 QO + 1 Ql + 1 Q4 + 1 Q5 + o l rn r dn` r p rn respectively , where
Q5 and Q6 are given in the above Q0' Q1' Q2' Q4' and Cov (Qi,Q.) = 0 for i ¢ j (i,j = 0,1,2,4,5,6) except for i = 2 and j = 6. The asymptotic deficiencies of 0, e , 6* and A** relative to 0 are given in the table below.
Estimator Tn
Asymptotic deficiency of Tn relative to n I*V(Q2)
I*{V(Q4) + v(Q5) + v(Q6)} I*{V(Q2) + v(Q4) + v(Q5)}
I*{v(Q4) + v(Q5)}
6**
From Theorem 2.1 it is seen that 0 is asymptotically better than ** 0, 8, S and 0 up to order n . Using the estimator 0 we may construct an estimator 0* asymptotically equivalent to 0 as follows: in + 0* = 0 - ml ( 3J600 Ke00)(6 0)2 C i 1=1 l l `^00 2 i where J606
= J600 ( 6.) and K660 - K600 (
0i) (see Akahira and Takeuchi,
1981b). We can formulate still another estimator , i.e., the linear estimator 0 given by (m i* L 3AA 04
e* = =1m
,
i I AAA
i=1 where for each i = 1, ..., m the weight `966 is defined by 2
`^00 n a 2 Li(0'Y ae
215 AKAHIRA-TAKEUCHI
33
(0 may be replaced by and i). Then we get
Z0 + 1 QO + 1 Ql - 1 Q2 + op(1 j r Therefore the asymptotic deficiency of 0* relative to 0* is equal to that of 0. Moreover , we have from the above, i i* `^66 + `_00 c i me me
+6 1 m 2 2 iLl
iLl `^0A
iLl `'00 Thus the estimator
* 0i = 0 + op(n)
( 0* + 0)/2 is asymptotically equivalent to 0* up
to the third order, i.e., order n - l , that is , the asymptotic deficiency * * is zero , which can also be shown by the of (0 +0 / 2 relative to 0 more direct way as follows. Let 0 be the MLE .
Then we have
1ma Li(0, 0 = n 1 a0
- e )2
k.)(e*-e) + 1 1Gm a3 L i Z)(6*i = n 1 a2 L (0 (0 i' ae2 i i' i i 2n i a03
+0 (1) p n
Similarly we have 1m 0 = n 1 a6 Li(0i,Ei) 3 2 Li(0*. i 0*) +2n I a 3 Li(0*, *) (0 6*) 2+0p(1 ) = n a 2 a0 1 a0 And noting that 1 n 1
I 1
m
a3
1
m
0 3 Li(0i.^i) = T aA m
a2
*
1
n
in
a3
E
3
Li(0 * ,Z.) + op(1 )
1 a0
a2
n 1 a02 Ll(0 , i) = n ae2 Li(0 , i) + op(1) we have m 2 2 I 2 Li(0i,^i) + a 2 Li (0, n a i) (0*-0i) = 0 (n) * Z*)l
l t ae
Thus we get
ae
216 34
AKAHIRA-TAKEUCHI
m
i
i*
^
E `(Jee + use) ei i
o
ee + 00
Hence we have established the following theoremTHEOREM 2 . 2. Under the assumptions (A.1) - (A.5), the estimator a is asymptotically equivalent to the MLE 6 up to the order n -^ and its * asymptotic deficiency relative to 6* is zero , and the estimator (e +e)/2 is asymptotically equivalent to 6 up to the order n and its asymptotic deficiency relative to 6 is zero.
3. Discussion. V(Qi) (i = 0,1,...,6)
can be interpreted in the following way. V(QO)
consists of two parts, i.e., * * 2 ) and (J +K)) *4 (M*I* - J*2 I 21 the first being the "loss of information " of the MLE and the second due to the condition of (higher order asymptotic ) unbiasedness ( Amara, 1980). These two factors are unavoidable even when we know the true values of the nuisance parameter. Then for the MLE 6 we have three additional terms i2 i2 1 1 1 i _ J6c0 _ J0
rr (MeeE
I*2 i1l Ii
I1
Il
i2 1 C Je 2I*2
iY
i2 I
and i2 1 t° 1 _ 1 Je;6 1*2 i11 Ii I Ii
ee
of which the first is the "loss of information " due to the presence of
217
AKAHIRA-TAKEUCHI
35
unknown nuisance parameters , the second is the unbiasedness effect and the third represents the loss due to pooling. V(Q2) can be considered as representing the effect of the nonlinearity of the optimum combination of the MLE ' s to get the MLE based on the whole sample. Comparing 0 with 0 , it is shown that the information contained 60 (i = 1, ..., m) are lost when we use I66 (i = 1, '3 ... ,m) in its stead, and V(Q5) represents the loss with respect to 0
in the statistics
and V(Q4) with respect to Ci (i = 1,...,m). Comparison of 0 and 0 does not lead to any uniform conclusion, which may seem strange since in 0 a better estimator 0 of 0 is used instead of 0 (i = 1,...,m) , but it can be interpreted that 0 could be better since I00(0i'Ei) may be closer to J 60 than 160O'Ei). Comparing 0** with 0* , it is shown that the loss of information caused by restricting the estimators to be functions of 0i and Ci (i = 1,...,m) alone is expressed by V(Q4) + V(Q5). Of the two terms Q4 and Q5 the latter is the weighted mean of the asymptotically ancillary statistics JI
ZAe - 000 Zei
(i = 1,...,m)
lee and is easy to interpret , but the former is a weighted sample covariance between
( Z/I e0 ) - ( Z/I ) and Zi/Iii , but its meaning is harder to
grasp. In case of the symmetric location -scale problem in which the density function g(x,0,Ei )
is expressed as
1 f for each i = 1, ..., m and where f is an even function, we have for each i = 1, ..., m ,
218 AKAHIRA-TAKEUCHI
36
I00 2, 2 ; Ie = EI{ae (x, e, )}{a g(X,0, )}^ = 0 i
i
L
where A and B are constants determined as
A = with
f(x)d,; B = J x2{^' ( x)}2 f(x)dx
fi(x) = log f ( x). Further we obtain for each i = 1, ..., m
K600 = - 0 i -=0; i J600 JeO^ = 2 i3 (3A+C) Jere = 2 13 (-A+C) ; K60 _ i3 (-A-C) where C = I x{^'(x)}3 f(x)dx . And for each i = 1, ..., m , JO - 0; JO = 0 It is further calculated that for each i = 1, ..., m ,
where
D = J{4"(x)}2 f(x)dx and F = J x2{c" ( x)}2 f(x)dx . Therefore , in the estimation of the common location parameter of different populations of the same shape but with different scale parameters, we have Q2 = 0 and Q6 - 0 , thus we have 6 = 8 and 0 = 0 . Especially , when the underlying distribution is normal, we have for each i = 1, ..., m ,
J06^ 2E3 {31
x2 f(x)dx - J x4 f ( x)dx} = 0
and Je,O + Klan = 0 , which implies that Q4 0 and also Z60 = 0 hence Q1 = Q5 = 0. Then we see that 0*, 0 and 0 ** asymptotically equivalent , and the most commonly used estimator
are all
219 AKAHIRA-TAKEUCHI
37
6i/j 8=i=1 j./Z2 11 n n 0. = i = n X Xij and 1 = nll (Xij-% i2 (1 = 1,...,m) is, j=1 j=1 although not identical with the MLE , asymptotically equivalent to it up
with
to the order n 1.
Acknowledgements. This paper was written while the first author was at the Limburgs Universitair Centrum in Belgium as a guest professor .
The visit was sup-
ported by a grant of the Belgian National Scientific Foundation (N.F.W.O.). He is grateful to Professor H. Callaert of the LUC for inviting him and for his interest in this and related work.
References Akahira, M .: On asymptotic deficiency of estimators .
Austral. J . Statist.
23, 67-72 (1981). Akahira, M . and Takeuchi , K.: Asymptotic Efficiency of Statistical Estimators: Concepts and Higher Order Asymptotic Efficiency , Lecture Notes in Statistics 7, New York-Heidelberg -Berlin, Springer ( 1981a). Akahira, M . and Takeuchi , K.: On asymptotic deficiency of estimators in pooled samples .
Technical Report of the Limburgs Universitair
Centrum, Belgium ( 1981b). Amari, S. : Theory of information space : A differential -geometrical foundation of statistics. RAAG Report , 106 (1980). Fisher, R. A .: Theory of statistical estimation .
Proc. Camb . Phi.. Soc.
22, 700-725 ( 1925). Ghosh, J. K., Sinha , B. K. and Wieand , H. S.: Second order efficiency of the mle with respect to any bounded bowl-shaped loss function. Ann. Statist. 8 , 506-521 ( 1980). Hodges, J. L . and Lehmann , E. L.: Deficiency . 783-801 ( 1970).
Ann. Math. Statist. 41,
220 38
AKAHIRA-TAKEUCHI
Pfanzagl , J. and Wefelmeyer , W.: A third order optimum property of the maximum likelihood estimator. J. Multivariate Anal. 8, 1-29
( 1978).
Takeuchi , K. and Akahira , M.: Third order asymptotic efficiency of maximum likelihood estimator for multiparameter exponential case. Rep. Univ. Electro -Comm. 28, 271 -293 (1978).
Masafumi Akahira Kei Takeuchi Department of Mathematics Faculty of Economics University of Electro- Communications University of Tokyo Chofu, Tokyo 182 Hongo , Bunkyo-ku Japan Tokyo 113 Japan
221
Ann. Inst . Statist. Math. 37 (1985), Part A, 17-26
ESTIMATION OF A COMMON PARAMETER FOR POOLED SAMPLES FROM THE UNIFORM DISTRIBUTIONS MASAFUMI AKAHIRA AND KEI TAKEUCHI (Received Sept. 6, 1983; revised Mar. 12, 1984)
Summary The problem to estimate a common parameter for the pooled sample from the uniform distributions is discussed in the presence of nuisance parameters. The maximum likelihood estimator (MLE) and others are compared and it is shown that the MLE based on the pooled sample is not (asymptotically) efficient. 1. Introduction In regular cases the asymptotic deficiencies of asymptotically efficient estimators were calculated in pooled sample from the same distribution (Akahira [5]) and in the presence of nuisance parameters (Akahira and Takeuchi [7]). In non-regular cases the asymptotic optimality of estimators was discussed in Akahira [4], Akahira and Takeuchi [6], Ibragimov and Has'minskii [10], Jureckova [11] and others, and also recently a Monte Carlo study on the estimator considered in Akahira [1]-[3] has been done by Antoch [9]. In this paper we consider the problem to estimate an unknown real-valued parameter 0 based on m samples of size n from the uniform with different distributions on the interval (0-, B + ^1) nuisance parameters which is treated as a typical example in non-regular cases. In some cases the MLE and other estimators will be compared and it will be shown that the MLE based on the pooled sample is not better for both a sample of a fixed size and a large sample. Related results can be found in Akai [8].
2. Results Suppose that it is required to estimate an unknown real-valued Key words and phrases: Maximum likelihood estimator, weighted estimator, uniform distributions. 17
222 18 MASAFUMI AKAHIRA AND KEI TAKEUCHI
parameter 0 based on m samples of size n whose values are Xif (i=1, • • , m ; j =1, • • • , n) from the uniform distributions on the interval (0 0+^i ) with different nuisance parameters ^i (i=1,• • •, m). For each i let X(l)<X(2)<• .. <XX(,,) be order statistics from Xi1, XXE, • • •, Xi,,. We consider some cases. C a s e I. ^i = E (i =1, • • • , m) are unknown.
The MLE 0ML of 0 based on the pooled sample {X1} is given by d^L = 2 (min Xi( 3)+max Xi(n)) • lsism 15i$m
An estimator B„ of 0 based on the sample of size n is called to be (asymptotically ) median unbiased if Pr {&n 0} = Pr {dnB} = 2
^lim IPr {d&S0} - 1 I =lira Pr {dn 2
B} -
n--
2
=0 uniformly in some neigh-
borhood of B) (e.g. see Akahira and Takeuchi [6]). Then it is shown that dML is one-sided asymptotically efficient in the sense that for any asymptotically median unbiased estimator d„ lim [Pr {n(&IL-0)S t} - Pr {n(dn-0)St}]? 0
for all t>0 ;
{n(dn- 0) St}]S0
for all t<0 ,
or lim [Pr {n(#ML-0) St} -Pr n-.m
since in this case d,L is actually the MLE from the sample of size mn (see Akahira and Takeuchi [6]). Case II.
^i (i =1, • • • , m) are known. The MLE d ML of 0 based on the pooled sample {X f} is given by 1
dML= 1 {max(X(n)2
^i)+ min 15i;sm 1;ism
(
XX (
) +Si)}
Then it can be shown in a similar way as in [4] and [6] that 0ML is two-sided asymptotically efficient in the sense that for any asymptotically median unbiased estimator do
Lim [Pr {nIdML - 0^
for all t>0 .
n--
(For the definition see Akahira and Takeuchi [6], page 72 and Akahira [4l).
223 ESTIMATION OF A COMMON PARAMETER FOR POOLED SAMPLES 19
C a s e III.
Ei (i =1, • • • , m) are unknown. For each i the MLE's 9i and ^i of B and Ei based on the sample Xil, • • • , Xi,, is given by i=
2
(X,(1)+Xi(n )
i=
,
(Xi ( n)-Xi(l,) ,
respectively . Let &IL be the MLE of 0 based on the pooled sample {Xij} . Then it will be shown that BML is not two -sided asymptotically efficient . We define 9,*= 1 {max (9i+ Ei -E°)+ min (9i- Ei+Et)} 2 15i5m
Then it can be shown from Case II that 9n*, is asymptotically locally best estimator of 0 at Ei = EOt (i =1, • • • , m) in the sense that for any asymptotically median unbiased estimator 6. lim[ Pe{ n1,*e1
{n19n-
0I t }] >_0
n-w
for all t>0. First we shall obtain the MLE 9ML based on the pooled sample {Xij}. Let fi(x, 0) be a density function of the uniform distribution on the interval (0 - Ei, 0 +E). Since the likelihood function L(0 ; E l, • • , E m) is given by m
L(0; E 1,..., Em ) =TT
n
TT f
i(
i=1 j=1
x j, 0)
1 1 2n
n
for
xi(n
TT E i
)- E i;50SXi<1, + E i (i=1,•••,m);
i=1
0 otherwise . In order to obtain the MLE 9ML it is enough to find 0 minimizing if[ Ei i=1
under the condition Ei _ max {0 - xi(1) , xi(n,-0} for all i. Let 9* be some estimator of 0 based on the pooled sample {Xij} . For each i we put E*=max {B*- xi(l), xi,n) -B* }. Then we have for each i
E* =max {9 * - 9i+Ei, Hence the MLE
9$ L
ei
+Ei -B*}
=
Ei +I
B i -d *1 .
is given by 9* minimizing
TT (gi +I -9 *I) 9i
i=1
that is, such an estimator 9* is given by either of the estimators (i=1,•••,m). Since
224
20 MASAFUMI AKAHIRA AND KEI TAKEUCHI
TT (^i +1Bi-B *D=TT TT (1+ for sufficiently large n it is asymptotically equivalent to
TI si(1 +E
lei-e*^ 1 .
Hence it is seen that for sufficiently large n the MLE eML is asymptotically equivalent to a weighted median by the weights 1/^i (i = 1, • • • , m). Next we shall discuss the comparison among B, L, the weighted estimator and other estimators . We consider the case when m=2. Then for each i=1, 2, Xi,, • . • , Xi„ are independently , identically and uniformly distributed random variables on (0-^i , 8+ei). Then for each i=1, 2, the joint density function fn(x, y; 0, ei) of Bi and ^i is given by
n(2^n
(2.1)
fn(x,
1)
y; 8, ^i) =^
n -z
for 05y5^i and
B-^ i +y5x58+
i- 2J
;
0 otherwise .
For each i=1, 2 the density function fn(x; 0, si) of 0, is given by (
(2.2)
n n-1 (;i-Ix-BI )
for B - ^7i<x<8+^;i ;
fn(x ; 0, s^i)= 2^i 0
otherwise .
Also for each i = 1, 2 the conditional density function f.(x I y ; 0, $i) of
Bi given ti is given by (2.3)
fn(x
I y ; 0, e i ) =
1 2(e -y)
for B -^:i } y5x
0
otherwise ,
that is, the conditional distribution of Bi given ^i is uniform distribution on the interval (0-(^i-gi), 8+(e,-gi))• For two (asymptotically) median unbiased estimators B' and B2 of 0, B' is called to be (asymptotically) better than e2 if Pr {nIB'-0l tI&„ 2} _ Pr {nIB2-8I5tI^,, 2} a.e. for all t>0 (lim [Pr
{n1 B' -Oast} -Pr {nIB2- 1_0
for all t>0), and then for
n-.Co
simplicity we denote it symbolically by B' } B2 (B' } 62), where Pr {A as.
^2} denotes the conditional probability of A given e, and ^2. From (2.3) we see that the conditional density of 6,-0 and #,-0 given ^, and 2
225
ESTIMATION OF A COMMON PARAMETER FOR POOLED SAMPLES 21
is given by
fn(
x l,
x21 ^1, 2) _
1 4T1z2
for
I x il
0 otherwise , where Ty =e2-c , (i = 1, 2). If c1 7*1 > c2T2, then the conditional density function f.(yI g1, ^2) of Bo-0 With Bo= C1B1+C2B2 given ^1 and 2 is given by 1
for y< - c 1T1+c2T2 ;
(C1z1+c2z2+y) 4clC2T1T2
1 2c1z1
(2.4) fn.(y I ^1, ^2) _
for
1
for y>c1r1-c2z2.
(c1T1+c2z2-y)
4c1c2T1z2
I
Iy
If c1z1
1
(c1Tl+c2z2 - y)
4c1C2T1z2 (2.5)
fn(yI ^ 1,
2)=
for y> - C1z1+c2z2
1 2c2z2
for
1
IyI<-C1z1+c2z2;
for y
(Clzl+C2T2 +y) 4C1C2T1T2
If c`=
^5/(^1 +&)
=c% (say)
(i j; i, j =1, 2) ,
then
(2.6)
e c C2T2 51 ( 52-^2)
('lzl 2(51-e1)
If e`=g>/( 1+^2)= C'' (say)
(i*j; i, j=1, 2) ,
then (2.7)
1 _ iG-_ _s2 i'Tl ^2 ( S1-S1) ' C2 'z2 ^1(E2-2)
Hence we have the following :
(2.8) 22 1 if and only if
1
= ^ 1 +52
Cizl 2
z
'.C 1-2
=1
2 1 ( ^l - J 2(L l - t1) C1 71
226 22 MASAFUMI AKAHIRA AND KEI TAKEUCHI
+^2 ^i '-2
(2.9)EZ1 if and only if
ee (CC tt S1 s2-S2) t1(S2-S2) C2'T2
c 2
On the other hand as is seen from the above discussion on the MLE, the MLE BML based on the pooled sample {X,5} is given by
0ML =j
& if 1<2 ,
02 if e1 > e2
The conditional density of the MLE B,^L given g, is given by (2.3). Now we consider two cases, i.e., ^1= e2 and S 1 # e2. Note that the first Case III, was also mentioned in the Case I with another optimum criterion. Case IIII .
e.
= In the case when ci,=ci=^5/(&1+^2) (i j; i, j=1, 2), S1=S 2
e1<>g2 if and only if ,(^-g 2) >S2(S - ,) ,
i.e. ,
C2r2>C,r 1.
Hence we have rl
(2.10)
e1
<> ^2 if and only if
cir1 +C27.2 < r2
We also obtain 2 CCe1e2C[(^1 t-^2) Z0 Ci'T1+C2 r2-(civ1+ C2r2) =
From (2.4) to (2.10) we have established the following theorem.
(i)
0
ri C ( r 1+ C2 r2
r
C l r 1+ C 2 r 2
Fig. 2.1. Comparison of the conditional densities of c161+c262-0 given & and S2 with (i) cz=c,1=^j/(&+t2) (i#j; i, j=1, 2) and (ii) ci=c'' =tj2/(ei+ ea) (i4j; i, j=1, 2) and of (iii) the MLE Bart given &s. They are given by (2.5) and (2.3), respectively.
227
ESTIMATION OF A COMMON PARAMETER FOR POOLED SAMPLES
THEOREM 2 .1.
23
If e1=^2 and S1<s2, then ^2B1+ ie2 2BI+b 102 >'BbiLB1 ee S1+S2 ^1+^2
if
e+1=52
and
e1 > e2,
then c2B1 +cz }, S2B1 +cie2 } ^1+ee le 22 52 1+ 2
B ML -B2
We assume that c1=c2=1/2 and ^1=^2=e. By (2.2) it follows that for each i=1, 2 the asymptotic density of n(&-B) is given by
.fi(x)= 1 e -1xW
(2.11)
Then the characteristic function qS(t) of the asymptotic density function of n[{(BI+B2)/2}-B] is given by (1+e2t2/4)z We may represent fi(t) as follows : 2 /4
^ t2
(2.12) t)-
1 + 2(1 +e 2t2/4)
Since
f - 1 exp
(- 2 IYI) exp (ity)dy = ^ exp (- Ix I) exp (i_x) dx 2 -1 1+e2t2/4 '
3 i i exp (--- iYl ) exp (ity)dy= 2
s:
IxI exp (- Ix I) exp (i
_ x)dx
Fig. 2. 2. Comparison of the asymptotic densities of n(61-8) and n[((61+ 62)/2) -0] given by (2.11) and (2.13), respectively.
228 24 MASAFUMI AKAHIRA AND KEI TAKEUCHI
1- Q t2/4 (1+S2t2/4)2
it follows from (2.12) that the asymptotic density of n[{(&+02)/2}
-e}
is given by (2.13)
Lx -1 exp (- fixi)
From (2.11) and (2.13) we have established the following theorem. THEOREM 2 .2.
If
e1 = e2=^,
then
Case III2 • $1#s2• We consider only the case when
e1
if &1«2;
02
if ccl>2
S,
<$ ,.
We assume that c1^1 * e2S 2•
By (2.2) it follows that for each i=1, 2 the asymptotic density of n(0, - 0) is given by
(2.14)
fi(x)= 2 t exp (-IxI/^{) ,
and also its characteristic function ¢,(t) is given by
01(t) = 1
1+Vt2
Hence the characteristic function ¢*(t) of the asymptotic density of n(0o-B )=n(c101+c202-0) is given by
*(t) = 1
(1 +cieit2) (1 +cz^at2)
Fig. 2 .3. Comparison of the asymptotic densities of n(9^ - B), n(9o-B')= n(c191+c292-B) and n(42-0) given by (2.14), (2.15) and (2.14), respectively, when F,«2.
229
ESTIMATION OF A COMMON PARAMETER FOR POOLED SAMPLES
25
Since (c'V C 'Ie2) 1 +c;eit2 1 + c2^2t2 ) it follows that the asymptotic density f(x) of n(Bo-0) is given by (2.15) f(x)= 2(c,e11 1 clea) {Cii exp (- c 1) -c2^2 exp (__!L ) There exists a positive number to such that
5°
50 f()d
f(x)dx ,
where f1(x) and f (x) are given by (2.14) and (2.15), respectively. Hence we have established the following theorem. THEOREM 2 .3.
Suppose that
and c1`s 1 *c2^2.
S1<S 2
Then
OML-01>- B0 as.
in some neighborhood of 0 in the sense that lim [Pr {nIBML-9ISt}-Pr {njBO-9I5t}]z 0
for all t<-t0.
noo
Further 00 >-OML =B1 IS.
in far away from 0 in the sense that lim [Pr {nI&-0jSt} -Pr {nI&L-0ISt}]>_O for all t>t0. Remark.
From (2.14) and (2.15) it is easily seen that
0O> 02
and
as.
0ML -01 > 02 . as.
The case e1>2 may be treated quite similarly. UNIVERSITY OF ELECTRO - COMMUNICATIONS UNIVERSITY OF TOKYO
REFERENCES Akahira, M. (1975). Asymptotic theory for estimation of location in non-regular cases, I: Order of convergence of consistent estimators , Rep. Statist. Appl. Res., JUSE, 22, 8-26. [ 2 ] Akahira, M. (1975). Asymptotic theory for estimation of location in non- regular cases, II: Bounds of asymptotic distributions of consistent estimators , Rep. Statist. Appl. Res., JUSE, 22, 99-115.
[1]
230 MASAFUMI AKAHIRA AND KEI TAKEUCHI [3] Akahira , M. (1976). A remark on asymptotic sufficiency of statistics in non -regular cases , Rep. Univ. Electro-Comm ., 27, 125-128. [ 4 ] Akahira , M. (1982). Asymptotic optimality of estimators in non-regular cases, Ann. Inst. Statist. Math., A, 34, 69-82. [ 5] Akahira, M. (1982). Asymptotic deficiencies of estimators for pooled samples from the same distribution , Probability Theory and Mathematical Statistics , Lecture Notes in Mathematics, 1021 , 6-14, Springer , Berlin. [6 ] Akahira , M. and Takeuchi, K. (1981 ). Asymptotic Efficiency of Statistical Estimators : Concepts and Higher Order Asymptotic Efficiency , Lecture Notes in Statistics 7, Springer, New York. [ 7 ] Akahira , M. and Takeuchi , K. (1982). On asymptotic deficiency of estimators in pooled samples in the presence of nuisance parameters , Statistics and Decisions, 1, 1738, Akademische Verlagsgesellschaft. [8 ] Akai , T. (1982). A combined estimator of a common parameter, Keio Science and Technology Reports, 35, 93-104. [ 9 ] Antoch, J. (1984). Behaviour of estimators of location in non-regular cases: A Monte Carlo study , Asymptotic Statistics 2, the Proceedings of the 3rd Prague Symposium on Asymptotic Statistics , 185-195 , North -Holland, Amsterdam. [10] Ibragimov, I. A. and Has'minskii, R. Z. (1981 ). Statistical Estimation : Asymptotic Theory, Springer, New York. [11] JureMova, J. (1981). Tail-behaviour of location estimators in non -regular cases, Commentationes Mathematicae Universitatis Carolinae , 22, 365-375.
231
Rep. Star. Appl. Res., JUSE Vol. 32, No. 3, Sept., 1985 p.p. 17-22
A-Section
A Note on the Minimum Variance Unbiased Estimation When the Fisher Information is Infinity Masafumi AKAHIRA* and Kei TAKEUCHI**
Abstract In the case when the Fisher information is infinity, it is shown that the locally minimum variance of unbiased estimators is equal to zero. Some examples are also given.
1. Introduction In non-regular cases the minimum variance unbiased estimation has been studied by Chapman and Robbins [6] , Kiefer [7], Polfeldt [8], Vincze [101, Akahira [11 , Akahira and Takeuchi [21, [31, [41, [9] , Akahira, Puri and Takeuchi [5] and others. In the previous paper by Akahira and Takeuchi [2] they introduced the concept of the one-directional distribution and discuss the locally minimum variance unbiased estimation for its family. In this paper, from another point of view we shall treat the case when the Fisher information is infinity and show that the locally minimum variance of unbiased estimators is equal to zero. Further we shall give two examples.
2. Results Let (X, Y) be a pair of random variables defined on some product space x x I of ?I and with a joint probability density function (j.p.d.f.) f0(x, y) with respect to a a-finite measure y, where 0 is a real -valued parameter. Then it follows that for almost all (x, y) [µx l1x Y] f0 (x, Y) = f0 (x I Y) f0 (Y), Received September 11, 1985 * Department of Mathematics, University of Electro-Communications. 1-5-1, Chofugaoka, Chofu-shi, Tokyo 182, Japan. (The author is visiting Queen's University in Canada from April to July 1985.) * * Faculty of Economics, University of Tokyo. 7-3-1, Hongo, Bunkyo-ku, Tokyo 113, Japan. (Key words) Locally minimum variance, Unbiased estimator, Fisher information.
232 M. Akahira and K. Takeuchi where f0 (x I y) and f0 (y) denote a conditional p.d.f. of x given y and a marginal p.d.f. of y, respectively . We assume that for almost all (x, y) [µx yJ , f0 (x,y), fe(x I y) and f0 (y) are continuously differentiable in 0. Since log f0(x, y) = log fe(x I y) + log fe(y) a.e. µx y it follows that XIY1Y 1X, (0)Y =B E[1 (0)] + (0)
where IX,Y(o), IXIY(o) and IY(0) denote the amounts of Fisher information with respect to f0(x,y), f0(xly) and f0(y), respectively, e.g.,
alo aee (x,
IX Y(0) JJ 1
y) cZ fo(x,y)dµx,y
3€xV
and E0Y(.) is the expectation w. r. t. f0(y). We denote J0(Y) = IXIY(o). Here we assume the following conditions (A.1) to (A.3). (A.1) O<J0(y) <° for almost ally [µy] and all0, and E00Y[J00(Y)]= oo and IX,Y(00) for some Oo. (A.2) There exists a sequence I Bn I of measurable sets of the domain of Y such that bn E0Y[J00(Y) XBn(Y)]
VXIy(6 (}{)) = 1 for all 0, 0 y J0 (y) where EX B ly(.) and V0 '(.) denote the conditional expectation and the conditional variance ofXgiven y. We shall show that the locally minimum variance of unbiased estimators of 0 is equal to zero. Theorem : Assume that the conditions (A.1) to (A.3) hold. Then inf V00(O(X,Y)) = 0. 6(X,Y) : unbiased Proof : Let an estimator 6* = 6 *(X,Y) be 1 o* = bn J00^')XBn(Y)
16y(X) - 0 0 ^
+ 00 .
233
Minimum Variance Unbiased Estimation By the assumptions (A.2) and (A.3) we have
EB 'Y(0 *) = bn EB [JB0(Y)XBn (Y) I EB IY (0Y(X)) - 0017+ 00 = 0
for all 0, hence 0* is an unbiased estimator of 0, where EB 'Y(.) denotes the expectation w.r.t. f0(x,y). We also obtain by (A.2) and (A.3) EX, Y (0*2) = bnz E, Y [J002 (Y)X Bn (Y) 10Y(X) - 00 } 2 ] 00 60 20 0 E0 X, Y [J00 (Y)XB (Y) {6 Y(X) - 00 11 + 002 + b n 0 n bnz E 0 [J80 (Y)xBn(Y) EX'Y (0Y(X) - 00)2 ] 00 20 + b0 EB [J00(Y)XB (Y) I E0 [Y (0Y(X)) - 00 + 002 n 0 n 0 bnz E0o [J00(Y)XBn(1')1 + 002
Then it follows that the variance of 0 at 00 is given by V X,Y 1 00 ( 6* ) = b n which tends to zero as n --^
because of (A.2). Thus we complete the proof.
Next we shall show some examples. Example 1 . Let Xi = 0Yi + Ui (i = 1, 2, ... , n), where (Yi,Ui) (i = 1, ... , n) are independently and identically distributed (i.i.d.) random vectors and for each i, Yi and Ui are mutually independent. Let f(u) be a known density function of Ui. Assume that lU= f(u)12 du< 00 J f(u) Putting Y = (YI, ... , Yn ) we have IY(0)=0 since the distribution of Y is independent of 0, and EY(J0 (Y)) = nE(Y 12)IU.
234 M. Akahira and K. Takeuchi Assume that E(Y12) _ 00 and f(u) is a density of the standard normal distribution. For any K > 0 and each i = 1, ... , n we define Yi for IYiI S K ; Yi,K =
1
0 for IYi I> K .
Since E(Y12) = 00, it is easily seen that E(Y1, 2) tends to infinity as K ^. We also define an estimator n i E YiK Xi OK= nE(Y1,K ) It is clear that 0K is an unbiased estimator of 0 . Since the variance of OK at 0 = 0 is given by n E(E YiK2) i-1 VO(0K) = E0(0K2) _ l _ n2f E(Y1 K2)I2
1 nE(Y1,K2)
it follows that lim VO(OK) = 0, K- 00 hence inf V0(0) = 0. 6 : unbiased Example 2 . Let Y be an unobservable random variable according to the chi-square distribution with one degree of freedom. Assume that X1 ... , Xn are independently, identically, normally and conditionally distributed random variables with mean 0 and variance Y given Y , where n > 5. Then we have I Y I(X1 _0 )2] ^ = nEe ( Y ) _ 00. 1X,Y(e) nE i Y I( XlY 0 ) 21 = nEe [ - EX, For any K > 0 we define an estimator
for E (Xi -X)2 > 1 ; i=1
n for E (Xi -X)2 < K i=1
235 Minimum Variance Unbiased Estimation where cK is a constant such that eK is an unbiased estimator of 6. Then we shall show that K VO(OK) = 0. n _
First we easily see that E (X1-X)2 is expressed as a product of two independent random i=1 variables Y and Z, where Z has the chi-square distribution with n-1 degrees of freedom. Let f1(y) and fn_1(z)be the density functions of Y and Z, respectively. By the unbiasedness condition of eK we obtain 1 1 f 1 y )f
1
If > yz=K We also have the variance of eK at 0 = 0 2
1 (z
(2 . 1)
)dydz cK
f l (y)f n_l (z) dydz = vK (say). nyz2
f fZ> y Putting
00 1 y fl(y) dy,
F(Kz) = f 1 Kz we have 0 F(Kz) = c' f
3 y y 2e 2 dy Kz
,
1 Kz where c and c' are constants. By (2.1) and (2.3) we obtain 00 1 = cK J F
(Kz) fn-l (z) dz = A V KCK ,
0 which implies cK = O(K - 1 /2), where A is some constant. By (2.2) and (2. 3) we also have 0 vK = c2 F^2) fn-1( z)dz = B ^ cg = O (K-1 /2 nz 0
(2.2)
236 M. Akahira and K. Takeuchi where B is some constant. Then we obtain li V0(OK) = lim^"K = 0.
Hence we have inf VO(O) = 0. 0 : unbiased
Acknowledgements The authors wish to thank the referees for their comments. The paper was written while the first author was at Queen's University in Canada as a visiting professor. The visit was supported by a grant of the Natural Sciences and Engineering Research Council of Canada. He is grateful to Professor Colin R. Blyth for inviting him. REFERENCES [1] Akahira, M. (1984): "On the Bhattacharyya inequality in non-regular case," Surikaisekikenkyusho Kbkyuroku (Prot. Symps., Res., Inst. Math. Sci, Kyoto Univ.) 538, 65-80. [2] Akahira, M. and Takeuchi, K.: "The lower bound of the variance of unbiased estimators for one-directional distributions" (Submitted for publication). [3] Akahira, M. and Takeuchi, K.: "Locally minimum variance unbiased estimator in a discontinuous density function" to appear in the Metrika. [4] Akahira, M. and Takeuchi, K. (1985): Non Regular Statistical Estimation, Monograph. [5] Akahira, M., Puri, M.L. and Takeuchi, K.: "Bhattacharyya bound of variances of unbiased estimators in non-regular cases" to appear in the Annals of the Institute of Statistical Mathematics. [6] Chapman, D.G. and Robbins, H. (1951): "Minimum variance estimation without regularity assumptions", Ann. Math. Statist., 22, 581-586. [7] Kiefer, J. (1952): "On minimum variance in non-regular estimation",Ann. Math. Statist., 23, 627-630. [8] Polfeldt, T. (1970): "The order of the minimum variance in a non-regular case", Ann. Math. Statist., 41, 667-672. [9] Takeuchi, K. and Akahira, M.: "A note on minimum variance" to appear in the Metrika. [10] Vincze, I. (1979): "On the Cramer-frechet-Rao inequality in the non-regular case", In: Contributions to Statistics. The Jaroslav Hajek Memorial Volume, Academia, Prague, 253-263.
237 Ann. Inst. Statist. Math. 38 (1986), Part A, 35-44
BHATTACHARYYA BOUND OF VARIANCES OF UNBIASED ESTIMATORS IN NONREGULAR CASES MASAFUMI AKAHIRA, MADAN L. PURI* AND KEI TAKEUCHI
(Received May 19, 1984; revised Jan. 16, 1985)
Summary Bhattacharyya bound is generalized to nonregular cases when the support of the density depends on the parameter, while it is differentiable several times with respect to the parameter within the support. Some example is discussed, where it is shown that the bound is sharp. 1. Introduction It is well known that the Cramer-Rao and the Bhattacharyya bounds are most important and very useful for the variances of unbiased estimators. They are, however, not applicable to the non-regular cases when the support of the distribution is dependent on the parameter. Same is true about more general and simpler bounds by Hammersley [6], Chapman and Robbins [2], Kiefer [8], Fraser and Guttman [5], Fend [4] and Chatterji [3], among others. (For an exposition of some of this work along with extensions in different directions, see Polfeldt [10], [11] and the recent papers of Vincze [15], Khatri [7] and Mori [9], among others). In his paper, Polfeldt [10] discussed the lower bound of the variances of the unbiased estimators when the class of probability measures is one-sided, that is, when P, is absolutely continuous with respect to P02 (symbolically, P,,02. In this note, our main interest is to obtain the Bhattacharyya bound when for any 01, 02, with 010 02, neither PBI << PB2 nor P82 << P01. 2. Results Let ' be an abstract sample space with x as its generic point, A a o-field of subsets of ', and let ® be a parameter space assumed to be an open set in the real line. Let 9= {P0 : 0 E ®} be a class of prob* Research supported by NSF Grant MCS-8301409. Key words and phrases: Cramdr-Rao bound, Bhattacharyya bound, unbiased estimator. 35
238 36 MASAFUMI AKAHIRA, MADAN L. PURI AND KEI TAKEUCHI
ability measures on (X, mil). We assume that for each 0 E 0, P0(•) is absolutely continuous with respect to a c-finite measure p. We denote dPe/d,u by f(x, 0). For each 0 E 0, we denote by A(0) the set of points in ' for which f (x, 0)>0. We shall consider the Bhattacharyya bound of variances of unbiased estimators at some specified point 0o in 0. We make the following assumptions :
(A.1) p((n ( U A(so+h)))DA(so))=0 c>0 Ihlf(x, 0), and for every 0 E (0o-s, 0o+s),
1
I Ax) I Ax, 0) dp < oo implies A(0)
U
r(x) I p(x)dp < oo •
A(B)
(A.3) For some positive integer k lim sup n-o x E U A(8o +jh) - A(80)
>-1 (_V (
j)f(x, so
+jh) <00 ,
I h I 'p(x)
J=1
i=1,•••,k. First we prove the following lemma. LEMMA.
Assume that (A.1), (A.2) and (A.3) hold. If ¢(x) is any
measurable function for which
I ¢(x) I p(x)dp < oo, then X
lim 1 h-.o h' i
( -1)' ^ ) (x)f(x0
0o + jh)dfe=0
.
U A(6o+jh)-A(go)
J=1
PROOF. By (A.2) and (A.3) it follows that for every i=1,. • • , k and every 0o E 0, there exist positive numbers s and K, such that IhH(_1)1
() f(x, 0o +jh) < KKp(x) for l ihl <e and x E A(0o)c ,
where A(0o)c denotes the complement of the set A(0o). Also, it follows from (A.1) that for every j=1,• • •, i, there exist a sequence {s1„} of positive numbers converging to zero as n-+ oo and a monotone nonincreasing sequence {S1„} of measurable sets such that l jh l < s p„ implies
239
BHATTACHARYYA BOUND OF VARIANCES OF UNBIASED ESTIMATORS 37
A(Bo+jh)-A(Bo)cSin and p ( fl Sin)= 0.
If for each j =1,• • •, i, I jhI<
n=1
e in, then
hi
i (-1)i(j)O(x)f(x, 00+jh)dp i U A(Oo+jh )- A(Oo) J=1 Ki
I qS( x ) I P( x)dp
S
{
K{I cb(x)I P(x)dp
U Sin
U A(9o + jh)-A(Oo)
i=1
1=1
Ki I cb( x ) I P( x)dp , sE J=1 J Sin
which tends to zero as n->oo . Remark.
The proof follows.
The assumption (A.2) together with the condition in the
above Lemma is satisfied with p(x) _ Z cif (x, 0) when the following i=1
holds : For each Bo and s > 0, there exist countable points 01 , 02, • • • and positive constants c1, c2,• • • such that U A(6i)DA(B) for all 0 E (Boi=1
s, 00+s ) and that
and F, cif(x, Bi)>f(x, 0) for all 0 E (Bo-s, Bo
ci
i=1
+ s) and almost all x[P].
We assume the following : (A.4) For each x E A(00), f (x, 0) is k-times continuously differentiable in 0 at 0=0g. (A.5) For each i=1,• •, k, x, 00+h) 8Bi f( lim sup <00 , h-•0 XEA (80)
where p(x) is defined in (A.2).
P(x)
We now prove our main theorem on the Bhattacharyya bound of the variances of unbiased estimators. THEOREM 2.1. Assume that (A.1) to (A.5) hold. Let g(0) be an estimable function which is k-times differentiable over ®. Let g(x) be an unbiased estimator of g(0) satisfying
5
I 9(x) I P( x )dp < oo .
A(B0)
Further, let A be a k x k non- negative definite matrix whose elements are
240
38 MASAFUMI AKAHIRA, MADAN L. PURI AND KEI TAKEUCHI
A£f= ( 1 J a£f(x, 00) of f(x, 00) d ^` J Ax, eo) 1. ae£ aef
i, j=1,•••,k.
A(80)
Assume that 2££, i =1, • • • , k are finite. If d is nonsingular at 00, then (2.1) Var (9) Z (gu)(0 o), ... , g(k)(00 ))A-i(g(i)(0 0), ... , g(x)(0o))' , 00
where g(£)( 0) is the i-th order derivative of g(0). PROOF. Denote
c(x)f(x, B)dp
go(0)= J
J f(x, B)=f(x, 0+h) -f(x, 0) .
and
A(00)
Then, by (A.5), we have (2.2)
1-
£ Wg0(0)]
'&'(00) = li m
'= 6a =lhm h£
I-0 h
g(x)dkf( x , 0o) dp
£ A(60)
= lim
9'(x ) -£ dhf(x, Bo)dp A(60)
( !j f(x, die =1m J 9(x ){ _Bo+h) 9
A(B)
9(x)
[ ae £ f( x ,
0) ] d p
A(B0)
0
where dhg(0)=dh 1(dhg(B)), i=1,• • •, k and 0<<1. Since W) = f 9(x) A(B)
• f(x, 0)d1 , we obtain for each i=1, • • •, k
(2.3)
d n(g(00) - go(00)) =E
f
=1
(-1)1 (
{g(0o+jh)-g o( Bo +jh)}
fE (-1)J^ j I l
-
=
5
J
4(x)f(x, 9o +jh)dte
A(60+fh)
9(x)f(x, eo +jh)dq}
A(B0)
£ E(-1 )f(j)
(5 - A(60+fh)
• g'(x)f(x,
A(80)f1 A (90+fh)
A(00)-A (00+fh)^
8o +j h)dp}
= J=l (-1)f (j) l^A(80+fh) J
=f^ (- 1)f (j) S
J )9(x)flx, A(B0) n A (Oo+f h)
A(60+fh)- A(80)
i(x)f(x, Bo+jh)dp
0o
+jh)dp}
241
BHATTACHARYYA BOUND OF VARIANCES OF UNBIASED ESTIMATORS 39
=fE
(-1)'(j )
t
1
gr(x)f(x, Bo+jh)dp
U A(Oo+kh)-A(Oo) k=1
f J 1)J >
l /9(x)f(x, Bo+jh)du
U A(Bo+kh)-A(eo) k=1
By (2.3) and Lemma, we have for each i=1, • •, k, [-j.g(o)]
(2.4)
=lim hi d Lg(eo) =lim h; dn(g(60)-g0(00))+li V¢ dng(eo) h-o h-0
^ =lim ht dhg(O0)= Bf go(B) e_e
0
From (2.2) and (2.4) we obtain for each i=1, • • •, k, (2.5)
L 8B' g(B)]
a-BO & A(Bp)
)[-8Bi .f (x, 0)] a_ odf,
Proceeding now as in the regular case (see e.g. Zacks [16], page 190), one can show that the Bhattacharyya bound of the variances of the unbiased estimators of g(0) is given by (2.1).
3. Example We consider the location parameter case, i.e., f(x, 0)=f(x-0), and unbiased estimators of 0.
We assume that for any pZ1, the density function f(x) is given by c(1 - x2)p-1
f(x)=
f 0
if
if
Ixk< 1 IxIz1
where c=1/B(1/2, p) with B(a, p)=sax 1(1-x)0-1dx (a>0, p>0).
Case ( i ) : Let p = 1. Then the distribution is uniform, and it is easy to check that min Vareo (0)=0 for any specific value 0o. (See Takeuchi B: unbiased
[14]). Case ( ii): Let p=2. In this case , it is easy to check that the Fisher
information
51
[f'(x)/f(x)]2f(x)dx=00.
For any s > 0 , we define an estimator B. which satisfies
242 40 MASAFUMI AKAHIRA, MADAN L. PURI AND KEI TAKEUCHI
if IxIs1 -6 ,
- c,f'(x) /f(x)
0 if 1-6<jxj51 , where c, is a constant determined from the equations
51
(3.1)
1 B,(x) f (x)dx = 0
5' 1
and
B,(x) f'(x)dx = -1 .
We shall determine B,(x) for x outside the interval [-1, 1] from the unbiasedness condition (11'1 +9
(3.2)
B,(x)f(x- 6)dx=6. J -1 +9
First consider the case 0 < 0:!9 1, and define g(6) = f 1 1+8 #,(x) f (x - 6)dx .
(3.3)
Since B,(x) and f'(x) are bounded, g(O) is differentiable and g'(6)= e,(x)f'(x-6)dx. _ -1+9
If we assume that B,(x) is bounded for 1<x52, the right hand side of (3.3) is also differentiable, and we have by (3.2) and (3.3) 1+9
(3.4) 1-g'(B)=-J
6,(x)f'(x-6)dx .
Differentiating (3.4) again, and noting that lim f'(x)= -3/2, we have x-.1-0 1+9
g"(6) 2 0,(1+0)- f 1
(3.5)
B,(x)f"(x - 6)dx
If B. satisfies (3.5), then it also satisfies (3.4) since lim g'(6) _ -1; it 9-.0
also satisfies (3.3) since lim g(6)=0 by (3.1). 9-.0
Since the integral equation (3.5) is of Volterra's second type, it follows that the solution B,(x) exists and is bounded. Repeating the same process, we have the solution e,(x) for all x>1. Similarly, we can construct 9,(x) for x<-1. On the other hand,
(3.6) Vargo (e,) = c' ,
E +&
[f'(x)lf(x)]Ef(x)dx
while from (3.1), we have (3.7) c, 1x1 1 J
f'(x)dx = 0 -1+.
Hence Vargo (9,)=c,.
and J
c, 1x1 1
-1+.
[ f'(x)/ f (x)]'f (x)dx =1 .
243
BHATTACHARYYA BOUND OF VARIANCES OF UNBIASED ESTIMATORS 41
[ f'(x)/ f (x)]2 f (x)dx = oo, we have from (3.7),
Now, since lim r 1 .moo
inf Vare, (B)=0 for any specified value Bo.
lim c,=0. Consequently , `4
9:
unbiased
Before proceeding on to the next cases, we note the following : (i=1, .. • , k) given in Theorem 2.1 are finite,
If k < p/2, then A since
5
1 -1 (1-
X 2)p-2k-l dx < oo
Also, 1 aif(x -e) aff(x-e) dx
c6+1
(3.8) 9
-1 f(x -B)
ae f
ae{
= f' (-1)t+> f(x) f"'(
x)fcf'(x)dx ; i, j = 1,..., k,
We also obtain for Ixl<1, f<1'(x)= -2c(p-1)x(1- x2)P-2 ; fc2>(x)= -2c(p-1){(1-
(3.9)
x2)P-
2 - 2(p-2)x2(1- x2)P -2]
f(2)(x) = 4c(p-1) ( p - 2) {3x(1- x2)P-2 - 2(p - 3)x2 (1- x2)P-1) ; .. .
If i+ j is an odd number, if follows by (3.8) and (3.9) that 2 = 0 since f"'(x) f 1>'(x) is an odd function. From (3.8) and (3.9), we have 211=
4c(p -1)2B(
3 , p-2)
Ail=8c(p-1)2(p-2) {2(p-3)B(2 , p-4) -3B(2 , p-3)}
222=
4c(p-1)2{B (2 , p-2)-4(p-2)B(
+4(p-2)2B(5, p
2,
p-3)
-4)}
233 = 16c(p-1)2(p-2)2 {9B (3 , p-4) -12(p-3)B(2 , p-5)
+4(p-3)2B 2 , p_6)} ;... ( Case (iii) : Let p=3,4. Then, we have, for any unbiased estimator B(X) of 0,
(3.10)
511
O(x)f(x)dx=0
244
42 MASAFUMI AKAHIRA, MADAN L. PURI AND KEI TAKEUCHI
11
(3.11)
B(x)f'(x)dx =-1
(3.12) 5 1
^(x)f(x)dx=0, k=2,..., p-1. 2
p-1
Noting that ^- [ {
ck f (k)(x) (
/f (
1 k=i
x)] dx <
implies c2= • • = c,-, = 0, we
have by Takeuchi and Akahira [13], that the infimum of Varo (B)= B2(x)f(x)dx under (3.10), (3.11) and (3.12) is given by inf Varo (B) ^11
= 1/A11 , where 21,=
5[
any 8>0 there exists
and
5'
9 : ( 3 .1 0 )-( 3 .12)
f'(x)/f(x)]2f(x)dx=(p-1 )(2p-1)/(p - 2) and for
0,(x) in (-1, 1) satisfying (3.10), (3.11) and (3.12),
B;(x) f (x)dx < (1/A,,) -{- s.
We can extend B,(x) for x outside (-1, 1+e
1) from the unbiasedness condition
0,(x) f (x - B)dx = B.
First, we
-1+B
consider the case when 0< 0<1, and define g(O) (0) =
5 1'
9,(x) f (x - B)dx. 1 +B
In a similar way to the case (ii), we have for k=2,• • •, p-1, 1+0 „
(3.13)
(- 1)k +lg(k +1)(O)= BkO,(1+0)- 0,(x)f ( k +1)(x-B)dx l Bk= lim f( k)(x)
Since the integral equation (3.13) is again of Volterra's second type, it follows that the solution B,(x) exists. Repeating the process, we can construct an unbiased estimator B,(x) for all values of x. Then, it follows that inf Var8, (B)=1/A,, for any specified value 00. 9: unbiased
Case (iv) : Let p=5, 6. Note that
J 11 [f'(x)f"(x)/f (x)]dx = 0 J11
[f"(x)/f(x)]2f(x)dx=
(p-1)(2p-1)(2p-3) (2p'-7p+8) (p- 2) (p- 3) (p-4)
= 222 (say)
and J1 imply c,= • .. =cp_1=0.
1 Y, ck f(k)(x)/f (x)I 2 f(x)dx < oo
1 k=i
Then we see that
Vargo (B)Z(1, 0)( '111 0 0 222
0 = Ali /_(1) 1
245
BHATTACHARYYA BOUND OF VARIANCES OF UNBIASED ESTIMATORS 43
for any specified 00, where 211 is defined above (3.8). Here again, as in the previous case inf Var,0 (8)=1/211 for any specific 00. 6: unbiased
Case (v) : Let p=7. In this case, we see that k=3. Using Theorem 2.1, (3.8) and (3.9), we obtain
Vargo (e)> ICI
3
1 ill = E('-_211233
)]_1
where
0
213
222
0
0
283
with 2ii=144cB(2, 5) ; 2ia=1440c {8B(2, 3 )-3B(3, 4)} 2E2 =144c {B (2 , 5) - 20B (2 , 4) +100B (2 , 3) } 233=144O0c {9B(3, 3)-48B (2, 2)+64B(2, 1)} We also obtain 2ia = {8B(5/2, 3)-3B(3/2, 4)}2 2112aa B(3/2, 5){9B(3/2, 3)-48B(5/2, 2)+64B(7/2, 1)} Here again inf Vargo (B)=[211(1-(213/211213)]-1 for any specific 0 0 , i.e. 9: unbiased
we have a sharp bound. Case (vi) : For pZ 8 we can continue in a similar manner by choosing k=[(p-1)/2], where [s] denotes the largest integer less than or equal to s. The above discussion establishes that here the bound is sharp but generally it is not attainable.
Acknowledgement The authors are indebted to Professor Michael Perlman for the very careful and critical examination of the first version of the paper.
246 44 MASAFUMI AKAHIRA, MADAN L. PURI AND KEI TAKEUCHI
His constructive comments and suggestions for improvements are gratefully acknowledged. They are also grateful to the editor and the referee for their helpful comments. UNIVERSITY OF ELECTRO - COMMUNICATIONS INDIANA UNIVERSITY UNIVERSITY OF TOKYO
REFERENCES [ 1 ] Bhattacharyya, A. (1946). On some analogues of the amount of information and their use in statistical estimation , Sankhya, 8, 1-32. [ 2 ] Chapman, D. G. and Robbins, H. (1951). Minimum variance estimation without regularity assumptions, Ann. Math. Statist., 22, 581-586. [3 ] Chatterji, S. D. (1982). A remark on the Cramer-Rao inequality, in Statistics and Probability: Essays in Honor of C. R. Rao, North Holland Publishing Company, 193196. [ 4] Fend, A. V. (1959). On the attainment of the Cramer-Rao and Bhattacharyya bounds for the variance of an estimate, Ann Math. Statist., 30, 381-388. [ 51 Fraser, D. A. S. and Guttman, I. (1951). Bhattacharyya bounds without regularity assumptions , Ann. Math. Statist., 23, 629-632. [ 61 Hammersley, J. M. (1950). On estimating restricted parameters, J. Roy. Statist. Soc., B, 12, 192-240. [ 7 ] Khatri, C. G. (1980). Unified treatment of Cramer-Rao bound for the non-regular density functions , J. Statist. Plann. Inf., 4, 75-79. [ 8 ] Kiefer, J. (1952). On minimum variance in non-regular estimation , Ann. Math. Statist., 23, 627-630. [ 9 ] M6ri, T. F. (1983). Note on the Cramer-Rao inequality in the non -regular case: The family of uniform distributions, J. Statist. Plann. Inf., 7, 353-358. [10] Polfeldt, T. (1970). Asymptotic results in non-regular estimation, Skand. Akt. Tidskr. Suppl., 1-2. [11] Polfeldt, T. (1970). The order of the minimum variance in a non -regular case, Ann. Math. Statist ., 41, 667-672. [12] Rao , C. R. (1973). Linear Statistical Inference and Its Applications, Wiley, New York. [13] Takeuchi, K. and Akahira, M. (1986). A note on minimum variance, in press in Metrika. [14] Takeuchi, K. (1961). On the fallacy of a theory of Gunner Blom, Rep. Stat. Appl. Res., JUSE, 9, 34-35. [15] Vfncze, I. (1979). On the Cram€r-Fr€chet-Rao inequality in the non- regular case, in Contributions to Statistics, The Jaroslav HAjek Memorial Volume, Academia, Prague, 253-262. [16] Zacks, S. (1971). Theory of Statistical Inference, John Wiley.
247
Metrika, Volume 33, 1986 , page 85-91
A Note on Minimum Variance By K. Takeuchi1 and M. Akahira2
Summary: Minimizing f {o(x)}2f (x)dp is discussed under the unbiasedness condition : f o(x)fr(x)di. = ci (i = 1 , ..., p) and the condition (A): fi (x) (i = 1, ..., p) are linearly independent, P f {ft(x)}2/f(x)d,
1 Introduction Let (x, B) be a sample space. We consider a family P= {P0 : 0 E 8} of probability measures on B, where the index set O is called a parameter space. In minimum variance unbiased estimation theory, the locally best unbiased estimator,y* of y = g(O) at 0 = 00 is obtained by minimizing f {'t(x)}2dFB0(x) x under the condition that f y(x)dP0(x) =g(0) x
for all 0 E
0.
(1.1)
When the parameter space 0 is a finite set {00, 01, ..., Op}, the condition (1.1) is reduced to
f y(x)dPei (x) = g(Oi) = c i (say)
x
(i = 0, 1, ..., p)
And it is easily derived that the minimizing solution y* is obtained as P
y *(x) = E
i=o
kfi (x),
1 Kei Takeuchi, Faculty of Economics , University of Tokyo, Hongo, Bunkyo-ku, Tokyo 113, Japan. 2 Masafumi Akahira, Statistical Laboratory , Department of Mathematics , University of ElectroCommunications , Chofu, Tokyo 182 , Japan. 0026-1335/86/020085-091 $2.50 © 1986 Physica-Verlag, Vienna
248 86 K. Takeuchi and M. Akahira where dei
A (x) = dPe (x), 0 provided that For each i = 1, ..., p, Pei is absolutely continuous with respect to Poo.
dPoi f x
(C.1)
2
dPeo
dPo o (x) <°° (i = 1, ...,p)•
f(x ) (i = 1 , ..., p) are linearly independent.
(C.2)
(C.3)
The purpose of this note is to investigate the situation where for some i the first and/ or the second condition does not hold. Remark that the third condition is obviously necessary. First we consider the case when we assume the first condition but not the second. To deal with the case we need a more subtle formulation of the condition in place of (C.2) which is given in the next section.
2 Results Let f(x) and fi(x) Q = 1, ...,p) be measurable functions. Then we consider to minimize
f {9(x) }2f(x)dp x under the unbiasedness conditions: f B(x)f(x)dµ = ci (i = 1, ...,p) x and the condition (A): fi(x) (i = 1, ..., p) are linearly independent, {fi (x)}2
AX) (X)
du< °°
(i= 1, ...,k;k
fx
and P { affi(x)}2 i=1
f(x) x
dp
249
A Note on Minimum Variance 87
implies ak+1=...=ap=0.
Note that in the above f and f (i = 1, ..., p) may not necessarily be density functions. In fact , in some applications we may take the derivatives of the density with respect to the parameter 0 as f and obtain some generalization of Bhattacharyya inequality. Also generally f is equal to one of fi's but it is not essential for the subsequent discussion. Let A11 be a k x k non-negative definite matrix whose elements are
__ tj X
f, (x)f.(x) dµ f(x
(i , 1= 1,...,k).
Theorem: Under the condition (A), inf f {9(x)}2 f(x)du = c(k)A11 c(k) 8: unbiased x
holds, where c(k) = (c1, ..., ck) ' is given in the above. The equality is not usually attained. Proof. We put
A n= x : 1 m a ^ x P
.f,•(x)I 1
x)
We define the class C„ of estimators such that for any B„ E Cn, en (x) = 0 for x Ef An . We also put
^^()
n An
fi(x)f (x) f(x)
dµ (i , l = 1, ..., p)
and denote the matrix (J4^")) by An . Then we have min Vare (9n) =c'An
OnE Cn: unbiased
1c,
where c = (c1, ..., cp)' is given in the condition (A). On the other hand we may make a partition of the matrix An into blocks as
A=
A(n) Ai21
n 1 A21 A22
where 1A(n) and A2Z are k x k and (p - k) x (p - k) matrixes , respectively.
250 88 K. Takeuchi and M. Akahira
Let c = (ci, c2)' with cl E Rk and c2 E Rp-k. Then the condition (A) implies that
(2.1)
lim c'An c = n-
for every vector c = (e 1, c2 )' such that c2 : 0. To prove the theorem it is necessary and sufficient to show that
as n - oo
and since the matrix An is non negative definite it is enough to show that
(A22 - A211111 ' A12 )-' - 0
as n -> °°,
that is, the minimum characteristic roots of the matrix An = A22 - AiiAii -1Aiz diverges to infinity as n tends to be large. Assume otherwise , then we have a sequence {cn } of vectors such that cn = ciw ^2n)' with 11 4n 11 = 1 and cn Ancn -C2nAne2n =pn (say)
n -> -. and p, 1' p* Then there exists a subsequence { cnr } of {.E*} } such that cnj = (c ink, c2Q' with II cinj II = 1 and as j - °°,
where c* (ci, c2 )' with Il ci II = 1. We can show that c*'Anc* -> p*
as n -) °°,
which contradicts (2.1). Assume otherwise, then there exist e > 0 and no such that
c*'Ano c* > p* + C. Since An is monotone increasing in n in the sense that c'An c is so for any vector c, it follows that for n > no cn'Anocn G cn'Ancn 5 p*.
251
A Note on Minimum Variance 89
Hence p*+e
f {6(x)}2.f(x)dµ S under the unbiasedness condition f O(x)f (x)dk = (f + f) O(x)ft (x)dp = c; (i = 1, ..., P), x S SC
where f (x) (i = 1, ..., p) are linearly independent. Assume that we have a condition similar to (A) above:
{f (x)}2 f f(x) dµ
i = 1, ..., k),
and P { Fi aif{(x)}2 i=1
S f(x)
dµ < 00
implies ak+1 = ... = aP = O,
and further we assume that fi (x) (i = 1, ..., p) are linearly independent within S. Let Ai 1 be a k x k non-negative matrix whose elements are _ .f1(x)f ( x) du (i,1= 1, ..., k). s f(x)
For any estimator 9(x) we define a vector d = (d 1, ..., d,)' by di = f 6(x)f(x)du Se
(i = 1, ...,P)•
252 90 K. Takeuchi and M . Akahira
Suppose that d is given , then the unbiasedness condition is reduced to f O(x)f(x)dµ = c, - di (i = 1, ..., p)
S
and under this condition we have from above theorem that inf
f
6: unbiased with S
{6(x ) }2f(x)dµ =(Cl-dl)'A 11 '(cl-dl),
a fixed d
where c = (C1i..., CP)' _ (C1, C2 )' and d = (d 1, d2)' W1thCl, d1 E Rk andc2, d2 E Rp-k It is obvious that the set of possible vectors dl is a linear space which is denoted by V. Then we have
inf 6: unbiased
f
{6(x) }2 f(x)dp = min (c1- dl)'Aii'(cl -dl). d lE D1
S
If the dimension of Dl is 1, we have a basis d*, ..., d7 of Dl and any d 1 E Dl can be expressed as
dl =bldi +...+bidr =D*b, where D* _ (d 1 , ..., dl ') is a k x 1 matrix and b = (b 1, ..., bl)'. Then it is straightforward to show that min (c1 -di )'A i '(c l
-dl)
dIEDi
= min (cl -D*b)'Aii'( cl -D*b) b
= ciA,* iD*'(D*'Aii iD*)-iD*Aii lci and the minimum is attained by b* = (D*'Aii'D*)-'D*Aii lcl. If, moreover, fi (i = 1, ..., p) are not linearly independent in S, the vector c -d is restricted to a linear space E and inf f {B(x)}2f(x)dji=(co-do)'Ao-'(co-do), 6: unbiased with d; S c-dEE where Ao is the maximum rank principal minor of Ai 1 and co and do are corresponding subvectors of c and d. Then do is on a hyperplane H of some dimensionality and
inf
f {6(x)}Zf(x)dp= min
9: unbiased S
d0EH
(co- do)'A0 -1(co -do).
253
A Note on Minimum Variance 91
Note that even when the parameter space © is infinite , minimizing the variance with a finite number of conditions is often applied to obtain a lower bound of the variances of unbiased estimators , as is the case of Cramer -Rao and Bhattacharyya . (See Hammersley; Chapman/Robbins ; Kiefer ; Fraser/Guttman; Fend; Sen/Ghosh ; Chatterji ; Akahira/ Puri/Takeuchi and also Isii with relation to mathematical programming)
References Akahira M , Puri ML, Takeuchi K (1986) Bhattacharyya bound of variances of unbiased estimators in non-regular cases . To appear in the Ann Inst Statist Math 38 Chapman DG, Robbins H (1951 ) Minimum variance estimation without regularity assumptions. Ann Math Statist 22:581-586 Chatterji SD (1982) A remark on the Cramer-Rao inequality. In: Kallianpur, Krishnaiah, Ghosh (eds) Statistics and probability : Essays in honor of C R Rao. North Holland , New York, pp 193-196 Fend AV (1959 ) Bounds for the variance of an estimate. Ann Math Statist 30:381-388 Fraser DAS, Guttman I (1952) Bhattacharyya bounds without regularity assumptions . Ann Math Statist 23 :629-632 Hammersley JM (1950) On estimating restricted parameters . J Roy Statist Soc (B ) 12:192-240 Isii K (1964) Inequalities of the types of Chebyshev and Cramer -Rao and mathematical programming . Ann Inst Statist Math 16:277-293 Kiefer J (1952) On minimum variance in non-regular estimation . Ann Math Statist 23:627-629 Sen PK, Ghosh BK ( 1976) Comparison of some bounds in estimation theory . Ann Statist 4: 755 -765 ; Correction. Ann Statist 5:593
Received August 11, 1983
254 Metrika, Volume 33, 1986, page 217-222
A Note on Optimum Spacing of Observations from a Continuous Time Simple Markov Process By K. Takeuchi' and M. Akahira2
Summary : Assume that X(r) is a continuous time simple Markov process with a parameter 0. The problem is to choose observation points to < rl < ... < rT which provide with the maximum possible information on 0. Suppose that the observation points are equally spaced , that is, for t = 1, ..., T, rt - rt-1 = s is constant. Then the optimum value for s is obtained.
1 Introduction In various practical situations especially in engineering applications of time series analysis, it often happens that the process itself is continuous in time, but measurements can be made only at discrete time points. Then the problem of choosing sample points arises, and its most natural formulations would be "what is the best set of sample points in continuous time points which would give the largest possible amount of information with respect to some unknown parameter given the size of the sample?" In this paper we deal with the problem in nearly the simplest possible case, that is, the underlying process is simple Markov and the measurement points are equally spaced, and calculate the optimum spacing. It is conjectured that the solution given is also optimum even when the sample points can be arbitrarily spaced, but the authors have been unable to prove it yet. The model can be also generalized in various ways, which will be discussed in subsequent papers. As far as the authors are aware, there seems to be little literature which has dealt with this problem, and we mention Taga (1966) who dealt with a similar problem in connection with the least squares prediction.
2 Optimum Spacing of Observations Suppose that X(r) is a stationary Gaussian process with a continuous time parameter r, and also suppose that E(X(r)) = 0, Var (X(-r)) = a2 and Cov (X(rl),X(r2 )) = 0 1T2 - Tl1a2, that is, X(r) is a simple Markov process with a parameter 0, where 0 < 0 < 1. We as-
1 Kei Takeuchi, Faculty of Economics , University of Tokyo, Hongo, Bunkyo-ku, Tokyo 113, Japan. 2 Masafumi Akahira, Department of Mathematics , University of Electro- Communications , Chofu, Tokyo 182, Japan. 0026 - 1335/86/03040217 - 222 $2.50 © 1986 Physica-Verlag , Vienna
255
218 K. Takeuchi and M. Akahira
sume that 0 and a2 are unknown , and it is required to estimate 0 based on T+ 1 measurements on X(T), which are denoted as X(ro ), X(Ti), ..., X(TT). The problem is to choose such observation points To < 7-1 < ... < TT that provide with the maximum possible information on 0. For simplicity 's sake we assume that for t = 1 , ..., T, Tt - Tt_ i = s is constant, that is, the observation points are equally spaced , and we want to get the optimum value for s . In order to simplify the notation we denote Xt = X(Tt) (t = 0, 1, ..., T) and also put p = 0s. Then we have the following simple Markov process Xt = pXt_1 + Ut (t = 1, 2, ...), where {Ut} is a sequence of independently, identically and normally distributed random variables with mean 0 and variance 02 ( 1 - p2), and X0 is a normal random variable with mean 0 and variance a2 and for each t, Xo is independent of Ut. The log-likelihood function of p and a2 based on the sample (Xe, Xi, ..., XT) is given by I logL(p , a2)=-
T-1 T XO+4T +(1+p2) xt -2p E xtxi_1
2 2 I 2(1p )0 t=1 t=1
T T+1 T+1 - 2log(1-P 2) - 2 loga2- 2 log21r.
The first and second order partial derivatives with respect to p and a2 are also given by T-1 T P 2 xo + XT +(I + p2) E x2 - 2p T xtxt-1 8 log L(p, a2) ap (1-p) a2 t=1 t=1 T 2 p 1 xt -
T P Z xtxt_1 + T
12)0, (1-p t1 t=1 1-p 2;
a a02 log
L(p, a2) I
T-1 JX2+xT +(1 +p2
2(1-p2) a4
T T+1 xtxt_1 E x2 -2p 2a2 t=1 t=1 T-1
a2 1 + 3p2
T
- Ti log L (P, u2) _ (I -p2) 302 xo +xT t (1 + p2) t El x2 - 2p t El xtxt_1 I 4p T-1 2 T (P -t=ixtxt t=i xt +(I-P2)2a2 + I T I 1 2 - T( (1-p2)a2 t=1 t
1 +p2).
( 1-p2)2'
(1)
256 A Note on Optimum Spacing of Observations 219
a2 8(a2 )2 log
L(P, a2 )
T-1 2 T 1 T+1 {x^x+(1+P2) i I1 At - 2p t E1 xtxt-1 2a4 (1-p2)a6
a2 apaa log
_L j L( P, a2) = Q 2
a
1 aP
log L (P, a2) +
Tp 1- 2
(2)
(3)
P j.
It follows by (1), (2) and (3) that the information amounts on p and a2 are obtained by
a2 l T(1 + p2 ) Ipp =-ELape logL(P, 0 2 ) = (1 _p2)2
1
a'
.[,,2,,2 = - E
a(a2
)2 log L(P, a2)
T+ 1 2a4
a2 2 1 TP .1p,2 E apaa2 logL(p, a ) _ (1 _p2)a2 .
Hence the inverse of the information matrix is given by
IPP
Ipa2
Y
1
T(l + p2) Tp (1 -p2)2
(1 - p2)a2
Tp T+ 1 I Ipa2 Ia2°2 )
\(1-p2)a2
2a4
where T 2p2 J(P)- 1-p2 {1+(T+1)( 1-p2)}•
We can regard J(p) as the amount of information obtainable from the sample with respect to the parameter p, and when T is large , for any best asymptotically normal estimator A (which may be the maximum likelihood estimator, the least squares estimator, etc.). T+ 1 (,3-p) is asymptotically normal with mean 0 and variance (T+ 1)J(p)-1. And for the initial parameter 0 with which we are concerned, the in-
257 220 K. Takeuchi and M. Akahira
formation is given by - I ap `z s202s - 2 ( 2T 02s I(8) - J(0s) = T + ) aB 1-Bzs T+1 1-Bzs
(4)
since p = O. Now we shall obtain such value of s which maximizes 1(0). Since (T+ 2T pz ) (logp)pz z 0 B z (log )2 I(0 T+1 1- p z ) 1 -p Putting t = ( 1 - pz)/pz we have
402(log 0 )2I(0) _ T { log (1 + )}2 + 0 (1) =It + 0( 1) (say). In order to get s maximizing 1(0) given by (4), it is enough to take the solution to of dItldt = 0, i.e., log (1 + t) = 21;/(1 +:;) which is about 3.926. It is also noted that for 1; =:;o, p _ 0.451. Hence s maximizing 1(0) is given by s =-{log (1 + to)}/(2 log 0) -0.797/log 0. Even when T is small, the amount of the information provided by the sample may have some value, and it is interesting to calculate the value of 0 which maximizes
T
z Bz(log B )zI(B)= en
{1+(T+1)(1 -e')},
(5)
where (3=2slog0. Then the optimum solution is given by the negative root of the equation 2 1 2 -- + =0 Q 1-ceQ 1-eQ which is independent of 0, where c = (T- 1)/(T + 1). Table 1 provides us with the values of Q which maximize (5) for some values of T. Note that for T = 1, i.e. the case of two observations , the optimum is given by s = 0. For comparison if T = 3 we have c = 1/2, hence 0 =-1.151, while for T - 00, i.e. c = 1, we have R = -1.594 which says that for small T, the optimum spacing is smaller than for large, but is at least 73 % of the asymptotically optimum spacing if T>3. The optimum value of s depends on the unknown parameter 0, hence we can not apply the result to the practical case but the autocorrelation of the successive measurements for the optimum spacing is independent of 0, and we may get some rough idea about the optimum spacing since in general cases we may have some prior information about the possible speed of decreasing of Cov (X(t), X(t + s)).
258 221
A Note on Optimum Spacing of Observations Table 1 T
2
3
4
5
9
-0.926
-1.151
-1.261 -1.327 -1.371 -1.402
T
9
10
20
0
-1.444
-1.459
-1.526 -1.566 -1.580 -1.594
50
6
7
8 -1.426
100
Otherwise we may resort to a two -stage procedure by first observing Tt values based on an initial guess and then observing T2 (= T - TI) values corresponding to the estimated optimum spacing based on the first sample. Now let us consider the problem of testing the hypothesis 0 = 0 o against the alternative 0 * 00 . Then asymptotically , the locally most powerful test and the locally most powerful unbiased test are equivalent to the one -sided and the two-sided tests respectively of the maximum likelihood estimator (e.g. Rao 1965). When the sample spacing is given , the hypothesis is transformed into H : p = po = 00'. Then if T is large, the tests based on the maximum likelihood estimator are given by the procedure: One-sided case; Reject H iff
T + 1 (P - po) > ua/ J(p0) (or
Two-sided case ; Reject H iff
T+1 (,a-po)
T+1 _ I^ - po I > u«/2 /
J(po),
where un denotes the a-upper quantile of the standard normal distribution. Under the contiguous alternative K : p = po + r/ T +I- the asymptotic powers of the above tests are represented as monotone increasing functions of r J(p0) , where r is a real number. Transforming back to the original parameter, the asymptotic power for the alternative 0 = Oo + t/\/7T1 is given as a function of t I dp/d0 I J = t \/i. Hence the power is maximized when I(00) is maximized. Therefore the optimum spacing is also given by s =-0.797/log 00. Remark: In the above discussion we excluded the possibility of 0 being equal to zero. It may be of some importance, however, in practice to test the hypothesis 0 = 0 against 0 > 0. Then it can be shown that the local power is larger when s is smaller and the maximum of the local power can not be attained, which can be also seen from the fact thats-0as0- 0. Acknowledgements: The authors are indebted to the referees for their helpful comments which resulted in improvements at various points of the paper . They also wish to thank Mr . H. Osawa of the University of Electro-Communications for making the numerical calculation of Table 1.
259
222
K. Takeuchi and M . Akahira: A Note on Optimum Spacing of Observations
References Rao CR (1965) Linear statistical inference and its applications. Wiley, New York Taga Y (1966) Optimum time sampling in stochastic processes (in Japanese). Proc Inst Statist Math 14:59-61
Received February 8, 1984 (Revised version July 10, 1984)
260
ON THE BOUND OF THE ASYMPTOTIC DISTRIBUTION OF ESTIMATORS WHEN THE MAXIMUM ORDER OF CONSISTENCY DEPENDS ON THE PARAMETER *
Masafumi Akahira(*) and Kei Takeuchi(**)
Abstract In this papar, the case when the order of consistency depends on the parameter is discussed, and in the simple unstable process the asymptotic means and variances of the log-likelihood ratio test statistic are obtained under the null and the alternative hypotheses. Further its asymptotic distribution is also discussed.
(*) Department of Mathematics, University of Electro-Communications, Chofu, Tokyo 182, Japan. The author is on leave and visiting Queen's University from April to July 1985. Faculty of Economics, University of Tokyo, Hongo Bunkyo-ku, Tokyo 113, Japan.
AMS Subject Classification (1980). 62F11, 62M10. Key words and phrases: Order of consistency, Asymptotically median unbiased estimator, Autoregressive process, Log-likelihood ratio, Asymptotic distribution.
This paper is retyped with the correction of typographical errors.
261
1. INTRODUCTION In the regular case it is known that the order of consistency is equal to V^n_, but in the non-regular case it is not always so, e.g.
nl/a (0 < a < 2 ),
n log n
etc., which are independent of the unknown parameter (e.g. see Akahira , 1975a; Akahira and Takeuchi, 1981; Vostrikova, 1984 ).
However , the order of consistency
may depend on it in the case of the unstable process. Here a discussion on that will be done. In the autoregressive process { Xt} with Xt = 9Xt_1+ Ut (t = 1, 2, ...), where { Ut} is a sequence of independently and identically distributed random variables and X0 = 0, it is known that the asymptotic distribution of the least squares estimator of 9 is normal for 101 < 1, Cauchy for 101 > 1 (e.g . White , 1958 ; Anderson, 1959) and some one for 101 = 1 (Rao , 1978 ). Further, in the case when 101 < 1 the asymptotic efficiency of estimators was studied by Akahira ( 1976 ) and Kabaila (1983) for the ARMA process , and their higher order asymptotic efficiency has been discussed by Akahira ( 1975b, 1979, 1982 , 1984 ) and Taniguchi ( 1983) for the ARMA process . In this paper in the first order AR process with 10 1 > 1 the asymptotic means and variances of the loglikelihood ratio test statistic are obtained under the null and the alternative hypothesis.
262
Further its asymptotic distribution is also discussed. 2. RESULTS Let (X, B) be a sample space and 0 be a parameter space, which is assumed to be an open set in an Euclidean 1-space R'. We shall denote by (X(T), 13(T)) the T-fold direct product of (X,B). For each T = 1, 2, ..., the points of X(T) will be denoted by ;VT = (x1, ... , XT). We consider a sequence of classes of probability measures {PB,T : 0 E O} (T = 1, 2, ...) each defined on (X (T), B(T)) such that for each T = 1, 2, ... and each 0 E 4 the following holds: PO,T (B(T)) = PB,T+1 (B(T) X X) for all B(T) E B(T). An estimator of 0 is defined to be a sequence {9T} of B(T)-measurable functions. For simplicity we may denote an estimator 9T instead of {9T}. For an increasing sequence {CT} (CT tending to infinity) an estimator 0T is called consistent with order {cT} (or CT-consistent for short) if for every e > 0 and every i9 E O there exists a positive number L such that
lira lim sup
L-9oo T-* oo O:18-i9! < 6
(Akahira, 1975a).
PO,T
{CTIT
-
01
_> L I = 0
263
It is known that the order of convergence of consistent estimators is equal to VTn_ is the regular case, but not always so, e.g . nl/a (0 < a < 2), n log n etc., in the non-regular case when the support of the density depends on the parameter 9. In both cases the order of consistency is independent of 9. If CT can not be decided independently of 9, we may change cT(9) instead of cT in the above definition. However, in such a definition we shall not be able to determine uniquely the value of cT (9) at Oo (Takeuchi , 1974). Then a similar phenomenon to "superefficiency" happens . Indeed , if 0T has an order cT of consistency independent of 0, then for a specified value 90 we define an estimator 9T as { d _ 1 (OT - 90 )
+"
9T
for 16T - 901 _< CT1/2 for 16T - 001 > CT
1/2
where f o r each T = 1, 2, ..., dT is a constant with dT > 1. We define 4 (0) as follows:
4(00)
= CTdT
264
and for any 0 90 CT
for
4(6) _ CT{1- 10 } for Case (i)
0
=
10 - Sol 10 -901
< cT-1 /2 >CT1
/2
90. Since IBT - 9o l < CT1 1 2 implies CTdT 1eT-
901 = CT I9T - 901, it follows that POo,T
{ 4 (oo)16
- eoI
IOT - 90 I
> L}
>Ll
+ Poo , T { I9T - 001
Case (ii) 0 90 . In the case when
001
T1 /2
OT - 00 1 >
10 - OoI
<
> CT1/2 CT1
}
/2, I9T -
implies 4 (9)IOT -9I < 4(9)I9T-9o I < 1, and CT1
/2 implies 4 (0) 16T -
91
= CT (O) IOT - 91 <
CTI9T - 91. Then we have for L > 1 PO,T
{GT(e) IOT
-91
>L}
-e1
>L
}.
-
10 - 001 > CT 1 / 2, IeT - 90 1 < CT-1 /2 implies 4 (0)IBT - e1 < 4(e)I9T - 001 < CTI9T - 91, and 001 > CT1 /2 implies I eT In the case when
- 1/2
CT(e)OT-e1 =CT
1- 10-901
IeT-e0I
265
Then we have for L > 1 PO,T {c
(9)M -91 > L }
> L}.
Hence in both cases (i) and ( ii) we obtain for L > 1
PO,T {(o)IoT -
01
> L}
L} + Peo,T {cTIOT - 0.1 > 4(2} Letting n -+ oo and L -* oo, we see that the righthand side of the above inequality tends to zero locally uniformly. Since {dT} can be a sequence tending to infinity as T -* oo and for any 9 h 00 , 4(9)/ cT -4 1 as T -p oo, it is possible to make the order of convergence arbitrarily large at 9 = 90. Hence we can not decide uniquely the value cT(9) at the point.
A CT-consistent estimator 9T is defined to be asymptotically median unbiased (AMU) if for any 19 E O there exists a positive number b such that lim sup T->oo 0: 10-191<6
lim sup
T-+oo 0:10-a91
PO,T{OT<9}-2 = 0;
IPB,T {9T >
9}
-2I
=0.
266
Let OT be an AMU estimator satisfying the following: (2.0) lim [ Po,T{cT
(OT_e )
< t} - 2
> 0 for some t > 0;
T->oo
lim __ [ PO,T {CT
(OT
T-4oo
- o)
If for any sequence {c' } of order of consistency with (2.0), lim 4/cT < oo, then
{CT}
is called maximum or-
T-+oo
der of consistency. There are most cases when the maximum order is uniquely determined, but there may not exist an estimator which satisfies (2.0) and uniformly attain the bound of PB,T
{CT(OT
-9)
in the class of AMU estimators with ( 2.0). It is noted that the condition ( 2.0) means non-constant of the concentration probability of the estimator in some interval involving the origin. We consider a simple autoregressive (AR) process {Xt} which is defined by Xt = 9Xt_1 + Ut ( t = 1, 2, ...), where { Ut} is a sequence of independently, identically and normally distributed random variables with mean 0 and variance 1 and X0 = 0.
267
In the case when 191 < 1, the bound of the asymptotic distribution of the all AMU estimators is obtained up to the n-1/2 and it is shown that a (modified) least squares estimator is (second order) asymptotically efficient (e.g., see Akahira, 1975b, 1976, 1979, 1982, 1984). In the subsequent discussion we shall treat the case when 101 > 1. Then it is known that the order {cT(B)} of consistency is given by CT
(0)
- 191T for1 01 >1; = 1, I T fot 101
(e.g. see Anderson, 1959; Rao, 1978). Letting 100 1 > 1, we deal with the problem of testing the hypothesis H : 0 = Oo + (u/cT(9o)) against the alternative K : 0 = 00. Putting 01 = e0 + (u/cT(90)) we consider the log-likelihood ratio LT given by T
(2.1) LT = (90 - 0 ) ( E XtXt1 t=2
- 00+ 2
01
E t-1 t=2
Then we shall obtain the asymptotic mean and variance of LT under H:0=0 iandK:0=90.
268
For each k = 0, 1,.. . , t we have (2.2) 2 9kVe ( Xt -k) = 9kQtk
EB (Xtxt - k) = 0E9 (Xt- l Xt -k) Since for each t = 1, 2, ...,
V0 (Xt) = e2Ve (Xt-1) + V (Ut) , it follows that o. = e2ot 1+1, (t=1,2,...),
(2.3)
where o,02 = 0. First we obtain
(2.4) T
VB (LT)
(BO - el)2
{ v9
E xtxt-1 t=1
T - (Oo
T 2
+ 61) Cove
Xtxt-1, Xt -1 t=2 t=2
+ (0o+01 )2Ve TL 4 L, t 1 t=2
By (2.2 ) and (2 . 3) we have (2.5) T
Ve xtxt - 1 t=2 T
_ E Ve (Xtxt-1) + 2 Y Cove (Xtxt-1, Xt'Xt'-1) t=2 t
_ i (tee o 1 + 62a 1) + 4 t=2 t
Qt
^t le2(t'-t).
(Say)
269
We also obtain by (2.2) and (2.3) (2.6) T
V0
T E
= 2 E of 1 + 4
Xi 1 t=2
t=2
of le2(t'-t) ;
t
(2.7) T Cove
T^
T
Xtxt -1,> Xt 1
t=2
t=2 T
= 2 Ut ort le2(t,-t
)-1 + ^t le2(t'-t)+1 + Qt le
t
t1. Since by (2.2) T
T
E0 Xtxt -1
Ee
T
T-1
= EB (XtXt-1) = e = Qt 1 = 0
t=1 t=2 T-1 T = 2 Ut E X2 2 1 t=1 t=2
t=2
t=1
it follows from ( 2.1) that T-1
T-1 2
(2.8) EB (LT ) = (60
- el)
01 6 E vt - 2 Qt 00+
t=1
t=1 T-1
=(e0 -01 )B0 (e_ 2ell ^vt
t_1
Since by (2.3) ^2= 1+e2+...+
02(t -1)
-
T-1 2 92 (T-1)
t=1
-
8
it follows that
2-1
2
Ort =
(92
-1
T - 1
02 - 1)2
- 1.
1
^
Ort
270
By (2.8) and (2.9) we have ( 2 . 10 )
Eeo (L T ) = u - 1) 22 + o(1) ;
( 2 . 11 )
EB 1 (L T ) = -
290 (00
u
200 (00 _ 1)2
+0(1) .
Since 00 - 01 = -u90 T and 00 + 01 = 200 + u00 T = 200 + o ( 1), it follows from (2.3) to (2.7) and (2.9) that
(2.12) Veo (
LT) = u200-2T
T
(Qt
at 1
- 0oot i) + o(1)
t=2
T =u2002TEO.t 1
+o(1)
t=2
z
00 (00 - 1)
a + o(1).
Similarly we have
(2.13)
u VB, (LT) _ 00 (00 _ 1)2 + o(1).
In the case when 00 < -1, we have similar asymptotic means and variances of LT under H and K to those of this case.
271
Case (II): Oo = 1. Since by (2.2) and (2.3) T T-1 T-1 (T -1)T E El (XtXt-1) _ E E1 (Xt) _ t =
t=2
t=1
t=1
2
it follows from (2.1) that z E1(LT) = 4 + o(1).
(2.14)
In a similar way as the case (I), we obtain under H 0=61=1+(u/T) (2.15) u2
E01(LT)=-2T2
1- 01(T -1)
(T -1)61 2
(1- 6-2)(1- 61)
1 -6-2
u2 61T _ 02 T - 1 2T2 (61_1)2 61-1 Since 1 _ T 61-1 2u+T^ it follows from ( 2.15) that
(2.16) l2 2 r l E91 (LT) = -2T2 (1+TI - (1+TJ } T 'T z 2 ( 2u 2
2T
+ T)
8 (e2u - 1) + 4 + 0(1).
272
We also have the asymptotic variances as (2.17) u2
Vl(LT)
T
u2
T-1
=T21: ct 1=71: t=2 t=1
u2
t= 2 +o(1);
(2.18) Ve i (L T )
u2 e2(T-1) - 1 T - 1
2
2T
=^a Zt -1 =
t=2 u2
T 2 2_ 2 2 (e1 1) 61
U ) 2(T-1 )
T2 -f T
_ 2 _ 1 _ 2e2"-1
U
4u2 2u
T2
( 2u -}-
(T
u2 2 T)
-
2u + T
( ) +0 1
= 1 (e2u _ 1) _ 2 + o(1)
In the case when Bo = - 1, we have similar asymptotic means and variances of LT under H and K to those of this case.
Hence we established the following theorem.
1)T
273
Theorem . In the AR process the asymptotic means and variances of the log-likelihood ratio LT under H : 0 = 01 and K : 0 = Oo for 1001 > 1 are given by the following:
1001=1
1001>1 Eo.(LT )
4 + 0(1)
+ 0(1)
^ aeo(e
Eo,(LT )
^( B + 0 ( 1) -2O
-8(e2u - 1 ) + 4 + 0(1)
Veo(LT )
eo(00 12 + 0(1)
2 + 0(1)
Vol (LT)
o eo(e
2
+ 0(1)
4(e2u - 1) - z + 0(1)
Table 2.1
Remark. By the Taylor expansion it is shown that for
I0o1
=1 1 2u u2 Eol(LT)=- 8(e -1)+2+0(1 )=-4 +... u2
Vey (LT) =
2 + ... ,
274
in which the leading terms are equal to those by E9o(LT) and V90(LT), respectively. Next we shall consider the asymptotic distribution of LT under H and K. Using the discussion by Basawa and Brockwell (1984, page 165), we see that LT converges in distribution to {ao (u) Y + be;(u)Z} Y (i = 0, 1) under H and K, respectively, where for each i = 0, 1, ae; (u) and b92 (u) are constants, and where Y and Z are mutually, independently and normally distributed random variables with mean 0 and variance 1. By the fact we may obtain the bound of the asymptotic distribution of the all AMU estimators, but there might not exist an AMU estimator whose asymptotic distribution uniformly attains it.
ACKNOWLEDGEMENTS The paper was written while the first author was at Queen's University in Canada as a visiting professor. The visit was supported by a grant of the Natural Sciences and Engineering Council of Canada. He is grateful to Professor Colin R. Blyth for inviting him.
275
REFERENCES Akahira, M. (1975a). Asymptotic theory for estimation of location in nonregular cases, I: Order of convergence of consistent estimators. Rep. Stat. Appl. Res., JUSE, 22, 8-26. Akahira, M. (1975b). A note on the second order asymptotic efficiency of estimators in an autoregressive process. Rep. Univ. Electro-Comm., 26, 143-149. Akahira, M. (1976). On the asymptotic efficiency of estimators in an autoregressive process. Ann. Inst. Statist. Math., 28, 35-48. Akahira, M. (1979 ). On the second order asymptotic optimality of estimators in an autoregressive process. Rep. Univ. Electro-Comm ., 29, 213-218. Akahira, M. (1982). Second order asymptotic optimality of estimators in an autoregressive process with unknown mean. Selecta Statistica Canadiana VI, 19-36. Akahira, M. (1984). Asymptotic deficiency of the estimator of a parameter of an autoregressive process with the missing observation. Rep. Stat. Appl. Res., JUSE, 31, 1-13. Akahira, M. and Takeuchi, K. (1981 ). Asymptotic Efficiency of Statistical Estimators : Concepts and Higher Order Asymptotic Efficiency. Lecture Notes in Statistics 7, Springer-Verlag, New York. Anderson , T. W. (1959). On asymptotic distributions of estimates of stochastic difference equations. Ann. Math. Statist., 30, 676-687. Basawa, I. V. and Brockwell, P. J. (1984). Asymptotic conditional inference for regular nonergodic models with an application to autoregressive processes . Ann. Statist., 12, 161-171. Kabaila, P. (1983 ). On the asymptotic efficiency of the estimators of the parameters of an ARMA process. Journal of Time Series Analysis, 4, 37-47. Rao, M . M. (1978 ). Asymptotic distribution of an estimator of the boundary parameter of an unstable process. Ann. Statist., 6, 185-190. Takeuchi , K. (1974). Tokeiteki suitei no Zenkinriron (Asymptotic Theory of Statistical Estimation ). (In Japanese ) Kyoiku-Shuppan , Tokyo. Taniguchi, M. (1983 ). On the second order asymptotic efficiency of estimators of Gaussian ARMA processes . Ann. Statist., 11, 157-169. Vostrikova, L. Ju. (1984). On criteria for ca-consistency of estimators. Stochastics 11, 265-290. White, J. S. (1958). The limiting distribution of the serial correlation coefficient in the explosive case. Ann. Math. Statist., 29, 1188-1197. Requ en Mai 1985
276 Masafumi Akahira 1 and Kei Takeuchi 2
ON THE DEFINITION OF ASYMPTOTIC EXPECTATION ABSTRACT In this paper a definition of asymptotic expectation is given and its fundamental properties are discussed. 1. INTRODUCTION In the asymptotic theory of estimation, the concept of asymptotic expectation is widely used (e.g., Akahira and Takeuchi , 1981 ; Ibragimov and Has'minskii, 1981 ; Lehmann , 1982), and it is usually remarked that it can be different from the asymptotic value of expectation . It is, however , not sufficiently accurately defined in the literature, especially when the asymptotic distribution does not exist. In the paper we shall give a definition of the asymptotic expectation and show its properties , e.g., its linearity and a Markov type inequality. We shall also obtain the necessary and sufficient conditions for the convergences in probability and distribution.
2. RESULTS Let {X,a} be a sequence of non-negative random variables . For any sequence {A„} of positive numbers we define an A„-censored sequence as X,,(A,,) = min{X,,, A„},
n = 1, 2,....
For {X„} we denote by A(X„) a set of all the sequences {A„} of positive 1 Department of Mathematics , University of Electro-Communications , Chofu, Tokyo 182, Japan 2 Faculty of Economics , University of Tokyo, Hongo, Bunkyo-ku , Tokyo 113, Japan. 199 I. B. MacNeill and G. J. Umphrey (eds.), Foundations of Statistical Inference, 199-208. Q 1987 by D. Reidel Publishing Company.
277
200 M. AKAHIRA AND K. TAKEUCHI
numbers satisfying as n -^ oo.
Pr{Xn>An}-+ 0
We define the upper and the lower asymptotic expectations of {X,,} as As.E(Xn) =
inf
limsupE (Xn*(An))
{An}EA ( Xn)
n-+oo
and As.E(Xn) =
inf lim inf E(X,,(An)),
{An}EA( Xn) n-+oo
respectively. Note that they can be infinity. Definition 2.1. If As.E(Xn) = As.E(Xn) < oo, then we call it the asymptotic expectation of {Xn} and denote it by As.E(Xn). The following theorem establishes that the above definition is reduced to the usual concept when Xn has an asymptotic distribution. Theorem 2 .1. If Xn has a proper asymptotic distribution F, that is Pr{Xn < x} -. F(x)
as n -' oo
for every continuity point x of F, then
As.E(Xn) =
J0 O xdF(x),
provided that the right-hand side is finite. Proof. Denote by Fn(x) the distribution of Xn. We put µ = xdF(x). 0^ Since for any fixed positive number A
I E(X„(A)) - E (X*(A))I
1
0
A xdFn(x) -
J0 A xdF(x)
+AIFn(A) - F(A)I,
we have
I E(Xn*(A)) - E(X *(A))I -' 0 as
n -+ oo. (2.1)
278 ON THE DEFINITION OF ASYMPTOTIC EXPECTATION 201 Since µ is finite, it follows that I E(X*(A)) - µI -^ 0
as A -+ oo. (2.2)
Then it follows from (2.1) and (2.2) that for any sequence {E,n} of positive numbers there exists a sequence {A.,,,,} of positive numbers such that I E(X{(Am)) - pI < Em
(m = 1, 2, ...). (2.3)
For the sequence {A,,,,} there is a monotone increasing sequence {N(m)} of positive integers such that for n > N(m) I E(Xn(A,n)) - E(X *(Am))I < E,n.
(2.4)
For n satisfying N(m) < n < N(m + 1) we define m = N-1(n). Denoting AN-i(n) and EN-1(n) by An and en, respectively, we have by (2.3) and (2.4) I E(Xn(An)) -.01 < 2En for sufficiently large n.
Letting En -^ 0 as n -^ oo, we obtain
I E(Xn(An)) - pj --^ 0 as
n
-t
oo.
This completes the proof. In order to obtain some properties on the asymptotic expectation we have to prove some lemmas.
Lemma 2 .1. If As. E(Xn) < oo, then for any sequence {An} of positive numbers tending to infinity as n -^ oo, we have that {An} E A(Xn). Proof. Assume that there exists a sequence {A;,} of positive numbers such that limn---o. A'„ = oo and {A;,} 0 A(Xn). For some c > 0 there exists a subsequence {A, . } of {An) such that Pr{Xn, > A;,^ } > E. Since for any {An} E A(Xn), Pr{Xn > An} --^ 0 as n -^ oo, it follows that for sufficiently large j Pr{Xn, > And } < E,
279
202
M . AKAHIRA AND K. TAKEUCHI
hence An, > A'.. And we have for sufficiently large j E(Xn.(Ani)) 1 E(Xn,(An.)) > A'. Pr{Xn, > And } i > And
hence the last term tends to infinity as j - ^ oo . Hence we have lim sup E(Xn(An)) = oo, n- oo
which contradicts the condition of the lemma. This completes the proof. We denote by A,.(Xn) a subset of A(Xn) whose element { An} satisfies limn---oo An = 00.
Lemma 2 .2. The following hold: As.E(Xn)
l nm up E(X X (An)); inf liminfE(X;,(An)).
_ {An} inf (X..)
As.E(X,) =
{A„}EA„(X,.) n- ^ oo
Proof. If the left-hand side of either of the two equalities above is infinity, it is obvious that the right -hand side is also infinity, hence we may assume that the right-hand side is finite. For any {An } E A(Xn) we put Pn = Pr{Xn > An}. Since pn -+ 0 as n -+ oo, putting A' = max{An , pn112} we have An -+ oo as n -^ oo, hence {An} E A„( Xn). Since An > An, we obtain E(Xn(A,)) ^ E(Xn*( An)). (2.5)
Since E(Xn(A' )) - E(Xn(An))
< (An - An)Pr{Xn
> An}
1/7 - pn r
it follows from ( 2.5) that
lim {E(Xn(An)) - E(Xn*(An))} = 0. Hence for any (An) E A(Xn) there exists a sequence {A„} E .4(X) such that n mo{E(Xn(An)) - E(Xn(An))} = 0.
280 ON THE DEFINITION OF ASYMPTOTIC EXPECTATION 203 This fact leads to the conclusion of the lemma. Lemma 2.3. If As .E(X,,) < oo, then there exists a sequence {A,,} E !100(X,, ) such that
limsupE(X,,(A;,)) = As.E(X,,); n-+oo
lim inf E(X„(A* )) = As.E(Xn). n-+oo
Proof. For a sequence { en) of positive numbers, such that en 10, there exists a sequence {An ,n} of positive numbers such that
lim sup E(Xn* (A n,,. )) < Ai.E(Xn)
+ en.
n-+oo
Without loss of generality we assume that, for each n, An,, is monotone decreasing in m. We Put Pn,,n = Pr{Xn > An ,n}. Then for each n, Pn,,n is monotone increasing in m and for each m, limn^oo Pn,m = 0 . For any m there exists n(m) such that Pn,,n < en for n > n(m). For any n let m = m-1(n) be a maximum value satisfying n(m) > n. Since m-1(n) -* 00 as n -+ oo, putting A** = An,,n-1( n) we have Pr{Xn > A;,*} <
of which the right-hand side tends to 0 as n - ^ oo. For any fixed no we obtain limsupE(Xn*(A;,*)) < limsupE(Xn*(An,,n- l(no))) n-'oo n-+oo
< As.E(Xn) + ern- 1 (no) . Since e,n-1(no) - ^ 0 as no .-' oo, we have
limsupE(Xn*(A**)) < As.E(Xn).
(2.6)
n-+oo
On the other hand it follows from the definition of As.E(Xn) that the inverse inequality of (2.6) holds . Hence it is seen that the equality in (2.6) holds. In a way similar to the above, it is shown that there exists a sequence {An***} E 6100 (X,,) such that
iim inf E(Xn* (An***)) = As.E(Xn). n-+oo
We put An* = min{A;,*, A;,**}. Then we have Pr{Xn>An)-+ 0 and
as n -^ oo
281 204
M . AKAHIRA AND K. TAKEUCHI
limsupE(X„(A;,)) < limsupE(X,,(A,,*)) = As.E(X,,); (2.7) n-oo
n-+oo
lim of E(Xn(An*)) < lim of E(X,,(A;,**)) = As.E(Xn). (2.8) Returning to the definitions of As.E(X„) and As.E(Xn) we see that the inverse inequalities of (2.7) and (2 .8) hold. This completes the proof. Lemma 2.4. Assume that As.E(Xn) < oo. Let {An*} be a sequence satisfying the condition of Lemma 2.3. Let {Bn*} be any sequence such that B„ -p oo as n -+ oo and B;, < An* for all n > no with some no. Then {B„} satisfies the condition of Lemma 2.3. Proof. By Lemmas 2.1 and 2 . 2 we have {B,,} E A(Xn) and lim sup E(X „(B * )) < lim sup E(Xn* (An*)); n^oo
(2.9)
n-+oo
lim inf E(Xn(Bn* )) < lim inf E(Xn* (An*)). n-boo n-'oo
(2.10)
From the definitions of As.E(Xn) and As.E(Xn) we see that the inverse inequalities of (2.9) and (2.10) hold. This completes the proof. Lemma 2.5. Assume that As.E(Xn) < oo and As.E(Yn) < oo. Then (i) As.E(Xn) + As.E(Yn) > As.E(Xn +Yn);
(ii) As.E(Xn) + As.E(Yn) < As.E(Xn +Yn); (iii) As.E(Xn + Yn) < As.E(Xn) + As.E(Yn) < As.E(Xn + Yn). Further if Xn > Yn, then As.E(Xn - Y,,):5 As.E(Xn) - As.E(Y,,); As.E(Xn - Yn) > As.E(Xn) - As.E(Yn).
Proof. (i) By Lemma 2 . 3 we take {An} E Aoo(Xn) and {Bn} E Ao(Y,,) such that
As.E(Xn) = limsupE(Xn*(An)); n-ioo
As.E(Yn) = limsupE(Y„(Bn)). n^oo
Let cn = min{An, Bn}. Since
E((Xn +Yn)*(cn)) < E(Xn*(An)) + E(Y„ ( Bn)),
282
ON THE DEFINITION OF ASYMPTOTIC EXPECTATION 205 it follows from Lemmas 2.3 and 2.4 that
lim sup E((Xn + Yn) * (cn)) < As.E(X,.a) + As.E(YY). n-.oo
By Lemma 2.2 we obtain
As.E(Xn + Yn) < As.E(Xn) + As.E(Yn). (ii) By Lemma 2.3 we take a sequence {cn} E Aoo (Xn + Yn) such that Xs-.E(Xn +Yn) = lim sup E((Xn +Yn)*(cn)). n-oo
Since E(Xn(2))+E(Yn (2)) -5 E((Xn+Yn
)*(cn)),
it follows from Lemmas 2.3 and 2.4 that lim inf E (Xn* (2)) + lnm of E (Y„ (2)) As. E(Xn + Yn). n-ooo
By Lemma 2 . 2 we have
As.E(Xn) + As.E(Yn) < As.E(Xn + Yn). (iii) By Lemma 2.3 we take { cn} E Aoo (Xn + Yn) such that Xs-.E(Xn + Yn) = limsup E((Xn +Yn)*( cn)). n-'oo
Since
E((Xn+Yn)*( cn))>_E(Xn ( 2))+E(Y: (2)) it follows that
lim sup E((Xn + Yn)*( cn)) n-+oo
lim sup E (Xn* (c")) + lim inf E (Y„ (c")) . n- oo
2
n-+oo
2
By Lemmas 2.3 and 2 .4 we have
Xs-.E(Xn + Yn) > As.E(Xn) + As.E(Yn). Similarly we obtain As.E(Xn + Yn) < As.E(Xn) + As.E(Yn).
283 206 M. AKAHIRA AND K. TAKEUCHI The proof of the rest follows directly from (iii). This completes the proof. From the above lemmas we have the following theorems. Theorem 2.2. Assume that As.E(X,a) and As.E(Y„) exist. Then As.E(X,, + Yn) = As.E(Xn) + As.E(YY). Further if Xn > Yn, then
As.E(Xn - Yn) = As.E(X,,) - As.E(Yn). The proof of the theorem is directly derived from Lemma 2.5. Theorem 2.3. (Markov type inequality .) If As. E(X,,) < oo, then for any c>0 As.E(X„) limsup Pr{Xn > c} < n-.oo C lim inf Pr{Xn > c} < As.E(Xn) n-+oo C
Proof. There exists a sequence {An} such that As.E(Xn) = lim sup E (XZ(An)) n-oo
and An -+ oo as n -^ oo. Then we can take no such that for n > no, An > c and Pr{Xn > c} = Pr{XX(An) > c} < E (X. An c By Lemma 2.3 we have lim sup Pr{Xn > c} < n-. oo
As.E(Xn) C
The other inequality is obtained similarly. This completes the proof. Theorem 2.4. Xn converges in probability to zero if and only if As.E(Xn) = 0. Proof. The proof of necessity is clear . If Xn does not converge in probability to zero, then there exists a positive number c such that limn,oo Pr{Xn > c} > 0. It follows from Theorem 2.3 that this contradicts the condition. Thus the proof is completed.
284 ON THE DEFINITION OF ASYMPTOTIC EXPECTATION 207 Theorem 2.5. Assume that random variables X„ and X have the distribution functions F„ and F, respectively, and the moment generating function
(m.g.f.) of X, 9(0) =
J
eDxdF(x),
exists for all B in some open interval I including the origin . Then X„ converges in distribution to X as n --^ oo, i.e ., X, -' X if and only if As.E(e°X^) = g(O) for all 0 E I. Proof. If Xn D+X, then e6X^ D,e'X, hence from Theorem 2.1 As.E(e°X^) = g(0)
for all
0 E I.
If As.E(e°X^) = g(0) for all 0 E I, then it follows from Theorem 2.3 that the sequence {Fn} is tight, hence for any e > 0 there exist no > 0 and K > 0 such that for n > no Pr{Xn > K} < E. Then there exists a subsequence { n;} of {n} such that Fn; converges in distribution to some distribution Fo as i -+ co and
As.E(Xn;) = lim
$-.00
eDxdFn;(x) = g(0)
for all
B E I.
If a subsequence {F,,,,,} of {Fn} does not converge to Fo, then there is some positive number b such that a set A6 = {n : A(Fn, Fo) > S} is infinite, where A(F, G) denotes the Levy distance between two one-dimensional distributions. We also have a subsequence {Fm.} of {F,,,,,} such that rn E A6 (i = 1, 2, ...) and Fm' converges in distribution to some distribution Fo as i - ^ oo. By the first part of the proof we have Fo = F0, which contradicts the assumption. This completes the proof. Further, if Yn converges in probability to zero as n -, oo and a function f (x) y) is continuous in y, then it may be shown that As.E(f (Xn, Yn)) _ As.E(f (Xn, 0)). For general real random variables Xn we have Xn = X,+, - X;, where X,4 = max(Xn, 0) and Xn = max(-Xn, 0). Then we define the upper and the lower asymptotic expectations of {X,,) as
As.E(Xn) = As.E(Xf1) - As.E(Xn ); As.E(Xn) = As.E(Xn) - As.E(X. ), respectively, and if -oo < As.E(Xn) = As.E(Xn) < oo we call it the asymptotic expectation of {Xn} and denote it by As.E(Xn).
285 208 M. AKAHIRA AND K. TAKEUCHI In a way similar to the case of the non-negative random variable it may be shown that
As.E(-X„) _ -As.E(X„); As.E(-X,,) _ -Xs.E(X,,); As.E(X„ + Y„) < As.E(X,,) + As.E(Y,,); As.E(X„ + Y„) > As.E(X„) + As.E(Y„), provided that As.E(IX„ I) < oo and As . E(IY„ I) < oo. Hence As.E(X„ +Y,,) = As.E(X„) + As.E(Y„), provided that As.E(X,,) and Ae .E(Y„) exist.
ACKNOWLEDGMENTS The paper was written while the first author was at Queen's University in Canada as a visiting Professor. The visit was supported by a grant of the Natural Sciences and Engineering Research Council of Canada. He is grateful to Professor Colin R. Blyth for inviting him.
REFERENCES Akahira, M., and K. Takeuchi (1981). Asymptotic Efficiency of Statistical Estimators: Concepts and Higher Order Asymptotic Efficiency. Lecture Notes in Statistics 7. New York: Springer.
Ibragimov, I. A., and R. Z. Has 'minskii (1981). Statistical Estimation : Asymptotic Theory. New York: Springer. Lehmann, E. L. (1982). Theory of Point Estimation. New York: Wiley and Sons.
286 Metrika, Volume 34, 1987 , page 1-15
Locally Minimum Variance Unbiased Estimator in a Discontinuous Density Function By M. Akahiral and K. Takeuchi2
Abstract: The exact forms of the locally minimum variance unbiased estimators and their variances are given in the case of a discontinuous density function. Key words and phrases: Locally minimum variance unbiased estimator , Non-regular cases, Discontinuous density function.
1 Introduction In the theory of minimum variance unbiased estimation in regular cases , it is usually assumed that the density function is continuous with respect to the unknown parameter 0 to be estimated. In non-regular cases the minimum variance unbiased estimation was studied by Chapman and Robbins (1951), Kiefer (1952), Polfeldt (1970), Vincze (1979), Takeuchi and Akahira (1986), Akahira, Puri and Takeuchi (1986) and others. Though the Chapman-Robbins-Kiefer type discussion does not require the continuity of the density function in 0, their bounds are not usually sharp and the precise form of the locally minimum variance unbiased (LMVU) estimator is not given. In this paper we shall give an example where the density function is discontinuous in 0, and obtain the exact forms of the LMVU estimators of the parameter and their variances , and illustrate some aspects of the LMVU estimators in such a situation.
2 Results Suppose that the density function f(x, 0) (with respect to a a-finite measure p) is not continuous in 0, while the support A(0) _ {x I f(x, 0) > 0} can be defined to be independent of 0. Note that this condition does not affect the existence of an LMVU
1 M. Akahira, Statistical Laboratory, Department of Mathematics , University of Electro-Communications, Chofu, Tokyo 182 , Japan. 2 K. Takeuchi, Faculty of Economics , University of Tokyo, Hongo, Bunkyo-ku , Tokyo 113 , Japan. 0026 - 1335 /87/1/1 - 15 $2.50 © 1987 Physica-Verlag , Heidelberg
287 2 M. Akahira and K . Takeuchi
estimator , because it can be shown that an LMVU estimator at 0 = 00 exists if f ff(x, 0)}2dp(x)< o0 A(e0) f(x, 00) (e.g. see Barankin 1949 and Stein 1950). But the LMVU estimator sometimes behaves very strangely. Let X1, ..., X„ be independently and identically distributed random variables according to the following density function w.r.t. the Lebesgue measure
p
for 0<x<0 and 0+1<x<2;
f(x, 0) = q
for 0 < x < 0 + 1;
0 otherwise, where the parameter has the range 0 < 0 < 1, and p and q with 0
q
P1 0
0 1 6+1 2
Fig. 2.1 The density function x f(x, 0) given by (2.1)
Then the LMVU estimator „* = 6n(x1 , ..., xn) at 0 = 00 has the form
6n(xl ,...,xn)=f n f(x"tt) dGn(?7), 0 1=1 f(xi' 00) where Gn (ri) is some signed measure over the closed interval [0, 1] (e.g. see Stein 1950). First we shall construct the LMVU estimator at 0 = 00 given by the form (2.2) in the case when n = 1, i.e., x 1 = x. We put 2
Ke0(0 , r?) = f ire0,n(x)f(x, O)dx 0 with
_ f(x, tr) 1ro0 (x) f(x, 00)
(2.3)
288 Locally Minimum Variance Unbiased Estimator in a Discontinuous Density Function 3
In order to obtain the LMVU estimator 6i(x) at 0 = 00 it is enough to take some signed measure G over [0, 11 such that
Ee[0i (X)l = f j f f(x, "' dG(n)} f(x, 0)dx 0 o f(x, 00) 1
= o K00(0, r7)dGn) =0 for all 0 E [0, 1]. For rt > Bo we have for 00<x
for 00+1x<
+1;
(2.4)
p otherwise. For 77 < 00 we also obtain
forri<x<00; p IrBO,»(x) p q
for + 1 <x < 00 + 1;
1 otherwise. Note that for = 00, ae0,,1(x) = 1. In order to calculate K00(0, ri) we consider two cases 0 > 00 and 0 < 00. Note that for 0 = 00,Ke0(00,17) = 0.
Case 1 : 0 > 00. If 17 < 00 , then we have by (2.5) 2
K00(0,77 ) =O f ire0,,.(x )f(x, 0)dx 77 00 0 71+1 00+1 0+1 2 = f pdx + f qdx + f pdx + f qdx + f pdx + f qdx + f pdx
0 = 1.
n
00
0
77+1 80+1 0+1
289 4
M. Akahira and K. Takeuchi
If 00 < ri < 0 , then we obtain by (2.4) 2
ireo ,,7(x)f(x, 9)dx
Ke0(B,7?) = of
2 2 00 17 z A+iqdx +fl _ pdx +f p dx +f pdx + 4 dx 1 + f qdx + f pdx f 00+1 p T7+1 0+1 o 00 q 0
= 1 +c(r7-00), where 1 c=--4 (>0). pq
If 0 ?, then we have by (2.4) 2 K00(0 ,17) = f ire0,,n(x)f(x, B)dx 0 00 0 p2
i7 00+1 0+1 q2 77+1 2
= f pdx + f - dx + f pdx + f qdx + f - dx + f qdx + f pdx 0 e0 q 0 n e0+1 p 0+1 77+1
=1+c(6-00). Hence we obtain for 0 > 00
I for 7? <00; Kee(O,r7) 1+c(r7-60) 1 +c(0 -00 )
for 00
(2.6)
for 9 <'q,
where 1 c=--4. pq Case 2: 0 < 90. In a similar way to the Case 1 we have by (2.4) and (2.5) 1
for B0
K00(0,17)= l+c(00-r7) 1+c(80-0 )
for 6<17<0o;
(2.7)
for r7<0.
We take a signed measure G* over [ 0, 1 ] satisfying G*({0}) _ -1/c, G*({ 1 }) = 1/c, G*({Bo }) = 0 and G*((0, 0 ) U (0 0 , 1)) = 0.
290 Locally Minimum Variance Unbiased Estimator in a Discontinuous Density Function
Since by (2.6) and (2.7) KOO(0, 0) =
1
for 0 > 00;
1 +c(00 -0)
for 0 <00;
1 +c(0 -00 )
for 0 >00;
1
for 0 <00,
and
K90(0, 1) = we obtain
E0[Bi(X)]
=o 1o
1rBO ,,7(x)dG *(1t)j
f(x, 0)dx
= o Ke0(0, n)dG*(i?) 1
= c {KBO(0, 1) -K00(0, 0)} +K00(0, 00)00
=0 for all 0 E [0, 1], which shows the unbiasedness. Hence the LMVU estimator 6 *(x) at 0 = 00 is given by 1 = f ir00,,7(x)dG*(rl) 6i (x) 0
1 ^ {'TOO,,W-'TBO,o(x)}+00, where q
for 0<x<00;
for 00<x<1;
7T00,1 (x)-1T0o,o(x) = for 1<x<00+1;
for 00+1<x<2.
5
291 6 M. Akahira and K. Takeuchi
80+1 2 x
P-1 I - - ---4
Fig. 2.2 . The values of 1re0 1(x) - ae p 0(x) for all x E [0, 2]
Note that 6i(x) can take values outside of the interval [0, 1 ] which is the range of the parameter 0. Since for 0 > 00,
Ee [ire0 ,i(X) - ir80, 0(X)] =C(0 -00), Ee[{ire0 ,i(X)-ireo,0(X) }21=cif+(p-4 l p4
)z
( 0 -00)
1 I
we obtain for 0 > 00 V0($i(X))= -+(0-00 ){ 1-(0-00)}, where Ve designates the variance at 0. In a similar way to the case when 0 > 00, we have for 0 < 00
Ve(6i(X )) = ± +(00 -0) { I -(00 -0)1. Hence the variance of the LMVU estimator ei(x) is given by
Vo(ei(X))
_ 1 +10 - 00(1- 10 -001).
Note that
V00 i(X )) > - = Ve0($i(X )) for all 0 E [0, 1].
(2.8)
292 Locally Minimum Variance Unbiased Estimator in a Discontinuous Density Function 7
n V.(8*(X)) c+4
c +00(1 -e
1 c
0
1 e0 2
e 01
e+i 0
Fig. 2.3 . The variance of the LMVU estimator 0^j(x) at 0 = 00 given by (2.8) when 00 > 1/2
Next we shall construct the LMVU estimator at 0 = Bo given by the form (2.2) in the case when n > 1. From (2.3) we have 2 2( n f (X, , 77) 1 H 0 ...0
1 1=1 f(xi, B0)
1
.f(x,, 0) dX i ... dxp i=1
_ {Kg0(O,r7)}" =KgO(9,?7)
(say)
(2.9)
By (2.6) and (2.7) we obtain for 0 > BO 1 K00(0,r7)
for ri < 00
{1+c(r7-00)} n for 0
(2.10)
for 9
and for 0 < 00 1 for t00
for 97<0o;
{1 +c(00 -0 )} n for r7 <0, where 1 c--4. P4 Note that for 0 = 00 , KOO(00, r7) = 1.
(2.11)
293
8 M. Akahira and K. Takeuchi
We take a signed measure
dGn
(
1\
over [0, 11 satisfying
Gn
1
8n(17) dr7 n (1 +clr1901)" sgn (rl -eo)
(Say)
for 0
Gn({1})=
(2.12)
1 +C9o)n-1' I
nc{1 +c(1 -00)}n -I ,
Gn({00 }) = 0 0 . Note that Gi is consistent with the G* given in above case n = 1. Then the estimator
6n
1 n
Bn(x1,...,xn )
f
11
f(
1,
(2.13)
f(xt,BO )dG*(17)
is unbiased. Indeed, we have by (2.9), (2.10) and (2.11) and for 0 > B0 2 1 xi' 1 EB($n) = f ... f i f II f( 7) dGn(rl)} n f(xi, 6)dx1 ... dxn
0 0 0 i =1 f(x;,00)
=1
1
f
Ke0(0, n)dGn(17)
=Gn({0})Ke0(O,0)+G n( {1 })xeo(e,1)+Gn( {eo })Ke0(B,eo) Bo
0
1
+fgn(ri)dri+ f { 1 + c(77-00)}"Sn(77)dr?+ { 1 +c(0 -0o )} "fgn(17)dr1 0
1
00
{1 +c(0 -90)}"
nc(1 +c90 )n-1
0
1 0
1
drl nc { 1 +c(1 -00 )}n-1 n ^11f {1 +c(00 -,)} n
l e 1 1 1 1 +(1-- fo d7l+(1 -n {1+c(0-Bo)}" n f {1+c(7-eo )}"d^ +00
1
=0. Similary we obtain for 0 < 00 E0 ($n) = 0
294
Locally Minimum Variance Unbiased Estimator in a Discontinuous Density Function 9 and for 0 = 00
Eep(S*) = Bp. Hence the estimator On is the LMVU estimator. Next we shall construct the estimator. Putting
" f(x;,17 n 718"0 ,+7() =fn1 fH iro0 ,'n(xi) f(xi, eo)
with x = (x 1, ..., xn), we have from (2.13) 1
6*`)=o rr00(x)dG *(i7) = 7fg0 p (x)G*({0}) + fg0,1(s)G*({ 1 }) + e0 00
1
+O f IrBp,7(x)gn(17)d17 + f TreO ,,70C)Sn(r7)d4-
0 For t = 0, 1 we have
7roo 0 (,1) =tlll lrBo,O(xi) = \pI a \q )b = (p)a- b
and a"^Q)b" ,TBp,l (X) =i n17T Bp ,l(xi) =
1
R P 4 1P
with a=#{x110<x;<0o}, b=#{x111<x1<80+1}, a'=#{x;IBO<xi<1} and b'=#{x;190+1<xi<2}, where # { } designates the number of the elements of the set { }.
Note that 0
'TBO,»(x)-\P1S14/r-1P^J r'
where s=#{xelrl<xi<0p}andt=#{x;lri+1<x,<0p+1}.
(2.14)
295 10 M. Akahira and K. Takeuchi
If00
P = ^
where s '=#{x1I00<xi
(q )a
I a'-b'
b
Gn( {0})+
-
I
00 1s - t
+o \Pq/
Gn( {1
\ 1
}) +Op
s-t
Sn ('^7)d'^'7 +ef q) Sn(77)d7? 0
Pa-b 14^a,-b,
nc(1 +c00 ) n- +nc{1 +c( 1 -00)1`1
+Bp
pq s-t 1 1P s'-t
+pf 1P
gn(17)d17 Bf
/
- gn(17)d1?,
(2.15)
where 1
gn(17)= 1
1
n)(1+c117-00I)
nsgn (17-00)
(0
For n > 2, it may be difficult to calculate directly the variance of the LMVU estimator 6n given by (2.15), but it may be possible to do it by (2.13). Putting 2 K00(0, f lre,n,n'(x)f(x, 0)dx 0 with
'Tep,n ,n'(x) = 7rep ,n(x)'r0p ,7j'(x)1 we have
296 Locally Minimum Variance Unbiased Estimator in a Discontinuous Density Function 11 2 2 1 1 n
f(x=, 1))
Ee(6n2) = o ... o 1f o i 111 f(xi, Bo)
rl f(xi, ?I') dGn(ii)dGn(iiI ) II f(xi, 0) H dxi i=1 i=1 i=1 f(xi, 00)
1 1 2 2 n f(xfi, n)f(xi, n ') 0 0 ...0 i =1
( (xi,B0))2
17)f(x,1 = f f 1 f2 f(x, 1)f(xi, 9)dx1ndGn (rl)dGn(??') o0o (f(x, 00)) 11
=oo
Keo (0, n,
Here we shall obtain the variance of the estimator
have for r?' <71 < 00 1 4
(2.16)
n')dGn(17)dGn(??')
$n at 0 = O. By (2.4) and (2.5) we
for 0 <x
P for n <x <00; for 00 <x
(2.17)
for 1?'+1<x
for 17+l<x<00+1;
1
for 00+1<x<2.
Since rre0 ,n,n'(x) = 7re0,n,,n(x), it is easy from (2.17) to obtain ire0,n,n'(x) for 17
<1?'
<0o.
From (2.17) we have
(l+c(Oo-'0) if ??'
(2.18)
297 12 M. Akahira and K . Takeuchi
By (2.4) and (2.5) we have for 77 < 00 -<-,q' (- 1 for 0 <x <17; 4 for 77 <x <00; p p for 00 <x <77'; q for 77'<x<77+1;
7re0,,7 17'(x) 1
p for 77+1<x<00+1; q 4 for 00+1<x<77'+1; p l l for 77 '+1<x<2. Then we obtain for 77 < 00 < 77' (2.19)
K0o(0o , 77, 77') = 1. From (2.18) and (2.19) we have
Ke0(0o , 7i,7i')=
1+c(00-77)
if 77'<77<00;
1+c(00-77')
if 77<77'<00;
1
if 77 < 00 <77'.
(2.20)
In a similar way to the above we have
Ke0(0o , 77,77')=
1
if 77 ' < 00 <77;
1+c(r7 '- 00)
if 00<77'<71;
1 +c(77-00)
if 00 <17<77'.
In order to calculate E00(6*2) we obtain from (2.16) i Eeo(6*2) = G*({O}) f KB0( oo, 0, r7 ')dG*(77') 0 i +G*({1})f Ke0(0o, 1,77')dG*(77') 0 i + G*({00 }) f K0(00,00 , 77')dG*(77') 0
(2.21)
298 Locally Minimum Variance Unbiased Estimator in a Discontinuous Density Function 13 1 ep
+pf o K
(0, n7 ,?7')dGn('R)dGn(T7')
1 1
(2.22)
+f B K"ep(0 , 17, r7')dGn(17)dGn (?7') 0 0 By (2.20) and (2.21) we have 1+c(eo-"I') Kep(B0 , 0, o) = 1
for t'
(2.23)
for 00
' 1 for r7 <9p; Kep(Bo, 1 ,77') =
(2.24)
1+c(r7'-00) for 0 0 <77'.
It is also clear that Ke p(B0 , Bp, r7') = 1 for all r7' E [0, 1]. (2.25) By (2.12), (2.23), (2.24) and (2.25) we have 1
(2.26)
Gn({0}) f Ko0(Oo , 0,77')dGn(I7') = 0; 0
1 Gn({1}) fKep(ep, 1,r7' )dGn(71') o
I cn{1
+c ( 1 - e o )}n-1
(2.28)
Ke"p(B0, 0o, ?7')dGn(1?')=Bo.
Gn({9p}) of
By (2.12) and (2.25) we also have for 0 < 17 < 00 1
Kgp(Bp, 77, 0)Gn({0})+ Knop(ep , 77,1)Gn ( { 1 ep
1
+(f +f +f)Kep(Bo,7l,'n')dGn(n') 0 ' ep
=77.
(2.27)
+Kep(ep,T? , ep)G n( {ep))
299 14 M. Akahira and K . Takeuchi
Then we have for n > 2 1 0° f0f
KB0(
00,rl,rl' )dGn(n)dGn(rl')
00 1
=f o Ke"0(00,17, r7 ')dGn(r7')dGn(n) 0
_ 0f r1dGn(11) 00 1 1+CO0) for n=2; -+2C21og( 2c 00
1
(2.29)
1
cn +c2n (n - 2) c2n(n - 2) (1 +c00)n
for n > 2.
In a similar way to the above we obtain by (2.12) and (2.21) 1 1
f e K00 (00, r7, ?7')dGn(rl)dGn(17') 0 1 2c 00
1 1 { 1+c(1-0o )}j for n=2; 1+c(1-0o)+-log
(2.30) 1 cn 00
1
1
{ 1 +c(1 -00)}n - 1
1
1 1
c2(n -2) 1 (1 +c(1 -0o))n
11
for n > 2. From (2.22), (2.26) - (2.30) we have 1 + 2c2 log {(1 +c00)(1 +c(1 -00))} for n = 2; EB0(en2) _ I 2 1 1 00 +c2n (n -2) 2-(1 +CO0 )n-2
1 (1 +c(1 -00))n-2
for n > 2. Hence for n > 2 the variance of the LMVU estimator 6n is given by log {(1 +cOo )( 1 +c(1 -00 ))} for n = 2; V0o($n) =
}
1 1 1 - 2- for n > ( I + cO0) n --2 (1 +c(I -00))n-2 -2)
1
2.
300 Locally Minimum Variance Unbiased Estimator in a Discontinuous Density Function
15
Note that for sufficiently large n
Vg0(e„)
2 +OIn I , c2
i.e., the variance of the LMVU estimator 6* is the order of n-2. The calculation of the variance V0(6*) at any B is more complicate than that of VOO(S*).
References Akahira M, Puri ML, Takeuchi K (1986) Bhattacharyya bound of variances of unbiased estimators in non-regular cases. Ann Inst Statist Math 38A:35-44 Barankin EW (1949) Locally best unbiased estimates . Ann Math Statist 20:477-501 Chapman DG, Robbins H (1951 ) Minimum variance estimation without regularity assumptions. Ann Math Statist 22:581-586 Kiefer J ( 1952) On minimum variance in non -regular estimation . Ann Math Statist 23:627--630 Polfeldt T (1970) The order of the minimum variance in a non-regular case. Ann Math Statist 41:667-672 Stein C ( 1950) Unbiased estimates of minimum variance. Ann Math Statist 21:406-415 Takeuchi K, Akahira M (1986 ) A note on minimum variance. Metrika 33:85-91 Vincze I (1979) On the Cramer-Frichet-Rao inequality in the non-regular case . In: Contributions to statistics . The Jaroslav Hdjek Memorial Volume . Academia, Prague , 253-263
Received June 12, 1984
301 Ann. Inst. Statist. Math. 39 (1987), Part A, 593-610
THE LOWER BOUND FOR THE VARIANCE OF UNBIASED ESTIMATORS FOR ONE-DIRECTIONAL FAMILY OF DISTRIBUTIONS MASAFUMI AKAHIRA AND KEI TAKEUCHI
(Received Sept. 13, 1985; revised Feb. 15, 1986)
Summary In this paper we introduce the concept of one-directionality which includes both cases of location (and scale) parameter and selection parameter and also other cases , and establish some theorems for sharp lower bounds and for the existence of zero variance unbiased estimator for this class of non-regular distributions. 1. Introduction For the lower bound for the variance of unbiased estimators, the most famous is the so-called Cramer-Rao bound. But the Cramer-Rao bound and its Bhattacharyya extension assume a set of regularity conditions. Chapman and Robbins [2], Kiefer [4] and Fraser and Guttman [3] obtained bounds with much less stringent assumptions, but they still require the independence of the support of the parameter 0 or almost equivalently that the distribution with 0#0 0 is absolutely continuous with respect to that with 0=0 0 when 0 is the specified parameter value at which the variance is evaluated. Recently in the non-regular cases the Cramer-Rao bound has been discussed by Vincze [8], Mori [5] and others. In the previous paper, Akahira, Puri and Takeuchi [1] get the Bhattacharyya type bound for the variance of unbiased estimators in non-regular cases. In this paper we introduce the concept of one-directionality which includes both cases of location (and scale) parameter and selection parameter and also other cases, and show that the bound for the variance of unbiased estimators is sharp in the sense that the actual infimum of the variance of unbiased estimators is equal to the bound for a specified 00, for this class of non-regular distributions. We also estabKey words and phrases: Cramer-Rao bound, Bhattacharyya bound, unbiased estimator, one-directional family of distributions, sharp lower bound. 593
302 594 MASAFUMI AKAHIRA AND KEI TAKEUCHI
lish that for a wide class of the non-regular distributions the infimum of the variance of unbiased estimators can be zero when the sample size is not smaller than 2. A simple but rather general case of the class of distributions of which none dominates another is that it is characterized by a real parameter 0 and the distribution shifts monotonically as 0 changes. For one dimensional random variables the case can be visualized by the following example.
01
01<02<03 Mathematical definition for such cases in a rather general set-up is given in Section 2 and is termed as one-directional family of distributions.
2. Definition of the one- directional family of distributions We assume that we are given a model consisting of a sample space (X, p) and a family '={PB: 0 E ®} of probability measures , where a parameter space ® is an open subset in a Euclidean 1-space R1. Throughout the subsequent discussion we shall assume the following : (A.2.1) For each 0 E ®, PB is absolutely continuous with respect to a o-finite measure p and the corresponding density w.r.t. p is f(x, 0). Let A(0) be a support of f (x, 0), that is, A(0) = [x: f (x, 0) > 0}. The determination of A(0) is not unique so far as any null set may be added to it, but in the sequel we take one and fixed determination of A(0) for every 0 E ® which satisfies the following : (A.2.2) For any disjoint points 01 and 02 in ©, neither A(01)DA(02) nor A(01) c A(02). For 01<02<03, (A.2.3)
A(01) n A(02) c A(01) n A(02) A(01) n A(03) c A(02) n A(08) . (A.2.4)
If 0„ tends to 0 as n --+ oo, then
303 THE LOWER BOUND FOR THE VARIANCE OF UNBIASED ESTIMATORS 595
ft ((DUI inn A(et) GA(B)) =,u ( (x(11 tiun A(Bt )) 4A(0)) = 0 , where EGF denotes the symmetric difference of /two sets E and F. (A.2.5) For any two points 0, and 02 in ® with 0,<02i there exists a finite number of ^t (i =1, • • • , k) such that 01 _ e0 < el< • • • <$1=02 and p(A(st)nA(e,))>0 (i=1,..., k). Then IT is called to be a one-directional family of distributions if the conditions (A.2.1) to (A.2.5) hold. 3. The lower bound for the variance of unbiased estimators when X is a real random variable Now suppose that X is a real random variable with a density function f (x, 0) whose support is an open interval (a(0), b(0)), then the condition of one-directionality means that a(0) and b(0) are both monotone and continuous functions. Without loss of generality we assume that a(0) and b(0) are monotone increasing functions. Therefore we can formulate the following problem. Let X=®=R1. Let X be a real random variable with a density function f (x, 0) (with respect to the Lebesgue measure p) satisfying the following conditions (A.3.1) to (A.3.7) : (A.3.1)
f(x, 0)>0
for
a(6)<x
f(x, 0)=0
for
x5a(0), x>_b(0) ,
where f(x, 0) is continuous in x and 0 for which a(0)<x0, b'(O)>O for all 0. (A.3.2) lim f (X, 0) = lim f (X, 0) = 0 , x-a(6)+0 x-+ b(B)-0
and for some positive integer p lim at f(x,
0) =
lim
at f(x, 0)=0 (i=1, • • •, p-1) ,
x-•a(6)+O aet x-b (8)- O aet
lim ap f(x, 0)=A,(0) , x-.a(6)+0
lim
aep
ap
x -+ b(6)-0 aep
f(x, 0)=Bp(0) ,
where Ap (0) and BP (0) are non- zero, finite and continuous in 0. (A.3.3) (atlaet)f(x, 0) (i=1, • • •, p) are linearly independent. (A.3.4) For some 00 E 0 {f( x, 0O)}2
00) du(x)
0 a(60 )
f (x,
304 596 MASAFUMI AKAHIRA AND KEI TAKEUCHI
is finite for each i=1,• • •, k, and (,X, at
ri __ n
)z
(11 b(°o)
f(x ,
a(00)
dp 0o)
is infinite for each i=k+1, • • •, p unless ck+1= • • • =c,=0. (A.3.5) For Bb E ® there exists a positive number a and a positive-
valued measurable function p(x) such that for every x E A(B) and every
0 E (e o- e, Bo+ s), p(x) > f(x, 0), and for every 0 E ( B o- e, B o+ Z),
I
r(x) I
A(B)
I r(x) I p(x)dp < oo .
x f (x, 0)d p < oo implies 6E(Oo BO+,)ACB)
(A.3.6) For each i=1, • • •, p+1, ^Z(-1)f \^)f(x, 9o+i h)
sup
Tim
I h I `p(x)
A-0 xE
U A(BO+1h )- A(60) J=1
< oo .
(A.3.7) For each i=1,. • •, p+1,
It f (x, eo+h) TM sup < oo . h-. 0
xE
p(x)
A (eo)
Note that the conditions (A.3.5), (A.3.6) and (A.3.7) are assumed to obtain the Bhattacharyya bound for the variance of unbiased estimators (see Akahira et al. [1]). First we consider the special case when p=0, then we have to modify slightly the condition (A.3.2) as follows : (A.3.2)' lim f(x, 0)=Ao(0)>0 , x-a(8)+0
lim f(x, 0)=B0(0)>0.
x-b(B)-o
In the following theorem we shall have a lower bound. THEOREM 3.1. Let g(0) be continuously differentiable over ®. Let g(x) be an unbiased estimator of g(0). If for p=0 and a fixed Bo, the conditions (A.3.1), (A.3.2)', (A.3.5), (A.3.6) and (A.3.7) hold, then min 4: unbiased
where
Vgo
Vgo
(g') = 0 ,
(g) denotes the variance of g at 0= 00 .
PROOF. We first define gg(x)=g(00)
for
a(00)<x
Since a(0o)00, it follows from the unbiasedness condition that
305 THE LOWER BOUND FOR THE VARIANCE OF UNBIASED ESTIMATORS 597 b(o) r b(B0) r b(o)
(3.1)
g(8) = g(x) f (x, O)d p = a(O)
f
g(x) f (x, a(B)
O)d p +
f
g(x) f (x, O)d p .
b(80)
Putting b(6)
h (B) = J a(B) g(x)f (x, O)dp , we have by (3.1) r b(e) (3.2)
f
g(x)f(x, O)d ,a = g(O) -h(O) .
b(00)
Differentiating both sides of (3.2) with respect to 0, we obtain
(3.3)
b(B)
b '(0)Bo(0)9'(b(0))+
b(OO)
I
g(x) a o
A x, B)Jda=g'(B) -h'(B)
Differentiation under the integral sign is admitted because of (A.3.7) with p=0. If g(x) satisfies (3.3), then it also satisfies (3.1) since g(Bo) =h(00). Since by (A.3.1) and (A.3.2)', b'(e)Bo(B)>0, it follows that the integral equation (3.3) is of Volterra's second type, hence the solution g(x) exists for b(B)>x>_b(Bo). Similarly we can construct g(x) for x< a(00). Repeating the same process we can define g(x) for all x. Hence we have min
V00()
=0.
¢: unbiased
Thus we complete the proof. The following useful lemma is a special case of the result by Takeuchi and Akahira [7]. LEMMA 3 .1. Let g(0) be p-times differentiable over 0. Suppose that the conditions (A.3.3) and (A.3.4) hold and G is the class of the all estimators g(x) of g(6) for which
5
b(OO) A
a(BO) a(BO) 0
g(x)f(x, Bo)du=g(00)
g(x ) {
a9(
f (x, Bo)} du = g (i) (Bo) (i= 1, ... , P),
where g(`)( 0) is the i-th order derivative of g(0) with respect to 0. Then inf VB (g) _ (g(i) (0o), ... , g(k) (0o))A-' ( g(i) (eo), ... , g ( k) (eo))'
$EG
where A is a k x k matrix whose elements are .i
2Z>
(
b(BO)
a(BO) U
f (x,
Bo)
B7 P x,
A x, 690)
00) du (i, 9 =1, ... , k).
306 598 MASAFUMI AKAHIRA AND KEI TAKEUCHI
The proof is omitted. In the following theorem we shall get a sharp lower bound. THEOREM 3.2. Let g(O) be (p+1)-times differentiable over ®. Let g(x) be an unbiased estimator of g(0). If for p> 1 and a fixed 00, the conditions (A.3.1) to (A.3.7) hold, then inf VOD(9) = vk (0o) ¢: unbiased
that is, the bound vk (00) is sharp,( where vk (Bo ) =
(g (1) (00)
, ... , g(k)
(00))A- 1(g(,) (00), ...
g(k)
(00))^
with a k x k matrix A given in Lemma 3.1. PROOF. From the unbiasedness condition of g(x), (A.3.2) and (A.3.7) we have D(0) ^
(3.4) a(80)
(3.5)
D
'(x)f (x, 0o)dp = g(00) , g(x)
J a(B)
I a{
Ax, 0o)
J
(i=1,..., p)
dp =g(s) (eo)
j^
By Lemma 3.1 it follows that the sharp lower bound of
(1' D(B0)
V9D( il)= Ja(00)
{g(x) -g(00 )}2 f (x, 00)d p(x) under (3.4) and (3.5) is given by (g(')( 00)'...' 1(g(1)(0o) y ..., g(k)(B0 )),, i.e., g(k)(eo))Ag(k) (00)) A-' (g(1 ) (00), ... , g (k) (3.6) inf VBo ( 9) = (g((00) , . .. , (00))l 0- (3.4), (3.5)
= v k (0) (say) Note that the right-hand side of (3.6) is the Bhattacharyya bound for the variance of unbiased estimators at 0=0 (see Akahira et al. [1]). From (3.6) it follows that for any e>O there exists gr,(x) in the interval (a(00), b(60)) satisfying (3.4), (3.5) and
VBD(g, )Svk( 0o)+e
.
We can extend g, (x) for x outside (a(00), b(00)) from the unbiasedness condition b(0)
(3.7)
J g, (x) f (x, O)d p(x) = g(0) a(B)
for all
For B>Bo, i.e., b(B)zb(00) we put b(BD) ^
h(0) = J b(e) r, (x)f (x, 0)d.(x) .
OE
®
.
307 THE LOWER BOUND FOR THE VARIANCE OF UNBIASED ESTIMATORS 599 By (3.7) we obtain b(6)
(3.8)
g, (x) f (x, O)d p(x) = g(O) - h(B) . b(BO)
Differentiating (p+1)-times both sides of (3.8) with respect to 0, recursively, we have by (A.3.1), (A.3.2), (A.3.5), (A.3.6) and (A.3.7) 1
0(0) o)
b( 80) b(B)
^,(x){ _
f( x , B ) dp(x)=g(')(8)-hu)(e)
(J
^b(O) g, ){ - f(x, 0)}dp(x)=g(P)(0)-h(P)(0) , 1
(3.9)
BP(0 )b(1)(9),,(b(0)) +Sb(8))9'.(x)1 0
86P+ i
f(x, 0) }dp= g(P +1)(6)-h +1)(6) . (P
Differentiation under the integral sign is admitted because of (A.3.7). If g,(x) satisfies (3.9), then it also satisfies (3.8) since g")(00)=h(()(00) (i=1,---,p) and g(00)=h(60). Note that h(P+') (B) is determined by the values of g, (x) for a(B) <x x z b(00). Similarly we can construct &(x) for x
inf
6: unbiased
Vao(g)= V,E(8o)
i.e., v, (90) is a sharp bound. Thus we have completed the proof. We shall give one example corresponding to each of the situations where the conditions (A.3.1), (A.3.2)/(A.3.2)' to (A.3.7) are assumed. Example 3.1. Let X be a real random variable with a density function f (x, 0) (with respect to the Lebesgue measure p) satisfying each case. (i) Location parameter case. The density function is of the form f (x - 9) and satisfies the following : f(x)>0
for
a<x
f(x)=0
for
x5a, xZb ,
and lim f(x)>O, lim f(x)>O and f (x) is continuously differentiable in the open interval (a, b).
(ii) The case on estimation of g(O)= O with a density function
308 600 MASAFUMI AKAHIRA AND KEI TAKEUCHI
c(1-(x-8)2)11-1
for Ix-0l<1 ,
f(x-0)= {
0
for
I X-0 1 >_1 ,
is discussed in Akahira et al. [1], where q>1 and c is some constant. (iii) Scale parameter case. The density function of the form f (x / 0) / B satisfies the following :
f(x)>0
for
f (x) = 0
otherwise ,
0
and satisfies the same condition in (i). (iv) Selection parameter case (e.g., Morimoto and Sibuya [6]). Consider a family of density functions whose supporting intervals depend on a selection parameter 0 and are of the form (0, b(0)), where - oo <0
AX, 0) =
p(x) F(0)
for
0 xSb(0) ,
l 0 otherwise , b(8)
where p(x)>0 a.e. and F(0)=JB p(x)dp(x).
Note that the cases (i),
(iii) and (iv) correspond to the case p=0, and (ii) corresponds to the case p=q-1, where q is an integer.
4. The lower bound for the variance of unbiased estimators for a sample of size n of real-valued observations Now suppose that we have a sample of size n, (X1, • • • , Xn) of which Xi's are independently and identically distributed according to the distribution characterized in the previous section. Then we can define statistics Y=max XX+min X, , 1gign
1^t:in
Z=max Xi-min Xi 15isn
1;5i$n
and we may concentrate our attention on the estimators depending only on Y and Z. Since they are not sufficient statistics, we may lose some information by doing so. More generally, for a sample of size n, (X1, • • • , Xn) from a population in a one-directional family of densities with a support A(0), we can define two statistics 0=sup{01XiEA(0) (i=1,•••,n)}, 2=inf{0IXXEA(0) (i=1,•••,n)}
309 THE LOWER BOUND FOR THE VARIANCE OF UNBIASED ESTIMATORS 601
and also define Y= 2 (0-+#) , Z= 2 (B - 9) . There are various ways of defining the pair of statistics Y and Z, but disregarding their construction we assume that there exists a pair (Y, Z) which satisfies the following. Let Y and Z be real-valued statistics based on a sample (X,, • • • , X..) of size n for n>_2. We assume that (Y, Z) has a joint probability density function f0(y, z) (with respect to the Lebesgue measure p ,,,$) satisfying
a.e.,
fe(y, z)=fa(y I z)ha(z) ,
where fg (y I z) is a conditional density function of y given z with respect to the Lebesgue measure pv and he (z) is a density function of z with respect to the Lebesgue measure a,. Note that if Z is ancillary, h0 (z) is independent of 0. We assume the following condition : (A.4.1) For almost all z [,as]
f0(yl z)>0
for
a$(e)
fe(yl z)= 0
for
y5a.(0), y >b$(8) ,
where as (0) and b, (6) are strictly monotone increasing functions of 9 for almost all z [,az] which depend on z, and h0(z)>0
for
c
h®(z)=0
for
zSc, zzd ,
where c and d are constants independent of 0. We also assume that for almost all z [p.], fa (y I z) instead of f (x, 0) satisfies the conditions (A.3.2) to (A.3.7), and we call the corresponding conditions (A.4.2) to (A.4.7). Let q(y, z) be any unbiased estimator of g(8). We define byte)
(4.1) ^y (B) = 9'(y, az(e)
z)fe (y l z)dpv
for a.a. z [ ,a,] .
Further we assume the following condition : (A.4.8)
0y(0o)=g(0o)
for a.a. z [py]
OV)(B )=0 for a.a. z [t ] (i= y 1,•••, k) o y where ¢()(8) is the k-th order derivative of ¢y(O) with respect to 0.
In the following theorem we shall show that the sharp bound is equal to zero.
310 602 MASAFUMI AKAHIRA AND KEI TAKEUCHI
THEOREM 4 .1. Let g(0) be (p+1)-times differentiable over 0. Let #(X1, be an unbiased estimator of g(0). If n>_ 2 and for a fixed ( X 1 , . ,. .X„) , 00, the conditions (A.4.1) to (A.4.8) hold, then, inf V00(g)=0. 6: unbiased
PROOF. From the unbiasedness condition we obtain
(4.2) 5 q5z(0) he(z)dpz=g(o)
for all
0E®.
First we assume that OJO) is given, then under (A.4.1) to (A.4.7) we have by Lemma 3.1 (4.3) inf (bZ(0o' J {g(y, z)-0a(Bo)}E.fOb(y l z)dpv o: (4.1) aa(O0)
_ (^a' (e0), • • ,
=vk(00 1 z)
zk' (Bo)) A '
(qa' ( Bo), • • . , }I•zk' (BO))'
(say) ,
where 0(t) (B) is the i-th order derivative of oz (O) with respect to 0 and d is a k x k matrix whose elements are
(y I z) {f60( y I z)} a0i fab(y I z)} d1i 2`' JaZ( o) fB A From (4.3) it follows that for any s > 0 there exists g, (y, z) such that bz(0)
(4. 4) (4.5)
z d B
a,(0)
J
bz(60) az(00)
for all B E©,
{g(y, z) -0z (00)}2f0b(y I z)dpv
Since by (4.2)
5
d by(0) A c .) ay(0)
9.(y,
z)f0(yI
z) h6(z)df vdf s=g(0)
for all 0 E ®, it follows from (4.5) that
(4.6)
Q (%)
J1
z (O o )
{g.(Y, z) - 0. (00) }2 f0b (y I z) h0o(z)d av d p. < ^ d vk (eo l z) h0o(z)dp,+ s
By the condition (A.4.8) vk(9oIz)=0 for a.a. z [,az] . From (4.6) we obtain
311 THE LOWER BOUND FOR. THE VARIANCE OF UNBIASED ESTIMATORS 603
Jd
(4.7)
(e0)
c Q(80)
{g. (?!, z) - Bo }2fab(y I z) heu(z) dp v d,ua < e
Putting g, (x,, • • • , x„) = g , (y, z), we have by (4.4) and (4.7) EB(g,)=g(8)
V00.) < e
for all 8 E ® ,
.
Letting a -> 0 we obtain
inf
V00(9)
=0.
unbiased
Thus we complete the proof. Now we shall find a function ¢a(8) satisfying the condition (A.4.8) when 8=R1, g(8)=8 and as is a Lebesgue measure. Without loss of generality, we put 00=0. We define (8)= Msgn 8I0I1+ 2(z-c)k ho ( z) {(d - z) k +2 + I 0 Ik +2 (z - c)k +2}
(4.8)
where M is a constant and k is a positive integer. Then we have
J
B h z dz=M
d ,( ) B( )
Ir
sgn 0l8Ik + 2(z-c)k
Jdc (d- z) k +2+ IBIk +2 ( z-C) k+2 dz
\\ uk _ M s gn e I 0 Ik+2 du d-c Jb 1 +I0Ik + 2ukt2
(after transformation u= z-c d-z M v MK sgn BIBI dvo 1 + vk+2 d-c d-c 81 where K=
Jb
1+vk +2 dv is a constant . If we put M=(d-c)/K, then
5zohe zdz=o And it is easily seen that ¢$(0)=0 for a.a. z, gZt>(0)=0 for a.a.
z (i=1, • • •, k).
Thus it is shown that 0,(0) given by (4.8) satisfies the condition (A.4.8). We consider the estimation on the location parameter 0. Let X, and X2 be independently and identically distributed with a density function f (x, 0) of the form f (x - 8) which satisfies the following :
312
604 MASAFUMI AKAHIRA AND KEI TAKEUCHI
f(x)>0
for
a<x
f(x)=0
for
xSa, x>b,
and lim f(x)>O, lim f(x)>O and f (x) is continuously differentiable in x-.a+ 0 x-.b-0
the open interval (a, b). We define Y=2(Xi+X2) Z=2(X,-X2). Then if the conditions (A.4.1) to (A.4.8) are assumed, we have
inf
Veo(B)=0.
d: unbiased
Note that Z is an ancillary statistic, but that (Y, Z) is not sufficient unless f(x) is constant for a<x
5. A second type approach to obtain the lower bound for the variance of unbiased estimators Suppose that (X, Y) is a pair of real random variables according to a joint density function f(x, y, 0) (with respect to the Lebesgue measure p) which has the product set (0, a(O)) x (0, b(O)) of two open intervals as its support A(B), where a(O) is a monotone increasing function and b(O) is a monotone decreasing function. We assume the condition
(A.5.1) inf
f(x, y, B)>0 .
(x.Y) a A(B)
Let the marginal density functions of X and Y be f, (x, 0) and f2 (y, 0) with respect to the Lebesgue measures Px and py, respectively. We further make the following assumption : (A.5.2) The density functions f,(x, 0) and f2(y, 0) are continuously differentiable in 0 and satisfy the conditions (A.3.5), (A.3.6) and (A.3.7) for k=1 when fl (x, 0) and f2 (y, 0) are substituted instead of f (x, 0) in them. In the following theorem we shall show that the sharp bound is equal to zero. THEOREM 5 .1. Let ®=R1. Suppose that X and Y are random variables with a joint density function f (x, y, 0) (with respect to a a -finite measure u) satisfying (A.5.1) and (A.5.2) for a fixed 0o. Let g(0) be continuously differentiable over 4. Let q(x, y) be an unbiased estimator of g(0). Then
inf ¢: unbiased
V90(g)
=0
313 THE LOWER BOUND FOR THE VARIANCE OF UNBIASED ESTIMATORS 605
PROOF. We first define gr(x, y)=g(B0)
for
(x, y) E A(Oo) .
In order to extend g(x, y) for (x, y) outside A(80) using the unbiasedness condition , we consider unbiased estimators #j x) and g2 (y) of g(8) with respect to fl (x, 0) and f2 (y, 0), respectively , such that g1(x)=g(80) for 0<x60 , i.e., a(O)> a(80) we put
a(00)
h, (0) = g(Bo)
0
.f1(x, B)dpx
and also for 0<00, i.e., b(B) >b(80) b(00)
h2(0) =g(80)
J0
f2(y, 0)dpy
Since #,(X) and g2 (Y) are unbiased estimator of g(8), it follows that a(B)
g,(x)fl(x, 6)dpx= g(6) -h,(8)
for all
8zO ,
92(y)f2(y , 0)dpy=g(B)-h2(8)
for all
e s eo .
a(B0)
(5.1) b(O)
b(00)
Since the supports of the density functions f, (x, 0) and f2 (y, 0) are open intervals (0, a(8)) and (0, b(0)), respectively, it follows from (A.5.1) that 0< lim
f,(x,
0) =a , (0) (say),
x-.a(6)-0
(5.2) 0< lim f2(y, 0) y-.b(O)-0
Differentiating both sides of (5.1) we have by (A.5.2) (5.3)
a
a(B)
gi(a(B))a,(0)-F Ja(B) 9''(x) ^ B b
f,(x, B)}df^x=g'(B)-h;(B)
for all B>_ Bb . Since the equation (5.3) is of Volterra's second type, it follows that the solution g,(x) exists for all x>a(00). If g,(x) satisfies (5.3), then it also satisfies (5.1) since g(00)=h,(B0). Similarly we can construct the unbiased estimator g2(y) for all y>b(B0). We define an estimator
(5.4)
g (x,
Y)=.
g(80)
for
0<x
#I (x)
for
a(B0)<x, 0
g2 (y)
for
0<x
Then g(X, Y) is an unbiased estimator of g(0) with variance 0 at 0=00.
314 606 MASAFUMI AKAHIRA AND KEI TAKEUCHI
Indeed, we have from (5.1) and (5.4) EO [9(X, Y)l
( =J
ff a(BO)
b(8) 0
f y, 9)d p + J \ g 1( x )f (x, y, 9)d p b(0) a(0)
g(B0)f0 (x,
0
a(00)
a o (Ob ) =
g(BO)
f 1(x, 0)d px+
\
J a(80)0 ) #1(x)fl (x, 0)dpx ,
for all 0 Z 00. Similarly we have that EO [9(X, Y )] = g(0) for all 0:5.- 00. Hence we see that g(X, Y) is an unbiased estimator of g(0). We also have, for all 0 Z 90
VB(s^(X, Y))= g2(0 0) ^o(BO)f1(x, B)dp -
a(80)
9'1(x)f1( x, B)dp - g2(0)
When 0=00, we obtain VOO(4(X, Y))=0 . Thus we complete the proof . We can give the following example. Example 5.1 . Let X , ,--- , X „ and Y , ,--- , Y„ be independently and identically distributed random variables with the uniform distributions U(0, 0) and U(0, 1 / 0), respectively. Put T1=max Xi ,
T2=max Y{ .
1sti5n
15LSn
Then the unbiased estimator d (T1i T2) of 0 with variance 0 at 0=1 is given by
(5.5)
B(t1, t2)=
#1(t1)
for
15t1, 0
#2(t2)
for
1;5t2, 0
1 for t1<1, t2<1 , where B1(t1)=
(1 + 1 1t1 for 15t1 ,
92(12)=(1- 1)t2 n
for 15t2.
Indeed, we can easily see that the estimator B(T1, T2) We also have, for 0<0S1
VO(B(T^, T2))= We also have, for 0>-1
{1-
(n-1)2 n(n-2)
}(On
__.02)
315 THE LOWER BOUND FOR THE VARIANCE OF UNBIASED ESTIMATORS 607
(5.7) V0(0(T^, T2))= {1- (n+1)2 1(9_"_ 02) . n(n-2) From (5.6) and (5.7), we obtain
V1(9(T1, T2))=0 . Thus we see that the unbiased estimator B(T1, T2) given by (5.5) with variance 0 at 0 =1. As a further case of this situation, let X and Y be independent real random variables according to density functions (1/0)f(x/0) and Of(By) (with respect to the Lebesgue measure p) with a positive valued parameter 0, respectively, which satisfy the following : (A.5.3)
f(x)>0
for 0<x<1 ,
f (x) = 0
otherwise ,
and f(x) is (p+1 )-times continuously differentiable in the open interval (0, 1) and for each i=0, 1,..., p ca>( )
0< lim f x < oo , 0< lim x-•0+0 xP-f
x-.1-0
fcs) (x)
(1 - x)P-t
By Theorem 4.1 we have the following : THEOREM 5.2. Let g(0) be an estimable function which is (p+1)times differentiable over R'. Let $(X, Y) be an unbiased estimator of g(0). If the conditions (A.5.3) and (A.4.8) on pe(xjt) hold, then
inf V90(g) = 0 , 6: unbiased where p0 (x It) denotes the conditional density function of X given XY= t.
PROOF. Letting T=XY, we have the conditional density function pe (x l t) of X given T= t : x (5.8)
p0 (x l t) _
for
Bt<x<0 ,
50 xf(0)f(xt)dx 0 otherwise ,
for almost all t [p].
Since for 0=1 and almost all t [p]
p1(x l t) =
( x
l0
-L
f(x)f ( X )
for
t<x<1,
otherwise ,
316 MASAFUMI AKAHIRA AND KEI TAKEUCHI
608
we obtain from (5.8) pB (x l t) p, ( (t)
for all
6, a.a. t [,u]
where ct=
(5t x f( )f x
(x )dx)-1.
Putting
gt (x) _
xf(x)
for
t<x<1 ,
0 otherwise , for almost all t [p], we have (5.9)
a.a.
p,(xI t)=f(x)gt(x) ,
t [f^]
hence the same conditions on p, (x l t) as (A.4.1) and (A.4.2) hold. Indeed, we have by (A.5.3) and (5.9) limo p, (x l
t)=f(t + 0)gt(t +o)= tt f(t+0)f(1+0)=0
(5.10) lim p,(xlt)=f(1- 0)g' (1 -0)=0
x-. 1 -0
for almost all t [p]. (5.11)
By the Leibniz's formula, we obtain
(j)f"' (x)g('t -1)(x) atpaxx l t) = E J=O =E (2 )fci_i)(x) g1)(x),
a.a.
t
>=a j
By (A.5.3) we have for i=1, • • •, p-1 and a.a.
(5.12) lim ai p, (xIt) =0, x-.1-0
a x,
[p]
t [tt]
lim ap p 1 (xIt) =ct * 0
x-.1-0 axp
where ct is finite. Since g
t i 1 8f-i )=ct E (-1)f x^+1 axf-i f( x
it follows by (A.5.3) and (5.11) that for almost all t [,u],
lim x-•t+0
(5.13) lim x-.t+0
ai p, (xl t) =0
(i=1,...,p-1),
ax,
app1(xl t) =D,00 , axp
.
317 THE LOWER BOUND FOR THE VARIANCE OF UNBIASED ESTIMATORS 609
where D, is finite. It is seen by (5.9), (5.10), (5.12) and (5.13) that the same condition on p, (x I t) as (A.4.2) holds. For i=1,. ., [p/2]
{P
I {
0<Jc
,(
x 10
1
2
oo ,
a.a.
8 P I (xI t) dx<
t
[p]
and
LI{ i- E +,c; aLt
pI (x I t)} 2 ax dx ,
[P,2]
t
PI (x 1 0
a.a.
t [p ]
is infinite unless CEPi2]+1 = • • . =cP, where [s] denotes the largest integer less than or equal to s, since when x-^ 0+0 or x--).t-0 the numerator of the integrand approaches to a polynomial in x or t-x of the degree p-i* if ci.*0 and cia+,= • • . =cP=O and the denominator tends to that of the degree p. Hence the same condition on p, (x I t) as (A.4.4) holds for k = [p/2]. And also from (A.5.1) it is seen that when x-0+0 or x--).t-0, {(8i/ ax`)p, (x I t)} l pI (x I t) (i=0, 1,. • • , k) approaches to polynomials of different degrees, hence they are linearly independent. Putting PI (x) =
sup
p, (x' 10
x': IX'-XI
for appropriate c> 0, the same conditions on pI (x I t) as (A.4.5) to (A.4.7) hold. By Theorem 4.1 we obtain the conclusion of Theorem 5.2. When g(O)=O, Example 5.1 can be also an example of Theorem 5.2.
Acknowledgements The authors wish to thank the associate editor and the referee for useful comments. UNIVERSITY OF ELECTRO - COMMUNICATIONS , TOKYO* UNIVERSITY OF TOKYO
REFERENCES Akahira, M., Puri, M. L. and Takeuchi, K. (1984). Bhattacharyya bound of variances of unbiased estimators in non-regular cases, Ann. Inst. Statist. Math., 38, 35-44. Chapman, D. G. and Robbins, H. (1951). Minimum variance estimation without regularity assumptions, Ann. Math. Statist., 22, 581-586. Fraser, D. A. S. and Guttman, I. (1952). Bhattacharyya bounds without regularity assumptions, Ann. Math. Statist., 23, 629-632. * New at University of Tsukuba.
318 610 MASAFUMI AKAHIRA AND KEI TAKEUCHI Kiefer, J. (1952). On minimum variance in non -regular estimation , Ann. Math . Statist., 23, 627-629. M6ri , T. F. (1983). Note on the Cramer -Rao inequality in the non -regular cases: The family of uniform distributions, J. Statist. Plann. Inference, 7, 353-358. Morimoto , H. and Sibuya , M. (1967). Sufficient statistics and unbiased estimation of restricted selection parameter , Sankhyd, A27, 15-40. Takeuchi, K. and Akahira, M. (1983). A note on minimum variance, Metrika, 33, 85-91. Vfncze, I . (1979). On the Cram€r-Frdchet-Rao inequality in the non -regular case, In Contributions to Statistics , the Jaroslav H'ajek Memorial Volume , Academia, Prague, 253-262.
319 STATISTICAL THEORY AND DATA ANALYSIS II K. Matusita (Editor) ® Elsevier Science Publishers B.V. (North-Holland), 1988
SECOND ORDER ASYMPTOTIC EFFICIENCY IN TERMS
191
OF
ASYMPTOTIC VARIANCES OF THE SEQUENTIAL MAXIMUM LIKELIHOOD ESTIMATION PROCEDURES
Kei TAKEUCHI and Masafumi AKAHIRA
Under suitable regularity conditions, the Bhattacharyya type bound for asymptotic variances of estimation procedures is obtained. It is also shown that the modified maximum likelihood estimation procedure attains the bound if the stopping rule is properly determined.
1. INTRODUCTION Sequential estimation procedures can be defined in two stages (a) definition of a stopping rule, and (b) definition of the estimation procedure once the stopping rule is determined. In this paper , we restrict ourselves to the class of estimation procedures related with a sequence of sequential sampling procedures where the sample size tends stochastically to infinity and is asymptotically constant in the sense that its coefficient of variation tends to zero . Then we can show that the asymptotic variance of the estimator must satisfy the Bhattacharyya type bound , and for one parameter case, the modified maximum likelihood (ML) estimation procedure attains the bound, if the stopping rule is properly defined. The above results can be interpreted to mean that the asymptotic loss (deficiency) of the ML estimation procedure can be reduced to zero when we apply an appropriate sequential estimation procedure . This fact can be established also for the asymptotic distribution of the sequential ML estimation procedure which will be dealt with in a subsequent paper by the same authors.
2. BHATTACHARYYA BOUND AND MAXIMUM LIKELIHOOD ESTIMATION PROCEDURE Let X1, X2, •••, Xn, ••• be a sequence of independent and identically distributed random variables with a density function f(x, 0) with respect to a a -finite measure p, where 0 is a real-valued parameter. Suppose that the sample size n is determined according to some sequential rule. Actually, we consider a sequence of sequential estimation procedures {IIQ : a=1 ,2, ••• } such that Ee,Q(n) =v. (0) becomes large uniformly in 0 as a-- , where for each a we define a sto oping rule and estimators based on it and consider the asymptotic distribution of V (9a-9 ) as a--. For simplicity, we denote v0(9) by v. In order to consider the second order asymptotic efficiency, we assume the following conditions.
(A.1) Eo(n) = v+o(1), Vo(n)/v = O(1), Ee(nk/vk) = O( 1) (k=2,3,4), {(a/119)va(9)}/va(9 )= O(1) and { (a2/a92)va(9)}/va(0)=O(1).
320 192
K. Takeuchi and M. Akahira
(A.2) {x : f(x,O) > 0} does not depend on 0. (A.3) For almost all x[p], f(x, 0) is four times continuously differentiable in 0. In the Taylor expansion t
108 /(x,e+h) - h e(i)( 9,x)+h4R(x,h), /(x,e) . 1 i!
R(x,h) is uniformly bounded by a function G(x) which has moments up to the fourth order, where f(i)(O,x) _ (ai/aei)€(e,x) (i = 1 ,2,3,4) and €(O,x) = log f(x,0). (A.4) For each 0, 0
J(O)=Ee[f(1)(e,X)f(2)(O,X)], K(0)=Ee[(f( 1)(O,X)}3], L(0) = Ee[f(1)(O,X)f(3)(O,X)], M(0) =Ee[{f(2)(O,X)}2] -12(0), N(0) = Eg[{f(1)(O,X)}2€(2)(6,X)] +12(0), H(0) =Ee[{f(1)(O,X)}4] - 3I2(O),
and both J(0) and K(0) are differentiable in 0, and Ee[f(3)(O,X)] _ -3J (0) - K(0) and Ee[f(4)(O,X)]=4L( O)+3M(0 )+ 6N(0)+H(0). We put
Z,"= ¢1 and
)(o,x ), Z2, = 1- Y {(2)(e, x.)+ r(e)), vi=1 -Vvi=1 Z 3'V 1_ {e«'(e, X)-3J(e)-K(e)}. s,^ Vv;=1
The following lemma is very useful for calculations ofcumulants. LEMMA 2. 1. Suppose that Y0 is a function of X1, ---, Xn and 0 and is differentiable in 0. Then Ee(Z1VYe)=
and
ay -= Ee( ) E d e(Ye)- 1_ -/vim 1/v de
az aY 'v)- 1-Ee(Z1 ). Ee(Zi Ys)= 1_ - Ee(Z1vYe)- 1- Ee(Y .v vv -Vv
provided that differentiation under the integral signs of Ee(YO ) and Ee(Z1,vYe) is allowed. The proof is omitted since the lemma is similar to Lemmas 5.1.1 and 5.1.2 in Akahira and Takeuchi [21(see also Lemmas 2.1.1 and 2.1.2 in Akahira [1]).
In the following theorem, we have the Bhattacharyya type bound for asymptotic variances. THEOREM 2 . 1. Assume thatthe conditions (A.1) to (A.4) hold. Then for any asymptotically unbiased estimator On of 0, i.e., Ee(On)=0+o(1/v), it holds
321
Second Order Asymptotic Efficiency
ve(V/v ("en-9))? 1 + 1 (J(e
193
)+K(9) + 2v_ (e)12+o( 1 ),
1(0) 2v (0)12(8 ) i 1(e) v(0) v(e)
where Ee(•) and Vo(•) designate the asymptotic mean and the asymptotic variance, respectively , and v'(e) =(a/ae)v(e). REMARK . In the above theorem , for each k= 0,1,2, ••• we ngcan by the term asymptotic mean of Zv is equal to pv up to the order v-k if Z„= Zv+op(v-k) and E(Zv) = pv, and note that the above lemma also holds the asymptotic mean.
PROOF . Since E0 (0n)=0+o(1/v), VvE0( 6nZ1,v)= land n a2 2log/(x.,0) _VvZZv-n1, i=tae
it follows that Ee[V v (fn-0) Z1 v1=o (1),
Ee[Vv(fn-ex,V vZ1v2-n_I+Z2. )1=o(1), Vv
where v , v' and I denote v(0), v'=v'(O)=(d/dO)v(O) and I(0), respectively. Similarly, J and K below stand for J( e) and K(0). Then we define the extended information matrix B as is given by Eu 1v(1/VZ1vtZ2v-Vvn1
B-^ E6(Z1v) Ee[Zlv(VvZly+ Z2v- n _I)1 1^ v
EelVvZly+Z2v- n_n21 ^/v
Since by Lemma 2.1 and Wald 's identity [31
Ee(Z1)=1(0) V(-V EB[Z" v(v' v Z2 v+Z2v-
n- n 1= J(0)+ K(0)+2I - +o (1) , V
v
v
Ee[(-./VZ2 +Z2v - 1/n_ I)2] v
=EB[vZiv-21nZ1v+
n2 21n 12+2VvZ2 Z2v- vZ2v+Z1)
=3y12-2v12+V12+o(1)
=2v12+o(1), it follows that B- 1\ (B11 * * ** ),
where
(J+K+21- )21 ff f v B11 = {1_ tl 2v12
1
+0(1)
322 194
K. Takeuchi and M. Akahira 1+1
(J+K+
I 2v1^2 /
2v')2+o(1). I V
Hence we have Ve ( V V(Bn -0)) z B11 . This completes the proof. In order to show that the modified ML estimation procedure attains the Bhattacharyya type bound given in the above theorem, we need the following lemma.
LEMMA 2.2. Assume that the condition (A.1) holds. If 1 1 1 ^v(0n-0)=1(0)Z'•vVQ+ P( Jv
then V0(V v (6n- 0)) =
+ V0Q+o 1 (0) v (-)
where Q =Op(1).
Putting T= Y V (en-0), we have
PROOF.
2 1 g 2 ^ 2) =E0((T_Z_L._1 )2 +2 V0(Vv (0n-0))=E0(T Z' T- Zl,r)
C0(Z1.) V EO(Q)+ 1( ,v7) = 1 2 0)E0(Zl 1 (0)
Since by Lemma 2.1 E0(Z1 vT
1 a
1 dT
)= v' 0 E0(n- V E°(a ) 1-E0[ !-(e„-e) +Vv) Vv 2Vv
=1+0(1),
the result follows. Denote by 8 L the maximum likelihood estimator of 0 baled on the sample (X1, , Xn). Let O * ML be the ML estimator modified so that Ee(0*ML)=0+o(v-1), and generally it automatically ensures that E0(0*ML )= 0+o(v-3/2). THEOREM 2 .2. If the stopping rule is so determined that sampling is stopped at n and satisfies n• ^
- L. e N0iyc,X. )= v(en,L)[(e;d+c(OfL)+e with
(2.1)
i=1
_ J(0)v'(0 ) + L(0) v"(0) 1 C(0) 1(0)v (0) 1(0) 2v (0)
21(0 ){
2L(0)+M(0)+N(0)}
and some random variable e with E0(e)=o(1), then the asymptotic variance of O*ML is given by (0)+K(0)2v_ ( 0) 2+o( 1 ), V (v v(8 - 0))= 1+ 1 J V . ML I(0) 2v(0)12(0) 1 1(0) v(0) I v(0)
which is equal to the Bhattacharyya type bound, that is, the modified maximum
323 Second Order Asymptotic Efficiency
195
likelihood estimation procedure has the second order asymptotic efficiency property in terms of asymptotic variance.
PROOF.
Since by (2.1)
- ei2)(eML,Xi)=v(eML)1( Mt)+d^ML), i=1
it follows by the Taylor expansion that - e 2)(e,X.)- l I e9)(9,Xi)J(0ML-e)- 1 l ¢4)(e,Xi)J(0ML-e)2 !=1 ,=1 2 1i=1
(2.2)
=vl+(v'I+vr)(eML- 0)+ 2 (v"I+2v'I+vr'XoML- e)2+c(O)+oo(1),
where I' and v" are defined by I'=I'(e)=(d/de)I(O) and v"=v"(e)=(a2/ae2)v(e). Since I'=2J+K, we obtain from (2.2) nI-V v Z2v+{ n-(3J+K)+Z3 V V v (O ML-e) %/ v
+ 2v
(4L+3M+6N+H)v(9ML-e)2
=v1+J1-{v'I+v(2J+K)}Vv(SML-e) v + i (v"1+2vr+vrlv(0ML-e)2+c(e)+ p(1), 2v
which implies J 3J+K n-v v' ^/v Zl °+ - Zt° (n-vu='Vv (Z2.,- ZI ,^-
1 I Z1 Z3 v
4L+3M+6N+H
•nZl v
212v
+ 2I2v {v"I+2v (2J+K)+v(2L+2M+5N+H)}Z2 V+c(e)+oo(1), (2.3) since N/v(6ML-e)=(Zi ^/1)+Opp(1/^), J'=(d/de)J(9)=L(e)+M(9)+N(e)=L+M+N and K'=(d/de)K(O)=H(e)+3N(9)=H+3N. In order to determine n so as to satisfy Eo(n)=v+o( 1), we have from (2.3) _ Jv' L v" 1 do) since
Iv + I 2v + 21(2L+M+M,
=v%'Vv andEe(nZi=vl+o(v). L = L(e)=Ee(Z1 Z3 ) +o(1),Ee(nZ1
We also obtain from (2.3) Z
nivvZ1
.v 121/ v(Z2v
1Z1'r1+12vV' 1v+oP(V).
On the other hand we have
V i==1
=
-1
^vi =1
!f i=1
e(2)(e, X i)J v (;ML -e)
(2.4)
324 196
K. Takeuchi and M. Akahlra
e,3)(
+V
=l
3rl
-. o,x,)}v(ewML-o )z +o = VV
1 nI =Zlv+(^VZ2v
=zl "+
(
,. )J v(oMt -o)-2w(3J+K)v (OMt-o )z+ap(N/ V
Z1 , , _,_
I(n-v)1
-VV
-
1
1
Jv(eMt- o)-- (3J +K)v (oML -o)z +o (-)
v
2Vv
a Jv
which implies - .. 1 n - v 1 Jv (oMt- o)= I Z1 ,, ;F-= V V Z 1/2.v Iv Zl v+ - 3J+K Zi "+ 213Jv Jv
P(
1-).
(2.5)
From (2 .2), (2.4) and (2.5) we obtain 1 J+K Z J v (o Mt-o)= Zlv- 213J v Zl'"
y' 2 1 IzvV v Zl."+o"(J v)
hence , by Lemma 2.2, 1 J+K 2v 1 ' ) 2+o( 1 . V0(Jv (9Mt-8))= I+ z( v v) 2v7 I + w
w
Since the asymptotic variances of 9ML and O * ML are equal up to the order of v-1, the conclusion of the theorem follows.
REFERENCES [1] Akahira , M. (1986). The Structure of Asymptotic Deficiency of Estimators. Queen's Papers in Pure and Applied Mathematics o.75, Queen's University Press, Kingston, Ontario, Canada. [2] Akahira, M. and Takeuchi, K. (1981). Asymptotic Efficiency of Statistical Estimators : Concepts and Hi her Order Asymptotic Efficiency, Lecture Notes in Statistics 7, Springer- er ag, N ew Y ork.
[3] Wald, A. (1973). Sequential Analysis. Dover Publications , Inc., New York.
University of Tokyo University of Tsukuba
325
SECOND AND THIRD ORDER ASYMPTOTIC COMPLETENESS OF THE CLASS OF ESTIMATORS* Masafumi Akahira, Fumiko Hirakawa, and Kei Takeuchi
1. Introduction
The concept of higher order asymptotic efficiency of estimator 9n of a parameter 0 in an open subset a of RP is usually defined by the property that for any other estimator On satisfying some condition we have nlimoyt (k -1)/2 LP.,n {vim (9n -
0)
E C} - Pe,n
{v^
(9n - 0) E C}] > 0
for all 0 E 8 and any convex set C containing the origin. And usually the condition imposed is k-th order asymptotically median unbiased and the like, and 6n is usually "modified" maximum likelihood estimator (MLE) (see [4], [6], [7] and [8]). However, the condition of the asymptotic median unbiasedness is rather arbitrary, and a more meaningful property will be that of higher order "asymptotic completeness" which is defined as follows: O is called k-th order asymptotically complete if for any estimator On within some class of estimators we can construct Bxi'=Bn+h, (0 * ) ,
{hn} being a sequence of functions depending on On so that we have nlim n(k
1)/2 IP9,n {v' (Bn*
- 0) E C} - PB,n {/ (On - 0) E C}] > 0
for all 0 E e and any convex set C containing the origin. Actually, the definition was given in the monograph [4] by Akahira and Takeuchi. But there the discussion was limited to the class of asymptotically median unbiased estimators in its application . The concept is also discussed in [10], but again the class of estimators was limited to that of regular functions of sufficient statistics. The purpose of this paper is to show that , for any class of estimators which admit Edgeworth expansions but are not necessarily asymptotically median unbiased , we get the second order asymptotic completeness of the MLE, and the third order asymptotic completeness of the MLE BML together with the second order derivative of the log-likelihood function evaluated at the MLE which is denoted by Z2 OML) later. ( This paper is retyped with the correction of typographical errors.
326 12
It follows from this higher order asymptotic completeness property that , in any "decision theoretic" set-up , where we consider a sequence of loss functions of the type Ln (u) - L*(/ u) for a sufficiently large n , for any Bayes decision rule 6n with smooth prior we can construct a 6;a = OML + hn ( OML , Z2 (9ML ) ), { hn} being a sequence of functions depending on O, , such that nhlt n (E [Ln(6n - 9)] - E [Ln(bn - 0)]) >- 0 -oa for all 0 E A. The concept of asymptotic completeness is related to that of asymptotic sufficiency. A sequence {Tn} of statistics is called to be higher order asymptotically sufficient if for any sequence {Tn} we can construct Tn' = gn(Tn, Un), {g.} being a sequence of functions depending on Tn, where Un is a random variable independent of 9, such that the asymptotic distribution of Tn' coincides with that of Tn up to the order n -ik'1i/2. (The concept of higher order asymptotic sufficiency may be given in other ways but basically it amounts to the above.) Asymptotic sufficiency, however , has generally been derived from the statement that we have Tn
-
Tn
= op (n
,
_k/2)
which is actually much stronger than the equivalence of asymptotic distributions up to the order n-(k-1)/2 where Xn = op(Yn) means that Xn/Yn converges to zero in probability as n -f oo. For example in Takeuchi [9], Bickel, Gotze and van Zwet [5], it was discussed that for any Bayes decision procedure 6n it is possible to get 6n = gn (0,,,) such that 6n - 6n = o, (n-3/2) only if the loss function is symmetric. However, when the loss is not symmetric, it was shown in [9] that even with Z2 we can not construct 6n = gn (BML, Z2 (9ML)) such that 6n -6n = op (n-3/2). But, in order that 6n and 6n have asymptotically equivalent distributions up to the same order, hence have asymptotically the same risk, it is not necessary that 6n and 6n are stochastically equivalent up to the same order. As a simple illustration of this fact, let us assume that 1 n
6n= On + -En,
where en has the property that E(EnIBn) = o(1). Then 6n-On= nOp (E,) =O,(n
),
327 13
but E [exp(itbn)] =E exp it
[
=E [exp
on +
(iton)
n15n)
}]
] + n E [en exp (ito ), + 0
(
-_ )
=E [exp (it6n)] + o Cn/ , which means that the distributions of 9n and 5n are asymptotically equivalent up to the order n'1. Actually, from what is derived in the paper, it is shown that, under a sequence of decision problems with a proper set of regularity conditions, for any Bayes decision rule bn with some smooth prior there exists a bn (OML, Z2 (6ML)) such that P9,n{8n EA} =Pe,n{bn EA}+o(1) n for all measurable set A in the decision space. The main purpose of this paper is to establish that 9ML is generally second order asymptotically complete, and BML together with Z2 (OML) is third order asymptotically complete. Although this result is given in the framework of "point" estimation theory, it is also applicable to other problems of inference including testing hypothesis and interval estimation, which will be discussed in subsequent papers.
2. Preliminaries
Let X be an abstract space , elements of which are denoted by x, and let B be a v-field of subsets of X. Let e be a parameter space , which is assumed to be an open subset of the Euclidean p-space RP with the usual norm denoted by II - 11. We shall denote by (X('), 13(n)) the n-fold direct product of (X, B). We consider a sequence of classes of probability measures {Po,n : 0 E 6} (n = 1, 2, ...), each defined on (X(n), B(n)), such that for each n and each 0 E ® the following holds:
PB,n (B(n)) = P0,n+1 (B(n) X X)
for all B(n) E B(n). An estimator of 0 is defined to be a sequence {9n} where on is a Bin)-measurable function from X(n) into 9 (n = 1, 2,. ..). For simplicity we denote an estimator as on instead of {on}. For an increasing sequence of positive numbers {cn} (cn tending to infinity) an estimator 9n is called to be cn-consistent if for every 77 of 6, there exists a sufficiently small positive number 6 such that
328 14
> L} = 0 Jim 71m sup Pe ,n {c,,, L-.oo n-•oo 0: 110-+111 <6 Ilea - III
(Akahira [1]).
3. One parameter case
Suppose that B is an open subset of R1. Let X1, X2, ... , Xn, ... be a sequence of independently and identically distributed (i.i.d.) real random variables with a density function f (x, 0) with respect to a a-finite measure µ, where 0 E B. In the subsequent discussion it is enough to consider only the case cn = V , ' n - . F o r each k = 1, 2, ..., 0 < a(0) < 1 and a continuously differentiable function a(I) of 0, a cn-consistent estimator On is called k-th order asymptotically (a, a)-biased (as. (a, a)-biased) estimator if, for any 77 E O, there exists a positive number b such that lim sup nik-11/2
n-.oo 0:10 -,11 <6
IPe,n
{
nI (I) (In - I) < a (I)} - a(0)I = 0,
Jim sup n (k-1)/2 I Pe,n { nI(I) ( In - I) > a (0)} - 1 + a(I)I = 0, -.oo 0:10-1<6
where 1(0) denotes the Fisher information of f where a(I) and a(0) are determined from the outset. Now we assume that a(I) is a constant a. Alternatively, we may fix a(0) to be constant letting a(0) depend on I, and the subsequent discussions apply exactly in a similar way. For each k = 1, 2.... we denote by Bk(a, a) the class of k-th order as. (a, a)-biased estimators for which the distribution of
nI(0) (On - I) - a (0) admits the Edgeworth expansion up to the
k-th order. If an estimator On belongs to the class Bk (a, a), then we call it a Bk (a, a)-estimator. F o r each k = 1, 2, ... , a Bk (a, a)-estimator On* is called k-th order asymptotically efficient in the class Bk(a, a) if for any Bk(a, a)-estimator fin Pe,n -tl < nl(0) B;, - 0 - a(0) < t2 > P9,n -tl < nI(0) On - B - a(0) < t2
+o n- (k-1)/2
for alltlit2>0and0EB. We assume the following conditions. (A.1) {x : f (x, 0) > 0} does not depend on I. (A.2)k For almost all x[µ], f (x, I) is k times continuously differentiable in I. expansion
log
k f (x, 0 + h) = . h ' 0) (I, x) + hkR(x, h)
f (x, 0) 8_ 1 Z.
R(x, h) is uniformly hounded by a function G(x) which has
In the Taylor
329
15
moments up to the k-th order, where
0l(9, x) = (ai/aOi)e(9, x) (i =1, ... , k) with e(9, x) = log f(x, 9). (A.3) For each 9 E O
0 < I(9) = E9 [{t1(9,X)} 2 ] =
l -E6 [e(2)(0'X)J
< oo,
and 1 (0) is differentiable in 9.
(A.4) There exist = E9
J(9)
[e(l) (0, X)e(2) (0, X)]
{
K(9) =E9 {')(ox)}3] , and both J(9) and K(9) are differentiable in 9, and EB [e(3)(0,X), = -3J(O) - K(9). In the following theorem we obtain the asymptotic distribution of a B2 (a, a)-estimator up to the second, i.e., the order n-1/2. Theorem 3.1. Assume that the conditions (A.1), (A.2)3, (A.3), and (A.4) hold and that the asymptotic cumulants of a biased best asymptotically normal (BAN) estimator On are given as follows: For Tn = Vrn (On - 9) Co(9) + C'(
E9(Tn )
) +o \ n= 1 VB(7n ) = I(0 ) + C2 (0) +0 \ ^/ \Tn ) C,)+a\\
where Ci(9) (i = 0, 1, 2, 3) are differentiable functions of 9. Then the second order asymptotic distribution of the second order as. (a, a)-biased BAN estimator 9n, of the form 9n = On ( 9n) - n-1u2 (9n), is given by
(3.1)
P1,n {
nI ( 9) (On'
-n1/2u
- e) < t + a(9)}
1 =4,(t + va ) +
-q(t
+ V10 {_
I3/2 (9) C3(9)t2
I(9)C2(9) a'(0) 1I3/2 (9)C3(9)vat + t I\Co( B ) - 2 -3 I(9) (
+ 2J(O) K 2I/2 9 )
(O)
(a( 9) -
va))
} +0
(
✓n
`
,
where 4)(u) = f.O(x)dx with O(x) = (1/ 27r)e-x2/2 and 4(va) = a.
330 16
Remark: In the above theorem, ul(0) and u2 (0) are given as follows: u1(0) -Co ( u2(0)
0) - a(0) - v«. I(0) ,
=C 1 (0)
- Co(0)ui (0) +
2 I (e) { C2(0) - 2i4 () } [a(0)
TT) 1CO (0) - u, (0) }] + 1I (0 ) C3(0)[ {a(0)
O
I(0)(Co(0) - u,(0))}2 -1]. In the following theorem we shall obtain the bound for the asymptotic distributions of B2(a, a)estimators up to the order n-1/2. Theorem 3 .2. Assume that the conditions (A.1), (A.2)3, (A.3), and (A.4) hold. Then the bound F* (t, 0) for the asymptotic distributions of
n1(0) (in - 0) for the all B2 (a, a)-estimators
On
is
given by F* (t, 0) =41(t + va) + 1 0(t + va) f
{3J(61 /2 (O)}t 2
t a(0)(2JL(0) + K(0)) vaK(0) 1 I(0) {a'(0) 21(0) 61(0) }J
in the sense that for any B2 (a, a)-estimator On Pe,n { nI ( 0) (On - 0) < t + a(0) } < F* (t, 0) for all t>0and all 0E9, PB,n { nI ( 0) (On - 0) < t + a(0) } > F* (t, 0) for all t<0and all 0E6. From Theorems 3.1 and 3.2 we have the following: Theorem 3.3. Under the conditions (A.1), (A.2)3, (A.3), and (A.4) it holds that for the biased BAN estimator 0n (0) I (0)C2(0) C3(0) _ - 3J( 0) 2K (0) Co J3(0) 2
for all 0E9,
and further if Co(0) = I(0)C2(0)/2, the second order as. (a, a)-biased BAN estimator
On
is second
order asymptotically efficient in the class B2 (a, a). Sketches of the proofs of Theorems 3.1, 3.2 and 3.3 are given below, since their detailed ones are done in Akahira [2]. Sketch of the proof of Theorem 3.1. T 0* - 9 U1 (0n) u2 (0n) et n- n-
331 17
where ul (0) is continuously differentiable in 9 and U2(0) is a function of 9. Then we have the following asymptotic cumulants of Tn =
nI (o) (Bn - 0):
(3.2) EB(Tnn) (Co - u1) + I/n(C1 - Cou'1 - u2) + o \ n= , (3.3)
VB(Tn)=1+ C2-211
+ol ✓n I,
(3.4) #c3,9 (Tn) = 7n= 13/2''3 +0
(=), where I, Co, U1, ui, u2, C2 and C3 denote 1(9), Co (0), u1(0), ui (0) = du1(9)/d9, U2 (0), C2 (0) and C3(9). From (3.2) to (3.4) it follows that the Edgeworth expansion of the distribution of
/I(9) (o;, -
is given by (3.5)
Pe,n { nI (o, - e)) < t + f(C0 - u1)} ='P(t) - 1 !b(t) {/i(Ci - COU4 - U2) + 2 (C2 - jUl t + 6I3/2C3(
t2 - 1) 1
(71). By the second order as. (a, a)-biased condition we have the following u1 and U2: a -vim
(3.6)
u1 =Co -
.
(3.7) U2 =C1 ( c'2 - Cou1 + - {a - vfI(Co - u1)} / 2 2 1 + 163[{a- VL
(Co-
UI) }2-
11,
where ' (v0) = a. Since I'(0) = 2J(9) + K(0), it follows from (3.5), (3.6) and (3.7) that (3.1) holds. This completes the proof. Sketch of the proof of Theorem 3.2. Let 0o be arbitrary but fixed in O. Consider the problem of testing the hypothesis H : 9 = 01 against alternative A : 0 = 00, where o1 = oo + O(1/vrn-). In order to obtain the upper bound of Pe,,,n { nI (9o) (Bn - 0o ) < t + a(9) }
for each t > 0 and all B2 ( a, a)-estimators Bn, that is, under the second order as . ( a, a)-biased condition Pe,,n{
n7 ( 0 I )(B n
-
o I )
a }=+o( 1 ), n
332 18 we first take a(2J + K) l^ 01 eo + r1 - 1IJ a , - 21 J nI -vfnwhere a, a', I, J and K denote a(Oo), a'(0o), I(Oo ), J(0o) and K(0o). Setting f (xi, Bo) Tn = n log - J (Xi, 81)
by the Edgeworth expansion of the distribution of Tn we take the following an so that the test with a rejection region {Tn > an} has the level a + o(1/ f): t2 t (, a ( 2J + k)1 ,
-'i 6I3/2
1 { (3J + 2K ) t2 -I-3(J + K) -a t + K(112 - 1 ) } + o (=)
.
In a similar way to the above we have the asymptotic power of the test with the rejection region {Tn ^ an } as follows:
(3.8)
Peon{Tn > an} =4^(t + va ) + 7)(t + va)
(3J + 2K )t2 - t2 { , a(2J + K) 1 a 21 613/2
+0 (1) = F*(t, Bo ) (say). By the fundamental lemma of Neyman and Pearson it is seen that the asymptotic power series of the most powerful test of the level a + o(1//) is given by (3.8). Since 0o is arbitrary, we have the desired result for all t > 0. In a similar way to the case t > 0, we also obtain the conclusion for all t < 0. Sketch of the proof of Theorem 3.3. By Theorems 3.1 and 3 . 2 we have F* (t , 0) - Pe,n {
(3.9) _
1
qi(t + va) r
nI (B) (On* - 0) < t + a(0) }
3J(B) + 2K(0) () I3C3 (0) (
t2 + 2vat ) - (Co - I2 2) t }
+OI
If one assumes that 3J ( Oo)+2K (Bo)+I3 ( 0o)C3(0o) 54 0 for some Bo E O , it leads to a contradiction to the bound F* (t, 0). Hence it follows that
333 19
C3(9) _ -3J(0) (^)K(0)
(3.10)
for all 0 E ®.
Since F* (t, 0) is the bound for the asymptotic distributions of
nI (B) (Bn - 9) for all B2 (a, a)-
estimators O , it follows from (3.9) and (3.10) that
CC(0) < I (0)C2(0) /2
for all 0 E
e.
If CC (0) - I(O)C2(0)/2, then it is seen that the asymptotic distribution of
nI (B) (Bn - 0) attains
the bound F* (t, 0) for all t and all 0 E a Hence the desired result follows. Let BML be the maximum likelihood estimator (MLE) of 0. Then the second order asymptotic efficiency of the MLE is given as follows: Theorem 3.4. Assume that the conditions (A.1), (A.2)3, (A.3), and (A.4) hold. Then the estimator BML modified from the MLE to be in B2(a, a) is second order asymptotically efficient in the class B2(a, a). The proof follows from Theorems 3.1 and 3.3 since Co(0) = C2(9) - 0 and C3(0) - -{3J(0) + 2K(0)}/I3(9) in the asymptotic cumulants of the MLE. (See Akahira and Takeuchi [4], page 90). Theorem 3.5. Assume that the conditions (A.1), (A.2)3, (A.3), and (A.4) hold. Suppose that
B„ is any BAN estimator of which the Edgeworth expansion up to the order n-1/2 is valid and the coefficients CC(0) (i = 0, 1, 2, 3) in its asymptotic cumulants as in Theorem 3.1 are continuously differentiable in 0. Then the MLE is second order asymptotically complete in the sense that there exists a modified MLE BML, of the form BML = OML + 1 91 (BML) + 192 (BML) ,
such that Pg,n {-t1 < >PO,n
nI(B) (BML - B) - a(0) < t2}
{ -t1 < nI (B) (in - B) - a(9) < t2 } + o
for allt1it2 > 0andall0E9. Sketch of the proof. For fixed a, define an(9) by
(
1 n
/
PB,n { nI ( B) (Bn - B) < an (B) } = a + o 1 1 ) , Po,n{ nI (B)(Bn-0)>an(0)}=1 -a+o(z I, n/ locally uniformly in B, then an (B) can be expanded as an(0) =
334 20
a(0) + n-1/2b(B) + o (n-1/2), where a(O) and b(0) are continuously differentiable. Consider the class B2(a, an) with an(0) thus defined. Analogously to the proof of Theorem 3.4 we can show that within the class there exists a modified MLE BML, of the form BML = BML + 1 91 ( OML) + 192 (OML)
which is second order asymptotically efficient in the class B2(a, a). This implies the desired result. For the third order asymptotic completeness we further assume the following: (A.5) There exist Ee [Pl1i(0,X)tl3i(9,X)], EB [{e(2)(0,X)
}2],
E0 [{€(')(0,X)}2e(2)(9,X)] and Ee [{0)(0,X)}4]. Theorem 3.6. Assume that the conditions (A.1), (A.2)4, (A.3), (A.4), and (A.5) hold. Suppose that On is second order asymptotically efficient in B2 (a, a), of which the Edgeworth expansion up to the order n-1 is valid. Then the pair of statistics (BML, Z2 (BML)) is third order asymptotically complete in the sense that there exists a modified MLE °ML, of the form BML = BML + 791 (BML) + n92 (BML, Z2 (OML )) + n703 (BML , Z2 (OML)),
such that /n-I(O) (BML - B) - a(O) < t2}
Pe,n {-t1 <
>Po,n {-t1 < nI(0) (On - B) - a(0) < t2} + o I 1 I n for all t1, t2 > 0 and 0 E e, where Z2(0) = E
f(2)(0,X,)/f + fI(0).
Sketch of the proof. The asymptotic cumulants ni (i = 1, 2, 3, 4) of Tn = f (0n - 0) have the following form: i1
=Ee (TT) = µ1o( 9) +
e(TT) =
K2 =1"
k3 =ic3 ,e (T)
µ1 1(B)
+ µ l2(0) + o
(1)
n ( n 1 + µ2 ) + µ22n(9) + 0 :)' I(0) -7n f
= Ee [{T- Eo(T)}3] = A319 ) +µ32(g) + 0(1),
335 21
r-4 =K4 ,8(Tn)
= Eg [ {Tn - E (Tn)}4] - 3 {Ue(T)}z
_ A4ng)
O +0 (
i
) .
Then it is seen that the coefficients µ1o(0), µ11(0)), µ210), µ31(0) and µ42(0) in the above are determined from the second order asymptotic efficiency of in, but µ22(0) and µ32(6) are not so (e.g. see Akahira and Takeuchi, [4]). And it follows that, in the Edgeworth expansion of the distribution PO ,n { I/rnI(O) (in - B ) - a(O) < t}, the only undecided term has the form 31U32(0) + {µ22(0)/ I(B)}w(t), where w (t) is some linear function of t. It follows by the discretized likelihood method in Akahira and Takeuchi [3], [4] that for any Bn E B2(a, a) with µ32(B) and µ22 (9) in its asymptotic cumulants, any fixed 0o E 6 and each real t, there exists a second order as. (a, a)-biased estimator Bt, with µ32(•) and µ22(•) in its asymptotic cumulants, such that (3.11) 31L32(00) +µ22(0o) mo(t) < 3µ32(0o) + µ22( 0o) w(t) I(0o) I(0o)
For any given Bn, let µZ2(0) be the corresponding µ22(0), then it is shown that there exists to = to (0o) such that /22(00) = µ22(0o). Since, by (3.8), for any /032(') 3µ32(9O) +
µ22( BO)w(to ) < 3P32(Bo) +
OWN
µ22(B O)00),
I(0o)
it follows that (3.12)
µ'32(B0) :5 P3200)•
From (3.11) and (3.12) we have 3µt3`z (0o) + µoz2 ( 0o) w(t
µ2z(eo) w(t) ) < 3µ32(0o) + I(B-) I(0o)
for all real t. Hence, for any second order asymptotically efficient estimator Bn in B2 (a, a) (3.13)
Peon {-t1 < rn- (0t° - Bo) - a(Oo) < t2} >P00,n{-t1
for allt1,t2>0.
336 22
Now define B
ML = BML + 1 h
1 (BML)
+ n1 n h2 (BML ) +
n =h3 (BML)
such that Ell (OML ) is asymptotically equal to Eo (9n) up to the order n-1, then BML and Bn have the same values for µ10(0), µ110), µ21(0), µ31(0 ) and µ42 ( B), but the coefficient 632(B) of the _ order n 1 in the third order cumulant for OOML is identically equal to zero. By the discretized likelihood method in Akahira and Takeuchi [ 3], [4] it follows that ( l ( 1 60 to (00)5 +op I n^ j Z2(00) - J Bt° I (Bo) 21 + 2n1(Bo) Now define Bt° = BML +
t0 Z 2 (BML) , 2nI (BML)
where io = to (BML)
Then it can be shown that ( l B1o = Bto + n^W + op \nf /
(3.14)
where W is some stochastic quantity of magnitude of order 1 and EBo(W) = o(1). Since Bt° can be stochastically expanded as 1 f (Bt0 - Bo ) = 1(00) Z1(Bo)
1
+ 7Q +
n
+o
p (1) n (1)
it follows from (3.14) that / i (01° - Bo) = I (80) 21(0.) + ^Q + r1 (R + W) + op
(n
I
where both Q and R are certain stochastic quantities of magnitude of order 1. Then it is shown in Akahira and Takeuchi [4] that the asymptotic distributions of the above two estimators Bt° and Bt° coincide up to the order n-1 if E90 (W) = 0. Hence it follows from (3.13) that for any second order asymptotically efficient estimator Bn in B2(a, a) 1 Peon {-t1 < f (Bt° - BO) - a(Bo) < t2J >Peo,n {-t1 < (On - 90) - a(Bo) < t2J + o (n) for allt1,t2>0.
337 23
Since Bt° is independent of 0o and 0o is arbitrary, we have the conclusion of the theorem.
4. Multiparameter case
Assume that 9 is an open subset of RP. Let X1i X2, ... , Xn, ... be a sequence of i.i.d. real random variables with a density function f (x, 0) with respect to a a-finite measure ji, where 0 E 9. Denote by Bn = (Bnl) ... , 9np), an estimator of 0 = (01, . . . , Op)' E 9. We assume that the distribution of f (Bn - 0) admits the Edgeworth expansion up to the order n-1, and Bn has the stochastic expansion 1 ^(0na-Ba)=Ua+ 2rn-
with U,,
=
E
Iap(O)
p-1
1 (1 nWQ+Op ) n
1(a/i9O,3) logf(Xi, O) /VIn,
Q.
=
(a=l,...,p),
Op( 1)
and W, = OP(1) (a =
1,.. . , p), whose asymptotic cumulants are the following: for Y.
=
,r (On
a
- 0Q) (a= 1,...,p)
Eo(YY)=µo,(O)+µ1 )+µ2n(0)+o(-); VrnCov(YQ, Y3) =I (0) + CQ!) + Ca(0) -1 n + 0 (); ^(1) (0) x(2) (0) 1 K3,0(Y.,Yfl,Yy)= K4,0 (Y., Yfl, Yy, Y6) =rash
+ Qpn (0) +0 (
+O^n)
n)
for a,,O, ry, b = 1,... , p, where I' (O) is an ( a„ Q)-element of the inverse matrix of the Fisher information matrix. For each k = 1, 2, ..., an estimator O;, is called k- th order asymptotically efficient in the class Ak of the estimators On with the same bias k-1
Ee (0na ) = 0a + E µia ( O)n-(i+l)/2 + 0 (n-k/2 ) (a = 1, ... , p) i-o up to the order n-k/2 if for any estimator Bn E Ak PB,n {vi ( 0n - O) - µo (0) E C} > Po,n {I/n ( en - O) - µo (0) E C} + O ( n-(k-1)/2) for any convex set C of RP containing the origin and all 0 E 9.
We shall use the following notations : for a,,3, ry, 6 = 1 , ... ) P
338 24
EB
Ja1w
[1 oaeap _ 1o
K«Rry =E9 [{ P
J aR ry = a'=1 P
P
P
log f (X, B) } {
_ 1og f(x,o)}J;
0 1og f(x,o)} f(xo)} {_ 1og f(x e)} {_
]
Iaa'Ipp'Iryry' 7 R ry'>
p' =1 ry' = 1 P
P
Kct fl-t = E E I aa IRR 'I77 Ka'p'ry'; a'=1 0'=1 -r'=1
M.0.-Y6 =E9 M',6'''5
=
a2 log log B) f (X, } { ae7aea f (X, B) }, - 1.0 45; {{ ao o R a
> > E >a'I aa' a' p'
Ipp ' Iryry' Idb
' Ma'p' •- 'o' ;
7'
J7R=^^Ia«IRR' Ja,R ry
Ja.7=ETIaa' IryryJa'p7'; a'
ry'
K,y R = > I«^ IRR' Ka' p,7. a' R'
Theorem 4.1. Assume that Bn is any asymptotically efficient estimator in the class of the estimators with the same bias µoa(B) (a = 1,. .. ,p) up to the order n-112 in the above, which admits the Edgeworth expansion of its distribution up to the order n-1/2. Then the MLE OML is second order asymptotically complete in the sense that there exists a modified MLE BML, with the salve asymptotic bias as On up to the order n-1, of the form BML = OML + =y1 (BML) + -^R92 (BML) ,
such PO,- j Vrn ( 41L
-
that
/
8) - µo(8) E C} > PO , n { ✓rt (On - B) - A' (0) E C} + o
for any convex set C of RP containing the origin and all 0 E 6, where 1Lo(8)
\
(
^ 1 /
_ (µo1(8), ... , µoP(O))'. The proof is omitted since it is done in a similar way to that of Theorem 3.4. Theorem 4.2. Assume that On is second order asymptotically efficient in the class of the estimators with the same bias up to the order n-1 in the above, which admits the Edgeworth expansion of its distribution up to the order n-1. Then the pair of statistics (OML, Z2 (BML)) is third order asymptotically complete in the sense that there exists a modified MLE BML, with the same asymptotic bias as On up to the order n-3/2, of the form
339 25
OML = OML + _q1 (OML) + np (BAIL , Z2 (BAIL)) + n7q3 (BM L, Z2 (OML))
such that P9,n {s/ i(OML
- 9)- µo(8)
( ^1 {v4 (On-B)-a. (B)EC +O EC}1POn I
for any convex set C of RP containing the origin and all B E
Z2(8) =
7ne0 ,
6, where
}
n
logf(xs, O) + ^j(B)
with the Fisher information matrix I(9). Sketch of the proof. First it is noted that the coefficients µ0a(8), µ1a(9), C,(,p(9), r.(1),(0) and 00 xao.ya (O) (a, /Q, ry, 8 = 1, ... , p) in the above asymptotic cumulants of On are determined from the second order asymptotic efficiency of On- In order to obtain BAIL with more concentration probability than On up to the order n-1 by the Edgeworth expansion of the distribution of On, it is enough to find an estimator On = B;, which minimizes CQR (0) = Cove (Q,, Q,), given ^cap7(0). Putting V a p = Ea=1 Ia0 (8/88.y)U0 with Iao = 1`0 (0) (a,# = 1, ... , p), we have for a, /3, y = 1,...,P
(4.1) Kap•y(8) = E9 (Q«V, ) + E0 (QoV7«) + Ee (Q. V«o) We also have for a, /3, ry, 8 = 1.... )P1 (4.2) E9 (UOU-1Q,) =2 (µlai17 - Ka07 - JO"*') + o(1), P
(4.3)
ll E9 (UOVy6Qa) =2 Ala (K-Y60 + J7O.6) + (Kb o" + Jb a ) 6'=1 P
(Kryb"6 + Jry6 '6 - JaaJ76"6 + Moa76 6'=1
+ o(1), where for each a = 1,...,p, µ1a denotes µia(O). From the second order asymptotic efficiency of On we obtain (4.4) Eo (UaQa) = o(1) (a = 1, ... ,P)• Then it follows by the Lagrange method that, under the conditions (4.1) to (4.4) and E9 (Qa) _ 2µ1a + 0(1), Q. = Qa minimizing Cove (Qa, Qo) have the following: for a = 1, ... , p
340 26
(4.5) Q« =2µ1. + 710-f (0) Vp7 - p7(0)U. p=17=1 p=1Y=1 +^^Aa7(0)(upU7-Ip7)+EEEAary6(0) /3=17=1
0=1'y=16=1
(UpV76 + Kap7 + Jn7•0) , where rl«7(0), µ0"7(0), A17(0) and Apry6(9) (a,0,-y,6 = 1,. .. , p) are Lagrangian multipliers determined so that the conditions are satisfied . Hence it is seen the desired estimator 9 has Qa in its stochastic expansion. Substituting 9ML for 0 in (4.5 ) and making it to have the same asymptotic bias as On up to the order n-312, we obtain the desired result.
References [1] M. Akahira: Asymptotic theory for estimation of location in non-regular cases, I: Order of convergence of consistent estimators , Rep. Stat. Appl. Res., JUSE 22 (1975 ), 8-26. [2] M. Akahira: The structure of asymptotic deficiency of estimators , to appear in Queen 's papers in Pure and Applied Mathematics.
[3] M. Akahira and K. Takeuchi: Discretized likelihood methods-Asymptotic properties of discretized likelihood estimators (DLE's), Ann. Inst. Statist . Math., 31 (1979), 39-56. [4] M. Akahira and K. Takeuchi: Asymptotic Efficiency of Statistical Estimators : Concepts and Higher Order Asymptotic Efficiency , Lecture Notes in Statistics 7, Springer-Verlag, New York- Heidelberg- Berlin ( 1981). [5] P. J. Bickel , F. Gotze and W. R. van Zwet: A simple analysis of third-order efficiency of estimates , Proceedings of the Berkeley Conference in Honor of Jerzy Neyman and Jack Kiefer , Vol. II (L. M. LeCam and R. A. Olshen eds.), Wadsworth ( 1985 ), 749-768. [6] J. K. Ghosh , B. K. Sinha and H . S. Wieand : Second order efficiency of the mle with respect to any bounded bowl-shaped loss function , Ann. Statist., 8 (1980), 506-521. [7] J. Pfanzagl and W. Wefehneyer : A third order optimum property of the maximum likelihood estimator , J. Multivariate Anal., 8 ( 1978), 1-29. [8] J. Pfanzagl and W. Wefehneyer: Asymptotic Expansions for General Statistical Models, Lecture Notes in Statistics 31, Springer -Verlag, Berlin-Heidelberg-New York-Tokyo (1985).
341 27
[9] K. Takeuchi : Higher order asymptotic efficiency of estimators in decision procedures, Proc. III Purdue Symp. on Decision Theory and Related Topics , Vol. II , ( J. Berger and S. Gupta eds.), Academic Press , New York ( 1982), 351-361. [10] K. Takeuchi and K. Moriinune: Third-order efficiency of the extended maximum likelihood estimators in a simultaneous equation system, Econometrica 53 (1985), 177-200.
M. Akahira F. Hirakawa Department of Mathematics Department of Mathematics University of Electro- Science University of Tokyo Communications Yainazaki , Noda-shi Chofu, Tokyo 182 Chiba 278, Japan Japan K. Takeuchi Faculty of Economics University of Tokyo Hongo, Bunkyo-ku Tokyo 113, Japan
342
Ann. Inst. Statist. Math. Vol. 41, No. 4, 725-752 (1989)
HIGHER ORDER ASYMPTOTICS IN ESTIMATION FOR TWO-SIDED WEIBULL TYPE DISTRIBUTIONS MASAFUMI AKAHIRA' AND KEI TAKEUCHI2 'Institute of Mathematics, University of Tsukuba, Ibaraki 305, Japan 'Research Center for Advanced Science and Technology , University of Tokyo, Komaba, Meguro-ku, Tokyo 156, Japan
(Received May 18, 1988; revised January 25, 1989)
Abstract . We consider the estimation problem of a location parameter 0 on a sample of size n from a two-sided Weibull type density f (x - 0) = C(a)exp(- Ix-0Ia) for -oo<x
Key words and phrases : 2a-th order asymptotically median unbiased estimator , 2a-th order asymptotic distribution , 2a-th order asymptotic efficiency, Edgeworth expansion , maximum likelihood estimator.
1. Introduction Higher order asymptotics have been studied by Pfanzagl and Wefelmeyer (1978, 1985 ), Ghosh et al. ( 1980) and Akahira and Takeuchi (1981), Akahira ( 1986), Akahira et al. ( 1988), among others , under suitable regularity conditions. In non-regular cases when the regularity conditions do not necessarily hold, higher order asymptotics were discussed by Akahira and Takeuchi (1981), Akahira ( 1987, 1988a , 1988b), Pfanzagl and Wefelmeyer (1985), Sugiura and Naing (1987) and others. In this paper we consider the estimation problem of a location parameter 0 on a sample of size n from a two-sided Weibull type density f (x - 0) = C( a) exp (- Ix- O I a) for - oo < x < oo, - oo < 0 < oo and 1 < a < 3/2, where C(a) = a/ {2F(1 / a) }. It is noted that there is a Fisher information amount and a first order derivative of f (x) at x = 0, but there is no second order one of f(x) at x = 0. It is also seen in Akahira (1975)
725
343 726 MASAFUMI AKAHIRA AND KEI TAKEUCHI
that the order of consistency is equal ton-'12 in this situation. Then we shall obtain the bound for the distribution of asymptotically median n(2a 1)12. We unbiased estimators of 0 up to the 2a-th order, i.e., the order shall also get the asymptotic distribution of the maximum likelihood estimator (MLE) of 0 up to the 2a-th order and see that the MLE is not generally 2a-th order asymptotically efficient . Further, we shall obtain the amount of the loss of asymptotic information of the MLE. 2. The 2a-th order asymptotic bound for the distribution of second order AMU estimators Let X1,..., Xn,... be a sequence of independent and identically distributed (i.i.d.) random variables with a two -sided Weibull type density f (x - 0) = C( a) exp { - I x - 01 a} for - oo < x < 00 where 0 is a real-valued parameter, 1 < a < 3/2 and C(a) = a/{2T( 1/a)} with a Gamma function u- 1 -x T(u), i.e., T(u) =fo' x a dx (u > 0).
We denote by Po, the n-fold products of probability measure Pe with the above density f(x - 0). An estimator On of 0 based on Xi,..., Xn is called a 2a-th order asymptotically median unbiased (AMU) estimator if for any E R', there exists a positive number 6 such that n (2a-1)/2
lim sup n-°°
n(2a -1)/2
lim sup n-°°
0: I0-711
1
{
Pe,n en C 0} - 1
e; 10-ql <6
Pe,n{On
2
0}
-
1 2
=0, =0.
We denote by Ala the class of all best asymptotically normal and 2a-th order AMU estimators . For a On 2a-th order AMU, Go(t, 0) + n (2a-1)/2G1 ( t, 0) is defined to be the 2a-th order asymptotic distribution of (8n - 0) (or On for short ) if for any t E R' and each 0 E R'
lim
n(2a
-1)/2 1 Po,n{V ( On - 0) s t} - Go(t , 0) - n (2a-'1/2G1(t, 0) I = 0 .
In order to obtain the bound for the distribution of 2a-th order AMU estimators of 0, for arbitrary but fixed Bo, we consider the problem of testing hypothesis H: 0 = 00 + to '/2 (t > 0) against the alternative K: 0 = Bo. Then the log-likelihood ratio test statistic Zn is given by Zn =
{ f( X; - Oo)lf(X1 - 00 - to '12 ;I1 log n
_ - 1 1(IX
-Oo
la-
IX- Oo- tn 1/21").
344 ASYMPTOTICS FOR WEIBULL TYPE DISTRIBUTION 727
In order to obtain the asymptotic cumulants of Z, we need the following lemma. LEMMA
2.1.
If ha (x) = (x + A )' -
xa
for d > 0, then
fo- h2X°dx=aF( 2- a1 )d2+a(a - 1)T(2 _-2a )d3 1 +Y d 2a + 1
+0 2a + 1
)
2a+ 1 '
f
ha(x)eX0dx=a2T(3
- 2a )d3 +0(d4)
where
_ a(a - 1)T(a - 1)F(3 - 2a) Y 2(2a-1)F(2-a) PROOF .
First we have
(2.1) f o he (x)e Xadx = f o (x + d)2ae xdx - 2 f o (x + A)axae adx + f0 x2ae XQdx.
Since for f3 > 0 (2.2)
fo xfi 1 eX"dx= aT(a)
it follows that (2.3) f o (x + 4)2ae X'dx d2a+1
a
2a+1+2a+1
a+ (x + d)2 txa -1e Xadx
d2a+ 1
a 2a+1 + 2a+1 • f o x3a + (2a + 1)x3a -' d + a(2a +
1)x3a-2A2
+ 3 a(2a - 1)(2a + 1)x3,-3d3 } e X'dx + O(A4)
345 728 MASAFUMI AKAHIRA AND KEI TAKEUCHI
2a+1 T(3
+ a) + 2d +aT (3-
+ 3 a(2a - 1)T (
a
)d2 42a
3 - a) d3 2a + 1
+1 + 0(d4)
From (2.2), we obtain ^
f0
(x + d)"x"e-Xadx a _ - f (x + d)a+1( xa-1 - X-XQdx a+1 0
a a+1
fo
x"+' + (a + 1)x"d +
+ 6 a(a - 1)(a
a
a(a + 1)x"-'d2
+ 1)xa-2d3 }(x2a-1 - xa-1)e XQdx
a 1 fog R(d)xa-'e XQdx + 0(d4) , a+ where R(d) _ (x + d )a+l - x"+1 - (a + 1)dx " - 2 a(a + 1)d2xa-' - 6 a(a - 1)(a + 1)d3xa-2 . Then the remainder term R(A) of the Taylor expansion is represented by R(A) = Ka fo (A - t)3(x + t)"-3dt , where 0 : 5 t : 5 it follows that fo(x+ t)
and K. = a(a + 1)(a - 1)(a - 2)/6. Since 1 - xa < e X° < 1,
a-3xa-l(1
- e XQ)dx
1 - e X0)dx + fl. x2a- 4(1 - e XO)dx
fo' x3a-4dx
+ fl00 x2a-4dx
346 ASYMPTOTICS FOR WEIBULL TYPE DISTRIBUTION 729
a
3(a-1)(3-2a) Since f o (x + t)a-3x° -' dx = t2°i-3B(a, 3 - 2a), we have a
a fo R(d) x -'- x' dx = f0 R(A)xa-'dx - f0 R(d)x -' ( 1 - e- x')dx
= Kafo (d - t)3 j fa (x + t)'-3xa-'
dx I dt + 0(.42a+1)
= KaB(a, 3 - 2a) f0 t2a- 3(d - t)3dt + o(d2a+`) = KaB(a, 3 - 2a)B(2a - 2,4
) A2a+l
+ o(d2a+)
(a + 1)(a - 1)T(a - 1)T(3 - 2a) d2a+I + o(d2a+I) 4(2a + 1)(2a - 1)F(2 - a) Hence , we obtain
fo (x + d)axaex. A a+ {T(3+ a)-T(2+ a )} +d+ 2 (T(3- a)-T(2- a
)}d2
+ 6 a(a- 1) j T(3- a)-T(2- a )}A3 + a(a - 1)T(a - 1)T(3 - 2a) d2a+I + O(d2a+I) 4(2a + 1)(2a - 1)r(2 - a) = a T\ 2+ a)+d+(a2 1)T(2- a )d2
+ 6 (a- 1)(a-2)T(2- a )d3
+ a(a - 1)T(a - 1)T(3 - 2a) d2a+I + (d2a+I) 4(2a + 1)(2a - 1)r(2 - a) and by (2.2), f o x2aex0dx = T(2 + I/ a)/ a. From (2.1) to (2. 4), we have
347 730 MASAFUMI AKAHIRA AND KEI TAKEUCHI
he(x)e x°dx=aT(2- 1 )d2+a(a- 1)F(2_ 2 )d3 a a 1 + Y A2a+1 +0 (d2a+1)
2a+1
,
where y = a(a - 1)T(a - 1)T(3 - 2a)/2(2a - 1)T(2 - a). We also obtain
f 0 he(x)e-"dx = f 0 {(x + 4) a - xa}3e-"dx x 3a-3e
= a3d3 0
x" dx +
O(A4)
=a2T(3- 2 )d3+O(d4). a Thus we complete the proof. In the following lemma we obtain the asymptotic mean, variance and third-order cumulant of Z, under H and K. LEMMA 2.2. The asymptotic mean, variance and third-order cumulant of Zn are given as follows: Under K: 0 = Bo, Eeo(ZZ) = 1
t2 _ k t2a+1n ( 2a-1)/2
2 2
+ o(n
(2a-1)/2)
VBo(Zn ) = 1,2 - kt2a+ln(2a-1)/2 + o(n(2a-1)/2)
K3,Bo( Zn)=4(n
(2a- 1)/2
),
and under H: 0 = Bo + to 1/2 EBo +rn
2(Zn)=
I 2 k 2a+1
(2a-1)/2 -(2a-1)/2 +o(n ) - 2 t+ 2 t n -
VBo+ln',2(Zn) = It2 - kt2a+ln(2a-1)/2 +
o(n (2a-1)/2)
(2a-1)/2
where 12 ]
I=EB[[ aB logf(X-o) = - EB [ 2 log f (X - B) = a(a - 1)T( 1 - (1 / a))/T(1 / a)
J
348 ASYMPTOTICS FOR WEIBULL TYPE DISTRIBUTION
731
and
k=a{B(a + 1,a+1)+2a+1 }/r(1/a) with B(u, v) = fo' x"-'( 1 - x)°-'dx (u, v > O) . PROOF. Without loss of generality, we assume that No = 0. Putting n
We(x) _ l x l a - Ix -A I a with d > 0, we have Z n Yid (X;) where d = to "2. Since xa - (x - A)a (2.5)
Y% (x) = xa - (d - x)a (-
x)a
-(d
-x)a
for
x>_d,
for 0 < x < d , for x<0,
it follows that (2.6) Eo ['I%(X)] = C(a) [ - fo {(x + A)a - xa}e xadx
+ f o {x- - (A - x)0}e "^dx
f,
+ - Jx- - (x - d) a} e xadx ] = C(a)(I, + 12 + 13)
(say) .
Putting he (x) = (x + A)a - xa, we have from Lemma 2.1 (2.7) Ii+I3=
-f0
{(x
+A)a
a
-xa
}e xOdx +f - {xa- (x-d) }e xadx
= f o {(x + A) a - xa} {e (x+A)O - e "°} dx = f o he (x) {e ti°(x) - 1 } e x'dx _ - f o he (x) e xadx + 2 f o he j(x)e- X°dx + O(A)
1 )d2 d2a+1 +o(d2a +1). a 2a+1 +
349
732 MASAFUMI AKAHIRA AND KEI TAKEUCHI
We also obtain
(2.8) I2 =
f
{xa
- (A - x)a }e Xadx
= f o {xa - (d - x)a} { 1 - xa + 2 xza + O(x3a) } A 2a+1 ,42.
+l f° (e )a (1 - x )a dx + O(d3a+,)
2a + 1 + ={B(a+l,a+1)-
dza+l + O(d3a +1)
1
2a + 1
where B(u, v) denotes the Beta function. From (2.7) and (2.8), we have I,+Iz+I3=-aF(2- 1 )dz a + { B(a + 1, a + 1) +
y } 42a+1 + o(d2a+l) 2a+ 1
Since C(a) = a/{2T( 1/a)}, it follows from (2.6), (2.7) and (2 . 8) that (2.9) Eo ['YA (x)] = C( a)(Ii + 12 + 13) a(a-1)F(1-
1) a
J
2
te(a) + a{B(a + 1, a + 1) + (y/(2a + 1))) dza+, + o(dza+1) 2r(a) From (2.5), we obtain
(2.10)
Eo ['I%(X)] = f {(- x)a - (d - x)a }2f(x)dx +f' {xa - (d - x)a}z f (x) dx
+ f {xa - (x - d )a}z f(x) dx =f0 {xa - (x + d)a}zf(x)dx
350 ASYMPTOTICS FOR WEIBULL TYPE DISTRIBUTION 733
+fo
{xa
- (A - x)a }2f(x)dx
+ f °° {xa - (x - d)a }2f (x) dx
= C(a)(I, + 12 + 13 )
(say).
Since he (x) = (x + A)' - xa, it follows from Lemma 2.1 that (2.11)
Ii + 13' = f o {(x + d)° - xa12e-x.dx + f {x, - (x - d),12e x'dx x+e)°}dx = f p {(x +A ),_ xa}2 {e x' + e (
= f o h d (x){e °(x) + 1} e x°dx
= 2 f o hv(x)e x°dx - f o he(x)e xadx + O(d) =2aT(2- 1 )d2_ 2(l +Y)d2a +1+o (d2a+1). a 2a+ 1
Since 12, = f A
{xa
- ( d - x)a }2e x 'dx
=f d {xa - (d - x)a12(1 - xa) dx + O(d3a+1) =fo
{x2a
2a+1
2a + 1
- 2xa(d -
x)a
+ (d - x)2 a }dx + O(d3a+1)
- 2d2a+1 f ° O(d3a+1) 0 (d )a (1 d )a d dx +
2,42a +1 1B(a + 1, a+ 1) + O(d3a +1) 2a + 1 - 2d2«+ we obtain from (2.10) and (2.11)
(2.12) Eo[!e(X)] = C(a)(I1 + 12' + 13) a(a- 1)T(1 -1) a 2 T(a
351 734 MASAFUMI AKAHIRA AND KEI TAKEUCHI
a{B(a + 1, a + 1) + (y/(2a + 1))) d2a+l
+ o(d2a+') .
F(a ) From (2.5), we have
(2.13)
Eo [ `I%(X )] =
1 +
{(- x)a - (d - x)a}3 f(x)dx
fo
{xa
- (d - x)a}3f(x)dx
+f- {xa - (x - d)'}3 f (x) dx = f o {xa - (x + d)a}3 f (x) dx + f0 {xa - (d - x)a}3f(x)dx
+ f '{xa-(x-d)a}3f(x)dx = C(a)(Ii" + 13 + I3)
(say).
Since he (x) = (x + A)a - xa, it follows that
(2.14) 11" + 13 = - f o {(x + d)a - xa}3e xQdx + f {xa - (x - A )a}3e X'dx = f 0 {(x + A)a - xa } 3{e (x+e)' - e Xa}dx = f o he (x) {e "d(x) - 1 } e x'dx .
Since
fo he(x)e-xQdx = fo {(x + d)a - xa}°e xQdx = a4d4
f0 x4a-4e xadx + o(d4) = 0(d4)
it follows from (2.14) that (2.15) We also have
I +I3=0(A4).
352 ASYMPTOTICS FOR WEIBULL TYPE DISTRIBUTION 735
(2.16)
12" =
1
0
1) . {Xa - (A - x)a}3e XQdx = O(d3a+
From (2.13), (2.15) and (2.16), we obtain
(2.17) Eo[VIe(X)] = O(d4) . Putting d = to- 1/2 with t > 0, we obtain from (2.9), (2.12) and (2.17) Eo[Z„] = - nEo[!P1n'12(X)] a(a-1)TI1-1) a 2 1 t =
2T
(a)
a{B(a + 1, a + 1) + (y/(2a + 1))} 2a +1
+ o(
n
- (2a - 1 )/2
(2a-1)/2)
= 2 t2 - k t2a+In (2a-1)/2
+ O(n
(2a-1)/2
Vo(Z„) = nVo(Vf,n 2(X)) a(a - 1)r 1 - 1 ) a
T(
1
t
2
a)
a {B(a + 1, a + 1) + (y/(2a + 1))} tea +ln( 2a -1)/2 T(a ) (2a-1)/2) + o(n
= It2 - kt2a
K3,o(Z„)
+'
- ( 2a-1)/2 + o ( -(2a-1)/2)
nK3,o (Y%,,-=(X ))
nEo[{Yf... ... (X) - Eo[T,n '(X)]}3] = O(n
where
(2a
-1)/2)
353 736 MASAFUMI AKAHIRA AND KEI TAKEUCHI
0) }2 = - Eo I= Ee [} a0 log f(X -
]
[
2 log.f(X - e) J
= a(a - 1)T(1 - (1/a))/T(1/a) and
k=a{B(a+l,a+1)+2a+1 }/riia. In a similar way to the case under K: 0 = 0, we can obtain the asymptotic mean , variance and third-order cumulants under H: 0 = to 1/2 Thus we complete the proof. In order to get the bound for the 2a-th order asymptotic distribution of 2a-th order AMU estimators, we need the following. LEMMA 2.3. Assume that the asymptotic mean , variance and third order cumulant of Zn, under the distributions Po,n(O = 00, 00 + to 1/2), are given by the following form. E&(Zn) = µ(t, 0) + n (2a-1)/2C1 (t, 0) + o(n Ve(Zn)
=
u2(t, 0)
K3, e(Zn) =
o( n
+
(2a
n (2a-1)12C2
(t, 0) + o( n
(2a-1)/2) 12a-1)/2)
-1)/2)
Then 1 - (2a- 1)/2 Pe,n{Zn< ao} = 2 + o(n
if and only if ao
=,u(t, 0) +
Cl(t,0)
n (2a
-1)/2 + o(n(2a-1)/2)
The proof is essentially given in Akahira and Takeuchi (( 1981), pp. 132, 133). In the following theorem we obtain the 2a-th asymptotic bound for the distribution of 2a-th order AMU estimators of 0. THEOREM 2.1. The bound for the 2a-th order asymptotic distribution of 2a-th order AMU estimators of 0 is given by
,p(t) - CO ItI2acb (t)n(2a -1 )/2 sgn t + o(n (2a - 1)/2) ,
354 ASYMPTOTICS FOR WEIBULL TYPE DISTRIBUTION 737 that is, for any en E A2a
PQn{
In(
6n - 0):5: t
} :5 c (t) - Cot2a4
(t) n (2a- 1)/2
+ o(n
(2a-1) /2)
for all t>0, PB,n{ In (en - 0) C t} > 0(t) + CoI tI2ao(t)n(2a-1)/2 + o(n(2a-1)/2)
for all t<0,
where a{B(a + 1, a + 1) + (y/(2a + 1))} Co= 21a+(1/2)F(1/a) and 0(t) and 4(t) denote the standard normal distribution function and its density function, respectively. PROOF. Without loss of generality, we assume that 00 = 0. We consider the case when t > 0. In order to choose ao such that
(2.18)
Ptn '/ z,n{Zn
< ao} =
2
+ o(n
(2a-1)/2)
we have by Lemmas 2.2 and 2.3
ao = - 2 t2 +
2 t2a+ln( 2a-1)/2 + o
(n
(2a-1)/2)
Since (2.19) Po ,n{ZZ ? ao } = Po,n{ - (Zn - It2 - ao) <- It2} , putting Wn = - (Zn - It2 - ao), we have from Lemma 2.2 ) = ktza+ln( 2a-1)/2 + o n (2a 1)/z) (
(2.20)
E'o(Wn
(2.21)
Vo(Wn)
(2.22)
K3,o(Wn ) = o(n(2a -])/2)
= It2 - kt2a
+ln(2a-1)/2 + o(n(2a-1)/2)
We obtain by (2.19) to (2.22) and the Edgeworth expansion (2.23) Po,n{Zn?ao}=Po,n{Wn!5 It2} k 2a
-(2a - 1)/2 -(2a-1)/2
_ ^(^t) - 2^ t ^(f t)n + o(n ) .
355
738 MASAFUMI AKAHIRA AND KEI TAKEUCHI
Here , from (2.18), the definition of Z„ and the fundamental lemma of Neyman-Pearson it is noted that a test with the rejection region {Zn >_ ao} is 1/2 + o(n (2a-I)i2). the most powerful test of level Let On be any 2a-th order AMU estimator . Putting A& = {V 0n < t}, we have '1z
Ptn"', n (A&) = Ptn ' Z, n {an < to
} =
I
O (n
_ (2a- 1)/2
) .
2 +
Then it is seen that XA . of the indicator of A& is a test of level 1/ 2 + o(n (2a-')/2). From (2.23), we obtain for any en E A2a
Po,n{/ On
< Po,n{Wn< It2} k(V
t)
2
t2a
o( V I t)n )2a-1)l2 + o(n
(2a- 0/2 )
that is, (2.24)
Po,n{
In On < t} <- p (t) -
Cot2a
4, (
t)f
(2a -1)/2
+ o(o (2a
- 1)/2)
for all t > 0, where k a{B(a + 1, a + 1) + (y/(2a + 1))} Co = 21a+( 1/2) - 2 ja+('12)r(1 / a) Hence we see that the bound for the 2a-th order distribution of 2a-th order AMU estimators for all t > 0 is given by (2.24). In a similar way to the case t > 0, we can obtain the 2a-th order bound for all t < 0. Thus we complete the proof. Remark 2.1. The result of Theorem 2.1 holds for 2/ 3 < a < 1, where the information amount I must be expressed as a2T(2 - (1 / a))/ T(1 / a). The proof is omitted since it is essentially similar to the above.
3. The 2a-th order asymptotic distribution of the maximum likelihood estimator In this section we obtain the 2a-th order asymptotic distribution of the maximum likelihood estimator (MLE) and compare it with the 2a-th order asymptotic bound obtained in the previous section. We denote by 0o and OML the true parameter and the MLE, respective-
ly. It is seen that for real t , O ML < 00 +
to 112 if and only if
356 ASYMPTOTICS FOR WEIBULL TYPE DISTRIBUTION 739 n
(a
log f (Xi -
00 - In-'/2 ) < 0 .
Without loss of generality, we assume that 00 = 0. Hence we see that for each real t
(3.1)
if and only if I E (dl A) log f(X; - to-'/z) > 0 . s7n
OML < to '/z
Since a f-log f(x)= - al x i -' sgn x, we put () 32 Un = - I E (dl A) log f(X; - to '^z ) ,In =1 a
i/z I a i sgn (X; - to '/z) n II = I X; - to
_
In order to obtain the asymptotic cumulants of Un, we need the following lemma. LEMMA 3 .1.
f
If he(x) = (x + d)a - xa for d > 0, then
a
x -Ie X'he(x)dx=T(2- II )d+a
2
IT ( 2- 2 ) .4 2
Y dza + o(dza) - 2a
(3.3)
fo
xza -ze X"he(x)dx = T( 3 - a)A+
2
(a - I)T 3- a dz
+o(A2),
(3.4)
fo
fo (3.5)
x3a-3e
Xuhe (x)dx = (3- 3-)T ( 3 - 3) d + O(d a a
xa I X'ha (x)dx = aT ( 3
f xza-zeX'he(
?a
x)dx = aT ( 4 -
a
dz +0(43)
)A2 + O(43)
)
,
357
740 MASAFUMI AKAHIRA AND KEI TAKEUCHI
fo x3° - 3e x°he (x)dx = O(A2) ,
(3.6)
f oxa-'e X'hA(x)dx = O(d3 ) ,
fo
xz
° ze
x'he (x) A = O(d 3) .
PROOF. From (2.2), we have fo x a- Ie xvhe(x)dx a a-1 f
2a-1
x°
=fa (x+d) x e dx-fox e dx I (x + A)a+1{( a - 1)xa-2 - ax 2a z }ex°dx - 1 a+1 0 a
a + 1 fo {
xa
+1 + ( a+ 1)dxa
+
a(
1)
a2
A2xa -1
I
. {(a - 1)xa-2 - ax2a-2}e x°dx
a+1
f
R*(d) xa
2e
xadx -
a
+ O(d 3) ,
where R*(A) = (x + A)
a+t - xa+1 - (a
+ 1)dxa - 2 a(a + 1)d2xa-1 .
In a similar way to (2.4), we obtain
fo
R *(d)xa -
2e "dx
= 2 a(a + 1)(a - 1) f o (d - t)2 (f o (x + t)a-2xa-2e xadx) dt
= I a(a + 1)(a - 1) B(2a - 2, 3) B(a - 1, 3 - 2a) A2a + o(d2a) - (a + 1)T(a - 1)T(3 - 2a) d2a + o(d2a) . 4(2a - 1)T(2 - a) Hence , we have
358 ASYMPTOTICS FOR WEIBULL TYPE DISTRIBUTION 741
fo XI-1 e- x'hA (x) 3a-1 -x°
2a-1 -xQ
a+1 (a-1)fo x e dx-afo x e dx + (a - 1)(a + 1)d fo x2"-2e-XQdx - a(a + 1)d fo x3a-2e-XQdx + 2 a(a - 1)(a + 1)d2
-
a2(a +
f
0
1)d2 0
fo
x2"-
x3a-3e X
-
3e-XQdx
Qdx
1
2 1 A a - 1)T(a - 1)T( 3 - 2a) d2a + o(d2a a 4(2a - 1)T(2 - a)
a+ 1{ a a 1 -2+ a (a- 1)(a+ 1)T(2- I )d
-(a+l)T(3- 1 )d a
+ 2 (a- 1)(a+ 1)T(2- a) d2 - 2 a(a+1)T ( 3- 2 )d2} 1 - (a - 1)1'(a - 1)1'(3 - 2a) A2a + O(d3) a 4(2a - 1)F(2 - a) T(2- a ) d + a 2 I T
2- a )d2 2a Ala+o(d2a).
In a similar way to the above , we can obtain (3.3) and (3.4). We also have 2 fo xa -'e XQhe (x)dx = f o (x + d )2ax" 'e XQdx - 2 f o (x + d)"x2" 'e XQdx
+f0:
x3a-le
XQdx
359 742 MASAFUMI AKAHIRA AND KEI TAKEUCHI
= f 0^ {x2a + 2ax2a -'d + a(2a - 1)x2°`- 2d2 } xa -'e ".dx - 2 fo j xa + axa-'d + 2 a(a - 1)xa-2d2 l x2a-Ie- "dx
+ 2 + O(d 3)
a
aF(3_ 2 )d2+O(A) , a
and similarly get (3 . 5) and (3.6). From (2.2), we obtain fo
xa
-' e X"he (x)dx =
fo
xa
'e "Q(axa 'd )3dx + o(d3)
=a2T(4- 3 )d3+o(d3) a =
O(A)
,
and similarly have (3.7). Thus we complete the proof. In the following lemma , we obtain the asymptotic mean, variance and third order cumulant of U. LEMMA 3.2. The asymptotic mean, variance and third order cumulant of U„ are given for t > 0 as follows:
Eo (U„) = - I t + - t2an (2a- 1)/2 + o (n (2a-1)i2) 2 (2a -I)/2) Vo(U„) = I+ o(n -1)/2 ) K3,o(U„) = o(n(2a
1
where I is given in Lemma 2.2 and
k' = PROOF. (3.8)
a2{B(a, a + 1) + (y/ a)}
T(1/a)
Putting d = tn'/2, we obtain
Eon l X - d l a- ' sgn (X - d)] = C(a) { f,- (x - A)'-'e-"dx -
fo (A - x)'-'eX.dx
360 ASYMPTOTICS FOR WEIBULL TYPE DISTRIBUTION 743
-f 0 (d - x)«-'e I xl adx
= C(a)
I fo,
x« i e(x +d)° dx -
Jo (x + d)«
I
'e x' dx
[
a-1 - x « -1 } e x.dx - x" = C(a) f0 xa_ 1 (e_^4)° e - } dx - f o {(x + A) d
- f o (d - x)'- 'e- xadx = C(a)(J, + J2 + J3)
(say) .
Putting hd (x) = (x + d)« - x«, we have from Lemma 3.1
(3.9)
«
-'{e (x +d)' - e J, = f, 0 x = fo, x«-Iex 0
__
}dx
°{e ha(x) - 1}dx
x«-te
Jo
-x"
xahd(x)dx + 2 J0 x« 'e xQhv (x)dx + O(d3)
=-I'(2- a )d- a2 1 T 2- a )d2+ 2 T(3- a )d2 d2 « +
O(A2 a)
+ Za
(1
_ +)r(1 _1a'A+
a )d2 a2 1 2 - 2
+ 2 d2«+o(d2a).
From (2.2), we obtain
(3.10)
- J2 =
f0
-,e x"dx - f o x«- I e x0dx (x + d)«
« a
+ f o (x + d)«x« 'e x°dx - a .
Since by a similar way to (2.4)
361 744 MASAFUMI AKAHIRA AND KEI TAKEUCHI
f o (x + d)axa-'e X°dx = a +F 2- a )d+ a2 1 T(2-
a
)d2
Y
j2a
2a
+o(d2a),
it follows from (3.10) that
(3.11)
)dJ2= -T( 2a a
( 1- a
)d2
a
2 Ala
+
+ O(d2a)
Za
We also have
(3.12)
d a' -X°
-J3=,fo(d-x)-e dx =
fo (d
-
x)a-'( 1
- xa)dx +
O(A3a)
a a
-d2aro'(1- d)a'(^ )a^ dx+O(d3a)
= da - B(a, a + 1),42a + 0('43a) . a
From (3.8), (3.9), (3.11) and (3.12), we obtain (3.13)
Eo[IX-dla-'sgn(X-d)] =C(a){-2(a- 1)a- 1 )d a a + ( B(a, a + 1) + Y )deal+OWet) a JJ =-(a-1)F(1 (1/a))d
T(1/a)
+ a{B(a, a + 1) + (y/ a)} d2a + O(d2a) . 2T(1/a) Next, we have
(3.14)
Eo[IX - d 12a-2]
362 ASYMPTOTICS FOR WEIBULL TYPE DISTRIBUTION 745
= C(a) { f (x = C(a) f f o x2a
d)2a
-2e- x°dx +f
-2e(x+d)°
(d - x)2a-2e I xl °dx
I
dx + f 0 (x + d)2a72e-x°dx G 2a-2 x°
+ f 0 (d - x) e dx
[
(x +,J), x2a 2e- x °dx + f0 {e = C(a) 2 fo
+f0 {(x +
d)2a-2 -
ex° } x2a
2dx
x2a-2}e-x°dx
G 2a-2 x° +f0
(d - x) e dx
= C(a)[Ji + J2' + J3 + Jd]
(say) .
From (2.2), we obtain
(3.15) J(=2 f0 x2a-2e x°dx = a T ( 2 - a) = a ( 1 - a)F(l - a Since
f
0
Jz = 0 {e
(x +e)°
- ex. } x2a-2dx
= f°° x2a- 2e x°{e ha(x) - 1) dx 0 _ _f0
x2a-
_ 6 f0
ze x°he (x)dx + f 0 x2a ze xhv(x)dx
xz°-2e
x° he (x) dx + O(d3) ,
it follows from Lemma 3.1 that (3.16)
J2=-T( 3- a )d+(a -1)T(3- 3 )d2+0(d3).
From (2.2), we have
- 2 - x2a-2je x°dx (3.17) J3 = f0 {(x + d)2a
363 746 MASAFUMI AKAHIRA AND KEI TAKEUCHI
1 42 a - I
+
a f -(x
2a- 1 2a- 1 0
+ A)2a- Ix
a
-'e- x*dx
- a r (.^2aa> 1
1
2a
-1 _ 1 2a - 1
2a -14 aT +
2a
a
a
{f0 x32edx + (2a - 1)d f0 x3a-3e-x'dx
1
+ (a - 1)(2a - 1)42
fo
2a-i
-aT 2-
2a1 1 d
x3a-4e-
xadx
_
}
_
a+ 2a1 1 T 3
+T(3- a )4 +(a-1)T(3-
3
+0 (A 2)
-
a
)42 +O(43)
1 42a-I+T (3- a )d+(a -1)F(3- a )d2 2a - 1 +0(42). We also obtain
(3.18)
J4
=f0 (4 -x) = f' (A - x
za -z
)za
X. -2dx
za -i
2a -142a-
x
e dx
43a
(4
_ x)2a -2dx + O(44.- 1)
' fo (4 )a (1 - a
) 2a -
2 d
dx + O(A 1
i
2a- 1
- B(a + 1, 2a - 1)A ' + O(44a ^)
From (3.14) to (3.18), we have Eo[IX-4Iza-2] =
a r 1 '(1/a) al a 1 + I'(1/ )) T(3- a )42+0(42),
a
)
364 ASYMPTOTICS FOR WEIBULL TYPE DISTRIBUTION 747
hence , by (3.13) (3.19)
Vo ( IX - AI a-'sgn(X -d))
=aT(1 /a)r(1 +(a- 1)
-a)
[ F(1/a)T(3- a T1-la))_ 2
-(a- 1){ (F(1/ ) ]A2+o(42). Third, we have (3.20)
Eo [I X- A
I3a
-
3 sgn (X-d)]
= C(a) { f- (x -
d)3a-3e x°dx -fo (A_ x)3a - 3e X. dx 3a -f 0 (d - x) 3e iXi°dx }
I
- x3a- 3
= C(a) f0 -
f
{e (x+d)° - e x°}dx
m
0
fA
{(X +
(A
d)3a - 3 - x3a - 31
-
3
e-
x
e °dx
1
(say) .
= C(a)(Ji" + J2" + J3') Since JI,
x3a-3 -f0
_
{
e
-
(x
3e
+d)°
- e x° }dx = f0 x3a-3ex°
° h d( x)dx +
2
f0
x3a
{e ha(X) _ 1 }dx
3e X°he(x) dx + o(d2)
it follows by Lemma 3.1 that
(3.21) Jr = -3I 1-a)T(3- a )d+O(d2). From (2.2), we have
365 748 MASAFUMI AKAHIRA AND KEI TAKEUCHI (3.22) - J2" = f 0
+ d)3a-3 - x3a-3 le
{(X
x-°dx
= 3(a - 1)Afo x3a-4e ".dx + o(A) =3(1- a)r(3- a )A+o(A)
and also (3.23)
d - J3' = f 0 (d - x )3a-3e -X adx 3a-2 _ 3 - 2 -
A4a -2
f
d ( d
-3 )a ( 1 - d )la
d dx +
O(A5a-2)
A3a-2
3a - 2
+ O(A4a-2) .
From (3.20) to (3.23), we obtain
Eo[IX-
A13a -
3(a- 1). r
3sgn (X-A)]=
a
)A+o(A),
hence (3.24) K3, o (I X
- Al"
=Eo[{IX
sgn (X- A))
-Ala-1
sgn(X
-A)- Eo[IX- Al a-lsgn (X-A)]}3]
=O(A). Putting A = to 1/2 with t > 0, we have from (3.2), (3.13), ( 3.19) and (3.24) Eo(U„) =
a\/n Eo[I X -
_ - a(a - 1)
to 1 /21a
-1
r(1- (I/ a)) T(1/a)
sgn (X - to '/2)] I
+ a2{B(a, a+ 1) + (Y/ a)) 2a - (ea-1)/2 2r(1/a) + o(n
(2a-l)/2)
I t + kf
t2an( 2a -1)/2
+o(
n
(2a -1)/ 2
)
Vo(U„) = a2Vo(I X - to-1/21 a-1 sgn (X - to 1/2))
366 ASYMPTOTICS FOR WEIBULL TYPE DISTRIBUTION 749
T(1 - (I/a)) + o n (2a-1)/2) = a a- 1 ( T(1 ) /a) = I+ o(n (2a-1)/2) 3
K3,o(U
n) = K3,o(I X - tol121a-1 sgn (X - tn- '/2))
= o(
n -(2a- 1)/2)
where I= a(a - 1)T( 1 - (1 / a))/T(1 / a) and k'= a2{B(a, a + 1) + (y/ a)}/ T(1 /a). This completes the proof. In the following theorem , we obtain the 2a-th order asymptotic distribution of the MLE of 0. THEOREM 3 . 1. BML of 0 is given by
(3.25)
P8, n{
The 2a-th order asymptotic distribution of the MLE
In (BML - 0) <- t} C1 I t l2a o ( t)n (2a- 1)/2 sgn t = 2(t) -
+ (
n (2a-1)/2)
where C, = (2a + 1) Co with Co = a{B(a + 1, a + 1) + (y/ (2a + 1))}/ {2la+l l/2'T( 1 / a)}, and also the MLE is not 2a-th order asymptotically efficient in the sense that its 2a-th order asymptotic distribution does not uniformly attain the bound given in Theorem 2.1. IX)" is symmetric about the PROOF. Since the density f(x) = C(a)e origin , we see that the MLE of 0 is a 2a-th order AMU. We consider the case when t > 0. Using the Edgeworth expansion , we have by (3.1), (3.2) and Lemma 3.2 Po,n{v( 0ML =
PP,n{Un
0)
C
:5 t}
01 t2a`N ( V t)nl2a -,1/2 + o(n(
_ k(^t) -
2a-1)/2
2 ,[1-
that is,
(3.26)
PO,n{ In (OML - 0) < t} k' 2a (2a - 1)/2 (2a-,)/2 + o(n ) 2Ia +( 1/2) t s (t)n
367
750 MASAFUMI AKAHIRA AND KEI TAKEUCHI
4( t)n (2a - 1)/2 + o (n (2a -1)/2) = 0(t) - Clt2a ,
where C1 = k'/{2Ia
+( 1/2) }.
In a similar way to the case t > 0, we obtain for t < 0 In (OML - 0)
(3.27) PB,n{ =
0(t)
+
< t}
C1I t I2ao( t) n (2a-1)/2
+ o(
n
(2a-1)/2)
Hence (3.26) and (3.27) imply (3.25). Since k' = a2{B(a, a+ 1) + (y/ a)}/ ['(1/a) and Co = a{B(a + 1, a +. 1) + (y/(2a + 1))}/{2Ia+(1/2)T(l /a)], it is seen that C1 = (2a + 1) Co. Since Cl > Co for 1 < a < 3/ 2, it follows from Theorem 2.1 and (3.25) that the MLE is not 2a-th order asymptotically efficient in the sense that its 2a-th order asymptotic distribution does not uniformly attain the bound given in Theorem 2.1. This completes the proof. Remark 3 . 1. In the double exponential distribution case, that is, the case when a = 1, it is shown in Akahira and Takeuchi (1981) that the bound for the second order asymptotic distribution of the second order AMU estimators of 0 is given by t2 o(n 1/2) rh(t) - 6 0( t)n '/2 sgn t +
,
and the second order distribution of the MLE of 0, i.e., the median of X1,..., X, is given by
(3.28)
fi(t) - 2 0(t)n_n '/2 sgn t + o(n 'l2)
The results coincide with the case when a = 1 is substituted in the formulae of Theorems 2.1 and 3 . 1, but note that the proofs of these theorems do not include the case for a = 1 since it does not automatically hold that F(a)=(a- 1)T( a- 1)fora= 1.
4. The amount of the loss of asymptotic information of the maximum likelihood estimator In the section we obtain the amount of the loss of asymptotic information of the MLE BML using its second order asymptotic distribution (3.25). Differentiating the right-hand side of (3.25) w.r.t. t, we have the second order asymptotic density g(t) of In (BML - 0) as follows: (4.1) g(t) = 0 (t){1 - Cl(2altl2a-I
_ Itl2a+1)n (2a-1)/2} + o (n(2a-1)/2)
368 ASYMPTOTICS FOR WEIBULL TYPE DISTRIBUTION
751
for -oc0
I
(4.2)
^tl
a^
(t) dt = 27c f(,'o tae 11 /2dt
=
2a/2
f (after transformation u = t2/2) a2 1 ).
Since, for sufficiently large n, d
d log g(t) t - C, {2a(2a -
1)l tl2a - 2 - (2a +
1)l tl2a}
it follows from (4.1) and (4.2) that the asymptotic information amount of the MLE is given by
IML = nI
IM L
f dt log g(t) 12g(t)dt
(p(t)[t2 + C,{4a(2a -
= nI f
- 2(3a + 1)ltl2 + o(n(3/2)-a) 2a+(3/2)
= nI { 1 - C, T(a +
I)l
t l2a-1
a +1 + ltl 2a +3 } n -(2 a - 1)/2 ]dt
1)n-(2a-1)/2
} + 0(n(3/2)-a) .
Hence, the amount of the loss L of asymptotic information of the MLE is given by 2a+(3/2 )
L = nI - IML = C,IT( a + 1)n(3/2)-a + o(n(3/2)-a) 2a+(1/2)a (2a + 1)T(a + 1){B(a + 1, a + 1 ) + (y/(2a + 1 _ Y,j a-(1/2)r(1/a) + o(n(3/2)-a) .
369 752 MASAFUMI AKAHIRA AND KEI TAKEUCHI
In a similar way to the above , it follows from (3.28) that, in the double-exponential distribution case , namely , when a = 1, the amount of the loss of asymptotic information of the MLE is given by 2 2n/7c + o(f), since I= 1. Acknowledgement The authors wish to thank the referee for pointing out two errors in the original version. REFERENCES
Akahira, M. (1975). Asymptotic theory for estimation of location in non-regular cases, II: Bounds of asymptotic distributions of consistent estimators , Rep. Statist . Appl. Res. JUSE, 22, 99-115. Akahira, M. (1986). The Structure of Asymptotic Deficiency of Estimators, Queen 's Papers in Pure and Appl. Math ., No. 75 , Queen 's University Press, Kingston, Ontario, Canada. Akahira , M. (1987). Second order asymptotic comparison of estimators of a common parameter in the double exponential case, Ann. Inst. Statist. Math., 39, 25-36. Akahira , M. (1988a). Second order asymptotic properties of the generalized Bayes estimators for a family of non-regular distributions , Statistical Theory and Data Analysis II, Proceedings of the Second Pacific Area Statistical Conference, (ed. K. Matusita), 87-100, North-Holland , Amsterdam-New York. Akahira, M. (1988b ). Second order asymptotic optimality of estimators for a density with finite cusps, Ann. Inst. Statist. Math., 40, 311-328. Akahira, M. and Takeuchi , K. (1981). Asymptotic Efficiency of Statistical Estimators: Concepts and Higher Order Asymptotic Efficiency, Lecture Notes in Statistics 7, Springer , New York. Akahira, M., Hirakawa, F. and Takeuchi , K. (1988 ). Second and third order asymptotic completeness of the class of estimators , Probability Theory and Mathematical Statistics, Proceedings of the Fifth Japan- USSR Symposium on Probability Theory, (eds. S. Watanabe and Yu . V. Prokhorov), Lecture Notes in Mathematics 1299, 11-27, Springer, Berlin. Ghosh, J. K., Sinha , B. K. and Wieand , H. S. (1980). Second order efficiency of the mle with respect to any bounded bowl -shaped loss function , Ann. Statist., 8, 506-521. Pfanzagl, J. and Wefelmeyer , W. (1978). A third -order optimum property of maximum likelihood estimator , J. Multivariate Anal., 8, 1-29. Pfanzagl, J. and Wefelmeyer, W. (1985). Asymptotic Expansions for General Statistical Models, Lecture Notes in Statistics 31, Springer , Berlin.
Sugiura, N. and Naing , M. T. (1987). Improved estimators for the location of double exponential distribution , Contributed Papers of 46 Session of ISI, Tokyo, 427-428.
370 SEQUENTIAL ANALYSIS, 8(4), 333-359 (1989)
THIRD ORDER ASYMPTOTIC EFFICIENCY OF THE SEQUENTIAL MAXIMUM LIKELIHOOD ESTIMATION PROCEDURE Masafumi Akahira Kei Takeuchi Institute of Mathematics Research Center for Advanced University of Tsukuba Science and Technology Ibaraki 305 University of Tokyo Japan 4 -6-1 Kowaba, Meguro-ku Tokyo 156, Japan Keywords and Phrases :
Sequential estimation procedure ; stopping rule ; third
order asymptotic efficiency ; maximum likelihood estimation procedure ; Wald identity ; asymptotically median unbiased estimator ; Edgeworth expansion.
ABSTRACT Under suitable regularity conditions , the third order asymptotic bounds for distributions of regular estimators are obtained . It is shown that the modified maximum likelihood estimation procedure combined with appropriate stopping rule is uniformly third order asymptotically efficient in the sense that its asymptotic distribution attains the bound uniformly in stopping rules up to the third order. 1JNTRODUCTION We consider a class of sequences of sequential estimation procedures {IIa : a=1,2, •••}, where for each a , IIQ denotes a sequential estimation procedure, that is, an estimating method combined with a stopping rule. We consider such a sequence that the expected sample size tends to infinity as a--- , also that the sample size •n is almost constant in the sense that Va(n)/{Ea(n )}2- 0 as a--00 or more precisely Va(n)/Ea( n)=O(1) as a-- -. In the previous paper by the same authors ( 1987), the Bhattacharyya type bound for asymptotic variances of estimation procedures is obtained , up to the second order , and it is also shown that the modified maximum likelihood estimation procedure attains the boun', if the stopping rule is properly determined. 333
371 334
AKAHIRA
AND
TAKEUCHI
In this paper it is shown that the third order asymptotic bounds for distributions of regular estimators which are completely similar to the fixed sample case, can be obtained, and it is also shown that for a choice of appropriate stopping rule, we may attain the third order bound uniformly which is generally impossible for the fixed sample case. 2. Wald's identity and moments Suppose that XI, ••• , Xn are independently and identically distributed (i. i. d.) random variables according to a distribution with a density f(x, 8) with respect to a a-finite measure p , where 0 is a real -valued parameter. We assume an usual set of regularity conditions on f(x , 0). And now we assume that a sequential sampling rule is given , with which we continue to observe X1, X2, ••• , Xn until n=No, and make an inference (estimation , test, etc.) based on XI, , XNO. Whether or not we stop sampling at n is decided based on XI, , Xn. In the subsequent discussions we consider asymptotic case where we actually consider a sequence of sequential inference rules {IIQ }(a=1, 2, ••• ) in which expected sample size va (8)=Eg0(n) approaches to infinity as a-a, and consider limiting properties of inference taken. In what follows we actually consider asymptotically almost fixed sample size rule, that is, we assume that n/va(8 ) converges in probability to 1 uniformly in 8 as a--. More precisely we assume the following (A.1) For any fixed point 80, va(8)/va(8o ) = 0(1), Ee,Q(n) =% (O) + O(1/v,,(8)), Vo,Q(n)/vQ(8)=O ( 1),Eo,Q(nk)/{va ( 8)}k=0(1) (k = 2,3,4), and {(ak/aek)va(e)}/vQ(8) = O(1) (k =1,2 ), uniformly in a neighborhood of 80, where E8,0(•) and V0, 0(•) designate expectation and variance , respectively. As the basic tool of the analysis we shall make frequent use of the following fundamental lemma of A.Wald (1959 ). Suppose that XI, , Xn are i.i.d. random variables. Proposition 2.1. Let Zn = g(X1)+ ••• +g(Xn) and 4(t)=E[exp {itg(XI)}]. Then we have E[eitZ„ / {4(t)}n] = 1.
This proposition can be easily generalized to the following : Proposition 2 . 2. LetZ„i = gi(X1)+ ••• +gi(Xn )(j=1, ••• , k), and 4 ( t1, , tk) =E(exp {itlgl ( X1)+ ••• +itkgk(X1)}]. Then we have Psp(it1Z +--.+itkZR)
E
)=1. l
From this we have the following lemma.
(2.1)
372 THIRD
ORDER
ASYMPTOTIC
EFFICIENCY
335
Lemma 2 .1. Let Zni U= 1,••,k) be defined as above , and denote pj =E[gj(Xi)] Q=1,---,k) and p J1,...jk.)= E[{gjl (X1)-pjl}...{ gjk.(X1 )- Pjk .}], where (j1 ,... jk,}c{1,...,k}.
Then we have
E[Zni] =E(n)pj (j =1, ••• ,k) ; E[(Znu -pjl )( Zn12 - p12)] =E(n)p(11J2); E[(Znii -pjl)(Zn12 - Pj2 )(Zn13 - pJ3 )] = E(n)p(jl j2j3) + E(nZnii ) p(2J3)
+ E(nZni2 )p(j 1 j3 ) + E(nZ j3 )p(j i j2) E[(Zn11-pji)(Z i2-pJ2 )(Zrj3-pJ3XZn14-pj4)] =E(n)p(11 J2J3J4 ) - E(n2){p 0 1 J2)p(J3J4) + 003)02M +p(11J4)P(12J3)}+E(nZn11Zn12 )p(j3 j4) + E(nZai2Zr,i3 ) pUlj4) +E(nZn13Zn14 )p(jij2)+ E(nZn11Zn13 )p(12 j4) + E(nZni2Zni4 )p(ji j3)+ E(nZ,j1Znj4 )p(12j3) +E(nZnji )p(j2J3J4) + E(nZni2 )p(1 j3J4) +E(nZni3 )p(l j2J4) + E(nZni4 )p(11 j2j3),
where {it, •••jk'}c{l,•• ,k}, provided that the differentiation under the integral sign of the left-hand side of (2.1) is allowed. Proof.
We shall give the proof of the last equality since the previous ones
are far easier to prove . For simplicity we shall write 1,2,3,4 instead of j1, j2 , j3, j4, and also assume pja=0 (a=1,2,3,4) without loss of generality. Putting tj=itj (j =1,2,3,4) in Proposition 2.2 we have ezp(t1Za 1-E L
+t2Z.t n
+t^.n)
JI
(01,t2, 3,Qr
= E[exp{tiZnl + t2Zn2 + t3Zn3 + t4Zn4 - n4r(ti,t2,t3,t4)}]
=E(e$) (say ), ( 2.2) where ip( tl,t2,t3,t4)= 1og4(ti,t2 , t3,t4). Differentiating both sides of (2.2) we obtain
a4
E(e'1')=0
m1m2at3at4
Since differentiation and expectation can be interchanged, we have ( O_EI at
a4
t
1"?34
_E at1at2at3\at4e^^^
373 336
AKAHIRA AND TAKEUCHI
-E
1Iat 1 & 1&
I(
&"
+
4 2 33
-E[a((
at 4 a!
)e*1
a3.t + -t. 2k +
t
+ +±'2k'
&1 &2&3 4 813814 &2 812814 813 at4 812813 &4 &3 &2
)
e^
_ a -E ael ((+ ++34+2++u+3+$4$23++4+34)2)e-O} 1
E 1 (1234 + +134+2+$34$12++124+3 + +24+13++14$23+'P4 '123
++14+3+2++4+13+2+ t4+3+12+4)1+234 + +14)24)34
+$1$3 24 + +1$4 23 + +14AA4)e+J'
where +a = alt/ata, +ap = a2+/ataatp, +apr = a31t/ataatpatr and +apr6 = a4Vataatpatrat8 (a,P,r,S = 1,••• ,4). Then we obtain 0 = E[ - np ( 1,2,3,4 ) - np(1,3 ,4)Zn2 + n2p(3 ,4)p(1,2) - np(1,2,4)Zn3 + n2p(2,4)p(1,3 ) + n2p(1,4)p(2,3 ) - np(1,2 , 3)Zn4 -np(1,4)Zn2Zn3 -np(1,3 ) Zn2Zn4-np ( 1,2)Zn3Zn4 - np(2,3 ,4)Zn1-np(3,4)Zn1Zn2 - np(2,4)Zn 1Zn3 - np(2,3 )Zn1Zn4 + Zn1Zn2Zn3Zn4],
hence E[Zn 1Zn2Zn3Zn4] = E(n)p(1 ,2,3,4) -E(n2){p(1 , 2)p(3,4) + p(1,3)p(2,4) + p(1,4)p( 2,3)} + E(nZn 1Zn2)p(3,4) + E(nZn2Zn3)p(1,4) +E(nZn3Zn4) p(1,2) +E(nZn1Zn3 )p(2,4) +E( nZn2Zn4)p(1,3) +E(nZn1Zn4) p(2,3) +E(nZn1 )p(2,3,4) +E(nZn2) p(1,3,4) + E(nZn3 ) p(1,2,4) +E(nZn4 ) p(1,2,3). This completes the proof. Remark 2 .1. From Lemma 2.1 it follows that the fourth order cumulant of Zni1, Zi2, Zi3 and Zni4 is given by K(Zni1, Zni2, Zni3, Znl4) = E(n)p(1, j2, j3,j4 ) - V(n){p(11,j2)p(j3,j4)+p(1J3)p(2J4) + A 1,J4)p(2,13)} + Cov(n, Z& I Zni2 ) p(j3 j4) + Cov(n,Zl2Zi3)p(j 1 X14) + Cov(n, Zni3ZJ4 )p( 1 J2) + Cov(n,Zn11Z13 )p(12, J4 ) + Cov(n ,Zni2Zni4)p(j 1„13) + Cov(n,ZnilZni4)p(12j3 ) +E(nZnii) p(12j3,j4)
374 THIRD ORDER ASYMPTOTIC
EFFICIENCY
337
+ E(nZnj2 )11(j 1 j3J4 ) + E(nZl3 ) p(j 1 j2J4) +E(nZr,i4)11(j1i2i3)•
3. Asymptotic cumulants of sequential estimation procedures In the same situation as in the previous section, we shall obtain asymptotic cumulants of sequential estimation procedures . For simplicity we denote va(6) by v. Let ® be a parameter space which is assumed to be an open subset of Euclidean 1space RI. In order to obtain the asymptotic cumulants of sequential estimation procedures, we assume the following conditions. (A.2) {x:f(x,6)>0} does not depend on 0, and f( x,61)/f(x,02) is not equal to constant for all disjoint points 81, 62. (A.3) For almost all x[pJ, f( x,8) is four times continuously differentiable in 6. In the Taylor expansion 4 hi
log
ftzre) _ X e( i)(e,x)+h4R(x,h) i=1
R(x,h) is uniformly bounded by a function G(x) which has moments up to the fourth, where 60(6,x) = (ai/aei)e(6,x)(i = 1,2,3,4 ) with e( 6,x) = log f(x,6) (A.4) For each 0, E8[e(I)(6,X)] = 0, 0<1(0) =Ee[{e(I)(6,X)}2] _ -Ee[e(2)(6,X)]
'\/vi=1
and z = -^ (e, 3,(9,X.) 3,v
v L vi=l
t
-3J(O)-K(e)}.
The following lemma is very useful for calculations of asymptotic cumulants.
375 338
AKAHIRA AND TAKEUCHI
Suppose that Ye is a function of X1, ••• , Xr, and 0 and is Lemma M . differentiable in 0. Then
dy
I
0 1 ^, Yd _ vv d e Ee( Yd-
I/ J VV Ee\ ae
and d EB(Zl"Yd -Vv dOE0(zly -,/-,EO (Y0 a01r)
J.E0(zlV ae )'
provided that the differentiations under the integral signs ofEe(Ye) and Ee(Z1,,,Ye) are allowed , respectively. The proof is omitted since the lemma is similar to Lemmas 5.1.1. and 5.1.2 in Akahira and Takeuchi (1981) ( see also Lemmas 2.1.1 and 2.1.2 in Akahira, 1986). An estimator 9a=ea (X1, , X,) of 0 is called to be -*/v-consistent if for any e>0 and any q of ®, there exist a sufficiently small positive number 8 and a sufficiently large positive number L satisfying the following : A
lim
Pe {\/ v e le I <s
I0a-0I?t}<e.
For each k=1,2,•••, a '/v-consistent estimator 0a is called k-th order asymptotically median unbiased (or k-th order AMU) estimator if for any q E there exists a positive number 8 such that lim
lim CF-
e... •I -q I< 8
SUP
qi <
v(k-
i)1z1 p {0
1 5e}- I =e, a 2
n 1 1Y`k-lua1Pe{ea?e}- I = 0.
2
A Let C be the class of the all bias -adjusted BAN estimators 0a which are third order AMU and asymptotically expanded as _ A zIV (0) 1 1 1 ^/v(9a-0 )= 1(0) +V -Q+vR+ oP(v),
where Q=Op(1), aQ/a0=Op(1), R=Op(1), aRIa0=Op(1), and the distribution of v/v(0a-0) admits the Edgeworth expansion up to the order v-1. If an estimator 0a belongs to the class C, then we call it C-estimator. However, in the subsequent discussion, the term R is not explicitely needed, hence we simply write
z (0) 1 ^v (e.-8)= «0) + ^-Q+O (v)' V v
(3.1)
although the above stochastic expansion is necessary in order to validate the proofs, since they actually depends on two facts that ( i) the asymptotic distribution of dv (9a - 0) is equivalent to that of
376 THIRD ORDER ASYMPTOTIC
EFFICIENCY
339
Z,,Y(e) 1 1 Q+x r(e) +^, v up to the order v-1, and that ( ii) the asymptotic moments of R and their derivatives are of 0(1).
Theorem 3 . 1. Assume that the conditions (A.1) to (A.5) hold. Let 6a be an Cestimator with (3.1). Then its asymptotic cumulants xi (i =1,2,3,4 ) up to the fourth and the order o(v-1) are given as follows : For Ta = Vv (6,-0)
µ1(e) µ2(e) 1 K1=Ea(Ta)=-+-+o 1/v v v
KZ-Ve( °)
K3=K3e( a
1 2µl(9) 2 ✓ µ1(e) c(9) + v +oY I(s)+ v1(9) v I(9)
p3(9) r3(e) I )= Ee[{Ta-Ee (a)}3)= =+-+o
r4,,(T
v
v
v
(i4(9) r4(O) )}2= - + - +.(-1 ) V V V
where 1 3J(9)+2K(O) 3J µt(9)= 61(9) X3(9) with P3( O)= - - 2
t3(e> yr (e)
1
µ2(e) = 61(9) r 3(9) with
3 r3(e)= 21(0) Ee[Z
( 1 ,V Q- µt(9))2)
I2(2J(9)+K(9)XJ(9)+K(9)) - 3H (9)+4L(9)+12N(9 ) X4(0)= J4 IS(e) (9)
3 JZ(O) 14(9) (M(O)- 1(0)
5v' 27v'2 4v" + (30J(O)+ 19K(O)) + - , 314(O)v J3(O)v2 13(9)v
3 J v' (n-v)1 Ee(W^ with W= ZZ -( - )Z - r4(9)= -v--,Y J4(9) 1 v vv z(6) = V9(Q), v'=(a/aO)va(0), and v" = (a2/a02)va(0).
Remark 3.1. If v=va(6) is independent of 0, then it is seen that the cumulants in Theorem 3.1 coincide with those in non-sequential case given in Theorem 2.1.1 in Akahira (1986 ), since v'=v"= 0 and E9(W2) = M(0) - {J2(0)/I(0)}.
377
340
AKAHIRA
AND
Remark 3 .2. Note that pl(9),
TAKEUCHI
{33(0) and p4(0) in Theorem 3.1 are
n independent of the specific estimator 0a.
A Theorem 3 .2. Assume that the conditions (A.1) to (A.5) hold. Let 9Q be an Cestimator with (3.1). Then the Edgeworth expansion of the distribution of 9Q up to the order v-1 is given by Pe{V I(9)v (9a-e);5 t)
I(e) ./I(0) p3(e) 2 12(e){p4(e )+ r4(e)}
e $(t)-
=fi(t)-
(t3- 3t)$(t)
24v
6w
13(0)p2(e) I(e){de )+ µi(0)} - 10t3 + 15t)$(t)- 2v 140) 72v (16
12(0 ) P 3(o)pl(e) 2 1
I(0)vi(0) r3(0) 2
6v t $(t)-
sv
t(t -3)$(t)+o(Y).
(3.2)
where 111 (9), 112(9), p3(9), p4(9), T3(9), r4(()) and i(9) are given in Theorem 3.1, and (t)- 40)dx with $(s)=(1/^/2rt)e 12 f 2
Theorem 3.3 Assume that the conditions (A.1) to (A.5) hold. Let 6, be any Cestimator with (3.1). Let Ee(WQ*)=rEg(W2)/1+o(1) with W=Z2,v-{(J/1)-(v%v)}ZI,v -{(n-v)I/-/v }, Then the asymptotic bound for the distributions of 0, and a fixed stopping rule up to the order v- t is given as follows :
P8{V I(e)r (ea-e)5 t) (())V
(Z) $(t)- 6Vv
P3(0) 2 I2(9
t $ (t)-
) p4(e) 3
24v
(t -3t)$(t)
13(9)2(0) I(0)p2(0) (t6-10t3 + 15t (t)- 2v t$(t) 72v IZ(e)S3(e)pl(e) 2 - t(t -3 6v
(t)-
2v
1 Ap(t){J(0)+R(9)+
4vI3(e) v
EE(W2) t Ap(t){(-+rV'l(e))2+ 2vI2(0) 2 4
I(0)}2+o V
(i)
(3.3)
378 THIRD ORDER ASYMPTOTIC
EFFICIENCY
341
for all t > O (t<0), where p1(0), p3( 0) and p4(0) are given in Theorem 3.1, and Q*=Q-Pt(e).
Corollary 3.1. Assume that the conditions (A.1) to (A.5) hold. Let 0a be any C-estimator with ( 3.1). Then the asymptotic bound for the distributions of 0. and any stopping rule satisfying (A.1) up to the order v- l is given by pe('/I(e)v(9a-9) S t} 5 I(e)V'1( 9)P3(9) (9)^ (9)
(^) 00) 6d v
t,+ (t) - 24v (t'-3e(t)
I'(e)pa(O) s I(9)p2(0) (t -lot + 15t(t)- v *(9) 72v 2v (e)p1(9) 6v
t(t2-3)t(e)- I e4(t )(J(9)+K(9)+ 2v' -I(e)}2+o (1) (3.4) 4vI (9) v v
for all t > O(t
Denote by snit, the maximum likelihood(ML) estimator of 0 based on the sample (XI, • • • , X,). Let 9ML* be the estimator modified from the ML estimator to be third order AMU. Then it follows from a similar way to Akahira and Takeuchi (1981) that under the condition (A.1) zto -v Ztr 3J+K
(eML 9)= I -( y ) I +Ij:/r(Z1v2;,- 21 Z1.)+oo(7y
(3.5)
Theorem 3.4. Assume that the conditions (A.1) to (A.5) hold. If the stopping rule is so determined that the observation is stopped at n satisfying R - e(2) --
A.
(3.6)
with C(9)= Jt01(9)
L(O)
,-(e) -- 1 {2L(9)+M(9)+N(9)}
1(8ve) + 1(0) -2v(O)
(3.7)
21(0)
and some random variable a with Eo(e)=o(1), and the stopping rule of (3.6) satisfies (A.1), then the modified maximum likelihood estimation procedure combined with the stopping rule is uniformly third order asymptotically efficient in the sense that its asymptotic distribution attains the bound (3.4), up to the order v- I, uniformly in t and stopping rule.
379
342
AKAHIRA
AND
TAKEUCHI
It has been well known that for fixed sample estimations Ee(W2) =Eo[{Z2,v-(J/I)Z1,v}2) represents the minimum loss of information for any best asymptotically normal estimator (Fisher, 1925; Rao, 1961) or the curvature of the model (Efron , 1975 ; Amari, 1985 ). The above result establishes that by the choice of appropriate stopping rule the loss of information can be reduced to zero (!), or the model can be made flat (!), and it is also to be noted that the right-hand side of (3.6) can be interpreted as the "realized" or ex post amount of information , and the stopping rule concerned is that we continue to sample until we get the "realized" amount of information is (nearly) equal to the expected amount of information corresponding to the predetermined expected sample size . In practical cases it may be troublesome but not really quite difficult to determine c precisely, and also to check the condition (A.1), but we may rather stop sampling as soon as the righthand side of (3.6 ) exceeds v ( 9*ML)I(8*ML)+c (K*ML). Then the expected sample size will not be exactly equal to 1, but the difference Ee(n)-v (6) will be of small magnitude of o(1).
Proof of Theorem 3.1. Putting U=Ta-E0(Ta), we have U = dv(A, -Ee(9Q))
(3.8) where Z1 ,,, I and pl denote Z1,49), 1(0) and p1 (O), respectively. Since UY=(U- Z[.r )z+ I Z1 I. I Zi, .
it follows that 1 2 1 Ee(U2= y ye(@*)+ I Ee(Z1.vU)- +0(Y ). where Q*=Q-p1.
Since = a (V;(e*- Ee(9n)))
=2 U-v-Ee(9A ")
= 2 U- -V v { EE(eA - 9)+ 1 J
V
Zl,r
=2vU ^v dO
1 11 1 - +YQ+Ph +1 ^vl
(3.9)
380 343
THIRD ORDER ASYMPTOTIC EFFICIENCY
2v
U - ,/ ;
l ae (vEe(Q))+1+o( Y))
+lµ' ) -^ v +o( 1_) _-• U-Vv(- v'µ v 2 S1 v 2v -y'U+ v' ,1 - 1 -Pi- -µ -V r +o(7=), 2v v4/ v S/ v v
(3.10)
it follows by Lemma 3.1 that E0 1,v Lq= -L± ROM- -L 199(
au)
2vvVvµl
Vvµl 1+o(1) 1111 v
(3.11)
=1+yµ1 zµ1+o(y), v
where v and pl' denote v(A)=(a/a6)v0(0) and pl' =(a/aO)pl(0),respectively. From (3.8), (3.9) and (3.11) we have
K2= V9(T0) = Ee(U2) -1+2µ -2vµ I
vl
+ 1Ve (Q*)+O(1 v v2I v
=1+2 µ' 2y'
1
(3.12)
I vl I v2l I v v
Since ZI." 9 3 ZI" 2 3 2 3 U9=(U- I )+_(U__ )ZIv+ZZI."U -ZIP". 2 219
it follows that Ee(U2= 3 Ee(ZI"Q*2) + 2IEe(ZjYU2)-
2 Ee(Zl.)+ o(1).
(3.13)
Since Ee(U^ =
I +0 (y) and r(e)=
) = 2J(O)+ K(O),
it follows from (3.10 ) and Lemma 3.1 that 2J+K v' 1 1 a aU +o(-) E 4Z U^ =-- E (U2)E ( U-)=- 2 v321 e 1," ^/ a8 a S/ v e a8
where J and K denote J( O) and K(e), respectively. Since 1 3v' Ee(Z91) _ K + -I+o( Y), S/V
(3.14)
381 344
AKAHIRA
AND
TAKEUCHI
we have from (3.13) and (3.14) K3 = K3 06(Ta) = Ee(U3) 3 3 2J+K Y' 2YI ^• 21 7v 13 v"I
-L (- K = 3v' )+0( 1 ) 213 V v rag
27B•(Z1.rQ*')+ o(!). v;13 3R1i+ 1 1 1 ^3+ v T3(A)+o(Y-).
where P3= D3(O) - ,r{(3J+2K)/(VcI3))-{3Y%(,3/212)} and r3 ( ()) = 3Ee(Z1 ,YQ*2)/(2 I.
Since Z 4 U4=(U- I r)^+ 3Z1.r(U- j )3- Z4,,U_+ a Z1.rU3+ 1 Z4 12 31 314
it follows that (3.16)
B•(U')=- B•(Z4 U3)+ 31Be(Z1.,U3)+ !jE•(Z; r) 12 From (3.10), (3 . 15) and Lemma 3.1 we obtain •,^, l a 3 eat/ 1 a 3J+2K 3v' \1 B•(ZlrU3)= N/va9B•(U3)- YB•(U )_ ^Yd9 - ^v13 vial=/
-
3
a Iva 8
1 v
U2
1 2vU--/
v(l+Yp" -
! p1
)l1
+o(1).
3J+ 2 K 3Y' g v; Y;12 )
- ^tB•(U3,+ 3(1 + v pi - plus•(U)+o (Y ).
(3.17)
Since azlr -Z - v1 Z - 111 aB ;r 2v lr
it follows from (3.10), (3 .14) and Lemma 3.1 that _ az B•(Zi.U!)= 1- d B•C+1^,U •)- 1-B4 -1'rU )- 2B•(Z1.vUl) ^YdO N/ Y
I d (2J+ K + r' / - i 6 f (Z - s Z3, - 7 )U=, ^/r t ;r13 v^ll ^r • ;r r IV;
(3.18)
382 THIRD ORDER ASYMPTOTIC
EFFICIENCY
Z_B.1z U+ r, µ - I-µt-,V; +o(^). V= 7r LL `2v 1 1 A
W-Z;•-(1
_;)Z,-.
345
(3.19)
(3.20)
V
Vr
we have by Lemma 3.1 B.(WU.*)=6e1 W(U- _^ •)_, + t Be(Zl.•U1n- !B0(Z^•w) 8 1 a 2
1 au 2 1 aw SWU
(3.21)
_ Beal w)+N/Y
Since by Lemma 3.1 (3.22)
B(nZ = -L±8 W = we obtain from (3.11) Be(UW)=Be(z;•U)- j Be(Zl •U)-Be( IU )+ Y• Be(Zl."4
-Be(I Zl•Z
;,
I -Be(n rZ1.) + Be(ZD+o(1)= 0(1). vI r IV;
(3.23)
From (3.8) we have
-U- VYABe(O
- 9)-%/;
=
=2vU-Vv (YBE(Q)}v+o(Y) v' - 1 =-U-^v+0(-). 2v Vv
(3.24)
Then by (3.20) and (3.22) Be( W)= Bel(ZvU-N/ vxZR•-(^-v )Zl.y-(^v^} +0(1)
-Be[ sv (ZR•U-(I - r• ) ZI •U-
2YBl I Zi. Zr•' I (I
(n -
vu U}J +0(l)
dY
v,)Z1.•- (
J
n
=0(1).
v)Z^v^+o(1) Y N/;
(3.25)
383 346
AKAHIRA
AND
TAKEUCHI
Since I'(9) =2J(9)+K(9) and J'(9) =L(8)+M(8)+N(A), it follows that ;r v' Z +Z +o (1) M d8 2v 2,1 Ar P a J v' 1 J(2J+K) vs v' J V. v' nl a^v) (L+M +M+ .V `/y)+ P(1). Y v )Z 1,r +( I y XZ2.r 2,, I 2 Is
((I
=(I
and a (n-v)I v7 + (n-vX2J+K) - (n-v)v7+o (1) a8 vv )_-`/ v 2v312 P
Hence aw azs r a J- v' a (n-v)I) v)Zly)-a k as _ (( ^v
-Z +Z ,v 3,v- v
2v 2
j(L+M+MZI,r -(I-Y)(Zxr -2v Z1 v" v.2
,v --v)+J( 12 ^Zlv
(n-vX2J +K) (n-vWI
VI +(--Y2)Z,r+-+P(1).
v which implies by (3.22) E
2v32
1
rZ2v+Z1.vZ3,v -
- (L+M +MZiv-
J(2J+K) 2
vZl,r
(I- L)(Z1,,Z^v-2-Z1v
v" v'2 2
(n-vX2J+K)
+ I2 Z1.v+( v y2 )7.1,r V
r
nZ
^vrn (n-vyv'I
v 7lv+ 2v32 21'vJ
.
= 1 v J+L - J_• - -(L+M+N) I 2v Vv ^v J V' v' I v ' J(2J+K) v" vi2 -(I-vXJ-2vI v V v)+ I +(v-v2X
Iv'2l (2J+K)v v 2v2
v' 1 J(J+K) i{ I -(M+M-W+K) v
Iv' 21v'2 - s }. V2 +v
(3.26)
384 THIRD
ORDER
ASYMPTOTIC
EFFICIENCY
We also have from (3.20) and (3.24) r r W)= Ee1(2U- dv) {Z$„ -(I- Y)z IV 199( r r =EeI - I z U-( I
_ y' El( Zlv2$v
2v I
347
N/- 1 +o(1)
r
_ (n-vk u IT v )2.U V-v ✓
+0(l)
y y^sl,v n-v
1221.,+
vI ^v ZI^'1+o(1)
(3.27)
=0(1).
By Lemma 3.1, (3.23) and (3.26) we obtain a'
E0(Zi W)
A Ee(21.YW)- y Ee(W
Y )- v E9(ZI.V
LW- )
1 d v' J '1y 1 1 1 e[ w( w+_ ! N/vdO v n-TVE 2 z - ^/vl
I I J(J+K) v' !v" 21,121 1 t I -(M+N)-W+K)v + Y - y2 +O(T)
^v
1 v'! , 1 1 _J
=-^/v(v) Ee(W) vv 1 :/
v'
2v)Ee(Z1•W)
1 v' Iv-_ 21,14 1 1 _ IJ(J+K) ;v 1 -(M+M-W+I^v + v - +o(-) v2 -V v 1 v" v' v'2 v{v1+vW+K)v2Il-^/vEe(W2) 1
(J(J+K)
i ^/v -(M
v' Iv" 2Ivi2 j +M-(J+K)-+-- v s + 0 (
1_lj(J+K)_(M+N)+Jvg_31v'2 v"1 1 +2l- - E(W2)+o(-). vv I v v2 v -Vv a Vv From (3 .21), (3.23), (3.25), (3 .26), (3.27) and (3 .28) we have
(3.28)
2 v ' Iv" 2121 E (WU2)=- i1 J(,+K) -(M+M-(J+K)v + v J e 12;/ v V2
+
Jv' 3Ivi2 v" 1 +21 1) -(M+N)+ --+ 1 E W2)+o( ^/v I v 2,2 v ;2: v a{
1 r J(J+K) !^v
_ - V4
1 I J(J+K)
Iv2^
v
v' 1 1 -(M+N)-(3J+K)- + Ee(W)+o(-
I2^/
v
I
v
I^/
v
"
v
(3.29)
385
348
AKAHIRA AND TAKEUCHI
From (3.20) we obtain (((
r' I ) 2vN/y U21
=Ee1 (w+(I - 2y )Z1 .,-vvll Usl
(3.30)
.,U)-VrIE(UZ). =Ee(WU2 )+( I - 2Y)Ee(Z1
Hence we have from (3.19) and (3.30)
Ee(Z1
1 d (2J+K v' 1 +, X/ 1 2 7v^ I
--
--E (WU^- 1_(J- 3v )E (Z U2)+IE(U2) v e N/ v• I 2v 2 j,,"ff.(Z"U)+(_^_pj1-pi -vlr 2v y312
I d
,(n}+o(1) r
2J+K v' 1
^/vd9 v;I2 r-"11 - 1_Ee(WUZ)- 1(J- 3v)Ee(Z1,U^+IEe(U^ 2v ^/v Vv-2(Y µl-1P; - 1)E0(Z1 ,U)+o(y
(3.31)
Since by Lemma 2.1 and (3.22) 2 31 E. K (.2 + 61Ee(nZi.,)+ v Ee(nZ1.)+o(v ) H yz Ee(Zi .)= V 312 = H + 3KY '
2 31 - vz Ee(n2)+
(nZ,,)+o(v ), vl Ee
it follows from (3.16), (3.17), (3.31) and (3.32) that E (O_-2 1 d 2J+%+
v 1-E (WUZ) O , / v ^ ^/ v12 v`^ZI ^/ y e
TV (I 2v
-2(y2p1 -
)Ee(Z1.,U)+IEe(U)
v Ni -1 )Ee(Z1,,U))
(3.32)
386 349
THIRD ORDER ASYMPTOTIC EFFICIENCY
8 1
d 3J+ 2K 3v' 3v'
+ 31 -- - - - E (U^ 2v" N/ v rl3 v"12
+3(1+y'p'- sp1)Ee(U2) v
s
+-L 314
-3Be(ns1+6//Ee(nZ",)I+o(- ). V2 v v 1 Hv+3 qvv
(3.33)
Since by Lemma 3.1, (3.18 ) and (3.22) az E,(nZ, ,= 1_ d Ee(nZl^- 1 EE(n 1Y) 1,dO N/ v ag
) I V d (Y,--)- f E (nZ - v2 - I _E (no)r +0(1) d9 ^/v ` a 2" 2v^ ^ e 1 1 d v' v2 I
= -(-)+-+-E (nZ )+o(1), ^vdO Vv 2v2 V'v 6 2"
it follows from (3 .23) that
klI
E9( nZ21)-y E0(n)J
.
s
( 2 d( v-)+ v2 + 2 9.(n2)- 2-Ee nZ2,)- I Ee(n^ 12, I-vvdO 7v IV 2 2 V4 1 I v vEe(ns)-I^vE(fZ^y)j { ( 13v{ `^/vde^/v)+ v2 j+1 1
2
2
1
9v1
vde(^v)+v+IvtEal(Zv v/
lsEa(Z^`^+v)
= 1 ' J " ')Z, )21 M + 1 +o(1) . - -_ j ? _ d (y__ )+ - } +vL9 [( W+(- - p :v 12 v I v ••') J3, %/v dR v v ,9 14 1
``
2 ± ( v'
13v`-V vde^v)
+r2} v It
1 2 J V. I_ v' 2 2 l +14V Ee( W +2(I-v)Z1rW+( I v)Zi.v/
2v,2 2Jv' J2 M i 1 1 1 1( 2 d v ' 12 7, A ^/ v vs Iv 12 1 7, • v
-M 14
1 1
+I
+o( v
(3.34)
387 350
AKAHIRA
AND
TAKEUCHI
Since J'(9)=L(9)+ M(8)+N(9 ) and 1V(9)=H(e)+3N (O), we have from (3.33) and (3.34) Ee(U4)=sEe(U2)+ zS2(- ,+µ1-lµ^ )Ee(ZlY0 v I I V2 i 3v I (J + EO(WU^+ 1 1 1 d r 2J + K )v' +E(WU^+-( ---)E(Z 1 U'J +--` + a v I 2v a ,. N / X/vde w12 ,32j
8 1 d ( 3J + 2K 3v 3v' + 31 N/ v d9 i3 vi12) 2v Ee(U^ 1 v' 1 1 H 3Kv' +3(Y Pj - vzµl )Ee(U2) + 3^(Y + Yz )
1 1( 2_ d v' 2,'2 2J,' J2 M 1 2 1 ++(--(-)+---+--- +E(W)+o(-) 14V e v 12 13v N ( vde v/ v ,2 Iv 12 I
= jEe(U2)+ z(-1+ V I
2 Nl-
vµ0(1+vµ;- Zµ1)
de /+ 12^vlIv2.v I2^/v 1 12+ v3121J
I2^/v1
v' J 3J 1 2 -(3 J + K )- + - Ee( W2) + ^ v 72^/v I 2v 72^v 8
1
d( + 31 l 7=VA
3J+2 K 3v ' vv13
v3^12 I
2J+ K v' z 312, /v l V
3v' (
3J+2K
3v'
2,3a \
v vI3
,31212
1 H 3K-1 1-v' 11 +3(µ 1 µ))+-(-+-) , v2 l l 3I4 v v2 2v'2 2Jv' J2 M l 1 1 1( 2 d v' E(W2)+1 ++ --(-)+---+---J+4 -o(-) I2 13, ^/ vde X/v v2 I, I2 I 1 4, e
V
2v'2 2 d 2J+K v' 8 6 2 3 1 ; --P )+ -( +- ) += -E e (U)--+- (v-µ ,V3121 13,3 2,/ v de 12 I2 2 1 v ; I2
v
- 2 (J(J+K) v' 2 J 3v' 2J+K - v' -M-N-(3J + 2K)- + (---X- -) v32l v I v - X72^/ v l 2v 38 ( 1 d ( ,
^/
I
3J+2K
` ^vr3
3v'
3v '
J
,31212 I -
2v^2 `\ -
3J+2K - 3v' 1
^/vl' vlZ
388 351
THIRD ORDER ASYMPTOTIC EFFICIENCY
1 ' 1 1 H 3Kv' +3(-p' --µ)- +-(-+-) 314 v v2 v 1 2 I
2v'2 2J-/ J2 M 1 + 3 2 1 1 1 2 d v' +- (-)+---+----E0(w )+o(-) v 1/ v 1^
v2 Iv 12 14v V
2(2J'+K') 6 2)_ 3 = 1Ee(U )-12+ 14 v M 2N H l4v
4(2J+K)2
J(5J+4K)
16v
16v
8(3J'+2)V)
14v 314v
8(3J+2K)(2J+K)
314v 16v
v' 27v'2 4v" 3 1 + -(150J+95K)+ - - - + -E (W)+o( 314v2
-I
Ee(U2)-
13v2
13v3
2 + 12(2J+
14v
9
v
+12N J +K) - 3H+4L+12N v
v
3 J2 5v' 27v'2 4v" (M-V+ -(30J+ 19K)+--14v
13v3
3I4v2
I3v2
+IEB(W2)+o(1 ) Y v
= 6E (U2)- 3 + -4 + 3 E ( W2)+o (1) (say). I e 12 v 14v 9 v
(3.35)
Since by (3.12) {V9(Ta)}2= {Eg(U2)}2= 1 Ee(U2)- 1 +o (v ).
12
it follows from (3.35) that K4=Ee[(TQ -Ed(T))4)-3 {V0(Ta)}2 6 3 =Ee(U)- -Ee(U)+ z I 04 =- +- EeW2 v
_ 04 + 1 r4(0) v v
(say).
(3.36)
389 352
AKAHIRA
AND
TAKEUCHI
We put pl(e) P2(9) Vy
v
v
It follows from (3.12), (3 . 15) and (3.36) that the Edgeworth expansion of the distribution of 0Q up to the order v- I is given by
,/i pl
'Vip=
=1'()- ^v fi(t)-
y=*t)
13 s P3
12
24v(04+r4xt3-3t$(fIL+IpI
IVi^
v $(O- 3(t=-1)$1 - I (y 6Vv
72v (ts-10t3+15t)4(f
1%/1r
1'0
3 t '9(t2 z. eau)- 6v ( 1)$(n- 6'p' at'-
t
(3.37)
2
where o( n= J- $(t)dzwith$(t)=(1/d2n)e-: rl A
(See also Kendall and Stuart, 1969). By the third order AMU condition of 0Q we have from (3.37) A 1 VIpl Q1µ2 IVIP3 I'1173 pe{e 5 e} = Z - -- $(o)- Y $(o)+ 6^Y 40( 0)+ 6v t(O)+o(v )
1 1 =2+a(y),
hence IP3 173 µ1= 6 and µz= 6
Therefore we obtain 1(0)03(0) 1(0)73(0) 1 Ee(Ta)= 6V'v + 6v +o(v). Thus we complete the proof.
Proof of Theorem 3.2. It follows from (3 .37) with (3.38). Proof of Theorem 3.3. From (3.2) we have pe(VI(e)v (9a-e)5 t}
(3.38)
390 353
THIRD ORDER ASYMPTOTIC EFFICIENCY
13 s
=tee)- - t
12 P4
=$(t)
24v
(t' -344 (d-
- -*(t)- Isg3N1 t(12-3* (fl-
P3 v
72
(`s- 1 ot3 +15 )$t
Ee(W2Xt'-3t*(t) L-
+0(j). 4 I Ee(Z1.,Q•2)t=$(A ).
- 2 Ve(Q*) $(t)
(3.39)
where Q* =Q-µl, and I, P1, P2, p3 and p4 denote 1 (0), 111(e), p2(0),03(0) and p.(e). By Lemma 3.1 we obtain E (aQ
2 1aE (Q *'^ - Q*) Q*')=Jvae e 1,. a Jv a ag
E(Z
2Ee l(d9 Vv
(Jv(Ta-Zj')1^Q*I+o(1)
_ Z r := - 2-EeE y (Ta )Q*+J v ±(T0- ZjY))Q*I+o(1)
l
Vv 2Jv
-2EeIQ*{2v u Jv
v +-((2J+JM
-I aB^i)JJ+0(1)
= EE(Q* ^^r)+o(1)
= 2 Ee[Q*(
Z;,-2-Z,, _ JY
)J+0(1)
JY I Q *jW - )Zi ., - 11 i,g.l+(I 2v
= 2
= Ee(WQ*)+ o(1),
where J
1i=ZS<^-
Y•
(I-
(n-v)l
)Z1,,-
Jy
+0(1)
(3.40)
391 354
AKAHIRA AND TAKEUCHI
We also have from Lemma 3.1 Ee(Z1,YWQ* )= T ^Ee(WR')- YEe(
R')- JEe(W )
• =- 1-E (a Q')- 1-E (W
^/ v e ag •S/ v e ag
)+o(1).
(3.41)
Since aW v'
nJ i
- =- - Z 2v +z3,v- Y - I (L+ ht+N)z1,v ae - v' Z - nl )+ J(2J+I^ Z +( YM-)Z + v'I -(J- v'XZ I v 2,Y 2v 1 .Y Vv 1 •r 12 v vZ 1 r ^ v
- (n-vX2J+ K) + (n-v)WI +o (1) Vv 2v542 Y ' it follows Lemma 3.1 that
I-Ee(
(3.42)
Q')=0(1).
Since 1
aQ
_1
a (
_
Z
V v d8 v ag' a I Z = 2v (Ta- I v )+
v
+ I2 {(2J
aT
aZ +K)ZI.v
-I
^v
v' 1 v' - P1 2J+K 2 Zlv -Q'+- Ta-^/v--+ 2v 2v T/v ,/v I 1 v' n1 1 v' { - + N2 +0(1) I ^v (Z -Z- -)+ ^/v 2v v v - I v' 2J+K v' --1J J 3V' I W+(I- 2v1, )Z1 vl +2v T+(a I2 +27v12l ,v v +P(1) 2 12 + Iv / 1••+ P(1) II W+ 2vTa+(J+K
itfollows from (3.23), (3.41) and (3.42) that Ee(Z1vWQ•)=--LEe(W ag)+0(1)
392 THIRD
ORDER
ASYMPTOTIC
= jEe(W)--E9(W7a
EFFICIENCY
)-(JI2K+ Ivr
355
)Ee(WZI.,)+o(1)
_ ^Ee(W) + o(1). By Lemma 3.1 we have
1 d 1 aZIv 1 )- -'vEe( as Q*)- J-EB(ZIY a ) Ee(ZiYQ*)= Vvd6Ee(ZIvQ*
- L EOQ*jW +(J3 V
v
l
I
✓
2v
( f
1111
Z (
---Ee(ZI"( aa-g(V; To- IY)/^1+0(1) ` -Ee(WQ*)- 1- Eel -(T.-Z")ZI,, -vv Vv 2-V v I Z +-VVIae (Ta- jv)
1, 1+o(1)
r Z =-2 E9 [(To- [v)Ziv
-Ee
ll -T \/ vC(2J+K)Z iv-I az 'v/jZIVJ+o(1) vi + I Z\ 112Y a777
az ltv =-2 Ee(Ta Iv)-12KEe(Ziv)+IEe( 0 Ziv)+o(1) 0 V 2J+K 1 f ( I +IEe Zi W
2v
J v' 11
+(I- Y)Ziv - vvI
+0(1)
✓
2J+K _J 3v' 2v I + I 2v +0(1)
If we take QO* as Q* such that Vg(Q*) is minimized under the conditions
E8(WQ*)= jEe(W2)+o(1).
Ee(ZIvWQ *)= -Ee(W2)+ o(1),
(3.43)
(3.44)
393 356
AKAHIRA
Ee(Zi,,Q *)=-
AND
TAKEUCHI
J+K 2v' 1 - v +0(1)
(3.45)
for real r, then Qo* must have the form Qo* =Ao+A1W+A2WZ1,v+A3Z12,v ,
where Ai (i=0,1,2,3) are constants . Since Cove(W,Zi,vW)=o(1), Covg(W,Z12,v) =o(1), and Covg(Z1,vW,Z12,v)=o(1), it follows that 12 1 2 1 2-/ V9(Q0*)= (r + -)E9(W )+-(J+ K+ I)2+o(1).
12
(3.46)
v
214
Hence we have from (3.39), (3 .40), (3.43 ), (3.44), (3 .45) and (3.46) Pe(V I(9)v (60-9)6-t}
2
2
124v (t3-3t*(t)- 72v (ts-1Ot3 +15t$(t)
;5 13'203
( ) •(t)- 6^v ta¢(t)-
lµ12
12p3µ1
- 2v 90(t)- 6v
2
do(t)
dt - 3)^(t)
2v
1
(W2(t2-3 ) 4l2Ee
+IV9(Q0 )+ 2Ee(Z1.,Qo 1+o(Y)
_ ... - *()I I Ee( W2)(t2 - 3)+ (r2+ )Ee(W^+ 1 W+K+ W l 4I 213 v
e
1
IVI--
L Ee(W^t +o (v ) + Ee(W2) (( t2 - 3 l 1 2V 1 d^(t) 1 -+(Ir2+1 )+'JIrtf- i*(tXJ +K+-I)2+o(-) 2v
l
4
11
4v
V
v
Ee(wll it 2 + '1 1 2v' 1 _ ti(t) (Z+VIr) +4 --t$(tXJ+ K+ v I)2+o(-) V 2v12 4v1' for all t>0 (t<0), where "•••" denotes first six terms of the above right-hand side. This completes the proof.
Proof of Corollary 3.1. It is straightforward from Theorem 3.3. Proof of Theorem 3.4. From (3.5) we have n -v
Zl,r Zl,r
3J+K 2 1
V;(e -e)=-+ (z --n- _z +.P( ML I^/ v
2r
^/v 21v 4r
394 THIRD ORDER ASYMPTOTIC EFFICIENCY
357
_3J+ =Z1'^+ 71,v {W +(J--)Z I V I' I2V1 V
Z =- + --Z t'vw I'Y
K Z2 +o( 1 P ^/ v 21^/ v t'v
213. . 1,v-12V
Zt ,Y+o,(: v) .
(3.47)
Vv
In a similar way to Theorem 2.2 in Takeuchi and Akahira (1988), we obtain by Taylor expansion of both sides of (3.6) about 0 (n-v)I=Vv (Z2,v-
J 3J+K n-v v' I JvZ 1,v ;v Zl'v+
IZI)
1 4L+3M+6N+H 2 ZI,vZ3,v 212v nZI'V
+ 212-- (v"I+2v (2J+K)+v(2L+2M+5N+H))Zi v+c( 6)+o(1),
(3.48)
where v =ve(0), v'=vo (0)=(a^ae)va(9), v"=v' ,(0)=(a2/a92)va (9), Z1,v=Z1,v(e), Z2,v=Z2,v(9), Z3,v=Z3,0), I=I(8), J=J(9), K=K(9), L=L(9), M=M(9), N=N(9) and H=H(9). In order to determine n satisfying Ee(n)=v+o (1), we have from (3.48) C(0) = Jv' L v" I Iv + I - 2v + 21
(2L + M+M .
From (3.48) we also obtain J v' n-v Z2,v-(I - v )Zt v- -I= p(1), -'/v
that is,
W
=op(1),
(3.49)
hence by (3.47)
'Vvl0 -9)=Zjv-
J+ K Zl,v- v - Z2,v+0 (- ). ✓v 213V v 12V Vv
Since , by Takeuchi and Akahira ( 1988), - A 1 1 2v 2 1 J+K Ve(^v(eML-9))=I+ 2( 1 + v) +o(y),
2I
the estimator O*ML modified from the ML estimator OML to be third order AMU has the same asymptotic variance as that of eML. Then it follows from Akahira (1986) and Akahira and Takeuchi ( 1981) that the modified ML estimator ;*ML has the
395 358
AKAHIRA AND TAKEUCHI
following asymptotic distribution up to the order v-1: Pot-,/ I(eh49) (e*L_ 9)s t }
I(OW I( 9)
=q'(t)-
(3(9)
6^ Y
2 t $(t)-
I2(9)p4(0)
24v
I3(9)02(9)
(t3-31)4(t)
I(9)µ12(9)
3 (15-1013 +15t)4(t)-
72v
t4(t)
2v
I2(9)(33(9)µ1(9) 1 2 ✓ 1 t(t2-3)¢(t)- ti(t){J(9)+K(9)+ -I( 9))2+o (-) , 6v 4v1 3 (9) V V
where µl( 9) 03(9) and {i4(9) are given in Theorem 3.1. Since, by (3.49), Ee(W2) =o( 1), it is seen from (3.4) that the asymptotic distribution of a*Mt . coincides with the bound up to the order v-1 uniformly in t and stopping rules . This completes the proof.
References Akahira, M. (1986 ). The Structure of Asymptotic Deficiency of Estimators. Queen's Papers in Pure and Applied Mathematics No.75 , Queen's University Press, Kingston , Ontario, Canada. Akahira, M. and Takeuchi, K. (1981). Asymptotic Efficiency of Statistical Estimators : Concepts and Higher Order Asymptotic Efficiency. Lecture Notes in Statistics 7, Springer-Verlag, NewYork. Amari, S. (1985). Differential-Geometrical Methods in Statistics. Lecture Notes in Statistics 28, Springer-Verlag, Berlin. Efron, B. (1975). Defining the curvature of a statistical problem (with application to second order efficiency). Ann. Statist., 3, 1189-1242.
Fisher, R. A. (1925). Theory of statistical estimation . Proc. Comb. Phil . Soc., 22, 700-725. Kendall, M. G. and Stuart, A. (1969). The Advanced Theory of Statistics, Vol.1, 3rd ed., Griffin, London. Rao, C. R. (1961). Asymptotic efficiency and limiting information. Proc. Fourth Berkeley Symp. Math. Statist. Prob., 1, 531-545.
396 THIRD
ORDER
ASYMPTOTIC
EFFICIENCY
359
Takeuchi , K. and Akahira, M. ( 1988). Second order asymptotic efficiency in terms of asymptotic variances of the sequential maximum likelihood estimation procedures . In : 2nd Pacific Area Statistical Conference , Statistical Theory and Data Analysis 11, North -Holland , Amsterdam, 191-196. Wald, A . ( 1959 ).
Sequential Analysis. John Wiley, New York.
Received December 1987 ; Revised June 1989 Recommended by J.K. Ghosh
397 Pub. Inst. Stat. Univ. Paris XXXV, fasc. 1, 1990, 3 a 9
First order asymptotic efficiency in semiparametric models implies infinite asymptotic deficiency * Masafumi Akahira(*) and Kei Takeuchi(**)
Abstract For semiparametric models, it is shown that, under fairly regularity conditions, the asymptotic deficiency of the maximum likelihood estimator or any regular best asymptotically normal estimator is infinity.
1. Introduction For various types of semiparametric models, it has been shown that uniformly asymptotically efficient estimators may be obtained under some regularity conditions ( e.g. see Hajek, 1962 and Takeuchi, 1971). More precisely let us assume that X1, . . . , X,, are independent
(') Institute of Mathematics, University of Tsukuba, Ibaraki 305, Japan (") Research Center for Advanced Science and Technology, University of Tokyo, Komaba, Meguro-ku, Tokyo 156, Japan Key words and phrases : Semiparametric models, asymptotic deficiency, maximum likelihood estimator. AMS Subject Classification (1980): 62F12, 62F10
*This paper is retyped with the correction of typographical errors.
398 and identically distributed random variables according to a density function f (x, 0) with respect to a a-finite measure µ, where 0 is a real-valued parameter, and, given 9, f belongs to some class F which can not be identified by a finite number of parameters. It is naturally required that fl(x,01) # f2(x,02) for any disjoint 01 and 02 and fl, f2 E F. Then we can construct estimators B. such that 9,, is asymptotically efficient for the model with known f, that is, in regular cases , v/n- (B,t - 0) is asymptotically normal N (0, if 1), where f (x, 9) }2 f (x, 0)dµ(x) jr = j { 80 log whatever true f E F be. The pertinent question here is that when we can also achieve uniform second order asymptotic efficiency ? or does the distribution of Vfn- (o - 0) attain the second order asymptotic bound ? In the subsequent discussion we will show that, in a sense , loss of third order asymptotic efficiency, or asymptotic deficiency , is generally infinity under fairly general conditions for semiparametric models, which although does not directly answer the above question, may shed some light on the possible higher order asymptotic efficiency of estimators under semiparametric models.
2. Infinite deficiency in semiparametric models Suppose that X1,. .. , X,. are independently and identically distributed (i.i.d.) real random variables with a density function f (x, 0, ^) with respect to a or-finite measure p, where 0 is a real-valued parameter to be estimated and ^ is a real-valued nuisance parameter. We assume the following conditions (A.1) to (A.5).
(A.1) The set {x : f (x, 0, ^) > 01 does not depend on 0.
399 (A.2) For almost all x[µ], f ( x, 0, ^) is three times continuously differentiable in 0 and ^. (A.3) For each 0 and each 6
0 < Ioo(9, 6) = E [{Po(9, 6, X)}2] = -E [Poo(9, e, X )] < 00,
o < Iii(0,6) = E [ {P,(e,6,X) }2] = - E [Pig(0 ,6,X)] < oo, where Po (0, 6, x) =
(a /a9)P(0, ^, x), too (0 , 6, x) _ (a2 /092)P(9, 6, x),
Pl (0, 6, x ) _ with
6, x ) and Pl, (e, 6, x) _ (a2/a62)P(9, ^, x)
1 (9, e, x) = log f (x, 9, 6).
(A.4) The parameters are defined to be "orthogonal" in the sense that E [Poi (9, e, X)] = 0, where Po (0,
^, x) = (a2/aea )P(e, e, x).
Note that the condition (A.4) is not necessarily restricted , because otherwise we can redefine the parameter g(9, ^) so that we have the above orthogonality.
(A.5) There exist Jooo
= E [Poo (9, , X)Po(0, ^, X)] ,
J001
= E [Poo (9, ^, X )Pl (B, ^, X )] ,
Jo,o = E [Poi (9, , X)Po(9, e, X )] , Joie =
E [Poi (9, ^, X )Pi (B, ^, X )] ,
Juo = E [Pil(9 , , X)Po(8, e, X)] , Kooo = E [{Po(0 , ^, X)13], 2 Koo, = E [{Po(o, e, X)} P1(0, e, X)] , Moooo = E [{Poo (0, ^, X)}2] - Ioo, Moroi = E [{Po1(0, ^, X)}2] and the following holds.
E [Pooo(O, X )j = -3Jooo - Kooo, E [tooi(0,
X)] = -Joao,
E[tou(0,X)] = -Joie, where Pooo(0,
e, x) _ (03/003) P(9, C, X), tool (0, C x) = (93/ae2oC)
1(0, e, x) and Poll (0, C, x) = (a3 /aeaC2) P(9, C, x).
400
From the condition (A.5) it is noted that Kooi = Joio - J001. We put Z0 = to(0'"'), Zi = tl(B,"),
Zoo = E { 'oo(0"' X) + loo} , Zoi = f E'0l(0", X:)• Let B* and ^* be the maximum likelihood estimators (MLEs) of 0 and respectively. Then we have the following. Theorem 2. 1. Assume that the conditions (A.1) to (A.5) hold. Then the MLE 0* of 0 has the following stochastic expansion. (B*-B)
1 1 3Jooo + Kooo 2 =j0oZo+ V n6IOOZooZo
-
2fI00
1 oInZoiZi Z0 + ^lo
- Joio Z Z - Joie Z2 + 0 ( 1
- i o 1 2^loolii 1 ^ Vfn The proof is given in the paper by Akahira and Takeuchi (1982) and also in section 4 . 2 of the monograph by Akahira ( 1986 ). We put 1 3Jooo + Kooo 2 Qo = I2 ZooZo 2I3 Zo, 00 00 2 Qi = 1 Zi (Zoi - Joio Z0 - Joie Z1 + Joii Z1 100111 \ Ioo Iii 2loolii Then we have the following:
Theorem 2.2. Assume that the conditions (A.1) to (A.5) hold. Then the asymptotic deficiency of the MLE 9* is given by d = loo {V (Qo ) +V (Q,)} _ 1 (Jooo + K000)2 1 ( J021 _ Joie Ioo (Mooooloo -j020 o) + 2I 0 + 100,11 I Moroi - 100 Iii
+ 2loolii'
401
where V(.) designates the asymptotic variance. The proof is given in the paper by Akahira and Takeuchi (1982) and also in section 4 . 2 of the monograph by Akahira ( 1986). Now, in order to consider a semiparametric model, we assume the condition (A.6). (A.6) For each 0, each ^ and almost all x[µ], f (x, 8, ^) = f (x - B, 01 and f(x,^) = f(-x,^)• Then we have the following corollary. Corollary 2.1. Assume that the conditions (A.1) to (A.6) hold. Then the asymptoic deficiency of the MLE B" is given by (2.1) d = oo (M0 0000 - Jooo) + (Jooo loo 000 )2 + 1020 111 (Mo101100 - j0210) Since, by the condition (A.6), Joll = 0, the proof is easily derived from Theorem 2.2. It is also noted that the inequalities M0000I00 Jooo ? 0 and Mo1oll00 - JOlo > 0 hold from the Schwarz inequality. Next we consider a semiparametric model of the type:
(2.2) f(x - 0,^) = CC { fo(x - 6)}1-e{g(x - 0)}£ for 0 < ^ < 1, where Cc is some constant, depending on ^, with Co = C1 = 1, and fo(x) and g(x) are certain density functions. Then we assume the condition (A.7). (A.7) {fo( x2du(x) = 1. .fo(x) f
It is easily seen that the condition (A.7) implies loo = 1. From (2.2) and (A.7) it follows that, by the Schwarz inequality, Mo101
j0210 =
J-^ f g(x) I2 fo
(x)dµ - {f- g(
= p (.fo, g) (say) is nonnegative. Then we have the following.
x)fo(x) fo(x)dµ
402
Theorem 2.3. Assume that the conditions (A.1) to (A.7) in the semiparametric model (2.2) hold. Let g(x) be of the form Kago (x) with some constant Ka for a > 0 and some function go(x). If p (fo, go) > 0, then the asymptotic deficiency of the MLE 9• tends to infinity as a -* oo. Proof. From the assumptions we have lim (Molol - Joao) = lim alp (fo, go) = 00, Q-400
a_+00
hence, by (2.1), it follows that the asymptotic deficiency of the MLE B` tends to infinity as a -^ oo. This completes the proof. Denote by F a semiparametric model consisting of density functions f (x) with f (x) = f (-x). Then we have the following. Corollary 2.2. Assume that for any fo E F there exists go E F such that p (fo, go) > 0, and that for any 6 with 0 < 6 < 1 and any a > 0, C(^, a) fo-fgo^ E F, which is denoted by f0(x, 6), where C(^, a) is some constant. If the conditions (A.1) to (A.7) on fa(x - 0, 6) for every a > 0 hold in the case when 0 is a location parameter, then the asymptotic deficiency of the MLE ( of 0 tends to infinity as a oo.
The proof is straightforward from Theorem 2.3. Example. Let fo(x) be a non-normal density and go(x) be a normal density with mean 0 and variance 1. Then we consider a mixture fo of densities fo and go as fo(x, e) = C(e, a) {fo(x)}'
{go ( x)}"oz for 0 < ^ < 1,
where C(^, v) is some constant with C(0, v) = 1. If the assumptions of Corollary 2.2 are satisfied for 1/a2 instead of a in the case when 0 is a location parameter, i.e., f, (x - 0,^) and go(x - 0), then the asymptotic deficiency of the MLE of 0 tends to infinity as o -4 0.
403 References Akahira, M. (1986).
The Structure of Asymptotic Deficiency of
Estimators. Queen's Papers in Pure and Applied Mathematics No.75, Queen's University Press, Kingston, Ontario, Canada Akahira, M. and Takeuchi, K. (1982). On asymptotic deficiency of estimators in pooled samples in the presence of nuisance parameters. Statistics & Decisions 1, 17-38. Hajek, J. (1962). Asymptotically most powerful rank order tests. Ann. Math. Statist., 33, 1124-1147. Takeuchi, K. (1971). A uniformly asymptotically efficient estimator of a location parameter. Jour. Amer. Statist. Assoc., 66, 292301. Recu en Juin 1988
404 Austral. J. Statist., 32(3), 1990 , 281-291
LOSS OF INFORMATION ASSOCIATED WITH THE ORDER STATISTICS AND RELATED ESTIMATORS IN THE DOUBLE EXPONENTIAL DISTRIBUTION CASE MASAFUMI AKAHIRA1 AND KEI TAKEUCHI2
University of Tsukuba and University of Tokyo
Summary Fisher (1934) derived the loss of information of the maximum likelihood estimator (MLE) of the location parameter in the case of the double exponential distribution. Takeuchi & Akahira (1976) showed that the MLE is not second order asymptotically efficient. This paper extends these results by obtaining the (asymptotic) losses of information of order statistics and related estimators, and by comparing them via their asymptotic distributions up to the second order. Key words: Loss of information ; double exponential distribution ; second order asymptotic efficiency; maximum likelihood estimator ; median.
1. Introduction Fisher (1934), starting from his fundamental paper (1922), discussed estimators of the location parameter of a double exponential (two-sided exponential) distribution as a typical example of non-regular estimation. He showed that the maximum likelihood estimator (MLE), which is equal to the sample median in this case, has asymptotic loss of information of order Vn, as compared to constant order in regular cases. In Takeuchi & Akahira (1976) we showed that the MLE does not have second order asymptotic efficiency, unlike regular cases. This paper extends these results by showing that it is possible to construct an estimator which is asymptotically Received May 1989; revised September 1989.
'Institute of Mathematics, University of Tsukuba, Ibaraki 305, Japan. 2Research Center for Advanced Science and Technology, University of Tokyo, 4-6-1 Komaba, Meguro-ku , Tokyo 156, Japan. Acknowledgements. The authors thank the associate editor and the referee for valuable comments , in particular for pointing out the form of proof used after Lemma 2.2.
405 282 MASAFUMI AKAHIRA AND KEI TAKEUCHI
better than the MLE both in terms of asymptotic variance and asymptotic loss of information in the second order, but that it is impossible to have an estimator which is uniformly better than the MLE in the second order expansion of the distribution. We thus conclude that there is no second order asymptotically efficient estimator.
2. The Loss of Information Supose that Xi,. .. , Xn are independently and identically distributed (i.i.d.) random variables with the double exponential density function. f(x,6) = fo(x - 0) = i exp(-Ix - 01). First we compute the amount of information contained in the order statistics around the median, or equivalently the loss of information by discarding all other order statistics. We use the following lemma. Lemma 2 .1. (Fisher, 1925; Rao 1961) The loss of information associated with any statistic T, which is a function of a sample of size n obtained from the population with the density function f (x, 0), is given by
EB[Ve( E a9logf(Xi,0) I T)] , where E9 (.) denotes unconditional expectation and Ve(• I T) the conditional variance given T. Now let Tk be the set of the central 2k + 1 order statistics X(,_k+1) < • • • < X(,+k+l) obtained from a sample of size n ='2s + 1 from the double exponential distribution. Then we have the following lemma. Lemma 2 .2. For each k = 0,1, ... , s - 1 the loss of information Lk associated with Tk is given by
/ 2s+1 \ Lk = Ee [v9 I sgn(X= - 0) I Tk Il (2.1) 2(2s + 1)! (2u3_k_1 - us_k-2 )( 1- u)a+k du. (2.2) (s - k - 1)!(s + k)! j Proof. Equation ( 2.1) follows directly from Lemma 2.1 and the assumed form of the density function . Given Tk, the set of the first s - k order statistics X(1 ), ..., X(,_k) is a random ordered sample from the distribution with
406 LOSS OF INFORMATION IN THE DOUBLE EXPONENTIAL DISTRIBUTION 283
a density fo(x)/Fo(x(,-k+1)) on x < x(,_k+1), where Fo(x) is the double exponential distribution function of the Xi. Similarly, the last s - k are from fo(x)/[1- FO(x(•+k+1)] on x > x(,+k+1). Define u = Fo(X(,_k+l)), v = 1-Fo(X(,+k+1))+ wi = Fo(X(i)), zi = 1-Fo(X(i)). Then, ignoring the ordering of the first and last (s-k) X s, w1i ..., w,_k and z ,+k+2,..., x2s+i are random samples from uniform distributions on [0, u] and [0, v] respectively. Given Tk, each sgn(Xi - 0) is determined for s - k + 1 < i < s + k + 1, so we have
1
2s+1
V sgn(Xi - 9) I Tk
k)(V1 + V2),
i =1
where V1 = 1 - { fo sgn( x - 0) dw /u}2, which corresponds to the conditional variance of X for X < X(,_ k+1), V2 corresponds to that of X for X > X(,+k+1), and
w = Fo(x - 0) _
f
2e-Ix-91
l 1_e_I8I
for x < 0, for .T > 0.
Then we get V1 =
0 for X(,_ k+l) < 9, { (2u - 1)/u2 for X(,_ k+l) > 0, 0 for X(,+ k+1) > 0,
V2 = { (2v - 1)/v2 for X(s+k+1) < 9, and hence Lk = (s - k){E(V1) + E(V2)}. Symmetry now implies that
E(V1) E(V2)
(s (k) ^s +1 k)^ J 31(2 u2 1)us-k(1
- u)s+k du,
and from this result equation (2.2) follows as required. This result can also be expressed in the following form. Theorem 2.1. For each k = 0,1'. ..' s - 2, the loss of information Lk associated with Tk is given by
_ 2s 1 2, k+1 +^k`2(k-j+1)( 2s 12 2s Lk s-j) 2(2s+1) - (s)(2) s-k-1 0 s-k-1 (2.3)
407 284 MASAFUMI AKAHIRA AND KEI TAKEUCHI
Proof. The essence of the proof lies in recognizing the integrals in (2.2) as incomplete beta functions, which in turn are tail probabilities of binomial random variables. Specifically, using Y(n) to denote a symmetric binomial random variable with parameter n, i.e., Pr{Y(n) = p} _ (p) (2 )", we have 1
n!
(p - 1)!(n - p)!xp-1(1 - x)"_p dx = Pr{Y(n) < p - 1} fil11 for integers p = 1, ..., n. Then (2.2) gives
Lk = 2s Pr{Y(2s - 1) < s - k - 2} 2(2s .}, l) - 2 Pr{Y(2s) < s - k - 1} - s - k - 1 Using symmetry twice, this equals
-Pr{IY(2s)-sl
Since s Pr{Y(2s - 1) = i - 1} = i Pr{Y(2s) = i}, the sum in (2.4) equals k 2E(1- ) Pr{Y(2s) = s - j} . sk-1
j-o
Substitution in (2.4) completes the proof of (2.3). Remark 2 .1. For the case k = 0 , Theorem 2.1 yields Lo s+l 1 2d1 1[( )(s)(2) 2(2s+1) s-1 This expression is consistent with a result in Fisher (1934, p.300).
3. The Asymptotic Loss of Information Evaluating the three terms in Theorem 2.1 for fixed k and large s shows that asymptotically the loss of information is
^^ [1+0(1)] -4(k+1)+OI s). (3.1)
408 LOSS OF INFORMATION IN THE DOUBLE EXPONENTIAL DISTRIBUTION 285
But in order to improve this quantity we must increase k with s, and (3.1) suggests that we take k = r^/s + o(-4/s) with a non-negative constant r.
From Theorem 2.1 and the Cramer-Rao inequality it is follows that for any unbiased estimator 9 which is a function of Tk only, we have for fixed k Lk = n { 1 + -Lk + ° VB(B) I - IT ( n) }, A = n. ll
(3.2)
where II and IT,, denote the Fisher information at B and Tk respectively. ✓n + o( ^/n) for p = r1/2 > 0 Theorem 3 . 1. For k = rVs + o(Vs) = a passociated with Tk is given by and n = 2s + 1, the loss of information , Lk, Lk = 4 [O( p) - p1 l - 4, (p)}] ✓n + o(,/n),
(3.3)
where t (x) = f x. ¢(u ) du with 0( u) = e-ss'/Z/ 27. Proof. Using the normal approximation to the binomial distribution we have
Pr{Y(2s) = s} =
1 (1 ✓
8s + 0(s-2)) . (3.4)
The summation term in (2.3) equals 2 k - + 1 Pr{Y( 2s) = s + j} sk-1
k = 2(k + 1) > s
-
Pr{Y(2s ) = s + j} -
k - 1 i=o
11
k > j Pr{Y(2s) = s + j}
s -k-1 j=o
f P0(u)du -s r,Is 1
s-rls
2
J PuO( u)du+°\ ✓s
after approximation by Riemann integrals,
2(r✓s + 1 ) ^ P 0(u) du s - r%/s - 1 Jo
1 - e-r2
✓(2s ) + 0 1 (3.5) ,/(2r) s - r^/s - 1 (V/s )
By (2.3), ( 3.4) and (3.5) we obtain /' P Lk = 2(2s + 1){ -
✓s + Ts J
0(u) du +
✓( rs)e-r2 + °(
✓s)
= 4[4(p) - p{1 - gy(p)}] ✓n + of,/n), which completes the proof. We now consider the explicit form of estimators depending on Tk. As the simplest one, we put 9k = (X(s-k+1) + X(,+k+1))/2. The proof of the next result is in Section 4.
409
286 MASAFUMI AKAHIRA AND KEI TAKEUCHI
p.%/n+o( ✓n) where p = ro > 0 and n = 2s + 1, the estimator 9k has the stochastic expansion
Theorem 3.2. Fork = rVs+o(.v/s) _
✓n(Bk- 9)
=Z+ (Z-
PA Z -p1+(Z + PAZ + P 1
+ op (n
7' )
and asymptotic variance Ve(%/n9k) = 1 + cPn 1 (1 + 0(1)), (3.6) where Z = ^/n{Fo(X(.-k+l) - 9) + Fo(X(s+k+l) - 9) - 1} with Fo(.) the distribution function of X;, equivalently, Z = Jn(U(s-k+l) + U(s+k+l) - 1) with U(1), ..., U(n) the order statistics from the uniform distribution on the interval (0, 1), and cP=2[2-0(p)+p{21P (P)- i}].
(3.7)
Remark 3.1. Since ¢(x) > x{1- fi (x)} for x > 0, it follows for any p > 0 that cP > p. Define po by -11(po) = 3/4, so that po 0.67. Then c, has a minimum value at p = po. Hence 9k has minimum variance at p = po. Remark 3.2. In particular, for k = 0 i.e., p = 0, we obtain from Theorem 3.2, Ve(%/nio)=1+
( /. (3.8) 2)+0 n ✓ rn ✓
Note that 8o is the median, i.e., the MLE in this case. Remark 3 .3. Since
Ve(ek) = n-1
{1 + cvn
3[ 1 + 0(1)]},
it follows from (3.2) that c, > Lk/`/n. In this case we have from (3.3) and Theorem 3.2,
cP -
1 n Lk = p.
Corollary 3.1. For k = r%s+o(,/s) = 2 p✓n+o(%/n) where p = r%/2 > 0 and n = 2s + 1, the bound of the loss of information Lk associated with the estimator 6k is given by Lk < cP0%n + o(,/n), where po is defined in Remark 3.1.
Proof. From (3.2) we have V9(Bk) > (n - Lkwhich implies that Lk
+ o( ✓n)=cP✓n+ o( ✓n)•
410 LOSS OF INFORMATION IN THE DOUBLE EXPONENTIAL DISTRIBUTION 287
Therefore, by putting p = po we have Lk - c,. ✓n + o( ✓n) Remark 3.4. Since po 0.67, it is seen that
Lk < cPa ✓n + o( ✓n) = 1.27 ✓n + o( ✓n) .
(3.9)
On the other hand, it follows from (3.1) that the loss Lo of the MLE Bo, i.e., the median, is given by Lo = Lo = ^^
✓n+o( ✓n) = 1.60 ✓n+o( ✓n). (3.10)
From (3.9) and (3.10) we see that the estimator 9k is asymptotically better than the MLE Bo in the sense of Lk < Lo + o(/n). In Akahira & Takeuchi (1981, p.97) it is shown that the asymptotic distribution of 80 up to the order n-1/2 is given by Pr { ✓n(9o - 0) < t} = .(t ) - 2 n- Jt20( t)sgn t + o(n- )
= Fn(t) (say). (3.11) Then the asymptotic density fn(t) of eo is given by fn(t) = Fn(t) = ¢(t) - 1n-1(2t - t3)q5(t)sgnt+o(n-l). Hence the asymptotic variance of 60 is given by
Ve( ✓neo) = J_oot2mThtt__ 1 + ^() (1+0(1)) which is consistent with (3.8), and identically so when n = 2s + 1. Next we obtain the asymptotic distribution of 0k up to the order n-1/2. The proof is given in Section 4. Theorem 3 .3. For k = r✓s+o( ✓s) = 2 p ✓n+o( ✓n) where p = r'/2 > 0 and n = 2s + 1, the asymptotic distribution of Bk up to the order n-,j is given by
Pr { ✓n(Bk - 0) < t}
=4(t)-2^)[-pt+ {(t-p)It-pl+(t+p)It+pl}]+o(n-4). z
Remark 3 .5. Letting r = 0, we see that, for k = 0, the asymptotic distribution of the estimator 60 is consistent with that given by (3.11). From (3.11) and Theorem 3.3 we have the following (the proof is straightforward).
411
288 MASAFUMI AKAHIRA AND KEI TAKEUCHI
Corollary 3 .2. For estimators 9k and Bo, Pr{Vn(9o -9)
{Vn( 6k-9)p,
Pr{ /n(Oo-9)Pr{\/n(Bk-0) Pr { Jn(Bk -
0) < t} + o(1 /%/n) for 0 < t < p,
Pr {Vn( Bo -9)
for-p
Remark 3.6. From Corollary 3.2 we see that, for 0 < t < p, eo is asymptotically better than 8k in the sense of the concentration probability. That is, Pr {\/njBo
- 01
< t} > Pr
{'/nl9k
- 01
< t} + 0(1/\/n)
for 0 < t < p,
and, for t > p, Bk is asymptotically better than Bo in the same sense as the above, i.e.,
Pr {VnIBo -
91
< t} < Pr {\/nOk -
01
< t} + 0(1 /V/n)
for t > p.
Recently, Sugiura & Naing (1989) extended the above results to the other estimators using a weighted linear combination of the sample median and pairs of order statistics.
4. Proofs In this section we give the proofs of Theorems 3.2 and 3.3, based on the following lemmas.
Lemma 4 .1. Let 1'1,. .. , Y„ be i. i.d. random variables with an exponential density function e-11/B /0 for y > 0, (4.1) f(y,e) 0 fory<0, and let g(Y1 i ... , Yn) be homogeneous in Y1, ... , Yn of degree 0 so that g(KYi,••.,KYn) = g(Y1i...,Yn) for all non -zero K . Then g (Y1,...,Y,,) and E' , Yi are independent. Proof. It is clear that F,'1 Y; is a complete sufficient statistic for 0. Since the distribution of g(Y1 i ... , Yn) is independent of 0, then, by the theorem of Basu ( 1955), it is independent of E 1 Y.
412 LOSS OF INFORMATION IN THE DOUBLE EXPONENTIAL DISTRIBUTION 289
Lemma 4.2. Under the same conditions as Lemma 4.1, E[9(Yi ,...,Yn)] =
E [g(Yi , ..., Yn)(Ei_ 1 Yi)a) E[(F,ni= 1 Ys')a ]
for all real numbers a. The proof is straightforward from Lemma 4.1. Proof of Theorem 3.2. Without loss of generality , we assume that 0 = 0. Let Fo(x) be a distribution function of Xi (i = 1, . . ., n). Then we have X(i) = Fo 1(U(i)) (i = 1, ..., n), where the U(i) s are order statistics from the uniform distribution on the interval (0,1). We also obtain
Fo'(u) =-{ sgn(u-Z )}log(1-I2u-1I). We Put U = U(,_k+1) and V = U(,+k+l). Then we have n Bk = 1ni {F-1(U)+F-1(V)}
= n(U+V - 1) + n- 1 {ni(U - i)Ini(U - 2)I} +n_ i {n4 (V - 2)Ini(V I 2) }+op(n-i)• Putting X = n3 (U - Z ), Y = n (V - 2 ), we obtain n30k=X 1-Y+ n( X If 1'1,. • • , Yea+1 are i.i.d. random variables with the exponential density function as at (4.1) with 0 = 1, then it is known that j
2s+1
U(j)Yi/Yi i=1
i=1
for each j = 1, ... , 2s + 1. Using this fact we have, by Lemma 4.2, a-k+l 2a+1
2
1)2) = Ef L( \ Flt=1 Yi - ^.i=s+k+2 Yi l E[(U + V ^'`` a+ li ^ / J /^ i=1 s-k+l 2s+1 2 E[(Fi=1 Y. - Fi=s+k+1 Yi) E[( Ei=11
s-k+1 (s + 1)(2s + 1)
Yi)2]
413 290 MASAFUMI AKAHIRA AND KEI TAKEUCHI Hence we obtain
V (X + Y) = E[(X + Y)2] = E [{ ,/n (U + V -1)} 2] = s s + 1 1 = 1 - - + o( n-3). (4.3) Since E(X - Y) = E[ni(U - V)] = -p, it follows similarly that V(X - Y) = E[{\/n(U - V)}2] - p2 = n'4 (p + 0(1)) . We put
Z=X+Y;
W = s1"4 (X - Y + P).
Since X=
2(X +Y+X -Y) = 2(Z-P+s-1/4W);
Y = 2{(X + Y) - (X - Y)} = 12(Z + p we have from (4.2) niik
= Z+;n-4[(Z-P)IZ-PI +(Z+P)IZ+PI] +op(n'4). (4.4)
Hence
V(Vnek ) = E(Z2) + 2n-i{E[Z (Z - p)IZ-pI] +E[Z(Z+P)IZ+PI]} +o(n -3). (4.5) For any constant c we have
E[Z(Z - c)IZ - cI] = E[IZ - cI3] + cE[(Z - c)IZ - CI] . (4.6) We obtain
f00 00 00
I
00
I
X
- CI 3O( x) dx = 2( c2 + c)¢(c) + (c3 + 3c) {2-t(c) - 1}, (4.7) cIo(x) dx
= - 2c0(c) + (c2 + 1){1 - 20(c)} . (4.8)
414 LOSS OF INFORMATION IN THE DOUBLE EXPONENTIAL DISTRIBUTION 291
Since Z is asymptotically normally distributed with mean 0 and variance 1 - r//s = 1 - p/ Jn, we have from (4.6)-(4.8) that
E[Z(Z - c)IZ - cI] = 40( c) + 2c{2-P(c) -1} . Hence we obtain
E[Z(Z - P)IZ -
pI] = 40( p) + 2p{24^(p) - 1} . (4.9)
From (4.3), (4.5) and (4 .9) we have
V(n39k) = 1 + 2n-i [20 (p) + P{2t ( P) - Z }] + o(n- ) . The coefficient of n-3 here yields c,, as at (3.7 ), proving (3.6). Proof of Theorem 3.3. Assume without loss of generality that 9 = 0. From (4.4) we have that Pr{i/nOk < t} equals
Pr{Z+4n -3[(Z-P)IZ-pI+(Z+P)IZ+pl] < t+op(n A)} =Pr {Z
n-'[(t-P ) lt-PI +(t+P)lt+p1] +op (
n-')}.
We see that Z is asymptotically normally distributed with mean 0 and variance 1 + o(1/,1s) = 1 + o(1 /Vn). Hence we obtain Pr{/n9k
+2[(t- P)lt -PI
+ (t+P)It+PI] } +op(n-1)} _ fi(t) - Zn 0(t) (pt + sgn(t)[(ItI - P)+]Z)
+ o(n-7).
References AKAHIRA , M. & TAKEUCHI , K. (1981 ). Asymptotic Efficiency of Statistical Estimators: Concepts and Higher Order Asymptotic Efficiency. Lecture Notes in Statistics 7. New York: Springer-Verlag. BASU, D. (1955). On statistics independent of a complete sufficient statistic. Sankhya 15, 377-380. FISHER, R.A. (1922 ). On the mathematical foundations of theoretical statistics . Philos. Trans. Roy. Soc. London Ser. A 222, 309-368. FISHER, R.A. (1925). Theory of statistical estimation . Proc . Cambridge Philos. Soc. 22, 700-725. FISHER, R.A. (1934 ). Two new properties of mathematical likelihood . Proc . Roy. Soc. London Ser. A 144, 285-307. RAO, C . R. (1961 ). Asymptotic efficiency and limiting information . Proc . Fourth Berkeley Symp. Math . Statist. Probab. 1, 531-545. SUGIURA, N. & NAING, M.T. (1989 ). Improved estimators for the location of double exponential distribution . Comm . Statist. A-Theory Methods 18, 541-554. TAKEUCHI , K. & AKAHIRA , M. (1976 ). On the second order asymptotic efficiencies of estimators . In Proc. Third Japan-USSR Symp. Probab. Theory, eds. G. Maruyama and J .V. Prokhorov, 604-638. Lecture Notes in Mathematics 550. Berlin: SpringerVerlag.
415
Ann. Inst. Statist. Math. Vol. 43, No. 2, 297-310 (1991)
BOOTSTRAP METHOD AND EMPIRICAL PROCESS MASAFUMI AKAHIRA ' AND KEI TAKEUCHI2 'Institute of Mathematics, University of Tsukuba, Tsukuba, Ibaraki 305, Japan 2Research Center for Advanced Science and Technology, University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 156, Japan
(Received August 24, 1989; revised February 22, 1990)
Abstract. In this paper we consider the sampling properties of the bootstrap process, that is, the empirical process obtained from a random sample of size n (with replacement) of a fixed sample of size n of a continuous distribution. The cumulants of the bootstrap process are given up to the order n-1 and their unbiased estimation is discussed. Furthermore, it is shown that the bootstrap process has an asymptotic minimax property for some class of distributions up to the order n-1/2_
Key words and phrases: Bootstrap process, cumulants, unbiased estimators, asymptotic minimax property.
1. Introduction The bootstrap method may be reviewed from different viewpoints . In this paper , we intend to consider the sampling properties of the bootstrap process, that is, the empirical process derived from the bootstrap sampling, i.e. that obtained from a random sample of size n (with replacement ) of a fixed sample of size n of a certain distribution . Let X1,..., Xn be a sample of size n from a population with the distribution function F(t) and let Xi , ... , Xn be a bootstrap sample of size n, that is, a random sample of size n f r o m X1, ... , Xn. Let the empirical distribution functions obtained from (X1,..., Xn) and (Xi , ... , X*) be denoted by Fn(t) and Fn* (t), respectively. It is well known that /h(Fn(t) - F(t)) approaches a Gaussian process as n --+ oo, and, given Fn(t), f (FF(t) - Fn( t)) conditionally approaches a Gaussian process with the same variance and covariance of / (Fn(t) - F(t)) with F(t) replaced by Fn(t). Hence V/'n-(F,(t) - Fn( t)) can be considered to be the consistent estimator of V/n-(Fn(t) - F(t)). Note that /(Fn(t) - F(t)) can not be usually observed since F(t) is unknown , whereas the distribution of V/n-(Fn* (t) - Fn (t)) can be completely computed from the sample. We further investigate how V/'n-(F,F(t) - Fn(t)) will differ from /(Fn(t) - F (t)) in higher order terms and we discuss possible improvements on F,, (t). In many problems of statistical inference, the procedures will depend on the distribution of a statistic Tn under an unknown distribution F(t), which in many 297
416 298 MASAFUMI AKAHIRA AND KEI TAKEUCHI
cases can be discussed in terms of V^n-(Fn (t) -F(t)), at least asymptotically. Hence v,'n-(Fn (t) - FF( t)) can be used instead of V/'n-(Fn(t) - F(t)) in the derivative of the asymptotic distribution of the statistic . It will be shown that the method is in a sense asymptotically efficient in a nonparametric (or semiparametric) framework. A first order approximation was considered in the work of Efron ( 1979 , 1982) and Beran (1982 ) considered a second order approximation from a different viewpoint. The purpose of this paper is to compute the cumulants up to the order n - 1 and to show that the bootstrap process is in a sense, asymptotically, the best estimator of the empirical process up to the order n-1/2, whereas in terms of the order n-1 there are many complications and although a slight improvement is possible over the usual bootstrap process, no uniformly optimal results seem to be obtainable. The bootstrap method is used to estimate the distribution of some statistic Tn under a general unknown population distribution and it is shown that it is asymptotically best up to the second order in the sense that the estimator of the asymptotic variance as well as that of the asymptotic distribution of Tn can not be uniformly improved if the class of possible population distributions is sufficiently wide.
2. Unbiased estimation of cumulants of the empirical process ,W(t) _
In the framework of Section 1, we put Wn (t) = /i(Fn(t)-F(t)) and J(F,F(t) - Fn (t)). Consequently we have the following. LEMMA 2 . 1. follows:
The cumulants of Wn(t) are given, up to the fourth order, as
E[ Wn(t)] = 0, Cov(Wn(t l ), Wn(t2)) =F(t1)(1 - F(t2)) for t1
t2,
,c3(Wn(tl), Wn(t2), Wn ( t3))
=(1/vrn-)
(t1)(1 - 2F(t2))(1 - F(t3))
for t1 < t2 <
t3,
/( F k4(Wn(tl), Wn(t2), Wn(t3), Wn(t4))
_ (1/n)F(ti)(1 - F(t4))(1 - 4F(t2) - 2F(t3) + 6F(t2)F(t3)) for
t
4
-
The T h e proof is given in Section 4, but Lemma 2.1 may be also derived from Lemma 3.1 of Withers (1983). From Lemma 2.1 we have the following. LEMMA 2 .2. Given Fn( t), the conditional cumulants of WW(t) are given, up to the fourth order, as follows: E[W,(t) I F'n(t)] /_ 0, COV(Wn(tl), Wn
(t2)
I Fn( tl), F'n(t2)) = Fn (t l) (1 - Fn(t2)) for
tl < t2,
ic3(Wn(tl), Wn(t2), Wn(t3) I Fn(t1), Fn(t2), Fn(t3))
= (1/N/n)Fn(t1)(1 - 2Fn(t2))(1- Fn(t3)) for t1 < t2 < t3,
417
BOOTSTRAP METHOD AND EMPIRICAL PROCESS 299 #c4(Wn(tl), Wn(t2), Wn(t3), Wn( t4) I Fn ( tl), Fn (t2 ), F.(6), Fn(t4))
_ (1/n)Fn(t1)(1 - Fn(t4))(1 - 4Fn(t2) - 2Fn(t3) + 6Fn( t2)F.(t3)) for t1 < t2
E[Fn(t1)(1 - Fn(t2))] = {1 - (1/n)}F(tl)(1 - F(t2)) for t1 < t2, E[F.(t1)(1 - 2Fn(t2))(1 - Fn(t3))] _ {1 - (1/n)}{1 - (2/n)}F(tl)(1 - 2F(t2))(1 - F(t3)) for t1 < t2 < t3i E[Fn(t1)(1 - Fn(t4))(1 - 4Fn(t2) - 2Fn(t3) + 6Fn(t2)Fn(t3))] _ {1 - (1/n)}{1 - (2/n)}{1 - (3 /n)}F(tl)(1 - F(t4)) (1 - 4F (t2) - 2F (t3) + 6F(t2)F(t3)) - (1/n){1 - (1 /n)}F(tl)(1 - F(t4)) for t1 < t2 < t3 < t4. The proof is given in Section 4. From Lemmas 2.2 and 2.3 it is seen that, given Fn (t), the conditional cumulants of Wn (t) are not unbiased estimators of the corresponding cumulants of Wn(t). LEMMA 2.4. The ( unconditional ) cumulants of ,W(t) are given, up to the fourth order, as follows:
E[WW (t)] = 0, Cov(W, (tl), W, (t2)) = {1- (1 /n) }F(tl )( 1 - F(t2 ))
for
ti
<
t2,
i3(Wn ( t1), Wn ( t2), Wn(t3))
_ (1/i){1 - (1/n)}{1 - (2/n)}F(tl)(1 - 2F(t2 ))( 1 - F(t3)) for t1 < t2 < t3i k4(W, (tl), W, (t2), W,(t3), W,(t4))
_ (1/n){1 - (1/n)}{1 - (2/n)}{1 - (3/n)}F (tl)(1 - F(t4)) (1 - 4F(t2) - 2F(t3 ) + 6F(t2 ) F(t3)) - (1/n2){1 - ( 1/n)}F(tl )( 1 - F(t4))
+ (1/n)F(tl )( 1 - F(t4))(3 - 8F(t2) - 4F(t3 ) + 12F(t2 )F(t3)) - (2/n2)F (tl)(1 - F(t4))(3 - 1OF (t2) - 5F(t3) + 15F (t2)F(t3)) + ( 3/n3)F (t1)(1 - F (t4))(1 - 4F(t2) - 2F (t3) + 6F (t2)F(t3)) + o(1/n3) for
tl
The proof is given in Section 4. From Lemmas 2.1, 2.2 and 2.3 we also have the following.
418 300 MASAFUMI AKAHIRA AND KEI TAKEUCHI
THEOREM 2 .1. The unbiased estimators of the covariance Cov(Wn(tl), Wn(t2)) and the third order cumulant Ic3 (Wn(tl), Wn(t2), Wn(t3)) are given by {n/(n = 1) } Cov(W,(tl), W.* (t2) I F. (t1), Fn (t2)) = {1/(n - 1)}Fn(tl)(1- Fn(t2)) for t1 < {n2/(n - 1 ) (n - 2) }k3 ( Wn(tl), W„ (t2),
W.* (t3) I
Fn (4),
Fn (t 2
t2,
), Fn (6))
_ {n./(n - 1)(n - 2)}Fn(t1)(1 - 2Fn(t2))(1 - Fn(t3)) for t1 < t2 < t3i respectively. PROOF . From Lemmas 2.1, 2.2 and 2 . 3 we have for tl < t2 E[Cov (Wn(tl ), Wn(t2) I Fn (tl), Fn(t2))] = E[Fn ( t1)(1 - Fn(t2))]
= {1 - (1/n)}F(tl)(1 - F(t2)) = {(n - 1)/n} Cov(Wn(tl), Wn(t2)), hence {n/(n-1)}Cov(WW(tl), Wn(t2) I Fn(tl), Fn(t2)) = {n/(n-1)}Fn(t1)(1-Fn(t2)) is an unbiased estimator of Cov(Wn(ti), Wn(t2)). In a similar way we obtain for tl < t2 < t3
E[ic3(Wn (t l),
Wn(t2 ), Wn( t3 ) I F,-(t1 ), Fn(t2), Fn (
t3))]
= E[(1/ /) Fn(tl )( 1- 2Fn (t2))(1- Fn(t3))] = (1/'){(n - 1)(n - 2)/n2}F (tl)(1 - 2F (t2))(1 - F(t3)) = {(n - 1)(n - 2)/n2}k3(Wn(tl ), Wn(t2 ), Wn(t3)),
hence {n2/(n - 1 )(n - 2)}ic3 ( Wn(tl ), Wn(t2 ), Wn(t3 ) I Fn(t1 ), Fn(t2), Fn(t3)) = {n^/-n-/(n - 1)(n - 2)}Fn (t1)(1 - 2Fn (t2))(1 - Fn(t3)) is an unbiased estimator of ic3 (Wn(tl), Wn(t2), Wn(t3)). Thus we complete the proof. Remark 2 .1. Let Xi, ... , Xn_1 be a bootstrap sample of size n -1, that is, a random sample of size n - 1 from X 1 ,..., Xn. We put Wn_1(t) = /i(Fn-1(t) -
Fn(t)) with the empirical distribution Fn* -1(t) of X,..., Xn_1. Then it follows from Lemmas 2.1, 2.2 , 2.3 and Theorem 2.1 that Cov(Wn_1(tl),
Wn -1
(t2) I Fn(tl), Fn( t2)) = {n/(n -1)}Fn(t1)(1- Fn(t2))
is also an unbiased estimator of Cov(Wn(tl), Wn(t2)), but Cov(WW(tl), Wn(t2) I Fn(t1), Fn(t2)) is not unbiased for it. Hence it is desirable to use the bootstrap sample of size n-1 in place of size n . And also the biases of higher order cumulants become smaller.
419
BOOTSTRAP METHOD AND EMPIRICAL PROCESS 301
3. Minimax property of the bootstrap estimator In this section we consider the estimation problem based on the i.i.d. sample X1,..., X,,, on some real parameter 9 which can be defined as a functional 9 = 'Y(F) of a continuous distribution F. Then the natural estimator is 9n = XF(Fn), where F,,, is the empirical distribution function. We shall show that the bootstrap estimator of the distribution of 9n has a minimax property for some parametric family of distributions. We assume the following condition.
(A.1) The functional ' is Frechet differentiable up to the third order, that is, there are functions 8W/oF, 82W/OFBF and 83W/8FOFOF such that (3.1)
W (G) - T (F) =
- F) J 00 (aW /8F)d(G 00 W + (1/2) (G - F)d(G - F) 1-0000 1- 00 + (1/6) J J : 1_oo (83I /8F8F'0F) • dF(G - F ) d(G - F)d(G - F)
+ o(IIG - F'II3), where JIG - FII = sup., IG(x) - F(x)I. Putting WW,,(x) = ,fn-(Fn(x) - F( x)), we have from (3.1) (3.2)
/(9n - 0) 00 O1( x)dWn(x) 00 00 + (1/2vlrn-)-00 r f-00 02(x, y)dWn(x)dWn(y) + (1/6n) J 00 rM roc 03(X, y, z)dWn(x)dWn(y)dWn(z) + op(1/n),
where q51(x) = (8W/8F)(x), 02(x, y) = (82W /OFBF )(x, y) and 03(x, y, z) (OW3/OFOFOF)(x, y, z). We also assume that the following holds.
(A.2) fcbi(x)dF(x) = 0, fcb2(x, t)dF(t) = 0, f=,y)dF(s) 2(s f 03(x, y, u)dF(u) = f 03(x, t, z)dF(t) = f 03(s, y, z)dF( s) = 0, and the functions (62(x, y) and q3(x, y, z) are symmetric in (x, y) and (x, y, z), respectively. Furthermore, using Tn = V/-n-(9n - 0), we assume the following condition. (A.3)
E(TT) < oc.
LEMMA 3.1. Assume that the conditions (A.1), (A.2) and (A.3) hold. Then the asymptotic cumulants of Tn are given as follows.
420 302 MASAFUMI AKAHIRA AND KEI TAKEUCHI
. E(Tn) = (1/2^){fw + o(1/n)
02(x, x)dF(x) - : F 02(x, y) dF(x)dF(y)} oo
= (1/v/'n-)bl + o( 1/n) (say), oo
V(Tn ) =
f
+
oooo
q51
2
i(x)dF(x) }
02(x)dF(x ) - {f f 00 1 ^ 00.
( x)c2(x, x)dF(x)
- 3 f FOO 01(x)02(x, y)dF(x)dF(y) +2I f "Oc51( x)dF(x )) f°° f00 q2( x, y)dF (x)dF(y)} 00 . 00 }z m + (1/2n){ f 02(x, x)dF(x) - f f 00 02(x, y)dF(x) dF(y) l ocoo
00 )
00
y , y)dF(x)dF(y) + (1/n){
0 00
- 2 f. La 00 00
00
o L ao
O1(x)(a3(X,
y, z)dF (x)dF(y)dF(z)
+ f f f f 01(x)03( y, z, u) • dF(x)dF(y)dF(z)dF(u) } + o(1/n) vo + (1/n)vl + o(1/ n) (say), J
ic3(Tn) = E[{Tn - E(Tn)}3]
{
(1/vfn-) f
0i (x)dF(x ) - 3
(
f - 02 (x)dF(x )) (f ^ 01(x) dF(x))
+2 (f lo 01(x)dF(x)) 3
(
+ 3 f l 01(x)dF(x)) j - 3( j 02(x)dF(x
2(X , x)dF(x)
)) f f 00 2
cb2 (x, y)dF (x)dF(y)
+3 1 f &1(x)dF(x )) f. f 0 02 (x, y)dF(x)dF(y) + (3/2) (
c .
01 (x)dF(x ))2 f2( x , x)dF(x)
+ o(1/n), = (1//)/33 + 0 (1 /n) (say),
rc4(Tn) = E[{Tn - E(Tn)}4] - 3{V(Tn)}2 = 0(1/n).
421
BOOTSTRAP METHOD AND EMPIRICAL PROCESS 303
The proof is omitted since Lemma 3. 1 is similar to Theorem 3.1 of Withers (1983). Remark 3 .1. With the condition (A.3) and from the fact that there exists a finite positive constant c such that P(sup ^I Fn(x) - F(x)I > r I < ce-2r2
(Dvoretzky et al. (1956)),
lX
holds for all r > 0 and all positive integers n, it follows that the above expansion of the remainder term is valid. For an estimator 6* based on the bootstrap sample X*,..., Xn* of size n, we put T,n = Ji(6* - 6). LEMMA 3 .2. Assume that the conditions (A.1), (A.2) and (A.3) hold. Then the conditional asymptotic cumulants of Tn, given the empirical distribution function Fn, have the following form. E[Tn I Fn] _ (1/vrn-)bl + (1/n)ei + op(1/n),
V(T, I Fn) = vo + (1/01;2 + (1/n)vl + op(1/n), ic3(Tn
IF
n) = (
l£4(Tn I Fn) =
1 //)/33
/c4(Tn)
+ (1/n )S3 + op(1 /n),
+ op(1/n),
where ^1 = Op(1), 6 = Op(1), 6 = Op( 1), and b1, vo, v1 and 03 are constants given in Lemma 3.1. The proof is given in Section 4. Remark 3 . 2. In order to evaluate the bootstrap estimator 9*, it is seen from Lemmas 3 . 1 and 3 . 2 that the variance of e2 = /i(V(TT I Fn)-V(Tn ))+op(1/J) plays an important part. LEMMA 3.3. is given by
Under the conditions (A.1), (A.2) and (A.3), the variance of ^2
^
V(2) = (x){ f
2
(x) - 2m}2dF(x) - { f0(x ) dF(x) - 2m2 } ,
where m = f 01(x )dF(x). The proof is given in Section 4. Now we consider a parametric family F = {Fe:0 E O} of distribution functions, where O is an open set of R1 involving the origin. Take Feo as the previous distribution function F. We assume that, for each 0 E 6, the distribution function Fe is absolutely continuous with respect to a o-finite measure p, and denote dFe(x)/dp(x) by fo(x). For each 0 E ®, we put
00 ve = f 00
12 0i(x)fe(x)4 - S F O1(x)fe(x)dµ } .
l
JJJ
422 304 MASAFUMI AKAHIRA AND KEI TAKEUCHI
Since 8n = IQ (F,,) is an asymptotically unbiased estimator of 8, we have by Taylor's expansion of vo around 0 = Bo vB,. = voo + [ave/a8]0=00(8n - 8o ) + o(1/'), hence the variance of ven is given by Veo (vb.) = ([avo/88 ] o=e0 )2 Voo ( 8n) + o(1/n). Assume that the Fisher information amount 1(0) exists, i.e.
0 < I(8) =
f
{81og fo x)/86}2 fex)d p < oo, 00
then we have by Cramer-Rao's inequality that (3.3)
nVoo(ve,.)
>- ([avo/06]e= 00)2/I(0) +o(1),
provided that the differentiation under the integral sign is allowed. We further restrict our attention to a family of subclasses .F, = {Fo: dFo(x)/dµ = fo (x) with the form log (fo(x)/foo (x)) = c(8) + 0&i (x) a.e. [p] with c(0) = 0} of T, where b(x) is a function with finite variance at fo0. Then we have the following. THEOREM 3.1. Assume that the conditions (A.1), (A.2) and (A.3) hold. Then the bootstrap estimator 8* has a minimax property in the above family, i.e. max min nVoo (ven) = nVoo (vb.) + o(1), YO
bn
provided that the differentiation under the integral sign is allowed. The proof is given in Section 4. Remark 3.3. From Theorem 3.1 we see that the maximum of relative efficiency of the bootstrap estimator 8n is equal to 1 + o(1), i.e.
max 1( ein nVoo (ven ) Y,P
1 /nV0o(v.) ]
= 1+0(1)-
It also follows from Theorem 3.1 that in a semiparametric situation where the class of distributions is sufficiently wide to include FP, it is impossible to get an estimator with a smaller asymptotic variance than ve,.
423 BOOTSTRAP METHOD AND EMPIRICAL PROCESS 305
4. Proofs
In this section the proofs of lemmas and theorems are given. In order to prove Lemma 2.1 we have the following. LEMMA 4. 1. Let Z be a real random variable. Assume that, for each i = 1, 2, 3, 4 , Yi = 1 for Z < ci,, Yi = 0 for Z > ci, where cl < c2 < c3 < c4. Then i3 (Y1 ,
Y2, Y3) = E[(Y1 - p1)( Y2
- P2 )(Y3 -P3 )]
= Pi(1 - 2p2 )( 1 -M,
#c 4 (Yi, Y2, Y3, Y4) = E[(Yi -P1)( Y2 -p2 )(Y3 -p3 )(Y4 -p4 )] - Cov(YI, Y2) Cov(Y3i Y4) - Cov(YI, Y3) Cov(Y2, Y4) - Cov(YI, Y4) Cov(Y2, Y3) = pi(1 - p4)(1 - 4P2 - 2P3 + 6p2p3), where for each i = 1, 2, 3 , 4, pi = P{Z < ci} and Cov (•, •) denotes the covariance. PROOF . It is seen that pi < P2 < P3 < p4 . Since E (Y2) = pi (i = 1, 2, 3), E(Y1Y2) = E(YI) = pi, E(Y2Y3) = E(Y2) = P2, E(YIY2Y3) = E(YI) = PI, it follows that
r-3 (Y1, Y2, Y3) = E[(Y1 -p1)(Y2 -P2)(Y3 -P3)] = E(Y1Y2Y3) - p1E(Y2Y3) - p2E(Y1Y3) - p3E(Y1Y2) + 2p1P2P3 = Pi (1 - 2p2)(1 - P3)• In a similar way, we have
E[(Y1 -PI)(Y2 - p2 )(Y3
-p3)(Y4 -p4 )]
= {pl(1 - 2p2)(1 - P3) + plp2p3}(1 - p4), Cov(Yi, Yj) = pi (1 - pj) (1 < i < j < 4).
Hence we obtain r-4 (Y1, Y2, Y3, Y4) = E[(Y1 -pl)(Y2 - p2)(Y3 -p3 )(Y4 -P4)]
- Cov(YI, Y2) Cov(Y3i Y4) - Cov(Y1i Y3) Cov(Y2, Y4) - Cov(Y1i Y4) Cov(Y2i Y3) = Pi(1 - P4)(1 - 4P2 - 2P3 + 6P2P3)• Thus we complete the proof. PROOF OF LEMMA 2.1. Since WW,,(t) _ / (Fn(t) - F(t)), it is easily seen that E [Wn(t)] = 0 and Cov ( Wn(t1 ), Wn(t2)) = F (tl)(1-F (t2)) for t1 < t2. From Lemma 4.1 we have ic3(Wn ( tl), Wn ( t2), Wn (t3)) = (1/f) F(t1)(1 - 2F(t2 ))( 1 - F(t3)) for t1 < t2 < t3i k4( Wn( tl
), Wn(t2), Wn(t3), Wn(t4))
_ (1/n)F(ti)(1 - F(t4))(1 - 4F(t2) - 2F(t3) + 6F(t2)F(t3)) for
t1 < t2 < t3 < t4.
424 306 MASAFUMI AKAHIRA AND KEI TAKEUCHI This completes the proof.
In order to prove Lemma 2.3 we have the following. LEMMA 4 . 2. Suppose that , for each i = 1, 2, 3 , 4, Yi is a real random variable with mean E (YY) = mi. Then E[YI(1 - Y2)] = MI (1 - m2) - a12,
E[YI(1 - 2Y2)(1 - Y3)] = m1 - m1m3 - 2m1m2 + 2m1m2m3 - a13 - 2a12 + 2m10'23 + 2m2913 + 2m3o'12 + 2K123, E[YI(1 - Y4) (1 - 4Y2 - 2Y3 +6Y2Y3)] = m1 - 4(o*12 + m1m2) - 2(0'13 + m1m3 ) - ( a14 + m1m4) + 6(k123 + m10`23 + m20`13 + m3(712 + m1m2m3) + 4(IC124 + m1024 + m20`14 + m4a12 + m1m2m4) + 2(/134 + m10`34 + m3cr14 + m4a13 + m1m3m4) - 6(1234 + m41123 + m36124 + m2tc134 + 7121 /234 + m1m40'23 + m2m40`13 + m3m40`12 + m1m3a24 + m2m30`14 + m1m20r34 + m1m2m3m4 + 0` 120'34 + 0`13o24 + a14923),
where, for 1 < i < j < k < r < 4, Uij = Cov(Y, Yj), Icijk = Ic3(Yi, Yj, Yk) and kijkr = K4(Yi, Yj, Yk, Yr).
PROOF. The first one is easily derived. Since E(Y1Y2Y3) = K123 + m10-23 + m20`13 + m3cr12 + m1m2m3, it follows that E[Y1(1 - 2Y2)(1 - Y3)] = E(Y1) - E(Y1Y3) - 2E(YIY2) + 2E(Y1Y2Y3) = m1 - m1m3 - 2m1m2 + 2m1m2m3 - a13 - 2a12 + 2m10'23 + 2m20-13 + 2m3912 + 2K123•
Since E[Y1Y2Y3Y4] _ K1234 + 0` 120'34 + a13o24 + 0*14o23 + m4K123 + m3K124 + m2rv134 + m1ic234 + m1m4a23 + m2m4a13 + m3m41712 + m1m3a24 + m2m30*14 + m1m20`34 + m1m2m3m4,
it follows that E[YI(1 - Y4)(1 - 4Y2 - 2Y3 + 6Y2Y3)] = ml - 4E(Y1Y2) - 2E(Y1Y3) - E(Y1Y4) + 6E(YIY2Y3) + 4E(Y1Y2Y4) + 2E(YIY3Y4) - 6E(Y1Y2Y3Y4),
425 BOOTSTRAP METHOD AND EMPIRICAL PROCESS 307
hence the desired result follows. PROOF OF LEMMA 2.3. For i = 1, 2, 3, 4, we put Y = Fn,(ti ) and mi = F(ti) = E[FF,(ti )]. Then we have Qij = (1/n)mi(1 - mj ), rcijk = (1/n / )mi(1 2mj)(1 - mk ), lcijkr = ( 1/n2)mi ( 1 - mr)(1 - 4mj - 2mk + 6mjmk) for 1 < i < j < k < r < 4 . From Lemmas 4.1 and 4.2 we have the conclusion of Lemma 2.3 . PROOF OF LEMMA 2.4. From Lemmas 2.2 and 2.3 it follows that
E[W,(t)] = E[E[W,(t) I Fn(t)]] = 0, Cov(Wn*(t1), Wn(t2)) = E[Cov(Wn*(tl), Wn*,(t2) I Fn(t1), Fn(t2))] = {1 - (1/n) }F(tl)(1 - F(t2)) for t1 < n k3(Wn (tl), W (t 2), Wn (t3))
t2,
= E[i3(Wn(t1), Wn(t2), Wn(t3) I Fn(t1), Fn(t2), Fn(t3))] (1/n)}{1 - (2/n )}F(t1)(1 - 2F(t2))(1 - F(t3))
for
tl < t2 < t3.
Fn( t
3 ),
In a similar way, we have i4(Wn(tl), Wn(t2), Wn(t3), Wn(t4)) =
E[k4(W,(tl), Wn( t2), Wn (t3 ), Wn(t4) I Fn(tl), Fn (t2), +COV(COV(Wn*,(t1), Wn* (t2) I Fn (6), Fn (t2)) , COV(W,n*,(t3), Wn(t4 ) I Fn(t3 ) ,
+Cov(Cov(WW*,(tl), Wn*,(t3) I
Fn (t4))]
Fn (t4)))
Fn(tl), Fn (t3)),
C0V(Wn*(t2), W, (t4) I Fln(t2),
Fn( t4)))
+ Cov(Cov(Wn*(t1), Wn*(t4) I Fn(tl), Fn(t4)), COV(Wn*,(t2), Wn(t3) Fn( t2), Fn(t3))) = E[k4(Wn(t1), Wn(t2), Wn*,(t3), Wn(t4) I Fn(tl), Fn(t2), Fn(t3) , Fn(t4))] + 74 (say).
Since, by Lemma 2.2,
Cov(Wn*(ti), Wn*
I Fn(ti), Fn(tj)) = Fn(ti)(1 - Fn (t i))
for
it follows that
CoV(CoV(Wn*(t1), Wn*,(t2)
I
Fn (t1),
Fn(t2)),
COV(Wn*, (t3), Wn (t4) I Fn(t3), Fn(t4)))
= Cov(Fn(tl)(1 - Fn(t2)), F'n(t3)(1 - Fn(t4))) = E[Fn(tl)(1 - Fn(t2))Fn(t3)(1 - Fn(t4))]
- E[Fn(tl)(1 - Fn(t2))]E[Fn(t3)(1 - Fn(t4))]•
ti :5 tj,
426
308 MASAFUMI AKAHIRA AND KEI TAKEUCHI
We put Yi = Fn(ti) and mi = F(ti) = E[Fn,(t;,)] for i = 1, 2, 3, 4. Then we have from Lemma 4.2
1'4 = E[YI(1 - Y2)Y3(l - Y4)] - E[YI(1- Y2)]E[Y3(l - Y4)] + E[YI(1 - Y3)Y2(1 - Y4)] - E[YI(1 - Y3)]E[Y2(1 - Y4)] + E[YI(1 - Y4)Y2(l - Y3)] - E[YI(1 - Y4)]E[Y2(1- Y3)] = (1/n)ml(1 - m4)( 3 - 8m2 - 4m3 + 12m2m3) - (2/n2)ml ( 1 - m4)(3 - 10m2 - 5m3 + 15m2m3) + ( 3/n3)ml (1 - m4)(1 - 4m2 - 2m3 + 6m2m3 ) + o(1/n2). Hence we obtain from Lemmas 2.2 and 2.3 k4(Wn(t1), Wn(t2), Wn(t3), Wn(t4))
(1/n){1 - (1/n)}{1 - (2/n)}{1 - (3/n)}F(ti)(1 - F(t4)) (1 - 4F(t2) - 2F(t3) + 6F(t2)F(t3))
- (1/n2){1 - (1/n)}F(ti)(1 - F(t4)) + (1/n)F(ti)(1 - F(t4))(3 - 8F(t2) - 4F(t3) + 12F(t2)F(t3)) - (2/n2)F(tl)(1 - F(t4))(3 - 10F(t2) - 5F(t3) + 15F(t2)F(t3)) + (3/n3)F(t1)(1 - F(t4))(1 - 4F(t2) - 2F(t3) + 6F(t2)F(t3)) + o(1/n3). Thus we complete the proof. PROOF OF LEMMA 3.2. From Lemma 3.1, it follows that the conditional cumulants of Tn, given Fn(t), have the form of
E(Tn
I Fn) = (1/vrn-)bi + op(1 /n) = (1 /./ ) b1 + (1 /n)e 1 + op(1 /n),
V(Tn I Fn ) = vo + ( 1/n) vi +op(1 /n) = v0 + (1 / / )^2 + (1/n)vl + op(1/n), '
t
3 + op(1/n) = (1/v ni)/33 + (1/n) S3 + op(1/n), i3(Tn I Fn) = (i// ) N3 r-4(Tn I Fn) = (1/n)/4 + op(l/n) = (1/n)f34 + op(l/n)•
This completes the proof. PROOF OF LEMMA 3.3. Since Wn(x) = /( Fn(x) - F(x)), it follows that
6 =
1
00
1 00:
( x)dW ( x) - (l/ / 00
:
(x)01 (y)dW ( x)dW(y) ) ^ 00^ 001 m
00
-
f 01(x)q51(y)dWn(x)dF(y) - f f 01(x)cb1(y)dF(x)dWn(y) - Il 00 00 01(x)dWn(x) 0i (x)dWn (x) - 2m
i_:
J 00
\ 12 x)dWn(x) I - (1/^) (f &1( 00
427
BOOTSTRAP METHOD AND EMPIRICAL PROCESS 309
where m = f 0. 01(x)dF(x). Then we have
E (2) = -(1 /h)E
[
0
t: foo
01(x)01 (Y)dWn (x)dWn(Y) )2^
_ -(1/\/-n-) f 00 02 ( x)dF(x ) - \ f ^ 01(x)dF(x) 00
= 0(1/.\/-n-). We also obtain
E(^2) = E
IF
.r oo
01(x) 01(y){&1(x)q51(y) - 4mo 1 (x) + 4m2 }dWn (x)dWn (y))
+ o(1/i)
02(x){0i( x) - 4mo1 (x) + 4m2}dF(x)
= L
- 4m&1 ( x) + 4m2}dF (x)dF(y) - f : j C 00 + o(1/v) 0i(x)dF(x ) - 2m2 = f 02(x){01(x) - 2m}2dF(x) - { r 00 JJ oo + o(1/'i). Since V(e2) =
E(^2) + o(1/\), we have the desired result.
PROOF OF THEOREM 3.1. Since the scaling of 0 is arbitrary, without loss of generality, we assume that Oo = 0 and 1(0) = 1. It is enough to obtain b(x) which maximizes [ave /ae]e=o under the condition 1 = f {[(8/80) logfe(x)]e=o}2fo (x)dµ - f 00
1c'(0) +' (x )} 2fo(x)dµ,
that is, to get O(x) which minimizes
J-'3000
1c'(0) + 0 (x)}2 fo(x)dp
under the condition 0l(x) j 01(x) - 2m}{c'(0) + O ( x)} fo(x)dµ = 1. We put h (x) = ¢1(x){ 01 (x) - 2m}. With the Lagrange multipliers A0 and A1, we have O (x) + c'(0) = Aoh ( x) + Al and it follows that
Ao F h2(x)fo(x)dµ + Al h(x)fo( x)dp = 1, F 00 00 r oo
428 310 MASAFUMI AKAHIRA AND KEI TAKEUCHI
hence
AO = 1
J 00 h2(x)fo(x )dtt - {f ^ h(x)fo( x)dA 1 2
^1 = - { h(x)fo(x)dp ao
1
00 f h2(x)fo(x)dl-i - { f h(x)fo(x)dµ 2 . 00 00
1
From Lemma 3.3 we have 00 l 2 00 )dp }2h(x h(x)fo(x)dµ = 1 h2(x)fo(x)dµ f f {c'(0) + O(x) r 00 00 oo
T
= 1/{V(1;2) + 0(1)}, hence, by (3.3), max min nVeo (ven) = V (W + o(1) = nVeo (ve.) + 0(1) . :r,, en n
This completes the proof. Acknowledgements The authors wish to thank the referees for useful comments and the Shundoh International Foundation for a grant which enabled the first author to present the results of this paper at the 47th Session of the International Statistical Institute in Paris, 1989. REFERENCES Beran , R. (1982 ). Estimated sampling distributions: bootstrap and competitors , Ann. Statist., 10, 212-225. Dvoretzky, A., Kiefer, J. and Wolfowitz, J. (1956). Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator , Ann. Math. Statist., 27, 642-669. Efron , B. (1979 ). Bootstrap methods: another look at the jackknife , Ann. Statist., 7, 1-26. Efron , B. (1982 ). The Jackknife , the Bootstrap and Other Resampling Plans, CBMS Regional Conference Series in Applied Mathematics 38, SIAM , Philadelphia. Withers, C . S. (1983 ). Expansions for the distribution and quantiles of regular functional of the empirical distribution with applications to nonparametric confidence intervals , Ann. Statist., 11, 577-587.
429 SEQUENTIAL ANALYSIS , 10(1&2), 27-43 (1991)
SECOND ORDER ASYMPTOTIC EFFICIENCY IN TERMS OF THE ASYMPTOTIC VARIANCE OF SEQUENTIAL ESTIMATION PROCEDURES IN THE PRESENCE OF NUISANCE PARAMETERS
Masafumi Akahira Kei Takeuchi Institute of Mathematics Research Center for Advanced University of Tsukuba Science and Tochnology Ibaraki 305 University of Tokyo Japan 4-6-1 Komaba, Meguro-ku Tokyo 156, Japan
Key words and phrases : Sequential estimation procedure ; stopping rule; Bhattacharyya type bound ; asymptotically median unbiased estimator ; maximum likelihood estimation procedure ; second order asymptotic efficiency.
Abstract In the presence of nuisance parameters, the Bhattacharyya type bound for the asymptotic variance of estimation procedures is obtained. It is shown that the modified maximum likelihood (ML) estimation procedures together with any stopping rule does not attain the bound. Further it is shown that the modified ML estimation procedure with the appropriate stopping rule is second order asymptotically efficient in some class of estimation procedures in the sense that it attains the lower bound for the asymptotic variance in the class.
27
Copyright © 1991 by Marcel Dekker, Inc.
430 28
AKAHIRA AND TAKEUCHI
1. Introduction In the previous papers by Takeuchi and Akahira (1988) and Akahira and Takeuchi (1989), for one parameter case, the second and third order asymptotic efficiencies have been investigated in terms of the asymptotic variance and the asymptotic distribution of sequential estimation procedures, respectively. In this paper, in the presence of nuisance parameters, the second order asymptotic efficiency is discussed in the class of estimation procedures related to a sequence of sequential sampling procedures where the size of sample tends to stochastically infinity and is asymptotically constant in the sense that its coefficient of variation tends to zero. Then it is shown that the asymptotic variance of asymptotically unbiased estimators satisfies the Bhattacharyya type inequality, and that the modified maximum likelihood estimation procedures together with any stopping rule does not generally attain the bound. The second fact is essentially different from that in one parameter case. Further, if the class of asymptotically unbiased estimators is restricted, the modified maximum likelihood estimation procedure together with the stopping rule is second order asymptotically efficient in the class in the sense that its asymptotic variance attains the lower bound for the asymptotic variance of estimators in the class. The related discussion from the viewpoint of differential geometry is done by Okamoto et al. (1990). 2. Notations and assumptions Let X1, X2,•••, Xn,•-•, be a sequence of independent and identically distributed random variables with a density function f(x, 8, ,) with respect to a v-finite measure p, where 8 is a real-valued parameter to be estimated and k is a real-valued nuisance parameter, Suppose that the size n of sample is determined according to some sequential rule, Actually, we consider a sequence of sequential estimation procedures {IIQ: a=1, 2.•••} such that Ee, t, a (n) = Va(8, l;) becomes large uniformly in 0 and k as a-- 00, where for each a we define a stopping rule and estimators based on it and consider the asymptotic distribution of V (9a-8) as a-'00.
For simplicity, we denote v4(0, E,) by v. In order to consider the second order asymptotic efficiency we assume the following conditions (A. 1) to (A. 6). (A.1) Eo,t(n)=v+o(1), Ve,k(n)/v=0(1), Eo,v(nk)/vk=0(1) (k=2,3,4), and {(ak/aek)v}1v=O(1) (k=1,2), uniformly in 0 and E.
431
29
SECOND ORDER ASYMPTOTIC EFFICIENCY
(A. 2) The set {x :f(x, 0, Q>01 does not depend on 0 and k. (A. 3) For almost all x[p], f(x, 0, f is three times continuously differentiable in 6 and E . (A. 4) For each 0 and each k 0
Koo1=E[{lo(6, k, X)}211(6, ^, X)], Kill =E[111(0, k, X)}3], M0101 =E[{101(6, k, X)}2l, and the following holds. E[looo(0, k, X)] = -3Jooo-K000, E[1111(0, k, X)] = - 3J111-K111, E[looi (6, k, X)] = -Jolo, E[Ioll(6, k, X)] = -Joie, where 1000(6, k, x)=(a3/a63)1(0, k, x), 1111(6, kx)=(a3/ak3)1(0, k, x), 1001(6, 1;, x) _ (a2/a62a^)1(0, E, x)and loi i (6, k, x) = (a3/a0ak2) 1(6, ^, x). From the condition (A. 5) it is noted that Koo1 Jol0 -J0o1. We put
1 10(e, ,. Xi), 'N/vi=1
1 Z0= -
n
z 1 =t1 (e,t.x.), n
n
1100(0' xi ) + id,
zoo= 1 \/v
z11 = 1- I {1 (O, F, Xi) +rll},
i_1
'%/ V i=1
432 30
AKAHIRA
AND
TAKEUCHI
n Z01= - i_1 [01(0, E, xi), ^/v
where I00 and Ill denote I00(0, >;) and I11(0, E), respectively. 3. Bhattacharyya type bound for the asymptotic variance of asymptotically unbiased estimators In order to obtain the Bhattacharyya type bound, we need the following useful lemma for calculations of asymptotic cumulants. LEMMA3. 1. Suppose Y0,k is a function of X1," ,Xn, 0 and k and is differentiable in 0 and Then (ay e.t 1, E(ZOYek) 1= a E(Ye{)- 1-E(
ay
E(Z1Ye,^) V VA
E (Ye {)-
-VV
E(
A
{
dZ0 E(Z0 aY { a Y )= i= a E(ZOYe,^) -1-E E(ZoY0 a0 A vv V v a0 V v
and az l ay E(Z1Y0)
7V-E(Z1Yek)- 1 V E(Ye^ Al ^VE(Z1 A ^
provided that partial differentiation under the integral signs of E (Y0,k), E(Zi Ye ,k) (i = 0, 1 ) with respect to 0 and >; is allowed. The proof is omitted since the lemma is similar to Lemma 5. 1. 1 and 5. 1. 2 in Akahira and Takeuchi ( 1981 ) ( see also Lemmas 2. 1. 1 and 2. 1. we have the In the following theorem , 2 in Akahira, 1986 ). bound for the asymptotic variance of asymptotically Bhattacharyya type unbiased estimators of 0. THEOREM 3 . 1. Assume that the conditions (A. 1) to (A.A 6) hold. Then for any asymptotically unbiased estimator 0 of 0, i, e., E(0 ) = 0 +o(v-1),
- n
V( ,/v(0 -0))
1 1
J000 +K000 2v0
I I00 2100v 2 00 1
+
V
)2 +
J001 V,
) I v(I 00111 00 v
2
433
31
SECOND ORDER ASYMPTOTIC EFFICIENCY 2 + X011 +0111
272 zv `\
1 100
1P
(
y
where E() , V(•), vo and vl designate the asymptotic mean, the asymptotic variance, (a/a0) v (0, ) and (a/a2) v (0, E), respectively, and (1/Ipp)+(D/v) is called the Bhattacharyya type bound for the asymptotic variance of asymptotically unbiased estimators of 0. PROOF. Putting
log
L
log
f(x,, 0, ,
we have the Bhattacharyya type bound for the asymptotic variance of asymptotically unbiased estimators of 0 analogously as in the one parameter case (Takeuchi and Akahira , 1988), which is given by 10
t B
01
01000 / ( Bt Bta)-1 12 22
00 00 00
where z L0 B11=Ef L l )] vL2 `L OL1LoL 1
B -EI 1 (LoL^ 12 l vL2L1Lu
2 Loo
LOL01 L0L11 L1L10 L1L00 )1
Lo0L01 LooL11
_ ( 1
01 Lo1L11) B22-El ..i?\ LooL01 L 2 2 L0oL11 L01L11 L11
434 32
AKAHIRA
AND
TAKEUCHI
with Lo=(a/a0)L, Loo=( a2/a02)L, L01=L10=(a2/aOaE)L, L1=(alak)L, L11=(a2/ak2)L, and B121 denotes the transposed matrix of B12 (e. g. see Zacks, 1971, pages 189, 190). Then we have I0
(3.1) B ) +(0(1)), it 0 111
(3.2) E[Z0(ZOO+ ^/vZ0 B 12
_
100)x,
E[Zo(Zo1+d;Z0Z1) ], E[(Z ll + ^/vZ I-
ll
11)1 v1
_ _ l E[Z1(Z11+^/vZl-vrll) J, E[Z1(Zo1+,/vZ0ZJ J ,E[Z1(Z00 +V Z2 !roo)J
2vI0 0 0 (3.3) B22= ( 0 v100111
0 )+(o(1)).
0 0 2vI11
Since by Lemma 3. 1 1 av E(nZd= -1/vde.
we have (3.4)
EIZO Z®+VvZ2_ n'0I1 \ v /
v =J000+VvE(Zo)- oI^,
v
where v0=(a/a6)v. Since (aI001a0) = 2J000 +Kooo, it follows from Lemma 3. 1 that az l 2 (3.5) E(z^ =-VvaoE(ZO)- V E ( \z°a0 / v 1 ^I^ 2 VEIZO 11 \ 1/ N/ v
00 n 1^/ ^ 2vz^+z V v
I =2J^+K^-E I - V9Zo+2ZoZ00- 2_IOOnZoI v Vv
435 33
SECOND ORDER ASYMPTOTIC EFFICIENCY 3v0 =KO00+ -100+0(1). V
From (3.4) and (3. 5) we have (
(3.6)
/
_
n
2v0
E I Z0I Z00 + V v Zo- -I. I =J^+K^+ -100+o(1) . v / v t \
J
Since.by Lemma 3. 1
E(ZoZ1)= -Vvv E(Z0ZI)- - v
E\ZI aOo l- v E(ZO ^1)
v QI v E[Z1(- oZO+Z00 - n 1 EIZoI oZ1+Z01)1 1/v 2v -/v Vv \ 2v
Jo01 I00l Jo1o 1 v"v V v V Vv Vv
we obtain l l (3.7) E[Z0(Z01+VvZOZ01/1
=E(Z0Z01)+-\/ v E[ZoZ01I
_ -J001+1001 +0(1), V
where v1=av/ak. Since by Lemma 3.1 1 a 2 1 az i E(Z1)V,( E E(ZOZ)= Vvv^ 2Z1 80 /
al ( v 1 u _ =E2Z1(- -z1+Zo1 V v as ^/ v l 2v V 1 1 _^V(J011-J110) -Vv( 111+ 2Jo11)+0 V v
436 AKAHIRA AND TAKEUCHI
34 V
1
v(Jolt+ 1110)+I11 v^v +o(
1 ^v
it follows that (3.8) EIZO (Z11+1/vZ2_ V 'I11), L
=j110-1' v E(ZOZi Ill )- E(nZO)= -J11o+o(1) . V
In a similar way to the above we have (3.9)
E Z1(Z11+Vv Z1- Vn I11)1 =Jill VI +K,11+2111 V +0(1),
_ v0 (3.10) E Z1(Zo1+v'VZoZ1)llJ=-,1110+111 +0(1), V
(3.11) E [ Z1(Zoo+ ^/ v Z0- n V loo), = - Jo1o+ 0(1).
From (3.6) to (3. 11) we obtain OOoooO ( J+K+2I B12=
v
-J+I -J , l011 V
)+0((1)), vl vO
J111+K111+2111 ' -1110+111 -1010 hence by (3. 1) to (3. 3) 1 Bll-Bt?B22 1 821 )
loo 2Y12(Jaoo+K^ E3
Y 2 Y 2 J + 21^ v ) v1 (1001_ r 0 v) - 2v1
+(Y), 0(Y) -1 12
1 1 ) 0(V) 111+UQ )
where 811 I^
z v i)z+2 ^ot1 +0(1).
2 z(Joo +2Yo)z+v11 11( 1^1_ + 2vl00 00`11
2
Y
437
SECOND ORDER ASYMPTOTIC EFFICIENCY
35
Therefore we have V(V n
(e -e))z Bil
This completes the proof. 4. The second order asymptotic efficiency of maximum likelihood estimation procedure In order to obtain the asymptotic variance of the modified maximum likelihood (ML) estimator, we need the following lemma. LEMMA 4. 1 . Assume that the condition (A. 1) holds. If for any asymptotically unbiased estimator 8n, i. e ., E(6n) = 8 + o(v-1) , ^" 1 0
3-=Q+o(3).
then _A
V(V'v
1 1
1
00 V
v
(en -e))= I +-V(Q)+ o
/
where Q=Op(1). The proof is omitted since the lemma is given in Akahira (1986) and Takeuchi and Akahira (1988). Let 0* and k* be the ML estimators of 0 and k , respectively. Under suitable regularity conditions it will be shown that V (Q* -0) and V are of order Op(1). In order to establish this proposition we first need to prove the consistency of the ML estimators , from which the asymptotic joint normality of (V (A* - 8), (k* -i,)) will follow easily, and for the consistency we need a separate set of regularity conditions similar to those given by Wald ( 1949). Since this part is outside of our main concern , the detailed discussion is omitted. By Taylor' s expansion we have n
(4.1)
0 = - 10(0 *, k*, xi
n
1
n
- A
_-Y 1o(e,c x.)+-I 100(0, ,,x.)-%/v(e*-e) N/vi_i
n
vt= 1 n
+ 1 Y 101 (0, c xi)V' v (k*-E)+ v
i=1
1 _ J1(0, x,)v(0*- 0)2
zvX/v
i=i
438 AKAHIRA AND TAKEUCHI
36 n
1
2v\/vj=1
+ 1-Y
'
oil
l® i(O,kxi) {v(0*-ex
*- o)+,^ p
v1Vvi_1
\/v
1 n n
(4.2)
0= -=1 11(0 '*,x vv;=1
1 n 1 n = - Y l1(0, F,, X )+ -1 1Vvi =1 v,=1
-A
111(0, E. xi)1v v
+ 1 10 1 (0,^,XJWv(0 * -O)+ 1
cv - E)
-I
v i=1 2vN/v i=t
+
lltt(0,F.X.
)v(c*-12
1 - l001(0, E,, X4)v(O * -0)2
2v-%/v;_t
+ 1011 (0, F,xJ{v(8*-Ok - k))+0 ( p l/v vl/vi= 1 From (4 . 1) and (4. 2) we obtain / 1 n 1 -n O=ZO+(-Zo--I^ N/ v(8*-0)+-Z01Vv(F*-0 Vv v Vv
- 1 _(3J +Kv(0A *-0)2- 1 Joft v(F,*-^2 .V v 000 2V/v
)+vp(-Vv - ^`Io1o{v(0*-O)(k*-ID
/ v (0*-0) ( Z*-^+ 1-Zol^ 0=Z1+( Vv 1-Z11- n I11)vv v Jv
2-/ v (3Jt11+K111)v(,*-I)2- 2_/-Jolov v
(e*-O)2
439 37
SECOND ORDER ASYMPTOTIC EFFICIENCY
A -:J
MV-0)(k, vv O il
Qj+0
P(
hence (4.3)
vv
(0*-e)= -Z -=°^/v(e *-e)+ 1 00 o V
z Vv(e*-e)
100-V v o0
I +ZolN/v(V-E)^/2 v (3J^+K^)v(O*-e) 0o^ v 2100
-' Jollw2-1 1/ vJolo{V(e*-e)(k*-^))+Opl `/v 21 V 00 00
_(n =I1 z _
00 0
( nVv)Z 00 \
+ 1 Z^ + 1 Z Z
IM!V
oo
V
Io01 ll^
V 1 Ol
I _(3J +K )Z2J Z2 v 000 000 0 2I 2^ v 011 1 2I 3v 00 D0111
1 I Z V Joloz0zl+ 0 P(1/v 0ol l1 n
Let 6** be the ML estimator of 0 modified so that E(6**) =0 +o(v-1), and generally it automatically ensures that E(9**) = 0 +o(v-312). From (4.3) we have the following. THEOREM 4. 1. Assume that the conditions (A. 1) to (A. 6) hold. If the stopping rule is so determined that sampling is stopped at n and satisfies (4.4)
- I00(6**, x,)=v(e**, E)+100(9**r Q+op(1/v), i=1 A
(4.5)
E[v(O**,Q)=V(e, )+o(v(e,W,
then the asymptotic variance of the modified ML estimator 6** is given
by Vi
vv(e**_ e))
1
12
K^
v( 001 +2v0^2 100+2102 (JAI +1 I 0v 00 WIli 00
) 2V
Z
440 38
AKAHIRA AND TAKEUCHI 2
2
2
+ Jou + I (M -Joo-Jo1 HI) 0101 21 21 Zv 1 2 v 100 111 v ao 1 00111
2 2
I 1^ 1 1 v (Mob' ooJoin XI 1 1 v 00 100 11 V 00-11 00
and it does not generally attain the Bhattacharyya type bound given in Theorem 3. 1. REMARK 4. 1. In one parameter case, it is shown by Takeuchi and Akahira (1988) that the modified ML estimation procedure with some stopping rule attains the Bhattacharyya type bound. However, in the presence of nuisance parameters, the fact is not always true , as is stated in Theorem 4. 1. That is an essential difference between two cases. It is noted that the modified ML estimation procedure with the stopping rule (4.4) attains the bound if and only if 10101 k,x)=ao(0,k)+a,(O,k)10(0,k,x)+a 2(0.k)11(6,k.x) a.e.[P]
for all 0 and all 1;, where a1(0, k) (i=0. 1. 2) are certain functions of 0 and 1;, which is equivalent to M0101= (JO2o'100 )+ (Jolt /111).
REMARK 4. 2. The equation (4.4) with (4.5) is attained if
_
1O(0 **, k*
, Xi)=v(0**,
*)100(0**,1*)+c(0**, *),
i=1
where c(0, Q is of constant order and determined to satisfy (4. 5). Further, in Corollary 4. 1 below, it is shown that the modified ML estimator has the minimum asymptotic variance in some class of asymptotically unbiased estimators. PROOF of THEOREM 4. 1 . Since by (4.4) and (4. 5) 100( 0*, k,Xi )=v(0 *,{ *)Ioo(0 *,V)+Op(-Vv),
i=1
it follows by the Taylor expansion that (4.6) - J100(o,k,Xi)-{ J 1000(0,k,Xi){(9(1001(0 ,k,Xi)W -Q i=1 i=1 i=1
441
SECOND ORDER ASYMPTOTIC
EFFICIENCY
n
39
n
=v(8 , k)I00 (e , k)+100 (9 , k){v0(9*-9)+v1 (V-0}
+v(O, )l f 1®(O , )}(O*_O)+ { 100 (8,0}(*_ )j+ p(vv), where v° and vi denote (a/aO)v(8 and (a/aQv ( O , k ), respectively. From (4. 6) we have z nI00-vvZ00+vv (3J^+K °
+vvJo10Z1
100
111
1 I^ 1 2JOOO+K2J001+K001 v7^+ v°Z0+ - v1ZI )+\/v Zo+ Z1 +o( vv ), v
v
Ill
100
I11
J
v
hence (4.7)
(n _ v) 00^°^o1Z1 + vv
I00
v1l UZ+ oZ1+0 (1).
111 vlll
From (4.3) and (4.7) we obtain
(4.8)
vv (O*-
9)= Z Z o -(Zoo IoooZo I001Z1+ I00 00
Irv v
0z
0+
v1 ll -
00 11 11
+ 1 Z Z + 1 _Z Z- 1 _(3J +K )Z0 o00 000 0 v 1 01 Irv v ° 00 100I11^ 21^v v
2 -21 2^V`1o11Z1-12 1^y`loloZoZ1+0P( vv ooI11 00111
- Zo `1000+KOOOZ2 +
1 _Z \Z - J010Z - `1o11Z "
100 21030 -%/ v 0 I00111Vv 1\l 01 100 0 111 1
- `1001 _Z Z + J0ll Z2_ y0 Z2 I^111V
I00I
°1
2100I11 V v 1 I020v--/ v 0
yI _Z Z +o 11vvv o 1 a vv
442 40
AKAHIRA AND TAKEUCHI
Zo i J000+Kooo 2vo )Z2 0 Ioo 21002/ v I0o v
+
J J 001 Z _ 011 Z 1 z Z _ 1 of 1 0 1 1) I00lll^ 00 11
1 `001 y l l
`I 011 2
+ Ioolll^v l0° v /Z°Z1+ 2I00I1 v Z1
Hence it follows from (4.8) and Lemma 4. 1 that the asymptotic variance of the modified ML estimator 0** of 0 is given by _ A 1 1 Jooo+KOOO 2vo 2 1 oo1 V 1 001 V V(^/v(8**-0))=+ ( + v ) +100111v( 100 100 21 002/ v 100 2
2 2 2 101 1 1 _Jolo_101I (1
.2 1 2 +2I v+I2 I
1
(M0101 +o 100 Ill v
2 2
_ 1 1 1 1 v(
- + +
M0101
_ 0101 1(ii
JJ
1 1 J v 00 11 A It is also seen from Theorem 3.1 that 9** does not generally attain the Bhattacharyya type bound. Thus we complete the proof. 00 V
001
Next we consider a class of asymptotically unbiased estimators of 0. Let C be the subclass of all asymptotically best asymptotically normal and unbiased estimators 9n , with E(9n)=8+o(v-1) of those which are asymptotically expanded as Vv(0 -e)=Z+ Q +0( -), n I00 -V V ° P ^v
where Qo = Op(1) and aQola9 = Op(1). Then we have the following theorem. THEOREM 4. 2. Assume that the conditions (A. 1) to (A. 6) hold. Then, for any estimator On( C,
443
SECOND ORDER ASYMPTOTIC EFFICIENCY
41
1 1 1 2 1 2 /1 (4.9) V(Vv (8 -O))z - + - j3+ E(W00)+ E(W )+I - f, 01 \ v / 2 vl 21 100 V v l 001 11 00 11
where P is given in Theorem 3. 1, and J^
v0
( w W00=Z00 - \ 1 00
(n -v)I0
vv
V )ZD -
00
wo1= Zo1- Jo10 Zo - 1011 Z 1' 100 111
REMARK 4. 3. In the lower bound of (4. 9),as is shown above from the formula the ML estimators E(W ) can be reduced to zero by the stopping rule, while E(woi)= M0101- (Joio'loo)-(Jo111111)
is independent of the stopping rule. Sketch of the proof of Theorem 4.2 is given as follows, since the proof is similar to that given in Akahira and Takeuchi (1989).
Since J000+K000 2v0 E(ZoQO) I - V +0(1).
ao E(ZOZ1Q0)= j - V +0(1), 00
E(ZIQO)= doll +0(1), 00
E(ZOWOOQO)= 1 E(WO^+o(1), 00
E(ZIW01Q0) j E(WOI)+o(1), 00
it can be shown that under these conditions V(Q a 1 (Jo00+K000 + 2vo 2+ 1 J001 v1 2 0' ZI 2 I. v ) 111 1O0 v
00
444 AKAHIRA AND TAKEUCHI
42 2
+ 20112+
z E(W2+ . E(W )+o(1). 01
2102 j 1 020111
lodiu
Then the conclusion of the theorem follows from Lemma 4. 1. COROLLARY 4. 1. Assume that the conditions (A. 1) to (A. 6) hold. Then the modified ML estimation procedure together with the stopping rule (4. 4) is second order asymptotically efficient in the class C in the sense that it attains the lower bound of (4.9). PROOF. It is seen from (4. 3) that the modified ML estimator belongs to the class C . Since , by the stopping rule (4 .4), E(W )=o(1),
the conclusion of the corollary follows from Theorem 4. 1 and Remark 4. 2. REMARK 4 .4. Almost all "regular" type estimators in particular cases belong to the class C, as in the non-sequential cases.
References Akahira, M. (1986). The Structure of Asymptotic Deficiency of Estimators. Queen's Papers in Pure and Applied Mathematics No. 75, Queen's University Press, Kingston, Ontario, Canada. Akahira, M. and Takeuchi, K. (1981). Asymptotic Efficiency of Statistical Estimators : Concepts and Higher Order Asymptotic Efficiency. Lecture Notes in Statistics 7, Spriger-Verlag, New York. Akahira, M. and Takeuchi, K. (1989). Third order asymptotic efficiency of the sequential maximum likelihood estimation procedure. Sequential Analysis 8,333-359. Okamoto, I., Amari, S. and Takeuchi, K. (1990). Asymptotic theory of sequential estimation : Differential geometrical approach. To be published in the Annals of Statistics. Takeuchi, K. and Akahira, M. (1988). Second order asymptotic efficiency in terms of asymptotic variances of the sequential maximum likelihood estimation procedures. In : Statistical Theory and Data Analysis II, Proceedings of the Second Pacific Area Statistical Conference (K. Matusita, ed. ), 191-196, NorthHolland.
445 SECOND ORDER ASYMPTOTIC
EFFICIENCY
43
Wald, A. (1949). Note on the consistency of the maximum likelihood estimator. Ann. Math. Statist., 20, 595-601. Zacks, S.. (1971). The Theory of Statistical Inference . John Wiley, New York. Received November 1989 ; Revised November 1990 Recommended by J.K. Ghosh
446
Rep. Stat. Appl. Res., JUSE Vol. 38, No. 1, 1991 pp.1-9
A-Section
Asymptotic Efficiency of Estimators for a Location Parameter Family of Densities with the Bounded Support Masafumi AKAmRA* and Kei TAK ucm**
Abstract We consider the estimation problem on a location parameter of the density function with a support of a finite interval and contact of the power a-1 at both endpoints, where 1
1. Introduction Let fo(x) be a density function which vanishes on the exterior of the open interval (-1, 1) and twice continuously differentiable in (-1, 1 ), and consider a location parameter family of densities f (x, 0), 0 E Rt, defined by f(x, 0) = fo(x -0), x E Rt . We assume that fo(x) - A(l+x)a-t as x -+ -1+0, fo(x) - B(1-x)01-' as x -4 1 -0. Suppose that Xt, ..., X„ are independently and identically distributed random variables according to the density fo (x- 0). In a similar setup to the above, the order of minimum variance was obtained by Polfeldt [4], and it was shown by Akahira [ 1] and Woodroofe [7] that, for a=2, the maximum likelihood estimator has the asymptotic normal distribution. And also it was shown by Akahira [1] that the order of consistency is equal to nl/a, nlogn and 4 for 0 < a < 2, a =2 and a > 2, respectively , and, for 0 < (x:5 1 and a >_ 2, the asymptotic accuracy of estimators in terms of their asymptotic distributions was discussed by Akahira [2]. Related results can be found in Smith [5]. However, in the case when 1 < a < 2, it seemed to be difficult to discuss the asymptotic efficiency of estimators. In this paper the bound for the asymptotic distribution of asymptotically median unbiased estimators based on the sample is obtained and it is shown that the bias-adjusted maximum likelihood estimator is not asymptotically efficient in the sense that its asymptotic distribution does not uniformly attain the bound. Received February 16, 1990 * Institute of Mathematics , University of Tsukuba, Tsukuba, Ibaraki 305 , Japan. ** Research Center for Advanced Science and Technology, University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 156, Japan. (Keywords) Location parameter family, Asymptotic efficiency, Asymptotic distribution, Asymptotically median unbiased estimator, Maximum likelihood estimator.
447
M. AKAHIRA and K. T ucm
2. Assumptions and lemmas
Let fo(x) be a density function with respect to the Lebesgue measure and consider the location parameter family f (x, 0), 8 E R', defined by f(x, 0) = fo(x-0) for x e R. We assume the following conditions on fo(x). (A. 1) fo(x) > 0 for x < 1, fo(x)=0 for I xI >_ 1.
(A.2) fo(x) is twice continuously differentiable in the open interval (-1, 1) and lim (l+x)'-a fo(x)=A, lim (l +x)Za fo(x)=A', lim (1 +x)3 a fo(x)=A", x-+ -1+0 x-+ - 1 +0
x-+ - 1+0
lira (1-x)'-a fo(x)=B, lim (1 -x)2,fo(x)=B', lim (1 -x)3- fo(x)=B ", x-,1 -0 x-+1 -0
x-4 1 -0
where 1
j
Suppose that X„ ..., Xn are independent and identically distributed real random variables according to the above density function fo(x-0) with a location parameter 0. Then we consider the estimation problem on the parameter. In the above formulation, it was shown by Akahira [1] that the order of consistency is equal to n'/a. Now, in order to get the bound for the asymptotic distribution of asymptotically median unbiased estimators On of 0 , we shall obtain the moment generating function of
Zh = log fo(X) - log fo(X-h) for -1 + h<X<1, = 0 otherwise,
(2.1)
where h is a sufficiently small positive number. Here, an estimator On of 0 is called asymptotically median unbiased (AMU) if
limn'/a1Pe(8n<8}-(1/2)1 =limn'/a1Pe(8n > 0)-(1
n-wo
n->-
/2)1 =0
uniformly in some neighborhood of 8. From (2.1) we have the moment generating function
T (t) = Eo [exp (t Zh)] _ (exp (t zh)) fo(x)dx -1+h
_ fl ( exp (t zh)) fo (x) dx + fo (x) dx 1+h i-1 1 r -1+h
= [exp ( t log (fo (x)/fo (x-h))}] fo (x) dx + fo (x) dx i -1
l+h 1
.1+h
_ (fo (x)/ fo (x-h))1 fo (x) dx + fo (x) dx. -1+h
1
(2.2)
448 Asymptotic Efficiency for Densities with Bounded Support
Since (fo(x) / fo(x-h) )` = exp (thf 'o (x)/fo(x)) + o(h) = 1 + (thf 'o (x)/fo(x)) + o(h) we put Rl (x) = (fo(x) / fo(x-h) )` - 1 - (thf 'o (x) / fo(x) ).
Then we have from (2.2) 1
1
1
4+h
T (t) = R1 (x) fo (x) dx + fo (x) dx + th f 'o (x) dx + fo (x) dx . (2.3) J -1+h
-l+h
i -1
J -l+h
For the first term of the right-hand side of (2.3) we have the following. Lemma 2 .1. For 0<S<1, 1
Rl (x) fo (x) dx = ha (AG1h (t) + BG2h ( t) } + o (hn, 1+h
where hbl
Glh (t) = f [((u+1)/u)(a-1),- 1 - ((A'/A)t/(u+l))] ( u+1)°C-l du, 0 1+h8-1
[((u-1)/u)(°` -1)t_ 1 +
G2h (t) =
((B 'B)t /(u-1)}] (u-1)°`-1 du.
1
Remark 2.1. From Lemma 2.1 we see that hi m Gmh (t) _ [((u+l)/u )(°`-1)`- 1 - ((A'/A)t/(u+l))] (u+1 0
)°`-
1 du
_ [{ 1 - (1/(u+1)))(0G-1)`- 1 - ((A'/A)t /(u+1)}] (u+1)°L-1 du = Gi (t) (say), 0
1))] (u- I)-' du him G2h (t) = I [{ (u-1)/u }tat 1) `- 1 + { (B 'B)t /(u-1) J1
[( I+ (1/(u+1 )))-to-1)`- 1 + ((B'B) t/(u-1)}] (u- 1)a-1 du =G2(t) (say).
= 1
449 M.
AKAHIRA
and K. TAuuci
Proof of Lemma 2.1. First we have for 0<S<1, 1 -1+h+ha 1-ha 1
Rt (x) fo (x)dx = (
j
+ + ) R1 (x) fo (x)dx = II + 12 + 13 (say). (2.4)
f 1+h 1 -I+h -1+h+ha 1-ha
(1)
I. From the conditions (A.1) and (A.2) we have for 0<5<1, 1+h+ha
II = [{(1+x)/(1+(x-h)))("-1)1 - 1 + {(A'/A)th/(1
+x))] A (1 +x)"-I dx.
-1+h
Then it follows that -l+h+ha
II =A [{(1+x)/(1+(x- h)))
') ` - 1 - { (A'/A)th/(l
+x) )]
(1+x) "-I dx + o(h"'s
1+h
h8
[ {(y +h)/y}("-I)1 - 1 -{(A'/A)th/(y
=A
+h))]
(y+h)"r1 dy+o(hn
0 hbl
=Ah" [((u+1)/u)("-I)1 - 1 - {(A'/A)th/(u+l))] (u+l)"-I du+o(h"s) 0
= Ali" GIh (t) + o(h"s) (say). (ii) 12. Putting g (h) = { fo (x)/fo (x-h) }`, we have g' (h)/g(h) = tf 'o (x-h)/fo (x-h), hence (g" (h)/g (h) } - (g' (h)/g (h) }2 = t [(-f "o (x-h)/fo (x-h)) + (f 'o (x-h)/fo (x-h) )2] Since g (h) = I + (thf 'o (x)/fo (x) } + (h2/2)g "(ph) with O
Rl (x) = (h2/2)g "(ph) and g" (h) = t [{ -f'o (x-h)/fo (x-h) } + (t+l) { f 'o (x-h)/fo (x-h) )2] g (h). Since fo(x) - A (1+x)"- 1 as x -4 -1 +0, we have g"(h) - t f - (A"/A) + (t+l)(A'/A)2} (1+x-h)-2 { 1+(h/(1+x-h))}("-I)1 . as x -> -1 +h+0. Letting h (>0) with hs < E for any fixed e with 0 < e <1, we obtain for -l+h+hs < x < -1+h + e, 0 < { 1+(h/(1 +x-h)) )("-I)1 < max 11,
(2.5)
450 Asymptotic Efficiency for Densities with Bounded Support We also have
I.
-1+h+e
((1+x)ac-1 /(1+x-ph)2} dx 1+h+ha e
)a
{(y+h)a-' /(y+(1-p)h)2 ) dy < ha
dy
Jha
J
< (y+h^a i/
eh-8
(h8 u +h^a-1/ (h s u) du
y2 dy = fhs 1
<_ h
h")s
_8
(u + 1 )a-i/u2 du
1
(u+1)a-i/u2 du
= h(a-2)8
J1 = O (h(a-2)8 ). Then it follows from (2.5) and (2.6) that -l+h+e
R1 (x) fo (x) dx = 0 (h2+(a.2)s,
In a similar way to the above we have 1-ha
R1 (x) fo (x) dx = O
(h2+ta2
(2.8)
J1.h&e
Since Ig" M 5 Me with some constant ME for - 1+e<x<1 -e, we have Rt (x) fo (x) dx = O (h2).
(2.9)
From (2.7), (2.8) and (2.9) we obtain 1-ha
(h2+(a-2)s). 12 = R1 (x) fo (x) dx = 0 l+h+ha
(iii) 13-
In a similar way to the case (i), we have 1
13 = g {(1-x)/(1-(x-h))}^0` 1^`-1 + {(B'B)th/(1-x)}] (1-x)°`-1 dx +o (h «s,) 1-ha h+ha
B [((y-h)/y
)(a
-1)1-1 + ((B'B)th/(y-h) }] (y-h)-1 dy + o (has)
h 1+h8-1
[{(U- 1)/u}(a-1) `-1 + { (B'B)t /(u-1))] (U-1) ("' du + o (has)
= Bha 1
= Bha G2h (t) + o (ha8) (say).
451 M. A AmA and K. TAxEucm
From (2.4), (i), (ii) and (iii) we have for 0<S<1, Rl (x) fo (x) dx = h" { AGih (t) + BG2h (t)) + o(has). Thus we complete the proof. For the third term of the right-hand side of (2.3) we have the following. Lemma 2.2. For 1
Proof. From the conditions (A.1) and (A.2), it follows that l+h
f 'o (x) dx = - f - A' (l+x)a 2dx + o(ha-1) _ -A' ha-1/ (a - 1) + o(ha-1), 1
which completes the proof. From Lemmas 2.1 and 2.2 we have the following. Lemma 2.3. The moment generating function T(t) of Zh is given by cp(t) = E0[exp (t Zh)]
= 1 + h" {- (A/(a - 1))t + AGIh (t) + BG2h ( t)) + o(ha) . Remark 2.2. Putting Hh (t) = -(A' /(a - 1))t + AGIh (t) + BG2h ( t), we see that Hh(0)=0since Glh(0)=G2h(0)=0. The proof of Lemma 2.3 is straighforward from Lemmas 2.1 and 2.2. For each i=1, ... , n, we define Zhi = log fo (Xi) - log fo (Xi- h) for -1 + h< Xi< 1, = 0 otherwise. n
Putting W n =Y-
Zhi,
from Lemma 2.3 and Remark 2.2 we have as the moment generating
i=1
function cpn(t) of Wn
cpn (t) = E o [exp (tWn)] = (E o[exp (t Zh)] }n = { 1 + haHh (t) + o(h") }n We also put h=an-1/a , then cpn (t) =[1 + (&H(t)/n) + o(1/n)]n = exp {a"H(t)}+0(1). (2.10) where H(t) = -(A'/(a - 1)) t+AG1n 1/a(t) + BG2h-1%(t). A similar discussion to the above can be done for h < 0.
452
Asymptotic Efficiency for Densities with Bounded Support 3. The bound for the asymptotic distribution of AMU estimators and its comparison with that of the MLE In order to obtain the bound for asymptotic distributions of AMU estimators, we consider a problem of testing H:9 = an-lia = h against K:O = 0, since the order of consistency is equal to nila for this problem. Then the acceptance region is of the form n n f Zhi = O or Zhj S C, i=1 i=1
where C is some constant. In a similar way to Akahira and Takeuchi [3) (e.g. pages 57, 84), it is seen that the upper bound for the asymptotic distribution Peo(ni/a (On - 90) <_ a) of AMU
estimators On for a > 0 is given by
Po^IIZhi=O +Po{IZhi 5C). (3.1) 1-1
t-i
Since, by Lemma 2.2, for h > 0,
J
-1+h
Po (Zh1 = 0) = fo (x)dx = (A/a)ha + o(ha) , it follows that, for h > 0, Po` n
n Zhi
=^
=1 - ( I - PO (Zh1
0
t-t I
= O )r
= 1 (1 - (A/a)ha + o(ha) )n.
Letting h = an-Ila with a > 0, we have / Po `II Zhi = 0 =1 - (1 - (Aaa/an) + 0(1 n) )n = 1 exp (-(Aaa/a)) +O(1). (3.2) i-1 n
Zh; is
On the other hand, it follows from (2.10) that the asymptotic density pn(x) of Wn = i =1
given by
pn(x) = (1/2n)
J
a itx q (it)dt.
where i denotes the imaginary unit, hence,
c
Po (Wn 5 C) =
pn (x)dx.
(3.3)
In consideration of the AMU condition, we determine the constant C so that, under the hypothesis H:9 = an - I/a = h, Ph(Wn<<_C)= 1/2+0(1).
(3.4)
453
M. AKAHa2A and K. TAKEUCHI Indeed, since the moment generating function of Zh, under H:0 = h, i.e., yr(t) = Eh [exp (t Zh)] = (exp (t zh)) fo(x- h)dx, it follows in a similar way to the one under K:O = 0 that it is calculated. Hence the above constant C can be obtained. From (3.2) to (3.4) we can calculate the upper bound, i.e. (3.1) for a > 0 and also similarly the lower bound for a < 0. Next, we shall consider the maximum likelihood estimator (MLE). Denote by ONn, the MLE. We also define the likelihood function L(0) by n
L(0) =
rl fo (Xi - 0) > 0 for X(n) - 1 <0<X(l) + 1,
i=1
= 0 otherwise, where min X(l) == max t i Xi. It is known that the order of consistency is equal to Xi,, X(,,)
When 00 is the true parameter, it can be shown that the event " ONU, < 00 + an-lia " is equivalent to the one " (a/a0) log L(00+ an-11a)<0 ". We put h=an-l/(X. Since n
log L(0) = E log fo (Xi - 0) i =1
it follows that n
(a/a0)log L(Oo + h) = -14 (Xi - Oo - h)/fo(Xi - Oo - h) for X(n)-1<0o+h<X(1)+1. i=1
Without loss of generality, we assume that 00 = 0. We also put U = - fo (X - h)/fo (X - h) for =0 for
IX - q < 1, IX - N _1.
(3.5)
Then it is shown that the density f(u) of U is given by f(u) = (A'/(a- 1))/(1 +Iul2)(a+1N2+ (A'/(a- 1))g(u) for -oo
1/(1+{uu04 1' = O(juj-(a+2)),
it follows that, as 10 -+ 00 f(u) = (A'/(a- 1))/(1 +I ul)a+l + r (u), where r (u) = 0Qq-(a+2)). It is also seen that
J
r(u)eitu du=O(1
tI^.
(3.7)
In a similar way as in Takeuchi and Akahira [6], it follows that ((X/2)eitu /(1 + Iul )a+1 du = 1 + ka I t la + O(I t I a), (3.8)
454
Asymptotic Efficiency for Densities with Bounded Support where ka is some constant. From (3.6), (3.7) and (3.8) it is seen that the characteristic function 4 (t) of U is given by fi(t) = J e'`uf (u)du=1+k'a ltla+o( Itla), where k'a is some constant. We also see that the bias-adjusted MLE is not asymptotically efficient in the sense that its asymptotic distribution does not uniformly attain the bound, for Zh of (2.1) and U of (3.5) are not equivalent random variables. Remark 3.1. In this paper, the support of the density f0(x) is assumed to be the open interval (-1,1) in the conditions (A.1) and (A.2). However, the results of the paper still hold when the interval (-1,1) is replaced by a finite open interval (a,b) in the conditions. REFERENCES [1] Akahira, M. (1975). Asymptotic theory for estimation of location in non -regular cases, I: Order of convergence of consistent estimators. Rep. Stat. Appl. Res., JUSE, 22, 826.
[2] Akahira, M. (1975). Asymptotic theory for estimation of location in non-regular cases, II: Bounds of asymptotic distributions of consistent estimators. . Rep. Stat. Appl. Res., JUSE, 22, 99-115. [3] Akahira, M. and Takeuchi, K. (1981). Asymptotic Efficiency of Statistical Estimators: Concepts and Higher Order Asymptotic Efficiency. Lecture Notes in Statistics 7, Springer, New York. [4] Polfeldt, T. (1970). The order of minimum variance in a non-regular case. Ann. Math. Statist., 41, 667-672.
[5] Smith, R. L. (1985). Maximum likelihood estimation in a class of non-regular cases. Biometrika, 72, 67-90. [6] Takeuchi, K. and Akahira, M. (1976). On Gram-Charlier-Edgeworth type expansion of the sums of random variables (II). Rep. Univ. Electro-Comm., 27,117-123. [7] Woodroofe, M. (1972). Maximum, likelihood estimation of a translation parameter of a truncated distribution. Ann. Math. Statist., 43, 113-122.
455
71
Journal of Computing and Ini bima lion Vol. 2, No. 1, 1991. Pages 71-92 Institution Gaussianum
A definition of information amount applicable to non-regular cases"
Masafumi Akahira Institute of Mathematics University of Tsukuba Ibaraki 305, Japan Kei Takeuchi Research Center for Advanced Science and Technology University of Tokyo Komaba, Tokyo 156, Japan
1*'The results of this paper have been presented by the first author at the Second International Symposium on Probability and Information Theory at McMaster University of Canada, August 1985.
456
72 Akahira and Takeuchi
1991
1 Introduction Amount of information contained in a sample and in a statistic (or an estimator) plays an important role in the theory of statistical inference as was shown in papers and books by R.A. Fisher (1925,1934,1956), Kullback (1959) and others. There are, however, various ways of definitions of amount of information, some are more convenient (such as Fisher information) but more restricted in applications. In this paper we shall discuss a definition of information amount between two distributions which is always well defined, symmetric and additive for independent samples and information contained in a statistic is always not greater than that in the whole sample and the equality holds if and only if the statistic is sufficient. Hence we can also discuss the asymptotic relative efficiency of a statistic (or an estimator) by the ratio of informations contained in the statistic and in the sample in a systematic and unified way both in regular and non-regular cases. We shall give several examples and discuss some details of the structure of information.
2 The definition of information amount There are various definitions on the distance between two distributions or amounts of information for random variables. Here we consider the following quantity. Let X be a random variable defined over an abstract sample space X and P and Q are absolutely continuous with respect to a a-Finite measure µ. We define an amount of information between P and Q as
457
Journal of Computing and Information , 2, 71-92 73 IX(P,Q) = - 81og f (
Q)In dµ dµ
dµ. (2.1)
Here the integral in the above is called affinity between P and Q (e.g. see Matusita, 1955). The above quantity is independent of a choice of the measure µ. Indeed, if v is absolutely continuous w.r.t. µ, then (dP dQ)"2dv dP dQ nµ. f dv dv J = f ( dµ dµ)d
Taking P+ Q as µ, we can see that Ix (P, Q) is always well defined since P and Q are absolutely continuous w.r.t. µ. If P and Q are not equivalent, then I,(P,Q) > 0. So far as P and Q are not disjoint, that is, for any measurable set A, P(A)Q(A) +(1-P(A)) (I -Q(A)) > 0, it follows that I, (P, Q) is finite. When P and Q are disjoint, we define (PI Q) = °°. If X and Y are independent random variables and have the distributions P. and Qi (i = 1,2), then it is easily seen that IX.r(PI xQ1,P2xQ2) = IX(P1,P2)+IY(Q1,Q2)
where P, x Q, and P, x Q, denote product measures of P,i = 1,2) and Q; (i = 1,2), respectively.
Let T= t (, be a statistic. We denote by PvT , Qjr,. and Pr • QT the conditional distributions of X given T and the distributions of T which are absolutely continuous w.r.t. a-finite measures µ, and µ2, respectively. Then we obtain the following:
458
74 Akahira and Takeuchi 1991 Proposition. It holds that IT(P,Q) s IX(P,Q),
where the equality holds ifand only if T is pairwise sufficient for P and
Q. Proof. We have 1(dPX . dQX 1A dµ dµ dµ ff(dPxIT dP. "2 dQXIdQT h12 dµ1dµ2 dµ1 dµ2 dµ1 dµ2 . dQXIT _ fJf ( dPxIT 1/2dµ1 dµ1 dµ1
dPT . dQT 1/2 dµ dµ2 d12 2
dPT dQT 1/2
dµ2 dµ2 dµ2, hence by (2.1) IT(P,Q) S IX(P,Q)• In the above the equality holds if and only if = Q. a.a. T, that is, T is pairwise sufficient for P and Q. Thus we complete the proof. Next we define a new distribution KT of T as dKT = dPT dQT 1/2 c , f dKT = 1, dµ2 dµ2 • dµ2 where c is some constant. Since
459
Journal of Computing and Information, 2, 71-92 75 dPx dQx 12d dµ dp µ 1 f exp ( - 81 Ix I T(P, Q) I dµ dµ2 1 IT(P I O 2
it follows that Ix(P,Q) = - 81ogET exP j- 8 I1'T(P1Q) } +IT(P,Q), 0(2.2)
J
where I0,. (P, Q) is the amount of information between the conditional distributions of X given T, and E denotes the expectation w.r.t. the distribution K. The constant 8 in the amount Ix (P, Q) of information is given to have a connection with that of the Fisher information. Indeed, if the distributions depend on a real-valued parameter 0, we simply denote Ix (Pe1,Pe2) by 1(01,02). Suppose that for a neighborhood of some parameter 00, Pe is absolutely continuous w.r.t. Peo and dPe / dPeo is continuously differentiable w.r.t. 0. Letting µ = Peo , we have for sufficiently small Do 1/2 I(oo, oo + Do) = -Slog if d dP+oe dPe
eo
0
81og f 1 + d(Pe°+ °e - Pe) 1/2dPe dPeo
460
76 Akahira and Takeuchi
1991
2 _ -81og 1- 1 f dPeo+oe-Peo) dPeo +0((Ae)2) 8 dPeo 2dPe+o((V) dPeo f { d(Po+-P
{{
a ( dP (oe)2 + o((oe)2) e Leo ae dPeo 11 a=ao2
= I(ep)(A8)2
+0((Ae)2),
where I(9o) denotes the amount of Fisher information. If P. is not absolutely continuous w.r.t. Peo , we denote by A(9) the support of the distribution Pe. Putting n( 6) =
f dPe A(ep)
we obtain n(6) s 1.
If for sufficiently small AO n(ea +oe)
=
n (9o)- noIDel
+o(Ioel)
then I(Oo,e , + ne) = -81og ( 1- 2 Jt 1 OI^ +o(I^eI) = 4noJ06I
+o(Ioel),
461
Journal of Computing and Information , 2, 71-92 77 where x = n `(8). Suppose that X,...,X„ are independent and identically distributed (i.i.d.) random variables with a density function f(x,8), where 8 is a real-valued parameter. Then, for 8 and 0 + A0, the amount of information for (X,...,X0) is given by nI (8, 8 + A8). Suppose furthermore that there exists an estimator T„ = t (X,,...,Xn) which is consistent with the order c,,. Then it follows that lim n-co
IT (8,8+cn'0) > 0
for some nonzero constant 0, and lira nl(8,0 +cn'A) > 0 (2.3) n-00
which gives the bound for consistency (see Example 3.5).
3 Examples In this section we have some examples on the amount of information. Example 3 .1. Suppose that for two distributions P. (j = 1,2), X,...,X„ are independently, identically and normally distributed (i.i.n.d.) with mean p 1and variance off. Then the amount of information between two distributions for each X. is given by
462
78 Akahira and Takeuchi
(72(x
1 exp -
1(1,2) = -81og f
µl)2
(72 (x µ2) 2
2
dx
2n (7102 40102
N'1
= - 810 2 (7 1(72 exP g
N' 2
01+02 4(01+02 2
2 2 2(µl - N'2)
(72 + +
= 410 g 01 201 (7 2
(7
1
+
(72
Hence it is seen that the amount of information for (X..... X,') is equal to n-times the above value . We put T, = X = °_, X,. In and T = °_, (X - A) 2. Since +(72
2n(µ1-
IT(1,2) = 4log 01
t2)2 2 2 2(7102 01+02
and 2 2 (71+02
= 4(n-1)log T2 ' '(12)
,
20102
it follows that ITI(1,2)+IT2(1,2) = nl(1,2).
This equality should hold since (T,, T,) is a sufficient statistic. Example 3 .2. Suppose that for two distributions Pei (j = 1,2), X1,...,Xn are i.i.d. random variables according to a t-distribution with 3 degrees of freedom i.e., with the density
463
Journal of Computing and Information, 2, 71-92 79 C
f(x-0) =
0.)2)2,
(1+(x-
where c is some constant. Then the amount of information between two distributions for each X,. is given by 1(1,2) = -81og 1+
(61 -6 2)2
Example 3 .3. Suppose that for two distributions P8 (j = 1,2) X,...,X are i.i.d. random variables with an exponential density
f(x, 0p^;)
1 exp -
' forx z zj,
0 forx < ,
where 00 > 0 and -00 <; < oo (j = 1,2). If E , < E2, then the amount of information between two distributions for each X,. is given by +x-12 dx 1(1,2) = -8 log f 1 1/2 exp 1 1 x-1t t2 (6162) 2 01 02 = - 8log 2 182 exp - 2 - 1 6 1 + 92 201 = 41og (01 + 62)2 + 46192 61 We put T = min,,;,,., Xi and T = X - min, s,s„ X,.. Since T, is distributed according to an exponential distribution and it can be written that
464
Akahlra and Takeuchi 1991
80
n
n-1
nT2 = E X(,) - nX(1) _ (n - t') (X(j+1) - X(,) ) i=1 i=1
which is equal to the sum of n - 1 i.i.d. exponential random variables, it follows that 2
IT)(1,2) =
,
4log 4001 2) +4n X26 2 1
(3.1)
6 1 +6 2 62) (3.2) ITz(1,2) = 4(n-1)log ( 1 2
where X„ s X,) s ... s X(,,). Then we have ITi
TZ(1,2) = IT1( 1,2)+ITZ ( 1,2) = n1(1,2),
(3.3)
which shows that the pair ( T, T2) is sufficient. And also (3.1) and (3.2) show that T is sufficient when 0 is known, i.e., 0 = 61 = 62, and T, has only 1/n of the total information when E is known, i.e., = t, = t,. This fact can be interpreted as showing that T, is asymptotically sufficient when l; is known. Example 3 .4. Suppose that for two distributions P.(j= 1,2), X,...,X, are i.i.d. random variables according to an uniform distribution on the interval [0a - zj /2, ^- +sj/2]. If t1 < t2, then the amount of information between two distributions for each X,. is given as
465
81
Journal of Computing and Information, 2, 71-92
00
for el+ )21 ( T1 +T2+e2 -e1 1
1
-41og
T1T2
'
2
for
-41og
1
1
T 1 +T 2
T1T2 2
+
e2- e 1
T2
T1 <e2 - T2 <e1+ T1< e2+ T2
e 1- 2
2
2
2
1'J for e2 -
- 41og T 1
z e2+s ee or e, 2 22 2
V2 'Cl <e1-
2 2
<e2 + 2 <e1+ 2
+T1 <e2+ T2 . for82 -T2 <e 1- T1 <e1 2 2 2 2
and We put T, = max,s;s„ X; - min,,i,, X. TZ = (max, s;s , X,. + min,,i X, )/2. Then the joint density of T1, T,) is given by
f(t1,t2) =
I n(n_1) -2 2 T" 1 0 otherwise.
t " tl t1 for t2-->() --
ti T or t2 +-<0 +2 2 2
Hence the density of T, is obtained by
f(tl)
= In (n _1)t 2(t _ tl) /
for 0
0 otherwise, and the amount of information between two distributions is also given by T
IT1(1,2) = 4n 1og ? for T1< T2 • Ti
If T, ='r, = T,
it is easily seen that IT, (1,2) = 0 . However, it should
466
82 Akahira and Takeuchi
1991
be noted that Tz alone can not be sufficient even when t is known, i.e., t, = t, since T, and T are not independent. Then the amount of information for (X,,..., k) is also given by
lel-e21
nI(1,2) _
-8nlog 1 - T
for 101-021< t , (3.4)
for 101- 021zt.
00
Since the conditional distribution of TZ given T is an uniform distribution on the interval [0 - (t - T,)/2, 0 + (t - T,)/2], it follows that the conditional information amount is given by
e e 21
for 101-02 1
-8log 1 - 1 t
ITzIT^(1'2) =
Then we have E exp {- 8' IT Z IT^(1,2)},=
tT1- 161 -02I E T - T1
-le , -ezl
J/'
n(n- 1)t -t1-61-62)tl -2/tRdt1
0
1-
101-0 2 1 t
R
where ( )' denotes a positive part of (). Hence - 8 log ET. exp
B ITZ I TI (1,2)11 = 8 n log 1- 101- 02 ^
for fie, - 0, 1 < t, which is equal to (3.4) as is expected from (2.2) since
467
Journal of Computing and Information , 2, 71-92
83
ITT (1,2) = 0 . We also obtain the density of T -21t2-oI f(t2) = f 0
f(t1,t2) dt
=
n
(T - 2 It2
-0 I) n-1
for I t, - 61 < T/2. Hence the amount of information for T2 is given by ez+ T
I.
ITZ(1, 2) = -8log
2
J n(T-2It2
n-1
-e1i)
n-1
2 (T-2It2-02
I)
2
dt2
for 0 < 102 - 0, I < T. The explicit form of the above integral is complicated, so we consider the case when n is large and 0, and 0, are sufficiently close. Putting 0/n = 102 -01 1, we have for sufficiently large n 1,2(1, 2) - -81og f Texp^- T(It2i +It2- o1)}dt2 (3.5) 8 -log (1+A1 .
On the other hand we obtain from (3.4) nI(1,2) - -8n1og (1- 0 1 - 8A nTJ T
for sufficiently large n. Therefore the loss of information of T is asymptotically equal to the value 8 log(1 + Example 3 .5. Suppose that X..... X„ are i.i.d. random variables with a
468
84
Akahira and Takeuchi 1991
triangular density function
Ix
-e+<1, f(x-9) = 0 for Ix 61 -Z,. 1. { 1_'x _e for For a small positive number De, we have 1 0 f {f(x)f(x-Ae)}indx = f {(1+x)(1+ x-A6)}'t dx -1+n8 -l+Ae ee + f {(1-x)(1+ x-Ae)}1ndx
0 1
+ 09f {(1- x)(1 -x+Ae) }1I2 dx 1- A e
2f
y 2 + yAe dy
0 oe
+ f {(1-x)(1 +x- A9)}u2dx 0
1_ ee 2 2
f
u2- (AV du
ve 2
4
A8 2
t2
0
J
1-
(1- 2
- u2 du
A-e 2 _ (Ae)2 + (Ae)21o Ae 2) 4 4 g 4
(A4),log 121 1 Ze (1- 2el2 (A4)2 2
ne ` JJ
2 +1-elsin1 tt 2 /) 1 ne 2
1 + (A4)2 log 4e - 3 (A6)2+o((Ae)2). 8
469
Journal of Computing and Information, 2, 71-92
85
In a similar way to the case D0 > 0, we obtain for AO < 0, 1+eo
f {f(x)f(x- A9)}u2dx = 1+ (A4)21og J A61 - 8 (09)2+0 (08)2). -1
Hence we have for DO * 0 I(9,9+D0) = -81og f {f(x)f(x-A6)}ht2dx = - 8log 1 + (A4 O)2 log l 48 l - 8 (A 0)2 + o ((I 0)2 ) = -2(AO)21og 1401 +3(09)2+0((09)2).
In order that nI(O + AO) = 0(1), i.e. -n(AO)2log D6I = 0(1), we obtain AO = O((n logn)-""). Letting c„ = (n logn)"2, we see that (2.3) holds, i.e. , n1(9, 9 + C'- 10) > 0 for some nonzero constant. Hence it follows that the bound for consistency is equal to (n logn)12. Example 3 .6. Suppose that for a distribution Pe, X,,...,X, are i.i.d. random variables with a truncated normal density .f(x-6) = ce-(X_0)2/2 for Ix-61<1, 0 for Ix-9I z1,
where c is some constant. For 0 < 0 < 2, the affinity p(0,0) between Po and P. is given by
dPo dPe 12
p(0,0) =
dx dx f 0 -1
cexp - (x2+(x- 0)2) l dx
1
470
Akahira and Takeuchi 1991
86
=e^/gPo
1X1
<1-2
}.
For small 0 > 0, the amount of information between P. and P. for X, is given by 1(0,0) = 02-8logPo{1X1(<1- 2} = 02-8log 1 -2 2nc fi(1)-0(1- 2)
(3.6)
= 02 - 8log ( 1-k6- 1 k02 + o (02) ) = 8k0+{1 +2k(2k+ 1)}02+o(02), where O(x) is the standard normal distribution function. Since X = Y," , X,. /n is asymptotically and normally distributed with mean 0 and variance (1 - 2k)/n, it follows that for large n the amount of information between two distributions for X is given by
in IX(0, 0) - - 8log f
1 F2I
n?.k
exp - 4(1 n 2k) (x2 + (x - 0)2) dx -
= n02 + 1 log ( 1- 2k) - 1 g n 1-2k 4 4 - n62(1 +2k(2k + 1)) + 4log ( 1-2k) - 1 log n,
(3.7) where k=ce`'2. We put T,=n(X,)+ 1-0)and TZ=n(X,,)- 1-0), where X,) = min, 5;5„ X,. and X„) = max,.;,,, X,. Since the asymptotic densities of T, and T, are given by
471
Journal of Computing and Information, 2, 71-92 87 gl (t) = ke "Jet for t> 0; 0 for ts0,
and g2(t) = J ke't for t<0; 0 for tz0,
respectively, it follows that for large n and 0 > 0, the amounts of information between two distributions for X„ and X„) are given by 1,. (0, 91 -- - 81og f ke "' t '('^/2) dt = 4k0 nl e
(3.8)
and ITZ(o,
n e)
0 - - 81og f kela "(keI2)dt = 4k0 (3.9)
It is noted that X is asymptotically independent of X1) and X„), and (X,), X„)) is asymptotically sufficient. It follows from (3.6) to (3.9) that for large n and small 0 > 0 the amount of information between two distributions for (X1,...,X has the following relationship: Ixl,...,x^ (0 , 8) = nlX^ (0, 8) = 8kn8 +(1+2k(2k+ 1))n82 + o(n62) 1 (0,0)+ IXc.)(0,6 )+1X(0,8), Suppose that for n = 2m + 1, X..... X„ are independently and identically distributed random variables according to a density f(x - 0) w .r.t. the Lebesque measure . Let T be the median of X,,...,X„ , symbolically
472
88
Akahira and Takeuchi
1991
T= medX . We shall obtain the amount of information between two distributions for T and the asymptotic relative efficiency of T w.r.t. (X..... X„). First the density g (t - 0) of T is given by 0-0) _ (2ml 1 )! (
F(t-8))m { 1-F(t-9 ))'"f(t-0),
where F(x- 0) denotes the distribution function of X,. Then it follows that for 91 = 0 + A and 0 2 = 0 - A, the affinity p(01,02) between two distributions with densities f(x - 0) (j = 1,2) is given by P(61'02)= (2m+1)! f {F(t-A)F(t+A)(1-F(t-A))(1-F(t+A))?o/2 (m! m!) .{ f(t- A)f(t+ A))112dt.
Putting H(t) = F(t - A) F(t+A) (1-F(t-A)) (1-F(t+A)),
we have d logH(t) = f(t - A) + f(t + A) - f(t - A) _ f(t + A) dt (-t - A) F(t+A ) 1-F(t-A ) 1-F(t+A)
If f(-1) = f(1) for all real number t and f(t)/I - F(1) is a monotone increasing function of t, it follows that four terms of the right-hand side of the above are monotone decreasing functions of t, which implies that (d/dt)log H(t) is a monotone decreasing function of t. Since f(-A) = f(A) and F(A) = 1 - F(-A), it follows that d logH(O) = f(- A) + f(A) - .f(- A) - f(A) = U, dt F(- A) F(A) 1- F(- A) 1- F(A)
473
Journal of Computing and Information , 2, 71-92
89
implying that log H(t), i.e., H(t) takes a maximum value at t = 0. We also have d2 logH (0) = J" (-A) - f(A)2 + f(- A) - f(A)2 F(- A) F(A)2 F(- A) F(0)2 dt2 f (-A) _ f(-0)2 _ f(A) _ f(_)2 1-F(-A) (1-F(-A))2 1-F(A) 1-F(A) )2 _ 2.f (0) (1-2F(A)) _ 2f(0)2{F(0)2+(1-F(0))2}
F(A) (1-F(A)) F(A)2 (1-F(A))2 o2 (say). Then we obtain 2 logH(t) = logH(0)- 2 t2+....
Hence we have for a sufficient large in a2M
- (2m + 1)! H(0)ml2 f e 2 {f(t-A)f(t+0)}1"2dt. P(01,e2) MI ml Since by the Stirling formula for large factorials (2m) ! 1 2m 1 m! m! (2) - VIM--n it follows that
474
Akahira and Takeuchi
90
1991
M 0 2M t2 P(91902 ) - { 4H(0)112 } M 2m+1 f e 2 {f(t-A ) f(t+0))1"2dt m7r - {4F(^)(1 -F(0))}M.2J.f(^)Ia yf(0) - (2 F(Lt)(1 -F(0))}n, v{F(0)(1-F(0))}u2 1/2 (say) _ {2 F(0) (1 - F(0)) }"{Q(0 ) }-
Then we have )2+(1-F(A
))2 f (A) (1- 2F(A)), Q(0) = F(A F(0)(1-F (s)) f(A)2 hence the amount of information between two distributions for T= medX,. is given by IT(1,2) _ -4nlog (4F(0)(1-F(0)))+41ogQ(A)+o(1).
On the other hand the amount of information between two distributions for each X is given by 1(1,2) _ -81og f f(z-,&)'12f(X+0)1I2dx = I (A) (say) . Then the asymptotic relative efficiency of T= medXi w.r.t. (X,,. "'X17) is obtained by IT(1,2) 4 log(4F(0)(1-F(A)))• ( 3.10) nI(1„ 2) I(0) We also have for small A > 0
475
Journal of Computing and Information, 2, 71-92
91
I(A) _ -81og f f(x-0)112f(x+iX)112dx 2 1/2
- - 8log f f (x) 1- f( ) x
A2 dx
(3.11)
4 A2 f {f (x))2 dx
f(x)
= 4A21,
where I denotes the Fisher information off. Further we obtain for small A>0 z 4F(A)(1-F(0)) = 1-4(F(A) - 2 I
(3.12)
- 1-4f(0)2A2, J
Hence it follows from (3.10) to (3.12) that the asymptotic relative efficiency of T= medX1 w.r.t. (X..... X) is given by 4f(0)2/I, which is equal to the Pitman efficiency. In particular, letting f(x) = e /2 we have for A > 0 F(A) = 1- 1 e-°
and I(d) = -81og f 1 e-{Ix-AI.Ix+AI)t2dx 2 = -81og f e-od + fe-sdz o e
= -81og(e -°(1+0))
476
92 Akahira and Takeuchi
1991
= 8(A -log (1 +0))• Hence we see that the asymptotic relative efficiency of T= medX,. w.r.t. (X,,...,X) is given by A-1og(2-e-°) 2(A-log (1+A)), which tends to 1 as A -- 0.
References [1] Fisher, R.A. (1925). Theory of statistical estimation. Camb. Phil. Soc., 22, 700-725.
Proc.
[2] Fisher, R.A. (1934). Two new properties of mathematical likelihood. Proc. Roy. Soc. London, A-144, 285-307. [3] Fisher, R.A. (1956). Statistical Methods and Scientific Inference. Oliver & Boyd, Edinburgh. [4] Kullback, S. (1959). Information Theory and Statistics. John Wiley, New York.
[5]
Matusita, K. (1955). Decision rules based on the distance for problems of fit, two samples and estimation . Ann. Math. Statist., 26, 631-640.
477 Rep. Slat. Appl. Res., JUSE Vol. 39, No. 4, 1992 pp. 1-13
A-Section
Unbiased Estimation in Sequential Binomial Sampling Masafumi AKAHIRA*, Kei TAKEUCHI**, and Ken-ichi KoiKE*
Abstract In the case of sequential Bernoulli trials, a sufficient condition for a parametric function to be unbiasedly estimable is given and the existence of a discontinuous unbiasedly estimable function is shown using non-randomized sample size procedures. 1 . Introduction It is obvious to observe that for usual "regular" case, any parametric function which has unbiased estimators must be continuous or differentiable in the original parameter (e.g. see Zacks (1971) and Lehmann (1983)). However, in a sequential case or a randomized sample size case when the sample size is not bounded, the continuity of estimable function does not necessarily follow. The purpose of this paper is to show that for the simplest case of the Bernoulli sequence, we can construct an unbiased estimator of a discontinuous function of p using a sequential procedure with a stopping rule depending on the path and a non -randomized estimator. Singh (1964) claimed to have obtained the same results for a more general class of problems by randomizing the size of sample, but as will be discussed later, the authors found a flaw in his proof which seems to invalidate his main theorem. Our procedure is nonrandomized , but it is not a procedure based on minimal sufficient statistics , hence in a sense equivalent to a randomized procedure based on a sufficient statistic. It is an open problem whether we can construct a non-randomized sequential unbiased estimator of a discontinuous function based on the minimal sufficient statistic alone. In view of our results the contention of Bhandari and Bose (1990) that the estimable parameter must be continuous in p in sequential estimation set up can not be valid since the general theoretical framework is essentially the same although the class of procedures in their paper is different from ours. 2. Definitions and Preliminaries
Now we have a sequence of independent and identically distributed Bernoulli random variables XI, X2,..., Xn.... with P (X 1 = 1) = p and P (X 1 =0) = 1 -p, where 0 < p < 1 and determine a stopping rule. The set of the paths are defined as follows. Let Rn be a set of the Received September 3, 1992 s Institute of Mathematics , University of Tsukuba, Tsukuba, Ibaraki 305, Japan
** Research Center for Advanced Science and Technology, University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 156, Japan (Key words) Sequential Bernoulli trial , Stopping rile, Discontinuous estimable function , Non-randomized procedure.
-1-
478 M. AKAHntA, K. TAKEUCHI and K. KoIKE
all possible Bernoulli sequences of the length n, that is, a set of 2n sequences of 0 and 1, and denote R = V„_1 R,,, Now let S„ be a subset of R, and we stop sampling if the observed sequence of size n belongs to S. We denote S = Un=I S„ . Suppose that (11, ..., iM) E SM for some positive integer M, then it is required that, for any k <M, (i1, ..., ik) 9 Sk , and this is equivalent to the condition that if (i1, ..., iM) E SM, then (11, ..., iM, iM+1, •••, IM+m) 9 SM+m for any positive integer m and any (iM+1, •••, IM+m) with iM+j = 0 or I (j = 1, ..., m). For each path (X1, ..., Xn) E S,,, we denote the terminal point by the pair (X, Y), where x = ^" I X; and Y = LinI(1 - X; ). We denote by T the terminal point, where sampling is stopped. Suppose that we have a stopping rule defined as above. Then we obtain the probability that the procedure stops at the point T = (x, y) as P { T = (x, y)) = N(x, y)P x (I - P)y, where N is the number of paths in S from the origin to the point and is independent of p. We assume that N(x, y) > 0 for all x and all y with x + y ? 1 and the stopping rule is closed P IT = (x, y)) = 1. Then we have in the sense that , x y
N(x,Y )x+y-x=y'Cx-x' !^x+yCx x<-x,ysy
First we have for any estimator n (X, Y) depending on the stopping rule EE[ir(X,Y)]=E E n(x,y)N(x,y)px(1-p)Y. n x+y=n
Next we want a sufficient condition for g(p) to be unbiasedly estimable, that is, there exists an unbiased estimator ,I(X, Y) such that Ep[In(X,Y)I ]<°°. Now suppose that there exists an asymptotically unbiased estimator g,, (X, Y) of g(p), that is,
lim
I
g n (x , y)nCx Px 0 - PY = g (P),
(2.1)
n-^O°x+y=n
where g n (x , y) is defined for x + y = n, 0:5 x:5 n. Define gn(P)=
I
(2.2)
gn(x ,y )nCxpX (l -P)y,
x+y=n
then lim g,, (p) = g (p) for all p. Also define hn(p) = gn(P) - gn-AP) for n 1, 2,.., ho(p) = 0 n--),m
hn (p) = g(p). We put
and 90 (0,0)=9n-1 (4, n) = g n-1 (n, -1 ) = 0 for n= 1,2.... Then n=0
-2-
479 Unbiased Estimation in Sequential Binomial Sampling
Gn(x,Y)=8n (x,Y)-ngn-1(x-1,y)- n gn-1(x,Y-1)
(2.3)
for n = 1, 2, ..., and Go (0, 0) = 0. Then we have the following. Lemma 2.1. Gn(x,y )nCx Px ( 1-p)y.
hn(P)= x+y=n
Proof. Since hn(p) = gn(P) - 9n-AP) for n = 1, 2, ..., it follows from (2.2) that hn (P) = Y,
in-1 (x, Y )nC x Px (1 - P )y - ^,
x+y=n
9n4 (x, Y )n-1C x Px (1 - p)y.
x+y=n-1
(2.4) Since, for x + y =n- 1, n-1CxPx( 1-P)y=n -1Cxpx( 1-p)n =n-ICxPx+
lx
(p+(1-p))
l (1 -p)n-1-x+n
= nCx+l px+1 ( 1 -p )n x 1
-iCxpx
(1 -p)n x
-1Cx + nCx px (1 - p )n-x n-ICx nC x+1 nC x
= x n1 nCx+Ipx+I (1-p)n-x -1+f=XnCxpx (1-p)n x, n
it follows that g. -1 (x ,Y)n-1 Cx Px(1-p)y x +y=n-1
n-x-1 X+1 gn-1(x,Y)nCx+1Px+l(1-P) x +y=n-1 n
+
E
n gn-1(x ,Y)nCx P x 0 -p)
nx
x +y=n -1
_ x+1 x.
+1 (1_P) n-x-1 gn-1 (x,n- 1-x)nCx+1Px
n n-1
+ I ^x
in- 1(x,n-
1 -x)nCx Px(1-p)
x=0 n
x gn-1(x-1,n-X)nCx px(1-p)nx x=1 n
nx
480 M. AKAHIRA, K. TAKEUCHI and K. KOIKE
n + =x gn-1 (x,n- 1-x)n Cx Px(I-p)n: x-O n
E
: +y=nn
. gn -1 (x -1,Y),CxpX(I-P)y
y
+ 1
X +y= n
n
L
(2 .5)
-1 (x 'y - I ) n C x Px (I - P )y •
From (2.3), (2.4) and (2.5) we have hn(P)= Y , C gn(x,Y)-n gn-1 (x - Y x+y=n
^9.
(X
- I ) ' n Cx P X -P)y
= Y, Gn(x, Y )nC x px (1 - p )y . x+y=n This completes the proof. Thus if we put for x + y = n = 1, 2, ... ; e(0, 0) = 0
e(x, y) = Gn(x, y)nCx /N (x, y)
we have
E
I e(x,Y)N(x,y)Px(1-p)y=g(p)•
n x +y= n
Hence we established the following. Suppose that there exists an asymptotically unbiased estimator of g (p). Lemma 2 .2 Then g(p) is unbiasedly estimable provided that Y, I le(x,Y)IN(x,y)px(1-p)y
(2.6)
A x +y= n
A sufficient condition for (2.6) is that there exists a sequence (cn) of positive numbers with In cn < 00 such that e(x,Y)I N(x,Y)px(I-p)y = x
+ycn
x
I
G.(x,Y)I.CxPx(I
-p)y <- cn (2.7)
+y=n
for all p. 3. The Existence of a Discontinuous Unbiasedly Estimable Function and its Related Results
In this section, under the setup of the previous one , we can show that a discontinuous unbiasedly estimable function exists, and also get its related results.
481
Unbiased Estimation in Sequential Binomial Sampling
Example 3.1 We consider the case when 1 forp > 1/2, g (p) = 0 for p = 1/2, -1 forp < 1/2.
Then we may define for Ix - y I <- a,,,
(1/an) (x - y)
9"(x,Y)= 1 forx - y>an,
(3.1)
-1 for x -y<-a., where (an) is an increasing sequence of positive numbers in n such that lim (an In Y) _ n
c (> 0) for 1/2 < 7 < 1. We define gn(P) = Ep [8n (X, Y )] = gn(x, Y)5Cx px (1 - p)y. x +y=n
From (3.1) we obtain 1 forp > 1/2,
lim
gn (p)
0 forp= 1/2, - 1 for p < 1/2.
Indeed, we have, in view of the order of weak convergence of XIn top, g,, (X, Y) -)1 asn -- °°
forp> 1/2,
p g. (X, Y)-4-1 asn -- oo
forp < 1/2,
where -A means the convergence in probability. Since g,, (x, y) is bounded, we obtain gn (p) = Ep [gn(X, Y)] -4 1
as n -^ oo
forp > 1/2,
gn (p) = Ep [^.(X, YA -^ -1
as n -4 0o
forp < 1/2.
And when p = 1/2, we have gn (p) = Ep rgn(X, Y)] -4 0
as n
482 M. AxnxmA, K. TAKEuct-u and K. KoncE
since X - Y is symmetrically distributed around zero for p = 1/ 2 and gn is an odd function hn (p) = g (p).
of x - y. Putting hn(p) = gn (p) - g, l(p)for n = 1, 2, ..., we have n=1
Setting e(0,0) = 0 and, for x + y = n = 1, 2, ..., e (x, y) = gn (x, y)
-x g .-1 (x - 1, y) - y 9n- 1 (x, y -1) N(x, y), n n
where go (0, 0) = go (-1, 1) = go(1,-1)=0, we have by Lemma 2.1
I e (x, y) N(x, y) px (1 - p)y
hn (p) _
x+y=n
Hence it is enough to prove that I I je(x,yAN(x,y)px(1-p)y
In order to do so we shall show that
r-(ln)) I le (x, y A N(x, y) px (1 - p)y = O (nx+y=n
from which the above follows immediately. First we have for sufficiently large n
x^ Ign(x,y)-n 8n-1(x-1,y)-n gn-1(x,y-1) nCxpx(1-p)y y=n
+ I + E Ix-yl5an- 1-1 an-i-1
xl ...1„Cxpx( 1-p)y ifan-1+1
2:
+ I + I
jx-yl5 an- 1-1 an-1-1
ifan
1+12+13
(say) if an - 1 + 1 < an,
I + 12 + 13
(say) if a„
e
483 Unbiased Estimation in Sequential Binomial Sampling
where I ••• I denotes the same one as that in the first term of (3.2). Second we have
I
I< 1- 1 + 1 Y, ( an ana nan-i I x -yls a„-,-I
5
( '--1
an
nc-
P x( I - P Y
x+y=n
I
L n an-nan-1 1 +
()
Ix - y I
forp=1/2,
-YI ]
)E[
1
)(an-1-1) nCxpx(I - P)y anI-i I + nan -i Ix-yl s a, -1-1
for p * 1/2,
x+y=n
(3.3) From the condition of an we have, ant - a,;11 n-1an11
- c (n -1- (n - 1) -Y) = 0 (n Y 1) , (3.4)
= 0 (n-Y -1).
It is also seen that , in the terms Ii and 1'i (i = 2,3), 8n(x,Y)-ngn-1(x-1,Y) -ngn-1(x,y-1)
= 0(n-Y).
(3.5)
We consider the case when p = 1/2. Since Ep [IX - YI] = 0(n1/2) for p = 1/2, it follows from (3.3) and (3.4) that, for p = 1/2, I = O(n-Y-(1!2)),
Since, from (3.5), L an-1 -1
I...
InCxpx(1-p)y
S 1 1... InCx Px (1 - p)y = O (n-Y exp {- c' n-1(an-1 -1)2 ) an-1-1< Ix - yI X +y=n
for an < an-1 + 1 and p = 1/2, it follows that 1'2 + I'3 = o(n-Y-(ln)),
where I ••• I means the left-hand side of (3.5) and c' is some positive constant. In a similar way to the above we have 12 +13 = o(n-Y-(12)) for an-1 + I < an and p = 1/2. Next we consider the case when p # 1/2. Since an-1 = cnl+ o(nr) for 1/2 < y< 1, we have for p <1/2
-7-
484 M. AKAmRA, K. TAKEUCHI and K. KoiKE
I
C. Px
(I - P)y
I x +y=n
=Pv(12X- n1<-a„-1- 1) -I, r
an-i+1-n(2p-1) < X-n
np
n--p- (I 2 nP II-p)
=0 n y-(1/l) exp _ (2p - 1)2n
8p(1-p))^
= o (n-r +(1/2)) which implies that , for p < 1/2, I 0 an
an-1 I + nan-i
^(an-1-1 )
,,Cxpx(1_p Ix -Y Is a. _0
x+y=n
= o (n-r-(1m) . Hence we have for p < 1/2
I = o (n-r- (1rz)), In a similar way to the case p < 1/2, we also obtain for p > 1/2 1 =o(n-r-(1n)).
Since, from (3.5), I I... Incxpx(1-p)Y a„ -, -1
< Y, I ... InCxpx( I -p)Y=o(n-r -(1/2)) Ix-yIsaa-,+1
x+y=n
for an < an-1 +1 and p X1/2, it follows that "2 + I'3 = o(n-y- (1/2)), where I ... I means the left-hand side of (3.5). In a similar way to the above, we have I2 + 13 = o(n-y- (1n)) for an_, +I< an and p X1/2. Hence we have I + 12 + 13 = 0(n-7 1112)), I + 12 + 13 = 0 (n-y- (1/2)) for p = 1/2, I + 12 + 13 = o(n-y- (12)), 1 + 1'2 + 1'3 = o(n-y- (1/2))
for p # 1/2.
485 Unbiased Estimation in Sequential Binomial Sampling
Therefore we obtain gn(x,Y)-
-g- -I (x- 1,Y)
-y
n x+y-n n
n
8 n -1 (x,Y - 1)I-Cxpx(1-P)y<
which implies that g(p) is discontinuous in p, but, by Lemma 2.2, it is unbiasedly estimable.
Example 3 .2. Suppose that g(p) is continuous in p on the closed interval [0, 1] and that there is a constant c > 0 and S> 0 such that g(P)-Ag(P-(I-A)0)-(1-R)g(P+AAA
<-K;.(1-A.)A'+s
for all p E [0, 1] and a sufficiently small A> 0, where 0 < ? < 1 and K is some positive constant. Then we may define gn(x,Y)=g(X+ y)
forx+y= n= 1, 2,...,
and go(0, 0) = g.-1 n) = gn-1(n, -1) = 0 for n = 1, 2, ... Obviously, (2.1) holds. From the condition and (2.3) we have for x + y = n >- 2, Gin(x,Y)= g(&)-x- ( x ) fXg( (n1 n nn n-X 1) (n-xl`_n xg(x + 1 ,z ). g(n) n g (n _ n-1 n n n n-1 n For a sufficiently large n = x + y , we have x(n
- x) (
^1
+a
n2 n - 1
hence )1+ IGn(x,y)InCx px(1-p)y( ( 1
x
+y
=
n
n-
1
n [ X(n - X)nCx P.
xL=^0
(1 -P)n-x
n2
-_KP(1-p) 1 n (n-1)
Since <00
for S > 0,
n=2 n (n - 1)
it follows from (2.7) that g(p) is unbiasedly estimable. Now we have a sequence (ga(p))a=1,2, _.. of functions of p on [0, 1] for which an unbiased estimator ea(x, y) of ga(p) of the form
486 M. AKAHIRA, K. TA%EUCHI and K. KoiKE
ea(x,y)=GR (x,y)nCx/N(x,y)
forx+y=n= 1,2,...; ea(0,0)=0
is unbiased for each a = 1, 2, ..., where, in a similar way to Gn(x, y) of (2.3), for each a, Gn (x, Y) are defined under the condition of the existence of an asymptotically unbiased estimator gn (X, Y) of ga(p).
Suppose that lim ga(p) = g•(p) and a m GR (x, y )
= G: (x y). Then we define ea(x,y)=Gn (x,y),Cx/N(x,y)
forx +y=n= 1,2, ...; e`(0,0)=0.
If, for each n, there exists a function H„(x, y) for x + y = n = 0, 1,2, ... such that jG,a(x,yA5 Hn(x,y) H,,
n=0 x+y
forx+y =n=0, 1,2, .... and a = 1,2,...
(x, y ),,Cx P x ( 1 - P)Y <
,
00
=n
then M 1: e*(x,Y)N(x,Y)Px(I-p)Y = lim -
n=Ox+y=n
1: G,M,(x,Y),,C.px(1-p)Y
M Imn=0x+y=n
= lim lim I
I
M-,»a-won = 0x+y=n
Gn (x , y )n C. px (1 - p)Y
= lim G^ (x , Y )n C. Px (1 - P)Y a +oo n = 0 x+y=n
= lim I
I ea(x,Y)N(x,Y)Px(1-P)Y
a-4-..O x+Y=n
= lim ga (p) =g(P), hence g (p) is also unbiasedly estimable. Remark. From the above two examples and the above discussion, we can derive a fairly large class of unbiasedly estimable functions. Still it seems to be difficult to have a clear-cut simple statement for necessary and sufficient conditions for a parameter to be unbiasedly estimable, since we need some condition or other to guarantee (2.6). Similar conditions must have been necessary for the main theorem of Singh (1964) giving a necessary and sufficient condition for unbiased-
487
Unbiased Estimation in Sequential Binomial Sampling
estimability with randomized sample size procedures. More precisely, let g (0) be the limit of a sequence (p„ (0)) of polynomials on S2, and N be a stopping variable such that
Pe(N=n)= 1 ,
2
n=1,2 ,..... 9E 12,
and then there is a sequence Y1, Y2, ... such that Ee(Y, )=
2n(pn ( 0)-Pn-1(0 ) ),
n=1,2,...,0E S2,
where p0(9) = 0, and define Y = Y, if N = n. Then he obtains
Ea(Y )_ 1: E(Y I N =n)Pe(N =n) R=I
= L 2n (P4(® -Pn-1(0)) 1 n=1 2^
= lim P. (0) = g (0), 0 E Q. n-+ But, in order that EB (Y) exists, we must have EB (IY I)<-.
Since
E9(IYI)= I E(IYIIN=n)P9(N=n) R=I
>_
12n IPn(0)- Pn- 1 (0) I2^
n=1
I IPn(e)- Pn- (0) I = n=1 1
a necessary (not sufficient) condition is that Y, „=1 I P. (0) - pn -1 (0) I < -, which may not always be the case. As an example , first define a function -1 for -1 < x < 0, f (x) =
0 for x = 0, ± 1,
1 for 0<x<1. Then, it is easily shown that this function can be expanded into a Fourier series as
488 M. AKAHIRA, K. TAKEucm and K. KomE
f (x) = 4 I 1 sin (2k + 1)7rx, nk=o2k+1
(3.6)
where the convergence is pointwise or L2-sense but not absolute nor uniform . Now, by transforming the variable x into 9 by
g=1+sinirx 2 f is transformed into a discontinuous function f `(9) for 0!5 0:5 1 by -1 for 0 <- 9 < 2,
f*(9)= 0
for 9=2,
1 for 2 < O :g I. Since sin(2k + 1)a can be expressed as a polynomial in terms of sins of degree 2k + 1, each term on the right-hand side of (3.6) can be expanded as 4 1 1 u2k+1(9), Ir k=o2k+ 1 where u2k+1(9) is a polynomial of degree 2k + 1. Hence we may define P-(°) = 4 1 u(9 ), iri=1 j
where u1(9) is defined to be equal to 0 when j is even. Then, obviously, limm P. (9) = f (9)
for 0 < 9 < 1 ,
n.-i
since 1 u2k +1(0), k=o 2k+ 1
Y f*(9) =4 n but
I u2k+1 (0)
4
E
1
=4
E
1 1 sin (2k + 1)rrx
II Pn(0)-Pn -1 (9)'=
R=2
yr k =12k+1
Ir k =12k+1
489 Unbiased Estimation in Sequential Binomial Sampling
which is not always finite (e.g. put x = 1/2). Note that, for 0<_ p<_ 1, f (p) is equal tog (p) discussed in Example 3.1.
REFERENCES [1] Bhandari , S. K. and A. Bose (1990). Existence of unbiased estimates in sequential binomial experiments . Sankhya 52, Ser. A, 127-130. [2] Lehmann , E. L. (1983). Theory of Point Estimation . Wiley, New York. [3] Singh , R. (1964). Existence of unbiased estimates . Sankhya 26, Ser. A, 93-96. [4] Zacks, S. (1971). The Theory of Statistical Inference. Wiley, New York.
-13-
490
KEI TAKEUCHI (*) - MASAFUMI AKAHIRA (**)
Interval estimation with varying confidence levels
CONTENTS: 1. Introduction. - 2. Reconsideration of Neyman 's procedure. - 3. Conditional procedures. - 4. Example of estimated confidence interval. - 5. Consideration of the lengths and levels of intervals . - 6. Solution to Wald's formulation . References. Summary. Riassunto . Key words and phrases.
1. INTRODUCTION
Interval estimation procedure is usually formalized, if we adopt non-Bayesian standpoint, as a procedure with a fixed confidence level covering the true value of the parameter with probability of preassigned value. Such a procedure is mathematically obtained from a class of test procedures. Thus the confidence level is predetermined and fixed. But when we use interval estimation procedures in practical cases, such a procedure may not be quite appropriate in a practical sense, especially when the width of the preassigned confidence interval depends on the unknown parameter and may become impractically wide, hence we may want to have a narrower interval even at the cost of lower confidence level. In such a case the confidence level may depend on the unknown parameter value, provided that we can estimate the actual confidence level. We would like to look into the matter from a more general and flexible viewpoint and discuss
(*) Research Center for Advanced Science and Technology, University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 156, Japan. (**) Institute of Mathematics , University of Tsukuba, Tsukuba , Ibaraki 305, Japan.
491
4 possible generalization of the procedure with varying confidence levels.
2. RECONSIDERATION OF NEYMAN ' S PROCEDURE
Most commonly used interval estimation procedure is that of confidence intervals as formalized by Neyman ( 1937) and advocated by the so-called "frequentists" as opposed to Bayesian andFisherians. The confidence interval is a random interval L (X) , 0 (X)] based on the sample X with the property that P. {0 (X) < 0 (co) < 0 (X)} > 1 - a for all C OE 0 . Neyman ( 1937) emphasized that there is no meaning in the expression P.
{0 (X) < 0 (w) < 0 (X) I X = x} ,
that is, the confidence level (or coefficient ) should be clearly distinguished from the probability of the inequality being true given the sample, a main point of disagreement with the Bayesians and Fisherians. He argues that, once X = x is observed , the statement 0 (x) < 0 (w) < 0 (x) would be either true or false, hence the probability of the above statement could take the value only 0 or 1 . Even without accepting Bayesian viewpoint that we could talk about the probability of the above type of statement given X = x based on the subjective concept of probability , we can still point out apparent "contradictions" which are brought about by confidence interval procedures in some cases. For example , in case of the interval estimation of the ratio of the two means of normal populations, the interval obtained could be sometimes the whole line, then it is obvious that the obtained interval contains the true value with probability 1. In other cases calculated interval could be reduced to the null set, hence the probability must be equal to 0 , and the confidence coefficient may become meaningless.
492
5 Now, more precisely analyzing the nature of the problem, we may proceed as follows . Let us define
X. L (X ) (X)) =
( 1 if 0 (X) < 0 (CO) < 0 (X) ,
1 0 otherwise,
which is a random variable , and once X = x is given , the value of X. would be determined to be equal to either 1 or 0, just as Neyman (1937) insisted . But, since X. dependes both on X and the unknown parameter 0 , we can not know the value of X. . Then we would like to "estimate" the value of xw . If p = E. [xo, L (X) , 0 (X))] is known, we can regard p as the estimator of the value of Xw. Such a procedure can be justified by the following argument. Now suppose that the unobserved random value xw is known to take only two values 0 and 1 with known probability p=P.{x. = 1}= 1 -P.{x.=0} and we have no further information on X . In this case let x with 0 < x < 1 be the estimator of X. (or predictor ?). Now let the loss due to such an estimator be 4o (x) if X. = 0 and 4 (x) if X. = 1 . Naturally we can assume that ^o (0) = 4 1 (1) = 0 and 4o (x) and 4, (x) are monotone increasing and decreasing functions of x, respectively . Then the expected loss is equal to Ex = p 4 i (x) + (1 - p) 4o (x)
and the optimum value for x is given from the equation a ax
,
Ex=Pq) i(x)+(1 -P)9o(x)=0 (2.1)
or 40, (x) /P = - 01(x) / (1 - P)
493 6 In the case three conditions (i) 4 (x) = (^o (1 - x) , (ii) 4o is convex, and (iii) 00' (x) / x = ^o (1 - x) / (1 - x) holds for all 0 < x < 1 , we have x = p as the unique solution of (2.1). Two of such examples are Oo (x) = x2 and 4o (x) = - log (1 - x) , which are two of the most natural candidates for 40 . Now in the case of confidence interval , we may estimate X. (0 (X) , 0 (X) by a function of X , which we shall denote as j (X) . If p = P. {8 (X) < 0 (co) < 0 (X)} = E. (y) is known , we may put x (X) = p, but not necessarily . We may ask what will be the "best" estimator x (X) of X. ( (X) , 0 (X)). Before discussing the construction of such procedures we would like to point out that x can be regarded as the estimated posterior confidence coefficient based on the sample and would allow for such possibility as: (a) When an ancillary statistic T exists, j would be the estimator of the conditional probability P. {0 (X) < 0 (co) < 0 (X)IT} . (b) If P. {0 (X) < 0 ((0) < 0 (X)} is not constantly equal to 1 - a , we may estimate it based on the sample, such a procedure may be useful when we can not obtain a similar confidence interval.
3. CONDITIONAL PROCEDURES
Let us suppose that there exists a confidence procedure with the exact level 1 - o c, and also a statistic T such that , given T = t, the conditional probability of the confidence interval including the true value 0 (co) depends on T but independent of co. Then let X. be the function defined as above from the confidence procedure and let
it (T) = E. [xwl T] = P. {9 (X) <- 0 (co) <- 0 (X)171. Then for both of the choice 40 (x) = x2 and 00 (x) = - log (1- x) , it can be easily shown that it (T) is a better estimator of xw than the constant no = 1 - a . As a special case, suppose that there exists a transformation group G operating on the sample space and the parameter space with the property that Pgw (X) = P. (g X) for all co and all g
494
7 and assume that 9 (w) = co . Then a confidence set S (X) is said to be invariant with respect to G if S (g X) = g S (X) . If we assume that G is transitive on 0 , i.e. for any col , W2 E 0 , there is a g E G such that g col = cot . Then P. {S (X) 3 w} is independent of w , since we can fix coo and find g E G such that g coo = co , and we have P.{S(X)3oo} = Pg{S (X)3gcoo} = P.„oIs (gX)3gcoo} = Pft{gS(X)3gwo } = P." {S(X)3wo}. Now let T be the maximal invariant statistic, that is, T (g X) = T (X) for all g. Then, under the above assumption, it follows that X can be one-to-one transformed into (g (X) , T (X)) when g (X) is an element of G , hence we may equate X = (g , T) , where g E G . Then, for an invariant confidence interval, we have S(X) = S(g, T) = gS(e, T), where e is the unit element of G. So, given T, we have P. {S(X)3col T} = P.„o{S(gX)3gcoolT} = P.0 {S(gg,T)3gcool T} = Pft {ggS(e,T)3gcoo IT} = P.0{gS(e,T)3 cooIT}, which is independent of g , hence of co . Therefore we may use PO {S (X) 3 cool T} as the estimator of X. derived from S . In the simplest case of location parameter problem, let X = (X1 ,..., X„) and a location invariant estimator LO, 9] of the parameter 0 as L (X) , 9 (X)] which satisfies the condition that 9 (X, + a , X„ + a) = 9 (XI ,..., X„) + a , 9(X,+a, X„+a) = 9(XI,...,X„) + a. Then the conditional level of L , 9] , given T = (X2 - Xl ,..., X„ - Xi) , is computed in the following manner. Let e (Xi ,..., X„) = X1+0(0 , X2 - X1 ,..., X„ - X,) = X, + 90 (T) ,
9(X,,...,X„)=X,+9(0,X2-X,,...,XX-X0 =X,+90(T).
495
8 Then it can be shown that
Pe {0 (X) < 0 < 0 (X)} = P0 {X1 + 9e (T) < 0 < XI + 0e (T) I T} e (X) n 00 ft
= j rJ f(X;-u)dul f 0( X)'-^
-00
f(x1-u)du. i =1
Note that the above is equal to the fiducial probability of Fisher (1930) (see also Fraser (1961)).
4. EXAMPLE OF ESTIMATED CONFIDENCE INTERVAL
Another case where "estimated confidence level" may be of value is that of discrete distribution. When the value P { } can be well below the nominal level unless randomization is allowed. Now suppose that the sample X can take only integer values and assume that Pe (x) = Pe {X = x} ,
x = 0, 1 , 2,...,
where 0 is a real parameter. We also assume that pe (x + 1) / pe (x) is monotone increasing in 0 for all x. Then "good " confidence interval is given as
0 (x) < 0 < 0 (x) where 9 (x) and 0 (x) are monotone increasing functions with the condition that
Pe{0(X) <0<0 (X)}> 1-0C or
Pe {X > 0-1 (0)} + P0 {X > 9- 1 (0)} < a for all 0. For many practical cases the sum of the two terms in the above can well below o t, thus giving unnecessarily conservative
496
9 interval. In general , it is difficult to remedy the situation without voilating the condition for some value of 0, but we may adjust the conclusion by giving an estimator of the posterior confidence level given X = x. Suppose that X is distributed according to the binomial distribution B (n , 0) , 0 < 0 < 1 , and that an interval estimator of 0 is required. Now suppose that a confidence interval estimation procedure 0 (X) < 0 < 0 (X) is given . Then we may calculate n
Ee L (X) < 0 < 0 (X)] =
E
nCX 9x (1- 0)n-x 4 ( x , 0) = p (0) (say)
x=0
for 0 < 0 < 1 . And we can construct an estimator p = p (x) of p (0) and may use it as an estimated confidence level. Since the function p (0) is discontinuous (at the points where 0 = 0 (x) or 0 (x) for some x) no unbiased estimator of p (0) exists. We want to make n
E0 [{P - P (9) }2] =
I
nCx
(n (x)
- P (9))2 0' (1 - 0)" - x = R (0) (say)
x =0
as small as possible . Since R (0) above depends on the unknown 0, R (0) can not be uniformly minimized and we may resort to "Bayesian" approach and make
j'R(O)aT(9) dO = mini mum 0
w ith some weight function co (0) (or "prior" distribution if you like). Then the solution is given by i I n - x 0 I f CO (0) 0, (1- 0)n ) p(0)0 X ( 1_O) -x d 0 P (x) = fco(0 0
CW (x)
0
( 1_9 w( 0 )P( 0 ) Ox 0
)n -xd 9
(say).
497
10 Recalling the definition of p (9) we have 1r P(x ) =
„Cy (0 (0)9v+x (1_e)2n-x-y4(y,0)
J
d0
C.(x)y-e o 1
ecy> nCY r w (0) 9v+x (1 _ 9)2n-x r d 9 .
C. (x) y=o a (y) Here w (0) may be chosen arbitrarily but a natural choice will be the "beta function"
co (0) = const. 0p-' (1 - 0)Q-1 for p > 0 and q> 0 . Then p (x) is expressed in terms of incomplete beta function. Usually (non-randomized) confidence interval is defined so that we have
p (0) >- 1 - a for all 0 , but for small n the difference p (0) - (1 - a) is sometimes quite large. Then p (x) defined as above (which is the "posterior" mean of p (0)) becomes larger than 1 - o t, and would give more relevant information of the accuracy of the confidence interval obtained. We may also define 0 (X) and 0 (X) so that the estimated confidence level p (x) is equal to 1 - a for x = 0 , 1 ,..., n . Since there are 2 (n + 1) values of 9 (X) and 0 ( X) to be determined while the condition p (x) = 1 - a for x = 0 , 1 ,..., n , gives only n + 1 equations to be satisfied, there will still be n + 1 more conditions possibly to be imposed from other considerations.
5. CONSIDERATION OF THE LENGTHS AND LEVELS OF INTERVALS
We may consider a few types of confidence intervals with estimated confidence levels. One is the confidence interval with constant estimated levels . That is, we shall determine an interval
498 11 [ (X) , 0 (X)] with the property that the estimator p of E. [Xw (0 (X) , 0 (X))] is constant and equal to 1 - a, which means that E. [y ( (X) , 0 (X))] = 1 - a for all co and that p = 1 - a is a good estimator of the random variable X.. In case of invariant estimation with transitive transformation group G , we may so choose 0 and 0 as to satisfy the condition that
P, {0 (X) < 0(wo) < 0(X)IT} = 1 - a, in case of location parameter, 0 and 0 must satisfy the condition 0 f llf(Xi- 0)d0 a `- t
OD
f nf(X;-0)d0 J a-1
Another case is an interval with constant length , and estimated confidence levels. As an example suppose that Xl,..., X„ are independently, identically and normally distributed with mean 0 and unknown variance a2. Then it is intuitively clear (and may be rigorously proved) that the best invariant interval with fixed length 1(irrespective of a) is given by [X - 1, X + 1] , where X = E"- I X; / n . Then P{X-1<0 < X+l}=24) (,/n1/a)- 1, where
(D (u) =
J
(1 /
21[) e -X2/2 dx .
-00 An unbiased estimator of p = 24 (,/1 / a) - 1 is given by
_ 1 if IXI-X2I < p - { 0 otherwise.
2nl,
499
12 By applying the Rao-Blackwell theorem we get the uniformly minimum variance unbiased estimator of p by a
r
p* = E p (Xi - ',)2] = c [
J
(1 - r2)(n-4)/2 dr ,
-a
where a = min (1 , . In1 / E°=1 (Xi - X)2) . As a second example, suppose that X1 ,..., Xn are independently, identically and uniformly distributed on the interval [0 - (1/2), 0+(1/2)]. Then a sufficient statistic is given by the pair (min Xi, 15i5n
max Xi) and a natural estimator is given by 9 = (minXi+ max Xi) / 2, i
ISiSn
and we may consider the interval of the type [9 - 1, 0 +
i
1] . We shall
denote R = max Xi - min Xi, then the distribution of R is indepeni
i
dent of 0, hence it is an ancillary statistic. Given R, 0 is distributed according to the uniform distribution centered at 0 and length equal to 1 - R . Then P{0-1<0<0+1I R}=min{1,21/(1-R)}. Therefore, for the fixed length interval, estimated confidence level is given by the above formula. And if we put 1 = (1- a) (1- R) / 2, we have a confidence interval with constant estimated level. Now we shall consider the problem of determining the length of the interval simultaneously. In case of location parameter with known scale, we may first fix I and choose e0 and define an interval [00 -1, O0 + n so that So +1
p=
n
ft
f(x1-odo f fl f (Xi - 0) d 0 j' 00 i= I -1 i= I 00
is maximized. More generally we may determine I so as to balance the probability of inclusion and the length of the interval. A simple approach is to define a loss function by
f(P)
+g(1),
500
13 where f (p) is decreasing in p , the probability of inclusion of the true parameter and g (1) is increasing in I of the interval . An intuitively appealing choice will be f (p) = - log p
and
g (O = cl,
where c is a positive constant. Generally p is unknown, hence we may replace f (p) by its unbiased estimator J(p) and we shall choose 1 so that 1(P) + g (1) is minimized . It is often difficult to obtain an unbiased estimator of f (p) , therefore we may replace J(p) by f (p) , where p is an unbiased estimator of p . For the above choice , the optimum value of 1 is given by
- dl 1 +c=0 or di =cp, P
where c is some constant . In case of the normal distribution with known scale , we have p=20(,In1/a)- I. Hence dp / dl = 2 (,In / a) 0 (,In1 / a), and we obtain the optimum value of 1 from the equation
2,,L4(" l,) =21() - 1 or 4 (u) _ Cr 20 (u) - 1 2
(5.1)
501 14 and I = a u / In . Since the left-hand side of (5.1) is monotone decreasing in u > 0 , and approaches to oc as u -+ 0 and to 0 as u -+ oc , we always have a unique solution of (5.1) in u > 0. In case of location parameter with unknown scale , consideration of invariance leads the posterior estimator of the confidence level of the interval [O0 - 1, 00 + fl by eo+1
f f J _J T
1
nf ( x^ o )dodT
0 eo-r
P=
1 " , ( Xi _O ) dod T I o -0o ' -^
In case of the normal distribution we have
g (w) dw r (4- -A-I)/Z
P
00
1
f g(w)dw -cc where g (w) = C" / (1 + w2)" for - oc <w< oo, with some constant C" , and Z = E". i (X, - X)2 . We also have ^P = Z g(^(9^X+1))+g(„1
i
Another approach to the choice of levels and lengths is to start from the loss functions defined by Wald (1971 ) and others . Suppose
502
15 that an interval [ , 0] is given . Then the loss from the statement that 0 lies in between 0 and 0 may have the form
c(,0) if0<00, d (0-, 0) if 0 < 0 . A simple and plausible choice will be
c( ,0)
=0 -0,
d(,0)=
c(0-0) 2 +0-0,
and
4@ ,e)=cL-
0)2 + -0
with a given constant c. Now our choice of 0 and 0 depends on the loss function L (0 , 6, 0) and we would like to choose 0 and 0 so that E [L (, 0 , 0)] be minimized . We may use such (, 0) with the estimated risk r" (0 , 0) of E [L ( , 0 , 0)] and also with the estimated level p . Suppose that Xl ,..., X„ are independently , identically and normally distributed with mean 0 and known variance 62 . Considering the interval of the type [X - 1, X + 1] as [ , 0] , we have
er
+c
J
(O
_
+1)2L
4
(/';;(
_0))
d5
- 00 = 21 + 2n 2 J (u- " 1)2,) (u) du a Fria
=21+2cn2
(n Cr
+
"__I)^- 2c61^1)
2 6
Vn
6
503 16 The value of 1 minimizing E [L ( , 0 , 0)] is obtained from the equation 2ca ^1 (u - )a (u) du 0 = 1 - f nn V f l/a
=
1-2ca(^ ''
(Inl
a
"l a
1 1-fi1 ^1))} Cr
that is 1 = a u / / and 0(u) - u{l - t (u)} = In- /(2ca).
(6.1)
Since the left-hand side of (6.1) is not larger than ^ (0) = 1 / . = 0.3989 and approaches to 0 as u -+ oo , we have a unique solution for 1 if 2ca / " > 1/0.3989 = 2.5069, but if 2ca / " < 2.5069, we should put I = 0. In case of unknown scale a , let us consider the standardized loss
L( ,6,0) =
and the class of interval estimators of the type [ , 0] with 0 = X - IS and 0 = X + IS , where c and t are constants, and S = E"=1(X; - X)2/ n . Then we can show that E[L(0 ,0, 0)]
=
2ncEr(nt2S2
t 2a
2ctS
(...JtS)l
.,In- Cr a
+ 1) I1 -^
(,I'ntS)} ( 6 . 2)
504
17 where the expectation is taken with respect to the distribution of S. Differentiating (6.2) with respect to t, we have
E[LL,6
2 2aS ( ../6 tS)}
nc
2
J tS
- 2InS
Hence, for optimum t, we obtain (S l 2c f ^S (/tS ) E(a )
= n EL a
a
ntS2
{
}
( \/tS) ]
a2
.
(6.3)
Since S / a is distributed independently of a , both sides of (6.3) is independent of a , hence t can be determined from the equation (6.3) dependent on c .
REFERENCES FISHER, R.A. (1930). Inverse probability, Proc. Camb. Phil. Soc., 26, 528-535. FRASER, D.A.S. (1961). The fiducial method and invariance, Biometrika 48, 261-280. NEYMAN, J. (1937). Outline of a theory of statistical estimation based on the classical theory of probability, Phil. Trans. Roy. Soc., A236, 333-380. WALD, A. (1971). Statistical Decision Functions, Chelsea Pub. Comp., Bronx, New York. Interval estimation with varying confidence levels SUMMARY
Usually confidence interval is defined as an interval with preassigned confidence level I - a for all the value of parameters. More generally, however we may consider interval estimation procedures with confidence coefficient varying according to the value of the unknown parameter, and associated procedure to estimate the actual level. Such a consideration leads to more general procedures including conditional procedures given the ancillary.
505
18 Stima per intervallo con livelli di confidenza variabili RIASSUNTO
Un intervallo di confidenza a di solito definito come un intervallo con un prefissato livello di confidenza 1- a per tutti i valori del parametro. Piu in generale , pero, si possono considerare metodi di lima per intervallo con un livello di confidenza variabile in funzione del valore del parametro incognito , a cui sono associate procedure per stimare 1'effettivo livello di confidenza. Questa considerazione porta a procedimenti di carattere piu generale , the includono it condizionamento rispetto a statistiche ancillari.
KEY WORDS AND PHRASES
Confidence interval; confidence level; ancillary statistic ; invariance; loss function.
506 Stat. Sci. & Data Anal., pp. 375-382 K. Matsusita et al. (Eds) © VSP 1993
Second Order Asymptotic Bound for the Variance of Estimators for the Double Exponential Distribution MASAFUMI AKAHIRA and KEI TAKEUCHI Institute of Mathematics, University of Tsukuba, Tsukuba, Ibaraki 305, Japan Research Center for Advanced Science and Technology, University of Tokyo 4-6-1 Komaba, Meguro-ku, Tokyo 156, Japan
Abstract. The Bhattacharyya type bound for the variance of unbiased estimators of a location parameter of the double exponential distribution is obtained, and also the loss of information of the maximum likelihood estimator based on the distribution rounded off is discussed.
Key words: Bhattacharyya type bound, unbiased estimator, loss of information, maximum likelihood estimator.
1. INTRODUCTION
The double exponential distribution with unknown location parameter case is first order regular but second order non-regular, in the sense that the density admits the first order differentiability with respect to the parameter but not the second order. From that property it follows that the first order asymptotic theory of regular estimation can be applied but in the second order, the situation becomes non-regular. For example, R.A. Fisher, already in the paper [1], noted that the loss of information of the maximum likelihood estimator is of order V n instead of constant order as in the regular case. In Akahira and Takeuchi [2], [3], Akahira [4], Sugiura and Naing [5] among others, the second order (or more precisely next to the first order) properties of the maximum likelihood estimator and other estimators were discussed. In this paper we shall discuss the problem from a different point of view, and shed lights on the situation. Indee .!, we shall obtain the Bhattacharyya type bound for the variance of unbiased estimators of
507 376
M. Akahira and K. Takeuchi
a location parameter of the double exponential distribution and the loss of information of the maximum likelihood estimator based on the distribution rounded off.
2. THE BHATTACHARYYA TYPE BOUND FOR THE VARIANCE OF UNBIASED ESTIMATORS Suppose that Xi,.. . , X. are independent and identically distributed random variables with a double exponential ( two-sided exponential ) density f (x - 0) _ (1/2) exp(- Ix - 01) (-oo < x < oo ; -oo < 9 < oo). Then, for an unbiased estimator 9 = 9(X1, ... , X,) of the location parameter 9, we have the Cramer-Rao bound, i.e. Ve(9) > 1/nI(9) = 1/n, where V9( • ) denotes the variance and
I(0) = Es [{(8/89) log f (X
- 9)} 2]
= Ee [{sgn(X - 9)}2J = 1.
Since f (x) is not twice differentiable, we can not further differentiate log f (x), hence the Bhattacharyya bound can not be obtained. However, we can obtain a similar bound as follows. Define Z1(9) = 1 E sgn(Xi - 0). n i=1
Then, for an unbiased estimator 9 of 0, we have I
= i=1 i=1
n
where x = (x1i ... , xn). Indeed, we obtain a
J...J^(x).....
n
n
fl f( x . - 9) rj dxi = 1,
^=1 i=1 since the differentiation under the integral sign is allowed. Hence 11 f (Xi n 7-^ 1 = r ... f B(x) a9 - 9) ll dxi log fJ f (Xi - 0) i=1 i=1 i=1
J
n a
r
i=1 i=1 i=1
_ /... r JZl (9)e(x ) fl f (Xi - 0) [J dxi, J
J
i=1
i=1
(2 .1)
508 Second order asymptotic bound 377 which implies that (2.1) holds. From (2 . 1) it follows that
.. f zl ( e + oe)e(X)
f (xi
- B - AO) Ij dxi
n
from which we have
... J { Zl(9 +L9)_Zl(9) +,/zl (e)} e(X) f j f(x- e) fl dx= 0( 1). (2.2) i AO Indeed, from ( 2.1) and (2.2) we have
0 = f ... f z (e + ,)i(X) II f (x; - e - oe) - z, (e)e(X) II f( x; - e) } jI dx; i=1 i=1 :_1 1
= f f {zl (B + AO) - zl (e)} e(X) jI f( xi - e) rj dxi i =1
i= 1
+ oe f f zl (e + oe)8(X) ae jI f (x; - e) 1 =1
+ AO f . f zl (e')8(x)
= f • • • f { z (9 + l
AO)
ae
dxi
+ o(AO)
i=1
t f(x; - e' + AO) ft dx, + o(AO) =1
i=1
- zl(e)} e(X) II f( Xi - e) jI dxi i=1
+ oe f ... f zl (e')e(X)
i=1
a fl A xi - e') fl dxi + o(O 9)
= f i =1
i= 1
a
n
n
n
+ AO f ... f zl (e)B(X) { To log fl f (Xi - B) [J f (xi - 9) [f dxi + o(AO) i=1
i=1
=f
i=1
; i=1 f (Xi
+ AO f ... f ^B(X)z^ (e)
i=1
- e) 11 dxi + of°e>, i=1 i=1
which implies that (2.2) holds. Putting, for AO = O(11v/n-),
z2(e) = Z1(e
+ Q9) - z1(e ) +./ n-Z (0),
509 M. Akahira and K. Takeuchi
378
we have from (2.2)
f...JZ2(9)^(x)llf(x. - 9) ll dx, = o(1).
(2.3)
Note that Z2(9) depends on 09, but for the sake of simplicity we omit AO from the expression. Then it follows from (2.1) and (2.3) that the Bhattacharyya type bound for the variance of unbiased estimators 6 of 9 is given by Ve(9) ?
I"
(0)/n,
(2.4)
where I11(9)/n is the ( 1,1)-element of the matrix nEe[Zi (9)] J Ee[Z1(9) Z2(9)] 1 C / Ee[Z1(9) Z2(9)1 Ee[Z22(9)1 ) that is, (2.5)
111(9) = [E9 [Z2(e)] - {Ee[Z1 ( 9)Z2(9)]}2 /Ee[Z22 (9)1 ] (see, e. g., Zacks [6]). Since 1E sgn(Xi - 9) = 1 Z1(9) = f i=1
x;
(say),
i=1
it follows that 2
Ee[Zi (9)1 = EB
[{
7E
=x: }
=n
Ee[X,2] = 1.
(2.6)
Since Z2 (9)
= n
1 {sgn(Xi
=109
7= E
= I
y,
- 9 - 09) - sgn(Xi - 9)} + fZ1 (9)
+ Vfn-Z2(B) (say)
i=1
and Ee[Y] = -1 + (09/2) + o(09), Ee[Y2] = (2/M9) - 1 + o(1), Ee[X;Yi] = -1 + (09/2) + o(09) (i = 1, ... , n), it follows that Ee[Z1( 9)Z2(9 )] = Es
L fi x; n i -1
1 [ n
= nEe
C^Y \
X Y + i=1
=-l+ 2e +
+ ^Z^ (e)I J
n i -1
xix; Xk
XiY, + nEe i #j
o(09).
i
l
k
(2.7)
510 379
Second order asymptotic bound Since
1,78 I Y,)2 = nEe[Y;2]
]
=n2+
12 ] E's [z(9)Y
n(n -1 ) { Ee(Y,)}2
09 +°( E9)
,
lEe Xi2+ XiX^Yk
i=1
n
i=1
i#
j
k=1
n08 = nEe [Yi] = -n + 2 + o(n09), n
4
J ^
Ee [Zi (e)] = z Ee (E X; = z EB n
\i=1
n
X;2Xk2
Xj4 +3
i=1
i#
k
=g{n+3n ( n-1)} =3- 2, n it follows that Ee[Zz (9)] = n +o9 +2n (-
n
1+ 2B)+
= 2n + o(n).
( -n 3
) +oI nJ
From (2 .4) to (2 . 8) we have the following theorem. Theorem.
The Bhattacharyya type bound for the variance of unbiased estimators
B of 9 is given by n n 2n 00)), Vare (,fn-9)? (1+2n+ O(n))•
This gives the second order bound for the variance of unbiased estimators of 9. But the second term of the right-hand side of (2.9) is of order n-1, while, for all estimators thus far discussed, the second order in the asymptotic variance is of order n-1/2, and it is a situation still to be investigated and explained. The bound (2.9) may not be sharp. The related results can be found in Akahira and Takeuchi [3].
511
380
M. Akahira and K. Takeuchi
3. THE LOSS OF INFORMATION OF THE ESTIMATOR BASED ON THE DISTRIBUTION ROUNDED OFF In a practical situation, all data are given only up to some decimal unit, hence all "real" distributions are not strictly continuous but are discrete distributions up to rounding off. And rounding off incurs loss of information, which tends to zero as the width of the rounding goes to zero, so the continuous distribution can be considered as the limiting case , hence a practical approximation for the actual situation where the width of round-off is small enough. In the regular case we do not need to go beyond the above consideration, but in a non-regular case, we must take another aspect into consideration. When the data is rounded, the class of distributions is no longer irregular, admitting differentiation with respect to the parameter as many times as we want. Hence the "asymptotic deficiency" of the maximum likelihood estimator (MLE) and other estimators becomes of order 0(1), but not of higher order as in the limiting non-regular case. Therefore, if we restrict our attention to the class of the MLE and other similar regular types of estimators, these must be optimum "compromise" for the width of rounding between the loss of information in the data due to rounding and the loss of information (asymptotic deficiency) due to estimators. In this section we round off the double exponential distribution, consider the loss of information by the distribution and the loss of information of the MLE, and deal with their sum as the loss of information of the MLE based on the distribution. In the case when the density is given by f(x - 9) = (1/2)exp(-Ix - 91) (-oo < x < oo; -00 < 9 < oo), we consider to round off the density as follows. Without loss of generality we assume that 0 < 9 < h. For any integer j we define p,(9) by (.i+1)h
P,i (9)
=jhf f( x - 9)dx.
Since P0(6) = (2 - e-0 - ee-h) /2, (3.1) it follows that
= ( e-e - ee-h ) /2 = (h - 29)/2 + o(h), -g e8 -h ) /2 = -1 + (h/2) + o(h). PO (9) = - ( e Pu(9)
(3.2) (3.3)
512 381
Second order asymptotic bound
Then it follows from (3.1 ) to (3.3) that the amount Ih(O) of information , Jh(O) and Mh(O) on the distribution rounded off are defined and calculated by
a
Ih(e) _ . {logPi (o)}2 Pi(0) = L, {logj(o)}2Pj(o) + {
_
E
logPo(o)}2Po(o)
Pi(e) + {po(e) /Po(9 )} 2po(9) = 1 - 29
i#o
Jh(e) {
49
loPi (o)} {
z
(
1-
h / + o(h),
(3.4)
. logPj (o) } Pi(e)
a logPo ( 9)} {logPo (o)} Po(o) _ 1 +0(1), h
az
2
Mh(8) {.logPi(o) + Ih(9)} Pi(B) J12
02 2
(e)pi( 0)
+ { 892logPo(o) +Ih(9)} PO (0)
i#o
= Ih(e)(1- Po ( e)) + {(PI (0)/Po (e)) - (P o(9)/po (e))2 + Ih(9 ) } 2 Po (9) (3.6)
= -1 + (2/h) + o( 1).
Since the amount I(9) of information on the double exponential density f( x -9) is equal to 1, the loss of information by rounding off is defined by n{I(9) - Ih ( 9)}, and the loss Dh(O) of information ( asymptotic deficiency) of the MLE by rounding off is also given by {Ih (9)Mh(9) - Jh(9)}/Ih(9 ) ( see Akahira [7]). Hence the loss of information of the MLE based on the distribution rounded off is given by
n{I(9) - Ih (O)} + D h (O) = nI(9) - { nIh(9) - Dh(9)}. In order to get h minimizing it, we obtain h maximizing the average value of ( 3.7), i.e.
(1/h) / {nrh(9) - Dh(9)}d9, h 0
(3.8)
513
382
M. Akahira and K. Takeuchi
since 9 is considered to be random in the interval [0, h). From (3.4), (3.5) and (3.6) we have
Dh(9) = {Ih(9)Mh(9) - Jh(9 )}/Ih(9) = h +
40 -
402
+ o(1),
hence nIh(9)- Dh(9)=Cn-h I{ 1-29 1Then we obtain
(1/h)
J0
h{nIh(9) - Dh(9)}d9 = (n -
2) E
C
l - 3 I +o(1).
Hence it is seen that the value of h maximizing (3.8) is given by 6/n , and also its value of ( 3.8) is done by n - (2 6n /3) + (2/3). Now , in the second term, a quantity of order f appears, unlike the regular case when it should be of constant order , and the coefficient is not still satisfactorily explained in more general contexts.
REFERENCES 1. R. A. Fisher, Proc . Cambridge Philos. Soc., 22, 700-725 ( 1925). 2. M. Akahira and K . Takeuchi , Asymptotic Efficiency of Statistical Estimators : Concepts and Higher Order Asymptotic Efficiency , Lecture Notes in Statistics 7, Springer, New York ( 1981).
3. M. Akahira and K. Takeuchi , Austral. J. Statist ., 32, 281 -291 (1990). 4. M. Akahira, Ann. Inst . Statist. Math., 40, 311-328 (1988). 5. N. Sugiura and M . T. Naing, Comm. Statist., A - Theory Methods 18, 541-554 (1989). 6. S. Zacks, The Theory of Statistical Inference , Wiley, New York ( 1971). 7. M. Akahira, The Structure of Asymptotic Deficiency of Estimators, Queen 's Papers in Pure and Applied Mathematics 75, Queen's University Press, Kingston , Ontario, Canada ( 1986).
514 221 Statistica Neerlandica (1993) Vol. 47, nr. 3, pp. 221-223
On the application of the Minkowski-Farkas theorem to sampling designs M. Akahira Institute of Mathematics University of Tsukuba, Ibaraki 305, Japan
K. Takeuchi Research Center for Advanced Science and Technology University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 156, Japan
In the paper, we consider the following problem : Let (Irk) be a sequence satisfying 0 <7rk< 1 (k=1, ..., N) and INk= 1 irk =n. Then , is there an unordered sampling design such that, for each k=1 , ... N, the inclusion probability of unit k is equal to irk? It is shown that it can be solved by the straightforward application of the Minkowski-Farkas theorem. Key words & Phrases: finite population, label, Minkowski -Farkas theorem, sampling design , unit, inclusion probability.
1 Introduction In Chapter 1 of the book by CASSEL et al., (1977), discussing the finite population inference in survey sampling, the basic problem on determination of a sampling design given an inclusion probability is posed. A finite population is a collection of N units, where N < co. We use the label k to represent the physically existing unit Uk and talk about Uk as "the unit k" and thus denote the population as 2' = { 1, ..., k,..., N}. A nonempty set s such that s c Il is called an unordered sample. We assume that the number of elements of s is equal to n which is called the effective sample size. The set of all sets s of the effective sample size n will be denoted by Y. Then the number of components of 5o is equal to NC„ which is denoted by M, and also denote components sl,..., s,M. Let Ck = {s: k E s} (k = 1,..., N). If a function p(s) on y satisfies that p(s) > 0 for all s E 5° and p(s) = 1, then p(s) is called an unordered sampling design. S E.%
Our problem is the following: Let
{irk}
be a sequence satisfying 0 < rrk < 1
N
(k = 1, ..., N) and Y. ick = n. Then is there an unordered sampling design p(s) such k=1
that
p(s) = rck (k = 1, ..., N)? It is noted that SECk
p(s) is called an inclusion probSECk
ability of unit k . As a related result to this problem, a simple method to achieve the desired irk (k = 1, ..., N) is given by MADOW (1949) (see also CASSEL et al., 1977). It is also shown in HANURAV ( 1966) that if a suitable drawing mechanism exists then the
515 222
M. Akahira and K. Takeuchi
existence of a design is established . Further, DUPACOVA ( 1979) answers the question of the existence of a design with given inclusion probabilities affirmatively . The related discussion is done by HAJEK ( 1981 ). However, the existence is proved simply by applying the Minkowski -Farkas theorem, which is given in the following section.
2 The existence theorem of sampling design First, we state Minkowski-Farkas theorem to solve the above problem. LEMMA (Minkowski-Farkas). LetA fl e a N x M matrix. In order thatfor a N- dimensional vector n there exists a nonnegative M-dimensional vectorp such that Ap = nit is a necessary and sufficient condition that, for any N-dimensional vector y with A'y> 0, y 'n> 0, where "' " means the transposition and 0 denotes zero vector. The proof is omitted since it is a well-known theorem (e.g. see NIKAIDO, 1968). Using the above lemma we have the following theorem. N
THEOREM . Let {Irk}
be a sequence satisfying 0 <
7rk
< 1 (k = 1, ..., N) and
I Irk
Then there exists an unordered sampling design p (s) such that k I p(s) _ irk
= n.
=1
(k = 1, ..., N). (1)
SECk
M
PROOF. Since
p(s,) = 1, we denote a sampling plan with ( 1) as a nonnegative M;=1
dimensional vector p = (p(sl), ..., p(sM))' satisfying Ap = n, where A is a (N + 1) x M matrix and n is a (N + 1)-dimensional vector such that all a12 ... aim
A
...
n= 7c1i...,nN,1 ,
aNl aN2 ... aNM
1 1 ... 1 where aki = 1 for Si E Ck and aki = 0 for s, (t Ck. From the Lemma it follows that there exists a solution p(s) with (1) if for any (N + 1)-dimensional vector y such that A'y>
0
(2)
it holds that y'n> 0. If we denote y = (yl, ..., yN, z)' with yl < ... < yN, then (2) implies N
that
I a ki yk
N
+ z > 0 for i = 1, ..., M. Since z > - min
k=1
1
n I
k=1
a ki A = - Yk, we have k=1
516 223
Sampling designs and the Minkowski-Farkas theorem ^N
Y ,n=
^N
n
I
N
n
N
(1 - 7rk)Yk -> nkYn Yk = 7rkYk [, ^kYk + Z > L 7tkYk k=1 k=n+1 k=1 k = n+1 k=1 k=1
I
I
I
n
(1-7rk)Y,, =O. k=1
This completes the proof.
❑
References CASSEL, C. M., C. E. SARNDAL and J. H. WRETMAN (1977), Foundations of inference in survey sampling, Wiley, New York.
DUPACOVA, J. (1979), A note on rejective sampling, in: Contributions to Statistics (Hajek Memorial Volume), Academia Prague, 71-78. HAJEK, J. (1981), Sampling from a finite population, Marcel Dekker, New York. HANURAV, T. V. (1966), Some aspects of unified sampling theory, Sankhya A 28, 175-204. MADOW, W. G. (1949), On the theory of systematic sampling II, Annals of Mathematical Statistics 20, 333-354. NIKAIDO, H. (1968), Convex structures and economic theory, Academic Press, New York. Received: March 1991, revised: February 1992.
517
COMMUN. STATIST.-SIMULA., 26(3), 1103-1128 (1997)
RANDOMIZED CONFIDENCE INTERVALS OF A PARAMETER FOR A FAMILY OF DISCRETE EXPONENTIAL TYPE DISTRIBUTIONS
Masafumi Akahira Kei Takeuchi Kunihiko Takahashi Institute of Mathematics Faculty of International Studies University of Tsukuba Meiji-Gakuin University Ibaraki 305 Kamikuramachi 1598 Japan Totsukaku, Yokohama 244 Japan
Key Words and Phrases: Randomized test; Exponential type distribution; Randomized confidence interval; Edgeworth expansion.
ABSTRACT For a family of one-parameter discrete exponential type distributions, the higher order approximation of randomized confidence intervals derived from the optimum test is discussed. Indeed, it is shown that they can be asymptotically constructed by means of the Edgeworth expansion. The usefulness is seen from the numerical results in the case of Poisson and binomial distributions.
1. INTRODUCTION For discrete distributions it is usually impossible to obtain a non-randomized test or confidence interval with given size , and an actual size is often 1103 Copyright ® 1997 by Marcel Dekker, Inc.
518 1104 AKAHIRA, TAKAHASHI, AND TAKEUCHI quite different from the prescribed level. But randomized procedures, which is quite nice in theory, is not easily acceptable to practitioners. However, there is still something to promote randomized procedures. Although the practitioners and even theoreticians in actual applications may well feel it quite reluctant to apply randomized tests, the randomized confidence intervals may not seem so objectionable, if it is properly formulated. It should be noted that there are (infinitely) many ways to construct randomized confidence intervals from given randomized test, and it is necessary to choose an intuitively desirable method. Suppose that our statistic T takes integer values and the "optimum" test for the real parameter 0 = 0o (in some sense or other) is given by the test function 0 of the form as 1
for T < ci(0o), T > c2(0o),
71(0o)
for T = ci(0o),
72(00)
for T = c2(eo),
0
for ci(0o ) < T < c2(00),
4(T) =
where ci ( 9o) and c2(0o ) are integers and 0 < ry;(Oo) < 1 , ( i=1, 2) and all are increasing in eo (see, e.g . Lehmann ( 1986 )). Then the test is equivalent to the non-randomized test 0' based on Y:=T+U, where U is an uniformly distributed random variable distributed uniformly over ( 0, 1) and independent of T, and 0* is given by
0`(Y) _
J1
1 for Y< c1(0o)+-y1 (0o) or Y >c2(0o)- 72 (0o) + 1,
0 otherwise.
We can write y(00) := cl(0o) + ryi(0o) and y(0o) := c2(0o) - 72(00) + 1. Then y(0) and y(9) are both monotone increasing in 0 and we may assume that they are strictly monotone and also continuous. This yields that a randomized confidence interval [8(Y), 9(Y)] can be directly obtained from the equations y(6) = Y and y(8) = Y.
Such a confidence interval has the property that it depends on the random variable U and moves to left or right depending on it. And such confidence
519 RANDOMIZED CONFIDENCE INTERVALS 1105 intervals do not look so obnoxious, we should think . Such a situation exists surely for the case of discrete exponential type distributions (including the Poisson , binomial , negative binomial , etc.) (see , e.g. Kendall , Stuart and Ord (1994 ), Johnson , Kotz and Kemp ( 1992 ), Molenaar ( 1973 ), Takeuchi and Fujino ( 1981 )). The randomized confidence interval is approximated up to the higher order by means of the Edgeworth expansion for a family of one-parameter exponential type distributions . It is also seen from numerical results that the approximation is very useful in the case of Poisson and binomial distributions.
2. CONFIDENCE INTERVALS OF A PARAMETER FOR DISCRETE EXPONENTIAL TYPE DISTRIBUTIONS Suppose that Xl,... , X,, are independent and identically distributed nonnegative integer-valued random variables according to a one-parameter exponential type distribution with a discrete density function f(x,0) = h( x)exp{rl (0)x - C(O)} for x = 0, 1, 2,... , and 0 E R', where h(x), 71(0 ) and C( O) are real-valued functions . Then, for r = 1 , 2, ... , we denote by Kr(0 ) the r-th cumulant for the distribution . In particular, we denote KI(0) and r-2 (0) by µ( 0) and o2(0), respectively. Letting T = E"_1 X;, we standardize as
T - nit(0) vln-o, (0) Then the r- th cumulant of Z„ is given by
tb - (r'/
2)
+'Q,.(0),
where
/3r.(0) = K,.(0)/{ a(0)}" for r = 3, 4, .... Hence the Edgeworth expansion of the distribution of T is given by
Po{T
^(z)
- O(z) { 6
3 ( z2
-1)+
0240)
(z3 - 3z)
520 1106 AKAHIRA, TAKAHASHI, AND TAKEUCHI
+a72n)(z5-10x3+15x)-24na2(B)z}+o(n)
where Z = it + 2 - µ(6 )}/{o,(6)}, and 4D(z) = f Z , 4(t)dt with q(t) _ (1/
2Tr)e-t2t2. Then we consider a prob-
lem of testing the hypothesis H : 0 = Oo against the alternative hypothesis K : 0 0 do. For the problem we obtain the uniformly most powerful unbiased randomized test function 0 given by 1 for Tt2, O(T) = ui for T = t; (i =1, 2), (2.1) 0 for t1
where t1i t2, u1 and u2 are determined from the equations Eeo [4(T )] = a, Eeo[TO(T)] = naµ(0o)• Suppose that U is uniformly distributed on the interval [0, 1] and is independent of T. Letting Y:= T + U, we can regard (2.1) as a function of Y like
0'(Y) _
J 1 for Yt2-u2+1, 1 0 for t1+u1
Taking the adjustment into consideration since U is uniformly distributed on the interval [0, 1], we put y( 0o) := t1+u1 -(1/2) and y-( Oo) := t2 -u2+(1/2). Then we get a randomized confidence interval [ (Y), B(Y)] of 0 at level 1- a by solving the equations y(B) = Y and y(B) Y. Theorem The functions y(6) and y'(6) are approximated up to the third order, i.e. the order o(1/n) as
Y(6)
= nµ + via(-ua/2 + AI),
W(6)
= n/A + Vno ( ua12 + A2),
521 RANDOMIZED CONFIDENCE INTERVALS 1107 where 2 1
72n 6^3u2' 2 + 032
( 4U3/2 - 15U,,/2)
1 04 {12u,(1 - ui) - 1} ua/2 / 2 - 3U,,/ 2) - 241 nCr2 24n(ua {u2(1-u2)-ui(1 - ul)}+o(1) (2.4)
1
4nwua/2 2 /^ _ h'3 2 /32 3 A2
6 -u
/2 -
72n (
4u
12
- 15U,,/2)
+ n(u«/2 - 3u,,/2 ) + 24 4no2ua/2
n
1 24no2 {12u2(1 - u2) - 1} ua/2
{u2(1 - u2) - U1 ( I - ui)} + 0 ( n) (2.5)
with the upper 100a percentile ua. The proof is given in Section 4. Corollary 1 A randomized confidence interval [Q(Y),t1(Y)] of 0 at level 1-a is approximated up to the third order as solutions 0(Y) and B(Y) of the equations (-ua/2 + At) = Y
ny + V' and
np + v7io,(ua/2 + 02) = Y
respectively, where Di and 02 are given by (2.4) and (2.5). The proof is straightforward from the Theorem.
Corollary 2 The first and second order approximations of the functions y(0) and y( 0) are given by (0) =
V(O) =
n/
- /O'ua /2 + o(V/n),
nµ + V^naua / 2
+ of
n),
and
1
^
ny -
V 7 rU, /2
+
6QN3ua/ 2
1
+ 0(1),
Y(O) = nµ + foua /2 + 6O'Q3u2,,, /2 + 0(1)1 respectively.
522 1108 AKAHIRA , TAKAHASHI, AND TAKEUCHI The proof is straightfoward from ( 2.3), (2.4 ) and (2.5). Remark The third order approximations (2.3) of y(0) and W( 0) are computed in the following way. Let Oo be any fixed in R . We can determine positive integers t1 and t2 so that Eeo [O*( Y)] = a, since 0 < u, < 1 for i = 1, 2. Substituting them for (2.3), we obtain solutions ul and u2 from the equations (2.3) with y ( Oo)=tl + ul-(1/2) and W( 0o)=i2 - u2+(1/2 ), and get the third order approximations (2.3) of y( Oo) and Y(Oo). In order to compare the approximations of y(0 ) and W(0), we also consider two versions (I) and (11) of the third order approximation (2.3) as follows.
y(0) = nµ + do, ( -ua/2 + A ol ), (I)
Y(0)
= nµ + /o'(ua/2 + OZ),
where 2
1
60^ua / 2 V
032 (4ua/2
+
^`
A2 6 ^n ua/ 2 ^2n
- 15U,,/2) -
2
24n (ua/2
3
(4ua/2 - 15ua/
- 3u, 12),
3
2) + 24n
( ua/2 - 3ua/2).
y(0) = nµ + /o-(- ua/2 + Al),
(II) Y(O) = np
(2.9) +
Vna( ua/ 2 + O2),
O1 AI 24no,2 {12iiu1(1 - ul) - 1} ua/2 lu {u2(1 - u2) - ul( 1 - ul)} + o
4nv z
A2
1
AO 1
+ 24no2
a/2
n
{12u2(1 - u 2) - 1} ua/2
1 {u2(1 - u2 ) - ul(1 - ul)} + o (in) . 4na2ua/ 2
with ul and u2 obtained as the solutions of the equations (2.7), with Y(0) _ tl + ul - ( 1/2) and W(0)=t2 - u2 -}- (1/2), of ul and u2.
523 RANDOMIZED CONFIDENCE INTERVALS 1109 The numerical comparison of the approximations (2.3), (2.6), (2.7), (2.8) and (2.9) are obtained in Section 3 where the underlying distribution is Poisson and binomial. Using the approximations (2.3) and (2.8) we can also obtain randomized confidence intervals which are derived from y(B) = Y and V(12) = Y. Indeed, we get numerical results in the case of Poisson and binomial distributions in Section 3.
3. RANDOMIZED CONFIDENCE INTERVALS IN THE CASE OF POISSON AND BINOMIAL DISTRIBUTIONS In this section we consider the cases when the Poisson distribution with a density function -A ( X, A)= ^ (x=0, 1,2,... ; A > 0) X. and the binomial distribution with a density function f2(x,p)=.Cxpx (1 _p)n-x
(x=0 , 1,... ,n;0 < p < 1).
In the Poisson case with A=n, the errors of the approximations (2.3), (2.6), (2.7), (2.8) and (2.9) are given for n =1, 2, • • • , 40 and cr = 0.05 in Tables 3.1 and 3.2, where the true values of tl+ul-(1/2) and t2-u2+(1/2) are derived from (2.2). From the Tables it is seen that the third order approximations (2.3) and (2.9) are nearer to the true values than the others. In particular the approximation (2.9) may be recommended since the computation of (2.9) is easier than that of (2.3). From Corollary 1 we can asymptotically obtain a randomized confidence interval [A, X] of ) at level 1 - a by solutions A and -A of the equations
Y = A + \/A(ua/2 + Al
(3.1)
Y = A + vA(- ua/2 + Ai),
(3.2)
respectively, where 3 12 - 3ua/ ( 4U3 - 15U,,,/ 2) - 2 2) + 24Aua/2 4A ( 6 =ua/ 2 + 72 A u° 1 2 1 3 ua/2 - 3ua /2), (3.3)
6 =u"/2 + 7 -, (
524 1110 AKAHIRA, TAKAHASHI, AND TAKEUCHI
O2 6vfA-u«/2 72a
(4u«/2 - 15ua /2) +
24, ( ua/2
- 3ua /2) - 24a u«/2
3 /2 - 3ua/2), 6vl,-\u°/2 721 A ( u« We also have a randomized confidence interval [A0, Xo] of A at level 1 - a by the solutions and TO of the equations (3.1) and (3.2) with L° and Oz instead of Di and 02 in them, respectively, i.e. what one disregards the final terms in Ai and Oz. The graphs of [a, I] and [a , Xo] are given in Figures 3.1 and 3.2, respectively, and they are very close to the acceptance region [tl +ul-(1/2),t2-u2+(1/2)]. Hence they are useful in practice. In the binomial case , the errors of the approximations (2.3), (2.6), (2.7), (2.8) and (2.9) are given n = 1 - 50(5), 50 " 100(10), p = 0.1 - 0.5(0.1) and a = 0.05 in Tables 3.3 to 3.12, where the values of tl +ul - (1/2) and t2 u2+(1/2) are derived from (2.2). From the Tables it is seen that the third order approximations (2.3) and (2.9) are nearer to the true values than the others. In particular the approximation (2.9) may be recommended since computation of (2.9 ) is easier than that of (2.3). From Corollary 1 we can asymptotically obtain a randomized confidence interval [p, p] of p at level 1-a by solutions of the equations
(
+ 24np(1 2 p) ) ^
Y = np +
np( 1 - p) _u i2 + 001
Y =np+
np( 1 - p) (U,,/2 + Os - 24np(12 p)/
(3 . 5) (3 . 6)
where 00 and OZ are given in the Remark and 63 = (q - p) /..j, 04 = (1- 6pq ) /( pq)3/2 with q = 1 - p. We also have a randomized confidence interval [po , po] of p at level 1-a by the solutions po and po of the equations (3.5) and ( 3.6) with 0° and OZ instead of 0°+uQ/2/(24np(1-p)) and Ozua/2 /(24np ( 1-p)) in them, respectively. The graphs of [_P, 151 and [po , ]501 are given in Figures 3.3, 3.4, 3 . 5 and 3 .6 respectively, when n=20,30, and they are very close to the acceptance region [tj+ul-(1/2),t2-u2+ ( 1/2)]. Hence they are useful in practice.
525 RANDOMIZED CONFIDENCE INTERVALS TABLE 3.1 The errors of the approximations of y(A) for a=0.05 in the Poisson case.
A
tl
t2
true value
1st approx. ( 2.6 )
2nd approx. ( 2.7 )
3rd approx. ( 2.3 )
3rd approx. ( 2.8)
3rd approx. ( 2.9 )
-0.1232 0.6662 1.1755
---0.5861 -0.5581
-0.1223 0.0541 0.0821
0.0037 0.0008 -0.0054
-0.0884 0.0247 0.0559
0.0238 -0.0112 0.0081
2 3 4 5
0 0 1 1
6 7 9 10
6
2
11
1.7708
-0.5717
0.0685
0.0062
0.0444
-0.0030
7 8 9 10
2 3 4 4
13 14 16 17
2.4522 3.0191 3.7151 4.4340
-0.6378 -0.5627 -0.5950 -0.6319
0.0025 0.0775 0.0453 0.0083
-0.0211 0.0018 0.0033 -0.0140
-0.0197 0.0568 0.0257 -0.0103
-0.0134 0.0034 -0.0027 -0.0081
11
5
18
5.0763
-0.5767
0.0635
-0.0005
0.0458
0.0023
12 13 14 15 16 17 18 19 20 21 22 23
6 7 7 8 9 10 10 11 12 13 13 14
19 21 22 23 24 26 27 28 29 31 32 33
5.8001 6.5644 7.2689 7.9934 8.7618 9.5530 10.2943 11.0480 11.8336 12.6389 13.4408 14.2042
-0.5896 -0.6311 -0.6024 -0.5843 -0.6016 -0.6341 -0.6097 -0.5913 -0.5988 -0.6206 -0.6338 -0.6038
0.0506 0.0091 0.0378 0.0560 0.0386 0.0061 0.0305 0.0490 0.0414 0.0197 0.0064 0.0364
0.0033 -0.0044 -0.0057 0.0011 0.0024 -0.0039 -0.0050 -0.0001 0.0020 -0.0009 -0.0062 -0.0025
0.0337 -0.0072 0.0221 0.0408 0.0239 -0.0081 0.0167 0.0355 0.0283 0.0069 -0.0061 0.0242
-0.0012 -0.0064 -0.0001 0.0009 -0.0011 -0.0052 -0.0006 0.0008 -0.0006 -0.0031 -0.0038 0.0004
24
15
34
14.9945
-0.5963
0.0439
0.0006
0.0319
0.0004
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
16 17 17 18 19 20 21 22 22 23 24 25 26 27 27
35 37 38 39 40 41 43 44 45 46 47 48 50 51 52
15.8059 16.6298 17.4515 18.2387 19.0495 19.8695 20.7050 21.5485 22.3655 23.1801 24.0087 24.8492 25.6981 26.5533 27.3882
-0.6057 -0.6237 -0.6358 -0.6098 -0.6006 -0.6046 -0.6176 -0.6357 -0.6246 -0.6085 -0.6040 -0.6090 -0.6201 -0.6353 -0.6281
0.0345 0.0166 0.0045 0.0304 0.0396 0.0356 0.0226 0.0045 0.0156 0.0317 0.0363 0.0313 0.0202 0.0049 0.0121
0.0016 -0.0010 -0.0051 -0.0025 0.0000 0.0012 0.0004 -0.0023 -0.0035 -0.0015 0.0002 0.0011 0.0003 -0.0019 -0.0031
0.0228 0.0050 -0.0068 0.0193 0.0287 0.0249 0.0121 -0.0059 0.0054 0.0216 0.0263 0.0215 0.0105 -0.0046 0.0027
-0.0006 -0.0027 --0.0034 0.0001 0.0004 -0.0003 -0.0013 -0.0029 -0.0013 0.0003 0.0002 -0.0003 -0.0006 -0.0025 -0.0013
40
28
53
28.2172 -0.6131
0.0272
-0.0016
0.0179
0.0001
526 AKAHIRA , TAKAHASHI , AND TAKEUCHI
1112
TABLE 3 . 2. The errors of the approximations of y(A) for a = 0.05 in the Poisson case.
tl
t2
true value
1 st approx . ( 2.6 )
2nd approx . ( 2.7 )
3 rd approx . ( 2.3 )
3 rd approx. ( 2.8 )
3 rd approx. ( 2.9 )
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
0 0 1 1 2 2 3 4 4 5 6 7 7 8 9 10 10 11 12 13 13
6 7 9 10 11 13 14 16 17 18 19 21 22 23 24 26 27 28 29 31 32
5 . 5735 7.1729 8 . 6403 10 . 1244 11.4788 12. 8924 14.2559 15 .5476 16.8953 18.2041 19.4546 20 .7479 22.0312 23.2786 24 .4918 25 .7571 27 .0055 28 .2288 29.4269 30 .6463 31 .8709
-0.8017 -0.7781 -0.7204 -0.7418 -0.6779 -0.7068 -0.7123 -0.6677 -0.6974 -0.7036 -0.6651 -0.6812 -0.6977 -0.6877 -0.6520 -0.6760 -0.6901 -0.6855 -0.6617 -0.6646 -0.6779
-0.1615 -0.1379 -0.0801 -0.1016 -0.0377 -0.0666 -0.0721 -0.0275 -0.0571 -0.0634 -0.0249 -0.0409 -0.0575 -0.0475 -0.0117 -0.0357 -0.0499 -0.0453 -0.0215 -0.0244 -0.0376
-0.1287 -0.0221 -0.0465 -0.0090 -0.0222 0 . 0050 -0.0115 -0.0203 0.0032 -0.0067 -0.0107 -0.0014 -0.0007 -0.0064 -0.0079 -0.0004 0.0000 -0.0043 -0.0063 -0.0033 0 . 0016
-0.1199 -0.1042 --0.0508 -0.0753 -0.0137 -0.0444 -0.0513 -0.0079 -0.0385 -0.0457 -0.0079 -0.0246 -0.0418 -0.0323 0 .0030 -0.0215 -0.0360 -0.0318 -0.0083 - 0.0116 -0.0251
--0.0147 -0.0565 -0.0062 -0.0016 -0.0024 -0.0041 -0.0213 -0.0018 -0.0023 -0.0070 -0.0061 -0.0015 -0.0020 -0.0066 -0.0039 -0.0011 -0.0012 -0.0034 -0.0058 -0.0008
23
14
33
33 . 0851
-0.6945
-0.0452
- 0.0012
-0.0330
-0.0007
24 25 26 27
15 16 17 17
34 35 37 38
34 . 2783 35 .4542 36 .6565 37 .8577
-0.6765 -0.6544 -0.6626 -0.6734
-0.0363 -0.0142 -0.0224 -0.0332
-0.0040 -0.0051 -0.0023 0.0014
-0.0243 -0.0024 -0.0109 -0.0219
-0.0012 -0.0034 -0.0043 -0.0006
28
18
39
39 . 0522
-0.6811
-0.0408
-0.0006
-0.0297
-0.0006
29 30 31 32 33 34 35
19 20 21 22 22 23 24
40 41 43 44 45 46 47
40.2307 41.3957 42.5651 43.7528 44.9388 46 . 1055 47.2657
-0.6760 -0.6606 -0.6525 -0.6656 -0.6747 -0.6771 -0.6704
-0.0358 -0.0203 -0.0123 -0.0254 -0.0344 -0.0368 -0.0302
-0.0028 -0.0040 -0.0038 0.0001 0.0007 -0.0010 -0.0026
-0.0248 -0.0096 -0.0017 -0.0150 -0.0242 -0.0267 -0.0202
-0.0007 -0.0019 -0.0047 -0.0018 -0.0005 -0.0004 -0.0007
36
25
48
48 .4158
-0.6560
-0.0158
-0.0034
-0.0060
-0.0018
37 38 39 40
26 27 27 28
50 51 52 53
49 .5740 50.7451 51 .9110 53 .0702
-0.6520 -0.6631 -0.6711 -0.6743
-0.0118 -0.0229 -0.0308 -0.0341
-0.0030 0.0000 0 . 0007 -0.0006
-0.0021 -0.0133 -0.0214 -0.0248
-0.0076 -0.0015 -0.0005 -0.0003
A *2
* Note that t2 = 5 is taken in (2.3) for \ = 2.
527
10
20
30
40
x
FIGURE 3.1. The graphs of the acceptance region [t]+u1-(1f2), t2-u2+(1/Z)] and randomized confidence interval [a, X] for Poisson distribution. the acceptance region - - - - - the randomized confidence interval
#
---------------7 -$-u
11
10
20
30
40
A
FIGURE 3.2. The graphs of the acceptance region [tj+ul-(1/2^t2-112+(1/2)] and randomized confidence interval [, ao] for Poisson distribution. the acceptance region - - - - - the randomized confidence interval
528 AKAHIRA, TAKAHASHI , AND TAKEUCHI
1114
TABLE 3 .3. The errors of the approximations of y(p) for p = 0.1 and a=0.05 in the binomial case.
n
true value
1 st approx . (2.6)
2nd approx. (2.7)
3 rd approx . (2.3)
3rd approx. (2.8)
-----
-----
---
- ---
--
-----
6
--
--
( 0.0723 )
---
( 0.0554)
--
7 7 8 9
0 .2089 0 .5485 0.7268 0 . 9703
--0.5271 -0.4456 -0.4146
0.0828 -0.0149 0 .0666 0.0976
-0.0090 --0.0049 0.0051 0.0043
0 .0674 -0.0291 0 . 0533 0 . 0850
0.0177 0.0007 -0.0100 0.0059
1
10
1 . 3243
--0.4820
0 . 0302
-0.0158
0.0183
- 0.0016
2 3 3
11 12 14
1 .8782 2 .5866 3 . 2030
-0.4327 -0.5061 -0.4621
0 .0795 0 .0061 0 .0501
0.0044 -0.0011 -0.0066
0 . 0686 -0.0039 0 .0407
-0.0008 -0.0032 0.0026
15
3.8709
-0.4490
0 . 0632
0.0027
0 . 0543
-0.0014
16
4 . 6176
-0.4975
0 .0147
0.0002
0.0063
-0.0032
ti
t2
5 10 15 20
0 0 0 0
3 4 4 5
25
0
30 35 40 45
0 1 1 1
50
60 70 80 90
4
100
5
3rd approx. (2.9)
- -
TABLE 3 .4. The errors of the approximations of y(p) for p=0.1 and a=0.05 in the binomial case.
n
tl
t2
true value
*5 *10 15 20 25 30 35 40 45 50 60 70 80 90 100
0 0 0 0 0 0 1 1 1 1 2 3 3 4 5
3 4 4 5 6 7 7 8 9 10 11 12 14 15 16
2 . 5473 3. 5666 4 . 4486 5 . 3145 6 . 1109 6 . 8258 7.4928 8. 3172 9. 0633 9 . 7170 11 . 1542 12 .4455 13 .8292 15 . 1582 16.4114
1 st approx . 2.6
-2 nd approx . 2.7
3 rd approx. 2.3
3 rd approx . 2.8
3rd approx. (2.9)
-0.7620 -0.7325 -0.7072 -0.6713 -0.6849 -0.6710 -0.6053 -0.5142 -0.5984 -0.6190 -0.5593 -0.5997 -0.5260 -0.5701 -0.5801 -0.5315
-0.2498 -0.2203 -0.1950 -0.1591 -0.1727 -0.1588 -0.0931 -0.0020 -0.0862 -0.1068 -0.0471 -0.0875 -0.0138 -0.0579 -0.0679 -0.0193
-0.0696 -0.1583 -0.1629 -0.0988 -0.0737 -0.0364 0.0035 -0.0187 -0.0290 -0.0141 0.0050 -0.0143 -0.0114 0 .0014 -0.0090 -0.0087
-0.1657 -0.1827 -0.1684 -0.1374 -0.1539 -0.1419 -0 .0777 0.0122 -0.0729 -0.0942 -0.0352 -0.0766 -0.0038 -0.0485 -0.0590 -0.0109
---0.0220 -0.0271 -0.0137 -0.0142 -0.0096 -0.0081 -0.0104 -0.0073 -0.0042 -0.0059
* Note t h at t2 = 2 an d t2=3 are taken for n=5 an d n=10, respectively, in (2.3).
529 1115
RANDOMIZED CONFIDENCE INTERVALS
TABLE 3.5. The errors of the approximations of y(p) for p=0.2 and a= 0.05 in the binomial case.
n
tl
t2
true value
st approx. 2.6
2nd approx. ( 2.7 )
3 rd approx. ( 2.3 )
3rd approx. ( 2.8 )
--
--0.0273 0.0782 -0.0398 0.0707 0.0394 -0.0133
---0.0100 0.0054 -0.0105 0.0012 0.0018 -0.0088
--0.0278 0.0786 -0.0495 0.0711 0.0398 -0.0130
3 rd approx. ( 2.9 )
---0.0035 -0.0047 0.0072 0.0057 -0.0060 -0.0105
5 10 15 20 *25 30 35 40
0 0 0 1 2 2 3 3
3 5 6 8 9 11 12 13
0.3205 0.7999 1.5040 2.0194 2.7066 3.4391
--0.3060 -0.4239 -0.3135 -0.3447 -0.3974
45
4
15
4 . 0719
-0.3310
0 .0531
-0.0006
0 . 0534
0.0036
50 60 70 80 90 100
5 6 8 9 11 13
16 18 21 23 26 28
4.7940 6.2934 7.7867 9.3670 10.9061 12.5501
-0.3376 -0.3661 -0.3460 -0.3792 -0.3436 -0.3899
0.0465 0.0180 0.0382 0.0050 0.0405 -0.0058
0.0017 -0.0028 0.0011 -0.0032 0.0003 -0.0011
0.0468 0.0183 0.0384 0.0052 0.0404 -0.0056
-0.0030 0.0009 -0.0025 -0.0019 -0.0006 -0.0006
* Note that t1=1 is taken in (2.3) for n=25.
TABLE 3.6. The errors of the approximations of y(p) for p=0.2 and a=0.05 in the binomial case.
tl
n
t2
true value
st approx. 2.6
2nd approx. 2.7
3rd approx. ( 2.3 )
3rd approx. ( 2.8 )
3rd approx. ( 2.9 )
40
1
0
1
1.4500
-0.4660
-0 . 08 1 9
-0.0
5
0
3
3.3516
-0.5986
-0.2144
-0.1104
-0.2153
---
10 15 20 25 30 35 40
0 0 1 2 2 3 3
5 6 8 9 11 12 13
5.0395 6.4285 7.9946 9.3304 10.7246 12.0932 13.3582
-0.5603 -0.3921 -0.4885 -0.4105 -0.4305 -0.4551 -0.3998
-0.1762 -0.0080 -0.1044 -0.0263 -0.0464 -0.0709 -0.0157
-0.0517 -0.0082 -0.0165 -0.0056 -0.0022 -0.0094 -0.0033
-0.1768 -0.0085 -0.1048 -0.0267 -0.0468 -0.0713 -0.0160
---0.0104 -0.0203 -0.0028 -0.0130 -0.0059 -0.0001
45
4
15
14.6699
-0.4108
-0.0266
-0.0012
-0.0269
-0.0072
50 60 70 80
5 6 8 9
16 18 21 23
15.9878 18.4532 20 . 9933 23.4012
-0.4442 -0.3805 -0.4340 -0.3890
--0.0601 0 .0037 -0.0499 -0.0049
-0.0046 -0.0031 -0.0031 -0.0021
-0.0603 0 .0034 -0.0501 -0.0051
-0.0053 -0.0053 -0.0033 -0.0018
90
11
26
25 . 8608
-0.4233
-0.0391
-0.0010
-0.0394
-0.0035
100
13
28
28.2475
-0.4077
-0.0235
-0.0022
-0.0237
-0.0002
530 AKAHIRA, TAKAHASHI, AND TAKEUCHI
1116
TABLE 3.7. The errors of the approximations of y(p) for p=0.3 and a=0.05 in the binomial case.
n
tl
t2
true value
5 10 15 20 25 30 35 40 45 50 60 70 80 90 100
0 0 1 2 3 4 5 7 8 9 11 14 16 19 21
4 6 8 10 12 14 16 18 20 22 25 29 32 36 39
0.4578 1.2334 2.1928 3.2299 4.3222 5.4632 6.5798 7.7084 8.8690 11.2817 13.7220 16.1995 18.7186 21.2584
1st approx. ( 2.6 )
2nd approx. ( 2.7 )
3rd approx. ( 2.3 )
3rd approx. ( 2.8 )
3rd approx. ( 2.9 )
--0.2981 -0.2120 -0.2095 -0.2207 -0.2417 -0.2768 -0.2603 -0.2335 -0.2200 -0.2389 -0.2366 -0.2329 -0.2394 -0.2401
--0.0420 0.0441 0.0466 0.0354 0.0144 -0.0207 --0.0042 0.0226 0.0361 0.0172 0.0195 0.0232 0.0167 0.0160
--0.0324 -0.0068 -0.0029 -0.0042 -0.0068 -0.0081 -0.0015 0.0007 0.0020 -0.0026 0.0005 -0.0009 0.0003 -0.0013
--0.0285 0.0551 0.0561 0.0439 0.0222 -0.0135 0.0025 0.0289 0.0421 0.0227 0.0246 0.0279 0.0212 0.0203
-0.0387 0.0040 0.0048 0.0022 -0.0028 -0.0120 -0.0007 -0.0021 0.0002 --0.0002 -0.0014 0.0010 -0.0012 0.0002
TABLE 3.8 . The errors of the approximations of V(p) for p=0.3 and a=0.05 in the binomial case.
n
tI
t2
5 10 15 20 25 30 35 40 45 50 60 70 80 90 100
0 1 2 4 5 7 8 10 12 13 17 20 24 27 31
4 7 10 12 15 17 20 22 25 27 32 36 41 45 50
1
true value
1.4500 3.9458 6.1639 8.2807 10.3077 12.2816 14.2139 16 . 1087 17.9790 19 . 8158 21 .6133 25 . 2386 28 . 7968 32 . 3057 35 . 7999 39.2561
1st approx. ( 2.6 ) -0. 51
2nd approx. ( 2.7 ) 6.0013
3rd approx. ( 2.3 )
3rd approx. ( 2.8 )
3rd approx. 2.9 --
-0.4374 -0.3236 -0.3021 -0.2910 -0.2908 -0.2944 -0.2951 -0.2985 -0.2907 -0.2623 -0.2814 -0.2822 -0.2723 -0.2791 -0.2744
-0.1813 -0.0675 -0.0460 -0.0349 -0.0347 -0.0383 -0.0390 -0.0424 -0.0346 -0.0062 -0.0253 -0.0261 -0.0162 -0.0230 -0.0183
-0.0647 0.0000 -0.0051 -0.0050 -0.0035 -0.0014 0 . 0006 -0.0007 -0.0008 -0.0023 -0.0011 -0.0004 -0.0015 -0.0001 -0.0009
-0.2003 -0.0810 -0.0570 --0.0444 -0.0432 -0.0461 -0.0462 -0.0491 -0.0409 -0.0122 -0.0308 -0.0312 -0.0209 -0.0275 -0.0226
0.0110 0.0031 0.0008 0.0015 0.0027 0.0033 -0.0019 -0.0041 -0.0028 0.0011 -0.0025 0.0000 -0.0018 0.0004
531
RANDOMIZED CONFIDENCE INTERVALS
1117
TABLE 3.9. The errors of the approximations of y(p) for p=0.4 and a= 0.05 in the binomial case. 1st approx. ( 2.6 )
2nd approx. ( 2.7 )
3rd approx. ( 2.3 )
3rd approx. ( 2.8 )
3rd approx. ( 2.9 )
-1.0034 2.4375
-0.0398 -0.1563
-0.0883 -0.0282
-0.0068 -0.0187
(0.0090) 0.1081 -0.0121
-0.0095 -0.0213
12
3.7917
-0.0858
0.0423
0.0074
0.0563
0.0028
15 17 20 22 25 27 32 36 41
5.3145 6.8302 8.4735 10.0134 11.6788 13.3310 16.6831 20.0640 23.5517
-0.1154 -0.0893 -0.1540 -0.0861 -0.1199 -0.1205 -0.1206 -0.0974 -0.1398
0.0127 0.0387 -0.0260 0.0419 -0.0918 0.0075 0.0074 0.0306 -0.0118
-0.0090 0.0050 -0.0064 0.0014 0.0019 -0.0084 0.0015 0.0032 -0.0007
0.0252 0.0501 -0.0154 0.0518 0.0175 0.0164 0.0155 0.0381 -0.0048
-0.0040 0.0019 -0.0112 0.0016 0.0004 -0.0023 0.0003 0.0009 0.0015
27
45
26.9914
-0.1005
0.0276
0.0008
0.0342
0.0006
31
50
30.5388
-0.1406
-0.0126
-0.0007
-0.0063
0.0013
t1
t2
true value
5 10 15
0 1 2
4 7 10
20
4
25 30 35 40 45 50 60 70 80
5 7 8 10 12 13 17 20 24
90
100
n
TABLE 3.10. The errors of the approximations of y(p) for p=0.4 and a=0.05 in the binomial case.
n
1 5 10 15 20 25
t1
0 0 1 2 4 5
t2
true value
1 4 7 10 12 15
4.3411 7.2323 9.8911 12.4210 14.9742
1st approx. 2.6
2nd approx. ( 2.7)
3rd approx, (2.3 )
3rd approx. ( 2.8 )
3rd approx. ( 2.9 )
0.4107 -0.1941 -0.1959 -0.1723 -0.1269 -0.1733
-0.0660 -0.0679 -0.0443 0.0011 -0.0452
4 -0.0245 -0.0096 0.0094 -0.0073 0.0039
0.4762 -0.0940 -0.0877 -0.0604 -0.0129 -0.0578
--0.0022 0.0038 -0.0093 0.0016
30
7
17
17.3929
-0.1338
-0.0057
-0.0053
-0.0171
-0.0047
35 40 45 50 60 70 80 90
8 10 12 13 17 20 24 27
20 22 25 27 32 36 41 45
19.8326 22.2320 24.5560 26.9491 31.5537 36.1878 40.7248 45.2554
-0.1521 -0.1593 -0.1149 -0.1596 -0.1162 -0.1544 -0.1367 -0.1463
-0.0240 -0.0312 0.0131 -0.0316 0.0119 -0.0263 -0.0086 -0.0183
0.0043 -0.0029 -0.0021 0.0022 -0.0016 -0.0013 0.0012 -0.0016
-0.0346 -0.0411 0.0038 -0.0404 0.0038 -0.0338 -0.0156 -0.0249
0.0021 0.0004 0.0015 0.0009 0.0012 0.0004 -0.0002 0.0000
100
31
50
49.7385
-0.1367
-0.0086
0.0011
-0.0149
-0.0001
532 AKAHIRA, TAKAHASHI, AND TAKEUCHI
1118
TABLE 3 . 11. The errors of the approximations of y(p) for p=0.5 and a=0.05 in the binomial case.
n
tl
t2
true value
1st approx. ( 2.6 )
2nd approx. ( 2.7 )
3rd approx. ( 2.3 )
r approx. ( 2.8 )
r approx. ( 2.9 )
5 10 15 20 25 30
0 2 4 6 8 10
5 8 11 14 17 20
0.3000 1.8244 3.6782 5.6165 7.6042 9.6291
0.0087 0.0766 0.0264 0.0009 -0.0041 0.0033
0.0087 0.0766 0.0264 0.0009 -0.0041 0.0033
-0.0533 0.0117 0.0092 0.0058 0.0044 0.0043
0.0394 0.0984 0.0441 0.0163 0.0096 0.0159
-0.0231 0.0011 0.0039 0.0074 0.0070 0.0046
35
12
23
11.6861
0.0163
0.0163
0.0044
0.0279
0.0020
40 45 50 60 70 80 90 100
14 16 18 22 27 31 36 40
26 29 32 38 43 49 54 60
13.7730 15.8891 18.0351 22.4229 26.7791 31.2185 35.6933 40.1824
0.0291 0.0370 0.0354 -0.0138 0.0218 0.0163 0.0098 0.0178
0.0291 0.0370 0.0354 -0.0138 0.0218 0.0163 0.0098 0.0178
0.0037 0.0020 -0.0008 -0.0023 0.0022 -0.0023 0.0018 -0.0016
0.0399 0.0472 0.0451 -0.0049 0.0300 0.0240 0.0170 0.0247
0.0004 0.0001 0.0003 -0.0047 0.0003 -0.0005 0.0008 -0.0001
TABLE 3.12. The errors of the approximations of V(p) for p=0.5 and a=0.05 in the binomial case.
n
tl
t2
true value
1st approx. ( 2.6 )
2nd approx. 2.7
3r 3rd approx. approx. (2.3 1 _ ( 2.8)
r approx. ( 2.9 )
5 10 15 20 25 30 35 40 45 50 60 70 80 90 100
0 2 4 6 8 10 12 14 16 18 22 27 31 36 40
1 5 8 11 14 17 20 23 26 29 32 38 43 49 54 60
1.4500 4.7000 8.1756 11.3218 14.3835 17.3958 20.2709 23.3139 26.2270 29.1109 31.9649 37.5771 43.2209 48.7815 54.3067 59.8176
0.0300 -0.0087 -0.0766 -0.0264 -0.0009 0.0041 -0.0033 -0.0163 -0.0291 -0.0370 -0.0354 0.0138 -0.0218 -0.0161 -0.0098 -0.0178
0. 0 -0.0087 -0.0766 -0.0264 -0.0009 0.0041 -0.0033 -0.0163 -0.0291 -0.0370 -0.0354 0.0138 -0.0218 -0.0161 -0.0098 -0.0178
-0.0406 0.0533 -0.0117 -0.0092 -0.0058 -0.0044 -0.0043 -0.0044 -0.0037 -0.0020 0.0008 0.0023 -0.0022 0.0023 -0.0018 0.0016
-0.0 38 7 -0.0394 -0.0984 -0.0441 -0.0163 -0.0096 -0.0159 -0.0279 -0.0399 -0.0472 -0.0451 0.0049 -0.0300 -0.0240 -0.0170 -0.0247
-0.0231 -0.0011 -0.0039 -0.0074 -0.0070 -0.0046 -0.0020 -0.0004 --0.0001 -0.0003 0.0047 -0.0003 0.0005 -0.0008 0.0001
533
Y
20
15
10
5
0.2
0.4
0.6
i P
0.8
FIGURE 3 .3. The graphs of the acceptance region [ti+ul-(1/2, t2-u2+(1t2)J and randomized confidence interval [p , p] for binomial distribution when n = 20
the acceptance region - - - - - the randomized confidence interval
2
V1 15
I
18
5
0.2
0.4
0.6
0.8
1
P
FIGURE 3.4. The graphs of the acceptance region (t^ 1ui-(1/`2),t2-u,+(lr2)J and randomized confidence interval (j-0,poJ for binomial distribution %%'lien n=20 - the acceptance region
- - - - - the randomized confidence interval
534
0.2
0.4
0.6
0.8
1
p
FIGURE 3 . 5. The graphs of the acceptance region [tl+ul-(112} t2-u2+(1/2)] and randomized confidence interval [p, p] for binomial distribution when n = 30 the acceptance region - - - - - the randomized confidence interval
Y
3
.mot
25
29 715
10
5
0.2
0.4
0.6
0.8
i p
FIGURE 3.0. The graphs of the acceptance region [ii+ui-(1/2, t2-u2+(1/2)] and randomized confidence interval [po,po] for binomial distribution when n = 30 the acceptance region
the randomized confidence interval
535 RANDOMIZED CONFIDENCE INTERVALS 1121
4. PROOF In this section we give a proof of the Theorem. Now, letting t be a nonnegative integer and 0
Fn(y)Pe{Y
M„(y) E xPa {Tn = x} + utPa {Tn = t}, x-0
respectively. Then the condition (2.2) turns out F'n(t2 - U2 + 1) - Fn(t1 + u1) = I - a, Mn(
t2
- u2 + 1) -
Mn( t1
+ ul ) =
nµ(80 )( 1 - a).
For simplicity we denote µ(8o), tr ( Oo) and Q,.(8o) by u, o and /r, respectively. If n is large and It - n zI //Q is small, then we have
Fn(y) _ (1 - u ) {Zn < z' } + u(P{Zn < z} 03 '3 - 3z') 6 ,Fn ( z 2 - 1) + 0n(z t-
_ (1 - u) P[(z1 ) - (z')
2 + 03
2 (zj5 - 10x'3 + 15z') z' - 24nQ2
+ u [0(z) - O
(z) { 603 (
z2 - 1) +
2
24n ( z3 1
_4nU2
- 3z)
+o\nl'
(4.1)
where z'_(t-2-nµ)/(vca) , z=(t+2 - nµ)/(/Q). On the other hand we have MM(y) - nzFn(y) t-1
t
_ (1 - u) E(x - n,a)Po {7;, = x} + u E(x - np)Po {Tn = x}. (4.2) x=O
x=O
536 1122 AKAHIRA, TAKAHASHI, AND TAKEUCHI Then we obtain for a large n Lx - nµ)Po{T„ = x} x=o e
{_6 - ( zx3 - 3zx) + 24n(zX4 -6 ZX2 + 3)
+
[
_ x=0 x - n
2 + ^2n(zxs - 15zx4 + 45zx2 - 15)
+0(71' ), (4.3)
^1
where zx = (x - nµ)/(/Q). For the first term of the right- hand side of (4.3), we have as its integral approximation 1
,/, E zxY^(zx) V/n-a x=o
f xO( _
j ^(xO(x)) ^^dx + ° (n)
x)dx - 24nQ4
-O( z) + 241or4 (z2 - 1)0( z) +n ° (I) Since the other terms of the RHS of
where z = I t + (1/2)
(4.3) can be also approximated as integrals from sums, we have from (4.3) ^ (x - np)Po{Tn = x} x=o
_ '0, I f Z xO(x) {1 + ft(x3 - 3x) + Ln(x4 - 6x2 + 3) 2 + 72n
(xs - 15x4 + 45x2 - 15) dx
+24nv4 (y2-1)O(x)+°(n),. Letting H,(x)'s be Hermite polynomials , i.e. He(x ) = j(- d/dx)30(x)}/cb(x) (j = 0, 1, 2, ... ), we obtain f xq5(x) Hj(x)dx = - xq(x)Hi -1(x) +
f
cb(x) Hj-i(x)dx
_ -xO(x )H _1(x) - 4 ( z)Hj-2(x) for j = 2, 3 , .... Then we have from (4.4)
537 RANDOMIZED CONFIDENCE INTERVALS 1123 t E(x - np)Po{T„ = x} x=O
2 [_(z) 24n (z4 - 2z2 - 1) + #2n (zs - 9z4 + 9z2 + 3) _ ^o {1 + 6a3z3 + 24nv4 _ (z2-1)}J+o\ n 1 ).
Since, for given smooth function G and a small h (1 - u)G(t) + uG(t + h) = G(t + uh) + I u(1 - u)h2G"( t + uh) + o(h2), it follows from (4.1), (4.2) and (4.5) that
Fn(t + u) _ 41(x0) - O(zo) {
6163 (x02 3 zo) - 1) + 24n ( x03 Vrn1
2
+
R2 n (x05 -
10z03 + 15z0 )
+
241nor2 (
12u(1 - u) - 1)zo}
+o\n/' M„(t + u) - n/F„(t + u) _3_ 4 - 2x02 - 1) ) {1 + 61^zo3 + 24n(z04 2 1 9204 + 9x02 + 3)+ 241 4(12u(1 - u) - 1)(z02 - 1) } + ^3 (z06 -
- ✓no'0(zo
where z0 = { t + u - (1/2) - ntt }/(/ ). Then the condition ( 2.2) is represented by
Fn(t2 + (1 u2)) - Fn(t, + u 1) =1- a, ( 4.8) M„(t2 + (1 - u2 )) - nµF„( t2 + (1 - u2))
- {M„(t, + ul) - nzF„(tt + ul)} = 0. (4.9)
538 1124 AKAHIRA, TAKAHASHI, AND TAKEUCHI Let zi
(t1
+ ul -
2
- nµ)
/(Jnr),
Z2 (t2 - U2 + 2 - nµ)/(V'no) = { t2 + (1 - U2 ) - 2 - n/t}/(\/na).
Putting O1 := zi + u„/2 and A2 := z2 - ua/2i we see that 01 = 0(1/\/-n) and 02 = 0(1/\/n-), where ua is the upper 100a percentile . Then we have 4 (zi) -ua/2 ) + 0(-ua/2 = 2 + q5(ua /2 )IX 1 +
(z2)
=1-
2
)A1 + 20,(- ua/2)A12 + 0 ( in
Zua / 20(ua /2)O12
+ q5(ua/2 )02 -
+
!
( ), 0
2ua /20( ua/2)A22 +0 \n/
(zit - 1 ) 0(zi) = (utx/2 -
1)0(ua /2 )
+ ( u«/2 - 3ua / 2)^(u«/2 )01 + ° \v^/
(z2t - 1)0(z2) = (ua/ 2 -
1)0(ua /2)
-
(ua /2
- 3ua./2 )`Y( ua/2)02 + o (V )
From (4.6) we have Fn(t2 + (1 - u2)) - Fn(t1 + u') _
t(z2`)
) - 6a3- {(z2t - 1)O(z2) - (zit - 1)0zi)} \/n
- 4,(z
2 4 3ua/2) + - 4 (ua/2) s Q0 n (ua/2 ^2n (u«/2 - IOua/2 + 15ua/2) + +
q5(ua /2)
'
(12u2(1 - u2) - 1)ua/2} 2 4nQt
2 (-ua/2 + 10ua/2 - 15ua/2) + 3ua /2) + 4n U3 a /2 2n 2 241 2(12u1(1-u1)- 1) u«/z}+O (n)
=1-a+ +
(02
6R^(O1
-
Oi)q5(ua /2)
+ 02 )(u0/2
- 1(012 + A22)ua /20(ua /2)
3ua /2)0(ua/'2) 2
- 1(ua/2) { 12n (ua/2 - 3U,,/2) + 36n (ua/2 - 1Oua/2 + 15ua/2)
539 RANDOMIZED CONFIDENCE INTERVALS 1125 1 + 12na2 From
( 6u1(1 - ul ) + 6u2(1 - u2 ) - 1)ua/2 }
(4.8)
we
obtain
+0 \-
J
1 0 = 02 - Al - tun./2(L12 + A2 2) + 6 ^(ua/2 - 3ua /2)(11 + 02) 2
- pn (ua/2 -
12na2
3ua /2 )
-
^/Tt
fin (ua/2 - 10up/2 + 1511a/2)
{6u1(1 - u1) + 6u2(1 - u2) - 11 + o I a/ , (4.10)
hence L2 - Al = 1 2ua/2(A12 + 022)
603 (u3/ 2 - 3ua /2)(Al + 02) +
12n (U./
2 - 3ua/2)
2
+ 36n (ua/
2 10ua/2 + 15U,,/2)
1 ( 1 - u1) + 6u2 ( 1 - u2) - 1} ua/2 + 12na\2 {6u1 +0 +01-1. \n/
(4.11)
From (4.7) and (4.9) we have 0 = Mn (t2 + (1 - u2)) - n/LFn ( t2 + (1 - u2)) - {Mn(tl + ul) - npFn( tl + ul)} = - v/n-U [zn - W(z1) + 6- = (z 23Y'(z2) - z1Y'(zl)}
+ O ua/2)(ua/2 - 1) {u2 ( 1 - U2) - III ( I - III)) 2 (
+ 0 ^V 1b).
(4.12) Since 1 Z130(,Z1) = -up/2'Y( ua/2) - (U4 ./2 - 3ua /2)
0(u a/2)01 + 0
4 2 z23Y'(z2) = ua/20( ua/2) - (ua /2 - 3ua /2)Y'(ua /2)02 + 0 1
it follows that
r
-^ , VTL
540 .1126 AKAHIRA, TAKAHASHI, AND TAKEUCHI
z23q(z2) - z130z1)
= 2u«/20( u«/2) - (u«/2 - 3u«/2)0(u«/2)(A2 - Al) + 0 /„ ( ) (4.13)
Since -(Al
^(z2) - 4' ( zl)
+ A2 ) u«/24'(u«/2)
+ 2022 - A12 )(u«/ 2 -
1)Ou «/2)
+ 0 (n,) '
it folloes from ( 4.12) and ( 4.13) that 0 = -u«/2(01 + +
_2N3 _ U 3
3^- u« /2
A2)
+ 1(u2/2 -
1)(022 - 012)
03
4 2 - 67n (u «/2 - 3u«/2)(A2 -
Al)
+ 2 a2(u2a/2 - 1){u2(1 - u2) - u1(1 - u1)} + 0 (n) ' hence
Al +A2 1 (u«/2 - u «/2
2
+
3 2
3vrn
(A2 2
-
A 12)
03 - 6T (ua/ 2 - 3u« /2 )(A2 - Al)
{u2(1 - u2) - u1 (1 - u1)} + ° C- . (4.14) n + 2 Q2 (u«/2 - u« z/ From (4.11) and (4.14) we have
Al =
/j3 uz - u«/20 2 _ 1 (A22 _ A12) 6^/n «/2 2 1 4u«/2 3 3 - 3u«/2)A1 - 24n (u3/2 - 3u«/2) + 6-v/n' z #2n(u5/2 - l0ua/2 + 15u«/2) - 24nv2 {12u1(1 -u1)-1}u«/2 4na2u«./2
{u2(1 - u2) - u1(1 - u1)} + ° (n) '
Az = Q3 u« /2 + u«/2 /2A22 - (A22 - Ale) 6 f 2 4u«/2
541 RANDOMIZED CONFIDENCE INTERVALS 1127
-
03 67= (u «3 /2 - 3ua/2)/.2
2 72n (usa
1 4no2ua/2
- 3ua/2)
1
/2 - 10u a/2
+
+ 24 (u «/2
+
15ua/2) +
24no2
{12u2(1 - u 2) - 1)
{u2(1 - u2) - u1 (1 - ui)} + o
ua / 2
(in) ,
hence 2
A,
68/„ ua/2 V ^6
-
+
04
24n(u3
)2n
(4u
2
- 15ua/2)
«/
1 / 2 - 3U,,/2 ) - 24no2 {12u1(1 - ul) - 1} ua/2
olua/2{u2 4n2
( 1 - u2) - u' ( 1 - uI)} + o n/ , (n)
32
A2
4u«/2 ^2n( #2
6^^ua/2 + 24nu3 1
- 15ua/2)
1 /2 - 3u,,, /2) + 24no2 {12u2 ( 1 - u2) - 1} U(,,/2 {u2(1 - u2 ) - uI(1 - ui)} + o (_
4na2ua/2 Thus we complete the proof. ACKNOWLEDGEMENTS This research was supported in part by Venture Business Laboratory, University of Tsukuba.
REFERENCES Johnson, N. L., Kotz. S. and Kemp, A. W. (1992). Univariate Discrete Distributions (2nd ed.). Wiley & Sons, New York. Kendall, Sir M., Stuart, A. and Ord, J. K. (1994). Kendall's Advanced Theory of Statistics Volume 1: Distribution Theory (6th ed.). Edward Arnold, London. Lehmann, E. L. (1986). Testing Statistical Hypotheses (2nd ed.). Wiley & Sons, New York.
542 1128 AKAHIRA, TAKAHASHI, AND TAKEUCHI Molenaar , W. (1973 ). Approximations to the Poisson, Binomial and HyperGeometric Distribution Functions . Mathematical Centre, Amsterdam. Takeuchi, K. and Fujino , Y. (1981). Binomial Distribution and Poisson Distribution . ( In Japanese ), Tokyo Univ. Press.
Received October, 1996.
543
MASAFUMI AKAHIRA (*) - KEI TAKEUCHI (**)
The existence of a test with the largest order of consistency in the case of a two-sided Gamma type distribution CONTENTS : 1. Introduction . - 2. Order of consistency in a general case . - 3. The largest order of consistency in the case of a two-sided Gamma type distribution . - 4. The existence of a test with the largest order n2 of consistency . References . Summary. Riassunto. Key words and phrases.
1. INTRODUCTION
The order of consistency, i.e. the order of convergence of consistent estimators based on a random sample of size n, is discussed in Akahira (1975a, 1975b), Akahira and Takeuchi (1981, 1995) and others. In the regular case, the order is V, but , in non-regular cases, the order is not always so. For example, let X1, ..., X„ be independent and identically distributed (i.i.d.) real random variables with a density fo (x - 0) (with a location parameter 0) satisfying fo (x) > 0 for a < x < b, fo (x) = 0 otherwise, fo (x) - A' (x - a)a-1 as x -> a + 0, and fo (x) - B' (b - x)^-t as x -* b - 0, where (x:5 R, 0 < A' < co and 0 < B' < o. Then order of consistency in nt/" for 0 < a < 2, (n log n)t a for a = 2 and nt 2 for a > 2 (see, e.g. Akahira (1975a), Akahira and Takeuchi (1981, 1991, 1995) and Woodroofe (1972)). Hence they are not always "in in such cases. In each case, the above is also shown to be the largest order of consistency in the sense that there does not exist a consistent estimator with order greater than the above (see Akahira (1975a)). As one of other non-regular cases, we con-
Institute of Mathematics, University of Tsukuba, Ibaraki 305, Japan. (**) Faculty of International Studies, Meiji -Gakuin University , Kamikuramachi 1598, Totsuka-ku, Yokohama 244, Japan.
544 94 Sider the case when the density f0 (x) satisfies fo (x) -- C I x I -0` as I x 1 -4 0, where 0 < a < 1, where C is some positive constant. In the estimation problem of a location parameter 0, it is shown in Akahira (1975b) that the largest order of consistency in n"(' -) for 0 < a < 1 /2, the median of X,, ..., Xn is a {n'/(1-0C)} - consistent estimator for 0 < a < 1 , and its asymptotic distribution is also given . Related results are found in Polfeldt (1970a, 1970b), Ibragimov and Has ' minskii ( 1981), Smith ( 1985, 1989) and Woodroofe (1974). In this paper we deal with the case of a two-sided Gamma type distribution with a density fo (x) satisfying fo (x) = I x 1-1/2 g (x), whereg (x) is a bounded continuous positive- valued function on R' - (0), and g'(x)/g(x) is bounded on R'. For a random sample X ,, ..., X" of size n from the distribution, it is shown that the largest order of consistency is n2. In a problem of testing the hypothesis H: 0 = 00 against K: 0 = 00 + to 2, a test with a rejection region {min ,An-2) is given as one with the order, where t # 0 and k is a positive constant . The power function of the test is asymptotically given.
2. ORDER OF CONSISTENCY IN A GENERAL CASE
Let x ben an abstract sample space whose generic point is denoted by x, R a (Y-field of subsets of x and {P0 : 0 E O} a set of probability measures on 13, where O is called a parameter space. We assume that O is an open subset of Euclidean 1-space R'. We denote by (x("), (3(")) the n-fold direct products of (x, (3). For each n = 1, 2, ..., the points of x(") will be denoted by x = (x,, ..., x,,). Consider n-fold product measures P' of Pe. An estimator of 0 is defined to be a sequence 10 „ } of (3°" )-measurable functions 9" on x() into O. For simplicity we denote {0"} by 9,,. For an increasing sequence of positive numbers {c„} (c„ tending to infinity) an estimator 0„ is called {c,,}-consistent if for any rl E O there exists a sufficiently small positive number S such that lim lim Sup
Pe {cn 10,, -01>-L}=0.
L-*oo n--4- 0:10-inks
For any two points 0, and 02 in 0 there exists a a-finite measure such that Pe, and Pee are absolutely continuous with respect to µ". Then for any points 0, and 02 in O we define
545 95
I" (0, ,02) = f
X
dP a" dP a" ^ log ^ dµ
x dt" dPe2 which is called the amount of the Kullback-Leibler information in (X,,..., X,,) for discriminating between Pej and P'e2 when P.", represents the true distribution . It is easily seen that, for each n, I,, is independent of µ". The following theorem shows that a necessary condition for the existence of a consistent estimator is that the limit of the amount of the Kullback-Leibler information is infinite. THEOREM 2.1. Suppose that, for each n = 1, 2,..., the support {x:(dP"e/dµ„)(x)>0) does not depend on 0. If there exists a {c"}-consistent estimator, then for any 0 E O and any nonzero a lim lim I" (0, 0+aLcn')=oo. L->o n-^-
The proof is omitted since it is quite similar to that of Theorem 2.2.2 in Akahira and Takeuchi (1981). The above theorem is useful to obtain the largest order of consistency.
3. THE LARGEST ORDER OF CONSISTENCY IN THE CASE OF A TWO-SIDED GAMMA TYPE DISTRIBUTION
Let X = O = R', and we suppose that, for each 0 E O, P. is absolutely continuous with respect to the Lebesgue measure and constitutes a location parameter family. Then we denote the density dP0/dx by fo (x) and fe (x) = fo (x-0). Let X1, X2,..., X,... be a sequence of i.i.d. real random variables with the density fo (x-0) with respect to the Lebesgue measure. We assume that fo (x) = Ixl-'ag(x), where g(x) is a bounded continuous positive-valued function on R', symmetric around the origin, twice differentiable on R'-{0} , and g'(x)/g(x) is bounded on R'. Here, for x = 0, g'(0±0) is taken instead of g'(0). A distribution with the above density fo (x) is called a two-sided Gamma type distribution . For example, when g (x) = { 1/(2 S)) CL", fo (x) is called a density of the two-sided Gamma distribution. For 0 , and 02 in 0, the amount of the Kullback-Leibler (K-L) information in X, is
546
96
11(01,
e2) = rf fO (x-
O')log
x.fo(x - O2)
and that of the K-L information 1„ (01 02) in (X1,..., X„) becomes n11 (01, and 02), since X1,..., X„ are i.i.d. random variables. Then we have the following. Lemma 3.1. The value of the K-L information is given by 1, (0,A) = K-411 + O(I I3i2) for small IiI, where
K
2 g(0)I f o {log(l + u) }{(l + u
)-i i 2 _ u-i 121du
_ fol log 1 uul u -h12du .
]
Proof. Let 0 be a small positive number. Since fo (x) is symmetric about x = 0, it follows that
1j(0,0)=
-fo
{log fo(x+t)-log fo(x)}{ fo(x +A)-fo(x)}dx
0 +f {log f0 ( O-x)-log fo (x)}fo(x)dx =J1+J2 (say). (3.1)
Since, by the boundedness of g' (x)lg(x), log fo(x + A) - log fo(x) _-2log(1+A )+logg(x+A)- log g(x)
547
97 +t g'(x+40) 1 log 1+A) 2 ( x J g(x + 0) = - 21og1 1 + A I + O(A), we have
Ji =J° {21og1 1+A1+O(0)}
( •{(x + A)-1 i 2 g x + A) - x-1 I2g(x)}dx
( 2 Jo {iog(i + A )f I(x +A)-' 12 g x + A) - x-112g(x)}dx
+O(A)
J
0
{(x + A)-
112 g(x
+ 0) - x- 112g(x)
(3.2)
where 0 < ^ < 1 . First we obtain
f0
{log( 1+A )}I(x + A) i2g(x+A)-x 12g(x)}dx
= Jo f
logl 1 + )}{(x + A)-i
i2 _ xv2
}g(x)dx
+Jo {iog(i +
+ A)"2 {g(x + A) - g(x)}dx
_ -,-- Jo {Iog( 1
I }{(l + u)-1 i 2 - u l i 2 }g(Au)du
+ i
5
{ iog(i +
I }(1 + u)- 1 12 {g(Au + A ) - g(Au)}du.
u
(3.3)
548 98 Since
(
}{(
-00 < f o { log I
1 + u)-' i 2 _ u- 11 2 } udu < 0,
+ u)
it follows from the boundedness of g' (u) that
f {io(i +u)}{(i
+ u)"2 _u-1i2 }g(Au)du
u + u)-1/2 _ u-1 i 2 }du = g(0) f of 1og(1 + )}{(1
+0fo {iog( i + + u)hI2 - u12 }ug'(u)du
=g(0)fo {log( l+u)}{ (1+u)-'i2-u -'i2}du+O(O), (3.4)
where 0 < E< 1. Since 0 < f o { log( I+ u I}(1 + u)-' 2 du < it follows from the boundedness of g' (u) that
f o {iog(i + 1 )}(1 + u)-' 12 {g(Au + A) - g(Au)}du = O(A).
(3.5)
From (3.3), (3.4) and (3 .5) we have
f o S 109 (1 + A )}{(x + A) -i i2 g(x + A) - x-i i 2g(x)}dx
=g(0)VAfo {log(l +u )}{(l+u)-v2 _u -ii2}du+O(A 2).
(3.6)
549
99 Since g' (x)/g(x) is bounded, we obtain {(x + A)-112 g(x + A) - x -112g(x)}x 0
= e t-112g(
t)dt - fo
x-1/2g(x)d
J
e _ _ ' t-1 '2g(t)dt _ O(e1 '2 ), 0
hence
O(A)f o {(x
+ A)-v 2 g( x + A) - x-112g (x)}dx = O(O312 ). (3.7)
From (3.2), (3.6) and (3 .7) we have
)if
J 1 =Ig( 0 0
112-u 112 du+o (,&.3/2 ). { Iog(1 +Iu )l +u{() }
(3.8 )
Note that
-^ < f o flog (l + u -
)f I(, +U )-1/2 - u-1121du < 0.
Next we have J2 =fo e {logfo(A-x)-logfo (x)}fo(x)dx
=- 2 f o [log(A - l )}x-I/ 2g(x)dx A + f o {log g ( 0 - x) - log g(x)}x 112g(x)dx.
Since f 0 {log( -1 ) }x-112g(x)dx
(3.9)
550
100 r i /2f J{log(1-u)}u-1/2g (Au)du- 51(logu)u /2g(Au)du and g(x) is bounded, we have Jo {iog( -1 I}x
-1/2g(x)d =0(0g/2).
(3.10)
'A x
Since g(x) and g'(x)/g(x) are bounded, it follows that Jo e flog g( 0 - x) - log g ( x)}x-' / zg
( x)dx = O(03 /2). (3.11)
Since g'(x) is bounded and (log l-ul ui / zdu <00, u fJol ' it follows from (3.9), (3.10) and (3 . 11) that
J
-u to g 2 2( Jo u
u- 112g(Au )du+0(13/2)
uul u1/zg. (Du)du} 2 {g(0) Jo (log 1 uu) u-1/2du + 0Jol log 1 +0(03/2)
(03/2). 2 g(0)-FAJJ(log - ) u-112du +0
Note that J5'(log 1
uu) u-1 /2du < 00.
From (3.1), (3.8 ) and (3.12) we have I, (0, 0) =
2g
(0) / [5 j log(
K
u)}{(1 + u)-" / 2 1+
- U-1/2 j,^u
(3.12)
551
101
-$ I log 1 uul u-uzdU +O(O3i2) = KVK+0(0312) (say). In a similar way to the case A > 0 , we obtain for A < 0 1, (0,A) = K
+O(I A 11/2).
Thus we complete the proof. THEOREM 3.1. Suppose that X1. X2, ...,XX, ... is a sequence of i.i.d. random variables with the density fo (x-9). Then the largest order of consistency is n2 in the sense that there does not exist a consistent estimator with order greater than order nz. Proof. From Lemma 3.1 we have for sufficiently large n and any nonzero a
In (0, 0 + aLc, ' ) = nI, (9, 9 + aLc , ') = nK I a I Lcn ' + 0 ( nL3 ' zcn 3/ z ). If order c„ is greater than order n2, then lim ln (0,0+aLc,' ) = lim K41al L., 0. Fn !-! n
Hence it follows from Theorem 2 . 1 that there does not exist a consistent estimator of 9 with the order greater than order n2. Thus we complete the proof.
4. THE EXISTENCE OF A TEST WITH THE LARGEST ORDER n2 OF CONSISTENCY
In the previous section it is shown that the largest order of consistency is n2. In this section we show that there exists a test with order n2 of consistency. We consider a problem of testing the hypothesis H: 0=00 against the alternative K. 0 = 00 + to 2, where t # 0. Without loss of generality we
552 102 assume that 00 = 0. In the problem we take a test with a rejection region Rn{(Xi...Xn)minJXiJ> : = C}where Cn is a small positive constant 1
depending on n. Since for sufficiently large n PO
{IX1i <-Cn}=
f cc C
fo(x) d =2fonx I12g(x)dx - 4g(0)C,,,12, .
it follows that Po !
C,, J =
[1-Po
{IXII _ Cn}1
{1- 4g(0)Cni2 n
n
• (4.1)
Letting Cn = kn-2 with a positive constant k, from (4.1) we have for sufficiently large n n Po {I min lXi I > C. - (1- 4g(0),rk- n_ 1) - e-4gco>-TkTaking 0 < a < 1 as a level of the test, we obtain e-agco>,Fk = a, hence _ (loga)Z _ k 16g2(0) ko (say). Putting Cn* = kon 2, we have the following. THEOREM 4.1. In testing the hypothesis H: 0 = 0 against the alternative K : 0 = tn2, the level a test with the rejection region R,, := {(x1..minlXil > C;, } is a test with order n2 of consistency and 1_i_n I
has a power function
553
103 { min l Xi l > Cn R n (t) : = Pn_Z In 1
I/ z
exp[-2g(0) {(Itl +ko)1
+(ko
/2 -
_t)1 /zll
(ItI-ko) 1
for Itl <- ko,
/z}]
for sufficiently large n, where t * 0.
for
Iti
> ko
J
Proof. In the testing problem, we consider a test 1 for min lxi l > Cn , 4'n (XI,...,XnI =
I Si_n
0 for min lxi
l< C
n
Since C„ = kon-z, it follows that cp, is a test of level a with order n2 of consistency. Then the power function Rn (t) of the level a test (p„ is given by Rn(t) = En-2 [^n(XI,..., Xn )]
= Pn_Z min lXil> C'*. in I5i5n
1
=[I - PM , {IXI < kon-z In =[1-Pn_2 {-(ko+t)n-z <XI -tn-z <-(ko-t) n-z}]n
=[1-Po{-(ko+t)n-z <X1 <-(ko-t)n-z},n. (4.2) Putting an : _ - (ko + t) h-2 and bn sufficiently large n
(ko - t) n-2, we have for I t I < ko and
Po{an <-X1
an
-I/2 -(J an S" 0 )IxI
g(X)dX
554 104 - foa,, x-1/2
g(x)dx+
b,,x-1/2 g(x)dx
" 2g(0){(-a„)1/2 +bn/2}
=
2g (0){(ko
+t)112 n -'
+(ko _
t)1/2 n-1 {
(4.3)
From (4.2) and (4.3) we obtain for It I < ko and sufficiently large n on (t) - [1- 2g(0){(ko +t)1/2 +(ko _ t)1/2l n-1 in -
exp[-2g(0){(ko
+t)1/2 +(ko -t)1/2}]
(4.4)
>_ a-4g(O) ko = a.
Since for t > ko and sufficiently large n - "x- 1/2g(x)dx Po Jan :5X15 5bn }=J b,,fo(x)dx=fa a" b" =
(-
a
b
x-1/2
L "J "
g (x) dx
2g(0){(-an)1/2 _(_bn)1/2}
= 2g(0){(ko +t)l/2n-1 -(t-ko)1/2n-1 we have from (2.2) On(t)
- [1-2g(0){(ko
+t)
-exp[-2g(o)J(ko
1/2 - (t-ko)1/2
}n-1 In
+t) 1/2- (t-ko) 1/2
> e-2 2k0 g( 0) > e-¢g(o) ko = a.
}] (4.5)
In a similar way to the above we obtain for t<- - ko and sufficiently large n
555 105 011 (t) - [i - 2g (0){(ko -
) l ' 22
- (-ko - t)i t2ln
t)it2 exp[-2g(0){(ko _t)Ii2 -(-k0 -
> e -2 2kn
g(a)
>
e-4g(o)
-t In
}1
k,, = a.
(4.6)
The desired result follows form (4.4), (4.5) and (4.6). Remark 4. 1. The power function P„ (t) of the level (x test with the rejection region { minl5;,,, I X; I > C,, } is asymptotically given as Figure 4.1.
Fig. 4.1 - Asymptotic power function of the level a test.
REFERENCES
AKAHIRA, M. (1975a) Asymptotic theory for estimation of location in non -regular cases, I: Order of convergence of consistent estimators . Rep. Stat. App!. Res., JUSE, 22, 8-26.
556
106 AKAHIRA, M. (1975b) Asymptotic theory for estimation of location in non-regular cases, 11: Bounds of asymptotic distributions of consistent estimators. Rep. Stat. Appl. Res., JUSE, 22, 99-115. AKAHIRA, M. and TAKEUCHI, K. (1981) Asymptotic Efficiency of Estimators: Concepts and Higher Order Asymptotic Efficiency. Lecture Notes in Statistics 7, Springer, New York. AKAHIRA, M. and TAKEUCHI, K. (1991) Asymptotic efficiency of estimators for a location parameter family of densities with the bounded support. Rep. Stat. Appl. Res., JUSE, 38, 1-9. AKAHIRA, M. and TAKEUCHI, K. (1995) Non-Regular Statistical Estimation. Lecture Notes in Statistics 107, Springer, New York. IBRAGIMOV, I.A. and HAS'MINSKII, R.Z. (1981). Statistical Estimation: Asymptotic Theory. Springer, New York. POLFELDT, T. (1970a) Asymptotic results in non-regular estimation. Skand. Akt. Tidskr. Suppl., 1-2. POLFELDT, T. (1970b) The order of the minimum variance in a non-regular case. Ann. Math. Statist., 41, 667-672. SMITH, R.L. (1985) Maximum likelihood estimation in a class of non-regular cases. Biometrica 72, 67-90. SMITH, R.L. (1989) A survey of non-regular problems. Bull. /nt. Statist. Inst., 53, 353-372. WOODROOFE, M. (1972) Maximum likelihood estimation of a translation parameter of a truncated distribution. Ann. Math. Statist., 43, 113-122. WOODROOFE, M. (1974) Maximum likelihood estimation of a translation parameter of a truncated distribution II. Ann. Statist., 2, 474-488.
The existence of a test with the largest order of consistency in the case of a two-sided Gamma type distribution SUMMARY
For a random sample (X,,..., X„) of size n from a two-sided Gamma type distribution, it is shown that the largest order of consistency is equal to n2. In a problem of testing the hypothesis H : 0 = 00 against K : 0 = 00 + to 2, a test with the rejection region { min,s;s„ I X; 1 > kn-2) is given as one with the order, where t * 0 and k is a positive constant. The power function of the test is also asymptotically given.
Esistenza di on test con it piu ampio ordine di consistenza nel caso di distribuzione del tipo Gamma a due lati RIASSUNTO
Per un campione casuale (X,, ..., X,) di ampiezza n estratto da una distribuzione del tipo Gamma bilaterale, si dimostra the l'ordine di consistenza piit grande 6 uguale a n2. In
557
107 un problema di verifica dell'ipotesi H : 0 = 00 contro K : 0 = 0O + to2, viene fornito on test con la regione di rifiuto { min 1y ,, I X, I > kn 2 } dove t # 0 e K e una costante positiva. Viene anche data, asintoticamente , la funzione di potenza del test.
KEY WORDS AND PHRASES
Order of consistency , two-sided Gamma distribution, Kullback- Leibler information.
[Manuscript received October, 1997; final version received December, 1997]
558 COMMUN. STATIST.-THEORY METH., 28(3&4), 705-726 (1999)
THE HIGHER ORDER LARGE-DEVIATION APPROXIMATION FOR THE DISTRIBUTION OF
THE SUM OF INDEPENDENT DISCRETE RANDOM VARIABLES Masafumi Akahira Kei Takeuchi Kunihiko Takahashi Institute of Mathematics Faculty of International Studies University of Tsukuba Meiji-Gakuin University Ibaraki 305-8571 Kamikurata-cho 1598 Japan Totsuka-ku, Yokohama 244-0816 Japan Key Words and Phrases: Large- deviation approximation; Normal approximation; Edgeworth expansion; Binomal case; Negative-binomial case.
ABSTRACT For a sum of not identically but independently distributed discrete random variables, its higher order large-deviation approximations are given. They are compared with the normal and Edgeworth type approximations in various cases. Consequently, the large-deviation approximations give sufficiently accurate results. 1. INTRODUCTION The Edgeworth and allied expansions are discussed by Barndorff-Nielsen and Cox (1989). In particular the tilted Edgeworth expansion is closely connected with the large deviation (see Daniels (1954, 1987)). Related results are
705 Copyright 0 1999 by Marcel Dekker . Inc.
www . dekker.com
559 706 AKAHIRA, TAKAHASHI , AND TAKEUCHI found in Barndorff-Nielsen and Cox (1979), Booth and Wood (1995), Gatto and Ronchetti (1996), Harvill and Newton (1995) and Lieberman (1994). The distributions of a sum of not identically but independently distributed random variables are difficult to calculate exactly. Obviously the normal and Edgeworth type approximations can be applied when the number of independent random variables is not too small, but it is not always sufficiently accurate especially for the tail part. In such cases, large-deviation approximations could give better approximations especially for tails. In this paper, we discuss large-deviation approximations for the distribution of the sum of discrete random variables and show in various cases they give sufficiently accurate results.
2. LARGE-DEVIATION APPROXIMATIONS Suppose that X1, ..., X,,, ... is a sequence of independent integer-valued random variables and, for each j = 1, ... , n, ... , X j is distributed according to a probability function pj(x) = P {Xj = x}
for x = O, f1, ±2,... .
Putting S„ :_ X„ we denote a probability function of S„ by pn(y) := P {Sn =y}
(2.1)
for y = 0, ±1, ±2,. ... Denote the moment generating function (m.g. f.) of X j by Mj(9) := E[ e'x' (2.2) for each j = 1, ... , n,... , assuming that Mj (9)'s exist for values of 0 in an open interval a which includes 0. Now, for each j, we consider a discrete exponential family Pj := {pj,e(x) : 0 E &} of probability functions pj,o(x ) Pe {Xj = x} = pj(x)eexMj(0)-1 (2.3) for x = 0, ±1, ±2,..., where pj,o(x) = pj(x). Denote a probability function of S„ by pn,e(y) Pe {Sn = y}
(2.4)
560 HIGHER ORDER LARGE-DEVIATION APPROXIMATION 707 for y = 0, f1, f2, ... , where pn,o(y) = pn(y)• From ( 2.1) to ( 2.4) we have n
pn,e(y) = pn(y)e° ll Mj(0) -1. (2.5) j=1
On the other hand, it follows from (2.2) that, for each j, the characteristic function of Xj, under the family Pj, is given by
Ee [eitx;] = 1:e='pj,B(x) = Mj(0)-1Mj(0 +it). (2.6) X
Since, by (2.6), the characteristic function of Sn, under the families {Pj}, is n n Ee [ettSn] _ 11 Mj(0+it)f[Mj(0)-1
l
j=1
j=1
By the Fourier inverse transform, we have ,r
n
n
H
Mj(B + it) J Mj(0)- 1e- 'tydt. (2.7) Pn,e(y) =2^ - f „ j=1 j=1 From (2.5) and (2.7) we obtain n
pn(y) = e-By fJ Mj(0) pn,e(y) j=1 n
1
n
n
11
Mj (0)-1 e- `tydt. (2.8) = e-By 11 Mj (0) 2^ f 11 Mj(0 + it) j =1 * j=1 j=1 Letting Kn(0) :=
1 log Mj(0), we have n
Mi(0) =
ettn(e)
j=1 From (2.8) and (2.9) we obtain ex.,(e)-ey R xn (e+it)-xn ( e)-`tydt. pn(y) = 2^ f
(2.10)
For small ItI, we have by the Taylor expansion
Kn(0 +
K, 2) (0) (it)2 + 1 Kn(3) (0) ( it)3 + .. . it) - K.(0) = Knl) (9)it + 1 2 6
561 708 AKAHIRA, TAKAHASHI, AND TAKEUCHI where K5°`' (9) = (d" /de-) K.(9) for a = 1, 2,.... Then we consider an estimator 8 := B(S) for 0 such that K.(" (e) = y
(2.11)
when Sn = y for y = 0,±I,±2 ..... Using the estimator 0, we have the following expansion of pn(y). Theorem 1. I f K (0) = 0(n) (j = 2, 3, ... ), then the probability function pn(y) of the sum S. is asymptotically given by
pn(y)
1 eK„ A -h L1 + K(2)(9) 2 - 5143) B)} s +0 8{Kn (9)} 24{Kn (B)}} = L JJ 2lrK(2)( 9)
The proof is omitted, since it is essentially similar to Chapter 2 of Jensen (1995). Next we define the amount of the Kullback- Leibler information between probability functions pj,e(•) and p3(•) as I3(0, 0) := E p1,e ( x) log X
pje(x) pi(x)
for j = 1,... , n,.... From ( 2.3) we have for each j ( I.i(0, 0) = 1: pi, e(x) log eezM.i(e)-1) X
=
eae logM3(6) - log Mj (0),
hence, in a similar way to the above, it follows from (2.5) that the amount of the Kullback-Leibler information between pn,e(.) and pn(•) is given by .6(y) In* (0, 0) _ Epn,e(y) log p Y pn(y) n
E I;(9, 0) = eKnll(0) - Kn(0).
(2.12)
j=1
From (2.10) and (2.11) we have
I;,(9, 0) = By - Kn(9).
(2.13)
562 709
HIGHER ORDER LARGE-DEVIATION APPROXIMATION Then we obtain the following.
Corollary. If K,(,')(9) = 0(n) (j = 2,3.... ); then the probability function pn(y) of the sum Sn has the asymptotic form
pn(y) =
1
e- In(B,o)
K
Kn4)(O)
+
8{K2)(9) }2
n3)"O)}
_ 5 {K 2 +0 24{K2(6)}3 (h )
].
The proof is straighforward from Theorem 1 and (2.13). We also have the tail probability of Sn as follows. Theorem 2. If K,(,') (B) = 0(n) (j = 2, 3,.. .), then
00
I
{ y} = nnlnJ-n✓
P Sn >- 27rKn2)(9)
_6 6z
Le -
Z2 K(n3) (6)z
-
2K(2)(6)
2{Kn2)(0)}2
+ Kn4)(B) - 5{Kn)(0)}2 +O ( W2 8{Kn2)(0)}2 24{Kn2)(9)}3 \l (2.14)
for all y>E(S,,). The proof is given in Section 5. Note that the formula (2.14) does not include any term in which 1 - (D(x) appears, as are the formulas discussed by Jensen (1995) and others, where '(x) denotes the standard normal distribution function. Both can be related to each other through 11 33.5 1 - 4'(x) _ 0 (x) 2 x3 + TS x7 (- )n1.3.5...(2n-1) 1 + + 2n+l
+
I\ x 2n+3 1 /
as x - co, where O(x) is the standard normal density function. It is to be remarked that in (2.14) the right-hand side is meaningful only when 9 is not too small, which implies that y - E(S,,) is of order n. This is in contrast to the usual large-deviation approximation (such as discussed
563
710 AKAHIRA, TAKAHASHI, AND TAKEUCHI in Lugannani and Rice (1980) and Robinson (1982)) which is based on the normal density, where 9 is assumed to be small, that is y - E(S,,) is of o(n). The formula (2.14) can be regarded as truely large-deviation approximation. We could further improve the approximation (2.14) by setting the limit of summation by actual possible largest value.
3. BINOMIAL CASES Suppose that X1, ... , Xn,... is a sequence of independent random variables and, for each j = 1, ... , n, ... , Xj is distributed according to the binomial distribution B(1, pj) with a probability function 1 -x f jr; (x) = P{X j = x} = pj qj
for x = 0,1, where 0 < pj < 1 and qj = 1 - pj. Let Sn :=
Xj. En j -1 j'
we consider the exact distribution, large-deviation approximations, normal approximation and Edgeworth expansion for Sn and compare them numerically. (i) Exact distribution. We obtain the exact distribution of Sn recusively as follows. k
l
P{Sk=O}=P >Xj=O}=P{X1=0,..., Xk=0}=gl...gk i=1 JJJ
for k = 1,... , n. P{Sk=y}=P{Xk=1}P{Sk=yWk =1}+P{Xk=O}P{Sk=yIXk=0} = pkP{Sk_1 = y - 1} + qkP{Sk_1 = y} fory=l,...,kandk=l,....n,where P{Sk=y}=0fory>k+1. (ii) Large-deviation approximations. Since the m.g.f. of Xj is given by Mj(O) _ pjee + qj for each j = 1, ... , n, it follows that pj,e(x) = PB{Xj = x} = (pjee )xgi-x /(pjee + qj) for j = 1,... , n. Since n
n
Kn(9) _ E logMj (B) _ log( pjee+qj), j=1
j=1
564 HIGHER ORDER LARGE-DEVIATION APPROXIMATION 711 we have n
eg
n
K$. )(9) = pjee+qj - pj,e(1)• Then we take 9 such that n
Epj,e(1) = Y. j_1
(3.1)
Putting f := ee, we have from (3.1) n n pjee n Pf' y=Lpje(1)=E e =^ •T+.'
j=1 pie -i- qj j=1 '23
j=1
and also n
n-y=
qj pj'+qi
From (3.2) or (3.3) we can obtain the value of T. We put fj := p,,6(1) and 4i:=1-pj. Then we have In(6,^ ) _E jPio( 1)log p^ ( 1)) +pj,e(0)logp^(0))} j=1
j =1
log+4j log Pi qjJ
Since pies p_e2e = E Kn2)(e) j=1 pjee + qj (pjee + qj)2 n
_ Epj,o(1) {1 - pj,e(1)} j=1
it follows that n
Kn2^(B) j=1
(3.4)
565 712 AKAHIRA , TAKAHASHI , AND TAKEUCHI In a similar way to the above , we obtain n
Kn3)(e)
_ 1, 0j4j (4j j=1 n
K(4) (e) _ 3Pj4j( 1 - 6Pj4j). j=1
Since n
n
n
= 7T(pjea
+ 4j) _ Qj eK(e) = ft Mj(B) j=1 j=1 j=1 4i
it follows from Theorem 1 and (3.5) that the first order large-deviation approximation LD1 of pn(y) is given by 1 Pn(y) = eK„(a )- ev (1 + o(1)) 27rK,(,2) (B) ['^1 tllgj^ j=1 qj 27rLij =lpjgi
v
for y = 0, 1, ... , n.
From the Corollary we have the second order largedeviation approximation LD2 of pn (y) is given by {K,(,(2, 8)} 3
Pp(y) = 1 + K((8) _ 5 +On (b)1 2 24{Kn (8)} ( )} 2irK,(,2)(B) 8{42)
n gj l
_ 27r
E 1 Pj4j
^ )T j = 1 qi 1
+
K(4)(i) ([ten 2 8 (Lj=lPigi)
5{K,(,3)()}2
- 9 +O( 24 ()
)
]
(3.9)
for y = 0,1, ... , n, where K,(,3) (B) and K,(,4) (B) are given by (3.6) and (3.7), respectively. In the special cases y = 0 and y = n, it is easily seen that pn(O) = q1... qn and pn(n) = p1... pn.
566 HIGHER ORDER LARGE-DEVIATION APPROXIMATION 713
Since the cumulant of Sn are given by n µn := E(Sn) _ Epj,
n
vn := V (Sn) = Epjgj,
j=1 n
j=1
K3 (Sn) _ Epjgj(4j -pj), j=1
K3,n
n
/C4 ,n
:= k4(Sn)
= I:pjgj(1 - 6pjgj), j=1
it follows that the Edgeworth expansion of the distribution of Sn is given by P{Sn=t}=P Sn-P,=t-µn {
vn
vn
1 0(y) 5 1 + ',n (y3 - 3y) + K4' 2 (y4 - 6y2 +3) vn l 6vn3/2 24vn 2
+
-
- 15y4 + 45y2 -15) } + o (3.10) n JJJl ( 3=),
where y (t - µn) vn . The first term of the final expression of (3.10) is called the normal approximation of the distribution of Sn. We consider the cases when (i) {pj}^91 = 0.05(0.05)0.95, (ii) {pj}^01 = 0.03(0.03)0.60, (iii) {pj}r 1 is uniformly distributed on the interval (0, 1). In each case, we numerically compare the exact distribution of Sn, the normal approximation, the Edgeworth one (3 .10), the first order large-deviation one LD1 (3.8) and the second order large-deviation one LD2 (3.9) (see Tables 3.1 to 3.3). It is seen that the large-deviation approximation LD1 and LD2 are much better than the others and are also sufficiently accurate especially for the tail part. Note that the normal approximation and the Edgeworth one correspond to LD1 and LD2, respectively, in view of the order. Next we consider the one-sided probability of Sn = En , Xj the binomial case. In a similar way to the proof of Theorem 2 we have n-y
P{Sn > y} _ E pn(y + z) Z=o n-Y 2 (3) (9)z 1 eKn^a)-eY e 9z 1 - z - Kn 27fxn2)(0) K(nV4)(0)
+ 8{Kn2)(e)}2
z=o
2Kn2)(e) 2{Kn2)(6)}2
_ 5{Kn3)(O)}2 24{Kn2)(B)}3 +O ()
]
(3.11)
567 AKAHIRA, TAKAHASHI , AND TAKEUCHI
714
for y = 0,1, ... , n. Putting m:= n - y, we obtain 1 ?o ' e_ 1
- e-e
e (m+1)e)e-e _ _ (m + 1 )e-(m+1)e + (1 E ze -ez z=o 1 - e-e (1 - e-e)2 M 2e-(m+l)e 1 2 -ez _ _ (m+ 1 )
z
e
z=o
+ ( ) {-(2m + 3) e-(m+2)e 1 - e - e 1- e-e 3 + (2m + 1)e-(m+3)e + e-e ( 1 + e-e) }
From (3.11) we have P{Sn > y} K(4)(8) 5(43)(B))2 1 eK.de)-ey 1 - e-(m+1)e 1 + K(2)^°` 1 e-e R(K!2)(A)12 2d(K12)(A1)3 /2 7r
(m + 1) 2e'( m+1)e
1
2Kn2)(B) { 1 - e-B m+3)e + e-e ( 1 + e-e)) } + (1-e-) 1 e 3 (-(2m + 3)e-(m+2)e + (2m + 1)e- ( ))J
I
(m + 1)e-(m+1)e + (1 - e-(m+1)e)e-e +0 2 {Kn2)(B) }2 1 -e- 0 (1 - e-e)2 Kn3)(9)
n2
Letting f = ee, we have P {Sn > y} 1
n
n
2^ ti=1 Mj i=1 9i 1
- -(m+1)
{1+
Kn4)(B) 1 1 - T- 8(E n=1 Pi4i)2 1
(m + 1)2T-(m+1)
n 2 F_i =1 Pi 9i
1 - T-1
+
-
5 {Kn3)(9) }2 24(E^ 1 PiQi)3 }
(m +3)+T - 1(1 +T-1))^ 1 1 ) 3 \-(2m+3 ) T-(m+2)+(2m +1)r (1-T-J
568 HIGHER ORDER LARGE-DEVIATION APPROXIMATION 715 TABLE 3.1. The values of the exact distribution P{S„ = y} of S,,,, and the relative errors of the normal approximation, the Edgeworth one (3.10), the first order large-deviation one LD1 (3.8), the second order large-deviation one LD2 (3.9) when {pj}.^91 = 0.05(0.05)0.95. Exact(7o)
y
Normal
Edgeworth
LD1
LD2
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
0.0000 0.0001 0.0025 0.0293 0.2122 1.0319 3.5182 8.6510 15.6199 20.9349 20.9349 15.6199 8.6510 3.5182 1.0319 0.2122 0.0293 0.0025 0.0001
11M63 2.4706 0.8305 0.2991 0.0905 0.0091 -0.0144 -0.0120 -0.0140 0.0065 0.0065 -0.0140 -0.0120 -0.0144 0.0091 0.0905 0.2991 0.8305 2.4706
-0.5734 -0.0435 0.0076 0.0040 -0.0000 -0.0006 -0.0000 0.0001 -0.0000 -0.0000 0.0001 -0.0000 -0.0006 -0.0000 0.0040 0.0076 -0.0435 -0.5734
0.0000 0.0742 0.0321 0.0196 0.0141 0.0112 0.0095 0.0085 0.0079 0.0076 0.0076 0.0079 0.0085 0.0095 0.0112 0.0141 0.0196 0.0321 0.0742
0.0000 -0.0080 -0.0024 -0.0010 -0.0005 -0.0002 -0.0002 -0.0001 -0.0001 -0.0001 -0.0001 -0.0001 -0.0001 -0.0002 -0.0002 -0.0005 -0.0010 -0.0024 -0.0080
19
0.0000
11.0363
-
0.0000
0.0000
TABLE 3.2. The values of the exact distribution P{S,, = y} of S,-, and the relative errors of the normal approximation, the Edgeworth one (3.10), the first order large-deviation one LD1 (3.8), the second order large -deviation one LD2 (3.9) when {pj }^o 1 = 0.03(0.03)0.60. y 0
Exact o 0.0263
Normal 2.7822
Edgeworth 0.0933
LD1 0.0000
LD2 0.0000
1 2 3 4 5 6
0.2970 1.5463 4.9248 10.7461 17.0540 20.3938
0.5926 0.1126 -0.0290 -0.0548 -0.0334 0.0024
0.0512 0.0032 -0.0058 -0.0014 0.0018 0.0005
0.0840 0.0416 0.0273 0.0201 0.0159 0.0132
-0.0059 -0.0013 -0.0006 -0.0003 -0.0002 -0.0002
7
18.7873
0.0312
-0.0014
0.0114
-0.0002
8 9 10
13.5171 7.6552 3.4236
0.0378 0.0139 -0.0416
-0.0002 0.0025 0.0011
0.0102 0.0094 0.0089
-0.0001 -0.0001 -0.0001
11
1.2081
-0.1226
-0.0071
0.0087
-0.0001
12 13
0.3348 0.0722
-0.2185 -0.3169
-0.0131 0.0087
0.0089 0.0094
-0.0001 -0.0001
14 15 16
0.0120 0.0015 0.0001
-0.4057 -0.4748 -0.5146
0.1017 0.3331 0.8263
0.0105 0.0122 0.0152
-0.0002 -0.0003 -0.0005
17
0.0000
-0.5092
1.8805
0.0210
-0.0010
18
0.0000
-0.4175
4.4459
0.0339
-0.0022
19 20
0.0000 0.0000
-0.0774 1.6449
12.7190 60.6571
0.0764 0.0000
-0.0075 0.0000
569 716 AKAHIRA , TAKAHASHI, AND TAKEUCHI
TABLE 3.3. The values of the exact distribution PIS,, = y} of S, and the relative errors of the normal approximation , the Edgeworth one (3.10), the first order large-deviation one LD1 (3.8), the second order large-deviation one LD2 (3.9) when { pj } 1 is uniformly distributed on the interval (0, 1), that is, pl, ... , p20 are given by 0.305146, 0.715095, 0.612101, 0.672283, 0.447648, 0.268358 , 0.434328, 0.552620, 0.608603, 0.130255, 0.941095, 0.141198, 0.164085, 0.693920, 0.565611, 0.977985, 0.0513902 , 0.877854, 0.451323, 0.0628465, respectively. y
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Exact o
0.00 0.0001 0.0015 0.0210 0.1670 0.8584 3.0390 7.7151 14.4053 20.0831 21 . 0635 16. 6429 9.8669 4.3488 1.4041 0.3251 0.0524 0.0056 0.0004 0.0000 0.0000
Normal
.118 5.5998 1 .5567 0.5223 0. 1642 0.0291 -0.0142 -0.0172 -0.0057 0.0055 0.0087 0.0025 -0.0091 -0.0167 -0.0059 0.0457 0. 1788 0.4856 1 .2369 3.5355 15. 4703
Edgeworth
-0.0944 0.0311 0.0138 0.0001 -0.0025 -0.0007 0.0007 0.0006 -0.0002 -0.0006 -0.0003 0.0006 0. 0019 0.0024 -0.0065 -0.0668 -0.3813 -
LD1
00 0.0564 0.0186 0. 0123 0.0119 0. 0120 0.0117 0.0110 0.0102 0.0093 0.0085 0.0077 0.0072 0.0070 0. 0074 0.0091 0.0125 0. 0192 0. 0338 0.0789 0.0000
LD2
0.0000 -0.0122 -0.0036 -0.0012 -0.0000 0.0003 0.0003 0.0003 0.0002 0.0001 0.0000 -0.0000 -0.0002 -0.0003 -0.0004 -0.0005 -0.0007 -0.0011 -0.0022 -0.0064 0.0000
TABLE 3.4. The values of the exact tail probability P{S„ > y} of Sn , and the relative errors of its Edgeworth approximation (3.13) and its second order largedeviation approximation LD2 (3. 12) when {pj };91 = 0.05 (0.05)0.95. y
1 11 12 13
Exact o
50.000029.0651 13.4452 4.7942
Edgeworth
LD2
O.OUOO -0.0000 0.0000 0.0004
-0.8356 -0.2010 -0.0627
14
1.2761
0.0014
-0.0232
15 16 17 18 19
0.2442 0.0320 0.0027 0.0001 0.0000
-0. 0006 -0.0293 0.2154 -
-0.0097 -0.0048 -0.0038 -0.0083 -
570 717
HIGHER ORDER LARGE-DEVIATION APPROXIMATION
TABLE 3.5. The values of the exact tail probability P{Sn > y} of S,,, and the relative errors of its Edgeworth approximation (3.13) and its second order large-deviation approximation LD2 (3.12) when {pj}j0-1 = 0.03(0.03)0.60. LD2
y
Exact(%)
Edgeworth
7 8 9
45.0120 26.2247 12.7076
0.0002 0.0003 -0.0003
-0.9933 -0.2467
10 11
5.0524 1.6288
-0.0032 -0.0067
-0.0840 -0.0346
12 13 14 15 16 17 18 19 20
0.4207 0.0859 0.0136 0.0016 0.0001 0.0000 0.0000 0.0000 0.0000
0.0004 0.0451 0.1758 0.4772 1.1191 2.5290 6.0939 18.1006 91.0675
-0.0161 -0.0082 -0.0044 -0.0026 -0.0018 -0.0017 -0.0025 -0.0076 -
Kn3)(9) _ (m + 1)T-(-+l) + (1 - T-(7`+1))T-1 +0 1 j , (1 -T-1 ) 2 } () 2(E 1pj4j) 2 1 -T-1 (3.12) where n
n
K(3)(e) _ pj9j( 4j -pj ), K(4)(e) _ EPj4j( 1 - %4j)j=1 j=1
On the other hand, by the Edgeworth expansion we have P{Sn > y} = 1 - CZ) + O(x)
I 6vn2 (x2 -1) + 24n (z3 - 3z) 2 +74x3 (zs - 10x3 + 15z) 24vn z + o n
(n
I ),
)J (3.13)
vn, In the cases (i) to (iii), we compare the where z = (y - 0.5 - p) Edgeworth expansion (3.13) of the tail probability of Sn and the second order large-deviation approximation (3.12) (see Tables 3.4 to 3.6).
571
718 AKAHIRA , TAKAHASHI , AND TAKEUCHI TABLE 3.6. The values of the exact tail probability PIS,, > y} of S,,, and the relative errors of its Edgeworth approximation (3.13) and its second order large-deviation approximation LD2 (3.12) when {pj},^0_1 is uniformly distributed on the interval (0,1), that is, pl,... p2o are given by 0.305146, 0.715095, 0.612101, 0.672283, 0.447648, 0.268358, 0.434328, 0.552620, 0.608603, 0.130255,0.941095,0.141198,0.164085,0.693920,0.565611,0.977985,0.0513902, 0.877854, 0.451323, 0.0628465, respectively. y
Exact(%)
Edgeworth
10
53.7097
-0.0002
-
11 12
32.6462 16.0033
-0.0003 0.0001
-0.2793
13 14
6.1365 1.7877
0.0012 0.0023
-0.0836 -0.0306
15 16 17 18 19 20
0.3836 0.0585 0.0060 0.0004 0.0000 0.0000
-0.0014 -0.0307 0.1676 -0.7757 -
-0.0129 -0.0061 -0.0036 -0.0032 -0.0067 -
LD2
TABLE 3.7. The values of the exact tail probability P{S„ > y} of S, and the relative errors of its normal approximation , the saddlepoint approximation with the first term only, the saddlepoint (sp.) expansion, the Lugannani-Rice formula and the large-deviation approximation LD2 when pj =p = 0.15 (j = 1,... , n) for n =10, 20. The values except the exact and the large-deviation approximation LD2 are calculated from Table 2.4.4 of Jensen (1995, page 44). n=10,p=0.15
2 45.57 0.097
4
5
Exact(%) Normal
5.00 -0.234
0.987 -0.601
8.67 x 10-0.994
3.33 x 10-1.000
Saddlepoint
-0.137
-0.050
-0.035
-0.007
0.021
Sp: expansion Lugannani-Rice LD2
0.004 0.004 -
0.004 0.008 -0.070
0. 005 0.013 -0.021
0.010 0.038 -0.003
0.024 0.081 -0.009
y
9
n=20,p=0.15
Exact(%)
6 6.73
8 0.592
18 2.07 x 10-
Normal Saddlepoint
-0.128 -0.043
-0.591 -0.027
-1.000 0. 010
-1.000 0.039
Sp: expansion Lugannani-Rice LD2
0.001 0.003 -0.126
0.002 0.005 -0.022
0.005 0. 039 -0.005
0.021 0.084 -0.008
y
19
3.80 x 10
572 HIGHER ORDER LARGE-DEVIATION APPROXIMATION 719 In the independently, identically and binomially distributed case, Jensen (1995, page 44) gives the table to compare the exact tail probability with the normal approximation, the saddlepoint approximation with the first term only, the saddlepoint expansion and the Lugannani-Rice formula. So, we add our large-deviation approximation LD2 to the table when p; = p = 0.15 (j = 1, ... , n) for n = 10, 20 (see Table 3.7). The LD2 seems to be better than the others for the tail part in the case.
4. NEGATIVE-BINOMIAL CASE Suppose that Xi, ... , X,,, ... is a sequence of independent random variables and, for each j = 1,. - - , n,..., X1 is distributed according to the negative-binomial distribution NB(r;, p;) with a probability function
C
x+r;-1 x} x p;) q; fx; (x) = P{X; = = rj x f o r x = 0,1, 2, ... , where 0 < p; < 1 and qj = 1 - p;. Then we consider the exact distribution, large-deviation approximations , normal approximation and Edgeworth expansion for S„ = E 1 X; and compare them numerically. (i) Exact distribution. We obtain the exact distribution of S„ recursively follows : Letn=20andr;=rfor j=1,...,20 . Letg1= •••= qs=a,qa="'= q1o=b,g11= •••= q15=c,g16 =• •=qzo=d . Then we have P{X; = x} = r(
P{ X; = x} = r(
P{X; = x} = r(
P{X; = x} = r(
r + 1) (r + x - 1 ) ar(1 - a)x X! (x=0,1,2,... ;j=1,...,5), r+1)..•(r+x - 1)br(1-b)x x! (x=0,1,2 ,...; j=6,...,10), r + (r + x - 1 ) cr(1 c)X! (x=0,1,2,...;j=11,...,15), r + • • • (r + x 1) dr(1 d)x x! (x=0,1,2 ,... j=16,...,20).
Let Si =Ss1X;,Sz6X;,S3;11 X; andS4E^°_1sX;. Then it is seen that Si, Sz, S3 and S4 are distributed according to the negative-
573 720 AKAHIRA, TAKAHASHI , AND TAKEUCHI binomial distributions NB(5r, a), NB(5r, b), NB(5r, c) and NB(5r, d), respectively. Since Si and S2 are independent, the probability function of Si + S2 is given by
P {S1 +S2=y}P {S1= k}P{S2=y-k} k=o for y = 0,1, 2, .... Letting S12 := Si + S2, we have
S1 +S2+ S3 =S12+S3. Since S12 and S3 are independent, in a similar way the probability function of S12 + S3 = Sl + SZ + S3 can be given. Repeating the above process, we also get the probability function of Sl + SS + S3 + S4. (ii) Large-deviation approximations. Since the m.g.f. of Xj is given by Mj(O) = E( eex;) = p,,' ( 1 - gjee)-r3 for small
101
and j = 1,... , n. Since n
n
Kn(0) = E log Mj(B) = E {rj log pj - rj log(1 - gje0) } , j=1 j=1 it follows that
K' (0)
r7g7eB B
j=1
- qje
Then we take 8 such that rjgjee = Y. E 1-gjee j =1 Putting T = e9, we have from (4.1) n y 1: rjgjT = 1-q.f j=1
We also obtain K,2) (B) = rjgjee = 1:
rjgjT
j=1 (1 - gjea)2 j=1 (1 - gjT)2
(4.2)
574 HIGHER ORDER LARGE-DEVIATION APPROXIMATION 721 From Theorem 1, it follows that the first order large -deviation approximation LD1 of pn(y) is given by pn(y) = 1 eK" V27rKnz)(B)
(B)- ey(1 + 0(1))
1 { fjMj( O)}e(1+o(1))
2rKz )( 6) j=1 ) 1 nn P r 11(1-qjT/ V2ir^nj -1 ( 1 -4 jr
T y(1 +0(1)) (4.3)
for y = 0, 1,2,.... Since K(3)(9)=C`r q (1 +qjT) (4.4) 3 L. j=1 (1 - qjT) K(4)(e) _ rjgj(1 +4q,T +q?TZ) (4.5) 4 j=1 (1 -qjT) it follows from Theorem 1 that the second order large-deviation approximation LD2 of pn is given by 1 n
pj )rj T-y
pn (y)
2^j=1 1 - qjT
r K(n4)(B) - 5{Kn3)(e)} 2 +0 Ill+ 8{K2)(B)}2 24{Kn2)(B)}3 h)]
(
(4.6) ,
where Kn2)(8), Kn3)(9) and Kn4)(B) are given by (4.2), (4.4) and (4.5), respectively. Since the cumulants of Sn are given by n
/in
E(Sn)
= E gE , j=1
K3, n
K3(Sn)
vn
Pi
n
V (Sn) gjrj z j=1 pj
p3gj)rj _ qj(1 + j=1 n
k4,n
^s ('yn)
=Ei,rr
4 qj), ( 1+4g j + z j=1 pJ4
it follows that the Edgeworth expansion of the distribution of Sn is given by
575 722 AKAHIRA, TAKAHASHI , AND TAKEUCHI TABLE 4.1. The values of the exact distribution P{ S„ = y} of S,,, and the relative errors of the normal approximation , the Edgeworth one (4 .7), the first order large-deviation one LD1 (4.1) and second order large-deviation one LD2 (4.6), when n = 20, rj = 1(j = 1,... ,20), 41 = ... = 95 = 0.1, 46 = ... = 410 = 0.2, 4u=...=415=0.3,416=...=420=0.4.
y
Normal 3.1201 0.5443 -0.0044 -0.1701 -0.2028
Edgeworth
1 2 3 4
Exact o 0.2529 1.2644 3.3506 6.2587 9.2491
0.3185 0.1379 0.0296 -0.0061 -0.0090
0.0845 0.0422 0.0281 0.0210
-0.0059 -0.0012 -0.0005 -0.0002
5 6
11.5098 12.5389
-0.1711 -0.1026
-0.0024 0.0017
7 8 9 10 11 12 13 14 15 16
0.0168 0.0139
-0.0001 -0.0001
12.2775 11.0112 9.1744 7.1794 5.3226 3.7646 2.5548 1.6716 1.0586 0.6512
-0.0147 0.0766 0.1543 0.2011 0.2025 0. 1503 0.0453 -0.1018 -0.2733 -0.4481
0.0007 -0.0015 0.0002 0.0059 0.0080 -0.0028 -0.0267 -0.0475 -0.0387 0.0164
0.0119 0.0103 0.0091 0.0081 0.0073 0.0066 0.0060 0.0055 0.0050 0.0046
-0.0000 -0.0000 -0.0000 -0.0000 -0.0000 -0.0000 -0.0000 -0.0000 -0.0000 -0.0000
17
0.3902
-0.6079
0.1057
0.0043
-0.0000
LD1
D2
.
TABLE 4.2. The values of the exact distribution P{S,., = y} of S, and the relative errors of the normal approximation, the Edgeworth one (4.7), the first order large-deviation one LD1 (4.1) and second order large-deviation one LD2 (4.6), when n = 20, rj = 2(j = 1,... , 20), 41 = ... = 45 = 0.1, 46 = • = 410 = 0.2, 411 =... =415=0.3, 416=•••=42o=0.4.
y 0 1 2
Exact o 0.0006 0.0064 0.0329
Normal 97. 8639 17.9615 5.7426
Edgeworth -0.6928
LD1 .000 0.0844 0.0422
LD2 0.0000 -0.0060 -0.0012
3
0.1164
2.3357
-0.0674
0.0281
-0.0005
4 5 6 7 8 9 10 11 12 13 14 15
0.3173 0.7115 1.3660 2.3082 3.5026 4.8463 6.1873 7.3591 8.2185 8.6742 8.7003 8.3318
1.0422 0.4515 0.1504 -0.0110 -0.0961 -0.1349 -0.1433 -0.1306 -0. 1028 -0.0647 -0.0205 0.0259
0.0456 0.0460 0.0266 0.0101 0.0006 -0.0031 -0.0030 -0.0015 0.0000 0.0006 0.0004 -0.0003
0.0210 0.0168 0.0140 0.0120 0.0105 0.0093 0.0084 0.0076 0.0069 0.0064 0.0059 0.0055
-0.0003 -0.0002 -0.0001 -0.0001 -0.0001 -0.0000 -0.0000 -0.0000 -0.0000 -0.0000 -0.0000 -0.0000
16
7.6491
0.0700
-0.0006
0.0051
-0.0000
17 18 19 20
6.7559 5.7583 4.7493 3.7996
0.1076 0.1342 0. 1460 0.1397
0.0001 0.0015 0.0028 0.0027
0.0048 0.0045 0.0043 0.0040
-0.0000 -0.0000 -0.0000 -0.0000
576 HIGHER ORDER LARGE-DEVIATION APPROXIMATION 723 TABLE 4.3. The values of the exact distribution PIS,, = y} of S,,, and the relative errors of the normal approximation, the Edgeworth one (4 . 7), the first order large-deviation one LD1 (4.1) and second order large -deviation one LD2 ( 4.6), when n = 20, rr = 1 (j = 1,... , 20), 91 = ... = 4s = 0.2 , 46 = ... = 41o = 0.4, 411=...=91s = 0.6,916= ...= 42o=0.8.
ExacqToj T ' o.uuuu 1 0.0001 2 0.0004 3 0.0017 4 0.0049 0.0122 5 0.0265 6 7 0.0519 8 0.0931 9 0.1551 0.2427 10 11 0.3599 0.5091 12 0.6910 13 14 0.9042 1.1454 15 1.4091 16 1.6887 17 y
Normal Edgeworth 715-4W 916.1823 219.6504 73.7118 30.7225 14.8687 7.9959 4.6291 -0.2211 0.1071 2.8151 1.7612 0.2037 1.1110 0.2100 0.1831 0.6900 0.1469 0.4070 0.2111 0.1113 0.0728 0.0801 0.0543 -0.0261 0.0341 -0.0969 -0.1471 0.0188
1 U.-OW 0.0845 0.0422 0.0281 0.0210 0.0168 0.0139 0.0119 0.0103 0.0091 0.0081 0.0073 0.0066 0.0060 0.0055 0.0050 0.0046 0.0043
LD2 0.0000 -0.0059 -0.0012 -0.0005 -0.0002 -0.0001 -0.0001 -0.0001 -0.0000 -0.0000 -0.0000 -0.0000 -0.0000 -0.0000 -0.0000 -0.0000 -0.0000 -0.0000
18
1.9764
-0.1817
0.0078
0.0040
-0.0001
19
2.2640
-0.2043
0.0004
0.0037
-0.0001
20
2.5432
-0.2171
-0.0042
0.0035
-0.0001
21 22 23
2.8062 3.0462 3.2573
-0.2221 -0.2205 -0.2134
-0.0067 -0.0075 -0.0072
0.0033 0.0032 0.0030
-0.0001 -0.0001 -0.0002
24
3.4351
-0.2015
-0.0062
0.0029
-0.0002
25 26 27 28
3.5766 3.6801 3.7452 3.7729
-0.1854 -0.1658 -0.1432 -0.1179
-0.0049 -0.0036 -0.0024 -0.0014
0.0028 0.0028 0.0027 0.0027
-0.0002 -0.0002 -0.0003 -0.0003
29 30
3.7648 3.7237
-0.0905 -0.0614
-0.0008 -0.0005
0.0027 0.0027
-0.0003 -0.0004
31 32 33 34
3.6526 3.5553 3.4357 3.2975
-0.0310 0.0001 0.0316 0.0628
-0.0004 -0.0004 -0.0004 -0.0003
0.0027 0.0027 0.0028 0.0029
-0.0004 -0.0004 -0.0004 -0.0005
35
3.1447
0.0932
0.0000
0.0030
-0.0005
36 37 38 39 40 41 42 43 44 45
2.9810 2.8098 2.6344. 2.4575 2.2815 2.1086 1.9404 1.7785 1.6237 1.4770
0.1223 0.1497 0.1746 0.1966 0.2151 0.2297 0.2399 0.2454 0.2458 0.2409
0.0007 0.0018 0.0032 0.0048 0.0064 0.0078 0.0087 0.0087 0.0075 0.0048
0.0031 0.0031 0.0033 0.0034 0.0035 0.0036 0.0038 0.0039 0.0040 0.0042
-0.0005 -0.0005 -0.0005 -0.0005 -0.0005 -0.0006 -0.0006 -0.0006 -0.0006 -0.0006
577 724 AKAHIRA, TAKAHASHI, AND TAKEUCHI
P{sn = t} = P Sn F
ln = t - fit,
vn
1n
}
1 ^(y) S 1 + K3"2 (y3 - 3y) + K4' Z (y4 -6 Y2 +3) 1n
l
6vn
2412
2
-15)}+0(n , (4.7) +7211(ys-15y4+45y2 n JJJ ) where y := (t - Ft„) vn. The first term of the final expansion of (4.7) is called the normal approximation of the distribution of Sn. We consider the cases when (i) n = 20 , ri = 1 (j = 1 ,... , 20), q1 qs=0.1,qo=...=qio=0.2,qii=...=q1s=0 . 3,g1o= ...= q2o=0.4, (ii)n=20 , r3=2(j = 1,...,20), q1 =...= q5=O.1,gs=...=q1o=0.2, q11= ••=qis=0 . 3,q16 =•••=q2o = 0.4,(iii)n=2.0,rj = 1(j=1,...,20), q1=...=qs=0 . 2,q6=...=q1o = 0.4,qu=...=q1s=0 . 6,g16=..._ q2o = 0 . 8. In each case , we compare the exact distribution of S" , the normal approximation , the Edgeworth one (4 . 7), the first order large -deviation one LD1 (4. 3) and the second order large-deviation one LD2 (4.6) (see Tables 4.1 to 4.3). It is seen that the large-deviation approximation LDI and LD2 are much better than the others . Note that the normal approximation and the Edgeworth one correspond to LDI and LD2, respectively, in view of the order.
5. PROOF In this section we give the proof of Theorems 2 in Section 2. Proof of Theorem 2. From (2.10) we have for any z > 0 Pn(y + z) = eK.(B)- e(y+.) . 1 7r eK.,(B+=t)-K-(B)-=t (y+z)dt. 27r f We also obtain 1
eK„ (e+4t)- K„ (B)-tt(y+z)dt
27r _n 1 Kn2)(B) 24 Kn2 )(B) -n K,(,)(B)
iu Kn2)(e) - Kn(B) _ iu(y + z) du Kn2)(e)
578 HIGHER ORDER LARGE-DEVIATION APPROXIMATION 725
1 ,r KA21(B) iuz 1 exp - + (iU)2 21
x(2)(6) J a K,,25(9) K 2)(e) 2 3 + 1x(3) (e) iu
4 iu
+ 1 K14) (6)
}
)
6 x(2)(e) 24
du+O
(
u
(V'IO )
)
3
1
1
) O(u
2ZU
+ 1 x^3
)(e)
iu
-
2,rKn2)(9) ^00
K 2)(9) 6
K.2)(9)
4
1 + 24
iu
iu
z2u2
{Kn3)(B) }2 2Kn2)(B) + 72
VKn2)(e)
{Kn2)(B)}2} du+O (k ) 1
2
)(^) )
]
Kn3)(9)z Kn4)(9)
z2
2{Kn2)(e)}2 + 8{K,2)(9)}2
2^rK(2) a ^1- 2 Kn2)(e) n ( )
5{Kn3)(
(
)}
2
(2)
+
24{Kn ()} 3
O
\ (7
J
where cb(u) = (1 / 27r ) e-u2/2. Hence we have P{Sn > y} = 00E + z) Z=o K(3)(9)z 1 eKn(B)-9y 00 z2 E e-9: 1 - 2{Kn2)(9)}2 )(B) 27rKn2 )(8) :-0 2Kn2 + Kn4)(e) - 5{Kn3)(B)}2 +O
8{Kn2)(9) }2
24{Kn2)(8)}3
C1
n2 / } J
(5.1)
for all integer y > E(Sn). Thus we complete the proof.
ACKNOWLEDGEMENTS The authors wish to thank the referee for his comment on the independently, identically and binomially distributed case in Jensen (1995).
579 726 AKAHIRA , TAKAHASHI, AND TAKEUCHI
REFERENCES Barndorff-Nielsen, 0. E. and Cox, D. R. (1979). Edgeworth and saddlepoint approximations with statistical applications (with discussion).
J. Roy.
Statist. Soc. Ser. B, 41, 279-312. Barndorff-Nielsen, 0. E. and Cox, D. R. (1989). Asymptotic Techniques for Use in Statistics. Chapman and Hall, London. Booth, J. G. and Wood, A. T. A. (1995). An example in which th LugannaniRice saddlepoint formula fails . Statist. Probab . Lett., 23, 53-61. Daniels, H. E. (1954). Saddlepoint approximations in statistics. Ann. Math. Statist., 25, 631-650. Daniels, H. E. (1987). Tail probability approximations. Int. Statist. Review, 55, 37-48. Gatto, R. and Ronchetti, E. (1996). General saddlepoint approximations of marginal densities and tail probabilities. J. Amer. Statist. Assoc., 91, 666-673. Harvill, J. L. and Newton, H. J. (1995). Saddlepoint approximations for the difference of order statistics. Biometrika 82, 226-231. Jensen, J. L. (1995 ). Saddlepoint Approximations . Clarendon Press , Oxford. Lieberman, 0. (1994). On the approximation of saddlepoint expansions in statistics. Econometric Theory 10, 900-916. Lugannani, R. and Rice, S. (1980). Saddlepoint approximation for the distribution of the sum of independent random variables. Adv. Appl. Prob., 12, 475-490. Robinson, J. (1982). Saddlepoint approximations for permutation tests and confidence intervals. J. Roy. Statist. Soc. Ser. B, 44, 91-101. Received February , 1998; Revised October, 1998.
580 Ann. Inst . Statist. Math. Vol. 53, No. 3, 427-435 (2001) ©2001 The Institute of Statistical Mathematics
INFORMATION INEQUALITIES IN A FAMILY OF UNIFORM DISTRIBUTIONS MASAFUMI AKAHIRA' AND KEI TAKEUCHI2 'Institute of Mathematics, University of Tsukuba, Ibaraki 305-8571, Japan 2Faculty of International Studies, Meiji-Gakuin University , Kamikurata-cho 1598, Totsuka-ku, Yokohama 244-0816, Japan (Received June 2, 1999; revised January 21, 2000)
Abstract. For a family of uniform distributions, it is shown that for any small s > 0 the average mean squared error (MSE) of any estimator in the interval of 0 values of length e and centered at 0o can not be smaller than that of the midrange up to the order o(n-2) as the size n of sample tends to infinity. The asymptotic lower bound for the average MSE is also shown to be sharp. Key words and phrases: Best location equivariant estimator, average mean squared error, sufficient statistic.
1. Introduction Estimation of the mean 0 of the uniform distribution with known range, which may be assumed to be equal to 1, is simple but a typical case of non-regular estimation. It is known that the variance of the locally best unbiased estimator at any 0 = 00 is equal to zero even when the sample size is equal to one (see, e.g. Akahira and Takeuchi (1995)), while the best location equivariant estimator 0* = 1 I min Xi + max Xi 2 1 0 the average mean squared error of any estimator 0 in the interval of 0 values of length e and centered at 00 can not be smaller than that of 0*. More precisely we shall prove that for any estimator 0 based on the sample of size n 1 1 0o +e /2 n2Ee[(9 - 0)2]d0 > -
lim lim - J
-On-.oo E 0o-e/2 e-
2
*
This means that in a sense asymptotically 0* can be regarded as uniformly best. The result can be generalized to the case of estimation of the unknown location parameter 0 with the density f (x - 0), where f has the following conditions: (i) f (x) > 0 for a < x < b, f (x) = 0 otherwise. ( ii) lima .a+o AX) = lima-.b-o f (X) = A > 0.
581
428 MASAFUMI AKAHIRA AND KEI TAKEUCHI
(iii) f is continuously differentiable in the interval (a, b). Indeed, it is derived from the fact that the estimator which minimizes eo+E/2
Eo[(9 - 0)2]d9
I0. -E/2
is the Bayes estimator with respect to the uniform prior over the interval [90 - (e/2), 90 + (e/2)], and that is asymptotically equivalent to the one with respect to the prior over the entire interval and asymptotically equal to the estimator 9* = (mini
Suppose that X be distributed uniformly over the interval [0 - (1/2), 0 + (1/2)]. Let 9(X) be an unbiased estimator of 0, i.e. B+1/2 ^
9(x)dx=0
for all 0E R.
B-1/2 1
Denote the variance of 9 by 0+1/2
v( 0) := V0 (9(X))
(2.1)
=
{9(x ) - 0}2dx.
0-1/2
Now we have the following. THEOREM 2 . 1.
For any 0 E R
r0+1/2
J
1 /2
r1/2
v(t)dt = f v(t)dt =
0-1/2
1/2
J
1/2
{9(x) - x}2dx + J x2dx >_ VB(X) =
1/2
1/2
12.
PROOF. Denote
O(x) := 9(x) - X. Then O (x) is a periodic function with periodicity 1, i.e. V'(x + 1) = V)( x) for almost all x. Indeed, since f B+1/2
O(x)dx = 0
for all 9 E R,
B-1/2
by differentiation we have
( This shows the periodicity of fi(x).
o_)
=o
a.e.
582 INFORMATION INEQUALITIES IN UNIFORM CASES 429
Now we have 9+1/2
(2.2)
{b(x) + x - 0}2dx
v(9) = B-1/2
=
r 9+1/2 9 +1/2 9+1/2 (x - 0)2dx. (x - 0)0(x)dx + b2 (x)dx + 2 -1/2 o -1/2 o - 1/2
f
f
Since V) is a periodic function, if we express 0 by n + p with an integer n and 0 < p < 1, we have r 1/2 p+1/2 02(x)dx 02(x)dx + 02(x)dx = 02(x)dx = -1/2 P p-1/2 1/2 1/2 1/2 f p-1/2 1/2 = zb2(x)dx + zb2(x)dx V)2(x)dx 1/2 1/2 p -1/2 0+1/2
(2.3)
1
and
J
(2.4)
0
J
J f
9+1/2
p+1/2
(x - 0)b(x)dx =
p+1/2
(x - p)'b(x)dx =
J
p - 1/2
-1/2
J
J
x^/i(x)dx
p-1/2 p+1/2
1/2
1 = xb(x)dx + J fp r f = xz)(x)dx + J J
xb(x)dx + x'(x)dx = f 1 /2 p -1/2 1/2
p-1/2
-1/2 1/2
(x + 1)(x)dx 1/2 -1/2
' 1/2
1/z
V,(x)dx.
From (2.2), (2.3) and (2.4) we have f1/2
V(0) = v(p) =
J
1/2 fp-1/2
1/2
zli2(x)dx +
x2dx + 2 xb(x)dx +
f
1/2 1/21/2
J
b(x)dx
1/2
Therefore, if we can prove that fl
{
/z
rp-1/2
x (x)dx + iL / 2
J
0(x)dx dp = 0, 1/2
then the theorem is established. We denote rp-1/2
`I`( p)
(2.6)
/
O(x)dx. 1/2
Then we have 1@(0) _ 'Y(1) = 0, and it is shown that (2.7)
fo
{
1/2
f
xO(x)dx 1/2
1/2
dp =
J
xo(x)dx
1/2
Jo -1
Cp
2/7P \p 21
1
= -I `F(p)dp• 0
p - 1)W'(p)dp
583 MASAFUMI AKAHIRA AND KEI TAKEUCHI
430
From (2.6) and (2.7) we get (2.5). ❑ For estimators not necessarily unbiased, we have the following. Let M(9) be the mean squared error (MSE) of an estimator 9(X)
THEOREM 2.2. of 0, i.e.
0+1/2
{9(x) -
M(9):-
0}2dx.
10-1/2
Then for any 00 E R 1
9o+E /2
M(9)d9 >
eo-E/2
(
e-
2
for e >
for e < 1.
12 (1 - 2
PROOF. We assume 00 = 0 without loss of generality. Then we have for e > 1 f E/2 (2.9)
J
M(9)dO E/2 E/2 B+1/2
{ B(x) - 9}2dx
= dB
J = dO J f^:/2
B-1/22
E/2 B+1/2
f ,-/2
{ 92 (x) - 298(x) + 02 }dx
B -1/22
f
{92(x) - 2y9(x)}dxdy
f(x,y) l lx -yl <1/ 2, 0<x<(e/2)+(1/2)}
+{2(x) - 2y(x)}dxdy + 12 ff (x,y)flx-yl<1 / 2,-(E/2)-(1 / 2)<x<0} 3
Il + 12 + 12, where
Il = ff
{ 92(x) - 2y9 (x)}dydx, (x,y)IIx-yl <1 / 2,0<x<(6 / 2)+(1/2)}
I2 = ff { 92(x) - 2y9(x)}dydx. (x,y) I Ix-yI <1 / 2,-(6/2)-(1/2) <x<0}
Then we have +1/2 6 / 2+1/2 6/2 E/2-1/2 x 1/2 6
(2.10) Il = f -
J
f
f - + f
0
x
{92(x) - 2y9(x)}dydx
/2-1/2 x -1/2
pE/2-1/2
{92(x) - 2x9(x)}dx
0
E/2+1/
+ 1/2-1/2
2 {(e - ) ( x+-f9 ) x- x+ - +x() -- ^ (X) - () } dx
584 INFORMATION INEQUALITIES IN UNIFORM CASES 431 fE/2-1/2 f 6/21/2
> -
J
0
1
/E-
1)
2
- -x+-/ \ 2 +- I
x2dx-
4
/2-1/2
(
dx
/
24 (E - 1)3 - 48 (6e2 - 8e + 3) = 48 (-2e 3 + 2e - 1), where the equality holds for
B(x) =
for 0<x< 2 2'
ix 1 e
2 ( 2 +x- 2
for 2-2<x<2
2
Similarly we obtain
12 > 48 (-2e3 + 2e - 1).
(2.11)
From (2.9), (2.10) and (2.11) we have the inequality (2.8) for e > 1. For e < 1, we have
(2.12)
f
E/2
M(O)dO E/ 2
= {e2 (x) - 20(x)}dxdB ff (x ,e) II x- 0I<1/2,I0l<E/2} 3
=I1+12+12, where Ii and 12' denote I1 and 12, respectively. Then we obtain
(2.13)
Ii =
1
e/2 0+1/2
f {B2(X) - 298(x)}dxd0 B-1/2 -E/2+1/2 E / 2 E/2+1/2 E/2
+ ff {B2 (x) I E/ 2 E/2+1 / 2 x_1/2 ) E /2+1/2 E/2+1/2
EB2(x)dxf0
1-
6
/ 2+1 / 2
(2 -x+ 1
{92(x)- (+x_) O(x)}dx 1
> -4 f
E
/2+1 //2 E
E
/2 +1 /2
(2
x
6
1
+2)l
(2
Similarly we have
+x
2)
12 - 2) dx
3
(2.14) 12 > - 48 . From (2.12), (2.13) and (2.14) we get the inequality (2.8) for e < 1. ❑ COROLLARY 2.1. For the case when the range is equal to t instead of 1, let Me(0) be the MSE of an estimator 0(X) of 0, i.e. e+e/2 Mt(0) :=
{B(x) - 0}2dx. B-e/2
585 MASAFUMI AKAHIRA AND KEI TAKEUCHI
432
Then for any Bo E R 3
rBp+E/2 Me (B)d9 > 12 Bp
for
(Q2e - 2
-E/2
_
E > Q,
for E < Q.
1 2 (1 2f
-
Let Y := X/P and 0' := 0/L. Then it follows that Y
OUTLINE OF THE PROOF.
uniformly distributed on the interval [0' - (1/2), 0' + ( 1/2)]. Letting Bo := 0o/Q and := E lf, from Theorem 2.2 we have 3
e 1 12(^ 2)
MW(0)de >
J3 3
12kt) \1 (Q2
f 1
2't 3
E- 2/
11
3
12 1 2P/
3. Asymptotic lower bound for the average mean squared error Now suppose that X1,. .. , X,,, are independently, identically and uniformly distributed over the interval [0 - (1/2), 0 + (1/2)]. Let Y :_ (X(l) + X(,,,))/2 and R = X(„) - X(l), where X(1) := mini
JE : =
J
EB [(B - 0)2] do E/2
E/2
= ER J EB [(6 - 0)2
1
R] dO
E/2
From Corollary 2.1, we have
EB [(B - 0)2 1 R] de >
(3.1)
(1-R) 2e_ (1-R) 3} for 13 2
e> 1-R,
for
E < 1 - R.
J
E/2
12
1
2 ( l E R) }
The density of R is given by for 0 < R < 1, (3.2) f(R) _ n(n - 1)R'-2(l - R) 0 otherwise,
586 INFORMATION INEQUALITIES IN UNIFORM CASES 433
hence it follows that for any e < 1 (3 .3)
JE>nn-1 1 el R )2 f 1-e +n(n-1)
1( 1 R -1
e
0 12{1 2(1e- R)}
2(n + 1)(n + 2)
R)dR
Rn-2(1 -R)dR
- 12e(1 - e)n-1 + 4 (n - 1)e(1 - )n n+1+ n(n-1)e 1-e n+2
n(n-1)e
4(n +(11)) 12(n + 2) ( ) 1 1 n (1 - )n- - 6 (n - 1)(1 - )n (n + 1)(n + 2)(n + 3) + 24
+n(n - 1) (1 - 6)n+1 - n(n - e)n+2 + n(n - 1) (1 - )n+3 1) (1 4(n + 1) 6(n + 2) 24(n + 3) + 12e3(1 - e)n-1 - 12 (n - 1)e3(1 - E)n - 244(l - e)n-1.
Then we have the following. THEOREM 3 . 1.
For any estimator B = 0(X(1 ), X(n)) 2 e/2
(3.4) lim lim n2 JE = lim limn -+0 n-oo e-+0 n-oo a
J
Ee [( B - 0)2] dO > 1. E /2
2
The proof is straightforwardly derived from (3.3). We also have somewhat weaker result. COROLLARY
3.1. For any estimator 6 = 0(X(1), X(n)) 2
lim lim sup nE0 [(0 - 0)2] > 1 E-0n-oo 1ej<e/2 e 2 The proof is omitted since (3.5) is easily derived from Theorem 3.1. Note that the equalities in (3.4) and (3.5) are attained by
1 2 (X(1) + X(n)). Indeed, since the probability density function of 0* is given by n(1 (y) _ fo.
0
2I y - 0I)n -1
for
6-2
otherwise,
it follows that (3.6)
1 + 2) Eo [(0* - 0) 2 ] = 2(n + 1)(n
587 434 MASAFUMI AKAHIRA AND KEI TAKEUCHI
Since Je = e/{2(n + 1)(n + 2)} by (3.6), it is easily seen that 2 n _ 1 e-•0 n= 2(n + 1) ( n + 2) 2'
lim lim n2 de = lim lim e-,0 n,oo e
which implies that the equality in (3.4) is attained by B*. Since by (3.6) lim n2 sup EB [(0* - 0)2] = lim
n2
n °° IOI<e/2 n-oo 2(n + 1)(n + 2) 2' the equality in (3.5) is seen to be attained by 0*. Finally we shall show that the Mori type inequality is easily derived from the inequality (3.1). Letting c = e/2, we have from (3.1)
f cEe (B-0)21RdO> i 2cl-R2 1-R3 for 12 2
c> 1-R 2,
hence (3.7)
f E0 [( 9 - 0)2] dO = ER [f EB [(6 - 0)2 R] do
Lc
]
> 2 ER [2c(1 - R)2 - 2 (1 - R)3
l
J
for large c . Since by (3.2)
RT(1 - R)2) =
6 (n + 1)(n + 2)'
1-R = 24 E R ^( )3] (n + 1)(n + 2)(n + 3)' it follows from (3.7) that
2c f°E9[(8-0)2Jd0>
2(n + 1)(n + 2) 2c(n + 1)(n + 2)(n + 3)
for large c, which implies lim
1J
E9 [(9 - 0)2] dO >
c-.co 2c c 2(n + 1)(n + 2) This type inequality is given by Mori (1983).
Acknowledgement
The authors thank the referee for the kind comments.
588 INFORMATION INEQUALITIES IN UNIFORM CASES 435 REFERENCES Akahira, M. and Takeuchi, K. (1995). Non-Regular Statistical Estimation, Lecture Notes in Statistics, No. 107, Springer, New York. Khatri, C. G. (1980). Unified treatment of Cramer-Rao bound for the nonregular density functions, J. Statist. Plann. Inference, 4, 75-79. Mori, T. F. (1983). Note on the Cramer-Rao inequality in the nonregular case: The family of uniform distributions, J. Statist. Plann. Inference, 7, 353-358. Vincze, I. (1979). On the Cramer-Frechet-Rao inequality in the non-regular case, Contributions to Statistics , The J. Hdjek Memorial Volume, 253-262, Academia, Prague.
589
Permissions The editors and World Scientific Publishing Co. Pte. Ltd. would like to thank the original publishers of the joint papers of Akahira and Takeuchi for granting permissions to reprint specific papers in this volume. The following list contains the credit lines for those articles. [1] Reprint from Annals of Statistics 3 ©1975 by the Institute of Mathematical Statistics. [2] Reprint from Rep. Univ. Electro-Commun., 26 ©1976 by The University of ElectroCommunications.
[3] Reprint from Rep. Univ. Electro-Commun., 27 ©1976 by The University of Electra Communications. [4] Reprint from Rep. Univ. Electro-Commun., 27 ©1976 by The University of ElectroCommunications. [5] Reprint from Lecture Notes in Mathematics 550 ©1976 by Springer -Verlag. [6] Reprint from Ann. Inst. Statist. Math., 29 ©1977 by The Institute of Statistical Mathematics. [7] Reprint from Rep . Univ. Electro-Commun ., 28 ©1978 by The University of ElectroCommunications. [8] Reprint from Rep. Univ. Electro - Commun., 28 ©1978 by The University of ElectroCommunications. [9] Reprint from Rep. Univ. Electro-Commun., 29 ©1978 by The University of ElectroCommunications. [10] Reprint from Rep. Stat. Appl. Res ., JUSE, 26 ©1979 by Union of Japanese Scientists and Engineers. [11] Reprint from Rep. Stat. Appl. Res ., JUSE, 26 ©1979 by Union of Japanese Scientists and Engineers. [12] Reprint from Ann. Inst. Statist. Math., 31 ©1979 by The Institute of Statistical Mathematics. [13] Reprint from Ann. Inst. Statist. Math ., 31 ©1979 by The Institute of Statistical Mathematics. [14] Reprint from Rep. Univ. Electro-Commun., 30 ©1979 by The University of ElectroCommunications.
[15] Reprint from Austral. J. Statist., 22 ©1980 by Blackwell Publishing Asia. [16] Reprint from Rep. Univ. Electro-Commun., 31 ©1980 by The University of ElectroCommunications.
590 [17] Reprint from Statistics & Decisions 1 ©1982 by Oldenbourg Wissenschaftsverlag GmbH. [18] Reprint from Ann. Inst. Statist. Math., 37 ©1985 by The Institute of Statistical Mathematics. [19] Reprint from Rep. Stat . Appl. Res ., JUSE, 32 ©1985 by Union of Japanese Scientists and Engineers. [20] Reprint from Ann. Inst. Statist. Math., 38 ©1986 by The Institute of Statistical Mathematics. [21] Reprint from Metrika 33 ©1986 by Physica-Verlag. [22] Reprint from Metrika 33 ©1986 by Physica-Verlag. [23] Reprint from Publ. Inst. Stat. Univ. Paris 31 ©1986 by LSTA, Universite Pierre et Marie Curie. [24] Reprint from Foundations of Statistical Inference, Advances in the Statistical Sciences Vol. 2, (ed. McNeill) ©1987 by Kluwer Academic Publishers. [25] Reprint from Metrika 34 ©1987 by Physica-Verlag. [26] Reprint from Ann. Inst. Statist. Math., 39 ©1987 by The Institute of Statistical Mathematics. [28] Reprint from Lecture Notes in Mathematics 1299 ©1988 by Springer-Verlag. [29] Reprint from Ann. Inst. Statist. Math., 41 ©1989 by The Institute of Statistical Mathematics. [31] Reprint from Publ. Inst. Stat. Univ. Paris 35 ©1990 by LSTA , Universite Pierre et Marie Curie. [32] Reprint from Austral. J. Statist., 32 ©1990 by Blackwell Publishing Asia. [33] Reprint from Ann. Inst. Statist. Math., 43 ©1991 by The Institute of Statistical Mathematics. [35] Reprint from Rep. Stat. Appl. Res., JUSE, 38 ©1991 by Union of Japanese Scientists and Engineers. [37] Reprint from Rep . Stat. Appl. Res., JUSE, 39 ©1992 by Union of Japanese Scientists and Engineers. [38] Reprint from Metron 50 ©1992 by Dipartimento di Statistica , Probability e Statistiche Applicate , University degli Studi di Roma. [39] Reprint from Statistical Sciences and Data Analysis (ed. Matusita et al.) ©1993 by VSP International Science Publishers.
591 [40] Reprint from Statistica Neerlandica 47 ©1993 by Blackwell Science Ltd. [42] Reprint from Metron 55 ©1997 by Dipartimento di Statistica, Probability e Statistiche Applicate, University degli Studi di Roma. [44] Reprint from Ann. Inst. Statist. Math., 53 ©2001 by The Institute of Statistical Mathematics.