DISTRIBUTION THEORY OF
RUNS AND PATTERNS AND ITS
APPLICATIONS A Finite Markov Chain Imbedding Approach
JAMES C. FU • W. Y. WENDY LOU
DISTRIBUTION THEORY OF
RUNS AND PATTERNS AND ITS
APPLICATIONS A Finite Markov Chain Imbedding Approach
This page is intentionally left blank
DISTRIBUTION THEORY OF
RUNS AND PATTERNS AND ITS
APPLICATIONS A Finite Markov Chain Imbedding Approach
JAMES C. FU University of Manitoba, Canada
W. Y. WENDY LOU University of Toronto, Canada
©World Scientific •
New Jersey • London • Singapore • Hong Kong
Published by World Scientific Publishing Co. Pte. Ltd. 5 Toh Tuck Link, Singapore 596224 USA office: Suite 202, 1060 Main Street, River Edge, NJ 07661 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
Library of Congress Cataloging-in-Publication Data Fu, James C. Distribution theory of runs and patterns and its applications / James C. Fu, W. Y. Wendy Lou. p. cm. Includes bibliographical references and index. ISBN 981-02-4587-4 (alk. paper) 1. Markov processes. 2. Random variables. 3. Distribution (Probability theory). I.Lou, W.Y.Wendy. II. Title. QA274.7.F8 2003 519.2'33-dc21
2003053824
British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.
Copyright © 2003 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.
Printed in Singapore.
To our parents
This page is intentionally left blank
Preface
It is the purpose of this book to provide a rigorous, comprehensive introduction to the finite Markov chain imbedding technique for studying the distributions of runs and patterns from a unified and intuitive viewpoint, away from the lines of traditional combinatorics. Over the past two decades, considerably many new results related to the distributions of runs and patterns have been obtained through this approach. The central theme of finite Markov chain imbedding, as the name suggests, is to properly imbed the random variables of interest into the framework of a finite Markov chain, and the resulting representations of the underlying distributions are compact and very amenable to further study of associated properties. In this book, the concept of finite Markov chain imbedding is systematically developed, and its utility is illustrated through practical applications to a variety of fields, including the reliability of engineering systems, hypothesis testing, quality control, and continuity measurement in the health care sector. This book is restricted to discrete sample spaces, a restriction which serves to make this work accessible to a wider audience by simplifying the theoretical results and their applications. The runs and patterns considered herein are largely defined on sequences of Markov-dependent two- as well as multi-state trials with practical applications in mind; those defined on random permutations of integers, such as the Eulerian and Simon Newcomb numbers, are also treated using an additional insertion procedure. The content of this book is geared mainly towards researchers who are using the distribution theory of runs and patterns in various applied areas of statistics, probability and combinatorics, but it could also form the basis vii
Vlll
Preface
of a one-semester special-topics course at the fourth-year undergraduate or at the first-year graduate level. We wish to acknowledge the assistance of Y. M. Chang and B. C. Johnson for proofreading early drafts of the book, as well as the encouragement from our colleagues at the University of Manitoba and the University of Toronto. We are also indebted to our families for their endless support. Lastly, we wish to thank Ms. E. H. Chionh of World Scientific Publishing Co. for her patience and managerial support.
JAMES C. W.Y.
Winnipeg, Manitoba Toronto, Ontario
FU
WENDY LOU
Contents
Chapter 1
Introduction
1
Chapter 2 Finite Markov Chain Imbedding 2.1 Finite Markov Chain 2.2 Chapman-Kolmogorov Equation 2.3 Tree-Structured Markov Chain 2.4 Runs and Patterns 2.5 Finite Markov Chain Imbedding 2.6 Absorbing State 2.7 First-Entry Probability Chapter 3 Runs and Patterns in a Sequence of Two-State Trials 3.1 Introduction 3.2 Number of Non-Overlapping Consecutive k Successes 3.3 Number of Success Runs of Length Greater Than or Equal to k 3.4 Number of Overlapping Consecutive k Successes 3.5 Number of Runs of Exactly k Successes 3.6 The Distribution of the Longest Success Run 3.7 Waiting-Time Distribution of a Success Run 3.8 Numerical Examples 3.9 Number of Successes in Success Runs of Length Greater Than or Equal to k ix
5 5 8 9 10 13 18 21
25 25 26 31 33 35 37 40 43 44
x
Contents
Chapter 4 Runs and Patterns in Multi-State Trials 4.1 Introduction 4.2 Forward and Backward Principle with Non-Overlap Counting . 4.3 Overlap Counting 4.4 Series Pattern 4.5 Joint Distribution
49 49 49 57 59 60
Chapter 5 Waiting-Time Distributions 5.1 Introduction 5.2 The Waiting Time of A Simple Pattern 5.3 The Waiting Time of A Compound Pattern 5.4 Probability Generating Function 5.5 Mean of Waiting Time W(A) 5.6 More About Generating Functions 5.7 Spectrum Analysis and Large Deviation Approximation . . . . 5.8 Probability Generating Function of W(r, A) 5.9 Scan Statistics
63 63 65 66 72 76 80 84 87 89
Chapter 6 Random Permutations 6.1 Introduction 6.2 Successions 6.3 Eulerian and Simon Newcomb Numbers
97 97 99 107
Chapter 7 Applications 7.1 Introduction 7.2 System Reliability 7.2.1 Consecutive-fc-out-of-n:F System 7.2.2 Linearly Connected System 7.3 Hypothesis Testing 7.4 Sequential Continuity 7.5 Quality Control Schemes 7.5.1 Simple Control Schemes 7.5.2 Compound Control Schemes
117 117 117 119 126 127 133 140 141 147
Bibliography
153
Index
161
Chapter 1
Introduction
The occurrence of runs and patterns in a sequence of discrete trial outcomes or random permutations is an important concept in various areas of science, including reliability engineering, quality control, psychology, sociology, DNA sequence matching, and hypothesis testing. Results for the probability distributions of elementary runs and patterns were derived sporadically in the literature until about the 1940s, when a number of pioneering studies on more complex runs and patterns were published: for example, Wishart and Hirshfeld (1936), Cochran (1938), Mood (1940), Wald and Wolfowitz (1940), Mosteller (1941), and Wolfowitz (1943). Most of these studies focused on finding the conditional distribution of success runs given the total number of successes in a sequence of two-state trials. A recent book by Balakrishnan and Koutras (2002) provides an excellent, comprehensive review of historical and current developments in the distribution theory of runs and scan statistics. Traditionally, the distributions of runs and patterns were studied via combinatorial analysis. For example, Mood (1940) wrote: "The distribution problem is, of course, a combinatorial one, and the whole development depends on some identities in combinatory analysis". However, finding the appropriate combinatorial identities to derive the probability distributions can be difficult, if not impossible, for complex runs and patterns, and this perhaps is the reason why the exact distributions of many common statistics defined on runs and patterns remain unknown. Furthermore, the required identities often differ even for similar runs and patterns, and hence, even in the simplest case of independent and identically distributed (i.i.d.) two-state trials (so-called "Bernoulli trials"), each new distribution 1
2
Introduction
problem typically has to be studied on a case by case basis using the combinatorial approach. For example, only relatively recently did Philippou and Makri (1986) and Hirano (1986), independently and via combinatory analysis, obtain the exact distribution of the traditional runs statistic Nn
/
.
.
i
\
/
\ x\-\
\-xk
kxk=n—m — kx
for x = 0,1, • • •, [n/k], with success and failure probabilities denoted by p and q = 1 — p, respectively. Another approach for determining an exact probability distribution involves deriving the generating function
=
pksk(l-ps) •
1- s+
qpksh+i
For more complex runs and patterns, the generating functions can be difficult to differentiate a large number of times, and approximation techniques may have to be employed. Feller used the method of partial fraction expansion, which can require efficient numerical methods for computing roots of polynomials. Throughout this book, we will study distribution problems of runs and patterns from a, in our opinion, more unified and intuitive viewpoint, away from the lines of traditional combinatorics. The approach taken is to properly imbed the random variable Xn(A) into a finite Markov chain {Yt}, so that the probability of Xn (A) = x can be expressed in terms of the probability of the Markov-chain outcome Yn residing in a subset Cx of the state space; i.e. P(Xn(A)
=x)=
P(Yn e Cx),
Introduction
3
where the right-hand-side probability can be easily computed through the transition probability matrices of the Markov chain. This representation of the underlying distribution of Xn(A) is compact, easy to compute, and quite amenable to further analysis. The method is largely dependent on being able to construct a proper Markov chain associated with the random variable Xn(A), but once the chain is constructed, the linearity of the Markov chain reduces the computational complexity often associated with combinatorial and generating-function techniques for computing the exact pdjs of runs and patterns. Early results in the distribution theory of runs and patterns were derived almost exclusively under the assumption of Bernoulli or i.i.d. multi-state trials. A great advantage of the finite Markov chain imbedding technique is that it can be applied not only to i.i.d. cases, but, with little additional effort, also to Markov-dependent multi-state trials, regardless of the counting procedures specified for overlapping patterns (overlap vs. non-overlap counting); it can also be extended to various types of runs and patterns on random permutations. Recently this method has been adopted by a number of researchers to study various distributions of runs and patterns: for example, Antzoulakos (1999, 2001), Boutsikas and Koutras (2000a,b), Doi and Yamamoto (1998), Fu (1985, 1986, 1996), Pu and Koutras (1994), Fu and Lou (2000a,b), Han and Aki (2000a,b), Johnson (2002), Koutras (1996a,b, 1997a,b, 2003), Koutras and Alexandrou (1995), Lou (1996, 2000, 2001), and Nishimura and Sibuya (1997). We will touch on some of these recent works, but our corresponding formulations may differ slightly in order to treat all problems using a common imbedding approach. This book is not a review book for the theory of runs and patterns, nor is it intended to be used primarily as a course textbook; it is mainly aimed at researchers in applied statistics and probability who are interested in using the finite Markov chain imbedding technique to study the distributions of runs and patterns arising in specific applications. The contents of the book are largely based on recent developments in this area, but are presented in a manner that does not require knowledge of advanced concepts in mathematics or probability; a background in probability theory, at the level of, for example, Feller's (1968) book "An Introduction to Probability Theory and Its Applications, Volume I", is assumed. This book is organized in the following way. In Chapter 2, we introduce the basic ideas and techniques of finite Markov chain imbedding. This chapter lays the foundation for computing the pdjs of runs and patterns,
4
Introduction
including waiting-time distributions. Chapter 3 examines the distributions of runs and patterns associated with two-state trials, and in Chapter 4, the extension to multi-state trials via the forward and backward principle is treated. Chapter 5 mainly studies the waiting-time distributions of simple and compound patterns, as well as their generating functions and large deviation approximations. In Chapter 6, the finite Markov chain imbedding technique is extended to the study of distributions of patterns in random permutations of integers, focusing in detail on the Eulerian and the Simon Newcomb numbers. Chapter 7 covers several applications of the distribution theory of runs and patterns in the areas of the reliability of engineering systems, hypothesis testing, continuity measurement in health care, and quality control.
Chapter 2
Finite Markov Chain Imbedding
2.1
Finite Markov Chain
Let 0, = {1,2, •••,m} (TO < oo) be a finite state space, and let {Yt} = {Yo, Yi, • • -,Yt, • • •} be a sequence of random variables defined on Q. Definition 2.1 The sequence {Yt} will be called a finite Markov chain if, for any sequence {Y0 = io, Yi = i\, • • •, F t - i = h-i, Yt = it}, t = 1,2, • • •, we have P(Yt = it\Yt-x = i t - i , • • • ,Y0 = i0) = P(Yt = it\Yt-i
= it-i).
(2.1)
In other words, the sequence is a Markov chain if the probability that the system enters the state it at time t depends only on the immediately preceding state it-i at time t — 1. Or more succinctly, viewed from the state at time t — 1, the future is independent of the past. The conditional probabilities P{Yt=j\Yt-1=i)=pij{t),
(2.2)
i,j & f2, are called one-step transition probabilities for the system at time t. The transition probabilities Pij(t), 1 < i,j < m, may be represented as anTOxTOmatrix / Pll(t)
Pl2{t)
•••
Pm2(t)
•••
P2l(t)
Mt = {Pij(t)) \Pml(t)
5
Plm(t)\
Finite Markov Chain
6
Imbedding
The matrices Mt, t = 1,2, • • •, are called one-step transition probability matrices. Definition 2.2 A Markov chain {YQ, Y\, • • •} is homogeneous if the transition probabilities are constant in time, i.e. P(Yt = j\Yt~i = i) = p^ for any i,j £ fl, and all t = 1, 2, • • •. This definition is equivalent to saying that the transition probability matrices of a homogeneous Markov chain may be represented by the single matrix M = Mt = (Pij),
for
all * = 1,2, • • •,
where the transition probabilities pij are independent of the time index t. The set of probabilities at time 0, P(YQ = i) for i = 1, • • • ,m, is referred to as the initial distribution of the Markov chain. Given an initial distribution and the transition probabilities of a Markov chain, the joint distribution of the chain can be computed as follows: P(Yn = in,---,Y1=i1,Y0
= i0)
=
P(Yn = in\Yn-i
= in-x) • • •
•••P(Yi = i1\Yo = i0)P(Y0 = i0). (2.3) Markov chains have been used in the modeling of a vast number of applications. Here we give two simple examples that are often seen in applied probability theory: Example 2.1 (Gambler's ruin problem). Consider a gambler who wins and loses a dollar with probabilities p and q = 1 — p, respectively. Suppose the gambler has an initial capital of a dollars. The gambler quits playing the game either when he is out of capital ("ruined") or when he attains a fortune of a + b dollars (with net gain b > 0). The sequence of the gambler's amount of capital, {Yt, t = 0,1,2, • • •}, forms a homogeneous Markov chain with state space fl = {0,1, 2, • • •, a, a + 1, • • •, a + b} and the following transition probabilities:
_ J p iij-i P
v - \
a
+1
ifj=t-l,
for i = 1,2, • • •, a + b — 1, poo = Pa+b,a+b = 1, and zero elsewhere. The states 0 and a + b are referred to as absorbing states since, once entered,
Finite Markov
Chain
7
they will never be exited. The Markov chain has the transition probability matrix
/I
0 0
0 O
P
a-1 a
M
0
p
a+ 1
o a+6
0 0
V
p 1/
where O represents a zero matrix, and the chain has the initial distribution P(Y0 = a) = 1. O Example 2.2 (Randomly inserting balls into urns). Consider a sequence of independent trials, each consisting of inserting a ball at random into one of k urns. We say the system {Yj : t = 0,1, • • •} is in state i if exactly i urns are occupied. This system forms a Markov chain on the state space fi = {0,1, • • •,fc}with transition probabilities
Pij = -\ V 0
if
3= i+1 otherwise,
for i = 0,1, • • •, k, and initial distribution P(Yo probability matrix is given by
/0 0
0) = 1. The transition
1 1
M
fc-1
o
¥ k-l k
o
fc-1 k
0
1 k
1/
o
Finite Markov Chain
8
Imbedding
Further examples of this type may be found in, for example, Feller (1968) and Ross (2000). Of course, there will also be many more examples of Markov chains in later sections of this book. 2.2
Chapman-Kolmogorov Equation
For a non-homogeneous Markov chain {Yt}, the n-step transition probabilities P(Yt = j\Yt-n = i) = p\j{t) can be obtained from the one-step transition probabilities by an important identity known as the ChapmanKolmogorov equation. If n = 2, we have, for t > 2,
Pi?W = E W - i = k\Yt-2 = i)P(Yt = j|y t -i =k)= k
J2Pik(t-l)Pkj(t), k
(2.4) which corresponds to summing over all possible intermediate states k in the transition from state i to state j . If {Yt} is a homogeneous Markov chain, then Eq. (2.4) yields the twostep (n = 2) transition probabilities
Pi? = £
P y
( ' - i = fcly'-2 = * ) W = J'ly*-i = k) = X)p
k
(2-5)
k
which are independent of t. Hence, from Eq. (2.5), the two-step transition probability matrix M ( 2 ) = (pf)) satisfies the identity M ( 2 ) = M2. Similarly, for the n-step transition probabilities of a homogeneous Markov chain, the Chapman-Kolmogorov identity,
^n)=£^irs)>
(2.6)
k
holds for every intermediate step s = 1, • • •, n — 1. It follows from Eqs. (2.5) and (2.6) that M(n)
=
For a homogeneous Markov space fl, it follows from Eq. system state Yn residing in C t0 = {P(Y0 = l),---,P(Y0 =
M{s)M{n-s)
=
Mn^
^
chain {It}, and any subset C of the state (2.7) that the conditional probability of the at time index n, given the initial distribution m)),i8
P{YneC\£0)=ZQMnu'{C),
(2.8)
Tree-Structured
Markov
Chain
9
where U (C) denotes the transpose of U(C), U(C) = Ylieceii an< ^ e *~ (0, • • •, 1, 0, • • •, 0)ixm is a unit row vector with 1 corresponding to the i-th state and zero elsewhere. More generally, if {Yt} is a non-homogeneous Markov chain, it can be shown (e.g. Feller 1968) that the conditional probability of Yn G C given £ 0 can be simply represented as n
P(Yn € C|*o) = U l l
M
*)U'(CO-
(2-9)
Equations (2.8) and (2.9) are two indispensable tools for evaluating probabilities of various events associated with homogeneous and non-homogeneous Markov chains, respectively.
2.3
Tree-Structured Markov Chain
In order to broaden the possible applications, it is useful to consider a simple extension of the preceding methodology to Markov chains defined on state spaces of different sizes. Let {Yt} be a sequence of random variables defined on a sequence of state spaces {ilt}, respectively. The sequence {Yt} is referred to as a tree-structured Markov chain if {Yt} is a Markov chain with transition matrices Mt = (pij{t)),
i = l,2,---,
where, for i G flt-i and j G fit,
pij{t)=P{Yt=j\Yt-1=i). Note that since the state spaces in the sequence {fit} may have different sizes, the transition matrices Mt may be rectangular instead of square; i.e. Mt, t — 1, 2, • • •, are card(flt-i) x card(flt) matrices, where card(flt) stands for the cardinal number of the state space Q.t- The sequence of transition probability matrices {Mt} still determines the tree-structured Markov chain {Yt}, and the Chapman-Kolmogorov equation remains applicable. For any subset C C ttn, the conditional probability of Yn € C given the initial distribution £ 0 can be evaluated via P(Yn e C|£o) = € o ( ] I Mt)U'n(C),
(2.10)
10
Finite Markov Chain
Imbedding
where Un(C) = X^ec ei a n d e* = (0, • • •, 0,1,0, • • •, 0) is a 1 x card{Q,n) unit row vector with 1 at the location associated with state i. If all the state spaces are identical (ft\ = • • • = fi„), then Eq. (2.10) reduces to Eq. (2.9). 2.4
Runs and Patterns
Traditionally, within a sequence of Bernoulli (i.i.d. success-failure) trials, a run denotes a sequence of consecutive successes or failures. For example, a success run of size 4 implies the pattern SSSS. Several runs statistics that are often used in statistics and applied probability for a sequence of n Bernoulli trials are: (i) Nntk, the number of non-overlapping consecutive fc successes, in the sense of Feller's (1968) counting; (ii) Mntk, the number of overlapping consecutive k successes; (hi) Entk, the number of success runs of size exactly fc, in the sense of Mood's (1940) counting; (iv) Gntk, the number of success runs of size greater than or equal to fc; and (v) Ln(S), the size of the longest success run. Perhaps the easiest way to understand the precise definitions of these runs statistics and the overlap/non-overlap counting procedure is by means of the following example. Suppose that there are n = 10 Bernoulli trials, with realization SSFSSSSFFF. Then LW(S) = 4, and for fc = 2, we have iVio,2 = 3, Mio, 2 = 4, £10,2 = 1 and Gwa = 2. From the definitions of these runs statistics, it follows by inspection that the following relationships are always true: En,k < Gra.fc < Nn!k < Mn,k, En,k = Gnik — Gn,fe+li Ln(S) < fc if and only if Nn>k = 0. To extend the definitions of runs, let us consider a sequence {Xf}™=1 of n multi-state trials, each of which has m > 2 states or symbols as possible outcomes. These symbols are denoted by bi,b2,---,bm, and occur with probabilities pi,P2, • • • ,pm, respectively. We then define three types
Runs and
Patterns
11
of general patterns: a simple pattern, a compound pattern, and a series pattern. Definition 2.3 A is a simple pattern if A is composed of a specified sequence of k symbols, i.e. A = b^ •••bik, with ij e { l , - - - , m } for all j = 1, • • •, k. The length of the pattern is fixed, and the symbols in the pattern may be repeated. Success runs and failure runs of size k are thus simple patterns under this definition, and in fact any fixed-length sequence of successes and failures, say A = SSFSF, can be considered as a simple pattern within a sequence of n two-state (m = 2) trials. Next, let Ai and A2 be two simple patterns of lengths/sizes k\ and k%, respectively. We say that Ai and A2 are distinct if neither Ai belongs to A2 nor A2 belongs to Ai. We define Ai U A2 to denote the occurrence of either pattern Ai or pattern A2, and define Ai * A2 to denote the occurrence of pattern Ai followed by pattern A2 (with possibly a gap between them). Definition 2.4 A is a compound pattern if it is the union of 1 < I < 00 overlapping/non-overlapping distinct simple patterns, i.e. A = u' = 1 Ai. Definition 2.5 A is a series pattern if A is composed of an ordered sequence of 1 < I < 00 non-overlapping distinct simple patterns Aj, i.e. A = Ai * A2 * • • • * A;. Throughout this book, the random variable Xn(A) represents the number of occurrences of the pattern A in a sequence of n multi-state trials, using either overlap or non-overlap counting. In order to clarify the three pattern definitions and the two types of counting conventions for multi-state trials, we give the following example. Example 2.3 Let {Xt}\^i be a sequence of sixteen four-state trials, where the possible outcomes for each trial are A, C, G, and T. Let Ai = AG AG and A2 = AGT be two distinct simple patterns, A = Ai U A2 be a compound pattern, and A* = Ai * A2 be a series pattern. Suppose the realization of this sequence of sixteen trials is TAGAGAGTCAGAGTCC, then (i) -Xi6(Ai) equals 3 with overlap counting and equals 2 with nonoverlap counting, (ii) Xi6(A) equals 5 with overlap counting and equals 3 with nonoverlap counting, and
Finite Markov Chain
12
Imbedding
(iii) -X"i6(A*) equals one.
O
The above definitions of runs and patterns in a sequence of multi-state trials can also be extended to random permutations {ir : n =(ir(l),- • • ,ir(n))} of n integers {1, 2, • • •, n}. For example, the Eulerian number E(ir, n), the number of rises in a random permutation 7r (see Carlitz 1964, Tanny 1973, and Worpitzky 1883), could be viewed as a random variable Xn(A) with the pattern A being a rise. Mathematically, the Eulerian number can be defined as n-1
E(7r,n) = X„(A) = £ ) / ( 7 r , 0 , i=0
where IUi)
=
{
1
\ 0
if T(0 <*(* + !) otherwise,
for i = 1, • • • , n — 1, with l(ir,0) = 1 by convention (the starting "gap" preceding the first permutation is always considered a rise). For example, the number of rises E(ir, 9) in the random permutation 7r = (321459768) of 9 integers is 5. In view of the above definitions and examples, one should expect the exact distribution of the random variable Xn(A) to depend heavily on three important factors: (a) the structure of the pattern A, (b) the structure of the sequence {Xj}™=1 of n trials (or random permutations), and (c) the counting procedure (overlap or non-overlap counting). Due to these factors, the analytical determination of exact distributions via traditional approaches such as combinatorics can be quite challenging and is generally complex, involving special identities and lengthy algebra. Consequently, the exact distributions of many statistics used in practical applications have never been studied using such methods, especially when the underlying trials are non-i.i.d. (e.g. Markov-dependent). In the following subsection, we describe a technique that allows one to obtain a compact matrix representation for the exact distribution in a relatively simple and universal manner by imbedding the random variable Xn(A) into a finite Markov chain; the resulting expression is also very amenable to further analysis of statistical properties, to the development of large deviation approximations, and to efficient numerical implementation for the computation of exact probabilities.
Finite Markov Chain
2.5
Imbedding
13
Finite Markov Chain Imbedding
The finite Markov chain imbedding technique for finding the distribution of the random variable Xn(A) has its early origins in a series of papers by Fu (1985, 1986), Fu and Hu (1987), Chao and Fu (1989, 1991), and Fu and Lou (1991). The term "finite Markov chain imbeddable" to describe a random variable was formally introduced by Fu and Koutras (1994). Let Tn — {0,1, • • •, n} be an index set, and let D. = {ai, a^, • • •, am} be a finite state space. Definition 2.6 The non-negative integer-valued random variable Xn(A) is finite Markov chain imbeddable if: (a) there exists a finite Markov chain {Yt : t £ T„} defined on a finite state space Q. with initial probability vector £ 0 , (b) there exists a finite partition {Cx : x — 0,1, • • •, ln} on the state space f2, and (c) for every x = 0,1, • • •, ln, we have P(Xn(A)
= x) =P(YnGCx
|€ 0 ).
Let {Mt}"=1 be the sequence o f m x m transition probability matrices of the finite Markov chain {Yt} defined on the state space 17 with initial probability distribution £ 0 = (Pfio = ^i), P(Y0 = a2), • • •, P(Y0 = am)). Theorem 2.1
If Xn(A)
is finite Markov chain imbeddable, then n
P(Xn(A)
= x) = t0(Y[Mt)u'(Cx),
(2.11)
t-i
where U(CX) = ^2r.a eC er, er is a 1 x m unit row vector corresponding to state ar, £ 0 is the initial probability vector, and Mt, t = 1, • • • ,n, are the transition probability matrices of the imbedded Markov chain. Proof. Since Xn(A) is finite Markov chain imbeddable, it follows from Definition 2.6(a) that there exists a finite Markov chain {Yt : t € Tn} with initial probability £ 0 . By the Chapman-Kolmogorov equation described in Section 2.2, for every ar G fi, we have n
P(Yn = ar\to) =
Ul[Mt)e'r. t=i
Finite Markov Chain
14
Imbedding
Furthermore, it follows from Definitions 2.6(b) and (c) that, for every x =
p(xn(A) = x) = P(y„ G CKo) = $ 3 p(y™ = °^l€o) n
n
=
UY[Mt)U'{Cx).
The fc-th moment E(X%(A)), k = 1, 2, • • •, can be written as n
E{XZ{A)) = t0([lMt)Vl
(2.12)
where
Vfc = X> f e t/(C E ). x=0
Similarly, the probability generating function for the random variable Xn (A) can be written as n
^s)=U\{Mt)W\S),
(2.13)
t=i
where
W{s) = Y,sxU{Cx). x=0
Moments and probability generating functions will be discussed within the context of specific applications in subsequent sections. Example 2.4 (Number of pairs of identical successive outcomes). Let {Xt : t = 0,1, • • • ,n} be a sequence of homogeneous Markov-dependent m-state trials with transition probability matrix Amxm = (pjj) and initial probability distribution TZQ ={P{XQ = 1), • • •, P{XQ = m)) = (1/m, 1/m, • • •, 1/m). Define a sequence of indicator functions =
f 1 if X t = X t - i \ 0 otherwise,
Finite Markov Chain
Imbedding
15
for t — 1, • • •, n. In this example, we are interested in the number of times that a particular outcome (one of m possible outcomes) at a given trial is repeated at the immediately following trial. In mathematical terms, we define the pattern A to denote such a repeated outcome, a pattern which is present at time index 1 < t < n if Xt-i = Xt or, equivalently in terms of the above indicator function, if It = 1. The runs statistic X n (A) =
£ >
then corresponds to the number of times that the pattern A occurred in the sequence of Markov-dependent m-state trials {Xt}™=0. In the healthcare sector, for example, the statistic Xn(A)/n is known as SECON, and forms the primary measure of sequential continuity in a series of n patient visits to m possible healthcare providers (Steinwachs 1979). One difficulty here is that the random variables {It} are not independent and do not together form a Markov chain, even if the sequence {Xt}"=0 were drawn from i.i.d. m-state trials. In fact, one can show that the random variables {It} are dependent and positively correlated by proving that Cov(Ii,Ij) > 0 for all i and j , with Cov(Ii,Ij) —• 0 as \i — j \ —> oo. However, as given below, the exact distribution can still be readily obtained using the finite Markov chain imbedding technique. First, we decompose the transition probability matrix A into two matrices G and D, where the latter matrix contains only the diagonal elements of A; i.e. ^•mxm
=
^mxm
i
*-'mX.Tni
where
/ o pa (*mxm
=
I
-A
'•.
Mi I
V • • • Pij
a n d Drnxm
o
=
o /
\
o Pmrr*
Let ft = {(u,v) : u = 0, • • • ,n, and v = 1, 2, • • •, m} be the state space containing a total of (n + l)m states. Given n, define a finite homogeneous Markov chain {Yt : t G r „ } on the state space Q as Y =
*
f iHUluXt) S
(0,X0)
\
Finite Markov Chain
16
Imbedding
with transition probability matrix IG O
D G
o\
o
D
M = G O
\o
D l)
where M is an (n + l)m x (n + l)rra matrix, O represents the m x m zero matrix, and J is the m x m identity matrix. The states in M are arranged in lexicographical (dictionary) order. Lastly, define the partition {Cx : x = 0,1, 2, • • •, n} on the state space f2 as Cx = {(x,v) :v =
l,2,---,m}.
Given the above definitions for the Markov chain {Yt}, the random variable Xn(A) is, by Definition 2.6, finite Markov chain imbeddable and its exact distribution follows from Theorem 2.1: for 0 < x < n,
(G O P(Xn(A)
D G
o\
o
D
=x)=£0
U (Cx),
\o
G O
(2.14)
D I)
where £ 0 = (TTO,0, • • • , 0 ) i x ( n + i ) m is the initial distribution of the state vector Y0, and U(CX) = (0, • • •, 0,1,1, • • •, 1,0, • • •, 0) is a 1 x (n + l)m row vector with 1 at the coordinates associated with states in Cx and zero elsewhere. Further details and a numerical example for this problem will be given in Chapter 7. O Koutras and Alexandrou (1995) introduced the notion of finite Markov chain imbeddable variables of binomial type (MVBs), and many common statistics for runs and patterns fall into this special category. Let the partition {Cx} = {[(x,v) : v = 1, • • • ,r], for x = 0, 1, • • •, ln} be the partition of the state space f2. Definition 2.7 A random variable Xn(A) is finite Markov chain imbeddable of binomial type if (i) X n (A) is finite Markov chain imbeddable as in Definition 2.6, and (ii) P{Yt = (y,j)\Yt.i = (s,t)) = 0 for all y ^ x or x + 1.
Finite Markov Chain Imbedding
17
For any MVB, introduce two r x r transition probability matrices: At(x) = (ay(*)) = (P(Yt = (x, j ) | y t - i =
(x,i)))
and Bt(x) = (bijit)) = (P(Yt = (x + 1, j)\Yt-X = (x,i))) Then the transition probability matrices Mt of the imbedded Markov chain have the following form: fAt(0)
Bt(0) At{l)
\ Bt{\)
o (2.15)
Mt O \
At(ln-1)
Bt(ln-1) At(ln)
/
for t = 1, • • • ,n, where the states are arranged in lexicographical (dictionary) order. There are many statistics for runs and patterns with transition matrices that are of this form, such as, for example, the runs statistics Nn
= x\£0) = an(x)l
, for all x = 0,1, • • •, ln,
(2.16)
where 1 = (1, • • •, 1) . Decompose Mt as Mt = Kt + Ht, where Kt is a diagonal matrix with components At(x), for x = 0,1, • • •, /„, and Ht is an upper-diagonal matrix with components Bt(x), for x = 0,1, • • •, ln — 1. From backward multiplication, £o(lT,-=i Af j) = £o(II,-=i Mj)Mt — ^o(lIi=i Mj)(Kt + Ht), it can be shown that the following recursive equations hold: at(0) at{x)
=
(2.17)
a t _i(0)A t (0) at-i(x
- l)Bt-i(x
- l) + at-i(x)At{x),
x = 1, • • • ,lr.
Finite
18
Markov
Chain
Imbedding
Equation (2.17) provides an efficient algorithm for computing the probabilities P(Xn(A) = x\£0) = OLn(x)l , for allx = 0,1, • • •, ln, and this is especially important when the dimension of the transition matrices Mt is large and the computational effort in naively calculating £0(Il™=i Mt)U (C x ) becomes prohibitive. Prom backward multiplication, the finite Markov chain imbedding technique often provides a recursive equation in a form similar to Eq. (2.17), a form which cannot, in general, be so easily obtained through the traditional combinatorial or renewal methods. 2.6
Absorbing State
In this section, some useful expressions are derived for the probability of entering an absorbing state. For clarity of the exposition, the focus will be on homogeneous Markov chains, but the ideas may be readily generalized to non-homogeneous cases. A state a € ft is called an absorbing state if, once the system enters state a, it never leaves; i.e. paa = 1 (and pa\, = 0 for any b j^ a). Let A = {ai, • • •, ccfc} be the set of all absorbing states of a homogeneous Markov chain {Yt} with transition probability matrix M. Under appropriate arrangement of the state space, the transition probability matrix M can always be written in the following form: M
=
(...^(r?}-.y..><..fe-M..i..^.(.™-y.?5.*:....) V
^fcx(m-fc)
i
Ikxk
(2 i s )
J
where m and k (m > k) are the numbers of states in fi and A, respectively. The matrix N defined by Eq. (2.18) is referred to as the essential transition probability submatrix of the Markov chain. It plays an important role in studying the exact distributions of Markov-chain-imbeddable random variables, especially for associated distributions of waiting times. Let £ 0 = (£ : 0 ) i x m be the initial distribution, where £=(£i,- • -,£,m-k), 0 = (0,- • -,0)iXfc, and YA^I & = *> anc ^ * et i1 '• °)ixm be a row vector, where 1 = (l,---,l)i X (m-fc)- The reason why we assume that the initial distribution has the form (£ : 0) is strictly for practical reasons, as most systems always start in a non-absorbing state. Theorem 2.2 Given a transition probability matrix M of a homogeneous Markov chain {Yt} in the form of Eq. (2.18), the probability for the time index n when the system first enters the set of absorbing states can be
Absorbing State
19
obtained from P(Yn e A, r „ _ ! £ A, ••-,¥!<£ A|£ 0 ) = ^Nn-\l Proof.
- JV)l'.
(2.19)
Since M has the form of Eq. (2.18), it follows that M -
1
= p ^ - j - ^ - L )
,
(2.20)
where K"„_i = (J + N -\ + Nn~2)C. Also, as all the states in A are absorbing states, it follows from the Chapman-Kolmogorov equation that P{Yn-liA,---,Y1iA\£0)
=
P ( F n _ ! G Q - A|£ 0 )
=
(£:0)Mn_1(l:0)'.
(2.21)
Equation (2.19) may then be deduced using Eqs. (2.20) and (2.21) via
= =
P ( y n _ ! ^A,---,Yxi (£ : 0 ) M
n_1
A\iQ) - P{Yn i A, y n _ ! i A, • • •, Y1 $ A\£0)
( J - M ) ( l : 0)' = £JV n _ 1 (I - JV)l'. D
In view of Eqs. (2.20) and (2.21), the following theorems are immediate consequences. Theorem 2.3
For any state i G Q — A,
p(y n _i = i,y„- 2 M > - , i i ^ 4 £ 0 ) = €Jv n - 1 e;.
(2.22)
Proof. Utilizing the same arguments as in the proof of Theorem 2.2 and replacing l ' by e-, Eq. (2.22) follows directly from Eqs. (2.20) and (2.21).
• Theorem 2.4 For any absorbing state j G A, the probability of the system first entering the absorbing state j at the n-th trial is P(Yn = j,Yn^
$A,---,Yxi
where C • is the j-th column of matrix C.
A\£0) = ZN^C),
(2.23)
20
Finite Markov Chain
Imbedding
Proof. For any j G A, it follows from the definition of the Markov chain and Theorem 2.3 that P(Yn=j,Yn-1tA,---,Y1
J2
P(Yn-i=i,Yn-2tA,---,Y1
i€Q-A
xP(Yn=j\Yn-1=i) n l
J2
ZN - e'lPij
SAT""1 £
eipii=iNn-1C'j.
ieii-A
D
To illustrate Theorems 2.2 to 2.4 and their relationships, we provide the following simple example. Example 2.5 Let {Yt} be a homogeneous Markov chain defined on the state space Q = {1,2,3,4} with initial distribution £ 0 = (£ : 0) = (1,0 : 0,0) and transition probability matrix 1 / 1/2 1/4 1/4 0 2 1/3 1/3 3 1 0 0 4 \ 0 0 0
M
0 \ 1/3 0
1 I
where A = {3,4} is the set of absorbing states. For n = 3, the firstentry probabilities of the system entering the absorbing states 3 and 4 are, respectively: 1/2
i/^y
(I/A\4
1/2
1/4 \
( 0
W = 3,yaM,^MI€o) = (i,o)(^
^To
)
_ i_ 12'
and ?(i3 = 4
1
y ^ i
1
r
1
^ y
= (i 1 o)
1/3 1/3/
1 3
V /
_5_ 72'
Further, by Theorem 2.2, the probability that the system first enters the subset A at the third trial is P(Y3 €A,Y2i (1.0)
1/2 1/3
A, Yl i A\i0) = ZNn~"{I 1/4V 1/3;
1 0
0\_/l/2 1/ Vl/3
1/4 \ 1/3;
N)\ 11 72'
First-Entry Probability
21
As a check on the above results, note that P(Y3 eA,Y2£
A,Y1 $ A\i0) = P(Y3 = 3, Y2 £ A,Yt £ A\ZQ) +P(Y3=4,y2M,Y1£A|£0) = ^ . O
More generally, since for every i € Q — A,
Pi Yl i jeA =
it follows that YljeA ^j
= 1_
(I ~ N)l
Pih J2 IEQ-A , and hence
^ ^ J V " - 1 ^ ^ =^Ar™-1(/-AT)l'.
(2.24)
jeA
2.7
First-Entry Probability
Using the ideas of Section 2.6, we can find the first-entry probability for any subset B C 0,. Given the subset B, the transition probability matrix M of a homogeneous Markov chain {Yt} can again always be arranged in the following form: M
=
n-B
( N
B
'J'
§ ) •
(2-25)
Theorem 2.5 Let {Yt} be a homogeneous Markov chain with transition probability matrix M, in the form of Eq. (2.25), and with initial distribution £ 0 = (£ : 0). Then, for a subset B of size k contained in the state space £7 of size m, the following relations hold true: (i) For all j e B, P(Yn = j,Yn--,
iB,---,Yxi
B\£Q) = iN^B],
(2.26)
where B • is the j-th column of matrix B(m_fc)Xfc, and (ii) P(Yn e B,Yn.! n 1
= $N ~ (I
- N)i
£ B,- • • ,YX £ B\Z0) = ] T fW"-1-^-jeB
(2.27)
22
Finite Markov Chain
Imbedding
Proof. Define a new Markov chain {Zt} on the state space O, where {Zt} is the same as {Yt} for all states i £ fl — B and where all states j G B are taken as absorbing states. Then the transition probability matrix M* for the Markov chain {Zt} has the form
Since, for each n, P(Yn=j,Yn-1$B,---,Y1tB\£0) =
P(Zn =
j,Zn-1<£B,---,Z1$B\£0),
Result (i) follows from Theorem 2.4. Similarly, Result (ii) follows immediately from (i) and the fact that (J — iV)l = YIJEB Bj'-' The above proof is guided by the fact that all the states in B are absorbing states with respect to the new Markov chain {Zt}, and hence, for example, for every i £ Q, — B, the probability P(Zn-\ = i) can be partitioned into two parts as follows: P{Zn-X = i|£ 0 ) = P(Z„_! = i, Zn-2 i B, • • •, Zx $ B\i0) +P(Zn-i
= i and at least one of Z„_2, • • •, Z\ is in B\£Q),
where the second part is always zero (since i e f2 — B and p*^ = 0 for all j e B). Note that, in Theorem 2.5, we assumed the initial distribution £ 0 = (£ : 0), which is equivalent to saying P(Yo € fi — B) = 1. Consequently, the probability P(Yn e B,Yn-i <£ B, • • •, Yi ^ B\£0) is referred to as the first-entry probability. Example 2.6 Let {Yt} be a homogeneous Markov chain defined on the state space Q = {1,2, 3,4,5} with transition probability matrix
M=
1 /1/2 2 1/4 3 1/4 4 0
1/4 1/2 1/4 0
1/4 0 0 0
0 1/4 0
5V 0
0
0
0
0 \ 0 1/2. 1 0
1/
First-Entry
23
Probability
Suppose B = {3,4}, then the transition probability matrix of {Yt} can be re-arranged as
M
•
/ 1/2 1/4 0 1/4 \ 0
1/4 1/2 0 1/4 0
0 1/4 0 0 1 0 1/2"' 0 0 0
""""6"" l 1) = 1, the first-entry
Given an initial distribution of YQ, say P(Yo probability for Yn = 3 is P(Yn = 3,y n _! £B,---,Y1
o \ 1/4 0
= 1) 1/2 1/4 0
= (1,0,0)
1/4 1/2 0
0 \ n " 1 /1/4 0 1/ \ o
°
For n = 3, this yields P(Y3 = 3,Y2 <£ B,YX £ B\Y0 = 1) = 5/64. Note that since state "5" is an absorbing state, the computations can be reduced further to P(Yn = 3 , y n - i tB,---,Y1<£B\Y0
= l) = (1,0)
1/2 1/4
1/4 1/2
n-l
1/4 0 O
Let A contain all of the absorbing states of {Yt}, and let B* = A U B. The transition probability matrix M*, corresponding to the associated Markov chain {Zt} in which all the states in B* are taken as absorbing states, can then be re-arranged as M* =
n-B* B*
N*
6
B* I
where N* is the essential submatrix of M*, and B* is the matrix formed from M* by deletion of all columns associated with states in Q, — B* and of all rows associated with states in B*. The following corollary then holds. Corollary 2.1
For every j € B,
P(Yn=j,Yn.1^B,---,Y1^B\(C:0))=e(NT~1B*', where B* is the j-th column of the matrix B*.
(2.28)
24
Finite Markov Chain
Imbedding
The proof of the above corollary is straightforward, and left to the reader. Note that the size of the matrix N* is smaller than or equal to the size of N.
Chapter 3
Runs and Patterns in a Sequence of Two-State Trials
3.1
Introduction
Although runs and patterns in a sequence of Bernoulli trials are special cases of those defined on multi-state trials, they merit a separate chapter due to their long history, the large quantity of associated results, and their broad application to numerous fields. The focus of this chapter will be on deriving the distributions for the most common and useful runs statistics in Bernoulli trials via the finite Markov chain imbedding technique, and also on extending these results to sequences of Markov-dependent two-state trials. Techniques for obtaining the recursive equations and probability generating functions of runs statistics through the finite Markov chain imbedding approach are also introduced. These tools can be very useful for studying certain characteristics of the distributions of runs, such as the mean, variance, and higher moments. The following runs statistics, defined traditionally on a sequence of n Bernoulli trials, are treated in this chapter: (i) (ii) (iii) (iv) (v) (vi)
NUtk, the number of non-overlapping consecutive k successes; Gn>k, the number of success runs of size greater than or equal to k; Mn>k, the number of overlapping consecutive k successes; En>k, the number of runs of exactly k successes; Ln(S), the size of the longest success run; and Sn,k, the total number of successes in success runs of length greater than or equal to k.
The waiting-time distribution of a success run is also treated, and some 25
26
Runs and Patterns in a Sequence of Two-State
Trials
numerical results for the distributions of the above runs statistics are included.
3.2
Number of Non-Overlapping Consecutive k Successes
The number of runs of non-overlapping consecutive k successes, Nn>k, in a sequence of n Bernoulli trials is probably the most important runs statistic, not only for its broad application to various areas but also for its connection with other runs statistics; in distribution theory, the distribution of Nn
k in a sequence of n Bernoulli trials was given independently by Philippou and Makri (1986) and by Hirano (1986) as
kxk=n—m—kx
(3.1) where x = 0,1, • • •, [n/k] ([n/k] is the integer part of n/k), and the success and failure probabilities are denoted by p and q = 1 — p, respectively. Godbole (1990) gave an alternative formula for the pdf of Nn^ with k > 1:
P(Nn,k = x) =
y+ x x
Y, [(n—kx)/k]
* E <-i>tt1)(,"'r")- <«> 0<j<[{n-kx-v)/k]
\
J
/ \
y
/
for x = 0,1, • • •, [n/k]. Formula (3.2) has an advantage over (3.1) in that it is easier to evaluate by computer for large n. Hirano and Aki (1987, 1993) studied some properties of this distribution, and extended the results to the case of two-state Markov-dependent trials. To begin our study of Nn^ using the method of finite Markov chain imbedding, let us consider the state space Q. = {(x, i) : x = 0,1, • • •, ln, and i = 0,1, • • •, k — 1}, where ln = [n/k] is the maximum number of non-overlapping success runs of length k that can occur within the n trials. We define a finite homogeneous
Number of Non-Overlapping
27
Consecutive k Successes
Markov chain {Yt : t = 0,1, • • •, n} on Q as follows: Yt = (Nt,k,Et),
for
l
(3.3)
where Nttk is the number of non-overlapping consecutive k successes that
occurred over the first t trials Xi, • • • ,Xt. The "ending block" Et equals m modulo k, where m represents the number of trailing successes (possibly zero) that exist in the sequence after the first t trials:
FFSSFSSNote that Et = 0 either if m is a positive multiple of k or if the t-th outcome is F. This ending-block variable keeps track of the number of successes in a possible partial run associated with the t-th trial. For example, given n = 10 Bernoulli trials with outcomes {FSFFSSSFSS} and a chosen success run length of k = 2, the realization of the imbedded Markov chain {Yt : t = 1, 2, • • •, 10} with respect to these ten outcomes is: {Yi = (0,0), Y2 = (0,1), Y3 = (0,0), Y4 = (0,0), y 5 = (0,1), Y6 = (1,0), Y7 = (1,1), Y8 = (1,0), Yg = (1,1), Yio = (2, 0)}. Note that for a given sequence of outcomes {FS • • • SF}, the realization of {Yt} is always unique. Define the subsets Cx = {(a;, i) : i = 0,1, • • •, k - 1}, 0 < x < ln.
(3.4)
The collection of subsets {Cx : x = 0,1, • • •, /„} forms a partition of the state space f2. Since {Xt} is, for the moment, a sequence of Bernoulli trials, it follows from the above definitions that Yt has a transition probability matrix Mt = (P(x,i)(y,j)) f° r all t = 1, 2, • • •, n, with the transition probabilities P(x,i)(y,j) given by the following equation: for 1 < t < n and 0 < x < ln, P(xMvd)
=
p Y
(t
= (vJWt-i
= (x,i))
q if y = x and j = 0, for i = 0,1, • • •, k — 1 p if y = x and j = i + 1, for i = 0,1, • • •, k — 2 p if y = x + 1 and j = 0, for i = k — 1, and x — u, 1, • • •, tji
1 0
1
if y = x = ln and j — i = k — 1 otherwise.
(3.5)
28
Runs and Patterns in a Sequence of Two-State
Trials
As an illustration, the transition probability matrix Mt of the imbedded Markov chain {Yt} associated with the random variable JV^ is given by (0,0) (0,1) (1.0) (1,1) (2,0) (2,1)
Mt{NB<2)
/
Q
q 0 0 0 v 0
0 0 p 0 q 0 q 0 0 0 0
P
0 0 0 \ 0 0 0 p 0 0 0 p 0 0 q p 0 0 1 /
(3.6)
6x6
for 1 < t < 5. For the case where {Xt} is a sequence of independent but non-identically distributed two-state trials with probabilities P(Xt = S) = pt and P(Xt = F) = qt, for t = l , - - - , n , the transition matrices Mt for the imbedded Markov chain {Yt} remain unchanged except that the probability p is replaced by pt and q is replaced by qt- In general for independent but nonidentically distributed two-state trials, the transition probability matrices can be written as
°\
(At
(3.7)
Mt{Nn,k)
At
Bt
At
)
d
Xd
for t = 1, 2, • • •, n, where (It Pt Qt 0
0 pt
\qt
0
0
°\ Pt 0/
fexfe
Bt is a k x k matrix having pt at the entry (k, 1) and zero elsewhere, A\ is the same as At except for having its last row replaced with (0, 0, • • •, 0,1), and the dimension of Mt(Nn^) is given by d = k(ln + 1). Hence, by virtue of Theorem 2.1, we may state that
P(Nn,k = x) = £ 0 (II Mt(Nn>k))U' (Cx), x = 0,1, • • •, ln, t=i
(3-
Number of Non- Overlapping Consecutive k Successes
29
where £ 0 = (1,0, • • • ,0)i X d, and U (Cx) is the transpose of the vector U(CX) = (0, • • •, 0, 1, • • •, 1, 0, • • •, 0) with ones at the locations associated with the states in Cx. Equation (3.8) yields the exact distribution of Nn>k for both identically as well as non-identically distributed independent twostate trials. In view of the transition probability matrix in Eq. (3.7), Nn^ is finite Markov chain imbeddable of binomial type in the sense of Definition 2.7 (Koutras and Alexandrou 1995). If {Xt} is a non-homogeneous Markov chain with transition probability matrices pFF(t) PsF(t)
pFS(t) Pss(t)
a minor modification to the imbedding procedure is needed to obtain the distribution of Nn
= (*, 7) =>• there are x runs of k consecutive successes in the first t trials with m > 0 trailing successes such that mmodfc = 0,
Yt
=
(x,0)
=> there are x runs of k consecutive successes in the first t trials with m — 0 trailing successes (Xt = F), and Yt = (x,i), for i = 1, • • •, k — 1, is defined as given in Eq. (3.3). The difference between the states (x, 7) and (x, 0) can be seen from the following example: for a success run length of k = 2, Yg(SFFFSSSS) = (2,7) and Yg(SFSSFSSF) = (2,0). Note that the ending block Et now contains not only the required information on subpatterns but also implies the outcome of Xt, enabling the assignment of transition probabilities for the imbedded Markov chain.
30
Runs and Patterns in a Sequence of Two-State
Trials
The transition probability matrices corresponding to these definitions may then be readily constructed. The imbedded Markov chain associated with the random variable N$:2, as considered in Eq. (3.6) for Bernoulli trials, has the following transition matrices M\ under non-homogeneous Markov-dependent trials: for t = 1, • • •, n, /
M*t(N5i2)
V
(0,0) PFf(t) Psp(t) 0 0 0 0 0
o
(0,1) PFSH)
0 0 0 0 0 0 0
(1.7) 0 Pss(t) 0 0 0 0 0 0
(1.0) 0 0 PSF(t) PFF{t) VSF(£)
0 0 0
(1.1) 0 0 Pss(t) PFS(t) 0 0 0 0
(2,7) 0 0 0 0 Pss(t) 0 0 0
(2,0) 0 0 0 0 0 PSF(t) PFF{t) 0
(2,1) 0 0 0 0 0 Pss(t) PFs(t) 1
Note the similar banded structure of Ml(N^) in comparison to Mt{N^^) of Eq. (3.6) for Bernoulli trials. As it is straightforward to derive the general form of Mt(Nntk) analogous to Eq. (3.7), we leave this to the interested reader. When the sequence {Xt} is i.i.d., the initial distribution £ 0 can be defined by P(YQ = (0,0)) = 1, yielding, for k > 1, the transition probabilities P(YX = (0, l ) | r 0 = (0,0)) =p and P{Y1 = (0,0)|F 0 = (0,0)) = q=(l-p). However, when {Xt} is a sequence of Markov-dependent random variables, one must be careful about assuming P(YQ = (0,0)) = 1, which would imply that the transition probabilities between YQ and Y\ are given by P{Y1 = (0,1)|F 0 = (0,0)) =pFS(l) anAP{Y1 = (0,0)|Y0 = (0,0)) = pFF(l), independent of pSF and pss. In order to avoid this type of bias, it is useful to create a dummy state 0 as the initial state for YQ. We then define P(Y0 = 0) = 1, and the transition probabilities P{YX = (0,1)|Y0 = 0) = Ps and P(Yi = (0,0)|Yo = 0) =PF- It follows that, for iV5)2, the corresponding imbedded Markov chain {Yt} is defined on the expanded state space fl = {0, (0,0), (0,1), (1,0), • • •} with transition probability matrices of the following form: 0 (0,0) / 0 i pF
(0,1) ps
(1,0) 0
61 M*t(N5,2)
Vo
(1,1) 0
Number of Success Runs of Length Greater Than or Equal to k
31
Note that the finite Markov chain imbedding procedure used to obtain the exact distribution of Nn^ remains the same, except for minor differences in the transition probability matrices, regardless of whether the sequence of trials {Xt} is i.i.d., independent but non-identically distributed, or Markovdependent.
3.3
Number of Success Runs of Length Greater Than or Equal to k
For a sequence of two-state trials, the random variable Gn,k is defined as the number of success runs of length greater than or equal to k. Let's consider a finite Markov chain {Yj : t = 0,1, • • •, n} defined on the state space fl = {(x, i) : x = 0,1 • • •, ln, and i = 7, 0,1, • • •, k - 1} - {(0,7)}, where ln = [(n + l)/(fc + 1)]. For a sequence of outcomes of the first t trials with m trailing successes, say FS • • • F SS • • • <S, we define the Markov chain m
Yt = (Gt,k,Et),
l
(3.9)
where Gt,k is the number of success runs of length greater than or equal to k in the sequence {Xt}, and Et is the ending block variable with Et = m if m = 0,1,••-,& — 1, and Et = 7 if m > k. To illustrate this definition, consider a minimum run-length of k = 2 and the following twelve outcomes of two-state trials: FSFFSSFSSSFS, for which G12,2 = 2. It follows from Eq. (3.9) that the realization of the Markov chain {Yt : t = 1, • • •, 12}
is {Y1 = (0,0),Y2 = (0,1),y 3 = (o,o),n = (0,0),Y5 = (0,1), y6 = (l, 7 ),y 7 = (1,0),Y8 = (1,1),Y9 = (2,7),Yio = (2, 7 ),Yii = (2,0), and Y12 = (2,1)}. Note that the ending block Et = 7 can occur only when there are at least k trailing successes, in which case Gn>k > 1; for this reason, the state (0,7) was excluded in the above definition of the state space
a From the definition of the imbedded Markov chain given by Eq. (3.9), the one-step transition probabilities in Mt(Gnik) for independent but nonidentically distributed trials are specified by the following equation: for
Runs and Patterns in a Sequence of Two-State Trials
32
t = !,-••, n, qt
if y = x and j = 0, for x = 0,1, • • •, ln, and i = 7 , 0 , 1 ,
Pt pt
if j/ = a; and j = i = 7, for x = 0,1, • • •, / n if j/ = a; and j = i + 1, for a; = 0,1, • • •, ln, and i = 7, 0,
Pt
ify = x + l,z = A; — 1 and j = 7, for a; = 0,1, • • •,
1 0
it y = x = ln and j = x = k — 1 otherwise.
• • • , k -
P(x,i)(y,j)(t)
= \
1
1, • • - , & - 2
(3.10) For the special case of n = 5 and k = 2, the transition probability matrices Mt(Gntk) are given by (0,0) f qt (0,1) Qt 0 (1,7) 0 (1,0) = 0 (1,1) 0 (2,7) 0 (2,0) (2,1) V 0
Mt(G5,2)
for t = 1, • • •, 5. In general, Mt(Gn:k) /
A
t
Pt 0 0 0 0 0 0 0
Pt Pt 0 0 0 0 0
0 0 0
0 0 qt qt qt 0 0 0
pt 0 0 0 0
0 0 0 0 pt pt 0 0
0 0 0 0 0 qt qt 0
0 \ 0 0 0 0 0 Pt
1 /
is a bi-diagonal block matrix of the form Pte'k Pt
O qtei
At Mt{Gn,k)
0
\
O
pte'k (3.11)
=
\ o
o
Pt
qtei
At J
where e\ = (1,0, • • •, 0) and efe = (0, • • •, 0,1) are 1 x k unit row vectors,
Number of Overlapping Consecutive k Successes
33
and At is given by ( It
Pt
0
\qt
0
0
0
\
At = Pt
0 /
The transition probability matrix At, in the context of demography, is often referred to as the Leslie matrix, or more generally, as a renewal-type matrix (see Seneta 1981). The dimension of Mt{Gn,k) is equal to (ln + l)(k+l) — 1. The matrix A*t in Eq. (3.11) is the same as At except for the last row, which is replaced by (0,0, • • •, 0,1). We define the partition {Cx : x = 0,1, • • •, ln} for 12 as C0 Cx
= =
{(0,i):* = 0 , l , - " , f c - l } , {(x,i) :i = 7 , 0 , 1 , • • • , £ ; - 1}, for x = 1, •••,/„,
from which it follows that P(GUtk = x) = P(Yn € Cx) for all x = 0,1, • • • ,ln. The distribution function, moments, and probability generating function can now be easily computed through Eqs. (2.11), (2.12) and (2.13), respectively. For the case of i.i.d. trials, all transition probabilities would be constant, and an extension to Markov-dependent trials may be carried out as described for the statistic Nntk in the previous section; in the remainder of this chapter on two-state trials, we focus primarily on the case of independent but non-identically distributed trials.
3.4
Number of Overlapping Consecutive k Successes
The random variable Mn^ is defined as the number of overlapping consecutive k successes in a sequence of n independent two-state trials. The imbedded Markov chain {Yt : t = 0,1, • • •, n} associated with Mn>k may be defined as Yt = (Mt,k,Et),
4 = 1,2,
(3.12)
Runs and Patterns in a Sequence of Two-State
34
Trials
on the state space fl
=
{(x,i)
: x = 0 , 1 , •••,ln
— l , a n d i ~ 7 , 0 , 1 , ••• ,k — 1}
U{(Zn,7)}"{(0,7)}, where ln = n — k + 1, Mttk is the number of overlapping consecutive k successes in the first t trials, and Et is the ending block variable keeping track of the number of trailing successes m: 7 m
Et
if m > k if m = 0,1, • • •, k — 1.
(3.13)
With overlap-counting, it is easy to verify that the probabilities for the transition probability matrices Mt = (P(x,i)(y,j){t)) can be obtained from the following equation: qt
if y = x and j = 0, for x = 0,1 • • •, ln, and i = 7,0,1,
Pt
if y — x and j = i + 1, for x = 0,1 • • •, ln, and i = 0,1,
• • • , k - l
• • • , k - 2 t
P(x,i)(v,j)( )
=
(
pt
if y = x + 1, j = 7 and i =fc— 1, for a; = 0,1, • • •,
Pt 1 0
if y = x + 1 and j = i = 7, for x = 0,1 • • •, ln — 1 if y = x = ln and j = i = 7 otherwise. (3.14) The corresponding partition of the state space f2 can be specified as follows:
C0 = {(0,0 :i = 0 , l , - - - , f c - l } , Cx
=
{(x,i) : i = 7 , 0 , 1 , • • - , & - 1}, cc = 1, •••,ln-
a„
=
{(/n,7)}-
1,
For n = 4 and fc = 2, for example, the transition probability matrices
Number of Runs of Exactly k Successes
35
M t (M 4 ,2), t = 1,2,3,4, are
M t (M 4 , 2 ) =
(0,0) qt ( (0,1) qt 0 (1,7) (1.0) 0 (1,1) 0 (2,7) (2,0)
(2,1) (3,7)
0 0 pt 0 Pt 0 0 qt 0 0 qt 0 0 ! 0 ! qt qt
Pt 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0
Vo
0 0 0 0
0 0 0
0 0
0 0 Pt 0 0 0 ! Pt Pt \ 0 0 qt 0 qt 0 qt 0 0
0 0 0 0 0 j 0
pt 0 0
0
0 0
\
0
0
.
(3.15)
Pt 0 Pt
1 /
For general n and k, the transition probability matrices continue to have a banded form similar to M^M^) in Eq. (3.15), and are of dimension ln(k + 1). The distribution and moments for the random variable Mn^ can again be computed through Eqs. (2.11) and (2.12), respectively. 3.5
Number of Runs of Exactly k Successes
The imbedded Markov chain {Yj} associated with the random variable En
(3.16)
on the state space Q = {(x,i) : x = 0,1, ••-,/„, and i = /?,7,0,1, • • - , & - 1} - {(0,7)}, where ln = [(n + l)/(fc + 1)], Et,k is the number of success runs of size exactly k in the first t trials, and the ending block Et is defined according to the number of trailing successes m in the first t trials as follows: Et
m 7 (3
if m = 0, !,••-,& — 1 if rn = k if m > k.
(3.17)
The two ending-block states (3 and 7 carry the following interpretation: (i) Waiting state (x, 7), x = 1, • • •, /„: Yt = [x, 7) means that m = k and that the x-th success run of size
Runs and Patterns in a Sequence of Two-State
36
Trials
k has occurred at the t-th trial, and (ii) Overflow state (x, /3), x = 1, • • •, ln: Yt = (x,(3) means that m > k and that exactly x success runs of size k have appeared prior to the last m + 1 outcomes (FS• • • S).
With these ending blocks in mind, we can easily construct the partition for the state space Q: Co = {(0, i) : i = /3,0,1, • • •, k — 1}, and Cx = {(x, i) : i = 7, (3, 0,1, • • •, k - 1}, for x = 1, • • •, ln. The probabilities for the transition matrices Mt{Et,k) of the imbedded Markov chain {Yt} are specified by the following equation: qt pt
if y = x and j = 0, for x = 0,1, • • •, ln, and i = 7, /?, 0,1, •••,ft-l if y = x and j = i + 1, for x = 0,1 • • •, ln, and i = 0,1, • • • , k - 2
P(x,i)(y,j)(t) = \ Pt if y = x + 1, j pt if y = x - 1, j Pt if j/ = x and j 1 if y — x = ln 0 otherwise.
= 7 and i = k - 1, for x = 0,1, • • •, ln - 1 = j3 and i = 7, for x = 1, • • •, ln = i = f3, for a; — 0,1 • • •, /„ and j = i = k — 1
(3.18) As an illustration, consider the case n = 5 and A; = 2, for which we have the transition probability matrices (0,/?) (0,0) (0,1) (1,7)
(1.0) Mt(Es,2)
= (1,0) (1,1) (2,7) (2,/3) (2,0) (2,1)
fft 0 0 Pt 0 0 0 0 0 0
\°
Pt 0 qt pt 0 Qt 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 pt 0 0 0 0 0 0 0 0
0 0 0 0 pt 0 0
0 0 0 Qt
pt 0 0 0
0 0 0 0
Qt Qt Qt
0 0 0 0 0 Pt 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 Pt 0 0 0 0 Pt 0 0 0 0
0 0 0 0 0 0 0 Qt Qt Qt
0
0 \ 0 0 0 0 0 0 0 0 Pt
1 / (3. 19)
The Distribution
of the Longest Success
Run
37
In general, the transition probability matrices of the Markov chain {Yt} associated with En
3.6
The Distribution of the Longest Success Run
Let Ln(S) be the length of the longest success run in a sequence of twostate trials. For the case of n independent tosses of a fair coin, let An(k) be the number of sequences of length n in which the longest run of successes (heads) is less than or equal to k. Since all the sequences are equally likely with probability 1/2™, the distribution of the longest run is P(Ln(S)
2~nAn(k),
(3.20)
where An(k) satisfies the recursive equation (Schilling 1990)
An(k)
S , = o An-i-j{k) 2n 1
if n > k if n < k if fc = 0.
(3.21)
For biased coins (p ^ 1/2), combinatorial analysis yields n
(3.22) x=0
for 1 < k < n, and P(Ln(S) = 0) = qn, where Cn\k) is the number of sequences of length n in which exactly x successes occur, but in which no more than k of these successes occur consecutively. C„ (k) can be obtained through the recursive equation
C^(k)
(") 0
if x < k < n if k < x = n.
(3.23)
More generally, suppose that the probabilities of success and failure could be different in each trial, equal to pt and qt, respectively, for t = 1, • • •, n. The following theorem derives the distribution for Ln(S):
38
Runs and Patterns
Theorem 3.1
in a Sequence of Two-State
Trials
For 0 < k < n, n
P(Ln{S) < k) =
t([[Nt)llx{k+1),
(3.24)
t=i
where £ = (1,0, •••,0) is a 1 x (k + 1) unit row vector, and Nt is, as indicated below, the (k + 1) x (k + 1) essential submatrix of the following transition probability matrix: 0 1
(
qt
Pt
Qt
0
0 Pt
•
•
0
•
0
0 \ 0
(
Mt
ct
Nt i
1 k a
Qt
\°
0 0
• •
0 0
Pt
1
J
(fc+2)x(fc+2)
(3.25)
Proof. The longest success run in a sequence of two-state trials is related to the runs statistics Nn^, Gn,k and Mn^ in the following simple way: L„(S) < k if and only if iV„,/c+i = G n , fc+ i = M „ | k + i = 0. Thus P(Ln(S) < k) = P(NUik+i = 0), and we can complete the proof by considering Eq. (3.8) for P(Nntk+i = x) with x = 0. Changing states (0,0), (0,1), • • •, (0, fc) in this application of Eq. (3.8) to states 0,1,2, • • •, k, respectively, and combining all other states into the absorbing state a, we obtain n
P(Ln(S)
k+1 =0) = ^ 0 ( J j M t ) ( l , • • •, 1,0)', t=i
where £o = (1,0, • • •, 0)i x ( f c + 2 ) = (£ : 0). With the notation (1, • • •, l,0)ix(fc+2) = ( 1 : 0)> and since the product of the transition probability matrices has the form
II M < the theorem follows immediately.
I$=1Nt\Ct{n) 6 l D
The Distribution
Corollary 3.1
of the Longest Success Run
39
Given 1 < k < n, we have the recursive equation k
P(Ln(S)
n
< k)+J2ln-i i=\
[J
PjP(Ln-i-i(S)
< k),
j=n—i+l
(3.26) with P(Ln{S)
= 0) = n " = i
< n) = 1 for k = n.
Proof. It follows from the structure of the transition probability matrices Mt given by Eq. (3.25) that (i) Mte0
= g t ( l , - - - , l , 0 ) ' l x ( f e + 2 ) , and
(ii) Mte'i = ptei-x,
for i = 1, • • •, k,
where e, = (0, • • •, 0,1, 0, • • •, 0) is a 1 x (k + 2) unit row vector having 1 at the coordinate associated with state i, for i = 0,1, • • •, k. Since (1, • • •, 1,0) = $2i=o e i ' o u r r e s u l t is a direct consequence of (i), (ii) and backward multiplications of Eq. (3.24). • Taking pt = qt = 1/2 for allt = 1, • • •, n, and multiplying P(Ln(S) < k) by 2 n , Eq. (3.26) yields the recursive equation (3.21) for An(k). Theorem 3.1 can also be extended to the longest failure run Ln{F) and the longest run statistic Ln = ma,x(Ln(S), Ln{F)). For i.i.d. cases and for large n, there are several outstanding results on the length of the longest success run. Renyi (1970), Csorgo (1979), Erdos and Renyi (1970) and Erdos and Revesz (1975) show that, a £ n - > o o , Ln(b)
g.s
logi/p n
This result is often referred to as the new law of large numbers. In Chapter 5, we will develop a large deviation approximation for the probability of Ln(S) under i.i.d. (and homogeneous Markov-dependent) trials: P(Ln(S)
< k) - exp{-n/3},
where (3 = — log A[i] and A[i] is the largest eigenvalue of the essential transition probability submatrix Nt (with constant pt and qt) given by Eq. (3.25).
40
Runs and Patterns in a Sequence of Two-State
3.7
Trials
Waiting-Time Distribution of a Success R u n
Let A = S • • • S be the simple pattern of k consecutive successes, and define the random variable W(A) as the waiting time for pattern A to occur, i.e. W(A) = inf{n : Xn_k+1
= Xn_k+2
= -.. = Xn = S}.
For example, given k = 3, W(A) = 6 means that the pattern SSS occurs for the first time after six trials, as in SFFSSS. The distribution of W(A) for Bernoulli trials is often referred to as the geometric distribution of order k (see Aki 1985 and Hirano 1986). Theorem 3.2 For a given pattern length k > 1 and a sequence of Bernoulli trials {Xt}, the distribution of W(A) is given by P(W(A)
(3.27)
= n) = SAT-^AXJ - iV(A))l ,
where £ = (1, 0, • • •, 0) is a 1 x k row vector, and N(A) is the kxk transition probability submatrix of 0 0
0 p
( q v q 0
I° f N(A)
M(A) =
-\ o fc-1 a
q
0 ip 1 0
0
\i) 0
essential
I )
c 1
(fc+l)x(fe+l)
(3.28) It further follows that the probability generating function of W{A) is given by ifiw(s)
=
pksk{l —ps) 1 — s + qpksk+l
(3.29)
Proof. Given A, n, and k < n, it follows from the definition of W(A) and Nn>k that these two random variables have the following relationship: W{A) < n if and only if Nn,k > 1, Vn > k. Hence P(W(A)
< n) = P(Nntk > 1), and
P{W{A) = n)
= =
P(Nnl)-P(Nn-lik>l), P(Nn^ltk
= 0)-P(Nn
(3.30)
Waiting- Time Distribution
of a Success
Run
41
Since we only need the general result for P(Nntk = 0) to complete the proof, we can, as in the previous section on the longest success run, replace the states (0,0), • • •, (0, k — 1) defined in Section 3.2 by the states 0,1, • • •,fc— 1, and combine all other states into an absorbing state a. Under this reduced state space, the transition probability matrix M(Nn,k) is simplified to Af (A). Equation (3.27) of Theorem 3.2 is then an immediate consequence of Eqs. (3.8), (3.30) and Theorem 3.1. Note that, for 0 < i < k - 1, eiiV(A)
-qe0+pei+i.
Given £ = eo, using the above forward-multiplication result yields the following recursive equation: fc P(W{A) = n)= #V n - 1 (A)(J- - JV(A))l' = ^ g P i - 1 P ( W ( A ) =
n-i).
t=i
(3.31) The above recursive equation and the boundary condition P(W(A) = k) = pk lead to the following recursive equation for the probability generating function fw{s):
1 i
s
i=l
Summing the finite power series yields the explicit result for tpw (s) given in Eq. (3.29), a result which was first derived by Feller (1968) using renewal theory. • Let us define the matrices / P q 0
0 P
•••
fo
0 \
0
o\
0 0
0 0
B
A = q \q 0
0
0
0
p fcxfc
A* = ( l ) l x l , a n d B * = (0
\p
0
0
0
0
p) fcxl
'
and let W(m,A) be the waiting time for the m-th run of k consecutive successes (under non-overlap counting). Similar to the above development
42
Runs and Patterns
in a Sequence of Two-State
Trials
for W(A), the distribution of the random variable W(m,A) can also be obtained using Eq. (3.27) by replacing the transition probability matrix M(A) by (A
B
M(m,A) =
o A
O
\
B A B*
A* J
V
(mfc+l)x(mfc+l)
Note that the transition probability matrix M(A) in Eq. (3.28) is the special case of M(m, A) with m = 1. Since W(m, A) = Y^HLI W J ( A ) , where Wj(A) represents the waiting time from the (i — l)-th occurrence until the i-th occurrence of pattern A, and since the random variables Wi(A) are i.i.d., it follows from Eq. (3.29) that the probability generating function of W(m,A) is
¥ V ( m , A ) ( s ) = <^w(s)
pfcsfe(l - p s ) k k+l 1 s+qp s
(3.32)
The probability generating function ipw(s) always exists for all \s\ < 1. This follows from its definition and the fact that \
= n)<J2
n=l
P{W = n) = 1.
n=l
However, the probability generating function cpw(s) may exist beyond the region |s| < 1. The exact region varies from problem to problem. We shall return to discuss the largest region of existence for (fw(s) in Section 5.7. The distribution of W(m, A) can also be obtained through the equation P(W(m,A)
1 dn
= n) = ^y^-
an approach which may be most easily accomplished using symbolic manipulation software {e.g. MAPLE or MATLAB). Further treatment of waitingtime distributions is given in Chapter 5 for both simple and compound patterns under i.i.d. as well as Markov-dependent multi-state trials.
Numerical
3.8
Examples
43
Numerical Examples
Before studying more complex runs statistics, in this section we provide some numerical results for the runs statistics and waiting times described in previous sections in order to illustrate the theoretical results. Given
the transition probability matrix (or matrices) of the imbedded Markov chain {Yt}, in general we need only two kinds of formulae, in the forms of Eqs. (3.8) and (3.27), for evaluating the distributions of Xn(A) and W(A), respectively. The formulae are simple and computationally efficient, suitable even for very large n. The numerical results presented here can also serve the purpose of checking one's programming computations. In all the examples considered, the computational time for obtaining each distribution is minimal, a fraction of a second on a current PC. Table 3.1 gives the exact distributions and means of the random variables £15,2, (315,2, JVi5,2, -Mis,2 and 1/15(5) under the assumption that {Xf}j£i is a sequence of independent two-state trials with probabilities pt = l / ( t + l ) fort = 1,2, •••,15.
Table 3.1 The exact distributions and means of five runs statistics with n = 15 and pt = ! / ( * + ! ) . •\r.v.
El5,2
Gl5,2
iVl5,2
Mi5,2
.67163 .24125 .06881 .01516 .00270 .00040 .00005 5.68e-06 5.54e-07 4.81e-08 3.71e-09 2.55e-10 1.58e-ll .43750
0 1 2 3 4 5 6 7 8 9 10 11 12+
.73200 .24771 .01976 .00051 3.99e-07 5.08e-09
.67163 .30120 .02646 .00070 5.27e-08 5.69e-09
.67163 .29046 .03602 .00184 .00004 4.85e-07 2.28e-09 3.10e-12
vlean
.28879
.35625
.36821
Li5{S) .06250 .60913 .26135 .05531 .00991 .00156 .00021 .00003 2.88e-06 2.85e-07 2.58e-08 2.14e-09 1.64e-10 1.34668
Runs and Patterns
44
in a Sequence of Two-State
Trials
Table 3.2 gives the waiting-time distributions of the first success run of length k, for various values of k and state probabilities pt. Additional numerical results on waiting-time distributions will be provided in Chapters 5 and 7. Table 3.2
Some waiting-time distributions of the first success run of length k.
P = .9 k = 2 A; = 3
2 3 4 5 6 7 8 9 10 11 12 13 14 15+
3.9
.8100 .0810 .0810 .0154 .0088 .0023 .0010 .0003 .0001
.7290 .0729 .0729 .0729 .0198 .0144 .0091 .0038 .0024 .0013 .0007 .0004 .0002
pt = t/(t k=2 .3333 .2500 .2000 .1111 .0595 .0271 .0117 .0046 .0017 .0006 .0002 .0001
+ 1) k=3 .2500 .2000 .1667 .1429 .0937 .0611 .0383 .0219 .0122 .0066 .0034 .0017 .0008
Pt = 1 - 1/2* k = 2 fc = 3 fc = 5
.3750 .3281 .2051 .0710 .0177 .0028 .0003
.3281 .3076 .1987 .1118 .0397 .0111 .0026 .0004 .0001
.2980 .2933 .1940 .1104 .0588 .0303 .0108 .0032 .0008 .0002
Number of Successes in Success Runs of Length Greater Than or Equal to k
Let Sntk be the total number of successes in success runs of length longer than or equal to fc, for k = 1, • • •, n. It can be written as n
Sn,k = ^2iRn(i),
(3.33)
where Rn{i), for i = k, • • •, n, is the number of success runs of length exactly equal to i in a sequence {Xt}. For k = 1, Eq. (3.33) is equivalent to the
Number of Successes in Success Runs of Length Greater Than or Equal to k
45
total number of successes in the sequence {Xt}; i.e. n Sn,l — /
JXj,
i=\
where Ixt = 1 if the i-th trial is a success and zero otherwise. If {Xt} is a sequence of Bernoulli trials, then 5 n> i has binomial and normal distributions as exact and limiting distributions, respectively. More generally, for the case where {Xt} is a sequence of Markov-dependent random variables, the statistic SU}k also has a normal limiting distribution, as determined by Nagaev (1957) for k = 1 and by Fu, Lou, Bai and Li (2002) for k > 2. In this section, we only study the exact distribution of Sn,k with k > 2. Let Lj, for j > 2, be the length of the success run located between the (J — l)-th and j-ih failures in the sequence {Xt}, with L\ = 0 if the first trial is a failure and L\ = I if the first / trials are successes and the (I + l)-th trial is a failure. For a given time index t, let mt be the number of failures in the subsequence X\, X2, • • •, Xt, and let L\ represent the number of successes that occur after the m t -th failure in this subsequence. Note that 0 < L$
StM = YJLj{k)+L*t(k),
(3.34)
where Lj(k) = LjI{Lj>k),
and L*(k) = Lp{L*>k}.
(3.35)
Here I{L>k} is a n indicator function equal to one if Lj > k and zero otherwise (I{L*>k) 1S defined similarly). To capture the relevant information in the subsequence X±, X2, • • •, Xt, we define a new sequence of random variables in the form of the twocomponent vector Yt = (St,k,Et(k)),
i=l,-..,n,
(3.36)
where Sttk indicates the total number of successes in success runs of length greater than or equal to k in the first t trials, and Et(k) is the ending-block random variable given by Et(k) = L*t(l - I{Li>k}) +
k+I{Lt>k}.
46
Runs and Patterns in a Sequence of Two-State
Trials
In this expression, k+ is a symbol representing the state where L\ is greater than or equal to k. The ending block Et (k) represents the length of the success run counting backward from the t-th trial, with Et (k) = 0 if the t-th trial is a failure and Et{k) = k+ if the length is greater than or equal to k. More specifically, consider that from the mt-th (or most recent) failure F to the end of the subsequence X\, X2, • • •, Xt, we can only have the following possible outcomes: {F, FS, • • •, FS • • • S, or S • • • S if mt = 0}; the random variable Et (k) is equal to the number of successes in these outcomes if the number is less than k, and Et(k) = k+ if it is equal to or greater than k. This ending block of the first i trials provides essential information about the transition probabilities from Yj to Yt+\. We define the state space ft = {(u, v) : u = 0, k, • • •, n — 1, n; v = 0,1, • • •, k — 1, k+}
(3.37)
with size d = card(tt) = (n — k + 2)(k + l), and consider here the case where the sequence {Xt} is a homogeneous Markov chain with transition probabilities pFF, PFSI PSF, and Pss- In °ur counting procedure, the sequence of random vectors Yt = (St,k, Et(k)), t = 1,2, • • •, n, defined on Q obeys the following rules: (i) Given Yt_i = (x,0), then Yt = (x,0) with probability pFF if the outcome of the t-th trial is F, and Yt = (x, 1) with probability pFS if the outcome of the t-th trial is S. (ii) Given Yt-i = (x,y) for 1 < y < k - 2, then Yt = (z,0) with probability pSF if the outcome of the t-th trial is F, and Yt = (x, y+1) with probability pss if the outcome of the t-th trial is S. (iii) Given Yt-i = (x,k — l), then Yt = (x, 0) with probability pSF if the outcome of the t-th trial is F, and Yt = (x + k, k+) with probability pss if the outcome of the f-th trial is S. (iv) Given Yt-\ = (x,k+), then Yt = (x,0) with probability pSF if the outcome of the t-th trial is F, and Yt = ( x + 1 , k+) with probability pss if the outcome of the t-th trial is S. In view of our construction, the sequence {Yt = (St,k,Et(k)) '• t = 1, 2, • • •, n} forms a homogeneous Markov chain with transition probability matrix —
{P(x,y),(u,v))dxd,
Number of Successes in Success Runs of Length Greater Than or Equal to k
47
where the transition probabilities P(x,y),(u,v)i under lexicographical ordering of the states (•, •), can be specified explicitly as follows. Given (x,y) G Q, pFF pFS PSF
^ I pss (x,y),(u,v) 1 0
if y — v = 0 and u = x if y = 0, v = 1, and u = x if y 7^ 0, v = 0, and u = x ifl
.
.
Hence the random variable Sn,k is finite Markov chain imbeddable, and the exact probabilities can be obtained from P{Sn,k=x)=£0Mn-lu'{Cx),
x = 0,k,---,n,
(3.39)
where the l x d row vector £ 0 = (qo,PO: 0, • • •, 0) is the initial distribution of Y\, the partition {Cx} is defined as Cx = {(x, y) : y = 0,1, • • •, k - 1, k+},
x =
0,k,---,n,
and U {Cx) is the transpose of the l x r f row vector U{Cx)={0, • • -, 0,1, • • -, 1, 0, • • •, 0) with ones at the coordinates corresponding to states in Cx. To gain further insight on the effects of the various parameters, the distributions of Sn,k for a few representative cases are shown graphically in Figure 3.1 with n = 15, 30, and 60, where the initial distribution £ 0 = (1,0, • • •, 0) is assumed. When pss is small {e.g. pss = 0.2), the parameters' effects on the distribution are less pronounced, and hence in Figure 3.1, only cases with large values of pss {= 0.8) are presented. For purposes of comparison, the expectations are also included. For fixed n, the effect of k and pFS can be summarized as follows. For small k {k = 2), the distribution of Sn
48
Runs and Patterns in a Sequence of Two-State
n= 15 , k= 2
Trials
n= 15 , k= 7
n= 15 , k= 4
(a)
(b)
(o)
n= 30 , k= 2
n= 30 , k= 4
n= 30 , k= 7
10
10 15 20 25 30
15
20
25
30
10
15
20
25
(d)
(e)
(f)
n= 60 , k= 2
n= 60 , k= 4
n= 60 , k= 7
10 20 30 40 50 60
10
number of successes (9)
number of successes (h)
20
30
40
50
60
30
10 20 30 40 50 60 number of successes (i)
Fig. 3.1 Representative distributions of Snik with n = 15,30,60, k — 2,4,7, pss = 0.8 and PFS = 0.15(solid line),0.5(dotted line), 0.85(dashed line). The means of the distributions are, in ascending order of PFS, as follows: ESn)k = (a) 5.27, 9.76, 11.44, (b) 4.09, 7.65, 8.98, (c) 2.30, 4.43, 5.24, (d) 11.44, 20.05, 23.09', (e) 9.35, 16.43, 18.93, (f) 6.01, 10.61, 12.25, (g) 23.78, 40.62, 46.41, (h) 19.89, 33.98, 38.83, (i) 13.43, 22.97, 26.25.
Chapter 4
Runs and Patterns in Multi-State Trials
4.1
Introduction
In Chapter 3, we discussed key ideas of the finite Markov chain imbedding technique to obtain the exact distributions for the numbers of success runs and patterns in a sequence of two-state trials. The main goal of this chapter is to extend the finite Markov chain imbedding technique to study the number of runs and patterns in a sequence of multi-state trials. It might seem that, in principle, the extension should be somewhat straightforward and should require only minor modifications. However, this is not the case, especially when the pattern is complex and the sequence {Xt} consists of Markov-dependent multi-state trials. The main difficulties are due to the complexity of constructing a proper finite Markov chain associated with the random variable Xn(A), especially in the process of obtaining the transition probabilities. In order to overcome these difficulties, we introduce the forward and backward principle. Our focus in this chapter will be on using the forward and backward principle to obtain the distributions of simple and compound patterns. In fact, the forward and backward principle plays an indispensable role in constructing the imbedded Markov chain for almost every application covered by this book.
4.2
Forward and Backward Principle with Non-Overlap Counting
Let us start with the simple case that {Xt}™=1 is a sequence of i.i.d. multistate trials. Each trial has m (m > 2) possible outcomes (states or symbols), labeled S = {6i, • • •, bm} and occurring with probabilities pi, • • • ,pm, 49
Runs and Patterns in Multi-State
50
Trials
respectively. Denote by Xn(A) the number of non-overlapping simple patterns A in the sequence {Xt}. First, we would like to introduce the forward and backward principle for the finite Markov chain imbedding technique, a principle which will guide the construction of an imbedded Markov chain {Yt} and the determination of its transition probability matrices. For convenience of discussion, the forward and backward principle is introduced via the following example. E x a m p l e 4 . 1 Let us consider a simple p a t t e r n A = 616162 in a sequence of n three-state trials (8 = {61, 62,63}). (i) Decompose the p a t t e r n A = 616162 into a set of sequential subpatterns S(A) = {61,6161,616162}. Define £ = § U §(A) = {61,6 2 ,63,6161,616162}
(4.1)
as a set of ending blocks induced by the p a t t e r n 616162 with respect to the sequence of trials {Xt}. (ii) Let u> = (xi, • • •, xn) be a realization of a sequence of n three-state trials. We define a state space ft = {(u, v) : u = 0 , 1 , • • •, [n/3], v G £} U {0} - {(0, 616162)}
(4.2)
and a Markov chain {Yt = (Xt(A),Et),
i=l,2,--.,n}
(4.3)
operating on u) as Yt(u>) = (u,v),
for t = l , . - - , n ,
where u
=
Xt(A)(u>) = the total number of non-overlapping p a t t e r n s A in the first t trials, counting forward from the first trial to the t-th trial, and
v
=
Et(iJ) = t h e longest ending block in £, counting backward from
Xt.
T h e definitions of u and v for the sequence of the first t trials are graphically illustrated in Figure 4.1. To make the imbedded Markov chain {1^} and the concept of the longest ending block more transparent, let us consider the following realization,
Forward and Backward Principle with Non-Overlap
Counting Forward
51
Counting Backward
u non-overlapping patterns A
12
Counting
ending block v
3 4 5 no pattern A
Fig. 4.1 Diagram illustrating the forward and backward counting procedure. Note that when t' = t and the u-th pattern occurred at the i-th trial, then, under non-overlap counting, the longest possible ending block is the pattern itself (v = A).
u) = (63&16261&162&1), of a sequence of seven three-state trials. Applying the forward and backward principle, the corresponding realization of the imbedded Markov chain {Yt} on u is given by {Yi(u>) = (0,63), Y2(a;) = (0,6i),r 3 (w) = ( O ^ ^ M = (Q,bi),Y5{u) = (0,6i6i),y 6 (w) = (1,616162) and ^7(0;) = (l,6i)}. Note that for every given u>, the realization of the imbedded Markov chain Yt(uj) = (u, v) is uniquely determined by the above procedures (i) and (ii) under non-overlap counting. In plain words, the ending block v represents the status of forming the next pattern A for the subsequence {xi, • • •, xt} containing u complete patterns. (iii) The imbedded Markov chain {Yt} is homogeneous and its transition probability matrix M = (p(x,z),(u,v)) may be determined as follows. For example, given Y§{u)) = (0, 6161), since XQ can only be one of the three possible outcomes 61,62 and 63, the forward and backward counting procedure yields
n(w)
-»
(0A&1)
->
Y6(u) ( (0,6i&i) < (1,616162)
I
(0,63)
if X§ = 61 (with probability p\) if XQ = 62 (with probability P2) if ^ 6 = 63 (with probability P3),
(4.4)
and ^5(0;) goes to any other state with probability zero. In this manner, all of the transition probabilities P(Yt = (u, v)\Yt-i = (x, z)) can be obtained. The dummy state 0 will be added as an initial state with P(Yo = 0) = 1 and with transition probabilities P(Y\ = 6j|Yb = 0) = Pi f° r i = 1,2,3. Note that the state (0, A = 616162) was deleted from the state space, since whenever the ending block v equals A there must be at least one occurrence
Runs and Patterns in Multi-State Trials
52
of the pattern in the sequence (i.e. u > 1 if v = A). (iv) Given n, we have the following partition on the state space Q,: (4.5)
{Cfl = [0],Co = [(O ) 6i),(O ) 6 2 ),(O ) 63),(O,6i6i)]. and Cx = [(x,v),v G £],x = 1, • • •, [«/3]}.
For n = 5 and the initial probability P(Yo = 0) = 1, it follows from the above procedures (i) to (iv) that the imbedded Markov chain {Yi}jL0 is defined on the state space fl = {0, (0, 6i), (0,62), (0, 63), (0, 6161), (1, 616162), (l,6i), (1,62), (1,63), (1,6161)} with transition probability matrix
(0,6i)
M
(0,6 2 ) (0,63) (0,6i6i) (1,616162)
(Mi)
(l.fc) (l.&s) (1,6161)
(°0
Pi
0 0 0 0 0 0 0
Pi Pi
\o
The probabilities easily computed.
0
0 0 0 0 0 0
P(X5(A)
0 P2 P3 P2 P3 Pi P2 P3 0 P2 P3 0 0 P3 Pi 0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
x) = £0M5U
0 0 0 0 P2
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 \ 0 0 0 0
0 Pi P2 P3 0 0 0 P2 P3 Pi 0 Pi P2 P3 0 0 Pi P2 P3 0 0 0 0 0 1 /
(4.6)
(Cx), x = 0,1, can then be O
To demonstrate the applicability of the forward and backward principle to compound patterns, let us consider the following example. Example 4.2 Given n = 4 and a compound pattern A = Ai U A2 consisting of the union of two distinct simple patterns Ai = 6162 and A2 = 6361, we are interested in finding the distribution of the random variable X^A), the number of occurrences of either Ai or A2 in a sequence of four i.i.d. three-state trials. Proceeding as in the previous example, one obtains the imbedded Markov chain {Yt}^ defined on the state space fi
=
{0, (0,60,(0,6 2 ), (0,63), (1,6162), (1,6361), (1,60, (1,6 2 ), (1,63), (2,6162), (2,6361)}
(4.7)
Forward and Backward Principle with Non-Overlap
Counting
53
with transition probability matrix ( 0 i Pi P2 P3
(0,60 (0,62) (0,63) (i,6i62) (1,6361)
M=
(Ml) (1,62) (1,63) (2.&1&2)
(2,6361)
0 i P1 0 0 \ Pi P2 0 \ 0 V2 0i ° 0 0 10 0 01 0 0 01 0 0 0 10 0 0i 0 0 ! 0 0
0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 P2 P3
0 0 0 0 0 0
0 0 0 0 0 0 0 0
Pi
P2
P3
0 0 Pi Pi 0 0 0 Pi P2 P3 0 0 Pi 0 P3 P2 0 0 Pi P2 P3 0 0 0 0 P2 P3 0 Pi 0 0 0 0 0 1 0 0 0 0 0 0 0 1 /
P3
0 0 0 0 0 0 0
\°
The probabilities P{X4(A) computed with ease.
0
x) = £0M4U
(4.8) (Cx), x = 0,1, 2, can again be O
The method can also be extended, with simple modifications, to the case where {Xt} is a sequence of Markov-dependent multi-state trials. Example 4.3 Let us return to Example 4.1, but consider here that {Xt} is a sequence of Markov-dependent three-state trials with transition probability matrix
(
Pii
P12
P13
P21
P22
P23
P31
P32
P33
Our goal is to determine the distribution of the pattern A = 616162 in a sequence of five trials. Analogous to Example 4.1, the transition probabilities of the imbedded Markov chain can be obtained for each state by the following argument. Given I3 = (0,&i&i), for example, we have Y* (O.&i&i)
(0, 6161) (1, 616162) (0, 63)
if X4 = 61 (with probability pn) if X4 = 62 (with probability pi2) if X4 = 63 (with probability pi3).
(4.9)
Equations (4.4) and (4.9) are equivalent except that the probabilities Pi,P2 and P3 are replaced by pu, pu, and P13, respectively. Hence, the imbedded Markov chain {Yt} is defined here on the same state space ft and
Runs and Patterns in Multi-State Trials
54
has t h e transition probability matrix
(OA) (o,62) (0,63) (0,6161) (1,616162)
M
(i.fci)
(i,fc) (1.&3) (1,6x61)
0 0 0 0 0 0 0 0 0 . 0
Pi
P2
P3
0
0
P12
P13
Pll
P21
P22
P23
P31
P32
P33
0 0
0 0 0 0 0 0
0 0 0 0 0 0
Pl3
Pll
0 0 0 0 0
0 0 0 0 0
0 0 0 0 P12
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
P21
P22
P23
0 0 0 0 0 0
0
P12
Pl3
Pll
P21
P22
P23
P31
P32
P33
0
0
0
0 0 1
(4.10) where t h e transition probabilities P(Yt(u, v)\Yt-i = (x,z)) are obtained as illustrated t h r o u g h Eq. (4.9). Note t h a t t h e transition probability matrix in Eq. (4.10) has t h e exact same form as t h e m a t r i x in Eq. (4.6) for t h e i.i.d. case. O In view of t h e state transitions outlined by Eqs. (4.4) and (4.9), leading t o t h e transition probability matrices of Examples 4.1 t o 4.3 given in E q s . (4.6), (4.8), and (4.10), we define t h e following notation: given Yt-i = (x, z) G fl and Xt = j € S, (u,v) =< (x,z),j
>f
(4.11)
where t h e s t a t e (u, v) G fl is t h e result of forward a n d backward (nonoverlap) counting when a n additional outcome Xt = j is included. For every (x,z) G f2, we also define L{z) G S t o be t h e last element of t h e ending block z. T h e n for t h e general case, t h e transition probabilities of the imbedded Markov chain {Yt} are specified by t h e following equation: Pij
if Xt = j G S, L(z) = i and (u,v) =< (x,z),j >Q 0 otherwise, (4.12) where p^ are t h e transition probabilities of the Markov chain {Xt}. If {Xt} is a sequence of i.i.d. multi-state trials, then Eq. (4.12) becomes P(Yt = (u,v)\Yt-i
=
(x,z))
Pj P(Yt
= (u,v)\Yt-1
=
(x,z)) 0
if Xt = j G S and (u,v) =< (x,z),j otherwise.
>n
(4.13)
Forward and Backward Principle with Non-Overlap
Counting
55
Theorem 4.1 Assuming that {Xt} is a homogeneous Markov chain with transition probability matrix A. = (pij)mxmi and A = U' =1 Aj is a compound pattern generated by I distinct simple patterns Aj having the same length k, then the imbedded Markov chain {Yt = (Xt(A),Et),t = 1,2, •••,«,} corresponding to the random variable Xn(A) (i) is defined on the state space fi
=
{®}u{(x,z):
x = 0,1,-••, [n/k],z(E£}
- { ( 0 , A*) : i = 1, • • •, 1} - {([n/k],z)
(4.14) : k[n/k] + z(k) > n},
where £ = S u ' = 1 S(Aj) and z(k) = (length of z) mod k, (ii) has the transition probability matrix M=
(P(x,z)(u,v))dxd,
(4.15)
where the transition probabilities are given by Pj Pij
if (x, z) = 0, u = 0, v = j , for all j € 8 if(u,v) =< (x,z),j >Q, x < [n/k], j € §, L(z) = i, and kx + z(k) < n P(x,z)(u,v) 1 if(u,v) = (x,z),x=[n/k], and k[n/k] + z(k) = n 0 otherwise, (4.16) with d = card(Q), the size of the state space f2, equal to d
=
1 + ([n/k] + 1) x card{£} - I - card{([n/k],z)
(4.17)
: z G £, k[n/k] + z(k) > n},
and (Hi) yields the distribution P(Xn(A)
= x) = £,0MnU\CX),
a; = 0,1, • • - , [ " / * ] ,
(4-18)
where ^ 0 is the initial distribution specified by P(YQ = 0) = 1 and C0 = [0],C O = [(0,z) :ze£]-
Cx = [(x,z) : ze£],
[(0, A 4 ) : i = 1, 2, • • • ,1],
1 < x < [n/k], and
C[n/k) = [([n/k],z) :z&£,
k[n/k] + z(k) < n],
are the partitions of the state space f2.
(4.19)
56
Runs and Patterns
in Multi-State
Trials
Note that the state space tt and its size d are functions of n, the structure of the patterns Aj, i = 1, • • • ,1, and the common pattern length k. The reader may verify that the results of Examples 4.1 to 4.3 follow directly from Theorem 4.1. Proof. Given n, since the length of every pattern is k, the maximum number of patterns is [n/k] (under non-overlap counting). The set £ = S U' = 1 §(Aj) contains all possible ending blocks generated by § and all the patterns, and it follows that for forward and backward non-overlap counting, the state space has the form {(x, z) : x = 0,1, • • •, [n/k],z G £}. The states {(0, A,) : i = 1, • • •,/} are deleted because if the ending block is Aj, then there must be at least one pattern A» in the sequence (x > 1), and so the states {(0,Aj)} are unreachable; by the same token, the states {([n/k],z) : k[n/k] + z(k) > n) cannot occur either and can be deleted. Thus the state space Vi of the imbedded Markov chain has the form given by Eq. (4.14), and its size d is determined by Eq. (4.17). Given (x,z) G Vi, 0 < x < [n/k], and kx + z{k) < n, if Xt = j G S and (u,v) = < (x,z),j >n, then, as described in Eqs. (4.9) and (4.12), it follows that P(x,z)(u,v)
= P(Yt
= (u,v)\Yt-l
= (X,Z))
=Pij,
where i = L(z). If Yj_i = ([n/k], z) and k[n/k] + z(k) — n, then t — 1 = n; for convenience, we assign the transition probabilities for these states to P(Yt = ([n/k], z)\Yt-i = ([n/k],z)) = 1. This completes the construction of Eq. (4.16) and the transition probability matrix M. The partitions on the state space f2, given by Eq. (4.19), are a direct consequence of the definition of the imbedded Markov chain introduced in Eq. (4.3). Therefore, the distribution for the compound pattern A in Eq. (4.18) is an immediate consequence of Theorem 2.1. This completes the proof. • The above Theorem 4.1 naturally also holds for simple patterns, the special case when 1 = 1. When the pattern lengths k{, i = 1, • • • ,1, are
not all equal, the forward and backward principle can still be used to find the distribution for the compound pattern A. In principle, the forward and backward counting procedure is applicable for any number of patterns I of varying sizes ki, but it is not simple to write down the general form of the state space and the transition probability matrix of the imbedded Markov chain. We will discuss this problem in Chapter 5 using the duality relationship between Xn(A) and the waiting time W(A).
Overlap
4.3
Counting
57
Overlap Counting
Let us consider that the sequence {Xt} is a homogeneous Markov chain defined on the state space S = {a, b, c} with transition probability matrix A = (pi:j),
i,j = a,b,c.
(4.20)
Let A be a simple pattern of length k. The basic difference between overlap counting and non-overlap counting is that when the pattern A is formed, a part of A will be counted toward forming the next pattern A under overlap counting, up to the last (k — 1) trials. Definition 4.1 An ending block E° generated by the pattern A is the longest ending block (E° =/= A) that, after each occurrence of A under overlap counting, can be assigned as the initial ending block for the next occurrence of A. We write E° = A, with respect to overlap counting. For example, under overlap counting: • if A — aca, then E° — a, • if A = abcab, then E° = ab, and
• if A = a • • • a, then E° = a • • • a. k
fc-l
For a pattern such as A = abc, there does not exist an E°, in which case overlap and non-overlap counting are the same. Note that under overlap counting, since the first pattern requires k elements and every additional pattern requires only k — card(E°) elements, the largest possible number of patterns A that can occur in n (n > k) trials is
1 + k - card{E°)
(4.21)
In order to illustrate the minor differences arising from the two types of counting procedures, we provide the following example. Example 4.4 Let us consider the number of patterns A = aca occurring in n = 5 i.i.d. three-state trials. Under non-overlap counting, the transition
Runs and Patterns
58
in Multi-State
Trials
probability matrix M associated with the imbedded Markov chain is
(0,a) (0,6) (0,c)
M
0
Pc
Pa
Pb
Pc
Pa
Pb
Pc
0 0 0 0 0
Pb
Pc
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0 0 0 0
/ Pa Pb
(0,ab) (l.A) (l,o) (1,6) (l,c) (l,ac)
Io
0 0 0 0
0 0 0 0
0 0 0 0
Pa
Pb
Pc
0 0 0 0
Pa
Pb
0
Pc
Pa
Pb
Pc
Pa
Pb
Pc
0
0
0
0 0 0 Pa
0 0 0 0 0
0
^
0 0 1 I
where (1, A) = (l,aca). Under overlap counting, the transition probability matrix M° associated with the imbedded Markov chain is
(0,a)
M°
/ Pa
Pb
0
Pc
Pc
0 0 0 0 0 0 0 0 0
(0,6) (0,c)
Pa
Pb
Pa
Pb
Pc
(0,ab)
0 0 0 0 0 0
Pb
Pc
0 0 0 0 0 0
0 0 0 0 0 0
(1,A) (l,o) (1,6) (l,c) (l,ac) (2, A)
\o
0 1o 0 0 0 io Pa
1 o
0 0 0 0
0 i Pa Pb 0 \ Pa Pb 0 \ Pa Pb 0 i Pa Pb 0 | 0 Pb 0 1 o 0
0 0 0 0 0 0 Pc Pc Pc
0
0 0 0 0 Pc Pc
0 0 0 0
0
0 0 0 0 0 0 0
"\
Pa
1
)
The primary difference between the two matrices arises after the first pattern has occurred. With probability pc, the state (1, A) goes to state (1, c) under non-overlap counting, while (1, A) goes to (1, ac) under overlap counting, which also entails the additional state (Z° = 2, A). O
If {Xt} is a homogeneous Markov chain with transition probabilities given by Eq. (4.20), the transition probability matrix M° of the above
Series
Pattern
59
example (under overlap counting) becomes ( 0 Pa 0 Paa (o.ft) 0 Pba 0 Pea (0,c) (0, ab) 0 0 0 0 M°=(1,A) (l,a) 0 0 0 0 (l,fc) 0 0 (l,c) (l,ac) 0 0 (2, A) 0 (0,a)
v°
Pb
Pc
0
Pab
0
Pac
Pbb
Pbc
Pcb
Pec
Pbb
Pbc
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 Pba
0 0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
Paa
Pab
Paa
Pab
0 0 0 0 0 0 0
Pba
Pbb
Pbc
Pea 0
Pcb
Pec
Pcb
Pec
0
0
0
0 0 0 0 0 Pac Pac
0 0 0 0
o \ 0 0 0 0 0 0 0 0 Pea
1
J
where p a , j>b, and p c are the given transition probabilities from state 0 to states (0,a), (0,6), and (0, c), respectively. Again, the extension from i.i.d. to Markov-dependent trials remains straightforward. The concept considered in the above example can be extended to the case of overlap up to the last d trials (1 < d < k — 1), as introduced by Aki and Hirano (2000). It is a significant advantage of the finite Markov chain imbedding technique that the extension from non-overlap counting to overlap counting is direct and simple.
4.4
Series Pattern
The distribution of the number of series patterns A = Ai * A2 can be obtained in almost the same manner as for a simple pattern, with minor modifications at the state after the first pattern Ai has occurred. Example 4.5 Let us consider a sequence of n = 5 i.i.d. three-state trials drawn from S = {a, b, c}, and the series pattern A = ab * cc generated by the two simple patterns Ai = ab and A2 = cc. Define the ending block set £ = {a, a, ab*, ab * c,ab * cc} and the state space fi = {0, (0, a), (0, a), (0, ab*), (0, ab*c), (1, ab * cc), (1, a), (1, a)}, where a stands for b or c, and ab* represents ab*a or ab*b. The corresponding imbedded Markov chain {Yt} has the transition probability matrix M
60
Runs and Patterns
in Multi-State
Trials
given by
0 (0,o) (0,5) (0,ab*) (0, ab * c) (1, ab * cc) (l,o) (l,o)
/o
Pa
Pb+Pc
0
0 0 0 0 0 0
Pa
Pc
Pb
Pa
Pb+Pc
0 Pa+Pb Pa+Pb 0 0 0
\o
0 0 0 0 0
0 0 0 0 0
0 0 0 Pc
0 0 0 0
0 0 0 0
0 0 0 0 0
Pc
0 0 0
Pa
1 0
o
\
1
J
0 0 0 0 Pb+Pc 0
Hence P(Xn(A)
= x) = £0M5U'(Cx),
1 = 0,1.
Note that if Yi is at the state (0,ab*) or (0,ab * c), this means that the pattern Ai = ab has occurred before or at the t-th trial. Now if the realization of Xt+\ is a or b, then Yt+\ has to be at state (0, ab*), an event occurring with transition probability pa +pb, and if the realization ofXt+i is c, then Yt+i advances to state (0, ab*c) or (1, ab*cc), respectively, occurring with transition probability pc. O The above example highlights the differences between series and simple patterns with regard to their transition probability matrices of the imbedded Markov chains. By slightly enlarging the state space ft, replacing (i,a), i = 0,1, with (i,b) and (i,c), and replacing (0,ab*) with (0,ab*a) and (0, ab * b), the above example can be easily extended to the case of Markov-dependent three-state trials.
4.5
Joint Distribution
Finding the joint distribution of two numbers of runs, say Xn(A\) and Xn(\2), in a sequence of two- or multi-state trials {Xt} using the finite Markov chain imbedding technique is similar to finding the exact distribution of Xn(A) introduced in Section 4.2. In general, the imbedded Markov chain {Yt} associated with the joint distribution of-X"n(Ai) and X n (A2) has the form Yt = (Xt(A1),Xt(A2),Et),
i = l,2,.--,n.
(4.22)
Joint
Distribution
61
The state space ft and the ending block Et for {Yt} depend very much on the structure of the patterns Ai and A2. The transition probability matrices Mt of the imbedded Markov chain can be constructed using the same principles described in earlier sections. In the following, we give one example to demonstrate the procedure for finding the joint distribution. Example 4.6 Let Xn be the total number of success runs Xn(S) and failure runs Xn(F) in a sequence of n two-state trials. For every n, Xn = Xn(S) + Xn(F), and the numbers of runs Xn(S) and Xn(F) are related in the following way: if there are x success runs, then there can only be x + 1, x, or x — 1 failure runs. It follows that there can only be four types of states Yt = (Xt(S),Xt(F), Et), where the ending block Et is either 5 or F: (i) (x,x - 1,S), (ii) (x,x + l,F), (iii) {x,x, 5), and (iv) (x,x,F). Consider the outcomes of ten two-state trials u> = (SSFFSFSSSF). The realization of the imbedded Markov chain {Yt} is {Y"i = (1,0, 5), Y2 = (1,0, 5), Y3 = (l, 1, F), n = ( l , 1, F), Y5 = (2,1, S), Ye = (2,2, F), Y7 = (3,2,5), y 8 =(3,2, S), Y9 = (3,2,5), Yw = (3,3, F)}. The state space ft has the form ft = {(1,0, S), (0,1, F), (1,1, S), (1,1, F), • • •, (ln, ln, S), {ln, ln, F)}, where ln = [(n+ l)/2]. For the case of independent but non-identically distributed two-state trials, the definition of YJ yields the transition probability matrices, for t = 2,3, • • •, n, (1,0,5) (0,1, F)
/ Pt
(1,1,5) (1.1,-P")
0 qt
0 Pt Pt
qt 0 0 qt
\ 0 0 Pt
o qt 0
Mt
(ln,ln-l,S) (ln-l,ln,F) (ln,L,S) (ln,ln,F)
qt 0 O 0 \ 1 / (4.23) Given £t = (pi,qi,0, • • • ,0), it follows that the joint distribution of Xn(S) and Xn(F) is given by P(Xn(S)
= x,Xn(F)
Pt
= yfa) = ^(Y[Mt)u\c(XiV)), t=2
0 qt
0 Pt 1
(4.24)
62
Runs and Patterns
in Multi-State
Trials
where, if y = x + 1, then C^^+i) = {{x,x + 1,F)}, if y — x — 1, then C(x,a;-i) = {(%, x - l , 5)}, if y — x, then C(XiX) = {(a;, a;, £), (x, a;, F ) } , and C(X,K) = {0} elsewhere. O Again, with some simple modifications to the transition probability matrices, the above results also hold for i.i.d. as well as homogeneous and non-homogeneous Markov-dependent two-state trials. The marginal distributions of Xn(S) and Xn(F) can be obtained by projecting the joint distribution onto the partitions generated by the random variables Xn(S) and Xn(F), respectively. More generally, the joint distribution of I > 2 random variables X n (Ai), X„(A2), • • •, Xn(Ai) can be obtained in the same fashion with an (I + l)-dimensional Markov chain {Yt = (X t (Ai), • • •, Xt(Ai), Et)}.
Chapter 5
Waiting-Time Distributions
5.1
Introduction
The geometric distribution is the simplest waiting-time distribution of a success in a sequence of Bernoulli trials. Its probability distribution function is given by P(W(S)
= n)=pqn-1,
n = l,2,---.
(5.1)
The waiting time of the r-th success has a negative binomial distribution given by P(W(r, S)=n)=(nr~
l
\prqn~r,
n = r, r + 1, • • •.
For the pattern A = S • • • S of k consecutive successes, as discussed in Section 3.7, the waiting time W(A) has a geometric distribution of order k. This distribution has been studied by many authors, including, for example, Aki (1985), Aki and Hirano (1988), Aki, Kuboki and Hirano (1984), Feller (1968), Hirano (1986), Hirano and Aki (1987), Koutras (1996b, 1997a), Mohanty (1994), Philippou (1986), and Philippou, Georghiou and Philippou (1983). In view of the form of the geometric distribution of order k given in Section 3.7, we define a large family of geometric-type distributions. Definition 5.1 A waiting-time random variable W(A) has a general geometric distribution if T¥(A) is Markov chain imbeddable with probability distribution function P(W(A) = n)=£Nn-1(I-N)i, 63
n = l,2,---,
(5.2)
64
Waiting- Time
Distributions
where N is the essential transition probability submatrix (defined in Section 2.6) of M corresponding to the imbedded Markov chain {Yt}. For example, the waiting-time random variable W(r,A), the waiting time for the r-th occurrence of a compound pattern A — u' = 1 Aj in homogeneous Markov-dependent multi-state trials, also has a general geometric distribution, i.e. its distribution can be computed via Eq. (5.2). The class of general geometric distributions is extremely large. It covers all well-known discrete waiting-time distributions, including the waiting-time distributions of the extended Stirling family (see Nishimura and Sibuya 1997). Table 5.1 lists some general geometric distributions commonly seen in practice for Bernoulli trials: Table 5.1 trials.
Some common general geometric waiting-time distributions for Bernoulli
Distribution Geometric, G(p) Geometric of order k, Gk{p) Negative Binomial, NB{r,p) Negative Binomial of order k, NBk(r,p) Sooner Scan
Compound Pattern A= S A = S- --S of length k r-th occurrence of A = S r-th occurences of A = S • • •, S A = l4 = 1 Ai, I > 2 (defined in Section 5.9)
R.V. W(A) W{A) W(r,A) W(r,A) W(A) Sn(r)
It appears that all well-known waiting-time distributions can be viewed as a special case of the general geometric distribution for the r-th occurrence of a compound pattern (r > 1). In this chapter, we will focus on constructing the imbedded Markov chain {Yt} for the waiting time of a compound pattern. The average waiting times and the generating functions for various waiting-time distributions will be obtained through Eq. (5.2). Further, we also study the large deviation approximation for the tail probability P(W(A) > n). The dual relationship between the random variables Xn{A) and W(r, A), {Xn{A) < r} if and only if {W(r, A) > n},
The Waiting Time of A Simple
Pattern
65
gives the probability identity P(Xn(A)
n).
(5.3)
This identity will provide a way to compute the exact distribution of the number of occurrences of pattern A i n n trials by computing the tail probabilities of its corresponding waiting-time random variable. For example, the distributions of the longest run statistic and the scan statistic are obtained in this book through such dual relationships.
5.2
The Waiting Time of A Simple Pattern
Let W(A) be the waiting time for the first occurrence of a simple pattern A = bij bi2 • • • bik, with ij e {1, • • •, m}, in a sequence of i.i.d. m-state trials {Xt}. The imbedded Markov chain {Yt} associated with the waiting-time random variable W(A) is a homogeneous Markov chain defined on the state space
n = {0} u s u s(A) = {0, h, b2, • • •, bm, biji,,- --MiK"-
K-i, ^}
(a = A), with a transition probability matrix of the form M = (Pu,v) =
a
(•--• -
r i
-J ,
(5.4)
where a is an absorbing state and Yt = a means that the pattern A has occurred on or before the t-th trial. For convenience, a dummy state "0" is added in the state space fi so that we can assume P(Yo = 0) = 1 as the initial distribution for the imbedded Markov chain. We will provide the details of how to construct the transition probabilities pu,v in the next section. Theorem 5.1 / / {Xt} is a sequence of i.i.d. m-state trials, then the distribution of the waiting time for a simple pattern A = 6^ 6;2 • • • bilc is given by P(W(A) =n) = £JV n - 1 C = ^Nn-\I
- N)l',
(5.5)
where the matrix N and the column vector C are defined by Eq. (5.4).
66
Waiting- Time
Distributions
Proof. Given YQ = 0, since {Yt} is the imbedded Markov chain for W(A), it follows that W(A) = n if and only if the Markov chain {Yt} enters the state a for the first time at the n-th trial; i.e. P(W(A) = n\Y0 = 0) = P(Yn = a, Yn^
± a, • • •, Y1 £ a\Y0 = 0).
(5.6)
The result (5.5) follows immediately from Theorem 2.2, Eqs. (5.4) and (5.6), and the identity ( I — N)l = C (rows summing to one). • Example 5.1 Consider a sequence of i.i.d. four-state trials {Xt} with possible outcomes a, c, g, and t, occurring with probabilities pa, pc, pg, and Pt, respectively. Suppose that we are interested in finding the waiting-time distribution for the pattern A = acac. The distribution of the waitingtime random variable W(A) is determined by Eq. (5.5) through the imbedded Markov chain {Yt} defined on the state space Q. = {0} U S U S(A) = {0, a, c, g, t, ac, aca, a} with transition probability matrix 0
Pa
Pc
Pg
Pt
0
a c
0
Pa
0
Pg
Pt
Pc
0
pa
Pc
Pg
Pt
9
0
Pa
Pc
Pg
Pt
t
0
Pa
Pc
Pg
Pt
ac aca a
0
0
Pc
Pg
Pt
0
Pa
0 0
Pg
Pt
0
0
0 0 0 0 0 0
(
M
\6 6
0 0 0 0 0 Pa
0 0
0
0 0 0 0 0
\
Pc
1
/
If {Xt} is Markov dependent, the probability distribution of W(A) can also be obtained through Eq. (5.5), with some simple modifications to the transition probabilities in M (without changing the state space). The explicit form of M for Markov-dependent trials as well as compound patterns will be given in the next section. O
5.3
The Waiting Time of A Compound Pattern
Before extending Theorem 5.1 to the more general case where {Xt} is an m-state homogeneous Markov chain defined on § = {b\, • • •, bm} with transition probabilities (pij) and where A is a compound pattern of I distinct simple patterns (A = u' = 1 Aj), we first discuss a specific extension of Example 5.1 to illustrate the imbedding procedure. Let A = Ai U A2 be a
The Waiting Time of A Compound
Pattern
67
compound pattern composed of two distinct simple patterns Ai = acac and A2 = etc. Define the imbedded homogeneous Markov chain {Yt} on the state space n = {0} U S U S(Ai) U §(A 2 ) = {0, a, c, g, t, ac, aca, ct, au a2}, where a.\ and a2 a r e absorbing states corresponding to Ai and A2, respectively. Given Yt-\ = u G Vt, say u = ac, it follows from the forward and backward principle (no pattern A has occurred in counting forward and the longest ending block in the state space Q, is obtained from counting backward) that there are only four possible outcomes of Yt = v corresponding to the outcomes Xt = a, c, g, and t: et-~
L
time t
u
V Pea.
aca c
Pec
ac
Peg
9 ct
Pet
if Xt if Xt iiXt if Xt
=a =c =g = t,
(5.7)
where pij, i,j = a,c,g,t, are the transition probabilities of the Markov chain {Xt}. Thus the transition probabilities of {Yt} are Pea
P{Yt = v\Yt-i =ac) = {
Pec Peg Pet
iiXt iiXt HXt iiXt
= a, v = aca = c,v = c = 9,v = g — t, v = ct.
(5i
In view of Eqs. (5.7) and (5.8), the imbedded Markov chain {Yt} has the transition probability matrix
M
a c 9 t ac ct aca OLl Ct2
(°0 0 0 0 0 0 0 0
\o
Pa
Pc
0
0
Pg Pag
Pt
Paa
Pat
Pac
Pea
Pec
Peg
0
Pga
Pgc
Pgt
Pta 0
Pte
Pgg Ptg
Pec
Peg
0
0 0 0 0
Ptg
Ptt
Pag
Pat
0 0
0 0
0 0 0 0 0 0 0 0
Pta Paa
0 0
Ptt
0 0 0 0
0 0 0 0 0
Pet
Pea
0 0 0 0
0 0 0 0
Pet
0 0 0 0 0 0 0 Pac
1 0
0
0 0 0 0 0
\
Pte
0 0
1 /
68
Waiting- Time
n-A
N \ C
A
6Ti
Distributions
(5.9)
where A denotes the collection of absorbing states a\ and a.2For the general case, we define the state space (5.10)
fi = { 0 } U S u U S ( A i ) ,
where S = {b\, • • •, bm} and S(A») = {all sequential subpatterns of A^}. Let oti, i = 1, • • •, I, represent the absorbing states corresponding to the simple patterns Ai, • • •, A;. Note that the state space tt discussed earlier in this section is a special case of Eq. (5.10) with Ai = acac and A2 = etc. Let A = {ai, • • •, a/} be the set of all absorbing states. To generalize Eq. (5.8), for given Yt-i = u G Q, — A — {0} and Xt = z G S, we define v
=
<
M, z
>n
— the longest ending block in f2 with respect to the forward and backward counting procedure.
(5-11)
Further, for each u G fl — A — {0}, we define a subset [u : S] = {v : v G tt, v =< u, z >n, z G §}.
(5.12)
Theorem 5.2 / / {Xt} is a sequence of homogeneous Markov-dependent m-state trials and A = u' = 1 Aj is a compound pattern consisting of I simple patterns Ai, • • •, A;, then (i) the imbedded Markov chain {Yt} defined on the state space Q, given by Eq. (5.10) has a transition probability matrix M
= (Pu,v)dxd =
N (d-l)x(d-l) 0
c ......
(5.13) ixd
with u, v G Q, and transition probabilities pUiV given by pz pxz Pu,v = P(Yt = v\Yt-! = U) = < 1 0
if u = $,v = z,z G S ifu€Cl-A-{
The Waiting Time of A Compound
Pattern
69
where x stands for the last symbol of u, pz are the initial probabilities from state 0 to z, pxz are the transition probabilities of the Markov chain {Xt}, and d = card(ft), (ii) given the initial distribution £ 0 = (£ : 0) with P(Y"o = 0) = 1, the waiting-time distribution of the compound pattern A is given by P{W{h) = n) = £Nn-l{I-N)l
,
(5.15)
P(W(A) = n, W(Ai) = n) = t N ^ d ,
(5.16)
where N is defined by Eqs. (5.13) and (5.14), (Hi) for i = 1, • • •, I,
an
d
where C% is the i-th column vector of the matrix C. For example, taking u — ac, Eq. (5.14) leads to Eq. (5.8) of our example at the beginning of this section. Further taking every u s fi, Eq. (5.14) yields the transition probability matrix M given by Eq. (5.9). Equations (5.10) and (5.14) provide a simple way to construct the imbedded Markov chain {Yt} on the state space fl with transition probability matrix M. Given A = u ' = 1 A j , an automatic computer algorithm to construct the state space f2 and the transition probability matrix M, as well as to compute the distribution of the waiting time W(A) and its generating function, is given by Fu and Chang (2002). The algorithm is based on Eqs. (5.10)-(5.16). Note that if {Xt} is a sequence of i.i.d. m-state trials, then Eq. (5.14) reduces to
{
pz 1 0
if Xt = z, v € \u : §] and u G Q — A if u e A and v = u otherwise. (5.17)
Proof. In view of the concept introduced in Eqs. (5.7) and (5.8), the results given by Eqs. (5.13) and (5.14) are direct consequences of the structure of the state space fl, the forward and backward counting procedure, and the fact that {Xt} is a homogeneous Markov chain. Since, for every n=l,2,---, Yn e n - A <^> W(A) > n,
(5.18)
70
Waiting- Time
Distributions
it follows that P(W{A) > n)
=
P(Yn
=
n
efl-A)
£0M U(n-A),
(5.19)
where £ 0 = (£ : 0) and 11(0,- A) = (1, • • •, 1,0, • • • ,0) = (1 : 0). Hence our result (ii) can be concluded from Theorem 2.2 and Eq. (5.19); i.e. P(W(A)
= n)
= P(W(A)
> n - 1) - P(W(A) > n)
=
m 1
i0M - U
(to-A)-
=
iN71'1!
-$,Nnl
=
^JVn_1(/-JV)l'.
£QMnu'
(fl - A)
From the definition of I^(A) and {Yt}, the following two events are equivalent: {W(A) = n, W(Ai) =n}<=*{YteSl-A,t
= l,---,n-l,
and Yn = on}. (5.20)
Hence, it follows from Eq. (5.20) and Theorem 2.4 that P{W(A) = n,W{Ai)=n)
=
P(Yn = auYt e ft - A,t = 1, • • • ,n - 1)
=
5 Z Pu,aiP{Yn-l
= U)
ueQ-A
=
ZN^d.
This completes the proof.
•
Note that i
P(W(A)=n)
=
^2P(W(A)=n,W(Ai)=n) «=i i
=
^2iNn-lCi=iNn~1(I-N)l
.
i=i
Although the above theorem is proved only under the assumption that {Xt} is a homogeneous Markov chain, the theorem remains true after minor modifications if the trials Xt are: (i) i.i.d. (Theorem 5.1 is a special case for a simple pattern), (ii) independent but non-identically distributed, or (iii) non-homogeneous Markov-dependent. We provide the following example to illustrate this point.
The Waiting Time of A Compound
Pattern
71
Example 5.2 (A) Let {Xt} be a sequence of Bernoulli trials and A a compound pattern consisting of Ai = SSS and A2 = FFF. The imbedded Markov chain {Yt} associated with the waiting-time random variable W(A) is defined on the state space fi = {0,5, SS, F,FF, 01,0:2} with transition probability matrix
s M=
F SS FF Oil
Q2
PF
0
PF
Ps 0 0 0 0 0
/ 0 0
ps 0
0 0
Ps 0
PF
0 0 \ 0
Ps 0 0
0 0 0
0
PF
0 0 0
0 0 0 0
Ps 0 1 0
0 0
0 0 0 0
\
N \ C
d TT
PF
0 1
/
The distribution of the waiting time W(A) can be obtained from Eq. (5.15). (B) If {Xt} is a sequence of non-homogeneous Markov-dependent trials, then the transition probability matrices of the imbedded Markov chain {Yt} for the waiting-time random variable W(A) are, for t = 1, 2, • • •, n,
Mt
S F SS FF
f °0 0 0 0 0
Oil
Q2
\o N(t)
6
Ps 0 Pps{t) 0 pFS(t) 0 0
Psp(t) 0 PsF(t) 0 0 0
0 0 0
0 0
0
PF
Pss{t) 0 0 0 0 0
PFF(t) 0 0 0 0
Pss(t) 0 1 0
\ C(t)
0 0 0 0 pFF(t) 0 1
J (5.21)
! J
The distribution of W(A) is given by P(W(A) = n ) = £ ( i j N(t))
( I - N(n))
1,
(5.22)
u=i
and, for i = 1,2, /n-l
P{W{A) = n and W(Ai) = n) = £ [ J J N(t) 1 Ci{n), ,t=i
(5.23)
72
Waiting- Time
Distributions
where the matrices N(i) and C{t) are defined by Eq. (5.21), and i = 1,2, are the columns of matrix C(n).
Ci(n), O
Note that, for a given k (1 < k < n), the probability of the longest run (success or failure run) Ln < k is equal to the probability of the waiting time W{A) > n with A = Ai U A 2 , Ai = F • • • F and A2 = S • • • S, so that k
k
P(Ln
5.4
Probability Generating Function
In Chapter 3, the probability generating function of the waiting-time random variable W(A) for k consecutive successes A = S • • • S in Bernoulli trials, _
pksk(l-ps) 1 — s + qp>cs'c+i
was developed through the recursive equation (3.31). For i.i.d. and homogeneous Markov-dependent multi-state trials, the probability generating
Probability
Generating
73
Function
function and the mean waiting time for a simple pattern A are given by the following theorem. Theorem 5.3 Suppose that M is the transition probability matrix, in the form of Eq. (5.4), of the imbedded Markov chain {Yt} for the waiting-time random variable W(A) of a simple pattern A, then
= l + {s~l)^{I-sN)-1l,
lpw{s)
(5.24)
and (ii) EW(A)=£{I-N)-1i.
(5.25)
Proof. It follows from the definition of the generating function and from Theorem 5.1 that oo
tpw(s) = J2snP(W(A) = n) n=l oo
=
^ V [ P ( W ( A ) > n - 1) - P(W(A) > n ) ] n=l oo
n
n 1
oo
= J2 s ^N - l - Y, sn£Nnl OO
OO
n=l oo
oo
n=l
n=0
n—0
= i + (*-i)j£yjv n V =
1 + (s - 1)£(I - a J V ) - 1 ! ' .
This proves the first part of the theorem. The second part, the mean of the waiting time W(A), is an immediate consequence of V{w\s)\s=i = ^ { l + (s - 1)£(I - sN)-1!'}
| s = 1 = £(I - J V ) " 1 ! ' .
74
Waiting-Time
Distributions
In the following, we extend Theorem 5.3 to the case where A is a compound pattern. Theorem 5.4 If A = u' = 1 Aj is a compound pattern of I distinct simple patterns, then Eqs. (5.24) and (5.25) hold. Proof.
For a compound pattern A = u' = 1 Aj, we have i
P(W(A)
=n)
= ^2 P(W(A) = n and W(At) = n),
(5.26)
i=l
for n = 1, 2, • • •, oo. For each i = 1, • • •, I, the generating function (fii(s) of the sequence {P(W(A) = n and W{A{) = n)}^1 is given by oo
&(s)
=
snp
J2
(W(A)
= n and W(At) = n)
n=\ oo
=
^«np(y„ = ai,y„_i^A,...,yi^^) n=l oo
= J2 sn£Nn~1Ci n=l
=
s£(I - sN^d.
(5.27)
In view of Eq. (5.26), we have i
i
s
^
7
- sAT)" 1 ^.
(5-28)
i=\
Since i
(i-N)i
= ^
c
-
1=1
and s£(I - sN^Nl
= £(I - sAT)" 1 ].' - 1,
Eqs. (5.24) and (5.25) follow immediately.
•
Example 5.3 Let {Xt} be a sequence of Bernoulli trials and A = Ai U A2 be a compound pattern composed of Ai = SS and A2 = FF. The imbedded
Probability Generating
Function
75
Markov chain {1^} for the waiting time W(A) has the state space il {S, F, ai, 0:2} and the transition probability matrix
s M
•
F Oil «2
/ 0 q P 0 \ p 0 0 q 0 0 1 0 \ 0 0 0 1/
AT J C
0 "Ti"
with an initial distribution for Y\ of (p,q, 0,0). Note that for convenience in obtaining the inverse (J — sAT) -1 below by hand, here we use the initial distribution of Y\ instead of YQ (at the dummy state 0). In the following, we obtain the probability generating function W(A) by two methods: the recursive method introduced in Section 3.7, and Theorem 5.4. A simple manipulation via Theorem 5.2(iii) yields the following recursive equation:
Ms) = Xy^Ar 1 "^! 71=2
p2s2 +p2qs3
=
+pqs2(f>1(s),
where ^ = (p,q). Hence
Ms)
p2s2 +p2qs3 1 — pqs2
Ms)
q2s2 + q2ps3 1 — pqs2
Similarly, we have
It follows from Eq. (5.28) that
(p2 + q2)s2 +pqs3 1 — pqs2
The above result can also be obtained from th.e definition of the generating function and Theorem 5.2(h), as well as directly from Theorem 5.4: 00
^2sn^Nn~2(I-N)l
=
s2£l(I-sN)-l(I-N)l
Waiting- Time
76
_
Distributions
(p2 +q2)s2 +pqs3 1 — pqs2 O
5.5
Mean of Waiting Time
W(A)
In Theorems 5.3 and 5.4, it was shown that the mean waiting time of a compound pattern is EW(A) = £(J — i V ) _ 1 l . This formula is efficient only if the size of N is not too large. To find an analytic form of EW(A) using the above formula is rather hard. In view of this complexity, we provide the following alternative result, which often yields an analytic form for the mean waiting time EW(A) with less effort. For the imbedded Markov chain {Yj} of the waiting-time random variable W(A), the state space and the transition probability matrix can always be relabeled as fl — {1, 2, • • •, w, aj., • • •, a;} and
. .
f N \C
\
where w denotes the number of non-absorbing states. T h e o r e m 5.5 EW{A) = Si + • • • + Sw,
(5.29)
where {Si, • • • ,SW} is the solution of the following simultaneous equations: Si = ee'i + {Si,---,Svl)Ne'i,
i = l,---,w,
(5.30)
and Nei
is the i-th column vector of the matrix
N.
Proof.
It follows from the definition of EW{A) and Theorem 5.2(h) that oo
EW{A) = Yl P(W(A) > n) n=l oo n=l
Mean
of Waiting
Time
W(A)
77
= EE^-^ w
oo
i = l n—1
Si H
l-S w ,
where Si = £ ~ = i CJV n_ e i5 i = 1,2, • • •, w. Note that £JV n - 1 e-
^Nn-2Ne'i
= W
and
& = jr^"-1^ n=l
n=2
j=l
w
oo
j=l
n=2
This completes the proof.
•
Example 5.4 Let {Xt} be a sequence of Bernoulli trials with P(X — S) = p and P(X = F) = q, and let W(A) be the waiting time for the pattern A = S • • • S of k consecutive successes. The imbedded Markov chain {Vt}, defined on the state space Sl =
{F,S,SS,---,S---S,a}, fc-i
78
Waiting- Time
Distributions
has the transition probability m a t r i x of Eq. (3.28): ( q q
p 0
0 p
0
-••
0 0
o
N \C oil
M
q
p ^ .....
\6
For i.i.d. cases, we set the initial distribution to YQ = F with probability one, i.e. £ = (1,0, • • •, 0)i x /c- Hence it follows t h a t (here w = k)
Si = J^ tN™-1^
= 1 + q{S1 + S2 +
•Sk)
n=l
and ^{JVn-1e;=pSi_1,
i = 2, •••,*;.
n=l
This yields t h e mean waiting time EW(A)
=
\-p'A s - l
l+p-\ 1-q-pq
. pk
1 1g
pfc
Jfe-1
o E x a m p l e 5.5 For the p a t t e r n A = SFS in Bernoulli trials, the imbedded Markov chain {Yt} of the waiting time W(A) has the state space Q = {F, S, SF, a} and t h e transition matrix
M
=
SF a
/ q p o 0 P q q 0 0
0
\ 6 d 6 1 /
Given £ = ( 1 , 0 , 0 ) , it follows from Eq. (5.30) t h a t S,
=
52
=
53
=
l+q(S1+S3), p(S1+S2), qS2.
Mean of Waiting Time
W(A)
79
Solving the above three simultaneous equations yields
EW(A) = - + — + -1. z p
pq
p
O
Example 5.6 Let A = Ai U A2 be a compound pattern with Ai = FFF and A2 = SSS. The imbedded Markov chain associated with the waiting time W(A) in Bernoulli trials has the state space Q = {F, S, FF, SS, ai,a2} and the transition probability matrix 0
S FF M = SS
q 0 q 0 0
ai
p 0 p 0 0 0
q 0 0 0 0 0
0 p 0 0 0 0
0 0 q 0 1 0
Choosing the initial probability ^ = (q,p,0,0) following system of simultaneous equations: Si
=
q + q(S2 + S4),
52
=
p + p(5i+53),
53
=
qSu
54
=
pS2.
(T 0 0 p 0 1
for Y\, we then have the
Basic manipulation yields E[W(A)\^
= (q,p,0,0)}
=
S1 + S2 + S3 + S4 2(1 +pq + p2q2) 1 — 2pq — p2q2 O
Although all of the examples above are for i.i.d. cases, the results of Theorem 5.5 also hold for multi-state homogeneous Markov-dependent trials with some simple modification to the transition probability matrix M. The connection between Theorem 5.4 and Theorem 5.5 for finding EW(A) is straightforward. Since (Si, • • • ,SW) is the solution of the simultaneous equations Si = £e i + (Si, • • •, Sw)Nei, i = 1, • • •, w, it follows that (Si, S2,
Waiting- Time
80
Distributions
•, Sw)= £ ( I - AT) 1. Hence we have EW(A)
= Sx + • • • + Sw
til-N)-1!.
Further, since the expected waiting time depends on £ 0 = (£ : 0), the initial distribution of the imbedded Markov chain {Yt}, the latter should be set according to the particular application, especially when {Xt} is a Markov chain and A is a compound pattern. If the dummy state {0} is added in Example 5.6 with P(Yo = 0) = 1, for instance, then the imbedded Markov chain {Yt} is defined on the state space fi0 = {0, F, S, FF, SS, 01,02} with transition probability matrix F S M® = FF SS ai
/ 0 0 0 0 0 0 ^ 0
q 0 q 0 q 0 0
p p 0 p 0 0 0
0 q 0 0 0 0 0
0 0 P 0 0 0 0
0 0 0 q 0 1
0 \ 0 0 0 p 0
I
0 1 J
and the mean waiting time of VF(A) equals
E[w(m
=
(iMoM=1_\lq%f-
Note that E[W(A)\£ = (1,0,0,0,0)] - E[(W(A)\^
= fop,0,0)] = 1.
This is due to the fact that an additional step is required by starting at YQ instead of Yi, and we have the relationship £N$
5.6
= (l,O,O,O,O)iV 0 = (0,q,p,0,0)
= (0 : &).
More About Generating Functions
Obtaining the explicit form of the probability generating function ipw (s) through Theorems 5.3 and 5.4 is often a hard task because it requires finding the analytic form of the inverse of the matrix (I — sN). Parallel to Theorem 5.5, the analytic form of the probability generating function ipw (s) can be written in terms of $ w ( s ) = E ^ L i snP(W(A) > n), the probability generating function of cumulative probabilities {P(W(A) > n ) } .
More About Generating
Functions
81
T h e o r e m 5.6 Assume that the imbedded Markov chain {Yt} corresponding to the waiting time W(A) is defined on the state space fi = {1,2, • • •, w, a\, • • • ,ai} with a transition probability matrix M having the form of Eq. (5.13). Then
ft) $w(s) = Ms) + --- +
(5-31)
where (<j>i(s), 4>2{s), • • •, <j>w(s)) is the solution of the simultaneous equations i(s) = s£e- + s(Ms),
••-, Kis^Ne'i,
(5.32)
for i = 1, • • •, w, and (ii) w(s)--<S>w(s). (5.33) s Proof. The proof of part (i) is the same as the proof of Theorem 5.5. We leave this to the reader. The second part of the theorem follows from the definitions and oo
71=1 OO
OO
n
n :l
= ]Ts £iV ~ l'-53s n £iV n l' n=l oo
n=l n
= ^s £JV
n-1
n—\
1
oo
l ' - - SJ % n + 1 i V n l ' + l n—Q
l + $w(s)-
-$w{s). s
This completes the proof.
D
Example 5.7 Consider A = SS in Bernoulli trials. The imbedded Markov chain {Yt} for W(A) has the state space Q = {F, S, a} and the transition probability matrix F M =
s a
(
q q
Vo
p 0 0 p 0 1,
Waiting-
82
Time
Distributions
Given £ = (1,0), it follows that 4>i(s) = s + qs{(/)i(s) + 4>2{s)), and (j)2{s) =ps(f>1(s). Simple manipulation yields *> r \
s+ps2 1 — qs — qps2
and
=
1 + $w(s)
--$w(s)
_
p2s2 _ p2s2(l -ps) 2 1 — qs — qps 1 — s + qp2s3 O
The probability generating function
w
- N)l
n—1
= Y
&¥><(*)>
( 5 - 34 )
n=l
where, for i = 1, • • •, w, (pi(s)
= ^s"eiiV"-1(J-iV)l'
(5.35)
is the probability generating function with initial distribution e,. For each i, it follows from this first-step analysis that oo
n=l oo
=
sei(J-iV)l'+^sneiiVn-1(J-iV)l' ra=2 oo
=
Sei(I
- N)l
*
+ s Y, sn-\eiN)Nn-2(I 71=2
- N)l
More About Generating oo
=
sei(I - N)l
w
+sJ2Y,
Functions
83
* n _ 1 Pije j JV n - 2 (/ - JV)l'
n=2j=l w
=
+sJ2PijVi(s)-
sei(I-N)l
( 5 - 36 )
Equations (5.34), (5.35), and (5.36) yield the following theorem. T h e o r e m 5.7 Assume that the imbedded Markov chain {Yt} corresponding to the waiting time W(A) is defined on the state space £} = {1, 2, • • •, w, a i> ' ' •> ai} with a transition probability matrix M having the form of Eq. (5.13). Then given £Q — (£ : 0), the probability generating function of W(A) is
,
where (tpi(s), • • • ,
+ seiN((p1(s),
•••,
for i = 1, • • •, w, and ejAT is the i-th row of the matrix
N.
For the special case £ = (1,0, • • •, 0), Aki and Hirano (1999, 2000), Aki (1999), and Han and Aki (1998,2000a) refer to the probability generating function ipw(s) = 1 ( S ) | s = 0 . They also refer to this approach for obtaining the distribution of the waitingtime random variable W(A) as the conditional probability generating function technique. In addition to the above i.i.d. examples, we would like to give an additional example to illustrate that our results are also applicable for the case where {Xt} is a two-state homogeneous Markov chain with transition probability matrix A
=
(
PFF
V PSF
PFS
\
PSS J '
Waiting- Time
84
Distributions
Example 5.8 Let us consider the waiting-time random variable W(A) of a compound pattern A = Ai U A2, where Ai = SS and A2 = FF. Given the initial probabilities P(X± = S) = p and P{X\ = F) = q = 1 — p, the imbedded Markov chain {Yt} associated with the waiting time W(A) has the state space 0. = {0, S, F, a>\, 0.2} and the transition probability matrix
M
S F «2
( 0 0 0 0
p 0
q
0
PsF
Pss
0 0 0
0 1 0
PFS
0 0
\°
0 0
AT
PFF
6
0 1
c I
Given the initial distribution £ = (1,0,0), Theorems 5.5 and 5.6 yield the mean waiting time and the probability generating function, respectively, as P +
E[W(A)] = 1 + 1
PPSF
QPFS PSFPFS
1
PSFPFS
and Vw(s) =Ps
s
psA + qpFSs° 1 i
2 PSFPFSS
qs* + p p S F s u 1 'PSFPFSS
o 5.7
Spectrum Analysis and Large Deviation Approximation
Let A[ij, A[2], • • •, X[w], be the ordered eigenvalues (in the sense of |A[i]| > IA[2] > • • • > lAr, |) of the essential transition probability submatrix N, and let 'Hl,r)2,---,riw, be the column eigenvectors corresponding to the eigenvalues \^, in the sense that Nrii = X[i]f]i, for all i = 1, • • • ,w. Since AT is a substochastic matrix, it is known by the Perron-Frobenius Theorem (Seneta 1981) that the largest eigenvalue A[ij is unique and a real number between zero and one (0 < A[i] < 1). Since 1 can always be written as a linear combination of the eigenvectors r]t, i.e.,
J2air>'i' we obtain the following theorem.
(5.37)
Spectrum Analysis and Large Deviation
Approximation
85
Theorem 5.8 / / the transition probability matrix M corresponding to the waiting time W(A) has the form of Eq. (5.13), then (i) W
P(W(A)>n) = Y,Ci\^\
(5.38)
i=l
where Ci = a^rj,, i = 1, • • •, w, (ii)
$w(s) = J2snP(W(A) >n) = ^2^ n=l
CiS
i=l
and $ w ( s ) exists for \s\ < 1/A[x], and (Hi) w
-\
r-s
Proof.
<5-39)
W
It follows from Eq. (5.37) that, for every n, w
P(W(A)>n) = tN^l
n 1
=(-N - Y,*iv'i i=l
w i=\ w
= E^,-1.
(5-4°)
i=i
This completes the proof of part (i). The result (ii) follows immediately from the definition of $ w ( s ) and Eq. (5.40). Note that &w{s) exists if and only if Y^Li 'Mil'8™ < °°> f° r all i = 1, • • •, u>. Since all the series are power series and A^j is the largest eigenvalue (0 < A^j < 1), $ w ( s ) exists if and only if \s\ < 1/A^j, and summing the series gives
*"W = £ r ^ n The result (iii) follows from Theorem 5.6(ii) and Eq. (5.41).
( 5 - 41 ) •
Waiting- Time
86
Distributions
The largest eigenvalue A[i] is always less than 1 (0 < A[i] < 1), even though its value varies from problem to problem. Hence all the probability generating functions and the cumulative generating functions always exist in the range \s\ < 1. Further, it follows from Eqs. (5.39) and (5.41) that w
»
p
(5 42)
EW(A) = -v^OOUi = <Mi) = E r~T- • T h e o r e m 5.9 As n —> oo, the tail probability P(W(A) zero exponentially in the following sense: lim - logP{W(A)
-
> n) converges to
>n) = -/3(A),
(5.43)
n—>oo n
where /3(A) = — logA^j. The tail probability of the general geometric distribution is determined by the largest eigenvalue of N, and, for large n, Eq. (5.43) provides a large deviation approximation for the tail probability; i.e. P(W(A)
>n) = exp{-n/3(A)[l + o(l)]} ~ exp-fnlogAmJ.
(5.44)
Proof. It can be shown that Yl7=i ai£7li = 1 a n d ^ i = ai£*7i > 0- Since A[i] is unique and A[i] > |A[j]|, for a l i i = 2, • • • ,w, by Eq. (5.38) we have the following inequality 1 - g l t l l ^ l
}
*
P(W(A)>n) \
<
n—1
CiA: t=2
Ml]
Given an arbitrarily small number e > 0, since |A[j]/Am|n 1 —> 0 as n —> oo for alH = 2, • • •, w, there exists a number no such that for all n > no, C i A ^ ^ l - e} < P(W(A) > n) < C i A j ^ l + e}.
(5.45)
Taking the log of both sides of Eq. (5.45) and dividing by n, then as n —> oo, we have lim - log P(W (A) > n) = log Am. n—»-oo 77,
This completes the proof.
l J
•
Probability Generating Function of W(r, A)
87
In the language of large deviation theory, we can say that the tail probability P(W(A) > n) obeys the first-order large deviation principle in the sense of Eq. (5.43). The positive constant /3(A) = — logA[i] is often referred to as the exponential rate of P(W(A) > n) tending to zero. Numerical examples for the large deviation approximation in Eq. (5.44) are provided in Section 7.2. Following Theorem 5.9, we can prove a slightly stronger result. Theorem 5.10 P(W(A)_> n) lim - ^ — „ T ' = 1. Proof.
(5.46)
It follows from Eq. (5.38) that there exists a constant c such that A[2]
A,[i]
n-l
P(W(A) < ——
> n) A 7 < 1 + c [2] A A[i] °1 [1]
n-l
(5.47)
Since A[i] > |A[2]|, the result is an immediate consequence of the above inequality by letting n tend to infinity. This yields, for large n, that the tail probability of W(A) can be approximated by P(W(A)
5.8
> n) ~ CiA^j"1.
n
Probability Generating Function of W(r, A)
The waiting-time random variable W(r, A) can be viewed as the sum of r waiting times of the pattern A under non-overlap counting: W(r, A) = Wi(A) + • • • + Wr{A). It is easy to see that {Wj(A)} are i.i.d. (i.e. W(A) = W»(A) for all i) if {Xt} are i.i.d. multi-state trials, and the counting is non-overlapping. Then the probability generating function of W(r,A) is
(5.48)
Waiting- Time
88
Distributions
Let E° be the ending block with respect to overlap counting (see definition of E° in Section 4.3). Given $,{E°), the random variables W°(A), i = 2, • • •, r, are conditionally independent and identically distributed, and we denote these variables by W°(A). Note that the imbedded Markov chain {Yt} for W°{A) has the same transition probability matrix as the imbedded Markov chain for the waiting time W(A) (under non-overlap counting), and has initial distribution £(E°), where £(E°) is a unit vector with one at state E° and zero elsewhere. Hence the probability generating function for W°(A) is
Vw°(s) = J2 snt(E°)Nn-1(I
- N)l.
(5.49)
n=l
It follows that the probability generating function of W°(r,A) ¥ V ° ( r , A ) ( s ) =
is (5.50)
If {Xt} is a sequence of homogeneous Markov-dependent trials, A is a simple pattern, and counting is non-overlapping, then the probability generating function of W(r,A) has the same form as Eq. (5.50), except that the initial distribution for the second and subsequent patterns, £(0 e ), is degenerated at the state 0 e , the initial state for counting the second pattern A when e is the last element of pattern A. The transition probability from 0 e to v G fl is given by 0e t
''
_ J peb \ 0
if v = < 0 e , 6 >n with b e § otherwise.
Further, Eq. (5.50) also works for the case where {Xt} is a sequence of homogeneous Markov-dependent trials, A is a simple pattern, and counting is overlapping. To illustrate Theorem 5.6 and Eq. (5.50), we provide the following example. Example 5.9 Assume {Xt} is a sequence of Bernoulli trials and A = SS. It follows that A yields E° = S with respect to overlap counting. The imbedded Markov chain {Yt} associated with W(A) has the transition probability matrix given in Example 5.7, where it was shown that $w(s) = (s +ps2)/(l — qs — qps2) and
Scan
89
Statistics
Since £(E°) — (0,1), it follows from Example 5.7 and Theorem 5.6 that
#(*) = */(#(*)+.$(«)), and that 4>°2{s) = s + sp
Solving these simultaneous equations gives $w°(s) =
2'
1 — qs — qpsz and
=
~
1 — qs — qps2
It follows that fw-(r,A)(S)
=
l
Pw(s)(iPw'>(s))r~1 pr+lsr+l(l_qsy-l
(1 — qs — qps2)r and 1 r 1 1 r- 1 W°(r,A) = - + _ = ( * + _ ) + L-j-. p p p P p O
In general, it follows from Eq. (5.50) that d_ ¥V°(r,A)(s)U=l = $ w ( l ) + (r - l ) $ w c ( l ) . ds This is equivalent to saying that EW°(r, A) = EW(A) + (r - 1).£W 0 (A). 5.9
Scan Statistics
Let {Xi} be a sequence of n i.i.d. or homogeneous Markov-dependent twostate trials. The scan statistic 5„(r) of window size r for the sequence {Xi} is defined as Sn(r)=
max S{r,t), r
(5.51)
Waiting- Time
90
Distributions
where S(r, t) is the partial sum given by t
S(r,t)=
J2
X
«-
k=t-r+l
The scan statistic has been successfully used in numerous areas, such as cluster analysis (Naus 1965 and Huntington and Naus 1975), the generalized birthday problem (Saperstein 1972 and Naus 1974), DNA pattern matching (Sheng and Naus 1994) and the reliability of /c-within-consecutive-r-out-ofn:F systems (Chao, Pu and Koutras 1995). Naus (1974, Theorem 1) provided a combinatorial result for the conditional distribution of Sn(r) given Y^i=i -^» = a, the total number of successes in n trials. The result is briefly described in the following. Consider n positions (trials) being divided into L disjoint groups each of size r, i.e. n = rL. Let a denote a partition of a Is into L numbers (ni, • • • ,ni), where m > 0 represents the number of Is in the i-th group and Xw=i ni = a- Let Fs denote the collection of all such partitions of a where the L non-negative integers are each smaller than s (i.e. rii < s for i = 1, • • •, L). By using a theorem of Karlin and McGregor (1959), Naus (1974) gave the following formula P(sn{r) < s J2X, = a\ = £ § - J2 V
„-_1
/
\n.)
det
( 5 - 52 )
Mvl.
~a T?
where dij = l/(cjj!(r — c^)!), dij = 0 if any of the factorials is less than zero, and, for i, j = 1, • • •, L, Cij —
(j - i)s - Yjk=i rik + rii s
n
U - i) + E L , k
if i < j
if i > j-
The unconditional probability P(Sn(r) < s) can be obtained by averaging the conditional probability P(Sn(r) < s\Y^=xXi = a) over the binomial distribution of E™=1 Xi, i.e. P(Sn(r) <8) = J2 (n)paQn-ap(Sn(.r) a=0 ^ a ^
^
Ksf^Xi^a). i=l
'
The major difficulty encountered in the computation for large n is that the conditional probability involves a sum of large determinants (det \dij\) over many items in E CT€i r s , and the number of terms in this sum tends
Scan
Statistics
91
to infinity, as n —> oo, exponentially fast. Naus (1982) pointed out that Eq. (5.52) can be computed only if r is small and the ratio r/n is relatively large. Even with today's computers, r is still restricted to about less than 20 and n to less than 100 when Eq. (5.52) is used. Because of the complexity of computing the exact probability, a considerable number of approximations and bounds have been developed for P(Sn(r) < s); see, for example, Naus (1982), Glaz (1989, 1992), and Chen and Glaz (1997, 1999). Some of the bounds are excellent. Two recent books, one by Glaz, Naus, and Wallenstein (2001) and the other by Balakrishnan and Koutras (2002), provide comprehensive studies of various types of scan statistics. Koutras and Alexandrou (1995) studied the exact distribution of the scan statistic Sn(r) using the finite Markov chain imbedding technique. The following alternative formula for the exact distribution is based on Fu (2001), where the tail probability of the scan statistic P(Sn(r) < s) is cast in terms of the transition probability matrix of the imbedded Markov chain corresponding to the waiting time of a compound pattern A = u ' = 1 A j . To make this link more clear, we first give the following example. Example 5.10 Suppose that the window size is r = 5 and that s = 3, then the event Sn(5) < 3 occurs in a sequence of n trials {Xi}f=1 if and only if none of the six simple patterns Ai = SSS, A2 = SFSS, A3 = SSFS, A4 = SFFSS, A5 = SFSFS, and A6 = SSFFS occur in the sequence. If we define the compound pattern A^^ = uf =1 Aj, and let W(A5ts) be the waiting time of the compound pattern As,3, then we have P(S„(5) < 3) = P(W(A5,3)
> n + 1).
Furthermore, we may note that the number of simple patterns A* associated with the event 5 n (5) < 3 is
+
-G) M)From this example, it should be clear that it is sufficient to find the distribution of the waiting time of the compound pattern A when we want to find the distribution of Sn(r). O In general, given r and s, where 0 < s < r, we define a collection of
Waiting- Time
92
Distributions
simple patterns Tr,s = {Ai:A1
= S---S,A2
= SFS •••S,---,Al
= S---SF---
s
FS}
r
(5.53) and the compound pattern Ar>s = u{ =1 Ai
(5.54)
induced by all simple patterns Aj € .?>,«. We define the usual waiting-time random variable for the compound pattern Ar)S as W(ArtS)
=
inf{n: the number of trials required to obtain the compound pattern A riS }
=
the minimum number of trials required to obtain any one of the simple patterns Aj, Aj € .?>,«•
Theorem 5.11
(5.55)
Given r, s, and n, where 0 < s < r < n,
(i) the waiting-time random variable associated with the scan statistic Sn(r) is W(Ar:S), where Ar>s is a compound pattern defined by Eqs. (5.53) and (5.54), and the number of simple pattern is
,.£(-^ (ii) the imbedded homogeneous Markov chain {Yt} is defined on the state space fl = {0, S, F} U t i S(Ai), for A* e Tr,„ and has a transition probability matrix
Mr.a = [ - ^ -
i
where the transition probabilities pu>v of the matrix MT}S are defined by Eq. (5.14), and (Hi) we have P(Sn(r)
<s)=
P(W(AriB)
> n+ 1) = ^N^sl
.
Scan
Statistics
93
Proof. Given r, s and n, it is clear that if Sn(r) < s, then no A; £ Tr,s has occurred in the sequence {-^i}"=i, and the converse is also true. Hence the waiting time of the compound pattern A rj5 is related to Sn(r) in the following way: Sn(r) < s «=> W(AriS)
> n + 1, for any 1 < s < r.
This proves the first parts of Theorem 5.11(i) and (iii). Let x be the number of Fs between the first and the last S (the s-th 5). In view of Example 5.10 and the definition of .?>,«, the patterns in ^>>s are created by the permutations of (s — 2) Ss and x (x = 0, • • •, r — s) Fs between the first and last 5, and hence the number of patterns in TTyS is ,
(s-1\
fs - 2 + 1 \
fs - 2 + r - s\
^
fs - 2 + x\
This completes the proof of the second part of Theorem 5.11(i). Each subpattern in S(Aj), % = l,---,l, has to be a state in fi with respect to the imbedded Markov chain {Yt}, and 0 is the initial state; hence
n = {0,-F,S}l4 =1 S(Ai). It follows from Section 5.3 that the waiting-time random variable W(A r , s ) is finite Markov chain imbeddable. The transition probability matrix M of the imbedded Markov chain {Yt} can be rearranged in the form
[ o i l ] ' where the absorbing states cc\, • • •, a; correspond to the patterns Aj 6 ^>,sHence Theorem 5.11(ii) is an immediate consequence of Theorem 5.2. This completes the proof. • Again we provide an example of the transition probability matrices of the imbedded Markov chain for both i.i.d. and Markov-dependent two-state trials. Example 5.11 Given r = 4 and s = 3, it follows that Tr^ = {Ai = SSS, A2 = SFSS and A3 = SSFS} and that the state space is SI = {0, F, S, SF, SS, SFS, SSF, a} after combining the absorbing states a±, oti and OJ3 together as a. If {Xi} is a'sequence of i.i.d. two-state trials, then the transition probability matrix of the imbedded Markov chain {Yt} defined
94
Waiting- Time
Distributions
on O has the form
M
/ 0 0 0 0 0 0 0
F S SF SS SFS SSF
q q 0 g 0 0 q
p p 0 0 0 0 0
0 0 g 0 0 q 0
\ o d 6 d
0 0 p 0 0 0 0 0
0 0 o \ 0 0 0 0 0 0 p 0 0 0 q p 0 0 p 0 0 p 0 0 1/
I
0
If the sequence of two-state trials is a homogeneous Markov chain with transition probability matrix A =
PFF
PFS
PSF
PSS
then the transition probability matrix M* corresponding to the imbedded Markov chain {Yj} is
0
M*
F S SF SS SFS SSF a
(0 0 0 0 0 0 0
\o
1
p
PFF
PFS
0
0 0 0 0 0 0
PFF
0 0 PFF
0
0 0
0 0
PSF
Pss
0 0
0 0 0 0 0
PSF
0 0
0 0 0 PFS
0 0 0 0
o \
0 0 0 0 PSF
0 0 0
0 0 0 Pss Pss PFS
1
/
Note that here we assume P(Xi = S) = p, P(Xi = F) = q, and the initial distribution P(YQ = 0) = 1 in both cases. The two transition probability matrices share the same form, and the changes from matrix M to matrix M* are straightforward. Since P(W(ArtS) >n+l) = P(Sn(r) < s), application of Theorem 5.6 yields the probability generating function for
the sequence {P{Sn{i) < 3)} as follows: $w(t)
t(l +pt + pH2 + p2qts - p3qt4 - p3qH5) l - q t - pqt2 - p2q2t4 + p3qH6 O
Tables 5.2 and 5.3 provide selected numerical examples. Specifically, Table 5.2 provides some cumulative distributions of the scan statistic Sn(r)
Scan
Statistics
95
for r = 5 and n = 10,50,500, and Table 5.3 provides selected probabilities P(Sn(r)
< s) for r = 10,15, 20 and large n. Table 5.2
Selected case n s 10 0 1 2 3 4 5 Mean Standard deviation 50 0 1 2 3 4 5 Mean Standard deviation 500 0 1 2 3 4 5 Mean Standard deviation
Cumulative distribution of 5 n ( 5 ) .
i.i.d. p = q = 0.5 0.0009766 0.0253906 0.1816406 0.5478516 0.8906250 1 3.3535156 0.9581207 8.881784xKT :L6 1.7635697x 10"- 9 0.0000418 0.0215953 0.4481247 1 3.9830427 0.7501573 3.0549364x io--151 5.4756424x 10"-90 5.0166948x 10"-46 3.4424189 x 10"-18 0.0001974 1 4.5302382 0.5407803
Markov-dependent pFF = 0.3, pSF = 0.4 0.0050388 0.0965779 0.4486013 0.8453266 0.9860426 1 2.6184271 0.9024778 6.7356773xl0~ 12 2.9342188xl0" 6 0.0085945 0.3396303 0.9071361 1 3.1873302 0.7636984 9.9184531xl0- 112 4.6027994xl0- 5 7 4.1206202xl0" 22 1.1910137xl0" 5 0.3549406 1 3.7446362 0.6269622
As can be seen from Tables 5.2 and 5.3, given a value of r, for every s < r the probability P(Sn(r) < s) tends to zero very fast as n increases. In fact, even P(Sn(r) < r) tends to zero exponentially fast in the sense that lim -\og{P(Sn(r)
= -/?,
(5.56)
96
Waiting-Time Table 5.3
Selected case n r s 100 15 4 20 4 500 10 3 10 5 15 4 20 4
Selected P(Sn(r)
Distributions < s) for large r and n.
i.i.d. p = q = 0.5 6.7451941x10" -15 6.6456841x10" -18 1.4369406x10" -79 4.0460520x10"-34 1.1297610x10" -73 1.1020938x10" -86
Markov-dependent pFF =0.3,pSF = 0.4 2.1329403xl0" 9 4.1583113X10" 11 6.2128156xl0" 48 7.4002280xl0~ 14 2.2560900xl0~ 43 2.3334601 xlO" 5 3
where /? = — log A^] and A[i] is the largest eigenvalue of the essential transition probability submatrix of the imbedded Markov chain {Yj} corresponding to the waiting-time random variable W(Artr). This is a direct result of Theorem 5.9. In this section, we have focused our study on the scan statistic Sn(r). There are two other commonly used scan statistics, WStT and Dn(s), as defined below: (i) Given s and r, let Wss}. (ii) Given n and s, let Dn(s) be the length of the smallest window that contains at least s successes, Dn(s) = inf{r : S„(r) > a}. From the definitions, the two statistics Ws,r and Dn(s) are closely related to the scan statistic Sn(r), and their distributions can be obtained in terms of the distribution of 5„(r). These relationships can be mathematically expressed as P(Sn(r)
<s)= P(Wr,s > n) = P(Dn(s)
> r), for 0 < s < r < n. (5.57)
Chapter 6
Random Permutations
6.1
Introduction
Let Zn = {1, 2, • • •, n) be a collection of n integers, and let U(Zn) = {ivn = (7r(l), 7r(2), • • •, 7r(n)) : ir(i) G Z„} be the set of all permutations generated by the integers in Zn. A succession of size 2 has occurred at the i-th place if 7r(i) + 1 = w(i + 1), and a rise (increase) has occurred at the i-th place if ir(i) < 7r(i + 1). Successions and rises can be viewed as patterns in the permutation irn. For example, the number of permutations in Ii{Zn) with exactly k rises, A(n, k), is known as the Eulerian number. More generally, let S be a collection of N integers with [s] -specification [s] — [si, • • •, s„], where Sj(> 1), for i — 1, • • •, n, is the number of integers i and Si H 1- sn = N. For example, if S = {1,1,2,3,3}, then [s] = [2,1, 2] and si + S2 + S3 = 5. Denote by TLN(S) = {TTN = (7r(l), • • • ,7r(iV)) : 7r(i) G 5} the collection of all the permutations generated by the integers in S. The number of permutations in II^ (S) having exactly k rises (or falls), A([s], k), is called the Simon Newcomb number. It is easy to see that H(Zn) and A(n, k) are special cases of IIJV(S) and i4([s],fc), respectively, when the [s] -specification is [1, • • •, 1]. In the literature, the Eulerian and Simon Newcomb numbers are probably the most heavily studied and celebrated numbers associated with random permutations in combinatorial analysis. There are many other interesting patterns and runs statistics besides these two, such as the numbers of levels, picks and waves. A considerable amount of literature in combinatorial analysis has treated such numbers of runs and patterns in a random permutation TTN G 11^(5): e.g. Abramson and Moser (1967), Dwass (1973), 97
98
Random
Permutations
Jackson and Reilly (1976), Johnson (2001), Kaplansky (1944), Reilly and Tanny (1979), Roselle (1968) and Tanny (1976). The books by MacMahon (1915), Riordan (1958) and David and Barton (1962) provide a good overview of early developments in this area; Johnson (2002) gave a short review of current developments. Recently Fu (1995), Fu, Lou and Wang (1999) and Johnson and Fu (2000) extended the finite Markov chain imbedding technique to study the distribution of runs and patterns defined on random permutations. Whenever the context is clear, we will simplify the notation 7rn and 7TJV to ir.
If we assume that all the permutations in H(Z„) are equally likely, then, for example, the probability of exactly k rises in a permutation n can be written as
P(Rn(Tr) =
k)=A(n,k)/n\,
where Rn(^) denotes the number of rises in a random permutation n. Furthermore, note that every permutation n G U(Zn) is a realization of inserting n integers one by one into gaps between integers (including the two end gaps). For example, consider first that the permutation 7r = (31524687) can be decomposed into 8 sub-permutations {7r8 = 7r,7r7 = (3152467), 7r6 = (315246), TT5 = (31524), 7T4 = (3124), 7T3 = (312), 7T2 = (12), and m = (1)} by removing the largest integer at each step. This decomposition is unique. Conversely, the random permutation irg is a realization of inserting the integers 1 to 8 one by one randomly into the gaps between integers: {(1) -> (12) -> (312) ->
> (3152467) -> (31524687)}.
In general, under the above random insertion procedure, all the permutations in U(Zn) are equally likely and have probability 1/nl. The most important aspect of the above insertion procedure is that it provides a way to study the distributions of runs and patterns on random permutations via the finite Markov chain imbedding technique without combinatorial analysis. Basically, the sequence of insertions will be viewed as a sequence of trials, and treated as in previous chapters.
Successions
6.2
99
Successions
Let 7r = (n(l), • • •, n(n)) be a random permutation in H(Zn). A succession of size 2 is a, pair of integers (7r(i), 7r(i + 1)) which satisfy 7r(i + l ) = 7 r ( i ) + l.
(6.1)
The number of successions of size 2 in n, Xn(ir,2), is the total number of pairs (7r(i),7r(i + 1)), i = 1,2, • • •, (n — 1), that satisfy the condition (6.1) (with overlap counting). More generally, for 2 < m < n, we define a sequence of index functions for successions of size m: for i = 1, • • •, n—m+1,
{
1
if w(i + m - 1) = n(i + m - 2) + 1 = • • • = w(i) +m- 1
(6.2)
0 otherwise. We then define the random variable n—m+l
Xn(ir,m)
=
^2
In(n,m,i),
(6.3)
»=i
the total number of successions of size m in the permutation ir. Every permutation IT G U(Zn) can be viewed as a sequence of n subpermutations {7i"i, 7T2, • • •, 7rn}, where nt is obtained by inserting the integer t into one of the gaps of the sub-permutation -Kt-i, for t = 2, • • •, n, with 7Ti = (1). Let 0. = {0,1, 2, • • • ,n—l} be a state space and Tn = {0,1, • • •, n} be an index set. For m = 2, we define a sequence of transformations Yt : n(2T„) —• fi, for t = 1, 2, • • •, n and every IT e II(Z n ), as y t (7T)
=
Xt(7Tt,2)
=
the total number of successions of size 2 in the sub-permutation irt induced by IT.
(6.4)
For example, consider again the permutation -IT = (31524687), which can be decomposed into 8 sub-permutations by deleting the largest integer one by one from the permutation IT: 7rg = IT, TT-J = (3152467), TTQ = (315246), 7T5 = (31524), 7T4 = (3124), TT3 = (312), TT2 = (12) and TTI = (1). Hence IT is a realization of the random insertion procedure. Further, the realization of the imbedded chain {Yt(n)}%=1 with respect to TT = (31524687) is {YI(TT) = 0,y2(7r) = l,Y3(ir) = 1,Y4{TT) = 1,Y5{TT) = 0,Y6{TT) = 0,Y7(ir) = 1 and Yg(ir) = 0}. From the method of insertion and the above example, it may
Random
100
Permutations
be observed that if Ft_i(7r) = k (for t = 2, • • • ,n and 0 < k < t — 2), then Yt(n) can only be at state k ~ l,fc or fc + 1. Since the integer "£" has equal probability of being inserted into any one of t positions (gaps) of the sub-permutation 7r t _i, the transition probabilities associated with the imbedded chain {Yt} are specified by the following equation: for t = 2, • • •, n and 0 < k < t - 2, k/t (t-k-
P(Yt = xlYt-t = k)
if x = k — 1 if x = k
l)/t
1/t
(6.5)
ifx =fc+ l,
with P(Y0 = 0) = 1 and P(Yi = 0|F 0 = 0) = 1. It follows from Eqs. (6.4) and (6.5) that the sequence {Yt : t G Tn} forms a non-homogeneous Markov chain on CI having transition probability matrices given by
l
\
T
o o
Mt(n) =
o
i-3 i-2 i-1
¥
t t-2
o n-1
l
n-t+l
\ (6.6)
fort
• •, n, where Mt{n)
contains the entries Pij(t : n),
Pij(t : n) = P(Yt = j\Yt-!
=i),
i, j = 0,1, • • • ,t - 2,
defined by Eq. (6.5), and In-t+i is the (n — t + 1) x (n — £ + 1) identity matrix. Let a(t) = (ao(t),ai(t), • • • ,an-\{t)), t = 1,2, • • • ,n, be n vectors with components at{t) = P(Yt = i\Y0 = 0),
(6.7)
1. For t = 0, the initial distribution of YQ gives a(0) for 0,1, (1,0,•••,0), and for t = 1, the transition probability matrix M\{n) has
Successions
101
entry poo(l : n) = P{Y\ = 0\Y0 = 0) = 1 and has zero at all the other entries. It follows from Eqs. (6.4), (6.5), and (6.6) that X„(ir, 2) is finite Markov chain imbeddable with respect to the insertion procedure, and that P(Xn(n,
2) = 0 = P(Yn(n)
= i\Y0 = 0).
(6.8)
T h e o r e m 6.1 Given the initial distribution a(0), the distribution ofXn(ir, 2) can be obtained by P(Xn(n,
2)=i)
= a(0) f j j Mt(n)
j e'it i = 0,1, • • •, n - 1, (6.9)
where Mt(n) is given by Eq. (6.6). Further, for n > 1, a;(n) satisfies the following recursive equation: ( s^loo^-lj + ia^n-l) ai(n)
= {
+i^ai+1(n-l) ia„_3(n-1) +ian_2(n-1) ^a„_2(n-l)
ifi = 0 ifl
(6.10)
Proof. Since Xn(ir, 2) is finite Markov chain imbeddable and {Yt} is a non-homogeneous Markov chain with transition probability matrices given by Eq. (6.6), the first part of the theorem follows immediately from Theorem 2.1. The recursive equation follows from ai(n)
= o(0) ( f [ Mt(n) J e; =
(o0(n-l),ai(n-l),---,a„_i(n-l))Mn(n)ei.
The recursive equation (6.10) provides a simple and direct way to show that the limiting distribution of the number of successions of size 2, Xn(ir, 2), is a Poisson distribution with parameter A = 1. T h e o r e m 6.2
Given a(0) = (1,0, • • •, 0), lim P(Xn(n,2) n—•oo
= x) = — e _ 1 , a; = 0 , l , - - - .
(6.11)
X!
In order to prove the above result, we need the following simple lemma. L e m m a 6.1
Given a(0) = (1,0, • • •, 0), it follows that
Random
102
Permutations
(i) ax(x + 1) = [{x + 1)!]"\ for x = 0,1,2, • • •, n - 1, (w,) a x (i) = 0, for t < x. Proof, (i) For Yx+\ = x, the permutation nx+i has to have the form (123 • • • x(x + 1)), and this event has probability l/(x + 1)!. (ii) For t < x, Yt = x cannot happen, and hence ax(t) = 0 . • Proof. (Theorem 6.2). For x = 0, it follows from Eq. (6.10) and some simple algebra that 1 a0(n)
1 )ao(n—1)H—ai(n-l)
=
(1
=
( l - i ) " - i [ a 0 ( l ) + o(l)].
(6.12)
Note that ao(l) = 1. Furthermore, for fixed x (x > 1), it again follows from Eq. (6.10) that ax(n)
X
Tl — X — 1
— -ax-i(n-l)-\ n
n
ax(n - 1) -\
X ~\~ X
n
ax+i(n - 1)
(1-I)»—1[ax_1(x)+o(l)]. (6.13) n Taking n —> oo, the result (6.11) follows immediately from Eqs. (6.12) and (6.13), and from Lemma 6.1. This completes the proof. • =
For m > 3, the random variable Xn(ir,m) distribution at zero. Theorem 6.3
has a degenerate limiting
For m > 3, lim P(Xn(ir,m) n—>oo
= x) = \ J (^ U
fX
=°
IJ X >
(6.14)
1.
Proof. Let, for i = 1,2, • • •, n — m + 1, Ei be the event such that 7r(i) + m—1 = n(i+l)+m — 2 = ••• = 7r(i+m—1). It follows from the definition of .Ei and the random insertion procedure that P(Ei) = l / n m ~ 1 + o ( l / n m _ 1 ) . From the Bonferroni inequality, it follows that n—m+1
P(Xn(7r,m)>l)
=
P(ur=T+1£i)<
^
P(^)
(6.15)
Successions
103
n—m+1 i=l
1
^+°(-^o)-
Taking n —> oo, the result (6.14) is an immediate consequence of inequality (6.15). D Prom Theorems 6.2 and 6.3, we may conclude that in a very large random permutation, the number of successions of size 2 has a Poisson distribution with A = 1, and that, with probability tending to one, there are no successions of size greater than 2. To illustrate Theorems 6.1 and 6.2, we give in Table 6.1 some numerical results for the exact distribution of -Xn(7r, 2) and its limiting distribution, the Poisson distribution with A = 1. Since the number of insertions can be viewed as an index set, we can define a waiting-time (number of insertions) random variable as follows: W(k; 2)
=
the smallest number of insertions required to have k successions of size 2.
(6.16)
With a minor modification to the Markov chain {Yt} defined by Eq. (6.4), we are able to obtain a new imbedded Markov chain for the waiting-time random variable W(k;2). Given k, we define a Markov chain {Yt(k);t = 1, 2, • • •} on the state space fifc = {0,1, • • •, k — 1, k} with the following transition probability matrices: for 2 < t < k + 1, 1 t
/¥ t
t=l t 2 t
\ It t=l t
1 t
o o
Mt(k) =
o
t-3 i-2 t-1
¥ f ¥.. o
k
\
Ife-t+2
Random
104
Permutations
Table 6.1 Selected distributions of X n (7r,2), and its limiting distribution, the Poisson distribution with A = 1.
x\n 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
5 .44167 .36667 .15000 .03333 .00833
10 .40467 .36788 .16555 .04905 .01073 .00184 .00025 .00003 .248 x 10" 5 .276 x 10" 6
15 .39241 .36788 .17168 .05314 .01226 .00225 .00034 .00004 .487 x 10" 5 .473 x 10" 6 .406 x 10~ 7 .306 x 10" 8 .209 x 1 0 ~ 9 .107 x 10" 1 0 .765 x 10~ 12
Poisson(l) .36788 .36788 .18394 .06131 .01533 .00307 .00051 .00007 .912 x 10" 5 .101 x 10" 5 .101 x 10" 6 .922 x 10~ 8 .768 x 10" 9 .591 x 10" 1 0 .422 x 10" 1 1 .281 x 10" 1 2 .176 x 10" 1 3 .103 x 10~ 14 .575 x 10" 1 6 .302 x 10" 1 7
20 .38627 .36788 .17474 .05518 .01303 .00245 .00038 .00005 .593 x 10~ 5 .608 x 1 0 " 6 .558 x 10" 7 .461x 10" 8 .346 x 10" 9 .236 x 10" 1 0 .148 x 10" 1 1 .844 x 10" 1 3 .438 x 10~ 14 .211 x 10" 1 5 .781 x 10" 1 7 .411 x 10" 1 8
and for t > k + 2, 0 1 2 Mt(k)
=
Tf
0
o
0
; jfe-i k
(6.17)
o
fc-1 t
t-k t
0
0
1 t
1 /
The transition probability matrices Mt (n) and Mt (k) for Markov chains {Yt} and {Yt(k)}, respectively, are very similar. The major difference is
Successions
105
that the absorbing state "fc" of the Markov chain {Yt(k)} corresponds to the states k, k + 1, • • •, (n — 1) of the Markov chain {1^}. The Markov chain {Yt(k)} specified by the transition probability matrices Mt(k) has two important characteristics: (i) if Yt(k) < k — 1, it implies Yi(fc) < k — 1 for alii < t, and (ii) if Yt{k) = k, it implies Yi(k) = k for all i > t. These two characteristics and the definitions of the two random variables Yt(k) and W{k\ 2) yield the conclusion that they are one-to-one related in the following way: W(k;2)>n^>Yn-1{k)
(6.18)
Hence the following theorem holds. Theorem 6.4 P(W(k;2)
For n > k + 1 and k = 1,2, • • •, we have = n) = o(0) ( J[ Mt(k)\
(I -
Mn(k))U\k),
where Mt(k) is defined by Eq. (6.17), a(0) = (1,0, • • • ,0)i x (fc +1 ), and t/(fc) = ( l , l , - - - , l , 0 ) l x ( f c + 1 ) . Proof.
It follows from Eq. (6.18) that
P{W(k-2)
= n)
=
P(W(k-2)>n)-P(W(k;2)>n+l)
=
P(Yn-i(k)
< k - 1) - P{Yn(k) < k - 1)
=
o(0) ( J ] Mt(k) j (J - Mn{k))U
(k).
For the special case k = 1, the distribution of the waiting time for the first succession of size 2 can be determined by the following recursive equation. Theorem 6.5
For n>2 and the condition P(W(1; 2) - 1) = 0,
P(W(1; 2) = n) = i f 1 - ] T P(W(1; 2) = i) J .
Random
106
Permutations
Proof. For k = 1 and t = 1, the transition probability matrix iVf i(l) is given by
"*>=(; s For k — 1 and £ = 2, 3, • • •, the matrices Mt{k) given by Eq. (6.17) reduce to t-i
l
Mt(l) = J ( ^ J ) Since, by Theorem 6.4, n—1
n—1
/£—1
£P(W(l;2) = i) = j;(l,0) m ^ i1=1 =i
t=l
\ t = il /n-1
1
) P-^WX1'0)' /
i-(i,o)(nMt(i)](i,o)\ \t=l
we have
P(W(l;2)=n) =
(l,0)lf[Mt(l)j(I-Mn(l))(l,0)'
= i^o) rf[Mt(i)) (i,o)' \
i=i
/
n For example, it follows from the above theorem that P(W(1; 2) = 2) = 1/2, P(W(1;2) = 3) = 1/6, and P(W(1;2) = 4) = 1/12. Note that (3421) and (2134) are the only two permutations with W(l;2) = 4 among the 24 possible permutations obtained from inserting the four integers 1,2,3 and 4. Comparing the result of Theorem 6.4 with Eqs. (5.2) and (5.22), loosely speaking, we could say that the waiting time W(k; 2) has a general geometric distribution with respect to the insertion procedure on random permutations.
Eulerian and Simon Newcomb
6.3
Numbers
107
Eulerian and Simon Newcomb Numbers
Let us consider the following identity: i
n
oo
.
=^-T = l + YYA(n,k)xk v 1 - z e x p { A ( l - x)} ^ ^ L
v
''
;
n
—. n!
(6.19) '
K
n=lfc=l
The integers A(n, k), k = 1,2, • • •, n, on the right-hand side of Eq. (6.19) are known as the Eulerian numbers. Euler first introduced this identity in his famous book "Institutiones calculi differentialis" (1755, pp. 485-487), and the formula for A(n, k) given by him has the following form: A(n,k)=J2{-iy(k-j)n(n
+ 1
\
n>k>l.
(6.20)
He also showed that A(n, k) satisfies the recursive equation A(n,k) = kA(n -l,k)
+ (n-k+
\)A{n - l , k - 1).
(6.21)
Worpitzky (1883) proved that Eulerian numbers may also be defined via the identify xn =
J2(X
+ k 1
~ )A(n,k).
n
fc=i V
(6.22)
/
Roselle (1968) showed that the Eulerian numbers A(n, k) equal the number of permutations in II(Z n ) with exactly k rises, as mentioned in Section 6.1. Consider a deck S of N cards containing s* cards of face value i, for i = 1, 2, • • •, n, with si + ••• + sn = N. Turn over the top card and put it down face up. Turn over the second card and place it on top of the first card if the face value is less than or equal to the first; otherwise, start a new pile. The Simon Newcomb number A([s], k) with respect to the specification [s] = [si, • • •, sn] is defined to be the number of all possible permutations of the deck S which result in exactly k piles or rises, a definition equivalent to that given in Section 6.1. Dillon and Roselle (1969) gave the following formula for the Simon Newcomb numbers: for k = 1, 2, • • •, n,
w>=E(-iyr+ 1 )n(* , + f c ; J ~ 1 } j=0
^
J
' i=l
^
l
'
<6-23»
108
Random
Permutations
Given a permutation 7r G TLN(S), we define a random variable RN(TT), the number of rises (increases) in permutation n, as N
i=l
where I(TT,i) are index functions defined as Iui)
1
=f '
if^)>^-l)
\ 0
otherwise,
for i — 2,3, • • •, N, and I(ir, 1) = 1 by convention. The insertion procedure associated with the [s]-specification considered here can be described as inserting integers one by one randomly between gaps, starting with s% integers " 1 " , followed by S2 integers "2", and continuing this procedure until all the integers have been inserted. The main purpose here is to show that the Simon Newcomb number A([s],fc) (or the number of rises i?jv(7r)) is finite Markov chain imbeddable with respect to the above insertion procedure. Our approach also establishes a recursive equation for A([s], k). The Eulerian numbers A(n, k) will be treated as a special case of the Simon Newcomb numbers with [s] = [1, • • •, 1]. Let s* = max(si, • • •, s n ) and N* = N-s* + l. Dillon and Roselle (1969) showed that N* is the maximum number of rises (by convention, the first gap is a rise and the last gap is a fall) in a random permutation w £ HN(S). For every 1 < t < N, there exists an integer h = 0,1, 2, • • •, n — 1, such that h
h+l
y^sj
i—l
(with convention X)i=i = 0)- Further, for every t, we define h
sh+i(t)
= t - y^sj, 2=1
h l
(t)
= t-J2Si-l
=
Sh+l{t)-l,
1=1
s*(t)
=
N?
=
[s]t =
max(si,---,sh,sh+i(t)), t-s*(t)
+ l, N* =N-s*(N)
[si,---,sh,sh+i(t)},
+ l,
Eulerian and Simon Newcomb
Numbers
109
and -Kt as the permutation generated by the insertion procedure for the first t integers. Consider a sequence of transformations {Yt}^=1 : IIjv(7r) —> fi = {1, 2, • • •, N*} defined by Yt(ir) = the total number of rises in irt, t = 1, • • •, N. Lemma 6.2
For every t, 1 < t < N,
(i) JVt* is the maximum number of rises in a random permutation irt generated by integers with specification [s]t, and (ii) l(t) equals the total number of falls and levels containing the integers "h+1" in a random permutation •Kt-x generated by the integers with specification [si, • • •, Sh+i(t) — 1]. Proof. Result (i) is due to Dillon and Roselle (1969). The variable l(t) = Sh+i(t) — 1 > 0 is the number of integers uh + 1" in a random permutation 7rt_i generated by the specification [si, • • •, s/,, Sh+i(t) — 1}. If l(t) = 0, Result (ii) is obviously true. Now consider l(t) > 1. Since "ft. + 1" is the largest integer in the permutation nt-i, following immediately after the integer "ft + 1" must be either a fall or a level. Hence the total number of falls and levels involving the integer "ft + 1" equals l(t) = Sh+i{t) — 1. • Theorem 6.6 Let S be a set of integers having specification [s] = [s\, • • •, sn], where s, > 1 and Y17=i Si = ^. Then the exact distribution of the number of rises in a random permutation TT € IIJV (S) is given by
P(RN(TT)
= k) = C0 ( f[ Mt([s)) J e'k,
where £ 0 = (1,0, •••,0), and the transition probability matrices t = 1, • • •, N, are given by
Mt([s}) = (..Aj.^.-) ,
(6.24)
Mt{[s]),
(6.25)
Random Permutations
110
with i+i(t) t
t-i(t)-i 2+l(t) t
t-l(t)-2 t x+l(t) t
N:
-l
t-l(t)-x t jv;-i+t(t) t
V
an (N£ — 1) x (JVt* — 1) matrix, and 1
/
0\
0
0 Bt 0 t-i(t)-N;+i \ t
an (N* - 1) x (JV* - JVt* + 1) matrix. Proof. If F t _i = x and the (l(t) + l)-th copy of the integer of "ft + 1" is inserted into the t gaps of the permutation itt-i, then, following from Lemma 6.2, we have that (i) the number of rises remains x (Yt — x) when the integer "h + 1" was inserted into either the x gaps of rises or the l(t) gaps of falls or levels associated with integers "h + 1" in nt-i, and (ii) the number of rises becomes a; + 1 (1* = a; + 1) when the integer "h + 1" was inserted into the (t — x — l(i)) gaps of falls or levels not associated with the integers "ft + 1" in nt-iHence the transition probabilities associated with Yt, t = 1, defined as follows:
P{Yt = y\Yt-! = x) = {
x+l(t) t t-x-l(t) t
if y = x, 1 < x < N? - 1 if2/ = x + l , l < x < i V ; - l if y = x, x > N£ otherwise.
, N, are
(6.26)
1 0 The above equation yields the transition probability matrices given in Eq. (6.25), where the state space is fi = {!,-•• ,N*}. It follows from
Eulerian and Simon Newcomb
Numbers
111
the definition of Yt that P(RN(n) = k) = P(YN = k), for all k € ft. Hence RN(TT) is finite Markov chain imbeddable, and Eq. (6.24) follows immediately from Theorem 2.1. • Corollary 6.1
For k = 1, 2, • • •, N*, the Simon Newcomb numbers equal
An fN \ , ^([*].*0 = J J ^ A \J[Mt([s})\ ek.
(6.27)
WWi .A([s], 1) = 1, the recursive equation A([s],k) = (N-
l(N) -k + l)A([s'],k -i)
+ (k + l(N))A([s'],k)
(6.28)
holds for k = 2, • • •, N*, where the [s ]-specification is the [s]-specification with deletion of one of the sn integers "n". Proof. Since there are N\/iJ\^=l s»!) random permutations in ITJV(5) and they are equally likely, Eq. (6.27) follows from the fact that the Simon Newcomb numbers J4([S],A;) equal the product of N\/(J\^=1 sj.) and P(RN(w) = k). By definition, the first gap is a rise, and hence if there is only one rise in the permutation 7r, then the permutation has to have the form 7T = (n---n(n - 1) • • • (n - 1) • • • 1 • • • 1) and A([s], 1) = 1. The recursive Eq. (6.28) follows directly from Eqs. (6.24) and (6.25) of Theorem 6.6, since, for k = 2, • • •, N*, /N-i
P(RN(n)
= k)
\
=
£ 0 ( J J Mt([s\) J (MN([s])e'k)
=
N-l(N)-k + l ^ P{RN-1{n) +
(6.29) =
k-l)
k + UN) „ . . . ,, ^ P ( i ? i v - i W = k).
To make the insertion procedure and the above theorem more transparent, a simple example is given below. Example 6.1 Let S = {1,2,2,2,3}. Then S has the specification [s] = [1,3,1] with N = 5. The maximum number of rises in a random permutation IT e n 5 ( S ) is N* = 3. For t = 2, 3,4, and 5, we have JV2* = 2, 1(2) = 0; AT* = 2, 1(3) = 1; N% = 2, 1(A) = 2; and N£ = N* = 3, 1(5) = 0. It follows
Random Permutations
112
from Eq. (6.25) that the transition probability matrices associated with the finite Markov chain {Yj}, t = 2, • • •, 5, are M2([s\) = (
1/2 0
1/2 0 \ / 2/3 1 0 , M3{[s}) = 0
o o i / 3/4 M4([s}) = | 0
0
Vo
1/3 0 1 0
o i
1/4 0 \ / 1/5 4/5 0 1 0 , M 5 ([s]) = 0 2/5 3/5
0
1/
V 0
0
1
With Mi([s]) = J 3 X 3 and £ 0 = (1,0,0), therefore, for 1 < k < N* = 3,
p^w = *) = uti Mt)ek = (~, g, ^)e;. Hence P(R5(TT) = 1) = 1/20, P(-R5(7r) = 2) = 10/20, and P( J R 5 (TT) = 3) = 9/20. Since there are 20(=5!/3!) possible permutations in I I s ^ ) , the Simon Newcomb numbers are A([s], 1) = 1, A([s},2) = 10, and -A([s],3) = 9. This result can be checked by writing out the 20 permutations in II5 (S) and classifying them into three subsets of permutations, according to the number of rises (1, 2, or 3): (i) {(32221)}, (ii) {(31222), (32122), (32212), (21322), (22132), (22213), (23221), (22321), (22231), (13222)}, and (iii) {(12223), (12232), (12322), (21223), (21232), (22123), (23122), (23212), (22312)}. O For the case s\ = • • • = sn = 1, it is easy to check that (i) iVt* = t and l(t) = 0, for all t = 1, 2, • • •, n, (ii) ft = {1, 2, • • •, n}, and (iii) Eq. (6.26) is reduced to
P(Yt = y\Yt^
x/t (t-x)/t =x)=i 1 0
\£ y = x, 1 < x
The following corollary gives the exact distribution of i?n(7r), and hence the Eulerian numbers. Corollary 6.2
Given the random permutation w G U(Zn):
(i) The distribution of the number of rises Rn(w) is given by n
P(Rn(Tr)=k)=Ul[Mt)ek, t=i
k = 1,2, ...,n,
(6.31)
Eulerian and Simon Newcomb
Numbers
113
where £ 0 = (lj 0, • • •, 0), and the transition probability matrices t — 1,2, • • • ,n, equal 1
(\
I
t^±
\
t
o
¥ Mt
Mt,
(6.32) t-1 t
t-1
t
o
J
V
(ii) The Eulerian number A(n, k), the number of random permutations in H(Zn) having exactly k rises, is given by A(n,fc)=n!£o(nM')e'k>
(6.33)
and satisfies the recursive equation A(n, k) = kA{n -l,k)
+ (n-k
+ l)A(n - 1,fc- 1).
(6.34)
Proof. Since Zn has specification [s] — [1,1, • • •, 1], the first part of the theorem follows immediately from Theorem 6.6 and Eq. (6.30). The size of n(Z„) is n\, and the result (6.33) follows directly from the fact that all permutations are equally likely with respect to the random insertion procedure. Note that, by (i), the probability P(Rn(w) = k) satisfies the recursive equation ( iP(i?„_ 1 (7r) = l) P(Rn{v)=k)=l
w-fc+l
P(i?„_i(7r) = { iP(/2n_i(7r)=n-l)
if fc = 1
if k = 2,---,n-l iik = n. (6.35) The recursive equation (6.34) is a direct consequence of A(n, k) = n\P{Rn{-K) = k) and Eq. (6.35). •
+
fc-1)
114
Random
Permutations
The well-known Worpitzky (1883) identity for the Eulerian numbers, as given by Eq. (6.22), follows immediately from Eq. (6.33) and the identity
as shown below:
JZ{X + kn l)A{n,k)
2m+(n-2)m 1
M
= (n-l)!€oii *
fe=i
(n-^rrvrr )
t=l
^ /x+n-l\
/
L-i) \
n-1
M («-l)^oII ' t=l
/,s+n-2\ I n-1 /
V (^I 1 ) J
x n _ 1 • x = a;71
(6.36)
Further, Euler's original form of the definition of A(n, k) given in Eq. (6.19) can also be derived from the method of finite Markov chain imbedding. Note that the Eulerian polynomial satisfies the equation
An{x) —
y^jA(n,k)xk fe=i n
n 2 n
\ 7i-2 n
(
o
X1 „2
\
n-1
*toI nI J MU f*
n\£n
/c n
n —k n
o =
V
nxA n _i(x) + x(l -
1
x)DAn-i(x),
/ (6.37)
where D = d/dx, AQ(X) = 1 for all x, and its exponential generating
Eulerian and Simon Newcomb
Numbers
115
function,
— ,
(6.38)
n=0
satisfies the differential equation (1 - x\)(p\(x, A) = x
A ) =
(6 39)
l-,exp{Aa-x)}-
-
Identity (6.19) follows naturally from Eqs. (6.37)-(6.39). Similarly, we could find the distribution for the number of falls (decreases), DN(TT), the same way in which we found the distribution for i?jv(7r). For the special case when [s] = [1,1, • • •, 1], we have Rn(^) + Dn(-K) = n + 1. This yields that the distribution of Dn(ir) is given by P(Dn(Tr) = k) = P(Rn(ir) =n-k
+ l),
k = 1, • • •, n,
and that their joint distribution is given by P(£>„(TT)
= k, Rniir) =n-k
+ l) = P{RnM
=n-k+l),
for k = 1, • • •, n, and 0 otherwise. For general cases, with some Sj > 2, the joint distribution of -DJV(TT) is rather complex (see Fu and Lou 2000b).
RN(TT)
and
This page is intentionally left blank
Chapter 7
Applications
7.1
Introduction
The application of the finite Markov chain imbedding technique started in the early 1980s for the purpose of evaluating the reliabilities of various engineering systems, especially the consecutive-£;-out-of-n:F system. Since then, it has been used, for example, in hypothesis testing by Lou (1996, 1997), Koutras and Alexandrou (1997a), and Johnson (2001), in quality control by Chao (1999) and Fu, Spiring and Xie (2002), in continuity of care by Lou (2000, 2001) and Fu and Lou (2000a), and in DNA sequence matching by Fu, Lou and Chen (1999), Cheung (2002), and Lou (2003). In this chapter, our main goal is to present some of these works in detail, and to provide a scope for further applications of this approach to other areas.
7.2
S y s t e m Reliability
Engineering systems such as atomic power plants, aircrafts, automobiles, computers, and software programs are required to be highly reliable. Reliability evaluation has become an important and integral part of the planning, design and operation of all types of engineering systems. There are various reliability structures, including series systems, parallel systems, kout-of-n:F systems, consecutive-fc-out-of-n:F systems, linear systems, and deteriorating and repairable systems. It is widely known that the reliability of series systems is low especially when the number of components in the system is large, and, on the other hand, parallel systems have high reliability but are expensive to build. Recently, consecutive-fc-out-of-n:F systems 117
Applications
118
have become very popular for their high reliability and relatively low cost; the analysis of these systems will be the focus of this section. Let {Xi}™=1 be a sequence of two-state indicator random variables for the proper functioning of n components: Xi = 1 if and only if the i-th component works and Xi = 0 otherwise. Let (Xi, • • •, Xn) is commonly referred to as the structure function of the system. For example, a series system can be represented by 4>{X\, • • •, Xn) — min{Xi, • • •, Xn}, while 4>* {X\, • • •, Xn) = max{Xi, • • •, Xn} denotes a parallel system. It is well known that for any system, the structure function cp can be written in the form {Xx, • • • , * „ ) = m i n l ^ - p d , • • -,Xn)}, (7.1) o where (j>j is the structure function of a parallel system (see, for example, Ross 2000). This means that any complex system can be decomposed into a series system with each component equivalent to a parallel subsystem. The reliability of the system, Rs, is the probability that the system is functioning, or, equivalently, Rs=E((X1,---,Xn)).
(7.2)
Although the structure function in Eq. (7.1) and the reliability function in Eq. (7.2) are presented in simple mathematical form, they are unworkable for most cases, especially when the structure functions (f>j are complex, as in, for example, consecutive-fc-out-of-n:F systems (denoted C(k,n:F)), deteriorating and repairable systems, or if the components Xi are multistate Markov-dependent trials. The reliability of the C(k,n:F) system was studied extensively during the 1980s via combinatorial analysis by, for example, Kontoleon (1980), Chiang and Niu (1981), Hwang (1982, 1986), Papastavridis (1988), Chrysaphinou and Papastavridis (1988), and Kossow and Preuss (1989). From the point of view of the theory of runs and patterns, the C(k,n:F) system is a linear system containing n components, and the system works if and only if no pattern (or failure run) A = F • • • F of size k has occurred in the sequence of n two-state trials. Mathematically, the reliability of the C(k,n:F) system is R{k,n:q)
= P(W(A)
>n + l).
(7.3)
System
Reliability
119
More generally, the reliability of any engineering system with n components equals the tail probability of a waiting-time random variable W(A) of a specific pattern A (simple, compound, or series). This fundamental connection paves a simple and effective way for evaluating the reliability using the finite Markov chain imbedding technique, without the usual strong restrictions that the components have to be independent and identically distributed. 7.2.1
Consecutive-k-out-of-n:F
System
The following two examples provide descriptions of C(k, n: F) systems (Chiang and Niu 1981, and Chao, Fu and Koutras 1995). Example 7.1 A sequence of n microwave stations transmit information from place A to B. The microwave stations are equally spaced between A and B. Each microwave station is able to transmit information a distance up to k microwave stations away. This system fails if and only if k or more consecutive microwave stations fail. O Example 7.2 A system for transporting oil by pipes from point A to B has n pump stations. The pump stations are equally spaced between A and B, and each pump station can transport the oil a distance of k pump stations away. If one pump station is down, the flow of oil would not be interrupted because the next station could carry the load. However, when k or more consecutive pump stations fail, the oil flow stops and the system fails. O Kontoleon (1980) studied the reliability of the C(k,n:F) system where all the components are stochastically independent and have the same failure probability (i.e. Bernoulli trials). For such a system, Chiang and Niu (1981) provide the recursive equation n—k+l
R(k,n:q)=pn-k+1+
r+k—1
Y, r=l
Y.
R(k,n-m:q)prqm-r,
(7.4)
m=r+l
with R(k,j:q) = 1 if 0 < j < k. Derman, Lieberman, and Ross (1982) expressed R(k,n:q) in the form n
R(k,n:q)
= £ ) N(n,j, k)Vn~iq\ 3=0
(7.5)
Applications
120
where the coefficients N(n,j,k) are referred to as Fibonacci numbers of order k. For i.i.d. cases, Philippou and Makri (1986) and Hirano (1986) independently developed the exact formula for the distribution of the random variable Nn^ (see Eq. (3.1) of Section 3.2), and the reliability of a C(k,n:F) system equals the probability that NUtk = 0. In a series of papers by Fu (1985, 1986), Fu and Hu (1987), and Chao and Fu (1989, 1991), the finite Markov chain imbedding technique, with forward and backward principle, was introduced to study the reliability of the C(k,n:F) system, its bounds and its limiting distributions. We summarize below the main results of these papers. Theorem 7.1 Assuming all the components operate independently and have the same failure probability q, then
ft) R(k,n:q)=£Nn(A)l
(7.6)
where the A = F • • • F is a simple pattern of length k and the matrix iV(A) has the form 0 1 2
N(A)
k-\
/ P P
1 0
q
p
0
0
\
0
\p
(7.7) 0/
fcxfc
(«) k\n~k-\-l
(1-9*)
k < (1 - p q k\n—/c+1 )
(7.8)
and (Hi) R(k,n:q)
~ expjnlogA^]},
(7.9)
where \[X] (0 < A^j < 1) is the largest eigenvalue of N(A). With some simple modifications to the transition probability matrices, the results (i) and (iii) of Theorem 7.1 also hold for Markov-dependent sequences of components (see Chapter 3 for simple patterns). Inequality (7.8) has been studied by Fu (1985, 1986), Cai (1994), Papastavridis and
System
Reliability
121
Koutras (1993), and recently by Muselli (2000). We would like to point out an important connection between the lower bound of the inequality (7.8) and the Poisson convergence, when q is very small. Assuming the failure probability of each component within the period [0, s] is q = {\s/n)l/k, then it follows from the inequality (7.8) that lim R(k,n:q)
= exp{-As},
A, s > 0.
(7.10)
n—>oo
This implies that, if q is sufficiently small for a given large n, the reliability of the system in the time period [0, s] is approximately ~ exp{-nqk}.
R(k,n:q)
(7.11)
In other words, if once the system breaks down it will be replaced immediately by a new system, then the number of times of failure of the system NS(A) during the period [0, s] has a Poisson distribution in the sense that P(NS(A) < x) ~ ] T ^ T 3=0
exp{~As},
(7.12)
J
'
as q = (\s/n)l/k —> 0. Results (7.10) and (7.12) have been studied by, for example, Chao and Lin (1984), Fu (1986), Papastavridis (1988), Godbole (1991), and Koutras and Papastavridis (1993). For fixed failure probability q, we would also like to point out an interesting connection between the upper bound (1 — pqk^n~k+l 0 f the inequality (7.8) and the famous Goncharov (1944) approximation for the longest failure run: for large n R(k,n:q)
= P{Ln{F)
< k) ~ exp{-A n [l + o(l)]},
(7.13)
where A„ = npq^061'" n'+x and x = k — [ l o g ^ n ] . To see this connection, let us take k = [log1/qn\ + x, then the upper bound of the inequality (7.8) yields (1 - pqk)n~k+1
=
( l — ng[ 1 °Si/<, n l+ : i : : ) n -[ l o Si/9 n ]- a : : + 1
=
(1 -
*
exp{-A n (l + o(l))}.
Xn/n)n^+°^
It is well known that Ln{F) — [log! iq n] does not have a limiting distribution (see Goncharov 1944 and Vaggelatou 2003), meaning that Eq. (7.13) is only a local approximation along the sequence { [ l o g ^ n]+x, x = 0, ± 1 , ±2, • • •}.
Applications
122
Hence we don't expect this approximation to perform well for fixed k and large n. Further, since the Poisson approximation (7.11) for R{k,n : q) requires q = {Xs/n)1/k tending to zero a s n - > oo, we also don't expect it to perform well numerically for fixed q and large n. If q is very small (and hence p is close to 1), then the lower and upper bounds of the inequality (7.8) are almost equal, hence providing a good approximation for the exact reliability R(k, n:q). In other words, for small q, the lower bound of Eq. (7.8) performs better than the Poisson approximation given by Eq. (7.11). We would first like to provide some numerical results for Theorem 7.1 before providing its proof. Let us introduce the following notation: E: The exact reliability R(k,n:q) in Eq. (7.6), U: The upper bound of R(k, n:q) in Eq. (7.8), L: The lower bound of R(k,n:q) in Eq. (7.8), AL: The large deviation approximation of R(k,n:q) in Eq. (7.9), AQ: The Goncharov approximation of R(k,n:q) in Eq. (7.13). Table 7.1 gives the exact probability R{k,n:q) and its various bounds and approximations for i.i.d. two-state trials with q = 0.1 and k = 4, and Table 7.2 shows the effect of increasing the failure probability to q = 0.3. Table 7.1 Comparisons of the exact reliability R(k, n: q) to bounds and approximations for i.i.d. two-state trials with q = 0.1 and k = 4.
L 50 0.9950 100 0.9901 1000 0.9048 10000 0.3679 n
AG 0.9955 0.9910 0.9139 0.4066
E 0.9958 0.9913 0.9141 0.4065
AL 0.9955 0.9910 0.9139 0.4064
U 0.9958 0.9913 0.9142 0.4067
In Table 7.3, we provide some numerical results for the reliability of the C(k,n:F) system under a Markov-dependent structure of components having the transition probability matrix ,
=
f 0.75 0.25 \ ^ 0.25 0.75 J '
Summarizing the numerical results, for q small (q = 0.1), all the bounds and approximations perform reasonably well, and for a more moderate value
System
Reliability
123
Table 7.2 Comparisons of the exact reliability R(k, n: q) to bounds and approximations for i.i.d. two-state trials with q = 0.3 and k = 4.
L 50 0.6659 100 0.4434 0.0003 1000 10000 4.77e-36 n
AG 0.7531 0.5672 0.0035 2.37e-25
E 0.7590 0.5674 0.0030 5.35e-26
AL 0.7475 0.5588 0.0030 5.27e-26
U 0.7654 0.5761 0.0035 2.06e-25
Table 7.3 Comparisons of the exact reliability R(k, n:q) to the large deviation approximation for Markov-dependent two-state trials with transition probability matrix A* and initial probabilities po =
fc=4 fc=8 n
20 50 100 1000 10000
E 0.2094 0.0165 2.40e-04 2.01e-37
AL 0.1841 0.0145 2.11e-04 1.77e-37
E 0.7375 0.4020 0.1462 1.81e-09 1.55e-88
AL 0.6673 0.3637 0.1323 1.64e-09 1.40e-88
oi q {q = 0.3), the upper bound U and the approximations Ac and AL still perform very well for small and moderate n. Not surprisingly, for large n, Ai outperforms other approximations and bounds, regardless of the values of q and k; as the number of components n increases, the reliability of the system R(k,n:q) tends to zero exponentially at the same rate as AL (see Theorems 5.9 and 5.10). Since the computation of R(k,n:q) based on Eq. (7.6) is very simple and also very efficient, we believe that the reliability R(k,n:q) should generally be evaluated by the exact formula in Eq. (7.6), and the large deviation approximation should be used only if the number of components n is very large. The result of Theorem 7.1(i) follows immediately from the fact that the C(k,n:F) system works if and only if NU}k = 0 or Ln(F) < k. Theorem 7.1 (hi) can be proved in the exact same manner as Theorem 5.9. Hence to prove Theorem 7.1 we need only prove Part (ii), the upper and lower bounds of the system reliability.
Applications
124
Let a(0) = £ = (1,0, • • •, 0)iXk, and for i = 1,2, • • •, n, let a(i) = £AP = (ao(t), oi(0» • • •»Ofc-i(O).
( 7 - 14 )
where a, (i) is the probability of the system residing in state j at time index t = i (after the first i components are counted). In component form, we write aj(i)
= SN%+l,
j = 0,l,---,fc-l.
(7.15)
Note that in the above Eq. (7.15), we have e J + 1 rather than e.y This is because while j is the number of failures and starts from 0, here we wish to emphasize that j corresponds to the (J + l)-th state in the state space Q,. It follows from the structure of the transition probability matrix in Eq. (7.7) and from Eq. (7.15) that the following recursive equations hold: for 1 < i < n, «o(«)
=
a
= qaj-i(i~l),
j(i)
p[ao(i - 1) + ai(i - 1)-\
Ofe_i(i - 1)] and
J = 1, • • •,fc— 1.
(7.16)
To prove Theorem 7.1(h), we introduce the following three lemmas. L e m m a 7.1 (i) N l ^ l -qe'k, (ii) Nex = pi , (Hi) Net = qel_1, I = 2, • • •, k. Proof. The above results follow directly from the structure of the matrix N given by Eq. (7.7). • L e m m a 7.2 (i) a(i)l = a(i — 1)1 — qa(i — l)ek, (ii) a(i)l < a(j)l , for all 0 < j < i k. Proof. Result (i) of this lemma is a direct consequence of Lemma 7.1(i) and the definition of a(i) given by Eq. (7.14). Since qa(i — l)e fc > 0, Result (ii) of this lemma follows directly from (i). Finally, Result (iii) follows from a(i)l
= a(i - l ) ( l ' - e'k) + pa(i - l)e'k >
a(i-l)(l'-ek)
System
Reliability
125
=
o(t-2)(l
-efe-efc_1)+pa(i-2)efc_1
>
a (t-2)(l'-ei b -efc_ 1 )
=
a(i —fc+ l ) ( l — ek — • • • — e 2 ) +pa(i — k + l ) e 2
>
a(i — k + l)e1.
a The above inequalities (ii) and (iii) are critical to proving the inequality (7.8). We would like to give the intuitive implications of these two inequalities: (A) The inequality of Lemma 7.2(h) implies that R(k, n:q) is a decreasing function of the number of components. (B) The inequality of Lemma 7.2(iii) shows that the reliability of the C(k, n: F) system with the first i (i > k) components is greater than or equal to the probability that the system with the first (i — k+1) components is at the state "0" (no failures). Lemma 7.3
For n > i > k, we have pqk~1a(i)l
Proof.
< a(i)e'k < qk-1a{i)l
.
(7.17)
Since a(i)ek = qk~1a(i — k + l)e1 = pqk~xa{i -fc)l,
Eq. (7.17) follows immediately from the inequalities of Lemma 7.2(h) and (iii). • Proof, [of Theorem 7.1(h)]. By Lemma 7.2, R(k,n:q)
= £Nnl
= a(n)l
= a(n - l ) l ' - qa(n - l)ek
^_qa(n-l)e\
,
o(n-l)l J
0 1
U'-t^h '
Applications
126
Note that a(0) = £ = (1, 0, • • •, 0) and a(0)l' = 1. Hence
R{k,n:q)=n(l-«¥^). fj[\
(7.18)
a(n-i)l
)
Applying Lemma 7.3 to Eq. (7.18) yields the inequality (1 - 9 *)"-*+i < R(k, n:q)<(l7.2.2
Linearly
Connected
•
pqk)n~k-
System
An engineering system having n components is referred to as a linearly connected system if it can be imbedded into a finite Markov chain {Yt : t = 0,1, • • •, n} defined on a finite state space Q = {0,1, • • •, k — I, a} with transition probability matrices 0 M(t;n)=
/
Pofi{t;n) :
'•
k-1 a
\
••• Po,a(t;n) •••
pk-i,o(t;n) 0
••• •••
\
:
Pk-i,a{t;n) 1 /
N(t; n)
6
C{t; n) 1
(7.19) State a is an absorbing state at which the system breaks down and cannot be used anymore. Let £ 0 = (£ : 0) be the initial distribution of YQ. Since {Yt} is a Markov chain and a is an absorbing state, it follows that the reliability of the linearly connected system can be represented as Ri{k,n:q)
= P(Y0 < k - 1, YY < k - 1, • • •, Yn < k - 1) =
P(r0
= t(f[N(t;n))l'.
(7.20)
The structure of the linearly connected system is very general; it covers many well known systems such as series, standby, k-out-oi-n:F, consecutivefc-out-of-n:F, m-within-consecutive-/c-out-of-n:F, and deteriorating and repairable systems. The paper by Chao, Fu and Koutras (1995) provides a more detailed review of the reliabilities for such systems.
Hypothesis
7.3
Testing
127
Hypothesis Testing
The formal application of the distribution theory of runs and patterns to hypothesis testing started with Wald and Wolfowitz (1940). They proposed a test based on the conditional distribution of the total number of runs given the number of successes in Bernoulli trials. The null distribution of the test was studied by Swed and Eisenhart (1943), and the power functions were studied by David (1947), Bateman (1948), Barton and David (1958), and Goodman (1958). Since then, many tests based on runs and patterns have been proposed; for example, Rubin, McCulloch, and Shapiro (1990) proposed a test, based on the number of runs in multinomial data, of randomness versus clustering. The main purpose of this section is to show how the finite Markov chain imbedding technique could be used in computing critical regions and powers in a runs-related test. One of the most commonly used runs-related statistics for testing randomness is the conditional test based on the number of success runs given the number of successes JVi = ri\ in a sequence of two-state trials. The unconditional distribution for the number of success runs Rn, and the distribution for the number of successes N±, can be derived in a manner similar to what is described in the following, and hence we leave this to the reader. Here we focus on the conditional distribution of the number of success runs given the number of successes, Rn\Ni — n\, as it is commonly used in the literature. The conditional distribution of Rn given N\ = rii can be obtained directly through the joint distribution of Rn and Ni. This method can be applied not only to i.i.d. cases, but also to one-step Markov-dependent cases, and hence can be used to calculate both the critical regions and powers of the success runs test (Lou 1996, 1997). For a sequence of two-state trials u> = (z\, • • •, zn) with n\ successes, define (a) a Markov chain {Yt} operating on LU: Yt(ui) = (B,R,E), where B is the number of successes in the first t trials, R is the number of success runs in the first t trials, and E is the outcome of the i-th trial (used to help identify the transition probabilities), (b) the state space ft = (0,0,F) U {(&,r, e) : b = 1, • • • ,ni; r = 1, • • •, m; e = 5, F} U a, where m = min(ni, n2 + 1), r < b, a is an absorbing state (standing for states with b > m or r > m), and
Applications
128
k = Card(Cl) — 2 + m(m + 1) if n\ = m or k — 2 + m(2ni — m + 1) if ni > m, (c) the partitions C 0 = {(0,0, F)}, Ca = {a}, Cni,r = {(ni,r,S),(nur, F)}, and CbtT = {(b,r,S),(b,r,F)}, where b — l,---, n x — 1, and r = 1, • • • ,m, with r < b. Theorem 7.2 Consider a sequence of n homogeneous Markov-dependent two-state trials. For given n and n\, the random vector (Ni,Rn) can be imbedded into the Markov chain {Yt} defined above with transition probability matrix
M=
(0,0, F) (1,1,5) (1,1,-F) (2,1,5)
PF
0
0
0
PSF
PSS
0
0
pFF
0
(PFF
0
\
(*)
Vt,
(7.21)
{n1,m,F)
Vo
0
0
1 )
where (•) stands for the probabilities P(Yt = a|Yi_i = i), t = l , - - - , n , i 6 f t - { a } . Given the initial condition P(Yo = (0,0, F)) = 1, then the following results hold: (1) For m = 0 and 1, P(JVi = 0, i? n = 0) = P(iV! = 1, Rn = 1) = 1, and /or ni > 2, r = 1, • • •, m, P{N! = ni,Rn
= r)= i0MnU
(C n i , r ),
(7.22)
where £ 0 = (1,0, • • • ,0) is a 1 x k vector, k being the size of the state space. (2) Form > 2, r = 1, • • • ,m, P(Rn = r\N, = where Cni = yjr^LlCni^r U (Cni
ni)
jQMnU
(Cnur)
£0MnU'{Cni)
(7.23)
and U (C n i ) is the sum of all vectors
Proof. To prove the above results, it is sufficient to prove that the transition probability matrix M has the form given in Eq. (7.21).
Hypothesis
Testing
129
For n\ = 0 or 1, the result is obvious. For n\ > 2, it follows from the constructions and definitions (a), (b), and (c), that the transition probabilities of the Markov chain {Yt}, t = 1, • • •, n, are given by (i) for 1 < b < m, 1 < r < m = min(m,n2 + 1), P(Yt = (b,r,F)\Yt-1
= (b,r,F))
=
pFF,
P{Yt = {b,r,F)\Yt-i
= (b,riS))
=
PsF,
(ii) for 1 < 6 < nx - 1, 1 < r < m - 1, P ( r t = ( f t + l ) r + l , 5 ) | y t - i = (6,r,F)) P(Yt = (b+l,r,S)\Yt-i P(yt = {b+l,m,S)\Yt-1
=
pFS,
= (b,r,S))
=
pss,
= (b,m,S))
=
pss,
(iii) for b = ni, 1 < r < m, P ( y t = a|Y t _ 1 = (b,r, J F))
=
pFS,
F ( r t = a | r t _ 1 = (6,r,5))
=
pSS!
(iv) P(Yt = a\Yt^i = a) = 1, and zero otherwise. Hence the transition probability matrix M is an immediate consequence of conditions (i) to (iv). Equations (7.22) and (7.23) follow directly from Theorem 2.1 and the definition of conditional probability. • To illustrate the theorem, we give the following example. E x a m p l e 7.3 For n = 5, n\ = 2, m = min(2,4) = 2, the state space can be defined by fi = {(0,0, F), (1,1,5), (1,1,F), (2,1,5), (2,1,F), (2,2,5), (2,2, F ) , a}, where (0,0, F) is the initial state and stands for no success, and a represents states with more than two successes (6 > 2). A Markov chain {Yt} is then defined. For instance, for a given sequence w = (SFSFF), the realization of the Markov chain Yt(u>), t = 0,1, • • • , 5 , is {YQ(UJ) =
(o,o,F),yi(w) = (i,i,5),y2M = ( i , i , n r 3 M = Y-5H = (2,2,F)}.
(2,2,S),Y 4 M
=
If the two-state trials are homogeneous Markov-dependent, then the transition probability matrices of the imbedded Markov chain {Yt} are the
Applications
130
same for all t, and can be expressed as (0,0, F) ( (1,1,5)
(M,-F) (2,1,5) (2,1, F) (2,2,5) (2,2,F) a \
PFF
PFS
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0
0
PSF
Pss
PFF
0 0 0 0 0
0 0 0 0 0 0
0 0 0 PSF PFF
0 0 0
0 0 PFS
0 0 0 0 0
0 0 0
0 0 0 0 0
PFS
PSF
Pss
PFF
PFS
0
1
Pss
The partition on the state space fl for ni = 2 is generated by C ni)T . = {(2,r, S),(2,r,F)}, r = 1,2. Then the exact probability of R5,2 {Rn,ni denotes the conditional random variable of Rn given Ni = n{) can be obtained via Eq. (7.23). Given pss = 1/2 a n d p F S = 1/4, for example, then P(i?5,2 = 1) = 0.643. Further, if pss = pFs = P, 0 < p < 1 (i.e. under the null hypothesis of i.i.d.), then P(-R5,2 = 1) = 0.4 for all p, and this result can be easily confirmed by writing out all 32 possible sequences from 5 two-state trials. O Remark 7.1: The Markov chain {Yt} defined above contains three components: (i) the number of successes in the first t trials, (ii) the number of success runs in the first t trials, and (iii) the outcome of the i-th trial (serving as ending block). When t increases, variables (i) and (ii) are nondecreasing. Therefore, when the number of successes ri\ is given, the state space can be reduced by collapsing the states where N\ > ni into an absorbing state a. This reduction not only reduces the size of the state space, but also the dimension of the transition probability matrix, and hence computational time. The conditional distribution of Rn given A?i = n\ will not be affected by this reduction. Remark 7.2: If the two-state trials are non-homogeneous Markov dependent, the joint distribution of (Ni,Rn) can be obtained in the same way by simply replacing the transition probabilities pss, pSF, pFS, and PFF, in the transition probability matrix of Eq. (7.21), with time-dependent transition probabilities pSs(t), pSF(t), pFS(t), and pFF(t). Furthermore, if the two-state trials are i.i.d. with pss = pFS = p, the conditional distribution of Rn given iVi under the null hypothesis can be obtained easily. This implies that Theorem 7.2 can be used for computing both critical regions and
Hypothesis
Testing
131
powers of the runs test. Critical regions for the test based on success runs, Rn,ni = (Rn\Ni =n\), can be constructed using the method of finite Markov chain imbedding. To demonstrate the capabilities of this method, the critical regions of the twosided success runs test, at the 5% significance level, for p = 1/2 and a large sample of n = 100 are obtained, and are presented in a graphic as opposed to tabular format in Figure 7.1. Given any number of successes n i , the hypothesis Ho can be tested by whether the observed value of i?„ ) n i is located in the rejection region (Cr) or not. As for all tests based on discrete probability distributions, the tail probabilities may not be equal to the assigned significance level. Therefore, Figure 7.1 was constructed based on the closest critical values for both sides.
0
20
40 60 number of successes
80
100
Fig. 7.1 Critical region curves for the test based on -Rioo given the number of successes n%, with p = 1/2 and a significance level of 5%.
Some power functions are plotted in Figure 7.2 for a moderate sample size of n = 30, and various sizes of the test. Power comparisons for twosided tests based on i?„, ni and Lnni (the length of the longest success run
Applications
132
/1
alpha=0.0374 alpha=0.0496
\
"\> 0.0
0.2
K_i
/.-' 0.4
0.6
0.8
1.0
0.0
0.2
0.4
Pre (a)
I
\ 0.2
/ 0.4
0.6 Pre (O)
0.8
1.0
alpha-0.0182 alpha=0.0301
' \ \
\
0.0
0.6 Pre (b)
alpha=0.0182 alpha=0.0301
\\
i
~~
/
\
alpha=0.0374 alpha=0.0496
"\/\
0.8
K_j
I
/'
1.0
0.0
0.2
0.4
0.6
0.8
1.0
Pre (d)
Fig. 7.2 Powers of two-sided tests based on Rn^i with n = 30: (a) pss = 0.5, m = n 2 , (b) pss = 0.8, n i = n 2 , (c) pSs = 0.5,ni = 2/3n 2 , and (d) pss = 0.8, m = 2 / 3 n 2 (alpha = size of the critical region).
given JVi = ni) are given in Lou (1996). To illustrate the practical use of the runs test for binary longitudinal data, one subset of data from a multi-center health risk study on asthmatics is analyzed here. Example 7.4 In Canada, over half a million people suffer from asthma (Wigle 1982), a cause of illness and disability. Despite advances in treatment modalities, the morbidity and mortality associated with asthma have continued to increase in Canada over the past decade. Hence it is vital to have a strategy for the prevention and control of the disease. A multicenter study was conducted by the Gage Research Institute, Toronto, across Canada from November 1991 to May 1993 in order to identify important environmental factors which may increase disease severity or cause acute exacerbation, and to provide insights on regional differences. With the necessary information, a prevention strategy can then be planned, such as better enforcement of environmental controls against certain air pollu-
Sequential
Continuity
133
tants, the administration of vaccines against viral infections, and so on. During the preliminary data analysis, the randomness of daily symptoms was tested. As an example, the occurrence of wheeze, a common discomfort symptom of asthmatics, will be considered here as the outcome variable. It is defined as 'Failure' if there were no symptoms of wheeze on a given day, and as 'Success' otherwise. Of the 59 asthmatics from Vancouver, BC, there were 29 subjects who had both more than two days of no symptoms as well as some symptoms during a 200-day period, and it is these subjects which will be used in the following analysis. The remaining subjects who had symptoms nearly every day or no symptoms at all are not used, as the results are trivial. As shown in Figure 7.3, there are 22 sequences that have the number of success runs falling inside the critical region curves, and seven sequences which do not. The approximate probabilities for large samples given by Wald and Wolfowitz (1940) were also computed and compared to the exact probabilities, which were obtained easily using the finite Markov chain imbedding approach, and most approximations were found to be significantly inaccurate. Based on the exact probabilities and Figure 7.3, the null hypothesis of randomness at the 5% significance level is rejected for 22 out of the 29 sequences. Therefore, for about three quarters of the subjects, the occurrence or non-occurrence of wheeze on a given day is somehow related to their condition on previous days. The type of tendency (positive or negative, firstor second-order Markov dependence, etc.) can then be modeled. O
7.4
Sequential Continuity
Continuity of care, and in particular provider continuity, is increasingly becoming a central aim of health care policy in a majority of clinical settings. The benefits of broader information exchange between doctor and patient attributed to provider continuity include earlier recognition of health problems and psycho-social effects such as greater patient satisfaction, leading to an overall improvement in the quality of care as well as a reduction in cost. Continuity of care describes the extent to which information about the diagnosis and management of health problems is conveyed from one visit to the next. This definition, in its broadest sense, includes not only provider
Applications
134
0
50
100 number of successes
150
200
Fig. 7.3 Critical region curves for n = 200, and observed numbers of success runs (•) for the 29 sequences of Example 7.4.
(e.g. physician, health team, HMO) continuity, but also other dimensions such as continuity of medical records and of geographical location of treatment site. Although existing measures for the quantification of continuity can be applied to each of these dimensions in a similar manner, corresponding literature has focused mainly on provider continuity, which is expected to have the greatest impact upon treatment outcome. Over the past two decades, more than a dozen measurement indices of continuity have been proposed, and, in general, they can be categorized into visit-based and individual-based measures. Steinwachs (1979) proposed an individualbased statistic: the sequential continuity measure SECON, which is the fraction of sequential visit pairs at which the same provider is seen. Currently it is one of the most commonly used individual-based statistics. A brief introduction to the formulation for the imbedded SECON statistic has been given in Example 2.4. A more complete and detailed mathematical discussion is provided in the following.
Sequential
Continuity
135
Let {Xt,t = 0,1, • • •} be a stationary sequence of random variables forming a homogeneous Markov chain defined on a finite state space S — {1, 2, • • •, m}. Assume the chain is irreducible and aperiodic with transition probability matrix A = (pij)mxm- Let IT = (wi, • • • ,nm) be the ergodic distribution associated with the Markov chain; i.e. for i = 1, • • •, m, Tti = limn^00P(Xn = i). Define a parameter 9 — Y^i^iPui which is the sequential continuity measure associated with the Markov chain, and define a sequence of indicator random variables It such that It = 1 if Xt = Xt-i, t = 1, 2, • • -. Thus It = 1 if the same health provider is seen at consecutive visits, and zero otherwise, where the random variable Xc, corresponds to the outcome of the initial visit of the patient at time t = 0 with initial probability 7r0 = (P(XQ = 1), • • •, P(X0 — m)). Unless specified otherwise, we assume that the initial probability 7To is the ergodic distribution 7r of the Markov chain (i.e. -KQ = n) . Let v(T), a positive integer random variable defined on J+ = {1,2, • • •} with probability mass function fJ-(-), be the number of visits, excluding the initial visit, that have occurred up to time T. Steinwachs (1979) introduced the statistic SECON as the outcome average of the sequence {It}tLi f° r a given v(T) = n, or in a practical sense, as the fraction of sequential visit pairs at which the same health provider is seen. In our context, SECON estimates 6, with SECON
= Sv{T)/v{T)
(7.24)
and Sv(T) = h+ h-\ h IV(T) • The exact distribution of SECON under the assumption of random assignment (i.i.d. with equal probabilities) at each visit was given by Steinwachs (1979). Here we focus on the exact distribution of SECON under the more realistic model of dependence between visits, obtained originally by Fu and Lou (2000a). If the integer random variable v(T) is independent of the Markov chain {Xt}, and the initial distribution TTQ is the ergodic distribution 7r of the Markov chain, then SECON is an unbiased estimator for the sequential continuity parameter 9. This follows from the fact that for every given v(T) = n with n € J + , E(SECON\v(T) = n) = E(S„/n) = 9. It is easy to see that the sequence of random variables {It} induced by the Markov chain {Xt} is stationary but not always a Markov chain. Hence the usual Markov chain techniques cannot be applied directly to study the exact
Applications
136
distributions of SV(T) a n d SECON. The goal here is to find the exact distributions of SV(T) a n ( l SECON under one-step Markov dependence between two consecutive events/visits. Decompose the transition probability matrix A associated with {Xt} into two matrices G and D: Amxm = (Pij) ij Jmxm — Gn .,+Dr, where (
0
ptj
O
/ Pn
••
. (7.25)
and Dn \
Pa
V°
0
Let f2 = {(u, v) : u — 0,1, • • •, n, and v = 1, 2, • • •, m} be the state space containing a total of (n + l)m states. Define a homogeneous Markov chain {Yt} on CI as Y = 1
l (YsUh, Xt) \ (0,X 0 )
\
(7.26)
t = 0,
with transition probability matrices, for t = 1, • • •, n, given by (
G
o
D G
O D
O O
°O \
G O
D
(7.27)
M(.( n + l ) m x ( n + l ) m —
o \o Theorem 7.3
I J
The distribution of Sn is given by P(Sn = a) = i0MnU'(Cs),
0<s
(7.28)
where £ 0 = (TC0, 0, • • •, 0, • • •, 0 ) I
x(n+i)m *s the initial probability vector of YQ, 0 = (0, • • •, 0 ) i x m , U{C.) = (0, • • •, 0,1, • • •, 1,0, • • •, 0) is a 1 x ( n + l ) m row vector with 1 at the coordinates associated with states in Cs and 0 elsewhere, and the Cs = [(s, 1), • • •, (s, m)],s = 0,1, • • •, n, form a partition of Q. Further, the probability of Sn = s satisfies the following recursive equation: m
P(Sn = s) - P(5„_! = s) l ^ f e p - ' f e ' t s - l,j) - e'(s,j)), (7.29) 3=1
where e(s,j) = (0, • • •, 0,1,0, • • •, 0 ) i x ( „ + i ) m is a unit vector associated with the state (s,j).
Sequential
Continuity
137
Proof. Since {Xt} is a Markov chain with initial probability 7To and transition probability matrix A, it follows from Eq. (7.26) that {Yt} is also a Markov chain with initial probability £ 0 and transition probabilities determined by the following: for 0 < s < n — 1 and i = 1, • • •, m,
P(Yt = (u, v)\Yt-i
= (s, i)) = { pij
if u = s + 1 and v = i if u = s and w = j , i ^ j otherwise.
(7.30)
For states with s = n, by convention, P(Yt = {u,v)\Yt-i = (n,i)) = 1 if u = n and v = i, and 0 otherwise. The transition probability matrix M given in Eq. (7.27) is hence defined. From the definition of the Markov chain {Yt}, P(Sn = s) — P(Yn G Cs), so Eq. (7.28) is a direct consequence of the Chapman-Kolmogorov equation. The recursive Eq. (7.29) follows immediately from Eq. (7.28) and m
MU'(CS)
= U'(CS) + 5 > , ( e ' ( S - l,j) -
e'(s,j)).
a Note that GD — DG if and only if p n = P22 = • • • = Pmm = 0 > 0. Given v{T) = n, if GD = DG, it follows from Eq. (7.28) and £ 0 M " = [ ( j ) i r 0 G » , QnoG^D,-.-,
Q*oZ>"
(7.31)
that Sn has a binomial distribution P(Sn = s) = (n\QGn~sDsl
= ( " V ( l - 0)n~s.
(7.32)
Thus in this special case, the result is independent of the number of available providers m and the initial probability no. The Markov chain {Yt} defined in Eq. (7.26) can be regarded as a Markov random walk (Pyke 1961). Our construction of {Yt} shows that Sn is finite Markov chain imbeddable. In addition, if GD = DG, Sn is finite Markov chain imbeddable of binomial type in the sense of Definition 2.7 (Koutras and Alexandrou 1995). If v(T) (with probability measure /i(-) on J + ) is independent of {Xt} and has, for example, a truncated Poisson distribution with parameter A such that fi(n) = A"e~ A /(n!(l - e _ A )), n € J + , then for cases where
138
Applications
pu = • • • = pmm = 6, the SECON for a e [0,1],
statistic has the following distribution:
P(Sv{T)/v(T) < a) = ^—^ E ^ E L P " ^)""S- (™3) n=l
' s=0
V
'
Further, under the above assumptions, the SECON and variance
statistic has mean 6
Var(Sv(T)/v(T)) = ^ ' J ^ £ ^ y .
(7.34)
n—l
The assumption of independence between the number of visits v(T) and the visit sequence {Xt} is crucial for the above results. Example 7.5 As part of a study of HIV/AIDS patients, we investigated the relationship between continuity of care and overall outcomes, such as total charges and frequency of hospitalization, for patients from the Mount Sinai AIDS Center. This patient population is composed mainly of East and Central Harlem residents. These low-income residential communities with high illicit drug use have among the highest AIDS and tuberculosis prevalences of any community in the country, and it is important to quantify the patterns of care received by patients in such areas. As an illustration here, we select one set of 45 patients whose primary provider is Medicaid, whose ages are between 25 and 55, and who had regular visits since the initial date of receiving case management care during the period of July 1995 to June 1996. Among the 45 observed sequences, 24 had SECON values equal to one, and the minimum value was 0.3. Hence these patients received much better sequential continuity of care than if they had been randomly assigned to any one of the m = 13 providers at each visit, in which case 6 would be 1/13. To gain further insight into the measure 6 for these 45 patients, the empirical distribution of SECON is plotted in Figure 7.4, along with one theoretical distribution of SECON where 9 = 0.8 and the number of visits v(T) is assumed to follow a Poisson distribution with A = 8. Here, as a reasonable starting point, we naively assumed a symmetric transition probability relationship for the 13-provider Markov chain, resulting in the small discrepancies apparent in Figure 7.4. Note that both distributions are heavily skewed toward the left. A chi-square test with four intervals, each containing at least five observations, was carried out for
Sequential
139
Continuity
/ empirical A=8
co d
,..
/
/ VI
d
/ CM
d o d
i 1
1
0.0
0.2
"
1
0.4
-
^
J 1
• 1
0.6
0.8
1.0
Fig. 7.4 The empirical distribution of Sv(T)/v(T) for the 45 patients of Example 7.5, along with the exact distribution of Sv{T)h{T) where 9 = 0.8 and v(T) follows a truncated Poisson distribution with A = 8.
goodness-of-fit of the theoretical distribution of SECON with parameters 6 = 0.8 and A = 8, yielding a p-value of 0.09. It's worth mentioning that when the restriction of having at least five observations in each interval is removed, as in using the interval a e [0,0.34] which contains two observations, the p-value decreases to less than 0.05. The requirement of at least five observations in each interval was chosen to eliminate potential problems in the chi-square approximation due to extremely small expected values. Based on Figure 7.4 and the chi-square goodness-of-fit test, it appears that the theoretical distribution approximates the empirical distribution well in the center, and less well in the two tails. With proper estimation of the transition probabilities, we can then model the relationship between the distribution of SV(T)/V{T) and the characteristics of the patient population (health care provider, race, etc.) and further with overall outcomes such as total costs and quality of care. v
140
7.5
Applications
Quality Control Schemes
Basic control schemes such as Cumulative Sum (CUSUM), Exponentially Weighted Moving Average (EWMA), and the Shewhart chart have found widespread application in improving the quality of manufactured goods and services. It is known that there is no single control scheme that is optimal in detecting all types of changes and shifts. Recently, multiple control schemes (compound sets of control rules) have been used for monitoring various manufacturing process parameters, such as the mean, variance, and proportion. One example of a multiple control scheme is the Western Electric Control Scheme (see Montgomery 2001, p. 176): (1) One or more points outside of the control limit, (2) Two of three consecutive points outside the two-sigma warning limits but still inside the control limits, (3) Four of five consecutive points beyond the one-sigma limits, and (4) A run of eight consecutive points on one side of the center line. A number of methods have been used to investigate the run-length distributions (waiting-time distributions for an out-of-control signal) and the average run lengths (ARLs) for various control schemes; a basic overview of the area may be gained from the following references: Barnard (1959), Ewan and Kemp (1960), Brook and Evans (1972), Lucas and Crosier (1982), and Chao (1999). In this section, we adopt a general method based on discretization and the finite Markov chain imbedding technique to study the distributions and ARLs of various control schemes, as given by Fu, Spiring and Xie (2002). The reasons for imposing discretization are two-fold: (i) to allow for the efficient application of the method of finite Markov chain imbedding in computing the run-length distribution when the control scheme {Sn} is a sequence of continuous random variables, and (ii) to avoid the mathematical complexity of other methods in computing the ARL for multiple control schemes (see Montgomery 2001). Technically speaking, we discretize the range space into m states plus control limits (absorbing states), and then the quality control scheme can be viewed as the limiting control scheme based on a sequence of multi-state trials, {5 n (m)}, as m —> oo. In plain words, for each control scheme {Sn}, we construct a sequence of control schemes {Sn(m)} defined on a finite state space. Under very mild assumptions, we expect the discretized control schemes to perform as {Sn}, in the limit m —> oo.
Quality Control
7.5.1
Simple
Control
Schemes
141
Schemes
Let {Zn} be a sequence of i.i.d. continuous random variables. The simple control schemes known as CUSUM, EWMA and the Shewhart chart are perhaps the three most commonly used in practice. Mathematically, these schemes can be described as follows. (1) CUSUM control scheme: Given some upper control limit (UCL), h > 0, the upper-sided CUSUM is defined as S0 = 0,Sn=m&x{0,Sn-l
+ Zn},
for n = 1,2, • • •.
(7.35)
The process being monitored is out-of-control at stage n (after n trials) if Si < h for alii = 0,1, 2, • • •, n — 1, and Sn > h. The main use of the uppersided CUSUM control scheme is to detect a small upward shift. Similarly, we can define the lower-sided CUSUM as 5 0 = 0 , 5 n = min{0,S'„_i + Z n } , for n = 1,2, • • •.
(7.36)
The process is out-of-control at stage n if Si > —h for all i = 0,1, • • •, n — 1, and Sn < —h, where — h is referred to as the lower control limit (LCL). (2) EWMA control scheme: For given 0 < A < 1 and h > 0, the EWMA scheme can be defined as So = 0,Sn = (l-X)Sn-i
+ XZn, forn = l , 2 , - - ,
(7.37)
The process is out-of-control at stage n if — h < Si < h for all i = 0,1, • • •, n — 1, and Sn < —h or Sn > h. (3) Shewhart control scheme: The Shewhart control scheme is the special case of the EWMA control scheme with A = 1: S0=0,Sn
= Zn, forn = l,2,---.
In general, the upper and lower control limits are functions of n. For example, in practice, for the two-sided CUSUM control scheme a V-mask is often used, for the EWMA control scheme linear control limits are sometimes applied, but for the Shewhart control scheme the upper and lower limits are typically constant. For simplicity in presenting the application of the finite Markov chain imbedding technique to the analysis of quality control schemes, we now focus our discussion on the upper-sided CUSUM control scheme (denoted hereafter simply by Sn), and choose a constant control limit (h > 0). At the end of this subsection, we briefly indicate how to extend the methodology to non-constant control limits. However, the
Applications
142
theoretical results developed in this section for the upper-sided CUSUM control scheme also hold for the EWMA and Shewhart schemes. Given a constant control limit h > 0 and 5o = 0 with probability one, define W = M{n >l:Sn>
h\S0 = 0}
as the run-length (waiting-time or hitting-time) random variable induced by the upper-sided CUSUM control chart. Let f(z) be a continuous common density function for a sequence of i.i.d. random variables {Zn}. Given the upper control limit h > 0 and a positive integer m, let {Sn(m)} be the sequence of discretized random variables of the sequence {Sn}, where Sn(m) takes values ao = 0, aj = id, for i = 1, • • •, 77i, and a m + i = h with d = h/(m+l). We define a discretized upper-sided CUSUM as follows: So(m) = ao = 0 and a0
if 0 < max{0,5n_i(m) + Zn} < a0 + f if dj-i + | < max{0, 5n_i(m) + Z„} < a,- + Sn(m) for j = 1, • • • ,m am+i if a m + | < max{0, Sn-i(m) + Zn}, (7.38) for n = 1,2, • • •. It is easy to see from our construction that {Sn(m)} forms a homogeneous Markov chain defined on the state space f2 ={ao, ai, • • •, o m , a m + i } with transition probability matrix / Poo PlO
Poi Pii
• •• • ••
POm
P0(m+1)
Plm
Pl(m+1)
PmO
Pml
•
Pmm
Pm(m+1)
0
•
0
\ A(m) 0
M =
\ °
1
/
B(m) 1
(7.39) where the state am+i is an absorbing state, A(m) is the (m + 1) x (m + 1) essential submatrix, B(m) is an (m + 1) x 1 vector,
Pij PiO
f(z)dz,
i = 0,1, 2, •••,m, j
J(a3_i-ai) + | (o0-ai) + 5
/
-oo
/(z)dz, i = 0,1,2, •••,771,
1,2,-
Quality Control
Schemes
143
/•OO
Pi(m+i)=
f(z)dz,
i=
0,l,2,---,m,
J (a-m— <»i)+f
P(m+i)(m+i) = 1 (stay "out of control"), and
(7.40)
P(m+i)j = 0, j = 0,1,2, • • •, m (cannot return once beyond UCL). Note that, for every i — 0,1, • • •, m, m 1
+
r(a0-ai)+%
E^
= /
j=0
m
r(aj-ai)+&
f(z)dz+Y,
J — oo
f(z)dz
j-.-y J(a,j-i
— ai) + f
/•OO
+ / J(am-a.i)
/(z)dz = 1. +%
Hence, the transition probability matrix M associated with the upper-sided CUSUM control scheme consists of elements pij, i,j = 0,1, • • • ,m + 1, as denned in Eq. (7.40), where each p^ represents the probability of the control statistic moving from state i to state j . Let the random variable of run length induced by the finite Markov chain {S„(m)} be W(m), such that W(m) = mi{n > 1 : Sn{m) = am+1\S0(m)
= a0},
(7.41)
where a^ = 0. Throughout the following, we assume the initial distribution P(5 0 (m) = S*0 = oo) = 1; i.e. £(m) = (l,0,---,0) and £o = (£(m) : 0). The run-length distribution and its mean and variance are given by the following results. T h e o r e m 7.4 (i) (ii)
Given m, h, and £(m), we have
P(W(m)
= n|£(m)) = { ( m ) A n - 1 ( m ) ( J - A(m))l
E[exp(sW(m))}
s
s
= 1 + (e - l)£(m)(J - e A ( m ) )
, _1
l',
1
(iii)
E[W(m)} = ^ ( m ) ( J - A ( m ) ) - l ' , and
(iv)
E[W2{m)} = £(m){I + A{m))(I
- A{m))-21
.
Proof. The result (i) follows immediately from Eqs. (7.39) and (7.41), and from Theorem 5.1. With straightforward manipulation, the result (ii) follows from the definition of the moment generating function. The results (iii) and (iv) are a direct consequence of ElW^m)]
= ^-E[eXp(sW(m))}\s=0, as1
i = 1,2. •
Applications
144
Theorem 7.5
For every given n,
(i) lim P(W(m)
> n\S0(m) = a0) — P(W > n\S0 = a 0 ), and
m—+00
(ii) lim P(W(m)
> n\S0(m) = a 0 ) = pooP(W > n - l|5i = a 0 ) rh
Jo >o
P(W>n-l\S1
=
z)f(z)dz.
Proof. Given m, it follows from the definitions of W and W(m) that for every n = 1, 2, • • •, we have P ( W > n\S0 = ao) = P ( 5 „ < ft|50 = «o),
(7.42)
and P{W{m) > n\S0(m) = a0) = P(Sn(m)
< h\S0(m) = a 0 ).
It follows from the construction of pij and the definition of Sn(m) that, for all n, Sn(m) converges to 5 „ a s m - > oo; i.e. lim P(Sn(m)
< h\S0{m) = a0) = P(Sn < h\S0 = a0).
(7.43)
m—>oo
The result (i) follows immediately from Eq. (7.43). Further, since [0, h] is a compact set, the convergence (for fixed n) is uniform. It follows from Theorem 5.1, the definition of integration, and Eq. (7.41), that as m —> oo (d - 0), P(W(m)
J^PoieiAn-1(m)i(m)
> n\S0{m) = o 0 ) = i=0
m
=
PooP(W(m)
> n - l|5i(m) = a 0 ) + $^PoiP(W(m) > n - l|5i(m) = o<) i=l
-» PooP(VF > n - l | 5 i = a 0 ) + / P(T^ > n - l | 5 i = Jo
z)f(z)dz. •
To illustrate the methodology developed above, two examples with different probability densities are given in the following to provide some numerical results of ARLs and standard deviations for the upper-sided CUSUM, the EWMA, and the Shewhart control schemes. Example 7.6 (Upper-sided CUSUM). Consider that (i) UCL = h = 3, (ii) m = 20,40, • • •, 1000, and (iii) the sequence of random variable {Zn}
Quality Control
Schemes
145
has a common exponential density function f(z) = exp{—z}, 0 < z < oo. The transition probabilities of A(m) are obtained from Eq. (7.40), and Table 7.4 gives the expected mean and standard deviation for W(m) using Theorem 7.4. Table 7.4 Expected mean and standard deviation of run length for the upper-sided CUSUM control scheme in Example 7.6.
m 20 40 80 200 500 1000
ARL= E[W{m)} 3.78584 3.89026 3.94445 3.97761 3.99101 3.99550
y/Var[W(m)} 1.84553 1.79263 1.76339 1.74486 1.73723 1.73466
In view of Table 7.4, on the average, after four trials the process is "out of control". This demonstrates that CUSUM detects an upward shift very quickly, due to the fact that the process Sn = max{0, 5„_i + Zn} is monotonically increasing, with an average upshift in this example of EZn — 1 at every step. This property can be viewed as an advantage, and sometimes as a disadvantage, of using the CUSUM control scheme. O Example 7.7 (EWMA control scheme). We assume (i) the UCL and LCL (Upper and Lower Control Limits) are h and —h, respectively, (ii) each side of ao = 0 has m states and an absorbing state a m +i = h (or —h) resulting in 2(m + 1) + 1 states in total, (iii) states are equally spaced with d = h/(m + 1), and (iv) a standard normal density function f(z) = - 7 = e ~ z 2 / 2 , - o o < z < oo. V27T
Given 5 n _i(m) = id and Sn(m) = jd, it follows from the definition of the EWMA scheme given in Eq. (7.37) that z has to satisfy the following two inequalities: (1 — X)id + Xz > (j — -)d, (1 - \)id + \z < (j
+-)d.
and
Applications
146
Hence the transition probabilities of A(m) are, for i, j • • •,
-m, •••, -1,0,1,
m :
Pij
$
Pi(m+1)
Pi(-m-l)
1 {U-n)d-(l-X)id}
{(j + I ) d - ( l - A ) t d } - $ = 1 -
$
A
{(m+
)d-(l-\)id}
= $ l{(_m_I)d_(l_A)id}
P(m+l)j = P ( - m - l ) j
0, and (7.44)
F(m+l)(m+l) = P ( - m - l ) ( - m - l ) = 1,
where each p ^ represents the probability of the control statistic moving from state i to state j , and $(•) is the standard cumulative normal distribution function. Consider h = 3,m = 7, and A = 0.2. For the EWMA scheme based on the standard normal distribution, Table 7.5 gives the run-length mean and standard deviation holding all but the number of divisions (states) constant, using the results from Theorem 7.4(iii). Table 7.5 Expected mean and standard deviation of run length for the EWMA control scheme in Example 7.7.
m 12 22 32 52 102 502
ARL= E[W(m)] 514.266 545.073 552.691 557.075 559.101 559.636
^Var[W(m)] 509.678 540.540 548.172 552.565 554.594 555.132
When A is set to 1, the EWMA control scheme simplifies to the wellknown Shewhart scheme. For a Shewhart chart, the ARL and the variance of the run length are 1/p and 1/p2, respectively, where p is the probability of an out-of-control signal detected by the Shewhart control scheme: p=2 „ f
1-=t
dz ^ 0.00269980.
Quality Control
Schemes
147
For m = 502 in our imbedding framework, it can be seen from Table 7.6 for the Shewhart chart that the ARL and the standard deviation of W(m) are equal to 370.377 and 369.877, respectively. The numerical results converge to the theoretical results (ARL=standard deviation=l/p = 370.398) as m increases (m —» oo). O Table 7.6 Expected mean and standard deviation of run length for the Shewhart control scheme in Example 7.7.
m 12 22 32 52 102 502
ARL= E[W(m)} 345.711 362.443 366.551 368.918 370.004 370.377
^JVar[W{r 345.210 361.943 366.051 368.418 369.504 369.877
Note that the procedure developed here is independent of the distribution associated with the characteristics being monitored, and requires only the density function (i.e. f(z)) to be specified in order to determine the transition probability matrix. This allows more flexibility in the selection of density functions for proper modeling of the sequence {Zn} under realistic conditions. With some modifications, the method can also be extended to the case where the control limit h is a function of n. Then the imbedded chain is a non-homogeneous, tree-structured Markov chain (c/. Section 2.3). The state spaces and the transition probability matrices need to be constructed for every n = 1,2, • • -.
7.5.2
Compound
Control
Schemes
A control scheme composed of two or more simple control schemes 2) is referred to as a compound control scheme cf> = u'=1>j in the sense that an "out-of-control" signal is issued if and only if any one of the simple control schemes has issued an "out-of-control". This is equivalent to saying that the run length (waiting time) of a compound control scheme (f> can be
Applications
148
defined as W(fa = mm(W(fa),
• • •, Wifr)),
(7.45)
where W(fa) is the out-of-control run length for the simple control scheme fa. The technique for obtaining the distribution of the waiting time of a compound pattern developed in Chapter 5 has been extended recently by Pu, Shmueli and Chang (2002) to study the compound control scheme. To illustrate this extension, we give the following example. Example 7.8 Let's consider an upper-sided CUSUM control scheme, which, in addition to the upper control limit h, also has a "warning limit" h*. The compound control scheme > = fa U fa is to signal at time t if the scheme fa, the CUSUM control scheme St = max{0,5 t _i + Xt}, exceeds the upper control limit at time t, or if the scheme fa signals that two of three consecutive CUSUM statistics fall in the interval between the warning and control limits, [h*,h). We assume that the process outcomes Xt are i.i.d. N(Q,1). Let D(X) be the discretized standard normal variable X in the following sense, D(X)=iA,
i = 0,±l,±2,---,±(m+l),
where A is a small positive constant (A > 0). We define Pi = P{D{X) = iA) and F(i) = P(D(X) ,(i+0.5)A
1
~r=e
Pi= i(i-0.5)A
x
dx, i = 0, ± 1 , - - - , ± m ,
v27T i
/•OO
P( m + i) = /
-^e-V^'dx,
J[(m+1)-0.5]A V27T /.[-(m+l)+0.5]A P-(m+l)
F(i)=
=
< iA) as follows:
.
2
1
(7.46)
E
1
-7S=e~ V27T
/ J-oo
(7.47) 2
2X
rfx
'
aIld
Pi-
j=-(m+l)
We denote the upper control limit by h = (m + 1)A and the warning limit by h* = m*A, m* < m. Then St takes on the values iA, i = 0,1, • • •, (m + 1), in the interval [0, h].
Quality Control
Schemes
149
In addition, the chart in this example is naturally divided by the second rule (f>2 into three regions: for t > 1, ( n R(St) = i r2
iiO<St
(7.48)
I r 3 if St > h, and we define an initial state R(So) = 0. For simplicity, we write St = i to denote St = iA. Define a state space fi = {(0, 0), (0,0), • • •, (0,m), ( n , 0 ) , • • •,
(ri,m),
(r2,0),---,(r2,m*-l),a},
(7.49)
with 2 + 2(m + 1) + m* states. States that include rz or (m + 1) as one of the coordinates can be combined into the absorbing state a. Using the forward and backward principle, we define a homogeneous Markov chain {Yt} on the state space fl as, for t > 1,
{
a
if either W((/>i)
or W{ 1, note that if Yt is not in the absorbing state, it will not move into the absorbing state a at time t + 1 if (a) {St+i < h*}, or if (b) {h* < St+i < h, R(St-!) = 0 and R(St) = n } , or if (c) {h* < St+i < h, R(St-i) = R(St) = ri}. It follows from Eqs. (7.47) and (7.50) that the transition probabilities of the imbedded Markov chain can be specified by the following equation: for t > 1, P(Yt+1 = [R(St),St+i]\Yt = [RiSt-^St]) (7.51) F(Xt+i = -St) if St+i = 0 P(Xt+i = St+i - St) if 0 < St+i < h* P(Xt+i — St+i — St) if condition (b) or (c) holds 0 otherwise. Since the transition probabilities of each row add to 1, the probability of moving into the absorbing state a can be obtained by subtracting all the
Applications
150
other probabilities in the row from 1: i.e., P(Yt+1=a\Yt
=
= 1-
J2
(7.52)
[R(St-i),St])
P(Yt+1 = [R(St),St+1]\Yt
[J2(St_i),S t ]).
[R(st),st+i]en-a To illustrate such a transition probability matrix, we choose ra = 3, h = 4 (A = h/(m + 1) = 1) and h* = 2 (m* = ft*/A = 2). Then, D(Xt) = iA = i, i = 0, ± 1 , ± 2 , ± 3 , ± 4 , and St can take on the values i = 0,1,2,3,4 (with 4 denoting the absorbing state a). The transition probability matrix M is given by (0,0) (0,0) (0,1) (0,2) (0,3) (n,0) (n.,1) (n,2) (n,3)
/^°0
0 0 0 0 0 0 0 0 (T-2,0) fa.l) 0 a \o
F(0)
Pi
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
P2 0 0 0 0 0 0 0 0 0 0 0
P3 0 0 0 0 0 0 0 0 0 0 0
0
0
0
0
F(0) F(-l)
Pi
P2 Pi
Vz
0 0 F(0) F(-l)
0 0 F(0) F(-l)
0
Po 0 0 Pi
Po 0 0 Pi
Po 0
0 0 Vi Pi
0 0 0 0 0
P2 0 0 Ps Vi
0 0 0 0 0
0 0 0
0 0 0
F(-2) F(-3)
P-1 P-2
0 0
0 0
F(-2) F(-3)
P-1 P-2
0 0 0
0 0 0
1-F(3) \ 1-F(3) 1 - F(2) l-F(-l) l-F(-2) 1 - F(3) 1 - F(2) l-F(-l) 1 - F(-2) l-F(l) 1 - F(0)
"i
7
and it is again written in the form M =
N j C
bTi
Note that here we assume that ^o is at the initial state 0 with probability one: i.e. P(S0 = 0) = 1. Hence the states (0,0), (0,0), • • •, (0, m) are needed so that Yo and Y\ of the imbedded Markov chain {Yt} can be properly defined with the initial distribution P(Yo = (0,0)) = 1. Traditionally, the mean and standard deviation of the run-length distribution are used to compare the performance of control schemes. Since our numerical results show that the run-length distribution for a compound control scheme is rather skewed toward the right, we feel that displaying the quartiles along with the mean and standard deviation of the distribution is more proper, as in Table 7.7, for example. In general, the run-length distribution of a compound control scheme becomes more skewed as the number of simple control schemes involved increases. It can be seen from
Quality Control Schemes Table 7.7
m 5 14 29 74 149 299 749 1499 1874
151
Mean, standard deviation and quartiles of run length in Example 7.8.
ARL= E[W{m)} 11.739 12.749 13.103 13.319 13.392 13.428 13.450 13.457 13.459
^Var[W(m)] 9.386 10.187 10.473 10.649 10.709 10.738 10.756 10.762 10.764
Qi
Q2
4.626 5.066 5.223 5.316 5.348 5.364 5.373 5.376 5.376
8.444 9.213 9.486 9.649 9.703 9.730 9.747 9.753 9.753
g3
14.904 16.232 16.701 16.976 17.075 17.124 17.154 17.164 17.164
Table 7.7 that the numerical results are already rather accurate for the case ofTO= 299, and converge fast. O Generally speaking, so far there are no analytical formulae for the ARLs of compound control schemes (see Montgomery 2001), not even for the i.i.d. cases. The finite Markov chain imbedding technique always provides the numerical solutions for the distributions of run length (including ARL, quantiles, variance, etc.) for both simple and compound control schemes.
This page is intentionally left blank
Bibliography
Abramson, M. and Moser, W. O. J. (1967). Permutations without rising or falling w-sequences. Annals of Mathematical Statistics 38, 1245-1254. Aki, S. (1985). Discrete distributions of order k on a binary sequence. Annals of the Institute of Statistical Mathematics 37, 205-224. Aki, S. (1997). On sooner and later problems between success and failure runs. Advances in Combinatorial Methods and Applications to Probability and Statistics (ed. N. Balakrishnan), Birkhauser, Boston, 385-400. Aki, S. (1999). Distributions of runs and consecutive systems on directed trees. Annals of the Institute of Statistical Mathematics 5 1 , 1-15. Aki, S., Balakrishnan, N. and Mohanty, S. G. (1996). Sooner and later waiting time problems for success and failure runs in higher order Markov dependent trials. Annals of the Institute of Statistical Mathematics 48, 773-787. Aki, S. and Hirano, K. (1988). Some characteristics of the binomial distribution of order k and related distributions. Statistical Theory and Data Analysis II (ed. K. Matusita), North-Holland, Amsterdam, 211-222. Aki, S. and Hirano K. (1999). Sooner and later waiting time problems for runs in Markov dependent bivariate trials. Annals of the Institute of Statistical Mathematics 5 1 , 17-29. Aki, S. and Hirano K. (2000). Numbers of success-runs of specified length until certain stopping time rules and generalized binomial distributions of order k. Annals of the Institute of Statistical Mathematics 52, 767-777. Aki, S., Kuboki, H. and Hirano, K. (1984). On discrete distributions of order k. Annals of the Institute of Statistical Mathematics 36, 431-440. Antzoulakos, D. L. (1999). On waiting time problems associated with runs in Markov dependent trials. Annals of the Institute of Statistical Mathematics 51, 323-330. Antzoulakos, D. L. (2001). Waiting times for patterns in a sequence of multistate trials. Journal of Applied Probability 38, 508-518. Balakrishnan, N. and Koutras, M. V. (2002). Runs and Scans with Applications, 153
154
Bibliography
Wiley, New York. Balasubramanian, K., Viveros, R. and Balakrishnan, N. (1993). Sooner and later waiting time problems for Markovian Bernoulli trials. Statistics and Probability Letters 18, 153-161. Barnard, G. A. (1959). Control charts and stochastic processes. Journal of the Royal Statistical Society, Series B 2 1 , 239-271. Barton, D. E. and David, F. N. (1958). Non-randomness in a sequence of two alternatives: II. Runs test. Biometrika 4 5 , 253-256. Bateman, G. (1948). On the power function of the longest run as a test for randomness in a sequence of alternatives. Biometrika 35, 97-112. Boutsikas, M. V. and Koutras, M. V. (2000a). Generalized reliability bounds for coherent structures. Journal of Applied Probability 37, 778-794. Boutsikas, M. V. and Koutras, M. V. (2000b). Reliability approximation for Markov chain imbeddable systems. Methodology and Computing in Applied Probability 2, 393-411. Brook, D. and Evans, D. A. (1972). An approach to the probability distribution of cusum run length. Biometrika 59, 539-549. Cai, J. (1994). Reliability of a large consecutive-fc-out-of-r-from-n:F system with unequal component-reliability. IEEE Transactions on Reliability 43, 107111. Carlitz, L. (1964). Extended Bernoulli and Eulerian numbers. Duke Mathematical Journal 3 1 , 667-689. Chao, M. T. (1999). Applications of Markov chains in quality-related matters. Statistical Process Monitoring and Optimization (eds. S. H. Park and G. G. Vining), Marcel Dekker, New York, 175-188. Chao, M. T. and Fu, J. C. (1989). A limit theorem of certain repairable systems. Annals of the Institute of Statistical Mathematics 4 1 , 809-818. Chao, M. T. and Fu, J. C. (1991). The reliability of large series system under a Markovian structure. Advances in Applied Probability 23, 894-908. Chao, M. T., Fu, J. C. and Koutras, M. V. (1995). Survey of reliability studies of consecutive-fc-out-of-n:F and related systems. IEEE Transactions on Reliability 44, 120-127. Chao, M. T. and Lin, G. D. (1984). Economical design of large consecutive-kout-of-n:F systems. IEEE Transactions on Reliability 33, 411-413. Chen, J. and Glaz, J. (1997). Approximations and inequalities for the distribution of a scan statistic for 0-1 Bernoulli trials. Advances in the Theory and Practice of Statistics (eds. N. L. Johnson and N. Balakrishnan), Wiley, New York, 285-298. Chen, J. and Glaz, J. (1999). Approximations for the distribution and the moments of discrete scan statistics. Scan Statistics and Applications (eds. J. Glaz and N. Balakrishnan), Birkhauser, Boston, 27-66. Cheung, L. K. W. (2002). Statistical Pattern Recognition in Genomic DNA Sequences. Ph.D. Dissertation, Department of Statistics, University of Manitoba, Canada.
Bibliography
155
Chiang, D. T. and Niu, S. C. (1981). Reliability of consecutive-/c-out-of-n:F systems. IEEE Transactions on Reliability 30, 87-89. Chrysaphinou, O. and Papastavridis, S. (1988). A limit theorem on the number of overlapping appearances of a pattern in a sequence of independent trials. Probability Theory and Related Fields 79, 129-143. Cochran, W. G. (1938). An extension of Gold's method for examining the apparent persistence of one type of weather. Quarterly Journal of the Royal Meteorological Society 64, 631-634. Csorgo, S. (1979). Erdos-Renyi laws. Annals of Statistics 7, 772-787. David, F. N. (1947). A power function for tests of randomness in a sequence of alternatives. Biometrika 34, 335-339. David, F. N. and Barton, D. E. (1962). Combinatorial Chance, Hafner, New York. Derman, G., Lieberman, G. J. and Ross, S. M. (1982). On the consecutive-fc-outof-n:F system. IEEE Transactions on Reliability 3 1 , 57-63. Dillon, J. F. and Roselle, D. P. (1969). Simon Newcomb's problem. SIAM Journal on Applied Mathematics 17, 1086-1093. Doi, M. and Yamamoto, E. (1998). On the joint distribution of runs in a sequence of multi-state trials. Statistics and Probability Letters 39, 133-141. Dwass, M. (1973). The number of increases in a random permutation. Journal of Combinatorial Theory, Series A 15, 192-199. Ebneshahrashoob, M. and Sobel, M. (1990). Sooner and later problems for Bernoulli trials: frequency and run quotas. Statistics and Probability Letters 9, 5-11. Erdos, P. and Renyi, A. (1970). On a new law of large numbers. Journal d'Analyse Mathematique 23, 103-111. Erdos, P. and Revesz, P. (1975). On the length of the longest head-run. Topics in Information Theory, Colloquia Mathematica Societatis Janos Bolyai, 16 (eds. I. Csiszar and P. Elias; Keszthely, Hungary), North-Holland, Amsterdam, 219-228. Ewan, W. D. and Kemp, K. W. (1960). Sampling inspection of continuous processes with no autocorrelation between successive results. Biometrika 47, 363-380. Feller, W. (1968). An Introduction to Probability Theory and Its Applications (Vol. I, 3rd ed.), Wiley, New York. Fu, J. C. (1985). Reliability of consecutive-/e-out-of-n:F system. IEEE Transactions on Reliability 34, 127-130. Fu, J. C. (1986). Reliability of consecutive-fc-out-of-n:F systems with (k — 1) step Markov dependence. IEEE Transactions on Reliability 35, 602-606. Fu, J. C. (1995). Exact and limiting distributions of the number of successions in a random permutation. Annals of the Institute of Statistical Mathematics 47, 435-446. Fu, J. C. (1996). Distribution theory of runs and patterns associated with a sequence of multi-state trials. Statistica Sinica 6, 957-974. Fu, J. C. (2001). Distribution of scan statistics for a sequence of bi-state trials.
156
Bibliography
Journal of Applied Probability 38, 1-9. Pu, J. C. and Chang, Y. M. (2002). On probability generating functions for waiting time distributions of compound patterns in a sequence of multistate trials. Journal of Applied Probability 39, 70-80. Fu, J. C. and Hu, B. (1987). On reliability of a large consecutive-fc-out-of-n:F system with k — 1 step Markov dependence. IEEE Transactions on Reliability 36, 75-77. Fu, J. C. and Koutras, M. V. (1994). Distribution theory of runs: a Markov chain approach. Journal of the American Statistical Association 89, 1050-1058. Fu, J. C. and Lou, W. Y. W. (1991). On reliabilities of certain large linearly connected engineering systems. Statistics and Probability Letters 12, 291296. Fu, J. C. and Lou, W. Y. W. (2000a). On the exact distribution of SECON and its application. Statistica Sinica 10, 999-1010. Fu, J. C. and Lou, W. Y. W. (2000b). Joint distribution of rises and falls. Annals of the Institute of Statistical Mathematics 52, 415-425. Fu, J. C , Lou, W. Y. W., Bai, Z. D. and Li, G. (2002). The exact and limiting distributions for the number of successes in success runs within a sequence of Markov-dependent two-state trials. Annals of the Institute of Statistical Mathematics 54, 719-730. Fu, J. C , Lou, W. Y. W. and Chen, S. C. (1999). On the probability of pattern matching in nonaligned DNA sequences: a finite Markov chain imbedding approach. Scan Statistics and Applications (eds. J. Glaz and N. Balakrishnan), Birkhauser, Boston, 287-302. Fu, J. C , Lou, W. Y. W. and Wang, Y. J. (1999). On the exact distributions of Eulerian and Simon Newcomb numbers associated with random permutations. Statistics and Probability Letters 42, 115-125. Fu, J. C , Shmueli, G. and Chang, Y. M. (2002). A unified Markov chain approach for computing the run length distribution for control charts with simple or compound rules. Technical Report, Department of Statistics, University of Manitoba. Fu, J. C , Spiring, F. A. and Xie, H. (2002). On the average run lengths of quality control schemes using a Markov chain approach. Statistics and Probability Letters 56, 369-380. Glaz, J. (1989). Approximations and bounds for the distribution of the scan statistic. Journal of the American Statistical Association 84, 560-566. Glaz, J. (1992). Approximations for tail probabilities and moments of the scan statistic. Computational Statistics and Data Analysis 14, 213-227. Glaz, J., Naus, J. I. and Wallenstein, S. (2001). Scan Statistics, Springer-Verlag, New York. Godbole, A. P. (1990). Specific formulae for some success run distributions. Statistics and Probability Letters 10, 119-124. Godbole, A. P. (1991). Poisson approximations for runs and patterns of rare events. Advances in Applied Probability 23, 851-865.
Bibliography
157
Goncharov, V. L. (1944). On the field of combinatory analysis. Isvestija Akad. Nauk. SSSR. Ser. Math. 8, 3-48 (in Russian); English translation: Translations of the AMS Ser. Math. 19 (1962), 1-46. Goodman, L. A. (1958). Simplified runs tests and likelihood ratio tests for MarkofF chains. Biometrika 45, 181-197. Han, Q. and Aki, S. (1998). Formulae and recursions for the joint distributions of success runs of several lengths in a two-state Markov chain. Statistics and Probability Letters 40, 203-214. Han, Q. and Aki, S. (2000a). Sooner and later waiting time problems based on a dependent sequence. Annals of the Institute of Statistical Mathematics 52, 407-414. Han, Q. and Aki, S. (2000b). Waiting time problems in a two-state Markov chain. Annals of the Institute of Statistical Mathematics 52, 778-789. Hirano, K. (1986). Some properties of the distributions of order k. Fibonacci Numbers and Their Applications (eds. A. N. Philippou, G. E. Bergum, and A. F. Horadam), Reidel, Dordrecht, 43-53. Hirano, K. and Aki, S. (1987). Properties of the extended distributions of order k. Statistics and Probability Letters 6, 67-69. Hirano, K. and Aki, S. (1993). One number of occurrences of success runs of specified length in a two-state Markov chain. Statistica Sinica 3, 313-320. Huntington, R. J. and Naus, J. I. (1975). A simpler expression for fcth nearest neighbor coincidence probabilities. Annals of Probability 3, 894-896. Hwang, F. K. (1982). Fast solutions for consecutive-fc-out-of-n:F system. IEEE Transactions on Reliability 3 1 , 447-448. Hwang, F. K. (1986). Simplified reliabilities for consecutive-fc-out-of-n systems. SIAM Journal on Algebraic and Discrete Methods 7, 258-264. Jackson, D. M. and Reilly, J. W. (1976). Permutations with a prescribed number of p-runs. Ars Combinatoria 1, 297-305. Johnson, B. C. (2001). Distribution of increasing /-sequences in a random permutation. Methodology and Computing in Applied Probability 3, 35-49. Johnson, B. C. (2002). The distribution of increasing 2-sequences in random permutations of arbitrary multi-sets. Statistics and Probability Letters 59, 6774. Johnson, B. C and Fu, J. C. (2000). The distribution of increasing Z-sequences in random permutations: A Markov chain approach. Statistics and Probability Letters 49, 337-344. Kaplansky, I. (1944). Symbolic solution of certain problems in permutations. Bulletin of the American Mathematical Society 50, 906-914. Karlin, S. and McGregor, J. (1959). Coincident probabilities. Pacific Journal of Mathematics 9, 1141-1164. Kontoleon, J. M. (1980). Reliability determination of a r-successive-out-of-n:F system. IEEE Transactions on Reliability 29, 437. Kossow, A. and Preuss, W. (1989). Reliability of consecutive-/c-out-of-n:F system with nonidentical component reliabilities. IEEE Transaction on Reliability
158
Bibliography
38, 229-233. Koutras, M. V. (1996a). On a Markov chain approach for the study of reliability structures. Journal of Applied Probability 33, 357-367. Koutras, M. V. (1996b). On a waiting time distribution in a sequence of Bernoulli trials. Annals of the Institute of Statistical Mathematics 48, 789-806. Koutras, M. V. (1997a). Waiting time distributions associated with runs of fixed length in two-state Markov chains. Annals of the Institute of Statistical Mathematics 49, 123-139. Koutras, M. V. (1997b). Waiting times and number of appearances of events in a sequence of discrete random variables. Advances in Combinatorial Methods and Applications to Probability and Statistics (ed. N. Balakrishnan), Birkhauser, Boston, 363-384. Koutras, M. V. (2003). Applications of Markov chains to the distribution theory of runs and patterns. Handbook of Statistics 21: Stochastic Processes, Modeling and Simulation (eds. D. N. Shanbhag and C. R. Rao), Elsevier, Amsterdam, in press. Koutras, M. V. and Alexandrou, V. (1995). Runs, scans and urn model distributions: a unified Markov chain approach. Annals of the Institute of Statistical Mathematics 47, 743-766. Koutras, M. V. and Alexandrou, V. (1997a). Non-parametric randomness tests based on success runs of fixed length. Statistics and Probability Letters 32, 393-404. Koutras, M. V. and Alexandrou, V. (1997b). Sooner waiting time problems in a sequence of trinary trials. Journal of Applied Probability 34, 593-609. Koutras, M. V. and Papastavridis, S. G. (1993). Application of the Stein-Chen method for bounds and limit theorems in the reliability of coherent structures. Naval Research Logistics 40, 617-631. Ling, K. D. (1992). A generalization of the sooner and later waiting time problems for Bernoulli trials: frequency quota. Statistics and Probability Letters 14, 401-405. Ling, K. D and Low, T. Y. (1993). On the soonest and the latest waiting time distributions: succession quotas. Communications in Statistics - Theory and Methods 22, 2207-2221. Lou, W. Y. W. (1996). On runs and longest run tests: method of finite Markov chain imbedding. Journal of the American Statistical Association 9 1 , 15951601. Lou, W. Y. W. (1997). An application of the method of finite Markov chain imbedding to runs tests. Statistics and Probability Letters 3 1 , 155-161. Lou, W. Y. W. (2000). The exact distribution of the continuity of care measure NOP. Statistics and Probability Letters 48, 361-368. Lou, W. Y. W. (2001). The distribution of the usual provider continuity index under Markov dependence. Statistics and Probability Letters 54, 269-276. Lou, W. Y. W. (2003). The exact distribution of the K-tuple statistic for sequence homology. Statistics and Probability Letters 1, 51-59.
Bibliography
159
Lucas, J. M. and Crosier, R. B. (1982). Fast initial response for CUSUM quality control schemes: Give your CUSUM a head start. Technometrics 24, 199205. MacMahon, P. A. (1915). Combinatory Analysis, Cambridge University Press, London. Mohanty, S. G. (1994). Success runs of length k in Markov dependent trials. Annals of the Institute of Statistical Mathematics 46, 777-796. Montgomery, D. C. (2001). Introduction to Statistical Quality Control (4th ed.). Wiley, New York. Mood, A. M. (1940). The distribution theory of runs. Annals of Mathematical Statistics 11, 367-392. Mosteller, F. (1941). Note on an application of runs to quality control charts. Annals of Mathematical Statistics 12, 228-232. Muselli, M. (2000). Useful inequalities for the longest run distribution. Statistics and Probability Letters 46, 239-249. Nagaev, S. V. (1957). Some limit theorems for stationary Markov chains. Theory of Probability and its Applications 2, 378-406. Naus, J. I. (1965). The distribution of the size of the maximum cluster of points on a line. Journal of the American Statistical Association 60, 532-538. Naus, J. I. (1974). Probabilities for a generalized birthday problem. Journal of the American Statistical Association 69, 810-815 Naus, J. I. (1982). Approximations for distributions of scan statistics. Journal of the American Statistical Association 77, 177-183. Nishimura, K. and Sibuya, M. (1997). Extended Stirling family of discrete probability distributions. Communications in Statistics - Theory and Methods 26, 1727-1744. Papastavridis, S. G. (1988). A Weibull limit for the reliability of a consecutive fc-within-m-out-of-n system. Advances in Applied Probability 20, 690-692. Papastavridis, S. G. and Koutras, M. V. (1993). Bounds for reliability of consecutive fc-within-m-out-of-n:F systems. IEEE Transactions on Reliability 42, 156-160. Philippou, A. N. (1986). Distributions and Fibonacci polynomials of order k, longest runs, and reliability of consecutive-/c-out-of-n:F systems. Fibonacci Numbers and Their Applications (eds. A. N. Philippou, G. E. Bergum and A. F. Horadam), Reidel, Dordrecht, 203-227. Philippou, A. N., Georghiou, C. and Philippou, G. N. (1983). A generalized geometric distribution and some of its properties. Statistics and Probability Letters 1, 171-175. Philippou, A. N. and Makri, F. S. (1986). Success runs and longest runs. Statistics and Probability Letters 4, 211-215. Pyke, R. (1961). Markov renewal processes: definitions and preliminary properties. Annals of Mathematical Statistics 32, 1231-1242. Reilly, J. W. and Tanny, S. M. (1979). Counting successions in permutations. Studies in Applied Mathematics 6 1 , 73-81.
160
Bibliography
Renyi, A (1970). Probability Theory, American Elsevier Publishing Company Inc., New York. Riordan, J. (1958). An Introduction to Combinatorial Analysis, Wiley, New York. Roselle, D. P. (1968). Permutations by number of rises and successions. Proceedings of the American Mathematical Society 19, 8-16. Ross, S. M. (2000). Introduction to Probability Models (7th ed.), Academic Press, San Diego. Rubin, G., McCulloch, C. E. and Shapiro, M. A. (1990). Multinomial runs tests to detect clustering in constrained free recall. Journal of the American Statistical Association 85, 315-320. Saperstein, B. (1972). The generalized birthday problem. Journal of the American Statistical Association 67, 425-428. Schilling, M. F. (1990). The longest run of heads. The College Mathematics Journal 2 1 , 196-207. Seneta, E. (1981). Non-negative Matrices and Markov Chains (2nd ed.), SpringerVerlag, New York. Sheng, K. N. and Naus, J. I. (1994). Pattern matching between two non-aligned random sequences. Bulletin of Mathematical Biology 56, 1143-1162. Steinwachs, D. M. (1979). Measuring provider continuity in ambulatory care. Medical Care 17, 551-565. Swed, F. S. and Eisenhart, C. (1943). Tables for testing randomness of grouping in a sequence of alternatives. Annals of Mathematical Statistics 14, 66-87. Tanny, S. (1973). A probabilistic interpretation of Eulerian numbers. Duke Mathematical Journal 40, 717-722. Tanny, S. M. (1976). Permutations and successions. Journal of Combinatorial Theory, Series A 2 1 , 196-202. Vaggelatou, E. (2003). On the length of the longest run in a multi-state Markov chain. Statistics and Probability Letters 62, 211-221. Uchida, M. and Aki, S. (1995). Sooner and later waiting time problems in a twostate Markov chain. Annals of the Institute of Statistical Mathematics 47, 415-433 Wald, A. and Wolfowitz, J. (1940). On a test whether two samples are from the same population. Annals of Mathematical Statistics 11, 147-162. Wigle, D. T. (1982). Prevalence of selected chronic diseases in Canada, 1978-1979. Chronic Disease in Canada 3, 9. Wishart, J. and Hirshfeld, H. O. (1936). A theorem concerning the distribution of joins between line segments. Journal of the London Mathematical Society 11, 227-235. Wolfowitz, J. (1943). On the theory of runs with some applications to quality control. Annals of Mathematical Statistics 14, 280-288. Worpitzky, J. (1883). Studien iiber die Bernoullischen und Eulerschen Zahlen. Journal fur die reine und angewandte Mathematik 94, 203-232.
Index
Absorbing state, 18, 22 Average run length (ARL), 140-147
Hypothesis testing, 127 Joint distribution, 60, 127, 128
Bernoulli trials, 2, 25, 64 Binomial distribution of order k, 26
Large deviation approximation, 39, 84-87, 120-122 Longest success run, 25, 37-39, 72
Chapman-Kolmogorov equation, 8, 9, 19 Compound control scheme, 140, 147 Compound pattern, 11, 52-56, 64, 66, 91-93 Cumulative sum (CUSUM), 140-145
Markov chain imbeddable variables of binomial type (MVBs), 16 Mean of waiting time, 73, 76-80 Moments, 14, 143 Negative binomial, 63, 64 of order k, 64 Non-overlap counting, 10, 49 consecutive k successes, 26
Ending block, 27, 31, 34, 50-52 Eulerian number, 12, 97, 107, 113 Exponentially weighted moving average (EWMA), 140, 141, 145
Overlap counting, 10, 57 consecutive k successes, 33
Finite Markov chain, 5 imbedding, 13 imbeddable variables of binomial type, 16 First-entry probability, 21-24 Forward and backward principle, 49-53
Probability generating function, 2, 14, 40, 72-75, 80 Quality control schemes, 140-151 Random permutations, 12, 97 Ruin problem, 6 Runs of exactly k successes, 10, 25, 35 Runs test, 127
General geometric distribution, 63 Geometric distribution of order k, 40, 63, 64 161
162 Scan statistics, 89-96 Sequential continuity (SECON), 15, 133-139 Series pattern, 11, 59, 60 Simple pattern, 11, 50, 65, 66 Simon Newcomb number, 97, 107-112 Sooner waiting time, 64, 72 Spectrum analysis, 84-87 Success runs of size greater than or equal to k, 10, 25, 31 number of successes in, 44-48 System reliability, 117-119 consecutive-/c-out-of-n:F system, 119-126 linearly connected system, 126 Tree-structured Markov chain, 9 Waiting-time distribution, 40, 63 simple pattern, 40, 65 compound pattern, 66-72 sooner, 72
Index
DISTRIBUTION THEORY OF
RUNS AND PATTERNS AND ITS
APPLICATIONS A Finite Markov Chain Imbedding Approach
T
This book provides a rigorous, comprehensive introduction to the finite Markov chain imbedding technique lor studying the distributions of runs and patterns from a unified and intuitive
viewpoint, away from the lines of traditional combinatorics. The central theme of this approach is to properly imbed the random variables of interest into the framework of a finite Markov chain, and the resulting representations of the underlying distributions are compact and very amenable to furthei study of associated properties. The concept of finite Markov chain imbedding is systematically developed, and its utility is illustrated through practical applications to a variety of fields, including the reliability of engineering systems, hypothesis testing, quality control, and continuity measurement in the health care sector.
ISBNM1-02-4M7-4
World Scientific www. worldscientific. com 4669 he