COMBINATORIAL GEOUP TESTING ATOIfSAPPUCATIONS
SERIES ON APPLIED MATHEMATICS Editor-in-Chief: Frank Hwang Associate Editors-in-Chief: Zhong-ci Shi and Kunio Tanabe
Vol. 1
International Conference on Scientific Computation eds. T. Chan and Z.-C. Shi
Vol. 2
Network Optimization Problems — Algorithms, Applications and Complexity eds. D.-Z. Du and P. M. Pandalos
Vol. 3
Combinatorial Group Testing and Its Applications by D.-Z. Du and F. K. Hwang
Vol. 4
Computation of Differential Equations and Dynamical Systems eds. K. Feng and Z.-C. Shi
Vol. 5
Numerical Mathematics eds. Z.-C. Shi and T. Ushijima
Series on Applied Mathematics Volume 3
Ding-Zhu D u Department of Computer Science University of Minnesota and Institute of Applied Mathematics Academia Sinica, Beijing
F r a n k K. H w a n g AT&T Bell Laboratories Murray Hill
V f e World Scientific « •
Singapore • New Jersey • L London • Hong Kong
Published by World Scientific Publishing Co. Pte. Ltd. P O Box 128, Farrer Road, Singapore 9128 USA office: Suite IB, 1060 Main Street, River Edge, NJ 07661 UK office: 73 Lynton Mead, Totteridge, London N20 8DH
Library of Congress Cataloging-in-Publication Data Du, Dingzhu. Combinatorial group testing and its applications / Ding-Zhu Du, Frank K. Hwang. p. cm. — (Series on applied mathematics; vol. 3) Includes bibliographical references and index. ISBN 9810212933 1. Combinatorial group theory. I. Hwang, Frank. II. Title. III. Series: Series on applied mathematics v. 3. QA182.5.D8 1993 512'.2-dc20 93-26812 CIP
Copyright © 1993 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form orby any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher. For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 27 Congress Street, Salem, MA 01970, USA.
Printed in Singapore by JBW Printers & Binders Pte. Ltd.
Preface Group testing has been around for fifty years. It started as an idea to do large scale blood testing economically. When such needs subsided, group testing stayed dormant for many years until it was revived with needs for new industrial testing. Later, group testing also emerged from many nontesting situations, such as experimental designs, multiaccess communication, coding theory, clone library screening, nonlinear optimization, computational complexity, etc.. With a potential world-wide outbreak of AIDS, group testing just might go the full cycle and becomes an effective tool in blood testing again. Another fertile area for application is testing zonal environmental pollution. Group testing literature can be generally divided into two types, probabilistic and combinatorial. In the former, a probability model is used to describe the distribution of defectives, and the goal is to minimize the expected number of tests. In the latter, a deterministic model is used and the goal is usually to minimize the number of tests under a worst-case scenario. While both types are important, we will focus on the second type in this book because of the different flavors for these two types of results. To find optimal algorithms for combinatorial group testing is difficult, and there are not many optimal results in the existing literature. In fact, the computational complexity of combinatorial group testing has not been determined. We suspect that the general problem is hard in some complexity class, but do not know which class. (It has been known that the problem belongs to the class PSPACE, but seems not PSPACE-complete.) The difficulty is that the input consists of two or more integers, which is too simple for complexity analysis. However, even if a proof of hardness will eventually be given, this does not spell the end of the subject, since the subject has many, many branches each posing a different set of challenging problems. This book is not only the first attempt to collect all theory and applications about combinatorial group testing in one place, but it also carries the personal perspective of the authors who have worked on this subject for a quarter of a century. We hope that this book will provide a forum and focus for further research on this subject, and also be a source for references and publications. Finally, we thank E. Barillot, A.T. Borchers, R.V. Book, G.J. Chang, F.R.K. Chung, A.G. Dyachkov, D. Kelley, K.-I Ko, M. Parnes, D. Raghavarao, M. Ruszinko, V.V. Rykov, J. Spencer, M. Sobel, U. Vaccaro, and A.C. Yao for giving us encouragements and helpful discussions at various stage of the formation of this book. Of course, the oversights and errors are our sole responsibility.
v
This page is intentionally left blank
Contents Preface
v
Chapter 1 Introduction
1
1.1 The History of Group Testing 1.2 The Binary Tree Representation of a Group Testing Algorithm and the Information Lower Bound 1.3 The Structure of Group Testing 1.4 Number of Group Testing Algorithms 1.5 A Prototype Problem and Some Basic Inequalities 1.6 Variations of the Prototype Problem References Chapter 2 General Algorithms
1 5 7 10 12 17 18 19
2.1 Li's s-Stage Algorithm 2.2 Hwang's Generalized Binary Splitting Algorithm 2.3 The Nested Class 2.4 (d, n) Algorithms and Merging Algorithms 2.5 Some Practical Considerations 2.6 An Application to Clone Screenings References Chapter 3 Algorithms for Special Cases 3.1 Two Disjoint Sets Each Containing Exactly One Defective 3.2 An Application to Locating Electrical Shorts 3.3 The 2-Defective Case 3.4 The 3-Defective Case 3.5 When is Individual Testing Minimax? 3.6 Identifying a Single Defective with Parallel Tests References
vii
19 20 23 27 30 34 36 38 38 43 48 53 56 59 60
viii
Contents
Chapter 4 Nonadaptive Algorithms and Binary Superimposed Codes . 62 4.1 The Matrix Representation 4.2 Basic Relations and Bounds 4.3 Constant Weight Matrices and Random Codes 4.4 General Constructions 4.5 Special Constructions References
62 63 68 73 78 87
Chapter 5 Multiaccess Channels and Extensions
91
5.1 Multiaccess Channels 5.2 Nonadaptive Algorithms 5.3 Two Variations 5.4 The k-Channel 5.5 Quantitative Channels References Chapter 6 Some Other Group Testing Models 6.1 Symmetric Group Testing 6.2 Some Additive Models 6.3 A Maximum Model 6.4 Some Models for d = 2 References Chapter 7 Competitive Group Testing 7.1 The First Competitiveness 7.2 Bisecting 7.3 Doubling 7.4 Jumping 7.5 The Second Competitiveness 7.6 Digging 7.7 Tight Bound References Chapter 8 Unreliable Tests 8.1 Ulam's Problem 8.2 Geperal Lower and Upper Bounds 8.3 Linearly Bounded Lies (1)
92 96 99 101 105 105 107 107 109 115 118 123 126 126 128 132 134 138 140 143 148 149 149 155 160
Contents 8.4 The Chip Game 8.5 Linearly Bounded Lies (2) 8.6 Other Restrictions on Lies References Chapter 9 Optimal Search in One Variable 9.1 Midpoint Strategy 9.2 Fibonacci Search 9.3 Minimum Root Identification References Chapter 10 Unbounded Search 10.1 Introduction 10.2 Bentley-Yao Algorithms 10.3 Search with Lies 10.4 Unbounded Fibonacci Search References Chapter 11 Group Testing on Graphs 11.1 On Bipartite Graphs 11.2 On Graphs 11.3 On Hypergraphs 11.4 On Trees 11.5 Other Constraints References Chapter 12 Membership Problems 12.1 Examples 12.2 Polyhedral Membership 12.3 Boolean Formulas and Decision Trees 12.4 Recognition of Graph Properties References Chapter 13 Complexity Issues 13.1 General Notions 13.2 The Prototype Problem is in PSPACE
ix 164 168 172 175 177 177 179 183 190 193 193 195 199 200 202 203 203 205 207 212 216 217 218 218 220 222 226 229 231 231 233
x
Contents 13.3 Consistency 13.4 Determinacy 13.5 On Sample Space S(n) 13.6 Learning by Examples References
Index
234 236 237 243 244 245
1 Introduction
Group testing has been around for fifty years. While traditionally group testing literature employs probabilistic models, the combinatorial model has cut its own share and becomes an important part of the literature. Furthermore, combinatorial group testing has tied its knots with many computer science subjects: complexity theory, computational geometry and learning models among others. It has also been used in multiaccess communication and coding.
1.1
The History of Group Testing
Unlike many other mathematical problems which can trace back to earlier centuries and divergent sources, the origin of group testing is pretty much pinned down to a fairly recent event-World War II, and is usually credited to a single person-Robert Dorfman. The following is his recollection after 50 years (quoted from a November 17, 1992 letter in response to our inquiry about the role of Rosenblatt): "The date was 1942 or early '43. The place was Washington, DC in the offices of the Price Statistics Branch of the Research Division of the office of Price Administration, where David Rosenblatt and I were both working. The offices were located in a temporary building that consisted of long wings, chock-full of desks without partitions. The drabness of life in those wings was relieved by occasional bull sessions. Group testing was first conceived in one of them, in which David Rosenblatt and I participated. Being economists, we were all struck by the wastefulness of subjecting blood samples from millions of draftees to identical analyses in order to detect a few thousand cases of syphilis. Someone (who?) suggested that it might be economical to pool the blood samples, and the idea was batted back and forth. There was lively give-and-take and some persiflage. I don't recall how explicitly the problem was formulated there. What is clear is that I took the idea seriously enough so that in the next few days I formulated the underlying probability problem 1
Introduction
2
and worked through the algebra (which is pretty elementary). Shortly after, I wrote it up, presented it at a meeting of the Washington Statistical Association, and submitted the four-page note that was published in the Annals of Mathematical Statistics. By the time the note was published, Rosenblatt and I were both overseas and out of contact." We also quote from an October 18, 1992 letter from David Rosenblatt to Milton Sobel which provides a different perspective. "It is now (Fall of 1992) over fifty year ago when I first invented and propounded the concepts and procedure for what is now called "group testing" in Statistics. I expounded it in the Spring of 1942 the day after I reported for induction during World War II and underwent blood sampling for the Wasserman test. I expounded it before a group of fellow statisticians in the division of Research of the Office of Price Administration in Washington, D.C. Among my auditors that morning was my then colleague Rorbet Dorfman." ' Considering that fifty years have lapsed between the event and the recollections, we find the two accounts reasonably close to each other. Whatever discrepancies there are, they are certainly within the normal boundary of differences associated with human memory. Thus that "someone" (in Dorfman's letter) who first suggested to pool blood samples could very well be David Rosenblatt. It is also undisputed that Dorfman alone wrote that seminal report [1] published in the Notes Sections of the journal Annals of Mathematical Statistics which gave a method intended to be used by the United States Public Health Service and the Selective Service System to weed out all syphilitic men called up for induction. We quote from [1]: "Under this program each prospective inductee is subjected to a 'Wasserman-type' blood test. The test may be divided conveniently into two parts: 1. A sample of blood is drawn from the man, 2. The blood sample is subjected to a laboratory analysis which reveals the presence or absence of "syphilitic antigen." The presence of syphilitic antigen is a good indication of infection. When this procedure is used, n chemical analyses are required in order to detect all infected members of a population of size n. The germ of the proposed technique is revealed by the following possibility. Suppose that after the individual blood sera are drawn they are pooled in groups of, say, five and that the groups rather than the individual sera are subjected to chemical analysis. If none of the five sera contributing to
1.1 The history
of the group testing
3
problem
t h e pool contains syphilitic antigen, t h e pool will not contain it either and will test negative. If, however, one or more of t h e sera contain syphilitic antigen, t h e pool will also contain it and t h e group test will reveal its presence (the author inserted a note here saying t h a t diagnostic tests for syphilis are extremely sensitive and will show positive results for even great dilutions of antigen). T h e individuals making up t h e pool must t h e n be retested to d e t e r m i n e which of t h e m e m b e r s are infected. It is not necessary to draw a new blood sample for this purpose since sufficient blood for b o t h t h e test and t h e retest can be taken at once. T h e chemical analysis requires only small quantities of blood."
If one or more of the sera in the
But, if no one in the pool
pool contain syphilitic antigen,
contains syphilitic antigen, then many tests are saved.
then a test may be wasted.
O
CD
CD CD
Figure 1.1: T h e idea of group testing. Unfortunately, this very promising idea of grouping blood samples for syphilis screening was not actually p u t to use. T h e m a i n reason, c o m m u n i c a t e d to us by C. Eisenhart, was t h a t t h e test was no longer accurate when as few as eight or nine samples were pooled. Nevertheless, test accuracy could have been improved over years, or possibly not a serious problem in screening for another disease. Therefore we quoted t h e above from Dorfman in length not only because of its historical significance, but also because at this age of a potential AIDS epidemic, Dorfman's clear account of applying group testing to screen syphilitic individuals may have new impact to t h e medical world and t h e health service sector.
4
Introduction
Dorfman's blood testing problem found its entrance into a very popular textbook on probability as an exercise; Feller's 1950 book "An Introduction to Probability Theory and Its Application, Vol. I" [2] and thus might have survived as a coffee break talk-piece among the academic circle in those days. But by and large, with the conclusion of the Second World War and the release of millions of inductees, the need of group testing disappeared from the Selective Service and the academic world was ready to bury it as a war episode. The only exception was a short note published by Sterrett [9] in 1951 based on his Ph.D. dissertation at University of Pittsburgh. Then Sobel and Groll [8], the two Bell Laboratories Scientists, gave the phrase "group testing" new meaning by giving the subject a very thorough treatment and established many new grounds for future studies in their 74-page paper. Again, they were motivated by practical need, this time from the industrial sector, to remove all leakers from a set of n devices. We quote from Sobel and Groll: "One chemical apparatus is available and the devices are tested by putting x of them (where 1 < x < n) in a bell jar and testing whether any of the gas used in constructing the devices has leaked out into the bell jar. It is assumed that the presence of gas in the bell jar indicates only that there is at least one leaker and that the amount of gas gives no indication of the number of leakers." Sobel and Groll also mentioned other industrial applications such as testing condensers and resistors, the main idea is very well demonstrated by the Christmas tree lighting problem. A batch of light bulbs is electrically arranged in series and tested by applying a voltage across the whole batch or any subset thereof. If the lights are on, then whole tested subset of bulbs must all be good; if the lights are off, then at least one bulb in the subset is defective. Call the set of defectives among the n items the defective set. Dorfman, as well as Sobel and Groll, studied group testing under probabilistic models, namely, a probability distribution is attached to the defective set and the goal is to minimize the expected number of tests required to identify the defective set. Katona [5] first emphasized the combinatorial aspects of group testing. However, his coverage was predominantly for the case of a single defective and he considered probability distributions of defectives. In this volume a more restrictive viewpoint on combinatorial group testing (CGT) is taken by completely eliminating probability distributions on defectives. The presumed knowledge on the defective set is that it must be a member, called a sample, of a given family called a sample space. For example, the sample space can consist of all rf-subsets of the n items, when the presumed knowledge is that there are exactly d defectives among the n items. This sample space is denoted by S(d,n). An item is said to be defective (or good) in a sample if it is in (or not in) the sample. The goal in CGT is to minimize the number of tests under the worst scenario. A best algorithm under this goal is called a minimax algorithm. The reason that probabilistic models are excluded is not because they are less important or less interesting, but simply
1.2 The Binary Tree Representation and Information Lower Bound
5
because there is so much to tell about group testing and this is a natural way to divide the material. Li [6] was the first to study CGT. He was concerned with the situation where industrial and scientific experiments are conducted only to determine which of the variables are important. Usually, only a relatively small number of critical variables exists among a large group of candidates. These critical variables are assumed to have effects too large to be masked by the experimental error, or the combined effect of the unimportant variables. Interpreting each variable as an item, each critical variable as a defective, and each experiment as a group test, a large effect from an experiment indicates the existence of a critical variable among the variables covered by the experiment. Li assumed that there are exactly d critical variables to start with, and set to minimize the worst-case number of tests. Since Li, CGT has been studied along side with PGT for those classical applications in medical, industrial and statistical fields. Recently, CGT is also studied in complexity theory, graph theory, learning models, communication channels and fault tolerant computing. While it is very encouraging to see a wide interest in group testing, one unfortunate consequence is that the results obtained are fragmented and submerged in the jargons of the particular fields. This book is an attempt to give a unified and coherent account of up-to-date results in combinatorial group testing.
1.2
The Binary Tree Representation of a Group Testing Algorithm and the Information Lower Bound
A binary tree can be inductively defined as a node, called the root, with its two disjoint binary trees, called the left and right subtree of the root, either both empty or both nonempty. Nodes occurring in the two subtrees are called descendants of the root (the two immediate descendants are called children), and all nodes having a given node as a descendant are ancestors (the immediate ancestor is called a parent). Two children of the same parent are siblings. Nodes which have no descendants are called leaves, and all other nodes are called internal nodes. The path length of a node is the number of that node's ancestors. A node is also said at level I if its path length is I — 1. The depth of a binary tree is the maximal level over all leaves. Let S denote the sample space. Then a group testing algorithm T for S can be represented by a binary tree, also denoted by T, by the following rules: (i) Each internal node u is associated with a test t(u); its two links associated with the two outcomes of t(u) (we will always designate the negative outcome by the left link). The test history H(u) of a node u is the set of tests and outcomes associated with the nodes and links on the path of u. (ii) Each node u is also associated with an event S(u) which consists of all the members of S consistent with H(u). \S(v)\ < 1 for each leaf v.
Introduction
6
o O 0 /
Q
\ 1
0/
0/
\ 1
(~~\
test
(^ J
sample
0
negative
1
positive
\ 1
Figure 1.2: The binary tree representation. Since the test t(u) simply splits S(u) into two disjoint subsets, every member of S must appear in one and only one S(v) for some leaf v. An algorithm is called reasonable if no test whose outcome is predictable is allowed. For a reasonable algorithm each S(u) is split into two nonempty subsets. Thus |5(f)| = 1 for every leaf v, i.e., there exists a one-to-one mapping between S and the leaf-set. Let p(s) denote the path of the leaf v associated with the sample s and let \p(s)\ denote its length. Then MT{S) = max \p(s)| is the depth of T. Since the same subset will not be tested more than once in a reasonable algorithm, the number of tests, consequently the number of reasonable algorithms, is finite. Therefore we can define M(S) = min MT(S) . An algorithm which achieves M(S) is called a minimax algorithm for S. The goal of CGT is to find a minimax algorithm for a given 5; usually, we settle for a good heuristic. The determination of M(S) and obtaining good bounds for it are challenging problems and unsettled for most S. Let IVKL^J) denote the smallest (largest) integer not less (greater) than x. Also let log x mean log2 x throughout unless otherwise specified. Theorem 1.2.1 M{S) >
\log\S\].
7
1.3 The Structure of Group Testing
Proof: Consider a group testing algorithm T for the problem S. At each internal node u of T, the test t(u) splits S(u) into two disjoint subsets according to whether the outcome is negative or positive. The cardinality of one of the two subsets must be at least half of \S{u)\. Therefore M{S) > [log|S[|- • We will refer to the bound in Theorem 1.2.1 as the information lower bound. Lemma 1.2.2 E s e s2 - I p ' s 'l = 1 for every reasonable algorithm. Proof. True for | 5 | = 1. A straightforward induction on | 5 | proves the general case. • Theorem 1.2.1 and Lemma 1.2.2 are well known in the binary tree literature. The proofs are included here for completeness. A group testing algorithm T for the sample space 5 is called admissible if there does not exist another algorithm T" for S such that W)\
> \p'(s)\
for a l l s G S
with at least one strict inequality true, where p'(s) is the path of s in T'. Theorem 1.2.3 A group testing algorithm is admissible if and only if it is reasonable. Proof. That "reasonable" is necessary is obvious. To show sufficiency, suppose to the contrary that T is an inadmissible reasonable algorithm, i.e., there exists T' such that \P(S)\>\P'{S)\
for
allseS
with at least one strict inequality true. Then £ s6 s2- |p 'WI > S s e s 2- | p ( s ) l = 1 , a contradiction to Lemma 1.2.2.
•
From now on only admissible, or reasonable, algorithms are considered in the volume. The word "algorithm" implies an admissible algorithm.
1.3
The Structure of Group Testing
The information lower bound is usually not achievable. For example, consider a set of six items containing exactly two defectives. Then [log |5|] = [log (Tjl = 4 . If a subset of one item is tested, the split is 10 (negative) and 5 (positive); it is 6 and 9 for a subset of two, 3 and 12 for a subset of three, and 1 and 14 for a subset of four. By Theorem 1.2.1, at least four more tests are required.
8
Introduction
T h e reason t h a t information lower bound cannot be achieved in general for group testing is t h a t t h e split of S(u) at an internal node u is not arbitrary, b u t m u s t be realizable by a group test. Therefore it is of i m p o r t a n c e t o study which types of splitting are permissible in group testing. T h e rest of this section reports work done by Hwang, Lin and Mallows [4]. While a group testing algorithm certainly performs t h e tests in t h e order starting from t h e root of t h e t r e e proceeding to t h e leaves, t h e analysis is often m o r e convenient if s t a r t e d from t h e leaves (that is t h e way t h e Huffman tree - a m i n i m u m weighted binary tree - is c o n s t r u c t e d ) . T h u s instead of asking what splits are permissible, t h e question becomes: for two children nodes x,y of u, what types of S(x) and S(y) are p e r m i t t e d t o merge into S(u). Let N denote a set of n items, D t h e defective set and S0 = {Dx, • • •, Dk} t h e initial s a m p l e space. W i t h o u t loss of generality assume ujLjTJ,- = N, for any item not in UjLjT?,- can immediately be identified as good and deleted from N. A subset Si of So is said t o b e realizable if t h e r e exists a group testing t r e e T(A) for SQ and a node u of T such t h a t S(u) = Si. A partition n = (Si, • • •, Sm) of So is said to be realizable if t h e r e exists a group testing tree T for So and a set of m nodes (u x , • • •, um) of T such t h a t S(ui) = Si for 1 < i < m. Define || S | | = U o . e s A ' , || S || = N\ || S ||, t h e complement of || S \\, and S = {A \ A C|| S ||, A D some £),•}, t h e closure of S. F u r t h e r m o r e , let 7r = (Si, S2, • • •, Sm) be a partition of So, i.e., S, PI Sj = 0 for all i 7^ j , and Us i g x 5',- = So- T h e n 5,- and Sj are said to be separable if there exists an I C N such t h a t i" n D = 0 for all D G Si, and / n D / 0 for all D £ Sj (or vice versa). Given 7r, define a directed graph G^ by taking each Si as a node, and a directed edge from 5,- t o Sj (Si —» Sj), i ^ j , if and only if t h e r e exists Ai € Si, Aj G Sj such t h a t Ai C Aj.
T h e o r e m 1.3.1 The following statements (i) Si and Sj are not separable, (ii) Si —* Sj —» Si in Gn. (Hi) SiHSj^Hl.
are
equivalent:
Proof: By showing (i) => (ii) =$• (iii) =$• (i). (i) =*> (ii): It is first shown t h a t if Si / » Sj, t h e n I = JS~\\C) || 5,- | | ^ 0 and I separates Si, Sj. Clearly, if / = 0, t h e n || Si | | C | | Sj ||; since || Si ||6 S; and || Sj ||G Sj, it follows t h a t 5,- —• Sj, a contradiction. T h u s 7 ^ 0 . Also, 7 fl £) = 0 for all _D € 5 j since 7 C || 5 j ||. F u r t h e r m o r e , if there is a D € S; such t h a t 7 D 7J = 0, then from t h e definition of 7, P T l f n || 5 ; || HD = fS~\\ D 7J = 0 and hence D C|| ^ ||e 5,-. This implies Si —> 5 j , again a contradiction. Therefore 7 D D ^ 0 for all D € 5, and 7 separates 5^, 5 j . T h e proof is similar if Sj -/* Si. (ii) => (iii): Suppose Si —> Sj —• 5;. Let A,, A; G 5; and A3,A!d G 5 j such t h a t Ai C Aj and A | 2 ^ - . F u r t h e r m o r e , let u = | | S{ || n || S,- ||. T h e n u 7^ 0. Since
1.3 The Structure
of Group
9
Testing
Ai C Aj C|| Sj ||, A,- C|| Si || n || 5 j | | = u C|| 5,- || and t h u s u € S,. a r g u m e n t shows u £ Sj. Therefore 5 , D Sj ^ 0. (iii) => (i): Suppose u> € b e a subset such t h a t / f~l A w € 5,- =*• t h e r e exists a A t h e other h a n d u> £ 5 j =>• w S,- are not separable.
A similar
5; f~l 5,- (w 7^ 0) b u t Si and 5,- are separable. Let I C N 7^ 0 for all A € S; and / f~l Dj = 0 for all Z?; e 5,-. But 6 S; such t h a t £>; C UJ, hence / fl w D I n D ; 7^ 0. On C|| Sj ||=> / fl u> = 0, a contradiction. Therefore, Si and
T h e o r e m 1.3.2 ~K is realizable if and only if G* does not contain
a directed
cycle.
Proof. Suppose G x has a cycle C. Let it' be a partition of So obtained by merging two (separable) sets 5,- and Sj in TT. From T h e o r e m 1.3.1, G* cannot contain t h e cycle St —• Sj —* Si for otherwise S; and Sj are not separable, therefore not mergeable. Since every edge in G* except those between Si and Sj is preserved in G„i, G*i must contain a cycle C". By r e p e a t i n g this a r g u m e n t , eventually one o b t a i n s a partition with no two p a r t s separable. Therefore 7r is not realizable. Next assume t h a t G> contains no cycle. T h e graph Gv induces a partial ordering on t h e 5,'s with 5,- < Sj if and only if Si —> Sj. Let Sj be a m i n i m a l element in this ordering. T h e n / = || Sj || separates Sj from all other 5,'s. Let G^ be obtained from Git by deleting t h e node Sj and its edges. T h e n clearly, GTi contains no cycle. Therefore t h e above a r g u m e n t can be repeated to find a set of tests which separate one Si at a t i m e . • Unfortunately, local conditions, i.e., conditions on 5; and Sj alone, are not sufficient t o tell w h e t h e r a pair is validly mergeable or not. (S, and Sj is validly mergeable if merging Si a n d Sj preserves t h e realizability of t h e partition.) T h e o r e m 1.3.1 provides some local necessary conditions while t h e following corollaries to T h e o r e m 1.3.2 provide more such conditions as well as some sufficient conditions.
C o r o l l a r y 1 . 3 . 3 Let M be the set of Si's which are maximal under the partial ordering induced by G*. (If \M\ = I, add to M = { 5 , } an element Sj such that Sj < 5, and Sj -jt Sk for all other Sk-) Then every pair in M is validly mergeable. C o r o l l a r y 1.3.4 Letn be realizable. Then the pair Si, Sj is validly mergeable if there does not exist some other Sk such that either Si —> Sk or Sj —> Sk • C o r o l l a r y 1.3.5 Si and Sj are not validly mergeable if there exists another that either Si —• Sk —» Sj or vice versa. C o r o l l a r y 1.3.6 Let TV be the partition single element. Then IT is realizable.
of So in which every Si consists
Sk such
of just
a
Introduction
10
Let S C So ( S 7^ 0) and let •K = {S, Si, S2, • • •, Sm} be t h e partition of S0 where each St consists of a single element /),-. T h e following t h e o r e m shows a connection between t h e realization of 5 and t h e realization of n.
T h e o r e m 1 . 3 . 7 The following (i) S is realizable,
statements
are
equivalent:
(ii) Sr\S0 = S. (Hi) -K is
realizable.
Proof. We show (i) =*• (ii) =>• (iii) => (i). (i) =>• (ii): Since S C S0, clearly S Ci S0 2 S. So only S fl So C S needs to be shown. Let D G S C\ S 0 , t h e n D € S and hence there is some D' G S such t h a t D'
= S;, for some i. B u t D € 5 implies Si —• 5 and D' C D implies 5 —> S;. Hence S is not realizable by T h e o r e m 1.3.2. (ii) => (iii): If TT is not realizable, t h e n GT contains a directed cycle. Since each Si is a single element, Si = Si and hence any directed cycle m u s t contain S, say Si —* S2 —*•••—> Sk —> S —* Si. B u t Si —> 5 2 —> • • • —> S/i —> 5 implies Si —-• 5 since S,'s are single elements. By T h e o r e m 1.3.1, S —+ Si —> S implies S n 5 i = S n { A } / 0- This implies A e S f l So. Since fteS,5nS0/ S. (iii) =^ (i): Trivially t r u e by using t h e definition of realizability. •
1.4
Number of Group Testing Algorithms
O n e interesting p r o b l e m is t o count t h e n u m b e r of group testing .algorithms for S. This is t h e t o t a l n u m b e r of binary trees with \S\ leaves (labeled by m e m b e r s of S) which satisfy t h e group testing s t r u c t u r e . While this problem r e m a i n s open, Moon and Sobel [7] counted for a class of algorithms when t h e sample space S is t h e power set of n i t e m s . Call a group pure if it contains no defective, and contaminated otherwise. A group testing algorithm is nested if whenever a contaminated group is known, t h e next group to be tested m u s t be a proper subset of t h e c o n t a m i n a t e d group. (A more detailed definition is given in Section 2.3.) Sobel and Groll [8] proved L e m m a 1 . 4 . 1 Let U be the set of unclassified items and suppose that C C U is tested to be contaminated. Furthermore, suppose C" C C is then tested to be contaminated. Then items in C\C can be mixed with items inU\C without losing any information. Proof. Since C" being c o n t a m i n a t e d implies C being c o n t a m i n a t e d , t h e sample space given b o t h C and C" being c o n t a m i n a t e d is t h e same as only C" being c o n t a m i n a t e d . B u t u n d e r t h e l a t t e r case, items in C \ C and U\C are indistinguishable. • T h u s u n d e r a nested algorithm, at any stage t h e set of unclassified i t e m s is characterized by two p a r a m e t e r s m and n, where m > 0 is t h e n u m b e r of items in a
1.4 Number of Group Testing
Algorithms
11
contaminated group and n is the total number of unclassified items. Let f(m,n) denote the number of nested algorithms when the sample space is characterized by such m and n. By using the "nested" property, Moon and Sobel [7] obtained /(0,0) f(0,n) /(l,n) f(m,n)
= = = =
1 EZ = 1 /(0,n-fc)/(fc,r») for n > l, / ( 0 , n - l ) for n > 1, E^J"11/(m — k,n — k)f(k, n) for n > m > 1,
where k is the size of the group to be tested. Recall that the Catalan numbers I (Ik-
n
2)
Ck
-k{k-l
satisfy the recurrence relation
for k > 2. Moon and Sobel gave Theorem 1.4.2 f(0,n) f(m,
n)
= C^U^f^i)
forn>2,
=
for 1 < m < n .
CrJi%if{Q, n-i)
Proof. Theorem 1.4.2 is easily verified for / ( 0 , 2) and f(l,n). proved by induction f(0,n)
=
= /(m,n)
= =
= =
££=1/(0,n-l)/(fc,n) EJU (c B _ i+1 n?-*- 1 /(o,
The general case is
0) (c*n?=1/(o, n -»))
1
Ct+iiiri /^,!). Zj£f(m-k,n-k)f(k,n) ^=i (cm-kn?=-;hf(o,n-k-,•))(c*n?=1/(o,n CmUtJ(0,n-i). a
- *))
Define F(n) = / ( 0 , n ) . The following recurrence relations follow readily from Theorem 1.4.2 and the definition of the Catalan numbers. Corollary 1.4.3 Ifn>l,
then
F(n) = % t l n „ - 1) = 3 * L z l ) n n _ !) . Cn
n +I
Introduction
12
Corollary 1.4.4 Ifn>2,
then F(n) = Cn+lCnC^jC^.j • • • C2
Corollary 1.4.5 Ifn>
I, then
nn) = 4-n ?=1 {i-^A Ty } 2 . The first few values of F(n) are n F(n)
1
2
3
4
5
6
1 2
10
280
235,220
173,859,840,000
.
The limiting behavior of F(n) can be derived from the formula in Corollary 1.4.5. Corollary 1.4.6 lim{F(n)}2'" "^°°x K n
= 4 I T 1~ 1 I l - — ^ — - 1 '- \ 2(i + l ) J
= 1.526753-•• .
More generally, it can be shown that
™-H'+¥-(i)} as n —> oo, where a = 1.526753- • • ; in particular, F(n) > \o?n for n > 1. The proof of this is omitted.
1.5
A Prototype Problem and Some Basic Inequalities
We first describe a prototype CGT problem which will be the focus of the whole book. Then we discuss generalizations and special cases. Consider a set I of n items known to contain exactly d defectives. The defectives look exactly like the good items and the only way to identify them is through testing, which is error-free. A test can be applied to an arbitrary subset of the n items with two possible outcomes: a negative outcome indicates that all items in the subset are good, a positive outcome indicates the opposite, i.e., at least one item in the subset is defective (but not knowing which ones or how many are defective). The goal is to find an algorithm A to identify all defectives with a small M&{S) = M^(d,n). This is the prototype problem. It is also called the (d, ra)-problem to highlight the two parameters n and d. In the literature the (d, n) problem has sometimes been called the hypergeometric group testing problem. We now reserve the latter term for the case
1.5 A Prototype Problem and Some Basic Inequalities
13
that a uniform distribution is imposed on S, since then the probability that a subset of k items containing x defectives follows the hypergeometric distribution
UJ U-J Note that the hypergeometric group testing problem under this new definition belongs to PGT, not CGT. Since M(n,n) = M(0,n) = 0, whenever the (d, n) problem is studied, it is understood that 0 < d < n. Hu, Hwang and Wang [3] proved some basic inequalities about the M function which are reported in this section. Lemma 1.5.1 S C S' implies M(S) <
M(S').
Proof. Any algorithm for S' is also an algorithm for S where all paths except those of s € S' \ S axe preserved. Therefore, the maximum path length of the algorithm on S cannot exceed that on S'. • Corollary 1.5.2 M(d,n)
< M(d,n + 1).
Proof. Add an imaginary good new item 7 n + 1 to the item set. Then each sample in S(d,n) is augmented to a sample in S(d,n + 1) since In+i of the former can only be good. Note that adding a good item to any group does not affect the test outcome. Therefore, no additional piece of good item is actually needed; an imaginary piece of good item will do. • Let M(m;d,n) denote the minimum number of tests necessary to identify the d defectives among n items when a particular subset of m items is known to be contaminated. Theorem 1.5.3 M(m; d, n) > 1 + M(d - 1, n - 1) for m > 2 and 0 < d < n. Proof. By Lemma 1.5.1 M(m;d,n)
> M(2;d,n)
for m > 2 .
Let T be an algorithm for the (2; d, n) problem, and let Mj(2\ d, n) = k, i.e., k is the maximum path length of a leaf in T. Let I\ and I2 denote the two items in the contaminated group. Claim. Every path of length k in T includes at least one test that contains either 7i or I2.
Introduction
14
Proof of claim. Suppose to the contrary that for some leaf v the path p(v) has length k and involves no test containing I\ or 72. Since no test on p(v) can distinguish I\ from 72, and since {7i,7 2 } is a contaminated group, I\ and 72 must both be defective in the sample s(v). Let u be the sibling node of v. Then u is also a leaf since v has maximum path length. Since p(u) and p(v) have the same set of tests, 7i and 72 must also be both defective in the sample s(u). Since s(u) ^ s(v), there exists indices i and j such that 7; is defective and 7, is good in the sample s(u), while 7; is good and Ij is defective in s(v). Let w denote the parent node of u and v; thus S(w) = {s(u),s(u)}. Then no test on p(w) can be of the form G U {7;}, where G, possibly empty, contains only items classified as good in s(u), since such a test must yield a positive outcome for s(u) and a negative outcome for s(v), and hence would have separated s(u) from s(v). Define s to be the sample identical to s(u) except that I2 is good and both 7,- and Ij are defective. Then s can be distinguished from s(u) only by a test containing 72, which by assumption does not exist on p(w), or by a test of the form G U {7,}, whose existence has also been ruled out. Thus s € S(w), and u and v cannot both be leaves, a contradiction that completes the proof of the claim. By renaming the items if necessary, one may assume that every path of length k in T involves a test that contains 7i. Add an imaginary defective to the (d — 1, n — 1) problem and label it 7i, and map the n — 1 items of the (d—l,n — I) problem one-toone to the items of the (d, n) problem except I\. Then the modified T can be used to solve the (d — 1, n — 1) problem except that every test containing I\ is skipped since the positive outcome is predictable. But each path of length k in T contains such a test. Hence the maximum path length in applying T to the (d — l , n — 1) problem is k — 1, i.e., MT{2;d,n) > 1 + MT{d - l,n-l) . Since T is arbitrary, the proof is complete.
•
Corollary 1.5.4 M(n — l , n ) = n — 1. Proof. Trivially true for n = 1. For n > 2 M(n-l,n)
= M(n;n-l,n)
> l + M{n-2,n-l)
Theorem 1.5.5 M{d,n) > l+M{d-l,n-l)
> n - l + M(0,l) =n-l
. D
> M(d - l,n) for 0 < d < n.
Proof. By noting M(d,n) = M(n;d,n) , the first inequality follows from Theorem 1.5.3. The second inequality is trivially true for d = 1. The general case is proved by using the induction assumption
M(d,n) >M{d-l,n)
.
15
1.5 A Prototype Problem and Some Basic Inequalities
Let T be the algorithm which first tests a single item and then uses a minimax algorithm for the remaining problem. Then M(d-l,n)
L e m m a 1.5.6 M(d,n)
< = =
MT(d-l,n) 1 + max{M(d - 1, n - 1), M(d - 2, n - 1)} 1 + M(d- l,n-l) . •
< n — 1.
Proof. The individual testing algorithm needs only n — 1 tests since the state of the last item can be deduced by knowing the states of the other items and knowing d. • Lemma 1.5.7 Suppose that n — d > 1. Then M(d, n) = n — 1 implies M(d, n — 1) = n -2. Proof. Suppose to the contrary that M(d:n — 1) =^ n — 2. By Lemma 1.5.6, this is equivalent to assuming Mid, n — 1) < n — 2. Let T denote an algorithm for the (d, n) problem which first tests a single item and then uses a minimax algorithm for the remaining problem. Then M{d,n)
< = =
MT(d,n) l+ma.x{M(d,n-l),M(d-l,n-l)} 1 + M(d, n - 1) by Theorem 1.5.5
<
71-1,
a contradiction to the assumption of the lemma. Theorem 1.5.8 M(d,n)
•
= M(d— l,n) implies M(d,n)
= n — 1.
Proof. Suppose n — d = 1. Then Theorem 1.5.8 follows from Corollary 1.5.4. The general case is proved by induction on n — d. Note that M(d, n) = M(d— l,n) implies M(d, n) = 1 + M(d - 1, n - 1) by Theorem 1.5.5. Let T be a minimax algorithm for the (d, n) problem. Suppose that T first tests a group of m items. If m > 1, then MT(d,n)
> 1 + M(m;d,n) > 2 + M(d-l,n-l)
by Theorem 1.5.3,
a contradiction to what has just been shown. Therefore m = 1 and M(d,n)
= =
l+ma.x{M{d,n-l),M{d1+ M(d,n-1)
\,n-
1)}
Introduction
16 by Theorem 1.5.5 and the fact d < n — 1. It follows that M(d-l,n-l)
= M(d,n-
1).
Hence M(d-i,n-l)
=
n-2
by induction. Therefore M{d, n) = 1 + M{d - 1, n - 1) = n - 1 . L e m m a 1.5.9 Suppose M(d,n)
< n — 1. TViera
M(d, n) > 21 + M( 1. First consider the case / = 1. Let T be a minimax algorithm for the (d, n) problem which first tests a set of m items. If m > 1, then Lemma 1.5.9 is an immediate consequence of Theorem 1.5.3. Therefore assume m = 1. Suppose to the contrary that MT{d,n)
< 2 + M(d - l , n - 1) .
Then 1+ M(d-l,n-l)
> = =
MT{d,n) l+max{Af(
by Theorem 1.5.5. Therefore M(d - 1, n - 1) = M(d, n - 1) = n - 2 by Theorem 1.5.8. Consequently, M(d, n) = 1 + M(d, n - 1) = n - 1 , a contradiction to the assumptions of the lemma. Thus Lemma 1.5.9 is true for / = 1. The general case is proved by a straightforward induction argument (on /). • Corollary 1.5.10 M(d,n)
> m i n { n - l , 2 Z + Pog(]:|)"|} for 0 < / < d < n.
1.6 Variations of the Prototype Problem
1.6
17
Variations of the Prototype Problem
While we define a group testing algorithm to be sequential in nature, one can also study nonadaptive algorithms in which all tests must be specified simultaneously without the knowledge of any test outcomes. In general, one can consider multistage algorithms in which tests are divided into several stages and only test outcomes of previous stages are assumed known. In practice the choice of a sequential algorithm, a nonadaptive algorithm, or a multistage algorithm is a trade-off between time and the number of tests required. However, the binary tree representation is valid only for sequential procedures. When d is not exactly known, we will keep the notation of Mx(d,n) except replacing d by some other parameters characterizing the sample space. In particular, d denotes that d is an upper bound of the number of defectives and a blank for d denotes that nothing is known about the defective set. Sometimes the sample space is induced from another space through a set of tests and their outcomes. Such a space may be easier to characterize as the original space tagged with a test history. Let i denote the outcome that the tested subset contains i defectives, and let i+ denote at least i defectives. The binary outcome (0,1 + ) of group testing can be extended to fc-nary outcomes. For example, multiaccess channels in computer communication have ternary outcome (0,1,2 + ). This has been further generalized to (k + l)-nary outcome (0,1, • • • , k — 1, k+). In particular, when k = n (the number of items), then each test reveals the exact number of defectives contained therein. Other types of fc-nary outcome are also possible. For example, a check bit in coding often provides the binary outcome (even, odd). Various kinds of restrictions have been considered. In the line group testing problem the n items are arranged on a line and only subsets of consecutive items from top of the line can be tested. This situation may be appropriate when one inspects items on an assembly line. Similarly, one can study circular group testing and in general, graphical group testing in which only certain subgroups can be tested. In the last problem, the goal can also change from identifying n defective nodes to identifying a prespecified subgraph. Other restrictions may concern the size of testable subsets, or the memory space in implementing an algorithm. The searched items and the searching space can also be geometrical objects. One big assumption in the prototype problem which may not hold in reality is that tests are error-free. Recently, this assumption has been dropped in the study of learning models which can be considered as an extension of combinatorial group testing. The prototype problems as well as all of its variations will be the subject matter in subsequent chapters.
Introduction
18
References [1] R. Dorfman, T h e detection of defective m e m b e r s of large populations, Math. Statist. 14 (1943) 436-440. [2] W . Feller, An Introduction to Probability ( J o h n Wiley, New York, 1950).
Theory
and Its Applications,
Ann.
Vol. 1,
[3] M. C. H u , F . K. Hwang and J. K. Wang, A b o u n d a r y problem for group testing, SIAM J. Alg. Disc. Methods 2 (1981) 81-87. [4] F . K. Hwang, S. Lin and C. L. Mallows, Some realizability theorems group testing, SIAM J. Appl. Math. 17 (1979) 396-400. [5] G. 0 . H. K a t o n a , Combinatorial search problem, in A Survey of Combinatorial Theory, Ed. J. N. Srivastava et al, (North-Holland, A m s t e r d a m , 1973). [6] C. H. Li, A sequential m e t h o d for screening experimental variables, J. Statist. Assoc. 57 (1962) 455-477.
Amer.
[7] J. W . Moon and M. Sobel, E n u m e r a t i n g a class of nested group testing procedures, J. Combin. Theory, Series B 23 (1977) 184-188. [8] M. Sobel and P. A. Groll, Group testing to eliminate efficiently all defectives in a binomial sample, Bell System Tech. J. 28 (1959) 1179-1252. [9] A. S t e r r e t t , On t h e detection of defective m e m b e r s of large populations, Math. Statist. 28 (1957) 1033-1036.
Ann.
2 General Algorithms
In most practical cases, d is a much smaller number than n. Group testing takes advantage of that by identifying groups containing no defective, thus identifying all items in such a group in one stroke. However, the determination of how large a group should be is a delicate question. On one hand, one would like to identify a large pure group such that many items are identified in one test; this argues for testing a large group. But on the other hand, if the outcome is positive, then a smaller group contains more information; this argues for testing a small group. Keeping a balance between these two conflicting goals is what most algorithms strive for.
2.1
Li's 5-Stage Algorithm
Li [14] extended a 2-stage algorithm of Dorfman [4] (for PGT) to s stages. At stage 1 the n items are arbitrarily divided into g\ groups of k\ (some possibly k\ — 1) items. Each of these groups is tested and items in pure groups are identified as good and removed. Items in contaminated groups are pooled together and arbitrarily redivided into <72 groups of k2 (some possibly k2 — 1) items; thus entering stage 2. In general, at stage i, 2 < i < s, items from the contaminated groups of stage i — 1 are pooled and arbitrarily divided into gt groups of k{ (some possibly ki — 1) items, and a test is performed on each such group. ks is set to be 1; thus every item is identified at stage s. Let ts denote the number of tests required by Li's s-stage algorithm. Note that s = 1 corresponds to the individual testing algorithm, i.e., testing the items one by one. Thus t\ = n. Next consider s = 2. For easier analysis, assume that n is divisible by k\. Then n *2 = 9\ + 92 < j - + dkx . Ignoring the constraint that k\ is an integer, the upper bound is minimized by setting ki = Jn/d (using straightforward calculus). This gives gt = \fad and t2 < 2\fnd. Now consider the general s case. J^
n
dkx
;=1
K\
K2
dk3_2 , ,, Ks-1
19
General Algorithms
20
Again, ignoring the integral constraints, then the upper bound is minimized by
ki = (Jpj ' , 1 < i < s - 1 . This gives
and
The first derivative of the above upper bound with respect to a continuous s is
'(i) which has a unique root s = In f^J. It is easily verified that s = In (^J is the unique maximum of the upper bound. Hence
where e = 2.718. Since sd ( 3 ) ' is not concave in s, one cannot conclude that the integer s which maximizes the function is either [In (3 )J or [In (5)1 • Li gave numerical solutions for such s for given values of n/d. To execute the algorithm, one needs to compute the optimal s and fc; for i = 1, • • • , « . Each ki can be computed in constant time. Approximating the optimal s by the ceiling or floor function of \og(n/d), then Li's s-stage algorithm runs in O(\og(n/d)) time.
2.2
Hwang's Generalized Binary Splitting Algorithm
It is well known that one can identify a defective from a contaminated group of n items in [logn\ tests through binary splitting. Namely, partition the n items into two disjoint groups such that neither group has size exceeding 2' l o g "1 - 1 . Test one such group, the outcome indicates either the tested group or the other one is contaminated. Apply binary splitting on the new contaminated group. A recursive argument shows that in [lognl tests a contaminated group of size 1 can be obtained, i.e., a defective is identified. A special binary splitting method is the halving method which partitions the two groups as evenly as possible. By applying binary splitting d times, one can identify the d defectives in the (d, n) problem in at most d[log n] tests. Hwang [10] suggested a way to coordinate the d applications of binary splitting such that the total number of tests can be reduced. The
2.2 Hwang's Generalized Binary Splitting
Algorithm
21
idea is, roughly, that there exists in average a defective in every n/d items. Instead of catching a contaminated group of size about half of the original group, which is the spirit of binary splitting, one could expect to catch a much smaller contaminated group and thus to identify a defective therein in fewer number of tests. The following is his generalized binary splitting algorithm G: Algorithm G Step 1. If n < 2d — 2, test the n items individually. If n > 2d — 1, set / = n — d + 1. Define a = Ll°g('/^)J • Step 2. Test a group of size 2°. If the outcome is negative, the 2a items in the group are identified as good. Set n := n — 2° and go to Step 1. If the outcome is positive, use binary splitting to identify one defective and an unspecified number, say x, of good items. Set n := n —• 1 — x and d := d — 1. Go to Step 1. Theorem 2.2.1 n 1 (a
M
a(d'n)-
+ 2)d
for n < 2d — 2, forn>2d-l,
+ p-l
where p < d is a nonnegative integer uniquely defined in I = 2ad+2ap+9,
0 < 8 < 2a.
Proof. The case n < 2d — 2 is true due to Step 1. For 2d—1 < n < "id— 2, a must be 0 and G is reduced to individual testing. Furthermore, 0 = 0 and p = l — d= n — 2d-\-\. Thus (a + 2)d + p - 1 = 2d + n - 2d + 1 - 1 = n = Ma(d, n) . For d = 1, then / = n — d + 1 = n. It is easily verified that except for n = 2a, G is reduced to binary splitting and (a + 2)d + p-
1 = a + 1 = [lognj + 1 = \\ogn] = Ma{l,n)
.
For n = 2°, G spends one more test than binary splitting (by testing the whole set first) and (a + 2)d + p-l = 1 + Tlogn] = MG(l,n) . For the general case d > 2 and n > 3d — 1 Theorem 2.2.1 is proved by induction on d -f n. From Step 2 Ma{d,n)
= max{l + MG(d,n - 2a), 1 + a + MG{d - l , n - 1)}
For n' = n - 2a and d' = d, I' =
=
n'-d'
+ l = n-2°'-d+l
=
f 2ad + 2a(p-l) +9 I 2a-1d + 2a-1(d-2) +e [ 20,-1d + 2°-1{d-l) + (6-2a-1)
l-2a foi p> 1 iorp = 0,9<2°'-1 for p = 0,9 >2a~\
General Algorithms
22 Hence by induction
f (a + 2)d + (p - 1) - 1 for p > 1 I (a + l)d+(d-2)-l for p = 0,9 < 2 0 " 1 1 (a + l ) d + ( d - l ) - l forp = 0,6>>2 a - 1 .
MG{d,n-2a)= Consequently
' (a + 2)d + p - 2 for p = 0, 6 < 20"1 , 0W , , ., • (a + 2Ja + p — 1 otherwise.
l + MG(d,n-2a)-
1 1
For d' = d — 1 and n' = n — 1, /' = n' - d! + 1 =
/ 2a(d-l) + 2a{p + l) + 6 f o r p < d - 3 a+1 2 (d-l) +e iorp = d-2 2a+1{d-l)+2a+ 8 forp = d-l.
Hence by induction (a + 2 ) ( d - l ) + ( p + l ) - l (a + 3 ) ( d - l ) - l
MG{d-l,n-l)
for p
d- 1.
Consequently, ii i_ M lA i n / 1 + a + M G (d - l , n - 1) = < ' [ Since for d > 2, p = 0 and p = d — 1 are MG(d,n)
= =
(« + 2)d + p - 2 forp=d-l r } ; , , . (a + 2)d + p —1 otherwise. mutually exclusive,
max{l + MG{n - 2a,d),l (a + 2)d + p-l. •
+ a + Ma(n -
l,d-l)}
For n/d large, MG(d,n) —> d log(n/d) which compares favorably either with the dlog n tests of binary splitting, or the upper bound (e/ log e)d\og(n/d) of Li's s stage algorithm. In fact, MG(d,n) is not too far away from the information lower bound
Rog(2)lCorollary 2.2.2 MG(d,n)
- [log (£)] < J - 1 for d > 2.
Proof. l + d-l\
n
dj "
ld
\ d ) > d\ [2"<*(1 + pjd)Y d\ ~
(2°,d + 2ap + 9)d
d\ 9(a+1)d+p _ 1
23
2.3 The Nested Class The rest of the proof is similar to the proof of Theorem 2.2.1.
•
To execute G, one needs to compute a for each set of updated n and d. Since it takes constant time to compute a, the generalized binary splitting algorithm can be solved in O(d\og(n/d)) time.
2.3
The Nested Class
Sobel and Groll [16] introduced a class of simple and efficient algorithms for PGT, called the nested class. A nested algorithm can be described by the following rules: 1. There is no restriction on the test group until a group is tested to be contaminated. Mark this group the current contaminated group and denote it by
c. 2. The next test must be on a group, say G, which is a proper subset of C. If G is contaminated, then G replaces C as the current contaminated group. Otherwise, items in G are classified as good and C\G replaces C as the current contaminated group. 3. If the current contaminated group is of size one, identify the item in the group as defective. Test any group of unidentified items, if any. Note that the generalized binary splitting algorithm is in the nested class. Due to Lemma 1.4.1, a simple set of recursive equations can now describe the number of tests required by a minimax nested algorithm. Let H(d, n) denote that number and let F{m;d,n) denote the same except for the existence of a current contaminated group of size m. H(d,n)
=
min ma,x{H(d,n — m),
F(m;d,n)},
l<77i
F(m;d,n)
=
min ma,x{F(m — k; d, n — k), F(k; d, n)}, \
with the boundary conditions H(d,d) F(l;d,n)
= H(0,n) = 0, =
H(d-l,n-l).
Since the recursive equations have three parameters d,n,m, and each equation compares 0(rn) values where the range of m is n, a brute force solution requires 0(n3d) time. However, a careful analysis can significantly cut down the time complexity. Define a line algorithm as one which orders the unclassified items linearly and always tests a group at the top of the order. It is easily verified that a line algorithm identifies the items in order except
General Algorithms
24
1. A good item may be identified together with a sequence of items up to the first defective after it. 2. When only one unidentified defective is left, then the order of identification is from both ends towards the defective (this is because once a contaminated group is identified, all other items can be deduced to be good). Lemma 2.3.1 Every nested algorithm can be implemented as a line algorithm. Proof. By Lemma 1.4.1 all unidentified items belong to two equivalent classes, those in the current contaminated group and those not. Since items in an equivalent class are indistinguishable, they can be arbitrarily ordered and any test involving items in this equivalent class can be assumed to be applied to a group at the top of the order. At the beginning, items in a (d, n) problem are in one equivalent class. Thus they can be linearly ordered. The first test is then on a group at the top of the order. If this group is pure, then items in this group are deleted from the linear order. Except for updating n, the situation is unchanged and the next test is still taken from the top of the order. If the first group is contaminated, then items in the contaminated group again constitute an equivalent class. Without loss of generality, assume that the linear order of this class is same as the original order. Then the next test, taken from the top of the smaller order, is also from the top of the original order. Lemma 2.3.1 is proved by a repeated use of this argument. • The following lemma demonstrates some monotonicity properties of H(d,n) and F(m; d,n). Lemma 2.3.2 H(d,n) m,n and d.
is nondecreasing in n and d. F(m;d,n)
is nondecreasing in
Proof. That H(d,n) and F{m\d,n) are nondecreasing in n follows from Lemma 1.5.1 and the trick of adding an imaginary good item. That F(m;d,n) is increasing in m also follows from Lemma 1.5.1. H(d,n) > H(d — l,n) obviously for d = 1. For d > 2, consider testing one item in the (d — l , n ) problem. Then H{d-l,n)
< =
l+m&x{H(d-l,n-l),H{d-2,n-l)} 1 + H(d — 1, n — 1) by induction.
Suppose now that a minimax algorithm for the (d, n) problem first tests m items. Then H(d,n)
= 1 + ma.x{H(d, n — m),F(m; d, n)} > 1+F(m;d,n) > 1 + F{l;d,n) = l + H(d-l,n-l) >
H{d-l,n)
.
2.3 The Nested
Class
25
Finally, for m = 1 and d > 2, F{l;d,n)
=
H{d-l,n-l)
>
H(d-2,n-l)
=
F(l;d-l,n).
For general m > 1 F(m;d,n)
=
min m&x{F(m
— k; d, n — k), F(k; d, n)}
l
>
min m a x { F ( m — k; d — l , n — k), F(k; d — l , n ) } by induction l
=
F{m;d-
l,n) .
D
Define fd{t) as t h e m a x i m u m n such t h a t # ( d , r a ) < t. By L e m m a 2.3.2, H(d,n) is nondecreasing in n. T h u s t h e specification of fd(t) is equivalent t o t h e specification of H(d,n). While binary splitting can identify a defective among n items in [logn] tests, one would also like to identify as m a n y good items as possible in t h e process when identifying one defective is a subroutine in a bigger problem. Chang, Hwang and Weng [2] gave t h e following result. L e m m a 2 . 3 . 3 There exists an algorithm which can identify a defective among n items in at most k = [log n~\ tests. Furthermore, if [log n~\ tests are actually used, then at least 2k — n good items are also identified. Proof. L e m m a 2.3.3 is trivially t r u e for k = 1. T h e general case is proved by induction on k. Consider two cases: 1. n — 1k~x > 2k~2. Test a group of n — 2 f c _ 1 i t e m s . If t h e o u t c o m e is negative, then n — 2k_1 > 2k — n good items are identified, while a defective can be identified from t h e remaining 2k_1 items in k — 1 tests by binary splitting. If t h e o u t c o m e is positive, t h e n by induction, at least 2 * _ 1 — (n — 2 * - 1 ) = 2k — n good items are identified along with a defective if k — 1 more tests are used. 2. n — 2k~l <2k~2. Test a group of 2k~2 items. If t h e outcome is negative, t h e n 2k~2 good i t e m s are already identified and by induction another 2k~1 — (n—2k~2) good items will be identified in t h e remaining n — 2k~2 items along with a defective if k — 1 more tests are used. If t h e outcome is positive, t h e n a defective can be identified in t h e 2 i t e m s in k — 2 tests and a t o t a l of k — 1 tests are used. •
26
General Algorithms
From the proof it is also clear that the guaranteed identification of 2k — n good items is the best possible. Let l(i,t) denote the maximum number of items in a contaminated group such that a line algorithm can identify a defective in at most t tests; but if the first defective is not in the last i items, then it can be identified in at most t — 1 tests. Corollary 2.3.4 l(it) l
- { L^J /or0
' >-
\ 2'
for
z>2 .
T h e o r e m 2.3.5 Ford>2, fd(t) = fd{t- 1) + l{fd-i(t') t' is defined in h-i{t')>h{t-l)>fd-i{t'-l).
-fd(t-
l ) , i - 1 -t'),
where
Proof. Suppose that fd(t — 1) < n < fd(t)- The first test must be on a group not fewer than n — fd(t — 1) items for otherwise a negative outcome would leave too many items for the remaining £ — 1 tests. On the other hand there is no need to test more than n — fd[t — 1) items since only the case of a positive outcome is of concern and by Lemma 2.3.2, the fewer items in the contaminated group the better. For the time being, assume that fd-i{t' +1) > fd(t). Under this assumption, if the first defective lies among the first n — fd^\ (f) items, then after its identification the remaining items need t' + 1 further tests. Therefore only t — 2 — t' tests are available to identify the first defective. Otherwise, the remaining items can be identified by t' tests and one more test is available to identify the first defective. Therefore, the maximum value of doable n is
n - Mt - 1) = /(/«_!(*') - h{t - l),t - 1 - *') • The proof of fd-i(t'+ 1) > fd(t) is very involved. The reader is referred to [11] which proved the same for the corresponding i?*-minimax merging algorithm (see Section 2.4). a By noting fi(t) = 2', f2(t) has a closed-form solution. Corollary 2.3.6 f 22 - 2 + [ 4 ^ J /2(t)
=
\
«-l 2
for t even and > 4, 1 2
{ 2 V - 2 + I "*"' ' '] for t odd and > 3. The recursive equation for fd(t) can be solved in 0(td) time. Since the generalized binary splitting algorithm is in the nested class, H(d,n) cannot exceed Ma{d,n), which is of the order d \og(n/d). Therefore, fx(y) for x < d and y < d log(n/d) can be computed in 0(d2 \og(n/d)) time, a reduction by a factor of n3/d \og(n/d) from the brute force method.
2.4 (d, n) Algorithms and Merging
27
Algorithms
For given n and d, compute fx{y) for all x < d and y < d \og(n/d). A minimax nested algorithm is defined by the following procedure (assume that unidentified items form a line): Step 1. Find t such that fd(t)
2.4
(d, n) Algorithms and Merging Algorithms
A problem seemingly unrelated to the group testing problem is the merging problem which has been studied extensively in the computer science literature (see [13], for example). Consider two linearly ordered sets Ad =
{ai < a 2 < • • • < o.d} ,
Bg
{&! < b2 < ••• < bg) .
=
Assuming a; and bj are all distinct, the problem is to merge Ad with Bg into a single linearly ordered set Ud+g = {l(l < U2 < ' • • <
Ud+g}
by means of a sequence of pairwise comparisons between elements of Ad and elements of Bg. Hwang [8] compared the two problems and established some relationships between them whereby algorithms for solving one problem may be converted to similar algorithms for solving the other problem. In particular, he showed that a class of merging algorithms well studied in the merging literature can be converted to a class of corresponding (
General
28
Algorithms
(i) A sequence of comparisons is represented by a directed p a t h from t h e root t o a node. Each internal node is associated with a comparison while t h e two outgoing links of t h e node denote t h e two possible outcomes. (ii) Each node is also associated with t h e event whose sample points are consistent with t h e outcomes of t h e sequence of comparisons m a d e along t h e p a t h preceding t h e node. A merging algorithm and a (d, n) algorithm are said t o be m u t u a l l y convertible if t h e y can be represented by t h e same rooted binary tree. T h o u g h a comparison and a test serve t h e same function of partitioning t h e sample space into two smaller subspaces, t h e sets of realizable partitions induced by each of t h e m are q u i t e different. In general, a comparison of a; versus bj answers t h e question: Are t h e r e at least i of t h e i + j — 1 smallest elements of Ud+9 elements of A&l On t h e other h a n d , a group test on X C / answers t h e question: Is there at least one defective in XI However, if i = 1 or d, t h e n t h e comparison a,- versus bj can be seen to correspond to a group test on X = (7i, I-z, • • •, Ij) or X = (Ij+d, Ij+i+d, • • •, Ig+d) respectively in t h e sense t h a t t h e r e is a one-to-one correspondence between each of t h e two possible outcomes in t h e two problems such t h a t t h e resulting situations again have isomorphic sample spaces. This is m a d e explicit in t h e following table: It is
(d,n)
Merging
problem
Comparison a,\ vs. bj
Group test on X = (h, h, • • •, Ij)
ai > bj
X pure
a,\ < bj
X contaminated
Comparison a^ vs. bj
G r o u p test on X = {Ij+d, •••, Ig+d)
ad < bj
X pure
ad > bj
X contaminated
easily seen t h a t if i =^ 1 or d, t h e n a comparison of a; versus bj doesn't correspond to a group test. T h u s T h e o r e m 2 . 4 . 1 A necessary and sufficient condition that a merging algorithm is convertible to a (d,n) algorithm is that a comparison of ai versus bj is allowed only if a, is one of the two extreme elements among the set of undetermined elements of AdC o r o l l a r y 2 . 4 . 2 All merging rithms.
algorithms
for d = 2 are convertible
to (d, n)
algo-
2A (d, n) Algorithms and Merging
Algorithms
29
In [11], a procedure r to merge Ad with Ba is defined to be an .R-class algorithm, i.e., r G R(d,g) if it satisfies the following requirements: (i) The first comparison made by r must involve at least one of the four extreme elements a 1; a^, 61, bg. (ii) Suppose the first comparison is aj, versus bj for some j . If a^ < bj, then an algorithm r' € R(d,j — 1) is used to merge Ad with fij_i. If ad > b3, then a sequence of comparisons (if needed) involving ad are used to merge ad into Bg, i.e. to establish 6,- < aj < 6,+i for some i, j g to start with, then the corresponding r*(d,g) will be the tape merge which in turn yields the individual testing (d, n) algorithm. But individual testing is indeed minimax [7] for this case. If d < g, then r*(d,g) will mimic r(d,g) until at some stage the starting shorter set A becomes the current longer set. At this time, r(d,g) will merge an extreme element of the B-set while r*(d,g) will shift to tape merge for the remaining problem. Many .R-class algorithms, like tape merge, generalized binary, and .R-minimax, have the property that the starting shorter set cannot become the longer set without passing a stage where the two sets are of equal length. Hence, they can be modified to become an R* algorithm without increasing their maximum numbers of tests since tape merge is indeed minimax in the required situation. Let i?*-minimax be the R* algorithm modified from the .R-minimax algorithm. From the above discussion, we have proved T h e o r e m 2.4.3 MR*-minimax(d,n)
= Mfl_m,„,TOoa;(rf, n) for all d and n.
General Algorithms
30
Theorem 2.4.4 The R*-class merging algorithms and the nested algorithms are mutually convertible. Proof. It has been shown that every it*-class merging algorithm is convertible to a (d, n) algorithm. Clearly, that resultant (d, n) algorithm is a nested algorithm. Therefore, it suffices to prove that every nested algorithm is convertible to a merging algorithm of /J*-class. Let / = {Ii,I2, • • • , In} be the set of items. Without loss of generality, assume that the first test is on the group G = {7i, • • • h} for some k < n. If G is pure, relabel the unclassified items so that the current / has n — k items and repeat the procedure. If G is contaminated, then the next group to be tested must be a subset of G. Again without loss of generality, assume that the group to be tested is {Ii, • • • 1^} for some k! < k. Proceeding like this, every group being tested consists of the first i items in the sequence of unclassified items for some i. Clearly each such test corresponds to a comparison of ai vs. &,-. The proof is complete. Corollary 2.4.5 The (d, n) algorithm which is obtained by converting the R*-minimax merging algorithm is a minimax nested algorithm (whose required number of tests is given in Theorem 2.3.5).
2.5
Some Practical Considerations
In practice, one hardly knows d exactly. Thus d is often either an estimate or an upper bound. Note that the algorithms discussed in this chapter all have the property that they will either identify all defectives, if the actual number of defectives is up to d, or they will identify d defectives, if the actual number exceeds d. When d is an overestimate, a (d,n) algorithm still identifies all defectives and solves the problem, although not as efficiently as if the correct value of d is assessed. On the other hand, one could guard against the case that d is an underestimate by applying a test on all remaining items upon the identification of the d1 defective. If the outcome is positive, then d must be an underestimate. One would assign a value d' to represent the number of defectives in the remaining n' items and apply a (d1, n') algorithm and proceed similarly. When d is known to be an upper bound of the number of defectives, the (d,n) problem will be denoted by (d,n). Hwang, Song and Du [12] proved that M(d,n) is at most one more than M(d, n) by first proving the following crucial lemma. Let T denote a procedure for the (d, n) problem. Let v be a terminal node of T and s G S{v). Partition the d defectives in s into two categories, the fixed items and the free items, x € s is a fixed item if there exists a group tested in H(v) such that the group does not contain any other element of s; otherwise, x is a free item. A free item is identified as defective through the identification of n — d good items.
2.5 Some Practical
Considerations
31
Lemma 2.5.1 Suppose that v is a terminal node with f > 1 free items. Let u be the sibling node of v. Then u is the root of a subtree whose maximum path length is at least / — 1. Proof. Let G denote the last group tested before v. Since whenever n — d good items are identified the (d, n) problem is necessarily solved, free items can be identified only at the last test. Therefore / > 1 implies that v corresponds to the negative outcome of G. Hence G cannot contain any item of s. Let y denote a free item of s and x 6 G. Then (s \ {y}) U {x} € S(u). To see this, note that any test group containing y must contain another element of s, or y would be a fixed item. Therefore changing y from defective to good does not change the outcome of any test in H(v). Furthermore (s \ {y}) U {x} is consistent with the contaminated outcome of G, hence it is in S(u). Let Sx denote the set s \ {y \ y 6 s and y is free } U {x}. Then M(S(u)) > M{SX) = \SX\ - 1 = / - 1 by Lemma 1.5.1 and Corollary 1.5.4. D Theorem 2.5.2 For each procedure T for the {d,n) problem, there exists a procedure T" for the (d, n) problem such that
MT(d,n) + 1 > MT,(d,n) . Proof. Let T" be obtained from T by adding a subtree Tv to each terminal node v having a positive number of free items (see Figure 2.1). T„ is the tree obtained by testing the free items one by one. Since free items are the only items at v whose states are uncertain when (d, n) is changed to (d, n), I " is a procedure for the (d, n) problem. From Lemma 2.5.1, the sibling node of v is the root of a subtree with maximum path length at least / — 1 where / is the number of free items of sv. The theorem follows immediately. • Corollary 2.5.3 M(d,n) + 1 > M(d,n)
> M(d,n + 1).
Proof. The first inequality follows from the theorem. The second inequality follows from the observation that the (d, n + 1) problem can be solved by any procedure for the (d, n) problem, provided one of the n + 1 items is put aside. But the nature of the item put aside can be deduced with certainty once the natures of the other n items are known. • For nested algorithms Hwang [9] gave a stronger result which is stated here without proof. Theorem 2.5.4 F(m;d,n+
1) =
Corollary 2.5.5 H(d,n + 1) =
F(m;d,n). H{d,n).
General Algorithms
o
I
I
/»
f
/ »
I
I
t
I
I
t
/
/
/
/
/
/
\
f-i
\
I
Figure 2.1: From tree T to tree T". When the tests are destructive or consummate, as in the blood test application, then the number of tests an item can go through (or the number of duplicates) becomes an important issue. If an item can only go through one test, then individual testing is the only feasible algorithm. If an item can go through at most s tests, then Li's 5-stage algorithm is a good one to use. In some other applications, the size of a testable group is restricted. Usually, the restriction is of the type that no more than k items can be tested as a group. An algorithm can be modified to fit the restriction by cutting the group size to k whenever the algorithm calls for a larger group. There are more subtle restrictions. Call a storage unit a bin and assume that all items in a bin are indistinguishable to the tester even though they have different test history. A b-bin algorithm can use at most b bins to store items. A small number of bins not only saves storage units, but also implies easier implementation. Since at most stages of the testing process, one bin is needed to store good items, one to store defectives and one to store unidentified items, any sensible algorithm will need at least three bins. An obvious 3-bin algorithm is individual testing; and at the first glance this seems to be the only sensible 3-bin algorithm. We now show the surprising result that Li's s-stage algorithm can be implemented as a 3-bin algorithm. The three bins are labeled "queue," "good item" and "new queue." At the beginning of stage i, items which have been identified as good are in the good-item bin, and all other items
33
2.5 Some Practical Considerations
are in the queue bin. Items in the queue bin are tested in groups of size k{ (some possibly k{ — 1) as according to Li's s-stage algorithm. Items in groups tested negative are thrown into the good-item bin, and items in groups tested negative are thrown into the new-queue bin. At the end of stage i, the queue bin is emptied and changes labels with the new-queue bin to start the next stage. Of course, at stage s, each group is of size one and the items thrown into the new-queue bin are all defectives. All nested algorithms are 4-bin algorithms. Intuitively, one expects the minimax nested algorithm to be a minimax 4-bin algorithm. But this involves proving that the following type of tests can be excluded from a minimax algorithm : Suppose a contaminated group C exists. Test a group G which is not a subset of C. When G is also contaminated, throw G into the bins containing C and throw C into the other bin containing unidentified items. Note that the last move loses information on finding C contaminated. But it is hard to relate this to nonminimaxity. Yet another possible restriction is on the number of recursive equations defining a minimax algorithm in a given class. A small number implies a faster solution for the recursive equations. For example, individual testing needs one equation (in a trivial way) and the nested class needs two. Suppose that there are p processors or p persons to administer the tests parallelly. Then p disjoint groups can be tested in one "round." In some circumstance when the cost of time dominates the cost of tests, then the number of rounds is a more relevant criterion to evaluate an algorithm than the number of tests. Li's .s-stage algorithm can be easily adapted to be a parallel algorithm. Define n' = \n/p~\. Apply Li's algorithm to the (d, n') problem except that the g, groups at stage i, i = l , . . . , s , are partitioned into [5,/p] classes and groups in the same class are tested in the same round. At the last stage, instead of d defectives, at most d contaminated groups are identified. Since each group contains at most p items, at most d more parallel tests identify all defectives in these groups. Recall that the number of tests for Li's algorithm is upper bounded by
l o l ^ l o s (5) ' where s = In f^J. The total number of rounds with p processors is upper bounded by logep
\dp)
which tends to
\dp)
feH
iogn
when n is much larger than d and p.
•
Generai Algorithms
34
2.6
An Application to Clone Screenings
Screening large collections of clones for those that contain specific DNA sequences is a preliminary indispensable to numerous genetic studies and is usually performed with a clone-by-clone probe or something equivalent. For a yeast artificial-chromosome (YAC) library, the high sensitivity and specificity of the polymerase chain reaction (PCR) allows the detection of target sequences in DNA prepared from pools of thousands of YAC clones. Several PCR-based screening protocols have been suggested to reduce the number of tests. They will be introduced here and their relations to group testing expounded. A YAC library typically contains from 10,000 to 100,000 clones where each DNA sequence will appear on average in r clones. The value of r is specified by the library and the ratio r/n, though depending on the particular library, is in the order of 10 - 4 . A PCR can identify the existence of a specified DNA sequence in a pool of clones and the identification is considered error-free if the pool size stays within a certain limit. The objective is to identify all clones containing the specified DNA sequence with a minimum number of efforts; this includes the number of PCRs, the easiness of preparing the pools and whether the pooling can be done parallelly. Thus it is clear that clone screening can be cast as a group testing problem where the clones are the items and those containing a specified DNA sequence the defectives. While the number of defectives is usually given in its expected value, an upper bound can be obtained by assuming that the number of defectives follows a Poisson distribution. A PCR is a group test with some restriction on its maximum size. If the DNA of a pool contains the appropriate PCR product, then the test outcome is positive (such a pool is identified as "positive"). While one wants to minimize the number of tests, how easy the test group can be assembled is definitely a concern. In general, nonadaptive algorithms are preferred over sequential algorithms, with the multistage algorithms with a small number of stages a possible compromise. Green and Olson [6] considered a human YAC library with n = 23,040 clones and r = 2. YAC clones were grown on nylon filters in rectangular arrays of 384 colonies following inoculation from four 96-well microtiter plates. The yeast cells from each filter are pooled and the DNA is purified, yielding single-filter pools of DNA. Equal aliquots from single-filter pools are mixed together in groups of five to yield multifilter pools, each representing the DNA from 1920 clones. The multi-filter pools of DNA are then analyzed individually for the presence of a specified DNA segment by using the PCR. If a multi-filter pool is found to be positive, then each constituent single-filter pool is analyzed individually by the same PCR assay. Upon generation of a positive single-filter pool, locations of positive clones within the 384-clone array are established by colony hybridization using the radio-labeled PCR product as the probe. Green and Olson also considered a modification which affects when a positive single-filter pool is found. Instead of colony hybridization, the modification uses the binary representation matrix (see Section 4.5) to identify the positive clone in
2.6 An Application to Clone Screenings
35
[log 1920] = 11 pools. Note that this method would fail if there exists more than one positive clone in the pool, and more pools have to be taken for remedy. But for the given parameters of the library, the probability that this will happen is only 3%. The modified Green and Olson method is a 3-stage group testing algorithm except in the last stage the individual testing is replaced by some technology-based procedures. Note that the size of the single-filter grouping is constrained to be a multiple of 384, and that of the multiple-filter grouping to be within the limit of an effective PCR. De Jong, Aslanidis, Alleman and Chen [3], and also Evans and Lewis [5], proposed a different approach which was elaborated and analyzed by Barillot, Lacroix and Cohen [1]. They assumed that the clones are arranged in a 2-dimensional matrix, and each row or column yields a pool. A positive clone renders its row and its column positive. Thus all positive clones are located at the intersections of positive rows and positive columns. Unfortunately, the reverse is not true. In fact r positive clones may cause r2 such positive intersections. Thus one or several other similar matrices are
o
o
positive
positive
o
Y positive
positive Figure 2.2: True and false positive intersection.
needed to differentiate clones at these intersections, using the fact that positive clones will be located at positive intersections of every matrix. For better discrimination, it is desirable that any two clones appear in the same row or column in at most one matrix. Such matrices have been studied in the statistical literature and are known as lattice designs. In particular, they are called lattice squares if the matrices are square matrices. The construction of a set of k, k > 2, lattice squares corresponds to the construction of k — 2 latin squares (see [15]). Since the effort of collecting a pool is not negligible, after a pool is collected and analyzed, a decision needs to be made among three choices: use another matrix for more pools, test the ambiguous clones individually, accept a small probability of including a false positive clone. The 2-dimensional pooling strategy can be extended to (/-dimensional spaces where each hyperplane yields a pool. For til-dimensional cubes, Barillot et al. pointed out that the size of a pool is n 1 - 1 ^ and would be too large for d > 4. They provided figures which show the optimal d as a function of n and r.
36
General
Algorithms
References [1] E. Barillot, B . Lacroix and D. Cohen, Theoretical analysis of library screening using a A^-dimensional pooling strategy, Nucleic Acids Res. 19 (1991) 6241-6247. [2] X. M. Chang, F . K. Hwang and J. F . Weng, G r o u p testing with two and three defectives, Ann. N.Y. Acad. Sci. Vol. 576, Ed. M. F . Capobianco, M. G u a n , D. F . Hsu and F . T i a n , (New York, 1989) 86-96. [3] P. J. De Jong, C. Aslanidis, J. Alleman and C. Chen, Genome m a p p i n g and sequencing, Cold Spring Harbour Conf. New York, 1990, 48. [4] R. Dorfman, T h e detection of defective m e m b e r s of large populations, Math. Statist. 14 (1943) 436-440.
Ann.
[5] G. A. Evans and K. A. Lewis, Physical m a p p i n g of complex genomes by cosmic m u l t i p l e x analysis, Proc. Nat. Acad. Sci. USA 86 (1989) 5030-5034. [6] E . D . Green and M. V. Olson, S y s t e m a t i c screening of yeast artificial-chromosome libraries by use of t h e polymerase chain reaction, Proc. Nat. Acad. Sci. USA 87 (1990) 1213-1217. [7] F . K. Hwang, A m i n i m a x procedure on group testing problems, Tarnkang Math. 2 (1971) 39-44.
J.
[8] F . K. Hwang, Hypergeometric group testing procedures and merging procedures, Bull. Inst. Math. Acad. Sinica 5 (1977) 335-343. [9] F . K. Hwang, A n o t e on hypergeometric group testing procedures, SIAM Math. 34 (1978) 371-375.
J.
Appl.
[10] F . K. Hwang, A m e t h o d for detecting all defective m e m b e r s in a population by group testing, J. Amer. Statist. Assoc. 67 (1972) 605-608. [11] F . K. Hwang a n d D. N. Deutsch, A class of merging algorithms, J. Assoc. put. Math. 20 (1973) 148-159.
Corn-
[12] F . K. Hwang, T. T . Song and D. Z. Du, Hypergeometric and generalized hypergeometric group testing, SIAM J. Alg. Disc. Methods 2 (1981), 426-428. [13] D. E. K n u t h , The Art of Computer ing, Mass. 1972).
Programming,
Vol. 3, (Addison-Wesley, Read-
[14] C. H. Li, A sequential m e t h o d for screening experimental variables, J. Statist. Assoc. 57 (1962) 455-477.
Amer.
References
37
[15] D. Raghavarao, Constructions and Combinatorial Problems in Designs of Experiments, (Wiley, New York, 1971). [16] M. Sobel and P. A. Groll, Group testing to eliminate efficiently all defectives in a binomial sample, Bell System Tech. J. 38 (1959) 1179-1252.
3 Algorithms for Special Cases
When d is very small or very large, more is known about M(d,n). M ( l , n ) = [logn] by Theorem 1.2.1, and by using binary splitting. Surprisingly, M(2,n) and M(3, n) are still open problems, although "almost" minimax algorithms are known. On the other hand one expects individual testing to be minimax when n/d is small. It is known that the threshold value for this ratio lies between 21/8 and 3, and was conjectured to be 3.
3.1
Two Disjoint Sets Each Containing Exactly One Defective
Chang and Hwang [1], [2] studied the CGT problem of identifying two defectives in A = {A\,..., Am) and B = {_Bj,..., Bn} where A and B are disjoint and each contains exactly one defective. At first, it seems that one cannot do better than work on the two disjoint sets separately. The following example shows that intuition is not always reliable for this problem. E x a m p l e 3 . 1 . Let A = {A1,A2,A3} and B = {B1,B2,B3,Bi,Bs}. If one identifies the defectives in A and B separately, then it takes [log 3] + [log 5] = 2 + 3 = 5 tests. However, the following algorithm shows that the two defectives can be identified in 4 tests. Step 1. Test {Ai, Bi}. If the outcome is negative, then A has two items and B has four items left. Binary splitting will identify the two defectives in log 2+log 4 = 3 more tests. Therefore, it suffices to consider the positive outcome. Step 2. Test B\. If the outcome is negative, then A\ must be defective. The defective in the four remaining items of B can be identified in 2 more tests. If the outcome is positive, then the defective in the three items of A can be identified in 2 more tests. Note that there are 3 x 5 = 15 samples {Ai, Bj}. Since [log 15] = 4, one certainly cannot do better than 4 tests. 38
3.1 Two Disjoint Sets Each Containing Exactly One Defective
39
In general, the sample space is A x B which will also be denoted by m x n if | A |= m and \ B \= n. Does there always exist an algorithm to identify the two defectives in A x B in [logmn\ tests? Chang and Hwang [2] answered in the affirmative. A sample space is said to be A-distinct if no two samples in it share the same A-item Aj. Suppose S is a sample space with | S |= 2 r + 2 ' - 1 + • • • + T-" + q , where 2 r _ p _ 1 > q > 0. An algorithm T for S is called A-sharp if it satisfies the following conditions: (i) T solves S in r + 1 tests. (ii) Let v(i) be the ith node on the all-positive path of T, the path where every outcome is positive. Let v'(i) be the child-node of v(i) with the negative outcome. Then | S(v'(i)) |= 2r~i for i = 0 , 1 , . . . ,p. (iii) | S(v(p+
1)) |= q and S(v(p+ 1)) is A-distinct. r
If | S |= 2 , then the above conditions are replaced by the single condition. (i') T solves S in r tests. Lemma 3.1.1 There exists an A-sharp algorithm for any A-distinct sample space. Proof. Ignore the B-items in the ^4-distinct sample space. Since the A-items are all distinct, there is no restriction on the partitions. It is easily verified that there exists a binary splitting algorithm which is A-sharp. • For m fixed, define rik to be the largest integer such that mrik < 2*. Clearly, there exists a k for which n^ = 1. Theorem 3.1.2 M(m x n/c) = k for all rik > 1. Furthermore, if nk is odd, then there exists an A-sharp algorithm for the sample space m x rikProof. By the definition of n^ and Theorem 1.2.1, M(m X rik) > k. Therefore it suffices to prove M(m X rik) < k. If m is even, test half of A and use induction. So, it suffices to consider odd m. The sample spaceTOx 1 is A-distinct. Therefore there exists an A-sharp algorithm by Lemma 3.1.1. For general rik > 1, Theorem 3.1.2 is proved by induction on rikNote that 2*~2 < mrik-i < 2k~1 < m(nk-i + 1) implies 2 i _ 1 < m{2nk-i)
<2k < m(2nk-i
+ 2) .
40
Algorithms
for Special
Cases
Therefore nk is either 2nk-i or 2nk_1 + 1. In t h e former case test half of t h e set B and use induction on t h e remaining (m x nt_x)-problem. In t h e l a t t e r case, let r be t h e largest integer such t h a t nk = 2rnk_r T h e n r > 1 a n d nk_r
+ 1 .
is necessarily odd. Let ~~~
nk—r—l
mnk-r
= I
, nk—r — 2 ,
+ 2
, nk—r~p
+ • • •+ 2
, „
+ q ,
where 0 < q < 2k-r-"-1
.
Then mnk
=
m(2rnk-r
=
2k~1 + 2k~2 + ••• + 2k-p + 2rq + m .
+ 1)
Let T be an A-sharp algorithm for t h e (m x nt._ r )-problem. T h e existence of T is assured by t h e induction hypothesis. Let v be t h e node on t h e all-positive p a t h of T associated with q samples. Let J be t h e set of j such t h a t (Ai, Bj) € S(v) for some Ai. For j e J, let Lj denote a set consisting of those A,'s such t h a t (Ai,Bj) € S(v). Since S(v) is yl-distinct, t h e L,-'s are disjoint. An A-sharp algorithm is now given for t h e (m x nj.)-problem. For notational convenience, write n for nk_T and n' for nk. T h e n B will refer to t h e set of n items and B' to t h e set of n' items. P a r t i t i o n B' — {Bni} i n t o n groups of 2T items G 1 ; • • • , Gn. Consider T t r u n c a t e d at t h e node v; i.e., delete t h e subtree rooted a t v from T. Let T" be an algorithm for t h e (m x rc')-problem where T ' is obtained from T by replacing each i t e m Bj in a test by t h e group Gj and adding Bni to every group tested on t h e all-defective p a t h . T h e n each t e r m i n a l node of T", except t h e node v' corresponding with v, will be associated with a set of solutions (Ai x Gj) for some i and j . Since t h e only u n c e r t a i n t y is on Gj and \ Gj \=2r, r m o r e t e s t s suffice. Therefore, it suffices to give an A-sharp algorithm for t h e sample space 5 ( « ' ) = {VjeALj with | S(v')
x Gj)} U (A x J3 n ) ,
| = 2rq + m. Let Gj\, • • •, Gj2r denote t h e 2T items in Gj. Tj = {UjejGn}
Define
UR ,
where R is a subset of A-items not in any of t h e Lj, j 6 J , with | R \= 2rq + m — 2 * - P - I _ q Note t h a t t h e r e are a total of m — q A-items not in any of t h e Lj. It is now proved t h a t m — q >\ R | > 0. T h e former inequality follows immediately from t h e fact t h a t 2rq < 2r2k-T~p-1 = 2k~p-1 .
3.1 Two Disjoint Sets Each Containing Exactly One Defective
41
Furthermore, since 2*- 1 < m(njfe_i + 1) = =
m(2r-1nk_r + 1) 2k~2 + 2k~3 + ••• + 2k-p-1 + 2 r - 1 9 + m ,
it follows that 2r-xq + m > 2k-p~1 , or \R\>2r'1q-q>0
.
Test Ti at S(v'). Let Sa and Sd denote the partition of S(v') according as to whether the outcome of T\ is negative or positive. Then Sd = {UjeALj x Gji)} U (R x Bn,), with |&|
= =
? + 2rq + m - 2k-"-1 - q 2 r ? + m - 2*- p - 1 ,
and 5 S = {U2ur=2 U i e J (£,• x G iu ,)} U ({A \ R} x Bn.) , with | 5 S | = | S(v') | — | Sd |= 2* _ p _ 1 . Since 5^ is A-distinct, there exists an A-sharp algorithm for Sd by Lemma 3.1.1. It remains to be shown that Sg can be done in k — p — 1 tests. Note that Sg can also be represented as Sg = {Vj£j(Lj
x {Gj \ {Gfl} U {£„.}})} U ({(A \ tf) \ UitjLj}
x £„,) .
Since | Gj \ {Gji} U {Bn,} |= 2 r , | (A \ .ft) \ U, e j£j | must also be a multiple of 2T. Partition A — R — {Uj^jLj} into 2 r subsets of equal size, Hi, • • •, Hi*. Define Tw = { U j e J G ^ } U Hw
for to = 2,3, • • •, 2 r .
Then the TVs are disjoint. By testing a sequence of proper combinations of TVs, it is easily seen that in r tests Sg can be partitioned into 2r subsets consisting of Sw = {UjeJ(Lj
x Giw)} U (Hw x Nn.) ,
w = 2, • • •, 2 r ,
and
Si = S,\ {u£_2 Sw} . Furthermore, 5„, is A-distinct and | Sw |= 2k~p~T~1 for each w = 1, • • •, 2 r . Therefore each Sw can be solved in fc — p — r — 1 more tests by Lemma 3.1.1. This shows that the algorithm just described for S(v') is A-sharp. Therefore, the algorithm T' plus the extension on v' as described is A-sharp. •
Algorithms for Special Cases
42 T(A 3 xB 3 ):
T'(A 3 xB 2 j):
Figure 3.1: An yl-sharp algorithm for A3 x _B2iCorollary 3.1.3 M(rn x n) = [logmn] for all m and n. The corollary follows from Theorem 3.1.2 by way of the easily verifiable fact that M(m X n) is monotone nondecreasing in n. It can be easily verified that if the m x n model is changed to A and B each containing "at least" one defective and the problem is to identify one defective from each of A and B. Then Corollary 3.1.3 remains true. Denote the new problem by fh x n. Then Corollary 3.1.4 M{fn x n) = [logmn] for all m and n. Ruszinko [11] studied the line version, i.e., A and B are ordered sets and a test group must consist of items from the top of the two orders. He showed that for m = 11 and n = (210k — 1)/11 (which is an integer by Fermat's theorem), the first test group must consist of 8 items from A and 210k~'1 items from B. But no second test can split the remaining samples into two parts both of sizes not exceeding 210k~2. He also proved Theorem 3.1.5 7/logm + logn— [logm] — [lognj > 0.8, then [logmn] suffice. Weng and Hwang [13] considered the case of k disjoint sets.
3.2 An Application
to Locating
Electrical
43
Shorts
T h e o r e m 3 . 1 . 6 Suppose that set S; has 2 2 + 1 items containing for i = 0 , 1 , • • • , m . Then M{S0
exactly one
defective
x S1 x • • • x Sm) = 2 m + 1 .
Proof. Since n™ 0 (2 2 " + l )
= =
M{S0
x Si x • • • x Sm)
>
2 2 m + 1 - 1 + 2 2 m + ' - 2 + -.. + 2 1 + 2 0 22m+1-l, 2m+1
by T h e o r e m 1.2.1 .
T h e reverse inequality is proved by giving an algorithm which requires only 2 + 1 tests. Let 7,- be an arbitrary item from S{, i = 0 , 1 , • • • , m . Consider t h e sequence of subsets J i , • • •, J 2 m + 1 - i ; where J 2 * = {h} for k = 0 , 1 , • • • , m and J 2 * + J = { 4 } U J j for 1 < j < 2k. It is easily proved by induction t h a t E / . g j , 2' = j for 1 < j < 2m+l - 1. T h e actual testing of groups is done in t h e reverse order of Jj until either a negative o u t c o m e is obtained or J j is tested positive. Note t h a t a subset is tested only when all subsets contained in it have been tested. Therefore J 2 i + i _ i , • • •, J , + i all tested positive and J j negative imply t h a t items in Jj are good and items in {7o, I\, • • • , Im}\Jj are defective. T h u s t h e (S0 x Si x • • • x Sm) problem is reduced t o t h e (II/. e .7 j (.S'i\{/;})) problem. T h e total n u m b e r of tests is 2m+i
_j+
Y^2{ = 2 m+1 .
If J 2 m+i_ 1 , • • • , Ji all tested positive, then each /;, 0 < i < m , has been identified as a defective through individual testing. Therefore all defectives are identified while t h e total n u m b e r of tests is simply 2 m + 1 — 1. •
3.2
An Application to Locating Electrical Shorts
A prevalent t y p e of fault in t h e manufacture of electrical circuits is t h e presence of a short circuit ("short") between two nets of a circuit. Short testing constitutes a significant p a r t of t h e manufacturing process. Several fault p a t t e r n s can be described as variations or combination of these types of faults. Applications of short testing ranges from printed circuit board testing and circuit testing to functional testing. Short testing procedures can have two different objectives; a detecting procedure aims simply to detect t h e presence of a short, while a locating procedure identifies the shorted pairs of nets. A short detector in c o m m o n use is an a p p a r a t u s involving two connecting leads which, when connected respectively to two nets or groups of nets, can detect, b u t not locate, t h e presence of a short between these nets or groups of nets. An obvious way to use this device for short location is t h e so called n-square testing, which checks each pair of nets separately. If t h e circuit has n nets to start with, then this location procedure requires (!M tests.
Algorithms for Special Cases
44
lead A
_ ^
\
Q
net u
•Q
net v
lead B
Figure 3.2: A short detector. Garey, Johnson, and So [7] proposed a procedure for detecting shorts between nets in printed circuit boards. They showed that no more than 5, 8, or 12 tests (depending on the particular assumptions) will ever be required, independent of the number of nets. However, it is not a locating procedure, and their assumptions allow only shorts occurring vertically or horizontally, not diagonally. Skilling [12] proposed a clever locating procedure much more efficient than the n-square testing. In his method each net is tested individually against all other nets, collectively. Thus after n tests one obtains all the shorted nets, though not knowing to which other nets they are shorted. If there are d shorted nets, then LI more tests are required to determine which of these are shorted to which others. Thus the total number of tests is n + u ) , which can be significantly less than (!M f° r n much larger than d. Chen and Hwang [1] proposed a locating procedure, based on Theorem 3.1.2, which requires approximately 2 ( d + l)logn tests. Consider a circuit board with n networks. Without loss of generality assume n = 2°. This testing process can be mathematically described as follows: Let [X2v=i(ai x &>)] denote the configuration where there exist 2fc disjoint sets with d\, 6j, • • •, ajt, bk nets such that there exists a shorted pair between the nets in a,- and the nets infe,-for at least one i. Let [Yli=i a>\ denote the configuration where there exist k disjoint sets with
3.2 An Application to Locating Electrical Shorts 3.2.1 now follows from Theorem 3.1.2.
45
•
Define the halving procedure H for the short testing problem as follows. For the configuration [5I*=i a«] split each set of a, nets into half (as closely as possible); connect one half to one lead and the other half to the other lead. When no short is present, move into the configuration [J2]ti a'i\ where each o; splits into a\ and a'i+k. When a short is present, move into the configuration E f = i [o,/2J x [a,/2]]. Use binary splitting on the k pairs to locate one pair [a,/2j x [a,/2] that contains a short in [log fcj tests. By Lemma 3.2.1 a shorted pair from the configuration [[ x bi\ configurations H encounters also have k a power of 2 and each a;6j < 2 2 ' a _ 1 ° s * : _ 1 '. Suppose that the first time that the configuration of the form [ £ ; = 1 ai x h] is obtained is when k = 2m. Then m + 1 tests have been used. In another m tests a pair a x b which contains a shorted pair is obtained. Finally, the shorted pair is located in 2 [log n] — 2m — 2 more tests. Adding up the tests yields 2[logn] — 1. If after [logn] tests the configuration [E?=i ai] with each a,- < 1 is obtained, then clearly no shorted pair exists. • Now consider two modes of operation, R and R, depending on whether a located short is immediately repaired or not. An important difference is that a newly detected short cannot be a previously located short under mode R. Theorem 3.2.3 If the n nets contain d shorted pairs, then d-\-\ runs of H will locate all shorted pairs in at most (2d + 1) [log n] — d tests under mode R. In practice, if it is known that no shorted pair exists between a set of J2i=i ai n e t s and a set of 2 t = i h nets in a run, then the test can certainly be skipped in all later runs. Corollary 3.2.4 At most (2d + l)[logra] tests are required if mode R is replaced by
mode R. Proof. After H located a shorted pair (x, y), remove y and test x against the other n—2 nets. If a short is detected, then locate another shorted pair in l+log(n—2) < 2[log n\ tests by Corollary 3.1.4. If no short is detected, then (x,y) is the only shorted pair involving x. Remove x and apply H to the other n — 1 nets (y is put back). Thus the detection of each shorted pair may consume one extra test. •
46
Algorithms
for Special
Cases
TABLE 3.1 A COMPARISON OF THREE METHODS
Skillling method
n-square
n 64
method 2016
d=0 64
Procedure H (mode R)
d = 2
d = A
d = 6
d = 8
d= 2
d= 4
d= 6
d= 8
65
70
79
92
d=0 6
28
50
72
94 111
128
8128
128
129
134
143
156
7
33
59
85
256
32640
256
257
262
271
284
8
38
68
98
128
513
518
527
540
9
43
77
111
145
512
130816
512
1024
523776
1024
1025
1030
1039
1052
10
48
86
124
162
2048
2096128
2048
2049
2054
2063
2076
11
53
95
137
179
Skilling [12] compared t h e n u m b e r of tests required by his m e t h o d with t h e nsquare m e t h o d for n = 64,128, • • • , 2048 and d = 0 , 2 , - - - , 8 . Table 3.1 adds t h e corresponding n u m b e r s for H into t h e comparison. T h u s procedure H can save significant n u m b e r s of tests over Skilling's m e t h o d for practical values of n and d. Note t h a t all three compared procedures do not assume t h e knowledge of d. Previous discussions assumed no restriction on t h e n u m b e r of nets t h a t can be included in a test. However, in practice, this is sometimes not t h e case. Therefore, in order t o have practical implementation of t h e proposed algorithm, t h e algorithm needs t o be modified t o a c c o m m o d a t e h a r d w a r e restrictions. T h e original analysis given in [1] contains some errors. T h e following is a revised account. T h e o r e m 3 . 2 . 5 Suppose that at most I nets can be included in one of the two groups. Then there exists a procedure for mode R which requires at most \n/l~\ + \n/21~\ [log /] -+d[21ogf| +d tests. Proof. P a r t i t i o n t h e n nets into \n/t] units where each unit has / nets except possibly one unit has fewer. Suppose t h a t t h e r e exists k between-unit shorted pairs and d — k within-unit shorted pairs. First, use t h e Skilling procedure on t h e units to locate t h e k (at m o s t ) pairs of units containing between-unit shorted pairs in \n/r\ + (z\ tests. Note t h a t in t h e Skilling procedure one group always contains a single unit, hence no more t h a n I n e t s . For each of k unit-pairs, use t h e procedure in L e m m a 3.2.1 to locate a shorted pair in [log Z2] tests. Repair t h e pair and test t h e same unit-pair again to see if a n o t h e r shorted-pair exists. If it does, repeat t h e same procedure. T h u s each between-unit shorted pair can be located in [2 log /] + 1 tests. To locate t h e within-unit shorted pair, merge any two units. T h e choice of merging units is not u n i q u e , b u t all possible choices have t h e same testing complexity. Apply
47
3.2 An Application to Locating Electrical Shorts
TABLE 3.2 PROCEDURE H UNDER CONSTRAINT
2
32 1024 768 512 320 192 112 1046 774 522 334 210 134
4
1068 780 532 348 228 156 116
6 8
1090
d\l
0
1
2
4
8
16
64 128 256 512 64 36 20 11 90
66
54
49
96 88 87 786 542 362 246 178 142 126 122 125 1112 792 552 376 264 200 168 156 156 163
the procedure H to the merged unit with the first test pitting one unit against the other skipped. Note that the procedure H on 21 nets puts at most / nets in a group. Let ki denote the number of shorted pairs in the ith merged unit, i = 1, • • • , [ra/2f|. By Theorem 3.2.3 all Efef' 1 h = d - k shorted pairs can be located in rn/2/l
£
{(2fc + 1) flog2ri - fc.- - 1}
i=l
= {2d -2k+
\n/2l])[log2l]-d+k-
\n/2I]
tests. Thus the total number of tests is \n/I] + r J + k{ \2 log T\ + 1) + (2d -2k+ =
\n/l\ + (t J + *( \2 log /I + 1) + (2d -2k+
\n/2l]) [log 21] - d + k -
\n/2t\
\n/2l]) flog 1] + d - k
which achieves the maximum \n/t] + 4 2 log [\ + \n/2I] flog I\ + d at k = d. Theorem 3.2.5 follows immediately.
•
Corollary 3.2.6 At most d more tests are needed if mode R is replaced by mode R. Let n = 1024 in the following table for numbers of tests with various values of d and /. Note that procedure H requires a slightly greater number of tests than Skilling's method for / = 1. Also note that for d large the number of tests is not necessarily decreasing in /. This suggests two things. First, the benefit of having larger / is decreasing. Second, for / large one may not want to use the full capacity. It is clear that / > n/2 is never needed since one can always apply the capacity constraint to the smaller group of nets. Therefore / = 512 corresponds to the case of no capacity constraint. A discrepancy of one test between / = 512 in Table 3.2 and n = 1024 for procedure H in Table I is due to a slight difference in analysis.
48
3.3
Algorithms for Special Cases
T h e 2-Defective Case
Let nt(d) denote the largest n such that the (d, n) problem can be solved in t tests. Since M(d,n) is decreasing in d for d < n (Theorem 1.5.5), a complete solution of nt(d) is equivalent to a complete solution of M(d,n). While n t ( l ) = 2' is easily obtained by binary splitting, the solution of nt(2) is surprisingly hard and remains open. In this section bounds on nt(2) are studied. For t > 1 let it denote the integer such that
itM\ < 2'* < fit ' +1 Since no integer i is a solution of m = 2' for t > 1, no ambiguity arises from the definition of it. By the information lower bound (Theorem 1.2.1), it is clearly an upper bound of n<(2). Chang, Hwang and Lin [3] showed that it — 1 is also an upper bound of nt(2). First some lemmas. Lemma 3.3.1 it = [ 2 ^
-\\+l.
Proof. It suffices to prove
Since
L2^-ij+K2^
2
J
+
I
< L 2
^_I
J + 2
8
V
2
, 2
By noting that no integer i is a solution of (I) = 2', it follows
j+,
j+2
r- 2 ' H 2'. Proof. For t even, it+1 = 2 2 . Therefore
( T H V ) = |{2^(2^-l)-(.-l)(.-2)} = l{2*+*-2!¥-(it-l)u
+
2{it-l)}
= 2 ' « - Q ] + [(it-l)-2i]>2«,
49
3.3 The 2-Defective Case
' I > 2 i + 1 - 2' = 2' and /2* + 1\
. < 2* implies - it - 1 > 25 .
For t odd, i ( = 2 2 . Therefore
( T ) - ( V ) = i{(^-D^-(2^-D(2^-2)} =
5 {(»«+i + l)»«+i " 2(t t + i - 1) " 2 t + 1 + 3 - 2 ^ - 4 }
=
{(?1+12+1)-2'} + {3.2ifi-(8(+1
t t + i + 2^ _
2t > 2<+,
_ 2<
+
l)}>2')
= 2<
and 3 2
] > 2 t + 1 implies 3 • 2 ^ > i
m
+ 1
T h e o r e m 3.3.3 nt(2) < it - 1 for t > 4. Proof. It is easily verified that n 4 = 6 = J4 —1. The general case is proved by induction on 2. Consider an arbitrary algorithm T. It will be shown that Mr{2,it) > t. Suppose that the first test of T is on a set of m items. If m < it — (it-i — l), consider the negative outcome. Then the problem is reduced to the (2, it — m) problem. Since it — m > it-\ — 1, at least t more tests are needed by the induction hypothesis. If m > it — (it-i — 1), consider the positive outcome. The set of samples after the first test has cardinality it\
fit - rn\ ^ fit\
2)
V 2 J* (2) - ( 2
(it-\ - l\ _
Thus again, at least t more tests are needed.
nt-1
J>2
byLemmaU2
'
•
Chang, Hwang and Lin also gave a lower bound for n ( (2), which was later improved by Chang, Hwang and Weng [4]. Algorithm of the latter is reported here.
Algorithms
50
for Special
T h e o r e m 3 . 3 . 4 There exists an algorithm c such that the (2,ct) solved in t tests where Ct = nt{2) for t < 11 and 89 • 2k~6 63 • 2 * - 5
problem
Cases can be
for t = 2k>12 for t = 2k + 1 > 13 .
Proof. Constructions for ct,t < 11, were given in [4] and will not b e repeated here. These ct equal i t — 1 for t > 4 or it for t < 3, hence equal n ( ( 2 ) by T h e o r e m s 3.3.3 and 1.2.1. For t > 12, t h e r e exists a generic algorithm as follows: 1. Test a group of ct — ct-i items. If t h e o u t c o m e is negative, follow t h e ct-\ construction. 2. If t h e o u t c o m e of t h e first test is positive, next test a group of m items from t h e untested group (with c t _ ! items) where 39 • 2k~e 55-2fc-6
for t - 2k, for* = 2ifc + l .
If t h e o u t c o m e is positive, t h e problem is reduced to t h e (ct — ct-\) Since
^
/ (26 • 2'=- 6 )(39 • 2*- 6 ) < 22k~2 - j ( 3 7 . 2 * - 6 ) ( 5 5 • 2k~6) < 2 2 *" 1
ct^)m
x m problem.
for t = 2k, for t = 2k + 1,
by Corollary 3.1.3, t — 2 more tests suffice. 3. If t h e o u t c o m e of t h e second test is negative, next test a group of [ m / 2 J items from t h e u n t e s t e d group (with ct-\ — m items). If t h e o u t c o m e is positive, t h e problem is reduced to t h e (ct — ct-i) x [ m / 2 j problem. Since
(c< -
c
i - i ) L ^ J < 2^Ct ~
Ct
-^m
< 2
'~3'
t — 3 more tests suffice. If t h e outcome is negative, a total of
Ct
m
L
m 2J
S
( 89 • 2k~6 - 39 • 2k~6 - 19 • 2k~6 = 31 • 2k~6 < c,_ 3 \ 63 • 2 fc " 5 - 55 • 2 f c - 6 - 27 • 2 * - 6 = 44 • 2k~6 < c,_ 3
i t e m s is left; hence t — 3 more tests suffice. C o r o l l a r y 3 . 3 . 5 ct/nt(2)
for t = 2k, for t = 2k + 1
•
> 0.983.
Proof. For t < 11, c ( / n t ( 2 ) = 1. For t > 12 substituting it - 1 for n ( ( 2 ) ,
S W > 5 S 5 > °-983
n-
'
63 2
5
" -| *~^ > H > 0.984
for
* = 2fc'
fort = 2fc + l. •
51
3.3 The 2-Defective Case
Chang, Hwang and Weng also gave an improvement of the algorithm c. Let w be the algorithm resulting from the selection of the largest m satisfying < 2'~ 2 ,
(wt - wt-i)m
< 2'- 3 ,
ft-uH-iHfj i
m
i
<-
Clearly, the largest m is 2<-2
m =[
J,
wt - wt-i and wt can be recursively computed as the largest integer satisfying 9'-3
9*-2
w>t-3 >wt-
[
J- I
J•
wt — wt-i wt — wt-i (+1 2 Let ut denote the bound it - 1 = [2( )/ - 1/2J. Theorem 3.3.6 Wt/ut
—> p* as i —> oo where ^1/2
3 4(5 - 3\/2)
> 0.995 .
Proof. It suffices to show that for any positive p < p* there exist tp and q, where q depends on p and tp but not on t, such that ~ > 2(*+i)/2 -
P
~ 2C+1)/2
*p'
Relax the inequality defining u>( to /
2 <-2
X 1 )
Wf-3 > IU< — I ^Wf—Wt-i
(
2«-3
\Wt—Wt-\
or equivalently, (wt - w^wt
- wt-3 + 2) - 3 • 2'" 3 < 0 .
Though this relaxation may decrease wt, it doesn't affect the asymptotic result. Define 5-3^2
«2 =
,
3
/8v^-7 12^2-9 \ r + p p,
52
Algorithms
a3
=
„ „ 9p 2 3 + 6p + — (lb
a4
18%/2 - 15
tor Special
Cases
pq
27p\ -(-2+—)q>
=
as and
9p 2 a'3 = 3 + 6p + - i -
N o t e t h a t a! < 0 for p < p*. F u r t h e r m o r e , there exists a tv large enough such t h a t for all t >tp, i/2 + a3 < 0 ai2* + a 2 2 Also note t h a t tp is d e t e r m i n e d independent of q. Let q = 2('" +1 »/ 2 p T h e n it is easily verified 9 - > ° > P -
for t < tv
2 ( « + i)/2
T h e proof t h a t — >P-
2 ^
*»*>'»•
is by induction on t. It will be shown t h a t
satisfies t h e inequality (y - t c - O d / - u>,_3 + 2) - 3 • 2 ' - 3 < 0 . Since u>( is t h e largest integral solution of t h e relaxed inequality,
For easier writing define r = 2'^ 2 . T h e n ut < y2r, r / 2 - 3 / 2 . Hence
[(-
2(t+l)/2
ut-i
> r — 3/2 and ut-z
Ut + 1 - tOi_3 + 2 - 3 - 2 '
(,.^)^ + ,-(,-J)(r-l) (VS-Dpr + i + f - f f 1 a-\r2 — a 2 r + a 3 + a^r'1
+ a^r'
.^-i)
pr + 3 + — - 3gr-i
>
3.4 The 3-Defective Case
53
Since q > 0, air2 + a2r + a3 < air2 + a2r + a'3 < 0 for t > tp . Furthermore, -1 ,
a\r
-2
+ asr
-2 / °5
=
—a^r
=
a^r
= ^
2
I (
\
rI lSq
V.30 + 27p
r
\
j
( 30 + 27P - j ^
0
'
since l ^ l < 1 and 2^2
3.4
for i > tp . ~
The 3-Defective Case
Chang, Hwang and Weng [4] gave the following result. T h e o r e m 3.4.1 There exists an algorithm k such that the (3,ht) problem can be solved in t tests where t +1
fort<7
,
h3k+i + 2*- 1
for t = 3k + 2 > 8 ,
c2k + 2*- 2 + 1
for t = 3k > 9 ,
ht
. l&k+i + c2fc)/2J + 3 • 2k~3 + 1 for t = 3k + 1 > 10 , except hi2 = 26, hi3 = 32, hn = 40 and his = 52, where ct are given in Theorem 3.3.4. Proof. For t < 7 let h be the individual testing algorithm. For £ = 3k + 2 > 8 let h be the following algorithm: First test a group of 2k~1 items. If the outcome is negative, the number of remaining items is hak+2 — 2 = h3k+i . Hence 3k + 1 more tests suffice. If the outcome is positive, identify a defective in the contaminated group in k — 1 more tests. The number of remaining items is at most hsk+2 - 1 < 108 • 2k~6 + 3 • 2k~3 + 2*" 1 = 41 • 2*"4 < c 2 t + 2 .
Algorithms
54
for Special
Cases
T h e t o t a l n u m b e r of tests is at most l+fc-l+2fc+2=3fc+2. For t = 3fc let h be t h e following algorithm: 1. First test a group of 3 • 2k~3 items. If t h e outcome is negative, t h e n u m b e r of r e m a i n i n g i t e m s is h3k - 3 • 2k~3
=
89 • 2k~6 + 2k~2 + 1 - 3 • 2k~1 = 81 • 2k~6 + 1 < L107 • 2k~7\ + 3 • 2k~4 + 1 + 2k~2 < /i 3 *_i .
Hence 3fc — 1 m o r e tests suffice. 2. If t h e o u t c o m e of t h e first test is positive, next test a group of 2 items consisting of 2k~2 i t e m s from each of t h e c o n t a m i n a t e d (with 3 • 2k~3 items) groups and t h e u n t e s t e d (with h3k — 3 • 2k~3 items) group. If t h e outcome is negative, t h e size of t h e c o n t a m i n a t e d group is reduced to 2k~2. Identify a defective therein in fc — 2 tests, t h e n u m b e r of remaining items is at most h3k - 2k~2 - 1
<
c2k
=
c2k
for fc < 5 , for
fc
> 6 .
T h e t o t a l n u m b e r of tests is at most 2 + fc-2 + 2fc = 3fc. 3. If t h e o u t c o m e of t h e second test is positive, t h e n t h e unidentified items are divided into four sets A,B,C,D of sizes 2k~2,2k'3,2k~3 and h3k — 2k~l such t h a t A U B a n d B U C each contains a defective. Next test t h e set A. If t h e o u t c o m e is negative, t h e n B m u s t contain a defective. Identify it in fc — 3 tests. T h e n u m b e r of r e m a i n i n g i t e m s is at most 2k~3 - 1 + 2k~3 + h3k - 2k~x = h3k - 2k~2 - 1 , which has been shown to b e at most c-ik. T h e total n u m b e r of tests is at most 3+fc-3+2fc=3fc. If t h e o u t c o m e is positive, t h e n identify a defective in each of A and B U C in fc — 2 tests each. T h e n u m b e r of remaining items is at most h3k - 2 = c2k + 2k~2 - 1 = 89 • 2k~6 + 2k-2 - 1 < 2k+1
.
Hence fc + 1 m o r e tests identify t h e last defective. T h e total n u m b e r of tests is at most 3+fc-2+fc-2+fc+l=3fc.
3.4 The 3-Defective
55
Case
Finally, for t = 3 K + 1 > 10, let h be t h e following algorithm: First test a group of h3k+i — h3k items. If t h e o u t c o m e is negative, t h e r e are h3k i t e m s left and t h e problem can be solved in 3 K more tests. If t h e o u t c o m e is positive, t h e c o n t a m i n a t e d group contains
h3k+1-h3k
= [3*±p^j + 2*-s L
63-44 J 2
+ 22
[37^!j + 2*-3
for k = 5 , for
k
>6 .
It is easily verified t h a t 2k~2 < h3k+1 - h3k < 2k~1 . By L e m m a 2.3.3, a defective can be identified from t h e c o n t a m i n a t e d group either in K — 2 t e s t s , or in k — 1 tests with at least 2 ~ — {h3k+i
— h3k)
good items also identified. In t h e first case, t h e n u m b e r of remaining items is at most h3k+i — 1 which can be verified to be < C2/C+2 for all k > 3. In the second case, the n u m b e r of remaining items is
h3k+1
- 1 - [2 4 " 1 - {h3k+1
=
2h3k+l
- h3k - 2k~l - 1
<
c2k+l
- k3k)}
+ c2k + 3 • 2*" 2 + 2 - c2k - 2k~1 - 1 = c2k+1
.
In either case t h e total n u m b e r of tests is easily verified to be 3 K + 1. Let vt = [V'^/3]
•
+ 1. Since
«.-u, vt is an u p p e r b o u n d for n ( ( 3 ) . C o r o l l a r y 3 . 4 . 2 There exists a t* large enough such that ht/vt Proof. As t —> 00, 105 • 2k~6 Vl32k
h3k v3k
263 • 2k~7 > 0.896 VW+W
h3k+1 v3k+1 h3k+2 ^ 7
105 > 0.902 64 • 6 1 / 3
-*
327 • 2 Ql/32k+2/3 > 0.885 .
•
> 0.885 for all t > t*.
Algorithms for Special Cases
56
3.5
When is Individual Testing Minimax?
The individual testing algorithm for the (d, n) problem requires n — 1 tests since the nature of the last item can be deduced. D. Newman conjectured and Hwang [9] proved that individual testing is minimax if d < n < 2d. Later, Hu, Hwang and Wang [8] strengthened that result to d < n < [(5d + 1)/2J. They also conjectured that individual testing is minimax if d < n < 3d. If the conjecture is true, the bound "id is sharp since Hu, Hwang and Wang proved
Theorem 3.5.1 M(d, n) < n - 1 for n > 3d. Proof. It suffices to prove Theorem 3.5.1 for M(d, "id + 1) since M(d,n) is nondecreasing in n. For d = 1, M(l,4) = 2 < 4 — 1 by binary splitting. The general case is proved by induction on d. Let T be a nested (line) algorithm which always tests a group of two items (unless only one is left) if no contaminated group exists. When a contaminated group of two items exists, T identifies a defective in it by another test. In the case that the item in the contaminated group being tested is good, then a good item is identified at no additional cost. All other good items are identified in pairs except that those good items appearing after the last defective are identified by deduction without testing. Note that T has at most 3d tests since the d defectives requires 2d tests and another d tests identify 2d good items. As the number of good items is odd, one of them will either be identified for free or is the last item for which no test is needed (since it appears after the last defective). Suppose that T takes 3d tests. Then before the last defective is identified, at least 2d good items must have been identified. If the last defective is identified with a free good item, then no more good item appears after the last defective; otherwise, at most one good item appears after. In either case, before the last two tests which identify the last defective, at most two items, one good and one defective, are unidentified. By testing one of them the nature of the other item is deduced. Thus only 3d — 1 tests are needed. • The latest progress towards proving the n < 3d conjecture is a result by Du and Hwang [6] which shows that individual testing is minimax for n < |_21d/8j. Consider the following minimization problem. Let m and n be relatively prime positive integers, / an integer satisfying 0 < I < m + n — 2, and A a positive number. Define li = [m(l + l)/(m + n)\. The problem is to locate the minimum of w
\
mk + h J
over the nonnegative integers k — 0,1,2, • • • .
3.5 When is Individual Testing Minimax?
57
Let l2 = ln(l + l)/(m + n)\. Since m and n are relatively prime and / + 1 < m + n, neither m(l + l ) / ( m + n) nor n(l + l ) / ( m + n) can be an integer. Therefore
m(/ + l)
1
."»(/+!)
m+n
m +n
n(i + l) '
l c
^
<
"(
m+n
/
+ 1). m+n
Adding up, one obtains
/ - 1 < h + h
,
xnTJ^-'Km + n^x + ^ + i-i] f(X>
V£?[m(x
+ 1) + /, - i\n££[n(x
+ 1) + h - i)
for real x > 0. Then
/w =
^ i r for*=o'1'2'--- •
If f(x) = 1 has no nonnegative solution, then F(k) is either monotone increasing or monotone decreasing. If f(x) = 1 has a unique nonnegative solution x°, then F(k) has a minimum k° which is either |_x°J or [i°] . Of course, since f(x) is a polynomial of degree m + n — 1, one would not expect in general that f(x) = 1 would have so few solutions. However, Du and Hwang gave an interesting and novel method to show that indeed f(x) = 1 has at most one nonnegative solution. Define M = max{(m + h)jm, (n + h)/n} and c=
mmnn
(m + n)m+nX
Theorem 3.5.2 If c > 1 or c < 2(1 + l ) / ( m + n + 2(1 + 1)), f(x) = 1 has no nonnegative solution. If 1 > c > 1 — 1/2M, f(x) = 1 has a unique nonnegative solution, lying in the interval (1/2(1 — c) — M,c/2(1 — c) — (I + l ) / ( m + n)). If 1 - 1/2M > c > 2(1 + l ) / m + n + 2(1 + 1), f(x) = 1 either has no nonnegative solution or has a unique one, lying in the interval [0,c/2(l — c) — (I + l)/(m + n)]. Proof. See [6]. Theorem 3.5.3 M(d,n)
= n - 1 for n <
fd.
Proof. Since M(d,n) is nondecreasing in n (Corollary 1.5.2), it suffices to prove Theorem 3.5.3 for n = [^dj. The proof is decomposed into eight cases. Case (i). d = 8k. Then n = [fd\ = 21ik. Set t = ik. Then n - t = Ilk. d~t = ik,n-2t-2 = 13k~2. The inequality ( ^ > 213k~2 is proved by showing that l'K \ ,„i3i_2
min - ) / 2 —
>1
Algorithms for Special Cases
58
Theorem 3.5.3 then follows from Corollary 1.5.10 by setting / = t. Define
Then m = 4, n = 13, I = h = 0, A = 2" 1 3 . Compute M = max
'm + h
,
n + l2\
= 1,
mnnn M > = 7 TTTT = 0-7677 > 1 - — = 0.5 . (m + n)m+n\ 2 Therefore, from Theorem 3.5.2, f(k) = 1 has a unique nonnegative solution in the interval (1.15, 2.59). Namely, F(k), and hence r47fc*)/213k~2 attains a minimum at k = 2. Thus we have 1
c
T'( 1 4 7 *)/ 2iafc " a = ( ? ) / 2 M = L08>1 As the proofs for the other seven cases are analogous to case (i) but with different parameter values, only the values of the parameters are given in each case without further details. K° will always denote the value of k that minimizes fcj)/2"~2t~2. Case (ii). d = 8k + 1, n = 21k + 2, t = 4k + 1, 1.08 < K° < 2.54. 3 17* + A ,ol3*-2 a _ f ^
/o24
T V 4* J/^- =UJ/2M = l-*0>l. Case (iii). d = 8k + 2, n = 21Jfc + 5, t = 4fc + 2, 0.92 < K° < 2.42. 17fc + A ,„ *-i _ m i n ^ ' y " J / 213 - - 1 =
m
. J / 2 0 \ / o l 2 _ , 1 fi / 3 7 \ / o 2 5 i n | ^ j / 2 - = 1.18, ^ j / 2 " = 1.15J > 1 .
Case (iv). d = 8A; + 3, n = 21k + 7, t = 4k + 2, 0.84 < A'0 < 2.30.
Case (v). d = 8A; + 4, n = 21fc + 10, t = 4k + 3, 0.69 < K° < 2.18.
Case (vi). d = 8fc + 5, n = 21k + 13, t = 4k + 3, 0.54 < K° < 2.01.
3.6 Identifying a Single Defective with Parallel Tests
59
Case (vii). d = 8k + 6, n = 21k + 15, t = 4k + 3, 0.40 < K° < 1.88.
Case (viii). <2 = 8fc + 7, n = 21k + 18, t = 4fc + 4, 0.31 < K° < 1.77.
3.6
Identifying a Single Defective with Parallel Tests
Karp, Upfal and Wigderson [10] studied the problem of identifying a single defective from the sample space S(h,n) \ S(0,n), with p processors. Let t(n,p) denote the minimax number of rounds of tests required. They proved Theorem 3.6.1 For all n and p t{n,p)=
riog p+1 n"|.
Proof. The inequality t(n,p) < [log p+1 ri\ is obtained by construction. Partition the n items into p + 1 groups as evenly as possible and use the p processors to test the first p groups. Either a group is tested to be contaminated, or one concludes that the untested group is. Apply the same to the singled-out contaminated group and repeat the partition and testing until a contaminated group of size one is identified. Clearly, at most [log + 1 n\ rounds of tests are required. The inequality t(n,p) > log„ +1 n is obtained by an oracle argument. Note that a complete description of the information available to the algorithm at any point of its execution is a set consisting of known contaminated groups and a known pure group. Without loss of generality, assume that items in the pure group are not involved in any future tests. Let m; denote the size of a smallest contaminated group before the round-i tests. Then mo = n since the set of all n items constitutes a contaminated group. Whenever m, = 1, a defective is identified and the algorithm ends. The role of the oracle is to choose outcomes of the tests so that the reduction of m at each round is controlled. Suppose that Qi, • • •, Qp are the p tests at round i (some could be vacuous) and m; > 1. It suffices to prove that m; + i > m,-/(p + 1 ) . The oracle imposes a negative outcome for each Qj with \Qj\ < ™>i/{p + 1) and updating all Qk to Qk \ Qj. Note that a negative outcome on Qj is consistent with the test history since otherwise Qj would be a known contaminated group with a size smaller than m,m; — x , 0_ < x < yp — 1 p+1' ~ which is the size of the smallest such group after x updates of Qk- The oracle also imposes a positive outcome on each Qj with \Qj\ > m;/(p + 1). Again such an
60
Algorithms for Special Cases
outcome is consistent since it agrees with the sample that all items not in the known pure group are defectives. This particular sample also confirms that at any given time the smallest known contaminated group is among those Qk's, the updated ones and the newly added ones. This outcome description assumes that the negative outcomes are given before the positive outcome. It is easily verified that the size of an existing Qj after the round-i tests is at least P+i
p+i
and the size of a newly added Qj is at most rriij(j> + 1). Since t(n,p) is an integer, Theorem 3.6.1 follows. • Karp, Upfal and Wigderson motivated their study of t(n,p) by showing that it is a lower bound of the time complexities of the two problems of finding maximal and maximum independent sets in a graph.
References [1] G. J. Chang and F. K. Hwang, A group testing problem, SIAM J. Alg. Disc. Methods 1 (1980) 21-24. [2] G. J. Chang and F. K. Hwang, A group testing problem on two disjoint sets, SIAM J. Alg. Disc. Methods 2 (1981) 35-38. [3] G. J. Chang, F. K. Hwang and S. Lin, Group testing with two defectives, Disc. Appl. Math. 4 (1982) 97-102. [4] X. M. Chang, F. K. Hwang and J. F. Weng, Group testing with two and three defectives, in Graph Theory and Its Applications: East and West, ed. M. F. Capobianco, M. Guan, D. F. Hsu and T. Tian, (The New York Academy of Sciences, New York, 1989) 86-96. [5] C. C. Chen and F. K. Hwang, Detecting and locating electrical shorts using group testing, IEEE Trans. Circuits Syst. 36 (1989) 1113-1116. [6] D. Z. Du and F. K. Hwang, Minimizing a combinatorial function, SIAM J. Alg. Disc. Methods 3 (1982) 523-528. [7] M. R. Garey, D. S. Johnson and H. C. So, An application of graph coloring to printed circuit testing, IEEE Trans. Circuits Syst. 23 (1976) 591-599. [8] M. C.; Hu, F. K. Hwang, J. K. Wang, A boundary problem for group testing, SIAM J. Alg. Disc. Methods 2 (1981) 81-87.
References
61
[9] F. K. Hwang, A minimax procedure on group testing problems, Tamkang J. Math. 2 (1971) 39-44. [10] R. M. Karp, E. Upfal and A. Wigderson, The complexity of parallel search, J. Comput. Syst. Sci. 36 (1988) 225-253. [11] M. Ruszinko, On a 2-dimensional search problem, J. Statist. Plan, and Infern., to appear. [12] J. K. Skilling, Method of electrical short testing and the like, US Patent 4342959, Aug. 3, 1982. [13] J. F. Weng and F. K. Hwang, An optimal group testing algorithm for k disjoint sets, Oper. Res. Lett., to appear.
4 Nonadaptive Algorithms and Binary Superimposed Codes
A group testing algorithm is nonadaptive if all tests must be specified without knowing the outcomes of other tests. The necessity of nonadaptive algorithms can occur in two different scenarios. The first comes from a time constraint: all tests must be conducted simultaneously. The second comes from a cost constraint: the cost of obtaining information on other tests could be prohibitive. A mathematical study of nonadaptive CGT algorithms does not distinguish these two causes. A seemingly unrelated problem is the construction of superimposed codes first studied by Kautz and Singleton for retrieving files. The thing that unifies nonadaptive CGT algorithms and superimposed codes is that both use the same matrix representation where the constraints are imposed on the unions of any d or up to d columns.
4.1
The Matrix Representation
Consider a t x n 0-1 matrix M where Ri and Cj denote row i and column j . R{{Cj) will be viewed as a set of column (row) indices corresponding to the 1-entries. M will be called d-separable (d-separable) if the unions, or Boolean sums, of d columns (up to d columns) are all distinct. M will be called d-disjunct if the union of any d columns does not contain any other column. Note that d-disjunct also implies that the union of any up to d columns does not contain any other column. These definitions are now explained in terms of nonadaptive CGT algorithms and superimposed codes, respectively. A i x n d-separable matrix generates a nonadaptive (d, n) algorithm with t tests by associating the columns with items, the rows with tests, and interpreting a 1-entry in cell (i,j) as the containment, and the 0-entry as a noncontainment, of item j in test i. A set of d columns will be referred to as a sample, if they are associated with d defectives. The union of d columns in a sample corresponds to the set of tests (those rows with a " 1 " entry in the union) which give positive outcomes given that sample. Thus the d-separable property implies that each sample in S(d, n) induces a different set of tests with positive outcomes. By 62
63
4.2 Basic Relations and Bounds
matching the sets of positive tests with the samples in S(d,n), the d defectives can be identified. Similarly, the d-separable property implies that samples in S(d,n) are distinguishable. A J-disjunct matrix also corresponds to a nonadaptive (J, n) algorithm, but with an additional property which allows the defectives to be identified easily. To see this let s denote a set of columns constituting a sample. Define P{s) = \Jj£.Ci . P{s) can be interpreted as the set of tests with positive outcomes under the sample s. An item contained in a test group with a negative outcome can certainly be identified as good. In a nonadaptive algorithm represented by a (^-disjunct matrix, all other items can be identified as defectives. This is because the columns associated with these items are contained in P(s), and thus must be defectives or the matrix is not ((-disjunct. Consequently one does not have to look up a table mapping sets of contaminated groups to samples in S(d,n) or S(d,n). This represents a reduction of time complexity from 0(nd) to 0(d). Another desirable property first observed by Schultz [28] is that if after the deletion of all items from groups with negative outcomes, the number of remaining items is more than d, then the false assumption of s € S(d,n) is automatically detected. At x n J-separable matrix also generates a binary superimposed code with n code words of length t. The ^-separable property allows the transmitter to superimpose up to d code words into one supercode word (with the same length) and transmit. The receiver is still able to uniquely decode the supercode word back to the original code words if the channel is errorless. A d-separable matrix achieves the same except that a supercode word always consists of d original code words. The disjunct property again allows easier decoding by noting that the components of a supercode code are exactly those code words which are contained in the supercode word. Kautz and Singleton [19] were the first to study superimposed codes. They indicated applications of superimposed codes in file retrieval, data communication and design of magnetic memory. Although nonadaptive CGT algorithms and superimposed codes have the same mathematical representation, the respective problems have different focuses. For the former, one wants to minimize the number t of tests for a given number n of items. For the latter one wants to maximize the number n of codewords for a given number t of alphabets. Of course these two focuses are really the two sides of the same problem. For d-separable or d-disjunct, both applications want to maximize d.
4.2
Basic Relations and Bounds
Kautz and Singleton proved the following two lemmas.
64
Nonadaptive Algorithms and Binary Superimposed
Codes
L e m m a 4.2.1 If a matrix is d-separable, then it is k-separable for every 1 < k < d < n. Proof. Suppose that M is d-separable but it is not fc-separable for some 1 < k < d < n. Namely, there exist two distinct samples s and s' each consisting of k columns such that P{s) = P{s'). Let Cx be a column in neither s nor s'. Then Cx U P{s) = CxU P{s') . Adding a total of d — k such columns Cx to both s and s' yields two distinct samples Si and s'd each consisting of d columns such that P(sd) = -P(sjj). Hence M is not d-separable. If there are only i < d—k such columns Cx, then select d—k — l pairs of columns (Cy, Cz) such that Cy is in s but not in s' and C2 is in s' but not in s. Then Cz U P(s) = P{s) = P(s') = CyU P{s') . Therefore these pairs can substitute for the missing Cx. Since M is d-separable, a total of d — k Cx and (Cy,Cz) can always be found to yield two distinct sj and s'd each consisting of d columns. • The following result shows that d-disjunct is stronger than d-separable. L e m m a 4.2.2 d-disjunct implies d-separable. Proof. Suppose that M is not d-separable, i.e., there exist a set K of k columns and another set K' of k' columns, 1 < k, k' < d, such that P{K) = l)P(K'). Let Cj be a column in K'\K. Then Cj C P(K) and M is not fc-disjunct, hence not d-disjunct. • It is clear that d-separable implies d-separable, hence d-disjunct implies d-separable. Saha and Sinha [27] proved a slightly stronger result. L e m m a 4.2.3 Deleting any row Ri from a d-disjunct matrix M yields a d-separable matrix Mi. Proof. If a sample s intersects Ri, then the set of negative tests is preserved by the deletion of Ri. So the d-disjunct property is not affected. Therefore it suffices to show that two samples s and 5', both not intersecting Ri, can still be distinguished with JR,- deleted. Note that P(s) and P(s') are preserved by the deletion of _/?,-. Since they are not identical in M, they remain non-identical in M,-. • Kautz and Singleton also proved the following two lemmas.
L e m m a 4.2.4 (d -f \)-separable implies d-disjunct.
4.2 Basic Relations and Bounds
65
Proof. Suppose that M is (d + Inseparable but not d-disjunct, i.e., there exists a sample s of d columns such that P(s) contains another column Cj <\f P(s). Then P(s) = Cj U P(s),
a contradiction to the assumption that M is (d + Inseparable,
O
From Lemmas 4.2.2 and 4.2.4 any property held by a (/-separable matrix also holds for a d-disjunct matrix, and any property held for a d-disjunct matrix also holds for a d + 1-separable matrix. In the rest of the chapter, we will state the result either for d-disjunct or for d-separable, whichever is a stronger statement, but not for both. Figure 4.1 summarizes the relations between these properties.
(d + l)-separale
Z^>
d-disjunct ( = d-disjunct) O *
k-separable (k < d)
d-separable
sip
* : delete any row
Figure 4.1: An ordering among properties.
L e m m a 4.2.5 If M is a d-separable matrix, then \P(s)\ > d for s €
S(d,n).
Proof. Suppose to the contrary that |-P(s)| < d for some s £ S(d,n). For each j £ s let tj denote the number of rows intersecting Cj but not any other column of s. Note that every j £ s must intersect some rows of P(s) or _;' cannot be identified as a member of s. Furthermore, |.P(s)| < d implies that there exist at least two indices i(l) and z(2) such that t^ = i,(2) = 0. Let sx = s\i(l) and s2 = s\i(2). Then P(si) = P(s2), so M is not a. (d — Inseparable matrix, a contradiction to Lemma 4.2.3. • Corollary 4.2.6
ChsC Proof. The RHS of the inequality gives the total number of ways of selecting distinct P(s). D Kautz and Singleton gave a stronger inequality
66
Nonadaptive Algorithms and Binary Superimposed
Codes
for a d-separable matrix. Unfortunately, their result is incorrect. An n x n identity matrix is certainly n-separable, but does not observe the inequality. Let Wj, called the weight, denote the number of 1-entries in Cj. For a given 0-1 matrix let r(w) denote the number of columns with weight w. A column is called isolated if there exists a row intersecting only this column. Note that there exist at most t isolated columns. Dyachkov and Rykov [6] proved Lemma 4.2.7 Any column of a d-disjunct matrix with weight < d is isolated and consequently
£ 'M < * • Proof. Suppose to the contrary that Cj is not isolated and WJ < d. Then Cj is contained in a union of at most d columns and M cannot be d-disjunct. • For a given set P of parameters let t(P) denote the minimum number of rows in a matrix satisfying P and let n(P) denote the maximum number of columns in a matrix satisfying P. For example, t(d,n) denotes the minimum number of rows for a d-disjunct matrix with n columns. Bassalygo (see [7]) proved Theorem 4.2.8 t(d,n) > min{ ( ^ 2 ) , n\ for a d-disjunct
matrix.
Proof. Suppose that the d-disjunct matrix M has a column C with weight w. Delete C and the w rows which intersect C and let the resultant matrix be M'. Then M' is a (d — l)-disjunct matrix. If it were not, say, there exists a column C[ contained in a union of P(s') of d — 1 columns in M1. Then C; must be contained in P(s) where s consists of C and the same d — 1 columns in M, contradicting the assumption that M is d-disjunct. Therefore t(d,n) >w
+ t(d-
1, n - 1) .
Theorem 4.2.8 is trivially true for n = 1. The general case is proved by induction on n. If M has a column of weight w > d + 1, then t{d,n)
> d + 1 + min j (
~
min
),n-l
{(T)' n
If M does not have such a column, then from Lemma 4.2.7, i(d, n) > E
r u;
( ) — n > min <
,n
67
4.2 Basic Relations and Bounds Corollary 4.2.9 For ( d + 2 ) > n, t(d,n)
= n
Proof. t(d,n) > n from Theorem 4.2.8. t(d,n) < n since the n x n identity matrix is d-disjunct. • Define v = |~u>/d]. Erdos, Frankl, and Fiiredi [9] proved Lemma 4.2.10 Let M be a t x n d-disjunct matrix.
Then
0
r(w) <
(:-":)' Proof. Let C be a column of M and let F{C) be the family of v-subsets of C such that u £ F ( C ) implies the existence of a column C" of M, C / C, such that u C C". Let u i , . . . , Uj g F{C). Then clearly, uf=1Uj / C o r M would not be d-disjunct. By a result of Frankl (Lemma 1 in [12]), any family of u-subsets satisfying the above property has at most y"~ J members. Therefore C must contain at least w\
(w — l \
fw — 1
vJ
\ v J
\v — 1
^-subsets not contained in any other column of M. Since there are only (M distinct u-subsets, Lemma 4.2.10 follows immediately. D By summing over r{w) over w and using the Stirling formula for factorials, they obtained Theorem 4.2.11 t(d,n) > d(l + o(l))lnn. Dyachkov and Rykov [6] obtained a better lower bound by a more elaborate analysis. Let h(u) = — u log u — (1 — u) log(l — u) be the binary entropy function and define
fd(v) = h(v/d) - vh(l/d) . Also define K\ = 1 and for d > 2, Kd is uniquely defined as Kd = [max fdiv)]'1
,
where the maximum is taken over all v satisfying 0
l-Ki.-i.IKi
.
Nonadaptive Algorithms and Binary Superimposed
68
Codes
Corollary 4.2.12 For d fixed and n —* co t{d,n) > Kd (l + o(l))logn . Proof, (sketch) For d = 1, Corollary 4.2.12 follows from Corollary 4.2.6. For d > 2 Corollary 4.2.12 follows from verifying that w Kd-i Kd
f rf 1 hm v ( - '") < —;
n-><x>
<
l o g Tl
•implies r
,. t(d,n) lim -^—L . O n-»oo [ 0 g
n
(4.1) Dyachkov and Rykov computed K2 = 3.106, K3 = 5.018, K4 = 7.120, K5 = 9.466 as and A'6 = 12.048. They also showed Kd —> 2\ogd\ ^ ~~* °°- (^ e e R- u s z m ko 2\0 j U + °(1)) [25] for a simpler proof.) Dyachkov, Rykov and Rashad [8] obtained an asymptotic upper bound of t(d, n) Theorem 4.2.13 For d constant and n —> oo t(d,n)<
- ^ - ( 1 + o(l))logn , Ad
where Ad = max
max < —(1 — Q) log(l — q ) + dQlogl
0< 9 <1 0
'
X
+
(\-Q)log±-
'
They also showed that Ad —> — (1 + o(l)) as d —> co . aloge For d-separable, the asymptotic Ad is two times the above and kd > d for d > 11 (private communication from Dyachkov and Rykov).
4.3
Constant Weight Matrices and Random Codes
Kautz and Singleton introduced some intermediate parameters to get better bounds. Let \{j denote the dot product of C,- with Cj, i.e., the number of rows that both C, and Cj have a 1-entry (We also say C, and Cj intersect A;J times). Define w = min Wj(w = max Wj) i
j
m a x \,j .
4.3 Constant Weight Matrices and Random Codes
69
Let M be a t x n matrix with the parameters w and A. Then any (A + l)-subset of the underlying t-set can be contained in at most one column. Thus
§V* + IJ-(A + I]The case Wj = w for all j , was first obtained by Johnson [17] as follows: Lemma 4.3.1 For a matrix with constant weight w n <
(&)' Lemma 4.3.2 Let M be a t x n matrix with parameters w_ and A. Then M is ddisjunct with d = \{w_ — 1)/AJ . Proof. Cj has at most dX < w < Wj intersections with the union of any d columns. Hence Cj cannot be contained in that union. • Define t'(w,\,n) as the minimum t for a t x n matrix with constant column weight w and maximum number of intersections A (between two columns). Johnson also showed that t(w-X) > n(w2 -tX) . Nguyen and Zeisel [21] used this Johnson bound to prove Theorem 4.3.3 A t x n constant weight d-disjunct matrix with d = [(w — 1)/AJ exists only if dw < It. Proof. Define a = dw/t. By Lemma 4.3.2, X < [(w - l)/d\. 2
- tx > — -t dJ
fod\2
at2
Thus
at2,
Id
The Johnson bound can be written as w2-tX'
~
which is increasing in A. Therefore setting A = w/d, it follows that n — <
2
^at( a - l )
Since n > d — 1, necessarily, a < 2.
•
a-1
Nonadaptive Algorithms and Binary Superimposed
70
Codes
Corollary 4.3.4 Suppose that d is fixed and n —• oo in a constant weight d-disjunct matrix. Then t > dw. Nguyen and Zeisel used Lemma 4.3.1 to obtain a better asymptotic lower bound of t than the one given in Corollary 4.2.12. Theorem 4.3.5 For d ^> 1 and n - t o o , ., . d? log n , . tld.n) > — : , where a = dw t < 1 . a log d Proof. From the Stirling formula it is easily verified that for any sequence kn where lim^oo kn/n = q < 1, lim - log n—oo
I = h(q) . ^knj
n
Using this fact and Lemma 4.3.1, t(d,n) > d2 log n a log f - (d2 - a)log(l - s ) + a(d - l)log(l - i ) Theorem 4.3.5 follows by noting d S> 1.
n
Setting A = 1 in the Johnson bound, then t>
nw2 ^ n(d + l)2 > V / > n+w — 1 n+a
. . ,,. mm{n,d2}
since if n < d2, then n(d+l)2 n+d
2
>n(d+l) 2
~
>
n
d +d
and if n > d2, then n(^+l)2 n+d
=
(<*+l) 2 > (d+1)2 1 + d/n " 1 + 1
>
d2
Define mo = max < I
J, min{n, d }
Dyachkov and Rykov [7] used the above results and known bounds of the binomial coefficients in a variation of Lemma 4.2.10 to prove Theorem 4.3.6 Let d,n,mo and
be given integers such that 3 < d < n, l / d + l / n < 1/2 1/d+l/n — d
1
1 1 < . m0 ed + 1
Then t > F(d,n,m0)
=
h(l/d + l/n)/d
logn + llog(7r/4) + l / m 0 ) - (1/d +
l/n)h(l/d)
4.3 Constant Weight Matrices and Random Codes
71
Theorem 4.3.6 can be used to improve m 0 . For k > 1 define rrik = max{mi_i,
F(d,n,m,k-i)}.
Let ko be the smallest k satisfying mj = mk-\. Then m/b0 > m 0 is a lower bound of t'(w, \,n). Corollary 4.3.7 For d constant and n —» oo, the asymptotic inequality t > (h{l/d2)
- h{l/d)/d)(l
+ o(l)) log n
holds. Applying the inequality h(u) < u\og(e/u), J2.
h(i/d)
h{iid?)-^-^>d
one can verify ^
d* log(rfe)
Dyachkov and Rykov recently claimed (private communication) a stronger result
'-/^)(1+0(1))l0gn' where qd = \ v g f 1 . They showed that 1 h(qd)
8d2 (1 + o(l)) a s i - » o o log d
Dyachkov and Rykov also used a random coding method to obtain an upper bound for t(d,n). Busschbach [1] improved this construction to obtain Theorem 4.3.9. It is straightforward to verify L e m m a 4.3.8 The property of d-disjunct is equivalent to the following statement: For any d + 1 columns with one of them designated, there always exists a row with 1 in the designated column and Os in the other d columns. T h e o r e m 4.3.9 t(d,n) < 3 ( d + l ) l n [ ( d + 1 ) ^ ) ] . Proof. Let (Mfj) be a t x n random 0-1 matrix where M;J = 1 with probability p. For a row i and columns ji,. . ., j d + 1 , the probability that M{n = 1 and M,J2 = . . . = M,Jd+1 = 0 is p ( l — p) . The probability that there does not exist such a row i is
[I-PU-P)"]'
72
Nonadaptive Algorithms and Binary Superimposed
Codes
Note that this term is maximized by choosing p = l/(c?-f-1). Finally, the probability P that this will happen for at least one choice of ji,... ,jd+i is less than (d+l)
1 d + l('-TTl)'
d+l
From
1
1 \
/
i- > I 1 - -r— 2 - V d+lJ
1
> - for d > 1 3
and — ln(l — i ) > x for 0 < x < 1 one obtains In 1 -
d+l
d+l
1 3(d + l)
Therefore P < 1 for t > 3(d + 1) In ( d + l )
d+l
which implies the existence of an (M,j) which is d-disjunct.
•
Corollary 4.3.10 For n > d, t(d,n) < 3 ( d + l ) 2 l n n . Kautz and Singleton gave a method (see next section) to convert a q-nary code of length t/q to a binary superimposed code of length t with the same number of code words. Nguyen and Zeisel showed that with a proper choice of q, there must exist a random q-nary code whose conversion will yield a d-disjunct matrix with a favorable ratio f/logra. Theorem 4.3.11 t(d,n) < K(d,n)(d K(d, n) = min i<1
d+l
+ l) 2 logn, where
1 l log _ ( l _ l ) ' l
+
q (d+l)2logrc •
Proof. Let M denote the t x n binary matrix converted from a random q-nary code of length t/q. The probability that M is not d-disjunct is less than t/q
(n-d)
< exp \ (d + 1) In n + (
1 ] In 1 - 1
which is less than 1 if t satisfies the inequality given in the theorem. This implies the existence of a t x n d-disjunct matrix. •
4.4 General
73
Constructions
C o r o l l a r y 4 . 3 . 1 2 As d is fixed and n —• oo,
K(d,n)
< T^T
= 1.5112 .
e-l
As d also —• oo but at a much slower rate than n, lim sup K(d,n) d-^oo
4.4
= - i - S* 1.4427 . In 2
General Constructions
An equireplicated pairwise balanced design ( E P B D ) is a family of t subsets, called blocks, of an underlying set N — { l , - - - , n } such t h a t every element of N occurs in exactly r subsets and every pair of elements of A" occurs together in exactly A subsets. W h e n all blocks have t h e same cardinality, an E P B D becomes a balanced incomplete block design (BIBD) which is a well-studied subject in experimental designs and combinatorial designs. Bush, Federer, Pesotan and Raghavarao [2] proved L e m m a 4 . 4 . 1 An EPBD
yields a t x n d-disjunct
matrix if r > dX.
Proof T h e incidence m a t r i x of an E P B D , i.e., where the rows are the blocks and the columns are t h e elements of the underlying set, yields a d-disjunct m a t r i x by L e m m a 4.3.2. • K a u t z and Singleton pointed out t h a t E P B D s do not yield useful d-disjunct matrices since t > n by t h e Fisher's inequality. In b o t h nonadaptive C G T and superimposed code applications, t h e goal is to have small t and large n, hence the d-disjunct m a t r i x generated by an E P B D is inferior to the n x n identity m a t r i x . K a u t z and Singleton also noted t h a t t h e only BIBD which can lead to an interesting d-disjunct m a t r i x M is when A = 1, because MT (transverse) has the property AT = 0 or 1. From L e m m a 4.3.2, where the constant weight is k, MT is a (k — l)-disjunct m a t r i x . T h u s it seems t h a t the combinatorial designs which yield interesting d-disjunct matrices are those with nonconstant A. T h e following result is a generalization of L e m m a 4.3.2 towards t h a t direction. An (n, b) design T is simply a family B of b blocks of the set A^ = {1, • • •, 72}. For K a subset of N, let rx denote t h e n u m b e r of blocks in T intersecting all columns of K. Let r t ( / , k) (r t ( — /, k)) denote an upper (lower) bound for the sum of the / largest (smallest) r^uK over all fc-subset K. T h e o r e m 4 . 4 . 2 If for every i £ N and every d-subset
H.} + then T is a d-disjunct
D not containing
E(-1)J'-.((-1)J+1(J).J)>O,
matrix.
i,
Nonadaptive Algorithms and Binary Superimposed
74
Codes
Proof. Let D denote a given d-subset of N and let Dj, 0 < j < d, denote an arbitrary j-subset of D. Using the inclusion-exclusion principle,
E(-i)J
E^OUD,-
j=0
Dj
> r{0 + E(-iyv,((-iy' +1 ( , : . J ) > o i=l
V
is the number of blocks having an 1-entry in P({i}) but not in P{D). Therefore D does not contain i. Since the above inequality holds for any D, T is a d-disjunct matrix by Lemma 4.3.8. • A t-design is a (n, 6) design where every block is of a constant size k and every f-tuple occurs in a constant number Aj of blocks. It is well known that a i-design is also a j-design for 1 < j < t — 1. In fact \j can be computed as n-j t-j ' k- j t-3 Note that the inclusion-exclusion formula yields a smaller number if truncated to end at a negative term. Saha, Pesotan and Raktoc [26] obtained Corollary 4.4.3 A t-design B yields a d-disjunct matrix if
t-i-j where d
if
t-l>d
t*= < t — 1 if t — 1 is odd and < d , t — 2 if t — 1 is even and < d . Unfortunately, a t-design is a 2-design with an EPBD. So again t > n and the constructed tZ-disjunct matrix is not interesting. An m-associate partially balanced incomplete block design (PBIBD) is otherwise a 2-design except that the number of subsets containing two elements i and j belongs to a set {A21 > A22 > • • • > A2m}. A PBIBD is group divisible if the elements can be partitioned into groups of constant size such that an intergroup pair gets A2i and an intragroup pair gets A22- By truncation at j = 1, Saha, Pesotan and Raktoc proved
4.4 General
Constructions
75
C o r o l l a r y 4 . 4 . 4 A 2-associate
PBIBD
B yields a d-disjunct
Ai - d\2\
matrix
if
> 0 .
T h e following is an example of using Corollary 4.4.4 t o obtain a 2-disjunct m a t r i x with fewer tests t h a n t h e n u m b e r of i t e m s . E x a m p l e 4 . 1 Let B be t h e 2-associated P B I B D consisting of t h e subsets:
# = {1,2,3,4},
# = {5,6,7,8},
# , = {9,10,11,12}
# = {13,14,15,16},
#
B6 = { 2 , 6 , 1 0 , 1 4 } ,
B7 = { 3 , 7 , 1 1 , 1 5 } ,
5 8 = {4,8,12,16},
5 1 0 = {2, 7 , 1 2 , 1 3 } ,
Bu
#
= {1,5,9,13},
# , = {1,6,11,16},
= {3,8,9,14},
2
= {4,5,10,15}.
It is easily verified t h a t Aj = 3, A2i = 1 and A22 = 0. Since A : - d\2X
= 3 - 2 •1 = 1 > 0 ,
B is a 12 x 16 2-disjunct m a t r i x . In a different direction K a u t z and Singleton searched a m o n g known families of conventional error-correcting codes for those which have desirable superimposition properties. T h e y c o m m e n t e d t h a t binary group codes do not lead to interesting superimposed codes (d-disjunct matrices). First of all, these codes include t h e zero-vector as a code word, so t h e corresponding m a t r i x cannot be d-disjunct. T h e removal of the zero-vector does not solve t h e problem since t h e code usually contains a code word of large weight which contains at least one code word of smaller weight. F u r t h e r m o r e , if an error-correcting code of constant weight is extracted from an arbitrary errorcorrecting code, t h e n K a u t z and Singleton showed t h a t t h e corresponding (i-disjunct m a t r i x will have very small values of d. So instead, they looked for codes based on g-nary error-correcting codes. A (jr-nary error-correcting code is a code whose alphabets are t h e set { 0 , 1 , • • • , q — 1}. K a u t z and Singleton constructed a binary superimposed code by replacing each §-nary a l p h a b e t by a unique binary p a t t e r n . For example, such binary p a t t e r n s can be t h e (/-digit binary vectors with unit weight, i.e., t h e replacement is 0 —• 10 • • • 0,1 —> 010 • • • 0, • • •, q — 1 —> 0 • • • 01. T h e distance I of a code is t h e m i n i m u m n u m b e r of nonidentical digits between two code words where t h e m i n i m u m is taken over all pairs of code words. Note t h a t t h e distance / of t h e binary code is twice t h e distance /, of t h e -nary code it replaces, and t h e length t is q times t h e length tq. Since the binary code has constant weight w — tq, t h e corresponding t x n m a t r i x is d-disjunct with
76
Nonadaptive Algorithms and Binary Superimposed
Codes
from Lemma 4.3.2. To maximize d with given tq and nq, one seeks q-nary codes whose distance lq is as large as possible. Kautz and Singleton suggested the class of maximal-distance separable (MDS) §-nary codes where the tq digits can be separated into kq information digits and tq — kq check digits. Singleton [30] showed that for an MDS code lq — tq
fcq
I 1 •
Thus for those MDS codes achieving this upper bound distance,
-bwJ' (a deeper analysis proves the equality). Also the kq information digits imply a total of nq = qkq code words. Kautz and Singleton also commented that the most useful MSD g-nary codes for present purposes has q being an odd prime power satisfying q+l>tq
>kq +
l>3.
Therefore for given d q > tq — 1 ~ (kq — l)d ; q is certainly too large for practical use unless kq is very small, like two or three. One may replace the q alphabets by binary patterns more general than the aforementioned unit weight type, provided only that the q binary patterns for replacement form a (^-disjunct matrix themselves. Clearly, the length of such binary patterns can be much shorter than the unit weight patterns. Such a replacement can be regarded as a method of composition in which a small to x «o do-disjunct matrix is converted into a larger t\ x n\ d r disjunct matrix on the basis of an t,-digit g-nary code having kq independent digits where (q is a prime power satisfying tq — 1 < q < nq) ti
= totq ,
m
=
di
= min< d0,
qk- , to -I
For the unit weight code to = UQ = do = q. Starting with n unit weight code and keeping d fixed, repeated compositions can be carried out to build up arbitrarily large d-disjunct matrices. By setting t\ = qtq, Nguyen and Zeisel used a result of Zinoviev [38] and Theorem 4.3.3 to prove
4.4 General
Constructions
77
Theorem 4.4.5 If d < an1/* for some integer k > 2 and constant a, then as d —> oo, there exists a constant weight d-disjunct matrix such that ., . d2log n t = (k — 11— — as n —> oo . logd Different -nary codes can be used at each stage of the composition. If the same type of q-nnry code is used except that q is replaced by q — n\, then a second composition yields t2
= t0[l + d{kq - l)] 2
n2
=
q^2
where q{Kq
-\)
Kautz and Singleton also discussed the optimal number of compositions. Hwang and Sos [16] gave the following construction. Let T be a ^-element set and Tk consist of all &-sets of T. Define r = \t/16d2~\, k = Adr and m = 4r. Choose C\ arbitrarily from Tk. Delete from Tk all members which intersect C\ in at least m rows. Choose Ci arbitrarily from the updated Tk. Again delete from the updated Tk all members which intersect C2 in at least m rows. Repeat this procedure until the updated Tk is empty. Suppose Ci, • • •, C„ have been chosen this way.
./C2" c3...
Figure 4.2: Choosing C; from Tk.
Theorem 4.4.6 C\, • • •, Cn constitute atxn
d-disjunct matrix with n > (2/3)3 t/,16(i .
Proof. By construction, any two columns can intersect in at most m — 1 rows. By Lemma 4.3.2, C\, • • •, C„ constitute a d-disjunct matrix.
78
Nonadaptive Algorithms and Binary Superimposed
Codes
At the j t h step the number of members in the updated Tk set which intersect C, in at least m rows is at most
.taxTherefore, t k n > Z-/t=TJ
k \ t-k i I \ k—i
•"-•{
l:-
Set
For 3r < i < k = 4dr bj _ (k-i)2 (4
<
1 3
Hence k
k i—m
, i , i-3r 03r
\ .M
-3r
&3r
( fc J = £ ^ > *3r »
n >
\
7
>
2 • 3r
>
2-3^
_ 1
.
Q
Corollary 4.4.7 t{d,n) < 16eP(l + log 3 2 + (log 3 2) logra). This bound is not as good as the nonconstructive bound given in Theorem 4.2.13.
4.5 Special
4.5
79
Constructions
Special Constructions
T h e definition of a 1-separable m a t r i x is reduced to "no two columns are t h e same." This was also called a separating system by Renyi [24]. For given n t h e n t > [logrc] is necessary from T h e o r e m 1.2.1 for a i x n separating system to exist. A well-known construction which achieves t h e lower bound is to set column i, i = 0 , 1 , . . . , n — 1, to be t h e binary representation vector of t h e n u m b e r i (a row corresponds to a digit). Since two binary n u m b e r s m u s t differ in at least one digit, t h e m a t r i x , called a binary representation matrix, is a separating system. By a similar a r g u m e n t , if column weights are upper bounded by w, t h e n n < Y^iLo (•) f ° r a ' x n separating system to exist. Renyi asked t h e question if each row can contain at most k I s , t h e n what is t(l,n,k), i.e., t h e m i n i m u m t such t h a t a t x n separating system exists. K a t o n a [18] gave t h e following answer ( L e m m a 4.5.1 and T h e o r e m 4.5.2). L e m m a 4 . 5 . 1 t(l,n,k) so, • • -,st satisfying
is the minimum
t such that there exist nonnegative
integers
t tk
= ^2Jsi
,
3=0 t
n
=
]Ts, , j=0
Sj
<
( .)
for
0 <j
.
Proof, (sketch). Interpreting Sj as t h e n u m b e r of columns with weight j in a separating system, t h e n t h e above conditions are clearly necessary if each row has exactly k Is. T h e proof of sufficiency is involved and o m i t t e d here. It can then be proved t h a t t h e existence of a separating m a t r i x where each row has at most k Is implies the existence of a separating m a t r i x of t h e same size where each row has exactly k Is. • However, it is still difficult to e s t i m a t e t h e m i n i m u m t from L e m m a 4.5.1. K a t o n a gave lower and upper bounds. His upper b o u n d was slightly improved by Wegener [35] as s t a t e d in t h e following t h e o r e m . T h e o r e m 4 . 5 . 2 Let t° denote exists. Then "l0g" Hog(en/fc)
the minimum
t such that a t x n separating
matrix
logn < t °
Proof. An information a r g u m e n t proves t h e first inequality. T h e total information bits needed t o choose one of t h e n given columns is log n. Each row with k Is provides k n n — k n -n log -k + log -n — k n
80
Nonadaptive Algorithms and Binary Superimposed
Codes
bits of information. Thus ,o ^
lo n S * log 2 + ==* log - ^
n
&
fc
n
to
n-k
l o g 72
using ln(l + x) < x
nlogn Hogf The upper bound is obtained by a construction which is a generalized version of the binary representation matrix given at the beginning of this section. To demonstrate the idea, assume n/k = g and k = gx. At each stage the n elements are partitioned into g groups and a test applies to each group except the last. Thus each stage consumes g — 1 = n/k — 1 tests and identifies a contaminated group. The first partition is arbitrary. The second partition is such that each part consists of gx~l elements from each part of the first partition. Thus, the first two partitions divide the n items into g2 equivalent classes of size gx~l each where the defective is known to lie in one such class. The third partition is then such that each part consists of gx~2 elements from each of these classes and so on. After x + 1 = logfc n stages, each equivalent class is of size 1 and the defective is identified. By considering the possible nonintegrality of n/k, the upper bound is obtained. • The definition of a 1-disjunct matrix is reduced to "no column is contained in another column." This was also called a completely separating system by Dickson [5] who proved that the minimum t (such that atxn complete separating system exists) tends to logra, the same as the separating system. Spencer [31] proved Theorem 4.5.3 For a given t the maximum n such that atxn system exists is (u/jij-
complete separating
Proof. M is such a matrix if and only if for all i ^ j , Ci 2 Cj . By the Sperner's theorem [32], the largest family of subsets with no subset containing another one consists of the f | ( L|J [i/2J-tuples. • Schultz, Parnes and Srinivasan [29] generalized Theorem 4.5.3 to Theorem 4.5.4 Forgivent n such that atxn
and upper bound w < (u/ 2 i) of the weight, the maximum
complete separating matrix exists is (M.
4.5 Special
Constructions
81
Proof. Implied by Sperner's proof of his theorem.
•
When n is large with respect to the maximum number of Is in a row, Cai [3] gave a more specific result. Theorem 4.5.5 If n > k2/2 where k is the maximum number of Is in a row, then t = |"2n/fc] is the minimum t such that a t x n complete separating system exists. Proof. Let M be such a matrix. To avoid trivial discussion assume that M does not contain isolated columns. Then each column has weight at least two. Counting the number of Is by rows and by columns separately, one obtains tk > 2n . The reverse inequality is proved by constructing a graph with [2n/fc] > k +1 vertices and n edges such that the maximum degree is k or less (this can easily be done) and let M be the incidence matrix of the graph. D For d = 2 and w = 2k — 1, Lemma 4.2.10 yields
G)
K2*-l)<7iEy
f°r*>l.
However, Erdos, Frankl and Fiiredi [9] noted that since any two partitions of a umn into a fc-subset fc-su column and a (fc-l)-subset must be disjoint, there exist ( , 1J disjoint partitions. Hence r(2k-l)
< - ^
(V) • They proved that this bound is tight and also solved the even w case in the following theorem. Let S(t, r, n) denote a i-design on n elements where the block size is r and every i-tuple appears in one block. Theorem 4.5.6 Let M be at x n constant weight 2-disjunct matrix. (') n < T^TT ( * )
Then
for w = 2k - 1
and
" ^ 75^iy
^ «> = 2* .
Furthermore, equality holds in the former case if and only if there exists an S(k,2k — l , t ) ; and equality holds in the latter case if and only if there exists an S{k,2k — \,t — 1).
82
Nonadaptive Algorithms and Binary Superimposed
Codes
Proof. In the odd w case, it is clear that a t-design as specified achieves the bound and any constant weight 2-disjunct matrix achieving the bound defines such a i-design. Note that adding a row of Is to the incidence matrix of this t-design preserves the constant weight 2-disjunct property with w = 2k. However, the proof for the bound with even w and the proof that the existence of an S(k, 2k — 1, t — 1) is necessary to achieve the bound are much difficult than the odd w case. The reader is referred to [9] for details. • Corollary 4.5.7 r ( l ) = t, r{2) = t - l , r(3) = t2/Q + 0{t) = r(4), r(5) = i 3 /60 + o(i 3 ) = r(6). Proof. It is well known [17] that 5(2,3, i), which is known as a Steiner triple system, exists if and only if t > 7 and t = 1 or 3 (mod 6). Hanani [14] proved the existence of 5(3,5,4" + 1). Erdos and Hanani [11] proved the existence of matrices with constant weight w where two columns intersect at most twice and n = (* J / (^ J — o(t3). • Erdos, Frankl and Fiiredi also proved Theorem 4.5.8 Let M be at x n constant weight 2-disjunct matrix. j.
t
n>
\
/
\
\ ,/
Then
2
w
{\w/2])\n2\
Proof. The number of u>-sets which can intersect a given w-set at least [u>/2] times is less than w \ ( t - rw/2"P Jw/2])[ LW2J Using a construction similar to the one discussed in Theorem 4.4.6.
„..
U >
U) ( «, \(t-lw/2]\ \\wli\)\ [w/2j )
U/q) j
w \2 • \{WI2\)
D
Finally, Erdos, Frankl and Fiiredi proved Theorem 4.5.9 log 1 2 5 n < t(2,n) < log 1 1 3 4 n. Proof. The first inequality follows from Theorem 4.5.6, using the Stirling formula and the fact n < 52L=i riw)- The second inequality is proved by a random construction. Select each of the f' J tu-subsets independently with probability 2m j (' J, the value of m and w to be fixed later. For a given u)-set A the number of ordered pairs of w-sets B, C such that A c ( B U C ) i s
«(«.«)=g(l)(;::)("-r+-)
4.5 Special
83
Constructions
where the i t h term corresponds to the case that B contains exactly i members of A, and C contains the other w — i members of A. Since v^ (w\2 R(n,w) = > D/
\
fn-w
+ i\ < maxn
(w\ .
(n-w
+ i\ _ ... = n max g(i) ,
the probability that A is covered by two selected w-sets is R(n,w)4m2/(
) < 1/2 for rn < —p ( ] / In max g(z') . \i«y \/2 \io/ V '-™
After deleting those covered u;-sets, the expected number of remaining u>sets is at least 2ra — (l/2)2m = rn, and these remaining ui-sets form a constant weight 2-disjunct matrix. To estimate m, note that g{i)/g(i — 1) is decreasing in i. Thus the maximum is achieved when this ratio is about one, yielding W
~ -(3w - 2i + \/5u>2 - 8wt + it2) .
Setting w = 0.2Qt, one obtains i max = 0.1413- • • t and m = (1.1348- • •)'.
•
The transpose of the parity check matrix of a conventional binary d-error-correcting code is known (p. 33 [23]) to have the property that the modulo-2 sums of any up to d columns are distinct. This property is exactly what is desired for d-separable matrices except that the sum is Boolean sum for the latter. Kautz and Singleton suggested the following transformations for small d. For d = 2, transform 0 —* 01 and 1 —> 10. The modulo-2 addition table before transformation and the Boolean addition table after transformation are given in the following: © 0 1 0 0 1 1 1 0
V 0 1 0 01 11 1 11 10
Thus if two pairs of columns have different modulo-2 sums, they also have different Boolean sums under this transformation. Namely, the 2-error correcting property is translated to the 2-separable property; while the price paid is that the number of rows is doubled. The family of Bose-Chaudhuri codes (p. 123 [23]) for d = 2 have 2k — 1 rows and no more than 2k columns for every k > 2. Hence they yield 2-separable matrices with 4k rows and 2k — 1 columns; o r n = 2'/ 4 — 1. Lindstorm [20] obtained the following stronger result. Theorem 4.5.10 n{2,t) < 1 + 2^t+1^2; for t even n{2,t) > 2t'2.
84
Nonadaptive Algorithms and Binary Superimposed
Codes
Proof. The upper bound is derived from the fact that there are f !j) pairs of columns whose unions must all be distinct t-vectors (there are 2' of them). To prove the lower bound, consider the set V of all vectors (x, x3) with x £ GF(2t). Then V has cardinality 2*. If (x,x3) + (y,y3) = (u,v) for two elements x ^ y in GF(2'), then I + I/ =
II/0,
and xy = v/u — u2 . Since an equation of the second degree cannot have more than two roots in the field, {x, y} is uniquely determined by (u,v). Therefore V induces a It x 2' 2-separable matrix. • Corollary 4.5.11 n(2,t)
—•
2tl2.
t—*oo
Kautz and Singleton also studied the construction of 2-separable matrices with constant weight two. Let M be such a matrix. Construct a simple graph G(M) where vertices are the row indices and edges are the columns. Then G(M) contains no 3-cycle or 4-cycle since such a cycle corresponds to two column-pairs with the same Boolean sum. Regular graphs (each vertex has the same degree r) of this type were studied by Hoffman and Singleton [15] with only four solutions: r=2
f= 5
n=5
r = 3
t = 10
n = 15
r = 7
t = 50
n = 175
r = 57 t = 3250 n = 92625 which satisfy the conditions t = 1 + r 2 and n = r(l + r 2 )/2. Thus n = t%Jt — 1/2 —> f 3 / 2 /2 asymptotically. This result was also obtained by Vakil, Parnes and Raghavarao [34]. Relaxing the regularity requirement, one can use a (v, k, b, r, 1) BIBD to construct a (v + b) x kr 2-separable matrix, (the condition A = 1 guarantees the 2-separable property). For fixed t, n is maximized by using a symmetric BIBD where v = b and k = r. It can then be shown that
4.5 Special
Constructions
85
Frankl and Fiiredi [13] stated t h e following results without proofs: n(2,i,3)
=
[t{t-l)/2\,
n(2,f,4)
=
[1 + o(l)]t3/24
.
W e i d e m a n and Raghavarao [37] proved a relation between w and A for 2-separable matrix. L e m m a 4 . 5 . 1 2 Any 2-separable
matrix with A = 1 can be reduced to one with w = 3.
Proof. Replacing all columns of weight greater t h a n t h r e e by arbitrary 3-subsets preserves t h e 2-separability since under t h e condition A = 1 no 3-subset can be contained in t h e union of any two columns. T h e reduced m a t r i x also preserves A. T h e o r e m 4 . 5 . 1 3 Suppose
that A = 1 in a 2-separable n
matrix.
Then
l)/6 .
Proof. From L e m m a 4.5.12 assume w = 3. A column of weight 2 can be represented by t h e pair of intersecting row indices. P a r t i t i o n t h e columns of weight 2 into m disjoint classes where t h e columns in class i have a c o m m o n row index i?,. Let class i consist of p,- columns. .ft, is called t h e feature index of class i and t h e other p; row indices t h e supporting indices. T h e 2-separable property imposes t h e condition t h a t if columns {x, y] and {x,z} are in t h e design, then y,z cannot be collinear. For if C were t h e collinear column, then {x,y}l)C
= {x,z}uC
.
T h u s two supporting indices in t h e same class cannot be collinear; this rules out !C£Li {2) pairs. F u r t h e r m o r e , if m
m > t-J^Pi . 1=1
t h e n at least m — t + Y^Li Pi feature indices are supporting indices of some other classes; so at least t h a t m a n y pairs are also ruled out. Therefore, regardless of whether m > t — Y^iLi Pi-, t h e n u m b e r of available pairs t o be collinear in columns of weights exceeding 2 is at most
2
J ~12Pi
£
= S
( \
J - m a x { m - i + £;>.-,0}
(U-SK-s(p2')-(m-i+SK)-
Nonadaptive Algorithms and Binary Superimposed
86
Codes
Since each column of weight exceeding 2 generates at least three collinear pairs,
<
£P.+
t{t+1) 2
-^r>
2±P,-t(P' ^
i
=
[t(t + l ) - { 2 m - ^ p , ( P i - 3 ) } ] / 6
<
*(* + l)/6 ,
/3
2
noting that the term in { } is nonnegative for all integral p;'s.
O
Weideman and Raghavarao showed that the upper bound can be achieved for t = 0 or 2 (mod 6). They also showed in a subsequent paper [36] that even for other values of t the upper bound can still often be achieved. Vakil and Parnes [33], by crucially using the constructions of group divisible triple designs of Colbourn, Hoffman and Rees [4], gave the following theorem. Theorem 4.5.14 The maximum n in a 2-separable matrix with A = 1 is
n= \t{t+ l)/6\. Since the constructions consist of many subcases, the reader is referred to [33] for details. An example is given here. Example 4.2. For t = 8, the group divisible triple design with the four groups (1,2), (3,4), (5,6), (7,8) consists ofthe blocks: (1,3,5), (2,4,6), (2,5,7), (1,4,8), (1,6,7), (2,5,8), (3,6,8) and (4,5,7). The 2-separable matrix consists of twelve columns: the four groups and the eight blocks. The eight tests (rows) are: (1,5,8,9), (1,6,7,10), (2,5,7,11), (2,6,8,12), (3,5,10,11), (3,6,9,11), (4,7,9,12), (4,8,10,11). Note that the n value in Theorem 4.5.14 is slightly bigger than n(2,£,3). A variation of replacing the condition A = 1 by the weaker one w = 3 was studied in [29, 33]. Kautz and Singleton tailored the composition method discussed in the last section to the d = 2 case. From a ( x n 2-separable matrix M, construct a (2t + 2p — 1) x 12 2-separable matrix M'. Column C; of M' consists of three sections a;, 6;, c,-, where a; x 6; enumerates all column pairs of M, and c,-, depending on a,- and 6,-, has length 2p — 1. The condition on c; can be more easily seen by constructing a t x t matrix C whose rows (columns) are code words of M chosen for a;(6;) and the entries are Ci(an r x 1 vector). Denote the entry in cell (x,y) by Cxy. Then the condition, called the minor diagonal condition, is ' xy * ^ uv /
^ xv * ^ uy )
87
References
since otherwise t h e column-pair with first two sections axby, anbv and t h e column-pair with first two sections axbv, anbv would have t h e same Boolean s u m . S t a r t i n g with a 3 P x 3 P weight-one code and using t h e m a t r i x / 1
C1
0
0
=
0 \
1 0
\ 0
0
I )
t o construct t h e t h i r d segment (r = 1 here), one obtains a 7 x 9 2-separable m a t r i x . So t h e next third-section m a t r i x C2 is of size 9 x 9 . In general at t h e p t h iteration t h e m a t r i x Cp is of size 3 P x 3 P . K a u t z and Singleton gave a m e t h o d to construct Cp from Cj,_i which preserves t h e minor diagonal condition where each entry in Cp is a (2p — 1) x 1 vector. T h u s Cp yields a tp x np 2-separable m a t r i x with tp = 2 i p _ ! + 2p - 1 (or tp = 6 • 2" - 2p - 3) and
It follows t h a t n —> 3 ' / 6 asymptotically. Note t h a t this bound is inferior to t h e one given in Corollary 4.5.11. Similar composition can be used to grow a 2-disjunct m a t r i x with p a r a m e t e r s
t„ = 3 V i (°r U = 3P+1) and
nv = Z T h e n n —* 3 ' ' 2 . Note t h a t this construction yields a result inferior to t h e one given in T h e o r e m 4.5.9. For d = 3 let H denote t h e parity check m a t r i x of a 3-error-correcting code. Construct M such t h a t every column-pair of H yields four columns of M by t h e transformation: 00 -> 1000, 01 — 0100,
10 - • 0010,
11
0001
It can t h e n be verified t h a t M' is 3-separable. T h e Bose-Chaudhuri code for d = 3 has 2* — 1 rows and no more t h a n 3fc columns for every k > 3. T h u s M' has t = (3*J rows a n d ra = 2* — 1 columns; or n > 2V */18 asymptotically.
88
Nonadaptive
Algorithms
and Binary
Superimposed
Codes
References [1] P. Busschbach, Constructive m e t h o d s to solve t h e problems of: s-subjectivity, conflict resolution, coding in defective memories, unpublished m a n u s c r i p t , 1984. [2] K. A. Bush, W . T . Federer, H. Pesotan and D. Raghavarao, New combinatorial designs and their applications t o group testing, J. Statist. Plan. Infer. 10 (1984) 335-343. [3] M. C. Cai, On t h e problem of K a t o n a on m i n i m a l completely separating systems with restrictions, Disc. Math. 48 (1984) 121-123. [4] C. Colbourn, D. Hoffman and R. Reed, A new class of group divisible designs with block size three, J. Combin. Thy. 59 (1992) 73-89. [5] T . J. Dickson, On a problem concerning separating systems of a finite set, J. Combin. Thy. 7 (1969) 191-196. [6] A. G. Dyachkov and V. V. Rykov, Bounds of t h e length of disjunct codes, Problems Control Inform. Thy. 11 (1982), 7-13. [7] A. G. Dyachkov and V. V. Rykov, A survey of superimposed code theory, Problems. Control Inform. Thy. 12 (1983) 1-13. [8] A. G. Dyachkov, V. V. Rykov and A. M. Rashad, Superimposed distance codes, Problems Control Inform. Thy. 18 (1989) 237-250. [9] P. Erdos, P. Frankl a n d Z. Fiiredi, Families of finite sets in which no set is covered by t h e union of two others, J. Combin. Thy. A33 (1982) 158-166. [10] P. Erdos, P. Frankl and D. Fiiredi, Families of finite sets in which no set is covered by t h e union of r others, Israel J. Math. 51 (1985) 79-89. [11] P. Erdos and H. Hanani, On a limit theorem in combinatorial analysis, Math. Debrecen 10 (1963) 10-13. [12] P. Frankl, On Sperner families satisfying an additional condition, J. Thy. A24 (1978) 308-311.
Publ.
Combin.
[13] P. Frankl and Z. Fiiredi, Union-free hypergraphs and probability theory, J. Combinatorics 5 (1984) 127-131.
Euro.
[14] H. H a n a n i , On some tactical configurations, Canad. J. Math. 15 (1963) 702-722. [15] A. J. Hoffman and R. R. Singleton, On Moore graphs with diameters 2 and 3, IBM J. Res. Develop. 4 (1960) 497-504.
89
References F . K. Hwang and V. T Sos, Non-adaptive hypergeometric group testing, Scient. Math. Hungarica 22 (1987) 257-263.
Studia
S. M. Johnson, A new upper bound for error correcting codes, IEEE Inform. Thy. 8 (1962) 203-207.
Trans.
G. K a t o n a , On separating systems of a finite set, J. Combin. 174-194.
Thy. 1 (1966)
W . H. K a u t z and R. R. Singleton, N o n r a n d o m binary superimposed codes, Trans. Inform. Thy. 10 (1964) 363-377. B . L i n d s t r o m , D e t e r m i n a t i o n of two vectors from t h e sum, J. Combin. (1969) 402-407.
IEEE
Thy. A6
Q. A. Nguyen and T . Zeisel, Bounds on constant weight binary superimposed codes, Probl Control & Inform. Thy. 17 (1988) 223-230. W . W . Paterson, Error Correcting
Codes, ( M I T Press, Cambridge, Mass. 1961).
D. Raghavarao, Constructions and Combinatorial iments, (Wiley, New York, 1971).
Problems
in Designs
of Exper-
A. Renyi, On r a n d o m generating elements of a finite Boolean algebra, Acta Math, (szeged) 22 (1961) 75-81.
Sci.
M. Ruszinko, On t h e upper bound of t h e size of t h e r-cover-free families, preprint. G. M. Saha, H. Pesotan and B. L. R a k t o c , Some results on ^-complete designs, Ars Combin. 13 (1982) 195-201. G. M. Saha and B . K. Sinha, Some combinatorial aspects of designs useful in group testing e x p e r i m e n t s , unpublished manuscript. D. J. Schultz, Topics in Nonadaptive Group Testing, P h . D . Dissertation, Temple University, 1992. D. J. Schultz, M. Parnes and R. Srinivasan, Further applications of d-complete designs to group testing, J. Combin. Inform. & System Sci., to appear. R. R. Singleton, M a x i m u m distance ^-nary codes, IEEE (1964) 116-118.
Trans. Inform.
J. Spencer, Minimal completely separating systems, J. Combin. 446-447.
Thy. 10
Thy. 8 (1970)
E. Sperner, Ein Satz Uber U n t e r m e n g e n einer endliche Menge, Math. (1928) 544-548.
Zeit. 27
90
Nonadaptive
Algorithms
and Binary
Superimposed
Codes
[33] F . Vakil and M. P a r n e s , O n t h e s t r u c t u r e of a class of sets useful in non-adaptive group-testing, J. Statist. Plan. & Infer., to appear. [34] F . Vakil, M. P a r n e s and D. Raghavarao, G r o u p testing with at most two defectives when every i t e m is included in exactly two group tests, Utilitas Math. 38 (1990) 161-164. [35] I. Wegener, On separating systems whose elements are sets of at most k elements, Disc. Math. 28 (1979) 219-222. [36] C. A. W e i d e m a n and D . Raghavarao, Some o p t i m u m non-adaptive hypergeom e t r i c group testing designs for identifying defectives, J. Statist. Plan. Infer. 16 (1987) 55-61. [37] C. A. W e i d e m a n and D. Raghavarao, N o n a d a p t i v e hypergeometric group testing designs for identifying at most two defectives, Commun. Statist. 16A (1987), 2991-3006. [38] V. Zinoviev, Cascade equal-weight codes and m a x i m a l packing, Probl. & Inform. Thy. 12 (1983) 3-10.
Control
Multiaccess Channels and Extensions
Consider a communication network where m a n y users share a single multiaccess channel. A user with a message t o be t r a n s m i t t e d is called an active user. For convenience, a s s u m e t h a t each message is of unit length a n d can b e t r a n s m i t t e d in one t i m e slot. T h e transmission is done by broadcasting over t h e channel which every user, active or n o t , can receive. However, if at any given t i m e slot more t h a n one active user broadcasts, t h e n t h e messages conflict with each other and are reduced t o noise. T h e problem is t o devise an algorithm which schedules t h e transmissions of active users into different t i m e slots so t h a t t h e transmissions can be successful.
• channel Figure 5.1: A multiaccess channel. T h e algorithms t o be considered in this chapter do t h e scheduling in epochs. At t h e s t a r t of an epoch, t h e users are classified either as active or inactive according as w h e t h e r t h e y have messages at t h a t particular m o m e n t . An inactive user remains labeled as inactive even t h o u g h a message is generated during t h e epoch. An epoch ends when all active users have t r a n s m i t t e d successfully in different t i m e slots, and t h e n t h e next epoch starts. W h e n t h e set of active users is known, they can t h e n be scheduled to t r a n s m i t in successive t i m e slots in some order and no conflict can occur. B u t active users are usually unknown at t h e beginning of an epoch. O n e simple protocol, called T D M (time-division-multiplexing), is t o assign each user a t i m e slot in which t h e user can t r a n s m i t if active. W h e n there are n users, n t i m e slots will be needed. This is very inefficient since t h e n u m b e r of active users is typically small (or t h e sharing of a multi-access channel would not be practical). Hayes [9] proposed a bit reservation algorithm which first identifies all active users using a group query; and t h e n schedules t h e known active users to t r a n s m i t in order. In a bit reservation algorithm a set of 91
Multiaccess Channels and Extensions
92
users is queried and the active users among them are requested to transmit a bit. When the channel is quiet after the query, the queried set cannot contain an active user. If the channel is not quiet, then the queried set is known to contain active users, but not how many or whom. The feedback is identical to the feedback of group testing. Usually the number of active users in an epoch, or an upper bound of it can be estimated from the length of the previous epoch (since the active users were generated during that epoch). Then a (d, n) or (J, n) algorithm can serve as a bit reservation protocol. Capetanakis [4], also Tsybakov and Mikhailov [13], proposed a direct transmission algorithm which also queries sets of users, but requests active users among the queried set to transmit. If the channel is quiet in the next time slot, the queried set will contain no active user. If the channel broadcasts a message, the queried set will contain exactly one active user who has transmitted successfully. If the channel is noisy, then the queried set contains at least two active users who have transmitted, but their messages collide, thus fail to get through. Further queries are needed to separate the transmissions of these active users. Note that the type of feedback of a query is ternary: 0,1,2 + , where 2 + means at least 2. The reader should be warned that the bulk of literature on multiaccess channels employs probabilistic models and will not be covered in this volume.
5.1
Multiaccess Channels
Capetanakis [4, 5] (also see Tsybakov and Mikhailov [13]), gave the following tree algorithm C for n = 2k users: Let Tn denote a balanced binary tree with n leaves. Label each leaf by a distinct user, and label each internal node u by the set of users which label the descendants of u. Thus the root of Tn is labeled by the whole set of n users. Starting from the root, using either a breadth-first or a depth-first approach to visit the nodes and query the set of users labeling them, except that if the feedback of a query is 0 or 1, then all descendants of the node will not be visited.
Figure 5.2: Ts with three active users: 2, 7, 8.
93
5.1 Multiaccess Channels
Clearly, after all the nodes are visited, all active users have successfully transmitted. The number of queries is just the number of nodes visited. Let C(d \ n) denote the number of queries required by algorithm C when d of the n users are active (d is unknown). Then C ( l | 2fc) C(d\2k)
= =
1, 1+
max {C(i\2k-1)
+ C(d-i
\2k~1)}
for d > 2 .
i<> - 1) + 2D - I, where D = [log d]. Namely, the worst case occurs when active users label the leaves in pairs, but are otherwise scattered evenly over the tree. Corollary 5.1.2 C(d | n) —> dlogn for every fixed d. n—*oo
Capetanakis also proposed a varient tree algorithm in which the visit starts at nodes at level / (the root is at level 1). Let Ci(d \ n) denote the corresponding number of queries. Since the worst-case distribution of active users is the same as before i
C,{d | n)
= C(d | n) - (2'" 1 - 1) +
£
(2-1 - 2
i=[\ogd]+l
=
C(d\n)-2^°^
-2
(/-flogrfD + ^
+ l,
where 2' J — 1 is the number of nodes skipped in Cj, and 2' a — 2 I ~\ is the number of nodes at level i skipped in C. If d is known, then / can be optimally chosen to be approximately 1 + log(d/ In 2) with Corollary 5.1.3 C{d,n) = C{d\ n) - d(2 - log In 2 - l / l n 2 ) + l. Several improvements over the tree algorithm have been suggested. Massey [11] pointed out that if node u has two child-nodes x and y such that the feedback of querying u and x are 2 + and 0 respectively, then one can skip the visit to y and go directly to the child-nodes of y, since the feedback of querying y can be predicted to be2+. There is also no reason to restrict the size of queried sets to powers of two. Suppose that the n users are divided into g groups of n/g users each. Apply the tree algorithm
94
Multiaccess
Channels
and
Extensions
on each group. Assume t h a t group i contains d{ active users. T h e total n u m b e r of queries is at most 3
g + J2di
log(n/,9) = g + dlog(n/g)
.
t=i
If g is selected to be approximately l o g n , t h e n t h e n u m b e r of queries is about (d + l ) l o g n — log l o g n . If d is known, g can be optimally selected to be approximately d i n 2. T h e n t h e n u m b e r of queries is about dlogn — dlog e[ln(
=
1 + mm {H{d,n
- k), H(d - l,n - k),G(k
: d,n)}
,
l
G(m\ d,n)
=
1+
m i n {G(m — k;d,n
— fc), Fim — k\d — l,n — k),G(k;dyn)}
,
l
F(m;d,n)
—
1+
m i n {F(m
— k\ d,n - fc), Hid - 1, n - k), G(k; d, n)} ,
l
where H{0,n) = 0, H(l,n) = 1, G{2;d,n) = 2 + H{d-2,n - 2 ) and F(l;d,n) = 1 + H(dl,n-l). Let Q(d,n) denote t h e n u m b e r of queries required given a (d,n) sample space. Greenberg and Winograd [8] proved a lower bound of Q(d, n) which is asymptotically close to t h e upper bounds discussed earlier. T h e y first proved two looser lower bounds whose combination yields t h e tighter lower b o u n d desired. L e m m a 5 . 1 . 4 Q(d, n) > d + log(n/d)
for d > 2.
Proof. An oracle a r g u m e n t is used. Let A$ denote t h e original set of active users and let At denote t h e u p d a t e d set of active users after t h e tih query Qt. T h e n
C At - Qt
if | At n Qt | = 1 ,
( At
otherwise .
At+i = I
Given any sequence of queries <3i, • • •, Qt, define S\ = { 1 , ••• ,n} ( StOQt
if|S,nO«|>|5,|/2
( St — Qt
otherwise .
st+l =
and
5.1 Multiaccess Channels
95
Then | 5 t + 1 | > | St | /2 > n / 2 ' . It follows that | 5 i + i | > d as long as t < hg(n/d). The oracle selects AQ as a subset of S<+i; thus A i + i = AQ. Since | J4O | = d, at least rf queries are needed to allow the active users in Ao to transmit. Hence Q(d, n)>d + \og{n/d) . Lemma 5.1.5 Q(d,n) > (L/logL) (log \n/L\) [d/2\ (hence 4 < L < n/L).
•
- L for 8 < d < V^n, where L =
Proof. Consider a new problem of having L sets NQ(1), • • • ,No(L), of [n/L\ users each, where each set contains exactly two active users. Let Q(L) denote the number of queries required for this new problem. Then Q(L)
The following oracle scheme is used such that the entropy gained at each query is upper bounded. (i) Suppose that for some j , \ Nt(j) | > L and | Nt{j) D Qt \ > | Nt(j) \ jL. Then select one such j and set Nt+1(j) Nt+1{i)
= =
Nt(j)nQt, Nt(i) f o r a l H / j i - .
(ii) Otherwise, for every i with | Nt(i) | > L, set Nt+1(i) = Nt(i) - Qt . For every i with | Nt(i) \ < L, set Nt+i(i) = any two users of Nt(i) . Let Lt denote the number of Nt{i) such that | Nt(i) \ < L. The entropy gained at query t is
( 31ogJVt(j) - 31ogiV t+l (i) < 31ogL 3(1-i()log-
+Lt(2\ogL
in case (i)
- 1) in case (ii) ,
96
Multiaccess Channels and Extensions NtG)
N t CD
NtG)
Qt (a) large intersection
(c) small Nt(j)
(b) large N t (j) but small intersection Figure 5.3: Nt(j + 1) (shaded area)
where the coefficient 3 is chosen as the smallest integer satisfying the inequalities. The total entropy gained in q queries is i-1
r 1
r
£)£(<)
< 3? max j log L, L l o g -
i=o
=
\ + L(2\ogL - 1)
L — 1)
*•
3glogZ, + Z ( 2 1 o g L - l )
for L > 4 .
The problem is unsolved whenever this number does not exceed the initial entropy,
gH(l"fl)=i.o6(WiJ)>^Ml»W-i)-x. Hence mn)>H2loS(HH-l)-2logL
+
l}
Q
31ogi Theorem 5.1.6 Q(d,n)
=
il(dlogn/logd).
Proof. For d < 8 or d > \/2n, use Lemma 5.1.4. For 8 < d < \/2~n, use Lemma 5.1.5. •
5.2
Nonadaptive Algorithms
A nonadaptive algorithm for the multiaccess channel problem can also be represented by a 0 — 1 matrix M with columns as users, rows as queries and M tJ = 1 implying
5.2 Nonadaptive
97
Algorithms
t h a t query i contains user j . Since an active user can t r a n s m i t successfully in a query if and only if no other active user is also t r a n s m i t t i n g , M m u s t have t h e property t h a t for any sample from t h e given sample space and a given column in t h a t sample, t h e r e exists a row with 1 in t h e given column and 0 in all other columns in t h a t sample. W h e n t h e sample space is S(d,n) or S(d,n), by L e m m a 4.3.8, this property is satisfied if and only if M is (d — l)-disjunct. Therefore all results on ^-disjunct matrices apply here. However, there is a weaker sense of nonadaptive algorithms for the multiaccess channel problem with less stringent requirements. This occurs when all queries are specified simultaneously b u t carried out one by one in the order of t h e row index. Any active user who has already t r a n s m i t t e d successfully will not t r a n s m i t in any subsequent query even though its column entry is 1. Translating this p r o p e r t y to the m a t r i x M , then for every sample s there exists a p e r m u t a t i o n of t h e columns in s such t h a t t h e r e exists | s | rows which form an | s | x | 5 | lower triangular s u b m a t r i x with t h e columns in s, i.e., t h e s u b m a t r i x has l ' s in t h e diagonal b u t only O's above t h e diagonal. Busschbach [3] gave an upper bound using a r a n d o m code construction similar to t h e one given in T h e o r e m 4.3.9 except t h e coefficient 3 is replaced by log 3. Komlos and Greenberg [10] used a probability a r g u m e n t to derive a b e t t e r upper b o u n d for n o n a d a p t i v e algorithms. For convenience, assume d divides n. Let Sk denote t h e set of all fc-subsets of t h e set of n users. Let Q i , • • •, Qq denote a list of queries which are distinct but r a n d o m m e m b e r s of Sn/d- Let Aj denote an arbitrary m e m b e r of Sd- A query Qi is said to isolate a user in Aj if | Q»- D >4j | = 1. L e m m a 5 . 2 . 1 For every i, 1 < i < d, with probability at least ( 4 e ) _ 1 , there is a user x in Aj isolated by Qi, but not by any Qk for 1 < k < i.
Proof. P r o b (Qi isolates x and Qk not isolates x for 1 < k < i) =
P r o b (Qi isolates x) P r o b (Qk not isolates x for 1 < k < i)
Since t h e expected value of | Qi C\ Aj \ is one, Prob (| QiDAj
\>z)
< l/z
.
F u r t h e r m o r e , it is easily verified t h a t P r o b (| Qi nAj\
= l)>
P r o b (| Qt n A3 | = 0).
Hence P r o b (| Qi n A,- | = 1)
>
[1 - P r o b (| Q{ n Aj | > 2)]/2
>
(1 - l / 2 ) / 2
= 1/4 .
98
Multiaccess
Channels
and
Extensions
P r o b (Qj not isolates z for 1 < j < i)
> Prob (xt'\jQk^
= (l-J)'
<2-l
>
(1-^1
> e-1 .
D
L e m m a 5 . 2 . 2 A random query list Qi, • • • ,Qd isolates at least d / ( 4 e ) 2 users in Aj with probability at least 1 — e~hd for any Aj e Sd, where b = [l + ( 4 e ) _ 1 ] 2 [1 — ( 4 e ) _ 1 ] / 2 . Proof. Let B(m,M,p) denote t h e probability of m or more successes in M independent trials, each of which succeeds with probability p. Angluin and Valiant [1] proved that B({l + 0)Mp,M,p) <e~s2Mp'2 . Call Qi a failure if it isolates a user in Aj not isolated by any Qk, k < i; call Qi a success otherwise. T h e n t h e probability of a success is less t h a n 1 — ( 4 e ) _ 1 for every Qi regardless t h e Q,'s are independent or not. T h u s t h e probability of getting [1 — (4e) - 2 ] successes or more in t h e r a n d o m list of queries Q i , . . •, Qd is less t h a n B([l + ( 4 e ) - 1 ] 4 l - ^ e ) - 1 ] , d, 1 - ^ e ) " 1 ) < e -bd L e m m a 5 . 2 . 3 Ford > 2 there exists a list of queries of length O(dlog(n/d)) that it isolates at least d/(ie)2 users in Aj for any Aj G Sd-
such
Proof. Consider a r a n d o m list Q of t queries Qi,- • • ,Qt where t = md. From L e m m a 5.2.2 each disjoint sublist of d queries fails to isolate d/(4e) users in a given Aj with probability at most e~bd. Hence t h e probability for t h e t queries t o fail is at most e " m M . Setting
+i,
bd
thene-mM < l/(^). For a given Aj € Sd, define r a n d o m variables
Let
1
0
if Qi, • • •, Qt isolates at least rf/(4e)2 users in Aj
1
otherwise .
X{Q)
=
Y.
X
i(Q)
•
Aj<ESd
Treating Q\, • • •, Qt as a r a n d o m list of t queries,
E(X) = £ E(Xj) < £ X = 1 . AjSSd
A,£Sd
[d)
99
5.3 Two Variations
Therefore, there must exist a list Q' such that X(Q') = 0, or equivalently, Q' isolates at least d/(4e)2 users in every Aj G Sd- It is easily verified that t = md = 0{d\og{n/d)). D Theorem 5.2.4 Ford > 2, there exists a list of queries of length O(d\og(n/d)) isolates all users in Aj for every Aj £ Sd-
which
Proof. Define c = d/(4e) 2 . Lemma 5.2.3 guarantees the existence of lists Li, i = 0 , 1 , . . . , p — 1, each of length 0(d{ log(n/
J2 c{i-cyd = (i-c)pd t=0
users when p is set to be
such that (1 — cfd < 1; hence all users are identified. The length of the concatenated list is easily verified to be 0{d\og(n/d)). •
5.3
Two Variations
Berger, Mehravari, Towsley and Wolf [2] introduced two variations to the multiaccess channel problem by considering two ways to combine the ternary feedbacks of the latter to binary feedbacks (a third way leads to the group testing model). Although their discussion was in the context of PGT, the essence of their results can be easily transplanted to CGT. Since an active user needs more than identification, i.e., the message must be transmitted successfully, the term satisfied is introduced to denote the latter state. For an inactive user, satisfaction simply means identification. In the conflict/no conflict model, the feedbacks of 0 and 1 are combined into a "no conflict" feedback. Note that all users in a queried set with a no-conflict feedback are satisfied; so they are not affected by the suppression of information. There are two possible consequences for unsatisfied users: 1. If a subset G' of a conflict set G is queried with a no-conflict feedback, then it is not known whether G\G' contains at least 1 or 2 active user. 2. The number of satisfied active users is no longer known. So one cannot deduce that all unsatisfied users are inactive from the knowledge that d active users have been satisfied.
Multiaccess Channels and Extensions
100
Berger et al. suggested ways to test G\G' in the situation described in the first case. Then one can simulate an algorithm for the multiaccess channel for the current model. The second case can be taken care of by testing the set of unidentified items whenever the situation is unclear. In the success/failure model, the feedbacks 0 and 2 + are combined into a "failure" feedback. This may occur when the channel has another source of noise indistinguishable from the conflict noise. Berger et al. proposed an algorithm B for PGT which, whenever a failure set is identified, queries all users in that set individually. Adapting it to S{d,n) and let QB(d, n) denote the maximum number of queries required by B. Then QB{0,n) = 0, QB(l,n) = 1, QB{n,n) = n and for n > d > 2 QB(d,n)
=
1+
min
rnax{QB(d — l , n — k),
l
k — 1 + QB(d,n — k),
k + max QB(d — i,n — k)}.
Note that if the failure set contains no active user, then one can deduce this fact after querying the first k — 1 users individually. This explains the term k — 1 added to QB{d, n — k). Since it is easily verified by induction that QB{d,n)>QB(d-l,n)
,
the above recursive equation can be reduced to QB(d,n)
= 1+
min
max{A: — 1 + QB(d,n — k),k + QB(d — 2,n — k)} .
l
For d = 2 it follows QB(2,n)=
min {k + QB(2,n - k)} . l
Using the initial condition QB(2,2) = 2, it is easily verified by induction that QB(2,n) Since QB(d,n)
=n .
is clearly nondecreasing in d, QB(d, n) = n for d > 2 ,
i.e., Algorithm B is simply the individual testing algorithm. Are there better algorithms than individual testing for d > 2? The answer is yes, and surprisingly we already have it. Namely, all the nonadaptive algorithms reported in Section 5.2, be they regular or weak-sense, apply to the success/failure model. This is because the matrices in these algorithms have the property that for any sample of
5.4 The
k-channel
101
d columns including a designated one, there exists a row with 1 in t h e designated column and 0s in t h e other columns of t h e sample. So each active user in t h e sample has a query t o its own t o t r a n s m i t . Therefore all t h e (d — l)-disjunct matrices can be used in t h e current model for t h e sample space S(d,n). In particular, T h e o r e m 4.5.3 says t h a t t h e ( 2 , n ) problem can b e done in t queries where t is t h e largest integer satisfying
,L*/2j)-"Since t h e feedbacks for t h e success/failure model is a subset of t h e multiaccess channel model, t h e above t is also t h e m i n i m a x n u m b e r of queries for n o n a d a p t i v e algorithms for t h e former model as it is for t h e latter. O n e can do b e t t e r for sequential algorithms. For example, for d = 2, one can use t h e binary representation m a t r i x given in Sec. 4.5 (sequentially). Since 1-separable implies t h a t for every pair of columns t h e r e exists a row with 1 in one column and 0 in t h e other, one active user will t r a n s m i t through t h a t row a n d t h e other through a c o m p l e m e n t a r y subset (not in t h e m a t r i x ) . T h u s it requires 1 + [logra] queries. For d = 3, a modified scheme works. Query t h e users labeled by all 0s and all Is (if t h e y exist) individually and delete these two columns from t h e binary representation m a t r i x . Query t h e set of Is in each row and their c o m p l e m e n t a r y sets. It is easily verified t h a t every active user has a query to its own to t r a n s m i t . T h e total n u m b e r of queries is 1 + [log n ] + [log(n + 1 ) ] . An e x a m p l e for n = 8 and d = 2 is given in t h e following table:
query\user 1 2 3
0 0 0 0
1 2 3 4 5 6 0 0 0 1 1 1 0 1 1 0 0 1 1 0 1 0 1 0
7 1 1 1
If 6 and 7 are t h e two active users, then 7 t r a n s m i t s in query 3, and 6 in query 4 which consists of t h e complementary set of query 3.
5.4
The k-Channel
T h e fc-channel is a generalization of t h e multiaccess channel such t h a t any fewer t h a n k users can t r a n s m i t successfully in t h e same t i m e slot (k = 2 is t h e multiaccess case). T h e simultaneous transmission is possible in several scenarios. T h e channel may be able to t r a n s m i t t h e Boolean sum of messages received. If t h e messages are coded into a superimposed code (see C h a p t e r 4) where each supercode word consists of up to k — 1 code words, t h e n t h e superimposed message, which is a concatenation of supercode words, can also be uniquely deciphered. A typical application is to use a superimposed code on users' identifications. A reservation algorithm requests active
Multiaccess Channels and Extensions
102
users to transmit their identifications. If the number of active users is less than k, then all active users are identified in one time slot. Another scenario is when a time slot has capacity k — 1 (k — 1 subslots), or the channel is a pool of k — 1 subchannels. The fc-channel was first studied by Tsybakov, Mikhailov and Likhanov [14] for probabilistic models. Let Qk(d, n) denote the number of queries required for a minimax algorithm for the fc-channel with a (d, n) set of users. Chen and Hwang [7] used an argument similar to the proof of Theorem 5.1.6 to prove Theorem 5.4.1 Qk(d,n) = D.((d/k)
logn/\ogd).
They also proposed an algorithm which is an extension of the generalized binary splitting algorithm (see Section 2.2): Partition the set of unsatisfied users into two different groups L and U, called loaded and unloaded. Namely, a loaded group is known to contain some active users and this knowledge comes from querying, not from the nature of the sample space. Originally, all users are in the unloaded group. Let a(u) be a parameter to be determined later when u is the size of the updated U. Step 1. Query any set S of (k — l)2°( u ) users in U (set 5 := U if ka{m) > m). Step 2. If the feedback on S is at most k — 1, set U := U \ S and go back to Step 1. Step 3. If the feedback on S is k+, set L := S and c := k. Step 4- Let £ be the size of L. If £ < k — 1, query L and go back to Step 1. If £ > k, query any set S' of |//2J users from L. Step 5. Suppose the feedback on S' is k!. If k! < c, set L := L \ S', c := c — k! and go back to Step 4. Otherwise, move L\S' to U. If c < k! < k, go back to Step 1. If k' = k+, set L := S' and go back to Step 4. Let Qk(d, n : a) denote the maximum number of queries required by the above algorithm. Note that every time Step 3 is taken, at least k active users will be satisfied before going back to Step 1. Therefore Step 3 can be entered at most [dfk\ times. Once Step 3 is entered, the size of L is reduced by half by every new query until the size is at most k — 1. Hence a total of a(u) + 1 queries (counting the queries entering Step 3) is needed. Making the conservative assumption that no inactive user is identified in the process of satisfying active users, then the n — d inactive users are identified in Step 1 in a total of at most \{n — d)/ka{u)\ queries. So
Qk(d,n:a)<
jL-(a(u) + 2) +
-d
{k - 1)2«W
Ideally, a(u) should be a function of u. For easier analysis, set a(u) = log(n/d). Theorem 5.4.2 Qk(d,n)
=
0({d/k)\og(n/d)).
5.4 The k-channel
103
For d small or large, more accurate information on Qk(d,n) can be derived. A TDM algorithm for the fc-channel will query k — 1 users at a time. Thus it requires \n/(k — 1)] queries. Clearly, a TDM algorithm will be minimax if d is sufficiently large. Chang and Hwang [6] gave a lower bound on such d. T h e o r e m 5.4.3 / / [ ^ j
< [^±i] + \-^\ Qk{d,n) =
- 1, then n fc-1
Proof. It suffices to prove that n fc-1 An oracle argument is used. If a query is of size at most fc — 1, then the oracle always orders the feedback 0 (if 0 is impossible, then an integer as small as possible). Otherwise, the feedback is fc+, whenever possible. Under this oracle, an algorithm will not query a set larger than fc, unless it is known that the number of active users contained therein is less than fc. An active user can be satisfied only in the following two situations: Qk{d,n) >
(i) The user is in a queried set of size at most fc — 1. (ii) The user is in a queried set of size at least fc, but this set can contain at most fc — 1 active users since at least d —fc+ 1 active users are known to be elsewhere. Let S be the first queried set to arise in situation (ii). Since 5 contains inactive users, all sets of size at most fc queried before S must contain no active user (since otherwise the feedback to such a set could be reduced by switching active users with the inactive users in 5). So the x > d—fc+ 1 active users known to lie elsewhere must lie in sets of size at least fc. By the definition of the oracle, the feedback of these sets are fc+; hence these sets must be of size fc or the users contained therein cannot be deduced to be active. Thus there are at least \(d — k + l)/fc] such sets. Furthermore the x active users have to be satisfied in at least another \(d —fc+ l ) / ( ^ — 1)1 queries. Counting the query on S, the minimum total number of queries is
d+1 fc-1 If situation (ii) never arises, then the total number of queries is at least \n/(k — 1)]. • Let nq(k,d) denote the largest n such that q queries suffice given d and fc. Since Qk(d,n) is clearly increasing in n, a complete specification of nq(k,d) is equivalent to one of Qk(d,n). If d < fc, then Qk{d,n) = 1. For d > k but d small, Chang and Hwang [6] proved
MuJtiaccess Channels and Extensions
104
Theorem 5.4.4 For 2 < k < d < 2k and q > 2, nq(k,d) = 29"2(3fc -2-d)
+
d-k.
Proof. A construction is first given to show n,(jfc, d) > 2"-2(3k -2-d)
+ d-k
.
For g = 2 the RHS of the above inequality is 2fc — 2 which can certainly be handled by two queries of k — 1 users each. The general q > 3 case is proved by induction on q. Let S be the first query set with 25~3(3fc — 2 — d) users. If the feedback on S is less than k, then by induction, q — 1 more queries suffice for the remaining users. So assume the feedback is k+ and 5 is a loaded set. For the next q — 3 queries, use the halving technique as described in the Chen-Hwang algorithm until a loaded set of size 3k — 2 — d is left. This set can either be a set which has been queried with feedback k+ or the only unqueried subset of such a set. Next query a subset of k — 1 users from this loaded set. Then the remaining 2k — 1 — d users are the only unsatisfied users from a set with feedback k+. The number of satisfied active users during this process is at least k-(2k-l-d)~d + l-k. Hence there are at most k — 1 active users among all unsatisfied users. One more query takes care of them. Next the reverse inequality on nq(k,d) is proved. For q = 2, 2k — 1 users require three queries by Theorem 5.4.3. The general q > 3 case is again proved by induction on q. Let n = 2"-2(3k -2-d) + d-k + l . Suppose that the first query is on x users. If x < 2,_3(3fc — 2 — d), consider the feedback 0. Then the remaining n — x users cannot be satisfied by q — 1 more queries by induction. If x > 2'~3(3fc — 2 — d) + 1, consider the feedback k+. Clearly, if x > 2,_3(3A; — 2 — d) + d—k+1, then the x users cannot be satisfied by q — 1 more queries by induction. So assume l<x-2"-3{3k-2-d)
.
Now an oracle reveals 2,_3(3fc — 2 — d) inactive users among the unqueried users; then these users are satisfied. So the number of unqueried, unsatisfied users is n - x - 2"-3{M -2-d)
= d-k
+ l-[x-
2"-3{3k - 2 - d)} < d - k .
These up to d — k users can still all be active since the queried x users guarantee to contain only k active users. Therefore the set of ri = 29~3(3fc - 2 - d) + d - k + 1 unsatisfied users form an S(d,n'). By induction they cannot be satisfied in q — 1 queries. •
5.5 Quantitative Channels
5.5
105
Quantitative Channels
In a bit reservation channel each active user transmits a bit. Although bits transmitted by active users collide, the channel may have the capability to sense the "loudness" of this collision and hence deduces the number of active users, though still not knowing who they are. Such a channel will be called a quantitative channel since it provides quantitative information about the number of active users. It will be shown that this quantitative information leads to more efficient algorithms than ordinary group testing which provides only qualitative information. Long before Tsybakov [12] introduced the quantitative channel to multiaccess communication (for probabilistic models), the combinatorial version had been studied as a problem of detecting counterfeit coins by using a spring scale. Therefore the mathematical results will be discussed in the next chapter, Sec. 6.2.
References [1] D. Angluin and L. Valiant, Fast probabilistic algorithms for Hamiltonian paths and matchings, J. Comput. Syst. Sci. 18 (1979) 155-193. [2] T. Berger, N. Mehravari, D. Towsley and J. Wolf, Random multiple-access communications and group testing, IEEE Trans. Commun. 32 (1984) 769-778. [3] P. Busschbach, Constructive methods, to solve the problem of: s-subjectivity, conflict resolution, coding in defective memories, unpublished manuscript, 1984. [4] J. I. Capetanakis, Tree algorithms for packet broadcast channels, IEEE Trans. Inform. Theory 25 (1979) 505-515. [5] J. I. Capetanakis, Generalized TDMA: The multi-accessing tree protocol, IEEE Trans. Commun. 27 (1979) 1479-1485. [6] X. M. Chang and F. K. Hwang, The minimax number of calls for finite population multi-access channels, in Computer Networking and Performance Evaluation, Ed: T. Hasegawa, H. Takagi and Y. Takahashi, (Elsevier, Amsterdam, 1986) 381-388. [7] R. W. Chen and F. K. Hwang, K-definite group testing and its application to polling in computer networks, Congress Numerantium 47 (1985) 145-159. [8] A. G. Greenberg and S. Winograd, A lower bound on the time needed in the worst case to resolve conflicts deterministically in multiple access channels, J. Assoc. Comput. Math. 32 (1985) 589-596. [9] J. F. Hayes, An adaptive technique for local distribution, IEEE Trans. Commun. 26 (1978) 1178-1186.
106
Multiaccess Channels and Extensions
[10] J. Komlos and A. G. Greenberg, An asymptotically fast nonadaptive algorithm for conflict resolution in multiple access channels, IEEE Trans. Inform. Theory 31 (1985) 302-306. [11] J. L. Massey, Collision-resolution algorithms and random-access communications, Tech. Rep. UCLA-ENG-8016, School of Engineering and Applied Science, Univ. Calif. Los Angeles, 1980. [12] B. S. Tsybakov, Resolution of a conflict with known multiplicity, Probl. Inform. Transm. 16 (1980) 65-79. [13] B. S. Tsybakov and V. A. Mikhailov, Free synchronous packet access in a broadcast channel with feedback, Probl. Inform. Transm. 14 (1978) 259-280. [14] B. S. Tsybakov, V. A. Mikhailov and N. B. Likhanov, Bounds for packet transmissions rate in a random-multiple-access system, Prob. Inform. Transm. 19 (1983) 61-81.
6 Some Other Group Testing Models
We introduce other group testing models with different types of outcomes. Some of these models can be classified as parametric group testing since there is a parameter to represent the degree of defectiveness and a test outcome is a function of the parameters of items in that test group. In particular, all possible patterns of outcomes for two (possibly different) defectives are examined.
6.1
Symmetric Group Testing
Sobel, Kumar and Blumenthal [23] studied a symmetrized version of group testing for PGT which is asymmetric with respect to defectives and good items. In the symmetric model, there are ternary outcomes; all good, all defective, and mixed (which means containing at least one defective and one good item each). Hwang [13] studied the symmetric model for CGT. He proved Theorem 6.1.1 M(n - l,n) = n. Proof. That M(n - l,n) < n is obvious since individual testing needs only n tests. The reverse inequality is proved by an oracle argument. Let G denote the graph whose vertices are the unidentified items and an edge exists between two vertices if they have been identified as a "mixed pair," one good and one defective. At the beginning the graph has n vertices and no edges. After each test, the graph is updated by either adding one edge or removing a component (a connected subgraph). Note that if the state of a vertex is known, then the states of all vertices in the same component can be deduced. Therefore one can assume without loss of generality that a test group does not contain two vertices from the same component. The oracle will dictate the "mixed" outcome to each test of size two or more and add an edge between two arbitrary vertices in the test group. It will also dictate the "all good" outcome to each test of size one and remove the component containing that vertex from G. In either case the number of components is reduced 107
Some Other Group Testing Models
108
by one for each test. Since G starts with n components, n tests are required.
•
Next we prove a complementary result. Theorem 6.1.2 M(n — 2,n) < n except for n = 3,4. Proof. Test the group (1,2), (1,3), ••-, until a group ( l , i ) , 2 < i < n — 1, yields a nonmixed outcome. Then the states of items 1,2, •••, i are all known in i — 1 tests and the states of the remaining n — i items can be learned through individual testing in n — i tests. If all outcomes of (1,2), ..., (1, n — 1) are mixed and n > 5, then items 1 and n must be the only two defectives. • For small d, the ordinary group testing model is not too different from the symmetric model since the "all defective" outcome is unlikely to occur except for small test groups. Therefore good group testing algorithms can be simulated here without giving up too much. The same goes for small n — d by reversing the definition of good items and defectives. For general d, Hwang proposed the following two algorithms. The chain algorithm Step 1. Test items 1 and 2 as a group. Step 2. Suppose the current test group is {i, i + 1}. If the outcome is mixed, test the group {i + 1, t + 2}; if not, test the group {i + 2, i + 3}. The star algorithm Step 1. Test items 1 and 2 as a group. Step 2. Suppose the current test group is {i,j}, i < j - If the outcome is mixed, test the group {i, j + 1} next; if not, test the group {j + 2, j + 3}.
•
•
o—•—o—o
A
Figure 6.1: The chain algorithm and the star algorithm. It should be understood that for both procedures, any specified item having a label beyond n should be excluded. Furthermore, an empty group should not be tested.
6.2 Some Additive
Models
109
Note that for both algorithms, the number of tests equals n subtracting the number of nonmixed outcomes. The maximum size of a chain or a star is 1 + min{d, n — d}. It seems that the star algorithm is the better one overall, except for some special cases like min{c/, n — d} = 1.
6.2
Some Additive Models
Consider the (d, n) sample space and assume that the ith defective X{ has a defectiveness measurable by #;. For an additive model, the outcome of a test on group G is Ex.gG ftIn the power-set model it is assumed that £*(£»' ^> a r e distinct over all s' C s. Therefore there exists a one-to-one mapping from the outcomes to the power set of s, i.e., there are 2d possible outcomes. Since there are n ) samples in the (d,n) sample space, the information lower bound yields M(d, n) > [log.,,, n(n - 1) • • • (n - d + 1)] . Here we give a stronger lower bound and a construction to achieve it (the special case d = 2 was given in [14]). T h e o r e m 6.2.1 M(d,n)
= [logn"|.
Proof. To identify the d defectives, one certainly has to identify a particular defective, say, x. For the purpose of identifying x, the only relevant information in an outcome is whether it contains x. Therefore
M{d,n) > \\ogn] since the right hand side is the lower bound of identifying a single defective with the binary outcome "contained" or "not contained." On the other hand the binary representation matrix given in the first paragraph of Sec. 4.5 to identify a single defective also identifies all d defectives. Namely, defective X{ is 7/t where k is the number corresponding to the binary vector with Is in all rows whose outcomes contain i , and 0s elsewhere. • Example 6.1. In the following table, x\ = Ik where k = (0,0,1,1) = 3 and x2 = Ij where j = (1,0,1,0) = 10.
h
h
h
h
h
h
h
h
h
h
/io
III
0 0 0 0
0 0 0 1
0 0 1 0
0 0 1 1
0 1 0 0
0 1 0 1
0 1 1 0
0 1 1 1
1 0 0 0
1 0 0 1
1 0 1 0
1 0 1 1
outcon *2
XiX2 Xi
Some Other Group Testing Models
110
In the residual model it is assumed that nothing is known about 0j. So the only inference one can draw from the outcomes is the "containment" relation. For example, if group G induces an outcome 8 and a subgroup G' induces an outcome 6' < 9, then without testing, one can infer that the outcome on the group G\G' is 8 — 6', hence G\G' is a contaminated group. The residual model was first studied by Pfeifer and Enis [21] for PGT. For d = 1 the residual model is reduced to the ordinary group testing model; thus M(l,n) = [logn]. For d = 2, the residual model is identical to the quantitative model (see below). Not much is known for d > 3. In the quantitative model it is assumed that 8{ = 6, a constant. Thus the set of outcomes is mapped to the number of defectives in the test group. The quantitative model has a long history in combinatorial search literature. Shapiro [22] and Fine [9] raised the question that if counterfeit coins weigh nine grams and genuine coins weigh ten grams, given a scale which weights precisely, what is the minimum number of weightings required to extract all counterfeit coins from a set of n coins? A t x n 0 - 1 matrix M is called a detecting matrix if the vectors MXi, i = 1 , . . . , 2" are all distinct, where X{ ranges over all binary rc-tuples. An immediate extension of Theorem 1.2.1 to the case of n + 1 outcomes yields
(n + lY >2n . Let g(n) denote the minimum t such that a i x n detecting matrix exists. Cantor [3] proved that g(n) = 0(ra/log logra), while Soderberg and Shapiro [24] proved g(n) = 0(n/ log n). Erdos and Renyi [8], and many others independently, proved Theorem 6.2.2 lim n—oo
i n f ^ ^ > 2 . n
Proof. Let M denote a i x n detecting matrix. Partition the rows T\,..., rt of M into two classes according as whether the weight of a row is less than h = [\/n log n\. Let s denote a sample (a subset of columns) and let Ms denote the submatrix of M consisting of s whose rows are denoted by r[,... , r j . Let vs denote the 1 x l vector whose component vs(j) is the weight of r'j. If rj belongs to class 1, then vr(j) can take on at most h different values. If rj belongs to class 2 and has weight w > h, then the number of samples s such that vs(j) does not lie between w/2 ± \%/w log w (A is a positive constant to be chosen later) is c\n—w
\ *
A sample is called "bad" if it contains such a vs(j) and "good" otherwise. According to the Moivre-Laplace theorem,
, £
ft)-"(£)•
6.2 Some Additive
Models
111
Let 6 denote the number of bad samples. Then
= ofl^U -=^\ =0 \h^J Vn^-!(k>gn U»
b = 0 =
0 (^—] \\ognJ
for
A2 > 1 .
On the other hand, let v denote the number of different vectors vs ranging over the good samples. Then v
< (2\\Jn\ogn 1 .
Since M is a detecting matrix, necessarily v >T - b which implies 2n t > logn + O(loglogn) Theorem 6.2.2 follows immediately.
D
Lindstrom [19, 20] (also see Cantor and Mills [4]) gave a construction of (2A — 1) x k2 ~ detecting matrices Mj, to show that the asymptotic lower bound of Theorem 6.2.2 can be achieved. The rows of M* are labeled by all nonempty subsets Y of K = { 1 , . . . , k}. The columns of M^ are divided into groups which are also labeled by nonempty subsets of K. The group labeled by X C K contains |X| = x columns v\,..., vx. Let c(Y, X, Vi) denote the entry at the intersection of row Y and column (X, Vi). Furthermore, let R(X) be the set of 2* _1 rows whose numbers of intersections with X have odd cardinalities and let Ri[X) be an arbitrary 2*~'-subset of R{X). k 1
f 1 if Y n X G Ri(X) c(Y,X,vi)=l I 0 otherwise . M3 is given in the following table:
Some Other Group Testing Models
112
{1} {2} {3} {1-2} {1,3} {2,3} {1,2,3}
{1}
{2}
{3}
{1,2}
{1,3}
{2,3}
{1,2,3}
1 0 0 1 1 0 1
0 1 0 1 0 1 1
0 0 1 0 1 1 1
11 10 00 00 11 10 00
11 00 10 11 00 10 00
00 11 10 11 10 00 00
111 1 1 0 10 0 000 00 0 00 0 1 00
To prove that Mk is a detecting matrix, the following lemma is crucial. Lemma 6.2.3 Let X and Z be two column-groups such that Z <£. X. £
c(Y,X,v,)=
K€fi(Z)
£
c(Y,X,vt)
Then
.
V-gfi(Z)
Proof. Since Z <£ X, there exists an element j e Z — X. Clearly,
{Yu{j})nx = Ynx . By construction c(YU{j},X,vi)
=
c(Y,X,vi)
for every row Y not containing j . Since Y l~l Z and (Y U {j}) <~) Z have different parities, one of them is in R(Z) and the other not. Furthermore, c({j},x,Vi) = 0. Hence the terms in the two sums of Lemma 6.2.3 are pairwise equal. Theorem 6.2.4 lim
g ( n ) l 0 g n
=2.
Proof. It is shown first that Mk is a detecting matrix. Let s be a sample and let Ii(s) be the indicator function such that
I
I if column C; € s 0 otherwise .
Let I(s) be the column vector consisting of /;(s), i = 1 , . . . , fc2*-1. Then Mk is a detecting matrix if Ii(s) can be solved from the equation MkI(s)
= m(s) ,
6.2 Some Additive
Models
113
where m(s) = (mi,..., m2fc) is a column vector and rrii is the outcome yielded by row j . The Ii(s) are solved in the partial order of the subsets labeling the column groups. Columns which have been solved are deleted from the partial order and columns in a group which is a maximal element in the current partial order are next to be solved. Suppose that Z is such a group with columns vi,..., vz. For each i = 1 , . . . , z, add up the rows in Ri(Z) and subtract from it rows not in Ri(Z). Due to Lemma 6.2.3, the coefficients of Ii(s) for all unsolved columns C; not in Z are zero. Thus the z equations contain only z unknowns corresponding to the z columns in group Z. /,-($) for these columns can be uniquely solved. Note that
*
(f + l)log(f + l) 2
.(k
i=0
Thus
(t + <
l)logn 2
g(n)\ogn hm sup —^—^—2— < 2 . n-»oo
n
Theorem 6.2.4 now follows from Theorem 6.2.2.
•
One might also consider the quantitative model for the (d, n) sample space or when the row weight is restricted to be k. Koubek and Rajlich [18] showed that lim<7(n) exists for the latter case. Aigner and Schughart [2] gave a minimax (sequential) line algorithm for the former case. Their model actually allows testing the items from both ends. But since a set which contains x defectives implies its complementary set contains d — x defectives, it can be assumed that all tests are from the top order. Furthermore, it doesn't matter whether identified items are removed from the line or not since if a test set contains identified items, then the number of unidentified defectives contained in the subset can be deduced. Consider a line algorithm whose first test is on the first y items and the second on the first z items. The first test partitions the n items into two disjoint sets G of the first y items, and G' of the remaining n — y items. If z < y, then the second test yields no information on G'\ while if z > y, then no information on G. Let A(d,n) denote the minimum number of tests required by a minimax line algorithm. The above argument yields the recursive equation A(d, n) = 1 +
min
max {A(d - j , i) + A(j, n - i)} .
[n/2]
Aigner and Schughart showed that this recursive equation has the following solution. T h e o r e m 6.2.5 Define k = [log ^ ]
+ 1 and h = \^]
- d . Then A(d,n)
=
kd+h-l. Sequential algorithms for the (2, n) sample space have been extensively studied. Let (2,n) denote the number of tests required by a minimax algorithm. Christen [6]
Some Other Group Testing Models
114
(also see Aigner [1]) showed that 5(2, n) < log^ n = 2.28 log 3 n, where = (1 + \/5)/2 is the golden ratio. Hao [12] considered the problem of identifying two defectives from two disjoint sets of size m and n each containing exactly one defective. Let H(m,n) denote the number of tests required by a minimax algorithm. Lemma 6.2.6 H(ab,cd) < H(a,c) + H(b,d). Proof. Partition the afe(or cd) items into a(or c) classes of 6(or d) items each. Use H(a, c) tests to identify the two classes containing defectives. Then use H(b, d) tests to identify the two defectives from the two contaminated classes. • Lemma 6.2.7 There exist infinitely many n such that H(n,n)
> g(2,n).
Proof. If not, then there exists an N such that for all n > N, H(n, n) < g(2, n). Now consider a set of n = 2TN items and the algorithm which first tests a set of 2 r_1 A f items and then uses the minimax subalgorithm. Then g(2,2TN)
< l+max{H(2r-1N,2r-1N),g(2,2'-1N)}
= 1 + g(2,2T~1N)
.
Repeat this to obtain g(2,2rN)
+
g(2,N),
which implies V ff(2,") h m sup n->oo l0g3
^, n < log 6 . n
But an application of Theorem 1.2.1 yields g(2,n)>log3(;) ^ log 3 n _ log 3 n which implies lim inf ——'— > 2 > log 3 n—00
l0g
a contradiction.
•
n
Using the above two lemmas Hao proved that H(n, n)/ log 3 n, hence g(2, n)/ log3 n, converges to a constant < 12/ log 3 330 = 2.27. Recently, Gargano, Montuori, Setaro and Vaccaro [11] obtained a better upper bound of g(2,n) by strengthening Lemma 6.2.7. Lemma 6.2.8 For any positive integers m and n, m < n, g(2,n)<
max
\k + H (mK{m\mK{m))\
where K{m)=\\ogmn-k\ogm2\
.
,
6.3 A Maximum
Model
115
Proof. Consider t h e algorithm which tests half of t h e current set containing two defectives until this set is split into two subsets each containing one defective. T h e n invoke H(x,x). Suppose t h a t k halving tests are used. T h e n
x < r J l < ™A'(m) • n By displaying t h e binary tree d e m o n s t r a t i n g H(32,32) 6.2.8, t h e following is proved.
= 7, and using L e m m a
T h e o r e m 6 . 2 . 9 g(2, n) < 2 . 1 8 . . . log 3 n + 0 ( 1 ) . Proof. 5(2, n)
=
m a x {fc + tf (32A'<32>, 32A'<32>)}
<
m a x {k + 7K(32)}
= 2.18 . . . l o g 3 n + 0 ( 1 ) .
•
k
6.3
A Maximum Model
Let di denote t h e defectiveness measure as in t h e last section. In t h e model M , t h e o u t c o m e of testing a group G is m a x ^ g G 0; (0 if G contains no defective). This model applies where there exist different degrees of defectiveness and what is observed in a test is t h e consequence caused by t h e most severe defectiveness. Clearly, logrf+i n ( n - 1) • • • (ra - d + 1) ~ d l o g d + 1 n is an information lower bound. O n t h e other hand t h e d defectives can be identified one by one, in t h e order of severeness, by treating t h e currently most severe one as t h e only defective (and removing it after identification). T h e 2 th severest defective t h e n requires at most [log(n — i + 1)] tests, yielding the upper bound d-\
J2\log(n
- i)~\ ~ d l o g r c .
i=0
Obviously, this u p p e r bound has not m a d e full use of t h e information about those defectives which are not most severe. For example, if #,- is t h e currently largest 8 and t h e o u t c o m e of a test of group G is 9j, j / i, t h e n t h e information t h a t Xj, is t h e most severe defective in G is ignored. For d = 2 Hwang a n d X u [16] gave a significantly b e t t e r algorithm t h a n t h e above naive one. Let x denote t h e less severe defective and y t h e more severe one. T h e y considered two configurations [a x b] a n d [n, 6]. T h e former denotes t h e case t h a t x is known to lie in a set A of a items and y a set B of b items where A a n d B are disjoint. T h e l a t t e r denotes t h e case t h a t x is known t o lie in a set N of n i t e m s of which a subset B of b i t e m s contains y. Note t h a t [n, n] denotes t h e original s a m p l e space. Let M[a x 6] and M[n, b] denote t h e m i n i m a x n u m b e r s for these two configurations. Let bm(a) denote t h e largest b such t h a t M[a x b] < m.
Some Other Group Testing Models
116
Lemma 6.3.1 Suppose 2a > a > 2a~1. Then Bm(a) is undefined for m < a and M a ) = zL ( • )
for
m
a
-
•
Proof. Clearly, M[a x b] is decreasing in both a and 6. Suppose a > m. Then M[a x 6] > M[a x 1] = [log a] = a > m, hence bm(a) is undefined. It is also clear that bm(a) = 1 = J2 ( . I for m = a. The general m case is proved by induction on both m and a. Let 0, x, y denote the three outcomes in a natural way. Assume that the first test group G consists of i items from A and j items from B. Then the new configuration after the testing is [(a — i) x (6 — j)] if the outcome is 0, [i x (6 — j)] if the outcome is x, [a X j]
if the outcome is y.
Since M[a X j] < m — 1 is required, j < fem_i(a). Similarly, b-j<
min{B m _i(a -
i),Bm-i(i)}.
By the induction hypothesis, i = [a/2\ maximizes the above minimum. Therefore M«)
=
&m-i(a) + & m -i(|a/2J)
t=a
\
/
t=a—1
\
/
t=a
Define /(0,0)
=
/(0,1) = 1
/(4t,0)
=
2 31_1 + 1
/(«,*)
=
/(4< - 1, A: — 1)
/ ( « + »,*)
=
m i n | ^ / ( 4 i + i - l , i ) , 2 3 t + , ' - 1 + e(A:)l
fort>
1 fort>l,k=l,...,t
+ l, /or <> 0,i = 1,2,3,
fc = 0 , l , . . . , t + l, where e(k) = 1 if k = 0 and e(fc) = 0 otherwise. /(4t + i, k) will later be used as the 6 value in [n, 6]. Hwang and Xu proved
6.3 A Maximum
Model
117
L e m m a 6.3.2
(i)m+i,k)<Eit--k1+,(it+rk)-
(H) E{t\ / ( « + 3, k) > 2 3( /5 for t > 4. (Hi) f(it + 2,1) > 2 3 '" 1 /or t > 0.
(«J /(4i + 3,0) > 2 3t+1 /or t > 0.
The proofs are skipped as they are quite involved. Theorem 6.3.3 M[2 3 '- 1 + i + 1, f(it + i, k)] < it + i - k for it + i > 1. Proof. The proof is by induction on (4£ + z, £) in the lexicographical order. It is easily verified that f(it + i, t + 1) = 1 for all t and i, and M[23t~1+i + i, f(it + i,t + 1)] = M[2 3 i - 1 + i + 1,1] = 3< - 1 + i = it + i - (t + 1) . For the pair (it + i, k), k ^ t + 1, i ^ 0, first test a group of 2 3 i _ 2 + 1 items with f(it + i, k + 1) items from the set of /(4t + i, k) items. The new configuration after the testing is [23f-2+; + 1^(42 + j _ 1, fc)] if the outcome is 0, [2 3 '- 2 ' x f(it + i-l,k)} if the outcome is x, [2 3 '- 1 + i , f(it + i, k + 1)] if the outcome is y. By induction, Lemma 6.3.1 and 6.3.2(i), it + i — 1 — k more tests suffice. For the pair (it, k), t > k > 1, Af^3*-1 + 1,/(4*,A:)] = <
M[2 3 ( ( - 1 >- 1+3 + l , / ( 4 ( * - l ) + 3,A;-l)] i(t - 1) + 3 - k + 1 = it - k .
Finally, for the pair (42,0) first test a group of f(it figuration after the testing is
— 1,0) items. The new con-
[2 3 '- 1 + 1 - f(it - l . O ) ^ 3 ' - 1 + 1 - f(it - 1,0)] [/(4t - 1,0) x (2 3 '- 1 + 1 - f(it - 1,0))] [23(-i + ij(u -1,0)]
if the outcome is 0, if the outcome is x, if the outcome is y.
By Lemma 6.3.2(i), 2 3 '" 1 + 1 - f(it
- 1,0) < f(it
-1,0),
hence M[2 3 i _ 1 + 1 - f(it - 1,0), 2 3 1 - 1 + 1 - f(it - 1,0)] < Mp 3 *" 1 + 1, f(it - 1,0)] = it - 1 by induction. Furthermore M[f(it
- 1,0) x (2 3 '- 1 + 1 - f(it - 1,0))] < MI2 3 '" 1 x f(it - 1, 0)] < it - 1
by Lemmas 6.3.1 and 6.3.2(i).
D
Some Other Group Testing Models
118
Corollary 6.3.4 M(2, /(4i<, k)) < 4t + i - k for At + i > 1. Since f(it + i,k) is roughly 2 3 ', M(2,n) is approximately | l o g n , which is an improvement over the naive upper bound 21ogn. How to extend this improvement for general d is still an open problem.
6.4
S o m e M o d e l s for d = 2
Let x and y be the two defectives. In the candy factory model there are three possible outcomes: x,y and 0/xy, meaning the test group contains x only, y only, none or both. Christen [6] gave a picturesque setting for the model. Workers of a candy factory pack boxes containing a fixed number of equally heavy candy pieces but one mischievous worker shifts a piece of candy from one box to another (Figure 6.2). So
Figure 6.2: One mischievous worker shifts a piece of candy from one box to another. one box becomes too heavy and the other too light, but their total weight remains constant. If a test group contains both the heavy and light box, a weighing cannot differentiate the group from a group containing neither. Christen noted that the division of the sample space induced by the three outcomes is not balanced (in favor of the 0/xy outcome). Therefore the usual information lower bound, which depends only on counting but not on the structure of the problem, is an underestimate. For example, a set of eight items consists of 8 x 7 = 56 possible pairs of (x,y). If a group of four items is tested, then the distribution of these 56 pairs into the three subspaces induced by the three outcomes x, y, 0/xy is 1 6 : 1 6 : 24. In general after j tests all with the 0/xy outcome, the n items are partitioned into up to 23 subsets where both x and y lie in one of these subsets. Christen proved L e m m a 6.4.1 (i) The minimum of 2 5ZS=i \£) under the constraint YT=i ri = n occurs exactly when m — n + [n/mj m of the r^ are equal to [n/m\ and the others to \n/m~\. (ii) The above minimum is equal to (\n/m\ — l)(2n — \n/rn\m). (Hi) For fixed n > m the minimum is decreasing in m.
119
6.4 Some Models for d = 2
The proof is straightforward and omitted. The oversize of the subspace induced by a sequence of 0/xy outcomes forces a better lower bound which, amazingly, was shown by Christen to be achievable. Thus Theorem 6.4.2 The maximum number f{t) of items such that the two defectives in the candy factory can be identified in t tests is f(t)
[ = ) [
7 (Z*-h - 1 + 2 1+,1 )/2 [3(3'-'' + 2 h )/4j
i/t = 4, if 1 + 2h < 3<-h, otherwise,
where h = t - [(1 + t)/ log 6J. Proof. First it is shown that n = /(<) + 1 items need at least t + 1 tests. The case t = 4 can be directly verified. For t > 5, it is easily shown that 3 t-A-i +
i < 2A < 3 ( "' 1 + 1 - 1 .
Set n = f(t) + 1 in Lemma 6.4.1. By Lemma 6.4.1(iii) it suffices to consider the largest possible m. Two cases have to be distinguished, (i) 2h < 3' - ' 1 — 1. Consider the subspace after h Q/xy outcomes. Then m = 2h and 2m>n
= f(t) + 1 = (3'-' 1 + 1 + 2 fc+1 )/2 > m .
Hence \n/rn\ = 2. From Lemma 6.4.1(ii), the size of the subspace is 2(n - m) = 3*-h + 1 . Thus at least t — h + 1 more tests are required. (ii) 3*~h + 1 < 2h. Consider the subspace after h — 1 0/xy outcomes. Then m = 2h~l and 4m > n = f{t) + 1 = [3(3*-A + 2fc)/4j + 1 > 2m . Hence 4 > \n/m] > 3. From Lemma 6.4.1(H), the size of the subspace is
= >
4n — 6m + (\n/m~\ — 3)(2n — \n/m~\m — 2m) > An — 6m 4 L3(3*~fc + 2A)/4J + 4 - 3 • 2h 3<-M-i + i .
Thus at least t — h + 2 more tests are required. That f(t) items can always be done in t tests is proved by construction. Let [Y^iiLi ri] denote the configuration that there are m disjoint sets of size ri, • • •, rm, and the two defectives lie in one of the m sets. Let Ei=i(Pt x 9i)] denote the configuration that there are m disjoint sets each of which is further partitioned into two subsets of p, and , items each. The two defectives always lie in the same set i,
120
Some Other Group Testing Models
with x among the p, items and y among the ; items. It can be easily seen that once an x or y outcome is obtained, then the n items are partitioned in this manner. In this configuration the prescribed algorithm takes [p;/2] and \_qi/2\ items from each set i, as long as p, > 4 and , > 4 for each i, for a group test. In the [Y^Li r,-] configuration, the algorithm takes [ r i/2j items from each set, as long as r; > 4 for each i, for a group test. Whenever the p; > 4, qt > 4, r,- > 4 condition is violated, Christen gave some ad hoc procedures to take care of the situations. Then he showed that this algorithm does /(<) items in t tests. The detailed verification is omitted here. • Suppose that in the candy factory example, a weighing can only identify an underweight box. Then the possible outcomes are x and 0/y/xy, the former indicating that the group contains x but not y. This model was called the underweight model by Hwang [14] who gave an algorithm requiring 31ogn — 1 tests. Ko [17] gave a recursive algorithm with better asymptotic results. Let [m(p x q)] denote the configuration E£Li(p< x ?•')] 'f P' = P a n ( ^ Ik = q for all i, and let ra(r) denote the configuration [Y^Li fi] if ?\ = r for all i. Lemma 6.4.3 Let 0 < k < I and p = (A. Given the configuration [m(pq)] there exist I tests such that (i) if one test outcome is x, then the new configuration is \m (ii) if none of the test outcomes is x, then the new configuration is \mp(q)\. Proof. First assume m = q = 1. Construct an I x p matrix M by having the fc-subsets of the set { 1 , . . . , / } as columns. Then each row has kp/l Is. Associate each column with an item and treat rows as tests. Suppose row j yields the outcome x. Then x must be one of the columns having a 1-entry in row j, and y must not be. So the new configuration is [(^f x itjk)}. (i) is verified. Clearly, for any two columns C, C" of M, there must exist a row with 1 in C and 0 in C". So (ii) holds trivially. In the general case partition the m(pq) items into p sets of mq items each, with q items coming from each of the m groups. Associate the columns of M with the p sets. Then (i) and (ii) are easily verified. • Ko's algorithm K(l): For given /, set k = [l/2\ and p = (Ik). Suppose p a _ 1 < n < pa. Add dummy items to make the initial configuration [(p°)]. There are 2a steps, starting with Step 0. Step m (0 < m < a). Assume that the current configuration is [p m (p a_m )]. Apply Lemma 6.4.3 to find t tests satisfying (i) and (ii) of Lemma 6.4.3. If one of the tests has outcome x, go to Step m'. If not, go to Step p + 1.
Step p' (0 < p < a). Assume that the current configuration is \pm(kpd~m/l x (/ — k)pa~m//)]. Using binary splitting to identify x among the kpa~m/l items. Then
121
6.4 Some Models for d = 2 by inserting x into every test, y can be identified from the (Z — k)pa again by binary testing.
m
/Z items
Lemma 6.4.3 assures that K(l) is a correct algorithm. Theorem 6.4.4 The number of tests required by K(l) is M j f ( 0 ( n ) < ( 2 + e,)logn + 3Z, where yiogp
y
log
P
By setting Z = \y/\ogn ] , one obtains Corollary 6.4.5 MK(i)(n) = 2 log n + 0(log log ny'log n) Proof. Since Z — log I — 1 < log m < I — log Z, it follows that M K (n(n) < 2 log n + °,S + log n + 3Z . w log m Substituting m ~ Z — log Z/2 and Z = [\/l 0 g n] > Corollary 6.4.5 is shown. Gargano, Korner and Vaccaro [10] considered the subproblem of identifying only x in the underweight model. Let M({x},n) denote the minimax number of tests for this subproblem. Theorem 6.4.6 If n = ( * ) for some integer k, then M({X}, „)
= * - ! +nog ( ^ ^ - ^ ^ i .
Proof. For any algorithm A represent its first k — \ tests by a binary matrix. By the Sperner's theorem, there exist at most (r/t_^w2-|) columns not containing each other. Since n is greater than this number, there exist columns a, b such that b contains a. Let a and 6 also denote the corresponding items. Then for each such pair (a,b), x = a(with y = b) is a solution consistent with the test outcomes. Let X denote the set of such items o. Then the set of columns labeled by items not in X constitute a Sperner family with k — 1 elements. Thus m n
- -(r(fc-i)/2i)
=
(j(fc-i)/2i-
Some Other Group Testing Models
122
which means that A needs at least [logX] tests. On the other hand, let the n items be represented by the [&/2J -subsets of the set {1, • • • , n}. Since no column is contained in another column, one of the tests must yield the x outcome. But the last test can be skipped since its x outcome can be deduced if the first k — 1 tests yield no x outcome. Since each row has n\_k/2\/k = f rffc—iT/2l—l) Is once a row with the x outcomes is found, it takes at most log (rfc/21—l) m o r e tests. • Corollary 6.4.7 M({x},n)
> 21ogn + \ log log n — 4.
Proof. Assume that \(k-l)/2-])-n<(\k/2])Using Sterlings formula, it is easily verified that , log log n k > log n H . Since M({x},n)
is nondecreasing in n
M(W,n)>M(W,(r(;_-;/21))
>
*-2+nog(rfc*-^2)i
> 2 log n +
log6 log6 n
4 .
•
Clearly, a lower bound of M({x}, n) is also a lower bound of M(n) for the underweight model. From Corollaries 6.4.5 and 6.4.7, M(n) — 21ogn lies between log log n and loglogn-\/logn. Hwang also studied the parity model whose outcomes are odd and even, the parity of the number of defectives in the test group. He observed that by using the 1separable property of the binary representation matrix, there always exists a row containing one defective but not the other. Suppose that row j is the first such row. Since each row has at most n/2 Is, by binary splitting, one defective can be identified in [log n~\ — 1 more tests. The other defective must lie in the set of Os in row j , but in the same set as the identified defective in row i for i = 1,... ,j — 1. There are at most n/21 items satisfying this requirement. Hence the other defective can be identified in [log n] — j more tests. The total number of tests is j + [log n] - 1 + [log n\ - j - 2[log n] - 1 , which is at most one more than the information bound [log \Vf\Chang, Hwang and Weng [5] proposed a better algorithm H which achieves the information lower bound except on a set of n with measure zero. Their algorithm encounters only the two configurations [Y^Li ri] a n d [E£Li(p; x ;)] defined as before.
123
References
In the former configuration, the next test group consists of Lri/2J items from each r;. In the latter configuration, the next test group will take items from each p, x q± group in such a way that the sizes of the two configurations split YlT=i(Vi x •?•) evenly (it is easily verified that this can always be done). Represent the algorithm H by a binary tree with the left branch always denoting the "odd" outcome. Then it can be shown that the sizes of the configurations corresponding to the nodes at a given level are ordered from left to right. Therefore, only the leftmost node at each level needs to be concerned. Furthermore, since the split of the leftmost node is always even, the number of tests is simply
i + nog(L^i)i, where [f J * |~f 1 *s * n e configuration after the first test yielding an "odd" outcome. Thus H achieves the information lower bound except when
RogLfj x L§J1 > Hog (2)1 - 1 . Define integer ik by
p*(j)i- 2 J + 1 /(2/ - 2j - 2) for some j > 0}. They also showed that although N is an infinite set, its density is (logra)/2n. Hwang [13, 14] surveyed all nonequivalent models for two possibly different defectives.
References [1] M. Aigner, Search problems on graphs, Disc. Appl. Math. 14 (1986) 215-230. [2] M. Aigner and M. Schughart, Determining defectives in a linear order, J. Statist. Plan. Inform. 12 (1985) 359-368. [3] D. G. Cantor, Determining a set from the cardinalities of its intersections with other sets, Canad. J. Math. 16 (1964) 94-97. [4] D. G. Cantor and W. H. Mills, Determination of a subset from certain combinatorial properties, Canad. J. Math. 18 (1966) 42-48. [5] X. M. Chang, F. K. Hwang and J. F. Weng, Optimal detection of two defectives with a parity check device, SIAM J. Disc. Math. (1988) 38-44.
124
Some Other Group Testing Models
[6] C. Christen, A Fibonaccian algorithm for the detection of two elements, Publ. 341, Dept. d'IRO, Univ. Montreal, 1980. [7] C. Christen, Optimal detection of two complementary defectives, SIAM J. Alg. Disc. Methods 4 (1983) 101-110. [8] P. Erdos and A. Renyi, On two problems of information theory, Publ. Math. Inst. Hung. Acad. Sci. 8 (1963) 241-254. [9] N. J. Fine, Solution E1399, Amer. Math. Monthly 67 (1960) 697-698. 10] L. Gargano, J. Korner and U. Vaccaro, Search problems for two irregular coins with incomplete feedback, Disc. Appl. Math. 36 (1992) 191-197. 11] L. Gargano, V. Montuori, G. Setaro and U. Vaccaro, An improved algorithm for quantitative group testing, Disc. Appl. Math. 36 (1992) 299-306. 12] F. H. Hao, The optimal procedures for quantitative group testing, Disc. Appl. Math. 26 (1990) 79-86. 13] F. K. Hwang, Three versions of a group testing game, SIAM J. Alg. Disc. Method 5(1984) 145-153. 14] F. K. Hwang, A tale of two coins, Amer. Math. Monthly 94 (1987) 121-129. 15] F. K. Hwang, Updating a tale of two coins, Graph Theory and Its Applications: East and West, ed. M. F. Capobianco, M. Guan, D. F. Hsu and F. Tian, (New York Acad. Sci., New York) 1989, 259-265. 16] F. K. Hwang and Y. H. Xu, Group testing to identify one defective and one mediocre item, J. Statist. Plan. Infer. 17 (1987) 367-373. 17] K. I. Ko, Searching for two objects by underweight feedback, SIAM J. Disc. Math. 1 (1988) 65-70. 18] V. Koubek and J. Rajlich, Combinatorics of separation by binary matrices, Disc. Math. 57 (1985) 203-208. [19] B. Lindstrom, On a combinatory detection problem I, Publ. Math. Inst. Hung. Acad. Sci. 9 (1964) 195-207. [20] B. Lindstrom, Determining subsets by unramified experiments, A Survey of Statistical Designs and Linear Models, Ed: J. N. Srivastava, (North Holland, Amsterdam, 1975) 407-418. [21] C. G. Pfeifer and P. Enis, Dorfman type group testing for a modified binomial model, J. Amer. Statist. Assoc. 73 (1978) 588-592.
References
125
[22] H. S. Shapiro, Problem E1399, Amer. Math. Monthly 67 (1960) 82. [23] M. Sobel, S. Kumar and S. Blumenthal, Symmetric binomial group-testing with three outcomes, Statist. Decision Theory and Related Topics, ed. S. S. Gupta and J. Yackel, (Academic, 1971) 119-160. [24] S. Soderberg and H. S. Shapiro, A combinatory detection problem, Amer. Math. Monthly 70 (1963) 1066-1070.
7 Competitive Group Testing
In the previous chapters, the number of defectives or an upper bound of it was generally assumed to be known (except in Sections 3.2 and 5.1). However, in practice, one may have no information about it. Probably, one only knows the existence of defectives and nothing else. How effective can group testing still be in this situation? We discuss this problem in this chapter.
7.1
The First Competitiveness
For convenience, assume that no information on defectiveness is known. In fact, knowing the existence of defectives may save at most one test. It doesn't affect the results in this chapter. Let us start our discussion with an example. Consider a tree
Figure 7.1: Tree T. T as shown in Figure 7.1. An ordering of nodes of T is given as follows: (1) The root is the first one. It is followed by nodes at the second level, then nodes at the third level and so on. (2) Notes at the same level are ordered left to right. The search along this ordering is called breadth-first search. Based on breadth-first search, an algorithm for twelve items is designed as follows: 126
7.1 The First
Competitiveness
127
input items 1, 2, • • -; 12 repeat find an untested node XofTby breadth-first search; test X; if X is pure then prune all descendants of X from T; until T has no untested node. If the input sample contains only one defective, say 7, then the algorithm tests 9 nodes of T as shown in Figure 7.2. If the input sample contains six defectives 1, 3, 5, 7, 9 and 11, then the algorithm tests all nodes of T in Figure 1. This example
pure I
^contaminated
Figure 7.2: Only 7 is defective. tells us that although no information is known about defectives, group testing can still save tests if the input contains a small number of defectives. But, if the number of defectives in the input is too large, then group testing may take more tests than individual testing does. So, the behavior of an algorithm depends on the number of defectives in the input. Motivated by study of on-line algorithms [8, 10] and the above situation, Du and Hwang [3] proposed the following criterion for group testing algorithms with unknown d, the number of defectives. For an algorithm a, let Na(s) denote the number of tests used by the algorithm a on the sample s. Define Ma(d | n) = max s€ s(^ n ) Na(s) where S(d,n) is the set of samples of n items containing d defectives. An algorithm a is called a c-competitive algorithm if there exists a constant a such that for 0 < d < n, MQ(d \ n) < c-M(d, n)-f a. Note that the case d = n is excluded in the definition because M(n, n) = 0 and for any algorithm a, Ma(n | n) > n. A c-competitive algorithm for a constant c is simply called a competitive algorithm while c is called the competitive ratio of the algorithm.
Competitive Group Testing
128
To establish competitiveness of an algorithm, a lower bound for M(d, n) is usually needed. The following lemma is quite useful. L e m m a 7.1.1 For 0 < d < pn, M(d,n)
> d(log^- + hgfeJlv a
- p)) - 0.5 log
Proof. Note that the information lower bound is [log [Vj\ for M(d,n). Since njd < (n — i)/{d — i) for 0 < i < d, M(d,n) > dlog(n/d). Now, the following estimation is obtained by using Stirling's formula n! = V2rn ( * ) " eife (0 < e < 1) [9].
nA dj
> ' ~ ,S^(3)'(^)"-«p(y 2-Kd(n-dyd' n-d' 12d y
rv
12(n-d)7
> ^ G ) ' ^ + ^ - ) ( " - r f » / ^ 0 - 5 ] 1 ^ - ) - 0 M + 0 - 5 - ^ exp(-i) y/d
>
a
n —d
n —d
8
SJI-K
6
1
-^)'(eV^)'(l-p)-°- 2- -«".
Thus, Af (d, n) > d(log ^ + log( e v /l - p)) - 0.5 log d - 0.5 log(l - p)- 1.567.
D
For convenience, assume that the value of the function d log ^ at d = 0 is 0 because Wvad-^odXog^ = 0. Clearly, with this assumption, the lower bound in Lemma 7.1.1 also holds for d = 0. Du and Hwang [2] showed that for d/n > 8/21, M(d,n) = n — 1 (Theorem 3.5.3). Applying the above result to the case of p = 8/21, the following is obtained. Corollary 7.1.2 For 0 < d < (8/21)n, M(d,n)
7.2
>d(\ogT] + 1.096) - 0 . 5 l o g d - 1.222. a
Bisecting
Let S be a set of n items. The principle of the bisecting algorithm is that at each step, if a contaminated subset X of S is discovered, then bisect the set X and test the resulting two subsets X' and X". How one bisects will affect the competitive ratio. A better way is to choose X' to contain 2l log l ;f| 1-i items and X" = X \ X' where \X\ denotes the number of elements in X. Let G be a bin for good items and D a bin for defectives. Let Test(X) denote the event that a test on set X is performed. The following is a bisecting algorithm.
7.2 Bisecting
129
Algorithm A l ; input S; G<-0; JD+-0;
TEST(S); if S is pure then G *- S and Q <- 0 else Q <— { 5 } ; repeat pop the frontier element X of queue Q; bisect X into X' and X"; TEST(X'); if X ' is contaminated then TEST(X"); {if X' is pure, then it is known that X" is contaminated.} for Y <- X' and X" do begin if Y is pure then G <- G U F ; if y 25 a contaminated singleton then D <^ D UY; if V 25 contaminated but not a singleton then pus/i y m£o the queue Q end-for; until Q = 0 end-algorithm. Algorithm Al is a variation of the algorithm demonstrated in the example in the last section. Instead of pruning a tree, algorithm Al builds up a binary tree T during the computation. However, these two algorithms end with the same tree T provided that they bisect every set in the same way. Let T* be the binary tree T at the end of the computation. Then all tested sets are nodes of T*. For the algorithm in the last section, every node of T* is a tested set. However, for algorithm Al, some nodes such as Xfns may not be tested sets. In the following, an upper bound of the number of nodes of T* is established. This means that the analysis below does not take advantage of possible saving of Al. The analysis holds for both algorithms. L e m m a 7.2.1 M^i(d \ n) < 2n — 1 for any d. Proof A binary tree is a rooted tree with property that each internal node has exactly two sons. A node is said to be on the kth level of the tree if the path from the root to the node has length k — 1. So, the root is on the first level. Let i be the number of nodes in a binary tree and j the number of internal nodes in the tree. It is well-known
that i = 2j + 1. Consider the binary tree T*. Note that each leaf of T* must identify at least one distinct item. So, T* has at most n leaves. It follows that T* has at most 2n — 1 nodes. Therefore, MA\{d \ n) < 2n — 1. •
Competitive Group Testing
130
Note that if d/n > 8/21, then M(d,n) = n-l. Hence MA1(d \ n) < 2M{d,n) + 1 for d/n > 8/21. In the next three lemmas, another upper bound for MA1(d \ n) is shown, which is useful in the case that d/n < 8/21. The following lemma is an important tool for the analysis. Lemma 7.2.2 Let d = d' + d" and n = n' + n" where d' > 0,d" > 0,ri > 0 and n" > 0. Then d'l0g^
d"l0g^
+
Proof. Note that ^ j ( — s l o g s ) = — JJ^J < 0 for x > 0. So, —slogs is a concave function. Thus,
^°4+d"i°4> .n'd' n' n n' d! d n < n(-log-) n a n = dlog-. a
n"d\ n n"
n\ d"
Clearly, when n is a power of 2, the analysis would be relatively easy. So, this case is analyzed first. Lemma 7.2.3 Let n be a power of 2. Then for 1 < d < n MA1{d\n)
<2d(log^ + l ) - l . a
Proof. Consider a binary tree T*. Clearly, every internal node must be contaminated and there exist exactly d contaminated leaves. Next, count how many contaminated nodes the tree can have. Denote u = logn, v = \\ogd\ and v1 — v — logd. Then tree T* has u + 1 levels. The ith level contains 2 ' - 1 nodes. Note that each level has at most d contaminated nodes and the (v + l)st level is the first one which has at least d nodes. Thus, the number of contaminated nodes is at most ^ 2 i - 1 + (« - v + l)d =
2V - 1 + d(log ^ + 1 - v1) a
=
- l + d(log-j + l - w ' + 2") a
< - 1 + d{\og 3 + 2). a
131
7.2 Bisecting
The last inequality sign holds since f(v') = —v' + 2" is a convex function of v' and v' is between 0 and 1. Thus, T* has at most —1 + d(log(n/d) + 1) internal nodes and hence at most 2d(log(n/d) + 1) — 1 nodes. D According to the method of bisecting, each level of tree T* contains at most one node which is of size not a power of 2. This property plays an important role in the following. L e m m a 7.2.4 For 0 < d < n MA1(d\n)<2d{\og^
a
+ l) + l.
Proof. It is proved by induction on n. For n = 1, it is trivial. For n > 1, let S be the set of n items. Note that for d = 0 only one test is enough so that the lemma holds obviously. Next, assume d > 0 and consider two cases according to 5 ' and S" obtained by bisecting S. Case 1. S' is contaminated. Since the number of items in S' is a power of 2, algorithm Al spends at most 2d'(log ^ + 1) — 1 tests on S' where d' is the number of defectives in S'. Let d" be the number of defectives in S". By the induction hypothesis, algorithm Al spends at most 2d"(log '-^ + 1) + 1 tests. Adding the test on 5, the total number of tests is at most 2 d ' ( l o g ^ + l) + 2d"(logl|Ji + l) + l <
2d(log^ + l) + l. a
Case 2. S' is pure. In this case, algorithm Al spends a test on S' and at most 2d(log LZ1 -|-1) -J-1 tests on S". So, adding the test on S, the total number of tests is at most I9"l 2 + 2d(logi—i + l) + l a < <
2 d ( l o g ^ + l) + l a 2d(log^ + l) + l. D a
Based on the above estimation, the following is obtained. Theorem 7.2.5 MA1{d | n) < 2M(d,n) + 5 for 0 < d < n - 1. Proof. By the remark that was made after the proof of Lemma 7.2.1, it suffices to consider the case that d/n < 8/21. In this case, by Corollary 7.1.2, M(d,n)
> d(log^ +1.096) - 0.5 log d - 1.222. a
132
Competitive
Group Testing
Thus, by Lemma 7.2.4, MAl{d | n) < 2M(d,n) + 2 ( 0 . 5 l o g d + 1.722 - 0.096(f). Denote h(d) = 0 . 5 1 o g d - 0.096d. Note that h'(d) = 0.5/(dln2) - 0.096. So, h(d) is decreasing for d > 8 and is increasing for d < 7. Since d is an integer, h(d) < ma.x(h(7),h{8)) < 0.74. Therefore, MAl{d \ n) < 2M{d,n) + 5. • The competitiveness of the bisecting algorithm was first proved by Du and Hwang [3]. They presented a bisecting algorithm with competitive ratio 2.75. The above improvement was made by Du, Xue, Sun and Cheng [6].
7.3
Doubling
Bar-Noy, Hwang, Kessler and Kutten [1] proposed another idea to design a competitive group testing algorithm. Their basic idea was as follows. Because d, the number of defectives, is unknown, the algorithm tries to estimate the value of d. If d is small, the algorithm would like to find large pure sets while if d is large the algorithm would like to find small contaminated sets. To have this behavior, the algorithm uses a doubling strategy. It tests disjoint sets of sizes 1, 2, • • •, 2' until a contaminated set is found. Namely, the first i sets are pure and the last set is contaminated. So, the algorithm identifies 1 + 2 + - • • +2'~ x = 2 ' — 1 good items and a contaminated set of size 2' with i + 1 tests. Next, the algorithm identifies a defective from the contaminated set by a binary search, which takes i tests. Thus, totally, the algorithm spends 2i + 1 tests and identifies 2' items. Note that if d = 0, then the doubling process would take flog |5|] tests instead of one. Thus a test on S is inserted at the beginning to take care of this case. The following is a formal description of the doubling algorithm A2. First, introduce a function DIG which identifies a defective from a contaminated set X with [log \X\\ tests. function DIG(X); Y^X; repeat Y' <- \\Y\/2] items from Y; TEST(Y'); if Y' is contaminated then Y <- Y' else Y <- y \ y ; until Y is a singleton; DIG <- Y; end-function; The following is the main body of the algorithm.
133
7.3 Doubling Algorithm A 2 : input S; G<-0; while S / 0 do TEST{S); if S is pure then G <- G U S 5^0; k <- 1; X <-0; repeat X <— min(fc, |5|) items from S; TEST(X); if X is pure then G <- G U X fc <-2fc S <- S \ X ; until X is contaminated; S*-S\ DIG(X); D <- Dll DIG(X); end-while; end-algorithm.
Bar-Noy, Hwang, Kessler and Kutten [1] commented that A2 is a 2.16-competitive algorithm. We now show that it is 2-competitive by a more careful analysis. L e m m a 7.3.1 For 0 < d < n, MA2(d \ n) < 2n - 1. Proof. We prove it by induction on d. For d = 0, it is trivial that M ^ O | n) = 1 < 2n — 1. For d > 0, algorithm A2 first identifies 2' — 1 good items and one defective with at most 2i + 2 tests for some i > 0. By the induction hypothesis, the rest of items are identified by using at most 2(n — 2') — 1 tests. Thus, we have MA2(d\ n) < 2 ( n - 2 ; ) - 1 + 2i + 2 < In-
1.
•
L e m m a 7.3.2 For 1 < d < n, MA2(d | n) < 2d(log \ + 1). Proof. We prove it by induction on d. For d = 1, algorithm A2 first identifies 2' — 1 good items and one defective with 1i + 1 tests and then identifies the remaining n — 2' good items with one test. Clearly, i < logn. Therefore, the lemma holds. For d > 1, algorithm A2 first identifies 2' — 1 good items and one defective with at most 2i + 2 tests for some i > 0. By the induction hypothesis, the remaining n — 2' items are
Competitive Group Testing
134
identified with at most 2{d — l)(log JV^- + 1) tests. Thus, the total number of tests is at most 2(log 2 ; + 1) + 2(d - l)(log ?j~ <
2
Theorem 7.3.3 Forl
+ 1) •
MA2{d | n) < 2M{d, n) + 4.
Proof. Similar to the proof of Theorem 7.2.5. O
7.4
Jumping
As we mentioned in the last section, Bar-Noy, Hwang, Kessler and Kutten employed a trick on testing three items in their doubling algorithm. Du, Xue, Sun and Cheng [6] extended this technique and obtained a 1.65-competitive algorithm. This section is contributed to this algorithm. The basic idea is explained in Figure 7.3. Instead
Figure 7.3: Doubling and jumping. of climbing stairs one by one in the doubling process, the jumping algorithm skips every other stair; that is, instead of testing disjoint sets of size 1, • • •, 2' the algorithm tests disjoint sets of size 1+2, 4+8, • • -, 2* + 2 , + 1 for even i until a contaminated set is found. In this way, the algorithm identifies 2' — 1 good items with i/2 tests instead of i tests. However, it finds a contaminated set of size 3 • 2' instead of 2 ! , which requires one more test on a subset of size 2' in order to reduce the contaminated set either to size 2' or to size 2 1+1 with 2' identified good items. Let us first describe a procedure for three items, which is a modification of a procedure in [1]. The input for this procedure is a contaminated set of three items. With two tests, the procedure identifies either two defectives or at least one good item and one defective. Procedure TEST{x); TEST(y);
3-TEST({x,y,z});
7.4 Jumping
135
if x is defective then D <- D U {x} else G < - G U { x } ; if y is defective then D *- Dl){y} else G ( - G U { i } ; if x and y are both good then S <— 5 \ {x, i/, z} D H D U W else S <— 5 \ {x, ?/}; end-procedure; An extension of the above procedure is as follows. The input is a contaminated set of 3 • 2k items (k > 0). The procedure first identifies either a contaminated set of size 2k or a pure set of size 2* and a contaminated set of size 2 + 1 and then identifies a defective from the resultant contaminated set. Procedure BIG-3-TEST{X); X' <- min(2fc, \X\) items from X; TEST(X'); if X' is contaminated then X <- X' else X <- X \ X' G+-GUX' S <- 5 \ A"'; A +- X\ DIG(X); D i- DU DIG(X); S «- 5 \ DIG(X); end-procedure; The following is the main body of the algorithm. Algorithm A 3 ; input S;
G^0; while 151 > 3 do k <-0; repeat {jumping process} X <- min(2*: + 2* +1 , |5|) items from S; TEST{X); if X is pure then G ^ G U I
Competitive Group Testing
136 S ^S\X k <- k + 2; if k = 10 then TEST(S) if S is pure then G <- G U 5 until 5 = 0 or X is contaminated; if X is contaminated then if ifc == 0 then 3-TEST(X); if fc > 0 then BIG-3-TEST{X); end-while; while 5 ^ 0 do x *— an item from S; TEST(x); if x is good then G <— G U {x}; if i is defective then Z) <— Z) U {a:}; 5-_5\{x}; end-while; end-algorithm.
Next, algorithm A3 is analyzed in the way similar to that for algorithm A2 in the last section. Lemma 7.4.1 For 0 < d < n, MA3{d | n) < 1.65d(log - + 1.031) + 6. Proof. It is proved by induction on d. For d = 0, since the algorithm will test 5 when k — 10, it takes at most six tests to find out that S is pure. Thus, MAS(0 \n) < 6. For d > 0, suppose that the first time that the computation exits the jumping process is with k = i. So, a contaminated set X of size at most 2' + 2' + 1 (i is even) and 2' — 1 good items are found with i/2 + 1 tests. Next, consider three cases. Case 1. i = 0. Procedure 3-TEST identifies either two defectives in two tests or at least one good item and one defective in two tests. Applying the induction hypothesis to the remaining n — 2 items, in the former subcase, the total number of tests is at most 3 + 1.65(d - 2)(log ^ | + 1.031) + 6 a—I
= 1.5 • 2(log | + 1) + 1.65(rf - 2)(log J 5 | + 1.031) + 6 <
1.65d(log^ + 1.031) + 6. d
137
7.4 Jumping In the latter subcase, the total number of tests is at most 3 + 1.65(d - l)(log ^ | + 1.031) + 6 a—1 = <
1.5(log 2 + 1) + 1.65(
Case 2. 2 < i < 8. Procedure BIG-3-TEST identifies either one defective with at most i + 1 tests or one defective and 2* good items with at most i + 2 tests. In the former subcase, the total number of identified items is 2' and the total number of tests for identifying them is at most (i/2 + 1) + (* + 1) < 1.65(log2'' + 1.031). In the latter subcase, the total number of identified items is 2 , + 1 and the total number of tests for identifying them is at most (i/2 + 1) + (t + 2) < 1.50(log 2 i + 1 + 1). Applying the induction hypothesis to the remaining unidentified items and using Lemma 7.2.2, the upper bound 1.65d(log^ + 1.031) + 6 is obtained for the total number of tests. Case 3. i > 10. This case is similar to Case 2. The difference is that the algorithm spends one more test on S when k = 10. So, there are two subcases corresponding to the two subcases in Case 2. In the former subcase, the total number of tests for identifying 2* — 1 good items and one defective is at most (i/2 + 1) + (i + 2) < 1.65(log V + 1.031); in the latter subcase, the total number of tests for identifying 2 1+1 good items and one defective is at most (i/2 + 1) + (t + 3) < 1.65(log2' +1 + 1.031). The proof is completed by applying the induction hypothesis to the left unidentified items and using Lemma 7.2.2. • L e m m a 7.4.2 For 0 < d < n, MA3(d \ n) < 1.5ra. Proof. It is proved by induction on d. For d = 0, the algorithm needs one test when n < 3, two tests when 4 < n < 15 and at most five tests when n > 16, so that •^Cl3(0 I n) < 1.5n. For d > 0, suppose that the first time that the algorithm leaves the jumping process is when k = i. So, a contaminated set X of size at most 2' + 2 , + 1 (i is even) and 2' — 1 good items are found with i/2 + 1 tests. Next, the proof follows
Competitive Group Testing
138
the idea of the proof of the last lemma to verify that in each case, the number of tests is at most one and a half times the number of identified items. Case 1. i = 0. Two items are identified with three tests. Case 2. 2 < i < 8. Either 2' items are identified with 1.5i + 2 tests or 2 , + 1 items are identified with 1.5* + 3 tests. Since i > 2, 1.5i + 2 < 1.5-2'and 1.5J' + 3 < 1.5-2 i+1 . Case 3. i > 10. Either 2* items are identified with 1.5i + 3 tests or 2 , + 1 items are identified with 1.5i+4tests. Since i > 10, 1.5i + 3 < 1.5-2'' and 1.5i + 4 < 1.5-2 i+1 . The proof is completed by applying the induction hypothesis to the remaining unidentified items and adding the bound to the inequalities in the above. • Theorem 7.4.3 For 1 < d < n - 1, MA3(d | n) < 1.65M(d,n) + 10. Proof. If d/n > 8/21, then M(d,n) = n — 1. The theorem then follows by Lemma 7.4.2; if d/n < 8/21, then by Lemma 7.4.1 and Corollary 7.1.2, MA3{d | n) < 1.65M(d,n) + 6 + 1.65(0.5log d - 0.065rf + 1.222). Denote h(d) = 0.5 log d—0.065d. Then h(d) increases for d < 11 and decreases for d > 12. Thus, h(d) < max(/i(ll),fe(12)) < 2.237. Hence, MA3{d | n) < 1.65M(d,n) + 10.
• By modifying algorithm A3, the competitive ratio could be further improved to approach 1.5. The modification can be done through studying the competitive group testing on small number of items. For example, instead of procedure 3-TEST, using a procedure for testing 12 items, the competitive ratio can be decreased to be less than 1.6. However, it certainly requires a new technique in order to push the competitive ratio down under 1.5. Du and Kelley [4] introduced the technique of Bentley and Yao for unbounded search (see Chapter 10) to the competitive group testing. With such a technique, they were able to reduce further the competitive ratio.
7.5
The Second Competitiveness
A number of papers (see Chapter 3) have appeared in studying group testing algorithms for 1 < d < 3. In those papers, the following number was studied. n(d,k) = ma,x{n\M(d,n)
< k}.
Motivated by this situation, Du and Park [5] defined the second competitiveness as follows. Consider an algorithm a. Define na(d | k) = max{n | Ma(d | n) < k}. An algorithm a is called a strongly c-competitive algorithm if there exists a constant a such that for every d > 1 and k > 1, n(d, k) < c • na(d \ k) + a. Note that d = 0 is excluded because n(0,k) = oo. A strongly competitive algorithm is a strongly c-competitive algorithm for some constant c.
7.5 The Second
Competitiveness
139
Theorem 7.5.1 Every strongly competitive algorithm a satisfies the following: (1) There exists a constant d such that for 1 < d < n — 1, Ma{d \n) < (2) For
c'M(d,n).
d>\, Ma(d | n) hm ——-—r= 1. n-.oo M(d,n)
Before proving this theorem, a lemma is given. Lemma 7.5.2 For 0 < d < n, dlog-r < M(d,n) d Proof. bound. M(l,n) for d = 2.2.2,
Since rjj To prove = \logn~\, 0,1 and d
< d l o g ^ + (1 +loge)d. d
> ij) , the first inequality follows from the information lower the second inequality, first note that M(0,n) = M(n,n) = 0, and for n < ^: M(d,n) = n — 1. So, the inequality holds trivially > jrn. Next, consider 2 < d < jrn. From the proof of Corollary M(d,n)
<Ma(d,n)
Moreover, for 2 < d < ^-n,
Thus, M(d,n)
< d l o g ^ + (1 + log e)d. d
D
Now, Theorem 7.5.1 is ready to be proved. Proof of Theorem 7.5.1. Let n(d, k) < c • na(d | k) + a. Without loss of generality, assume that c and a are positive integers (otherwise, one may use two positive integers bigger than c and a instead of c and a). From the above inequality, it follows that Ma(d | n) = min{fc | na{d \ k) > n) <
min{fc | {n(d, k) — a)/c > n} = M(d, en + a).
Competitive Group Testing
140 Thus, for d > 1,
~
Ma(d \ n) M(d,cn + a) d\og^ M(d,n) ~ M(d,n) ~
+ {1+\oge)d d\og*
Clearly, the right hand side is bounded by a constant. So, a is a competitive algorithm. Moreover, the right hand side approaches one as n goes to infinity. Therefore lim,,^^ Ma(d | n)/M(d,n) = 1. D The inverse of the above theorem is not true. The next theorem gives a sufficient condition for strong competitiveness. Theorem 7.5.3 If for algorithm a there exists a constant c such that for 1 < d < n-\, Tl
Ma(d | n) < dlog — + cd, then a is strongly competitive. Proof. Note that na(d | k)
= max{n | Ma(d \ n) < k] > >
Tl
max{n | dlog — + cd < k) d d2^- c - 1
and n(d, k)
= max{n | M(d, n) < k} Tl
< max{n I dlog — < k} d < d2$. Thus, n{d,k) <2c{na(d\
7.6
n) + 1).
•
Digging
Note that if both X and X' are contaminated sets with X' C X, then the information from the test on X' renders the information from the test on X useless (Lemma 1.4.1). Thus, in the bisecting algorithm, a lot of tests produce no information for the final result. This observation suggests the following improvement: Once a contaminated set is found, a defective is identified from the set. This will be called digging, which results in a strongly competitive algorithm as follows. (The procedure DIG for digging a defective has been introduced in Section 7.3.)
141
7.6 Digging Algorithm A 4 : input S; G «— ty;{the bin of good items] D <— 0;{i/ie bin of defectives} Q - {S}; repeat pop the frontier element X from queue Q; TEST{X); if X is pure then G «- G U X else begin Y <- DIG(X); X^X\Y; D*-DUY; bisect X into X' and X"; if X' ^ 0 then push X' into queue Q; if X" ^ 0 then pusA X" into queue Q; end-else until Q = 0. end-algorithm. The following is an analysis for algorithm A4. L e m m a 7.6.1 Let n be a power of 2. Then for 0 < d < n. MA4{d | n) < (/log - J i - + 4J - log(d + 1) + 1. a+ 1
Proof. Let n = 2U, t7 = [l°g(^ + 1)J a n d u ' = l°g(^ + 1) — w- Note that a defective detected from a set of size k by function DIG needs [log k~\ tests. By the definition of algorithm A4, it is not hard to see that function DIG applies to at most one set of size n, at most two sets of sizes between 1 + n/4 and n/2, ..., and in general, at most 2' sets of sizes between 1 + n/2' and n / 2 ' - 1 . Thus, the number of tests consumed by function DIG is at most u + 2(w - 1) + • • • + 2v~1(u - v + 1) + (d - 2" + l)(u - v) = u(T - 1) - (v2v - 2" + 1 + 2) + (d - 2" + l)(u - v) = ud- v(d + 1) + 2" +1 - 2 = d{u-vv') + v'd - v + 2" + 1 - 2 = d(u-vv') + (v' + 2 w ) ( < f + 1) - 2 - log(
<
d\ogj?—+2d-log(d+l).
The last inequality holds because v' + 2 1 ""' < 2 for 0 < v' < 1. In fact, i/ + 2 1 ""' is a convex function with minimum value 2 at v' = 0 and u' = 1.
Competitive Group Testing
142
Now, consider tree T* which is built up by a bisecting process, that is, the node set of T* consists of S and all sets X' and X" appearing in the computation, and node X is the father of the two sons X' and X" iff X' and X" are obtained by bisecting X after identifying a defective. Clearly, every internal node is a contaminated set from which a defective is identified. Thus, T* has at most d internal nodes. It follows that the total number of nodes of T* is at most 2d + 1. Therefore, MA4{d\n)
< < i l o g 3 + 4 d - l o g ( J + l ) + l. a
•
L e m m a 7.6.2 For 1 < d < n, MA4(d\n) 1, the algorithm identifies the first defective with [log n\ + 1 tests, and bisects the remaining n — 1 items into two sets S' and S" of sizes n' and n" where n' = 2 U _ 1 and u = [log(rc — 1)]. Suppose that S' and S" contain d' and d" defectives, respectively. So, d! + d" + 1 = d and n' + n" + 1 = n. Then by Lemma 7.6.1, the number of tests for identifying items in S' is at most rf'(log^T
+ 4)-log(rf' + l) + l.
Next, consider two cases. Case 1. n =^ 2" + 1- Then u = [logn]. If d" = 0, then the algorithm uses one test to detect S". If d" > 0, then use the induction hypothesis on S". In either situation, the number of tests for identifying items in S" is at most d » ( l o g ^ + 4) + l. Thus, the total number of tests is at most u + 3 + d'(\og ^
+ 4) - \og(d' + 1) + d"(\og J
< (rf' + l ) ( l o g ^ T + 4 ) + d"(logJ+4) < <
n —1 d(\og — — + 4) (by Lemma 7.2.2) d
+ 4)
143
7.7 Tight Bound
Case 2. n = 2U + 1. Then u + 1 = [log n] and n" = n' = 2"" 1 . If d" = 0, then the total number of tests is at most u
<
+ 4 +
l+d(log^-+4) a d(log^ + 4).
<
If d" > 0, then apply the induction hypothesis to 5". It is easy to see that the total number of tests has the same upper bound as that in Case 1. • T h e o r e m 7.6.3 Algorithm A4 is strongly competitive. Proof. It follows immediately from Lemma 7.6.2 and Theorem 7.5.3. •
7.7
Tight Bound
Du and Hwang [3] conjectured that there exists a method A of bisecting such that MA(d\n) <2M(d,n) + l for 0 < d < n—1. Since for any method A of bisecting, M^(n—1 | n) = 2M(n — l,n)+l and MA(1 \ n) = 2 M ( l , n ) + 1, this bound is tight. Du, Xue, Sun, and Cheng [6] showed that the bisecting method Al in Section 7.2 actually meets the requirement. T h e o r e m 7.7.1
For0
<2M(<J,n) + l .
The proof of this result involves a more accurate analysis on the upper bound of Mjii(d | n). Such techniques can also be applied to other algorithms to yield tighter bounds. In this section, the main idea of their proof is introduced. By the remark that we made after the proof of Lemma 7.2.1, it suffices to consider the case that d/n < 8/21. In this case, by Corollary 7.1.2, M(d, n) > 8. To obtain the tight bound, one needs h(d) < —1.222. This yields d > 41. Therefore, for d > 41, MA1(d \ n) < 2M{d,n) + 1.
Competitive
144
Group Testing
Next, consider 1 < d < 41. Define f(n,d)
=
(?) 2" If f(n, d) > 1, then by Lemma 7.2.4 and the information lower bound for M(d, n), it is easy to see that MA1{d \n) <2M(d,n) + 2. Since both sides of the inequality are integers, MA1(d\ n) <2M(d,n) In order to find when f(n,d)
+ l.
> \/2, consider the following ratio f(n,d+l)
_n-d
f{n,d)
2n
1 {
d' '
It is easy to see that n —d 2n
f(n,d-\-l) f(n,d)
n — d, d . , / , 2n Kd+l>
Thus, L e m m a 7.7.2 For d/n > 1 — 2/e, f(n, d) is decreasing with respect to d. For d/n < 1 — (2/e)w(rf+ l)/d, f(n,d) is increasing with respect to d. This lemma indicates the behavior of the function f(n, d) with respect to d. Next, study the function with respect to n. Consider
Note that g(n, d-\-1) g(n,d)
n(n — d-\-1) > 1 (n + l)(n — d)
because n(n-d+l)-(n
+ l)(n - d) = d.
Moreover, g(n, 1) = 1. Thus, for d > 1, g(n, d) > 1. Therefore, the following theorem holds. Lemma 7.7.3 For d > 1, f(n,d)
is increasing in n.
7.7 Tight Bound
145 d
21d>8n
^•^^^ 0
^
s
^
--'"""""""" X
-***•'*'
^"Figure 7.4: Pairs of n and d. From Lemmas 7.7.2 and 7.7.3, it can be seen that if f(n*,d") > 1, then for every n > n* and (1 - f i / 5 > + 1 > d > d*, f(n,d)
> 1. Note that /(157,5) > 1
and (1 - f-y/lf) -157 + 1 > 41. By Lemma 7.7.2, /(157,d) > 1 for 5 < d < 40. Furthermore, by Lemma 7.7.3, f(n, d) > 1 for n > 157 and 5 < d < 40. (See Figure 7.4.) Unfortunately, the above argument does not help in the case of d < 4. In fact, it is easy to prove that in this case f(n,d) < 1. Actually, one needs a more accurate upper bound of MA1(d | n) for 1 < d < 4. By the definition of algorithm Al, it is easy to find the following recursive formulas for computing MAi(d \ n). MA1(d | n) = max {1 + MM{d' \ n') + MA1{d" | n")}{d > 1) 0<.d'Kd
MA1(0 | n) = 1. where n' = 2n°« n l- 1 ,n" = n- • n' and d" = d— d'. From these formulas, the following lemmas can be obtained. Lemma 7.7.4 MA1{\ | 2") = 2u + l
foru>0,
MA1(2 | 2") = 4 u - l
/orw>l,
M,u(3 | 2") = 6 u - 5
foru>2,
Af / u(4|2") = 8 u - 9
foru>2.
Proof. The proof is by induction on u. It is easy to check each equation for the initial value of u. For the induction step, the recursive formula is employed to yield the following. MA1(1\2U)
= 1 + MA1(1 | 2""1) + M ^ O | 2 - 1 ) = U
MA1{2\2 )
l + 2 ( u - l ) + l + l = 2 u + l.
= max(l + M j 4 1 (2|2"- 1 ),l+2M / 1 1 (l I2"- 1 )) =
max(4u — 4,4u — 1) = 4« — 1.
Competitive Group Testing
146 MM{3\2U)
max(l + M A 1 ( 3 | 2 u - 1 ) , l + M ^ 1 ( l | 2 u - 1 ) + M A 1 ( 2 | 2 " - 1 ) ) max(6u — 10,6u — 5) = 6u — 5. 1 1 o maX 4 (M,u(d' | 2"" ) + MA1(4 - d! | 2"" ))
= = MM(A | 2") = =
m a x ( 8 u - 1 7 , 8 u - l l , 8 u - 9 ) = 8u - 9.
L e m m a 7.7.5 Letu + l= Then
\\ogn\,v
•
= [log(n - 2")], and w = [log(rc - T - 2 " - 1 ) ] .
MAi(l|n) MA1(2\n) MA1(3\n)
< 2(« + l) + l, < max(4u + l,2(u + v + 1) + 1), < m a x ( 6 u - 3 , 4 u + 2u + l ) ,
MAi(4\n)
< max(8u - 7 , 6 u + 2v - 3 , 4 u + 4t> - 3,4u+ 2v + 2w + 1).
Proof. Use the recursive formula and note that M,u(d | n — 2") < MAi(d \ 2V). • Now, Theorem 7.7.1 for 1 < d < 4 is proved as follows. Note that u and v are defined the same as those in Lemma 7.7.5. For d = 1, M ^ ( l | n) < 2(u + 1) + 1 = 2M(l,rc) + l. For d = 2, if u < u, then MA1(2 \ n) < 4u + 1 and
M(2,n) > pog Q i > rio g ( 2 " + 2 1 ) 2 "i =2«. So, M 4 1 (2 | n) < 2M(2,n) + 1. If u = u, then M,u(2 | n) < 2(2u + 1) + 1 and
> =
[log(22" + 2 2 u - 3 )l 2u + l.
Thus, Theorem 7.7.1 holds. For d = 3 and M = 1, it is trivial. Next, consider d = 3 and u > 2. If u < u, then A M 3 | n) < 2(3u - 2) + 1 and
>
flog((2- + 1)2 2 - 3 )1
>
3u - 2.
Thus, Theorem 7.7.1 holds. If u = v, then MA\(3 \ n) < 6u + 1 and n\.
.
„
(2 u + 2 " - 1 + l )A ( 2 " + 2''- 1 )(2 u + 2 u - 1 - 1)
fiogyi > r i ° g — — > >
6
— ^
riog((22u+1+22u-2-l)2"-2)l 3«.
'-]
147
7.7 Tight Bound
For d = 4, it is trivial to verify Theorem 7.7.1 in the cases of u = 1 and u = 2. Next, consider u > 3. If u < u and u; < u — 1, then MA1(i | n) < 2(4u - 3) + 1 and
iw(;)i > r.o 8 (2 ' +1)2 ' ( r-' )(2 '- 2) i ( 2 3 " - 2 2 u - 2 " - 1 + l)2"- 2 n y 1
=
[log
>
4w - 3.
Thus, Theorem 7.7.1 holds. If u = v and w < u - 1, then MA1{4 \ n) < 2(4u - 2) + 1 and
iw(;)i (2" + 2"" 1 + 1)(2" + 2 U - 1 )(2" + 2"- 1 - 1)(2U + 2"" 1 - 2), _ ] 1 2 1 2 ((2" + 2"- ) - 1)((2" + 2"" - l) - 1) I og I 3 g
>
riog
_
92u+l
I n2u-2 __ i
c\2u+\ I o2u-2 _ ou __ 9 « - l
> riog 5 > Rog((2 2 " -1 + 1)2 2 "- 2 )1 >
-8
1
4u - 2.
So, Theorem 7.7.1 holds. If w = u — 1, then it must be the case u = v and M>n(4 | n) < 2(4u - 1) + 1. Note that n > T + 2"" 1 + 2"" 2 + 1. Thus,
>
-
> > >
I °g
((2" + 2"" 1 + 2 " ' 2 ) 2 - 1)((2" + 2"- 1 + 2"- 2 - l) 2 - 1) o2u+l j _ n2u _|_ o2u-4 _ 1
3 .g Q 2 ^ 1 J_ 02w-4
_
_
|l 0 g 2
2
I
]
2
riog((2 " + 1)2 "- )1 4u - 1.
Therefore, Theorem 7.7.1 holds. Finally, for finitely many pairs of (n, d) located in polygon oxyz as shown in Figure 7.4, compute MAi{d \ n) by the recursive formula. Also compute a lower bound of M(d, n) by the following formula £(d,n) = mm{n - 1, max^log r
~ *Jl + 2k]}.
(This lower bound is in Lemma 1.5.9.) Comparing the two computation results, one finds that MA1(d \ n) < 2£(d, n) +1. (The details can be found in [6].) This completes the proof of Theorem 7.7.1.
148
Competitive Group Testing
References [1] A. Bar-Noy, F.K. Hwang, I. Kessler, and S. Kutten, Competitive group testing in high speed networks, to appear in Discrete Applied Mathematics. [2] D.Z. Du and F.K. Hwang, Minimizing a combinatorial function, SIAM J. Alg. Disc. Method 3 (1982) 523-528. [3] D.Z. Du and F.K. Hwang, Competitive group testing, in L.A. McGeoch and D.D. Sleator (ed.) On-Line Algorithm, DIMACS Series in Discrete Mathematics and Theoretical Computer Science, Vol. 7 (AMS & ACM, 1992) 125-134. (Also to appear in Discrete Applied Mathematics.) [4] D.Z. Du and D. Kelley, An improvement on competitive group testing, in preparation. [5] D.Z. Du and H. Park, On competitive algorithms for group testing, Technical Report 92-39, Computer Science Department, University of Minnesota, 1992. [6] D.Z. Du, G.-L. Xue, S.-Z. Sun, and S.-W. Cheng, Modifications of competitive group testing, to appear in SIAM J. Computing. [7] M.C. Hu, F.K. Hwang and J.K. Wang, A boundary problem for group testing, SIAM J. Alg. Disc. Method 2 (1981) 81-87. [8] M.S. Manasse, L.A. McGeoch, and D.D. Sleator, Competitive algorithms for online problems, Proceedings of 20th STOC, (1988) 322-333. [9] E. M. Palmer, Graphical Evolution: An Introduction to the Theory of Random Graphs, (John Willy fc Sons, New York, 1985). [10] D.D. Sleator and R.E. Tarjan, Amortized efficiency of list update and paging rules, Communications of ACM, 28 (1985) 202-208.
8 Unreliable Tests
In the previous chapters, all tests are reliable, i.e., the test outcomes are error-free. In this chapter, we will shift our attention to unreliable tests, i.e., each test may make an error with a certain possibility.
8.1
Ulam's Problem
Stanislaw M. Ulam (1909-1984) is one of the great mathematicians in the twentieth century. In his autobiography "Adventures of a Mathematician" [23], he wrote the following. " Someone thinks of a number between one and one million (which is just less than 2 20 ). Another person is allowed to ask up to twenty questions, to each of which the first person is supposed to answer only yes or no. Obviously the number can be guessed by asking first: Is the number in the first half-million? and then again reduce the reservoir of numbers in the next question by one-half, and so on. Finally the number is obtained in less than log 2 (l, 000, 000) questions. Now suppose one were allowed to lie once or twice, then how many questions would one need to get the right answer? One clearly needs more than n questions for guessing one of the 2 n objects because one does not know when the lie was told. This problem is not solved in general." Ulam's problem is a group testing problem with one defective and at most one or two erroneous tests. In general, more defectives and more errors may be considered. Suppose there are n items with d defectives and at most r errors are allowed. For any algorithm a identifying all defectives with such unreliable tests, let N^{a \ d,n) denote the number of unreliable tests performed by a on sample a and let MTa(d,n)=
max
Nra{a\d,n)
cxeS(d,n)
MT(d,n)
= min Mra{d,n). 149
Unreliable Tests
150
Ulam's problem is equivalent to finding M ^ l , 106) and M 2 ( l , 106). Rivest, Meyer, Kleitman, Winklmann, and Spencer [19] and Spencer [21] showed that 25 < M1(1,W6) < 26. Pelc [12] determined M ^ ^ I O 6 ) = 25 by proving the following result. T h e o r e m 8.1.1 Mi(,
^n>~
\ _ / min{fc | n(k + 1) < 2*} \ min{fc|n(fc+l) + ( f c - l ) < 2k}
if n is even, if n is odd.
Pelc's proof uses a weight function introduced by Berlekamp [2]. A history of group testing is a sequence of tests together with outcomes. Associated with a history, a state is a couple (a, 6) of natural numbers. The first number a is the number of items which, if defective, satisfy all test-outcomes. (The set of such items is called the truth-set.) The second number 6 is the number of items which, if defective, satisfy all but one test-outcomes. (The set of all items of this type is called the lie-set.) For example, consider items {1,2,3,4,5} and a history T ( l , 2 ) = 0,T(2,3,4) = 1,T(2,4,5) = 1. Then the state (a, b) associated with this history is determined as follows: a = |{3,4,5} n {2,3,4} n {2,4,5}| = |{4}| = 1 b = |({3,4,5}n{2,3,4}n{l,3})U({3,4,5}n{l,5}n{2,4,5}) U ( { l , 2 } n {2,3,4} n{2,4,5})| = |{3,5,2}| = 3. For a certain k, to look at whether k tests are enough or not, one introduces the weight of a state (a, 6) obtained from a history of j tests as follows. wk-j(a, b) = a(k - j + 1) + 6. This definition is interpreted by Spencer [21] as the number of possibilities that the remaining k — j tests lie or not. In fact, if an item in the truth-set is defective, then the remaining k — j tests can have exactly one lie or none at all so that k — j -\- 1 possibilities exist; if an item in the lie-set is defective, then the remaining k — j tests cannot lie so that only one possibility exists. When a new test is performed at a state (a,b), it yields two new states (ai,&i) and (02,62) corresponding to the two outcomes of the test. An important property of the weight is that Wj(a,b) = w^i(ai,bi)
+ w J _ 1 (a 2 ,6 2 ),
i.e. the total weight is invariant. The proof of this equation is quite easy. In fact, suppose that the test is on a set containing x items in the truth-set and y items in the lie-set. Then the state corresponding to the positive outcome is (x, y + a — x) and
8.1 Ulam's Problem
151
the state corresponding to the negative outcome is (a — x, b — y + x). Thus, the total weight of the two states is ii)j_i(x, y + a — x) + Wj-i(a — x, b — y + x) = xj + y + a — x-{-(a — x)j + 6 — y + x
= a{j + l) + b =
Wj(a,b).
Using this weight function, Pelc first proved the following three lemmas before proving Theorem 8.1.1. L e m m a 8.1.2 For even n, if n(k + 1) > 2k, then Mx(l,n) n(k + 1) + (k - 1) > 2k, then 71^(1, n) > k.
> k. For odd n, if
Proof. First, consider even n. Suppose to the contrary that there exists an algorithm a identifying the defective within k tests. Let a be represented by a binary tree. Consider a path from the root to a leaf; each node has weight not smaller than that of its sibling. Suppose that the length of the path is k — j (> 0). Since the root is in the state (n, 0) with weight n(k + 1) > 2k, the leaf has weight more than 2J. Note that the state of a leaf must be (1,0) with weight j + 1 or (0,1) with weight 1. Thus, either j + 1 > 2j or 1 > 2j. Both inequalities are impossible. Next, consider odd n. Suppose to the contrary that there exists an algorithm a identifying the defective within k tests. Note that the first test must yield states (a,i,bi) and (a 2 , 62) with weights satisfying max(u;i_i(ai,61),to*-!(02,62)) >
„
k+ —— > 2h~1.
Then an argument similar to the case of even n results in a contradiction. • Clearly, Lemma 8.1.2 means that for even n M1(l,n)>min{k\n(k+l)
<2k)
and for odd n M\\,n)
>min{fc|n(fc + l) + (ifc-l)
<2k}.
To prove the inequality in the other direction of inequalities, the following two lemmas are needed. Define ch(a: b) =
Unreliable
152 L e m m a 8 . 1 . 3 Let b be a natural algorithm identifying the defective,
Tests
number and k = ch(l,b). Then there exists an starting from the state (1,6), in k more tests.
L e m m a 8 . 1 . 4 Let (a,b) be a state such that b > a — 1 > 1. Then there exists a test which yields states (a1:bi) and (02,62) such that a
1
i
i-
y
i
^
s
i
a
+ * 1 iai
^
^ i
^ a i ^ L—^—J, LgJ ^ ° 2 ^
2.
&! > ax — 1,6 2 > a 2 — 1;
3.
ch(ai, bi), ch(a2, 62) < ch(a, b) — 1.
a
+
1
i
\r-2-J;
T h e following explains how to prove T h e o r e m 8.1.1 by using these two lemmas and also gives t h e proofs of two l e m m a s . Proof of Theorem 8.1.1. It suffices to prove t h a t for even n, if n(k + l) < 2k, t h e n there exists an algorithm identifying t h e defective in k tests; for odd n, if n(k+l) + (k — 1) < 2k, t h e n t h e r e exists an algorithm identifying t h e defective in k tests. First, consider even n. Let n = 2a. Select t h e first test on a set of a items, which yields s t a t e (a, a) with ch(a, a) < k — 1. If 0 = 1, t h e n by L e m m a 8.1.3, t h e defective can be identified by k — 1 more tests. If a > 1, t h e n t h e s t a t e (a, a) satisfies t h e a s s u m p t i o n of L e m m a 8.1.4. Note t h a t Conditions 1 and 2 g u a r a n t e e t h a t L e m m a 8.1.4 can be applied repeatedly until a s t a t e (1,6) is reached after t < [a\ + 1 tests. Condition 3 guarantees ch(l, 6) < k — 1 — t for every s t a t e ( 1 , 6) t h a t is reached. By L e m m a 8.1.3, t h e defective can be identified by k — 1 — t more tests. So t h e total n u m b e r of tests for identifying t h e defective is at most 1 + t + (k — 1 — t) = k. For odd n, let n = 2a + 1. Select t h e first test on a set of a + 1 items, which yields states (a + l , a ) and ( a , a + l ) . Note t h a t W f c _ i ( a , a + l ) = afc + a + 1 < Wk-i(a, a + 1) = (a + l)fc + a = (n(k + 1) + (k - l ) ) / 2 < 2*" 1 . So, ch(a,a-{-
1) < k — l,ch(a
+ l , a ) < k — 1.
T h e rest is similar t o t h a t in t h e case of even n.
•
Proof of Lemma 8.1.3. It is proved by induction on 6. For 6 = 0, ch(l,0) = 1, so t h e l e m m a holds trivially. In t h e induction step, consider two cases. Case 1. b < k. T h e test on t h e t r u t h - s e t yields states (1,0) and (0,1 + 6) with W*_1(l,0)>™fc_1(0,l
+ &).
T h e s t a t e (1,0) is always a leaf and «> A _ 1 (0,l + 6 ) < 2 * - 1 . It follows t h a t k — 1 tests are enough to identify t h e defective starting from t h e s t a t e ( 0 , 1 + 6), because t h e t r u t h - s e t being e m p t y implies t h a t t h e remaining k — 1 tests will not m a k e any error.
8.1 Ulam's Problem
153
Case 2. b > k. Let x = [(n — k + 1)/2J. Then the test on the set consisting of the unique element in the truth-set and x elements in the lie-set yields states (l,x) and (0, 6 + 1 - x). Note that K _ i ( l , x) - u)i_x(0, b + 1 - x)\ = \k - b - 1 + 2x\ < 1 to fc _i(l,x) + wfc_i(0, 6 + 1 - i ) = u;fc(l, 6) < 2*. It follows that u;/t-i(l,x)<2'!-1,i(;t_i(0,6+l-x)<2i-1. So, cfc(l,x) < fc — 1 and c/«(0, b + 1 — x) < k — 1. By the induction hypothesis, A; — 1 tests are enough to identify the defective from the state ( l , x ) . Moreover, it is clear that k — 1 tests are also enough to identify the defective, starting from the state (0, b + 1 — x). Therefore, a total of k tests are enough to identify the defective, starting from the state (1,6). • Proof of Lemma 8.1.4- First, consider the case of b = a — 1 > 1. Let £ = ch(a, b). If a is even, then a = 2c and b = 2c — 1. The test on the set consisting of c elements in the truth-set and c elements in the lie-set yields states (c, 2c) and (c,2c— 1). Clearly, Conditions 1 and 2 are satisfied. Condition 3 follows from the fact that W(-i(c, 2c) — «;<_1(c, 2c— 1) = 1. If a is odd, then a = 2c + I and b = 2c. First, assume a > 5. The test on the set consisting of c + 1 elements from the truth-set and c — \i/2\ elements from the lie-set yields states (c + 1,2c— |//2J) and (c, 2 c + [l/2\ + 1 ) . Condition 1 is clearly satisfied. To prove Condition 2, note that for a > 6, a(a + 1) + (a - 1) < 2 a . Hence, £ = ch(a,a — 1) < a, which implies that 2c - [£/2\ > c; that is, Condition 2 is satisfied. For Condition 3, it suffices to notice that
=
I t u ^ c + l , 2c - 1//2J )-«;/_,(<:, 2 c + | * / 2 J + 1)| \£-2[£/2\ - 1 | < 1.
Next, consider a < 5. Since 6 = o — 1 > 1, there are only two states (3,2) and (5,4). For the state (3,2), the test on two elements in the truth-set yields states (2,1) and (1,4) satisfying Conditions 1, 2, and 3. For the state (5,4), the test on three elements of the truth-set yields states (3,2) and (2,7) satisfying Conditions 1, 2, and 3. Now, consider 6 > a — 1 > 1. Denote x = b — a + I, m = ch(a,b), and £ = ch(a,a — 1). By the above proved case, there exists a test on a set of s elements from the truth-set and t elements from the lie-set which, starting from the state
154
Unreliable Tests
(a,a — 1), yields states (a.i,bi) and (a2,b2) satisfying Conditions 1, 2, and 3. Let Vi = u>m_i(a,', 6;). Then vx, v2 < 2 m , u>m(a,a —1) = ^i+t>2, and wm(a, b) = Dj-t-^ + z. Denote y = min(a;, 2 m _ 1 — i>i). For the state (a, 6), the test on a set of s elements from the truth-set and t + y elements from the lie-set yields states (
=
">m-i(<22, b2 + x-y) = v2 + x-y = max(u 2 , «i + v2 + x - 2 m _ 1 ) max(i;2,iym(a,6)-2 m _ 1 ) < 2 m " 1 .
This completes the proof of Lemma 8.1.4.
•
When r errors are allowed, each state is described by a (r + l)-tuple (t0,
Their result is that M 2 ( l , 2 m ) = mm{k \ k2 + k + 1 <
2h~m+i}.
From this, they were able to show that M 2 (l,10 6 ) = 29. Guzicki [9] determined M 2 ( l , n) completely. The result that he obtained is as follows. Theorem 8.1.5 Let ch(a, b, c) = minjfc | Wk(a, b, c) < 2 }. Denote k = ch(n,0,0).
Then k<M2(l,n)
<
k+l.
Moreover, if k > 14, then M 2 ( l , n ) = k if and only if the following conditions hold.
8.2 General Lower and Upper Bounds
155
Case 1. n = Am. Then n • F{k) < 2k. Case 2. n = Am + 1. Then if k = At then (m + l)F(k -2) + (2m-t+l) + m + t - l < 2k~2, ifk = A£+l then mF{k - 2) + (2m + £ + l){k - 1) + m - £ < 2 * " \ ifk = A£ + 2 then mF{k - 2) + (2m + £ + l){k - 1) + m - £ < 2k'2, ifk = At + 3 then (2m + l)F(ifc - 1) + 2mfc < 2*- 1 . Case 3. n = Am + 2. Then if k = At then mF(k - 2) + (2m + £ + l)(ifc - l ) + n - ^ + l < 2*- 2 , i/Jfc = At + 1 tfien nF(fc) < 2k, ifk = A£ + 2 then (m + l)F(k -2) + (2m-£ + 2)(k - 1) + m + £ < 2k~2, ifk = At + 3 then mF(k - 2) + (2m + t + 2)(k - 1) + m - £ < 2k~2. Case 4- n = An + 3. TAen ifk = 2£ then (2m + 2)F(fc - 1) + (2m + l)k < 2k~1, ifk = 2£ + l then (m + l)F(k - 2) + (2m + 2){k - 1) + m < 2 fc " 2 . Negro and Sereno [10] proved that Af 3 (l,10 6 ) = 33. In general, it seems pretty hard to give an explicit formula for M r ( l , n ) when r > 3. From the work of Pelc [13] and Guzicki [10], it can be seen that the formula for M 1 ( l , n ) depends on the outcome of the first test and the formula for M 2 ( l , n ) depends on the outcomes of the first and the second tests. We conjecture that the formula for M r ( l , n ) depends on the outcomes of the first r tests.
8.2
General Lower and Upper Bounds
For r > 3, determining M r ( l , n ) is not easy. So, it is worth giving lower bounds and upper bounds. For a state (to,ti,- • • ,tr), testing a set consisting of s; elements in A{ for all i will yield two states (SQ, s\ + to — so, S2 + ii — S\, • • •, sr + i r _! — sr_i) and (t0 — so, t\ — s \ + •So, t2 — «2 + si> • • •, tr — sr + ST—i)- Consider the problem of whether the unique defective can be identified in k tests or not. Define the weight of a state (t0, ti: • • • , tr)
156
Unreliable
Tests
as follows:
where m is t h e n u m b e r of remaining ning tests and
((T))-SC: 3=0
Note that
::.')h rr1 - I
It is easy t o verify t h a t u>m(*o,*i, ••-,*.•) =
tW m _i(s 0 ,Sl + *0 - 50,52 + <1 - S i , - - - , S r + < r _ l - S r _i) +Wm-l(tQ
— S0,ti
— Si + 5 0 , <2 — S2 + Si,- • • ,tr — Sr + -S r _i).
T h e following result is a generalization of L e m m a 8.1.2. T h e o r e m 8 . 2 . 1 For even n, M r ( l , n ) > min{yfc | n ( T ] J <
2k)
and for odd n,
Proof. First, consider even n. Suppose t o t h e contrary t h a t t h e r e exists an algorithm a identifying t h e defective in k tests with
Representing a by a binary tree, select a p a t h from t h e root to a leaf such t h a t each node has a weight not smaller t h a n t h a t of its sibling. Suppose t h a t t h e length of t h e p a t h is k — j ( > 0). Since t h e root is in t h e s t a t e ( n , 0 , • • • , 0 ) with weight n(()) > 2k, t h e leaf has weight m o r e t h a n 2?. Note t h a t every leaf m u s t be in a s t a t e of t y p e ( 0 , - - - , 0 , l , 0 , - - - , 0 ) . Thus, r —; for some i, 0 < i < r, which is impossible. In t h e case of odd n, t h e proof is similar.
•
8.2 General Lower and Upper
Bounds
157
To o b t a i n a b e t t e r u p p e r b o u n d , one needs to divide t h e weight function evenly. However, as r increases, t h e difficulty of doing t h a t increases rapidly. Rivest et al.[19] found an interesting way to get around this trouble. T h e y p u t n items at - , - , • • •, 1 in t h e interval (0,1] and use t h e question u Is x > c?" corresponding to a test on t h e set of items located in (c, 1] where x is t h e location of t h e unique defective item. T h e n they proved t h a t there exists a c to divide a continuous version of t h e weight function evenly. In this way, they proved t h e following upper b o u n d . T h e o r e m 8.2.2 M r ( l , n ) <min{fc + r l ™ ( (
J ) < 2*}.
A detailed proof of this result will be given after t h e following l e m m a . L e m m a 8 . 2 . 3 For any two natural numbers k and r, let e(A;,r) denote the smallest e such that k "Yes-No" questions about an unknown x E (0,1], up to r of which may receive erroneous answers, are sufficient in the worst case to determine a subset A of (0,1] with x £ A and \A\ < e. Then
Moreover, x < cV\
this smallest
e can be achieved
by a strategy
using only comparisons
"Is
Proof At t h e stage t h a t q questions r e m a i n , t h e s t a t e of knowledge of t h e questioner can be represented by an (r + l ) - t u p l e (A 0 , Au • • • , Ar) where Ai is t h e set of points satisfying all b u t i answers. Consider t h e weight function
wk(Ao,Au---,Ar)
= ^2\Ai\ f ( _ J
where \A{\ is t h e total length of A{. Note t h a t any "Yes-No" question is equivalent to a question "Is x G T ? " for some set T C (0,1]. So, it suffices to consider the question of t h e latter t y p e . For t h e question "Is x G T ? " , a "Yes"-answer results in a s t a t e K , A ; , . - - , A ' r ) with A'0 = A0 fl T and Ai = (Ai fl T) U (i4t-_i \ T ) for 1 < i < r and a "No"-answer results in a s t a t e (AQ, A", • • •, A") with
K = A0\T
Unreliable Tests
158 and A'l = {A{ \ T) u (A,-_! n r). Clearly,
wk(A0, A l 5 • • •, A r ) = wfc_i(Aj,, Ai, • • •, A'r) + t n M ( ) 4 j , A?, • • •, A'r'). Thus, each test can reduce the weight by at most a half. Note that «*((O,l],0,---,0)=
((*))
and w0(A0,Al,---,Ar)
= \A0\ + \At\ + • • • + \Ar\ = e{k,r).
Thus,
<(*.')> ((J)) 2~\ The above analysis also shows that the best questioning strategy is to choose the next question "Is x € T?" such that the two resultant weights are equal. Any such strategy would make e(k,r) achieve (I ))2~k. This can be done with a comparison "Is x < c?". In fact, Wk-i(A'0, A[, • • •, A'r) and wk^i(A'g,A",- • -,A") are continuous functions of c. Moreover, for c = 0,
=
u>k-i(®,AQ,---,Ar-i)
- E-™((t:!)) =
wk-i(AQ,Ai,---,AT)
and similarly, for c = 1,
t i ^ K , A;, • • •, A;) >«;»_!«, A';, • • •, A'/). Thus, there exists c such that the two weights are equal. • Proof of Theorem 8.2.2. Place n items at - , - , • • •, 1, respectively. Consider the optimal comparison strategy for the continuous problem as described in the proof of Lemma 8.2.3. Let (Ao, Ai, • • •, A r ) be the final state. We will show that r additional comparison questions suffice to reduce the set U;_0Ar to a single interval. Since this
159
8.2 General Lower and Upper Bounds
interval has length at most e(fc, r), it contains at most one item if e(k, r) < K (Note: the interval is always in the form (a, 6].) This means that Mr(l,n)
<min{fc + r l ™ ( (
] ) < 2~*}.
Now, we show how to reduce Uri=0Ar to a single interval. Suppose that the questions "Is x < c'uV for u = 1, • • •, k' where c\ < c'2 < • • • < dk, received the "Yes"answer and the questions "Is x < c"?" for v = 1, • • •, k" where c" > c£ > • • • > d{j, received the "No"-answer. For convenience, define c'u = 1 for u > k' and c" = 0 for v > k". Then Uri=0Ai C {<%,JT}. For y 6 (c'r+1, c';+1], y $ Dri=0Ai if and only if there exists u such that c!u < y < c" +1 _ u . It follows that UUA.- =
(c'r'+1, < + 1 ] \ (Uci,
(See Figure 8.1.) Clearly, each interval of U; =0 A' contains at least one of the following C
'l
1 1
No !
ll : i
i
i
1
i i
Yek
-•] h c
i
0
i
i
"l
c
'r+l ;
c
c
u
1
r+l-u
Figure 8.1: The set Uri=1Ai. intervals (cu. cr+2-«]
for c
'u < 4+2-u and 1 < u < r + 1.
There are at most r + 1 intervals in this list. Choose the next question "Is x < c?" such that c is located between two intervals of U;_0A,-. Note that any answer for this question will reduce by at least one the number of intervals in the above list. Thus, at most r questions would suffice to reduce UJ_0/t, to a single interval. •
Unreliable Tests
160
Consider an algorithm a for identifying the unique defective from n items with at most r lies. When a is represented by a binary tree, every leaf is associated with the identified defective and each edge is labeled by a test outcome 0 or 1, representing 'pure' and 'contaminated' respectively. Then each path from the root to a leaf gives a binary string to the item labeling the leaf. This code has the property that for any two different items, their binary strings differ in at least 2r + 1 bits. Thus, a corresponds to a r-error correcting prefix code of n objects, and vice versa. The error-correcting code is an important subject in coding theory. Pelc [14] compared Ulam's problem with coding theory and raised the following problems: What is the minimum number of "Yes-No" questions sufficient to find an unknown x € {1, • • •, n} or to detect errors, if up to k errors are possible? What is the minimum number of "Yes-No" questions sufficient to find an unknown x <E {1, •••,n} or to determine the number of lies if up to r lies are possible? For the first problem, he gave a complete answer: the number is flogn] + k. For the second problem, he obtained the answer for k = 2 and left the problem open for k > 3. Pelc [13] studied a constrained group testing problem with a lie. We will discuss it in Chapter 11.
8.3
Linearly Bounded Lies (1)
Lies are said to be linearly bounded if each initial sequence of m tests contains at most qm lies for a constant q (0 < q < 1). So, an item can be identified to be good (or defective) if and only if in an initial sequence of m tests, the item is identified to be good (or defective) for more than qm times. From the upper bound in Theorem 8.2.2, Pelc [15] was able to derive the following. Theorem 8.3.1 / / 0 < q < | , there exists a constant c such that for k > clogn,
Mgk(l,n)
The proof of this theorem is based on a well-known inequality on (( m J). Lemma 8.3.2 For j < m/2,
where H(x) = —a; log a; — (1 — x) log(l — x). Proof. It is proved by induction on m. The initial case m = 1 is trivial. In the induction step, if j = m / 2 , then H{k/m) = 1 and hence ( ( " ) ) < 2m = 2mH<^ro>; if j < (m - l ) / 2 , then m j
8.3 Linearly Bounded Lies (1)
<
-
771
0
t=0 \
161
m
'Hi::H :'> l
i=0
}
, m — 1. • ra - 1
J
>K
m-j-l'
y K
j' m-j>
i
V
S-l 'm(l
+
-L_)m-i+
m
m-;; (1 +
_L_)m-i >
m-j'
Proof of Theorem 8.3.1. By Theorem 8.2.2 and Lemma 8.3.2,
< <
min{ib|2^-«>*W^1-^-1)n
for k > clogn where c = log ( 1 . q ) ( 1 . ^ ( q / ( 1 . q ) ) ) . ° Theorem 8.3.1 means that for q < 1/3, the defective can be identified through O(logn) tests. It was a major open problem in [15] whether for 1/3 < q < 1/2, the unique defective can be identified through O(logn) tests or not. Aslam-Dhagat [1] solved this problem by providing a positive answer. We will exhibit their solution in the next two sections. Before doing so, we introduce some results of Spencer and Winkler [22] in this section which are helpful for understanding the nature of the problem. Spencer and Winkler [22] considered three versions of linearly bounded lies in a game with two players, "Paul" and "Carole". Paul is the questioner and Carole is the answerer. The game proceeds as follows: Carole thinks of an item x from a set of n items; Paul tries to determine x. For fixed k and #, a winning strategy for Paul is an algorithm by which he can identify x with at most k questions if Carole follows a certain rule. The following are three rules which give three versions of the game. (A) For each question of Paul, Carole has to answer immediately. In addition, every initial sequence of k! of Carole's answers contains at most k'q lies. (B) For each question of Paul, Carole has to answer immediately. In addition, of k answers, Carole can lie at most kq times. (C) Carole can wait until Paul submits all k questions and then she gives k answers which contain at most kq lies.
162
Unreliable Tests
Note that any "Yes-No" question is equivalent to a question "Is x € TV for some subset T of items. It is easy to see that the first version is equivalent to the linearly-bounded-lie model defined at the beginning of this section. They proved the following for it. Theorem 8.3.3 If q > \ and n > 3, then there does not exist a strategy to identify an unknown x from {1, • • •, n} under the condition that the possible lies are linearly bounded with ratio q, i.e., there does not exist a winning strategy for Paul in the first version of the game. Proof. Carole chooses two items, say 1 and 2, one of which is the unknown x. Then she uses the following strategy. (1) If Paul asks a question "Is x 6 TV where T contains exactly one of items 1 and 2, then Carole gives an answer by alternately assuming a . 1 is the unknown x, b . 2 is the unknown x. (2) If (1) does not occur, then Carole always tells the truth. A crucial fact is that if n > 3, then Paul must ask a question such that (2) occurs in order to find that the unknown x is not an item other than 1 and 2. Thus, both conclusions (a) and (b) are compatible with at least one half of the answers. This means that Paul cannot distinguish (a) from (b). • Theorem 8.3.3 was also discovered by Frazier [8]. For the second version, there exists a winning strategy for Paul if and only if there exists a natural number k such that Mqk[\,n) < k. Theorem 8.3.1 says that if 0 < q < g, then such a k exists. Spencer and Winkler also proved the following negative result. Theorem 8.3.4 Let q > | and n > 5. Then there does not exist k such that M9 (l,n) < k, i.e., there does not exist a winning strategy for Paul in the second version of the game. Proof. Let Ai(t) be the set of items satisfying all but i answers at time t (i.e., t — 1 questions have been answered and Paul is asking the tth question). Place Ao(t), Ai(t), ••• on a line. Then naturally, an ordering is assigned to all items. Each answer of Carole will move some items from A,- to Ai+i. All items in A{ for i > kq are considered as removed items while the others are remaining items. Consider the following strategy of Carole. (1) As long as |J4O(*)| •> 3, Carole makes sure that no more than [|A0(f)|/2J items move from AQ. (2) As long as at least three items remain, Carole never moves the first three items at the same time.
8.3 Linearly Bounded Lies (1)
163
(3) If only two items remain, Carole never moves two items at the same time. Let i\, 12, and J3 be three indices such that the first three items at time t are in A;,, A,-2, and A;3, respectively. Define /(<) = *i + h + min(t 3 , [qk\ + 1). Then f(t + 1) < f{t) + 1 for 1 < i < 9fc. Since n > 5, |A 0 (2)| > 3. Thus, / ( l ) = /(2) = 0. Suppose to the contrary that Paul has a winning strategy. Then at the end of the game, the first item must be in A[,fcj and the second item has just been removed, i.e., it has just moved to A^ f c j + 1 . So, for some k' < k, f{k' + 1) = 2[qk\ + 2. Thus, •i[qk\ + 2 = f{k' + 1) < 1 + f(k') <•••
+ /(2) < k - 1,
that is, k> 3([qk\ + 1 ) > 3qk, contradicting q > | .
•
The third version of the game is equivalent to the nonadaptive CGT with one defective and linearly bounded lies. (In this case, since all answers are given at the same time, the concept of the initial sequence is useless.) For this version, Spencer and Winkler proved the following. Theorem 8.3.5 In the third version of the game, Paul wins with O(logn) questions if q < \; Paul has no winning strategy if q > \; when q = \, Paul wins with 0(n) questions. From Theorems 8.3.3 and 8.3.4, one can understand that for | < q < ^, a winning strategy of Paul for the first version of the game has to use the on-line boundary (at most qt lies at time t) to obtain a truth outcome. How many times does Paul have to use it? Aslam and Dhagat [1] gave a three-stage strategy. In the first stage, Paul can use it only once to remove all but O(logn) items. In the second stage, Paul also needs to use it only once to remove all but 0(1) items. In the last stage, Paul needs to use it 0(1) times. Thus, totally, Paul needs to use the on-line boundary for only a constant number of times. What is the minimum number? It is an interesting open question. When there is no deterministic winning strategy for Paul, probabilistic algorithms may be useful. Pelc [15] proved that for | < q < | , Paul can still win the second version of the game with any fixed reliability p < 1 by using 0(log 2 n) questions.
Unreliable Tests
164
8.4
The Chip Game
Aslam and Dhagat [1] proposed a different approach to establish the upper bound for Mr(l,n). The advantage of their approach is in the implementation. No weight function needs to be computed in their algorithm. However, in the approach of Rivest et al. or others, the weight function is involved and a lot of computation is required. How can Aslam and Dhagat eliminate the weight function? Here is a little explanation before introducing their method. From the use of the weight function, it is easy to know that the new test should be chosen to divide the current weight as evenly as possible. One way to do this is to choose the new test on a set consisting of nearly half of the elements from A, for all i. This strategy is obtained by keeping the weight in mind, but not depending on the weight. For a simple description of their strategy, Aslam and Dhagat [1] used a chip game formulation. The chip game is played on a unidimensional board with levels from 0 upward (see Figure 8.2). There are two players, a chooser Paul and a pusher Carole. At the i
i
i
i
i
i
0
1
2
3
••
b (boundary line)
Figure 8.2: Chip Game beginning all chips are on level 0. Each chip represents an item. (Chips and items will not be distinguished in the rest of this chapter.) So, they are labeled by 1,2, • • •, n. At each step, the chooser Paul divides {1,2, • • • , « } into two disjoint subsets. Then the pusher Carole picks one of these two subsets and pushes every chip in this subset to the next level. There is a boundary line at some level. Every chip that passes over the boundary line can be removed by Paul. To win, Paul must remove all but one chip from the board. The location of the boundary line is based on how many errors are allowed. For up to r errors, the boundary line is set at level r. When r is a function of the number of tests, the boundary line has to be updated at each step. For example, if lies are linearly bounded, i.e., r = qk at step k, then the boundary line should be placed at level [5dm J at step k. So, it is updated by moving it forward one more level approximately every l/(rd) steps. Group testing for one defective can be explained as a chip game in the following way: A test corresponds to a step of the chip game. When Paul divides {1,2, • • • , n} into two subsets, it means that the test is on one of these two subsets. (It does not matter which subset is tested because the test outcome is the same: one subset is pure and the other one is contaminated.) The test outcome corresponds to the choice of Carole. The subset pushed by Carole is the pure subset. From the above correspondence, it is easy to see that each chip at level i is contained in pure subsets i times. If it is defective, then i lies have been made. So, the chips at level i form exactly Ai, the set of items each of which, if defective, satisfies all but exactly i test
8.4 Chip Game
165
outcomes in the test history. The following table summarizes the above description. group testing with one defective the test subset and its complement the pure subset the items in the set Ai the state
the chip game two subsets obtained by Paul's division the subset pushed by Carole the chips at level i the sequence of the number of chips at each level
Aslam and Dhagat's approach can be extended to more defectives. In the generalized chip game, the chip board and players are the same as before. At the beginning all chips are also at level 0. However, at each step, the chooser Paul divides {1,2, • • • , n] into d + 1 disjoint subsets instead of two subsets. The pusher Carole picks one of the d + 1 subsets and pushes every chip in the chosen subset to the next level. To win, Paul must eliminate all but d chips from the board. The generalized chip game corresponds to a certain type of group testing algorithms. At each step, Carole's choice corresponds to the testing in the following way: Sequentially test d of the d-\-1 sets determined by Paul. If a test outcome is negative, then push the tested set and move to the next step; if all test outcomes are positive, then push the (d + l)th group. Let hm(i) denote the number of chips at level i after m steps. Consider a strategy for Paul as follows: At each step, first label all chips on the board, one level followed by another level, with natural numbers starting from 1. Then choose the ith group consisting of chips with labels congruent to i modulo d+1. According to this strategy, at step m-f 1, each group contains chips at theith level of sizes between [hm(i)/(d+l)\ and \hm(i)/(d+l)]. Denote , ... nd"1'* (rn\ OmW (d+1) and A m (i) = hm(i) -bm(i). The normalized binomial coefficient bm(i) is used to approximate hm(i). In fact, to find out how many chips are left on the board, one needs to estimate Y7i=o hm(i)However, this sum is hard to determine exactly. As a replacement, an upper bound will be given to X3,<^Wj A m (7). The following results are generalizations of those in
[!]• L e m m a 8.4.1 (Vm > 0) A m (0) < d. Proof. This is proved by induction on m. For m = 0, hQ(Q) = n = b0(i) and hence A o (0) = 0. Assume A m _!(0) < 1. Then
MO) < Am-,(0)-L%^J
Unreliable Tests
166 fem-i(0)d d+1 d+1 (6 m _ 1 (0) + A m (0))^ d d+1 d+1 bm{0) + d.
<
<
Thus, A m (0) = M O ) - M ° ) < d- D Lemma 8.4.2 (Vm > 0) £™ 0 A ™(0 = °Proof. It follows immediately from the fact that Y?h=o ^m(«) = n = Eiio 6 m (i). L e m m a 8.4.3 £i=o 6 m _ 1 ( J ) + * » # = £ L o M O Proo/. Note that (7_"1) + (^J1) = (™).
g 6m _ l(j) + _ _ V" h
(d +
(A J.
—[£{d
w {d+1)"
P
b
m-iU)d
m — l\
,„ /ro — 1 \
n
„ , fm — 1
i)
+
o
+ _, , r fm — 1+\
^t o
/m — 1
iJ+
--t^MV
w i_ r d-r+-+^' V
0 ( d + 1 ) ™ 1 [ rV + /
1
TO.
= EM0- ° i=0
Lemma 8.4.4 For all m > 0 anrf j < m, £i=o A m (i) < (j + l)d. Proof. This is proved by induction on m. The case m = 0 is trivial because A o (0) = 0. In the induction step, for j = 0 and j = m, the lemma follows from Lemmas 8.4.1 and 8.4.2. For 0 < j < m, note that 3
< £»-.<••>- I ^ J
8.4 Chip Game
+
S
ra l() + _
-
^TT- + ^TT
- SM0+SAm-i(t)+-inT-+d+iIf A m _ x (i) < d , then
gAm_1(*) + ^ - p r
+
—
i-i
<
X]A m _i(t) + d *=0
<
(j + l)d.
If A m _i(i) > d, then
gAm_1(l) + - 7 T 1 - + J T I z'=0
< (i + iK Therefore,
£ A -(0 = EM»)-EM») m
t=0
<
V A
< (i + iK
i=0 M
, Am-l(j> ,
*
n
Theorem 8.4.5 After m steps in the chip game, there are at most
* +1 » +
Unreliable Tests
168 Proof. It follows immediately from Lemmas 8.4.4. • If m satisfies
then after m steps, there are only d(r + l) chips left at levels from 0 to r. Suppose that there are at most r errors. Then all chips not at levels from 0 to r can be removed. By Lemma 8.3.2,
£ < T - W <{d2H^r'mYChoose m = max(4
^ "
±
J ) = O(logra). Note that log -±- > H{j-d). So,
n(d2 H < r / m ') m < n{d2H^l^)m
< {d + l ) m ,
that is, for this chosen m, after m steps, there are at most d(r + 1) chips left on the board. Since at each step, at least one chip is pushed up unless there are only d chips left on the board, dr(r +1) steps are enough to remove the other dr good chips. Therefore, it follows Corollary 8.4.6 Let r, d and n be natural numbers with d < n. Then
Mr(d,n) m - ' ( m j < (1 + ^ D Furthermore, if r is a constant, then Mr(d,n)
= O(logri).
When d = 1, the upper bound in this corollary is weaker than that in Theorem 8.2.2. We conjecture that Mr(d,n)
< d r + min{m | n • £ < T - ' ' P™) < (1 +
hm]
and for any r and d, there exists an n such that the equality sign holds. Finally, a remark should be made that in the above results, the boundary is used only at the end of the game, i.e., all identified chips are removed according to the last boundary line.
8.5
Linearly Bounded Lies (2)
In Theorem 8.4.5, set r = qdm and m == \-—j+°g \,, ,C\ (= O(logrc) when log ^Jl > H{qd)). Then the following is obtained.
8.5 Linearly Bounded Lies (2)
169
Theorem 8.5.1 Suppose that log ^±1 > H(qd) and q < ^ . Then with the strategy provided in the last section, the chooser can remove all but O(logn) chips from the board within O(log n) steps. Based on this result, the following theorem will be proved in this section. Theorem 8.5.2 Suppose H(qd) < log^±I. Then there exists an algorithm identifying all d defectives from n items in 0(log n) tests with linearly bounded errors in proportion to q. The algorithm consists of three stages. In the first stage, it corresponds to the chip game described in the last section, which achieves the result stated in Theorem 8.5.1. In order to remove the remaining O(logn) good chips, two more stages are used; they are similar to those introduced in [1]. In the second stage, all but 0(1) chips on the board will be removed in O(log n) tests. In the third stage, all but d chips on the board will be removed in O(logn) tests. The details are as follows. Suppose that there are C\ log n chips left on the board with the boundary line at level c2 log n. Let us still use the chip game language to describe the algorithm at the second stage. At the beginning of the second stage, Paul moves some chips to the left such that each level contains at most d + 1 chips and then continues to form groups in the same way as that in the first stage. In order to get such an initial state for the second stage, Paul may need to add |~(ci \ogn)/(d + 1)] levels below level 0. An equivalent treatment is to relabel the levels such that the first level is still level 0 and the boundary line is at level c 2 logn -f' \(ci logn)/(d + 1)](= c 3 logn). In the following, the latter way is used. The next lemma states an invariant property of the second stage. L e m m a 8.5.3 The property that each level on the board contains at most d-\-l chips is invariant during the second stage. Proof At each step, a level receives at most one chip from the previous level. If this level contains exactly d -f 1 chips at the beginning of the step, then it must have a chip going to the next level. Thus, at the end of the step, each level still contains at most d -f 1 chips. • Choose k = [2/(1 — (d+l)dq)]. It will be shown that Paul removes all but (d-\-l)k chips on the board in 0(log n) steps. To do so, define the level weight of the board to be the sum of level numbers where its (d + l)k leftmost chips are located. L e m m a 8.5.4 After each step, the level weight of the board increases by at least k—1. Proof Among (d + l)k leftmost chips there are at least k pushed up at each step. Of those k chips there are at least k — 1 chips which remain in the set of (d + 1)A; leftmost chips after the step. Thus, the level weight of the board increases by at least Jfc-1. •
Unreliable
170 T h e o r e m 8 . 5 . 5 If q(d + l)d < 1, then at the second stage, (d + l)k chips from the board in O(log n) steps.
Paul removes
Tests all but
Proof. Let S b e t h e n u m b e r of steps taken during t h e second stage. Let W be t h e level weight of t h e board at t h e end of t h e second stage. Note t h a t initially t h e level weight of t h e b o a r d is nonnegative. By L e m m a 8.5.4, W > S(k — 1). Moreover, t h e b o u n d a r y line at t h e end of t h e second stage is C3 log n + L^^^J • T h u s , W < (d + l)k{c3 log n + qdS). Combining t h e two inequalities, one obtains (d+l)kc3\ogn fc-1 So, 5 = 0 ( l o g n ) .
-q(d+l)dk'
•
Note t h a t t h e generalized chip g a m e corresponds t o a certain t y p e of testing m e t h o d . Since a testing m e t h o d of a different t y p e is used at t h e t h i r d stage, it is convenient t o use t h e original terminology of group testing except t h e chip board stays. Suppose t h a t there are c chips left on t h e board after t h e second stage and meanwhile t h e b o u n d a r y line is at level 6 = O ( l o g n ) . First, choose an algorithm at for identifying d defectives from c items. T h e n perform a with brute force in t h e following way: For each test of a, repeat t h e test until either t h e n u m b e r of negative outcomes or t h e n u m b e r of positive outcomes exceeds t h e b o u n d a r y line. Let Xi denote t h e n u m b e r of repetitions for t h e i t h test. Suppose t h a t t h e b o u n d a r y line is moved t o level 6; with b r u t e force on t h e ith test. T h e n (1 — q)x{ < 1 + 6,-_i + [qxi\. T h u s , 1 + 6.-1 < ~ l-2q and 1 + 6;
<
1 + &,-_! +
^
(1+ &,_!)(!-9) l-2q
[qxi\
< u + *)(rf£)'. Consequently,
< A±A ( l-L rl -
l-2qVl-2q'
and
g>,. < (Ili^lz-L)
O(logn)
8.5 Linearly Bounded Lies (2)
171
where m is the number of reliable tests for identifying d defectives from c items, which is a constant. Finally, the proof of Theorem 8.5.2 is completed by an analysis of the three stages and noting
H d)
^^q
^
In fact,
MM < ^ ^
<^
lorf* + 1) + ^
log <±± = HijL.)
and H(x) is increasing when x < A. • Note that the number of reliable tests for identifying d defectives from n items is at most dlog j + d by Lemma 7.5.2. Thus, the number of linearly bounded unreliable tests for identifying d defective ones from n items is at most |dlogf+dj
<
hi
1
~g^(i°g^+i)_1\
for q < 1/2. For H(qd) > log - J i and q < | , it is still an open problem whether or not there exists an algorithm identifying all d defective ones from n items by using at most O(log n) tests with linearly bounded errors in proportion q. Theorem 8.5.2 for d = 1 is exactly the result obtained by Aslam and Dhagat [1]. Corollary 8.5.6 Suppose q < 1/2. Then there exists an algorithm identifying the unique defective from n items within 0(log n) tests with linearly bounded errors in proportion q. Pelc [15] also studied the following model: Each test makes an error with probability q. This model is called the known error probability model which is different from the linearly bounded error model. In fact, in the former one, n tests may have more than qn errors. (Exactly k errors exist with probability ( ^ ( ^ ( l — q)n~k.) But, in the latter one, n tests cannot have more than qn errors. Since in the known error probability model, the number of errors has no upper bound, no item can be identified definitely. So, the reliability of the identification has to be introduced. An algorithm has reliability p if each item can be identified to be a good (or defective) item with probability p. Pelc [15] also proposed the following open question. With error probability q <\, can the unique defective be identified with O(logn) questions for every reliability p? This question can be answered positively by using the result of Aslam and Dhagat and an argument of Pelc.
Unreliable Tests
172
In fact, let r^ be the number of errors in a sequence of k answers By Chebyshev s inequality, Tk
D u\ i -> -\e)<1 r-
l(l-
for any e > 0. Let e = (± - g)/2. Then q + e < ± and rfc < (q + e)k Thus, Prob(rk <{q + e)k) > P r o 6 ( | y - q\ < e) > 1 -
|
where c = 16p(l - g)/(l - 2g) 2 . This means that for any fixed reliability p, the number of errors in a sequence of k > (1 — p)/c answers can be bounded by (q + e)k < |fc. Note that in the proof of Theorem 8.5.2, the first stage has O(logn) steps and needs to identify items only at the end of the stage. Thus, when n is sufficiently large such that the number of steps in the first stage becomes more than (1 — p)/c, every identification would take place after step [(1 — p)/c~\. This means that every item is identified with reliability p. Thus, Paul's O(log n) winning strategy for the linearly bounded error model also gives an algorithm with reliability p for the known error probability model. Thus, we have Theorem 8.5.7 With error probability q < 1/2, the unique defective can be identified in 0(log n) tests for any reliability p. For nonadaptive testing, does a similar result exist? It is an open problem. Finally, it is worth mentioning that Brylawski [4] and Schalkwijk [20] pointed out that the search with linearly bounded lies is equivalent to the problem of optimal block coding for a noisy channel with a noiseless and delayless feedback channel.
8.6
Other Restrictions on Lies
In Section 8.2, to obtain an upper bound for M i ( l , n ) , we studied the problem of identifying an unknown point from a continuous domain (0,1]. In this section, we will discuss this kind of problem with a general restriction on lies. When all tested objects form a continuous domain (such as a segment, a polygon, etc.), it is impossible to identify an unknown point exactly from the domain. Hence, the task is to identify the unknown point with a given accuracy e, i.e., to determine a set of measure at most e such that the set contains the unknown. Let fi(A) denote the measure of set A. What is the smallest e such that k tests with at most r lies can in the worst case determine a set A with /i( A) < e? Lemma 8.2.3 has given the answer. However, there is a very interesting generalization of Lemma 8.2.3, which was given by Ravikumar and Lakshmanan [18]. Consider a test history H of k tests. Denote an erroneous test by 1 and a correct test by 0. Then the lie pattern of H can be described by a binary string. The
173
8.6 Other Restrictions on Lies
restriction on lies can be represented by a set R of binary strings. For example, if R consists of all strings having at most r l's, then the restriction is that at most r lies are possible. They proved the following. Theorem 8.6.1 Let k be a natural number and R a restriction on lies. Let t(k,R) denote the smallest e such that k "Yes-No" questions about an unknown x £ (0,1], with restriction R on lies, are sufficient in the worst case to determine a subset A of (0,1] with x e A and p(A) < e. Then \Rn{0,l}k\-2-k
e(k,R) =
where {0,1}* is the set of all binary strings of length k. Moreover, this smallest e can be achieved by a strategy using only comparisons "Is x < c?". Actually, the lower bound can be stated more generally. Theorem 8.6.2 Let k be a natural number and R a restriction on lies. Let eo(k, R) denote the smallest e such that k "Yes-No" questions about an unknown x € D, with restriction R on lies, are sufficient in the worst case to determine a subset A of D with x G A and n(A) < t. Then eD(k,R)
>\RH {0,l}*|-2-*/*(/?).
Proof. Consider an algorithm with k tests, which can be represented by a binary tree of depth k. Suppose that lie patterns in R fl {0,1}* are linearly ordered. Let Sij denote subsets of D such that x 6 Sij if and only if the choice of x arid the j t h lie pattern lead to the ith leaf of the binary tree. For any fixed lie pattern j / , 6 R, the sets Sij, i = 1,2,..., 2k form a partition of D. That is, Sij n Si.j = 0 for i ^ i' and U ^ Sij = D. In addition, Sij 0 Sij. = 0 for j ± j ' . Since one does not know which lie pattern appears, the set of uncertainty after k tests is Dk(i) = U„-£Rn{o,i}i>Sij if the testing process leads to the ith leaf. Therefore, the worst case value of the measure of the set of uncertainty is max fi(Dk(i)). Note that
E / W O ) = E E M ^ ) = £!>($>•) = I > ( 0 ) = t*(D)\Rn{o,i}k\. «
'
i
i
i
i
Unreliable
174
Tests
Therefore, max. n{Dk(i))
> 2—fc|JRn {0, l} fc |/^(£>).
D
t
De Bonis, G a r g a n o , and Vaccaro [3] considered more t h a n one unknown point. Suppose t h a t t h e r e are d unknown points in (0,1]. T h e n t h e search space becomes D = { ( x i , . . . , xd) | xu ..., xd € (0,1]}. Each test consists of a question "Is {xi,..., x&\ fl T = 0?" on a subset T of (0,1]. Let td(k, r) denote t h e smallest t such t h a t k "Yes-No" questions about an unknown x £ D, w i t h a t m o s t r lies, are sufficient in t h e worst case t o d e t e r m i n e a subset A of D with x £ A and /x(A) < e. By T h e o r e m 8.6.2,
*(*,r) > ( ( * ) ) • 2-V(Z>) =
((;)) ^ .
T h e y further proved t h e following. T h e o r e m 8 . 6 . 3 e2(k,r)
= ( ( * ) ) 2-( f c + 1 ).
T h e y also o b t a i n e d a similar result for parity testing, i.e., each test consists of a question "Does T contain exactly one of xi: ..., XAV on a subset T of (0,1]. In Section 8.2, L e m m a 8.2.3 was used to obtain an u p p e r b o u n d for t h e discrete version of t h e problem. Could T h e o r e m 8.6.1 be used to obtain a similar result t o T h e o r e m 8.2.2? Czyzowicz, L a k s h m a n a n , and Pelc [5] gave a kind of negative answer. For each string p, let Rp be t h e set of binary strings which do not contain p as a substring. T h e n t h e y proved t h a t T h e o r e m 8 . 6 . 4 Identifying an unknown x from { 1 , . . . , n} by "Yes-No"-queries with restriction Rp on lies is possible if and only if p is one of the following 0, 1, 0 1 , and 10. C o m p a r e d with T h e o r e m 8.6.1, how should one u n d e r s t a n d this result? Applying T h e o r e m 8.6.1 to t h e interval ( 0 , n ] , with a certain n u m b e r of tests, t h e set of uncert a i n t y can have very small measure. However, this set m a y have too m a n y connected c o m p o n e n t s d i s t r i b u t e d "evenly" in (0,n] so t h a t none of 1, . . . , n can be excluded. In this case, identifying x is impossible. Czyzowicz, L a k s h m a n a n , and Pelc [5] also obtained o p t i m a l strategies for t h e four possible cases. In fact, for p = 0 , 1 , it is trivial and for p = 01,10, t h e result is stated in t h e following. T h e o r e m 8 . 6 . 5 The depth of the optimal binary search tree for finding x in { 1 , . . . , n } with restriction R\o on lies is *• '
;
~ \ min{< \(t + l)n + (t-l)
The same result holds for restriction
RQI .
< 2'},
if n is odd.
References
175
Rivest et al. [19] introduced another model of erroneous response. It is supposed that the answer "Yes" is always true and lies occur only in "No"-answers. This is called the half-lie model. In the half-lie model, not every lie pattern leads to a welldefined testing. In fact, if the forbidden lie pattern p ends with a 0, then a conflict may result when the testing gives different outcomes according to the half-lie model and to the forbidden lie pattern. To see this, consider q = 1, i.e., every test outcome is erroneous. Now, if the query is "Is x € 0?", the erroneous response is "Yes", which is always supposed to be true in the half-lie model, a contradiction. For the forbidden lie pattern p with a 1 at the end, Czyzowicz, Lakshmanan, and Pelc [5] proved the following. T h e o r e m 8.6.6 In the half-lie model, search with restriction Rp is feasible for every p ending with a 1.
References [1] J.A. Aslam and A. Dhagat, Searching in presence of linearly bounded errors, Proceedings of 23rd STOC, 1991, pp. 486-493. [2] E.R. Berlekamp, Block for the binary symmetric channel with noiseless delayless feedback, in Error-Correcting Codes, (Wiley, New York, 1968) 61-85. [3] A. De Bonis, L. Gargano, and U. Vaccaro, Group testing with unreliable tests, manuscript. [4] T.H. Brylawski, The mathematics of Watergate, Unpublished manuscript. [5] J. Czyzowicz, K. B. Lakshmanan, A. Pelc, Searching with a forbidden lie pattern in responses, Information Processing Letters 37 (1991) 127-132. [6] J. Czyzowicz, D. Mundici, and A. Pelc, Solution of Ulam's problem on binary search with two lies, J. Combin. Theory A49 (1988) 384-388. [7] J. Czyzowicz, D. Mundici, and A. Pelc, Ulam's searching game with lies, J. Combin. Theory A52 (1989) 62-76. [8] M. Frazier, Searching with a nonconstant number of lies, manuscript. [9] W. Guzicki, Ulam's searching game with two lies, J. Combin. Theory A54 (1990) 1-19. [10] A. Negro and M. Sereno, Solution of Ulam's problem on binary search with three lies, J. Combin. Theory A59 (1992) 149-154.
Unreliable
176
Tests
[11] A. Negro and M. Sereno, An U l a m ' s searching g a m e with three lies, Advances Mathematics [12] A. Pelc, Solution of U l a m ' s problem on searching with a lie, J. Combin. A44 (1987) 129-140. [13] A. Pelc, Prefix search with a lie, J. Combin.
in
Theory
Theory A48 (1988) 165-173.
[14] A. Pelc, Detecting errors in searching games, J. Combin. 43-54. [15] A. Pelc, Searching with known error probability, Theoretical 63 (1989) 185-202.
Theory
A51 (1989)
Computer
Science
[16] A. Pelc, Detecting a counterfeit coin with unreliable weighings, Ars Combin. (1989) 181-192.
27
[17] B . R a v i k u m a r , K. Ganesan, and K . B . L a k s h m a n a n , On selecting t h e largest element in spite of erroneous information, Proceedings of ICALP '87, 99-89. [18] B . R a v i k u m a r and K . B . L a k s h m a n a n , Coping with known p a t t e r n s of lies in a search g a m e , Theoretical Computer Science 33 (1984) 85-94. [19] R.L. Rivest, A.R. Meyer, D.J. K l e i t m a n , K. W i n k l m a n n , and J. Spencer, Coping with errors in binary search procedures, J. Computer and System Sciences 20 (1980) 396-404. [20] J . P . Schalkwijk, A class of simple and o p t i m a l strategies for block coding on t h e binary s y m m e t r i c channel with noiseless feedback, in IEEE Trans. Information Theory 17:3 (1971) 283-287. [21] J . Spencer, Guess a n u m b e r - w i t h lying, Math. Mag. 57 (1984) 105-108. [22] J. Spencer and P. Winkler, T h r e e thresholds for a liar, P r e p r i n t , 1990. [23] S.M. U l a m , Adventures
of a Mathematician,
(Scribner's New York 1976).
9 Optimal Search in One Variable
Many optimal search problems in one variable have the same flavor as CGT. We study them in this chapter.
9.1
Midpoint Strategy
When gas pipe line has a hole, one may find it by testing the pressure at some points on the line. If the pressures at two points are different, then a hole must be between them. (a) Suppose there exists exactly one hole between points A and B. How does one choose the test points on segment AB to optimize the accuracy in the worst case? (b) Suppose there exists at most one hole between A and B. How does one solve the same question in (a)? Problems of this type provide continuous versions of combinatorial group testing. If the pressures at A and B are already known, then the best choice for one test is the midpoint in both problems (a) and (b). In fact, from each test, one can break the segment into two parts; one of them may contain a hole. In the worst case, the hole always falls in the longer part. So, the midpoint would minimize the longer part. If the pressures at A and B are unknown, then the optimal strategy for (a) is that initially, test one of A and B, say A and then for each of the rest tests, choose the midpoint of the segment containing a hole. In fact, suppose that the initial two tests are at points C and D in [^4, B\. Then the test outcome is either [C, D] contains a hole or [A, B] \ [C, D] contains a hole. That is, the tests break the segment into two parts; one contains the hole, the other one does not. Moreover, adding any new test will reduce at most half of the part with a hole. So, the worst-case optimal strategy is to let the two parts have equal length. The above midpoint strategy meets this requirement. For (b), if the pressures at A and B are unknown, the midpoint strategy is also a good one in some sense, but not necessary the optimal. To see this, let us compare the midpoint strategy with another strategy. Suppose one uses three tests. Initially, test A. 177
Optimal
178
Search in One
Variable
By t h e m i d p o i n t strategy, t h e second test is performed at t h e midpoint C. In t h e worst case, t h e two pressures at A a n d C are t h e same. So, in order to know whether segment CB contains a hole or not, t h e t h i r d point h a s t o b e chosen at B. If C and B have different pressures, t h e n one finds a segment of length | | J 4 B | containing a
hole.
o—ooo—o A
E
C
D
B
Figure 9.1: Test points However, if t h e second test is performed at point D such t h a t \AD\ = | (see Figure 9.1), t h e n one gets a b e t t e r result. In fact, if A a n d D have t h e same pressure, t h e n test B; if A a n d D have different pressures, t h e n test t h e midpoint E of AD. In this way, o n e can d e t e r m i n e either t h e nonexistence of a hole or t h e position of a hole with accuracy | . In general, t h e following holds. T h e o r e m 9 . 1 . 1 Suppose there exists at most one hole between A and B and the pressures at A and B are unknown. For a natural number k, let e(k) denote the smallest e such that k tests are sufficient in the worst case to determine either a hole with accuracy e|yl-B| or nonexistence of a hole in [A,B\. Then e(k) = 2 *- 1 i_ 1 . Proof. First consider a similar problem u n d e r a little different a s s u m p t i o n t h a t t h e pressure at A is known and t h e pressure at B is unknown. Let t"(k) denote t h e smallest e such t h a t k tests are sufficient in t h e worst case to d e t e r m i n e either a hole with accuracy e or t h e nonexistence of a hole in [A, B] e*(k) = p ^ - j is proved by induction on k. For k = 1, t h e unique test must be m a d e at point B (otherwise, it cannot b e known whether a hole exists at B or n o t ) . In t h e induction step, suppose the first test is m a d e at point C with \AC\ = x\AB\. T h e r e are two cases. Case 1. A a n d C have t h e same pressure. So, [A, C] has no hole. T h e rest k — 1 tests are sufficient in t h e worst case to d e t e r m i n e either a hole with accuracy e*(k - 1)\CB\ or [C,B] having no hole. Case 2. A a n d G have different pressures. So, [A, C] contains a hole a n d [C, B] does n o t . T h e rest k — 1 tests would b e performed in [A, C]. Since b o t h pressures at A a n d C are known, t h e o p t i m a l strategy is t o choose t h e midpoint. In this way, one can d e t e r m i n e a hole with accuracy 2 1— *|v4C|. S u m m a r i z i n g t h e two cases, one sees t h a t x e *( fc ) = 0 ™ n i m a x ( - t r r , e * ( A : - l ) -(1 - x)). Since ^ b r is increasing a n d e*{k—1)(1—x)
is decreasing as x increases, raaxi^-^zi, e*{k—
1) • (1 —a;)) achieves its m i n i m u m when -^j fore, e*(k) = j ^ Y .
= e*(k — 1)(1 — x ) , i.e., x = fj—[• There-
9.2 Fibonacci Search
179
To prove e(k) = j , _ 1 1 _ 1 , it suffices to prove that in the situation that both pressures at A and B are unknown, an optimal strategy must first test one of A and B. Suppose to the contrary that the first test is taken place at C other than A and B. Without loss of generality, assume \AC\ > ^Ii~j|Aff|. Since e"(k - 2)|AC| > e*(k - 1)|AB|, the rest k — 1 tests must apply to [A, C\. This leaves [C,B] uncertain. So, the first test must be made at either A or B. • Note that if both pressures at A and B are unknown then the first test in an optimal strategy must be made at either A or B. So, The problem can be reduced to the situation that either the pressure at A or the pressure at B is known and the other is unknown. The following corollary establishes an advantage of the midpoint strategy. Corollary 9.1.2 Suppose that there exists at most one hole between points A and B and the pressure at A is known and the pressure at B is unknown. Assume that the first test is made at C. Let d(C,k) denote the smallest d such that k tests are sufficient to determine either a hole with accuracy d or the nonexistence of a hole in [A, B]. Let C* denote the midpoint of [A, B}. If C ^ C, then there exists k0 such that for k > k0, d(C, k) > d(C*, k). Proof. Denote x = j^gj. From the proof of Theorem 9.1.1, it is easy to see that d(C, k) = m a x ^ , t'(k - 1)(1 -
x))\AB\.
So, d(C, k) is increasing for 0 < x < |j—r and decreasing for |JJ—r < x < 1. Thus, if \AC\ > \CB\, then for all k > 2, d{C,k) > d{C',k). If \AC\ < \CB\, then there exists k0 such that for k > k0, \AC\ > f ^ | A B | . For those k, d(C, k) > d(C*, k). a
9.2
Fibonacci Search
Consider a unimodal function f : [a,b] —> R (see Figure 9.2). Choose two points XQ and xi from [a,b] with x0 < xx. Compare f(x0) with / ( x i ) . If f(x0) > / ( x j ) , then the maximum point of / falls into [a,xj]; otherwise, the maximum point falls into [x0, 6]. This fact provides an iterative method to search the maximum point in [a,b]. Initially, choose a point x 0 as above. At the fcth step, choose one more new point Xk which together with a point Xj (i < k) forms a new pair in the interval of uncertainty in which the maximum must lie. Then delete a piece from the uncertain interval by the above fact. What is the best choice for points Xfc's? The situation is similar to the problem (b) in the last section. Let F0 = i*\ = 1 and Fk = F^-i + Fjt_2 for k > 2. Then the characteristic equation for this recursive formula is x2 - x - 1 = 0
Optimal Search in One Variable
XQ
XJ
Figure 9.2: Unimodal function which has two roots
r± s
^. Fk =
So, in general,
1 + V5
1 -A/5
V~5
The sequence {Fk} is the well-known Fibonacci sequence. T h e o r e m 9.2.1 For a natural number k, let e be a number such that k tests are sufficient in the worst case to determine the maximum point with accuracy t\b — a\. Then e > — 4-. *** Proof. The theorem is proved by induction on k. For k = 1 and k = 2, it is trivial. In the induction step, suppose XQ and X\ are chosen in the initial step and the first step. Consider two cases. Case 1. |o — x j | > -jjF'-la — 6|. In the worst case, it may happen that f(x0) > f(xi). So, [a, si] is left for the interval of uncertainty. By the induction hypothesis,
er!r4 ^ ¥^-
So
> Fi |6-
Fk-2
>
Fk-i 1 Fk Fk.
kCase 2. \a — xi\ < F^-f^\o — a\. Then l ^ — 6| > -fe^la — 6|. In the worst case, the interval [x0, b] may be left in uncertainty at the first step and the interval [xi, b] is then left in uncertainty at the second step. Applying the induction hypothesis to [x\, b], one obtains
\a - b\ . |*i - b\
>
1
?k-2
9.2 Fibonacci
Search
181
So, \xx - 61 1 1 t > V r > —. D - \a - b\ Fk,2 ~ Fk To reach t h e lower bound 1/Fk, it is easy to see from t h e above proof t h a t x 0 and x\ m u s t be chosen such t h a t |x 0 - b\ _ \xi — a\ _
~]a~^b\ ~ \a-b\
~
Fk-i
Fk '
As k goes to infinity, this ratio goes t o * 5 - 1 , t h e well-known golden ratio. Using t h e golden ratio to d e t e r m i n e points xk is a well-known line search procedure in nonlinear p r o g r a m m i n g [19]. T h e m e t h o d drew a special a t t e n t i o n from Lou-Geng H u a [10] [11]. He p r o m o t e d t h e golden section search 1 as t h e most i m p o r t a n t m e t h o d in o p t i m u m seeking. By using t h e continued fraction, he established t h e optimality of t h e golden section search, which is similar t o t h a t of t h e middle point strategy. In t h e following, his t h e o r e m is given with a proof provided by Tao [26]. For any algorithm a and any unimodal function / on interval [0,1] of length one, let Lj(a,n) denote t h e total length of t h e intervals of uncertainty after n tests on t h e function / . Let L(a, n) = sup Lf(a,n). unimodal / T h e o r e m 9 . 2 . 2 For any algorithm a, there exists n(a) > 0 such that for n > n(a), L(a,n) < qn where q = ^ 5 2 ~ x . Moreover, the equality sign holds if and only if a is the golden section method. To prove this t h e o r e m , Tao first established two l e m m a s . L e m m a 9 . 2 . 3 For any algorithm
a, one of the following
holds:
L(a,l)>q
(9.1)
L(a,2)>q2.
(9.2)
Proof. Suppose t h a t t h e first and t h e second test points are Xi and x2. W i t h o u t loss of generality, assume x\ < x2. If i i < 1 — q, then L(a, 1) = m a x ( l — Xi,x2)
> 1 — £i > q.
If Xi > 1 — q, t h e n L(a, 2) > L(a, 1) — (x 2 — Xi) > x2 — (x2 — Xj) = Xi > 1 — q = q2.
•
'The golden section search is as follows: At each step with an interval [a, 6] of uncertainty, choose x0 and xi such that \XQ — a|/|xi — a\ = |xi — a|/|6 — a\ = Y^52~1 and then compare /(x 0 ) and f(x\).
Optimal Search in One Variable
182
Lemma 9.2.4 For any algorithm a, there exists a sequence of natural numbers {nk} such that (a) For n € {nk}, m
r
.=i
j=i
= U(q + e')-[l(q2 + e])
L(a,n)
where m + 2r = n, t* > 0, tj > 0, i = 1,2, • • • ,m, j = 1, • • • , r. (b) 1 < nk+1 -nk<2. Proof. By Lemma 9.2.3, either L(a, 1) = q + e\ or i ( a , 2) = g2 + e2 for some ej > 0 and tj > 0. If the former case occurs, then set n\ = 2; if the latter case occurs, then set nj = 2. Similarly, consider the remainder interval and in general, define _ f nk + 1 ~\nk +2
if if
nk+1
L(a,nk + 1) > qL(a, nk) L(a,nife+2)>g2i(a,nfc).
This sequence meets the requirement. • Now, Theorem 9.2.2 can be readily proved. Proof of Theorem 9.2.2. By Lemma 9.2.4, for any natural number n, either n € {nk} or n = nk + 1 for some k. In the latter case, m
r
i=i
j=i
where m + 2r = nk = n — 1, t* > 0, 0.5 > £j > e' > 0 for i = 1,2, • • • ,m and j = 1,2, • • - , r + l. Thus, L(a,n)
(,+
(1+ ,(1
c
S 7»B #
r+l>
.
Next, it is shown that for sufficiently large n, ^°„"' > 1. Case 1. {tj} is a finite sequence, i.e., for sufficiently large n, n G {«&}. In this case, it is clear that for sufficiently large n, '°„'"^ > 1. Case 2. {tj} is an infinite sequence, but it does not converge to 0. In this case, there exists eo > 0 such that for infinitely many j , tj > t0. Thus, L(a,n) —i q-
, e0 1 >K(1 H 111 ) —> oo as n —» oo. ~ q'( 2q>
Case 3. lim^oo tj — 0. Since tj > e'-, lim^oo t'- = 0. Note that a is different from the golden section search. Therefore, either m > 1 or (m = 0 and tj > 0 for some j , say £j > 0, without loss of generality). Thus, either 4 ^ ) > ( 1
+
%1_<±!)^(1
+
£1)>1
9.3 Minimum
Root
183
Identification
4£in)> (1 + £ i ) ( i _ < ± i ) ^ (i + £ l ) > L qn
q
q
•
q1
T h e o r e m 9.2.1 was proved by Kiefer [13]. K a r p a n d Miranker [12], Avriel and Wilde [2], and B e a m e r and Wilde [3] parallelized t h e testing process. T h e y considered a sequence of stages. In t h e i t h stage, exactly fc, tests a r e performed parallelly. W h e n t h e t o t a l n u m b e r of tests is predetermined, they generalized Kiefer's result t o d e t e r m i n e t h e test points. Hong [9] considered t h e case t h a t t h e n u m b e r of tests are not p r e d e t e r m i n e d . He established a generalization of T h e o r e m 9.2.2. Let {ki} b e an infinite sequence of n a t u r a l n u m b e r s . For any algorithm a and any u n i m o d a l function / on interval [0,1] of length one, let £/(<*, k\, • • •, kn) denote t h e t o t a l length of t h e intervals of uncertainty after n stages on t h e function / . Let L(a, hi, ••-,&„) =
sup Lj(a,ki,--unimodal /
,kn).
T h e o r e m 9 . 2 . 5 (a) If {ki} contains an infinite subsequence of odd numbers odd, then there exists a strategy a* such that for any strategy a, L(a*,klt-
• • ,kn) < L{a,kx,---
,kn)
for sufficiently large n. (b) In general, for any 0 < f3 < 1, there exists a strategy strategy a, P • L{ap, &!,•••, kn) < L(a, ki,---, kn) for sufficiently
or k\ is
ctp such that for any
large n.
B e a m e r and Wilder [3] raised t h e following problem: Given N, d e t e r m i n e {&,-}i<;
9.3
Minimum Root Identification
A lot of problems in t h e optimal search can b e reduced t o t h e following form. M i n i m u m R o o t Identification: with p r o p e r t y t h a t
Given m continuous functions ft.'s on [0,1]
Optimal Search in One Variable
184 (*)
hi(0) < 0 for all i = 1, • • • , m and each hi has at most one root in [0,1],
identify min{z G [0,1] | hi(z) = 0 for some i}. For example, consider the following optimization problem: maximize
f(x)
subject to
g\(x) < 0, • • •, gm(x) < 0 x e R"
where / is continuously differentiate in the feasible region 0, the set of all points satisfying the constraints, and all ; are continuously differentiate convex functions in the n-dimensional Euclidean space Rn. There are a family of classical iterative methods solving it, called feasible direction methods, which work as follows: Initially, find a feasible point x\ (i.e., a point in fi). At the fcth iteration with a feasible point Xk, first find a direction dk satisfying that (1) Vf(xk)Tdk > 0, and (2) there exists a0 > 0 such that for a € [0, a0], Xk + adk is a feasible point. Such a direction is called a feasible ascent direction. Once a feasible ascent direction dk is found, one then obtains a new feasible point Xk+i with f(xk + 1) > f{xk) by a line search procedure, e.g., the golden section search described in the last section. Since all gi are convex, the feasible region ft is a convex set. Define a" = max{ct\xk + ctdk is a feasible point }. Then for all a € [0,a*], x^ + ctdk is feasible and for any a > a*, Xk + adk is not feasible. In general, finding a* involves unbounded search which will be studied in the next chapter. However, when the feasible region fi is bounded or when a0 is defined by a0 = min{l,a*}, the problem of finding a* or ao can be reduced to the minimum root identification problem. To do so, define h{(a) = gi(xk + adk). Then hi will have the property (*) described in the problem and ao = min{z € [0,1] | A,-(z) = 0 for some i}. A naive bisecting method can compute ao with uncertainty at most 2 _ n in nm evaluations of h^s. (At each step, one tests the feasibility of the midpoint of the interval of uncertainty and each test for the feasibility of a point requires values of all hi at the point.) However, there are several obvious ways which can save the number of evaluations significantly. These methods can be found in [7, 4, 6, 8, 14, 21, 25]. In particular, Rivest, et al. [25] discovered an equivalent relation between the minimum root identification and a problem with unreliable tests as follows.
9.3 Minimum
Root
Identification
185
H a l f - L i e P r o b l e m : Suppose there is an unknown x 6 (0,1]. Given a n a t u r a l n u m b e r r, t h e problem is t o identify x by using only questions of t h e form "Is x < c?" with c € (0,1], where up t o r of t h e "No"-answers b u t none of t h e "Yes"-answers may b e erroneous. T h e equivalence of t h e two problems is in t h e sense t h a t an o p t i m a l strategy for one p r o b l e m can be trivially transformed to an o p t i m a l strategy for t h e other one, vice versa. This equivalence is now shown. Consider each evaluation of function A; at a as an answer for t h e question "Is hi(ct) > 0 ? " . Associated with a questioning history (together with answers), t h e s t a t e S is given through a sequence of m + 1 n u m b e r s 0 < i i < - < i
m
< l , 0 < i ? < l
and a p e r m u t a t i o n ir of {1, • • • , m } chosen such t h a t Li is t h e largest a for which t h e question "Is /»„•(;) ( a ) > 0?" has received a "No"-answer (Li = 0 if no such answer has been received yet) and R is t h e smallest a for which t h e r e exists a question "Is hi(a) > 0?" which has received a "Yes"-answer. (R = 1 if no "Yes"-answer has been received yet.) Clearly, t h e interval of uncertainty at this s t a t e is (L\,R\. L e m m a 9 . 3 . 1 The optimality of any strategy is unchanged 0?" is replaced by a question "Is /t„( 0 )(c) > 0 ? " .
if a question
"Is A„(,)(c) >
Proof. Let a be an o p t i m a l strategy in which t h e question "Is /i x (;)(c) > 0?" is asked at t h e s t a t e (Li, • • • ,Lm,R). Clearly, i , < c < R. Let Ak(Li, • • • ,Lm,R) denote t h e m i n i m u m length of interval of u n c e r t a i n t y after k more questions starting from t h e s t a t e (L\, • • • , Lm, R). Clearly, Li < c < R. If t h e answer is "Yes", t h e n t h e next s t a t e is (Li, • • •, Lm,c). If t h e answer is "No", t h e n t h e next s t a t e is (Li,--- ,Li-x,Li+1,• • ,Lk,c, Lk+i,- • • ,Lm,R) for some k with Lk < c < Lk+i. T h u s , A m ( i i , - - - ,Lm,R)
=
max(Am_1(L1,---,im,c), A m _ i ( i i , • • •, Li-i, Li+i, • • •, Lk, c, Lk+i,
Note t h a t Sm(Li, Thus,
• • •, Lm, R) is a nonincreasing function with respect to L\, • • •, Lm.
A
^
• • •, Lm, -R).
m - i ( - ^ 2 , - ' - ,Lk,c,Lk+1,-
••
,Lm,R)
^ m - l ( - ^ l ) ' ' ' )-ki-11-k>+l i ' • • ,Lk,C,
Lk+i,-
• • ,Lm,R)
.
It follows t h a t Am(Li,---,Lm,R)
>
max(Am_1(I1,---,Im,c), ^ m — 1 ^ 2 ) ' ' ' ) -kfc) C, -L/fc-j-i, • * * , Li-mi
R)).
186
Optimal
Search in One
Variable
This m e a n s t h a t if t h e question "Is h^(c) > 0?" is replaced by t h e question "Is ^f(o)( c ) > 0 ? " , t h e n t h e optimality of t h e strategy is unchanged. • Now, consider t h e half-lie problem. T h e s t a t e of this problem can be s u m m a r i z e d by a sequence of n u m b e r Li<---
0
where i , ' s are t h e r largest n u m b e r s c such t h a t t h e question "Is x < c?" has received a "No"-answer (Li = • • • = Lk = 0 if only r — k "No"-answers have been received) and R is t h e smallest c such t h a t t h e question "Is x < c?" has received a "Yes"-answer (R = 1 if no "Yes"-answer has been received). Now, let m = r and m a p t h e question "Is x < c?" t o t h e question "Is h„^(c) > 0?". T h e n in b o t h problems, from t h e s t a t e (Li, ••-, Lm,R) t h e answer "Yes" yields t h e s t a t e (L\, • • •, Lm, c) and t h e answer "No" yields t h e s t a t e (L2, • • -, Lk, c, Lk+i, • • •, Lm, R). T h u s , this m a p p i n g gives a one-to-one correspondence between strategies for t h e half-lie problem and strategies, which use only questions of t h e form "Is ^ir(o)( c ) > 0 ? " , for t h e m i n i m u m root identification problem. Moreover, by L e m m a 9.3.1, t h e optimality of any strategy is unchanged if a question "Is ftT(,)(c) > 0?" is replaced by a question "Is ftT(o)(c) > 0?". T h u s , t h e images of o p t i m a l strategies for t h e half-lie problem under t h e above m a p p i n g are actually o p t i m a l strategies for t h e m i n i m u m root identification problem. Conversely, an o p t i m a l strategy for t h e m i n i m u m root identification problem can first be transformed t o an o p t i m a l strategy using only questions in t h e form "Is ^ir(o)(c) > 0?" a n d t h e n transformed t o an o p t i m a l strategy for t h e half-lie problem. T h u s , Rivest el al. [25] concluded. T h e o r e m 9 . 3 . 2 The minimum are equivalent.
root identification
problem
and the half-lie
problem
Clearly, t h e o p t i m a l strategy in Section 8.2 for t h e problem of identifying an unknown x £ (0,1] with up to m lies is also a strategy for t h e half-lie problem. However, it m a y not be an o p t i m a l strategy. Let us r e s t a t e t h e strategy here directly for t h e m i n i m u m root identification problem for convenience of t h e reader. Associated with a n a t u r a l n u m b e r j (the n u m b e r of remaining tests) and a s t a t e S = ( L i , • • •, Lm, R), a weight is given as follows:
»0\S) = £
( Q ) (Li+1 - Li) + ( Q ) (R - Lt)
where Lt < R < Lt+\ (denote L m + i = 1). Suppose t h a t a new question "Is /i x ( 0 )(a)?" is asked at t h e s t a t e S where LT < a < Z/r-|-i < R. Let Sy(a) and Sn(a) denote two states obtained respectively from answers "Yes" and "No". T h e n t h e strategy can result in t h e following algorithm.
9.3 Minimum Root
Identification
187
Algorithm: For a given accuracy e, find k such that ( f ^ ) ^ * < t. (k is the total number of tests needed for the given accuracy.) Then carry out the following: for j = k, k — 1, • • •, 1 do begin compute w(j — l,Sy(R)); find a € (Li,R] such that w(j — 1, Sy(a)) < 0.5 • w(j, S) and w(j-l,Sn{a))< 0.5-w{j,S); answer question "Is /&„(o)(a) > 0?"; end-for Note that at the initial state, the weight is ((„,))• After answering a question "Is /i„(0)(a) > 0?", the weight is reduced to a half. Thus, k questions would reduce the weight to a number less than ( ( ^ ) ) 2~*'. Moreover, UJ(0, S) = \R — Li\. That is, after k questions, the weight is exactly the length of the interval of uncertainty which contains c*o. So, the following can be concluded. Theorem 9.3.3 Using k questions "Is A,(a) > 0?", one can determine the minimum root with accuracy {{ J J 2 . The minimum root identification problem can also be solved by using the chip game. To use the chip game, the domain has to be transformed from the continuous one to a discrete one. To do so, given an accuracy e, choose n = [1/e] and consider the domain {z \ j = 1,2, • • • , « } . Now, the chip game can be performed on this domain by Aslam-Dhagat's method. Namely, a j * can be obtained to satisfy j * = maxfj | hi(-) < 0 for every i = 1, • • •, m). n Clearly, •*- is within distance e from the minimum root. By Theorem 9.3.3, to identify the minimum root with accuracy 2~", k questions are enough if (( )) 2 -A: < 2~™. By Lemma 8.3.2, it is sufficient if k satisfies (\kH{mjk)—k ^ n—n
that is, k{l-H(m/k)) Note that A:(l — H(m/k)) enough if k0 satisfies
>n.
is an increasing function of k. Thus, \ko] questions are fc0(l — H(m/k0))
= n.
Note that = = =
k0 — (n -\- m log n — m log m) k0 — fc0(l — H(m/k0)) — m\og{k0(l — H(m/k0))} + mlogra k0H(m/k0) + m\.og(m/k0) — mlog(l — H(m/kQ)) -kQ(l - m/k0) log(l - m/k0) - m log(l - H(m/ko)).
Optimal Search in One Variable
188
Since Mm — k0(l — m/k0)\og(l
— m/k0) — mlog(l — H(m/k0))
= mloge
ko—*oo
and —fc0(l— m/k0) log(l — mjk0) — m log(l — H(m/k0)) is increasing as k0 increases, one has that k0 = n + m log n — m log m + 0{m). Rivest et al. [25] also established a matching lower bound as follows. Theorem 9.3.4 Suppose k questions "Is x < c?" are enough to determine an unknown x € (0,1] with accuracy 2~n when up to r of the "No"-answers, but none of the "Yes"-answers, may be erroneous. Then k > n + r l o g n — r l o g r + 0(r). Since the set of uncertainty is always an interval, Theorem 9.3.4 is equivalent to the following. Theorem 9.3.5 Suppose k questions "Is x < c?" are enough to determine an unknown x £ { l , - - - , 2 n } when up to r of the "No"-answers, but none of the "Yes"answers, may be erroneous. Then k > n + r log n — r log r + 0(r). Proof. Consider an optimal strategy S which is represented by a binary "decision tree" Ts. Each internal node of Ts is a question "Is x < c?"; the two edges from the internal node to its right and left sons correspond to the "Yes"-answer and "No"answer, respectively. Each leaf £ is associated with a value value(i) which is the determined value of the unknown x. Since the optimality is considered in the worst case, without loss of generality, Ts can be assumed to be completed, that is, the strategy S always asks exactly k questions. (If x is identified in less than k questions, then asks some dumb questions like "Is x < 2 n ?" for the rest times.) Let lies{£) and yes(£) be two subsets of {1, • • •, k} which indicate on the path from the root to the leaf I which questions were answered incorrectly and which questions were answered "Yes". That is, if i € lies(£), then the tth question along the path from the root to t must be answered "No" which is also a lie; if i £ yes(£), then the i question on the path must be answered "Yes" which cannot be a lie. Suppose value(t) = x. If |Zies(£)| < r, then for each i € yes(£), the "No"-answer will give one more lie without providing any information. Therefore, the "No" branch of the ith question must have a leaf £' such that value(£) = x and |h'es(£')| = |Zies(£)| + 1. Now, one considers a x which satisfies the following property: On every path from the root to a leave £ with value(C) — x and every i € {1, • • • , r } , at least one fourth of k/r answers between 1 + i • (fc/r)th and (i + l)(fc/r)th answers are "Yes"-answers.
9.3 Minimum Root
189
Identification
Such an x is called a regular value. Claim. For every regular x, there are at least (fc/(4r)) r leaves with value x. Proof. Define Nx(i) = {t | t is a node in Ts on level i • (k/r) + 1 and has a leaf I with value(£) = x and |/ies(£)| = i but no leaf £' with value(£) = x and lies(C) < z'}. Clearly, A^O) consists of the root, so |-/VX(0)| = 1. For i < r, note that each "Yes"answer on the path from a node i 6 Nx(i) to a leaf £ with value(£) = x and |/ies(£)| = i, between level 1 +i • (k/r) and level (i + l)(fc/r), provides a node in Nx(i + 1). Since x is regular, there exist at least fc/(4r) such "Yes"-answers for each node t in Nx(i). Therefore, Nx(i +
l)>^-N(i). vr
It follows that
To prove the theorem, it suffices to prove that for sufficiently large n, there exist at least 2" _ 1 regular values among { l , - - - , 2 " } . In fact, if this is done, then for sufficiently large n, by Claim, the total number of leaves in Ts is at least
2
-a
Thus, the depth of Ts is k > log(2 n _ 1 f — J ) = n - 1 + r log k - r log r - IT. By the information lower bound, k > n. Thus, for sufficiently large n, k > n + r log n — r log r — 2r — 1, that is, k > n + r log n — r log r + 0(r). Now, it will be proved that there are at least 2 n _ 1 regular values among {1, • • • , 2" }. First, note that the number of "Yes"-"No" sequences of length k/r with less than k/(Ar) "Yes"-answers is k r l \ I < 2(*/>-)tf(i/4) k/(Ar) " K
Optimal
190
Search in One
Variable
Thus, t h e n u m b e r of irregular values a m o n g { 1 , • • •, 2"} <
t h e n u m b e r of p a t h s each of which does not satisfy t h e condition in t h e definition of t h e regular value
<
r2
(
* / r ) / / ( 1 / 4 ) • 2k~k/r
vhere r
r
Since k < n + r l o g n — r l o g r -f- 0(r), kd + l o g r < (n + rlogn
4
r
one has t h a t for sufficiently large n, — r l o g r + 0(r))d
T h i s completes t h e proof of t h e t h e o r e m .
+ l o g r < n — 1.
•
References [1] M. Aigner, Combinatorial
Search, (John Wiley & Sons, New York, 1988).
[2] M. Avriel and D.J. Wilder, O p t i m a l search for a m a x i m u m with sequences of simultaneous function evaluations, Management Science 12 (1966) 722-731. [3] J . H . B e a m e r and D.J. Wilder, M i n i m a x optimization of u n i m o d a l functions by variable block search, Management Science 16 (1970) 529-541. [4] J . H . B e a m e r and D.J. Wilder, A m i n i m a x search plan for constrained optimization p r o b l e m s , J. Optimization Theory Appl. 12 (1973) 439-446. [5] E.R. Berlekamp, Block for t h e binary s y m m e t r i c channel with noiseless delayless feedback in Error-Correcting Codes, (Wiley, New York, 1968) 61-85. [6] S. Gal, Multidimensional m i n i m a x search for a m a x i m u m , SIAM 23 (1972) 513-526.
J. Appl.
Math.
[7] S. Gal, B . Bacherlis, and A. Ben-Tal, On finding t h e m a x i m u m range of validity of a constrained System, SIAM J. Control and Optimization 16 (1978) 473-503. [8] S. Gal and W . L . Miranker, Sequential and parallel search for finding a root, Tech. Rep. 30, I B M Israel Scientific Center, Haifa, Israel, 1975. [9] J . - W . Hong, O p t i m a l strategy for o p t i m u m seeking when t h e n u m b e r of t e s t s is u n c e r t a i n , Scientia Sinica No.2 (1974) 131-147 (in Chinese).
191
References [10] L.-G. H u a a n d Y. Wang, Popularizing Mathematical Republic of China, (Boston: Birkaiiser, 1989). [11] L.-G. H u a , Optimum
Methods
in the
People's
Seeking Methods, (Science Press, Beijing, 1981)(in Chinese).
[12] R . M . K a r p a n d W . L . Miranker, Parallel m i n i m a x search for a m a x i m u m , of Combinatorial Theory 4 (1968) 19-35. [13] J. Kiefer, Sequential m i n i m a x search for a m a x i m u m , Proceedings Soc. 4 (1953) 502-506.
Journal
of Amer.
Math.
[14] J. Kiefer, O p t i m u m sequential search and approximation m e t h o d s under minim u m regularity assumptions, SIAM J. Appl. Math. 5 (1957) 105-136. [15] W . Li, T h e solution of o p t i m a l block search problem for TV > "in, Acta maticae Sinica No.4 (1974) 259-269 (in Chinese).
Mathe-
[16] W . Li, Solution of o p t i m a l block search problem for TV > 3n (continuation), Acta Mathematicae Sinica N o . l (1975) 54-64 (in Chinese). [17] W . Li, Optimal
Sequential
Block Search, ( H e l d e r m a n n , Berlin, 1984).
[18] W . Li and Z. Weng, All solutions of t h e optimal block search problem, Mathematicae Sinica No.l (1979) 45-53 (in Chinese). [19] D.G. Luenberger, Linear ing, 1984).
and Nonlinear
Programming,
Acta
(Assison-Wesley, Read-
[20] S. Luo, All solutions of t h e o p t i m a l block search problem for TV < 3n, Mathematicae Sinica No.3 (1977) 225-228 (in Chinese).
Acta
[21] Y . M i l m a n , Search problems in optimization theory, P h . D . thesis, Technion-Israel I n s t i t u t e of Technology, Haifa, 1972. [22] A. Pelc, Searching with known error probability, Theoretical 63 (1989) 185-202.
Computer
Science
[23] J. Qi, Y . Yuan, and F . W u , An optimization problem (II), Acta Sinica No.2 (1974) 110-130 (in Chinese).
Mathematicae
[24] J. Qi, Y. Yuan, and F . Wu, An optimization problem (III), Acta Sinica N o . l (1975) 65-74 (in Chinese).
Mathematicae
[25] R.L. Rivest, A.R. Meyer, D.J. K l e i t m a n , K. W i n k l m a n n , and J. Spencer, Coping with errors in binary search procedures, J. Computer and System Sciences 20 (1980) 396-404.
192
Optimal Search in One Variable
[26] X. Tao, Proving the optimality of optimum seeking method, Acta Mathematicae Sinica 24:5 (1981) 729-732 (in Chinese). [27] F. Wu, An optimization problem (I), Scientia Sinica No.l (1974) 1-14 (in Chinese).
10 Unbounded Search
In this chapter, we deal with unbounded domains, that is, tested objects from an unbounded area, e.g., a half line, the set of natural numbers, etc..
10.1
Introduction
Many problems involve searching in an unbounded domain. The following are some examples. (1) In Section 9.3, we discussed the minimum root identification problem which searches a boundary point a0 satisfying _ f 1, if hi(l) < Ofor i = l , - - - , m , 1 max{a | hi(a) < 0 for i — 1, • • •, m}, otherwise where every h{(a) (= gi(x + ad)) is a convex function in a. The problem arose from solving the following optimization problem: maximize subject to
f(x)
where / is continuously differentiate in the constrained area and all g; are continuously differentiate convex functions in i f . A more natural version of the minimum root identification problem is to find a* in [0, oo] such that a* = max{a | ft;(a) < 0 for i = 1, • • • , m } . This version is a search problem with an unbounded domain. (2) In unconstrained optimization, one considers the problem max/(i) and uses a line search procedure to solve the following subproblem m a x { / ( i + ad) \ 0 < a}. 193
Unbounded Search
194
This subproblem gives another example in unbounded search. (3) In the recursive function theory, there is an important operation, called the fi-operation, as follows. min
h/x^i
{y\9(x,y) | undefined ,
= Q},
H3y,g(x,y) otherwise
= 0,
where g(x, y) is a total recursive function. Suppose that one uses the question " Is there y < a such that g(x, y) = 0?" as a primitive operation. Then this problem can be formulated as follows: Identify the unknown h(x) from the natural numbers by questions of the type "Is h(x) < a?". (Actually, in the recursive function theory, it is well-known that in general there is no way, unless the domain of h(x) is a recursive set, to tell whether h(x) is defined at x or not. Thus, one studies only the algorithm for computing h(x) in case that h(x) is defined. When h(x) is undefined, the algorithm runs forever and outputs nothing.) The most straightforward algorithm for unbounded search is the unary search. For example (3), it will ask questions "Is h(x) < 0?", "Is h(x) < 1?", •••, until a "Yes"-answer is obtained. But, the unary search is clearly not a good one. Let n denote the value of h(x). (n is considered to be the input size.) The cost of the unary search is n + 1 . The following binary search gives a significant improvement.
Figure 10.1: The decision tree of the unbounded binary search The binary search has two stages. The first stage of the binary search is to find a bounded area containing the unknown by successively asking "Is n < 2' — 1?" for i = 0,1, • • •, until a "Yes"-answer is obtained. Then the unknown n is identified in the
10.2 Bentley-Yao
195
Algorithms
second stage by a s t a n d a r d b o u n d e d binary search. Suppose t h a t t h e first stage stops at a "Yes"-answer for t h e question "Is n < 2 m - 1?". T h e n 2 " 1 - 1 - 1 < n < 2m - 1. n has 2 m - 1 possibilities. T h u s , t h e second stage takes m — 1 and t h e first stage takes m + 1 tests. T h e total n u m b e r of tests is 1m = 2 [log n\ + 2. T h e decision tree of t h e u n b o u n d e d binary search is shown in Figure 10.1. Each internal node of t h e tree is labeled by a n u m b e r i corresponding to a test "Is n < iV. Each leaf is labeled by t h e identified value of t h e unknown.
10.2
Bentley-Yao Algorithms
Based on t h e binary search, Bentley and Yao [2] discovered a family of interesting algorithms. T h e basic idea is as follows: Note t h a t t h e binary search is an improvem e n t of t h e u n a r y search. However, t h e first stage of t h e binary search is still a unary search along sequence {2 ! — 1 | i = 0 , 1 , • • •}. T h u s , this stage can also be improved by t h e binary search again. Let £{n) = [log nj -f 1 and (S'\n) t h e second improved algorithm runs as follows: Stage
= (.{••• £(n) • • •). T h e n
1. F i n d 62\n) by t h e u n a r y search, i.e., ask questions "Is lS2\n) < 0?", "Is l( \n) < 1?", • • •, until a "Yes"-answer is obtained. (Note t h a t £(x) < i if and only if x < 2' — 1. T h u s , t h e question "Is (Sk\n) < iV is equivalent to t h e question "Is n < fih\i)V where f(y) = 2y - 1.) 2
Stage 2. F i n d £(n) by t h e b o u n d e d binary search w i t h question of t y p e "Is £(n) < iV. (Note t h a t 2-1 + 1 < l{n) < 2*<2). T h u s , this stage takes at most £<2>(n) - 1 tests.) Stage 3. F i n d n by t h e bounded binary search. N o t e t h a t t h e first stage of this algorithm still uses t h e u n a r y search. So, it can be improved again. This kind of improvement can be done forever. So, a family of algorithms is obtained. Let Bo denote t h e u n a r y search and B\ denote t h e binary search. In general, t h e algorithm B^ is obtained from improving t h e algorithm Bk-\ and runs as follows. Stage 1. F i n d ^{n) by t h e u n a r y search, i.e., ask questions "Is £Sk\n) k fi \n) < 1?", • • •, until a "Yes"-answer is obtained.
< 0 ? " , "Is
Stage i (2 < i < k-\- \). Find (Sk~'+l> {n) by t h e bounded binary search with question of t y p e "Is ^ - i + 1 > ( r c ) < i ? " (note: 2*k~"'+2)-1 + 1 < ^ - i + 1 > ( n ) < 2 ^ " + 2 ) ( " > ) . Here (S°\n) = n. To analyze t h e cost of t h e algorithm Bk, notice t h a t t h e first stage needs (.(k\n) + 1 tests and t h e i t h stage needs at most i^k~'+2^{n) — 1 tests. T h u s , t h e t o t a l n u m b e r of tests is at most l{n) + eW(n)
+ ••• + 2(Sk\n)
- k + 1.
Unbounded Search
196
From the theoretical point of view, the algorithm Bk is better than the algorithm Bk-\ for any k. However, for a particular instance, it may not be the case. In fact, £'* _1 '(n) — 2v-k\n) — 1 decreases monotonely as k increases and it becomes a negative number when k gets sufficiently large. Clearly, the best choice for k is the largest j such that 6j-l\n) > 2^ (j) (n) - 1, which is also the smallest _;' such that f ( j ) (n) < 2^ + 1 >(n) - 1. Since the largest positive integer x satisfying x < 2E(x) - 1 = 2([loga;J + 1) - 1 is 5, the best choice for k is the smallest integer j satisfying £^{n) < 5. This j is denoted by £*(n). The following algorithm, denoted by B', was also suggested by Bentley and Yao [2]. Step 1. Find £*(n) by the unary search, i.e., ask questions "Is (S'\n) < 5?" for i = 0, 1, • • •, until a "Yes"-answer is obtained. Step 2. Find n by the algorithm -B<«(n)The reader may immediately find that the algorithm B* is still improvable because the first step is still the unary search. Actually, the technique of Bentley and Yao can be used in this situation forever and a family of uncountably many algorithms will be obtained. Note that in the algorithm B*, the first step takes 1 + *(") tests. The number of tests in the second step is at most r(n)
£ £^(n) +£ (r(n)) (n) - t{n) + 1
< £^ (i) (rc) + 6 - r ( n ) . So, the total cost of the algorithm B* is
:=1
Bentley-Yao algorithms are nearly optimal. To show this, a lower bound result was also presented in [2].
10.2 Bentley-Yao
197
Algorithms
Theorem 10.2.1 Let f{n) be the cost function of an unbounded searching algorithm. Then for infinitely many n, f(n) > log n + log (2) n + • • • + log (log * n) n - 2 log* n where log* n is the least number j such that log"' n < 1. The following lemma is an infinite version of Lemma 1.2.2. Lemma 10.2.2 (Kraft's Inequality) Let f(n) be the cost function of a correct unbounded searching algorithm. Then oo
£2-/w
Proof. The decision tree T of each correct unbounded searching algorithm is an infinite binary tree. A node is said at the jih level if the path from the root to the node has length j . For each node v at the j t h level, define g(v) = 2~j. Then g(v) = g(v') + g(v") when v' and v" are two children of v. From this relation, it is easy to see that for any finite subtree T" obtained from T by cutting off at a certain level, the sum of g(v) over all leaves v of T" is one. Now, for any natural number n, let _; = max{/(j) | i = 0, • • •, n}. Cut off all nodes at level lower than j +1 from T. Then all leaves corresponding to the results of some i in {0, • • • ,n} being the unknown are still left in the remainder, denoted by T". Clearly,
±2-™< •- 1
£
g{v) = l
v over leaves of v
for every n. Thus, oo
]T2-'(*'><1.
a
Proof of Theorem 10.2.1. Suppose to the contrary that Theorem 10.2.1 does not hold. Then there exists an algorithm such that the inequality holds for only finitely many n. It means that for sufficiently large n, 2-/(")
>
n(log n)(log (2) n) • • • (log(log* "- 1 ' n)'
By Kraft's inequality, to obtain a contradiction, it is sufficient to prove that the series oo
4log* n 2
Sn(logn)(log( »n)...(log( l o g *'- 1 )n)
Unbounded Search
198
is divergent. Define k(n) — log* n — log*(log* n) — 2 and, for positive integers i and n,
k = -i-2 ni
=
+ 22'
l+22
I
.2 T, ki+i+2
n'i = 22 Then for ni < n < n'-, fc(n) = k{ and log' A '"' +2 ' n < log* n. Hence, for n, < n < n', log(*+2)n
+
...
+
i o g (iog*") n
<
log'n + log(log* « ) + ••• + log (log " (log * n ) ) (log* n )
<
2 log* 7i.
Thus °° ntl
4log*" 2
n(log n)(log< > n) • • • (log* 1 ^ - l > „ ) 1
> y "
n X n(log n)(log<2> n) • • • (log<*<"> n)
~
Jn{
rn'x
dx x ( l o g x ) ( l o g ( 2 ) x) • • • (log ( f c i ) x)
= (ln^log'^'zl—] >
(ln2)fc'+1(2*:"
=
(21n2)* i+1 (0.5 + o(l))
-ki-i-3)
where the following facts have been used: log< 4 ' +1 )n; > 2ki log ( * i+1) m < k + i + 3. Note that 21n2 > 1. Therefore, as &; goes to infinity, (21n2)*''(0.5 + o(l)) also goes to infinity. This completes the proof. CI R e m a r k . It is worth mentioning that the above lower bound cannot be improved by the same technique. In fact, the series
V
-
£ln(logn)(logWn)-..(logP°«,B-1'n)
10.3 Search with Lies
199
is convergent. To see this, let h = {x | log* x = k}. Then h = (a, b] where log a = 0 and log* 6 = 1. Thus,
L
dx
x(logx)(log
<2)
x) • • • (log**"1' x)
(ln2)fclog*s|U (In 2)*.
= = Therefore,
I
dx
°° =
x(\0gx)(\ogWx)---(logos'*~Vx)
U(ln2) '
vhich is convergent. It follows that OO
-I
Eti «(logn)(log
(2
» n) • • • (log'108*""1' n)
is convergent. Thus, there exists a positive integer no such that OO
^
2
n5„ n(logn)(log< >n)---(log< lo s*»-» n)
<
L
From this fact, one can prove that there exists an algorithm whose cost function / satisfies that for n > nQ, f{n) < log n + log (2) « + ••• + log(log*n) n and no — 1
to
oo
^
i 2
"(logn)(log( ) n) • • • (log*10**"-1) n) "
In fact, the inverse of Lemma 10.2.1 holds. Its proof can be obtained by reversing the proof of Lemma 10.2.1. The reader may work out the detail as an exercise.
10.3
Search with Lies
Similar to the bounded minimum root identification problem, the unbounded minimum root identification problem can also be transformed to an unbounded search problem with lies. In fact, the unbounded minimum root identification is equivalent to the unbounded half-lie problem. In Section 9.3, the bounded half-lie problem was dealt with. In this section, the results are extended from the bounded erroneous searching to the unbounded erroneous searching (i.e., search for a unknown natural number n by "Yes"-"No" tests with errors with certain restriction).
200
Unbounded
Search
For t h e u n b o u n d e d erroneous searching, t h e search usually occurs in two stages. In t h e first stage, a b o u n d of t h e unknown n u m b e r n is determined. In t h e second stage, apply t h e algorithm for t h e bounded erroneous searching. T h e first stage can be performed by a b r u t e force search in t h e following way: Ask questions "Is n < 2 2 ' ? " for i = 0, 1, •••, until a "Yes"-answer is confirmed. If t h e n u m b e r of lies is b o u n d e d by a constant r, t h e n each question needs to repeat at most 1r + 1 times in order to obtain a confirmed answer. T h u s , totally, this stage takes at most (2r + 1) [log log n ] tests. If t h e n u m b e r of lies is linearly bounded with a ratio q, t h e n t h e t o t a l n u m b e r of tests in this stage is bounded by 0 ( ( T - ^ - )poglog"1) = o(logn). 1 -q For t h e second stage, if t h e n u m b e r of lies is b o u n d e d by a constant, then t h e bounded searching algorithm can be used to find n from [ i ^ 2 " 0 ' " * " 1 ] . Note t h a t „ 2 n°8 l°s "1
I
- „2 1 °s l o s™+ 1
S: £
= n
2
•
From Section 8.3, t h e r e exists an algorithm executing this stage with at most O(log n2) = O(log n) tests. If t h e n u m b e r of lies is linearly bounded, t h e n in t h e second stage, when t h e chip g a m e is employed, t h e b o u n d a r y line has to be set to start at level q • o(log n) instead of level 0. B u t , this does not affect very much t h e q u a n t i t y of t h e n u m b e r of tests. Aslam and D h a g a t [1] indicated t h a t it still takes O ( l o g n ) tests for 0 < q < 1/2. S u m m a r i z i n g t h e above discussion, t h e following has been obtained. T h e o r e m 1 0 . 3 . 1 Any unknown natural number n can be identified with "Yes"-"No" tests with a constant number of lies. For 0 < q < 1/2, any natural number n can be identified with O ( l o g n ) "Yes"-"No" tests linearly with ratio q.
O(\ogn) unknown bounded
For t h e known error probability model, Pelc [3] proved t h e following. T h e o r e m 1 0 . 3 . 2 If q < 1/2, then for any positive p < 1, the unknown natural number n can be identified with reliability p in 0 ( l o g 2 n) queries. If q < 1/3, then for any positive p < 1, the unknown natural number n can be identified with reliability p in 0 ( l o g n) queries.
10.4
Unbounded Fibonacci Search
Bentley a n d Yao [2] proposed t h e following question: "Is t h e corresponding unb o u n d e d Fibonacci search interesting?" In Section 10.1, it was m a d e clear t h a t in t h e nonlinear o p t i m i z a t i o n theory, u n b o u n d e d searching for m a x i m u m is certainly important.
10.4 Unbounded Fibonacci Search
201
Suppose that one wants to find the maximum of the unimodal function f(x) on [0, oo). Given an accuracy e > 0, define i_i = 0, x0 = FQC, X{ = F,e for i > 1 where .Fo = .Fi = 1, F,_i + Fi = Fi+i, which are Fibonacci numbers. (FQ is a number very close to F0 but smaller than Fo.) The unbounded Fibonacci search can be performed in the following way. Stage 1. Compute the function values /(xo), found.
f(xi),
until f(x{)
> f{xi+l)
is
Stage 2. Apply the bounded Fibonacci search to the interval [z;-i, z.+i]. Suppose that the maximum point is x. Then the first stage needs to compute 1 + min{i | x < Fit] function values. The second stage needs to compute at most min{i | x < Fie} function values. Thus, the total number of function values computed is 1 + 2min{i | x < Fie}. Similarly, the unbounded golden section search can be described as follows. Stage 1. Compute the function values /(e), f(pe), found where p = 3~Y5 •
•••, until f{p'e) > f[p'+1e)
Stage 2. Apply the bounded Fibonacci search to the interval [p'~le,
is
p'+1e\.
Both the unbounded Fibonacci search and the unbounded golden section search can be improved infinitely many times by refining the first stage. The improvements are similar to those in Section 10.2. The detail is left to the reader as exercises.
*n+l
a.
n+l x
2
Figure 10.2: The simplex method There also exist problems on unbounded searching in high dimensional spaces. For example, consider an unconstrained maximization problem m a x / ( i ) for i g i T .
202
Unbounded
Search
T h e simplex m e t h o d for this problem can be described as follows: Initially, c o m p u t e n + 1 function values f{x^), • • •, / ( a ; n + i ) . At each step, t h e r e are n + 1 function values, for simplicity, still denoted by f(xi), • • •, / ( x n + i ) , storaged in t h e algorithm. T h e a l g o r i t h m first finds t h e m i n i m a l function value, say / ( x " + 1 ) , from t h e n + 1 function values, t h e n c o m p u t e f(xn+i) where xn+i is t h e s y m m e t r i c point of xn+1 with respect to t h e h y p e r p l a n e d e t e r m i n e d by xi, • • •, xn (see Figure 10.2). Note t h a t in t h e above algorithm, two simplices X\Xi • • • xn+i and X\Xi • • • xnxn+i are s y m m e t r i c . So, they have t h e same size. T h e algorithm can be improved t o enlarge or reduce t h e size of t h e second simplex by using t h e technique for u n b o u n d e d searching.
References [1] J . A . Aslam and A. D h a g a t , Searching in presence of linearly bounded errors, Proceedings of 23rd STOC, 1991, p p . 486-493. [2] J.L. Bentley and A.C. Yao, An almost o p t i m a l algorithm for u n b o u n d e d searching, Information Processing Letters, 5 (1976) 82-87. [3] A. Pelc, Searching with known error probability, Theoretical 63 (1989) 185-202. [4] M. Avriel, Nonlinear Programming: wood Cliff, New Jersey, 1976).
Analysis
and Methods,
Computer
Science,
(Prentice-Hall, Engle-
11 Group Testing on Graphs
Consider the group testing problem studied in Section 3.1, i.e., there exist two disjoint sets each containing exactly one defective. By associating each item with a vertex, the sample space can be represented by a bipartite graph where each edge represents a sample of the space 5(2, n). This observation was first made by Spencer as reported in [7]. Chang and Hwang [7] conjectured that a bipartite graph with 2k (k > 0) edges always has a subgraph, induced by a subset of vertices, with 2k~1 edges. While the conjecture remains open, it has stimulated forthcoming research casting group testing on graphs.
11.1
On Bipartite Graphs
A graph is an ordered pair of disjoint sets (V, E) such that £ is a set of pairs of elements in V and V ^ 0. The set V is called the vertex set and the set E is called the edge set. When a graph is denoted by G, its vertex set and edge set are denoted by V(G) and E(G), respectively. Two vertices are adjacent if there is an edge between them. Two graphs are isomorphic if there exists a one-to-one correspondence between their vertex sets, which preserves adjacency. A graph H is a subgraph of G if V(H) C V(G) and E(H) C E(G). Furthermore, the subgraph H is said to be induced by W if W = V(H) and E{H) = E(G) D {(x,y)\x,y 6 W). In this case, denote H = G[W]. A graph is bipartite if its vertices can be put into two disjoint sets such that every edge is between these two sets. A bipartite graph is denoted by (A, B, E) where A and B are the two vertex sets and E is the edge set. Aigner [1] proposed the following problem: Given a graph G, determine the minimum number M[1,G] such that M[1,G] tests are sufficient in the worst case to identify an unknown edge e where each test specifies a subset W C V and tests whether the unknown edge e is in G[VK] or not. An equivalent form of the Chang-Hwang conjecture is that for any bipartite graph G, M[l,G]=riog|£(G)H where |-E(G)| is the number of edges in G. In fact, a bipartite graph G with m edges, 2 fc_1 < m < 2fc, can always be embedded into a larger bipartite graph with 2k edges. 203
204
Group Testing on Graphs
If the conjecture of Chang and Hwang is true, then every test can identify exactly half of the edges. Thus, k tests are enough. Clearly, k is a lower bound for M[1,G]. Thus, M[1,G] = k = riogm]. Conversely, if m = 2k and M[1,G] = k, then the first test has to identify 2k~1 edges. Thus, the conjecture of Chang and Hwang is true. Althofer and Triesch [5] proved the following. Theorem 11.1.1 For any bipartite graph G, M[l,G]<\\og\E(G)\]+l. The proof depends on the following crucial lemma. Lemma 11.1.2 Let a\, a2> •••, ar be r nonnegative integers with r > 2. Suppose that r
2t = J2 2°' != 1
for some positive integer t. Then there exists a subset I of {1, • • •, r} such that 2(-1=^2"'. Proof. Without loss of generality, assume a\ < a2 < • • • < ar. Then a\ = a2 since otherwise, every term in the equation 2< = £ 2 a ' is divisible by 2 a2 except 2 a i , which is impossible. For r = 2, ai = a2 implies that / = {1} meets the requirement. For r > 2, since 2°i + 2"2 = 2" 2 + 1 , by the induction hypothesis, there exists / £ {3, • • •, r} such that
2*_1 =£)2"''.
•
iei
Now, the proof of Theorem 11.1.1 is given as follows. Proof of Theorem 11.1.1. Note that adding good edges to G can only increase the number of tests. For any bipartite graph G = (A, B, F), add good edges (and possibly vertices) to G such that the resultant graph G' satisfies da,(u) = 2 ^ ^ for every u 6 A
205
11.2 On Graphs where da[u) is the degree of u in G. Clearly, \E{G')\<2\E(G)\.
Now, add isolated good edges to G' in order to obtain a bipartite graph G" such that |£(G")|=2'los|£;(G')l1. Clearly,
log|£(G")|
•
Let KXtV be the complete bipartite graph with two vertex sets containing x and y vertices, respectively. Then in any induced subgraph of Kx,y, the number of edges is in the form x'y' where 0 < x' < x and 0 < y' < y. For x = 5, y = 9, neither the number [^] = 23 nor the number |_^f J = 22 is in this form. This means that KXiV has no induced subgraph with nearly half the number of edges. Foregger (as reported in [7]) gave the above example to show that the conjecture of Chang and Hwang cannot be extended in this direction.
11.2
On Graphs
From Theorem 1.2.1,
M[1,G]> Rog|£(G)|l. Aigner [2] conjectured that for all graphs G, M[l,G}<\og\E(G)\+c where c is a constant. Recently, Althofer and Triesch [5] settled this conjecture by proving the following. T h e o r e m 11.2.1 For all graphs G, M[1,G]< p o g | £ ( G ) | l + 3 . Proof. Choose a bipartite subgraph H = (A, B, F) of G such that |.F| is maximal, V(G) = A U B (see Figure 11.1). For u € A, denote by CIA(U) the degree of u in the induced subgraph G[A] and by d}j{u) the degree of u in H. Then dA(u) < du(u),
Group Testing on Graphs
206
since otherwise the bipartite subgraph of G with two vertex sets A \ {u} and B U {u} has more edges than H has. Hence \E(G[A})\ = \ £
dA{u) < \ £
dH{u) =
\\E{H)\.
Similarly, \E(G[B])\ <
\\E{H)\.
Without loss of generality, assume |£(G[A])| < \E(G[B])\. Then \E(G[A])\ < \\E(G)\,
\E(G[B})\ < ±\E(G)\.
Now, the theorem is proved by induction on |£(G)|. It is trivial for |.E(G)| = 1. In the induction step, first test G[B]. If the unknown edge e is in G[B], then by the induction hypothesis, M[1,G] < 1 + M[1,G[B]\ < 1 + [log \\E{G)\\
+ 3 = [log \E(G)\] + 3.
If e is not in G[B], then test G[i4]. If the unknown edge e is in G[B], then by the induction hypothesis, M[l, G] < 2 + M[l, G[A\] < [log \E(G)\] + 3. If the unknown edge e is neither in G[B] nor in G[A], then it is in H and by Theorem 11.1.1,
M[l,G]<2 + M[l,#]<3+riog|£(G)n.
n
So far, no graph G has been found to have M[l, G] > [log |£(G)|] + 2. But, there exist infinitely many complete graphs Kn such that M[l, Kn] = [log |£(/C„)|] + 1 (see Section 11.3). One believes that for all graphs G, M[1,G] < [log \E(G)\] + 1. Here, we raise another problem on searching more unknown edges. Let M[d, G] be the minimum number k such that k tests are enough in the worst case to identify d defective (or unknown) edges in graph G. Then by the lower bound of M(d,n), it is easily seen that for all graphs G and d > 1, M[d,G]
><nog^l. d
We conjecture that there exists a constant c such that for all graphs G and d > 1,
M[d,G] < d(log¥^
+ c).
11.3 On
207
Hypergraphs
H / G[A] 1 i
y*"
f
i
>
/
i i
*|
G[B] \
\
)y
i i i i i i
'/
i
(
YX
I T
\
• 1 1 1 ^_
1
/
vJ— »
(
\
/ / / /
\ \
/
i
Figure 11.1: G r a p h G It is interesting t o notice t h a t t h e proof technique in t h e above m a y not work well for this conjecture. In fact, for d > 2, a new phenomenon is introduced in t h e testing. To explain this, consider t h e proof of T h e o r e m 11.2.1. Assume t h a t all edges in G[A] and G[B] have been identified. Now, we look at t h e problem of identifying edges in H. Note t h a t each test on H corresponds to a test on G. If all edges in G[A] and G[B] are good, t h e n t h e problem is reduced t o testing H since every test on H has t h e same o u t c o m e as t h a t of t h e corresponding test on G. However, if G[A] or G[B] contains a defective edge, t h e n a test on H m a y have a different test-outcome from t h a t of t h e corresponding test on G when t h e l a t t e r one contains t h e defective edge in G[A] or G[B]. T h u s , in this case, t h e problem cannot be simply reduced to testing H.
11.3
On Hypergraphs
A h y p e r g r a p h H is an ordered pair of disjoint sets (V, E) such t h a t E is a collection of subsets of V and V ^ 0. T h e set V is t h e vertex set of H and t h e set E is t h e hyperedge set. T h e y are also denoted by V(H) and E(H), respectively. T h e rank of H is defined by rank(H) = m a x |e|. For any u £ V, t h e degree of u in H is defined by degH{u)
=
\{eeE\u
Group Testing on Graphs
208
A hypergraph H' is a subhypergraph of H if V(H') C V{H) and E(H') C £ ( # ) . Furthermore, the subhypergraph if' is said to be induced by W if VK = V(H') and £ ( # ' ) = {e € £(ff) | e C VK}. In this case, write H' = # [ W ] . Aigner's problem on graphs can be generalized to hypergraphs as follows: Given a hypergraph H, what is the minimum number M[1,H] such that M[1,H] tests are sufficient in the worst case to identify an unknown edge e if each tests an induced subhypergraph H[W] and tells whether the unknown edge e is in //[W] or not. If all hyperedges of H are considered as items, then this problem is the CGT with constraint X on the testable groups where 1 = {E{H[W})\W
CV(H)}.
From the lower bound of the CGT, it is easily seen that for any graph G, M[1,H] > [log \E{H)|1. Recently, Althofer and Triesch [5] proved the following. Theorem 11.3.1 There exists a constant cT such that for all hypergraphs H of rank r, M[l,H]<\og\E(H)\
+ cr.
The proof of this theorem relies on the following lemmas. Lemma 11.3.2 Let m = |J5(.r7)|. Then there exists a subset WofV mjl < \E{H[W])\ < m/2 + r • m r / ( r + 1 ) .
such that
Proof. Choose W C V such that (a) \E(H[W})\ > m/2, (b) | £ ( # [ W ] ) | = mm{\E(H[W])\ | |£(#[W"])| > m/2, W C V), and (c) \W\ is minimal among all W satisfying (a) and (b). By contradiction, assume that
|£(.f7[jy])| > m/2 + V^ • m r/(r+1 ». For any w e W, if |{e 6 E{H) | w e e C W}\ < y/? • mr'm/2, contradicting (c). Thus, for every w, \{e € E(H) | w € e C W}\ > ^fr~ • m r/ < r+1 >.
209
11.3 On Hypergraphs Therefore, Vrm r / ( r + 1 ) |VK| < Y. degH{w) < rm. wew Hence, \W\ < ^ m 1 / ( r + 1 > . Note that rank(H) follows that
= r. H[W] can have at most \W\
nonempty hyperedges. It
| £ ( # [ W ] ) | < \W\ + 1 < \W\T + 1 < V?™ r / ( r + 1 ) + m/2, a contradiction. D L e m m a 11.3.3 Suppose f(u) = v f W r + 1 > and {x}%L0 is a sequence of natural numbers satisfying x0 = 1, xt+i > xt, and \xt-
2^+1! ^
fixt+i)-
Then there exists a constant Cj > 0 such that for all t, xt>cf-
2'.
Proof. Note that f(u)/u —* 0 as u —• 00. Hence, f(xt)/xt such that for t > t0, f(xt)/xt < 1/10. Then, for t > t0, xt _ l_ xt+i 2 ~
_
X i+1
10'
Thus, 2 5
< _
xt x/+1
< _
3 5
Furthermore, for t > to,
fi±i > > ~
2(i2(i
/(x +l)
'
)
_ /(*'+*) . 5 } *<+i 2;
Thus,
Xt0+k>2*.Xto.i[(l-5-.l±^ j=l
Z
x
(o+j
—> 0 as t —> 00. Choose
Group Testing on Graphs
210
The infinite product P=fid
_ | . /(Es±i))
converges if and only if the series ]C?ii f(xto+i)/xto+j
converges. Note that
^ ^ = v~r- *;Vi r+1) < ((|) J ^o)- 1 / ( r + 1 ) Thus, the series converges, so the infinite product converges to, say, c'. Then for t>t0, xt > (c'2- t °a :(0 )2 ( . Choose cs = min(2- i o ,c'2-'°x ! o ). Then for all t, xt > c/2'.
Proof which which 11.3.2
D
of Theorem 11.3.1. Denote by Hi the subhypergraph of H, consisting of edges are consistent with the first i tests (Ho = H) and by W,- the vertex subset induces the i-th test subhypergraph where Wi is chosen according to Lemma such that \E(Hi^{Wi})\
- \\E(H^)\
<
f(\E(H^)\)
and ^(if,--!)! > \E(Hi)\. Suppose that the unknown edge is identified after s tests. Define Ct
)|
~ { 2'-*.
for t = 0 , l , - - - , s , for t > s.
Then {^t}^o satisfies the hypothesis of Lemma 11.3.3. Thus, xs>
cj • 2s,
hence, M[1,H] <s < log xs - log Cj <\og\E(H)\-\ogcs.
D
The prototype group testing with more than one defective can also be transformed into a special case of the problem of group testing on graphs. In fact, for the prototype group testing problem with n items and d defectives, consider a hypergraph H = (V, E) where V is the set of n items and E is the family of all subsets of d items. One can obtain the following correspondence between the two problems:
11.3 On Hypergraphs
211
the prototype CGT test T T pure (contaminated) h, 12,'' • ,U a r e defectives
CGT on hypergraph H test V\T V \T contaminated (pure) {n,»2,••-.*<*} is the defective hyperedge
According to this correspondence, it is easy to transform algorithms from one problem to the other. So, the two problems are equivalent and the following holds. Theorem 11.3.4 Let H = (V,E) where \V\ = n and E is the family of all subsets of d elements in V. Then M[1,H] = M(d,n). For d = 2, H is a graph. So, for a complete graph Kn of n vertices, M[l,tf n ] = M ( 2 , n ) < r i o g 2 Q l + l (see section 3.3). In Section 3.3, it is proved that if it satisfies
then nt(2) < it — 1 for t > 4 (see Lemma 3.3.3). It follows that for t > 4, M(2,it)
>t + l.
Moreover, for t > 4, it > 3, so
Thus,
Hence, M(2,i()>flogaQl+l. Thus, the following can be concluded. Theorem 11.3.5 There are infinitely many n such that M[1,G„]> [ l o g 2 Q l + l Note that M(d, n) < log (j) + d. We conjecture that M[1,H] < log \E\ + cr for any hypergraph H = (V, E) of rank r and a constant c.
212
11.4
Group Testing on Graphs
On Trees
The following is a classic problem: Consider a rooted tree T. Suppose that each leaf is associated with an item. There exists exactly one defective. Identify it by a sequence of tests {T;}. Each T, is on the set of all leaves which are descendants of the node i. (Namely, each test is on a subtree T; rooted at i.) The test-outcome of T, indicates whether T, has the defective leaf or not. This problem has a large number of applications. When a hierarchical organization of concepts form the framework of an identification process, usually tests exist to determine whether the unknown concept can be identified with a given category in the hierarchy. For instance, the problems of identifying an unknown disease that a patient has, identifying an unknown chemical compound, and finding a faulty gate in a logic circuit fit the above description. Let \T\ denote the number of leaves in a tree T. Pratt [11] and Spira [14] proved the following. L e m m a 11.4.1 Every nonempty finite binary tree T contains a subtree T" such that
I < 1Z1 < 2 3 - \T\ - 3 ' Proof. Let r be the root of T. For each node v, denote by Tv the subtree rooted at v. For each internal node v denote by vo and vi, respectively, its left and right sons. Without loss of generality, assume that |T„0| < IT^J for every v. Let r be the root of T. First, one looks at the two sons r 0 and ri of the root r. If |T r ,| < §|T|, then the subtree Tri meets the requirement. If |T r J > | | T | , then one considers the son r n of /"i. If | T r n | < | | T | , then the subtree Trii meets the requirement since \Tril\>\\Tri\>\\T\. If \Tril\ > fl^li then consider the son r n l of ru. This process cannot go forever because one must have \TTl , | < | | r | when ri...i is a leaf. Thus, the subtree with the required property exists. • From this lemma, it is easy to see that the minimax number of tests required to determine a defective leaf in the binary tree T is at most [log 2 ; 3 | T | ] . However, this upper bound is not tight. In fact, if a test reduces only one third of the number of leaves, then the next test would be able to reduce a larger fraction. From this observation, Rivest [12] showed that k tests are sufficient to reduce the number of possibilities by a factor of 2/Fjt+3, where F{ is the ith Fibonacci number. Rivest's bound is tight. To see this, consider the Fibonacci trees {F,} constructed as follows (see Figure 11.2):
11.4 On Trees
213
T\ = F2 = the empty tree (i.e., the tree has only one node and no edge), Ti = the binary tree whose root has two sons F,_ 2 and Fi-\.
u
u
j*b
A,
As F5
U Figure 11.2: Fibonacci trees
Let Rk{T) denote the minimax number of leaves after k tests. Theorem 11.4.2 Rkifk+s) > 2. Proof. One proves this by induction on k. For k = 0, RQ(!FZ) is the number of leaves in Tz which equals 2. For the inductive step, one first notices that if a tree T contains two disjoint subtrees Tk+i and Tk, then Rk-i(T) > Rk-\(Tk+2)- Now, suppose that the first test is on a subtree T of Fk+z- Then Rkifh+a) > max{ii i t _ 1 (r), ^ - 1 ( ^ + 3 \ T)}. Note that Fk+3 has two disjoint subtrees !Fk+i and J-k+2 rooted at the second level. If T = 37k+2, then Rk-i(T) > 2. If T ^ ^rk+2, then T must be contained in one of three disjoint subtrees isomorphic to J-k+i, J~k, and Tk+\, respectively. Thus, the complement of T must contain disjoint subtrees isomorphic to Tk and J-'k+i, respectively. Hence, Rk-i^k+3 \ T) > Rk-i(Fk+i) > 2. D Rivest [12] proposed the following greedy algorithm. G r e e d y A l g o r i t h m . Given a tree T with a defective leaf, if T has at least two nodes, then carry out the following stages in each iteration. Stage 1. Find the lowest node a such that | T a | / | r | > 1/2.
214
Group Testing
on
Graphs
Stage 2. If a is a leaf, t h e n test a. Otherwise, let 6 be a son of a such t h a t Tt has more or t h e same n u m b e r of leaves in Ta \ Tt,. If \T\ — \Ta\ > |Tj,|, t h e n test Ta. Otherwise, test Tf,. Stage 3. Suppose t h a t t h e test on T" is performed in Stage 2. If T" is c o n t a m i n a t e d , t h e n set T := T"; else, set T := T\ T". If T is t h e e m p t y tree, then stop; else, go back to Stage 1. L e m m a 1 1 . 4 . 3 Suppose
that a test on subtree T' is chosen in Stage 2.
m a x { | r | , m - \T'\) =
Then
m m m a x { | T v | , | T | - |T„|}. vev(T)
Proof. Clearly, if v is an ancestor of a, then m a x { | T a | , \T\ - \Ta\} = \Ta\ < \TV\ = m a x { | r „ | , \T\ - |T„|}. If v is not comparable with a, t h e n |T„| < | | T | - Let u be the lowest ancestor for both v a n d a. T h e n
\T*\
< |r„| < |r| - |r„| = max{|r„|, |r| - \TV\).
If v is a descendant of a other t h a n 6, then m a x { | r 6 | , | r | - |T 6 |} = | r | - | T t | < | T | - |T„| = max{|T„|, | T | - |T„|}. Finally, t h e proof is completed by noting t h a t m a x d T ' l , | n - \T'\] < mm{\Ta\,
\T\ - \Tb\}.
D
Let Gk{T) be t h e m a x i m u m n u m b e r of leaves in t h e tree after k tests chosen by t h e greedy algorithm. T h e o r e m 1 1 . 4 . 4 For any binary tree T with \T\ > r
m
< Jf -F?h+2- • ^H\T\
M f J ^ J Fk+2
forr J°
J°r
in
Fk+3,
i2 < ^ ^ ^ |T|
2^
< - ^Ftki?
in ^ n y
3'
First, note that if 1/2 < GX{T)I\T\ < Fk+2/Fk+3, then 2 G,(r) Fk+2 i n
<
2 -^*+3
if Fk+2/Fk+3 < Gi{T)/\T\ < 2/3, then
2 Fk+2
.(.-^ ) S ^F-+2. ( 1 - 5r a ) s \T\
T h u s , this t h e o r e m implies t h e following.
k
k+3
2 rk+3
11.4 On Trees
215
Corollary 11.4.5 For any binary tree T with \T\ > Fk+3, Rk(T) < Gk(T) < - i - | T | . rk+3
Now, go back to the proof of Theorem 11.4.4. Proof of Theorem 11.4-4- 0 n e proves it by induction on k. For k = 1, it follows from Lemma 11.4.1. For k > 2, let U denote the subtree of T ( either the first tested subtree or its complement) such that \U\ = Gi(T) and V is the complement of U with respect to T. Clearly, |V| < ±\T\. Case 1. V contains the defective leaf. In this case, after the first test, the tree V is left. Thus, Gk(T) Gk^(V) \T\ ~ \V\
\V\ 2 '\T\~Fk+2'2
1
1 Fk+2~mmXFM
2
G^T) 2 \T\'Fk+1(
G^T) | T | ^
Case 2. U contains the defective leaf. In this case, one claims that G\(U) < |V|. In fact, if U = Ta, then G1(U)<\Tb\<\T\-\Ta\ UU = T\Tb,
= \V\.
then \Ta\ > \T\ - \Tb\, so
GX{U) < max{|T 0 | - \Tb\, \T\ - \Ta\] < max{|T6|, |T| - \Ta\} = \Tb\ = \V\. Thus, it is always true that G^U) < \V\. For 1/2 < G i ( r ) / | T | < Fk+2/Fk+3, has Gk(T) \U\ Gk=1{U) GiTT, _ 2 _ \T\ ~ | T | " |[/| |T| > i + 2 ' For Fk+2/Fk+3
one
< G i ( T ) / | T | < 2/3, one has 1 < 2 -
G
i ( t / ) < |V| \U\ ~ \U\
=
ir|-G1(T)
Therefore, G^T) \T\
\U\ Gk-!{U) ~ |T|' \U\
~
Gr(T) \V\ J_ \T\ '\U\'Fk+1
= n_G1 [
_2_ \T\>' Fk+1-
Let M(T) be the minimax number of tests for identifying a defective leaf from T. Then by Corollary 11.4.4, the following holds. Corollary 11.4.6 M{T) < mm{k + 1 | \T\ <
Fk+3}.
Group Testing on Graphs
216
Pelc [10] studied the same problem allowing a lie. Let Mr(T) denote the minimax number of tests for identifying a defective leaf from T with at most one lie. T h e o r e m 11.4.7 LetT(n) M^Tin))
be the complete balanced binary tree of n levels. Then = 2ra + 1 - maxjfc | M1{T{k))
< n).
The problem of determining Mr(T(n)) for r > 2 is still open. The tree search problem studied in this section is also investigated by Garey and Graham [9] and Garey [8]. But they did the analysis on the average-case complexity.
11.5
Other Constraints
Other constraints have been studied in the literature. We give a brief survey in this section. Aigner [2] considered an additional constraint that each test set has at most k items. For searching an edge in a graph with this additional constraint, he argued that the problem is not so simple even for k = 1. For k > 2, no nontrivial result has been derived. Aigner and Triesch [3] studied the problem of searching for subgraphs as follows: Let Q be a collection of graphs with the same vertex set V. Determine the minimum number L(Q) such that L(Q) tests are sufficient in the worst case to identify G if each test is on an edge e and tells whether e is in G or not. This problem is also a constrained group testing problem if graphs in Q are items with the constraint that only items in / can be tested, where !={{Geg\eeE{G)}\eCV}. Two interesting results obtained in [3] are as follows. T h e o r e m 11.5.1 Let M. be the collection of matchings on n vertices, n > 2 even. Then L{M) = 2*2=21. T h e o r e m 11.5.2 Let T be the collection of trees on n vertices. Then L(T) = it) — 1. A potential area of research about constrained testing is software testing. Software testing is an important and largely unexplored area. It can be formulated as a testing problem on graphs with possible failure edges and failure vertices. (See [13].) It is a complicated problem and worth studying.
217
References
References [1] M. Aigner, Search problems on graphs, Discrete [2] M. Aigner, Combinatorial
Appl. Math. 14 (1986) 215-230.
Search, (Wiley-Teubner, 1988).
[3] M. Aigner and E. Triesch, Searching for an edge in a graph, J. Graph Theory, to appear. [4] M. Aigner and E. Triesch, Searching for subgraphs, Combinatorica,
to appear.
[5] I. Althofer and E. Triesch, Edge search in graphs and hypergraphs of bounded rank, unpublished manuscript. [6] T . Andreae, A search problem on graphs which generalizes some group testing problems with two defectives, Discrete Math., to appear. [7] G.J. C h a n g and F.K. Hwang, A group testing problem, SIAM Meth., 1 (1980) 21-24. [8] M. R. Garey, O p t i m a l binary identification procedures, SIAM (1972) 173-186.
J. Alg.
Disc.
J. Appl. Math. 23
[9] M. R. Garey and R. L. G r a h a m , Performance bounds on t h e splitting algorithm for binary testing, Acta Informatica 3 (1974) 347-355. [10] A. Pelc, Prefix search with a lie, Journal 165-173.
of Combinatorial
Theory
48 (1988)
[11] V. R. P r a t t , T h e effect of basis on size of boolean expressions, Proc. 16th (1975) 119-121. [12] R. L. Rivest, T h e g a m e of (1977) 181-186.
U
N questions" on a tree, Discrete
[13] S. Sahni, Software Development Minnesota, 1985) p p . 325-368.
Mathematics
FOCS
17
in Pascal, (The Camelot Publishing Co., Fridley,
[14] P. M. Spira, On t i m e h a r d w a r e complexity tradeoffs for boolean functions, Proc. 4th Hawaiian Inter. Symp. System Science (1971) 525-527.
12 Membership Problems
Consider a set of n items having d defective ones. Given a family of subsets of items, what is the minimum number t such that there exists an algorithm, using at most t tests on subsets in the family, for identifying whether a given item is defective or not? This problem is called the membership problem which is a variation of group testing.
12.1
Examples
Clearly, for the membership problem, if individual testing is allowed, then the solution is trivial, i.e., individual testing is optimal. So, usually, the given family of subsets does not contain singletons. Suppose the given family consists of all subsets of two items. Clearly, if d+ 1 = n, then there exists no solution for the membership problem because all tests receive the same answer. If d + 2 < n, then d+ 1 tests are sufficient. In fact, for any item x, choose the other d + 1 items to form d + 1 pairs with x. Tests these d + 1 pairs. If one of these pairs is pure, then x is good; otherwise, x is defective. Actually, d + 1 is the minimum number of tests in the worst case to identify whether an arbitrarily given item is good or not. The proof is as follows. Given a family of subsets of two items, a subset of items is called a vertex-cover of the family if it intersects each subset in the family with at least one item. To do the worst-case analysis, consider a test history in which every test obtains the positive outcome. (If for some test, the positive outcome is impossible, then the test group must consist of all already-known good items and this test should be removed.) Suppose that the history contains t tests and x is identified to be defective in the history. Then any vertex-covering of the t contaminated sets, not containing x, must contain at least d-\-l elements. (Otherwise, one cannot know that x must be defective.) Thus, t>d+l. Suppose that the given family consists of all subsets of three items. Similarly, consider a test history consisting of t tests; every test produces the positive outcome. Suppose that x is identified to be defective in this history. Then, every subset of items, not containing x, which has nonempty intersection with every test group in 218
12.1
Examples
219
t h e history m u s t have size at least d + 1. Delete x and take away an item from each test set not containing x. T h e n t h e resulting graph has at most n — 1 vertices and a vertex-cover of size at least d + 1. T h u s , t is not smaller t h a n t h e m i n i m u m number of edges for a graph with at most n — 1 vertices which has t h e m i n i m u m vertex-cover of size at least d + 1. Conversely, suppose G is a graph with at most n — 1 vertices which has t h e m i n i m u m vertex-cover of size at least d + 1. Associate every vertex with an i t e m other t h a n x, and for each edge give a test set consisting of x and t h e two i t e m s associated with t h e two endpoints of t h e edge. T h e n x can be identified by these tests. In fact, if one test has t h e negative outcome, t h e n x is good; if all tests have t h e positive outcome, t h e n x is defective. T h u s , t h e m i n i m u m n u m b e r of tests for identifying an i t e m with 3-size test sets is exactly t h e m i n i m u m n u m b e r of edges in a graph with at most n — 1 vertices which has t h e m i n i m u m vertex-cover of size at least d + 1. Since a complete g r a p h of n — 1 vertices has t h e m i n i m u m vertex-cover of size n — 2. T h e m e m b e r s h i p problem has solutions if and only if n > d + 3. In general, if t h e given family consists of all subsets of k ( > 2) items, then the problem has solutions if and only if n > d + k. However, c o m p u t i n g t h e optimal solution seems not so easy for k > 3. T h e m e m b e r s h i p problem can also b e represented as a g a m e of two player. T h e first player keeps an item x in his mind; t h e second one asks questions in t h e form "Is x 6 5 ? " for some S in t h e given family. T h e n t h e first one answers t h e question according t o w h a t x he has. T h e g a m e ends when t h e second player identifies the m e m b e r s h i p of x based on t h e questions and answers. T h e problem is to find the o p t i m a l strategy for t h e second player. T h e game-theoretic formulation is used to s t u d y t h e following examples. A polyhedron is a convex set with a linear boundary. A polyhedral set is a union of finitely m a n y disjoint open polyhedrons. Given a polyhedral set P and a point x, decide w h e t h e r x is in P or not by using t h e m i n i m u m n u m b e r of linear comparisons, i.e., asking question "Is £(x) < 0, l(x) = 0, or £(x) > 0?" for a linear function £(•). This problem is called t h e polyhedral membership problem. Several n a t u r a l problems can b e transformed to this problem. T h e following is one of t h e m , which was initially studied by Dobkin and Lipton [5]: Given n real n u m b e r s x i , x2, ..., x „ , decide whether t h e y are distinct by using pairwise comparisons with answers: "bigger", "equal", or "less". Represent t h e n real n u m b e r s by an n-dimensional vector x = (x-i, • • • , x „ ) . Let P = {(j/i, • • • ,yn) | >/; 7^ y3,1 < i < j < n). T h e n t h e problem is t o d e t e r m i n e w h e t h e r x is in P or not, a polyhedral m e m b e r s h i p problem. It is worth mentioning t h a t P does not have t o be a polyhedron. A binary string is a finite sequence of symbols 0 and 1. Consider 2" items encoded by binary strings. Let T denote t h e family of subsets in t h e form {x\x2 • • • xn | X{ = 0 } . Suppose t h e set of defectives consists of all strings having an odd n u m b e r of l ' s . T h e n for any i t e m x chosen by t h e first player, t h e second player has to ask at least n questions to identify whether t h e item is defective or not. In fact, from each question t h e second player can learn one symbol of x. B u t , to identify whether x is defective
220
Membership
Problems
or not, t h e second player has to know all symbols of x. This is due to t h e fact t h a t x can change from defective t o good or from good to defective as a symbol of x changes between 0 and 1. Since n questions are clearly sufficient, t h e m i n i m u m n u m b e r of questions t h a t t h e second player needs to ask is n. A large class of m e m b e r s h i p problems is about graphs. Consider t h e set of all graphs with a fixed vertex set V of n vertices. Suppose a graph is defective if and only if it is connected. In t h e g a m e t h e first player chooses a graph on V and the second player determines whether t h e graph is connected by asking whether selected edges exist in t h e chosen graph. T h e n in t h e worst case, t h e second player has to ask n(n — l ) / 2 questions. To see this, n o t e t h a t t h e worst-case scenario m e a n s t h a t t h e first player may change his choice as long as it does not contradict his previous answers during t h e game. Now, assume t h a t t h e first player always wants to keep t h e g a m e going as long as possible. Note t h a t from each question t h e second player knows only whether an edge exists or not. Consider t h e following strategy of t h e first player: Choose t h e "Yes"-answer only in t h e case t h a t if he gives t h e "No"-answer, then t h e second player can find immediately t h a t t h e graph is disconnected. In this way, t h e second player has to know whether each possible edge exists or not. This fact is proved by contradiction. Note t h a t according to t h e strategy of t h e first player, the final graph he chooses is connected. Suppose on t h e contrary t h a t this strategy does not work, i.e., there is an edge (s, t) which has not been asked by t h e second player b u t t h e second player finds t h a t all edges (i,j) received t h e "Yes"-answer already form a connected graph G. Adding t h e edge (s, t) to G results in a cycle containing (s, t). From this cycle, choose an edge (k, h) other t h a n (s, t). T h e n (Gl)(s,t))\(k, h) is still a connected graph, contradicting t h e strategy for giving t h e "Yes"-answer for t h e question about edge (k,h).
12.2
Polyhedral Membership
In t h e last section, we mentioned t h e polyhedral m e m b e r s h i p problem. Suppose t h a t a polyhedral set P and a point x are given. If t h e b o u n d a r y of P can be described by m linear functions, gi for 1 < i < m, a naive way to d e t e r m i n e whether a point x is in P or not is to c o m p u t e all gi(x). T h u s , m tests are required. However, some preliminary work m a y reduce t h e n u m b e r of tests. For example, consider a polygon P in t h e plane (see Figure 12.1). P has m vertices and m edges d e t e r m i n e d by ,'s. Choose a point o in t h e interior of P. Let vertices v\, • • •, vm of P be arranged counterclockwise. T h r o u g h each pair of o and vertex Vi, a line is determined. W i t h such preparation, t h e wedge ou;u t + i containing t h e point x can be determined with at most [log(m + 1 ) ] tests. In fact, t h e first test uses a line through some ov{ such t h a t t h e line cuts t h e plane into two parts each of which contains at most \{m + l ) / 2 ] wedges, (note: O n e of t h e m wedges may be cut into two pieces.) Such a line exists because r o t a t i n g t h e line around o, t h e n u m b e r of wedges on each side of t h e line is changed within difference one; and when t h e line is r o t a t e d 180°, t h e two sides of t h e
12.2 Polyhedral Membership
221
Figure 12.1: A polygon in the plane. line are interchanged. Clearly, the rest of tests can also remove half of the wedges. Thus, with at most [log(m + 1)] tests, only one wedge is left. more test with the line u,-v,-+i is enough to determine whether the point x not. So, the total number of tests is [log(m + 1)] + 1. Yao and Rivest [8] elegant lower bound as follows.
remaining Now, one is in P or proved an
T h e o r e m 12.2.1 Let P be a polyhedral set in Rn. Let C(P) be the minimum number of tests sufficient in the worst case to determine the membership of a point in P for a point. Then C(P)>
\\\ogfd{P)-\foranyd
where fd{P) is the number of d-dimensional facets of P. (A facet of P = {x G R" | hi(x) = ajx — b{ < 0,i = 1, • • • ,m} is a subset in the form {x 6 Rn \ h{(x) < 0 for i G J, hi(x) = 0 for i £ 1} where 7 C {1, • • • ,m}.) Proof. Let h(x) = aTx — b be a linear function specifying a test. For any polyhedral set P , denote P~ = {x£ P\ h(x) < 0} and P+ = {x € P | h(x) > 0}. Then for each d-dimensional facet F, at least one of F~ and F+ is a d-dimensional facet of P~ or P+. So, fd(P-) + fd(P+)>fd(P). Suppose t tests are sufficient in the worst case to determine the membership of a point x* in P. x" may be asked always to lie in the part with more d-dimensional facets. When the testing ends, the t linear inequalities given by the t tests must form a polyhedron falling inside of P (otherwise, x* cannot be identified to be in P). So, each facet can be represented through these t linear functions. Clearly, a
Membership
222
Problems
polyhedron represented by t linear inequalities can have at most L ^ J d-dimensional facets. Thus,
So,
(^)^>/-(n It follows that C{P) > \\ log fd(Pj\.
•
This theorem was also discovered by Kalinova [7], Moravek [10] (for d = 0), and Moravek and Pudlak [11]. Dobkin and Lipton [5] showed
c(P) > n(iog2(A(P))) where f3o{P) is the number of connected components of P. Steele and Yao [16] and Ben-Or [1] did further investigation on this lower bound. Recently, Bjorner, Lovasz, and Yao [3] proved a new bound C(P)>\og3\X(P)\ where x{P) is the Euler characteristic of P, that is, x(P) = MP) - MP) + MP) -•••
+ (-i)"/n(P).
Yao [22] generalized this bound from polyhedral sets to semi-algebraic sets.
12.3
Boolean Formulas and Decision Trees
A boolean function is a function whose variables' values and function value all are in {true, false}. Usually, one denotes "true" by 1 and "false" by 0. In the following table, there are three boolean functions, conjunction A, disjunction V, and negation -•. The first two have two variables and the last one has only one variable. x y x Ay 0 0 0 0 1 0 0 1 0 1 1 1
xVy 0 1 1 1
-ii
1 1 0 0
Exclusive-or © is also a boolean function of two variables, which is given by x 0 y = ((-•a:) A y) V (x A (_,J/)). For simplicity, one also writes x A y = xy, x V y = x + y and -ix = x. The conjunction, disjunction and exclusive-or all follow the commutative law and the associative law. The distributive law holds for conjunction to disjunction,
12.3 Boolean Formulas and Decision Trees
223
disjunction to conjunction, and conjunction to exclusive-or, i.e. (x + y)z = xz + yz, xy + z = (x + z)(y + z), and (x © y)z = 12® yz. An interesting and important law about negation is De Morgan's law, i.e., ~xy = x + y and x + y = xy. An assignment for a boolean function of n variables is a binary string of n symbols; each symbol gives a value to a variable. A partial assignment is an assignment to a subset of variables. A truth-assignment for a function is an assignment x with f(x) = 1. Suppose that X\ = ei, • • •, x^ = e^ form a partial assignment for function / . Then f\xl=ei,...tXk=ek denote the function obtained by substituting X\ = e\,- • • ,Xk = tk into / , which is a function of variables other than i 1 , • • •, x^. The membership problem with binary strings as items and T as the given family can be interpreted as the problem of evaluating a boolean function. Suppose that a binary string is defective if and only if it is a truth-assignment. Note that each question "Is the string in \x\ • • • xn \ Xi = 1}?" is equivalent to the question " Is X{ = 1?". So, one can also describe the game-theoretic formulation in the following way: The second player picks a variable and then the first player gives a value to the variable. A tool for studying this game is the decision tree denned as follows. A decision tree of a boolean function / is a binary tree whose internal nodes are labeled by variables, leaves are labeled by 0 and 1. Each variable goes to its two children along two edges labeled by 0 and 1, corresponding to the two values that the variable may take. Given an assignment to a boolean function represented by a decision tree T, T computes the function value in the following way: Find a path from the root to a leaf such that all variables on the path take values in the assignment. Then the value of the leaf on the path is the value of the boolean function at the assignment. For example, a decision tree is given in Figure 12.2 which computes the function f{x1,x2,x3)
= (xi -r x2)(x2 + x3).
A decision tree for the function involved in the game is a strategy of the second player. As the first player always plays adversary to the second player, the second player has to go along the longest path on the tree. So, the optimal strategy for the second player is to choose a decision tree with the minimum longest path. (The length of paths is the number of edges on the path.) For a boolean function / , the minimum length of the longest path of a decision tree computing / is denoted by D(f). D(f) is the minimum number of questions that the second player has to ask in order to identify the membership of an item in the worst case. Clearly, D(f) < n when / has n variables. For convenience, assume that for constant function / = 0 or / = 1, D(f) = 0. Note that for every nonconstant function / , D(f) > 1. So, D(f) = 0 if and only if / is a constant. A boolean function / of n variables is elusive if D(f) = n. The following are some examples on the elusiveness of boolean functions.
Membership
224
Problems
Figure 12.2: A decision tree. Example 1. In a tournament, there are n players 1, • • •, n. Let X{j be the result of the match between players i and j , i.e. 1 0
if i beats j , if j beats i.
(Note that this is not necessarily a transitive relation.) Consider the following function: t(si2 j '
£ * ' > 3-n—l.nj n—l,n —
1 0
if there is a player who beats all other players, otherwise.
Then D(t) < 2(ra — 1) — [log2 n\. To show this, one needs to design a tournament such that within 2(n — 1) — [log2rcJ matches, the value of function t can be determined. This tournament has two stages. The first stage consists of a balanced knockout tournament. Let i be the winner of the knockout tournament. In the second stage, the player i plays against every one whom he did not meet in the knockout tournament. If i wins all his matches, then t equals 1; otherwise, t equals 0. A knockout tournament for n players contains n — 1 matches, in which the winner i plays at least [log2 nj times. So, the total number of matches is at most 2(n — 1) — [log 2 nJ. t is not elusive. Example 2. Consider the following function
III>.r i=l j = l
Then m is elusive. To see this, consider a decision tree computing m. The first player looks for a path in the following way. Starting from the root, suppose that
12.3 Boolean Formulas and Decision Trees
225
currently he faces a question on variable x,j. If all other variables in row i have been assigned the value 0, then the first player assigns 1 to x tJ ; otherwise, assigns 0 to Xij. In this way, it is easy to see that before all variables are assigned the second player cannot know the value of m. This means that the second player has to encounter all variables. Thus, m is elusive. A boolean function is monotone if it contains only AND and OR operations. A boolean function is called a tree function if it is monotone and each variable appears exactly once in its expression. The above function m is an example of the tree function. By a similar argument, one can show that every tree function is elusive. To find the longest path, consider the following strategy: If variable X{ that the first player meets appears in a sum in the expression, then set x = 0; otherwise, x must appear in a product in the expression, set x = 1. After assigning x a value, simplify the expression. This strategy will only make assigned variables disappear from the expression. Therefore, the second player has to encounter all variables by this strategy. The following are some general results on elusiveness. Theorem 12.3.1 A boolean function with an odd number of truth-assignment is elusive. Proof. The constant functions / = 0 and / s i have 0 and 2 n truth-assignments, respectively. Hence, a boolean function with an odd number of truth-assignments must be a nonconstant function. If / has at least two variables and a;,- is one of them, then the number of truth-assignments of / is the sum of those of f\Xi=o and / | I | = 1 . Therefore, either f\Xi=o or f\Xi=i has an odd number of truth-assignments. Thus, tracing the odd number of truth-assignments, one will encounter all variables in a path of any decision tree computing / . • Define p/(t) = Eig{o,i}» K 1 )*" 1 " where ||x|| is the number of Is in string x. It is easy to see that p / ( l ) is the number of truth-assignments for / . The following theorem is an extension of the above. Theorem 12.3.2 For a boolean function of n variables, (t + l)n-DW> | pj(t). Proof. First, note that if / = 0 then p}(t) = 0 and if / = 1 then pf(t) = (t + l ) n . This means that the theorem holds for D(f) = 0. Now, consider / with D{f) > 0 and a decision tree of depth D(f) computing / . Without loss of generality, assume that the root is labeled by xx. Denote f0 = /|a;1=o and / i = f\Xl=i. Then pj(t)
= = =
£
x6{0,l} n
£
/(*)*""" /(0x)*IW"+
Pfo(t) + tPh(t)-
D
/(lx)i ,+ IWI
Membership
226
Problems
Note that D(f0) < D(f) - 1 and D(fi) < £>(/) - 1. Moreover, by the induction hypothesis, (f+l)»-i-0(/o) | ph{t) and (i+l)"-i-^(/i) | Ph(t). Thus, (f+l)"- D <« | P / ( i ) . D
An important corollary is as follows. Denote fi(f) = P / ( —1). Corollary 12.3.3 If n(f) ^ 0, then f is elusive. Next, we present an application of the above criterion. Let H be a subgroup of permutation group Sn on {1, • • •, n). H is transitive if for any i,j £ {1, • • • , n } , there exists cr G H such that
12.4
Recognition of Graph Properties
Let {1, • • •, n) be the vertex set of a graph G = (V, E). Its adjacency matrix is (x^) defined by ii{i,j}eE, r . = ( i 1J ]0 otherwise. Note that = 0. So, there are only n(n — l ) / 2 independent variables, e.g., x^, 1 < i < j < n. If one uses the string £12X13 • • • Zin^i • • • £ n -i,n to represent the graph G, then all graphs, with n vertices, having a certain property form a set of strings which are truth-assignments of a boolean function. This boolean function characterizes the property. For example, the connectivity corresponds to a boolean function fcon such that fCon(xu, • • • i^n-i.n) = 1 if and only if the graph G with adjacency matrix (X;J) is connected. For simplicity of notation, a graph itself will be viewed as a string or a boolean assignment to n(n — l ) / 2 variables. Not every boolean function of n(n — l ) / 2 variables is a graph property. Because a graph property should be invariant under graph isomorphism, a boolean function
227
12.4 Recognition of Graph Properties
/ of n(n — l ) / 2 variables is a graph property if and only if for every permutation a on { l , - - - , n } , f(x12,
••• , Z „ _ i , „ ) = /(z<,(l)<,(2), • • • , 2V(n-l),7(n))-
There is a lot of research that has been done on decision trees of graph properties [4]. In 1973, Aanderaa and Rosenberg (as reported in [13]) conjectured that there exists a positive constant e such that for any nontrivial monotone graph property P, D(P) > en 2 . Rivest and Vuillemin [12] proved the conjecture in 1975. In this section, their result is presented. First, monotone nontrivial bipartite graph properties are studied. A bipartite graph property is a boolean function f(x\\,x\2, • • •, xi n , rr2i, • • •, xmn) such that for any permutation a of 1, 2, • • •, m and permutation r of 1, 2, • • •, n, / ( Z < T ( 1 ) T ( 1 ) , • • • ,£
,Zm„).
L e m m a 12.4.1 Let P be a monotone nontrivial property of bipartite graphs between vertex sets A and B with \A\ • \B\ a prime power. Then P is elusive. Proof. This is a corollary of Theorem 2.3. In fact, in order to transform an edge (i, j) to another edge (i',j') where i,i' € A and j,j' € B, one needs only to choose a permutation a on A and a permutation r on B such that a(i) = i' and r ( j ) = j ' . Thus, the bipartite graph property is weakly symmetric. Since the number of edges for a complete bipartite graph between A and B is a prime power \A\ • \B\, all conditions of Theorem 2.3 hold. D
Lemma 12.4.2 Let P be a nontrivial monotone property of graphs of order n. If 2 m mm{D(P'),22m-2} for some nontrivial monotone property P' of graphs of order n — 1. Proof. The idea is to reduce the computation of P on n vertices to the computation of a nontrivial monotone graph property on n — 1 vertices (in cases 1 and 2) or a nontrivial monotone bipartite graph property (in case 3). Let Kn-\ be the complete graph o n n - 1 vertices 2, • • • , n and K\,n-\ the complete bipartite graph between 1 and {2, • • • , n } . Case 1. {1} U Kn-\ has property P. In this case, let P' be the property that a graph G on vertices 2, • • •, n has property P' if and only if {1} U G has property P. Then the empty graph does not have property P' and Kn-\ has. So, P' is nontrivial. Clearly, P' is monotone since P is. Now, in a decision tree computing P, 0 is assigned to all edges in K\ D(P'). Case 2. -K\,„_i does not have property P. In this case, let P' be the property that a graph G on vertices 2, • • • ,n has property P' if and only if Ki>n-i U G has
228
Membership
Problems
property P . Then P' is a nontrivial monotone property of graphs of order n — 1 and D(P) > D(P'). Case 3. Ai, n -i has property P and {1} U Kn-i does not have property P. Let A = {1, • • • , 2 " 1 - 1 } , B = { n - 2 " 1 - 1 + 1 , • • •, n} and C = {2 1 "- 1 + 1, • • •, n -2™-1}. Let KBUC denote the complete graph on vertex set BUC. The A U KBUC is a subgraph of {l}l)Kn-i. Since { l j u i f n - i does not have property P and P is monotone, AU/fguc does not have property P. Let KA,B be the complete bipartite graph between A and B. Then K\iB.-\ is a subgraph of KA,B U KAUB which is isomorphic to iCi.B U KBUCSince -ft^.n-i has property P, so has 70,B U KBUC- NOW, let P ' be the property such that a bipartite graph G between A and 5 has property P' if and only if GUKBUC has property P . Then P' is a nontrivial monotone property of bipartite graphs between A and B with |A| = \B\ = 2 " - 1 and D(P)> D(P'). By Lemma 12.3.1, D(P') > 2 2m ~ 2 . D Lemma 12.4.3 If P is a nontrivial monotone property of graphs of order n = 2m then D(P) > n 2 / 4 . Proof. Let Hi be the disjoint union of 2 m _ I copies of the complete graph of order 2'. Then H0 C -ffi C • • • C Hm = Kn. Since P is nontrivial, H0 has property P and # m does not have property P. Thus, there exists an index j such that Hj has property P and i/j+i does not have property P. Partition Hj into two parts with vertex sets A and B, respectively; each contains exactly 2" 1 - -' - 1 disjoint copies of the complete graph of order 21. Let KA,B be the complete bipartite graph between A and B. Then Hj+i is a subgraph of Hj U KA,B- SO, 1/,- U KA,B has property P . Now, let P' be a property of bipartite graphs between A and B such that a bipartite graph between A and B has property P ' if and only if HjUG has property P . Then P ' is a nontrivial monotone property of bipartite graphs between A and B with |A| • \B\ — 22m~2. By Lemma 12.3.1, D(P) > D(P') = 22m~2 = n2/i. Q Theorem 12.4.4 If P is a nontrivial monotone property of graphs of order n then D(P) > n 2 /16. Proof. It follows immediately from Lemmas 12.4.2 and 12.4.3. • The lower bound in Theorem 12.4.4 has been improved subsequently to n 2 /9 + 2 o{n ) by Kleitman and Kwiatkowski [9] and to n 2 /4 + o(n2) by Kahn, Saks, and Sturtevant [6]. Kahn, Sake, and Sturtevant discovered a new criterion using some concept from algebraic topology. With such a criterion and some results on fixed point theory, they established the elusiveness of nontrivial monotone properties of graphs of order a prime power. In general, Karp conjectured that every nontrivial monotone graph property is elusive. This conjecture is still open. With the approach of Kahn, et al. [6], Yao [20] proved that any monotone nontrivial bipartite graph property is elusive. With Yao's result, the recursive formula in
229
References Lemma 12.4.3 has been improved to
D(P)>mm{D(P'),1^^} (see [8] for details). There are also many results on probabilistic decision trees in the literature. The interested reader may refer to [4, 14, 18, 19].
References [1] M. Ben-Or, Lower bounds for algebraic computation trees, Proceedings of 15th STOC (1983) 80-86. [2] M.R. Best, P. van Emde Boas, and H.W. Lenstra, Jr. A sharpened version of the Aanderaa-Rosenberg Conjecture, Report ZW 30/74, Mathematisch Centrum Amsterdam (1974). [3] A. Bjorner, L. Lovasz, and A. Yao, Linear decision trees: volume estimates and topological bounds, Proceedings of 24th STOC (1992) 170-177. [4] B. Bollobas, Extremal Graph Theory, Academic Press (1978). [5] D. Dobkin and R.J. Lipton, On the complexity of computations under varying sets of primitives, J. of Computer Systems Sci. 18 (1979) 86-91. [6] J. Kahn, M. Saks, and D. Sturtevant, A topological approach to evasiveness, Combinatorica 4 (1984) 297-306. [7] E. Kalinova, The localization problem in geometry and Rabin-Spira linear proof (czech), M. Sci. thesis, Universsitas Carolina, Prague, 1978. [8] V. King, My thesis, Ph.D. Thesis at Computer Science Division, University of California at Berkely, 1989. [9] D.J. Kleitman and D.J. Kwiatkowski, Futher results on the Aanderaa-Rosenberg Conjecture, J. Combinatorial Theory (Ser B), 28 (1980) 85-95. [10] J. Moravek, A localization problem in geometry and complexity of discrete programming, Kybernetika (Prague) 8 (1972) 498-516. [11] J. Moravek and P. Pudlak, New lower bound for polyhedral membership problem with an application to linear programming, in Mathematical Foundation of Computer Science edited by M. P. Chytil and V. Koubek, (Springer-Verlag, 1984) 416-424. [12] R. Rivest and S. Vuillemin, On recognizing graph properties from adjacency matrices, Theor. Comp. Sci., 3 (1976) 371-384.
230
Membership
Problems
[13] A.L. Rosenberg, On the time required to recognize properties of graphs: A problem, SIGACT News, 5:4 (1973) 15-16. [14] M. Snir, Lower bounds for probabilistic linear decision trees, Theoretical Computer Science 38 (1985) 69-82. [15] P.M. Spira, Complete linear proof of systems of linear inequalities, / . Computer Systems Set. 6 (1972) 205-216. [16] M. Steele and A. Yao, Lower bounds for algebraic decision trees, J. Algorithms 3 (1982) 1-8. [17] A.C. Yao, On the complexity of comparison problems using linear functions, Proc. 16th IEEE Symposium on Switching and Automata Theory (1975) 85-99. [18] A.C. Yao, Probabilistic computations: towards a unified measure of complexity, Proc. 18th FOCS (1977) 222-227. [19] A.C. Yao, Lower bounds to randomized algorithms for graph properties, 28th FOCS (1987) 393-400. [20] A.C. Yao, Monotone bipartite graph properties are evasive, manuscript, 1986. [21] A.C. Yao and R.L. Rivest, On the polyhedral decision problem, SIAM J. Computing 9 (1980) 343-347. [22] A. Yao, Algebraic decision trees and Euler characteristics, Proceedings of 33rd FOCS (1992) 268-277.
13 Complexity Issues
From previous chapters, one probably gets the impression that it is very hard to find the optimal solution of CGT when d, the number of defectives, is more than one. In this chapter, this impression will be formalized through studying the complexity
of CGT.
13.1
General Notions
In this section, we give a very brief introduction to some important concepts in computational complexity theory. The computation model for the study of computational complexity is the Turing machine as shown in Figure 13.1, which consists of three parts, a tape, a head, and a finite control. There are two important complexity measures, time and space. The time is the number of moves of the Turing machine. The space is the number of cells, on the tape, which have been visited by the head during the computation. tape 1
head finte control
Figure 13.1: A Turing machine. A problem is called a decision problem if its answer is "Yes" or "No". In the study of computational complexity, an optimization problem is usually formulated as a decision problem. For example, the prototype problem of CGT is equivalent to the following decision problem.
231
232
Complexity Issues
The Decision Version of the Prototype Problem: Given n items and two integers d and k (0 < d, k < n), determine whether M(d, n) > k or not. When a Turing machine computes a decision problem, both the time and the space are functions of the input size. If they are bounded by a polynomial, we say that they are polynomial time and polynomial space, respectively. There exist two types of Turing machines, deterministic TM and nondeterministic TM. They differ with respect to the function of the finite control. The interested reader may refer to [6] for details. The following are three of the most important complexity classes. P : A decision problem belongs to P if it can be computed by a polynomial-time deterministic TM. N P : A decision problem belongs to NP if it can be computed by a polynomialtime nondeterministic TM. P S P A C E : A decision problem belongs to PSPACE if it can be computed by a polynomial-space deterministic (or nondeterministic) TM. It is well-known that P C NP C PSPACE. However, no one knows so far whether the above inclusions are proper or not. A decision problem A is polynomial-time many-one reducible to a decision problem B, denoted by A
13.2 The Prototype Problem is in PSPACE
233
Vertex-Cover: Given a graph G = (V,E) and an integer k < \V\, determine whether there is a set V C V of size k such that each edge e 6 E is incident to at least one v G V. Some completeness results exhibited in the later sections will be based on a reduction from this problem.
13.2
The Prototype Problem is in PSPACE
Although one believes that the prototype problem is intractable, no formal proof has appeared at this moment. The computational complexity of the prototype problem is a long-standing open problem. Du and Ko [2] considered the general combinatorial search problem as follows. CSP: Given a domain D and an integer k, determine whether there is a decision tree of height < k of which each path uniquely determines an object in D. Here, each internal node of the decision tree corresponds to a Yes-No query and the left and the right sons of the internal node are the queries following the two answers, respectively (unless they are leaves). Leaves are samples consistent with the answers to the previous queries. They proved the following. Theorem 13.2.1 CSP belongs to PSPACE. Proof. For given D and k, one may guess nondeterministically a decision tree of height k in the depth-first-ordering and verify that for each of its paths, there is only one sample, in D, consistent with the queries and answers of this path. Note that at any step of the computation, this algorithm needs only 0(k) space to store one path of the decision tree, although the complete tree contains about 2 nodes. • Clearly, the prototype problem is a special case of the CSP. It is worth mentioning that the input size of the prototype problem is considered to be n log n + log d + log k. In fact, the names of n items need at least n log n spaces. For the prototype problem, the domain D has Vu samples. However, they can be easily enumerated using a small amount of space. Because the instance of the prototype problem has too simple a structure, it is hard to do the reduction to it from other complete problems. This is the main difficulty in proving the intractability of the prototype problem. It was proved in [4] that if the input to a problem is defined by two integers, then the problem cannot be PSPACE-complete unless P=PSPACE. Thus, to obtain any completeness results on the prototype problem, it is necessary to reformulate the problem by adding a more complex structure to the problem instance. A typical
Complexity Issues
234
approach to this is to treat the problem as a special case of a more general problem with instances in more general forms. For example, Even and Tarjan [3] extended the game Hex to general graphs and showed that the latter is PSPACE-complete, while the complexity of the original game Hex remains open. Du and Ko [2] added a test history to the input of the prototype problem. A history of the prototype problem is a finite sequence of tests together with outcomes. G P P (Generalized Prototype Problem): Given n items, a test history, and two integers d and k (0 < d, k < n), determine whether or not there exists a decision tree of height < k of which each path uniquely determines a sample in S(d, n) where S(d, n) consists of all samples of n items having exactly d defectives. Du and Ko [2] conjectured that the GPP is PSPACE-complete. It is still open at this moment. They also considered the following related problems: Consistency: Given a history H, determine whether there exists or not a sample consistent with the history. Determinacy: Given a finite sequence of queries, determine whether for any two samples, there exists a query in the sequence which receives different answers from the two samples. From the proof of Theorem 13.2.1, it is clear that the above two problems are closely related to CSP. We will exhibit some results on these two problems in the next three sections.
13.3
Consistency
Let GTk denote CGT with (fc+l)-nary test outcomes, i.e., the outcomes are 0,1,•• •, k— 1 and k+. To give a more precise description of the consistency problem, let us introduce some notations. For a sample S, let TZs(T) denote the result of a test T. Thus, in the problem GTk,
f i, '
l s { 1 )
-\k,
if \s n T\ = i < k, ii\SDT\>k.
Consistency-GT)t: Given two integers n and d and a history H = {(Ti,ai) \i = 1, • • •, m, with T{ C TV, a,j £ {0, • • •, k} for i = 1, • • •, m), determine whether the set C = {S £ S(d, n) | TZs(Ti) — a; for i = 1, • • •, m) is nonempty. Du and Ko [2] proved the following. Theorem 13.3.1 For all k > 1, Consistency-GTk is NP-complete.
13.3
235
Consistency
Proof. It is trivial t h a t Consistency-GT/t is in NP. T h u s , it is sufficient to reduce an N P - c o m p l e t e problem t o Consistency-GT^. T h e reduction consists of two p a r t s . In t h e first p a r t , it is proved t h a t a well-known N P - c o m p l e t e problem, Vertex-Cover, is polynomial-time reducible t o Consistency-GT X . In t h e second p a r t , Consistency-GTi is polynomially reduced t o Consistency-GT^. Let (G, k) be a given instance of Vertex-Cover, where G = (V, E) is a graph with vertex set V = {vi, • • •, vp} and t h e edge set E = {ei, • • •, e , } , and k is an integer less t h a n or equal to n. Define an instance (n,d,H = {(Tj,a,j) | j = l , - - - , m } ) of Consistency-GTi as follows. n — p; m = q; d = k; for each j = 1, • • •, m , let Tj = {i 11>; = ej} and a3 = 1. For each V C V, define a set Sy G S(d,n) by Sv = {i\vi € V"}. T h e n this is a one-to-one correspondence between subsets of V of size k and sets in S(d,n). F u r t h e r m o r e , V is a vertex-cover of E if and only if Sy <~\Tj ^ 0 for all j = 1, • • • , m . T h i s shows t h a t t h e m a p p i n g from (G, k) to (n, d, H) is a reduction from Vertex-Cover to Consistency-GTi. For t h e second p a r t , let (n,d,H = {{Tj,a.j)\j = l , - - - , m } ) be an instance of Consistency-GT\. Define an instance (n',d',H' = {{Tj,a'j)\j = l , - - - , m } ) of Consistency-GTi, (k > 1) as follows: n ' = n + k — 1; m ' = m + k — 1; d' = d + k — 1; for each j = 1, • • • ,m, if Oj = 0 t h e n let T ' = T, and a'- = 0, if a.j = 1 t h e n let T" = Tj{n + 1, • • • , n + k — 1} and a' = k; for each j = rn + 1, • • •, m + k — 1, let Tj = {n + j — m } and a^- = 1. A s s u m e t h a t (n, d, H) is consistent for Model GTi and S € S(d, n) satisfies t h e condition t h a t for all j = 1, • • •, m, S C\ Tj = 0 if and only if a,- = 1. Define 5' = 5 U ( m + l , ' - - , m - r J t - l } . T h e n , S € S(d', n'). Moreover, for all j = 1, • • •, m, if
aj
= 0, t h e n | 5 " n T j | = | 5 n T,| = 0 = a), and
if aj = 1, t h e n \S' n T j | = \S n T j | + (fc - 1) > ifc = a1and for all j = m + 1, • • •, m + k — 1,
|S" n r;| = i = aj. T h u s , (n', d', i / ' ) is consistent for Model GT^.
Complexity Issues
236 Conversely, if (n',d',H')
is consistent for Model GTk, then there is a set S ' C { l , . . . , m + /fc_l}
such that KS'{Tj) = a'j for j = l,---,m + k - 1. Let 5 = 5 ' n {1, • • • , m } . It is claimed that S C\ T3; = 0 if and only if a,j — 0 for all j = 1, • • •, m. First, if a5 = 0, then T/ = Tj and a$ = 0. So, 5 ' n Tj = 0. Next, if a,; = 1, then •fts,(T;) = a$ = k implies |S' D Tj| > k. Since
|5'n{n + i,---,n + /k-i}| < fc-ijsnr.l = |5'nr;n{i,•••,«}! > 1. This completes the proof. •
13.4
Determinacy
For Model GTk, the precise description of the determinacy problem is as follows. Determinacy-GTfc: Given two integers n and d and a set Q = {T,- | i = 1, • • •, m, with T; C JV, for i = 1, • • •, m}, determine whether, for any two sets Si, S2 in S(d, n), S\ i= Si implies TZSl (Tj) ^ %s2 {Tj) for some j = 1,- • • ,m. The following result was proved by Yang and Du [7]. T h e o r e m 13.4.1 For all k > 1, Determinacy-GTk
is co-NP-complete.
Proof. It is easy to see that Determinacy-GTlt belongs to co-NP. One shows that Vertex-Cover is polynomial-time reducible to the complement of Determinacy-GT/t. Let G = (V,E) and an integer h (0 < h < \V\ — 1) form an instance of VertexCover. Assume V = {1,2, •••,m}. Every edge e is represented by a subset of two elements of V. For convenience, assume that G has no isolated vertices. Define an instance (n, d, n + k + 1, k + h, Q) of Determinacy-GTt as follows: n = m + k + l,d = k + h, Q = {Xi I i = 1,2, • • •, m + k - 1} U {Te I e € E) where Xi = {«} for i = 1,2, • • •, m + k - 1, Te = e U {m + l,m + 2,- • • ,m + k} for e € -E. First, assume that G has a vertex-cover F with \Y\ — h. Define two sets Si = Y U {m + l , m + 2, •••,m + k} S 2 = F u { m + l , m + 2, • • • ,m + k - l,m + k + 1}.
237
13.5 On Sample Space S(n) Obviously, |Si| = |S 2 | = d, S\ =£ S 2 , and nSl{Xi) = nS2(Xi) for i = 1,2, • • • ,m + k - 1, •RSl(Te)=nS2(Tc)ioveeE.
Hence, (n, d, Q) is not determinant. Conversely, assume that (n,d,Q) is not determinant. Then there exists S\, S2 G S{d,n), Si ^ S2, such that for all T € Q, KSl(T) = ^ s 2 ( T ) . From fts,^;) = KsiiXi), it follows that i 0 Si \ S 2 and i g S 2 \ Si for all i = 1,2, • • • ,m + k - 1. Hence, S i \ S 2 = {m + A;} and S 2 \ Si = {m +fc+ 1}. This implies that for any e E E, I Si n r„| = |S 2 0 T e | + 1. Furthermore, since TlSi(Te) = 7ls 2 (r e ), |S2 n Te\ > k Ve e £ , s 2 n e ^ 0. Next, it is shown that {m + l , m + 2, ••• ,m + k — 1} C S2. Assume that {m + l , m + 2, ••• ,m + k — 1} <£S 2 . Then | S 2 n {m + l , m + 2, • • • , m + fc- 1}| = A: - 2 and e C S 2 Ve e £ . Because G has no isolated vertices, V C S 2 . Thus, m + A ; - 2 = | S 2 | - l = d - l = A: +
fe-l,
which implies that m — 1 < h. Since A < |V| — 1 = m — 1, there is a contradiction. Hence, { m + l , m + 2 , • • • ,m+k— 1} C S 2 . Define Y = S 2 \ { m + l , m + 2 , • • • ,m+k— 1}, then | F | < d - k = h. Since S 2 n e / 0, F n e / 0 Ve € £ . So F is a vertex-cover of G with | F | < h. If G has i isolated vertices, then let G' be a graph obtained from G by deleting these i isolated vertices and let h! = h — i. Applying the above argument to G' and h', then Vertex-Cover is polynomial-time reducible to Determinacy-GI\. •
13.5
On Sample Space S(n)
In Chapter 7, we considered a model in which the number of defectives is unknown. In this case, the sample space is S(n) consisting of all subsets of {1,2, • • • , n } , the set of n items. For this sample space, there are some interesting positive results on the corresponding consistency and determinacy problems. Let GT'k denote the problem with the sample space S(n) and (k + l)-nary test outcomes. The precise descriptions
238
Complexity Issues
for the two problems are as follows. Consistency-GT^: Given an integer n and a history H = {(Ti,a,i) \i = 1, • • •, m, with T; C N, aj 6 {0, • • •, k} for i = 1, •••, m } , determine whether the set C = {s £ S(n) 11Zs(Ti) = a, for i = 1, • • •, m} is nonempty. Determinacy-GT)J: Given an integer n and a set <5 = {T, | i = 1, • • •, m, with Ti C Af, for t = 1, • • • , m } , determine whether, for any two sets Si, 52 in S(n), S\ / S2 implies 7?.Si (T>) ^ TZs2{Tj) for some ; = 1, • • •, m. In the following, some positive results obtained in [2] and [7] are reported. T h e o r e m 13.5.1 Consistency-GT[ is •polynomial-time solvable. Proof. Let an instance (n,H = {(Tj,dj) | j = l , - - - , m } ) of Consistency-GTj' be given, where for each j = 1, • • •, m, Tj € S(ra) and aj € {0,1}. Define J = {i I 1 5: i ^
m
, a j = 0} and J = {j I 1 < j < 97i, aj = 1}.
Also let X = Ujg/Tj and Y = { l , - - - , n } \ X. Then it is easy to check that H is consistent iff for each j G J,Tj<~)Y ^= $. This characterization provides a polynomialtime algorithm for Consistency-GT[. • T h e o r e m 13.5.2 For k = 1,2, Determinacy-GT'k is polynomial-time solvable. The following is a characterization which provides a polynomial-time algorithm for determinacy-GT'1'. L e m m a 13.5.3 Let (n,Q) be an instance of Determinacy-GTl. Then {n,Q) is determinant if and only if for every i = 1, • • • ,n, the singleton set {i} is in Q. Proof. The backward direction is obvious, because the set {i} distinguishes between two sets Si and S2 whenever i € Si \ S2. For the forward direction, consider two sets Si = {1, • • - , « } and S2 = Si \ {i}. Then the only set T that can distinguish between S2 and S2 is T = {i} so that nSl{T) = l a.ndKS2(T) = 0. O The next three lemmas give characterizations of Determinacy- GT'2. L e m m a 13.5.4 Let (n,Q) be an instance of Determinacy-GT'k. terminant, then there exist Si and S2 in S(n) such that (1) Si + S2, (2) S X U S 2 = {1, • • - , « } , and (3) for all TeQ, KS,{T) = KS2{T).
If(n,Q)
is not de-
13.5 On Sample Space S(n)
239
Proof. Since (n,Q) is not determinant, there exist S[ and S'2 m S(n) such that for all T€Q, ns[{T) = ns^T). Let S,- = S[ U {N \ (S[ U S 2 )) for i = 1,2. Then Tls,(T) = mm{k,Ks,(T) Thus, KS>(T) = Ks^T)
+ \Tn (N \ (S[ U S'2))\}.
implies ftSl(T) = Ks2{T).
n
Lemma 13.5.5 Lei (n, Q) 6e an instance of Determinacy-GT2. Then {n,Q) is not determinant if and only if there exist Yj, Y2 in S(n) such that YC\Y2 = 0, FiUY^ ^ 0> and for every T € Q, the following holds: (1) If\T\<2, then | V i n r | = \Y2C\T\. (2) If \T\ > 3, then |Fi n T\ < \T\ - 2 and \T l~l Y2\ = \T\ - 2. Proof. Assume that (n,Q) is not determinant. By Lemma 13.5.4, there exist Si, S2 in S(n) such that Si ^ 5 2 , 5i U S2 = JV, and ft5l (T) = 7ls 2 (T) for all T £Q. Define *i = S1\S2 = N\S„Y2 = S2\S1=N\S1. If \T\ < 2, then |Si n T | = | S 2 n T | . So
IFinri = |5' 1 nr|-|rns 1 n5 2 | = TCSl(r)-|rn5!n52| = ftS2(r)-|rnSins2| = |5 2 nr|-|Tn5 2 n5 2 | =
\Y2nT\.
If |T| > 3, then |Si (~)T\ > 2 or |S 2 l~lT| > 2. Assume, without loss of generality, that \Si 0 T\ > 2. Then ^ ( T ) = ^ , ( 2 " ) = 2. Thus, \S2 C\ T\ > 2. Therefore, | y ; n T | < | T | - 2 f o r i = l,2. Conversely, assume that there exist Y\, Y2 in S(n) such that VinF 2 = 0, ViLlY^ ^ 0, and (1) and (2) hold. Define S; = N \ Y{ for i = 1,2. Then Si ^ S 2 and nSl(T) = Ks2(T) for all TeQ. a Let (n, Q) be an instance of Determinacy-GX^. Define a graph G(Q) = (N, E) by setting E = {T £ Q \ \T\ = 2}. A graph G = (V, £ ) is bicolorable if its vertices can be marked by two colors such that no edge has both its endpoints colored the same. A set of vertices with the same color is called a monochromatic set. For a connected bicolorable graph, there is only one way to divide its vertices into two monochromatic sets. Lemma 13.5.6 Let (n,Q) be an instance of Determinacy-GT2. Then (n,Q) is not determinant if and only if G(Q) has a connected component that is bicolorable and its monochromatic vertex subsets Y\ and Y2 satisfy the following conditions. (1) IfT € Q with \T\ = 1, then T C\ (Y,. UY2) = 0. (2) IfT&Q with \T\ > 3, then \T n Y{\ < \T\ - 2 for i = 1,2.
Complexity Issues
240
Proof. Assume that G(Q) has a connected component that is bicolorable; the monochro matic vertex subsets Y\ and Y2 of the connected component satisfy the conditions (1) and (2). Then by Lemma 13.5.5, (n,Q) is not determinant. Conversely, assume that (n, Q) is not determinant. Then there exist Y\ and Y2 satisfying the conditions in Lemma 13.5.5. For any T € E, since \Yi n T\ = 0 or 1 for i = 1,2, either T n (Yx U Y2) = 0 or T C Fi U Y2. Hence, the subgraph G{Q)\YluY2 induced by Yx U Y2 is a union of some connected components of G(Q). Moreover, for each edge T of G(Q)\Y1UY2, one must have \TnY1\ = \Tn Y2\ = 1. Thus, G(Q)\YiuY2 is bicolorable. Consider a connected component of G(Q)\Y1UY2- Its two monochromatic vertex subsets must be subsets of Y\ and Y2, respectively, and hence satisfy the conditions (1) and (2). D Proof of Theorem 13.5.2. By Lemma 13.5.3, it is easy to see that DeterminacyGT{ is polynomial-time solvable. A graph is bicolorable if and only if it contains no odd cycle, the latter holds if and only if there exists no odd cycle in a basis of cycles. Hence, the bicoloring of a graph can be determined in polynomial-time. If a connected graph is bicolorable, then its vertex set can be uniquely partitioned into two disjoint monochromatic subsets. By Lemma 13.5.6, Determinacy-GT2 is polynomialtime solvable. • The following negative result and its proof can also be found in [2] and [7]. Theorem 13.5.7 For k > 2, Consistency-GT'k Determinacy-GTf. is co-NP-complete.
is NP-complete and for k > 3,
A polynomial-time solvable special case for Determinacy- GT% was discovered by Yang and Du [7]. Theorem 13.5.8 Let (n,Q) be an instance of Determinacy-GT^. If (n,Q) does not contain a test of three items, then it can be determined in polynomial-time whether (n, Q) is determinant or not. To prove this result, the following lemma is needed. Lemma 13.5.9 Let (n,Q) be an instance of Determinacy-GT^. Then (n,Q) is not determinant if and only if there exist Y\, Y2 in S(n), Y\C\Y2 = 0, Yi U Y2 ^ 0, such that for any T 6 Q, the following conditions hold: (1) If\T\ < 3 , then \Ytr\T\ = \Y2DT\. (2) If\T\ = 4, then \r\ l~l T\ = \Y2 n T\ or\Tn{Y1UY3)\ = l. (3) / / |TJ > 5, then \Y{ n T\ < \T\ - 3 fori = 1,2. Proof. Assume that (n,Q) is not determinant. By Lemma 13.5.3, there exist Si and 5 2 in S{n) such that Si ^ S2, Si U S2 = N, and ftSl {T) = Ks2(T) for T £Q. Define
13.5 On Sample Space S(n)
241
y = Si \ S2 = N \ S2 and Y2 = 5 2 \ Si = N \ 5 j . If |T| < 3, then
|y a nr| = = = = =
|52nr|-|Tn5in52| nSl(T) - \Tn sx n s2\ TiSl(T)-\Tns1nsi\ , |s 2nr|-|rn51n52| \Y2nT\.
If \T\ = 4 and | y n T\ ± \Y2 n T\, then \S, n T| ^ \S2 n T\. Thus, KSl (T) = fts2(T) implies |5,- n T| > 3 for i = 1,2. Hence, \Y{ n T| < 1. Furthermore, |y, n T\ = 0 (or 1) if and only if \Y2 n T| = 1 (or 0). Therefore, \T D ( ^ U Y2)\ = 1. If |T| > 5, then the proof is similar to the case \T\ > 4 in the proof of Lemma 13.5.5. Conversely, assume that there exist Y\ and Y2 satisfying the conditions in the lemma. Define S\ = N \ Y2 and S2 = N \ Y\. Then it is easy to verify that S\ ^ ^2 and TZSl{T) = TZs2(T) for T € Q. D Let (n, Q) be an instance of Determinacy-GT^ such that Q contains no set of size 3. Define the graph G(Q) as above. Assume that the bicolorable connected components of G(Q) are G\, G2, ..., Gm and the monochromatic vertex subsets of G, are X, and Zi. The following is an algorithm for testing the determinacy property of (n, Q). Algorithm: Initially, let
R:=N\U?=1(Xi\jZi). For i = l , 2 , . . . , m, carry out the following steps in the ith iteration. If the algorithm does not stop within the first m iterations, then (n, Q) is determinant.
Step 1. Let Y := X{ and Y' := Zt. Step 2. If Y and Y' satisfy the condition that (1) for T e Q with \T\ = 1, T n (Y U Y') = 9, (2) for T € Q with |T| = 4, \Y n T\ = \Y' D T| or | ( y U Y') n T| = 1, and (3) for T e Q with JTJ > 5, | F n T| < |T| - 3 and \Y' n T| < |T| - 3, then the algorithm stops and concludes that {n,Q) is not determinant; else, it goes to the next step. Step 3. If Y and Y' do not satisfy (1) or (3), then let R := R U X{ U Z,- and go to the (i + l)th iteration. If y and y satisfy (1) and (3), but do not satisfy (2), then there exists T € Q with |T| = 4 such that either
| r n y | >2and | r n y ' | < 1
Complexity Issues
242 or
| m y | < i and |rny'| >2. If T C Y U Y', then let R:= RUXiOZi and go to the {i + l)th iteration; else, choose xeT\{YUY'). If x g -R, then let R := RU Xi \J Z — i and go to the (i + l)th iteration; else, x must be a vertex of Gj for some j = 1, 2, • • -, m. If
|:rnY| >2and |rny'| < 1,
then let Y := the union of Y and the monochromatic vertex subset that does not contain x; Y' : = the union of Y and the monochromatic vertex subset that contains x. If
|r n Y\ < i and \T n Y'\ > 2, then let Y := the union of Y and the monochromatic vertex subset that contains x; Y' :=the union of Y and the monochromatic vertex subset that does not contain x. Go to Step 2. To show the correctness of the algorithm, assume that Y\ and Y2 satisfy Lemma 13.5.7. Note that for T € Q with \T\ = 2, \T D Yi| = \T D F 2 |. So, G(Q)| yiU y 2 is the union of some bicolorable connected components of G(Q). Let G^. be the connected component of G(Q)\Y1UY2 with the smallest index. It will be proved that the algorithm stops not later than the i*th iteration and hence concludes that (n, Q) is not determinant. To prove it, the following two lemmas will be used. Lemma 13.5.10 At the i*th iteration, Y C Yu Y' C Y2 (or Y C Y2, Y' C Yl). Proof. If Y = X{., Y' = Zt., then obviously Y CYX and Y' CY2. In the following, it is shown that the lemma remains true when Y and Y' are redefined in Step 3. For convenience, let Y and Y' denote the redefined Y and Y'. It suffices to prove that Y C Yx and y ' C Y2 imply f C F , and ? ' C Y2. If (13.4) occurs, then Y C Yu \T\ = 4, and |Yj D T| > 2. By Lemma 13.5.9, |Yi fl T| = |y 2 n T\ = 2. Hence, (T \ ( y U Y')) C y 2 , a: € Y2. Thus, ? C Y - 1 and Y< C Y2. If (13.5) occurs, then the argument is similar, n
243
13.6 Learning by Examples
Lemma 13.5.11 The algorithm cannot go to the (i* + l)th iteration from the i'th iteration. Proof. For contradiction, assume that the algorithm goes to the (i* + l)th iteration from the «*th iteration, then one of the following occurs. (a) Y and Y' do not satisfy the conditions (1) and (3) in Step 3. (b) Y and Y' do not satisfy the condition (2) with T and T C Y U Y'. (c) (T \ {Y U Y')) n R ± 0 holds. If (a) occurs, then by Lemma 13.5.10, Y\ and Yj do not satisfy the condition (1) or (3) in Lemma 13.5.9. If (b) occurs, then Y and Y' do not satisfy the condition (2) in Lemma 13.5.9. Therefore, (a) and (b) cannot occur. Next, suppose (c) occurs. Note that \T\ = 4. Since Yj and Yz satisfy the conditions in Lemma 13.5.9, |Tfl Yj| = \T n Y - 2| = 2. Thus, T C ^ U Y2. (c) implies R n (Y~i U Y~2) / 0. However, during the computation of the i'th iteration, R =
N\UZsi.(Xk\JZll);
and by the assumption on G,.,
Viuy,cuE=i.(Jffcuzk), contradicting .R n (Ya U Y"2) ^ 0. D Proof of Theorem 13.5.8. If (rc,<5) is not determinant, then by Lemma 13.5.11 the algorithm must stop before or at the i*th iteration. Note that the loop at each iteration must be finite since each time the computation goes from Step 3 to Step 2, the number of vertices in F U Y' increases. Therefore, the algorithm must stop at the place where it concludes that (n, Q) is not determinant. If (n,Q) is determinant, then it suffices to prove that the computation must pass the m iterations. For contradiction, suppose that the computation stops in the ith iteration. Then there exist Y and Y' which satisfy the conditions (1), (2), and (3) in Step 3. It follows that Y and Y' satisfy the conditions in Lemma 13.5.9. Hence, (n, Q) is not determinant, a contradiction. •
13.6
Learning by Examples
Computational learning theory is an important area in computer science [1]. "Learning by examples" has been a rapidly developing direction in this area during the past 10 years. The problem that has been studied in this direction can be described as follows: Suppose there is an unknown boolean function which belongs to a certain family. When values are assigned to the variables, the function value is returned. The problem is how to identify the unknown function by using a small number of assignments.
Complexity Issues
244
The prototype problem can also be seen as a special case of this problem. To see this, let T = {xi1 -\- • • • + Xid \ \ < i\ < • • • < id < n}. Suppose that the unknown function / is chosen from T. For each assignment a, let T(a) be the set of indices i such that the variable x, is assigned 1 under the assignment a. Then the following correspondence holds.
/(*) =
#tj
T
* ' * ~r £ i d
/(a) = 1 /(«) = 0
i\, • • •, id are defectives. T(a) is c o n t a m i n a t e d . T(a) is pure.
Thus, learning the unknown function from J- by examples is exactly the (d, n) prototype problem. However, in learning theory, if the unknown function can be learned by polynomially many assignments with respect to n, then this problem is considered to be an easy problem. The (d, n) prototype problem can be solved with at most d(log2 T + 1) tests. Thus, from the viewpoint of learning, it is an easy problem. However, the prototype problem is indeed a hard problem if finding the optimal strategy is the goal.
References [1] D. Angluin, Computational learning theory: survey and selected bibliography, Proceedings of 24th STOC, 1992, 351-369. [2] D.-Z. Du and K.-I Ko, Some completeness results on decision trees and group testing, SIAM Algebraic and Discrete Methods, 8 (1987) 762-777. [3] S. Even and R.E. Tarjan, A combinatorial problem which is complete in polynomial space, J. Assoc. Comput. Mach., 23 (1976) 710-719. [4] S, Fortune, A note on sparse complete sets, SIAM J. Comput., 8 (1979) 431-433. [5] M.R. Garey and D.S. Johnson, Computers and Intractability, (W.H. Freeman, San Francisco, 1979). [6] J.E. Hopcroft and J.D. Ullman, Introduction to Automata Theory, Languages, and Computation, (Addison-Wesley, Reading, Mass., 1979). [7] F. Yang and D.-Z. Du, The complexity of determinacy problem on group testing, Discrete Applied Mathematics, 28 (1990) 71-81.
Index Cai, M.C., 80 candy factory model, 118 Cantor, D.G., 110, 111 C a p e t a n a k i s , J.I., 92 Chang, X.M., 25, 50, 53, 122 Chang, G.J., 39, 48, 50, 103, 203 Chen, C , 35 Chen, C.C., 44 Chen, R . W . , 102 Cheng, S.-W., 132, 134, 143 Christen, C , 114, 118 co-NP-complete, 232 Cohen, D., 35 Colbourn, C , 86 combinatorial group testing, 1 competitive algorithm, 127 competitive ratio, 127 complement, 232 complete, 232 completely separating system, 80 conflict model, 99 c o n t a m i n a t e d , 10 convertible, 28 Czyzowicz, J., 154, 174, 175
/i-operation, 194 A-distinct, 39 A-sharp, 39 A a n d e r a a , 227 active user, 91 adjacent, 203 admissible algorithm, 7 Aigner, M., 113, 114, 205, 216 all-positive p a t h , 39 Alleman, J., 35 Althofer, I., 204, 205, 208 Angluin, D., 98 Aslanidis, C , 35 Aslam, J . A . , 161, 163, 164, 171, 200 Avriel, M., 183 b-bin algorithm, 32 Bar-Noy, A., 132 Barillot, E., 35 Bassalygo, 66 B e a m e r , J.H., 183 Ben-Or, M., 222 Bentley, J.L., 195, 196, 200 Berger, T., 99 Berlekamp, E.R., 150 B I B D , 73 bicolorable, 239 bin, 32 bi nary representation m a t r i x , 79 binary splitting, 20 bi nary tree, 5 b i p a r t i t e g r a p h , 203 BjSrner, A., 222 B l u m e n t h a l , S., 107 breadth-first search, 126 b r u t e force, 170 Brylawski, T.H., 172 Bush, K.A., 73 Busschbach, P., 7 1 , 97
d-disjunct, 62 d-separable, 62 De Bonis, A., 174 De Jong, P.J., 35 decision problem, 232 decision tree, 223 defective set, 4 depth,5 descendants, 5 detecting m a t r i x , 110 deterministic T M , 232 D h a g a t , A., 161, 163, 164, 171, 200 Dickson, T . J . , 80 digging, 140 distance, 74 245
246 Dobkin, D., 219, 222 Dorfman, R., 1, 19 Du, D.-Z., 30, 57, 127, 128, 132, 134, 138, 143, 233-236, 240 Dyachkov, A . G . , 66-68, 70 edge set, 203 elusive, 224 Enis, P., 110 E P B D , 73 ErdSs, P., 67, 8 1 , 110 Evans, G.A., 35 Even, S., 234 failure model, 100 feasible ascent direction, 184 feasible direction m e t h o d , 184 feasible region, 184 Federer, W . T . , 73 Fine, N . J . , 110 fixed i t e m s , 30 Frankl, P., 67, 8 1 , 85 Frazier, M., 162 free i t e m s , 30 Fregger, 205 Fiiredi, Z., 67, 8 1 , 85 Garey, M.R., 44, 216 Gargano, L., 114, 121, 174 generalized binary splitting, 21 G P P , 234 G r a h a m , R.L., 216 graph isomorphism, 203 Green, E.D., 34 Greenberg, A.G., 94, 97 Groll, P.A., 4, 10, 23 group divisible, 74 group testing, 1 Guzicki, W . , 154, 155 half-lie model, 175 halving m e t h o d , 20 H a n a n i , H., 82 Hao, F . H . , 114
Index Hayes, J . F . , 91 Hoffman, A.J., 84, 86 Hong, J.-W., 183 Hu, M . C . , 13, 56 Hua, Lou-Geng, 181 Hwang, F.K., 8, 13, 20, 25, 27, 30, 32, 38, 39, 42, 44, 48, 50, 53, 56, 57, 77, 102, 103, 107, 120, 122, 123, 128, 132, 143, 203 hypergeometric group testing, 12 individual testing algorithm, 19 induced sub-hypergraph, 208 induced subgraph, 203 information lower bound, 7 internal node, 5 internal nodes, 129 intractable, 232 isolate query, 97 Johnson, D.S., 44, 69 K a h n , J., 228 Kalinova, E., 232 K a r p , R.M., 59, 183 K a t o n a , G.O.H., 4, 79 K a u t z , W . H . , 63, 86 Kelley, D., 138 Kessler, I., 132 Kiefer, J., 183 K l e i t m a n , D.J., 150, 228 known error probability model, 171 Ko, Ker-I, 116, 233-235 Komlos, J., 97 Korner, J., 121 Koubek, V., 113 Kraft's inequality, 197 K u m a r , S., 107 K u t t e n , S., 132 Kwiatkowski, D.J., 228 Lacroix, B., 35 L a k s h m a n a n , K . B . , 172, 174, 175 lattice designs, 35
247
Index leaf, 5 level, 5 Lewis, K.A., 35 Li, C.H., 5, 19 Li, W., 183 lie-set, 150 Likhanov, N.B., 102 Lin, S., 8, 48 Lindstorm, B., 83, 111 line algorithm, 23 linearly bounded lies, 160 Lipton, R.J., 219, 222 loaded group, 102 Lovasz, L., 222 Luo, S., 183 Mallows, C.L., 8 Massey, J.L., 93 maximal-distance separable, 75 maximum model, 115 Mehravari, N., 99 membership problem, 218 merging, 27 Meyer, A.R., 150 Mikhailov, V.A., 92, 102 Mill, W.H., 111 minimax algorithm, 4, 6 minor diagonal condition, 86 Miranker, W.L., 183 monochromatic, 239 monotone boolean function, 225 Montuori, 114 Moon, J.W., 10, 11 Moravek, J., 222 multistage algorithm, 17 Mundici, D., 154 negative outcome, 12 Negro, A., 155 nested algorithm, 10 nested class, 23 Newman, D., 56 Nguyen, Q.A., 68
nonadaptive algorithm, 17 nonconflict model, 99 nondeterministic TM, 232 Olson, V., 34 parent, 5 Park, H., 138 Parnes, M., 80, 84, 86 PBIBD, 74 Pelc, A., 150, 154, 155, 160, 163, 171, 174, 175, 200, 216 Pesotan, H,. 73, 74 Pfeifer, C.G., 110 polyhedral membership problem, 219 polyhedral set, 219 polyhedron, 219 polynomial-time many-one reducible, 232 positive outcome, 12 power-set model, 110 Pratt, V.R., 212 prototype, 12 Pudlak, P., 222 pure, 10 Qi, J., 183 quantitative channel, 105 quantitative model, 110 Raghavarao, D., 73, 84 Rajlich, J., 113 Raktoc, B.L., 74 rank of hypergraph, 207 Rashad, A.M., 68 Ravikumar, B., 172 realizable partition, 8 realizable subset, 8 reasonable algorithm, 6 Reed, R., 86 regular value, 189 residue model, 110 reliability, 171 Renyi, A., 79, 110
Index
248 Rivest, R., 150, 157, 175, 184, 188, 213, 221, 227 root, 5 Rosenberg, A.L., 227 Rosenblatt, D., 1 Ruszinko, M., 67 Rykov, V.V., 65-67, 69 Soderberg, S., 110 Sos, V.T., 77 Saha, G.M., 64, 74 Saks, M., 228 sample, 4 sample space, 4 satisfied term, 99 Schalkwijk, J.P., 172 Schughart, M., 113 Schultz, D.J., 63, 80 separable, 8 Sereno, M., 155 Setaro, G., 114 Shapiro, H.S., 110 Singleton, R.R., 63, 76, 84, 86 Sinha, B.K., 64 Skilling, J.K., 44, 46 So, H.C., 44 Sobel, M., 4, 10, 11, 23, 107 space, 231 Spencer, J., 80, 150, 161, 203 Sperner, E., 80 Spira, P.M., 212 Srinivasan, R., 80 state, 150 Steele, M., 222 Steiner triple system, 82 Sterrett, A., 4 strongly competitive, 138 Sturtevant, D., 228 success model, 100 successful transmission, 91 Sun, S.-Z., 132, 134, 143 Tao, X., 181
Tarjan, R., 234 TDM, 91 test history, 5, 150 time, 231 Towsley, D., 99 transitive group, 226 tree function, 225 Triesch, E., 204, 205, 208, 216 truth-assignment, 223 truth-set, 150 Tsybakov, B.S., 92, 102, 105 unloaded group, 102 Upfal, E., 59 Vaccaro, U., 114, 121, 174 Vakil, F., 84, 86 Valiant, L., 98 validly mergeable, 9 vertex set, 203 vertex-covering, 218 Vuillemin, S., 227 Ulam, S.M., 145 W. Feller, 4 Wang, J.K., 13, 56 weakly symmetric, 226 wedge, 220 Wegener, I., 79 Weideman, C.A., 85 weight, 66 Weng, J.F., 25, 42, 50, 53, 122, 183 Wigderson, A., 59 Wilder, D.J., 183 Winkler, P., 161 Winklman, K., 150 Winograd, S., 94 Wolf, J., 99 Wu, F., 183 Xu, Y.H., 115 Xue, G.-L., 132, 134, 143 Yang, F., 236, 240
Index Yao, A.C., 195, 196, 200, 221, 222, 228 Yuan, Y., 183 Zeisel, T., 69 Zinoviev, V., 76