Theory and Applications of Computability E E In cooperation with the association Computability in Europe
Series Editors Prof. P. Bonizzoni Università Degli Studi di Milano-Bicocca Dipartimento di Informatica Sistemistica e Comunicazione (DISCo) 20126 Milan Italy
[email protected] Prof. V. Brattka University of Cape Town Department of Mathematics and Applied Mathematics Rondebosch 7701 South Africa
[email protected] Prof. S.B. Cooper University of Leeds Department of Pure Mathematics Leeds LS2 9JT UK
[email protected] Prof. E. Mayordomo Universidad de Zaragoza Departamento de Informática e Ingeniería de Sistemas E-50018 Zaragoza Spain
[email protected]
For further volumes, go to www.springer.com/series/8819
Books published in this series will be of interest to the research community and graduate students, with a unique focus on issues of computability. The perspective of the series is multidisciplinary, recapturing the spirit of Turing by linking theoretical and real-world concerns from computer science, mathematics, biology, physics, and the philosophy of science. The series includes research monographs, advanced and graduate texts, and books that offer an original and informative view of computability and computational paradigms. Series Advisory Board Samson Abramsky, University of Oxford Eric Allender, Rutgers, The State University of New Jersey Klaus Ambos-Spies, Universität Heidelberg Giorgio Ausiello, Università di Roma, “La Sapienza” Jeremy Avigad, Carnegie Mellon University Samuel R. Buss, University of California, San Diego Rodney G. Downey, Victoria University of Wellington Sergei S. Goncharov, Novosibirsk State University Peter Jeavons, University of Oxford Nataša Jonoska, University of South Florida, Tampa Ulrich Kohlenbach, Technische Universität Darmstadt Ming Li, University of Waterloo Wolfgang Maass, Technische Universität Graz Prakash Panangaden, McGill University Grzegorz Rozenberg, Leiden University and University of Colorado, Boulder Alan Selman, University at Buffalo, The State University of New York Wilfried Sieg, Carnegie Mellon University Jan van Leeuwen, Universiteit Utrecht Klaus Weihrauch, FernUniversität Hagen Philip Welch, University of Bristol
E
Rodney G. Downey • Denis R. Hirschfeldt
Algorithmic Randomness E Complexity and V
Rodney G. Downey Victoria University of Wellington School of Mathematics, Statistics and Operations Research Wellington, New Zealand
[email protected]
Denis R. Hirschfeldt University of Chicago Department of Mathematics Chicago, IL 60637 USA
[email protected]
e-ISSN 2190-6203 ISSN 2190-619X ISBN 978-0-387-95567-4 e-ISBN 978-0-387-68441-3 DOI 10.1007/978-0-387-68441-3 Springer New York Dordrecht Heidelberg London Mathematics Subject Classification (2010): Primary: 03D32, Secondary: 03D25, 03D27, 03D30, 03D80, 68Q30 ACM Computing Classification (1998): F.1, F.2, F.4, G.3, H.1 © Springer Science+Business Media, LLC 2010 All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
Rod dedicates this book to his wife Kristin, and Denis to Larisa.
Contents
xiv
Preface
xvii
Acknowledgments
xx
Introduction
I
Background
1
1 Preliminaries 1.1 Notation and conventions . . . . . . . . . . . . . . . . . . 1.2 Basics from measure theory . . . . . . . . . . . . . . . . 2 Computability Theory 2.1 Computable functions, coding, and the halting problem 2.2 Computable enumerability and Rice’s Theorem . . . . 2.3 The Recursion Theorem . . . . . . . . . . . . . . . . . 2.4 Reductions . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Oracle machines and Turing reducibility . . . . 2.4.2 The jump operator and jump classes . . . . . . 2.4.3 Strong reducibilities . . . . . . . . . . . . . . . . 2.4.4 Myhill’s Theorem . . . . . . . . . . . . . . . . . 2.5 The arithmetic hierarchy . . . . . . . . . . . . . . . . . 2.6 The Limit Lemma and Post’s Theorem . . . . . . . . .
. . . . . . . . . .
2 2 4 7 7 11 13 15 16 18 19 21 23 24
Contents
vii
The difference hierarchy . . . . . . . . . . . . . . . . . . Primitive recursive functions . . . . . . . . . . . . . . . . A note on reductions . . . . . . . . . . . . . . . . . . . . The finite extension method . . . . . . . . . . . . . . . . Post’s Problem and the finite injury priority method . . Finite injury arguments of unbounded type . . . . . . . . 2.12.1 The Sacks Splitting Theorem . . . . . . . . . . . 2.12.2 The Pseudo-Jump Theorem . . . . . . . . . . . . Coding and permitting . . . . . . . . . . . . . . . . . . . The infinite injury priority method . . . . . . . . . . . . 2.14.1 Priority trees and guessing . . . . . . . . . . . . . 2.14.2 The minimal pair method . . . . . . . . . . . . . 2.14.3 High computably enumerable degrees . . . . . . . 2.14.4 The Thickness Lemma . . . . . . . . . . . . . . . The Density Theorem . . . . . . . . . . . . . . . . . . . . Jump theorems . . . . . . . . . . . . . . . . . . . . . . . Hyperimmune-free degrees . . . . . . . . . . . . . . . . . Minimal degrees . . . . . . . . . . . . . . . . . . . . . . . Π01 and Σ01 classes . . . . . . . . . . . . . . . . . . . . . . 2.19.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . 2.19.2 Π0n and Σ0n classes . . . . . . . . . . . . . . . . . . 2.19.3 Basis theorems . . . . . . . . . . . . . . . . . . . 2.19.4 Generalizing the low basis theorem . . . . . . . . Strong reducibilities and Post’s Program . . . . . . . . . PA degrees . . . . . . . . . . . . . . . . . . . . . . . . . . Fixed-point free and diagonally noncomputable functions Array noncomputability and traceability . . . . . . . . . Genericity and weak genericity . . . . . . . . . . . . . . .
27 28 29 31 34 39 39 41 42 44 44 47 53 56 58 64 67 70 72 72 75 77 81 82 84 87 93 100
3 Kolmogorov Complexity of Finite Strings 3.1 Plain Kolmogorov complexity . . . . . . . . . . . . . . . 3.2 Conditional complexity . . . . . . . . . . . . . . . . . . . 3.3 Symmetry of information . . . . . . . . . . . . . . . . . . 3.4 Information-theoretic characterizations of computability 3.5 Prefix-free machines and complexity . . . . . . . . . . . . 3.6 The KC Theorem . . . . . . . . . . . . . . . . . . . . . . 3.7 Basic properties of prefix-free complexity . . . . . . . . . 3.8 Prefix-free randomness of strings . . . . . . . . . . . . . . 3.9 The Coding Theorem and discrete semimeasures . . . . . 3.10 Prefix-free symmetry of information . . . . . . . . . . . . 3.11 Initial segment complexity of sets . . . . . . . . . . . . . 3.12 Computable bounds for prefix-free complexity . . . . . . 3.13 Universal machines and halting probabilities . . . . . . . 3.14 The conditional complexity of σ ∗ given σ . . . . . . . . . 3.15 Monotone and process complexity . . . . . . . . . . . . .
110 110 114 116 117 121 125 128 132 133 134 136 137 139 143 145
2.7 2.8 2.9 2.10 2.11 2.12
2.13 2.14
2.15 2.16 2.17 2.18 2.19
2.20 2.21 2.22 2.23 2.24
viii
Contents
3.16
Continuous semimeasures and KM -complexity . . . . . .
4 Relating Complexities 4.1 Levin’s Theorem relating C and K . . . . 4.2 Solovay’s Theorems relating C and K . . . 4.3 Strong K-randomness and C-randomness . 4.3.1 Positive results . . . . . . . . . . . 4.3.2 Counterexamples . . . . . . . . . . 4.4 Muchnik’s Theorem on C and K . . . . . 4.5 Monotone complexity and KM -complexity
150
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
154 154 155 161 161 162 168 169
5 Effective Reals 5.1 Computable and left-c.e. reals . . . . . . . . . 5.2 Real-valued functions . . . . . . . . . . . . . . 5.3 Representing left-c.e. reals . . . . . . . . . . . 5.3.1 Degrees of representations . . . . . . . 5.3.2 Presentations of left-c.e. reals . . . . . 5.3.3 Presentations and ideals . . . . . . . . 5.3.4 Promptly simple sets and presentations 5.4 Left-d.c.e. reals . . . . . . . . . . . . . . . . . 5.4.1 Basics . . . . . . . . . . . . . . . . . . 5.4.2 The field of left-d.c.e. reals . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
197 197 202 203 203 206 208 215 217 217 219
II
. . . . . . .
Notions of Randomness
6 Martin-L¨ of Randomness 6.1 The computational paradigm . . . . . . . . . . . . . . . 6.2 The measure-theoretic paradigm . . . . . . . . . . . . . 6.3 The unpredictability paradigm . . . . . . . . . . . . . . 6.3.1 Martingales and supermartingales . . . . . . . . 6.3.2 Supermartingales and continuous semimeasures 6.3.3 Martingales and optimality . . . . . . . . . . . . 6.3.4 Martingale processes . . . . . . . . . . . . . . . 6.4 Relativizing randomness . . . . . . . . . . . . . . . . . 6.5 Ville’s Theorem . . . . . . . . . . . . . . . . . . . . . . 6.6 The Ample Excess Lemma . . . . . . . . . . . . . . . . 6.7 Plain complexity and randomness . . . . . . . . . . . . 6.8 n-randomness . . . . . . . . . . . . . . . . . . . . . . . 6.9 Van Lambalgen’s Theorem . . . . . . . . . . . . . . . . 6.10 Effective 0-1 laws . . . . . . . . . . . . . . . . . . . . . 6.11 Infinitely often maximally complex sets . . . . . . . . . 6.12 Randomness for other measures and degree-invariance . 6.12.1 Generalizing randomness to other measures . . 6.12.2 Computable measures and representing reals . .
225 . . . . . . . . . . . . . . . . . .
226 227 229 234 234 238 239 241 245 246 250 252 254 257 259 260 263 263 265
Contents
ix
7 Other Notions of Algorithmic Randomness 7.1 Computable randomness and Schnorr randomness . . . . 7.1.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . 7.1.2 Limitations . . . . . . . . . . . . . . . . . . . . . 7.1.3 Computable measure machines . . . . . . . . . . 7.1.4 Computable randomness and tests . . . . . . . . 7.1.5 Bounded machines and computable randomness . 7.1.6 Process complexity and computable randomness . 7.2 Weak n-randomness . . . . . . . . . . . . . . . . . . . . . 7.2.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . 7.2.2 Characterizations of weak 1-randomness . . . . . 7.2.3 Schnorr randomness via Kurtz null tests . . . . . 7.2.4 Weakly 1-random left-c.e. reals . . . . . . . . . . 7.2.5 Solovay genericity and randomness . . . . . . . . 7.3 Decidable machines . . . . . . . . . . . . . . . . . . . . . 7.4 Selection revisited . . . . . . . . . . . . . . . . . . . . . . 7.4.1 Stochasticity . . . . . . . . . . . . . . . . . . . . . 7.4.2 Partial computable martingales and stochasticity 7.4.3 A martingale characterization of stochasticity . . 7.5 Nonmonotonic randomness . . . . . . . . . . . . . . . . . 7.5.1 Nonmonotonic betting strategies . . . . . . . . . 7.5.2 Van Lambalgen’s Theorem revisited . . . . . . . . 7.6 Demuth randomness . . . . . . . . . . . . . . . . . . . . 7.7 Difference randomness . . . . . . . . . . . . . . . . . . . 7.8 Finite randomness . . . . . . . . . . . . . . . . . . . . . . 7.9 Injective and permutation randomness . . . . . . . . . .
269 270 270 275 277 279 281 282 285 285 290 293 295 296 298 301 301 303 308 309 309 313 315 316 318 319
8 Algorithmic Randomness and Turing Reducibility 8.1 Π01 classes of 1-random sets . . . . . . . . . . . . . . 8.2 Computably enumerable degrees . . . . . . . . . . . 8.3 The Kuˇcera-G´acs Theorem . . . . . . . . . . . . . . 8.4 A “no gap” theorem for 1-randomness . . . . . . . 8.5 Kuˇcera coding . . . . . . . . . . . . . . . . . . . . . 8.6 Demuth’s Theorem . . . . . . . . . . . . . . . . . . 8.7 Randomness relative to other measures . . . . . . . 8.8 Randomness and PA degrees . . . . . . . . . . . . . 8.9 Mass problems . . . . . . . . . . . . . . . . . . . . . 8.10 DNC degrees and subsets of random sets . . . . . . 8.11 High degrees and separating notions of randomness 8.11.1 High degrees and computable randomness . 8.11.2 Separating notions of randomness . . . . . . 8.11.3 When van Lambalgen’s Theorem fails . . . . 8.12 Measure theory and Turing reducibility . . . . . . . 8.13 n-randomness and weak n-randomness . . . . . . . 8.14 Every 2-random set is GL1 . . . . . . . . . . . . . .
323 324 324 325 327 330 333 334 336 344 347 349 349 350 357 358 360 363
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
x
Contents
8.15 8.16 8.17 8.18 8.19 8.20
8.21
III
Stillwell’s Theorem . . . . . . . . . . . . . . . . . . . . DNC degrees and autocomplexity . . . . . . . . . . . . Randomness and n-fixed-point freeness . . . . . . . . . Jump inversion . . . . . . . . . . . . . . . . . . . . . . Pseudo-jump inversion . . . . . . . . . . . . . . . . . . Randomness and genericity . . . . . . . . . . . . . . . . 8.20.1 Similarities between randomness and genericity 8.20.2 n-genericity versus n-randomness . . . . . . . . Properties of almost all degrees . . . . . . . . . . . . . 8.21.1 Hyperimmunity . . . . . . . . . . . . . . . . . . 8.21.2 Bounding 1-generics . . . . . . . . . . . . . . . . 8.21.3 Every 2-random set is CEA . . . . . . . . . . . 8.21.4 Where 1-generic degrees are downward dense .
. . . . . . . . . . . . .
Relative Randomness
9 Measures of Relative Randomness 9.1 Solovay reducibility . . . . . . . . . . . . . . . . . 9.2 The Kuˇcera-Slaman Theorem . . . . . . . . . . . 9.3 Presentations of left-c.e. reals and complexity . . 9.4 Solovay functions and 1-randomness . . . . . . . . 9.5 Solovay degrees of left-c.e. reals . . . . . . . . . . 9.6 cl-reducibility and rK-reducibility . . . . . . . . . 9.7 K-reducibility and C-reducibility . . . . . . . . . 9.8 Density and splittings . . . . . . . . . . . . . . . . 9.9 Monotone degrees and density . . . . . . . . . . . 9.10 Further relationships between reducibilities . . . . 9.11 A minimal rK-degree . . . . . . . . . . . . . . . . 9.12 Complexity and completeness for left-c.e. reals . . 9.13 cl-reducibility and the Kuˇcera-G´acs Theorem . . . 9.14 Further properties of cl-reducibility . . . . . . . . 9.14.1 cl-reducibility and joins . . . . . . . . . . . 9.14.2 Array noncomputability and joins . . . . . 9.14.3 Left-c.e. reals cl-reducible to versions of Ω 9.14.4 cl-degrees of versions of Ω . . . . . . . . . 9.15 K-degrees, C-degrees, and Turing degrees . . . . 9.16 The structure of the monotone degrees . . . . . . 9.17 Schnorr reducibility . . . . . . . . . . . . . . . . .
364 366 370 373 376 378 378 379 381 381 383 386 394
403 . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
10 Complexity and Relative Randomness for 1-Randoms Sets 10.1 Uncountable lower cones in K and C . . . . . . . . . 10.2 The K-complexity of Ω and other 1-random sets . . . . . 10.2.1 K(A n) versus K(n) for 1-random sets A . . . . 10.2.2 The rate of convergence of Ω and the α function .
404 405 408 411 412 413 419 425 427 432 433 437 439 441 442 442 444 447 454 456 459 462 464 464 466 466 467
Contents
. . . . . . . . . . .
468 471 473 474 475 476 478 479 481 489 495
. . . . . . . . . . . . . . . . . . .
500 500 501 502 504 505 507 511 511 512 514 518 526 529 531 534 536 538 541 550
12 Lowness and Triviality for Other Randomness Notions 12.1 Schnorr lowness . . . . . . . . . . . . . . . . . . . . . . . 12.1.1 Lowness for Schnorr tests . . . . . . . . . . . . . 12.1.2 Lowness for Schnorr randomness . . . . . . . . . 12.1.3 Lowness for computable measure machines . . . . 12.2 Schnorr triviality . . . . . . . . . . . . . . . . . . . . . . 12.2.1 Degrees of Schnorr trivial sets . . . . . . . . . . . 12.2.2 Schnorr triviality and strong reducibilities . . . . 12.2.3 Characterizing Schnorr triviality . . . . . . . . . . 12.3 Tracing weak truth table degrees . . . . . . . . . . . . . 12.3.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . 12.3.2 Reducibilities with tiny uses . . . . . . . . . . . .
554 554 554 559 561 564 564 568 569 576 576 576
10.3
10.4 10.5 10.6
10.2.3 Comparing complexities of 1-random sets . . . . 10.2.4 Limit complexities and relativized complexities Van Lambalgen reducibility . . . . . . . . . . . . . . . 10.3.1 Basic properties of the van Lambalgen degrees . 10.3.2 Relativized randomness and Turing reducibility 10.3.3 vL-reducibility, K-reducibility, and joins . . . . 10.3.4 vL-reducibility and C-reducibility . . . . . . . . 10.3.5 Contrasting vL-reducibility and K-reducibility . Upward oscillations and K -comparable 1-random sets LR-reducibility . . . . . . . . . . . . . . . . . . . . . . Almost everywhere domination . . . . . . . . . . . . .
xi
11 Randomness-Theoretic Weakness 11.1 K-triviality . . . . . . . . . . . . . . . . . . . . . . . 11.1.1 The basic K-triviality construction . . . . . . 11.1.2 The requirement-free version . . . . . . . . . . 11.1.3 Solovay functions and K-triviality . . . . . . . 11.1.4 Counting the K-trivial sets . . . . . . . . . . . 11.2 Lowness . . . . . . . . . . . . . . . . . . . . . . . . . 11.3 Degrees of K-trivial sets . . . . . . . . . . . . . . . . 11.3.1 A first approximation: wtt-incompleteness . . 11.3.2 A second approximation: impossible constants 11.3.3 The less impossible case . . . . . . . . . . . . 11.4 K-triviality and lowness . . . . . . . . . . . . . . . . 11.5 Cost functions . . . . . . . . . . . . . . . . . . . . . . 11.6 The ideal of K-trivial degrees . . . . . . . . . . . . . 11.7 Bases for 1-randomness . . . . . . . . . . . . . . . . . 11.8 ML-covering, ML-cupping, and related notions . . . . 11.9 Lowness for weak 2-randomness . . . . . . . . . . . . 11.10 Listing the K-trivial sets . . . . . . . . . . . . . . . . 11.11 Upper bounds for the K-trivial sets . . . . . . . . . . 11.12 A gap phenomenon for K-triviality . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
xii
Contents
12.4 12.5 12.6
12.3.3 Anti-complex sets and tiny uses . . . . . 12.3.4 Anti-complex sets and Schnorr triviality Lowness for weak genericity and randomness . . Lowness for computable randomness . . . . . . Lowness for pairs of randomness notions . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
578 580 581 586 590
13 Algorithmic Dimension 592 13.1 Classical Hausdorff dimension . . . . . . . . . . . . . . . 592 13.2 Hausdorff dimension via gales . . . . . . . . . . . . . . . 594 13.3 Effective Hausdorff dimension . . . . . . . . . . . . . . . 596 13.4 Shift complex sets . . . . . . . . . . . . . . . . . . . . . . 601 13.5 Partial randomness . . . . . . . . . . . . . . . . . . . . . 602 13.6 A correspondence principle for effective dimension . . . . 607 13.7 Hausdorff dimension and complexity extraction . . . . . 608 13.8 A Turing degree of nonintegral Hausdorff dimension . . . 611 13.9 DNC functions and effective Hausdorff dimension . . . . 618 13.9.1 Dimension in h-spaces . . . . . . . . . . . . . . . 619 13.9.2 Slow-growing DNC functions and dimension . . . 623 13.10 C-independence and Zimand’s Theorem . . . . . . . . . 627 13.11 Other notions of dimension . . . . . . . . . . . . . . . . . 635 13.11.1 Box counting dimension . . . . . . . . . . . . . 635 13.11.2 Effective box counting dimension . . . . . . . . 636 13.11.3 Packing dimension . . . . . . . . . . . . . . . . 638 13.11.4 Effective packing dimension . . . . . . . . . . . 641 13.12 Packing dimension and complexity extraction . . . . . . 642 13.13 Clumpy trees and minimal degrees . . . . . . . . . . . . 645 13.14 Building sets of high packing dimension . . . . . . . . . . 648 13.15 Computable dimension and Schnorr dimension . . . . . . 654 13.15.1 Basics . . . . . . . . . . . . . . . . . . . . . . . 654 13.15.2 Examples of Schnorr dimension . . . . . . . . . 657 13.15.3 A machine characterization of Schnorr dimension 658 13.15.4 Schnorr dimension and computable enumerability 659 13.16 The dimensions of individual strings . . . . . . . . . . . . 662
IV
Further Topics
667
14 Strong Jump Traceability 14.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.2 The ideal of strongly jump traceable c.e. sets . . . . . . . 14.3 Strong jump traceability and K-triviality: the c.e. case . 14.4 Strong jump traceability and diamond classes . . . . . . 14.5 Strong jump traceability and K-triviality: the general case
668 668 672 680 689 700
15 Ω as an Operator
705
Contents
15.1 15.2 15.3 15.4 15.5 15.6
Introduction . . . . . . . . . . . . . . . . . . . Omega operators . . . . . . . . . . . . . . . . A-1-random A-left-c.e. reals . . . . . . . . . . Reals in the range of some Omega operator . Lowness for Ω . . . . . . . . . . . . . . . . . . Weak lowness for K . . . . . . . . . . . . . . . 15.6.1 Weak lowness for K and lowness for Ω 15.6.2 Infinitely often strongly K-random sets 15.7 When ΩA is a left-c.e. real . . . . . . . . . . . 15.8 ΩA for K-trivial A . . . . . . . . . . . . . . . 15.9 K-triviality and left-d.c.e. reals . . . . . . . . 15.10 Analytic behavior of Omega operators . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
xiii
. . . . . . . . . . . .
. . . . . . . . . . . .
705 707 709 711 712 713 714 715 716 719 722 722
16 Complexity of Computably Enumerable Sets 16.1 Barzdins’ Lemma and Kummer complex sets . . . . . . 16.2 The entropy of computably enumerable sets . . . . . . 16.3 The collection of nonrandom strings . . . . . . . . . . . 16.3.1 The plain and conditional cases . . . . . . . . . 16.3.2 The prefix-free case . . . . . . . . . . . . . . . . 16.3.3 The overgraphs of universal monotone machines 16.3.4 The strict process complexity case . . . . . . . .
. . . . . . .
728 728 731 738 738 743 752 761
References
767
Index
797
Preface
Though we did not know it at the time, this book’s genesis began with the arrival of Cris Calude in New Zealand. Cris has always had an intense interest in algorithmic information theory. The event that led to much of the recent research presented here was the articulation by Cris of a seemingly innocuous question. This question goes back to Solovay’s legendary manuscript [371], and Downey learned of it during a visit made to Victoria University in early 2000 by Richard Coles, who was then a postdoctoral fellow with Calude at Auckland University. In effect, the question was whether the Solovay degrees of left-computably enumerable reals are dense. At the time, neither of us knew much about Kolmogorov complexity, but we had a distinct interest in it after Lance Fortnow’s illuminating lectures [148] at Kaikoura1 in January 2000. After thinking about Calude’s question for a while, and eventually solving it together with Andr´e Nies [116], we began to realize that there was a huge and remarkably fascinating area of research, whose potential was largely untapped, lying at the intersection of computability theory and the theory of algorithmic randomness. We also found that, while there is a truly classic text on Kolmogorov complexity, namely Li and Vit´anyi [248], most of the questions we were in1 Kaikoura was the setting for a wonderful meeting on computational complexity. There is a set of lecture notes [112] resulting from this meeting, aimed at graduate students. Kaikoura is on the east coast of the South Island of New Zealand, and is famous for its beauty and for tourist activities such as whale watching and dolphin, seal, and shark swimming. The name “Kaikoura” is a Maori word meaning “eat crayfish”, which is a fine piece of advice.
Preface
xv
terested in either were open, were exercises in Li and Vit´anyi with difficulty ratings of about 40-something (out of 50), or necessitated an archaeological dig into the depths of a literature with few standards in notation2 and terminology, marked by relentless rediscovery of theorems and a significant amount of unpublished material. Particularly noteworthy among the unpublished material was the aforementioned set of notes by Solovay [371], which contained absolutely fundamental results about Kolmogorov complexity in general, and about initial segment complexity of sets in particular. As our interests broadened, we also became aware of important results from Stuart Kurtz’ PhD dissertation [228], which, like most of Solovay’s results, seemed unlikely ever to be published in a journal. Meanwhile, a large number of other authors started to make great strides in our understanding of algorithmic randomness. Thus, we decided to try to organize results on the relationship between algorithmic randomness and computability theory into a coherent book. We were especially thankful for Solovay’s permission to present, in most cases for the first time, the details from his unpublished notes.3 We were encouraged by the support of Springer in this enterprise. Naturally, this project has conformed to Hofstadter’s Law: It always takes longer than you expect, even when you take into account Hofstadter’s Law. Part of the reason for this delay is that a large contingent of gifted researchers continued to relentlessly prove theorems that made it necessary to rewrite large sections of the book.4 We think it is safe to say that the study of algorithmic randomness and dimension is now one of the most active areas of research in mathematical logic. Even in a book this size, much has necessarily been left out. To those who feel slighted by these omissions, or by inaccuracies in attribution caused by our necessarily imperfect historical knowledge, we apologize in advance, and issue a heartfelt invitation to write their own books. Any who might feel inclined to thank us will find a suggestion for an appropriate gift on page 517. This is not a basic text on Kolmogorov complexity. We concentrate on the Kolmogorov complexity of sets (i.e., infinite sequences) and cover only as much as we need on the complexity of finite strings. There is quite a lot of background material in computability theory needed for some of the more sophisticated proofs we present, so we do give a full but, by necessity, rapid refresher course in basic “advanced” computability theory. This material
2 We hope to help standardize notation. In particular, we have fixed upon the notation for Kolmogorov complexity used by Li and Vit´ anyi: C for plain Kolmogorov complexity and K for prefix-free Kolmogorov complexity. 3 Of course, Li and Vit´ anyi used Solovay’s notes extensively, mostly in the exercises and for quoting results. 4 It is an unfortunate consequence of working on a book that attempts to cover a significant portion of a rapidly expanding area of research that one begins to hate one’s most productive colleagues a little.
xvi
Preface
should not be read from beginning to end. Rather, the reader should dip into Chapter 2 as the need arises. For a fuller introduction, see for instance Rogers [334], Soare [366], Odifreddi [310, 311], or Cooper [79]. We will mostly avoid historical comments, particularly about events predating our entry into this area of research. The history of the evolution of Kolmogorov complexity and related topics can make certain people rather agitated, and we feel neither competent nor masochistic enough to enter the fray. What seems clear is that, at some stage, time was ripe for the evolution of the ideas needed for Kolmogorov complexity. There is no doubt that many of the basic ideas were implicit in Solomonoff [369], and that many of the fundamental results are due to Kolmogorov [211]. The measure-theoretic approach was pioneered by Martin-L¨ of [259]. Many key results were established by Levin in works such as [241, 242, 243, 425] and by Schnorr [348, 349, 350], particularly those using the measure of domains to avoid the problems of plain complexity in addressing the initial segment complexity of sets. It is but a short step from there to prefix-free complexity (and discrete semimeasures), first articulated by Levin [243] and Chaitin [58]. Schnorr’s penetrating ideas, only some of which are available in their original form in English, are behind much modern work in computational complexity, as well as Lutz’ approach to effective Hausdorff dimension in [252, 254], which is based on martingales and orders. As has often been the case in this area, however, Lutz developed his material without being too aware of Schnorr’s work, and was apparently the first to explicitly connect orders and Hausdorff dimension. From yet another perspective, martingales, or rather supermartingales, are essentially the same as continuous semimeasures, and again we see the penetrating insight of Levin (see [425]). We are particularly pleased to present the results of Kurtz and Solovay mentioned above, as well as hitherto unpublished material from Steve Kautz’ dissertation [200] and the fundamental work of Antonin Kuˇcera. Kuˇcera was a real pioneer in connecting computability and randomness, and we believe that it is only recently that the community has really appreciated his deep intuition. Algorithmic randomness is a highly active field, and still has many fascinating open questions and unexplored directions of research. Recent lists of open questions include Miller and Nies [278] and the problem list [2] arising from a workshop organized by Hirschfeldt and Miller at the American Institute of Mathematics in 2006. Several of the questions on these lists have already been solved, however, with many of the solutions appearing in this book. We will mention a number of open questions below, some specific, some more open ended. The pages on which these occur are listed in the index under the heading open question.
Acknowledgments
Our foremost acknowledgment is to the Marsden Fund of New Zealand. There is no question that this work would not have happened without its generous support. Hirschfeldt was Downey’s Marsden Postdoctoral Fellow when the work that eventually led to this book began. George Barmpalias, Laurent Bienvenu, Richard Coles, Noam Greenberg (now a faculty member at VUW), Evan Griffiths, Geoff LaForte, Joe Miller, and Antonio Montalb´an have also been Downey’s Marsden Postdoctoral Fellows. Keng Meng Ng, Adam Day, and Andrew Fitzgerald were supported by the Marsden Fund as Downey’s PhD students. Downey has also received direct Marsden support for the period of the writing of this book, as has Andr´e Nies, whose work appears throughout the text. As mentioned in the preface, our early interest in Kolmogorov complexity was stimulated by a talk given by Lance Fortnow at a conference in Kaikoura almost completely supported by the Marsden Fund, via the New Zealand Mathematical Research Institute.5 Downey and Liang Yu have also been supported by the New Zealand 5 This institute is a virtual one, and was the brainchild of Vaughan Jones. After receiving the Fields Medal, among other accolades, Vaughan has devoted his substantial influence to bettering New Zealand mathematics. The visionary NZMRI was founded with this worthy goal in mind. It runs annual workshops at picturesque locations, each devoted to a specific area of mathematics. These involve lecture series by overseas experts aimed at graduate students, and are fully funded for New Zealand attendees. The 2009 workshop, held in Napier, was devoted to algorithmic information theory, computability, and complexity, and Hirschfeldt was one of the speakers. The NZMRI is chaired by Vaughan Jones, and has as its other directors Downey and the uniformly excellent Marston Conder, David Gauld, and Gaven Martin.
xviii
Acknowledgments
Institute for Mathematics and its Applications, a recent CoRE (Centre of Research Excellence) that grew from the NZMRI, with Yu being Downey’s postdoctoral fellow supported by the logic and computability programme of this CoRE. As a postdoctoral fellow, Guohua Wu was supported by the New Zealand Foundation for Research Science and Technology, having previously been supported by the Marsden Fund as Downey’s PhD student. Stephanie Reid received similar support from the Marsden Fund for her MSc thesis. Finally, many visitors and temporary fellows at VUW have been supported by the Marsden Fund, and sometimes the ISAT Linkages programme, including Eric Allender, Veronica Becher, Peter Cholak, Barbara Csima, Carl Jockusch, Steffen Lempp, Andy Lewis, Jan Reimann, Ted Slaman, Sebastiaan Terwijn, and Rebecca Weber, among others. Downey received a Maclaurin Fellowship from the NZIMA as part of the Logic and Computation programme, and a James Cook Fellowship from the New Zealand government for 2008–2010, both of which were important in supporting his work on this book. We also owe substantial thanks to the National Science Foundation of the United States. Hirschfeldt’s research has been supported by NSF grants throughout the period of the writing of this book, including a Focused Research Group grant on algorithmic randomness involving researchers from around a dozen institutions in the United States, who together have produced a significant portion of the research presented here. Among many activities, this FRG grant has funded important workshops in the area, in Chicago in 2008, Madison in 2009, and South Bend in 2010. The group of researchers involved in this project first came together at a workshop organized by Hirschfeldt and Joe Miller at the American Institute of Mathematics. The NSF has also funded many visitors to the University of Chicago, including Downey, who has also been funded by the NSF for visits to other institutions in the United States. Many colleagues have helped us by sending in corrections and suggestions, by helping us with various aspects of the production of this book, and by generously sharing their results and ideas. Although errors and infelicities no doubt remain, and are entirely our responsibility, the text has benefited enormously from their assistance. With apologies to those we may have forgotten, we thank Eric Allender, Klaus Ambos-Spies, Bernie Anderson, Uri Andrews, Asat Arslanov, George Barmpalias, Veronica Becher, Laurent Bienvenu, Paul Brodhead, Cris Calude, Douglas Cenzer, Gregory Chaitin, Peter Cholak, Chi Tat Chong, Richard Coles, Chris Conidis, Barry Cooper, Barbara Csima, Adam Day, Ding Decheng, Jean-Paul Delahaye, David Diamondstone, Damir Dzhafarov, Santiago Figueira, Andrew Fitzgerald, Lance Fortnow, Johanna Franklin, Peter G´ acs, Noam Greenberg, Evan Griffiths, John Hitchcock, Carl Jockusch, Asher Kach, Bakh Khoussainov, Bjørn Kjos-Hanssen, Julia Knight, Antonin Kuˇcera, Geoff LaForte, Steffen Lempp, Leonid Levin, Jack Lutz, Elvira Mayordomo, Wolfgang Merkle, Joe Mileti, Joe Miller, Antonio Montalb´ an, Andr´e Nies,
Acknowledgments
xix
Keng Meng Ng, Chris Porter, Alex Raichev, Jan Reimann, Jeffrey Remmel, Alexander Shen, Richard Shore, Ted Slaman, Bob Soare, Robert Solovay, Frank Stephan, Jonathan Stephenson, Dan Turetsky, Nikolai Vereshchagin, Paul Vit´ anyi, Rebecca Weber, Guohua Wu, Cathryn Young, Yang Yue, Liang Yu, Xizhong Zheng, and Marius Zimand. We also thank the staff at Springer for all their help with the preparation of this book, and for their patience during this rather lengthy process. Particular thanks go to Ronan Nugent and Vaishali Damle. We are proud to inaugurate the CiE-Springer book series Theory and Applications of Computability, and thank the Computability in Europe Association, and particularly series editors Elvira Mayordomo and Barry Cooper. Finally, we thank the anonymous reviewers for their many valuable suggestions.
Introduction
What does it mean to say that an individual mathematical object such as an infinite binary sequence is random? Or to say that one sequence is more random than another? These are the most basic questions motivating the work we describe in this book. Once we have reasonable tools for measuring the randomness of infinite sequences, however, other questions present themselves: If we divide our sequences into equivalence classes of sequences of the same “degree of randomness”, what does the resulting structure look like? How do various possible notions of randomness relate to each other and to the measures of complexity used in computability theory and algorithmic information theory, and to what uses can they be put? Should it be the case that high levels of randomness mean high levels of complexity or computational power, or low ones? Should the structures of computability theory such as Turing degrees and computably enumerable sets have anything to do with randomness? The material in this book arises from questions such as these. Much of it can be thought of as exploring the relationships between three fundamental concepts: relative computability, as measured by notions such as Turing reducibility; information content, as measured by notions such as Kolmogorov complexity; and randomness of individual objects, as first successfully defined by Martin-L¨ of (but prefigured by others, dating back at least to the work of von Mises). While some fundamental questions remain open, we now have a reasonable insight into many of the above questions, and the resulting body of work contains a number of beautiful and rather deep theorems.
Introduction
xxi
When considering sequences such as 101010101010101010101010101010101010 . . . and 101101011101010111100001010100010111 . . ., none but the most contrarian among us would deny that the second (obtained by the first author by tossing a coin) is more random than the first. However, in measure-theoretic terms, they are both equally likely. Furthermore, what are we to make in this context of, say, the sequence obtained by taking our first sequence, tossing a coin for each bit in the sequence, and, if the coin comes up heads, replacing that bit by the corresponding one in the second sequence? There are deep and fundamental questions involved in trying to understand why some sequences should count as “random” and others as “lawful”, and how we can transform our intuitions about these concepts into meaningful mathematical notions. The roots of the study of algorithmic randomness go back to the work of Richard von Mises in the early 20th century. In his remarkable paper [402], he argued that a sequence should count as random if all “reasonable” infinite subsequences satisfy the law of large numbers (i.e., have the same proportion of 0’s as 1’s in the limit). This behavior is certainly to be expected of any intuitively random sequence. A sequence such as 1010101010 . . . should not count as random because, although it itself satisfies the law of large numbers, it contains easily described subsequences that do not. Von Mises wrote of “acceptable selection rules” for subsequences. Wald [403, 404] later showed that for any countable collection of selection rules, there are sequences that are random in the sense of von Mises, but at the time it was unclear exactly what types of selection rules should be acceptable. There seemed to von Mises to be no canonical choice. Later, with the development of computability theory and the introduction of generally accepted precise mathematical definitions of the notions of algorithm and computable function, Church [71] made the first explicit connection between computability theory and randomness by suggesting that a selection rule be considered acceptable iff it is computable. In a sense, this definition of what we now call Church stochasticity can be seen as the birth of the theory of algorithmic randomness. A blow to the von Mises program was dealt by Ville [401], who showed that for any countable collection of selection rules, there is a sequence that is random in the sense of von Mises but has properties that make it clearly nonrandom. (In Ville’s example, the ratio of 0’s to 1’s in the first n bits of the sequence is at least 1 for all n. If we flip a fair coin, we certainly expect the ratio of heads to tails not only to tend to 1, but also to be sometimes slightly larger and sometimes slightly smaller than 1.) One might try to get around this problem by adding further specific statistical laws to the law of large numbers in the definition of random-
xxii
Introduction
ness, but there then seems to be no reason not to expect Ville-like results from reappearing in this modified context. Just because a sequence respects laws A, B, and C, why should we expect it to respect law D? And law D may be one we have overlooked, perhaps one that is complicated to state, but clearly should be respected by any intuitively random sequence. Thus Ville’s Theorem caused the situation to revert basically to what it had been before Church’s work: intuitively, a sequence should be random if it passes all “reasonable” statistical tests, but how do we make this notion precise? Once again, the answer involved computability theory. In a sweeping generalization, Martin-L¨ of [259] noted that the particular statistical tests that had been considered (the law of large numbers, the law of iterated logarithms, etc.) were special cases of a general abstract notion of statistical test based on the notion of an “effectively null” set. He then defined a notion of randomness based on passing all such tests (or equivalently, not being in any effectively null set). Martin-L¨ of’s definition turned out to be not only foundationally wellmotivated but mathematically robust and productive. Now known as 1-randomness or Martin-L¨ of randomness, it will be the central notion of randomness in this book, but not the only one. There are certainly other reasonable choices for what counts as “effectively null” than the one taken by Martin-L¨ of, and many notions of randomness resulting from these choices will be featured here. Furthermore, one of the most attractive features of the notion of 1-randomness is that it can be arrived at from several other approaches, such as the idea that random sequences should be incompressible, and the idea that random sequences should be unpredictable (which was already present in the original motivation behind von Mises’ definition). These approaches lead to equivalent definitions of 1-randomness in terms of, for example, prefix-free Kolmogorov complexity and computably enumerable martingales (concepts we will define and discuss in this book). Like Martin-L¨of’s original definition, these alternative definitions admit variations, again leading to other reasonable notions of algorithmic randomness that we will discuss. The evolution and clarification of many of these notions of randomness is carefully discussed in Michiel van Lambalgen’s dissertation [397]. The first five chapters of this book introduce background material that we use throughout the rest of the text, but also present some important related results that are not quite as central to our main topics. Chapter 1 briefly covers basic notation, conventions, and terminology, and introduces a small amount of measure theory. Chapter 2 is a whirlwind tour of computability theory. It assumes nothing but a basic familiarity with a formalism such as Turing machines, at about the level of a first course on “theory of computation”, but is certainly not designed to replace dedicated texts such as Soare [366] or Odifreddi [310, 311]. Nevertheless, quite sophisticated computability-theoretic methods have found themselves into
Introduction
xxi ii
the study of algorithmic randomness, so there is a fair amount of material in that chapter. It is probably best thought of as a reference for the rest of the text, possibly only to be scanned at a first reading. Chapter 3 is an introduction to Kolmogorov complexity, focused on those parts of the theory that will be most useful in the rest of the book. We include proofs of basic results such as counting theorems, symmetry of information, and the Coding Theorem, among others. (A much more general reference is Li and Vit´anyi [248].) As mentioned above, Martin-L¨of’s measure-theoretic approach to randomness is not the only one. It can be thought of as arising from the idea that random objects should be “typical”. As already touched upon above, two other major approaches we will discuss are through “unpredictability” and “incompressibility”. The latter is perhaps the least obvious of the three, and also perhaps the most modern. Nowadays, with file compression a concept well known to many users of computers and other devices involving electronic storage or transmission, it is perhaps not so strange to characterize randomness via incompressibility, but it seems clear that typicality and unpredictability are even more intuitive properties of randomness. Nevertheless, the incompressibility approach had its foundations laid at roughly the same time as Martin-L¨ of’s work, by Kolmogorov [211], and in a sense even earlier by Solomonoff [369], although its application to infinite sequences had to wait a while. Roughly speaking, the Kolmogorov complexity of a finite string is the length of its shortest description. To formalize this notion, we use universal machines, thought of as “optimal description systems”. We then get a good notion of randomness for finite strings: a string σ is random iff the Kolmogorov complexity of σ is no shorter than the length of σ (which we can think of as saying that σ is its own best description). Turning to infinite sequences, however, we have a problem. As we will see in Theorem 3.1.4, there is no infinite sequence all of whose initial segments are incompressible. We can get around this problem by introducing a different notion of Kolmogorov complexity, which is based on machines whose domains are antichains, and corresponds to the idea that if a string τ describes a string σ, then this description should be encoded entirely in the bits of τ , not in its length. In Section 3.5, we further discuss how this notion of prefix-free complexity can be seen as capturing the intuitive meaning of Kolmogorov complexity, arguably better than the original definition, and will briefly discuss its history. In Chapter 6 we use it to give a definition of randomness for infinite sequences equivalent to Martin-L¨of’s. In Chapter 4, we present for the first time in published form the details of Solovay’s remarkable results relating plain and prefix-free Kolmogorov complexity, and related results by Muchnik and Miller. We also present G´ acs’ separation of two other notions of complexity introduced in Chapter 3, a surprisingly difficult result, and a significant extension of that result by Day. Most of the material in this chapter, although quite interesting in itself, will not be used in the rest of the book.
xxiv
Introduction
In Chapter 5 we discuss effective real numbers, in particular the left computably enumerable reals, which are those reals that can be computably approximated from below. These reals play a similar role in the theory of algorithmic randomness as the computably enumerable sets in classical computability theory. Many of the central objects in this book (Martin-L¨ of tests, Kolmogorov complexity, martingales, etc.) have naturally associated left-c.e. reals. A classic example is Chaitin’s Ω, which is the measure of the domain of a universal prefix-free machine, and is the canonical example of a specific 1-random real (though, as we will see, it is in many ways not a “typical” 1-random real). The next three chapters introduce most of the notions of algorithmic randomness we will study and examine their connections to computability theory. Chapter 6 is dedicated to 1-randomness (and its natural generalization, n-randomness). We introduce the concepts of Martin-L¨ of test and of martingale, using them, as well as Kolmogorov complexity, to give definitions of randomness in the spirit of the three approaches mentioned above, and prove Schnorr’s fundamental theorems that these definitions are equivalent. We also include some fascinating theorems by Miller, Yu, Nies, Stephan, and Terwijn on the relationship between plain Kolmogorov complexity and randomness. These include plain complexity characterizations of both 1-randomness and 2-randomness. We prove Ville’s Theorem mentioned above, and introduce some of the most important tools in the study of 1-randomness and its higher level versions, including van Lambalgen’s Theorem, effective 0-1 laws, and the Ample Excess Lemma of Miller and Yu. In the last section of this chapter, we briefly examine randomness relative to measures other than the uniform measure, a topic we return to, again briefly, in Chapter 8. In Chapter 7 we introduce other notions of randomness, mostly based on variations on Martin-L¨ of’s approach and on considering martingales with different levels of effectiveness. Several of these notions were originally motivated by what is now known as Schnorr’s critique. Schnorr argued that 1-randomness is essentially a computably enumerable, rather than computable, notion and therefore too strong to capture the intuitive notion of randomness relative to “computable tests”. We study various notions, including Schnorr randomness and computable randomness, introduced by Schnorr, and weak n-randomness, introduced by Kurtz. We discuss test set, martingale, and machine characterizations of these notions. We also return to the roots of the subject to discuss stochasticity, and study nonmonotonic randomness, which leads us to one of the most basic major open questions in the area: whether nonmonotonic randomness is strictly weaker than 1randomness. Chapter 8 is devoted to the interactions between randomness and computability. Highlights include the Kuˇcera-G´acs Theorem that any set can
Introduction
xxv
be coded into a 1-random set, and hence all degrees above 0 contain 1random sets; Demuth’s Theorem relating 1-randomness and truth table reducibility; Stephan’s dichotomy theorem relating 1-randomness and PA degrees; the result by Barmpalias, Lewis, and Ng that each PA degree is the join of two 1-random degrees; and Stillwell’s Theorem that the “almost all” theory of the Turing degrees is decidable. We also examine how the ability to compute a fixed-point free function relates to initial segment complexity, discuss jump inversion for 1-random sets, and study the relationship between n-randomness, weak n-randomness, and genericity, among other topics. In addition, we examine the relationship between computational power and separating notions of randomness. For example, we prove the remarkable result of Nies, Stephan, and Terwijn that a degree contains a set that is Schnorr random but not computably random, or one that is computably random but not 1-random, iff it is high. We finish this chapter with Kurtz’ results, hitherto available only in his dissertation, on “almost all” properties of the degrees, such as the fact that almost every set is computably enumerable in and above some other set, and versions of some of these results by Kautz (again previously unpublished) converting “for almost all sets” to “for all 2-random sets”. The next five chapters examine notions of relative randomness: What does it mean to say that one sequence is more random than another? Can we make precise the intuition that if we, say, replace the even bits of a 1-random sequence by 0’s, the resulting sequence is “ 21 -random”? In Chapter 9 we study reducibilities that act as measures of relative randomness, focusing in particular, though not exclusively, on left-c.e. reals. For instance, we prove the result due to Kuˇcera and Slaman that the 1random left-c.e. reals are exactly the ones that are complete for a strong notion known as Solovay reducibility. These reals are also exactly the ones equal to Ω for some choice of universal prefix-free machine, so this result can be seen as an analog to the basic computability-theoretic result that all versions of the halting problem are essentially the same. We also prove that for a large class of reducibilities, the resulting degree structure on left-c.e. reals is dense and has other interesting properties, and discuss a natural but flawed strengthening of weak truth table reducibility known as cl-reducibility, and a better-behaved variation on it known as rK-reducibility. In Chapter 10 we focus on reducibilities appropriate for studying the relative randomness of sets already known to be 1-random. We study K- and C-reducibility, basic measures of relative initial segment complexity introduced in the previous chapter, including results by Solovay on the initial segment complexity of 1-random sets, and later echoes of this work, as well as theorems on the structure of the K- and C-degrees. We introduce van Lambalgen reducibility and the closely related notion of LR-reducibility, and study their basic properties, established by Miller and Yu. We see that
xxvi
Introduction
vL-reducibility is an excellent tool for studying the relative randomness of 1-random sets. It can be used to prove results about K- and C-reducibilities for which direct proofs seem difficult, and to establish theorems that help make precise the intuition that randomness should be antithetical to computational power. For instance, we show that if A T B are both 1-random, then if B is n-random, so is A. Turning to LR-reducibility (which agrees with vL-reducibility on the 1-random sets but not elsewhere), we introduce an important characterization due to Kjos-Hanssen, discuss structural results due to Barmpalias and others, and prove the equivalence between LR-reducibility and LK-reducibility, a result by Kjos-Hanssen, Miller, and Solomon related to the lowness notions mentioned in the following paragraph. We finish the chapter with a discussion of the quite interesting concept of almost everywhere domination, which arose in the context of the reverse mathematics of measure theory and turned out to have deep connections with algorithmic randomness. Chapter 11 is devoted to one of the most important developments in recent work on algorithmic randomness: the realization that there is a class of “randomness-theoretically weak” sets that is as robust and mathematically interesting as the class of 1-random sets. A set A is K-trivial if its initial segments have the lowest possible prefix-free complexity (that is, the first n bits of A are no more difficult to describe than the number n itself). It is low for 1-randomness if every 1-random set is 1-random relative to A. It is low for K if the prefix-free Kolmogorov complexity of any string relative to A is the same as its unrelativized complexity, up to an additive constant. We show that there are noncomputable sets with these properties, and prove Nies’ wonderful result that these three notions coincide. In other words, a set has lowest possible information content iff it has no derandomization power iff it has no compression power. We examine several other properties of the K-trivial sets, including the fact that they are very close to being computable, and provide further characterizations of them, in terms of other notions of randomness-theoretic weakness and the important concept of a cost function. In Chapter 12 we study lowness and triviality for other notions of randomness, such as Schnorr and computable randomness. For instance, we prove results of Terwijn and Zambella, and Kjos-Hanssen, Nies, and Stephan, characterizing lowness for Schnorr randomness in terms of traceability, and Nies’ result that there are no noncomputable sets that are low for computable randomness. We also study the analog of K-triviality for Schnorr randomness, including a characterization of Schnorr triviality by Franklin and Stephan. Chapter 13 deals with algorithmic dimension. Lutz realized that Hausdorff dimension can be characterized using martingales, and used that insight to define a notion of effective Hausdorff dimension. This notion turns out to be closely related to notions of partial randomness that allow us to say, for example, that certain sets are 12 -random, and also has a pleas-
Introduction
xxvii
ing and useful characterization in terms of Kolmogorov complexity. We also study effective packing dimension, which can also be characterized using Kolmogorov complexity, and can be seen as a dual notion to effective Hausdorff dimension. Algorithmic dimension is a large area of research in its own right, but in this chapter we focus on the connections with computability theory. For instance, we can formulate a computability-theoretic version of the quite natural question of whether randomness can be extracted from a partially random source. We prove Miller’s result that there is a set of positive effective Hausdorff dimension that does not compute any set of higher effective Hausdorff dimension, the result by Greenberg and Miller that there is a degree of effective Hausdorff dimension 1 that is minimal (and therefore cannot compute a 1-random set), and the contrasting result by Zimand that randomness extraction is possible from two sufficiently independent sources of positive effective Hausdorff dimension. We also study the relationship between building sets of high packing dimension and array computability, and study the concept of Schnorr dimension. In the last section of this chapter, we look at Lutz’ definition of dimension for finite strings and its relationship to Kolmogorov complexity. The final three chapters cover further results relating randomness, complexity, and computability. One of the byproducts of the theory of K-triviality has been an increased interest in notions of lowness in computability theory. Chapter 14 discusses the class of strongly jump traceable sets, a proper subclass of the K-trivials with deep connections to randomness. In the computably enumerable case, we show that the strongly jump traceable c.e. sets form a proper subideal of the K-trivial c.e. sets, and can be characterized as those c.e. sets that are computable from every ω-c.e. (or every superlow) 1-random set. We also discuss the general (non-c.e.) case, showing that, in fact, every strongly jump traceable set is K-trivial. In Chapter 15 we look at Ω as an operator on Cantor space. In general, our understanding of operators that take each set A to a set that is c.e. relative to A but does not necessarily compute A (as opposed to the more usual “computably enumerable in and above” operators of computability theory) is rather limited. There had been a hope at some point that the Omega operator might turn out to be degree invariant, hence providing a counterexample to a long-standing conjecture of Martin that (roughly speaking) the only degree invariant operators on the degrees are iterates of the jump. However, among other results, we show that there are sets A and B that are equal up to finite differences, but such that ΩA and ΩB are relatively 1-random (and hence Turing incomparable). We also establish several other properties of Omega operators due to Downey, Hirschfeldt, Miller, and Nies, including the fact that almost every real is ΩA for some A, and prove Miller’s results that a set is 2-random iff it has infinitely many initial segments of maximal prefix-free Kolmogorov complexity.
xxviii
Introduction
Chapter 16 is devoted to the relationship between Kolmogorov complexity and c.e. sets. We prove Kummer’s theorem characterizing the Turing degrees that contain c.e. sets of highest possible Kolmogorov complexity, Solovay’s theorem relating the complexity of describing a c.e. set to its enumeration probability, and several results on the complexity of c.e. sets naturally associated with notions of Kolmogorov complexity, such as the set of nonrandom strings (in various senses of randomness for strings). As mentioned in the preface, we have included several open questions and unexplored research directions in the text, which are referenced in the index under the heading open question. Unlike the ideal machines computability theorists consider, authors are limited in both time and space. We have had to leave out many interesting results and research directions (and, of course, we are sure there are several others of which we are simply unaware). There are also entire areas of research that would have fit in with the themes of this book but had to be omitted. One of these is the uses of algorithmic randomness in reverse mathematics. Another, related one, is the growing body of results on converting classical “almost everywhere” results in areas such as probability and dynamical systems into “for all sufficiently random” results (often precise ones, saying, for instance, that a certain statement holds for all 2-random but not all 1-random real numbers). Others come from varying one of three ingredients of algorithmic randomness: the spaces we consider, the measures on those spaces, and the level of effectiveness of our notions. There has been a growing body of research in extending algorithmic randomness to spaces other than Cantor space, defining for example notions of random continuous functions and random closed sets. Even in the context of Cantor space, we only briefly discuss randomness for measures other than the uniform (or Lebesgue) measure (although, for computable continuous measures at least, much of the theory remains unaltered). The interaction between randomness and complexity theory is a topic that could easily fill a book this size by itself, but there are parts of it that are particularly close to the material we cover. Randomness in the context of effective descriptive set theory has also begun to be investigated. Nevertheless, we hope to give a useful and rich account of the ways computability theorists have found to calibrate randomness for individual elements of Cantor space, and how these relate to traditional measures of complexity, including both computability-theoretic measures of relative computational power such as Turing reducibility and notions from algorithmic information theory such as Kolmogorov complexity. Most of the material we cover is from the last few years, when we have witnessed an explosion of wonderful ideas in the area. This book is our account of what we see as some of the highlights. It naturally reflects our own views of what is important and attractive, but we hope there is enough here to make it useful to a wide range of readers.
Part I
Background
1 Preliminaries
1.1 Notation and conventions Most of our notation, terminology, and conventions will be introduced as we go along, but we set down here a few basic ones that are used throughout the book. Strings. Unless specified otherwise, by a string we mean a finite binary string. We mostly use lowercase Greek letters toward the middle and end of the alphabet for strings. We denote the set of finite binary strings by 2<ω , the set of binary strings of length n by 2n , and the set of binary strings of length less than or equal to n by 2n . We denote the empty string by λ, the length of a string σ by |σ|, and the concatenation of strings σ and τ by either στ or σ τ . The length-lexicographic ordering on 2<ω is defined by saying that σ is less than τ (written σ
2
1.1. Notation and conventions
3
The fact that there are reals with more than one binary expansion will for the most part not matter to us, but in any case we make the following convention, which saves some work in proofs where we do not want to treat the rational case in some nonuniform manner, and do not wish to fuss about nonunique representations. In all cases, dealing with the rational case is completely straightforward, and omitting it is just a matter of convenience. Convention 1.1.1. Unless mentioned otherwise or clear from context, we assume that all reals are irrational, and hence have unique binary representations. We denote the natural numbers by N or ω interchangeably, and the Cantor space of infinite binary sequences by 2ω . We will discuss this space further in the next section. We use standard set-theoretic notation, but mention here some notation that may be lesser-known. We write A =∗ B to mean that A(n) = B(n) for all but finitely many n. We write A n to mean the restriction of A to its first n bits (that is, the string A(0) . . . A(n − 1)), which we often identify with the finite set {m < n : m ∈ A}. More generally, A S is the restriction of A to the bits in S. We write A for the complement N \ A of A. For a set A of pairs, the domain of A, denoted by dom A, is the set of all x such that (x, y) ∈ A for some y. We fix a 1-1 function taking a sequence of natural numbers (n0 , . . . , nk ) to a natural number n0 , . . . , nk , and a canonical listing D0 , D1 , . . . of the finite sets. (The only thing that matters here is that we can effectively recover (n0 , . . . , nk ) from n0 , . . . , nk and Dn from n. That is, these functions are computable in the sense of the next chapter.) When considering max S for a set S, we adopt the convention that the max of an empty set is 0. We use the notation [x, y] for a closed interval (in N or R), (x, y) for an open interval, and (x, y] and [x, y) for half-open intervals. Let A[e] = {e, n : e, n ∈ A}. We call A[e] the eth column of A. Functions. For a function f , we write f (k) to mean f composed with itself k many times. Thus, for example log(2) n means log log n. By convention, f (0) is the identity function. When we write log n, we mean the base 2 logarithm of n, rounded up to the nearest larger integer. By convention, log 0 = 0. The graph of a function f is the set {(x, y) : f (x) = y}. For functions f, g, h : N → N, we write: (i) f (n) g(n) + O(h(n)) to mean that there is a constant c such that f (n) g(n) + c · h(n) for all n, (ii) f (n) = g(n) ± O(h(n)) to mean that f (n) g(n) + O(h(n)) and g(n) f (n) + O(h(n)), and
4
1. Preliminaries
(iii) f (n) g(n) + o(h(n)) to mean that limn
f (n)−g(n) h(n)
= 0.
(n) = 1, and write We say that f and g are asymptotically equal if limn fg(n) f ∼ g to mean that f (n) = O(g(n)) and g(n) = O(f (n)). Note that, in particular, f (n) = O(h(n)) means that there is a constant c such that f (n) c · h(n) for all n. Similarly, f (n) = o(h(n)) means that (n) limn fh(n) = 0. Sometimes, when we write an expression such as f (n) g(n) + O(h(n)), there is an extra additive constant on the right-hand side. Such a constant is of course absorbed into the O(h(n)) term, but only if h(n) > 0. So we make it a convention that, even if h(n) = 0, the O(h(n)) term is at least O(1), and hence still absorbs the constant. We also use this notation for other kinds of functions, such as functions from 2<ω to N, in an analogous way. Logical notation. We will use standard logical notation, including the quantifiers ∃∞ for “there exist infinitely many”, ∀∞ for “for all but finitely many”, and ∃k for “there exist at least k many”. We also use the usual λ and μ notation: λx f (x, y0 , . . . , yn ) is the function x → f (x, y0 , . . . , yn ) for fixed values y0 , . . . , yn and μn R(n) is the least n such that R(n) holds. Lattices. Let (P, ) be a partial order. We say that z ∈ P is the join of x, y ∈ P if x, y z and x, y w ⇒ z w. The join of x and y is denoted by x ∨ y. We say that z ∈ P is the meet of x, y ∈ P if x, y z and x, y w ⇒ z w. The meet of x and y is denoted by x ∧ y. If every pair of elements of P has a join, then (P, ) is an uppersemilattice. If every pair of elements of P has a meet, then (P, ) is a lowersemilattice. If both conditions obtain, then (P, ) is a lattice. An ideal in an uppersemilattice (P, ) is a subset of P that is closed downward and under joins.
1.2 Basics from measure theory We assume familiarity with the basic concepts of measure theory as given in initial segments of texts such as Oxtoby [313], but review a few important facts in this section. For simplicity of presentation, we study the Cantor space 2ω of infinite binary sequences. For σ ∈ 2<ω we denote by σ the set ω ω <ω of all extensions of σ in 2 . That is, σ = {σX : X ∈ 2 }. For A ⊆ 2 we let A = σ∈A σ. We say that A generates (or is a set of generators of ) A. The (sub-)basic open sets of Cantor space are σ for σ ∈ 2<ω . In this topology these sets are clopen, and the clopen sets are finite unions of such basic open sets. Although the real unit interval is not homeomorphic to 2ω with this topology, it is well known that the two spaces are isomorphic in the measure-theoretic sense. (See also Convention 1.1.1.)
1.2. Basics from measure theory
5
The (uniform or Lebesgue) measure of a basic open set σ is μ(σ) = 2−|σ| . A sequence of basic open sets {σ : σ ∈ A} is said to cover a set C ⊆ 2ω if C ⊆ A. The outer measure of C is ∗ −|σn | μ (C) = inf 2 : {σn : n ∈ N} covers C . n
The inner measure of C is μ∗ (C) = 1 − μ∗(C), where C is the complement 2ω \ C of C. If C is measurable, then μ∗ (C) = μ∗ (C), and we refer to this quantity as the measure μ(C) of C. Null sets are those that have measure 0. A set C has measure 0 iff there is a sequence of open covers of C whose measure goes to 0, that is, a collection {Vn }n∈ω of open sets such that limn μ(Vn ) = 0 and C ⊆ n Vn We will later effectivize this notion when we look at Martin-L¨ of randomness in Chapter 6. We will also effectivize the following fact, known as the (First) Borel-Cantelli Lemma. Theorem 1.2.1 (Borel-Cantelli Lemma). Let {Cn }n∈ω be a sequence of measurable sets such that n μ(Cn ) < ∞. Then μ({X : ∃∞ n (X ∈ Cn )}) = 0. Proof.Let D = {X : ∃∞ n (X ∈ Cn )}. Choose a sequence n0 , n1 , . . . such that jni μ(Cj ) 2−i . For each i, we have D ⊆ jni Cj , so D is a null set. The following (simplified version of a) classical fact from measure theory, known as the Lebesgue Density Theorem, will be used often in this book. It implies that for every set A of positive measure there are basic open sets within which the measure of A appears to be arbitrarily close to 1. As we will see, this is the critical fact allowing for the existence of 0-1 laws in degree theory. Our proof follows the one in Terwijn [387]. Definition 1.2.2. A measurable set A has density d at X ∈ 2ω if lim 2n μ(A ∩ X n) = d. n
Let D(A) = {X : A has density 1 at X}. Theorem 1.2.3 (Lebesgue Density Theorem). If A is measurable then so is D(A). Furthermore, the measure of the symmetric difference of A and D(A) is 0, so μ(D(A)) = μ(A). Thus, if μ(A) > 0 then for any q < 1 there is a σ such that the relative measure μ(A∩σ) is greater than q. μ(σ) Proof. It suffices to show that A − D(A) is a null set, since D(A) − A ⊆ A − D(A) and A is measurable. For ε ∈ Q+ , let Bε = {X ∈ A : lim inf 2n μ(A ∩ X n) < 1 − ε}. n Then A − D(A) = ε Bε , so it suffices to show that each Bε is null.
6
1. Preliminaries
Suppose for a contradiction that there is an ε such that Bε is not null, that is, μ∗ (Bε ) > 0. It is easy to see that μ∗ (Bε ) = inf{μ(U ) : Bε ⊆ U ∧ U open}, so there is an open set U ⊇ Bε such that (1 − ε)μ(U ) < μ∗ (Bε ). Let I = {σ : σ ⊆ U ∧ μ(A ∩ σ) < (1 − ε)2−|σ| }. Then the following facts hold. (i) If X ∈ Bε then I contains X n for infinitely many n. (ii) If σ0 , σ1 , . . . is a sequence of elements of I such that σi ∩ σj = ∅ for i = j, then μ∗ (Bε − i σi ) > 0. Fact (i) holds because U is open and contains Bε . Fact (ii) holds because
∗ μ Bε ∩ σi μ A ∩ σi = μ(A ∩ σi ) i
i
i
<
(1 − ε)2−|σi | (1 − ε).
i
Construct a sequence σ0 , σ1 , . . . as follows. Let σ0 be any element of I. Given σi for i n, let In = {σ ∈ I : σ ∩ σi = ∅ for all i n}. By (i) and (ii), In is infinite. Let dn = sup{2−|σ| : σ ∈ In } and let σn+1 be an −|σn+1 | element of In such > d2n . that 2 Let X ∈ Bε − i σi , which exists by (ii). By (i), there is a τ ∈ I with X ∈ τ . If τ were disjoint from every σn , and hence contained in every −|τ | In , then we would dn < 2−|σn |+1 for every n, contradicting have 2 the fact that μ( n σn ) 1. So there is a least n such that τ ∩ σn = ∅. Then either τ σn or σn τ . The minimality of n implies that 2−|τ | dn−1 < 2−|σn |+1 , so |τ | |σn |. Thus σn τ , and hence τ ⊆ σn . But then X ∈ σn , contradicting the choice of X. A tailset is a set A ⊆ 2ω such that for all σ ∈ 2<ω and X ∈ 2ω , if σX ∈ A then τ X ∈ A for all τ with |τ | = |σ|. An important corollary to the Lebesgue Density Theorem is the following theorem of Kolmogorov. Theorem 1.2.4 (Kolmogorov’s 0-1 Law). If A is a measurable tailset, then either μ(A) = 0 or μ(A) = 1. Proof. Suppose that μ(A) > 0. By Theorem 1.2.3, choose X ∈ A such that A has density 1 at X. Let ε > 0. Choose n sufficiently large so that 2n μ(A ∩ X n) > 1 − ε. Since A is a tailset, we know that 2n μ(A ∩ σ) > 1 − ε for all σ of length n. Hence μ(A) > 1 − ε. Since ε is arbitrary, μ(A) = 1.
2 Computability Theory
In this chapter we will develop a significant amount of computability theory. Much of this technical material will not be needed until much later in the book, and perhaps in only a small section of the book. We have chosen to gather it in one place for ease of reference. However, as a result this chapter is quite uneven in difficulty, and we strongly recommend that one use most of it as a reference for later chapters, rather than reading through all of it in detail before proceeding. This is especially so for those unfamiliar with more advanced techniques such as priority arguments. In a few instances we will mention, or sometimes use, methods or results beyond what we introduce in the present chapter. For instance, when we mention the work by Reimann and Slaman [327] on “never continuously random” sets in Section 6.12, we will talk about Borel Determinacy, and in a few places in Chapter 13, some knowledge of forcing in the context of computability theory could be helpful, although our presentations will be self-contained. Such instances will be few and isolated, however, and choices had to be made to keep the length of the book somewhat within reason.
2.1 Computable functions, coding, and the halting problem At the heart of our understanding of algorithmic randomness is the notion of an algorithm. Thus the tools we use are based on classical computability R.G. Downey and D. Hirschfeldt, Algorithmic Randomness and Complexity, Theory and Applications of Computability, DOI 10.1007/978-0-387-68441-3_2, © Springer Science+Business Media, LLC 2010
7
8
2. Computability Theory
theory. While we expect the reader to have had at least one course in the rudiments of computability theory, such as a typical course on “theory of computation”, the goal of this chapter is to give a reasonably self-contained account of the basics, as well as some of the tools we will need. We do, however, assume familiarity with the technical definition of computability via Turing machines (or some equivalent formalism). For more details see, for example, Rogers [334], Salomaa [347], Soare [366], Odifreddi [310, 311], or Cooper [79]. Our initial concern is with functions from A into N where A ⊆ N, i.e., partial functions on N. If A = N then the function is called total . Looking only at N may seem rather restrictive. For example, later we will be concerned with functions that take the set of finite binary strings or subsets of the rationals as their domains and/or ranges. However, from the point of view of classical computability theory (that is, where resources such as time and memory do not matter), our definitions naturally extend to such functions by coding; that is, the domains and ranges of such functions can be coded as subsets of N. For example, the rationals Q can be coded in N as follows. Definition 2.1.1. Let r ∈ Q \ {0} and write r = (−1)δ pq with p, q ∈ N in lowest terms and δ = 0 or 1. Then define the G¨ odel number of r, denoted by #(r), as 2δ 3p 5q . Let the G¨ odel number of 0 be 0. The function # is an injection from Q into N, and given n ∈ N we can decide exactly which r ∈ Q, if any, has #(r) = n. Similarly, if σ is a finite binary string, say σ = a1 a2 . . . an , then we can define #(σ) = 2a1 +1 3a2 +1 . . . (pn )an +1 , where pn denotes the nth prime. There are myriad other codings possible, of course. For instance, one could code the string σ as the binary number 1σ, so that, for example, the string 01001 would correspond to 101001 in binary. Coding methods such as these are called “effective codings”, since they include algorithms for deciding the resulting injections, in the sense discussed above for the G¨odel numbering of the rationals. Henceforth, unless otherwise indicated, when we discuss computability issues relating to a class of objects, we will always regard these objects as (implicitly) effectively coded in some way. Part of the philosophy underlying computability theory is the celebrated Church-Turing Thesis, which states that the algorithmic (i.e., intuitively computable) partial functions are exactly those that can be computed by Turing machines on the natural numbers. Thus, we formally adopt the definition of algorithmic, or computable, functions as being those that are computable by Turing machines, but argue informally, appealing to the intuitive notion of computability as is usual. Excellent discussions of the subtleties of the Church-Turing Thesis can be found in Odifreddi [310] and Soare [367].
2.1. Computable functions, coding, and the halting problem
9
There are certain important basic properties of the algorithmic partial functions that we will use throughout the book, often implicitly. Following the usual computability theoretic terminology, we refer to such functions as partial computable functions. Proposition 2.1.2 (Enumeration Theorem – Universal Turing Machine). There is an algorithmic way of enumerating all the partial computable functions. That is, there is a list Φ0 , Φ1 , . . . of all such functions such that we have an algorithmic procedure for passing from an index i to a Turing machine computing Φi , and vice versa. Using such a list, we can define a partial computable function f (x, y) of two variables such that f (x, y) = Φx (y) for all x, y. Such a function, and any Turing machine that computes it, are called universal. To a modern computer scientist, this result is obvious. That is, given a program in some computer language, we can convert it into ASCII code, and treat it as a number. Given such a binary number, we can decode it and decide whether it corresponds to the code of a program, and if so execute this program. Thus a compiler for the given language can be used to produce a universal program. Henceforth, we fix an effective listing Φ0 , Φ1 , . . . of the partial computable functions as above. We have an algorithmic procedure for passing from an index i to a Turing machine Mi computing Φi , and we identify each Φi with Mi . For any partial computable function f , there are infinitely many ways to compute f . If Φy is one such algorithm for computing f , we say that y is an index for f . The point of Proposition 2.1.2 is that we can pretend that we have all the machines Φ1 , Φ2 , . . . in front of us. For instance, to compute 10 steps in the computation of the 3rd machine on input 20, we can pretend to walk to the 3rd machine, put 20 on the tape, and run it for 10 steps (we write the result as Φ3 (20)[10]). Thus we can computably simulate the action of computable functions. In many ways, Proposition 2.1.2 is the platform that makes undecidability proofs work, since it allows us to diagonalize over the class of partial computable functions without leaving this class. For instance, we have the following result, where we write Φx (y) ↓ to mean that Φx is defined on y, or equivalently, that the corresponding Turing machine halts on input y, and Φx (y) ↑ to mean that Φx is not defined on y. Proposition 2.1.3 (Unsolvability of the halting problem). There is no algorithm that, given x, y, decides whether Φx (y) ↓. Indeed, there is no algorithm to decide whether Φx (x) ↓.
10
2. Computability Theory
Proof. Suppose such an algorithm exists. Then by Proposition 2.1.2, it follows that the following function g is (total) computable: 1 if Φx (x) ↑ g(x) = Φx (x) + 1 if Φx (x) ↓ . Again using Proposition 2.1.2, there is a y with g = Φy . Since g is total, g(y) ↓, so Φy (y) ↓, and hence g(y) = Φy (y) + 1 = g(y) + 1, which is a contradiction. Note that we can define a partial computable function g via g(x) = Φx (x) + 1 and avoid contradiction, as it will follow that, for any index y for g, we have Φy (y) ↑= g(y) ↑. Also, the reason for the use of partial computable functions in Proposition 2.1.2 is now clear: The argument above shows that there is no computable procedure to enumerate all (and only) the total computable functions. Proposition 2.1.3 can be used to show that many problems are algorithmically unsolvable by “coding” the halting problem into these problems. For example, we have the following result. Proposition 2.1.4. There is no algorithm to decide whether the domain of Φx is empty. To prove this proposition, we need a lemma, known as the s-m-n theorem. We state it for unary functions, but it holds for n-ary ones as well. Strictly speaking, the lemma below is the s-1-1 theorem. For the full statement and proof of the s-m-n theorem, see [366]. Lemma 2.1.5 (The s-m-n Theorem). Let g(x, y) be a partial computable function of two variables. Then there is a computable function s of one variable such that, for all x, y, Φs(x) (y) = g(x, y). Proof. Given a Turing machine M computing g and a number x, we can build a Turing machine N that on input y simulates the action of writing the pair (x, y) on M ’s input tape and running M . We can then find an index s(x) for the function computed by N . Proof of Proposition 2.1.4. We code the halting problem into the problem of deciding whether dom(Φx ) = ∅. That is, we show that if we could decide whether dom(Φx ) = ∅ then we could solve the halting problem. Define a partial computable function of two variables by 1 if Φx (x) ↓ g(x, y) = ↑ if Φx (x) ↑ . Notice that g ignores its second input.
2.2. Computable enumerability and Rice’s Theorem
11
Via the s-m-n theorem, we can consider g(x, y) as a computable collection of partial computable functions. That is, there is a computable s such that, for all x, y, Φs(x) (y) = g(x, y). Now
dom(Φs(x) ) =
N if Φx (x) ↓ ∅ if Φx (x) ↑,
so if we could decide for a given x whether Φs(x) has empty domain, then we could solve the halting problem. We denote the result of running the Turing machine corresponding to Φe for s many steps on input x by Φe (x)[s]. This value can of course be either defined, in which case we write Φe (x)[s] ↓, or undefined, in which case we write Φe (x)[s] ↑. We will often be interested in the computability of families of functions. We say that f0 , f1 , . . . are uniformly (partial) computable if there is a (partial) computable function f of two variables such that f (n, x) = fn (x) for all n and x. It is not hard to see that f0 , f1 , . . . are uniformly partial computable iff there is a computable g such that fn = Φg(n) for all n.
2.2 Computable enumerability and Rice’s Theorem We now show that the reasoning used in the proof of Proposition 2.1.4 can be pushed much further. First we wish to regard all problems as coded by subsets of N. For example, the halting problem can be coded by ∅ = {x : Φx (x) ↓} (or if we insist on the two-variable formulation, by {x, y : Φx (y) ↓}). Next we need some terminology. Definition 2.2.1. A set A ⊆ N is called (i) computably enumerable (c.e.) if A = dom(Φe ) for some e, and (ii) computable if A and A = N \ A are both computably enumerable. A set is co-c.e. if its complement is c.e. Thus a set is computable iff it is both c.e. and co-c.e. Of course, it also makes sense to say that A is computable if its characteristic function χA is computable, particularly since, as mentioned in Chapter 1, we identify sets with their characteristic functions. It is straightforward to check that A is computable in the sense of Definition 2.2.1 if and only if χA is computable. We let We denote the eth computably enumerable set, that is, dom(Φe ), and let We [s] = {x s : Φe (x)[s] ↓}. We sometimes write We [s] as We,s .
12
2. Computability Theory
We think of We [s] as the result of performing s steps in the enumeration of We . An index for a c.e. set A is an e such that We = A. We say that a family of sets A0 , A1 , . . . is uniformly computably enumerable if An = dom(fn ) for a family f0 , f1 , . . . of uniformly partial computable functions. It is easy to see that this condition is equivalent to saying that there is a computable g such that An = Wg(n) for all n, or that there is a c.e. set A such that An = {x : n, x ∈ A} for all n. A family of sets A0 , A1 , . . . is uniformly computable if the functions χA0 , χA1 , . . . are uniformly computable, which is equivalent to saying that both A0 , A1 , . . . and A0 , A1 , . . . are uniformly c.e. Definition 2.2.1 suggests that one way to make a set A noncomputable is by ensuring that A is coinfinite and for all e, if We is infinite then A ∩ We = ∅. A c.e. set A with these properties is called simple. An infinite set that contains no infinite c.e. set is called immune. (So a simple set is a c.e. set whose complement is immune.) Not all noncomputable c.e. sets are simple, since given any noncomputable c.e. set A, the set {2n : n ∈ A} is also c.e. and noncomputable, but is not simple. The name computably enumerable comes from a notion of “effectively countable”, via the following characterization, whose proof is straightforward. Proposition 2.2.2. A set A is computably enumerable iff either A = ∅ or there is a total computable function f from N onto A. (If A is infinite then f can be chosen to be injective.) Thus we can think of an infinite computably enumerable set as an effectively infinite list (but not necessarily in increasing numerical order ). Note that computable sets correspond to decidable questions, since if A is computable, then either A ∈ {∅, N} or we can decide whether x ∈ A as follows. Let f and g be computable functions such that f (N) = A and g(N) = A. Now enumerate f (0), g(0), f (1), g(1), . . . until x occurs (as it must). If x occurs in the range of f , then x ∈ A; if it occurs in the range of g, then x∈ / A. It is straightforward to show that ∅ is computably enumerable. Thus, by Proposition 2.1.3, it is an example of a computably enumerable set that is not computable. As we will show in Proposition 2.4.5, ∅ is a complete computably enumerable set, in the sense that for any c.e. set A, there is an algorithm for computing A using ∅ . We will introduce another “highly knowledgeable” real, closely related to ∅ and denoted by Ω, in Definition ´ 3.13.6. Calude and Chaitin [48] pointed out that, in 1927, Emile Borel prefigured the idea of such knowledgeable reals by “defining” a real B such that the nth bit of B answers the nth question in an enumeration of all yes/no questions one can write down in French. If A is c.e., then it clearly has a computable approximation, that is, a uniformly computable family {As }s∈ω of sets such that A(n) = lims As (n) for
2.3. The Recursion Theorem
13
all n. (For example, {We [s]}s∈ω is a computable approximation of We .) In Section 2.6, we will give an exact characterization of the sets that have computable approximations. In the particular case of c.e. sets, we can choose the As so that A0 ⊆ A1 ⊆ A2 ⊆ · · · . In the constructions we discuss, whenever we are given a c.e. set, we assume we have such an approximation, and think of As as the set of numbers put into A by stage s of the construction. In general, whenever we have an object X that is being approximated during a construction, we denote the stage s approximation to X by X[s]. An index set is a set A such that if x ∈ A and Φx = Φy then y ∈ A. For example, {x : dom(Φx ) = ∅} is an index set. An index set can be thought of as coding a problem about computable functions (like the emptiness of domain problem) whose answer does not depend on the particular algorithm used to compute a function. Generalizing Proposition 2.1.4, we have the following result, which shows that nontrivial index sets are never computable. Its proof is very similar to that of Proposition 2.1.4. Theorem 2.2.3 (Rice’s Theorem [332]). An index set A is computable (and so the problem it codes is decidable) iff A = N or A = ∅. Proof. Let A ∈ / {∅, N} be an index set. Let e be such that dom(Φe ) = ∅. We may assume without loss of generality that e ∈ A (the case e ∈ A being symmetric). Fix i ∈ A. By the s-m-n theorem, there is a computable s(x) such that, for all y ∈ N, Φi (y) if Φx (x) ↓ Φs(x) (y) = ↑ if Φx (x) ↑ . If Φx (x) ↓ then Φs(x) = Φi and so s(x) ∈ A, while if Φx (x) ↑ then Φs(x) = Φe and so s(x) ∈ / A. Thus, if A were computable, ∅ would also be computable. Of course, many nontrivial decision problems (such as the problem of deciding whether a natural number is prime, say) are not coded by index sets, and so can have decidable solutions.
2.3 The Recursion Theorem Kleene’s Recursion Theorem (also known as the Fixed Point Theorem) is a fundamental result in classical computability theory. It allows us to use an index for a computable function or c.e. set that we are building in a construction as part of that very construction. Thus it forms the theoretical underpinning of the common programming practice of having a routine make recursive calls to itself. Theorem 2.3.1 (Recursion Theorem, Kleene [209]). Let f be a total computable function. Then there is a number n, called a fixed point of f , such
14
2. Computability Theory
that Φn = Φf (n) , and hence Wn = Wf (n) . Furthermore, such an n can be computed from an index for f . Proof. First define a total computable function d via the s-m-n Theorem so that ΦΦe (e) (k) if Φe (e) ↓ Φd(e) (k) = ↑ if Φe (e) ↑ . Let i be such that Φi = f ◦ d and let n = d(i). Notice that Φi is total. The following calculation shows that n is a fixed point of f . Φn = Φd(i) = ΦΦi (i) = Φf ◦d(i) = Φf (n) . The explicit definition of n given above can clearly be carried out computably given an index for f . A longer but more perspicuous proof of the recursion theorem was given by Owings [312]; see also Soare [366, pp. 36–37]. There are many variations on the theme of the recursion theorem, such as the following one, which we will use several times below. Theorem 2.3.2 (Recursion Theorem with Parameters, Kleene [209]). Let f be a total computable function of two variables. Then there is a total computable function h such that Φh(y) = Φf (h(y),y) , and hence Wh(y) = Wf (h(y),y) , for all y. Furthermore, an index for h can be obtained effectively from an index for f . Proof. The proof is similar to that of the recursion theorem. Let d be a total computable function such that ΦΦx (x,y ) (k) if Φx (x, y) ↓ Φd(x,y)(k) = ↑ if Φx (x, y) ↑ . Let i be such that Φi (x, y) = f (d(x, y), y) for all x and y, and let h(y) = d(i, y). Then Φh(y) = Φd(i,y) = ΦΦi (i,y ) = Φf (d(i,y),y) = Φf (h(y),y) for all y. The explicit definition of h given above can clearly be carried out computably given an index for f .
2.4. Reductions
15
It is straightforward to modify the above proof to show that if f is a partial computable function of two variables, then there is a total computable function h such that Φh(y) = Φf (h(y),y) for all y such that f (h(y), y) ↓. In Section 3.5, we will see that there is a version of the recursion theorem for functions computed by prefix-free machines, a class of machines that will play a key role in this book. Here is a very simple application of the recursion theorem. We show that ∅ is not an index set. Let f be a computable function such that Φf (n) (n) ↓ and Φf (n) (m) ↑ for all m = n. Let n be a fixed point for f , so that Φn = Φf (n) . Let m = n be another index for Φn . Then Φn (n) ↓ and hence n ∈ ∅ , but Φm (m) ↑ and hence m ∈ / ∅ . So ∅ is not an index set. Note that this example also shows that there is a Turing machine that halts only on its own index. The following is another useful application of the recursion theorem. Theorem 2.3.3 (Slowdown Lemma, Ambos-Spies, Jockusch, Shore, and Soare [7]). Let {Ue,s }e,s∈ω be a computable sequence of finite sets such that Ue,s ⊆ Ue,s+1 for all e and s. Let Ue = s Ue,s . There is a computable function g such that for all e, s, n, we have Wg(e) = Ue and if n ∈ / Ue,s then n∈ / Wg(e),s+1 . Proof. Let f be a computable function such that Wf (i,e) behaves as follows. Given n, look for a least s such that n ∈ Ue,s . If such an s is found, ask whether n ∈ Wi,s . If not, then enumerate n into Wf (i,e) . By the recursion theorem with parameters, there is a computable function g such that Wg(e) = Wf (g(e),e) for every e. If n ∈ / Ue,s , then n ∈ / Wg(e) . If n ∈ Ue,s then for the least such s it cannot be the case that n ∈ Wg(e),s , since in that case we would have n ∈ / Wf (g(e),e) = Wg(e) . So n ∈ Wf (g(e),e) = Wg(e) but n∈ / Wg(e),s . We will provide the details of applications of versions of the recursion theorem for several results in this chapter, but will assume familiarity with their use elsewhere in the book.
2.4 Reductions The key concept used in the proof of Rice’s Theorem is that of reduction, that is, the idea that “if we can do B then this ability also allows us to do A”. In other words, questions about problem A are reducible to ones about problem B. We want to use this idea to define partial orderings, known as reducibilities, that calibrate problems according to computational difficulty. The idea is to have A B if the ability to solve B allows us also to solve A, meaning that B is “at least as hard as” A. In this section, we introduce several ways to formalize this notion, beginning with the best-known one, Turing reducibility. For any reducibility R , we write A ≡R B, and say
16
2. Computability Theory
that A and B are R-equivalent , if A R B and B R A. We write A
2.4.1 Oracle machines and Turing reducibility An oracle (Turing) machine is a Turing machine with an extra infinite readonly oracle tape, which it can access one bit at a time while performing its computation. If there is an oracle machine M that computes the set A when its oracle tape codes the set B, then we say that A is Turing reducible to B, or B-computable, or computable in B, and write A T B.1 Note that, in computing A(n) for any given n, the machine M can make only finitely many queries to the oracle tape; in other words, it can access the value of B(m) for at most finitely many m. The definition of relative computability can be extended to functions in the obvious way. We will also consider situations in which the oracle tape codes a finite string. In that case, if the machine attempts to make any queries beyond the length of the string, the computation automatically diverges. All the notation we introduce below for oracle tapes coding sets applies to strings as well. For example, let E = {x : dom(Φx ) = ∅}. In the proof of Proposition 2.1.4, we showed that ∅ T E. Indeed the proof of Rice’s Theorem demonstrates that, for a nontrivial index set I, we always have ∅ T I. On the other hand, the unsolvability of the halting problem implies that ∅ T ∅. (Note that X T ∅ iff X is computable. Indeed, if Y is computable, then X T Y iff X is computable.) It is not hard to check that Turing reducibility is transitive and reflexive, and thus is a preordering on the subsets of N. The equivalence classes of the form deg(A) = {B : B ≡T A} code a notion of equicomputability and are called Turing degrees (of unsolvability), though we often refer to them simply as degrees. We always use boldface lowercase letters such as a for Turing degrees. A Turing degree is computably enumerable if it contains a computably enumerable set (which does not imply that all the sets in the degree are c.e.). The Turing degrees inherit a natural ordering from Turing reducibility: a b iff A T B for some (or equivalently all) A ∈ a and B ∈ b. We will relentlessly mix notation by writing, for example, A
can also put resource bounds on our procedures. For example, if we count steps and ask that computations halt in a polynomial (in the length of the input) number of steps, then we arrive at the polynomial time computable functions and the notion of polynomial time (Turing) reducibility. We will not consider such reducibilities here; see Ambos-Spies and Mayordomo [10].
2.4. Reductions
17
and B ≡T B, then A ⊕ B ≡T A ⊕ B. Thus it makes sense to define A ≡T A the join a ∨ b of the degrees a and b to be the degree of A ⊕ B for some (or equivalently all) A ∈ a and B ∈ b. Kleene and Post [210] showed that the Turing degrees are not a lattice. That is, not every pair of degrees has a greatest lower bound. i = {i, n : n ∈ Ai }. Note that, while Aj T
For sets A0 , A1 , . . ., let i A i Ai for all j, the degree of i Ai may be much greater than the degrees of the Ai ’s. For instance, for any function f , we can let Ai =
{f (i)}, in which case each Ai is a singleton, and hence computable, but i Ai ≡T f . We let 0 denote the degree of the computable sets. Note that each degree is countable and has only countably many predecessors (since there are only countably many oracle machines), so there are continuum many degrees. For an oracle machine Φ, we write ΦA for the function computed by Φ with oracle A (i.e., with A coded into its oracle tape). The analog of Proposition 2.1.2 holds for oracle machines. That is, there is an effective enumeration Φ0 , Φ1 , . . . of all oracle machines, and a universal oracle (Turing) machine Φ such that ΦA (x, y) = ΦA x (y) for all x, y and all oracles A. We had previously defined Φe to be the eth partial computable function. However, we have already identified partial computable functions with Turing machines, and we can regard normal Turing machines as oracle machines with empty oracle tape; in fact it is convenient to identify the eth partial computable function Φe with Φ∅e . Thus there is no real conflict between the two notations, and we will not worry about the double meaning of Φe . When a set A has a computable approximation {As }s∈ω , we write ΦA e (n)[s] to mean the result of running Φe with oracle As on input n for s many steps. We also think of oracle machines as determining (Turing) functionals, that is, partial functions from 2ω to 2ω (or ω ω ). The value of the functional Φ on A is ΦA . The use of a converging oracle computation ΦA (n) is x+ 1 for the largest number x such that the value of A(x) is queried during the computation. (If no such value is queried, then the use of the computation is 0.) We denote this use by ϕA (n). In general, when we have an oracle computation represented by an uppercase Greek letter, its use function is represented by the corresponding lowercase Greek letter. Normally, we do not care about the exact position of the largest bit queried during a computation, and can replace the exact use function by any function that is at least as large. Furthermore, we may assume that an oracle machine cannot query its oracle’s nth bit before stage n. So we typically adopt the following useful conventions on a use function ϕA . 1. The use function is strictly increasing where defined, that is, ϕA (m) < ϕA (n) for all m < n such that both these values are defined, and
18
2. Computability Theory
similarly, when A is being approximated, ϕA (m)[s] < ϕA (n)[s] for all m < n and s such that both these values are defined. 2. When A is being approximated, ϕA (n)[s] ϕA (n)[t] for all n and s < t such that both these values are defined. 3. ϕA (n)[s] s for all n and s such that this value is defined. Although in a sense trivial, the following principle is quite important. Proposition 2.4.1 (Use Principle). Let ΦA (n) be a converging oracle computation, and let B be a set such that B ϕA (n) = A ϕA (n). Then ΦB (n) = ΦA (n). Proof. The sets A and B give the same answers to all questions asked during the relevant computations, so the results must be the same. One important consequence of the use principle is that if A is c.e. and ΦA is total, then A can compute the function f defined by letting f (n) be the least s by which the computation of ΦA (n) has settled, i.e., ΦA (n)[t] ↓= ΦA (n)[s] ↓ for all t > s. The reason is that f (n) is the least s such that ΦA (n)[s] ↓ and A ϕA (n)[s] = As ϕA (n)[s]. For a set A, let A = {e : ΦA e (e) ↓}. The set A represents the halting problem relativized to A. The general process of extending a definition or result in the non-oracle case to the oracle case is known as relativization. For instance, Proposition 2.1.3 (the unsolvability of the halting problem) can be relativized with a completely analogous proof to show that A T A for all A. Two good (but not hard) exercises are to show that A
2.4.2 The jump operator and jump classes We often refer to A as the (Turing) jump of A. The jump operator is the function A → A . The nth jump of A, written as A(n) , is the result of applying the jump operator n times to A. So, for example, A(2) = A and A(3) = A . If a = deg(A) then we write a for deg(A ), and similarly for the nth jump notation. This definition makes sense because A ≡T B implies A ≡T B . Note that we have a hierarchy of degrees 0 < 0 < 0 < · · · .
(n) (ω) We also define the ω-jump of A as A = n A . (We could continue to iterate the jump to define α-jumps for higher ordinals α, but we will not need these in this book.)
2.4. Reductions
19
The halting problem, and hence the jump operator, play a fundamental role in much of computability theory. Closely connected to the jump operator, and also of great importance in computability theory, are the following jump classes. Definition 2.4.2. A set A is lown if A(n) ≡T ∅(n) . The low1 sets are called low . A set A T ∅ is highn if A(n) ≡T ∅(n+1) . The high1 sets are called high. More generally, we call an arbitrary set A high if A T ∅ . These classes are particularly well suited to studying ∅ -computable sets. The following classes are sometimes better suited to the general case. Definition 2.4.3. A set A is generalized lown (GLn ) if A(n) ≡T (A ⊕ ∅ )(n−1) . A set A is generalized highn (GHn ) if A(n) ≡T (A ⊕ ∅ )(n) . Note that if A T ∅ , then A is GLn iff it is lown , and GHn iff it is highn . A degree is low if the sets it contains are low, and similarly for other jump classes. While jump classes are defined in terms of the jump operator, they often have more “combinatorial” characterizations. For example, we will see in Theorem 2.23.7 that a set A is high iff there is an A-computable function that dominates all computable functions, where f dominates g if f (n) g(n) for all sufficiently large n.
2.4.3 Strong reducibilities The reduction used in the proof of Rice’s Theorem is of a particularly strong type, since to decide whether x ∈ ∅ , we simply compute s(x) and ask whether s(x) ∈ A. Considering this kind of reduction leads to the following definition. Definition 2.4.4. We say that A is many-one reducible (m-reducible) to B, and write A m B, if there is a total computable function f such that for all x, we have x ∈ A iff f (x) ∈ B.2 If B is ∅ or N then the only set m-reducible to B is B itself, but we will ignore these cases of the above definition. If the function f in the definition of m-reduction is injective, then we say that A is 1-reducible to B, and 2 In the context of complexity theory, resource bounded versions of m-reducibility are at the basis of virtually all modern NP-completeness proofs. Although Cook’s original definition of NP-completeness was in terms of polynomial time Turing reducibility, the Karp version in terms of polynomial time m-reducibility is most often used. It is still an open question of structural complexity theory whether there is a set A such that the polynomial time Turing degree of A collapses to a single polynomial time m-degree.
20
2. Computability Theory
write A 1 B. Note that if B is c.e. and A m B then A is also c.e. Also note that ∅ is m-complete, and even 1-complete, in the following sense. Proposition 2.4.5. If A is c.e. then A 1 ∅ . Proof. It is easy to define an injective computable function f such that for each n, the machine Φf (n) ignores its input and halts iff n enters A at some point. Then f witnesses the fact that A 1 ∅ . It is not difficult to construct sets A and B such that A T B but A m B. For example, ∅ T ∅ , but ∅ m ∅ , since ∅ is not c.e. It is also possible to construct such examples in which A and B are both c.e. Thus m-reducibility strictly refines Turing reducibility, and hence we say that m-reducibility is an example of a strong reducibility (which should not be confused with the notion of strong reducibility of mass problems introduced in Section 8.9). There are many other strong reducibilities. Their definitions depend on the types of oracle access used in the corresponding reductions. We mention two that will be important in this book. One of the key aspects of Turing reducibility is that a Turing reduction may be adaptable, in the sense that the number and type of queries made of the oracle depends upon the oracle itself. For instance, imagine a reduction that works as follows: on input x, the oracle is queried as to whether it contains some power of x. That is, the reduction asks whether 1 is in the oracle, then whether x is in the oracle, then whether x2 is in the oracle, and so on. If the answer is yes for some xn , then the reduction checks whether the least such n is even, in which case it outputs 0, or odd, in which case it outputs 1. Note that there is no limit to the number of questions asked of the oracle on a given input. This number depends on the oracle. Indeed, if the oracle happens to contain no power of x, then the computation on input x will not halt at all, and infinitely many questions will be asked. Many naturally arising reductions do not have this adaptive property. One class of examples gives rise to the notion of truth table reducibility. A truth table on the variables v1 , v2 , . . . is a (finite) boolean combination σ of these variables. We write A σ if σ holds with vn interpreted as n ∈ A. For example, σ might be ((v1 ∧ v2 ) ∨ (v3 → v4 )) ∧ v5 ), in which case A σ iff 5 ∈ A and either (1 ∈ A and 2 ∈ A) or (3 ∈ / A or 4 ∈ A) (or both). Let σ0 , σ1 , . . . be an effective list of all truth tables. Definition 2.4.6. We say that A is truth table reducible to B, and write A tt B, if there is a computable function f such that for all x, x ∈ A iff B σf (x) . Notice that an m-reduction is a simple example of a tt-reduction. The relevant truth table for input n has a single entry vf (n) , where f is the function witnessing the given m-reduction. The following characterization follows easily from the compactness of 2ω .
2.4. Reductions
21
Proposition 2.4.7 (Nerode [293]). A Turing reduction Φ is a truth table reduction iff ΦX is total for all oracles X. This characterization is particularly useful in the context of effective measure theory, as it means that truth table reductions can be used to transfer measures from one space to another. Examples can be found in the work of Reimann and Slaman [327, 328, 329] and the proof of Demuth’s Theorem 8.6.1 below. Concepts defined using Turing reducibility often have productive analogs for strong reducibilities. The following is an example we will use below. Definition 2.4.8. A set A is superlow if A ≡tt ∅ . One way to look at a tt-reduction is that it is one in which the oracle queries to be performed on a given input are predetermined, independently of the oracle, and the computation halts for every oracle. Removing the last restriction yields the notion of weak truth table reduction, which can also be thought of as bounded Turing reduction, in the sense that there is a computable bound, independent of the oracle, on the amount of the oracle to be queried on a given input. Definition 2.4.9. We say that A is weak truth table reducible (wttreducible) to B, and write A wtt B, if there are a computable function f and an oracle Turing machine Φ such that ΦB = A and ϕB (n) f (n) for all n. Definitions and notations that we introduced for Turing reducibility, such as the notion of degree, also apply to these strong reducibilities. In particular, we have the notions of truth table functional and weak truth table functional . For a Turing functional Φ to be a wtt-functional, we require that there be a single computable function f such that for all oracles B, if ΦB (n) ↓ then ϕB (n) f (n). The best way to think of a tt-functional is as a total Turing functional, that is, a functional Φ such that ΦB is total for all oracles B. The reducibilities we have seen so far calibrate sets into degrees of greater and greater fineness, in the order T, wtt, tt, m, 1.
2.4.4 Myhill’s Theorem The definition of ∅ depends on the choice of an effective enumeration of the partial computable functions. By Proposition 2.4.5, however, any two versions of ∅ are 1-equivalent. The following result shows that they are in fact equivalent up to a computable permutation of N. Theorem 2.4.10 (Myhill Isomorphism Theorem [291]). A ≡1 B iff there is a computable permutation h of N such that h(A) = B. Proof. The “if” direction is immediate, so assume that A 1 B via f and B 1 A via g. We define h in stages. We will ensure that the finite partial
22
2. Computability Theory
function hs defined at each stage is injective and such that m ∈ A iff hs (m) ∈ B for all m ∈ dom hs . Let h0 = ∅. Suppose we have defined hs for s even. Let m be least such −1 −1 that hs (m) is not defined. List f (n), f ◦h−1 s ◦f (n), f ◦hs ◦f ◦hs ◦f (n), . . . until an element k ∈ / rng hs is found. Note that each element of this list, and in particular k, is in B iff n ∈ A. Extend hs to hs+1 by letting hs+1 (m) = k. Now define hs+2 in the same way, with f , hs , dom, and rng replaced by g, h−1 s+1 , rng, and dom, respectively. Let h = s hs . It is easy to check that h is a computable permutation of N and that h(A) = B. Myhill [291] characterized the 1-complete sets using the following notions. Definition 2.4.11 (Post [316]). A set B is productive if there is a partial computable function h such that if We ⊆ B then h(e) ↓∈ B \ We . A c.e. set A is creative if A is productive. The halting problem is an example of a creative set, since ∅ is productive via the identity function. Theorem 2.4.12 (Myhill’s Theorem [291]). The following are equivalent for a set A. (i) A is creative. (ii) A is m-complete (i.e., m-equivalent to ∅ ). (iii) A is 1-complete (i.e., 1-equivalent to ∅ ). (iv) A is equivalent to ∅ up to a computable permutation of N. Proof. The Myhill Isomorphism Theorem shows that (iii) and (iv) are equivalent, and (iii) obviously implies (ii). To show that (ii) implies (i), suppose that ∅ m A via f . Let g be a computable function such that Wg(e) = {n : f (n) ∈ We } for all e, and let h(e) = f (g(e)). If We ⊆ A then Wg(e) ∈ ∅ , so g(e) ∈ ∅ \ Wg(e) , and hence h(e) ∈ A\ We . Thus A is creative via h. To show that (i) implies (iii), we use the recursion theorem with parameters. Since A is c.e., we have A 1 ∅ , so it is enough to show that ∅ 1 A. Let h be a function witnessing that A is productive. Let f be a computable function such that Wf (n,k) = {f (n)} if k ∈ ∅ and Wf (n,k) = ∅ otherwise. By the recursion theorem with parameters, there is an injective computable function g such that Wg(k) = Wf (g(k),k) for all k. Then k ∈ ∅ ⇒ Wg(k) = {f (g(k))} ⇒ Wg(k) A ⇒ f (g(k)) ∈ A and k∈ / ∅ ⇒ Wg(k) = ∅ ⇒ Wg(k) ⊆ A ⇒ f (g(k)) ∈ A. Thus ∅ 1 A via f ◦ g.
2.5. The arithmetic hierarchy
23
We will see a randomness-theoretic version of this result in Section 9.2.
2.5 The arithmetic hierarchy We define the notions of Σ0n , Π0n , and Δ0n sets as follows. A set A is Σ0n if there is a computable relation R(x1 , . . . , xn , y) such that y ∈ A iff ∃x1 ∀x2 ∃x3 ∀x4 · · · Qn xn R(x1 , . . . , xn , y). n alternating quantifiers Since the quantifiers alternate, Qn is ∃ if n is odd and ∀ if n is even. In this definition, we could have had n alternating quantifier blocks, instead of single quantifiers, but we can always collapse two successive existential or universal quantifiers into a single one by using pairing functions, so that would not make a difference. The definition of A being Π0n is the same, except that the leading quantifier is a ∀ (but there still are n alternating quantifiers in total). It is easy to see that A is Π0n iff A is Σ0n . Finally, we say a set is Δ0n if it is both Σ0n and Π0n (or equivalently, if both it and its complement are Σ0n ). Note that the Δ00 , Π00 , and Σ00 sets are all exactly the computable sets. The same is true of the Δ01 sets, as shown by Proposition 2.5.1 below. These notions give rise to Kleene’s arithmetic hierarchy, which can be pictured as follows.
Δ03
...
Σ01
Δ02
Δ01
Π02
Π01
Σ02
A set is arithmetic if it is in one of the levels of the arithmetic hierarchy. As we will see in the next section, there is a strong relationship between the arithmetic hierarchy and enumeration. The following is a simple example at the lowest level of the hierarchy Proposition 2.5.1 (Kleene [208]). A set A is computably enumerable iff A is Σ01 . Proof. Suppose A is c.e. Then A = dom(Φe ) for some e, so n ∈ A iff ∃s Φe (y)[s] ↓. Thus A is Σ01 . Conversely, if A is Σ01 then for some computable R we have n ∈ A iff ∃x R(x, n). Define a partial computable function g by letting g(n) = 1 at stage s iff s n and there is an x < s such that R(x, n) holds. Then n ∈ A iff n ∈ dom(g), so A is c.e.
24
2. Computability Theory
2.6 The Limit Lemma and Post’s Theorem There is an important generalization, due to Post, of Proposition 2.5.1. It ties in the arithmetic hierarchy with the degrees of unsolvability, and gives completeness properties of degrees of the form 0(n) , highlighting their importance. In this section we will look at this and related characterizations, beginning with Shoenfield’s Limit Lemma. Saying that a set A is c.e. can be thought of as saying that A has a computable approximation that, for each n, starts out by saying that n ∈ / A, and then changes its mind at most once. More precisely, there is a computable binary function f such that for all n we have A(n) = lims f (n, s), with f (n, 0) = 0 and f (n, s + 1) = f (n, s) for at most one s. Generalizing this idea, the limit lemma characterizes the sets computable from the halting problem as those that have computable approximations with finitely many mind changes, and hence are “effectively approximable”. (In other words, the sets computable from ∅ are exactly those that have computable approximations, as defined in Section 2.2.) Theorem 2.6.1 (Limit Lemma, Shoenfield [354]). For a set A, we have A T ∅ iff there is a computable binary function g such that, for all n, (i) lims g(n, s) exists (i.e., |{s : g(n, s) = g(n, s + 1)}| < ∞), and (ii) A(n) = lims g(n, s).
Proof. (⇒) Suppose A = Φ∅ . Define g by letting g(n, s) = 0 if either Φ∅ [s] ↑ or Φ∅ [s] ↓= 1, and letting g(n, s) = 1 otherwise. Fix n, and let s be a stage such that ∅t ϕ∅ (n) = ∅ ϕ∅ (n) for all t s. By the use principle (Proposition 2.4.1), g(n, t) = Φ∅ (n)[t] = Φ∅ (n) = A(n) for all t s. Thus g has the required properties. (⇐) Suppose such a function g exists. Without loss of generality, we may assume that g(n, 0) = 0 for all n. To show that A T ∅ , it is enough to build a c.e. set B such that A T B, since by Proposition 2.4.5, every c.e. set is computable in ∅ . We put n, k into B whenever we find that |{s : g(n, s) = g(n, s + 1)}| k. Now define a Turing reduction Γ as follows. Given an oracle X, on input n, search for the least k such that n, k ∈ / X, and if one is found, then output 0 if k is even and 1 if k is odd. Clearly, ΓB = A. As in the case of c.e. sets, whenever we are given a set A computable in ∅ , we assume that we have a fixed computable approximation A0 , A1 , . . . to A; that is, we assume we have a computable function g as in the limit lemma, and write As for the set of all n such that g(n, s) = 1. We may always assume without loss of generality that A0 = ∅ and As ⊆ [0, s]. Intuitively, the proof of the “if” direction of the limit lemma boils down to saying that, since (by Propositions 2.4.5 and 2.5.1) the set ∅ can decide
2.6. The Limit Lemma and Post’s Theorem
25
whether ∃s > t (g(n, s) = g(n, s + 1)) for any n and t, it can also compute lims g(n, s). Corollary 2.6.2. For a set A, the following are equivalent. (i) A tt ∅ . (ii) A wtt ∅ . (iii) There are a computable binary function g and a computable function h such that, for all n, (a) |{s : g(n, s) = g(n, s + 1)}| < h(n), and (b) A(n) = lims g(n, s). Proof. We know that (i) implies (ii). The proof that (ii) implies (iii) is essentially the same as that of the “if” direction of Theorem 2.6.1, together with the remark that if Φ is a wtt-reduction then we can computably bound the number of times the value of Φ∅ [s] can change as s increases. The proof that (iii) implies (i) is much the same as that of the “only if” direction of Theorem 2.6.1, except that we can now make Γ into a tt-reduction because we have to check whether n, k ∈ / X only for k < h(n). Sets with the properties given in Corollary 2.6.2 are called ω-c.e., and will be further discussed in Section 2.7. As we have seen, we often want to relativize results, definitions, and proofs in computability theory. The limit lemma relativizes to show that A T B iff there is a B-computable binary function f such that A(n) = lims f (n, s) for all n. Combining this fact with induction, we have the following generalization of the limit lemma. Corollary 2.6.3 (Limit Lemma, Strong Form, Shoenfield [354]). Let k 1. For a set A, we have A T ∅(k) iff there is a computable function g of k + 1 variables such that A(n) = lims1 lims2 . . . limsk g(n, s1 , s2 , . . . , sk ) for all n. We now turn to Post’s characterization of the levels of the arithmetic hierarchy. Let C be a class of sets (such as a level of the arithmetic hierarchy). A set A is C-complete if A ∈ C and B T A for all B ∈ C. If in fact B m A for all B ∈ C, then we say that A is C m-complete, and similarly for other strong reducibilities. Theorem 2.6.4 (Post’s Theorem [317]). Let n 0. (i) A set B is Σ0n+1 ⇔ B is c.e. in some Σ0n set ⇔ B is c.e. in some Π0n set. (ii) The set ∅(n) is Σ0n m-complete. (iii) A set B is Σ0n+1 iff B is c.e. in ∅(n) . (iv) A set B is Δ0n+1 iff B T ∅(n) .
26
2. Computability Theory
Proof. (i) First note that if B is c.e. in A then B is also c.e. in A. Thus, being c.e. in a Σ0n set is the same as being c.e. in a Π0n set, so all we need to show is that B is Σ0n+1 iff B is c.e. in some Π0n set. The “only if” direction has the same proof as the corresponding part of Proposition 2.5.1, except that the computable relation R in that proof is now replaced by a Π0n relation R. For the “if” direction, let B be c.e. in some Π0n set A. Then, by Proposition 2.5.1 relativized to A, there is an e such that n ∈ B iff ∃s ∃σ ≺ A (Φσe (n)[s] ↓).
(2.1)
The property in parentheses is computable, while the property σ ≺ A is a combination of a Π0n statement (asserting that certain elements are in A) and a Σ0n statement (asserting that certain elements are not in A), and hence is Δ0n+1 . So (2.1) is a Σ0n+1 statement. (ii) We proceed by induction. By Propositions 2.4.5 and 2.5.1, ∅ is Σ01 m-complete. Now assume by induction that ∅(n) is Σ0n m-complete. Since ∅(n+1) is c.e. in ∅(n) , it is Σ0n+1 . Let C be Σ0n+1 . By part (i), C is c.e. in some Σ0n set, and hence it is c.e. in ∅(n) . As in the unrelativized case, it is now easy to define a computable function f such that n ∈ C iff f (n) ∈ ∅(n+1) . (n) (In more detail, let e be such that C = We∅ , and define f so that for all X oracles X and all n and x, we have ΦX f (n) (x) ↓ iff n ∈ We .) (iii) By (i) and (ii), and the fact that if X is c.e. in Y and Y T Z, then X is also c.e. in Z. (iv) The set B is Δ0n+1 iff B and B are both Σ0n+1 , and hence both c.e. in ∅(n) by (ii). But a set and its complement are both c.e. in X iff the set is computable in X. Thus B is Δ0n+1 iff B T ∅(n) . Note in particular that the Δ02 sets are exactly the ∅ -computable sets, that is, the sets that have computable approximations. There are many “natural” sets, such as certain index sets, that are complete for various levels of the arithmetic hierarchy. The following result gives a few examples. Theorem 2.6.5.
(i) Fin = {e : We is finite} is Σ02 m-complete.
(ii) Tot = {e : Φe is total} and Inf = {e : We is infinite} are both Π02 m-complete. (iii) Cof = {e : We is cofinite} is Σ03 m-complete. Proof sketch. None of these are terribly difficult. We do (i) as an example. We know that ∅ is Σ02 m-complete by Post’s Theorem, and it is easy to check that Fin is itself Σ02 , so it is enough to m-reduce ∅ to Fin. Using the s-m-n theorem, we can define a computable function f such that for all s and e, we have s ∈ Wf (e) iff there is a t s such that either Φ∅e (e)[t] ↑ or ∅t+1 ϕ∅e (e)[t] = ∅t ϕ∅e (e)[t]. Then f (e) ∈ Fin iff Φ∅e (e) ↓ iff e ∈ ∅ .
2.7. The difference hierarchy
27
Part (ii) is similar, and (iii) is also similar but more intricate. See Soare [366] for more details.
2.7 The difference hierarchy The arithmetic hierarchy gives us one way to extend the concept of computable enumerability. Another way to do so is via the difference hierarchy, which is defined as follows. Definition 2.7.1. Let n 1. A set A is n-computably enumerable (n-c.e.) if there is a computable binary function f such that for all x, (i) f (x, 0) = 0, (ii) A(x) = lims f (x, s), and (iii) |{s : f (x, s + 1) = f (x, s)}| n. Thus the 1-c.e. sets are simply the c.e. sets. The 2-c.e. sets are often called d.c.e., which stands for “difference of c.e.”, because of the easily proved fact that A is 2-c.e. iff there are c.e. sets B and C such that A = B \ C. We have seen the following definition in Section 2.6. Definition 2.7.2. A set A is ω-c.e. if there are a computable binary function f and a computable unary function g such that for all x, (i) f (x, 0) = 0, (ii) A(x) = lims f (x, s), and (iii) |{s : f (x, s + 1) = f (x, s)}| g(x). In Corollary 2.6.2, we saw that the ω-c.e. sets are exactly those that are (w)tt-reducible to ∅ . The following fact was probably known before it was explicitly stated by Arslanov [14]. Proposition 2.7.3 (Arslanov [14]). If A is ω-c.e. then there is a B ≡m A and a computable binary function h such that for all x, (i) h(x, 0) = 0, (ii) B(x) = lims h(x, s), and (iii) |{s : h(x, s + 1) = h(x, s)}| x. Proof. Let g be as in Definition 2.7.2. Without loss of generality, we may assume that g is increasing. Let B = {g(x) : x ∈ A}. Then B clearly has the desired properties. It is possible to define the concept of α-c.e. set for all computable ordinals α, forming what is known as the Ershov hierarchy. The details of the definition depend on ordinal notations. We will not use this concept in any
28
2. Computability Theory
significant way, so for simplicity we assume familiarity with the definition of a computable ordinal and Kleene’s ordinal notations. For those unfamiliar with these concepts, we refer to Rogers [334] and Odifreddi [310, 311]. See Epstein, Haas, and Kramer [140] for more on α-c.e. sets and degrees. Definition 2.7.4. Let α be a computable ordinal. A set A is α-c.e. relative to a computable system S of notations for α if there is a partial computable function f such that for all x, we have A(x) = f (x, b) for the S-least notation b such that f (x, b) converges. It is not hard to check that this definition agrees with the previous ones for α ω, independently of the system of notations chosen. As in the c.e. case, we say that a degree is n-c.e. if it contains an n-c.e. set, and similarly for ω-c.e. and α-c.e. degrees.
2.8 Primitive recursive functions The class of primitive recursive functions is the smallest class of functions satisfying the following properties (where a 0-ary function is just a natural number). (i) The function n → n + 1 is primitive recursive. (ii) For each k and m, the function (n0 , . . . , nk−1 ) → m is primitive recursive. (iii) For each k and i < k, the function (n0 , . . . , nk−1 ) → ni is primitive recursive. (iv) If f and g0 , . . . , gk−1 are primitive recursive, f is k-ary, and the gi are all j-ary, then (n0 , . . . , nj−1 ) → f (g0 (n0 , . . . , nj−1 ), . . . , gk−1 (n0 , . . . , nj−1 )) is primitive recursive. (v) If the k-ary function g and the (k + 2)-ary function h are primitive recursive, then so is the function f defined by f (0, n0 , . . . , nk−1 ) = g(n0 , . . . , nk−1 ) and f (i + 1, n0 , . . . , nk−1 ) = h(n, f (i, n0 , . . . , nk−1 ), n0 , . . . , nk−1 ). It is easy to see that every primitive recursive function is computable. However, it is also easy to see that the primitive recursive functions are total and can be effectively listed, so there are computable functions that
2.9. A note on reductions
29
are not primitive recursive.3 A famous example is Ackermann’s function ⎧ ⎪ if m = 0 ⎨n + 1 A(m, n) = A(m − 1, 1) if m > 0 ∧ n = 0 ⎪ ⎩ A(m − 1, A(m, n − 1)) otherwise. Many natural computable functions are primitive recursive, though, and it is sometimes useful to work with an effectively listable class of total computable functions, so we will use primitive recursive functions in a few places below.
2.9 A note on reductions There are several ways to describe a reduction procedure. Formally, A T B means that there is an e such that ΦB e = A. In practice, though, we never actually build an oracle Turing machine to witness the fact that A T B, but avail ourselves of the Church-Turing Thesis to informally describe a reduction Γ such that ΓB = A. One such description is given in the proof of the ⇐ direction of Shoenfield’s Limit Lemma (Theorem 2.6.1). This proof gives an example of a static definition of a reduction procedure, in that the action of Γ is specified by a rule, rather than being defined during a construction. As an example of a dynamic definition of a reduction procedure, we reprove the ⇐ direction of the limit lemma. Recall that we are given a computable binary function g such that, for all n, (i) lims g(n, s) exists and (ii) A(n) = lims g(n, s). We wish to show that A T ∅ , by building a c.e. set B and a reduction ΓB = A. We simultaneously construct B and Γ in stages. We begin with B0 = ∅. For each n, we leave the value of ΓB (n) undefined until stage n. At stage n, we let ΓB (n)[n] = g(n, n) with use γ B (n)[n] = n, 0 + 1. Furthermore, at stage s > 0 we proceed as follows for each n < s. If g(n, s) = g(n, s − 1), then we change nothing. That is, we let ΓB (n)[s] = ΓB (n)[s − 1] with the same use γ B (n)[s] = γ B (n)[s − 1]. Otherwise, we enumerate γ(n, s − 1) − 1 into B, which allows us to redefine ΓB (n)[s] = g(n, s), with use γ B (n)[s] = n, k + 1 for the least n, k ∈ / B. It is not hard to check that ΓB = A, and that in fact this reduction is basically the same as that in the original proof of the limit lemma (if we 3 We
can obtain the partial computable functions by adding to the five items above an unbounded search scheme. See Soare [366] for more details and further discussion of primitive recursive functions.
30
2. Computability Theory
assume without loss of generality that g(n, n) = 0 for all n), at least as far as its action on oracle B goes. More generally, the rules for a reduction ΔC to a c.e. set C are as follows, for each input n. 1. Initially ΔC (n)[0] ↑. 2. At some stage s we must define ΔC (n)[s] ↓= i for some value i, with some use δ C (n)[s]. By this action, we are promising that ΔC (n) = i unless C δ C (n)[s] = Cs δ C (n)[s]. 3. The convention now is that δ C (n)[t] = δ C (n)[s] for t > s unless Ct δ C (n)[s] = Cs δ C (n)[t]. Should we find a stage t > s such that Ct δ C (n)[s] = Cs δ C (n)[t], we then again have ΔC (n)[t] ↑. 4. We now again must have a stage u t at which we define ΔC (n)[u] ↓= j for some value j, with some use δ C (n)[u]. We then return to step 3, with u in place of s. 5. If ΔC is to be total, we have to ensure that we stay at step 3 permanently from some point on. That is, there must be a stage u at which we define ΔC (n)[u] and δ C (n)[u], such that C δ C (n)[u] = Cu δ C (n)[u]. One way to achieve this is to ensure that, from some point on, whenever we redefine δ C (n)[u], we set it to the same value. In some constructions, C will be given to us, but in others we will build it along with Δ. In this case, when we want to redefine the value of the computation ΔC (n) at stage s, we will often be able to do so by putting a number less than δ C (n)[s] into C (as we did in the limit lemma example above). There is a similar method of building a reduction ΔC when C is not c.e., but merely Δ02 . The difference is that now we must promise that if ΔC (n)[s] is defined and there is a t > s such that Ct δ C (n)[s] = Cs δ C (n)[s], then ΔC (n)[t] = ΔC (n)[s] and δ C (n)[t] = δ C (n)[s]. A more formal view of a reduction is as a partial computable map from strings to strings obeying certain continuity conditions. In this view, a reduction ΓB = A is specified by a partial computable function f : 2<ω → 2<ω such that 1. if σ ≺ B, then f (σ) ≺ A; 2. for all σ ≺ τ , if both f (σ) ↓ and f (τ ) ↓, then f (σ) f (τ ); and 3. for all τ ≺ A there is a σ ≺ B such that τ ≺ f (σ). The reduction in the proof of the ⇐ direction of the limit lemma can be viewed in this way by letting f (σ) be the longest string τ such that, for all n < |τ |, there is a k with σ(n, k) = 0, and τ (n) = 0 iff the first such k to be found is even.
2.10. The finite extension method
31
Notice that this last method implies the following interesting observation: A function f : 2<ω → 2<ω is continuous iff it is computable relative to some oracle.
2.10 The finite extension method In this section, we introduce one of the main techniques used in classical degree theory. We will refine this technique in the next section to what is known as the finite injury priority method. The dynamic construction of a reduction from B to A in the limit lemma, discussed in the previous section, has much in common with many proofs in classical computability theory. We perform a construction where some object is built in stages. Typically, we have some overall goal that we break down into smaller subgoals that we argue are all met in the limit. In this case, the goal is to construct the reduction ΓB = A. We break this goal into the subgoals of defining ΓB (n) for each n, and we accomplish these subgoals by using the information supplied by our “opponent”, who is feeding us information about the universe, in this case the values g(n, s). As an archetype for such proofs, think of Cantor’s proof that the collection of all infinite binary sequences is uncountable. One can conceive of this proof as follows. Suppose we could list the infinite binary sequences as S = {S0 , S1 , . . .}, with Se = se,0 se,1 . . . . It is our goal to construct a binary sequence U = u0 u1 . . . that is not on the list S. We think of the construction as a game against our opponent who must supply us with S. We construct u in stages, at stage t specifying only u0 . . . ut , the initial segment of U of length t + 1. Our list of requirements is the decomposition of the overall goal into subgoals of the form Re : U = Se . There is one such requirement for each e ∈ N. Of course, we know how to satisfy these requirements. At stage e, we simply ensure that ue = se,e by setting ue = 1 − se,e . This action ensures that U = Se for all e; in other words, all the requirements are met. This fact contradicts the assumption that S lists all infinite binary sequences, as U is itself an infinite binary sequence. Notice that if we define a real number to be computable if it has a computable binary expansion, then the above proof can be used to show that there is no computable listing of all the computable reals (modulo some unimportant technicalities involving nonunique representations of reals). We will return to the topic of effective real numbers in Chapter 5. Clearly, the proof of the unsolvability of the halting problem can also be similarly recast, where this time the eth requirement asks us to invalidate the eth member of some supposed list of all algorithms deciding whether Φe (e) ↓.
32
2. Computability Theory
While later results will be more complicated than these easy examples, the overall structure of the above should be kept in mind: Our constructions will be in finite steps, where one or more objects are constructed stage by stage in finite pieces. These objects will be constructed to satisfy a list of requirements. The strategy we use will be dictated by how our opponent reveals the universe to us. Our overall goal is to satisfy all requirements in the limit. To finish this section, we look at a slightly more involved version of this technique. While we know that there are uncountably many Turing degrees, the only ones we have seen so far are the iterates of the halting problem. Rice’s Theorem 2.2.3 shows that all index sets are of degree 0 . In 1944, Post [316] observed that all computably enumerable problems known at the time were either computable or of Turing degree 0 . He asked the following question. Question 2.10.1 (Post’s Problem). Does there exist a computably enumerable degree a with 0 < a < 0 ? As we will see in the next section, Post’s Problem was finally given a positive answer by Friedberg [161] and Muchnik [284], using a new and ingenious method called the priority method. This method was an effectivization of an earlier method discovered by Kleene and Post [210]. The latter is called the finite extension method, and was used to prove the following result. Theorem 2.10.2 (Kleene and Post [210]). There are degrees a and b, both below 0 , such that a | b. In other words, there are ∅ -computable sets that are incomparable under Turing reducibility.4 Proof. We construct A = lims As and B = lims Bs in stages, to meet the following requirements for all e ∈ N. R2e : ΦA e = B. R2e+1 : ΦB e = A. Note that if A T B then there must be some procedure Φe with ΦB e = A. Hence, if we meet all our requirements then A T B, and similarly B T A, so that A and B have incomparable Turing degrees. The fact that A, B T ∅ will come from the construction and will be observed at the end. The argument is by finite extensions, in the sense that at each stage s we specify a finite portion As of A and a finite portion Bs of B. These finite portions As and Bs will be specified as binary strings. The key invariant that we need to maintain throughout the construction is that As Au and 4 The
difference between the Kleene-Post Theorem and the solution to Post’s Problem is that the degrees constructed in the proof of Theorem 2.10.2 are not necessarily computably enumerable, but merely Δ02 .
2.10. The finite extension method
33
Bs Bu for all stages u s. Thus, after stage s we can only extend the portions of A and B that we have specified by stage s, which is a hallmark of the finite extension method. Construction. Stage 0. Let A0 = B0 = λ (the empty string). Stage 2e + 1. (Attend to R2e .) We will have specified A2e and B2e at stage 2e. Pick some number x, called a witness, with x |B2e |, and ask whether there is a string σ properly extending A2e such that Φσe (x) ↓. If such a σ exists, then let A2e+1 be the length-lexicographically least such σ. Let B2e+1 be the string of length x + 1 extending B2e such that B2e+1 (n) = 0 for all n with |B2e | n < x and B2e+1 (x) = 1 − Φσe (x). If no such σ exists, then let A2e+1 = A2e 0 and B2e+1 = B2e 0. Stage 2e + 2. (Attend to R2e+1 .) Define A2e+2 and B2e+2 by proceeding in the same way as at stage 2e + 1, but with the roles of A and B reversed. End of Construction. Verification. First note that we have A0 ≺ A1 ≺ · · · and B0 ≺ B1 ≺ · · · , so A and B are well-defined. We now prove that we meet the requirement Rn for each n; in fact, we show that we meet Rn at stage n + 1. Suppose that n = 2e (the case where n is odd being completely analogous). At stage n + 1, there are two cases to consider. Let x be as defined at that stage. If there is a σ properly extending An with Φσe (x) ↓, then our action is A to adopt such a σ as An+1 and define Bn+1 so that Φe n+1 (x) = Bn+1 (x). An+1 Since A extends An+1 and Φe (x) ↓, it follows that A and An+1 agree on An+1 the use of this computation, and hence ΦA . Since B extends e (x) = Φe Bn+1 , we also have B(x) = Bn+1 (x). Thus ΦA e (x) = B(x), and Rn is met. If there is no σ extending An with Φσe (x) ↓, then since A is an extension of An , it must be the case ΦA (x) ↑, and hence Rn is again met. Finally we argue that A, B T ∅ . Notice that the construction is in fact fully computable except for the decision as to which case we are in at a given stage. There we must decide whether there is a convergent computation of a particular kind. For instance, at stage 2e + 1 we must decide whether the following holds: ∃τ ∃s [τ A2e ∧ Φτe (x)[s] ↓].
(2.2)
This is a Σ01 question, uniformly in x, and hence can be decided by ∅ .5 The reasoning at the end of the above proof is quite common: we often make use of the fact that ∅ can answer any Δ02 question, and hence any Σ01 or Π01 question. 5 More
precisely, we use the s-m-n theorem to construct a computable ternary function f such that for all e, σ, x, and z, we have Φf (e,σ,x) (z) ↓ iff (2.2) holds. Then (2.2) holds iff f (e, σ, x) ∈ ∅ .
34
2. Computability Theory
A key ingredient of the proof of Theorem 2.10.2 is the use principle (Proposition 2.4.1). In constructions of this sort, where we build objects to defeat certain oracle computations, a typical requirement will say something like “the reduction Γ is not a witness to A T B.” If we have a converging computation ΓB (n)[s] = A(n)[s] and we “preserve the use” of this computation by not changing B after stage s on the use γ B (n)[s] (and similarly preserve A(n)), then we will preserve this disagreement. But this use corresponds to only a finite portion of B, so we still have all the numbers bigger than it to meet other requirements. In the finite extension method, this use preservation is automatic, since once we define B(x) we never redefine it, but in other constructions we will introduce below, this may not be the case, because we may have occasion to redefine certain values of B. In that case, to ensure that ΓB = A, we will have to structure the construction so that, if ΓB is total, then there are n and s such that ΓB (n)[s] = A(n)[s] and, from stage s on, we preserve both A(n) and B γ B (n)[s].
2.11 Post’s Problem and the finite injury priority method A more subtle technique than the finite extension method is the priority method . We begin by looking at the simplest incarnation of this elegant technique, the finite injury priority method . This method is somewhat like the finite extension method, but with backtracking. The idea behind it is the following. Suppose we must again satisfy requirements R0 , R1 , . . ., but this time we are constrained to some sort of effective construction, so we are not allowed to ask questions of a noncomputable oracle during the construction. As an illustration, let us reconsider Post’s Problem (Question 2.10.1). Post’s Problem asks us to find a c.e. degree strictly between 0 and 0 . It is clearly enough to construct c.e. sets A and B with incomparable Turing degrees. The Kleene-Post method does allow us to construct sets with incomparable degrees below 0 , using a ∅ oracle question at each stage, but there is no reason to expect these sets to be computably enumerable. To make A and B c.e., we must have a computable (rather than merely ∅ -computable) construction where elements go into the sets A and B but never leave them. As we will see, doing so requires giving up on satisfying our requirements in order. The key idea, discovered independently by Friedberg [161] and Muchnik [284], is to pursue multiple strategies for each requirement, in the following sense. In the proof of the Kleene-Post Theorem, it appears that, in satisfying the requirement R2e , we need to know whether or not there is a σ extending A2e such that Φσe (x) ↓, where x is our chosen witness. Now our idea is to first guess that no such σ exists, which means that we do nothing for R2e
2.11. Post’s Problem and the finite injury priority method
35
other than keep x out of B. If at some point we find an appropriate σ, we then make A extend σ and put x into B if necessary, as in the Kleene-Post construction. The only problem is that putting x into B may well upset the action of other requirements of the form R2i+1 , because such a requirement might need B to extend some string τ (for the same reason that R2e needs A to extend σ), which may no longer be possible. If we nevertheless put x into B, we say that we have injured R2i+1 . Of course, R2i+1 can now choose a new witness and start over from scratch, but perhaps another requirement may injure it again later. So we need to somehow ensure that, for each requirement, there is a stage after which it is never injured. To make sure that this is the case, we put a priority ordering on our requirements, by stating that Rj has stronger priority than Ri if j < i, and allow Rj to injure Ri only if Rj has stronger priority than Ri . Thus R0 is never injured. The requirement R1 may be injured by the action of R0 . However, once this happens R0 will never act again, so if R1 is allowed to start over at this point, it will succeed. This process of starting over is called initialization. Initializing R1 means that we restart its action with a new witness, chosen to be larger than any number previously seen in the construction, and hence larger than any number R0 cares about. This new incarnation of R1 is guaranteed never to be injured. It should now be clear that, by induction, each requirement will eventually reach a point, following a finite number of initializations, after which it will never be injured and hence will succeed in reaching its goal. We may think of this kind of construction as a game between a team of industrialists (each possibly trying to erect a factory) and a team of environmentalists (each possibly trying to build a park). In the end we want the world to be happy. In other words, we want all desired factories and parks to be built. However, some of the players may get distracted by other activities and never decide to build anything, so we cannot simply let one player build, then the next, and so on, because we might then get permanently stuck waiting for a player who never decides to build. Members of the two teams have their own places in the pecking order. For instance, industrialist 6 has stronger priority than all environmentalists except the first six, and therefore can build anywhere except on parks built by the first six environmentalists. So industrialist 6 may choose to build on land already demarcated by environmentalist 10, say, who would then need to find another place to build a park. Of course, even if this event happens, a higher ranked environmentalist, such as number 3, for instance, could later lay claim to that same land, forcing industrialist 6 to find another place to build a factory. Whether the highest ranked industrialist has priority over the highest ranked environmentalist or vice versa is irrelevant to the construction, so we leave that detail to each reader’s political leanings. For each player, there are only finitely many other players with stronger priority, and once all of these have finished building what they desire, the
36
2. Computability Theory
given player has free pick of the remaining land (which is infinite), and can build on it without later being forced out. In general, in a finite injury priority argument, we have a list of requirements in some priority ordering. There are several different ways to meet each individual requirement. Exactly which way will be possible to implement depends upon information that is not initially available to us but is “revealed” to us during the construction. The problem is that a requirement cannot wait for others to act, and hence must risk having its work destroyed by the actions of other requirements. We must arrange things so that only requirements of stronger priority can injure ones of weaker priority, and we can always restart the ones of weaker priority once they are injured. In a finite injury argument, any requirement requires attention only finitely often, and we argue by induction that each requirement eventually gets an environment wherein it can be met. As we will later see, there are much more complex infinite injury arguments where one requirement might injure another infinitely often, but the key there is that the injury is somehow controlled so that it is still the case that each requirement eventually gets an environment wherein it can be met. Of course, imposing this coherence criterion on our constructions means that each requirement must ensure that its action does not prevent weaker requirements from finding appropriate environments (a principle known as Harrington’s “golden rule”). For a more thorough account of these beautiful techniques and their uses in modern computability theory, see Soare [366]. We now turn to the formal description of the solution to Post’s Problem by Friedberg and Muchnik, which was the first use of the priority method. In Chapter 11, we will return to Post’s Problem and explore its connections with the notion of Kolmogorov complexity. Theorem 2.11.1 (Friedberg [161], Muchnik [284]). There exist computably enumerable sets A and B such that A and B have incomparable Turing degrees. Proof. We build A = s As and B = s Bs in stages to satisfy the same requirements as in the proof of the Kleene-Post Theorem. That is, we make A and B c.e. while meeting the following requirements for all e ∈ N. R2e : ΦA e = B. R2e+1 : ΦB e = A. The strategy for a single requirement. We begin by looking at the strategy for a single requirement R2e . We first pick a witness x to follow R2e . This follower is targeted for B, and, of course, we initially keep it out of B. We then wait for a stage s such that ΦA e (x)[s] ↓= 0. If such a stage A does not occur, then either ΦA (x) ↑ or Φ (x) ↓ = 0. In either case, since we e e keep x out of B, we have ΦA (x) = 0 = B(x), and hence R2e is satisfied. e
2.11. Post’s Problem and the finite injury priority method
37
If a stage s as above occurs, then we put x into B and protect As . That is, we try to ensure that any number entering A from now on is greater than any number seen in the construction thus far, and hence in particular greater than ϕA e (x)[s]. If we succeed then, by the use principle, A ΦA e (x) = Φe (x)[s] = 0 = B(x), and hence again R2e is satisfied. We refer to this action of protecting As as imposing restraint on weaker priority requirements. Note that when we take this action, we might injure a requirement R2i+1 that is trying to preserve the use of a computation ΦB i (x ), since x may be below this use. As explained above, the priority mechanism will ensure that this can happen only if 2i + 1 > 2e. We now proceed with the full construction. We will denote by As and Bs the sets of elements enumerated into A and B, respectively, by the end of stage s. Construction. Stage 0. Declare that no requirement currently has a follower. Stage s + 1. Say that Rj requires attention at this stage if one of the following holds. (i) Rj currently has no follower. (ii) Rj has a follower x and, for some e, either (a) j = 2e and ΦA e (x)[s] ↓= 0 = Bs (x) or (b) j = 2e + 1 and ΦB e (x)[s] ↓= 0 = As (x). Find the least j s with Rj requiring attention. (If there is none, then proceed to the next stage.) We suppose that j = 2e, the odd case being symmetric. If R2e has no follower, then let x be a fresh large number (that is, one larger than all numbers seen in the construction so far) and appoint x as R2e ’s follower. If R2e has a follower x, then it must be the case that ΦA e (x)[s] ↓= 0 = Bs (x). In this case, enumerate x into B and initialize all Rk with k > 2e by canceling all their followers. In either case, we say that R2e receives attention at stage s. End of Construction. Verification. We prove by induction that, for each j, (i) Rj receives attention only finitely often, and (ii) Rj is met. Suppose that (i) holds for each k < j in place of j. Suppose that j = 2e for some e, the odd case being symmetric. Let s be the least stage such that for all k < j, the requirement Rk does not require attention after stage s. By the minimality of s, some requirement Rk with k < j received attention at stage s (or s = 0), and hence Rj does not have a follower at the beginning of stage s + 1. Thus, Rj requires attention at stage s + 1, and
38
2. Computability Theory
is appointed a follower x. Since Rj cannot have its follower canceled unless some Rk with k < j receives attention, x is Rj ’s permanent follower. It is clear by the way followers are chosen that x is never any other requirement’s follower, so x will not enter B unless Rj acts to put it into B. So if Rj never requires attention after stage s + 1, then x ∈ / B, and we A never have ΦA e (x)[t] ↓= 0 for t > s, which implies that either Φe (x) ↑ or A Φe (x) ↓= 0. In either case, Rj is met. On the other hand, if Rj requires attention at a stage t + 1 > s + 1, then x ∈ B and ΦA e (x)[t] ↓= 0. The only requirements that put numbers into A after stage t + 1 are ones weaker than Rj (i.e., requirements Rk for k > j). Each such strategy is initialized at stage t + 1, which means that, when it is later appointed a follower, that follower will be bigger than ϕA e (x)[t]. Thus no number less than ϕA (x)[t] will ever enter A after stage t + 1, which e implies, by the use principle, that ΦA (x) ↓= ΦA (x)[t] = 0 = B(x). So in e this case also, Rj is met. Since x ∈ Bt+2 and x is Rj ’s permanent follower, Rj never requires attention after stage t + 1. The above proof is an example of the simplest kind of finite injury argument, what is called a bounded injury construction. That is, we can put a computable bound in advance on the number of times that a given requirement Rj will be injured. In this case, the bound is 2j − 1. We give another example of this kind of construction, connected with the important concept of lowness. It is natural to ask what can be said about the jump operator beyond the basic facts we have seen so far. The next theorem proves that the jump operator on degrees is not injective. Indeed, injectivity fails in the first place it can, in the sense that there are noncomputable sets that the jump operator cannot distinguish from ∅. Theorem 2.11.2 (Friedberg). There is a noncomputable c.e. low set. Proof. We construct our set A in stages. To make A noncomputable we need to meet the requirements Pe : A = We . To make A low we meet the requirements A Ne : (∃∞ s ΦA e (e)[s] ↓) ⇒ Φe (e) ↓ .
To see that such requirements suffice, suppose they are met and define the computable binary function g by letting g(e, s) = 1 if ΦA e (e)[s] ↓ and g(e, s) = 0 otherwise. Then g(e) = lims g(e, s) is well-defined, and by the limit lemma, A = {e : g(e) = 1} T ∅ . The strategy for Pe is simple. We pick a fresh large follower x, and keep it out of A. If x enters We , then we put x into A. We meet Ne by an equally simple conservation strategy. If we see ΦA e (e)[s] ↓ then we simply A try to ensure that A ϕA (e)[s] = A ϕ (e)[s] by initializing all weaker s e e priority requirements, which forces them to choose fresh large numbers as
2.12. Finite injury arguments of unbounded type
39
followers. These numbers will be too big to injure the ΦA e (e)[s] computation after stage s. The priority method sorts the actions of the various strategies out. Since Pe picks a fresh large follower each time it is initialized, it cannot injure any Nj for j < e. It is easy to see that any Ne can be injured at most e many times, and that each Pe is met, since it is initialized at most 2e many times. Actually, the above proof constructs a noncomputable c.e. set that is superlow. (Recall that a set A is superlow if A ≡tt ∅ .)
2.12 Finite injury arguments of unbounded type 2.12.1 The Sacks Splitting Theorem There are priority arguments in which the number of injuries to each requirement, while finite, is not bounded by any computable function. One example is the following proof of the Sacks Splitting Theorem [342]. We write A = A0 A1 to mean that A = A0 ∪ A1 and A0 ∩ A1 = ∅. Turing incomparable c.e. sets A0 and A1 such that A = A0 A1 are said to form a c.e. splitting of A. Theorem 2.12.1 (Sacks Splitting Theorem [342]). Every noncomputable c.e. set has a c.e. splitting. Proof. Let A be a noncomputable c.e. set. We build Ai = s Ai,s in stages by a priority argument to meet the following requirements for all e ∈ N and i = 0, 1, while ensuring that A = A0 A1 . i Re,i : ΦA e = A.
These requirements suffice because if A1−i T Ai then A T Ai . Without loss of generality, we assume that we are given an enumeration of A so that exactly one number enters A at each stage. We must put this number x ∈ As+1 \ As into exactly one of A0 or A1 , to ensure that A = A0 A1 . To meet Re,i , we define the length of agreement function i l(e, i, s) = max{n : ∀k < n (ΦA e (k)[s] = A(k)[s])}
and an associated use function 6 i u(e, i, s) = ϕA e (l(e, i, s) − 1)[s],
using the convention that use functions are monotone increasing (with respect to the position variable) where defined. 6 Of
course, in defining this set we ignore s’s such that l(e, i, s) = 0. We will do the same without further comment below. Here and below, we take the maximum of the empty set to be 0.
40
2. Computability Theory
The main idea of the proof is perhaps initially counterintuitive. Let us consider a single requirement Re,i in isolation. At each stage s, although i we want ΦA = A, instead of trying to destroy the agreement between e Ai Φe [s] and A[s] represented by l(e, i, s), we try to preserve it (a method sometimes called the Sacks preservation strategy). The way we implement this preservation is to put numbers entering A u(e, i, s) after stage s into A1−i and not into Ai . By the use principle, since this action freezes the Ai side of the computations involved in the definition of l(e, i, s), it ensures Ai i that ΦA e (k) = Φe (k)[s] for all k < l(e, i, s). Now suppose that lim sups l(e, i, s) = ∞, so for each k we can find infinitely many stages s at which k < l(e, i, s). For each such stage, Ai i ΦA e (k) = Φe (k)[s] = A(k)[s]. Thus A(k) = A(k)[s] for any such s. So we can compute A(k) simply by finding such an s, which contradicts the noncomputability of A. Thus lim sups l(e, i, s) < ∞, which clearly implies that Re,i is met. In the full construction, of course, we have competing requirements, which we sort out by using priorities. That is, we establish a priority list of our requirements (for instance, saying that Re,i is stronger than Re ,i iff e, i < e , i ). At stage s, for the single element xs entering A at stage s, we find the strongest priority Re,i with e, i < s such that xs < u(e, i, s) and put xs into A1−i . We say that Re,i acts at stage s. (If there is no such requirement, then we put xs into A0 .) To verify that this construction works, we argue by induction that each requirement eventually stops acting and is met. Suppose that all requirements stronger than Re,i eventually stop acting, say by a stage s > e, i. At any stage t > s, if xt < u(e, i, t), then xt is put into A1−i . The same argument as in the one requirement case now shows that if lim sups l(e, i, s) = ∞ then A is computable, so lim sups l(e, i, s) < ∞. Our preservation strategy then ensures that lim sups u(e, i, s) < ∞. Thus Re,i eventually stops acting and is met. In the above construction, injury to a requirement Re,i happens whenever xs < u(e, i, s) but xs is nonetheless put into Ai , at the behest of a stronger priority requirement. How often Re,i is injured depends on the lengths of agreement attached to stronger priority requirements, and thus cannot be computably bounded. Note that, for any noncomputable c.e. set C, we can easily add requirei ments of the form ΦA e = C to the above construction, satisfying them in the same way that we did for the Re,i . Thus, as shown by Sacks [342], in addition to making A0 |T A1 , we can also ensure that Ai T C for i = 0, 1. Note also that the computable enumerability of A is not crucial in the above argument. Indeed, a similar argument works for any set A that has a computable approximation, that is, any Δ02 set A. Such an argument shows that if A and C are noncomputable Δ02 sets, then there exist Turing incomparable Δ02 sets A0 and A1 such that A = A0 A1 and Ai T C
2.12. Finite injury arguments of unbounded type
41
for i = 0, 1. Checking that the details of the proof still work in this case is a good exercise for those unfamiliar with priority arguments involving Δ02 sets.
2.12.2 The Pseudo-Jump Theorem Another basic construction using the finite injury method was discovered by Jockusch and Shore [193]. It involves what are called pseudo-jump operators. Definition 2.12.2 (Jockusch and Shore [193]). For an index e, let the pseudo-jump operator Ve be defined by VeA = A ⊕ WeA for all oracles A. We say that Ve is nontrivial if A
42
2. Computability Theory
Thus the full construction proceeds as follows at stage s: For each e such that We [s] ∩ A[s] = ∅ and γ(2e, s) ∈ We [s], put γ(2e, s) into A. For each n entering ∅ at stage s, put γ(2n + 1, s) into A. We put at most one number into A for the sake of a given R- or Prequirement. By the definition of the γ function, for each n there is an s such that, for all t s, no number less than r(n, t) is put into A at stage t. It follows that limt r(n, t) exists, and Nn is satisfied. Thus γ(n) = lims γ(n, s) exists for all n, and the construction ensures that if We ∩ A = ∅ then γ(2e) ∈ A iff γ(2e) ∈ We , so that Re is satisfied. It is also clear that each P-requirement is satisfied by the construction, so we are left with showing that for each n, we can V A -computably determine a stage u such that γ(n, s) = γ(n, u) for all s u. Assume by induction that we have determined a stage t such that γ(m, s) = γ(m, t) for all m < n and s t. Let u > t be a stage such that for all m n, if m ∈ WkA then m ∈ WkA [u]. Clearly, such a u can be found V A -computably. Let m n. If m ∈ WkA then r(m, s) = r(m, u) for all s u, since no number less than r(m, u) enters A during or after stage u. Now suppose that m ∈ / WkA . For each s u, no number less than r(m, s) enters A during or after stage s, so if m were in WkA [s] then it would be in WkA , and hence m ∈ / WkA [s]. Thus in this case r(m, s) = 0 for all s u. So we see that r(m, s) = r(m, u) for all m n and s u, and hence γ(n, s) = γ(n, u) for all s u. Jockusch and Shore [193, 194] used the pseudo-jump machinery to establish a number of results. One application is a finite injury proof that there is a high incomplete c.e. Turing degree, a result first proved using the infinite injury method, as we will see in Section 2.14.3. The Jockusch-Shore proof of the existence of a high incomplete c.e. degree in fact produces a superhigh degree (where a set X is superhigh if ∅ tt X ), and runs as follows. Relativize the original Friedberg theorem that there is a noncomputable set We of low (or even superlow) Turing degree, to obtain an operator Ve such that for all Y we have that Y
2.13 Coding and permitting Coding and permitting are two basic methods for controlling the degrees of sets we build. Coding is a way of ensuring that a set A we build has degree at least that of a given set B. As the name implies, it consists of encoding the bits of B into A in a recoverable way. One simple way to to do this is to build A to equal B ⊕ C for some C, but in some cases we need to employ
2.13. Coding and permitting
43
a more elaborate coding method. We will see an example in Proposition 2.13.1 below. Permitting is a way of ensuring that a set A we build has degree at most that of a given set B. We will describe the basic form of c.e. permitting. For more elaborate forms of c.e. permitting, see Soare [366]. For a thorough exposition of the Δ02 permitting method, see Miller [281]. Let B be a noncomputable c.e. set, and suppose we are building a c.e. set A to satisfy certain requirements while also ensuring that A T B. At a stage s in the construction, we may want to put a number n into A. But instead of putting n into A immediately, we wait for a stage t s such that Bt+1 n = Bt n. At that stage, we say that B permits us to put n into A. If we always wait for permissions of this kind before putting numbers into A, then we can compute A from B as follows. Given n, look for an s such that Bs n = B n. Then n ∈ A iff n ∈ As . Of course, one has to worry whether a particular requirement can live with this permitting strategy. But suppose that we have a requirement R that provides us with infinitely many followers n, and will be satisfied as long as one such n enters A. Suppose that we never get a permission to put such an n into A. Then we can compute B as follows. Given m, wait for a stage s such that R provides us with a follower n > m. Since we are assuming we are never permitted to put n into A, we have B n = Bs n. In particular, m ∈ B iff m ∈ Bs . Since we are assuming that B is noncomputable, some follower must eventually be permitted to enter A, and hence R is satisfied. The following proof illustrates both coding and permitting. Recall that a coinfinite c.e. set A is simple if its complement does not contain any infinite c.e. set. Proposition 2.13.1. Every nonzero c.e. degree contains a simple set. Proof. Let B be a noncomputable c.e. set. We may assume that exactly one element xs enters B at each stage s. We build a c.e. set A ≡T B to satisfy the simplicity requirements Re : |We | = ∞ ⇒ We ∩ A = ∅. < as1 < · · · be the elements of As . For each e < s, if At stage s, let We,s ∩ As = ∅ and there is an n ∈ We,s with n > 3e and n > xs , then put n into A. Also, put as3xs into A. The set A is clearly c.e. Each Re puts at most one number into A, and this number must be greater than 3e, while for each x ∈ B, the coding of B puts only one number into A, and this number is at least 3x. Thus A is coinfinite. For each n, if Bs n + 1 = B n + 1, then n < xt for all t > s, so n cannot enter A after stage s. As explained above, it follows that A T B. Given x, the set A can compute a stage s such that asi ∈ A for all i 3x. Then at3x = as3x for all t s, so x ∈ B iff either x ∈ Bs or as3x ∈ A. Thus B T A. as0
44
2. Computability Theory
Finally, we need to show that each Re is met. Suppose that We is infinite but We ∩ A = ∅. Then we can computably find 3e < n0 < n1 < · · · and e s0 < s1 < · · · such that ni ∈ We,si for all i. None of the ni are ever permitted by B, that is, ni xt for all i and t si . So B ni = Bsi ni for all i, and hence B is computable, contrary to hypothesis. Thus each requirement is met, and hence A is simple.
2.14 The infinite injury priority method In this section, we introduce the infinite injury priority method. We begin by discussing the concept of priority trees, then give a few examples of constructions involving such trees. See Soare [366] for further examples.
2.14.1 Priority trees and guessing The Friedberg-Muchnik construction used to prove Theorem 2.11.1 can be viewed another way. Instead of having a single strategy for each requirement, which is restarted every time a stronger priority requirement acts, we can attach multiple versions of each requirement to a priority tree, in this case the full binary tree 2<ω . Recall that the relevant requirements are the following. R2e : ΦA e = B. R2e+1 : ΦB e = A. Attached to each node σ is a “version” Rσ of R|σ| . We call such a version a strategy for R|σ| , and refer to σ itself as a guess, for reasons that will soon become clear. Thus there are 2e separate strategies for Re . Each strategy Rσ has two outcomes, 0 and 1. The outcome 1 indicates that Rσ appoints a follower but never enumerates this follower into its target set. The outcome 0 indicates that Rσ actually enumerates its follower into its target set. At a stage s, we have a guess as to which outcome is correct, depending on whether or not Rσ has enumerated its follower into its target set. If this guess is ever 0, then we immediately know that 0 is indeed the correct outcome, so we regard the 0 outcome as being stronger than the 1 outcome. We order our guesses lexicographically, and hence if σ is to the left of τ , then we think of σ as being stronger than τ . Figure 2.1 shows the above setup. For comparison, it also shows the setup for the Minimal Pair Theorem, which will be discussed in the next subsection. Here is how this mechanism is used. Consider R1 . We have two strategies for this requirement. One, which in this paragraph we write as R0 rather than R0 to avoid confusion with R0 , is below the 0 outcome of Rλ and the other, R1 , is below the 1 outcome of Rλ . We think of these versions
2.14. The infinite injury priority method
45
R0
0
0
R1
1
1
0
1
R2
For the Friedberg-Muchnik Theorem
N0
∞
∞
f
f
∞
N1
f
N2
For the Minimal Pair Theorem
Figure 2.1. The assignment of priorities and outcomes
as guessing that Rλ ’s outcome is 0 and 1, respectively. The strategy R1 believes that Rλ will appoint a follower, but will never enumerate it into its target set. Thus its action is simply to act immediately after its guess appears correct, that is, once Rλ appoints its follower. At this point, R1 can appoint its own follower, and proceed to act as R1 would have acted in the original proof of the Friedberg-Muchnik Theorem. Its belief may turn out to be false (i.e., Rλ may enumerate its follower into its target set), in which case its action may be negated, but in that case we do not care about R1 . Indeed, in general, we only care about strategies whose guesses about the outcomes of strategies above them in the tree are correct. So we have the “back-up” strategy R0 , which believes that Rλ will enumerate its follower into its target set. This strategy will act only once its guess appears correct, i.e., after Rλ enumerates its follower into its target set. Then and only then does this strategy wake up and appoint its follower. The above is extended inductively on the tree of strategies in a natural way. For instance, for σ = 0110, there is a strategy Rσ for R4 , which guesses that Rλ and R011 do not enumerate their followers, but R0 and R01 do.
46
2. Computability Theory
This strategy waits for Rλ and R011 to appoint followers, and for R0 and R01 to both appoint followers and then enumerate them. Once all of these conditions are met, it appoints its own follower and proceeds to act as R4 would have acted in the original proof of the Friedberg-Muchnik Theorem. More precisely, at stage s, we have a guess σs of length s, which is defined inductively. Only those strategies Rτ with τ σs get to act at stage s. We say that such τ are visited at stage s. Suppose we have defined σs n. Let τ = σs n. Then we allow Rτ to act as follows. Let us suppose that n = 2e, the odd case being symmetric. If Rτ has previously put a number into B, then it does nothing. In this case, if n < s then σ(n) = 0. Otherwise, Rτ acts in one of three ways. 1. If Rτ does not currently have a follower, then it appoints a follower xτ greater than any number seen in the construction so far. In this case, if n < s then σs (n) = 1. 2. Otherwise, if ΦA e (xτ )[s] ↓= 0 = Bs (xτ ), then Rτ enumerates xτ into B. In this case, if n < s then σs (n) = 0. 3. Otherwise, Rτ does nothing. In this case, if n < s then σs (n) = 1. The strategy Rλ has a true outcome i, which is 0 if it eventually enumerates its witness into B, and 1 otherwise. Similarly, Ri has a true outcome. The true path TP of the construction is the unique path such that TP(n) is the true outcome of RTPn . Note that, in the above construction, TP(n) = lims σs (n), but with an eye to more complicated tree constructions, we in general define the true path of such a construction as the leftmost path visited infinitely often (that is, the path extending those guesses τ such that τ is the leftmost guess of length |τ | that is visited infinitely often). The idea behind this definition is that the strategies on the true path are the ones that actually succeed in meeting their respective requirements. In the construction described above, it is easy to verify that this is indeed the case. Let τ ∈ TP. Again, let us assume that |τ | = 2e, the odd case being symmetric. At the first stage s at which τ is visited, a follower xτ is appointed. Since followers are always chosen to be fresh large numbers, xτ will not be enumerated into B by any strategy other than Rτ . So if A ΦA e (xτ ) ↑ or Φe (xτ ) ↓= 1, then R|τ | is met. Otherwise, there is a stage t > s such that ΦA e (xτ )[t] ↓= 0 = Bt (xτ ) and τ is visited at stage t. Then Rτ enumerates xτ into B. Nodes to the right of τ 0 will never again be visited, so the corresponding strategies will never again act. Nodes to the left of τ are never visited at all, so the corresponding strategies never act. Nodes extending τ 0 will not have been visited before stage t, so the corresponding strategies will pick followers greater than ϕA e (xτ )[t]. If n < |τ | then there are two possibilities. If τ (n) = 1 then 1 is the true outcome of Rτ n , and hence that strategy never enumerates its witness. If τ (n) = 0 then Rτ n must have already
2.14. The infinite injury priority method
47
enumerated its witness by stage s, since otherwise τ would not have been visited at that stage. In either case, we see that strategies corresponding to nodes extended by τ do not enumerate numbers during or after stage t. The upshot of the analysis in the previous paragraph is that no strategy enumerates a number less than ϕA e (xτ )[t] into A during or after stage t, so by the use principle, ΦA e (xτ ) ↓= 0 = B(xτ ), and hence Rτ is met. It is natural to wonder why we should introduce all this machinery. For the Friedberg-Muchnik Theorem itself, as well as for most finite injury arguments, the payoff in new insight provided by this reorganization of the construction does not really balance the additional notational and conceptual burden. However, the above concepts were developed in the past few decades to make infinite injury arguments comprehensible. The key difference between finite injury and infinite injury arguments is the following. In an infinite injury argument, the action of a given requirements R may be infinitary. Obviously, we cannot simply restart weaker priority requirements every time R acts. Instead, we can have multiple strategies for weaker priority requirements, depending on whether or not R acts infinitely often. Thus, in a basic setup of this sort, the left outcome of R, representing the guess that R acts infinitely often, is visited each time R acts. In the Friedberg-Muchnik argument, the approximation to the true path moves only to the left as time goes on, since the guessed outcome of a strategy can change from 1 to 0, but never the other way. Thus, the true path is computable in ∅ . In infinite injury arguments, the approximation to the true path can move both left and right. Assuming that the tree is finitely branching, this possibility means that, in general, the true path is computable only in ∅ . It clearly is computable in ∅ because, letting TPs be the unique string of length s visited at stage s, σ ≺ TP iff ∃∞ s (σ ≺ TPs ) ∧ ∃<∞ s (TPs
2.14.2 The minimal pair method Our first example of infinite injury argument is the construction of a minimal pair of c.e. degrees. A pair of noncomputable degrees a and b is a minimal pair if the only degree below both a and b is 0. Theorem 2.14.1 (Lachlan [231], Yates [408]). There is a minimal pair of c.e. degrees. Proof. We build noncomputable c.e. sets A and B such that every set computable in both A and B is computable. We construct A and B in
48
2. Computability Theory
stages to satisfy the following requirements for each e. Re : A = We . Qe : B = We . B A 7 Ne : ΦA e = Φe total ⇒ Φe computable.
We arrange these requirements in a priority list as in previous constructions. We meet the R- and Q-requirements by a Friedberg-Muchnik type strategy. For instance, to satisfy Re , we pick a follower x, targeted for A, and wait until x enters We . If this event never happens, then x ∈ A ∩ We , so Re is met. If x does enter We , then we put x into A, thus again meeting Re . The N -requirements are trickier to meet. We first discuss how to meet a single Ne in isolation, and then look at the coherence problems between the various requirements and the solution to these provided by the use of a tree of strategies. We follow standard conventions, in particular assuming that all uses at stage s are bounded by s. As in the proof of the Sacks Splitting Theorem 2.12.1, we have a length of agreement function B l(e, s) = max{n : ∀k < n (ΦA e (k)[s] ↓= Φe (k)[s] ↓)}
and an associated use function B u(e, s) = max{ϕA e (l(e, t) − 1)[t], ϕe (l(e, t) − 1)[t] : t s},
using the convention that use functions are monotone increasing where defined. We now also have a maximum length of agreement function m(e, s) = max{l(e, t) : t s}, which can be thought of as a high water mark for the length of agreements seen so far. We say that a stage s is e-expansionary if l(e, s) > m(e, s − 1), that is, if the current length of agreement surpasses the previous high water mark. The key idea for meeting Ne is the following. Suppose that n < l(e, s), B so that ΦA e (n)[s] ↓= Φe (n)[s] ↓. Now suppose that we allow numbers to enter A at will, but “freeze” the computation ΦB e (n)[s] by not allowing 7 Clearly, meeting the R- and Q-requirements is enough to ensure that A and B are noncomputable. As for the N -requirements, it might seem at first glance that we B A need stronger requirements, of the form ΦA i = Φj total ⇒ Φi computable. However, suppose that we are able to meet the N -requirements. Then we cannot have A = B, since the N -requirements would then force A and B to be computable. So there is an n such that A(n) = B(n). Let us assume without loss of generality that n ∈ B. If f T A, B B then there are i and j such that ΦA i = Φj = f . But also, there is an e such that, for X X X all oracles X, we have Φe = Φi if n ∈ / X and ΦX e = Φj if n ∈ X. For this e, we have A B Φe = Φe = f . So if such an f is total, then Ne ensures that it is computable. This argument is known as Posner’s trick . It is, of course, merely a notational convenience.
2.14. The infinite injury priority method
49
As Φe [s]
Stage s is e-expansionary.
Φe [s] Bs x enters
At Φe [s] is destroyed
At stage t > s, x enters A, Bt
destroying the A-side.
restrained
As At stages s ∈ (t, u): Before recovery, we protect the B-side. Bs restrained
Au Stage u: recovery.
Φe [u]
u is e-expansionary. Bu
ΦA e has recovered.
Figure 2.2. The Ne strategy
any number to enter B below ϕB e (n)[s] until the next e-expansionary stage B t > s. At this stage t, we again have ΦA e (n)[t] ↓= Φe (n)[t] ↓. But we also B B have Φe (n)[t] = Φe (n)[s], by the use principle. Now suppose that we start allowing numbers to enter B at will, but freeze the computation ΦA e (n)[t] by not allowing any number to enter A below ϕA (n)[t] until the next ee B expansionary stage u > t. Then again ΦA (n)[u] ↓= Φ (n)[u] ↓, but also e e A B B A ΦA e (n)[u] = Φe (n)[t] = Φe (n)[t] = Φe (n)[s] = Φe (n)[s]. (Note that we are not saying that these computations are the same, only that they have A the same value. It may well be that ϕA e (n)[u] > ϕe (n)[s], for example.) So if we keep to this strategy, alternately freezing the A-side and the B-side B of our agreeing computations, then we ensure that, if ΦA e = Φe is total (which implies that there are infinitely many e-expansionary stages), then A ΦA e (n) = Φe (n)[s]. Thus, if we follow this strategy for all n, then we can A compute Φe , and hence Ne is met. Figure 2.2 illustrates this idea.
50
2. Computability Theory
In summary, the strategy for meeting Ne is to wait until an eexpansionary stage s, then impose a restraint u(e, s) on A, then wait until the next e-expansionary stage t, lift the restraint on A, and impose a restraint u(e, t) on B, and then continue in this way, alternating which set is restrained at each new e-expansionary stage. Note that there is an important difference between this strategy and the one employed in the proof of the Sacks Splitting Theorem. There we argued that the length of agreement associated with a given requirement could not go to infinity. Here, however, it may well be the case that lims l(e, s) = ∞, and hence lims u(e, s) = ∞. Thus, the restraints imposed by the strategy for Ne may tend to infinity. How do the R- and Q-requirements of weaker priority deal with this possibility? Consider a requirement Ri weaker than Ne . We have two strategies for Ri . One strategy guesses that there are only finitely many e-expansionary stages. This strategy picks a fresh large follower. Each time a new e-expansionary stage occurs, the strategy is initialized, which means that it must pick a new fresh large follower. Otherwise, the strategy acts exactly as described above. That is, it waits until its current follower x enters Wi , if ever, then puts x into A. The other strategy for Ri guesses that there are infinitely many eexpansionary stages. It picks a follower x. If x enters Wi at stage s, then it wants to put x into A, but it may be restrained from doing so by the strategy for Ne . However, if its guess is correct then there will be an eexpansionary stage t s at which the restraint on A is dropped. At that stage, x can be put into A. Coherence. When we consider multiple N -requirements, we run into a problem. Consider two requirements Ne and Ni , with the first having stronger priority. Let s0 < s1 < · · · be the e-expansionary stages and t0 < t1 < · · · be the i-expansionary stages. During the interval [s0 , s1 ), weaker strategies are prevented from putting certain numbers into A by Ne . At stage s1 , this restraint is lifted, but if t0 s1 < t1 , then at stage s1 it will be Ni that prevents weaker strategies from putting certain numbers into A. When Ni drops that restraint at stage t1 , we may have a new restraint on A imposed by Ne (if say s2 t1 < s3 ). Thus, although individually Ne and Ni each provide infinitely many stages at which they allow weaker strategies to put numbers into A, the two strategies together may conspire to block weaker strategies from ever putting numbers into A. This problem is overcome by having two strategies for Ni . The one guessing that there are only finitely many e-expansionary stages has no problems. The one guessing that there are infinitely many e-expansionary stages acts only at such stages, and in particular calculates its expansionary stages based only on such stages, which forces its expansionary stages to be nested within the e-expansionary stages. Thus the actions of Ne and Ni are forced to cohere. We now turn to the formal details of the construction. These details may at first appear somewhat mysterious, in that it may be unclear how they ac-
2.14. The infinite injury priority method
51
tually implement the strategies described above. However, the verification section should clarify things. The Priority Tree. We use the tree T = {∞, f }<ω . (Of course, this tree is the same as 2<ω , but renaming the nodes should help in understanding the construction.) To each σ ∈ T , we assign strategies Nσ , Rσ , and Qσ for N|σ| , R|σ| , and Q|σ| , respectively. (It is only the N -requirements that really need to be on the tree, since only they have outcomes we care about, but this assignment is notationally convenient.) Again we use lexicographic ordering, with ∞ to the left of f . Definition 2.14.2. We define the notions of σ-stage, m(σ, s), and σ-expansionary stage by induction on |σ|. (i) Every stage is a λ-stage. (ii) Suppose that s is a τ -stage. Let e = |τ |. Let m(τ, s) = max{l(e, t) : t < s is a τ -stage}. If l(e, s) > m(τ, s) then we say that s is τ -expansionary and declare s to be a τ ∞-stage. Otherwise, we declare s to be a τ f -stage. Let TPs be the unique σ of length s such that s is a σ-stage. Definition 2.14.3. We say that Rσ requires attention at a σ-stage s > |σ| if W|σ|,s ∩ As = ∅ and one of the following holds. (i) Rσ currently has no follower. (ii) Rσ has a follower x ∈ W|σ|,s . The definition of Qσ requiring attention is analogous. Construction. At stage s, proceed as follows. Step 1. Compute TPs . Initialize all strategies attached to nodes to the right of TPs . For the R- and Q-strategies, this initialization means that their followers are canceled. Step 2. Find the strongest priority R- or Q-requirement that has a strategy requiring attention at stage s. Let us suppose that this requirement is Re (the Q-requirement case being analogous), and that Rσ is the corresponding strategy requiring attention at stage s. (Note that s must be a σ-stage.) We say that Rσ acts at stage s. Initialize all strategies attached to nodes properly extending σ. If Rσ does not currently have a follower, appoint a fresh large follower for Rσ . Otherwise, enumerate Rσ ’s follower into A. End of Construction. Verification. Let the true path TP be the leftmost path of T visited infinitely often. In other words, let TP be the unique path of T such that σ ≺ TP iff ∃∞ s (σ ≺ TPs ) ∧ ∃<∞ s (TPs
52
2. Computability Theory
Lemma 2.14.4. Each R- and Q-requirement is met. Proof. Consider Re (the Q-requirement case being analogous). Let σ = TP e. Assume by induction that the strategies attached to proper prefixes of σ act only finitely often, and let s be the least stage such that 1. no strategy attached to a proper prefix of σ acts after stage s and 2. TPt is not to the left of σ for all t > s. By the minimality of s, it must be the case that Rσ is initialized at stage s. Thus, at the next σ-stage t > s, it will be assigned a follower x. Since Rσ cannot be initialized after stage t, this follower is permanent. By the way followers are chosen, x will not be put into A unless x enters We . If x enters We at stage u > t, then at the first σ-stage v u, the strategy Rσ will act, and x will enter A. In any case, Rσ succeeds in meeting Re . Note that Rσ acts at most twice after stage s, and hence the induction can continue. Lemma 2.14.5. Each N -requirement is met. Proof. Consider Ne . Let σ = TP e. If σ f ≺ T P then there are only finitely many e-expansionary stages, so Φe (A) = Φe (B), and hence Ne is met. A So suppose that σ ∞ ≺ T P and that ΦA e is total. We will show that Φe is computable. Let s be the least stage such that 1. no strategy attached to a prefix of σ (including σ itself) acts after stage s and 2. TPt is not to the left of σ for all t > s. To compute ΦA e (n), find the least σ ∞-stage t0 > s such that l(e, s) > n. A A We claim that Φe (n) = Φe (n)[t0 ]. Let t0 < t1 < · · · be the σ ∞-stages greater than or equal to t0 . B Each such stage is e-expansionary, so we have ΦA e (n)[ti ] = Φe (n)[ti ] for A all i. We claim that, for each i, we have either Φe (n)[ti+1 ] = ΦA e (n)[ti ] B or ΦB (n)[t ] = Φ (n)[t ]. Assuming this claim, it follows easily that i+1 i e e A A A ΦA (n)[t ] = Φ (n)[t ] = · · · , which implies that Φ (n) = Φ (n)[t ]. 0 1 0 e e e e To establish the claim, fix i. At stage ti , we initialize all strategies to the right of σ ∞. Thus, any follower of such a strategy appointed after stage ti must be larger than any number seen in the construction by stage ti , B and in particular larger than ϕA e (n)[ti ] and ϕe (n)[ti ]. By the choice of s, strategies above or to the left of σ do not act after stage s. Thus, the only strategies that can put numbers less than ϕA e (n)[ti ] into A or numbers less than ϕB (n)[t ] into B between stages t and ti+1 are the ones associated i i e with extensions of σ ∞. But such a strategy cannot act except at a σ ∞stage, and at each stage at most one strategy gets to act. Thus, at most one strategy associated with an extension of σ ∞ can act between stages ti
2.14. The infinite injury priority method
53
B and ti+1 , and hence only one of A ϕA e (n)[ti ] and B ϕe (n)[ti ] can change A between stages ti and ti+1 . So either A ϕe (n)[ti+1 ] = A ϕA e (n)[ti ] or B A B ϕB e (n)[ti+1 ] = B ϕe (n)[ti ], which implies that either Φe (n)[ti+1 ] = B B ΦA e (n)[ti ] or Φe (n)[ti+1 ] = Φe (n)[ti ]. As mentioned above, it follows that A A Φe (n) = Φe (n)[t0 ].
These two lemmas show that all requirements are met, which concludes the proof of the theorem. Giving a full proof of the following result is a good exercise for those unfamiliar with the methodology above. Theorem 2.14.6 (Ambos-Spies [5], Downey and Welch [135]). There is a noncomputable c.e. set A such that if C D = A is a c.e. splitting of A, then the degrees of C and D form a minimal pair. Proof sketch. The proof is similar to that of Theorem 2.14.1, using the length of agreement function Wm k l(i, j, k, m, s) = max{x : ∀y < x [ΦW (y)[s] ↓ i (y)[s] ↓= Φj Wm k ∧ ∀z ϕW (y)[s] (Wk [s] z Wm [s] z = As z)]}. i (y)[s], ϕj
2.14.3 High computably enumerable degrees Another example of an infinite injury argument is the construction of an incomplete high c.e. degree. Recall that a c.e. degree a is high if a = 0 . We begin with a few definitions. Recall that A[e] denotes the eth column {e, n : e, n ∈ A} of A. Definition 2.14.7. A set A is piecewise trivial if for all e, the set A[e] is either finite or equal to N[e] . A subset B of A is thick if for all e we have A[e] =∗ B [e] (i.e., the symmetric difference of A[e] and B [e] is finite). Lemma 2.14.8. (i) There is a piecewise trivial c.e. set A such that A[e] is infinite iff Φe is total. (ii) If B is a thick subset of such a set A, then B is high. Proof. (i) Let A = {e, n : e ∈ N ∧ ∀k n Φe (k) ↓}. Then A is c.e., and clearly A[e] is infinite iff A[e] = N[e] iff Φe is total. (ii) Define a reduction Γ by letting ΓX (e, s) = 1 if e, s ∈ X and X Γ (e, s) = 0 otherwise. If Φe is total then A[e] = N[e] , so B [e] is coinfinite, which implies that lims ΓB (e, s) = 1. On the other hand, if Φe is not total then A[e] is finite, so B [e] is finite, which implies that lims ΓB (e, s) = 0.
54
2. Computability Theory
Thus, the function f defined by f (e) = lims ΓB (e, s) is total. By the relativized form of the limit lemma, f T B . So B can decide whether a given Φe is total or not. By Theorem 2.6.5, ∅ T B . Thus, to construct an incomplete high c.e. degree, it suffices to prove the following result, which is a weak form of the Thickness Lemma discussed in the next section. Theorem 2.14.9 (Shoenfield [355]). Let C be a noncomputable c.e. set, and let A be a piecewise trivial c.e. set. Then there is a c.e. thick subset B of A such that C T B. Proof. We construct B ⊆ A to meet the following requirements for all e. Re : |A[e] \ B [e] | < ∞. Ne : ΦB e = C. We give the intuition and formal details of the construction, and sketch out its verification, leaving some details to the reader. To meet Re , we must make sure that almost all of the eth column of A gets into B. To meet Ne , we use the strategy already employed in the proof of the Sacks Splitting Theorem 2.12.1. That is, we measure the length of agreement l(e, s) = max{n : ∀k < n (ΦB e (k)[s] = Cs (k))}, with the idea of preserving Bs on the use ϕB e (n)[s] for all n < l(e, s). The problem comes from the interaction of this preservation strategy with the strategies for stronger priority R-requirements, since these may be infinitary. It might be the case that we infinitely often try to preserve an agreeing computation ΦB e (n)[s] = Cs (n), only to have some i, x enter A with i < e and i, x < ϕB e (n)[s]. Since we must put almost every such pair into B to meet Ri , our strategy for meeting Ne might be injured infinitely often. On the other hand, we do know that Ri can do only one of two things. Either A[i] is finite, and hence Ri stops injuring Ne after some time, or almost all of A[i] is put into B. In the latter case, the numbers put into B for the sake of Ri form a computable set (since A is piecewise trivial). It is easy to adapt the strategy for Ne to deal with this case. Let S be the computable set of numbers put into B for the sake of Ri . We can then proceed with the Sacks preservation strategy, except that we do not believe a computation ΦB e (k)[s] unless Bs already contains every element of S ϕB (k)[s]. This modification prevents the action taken for Ri from e ever injuring the strategy for meeting Ne . Of course, we do not know which of the two possibilities for the action of Ri actually happens, so, as before, we will have multiple strategies for Ne , representing guesses as to whether or not A[i] is finite. There are several
2.14. The infinite injury priority method
55
ways to organize this argument as a priority tree construction.8 We give the details of one such construction. The N -strategies do not have interesting outcomes, so we use the same tree T = {∞, f }<ω as in the proof of the Minimal Pair Theorem 2.14.1, with the same lexicographic ordering as before (i.e., the ordering induced by regarding ∞ as being to the left of f ). To each σ ∈ T of length e we associate strategies Rσ and Nσ for meeting Re and Ne , respectively. Definition 2.14.10. We define the notions of σ-stage, σ-believable computation, σ-restraint r(σ, s), σ-expansionary stage, and σ-length of agreement l(σ, s) by induction on |σ| and s as follows. Every stage is a λ-stage, and r(σ, 0) = 0 for all σ. As usual, if we do not explicitly define r(σ, s) at stage s, then r(σ, s) = r(σ, s − 1). Suppose that s > 0 is a σ-stage, and let e = |σ|. A computation ΦB e (k)[s] = Cs (k) is σ-believable if for all τ ∞ σ and all x, if B r(τ, s) < |τ |, x < ϕe (k)[s], then |τ |, x ∈ Bs . Let l(σ, s) = max{n : ∀k < n (ΦB e (k)[s] = Cs (k) via a σ-believable computation)}. Say that s is σ-expansionary if l(σ, s) > max{l(σ, t) : t < s}. Let r(σ, s) = max({r(τ, s) : τ
|At | then s is a σ ∞-stage. Otherwise, s is a σ f -stage. Construction. The construction is now rather simple. At stage s, let σs be the unique string of length s such that s is a σs -stage. For each e < s and each e, x ∈ As that is not yet in Bs , if e, x > r(σs i, s) then put e, x into B. End of Construction. We now sketch the verification that this construction succeeds in meeting all requirements. As usual, the true path TP of the construction is the leftmost path visited infinitely often. Let σ ∈ TP, let e = |σ|, and assume by induction that limt r(τ, t) is well-defined for all τ ≺ σ. Let s be a stage such that 1. r(τ, t) has reached a limit r(τ ) by stage s for every τ ≺ σ, 2. the construction never moves to the left of σ after stage s, 3. B r(τ ) = Bs r(τ ) for every τ ≺ σ, and 8 It
is often the case in an infinite injury construction that we have substantive choices in defining the priority tree. In particular, some authors prefer to encode as much information as possible about the behavior of a strategy into its outcomes, while others prefer to encode only the information that weaker strategies need to have. It is worth keeping this fact in mind when reading such constructions.
56
2. Computability Theory [i]
4. Bs = B [i] for all i < e such that σ(i) = f . Note that item 4 makes sense because if σ ∈ TP and σ(i) = f , then there is a τ of length i such that τ f ∈ TP, which means that A[i] is finite, and hence so is B [i] . It is now not hard to argue, as in the proof of the Sacks Splitting Theorem, that if t > s is a σ-stage and n < l(σ, t), then the computation ΦB e (n)[s] is preserved forever. (The key fact is that this computation is σ-believable, and hence cannot be injured by stronger priority strategies.) Again as in the proof of the Sacks Splitting Theorem, it must be the case that limt l(σ, t) exists, and hence Ne is met. It is also not hard to argue now that r(σ) = limt r(σ, t) is well-defined. Let u be a stage by which this limit has been reached. Then every e, x > r(σ) that enters A after stage u is eventually put into B, and hence Re is met. A good exercise for those new to the techniques presented in this section is to combine the constructions of this subsection and the previous one to build a minimal pair of high c.e. degrees.
2.14.4 The Thickness Lemma Theorem 2.14.9 is only a weak form of the real Thickness Lemma of Shoenfield [355]. To state the full version, we need a new definition. Definition 2.14.11. A set A is piecewise computable if A[e] is computable for each e. Theorem 2.14.12 (Thickness Lemma, Shoenfield [355]). Let C be a noncomputable c.e. set, and let A be a piecewise computable c.e. set. Then there is a c.e. thick subset B of A such that C T B. Proof Sketch. We briefly sketch how to modify the proof of Theorem 2.14.9. Recall that in that result, we assumed that A was piecewise trivial. Now we have the weaker assumption that A is piecewise computable, that is, every column of A is computable. Thus, for each e, there is a c.e. set Wg(e) ⊆ N[e] such that Wg(e) is the complement of A[e] in N[e] (meaning that Wg(e) A[e] = N[e] ). The key to how we used the piecewise triviality of A in Theorem 2.14.9 was in the definition of σ-believable computation. Recall that for σ with e = |σ|, we said that a computation ΦB e (k)[s] = Cs (k) at a σ-stage s was σ-believable if for all τ ∞ σ and all x, if r(τ, s) < |τ |, x < ϕB e (k)[s], then |τ |, x ∈ Bs . This definition relied on the fact that, if τ ∞ ∈ TP, then every |τ |, x > r(τ, s) was eventually put into B. The corresponding fact here is that every |τ |, x > r(τ, s) that is not in Wg(|τ |) is eventually put into B.
2.14. The infinite injury priority method
57
So in order to adjust the definition of σ-believable computation, for each τ ≺ σ, we need to know an index j such that Wj = Wg(|τ |) . Since we cannot compute such a j from |τ |, we must guess it along the tree of strategies. That is, for each τ , instead of the outcomes ∞ and f , we now have an outcome j for each j ∈ N, representing a guess that Wj = Wg(|τ |) . The outcome j is taken to be correct at a stage s if j is the least number for which we see the length of agreement between N[e] and A[e] Wj increase at stage s. Now a computation ΦB e (k)[s] = Cs (k) at a σ-stage s is σ-believable if for all τ j σ and all x, if r(τ, s) < |τ |, x < ϕB e (k)[s], then |τ |, x ∈ Bs ∪Wj . The rest of the construction is essentially the same as before. One important point is that, because the tree of strategies is now infinitely branching, the existence of the true path is no longer automatically guaranteed. However, if τ is visited infinitely often then τ j is visited infinitely often for the least j such that Wj = Wg(|τ |) , from which it follows that TP is welldefined. There are infinite injury constructions in which showing that TP is well-defined requires a careful proof. We make some remarks for those who, like the senior author, were brought up with the “old” techniques using the “hat trick” and the “window lemma” of Soare [366]. It is rather ironic that the very first result that used the infinite injury method in its proof was the Thickness Lemma, since, like the Density Theorem of the next section, it has a proof not using priority trees that is combinatorially much easier to present. In a sense, this fact shows an inherent shortcoming of the tree technique in that often more information needs to be represented on the tree than is absolutely necessary for the proof of the theorem. To demonstrate this point, and since it is somewhat instructive, we now sketch the original proof of the Thickness Lemma. B (n)[s] to be ΦB (n)[s] We have the same requirements, but we define Φ B unless some number less than ϕ (n)[s] enters B at stage s, in which case B (n)[s] ↑. This is called the hat convention. Using this we declare that Φ convention, we can generate the hatted length of agreement l(e, s) with Φ in place of Φ, and so forth. Finally, we define the restraint function r (e, s) = max{ϕ B (n)[s] : n < l(e, s)}. The construction is to put e, x ∈ As into B at stage s if it is not yet there and e, x > max{ r (j, s) : j e}. To verify that this construction succeeds in meeting all our requirements, it is enough to show that, for each e, the injury set I e,s = {x : ∃v s (x r (e, v) ∧ x ∈ Bs+1 \ Bv )} is computable, B [e] =∗ A[e] , and Ne is met, all by simultaneous induction, the key idea being the “window lemma”, which states that the liminf of s) = max{ the restraint R(e, r(j, s) : j e} is finite. The key point is
58
2. Computability Theory
that to verify these facts we do not actually need to know during the construction what the complement of A[e] is. This knowledge is used only in the verification. See Soare [366, Theorem VIII.1.1] for more details. There is a strong form of the Thickness Lemma that is implicit in the work of Lachlan, Robinson, Shoenfield, Sacks, Soare, and others. We write A[<e] for i<e A[i] . Theorem 2.14.13 (Strong Thickness Lemma, see Soare [366]). Let C be such that ∅
2.15 The Density Theorem One of the classic applications of the infinite injury method is to extend the Friedberg-Muchnik Theorem to show that not only are there intermediate c.e. degrees, but in fact the c.e. degrees form a dense partial ordering. It is possible to give a short and elegant proof of this result without using priority trees (see [366]), but it is quite hard to understand how that proof works. Therefore, in this section we give a proof using trees. Theorem 2.15.1 (Sacks Density Theorem [344]). For any c.e. sets B
2.15. The Density Theorem
59
Additionally, we must meet the global permitting requirement A0 , A1 T C and the global coding requirement B T A0 , A1 . Together with the incomparability of the Ai , these global requirements imply that B
60
2. Computability Theory
The solution to this dilemma is to notice that this ugly situation can 0 happen only if the values of ϕA e (n(i))[s] are unbounded for some i. If we knew this to be the case, then we could do the construction ignoring R2e , as it would automatically be satisfied. Obviously, though, we cannot know whether this situation happens ahead of time. However, we can have an infinitary outcome for R2e , which guesses that this use is unbounded. The trick is to make the effect on the construction of this outcome computable, hence allowing strategies below it to live with the action of R2e if it truly has this infinitary outcome. To make this idea more precise, we begin by presenting the basic module for the R2e strategy described above, in the absence of stronger priority requirements. (Thus the single strategy for R0 , the strongest requirement, acts exactly as follows.) The basic module for R2e+1 strategies is of course symmetric. 1. If all currently defined followers are realized, then choose a large fresh odd follower n(i). 0 2. If ΦA e (n(i))[s] ↓= 0, then declare n(i) to be realized and initialize weaker strategies.
3. If n(i) is realized and i enters C, then put n(i) into A1 . In this case, stop choosing new followers unless n(i) later becomes unrealized. 4. If n(i) is realized and the coding of B causes A0 to change below 0 ϕA e (n(i))[t] at stage t, then, for the least i for which this is the case at this stage, cancel all n(j) for j > i (which will have different values if defined again later) and declare n(i) to be unrealized. Note that at each stage s, the strategy for R2e has realized followers n(0), . . . , n(i − 1) and an unrealized follower n(i). For each realized follower 0 n(j), we have ΦA e (n(j))[s] ↓. To see that the above module succeeds in satisfying a single requirement 0 R2e , suppose not, so that ΦA e = A1 . Then each n(i) is eventually defined 0 and reaches a final value, since all uses of the form ϕA e (n(i)) eventually settle. Moreover, we can compute these final values using B, since it is only the coding of B into A0 that can change these uses. Now we claim that B can compute C, contrary to hypothesis. Given i, we can use B to compute a stage s at which n(i) reaches its final value and becomes permanently realized. If i were to enter C after stage s, then n(i) would 0 enter A1 , ensuring that ΦA e = A1 . Thus i ∈ C iff i ∈ Cs . The above also shows that this module has two possible kinds of outA0 0 comes. Either ϕA e (n(i)) is unbounded for some i, or Φe (n(i)) = A1 (n(i)). We denote the former outcome by (i, u) and the latter by (i, f ). We use these outcomes to generate the priority tree T = {(i, u), (i, f ) : i ∈ N}<ω , with a lexicographic ordering inherited from the ordering of outcomes (i, u)
2.15. The Density Theorem
61
We now discuss how weaker priority requirements can live in the environments provided by R2e , by considering the case e = 0. The markers n(i) for the R0 strategy will now be denoted by n(λ, i) (since that strategy is associated with the empty sequence in T ). There are infinitely many strategies for R1 , one for each outcome. We denote the strategy below outcome σ by Rσ and its markers by n(σ, i). A strategy living below (i, f ) guesses that the strategy for R0 has finite action. Hence it can be initialized each time we realize n(λ, j) for j i, and simply implements the basic module. A strategy Rσ living below σ = (i, u) guesses that n(λ, i) comes to a final value but no n(λ, j) for j > i does, because we keep changing A0 0 below ϕA 0 (n(λ, i))[s] for the sake of coding B. As in the Thickness Lemma, 1 this strategy will believe a ΦA 0 -computation only if the current value of n(λ, i + 1) (and hence every n(λ, j) for j > i) is greater than the use of that computation. However, there is a problem here. The strategy Rσ may want to put numbers into A0 smaller than uses that the R0 strategy is trying to preserve. For instance, say a follower n(σ, j) is appointed, then followers n(λ, k) with k > i are appointed and become realized with corresponding uses larger than n(σ, j), then n(σ, j) becomes realized and permitted by C. If the guess that (i, u) is the true outcome of the R0 strategy is correct, then there is no problem. We can put n(σ, j) into A0 and still satisfy R0 . But otherwise, we cannot afford to allow the enumeration of n(σ, j) to injure R0 . We now come to the main idea of this proof, Sacks’ notion of delayed permitting: Since Rσ is guessing that (i, u) is the true outcome of the R0 strategy, it believes that any use larger than n(σ, j) that the R0 strategy is trying to preserve will eventually be violated by the coding of B into A0 . So once n(σ, j) is realized and C-permitted, instead of immediately putting n(σ, j) into A0 , we merely promise to put it there if these uses do go away; that is, if Rσ becomes accessible again before n(σ, j) is initialized. This delayed permitting still allows C to compute A0 , because C can compute B, and hence know whether the relevant uses will ever be violated. That is, given a realized follower n(σ, j), the set C knows whether j ∈ C. If not, then n(σ, j) never enters A0 . Otherwise, C can check what the B-conditions are for Rσ to be accessible again, find out whether these conditions will ever obtain, and if so, go to that stage of the construction to check whether n(σ, j) has not been initialized in the meantime, which is the only case in which n(σ, j) enters A0 . It is a subtle fact that, while the true path of the construction remains computable only from 0 , the fate of a particular follower is always computable from C. Furthermore, we can still argue that, if σ = (i, u) is in fact the true outcome of R0 strategy and Rσ fails, then B can compute C, in much the same way as before, since every C-permitted follower eventually has no restraint placed on it.
62
2. Computability Theory
Construction. We code B by enumerating 2n into A0 and A1 whenever n enters B. At a stage s, the construction proceeds in substages t s. At each substage t, some node α(s, t) of T will be accessible, beginning with α(s, 0) = λ. At the end of the stage (which might come before substage s), we will have defined a node αs . At substage t, the strategy Rα(s,t) for Rt will act. Let us assume that t = 2e, the odd case being symmetric, and let α = α(s, t). We say that s is an α-stage. If n(α, 0) is currently undefined, then proceed as follows. Define n(α, 0) to be a fresh large odd number. Let αs = α(s, t + 1) = α(s, t) (0, f ), initialize all τ L αs , and end the stage. 0 We say that the computation ΦA e (x)[s] is α-correct if for all β (j, u) α, 0 if n(β, j) is currently defined, then it is greater than ϕA [s]. Let
(α, s) be e the α-correct length of agreement at stage s, that is, the largest m such that 0 for all x < m, we have ΦA e (x)[s] = A1 (x)[s] via an α-correct computation. Adopt the first among the following cases that obtains. 1. For some i with n(α, i) realized (defined below), i has entered C since the last α-stage, and the α-correct length of agreement has exceeded n(α, i) since the last α-stage. Fix the least such i. Let αs = α(s, t + 1) = α(s, t) (i, f ), and initialize all τ L αs . Put n(α, i) into A0 and end the stage. 0 2. For some i with n(α, i) realized, the computation ΦA e (n(α, i)) has been injured since the previous α-stage. Fix the least such i. Declare n(α, i) to be unrealized. Cancel all n(α, j) for j > i. Let α(s, t + 1) = α (i, u). Initialize all τ with τ L α(s, t + 1) and α(s, t + 1) τ . If t < s then proceed to the next substage. Otherwise let αs = α(s, t+1) and end the stage.
3. For some unrealized n(α, i), we have (α, s) > n(α, i). Fix the least such i. Declare n(α, i) to be realized, define n(α, i + 1) to be a fresh large odd number, and cancel all n(α, j) for j > i + 1. Let αs = α(s, t + 1) = α (i + 1, f ). Initialize all τ L αs and end the stage. 4. Otherwise. Find the least (necessarily unrealized) n(α, i) that is defined. Let α(s, t + 1) = α (i, f ). Initialize all τ with τ L α(s, t + 1) and α(s, t + 1) τ . If t < s then proceed to the next substage. Otherwise let αs = α(s, t + 1) and end the stage. End of Construction. Verification. We first show that the true path of the construction is welldefined, and all requirements are met. We say that α is on the true path if there are infinitely many α-stages but only finitely many s with αs
2.15. The Density Theorem
63
Lemma 2.15.2. Let α be on the true path, and let γ be its true outcome. Then Rα succeeds in meeting R|α| , there are infinitely many α-stages where the stage does not end at substage |α|, and Rα initializes strategies below α γ only finitely often. Proof. Assume by induction that the lemma holds for strategies above α. Then, since α can be initialized only when αs |α| at which α(s, |α| + 1) = α γ, the stage does not end at substage |α|. If γ = (i, f ), then the Rα -module acts only finitely often. Thus, by induction, for each e there is an α of length e such that there are infinitely many α-stages but only finitely many s with αs k and o ∈ {f, u}. Suppose there is a least β (k, f )-stage t > s. If the construction moves to the left of β (k, f ) between stages s and t then α is initialized, so n ∈ / A0 . Otherwise, either case 1 or case 3 above holds for β at stage t, so again α is initialized and n ∈ / A0 . Of course, if there is no β (k, f )-stage after s, then n ∈ / Ao . Thus, in this case, we know that n∈ / A0 no matter what happens after stage s. Now suppose that β (k, u) α for some k. After stage s, there can be an α-stage only if there is a β (k, u)-stage. Now C can use B to figure out i whether the use of the relevant computation ΦA e (n(β, k)) is B-correct. If it is B-correct, then n ∈ / A0 , since the only way for there to be a further β (k, u)-stage is for a strategy above or to the left of β (k, u) to enumerate something into Ai , which causes α to be initialized. If not, then go to a stage t where B changes below this use. If t is not a β (k, u)-stage and n(α, i) is not canceled at stage t, then αt
64
2. Computability Theory
segment of β and αt . We can now repeat the above argument with β in place of β. It is a straightforward exercise to show that, proceeding in this manner, we eventually reach a stage at which we either can be sure that n ∈ / A0 , or we have an α-stage, at which point n enters A0 . Since we do so using B and B T C, we have A0 T C. The two lemmas above complete the proof of the theorem.
2.16 Jump theorems We will see that the jump operator plays an important role in the study of algorithmic randomness. In this section, we look at several classic results about the range of the jump operator, whose proofs are combinations of techniques we have already seen. Of course, if d = a then d 0 . The following result is a converse to this fact. Theorem 2.16.1 (Friedberg Completeness Criterion [160]). If d 0 then there is a degree a such that a = a ∨ 0 = d. Proof. This is a finite extension argument. Let D ∈ d. We construct a set A in stages, and let a = deg(A). At stage 0, let A0 = λ. At odd stages s = 2e + 1, we force the jump (i.e., decide whether e ∈ A ). We ask ∅ whether there is a σ As−1 such that Φσe (e) ↓. If so, we search for such a σ and let As = σ. If not, we let As = As−1 . Note that, in this case, we have ensured that ΦA e (e) ↑. At even stages s = 2e+2, we code D(e) into A by letting As = As−1 D(e). This concludes the construction. We have A ⊕ ∅ T A (which is always the case for any set A), so we can complete the proof by showing that A T D and D T A ⊕ ∅ . Since ∅ T D, we can carry out the construction computably in D. To decide whether e ∈ A , we simply run the construction until stage 2e + 1. At this stage, we decide whether e ∈ A . Thus A T D. A ∅ oracle can tell us how to obtain A2e+1 given A2e , while an A oracle can tell us how to obtain A2e+2 given A2e+1 , so using A ⊕ ∅ , we can compute A2n+2 for any given n. Since D(n) is the last element of A2n+2 , we can compute D(n) using A ⊕ ∅ . Thus D T A ⊕ ∅ . The above proof should be viewed as a combination of a coding argument and the construction of a low set, done simultaneously. This kind of combination is a recurrent theme in the proofs of results such as the ones in this section. The following extension of the above result is due to Posner and Robinson [315]. We give a proof due to Jockusch and Shore [194].
2.16. Jump theorems
65
Theorem 2.16.2 (Posner and Robinson [315]). r If c > 0 and d 0 ∨ c then there is a degree a such that a = a ∨ c = d. Proof. By Proposition 2.13.1, there is an immune C ∈ c. (Recall that the complement of a simple set is immune.) Let D ∈ d. We build a set A by finite extensions. Let A0 = λ. Given An , proceed as follows. Let Sn = {x : ∃σ An 0x 1 (Φσn (n) ↓)}. If Sn is finite then there is an x ∈ C \ Sn . In this case, let An+1 = An 0x 1D(n). If Sn is infinite then, since Sn is c.e. and C is immune, there is an x ∈ Sn \ C. In this case, let s be least such that ∃σ An 0x 1 (Φσn (n)[s] ↓), let σ be the length-lexicographically least string witnessing this existential, and let An+1 = σD(n). It is easy to see that from the sequence A0 , A1 , . . . we can compute the set A , the set A ⊕ C, and the set D. Thus, it is enough to show that this sequence can be computed from each of these three sets. Suppose we have computed An . Since D T ∅ ⊕ C, using D we can compute Sn and C and hence determine An+1 . If we have A , then we have A, so we can determine the x such that An 0x 1 ≺ A. Then using ∅ we can determine whether x ∈ Sn , and hence determine An+1 . If we have A⊕C, then again we can determine the x such that An 0x 1 ≺ A. By construction, x ∈ Sn iff x ∈ / C, so we can then use C to determine An+1 . It is not hard to show that if A T ∅ then A is c.e. in ∅ . Since also ∅ T A , we say that A is computably enumerable in and above ∅ , abbreviated by CEA(∅ ). The following theorem is an elaboration of Theorem 2.16.1.
Theorem 2.16.3 (Shoenfield Jump Inversion Theorem [354]). If D is CEA(∅ ) then there is an A T ∅ such that A ≡T D. We will briefly sketch a proof of the following stronger result. (We restrict ourselves to a sketch because, from a modern viewpoint, the proof contains no new ideas beyond what we have seen so far. See Soare [366] for a full proof.) Theorem 2.16.4 (Sacks Jump Inversion Theorem [341]). If D is CEA(∅ ) then there is a c.e. set A such that A ≡T D. Proof sketch. Let D be CEA(∅ ). Then D is Σ02 , so there is an approximation {Ds }s∈ω such that n ∈ D iff there is an s such that n ∈ Dt for all t > s. Arguing as in the proof of Lemma 2.14.8, we have a c.e. set B such that B [e] is an initial segment of N[e] and equals N[e] iff e ∈ / D. We need to make the jump of A high enough to compute D, which we do by meeting for each e the requirement Pe : A[e] =∗ B [e] .
66
2. Computability Theory
As in Lemma 2.14.8, this action ensures that A can compute D. We also need to keep the jump of A computable in D, which we do by controlling the jump as in Theorem 2.11.2. For Pe , we enumerate all elements of B [e] into A as long as we are not restrained from doing so. This requirement has nodes on the priority tree T with outcomes {∞, f }. The first outcome corresponds to the case B [e] = N[e] , and the second to the case B [e] =∗ ∅. Naturally, in the end we need to argue that if the outcome on the true path is ν ∞, then almost all of B [e] is enumerated into A. We refer to Pe nodes as coding nodes. A strategy N corresponding to a lowness requirement has a finite number of coding nodes above it with which it needs to contend. The strategy N does not have to worry about coding nodes with outcome f above it. But suppose there is a coding node P corresponding to Pe outcome ∞ above N . Then N assumes that all of N[e] enters A, except for numbers below the maximum r of the restraints imposed by strategies above P . Thus N does not believe any computation until it has seen all numbers in N[e] between r and the use of the computation enter A[e] . The full construction works in the standard inductive way. As with the density theorem, while D cannot figure out the true path of the construction, it can sort out the fate of a given coding marker, by deciding whether a particular restraint is B-correct. The above result can be extended to higher jumps as follows. Theorem 2.16.5 (Sacks [341]). If D is CEA(∅(n) ) then there is a c.e. set A such that A(n) ≡T D. Proof. This result is proved by induction. Suppose that D is CEA(∅(n+1) ). Then we know that there is a set B that is CEA(∅(n) ) with B ≡T D, by the relativized form of Theorem 2.16.4. By induction, there is a c.e. set A with A(n) ≡T B. Then A(n+1) ≡T D. There are even extensions of the above result to transfinite ordinals in place of n. We will later have occasion to use the fact that the proof of the Sacks Jump Inversion Theorem is uniform. That is, an index for the set A in Theorem 2.16.5 can be found effectively from an index for the c.e. operator enumerating D from ∅(n) . The Sacks Jump Inversion and Density Theorems have been extraordinarily influential. They have been extended and generalized in many ways. Roughly speaking, any “reasonable” set of requirements on the ordering of a collection of c.e. degrees and their jumps that does not explicitly contradict obvious partial ordering considerations (such as a b ⇒ a b ) is realizable. Here is one example.
2.17. Hyperimmune-free degrees
67
Theorem 2.16.6 (Jump Interpolation Theorem, Robinson [333]). Let C and D be c.e. sets with C
2.17 Hyperimmune-free degrees The notions of hyperimmune and hyperimmune-free degree have many applications in the study of computability theory and its interaction with algorithmic randomness. There are several ways to define these notions. We will adopt what is probably the simplest definition, using the concept of domination. Recall that a function f dominates a function g if f (n) g(n) for almost all n. It is sometimes technically useful to work with the following closely related concept. A function f majorizes a function g if f (n) g(n) for all n. Definition 2.17.1 (Miller and Martin [282]). A degree a is hyperimmune if there is a function f T a that is not dominated by any computable function (or, equivalently, not majorized by any computable function). Otherwise, a is hyperimmune-free. While 0 is clearly hyperimmune-free, all other Δ02 degrees are hyperimmune, as shown by the following result. Proposition 2.17.2 (Miller and Martin [282]). If a < b a for some a, then b is hyperimmune. In particular, every nonzero degree below 0 , and hence every nonzero c.e. degree, is hyperimmune. Proof. We do the proof for the case a = 0. The general result follows by a straightforward relativization. Let B be a set such that ∅ m such that Bt m = Bn m for all t ∈ [n, h(n)]. Such an n must exist because there is a stage at which the approximation to B m stabilizes. By the definition of g and the choice of h, we have g(n) ∈ [n, h(n)], so B m = Bg(n) m = Bn m. Thus B is computable, which is a contradiction. On the other hand, there do exist nonzero hyperimmune-free degrees.
68
2. Computability Theory
Theorem 2.17.3 (Miller and Martin [282]). There is a nonzero hyperimmune-free degree. Proof. We define a noncomputable set A of hyperimmune-free degree, using a technique known as forcing with computable perfect trees. A function tree is a function T : 2<ω → 2<ω such that T (σ0) and T (σ1) are incompatible extensions of T (σ). For a function tree T , let [T ] be the collection of all X for which there is a B ∈ 2ω such that T (σ) ≺ X for all σ ≺ B. We build a sequence {Ti }i∈ω of computable function trees such that [T0 ] ⊇ [T1 ] ⊇ [T2 ] ⊇ · · · . Since each [Ti ] is closed, n [Tn ] = ∅. We will take A to be any element of this intersection. At stage 2e + 1, we ensure that A = We , while at stage 2e + 2, we ensure that ΦA e is majorized by a computable function. At stage 0, we let T0 be the identity map. At stage 2e + 1, we are given T2e . Let i ∈ {0, 1} be such that T2e (i) = We |T2e (i)|. Since T2e (0) and T2e (1) are incompatible, such an i must exist. Now define T2e+1 by letting T2e+1 (σ) = T2e (iσ). Notice that [T2e+1 ] consists exactly of those elements of [T2e ] that extend T2e (i), and hence, if A ∈ [T2e+1 ] then A = We . At stage 2e+ 2, we are given a computable function tree T2e+1 . It suffices to build T2e+2 to ensure that either (i) if A ∈ [T2e+2 ] then ΦA e is not total, or (ii) there is a computable function f such that if A ∈ [T2e+2 ] then ΦA e (n) f (n) for all n. T
(τ )
First suppose there are n and σ such that Φe 2e+1 (n) ↑ for all τ σ. Then define T2e+2 by letting T2e+2 (ν) = T2e+1 (σν). This definition clearly ensures that if A ∈ [T2e+2 ] then ΦA e (n) ↑. Now suppose there are no such n and σ. Then define T2e+2 as follows. T (σ ) For the empty string λ, search for a σλ such that Φe 2e+1 λ (0) ↓ and define T2e+2 (λ) = T2e+1 (σλ ). Having defined στ , search for incompatiT (σ ) ble extensions στ 0 and στ 1 of στ such that Φe 2e+1 τ i (|τ |) ↓ for i = 0, 1, and define T2e+2 (τ i) = T2e+1 (στ i ). This process ensures that for all n and T (τ ) all τ of length n, we have Φe 2e+2 (n) ↓. Furthermore, it ensures that T2e+2 is computable, so that we can define a computable function f by T (τ ) letting f (n) = max{Φe 2e+2 (n) : |τ | = n}. Thus if A ∈ [T2e+2 ] then A Φe (n) f (n). Note that the above construction can be carried out effectively using ∅ as an oracle, so there are nonzero Δ03 hyperimmune-free degrees. It is also possible to adapt the construction to prove the following result. Theorem 2.17.4 (Miller and Martin [282]). There are continuum many hyperimmune-free degrees.
2.17. Hyperimmune-free degrees
69
On the other hand, we will see in Section 8.21.1 that the class of sets of hyperimmune-free degree has measure 0. The name “hyperimmune degree” comes from the notion of a hyperimmune set, which we now define.9 A strong array is a computable collection of pairwise disjoint finite sets {Fi }i∈N (which means not only that the Fi are uniformly computable, but also that the function i → max Fi is computable). An infinite set A is hyperimmune if for all strong arrays {Fi }i∈N , there is an i such that Fi ⊂ A. A c.e. set is hypersimple if its complement is hyperimmune. Given the terminology, one would expect that a hyperimmune degree is one that contains a hyperimmune set. We will show that this is indeed the case by first introducing another equivalent characterization of the hyperimmune degrees. The principal function of a set A = {a0 < a1 < · · · } is the function pA defined by pA (n) = an . Lemma 2.17.5 (Miller and Martin [282]). A degree a is hyperimmune iff a contains a set A such that pA is not majorized by any computable function. Proof. Since pA T A, the “if” direction is obvious. For the other direction, suppose that there is a function f T a that is not majorized by any computable function. We may assume that f is increasing. Let B ∈ a and let b0 < b1 < · · · be the elements of B. Let A = {f (bn ) : n ∈ N}. Using the fact that f is increasing, it is easy to show that A ≡T B. Thus A ∈ a, and pA (n) f (n) for all n, so pA is not majorized by any computable function. Theorem 2.17.6 (Kuznecov [230], Medvedev [264], Uspensky [393]). A degree is hyperimmune iff it contains a hyperimmune set. Proof. Suppose that A is not hyperimmune. Then there is a strong array {Fi }i∈N such that A ∩ Fi = ∅ for all i. Let f (n) = max in Fi . Then f is computable, and pA (n) f (n) for all n. So if no set in a is hyperimmune, then it follows from Lemma 2.17.5 that a is not hyperimmune. Now suppose that a is not hyperimmune and let A ∈ a. By Lemma 2.17.5, there is a computable function f such that pA (n) f (n) for all n. Let F0 = [0, f (0)]. Given Fi , let ki = max Fi + 1 and let Fi+1 = [ki , f (ki )]. Then pA (ki ) ∈ A ∩ Fi for all i, so A is not hyperimmune. The following is a useful characterization of the hyperimmune-free degrees.
9 Some
authors have argued that the adoption of the term “hyperimmune-free degree” is a historical accident, and that a more descriptive one, based on domination properties, should be used. Nies, Barmpalias, and Soare, for example, have all used the term computably dominated. Earlier, authors such as Gasarch and Simpson used the term almost recursive.
70
2. Computability Theory
Proposition 2.17.7 (Jockusch [187], Martin [unpublished]). The following are equivalent for a set A. (i) A has hyperimmune-free degree. (ii) For all functions f , if f T A then f tt A. (iii) For all sets B, if B T A then B tt A. Proof. To show that (i) implies (ii), suppose that A has hyperimmune-free degree and f = ΦA e , and let g(n) be the number of steps taken by the computation ΦA (n). Then g T A, so there is a computable function h e that majorizes g. Let Ψ be the functional defined as follows. On oracle X and input n, run the computation ΦX e (n) for h(n) many steps. If this computation converges, then output its value, and otherwise output 0. Then Ψ is a truth table functional, and ΨA = ΦA e = f. Clearly (ii) implies (iii). To show that (iii) implies (i), assume that A does not have hyperimmune-free degree and let g T A be a function not dominated by any computable function. We build a set B T A such that B tt A. Let σ0 , σ1 , . . . be an effective list of all truth tables. For each e and n, proceed as follows. If Φe (e, n)[g(n)] ↓ then let e, n ∈ B iff A σΦe (e,n ) . Note that this definition ensures that Φe does not witness a truth table reduction from A to B. If Φe (e, n)[g(n)] ↑ then let e, n ∈ B. Clearly B T A. Assume for a contradiction that B tt A. Then there is an e such that Φe witnesses this fact. This Φe is total, so there is a computable function h such that Φe (e, n)[h(n)] ↓ for all n. Since h does not dominate g, there is an n such that Φe (e, n)[g(n)] ↓. For this n, we have e, n ∈ B iff A σΦe (e,n ) , contradicting the choice of e. The above result cannot be extended to wtt-reducibility. Downey [99] constructed a noncomputable c.e. set A such that for all sets B, if B T A then B wtt A. Since A is noncomputable and c.e., it has hyperimmune degree.
2.18 Minimal degrees The proof of Theorem 2.17.3 uses ideas that go back to a fundamental theorem of Spector [374]. Definition 2.18.1. A degree a > 0 is minimal if there is no degree b with 0 < b < a. Theorem 2.18.2 (Spector [374]). There is a minimal degree below 0 . Proof. As with Theorem 2.17.3, the proof uses forcing with computable perfect trees. As before, we build a sequence {Ti }i∈ω of computable function trees such that [T0 ] ⊇ [T1 ] ⊇ [T2 ] ⊇ · · · , and take our set A of minimal degree to be any element of n [Tn ].
2.18. Minimal degrees
71
At stage 0, we let T0 be the identity map. The action at stage 2e + 1 is exactly the same as in the proof of Theorem 2.17.3. That is, we let i ∈ {0, 1} be such that T2e (i) = We |T2e (i)| and define T2e+1 by letting T2e+1 (σ) = T2e (iσ). As before, this action ensures that A = We . At stage 2e + 2, we are given T2e+1 and want to meet the requirement A A Re : ΦA e total ⇒ Φe ≡T ∅ ∨ A T Φe .
The first thing we do is follow the method used in the proof of Theorem 2.17.3 to try to force the nontotality of ΦA e . We ask ∅ whether there are T2e+1 (τ ) n and σ such that Φe (n) ↑ for all τ σ. If so, we define T2e+2 by letting T2e+2 (ν) = T2e+1 (σν). This definition clearly ensures that ΦA e (n) ↑, and hence that Re is met. If there are no such n and σ, then we try to ensure that if ΦA e is total then it is computable. The critical question to ask ∅ is whether there is a T (τ ) σ such that for all n and all τ0 , τ1 σ, if Φe 2e+1 i (n) ↓ for i = 0, 1, then T (τ ) T (τ ) Φe 2e+1 0 (n) = Φe 2e+1 1 (n). If this question has a positive answer, then we define T2e+2 by letting T2e+2 (ν) = T2e+1 (σν). Then, to compute ΦA e (n) T2e+2 (μ) (assuming ΦA (n) ↓), we simply need to find a μ such that Φ (n) ↓, e e T2e+2 (μ) A and we are guaranteed that Φe (n) = Φe (n). Thus, in this case, ΦA e is computable if total, and hence Re is met. If the answer to the above question is negative, then we need to ensure that A T ΦA e . The crucial new idea is Spector’s notion of an e-splitting tree.10 Definition 2.18.3. We say that a function tree T is e-splitting if for each T (σ0) T (σ1) σ, there is an n such that Φe (n) ↓= Φe (n) ↓. The crucial fact about e-splitting trees is the following. Lemma 2.18.4. Let T be e-splitting and let B ∈ [T ]. If ΦB e is total then ΦB B. T e B Proof. Suppose we are given ΦB e , and that Φe is total. Search for an n0 T (0) T (1) T (i) such that Φe (n0 ) ↓= Φe (n0 ) ↓. Let i be such that ΦB (n0 ). e (n0 ) = Φe T (i0) Then we know that T (i) ≺ B. Now find an n1 such that Φe (n1 ) ↓= T (i1) T (ij) Φe (n1 ) ↓. Let j be such that ΦB (n ) = Φ (n ). Then T (ij) ≺ B. e 1 1 e Continuing in this way, we can compute more and more of B. Thus B T ΦB e .
So to meet Re , it is enough to ensure that T2e+2 is e-splitting. But recall that we are now in the case in which for each σ there are n and 10 We
are being a bit free and easy with history. The notion of being e-splitting was first used by Spector, but the influential use of trees in the construction of a minimal degree actually first occurred in Shoenfield’s book [356].
72
2. Computability Theory T
(τ )
T
(τ )
τ0 , τ1 σ such that Φe 2e+1 0 (n) ↓= Φe 2e+1 1 (n) ↓. So we can define T2e+2 as follows. Let T2e+2 (λ) = T2e+1 (λ). Suppose we have defined T2e+2 (μ) to equal T2e+1 (σ) for some σ. Search for τ0 and τ1 as above and define T2e+2 (μi) = T2e+1 (τi ) for i = 0, 1. Then T2e+2 is computable and e-splitting. As with hyperimmune-free degrees, there are in fact continuum many minimal degrees. A crucial difference between hyperimmune-free degrees and minimal degrees is given by the following result. (We say that a degree b bounds a degree a if a b.) Theorem 2.18.5 (Yates [410], Cooper [76]). Every nonzero c.e. degree bounds a minimal degree. In particular, there are minimal degrees below 0 (which was first shown by Sacks [342]), and hence hyperimmune minimal degrees. The method used to prove Theorem 2.18.5, known as the full approximation method , is quite involved and would take us a little too far afield. See Lerman [240] or Odifreddi [311] for more details. Minimal degrees form a very interesting class. We finish by quoting two important results about their possible Turing degrees. Theorem 2.18.6 (Jockusch and Posner [192]). All minimal degrees are GL2 . That is, if a is a minimal degree then a (a ∨ 0 ) . This result improved an earlier one by Cooper [77], who showed that no minimal degree below 0 can be high. (There are low minimal degrees, as well as minimal degrees below 0 that are not low; see [192] for a discussion of these and related results.) Cooper [77] also proved the following definitive result. Theorem 2.18.7 (Cooper [77]). If b 0 then there is a minimal degree a with a = b. Thus there is an analog of the Friedberg Completeness Criterion (Theorem 2.16.1) for minimal degrees. It is not possible to prove an analog of the Shoenfield Jump Inversion Theorem since Downey, Lempp, and Shore [124] showed that there are degrees CEA(∅ ) and low over 0 that are not jumps of minimal degrees below 0 . Cooper [78] claimed a characterization of the degrees CEA(∅ ) that are jumps of minimal degrees below 0 , but no details were given.
2.19 Π01 and Σ01 classes 2.19.1 Basics A tree is a subset of 2<ω closed under initial segments. A path through a tree T is an infinite sequence P ∈ 2ω such that if σ ≺ P then σ ∈ T . The
2.19. Π01 and Σ01 classes
73
collection of paths through T is denoted by [T ]. A subset of 2ω is a Π01 class if it is equal to [T ] for some computable tree T .11 An equivalent formulation is that C is a Π01 class if there is a computable relation R such that C = {A ∈ 2ω : ∀n R(A n)}. It is not hard to show that this definition is equivalent to the one via computable trees. There is a host of natural examples of important Π01 in several branches of logic. For example, let A and B be disjoint c.e. sets. The collection of separating sets {X : X ⊇ A ∧ X ∩ B = ∅} is a Π01 class. An important special case is an effectively inseparable pair, that is, a pair of disjoint c.e. sets A and B for which there is a computable function f such that for all disjoint c.e. We ⊇ A and Wj ⊇ B, we have f (e, j) ∈ / We ∪ Wj . This is the two variable version of a creative set. One example of an effectively inseparable pair is the sets {e : Φe (e) ↓= 0} and {e : Φe (e) ↓= 1}. The function f witnessing the effective inseparability of this pair is not hard to build: given e and j, using the recursion theorem we can find an i such that Φi (i) ↓= 1 if i ∈ We and Φi (i) ↓= 0 if i ∈ Wj .12 We then let f (e, j) = i. Another example of a Π01 class is the class of (codes of) extensions of a complete consistent theory. By the proof of G¨ odel’s Incompleteness Theorem (see e.g. [139]), for a theory such as Peano Arithmetic (PA), the set A of sentences provable from PA and the set B of sentences refutable from PA form an effectively inseparable pair. It is easy to check that the intersection of two Π01 classes is again a Π01 class, as is their union. The complement of a Π01 class is a Σ01 class. Thus C is a Σ01 class iff there is a computable relation R such that C = {A : ∃n R(A n)}. We can think of Σ01 classes as the analogs for infinite sequences of c.e. sets of finite strings. Indeed, an important fact about Σ01 classes is that they are exactly the subsets of 2ω generated by c.e. sets of finite strings, in the following sense. (Here we use the notation of Section 1.2.) 11 Sometimes
a Π01 class is defined to be the set of paths through a computable subtree Such a class is computably bounded if there is a computable function f such T of that for each σ ∈ T , if σn ∈ T then n f (σ). The study of computably bounded Π01 classes reduces to the study of the special case of Π01 subclasses of 2ω , since every computably bounded Π01 class is computably equivalent to a Π01 subclass of 2ω . We will restrict our attention to such classes, so for us a Π01 class will mean a Π01 subclass of 2ω . 12 In greater detail: Let g be a computable function such that for each k, the machine Φg(k) ignores its input and waits until k enters either We or Wj , at which point the / We ∪ Wj then the machine machine returns 0 if k ∈ We and 1 if k ∈ Wj (while if k ∈ does not return). By the recursion theorem, there is an i such that Φi = Φg(i) . For this i, we have Φi (i) ↓= 1 if i ∈ We and Φi (i) ↓= 0 if i ∈ Wj . ω <ω .
74
2. Computability Theory
Definition 2.19.1. A set A ⊆ 2<ω is prefix-free if it is an antichain with respect to the natural partial ordering of 2<ω ; that is, for all σ ∈ A and all τ properly extending σ, we have τ ∈ / A. Prefix-free sets will play an important role in this book, for example in the definition of prefix-free Kolmogorov complexity given in Section 3.5. Clearly, a set C ⊆ 2ω is open iff it is of the form A for some prefix-free A ⊆ 2<ω . The following result is the effective analog of this fact. Proposition 2.19.2. A set C ⊆ 2ω is a Σ01 class iff there is a c.e. set W ⊆ 2<ω such that C = W . This fact remains true if we require W to be prefix-free. Proof. Given a Σ01 class C, let T be a computable tree such that C = [T ]. Let W be the set of minimal elements of T , that is, strings σ ∈ / T such that every proper substring of σ is in T . Then W is prefix-free and C = W . Conversely, let W ⊆ 2<ω be a c.e. set. Then W = {A : ∃n, s (A n ∈ Ws )}, so W is a Σ01 class. Thus we can think of Σ01 classes as effectively open sets, and of Π01 classes as effectively closed sets. So when we have a Σ01 class C, we assume that we have fixed a c.e. set W ⊆ 2<ω such that C = W and let C[s] = W [s]. Similarly, for the Π01 class P = C, we let P [s] = C[s]. It is a potentially confusing fact that Proposition 2.19.2 remains true if we require W to be computable, rather than merely c.e. The reason for this fact is that if W ⊂ 2<ω is prefix-free and c.e., then V = {σ : ∃τ ∈ W|σ|+1 \ W|σ| (τ σ)} is computable and prefix-free, and V = W . We say that the Σ01 classes C0 , C1 , . . . are uniformly Σ01 , and their complements are uniformly Π01 , if there are uniformly c.e. sets V0 , V1 , . . . ⊆ 2<ω such that Ci = Vi for all i. It is not hard to see that P0 , P1 , . . . are uniformly Π01 classes iff there are uniformly computable trees T0 , T1 , . . . such that Pi = [Ti ] for all i. As one might imagine, a Δ01 class is one such that both it and its complement are Σ01 . It is not hard to see that C is a Δ01 class iff C = F for some finite F ⊂ 2<ω , so a Δ01 class is just a clopen subset of 2ω . The classes C0 , C1 , . . . are uniformly Δ01 if there are uniformly computable finite sets F0 , F1 , . . . ⊂ 2<ω such that Ci = Fi for all i. We will not need all that much of the extensive theory of Π01 and Σ01 classes, but a few facts will be important, such as the following bits of folklore. For more on the subject, see for instance Cenzer [55] or Cenzer and Remmel [57]. A path P through a tree T is isolated if there is a σ ≺ P such that no other path through T extends σ.
2.19. Π01 and Σ01 classes
75
Proposition 2.19.3. Let T be a computable tree and let P be an isolated path through T . Then P is computable. Proof. Let σ ≺ P be such that no path through T other than P extends σ. For any n > |σ|, there is exactly one τ σ such that the portion of T above τ is infinite (by K¨ onig’s Lemma, which says that if a finitely branching tree is infinite, then it has a path). So to compute P n for n > |σ|, we look for an m n such that exactly one τ σ of length n has an extension of length m in T . Then P n = τ . Corollary 2.19.4. Let C be a Π01 class with only finitely many members. Then every member of C is computable. Proof. Let T be a computable tree such that C = [T ]. Then T has only finitely many paths, so every path through T is isolated. Proposition 2.19.5. Let C be a nonempty Π01 class with no computable members. Then |C| = 2ℵ0 . Proof. Let T be a computable tree such that C = [T ]. Let S be the set of extendible elements of T , that is, those σ such that there is an extension of σ in [T ]. If every σ ∈ S has two incompatible extensions in S, then it is easy to show that there are continuum many paths through T . Otherwise, there is a σ ∈ S with a unique extension P ∈ [T ]. As in the previous proof, P is computable. Note also that the question of whether a given computable tree T is finite is an existential one, since it amounts to asking whether there is an n such that no string of length n is in T , and hence it can be decided by ∅ . The Cantor-Bendixson derivative of a Π01 class C is the class C obtained from C by removing the isolated points of C (i.e., those elements X ∈ C such that for some n, we have X n ∩ C = {X}). Let C (0) =C. For an ordinal δ, let C (δ+1) = (C δ ) , and for a limit ordinal δ, let C (δ) = γ<δ C (γ) . The rank of X ∈ C is the least δ such that X ∈ C (δ) \ C (δ+1) , if such a δ exists. Note that, by Proposition 2.19.3, if X has rank 0, then it is computable. If all the elements of C have ranks, then the rank of C is the supremum of the ranks of its elements (or, equivalently, the least δ such that C (δ+1) = ∅).
2.19.2 Π0n and Σ0n classes There is a hierarchy of classes, akin to the arithmetic hierarchy of sets. Definition 2.19.6. A set C ⊆ 2ω is a Π0n class if there is a computable relation R such that C = {A : ∀k1 ∃k2 · · · Qkn R(A k1 , A k2 , . . . , A kn )},
76
2. Computability Theory
where the quantifiers alternate and hence Q = ∀ if n is odd and Q = ∃ if n is even. The complement of a Π0n class is a Σ0n class. These definitions can be relativized to a given set X (by letting R be X-computable) to obtain the notions of Π0,X class and Σ0,X class. n n (n)
class, since every predicate of the form A Π0n+1 class is also a Π0,∅ 1 ∃k2 · · · Qkn R(A k1 , A k2 , . . . , A kn ) with R computable is computable (n)
class. The converse, in ∅(n) . Similarly, every Σ0n+1 class is also a Σ0,∅ 1 0,∅ however, is not always true, because every Π1 class is closed (indeed, class is closed for any X), but there are Π02 classes that are not every Π0,X 1 closed, such as the class C of all A with A(m) = 1 for infinitely many m, which can be written as {A : ∀k ∃m (m > k ∧ A(m) = 1)}. Indeed, even a closed Π02 class can fail to be a Π0,∅ class. Doug Cenzer 1 [personal communication] provided us the following example. It is not hard to build a computable subtree T of N<ω with a single infinite path f T ∅ . Let T = {1n0 +1 01n1 +1 0 . . . 1nk +1 0m : (n0 , . . . , nk ) ∈ T, m ∈ N}. It is easy to check that T is a computable subtree of 2<ω . Let P = [T ] ∩ C, where C is as in the previous paragraph. Then P is a Π02 class (because, as with Π01 classes, the intersection of two Π0n classes is again a Π0n class, as is their union). Furthermore, P has a single member A = 1f (0)+1 01f (1)+1 0 . . . . But A ≡T f , so A T ∅ . Thus P cannot be a Π0,∅ class by the relativized form 1 of Corollary 2.19.4. (n) Since Σ0,∅ sets are the same as Σ0n sets, the relativized form of Propo1 (n)
iff it is of the form W for some sition 2.19.2 says that a class is Σ0,∅ 1 Σ0n set W ⊆ 2<ω . We fix effective listings S0n , S1n , . . . of the Σ0n classes by letting Sen = {A : ∃k1 ∀k2 · · · Qkn R(A k1 , A k2 , . . . , A kn )} where e is a code for the n-ary computable relation R. By an index for a Σ0n class S we mean an i such that S = Sin . Similarly, by an index for a Π0n class P we mean an i such that P = Sin . We do the same for relativized classes in the natural way. We then say that the classes C0 , C1 , . . . are uniformly Σ0n if there is a computable function f such that Ci = Sfn(i) for all i, and uniformly Π0n if their complements are uniformly Σ0n . This definition agrees with the definition of uniformly Σ01 classes given in the previous section because the proof of Proposition 2.19.2 is effective, as we now note. Proposition 2.19.7. From an index for a Σ01 class C, we can computably find an index for a (prefix-free) c.e. set W ⊆ 2<ω such that C = W , and vice versa. Another way to look at this hierarchy of classes is in terms of effective unions and intersections. For example, in the same way that Π01 and Σ01 classes are the effective analogs of closed and open sets, respectively, Π02
2.19. Π01 and Σ01 classes
77
and Σ02 classes are the effective analogs of Gδ and Fσ sets, respectively (where a set is Gδ if it is a countable intersection of open sets and Fσ if it is a countable union of closed sets). More precisely,C is a Π02 class iff there are uniformly Σ01 classes U0 , U1 , . . . such that C = n Un . Similarly,C is a Σ02 class iff there are uniformly Π01 classes P0 , P1 , . . . such that C = n Pn . The same is true for any n in place of 1 and n + 1 in place of 2. It is also easy to check that we can take U0 ⊇ U1 ⊇ · · · and P0 ⊆ P1 ⊆ · · · if we wish.
2.19.3 Basis theorems A basis for the Π01 classes is a collection of sets C such that every nonempty Π01 class has an element in C. One of the most important classes of results on Π01 classes is that of basis theorems. A basis theorem for Π01 classes states that every nonempty Π01 class has a member of a certain type. (Henceforth, all Π01 classes will be taken to be nonempty without further comment.) For instance, it is not hard to check that the lexicographically least element of a Π01 class C (i.e., the leftmost path of a computable tree T such that [T ] = C) has c.e. degree.13 This fact establishes the Computably Enumerable Basis Theorem of Jockusch and Soare [195], which states that every Π01 class has a member of c.e. degree, and is a refinement of the earlier Kreisel Basis Theorem, which states that every Π01 class has a Δ02 member. The following is the most famous and widely applicable basis theorem. Theorem 2.19.8 (Low Basis Theorem, Jockusch and Soare [196]). Every Π01 class has a low member. Proof. Let C = [T ] with T a computable tree. We define a sequence T = T0 ⊇ T1 ⊇ · · · of infinite computable trees such that if P ∈ e [Te ] then P is low. (Note that there must be such a P , since each [Te ] is closed.) We begin with T0 = 2<ω . Suppose that we have defined Te , and let Ue = {σ : Φσe (e)[|σ|] ↑}. Then Ue is a computable tree. If Ue ∩ Te is infinite, then let Te+1 = Te ∩ Ue . Otherwise, let Te+1 = Te . Note that either ΦP e (e) ↑ for all P ∈ [Te+1 ] or ΦP e (e) ↓ for all P ∈ [Te+1 ]. Now suppose that P ∈ e [Te ]. We can perform the above construction computably in ∅ , since ∅ can decide the question of whether Ue ∩ Te is finite. Thus ∅ can decide whether ΦP e (e) ↓ for any given e, and hence P is low. The above proof has a consequence that will be useful below. Recall that a set A is superlow if A ≡tt ∅ . Marcus Schaefer observed that the proof of the low basis theorem actually produces a superlow set. 13 Such
an element is in fact a left-c.e. real, a concept we will define in Chapter 5.
78
2. Computability Theory
Corollary 2.19.9 (Jockusch and Soare [196], Schaefer). Every Π01 class has a superlow member. The following variation on the low basis theorem will be useful below. Theorem 2.19.10 (Jockusch and Soare [196]). Let C be a Π01 class and let X be a noncomputable set. Then there is an A ∈ C such that X T A and A T X ⊕ ∅ . Proof. Let C = [T ] with T a computable tree. As in the proof of the low basis theorem, we define a sequence T = T0 ⊇ T1 ⊇ · · · of infinite computable trees. We begin with T0 = 2<ω . Suppose that we have defined T2e to be a computable tree. We first proceed as in the proof of the low basis theorem. Let Ue = {σ : Φσe (e)[|σ|] ↑}. If Ue ∩ T2e is infinite, then let T2e+1 = T2e ∩ Ue . Otherwise, let T2e+1 = T2e . As before, T2e+1 is a computable tree. We now handle cone-avoidance as follows. We X ⊕ ∅ -computably search until we find that one of the following holds. 1. There are an n and an extendible σ ∈ T2e+1 such that Φσe (n) ↓= X(n). 2. There is an n such that T2e+1 ∩ {σ : Φσe (n) ↑} is infinite. We claim one of the above must hold. Suppose not, and fix n. The fact that 2 does not hold means that we can find a k such that Φσe (n) ↓ for all σ ∈ T2e+1 ∩ 2k . The fact that 1 does not hold means that, for each such σ, if Φσe (n) = X(n), then T2e+1 is finite above σ. Thus we can compute X(n) by searching for a j such that for all σ, τ ∈ T2e+1 ∩ 2j , we have Φσe (n) ↓= Φτe (n) ↓, which by the above must exist. Since X is noncomputable, either 1 or 2 must hold. If we find that 1 holds, then let T2e+2 = T2e+1 ∩{τ : σ τ }. σ Otherwise, let T 2e+2 = T2e+1 ∩ {σ : Φe (n) ↑}. Now let A ∈ e [Te ]. It is easy to check that we can perform the above construction computably in X ⊕ ∅ , so as in the proof of the low basis theorem, we have A T X ⊕ ∅ . Furthermore, the definition of the trees T2e+2 ensures that X T A. Corollary 2.19.9 and Theorem 2.19.10 cannot be combined. That is, there are a Π01 class C and a noncomputable set X such that X is computable from every superlow member of C. We will see examples in Section 14.4. Another important basis theorem is provided by the hyperimmune-free degrees. Theorem 2.19.11 (Hyperimmune-Free Basis Theorem, Jockusch and Soare [196]). Every Π01 class has a member of hyperimmune-free degree. Proof. Let C be a Π01 class, and let T be a computable tree such that C = [T ]. We can carry out a construction like that in the proof of Miller and Martin’s Theorem 2.17.3 within T . We begin with T0 = T . We do not need the odd stages of that construction since we do not need to force
2.19. Π01 and Σ01 classes
79
noncomputability (as 0 is hyperimmune-free). Thus we deal with Φe at stage e + 1. We are given Te and we want to build Te+1 to ensure that either (i) if A ∈ [Te+1 ] then ΦA e is not total, or (ii) there is a computable function f such that if A ∈ [Te+1 ] then ΦA e (n) f (n) for all n. Let Uen = {σ : Φσe (n) ↑}. There are two cases. If there is an n such that Uen ∩ Te is infinite, then let Te+1 = Uen ∩ Te . In this case, if A ∈ [Te+1 ] then ΦA e (n) ↑. Otherwise, let Te+1 = Te . In this case, for each n we can compute a number l(n) such that no string σ ∈ Te of length l(n) is in Uen . For any such σ, we have Φσe (n) ↓. Define the computable function f by f (n) = max{Φσe (n) : σ ∈ Te ∧ |σ| = l(n)}. that ΦA Then A ∈ [Te+1 ] implies e (n) f (n) for all n. So if we let A ∈ e [Te ], then A is a member of C and has hyperimmunefree degree. In the above construction, if [T ] has no computable members, then Proposition 2.19.5 implies that each Te must have size 2ℵ0 . It is not hard to adjust the construction in this case to obtain continuum many members of C of hyperimmune-free degree. Thus, a Π01 class with no computable members has continuum many members of hyperimmune-free degree. The following result is an immediate consequence of the hyperimmunefree basis theorem and Theorem 2.17.2, but it was first explicitly articulated by Kautz [200], who gave an interesting direct proof. Recall that a set A is computably enumerable in and above a set X (CEA(X)) if A is c.e. in X and X T A. A set A is CEA if it is CEA(X) for some X
80
2. Computability Theory
We end this section by looking at examples of nonbasis theorems. Proposition 2.19.14. The intersection of all bases for the Π01 classes is the collection of computable sets, and hence there is no minimal basis. Proof. If A is a computable set then {A} is a Π01 class, so every basis for the Π01 classes must include A. On the other hand, if B is noncomputable then, by Corollary 2.19.4, {B} is not a Π01 class, so every Π01 class must contain a member other than B, and hence the collection of all sets other than B is a basis for the Π01 classes.14 For our next result, we will need the following definition and lemma. Definition 2.19.15. An infinite set A is effectively immune if there is a computable function f such that for all e, if We ⊆ A then |We | f (e). Post gave the following construction of an effectively immune co-c.e. set A. At stage s, for each e < s, if We [s] ⊆ As and there is an x ∈ We [s] such that x > 2e, then put the least such x into A. Then A is infinite, and is effectively immune via the function e → 2e. Lemma 2.19.16 (Martin [258]). If a c.e. set B computes an effectively immune set, then B is Turing complete. Proof. Let B be c.e., and let A T B be effectively immune, as witnessed by the computable function f . Let Γ be a reduction such that ΓB = A. For each k we build a c.e. set Wh(k) , where the computable function h is given by the recursion theorem with parameters. Initially, Wh(k) is empty. If k enters ∅ at stage t then we wait until a stage s t at which there is a q such that ΓB (n)[s] ↓ for all n < q and there are more than f (h(k)) many numbers in ΓB [s] q. We then let Wh(k) = ΓB [s] q.15 This action ensures that |Wh(k) | > f (h(k)), so we must have Wh(k) A. Thus, ΓB [s] q = Wh(k) = A q = ΓB q. So to compute ∅ (k) from B, we simply look for a stage s at which there is a q such that Bs γ B (n)[s] = B γ B (n)[s] for all n < q and there are more than f (h(k)) many numbers in ΓB [s] q. If k is not in ∅ by stage s, it cannot later enter ∅ . 14 Another
way to prove this fact is to note that, by Theorem 2.17.2, the only sets that are both low and hyperimmune-free are the computable ones. 15 In greater detail: For each i and k, we define a c.e. set C i,k as follows. Initially, Ci,k is empty. If k enters ∅ at stage t then we wait until a stage s t at which there is a q such that ΓB (n)[s] ↓ for all n < q and there are more than f (i) many numbers in ΓB [s] q. We then let Ci,k = ΓB [s] q. It is easy to see that there is a computable function g such that g(i, k) is an index for Ci,k for all i and k. By the recursion theorem with parameters, there is a computable function h such that Wh(k) = Wg(h(k),k) for all k. This h has the required properties.
2.19. Π01 and Σ01 classes
81
Theorem 2.19.17 (Jockusch and Soare [195]). (i) The incomplete c.e. degrees do not form a basis for the Π01 classes. (ii) Let a be an incomplete c.e. degree. Then the degrees less than or equal to a do not form a basis for the Π01 classes. Proof. By Lemma 2.19.16, to prove both parts of the theorem, it is enough to find a Π01 class whose members are all effectively immune. Let A be Post’s effectively immune set described after Definition 2.19.15. An infinite subset of an effectively immune set is clearly also effectively immune, so it is enough to find a Π01 class all of whose members are infinite subsets of A. It is clear from the construction of A that |A 2e| e for all e. Thus A ∩ [2k − 1, 2k+1 − 2] = ∅ for all k.16 Let P = {B : B ⊆ A ∧ ∀k (B ∩ [2k − 1, 2k+1 − 2] = ∅)}. Since A is co-c.e., P is a Π01 class, and it is nonempty because A ∈ P. Furthermore, every element of P is an infinite subset of A. As explained in the previous paragraph, these facts suffice to establish the theorem.
2.19.4 Generalizing the low basis theorem We would like to extend the low basis theorem to degrees above 0 . Of course, we cannot hope to prove that for every degree a 0 , every Π01 class has a member whose jump has degree a, since there are Π01 classes whose members are all computable. We can prove this result, however, if we restrict our attention to special Π01 classes, which are those with no computable members. Theorem 2.19.18 (Jockusch and Soare [196]). If C is a Π01 class with no computable members and a 0 , then there is a P ∈ C such that deg(P ) = deg P ⊕ ∅ = a. Proof. Let C = [T ] with T a computable tree and let A ∈ a. By Proposition 2.19.5, T has 2ℵ0 many paths. We A-computably define an infinite sequence of computable trees T0 ⊇ T1 ⊇ · · · , each with 2ℵ0 many paths. We begin with T0 = T . Given T2e , let U2e = {σ ∈ T2e : Φσe (e)[|σ|] ↑}. If U2e is finite then let T2e+1 = T2e . Otherwise, since U2e is a subtree of T , it has no computable paths, and hence has 2ℵ0 many paths. In this case, let T2e+1 = U2e . Notice that we can A-computably decide which case we are in because A T ∅ . 16 Notice
that this fact shows that A is not hyperimmune. In fact, it is not hard to show that a co-c.e. set X is not hyperimmune iff there is a Π01 class all of whose members are infinite subsets of X.
82
2. Computability Theory
Now let σ0
2.20 Strong reducibilities and Post’s Program The concept of hypersimplicity, which we defined in Section 2.17, was introduced by Post [316] as part of an attempt to solve Post’s Problem. Although hypersimple sets can be Turing complete, Post [316] showed that no hypersimple set can be tt-complete. Later, Friedberg and Rogers [162] proved that no hypersimple set can be wtt-complete. This result has been extended as follows. A set A is wtt-cuppable if there is a c.e. set W <wtt ∅ such that A ⊕ W wtt ∅ . Theorem 2.20.1 (Downey and Jockusch [119]). No hypersimple set is wtt-cuppable. Proof. Suppose that A is hypersimple, W is c.e., W wtt ∅ , and A⊕W wtt ∅ . We show that W wtt ∅ . We will build a c.e. set B. By the recursion theorem, we may assume that we have a wtt-reduction ΓA⊕W = B with computable use g.17 Without loss of generality, we may assume that g(n) is increasing and even for all n and write h(n) for g(n) 2 . We define a strong array {Fi }i∈N as follows, writing m(i) for max Fi . Let F0 = [0, h(0)]. Given Fi , pick m(i) + 1 many fresh large numbers i+1 bi+1 < · · · < bi+1 0 m(i) and let Fi+1 = (max Fi , h(bm(i) )]. Since the Fi form a strong array, there are infinitely many i with Fi ⊆ A. We can therefore 17 In
greater detail: We can think of the ensuing construction as building a c.e. set Wf (e) given a parameter e, taking as Γ a wtt-functional such that ΓA⊕W = We . (It is not hard to see that indices for Γ and for a computable function bounding its use can be found effectively from e.) By the recursion theorem, there is an e such that We = Wf (e) , and we let B = We .
2.20. Strong reducibilities and Post’s Program
83
enumerate an increasing computable sequence i0 < i1 < · · · such that Fik +1 ⊆ A for all k. k +1 Now for each k, we act as follows. Let b = bim(i . We wait until a stage k) sk such that Fik +1 ⊆ Ask . If k enters ∅ at a stage tk sk , we wait for a stage s such that ΓA⊕W [s] (b + 1) = Bs (b + 1), then put bi0k +1 into B. If later A changes below h(b), we again wait for a stage s such that ΓA⊕W [s] (b + 1) = Bs (b + 1), then put bi1k +1 into B. We continue in this manner, enumerating one more bijk +1 into B each time A changes below h(b) and the computation ΓA⊕W (b + 1) recovers. Since Fik +1 ⊆ Ask , we cannot have more than m(ik ) + 1 many changes in A below h(b) after stage sk , so we never run out of bijk +1 ’s. Since we change B (b + 1) after the last time A h(b) changes, W h(b) must change at that point to ensure that ΓA⊕W (b + 1) = B (b + 1). Thus W h(b) changes at least once after stage tk , so we can wttcompute ∅ using W as follows. Given k, wait for a stage s sk such that Ws h(b) = W h(b). Then k ∈ ∅ iff k ∈ ∅s . Post’s original program for solving Post’s Problem was to find a “thinness” property of the lattice of supersets of a c.e. set (under inclusion) guaranteeing Turing incompleteness of the given set. It was eventually shown that this approach could not succeed. Although Sacks [343] constructed a maximal set (i.e., a coinfinite c.e. set M such that if W ⊇ M is c.e. then either W is cofinite or W \ M is finite) that is Turing incomplete, it is also possible to construct Turing complete maximal sets. Indeed, Soare [365] showed that the maximal sets form an orbit of the automorphisms of the lattice of c.e. sets under inclusion, and hence there is no definable property that can be added to maximality to guarantee incompleteness. Eventually, Cholak, Downey, and Stob [66] showed that no property of the lattice of c.e. supersets of a c.e. set can by itself guarantee incompleteness. Finally, Harrington and Soare [176] did find an elementary property of c.e. sets (that is, one that is first-order definable in the language of the lattice of c.e. sets) that guarantees incompleteness. There is a large amount of fascinating material related to this topic. For instance, Cholak and Harrington [68] have shown that one can define all “double jump” classes in the lattice of c.e. sets using infinitary formulas. The methods are intricate and rely on analyzing the failure of the “automorphism machinery” first developed by Soare and later refined by himself and others, particularly Cholak and Harrington. Properties like simplicity (defined in Section 2.2) and hypersimplicity do have implications for the degrees of sets related to a given c.e. set. For instance, Stob [383] showed that a c.e. set is simple iff it does not have c.e. supersets of all c.e. degrees. Downey [100] showed that if A is hypersimple then there is a c.e. degree b deg(A) such that if A0 A1 is a c.e. splitting of A, then neither of the Ai has degree b.
84
2. Computability Theory
2.21 PA degrees In this section, we assume familiarity with basic concepts of mathematical logic, such as the notion of a first-order theory, the formal system of Peano Arithmetic, and G¨odel’s Incompleteness Theorem, as may be found in an introductory text such as Enderton [139]. Those unfamiliar with this material may take Corollary 2.21.4 below as a definition of the notion of a PA degree (which makes Theorem 2.21.3 immediate). The class of complete extensions of a given computably axiomatizable first-order theory is a Π01 class. (We assume here that all theories are consistent.) Jockusch and Soare [195, 196] showed that, in fact, for every Π01 class P there is a computably axiomatizable first-order theory T such that the class of degrees of members of P coincides with the class of degrees of complete extensions of T . (This result was later extended by Hanf [174] to finitely axiomatizable theories.) As we will see, there are Π01 classes P that are universal , in the sense that any set that can compute an element of P can compute an element of any nonempty Π01 class. A natural example of such a class is the class of complete extensions of Peano Arithmetic. The degrees of elements of this class are known as PA degrees. Definition 2.21.1. A degree a is a PA degree if it is the degree of a complete extension of Peano Arithmetic. Theorem 2.21.2 (Scott Basis Theorem [352]). If a is a PA degree then the sets computable from a form a basis for the Π01 classes. Proof. Let S be a complete extension of PA of degree a.18 Let T be an infinite computable tree. We compute a path through T using S by defining a sequence σ0 ≺ σ1 ≺ · · · ∈ T . Let σ0 = λ. Suppose that σn has been defined in such a way that there is a path through T extending σn . If there is a unique i such that σn i ∈ T , then let σn+1 = σn i. Otherwise, both σn 0 and σn 1 are in T . For i = 0, 1, let θi be the sentence ∃m (σn i has an extension of length m in T but σn (1 − i) does not). These sentences can be expressed in the language of first-order arithmetic, by the kind of coding used in the proof of G¨ odel’s Incompleteness Theorem. Within PA, we can argue as follows. Suppose that θ0 and θ1 both hold. Then for each i = 0, 1, there is a least mi such that σn i has an extension of length mi in T but σn (1 − i) does not. Let i be such that mi m1−i . Then σn (1 − i) has an extension of length m1−i , and hence one of length mi , which contradicts the choice of mi . 18 We do not actually need S to be complete, but only consistent and deductively closed.
2.21. PA degrees
85
Thus PA ¬(θ0 ∧ θ1 ), so there is a least i such that θ1−i ∈ / S. Let σn+1 = σn i. There must be a path through T extending σn+1 , since otherwise we would have PA θ1−i , and hence θ1−i ∈ S. In unpublished work, Solovay showed that, in fact, the property in Theorem 2.21.2 exactly characterizes the PA degrees. This fact, which we will state as Corollary 2.21.4 below, follows immediately from the following result, which is important in its own right. Theorem 2.21.3 (Solovay [unpublished]). The class of PA degrees is closed upwards. Proof. Let A be the set of (G¨ odel numbers of) sentences provable in PA and B the set of (G¨ odel numbers of) sentences refutable from PA. We follow the proof in Odifreddi [310], which uses the fact that A and B form an effectively inseparable pair of c.e. sets (which recall means that there is a computable function f such that for all disjoint c.e. We ⊇ A and Wi ⊇ B, we have f (e, i) ∈ / We ∪ Wi ). Let T be a complete extension of PA computable in a setX. Using T , we will build finite extensions Fσ of PA for σ ∈ 2<ω so that σ≺X Fσ is a complete extension of PA with the same degree as X. Let ϕ0 , ϕ1 , . . . be an effective listing of the sentences of arithmetic. Let Fλ = PA. Suppose that we have defined Fσ to be a finite consistent extension of PA. Let ψ0 be an existential sentence of arithmetic expressing the statement that ∃m (m codes a proof of ϕn in Fσ and no k < m codes a proof of ¬ϕn in Fσ ). Let ψ1 be the same with the roles of ϕn and ¬ϕn interchanged. We have ψi ∈ T iff ψi is true, and at most one ψi is true. If ψ0 ∈ T then let Fσ = Fσ ∪ {ϕn } and otherwise let Fσ = Fσ ∪ {¬ϕn }. Let C be the sentences provable in Fσ and D the sentences refutable from Fσ . Then C and D are c.e. sets containing A and B, respectively, and we can effectively find indices for C and D. Since A and B form an effectively inseparable pair, we can compute a sentence ν ∈ / C ∪ D. Let Fσ0 = Fσ ∪ {ν} and F = F ∪ {¬ν}. σ1 σ Let S = σ≺X Fσ . Then S is a complete extension of PA, and is Xcomputable, since T is X-computable. Given S, we can compute X as follows. Assume by induction that we have computed X n and FXn . We know which of ϕn and ¬ϕn is in S, so we know FXn . Thus we can compute ν. We know which of ν and ¬ν is in S, so we know X n + 1 and FXn+1 . Thus S ≡T X. Since the complete extensions of PA form a Π01 class, if the sets computable from a degree a form a basis for the Π01 classes, then in particular
86
2. Computability Theory
a computes a complete extension of PA, and hence, by the above theorem, is itself a PA degree. Thus we have the following result. Corollary 2.21.4 (Solovay). A degree is PA iff the sets computable from it form a basis for the Π01 classes. A useful way to restate this corollary is that a degree a is PA iff every computable infinite subtree of 2<ω has an a-computable path. Another way to characterize PA degrees is via effectively inseparable pairs of c.e. sets. Let A and B form such a pair. As previously mentioned, the class of all sets that separate A and B is a Π01 class, so every PA degree computes such a set. Conversely, given a set C separating A and B, we can use the following useful lemma to obtain a C-computable complete extension of PA. Lemma 2.21.5 (Muchnik [285], Smullyan [362]). Let C be a separating set for an effectively inseparable pair of c.e. sets A and B, and let X and Y be disjoint c.e. sets. Then there is a C-computable set separating X and Y. Proof. Let f be a total function witnessing the effective inseparability of A and B. It is not hard to see that we may assume that f is 1-1. Using the recursion theorem with parameters, we can define computable 1-1 functions g and h such that for all n we have A ∪ {f (g(n), h(n))} if n ∈ Y Wg(n) = A otherwise and
Wh(n) =
B ∪ {f (g(n), h(n))} if n ∈ X B otherwise.19
Let D = {n : f (g(n), h(n)) ∈ C}. If n ∈ X then f (g(n), h(n)) ∈ Wg(n) ∪ Wh(n) , so Wg(n) and Wh(n) cannot be disjoint. It follows that f (g(n), h(n)) ∈ A ⊆ C, so n ∈ D. The same argument shows that if n ∈ Y then f (g(n), h(n)) ∈ B, so n ∈ / D. Thus D is a separating set for X and Y . Theorem 2.21.6 (Jockusch and Soare [196]). If C is a separating set for an effectively inseparable pair then C has PA degree. Thus a degree is PA iff it computes a separating set for an effectively inseparable pair. Proof. Let X be the set of sentences provable in PA and Y the set of sentences refutable from PA. By the above lemma, there is a C-computable set D separating X and Y . Let ϕ0 , ϕ1 , . . . be an effective listing of the 19 It is a useful exercise to work out the full details of the application of the recursion theorem in this case.
2.22. Fixed-point free and diagonally noncomputable functions
87
sentences of arithmetic. LetT0 = ∅. Suppose we have defined the finite set of sentences Tn . Let ψ = ( ϕ∈Tn ϕ) → ϕn . If ψ ∈ D then we know ψ is not refutable from PA, so ( ϕ∈Tn ϕ) → ¬ϕn is not provable from PA, and we can let Tn+1 = Tn ∪ {ϕn }. Otherwise, ψ is not provable from PA, so we can let Tn+1 = Tn ∪ {¬ϕn }. Let T = n Tn . It is easy to check that T is a D-computable (and hence C-computable) complete extension of PA. The following result follows immediately from the Low Basis Theorem 2.19.8. (Indeed, this result was Jockusch and Soare’s first application of their theorem.) Theorem 2.21.7 (Jockusch and Soare [196]). There is a low PA degree. By upward closure, 0 is a PA degree. However, combining Theorems 2.19.17 and 2.21.2, we see that it is the only c.e. PA degree. Theorem 2.21.8 (Jockusch and Soare [195]). No incomplete c.e. degree is PA. The following is another useful fact about the distribution of PA degrees. Theorem 2.21.9 (Scott and Tennenbaum [351]). A PA degree cannot be minimal. Proof. Let P = {A ⊕ B : ∀e (A(e) = Φe (e) ∧ B(e) = ΦA e (e))}. If a set S has PA degree then there is an A ⊕ B ∈ P such that A ⊕ B T S. It follows from the definition of P that 0
2.22 Fixed-point free and diagonally noncomputable functions A total function f is fixed-point free if Wf (e) = We for all e. By the recursion theorem, no computable function is fixed-point free, and indeed the fixed-point free functions can be thought of as those that avoid having fixed points in the sense of the recursion theorem. As we will see (in Section 8.17, for example), fixed-point free functions have interesting ramifications in both classical computability theory and the theory of algorithmic randomness. A related concept is that of a diagonally noncomputable (DNC ) function, where the total function g is DNC if g(e) = Φe (e) for all e. A degree is diagonally noncomputable if it computes a DNC function. Theorem 2.22.1 (Jockusch, Lerman, Soare, and Solovay [191]). The following are equivalent. (i) The set A computes a fixed-point free function. (ii) The set A computes a total function h such that Φh(e) = Φe for all e.
88
2. Computability Theory
(iii) The set A computes a DNC function. (iv) For each e there is a total function h T A such that h(n) = Φe (n) for all n. (v) For each total computable function f there is an index i with ΦA i total and ΦA (n) = Φ (n) for all n. f (i) i Proof. We begin by proving the equivalence of (i)–(iii). Clearly (i) implies (ii), since if Wh(e) = We then Φh(e) = Φe . To show that (ii) implies (iii), suppose that h is as in (ii). Define Φd(u) as in the proof of the recursion theorem. That is, Φd(u) (z) is ΦΦu (u) (z) if Φu (u) ↓, and Φd(u) (z) ↑ otherwise. As we have seen, we can choose d to be a total computable function. Let g = h ◦ d. Note that g T A. Suppose that g(e) = Φe (e). Then by (ii) we have Φd(e) = Φh(d(e)) = Φg(e) = ΦΦe (e) = Φd(e) , which is a contradiction. So g is DNC. To show that (iii) implies (i), fix a partial computable function ψ such that, for all e, if We = ∅ then ψ(e) ∈ We , and let q be partial computable with Φq(e) (q(e)) = ψ(e) for all e. Let g T A be DNC, and define f T A so that Wf (e) = {g(q(e))}. Suppose that We = Wf (e) . Then We = ∅, so ψ(e) ∈ We , which implies that ψ(e) = g(q(e)). But g is DNC, so g(q(e)) = Φq(e) (q(e)) = ψ(e), which is a contradiction. So f is fixed-point free. We now prove the equivalence of (iv) and (v). To show that (iv) implies (v), let f be a computable function, and let e be such that Φe (i, j) = Φf (i) (j). Fix h T A satisfying (iv) for this e. For all i and n, let hi (n) = h(i, n). Then there is a total computable function p with ΦA p(i) = hi for all i. By the relativized form of the recursion theorem, A A there is an i such that ΦA p(i) = Φi . Then Φi is total, and A ΦA i (n) = Φp(i) (n) = h(i, n) = Φe (i, n) = Φf (i) (n).
To show that (v) implies (iv), assume that (iv) fails for some e. Let f (i) = e for all i. Then for all i such that ΦA i is total, there is an n such that ΦA i (n) = Φe (n) = Φf (i) (n), so (v) fails. Finally, we show that (iii) and (iv) are equivalent. To show that (iii) implies (iv), given e, let f be a total computable function such that Φf (n) (x) = Φe (n) for all n and x. Let g T A be DNC and let h = g ◦ f . Then Φe (n) = Φf (n) (f (n)) = g(f (n)) = h(n). To show that (iv) implies (iii), let e be such that Φe (n) = Φn (n) for all n, and let h be as in (iv). Then h(n) = Φe (n) = Φn (n) for all n, so h is DNC.
2.22. Fixed-point free and diagonally noncomputable functions
89
Further results about DNC functions, including ones relating them to Kolmogorov complexity, will be discussed in Chapter 8, particularly in Theorem 8.16.8. For DNC functions with range {0, 1}, we have the following result. Theorem 2.22.2 (Jockusch and Soare [195], Solovay [unpublished]). A degree is PA iff it computes a {0, 1}-valued DNC function. Proof. It is straightforward to define a Π01 class P such that the characteristic function of any element of P is a {0, 1}-valued DNC function. So by the Scott Basis Theorem 2.21.2, every PA degree computes a {0, 1}-valued DNC function. Now let A compute a {0, 1}-valued DNC function g. Then g is the characteristic function of a set separating the effectively inseparable pair {e : Φe (e) = 0} and {e : Φe (e) = 1}. So by Lemma 2.21.6, g has PA degree. Since the class of PA degrees is closed upwards, so does A. The following is a classic result on the interaction between fixed-point free functions and computable enumerability. Theorem 2.22.3 (Arslanov’s Completeness Criterion [13]). A c.e. set is Turing complete iff it computes a fixed-point free function. Proof. It is easy to define a total ∅ -computable fixed-point free function. For the nontrivial implication, let A be a c.e. set that computes a fixedpoint free function f . By speeding up the enumeration of A, we may assume we have a reduction ΓA = f such that ΓA (n)[s] ↓ for all s and n s. For each n we build a set Wh(n) , with the total computable function h given by the recursion theorem with parameters. Initially, Wh(n) = ∅. If n ∈ ∅s then for every x ∈ WΓA (h(n))[s] [s], we put x into Wh(n) if it is not already there. (The full details of the application of the recursion theorem are much as in the footnote to the proof of Theorem 2.19.16.) We claim we can compute ∅ from A using h. To prove this claim, fix n. If n ∈ ∅ , then consider the stage s at which n enters ∅ . If the computation ΓA (h(n)) has settled by stage s, then every x ∈ WΓA (h(n)) is eventually put into Wh(n) , while no other x ever enters Wh(n) . So Wf (h(n)) = WΓA (h(n)) = Wh(n) , contradicting the choice of f . Thus it must be the case that the computation ΓA (h(n)) has not settled by stage s. So, to decide whether n ∈ ∅ , we simply look for a stage s by which the computation ΓA (h(n)) has settled (which we can do computably in A because A is c.e.). Then n ∈ ∅ iff n ∈ ∅s . The above argument clearly works for wtt-reducibility as well, so a c.e. set is wtt-complete iff it wtt-computes a fixed-point free function. Arslanov [14] has investigated similar completeness criteria for other reducibilities such as tt-reducibility. Antonin Kuˇcera realized that fixed-point free functions have much to say about c.e. degrees, even beyond Arslanov’s Completeness Criterion, and,
90
2. Computability Theory
as we will later see, about algorithmic randomness. In particular, in [216] he showed that if f T ∅ is fixed-point free, then there is a noncomputable c.e. set B T f . We will prove this result in a slightly stronger form after introducing the following notion. Definition 2.22.4 (Maass [256]). A coinfinite c.e. set A is promptly simple if there are a computable function f and an enumeration {As }s∈ω of A such that for all e, |We | = ∞ ⇒ ∃∞ x ∃s (x ∈ We [s] \ We [s − 1] ∧ x ∈ Af (s) ). The idea of this definition is that x needs to enter A “promptly” after it enters We . We will say that a degree is promptly simple if it contains a promptly simple set. The usual simple set constructions (such as that of Post’s hypersimple set) yield promptly simple sets. We can now state the stronger form of Kuˇcera’s result mentioned above. Theorem 2.22.5 (Kuˇcera [216]). If f T ∅ is fixed-point free then there is a promptly simple c.e. set B T f . Proof. Let f T ∅ be fixed-point free. We build B T f to satisfy the following requirements for all e: Re : |We | = ∞ ⇒ ∃x ∃s (x ∈ We [s] \ We [s − 1] ∧ x ∈ Bs ). To see that these requirements suffice to ensure the prompt simplicity of B (assuming that we also make B c.e. and coinfinite), notice that it is not hard to define a computable function g such that for all e and sufficiently large n, there is an i such that Wi = We ∩ {n, n + 1, . . .} and for x n, if x ∈ We [s]\ We [s− 1] then x ∈ Wi [g(s)]\ Wi [g(s)− 1]. So if the requirements are satisfied, then for each e and n, |We | = ∞ ⇒ ∃x > n ∃s (x ∈ We [s] \ We [s − 1] ∧ x ∈ Bg(s) ). During the construction, we will define an auxiliary computable function h using the recursion theorem with parameters. (We will give the details of this somewhat subtle application of the recursion theorem after outlining the construction.) At stage s, act as follows for each e s such that 1. Re is not yet met, 2. some x > max{2e, h(e)} is in We [s] \ We [s − 1], and 3. fs (h(e)) = ft (h(e)) for all t with x t s. Enumerate x into B (which causes Re to be met). Using the recursion theorem with parameters, let Wh(e) = Wfs (h(e)) . In greater detail, we think of the construction as having a parameter i and enumerating a set Bi . We define an auxiliary partial computable function g, which we then use to define h. At stage s, act as follows for each e s such that
2.22. Fixed-point free and diagonally noncomputable functions
91
1. Re is not yet met (with Bi in place of B), 2. some x > max{2e, i} is in We [s] \ We [s − 1], and 3. fs (i) = ft (i) for all t with x t s. Enumerate x into Bi (which causes Re to be met) and let g(i, e) = fs (i). Using the recursion theorem with parameters (in the version for partial computable functions mentioned following the proof of Theorem 2.3.2), let h be a total computable function such that Wh(e) = Wg(h(e),e) whenever g(h(e), e) ↓. Let B = Bh(e) . It is not hard to check that h is as above. Having concluded the construction, we first verify that B is promptly simple. Clearly, B is c.e., and it is coinfinite because at most one number is put into B for the sake of each Re , and that number must be greater than 2e. As argued above, it is now enough to show that each Re is met. So suppose that |We | = ∞. Let u be such that ft (h(e)) = f (h(e)) for all t u. There must be some x > max{2e, h(e), u} and some s such that x ∈ We [s] \ We [s − 1]. If Re is not yet met at stage s, then x will enter B at stage s, thus meeting Re . We now show that B T f . Let q(x) = μu > x ∀y x (fu (y) = f (y)). Then q T f . We claim that x ∈ B iff x ∈ Bq(x) . Suppose for a contradiction that x ∈ B \ Bq(x) . Then x must have entered B for the sake of some Re at some stage s > q(x). Thus fs (h(e)) = ft (h(e)) for all t with x t s. However, x < q(x) < s, so Wh(e) = Wfs (h(e)) = Wfq(x) (h(e)) = Wf (h(e)) , contradicting the fact that f is fixed-point free. As Kuˇcera [216] noted, Theorem 2.22.5 can be used to give a priority-free solution to Post’s Problem: Let A be a low PA degree (which exists by the low basis theorem). By Theorem 2.22.2, A computes a DNC function, so by Theorem 2.22.1, A computes a fixed-point free function. Thus A computes a promptly simple c.e. set, which is noncomputable and low, and hence a solution to Post’s Problem. Prompt simplicity also has some striking structural consequences. A c.e. degree a is cappable if there is a c.e. degree b > 0 such that a ∩ b = 0. Otherwise, a is noncappable. Theorem 2.22.6 (Ambos-Spies, Jockusch, Shore, and Soare [7]). Every promptly simple degree is noncappable. Proof. Let A be promptly simple, with witness function f . Let B be a noncomputable c.e. set. We must build C T A, B meeting Re : We = C for all e. To do so we build auxiliary c.e. sets Ve = Wh(e) with indices h(e) given by the recursion theorem with parameters. Analyzing the proof of the recursion theorem, we see that there is a computable function g such
92
2. Computability Theory
that for almost all n, if we decide to put n into Ve at a stage s, then n enters Wh(e) before stage g(s). Our strategy for meeting Re is straightforward. We pick a follower n targeted for C. If we ever see a stage s such that n ∈ We,s , we declare n to be active. If We ∩ Cs = ∅, then we pick a new follower n > n. For an active n, if some number less than n enters B at a stage t, we enumerate n into Ve . If n ∈ Af (g(t)) then we put n into C. In any case, we declare n to be inactive. Assume for a contradiction that We = C. Then infinitely many followers become active during the construction. Since B is noncomputable, infinitely many of these enter Wh(e) , so by the prompt simplicity of A, some active follower enters C, whence We = C. The strategies for different requirements act independently, choosing followers in such a way that no two strategies ever have the same follower. Thus all requirements are met. Whenever we enumerate n into C at a stage t, this action is permitted by a change in B below n at stage t and by n itself entering A by stage f (g(t)). Since f is computable, C T A, B. Ambos-Spies, Jockusch, Shore, and Soare [7] extended the above result by showing that the promptly simple degrees and the cappable degrees form an algebraic decomposition of the c.e. degrees into a strong filter and an ideal. (See [7] for definitions.) Furthermore, they showed that the promptly simple degrees coincide with the low cuppable degrees, where a is low cuppable if there is a low c.e. degree b such that a ∪ b = 0 . The proofs of these results are technical, and would take us too far afield. Corollary 2.22.7 (Kuˇcera [216]). Let a and b be Δ02 degrees both of which compute fixed-point free functions. Then a and b do not form a minimal pair. Proof. By Theorem 2.22.5, each of a and b bounds a promptly simple degree. By Theorem 2.22.6, these promptly simple degrees are noncappable, and hence do not form a minimal pair. Thus neither do a and b.20 Kuˇcera [217] introduced the following variation of the notion of DNC function. Definition 2.22.8 (Kuˇcera [217]). A total function f is generally noncomputable (GNC ) if f (e, n) = Φe (n) for all e and n. We can argue as before to show that the degrees computing {0, 1}valued GNC functions are exactly the degrees computing {0, 1}-valued DNC functions, that is, the PA degrees. 20 Kuˇ cera’s original proof of this result was direct, in the style of the proof of Theorem 2.22.5, and used the (double) recursion theorem.
2.23. Array noncomputability and traceability
93
Theorem 2.22.9 (Kuˇcera [217]). Let A be a Π01 class whose elements are all {0, 1}-valued GNC functions, and let C be a set. There is a function g ∈ A and an index e such that g(e, n) = C(n) for all n. Furthermore, e can be found effectively from an index for A. Proof. Using the recursion theorem, choose e such that Φe (n) is defined as follows. Suppose there is a τ ∈ 2<ω such that A ∩ {f : ∀n < |τ | (f (e, n) = τ (n))} = ∅. This condition is c.e., so we can choose the first τ we see satisfying it and let Φe (n) = 1 − τ (n) for n < |τ | and Φe (n) = 0 for n |τ |. If there is no such τ , then let Φe (n) ↑ for all n. If there is such a τ , then for each {0, 1}-valued GNC function f we have f (e, n) = τ (n) for all n < |τ |, and hence A ∩ {f : ∀n < |τ | (f (e, n) = τ (n))} = A = ∅, contradicting the definition of τ . Thus there is no such τ , and hence for every σ there is a g ∈ A such that g(e, n) = σ(n) for all n < |σ|. By compactness, there is a g ∈ A such that g(e, n) = C(n) for all n. Kuˇcera [217] remarked that the use of e, n with Φe (n) ↑ for all n is an analog of the use of what are known as flexible formulas in axiomatizable theories, first introduced and studied by Mostowski [283] and Kripke [214], in connection with G¨ odel’s Incompleteness Theorem. In unpublished work,21 Kuˇcera also noted that the above proof can be combined with the proof of the low basis theorem to show that if c is a low degree then there is a low PA degree a c. Combining this result with Kuˇcera’s priority-free solution to Post’s Problem, we see that for any low degree c there is a c.e. degree b such that b ∪ c is low. This result is particularly interesting in light of the fact that Lewis [245] has constructed a minimal (and hence low2 ) degree c such that b ∪ c = 0 for every nonzero c.e. degree b. Kuˇcera has also used the above method to construct PA degrees in various jump classes.
2.23 Array noncomputability and traceability In this section, we discuss the array noncomputable degrees introduced by Downey, Jockusch, and Stob [120, 121]. The original definition of this class was in terms of very strong arrays. Recall that a strong array is a computable collection of pairwise disjoint finite sets {Fn : n ∈ N} (which recall means not only that the Fn are uniformly computable, but also that the function i → max Fi is computable). 21 This
work is now mentioned as example 2.1 in Kuˇcera and Slaman [222]. In unpublished work, Kuˇcera and Slaman have also shown that there is a low2 degree that joins all Δ02 fixed-point free degrees to 0 .
94
2. Computability Theory
Definition 2.23.1 (Downey, Jockusch, and Stob [120]). (i) A strong array {Fn : n ∈ N} is a very strong array if |Fn | > |Fm | for all n > m. (ii) For a very strong array F = {Fn : n ∈ N}, we say that a c.e. set A is F -array noncomputable (F -a.n.c.) if for each c.e. set W there is a k such that W ∩ Fk = A ∩ Fk . such that Notice that for any c.e. set W and any k there is a c.e. set W ∩ Fk but W (n) = W (n) for n ∈ W ∩ Fk = W / Fk . From this observation it follows that if A is F -a.n.c. then for each c.e. set W there are infinitely many k such that W ∩ Fk = A ∩ Fk . This definition was designed to capture a certain kind of multiple permitting construction. The intuition is that for A to be F -a.n.c., A needs |Fk | many permissions to agree with W on Fk . As we will see below, up to degree, the choice of very strong array does not matter. Downey, Jockusch, and Stob [120] used multiple permitting to show that several properties are equivalent to array noncomputability. For example, the a.n.c. c.e. degrees are precisely those that bound c.e. sets A1 , A2 , B1 , B2 such that A1 ∩A2 = B1 ∩B2 = ∅ and every separating set for A1 , A2 is Turing incomparable with every separating set for B1 , B2 . (There are a number of other characterizations of the array noncomputable c.e. degrees in [120, 121].) We will see two examples of this technique connected with algorithmic randomness in Theorems 9.14.4 and 9.14.7 below. In Downey, Jockusch, and Stob [121], a new definition of array noncomputability was introduced, based on domination properties of functions and not restricted to c.e. sets. Definition 2.23.2 (Downey, Jockusch, and Stob [121]). A degree a is array noncomputable (a.n.c) if for each f wtt ∅ there is a function g computable in a such that g(n) f (n) for infinitely many n. Otherwise, a is array computable. Note that, using this definition, the a.n.c. degrees are clearly closed upwards. The following results give further characterizations of array noncomputability. They also show that, for c.e. sets, the two definitions of array noncomputability coincide and the first definition is independent of the choice of very strong array up to degree. Let m∅ (n) be the least s such that ∅ n = ∅s n. Theorem 2.23.3 (Downey, Jockusch, and Stob [121]). Let a be a degree, and let {Fn }n∈ω be a very strong array. The following are equivalent. (i) The degree a is a.n.c. (ii) There is a function h computable in a such that h(n) m∅ (n) for infinitely many n.
2.23. Array noncomputability and traceability
95
(iii) There is a function g computable in a such that for each e there is an n for which We ∩ Fn = We,g(n) ∩ Fn . Proof. To prove that (i) → (iii), let f (n) = μs ∀e n (We ∩Fn = We,s ∩Fn ). Then f wtt ∅ , so there is an a-computable function g such that g(n) f (n) for infinitely many n. To prove that (ii) → (i), let f wtt ∅ , and let h satisfy (ii). Fix e and a computable function b such that f (n) = Φ∅e (n) with use at most b(n) for all n. We may assume without loss of generality that h and b are increasing. We define a function g such that g(n) f (n) for infinitely many n as follows. Given n, let s be minimal such that s > h(b(n + 1)) and Φ∅e (n)[s] ↓ with use at most b(n), and let g(n) = Φ∅e (n)[s]. Clearly g is computable in a. Let n and k be such that b(n) k b(n + 1) and h(k) m∅ (k), and let s be as in the definition of g(n). We have s h(b(n + 1)) h(k) m∅ (k) m∅ (b(n)), so ∅s ϕ∅e (n)[s] = ∅ ϕ∅e (n)[s], and hence g(n) = f (n). Since there are infinitely many j with h(j) m∅ (j), there are infinitely many n for which there is a k as above, and hence g(n) = f (n) for infinitely many n. It remains to show that (iii) → (ii). Let g be as in (iii). We claim that for each e there are infinitely many n with We ∩ Fn = We,g(n) ∩ Fn . Suppose otherwise. Let i be such that Φi is defined as follows. If x ∈ Fn for one of the finitely many n such that We ∩Fn = We,g(n) ∩Fn , then let Φi (x) converge in strictly more than g(n) steps. Otherwise, if Φe (x) ↓ then let Φi (x) converge in at least the same number of steps as Φe (x), and if Φe (x) ↑ then let Φi (x) ↑. Then there is no n such that Wi ∩ Fn = Wi,g(n) ∩ Fn , contradicting the choice of g. Thus the claim holds, so it suffices to show that there is an e such that μs (We,s ∩ Fn+1 = We ∩ Fn+1 ) m∅ (n) for all n, since it then follows that (ii) holds with h(n) = g(n + 1). Define a c.e. set V as follows. At stage s, let cn,s be the least element (if any) of Fn+1 \ Vs . For each n < s, if ∅s+1 n = ∅s n then enumerate cn,s into V . Also enumerate cs,s into V . Note that |Fn+1 ∩ V | |{s : ∃i < n (i ∈ ∅s+1 \ ∅s )}| + 1 n + 1 < |Fn+1 |, so cn,s is defined for all n and s. It follows that μs (Vs ∩Fn+1 = V ∩Fn+1 ) m∅ (n). By Theorem 2.3.3, there is an e such that We = V and We,s ⊆ Vs for all s. Then μs (We,s ∩Fn+1 = We ∩Fn+1 ) μs (Vs ∩Fn+1 = V ∩Fn+1 ) m∅ (n) for all n, as required. Since the truth of item (i) in Theorem 2.23.3 does not depend on a choice of very strong array, neither does the truth of item (iii). Theorem 2.23.4 (Downey, Jockusch, and Stob [120, 121]). Let a be a c.e. degree and let F = {Fn }n∈N be a very strong array. Then a is a.n.c. in the sense of Definition 2.23.2 iff there is an F -a.n.c. c.e. set A ∈ a. Proof. (⇐) Let A ∈ a be an F -a.n.c. c.e. set. We show that item (iii) in Theorem 2.23.3 holds with g(n) = μs (As ∩ Fn = A ∩ Fn ). For each e, let Ve = {x : ∃s (x ∈ We,s \ As )}. Each Ve is c.e., so there is an n such
96
2. Computability Theory
that A ∩ Fn = Ve ∩ Fn . If x enters We ∩ Fn after stage g(n), then we have two cases: if x ∈ A then x ∈ Ag(n) , so x ∈ / Ve , while if x ∈ / A then x ∈ Ve . Both these possibilities contradict the fact that A ∩ Fn = Ve ∩ Fn , so We ∩ Fn = We,g(n) ∩ Fn . (⇒) Let a be an a.n.c. c.e. degree and let {Fn }n∈N be a very strong array. Let f (n) = μs ∀e n (We,s ∩ Fe+1,n = We ∩ Fe+1,n ). Clearly f wtt ∅ , so there is an a-computable g such that g(n) f (n) for infinitely many n. Since g is a-computable and a is c.e., there is a computable function h(n, s) and an a-computable function p such that g(n) = h(n, s) for all s p(n). We now define the c.e. set A. First, we let B ∈ a be c.e. and put n∈B F0,n into A. We will not put any other elements of any F0,n into A, so this action ensures that A has degree at least a. Now it suffices to ensure that if n e and g(n) f (n), then A ∩ Fe+1,n = We ∩ Fe+1,n . Whenever h(n, s) = h(n, s + 1) and e n s, we put all elements of We,h(n,s+1) ∩ Fe+1,n into A at stage s. The definition of A ensures that if x ∈ A∩Fe+1,n then x ∈ Ap(n) , so A is computable in p and hence A ∈ a. Suppose now that n e and g(n) f (n). It follows from the definition of f that We,g(n) ∩ Fe+1,n = We ∩ Fe+1,n . Choose s as large as possible so that h(n, s) = h(n, s + 1). (There is no loss of generality in assuming there is at least one such s n.) Then h(n, s + 1) = g(n) and so As+1 ∩ Fe+1,n = We,h(n,s+1) ∩ Fe+1,n = We+1,g(n) ∩ Fe+1,n = We ∩ Fe+1,n . Furthermore, by the maximality of s, no elements of Fe+1,n enter A after stage s + 1, so A ∩ Fe+1,n = We ∩ Fe+1,n . Corollary 2.23.5 (Downey, Jockusch, and Stob [120, 121]). Let F and G be very strong arrays. Then a degree contains an F-a.n.c. c.e. set iff it contains a G-a.n.c. c.e. set. We do not need to consider all wtt-reductions in the definition of array noncomputability, but can restrict ourselves to reductions with use bounded by the identity function. Proposition 2.23.6 (Downey and Hirschfeldt, generalizing [120]). A degree a is a.n.c. iff for every f that is computable from ∅ via a reduction with use bounded by the identity function, there is an a-computable function g such that g(n) f (n) for infinitely many n. Proof. The “only if” direction is obvious. We prove the “if” direction. Sup pose that Γ∅ = f with use bounded by the increasing computable function p. Let Φ∅ (n) = max{Γ∅ (m) : p(m) n}. Then ϕ∅ (n) n, so there is an a-computable increasing function g such that g(n) Φ∅ (n) for infinitely many n. Let g (m) = g(p(m + 1)). Since p is computable, g T A. Further-
2.23. Array noncomputability and traceability
97
more, if g(n) Φ∅ (n) then for the largest m such that p(m) n, we have g(m) = g(p(m + 1)) > g(n) Φ∅ (n) Γ∅ (m) = f (m), so g (m) f (m) for infinitely many m. Martin [258] gave a characterization of the GL2 sets quite close to the definition of array computability, as a consequence of the following result. Recall that a function f dominates a function g if f (n) g(n) for all sufficiently large n. A function is dominant if it dominates all computable functions. Theorem 2.23.7 (Martin [258]). A set A is high iff there is a dominant function f T A. Proof. Recall that Tot = {e : Φe is total}. This set has degree ∅ , so A is high iff Tot T A iff there is an A-computable function h such that lims h(e, s) = Tot(e) for all e. Suppose that A is high, and let h be as above. Define an A-computable f as follows. Given n, for each e n, look for an s n such that either 1. Φe (n)[s] ↓ or 2. h(e, s) = 0. For each e n, one of the two must happen. Define f (n) to be larger than Φe (n) for all e n such that 1 holds. If Φe is total then h(e, s) = 1 for all sufficiently large s, so f (n) > Φe (n) for all sufficiently large n. Thus f dominates all computable functions. Now suppose that there is a function f T A that dominates all computable functions. Define h T A as follows. Given e and s, check whether Φe (n)[f (s)] ↓ for all n s. If so, let h(e, s) = 1, and otherwise let h(e, s) = 0. If e ∈ / Tot then clearly h(e, s) = 0 for all sufficiently large s. If e ∈ Tot and s is sufficiently large then f (s) is greater than the stage by which Φe (n) converges for all n s, since the latter is a computable function of s. So h(e, s) = 1 for all sufficiently large s. Thus lims h(e, s) = Tot(e) for all e. Corollary 2.23.8 (Martin [258]). A set A is GL2 iff there is a function f T A ⊕ ∅ that dominates all A-computable functions. Proof. A set A is GL2 iff (A ⊕ ∅ ) ≡T A iff A ⊕ ∅ is high relative to A. Thus the corollary follows by relativizing Theorem 2.23.7 to A. By Corollary 2.23.8, if a set A is not GL2 then for each function f T A ⊕ ∅ there is a function g T A such that g(n) f (n) for infinitely many n. Thus if a degree is not GL2 then it is a.n.c. In particular, every array computable Δ02 degree is low2 . In [121], Downey, Jockusch, and Stob demonstrated that many results previously proved for non-GL2 degrees extend to a.n.c. degrees. However, the a.n.c degrees do not coincide with the
98
2. Computability Theory
non-GL2 degrees, nor do the a.n.c. c.e. degrees coincide with the nonlow2 c.e. degrees. Theorem 2.23.9 (Downey, Jockusch, and Stob [120]). There is a low a.n.c. c.e. degree. Proof. The proof is a straightforward combination of lowness and array noncomputability requirements. Let F = {Fn : n ∈ N} be a very strong array with |Fn | = n + 1. We build a c.e. set A to satisfy the requirements Re : ∃n (We ∩ Fn = A ∩ Fn ) and A Ne : ∃∞ s ΦA e (e)[s] ↓ ⇒ Φe (e) ↓ .
For the sake of Re we simply pick a fresh n to devote to making We ∩ Fn = A ∩ Fn . We do so in the obvious way: Whenever a new element enters We ∩ Fn , we put it into A. Such enumerations will happen at most n + 1 many times. Each time one happens we initialize the weaker priority N requirements. Similarly, each time we see a computation ΦA e (e)[s] ↓ we initialize the weaker priority R-requirements, which now must pick their n’s to be above the use of this computation. The result follows by the usual finite injury argument. On the other hand, an easy modification of Martin’s proofs above shows that if A is superlow then there is a function wtt-computable in ∅ that dominates every A-computable function, and hence A is array computable. Array computability is also related to hyperimmunity. Proposition 2.23.10 (Folklore). If a is hyperimmune-free then a is array computable. Proof. If A is of hyperimmune-free degree then every f T A is dominated by some total computable function. It is easy to construct a function g wtt ∅ that is a uniform bound for all partial computable functions. (That is, for each e, we have Φe (n) h(n) for almost all n such that Φe (n) ↓.) Then every function f T A is dominated by g, and hence A is array computable. The following concept, closely related to both hyperimmune-freeness and array computability, is quite useful in the study of algorithmic randomness. Definition 2.23.11 (Zambella [416], see [386], also Ishmukhametov [184]). A set A, and the degree of A, are c.e. traceable if there is a computable function p (called a bound ) such that, for each function f T A, there is a computable function h (called a trace for f ) satisfying, for all n, (i) |Wh(n) | p(n) and (ii) f (n) ∈ Wh(n) .
2.23. Array noncomputability and traceability
99
Since one can uniformly enumerate all c.e. traces for a fixed bound p, there is a universal trace with bound p2 , say. By a universal trace we mean one that traces each function f T A on almost all inputs. As the following result shows, the above definition does not change if we replace “there is a computable function p” by “for every unbounded nondecreasing computable function p such that p(0) > 0”. Proposition 2.23.12 (Terwijn and Zambella [389]). Let A be c.e. traceable and let q be an unbounded nondecreasing computable function such that q(0) > 0. Then A is c.e. traceable with bound q. Proof. Let A be c.e. traceable with some bound p. Let g T A. For each m, let rm be such that q(rm ) p(m + 1), and let f (m) = g(0), . . . , g(rm ). Let h be as in Definition 2.23.11. Let h be a computable function such that W h(n) is defined as follows. For each m, if rm n < rm+1 , then enumerate into W h(n) all the nth coordinates of elements of Wh(m+1) . Then |W h(n) | |Wh(m+1) | p(m + 1) q(rm ) q(n). Furthermore, g(n) is the nth coordinate of f (m + 1), and f (m + 1) ∈ Wh(m+1) , so g(n) ∈ W h(n) . We have not yet defined W h(n) for n < r0 , but since there are only finitely many such n, we can define W h(n) = {g(n)} for such n. Then for all n we have |W h(n) | q(n) and g(n) ∈ W h(n) , as required. Ishmukhametov [184] gave the following characterization of the array computable c.e. degrees. Theorem 2.23.13 (Ishmukhametov [184]). A c.e. degree is array computable iff it is c.e. traceable. Proof. Let A be a c.e. traceable c.e. set. Let f T A. Let p and h be as in Definition 2.23.11. Let g(n) = max Wh(n) . Then g wtt ∅ , since we can use p(n) to bound the ∅ -oracle queries needed to compute g(n), and g dominates f . Since f is arbitrary, A is array computable. Now let A be an array computable c.e. set. By Proposition 2.23.6, there is a function g computable from ∅ with use bounded by the identity that dominates every A-computable function. Let f = ΦA e . Let F (n) be the A least s such that ΦA e (n)[t] = Φe (n)[s] for all t > s. Then F T A, so by modifying g at finitely many places, we obtain a function G that is computable in ∅ with use bounded by the identity function and such that G(n) > F (n) for all n. Since ∅ n can change at most n many times during the enumeration of ∅ , there is a computable approximation G0 , G1 , . . . to G that changes value at n at most n times. Let h be a computable A function such that Wh(n) = {ΦA e (n)[Gs (n)] : s ∈ N ∧ Φe (n)[Gs (n)] ↓}. Then |Wh(n) | n and f (n) ∈ Wh(n) . Since f is arbitrary, A is c.e. traceable. Using this characterization, Ishmukhametov proved the following remarkable theorem, which shows that if the c.e. degrees are definable in the global structure of the degrees, then so are the a.n.c. c.e. degrees. A
100
2. Computability Theory
degree m is a strong minimal cover of a degree a < m if for all degrees d < m, we have d a. Theorem 2.23.14 (Ishmukhametov [184]). A c.e. degree is array computable iff it has a strong minimal cover. Definition 2.23.11 can be strengthened as follows. Recall that D0 , D1 , . . . is a canonical listing of finite sets, as defined in Chapter 1. Definition 2.23.15 (Terwijn and Zambella [389]). A set A, and the degree of A, are computably traceable if there is a computable function p such that, for each function f T A, there is a computable function h satisfying, for all n, (i) |Dh(n) | p(n) and (ii) f (n) ∈ Dh(n) . As with c.e. traceability, the above definition does not change if we replace “there is a computable function p” by “for every unbounded nondecreasing computable function p such that p(0) > 0”. If A is computably traceable then each function g T A is dominated by the function f (n) = max Dh(n) , where h is a trace for f . Thus, every computably traceable degree is hyperimmune-free. One may think of computable traceability as a uniform version of hyperimmune-freeness. Terwijn and Zambella [389] showed that a simple variation of the standard construction of hyperimmune-free sets by Miller and Martin [282] (Theorem 2.17.3) produces continuum many computably traceable sets. Indeed, the proof we gave of Theorem 2.17.3 produces a computably traceable set.
2.24 Genericity and weak genericity The notion of genericity arose in the context of set-theoretic forcing. It was subsequently “miniaturized” to obtain a notion of n-genericity appropriate to computability theory. Although the original definition of n-genericity was in terms of forcing, the following more modern definition is the one that is now generally used.22 Let S ⊆ 2<ω . A set A meets S if there is an 22 The original definition of n-genericity, dating back to Feferman [143], is as follows. Let L be the usual first-order language of number theory together with a set constant X and the membership relation ∈. Let σ ∈ 2<ω . For sentences ψ in L, we define the notion “σ forces ϕ”, written σ ψ, by recursion on the structure of ψ as follows.
(i) If ψ is atomic and does not contain X, then σ ψ iff ψ is true in arithmetic. (ii) σ n ∈ X iff σ(n) = 1. (iii) σ ψ0 ∨ ψ1 iff σ ψ0 or σ ψ1 . (iv) σ ¬ψ iff ∀τ σ (τ ψ).
2.24. Genericity and weak genericity
101
n such that A n ∈ S. A set A avoids S if there is an n such that for all τ A n, we have τ ∈ / S. Definition 2.24.1 (Jockusch and Posner, see Jockusch [188]). A set is n-generic if it meets or avoids each Σ0n set of strings. A set is arithmetically generic if it is n-generic for all n. It is easy to construct n-generic Δ0n+1 sets, and arithmetically generic sets computable from ∅(ω) . Indeed, it does not require much computational power to build a 1-generic set. Proposition 2.24.2. Every noncomputable c.e. set computes a 1-generic set. Proof. The proof is a standard permitting argument. Let S0 , S1 , . . . be an effective listing of the c.e. sets of strings. We build G T A to satisfy the requirements Re : G meets or avoids Se . Associated with each Re is a restraint re , initially set to 0. We say that Re requires attention through n at stage s if n > re and Gs n has an extension in Se [s]. For each e, we also have a set Me of numbers through which Re has required attention. Initially, we have G0 = ∅. At stage s, let e s be least such that Re is not currently satisfied and either 1. there is an n > re such that Gs n has an extension σ in Se [s] and As+1 n = As n or 2. Re requires attention through some n not currently in Me . (If there is no such e then let Gs+1 = Gs and proceed to the next stage.) In the first case, let Gs+1 extend σ and declare Re to be satisfied. In the second, put n into Me and let Gs+1 = Gs . In either case, for all i > e, declare Ri to be unsatisfied, empty Mi , and let ri be a fresh large number. We say we have acted for e. Let G = lims Gs . Once numbers less than n have stopped being enumerated into A, we cannot change G below n, so G is well-defined and A-computable. Assume by induction that there is a stage s after which we never act for any i < e. If we ever act for e after stage s and are in case 1 (v) σ ∃z ψ(z) iff there is an n ∈ N such that σ ψ(n). For a set A, we have A ψ iff there is some σ ≺ A such that σ ψ. A set A is n-generic if for each Σ0n sentence ψ of L, either A ψ or A ¬ψ. As Jockusch and Posner showed (see Jockusch [188]), it is not hard to see that this definition is equivalent to the one in Definition 2.24.1 using the result due to Matijacevic [261] that each Σ0n subset of N is defined by a Σ0n formula in L.
102
2. Computability Theory
of the construction, we permanently satisfy Re by ensuring that G meets Se . In that case, Re never again acts, so the induction can proceed. So suppose we never act for e in case 1 of the construction after stage s. Assume for a contradiction that we act infinitely often for e. At each stage t > s at which we act for e, we find a new n such that Gt n has an extension in Se [t]. Since we then increase all restraints for weaker priority requirements, G n = Gt n. Since we never reach case 1 of the construction for e, we must have A n = At n. From this fact it is easy to see that A is computable, which is a contradiction. Thus we act only finitely often for e. But if G n has an extension in Se , then Re requires attention through n at all large enough stages t. Thus no sufficiently long initial segment of G has an extension in Se , so G avoids Se . A degree is n-generic if it contains an n-generic set, and arithmetically generic if it contains an arithmetically generic set. An interesting fact about arithmetically generic degrees, shown by Jockusch [188], is that if a and b are arithmetically generic, then the structures of the degrees below a and of the degrees below b are elementarily equivalent. There is a wealth of results on n-generic sets and degrees, many of which are discussed in the survey Jockusch [188]. Kumabe [224] also contains several results on n-genericity. The following is an important property of n-generic sets. Theorem 2.24.3 (Jockusch [188]). If A is n-generic then A(n) ≡T A⊕∅(n) . In particular, every 1-generic set is GL1 . Proof. Assume by induction that the theorem is true for n − 1. (The base case n = 0 is trivial, as it says that A ≡T A⊕∅ for all A.) Let A be n-generic. By the inductive hypothesis, it is enough to show that (A ⊕ ∅(n−1) ) ≡T A ⊕ ∅(n) . One direction of this equivalence is immediate, so we are left with showing that (A ⊕ ∅(n−1) ) T A ⊕ ∅(n) . σ⊕(∅(n−1) |σ|)
(e) ↓} and Se = {σ : ∀τ σ (τ ∈ / Re )}. The Let Re = {σ : Φe Re are c.e. in ∅(n−1) and hence are Σ0n . So for each e, we have e ∈ (A ⊕ ∅(n−1) ) iff A meets Re iff A does not meet Se , the last equivalence following by n-genericity. But it is easy to check that the Re and Se are uniformly ∅(n) -computable, so using A ⊕ ∅(n) , we can compute (A ⊕ ∅(n−1) ) . The above result is an example of the phenomenon that, in many cases, a construction used to show that a set with a certain property (such as being GL1 ) exists actually shows that the given property holds of all sufficiently generic sets. For example, the requirements of a finite extension argument are usually ones that ask that the set being constructed meet or avoid some open set of conditions. A related phenomenon is the fact that genericity requirements can often be added to finite extension constructions. For instance, it is straightfor-
2.24. Genericity and weak genericity
103
ward to modify the construction in the proof of Theorem 2.16.2 to obtain the following result. Theorem 2.24.4. If C > 0 and D T ∅ ⊕ C then there is a 1-generic set A such that A ≡T A ⊕ C ≡T D. The following result exemplifies the fact that generic sets are quite computationally weak. Theorem 2.24.5 (Demuth and Kuˇcera [96]). If a set is 1-generic then it does not compute a DNC function. Proof. Let G be 1-generic and let ΦG = f . Let S = {σ ∈ ω <ω : ∀n < |σ| (σ(n) = Φn (n))}. Note that S is co-c.e. Let / S)}. X = {τ : ∃σ (Φτ |σ| ↓= σ ∧ σ ∈ Then X is c.e., so G either meets or avoids X. If G meets X then there is a τ ≺ G and a σ ∈ ω <ω such that Φτ |σ| ↓= σ and σ ∈ / S. Since ΦG = f , we have σ ≺ f , and hence f is not DNC. Thus it is enough to assume that G avoids X and derive a contradiction. Let ν ≺ G be such that if τ ∈ X then ν ⊀ τ . Let Y = {σ : ∃ρ (ν ≺ ρ ∧ Φρ |σ| ↓= σ)}. Then Y is a c.e. subset of S, and it is infinite because every initial segment of f is in Y . So we can computably find σ0 , σ1 , . . . ∈ S such that |σi | > i. Let g(i) = σi (i). Then g is computable and, by the definition of S, it is DNC, which is a contradiction. By Theorem 2.22.2, we have the following result. Corollary 2.24.6. No PA degree is computable from a 1-generic degree. Since the hyperimmune-free basis theorem implies that there is a DNC function of hyperimmune-free degree, we have the following. Corollary 2.24.7 (Downey and Yu [137]). There is a hyperimmune-free degree not computable from any 1-generic degree. Corollary 2.24.7 should be contrasted with the following result, whose proof uses a special notion of perfect set forcing. Theorem 2.24.8 (Downey and Yu [137]). There is a hyperimmune-free degree computable from a 1-generic degree. Recall that a set A is CEA if it is computably enumerable in some X
104
2. Computability Theory
The following result will be of interest in Section 8.21.3, where we show that being CEA is in fact a property of almost all sets, in the measure-theoretic sense. Theorem 2.24.9 (Jockusch [188]). Every 1-generic set is CEA. Proof. 23 Let A be 1-generic. Let X = {2i, j : i ∈ A ∧ 2i, j ∈ / A}. Clearly, X T A. For each i, the set {σ : ∃j (σ(2i, j) = 0)} is computable and dense, so for each i there is a j such that 2i, j ∈ / A. Thus i ∈ A iff ∃j (2i, j ∈ X), whence A is c.e. in X. So A is CEA(X), and thus we are left with showing that A T X. Let σ be the unique string of length |σ| such that σ (n) = 1 iff n = 2i, j for i and j such that σ(i) = 1 and σ(2i, j) = 0. We claim that if ν σ and n |ν| is odd, then there is a string τ ν such that τ (n) = 1 and σ τ . If σ(n) = 1 then we can take τ = σ, and if n |σ| then we can take any τ σ such that τ (n) = 1. So suppose that σ(n) = 0. Let S be the smallest set under inclusion such that n ∈ S, and if i ∈ S, then 2i, j ∈ S if σ(i) = 0 and 2i, j < |σ|. Let τ be the string of length |σ| such that τ (k) = 1 iff either σ(k) = 1 or k ∈ S. Then ν τ . We need to show that σ τ . Let k |σ|. If k is odd then σ (k) = τ (k) = 0, so assume that k = 2i, j for some i and j. If σ (k) = 0, then either σ(i) = 0 or σ(2i, j) = 1. In the latter case, τ (2i, j) = 1, so τ (2i, j) = 0. Now suppose that σ(i) = 0. If i ∈ S, then 2i, j ∈ S, so τ (2i, j) = 1, and hence τ (2i, j) = 0. If i ∈ / S, then τ (i) = 0 and hence τ (2i, j) = 0. If σ (k) = 1, then σ(i) = 1 and σ(2i, j) = 0. Since σ(i) = 1, we have τ (i) = 1. Furthermore, it follows easily by induction that S contains no numbers of the form 2d, q with σ(d) = 1. Since σ(2i, j) = 0 and 2i, j ∈ / S, we have τ (2i, j) = 0, and hence τ (k) = 1. We have established our claim. Now assume for a contradiction that Φ X = A for some Turing functional Φ. Let V = {σ : σ | Φσ }. Then V is c.e. and A extends no string in V , so, since A is 1-generic, there is a ν ≺ A such that no extension of ν is in V . Let n |ν| and σ ν be such that n is odd and Φσ (n) = 0. (Such a σ must exist because {ρ : ∃n |ν| (n odd ∧ ρ(n) = 0)} is computable and dense, so there is an odd n |ν| such that ΦX (n) = A(n) = 0.) Then there is a τ ν such that τ (n) = 1 and σ τ . We have Φτ (n) = 0 and τ (n) = 1, so τ ∈ V , contradicting the choice of ν. We now consider a weaker notion of genericity. A set of strings D is dense if every string has an extension in D. Recall that a set A meets D if there is a σ ∈ D such that σ ≺ A. Similarly, a string τ meets D if there is a 23 Jockusch
[188] attributes the idea behind this proof to Martin.
2.24. Genericity and weak genericity
105
σ ∈ D such that σ τ . Given the set-theoretic definition of genericity in terms of meeting dense sets, the following definition is natural. Definition 2.24.10 (Kurtz [228, 229]). A set is weakly n-generic if it meets all dense Σ0n sets of strings. A degree is weakly n-generic if it contains a weakly n-generic set. As noted by Jockusch (see [228, 229]), it is not hard to show that A is weakly (n + 1)-generic iff A meets all dense Δ0n+1 sets of strings iff A meets all dense Π0n sets of strings. Clearly, if a set is n-generic then it is weakly n-generic. The following result is slightly less obvious. Theorem 2.24.11 (Kurtz [228, 229]). Every weakly (n + 1)-generic set is n-generic. Proof. Let A be weakly (n + 1)-generic. Let S be a Σ0n set of strings. Let R = {σ : σ ∈ S ∨ ∀τ σ (τ ∈ / S)}. Then R is a dense Σ0n+1 set of strings, and hence A meets R, which clearly implies that A meets or avoids S. Thus A is n-generic. Thus we have the following hierarchy: weakly 1-generic ⇐ 1-generic ⇐ weakly 2-generic ⇐ 2-generic ⇐ weakly 3-generic ⇐ · · · . We now show that none of these implications can be reversed, even for degrees, by exploring the connection between weak genericity and hyperimmunity. Recall that a degree a is hyperimmune relative to a degree b if there is an a-computable function that is not majorized by any b-computable function. Theorem 2.24.12 (Kurtz [228, 229]). If A is weakly (n + 1)-generic, then the degree of A is hyperimmune relative to 0(n) . Proof. Let pA be the principal function of A. (That is, pA (n) is the nth element of A in the natural order.) We show that no 0(n) -computable function majorizes pA . Let f be a 0(n) -computable function. For a string σ, let pσ be the principal function of {n < |σ| : σ(n) = 1} (which is a partial function, of course). Let Sf = {σ : ∃n (pσ (n) ↓> f (n))}. Then Sf is a dense Δ0n+1 set, so A meets Sf , and hence f does not majorize pA . Corollary 2.24.13 (Kurtz [228, 229]). There is an n-generic degree that is not weakly (n + 1)-generic. Proof. There is an n-generic degree that is below 0(n) , and hence not hyperimmune relative to 0(n) . Theorem 2.24.14 (Kurtz [228, 229]). A degree b is the nth jump of a weakly (n + 1)-generic degree iff b > 0(n) and b is hyperimmune relative to 0(n) . Proof. Suppose that b = a(n) with a weakly (n + 1)-generic. By Theorem 2.24.12, a is hyperimmune relative to 0(n) , and the class of degrees
106
2. Computability Theory
that are hyperimmune relative to 0(n) is clearly closed upwards. Thus b is hyperimmune relative to 0(n) . Clearly, b 0(n) , so in fact b > 0(n) . The proof of the converse is a variant of the proof of the Friedberg Completeness Criterion (Theorem 2.16.1). Let b > 0(n) be hyperimmune with respect to 0(n) . Let h be a b-computable function that is not majorized by any 0(n) -computable function. We may assume that h is increasing and has degree exactly b, and let B = rng h, so that h = pB and deg(B) = b. Let f0 , f1 , . . . be uniformly 0(n) -computable functions such that {rng fi : i ∈ N} is the collection of all nonempty Σ0n+1 sets of strings. Let Si = rng fi and let Sis = {σ : ∃k pB (s) (fi (k) = σ)}. We build a set A by finite extensions σ0 σ1 · · · so that A(n) ≡T B and A is weakly (n + 1)-generic. We meet the requirements Re : If Se is dense then A meets Se . We say that Re requires attention at stage s if σs does not meet Ses , but there is a string τ σs 01e+1 0 that meets Ses . Our construction is done relative to B, which can compute ∅(n) , and hence we can test whether a given requirement requires attention at a given stage. The peculiar form that τ must have comes from the coding of B into A(n) . The number e encodes which requirement is being addressed, which will allow us to “unwind” the construction to recover the coding. We will say that a stage s is a coding stage if we act to code information about B into A at stage s. Let c(s) denote the number of coding stages before s. At stage 0, declare that 0 is a coding stage and let σ0 = B(0). At stage s + 1, if no requirement Re with e s + 1 requires attention then let σs+1 = σs and declare that s + 1 is not a coding stage. Otherwise, let Re be the strongest priority requirement requiring attention. Let m0 be the least m such that fe (m) σs 01e+1 0. If |fe (m0 )| > s + 1, then let σs+1 = σs and declare that s + 1 is not a coding stage. Otherwise, let σs+1 = fe (m 0 ) B(c(s + 1)) and declare that s + 1 is a coding stage. Let A = s σs . Lemma 2.24.15. Re requires attention only finitely often. Proof. Assume by induction that there is a stage s after which no requirement stronger than Re requires attention. Suppose Re requires attention at a stage t + 1 > s, and let m0 be the least m such that fe (m) σt 01e+1 0. If |fe (m0 )| s + 1 then Re is met at stage t + 1 and hence never again requires attention. Otherwise, Re continues to be the strongest priority requirement requiring attention until stage |fe (m0 )| − 1, at which point it is met, and never again requires attention. Lemma 2.24.16. A is weakly (n + 1)-generic. Proof. Assume that Se is dense. Let g(s) be the least k such that for each σ ∈ 2s+1 there is a j k such that fe (j) extends σ01e+1 0. Then g T ∅(n) . Let s be a stage after which no requirement stronger than Re ever requires
2.24. Genericity and weak genericity
107
attention. Since pB is not majorized by any ∅(n) -computable function, there is a t > s such that pB (t) > g(t). At stage t, the requirement Re requires attention unless it is already met, and continues to be the strongest priority requirement requiring attention from that stage on until it is met. Lemma 2.24.17. A(n) T B. Proof. Since ∅(n) T B, the construction of A is computable from B. By the previous lemma, A is weakly (n + 1)-generic and hence n-generic, so by Theorem 2.24.3, A(n) ≡T A ⊕ ∅(n) T A ⊕ B ≡T B. Lemma 2.24.18. B T A(n) . Proof. Note that A is infinite, as it is weakly (n + 1)-generic. Elements are added to A only during coding stages, and there are thus infinitely many such stages s0 < s1 < · · · . During stage sk we define σsk so that σsk (|σsk | − 1) = B(k). Thus it is enough to show that the function g defined by g(k) = |σsk | is computable from A(n) . We have g(0) = 1. Assume that we have computed g(k) using A(n) . We know that σsk+1 σsk 01e+1 0 for some e, and we can determine this e from A, since (A g(k)) 01e+1 0 ≺ A. Since ∅(n) A(n) , we can compute from A(n) the least m such that fe (m) σsk 01e+1 0, which must exist. By construction, g(k + 1) = |fe (m)| + 1. Together, the three previous lemmas imply the theorem. Taking n = 0 in the theorem above, we have the following special case, which will be useful below. Corollary 2.24.19 (Kurtz [228, 229]). A degree is weakly 1-generic iff it is hyperimmune. Corollary 2.24.20 (Kurtz [228, 229]). There is a weakly (n + 1)-generic degree that is not (n + 1)-generic. Proof. By the theorem, there is a weakly (n + 1)-generic degree a such that a(n) = 0(n+1) , so that a(n+1) = 0(n+2) . Since a < 0(n+1) , we have a(n+1) > a ∨ 0(n+1) , so by Theorem 2.24.3, a is not 1-generic. Downey, Jockusch, and Stob [121] introduced a new notion of genericity related to array computability. Let f pb C mean that f can be computed from oracle C by a reduction procedure with a primitive recursive bound on the use function. It is easy to show that f pb ∅ iff there is a computable function f (n, s) and a primitive recursive function p such that lims f (n, s) = f (n) and |{s : f (n, s) = f (n, s + 1)}| p(n). Most proofs yielding results about array noncomputable degrees that use functions f wtt ∅ actually use functions f pb ∅ .
108
2. Computability Theory
Definition 2.24.21 (Downey, Jockusch, and Stob [121]). (i) A set of strings S is pb-dense if there is a function f pb ∅ such that f (σ) σ and f (σ) ∈ S for all σ. (ii) A set is pb-generic if it meets every pb-dense set of strings. A degree is pb-generic if it contains a pb-generic set. / We )} is pb-dense, so For every e, the set {σ : σ ∈ We ∨ ∀τ σ (τ ∈ every pb-generic set is 1-generic. If f pb ∅ , then the range of f is Σ02 , so every 2-generic set is pb-generic. The main result about pb-genericity of relevance to us is the following. Theorem 2.24.22 (Downey, Jockusch, and Stob [121]). (i) If a is a.n.c., then there is a pb-generic set A T a. (ii) If A is pb-generic, then deg(A) is a.n.c. To prove this theorem, we need a lemma. A sequence {Sn }n∈ω of sets of strings is uniformly wtt-dense if there is a function d wtt ∅ such that d(n, σ) σ and d(n, σ) ∈ Sn for all n and σ. Lemma 2.24.23 (Downey, Jockusch, and Stob [121]). If a is a.n.c. and {Sn }n∈ω is uniformly wtt-dense, then there is an A T a that meets each Sn . Proof. Let d(n, σ) witness the uniform wtt-density of the Sn . Let d and σ, s) = d(n, σ) and |{s : b be computable functions such that lims d(n, σ, s) σ d(n, σ, s) = d(n, σ, s + 1)}| b(n, σ). We may assume that d(n, for all σ, n, and s. Let h(n, σ) = μs ∀t s (d(n, σ, t) = d(n, σ, s)). Let f (n) = max{h(e, σ) : e, |σ| n}. We have f wtt ∅ , since d wtt ∅ . Fix a function g computable in a such that g(n) f (n) for infinitely many n. We obtain A as s σs , where |σs | = s. The basic strategy for n at σs , g(s)) t for all t such that a stage s > n is to let σt = d(n, s < t |d(n, σs , g(s))|, at which point we declare Sn to be satisfied. If at any stage u > s we find that for some t ∈ (g(s), g(u)], we have σs , t) = d(n, σs , g(s)), then we declare Sn to be unsatisfied and repeat d(n, the strategy. This strategy can be interrupted at any point by the strategy for an Sm with m < n. If this situation happens, we wait until Sm is satisfied and repeat the strategy. Clearly, A T a, and if Sn is eventually permanently satisfied, then A meets Sn . To see that every Sn is indeed eventually permanently satisfied, assume by induction that there is a least stage s0 > n by which all Sm with m < n are permanently satisfied. Let s > s0 be such that g(s) f (s). If an iteration of the strategy for Sn begins at stage s, then it will clearly lead to Sn becoming permanently satisfied. Otherwise, there must be a stage v ∈ [s0 , s) such that a strategy for Sn is started at stage v and for all t ∈
2.24. Genericity and weak genericity
109
σv , t) = d(n, σv , g(v)). But then g(v) h(n, σv ), (g(v), g(s)], we have d(n, so Sn is again permanently satisfied. Proof of Theorem 2.24.22. (i) Let p0 , p1 , . . . be a uniformly computable listing of the primitive recursive functions. If Φ∅e (m) ↓ with use at most pi (m) for all m n, then let g(e, i, n) = Φ∅e (n). Otherwise, let g(e, i, n) = 0. Clearly, g wtt ∅ and each function f pb ∅ has the form n → g(e, i, n) for some e and i. Let σ0 , σ1 , . . . be an effective listing of 2<ω . Let h(e, i, n) = g(e, i, n) if σg(e,i,n) ⊇ σn , and otherwise let h(e, i, n) = n. Let Se,i = {σh(e,i,n) : n ∈ N}. Then the sets Se,i are uniformly wtt-dense, and every pb-dense set is equal to Se,i for some e and i. By Lemma 2.24.23, there is a set A computable in a that meets each Se,i , which makes A pb-generic. (ii) Let A = {a0 < a1 < · · · } be pb-generic. Recall that the principal function pA of A is defined by pA (n) = an , and that m∅ (n) is the least s such that ∅ n = ∅s n. We show that pA (n) m∅ (n) for infinitely many n, which by Theorem 2.23.3 suffices to conclude that deg(A) is a.n.c. For a string σ, let i0 < i1 < · · · < ijσ −1 be all the numbers such that σ(in ) = 1. Let pσ (n) = in if n < jσ and let pσ (n) ↑ otherwise. Fix k. We show that there is an n k such that pA (n) m∅ (n). Let S = {σ : ∃n k (pσ (n) m∅ (n))}. Then S is pb-dense, as witnessed by the function f defined by f (σ) = σ1k 0m∅ (jσ +k) 1. Thus A meets S, which means that there is a σ such that f (σ) ≺ A. Then pA (jσ + k) m∅ (jσ + k). Since each nonzero c.e. degree bounds a 1-generic degree, but no 2-generic degree is below 0 , pb-genericity is an intermediate notion between 1- and 2-genericity. Corollary 2.24.24 (Downey, Jockusch, and Stob [121]). There is a 1generic degree that is not pb-generic, and a pb-generic degree that is not 2-generic.
3 Kolmogorov Complexity of Finite Strings
This chapter is a brief introduction to Kolmogorov complexity and the theory of algorithmic randomness for finite strings. We will concentrate on a few fundamental aspects, in particular ones that will be useful in dealing with our main topic, the theory of algorithmic randomness for infinite sequences. A much fuller treatment of Kolmogorov complexity can be found in Li and Vit´ anyi [248]. We will not discuss the philosophical roots of Kolmogorov complexity. See Li and Vit´anyi [248] and van Lambalgen [397, 398] for a thorough discussion of the foundations of the subject. We will mainly deal with two kinds of Kolmogorov complexity: plain and prefix-free (both defined below). There are two notational traditions in algorithmic information theory. One uses C for plain Kolmogorov complexity and K for prefix-free Kolmogorov complexity (sometimes referred to as prefix Kolmogorov complexity). The other uses K for plain Kolmogorov complexity and H for prefix-free Kolmogorov complexity. In line with [248], we adopt the former convention.
3.1 Plain Kolmogorov complexity The main idea behind the theory of algorithmic randomness for finite strings is that a string σ is random if and only if it is “incompressible”, that is, the only way to generate σ by an algorithm is to essentially hardwire σ into the algorithm, so that the minimal length of a program to generate σ is essentially the same as that of σ itself. For instance, 01000000 can be R.G. Downey and D. Hirschfeldt, Algorithmic Randomness and Complexity, Theory and Applications of Computability, DOI 10.1007/978-0-387-68441-3_3, © Springer Science+Business Media, LLC 2010
110
3.1. Plain Kolmogorov complexity
111
described by saying that we should repeat 0 1000000 times, an algorithm that can easily be described in less than 1000000 bits. On the other hand, if we were to toss a coin 1000000 times and record the outcome as a binary string, we would not expect this string to have a very short description. We formalize this notion, in a sense first due to Solomonoff [369], but independently to Kolmogorov [211], as follows. Let f : 2<ω → 2<ω be a partial computable function. The Kolmogorov complexity of a string σ with respect to f is Cf (σ) = min{|τ | : f (τ ) = σ}, where this minimum is taken to be ∞ if the set is empty. We say that σ is random relative to f if Cf (σ) |σ|. Here we think of f as a “description system”. Relative to this system, the string described (or generated) by a given string τ is f (τ ) (if this value is defined). Thus σ is random relative to f if and only if it has no descriptions shorter than itself, relative to the description system f . We would like to get rid of the dependence on f by choosing a universal description system. Such a system should be able to simulate any other description system with at most a small increase in the length of descriptions. Thus we fix a partial computable function U : 2<ω → 2<ω that is universal in the sense that, for each partial computable function f : 2<ω → 2<ω , there is a string ρf such that ∀σ [U (ρf σ) = f (σ)].1 We call ρf the coding string and |ρf | the coding constant of f in U . As usual, we identify partial computable functions and Turing machines, and hence often refer to U as a universal (Turing) machine. (These definitions of universal partial computable function and universal Turing machine are not exactly the same as the ones given in Proposition 2.1.2, but are essentially equivalent to those, so we do not bother to make a distinction.) It is easy to check that for any partial computable function f : 2<ω → <ω 2 , we have CU (x) Cf (x) + |ρf |, so U is a universal description system in the sense discussed above. Thus, it makes sense to define the (plain) Kolmogorov complexity (or C-complexity) of a binary string σ to be C(σ) = CU (σ). 1 Such a function, and the corresponding Turing machine, are sometimes called universal by adjunction, to distinguish them from a more general notion of universality, where we say that U is universal if for each partial computable f : 2<ω → 2<ω there is a c such that for each σ ∈ dom(f ) there is a τ with |τ | |σ| + c and U (τ ) = f (σ). The distinction between these two notions is irrelevant in most cases, but not always, and it should be kept in mind that it is the stronger one that we adopt.
112
3. Kolmogorov Complexity of Finite Strings
Of course, a different choice of U would yield a different definition of C, but the two definitions would agree up to an additive constant. The following results are not hard to prove. Proposition 3.1.1 (Kolmogorov [211]). (i) C(σ) |σ| + O(1). (ii) C(σσ) C(σ) + O(1). (iii) If h : 2<ω → 2<ω is computable then C(h(σ)) C(σ) + O(1). To prove (i), let f : 2<ω → 2<ω be the identity function. Then C(σ) Cf (σ) + O(1) = |σ| + O(1). The other two parts of the lemma are equally straightforward to prove. We can think of n ∈ N as a binary string σ (namely, its representation in binary), and define C(n) = C(σ). By part (iii) of Proposition 3.1.1, it would not really matter if we chose to define C(n) to be C(0n ), say. Note that C(n) log n + O(1), since the length of the binary representation of n is log n. We will denote C(σ, τ ) by C(σ, τ ). By part (iii) of Proposition 3.1.1, the particular choice of pairing function does not matter, up to an additive constant. Suppose that C(σ) = k. Then there is a first string τ of length k such that the computation of U (τ ) converges with value σ. More precisely, there is a least stage s such that U (τ )[s] ↓= σ for one or more strings τ of length ∗ k. We fix the lexicographically least such τ and denote it by σC . ∗ The following fact is easy to check, since from σC we can compute both ∗ ∗ σ and C(σ) = |σC |, and from C(σ) and σ we can compute σC . ∗ Proposition 3.1.2. C(σ, C(σ)) = C(σC ) ± O(1).
We wish to say that a binary string σ is C-random if it is random relative to U , that is, C(σ) |σ|. However, it is more convenient (and reasonable, given the arbitrary choice of U ) to relax this definition a little by fixing a constant d and defining σ to be C-random if C(σ) |σ| − d. The following result says that C-random strings exist. Proposition 3.1.3 (Solomonoff [369], Kolmogorov [211]). For each n there exists a σ with |σ| = n and C(σ) n. Proof. There are only 2n − 1 binary strings of length less than n, so there are at most 2n − 1 binary strings with Kolmogorov complexity less than n. Notice that the same kind of counting argument shows that, for any k and n, |{σ ∈ 2n : C(σ) |σ| − k}| > 2n (1 − 2−k ).
3.1. Plain Kolmogorov complexity
113
For instance, the number of strings of length 1000 that are “half-random” (i.e., those with Kolmogorov complexity at least 500) is greater than 21000 (1 − 2−500 ). Thus, heuristically, almost every string is half-random. We would like to have C(στ ) C(σ) + C(τ ) + O(1), but this is not the case. The problem is that we cannot simply concatenate descriptions of σ and τ , since we would have no way of knowing where one description ends and the other begins. In fact, sufficiently long strings always have fairly compressible initial segments, as we now see. Theorem 3.1.4 (Martin-L¨ of, see Kolmogorov [212] and Li and Vit´anyi [248]). Fix k. If μ is sufficiently long then there is an initial segment σ of μ such that C(σ) < |σ| − k. Thus, for any fixed d, we have μ = στ such that C(μ) > C(σ) + C(τ ) + d. Proof. Let ν be an initial segment of μ, and let n be such that ν is the nth string in the length-lexicographic ordering of 2<ω . Let ρ be the string consisting of the next n bits of μ following ν, and let σ = νρ. To generate σ we need only know ρ, since the length n of ρ will then give us ν, so there is a constant c such that C(σ) |ρ| + c, where c does not depend on the choice of μ or ν.2 On the other hand, |σ| = |ν| + |ρ|, so if we choose ν so that |ν| > c + k, then C(σ) < |σ| − k. For the second part of the lemma, let c be such that C(τ ) |τ | + c for all τ , and let k = c+d. Let μ be a sufficiently large string such that C(μ) |μ|. By the first part of the lemma, we can write μ = στ so that C(σ) < |σ| − k. Then C(μ) |μ| = |σ| + |τ | > C(σ) + k + C(τ ) − c = C(σ) + C(τ ) + d. On the other hand, we do have the following bound, which has been shown to be fairly sharp, using techniques like that in the proof of Theorem 3.1.4. Proposition 3.1.5. C(στ ) C(σ, τ ) C(σ) + C(τ ) + 2 log C(σ) + O(1). Proof. That C(στ ) C(σ, τ ) is clear, since if we know both σ and τ then we know στ . For a string ν = a0 a1 . . . am , let ν = a0 a0 a1 a1 . . . am am 01. That is, ν is the result of doubling each bit of ν and adding 01 to the end. Intuitively, the second inequality in the lemma follows because we can describe σ and ∗ ∗ τ by giving σC and τC∗ together with the length of σC , that is, C(σ). If we code this length as a binary string ρ, then we can give this length in an unambiguous way as ρ, because each string μ has at most one initial ∗ ∗ segment of the form ν for some ν, so ρσC τC unambiguously describes σ, τ . 2 It
may be useful to those unfamiliar with this kind of argument to give a more detailed justification here: For a string τ , let m = |τ | and let f (τ ) = ζτ , where ζ is the mth string in the length-lexicographic ordering of 2<ω . Then f is a computable function and f (ρ) = νρ = σ. So Cf (σ) |ρ|, and hence C(σ) |ρ| + c, where c is the coding constant of f in our fixed universal function U .
114
3. Kolmogorov Complexity of Finite Strings
More formally, let f : 2<σ → 2<σ be defined as follows. On input μ, search for μ0 , μ1 , μ2 , ρ such that 1. μ = μ0 μ1 μ2 , 2. μ0 = ρ, 3. ρ is the binary representation of |μ1 |, 4. U (μ1 ) ↓, and U (μ2 ) ↓. If these strings are found then let f (μ) = U (μ1 ), U (μ2 ). ∗ ∗ Letting ρ be the binary representation of C(σ), we have f (ρσC τC ) = ∗ ∗ σ, τ , so C(σ, τ ) |ρσC τC | + O(1) = C(σ) + C(τ ) + 2 log C(σ) + O(1). We will use Kolmogorov complexity to study issues related to the relative algorithmic randomness of infinite sequences, but it has also found many other applications. For instance, there is a well-developed theory applying incompressibility to avoid counting in combinatorial arguments. This method is extensively used for lower bound arguments in complexity theory, and is one of the principal reasons for much of the interest in C-complexity. We will not dwell on such applications of Kolmogorov complexity here, since they are not directly related to our central theme, but will give a short but interesting example: the construction of an immune set, that is, a set with no infinite c.e. subset. For more on applications of Kolmogorov complexity, see Chapter 6 of [248]. Let A = {σ : C(σ) |σ| 2 }. Then A is immune. Indeed, suppose that A has an infinite c.e. subset B. Let h(n) be the first string of length greater than n to enter B. Then C(h(n)) |h(n)| > n2 , since h(n) ∈ A, but we can 2 generate h(n) given n simply by running the enumeration of B until a string of length greater than n appears, so C(h(n)) C(n)+ O(1) log n+ O(1). For large enough n, this is a contradiction.
3.2 Conditional complexity It is often useful to measure the compressibility of a string given another string. To do so, we fix a universal oracle machine, that is, an oracle Turing machine U such that for each oracle Turing machine M there is a coding string ρM such that ∀X ∀σ [U X (ρM σ) = M X (σ)], where X ranges over all oracles, both finite and infinite. For strings σ and τ , the (plain) Kolmogorov complexity of σ given τ is C(σ | τ ) = min{|μ| : U τ (μ) = σ},
3.2. Conditional complexity
115
where τ is as defined in the proof of Proposition 3.1.5. (The reason we use τ instead of τ is the following. By the way we have defined the action of oracle machines on finite oracles in Section 2.4.1, for any such machine M and any μ, the set of oracles ρ such that M ρ (μ) ↓ is prefix-free, as defined in Definition 2.19.1. It is not hard to use this fact to show that min{|μ| : U τ (μ) = σ} does not behave as we would expect conditional complexity to behave. For example, min{|μ| : U σ (μ) = σ} is not bounded by a constant, but we of course want to have C(σ | σ) O(1). With the definition we have given, we do have this property, since it is easy to define an oracle Turing machine M such that M σ (λ) = σ for all σ.) Notice that this notion of conditional Kolmogorov complexity reduces to unconditional Kolmogorov complexity by letting τ = λ, the empty string. That is, C(σ) = C(σ | λ) ± O(1). ∗ In Proposition 3.1.2, we saw that C(σC ) = C(σ, C(σ)) ± O(1). We can say a little more using conditional complexity. ∗ Proposition 3.2.1. C(σC | σ) = C(C(σ) | σ) ± O(1).
The proof is essentially the same as that of Proposition 3.1.2. Easy counting arguments give us the following basic result relating the size of a finite set to the complexity of its elements. Theorem 3.2.2 (Kolmogorov [211]). (i) Let A ⊂ 2<ω be finite. Then for each τ there is a σ ∈ A such that C(σ | τ ) log |A|. (ii) Let B ⊆ 2<ω × 2<ω be an infinite computably enumerable set such that each set of the form Bτ = {σ : σ, τ ∈ B} is finite. Then for every τ and every σ ∈ Bτ , we have C(σ | τ ) log |Bτ | + O(1), where the constant term is independent of both σ and τ . Proof. Part (i) is a counting argument, as in the proof of Proposition 3.1.3. For part (ii), given τ , we can describe σ ∈ Bτ by giving its position in the enumeration of Bτ derived from a fixed enumeration of B. This position is less than or equal to |Bτ |, and hence can be coded by a string of length less than or equal to log |Bτ |.
Universal oracle machines also allow us to relativize the notion of plain Kolmogorov complexity to any oracle A, by letting U be such a machine and defining C A (σ) = min{|τ | : U A (τ ) = σ}. All the basic properties of plain complexity continue to hold in the relativized case, with the same proofs. It is also easy to show that if B T A then C B (σ) C A (σ) + O(1).
116
3. Kolmogorov Complexity of Finite Strings
3.3 Symmetry of information The (plain) information content of a string τ in a string σ is defined as IC (σ : τ ) = C(τ ) − C(τ | σ). This notion is meant to capture the difference between the intrinsic difficulty of producing τ given σ and that of producing τ without σ. The following famous result expresses concisely the relationship between IC (σ : τ ) and IC (τ : σ). Theorem 3.3.1 (Symmetry of Information, Levin and Kolmogorov, see [425]). IC (σ : τ ) = IC (τ : σ) ± O(log C(σ, τ )). Note that it follows immediately from this result that IC (σ : τ ) = IC (τ : σ) ± O(log n), where n = max{|τ |, |σ|}. Theorem 3.3.1 follows easily from the following restated version. Theorem 3.3.2 (Symmetry of Information, Restated). C(σ, τ ) = C(τ | σ) + C(σ) ± O(log C(σ, τ )). To obtain Theorem 3.3.1 from Theorem 3.3.2, we argue as follows. We have C(σ, τ ) = C(τ | σ) + C(σ) ± O(log C(σ, τ )) and C(τ, σ) = C(σ | τ ) + C(τ ) ± O(log C(τ, σ)). But C(σ, τ ) = C(τ, σ) ± O(1), so IC (σ : τ ) = C(τ ) − C(τ | σ) = C(σ) − C(σ | τ ) ± O(log C(σ, τ )) = IC (τ : σ) ± O(log C(σ, τ )). We now proceed with the proof of Theorem 3.3.2, which follows that given by Li and Vit´ anyi [248]. Proof of Theorem 3.3.2. It is easy to see that C(σ, τ ) C(σ) + C(τ | σ) + O(log C(σ, τ )), since we can describe σ, τ via a description of σ, a description of τ given σ, and an indication of where to delimit the two descriptions, as in Proposition 3.1.5. For the other direction, let A = {(μ, ν) : C(μ, ν) C(σ, τ )} and Aμ = {ν : (μ, ν) ∈ A}.
3.4. Information-theoretic characterizations of computability
117
The set A is finite, and there is a uniform procedure (over all choices of σ and τ ) for enumerating it given C(σ, τ ). Similarly, the sets Aμ are finite and there is a uniform procedure (again over all choices of σ and τ ) for enumerating them given C(σ, τ ) and μ. Hence, one can describe τ given σ and C(σ, τ ), along with τ ’s place in the enumeration order of Aσ . Thus C(τ | σ) log |Aσ | + 2 log C(σ, τ ) + O(1). Let N = 2log |Aσ |−1 . (Recall our convention that when we write log n, we mean the base 2 logarithm of n, rounded up to the nearest integer. Thus we do not necessarily have N = |Aσ |/2. However, it is the case that N |Aσ |.) Since log N = log |Aσ | − 1, C(τ | σ) log N + O(log C(σ, τ )).
(3.1)
Now consider the set B = {μ : |Aμ | N }. We have σ ∈ B. Furthermore, there is a uniform procedure (over all choices of σ and τ ) for enumerating B 2C(σ,τ ) given C(σ, τ ) and N , and |B| |A| . Thus, to describe σ, we need N N only give C(σ, τ ) and N , along with σ’s place in the enumeration order of B. Now, N can be specified with log log |Aσ | many bits, so C(σ) log
2C(σ,τ ) + 2 log log |Aσ | + 2 log C(σ, τ ) + O(1). N
But |Aσ | 2C(σ,τ ) , so C(σ) C(σ, τ ) − log N + O(log C(σ, τ )).
(3.2)
Combining (3.1) and (3.2), we have C(σ) + C(τ | σ) C(σ, τ ) + O(log C(σ, τ )), as required.
3.4 Information-theoretic characterizations of computability In this section, we establish some combinatorial facts about the number of strings of a specified complexity, and show that one can use Kolmogorov complexity to provide information-theoretic characterizations of computability. We begin with a result of Loveland [250]. If A is a computable set then we can generate A n simply by running the computation of A(i) for all i < n. Thus C(A n | n) = O(1), where the constant depends on A. Loveland’s result is a converse to this fact. Theorem 3.4.1 (Loveland [250]). Let X be an infinite computable set. For each e, there are only finitely many sets A such that C(A n | n) e for all n ∈ X, and each such A is computable.
118
3. Kolmogorov Complexity of Finite Strings
Proof. Let U be our universal Turing machine. Fix e. For each n, let kn = |{τ ∈ 2e | U n (τ ) ↓}|. Since kn < 2e+1 for all n, there is a largest m such that kn = m for infinitely many n ∈ X, and there is an l such that kn m for all n ∈ X with n l. Furthermore, there is a computable sequence n0 < n1 < · · · ∈ X such that n0 l and kni = m for all i. Note that the sets Si = {μ | C(μ | ni ) e} are uniformly computable, since for each i we can wait until we see m strings τ ∈ 2e with U ni (τ ) ↓, at which point we know that Si is exactly the collection of values of U ni (τ ) for such τ . Let T be the tree consisting of all σ such that ∀ni |σ| (σ ni ∈ Si ). If C(A n | n) e for all n ∈ X then in particular A ni ∈ Si for all i, so A is a path of T . But the tree T is computable, and has width at most m, since for each ni there are at most m strings of length ni on T , so T has only finitely many paths, which means that each path of T is computable. It is interesting to note that Loveland [250] also showed that one cannot weaken the hypothesis of the above theorem to simply say that C(A n | n) e for infinitely many n. The key fact here is the following. Let #(σ) denote the position of σ in the length-lexicographic ordering of 2<ω , and let σ = σ0#(σ)−|σ| . Then for any σ, given | σ | = #(σ) we know σ, and hence σ . Thus there is an e such that C( σ | | σ |) e for all σ. Now for any set B we can form a sequence σ0 ≺ σ1 ≺ · · · by taking σ0 = λ and σi+1 = σ i B(i). Letting A = i σi , we have C(A n | n) e for infinitely many n. But because the choice of B was arbitrary, there are continuum many such A. One of the keys to Theorem 3.4.1 is the uniformity implicit in having C(A n | n) bounded by a constant. We now turn to a more interesting theorem, due to Chaitin [60], which also gives an information-theoretic characterization of computability, but at the same time indicates a hidden uniformity in plain complexity. We begin with two lemmas of independent interest. Let D : 2<ω → 2<ω be partial computable. Then τ is a D-description of σ, or a D-program for σ, if D(τ ) = σ. Lemma 3.4.2 (Chaitin [60]). Let D : 2<ω → 2<ω be partial computable. There is a constant c such that letting h(d) = 2(d+c) , for each σ ∈ 2<ω , |{τ : D(τ ) = σ ∧ |τ | C(σ) + d}| < h(d). That is, the number of D-descriptions of σ of length at most C(σ) + d is O(2d ).
3.4. Information-theoretic characterizations of computability
119
Proof. The intuition behind this proof is that if there are too many short D-descriptions of σ, then using a listing of these we can give a description of σ shorter than C(σ), which is impossible. For each m and k, let Lkm be the set of all τ that have at least 2m many D-descriptions of length at most k. Then |Lkm | < 2k−m , and the Lkm are uniformly c.e. Thus we can describe any element of Lkm by giving m and a string of length 2k−m , indicating its position in the enumeration of Lkm . Such a description can be given in at most 2 log m + k − m + O(1) many bits, where the constant term depends only on D. So there is a c such that if σ ∈ Lkm then C(σ) 2 log m + k − m + c. We may assume that c > 3. Let h(d) = 2d+c . Let m = h(d) and k = C(σ) + d. If σ ∈ Lkm then C(σ) 2 log h(d) + C(σ) + d − h(d) + c, and hence h(d) 2 log h(d) + d + c = 3(d + c), which is not the case. Thus σ ∈ / Lkm . In other words, the number of D-descriptions of σ of length at most C(σ) + d is less than h(d). Lemma 3.4.3 (Chaitin [60]). There is a computable function h with h(d) = O(2d ) such that, for all n and d, |{σ ∈ 2n : C(σ) C(n) + d}| h(d). Proof. Let D be the partial computable function defined by D(τ ) = 1|U (τ )| , where U is our universal Turing machine, and let h be as in Lemma 3.4.2. If ∗ C(σ) C(n)+d then σC is a D-description of n of length at most C(n)+d. There are at most h(d) many such descriptions, and hence at most h(d) many σ with C(σ) C(n) + d. Thus, although one might expect that, as the size of σ increases, the number of descriptions of σ of length close to C(σ) would grow, we see that this is not the case. Theorem 3.4.4 (Chaitin [60]). A set A is computable iff C(A n) C(n) + O(1). Furthermore, for each k there are only O(2d ) many A such that C(A n) C(n) + d. Proof. If A is computable, then clearly C(A n) C(n) + O(1). Now let e be such that C(n) log n + e for all n. Fix d and let T = {σ : ∀τ σ (C(τ ) log |τ | + d + e)}. The tree T is c.e. We want to transform it into a computable tree. We assume we have an enumeration of T such that each Ts is closed under substrings. Let Ik be the interval (2k , 2k+1 ], and let mk = min{|2n ∩ T | : n ∈ Ik }. Each Ik contains some n such that C(n) log n. For this n, we have |2n ∩ T | h(d + e), so mk h(d + e) for all k. Let c h(d + e) be the largest number such that mk = c for infinitely many k, and let l be such that mk c for all k > l. Let l < k0 < k1 < · · · and s0 , s1 , . . . be computable sequences such that for each i we have |2n ∩ Tsi | c for all
120
3. Kolmogorov Complexity of Finite Strings
n ∈ Iki . Let ai be the least element of Iki such that |2n ∩ Tsi | = c, which must exist since mki = c. Let S = {σ : ∀ai < |σ| (σ ai ∈ Tsi )}. Then S is a computable tree. If P ∈ [S] then P ai ∈ T for all i, so P ∈ [T ]. Conversely, suppose that a string σ of length ai is not in Tsi . Let n ∈ Iki be such that |2n ∩ T | = c. Then n ai , and there is no extension of σ of length n in T . So if P ∈ [T ] then P ai ∈ Tsi for all i, and hence P ∈ [S]. Thus [S] = [T ]. If C(A n) C(n) + d for all n then A ∈ [T ], so A ∈ [S]. But S has width c h(d + e) = O(2d ), so there are only O(2d ) many such A, and they are all computable. Note that the above theorem remains true (with a similar proof) if we weaken the latter condition by requiring that C(A n) C(n) + O(1) only for an infinite computable set of n, or if we replace it by C(A n) log n + O(1) (for all n or for an infinite computable set of n), since if C(A n) log n + d for all n then A ∈ [T ] for the tree T defined in the above proof. We record the former variation for later use. Theorem 3.4.5 (Chaitin [60]). A set A is computable iff there is an infinite computable set S such that C(A n) C(n) + O(1) for all n ∈ S. Merkle and Stephan [270] observed that this theorem relativizes to show that, for any infinite S, if C(A n) C(n) + O(1) for all n ∈ S then A T S. There are several other counting results along the lines of Lemma 3.4.3, many due to the Moscow school of algorithmic information theory. We prove one that will be useful below. (It is possible this result was known earlier than the given reference.) Theorem 3.4.6 (Figueira, Nies, and Stephan [147]). Let σ ∈ 2<ω . Then |{τ : C(σ, τ ) C(σ) + d}| O(d4 2d ), where the constant does not depend on σ. Proof. Let nσ be the position of σ in a fixed effective enumeration of 2<ω . Let c be such that C(σ) nσ + c for all σ. Let f (σ, τ, d) be a partial computable function defined as follows. Given σ, τ , and d, enumerate all the strings μ0 , μ1 , . . . with C(μi ) nσ + d + c. If there is an i such that μi = τ , let ν be a binary representation of i with enough initial zeroes so that |ν| = nσ + d + c + 1. Such a ν must exist because there are at most 2nσ +d+c+1 many μj . Let f (σ, τ, d) = ν. Of course, if C(τ ) > nσ + d + c then f (σ, τ, d) ↑.
3.5. Prefix-free machines and complexity
121
Now suppose that C(σ, τ ) C(σ) + d. Then C(σ, τ ) nσ + d + c, so f (σ, τ, d) ↓. We have C(f (σ, τ, d)) C(σ, τ ) + 2 log d + O(1) C(σ) + d + 2 log d + O(1) C(nσ + d + c + 1) + d + 4 log d + O(1). (The last inequality follows since we can compute σ from nσ + d + c + 1 and d.) For fixed σ and d, the map τ → f (σ, τ, d) is injective, so |{τ : C(σ, τ ) C(σ) + d}| |{σ ∈ 2nσ +d+c+1 : C(σ) C(nσ + d + c + 1) + d + 4 log d + O(1)}| O(d4 2d ), the last inequality following by Lemma 3.4.3.
3.5 Prefix-free machines and complexity In this section, we introduce the notion of prefix-free Kolmogorov complexity. Our main reason for working with this notion is to use it in studying algorithmic randomness of infinite sequences. However, some have argued that prefix-free complexity is the correct notion of descriptive complexity even for finite strings. One argument for the inadequacy of plain Kolmogorov complexity is the following. The intended meaning in saying that τ is a description of σ is that the bits of τ contain all the information necessary to obtain σ. But a Turing machine M might produce σ by using not only the bits of τ but also its length. In this way, τ actually represents |τ | + log |τ | many bits of information. Indeed, this fact was exploited in the proof of Theorem 3.1.4. Another way to make this argument is that, if M is allowed to use the length of τ in computing σ, then there must be some kind of termination symbol T (such as a blank space) on M ’s input tape following the bits of τ . Thus, to output a word in the alphabet {0, 1}, our machine uses an input in the alphabet {0, 1, T }, which we may view as cheating. If one accepts this argument then one ought to try to circumvent this shortcoming of plain Kolmogorov complexity. First Levin [241, 243] and later Chaitin [58] suggested using prefix-free machines to do so. Recall from Definition 2.19.1 that a set A ⊆ 2<ω is prefix-free if it is an antichain with respect to the natural partial ordering of 2<ω , that is, if for all σ ∈ A and all τ properly extending σ, we have τ ∈ / A. Joe Miller has pointed out that the set of valid telephone numbers is a real-world example of a prefix-free set. Note that this prefix-freeness allows us to give a phone number using only the alphabet {0,. . . ,9}. Anil Nerode has remarked that
122
3. Kolmogorov Complexity of Finite Strings
there have been phone numbering systems in the past that were not prefixfree. In such a system, to give a phone number, we also need a termination symbol, such as a blank space. A prefix-free function is one whose domain is prefix-free. Similarly, a prefix-free (Turing) machine is one whose domain is prefix-free. It is usual to consider such a machine as being self-delimiting, which means that it has a one-way read head that halts when the machine accepts the string described by the bits read so far. The point is that such a machine is forced to accept strings without knowing whether there are any more bits written on the input tape. This is a purely technical device that forces the machine to have a prefix-free domain, but it also highlights how using prefix-free machines circumvents the use of the length of a string to gain more information than is present in the bits of the string.3 To highlight the role of machines in the definition of Kolmogorov complexity, we will sometimes refer to general Turing machines as plain machines. A partial computable prefix-free function U is universal if for each partial computable prefix-free function f : 2<ω → 2<ω , there is a string ρf such that ∀σ [U (ρf σ) = f (σ)]. As in the non-prefix-free case, we call ρf the coding string and |ρf | the coding constant of f in U . As the following result shows, we can identify partial computable prefix-free functions and prefix-free machines, and hence often refer to U as a universal prefix-free machine.4 Proposition 3.5.1. (i) If Φ is a prefix-free partial computable function, then there is a prefix-free (self-delimiting) machine M such that M computes Φ. (ii) There is a universal (self-delimiting) prefix-free machine. 3 Another way to achieve this goal is to require the complexity measure to be “continuous”. This requirement gives rise to the notions of monotone complexity and process complexity, which we will discuss in Section 3.15. Another related notion, which we will not discuss in this book, is uniform complexity; see Li and Vit´ anyi [248] and Barzdins [29] for more details. Another example of a notion of complexity that we will not discuss is Loveland’s decision complexity Kd. It is defined by considering a machine as a c.e. collection of pairs (σ, τ ), thinking of σ as a description of τ , but now insisting that if (σ, τ ) occurs in our collection, then so does (σ, ρ) for all ρ ≺ τ . As with most complexities, Loveland’s variant gives further insight into descriptional complexity. See [250] for more details. Notice that Kd(σ) C(σ) + O(1), and in fact it is possible to have Kd(σ) be much less than C(σ), so C is not the minimal complexity measure to have been studied. For more on this topic see Uspensky and Shen [396]. Notions such as decision complexity and uniform complexity have much less developed theories than prefix-free complexity and plain complexity. 4 As in the non-prefix-free case, we sometimes refer to these functions and machines as universal by adjunction. See footnote 1 on page 111.
3.5. Prefix-free machines and complexity
123
Proof. (i) Let Φ be partial computable and prefix-free. We build a selfdelimiting machine M in stages. Initially, M works on the empty string. At a given point in the action of M , let σ be the string consisting of the bits read by M so far. The read head does not scan the next symbol unless it sees some extension τ of σ such that Φ(τ ) ↓. At this point, M scans the next symbol i on the read tape. If the string σ = σi remains an initial segment of τ , then M reads the next symbol and repeats the above process with σ in place of σ. Otherwise, M goes back to waiting for a string τ extending σ upon which Φ halts. Of course, if the string σ consisting of the bits read by M so far is ever such that Φ(σ) ↓, then M simply outputs Φ(σ). (ii) Let Ψe be the partial computable function defined by letting Ψe [s] = Φe [s] if dom Φe [s] is prefix-free and Ψe [s] = Ψe [s − 1] otherwise. Then {Ψe }e∈ω is an enumeration of all (and only) the prefix-free partial computable functions. We can then define Ψ(1e 0σ) = Ψe (σ), which is evidently prefix-free and universal. By part (i), there is a selfdelimiting machine computing Ψ. By the same argument as above, there is a universal prefix-free oracle machine U , that is, one such that for each prefix-free oracle machine M , there is a coding string ρM such that for every oracle X, we have ∀σ [U X (ρM σ) = mX (σ)] (where X is either a finite or an infinite oracle). Definition 3.5.2. Henceforth in this book, we fix a universal prefix-free oracle machine U. We write U(σ) for U ∅ (σ), and thus also think of U as a universal prefix-free machine in the unrelativized case. All our results will be independent of the choice of U. Fix the enumeration Ψ0 , Ψ1 , . . . of prefix-free partial computable functions given in the proof of Proposition 3.5.1. We will use the following version of the recursion theorem below. Theorem 3.5.3 (Recursion Theorem for Prefix-Free Machines). Let h : 2<ω × N → 2<ω be a partial computable function such that for each e, the function λσ.h(σ, e) is prefix-free. From an index for h, we can compute an index e such that Ψe = λσ.h(σ, e). Proof. Let f be a total computable function such that Φf (e) = λσ.h(σ, e) for all e. By the recursion theorem, there is an e (which can be computed from an index for f , and hence from an index for h) such that Φf (e) = Φe . Since Φf (e) is prefix-free, so is Φe , and hence Ψe = Φe . Thus we have Ψe = Φf (e) = λσ.h(σ, e). The machine U is minimal, in the sense that, if M is any prefix-free machine, then CU (σ) CM (σ) + O(1). Thus we can define the prefix-free
124
3. Kolmogorov Complexity of Finite Strings
Kolmogorov complexity of a string σ to be K(τ ) = CU (σ). Similarly, we can define the prefix-free Kolmogorov complexity K(σ | τ ) of σ given τ and the prefix-free Kolmogorov complexity K A (σ) of σ relative to A as we did for plain complexity. As in the case of plain complexity, the choice of universal prefix-free machine does not affect the definition of K, up to an additive constant. When M is a prefix-free machine, we will sometimes write KM instead of CM . As we did for plain complexity, we associate a natural number n with its binary representation σ and let K(n) = K(σ), and similarly for other notation introduced in the case of plain complexity. As was the case for plain complexity, we have the following simple but important result. Proposition 3.5.4. If h : 2<ω → 2<ω is computable then K(h(σ)) K(σ) + O(1). Suppose that K(σ) = n. Then there is a first string τ of length n such that the computation of U(τ ) converges with value σ. More precisely, there is a least stage s such that U(τ )[s] ↓= σ for one or more strings τ of length n. We fix the lexicographically least such τ and denote it by σ ∗ . One of the most important facts about K is that it can be approximated from above. Specifically, let Ks (σ) = min{|τ | : U(τ )[s] ↓= σ}. In other words, Ks (σ) is the length of the shortest description of σ provided by U in at most s many steps. Of course, there may be no such description. However, it will be technically useful to assume that Ks (σ) is always defined. It is easy to show that there is a c such that K(σ) 2|σ| + c for all σ. (We will give much better upper bounds below.) Thus, if there is no τ with U(τ )[s] ↓= σ, then let Ks (σ) = 2|σ| + c. The important properties of this approximation are that 1. The function (s, σ) → Ks (σ) is computable, 2. Ks (σ) Ks+1 (σ) for all s and σ, and 3. K(σ) = lims Ks (σ) for all σ. On the other hand, we have the following fact (which is also true for plain complexity, with the same proof). Proposition 3.5.5. There is no partial computable f with unbounded range such that f (σ) K(σ) whenever f (σ) is defined. Proof. Define a prefix-free machine M as follows. For each k, look for a σk such that f (σk ) 2k and let M (0k 1) = σk . Then K(σk ) CM (σk ) + O(1) k + O(1) < 2k f (σk ) K(σk ) for all sufficiently large k, which is a contradiction.
3.6. The KC Theorem
125
We will develop some further basic properties of prefix-free Kolmogorov complexity after discussing an important indirect way to build prefix-free machines in the next section.
3.6 The KC Theorem A central tool in building prefix-free machines is an effective interpretation 5 ω of an inequality of are disjoint measur Kraft [213]. If S0 , S1 , . . . ⊂ 2 <ω able sets then μ( S ) = μ(S ), so if A ⊂ 2 is prefix-free then n n n n −|σ| = μ(A) 1. Conversely, suppose that we have a sequence σ∈A 2 {di }i∈ω of natural numbers such that i 2−di 1. Since we are not arguing effectively, by rearranging the di if necessary, we may assume that d0 d1 d2 · · · . Let σi be the leftmost string of length di that does not extend any σj with j < i. An easy induction shows that such a string always exists.Thus we see that, for any sequence {di }i∈ω of natural numbers such that i 2−di 1, there is a prefix-free sequence {σi }i∈ω of strings such that |σi | = di for all i. The KC Theorem is an effectivization of this fact. Theorem 3.6.1 (KC Theorem, Levin [241], Schnorr [350], Chaitin [58]). Let (di , τi )i∈ω be a computable sequence of pairs (which we call requests), with di ∈ N and τi ∈ 2<ω , such that i 2−di 1. Then there is a prefixfree machine M and strings σi of length di such that M (σi ) = τi for all i and dom M = {σi : i ∈ ω}. Furthermore, an index for M can be obtained effectively from an index for our sequence of requests. Proof. It is enough to define effectively a prefix-free sequence of strings σ0 , σ1 , . . . with |σn | = dn . The following organizational device is due to n n n Joe Miller. For each n, let x = x1 . . . xm be a binary string such that 0.xn1 . . . xnm = 1 − jn 2−dj . We will define the σn so that the following holds for each n: for each m with xnm = 1 there is a string μnm of length m so that Sn = {σi : i n} ∪ {μnm : xnm = 1} is prefix-free. 5 This result is usually known as the Kraft-Chaitin Theorem, as it appears in Chaitin [58], but it appeared earlier in Levin’s dissertation [241], as stated in Levin [243], where it is proved using Shannon-Fano codes (giving slightly weaker constants). There is also a version of it in Schnorr [350, Lemma 1, p. 380]. In Chaitin [58], where the first proof explicitly done for prefix-free complexity seems to appear, the key idea of that proof is attributed to Nick Pippinger. Thus perhaps we should refer to the theorem by the rather unwieldy name of Kraft-Levin-Schnorr-Pippinger-Chaitin Theorem. Instead, we will refer to it as the KC Theorem. Since it is an effectivization of Kraft’s inequality, one should feel free if one wishes to regard the initials as coming from “Kraft’s inequality (Computable version)”. Recently, some authors such as Nies have referred to this result as the Bounded Request Theorem, and to what we call a KC set as a bounded request set.
126
3. Kolmogorov Complexity of Finite Strings
We begin by letting σ0 be 0d0 . Notice that x0m = 1 iff 0 < m d0 , so if we define μ0m = 0m−1 1, then {σ0 } ∪ {μ0m : x0m = 1} is prefix-free. Now assume we have defined σ0 , . . . , σn and μnm for xnm = 1 so that Sn = {σi : i n} ∪ {μnm : xnm = 1} is prefix-free. If xndn+1 = 1 then xn+1 is the same as xn except that xn+1 dn+1 = 0. So n we can let σn+1 = μndn+1 and μn+1 = μ for all m = d , and then n+1 m m n+1 Sn+1 = {σi : i n + 1} ∪ {μn+1 : x = 1} is equal to S , and hence is n m m prefix-free. Otherwise, find the largest j < dn+1 such that xnj = 1. Such a j must exist since otherwise 1 − jn 2−dj < 2−dn+1 , which would mean that −dj > 1. In this case xn+1 is the same as xn except for positions jn+1 2 j, . . . , dn+1 , where we have xn+1 = 0 and xn+1 = 1 for j < m dn+1 . m j n dn+1 −j Let σn+1 = μj 0 . For m < j or m > dn+1 , let μn+1 = μnm , and m = μnj 0m−j−1 1. Then Sn+1 = {σi : i n + for j < m dn+1 , let μn+1 m n+1 n+1 1} ∪ {μm : xm = 1} is the same as Sn except that μnj is replaced by a pairwise incomparable set of superstrings of μnj . This fact clearly ensures that Sn+1 is prefix-free. This completes the definition of the σi . Each finite subset of {σ0 , σ1 , . . .} is contained in some Sn , and hence is prefix-free. Thus the whole set is prefix-free. Since the σi are chosen effectively, we can define a prefix-free machine M by letting M (σi ) = τi for each i. The weight of a request (d, τ ) is 2−|τ |. The weight of a computable sequence −diof requests (di , τi )i∈ω is the sum of the weights of the requests, i.e., . If this weight is less than or equal to 1, then we say that this i2 sequence is a KC set . The beauty of using the KC Theorem to define a prefix-free machine is that we need only build a KC set, issuing requests and ensuring that the weight is kept less than or equal to 1; the prefix-free machine itself is built implicitly for us. The following is a very useful immediate consequence of the KC Theorem. (Recall that we denote the measure of A ⊆ 2ω by μ(A).) Corollary 3.6.2. Let (di , τi )i∈ω be a KC set. Then K(τi ) di + O(1). Proof. Let M be as in the KC Theorem. Then K(τi ) KM (τi ) + O(1) di + O(1). If a computable sequence of requests (di , τi )i∈ω has finite weight, then there is a c such that (di + c, τi )i∈ω has weight less than or equal to 1, and hence is a KC set. Thus Corollary 3.6.2 still holds for such a sequence. We will sometimes abuse terminology and refer to such a sequence as a KC set. A simple application of the KC Theorem is the following. Let τ0 , τ1 , . . . be an effective enumeration of 2<ω , with τ0 being the empty string. Let
3.6. The KC Theorem
127
d0 = 2 and for i > 0 let di = |τi | + 2 log |τi | + 3. Then i
2−di
1 2−|τi | 1 n 2−n 1 1 1 = 2 = < 1. + + + 4 i>0 3|τi |2 4 n>0 3n2 4 3 n>0 n2
Thus {(di , τi )}i∈ω is a KC set, so we have the following upper bound on K(τ ), which we will improve on in Theorem 3.7.4 below. Proposition 3.6.3. K(τ ) |τ | + 2 log |τ | + O(1). The following is another useful application of the KC Theorem. <ω be nonempty, pairwise disjoint, Proposition 3.6.4. Let A0 , A1 , . . . ⊂ 2 and uniformly c.e. Then K(n) − log σ∈An 2−|σ| + O(1). Proof. Let f (n) = σ∈An 2−|σ| and f (n, s) = σ∈An,s 2−|σ| , where An,s is the stage s approximation to An . Note that because the An are pairwise disjoint, n f (n) 1. Build a KC set L as follows. For each n and k, if there is an s such that f (n, s) 2−k , then enumerate a request (k + 1, n) into L. The weight of L is bounded by 2−(k+1) f (n) 1, n k− log f (n)
n
so L is indeed a KC set. For each n there is an s such that f (n, s) f (n) 2 , so L contains a request (k + 1, n) with k − log f (n) + 1, which implies, by Corollary 3.6.2, that K(n) − log f (n) + O(1). Corollary 3.6.5. Let C0 , C1 , . . . be nonempty, pairwise disjoint uniformly Σ01 classes. Then K(n) − log μ(Cn ) + O(1). Corollary 3.6.6. Let B0 , B1 , . . . ⊆ 2<ω be nonempty, pairwise disjoint, and uniformly c.e. Then K(n) − log τ ∈Bn 2−K(τ ) + O(1). Proof. Let An = {σ : U(σ) ∈ Bn }. The An are nonempty, pairwise disjoint, and uniformly c.e. Since τ ∗ ∈ An for all τ ∈ Bn , ∗ K(n) − log 2−|σ| + O(1) − log 2−|τ | + O(1) σ∈An
τ ∈Bn
= − log
2−K(τ ) + O(1).
τ ∈Bn
One technique that we will employ several times is to build a KC set and assume, by the recursion theorem, that we have an index for the corresponding prefix-free machine. The legitimacy of this assumption requires a brief comment. It is certainly the case that, when we enumerate a c.e. set of requests L, we can use the recursion theorem to assume that we have an
128
3. Kolmogorov Complexity of Finite Strings
index for L. If L is a KC set, then the KC Theorem tells us that from this index we can effectively pass to an index for the corresponding prefix-free machine. But if L is not a KC set, then there is no corresponding machine, so we have a problem unless we can ensure that our construction always builds a KC set, regardless of the parameter it is provided. Fortunately, this issue is easily bypassed. Whenever we say that we are building a KC set L using the recursion theorem, we have an additional unstated rule that if we ever wish to add a request to L that would drive its weight above 1, we do not do so, and immediately halt the construction. A construction with this rule always produces a KC set, no matter what the parameter, and in the constructions below, we will always implicitly show that the construction never halts if provided with the correct parameter (i.e., with an index for the machine corresponding to the KC set it builds).
3.7 Basic properties of prefix-free complexity We begin with the following simple but quite fundamental result. −K(σ) Proposition 3.7.1. 1. σ∈2<ω 2 Proof. Since U is prefix-free, σ∈2<ω 2−K(σ) < τ ∈dom U 2−|τ | 1. We now show that the upper bound in Proposition 3.6.3 cannot be improved too much. Proposition 3.7.2. For any d, there are σ with K(σ) > |σ| + log |σ| + d. Proof. Suppose that K(σ) |σ| + log |σ| + d for all σ. Then σ
2−K(σ)
2−n−d 2−|σ|−d 1 = = 2−d = ∞, 2n |σ| n n n>0 n>0
|σ|>0
contradicting Proposition 3.7.1. In fact, the argument in the above proof shows the following. −f (σ) Corollary 3.7.3. Let f : 2<ω → N be such that = ∞. σ∈2<ω 2 Then K(σ) > |σ| + f (|σ|) for infinitely many σ. The most precise bound on K(σ) is given by the following result. Theorem 3.7.4 (Chaitin [58]). K(σ) |σ| + K(|σ|) + O(1). Proof. Consider the prefix-free machine M that, on input μ, searches for strings σ and τ such that μ = τ σ and U(τ ) ↓= |σ|, and, if such a pair is found, outputs σ. Note that, since U is prefix-free, there can be at most one such pair σ, τ for any given μ. Furthermore, it is easy to check that M
3.7. Basic properties of prefix-free complexity
129
is prefix-free. Thus, letting νσ be a minimal-length U-program for |σ|, we have K(σ) CM (σ) + O(1) = |νσ | + |σ| + O(1) = K(|σ|) + |σ| + O(1).
This result immediately gives us the bound of Proposition 3.6.3, but we can do better by proceeding iteratively. That is, letting ||σ|| be the length of the binary representation of |σ|, we have K(σ) |σ| + ||σ|| + K(||σ||) + O(1) |σ| + log |σ| + 2 log log |σ| + O(1). Continuing in this way, we have the following. Corollary 3.7.5. For any k and ε > 0, we have K(σ) |σ| + log |σ| + log log |σ| + · · · + (1 + ε) log(k) |σ| + O(1). Chaitin [58] showed that the bound given in Theorem 3.7.4 is tight, as part of the following result. Theorem 3.7.6 (Counting Theorem, Chaitin [58]). (i) max{K(σ) : |σ| = n} = n + K(n) ± O(1). (ii) |{σ : |σ| = n ∧ K(σ) n + K(n) − r}| 2n−r+O(1) , where the constant does not depend on n and r. Of course, part (i) of the Counting Theorem follows from part (ii). The most elegant way to prove part (ii) was given by Chaitin [63], and works by way of the minimality of K as an information content measure, as we now explain. The concept of information content measure is more or less the same as the earlier one of computably enumerable discrete semimeasure, introduced by Levin [243], as we will see in Section 3.9. Definition 3.7.7 (Chaitin [63], after Levin [243]). An information content measure (i.c.m.) is a partial function F : 2<ω → N such that −F (σ) 1 and 1. F (σ)↓ 2 2. {(σ, k) : F (σ) k} is c.e. Notice that K is an information content measure. As with prefix-free machines, we can enumerate the information content measures as F0 , F1 , . . . in such a way that the corresponding sets given in item 2 of Definition 3.7.7 are uniformly c.e. We can then define a minimal information content measure K(σ) = min {Fk (σ) + k + 1}. Fk (σ)↓
is indeed an information content measure, and It is easy to check that K that it is minimal in the sense that, for any information content measure F , we have K(σ) F (σ) + O(1) for σ ∈ dom F . In particular, K(σ) K(σ) + O(1).
130
3. Kolmogorov Complexity of Finite Strings
On the other hand, we can easily translate information content measures to prefix-free machines, using the KC Theorem. That is, given an information content measure F , the set {(k + 1, σ) : F (σ) k} is a KC set, so there is a prefix-free machine M such that for each σ there is a τ of length F (σ) + 1 with M (τ ) = σ. In particular, this fact holds for F = K, so K(σ) K(σ) + O(1). Thus we see that K and K are the same up to withan additive constant, and so we will henceforth identify K and K out further comment. We note the result we have just proved for future reference. Theorem 3.7.8 (Chaitin [63]). Let F be an information content measure. Then K(σ) F (σ) + O(1) for σ ∈ dom F . Chaitin found some clever proofs exploiting the minimality of K. Before turning to the proof of the Counting Theorem, we give an example, Chaitin’s proof of Theorem 3.7.4. Since 2−(|σ|+K(|σ|)) = 2n 2−(n+K(n)) = 2−K(n) 1, σ
n
n
the map σ → |σ| + K(|σ|) is an information content measure, so K(σ) |σ| + K(|σ|) + O(1) by the minimality of K. Proof of Theorem 3.7.6.As mentioned above, it is enough to prove part (ii). Let F (n) = − log( σ∈2n 2−K(σ) ) . Then 2−F (n) 2−K(σ) = 2−K(σ) 1, n σ∈2n
n
σ∈2<ω
and {(n, k) : F (n) k} is c.e. Thus F is an information content measure. (Here we are thinking of n as a binary string by identifying it with its binary representation.) So, by the minimality of K, there are c and ε > 0 such that 2−K(n) 2−F (n)−c = ε 2−K(σ) . σ∈2n
Now suppose that there are more than 2n−r+O(1) strings of length n with K(σ) n + K(n) − r. In other words, suppose that there exists an unbounded binary function f such that, for Sn,r = {σ ∈ 2n : K(σ) n + K(n) − r}, we have |Sn,r | f (n, r)2n−r for all n and r. Then for each n and r, 2−K(n) ε
σ∈2n
2−K(σ) ε
2−K(σ)
σ∈Sn,r
εf (n, r)2n−r 2−n−K(n)+r = εf (n, r)2−K(n) .
3.7. Basic properties of prefix-free complexity
131
Since f (n, r) is unbounded, we have a contradiction.6 Notice that the proof above also establishes the following. Corollary 3.7.9 (Chaitin [63]). There is a constant c such that 2−K(n)+c −K(σ) for all n. σ∈2n 2 Miller and Yu [279] gave an extension of the Counting Theorem, though for all applications the authors are aware of, the original counting theorem suffices. Theorem 3.7.10 (Miller and Yu [279]). |{σ ∈ 2n : K(σ) < n + K(n) − ∗ m}| 2n−m−K(m|n )+O(1) . Proof. Let c be as in Corollary 3.7.9 and define K(m | τ) = n + c − m − log (|{σ ∈ 2n : K(σ) < n + |τ | − m}|) if U (τ ) ↓= n ∞ otherwise. Note that {(k, m) : K(m | τ ) < k} is c.e., uniformly in τ . Furthermore, ∗ 2−K(m|n ) = 2−n−c 2m |{σ ∈ 2n : K(σ) < n + K(n) − m}| m
=2
−n−c
m
σ∈2n
m
2m 2−n−c
2n+K(n)−K(σ)
σ∈2n
= 2K(n)−c
2−K(σ) 1,
σ∈2n
where the last inequality follows from Corollary 3.7.9. Therefore, there is a d such that ∀n ∀m (K(m | n∗ ) K(m | n∗ ) + d). So for all n and m, ∗
2−K(m|n
)
2−K(m|n
∗
)−d
= 2−n+m−c−d |{σ ∈ 2n : K(σ) < n+K(n)−m}|.
Multiplying both sides by 2n−m+c+d completes the proof. Miller and Yu [279] showed that their Improved Counting Theorem is tight up to a multiplicative constant. Theorem 3.7.11 (Miller and Yu [279]). |{σ ∈ 2n : K(σ) < n + K(n) − ∗ m}| 2n−m−K(m|n )−O(1) . To prove this theorem, we need the following lemma. 6 It would also be possible to prove this result using the KC Theorem. The key step in the above proof is showing that 2−K(n) O( σ∈2n 2−K(σ) ). We can do this by monitoring − log( σ∈2n 2−K(σ) ), and each time this value drops to a new low p, enumerating a KC request (p, n). The details are straightforward.
132
3. Kolmogorov Complexity of Finite Strings
Lemma 3.7.12. There is a c such that if δ ∈ 2n ends in at least m+K(m | n∗ ) + c many zeroes, then K(δ) < n + K(n) − m. Proof. We define a prefix-free machine M . By the recursion theorem, we may assume that we know in advance the prefix ρ by which our universal prefix-free machine U simulates M . Let c = |ρ| + 1. The domain of M consists of strings στ ν for which there are n, m ∈ ω with |ν| = n−m−|τ |−c such that U (σ) ↓= n and U (τ | σ) ↓= m. Note that the set of all such strings is prefix-free. For στ ν and n, m as above, define M (στ ν) = ν0n−|ν| . ∗ Now fix n, m ∈ ω and let δ = ν0m+K(m|n )+c be a string of length n. Let σ = n∗ and let τ be a minimal program for m given n∗ . Note that |ν| = n − m − |τ | − c, so we have M (στ ν) = ν0n−|ν| = δ. Therefore, K(δ) |στ ν|+c−1 = K(n)+|τ |+(n−m−|τ |−c)+c−1 = n+K(n)−m−1, as required. Proof of Theorem 3.7.11. Let c be the constant from Lemma 3.7.12. That ∗ lemma guarantees that there are 2n−m−K(m|n )−c many distinct strings of length n with complexity less than n + K(n) − m (assuming that n − m − K(m | n∗ ) − c 0). Thus we see that in certain aspects, such as upper bounds, prefix-free complexity is more difficult to deal with than plain complexity. There are trade-offs, however, in that prefix-free complexity simplifies certain results, such as the following (cf. the discussion beginning in the paragraph before Theorem 3.1.4). Proposition 3.7.13. K(στ ) K(σ, τ ) K(σ) + K(τ ) + O(1). This property of prefix-free complexity is known as subadditivity. Proving that it holds is not difficult, but can be a useful exercise. For this and similar facts about prefix-free complexity, see Li and Vit´anyi [248] or Fortnow [148]. We will sharpen the estimate of Proposition 3.7.13 when we consider symmetry of information in Theorem 3.10.2.
3.8 Prefix-free randomness of strings There are two possibilities for defining prefix-free randomness of strings. One is to regard σ as random if it is random relative to U in the sense of Section 3.1, that is, if K(σ) |σ|. As with C-randomness, we relax this condition a bit by choosing a constant d and calling σ weakly K-random if K(σ) |σ| − d. The other possibility is to regard σ as random if its shortest description is essentially as large as possible, that is, if for a fixed d, we have K(σ) |σ| + K(|σ|) − d. We call such σ strongly K-random. We will see below that there is a real difference between these two notions of prefix-free randomness. Since every prefix-free machine is a plain
3.9. The Coding Theorem and discrete semimeasures
133
machine, every C-random string is weakly K-random for some choice of constant. The converse is definitely not true. In fact, in Corollary 6.6.5, we will see that there are infinitely many strings σ such that K(σ) |σ| but C(σ) |σ| − log |σ|. On the other hand, in Corollary 4.3.3 we will see that every strongly K-random string is C-random for some choice of constant.
3.9 The Coding Theorem and discrete semimeasures Chaitin’s information content measures are more or less the same as the computably enumerable discrete semimeasures introduced by Levin [243] (see also G´acs [164]), and in a sense Solomonoff [369]. Definition 3.9.1 (Levin [243] (see also G´acs [164]), Solomonoff [369]). A discrete semimeasure is a function m : 2<ω → R0 such that σ∈2<ω m(σ) 1. A discrete semimeasure m is computably enumerable if it is approximable from below; that is, there are uniformly computable functions ms : 2<ω → Q such that for all s and σ we have ms+1 (σ) ms (σ) and lims ms (σ) = m(σ).7 A c.e. discrete semimeasure m is maximal if m(σ) O(m(σ)) for each c.e. discrete semimeasure m. Theorem 3.9.2 (Levin [243]). There is a maximal c.e. discrete semimeasure. Proof. It is not hard to see that there is an effective enumeration −n m0 , m1 , . . . of the c.e. discrete semimeasures, and that m(σ) = mn (σ) is a n2 maximal c.e. discrete semimeasure. The Coding Theorem, due to Levin [241, 243] and Chaitin [58] (and in a sense Solomonoff [369]), relates prefix-free Kolmogorov complexity and maximal c.e. discrete semimeasures. We present it in a stronger form, which also involves the probability that U outputs a given string. Fix a maximal c.e. discrete semimeasure m. Definition 3.9.3. For a prefix-free machine M , let QM (σ) = μ({τ : M (τ ) ↓= σ}). That is, QM (σ) is the probability that M outputs σ. We write Q(σ) for QU (σ). Theorem 3.9.4 (Coding Theorem, Solomonoff [369], Levin [241, 243], Chaitin [58], see also G´ acs [164]). K(σ) = − log m(σ)±O(1) = − log Q(σ)± O(1). 7 In the terminology of Definition 5.2.1 below, a discrete semimeasure is c.e. if it is c.e. as a function.
134
3. Kolmogorov Complexity of Finite Strings
Proof. The function Q is a c.e. discrete semimeasure, so m(σ) O(Q(σ)), whence − log m(σ) − log Q(σ). Since U(σ ∗ ) = σ, we have Q(σ) ∗ 2−|σ | = 2−K(σ) , so − log Q(σ) K(σ). Finally, − log m(σ) is an information content measure, so K(σ) − log m(σ) + O(1) − log m(σ) + O(1).8 One informal interpretation of the Coding Theorem is that if a string has many long descriptions then it also has a short description.
3.10 Prefix-free symmetry of information As with plain complexity, we can explore the notion of symmetry of information for prefix-free Kolmogorov complexity. In the same spirit as for C, we may define the prefix-free information content of τ in σ as IK (σ : τ ) = K(τ ) − K(τ | σ). For the rest of the section, we will drop the subscript K from IK . First note that the results of Section 3.3 hold for K in place of C, since K and C agree up to a log factor. In addition, we have the following. Theorem 3.10.1 (Symmetry of Information, Levin and G´ acs [164], Chaitin [58]). I(σ, K(σ) : τ ) = I(τ, K(τ ) : σ) ± O(1). Note that given the relationship K(ρ, K(ρ)) = K(ρ∗ ) ± O(1), the Symmetry of Information Theorem for prefix-free complexity may be neatly rewritten as I(σ ∗ : τ ) = I(τ ∗ : σ). As with the C case, the Symmetry of Information Theorem follows from a reformulated version. Theorem 3.10.2 (Symmetry of Information, Restated). K(σ, τ ) = K(σ)+ K(τ | σ, K(σ)) ± O(1) = K(σ) + K(τ | σ ∗ ) ± O(1). Theorem 3.10.2 gives us Theorem 3.10.1 as follows: K(σ) + K(τ | σ, K(σ)) = K(σ, τ ) ± O(1) = K(τ, σ) ± O(1) = K(τ ) + K(σ | τ, K(τ )) ± O(1), and hence, K(τ ) − K(τ | σ, K(σ)) = K(σ) − K(σ | τ, K(τ )) ± O(1). We now turn to the proof of Theorem 3.10.2. 8 Technically,
− log m(σ) may not quite be an i.c.m., since we might not have lims − log ms (σ) = − log m(σ) if − log m(σ) is an integer. But the function that takes σ to 1 + lims − log ms (σ) is an i.c.m., and is within O(1) of − log m(σ).
3.10. Prefix-free symmetry of information
135
Proof of Theorem 3.10.2. The second equality is clear, so it is enough to show that K(σ, τ ) = K(σ) + K(τ | σ ∗ ) ± O(1). We first prove that K(σ, τ ) K(σ) + K(τ | σ ∗ ) + O(1). Let ρ be a minimal length description of τ given σ ∗ . Then we can construct a prefix-free machine M that, on input σ ∗ ρ, first computes σ from σ ∗ , then computes τ from σ ∗ and ρ, and finally outputs σ, τ . We now prove that K(σ, τ ) K(σ) + K(τ | σ ∗ ) − O(1). To do this, we prove that K(τ | σ ∗ ) K(σ, τ ) − K(σ) + O(1). Let ν0 , ν1 , . . . be a computable ordering of the domain of U and let σs and τs be such that νs = σs , τs . Let Wσ = {s : σs = σ}. Given n and σ, build a KC set by enumerating requests (|νs | − n, τs ) for s ∈ Wσ , as long as the weight of these requests does not exceed 1. Call the resulting prefix-free machine Mn,σ . By the Coding Theorem 3.9.4, there is a constant c such that 2K(σ)−c Q(σ, τ ) 1 τ
for all σ, where Q is as in Definition 3.9.3. (To see this, consider the machine V = σ whenever U(ν) = σ, τ for some τ . Then QV (σ) = that has V (ν) Q(σ, τ ), so τ τ Q(σ, τ ) Q(σ) + O(1).) Since ∗ 2−(|νs |−|σ |+c) = 2K(σ)−c Q(σ, τ ) 1, s∈Wσ
τ
all relevant requests will be enumerated for the machine MK(σ)−c,σ . Now define the oracle prefix-free machine M as follows. With ρ on the oracle tape, M first looks for a σ such that U(ρ) = σ. Then it simulates M|ρ|−c,σ . If M has σ ∗ on its oracle tape, then it will simulate MK(σ)−c,σ . Since there is an s ∈ Wσ such that νs = σ, τ ∗ , the KC set defining this machine has as one of its requests (K(σ, τ ) − K(σ) + c, τ ), so K(τ | σ ∗ ) KM (τ | σ ∗ ) K(σ, τ ) − K(σ) + O(1).
It is sometimes useful to apply symmetry of information to στ rather than σ, τ . By Theorem 3.10.2, K(σ, τ ) = K(σ, στ ) ± O(1) = K(στ ) + K(σ | (στ )∗ ) ± O(1), so we have the following. Corollary 3.10.3. K(στ ) = K(σ, τ )−K(σ | (στ )∗ )±O(1) = K(σ)+K(σ | τ ∗ ) − K(σ | (στ )∗ ) ± O(1).
136
3. Kolmogorov Complexity of Finite Strings
A similar result holds for C in place of K. We finish this section with a useful “mixed” theorem for the complexity of στ , relating C and K in the spirit of Chapter 4. Proposition 3.10.4 (Folklore, see [308]). C(στ ) K(σ) + C(τ ) + O(1). Proof. Let V be a universal plain machine. Define a plain machine M as follows. On input ρ, look for ρ0 and ρ1 such that ρ0 ρ1 = ρ and both U(ρ0 ) ↓ and V(ρ1 ) ↓. If found, then output U(ρ0 )V(ρ1 ). Then C(στ ) CM (στ ) + O(1) K(σ) + C(τ ) + O(1).
3.11 Initial segment complexity of sets It follows from Theorem 3.1.4 that there is no set A such that C(A n) n − O(1). Recall that the idea of the proof of Theorem 3.1.4 was to take a string στ such that the length of τ codes σ, and use τ alone to describe στ . This reasoning is refined in the following results of Martin-L¨ of [260], the proofs here being drawn from Li and Vit´ anyi [248]. Lemma 3.11.1 (Martin-L¨ of [260], see Li and Vit´ anyi [248] and Staiger [376]). Let f be a computable function such that n 2−f (n) = ∞. Then for any set A there are infinitely many n such that C(A n | n) n − f (n). −g(n) Proof. Let g be a computable function such that = ∞ and n2 limn g(n) − f (n) = ∞. It is easy to see that such a function must exist. Let G(n) = in 2−g(n) . Think of [0, 1] as a circle with 0 and 1 identified and let In be the interval [G(n), G(n + 1)) mod 1 on this circle. Let Cn = {σ ∈ 2n : σ ∩ In = ∅}, where we think of σ as a subset of [0, 1]. Note that the Cn are uniformly computable and |Cn | 2n (G(n + 1) − G(n)). Thinking of A as a real, there are infinitely many n such that A ∈ In , since limn G(n) = ∞. Thus there are infinitely many n such that A n ∈ Cn . Given n, we can describe any element of Cn by its position in Cn , ordered lexicographically. Thus there are infinitely many n such that C(A n | n) log |Cn | + O(1) log(2n (G(n + 1) − G(n))) + O(1) = n − g(n) + O(1). For all such n that are sufficiently large, C(A n | n) n − f (n). Theorem 3.11.2 (Li and Vit´ a nyi [248], after Martin-L¨ of [260]). Let f be a computable function such that n 2−f (n) = ∞ and C(n | n−f (n)) = O(1). Then for any set A there are infinitely many n such that C(A n) n − f (n). Proof. It is enough to show that there are infinitely many n such that C(A n) n − f (n) + O(1), where the constant term does not depend
3.12. Computable bounds for prefix-free complexity
137
on f , because if f satisfies the hypotheses of the theorem, then so does n → f (n) − c for any constant c. Let σn be a minimal length U n -program from A n. Let τn be a minimal length U n−f (n) -program for n, and let tn be the position of τn in a fixed effective list of finite strings. Let μn = 1tn 0σn . By Lemma 3.11.1, there are infinitely many n such that |σn | n − f (n). Since tn is bounded, there is a constant c and infinitely many n such that |μn | n − f (n) + c. For such n, let νn = 0n−f (n)+c−|μn | μn . Then |νn | = n − f (n) + c. We can now easily define a Turing machine, which does not depend on f , that on input νn first obtains n − f (n) from |νn |, then decodes τn and obtains n, then decodes σn and obtains A n. Thus there are infinitely many n such that C(A n) |νn | + O(1) = n − f (n) + O(1), where the constant term does not depend on f . Corollary 3.11.3. For any set A there are infinitely many n such that C(A n) n − log n. Proof. The function n−log n → n is computable, so C(n | n−log n) = O(1). Furthermore, n 2− log n = ∞, so we can apply Theorem 3.11.2. There has been much work on C-complexity oscillations related to the above results. See Li and Vit´ anyi [248, page 138] for more on this topic. The following is a prefix-free analog of the above results. Theorem 3.11.4 (after Solovay [371]). If g is computable and n 2−g(n) = ∞, then for every set X there are infinitely many n such that K(X n) n + K(n) − g(n) + O(1). Proof. Let I0 , I1 , . . . ⊂ N be a uniformly computable sequence of pair wise disjoint finite sets such that n∈Ij 2−g(n) 1 for all n. Let mj = max{g(n) : n ∈ Ij }. Effectively split 2mj into (not necessarily pairwise disjoint) sets Snj for n ∈ Ij so that |Snj | = 2mj −g(n) and every element of 2mj is in at least one Snj . For a given X and j, let n be such that X extends a string in Snj . Then we can describe X n given n by specifying which of the 2mj −g(n) many strings in Snj is extended by X, and then giving X [mj , n). Thus K(X n | n) mj − g(n) + n − mj + O(1) = n − g(n) + O(1), and hence K(X n) n + K(n) − g(n) + O(1).
3.12 Computable bounds for prefix-free complexity Ignoring additive constants, we have seen that for plain complexity, there is a computable upper bound on C(σ), namely |σ|, that is achieved infinitely often. The analogous bound on K(σ), however, is |σ| + K(|σ|), which is not computable, and approximations to this bound, such as |σ| + 2 log |σ| or |σ| + log |σ| + 2 log(2) |σ|, are not achieved infinitely often. It is tempting
138
3. Kolmogorov Complexity of Finite Strings
to conjecture that there is no computable upper bound on K(σ) that is achieved infinitely often. This conjecture is false, however. Theorem 3.12.1 (Solovay [371], see also G´ acs [163]). There is a computable function g : 2<ω → N such that (i) K(σ) g(σ) for all σ, and (ii) K(σ) = g(σ) for infinitely many σ. Proof. It will be notationally convenient to build g : N → N so that K(n) g(n) for almost all n and K(n) = g(n) for infinitely many n. Since we can identify binary strings with natural numbers via G¨ odel numbers, this construction will be enough to establish the theorem. We will in fact first build an auxiliary function g, then modify it slightly to obtain g. We can construct a prefix-free machine M such that, for each τ and n, if s is least such that U(τ )[s] ↓= n, then M (τ ) ↓= n, s. Since U is universal, there is a constant c such that, for each such τ, n, s, we have U(μ) ↓= n, s for some μ with |μ| |τ |+c. We can now define a computable function f such that for all τ, n, s, if s is least such that U(τ )[s] ↓= n, then U(μ)[f (s)] ↓= n, s for some μ with |μ| |τ | + c. Thus, if s is least such that K(n) = Ks (n), then Kf (s) (n, s) K(n) + c. Now define g as follows. Given x, find the unique n and s such that x = n, s, and let g (x) = Kf (s) (x). Clearly, K(x) g(x) for all x. On the other hand, suppose that x is one of the infinitely many integers of the form n, s where s is least such that K(n) = Ks (n). Then g (x) = Kf (s) (x) K(n) + O(1) K(x) + O(1), the latter inequality coming from the fact that the function n, s → n is computable. Thus there is a constant d ∈ N such that K(x) = g(x) − d for infinitely many x and K(x) g (x) − d for all but finitely many x. Define g(m) = g (m) − d, then modify finitely many values of g to ensure that K(m) g(m) for all m. Note that there is a critical difference between the computable bound |σ| + O(1) for C(σ) and the computable bound g(σ) of Theorem 3.12.1. For the appropriate choice of constant c, we know that for each n there is some string σ of length n with C(σ) = |σ| + c. For g, however, we know only that there are infinitely many σ with K(σ) = g(σ), and apparently only ∅ can figure out where these σ are. This difference will be important, especially in Chapter 11, where we discuss a class of sets (the K-trivial sets) that show that Chaitin’s Theorem 3.4.4 fails for K in place of C. Bienvenu, Downey, and Merkle [39, 40] began a systematic study of computable upper bounds for K. These bounds admit a simple characterization. Lemma 3.12.2 (Bienvenu and Downey [39]). Let f : N → N be computable. The following are equivalent: (i) K(n) f (n) + O(1).
3.13. Universal machines and halting probabilities
(ii)
n
139
2−f (n) < ∞.
Proof. (i) ⇒ (ii) is trivial, as n 2−K(n) 1. The other direction follows by the minimality of K as an information content measure, Theorem 3.7.8. Definition 3.12.3 (Bienvenu and Merkle [40]). A computable function f is a Solovay function if n 2−f (n) < ∞ and lim inf n f (n) − K(n) < ∞ (in other words, there is a c such that f (n) K(n) + c for infinitely many n). As we will see in Theorem 9.4.1, Solovay functions are exactly those computable functions f for which n 2−f (n) is a 1-random real, as defined in Chapter 6. They will play a part in several chapters below. We will refer to the Solovay function built in the proof of Theorem 3.12.1 as “Solovay’s Solovay function”. For a computable function t, the prefix-free Kolmogorov complexity of σ with time bound t, denoted by K t (σ), is min{|τ | : U (τ )[t(|σ|)] ↓= σ}, where U is a universal prefix-free function that efficiently simulates every partial computable prefix-free function f , in the sense that there are a string ρf and a constant cf such that for all σ and s, if f (σ)[s] ↓ then U (ρf σ)[cf s] ↓= f (σ), ˜ be the function Kf (s) from the while if f (σ) ↑ then U (ρf σ) ↑. Letting K ˜ proof of Theorem 3.12.1, we see that there is a quadratic time version9 K ˜ of K such that K(n) = K(n) for infinitely many n. H¨ olzl, Kr¨ aling, and Merkle [183] showed that this fact has the following consequence, whose proof is along the lines of that of Theorem 3.12.1. Theorem 3.12.4 (H¨ olzl, Kr¨ aling, and Merkle [183]). If t is any superlinear time bound, then the time-bounded version K t of K is always a Solovay function.
3.13 Universal machines and halting probabilities In this section, we prove results of Solovay [371] giving estimates of the sizes of some basic sets associated with U. Recall that σ ∗ is the first string τ of length K(σ) such that U(τ ) ↓= σ. (A more precise definition was given in Section 3.5.) Recall also that we write f ∼ g to mean that f (n) = O(g(n)) and g(n) = O(f (n)). We slightly abuse notation by writing, for instance, f (n) ∼ 2n to mean that f ∼ n → 2n . We begin with the following fixed-point result for U. Lemma 3.13.1 (Solovay [371]). For each n there are at least O(2n−K(n) ) many strings σ of length n such that U(σ) ↓= σ. Proof. We define a prefix-free machine M with coding string ρ given by the recursion theorem. On input σ, the machine M looks for τ ∈ dom U 9 Superlinear
would work as well.
140
3. Kolmogorov Complexity of Finite Strings
and μ such that σ = τ μ. Then M interprets U(τ ) as a natural number n. If |μ| < n − |ρ| − |τ | then M does not halt. Otherwise, M outputs ρτ (μ (n − |ρ| − |τ |)). Given n, let Sn = {ρn∗ ν : |ν| = n − |ρ| − K(n)}. If ρn∗ ν ∈ Sn then U(ρn∗ ν) = M (n∗ ν) = ρn∗ ν, and |Sn | = 2n−|ρ|−K(n) = O(2n−K(n) ). Theorem 3.13.2 (Solovay [371]). Let (i) p(n) = |{σ ∈ 2n : U(σ) ↓}|, (ii) P (n) = |{σ ∈ 2n : U(σ) ↓}|, (iii) p (n) = |{σ : K(σ) = n}|, and (iv) d(n) = |{σ : K(σ) n}|. Then p(n)∼ P (n) ∼ d(n) ∼ 2n−K(n) . Furthermore, there is a number k such that njn+k p (j) ∼ 2n−K(n) .10 Proof. Let An = {σ ∈ 2n : U(σ) ↓}. The An are pairwise disjoint, uniformly c.e. subsets of the prefix-free set dom U. By Proposition 3.6.4, K(n) − log 2−|σ| + O(1) = − log(p(n)2−n ) + O(1), σ∈An
so p(n) O(2n−K(n) ). By Lemma 3.13.1, in fact p(n) = O(2n−K(n) ). Since p(n) P (n), we have the lower bound P (n) O(2n−K(n) ). For the upper bound, first notice that there is a c such that P (n) = p(n − j) c2n−j−K(n−j) . jn
jn
Now, K(n) K(n−j)+K(j)+O(1), so −K(n−j) −K(n)+2 log j+O(1). Thus there is a c such that P (n) c 2n−j−K(n)+2 log j = c 2n−K(n) j 2 2−j = O(2n−K(n) ), jn
j
j 2 2−j converges. So P (n) = O(2n−K(n) ). Clearly, d(n) P (n) O(2n−K(n) ), and it follows from Lemma 3.13.1 that d(n) O(2n−K(n) ), so d(n) = O(2n−K(n) ). fixed k, we have the upper bound Finally, we consider p(n) . For an+k−K(n+k) ) = O(2n−K(n) ). We now njn+k p (j) P (n + k) = O(2 establish the necessary lower bound.
as
j
10 Solovay
remarked that the machine-dependence of the definition of p(n) means that we cannot have an estimate such as p(n) ∼ 2n−K(n) in general. For instance, it is easy to define a universal prefix-free machine relative to which K(σ) is always odd. Solovay also remarked that he did not know of a natural universal prefix-free machine relative to which p(n) ∼ 2n−K(n) .
3.13. Universal machines and halting probabilities
141
Since d(n + k) = O(2n+k−K(n+k) ), there is a c such that, for any n and k, we have d(n + k) c2n+k−K(n)−O(log k) . Also, there is a c such that d(n) c 2n−K(n) for all n. Let k be large enough so that c2k−O(log k) 2c . Then d(n+k) d(n)+c 2n−K(n) for all n. Thus, there are at least c 2n−K(n) many strings σ for which n < K(σ) n + k, and hence njn+k p(j) c 2n−K(n) . As pointed out by Solovay, there is another way to look at the last part of Theorem 3.13.2. First we need a lemma. Lemma 3.13.3. There is a c such that if m c, then for each σ there is a τ ∈ 2K(σ)+m with U(τ ) = σ. Proof. We define a prefix-free machine M with coding string ρ given by the recursion theorem. On input σ, the machine M looks for τ, μ ∈ dom U and ν such that σ = τ μν. Then M interprets U(μ) as a natural number m. If |ν| = m − |μ| − |ρ|, then M does not halt. Otherwise, M outputs U(τ ). Let c be large enough so that m K(m) + |ρ| for all m c. Then if m c and ν is any string of length m − K(m) − |ρ|, we have |ρσ ∗ m∗ ν| = |ρ| + K(σ) + K(m) + (m − K(m) − |ρ|) = K(σ) + m and U(ρσ ∗ m∗ ν) = M (σ ∗ m∗ ν) = σ. Say that τ is a d-minimal program for σ if U(τ ) = σ and |τ | K(σ) + d, and that τ is a d-minimal program if it is a d-minimal program for some σ. We claim that there is a d such that the number of d-minimal programs of length n is O(2n−K(n) ). This number is bounded above by the function p(n) n−K(n) from Theorem 3.13.2, and hence is no greater than ). Now let c O(2 be as in Lemma 3.13.3 and let k be such that njn+k p (j) ∼ 2n−K(n) , as in Theorem 3.13.2. If n K(σ) n + k, then by Lemma 3.13.3 there is a U-program for σ of length exactly n + k + c. Each one of these programs is (k + c)-minimal. So the claim holds with d = k + c. We now turn to further results of Solovay [371] concerning the sets Dn = {σ ∈ 2n : U(σ) ↓}. We have P (n) = |Dn | as in Theorem 3.13.2. Since the Dn are uniformly c.e., we can uniformly compute Dn given n and P (n). Thus K(Dn ) K(n, P (n)) + O(1) K(n) + K(P (n) | n∗ ) + O(1).11 By Theorem 3.13.2, there is a c such that P (n) 2n−K(n)+c , so if we know n − K(n) then we can describe P (n) with a string of length n − K(n) + c. Furthermore, K(n − K(n) | n∗ ) = O(1), so K(P (n) | n∗ ) n − K(n) + K(n − K(n) | n∗ ) + O(1) = n − K(n) + O(1). Thus K(Dn ) K(n) + n − K(n) + O(1) = n + O(1). We now show that this estimate is sharp. 11 For a finite set F , we can take K(F ) to be the minimum of K(n) over all n such that F is equal to the canonical finite set Dn . We can similarly define F ∗ .
142
3. Kolmogorov Complexity of Finite Strings
Theorem 3.13.4 (Solovay [371]). K(Dn ) = n ± O(1). Proof. By the above argument, it is enough to show that K(Dn ) n − O(1). We define a prefix-free machine M with coding string ρ given by the recursion theorem. On input σ, the machine M looks for τ ∈ dom U and μ such that σ = τ μ. Then it interprets U(τ ) as a finite set of strings E (via a fixed encoding of finite sets of strings into strings). Let m be the length of the longest string in E. If m − |ρ| − |τ | < 0 or |μ| < m − |ρ| − |τ |, then M does not halt. Otherwise, M checks whether ρτ (μ (m − |ρ| − |τ |)) ∈ E. If not, then M outputs 0, and otherwise M does not halt. Suppose that K(Dn ) n − |ρ|, let τ = (Dn )∗ , and let μ be a string of length n − |ρ| − K(Dn ). Then ρτ μ ∈ Dn iff τ μ ∈ dom M iff ρτ μ ∈ / E iff ρτ μ ∈ / Dn , which is a contradiction. Thus K(Dn ) n − |ρ| for all n. Corollary 3.13.5 (Solovay [371]). K((Dn )∗ | Dn ) = O(1). Proof. We can determine (Dn )∗ given K(Dn ) and n, so K((Dn )∗ | Dn ) K(K(Dn ) | Dn ) + O(1). But K(Dn ) = n ± O(1), so K(K(Dn ) | Dn ) = K(n | Dn ) ± O(1) = O(1), since n is the length of the longest string in Dn , and hence we can determine n given Dn . The sets Dn are closely related to Chaitin’s famous number Ω, whose properties we will further explore in Section 6.1. (See Calude and Chaitin [48] for an expository discussion of this number.) Definition 3.13.6 (Chaitin [58]). The halting probability Ω of U is μ(dom U) = σ∈dom U 2−|σ| . Of course, the value of Ω depends on the choice of U, much as ∅ depends on the choice of enumeration of the partial computable functions. We have seen in Myhill’s Theorem 2.4.12 that all versions of ∅ are the same up to computable permutations of N. In Section 9.2 we will see that, as with ∅ , all versions of Ω are quite closely related, and it does make sense to ignore the machine-dependence of Ω in most circumstances. Let Ωs = σ∈dom U [s] 2−|σ| . Note that Ω = lims Ωs . Theorem 3.13.7 (Solovay [371]). (i) K(Dn | Ω n) = O(1). (Indeed, Dn can be computed from Ω n.) (ii) K(Ω n | Dn+K(n) ) = O(1). Proof. (i) It is easy to see that Ω is not computable (since otherwise we would be able to compute the domain of U). So in particular Ω ∈ / Q, which implies that, for each n, we must have Ωs n = Ω n for some s. Given Ω n, look for such an s. Then σ ∈ Dn iff σ ∈ dom U[s]. (ii) Given n, let D = Dn+K(n) . We may assume that n is large enough so that D = ∅. Using D, we can find the least j such that j +K(j) = n+K(n). We have |n − j| = |K(n) − K(j)| K(|n − j|) + O(1) 2 log |n − j| + O(1), so |n − j| + O(1). Thus K(n | D) = O(1).
3.14. The conditional complexity of σ ∗ given σ
143
We define a prefix-free machine M with coding string ρ given by the recursion theorem. On input σ, the machine M looks for τ ∈ dom U and μ such that σ = τ μ. Then M interprets U(τ ) as a number n. If |μ| = n then M does not halt. Otherwise, M interprets τ as a number m ∈ [0, 2n−1 ) and looks for an s such that Ωs m2−n . If such an s is found, then M outputs 0, and otherwise M does not halt. Let c be such that |ρ| + K(n − c) + n − c n + K(n) for all sufficiently large n, which must exist because K(n − d) K(n) + O(log d). Then Ω m2−(n−c) iff (n− c)∗ μ ∈ dom M , where μ ∈ 2n−c codes m as a number in [0, 2n−c−1 ). Since |ρ(n − c)∗ μ| = |ρ| + K(n − c) + n − c n + K(n), it follows that Ω m2−(n−c) iff ρ(n − c)∗ μ ∈ D. Since K(n | D) = O(1), we have K(Ω (n − c) | D) = O(1), so K(Ω n | D) = O(1).
3.14 The conditional complexity of σ ∗ given σ In this section, we investigate how difficult it is to produce σ ∗ given σ. This question is equivalent to asking how difficult it is to produce K(σ) given σ, as shown by the following lemma, which has the same proof as the analogous Proposition 3.2.1. Proposition 3.14.1. K(σ ∗ | σ) = K(K(σ) | σ) ± O(1). It follows from this result that K(σ ∗ | σ) log |σ| + O(log(2) |σ|). The following result shows that this bound is not too far from optimal. Theorem 3.14.2 (G´ acs [164], Solovay [371]). For all sufficiently large n, there is a σ ∈ 2n such that K(σ ∗ | σ) log n − log(2) n − 3. Before proving this theorem, we need a combinatorial lemma. To simplify notation, for the remainder of this section, we let k(n) = log n−log(2) n−3, and assume that n is sufficiently large for k(n) to be positive. 4i n Lemma 3.14.3. i<2k(n) n < 2 . 4i j 2k(n)+2 , so Proof. Let M = i<2k(n) n . Then M < j<2k(n)+2 n < n log M 2k(n)+2 log n, and hence log(2) M k(n) + 2 + log(2) n = log n − 1. Thus M < 2n . Corollary 3.14.4. Suppose that 2n is divided into 2k(n) disjoint pieces A0 , . . . , A2k(n) −1 . Then there is an i < 2k(n) such that |Ai | > n4i . For the least such i, we have j n . Then j
144
3. Kolmogorov Complexity of Finite Strings
Proof of Theorem 3.14.2. Assume for a contradiction that K(σ ∗ | σ) < k(n) for all σ ∈ 2n . We say that τ ∈ 2 n4i , and for this i we have |Si | = O(n4(i−1) ). Since |Ai | > n4i , there must be a σ ∈ Ai such that K(σ) 4i log n. Now suppose that we know the parameters i, |Si |, and n. Then we can wait until a stage s such that Si,s = Si , and hence compute Si . Every string entering Si+1 that is not in Si is in Ai , so we can computably enumerate Ai . But once we know that σ ∈ Ai , we can find the 2k(n) − i many strings τ active for σ, and compute U σ (τ ) for all such τ . One of these computations will produce σ ∗ , and hence we can determine K(σ). Let σ be the first string to enter our enumeration of Ai with K(σ) 4i log n. We have just argued that we can compute σ given i, |Si |, and n. Thus K(σ) K(i) + K(|Si |) + K(n) + O(1), so to arrive at a contradiction, it is enough to show that 4i log n K(i) + K(|Si |) + K(n) + O(1) if n is sufficiently large. Now, K(n) log n + O(log(2) n), and i < 2k(n) , so K(i) k(n) + O(log k(n)) = log n − log(2) n − 3 + O(log(log n − log(2) n − 3)) log n + O(log(2) n). Finally, |Si | = O(n4(i−1) ), so K(|Si |) log(|Si |) + O(log(2) |Si |) 4(i − 1) log n + O(log(2) n). Thus K(i) + K(|Si |) + K(n) = (4i − 2) log n + O(log(2) n), and hence 4i log n K(i) + K(|Si |) + K(n) + O(1) for sufficiently large n, as desired. The same proof also gives the analogous result for plain complexity. Theorem 3.14.5 (G´ acs [164], Solovay [371]). For all sufficiently large n, there is a σ ∈ 2n such that C(C(σ) | σ) log n − log(2) n − 3. Note that this lower bound is close to the obvious upper bound C(C(σ) | σ) C(C(σ)) + O(1) log |σ| + O(1).
3.15. Monotone and process complexity
145
Chaitin conjectured the following, which implies the above and was also conjectured correct by Solovay. Conjecture 3.14.6. There is an infinite sequence σ0 , σ1 , . . . such that (i) |σn | = n, (ii) K(σn ) ∼ n, and (iii) K(σn∗ | σn ) ∼ log n. Solovay pointed out that these σn might well satisfy K(σn ) say.
n , log(2) n
3.15 Monotone and process complexity The original machine characterization of randomness was due to Levin [241, 242], and involved what are called monotone machines, and a resulting notion of monotone complexity. A related notion is the process complexity of Schnorr [350], which uses a different kind of monotone machine.12 We will discuss both of these concepts in this section. Here, we are viewing Cantor space as a continuous sample space, and thinking of a sequence as a limit of computable sequences, rather than a limit of strings. Levin [241, 242] considered machines with possibly infinite outputs. On a given input σ, such a machine M either halts with a finite output or computes forever, producing a potentially infinite sequence. (So, given a fixed input, if at some stage M outputs some τ , and then at a later stage it outputs τ , then τ τ .) We write M (σ) ↓ if M has either of these behaviors after reading exactly σ from its input tape. For a standard Turing machine M (which we will call a discrete machine in this section), we may always assume that if M halts on input σ, then M reads all of σ from its input tape, and then ceases activity, so this notation agrees with our usual notation in the discrete case. Following Levin [241, 242], we call a machine monotone if its action is continuous, that is, for all σ ≺ τ , if M (σ) ↓ and M (τ ) ↓ then M (σ) M (τ ).13 Another way of thinking of such machines is to define them as a computably enumerable W collection of pairs of strings 12 A
concept similar to monotone machines was also introduced by Solomonoff [369]. is perhaps helpful to have a particular machine model for discrete machines in mind. We can think of them as machines with a one-way read-only input tape, a one-way write-only output tape, and work tapes. The input is read one bit at a time, and the output written one bit at a time, with the one-way nature of the output tape allowing no revisions. For such a machine M , we have M (σ) ↓= α, where σ ∈ 2<ω and α ∈ 2<ω ∪ 2ω , if either M halts after reading the bits of σ from the input tape, and α is what is on the output tape at that point, or M computes forever after reading exactly the bits of σ, and α is what is on the output tape in the limit. 13 It
146
3. Kolmogorov Complexity of Finite Strings
(σ, τ ) such that for all pairs (σ, τ ) and ( σ , τ ) in W , if σ σ , then τ is compatible with τ . Note that this allows for the possibility that σ = σ yet τ ≺ τ for infinitely many τ . Note that (discrete) prefix-free machines are monotone; their definition simply asks that for each σ there be at most one τ with (σ, τ ) ∈ W . Discrete monotone machines are also known as process machines. A simple but important fact is that we can enumerate all monotone machines, and hence use the standard method of constructing a universal one. We fix a universal monotone machine M and a universal discrete monotone machine D. Using these machines we can define two varieties of monotone Kolmogorov complexity. Definition 3.15.1 (Levin [241, 242], Schnorr [349, 350]). (i) The monotone complexity of σ is Km(σ) = min{|τ | : σ M(τ ) ↓}. (ii) The process complexity of σ is Km D (σ) = min{|τ | : D(τ ) ↓= σ}. The latter notion was introduced by Schnorr in [349] and further developed by him in [350], where process complexity was used to obtain a characterization of 1-randomness. Schnorr called it process complexity because he regarded information as being given with a direction. As he put it in [350], “he who wants to understand a book will not read it backwards, since the comments or facts which are given in its first part will help him to understand subsequent chapters (this means they help him to find regularities in the rest of the book). Hence anyone who tries to detect regularities in a process (for example an infinite sequence or an extremely long finite sequence) proceeds in the direction of the process. Regularities that have ever been found in an initial segment of the process are regularities for ever. Our main argument is that the interpretation of a process (for example to measure the complexity) is a process itself that proceeds in the same direction.” It is interesting to note that, even earlier and echoing work of Solomonoff [369], Levin had also developed a notion of process complexity, as can be seen in Zvonkin and Levin [425]. To make things clear, we will use the term strict process machine for Levin’s definition. Definition 3.15.2 (Levin, see [425], Solomonoff [369]). A strict process machine is a partial computable function M : 2<ω → 2<ω such that if τ ∈ dom(M ) and τ τ , then τ ∈ dom(M ) and M (τ ) M (τ ). Let U be a universal strict process machine. The strict process complexity Km S (σ) of σ is CU (σ). As pointed out by Adam Day, a natural model corresponding to Levin’s definition is almost identical to one described in the first paper on algorithmic randomness by Solomonoff [369]. Take a three-tape Turing machine M with a read-only one-way input tape, a one-way write-once output tape, and a work tape. The first square of the input tape is blank and the input
3.15. Monotone and process complexity
147
head starts on that square. Let the machine run. If at any stage M wants to move the input head of the tape, we first define M (τ ) = σ, where τ is the input string read so far and σ is the current output on the output tape. The distinction between a strict process machine and a process machine is almost completely unimportant in this book save for a theorem of Day [88] on the complexity of collections of non-random strings, Theorem 16.3.31. Until we reach that section, all theorems we state hold for both measures, and we will stick to Schnorr’s definition. Little is known about the relationship between Schnorr’s and Levin’s complexities. We now give some results relating Km and Km D to C and K. The precise nature of the interaction between these notions is not fully known. Some results can be found in Uspensky and Shen [396], but there are no sharp estimates akin to those of Solovay’s relating C to K, which we will discuss in Chapter 4. Clearly, Km(σ) Km D (σ)+O(1) K(σ)+O(1) and C(σ) Km D (σ)+ O(1). Furthermore, the identity machine is monotone, so Km D (σ) |σ| + O(1). The following easy result was noted by several authors, including G´ acs (implicit in [166]) and Calhoun [46]. Proposition 3.15.3. Km(0n 1) = Km D (0n 1) ± O(1) = K(1n ) ± O(1). Proof. We have observed that Km(σ) Km D (σ) + O(1) K(σ) + O(1) for all σ, and clearly K(0n 1) = K(1n ) ± O(1), so it is enough to show that K(1n ) Km(0n 1) + O(1). The minimal-length M-descriptions of 0n 1 over all n form a prefix-free set, so if we let L = {(k + 1, 1n ) : ∃τ ∈ 2k (0n 1 M(τ ))}, then L is c.e. and n 2−(k+1) 2−(k+1) = 2− Km(0 1) 1, (k+1,1n )∈L
n kKm(0n 1)
n
so L is a KC set. Furthermore, (Km(0n 1) + 1, 1n ) ∈ L for all n, so K(1n ) Km(0n 1) + O(1). Since C(σ) Km D (σ) + O(1) K(σ) + O(1), some results come for free. For instance, the estimate in Lemma 4.2.1 below shows that for any σ and ε > 0, we have K(σ) Km D (σ) + log |σ| + log log(|σ|) + · · · + (1 + ε) log(k) (|σ|). Actually, the strengthening of this result to Km is also true. Theorem 3.15.4 (Uspensky and Shen [396]). For any σ and ε > 0, we have K(σ) Km(σ) + log |σ| + log log(|σ|) + · · · + (1 + ε) log(k) (|σ|). Proof. The proof uses a device introduced in the proof of Proposition 3.1.5. As in that proposition, for a string ν = a0 a1 . . . am , let ν = a0 a0 a1 a1 . . . am am 01. This encoding is a prefix-free representation of ν of length 2|ν| + O(1). To improve on it, we can represent ν as |ν|ν, which has length |ν| + 2 log |ν| + O(1), or as ||ν|||ν|ν (where ||ν|| is the length of the binary representation of the length of ν), which has length
148
3. Kolmogorov Complexity of Finite Strings
|ν| + log |ν| + 2 log log |ν| + O(1) |ν| + (1 + ε) log |ν| for any ε > 0. So for any k, we have a prefix-free representation r(ν) of ν of length log |σ| + log log |σ| + · · · + (1 + ε) log(k) |σ|. Now let N be a prefix-free machine defined as follows. On input τ , look for τ0 and τ1 with τ = τ0 τ1 and such that τ0 = r(|σ|) for some σ M(τ1 ). If these are found, then output σ. If ρ is a minimal-length program such that σ M(ρ), then N (r(|σ|)ρ) = σ. In some ways, the theory of monotone complexity is smoother than those of plain complexity and prefix-free complexity. For example, we have the following simpler version of Chaitin’s Theorem 3.4.4. Proposition 3.15.5. A set A is computable iff Km(A n) = O(1). Proof. If A is computable then there is a monotone machine that on input λ outputs A, so Km(A n) = O(1). Conversely, if Km(A n) = O(1) then there are computable sets B0 , . . . , Bk such that for each n there is an i k for which A n ≺ Bi . But that clearly implies that A = Bi for some i k, so A is computable. For process complexity, we have the following analogous result. Theorem 3.15.6. A set A is computable iff Km D (A n) Km D (1n ) + O(1) for all n. Proof. Let M be a machine such that for each σ we have M (σ) ↓= 1n where n is the position of σ in the length-lexicographic ordering of 2<ω . Then M is a discrete monotone machine, so Km D (1n ) log n + O(1). Since monotone machines are Turing machines, at C-random n we have Km D (1n ) = log n ± O(1). Now we simply follow the proof of Theorem 3.4.4. We have Km(1n ) = O(1) and Km(0n ) = O(1) for all n, but for each c we can find an n such that Km(0n 1n ) > c, so Km is not subadditive. However, it is possible to recover a mild form of subadditivity by mixing complexities. Theorem 3.15.7. Km(στ ) K(σ) + Km(τ ) + O(1). Proof. Let N be a machine defined as follows. On input ρ, the machine N looks for μ and ν such that ρ = μν and U(μ) ↓. Then N computes U(μ)M(ν). (In other words, N writes U(μ) on its output tape, then proceeds to emulate M on input ν.) The machine N is monotone. If μ is a minimal-length U-program for σ and ν is a minimal-length M-program such that τ M(ν) ↓, then στ N (μν) ↓, so Km(στ ) |μν| + O(1) = K(σ) + Km(τ ) + O(1). Process complexity is also not subadditive. The usual argument for plain complexity uses complexity oscillations applied to a long random string.
3.15. Monotone and process complexity
149
This argument is not available for process complexity, but it is nevertheless possible to show that subadditivity fails. Theorem 3.15.8 (Day [90]). For any d there are σ and τ such that Km D (στ ) > Km D (σ) + Km D (τ ) + d. We will prove this theorem by giving short descriptions to many strings, and arguing combinatorially that one of these strings σ must have an extension στ with the desired property. The following lemma expresses a basic combinatorial fact about process machines. Lemma 3.15.9 (Day [90]). If A ⊆ 2<ω is prefix-free then 2− Km D (σ) 1. σ∈A
Proof. Let B = {τσ : σ ∈ A}, where τσ is a minimal-length description of σ with respect to the fixed universal discrete monotone machine D. If τ1 and τ2 are distinct elements of B, then D(τ1 ) and D(τ2 ) are incompatible, so B is also prefix-free. Thus σ∈A 2− Km D (σ) = τ ∈B 2−|τ | = μ(B]) 1. For σ = a0 , . . . , an−1 , let M (σ) = a0 a0 a1 a1 . . . an−1 an−1 . Let An = {M (σ) : σ ∈ 2n }. Since M is a discrete monotone machine, there is a c such that for all τ ∈ An , we have Km d (τ ) KM (σ) + c = n + c. For all n and m 2n + 2, let Bnm = {σ ∈ 2m : ∃ρ ∈ Ai (ρ01 σ ∨ ρ10 σ)}. Note that if n = n then Bnm ∩ Bnm = ∅, and |Bnm | = 2 · 2m−2n−2|An | = 2m−n−1 . We will use the following lemma to find the extension we want. Lemma 3.15.10 (Day [90]). For each e there are m, n, and σ ∈ Bnm such that Km D (σ) > |σ| − n + e. Proof. Let m = 2e+3 and assume that no such n and σ exist. Then 2− Km D (σ) 2− Km D (σ) 2−|σ|+n−e σ∈2m
=
m n< m 2 σ∈Bn
|Bnm |2−m+n−e =
n< m 2
n< m 2
m n< m 2 σ∈Bn
2m−n−1 2−m+n−e =
m −e−1 > 1, 2 2
which contradicts Lemma 3.15.9. Proof of Theorem 3.15.8. Let c be as above, and let d be such that Km D (τ ) |τ | + d for all τ . Fix k and let e = k + c + d. By the previous lemma, there exist m, n, and υ ∈ Bnm such that Km D (υ) > |υ| − n + e. Let σ = υ 2n. Then σ ∈ An , so Km D (σ) |σ| − n + c. Let τ be such that στ = υ. Then Km D (σ) + Km D (τ ) + k |σ| − n + c + |τ | + d + k = |στ | − n + e < Km D (στ ).
150
3. Kolmogorov Complexity of Finite Strings
There is comparatively little known about process and monotone complexity; their study seems to be a fertile area of research, particularly in view of the basic natural intuition behind them.
3.16 Continuous semimeasures and KM -complexity In Section 3.9, we defined the notion of a discrete semimeasure. This notion is not compatible with the usual measure on Cantor space. Its continuous analog is the following. Definition 3.16.1. A continuous semimeasure is a function δ : {σ : σ ∈ 2<ω } → R0 such that (i) δ(λ) 1 and (ii) δ(σ) δ(σ0) + δ(σ1) for all σ. To simplify notation, we often think of δ as a function from 2<ω to R0 and write δ(σ) for δ(σ). Of course, there is a big difference between continuous semimeasures and the discrete semimeasures of Section 3.9. For example, if we let δ(0k ) = 1 <ω for all k and δ(σ) = 0 for all other σ ∈ 2 , then δ is a continuous semimeasure, but σ δ(σ) = ∞. Continuous semimeasures are closely connected to the important notion of supermartingale, which we discuss in Chapter 6 (see in particular Section 6.3.2). In the words of Li and Vit´ anyi [248], a continuous semimeasure is a “defective” measure, as it is only subadditive. Of course, we can define effective continuous semimeasures (and hence “effective defective measures”. . . ), such as c.e. continuous semimeasures, as we did for discrete semimeasures. A continuous semimeasure δ is computably enumerable if it is approximable from below; that is, there are uniformly computable functions δs : 2<ω → Q such that for all s and σ we have δs+1 (σ) δs (σ) and lims δs (σ) = δ(σ). A c.e. continuous semimeasure δ is optimal if for every c.e. continuous semimeasure δ , we have δσ O(δ (σ)). Levin, see [425], showed that there is an optimal c.e. continuous semimeasure, as we will see in Theorem 3.16.2. Given this fact, we can associate a version of Kolmogorov complexity to a continuous semimeasure. Fix an optimal c.e. continuous semimeasure δ and let KM (σ) = − log δ(σ). This notion is called the a priori entropy of σ; we will also refer to it as the KM -complexity of σ. Along the lines of the Coding Theorem 3.9.4, but for continuous semimeasures, we can show that KM corresponds to the probability that a string is output by a given universal monotone machine. The following result is due
3.16. Continuous semimeasures and KM -complexity
151
to Levin (see [425]), but, while the theorem is referred to in the literature, only a sketch of its proof appeared in [425]. We give a proof due to Day [87]. If L is a monotone machine, then let the function ML : 2<ω → R0 be defined by ML (σ) = μ({τ : ∃σ σ ((τ, σ ) ∈ L)}). Theorem 3.16.2 (Levin, see [425]). (i) If L is a monotone machine, then ML is a c.e. continuous semimeasure. (ii) If m is a c.e. continuous semimeasure, then there exists a monotone machine L such that ML = m. (iii) If M is a universal monotone machine, then MM is an optimal c.e. continuous semimeasure, so KM (σ) = − log(MM (σ)) ± O(1). Proof. As discussed in Section 3.15, we think of a monotone machine L as given by axioms of the form (σ, τ ), where (σ, τ ) ∈ L means that on input σ, the machine M writes at least τ on its output tape. We write σ ≈ τ to mean that σ τ or τ σ. We prove (i) and (ii). Then (iii) follows immediately by the definition of ML . (i) Clearly ML (λ) 1. Furthermore, {τ : ∃σ σ((τ, σ ) ∈ L)} ⊇ {τ : ∃σ σ0((τ, σ ) ∈ L)} ∪ {τ : ∃σ σ1((τ, σ ) ∈ L)}, and {τ : ∃σ σ0((τ, σ ) ∈ L)} ∩ {τ : ∃σ σ1((τ, σ ) ∈ L)} = ∅, because L is a monotone machine. It follows that ML (σ) ML (σ0) + ML (σ1) for all σ. Finally, ML is approximable from below, since ML (σ) = limt MLt (σ), and MLt (σ) is rational and nondecreasing in t. (ii) We will use the following lemma. Lemma 3.16.3. If T, S ⊆ 2l then there exists R ⊆ 2l such that R = T \ S. Proof. Let R = {σ ∈ 2l : σ ⊆ T ∧ σ ∩ S = ∅}. Then R ⊆ T and R∩S = ∅, so R ⊆ T \S. If α ∈ T \S then α l ⊆ T because there must be some string of length l in T extended by α. No string in S is extended by α, and as all strings in S have length less than or equal to l, we have α l ∩ S = ∅. Hence α l ∈ R and so T \ S ⊆ R. Let m be a c.e. semimeasure. We can choose a monotone approximation m(σ, s) of m whose values are dyadic rationals and such that the following hold for all t. (i) If |σ| t then m(σ, t) = 0.
152
3. Kolmogorov Complexity of Finite Strings
(ii) There is at most one σ such that m(σ, t + 1) = m(σ, t). (iii) If m(σ, t + 1) = m(σ, t), then m(σ, t + 1) = m(σ, t) + n2−(t+1) for some n. (iv) For all σ, we have m(σ, t) m(σ0, t) + m(σ1, t). We build a monotone machine L via an enumeration with certain properties, which we now describe. Letting Dt (σ) = {τ : (τ, σ) ∈ Lt }, we will ensure that the following hold for all σ and t. (i) Lt is a monotone machine. (ii) Dt (σ) is a prefix-free set. (iii) μ(Dt (σ)) = m(σ, t). (iv) If τ ∈ Dt (σ) then |τ | t. (v) If σ ≺ σ then Dt (σ) ⊇ Dt (σ ). The construction of L proceeds as follows. Let L0 = ∅. Notice that L0 has the desired properties trivially. Assume that Lt has the desired properties. At stage t + 1, if m(σ, t + 1) = m(σ, t) for all σ such that |σ| < t, then let Lt+1 = Lt . That is, if the semimeasure does not change then do not change the machine. Otherwise, let σ be the unique string such that m(σ, t + 1) = m(σ, t). Let n be such that m(σ, t + 1) = m(σ, t) + n2−(t+1) . If σ = λ then let ρ = σ (|σ| − 1). By the assumption on Lt , μ Dt (ρ) \ (Dt (ρ0) ∪ Dt (ρ1)) = μ Dt (ρ)) − μ(Dt (ρ0) − μ Dt (ρ1) = m(ρ, t) − m(ρ0, t) − m(ρ1, t) n2−(t+1) . The first equality holds because Dt (ρ0) ∪ Dt (ρ1) ⊆ Dt (ρ), and as Ls is a monotone machine, Dt (ρ0) ∩ Dt (ρ1) = ∅. By Lemma 3.16.3, there is a set R of strings of length t + 1 such that R = Dt (ρ) \ (Dt (ρ0) ∪ Dt (ρ1)). So μ(R) n2−(t+1) . Choose n strings of length t + 1 from R and let this set be Tt . If σ = λ then find a set R ⊆ 2t+1 such that R = λ \ Dt (λ), and let Tt consist of n strings from this R. Let Lt+1 = Lt ∪ {(τ, σ) : τ ∈ Tt }. We show that Lt+1 has the desired properties. First we show that Lt+1 is a monotone machine. If σ = λ then there is nothing to check, as λ is comparable with any string. If σ = λ, then take any τ ∈ Tt and consider any υ ≺ τ such that (υ, π) ∈ Lt . We will show that π ≺ σ. First note that, by the choice of τ , there is some τ ≺ τ with (τ , ρ) ∈ Lt for some ρ. Then τ ≈ υ, so π ≈ ρ. If π ρ0 or π ρ1, then υ ⊆ (Dt (ρ0) ∪ Ds (ρ1)), and so τ ∈ / Lt . Hence π ρ ≺ σ.
3.16. Continuous semimeasures and KM -complexity
153
The set Dt+1 (σ) is prefix-free because Dt (σ) and Tt are prefix-free and none of the elements of Tt extend elements of Dt (σ). We have μ(Dt+1 (σ)) = μ(Dt (σ)) + μ(Tt ) = m(σ, t) + n2−(t+1) = m(σ, t + 1). The final two properties hold by construction. Since each Lt is a monotone machine, it follows that L is a monotone machine. By property (v), MLt (σ) = μ(Dt (σ)), so for all σ, ML (σ) = lim MLt (σ) = lim μ(Dt (σ)) = lim m(σ, t) = m(σ). t
t
t
It was a long-standing open question whether KM and Km are the same, that is, whether a natural analog of the Coding Theorem 3.9.4 holds for continuous semimeasures. In one direction, we saw in Theorem 3.16.2 that KM (σ) = − log(MM (σ)) ± O(1). But MM (σ) = μ({τ : ∃σ σ((τ, σ ) ∈ M)}) 2− Km(σ) . Thus KM (σ) Km(σ) + O(1). In Section 4.5, we will see results by G´acs [166] and Day [87] showing that the converse does not hold, although proving that fact is remarkably difficult.
4 Relating Complexities
In this chapter, we look at some of the fundamental relationships between flavors of Kolmogorov complexity. The first four sections explore relationships between plain and prefix-free complexity, while the last deals with the relationship between Km and KM . Most of this material, although of independent interest, will not be used directly in the rest of this book. Exceptions are Corollaries 4.3.3 and 4.3.8, which will be relevant in the discussion of sets with infinitely many initial segments of highest possible complexity in Chapter 6. Throughout this chapter, in addition to our fixed universal prefix-free machine U, we fix a universal Turing machine V and define C using this machine.
4.1 Levin’s Theorem relating C and K Our first results relating K and C are due to Levin. They appeared in his dissertation [241]; the first published proofs were in G´ acs [165]. Theorem 4.1.1 (Levin [241], see G´acs [165]). (i) C(σ) = min{n : K(σ | n) n} ± O(1). (ii) C(σ) = K(σ | C(σ)) ± O(1). Proof. Let mσ = min{n : K(σ | n) n}. Part (i). Consider a prefix-free oracle machine M , with coding constant c given by the recursion theorem, such that M n (τ ) = μ iff |τ | = n − c and R.G. Downey and D. Hirschfeldt, Algorithmic Randomness and Complexity, Theory and Applications of Computability, DOI 10.1007/978-0-387-68441-3_4, © Springer Science+Business Media, LLC 2010
154
4.2. Solovay’s Theorems relating C and K
155
V(τ ) = μ. Given σ, let τ be a minimal-length V-program for σ. On input τ and oracle C(σ) + c, this machine will output σ. Thus K(σ | C(σ) + c) |τ | + c = C(σ) + c, and hence C(σ) mσ − O(1). Now consider a plain machine N that, on input τ , searches for μ τ such that U |τ | (μ) ↓, and if such a μ is found, outputs U |τ | (μ). Note that, by the prefix-freeness of U, such a μ, if it exists, is unique. Suppose that K(σ | n) n, and let μ be a minimal-length U-program for σ given n. Note that |μ| n. Let τ = μ0n−|μ| . Then N (τ ) = U n (μ) = σ, so C(σ) |τ | + O(1) = n + O(1). Thus C(σ) mσ + O(1). Part (ii). We have K(σ | mσ − 1) mσ − 1, so K(σ | mσ ) K(σ | mσ − 1) − O(1) mσ − O(1). But also K(σ | mσ ) mσ , so in fact K(σ | mσ ) = mσ ± O(1). Now part (i) gives us K(σ | C(σ)) = K(σ | mσ ) ± O(1) = mσ ± O(1) = C(σ) ± O(1).
4.2 Solovay’s Theorems relating C and K In this section and the next, we look at the beautiful unpublished material of Solovay [371] relating C and K. The positive results involve simulations of Turing machines by prefix-free machines and vice versa. The negative ones involve the construction of an infinite sequence of strings whose plain and prefix-free complexities behave very differently in the limit. These results also have a bearing on the relationship between C-randomness and strong K-randomness (as defined in Sections 3.1 and 3.8, respectively). In Section 4.4, we give applications of Solovay’s results due to Miller [272] to obtain a result of An. A. Muchnik, see [288], and an improvement thereof. As the following result shows, it is not hard to give upper bounds on K(σ) in terms of C(σ). Recall that, for a function f , we write f (n) for the result of composing f with itself n many times. Proposition 4.2.1 (Solovay [371]). K(σ) C(σ) + C (2) (σ) + · · · + C (n) (σ) + O(C (n+1) (σ)) for any n. Proof. This lemma follows from Lemma 4.2.3 below by an easy induction, but we give a direct proof here. Recall the following idea from Proposition 3.1.5 and Lemma 3.15.4. For a string σ = a0 a1 . . . am , let σ = a0 a0 a1 a1 . . . am am 01. That is, σ is the result of doubling each bit of σ and adding 01 to the end. The key fact about this operation is that if σ = τ then σ and τ are incompatible. Thus there is a prefix-free machine M such that M (τ ) = V(τ ) for each τ . Taking τ to be a minimal-length V-program for σ, this fact shows that K(σ) |τ | + O(1) = O(C(σ)), thus proving the lemma for n = 0. But we can also build a prefix-free machine M so that for each τ we have M (τ τ ) = V(τ ), where τ is a minimal-length V-program for
156
4. Relating Complexities
|τ |. On input μ, the machine M simply looks for a splitting μ = μ1 μ2 such that μ1 = ν for some ν, then computes V(ν), and if that halts and equals |μ2 |, computes V(μ2 ) and outputs the result if any. Taking τ to be a minimal-length V-program for σ, this construction shows that K(σ) |τ τ | = C(σ) + O(C (2) (σ)), thus proving the lemma for n = 1. It should now be clear how to build a prefix-free machine M so that for each τ we have M (τ τ τ ) = V(τ ), where τ is a minimal-length V-program for |τ |, which proves the lemma for n = 2, and how to iterate this process to prove the lemma in general. The more precise relationships between C and K are as follows. Theorem 4.2.2 (Solovay [371]). K(σ) = C(σ) + C (2) (σ) ± O(C (3) (σ)).
(4.1)
C(σ) = K(σ) − K (2) (σ) ± O(K (3) (σ)).
(4.2)
and
Proof. It is useful to recall what the O-notation means here. For example, (4.1) means that there is a d such that C(σ) + C (2) (σ) − dC (3) (σ) K(σ) C(σ) + C (2) (σ) + dC (3) (σ) for all σ. We have already seen that the second inequality holds for some d. The difficulty comes in proving the first inequality. One might expect (4.1) and (4.2) to be part of an infinite sequence of approximations involving increasing numbers of iterations of C and K, respectively, as in Proposition 4.2.1. In the next section we will see that, remarkably, this is not the case. As Solovay observed, (4.1) and (4.2) are in fact equivalent. Indeed, we will also prove that K (2) (σ) − C (2) (σ) = O(K (3) (σ))
(4.3)
K (3) (σ) is asymptotically equal to C (3) (σ).
(4.4)
and
Granted these two facts, (4.1) and (4.2) are clearly equivalent. The proof of the above equations proceeds in two stages. First we prove that K(σ) C(σ) + K(C(σ)) + O(1),
(4.5)
which is not too difficult. Then we prove the more difficult inequality C(σ) K(σ) − K (2) (σ) + K (3) (σ) + O(1).
(4.6)
Note that (4.5) and (4.6) are close to (4.2), the problem being that (4.5) has the term KC rather than K (2) . It turns out that we can use the estimate
4.2. Solovay’s Theorems relating C and K
157
on K − C to get one on K (2) − KC to establish (4.2). After doing so, it will remain to prove (4.3) and (4.4) to get (4.1). We begin by establishing (4.5). Lemma 4.2.3. K(σ) C(σ) + K(C(σ)) + O(1). Proof. Define a prefix-free machine M as follows. On input μ, first attempt to simulate U by searching for μ1 μ such that U(μ1 ) ↓. If such a string is found then let μ2 be such that μ = μ1 μ2 and check whether |μ2 | = U(μ1 ). If so, output V(μ2 ) (if this value is defined). Notice that M is prefix-free, since, firstly, U is prefix-free, and, secondly, if M halts on μ then μ = μ1 μ2 with U(μ1 ) ↓ and |μ| = |μ1 | + |U(μ1 )|, which means that all extensions of μ1 on which M halts have the same length, and hence are pairwise incompatible. Given σ, let τ2 be a minimal-length V-program for σ and let τ1 be a minimal-length U-program for C(σ) = |τ2 |. Then M (τ1 τ2 ) = V(τ2 ) = σ, and hence K(σ) |τ2 | + |τ1 | + O(1) = C(σ) + K(C(σ)) + O(1). We now establish (4.6). The idea of the proof is the following. Fix a computable enumeration of dom U and let Ln be the list of strings τ ∈ dom U such that |τ | = n, ordered according to this enumeration. We will (2) show that there is a c such that |LK(x) | 2K(x)−K (x)+c . We will then argue as follows (in the proof of Lemma 4.2.6). Given σ, let τ be a minimallength U-program for σ. Let μ1 be a minimal-length U-program for K (2) (σ) and let μ2 be a string of length K(σ) − K (2) (σ) + c encoding the position of τ on the list LK(σ) . The strings μ1 and μ2 together allow us to compute both K(σ) = U(μ1 ) + |μ2 | − c and the position of τ on the list LK(σ) , and hence allow us to compute σ. Since U is prefix-free, the string μ1 μ2 is enough to allow us to compute σ, whence C(σ) |μ1 μ2 | + O(1) = K(σ) − K (2) (σ) + K (3) (σ) + O(1). We now give the details of this argument. Lemma 4.2.4. |Ln | 2n−K(n)+O(1) . Proof. We wish to apply the KC Theorem 3.6.1 to build a prefix-free machine M as follows. We enumerate the Ln simultaneously. Whenever we first see that |Ln | 2k , we enumerate a request (n − k + 1, n). We claim that these requests form a KC set. Assume this claim for now, and let kn be the largest k such that |Ln | 2k . Then K(n) n − kn + O(1), so 2K(n) 2n−kn +O(1) . Thus |Ln | < 2kn +1 2n−K(n)+O(1) . So we are left with establishing the claim. Defining kn as above, this task amounts to showing that 2−(n−i+1) 1. n ikn
158
4. Relating Complexities
To show that the above inequality holds, first note that
n+1
2−(n−i+1) =
ikn
2−j
j=n−kn +1
2−j = 2kn −n .
jn−kn +1
Since 2kn |Ln |, we now have 2−(n−i+1) 2−n |Ln |, ikn
and hence n ikn
2−(n−i+1)
2−n |Ln | =
n
n
=
{2−|σ| : σ ∈ dom U ∧ |σ| = n}
{2−|σ| : σ ∈ dom U} 1.
This fact establishes the claim and completes the proof. Corollary 4.2.5. |LK(σ) | 2K(σ)−K
(2)
(σ)+O(1)
.
We now establish (4.6). Lemma 4.2.6. C(σ) K(σ) − K (2) (σ) + K (3) (σ) + O(1). (2)
Proof. Let c be such that |LK(σ) | 2K(σ)−K (σ)+c for all σ. We define a machine M as follows. On input μ, first attempt to simulate U by searching for μ1 μ such that U(μ1 ) ↓. If such a string is found then let μ2 be such that μ = μ1 μ2 . Let n = |μ2 | + U(μ1 ) − c and interpret μ2 as a number j in the interval [1, 2|μ2 | ] in the natural way. Enumerate Ln until its jth element appears, if ever. If such an element τ appears, output U(τ ) (if this value is defined). Given σ, let τ be a minimal-length U-program for σ and let ν2 be a string of length K(σ) − K (2) (σ) + c encoding the position j of τ on the list LK(σ) . Let ν1 be a minimal-length U-program for K (2) (σ). If we run M on input ν1 ν2 then M will set n = |ν2 |+U(ν1 )−c = K(σ)−K (2) (σ)+c+K (2)(σ)−c = K(σ) and will proceed to search the list Ln for its jth element, namely τ . It will then output U(τ ) = σ. Since |ν1 ν2 | = K (3) (σ) + K(σ) − K (2) (σ) + c, the result follows. We are now ready to show that (4.2) holds. Let m and n range over the integers. It is easy to check that K(|m + n|) K(|m|) + K(|n|) + O(1). (Here | · | denotes absolute value.) Also, K(|m|) K(|m − n|) + K(|n|) + O(1) and similarly with m and n interchanged, from which it follows that |K(|m|) − K(|n|)| K(|m − n|) + O(1).
4.2. Solovay’s Theorems relating C and K
159
Lemma 4.2.7. C(σ) = K(σ) − K (2) (σ) ± O(K (3) (σ)). Proof. Let D(σ) = K(σ) − C(σ) − K (2) (σ). We need to show that |D(σ)| = O(K (3) (σ)). By Lemma 4.2.6, D(σ) −K (3) (σ) − O(1). By Lemma 4.2.3, D(σ) K(C(σ)) − K (2) (σ) + O(1). By the facts about the relationship between K-complexity and addition and subtraction of integers mentioned above, |K(C(σ)) − K (2) (σ)| K(|C(σ) − K(σ)|) + O(1) = K(|D(σ) + K (2) (σ)|) + O(1) K(|D(σ)|) + K (3) (σ) + O(1). Thus D(σ) K(|D(σ)|) + K (3) (σ) + O(1). Putting the two previous paragraphs together, we see that |D(σ)| K (3) (σ) + K(|D(σ)|) + O(1). For k ranging over the natural numbers, K(k) = o(k), so |D(σ)| − o(|D(σ)|) K (3) (σ), whence |D(σ)| = O(K (3) (σ)). We now turn to (4.3), from which (4.4) and (4.1) will follow easily. Lemma 4.2.8. |K (2) (σ) − C (2) (σ)| = O(K (3) (σ)). Proof. By Lemma 4.2.7, C(σ) and K(σ) differ by O(K (2) (σ)), and hence their K-complexities differ by O(K (3) (σ)). That is, K(C(σ)) = K (2) (σ) ± O(K (3) (σ)).
(4.7)
Likewise, (4.7) shows that K (2) (C(σ)) and K (3) (σ) differ by o(K (3) (σ)), whence K (2) (C(σ)) = O(K (3) (σ)).
(4.8)
Replacing σ by C(σ) in (4.2), we have C (2) (σ) = K(C(σ)) − K (2) (C(σ)) ± O(K (3) (C(σ))). By (4.8), this equation gives us C (2) (σ) = K(C(σ)) ± O(K (3) (σ)).
(4.9)
Combining this equation with (4.7) establishes the lemma. Corollary 4.2.9. K (3) (σ) is asymptotically equal to C (3) (σ). Proof. By Lemma 4.2.8, K (2) (σ) and C (2) (σ) differ by O(K (3) (σ)), so |K (3) (σ)−K(C (2) (σ))| = o(K (3) (σ)), and hence K (3) and KC (2) are asymptotically equal. But K and C are asymptotically equal, so KC (2) and C (3) are asymptotically equal.
160
4. Relating Complexities
Corollary 4.2.10. K(σ) = C(σ) + C (2) (σ) ± O(C (3) (σ)). Proof. From Lemma 4.2.7 we have K(σ) = C(σ) + K (2) (σ) ± O(K (3) (σ)). Using Corollary 4.2.8 we then have K(σ) = C(σ) + C (2) (σ) ± O(K (3) (σ)). By Corollary 4.2.9, it follows that K(σ) = C(σ)+C (2) (σ)±O(C (3) (σ)). This result completes the proof of Theorem 4.2.2. The following corollary of the previous results will be useful below. Corollary 4.2.11 (Solovay [371]). K(σ) = C(σ) + K(C(σ)) ± O(C (3) (σ)). Proof. Combine Corollaries 4.2.9 and 4.2.10 with (4.9). The results above all have relativized forms. For instance, we have the following. Corollary 4.2.12 (Solovay [371]). K(σ | τ ) = C(σ | τ ) + C (2) (σ | τ ) ± O(C (3) (σ | τ )), C(σ | τ ) = K(σ | τ ) − K (2) (σ | τ ) ± O(K (3) (σ | τ )), and C(σ | τ ) K(σ | τ ) + O(1) C(σ | τ ) + K(C(σ | τ )) + O(1). Another point worth noticing is that the upper bound on the size of LK(σ) given in Corollary 4.2.5 is strict, as shown by the following result. Proposition 4.2.13 (Solovay [371]). |LK(σ) | 2K(σ)−K
(2)
(σ)+O(1)
.
Proof. Suppose otherwise. Define the prefix-free machine M as follows. By the recursion theorem, we may assume we know the coding constant c of M . On input μ, search for a splitting μ = μ1 μ2 such that U(μ1 ) ↓ and |μ2 | = U(μ1 ) − |μ1 | − (c + 1). If such a splitting is found then let k be the element of [1, 2|μ2 | ] coded by μ2 , and search for the kth element enumerated into LU (μ1 ) . If such an element τ is found, compute U(τ ) and output the result, if any. (2) By assumption, there is a σ such that |LK(σ) | 2K(σ)−K (σ)−(c+1) . Let τ be a minimal-length U-program for σ and let μ1 be a minimal-length Uprogram for K(σ) = |τ |. Let μ2 be a string of length K(σ)−K (2) (σ)−(c+1) coding the position of τ in LK(σ) . Then M (μ1 μ2 ) ↓= σ. But |μ1 μ2 | = K (2) (σ) + K(σ) − K (2) (σ) − (c + 1) = K(σ) − (c + 1), so KM (σ) = K(σ) − (c + 1). But c is the coding constant of M , so K(σ) K(σ) − 1, which is a contradiction.
4.3. Strong K-randomness and C-randomness
161
4.3 Strong K-randomness, C-randomness, and limitations on the results of the previous section In this section, we will present Solovay’s theorem from [371] that, roughly speaking, every strongly K-random string is C-random, but the converse is not true. One intuitive explanation for this difference between the two notions of randomness is the following. Suppose we know a string σ. If σ is C-random then also knowing C(σ) gives us no additional information, but if σ is strongly K-random then also knowing K(σ) gives us K(|σ|). Thus there is more information in the prefix-free complexity of a strongly Krandom string than in the plain complexity of a C-random string. Indeed, this extra amount of information is exploited in the proof of Theorem 4.3.7 below, which has as an immediate corollary that not every C-random string is strongly K-random. The existence of C-random but not strongly K-random strings will also allow us to show that the main results of the previous section cannot be improved. For instance, in the previous section we established identity (4.1): K(σ) = C(σ)+ C (2) (σ)± O(C (3) (σ)). At first glance, one would expect this identity to be part of an infinite sequence of identities with decreasing Oterms (as in Proposition 4.2.1). However, the methods of this section will show that the identity K(σ) = C(σ) + C (2) (σ) + C (3) (σ) ± O(C (4) (σ)) is not true, and hence (4.1) is as sharp as possible.
4.3.1 Positive results The following proposition combines some information about plain and prefix-free complexity from Chapter 3. Proposition 4.3.1. There are constants cC and cK such that (i) C(σ) |σ| + cC , (ii) |{σ ∈ 2n : C(σ) n + cC − j}| = O(2n−j ), (iii) K(σ) |σ| + K(|σ|) + cK , and (iv) |{σ ∈ 2n : K(σ) n + K(n) + cK − j}| = O(2n−j ). Let mC (σ) = |σ| + cC − C(σ) and mK (σ) = |σ| + K(|σ|) + cK − K(σ). In light of Proposition 4.3.1, we see that these values measure how far the complexity of σ falls from its potential maximum. For either version of ran-
162
4. Relating Complexities
domness, the random strings are those strings σ for which the corresponding m(σ) is small. Recall that, for fixed constants c and d, we say that σ is C-random if C(σ) |σ| − c and strongly K-random if K(σ) |σ| + K(|σ|) − d. As we will see, the following theorem implies that every strongly K-random string is C-random for some constant. Theorem 4.3.2 (Solovay [371]). mK (σ) mC (σ) − O(log mC (σ)). Proof. Since C(σ) = |σ| − mC (σ) + O(1), K(C(σ)) = K(|σ| − mC (σ) + O(1)) K(|σ|) + K(mC (σ) + O(1)) K(|σ|) + O(log mC (σ)). By Lemma 4.2.3, K(σ) C(σ) + K(C(σ)) + O(1). Consequently, K(σ) |σ| − mC (σ) + K(|σ|) + O(log mC (σ)). Rearranging this inequality, we get |σ| + K(|σ|) − K(σ) mC (σ) − O(log mC (σ)). Since the left-hand side of this inequality is mK (σ) (up to the constant cK ), it follows that mK (σ) mC (σ) − O(log mC (σ)). Corollary 4.3.3 (Solovay [371]). Every strongly K-random string is Crandom for some constant. Proof. If σ is strongly K-random then mK (σ) c for some fixed c (independent of σ). By Theorem 4.3.2, mC (σ) − d log mC (σ) c for some fixed d, which clearly implies that mC (σ) c for some fixed c .
4.3.2 Counterexamples We now turn to counterexamples. We will present Solovay’s construction of an infinite sequence of strings whose plain complexities behave “as differently as possible” from their prefix-free complexities, and examine the consequences of the existence of such sequences. We begin with a few lemmas that will be useful below. The first says that if τ is C-random given n, then there are many μ’s of length n such that τ μ is C-random (though for a different constant). Lemma 4.3.4 (Solovay [371]). For each c there is a d such that, for all n and τ , if C(τ | n) |τ | − c
(4.10)
|{μ ∈ 2n : C(τ μ) |τ | + n − d}| 2n−c .
(4.11)
then
4.3. Strong K-randomness and C-randomness
163
Proof. Fix c. We build a machine that, for each d, τ , and n not satisfying (4.11), provides a short C-program for τ given n. The length of this program will depend on d, and we will show that, for large enough d, this program is shorter than |τ | − c, so that any τ and n not satisfying (4.11) for any d also fail to satisfy (4.10). For each d, m, and n, let Sd,m,n be a list of all σ ∈ 2m such that |{μ ∈ n 2 : C(σμ) m + n − d}| > 2n−c . Notice that the Sd,m,n are uniformly c.e. Furthermore, there are at most 2m+n−d+1 many pairs (σ, μ) such that |σ| = m and C(σμ) m + n − d, whence |Sd,m,n | 2m+c−d+1 . Define the (plain) machine M as follows. On input (σ, n), search for σ1 , σ2 , σ3 such that σ1 , σ2 ∈ dom U and σ = σ1 σ2 σ3 . If such σi exist, then let d = U(σ1 ) and m = U(σ2 ) + |σ3 |. Interpret σ3 as a number j in the interval [1, 2|σ3 | ] and output the jth element of Sd,m,n , if any. Consider d, n, and τ not satisfying (4.11), and let m = |τ |. Then τ ∈ Sd,m,n . Let σ1 , σ2 , σ3 be strings of minimal length such that U(σ1 ) = d and U(σ2 ) = d − c + 1, and σ3 has length m + c − d + 1 and codes the position of τ in Sd,m,n . Then M (σ1 σ2 σ3 , n) = τ , so C(τ | n) K(d) + K(d − c + 1) + m + c − d + 1 + O(1) m − d + O(log d) = |τ | − d + O(log d), where the O constant does not depend on τ or n. Choosing d large enough, we see that if n and τ do not satisfy (4.11) then C(τ | n) |τ | − c, so n and τ do not satisfy (4.10). Combining Lemma 4.3.4 with part (iv) of Lemma 4.3.1, we have the following corollary, which says that if τ is C-random given n, then, since there are many μ’s of length n such that τ μ is C-random, there is such a μ that is strongly K-random. Corollary 4.3.5 (Solovay [371]). For each c there is a d such that, for all n and τ , if C(τ | n) |τ |−c then there is a μ ∈ 2n with C(τ μ) |τ |+n−d and K(μ) n + K(n) − d. The next lemma says that, given any n, we can find an m such that the prefix-free complexity of the strongly K-random strings of length m is close to n. Lemma 4.3.6 (Solovay [371]). There is a c such that for each n there is an m with |m + K(m) − n| c. Proof. If d is large enough then |K(m+ d)− K(m)| K(d)+ O(1) < d. Let f (m) = m+K(m). Then f (m+d)−f (m) = m+d+K(m+d)−m−K(m) = d + K(m + d) − K(m), so 0 < f (m + d) − f (m) < 2d. Let c = 2d and, given n, choose m such that |f (m) − n| is minimal. Then |f (m) − n| c, since otherwise one of f (m + d) or f (m − d) would be closer to n than f (m). We are now ready to prove the main theorem of this section.
164
4. Relating Complexities
Theorem 4.3.7 (Solovay [371]). There is an infinite sequence of strings νi such that (i) limi |νi | = ∞, (ii) C(νi ) = |νi | ± O(1), and (iii) limi
mK (νi ) log(2) |νi |
= 1.
Before proceeding with the proof, we note the following consequence. By item (i) in Theorem 4.3.7, each νi is C-random. By item (iii), νi is not strongly K-random for large enough i. Hence the converse to Corollary 4.3.3 does not hold. Corollary 4.3.8 (Solovay [371]). There is a c such that, for all d, there are infinitely many strings that are C-random with constant c but not strongly K-random with constant d. Proof of Theorem 4.3.7. The idea of the proof is to find τi and ni such that τi is hard to describe given ni , but easy given K(ni ). Then Corollary 4.3.5 implies that there is a μi of length ni that is random enough so that from K(μi ) we can compute K(ni ) (so τi is easy to describe given K(μi )), but at the same time τi μi is C-random. Letting νi = τi μi , we have that K(νi ) is not much larger than K(μi ), but C(νi ) is as large as possible. By carefully choosing τi and ni so that νi is sufficiently longer than μi , we will be able to satisfy the requirements of the theorem. We now proceed with the details of the proof. Recall that σ ∗ is the first U-program of length K(σ) to converge with output σ. Recall also that by log n we mean the base 2 logarithm of n, rounded up to the nearest integer. i By Theorem 3.14.2, we can select n0 , n1 , . . . such that log ni = 22 and K(n∗i | ni ) 2i − O(i). By Proposition 3.14.1, K(n∗i | ni ) = K(K(ni ) | ni ) ± O(1) K (2) (ni ) + O(1) i
i
log 22 + O(log(2) 22 ) = 2i + O(i), so K(n∗i | ni ) = 2i ± O(i). Thus K (2) (n∗i | ni ) = O(i) (and hence, of course, K (3) (n∗i | ni ) = o(i)), so by Corollary 4.2.12, C(n∗i | ni ) = K(n∗i | ni ) − K (2) (n∗i | ni ) ± O(K (3) (n∗i | ni )) = 2i ± O(i). Let τi be the first to halt among the minimal-length V-programs for n∗i given ni . (That is, τi is (n∗i )∗C , where the second star is defined with respect to V ni .) Then |τi | = C(n∗i | ni ) = 2i ± O(i). Also, by the minimality of τi , C(τi | ni ) = |τi | ± O(1). (That is, |τi | = C(n∗i | ni ) C(τi | ni ) + O(1) |τi | + O(1). Thus it follows that τi is hard to describe given ni , but as we will see below, τi is relatively easy to describe given n∗i , and hence given ni and K(ni ).) By Corollary 4.3.5, there is a μi ∈ 2ni such that C(τi μi ) = |τi |+ni ±O(1) and K(μi ) = ni + K(ni ) ± O(1). Let νi = τi μi . Then limi |νi | = ∞, and
4.3. Strong K-randomness and C-randomness
165
each νi is C-random, since C(νi ) = |τi | + ni ± O(1) = |νi | ± O(1). So we mK (νi ) are left with showing that limi log (2) |ν | = 1. i We begin by computing K(|νi |). First, K(|νi |) = K(|τi | + ni ± O(1)) = K(ni ) ± O(K(|τi |)). Now, K(|τi |) = K(2i ± O(i)) = K(2i ) ± O(log i) = O(log i), since K(2i ) = K(i) ± O(1) = O(log i). Therefore, K(|νi |) = K(ni ) ± O(log i). We now need an upper bound on K(νi ). If we have a U-program σ for μi and a U-program σ for τi given σ, then we can easily obtain a U-program for νi = τi μi of length roughly |σσ |. Thus, K(νi ) K(μi ) + K(τi | μ∗i ) + O(1) = ni + K(ni ) + K(τi | μ∗i ) ± O(1). So we will obtain our bound on K(νi ) by bounding K(τi | μ∗i ). Now, ni = |μi |, so K(ni | μ∗i ) = O(1). Since |μ∗i | = K(μi ) = ni + K(ni ) ± O(1), we have K(K(ni ) | μ∗i ) = O(1). By Proposition 3.14.1, K(n∗i | ni , K(ni )) = O(1). Thus, K(n∗i | μ∗i ) K(K(ni ) | μ∗i )+K(ni | μ∗i )+K(n∗i | ni , K(ni ))+O(1) = O(1). So K(τi | μ∗i ) K(τi | n∗i ) + K(n∗i | μ∗i ) + O(1) = K(τi | n∗i ) + O(1). Recall that τi is the first to halt among the minimal-length V-programs for n∗i given ni . By the relativized version of Proposition 3.2.1, K(τi | n∗i ) K(|τi | | n∗i ) + O(1) = O(log |τi |) = O(i), so K(τi | μ∗i ) = O(i). Putting the last three paragraphs together, we have K(νi ) ni + K(ni ) + O(i). Together with the computation of K(|νi |) above, this bound implies that mK (νi ) = |νi | + K(|νi |) − K(νi ) 2i + ni + K(ni ) − O(i) − [ni + K(ni ) + O(i)] = 2i − O(i) = log(2) |νi | − O(log(3) |νi |). (The last equality holds because |νi | = |μi | + |τi | = ni + 2i ± O(i), and i log ni = 22 .) Thus lim inf i
mK (νi ) log(2) |νi |
1.
(4.12)
On the other hand, it is easy to get an upper bound on mK (νi ) using the results of Section 4.2. Recall that C(νi ) = |νi | ± O(1). Also, by Corollary
166
4. Relating Complexities
4.2.11, K(νi ) = C(νi ) + K(C(νi )) ± O(C (3) (νi )), so mK (νi ) = |νi | + K(|νi |) − K(νi ) = C(νi ) + K(C(νi )) ± O(1) − [C(νi ) + K(C(νi )) ± O(C (3) (νi ))] = O(C (3) (νi )) = O(log(2) |νi |). Thus mK (νi ) = O(log(2) |νi |), but we are not quite done because of the multiplicative constant that might be hidden in the O-term. The following method of finishing the proof was suggested by Joe Miller. Lemma 4.3.9. K(|νi |) K (2) (νi ) + K(mK (νi )) + O(1). Proof. Assume that ci = K(|νi |) − K (2) (νi ) − K(mK (νi )) is positive, since otherwise we are done. From minimal-length U-programs for K(νi ), for mK (νi ), and for ci , we can compute K(|νi |), and hence can compute |νi | = mK (νi ) + K(νi ) − K(|νi |). Thus, K(|νi |) K (2) (νi ) + K(mK (νi )) + K(ci ) + O(1) = K(|νi |) − ci + K(ci ) + O(1). So ci K(ci ) + O(1), whence ci = O(1). To finish the proof of the theorem given this lemma, we have the following inequality. mK (νi ) = |νi | + K(|ν1 |) − K(νi ) = C(νi ) + K(|νi |) − K(νi ) ± O(1) K(|νi |) − K (2) (νi ) + K (3) (νi ) + O(1)
by Lemma 4.2.6
K(mK (νi )) + K
by Lemma 4.3.9
(3)
K(mK (νi )) + log
(νi ) + O(1)
(2)
|νi | + o(log
(2)
|νi |).
By the crude bound mK (νi ) = O(log(2) |νi |) established above, we have K(mK (νi )) = o(log(2) |νi |), so mK (νi ) log(2) |νi | + o(log(2) |νi |), whence lim sup i
mK (νi ) log(2) |νi |
1.
Together with (4.12), this inequality implies that limi completing the proof of the theorem.
mK (νi ) log(2) |νi |
= 1,
The following theorem, whose proof is based on Theorem 4.3.7, has as a corollary that the identities (4.1) and (4.2) cannot be improved, and the inequality (4.5) cannot be reversed. Theorem 4.3.10 (Solovay [371]). There are infinite sequences of strings νi , τi , and μi such that (i) limi |νi | = ∞,
4.3. Strong K-randomness and C-randomness
167
(ii) C(τi ) = C(νi ) + O(1), (iii) limi
K(τi )−K(νi ) log(2) |νi |
= 1,
(iv) K(μi ) = K(νi ) + O(1), and (v) limi
C(νi )−C(μi ) log(2) |νi |
= 1.
Proof. Let νi be as in Theorem 4.3.7. For each i, let τi be a strongly Krandom string of length |νi |. By Corollary 4.3.3, C(τi ) = |νi | ± O(1) = C(νi ) ± O(1). The choice of τi implies that K(τi ) = |νi | + K(|νi |) ± O(1), so K(τi ) − K(νi ) = |νi | + K(|νi |) − [|νi | + K(|νi |) − mK (νi )] ± O(1) = mK (νi ) ± O(1). mK (νi ) i )−K(νi ) = limi log By Theorem 4.3.7, limi K(τ (2) |ν | = 1. log(2) |νi | i By Lemma 4.3.6, there exist mi such that |mi + K(mi ) − K(νi )| = O(1). For each i, let μi be a strongly K-random string of length mi . Then K(μi ) = mi +K(mi )±O(1) = K(νi )±O(1). Furthermore, by Lemma 4.3.3, C(μi ) = mi ± O(1), and by Theorem 4.3.7, C(νi ) = |νi | ± O(1). Thus, to |νi |−mi establish item 5, it is enough to show that limi log (2) |ν | = 1. i Let ri = |νi | − mi . Then K(mi ) = K(|νi |) ± O(log ri ), so
ri ± O(log(ri )) = |νi | + K(|νi |) − (mi + K(mi )) = |νi | + K(|νi |) − K(νi ) ± O(1) = mK (νi ) ± O(1). By Theorem 4.3.7, limi
ri log(2) |νi |
= limi
mK (νi ) log(2) |νi |
= 1.
Corollary 4.3.11 (Solovay [371]). None of the following relations hold in general. K(σ) = C(σ) + C (2) (σ) + C (3) (σ) ± O(C (4) (σ)). C(σ) = K(σ) − K (2) (σ) + K (3) (σ) ± O(K (4) (σ)). K(σ) = C(σ) + K(C(σ)) ± O(1). Proof. Let νi , τi , and μi be as in Theorem 4.3.10. If the first equation were true then, since C(νi ) = C(τi ) ± O(1), we would have K(νi ) − K(τi ) = O(C (4) (νi )) = O(log(3) |νi |). But then i )−K(τi ) limi K(ν = 0, contradicting Theorem 4.3.10. log(2) |νi | The argument for the second equation is the same, with μi in place of τi and K and C interchanged. Similarly, if the third equation were true then we would have K(νi ) − K(τi ) = O(1), which would lead to the same contradiction as before.
168
4. Relating Complexities
4.4 Muchnik’s Theorem on C and K In [288], Muchnik proved the following interesting result on the relationship between prefix-free complexity and plain complexity. The proof we give, which is based on Solovay’s results discussed above, is due to Miller [272]. Theorem 4.4.1 (Muchnik, see [288]). For each d there are strings σ and τ such that K(σ) > K(τ ) + d and C(τ ) > C(σ) + d. Proof. Let {νi }i∈ω be the sequence constructed in Theorem 4.3.7. For each i, take ρi to be a strongly K-random string of length |νi | − log log
we slightly abuse notation by writing note that
(2)
|νi |
2
to mean #
log
(2)
(2)
2
|νi | . 2 |νi |
(Here
$.) First
K(ρi ) − K(νi ) = |ρi | + K(|ρi |) − K(νi ) ± O(1) log(2) |νi | log(2) |νi | + K(|νi | − ) − K(νi ) ± O(1) 2 2 log(2) |νi | + K(|νi |) − K(νi ) ± O(log(3) |νi |) = |νi | − 2 log(2) |νi | ± O(log(3) |νi |). = mK (νi ) − 2
= |νi | −
Therefore, lim i
K(ρi ) − K(νi ) log
(2)
|νi |
= lim
mK (νi ) −
i
log
(2)
log(2) |νi | 2
|νi |
=1−
1 1 = . 2 2
By Theorem 4.3.3, ρi is C-random, so C(νi ) − C(ρi ) = |νi | − |ρi | ± O(1) = log(2) |νi | 2
± O(1), and hence lim
C(νi ) − C(ρi )
i
log
(2)
|νi |
= lim i
log(2) |νi | ± 2 (2)
log
O(1)
|νi |
=
1 . 2
So for any d we can take i large enough so that K(ρi ) − K(νi ) d and C(νi ) − C(ρi ) d. Miller’s methods allow an additional improvement to the previous result, namely that we can choose σ and τ to have the same length. Theorem 4.4.2 (Miller [272]). For each d there are strings σ and τ of the same length such that K(σ) K(τ ) + d and C(τ ) C(σ) + d. log(2) |νi |
Proof. We extend the proof above. For each i, define τi = ρi 0 2 . Then |τi | = |νi |. Furthermore, K(τi ) = K(ρi ) ± O(log(3) |νi |) and C(τi ) = C(ρi ) ± O(log(3) |νi |). Thus lim i
K(τi ) − K(νi ) log
(2)
|νi |
=
C(νi ) − C(τi ) 1 = lim . i 2 log(2) |νi |
4.5. Monotone complexity and KM -complexity
169
As above, for any d we can take i large enough so that K(τi ) − K(νi ) d and C(νi ) − C(τi ) d. Miller observed that the estimates used in both proofs can be improved. (2) It is not hard to show that K(|νi | − log 2 |νi | ) = K(|νi |) ± O(1), and from this fact, that K(τi ) = K(ρi )± O(1) and C(τi ) = C(ρi )± O(1). The weaker estimates above were used because they are sufficient for our purposes and require no explanation.
4.5 Monotone complexity and KM -complexity As mentioned in Section 3.16, it was a long-standing open question whether KM and Km are the same. We saw in that section that KM (σ) Km(σ)+ O(1). Levin [242] conjectured that in fact Km(σ) = KM (σ) ± O(1). This conjecture was finally refuted by G´ acs [166]. Theorem 4.5.1 (G´ acs [166]). There exists a function f with limn f (n) = ∞ such that Km(σ) − KM (σ) f (|σ|) for infinitely many σ. Indeed, we may choose f to be the inverse of Ackermann’s function. Although we will not discuss reducibilities arising from measures of complexity until later, the following corollary to G´ acs’ Theorem is worth noting here. A natural question given G´ acs’ Theorem is whether the corresponding reducibilities are different. That is, let A KM B if KM (A n) KM (B n) + O(1), and similarly for Km (Km-reducibility is also known as monotone reducibility). Then is KM different from Km ? Miller observed that this question is easily settled using G´ acs Theorem. Corollary 4.5.2 (Miller [personal communication]). The reducibilities KM and Km are different. Proof. We define A = s αs in stages, with each αs a finite string. Let α0 be such that Km(α0 ) − KM (α0 ) > 1. Now suppose we have defined αs . For a string τ of length at least |αs | + 1, let τ be obtained by replacing the initial segment of τ of length |αs | by αs . Then it is easy to check that Km( τ ) = Km(τ ) ± O(1) and KM ( τ ) = KM (τ ) ± O(1), where the constants depend only on αs . So we can choose a τ such that Km( τ )− KM ( τ ) > s+ 1 and let αs+1 = τ . Returning to Theorem 4.5.1, we see that, while G´ acs proved that KM and Km do not agree within an additive constant, it seemed possible that the difference might be very slight. As we will discuss in this section, Adam Day [87] established a strong extension of G´acs Theorem with a proof significantly extending and clarifying the original. How different can these two complexities be? The following theorem gives an upper bound on their difference.
170
4. Relating Complexities
Theorem 4.5.3 (G´ acs [166], Uspensky and Shen [396]). K(σ) KM (σ)+ K(|σ|) + O(1). Thus, for any k 1 and ε > 0, Km(σ) KM (σ) + K(|σ|) + O(1) KM (σ) + log |σ| + log log |σ| + · · · + (1 + ε) log(k) |σ| + O(1). Proof. For each n, we have σ∈2n 2− KM (σ) 1, by the definition of KM . Thus 2− KM (σ)−K(|σ|) = 2−K(n) 2− KM (σ) 2−K(n) < 1. σ
So {(k + 1, σ) : σ ∈ 2
n <ω
σ∈2n
n
∧ k KM (σ) + K(|σ|)} is a KC set.
Given the upper bound provided by Theorem 4.5.3, the following result, which brings G´ acs inverse Ackermann function up to two logs, is not far from optimal. (Indeed, since the upper bound in Theorem 4.5.3 holds for K as well as Km, it would be surprising if it were tight, as that would imply that, in a sense, a prefix-free machine is just as good at approximating KM as a monotone machine.) Theorem 4.5.4 (Day [87]). Let c < 12 . Then there exist infinitely many σ such that Km(σ) > KM (σ) + c log log |σ|. We follow Day’s presentation of his complex proof, which is included with his permission. Let M be a fixed universal monotone machine. We think of this machine as given by axioms of the form (σ, τ ), and write (σ, τ ) ∈ M to mean that on input σ, the machine M writes at least τ on its output tape. In this section, we write σ ≈ τ to mean that σ τ or τ σ. In this case, we say that σ and τ are comparable, and otherwise that they are incomparable. (We have been using the term “incompatible” for this concept, but want to avoid confusion with the concept of “M-incompatible” strings introduced below.) To show that the above theorem holds, we will develop a proof that monotonic descriptional complexity and monotonic algorithmic probability do not agree to within an additive constant. The improvement in the lower bound follows from the construction used in the proof. This proof builds on the original work of G´ acs but is more intricate. A main point of difference is that the algorithm at the heart of the new proof works on sets of strings as opposed to individual strings. It is fair to say that G´ acs’ Theorem is not well understood. We hope that the new proof, as well as improving the bound, clarifies the main reasons why Km and KM are different. From now on we will refer to a c.e. continuous semimeasure as just a c.e. semimeasure. We will also drop the word monotonic and just talk about descriptional complexity and algorithmic probability. We know that MM (as defined in Theorem 3.16.2) majorizes all c.e. semimeasures. So, if it were true that Km(σ) = KM (σ) + O(1), then for any c.e. semimeasure m that we could construct, there would be some constant c such that
4.5. Monotone complexity and KM -complexity
171
Km(σ) − log(m(σ)) + c. The general approach of the proof will be to build a c.e. semimeasure for which this is false, which means constructing a c.e. semimeasure m with the property that for all c there is a σ such that Km(σ) > − log(m(σ)) + c. Of course, we want to do more, and prove Theorem 4.5.4. This result will come out of the construction used in the proof. For now we will focus just on showing that Km and KM are different. We will leave the analysis of the size of the difference until later. Another way to look at this problem is to consider whether for any n, we can construct a semimeasure m such that m(λ) 2−n and there exists a σ such that Km(σ) > − log(m(σ)). Viewing the problem this way helps simplify the proof a little, because we can build a c.e. semimeasure a(σ) such that for all i, 1. a(0i 1) 2−2i−1 , 2. a(0i ) = j a(0i+j 1), and 3. there exists a σ such that Km(0i 1σ) > − log(a(0i 1σ)). Then we can scale the semimeasure a to construct a new semimeasure m with the desired properties. Proposition 4.5.5. If a semimeasure a with the above properties exists, then Theorem 4.5.1 holds. Proof. From a we can define a c.e. semimeasure m by letting m(0i 1σ) = i i i 2 a(0 1σ) and m(0 ) = j m(0i+j 1). Then m(σ) m(σ0) + m(σ1) for all σ, and m(λ) = m(0i 1) = 2i a(0i 1) 2i 2−2i−1 1, i
i
i
so m is a c.e. semimeasure. As MM majorizes all c.e. semimeasures, there is some d such that MM (σ) 2−d m(σ). Thus KM (σ) − log(2−d m(σ)) = − log(m(σ)) + d. Now given any c, let i c + d. By our hypothesis on a, there is some σ such that Km(0i 1σ) > − log(a(0i 1σ)) = − log(2−i m(0i 1σ)) = − log(m(0i 1σ)) + i − log(m(0i 1σ)) + c + d KM (0i 1σ) + c.
Note that we could use the recursion theorem to determine the constant d in the above proof. However, doing so is not necessary for our construction. Since we want a to be c.e., we can construct it in stages depending on some computable enumeration of the universal monotone machine M. We will start with a(σ, 0) = 0 for all σ. At each stage t+1, we have the option of
172
4. Relating Complexities
choosing a single string ρ and a positive integer x such that a(λ, t)+2−x 1, and defining a(σ, t + 1) as follows: a(σ, t) + 2−x if σ ρ a(σ, t + 1) = a(σ, t) otherwise. In the rest of the proof we will write simply a(ρ, t + 1) = a(ρ, t) + 2−x to refer to this process. If we select ρ at stage t + 1 and a(ρ, t) = 0, we will simplify the notation further and write a(ρ, t + 1) = 2−x . If no such ρ is chosen for a stage t + 1 then this simply means that a(σ, t + 1) = a(σ, t) for all σ. Our objective is to construct a c.e. semimeasure a that meets the following requirements for all i. Ri : a(0i 1) 2−2i−1 and there is a σ s.t. Km(0i 1σ) > − log(a(0i 1σ)) For most of this proof we will focus on meeting a single requirement Ri for some arbitrary i. Once we can achieve a single requirement, it will be relatively easy to meet all requirements. A natural way to view this construction is as a game between us constructing the semimeasure and an opponent constructing the universal monotone machine. For example, to meet the requirement that there is some string 1σ such that Km(1σ) > − log(a(1σ)), we could choose a string σ and set a(1σ, 1) = 2−r . Then we could wait and see whether a pair (τ, 1σ ) is enumerated into M where |τ | r and σ σ . If such a pair is not enumerated into M then the requirement is met because Km(1σ) > r = − log(a(1σ)). If such a pair is enumerated into M then we can make another change to a at some future stage. We need to develop a construction of a such that no matter how the opponent enumerates M, the requirements on a are met. We have only a finite amount of measure that we can use, and similarly, the opponent has only a finite number of descriptions of any given length. Our goal is make the opponent run out of suitable descriptions before we run out of measure. There are some fundamental concepts used by G´acs in his proof of Theorem 4.5.1 that we will make use of. We will introduce one of these with the following example. Example 4.5.6. We choose a string σ and set a(σ, 1) = 2−4 . Now the opponent must respond by ensuring that Km(σ) 4. To do so, the opponent must enumerate some pair (τ, σ ) into M, where |τ | 4 and σ σ. Let us suppose that the opponent enumerates (0000, σ) into M. Then either it is possible for the opponent to add (000, σ) into M or not. If not, we can set a(σ, 2) = a(σ, 1) + 2−4 = 2−3 . Now, as it is not possible for the opponent to add (000, σ) to M, the opponent must find a new string, say 001, and add (001, σ) to M. However, in this case, the opponent has used both 0000 and 001 to describe σ, a waste of resources. Alternatively, assume that the opponent can enumerate (000, σ) into M. We can think of the opponent as
4.5. Monotone complexity and KM -complexity
173
holding 000 as a description in reserve for σ. Now we make this reservation useless, by 1. never again adding measure to a string comparable with σ, and 2. using only units of measure of size at least 2−3 from now on. As we are using increments of measure of size 2−3 or more, the opponent must always respond with strings of length at most 3. However, the opponent cannot use 000 because 000 can be used only to describe a string comparable with σ. Hence, if we never increase the measure on a string comparable with σ, the opponent can never use 000. In this case, 0001 is a gap in the domain of M, as it cannot be used to respond to any of our increments in measure, which again is a waste of the opponent’s resources. A proof of Theorem 4.5.1 can be developed by generalizing the approach of this example. A first step toward this generalization is the following. We take a string σ. We have two integer parameters r, g with 0 < r < g. We want to increase the measure on σ in increments of 2−g . Each time we increase the measure we want the opponent to respond. To do so, we do not increase the measure on σ directly but on some extension of σ. Let Ξ = {στ : τ ∈ 2g−r }. We will take some ξ ∈ Ξ and increase the measure on ξ instead of on σ. A first attempt at the algorithm we need is as follows. Algorithm 1. Let t be the current stage in the enumeration of M. Choose some ξ ∈ Ξ that has not been used before (so that a(ξ, t) = 0). If there is some description of a string σ with σ σ in the domain of Mt that can be shortened to a description of length r, then terminate the algorithm. Otherwise, increase the measure on ξ by 2−g , and wait until a stage t at which the opponent describes ξ with a string of length at most g. If we have not used all the elements of Ξ, then let t = t and repeat. If we have used all the elements of Ξ, then the measure on σ must be |Ξ|2−g = 2−r , so wait until the opponent uses a string of length r to describe σ. If we apply this algorithm, then the opponent must either use a string of length r to describe σ, or have some string that describes a string comparable with σ that can be shortened to a string of length r. The crucial fact is that, at some point, we increased the measure on σ by just 2−g , and the opponent had to respond by finding a new string of length r. The string is new in the sense that it is not just some initial segment of a string already used. As it stands, this algorithm can make only a small gap between Km and KM . The hard part is amplifying this gap. Amplification can be done by using the algorithm recursively. In the algorithm above, we act by setting
174
4. Relating Complexities
a(ξ, t) = 2−g for some ξ at some stage t. However, rather than just directly increasing the measure on ξ, we could instead run the algorithm again on some extension of ξ. This action would have the effect of increasing the measure on ξ while forcing the opponent to waste even more resources. However, to implement this strategy, a number of issues need to be dealt with: 1. We want to make sure our algorithm increases the measure enough. Algorithm 1 will increase the measure on σ somewhere between 2−g and 2−r , which is too great a range. 2. We need to ensure that any gaps the algorithm makes in the domain of M are never used again. The algorithm above does not address this. 3. Each time we use an extension of a string, we go deeper into the tree of binary strings. We would like to minimize the depth that we go to in order to get the best possible lower bound on the difference between Km and KM . The concepts in Example 4.5.6 and Algorithm 1 were used by G´ acs in his original proof of Theorem 4.5.1.1 The approach of this proof is still based on these ideas but differs from that of G´ acs in how the problems listed above are addressed. The algorithm that we will present will make decisions based on the domain of M. The question of whether there is a description of σ that can be shortened will be crucial (e.g., whether or not 0000 could be shortened to 000 in example 4.5.6). First we will define certain subsets of the domain of M. We will prove some technical lemmas about how these sets change as the parameters defining them change. These lemmas will be crucial to the verification of the proof. Then we will present the algorithm that establishes a gap between Km and KM . After that, we will verify that the algorithm works. Finally, we will analyze the algorithm to show how it improves the lower bound. At any stage t, the approximation Mt tells us the options the opponent has left. In particular, Mt tells us what the opponent can and cannot use certain strings for. We need a more exact notion for the idea of the opponent shortening an existing description. In his original proof, G´ acs came up with the notion of a string τ being reserved as a description for σ. The idea is not just that the opponent can use τ to describe σ, but additionally that the opponent cannot use τ to describe a string incomparable with σ. For τ to be reserved for σ, the opponent must have already used a string comparable with τ to describe an extension of σ. 1 At the time of writing, an extended version of the original proof is available from G´ acs’ homepage.
4.5. Monotone complexity and KM -complexity
175
There are really two ideas in this definition that are worth separating out and making more explicit. First we will say that a string τ is fragmented by σ if some string comparable to τ describes an extension of σ, which means the opponent cannot use τ to describe a string incomparable with σ (as otherwise M would not be a monotone machine). Second, a string τ is M-incompatible with σ if some string comparable with τ describes a string incomparable with σ. This means that τ cannot be used to describe σ. Hence another way of saying a string τ is reserved for σ is that τ is fragmented by σ but τ is not M-incompatible with σ. As we are not so much interested in particular descriptions, but descriptions of a certain length, we will make the following definitions. Definition 4.5.7. (i) The strings of length r fragmented by σ at stage t are those in the set Fr (σ, t) = {τ ∈ 2r : ∃(τ , σ ) ∈ Mt (τ ≈ τ ∧ σ σ )}. (ii) The strings of length r M-incompatible with σ at stage t are those in the set Ir (σ, t) = {τ ∈ 2r : ∃(τ , σ ) ∈ Mt (τ ≈ τ ∧ σ | σ )}. (iii) The strings of length r reserved for σ at stage t are those in the set Rr (σ, t) = Fr (σ, t) \ Ir (σ, t). Example 4.5.8. Suppose that Mt = {(00, 11), (000, 1), (000, 11), (001, 11), (011, 01), (100, 01), (110, 011), (111, 1)}. If we consider the strings of length 2, working through the above definitions establishes that F2 (01, t) = {01, 10, 11}, that I2 (01, t) = {00, 11}, and that R2 (01, t) = {01, 10}. The definitions given for Fr and Ir can be extended to sets of strings. Definition 4.5.9. If Σ is a set of finite strings, then (i) Fr (Σ, t) = σ∈Σ Fr (σ, t) and (ii) Ir (Σ, t) = σ∈Σ Ir (σ, t). We will explain later why we take the intersection in the definition of Ir (Σ, t). The following lemmas show how the sets Fr (σ, t), Ir (σ, t), Fr (Σ, t), and Ir (Σ, t) change as the parameters that define them vary. Lemma 4.5.10. (i) If r0 r1 then Fr0 (σ, t) ⊇ Fr1 (σ, t) and Ir0 (σ, t) ⊇ Ir1 (σ, t). (ii) If σ0 σ1 then Fr (σ0 , t) ⊇ Fr (σ1 , t) and Ir (σ0 , t) ⊆ Ir (σ1 , t). (iii) If t0 t1 then Fr (σ, t0 ) ⊆ Fr (σ, t1 ) and Ir (σ, t0 ) ⊆ Ir (σ, t1 ).
176
4. Relating Complexities
Proof. (i) If α ∈ Fr1 (σ, t) then α r1 ∈ Fr1 (σ, t), so there exist τ ≈ α r1 and σ σ such that (τ, σ ) ∈ Mt . Since r0 r1 , we have τ ≈ α r0 , so α r0 ∈ Fr0 (σ, t), and hence α ∈ Fr0 (σ, t). Similarly, Ir0 (σ, t) ⊇ Ir1 (σ, t). (ii) If τ ∈ Fr (σ1 , t) then there exist τ ≈ τ and σ σ1 such that (τ , σ) ∈ Mt . As σ σ1 σ0 , we have τ ∈ Fr (σ0 , t). If τ ∈ Ir (σ0 , t) then there are τ ≈ τ and ρ | σ0 such that (τ , ρ) ∈ Mt . As σ1 σ0 , we have σ1 | ρ, and thus τ ∈ Ir (σ1 , t). (iii) follows because Mt0 ⊆ Mt1 . In summary, Fr (σ, t) and Ir (σ, t) are both increasing in t and decreasing in r. In the length of σ, we have that Fr (σ, t) is decreasing and Ir (σ, t) is increasing. To establish similar results for Fr (Σ, t), and Ir (Σ, t), we need to define a relationship between sets of strings. Definition 4.5.11. If Σ0 , Σ1 ⊆ 2<ω , then Σ0 Σ1 if for all σ1 ∈ Σ1 , there exists a σ0 ∈ Σ0 such that σ0 σ1 . Lemma 4.5.12. (i) If r0 r1 then Fr0 (Σ, t) ⊇ Fr1 (Σ, t) and Ir0 (Σ, t) ⊇ Ir1 (Σ, t). (ii) If Σ0 Σ1 then Fr (Σ0 , t) ⊇ Fr (Σ1 , t) and Ir (Σ0 , t) ⊆ Ir (Σ1 , t). (iii) If t0 t1 then Fr (Σ, t0 ) ⊆ Fr (Σ, t1 ) and Ir (Σ, t0 ) ⊆ Ir (Σ, t1 ). Proof. (i) If τ ∈ Fr1 (Σ, t), then for some σ ∈ Σ, we have τ ∈ Fr1 (σ, t), so by the previous lemma, τ ∈ Fr0 (σ, t) ⊆ Fr0 (Σ, t). If τ ∈ Ir1 (Σ, t), then for all σ ∈ Σ, we have τ ∈ Ir1 (σ, t), so by the previous lemma, τ ∈ σ∈Σ Ir0 (σ, t) = Ir0 (Σ, t). (ii) If α ∈ Fr (Σ1 , t) then for some σ1 ∈ Σ1 , we have α ∈ Fr (σ1 , t). There exists σ0 ∈ Σ0 such that σ0 σ1 , so by the previous lemma, α ∈ Fr (σ0 , t) ⊆ Fr (Σ0 , t). If α ∈ Ir (Σ0 , t) then for all σ0 ∈ Σ0 , we have α ∈ Ir (σ0 , t). Now, if σ1 ∈ Σ1 then for some σ0 ∈ Σ0 , we have σ0 σ1 . Since α ∈ Ir (σ0 , t), by the previous lemma, α ∈ Ir (σ1 , t). Thus α ∈ σ1 ∈Σ1 Ir (σ1 , t) = Ir (Σ1 , t). (iii) If τ ∈ Fr (Σ, t0 ), then for some σ ∈ Σ, we have τ ∈ Fr (σ, t0 ), so by the previous lemma, τ ∈ Fr (σ, t1 ) ⊆ Fr (Σ, t1 ). If τ ∈ Ir (Σ, t0 ), then for all σ ∈ Σ, we have τ ∈ Ir (σ, t), so by the previous lemma, τ ∈ σ∈Σ Ir (σ, t1 ) = Ir (Σ, t1 ). We are going to describe an algorithm that can be used to prove Theorems 4.5.1 and 4.5.4. The algorithm will be allocated a certain amount of measure to spend on the construction of the c.e. semimeasure a. It will also be given a set of strings Σ, and it will be able to increase the measure only on extensions of these strings. Ultimately we want the algorithm to establish some difference between Km(ρ) and − log(a(ρ)) for some string ρ that extends some element of Σ. We will argue that if this does not happen, then
4.5. Monotone complexity and KM -complexity
177
some contradiction must ensue. The basic idea is to make the opponent run out of short descriptions. Formalizing this idea is a little difficult. We want to be able to say that if we spend 2−r much measure on extensions of Σ, and we do not establish a difference between Km and − log(a), then we can cause x amount of something to happen to M (where M is the universal machine controlled by the opponent). We will refer to this x, whatever it is, as the gain made by the algorithm. Now, the most obvious measurement to use would be μ(π1 (Mt )) (where π1 (Mt ) is the projection of Mt onto the first coordinate), because this projection is analogous to the domain of Mt . However, remember that part of the idea is to leave gaps in M that the opponent cannot do anything useful with. The problem with this measurement is that it does not account for gaps. An alternative is the size of Fr (Σ, t), the set of strings of length r that are fragmented by elements of Σ. This measurement will count gaps, and is almost what we need. The size of Fr (Σ, t) is at most 2r , so if the algorithm ends at stage t1 , and we could ensure that |Fr (Σ, t1 )| k for an arbitrary k, we would be done. However, the actual situation is still more complicated. We will end up using the algorithm we define recursively. When verifying the algorithm, we do not want to double-count any gain made. To make the verification possible, we will subtract Ir (Σ, t0 ) from Fr (Σ, t1 ), where r r and t0 is the time the algorithm starts. This measurement is what we will use to determine the gain made by the algorithm. The following definition, where G is for gain, codifies this idea. Definition 4.5.13. If Σ ⊆ 2<ω , and r, r , t0 , t1 ∈ N are such that r r and t0 t1 , then
Grr (Σ; t0 , t1 ) = Fr (Σ, t1 ) \ Ir (Σ, t0 ).
The value Grr (Σ; t0 , t1 ) can be thought of as the gain the algorithm obtains using the strings in Σ between stages t0 and t1 , and using units of measure between r and r . With this definition in hand, we can define what sort of algorithm is needed. The following definition is complicated, so we will take some time to explain it after giving it. Note that we describe an algorithm that works on a set of strings, which is a new feature of this proof. Definition 4.5.14. If r, b ∈ N, and Σ is a finite prefix-free subset of 2<ω , then an (r, b, Σ)-strategy is an algorithm that, if implemented at stage t0 , ensures that one of the following holds. (i) (a) There is a σ ∈ Σ such that for some σ extending σ, we have Km(σ ) > − log(a(σ )), and (b) for all stages t and all σ ∈ Σ, we have a(σ, t) < 54 2−r ; or (ii) at some stage t1 , for some r r computable from (r, b), (a) for all σ ∈ Σ, we have 2−r a(σ, t1 ) < 54 2−r ,
178
4. Relating Complexities
(b) μ(Grr (Σ; t0 , t1 )) 1 + 2b a(Σ, t1 ), and ⊆ Σ, we have (c) for all Σ
t0 , t1 )) 1 + b a(Σ, t1 ) − 2−r D(b)|Σ \ Σ|, μ(Grr (Σ; 2 b where D(b) = 5 54 − 4 and a(Σ, t) = σ∈Σ a(σ, t). The input parameters for the strategy are r, b, and Σ. The set Σ is a finite prefix-free set of binary strings. These are the strings that the algorithm will work on. The r parameter determines the measure that the algorithm should spend on every string in Σ. The b parameter determines the amount of gain that the algorithm should make. First we will establish the existence of an (r, 0, Σ)-strategy for any appropriate r, then we will inductively establish the existence of (r, b + 1, Σ)-strategies assuming the existence of (r, b, Σ)-strategies. The definition gives two possible outcomes to a strategy. Outcome (i) is the preferred outcome. If outcome (i) occurs, then we have established a difference between Km and KM . However, if we do not achieve this outcome, then outcome (ii) is at least a step in the right direction. Provided b > 0, outcome (ii) ensures that some difference between μ(Grr (Σ; t0 , t1 )) and a(Σ, t1 ) is created. In outcome (ii), condition (b) says that the gain on the set of strings Σ is bounded below. We do not want the gain to be too concentrated on some particular subset of Σ. The gain achieved need not be evenly distributed among the elements of Σ, but it is important that this distribution not be too skewed. This is the reason for condition (c). Condition (c) ensures that no individual string in Σ contributes too much to the overall gain made. We need condition (c) because when we use the algorithm recursively, we will not be able to count the gain that occurs on certain individual elements of Σ. Hence we do not want an individual element of Σ providing too much = Σ. of the overall gain. Note that (c) implies (b) by taking Σ It is also important to observe that for all σ ∈ Σ, if the strategy is running at stage t, then a(σ, t) is bounded above in outcomes (i) and (ii). In outcome (ii) there is also a lower bound for a(σ, t1 ). This lower bound is essential for using strategies recursively because we want these strategies to increase the measure on strings, as well as establish some gain. The function D(b) is chosen because it has the properties that D(0) = 1 and D(b + 1) = 54 D(b) + 1. If we can implement an (r, b, Σ)-strategy with an arbitrarily high b then (ii) is not a possible outcome as the measure would be greater than 1. This situation would force outcome (i) to occur. The other value in the above definition to discuss is r . The value r dictates where the strategy should make its gain. In practice, what it will mean is that 2−r will be the minimum-sized increment in measure that the strategy will make to a. In other words, each time the strategy increases
4.5. Monotone complexity and KM -complexity
179
the measure on some string, the amount increased will be at least 2−r . The idea is that if any previous strategy has created a gap of measure less that 2−r , then the current strategy will not interfere with that gap. The idea behind the basic (r, 0, Σ)-strategy is simple. For every σ ∈ Σ we set a(σ) = 2−r . The opponent is then forced to respond by setting Km(σ) r. To do this, the opponent must find some string τ of length at most r and enumerate (τ, σ) into M. The only difficulty is in showing that this basic idea meets our rather elaborate definition of a strategy. Proposition 4.5.15. Let r ∈ N and let Σ be a finite prefix-free subset of 2<ω such that |Σ| 2r . If t0 is the current stage in the construction of a and a(Σ, t0 ) = 0, then we can implement an (r, 0, Σ)-strategy with r = r. Proof. Let Σ = {σ1 , . . . , σn }. The desired strategy can be implemented as follows. First, let a(σi , t0 + i) = 2−r for all i such that 1 i n . (We know that a(σi , t0 + i − 1) = 0 because Σ is prefix-free.) This definition ensures that a(Σ, t0 + n) = |Σ|2−r . The second step is to wait until a stage t1 , when the opponent responds by setting Km t1 (σ) r for all σ ∈ Σ. If this response never happens, then for some σ ∈ Σ, we have Km(σ) > r = − log(2−r ) = − log(limt a(σ, t)) = − log(a(σ)), so outcome (i) occurs. If at some stage t1 , we have Km t1 (σ) r for all σ ∈ Σ, then for each i such that 1 i n, the opponent must have enumerated some (τi , σi ) into Mt1 , where si = |τi | r and σi σi . In this case we will show that outcome (ii) occurs. First, (a) holds as a(Σ, t1 ) = a(Σ, t0 + n). For any i, let τr be any extension of τi of length r. By definition, τr ∈ Fr (σi , t1 ), so τr ∈ σ∈Σ Fr (σ, t1 ) = Fr (Σ, t1 ). If τr ∈ Ir (σi , t0 ), then for some τ comparable with τr and some ρ incomparable with σi , we have (τ , ρ) ∈ Mt0 ⊆ Mt1 , which contradicts the definition of a monotone machine,as it implies that both τ ≈ τi and ρ | σi . So τr ∈ Ir (σi , t0 ) and thus τr ∈ σi ∈Σ Ir (σi , t0 ) = Ir (Σ, t0 ). Hence Grr (Σ; t0 , t1 ), which by definition is Fr (Σ, t1 ) \ Ir (σ, t0 ), contains all infinite extensions of τr and hence all infinite extensions of τi . So τi ⊆ Grr (Σ; t0 , t1 ). Now if i = j then τi ∩ τj = ∅, because otherwise σi and σj must be comparable, but Σ is a prefix-free set. Thus
μ(Grr (Σ; t0 , t1 ))
n
μ(τi ) =
i=1
2−si
i=1
n i=1
Hence (b) holds.
n
2−r = |Σ|2−r = a(Σ, t1 ) = (1 + 02 )a(Σ, t1 ).
180
4. Relating Complexities
⊆ Σ. Let I = {1, . . . , n} and J = {i ∈ I : σi ∈ Σ}. By the same Let Σ logic as above, t0 , t1 )) 2−r = |I|2−r − (|I \ J|)2−r μ(Grr (Σ; j∈J
= (1 +
0 2 )a(Σ, t1 )
−r (1 + 0 )a(Σ, t1 ) − 2−r D(0)|Σ \ Σ|. − (|Σ \ Σ|)2 2
So (c) holds, and outcome (ii) occurs. We are going to work up to developing an (r, b + 1, Σ)-strategy, using (r, b, Σ)-strategies. Before we present the main algorithm that does so, there is some more preparatory work to do. Ideally, when we implement a strategy, we want to achieve outcome (i). However, if this outcome does not occur then at some stage the strategy will terminate and we can start a new strategy. We want to ensure that if the second strategy also achieves outcome (ii), then the gain of running the strategies sequentially is at least the sum of the minimum gain expected from the strategies individually. The definition of the (r, b, Σ)-strategy is designed to allow this to be the case, as Proposition 4.5.17 will show. Before proving this proposition, we need the following lemma. Lemma 4.5.16. If σ0 | σ1 then Fp (σ0 , t) ⊆ Ip (σ1 , t). Proof. If τ ∈ Fp (σ0 , t) then there exists (τ , σ0 ) ∈ Mt such that τ ≈ τ and σ0 σ0 , which implies that σ0 | σ1 . Thus τ ∈ Ip (σ1 , t). Proposition 4.5.17. If r0 r1 · · · rn and t0 t1 · · · tn are sequences of natural numbers, and Σ0 , Σ1 , . . . , Σn−1 are pairwise disjoint prefix-free subsets of 2<ω such that Σ = Σ0 ∪ Σ1 ∪ · · · ∪ Σn−1 is also prefixfree, then μ(Grrn0 (Σ; t0 , tn )) μ(Grrn−i (Σi ; ti , ti+1 )). n−1−i i
Proof. First, if i < n then Σ Σi , so by chaining together applications of Lemma 4.5.12, we have Frn−i−1 (Σi , ti+1 ) ⊆ Fr0 (Σi , ti+1 ) ⊆ Fr0 (Σ, ti+1 ) ⊆ Fr0 (Σ, tn ) and Irn (Σ, t0 ) ⊆ Irn (Σ, ti ) ⊆ Irn (Σi , ti ) ⊆ Irn−i (Σi , ti ). Hence Fr0 (Σ, tn ) \ Irn (Σ, t0 ) ⊇ Frn−i−1 (Σi , ti+1 ) \ Irn−i (Σi , ti ), r which by definition means that Grrn0 (Σ; t0 , tn ) ⊇ Grn−i n−i−1 (Σi ; ti , ti+1 ). Now let i < j < n. Then Frn−i−1 (Σi , ti ) ⊆ Frn−j (Σi , tj ) ⊆ Irn−j (Σj , tj ). The first inclusion is by Lemma 4.5.12. The second inclusion follows because if τ ∈ Frn−j (Σi , tj ), then τ ∈ Frn−j (σi , tj ) for some σi ∈ Σi . But if σj ∈ Σj
4.5. Monotone complexity and KM -complexity
181
then σj isincomparable with σi . So τ ∈ Irn−j (σj , tj ) by Lemma 4.5.16. Thus τ ∈ σj ∈Σj Irn−j (σj , tj ) = Irn−j (Σj , tj ). So we can conclude that (Frn−i−1 (Σi , ti+1 ) \ Irn−i (Σi , ti )) ∩ (Frn−j−1 (Σj , tj+1 ) \ Irn−j (Σj , tj )) = ∅, which again by definition means that (Σi ; ti , ti+1 ) ∩ Grrn−j (Σj ; tj , tj+1 ) = ∅. Grrn−i n−i−1 n−j−1
Buried in the above proof is the reason we define Ir (Σ, t) to be the intersection of Ir (σ, t) for all σ ∈ Σ: Take Σ0 and Σ1 to be disjoint sets whose union is prefix-free. If τ ∈ Fr (Σ0 , t) then τ is fragmented by some string in Σ0 . As all strings in Σ0 are incomparable with all strings in Σ1 , we have τ ∈ Ir (Σ1 , t). This is the reason the above proposition works, as it avoids double-counting any gains. Example 4.5.18. To illustrate why this proposition is useful, we will show how we can use it to gradually increase the measure on a string without losing any gain made along the way. Consider the following implementation of three strategies in sequence and assume that they all have outcome (ii). Take some r0 ∈ N. Fix a b ∈ N and assume we have an (r, b, Σ) strategy for any appropriate r and Σ. Let r1 be the r computable from (r0 , b), as in Definition 4.5.14. Similarly, let r2 be the r computable from (r1 , b), and let r3 be the r computable from (r2 , b). Take any string σ and let Σ2 = {σ00}, let Σ1 = {σ01τ : τ ∈ 2r1 −r0 }, let Σ0 = {σ1τ : τ ∈ 2r2 −r0 }, and let Σ = Σ0 ∪ Σ1 ∪ Σ2 . Starting at stage t0 , we first implement an (r2 , b, Σ0 )-strategy. When this strategy finishes at stage t1 , we implement an (r1 , b, Σ1 )-strategy. Finally, when this strategy finishes at stage t2 , we implement an (r0 , b, Σ2 )-strategy. Let t3 be the stage at which this final strategy finishes. By Proposition 4.5.17, μ(Grr30 (Σ; t0 , t3 )) μ(Grr3−i (Σi ; ti , ti+1 )) 3−1−i i2
b b a(Σi , ti+1 ) = 1 + a(Σ, t3 ). 1+ 2 2 i2
The last step follows provided we do not increase the measure on any other string comparable with a string in Σ except as required by the strategies. Now a(σ) a(Σ) |Σ0 |2−r2 + |Σ1 |2−r1 + |Σ2 |2−r0 = 3 · 2−r0 . This approach has allowed us to use strategies to increase the measure on σ in three increments of 2−r0 without losing any of the gain made by the three strategies individually. This ability to increase the measure on a string in small increments will be exploited in the main proof.
182
4. Relating Complexities
We have seen that we can use strategies sequentially without losing any of their individual gains. However, we need to do better: we need to be able to combine strategies in such a way as to increase our gain. To do so, we will make use of reservations. Our goal is to implement an (r, b + 1, Σ)-strategy using a finite sequence of (ri , b, Σi )-strategies. We will improve Algorithm 1. The basic idea remains the same. When we assign measure to a string, say when at stage t0 we set a(σ, t0 ) = 2−g , the opponent has to respond by allocating a string of length at most g to describe σ. Say that in response to our setting a(σ, t0 ) = 2−g , the opponent enumerates (τ, σ) into Mt1 , where |τ | = g. Take p < g and let τp be the first p bits of τ . If τp ∈ / Rp (σ, t), then τp ∈ Ip (σ, t), so the opponent can never enumerate (τp , σ) into M. So if we set a(σ, t1 ) = 2−p , the opponent must find some new string υ of length at most p to describe σ. The original τ description of σ is effectively a waste of resources. On the other hand, while τp ∈ Rp (σ, t), we also have a gain because the opponent can use τp only to describe a string comparable with σ. If we increase measure only on strings incomparable with σ, then the opponent cannot use τp in response. If we use only increments of measure of size at least 2−p , then the opponent cannot make use of any extension of τp (as it would be too long to be useful). Note that this is why we need an r in the definition of an (r, b, Σ)-strategy. The r is the smallest amount of measure this strategy needs. Being able to compute r means we know how much space strategies need in terms of units of measure. With this knowledge, we can ensure that the smallest measure needed by a strategy will not interfere with any reservations established by previous strategies. We noted earlier that part of the problem with Algorithm 1 was that the amount of measure that it assigned to a string ranged between 2−g and 2−r . This range was too great to allow the algorithm to be used recursively. Hence our new algorithm has two objectives: to create a certain amount of gain, and to ensure a certain amount of measure is allocated. These two objectives will be achieved by splitting the algorithm into two phases. In the first, the advantage phase, the objective will be to force the opponent to reserve a number of strings of length p. We do not know how much measure we will need to use before the opponent makes this many reservations. If we do not use enough measure, then we proceed with the second phase, the spend phase. The spend phase is where we make sure that we have placed enough measure on each string so that the strategies can be used recursively. The use of these two phases is another new feature of this proof. The spend phase must occur after the advantage phase, because only then will we know how much more measure we need to allocate. Accordingly, the spend phase will use larger units of measure than the advantage phase. From r we will compute a p, q, and r such that r < p < q < r . The advantage phase will use units of measure between q and r while the spend phase will use units of measure between r and p (see Figure 4.2). This fact gives us a bit of a problem. During the advantage phase we can require
4.5. Monotone complexity and KM -complexity
183
only reservations of length at most p. If p is much larger than r, then we will need to make many reservations to achieve anything. To get around this problem, we define Υ = {σ0τ : σ ∈ Σ ∧ τ ∈ 2p−r } (where Σ is the set of input strings for the algorithm). Note that the purpose of the 0 in the definition of Υ is just to separate the strings used by the advantage phase from those used by the spend phase. We will run the algorithm on the strings in Υ until enough of these have a reservation of length p. As in Algorithm 1, we will not increase the measure on the elements of Υ directly, but rather will make a series of small increments of measure on some extensions of them. We will let Ξ = {υτ : υ ∈ Υ ∧ τ ∈ 2e } (where e depends on b and will be defined shortly). The idea is the following. If υ ∈ Υ and ξ0 , . . . ξ2e −1 are the elements of Ξ that extend υ, then we will sequentially set the measure of each ξi to 2−p−e until a reservation of size p occurs for υ. If we do so for all such ξi , then the measure on υ will be 2−p , and the opponent must allocate a string of length p to describe υ. However, it is not quite that simple. We want to increase the measure on ξi by running some strategy on it. Every time we run a strategy we need to use larger units of measure. So instead, what we will do is take many extensions of ξ0 . Then we can run our first strategy on all these extensions, spending a little bit of measure on each. Then we will take slightly fewer extensions of ξ1 and run a strategy that spends a little more measure on each of these extensions, and so on. This procedure is just like what we did in Example 4.5.18. It gives us sets Ψ0 , Ψ1 , . . ., where Ψi is the ith set of extensions of elements of Ξ on which we want to increase measure. The sets used by the advantage phase are shown in Figure 4.1. In this figure arrows represent extensions of strings. The algorithm would ideally achieve a reservation of length p for each element of Υ, but there is a problem. Once a reservation is achieved for some υ ∈ Υ, we do not want to increase the measure on that υ. However, reservations are not monotonic; they can appear at one stage and then disappear at the next. Consider υ1 , υ2 ∈ Υ. Say we increase the measure on both υ1 and υ2 by 2−p−e , and then the opponent responds by giving a p reservation to υ1 , but not to υ2 . Then we could increase the measure on υ2 , and the opponent might respond by giving a p reservation to υ2 but taking away the p reservation from υ1 . By adopting this sort of tactic, the opponent could force us to make many separate increases in measure to achieve reservations for all elements of Υ. Every time we increase the measure on some subset of Υ, we need a new Ψi . The more Ψi ’s we need, the deeper we need to go into the binary tree. To improve the lower bound, we want to limit the number of Ψi ’s we need. To do so, we will wait until we have a reservation for three-quarters of the elements of Υ. Thus, any time we increase the measure, we must be increasing the measure on at least one-quarter of the elements of Υ. The spend phase is similar but simpler. In this phase we will identify those elements σ ∈ Σ that do not have enough measure. We will increase
184
4. Relating Complexities
ˆ0 Ψ
Ψ0
.. .
ˆ n−1 Ψ
Ψn−1
Ξ
ˆ Υ
ΥR
ˆ Σ
Υ
Σ
Figure 4.1. Algorithm strings
the measure on such σ in increments of 2−r−3 , until we have exceeded 2−r . Again like in Example 4.5.18, we will not increase the measure directly on σ but rather on some set of extensions of it. These extensions will form the sets Φ0 , . . . , Φ7 . The length of extensions that we take to form the sets Ψi and Φi is governed by the space that a strategy needs to run, in terms of quantities of measure. The function S(b) is used to determine this space. We will show that there exist (r, b, Σ)-strategies such that r = r + S(b). From our base strategy we can set S(0) = 0 because in this case r = r . In the process of determining S(b + 1) we will also compute all the other variables we need for the main algorithm. These variables are illustrated in Figure 4.2. To determine S(b + 1), first we compute an increasing sequence r0 r1 · · · r8 by r0 = r+3 and ri+1 = ri +S(b). This sequence gives us room for the spend phase. Let p = r8 . Let e = b+6. The number e is chosen so that D(b)2−e = (5( 54 )b − 4)2−b−6 < 4( 54 )b+1 2−b−6 < 4 · 2b+1 2−b−6 = 18 . Let q = p + e. Now compute an increasing sequence r0∗ r1∗ · · · rn∗ , ∗ where r0∗ = q and n = 2e+2 , and ri+1 = ri∗ + S(b). This sequence gives us room for the advantage phase. Finally let r = rn∗ , and S(b + 1) = r − r.
4.5. Monotone complexity and KM -complexity
Spend Phase
r
r0 r1 . . . r8 = p
e
185
Advantage Phase
q = r0∗ r1∗
. . . rn∗ = r
Figure 4.2. Algorithm parameters
This value is well-defined as S(b + 1) does not depend on the initial choice of r. It is possible to unravel the above definition to determine that S can be defined as follows: S(0) = 0 and S(b + 1) = (8 + 2b+8 )S(b) + b + 9. We are finally ready to present the main algorithm. We are given r, b ∈ N and a prefix-free set of strings Σ such that |Σ| 54 2−r 1. We let t0 be the stage that the algorithm starts and we require that a(Σ, t0 )=0. To implement an (r, b + 1, Σ)-strategy, we first determine all the values n, p, q, e, ri , ri∗ , r , as shown in Figure 4.2. In the algorithm, Υ represents those elements of Υ whose measure we will increase in the current pass through the advantage phase of the algorithm. Now, with all the values that we need computed, the algorithm that we will use to implement an (r, b + 1, Σ)-strategy is as follows. Advantage Phase. Let t be the current stage. Let Υ = {σ0τ : σ ∈ Σ ∧ τ ∈ 2p−r }. Let Ξ = {υρ : υ ∈ Υ ∧ ρ ∈ 2e }. Let Υ = {υ ∈ Υ : Rp (υ, t) = ∅}. If |Υ | < 14 |Υ|, then move to the spend phase. For each υ ∈ Υ , choose a ξυ ∈ Ξ such that ξυ υ and a(ξυ , t) = 0. Let Ξ = {ξυ : υ ∈ Υ }. Choose the least i < n−1 that has not ∗ been used previously. Let Ψi = {ξτ : ξ ∈ Ξ ∧ τ ∈ 2rn−i−1 −q } ∗ and implement an (rn−i−1 , b, Ψi )-strategy. If this strategy finishes at stage t , then wait until stage t such that Km,t (υ) p for all υ ∈ Υ such that a(υ, t ) 2−p . Repeat the advantage phase. Spend Phase. Let t be the current stage. Let Σ = {σ ∈ Σ : a(σ, t) < 2−r }. If Σ = ∅ then terminate the algorithm. Otherwise choose the least i < 8 such that i has not been used previously in the spend phase. Let Φi = {σ101 1τ : σ ∈ Σ ∧ τ ∈ 2r7−i −r−3 } and run a (r7−i , b, Φi )-strategy. Repeat the spend phase. We now verify the correctness of this algorithm. The algorithm starts at stage t0 . We will define the following parameters assuming that the algorithm terminates at some stage t1 . Let na be the number of times the advantage phase is run. Let ns be the number of times the spend phase is run. For all i < na , let t∗i be the stage the ith strategy in the advantage
186
4. Relating Complexities
phase begins (starting with i = 0). Let tmid be the stage at which the algorithm completes the advantage phase and let t∗na = tmid . For all 0 i < ns , let ti be the stage the ith strategy in the spend phase begins (starting with i = 0). Let tns = t1 . Let Φ = Φ0 ∪· · ·∪Φns −1 and Ψ = Ψ0 ∪· · ·∪Ψna −1 . ⊆ Σ. Let Σ The first step in the verification is to prove that the algorithm runs without error. To prove this, we need to show that each time we “choose” something in the algorithm, there is something valid to choose. First we note that the opponent cannot add a description of σ of length p without making a reservation of length p. Lemma 4.5.19. If t, p, and σ are such that Km,t (σ) p, then Rp (σ, t) = ∅. Proof. If Km,t (σ) p, then there exists (τ, σ ) ∈ Mt such that s = |τ | p and σ σ. Consider τ = τ 0p−s . As |τ | = p and τ τ , and σ σ, we have τ ∈ Fp (σ, t). If τ ∈ Ip (σ, t) then there would exist τ ≈ τ and ρ | σ such that (τ , ρ) ∈ Mt . However, then τ ≈ τ and ρ | σ , so this fact would contradict the definition of a monotone machine. Thus τ ∈ Ip (σ, t), and so τ ∈ Rp (σ, t). Proposition 4.5.20. The advantage phase of the algorithm never runs out of choices for ξυ or i, and the spend phase never runs out of choices for i. Proof. First we see that there is always a choice for ξυ . Take any υ ∈ Υ and any j < na . Assume that at stage t∗j for any ξ ∈ Ξ such that ξ υ, we have a(ξ, t∗j ) = 0. Then, given any ξ ∈ Ξ such that ξ υ, there is ∗ some i < j such that an (rn−i−1 , b, Ψi )-strategy has been implemented ∗ ∗ rn−i−1 −q with {ξσ : σ ∈ 2 } ⊆ Ψi . There are 2rn−i−1 −q many extensions of ξ in Ψi , so ∗
∗
a(ξ, t∗j ) 2rn−i−1 −q 2−rn−i−1 = 2−q . Hence a(υ, t∗j ) 2e 2−q = 2−p , because there are 2e many extensions of υ in Ξ. Thus, at the end of the iteration of the previous advantage phase, the algorithm waits until a stage t when the opponent uses a string of length at most p to describe υ before running the advantage phase again. As this fact implies that Rp (υ, t∗j ) = ∅, by Lemma 4.5.19, it follows that no choice of ξυ will be needed. The advantage phase stops if |Υ | < 14 |Υ|. So each time the phase is run, at least 14 |Υ| many elements of Ξ are selected, which can happen only 4 · 2e = 2e+2 many times (as otherwise the algorithm would run out of choices for some ξυ ), so there is always a choice for i. The spend phase will terminate if a(σ, ti ) 2−r for all σ ∈ Σ. During each iteration through the phase, a(σ, ti+1 ) a(σ, ti ) + 2−r−3 (if it is not greater than 2−r already), so after a maximum of 8 steps, a(σ, t) 2−r for all σ ∈ Σ, and hence the algorithm will not run out of choices for i.
4.5. Monotone complexity and KM -complexity
187
The next step in the verification is to show that the required bounds on the measure that the algorithm uses are met. Proposition 4.5.21. For all σ ∈ Σ, if the algorithm terminates at stage t1 , then 2−r a(σ, t1 ) < 54 2−r . Proof. As a(σ, t0 ) = 0, any measure placed on σ by stage tmid must be due to an increase of measure on some ψ ∈ Ψ where ψ σ. Hence, for any σ ∈ Σ, we have a(σ, tmid ) = ψ∈Ψ, ψσ a(ψ, tmid ). If ψ ∈ Ψi , then ∗ we increase the measure on ψ by less than 54 2−rn−i−1 . Now if ψ σ then ∗ ∗ ∗ |ψ| = |σ| + 1 + p − r + e + rn−i−1 − q = |σ| + 1 − r + rn−i−1 , so rn−i−1 = 5 −(|ψ|−|σ|−1+r) |ψ| − |σ| − 1 + r. Thus a(ψ, tmid ) < 4 2 . All such ψ form a prefix-free set of extensions of σ0, so a(σ, tmid ) = a(σ0, tmid ) = a(ψ, tmid ) ψ∈Ψ, ψσ0
<
ψ∈Ψ, ψσ0
5 5 −|ψ|+|σ0|−r 2 2−r . 4 4
The spend phase will run only if a(σ) < 2−r . Each iteration of the spend phase increases the measure on σ by less than 54 2−r−3 < 2−r−2 , so at the end of the spend phase, 2−r a(σ, t1 ) < 2−r + 2−r−2 = 54 2−r . Note that as a(Σ, t1 ) < |Σ| 54 2−r 1, we do not run out of measure. There are two possible outcomes to an (r, b + 1, Σ)-strategy, related to whether or not the algorithm terminates. Proposition 4.5.22. If the algorithm does not terminate then outcome (i) is achieved. Proof. The number of times any phase is repeated is bounded, so if the algorithm does not terminate, then this fact must be caused by waiting for a response from the opponent or by waiting for another strategy to finish. Using the basic strategy as an inductive base case, this situation can occur only if for some σ ∈ Σ, there is a σ σ with Km(σ ) > − log(a(σ )). Hence (i) (a) holds. Now (i) (b) holds by Proposition 4.5.21, because if the algorithm does not exceed the measure upper bound when it terminates, then it will not exceed the upper bound if it takes no action from some point onwards. We now proceed to the more difficult stage of the verification. Let us examine what happens if the algorithm does terminate. We are going to look at the gain made during the advantage phase and the spend phase separately. We do so because we know from Proposition 4.5.17 that the overall gain is at least the sum of the gains that occur during these phases. That is, letting Σ{i} = {σi : σ ∈ Σ}, t0 , t1 )) μ(Grp (Σ{0}; μ(Grr (Σ; t0 , tmid )) + μ(Gpr (Σ{1}; tmid, t1 )).
188
4. Relating Complexities
We know what we can expect from the spend phase of the algorithm because it is just a series of (r, b, Φi )-strategies. However, in the advantage phase we have to show that the overall gain is increased by the fact that some reservations of length p are made; i.e., for some υ ∈ Υ, we have Rp (υ, tmid ) = ∅. i = {ϕ ∈ Φi : First let us look at the spend phase. For all i < ns , let Φ =Φ 0 ∪ · · · ∪ Φ ns −1 . (ϕ σ)}. Note that Σ{1} i . Let Φ ∃σ ∈ Σ Φ Proposition 4.5.23. If the algorithm terminates then tmid, t1 )) (1 + 2b )a(Σ{1}, t1 ) − a(Σ{1} \ Σ{1}, t1)D(b). μ(Gpr (Σ{1}; Proof. From Proposition 4.5.17 and the inductive hypothesis of the existence of (r, b, Σ)-strategies, we know that i ; ti , ti+1 )) μ(Gpr (Σ{1}; tmid, t1 )) μ(Grr8−i (Φ 7−i
i<ns
((1 +
b 2 )a(Φi , ti+1 )
i |). − 2r7−i D(b)|Φi \ Φ
i<ns
The second inequality is a consequence of condition (c) of outcome (ii). Now, because this is the only way we increase measure on extensions of Σ{1}, it follows that (1 + 2b )a(Φi , ti+1 ) = (1 + 2b )a(Φi , t1 ) = (1 + 2b )a(Σ{1}, t1 ). i<ns
i<ns
The (r7−i , b, Φi ) strategy run guarantees that i , t1 ) 2r7−i |Φi \ Φ i |, a(Φi \ Φ so
i| 2r7−i D(b)|Φi \ Φ
i<ns
i , t1 )D(b) a(Φi \ Φ
i<ns
= a(Σ{1} \ Σ{1}, t1 )D(b). The result follows by combining these inequalities. Before we examine the advantage phase, we need some more definitions. We need to define those subsets of Υ that have p reservations, or extend or both. elements of Σ, 1. Let ΥR = {υ ∈ Υ : Rp (υ, tmid ) = ∅}. = {υ ∈ Υ : ∃σ ∈ Σ (υ σ)}. 2. Let Υ R = ΥR ∩ Υ. 3. Let Υ
4.5. Monotone complexity and KM -complexity
189
Note that |ΥR | 34 |Υ| = 34 |Σ|2p−r , as this is the condition for the algorithm = |Σ \ Σ|2 p−r . Hence, to move to the spend phase. Additionally, |Υ \ Υ| − |ΥR ∪ Υ| |ΥR | + |Υ| − |Υ| R | = |ΥR | + |Υ| |Υ p−r . 3 |Σ|2p−r − |Σ \ Σ|2 p−r = ( 3 |Σ| − |Σ \ Σ|)2 = |ΥR | − |Υ \ Υ| 4 4 For each υ ∈ ΥR , we choose a specific τυ in Rp (υ, tmid ). The following lemma establishes that these τυ ’s must be distinct. Lemma 4.5.24. If υ0 , υ1 ∈ ΥR and υ0 = υ1 then τυ0 = τυ1 . Proof. If υ0 = υ1 then υ0 | υ1 because ΥR is a prefix-free set. If τυ0 ∈ Rp (υ0 , tmid ), then τυ0 ∈ Fp (υ0 , tmid ), so τυ0 ∈ Ip (υ1 , tmid ) by Lemma 4.5.16. Hence τυ0 ∈ Rp (υ1 , tmid ), and so τυ0 = τυ1 . The gain made during the advantage phase can be broken down into R , we have τυ ⊆ Grp (Σ{0}; two parts. Firstly, for every υ ∈ Υ t0, tmid ). This is the case because τυ ∈ Rp (υ, tmid ) = Fp (υ, tmid ) \ Ip (υ, tmid ). In the following argument we will make use of Lemmas 4.5.10 and 4.5.12. we have τυ ∈ Fp (σ0, tmid ) ⊆ Fp (Σ{0}, As υ σ0 for some σ ∈ Σ, tmid). Additionally, τυ ∈ Ip (σ0, tmid ), so τυ ∈ Ip (Σ{0}, tmid) ⊇ Ip (Σ{0}, t0 ). Now, Ip (Σ{0}, t0 ) ⊇ Ir (Σ{0}, t0), so τυ ∩ Ir (Σ{0}, t0) = ∅. Thus, tmid) \ Ir (Σ{0}, t0 ) = Grp (Σ{0}; t0, tmid ). τυ ⊆ Fp (Σ{0},
Secondly, we can count the gains made by the recursive use of the strategies in the advantage phase as r∗ i ; t∗ , t∗ ) ⊆ Gr (Σ{0}; G n−i (Ψ t0 , tmid ). ∗ i
rn−i−1
i+1
p
i
We would like to combine the above to have
μ(Grp (Σ{0}; t0, tmid )) μ τυ + μ i
R υ∈Υ
However, we cannot do so, as
r∗ i ; t∗i , t∗i+1 ) . Grn−i (Ψ ∗
R τυ υ∈Υ
and
na −1 i=0
n−i−1
r∗
i ; t∗ , t ∗ ) Grn−i (Ψ ∗ i i+1 n−i−1
are not necessarily disjoint. Our goal is to find some subset ΨR ⊆ Ψ that ∗ i so that R τυ and na −1 Grn−i i \ we can subtract from any Ψ (Ψ r∗ i=0 υ∈Υ n−i−1
ΨR ; t∗i , t∗i+1 ) are disjoint. We use the superscript R in ΨR because we will construct the set so that ΥR ΨR . How do we find ΨR ? For each υ ∈ ΥR , we show that there is at most one i such that τυ ∈ Fp (Ψi , t∗i+1 ). Call this i, if it exists, iυ . We will show that all elements of Ψiυ that extend υ actually extend some unique ξ ∈ Ξ. We denote this ξ by ξυ . We can form ΨR by taking all extensions of ξυ in Ψ for all υ.
190
4. Relating Complexities
To show how to implement the above in detail, first note that, while Rp (σ, t) is not monotonic in t, if some τ is in Rp (σ, ti ) but is not in Rp (σ, tj ), with tj > ti , then it must be that τ ∈ Ip (σ, tj ), so τ ∈ Rp (σ, t) for all t tj . Proposition 4.5.25. If υ ∈ ΥR then there is at most one i < na such that τυ ∈ Fp (Ψi , t∗i+1 ). Proof. If no such i exists then the lemma holds trivially. Otherwise, let i be the least element of {0, . . . , na − 1} such that τυ ∈ Fp (Ψi , t∗i+1 ). For some ψ ∈ Ψi , we have τυ ∈ Fp (ψ, t∗i+1 ). It follows that υ ψ, because if not, then ψ | υ, and so by Lemma 4.5.16, τυ ∈ Ip (υ, t∗i+1 ), whence τυ ∈ Rp (υ, tmid ), which is a contradiction. Hence, as υ ψ, by Lemma 4.5.10, τυ ∈ Fp (υ, t∗i+1 ). Thus τυ ∈ Rp (υ, t∗i+1 ), because τυ ∈ Ip (υ, t) for all t tmid . Now for all t such that t∗i+1 t tmid , we have that τυ must remain in Rp (υ, t). Thus Rp (υ, t) = ∅, and so if j > i, then there is no ψ υ in Ψj , because the algorithm chooses only ψ that extend elements of υ ∈ Υ for which Rp (υ , t∗j ) is empty. Hence, for all ψ ∈ Ψj , we have ψ | υ. If τυ ∈ Fp (ψ , t∗j+1 ) then τυ ∈ Ip (υ, tmid ) (again by Lemma 4.5.16), which is false as τυ ∈ Rp (υ, tmid ). Hence, τυ ∈ Fp (Ψj , t∗j+1 ). Now, by construction, for any υ ∈ Υ and any i < na , there is a single ξ ∈ Ξ such that any ψ ∈ Ψi that extends υ also extends ξ; i.e., υ ξ ψ. Hence we can define ξυ for all υ ∈ ΥR as follows. Let i be the unique element of {0, . . . , na − 1} such that there exists a ψ ∈ Ψi and τυ ∈ Fp (ψ, t∗i+1 ) (assuming such an i exists), and define ξυ to be that element of Ξ such that all ψ ∈ Ψi that extend υ also extend ξυ . If no such i exists then let ξυ = υ0e . Let ΨR = {ψ ∈ Ψ : ψ ξυ for some υ ∈ ΥR }. The following lemma shows us that this ΨR is really what we want. Lemma 4.5.26. For any i < na and any υ ∈ ΥR , r∗
(Ψi \ ΨR ; t∗i , t∗i+1 ) = ∅. τυ ∩ Grn−i ∗ n−i−1
Proof. Take any such i and υ. We have shown that if ψ ∈ Ψi and τυ ∈ Fp (ψ, tmid ), then ψ ξυ and hence ψ ∈ ΨR . Thus τυ ∈ Fp (Ψi \ ΨR , tmid ), and so τυ ∈ Fp (Ψi \ ΨR , t∗i+1 ). Hence, τυ ∩ Fp (Ψi \ ΨR , t∗i+1 ) = ∅. The lemma now follows, as r∗
∗ Grn−i (Ψi \ ΨR ; t∗i , t∗i+1 ) ⊆ Frn−i−1 (Ψi \ ΨR , t∗i+1 ) ∗ n−i−1
⊆ Fp (Ψi \ ΨR , t∗i+1 ).
The remainder of the proof is essentially book-keeping. We need to show that breaking the gain into the above two components gives us what we need.
4.5. Monotone complexity and KM -complexity
191
⊆ Σ, then Proposition 4.5.27. If the algorithm terminates and Σ μ(Grp (Σ{0}; t0, tmid )) 5 −r + (1 + b )a(Σ{0}, t1 ) − a(Σ{0} \ Σ{0}, tmid)D(b). |Σ|2−r − |Σ \ Σ|2 2 8 Proof. The previous lemma shows that the sets υ∈Υ R τυ and
n
r∗
i \ ΨR ; ti , ti+1 ) Grn−1+1 (Ψ ∗ n−i
i=1
are disjoint, so t0, tmid )) μ(Grp (Σ{0};
μ(τυ ) +
R υ∈Υ
r∗
i \ ΨR ; t∗ , t∗ )). μ(Grn−i (Ψ ∗ i i+1 n−i−1
i
At this point, we use part (c) of outcome (ii). We know that the amount i , we of gain that occurs on ΨR is bounded. So if we subtract it from Ψ i ⊆ Ψi , it can determine the maximum amount we can lose. Note that as Ψ follows that i \ ΨR )| = |Ψi | − |Ψ i \ ΨR | |Ψi \ (Ψ i | − |Ψ i ∩ ΨR |) = |Ψi \ Ψ i | + |Ψ i ∩ ΨR |. = |Ψi | − (|Ψ With this identity we can determine that r∗
i \ ΨR ; t∗i , t∗i+1 )) (Ψ μ(Grn−i ∗ n−i−1
i \ ΨR )|D(b)2−rn−i−1 (1 + 2b )a(Ψi , t∗i+1 ) − |Ψi \ (Ψ ∗ ∗ i |D(b)2−rn−i−1 i |D(b)2−rn−i−1 = (1 + 2b )a(Ψi , t∗i+1 ) − |Ψi \ Ψ − |ΨR ∩ Ψ . ∗
The first inequality comes from part (c) of outcome (ii) applied to our (r, b, Ψi )-strategies. Now we can combine these results along with the facts that a(Ψi , t∗i+1 ) = R −p to get that a(Ψi , t1 ) and υ∈Υ R μ(τυ ) = |Υ |2 R |2−p + μ(Gpr (Σ{0}; t0 , tmid )) |Υ −
(1 + 2b )a(Ψi , t1 )
i
i |D(b)2 |Ψi \ Ψ
i
∗ −rn−i−1
−
i |D(b)2−rn−i−1 . |ΨR ∩ Ψ ∗
i
These terms can be simplified. First, (1 + 2b )a(Ψi , t1 ) = (1 + 2b )a(Σ{0}, t1 ). i
R : ∃ψ υ (ψ ∈ ΨR ∩ Ψ i )}. Because any R = {υ ∈ Υ Now define Υ i R R ψ ∈ Ψ ∩ Ψi extends some υ ∈ Υ , by the construction of ΨR , we have
192
4. Relating Complexities
∗ −q i | |Υ R |2rn−i−1 ψ ξυ . It follows that |ΨR ∩ Ψ . Hence i ∗ i |D(b)2−rn−i−1 R |D(b)2−q |Υ R |D(b)2−q . |ΨR ∩ Ψ |Υ i
i
i
The second inequality is a consequence of the following lemma. R R Lemma 4.5.28. The sets Υ 0 , . . . , Υna −1 are pairwise disjoint. R . By definition, there exist ψ0 and ψ1 , both R ∩ Υ Proof. Assume υ ∈ Υ i j i ⊆ Ψi and ψ1 ∈ ΨR ∩ Ψ j ⊆ Ψj . As extending υ, such that ψ0 ∈ ΨR ∩ Ψ R both ψ0 and ψ1 are in Ψ , they both must extend ξυ . But the algorithm construction then ensures that i = j. ∗ The (rn−i−1 , b, Ψi )-strategy run guarantees that ∗ i , tmid ) 2−rn−i−1 i |, |Ψi \ Ψ a(Ψi \ Ψ
which shows that n a −1
∗ i |D(b)2−rn−i−1 |Ψi \ Ψ
i=0
n a −1
i , tmid )D(b) a(Ψi \ Ψ
i=0
= a(Σ{0} \ Σ{0}, tmid)D(b). Putting these inequalities together shows that μ(Grp (Σ{0}; t0, tmid ))
R |2−p −|Υ R |D(b)2−q +(1+ b )a(Σ{0}, t1 )−a(Σ{0}\ Σ{0}, tmid)D(b). |Υ 2 R | ( 3 |Σ| − |Σ \ Σ|)2 p−r and D(b)2−e Now because |Υ 4 e, it follows that
1 8
by choice of
R |2−p − |Υ R |2−p (1 − D(b)2−e ) R |D(b)2−q = |Υ |Υ R |2−p 7 ( 3 |Σ| − |Σ \ Σ|)2 p−r 2−p 7 |Υ 8 4 8 −r −r 37 7 5 −r . = ( |Σ| − |Σ \ Σ|)2 |Σ|2 − |Σ \ Σ|2 48
8
8
Finally, μ(Grp (Σ{0}; t0, tmid ))
−r + (1 + b )a(Σ{0}, t1 ) − a(Σ{0} \ Σ{0}, tmid)D(b). 58 |Σ|2−r − |Σ \ Σ|2 2
Proposition 4.5.29. If the algorithm terminates then t0 , t1 )) (1 + μ(Grr (Σ;
b+1 2 )a(Σ, t1 )
− 2−r D(b + 1)|Σ \ Σ|.
4.5. Monotone complexity and KM -complexity
193
Proof. Proposition 4.5.17 tells us that t0 , t1 )) μ(Gr (Σ{0}; μ(Grr (Σ; t0 , tmid )) + μ(Gpr (Σ{1}; tmid, t1 )). p
All we need to do now is to combine and simplify Propositions 4.5.23 and 4.5.27. We have six terms to deal with: 1. (1 + 2b )a(Σ{1}, t1 ), 2. −a(Σ{1} \ Σ{1}, t1 )D(b), 3.
5 −r , 8 |Σ|2
−r , 4. −|Σ \ Σ|2 5. (1 + 2b )a(Σ{0}, t1 ), and 6. −a(Σ{0} \ Σ{0}, tmid)D(b). Notice that tmid)D(b) a(Σ{1} \ Σ{1}, t1)D(b) + a(Σ{0} \ Σ{0}, 5 2−r D(b), = a(Σ \ Σ, t1 )D(b) < |Σ \ Σ| 4
where the last inequality follows from Proposition 4.5.21. Also, −r 5 8 |Σ|2
=
+ (1 + 2b )a(Σ{1}, t1 ) + (1 + 2b )a(Σ{0}, t1 ) −r 15 2 4 |Σ|2
+ (1 + 2b )a(Σ, t1 ) > 12 a(Σ, t1 ) + (1 + 2b )a(Σ, t1 ) = (1 +
b+1 2 )a(Σ, t1 ).
And so, as D(b + 1) = 54 D(b) + 1, t0 , t1 )) μ(Grp (Σ{0}; μ(Grr (Σ; t0 , tmid )) + μ(Gpr (Σ{1}; tmid, t1 ))
b+1 5 −r D(b) − |Σ \ Σ|2 −r 2 )a(Σ, t1 ) − |Σ \ Σ| 4 2 −r 5 (1 + b+1 ( 4 D(b) + 1)|Σ \ Σ| 2 )a(Σ, t1 ) − 2 −r = (1 + b+1 D(b + 1)|Σ 2 )a(Σ, t1 ) − 2
> (1 + =
\ Σ|.
Proposition 4.5.30. If the algorithm terminates then outcome (ii) occurs. Proof. By Proposition 4.5.21, condition (ii) (a) holds. By Proposition 4.5.29, condition (ii) (c) holds. Thus condition (ii) (b) holds by taking to be Σ. Σ Hence we have established the existence of an (r, b + 1, Σ)-strategy. By induction we can assert the existence of an (r, b + 1, Σ)-strategy for any r, b ∈ N such that |Σ| 54 2−r < 1.
194
4. Relating Complexities
Proposition 4.5.31. If r 1 and b2−r−1 1, then for any σ such that a(σ, t0 ) = 0, an (r, b, {σ})-strategy achieves outcome (i). Proof. Because |{σ}| 54 2−r < 1, we have sufficient measure to implement such a strategy. However, if the algorithm terminates at some stage t1 , then
μ(Grr ({σ}; t0 , t1 )) (1 + 2b )a(σ, t1 ) (1 + 2b )2−r , = 2−r + b2−r−1 > 1, which is impossible. Thus outcome (i) must occur. We can run a countable number of winning strategies in our construction of a by dovetailing them. Proposition 4.5.32. There is a c.e. semimeasure a such that for all i, we have a(0i 1) 2−2i−1 and there exists a σi such that Km(0i 1σi ) > − log(a(0i σi )). Proof. The semimeasure a can be constructed by running a (2i + 2, 22i+3 , {0i 1})-strategy for each i. Then for all i, we have a(0i 1) < 5 −2i−2 < 2−2i−1 . Also, by Proposition 4.5.31, because 22i+3 2−2i−2−1 = 1, 42 the strategy achieves outcome (i), and hence there is some σi such that Km(0i 1σi ) > − log(a(0i 1σi )). Now Theorem 4.5.1 holds by Propositions 4.5.5 and 4.5.32. To determine a lower bound on the difference between Km and KM , we need to bound the maximum string length used by an (r, b, Σ)-strategy. We will do so by determining an upper bound for the number of bits appended to a string in Σ by the strategy. If b = 0, then the strategy does not use any extensions of strings in Σ, so the number of extra bits appended is 0. If b > 0, an upper bound is the number of bits needed to extend a ∗ string in Σ to a string in Ψ0 . Such an extension requires 1 + rn−1 −r = 1+r −r−S(b−1) = 1+S(b)−S(b−1) many bits (the extra 1 corresponds to the use of a 0 or 1 to distinguish between the advantage and spend phases). Now an upper bound on the maximum string length used can be obtained by taking the maximum length l of any string in Σ and adding the upper bound for the number of bits added by an (r, i, Σ) strategy for all i with 1 i b. Hence, we have that the maximum possible string length is l+
b
(1 + S(i) − S(i − 1)) = l + b + S(b) − S(0) = l + b + S(b).
i=1
Recall that S(b) can be defined inductively by S(0) = 0 and S(b + 1) = (8 + 2b+8 )S(b) + b + 9. Now we will show that for b sufficiently large, 2 S(b) < 2(b ) . First note that if b 1, then S(b) b + 9, so S(b + 1) = (8 + 2b+8 )S(b) + b + 9 S(b)(9 + 2b+8 ) S(b)(2b+9 ). n−1 Now we will show by induction that if n 2, then S(n) < 24 i=1 2i+9 . As our base case, we have S(2) S(1)(22−1+9 ) < 24 21+9 . Assuming that
4.5. Monotone complexity and KM -complexity
195
our inequality holds for n, we have S(n + 1) S(n)2
n+9
<2
4
n−1
2
i+9
2n+9 = 24
i=1
n
2i+9 .
i=1
So if n 2, then S(n) < 24
n−1
2i+9 = 24+9(n−1)
i=1
n−1
2i = 24+9(n−1) 2
(n−1)n 2
<2
n2 2
+9n
.
i=1 2
Thus, if n is sufficiently large then S(n) < 2(n ) . We will now prove the main result of this section. Proof of Theorem4.5.4. We need to generalize Proposition 4.5.5 a little. i Fix n. We have i 2− n < ∞, so let kn be such that 2kn is larger than this sum. Now we can construct a semimeasure a such that a(0i 1) 1 2−(1+ n )i−kn for all i. We construct a by running a (j(n + 1) + kn + j(n+1)+kn +2 1, 2 , {0jn 1})-strategy for each j. We do not exceed the bound on the measure for a because 1 5 a(0jn 1) < 2−j(n+1)−kn −1 < 2−j(n+1)−kn = 2−(1+ n )jn−kn . 4 Because b2−r−1 = 2j(n+1)+kn +2 2−j(n+1)−kn −2 = 1, each strategy achieves outcome (i). Thus, for each j there is a σj 0jn 1 such that Km(σj ) > − log a(σj ). We can apply the same scaling as before to a to construct a semimeasure m; i.e., for all σ, we let m(0i σ) = 2i a(0i σ). Again, we have i i m(λ) = m(0i 1) = 2i a(0i 1) 2− n −kn = 2−kn 2− n 1. i
i
i
i
We now have that for each j there is a σj 0jn 1 such that Km(σj ) > − log m(σj ) + jn. By the discussion immediately preceding this proof, the maximum length of a string used by the strategy corresponding to j is at most |0jn 1| + S(b) + b, where in this case b = 2j(n+1)+kn +2 , and if j j(n+1)+kn +2 2 ) is large enough then this number is less than jn + 1 + 2(2 + j(n+1)+kn +2 22j(n+2) . Again if j is large enough, this number is less than 2 . 2 22j(n+2) Hence, for infinitely many j, we can take σj so that |σj | < 2 , that is, log log |σj | < 2j(n+ 2). As MM majorizes all c.e. continuous semimeasures, there must be some dn such that Km(σj ) − KM (σj ) > jn − dn . Thus we have Km(σj ) − KM (σj ) > jn
2j(n + 2) n − dn > log log |σj | − dn . 2j(n + 2) 2(n + 2)
n > c + ε for some ε > 0. Now, For any c < 12 , there is an n such that 2(n+2) for sufficiently large j it must be the case that ε log log |σj | > dn . Hence,
196
4. Relating Complexities
for such j, Km(σj ) − KM (σj ) > (c + ε) log log |σj | − dn > c log log |σj |.
5 Effective Reals
In this chapter we study the left computably enumerable reals, and discuss some other classes of effectively approximable reals. Left computably enumerable reals occupy a similar place in the study of relative randomness to that of computably enumerable sets in the study of relative computational complexity. Most of the material in this chapter used in the rest of the book is contained in the first two sections. Recall that all the reals we consider are in [0, 1], and are identified with elements of 2ω and with subsets of N. Recall also our convention that all reals will be irrational and hence have unique binary expansions.
5.1 Computable and left-c.e. reals Let α be a real. By L(α) we mean the left cut of α, that is, {q ∈ Q : q < α}. It is well known that such cuts can be used to define the reals from the rationals, as can Cauchy sequences. As we will see, both of these approaches can be effectivized. Recall that we identify a real α with the set of natural numbers A such that α = 0.A, that is, such that the nth bit of the binary expansion of α is 1 iff n ∈ A. Thus, it is natural to define α to be a computable real if A is a computable set. Equivalently, α is computable if there is an algorithm that, on input n, returns the nth bit of α. (Notice that we could have used a base other than base 2 in this definition; it is easy to check that the computability of the binary expansion of α is equivalent to the computability R.G. Downey and D. Hirschfeldt, Algorithmic Randomness and Complexity, Theory and Applications of Computability, DOI 10.1007/978-0-387-68441-3_5, © Springer Science+Business Media, LLC 2010
197
198
5. Effective Reals
of the n-ary expansion of α for any n > 1.) Another natural definition, based on the Dedekind cut approach, is to say that α is computable if L(α) is computable. Fortunately, these two definitions agree. Proposition 5.1.1. A real α is computable iff L(α) is computable. Proof. If L(α) is computable then we can compute α n, since it is the lexicographically largest string σ ∈ 2n such that 0.σ ∈ L(α). (By our convention mentioned above, we assume that α is irrational, so α = 0.σ.) If α is computable, then the following is an algorithm for computing L(α). Let qn = 0.(A n + 1) (i.e., qn is the rational whose binary expansion is given by the first n + 1 bits of the characteristic function of A). Given q ∈ Q, wait until either q qn for some n, in which case q ∈ L(α), or q − qn 2−n for some n, in which case q ∈ / L(α), since α − qn < 2−n . It is clear that one of the two cases must occur. Relativizing this result, we see that a real α has the same degree as L(α). Soare [364] observed that (by the above proof) α tt L(α) for all α, but he also showed that there is a c.e. set A such that for α = 0.A, we have L(α) tt α. Turning to the Cauchy sequence approach, a natural effectivization is to consider those reals α that are limits of computable sequences of rationals. As we will see, though, this class is much larger than that of the computable reals, even if we insist that the sequences of rationals be monotonic. The reason is that, to fully effectivize the notion of Cauchy sequence, we should require not only that the sequence be computable, but also that it have a computable rate of convergence. The following result is implicit in Turing [392]. Theorem 5.1.2 (Turing [392]). The following are equivalent. (i) The real α is computable. (ii) There is a computable sequence of rationals q0 , q1 , . . . → α such that |α − qn | < 2−n for all n. (iii) There is a computable sequence of rationals q0 , q1 , . . . → α and a computable function f such that |α − qf (n) | < 2−n for all n. Proof. (i) ⇒ (ii) If α = 0.A for a computable set A then let qn = 0.(A n + 1). Then q0 , q1 , . . . → α and |α − qn | < 2−n for all n. (ii) ⇒ (iii) Obvious. (iii) ⇒ (i) Let rn = qf (n) . By our convention mentioned above, we assume α is irrational. For each k there must be an n such that the first k +1 bits of the binary expansions of rn − 2−n and rn + 2−n agree, since any convergent sequence of rationals not having this property must converge to a rational. Given k, search for an n with the above property and define A(k) to be the kth bit of rn . Since α ∈ (rn − 2−n , rn + 2−n ), the kth bit of α must be the same as that of the binary expansion of rn , whence α = 0.A.
5.1. Computable and left-c.e. reals
199
Of course, we can relativize the definition of computable real to any given oracle. Relativizing it to ∅ , we get the notion of a Δ02 real . These are exactly the reals that can be approximated by a computable sequence of rationals (with a possibly noneffective rate of convergence). Theorem 5.1.3 (Demuth [93], Ho [182]). A real is Δ02 iff there is a computable sequence of rationals converging to it. Proof. If A is a Δ02 set, then fixing a computable sequence An converging to A, let qn = 0.(An n). Then q0 , q1 , . . . → 0.A. If q0 , q1 , . . . → α is a computable sequence of rationals, the q ∈ L(α) iff ∃n ∀m > n (q < qm ) iff ∀n ∃m > n (q < qm ), so L(α) is both Σ02 and Π02 , and hence is Δ02 . It is well known that the computable reals (and hence, by relativization, the Δ02 reals) form a real closed field (see for instance [318]). The methods used to prove this fact are along the lines of the ones used in the proof of Theorem 5.4.7 below. Having defined computable reals, it is natural to look for a definition of computably enumerable reals. The following is the natural definition based on the Dedekind cut approach. Definition 5.1.4. A real α is left computably enumerable (left-c.e.) if L(α) is c.e. Left-c.e. reals are often referred to simply as “c.e. reals”, but we wish to avoid any confusion between c.e. reals and c.e. sets. Left-c.e. reals have also been called recursively enumerable, left computable, left semicomputable, and lower semicomputable. Of course, we also have right-c.e. reals, which are the reals α such that L(α) is co-c.e. Equivalently, α is right-c.e. if 1 − α is left-c.e. The following is the most useful characterization of the left-c.e. reals. Theorem 5.1.5 (Soare [363]). The following are equivalent. (i) The real α is left-c.e. (ii) There is a computable sequence of rationals q0 < q1 < · · · → α. (iii) There is a computable sequence of rationals q0 < q1 < · · · → α such that {qi : i ∈ ω} is computable.1 (iv) There is a computable sequence of positive rationals r0 , r1 , . . . such that α = n rn . Proof. (i) ⇒ (iii) Suppose α is left-c.e. For q ∈ Q, let #(q) be the G¨odel number of q, as in Definition 2.1.1. Let q0 ∈ L(α). Given qn ∈ L(α), wait for 1 Note that the difference between items (ii) and (iii) is that in (iii) we require not only that the sequence be computable, but also that its range be computable.
200
5. Effective Reals
a stage s such that for q = max L(α)[s], we have qn < q and #(qn ) < #(q). Such a stage must exist because L(α) has no largest element. Let qn+1 = q. It is easy to see that q0 < q1 < · · · → α. Since the qi are increasing in G¨ odel number, {qi : i ∈ ω} is computable. (iii) ⇒ (ii) Obvious. (ii) ⇒ (iv) Let r0 = q0 and rn+1 = qn+1 − qn . (iv) ⇒ (i) L(α) = {q ∈ Q : ∃k (q < n s (At = As ). That is, once the approximation to an initial segment of A has changed, it can never return to its old value. This property is enough to carry out many constructions
5.1. Computable and left-c.e. reals
201
involving c.e. sets. Of course, c.e. sets and almost c.e. sets have the same degrees, since if A is almost c.e. then L(0.A) is c.e. and has the same degree as A. Proposition 5.1.8. There is an almost c.e. set that is not c.e. Proof. We build an approximation to an almost c.e. set A. We begin with A0 (2e) = 0 and A0 (2e + 1) = 1 for all e. At stage s, if 2e + 1 enters We then we let As+1 (2e + 1) = 0 and As+1 (2e) = 1. (Otherwise, As+1 (2e) = As (2e) and As+1 (2e + 1) = As (2e + 1).) Clearly, A is almost c.e. and A = We for all e. Corollary 5.1.9 (Soare [363]). There is a left-c.e. real that is not strongly c.e. It is easy to adapt the above proof to show that there is an almost c.e. set that is not n-c.e. for any n. On the other hand, every almost c.e. set A is ω-c.e., since the approximation to A k can change at most 2k many times. Of course, not every real of the form 0.A for an ω-c.e. set is left-c.e. For example, if α is right-c.e. and not computable, then it cannot be left-c.e. The following is a less obvious example. Theorem 5.1.10 (Ambos-Spies, Weihrauch, and Zheng [12]). There is a d.c.e. set D such that 0.D is neither left-c.e. nor right-c.e. Proof. Let A |T B be c.e. sets. Let D = A ⊕ B. Assume for a contradiction that 0.D is left-c.e. (the right-c.e. case being analogous). Then D is almost c.e. We claim that we can compute A using B. Suppose we know A n. Let s be such that Ds 2n = D 2n. We can find such an s using A n and B. Now let t s be such that either n ∈ At or Dt (2n) = 1, which must exist by the definition of D. If n ∈ At then we know that n ∈ A. If Dt (2n) = 1, then we must have D(2n) = 1, as the approximation to D(2n) cannot change from 1 to 0 unless the approximation to D 2n changes. So in this case we know that n ∈ / A. Thus A T B, which is a contradiction. Four important examples of left-c.e. reals that figure prominently in the study of algorithmic randomness are the halting probabilities of prefix-free machines, the weights of c.e. prefix-free sets (where the weight of a prefix free set of strings S is μ(S) = σ∈S 2−|σ| ), the measures of Σ01 classes, and the least members of Π01 classes. Indeed, all of these examples exactly characterize the left-c.e. reals. The following theorem assembles the work of several researchers in the area. Theorem 5.1.11. The following are equivalent. (i) The real α is left-c.e. (ii) There is a prefix-free machine M such that α = μ(dom M ).
202
5. Effective Reals
(iii) There is a c.e. prefix-free set A ⊂ 2<ω such that α =
σ∈A
2−|σ| .
(iv) There is a computable prefix-free set B ⊂ 2<ω such that α = −|σ| 2 . σ∈B (v) There is a Σ01 class C such that α = μ(C). (vi) There is a Π01 class P such that α is the least element of P . (vii) There is a computable tree T such that α is the leftmost path of T . Proof. (i) ⇒ (ii) Let α be left-c.e. Then there is a computable sequence of positive rationals whose sum is α. Each rational is a sum of finitely many rationals the form 2−n , so there is computable sequence n0 , n1 . . . such of −ni that i 2 = α. Then {(ni , λ)}i∈ω is a KC set, so by the KC Theorem there is a prefix-free machine M such that μ(dom M ) = α. (ii) ⇒ (iii) Let A = dom M . (iii) ⇒ (iv) Let B = {τ : ∃σ τ (σ ∈ A|τ | )}. (iv) ⇒ (v) Let C = B. (v) ⇒ (vi) Let R be a c.e. prefix-free set such that C = R, and let P be the class of reals β such that β μ(C). It is easy to check that β ∈ P iff 0.(β n + 1) + 2−n σ∈Rn 2−|σ| for all n, so P is a Π01 class. (vi) ⇒ (vii) Let T be such that P = [T ]. (vii) ⇒ (i) Let σs be the leftmost node at level s of T and let qs = 0.σs . It is easy to check that q0 q1 · · · → α. So we see that left-c.e. reals correspond to (the measures of) domains of prefix-free machines, much as c.e. sets correspond to the domains of Turing machines. This fact helps explain the importance of left-c.e. reals to the theory of algorithmic randomness. One easy but useful fact about the class of left-c.e. reals is that it is closed under addition. Recall that we always work modulo 1, so by α + β we mean the fractional part of α + β. Proposition 5.1.12. If α and β are left-c.e. reals, then so are α + β and αβ.
5.2 Real-valued functions In this section we consider effective functions from a countable domain D (such as N, 2<ω , or Q) to R. A sequence of reals is uniformly computable if the corresponding sequence of left cuts is uniformly computable, and uniformly left-c.e. if the corresponding sequence of left cuts is uniformly c.e. Definition 5.2.1. Let f : D → R.
5.3. Representing left-c.e. reals
203
(i) The function f is computable if its values are uniformly computable reals. (ii) The function f is computably enumerable or Σ01 if its values are uniformly left-c.e. reals. Both of these definitions can be relativized in the obvious way. It is easy to check that f is c.e. iff there are uniformly computable functions fs : D → Q such that for all s and x ∈ D, we have fs+1 (x) fs (x) and lims fs (x) = f (x), or, in other words, f can be approximated from below. Another characterization is that f is c.e. iff the set {(x, q) ∈ D × Q : q < f (x)} is c.e., and f is computable iff this set is computable. It is important to note that when we define a computable function f : D → Q, we always mean one for which there is an algorithm to determine f (x) given x, not a computable function f : D → R that happens to take values in Q. We also define a function f : D → R to be co-c.e. if its values are uniformly right-c.e. reals (i.e., their left cuts are uniformly co-c.e.). Note that a function is co-c.e. iff it can be approximated from above.
5.3 Representing left-c.e. reals As we have seen, two ways to represent a left-c.e. real α are by a computable nondecreasing sequence of rationals with limit α and by the measure of a Σ01 class. In this section, we explore computability-theoretic aspects of these two ways of approximating left-c.e. reals.
5.3.1 Degrees of representations A representation of a left-c.e. real α is the range of a computable sequence of rationals q0 q1 · · · → α. Representations were first analyzed by Calude, Coles, Hertling, and Khoussainov [47]. By the relativized version of Proposition 5.1.6, every representation of a left-c.e. real α is computable in α, and by Theorem 5.1.5, α has a computable representation. The following result gives some more information about the possible degrees of representations of α. Recall that a c.e. splitting of a c.e. set A is a pair of disjoint c.e. sets B and C such that A = B ∪ C. Proposition 5.3.1 (Calude, Coles, Hertling, and Khoussainov [47]). Let α be a left-c.e. real. Every representation of α is half of a c.e. splitting of L(α). Proof. Let q0 q1 · · · → α be a computable sequence of rationals, and let A be its range. Clearly, A is an infinite c.e. subset of L(α). We claim that so is L(α) \ A. By our convention that α is irrational, for each n there
204
5. Effective Reals
is an m > n such that qm > qn . Thus q ∈ L(α) \ A iff q ∈ L(α) and there is an n such that q < qn and q = qm for m < n. Thus L(α) \ A is c.e. If A is half of a c.e. splitting of a c.e. set B, then A wtt B, because to tell whether n ∈ A, we first check whether n ∈ B, and if so then we enumerate A and B \ A until n appears in one of these sets. Thus we have the following result. Corollary 5.3.2 (Soare [364]). If A is a representation of α then A wtt L(α). It is not difficult to show, using a diagonalization argument, that the converse to Proposition 5.3.1 is not true. However, we do have the following partial converse. Proposition 5.3.3 (Calude, Coles, Hertling, and Khoussainov [47]). Let A be a representation of α. If B ⊆ A is infinite, then B represents α iff B is half of a c.e. splitting of A. Proof. The “only if” direction has the same proof as Proposition 5.3.1. For the other direction, let q0 q1 · · · → α be a computable sequence of rationals with range A. Since B and A \ B are both c.e., given n we can compute whether qn ∈ B, and hence effectively produce an infinite subsequence of the qn with range B. Calude, Coles, Hertling, and Khoussainov [47] also showed that every leftc.e. real α has a representation of the same degree as α. Their result follows immediately from the following exact characterization of the m-degrees of representations of a left-c.e. real. Theorem 5.3.4 (Downey [102]). Let α be a left-c.e. real. For every c.e. splitting B C of L(α), there is a representation A of α such that A ≡m B. Thus, the m-degrees of halves of c.e. splittings of L(α) and the m-degrees of representations of α coincide. Proof. Let B C = L(α). We assume without loss of generality that B is not computable. We give the rationals G¨ odel numbers as in Section 2.1. We take approximations to B, C, and L(α) with the following properties. First, Bs Cs = L(α)s for all s. Next, each element of L(α)s has G¨odel number less than s. Finally, if q ∈ L(α)s and r < q has G¨odel number less than s, then r ∈ L(α)s . We build a computable sequence of rationals a0 < a1 < · · · and let A be its range. In our construction we will use parameters ms , with m0 = 0. At stage s, proceed as follows. If there is no rational greater than ms in Bs , then let ms+1 = ms and proceed to the next stage. Otherwise, add all elements of Bs greater than ms to our sequence, in increasing order. (Here we mean the usual order of the rationals.) Let ms+1 be largest rational (again in the usual order) in L(α)s and proceed to the next stage.
5.3. Representing left-c.e. reals
205
We claim that lims ms = ∞. Suppose otherwise, and let t be a stage such that mt = lims mt . Let q be a rational. If q ∈ / Bt and q > mt then q∈ / B, since otherwise q would be added to our sequence at the stage u at which it enters B, and hence we would have mu+1 = mt . If q mt then q ∈ L(α), so by enumerating B and C until q enters one of them, we can determine whether q ∈ B. Thus B is computable, contrary to hypothesis. It now follows from our claim that a0 < a1 < · · · is an infinite sequence whose limit is the same as the supremum of L(α), that is α. Thus A is a representation of α. Let S be the set of all rationals q such that q ms where s is one more than the G¨ odel number of q. Note that S is computable. If q ∈ S then q ∈ A iff q ∈ As+1 , and, as above, q ∈ L(α), so by enumerating B and C until q enters one of them, we can determine whether q ∈ B. Thus A ∩ S and B ∩ S are both computable. Now suppose that q ∈ / S. Then either q > mt for all t, in which case q∈ / A and q ∈ / B, or there is a least t such that q mt . In the latter case, by our assumptions on the enumerations of B, C, and L(α), we must have q ∈ L(α)t−1 , and hence q ∈ A iff q ∈ Bt−1 iff q ∈ B. So if q ∈ / S then q ∈ A iff q ∈ B. Thus A ≡m B. This characterization, together with known results on c.e. splittings and wtt-degrees, yields several facts about representations, including many of the results in [47]. For instance, by Sacks’ Splitting Theorem 2.12.1, every noncomputable left-c.e. real α has representations in infinitely many degrees (because Theorem 2.12.1 implies that we can split L(α) into Turing incomparable A0 and A1 ; but then we can also split A0 into Turing incomparable B0 and B1 and have a representation of α of the same degree as B0 , and continue in this manner to obtain representations in infinitely many degrees). The following are further examples. For proofs of the corresponding theorems on splittings, see Downey and Stob [133]. Corollary 5.3.5. (i) There is a left-c.e. real α with representations of every c.e. m-degree (ii) There is a left-c.e. real β such that the set of Turing degrees of representations of β forms an atomless boolean algebra that is nowhere dense in the c.e. degrees. We can also obtain negative results on the existence of representations. For instance, Downey [101] showed that a c.e. degree a contains a set with splittings in each c.e. degree below a iff a is either (Turing) complete or low2 . Corollary 5.3.6. If a left-c.e. real α has representations in each degree below that of α, then α is either complete or low2 . It is not known whether every nonzero c.e. degree contains a left-c.e. real α that does not have representations in every c.e. degree below that of α.
206
5. Effective Reals
5.3.2 Presentations of left-c.e. reals Another natural way to represent left-c.e. reals is via measures. Definition 5.3.7 (Downey and LaForte [122]). A presentation of a left-c.e. real α is a prefix-free c.e. set A ⊂ 2<ω such that α = μ(A) = σ∈A 2−|σ| . We have seen that every left-c.e. real α has a representation of degree α. The same is not true of presentations. Downey and LaForte [122] showed that there is a noncomputable left-c.e. real all of whose presentations are computable. We will obtain this result as a corollary to Theorem 5.3.13 below. Downey and LaForte [122] also showed that such “only computably presentable” reals can be high, but that, as we will see in Theorem 5.3.16 below, they cannot be of promptly simple degree. Using a 0 argument (an elaboration of the infinite injury method discussed in Chapter 2), Wu [407] constructed a c.e. set A such that any A-computable, noncomputable left-c.e. real has a noncomputable presentation.2 Strong reducibilities often play a large role in effective mathematics, since naturally occurring reductions tend to be stronger than Turing reductions. For instance, the word problem for a finitely presented group is tt-reducible to the conjugacy problem (see [189]), algebraic closure is related to a reducibility known as Q-reducibility (see [35, 123, 257, 421]), and wtt-degrees characterize the degrees of bases of a c.e. vector space (see [131]). Here too, the classification of the degrees realized as presentations of left-c.e. reals appears to depend on a stronger reducibility than Turing reducibility. In this case, the relevant reducibility seems to be weak truth table reducibility. The following result is easy to prove. Proposition 5.3.8 (Downey and LaForte [122]). Let α be a left-c.e. real and let A be a presentation of α. Then A wtt α with use function the identity. What is interesting is that there is a sort of converse to this result. Theorem 5.3.9 (Downey and LaForte [122]). If A is a presentation of a left-c.e. real α and C wtt A is c.e., then there is a presentation B of α with B ≡wtt C. Proof. We assume that A and C are infinite, since there is nothing to prove if A is computable, and if C is finite then we can substitute it by any infinite computable set. Let Γ be a reduction with computable use γ such that ΓA = C. As usual, we may assume that γ is increasing. We also may assume that at each stage s, exactly one element ns enters C and ΓA (m)[s] = Cs (m) for all m < s. We choose a pairing function so that 2 As
we will see in Theorem 9.3.1, this property of always having noncomputable presentations is also true of reals whose initial segments have fairly low Kolmogorov complexity.
5.3. Representing left-c.e. reals
207
m, n > m, n for all m and n, in addition to the usual requirement that it be increasing in both arguments. B by enumerating strings of certain lengths. We will ensure that We build −|σ| = σ∈As 2−|σ| 1 for all s, which, by the same argument σ∈Bs 2 as in the proof of the KC Theorem, means that we will never run out of strings. It will, of course, also ensure that B is a presentation of α. To build B, at stage s we proceed as follows for each σ entering A at stage s. If |σ| < γ(ns ) then we enumerate 2|σ|,ns −|σ| many strings of length |σ|, ns into B. If |σ| γ(ns ) then we enumerate 2|σ|,|σ|+s −|σ| many strings of length |σ|, |σ| + s into B. We clearly have σ∈Bs 2−|σ| = σ∈As 2−|σ| for all s, which means that, as mentioned above, we can always find available strings of the appro −|σ| priate lengths, and also that σ∈B 2−|σ| = 2 = α, so B is a σ∈A presentation of α. Given n, let s > n be least such that Bs agrees with B on all strings of length less than γ(n), n. We claim that if n ∈ / Cs then n ∈ / C. Assume for a contradiction that n ∈ C \ Cs . Then n = nt for some t > s. Since ΓA (n)[t − 1] = Ct−1 (n) = 0 and ΓA (n)[t] = Ct (n) = 1, there must be some σ with |σ| < γ(n) entering A at stage t. Thus strings of length |σ|, n enter B at stage t, which is a contradiction because |σ|, n < γ(n), n. So we can compute whether n ∈ C if we know s, which requires knowing the value of B only on strings of length less than γ(n), n. Since γ is computable, C wtt B. Now let τ ∈ 2<ω . If there are no i and n such that |τ | = i, n then τ ∈ / B. Otherwise, let i and n be such that |τ | = i, n. If i γ(n), then τ can enter B at stage s only if s = n − i, so we can computably determine whether τ ∈ B. Now suppose that i < γ(n) and τ enters B at stage s. Then n = ns and i = |σ| for some σ entering A at stage s. Thus, if Ct n + 1 = C n + 1 then B(τ ) = Bt (τ ). Since n is computable from |τ |, we have B wtt C. The Turing degree of a c.e. set A is wtt-topped if every c.e. set that is Turing reducible to A is wtt-reducible to A. Clearly the halting problem is such a set, but there are incomplete c.e. Turing degrees that are wtt-topped as shown by Ladner and Sasso [234, 235] and Downey and Jockusch [119]. One consequence of the above result is that if α is a strongly c.e. real whose degree is wtt-topped, then there are presentations of α in all c.e. Turing degrees less than or equal to that of α. It is an interesting fact that if a c.e. degree is wtt-topped then it is either complete or low2 (see Downey and Jockusch [119]). Proposition 5.3.8 does not hold for tt-reducibility. Proposition 5.3.10 (Downey and Hirschfeldt [unpublished]). There exist a strongly c.e. real α and a presentation B of α such that B tt α. Proof. Let T0 , T1 , . . . be an effective list of all truth tables. We build c.e. sets A and B and let α = 0.A. We need to satisfy the following requirement
208
5. Effective Reals
for each e. Re : Φe total ⇒ ∃σe (σe ∈ B ⇔ A TΦe (σe ) (σe )). Let D = {0n 1 : n ∈ N}. At stage s, begin by defining σs to be a fresh large string in D. Let e s be least such that Re has not yet been satisfied and Φe (σe )[s] ↓. (If there is no such e then proceed to the next stage.) Put |σe | into A to obtain As+1 . If As+1 TΦe (σe ) (σe ) then put σe into B. Otherwise, put σe 0 and σe 1 into B. In either case, initialize all Ri with i > e by redefining σi to be fresh large strings in D. All such Ri are now unsatisfied, while Re is satisfied. Clearly A and B are c.e., and B is prefix-free because D is prefix-free. Furthermore, μ(B) = μ(A) = α because we add the same measure to A and B at each stage. Finally, assume by induction that there is a least s such that all Ri for i < e have stopped acting by stage s. Then σe reaches its final value at stage s. If Φe (σe ) ↑ then Re is satisfied and never acts after stage s. Otherwise, at the least stage t s such that Φe (σe )[t] ↓, we ensure that σe ∈ B ⇔ At+1 TΦe (σe ) (σe ). Since all Ri with i > e are initialized at that stage, A TΦe (σe ) (σe ) iff At+1 TΦe (σe ) (σe ), so Re is permanently satisfied and never again acts.
5.3.3 Presentations and ideals A class I ⊆ 2ω is a wtt-ideal if it is closed under join and downward closed under wtt-reducibility. For a left-c.e. real α, let I(α) be the collection of all c.e. sets C such that there is a presentation A of α with C ≡wtt A. Theorem 5.3.11 (Downey and LaForte [122]). If α is a left-c.e. real then I(α) is a Σ03 wtt-ideal. Proof. Let A and B be presentations of α. Then C = {0σ : σ ∈ A} ∪ {1σ : σ ∈ B} is a presentation of α, and C ≡wtt A ⊕ B. Together with Theorem 5.3.9, this fact shows that I(α) is a wtt-ideal. The statement “μ(We ) = α” is Π02 (“for all diameters ε there is a stage s such that μ(We )[s] and α[s] are closer than ε”). Saying that We is prefix-free is Π01 . For a given c.e. set A, the set {We : We ≡wtt A} is Σ03 (see Odifreddi [311, p. 627]; roughly, we have to say “there exists a wttreduction such that ∀x ∀y x ∃s > x such that at stage s the reduction gives the right answers on y”). Thus We ∈ I(α) if and only if there exists a d such that a certain Σ03 statement holds true of Wd , so I(α) is a Σ03 wtt-ideal. To see that this result is optimal, note that if α is computable, then by Theorem 5.3.9, I(α) = {We : We computable}, which is Σ03 -complete. I(α) is not always Σ03 -complete. We have seen that there are α such that I(α) = {We : e ∈ N}, which is trivial (as an index set). The following
5.3. Representing left-c.e. reals
209
result, in the spirit of Rice’s Theorem, says that this is in fact the only case in which I(α) is not Σ03 -complete. Theorem 5.3.12 (Downey and Terwijn [134]). I(α) is either {We : e ∈ N} or Σ03 -complete. Proof sketch. Let α = 0.A be a left-c.e. real. By Proposition 5.3.8 and Theorem 5.3.9, I(α) = {We : e ∈ N} iff A is wtt-complete. Now suppose that A is not wtt-complete. We can show that I(α) is Σ03 -complete using methods of Rogers and Kallibekov (see Odifreddi [311, pp. 625-627]). Let Inf = {e : |We | = ∞}. We use the fact that the weak jump {i : Wi ∩Inf = ∅} of Inf is Σ03 -complete. It suffices to build sets Bi such that if Wi ∩ Inf = ∅ then Bi is computable, and otherwise Bi wtt A. Then by Proposition 5.3.8 and Theorem 5.3.9, Wi ∩ Inf = ∅ iff the wtt-degree of Bi contains a presentation of α. Let Γ0 , Γ1 , . . . be an effective listing of the wtt-functionals. For a fixed i, we have requirements Pe : e ∈ Wi ∧ |We | = ∞ ⇒ ∃ne ∀n ne (ω [n] ⊆ Bi ) and A Re : (Wi ∩ Inf = ∅ ∧ ΓA e total) ⇒ ∃n (Bi (n) = Γe (n)).
The constructions for different i’s run independently. The requirement Pe is handled by waiting until e enters Wi (if ever), and then enumerating ω [n] into Bi for n ne whenever an element greater than n is found to be in We (where ne is a number determined by the stronger priority requirements). The requirement Re is handled by Sacks’ coding strategy, as in Theorem 2.15.1. We maintain a length of agreement function l(e, s) monitoring [n] agreement between Bi and ΓA for the least e , and code ∅ l(e, s) into ω [n] n such that ω has not been enumerated into Bi . We ensure that ni > n for weaker priority strategies Pi by redefining these ni if necessary. Then, provided that the stronger priority requirements are finitary, Re is also finitary (and hence satisfied), since otherwise all of ∅ would be coded into Bi and we would have Bi wtt A, contradicting the wtt-incompleteness of A. If Wi contains no code for an infinite c.e. set then all Pe are finitary, and hence every Re succeeds and Bi wtt A. If, on the other hand, e ∈ Wi is least such that We is infinite, then all requirements stronger than Pe are finitary, so for the final value of ne , we have ω [n] ⊆ Bi for n ne , while for each n < ne , either ω [n] ⊆ Bi or ω [n] ∩ Bi is finite. Thus Bi is computable. We will prove the following converse to the fact that every I(α) is a Σ03 wtt-ideal. Theorem 5.3.13 (Downey and Terwijn [134]). Let I be a Σ03 wtt-ideal of c.e. sets. There is a noncomputable left-c.e. real α such that I(α) = I.
210
5. Effective Reals
The computable sets form a Σ03 wtt-ideal, so we have the following result. Corollary 5.3.14 (Downey and LaForte [122]). There is a noncomputable left-c.e. real all of whose presentations are computable. The proof of Theorem 5.3.13 will make use of the following lemma, which implies that every Σ03 wtt-ideal of c.e. sets can be written as a collection of uniformly c.e. sets. Lemma 5.3.15 (Yates [409]). Let I ∈ Σ03 and let C = {Wi : i ∈ I} contain all the finite sets. There are uniformly c.e. V0 , V1 , . . . such that C = {Ve : e ∈ N}. Proof. Let R be a computable predicate such that i ∈ I iff ∃e ∀n ∃m R(i, e, n, m). Whenever we find that ∀n < s ∃m R(i, e, n, m) for some i, e, and s, enumerate all elements of Wi [s] into Vi,e . If i ∈ I then there is an e such that Vi,e = Wi . Conversely, for each i and e, either Vi,e = Wi , in which case i must be in I, and so Vi,e ∈ C; or Vi,e is finite, in which case Vi,e ∈ C by hypothesis. Proof of Theorem 5.3.13. We begin by outlining the main ideas of the proof. By Lemma 5.3.15, there are uniformly c.e. sets U0 , U1 , . . . such that I = {U0 , U1 , . . .}. Let V0 , V1 , . . . be an effective listing of the prefix-free c.e. sets of strings. We build α and prefix-free sets of strings A0 , A1 , . . . to satisfy the requirements Ce : Ae ≡wtt Ue ∧ μ(Ae ) = α and Ne : μ(Ve ) = α ⇒ Ve wtt
Ai .
ie
The C-requirements are positive requirements, which will cause us to add certain amounts to α. To make α noncomputable, we can add requirements ensuring that α = 0.We . These requirements can be satisfied in the same way as the C-requirements, so for simplicity of presentation we omit them from our presentation. Adding them to the construction is straightforward. We first describe the strategies for meeting the above requirements in isolation, and then explain how to combine them using a tree of strategies. We attempt to ensure that Ue wtt Ae by permitting. Along with Ae we define a function ψe : N → 2<ω . Whenever a number x enters Ue we put (or at least try to put) ψe (x) into Ae . We attempt to ensure that Ae wtt Ue by allowing small strings to enter Ae only for the sake of coding Ue . Assuming that ψe (x) x, we will then have Ae wtt Ue with use the identity function.
5.3. Representing left-c.e. reals
211
We build α by adding rational amounts to it, thus determining a sequence of rationals α[0] α[1] · · · → α. We satisfy the second part of Ce by ensuring that there are infinitely many stages s with α[s] = μ(Ae )[s]. For Ne , if α[s] and μ(Ve )[s] are close, we try to make Ve computable by restraining α. We monitor how close these two values get by defining a nondecreasing, unbounded sequence of numbers me [s]. Every time we see that |α[s] − μ(Ve )[s]| < 2−me [s] , we attempt to keep α from changing on short strings by ensuring that α−α[s] 2−me [s] . If μ(Ve ) = α and we were to succeed completely in these attempts, then Ve would be computable, since to determine whether γ ∈ Ve , we could run the construction until |α[s] − μ(Ve )[s]| < 2−me [s] with me [s] sufficiently larger than |γ|, and then we would have γ ∈ Ve iff γ ∈ Ve [s]. The coding strategy for a requirement Ce can easily live with the action of stronger priority coding strategies by picking different coding locations from theirs, and, of course, the strategies for the N -requirements do not interfere with each other. We describe how the C- and N -strategies can be combined. We first look at how Ne can deal with the strategy for a stronger priority requirement Ci . As described above, when |α[s] − μ(Ve )[s]| < 2−me [s] , the strategy for Ne attempts to restrain α and ensure that α − α[s] 2−me [s] . However, the coding action of Ci may spoil this attempt. Suppose that μ(Ve ) = α. Although we can no longer argue that Ve is computable, we can still argue that it is computable in Ai . (We will later show that this computation is in fact a wtt-computation.) To compute whether γ ∈ Ve using Ai , find an s such that μ(Ai ) − μ(Ai )[s] 2−|γ|−1 and |α[s] − μ(Ve )[s]| < 2−me [s] with 2−me [s] < 2−|γ|−2. After this stage, Ne does not change α by more than 2−me [s] , and Ci does not change α by more than 2−|γ|−1, so μ(Ve ) cannot change by more than 2−me [s]+1 +2−|γ|−1 < 2−|γ| , and hence γ ∈ Ve iff γ ∈ Ve [s]. We now look at how Ci can deal with a stronger priority Ne . The strategy for Ne can have two outcomes. The infinitary outcome happens when there are infinitely many stages at which μ(Ve ) grows closer to α. The finitary outcome happens when, from a certain stage onwards, μ(Ve )[s] is bounded away from α[s]. Suppose that a number enters Ui at stage s, and Ci wants to code this event by enumerating a string δ into Ai , but |α[s] − μ(Ve )[s]| < 2−me [s] < 2−|δ| . Then Ci is not allowed to enumerate a string as short as δ into Ai , since this enumeration would cause α to change by 2−|δ| , which is more than Ne is prepared to allow. To get around this problem, we let Ae announce that it wishes to enumerate δ, without actually doing so. We then increase α slightly, though not enough to injure Ne . If Ve does not respond with a corresponding increase, then Ne is satisfied and has finitary outcome. Otherwise, μ(Ve ) eventually grows closer to α, and we can repeat this procedure. If Ve keeps responding to the small changes we make in α, then by repeating this procedure enough times, we are eventually able to create enough space between μ(Ae ) and α for δ to enter Ae . Note
212
5. Effective Reals
that it is important that we do not allow me to grow during this procedure. We will refer to this strategy as the “drip-feed strategy”, since we think of Ci succeeding by feeding α changes small enough to be allowed by Ne , and doing this often enough to be able to finally make its move. (In Downey [103], the analogy is drawn with a stock market trader that wants to dump a large number of shares without disrupting the market. The trader sells a small amount of shares, then waits until the price recovers before doing so again.) The strategy for Ce becomes a little more complicated when it has to deal with the outcomes of more than one N -strategy. Suppose that Ce is below Ni , which is below Nj . Suppose that at some stage s, the strategy for Ce wants to put δ into Ae using the drip-feed strategy described above. Then Ce tries to increase α by 2−mi [s] a total of 2−|δ|+mi [s] many times. While we are waiting for Ni to respond to the first such increase, mj might grow, since Nj neither knows nor cares whether Ni is going to respond. When Ni finally does respond, mj may have become so big that Nj does not allow a change of 2−mi [s] to α, thus frustrating Ce ’s drip-feed strategy. The solution to this problem is to let Ni use a drip-feed strategy of its own to get Nj to allow a change of 2−n to α. If both Ni and Nj are infinitary, in the end all the changes in α requested by Ce will be allowed. After every successfully completed drip-feed strategy, the N -strategies are allowed to let their m-values grow. This idea works in general for any finite number of N -strategies above Ci , by recursively nesting the drip-feed strategies. To handle the interactions between strategies in general, we use a tree of strategies as in Section 2.14. We use 2<ω as our tree, assigning both Ce and Ne to each string of length e (because only the N -strategies have interesting outcomes). The outcome 0 corresponds to Ne having infinitary outcome, while the outcome 1 corresponds to its having finitary outcome. Each σ builds its own set Aσ and tries to satisfy C|σ| by building a wtt-reduction ψσ from U|σ| to Aσ . To coordinate the drip-feed strategies, we equip each node σ with a counter cσ and let σ act only at stages at which cσ = 0. The counter cσ indicates how many steps a drip-feed strategy corresponding to some τ σ still needs to perform to be successfully completed. At the start of such a strategy aiming to put a string δ into α, the counter cσ is set to 2−|δ|+n for some number n determined by the longest ν with ν0 τ that will not allow 2−|δ| to be added to α immediately. Every time α is allowed to increase by 2−n , the counter cσ is decreased by 1. Every time the counter reaches 0 we allow σ to act. To enable the actions of the various coding requirements to coexist, we equip each C-strategy at σ with a list Lσ of strings. After a drip-feed strategy is successfully completed, the top element of Lσ is removed. Each string in Lσ has to wait until it is at the top of the list before a drip-feed strategy on its behalf can be started. We also have a list Rσ of lengths of strings that strategies below σ want to put into their corresponding sets.
5.3. Representing left-c.e. reals
213
In addition to cσ and the lists Lσ and Rσ , each σ has associated with it a length of agreement parameter lσ , a parameter mσ (corresponding to the parameter me discussed above), and a restraint parameter rσ . The construction proceeds in stages. When we initialize Nσ , we empty Rσ and undefine all the parameters associated with σ. When we initialize Cσ , we empty Lσ . When we add an element to a list Lσ or Rσ , we place it at the bottom of the list. Construction. Stage 0. Let α[0] = 0 and initialize all σ. Stage s > 0. We define the approximation TPs to the true path at stage s recursively. Initially, allow the strategies associated with σ = λ to act. If the strategies associated with some σ ∈ 2s act at stage s, then proceed to the next stage at the end of their actions. Action for Cσ . Let e = |σ|. We first pick suitable coding locations for the elements of Ue in Aσ . For each n s, if ψσ (n) is not currently defined then define it to be a fresh long string not extending any string currently in Aσ . For each n entering Ue at stage s, proceed as follows. Let ρ be the longest string such that ρ0 σ and rρ > ψσ (n)[s]. (That is, Nρ is the weakest strategy requiring Cσ to use a drip-feed strategy to put ψσ (n)[s] into Aσ .) Add ψσ (n)[s] to Lσ and add |ψσ (x)[s]| to Rρ . If there is no such ρ then simply put ψσ (n)[s] into Aσ . If Lσ is currently nonempty, let δ be its top element. Let ρ be the longest string such that ρ0 σ and rρ [s] > |δ|. (The existence of such a ρ is guaranteed by the fact that δ is on Lσ .) If |δ| is currently on Rρ then do nothing. Otherwise, δ has been successfully processed by Nρ , and hence, by recursion, by all relevant N -strategies above σ. Put δ into Aσ and remove it from Lσ . In any case, ensure that the approximations to α and μ(Aσ ) agree by putting fresh long strings into Aσ (while keeping Aσ prefix-free) or increasing the value of α, if necessary. Action for Nσ . Let e = |σ|. Let s lσ [s] = min{n : |α[s] − μ(Ve )[s]| > 2−n } − 1
if μ(Ve )[s] = α[s] otherwise
and let mσ [s] = max{lσ [t] : t < s}. Note that |α[s] − μ(Ve )[s]| 2−lσ [s] . The stage s is σ-expansionary if lσ [s] > mσ [s]. If s is not σ-expansionary, then initialize all τ -strategies with σ1
214
5. Effective Reals
If s is σ-expansionary and Rσ is nonempty, then proceed as follows. Let n be number at the top of Rσ . If cσ [s−1] ↑ then let cσ = 2−n+rσ [s] , initialize all τ -strategies with σ0 0. Then at some previous stage t this counter was set to some number 2−n+rσ [t], and rσ has not changed since. Let ρ be the longest string such that ρ0 σ and rρ [s] > rσ [s]. If there is no such ρ then add 2−rσ [s] to α. If rσ [s] ∈ Rρ [s] then do nothing. (In this case rσ [s] is still waiting for its turn to start a drip-feed strategy.) Otherwise, rσ [s] was removed from Rρ when its counter cρ became 0 at some stage u s. Let cσ [s] = cσ [s − 1] − 1, and put rσ [s] on Rρ . In any case, proceed to the next stage immediately. End of Construction. Because the construction is computable, α is a left-c.e. real, and each Aσ is c.e. and prefix-free. Let TP = lim inf s TPs be the true path of the construction. It is easy to see that TP is infinite, because whenever the construction stops at some σ ∈ 2<s at stage s, the value of cσ decreases. We prove by induction along TP that if σ ≺ TP then, letting e = |σ|, we
have Aσ ≡wtt Ue and μ(Aσ ) = α, and if μ(Ve ) = α then Ve wtt τ σ Aτ . First note that Aσ wtt Ue : A string δ can enter Aσ only after some ψe (n) has been defined with δ = ψe (n) or |δ| > |ψe (n)|. Since |ψe (n)| > n, if Ue does not change below n then no string of length less than n can enter Aσ . Thus Aσ wtt Ue with use the identity function. Next note that if ρ0 ≺ TP, then every number put on Rρ is eventually removed from Rρ : Since there are infinitely many stages at which ρ0 acts, and ρ0 can act only when Rρ is empty or cρ = 0, infinitely often either Rρ is empty or its top element is removed. Let s be a stage after which Cσ is never initialized, which must exist since σ ∈ TP. Then ψσ (n) has a final value for all n. Suppose n enters Ue at a stage t > s. Let ρ be the longest string such that ρ0 σ and rρ [t] > ψσ (n). If no such ρ exists then ψσ (n) enters Aσ immediately. Otherwise, ψσ (n) is added to Lσ and |ψσ (n)| is added to Rρ . As noted above, |ψσ (n)| is eventually removed from Rρ , so ψσ (n) is eventually removed from Lσ and put into Aσ . Since no other strategy can put ψσ (n) into Aσ for any n, and Cσ will not put ψσ (n) into Aσ if n ∈ / Ue , we have n ∈ Ue iff ψσ (n) ∈ Aσ . We also have μ(Aσ ) = α, because μ(Aσ )[s] = α[s] at the end of Cσ ’s action at every stage s at which Cσ acts. Thus Cσ is satisfied. We finish by verifying that Nσ is satisfied. Suppose that μ(Ve ) = α. Let s be a stage after which Nσ is never initialized. Let γ ∈ 2<ω . We compute whether γ ∈ Ve as follows. Let t s, |γ| + 1 be a stage at which σ acts such that Aτ [t] agrees with Aτ on all strings of length less than or equal to
5.3. Representing left-c.e. reals
215
|γ|+1 for all τ σ, and rσ [t] = lσ [t] > |γ|+1. There is such a stage because lims lσ [s] = ∞ and σ0 acts infinitely often. By the definition of l we have |α[t] − μ(Ve )[t]| 2−lσ [t] . Assume for a contradiction that γ enters Ve at a stage u > t. By the choice of t, and the definition of the functions ψτ , the strategies Cτ with τ σ cannot add more than 2−|γ|−2 to α after stage t, and neither can the strategies Cτ with τ L σ1, as they are initialized at stage t. The coding strategies Cτ with τ σ0 may wish to change α below rσ [t] after stage t, but they have to use the drip-feed strategy to do so, and hence cannot ever increase the difference between α and μ(Ve ) by more than 2−rσ [t] , and must wait for μ(Ve ) to grow closer to α than 2−rσ [t] before each change to α they make. Thus the enumeration of γ into Ve at stage u ensures that |α[v] − μ(Ve )[v]| 2−|γ| − 2−|γ|−1 − 2−rσ [t] > 2−rσ [t] for all v u, contradicting the assumption that μ(Ve ) = α. So γ ∈ Ve iff γ ∈ Ve [t], and hence we can wtt-compute Ve using Aτ for τ σ. The drip-feed strategy of the above proof makes it somewhat difficult to get elements into α, but it is possible to make α high via standard highness requirements. In the next subsection, and later in Section 9.3, we will see that it is possible to say something about the degrees of left-c.e. reals realizing certain ideals.
5.3.4 Promptly simple sets and presentations of left-c.e. reals Recall from Definition 2.22.4 that a coinfinite c.e. set A is promptly simple if there are a computable function f and an enumeration {As }s∈ω of A such that for every infinite c.e. set W , there are an n and an s with n ∈ We [s] \ We [s − 1] ∩ Af (s) . (The statement of Definition 2.22.4 is slightly different, but easily shown to be equivalent.) This notion was introduced by Maass [256], and general technical methods for working with it were developed by Ambos-Spies, Jockusch, Shore, and Soare [7], as we have seen in, for example, Theorem 2.22.6. In this subsection, we prove the following result. Theorem 5.3.16 (Downey and LaForte [122]). If a left-c.e. real computes a promptly simple set then it has a noncomputable presentation. Proof. Let D = Γα , where α is left-c.e. and D is promptly simple via the computable function p. By Theorem 5.1.7, there is a computable function α(i, s) such that α(i) = lims α(i, s) for all i, and for all i and s, if α(i, s) = 1 and α(i, s + 1) = 0, then there is a j < i such that α(j, s) = 0 and α(j, s + 1) = 1. We build a prefix-free set A such that μ(A) = α to satisfy the following requirements, where V0 , V1 , . . . is an effective listing of all c.e. sets of strings. Re : |Ve | = ∞ ⇒ A = Ve ,
216
5. Effective Reals
where the Ve are thought of as sets of strings. We build A using the KC Theorem, by enumerating a KC set L and taking A to be the domain of the corresponding prefix-free machine. By the effectiveness of the KC Theorem, we may assume that at each stage s, the approximation As corresponds exactly to Ls . To satisfy our requirements, we need to add strings of relatively short lengths to A at various stages. The strategy for satisfying Re will involve a finite sequence of attempts to enumerate a short string from Ve into A, at least one of which will work if Ve is infinite. Because α computes the promptly simple set D, we can enumerate numbers into c.e. sets we build and use the function p witnessing the prompt simplicity of D to search for a stage at which α changes on some relatively small value (prompted by a change in D), enabling us to enumerate a string from Ve into A. The key fact is that each search can be computably bounded by p, and if Ve is infinite then there must be some search that succeeds, by the definition of prompt simplicity. Each requirement Re has associated with it a c.e. set Ue we build. Let g be as in the Slowdown Lemma 2.3.3 (so Wg(e) = Ue ). Each Re will also have followers that we can enumerate into Ue to attempt to provoke D-changes and consequent α-changes. At each stage s, for the least j such that Rj is unsatisfied and there is no current follower defined for Rj at s, choose a new follower for Rj . At stage s, let e < s be least such that Re has a current follower x for which D(x)[s] = Γα (x)[s] = 0 with use m0 , and A[s] and Ve [s] are equal on all strings of length less than or equal to m0 . (If there is no such e, then simply add requests to L to ensure that the weight of Ls is equal to α[s] and proceed to the next stage.) If α changes below this use, we would like to enumerate a diagonalizing witness of length m0 into A. However, a change in α below 2−m0 may not allow the enumeration of a string of length m0 , because the m0 th position of α may be followed by a sequence of 1’s. Therefore, wait for a stage t0 > s and an m > m0 to appear such that A[t0 ] and Ve [t0 ] are equal on all strings of length less than or equal to m, and there is some j with m0 < j < m and α(j)[t0 ] = 0. There must be such t0 and m, since otherwise α would be rational, and hence computable. Then enumerate x into Ue = Wg(e) . Let t > t0 be least such that x ∈ Wg(e) [t]. Now freeze all action in the construction until stage p(t), and then check whether x ∈ D[p(t)]. If so, then α − α[t] > 2−m , so wait until a stage u p(t) such that α[u]−α[t] > 2−m and add a string of length less than or equal to m to A (by enumerating an appropriate request into L), thus satisfying Re permanently. Otherwise, let u = p(t), release A from the current attack, and declare the current attempt on Re to have failed. In either case, add requests to L to ensure that the weight of Lu is equal to α[u]. Since D is coinfinite, if Ve is infinite and all our attempts at satisfying Re end in failure, then Ue = Wg(e) is infinite. Since D is promptly simple, for
5.4. Left-d.c.e. reals
217
some attack on Re as described above, there is an x ∈ (Wg(e) [t] \ Wg(e) [t − 1]) ∩ D[p(t)]. By the definition of Wg(e) , the set D must change value on x between stages t0 and p(t). But then some element of Ve [t0 ] is enumerated into A at stage u by the construction, satisfying Re . Since each Re can be satisfied after a finite number of attempts, it is a straightforward induction to show that each associated strategy freezes A only finitely often. Thus all requirements are satisfied, and clearly μ(A) = α as required. As we will see in Theorem 9.3.1, there are other conditions on left-c.e. reals, including ones related to Kolmogorov complexity, that ensure the existence of noncomputable presentations.
5.4 Left-d.c.e. reals 5.4.1 Basics There are other natural classes of effective reals between the left-c.e. reals and the Δ02 reals. Several researchers have investigated hierarchies in this region. See for example Zheng [418, 419], Rettinger, Zheng, Gengler, and von Braunm¨ uhl [331], or Rettinger and Zheng [420]. The following is a particularly interesting example of such a class of reals. Definition 5.4.1. A real α is left-d.c.e. if there exist left-c.e. reals β and γ such that α = β − γ. This terminology is a bit misleading, since left-d.c.e. reals are the differences of left-c.e. reals, not the ones whose left cuts are d.c.e. sets. However, a term such as “d.l.c.e. real” seems a little too cumbersome. If A is a d.c.e. set, then 0.A is a left-d.c.e. real, so by Theorem 5.1.10, there are left-d.c.e. reals that are neither left-c.e. nor right-c.e. The following is an alternate characterization of the left-d.c.e. reals. Theorem 5.4.2 (Ambos-Spies, Weihrauch, and Zheng [12]). A real α is left-d.c.e. iff there is a computable sequence of rationals q0 , q1 , . . . → α such that n |qn+1 − qn | < ∞. Proof. If such a sequence exists, then let β = q0 + {qn+1 − qn : qn+1 − qn 0} and γ=
{qn − qn+1 : qn+1 − qn < 0}.
Since n |qn+1 − qn | < ∞, both β and γ are finite, and they are both left-c.e. Clearly, α = β − γ. For the converse, let β and γ be left-c.e. and let α = β − γ. Let r0 < r1 < · · · → β and s0 < s1 < · · · → γ be computable sequences of rationals.
218
5. Effective Reals
Let qn = rn − sn . Then q0 , q1 , . . . → α, and
|qn+1 − qn | =
n
n
|rn+1 − sn+1 − (rn − sn )|
n
(rn+1 − rn ) +
(sn+1 − sn ) = β − r0 + γ − s0 < ∞.
n
It is easy to see that if A is an n-c.e. set for some n, then 0.A is a leftd.c.e. real. We can push this fact one level further using Theorem 5.4.2. We begin with a lemma. Lemma 5.4.3 (Downey, Wu, and Zheng [136]). Let A be ω-c.e. Let f and g be computable functions such that A(n) = lims f (n, s) and |{s : f (n, s+1) = f (n, s)}| < g(n) for all n. If n g(n)2−n < ∞, then 0.A is left-d.c.e. Proof. We may assume that exactly one element ns enters or leaves A at each stage s. Then {0.As }s∈ω is a computable sequence of rationals converging to 0.A, and |0.As − 0.As+1 | = 2−ns = g(n)2−n < ∞. s
s
n
By Theorem 5.4.2, 0.A is left-d.c.e. Theorem 5.4.4 (Downey, Wu, and Zheng [136]). Every ω-c.e. degree contains a left-d.c.e. real. Proof. Let A be ω-c.e. By Proposition 2.7.3, there are an ω-c.e. set B ≡T A and a computable function f satisfying conditions (i)-(iii) in that proposition. Since n n2−n < ∞, it follows from Lemma 5.4.3 that 0.B is a left-d.c.e. real. The converse to this result does not hold. Indeed, Zheng [417] showed that there is a left-d.c.e. real whose degree is not ω-c.e.3 We have seen in Theorem 5.1.3 that every Δ02 real is the limit of a computable sequence of rationals. It might seem reasonable to conjecture that perhaps every Δ02 degree contains a left-d.c.e. real. However, we have the following result, whose proof we omit due to space considerations. Theorem 5.4.5 (Downey, Wu, and Zheng [136]). There are Δ02 degrees containing no left-d.c.e. reals. 3 Zheng’s proof can easily be modified to show that if α < ω 2 then there is a left-d.c.e. real whose degree is not α-c.e.
5.4. Left-d.c.e. reals
219
5.4.2 The field of left-d.c.e. reals One of the reasons to consider left-d.c.e. reals is that they form the smallest field containing the left-c.e. reals. Theorem 5.4.6 (Ambos-Spies, Weihrauch, and Zheng [12]). The left-d.c.e. reals form a field. Proof. Rearranging terms shows closure under addition, subtraction, and multiplication (e.g., (α − β)(γ − δ) = (αγ + βδ) − (βγ + αδ), and we have already mentioned that the class of left-c.e. reals is closed under addition and multiplication). We show closure under division. Let α and β = 0 be left-d.c.e. reals. Let q0 , q1 , . . . → α and r0 , r1 , . . . → β be computable sequences of rationals such that n |qn+1 − qn | < ∞ and n |rn+1 − rn | < ∞. We may assume that the rn are bounded away from 0. Let M be larger than both the above sums, and larger than |qn | and qn 1 α |rn | for all n. Let sn = rn . Then s0 , s1 , . . . → β is a computable sequence α of rationals, so to show that β is left-d.c.e., it is enough to show that n |sn+1 − sn | < ∞. But this sum is equal to qn+1 qn rn qn+1 − rn+1 qn = − rn+1 rn rn rn+1 n n rn qn+1 − rn qn + rn qn − rn+1 qn = r r n n+1 n qn+1 − qn qn rn+1 + rn rn+1 (rn − rn+1 ) n n <M |qn+1 − qn | + M 3 |rn − rn+1 | < M 2 + M 4 . n
n
Ng [295] and Raichev [319, 320] have shown that this field behaves well analytically. Recall that a field F is real closed if every polynomial of odd degree with coefficients in F has a root in F , and for each x ∈ F , there is a y ∈ F such that x = y 2 or x = −y 2 . Theorem 5.4.7 (Ng [295], Raichev [319, 320]). The field of left-d.c.e. reals is real closed. Proof. We give the proof of this result from Ng [295]. Let R2 denote the leftd.c.e. reals. We actually show the following stronger result (see for instance [336] for more on basic concepts of real analysis). Proposition 5.4.8. Let f be a real function that is analytic at some u0 ∈ R2 . Suppose that its Taylor series converges in some open interval E centered at u0 , and that there are uniformly computable sequences of
220
5. Effective Reals
rationals a0,n , a1,n . . . → f (n) (u0 ) such that supn Then every root of f (x) in E is also in R2 .
k
|ak+1,n − ak,n | < ∞.
The theorem follows from this proposition: Any polynomial f (x) ∈ R2 [x] satisfies the hypotheses of the proposition, since f (m) (0) = 0 for all sufficiently large m, so every such odd-degree polynomial has a root in R2 , and √ γ ∈ R2 for any nonnegative γ ∈ R2 . From now on, we work with an f satisfying the hypotheses of Proposition 5.4.8. We assume without loss of generality that u0 = 0, since we can work with the function f (x + u0 ) instead of f . We also fix a root r ∈ E of f (x). Let k be the multiplicity of r; that is, f (x) = (x−r)k g(x) where g(r) = 0. If k > 1 then instead of working with f (x), we can work with f (k−1) (x), since f (k−1) (r) = 0 as well, and f (k−1) (x) satisfies the hypotheses of Proposition 5.4.8. So we assume that k = 1, or, in other words, that r is a simple root. k 0) Define fk ∈ Q[x] by fk (x) = ak,0 + ak,1 (x − u0 ) + · · · + ak,k (x−u . k! Lemma 5.4.9. There are rationals α, β, M, m, m such that (i) [α, β] is an interval in E containing r and (ii) for all x ∈ [α, β], we have |f (x)| < M and 0 < m < |f (x)| < m . Proof. By our assumption that r is a simple root of f (x), we have f (r) = 0, and hence there is some interval [r − δ, r + δ] with δ ∈ Q such that f (x) = 0 for all x ∈ [r − δ, r + δ]. Let α = r − δ and β = r + δ. Then we can clearly choose m, m , and M to satisfy the conditions of the lemma. From now on, we fix α, β, M, m, m as in the lemma, and we also fix δ such that [r − δ, r + δ] ⊆ [α, β]. Lemma 5.4.10. Let x 0 , x1 , . . . be a sequence of points in a closed bounded interval I ⊆ E. Then k |fk+1 (xk ) − fk (xk )| < ∞. Proof. Let T = supn k |ak+1,n − ak,n | and U = supx∈I |x|. Then k
|fk+1 (xk ) − fk (xk )|
|ak+1,0 − ak,0 | + · · · + |ak+1,k − ak,k |
k
k
|xk |k |xk |k+1 + |ak+1,k+1 | k! (k + 1)!
|ak+1,n − ak,n |
n
Un Uk + . |ak,k | n! k! k
By Fubini’s Theorem, the latter is less than or equal to T
Un n
n!
+
k
|ak,k |
Uk . k!
5.4. Left-d.c.e. reals
Since |ak,k − f (k) (0)| than or equal to
jk
221
|aj+1,k − aj,k | T for all k, the latter is less
2T eU +
|f (k) (0)|
k
Uk . k!
Since I is closed, it must contain U , which establishes the lemma because the power series converges absolutely on I. (n) (n) It follows that k |fk+1 (xk ) − fk (xk )| < ∞ for all n. Lemma 5.4.11. limk fk = f (and, similarly, limk fk = f and limk fk = f ) uniformly on [α, β]. Proof. Let I ⊆ E be a closed bounded interval containing [α, β] and 0. Since supx∈I |fk+1 (x)−fk (x)| is attainable for each k, we can let x0 , x1 , . . . ∈ I be such that |fk+1 (xk ) − fk (xk )| = maxx∈I |fk+1 (x) − fk (x)|. Then it follows from Lemma 5.4.10 that given any ε > 0,there is an Nε such that whenever u > v Nε , we have |fu (x) − fv (x)| kNε |fk+1 (xk ) − fk (xk )| < ε for all x ∈ I. It remains to show that fk (x) → f (x) for every x ∈ E. Let Tn = j |aj+1,n − aj,n |. Then for all k, (n) xn xn (n) − f f (0) (x) |f (0) − a | k+1 k+1,n n! n! nk+1 nk+1
xn Tn − |aj+1,n − aj,n | n! nk+1
jk
nk+1
Tn
xn xn − |aj+1,n − aj,n | . n! n! nk+1 jk
n Now, since n Tn xn! n supm Tm xn! < ∞, it follows by Fubini’s The n xn orem that limk nk+1 jk |aj+1,n − aj,n | xn! = n Tn n! . Since f is represented by its Taylor series in E, we have limk fk (x) = f (x). n
Lemma 5.4.12. There is an s0 such that for all s s0 , 1. [α, β] contains a simple root rs of fs , 2. |fs (x)| < M , and 0 < m < |fs (x)| < m for all x ∈ [α, β]. Proof. We may assume that f (α) < 0 < f (β). Since lims fs (α) = f (α) and lims fs (β) = f (β), there is an s0 such that for all s s0 , we have fs (α) < 0 < fs (β), and hence there is a (simple) root rs of fs in [α, β]. Let ε > 0 be such that minx∈[α,β] |f (x)| − ε > m. Since lims fs = f uniformly on [α, β], we may assume that s0 is large enough so that |fs (x) − f (x)| < ε for all s s0 and x ∈ [α, β]. Then |f (x)| − |fs (x)| < ε. The cases for m and M are similar.
222
5. Effective Reals
Fix such an s0 and rs for s s0 . Note that for each s s0 , we have fs (x) > 0 inside [α, β], and hence rs is the only root of fs in that interval. Lemma 5.4.13. lims rs = r. Proof. Let ε be such that 0 < ε < δ. By pointwise convergence, we can choose Nε such that for all s Nε , we have fs (r − ε)fs (r + ε) < 0, and hence there is a root of fs in [r − ε, r + ε] ⊆ [α, β]. Since rs is the only root of fs in [α, β], we have |rs − r| < ε. M 1 δ Let K = 2m and η < min{ 2K , 2 }. For all s s0 , we also want |rs − r| < so adjust s0 to be large enough to satisfy this requirement. For simplicity, we will assume from now on that the sequence {fs }sω starts with the index s0 . Let y0 ∈ Q be such that |y0 − r| < η2 . For each s, define a computable f (x ) . sequence of rationals by xs,0 = y0 and xs,n+1 = xs,n − fs (xs,n s,n ) η 2,
s
Lemma 5.4.14 (Newton’s Method of Locating Roots). For each s, if ) 2 x ∈ [α, β] and x = x − ffs (x (x ) , then |x − rs | K|x − rs | . s
Proof. By Taylor expansion of fs (x) about the point x , we have −fs (x ) = fs (x )(rs − x ) +
fs (c)(rs − x )2 2
for some c between rs and x . Hence, x = x −
fs (x ) f (c)(rs − x )2 = rs + s . fs (x ) 2fs (x )
Since c ∈ [α, β],
f (c)(rs − x )2 K|x − rs |2 . |x − rs | = s 2fs (x )
Lemma 5.4.15. For all s and n, we have |xs,n+1 − xs,n | ((Kη)n+1 + (Kη)n )|y0 − rs |. Proof. Fix an s. We first prove by induction that |xs,n − rs | < η for all n. First note that |xs,0 − rs | |xs,0 − r| + |rs − r| < η. Suppose that |xs,n − rs | < η. Then |xs,n − r| |xs,n − rs | + |rs − r| < 2η < δ, so xs,n ∈ [α, β]. Thus, by Lemma 5.4.14, |xs,n+1 − rs | Kη 2 < η. Next, we prove that |xs,n − rs | (Kη)n |y0 − rs |, again by induction. We have |xs,0 −rs | = |y0 −rs |. Now assume that the case for n holds. By Lemma 5.4.14, |xs,n+1 − rs | K|xs,n − rs |2 < (Kη)|xs,n − rs | (Kη)n+1 |y0 − rs |. Now the lemma follows by the triangle inequality. From now on we will assume that f and fs are all increasing on [α, β]. For each s, we can construct a partition I0 , I1 , . . . of [α, β] each element of which has width 2s+11 m . Then apply the sign test to fs using I0 , I1 , . . . and
5.4. Left-d.c.e. reals
223
let rs be the right endpoint of the partition element that contains the root rs . Lemma 5.4.16. For each s, we can find an Ns such that |xs+1,Ns − 1 xs,Ns | < m |fs (rs+1 )|. Proof. From the proof of Lemma 5.4.15, it is clear that limn xs,n = rs . Thus, for each s, we have limn |xs+1,n − xs,n | = |rs+1 − rs |. However, s+1 )| |rs+1 − rs | = |fsf(r (c) for some rs c rs+1 , so |rs+1 − rs | |fs (rms+1 )| < s
|fs (rs+1 )| . m
Define a computable sequence y0 , y1 , . . . as follows. 1. Let y0 be as above, and set s = 0 and n = 1. 1 |fs (rs+1 )| then let yn = xs,n−s . In this case, 2. If |xs+1,n−s −xs,n−s | m increment n by 1 and repeat this step. 1 3. If |xs+1,n−s − xs,n−s | < m |fs (rs+1 )| (which must eventually happen, by Lemma 5.4.16), then let yn = xs,n−s , let yn+1 = xs+1,n−s , and let yn+2 = xs+1,n−s+1 . Increment n by 3, increment s by 1, and return to step 2.
The idea of this definition is that we wait until |xs+1,n−s − xs,n−s | is small enough before we jump to the next level. Lemma 5.4.17. limn yn = r. Proof. For any s and n, there is a c such that yc = xs ,n with s > s and n > n. Let ε > 0. Then there is a t0 such that |ra − r| < 4ε for all a t0 . log( ε )
η Let u0 = log(kη) . Let c be obtained from t0 and u0 as described above. Then, for all n c, we have yn = xa,b for some a > t0 and b > u0 . So
|yn − r| |xa,b − ra | + |ra − r| < (Kη)b |y0 − ra | +
ε 4
η ε ε ε (Kη)b |y0 − r| + (Kη)b + < (Kη)b + < ε. 4 4 2 2 Lemma 5.4.18. r ∈ R2 .
Proof. It remains to show that i |yi+1 − yi | < ∞. By Lemma 5.4.15 and the construction of the yi , we have 1 |fs (rs+1 |yi+1 − yi | ((Kη)i+1 + (Kη)i )|y0 − rd(i) | + )| m s i i
| < 2s+11 m , by for a nondecreasing sequence d0 , d1 , . . . . Since |rs+1 − rs+1 1 1 the Mean Value Theorem, m |fs+1 (rs+1 )| m 2s+1 m < 21s . By Lemma
224
5. Effective Reals
5.4.10, |yi+1 − yi | i
i
((Kη)i+1 + (Kη)i )2δ +
1 1 |fs+1 (rs+1 )− fs (rs+1 )|+ < ∞. m 2s s s
This lemma establishes Proposition 5.4.8, and hence proves the theorem. Similar methods can also be used to show that the computable reals form a real closed field (see for instance [318]), and hence, by relativization, that so do the Δ02 reals, a fact first explictly noted and proved directly by Raichev [319, 320].
Part II
Notions of Randomness
6 Martin-L¨of Randomness
In this chapter, we will introduce three cornerstone approaches to the definition of algorithmic randomness for infinite sequences. (i) The computational paradigm: Random sequences are those whose initial segments are all hard to describe, or, equivalently, hard to compress. This approach is probably the easiest one to understand in terms of the previous sections. (ii) The measure-theoretic paradigm: Random sequences are those with no “effectively rare” properties. If the class of sequences satisfying a given property is an effectively null set, then a random sequence should not have this property. This approach is the same as the stochastic paradigm: a random sequence should pass all effective statistical tests. (iii) The unpredictability paradigm: This approach stems from what is probably the most intuitive conception of randomness, namely that one should not be able to predict the next bit of a random sequence, even if one knows all preceding bits, in the same way that a coin toss is unpredictable even given the results of previous coin tosses. We will see that all three approaches can be used to define the same notion of randomness, which is called Martin-L¨of randomness or 1randomness. This notion was the first successful attempt to capture the idea of a random infinite sequence, and is still the best known and most studied of the various definitions proposed to date, in great part because it enables us to develop a rich and appealing mathematical theory. For R.G. Downey and D. Hirschfeldt, Algorithmic Randomness and Complexity, Theory and Applications of Computability, DOI 10.1007/978-0-387-68441-3_6, © Springer Science+Business Media, LLC 2010
226
6.1. The computational paradigm
227
the development of these approaches to randomness, and the philosophical insights behind them, see van Lambalgen [397].
6.1 The computational paradigm In a way, the definition of algorithmic randomness is more satisfying for infinite sequences than it is for strings. One of the main reasons for this phenomenon, as we will see, is that algorithmic randomness of sequences is absolute, in that no matter which universal machine one chooses, the class of random sequences remains the same. A natural first attempt at defining a notion of randomness for a set A would be to say that A is random if C(A n) n − O(1). Unfortunately, no set satisfies this condition. This is a fundamental observation of MartinL¨ of that we have already seen in Theorem 3.1.4. As we will see in Section 6.7, it is possible to give a satisfactory definition of randomness for infinite sequences using plain complexity, but this is a recent development. We can vindicate the intuition leading to the incorrect attempt at a definition of randomness for infinite sequences via plain complexity by using prefix-free complexity instead. Definition 6.1.1 (Levin [243], Chaitin [58]). A set A is 1-random if K(A n) n − O(1).1 We will call a degree (Turing or otherwise) 1-random if it contains a 1-random set.2 The reason for the term “1-random” will become clear once we introduce the notion of n-randomness for arbitrary n in Section 6.8. Notice that, in the nomenclature of Chapter 3, a sequence is 1-random iff all its initial segments are weakly K-random. One cannot replace weak K-randomness by strong K-randomness, as we saw in Theorem 3.11.1. Recall Chaitin’s halting probability Ω, which we introduced in Definition 3.13.6: Ω = μ(dom U) = 2−|σ| . σ∈dom U
As we will see, Ω is 1-random. It is in fact the best known example of a 1-random real. As we discussed following Definition 3.13.6, while the 1 As we will see in Section 6.3.2, Levin [242] and Schnorr [350] used monotone complexity and process complexity, respectively, to characterize the class of 1-random sets in much the same way, and the essential idea of prefix-free complexity is implicit in that work. The history of this subject is quite involved; see Li and Vit´ anyi [248] and van Lambalgen [397] for detailed historical remarks. 2 We will adopt this terminology for other notions of randomness without further comment. That is, when we define a notion of randomness R for sets, we will say that a degree is R-random if it contains an R-random set.
228
6. Martin-L¨ of Randomness
value of Ω depends on the choice of U, we will see in Section 9.2 that all versions of Ω are quite closely related, and it makes sense to ignore the machine-dependence of Ω in most circumstances. Recall also that Ωs = U (σ)[s]↓ 2−|σ| . The rationals Ωs approximate Ω from below, so Ω is a left-c.e. real. One way to look at Ω is as a highly compressed version of ∅ , as the following results show. Proposition 6.1.2 (Calude and Nies [51]). Ω ≡wtt ∅ . Proof. Since Ω is a left-c.e. real, it is computable in ∅ . For the other direction, let M be the prefix-free machine defined by letting M (0e 1) ↓= λ whenever Φe (e) ↓. Since U is universal, there is a ρ such that U(ρ0e 1) ↓ iff e ∈ ∅ . To determine whether e ∈ ∅ given Ω, wait for a stage s such that Ω − Ωs < 2−(|ρ|+e+1) . If U(ρ0e 1)[s] ↓ then e ∈ ∅ . Otherwise, U(ρ0e 1) ↑, so e∈ / ∅ . Theorem 6.1.3 (Chaitin [58]). Ω is 1-random. Proof. By Proposition 6.1.2, Ω is not rational, so for each n there is an s with Ωs n = Ω n. We build a prefix-free machine M . By the recursion theorem, we may assume we know its coding constant c in U. Whenever at a stage s we have U(τ )[s] = Ωs n for some τ such that |τ | < n − c (which means that K(Ωs n) < n − c), we choose a string μ ∈ / rng U[s] and declare M (τ ) = μ. Since M is coded in U with coding constant c, there must be a ν such that |ν| |τ | + c < n and U(ν) = M (τ ) = μ. Since μ∈ / rng U[s], it follows that ν ∈ / dom U[s], so Ω − Ωs 2−|ν| > 2−n , and hence Ω n = Ωs n. This procedure ensures that if |τ | < n − c then U(τ ) = Ω n, whence K(Ω n) n − c for all n. It is important to note that Ω is a somewhat misleading example of 1-randomness, as it is rather computationally powerful. We will see in Section 9.2 that all 1-random left-c.e. reals behave like Ω (indeed they are all versions of Ω); in particular, they are all Turing (and even weak truth table) complete. We will also see, in Section 8.3, that 1-random sets can be arbitrarily powerful, in the sense that for every set A there is a 1-random set R T A. However, heuristically speaking, randomness should be antithetical to computational power, inasmuch as classes of sets with a certain given amount of computational power (such as computing ∅ , or being a PA degree) tend to have measure 0. We will see below that as we move from 1-randomness to more powerful notions of algorithmic randomness, we do indeed tend to lose computational power. But even within the class of 1-random sets, there is a qualitative distinction between those 1-random sets that compute ∅ and those that do not.3 As we will see below, the latter exhibit behavior much closer to what we should expect of random sets 3 We will see in Section 7.7 that there is a test set characterization of the 1-random of test discussed in sets that do not compute ∅ , in the spirit of the notion of Martin-L¨
6.2. The measure-theoretic paradigm
229
(for instance, they cannot have PA degree; see Theorem 8.8.4).4 Thus one should keep in mind that, while Ω is certainly the most celebrated example of 1-randomness, it is not “typically 1-random”. = <ω 2−K(σ) . It is Chaitin [63] also defined another left-c.e. real, Ω σ∈2 is 1-random. not hard to modify the above proof to show that Ω Another example of a 1-random left-c.e. real is the output probability Q(σ) from Definition 3.9.3. The easiest way to see that Q(σ) is 1-random for any σ is using the methods of Chapter 9.5 Thus we have concrete examples of 1-random sets. As we will see in the following section, almost every set is 1-random (as we would expect). It was an interesting question of Solovay whether lim inf s K(Ω n) − n = ∞. This question was eventually solved in the affirmative by Chaitin [63]. More recently, Miller and Yu [280] gave a powerful characterization of randomness that has this result as a corollary. Namely, they proved that A is 1-random iff n∈N 2n−K(An) < ∞. We will prove this result in Theorem 6.6.1 below. The following result is also worth noting. Proposition 6.1.4 (Fortnow [unpublished], Nies, Stephan, and Terwijn [308]). If there is an infinite computable set S such that K(A n) n + O(1) for all n ∈ S, then A is 1-random. Proof. Suppose that A is not 1-random and let S = {s0 < s1 < · · · } be computable. If σ = τ ρ and |σ| = s|τ | , then we can give a prefix-free description of σ by giving a prefix-free description of τ and then giving ρ itself, so K(σ) K(τ ) + |ρ| + O(1). So if K(A n) n − d, then K(A sn ) n − d + (sn − n) + O(1) = sn − d + O(1). Since for each d there is an n with K(A n) n − d, the same is true if we restrict ourselves to n ∈ S.
6.2 The measure-theoretic paradigm The main idea behind the measure-theoretic, or stochastic, paradigm is that a random sequence should have no rare properties. In a remarkable early paper, von Mises [402] attempted to formalize this notion. He pointed the next section, and hence we can think of this condition as a notion of randomness in its own right, rather than just a mix of randomness and computability conditions. 4 A heuristic explanation for this phenomenon is that there are two ways to pass an ignorance test. One is to be genuinely ignorant, and the other is to be knowledgeable enough to know what answers an ignorant person would give. 1-random sets above ∅ have this kind of knowledge, in that they can compute K, but other 1-random sets do not, and thus must rely on their own lack of wits. 5 It is also interesting to consider output probabilities for sets of strings, where the output probability of A ⊆ 2<ω is the probability that U outputs an element of A, which is equal to σ∈A Q(σ). This notion has been investigated by Becher and Grigorieff [32] and Becher, Figueira, Grigorieff, and Miller [31].
230
6. Martin-L¨ of Randomness
out that if we flip a fair coin, then we should surely expect to see as many heads as tails in the limit. Thus a random binary sequence a0 a1 .. . should ai obey the law of large numbers by having the property that limn i
6 The
selection functions considered in these notions are adaptive, in the sense that, before deciding whether to bet on the nth bit of a sequence A, they are allowed to know A n, which is of course reasonable for a notion that attempts to model gambling on a sequence of observable events. Thus these functions are best thought of as functions from strings to {yes, no}. See Sections 6.5 and 7.4.1 for more details.
6.2. The measure-theoretic paradigm
231
changes with time, if we can avoid it. Martin-L¨ of’s fundamental idea in [259] was to define an abstract notion of a performable statistical test for randomness, and require that a random sequence pass all such tests. He did so by effectivizing the notion of a set of measure 0. To motivate Martin-L¨ of’s definition via a simple example, consider the class C of sequences A such that A(2k ) = 0 for all k. Clearly, such sequences are not random. If we are presented with a sequence A that might be in C, we can perform a test for membership of A in C to within a confidence level of 2−n by considering all k < n. If for all such k we have A(2k ) = 0, then we have some reason to believe that A ∈ C. (Of course, if not, then we can be certain that A ∈ / C.) We might be wrong about our belief that A ∈ C, but the set of all A that our test indicates are in C has measure 2−n , so we can have greater and greater confidence that A ∈ C as we perform finer and finer levels of this test. Martin-L¨of’s idea was to abstract away from the particulars of such a test, or more interesting ones such as tests for the law of large numbers, and think of a test as a collection of reasonably simple subsets of Cantor space of smaller and smaller measure, zeroing in on a null class of sequences that have some rare property, and hence should not be considered random. Definition 6.2.1 (Martin-L¨ of [259]). (i) A Martin-L¨ of test is a sequence {Un }n∈ω of uniformly Σ01 classes such that μ(Un ) 2−n for all n. (ii) A class C ⊂ 2ω is Martin-L¨ of null if there is a Martin-L¨ of test {Un }n∈ω such that C ⊆ n Un . of random if {A} is not Martin-L¨ of null.7 (iii) A set A ∈ 2ω is Martin-L¨ In the above definition, we did not require that U0 ⊇ U1 ⊇ · · · , but we could have: Given a Martin-L¨ of test {Un }n∈ω , letVn = m>n Um . Then {Vn }n∈ω is a Martin-L¨ of test, V0 ⊇ V1 ⊇ · · · , and n Un = n Vn . We could also have defined a Martin-L¨ of test to be a sequence {Un }n∈ω of uniformly Σ01 classes for which there is a computable function f : N → Q such that limn f (n) = 0 and μ(Un ) f (n) for all n: Given such a test, let Vn = Um for theleast m such that f (m) 2−n . Then {Vn }n∈ω is a Martin-L¨ of test and n Un = n Vn .8 7 Demuth [93] independently developed a test-based definition of randomness equivalent to Martin-L¨ of’s. He was working in the style of the Russian school of constructive mathematics, and saw the potential importance of random reals in computable analysis. A brief but highly informative discussion of this work may be found in Kuˇcera and Slaman [221, Remark 3.5]. 8 A natural variation on this definition is to require only that lim μ(U ) = 0, withn n out requiring a computable rate of convergence. Such generalized Martin-L¨ of tests are perhaps too weak to correspond to a reasonable notion of performable statistical test. That is, with a Martin-L¨ of test, we can compute what level of the test we need to look at to approximate the corresponding null set to within a given ε, while with a general-
232
6. Martin-L¨ of Randomness
We can now restate the above example: The class C of all A such that A(2k ) = 0 for all k is Martin-L¨of null, and hence no such set is Martin-L¨of random. Much more interestingly, the class of sets that do not satisfy the law of large numbers is Martin-L¨of null, and similarly, for any computable selection function f , the class of sets that do not satisfy the law of large numbers for the subsequence of bits picked out by f is also Martin-L¨of null. Thus, if A is Martin-L¨of random then it is Church stochastic. The converse is not true, however, because we can build a Martin-L¨ of test containing Ville’s sequence mentioned above. Further discussion and proofs (such as the proof that Martin-L¨ of random sets obey the law of large numbers) can be found in Section 7.4, where we examine effective notions of stochasticity in detail. Martin-L¨ of randomness is equivalent to 1-randomness, which helps justify the notion of 1-randomness as a reasonable notion of algorithmic randomness. To show this fact, we begin with a lemma. We will use only its first part here, but the second part will be useful below. Lemma 6.2.2. Let M be a prefix-free machine, let k ∈ N, and let S = {σ : KM (σ) |σ| − k}. Then μ(S) 2−k μ(dom M ). Furthermore, μ(S) is computable in μ(dom M ) via a procedure that is uniform in k. Proof. For each σ ∈ S there is a string τσ such that |τσ | |σ| − k and M (τσ ) ↓= σ. Thus 2−|σ| 2−(|τσ |+k) = 2−k 2−|τσ | μ(S) σ∈S
σ∈S
σ∈S
2
−k
2−|τ | = 2−k μ(dom M ).
τ ∈dom M −c
To compute μ(S) to within 2 given μ(dom M ), find a finite set F ⊆ dom M such that μ(dom M ) − μ(F ) < 2−c+k . Let N0 be the restriction of M to F and N1 be the restriction of M to dom M \ F . Let Ti = {σ : KNi (σ) |σ| − k}. Note that T0 is finite, so we can compute its measure. Now μ(dom N1 ) < 2−c+k , so μ(T1 ) < 2−c . Since S = T0 ∪ T1 , we have μ(S) − μ(T0 ) μ(T1 ) < 2−c . Theorem 6.2.3 (Schnorr, see [58]9 ). A set is Martin-L¨ of random iff it is 1-random. Proof. (⇒) Let Uk = {X : ∃n K(X n) n − k}. The Uk are uniformly Σ01 , and by Lemma 6.2.2, μ(Uk ) 2−k , so {Uk }k∈ω is a Martin-L¨ of test. ized Martin-L¨ of we might have no idea. However, generalized Martin-L¨ of tests are quite interesting from the mathematical viewpoint, as they correspond to a fruitful notion of randomness strictly stronger than Martin-L¨ of randomness known as weak 2-randomness, as we will discuss in Section 7.2. 9 As we will see in Theorem 6.3.10, Levin [242] and Schnorr [350] proved essentially equivalent results for monotone complexity and process complexity, respectively.
6.2. The measure-theoretic paradigm
233
If A is Martin-L¨ of random then A ∈ / k Uk , so there is a k such that K(A n) > n − k for all n, and thus A is 1-random. (⇐) Let A be a set that is not Martin-L¨of random. Let {Un }n∈ω be a Martin-L¨ of test such that A ∈ n Un . Let R0 , R1 , . . . be a uniformly c.e. sequence of prefix-free sets such that Un = Rn for all n. We have 2 2−(|σ|−n) = 2n μ(Un2 ) 2−n +n < 2−m < 1. n2 σ∈Rn2
n2
n2
m2
So by the minimality of K among information content measures (Theorem 3.7.8), there is a constant c such that if σ ∈ Rn2 for some n 2 then K(σ) |σ| − n + c. Since A ∈ Un2 for all n, for each n there is a k such that K(A k) k − n + c, and hence A is not 1-random. The ⇐ direction of the above proof is taken from Chaitin [63]. It is also possible to prove this direction of the theorem using the KC Theorem; see for instance [118]. One quite important property of Martin-L¨of’s definition is the existence of a universal Martin-L¨ of test, in the following sense. Definition of [259]). A Martin-L¨ of test {Un }n∈ω is uni 6.2.4 (Martin-L¨ versal if n Un contains every Martin-L¨ o f null set, in other words, for any Martin-L¨ of test {Vn }n∈ω , we have n Vn ⊆ n Un . In other words, there is a single statistical test leading to a good definition of randomness for individual sequences. (However, it is certainly not a natural one in the sense that the law of large numbers and the law of the iterated logarithms are natural.) The Martin-L¨ of test built in the first part of the proof of Theorem 6.2.3 is universal. We give another construction of such a test in the following proof. Theorem 6.2.5 (Martin-L¨ of [259]). There exists a universal Martin-L¨ of test. Proof. Let {Rn0 }n∈ω , {Rn1 }n∈ω , . . . be an effective listing of all uniformly c.e. subsets of 2<ω . Let Sni be the result of enumerating Rni and stopping the enumeration if the measure of Rni ever threatens to exceed 2−n . Then {Sn0 }n∈ω , {Sn1 }n∈ω , . . . is an effective listing of all Martin-L¨ of tests. Let i Un = i Sn+i+1 . The Un are uniformly Σ01 and i μ(Un ) = μ(Sn+i+1 ) 2−(n+i+1) = 2−n , i
i
of test. For any Martin-L¨ of test {Vn }n∈ω , there so {Un }n∈ω is a Martin-L¨ i is an i such that V = S for all n. Then V ∈ Un for all n, so n n+i+1 n V ∈ U . n n n n
234
6. Martin-L¨ of Randomness
We will see another construction of a universal Martin-L¨of test due to Kuˇcera in Section 8.5. Note that if{Un }n∈ω is a universal Martin-L¨ of test then a set A is 1-random iff A ∈ / n Un . In particular, each complement Un is a Π01 class consisting entirely of 1-random sets, a fact that will often be quite useful. (For instance, it implies that there are low 1-random sets, by the low basis theorem.) That there is no Π01 class consisting exactly of the 1-random sets follows from the following result, since the only Π01 class of measure 1 is 2ω . (However, we will see in Lemma 6.10.1 that there are Π01 classes consisting only of 1-random sets that contain all 1-random sets up to finite shifts.) Corollary 6.2.6 (Martin-L¨ of [259]). The class of 1-random sets has measure 1. Proof. Let {Un }n∈ω be a universal Martin-L¨ of test. The class of 1-random sets is the complement of n Un , and μ( n Un ) = 0. Solovay gave the following alternate definition of 1-randomness. Definition 6.2.7 (Solovay [371]). A Solovay test is a sequence {Sn }n∈ω of uniformly Σ01 classes such that n μ(Sn ) < ∞. A set A is Solovay random if for every such test, A is in only finitely many Sn . Equivalently, A is Solovay random if for all computable sequences {In }n∈ω of intervals with rational endpoints such that n |In | < ∞, the set A is in only finitely many In . Thus, we also refer to such a sequence of intervals as a Solovay test. Theorem 6.2.8 (Solovay [371]). A set is Solovay random iff it is 1-random. Proof. A Martin-L¨ of test is a Solovay test, so if a set is Solovay random then it is Martin-L¨of random. Now suppose that A is Martin-L¨ of random, and let {Sn }n∈ω be a Solovay test. There is an m such that n>m μ(Sn ) < 1. k Let Uk = {σ : ∃2 n > m (σ ∈ Sn )}. The Uk are uniformly Σ01 , and it is not hard to see that μ(Uk ) 2−k n>m μ(Sn ) < 2−k . Thus {Uk }k∈ω is a Martin-L¨ of test, and hence there is a k such that A ∈ / Uk . So A is in at most 2k + m many Sn . Since {Sn }n∈ω in an arbitrary Solovay test, A is Solovay random.
6.3 The unpredictability paradigm 6.3.1 Martingales and supermartingales It seems reasonable to say that most people would intuitively identify randomness with unpredictability. In a random process such as repeatedly tossing a fair coin, knowledge of the first n tosses should be of no help for predicting the result of the (n + 1)st toss. Similarly, if we are betting on
6.3. The unpredictability paradigm
235
the successive bits of a random sequence, we should not expect to be able to make much money, no matter what betting strategy we apply. We can formalize the notion of a betting strategy as follows. Definition 6.3.1 (Levy [244]). A function d : 2<ω → R0 is a martingale if for all σ, d(σ0) + d(σ1) 10 . 2 It is a supermartingale if for all σ, d(σ) =
d(σ0) + d(σ1) . 2 A (super)martingale d succeeds on a set A if lim supn d(A n) = ∞. The collection of all sets on which d succeeds is called the success set of d, and is denoted by S[d]. d(σ)
The idea is that a martingale d(σ) represents the capital that we have after betting on the bits of σ while following a particular betting strategy (d(λ) being our starting capital). The martingale condition d(σ) = d(σ0)+d(σ1) is a fairness condition, ensuring that the expected value of our 2 capital after a bet is equal to our capital before the bet. (In the case of supermartingales, we are allowed to discard part of our capital, such as by buying drinks or tipping the dealer.) Ville [401] proved that the success sets of (super)martingales correspond precisely to the sets of measure 0. (This fact can be proved in the same way as Theorem 6.3.4 below, ignoring the effectivity of the notions involved in that result.) We note a few basic properties of (super)martingales. The two in the following proposition are easily proved directly from the definitions, and we use them below without further comment. Proposition 6.3.2. (i) If d is a (super)martingale and r ∈ R0 , then rd is a (super)martingale. (ii) If d0 , d1 , . . . are (super)martingales such that n dn (λ) < ∞, then d is a (super)martingale. n n The following result is known as Kolmogorov’s Inequality, but is due to Ville [401]. Theorem 6.3.3 (Kolmogorov’s Inequality, Ville [401]). Let d be a (super)martingale. (i) For any string σ and any prefix-free set S of extensions of σ, we have −|τ | d(τ ) 2−|σ| d(σ). τ ∈S 2 10 A more complex notion of martingale is used in probability theory. We will discuss this notion, and the connection between it and ours, in Section 6.3.4.
236
6. Martin-L¨ of Randomness
(ii) Let Rk = {σ : d(σ) k}. Then μ(Rk )
d(λ) k .
Proof. To prove (i), it is enough to consider finite S. We proceed by induction. For the n = 1 case, if τ σ then 2−|τ |d(τ ) 2−|σ| d(σ) follows easily from the definition of a (super)martingale. Suppose (i) holds for sets of at most n elements, and let S have n + 1 many elements. Let ν be the longest string such that every element of S extends ν. For i < 2, let Si = {τ ∈ S : νi τ }. Then for i < 2 we have |Si | n, so by the inductive hypothesis, τ ∈Si 2−|τ |d(τ ) 2−|νi| d(νi). Thus 2−|τ |d(τ ) 2−(|ν|+1) (d(ν0) + d(ν1)) 2−|ν| d(ν) 2−|σ| d(σ), τ ∈S
the last inequality following by the n = 1 case above. To prove (ii), let P ⊆ Rk be a prefix-free set such that P = Rk . By part (i) with σ = λ, μ(P ) =
τ ∈P
2−|τ |
d(τ ) d(λ) 2−|τ | . k k
τ ∈P
To define a notion of randomness using martingales, we have to restrict the class of martingales we consider. Schnorr [348, 349] did so by effectivizing the notion of martingale. We say that a (super)martingale d is computably enumerable (or Σ01 ) if it is a c.e. function in the sense of Definition 5.2.1.11 Theorem 6.3.4 (Schnorr [348, 349]). A set is 1-random iff no c.e. (super)martingale succeeds on it. Proof. We show that Martin-L¨ of tests and (super)martingales are essentially the same, thus effectivizing Ville’s work mentioned above. Let d be a c.e. (super)martingale. Let Un = {σ : d(σ) 2n }. The Un are uniformly Σ01 classes, and μ(Un ) 2−n by Kolmogorov’s Inequality. So {Un }n∈ω is a Martin-L¨ of test. Moreover, A ∈ n Un iff A ∈ S[d]. Conversely, let {Un }n∈ω be a Martin-L¨ of test. Let R0 , R1 , . . . be a uniformly c.e. sequence of prefix-free generators for this test. For each n, define dn as follows. Whenever a string σ enters Rn , add 1 to the value of dn (τ ) for every τ σ and add 2k−|σ| to dn (σ k) for each k < |σ|. It is easy to −n check that the dn are uniformly c.e. martingales and dn (λ) 2 . Thus d = n dn is a c.e. martingale. Moreover, A ∈ S[d] iff A ∈ n Un . 11 One might have expected to see computable martingales rather than c.e. ones as the natural effectivization of the notion of martingale. Indeed, Schnorr [348, 349] did consider computable martingales, and used them to define notions of randomness that are strictly weaker than 1-randomness. We will return to this topic in Chapter 7.
6.3. The unpredictability paradigm
237
By applying the second part of the above proof to a universal Martin-L¨ of test, we have the following result. Corollary 6.3.5 (Schnorr [348, 349]). There is a universal c.e. martingale, that is, a c.e. martingale d such that for any c.e. martingale f , we have S[f ] ⊆ S[d]. As we have seen, not only is there a universal prefix-free machine, but in fact prefix-free complexity is minimal among information content measures. Given a martingale d, if f is a constant multiple of d then S[f ] = S[d], and in fact d and f behave essentially the same on all sequences. Thus martingales are in a sense really specified only up to constant multiples. This observation leads to the following definition. Definition 6.3.6. A c.e. (super)martingale d is optimal if for each (super)martingale f there is a constant c such that cd(σ) f (σ) for all σ. We can strengthen Corollary 6.3.5 for supermartingales as follows. Theorem 6.3.7 (Schnorr [348, 349], also essentially Levin, see [425]). There is an optimal c.e. supermartingale. Proof. We can construct a computable enumeration d0 , d1 , . . . of all c.e. supermartingales up to multiplication by a constant as follows. First notice that we can list all c.e. functions d such that d(σ) 2|σ| for all σ. Then we can modify each such function d by not allowing the approximation to d(σi) to increase while that increase threatens to violate the condition d(σ) d(σ0)+d(σ1) . Now the supermartingale d(σ) = n 2−n dn (σ) is optimal. 2 We will see in Section 6.3.3 that this result does not hold for martingales in place of supermartingales. We finish this section by noting that the use of limsup’s in the definition of success for (super)martingales is not essential, as witnessed by the following result, sometimes known as the “savings trick”. The name comes from the idea of waiting until we win, say, two dollars, then saving one dollar and betting the second dollar until we have three dollars, at which point we save two dollars and bet one, and so on. This idea can be turned into a formal construction, but we also give a different proof that is easier in the case of c.e. martingales. Proposition 6.3.8 (Folklore). Let d be a (super)martingale. From d we = S[d] and for all can effectively define a (super)martingale d such that S[d] A we have lim supn d(A n) = ∞ iff limn d(A n) = ∞. If d is computable, we can make d also be computable, and if d is c.e., we can make d also be c.e. We may assume that d(σ) > Proof. We first give a direct construction of d. 0 for all σ. We define d using an auxiliary “savings function” s : 2<ω → Q,
238
6. Martin-L¨ of Randomness
Let d(λ) = d(λ) and s(λ) = 0. whose value at any string will be less than d. − s(σ)) d(σi) . Given d(σ) and s(σ), for i = 0, 1, let d(σi) = s(σ) + (d(σ) d(σ) Intuitively, we bet s(σ) much of our capital evenly, and copy d’s bet on the rest. Now, for each i = 0, 1, approximate d(σi) − s(σ) to within 14 , and if 3 this approximation r is greater than 4 , then let s(σi) = s(σ) + r − 14 (which ensures that 0 < d(σi) − s(σi) < 1). Otherwise, let s(σi) = s(σ). (In this case, 0 < d(σi) − s(σi) 1.) It is now easy to check that d is a d-computable (super)martingale, and that A ∈ S[d] iff lim supn d(A n) = ∞ iff limn s(A n) = ∞ iff limn d(A n) = ∞ iff A ∈ S[d]. For the c.e. martingale case, we have the following easier proof. From d we can construct a Martin-L¨of test as in the first part of the proof of Theorem 6.3.4. From that test we can construct a martingale d as in the = S[d]. But it is easy to check that if second part of that proof. Then S[d] A ∈ S[d] then in fact limn d(A n) = ∞. Note that the d built in our first construction has the property that for ) d(σ) − 2, since the unsaved part of our capital all σ and τ , we have d(στ never reaches 2. We will refer to this feature of d as the savings property.
6.3.2 Supermartingales and continuous semimeasures Recall the notions of a continuous semimeasure and of the complexity measure KM from Section 3.16. As discussed in that section, Levin (see [425]) directly constructed an optimal c.e. continuous semimeasure, but there is a natural correspondence between continuous semimeasures and supermartingales, so his result can be interpreted as a result about supermartingales. (Figuring out the history of concepts in this area is often difficult, as several people had similar but not always exactly congruent ideas.) More precisely, let d be a supermartingale. Then δ(σ) = 2−|σ| d(σ) is a continuous semimeasure, and this process can be reversed. Thus Schnorr’s optimal c.e. supermartingale is equivalent to Levin’s optimal c.e. continuous semimeasure δ. The quantity δ(σ) is sometimes called the a priori probability of σ. See Li and Vit´ anyi [248] for a discussion of the connection between optimal c.e. continuous semimeasures and Bayes’ rule [30]. In particular, we see that, since KM (σ) is defined as − log δ(σ) for a fixed optimal c.e. continuous semimeasure δ, we have KM (σ) = − log d(σ)+ |σ|, where d is the optimal c.e. supermartingale corresponding to δ. Another way of obtaining KM is the following. Proposition 6.3.9 (Uspensky [394]). The a priori entropy KM is (up to an additive constant) the smallest function f that is approximable from above and such that σ∈M 2−f (σ) 1 for any prefix-free set M .
6.3. The unpredictability paradigm
239
Proof. Clearly KM is approximable from above. Let d be an optimal c.e. supermartingale with d(λ) = 1. If M is prefix-free then σ∈M 2− KM (σ) = −|σ| d(σ) 1 by Kolmogorov’s Inequality (Theorem 6.3.3). σ∈M 2 Now let δ be our fixed optimal c.e. continuous semimeasure. If f satisfies the of the proposition, then let m(σ) be the maximum of conditions −f (στ ) 2 over all prefix-free sets M . Note that m(σ) 2−f (σ) . τ ∈M Then m is a c.e. continuous semimeasure, so δ(σ) O(m(σ)), and hence KM (σ) = − log δ(σ) − log m(σ) + O(1) f (σ) + O(1). One easy consequence of the supermartingale characterization of KM is that A is 1-random iff KM (A n) = n ± O(1). The same is true for the related notions of monotone and process complexity, which we introduced in Section 3.15. Theorem 6.3.10 (Levin [242], Schnorr [350]). A set A is 1-random iff Km(A n) = Km D (A n) ± O(1) = n ± O(1). Proof. If Km(A n) n − O(1) then K(A n) n − O(1), so A is 1-random. We now prove the converse. Since Km(A n) Km D (A n) + O(1) n + O(1), it is enough to assume that A is 1-random and show that Km(A n) n − O(1). Write M(σ)[s] = τ to mean that at stage s, the machine M has read σ and output τ so far. Let Uk = {ρ : ∃s ∃σ (|σ| |ρ| − k) ∧ ρ M(σ)[s])}. Note that if Km(ρ) |ρ|−k, then ρ ∈ Uk . The Uk are uniformly Σ01 classes. Let R be a prefix-free set of generators for Uk . For each ρ ∈ R, let σρ be such that |σρ | |ρ| − k and ρ M(σρ )[s] for some s. By monotonicity, the set of σρ for ρ ∈ R is prefix-free, so 2−|ρ| 2−(|σρ |+k) = 2−k 2−|σρ | 2−k . μ(Uk ) = ρ∈R
ρ∈R
ρ∈R
of test, so if A is 1-random then there is a k Thus {Uk }k∈ω is a Martin-L¨ such that A ∈ / Uk , which implies that Km(A n) > n − k for all n. One consequence of the above result is that all 1-random sets have the same initial segment complexity with respect to both monotone complexity and process complexity. Thus, if we are interested in relating initial segment complexity to levels of randomness, these complexities, like KM , are too coarse.
6.3.3 Martingales and optimality In this section we show that Theorem 6.3.7 does not hold for martingales; that is, there is no optimal c.e. martingale. We begin by showing that the particular proof of Theorem 6.3.7 cannot work, as there is no computable enumeration of all c.e. martingales. (This result follows from Theorem 6.3.12 below, but is more easily proved directly.)
240
6. Martin-L¨ of Randomness
Theorem 6.3.11 (Downey, Griffiths and LaForte [110]12 ). There is no computable enumeration of all c.e. martingales. Proof. Suppose that d0 , d1 , . . . is a computable enumeration of all c.e. martingales. (We do not require that all approximations to each di appear on the list, just that for every c.e. martingale d there is an i such that d = di .) From this enumeration we can produce an enumeration f0 , f1 , . . . of all c.e. martingales that are not the constant zero function as follows. Let S = {i : ∃s di (λ)[s] > 0}. Since S is c.e., we can enumerate its elements s0 , s1 , . . ., and let fi = dsi . Note that for each i and n, since fi is a martingale (rather than just a supermartingale), there must be a σ ∈ 2n such that fi (σ) > 0. We now derive a contradiction by defining a strictly positive c.e. martingale d not equal to any fi . In fact, d will be a computable rational-valued martingale. For a string σ = λ, let σ − = σ (|σ| − 1), that is, the string obtained by removing the last bit of σ, and let σ c = σ − (1 − σ(|σ|)), that is, the string obtained by flipping the last bit of σ. Let s be such that f0 (λ)[s] > 0 and let d(λ) = f0 (λ)[s] . Having defined d 2 on strings of length n, search for an s and a σ ∈ 2n+1 such that fn+1 (σ)[s] > 0. Let d(σ) = min(d(σ − ), fn+12(σ)[s] ), let d(σ c ) = 2d(σ − ) − d(σ), and for all τ ∈ 2n+1 other than σ, let d(τ ) = d(τ − ). It is straightforward to check that d is a strictly positive c.e. martingale, but for each n there is a σ ∈ 2n such that d(σ) < fn (σ). The following result may well have been known to Schnorr, Levin, and others, but it seems that the first proof in the literature was given by Downey, Griffiths, and LaForte [110]. Theorem 6.3.12 (Downey, Griffiths, and LaForte [110]). There is no optimal c.e. martingale. Proof. Let d be a c.e. martingale. We build a c.e. martingale f such that for each c there is a σ with f (σ) cd(σ), which implies that d is not optimal. It is enough to build uniformly c.e. martingales f0 , f1 , . . . so that for each n we havefn (λ) = 1 and fn (σ) 22n d(σ) for some σ, since we can then take f = n 2−n fn . Given n, we begin by letting fn (λ) = 1. Let σ0 = λ. As long as d(λ)[s] < 2−2n , we define fn (0s+1 ) = 2s+1 and fn (τ ) = 0 for all other strings of length −2n s + 1. If we find , then we wait until a stage 0] 2 an s0 such that d(λ)[s s0 +1−2n t0 such that τ ∈2s0 +1 d(τ )[t0 ] 2 , which must occur because d is a martingale. Let i < 2 be such that d(0s0 i)[t0 ] d(0s0 (1 − i))[t0 ], and let σ1 = 0s0 i. Let fn (σ1 ) = 2s0 +1 , and let fn (τ ) = 0 for all other τ ∈ 2s0 +1 . 12 There are statements to the contrary in the literature, but it seems likely that this fact was known before [110]; in particular, it is implicit in Levin’s work such as [242].
6.3. The unpredictability paradigm
Note that
241
d(τ )[t0 ] − d(σ1 )[t0 ] 2s0 −2n ,
τ ∈2s0 +1
since d(0 (1 − i))[t0 ] is at least as big as d(σ1 )[t0 ]. Thus, for any m, if d(σ1 ) m2−2n fn (σ1 ) = m2s0 +1−2n , then d(τ ) d(λ) = 2−(s0 +1) s0
τ ∈2s0 +1
2−(s0 +1) (2s0 −2n + m2s0 +1−2n ) = (m + 12 )2−2n . We now repeat the above procedure with σ1 in place of λ. That is, as long as d(σ1 )[s] < 2−2n fn (σ1 ), we define d(σ1 0s+1 ) = 2|σ1 |+s+1 and fn (τ ) = 0 for all other strings of length |σ1 | + s + 1. If we find an s1 such that d(σ1 )[s1 ] 2−2n fn (σ1 ) = 2|σ1 |−2n , then we wait until a stage t1 such that τ ∈2s1 +1 d(σ1 τ )[t1 ] 2|σ1 |+s1 +1−2n , which must occur because d is a martingale. Let i < 2 be such that d(σ1 0s1 i)[t1 ] d(σ1 0s1 (1 − i))[t1 ], and let σ2 = σ1 0s1 i. Let fn (σ2 ) = 2|σ1 |+s1 +1 , and let fn (τ ) = 0 for all other τ ∈ 2|σ1 |+s1 +1 . Now, as above,
d(σ1 τ )[t1 ] − d(σ2 )[t1 ] 2|σ1 |+s1 −2n , τ ∈2s1 +1
so if d(σ2 ) m2−2n fn (σ2 ) = m2|σ1 |+s1 +1−2n , then d(σ1 τ ) d(σ1 ) = 2−(s1 +1) τ ∈2s1 +1
2−(s1 +1) (2|σ1 |+s1 −2n + m2|σ1 |+s1 +1−2n ) = (m + 12 )2|σ1 |−2n = (m + 12 )2s0 +1−2n , and hence d(λ) (m + 12 )2−2n . We now repeat this procedure with σ2 in place of σ1 , and so on. Each time we define σi , it is because d(σi−1 ) 2−2n fn (σi−1 ). The latter implies −2n that d(λ) i+1 , which cannot be the case for all i. So there is an i such 2 2 that σi is never defined, which means that d(σi−1 ) < 2−2n fn (σi−1 ).
6.3.4 Martingale processes As mentioned above, our concept of martingale is a simple special case of the original notion from probability theory, which we now discuss. Let us begin by recalling some basic notions. (We will define these in the context of the uniform measure μ on 2ω , but they can be defined for any probability space. See for example [144] for details.) A σ-algebra on 2ω is a collection of subsets of 2ω closed under complements and countable unions. For any collection C of subsets of 2ω , there is a smallest σ-algebra containing C,
242
6. Martin-L¨ of Randomness
which we say is the σ-algebra generated by C. The σ-algebra B generated by the collection of basic clopen sets of 2ω is called the Borel σ-algebra on 2ω . For a σ-algebra F , a function from 2ω to R is F-measurable if f −1 ((−∞, r)) ∈ F for all r ∈ R. A random variable on 2ω is a function X : 2ω → R that is B-measurable. The expectation E[X] of X is its integral over 2ω with respect to μ. For A ⊆ 2ω such that μ(A) > 0, the conditional A] expectation of X given A, denoted by E[X | A] is E[X·χ μ(A) , where χA is the characteristic function of A. A function d : 2<ω → R can be thought of as a sequence of random variables X0d , X1d , . . ., where Xnd (α) = d(α n). It is then easy to check that d is a martingale iff it is nonnegative and for each σ ∈ 2<ω , d E[X|σ|+1 | σ] = d(σ).
(6.1)
However, in probability theory martingales are usually defined using a weaker condition. Definition 6.3.13. A sequence of random variables X0 , X1 , . . . is a martingale if for every n we have E[Xn ] < ∞ and for every r0 , . . . , rn ∈ R, E[Xn+1 | {α : X0 (α) = r0 , . . . , Xn (α) = rn }] = rn .
(6.2)
In other words, a martingale is a sequence of random variables such that the expectation of each random variable conditional on the observed values of the previous ones is equal to the observed value of the immediately preceding one. Hitchcock and Lutz [181] defined a function d : 2<ω → R to be a martingale process if the corresponding random variables X0d , X1d , . . . form a martingale. It is easy to see that condition (6.1) implies condition (6.2), so every martingale is a martingale process. The following is an alternative definition of martingale process, also given in [181] and easily seen to be equivalent to the one above, when we restrict attention to nonnegative functions. Definition 6.3.14 (Hitchcock and Lutz [181]). Let d : 2<ω → R0 be a function. We write σ ∼d τ if |σ| = |τ | and d(σ n) = d(τ n) for all n |σ|. We write sd (σ) for the size of the ∼d -equivalence class of σ. A function d : 2<ω → R0 is a martingale process if for all σ, d(τ 0) + d(τ 1) . sd (σ)d(σ) = 2 τ∼ σ d
As Hitchcock and Lutz [181] remarked, in the fairness condition for martingales given in Definition 6.3.1, the average is taken over all sequences with the same bit history, while in the fairness condition for martingale processes, the average is taken over all sequences with the same capital history.
6.3. The unpredictability paradigm
243
It should be noted that there is a more general notion of martingale in probability theory that has as subcases both martingale processes and martingales in our sense. See [181] or [268] for a discussion of this fact. As with martingales, we say that a martingale process d succeeds on a set A if lim supn d(A n) = ∞, and the collection of all sets on which d succeeds is denoted by S[d]. In addition to having its source in a probability-theoretic notion of martingale, the notion of martingale process is interesting to us because it provides a computable, rather than computably enumerable, characterization of 1-randomness. (See the discussion of “Schnorr’s critique” in the introduction to Chapter 7.) Putting together results of Hitchcock and Lutz [181] and Merkle, Mihailovi´c, and Slaman [268], we now show that a set is 1-random iff no computable martingale process succeeds on it. We begin with an analog of Lemma 6.3.3, adapted from Hitchcock and Lutz [181]. Lemma 6.3.15. Let d be a martingale process. (i) Let B be a set of ∼d -equivalent strings, and let S be a prefix-free set of extensions of elements of B such that S is closed under ∼d . Let σ ∈ B. Then 2−|τ |d(τ ) sd (σ)2−|σ| d(σ), (6.3) τ ∈S
where, as above, sd (σ) is the size of the ∼d -equivalence class of σ. (ii) Let Rk = {σ : d(σ) k}. Then μ(Rk )
d(λ) k .
Proof. Since (6.3) holds iff it holds for all finite subsets of S, to prove (i), it is enough to consider finite S. We proceed by induction on the number n of ∼d -equivalence classes in S. If n = 1 then all elements of S have the same length k, so if we let C be the set of all strings of length k that extend elements of B, then 2−|τ |d(τ ) 2−k d(τ ) sd (σ)2−|σ| d(σ), τ ∈S
τ ∈C
the last inequality following easily by induction on k from the definition of martingale process. Now suppose (i) holds for sets with at most n 1 many ∼d -equivalence classes, and let S have n + 1 many ∼d -equivalence classes. For each k, let Sk be the set of all τ ∈ 2k that have a (not necessarily proper) extension in S. Let m be the length of the shortest element of S. If Sm has only one ∼d equivalence class then Sm ⊆ S, since S is closed under ∼d , whence S = Sm , since S is prefix-free. But that conclusion contradicts our choice of S. Thus there is a least k such that Sk has at least two ∼d -equivalence classes, and |σ| < k m. Partition Sk into its ∼d -equivalence classes C0 , . . . , Cp . For each i, let σi ∈ Ci and let Ti be the elements of S that extend Ci . Then
244
6. Martin-L¨ of Randomness
each Ti has at most n many ∼d -equivalence classes, so by induction 2−|τ | d(τ ) sd (σi )2−k d(σi ). τ ∈Ti
But since the elements of Sk−1 are all ∼d -equivalent, for ρ ∈ Sk−1 the definition of martingale process implies that sd (σi )2−k d(σi ) sd (ρ)2−(k−1) d(ρ). ip
By the n = 1 case above with the ∼d -equivalence class of ρ in place of S, this quantity is less than or equal to sd (σ)2−|σ| d(σ). Thus 2−|τ |d(τ ) = 2−|τ |d(τ ) sd (σ)2−|σ| d(σ). τ ∈S
i
τ ∈Ti
To prove (ii), let S = {τ : d(τ ) k ∧ ∀ρ ≺ τ (d(ρ) < k)}. Then S is prefix-free and closed under ∼d , and S = Rk . By part (i) with B = {λ}, d(τ ) d(λ) 2−|τ | . 2−|τ | μ(S) = k k τ ∈S
τ ∈S
Theorem 6.3.16 (Hitchcock and Lutz [181]). If A is 1-random then no computable martingale process succeeds on A. Proof. The proof is basically the same as that of the corresponding direction of Theorem 6.3.4: Let d be a computable martingale process. Let Un = {σ : d(σ) > 2n }. The Un are uniformly Σ01 classes, and μ(Un ) 2−n by Lemma 6.3.15. So {Un }n∈ω is a Martin-L¨ of test. Moreover, A ∈ n Un iff A ∈ S[d]. So if A is 1-random then A ∈ / S[d]. We now turn to the converse of this result, beginning with an auxiliary lemma. Lemma 6.3.17 (Merkle, Mihailovi´c, and Slaman [268]). Let C ⊂ 2<ω be a computable prefix-free set such that μ(C) 12 . We can effectively define a computable martingale process d such that d(λ) = 1 and d(σ) = 2 for all σ ∈ C. Proof. Let D be the set of all extensions of C. For σ ∈ D, let of elements d(σ) = 2. For σ ∈ / D, let b = 2|σ| \ D, let c = C ∩ 2|σ| , and let d(σ) = 1 − cb , which must be nonnegative by our assumption that μ(C) 12 . It is easy to check by induction that d is a martingale process. Theorem 6.3.18 (Merkle, Mihailovi´c, and Slaman [268]). There is a computable martingale process that succeeds on every non-1-random set. Proof. Let U0 , U1 , . . . be a universal Martin-L¨ of test. Let Vσ = U|σ|+1 ∩ σ. Let Dσ ⊂ 2<ω be uniformly computable prefix-free sets such that Vσ =
6.4. Relativizing randomness
245
σDσ . Since μ(U|σ|+1 ) 2−|σ|−1 , we have μ(Dσ ) < 12 , so there are uniformly computable martingale processes dσ such that dσ (λ) = 1 and dσ (τ ) = 2 for all τ ∈ Dσ . Let dσ be the martingale process defined by letting dσ (σρ) = dσ (ρ) and dσ (ν) = 1 if σ ν. We now use the dσ to define a computable martingale process d. It is easiest to describe the action of d along a particular sequence A. At first d copies dλ . If there is an n0 such that dλ (A n0 ) = 2 (and hence d(A n0 ) = 2), then d begins to copy 2dAn0 . If there is an n1 such that dAn0 (A n1 ) = 2 (and hence d(A n0 ) = 4), then d begins to copy 4dAn1 . In general, with nk having been defined, d copies 2k+1 dAnk until, if ever, there is an nk+1 such that dAnk (A nk+1 ) = 2 (and hence d(A n0 ) = 2k+2 ). Then d begins to copy 2k+2 dAnk+1 . For each τ ∈ 2<ω , let στ be such that d(τ ) is defined as copying dστ . It is easy to see by induction on the length of strings that σ → dσ is a computable function, and that if τ ∼d τ then στ = στ . Since the dσ are themselves computable martingale processes, we see that d is a computable martingale process. Clearly, if A ∈ n Un , then d succeeds on A. As a final note on computable martingale processes, we mention the following useful result. For a proof, see [181]. Theorem 6.3.19 (Hitchcock and Lutz [181]). For every computable martingale process f and every ε > 0, there is a martingale process d : 2<ω → Q0 that is computable as a function from strings to rationals, so that |d(σ) − d (σ)| < ε for all σ.
6.4 Relativizing randomness Like most computability-theoretic notions, 1-randomness can be relativized in a straightforward way. Given a set X, we say that A is 1-random relative to X if K X (A n) n−O(1). We can also define a Martin-L¨ of test relative to X in the same way as we defined an unrelativized Martin-L¨of test, but with classes that are Σ01 relative to X. The same proof as above shows that A is 1-random relative to X iff it passes every Martin-L¨ of test relative to X iff no X-c.e. martingale succeeds on X. Most other notions in the theory of algorithmic randomness can be similarly relativized, and we often do so below without further comment. There are a few situations, however, in which relativization is more problematic. See Chapter 15 for an example. It is useful to note that the construction of a universal Martin-L¨ of test above can be easily adapted to produce a universal oracle Martin-L¨ of test , that is, a test U0X , U1X , . . . that is a universal Martin-L¨ of test relative to X for all X.
246
6. Martin-L¨ of Randomness
6.5 Ville’s Theorem In this section, we prove the theorem of Ville mentioned above. Our proof will be taken from Lieb, Osherson, and Weinstein [249], which is a simplified version of the proof of Uspensky, Semenov, and Shen [395]. A selection function, or selection rule, is a partial function f : 2<ω → {yes, no}. We think of f as deciding whether or not to select the (k + 1)st bit of a sequence based on the observed values of the first k bits. For a sequence α and a selection function f , let sf (α, n) be the nth number k such that f (α k) = yes, if such a number exists. If sf (α, n − 1) is defined, let Sf (α, n) = i
S(α,n) n
= 12 .
(ii) For every f ∈ E that selects infinitely many bits of α, we have S (α,n) = 12 . limn f n (iii) For all n, we have
S(α,n) n
12 .
Before proving the full theorem, we prove a finite version. Strictly speaking, the following result is not needed for the proof of Ville’s Theorem we give, but we believe that it is helpful to the comprehension of the ideas of the proof. Proposition 6.5.2 (Finite version of Ville’s Theorem). Let E be any finite collection of selection functions. Then there is a sequence α satisfying the conclusion of Theorem 6.5.1. Proof. Without loss of generality, we will assume that the function sending every string to yes is in E, and hence we need to meet only conditions (ii) and (iii) of Theorem 6.5.1. We build α in stages. Given α n, let C(n) = {f ∈ E : f (α n) = yes} and αn = |{j < n : C(j) = C(n)}| mod 2. That is, we let αn be 1 iff the subset C(n) of E consisting of those functions that select the nth bit of α appears an even number of times among the C(j) with j n. Evidently, α satisfies (iii), since each 1 appearing in α is preceded by an occurrence of 0 that can be uniquely chosen to match it. (In other words, if αn = 1 then there must be an m < n such that C(m) = C(n), and for the largest such m we have αm = 0.) As an illustration, let E = {h, f1 , f2 }, with h(σ) = yes for all σ; and f1 and f2 defined so that f1 (σ) = yes for all σ of length 0, 2, or 4, but
6.5. Ville’s Theorem
247
f1 (σ) = no for all other σ of length less than 8; while f2 (σ) = yes for all σ of length 0, 3, or 6, but f2 (σ) = no for all other σ of length less than 8. Then we have the following: C(0) = {h, f1 , f2 },
α0 = 0
C(1) = {h}, C(2) = {h, f1 },
α1 = 0 α2 = 0
C(3) = {h, f2 }, C(4) = {h, f1 },
α3 = 0 α4 = 1
(because C(4) = C(2))
C(5) = {h},
α5 = 1
(because C(5) = C(1))
C(6) = {f, f1 , f2 }, C(7) = {h},
α6 = 1 α7 = 0
(because C(6) = C(0)) (because C(7) = C(1) = C(5))
Thus α 8 = 00001110. Now suppose that |{n : f (α n) = yes}| = ∞ and let n0 < n1 < · · · be a list of all n such that f (α n) = yes. For any C ∈ E, if we take the (possibly finite) sequence ni0 < ni1 < · · · of all ni such that C(ni ) = C, we have C(nij ) = 0 for j even and C(nij ) = 1 for j odd, so for all n, the number of 1’s among the first n many bits of α selected by f differs from the number of 0’s among these bits by at most 1 for each C ∈ E. In other S (α,n) words, | n2 − Sf (α, n) 2|E| |. Thus limn f n = 12 . Notice that the proof above actually shows that for every f ∈ E, we have 0 n2 − Sf (α, n) 2|E| . The method above clearly fails when we try to use it in the case that E is infinite. The problem is that (for example) each subset of E might occur only once among the C(n), making α = 0ω . The proof of Ville’s Theorem in the general case uses the construction of a finite subset C(n) to determine qn , together with a combinatorial trick to decide which of the f ∈ E are allowed to be in C(n). Roughly, fk is allowed to occur with fk+m+1 only if it has occurred sufficiently often by itself, or with some of f1 , . . . , fk+m , as made precise below. The key to the proof of Theorem 6.5.1 is the following lemma. Let A denote the class of infinite sequences of subsets of N. For A ∈ A, let IA = {n : ∈ A(n)}. For α ∈ 2ω and I = {n0 < n1 < · · · } ⊆ N, recall that α I = α(n0 )α(n1 ) . . . . Lemma 6.5.3. To each A ∈ A we can assign a sequence α such that the value of α(n) depends only on A(0), . . . , A(n); for any such that IA is infinite, limn
S(αIA ,n) n
= 12 ; and
S(α,n) n
1 2
for all n.
Assuming the lemma for now, we can prove Theorem 6.5.1 as follows. Proof of Theorem 6.5.1 from Lemma 6.5.3. Let f0 , f1 , . . . list the elements of E. We may assume that f0 is the function sending every string to yes, so we need to meet only conditions (ii) and (iii) of Theorem 6.5.1. Let
248
6. Martin-L¨ of Randomness
A(0) = {i : fi (λ) = yes} and let α(0) be as in the lemma. Given α n, let A(n) = {i : fi (α n) = yes} and let α(n) be as in the lemma. Then condition (iii) is satisfied by the lemma, as is condition (ii), since for any fi such that fi (α n) = yes for infinitely many n, lim n
Sfi (α, n) S(α IiA , n) 1 = lim = . n n n 2
Proof of Lemma 6.5.3. We first build a map from A into itself, denoting the image of A by A∗ . Each coordinate of A∗ will be a nonempty finite subset of the corresponding coordinate of A. In building A∗ , we also define an auxiliary family of numbers I(n). Suppose that we have already defined A∗ (m) and I(m) for all m < n so that A∗ (m) = {j ∈ A(m) : j I(m)}. Let I(n) be the least i such that there is a j ∈ A(n) with |{m < n : j ∈ A∗ (m) ∧ I(m) = i}| 3i and let A∗ (n) = {j ∈ A(n) : j I(n)}. The construction of A∗ (n) depends only on {A(i) : i n}. Furthermore, for each i, we have I(n) = i for at most i3i many n. From A∗ we get α by letting α(n) = |{j < n : A∗ (j) = A∗ (n)}| mod 2. Clearly, α n depends only on A(0), . . . , A(n). Furthermore, as with the finite construction, we can pair each i such that α(i) = 1 with a j < i such that α(j) = 0, so S(α,n) 12 for all n. To verify the remaining property in n the lemma, fix an that occurs in infinitely many A(n). We need to show S(αI A ,n)
that limn = 12 . n Let n0 < n1 < · · · be a listing of IA . Since I(n) = for at most finitely many n, it follows that ∈ A∗ (nm ) for cofinitely many m. For each i , there are at least 3i many occurrences of i in the list I(n0 ), I(n1 ), . . . prior to the first occurrence of i + 1 in that list. Let k be least such that for the least m with I(nm ) = k, we have < I(ni ) for all i m. Let
γ = A∗ (nm )A∗ (nm+1 ) . . . . For j k, let σ(j) = A∗ (ni )A∗ (ni+1 )A∗ (ni+2 ) . . . A∗ (ni+r ), where i is least such that I(ni ) = j and r is least such that I(ni+r+1 ) = j+1. Note that γ = σ(k)σ(k + 1) . . . . Also, A∗ (i) is part of γ for cofinitely many i such that ∈ A(i). For i k, each of the sets appearing in σ(i) is a subset of {0, . . . , i}, so there are at most 2i many such sets. Therefore, since there are at least 3i many occurrences of i in I(n0 ), I(n1 ), . . . prior to the first occurrence of
6.5. Ville’s Theorem
249
i + 1, we see that γ has the form σ(k)σ(k + 1)σ(k + 2) . . ., where for all m 0, the block σ(k + m) has length at least 3k+m and contains at most 2k+m many distinct sets. Let m be such that γ(0) = A∗ (nm ), and let β = α (IA ∩ [nm , ∞)). Since β is a tail of α IA , it is enough to show that limn S(β,n) = 12 . n Let j be sufficiently large so that there is an m(j) such that γ(j) is within σ(k + m(j) + 1). Let N0 (j) be the number of 0’s in β j and N1 (j) the number of 1’s in β j. Since the block σ(k + m(j)) has at least 3k+m(j) many coordinates, N0 (j) + N1 (j) 3k+m(j) . Let p be the length of the initial segment of α IA missing from β. Then N1 (j) N0 (j) + p, whence N1 (j)
1 (N0 (j) + N1 (j) + p). 2
(6.4)
There are at most 2k+i distinct sets in σ(k + i), and this number bounds the number of unmatched 0’s in that block. Therefore, N0 (j) N1 (j) + 2k+i N1 (j) + 2k+m(j)+2 . im(j)+1
Thus, N1 (j)
1 (N0 (j) + N1 (j) − 2k+m(j)+2 ). 2
We finish by evaluating R(j) = then limj
S(β,n) n
N1 (j) (N0 (j)+N1 (j)) .
(6.5)
Clearly, if limj R(j) =
1 2
= 12 . By (6.4), R(j)
N0 (j) + N1 (j) + p , 2(N0 (j) + N1 (j))
which has limit 12 . For the lower bound, we use (6.5), which gives us R(j)
1 2k+m(j)+2 N0 (j) + N1 (j) − 2k+m(j)+2 = − , 2(N0 (j) + N1 (j)) 2 N0 (j) + N1 (j)
which converges to 12 , as N0 (j) + N1 (j) 3k+m(j) . Lieb, Osherson, and Weinstein [249] made the following observations about their proof above. Let h be the function sending every string to yes. Choose a selection function f that selects infinitely many bits of α, and define the fluctuation about the mean to be 1 δ (n) = Sf (α, n) − . 2 By the reasoning above we have seen that δ is bounded by an -dependent constant. This property mimics the behavior of h, whose fluctuation is never positive. Furthermore, using the reasoning above, there is a number
250
6. Martin-L¨ of Randomness log 2
C 0 such that δ (n) −C n log 3 . The 3 occurs because of the use of 3i in the proof, and an arbitrary number r could have been used instead of 3, yielding the conclusion that for each ε > 0, there is a constant C (ε) 0 such that for every n, δ (n) −C (ε)nε .
(6.6)
Lieb, Osherson, and Weinstein pointed out that this bound is remarkable, because for a random coin toss, the law of the iterated logarithm states √ √ log n for any ε > 0 infinitely often that the fluctuations exceed (1− ε ) n log 2 almost surely (see Feller [144]). They observed that for any slow growing function g, there is a suitably fast growing function p such that using p(i) in place of 3i will enforce a bound analogous to (6.6) with g(n) in place of nε , and a suitable constant C (g), reducing the fluctuations above the mean even further.
6.6 The Ample Excess Lemma While the definition of a set A being 1-random requires only that K(A n) n − O(1), there is a “gap phenomenon” that ensures that, in fact, the initial segment complexity of A must be quite a bit higher than this lower bound. Theorem 6.6.1 (Ample Excess Lemma, Miller and Yu [280]). A set A is 1-random iff n 2n−K(An) < ∞. Proof. If A is not1-random then there are infinitely many n such that K(A n) < n, so n 2n−K(An) = ∞. For the other direction, note that for any m,
2n−K(σn) =
2|τ |−K(τ )
σ∈2m τ ≺σ
σ∈2m nm
=
τ ∈2m
2m−|τ |2|τ |−K(τ ) = 2m
2−K(τ ) < 2m .
τ ∈2m
m−c strings σ ∈ 2m for which Therefore, for each c there are at most 2 n−K(An) n−K(σn) c 2 . Thus μ({A : nm 2 2c }) 2−c , and nm 2 n−K(αn) hence Uc = {A : n 2 2c } has measure at most 2−c . Further0 more, the Uc are uniformly Σ1 , and hence form a Martin-L¨ of test. If A is 1-random then there is a c such that A ∈ / Uc . Then n 2n−K(An) < 2c .
Corollary 6.6.2 (Chaitin [63]). A set A is 1-random iff limn K(A n) − n = ∞.
6.6. The Ample Excess Lemma
251
Corollary 6.6.3 (Miller and Yu [280]). Let A be 1-random and let f be a function such that n 2−f (n) = ∞. Then there are infinitely many n such that K(A n) > n + f (n) − O(1).13 Proof.If there is an m such that K(A n) n+ f (n)− O(1) for all n > m, then n 2n−K(An) n>m 2n−K(An) n>m 2−(f (n)−O(1)) = ∞, so A is not 1-random. Corollary 6.6.4 (Miller and Yu [280]). If A is 1-random then K A (n) K(A n) − n + O(1). Proof. Let d be such that n 2n−K(An)−n < d. Let S = {(c + d, n) : ∃s (Ks (A n) − n < c)}. The set S is c.e. relative to A and 2−(c+d) 2n−K(An)−d < 1. n
(c+d,n)∈S
Furthermore, for each n, we have (K(A n)−n+d+1, n) ∈ S. Thus, by the relativized form of the KC Theorem, K A (n) K(A n) − n + O(1). Using Corollaries 3.11.3 and 6.6.2, we can prove the result mentioned in Section 3.8 that there are many strings that are weakly K-random yet highly non-C-random. Corollary 6.6.5. There are infinitely many strings σ such that K(σ) |σ| but C(σ) |σ| − log |σ|. Proof. Let A be 1-random. By Corollary 3.11.3, there are infinitely many n such that C(A n) n − log n. By Corollary 6.6.2, for all but finitely many such n, we have K(A n) n. We finish this section with a result about the complexity of blocks, rather than just initial segments, of a 1-random set. Theorem 6.6.6 (Merkle, see [306]). If A = σ0 σ1 σ2 . . . and K(σi ) < |σi | for all i, then A is not 1-random. Proof. Let M be the prefix-free machine that, on input τ , searches for n and a partition τ = ρν0 . . . νn−1 such that U(ρ) ↓= n and U(νi ) ↓ for all i < n, and if these are found, outputs U(ν0 ) . . . U(νn−1 ). Then for each n ∗ we have M (n∗ σ0∗ . . . σn−1 ) = σ0 . . . σn−1 , so ∗ K(σ0 . . . σn−1 ) |n∗ σ0∗ . . . σn−1 | + O(1) K(σi ) 2 log n + |σ0 . . . σn−1 | − n + O(1). = K(n) + i
13 Solovay
[371] proved this result for computable f .
252
6. Martin-L¨ of Randomness
Thus for each k there is an m such that K(A m) m − k, and hence A is not 1-random.
6.7 Plain complexity and randomness Though the characterizations of 1-random sets in terms of initial segment complexities such as monotone and prefix-free complexity are satisfying, they raise the natural question, which remained open for nearly forty years, of whether there is a plain complexity characterization of 1-randomness. (We will see in Section 6.11 that having infinitely often maximal plain complexity suffices but is not necessary.) Miller and Yu [280] have given such a characterization, which we now present, along with characterizations that mix plain and prefix-free complexity. Definition 6.7.1 (Miller and Yu [280]). Define a computable function G : N → N by Ks+1 (t) if n = 2s,t and Ks+1 (t) = Ks (t) G(n) = n otherwise. The function G is a variation on Solovay’s Solovay function from the proof of Theorem 3.12.1. Note that 2−G(n) 2−n + 2−m = 2 + 2 2−K(t) < ∞. n
n
t
mK(t)
t
Furthermore, if n = 2s,t for the least s such that Ks+1 (t) = K(t), then G(n) = K(t) K(n) + O(1). So we can think of G as an infinitely often correct computable simulation of K that grows fairly fast (cf. Theorem 3.12.1). Theorem 6.7.2 (Miller and Yu [280], G´ acs [165] for (i) ⇔ (ii)). The following are equivalent. (i) A is 1-random. (ii) C(A n | n) n − K(n) − O(1). (iii) C(A n) n − K(n) − O(1). (iv) C(A n) n − g(n) − O(1) for every computable function g such that −g(n) < ∞. n2 (v) C(A n) n − G(n) − O(1). Proof. (i) ⇒ (ii): Let Uk = {A | ∃n C(A n | n) < n − K(n) − k}. Then A ∈ Uk iff ∃n ∃s Cs (A n | n) + Ks (n) < n − k, so the Uk are uniformly
6.7. Plain complexity and randomness
253
Σ01 . Furthermore, μ(Uk )
μ({A | C(A n | n) < n − K(n) − k})
n
2−n |{σ ∈ 2n | C(σ | n) < n − K(n) − k}|
n
n
2−n 2n−K(n)−k = 2−k
2−K(n) 2−k .
n
of test. If A is 1-random, then there is a k such Thus {Uk }k∈ω is a Martin-L¨ that A ∈ / Uk , whence ∀n C(A n) n − K(n) − k. (ii) ⇒ (iii): Immediate. (iii) ⇒ (iv): Let g be a computable function such that n 2−g(n) < ∞. By the minimality of K among information content measures (Theorem 3.7.8), we have K(n) g(n) + O(1), so if C(A n) n − K(n) − O(1) then C(A n) n − g(n) − O(1). (iv) ⇒ (v): Immediate since G is computable and n 2−G(n) < ∞. (v) ⇒ (i): Assume that A is not 1-random. We show that (v) does not hold. By Theorem 3.7.6, there a c such that for all t and k, |{σ ∈ 2t | K(σ) t − k}| 2t−K(t)−k+c . We construct a (non-prefix-free) machine M that gives relatively short descriptions of A n when n = 2s,t and s is least such that Ks+1 (t) = K(t). For s, t ∈ ω, let n = 2s,t . To s, t we devote the M -programs with lengths from n2 + c + 1 to n + c. Note that distinct pairs do not compete for elements in the domain of M . Fix k ∈ ω and let m = n − Ks+1 (t) − k + c. Note that m n + c. If m n2 + c + 1, then for each σ ∈ 2n such that K(σ t) t − k, we try to pick a τ ∈ 2m and define M (τ ) = σ. Different k do not compete for programs, but it is still possible that there are not enough strings of length m for all such σ, so we assign programs until we run out. If Ks+1 (t) = K(t), though, then we cannot run out, because the number of σ ∈ 2n for which K(σ t) t − k is equal to 2n−t |{ρ ∈ 2t | K(ρ) t − k}| 2n−t 2t−K(t)−k+c = 2n−K(t)−k+c = 2m . The point of the definition of M is that if Ks+1 (t) = K(t) and n, k, m are as above, then as long as m n2 + c + 1, for every σ ∈ 2n such that K(σ t) t − k, we have CM (σ) m. Fix k. Since A is not 1-random, there is a t such that K(x t) t − k. We may assume that t is large enough so that K(t) 2t−1 − k − 1. Let s be least such that Ks+1 (t) = K(t), let n = 2s,t , and let m = n − K(t) − k + c. Then m n − 2t−1 + k + 1 − k + c n2 + c + 1, because n 2t . Thus
254
6. Martin-L¨ of Randomness
CM (A n) m. Furthermore, G(n) = Ks+1 (t) = K(t), so C(A n) CM (A n) + O(1) m + O(1) = n − K(t) − k + c + O(1) n − G(n) − k + O(1), where the constant is independent of n and k. Because k is arbitrary, (v) does not hold. Too recently for a proof to be included in this book, Bienvenu, Merkle, and Nies [unpublished] have shown that the function G in the above result can be replaced by any Solovay function, or indeed any co-c.e. function with the same properties as a Solovay function. Theorem 6.7.3 (Bienvenu, Merkle, and Nies [unpublished]). The following are equivalent for a co-c.e. function g. −g(n) (i) < ∞ and lim inf n g(n) − K(n) < ∞. n2 (ii) A set A is 1-random iff C(A n) n − g(n) − O(1).
6.8 n-randomness There appear to be three natural ways to extend Martin-L¨of’s definition to obtain a notion of n-randomness for n > 1. One is simply to replace the Σ01 classes in his definition by Σ0n classes. Another is to notice the importance of the fact that the components of a Martin-L¨of test are open, and hence replace them by open Σ0n classes. The third is to view Σ01 classes as generated by c.e. subsets of 2<ω , and hence replace them by classes (n−1) generated by Σ0n subsets of 2<ω , that is, by Σ0,∅ classes, hence equat1 ing n-randomness with 1-randomness relative to ∅(n−1) . Fortunately, while these approaches may not yield the same tests, they do yield the same notion of randomness. The key to proving this fact is part (i) of Theorem 6.8.3 below. Before turning to that result, we need two lemmas. Recall the effective listings S0n , S1n , . . . of the Σ0n classes given in Section 2.19.2. Lemma 6.8.1 (Kurtz [228]). For each n > 0, the set Cn = {(i, q) : q ∈ Q ∧ μ(Sin ) > q} is Σ0n . Thus, the measures μ(Sin ) are uniformly ∅(n) computable. Proof. For n = 1 we use Proposition 2.19.7, which says that if S is a Σ01 <ω class then we can effectively obtain a c.e. set W ⊆ 2 −|σ|such that S = W . Thus μ(S) > q iff there is an s such that σ∈Ws 2 > q, which clearly implies that C1 is c.e. If S is a Σ0n+1 class then we can find Σ0n classes Sin0 ⊇ Sin1 ⊇ · · · uniformly n such that S is the complement of k Sik . Then μ(S) = 1 − limk μ(Sink ), so μ(S) > q iff there is a k and a rational r > 0 such that μ(Sink ) 1 − q − r,
6.8. n-randomness
255
that is, such that (ik , 1 − q − r) ∈ / Cn . By induction, Cn is Σ0n , so Cn+1 is 0 Σn+1 . Lemma 6.8.2. Let f = Φ∅e be a total function. Then i Sf1(i) is a Σ0,∅ 1 class, and an index for this class can be obtained effectively from e.
Proof. We can define a ∅ -c.e. set W ⊆ 2<ω (whose index we can clearly obtain from e) as follows. First find c.e. sets Ri ⊆ 2<ω such that Sf1(i) = Ri for all i. Then let W = i Ri . From W we can effectively pass to an 1 index for i Sf (i) = W as a Σ0,∅ class. 1 Theorem 6.8.3 (Kurtz [228], Kautz [200]). (i) From an index of a Σ0n class S and q ∈ Q, we can compute an index of a Σ0,∅ 1
(n−1)
class U ⊇ S such that μ(U ) − μ(S) < q.
(ii) From an index of a Π0n class P and q ∈ Q, we can compute an index of a Π0,∅ 1
(n−1)
class V ⊆ P such that μ(P ) − μ(V ) < q.
(iii) From an index of a Σ0n class S and q ∈ Q, we can ∅(n) -compute an index of a closed Π0n−1 class V ⊆ S such that μ(S) − μ(V ) < q. Moreover, if μ(S) is ∅(n−1) -computable then an index of V can be found ∅(n−1) -computably. (iv) From an index of a Π0n class P and q ∈ Q, we can ∅(n) -compute an index of an open Σ0n−1 class U ⊇ P such that μ(U ) − μ(P ) < q. Moreover, if μ(P ) is ∅(n−1) -computable then an index of U can be found ∅(n−1) -computably. Proof. For n = 1, (i) and (ii) are obvious. For (iii), let W ⊆ 2<ω be a c.e. set such that S = W , obtained effectively from S as in Proposition 2.19.7, and let Ss = Ws . Note that each Ss is a union of finitely many basic clopen sets, and hence is a Π00 class. Each S \ Ss is a Σ01 class, so by Lemma 6.8.1, we can ∅ -computably find an s such that μ(S) − μ(Ss ) < q. Let V = Ss . If μ(S) is computable, then s can be found computably. Now (iv) follows by taking complements. We now turn to proving (i) and (iii) for the n + 1 case. Items (ii) and (iv) then follow by taking complements. Let S be a Σ0n+1 class and let P0 ⊆ P1 ⊆ · · · be uniformly Π0n classes such that S = i Pi . (n−1)
classes Ui ⊇ Pi (i) By induction, we can ∅(n) -compute indicesfor Σ0,∅ 1 such that μ(Ui ) − μ(Pi ) < 2−(i+1) q. Let U = i Ui . Since U is a union of Σ0,∅ 1
(n−1)
classes with ∅(n) -computable indices, by the relativized form of
Lemma 6.8.2, we can computably determine an index for U as a Σ0,∅ 1
(n)
256
6. Martin-L¨ of Randomness
class. Since S ⊆ U , we have U \ S = i (Ui \ Pi ), so
μ(U ) − μ(S) = μ(U \ S) = μ (Ui \ Pi ) i
μ(Ui \ Pi ) <
i
2−(i+1) q = q.
i
(iii) By Lemma 6.8.1, we can ∅(n+1) -computably find an i such that μ(Pi ) μ(S) − q2 . (While ∅(n) is enough to compute μ(Pi ), we need ∅(n+1) (n−1)
to compute μ(S).) By induction, we can obtain an index of a Π0,∅ class 1 V ⊆ Pi such that μ(Pi )−μ(V ) < 2q . Then V ⊆ S and μ(S)−μ(V ) < q. The only place we used ∅(n+1) as an oracle was to compute μ(S). Thus, if μ(S) is ∅(n) -computable, then an index of V can be computed from ∅(n) . We can now define n-randomness using the first approach mentioned above, and it will follow easily from the above result that this definition is equivalent to the ones arising from the other two approaches. Definition 6.8.4 (Kurtz [228]14 ). (i) A Σ0n (Martin-L¨ of ) test is a sequence {Un }n∈ω of uniformly Σ0n classes such that μ(Un ) 2−n for all n. (ii) A set A is n-random (or Σ0n -random) if for every Σ0n test {Un }n∈ω , we have A ∈ / n Un . (iii) We can similarly define Π0n tests, Δ0n tests, and so on, and associated notions of randomness. (iv) A set A is arithmetically random if it is n-random for all n. Notice that a Σ01 test is a Martin-L¨ of test. As usual, we can relativize the notion of n-randomness to any oracle X by using Σ0,X classes in place n of Σ0n classes. An open Σ0n test is a Σ0n test whose components are open. Corollary 6.8.5 (Kautz [200]). The following are equivalent. (i) The set A is n-random. / (ii) For every open Σ0n test {Un }n∈ω , we have A ∈ (iii) The set A is 1-random relative to ∅
(n−1)
n
Un .
.
Proof. Obviously (i) implies (ii), and (ii) implies (iii) because every (n−1)
class is an open Σ0n class. So it is enough to show that ¬(i) Σ0,∅ 1 implies ¬(iii). Suppose there is a Σ0n test {Un }n∈ω such that A ∈ n Un . By Theorem 6.8.3, for each n, we can effectively obtain a Σ0,∅ 1
(n−1)
class
14 The original definition in [228] used higher-level analogs of Solovay tests. It was proved equivalent to this definition by Kautz [200].
6.9. Van Lambalgen’s Theorem
257
−n Vn ⊇ Un+1 such that μ(Vn ) μ(Un+1 ) + 2−(n+1) 2 . Then {Vn }n∈ω is (n−1) a Martin-L¨ of test relative to ∅ , and A ∈ n Vn .
It is typically most useful to think of n-randomness as 1-randomness relative to ∅(n−1) , as we can then apply the relativized form of results obtained for 1-randomness. For example, for each n, there are Δ0n+1 sets that are n-random, while no Δ0n+1 set can be (n + 1)-random. Thus, for each n there are n-random sets that are not (n + 1)-random. As another example, we have the following version of Schnorr’s Theorems 6.2.3 and 6.3.4, where a Σ0n (super)martingale is one that is c.e. relative to ∅(n−1) . Corollary 6.8.6 (Kautz [200], after Schnorr [348, 349]). The following are equivalent. (i) The set A is n-random. (ii) K ∅ (iii) No
(n−1)
Σ0n
(A n) n − O(1). (super)martingale succeeds on A.
We can define a relativized version of Ω by letting ΩX be the halt(n−1) ing probability of U X . Then Ω∅ is a somewhat natural example of an n-random real. Becher and Grigorieff [33] gave a number of examples of n-random reals based on output probabilities of (unrelativized) universal monotone machines.
6.9 Van Lambalgen’s Theorem In this section, we introduce a fundamental tool in the study of nrandomness. Suppose we have a random sequence. Intuitively, we would expect that if we pick out two subsequences that do not share elements, then these subsequences should be highly independent. Conversely, if we put together two sequences that are random relative to each other, we would expect the result to remain random. Van Lambalgen’s Theorem confirms these intuitions for n-randomness. As we will see in Section 8.11.3, the analog of van Lambalgen’s Theorem fails for other notions of effective randomness such as Schnorr, computable, and weak 1-randomness (notions of randomness introduced in Chapter 7), although, perhaps surprisingly, the “hard” direction (Theorem 6.9.2) does hold for these weaker notions of randomness. For any relativizable notion of randomness R, we say that A and B are relatively R-random if A is R-random relative to B and B is R-random relative to A. Theorem 6.9.1 (van Lambalgen [398]). If A ⊕ B is n-random then A and B are relatively n-random.
258
6. Martin-L¨ of Randomness
Proof. We do the n = 1 case. The general case follows by relativizing the proof to ∅(n−1) . Suppose that A ⊕ B is 1-random. We show that B is 1-random relative to A, the other direction being symmetric. Suppose that B is not 1-random relative to A. Let U0 , U1 , . . . be a universal oracle Martin-L¨ of test. Then B ∈ i UiA . Let Vi = {σ ⊕ τ : τ ∈ σ U2i ∧ |σ| = i}. The Vi are uniformly Σ01 classes and μ(Vi ) 2−i because X μ(U2i ) 2−2i for all X. Thus V0 , V1 , . . . is a Solovay test. But A ⊕ B ∈ Vi , since there are infinitely many initial segments σ ≺ A and τ ≺ B with σ ⊕τ in some U2i . So A ⊕ B is not 1-random. Theorem 6.9.2 (van Lambalgen [398]). If A is n-random and B is nrandom relative to A, then A ⊕ B is n-random. Proof. This proof is due to Nies (see [118]). Suppose that A ⊕ B is not n-random. We show that either A is not n-random or B is not n-random relative to · · · be uniformly Σ0n open sets such that A. Let V0 ⊇ V1 ⊇ −2i A ⊕ B ∈ i Vi and μ(Vi ) 2 , which exist by Corollary 6.8.5. We write σ ⊕ τ for the class of sets X = X0 ⊕ X1 such that σ ≺ X0 and τ ≺ X1 . Let Si be the open set generated by {σ : μ(Vi ∩ σ ⊕ λ) 2−i−|σ| }. Clearly, Si+1 ⊆ Si , and the Si are uniformly Σ0n open sets. Let σ0 , σ1 , . . . be a prefix-free set of generators for Si . Since the sets Vi ∩ σj ⊕ λ are pairwise disjoint and μ(Vi ) 2−2i , we have j 2−i−|σj | 2−2i , and hence μ(Si ) j 2−|σj | 2−i . Thus, if A ∈ i Si then A is not n-random. Otherwise, there is a j such that A ∈ / Si for all i > j. For such i, let Rik be k −i the open set generated by {τ ∈ 2k : (A k) ⊕ τ ⊆ Vi }. Then μ(R i)2 , since A ∈ / Si . Moreover, since Vi is open, Rik ⊆ Rik+1 . Let Ri = k Rik . The Ri are open and uniformly Σ0n relative to A, and μ(Ri ) = supk μ(Rik ) 2−i for i > j. Furthermore, B ∈ Ri for each i > j, so B is not n-random relative to A. Corollary 6.9.3 (van Lambalgen’s Theorem [398]). The following are equivalent. (i) A ⊕ B is n-random. (ii) A is n-random and B is n-random relative to A. (iii) B is n-random and A is n-random relative to B. (iv) A and B are relatively n-random. In particular, if A and B are both n-random, then A is n-random relative to B iff B is n-random relative to A. The last part of van Lambalgen’s Theorem may seem counterintuitive, as it says that if A is n-random, then there is no way to build a set B that is n-random relative to A without at the same making A be n-random relative to B. It implies for instance that if A is Δ02 and 1-random, and
6.10. Effective 0-1 laws
259
B is 2-random, then A is 1-random relative to B, and also that every nrandom set is n-random relative to measure 1 many sets. Nevertheless, it is one of the most useful basic properties of relative randomness. We note a few consequences of van Lambalgen’s Theorem here, and will use it several times below. Corollary 6.9.4. If A ⊕ B is 1-random, then A |T B, so A, B
6.10 Effective 0-1 laws In this section, we prove effective versions of measure-theoretic 0-1 laws such as Kolmogorov’s 0-1 Law (Theorem 1.2.4). Lemma 6.10.1 (Kuˇcera [215]). Let P be a Π01 class of positive measure. If A is 1-random then some translate of A lies in P . That is, there are σ ∈ 2<ω and B ∈ P such that A = σB. Proof. Suppose that for every σ and B, if A = σB then B ∈ / P . Let W ⊂ 2<ω be a prefix-free c.e. set such that P = W . Let S0 = W and Sn+1 = {στ : σ ∈ Sn ∧ τ ∈ W }. Let Un = Sn . The Un are uniformly Σ01 . Furthermore, μ(U0 ) < 1 and μ(Un+1 ) = μ(Un )μ(U0 ) = μ(U0 )n+2 . Thus we can pick out a Martin-L¨ of test Un0 ,Un1 , . . . . Since we are assuming that if A = σB then B ∈ P , we have A ∈ n Un . Thus A is not 1-random. Theorem 6.10.2 (Kautz [200, 201]). Let n > 0 and let P be a Π0n class of positive measure. If A is n-random then there are σ ∈ 2<ω and B ∈ P such that A = σB. Thus, P contains a member of each n-random degree. (n−1)
Proof. By Theorem 6.8.3, there is a Π0,∅ class V ⊆ P of positive 1 measure. By the relativized form of Lemma 6.10.1, there are σ ∈ 2<ω and B ∈ V ⊆ P such that A = σB. Theorem 8.21.2 below will show that this result cannot be extended to the notion of weak n-randomness introduced in the next chapter. Corollary 6.10.3 (Kautz [200, 201]). Let n > 0 and let C be a Σ0n+1 or Π0n+1 class that is degree invariant, or even just closed under translations. Then C contains either all n-random sets or no n-random sets.
260
6. Martin-L¨ of Randomness
Proof. It is enough to prove the result for a Σ0n+1 class C, since the result for a Π0n+1 class then follows by taking its complement. Let A be n-random. By Kolmogorov’s 0-1 Law, C has measure 0 or 1. If C has measure 1, then it is a union of Π0n classes, at least one of which has positive measure. By Theorem 6.10.2, this class must contain a translation of A, and hence C must contain A. If C has measure 0 then it is a union of Π0n classes of measure 0. For any such class P , using Theorem 6.8.3, we can ∅(n−1) (n−1) computably find Σ0,∅ classes U0 , U1 , . . . such that μ(Ui ) 2−i and 1 P ⊆ i Ui . Then U0 , U1 , . . . is a Martin-L¨ of test relative to ∅(n−1) . Thus A∈ / i Ui , and hence A ∈ / P . So A ∈ / C. Note that this result implies that a Σ0n+1 or Π0n+1 class of measure 0 cannot contain any n-random sets. We will see several important applications of 0-1 laws below, such as Theorem 8.21.8 that every 2-random set is c.e. in some set of strictly smaller degree.
6.11 Infinitely often maximally complex sets One might wonder whether there is a plain complexity characterization of 1-randomness simpler than the ones in Theorem 6.7.2. We have seen that we cannot have C(A n) n − O(1) for all n, so the following weakening of this condition is a natural candidate. Definition 6.11.1. A set A is infinitely often C-random if there are infinitely many n such that C(A n) n − O(1). Martin-L¨ of [260] noted that being infinitely often C-random implies being 1-random.15 An easy proof of this fact was found by Nies, Stephan, and Terwijn [308]: By Proposition 3.10.4, we know that C(στ ) K(σ) + C(τ ) + O(1), and hence C(στ ) K(σ) + |τ | + O(1). If A is not 1-random, then for each d there is an n with K(A n) < n − d. If m > n then, by the above, C(A m) K(A n) + (m − n) < m − d. Since d is arbitrary, A is not infinitely often C-random. As we will see, the notion of being infinitely often C-random turns out to be too strong to capture 1-randomness. In fact, it is a plain complexity characterization of 2-randomness. Before proving this remarkable fact, we note the following result. Theorem 6.11.2 (after Martin-L¨ of [260], see Li and Vit´ anyi [248]). The following are equivalent. (i) The set A is infinitely often C-random. 15 Though Martin-L¨ of [260] looked at sets A such that C(A n | n) n − O(1), which by Theorem 6.11.2 is equivalent to being infinitely often C-random.
6.11. Infinitely often maximally complex sets
261
(ii) There are infinitely many n such that C(A n | n) n − O(1). Proof. Obviously (ii) implies (i). Now suppose that there are infinitely many n such that C(A n) n − O(1). We have C(A n) C(A n | n) + 2 log(|n − C(A n | n)|) + O(1), since the 2 log(|n − C(A n | n)|) term allows room for a description of n given C(A n | n). So there are infinitely many n such that n − C(A n | n) 2 log(|n − C(A n | n)|) + O(1). For such n, we have n − C(A n | n) O(1). To show that being infinitely often C-random is equivalent to being 2random, Nies, Stephan, and Terwijn [308] began with the following result. Theorem 6.11.3 (Nies, Stephan, and Terwijn [308]). If a set is infinitely often C-random then it is 2-random. Proof. Suppose that A is not 2-random. Then there are n0 , n1 , . . . such that K ∅ (A ni ) < ni − i for all i. Let σi be a minimal length U ∅ -program for ∅ A ni . Let si > ni be such that U (σi )[si ] ↓ with ∅ -correct use. (That is, for the use u of this computation, ∅si u = ∅ u.) Let M be a Turing machine defined as follows. On input τ , first M searches for σ and μ such that σμ = τ and U ∅ (σ)[|τ |] ↓. If there is such a ∅ σ (which must then be unique), let ν = U (σ)[|τ |]. Then M outputs νμ. Fix i and let t si . Let μ = A(ni ) . . . A(t − 1). Then M (σi μ) ↓= (A ni )μ = A t, so C(A t) |σi μ| < t − i + O(1). Since for each i this fact holds for all sufficiently large t, we conclude that A is not infinitely often C-random. Let V be a universal Turing machine. For a function g, let C g (σ) = min{|τ | : V(τ )[g(σ)] ↓= σ}. (We consider only functions that are sufficiently fast growing so that this set is not empty for any σ.) Analyzing the above proof, we see that M has quite low running time. Thus, there is a computable function g such that if there are infinitely many n with C g (A n) n − O(1), then A is 2-random. Indeed, this fact is true of all sufficiently fast growing g. By an appropriate choice of universal Turing machine, there is a constant c such that we can take g(n) = n2 + c. Miller [276] proved the converse to Theorem 6.11.3, thus establishing the equivalence between being infinitely often C-random and being 2-random. Nies, Stephan, and Terwijn [308] obtained the same result independently, with a somewhat simpler proof that we now present. Definition 6.11.4 (Nies, Stephan, and Terwijn [308]). A compression function is a one-to-one function F : 2<ω → 2<ω such that |F (σ)| C(σ) for all σ. Lemma 6.11.5 (Nies, Stephan, and Terwijn [308]). There is a low compression function.
262
6. Martin-L¨ of Randomness
Proof. Let V be the universal Turing machine relative to which we define C. Let P be the Π01 class of graphs of partial functions extending V. Applying the low basis theorem, let A be a low element of P and let f be the partial function determined by A. Let F (σ) be the length-lexicographically least τ such that f (τ ) = σ. Note that such a τ always exists because f extends V, so F is total, and F is clearly one-to-one. Furthermore, F T A, so F is low, and F (σ) is at least as short as the shortest V-program for σ, so |F (σ)| C(σ) for all σ. Theorem 6.11.6 (Miller [276], Nies, Stephan, and Terwijn [308]). A set is 2-random iff it is infinitely often C-random. Proof. By Theorem 6.11.3, it is enough to assume that A is not infinitely often C-random and show that A is not 2-random. Let F be a low compression function. By the definition of compression function, limn n − |F (A n)| limn n − C(A n) = ∞. Let Pi,m = {X : ∀n m (|F (X n)| < n − i)}, and let Ui = m Pi,m . The Ui are uniformly Σ02 relative to F . Furthermore, μ(Pi,m ) μ({X : |F (X m)| < m − i}) = 2−m |{σ ∈ 2m : |F (σ)| < m − i}| < 2−m 2m−i = 2−i . −i Since Pi,0 ⊆ Pi,1 ⊆ · · · , it follows that μ(Ui ) 2 . Thus U0 , U1 , . . . is a 0 Σ2 test relative to F , and A ∈ n Un . Thus A is not 2-random relative to F , and hence is not 1-random relative to F . But F is low, so A is not 1-random relative to ∅ , and hence is not 2-random.
By the comment following the proof of Theorem 6.11.3, we also have the following result, which can be a useful tool in studying 2-randomness. (We will give an example of its application in Section 8.21.1.) Theorem 6.11.7 (Nies, Stephan, and Terwijn [308]). For every sufficiently fast growing computable function g, a set A is 2-random iff there are infinitely many n such that C g (A n) n − O(1). By an appropriate choice of universal Turing machine, there is a c such that we can take “sufficiently fast growing” to mean that g(n) n2 + c. Turning to prefix-free complexity, we have the following analog of Definition 6.11.1. Definition 6.11.8. A set A is infinitely often strongly K-random if there are infinitely many n such that K(A n) n + K(n) − O(1). By Corollary 4.3.3, if a set is infinitely often strongly K-random then it is infinitely often C-random, and hence 2-random. The following result is a partial converse. Lemma 6.11.9 (Yu, Ding, and Downey [415], after Solovay [371]). If a set is 3-random then it is infinitely often strongly K-random.
6.12. Randomness for other measures and degree-invariance
263
Proof.Let Pi,m = {A : ∀n m (K(A n) n + K(n) − i)} and let Ui = i Pi,m . Since ∅ can compute K, the Ui are uniformly Σ03 classes. Furthermore, by Theorem 3.7.6, μ(Pi,m ) 2−m |{σ ∈ 2m : K(σ) m + K(m) − i}| = O(2−i ). Since Pi,0 ⊆ Pi,1 ⊆ · · · , it follows that μ(Ui ) = O(2−i ). Thus there is a c such that Uc , Uc+1 , . . . is a Σ03 test. If A is not infinitely often strongly K-random, then A ∈ ic Ui , so A is not 3-random.
It had been a very interesting open question whether being infinitely often strongly K-random is equivalent to being 2-random, to being 3-random, or to neither. This problem was solved by Miller [277], who showed that a set is 2-random iff it is infinitely often strongly K-random. It follows that a set is infinitely often C-random iff it is infinitely often strongly K-random. We will prove Miller’s result as Theorem 15.6.5. We delay the proof because it relies on notions of lowness not introduced until later in the book. It is natural to ask whether there are unrelativized complexity characterizations (using either plain or prefix-free complexity) of 3-randomness, or even of n-randomness for larger n. Such characterizations can be obtained using the results in Section 10.2.4 below. Theorem 10.2.10, which is due to Vereshchagin [399], states that C ∅ (σ) = lim supn C(σ | n) ± O(1). Using this limit complexity result, we can iterate the plain complexity characterization of 1-randomness given in Theorem 6.7.2, in relativized form, to characterize n-randomness in terms of unrelativized plain complexity. Theorem 10.2.11, which is due to Bienvenu, Muchnik, Shen, and Vereshchagin [42] and states that K ∅ (σ) = lim supn K(σ | n)±O(1), can be similarly used to give an unrelativized prefix-free complexity characterization of n-randomness. Whether there are simpler unrelativized complexity characterizations of n-randomness is open.
6.12 Randomness for other measures and degree-invariance 6.12.1 Generalizing randomness to other measures We have restricted our development so far to the context of the uniform measure μ, but it is straightforward to adapt our definitions and many results to any other probability measure ν on Cantor space. When doing so, however, we need to take into account the complexity of ν. In this context, it is best to think of probability measures as generated by premeasures.
264
6. Martin-L¨ of Randomness
Definition 6.12.1. A (probability) premeasure is a function ρ : 2<ω → R0 such that ρ(λ) = 1 and ρ(σ0) + ρ(σ1) = ρ(σ) for all σ.16 For a premeasure ρ, let the function μ∗ρ : 2ω → R0 be defined by letting ∗ μρ (A) be the infimum of σ∈U ρ(σ) over all U such that A ⊆ U . Then μ∗ρ is an outer measure. Let μρ be the corresponding measure. We have μρ (σ) = ρ(σ) for all σ. For instance, if ρ(σ) = 2−|σ| for all σ, then μρ = μ. See for instance Royden [335] for more on basic measure theory. Definition 6.12.2. The rational representation of a premeasure ρ is the set {(σ, r, q) : ρ(σ) ∈ (r, q)} ⊂ 2<ω × Q × Q. We identify ρ with this representation, which allows us to talk about the degree of ρ and relativize concepts to ρ. A measure ν on 2ω is computable if ν = μρ for some computable ρ. We can now make the following definition. Many other definitions and results can be adapted in the same way (for instance, to define n-μρ -randomness). Definition 6.12.3. Let ρ be a premeasure. (i) A Martin-L¨ of μρ -test is a sequence {Un }n∈ω of uniformly Σ0,ρ 1 classes such that μρ (Un ) 2−n for all n. (ii) A class C ⊂ 2ω is Martin-L¨ of μρ -null if there is a μρ -Martin-L¨of test {Un }n∈ω such that C ⊆ n Un . of μρ -random, or 1-μρ -random, if {A} is not (iii) A set A ∈ 2ω is Martin-L¨ Martin-L¨ of μρ -null.17 When we are given a measure ν, we assume we have a particular premeasure ρ such that ν = μρ , and hence write simply “1-ν-random”. This notation makes particular sense when ν is computable, in which case we always assume that the corresponding ρ is computable, making the classes in the definition of Martin-L¨ of ν-test actual Σ01 classes. Clearly, randomness is not invariant under change of measure. For example, let ν be defined by letting ν(σ0) = ν(σ) and ν(σ1) = 2ν(σ) . 3 3 Then no 1-random set is 1-ν-random. But in this book we will often be more interested in degrees than in individual sets, which leads to the natural question of whether the 1-random and 1-ν-random degrees are the same. Even among computable measures, there are some that will lead to different 1-random degrees than μ. For example, we can simply concentrate 16 These
are of course related to the continuous semimeasures of Section 3.16. seems reasonable to incorporate the complexity of ρ into the definition of 1-μρ randomness, but this approach is not without its drawbacks for noncomputable ρ. See Reimann [326] for further discussion of this issue. We will not deal with noncomputable measures except in Section 8.7. 17 It
6.12. Randomness for other measures and degree-invariance
265
all our measure on a computable set, for instance defining ν(0n ) = 1 for all n and ν(σ) = 0 for all other σ. Then 0ω is a computable 1-ν-random set. To isolate this kind of example, define a measure ν to be atomic if ν({A}) > 0 for some A ∈ 2ω . We call such an A an atom of ν. A nonatomic measure is called a continuous measure. Our example above is an extreme example of an atomic measure. In particular, it is a trivial measure, which is a measure ν such that A∈A ν({A}) = 1, where A is the collection of atoms of ν. As it turns out, computability and continuity are enough to ensure invariance. In the following subsection, we will prove that if ν is computable and continuous, then a degree is 1-ν-random iff it is 1-random. As this material is not central to this book, we will do no more than touch on it. Considerations arising from working with measures other than the uniform measure are more fully discussed in van Lambalgen [397], Li and Vit´ anyi [248], and Reimann [326]. One further use we will make of this material, however, is in Section 8.6, where we will prove Demuth’s remarkable theorem that if A is tt-below a 1-random set, then the wtt-degree of A contains a 1-random set, which can be seen as a kind of downwards closure of 1-randomness. We will also revisit it briefly in Section 8.7, where we will prove the result due to Reimann and Slaman [327] that if A is noncomputable then there is a ν such that A is 1-ν-random and A is not an atom of ν, and will discuss related results.
6.12.2 Computable measures and representing reals In this subsection, we will show that if ν is computable and continuous, then a degree is 1-ν-random iff it is 1-random. This result was first proved by Levin (see [425]), but also appears in Kautz’ dissertation [200], whose account we follow. The key intuition we will exploit is that the identification of 2ω with [0, 1] actually depends on the measure. That is, which subsets of 2ω we should identify with what intervals depends on the choice of measure. In the usual representation, if A ∈ 2ω begins with 10, say, then the 1 corresponds to the fact that the real α identified with A is in [ 12 , 1] and the 0 to the fact that it is in [ 12 , 34 ]. That is, looking at the first n bits of A corresponds to knowing α to within 2−n , by computing an interval of length 2−n within which α must lie. Now suppose that ν is the measure mentioned above corresponding to the distribution where 1’s are twice as likely as 0’s (that is, the measure defined by ν(σ0) = ν(σ) and ν(σ1) = 2ν(σ) ). 3 3 Then the first bit of A being 1 indicates that the real identified with A is in [ 13 , 1], while the second bit being 0 indicates that it is in [ 13 , 59 ]. Thus we have the following definition, where from now on we fix a computable probability measure ν on 2ω and a corresponding computable premeasure ρ.
266
6. Martin-L¨ of Randomness
Definition 6.12.4. For a string σ, let Lσ = {τ : |τ | = |σ| ∧ τ ε > 0. Let T = {σ : ρ|σ| (σ) > ε − 2−|σ| }. Then T is a computable tree and A ∈ [T ]. But if B ∈ [T ] then ν({B}) ε, so [T ] is finite. Hence A is an isolated path in a computable tree, and hence is computable.
6.12. Randomness for other measures and degree-invariance
267
If α is computable, then so is seqν (α), but if ν is atomic and α is noncomputable, it is still possible for seqν (α) to be an atom of ν and hence computable. If ν is continuous, however, then that cannot be the case, so we have the following. Corollary 6.12.8 (Kautz [200], Levin, see [425]). If ν is continuous and seqν (α) is defined then seqν (α) ≡T α. The following is the main result of this subsection. Theorem 6.12.9 (Levin, see [425], see also Kautz [200]). (i) If α is 1-random, then seqν (α) is 1-ν-random. (ii) If seqν (α) is 1-ν-random and noncomputable, then α is 1-random. (iii) If ν is continuous and a > 0 then a contains a 1-random set iff a contains a 1-ν-random set. Proof. (i) Suppose that seqν (α) is not 1-ν-random. Let {Ui }i∈ω be a Martin-L¨ of ν-test covering seqν (α). Let V0 , V1 , . . . be uniformly c.e. sets i }i∈ω covof (μ-)test {U of generators for the Ui . We will define a Martin-L¨ ering α, thus showing that α is not 1-random. For each i and k, proceed as follows. Let σ be the kth string in the enumeration of Vi+1 . Compute τ0 , . . . , τn such that (σ)ν,k+i+3 = jn (τj )μ (which is possible because the i . Clearly endpoints of (σ)ν,k+i+3 are dyadic rationals). Put each τj into U 0 the Ui are uniformly Σ1 , and i ) μ(Ui+1 ) + 2−(k+i+2) 2−(i+1) + 2−(i+1) = 2−i . μ(U k
i form a Martin-L¨ Thus the U of test. For each i, there is a σ ∈ Vi+1 such that σ seqν (α). Say that σ is the kth string in the enumeration of Vi+1 . i . Thus α ∈ U Then α ∈ (σ)ν ⊆ (σ)ν,k+i+2 , so α ∈ U i i. (ii) Now suppose that α is not 1-random and seqν (α) is not computable. Let {Ui }i∈ω be a Martin-L¨ of test covering α. Let V0 , V1 , . . . be uniformly c.e. sets of generators for the Ui . For each i and k, proceed as follows. Let σ be the kth string in the enumeration of Vi+1 . Let p and q be such that [p, q] = (σ)μ , and let Ik = [p − 2−(i+k+3) , q + 2−(i+k+3) ]. Let i = {τ : ∃s ((τ )ν,s ⊆ Ik )}. U k
i are uniformly Σ0 . Since ν(τ ) = μ((τ )ν ), Clearly the U 1 i ) μ(Ik ) μ(Ui+1 ) + 2−i+k+2 2−(i+1) + 2−(i+1) = 2−i . ν(U k
k
i form a Martin-L¨ Thus the U of ν-test, and we are left with showing that i for all i. seqν (α) ∈ U
268
6. Martin-L¨ of Randomness
The only difficulty here is that ν may be atomic, but we can use the hypothesis that seqν (α) is not computable. Fix i and let σ ∈ Vi+1 be such that σ ≺ α. Suppose σ is the kth string enumerated into Vi+1 . Since seqν (α) is not computable, it is not an atom of ν, so there is a τ ≺ seqν (α) such that μ((τ )ν ) = ν(τ ) < 2−(i+k+4) . Then there is an s such that μ((τ )ν,s ) < 2−(i+k+3) . Since α ∈ (τ )ν ⊆ (τ )ν,s , letting Ik be as above, we i . have (τ )ν,s ⊂ [α − 2−(i+k+3) , α + 2−(i+k+3) ] ⊂ Ik , whence seqν (α)τ ⊆ U (iii) now follows from Lemmas 6.12.6 and 6.12.7. For atomic measures ν, the behavior of 1-ν-random sets can be strange. For instance, Kautz [200] showed that there is a nontrivial computable measure ν suchthat no noncomputable Δ02 set is 1-ν-random. (Recall that ν is trivial if A∈A ν({A}) = 1, where A is the collection of atoms of ν.) However, once we get to the level of 2-randomness, such problems disappear. Theorem 6.12.10 (Kautz [200]). Let ν and ν be nontrivial computable measures. Then every noncomputable degree containing a 2-ν -random set contains a 2-ν-random set. Proof. Let A >T ∅ be 2-ν -random. Let B ∈ n (A n)ν . Then A = seqν B, since A is not computable. Thus, by the relativized form of Theorem 6.12.9, B is 2-random. It is enough to show that there is a 2-ν-random set C ≡T B. Let S = {α : seqν (α) ↓ ∧ ∀ε > 0 ∃n (ν(seqν (α) n) < ε)}. It is easy to see that μ(S) > 0, since ν is nontrivial. Let {Ui }i∈ω be a universal Martin-L¨ of test. Let Pi = Ui , and let k be large enough so that μ(Pk ) > 1 − μ(S). Now, Pk is a Π01 class containing only 1-random sets, and if α is 1-random then α is noncomputable, and hence seqν (α) is defined. Thus Pk ∩ S = Pk ∩ {α : ∀ε > 0 ∃n( ν(seqν (α) n) < ε)}, which is clearly a Π02 class. Since this class has positive measure, by Lemma 6.10.2 it contains an α ≡T B. Let C = seqν (α). By Theorem 6.12.9, C is 2-ν-random. Furthermore, by Lemma 6.12.6, C ≡T α ≡T B, since the fact that α ∈ S implies that C is not an atom of ν.
7 Other Notions of Algorithmic Randomness
In [348, 349], Schnorr analyzed his characterization of 1-randomness in terms of c.e. martingales (Theorem 6.3.4). He argued that this result demonstrates a failure of 1-randomness to capture the intuition behind Martin-L¨ of’s original definition, because effective randomness should be concerned with defeating computable betting strategies rather than computably enumerable ones, as the latter are fundamentally asymmetric, in the same way that a c.e. set is semi-decidable rather than decidable.1 One can make a similar argument about Martin-L¨ of tests being effectively null (in the sense that we know how fast their components converge to zero), but not effectively given, in the sense that the test sets themselves are not computable, but only Σ01 .2 We will begin this chapter by looking at two natural notions of effective randomness introduced by Schnorr [348, 349] with this critique in mind. We will then discuss several further notions, including weak n-randomness, introduced by Kurtz [228], and nonmonotonic randomness, which modifies the concept of betting strategies to allow for betting on the bits of a sequence out of order.
1 In
Section 6.3.4, we saw how to replace c.e. martingales by computable martingale processes. Schnorr did not have this result available, of course, and one might use it to argue against his critique, particularly in light of Theorem 6.3.19. 2 Some confusion may have been created by the fact that every Σ0 class has a com1 putable set of generators, but there is a distinction to be made between general Σ01 classes and those with computable measures. The former behave much more like c.e. objects than like computable ones.
R.G. Downey and D. Hirschfeldt, Algorithmic Randomness and Complexity, Theory and Applications of Computability, DOI 10.1007/978-0-387-68441-3_7, © Springer Science+Business Media, LLC 2010
269
270
7. Other Notions of Algorithmic Randomness
7.1 Computable randomness and Schnorr randomness 7.1.1 Basics As mentioned in Section 6.3.1, c.e. martingales are not the most obvious effectivization of the notion of martingale. It is natural to investigate, as Schnorr [348, 349] did, what happens if we define randomness with respect to computable martingales, which are those that are computable functions in the sense of Definition 5.2.1.3 Definition 7.1.1 (Schnorr [348, 349]). A set is computably random if no computable martingale succeeds on it. In this definition, we can take our martingales to be rational-valued. (Recall that when we say that a rational-valued function f is computable, we mean that there is an algorithm to determine f (n) given n, not just that f is a computable function in the sense of Definition 5.2.1 that happens to take values in Q.) Proposition 7.1.2 (Schnorr [348, 349]). For each computable martingale d there is a computable martingale f : 2<ω → Q0 such that S[f ] = S[d]. In fact, d(σ) < f (σ) < d(σ) + 1 for all σ. Proof. Let f (λ) be a rational strictly between d(λ) and d(λ) + 1. Assume that we have defined f (σ) to be a rational strictly between d(σ) and d(σ)+1. It is easy to check that we can find rationals r0 and r1 such that f (σ) = r0 +r1 and for each i < 2, we have d(σi) < ri < d(σi) + 1. Let f (σi) = 2 ri . Now f is a computable rational-valued martingale, and clearly S[f ] = S[d]. There is also a somewhat surprising characterization of computable randomness in terms of the mere existence of limits of martingales. When we say a limit exists, we mean that it is finite. Theorem 7.1.3 (Folklore). A is computably random iff limn d(A n) exists for all computable martingales d. Proof. If limn d(A n) exists for all computable martingales d then clearly A is computably random. For the nontrivial direction, suppose that A is computably random but d is a computable martingale such that limn d(A n) fails to exist. Since lim supn d(A n) < ∞, there are rationals r < s such that there are 3 Miniaturizations
of the notion of computable randomness form the basis of the theory of resource-bounded measure; see for instance Ambos-Spies and Mayordomo [10] and Lutz [251].
7.1. Computable randomness and Schnorr randomness
271
infinitely many n with d(A n) < r and infinitely many n with d(A n) > s. We begin with d(λ) We define a new computable martingale d. = 1. Along a sequence B, we bet as follows. We bet evenly until (if ever) we find an n0 such that d(B n0 ) < r, then follow d’s bets until (if ever) we find an n1 such that d(B n1 ) > s, then bet evenly until we find an n2 such that d(B n2 ) < r, and so on. If B = A then we will find such n0 , n1 , . . . . Between A n2i and A n2i+1 , we increase our capital by a factor greater than sr , while between A n2i+1 and A n2i+2 , we do not change our capital. Thus d succeeds on A, contradicting the computable randomness of A. According to Schnorr, however, computable randomness is not quite effective enough, because given a set that is not computably random, we do have a computable strategy for making arbitrarily much money while betting on its bits, but we may not be able to know ahead of time how long it will take us to increase our capital to a given amount. In a way, this situation is similar to having a computable approximation to a real without necessarily having a computable modulus of convergence, which we know is a characterization of the real being Δ02 , rather than computable. Thus we have the following weaker notion of randomness. Definition 7.1.4 (Schnorr [348, 349]). (i) An order is an unbounded nondecreasing function from N to N.4 For a (super)martingale d and an order h, let Sh [d] = {X : lim supn d(Xn) h(n) = ∞}. We say that this (super)martingale h-succeeds on the elements of Sh [d], and call Sh [d] the h-success set of d. (ii) A set A is Schnorr random if A ∈ / Sh [d] for every computable martingale d and every computable order h. Note that if lim supn d(Xn) g(n) > 0 for an order g, then X ∈ Sh [d] for any sufficiently slower growing order h. One might wonder whether the limsup in the above definition should not be a liminf, which would mean we really know how fast we can increase our capital. (Indeed, in that case, for each X we could replace h by a suitable finite modification g such that if lim inf n d(Xn) h(n) = ∞ then d(X n) g(n) for all n.) As it turns out, making this modification leads to a characterization of the notion of weak 1-randomness, as we will see in Theorem 7.2.13. This notion is strictly weaker than Schnorr randomness, and is in fact weak enough that it is questionable whether it is really a notion of randomness at all (see Section 8.20.2). Schnorr randomness also has a pleasing characterization in terms of tests. 4 An “Ordnungsfunktion” in Schnorr’s terminology is always computable, but we prefer to leave the complexity of orders unspecified in general.
272
7. Other Notions of Algorithmic Randomness
Definition 7.1.5 (Schnorr [349]). (i) A Martin-L¨ of test {Un }n∈ω is a Schnorr test if the measures μ(Un ) are uniformly computable. (ii) A class C ⊂ 2ω is Schnorr null if there is a Schnorr test {Un }n∈ω such that C ⊆ n Un . It is sometimes convenient to take an alternate definition of a Schnorr test as a Martin-L¨of test {Un }n∈ω such that μ(Un ) = 2−n for all n. As we now show, these definitions yield the same null sets, so we use them interchangeably. Proposition 7.1.6 (Schnorr [349]). Let {Un }n∈ω be a Schnorr test. There is a Schnorr test {Vn }n∈ω such that μ(Vn ) = 2−n and n Un ⊆ n Vn . Proof. For each n, proceed as follows. Let r0 < r1 < · · · < μ(Un ) be a computable sequence of rationals such that μ(Un ) − ri 2−i . At stage s, first ensure that Vn [s] ⊇ Un [s]. If there is an i such that μ(Vn [s]) < 2−n −2−i and μ(Un [s]) ri , then add basic open sets to Vn [s] to ensure that its measure is equal to 2−n − 2−i . It is easy to check that this action ensures that Vn ⊇ Un and μ(Vn ) = 2−n . Theorem 7.1.7 (Schnorr [349]). A class C ⊂ 2ω is Schnorr null iff there are a computable martingale d and a computable order h such that C ⊆ Sh [d]. Thus, a set A is Schnorr random iff {A} is not Schnorr null. Proof. (⇐) Let d be a computable martingale and h a computable order. We build a Schnorr null set containing Sh [d]. By Proposition 7.1.2, we may assume thatd is rational-valued. We may also assume that d(λ) ∈ (0, 1). Let Ui = {σ : d(σ) > h(|σ|) > 2i }. The Ui are uniformly Σ01 classes, and μ(Ui ) 2−i by Kolmogorov’s Inequality (Theorem 6.3.3). To compute μ(Ui ) to within 2−c , let l be such that h(l) > 2c . Let Vi = {σ : d(σ) > h(|σ|) > 2i ∧ |σ| l}. Then Ui \ Vi ⊆ Uc . Thus μ(Ui ) − μ(Vi ) μ(Uc ) 2−c , so {Ui }i∈ω is a Schnorr test. Suppose A ∈ Sh [d]. Then d(A n) > h(n) for infinitely many n, so for i each i there is an n such that d(A n) > h(n) > 2 , whence A ∈ Ui . Thus Sh [d] ⊆ i Ui . (⇒) Let {Un }n∈ω be a Schnorr test. We build a computable martingale d and a computable order h such that n Un ⊆ Sh [d]. Let {Rn }n∈ω be uniformly computable prefix-free sets of strings such that Un = Rn . Let Rnk = {σ : σ ∈ Rn ∧ |σ| k}. Since the measures μ(Un ) are uniformly computable and tend to 0 effec tively, there is a computable order g such that n μ(Rnk ) 2−2g(k) for all k. Then 2g(k)−|σ| = 2g(k) 2−|σ| = 2g(k) μ(Rnk ) 2−g(k) , n σ∈Rk n
n σ∈Rk n
n
7.1. Computable randomness and Schnorr randomness
so there is a computable order f such that
n
f (k)
σ∈Rn
273
2g(k)−|σ| < 2−k . f (k)
For each n and k, define dn,k as follows. For each σ in Rn , add 2g(k) to the value of dn,k (τ ) for each τ σ and add 2g(k)−|σ|+i to dn,k (σ i) for each i < |σ|. Thisdefinition ensures thateach dn,k is a martingale. −k Furthermore, g(k)−|σ| dn,k (λ) = , so d (λ) < ∞, whence f (k) 2 n,k k,n k2 σ∈Rn d = n,kdn,k is a martingale. If A ∈ n Un then for each n there is an i such that A i ∈ Rn . Let n be f (k) large enough so that i f (0). Then A i ∈ Rn for the largest k such that f (k) i, so d(A i) dn,k (A i) 2g(k) . Let h(i) = 2g(k) for the largest d(Ai) k such that f (k) i. Then lim supi 1, so for any computable order h(i) h that grows sufficiently slower than h, we have A ∈ Sh [d]. So we are left with showing that d is computable. To compute d(σ) to within 2−c , proceed as follows. There are only finitely many n and k such that Rnk contains a substring of σ, and we can find all of them and compute the amount q added to d(σ) when substrings of σ enter Rnk ’s. The rest of the value of each dn,k (σ) is equal to rn,k = τ ∈Rf (k) ∧ τ σ 2g(k)−|τ |+|σ| . We can compute an sn,k such that n rn,k − τ ∈Rf (k) [s ] ∧ τ σ 2g(k)−|τ |+|σ| < 2−(c+n+k+2) . Then n n,k 2g(k)−|τ |+|σ| < 2−(c+n+k+2) = 2−c . d(σ) − q + n,k
τ σ f (k)
τ ∈Rn
n,k
[sn,k ]
Instead of looking at the rate of success of a martingale, we can look at the dual notion of how long it takes for that martingale to accrue a given amount of capital. Franklin and Stephan [158] observed this fact as part of a more technical result that will be useful in Section 12.2. Recall that a martingale d is said to have the savings property if d(στ ) + 2 d(σ) for all σ and τ . Theorem 7.1.8 (Franklin and Stephan [158], after Schnorr [348, 349]). The following are equivalent. (i) A is not Schnorr random. (ii) There is a computable martingale d and a computable function f such that ∃∞ n (d(A f (n)) n). (iii) There is a computable martingale d with the savings property and a computable function f such that ∃∞ n (d (A f (n)) n). Proof. Clearly (iii) implies (ii). We begin by showing that (ii) implies (iii), with a version of the savings trick. By Proposition 7.1.2, we may assume that d is rational-valued.
274
7. Other Notions of Algorithmic Randomness
Define computable martingales d0 , d1 , . . . as follows. Let dk (λ) = d(λ). If d(σ) 2k , then let dk (σi) = d(σi) for i = 0, 1. Otherwise, let dk (σi) = dk (σ) for i = 0, 1. Let d = k 2−k dk . Then d is a computable martingale. Let f (n) = max{f (m) : m n}. By (ii), there are infinitely many n such that d(A f (m)) m for some m ∈ [2n , 2n+1 ). For any such n and m, and all k n, we have dk (A f (m)) 2k , whence dk (A f (n)) 2k , by the definition of dk and the fact that f (n) f (m). Thus, for any such n, d (A f (n)) 2−k dk (A f (n)) 2−k 2k n. kn
kn
To complete the proof that (ii) implies (iii), we need to show that d has the savings property. Fix σ and τ and choose n maximal with 2n d(σ). For all k n, the martingale dk follows d at σ, whereas for k < n, the martingale dk is constant above σ. Therefore the following two equations hold. d (σ) = 2−k dk (σ) + 2−k d(σ) k
and d (στ ) =
k
kn
2−k dk (σ) +
2−k d(στ ).
kn
Now, kn 2−k d(σ) kn 2−k 2n = 2, so d (στ ) + 2 d (σ). We now show that (iii) implies (i). First note that the function f in (iii) can be chosen to be strictly increasing. To see this fact, let f (0) = f (0)+f (1)+f (2) and f (n+1) = f (n+3)+ f (n)+1. There exist infinitely many n such that d (A f (n + 2)) n + 2, so there exist infinitely many n such that d (A f (n)) n, since f (n) f (n + 2) and d has the savings property. Now let g(n) be the largest m such that f (m) < n (or 0 if there is none). There are infinitely many m such that d (A f (m)) m, and for each such m, letting n = f (m), we have d (A n) g(n). So d and g witness the fact that A is not Schnorr random. To complete the proof, we show that (i) implies (ii). Suppose that A is not Schnorr random, so there is a computable martingale c and a computable order g such that c(A n) g(n) for infinitely many n. We can use the same construction we used to obtain d from d above to build a new martingale c such that c (στ ) m for all τ whenever c(σ) > 2m+4 . Let f (m) = min{n : g(n) > 2m+4 }. There are infinitely many m such that, for some n ∈ (f (m), f (m+1)], we have c(A n) g(n). By the properties of c , we have c (A nτ ) m for all τ and, in particular, c (A f (m)) m. Many notions connected with 1-randomness can be adapted to Schnorr randomness if the right way to “increase the effectivity” is found. For example, the following is a characterization of Schnorr randomness analogous to
7.1. Computable randomness and Schnorr randomness
275
Solovay’s Definition 6.2.7. (It is equivalent to a definition in terms of martingales mentioned in Wang [405].) We will see another example in Section 7.1.3. Definition 7.1.9 (Downey and Griffiths [109]). A total Solovay test is a sequence {Sn }n∈ω of uniformly Σ01 classes such that n μ(Sn ) is finite and a computable real. A set A passes this test if it is in only finitely many of the Sn . Theorem 7.1.10 (Downey and Griffiths [109]). A set is Schnorr random iff it passes all total Solovay tests. Proof. A Schnorr test is a total Solovay test, so if a set passes all total Solovay tests then it is Schnorr random. Now let A be Schnorr random and let {Sn }n∈ω be a total Solovay test. The measures μ(Sn ) are uniformly computable, since they are uniformly left-c.e. and their sum is finite and computable. There is an m such that n>m μ(Sn ) < 1. Let Uk = {σ : ∃2 n > m (σ ∈ Sn )}. k
The Uk are uniformly Σ01 , and it is not hard to see that μ(Uk ) 2−k n>m μ(Sn ) < 2−k . To compute μ(Uk ) to within 2−c , let l be such that nl μ(Sn ) < −(c+1) <ω 2 . For each n ∈ (m, l), find a finite set Fn ⊂ 2 such that Fn ⊆ Sn k and μ(Sn \ F ) < 2−(c+n+1) . Let Vk = {σ : ∃2 n > m (σ ∈ Fn )}. Then Vk ⊆ Uk , and μ(Sn ) + μ(Sn \ Fn ) μ(Uk ) − μ(Vk ) nl
m
< 2−(c+1) +
2−(c+n+1) < 2−c .
m
/ Uk . Thus {Uk }k∈ω is a Schnorr test, and hence there is a k such that A ∈ So A is in at most 2k + m many Sn . Clearly, 1-randomness implies computable randomness, which in turn implies Schnorr randomness. We will show in Section 8.11.2 that neither of these implications can be reversed.
7.1.2 Limitations Despite Schnorr’s critique, 1-randomness has remained the paradigmatic notion of algorithmic randomness, and has received considerably more attention than Schnorr randomness. One reason may simply be that Martin-L¨ of’s definition came first, and is perfectly adequate for many results. Another important reason, however, is that the mathematical theory of Schnorr randomness is not as well behaved as that of 1-randomness. For
276
7. Other Notions of Algorithmic Randomness
example, the existence of universal Martin-L¨ of tests (and corresponding universal objects such as universal c.e. martingales and prefix-free complexity) is a powerful tool in the study of 1-randomness that is not available in the case of Schnorr randomness. Proposition 7.1.11 (Schnorr [349]). There is no universal Schnorr test. Proof. Let {Ui }i∈ω be a Schnorr test. Define a computable set A ∈ / U1 as follows. Suppose we have defined A n so that 2n μ(U1 ∩ A n) 2−k . 1kn
Then there is an i < 2 such that 2n+1 μ(U1 ∩ (A n)i) 1kn 2−k , −(n+1) so by approximating μ(U1 ) to within , we can determine an i < 2 2 n+1 such that 2 μ(U1 ∩ (A n)i) 1kn+1 2−k . Let A(n) = i. The set since U1 is open, A ∈ / U1 . A is computable, and A n U1 for all n, so, Thus there is a computable set not contained in i Ui , and hence {Ui }i∈ω is not universal. In a similar vein, the proof of Theorem 6.3.11 also establishes the following fact. Proposition 7.1.12 (Schnorr [348, 349]). There is no effective enumeration of all computable (super)martingales, nor is there an effective enumeration of all computable rational-valued (super)martingales. Another basic tool in the study of 1-randomness that is not available for either Schnorr or computable randomness is van Lambalgen’s Theorem. As observed by several people, the more difficult direction of van Lambalgen’s Theorem, namely Theorem 6.9.2, does hold for both Schnorr and computable randomness, with essentially the same proof. Furthermore, if A ⊕ B is either Schnorr or computably random, then so are A and B. Theorem 7.1.13. (i) If A is Schnorr random and B is Schnorr random relative to A, then A ⊕ B is Schnorr random. (ii) If A ⊕ B is Schnorr random then so are A and B. (iii) Items (i) and (ii) also hold for computable randomness. The proof is more or less the same as that of Theorem 6.9.2. The analog of Theorem 6.9.1 fails, however. That is, there are sets A and B such that A ⊕ B is Schnorr random but A and B are not relatively Schnorr random, and the same is true for computable randomness. We will prove this result of Merkle, Miller, Nies, Reimann, and Stephan [269], and in a stronger form Yu [412], in Section 8.11.3. Another reason the theory of Schnorr randomness had remained relatively undeveloped was that many of the basic questions about this notion
7.1. Computable randomness and Schnorr randomness
277
were open. For instance, a cornerstone of Martin-L¨of’s version of randomness is that it can be characterized via all three basic approaches (initial segment complexity, tests, and martingales), giving us a mathematically robust notion. Some of these characterizations had been missing in the cases of Schnorr and computable randomness. However, in the following sections we will see that such characterizations do exist. Even so, in many ways the mathematical theory of 1-randomness remains considerably smoother.
7.1.3 Computable measure machines One of the problems with the mathematical theory of Schnorr randomness had been the lack of a machine characterization analogous to the Kolmogorov complexity definition of 1-randomness. Whether such a characterization could be given was a long-standing open question (see e.g. Ambos-Spies and Kuˇcera [9] and Ambos-Spies and Mayordomo [10]), which was solved by Downey and Griffiths [109], who gave a characterization in terms of a new class of machines. Definition 7.1.14 (Downey and Griffiths [109]). A computable measure machine is a prefix-free machine M such that μ(dom M ) is computable. Theorem 7.1.15 (Downey and Griffiths [109]). A set A is Schnorr random iff KM (A n) n − O(1) for all computable measure machines M . Proof. (⇒) Suppose that M is a computable measure machine and lim supn n − KM (A n) = ∞. Let Uk = {A : ∃n KM (A n) n − k}. The Uk are uniformly Σ01 classes, and by Lemma 6.2.2, μ(Uk ) 2−k and the measures μ(U k ) are uniformly computable. Thus {Uk }k∈ω is a Schnorr test. Since A ∈ k Uk , it follows that A is not Schnorr random. (⇐) Suppose that A is not Schnorr random and let {Uk }k∈ω be a Schnorr test (with μ(Uk ) = 2−k ) such that A ∈ k Uk . Let {Rk }k∈ω be uniformly c.e. prefix-free sets of strings such that Uk = Rk for all k. Let L = {(|σ| − k, σ) : k 1 ∧ σ ∈ R2k }. Then L is a KC set, since it is clearly c.e. and its weight is 2−|σ|+k = 2k μ(U2k ) = 2k 2−2k = 1. k1 σ∈R2k
k1
k1
So by the KC Theorem there is a prefix-free machine M such that if σ ∈ R2k for k 1 then KM (σ) |σ| − k. Furthermore, μ(dom M ) is equal to the weight of L, which is 1, so M is a computable measure machine. For each k there is an n such that A n ∈ R2k , so lim supn n − KM (A n) = ∞. We can obtain analogs of some basic facts about general prefix-free machines for computable measure machines. An example is the subadditivity of prefix-free complexity.
278
7. Other Notions of Algorithmic Randomness
Proposition 7.1.16. For all computable measure machines M0 and M1 there is a computable measure machine N such that KN (στ ) KM0 (σ) + KM1 (τ ) for all σ and τ . Proof. Define N as follows. On input ρ, look for ν0 and ν1 such that ρ = ν0 ν1 and Mi (νi ) ↓ for i = 0, 1. If these exist then they must be unique. In this case, output M0 (ν0 )M1 (ν1 ). Then clearly KN (στ ) KM0 (σ) + KM1 (τ ) for all σ and τ , and N is a computable measure machine because μ(dom N ) = μ(dom M0 )μ(dom M1 ). We also have a complexity upper bound similar to the one for prefix-free complexity. Proposition 7.1.17 (Downey, Griffiths, and LaForte [110]). There is a computable measure machine M such that KM (σ) |σ| + 2 log |σ| + O(1). Proof. For a string τ = a0 a1 . . . am , let τ = a0 a0 a1 a1 . . . am am 01. That is, τ is the result of doubling each bit of τ and adding 01 to the end. For each n, let ρn be the binary representation of n. For each σ, let M (ρ|σ| σ) ↓= σ. It is easy to see that M is prefix-free and KM (σ) = |σ| + 2 log |σ| + 2. Let L = {ρn : n ∈ N}. For τ ∈ L the domain of M contains all extensions of τ of length |τ | + n (and every element of dom M is such an extension for some τ ∈ L). The combined of these extensions is 2n 2−|τ |+n = measure −|τ | −|τ | 2 . Thus μ(dom M ) = . There are two strings of length τ ∈L 2 4 in L (0001 and 1101). There are two strings of length 6 in L (110001 and 111101) and four strings of length 8. Generally, it is easy to check that there are 2i many strings of length 2i +4 in L for each i 1. Thus μ(dom M ) = 2−2 + i 2i 2−(2i+4) = 2−2 + i 2−(i+4) = 2−2 + 2−3 = 38 , and hence M is a computable measure machine. For the purposes of defining measures of complexity, we can replace computable measure machines by ones whose domains have measure 1. Proposition 7.1.18. For each computable measure machine M there is a prefix-free machine N such that μ(dom N ) = 1 and KN (σ) KM (σ) for all σ. Proof. Since μ(dom M ) is computable, we can computably determine k0 , k1 , . . . such that μ(dom M ) + n 2−kn = 1. Let L = {(|σ|, M (σ)) : M (σ) ↓} ∪ {(kn , 0n ) : n ∈ N}. Then L is a KC set, and the corresponding prefix-free machine N given by the KC Theorem has the desired properties. No universal prefix-free machine can be a computable measure machine, since the measure of such a machine is a version of Ω, and hence 1-random. However, we have the following result.
7.1. Computable randomness and Schnorr randomness
279
Proposition 7.1.19. For each computable measure machine M there is a computable measure machine N such that rng N = 2<ω and KN (σ) min{|σ| + 2 log |σ| , KM (σ)} + O(1). Furthermore, N can be chosen such that μ(dom N ) = 1. Proof. Let N0 be a computable measure machine such that KN0 (σ) |σ|+ 2 log |σ|+ O(1), as in Proposition 7.1.17. Define a computable measure machine N1 by N1 (0σ) = M (σ) and N1 (1σ) = N0 (σ). Let N be a computable measure machine such that μ(dom N ) = 1 and KN (σ) KN1 (σ) for all σ, as in Proposition 7.1.18. It is easy to check that N has the desired properties.
7.1.4 Computable randomness and tests In this section we present a test characterization of computable randomness. Definition 7.1.20 (Downey, Griffiths, and LaForte [110]). A computably graded test is a pair consisting of a Martin-L¨of test {Vn }n∈ω and a computable function f : 2<ω × ω → R such that, for any n ∈ ω, any σ ∈ 2<ω , I and any finite prefix-free set of strings {σi }iI with i=0 σi ⊆ σ, the following conditions are satisfied: (i) μ(Vn ∩ σ) f (σ, n), I −n , and (ii) i=0 f (σi , n) 2 I (iii) i=0 f (σi , n) f (σ, n). A set passes the test ({Vn }n∈ω , f ) if it passes {Vn }n∈ω . If condition (ii) holds for any finite prefix-free set of strings then it also holds for any infinite prefix-free set of strings: the infinite sum is just the supremum of the associated finite sums, and so is also no greater than 2−n . Similarly, if (iii) holds for finite prefix-free sets then it also holds for infinite prefix-free sets. If iI σi = σ then we can summarize conditions (i)–(iii) by the following: μ(Vn ∩ σ) Ii=0 f (σi , n) f (σ, n) 2−n . An equivalent notion was defined by Merkle, Mihailovi´c, and Slaman [268]. Definition 7.1.21 (Merkle, Mihailovi´c, and Slaman [268]). (i) A computable rational probability distribution (or measure) is a computable function ν : 2<ω → Q such that ν(λ) = 1 and ν(σ) = ν(σ0) + ν(σ1). (ii) A bounded Martin-L¨ of test is a Martin-L¨ of test {Vn }n∈ω for which there exists a computable rational probability distribution ν such that μ(Vn ∩ σ) 2−n ν(σ) for all n and σ.
280
7. Other Notions of Algorithmic Randomness
It is not difficult to check that {Vn }n∈ω is a bounded Martin-L¨ of test iff there is an f such that ({Vn }n∈ω ,f ) is a computably graded test. We will show that computably graded tests characterize computable randomness. We first need a lemma. Recall that a real-valued function is co-c.e. if it can be effectively approximated from above. Lemma 7.1.22 (Schnorr [348, 349]). From a co-c.e. martingale c : 2<ω → R we can effectively find a computable martingale d : 2<ω → Q such that d(σ) c(σ) for all σ. Proof. Let {cs }s∈ω be an effective decreasing approximation to c. That is, the cs are uniformly computable functions from 2<ω to Q, and for all σ and s we have cs+1 (σ) < cs (σ) and c(σ) = lims cs (σ). We inductively define a computable martingale d such that d(σ) > c(σ) + 2−|σ| for all σ. Let d(λ) = c0 (λ) + 1. Now assume that d(σ) has been defined so that d(σ) > c(σ) + 2−|σ| . Let n be such that 2d(σ) > cn (σ0) + cn (σ1) + 2−|σ|+1 . Let d(σ1) = cn (σ1) + 2−|σ| and d(σ0) = 2d(σ) − d(σ1). By definition d(σ1) > c(σ1) + 2−|σ1| , and d(σ0) is defined explicitly to satisfy the martingale property. Furthermore, d(σ0) = 2d(σ) − d(σ1) = 2d(σ) − cn (σ1) − 2−|σ| > cn (σ0) + cn (σ1) + 2−|σ|+1 − cn (σ1) − 2−|σ| > cn (σ0) + 2−|σ0| . Theorem 7.1.23 (Downey, Griffiths, and LaForte [110], implicit in Merkle, Mihailovi´c, and Slaman [268]). (i) From a computable martingale d : 2<ω → Q we can effectively define a computably graded test ({Vn }n∈ω , f ) such that if lim supn d(A n) = ∞ then A ∈ n Vn . (ii) From a computably graded test ({Vn }n∈ω , f ) we can effectively define a computable martingale d : 2<ω → Q such that if A ∈ n Vn then lim supn d(A n) = ∞. Therefore, a set is computably random iff it passes all computably graded tests iff it passes all bounded Martin-L¨ of tests. Proof.(i) We may assume without loss of generality that d(λ) = 1. Let Vn = {σ : d(σ) 2n } and let f (σ, n) = d(σ)2−(|σ|+n) . By Kolmogorov’s Inequality (Theorem 6.3.3), d(σ) μ(Vn ∩ σ) n , μ(σ) 2 so μ(Vn ∩σ) f (σ, n). Furthermore, again using Kolmogorov’s Inequality, if {σi }iI is a prefix-free set and iI σi ⊆ σ, then f (σi , n) = 2−n d(σi )μ(σi ) 2−n d(σ)μ(σ) = f (σ, n) 2−n . iI
iI
7.1. Computable randomness and Schnorr randomness
281
So ({Vn }n∈ω , f ) is a computably graded test. Finally, if lim supn d(A n) = ∞ then A ∈ n Vn by the definition of the Vn . (ii) Without loss of generality we may assume that Vn+1 ⊆ Vn for all n. We define d via two auxiliary functions: a co-c.e. function h : 2<ω × ω → R and a co-c.e. martingale c : 2<ω → R. Let Pσ be the collection of all finite partitions of σ into a finite prefixfree set, that is, all finite prefix-free sets F such that F = σ (for example, {σ0, σ1} ∈ Pσ ). Let |σ| h(σ, n) = 2 inf f (σi , n) | {σi }iI ∈ Pσ . iI
Since f satisfies conditions (i) and (ii) of have Definition 7.1.20, we−n |σ| 2|σ| μ(Vn ∩ σ) h(σ, n) 2|σ|−n . Thus h(σ, n) 2 2 = n n 2|σ|+1 , and we can define c(σ) = n h(σ, n). Fix n. By condition (iii) of Definition 7.1.20, 2|σ| f (σ, n) 2|σ| (f (σ0, n) + f (σ1, n)) 12 (h(σ0, n) + h(σ1, n)). Each partition of σ into a finite prefix-free set, other than the singleton {σ}, is a union of a partition of σ0 and a partition of σ1, so h(σ, n) = min( 12 (h(σ0, n) + h(σ1, n)), 2|σ| f (σ, n)) = 12 (h(σ0, n) + h(σ1, n)). In other words, λσ.h(σ, n) is a martingale. Thus c is a martingale. It is easy to see that h is co-c.e. Let {hs }s∈ω be an effective approximation of h from above. To see that c is co-c.e., let cs (σ) = ns hs (σ, n) + |σ|−n 2 . This sum is computable, since the first summand is a finite n>s sum of computable rational numbers, while the second is simply 2|σ|−s . In passing from cs (σ) to cs+1 (σ), we replace hs values by hs+1 values, which are no larger, and replace 2|σ|−(s+1) by hs+1 (σ, s + 1), which is also no larger. So cs+1 (σ) cs (σ) for all σ. Also, c(σ) = lims cs (σ) for all σ. Thus c is a co-c.e. martingale. Let d be as in Lemma 7.1.22. Suppose that A ∈ n Vn . Given i, let σ ≺ A be such that σ ⊆ Vi . Then h(σ, j) 2|σ| μ(Vj ∩ σ) = 2|σ| (i + 1)2−|σ| = i + 1. d(σ) c(σ) ji
ji
So lim supn d(A n) = ∞.
7.1.5 Bounded machines and computable randomness It is also possible to obtain a machine characterization for computable randomness. If M is a prefix-free machine, let n SM = {σ : KM (σ) |σ| − n}.
282
7. Other Notions of Algorithmic Randomness
Definition 7.1.24 (Mihailovi´c [personal communication]). A bounded machine is a prefix-free machine such that there is a computable rational n probability distribution ν with μ(SM ∩ σ) 2−n ν(σ) for all σ and n. Theorem 7.1.25 (Mihailovi´c [personal communication]). A set X is computably random iff KM (X n) n − O(1) for every bounded machine M. Proof. Suppose there is a bounded machine M such that ∀c∃n (KM (X n 0 1 n) < n − c). By definition μ(SM ∩ σ) 2−n ν(σ), so SM , SM , . . . is a bounded Martin-L¨ of test covering X. Thus X is not computably random. Now suppose that X is not computably random. Let A0 , A1 , . . . be uniformly c.e. prefix-free sets such that the An forma bounded ML-test with associated computable distribution ν, and X ∈ n An . It is straightforward to use the KC theorem to define a prefix-free machine M such that if σ ∈ A2c+2 then M (τ ) = σ for some τ such that |τ | = |σ| − c, as in the KC Theorem version of the proof of Theorem 6.2.3. For this machine, ∀c∃n (KM (X n) < n − c). To complete this direction of the proof we need to show that M is n bounded. Let τn,0 , τn,1 , . . . list the elements of An . Then SM = {τ2(n+k+1),i : k, i ∈ N}, so for each σ and n, n SM τ2(n+k),i ∩ σ . ∩ σ = k1 i
Consequently, n μ(SM ∩ σ)
−2(n+k) μ A2(n+k) ∩ σ 2 ν(σ) < 2−n ν(σ). k1
k1
7.1.6 Process complexity and computable randomness In Section 6.3.2, we saw that process complexity can be used to characterize 1-randomness. In this section, we show that a variation on process complexity can be used to give an analogous characterization of computable randomness. Definition 7.1.26 (Levin, see [425], Day [89]). A process machine P is a quick process machine if it is total and there is a computable order h such that |P (τ )| h(|τ |) for all τ . For a quick process machine P , any P -description of a string σ must be among the strings τ such that h(|τ |)) |σ|, so the complexity CP (σ) of σ relative to P can be determined by computing the values of P (τ ) for all such τ (which are defined, since P is total). Thus CP is computable. Definition 7.1.27 (Day [89]). A is quick process random if CP (A n) n − O(1) for all quick process machines P .
7.1. Computable randomness and Schnorr randomness
283
Theorem 7.1.28 (Day [89]). A set is quick process random iff it is computably random. Proof. First suppose that A is not quick process random. Then there is a quick process machine P such that for all c there is an n with CP (A n) < n − c. Let h be the order associated with P and let g(n) = min{x : h(x) n}, so that if τ ∈ 2g(n) then |P (τ )| n. As we may assume without loss of generality that h(n) n, we may also assume that g(n) n. We define a computable martingale d as follows. First, for each σ let Eσ = {τ ∈ 2g(|σ|) : P (τ ) σ}. Then let |Eσ | . 2g(|σ|)−|σ| Since P is total, Eσ can be computed from σ, and hence d is computable. We now show that d is a martingale. For any σ, the sets Eσ0 and Eσ1 are disjoint, as P is a process machine. If τ ∈ Eσ0 then τ g(|σ|) ∈ Eσ , because P (τ g(|σ|)) P (τ ) and σ P (τ ), and |σ| |P (τ g(|σ|))|. Hence Eσ0 ⊆ Eσ . Similarly, Eσ1 ⊆ Eσ . Now, if τ ∈ Eσ and τ τ with |τ | = h(|σ| + 1), then τ ∈ Eσ0 ∪ Eσ1 , since P is total, P (τ ) P (τ ) σ, and |P (τ )| g(|σ| + 1). Thus d(σ) =
(|Eσ0 | + |Eσ1 |)2g(|σ|)−g(|σ+1|) = |Eσ |, and hence d(σ) =
|Eσ | |Eσ0 | + |Eσ1 | = g(|σ+1|)−|σ| 2g(|σ|)−|σ| 2 1 |Eσ0 | + |Eσ1 | 1 = = (d(σ0) + d(σ1)). g(|σ+1|)−(|σ|+1) 22 2
Given c, let n and τ be such that |τ | < n − c and P (τ ) = A n. Since g(n) n > |τ | + c, if τ τ and |τ | = h(n), then P (τ ) A n, and so |EAn | τ ∈ EAn . Hence |EAn | 2c+g(n)−n , so d(A n) = 2g(n)−n 2c . Thus lim supn d(A n) = ∞, so A is not computably random. Now suppose that A is not computably random. By Proposition 7.1.2, there is a computable rational-valued martingale d that succeeds on A. It is clear from the proof of that proposition that we may in fact assume that all values of d are dyadic rationals. We may also assume that d(λ) = 1. By the savings trick (Proposition 6.3.8), we may further assume that limn d(A n) = ∞. We will construct a quick process machine P such that for each c there is an n with CP (A n) n − c. Let f be a strictly increasing computable function such that for each σ there is a natural number a with d(σ) = a2−f (|σ|) . 2|σ| We will define a function l : N → N. For strings with lengths in the range of l, we will give short P -descriptions to those strings with large d-values. To describe strings of length l(s + 1), we will use descriptions of length between f (l(s)) + 1 and f (l(s + 1)). To make our strings nonrandom, we
284
7. Other Notions of Algorithmic Randomness
need to make the gap between f (l(s)) and l(s + 1) increase as s increases. Thus we define l inductively by l(0) = 0 and l(s + 1) = f (l(s)) + s + 1. We will construct P in stages while ensuring that the following conditions hold. 1. dom Ps = 2f (l(s)) . 2. |Ps (τ )| = l(s) for all τ ∈ 2f (l(s)) . 3. 2−|σ| d(σ) = 2−f (l(s)) |{τ ∈ 2f (l(s)) : Ps (τ ) = σ}| for all σ ∈ 2l(s) . At stage 0, we define P (σ) = λ for all strings of length less than or equal to f (0). The conditions above are clearly met. At stage s + 1, for every σ ∈ 2l(s) , we proceed as follows. For all υ σ such that |υ| = l(s + 1) we know that d(υ)2−|υ| = a2−f (|υ|) for some natural number a. So there are a0 , . . . , an such that d(υ)2−|υ| = a0 2−f (l(s))−1 + a1 2−f (l(s))−2 + · · · + an 2−f (l(s+1)) , where n = f (l(s + 1)) − f (l(s)) − 1 and ai ∈ {0, 1} for 1 i n. Note that a0 may be greater than 1. Let U = {υ ∈ 2l(s+1) : υ σ}. Then υ∈U in ai 2−(f (l(s))+i+1) = −|υ| d(υ) = 2−|σ| d(σ) 1, so by the KC Theorem we can effectively υ∈U 2 find pairwise disjoint sets of strings Tυ for υ ∈ U such that Tυ has ai many strings of length f (l(s)) + i + 1 for each i n, and Tσ = υ∈U Tυ is prefixfree. Furthermore, we may assume that we define this set as in the proof of the KC Theorem, always taking the leftmost string of the required length that preserves prefix-freeness. Let Sσ = {τ ∈ 2f (l(s)) : Ps (τ ) = σ}. By condition 3, μ(Tσ ) = 2−|σ| d(σ) = |Sσ |2−f ((l(s)) = μ(Sσ ). Let Rσ = {ρ f (l(s)) : ρ ∈ Tσ }. Since μ(Sσ ) = μ(Tσ ), all the elements of Tσ have length at least f (l(s)) + 1), and Tσ was defined as in the proof of the KC Theorem, we must have |Rσ | = |Sσ |. Hence there is a bijection h : Rσ → Sσ . For ρ ∈ Tσ , let g(ρ) be the string formed by replacing the first f (l(s)) many bits of ρ with the image of these bits under h. Let T σ be the image of Tσ under g. Now we define Ps+1 as follows. For |ρ| f (l(s + 1)), let ⎧ ⎪ |ρ| f (l(s)) ⎨Ps (ρ) Ps+1 (ρ) = Ps (ρ f (l(s))) |ρ| > f (l(s)) ∧ ρ ≺ τ for some τ ∈ T υ ⎪ ⎩ υ |ρ| > f (l(s)) ∧ ρ τ for some τ ∈ T υ . We now verify that conditions 1-3 are satisfied and, assuming by induction that Ps is a process machine, so is Ps+1 . The nonempty sets Sσ for σ ∈ 2l(s) partition 2ω , by conditions 1 and 2. Furthermore, by the above construction, the nonempty sets T υ for υ σ and |υ| = l(s + 1) partition Sσ . Thus the nonempty sets T υ for υ of
7.2. Weak n-randomness
285
length l(s + 1) partition 2ω . So Ps+1 is defined for all strings of length less than or equal to f (l(s + 1)), meeting condition 1. If τ has length f (l(s + 1)), then τ extends some element of some T υ , which implies that |Ps+1 (τ )| = |υ| = l(s + 1), and hence condition 2 is met. If υ ∈ 2l(s+1) , then for all τ of length f (l(s + 1)) we have τ ⊆ T υ iff Ps+1 (τ ) = υ. Thus, as μ(T υ ) = μ(Tυ ) = 2−|υ| d(υ), we have d(υ)2−|υ| = μ(T υ ) = 2|{τ ∈ 2f (l(s+1)) : Ps+1 (τ ) = υ}|2−f (l(s+1)) , so condition 3 is met. Assuming Ps is a process machine, the first two parts of the definition of Ps+1 preserve the process machine condition. To see that the third part also does so, it is enough to consider the case where τ ∈ T υ for some υ. By construction, τ ξ for some ξ such that |ξ| = f (l(s)) and ξ ∈ Sσ for some σ υ. Then for any ξ with ξ ξ ≺ τ , we have Ps+1 (ξ ) = Ps (ξ) = σ υ = Ps+1 (τ ). As Ps is a process machine for all s, we have that P is a process machine. Furthermore, it is a quick process machine: Let k(n) be the largest s such that f (l(s)) n. Given any σ, let σ = σ f (l(k(|σ|)). Then |P (σ)| |P (σ )| = l(k(|σ|)) k(|σ|). To show that A is not quick process random, take any c > 0. As limn d(A n) = ∞, there must be a stage s + 1 > c such that d(A l(s + 1)) 2c . Let υ = A l(s + 1). Then d(υ)2−|υ| 2−|υ|+c = 2−f (l(s))−s−1+c , since by definition |υ| = l(s + 1) = f (l(s)) + s + 1. Also, d(υ)2−|υ| = a0 2−f (l(s))−1 + a1 2−f (l(s))−2 + · · · + an 2−f (l(s+1)) with ai ∈ {0, 1} if i > 0. If i > s−c, then i > 0, and so ai 1. Hence, there must be some i s − c such that ai = 0. For this i, there is some string of length f (l(s)) + i + 1 in Tυ . But f (l(s)) + i + 1 f (l(s)) + s + 1 − c = l(s + 1) − c. It follows that there is some string ρ of length l(s + 1) − c in T υ , and that Ps+1 (ρ) = υ = A l(s + 1). Thus A is not quick process random.
7.2 Weak n-randomness 7.2.1 Basics Kurtz [228] introduced a notion of randomness based on the idea of being contained in effective sets of measure 1 rather than avoiding effective sets of measure 0. Definition 7.2.1 (Kurtz [228]). A set is weakly 1-random, or Kurtz random, if it is contained in every Σ01 class of measure 1. We will later see that the relativized notion of weak (n + 1)-randomness is actually a strong from of n-randomness.
286
7. Other Notions of Algorithmic Randomness
There is an equivalent formulation of this notion in terms of tests, established by Wang [405] but implicit in some of the proofs in Kurtz [228]. Definition 7.2.2 (Wang [405]). A Kurtz null test is a Martin-L¨ of test {Vn }n∈ω for which there is a computable function f : ω → (2<ω )<ω such that Vn = f (n) for all n. Another way to state this definition is that there are finite, uniformly computable S0 , S1 , . . . ⊂ 2<ω such that Vn = Sn . Note that if {Vn }n∈ω is a Kurtz null test, then both {Vn }n∈ω and {Vn }n∈ω are uniformly Δ01 classes. Theorem 7.2.3 (Wang [405], after Kurtz [228]). A set is weakly 1-random iff it passes all Kurtz null tests. Proof. Suppose that A is not weakly 1-random and let U be a Σ01 class of measure 1 such that A ∈ / U . To define Vn , let s be least such that μ(Us ) > 1 − 2−n . Let Vn= Us . It is easy to check that {Vn }n∈ω is a Kurtz null test, and that A ∈ n Vn . there is a Kurtz null test {Vn }n∈ω such that A ∈ Conversely, suppose 0 / U , so n Vn . Let U = n Vn . Then U is a Σ1 class of measure 1, and A ∈ A is not weakly 1-random. A Kurtz null test is clearly a Schnorr test, so we see that every Schnorr random set is weakly 1-random. Of course, we can generalize Kurtz randomness to higher levels. Thus a set is weakly n-random, or Kurtz n-random 5 , if it is a member of all Σ0n classes of measure 1. Here the distinction between classes and open classes is important. It is not true that being a member of every Σ0n class of (n−1) class of measure measure 1 is the same as being a member of every Σ0,∅ 1 1. For instance, let A be 2-generic. Then it is not hard to see that A is in every Σ0,∅ class of measure 1, and hence is weakly 1-random relative to 1 ∅ . (This is the relativized form of Theorem 8.11.7 below.) But as we will see in Proposition 8.11.9, A cannot be weakly 2-random (or even Schnorr random, which, as we will see in Theorem 7.2.7, is a weaker notion than weak 2-randomness). Thus weak n-randomness is not the same as weak 1-randomness relative to ∅(n−1) . This observation is important since many results on n-randomness rely on Theorem 6.8.5, that n-randomness is the same as 1-randomness relative to ∅(n−1) . The following result says that we can recover a little of the spirit of that fact, though, if we allow for an extra quantifier. 5 Some authors, however, have used “Kurtz n-random” to mean Kurtz random relative to ∅(n−1) .
7.2. Weak n-randomness
287
Lemma 7.2.4 (Kautz [200]). Let n 2. (i) From the index of a Σ0n class C we can computably obtain the index (n−2) ⊆ C such that μ(C) = μ(C). class C of a Σ0,∅ 2 (ii) From the index of a Π0n -class V we can uniformly and computably (n−2) class V ⊇ V such that μ(V ) = μ(V ). obtain the index of a Π0,∅ 2
Σ0n
class. Then C is the union of uniformly Π0n−1 classes Proof. Let C be a T0 , T1 , . . . . By Theorem 6.8.3, we can uniformly find, for each i and j, a (n−2) = T i,j . Then Π0,∅ class T i,j ⊆ Ti with μ(Ti ) − μ(T i,j ) 2−j . Let C 1 i,j ⊆ C and μ(C) = μ(C). The proof of (ii) is symmetric. C Corollary 7.2.5. Let n 2. A set is weakly n-random iff it is in every (n−2) Σ0,∅ class of measure 1. 2 We can also have a “one jump” characterization of weak n-randomness via the analog of a Kurtz null test. We will refer to the tests in part (iii) of the following theorem as Σ0n Kurtz null tests. Theorem 7.2.6 (Kautz [200], Wang [405], after Kurtz [228]). The following are equivalent for n 2. (i) A is weakly n-random. 0 (ii) For every ∅(n−1) -computable sequence of Σn−1 classes {Si }i∈ω with −i μ(Si ) 2 , we have A ∈ / i Si . 0,∅ (iii) For every ∅(n−1) -computable sequence of Σ1 −i μ(Si ) 2 , we have A ∈ / i Si .
(n−2)
classes {Si }i∈ω with
Proof. The equivalence between (ii) and (iii) follows from Theorem 6.8.3. If A ∈ i Si , where theSi are as in (ii), then A is not weakly 1-random, since the complement of i Si is a Σ0n set of measure 1. Finally, suppose that A is not weakly 1-random. Then it is contained in some Π0n null set T . Now, T is of the form i Ui , where the Ui are uniformly Σ0n−1 ∅(n−1) as an oracle, for each i we can find a k such that classes. Using −i μ( jk U ) < 2 and let S = U j i j . Then the Si are as in (ii), and jk A ∈ i Si . Weak 2-randomness is quite a natural notion. When we introduced the notion of Martin-L¨of randomness, we mentioned the possibility of relaxing the definition of a Martin-L¨ of test {Un }n∈ω to require only that limn μ(Un ) = 0, that is, eliminating the requirement that there be a computable bound on the rate of convergence of these measures. We called such tests generalized Martin-L¨ of tests. For such a test, n Un is a Π02 class of measure 0, and every Π02 class of measure 0 can be expressed in this way, so being weak 2-random is the same as passing all generalized Martin-L¨of tests. Weak 2-randomness was first studied by Gaifman and Snir [168], in
288
7. Other Notions of Algorithmic Randomness
the sense that their paper is the first occurrence of this notion in print (to our knowledge). However, the notion is already present in Solovay’s notes [371], which include a proof due to Martin that weakly 2-random sets are not Δ02 (Corollary 7.2.9 below). Weak n-randomness and n-randomness are related as follows. Theorem 7.2.7 (Kurtz [228]). Every n-random set is weakly n-random and every weakly (n + 1)-random set is n-random. Proof. An n-random set passes every Σ0n Kurtz null test, and hence is weakly n-random. 0 If A is not of test U0 , U1 , . . . such n-random then there is a Σn Martin-L¨ that A ∈ n Un . Let V = n Un . Then V is a Σ0n+1 class of measure 1 not containing A, so A is not weakly (n + 1)-random. This result, and characterizations such as the one above in terms of generalized Martin-L¨ of tests, have led some to argue that weak n-randomness for n > 1 should be called strong (n − 1)-randomness. The traditional terminology is well-established, however. We will see in Section 8.13 that neither of the implications in Theorem 7.2.7 can be reversed. In some ways, weak 2-randomness is the weakest level of randomness at which “typical” random behavior happens. A good example of this phenomenon is the following result, which says that the behavior of weak 2-random sets is very unlike that of computationally powerful 1-random sets such as Ω. It will be generalized below in Theorem 8.13.1. Theorem 7.2.8 (Downey, Nies, Weber, and Yu [130]). Every weakly 2random degree forms a minimal pair with 0 . Proof. Let A be a Δ02 set and R a weakly 2-random set such that A = ΦR for some reduction Φ. We show that A is computable. The class S = {X : ∀n ∀s ∃t > s (ΦX (n)[t] ↓= At (n))} is a Π02 class containing R, and hence has positive measure. (Here, the oracle in the computation ΦX (n)[t] is always X, of course, since X is not being approximated in any sense.) We apply a majority vote argument. Let r be a rational such that μ(S) 2 < r < μ(S). Then there is a finite set F ⊂ 2<ω such that μ(F ) < μ(S) and μ(F ∩ S) > r. To compute A(n), we search for a finite set G ⊂ 2<ω with μ(G) > r and G ⊆ F so that for any σ, τ ∈ G, we have Φσ (n) ↓= Φτ (n) ↓. Such a set exists because μ(F ∩ S) > r. Then A(n) = Φσ (n), since otherwise G ∈ S, which is impossible because then μ(F ∩ S) μ(F \ G) < μ(S) − r < r. Thus A is computable. Corollary 7.2.9 (Martin, see Solovay [371]). There is no Δ02 weakly 2random set.
7.2. Weak n-randomness
289
Corollary 7.2.10 (Downey, Nies, Weber, and Yu [130]). There is no universal generalized Martin-L¨ of test. Proof. If there were such a test then, taking the complement of any of its components of measure less than 1, we would have a nonempty Π01 class containing only weakly 2-random sets. But then there would be a Δ02 weakly 2-random set. We can combine Theorem 7.2.8 with the following result to obtain a characterization of the weakly 2-random sets within the 1-random sets. Theorem 7.2.11 (Hirschfeldt and Miller, see [130]). For any Σ03 class S of measure 0, there is a noncomputable c.e. set A such that A T X for all 1-random X ∈ S.6 Proof. Let {Uni : i, n ∈ N} be uniformly Σ01 sets such that S = i n Uni i and limn μ(Un ) = 0 for all i. For each e, choose a fresh large ne . If we ever reach a stage s such that μ(Uni e ,s ) > 2−e for some i e, then redefine ne to be a fresh large number and restart. If ne ever enters We , then put ne into A and stop the action of e. Clearly, A is c.e., and each ne reaches a final value, whence We = A for all e, so A is noncomputable. Now suppose that X ∈ S is 1-random. Let i be such that X ∈ n Uni . Let T = {Uni e ,s : e i ∧ ne enters A atstage s}. Then the sum of the measures of the elements of T is bounded by ei 2−e , so T is a Solovay test. Thus X is in only finitely many elements of T . So i for all but finitely many n, if n enters A at stage s then X ∈ / Un,s . Let B i . be the set of all n such that n ∈ As for the least s such that X ∈ Un,s ∗ Then B is X-computable and A = B, so A is X-computable. It is worth noting that this result cannot be substantially improved. By Corollary 8.12.2 below, no noncomputable set can be computed by positive measure many sets. By Theorem 2.19.10 applied to a Π01 class of 1-random sets, for any c.e. set A, there is a Δ02 (and even a low) 1-random set that does not compute A, and the Δ02 sets are a Σ04 class of measure 0. Corollary 7.2.12 (Downey, Nies, Weber, and Yu [130]). For a 1-random set A, the following are equivalent. (i) A is weakly 2-random. (ii) The degree of A forms a minimal pair with 0 . (iii) A does not compute any noncomputable c.e. set. In Corollaries 8.11.8 and 8.11.10, we will see that each hyperimmune degree contains a weakly 1-random set, and indeed one that is not Schnorr 6 We
will revisit this result in Section 11.8, where we will see that it leads to a notion of randomness theoretic weakness, that of diamond classes, which will be of particular interest in Chapter 14.
290
7. Other Notions of Algorithmic Randomness
random. On the hyperimmune-free degrees, however, weak 1-randomness implies 1-randomness, as we will see in Theorem 8.11.11. The precise classification of the weakly 1-random degrees remains open.
7.2.2 Characterizations of weak 1-randomness There is also a martingale characterization of weak 1-randomness. Theorem 7.2.13 (Wang [405]). A is not weakly 1-random iff there is a computable martingale d and a computable order h such that d(A n) > h(n) for all n.7 Proof. We follow the proof in Downey, Griffiths, and Reid [111]. Suppose that A is not weakly 1-random, as witnessed by the Kurtz null test {Si }i∈ω , where the Si are uniformly computable, finite, and prefixfree. Let g(i) be the length of the longest string in Si . Note that g is computable, and g(i) i. Let ⎧ ⎪ if σ τ ⎨1 wσ (τ ) = 2−(|σ|−|τ |) if τ ≺ σ ⎪ ⎩ 0 otherwise, and let d(τ ) =
wσ (τ ).
n σ∈Sn
It that each wσ is a martingale, and is straightforward to check −n , so d is a martingale. Furthermore, we σ∈Sn wσ (λ) = μ(Sn ) 2 claim that d is computable. To see that this is the case, notice that, for any k, strings in Sk increase d(λ) by at most μ(Sk ) 2−k , so these strings |τ |−k increase . Thus, for any s, if we let k = |τ | + s, then d(τ ) by at most 2 −s of d(τ ). nk σ∈Sn wσ (τ ) is within 2 Let h(n) = |{j : g(j) n}|. Since g(j) j for all j and g is computable, h is computable. Furthermore, h(n) is non-decreasing and unbounded, so it is a computable order. Now fix n. Since A is in all Si , for any j with g(j) n there is a σj ∈ Sj such that σj A n. Thus wσj (A n) = 1 = h(n). d(A n) j:g(j)n
j:g(j)n
Now suppose that there are a computable martingale d and a computable order h such that d(A n) h(n) for all n. By scaling d and h, we may 7 Note that this condition is equivalent to saying that there is a computable martingale d and a computable order g such that d(A n) > g(n) for almost all n, since we can replace the first few values of g by 0 to obtain a new order h such that d(A n) > h(n) for all n.
7.2. Weak n-randomness
291
assume that d(λ) 1. By Proposition 7.1.2, we can take d to be rationalvalued. For each k, let g(k) be the least n such that h(n) 2k+1 . Note that g is total and computable. Let Sk = {σ ∈ 2g(k) : d(σ) > 2k }. The Sk are finite and uniformly computable, and μ(Sk ) 2−k , by Kolmogorov’s Inequality (Theorem 6.3.3). Thus {Sk }k∈ω is a Kurtz null test. For each k, we have d(A g(k)) h(g(k)) 2k+1 , so A ∈ Sk for all k, and hence A is not weakly 1-random. Downey, Griffiths, and Reid [111] gave a machine characterization of weak 1-randomness, along the lines of the one for Schnorr randomness in Theorem 7.1.15. That is, it uses initial segment complexity but measured with respect to a restricted class of machines. Definition 7.2.14 (Downey, Griffiths, and Reid [111]). A computably layered machine is a prefix-free machine M for which there is a computable function f : ω → (2<ω )<ω such that (i) i f (i) = dom M . (ii) If σ ∈ f (i + 1), then there is a τ ∈ f (i) such that M (τ ) M (σ). (iii) If σ ∈ f (i), then |M (σ)| = |σ| + i + 1. The idea of a computably layered machine is that each layer f (i) of the domain provides a layer of the range, and the elements of the range become more refined as i increases. For the next theorem, recall that if σ is not in the range of a machine M , then KM (σ) = ∞. Theorem 7.2.15 (Downey, Griffiths, and Reid [111]). A is weakly 1random iff for all computably layered machines M , we have KM (A n) n − O(1). Proof. Suppose that A is not weakly 1-random, as witnessed by a Kurtz null test {Si }i∈ω , where the Si are finite, prefix-free, and uniformly computable, and Si ⊇ Si+1 for all i. By replacing strings by sufficiently long extensions, we may assume that the Si are pairwise disjoint and that for each τ ∈ Si+1 there is a σ ∈ Si such that σ ≺ τ . We have 2−(|σ|−(n+1)) = 2n+1 μ(S2n+2 ) 2n+1 2−2n−2 = 1, n σ∈S2n+2
n
n
so by the KC Theorem, there is a prefix-free machine M such that, for each n and σ ∈ S2n+2 , there is a τ of length |σ| − (n + 1) such that M (τ ) = σ. Let f (n) = {τ : ∃σ ∈ S2n+2 (M (τ ) = σ)}. We claim that f witnesses the fact that M is computably layered. First, f is computable, since S2n+2 is finite and the KC Theorem is effective, and clearly dom(M ) = n f (n). By our assumptions on the Sn , for all τ ∈ f (n + 1), there is a σ ∈ f (n) such that M (σ) ≺ M (τ ). Finally, let τ ∈ f (n). Then it follows directly from the definition of M that |M (τ )| = τ + n + 1.
292
7. Other Notions of Algorithmic Randomness
Thus M is a computably layered machine. Since A is in every Si , for each i there is an n such that A n ∈ Si , whence KM (A n) = n − (i + 1). Now suppose there is a computably layered machine M such that for each d there is an n with KM (A n) < n−d. Let f be a function witnessing that M is computably layered. Let Sk be the set of all σ such that KM (σ) < |σ| − k but KM (τ ) |τ | − k for all τ ≺ σ. Then 2−|σ| 2−KM (σ)−k 2−k . μ(Sk ) = σ∈Sk
σ∈Sk
Suppose that τ ∈ f (k) and let σ = M (τ ). Then |σ| = |M (τ )| = |τ |+k+1, so KM (σ) |τ | = |σ| − k − 1, so σ ∈ Sk . Conversely, if σ ∈ Sk then there is a τ such that M (τ ) = σ and |τ | < |σ| − k. For such a τ , we have |M (τ )| = |σ| > |τ | + k, so τ ∈ / f (j) for j < k. Thus τ ∈ f (j) for some j k. By the properties of f , there is a ρ ∈ f (k) such that M (ρ) M (τ ) = σ. But since no proper initial segment of σ is in Sk , in fact M (ρ) = σ. So Sk = {M (τ ) : τ ∈ f (k)}, and hence the Sk are finite and uniformly computable. Thus {Sk }k∈ω is a Kurtz null test. Furthermore, by the choice of M , the set A is in every element of this test, and hence is not weakly 1-random. In Theorem 7.1.15, we saw that Schnorr randomness can be characterized in terms of computable measure machines. It is an indication of the close relationship between Schnorr randomness and weak 1-randomness that this class of machines can also be used to characterize weak 1-randomness. Theorem 7.2.16 (Downey, Griffiths, and Reid [111]). A is not weakly 1random iff there is a computable measure machine M and a computable function f such that for all n, we have KM (A f (n)) < f (n) − n. Proof. Suppose that A is not weakly 1-random, so there is a Kurtz null test {Sn }n∈ω such that A ∈ n Sn , where the Sn are finite and uniformly computable. We may assume that for each Sn , all the strings in Sn have the same g(n), where g is computable. As shown in Theorem 7.2.15, length−(|σ|−(n+1)) 1, so by the KC Theorem, there is a prefixn σ∈S2n+2 2 free machine M such that, for each n and σ ∈ S2n+2 , there is a τ of length |σ| − (n + 1) such that M (τ ) = σ. Let α = τ ∈dom M 2−|τ |. Then 2−(|σ|−(n+1)) = 2n+1 μ(U2n+2 ). α= n σ∈S2n+2
But for any s,
n>s
2n+1 μ(U2n+2 )
n
2n+1−2n−2 = 2−s−1 ,
n>s
μ(U2n+2 ) is an approximation to α to within 2−s , and hence so ns 2 α is computable. Thus M is a computable measure machine. n+1
7.2. Weak n-randomness
Let f (n) = g(2n + 2). Since A ∈ g(2n + 2) ∈ Sn , whence
n Sn ,
293
for each n we have A
KM (A f (n)) = KM (A g(2n + 2)) = g(2n + 2) − (n + 1) < f (n) − n. Now suppose we have a computable measure machine M and a computable function f such that KM (A f (n)) < f (n) − n for all n. Let Sn = {σ ∈ 2f (n) : KM (σ) < f (n) − n}. As in the proof of Theorem 7.2.15, μ(Sn ) 2−n . Furthermore, the Sn are finite and uniformly computable, since the domain of a computable measure machine is computable. Thus {Sn }n∈ω is a Kurtz null test, and by choice of M , we have A ∈ n Sn , so A is not weakly 1-random. We finish this section with a Solovay-type characterization of weak 1randomness. Definition 7.2.17 (Downey, Griffiths, and Reid [111]). A Kurtz-Solovay test is a pair (f, V ) consisting of a computable function f and a computable collection V = {Vi }i∈ω of finite, uniformly computable sets of strings such that n μ(Vn ) is finite and computable. We say that A fails a KurtzSolovay test if for all n, the set A is in at least n many of V0 , . . . , Vf (n) . Note that given any Kurtz-Solovay test (f, V ), a new Kurtz-Solovay test (n → n, V ) can be defined via V k = Vf (k) ∪ · · · ∪ Vf (k+1)−1 . Theorem 7.2.18 (Downey, Griffiths, and Reid [111]). A is weakly 1-random iff A does not fail any Kurtz-Solovay test. Proof. Suppose that A is not weakly 1-random, so there is a Kurtz null test {Vn }n∈ω such that A ∈ n Vn . It is easy to check that (n → n, {Vn }n∈ω ) is a Kurtz-Solovay test, and since A is in every Vn , it fails this test. Now suppose that A fails a Kurtz-Solovay test (f, {Vn }n∈ω ). Assume without loss of generality that n μ(Vn ) 1. Let Tk be the set of all σ for which there are at least 2k many Vi with i f (2k ) such that σ extends an element / Tk for τ ≺ σ. of Vi . Let Sk be the set of all σ ∈ Tk such that τ ∈ Then n μ(Vn ) 2k μ(Sk ), so μ(Sk ) 2−k . Furthermore, the Sk are finite and uniformly computable, since the Vn are, so {Sk }k∈ω is a Kurtz null test. Since A fails the Kurtz-Solovay test (f, {Vn }n∈ω ), for each k, the set A is in at least 2k many of V0 , V1 , . . . , Vf (2k ) , so there is an m for which k there are at least 2 many i f (2k) such that A m extends a string in Vi . Thus A ∈ k Sk , and hence A is not weakly 1-random.
7.2.3 Schnorr randomness via Kurtz null tests Schnorr randomness can also be characterized in terms of Kurtz null tests.
294
7. Other Notions of Algorithmic Randomness
Definition 7.2.19 (Downey, Griffiths, and Reid [111]). A Kurtz array is a uniform family of Kurtz null tests {Vjn }n,j∈ω such that μ(Vjn ) 2−j+n+1 for all n and j. The uniformity in this definition means that the Vjn are uniformly Σ01 classes and that the functions witnessing that the tests {Vjn }j∈ω are Kurtz null tests are uniformly computable. Theorem 7.2.20 (Downey, Griffiths, and Reid [111]). A set A is Schnorr random iff for all Kurtz arrays {Vjn }n,j∈ω , there is an n such that A ∈ / Vjn for all j. Proof. Suppose that A is not Schnorr random, and let {Un }n∈ω be a Schnorr test with μ(Un ) = 2−n such that A ∈ n Un . Let sj be the least stage s such that μ(Un [s]) 2−n −2−(n+j+1) and let Vjn = Un [sj +1]\Un [sj ] for all n and j. Then μ(Vjn ) = μ(Un [sj + 1]) − μ(Un [sj ]) 2−n − (2−n − 2−(n+j+1) ) = 2−(n+j+1) . It is clear that the other conditions for being a Kurtz arrayhold of {Vjn }n,j∈ω , so this family is a Kurtz array. Since A ∈ n Un and j Vjn = Un , for each n there is a j such that A ∈ Vjn . Conversely, suppose we have a Kurtz array {Vjn}n,j∈ω such that for n each n there is a j with A ∈ Vjn . Let Un = j Vj . Then the Un 0 are uniformly Σ1and the measures μ(Un ) are uniformly computable, n since μ(Un ) − μ( jc Vjn ) 2−c . Furthermore, μ(Un ) j μ(Vj ) −n+j+1 = 2−n . Thus {Un }n∈ω is a Schnorr test, and clearly A ∈ j2 U . n n Nicholas Rupprecht noticed that the characterization above allows for another elegant characterization in terms of a variation on Kurtz null tests. Definition 7.2.21 (Rupprecht [personal communication]). A finite total Solovay test is a sequence {Sn }n∈ω of uniformly computable clopen sets (i.e., Sn = {σ : σ ∈ Dg(n) } where g is computable) such that n μ(Sn ) is finite and computable. A set A passes this test if it is in only finitely many of the Sn . Notice that this is the Kurtz-style analog of Downey and Griffith’s notion of total Solovay test from [109] (see Definition 7.1.9). Indeed, we have avoided the term total Kurtz-Solovay test only to avoid confusion with the Kurtz-Solovay tests from Definition 7.2.17. The next result should be compared with Theorem 7.1.10. Corollary 7.2.22 (Rupprecht [personal communication]). A set is Schnorr random iff it passes all finite total Solovay tests.
7.2. Weak n-randomness
295
Proof. One direction is clear from Theorem 7.1.10, since every finite total Solovay test is a total Solovay test. For the other direction, suppose that A passes all finite total Solovay tests. Let {Vjn }n,j∈ω be a Kurtz array. This array is clearly a finite total Solovay test, so A is in only finitely many of its elements, and hence there is an n such that A ∈ / Vjn for all j.
7.2.4 Weakly 1-random left-c.e. reals There are no weakly 1-random c.e. sets. In fact, we have the following, where a set is bi-immune if neither it nor its complement contains an infinite c.e. set. Theorem 7.2.23 (Jockusch, see Kurtz [228]). If A is weakly 1-random then A is bi-immune. Proof. Let W be an infinite c.e. set. Let U = {X : ∃n (n ∈ W ∧ X(n) = 0)}. Then U is a Σ01 class of measure 1, so every weakly 1-random set must be in U . The same argument applies to the complement of a weakly 1-random set. Jockusch [186] showed that there are continuum many degrees that contain no bi-immune sets, so we have the following. Corollary 7.2.24 (Jockusch, see Kurtz [228]). There are continuum many degrees that contain no weakly 1-random sets. While no c.e. set can be weakly 1-random, the same is not true of c.e. degrees. Of course we know that the complete c.e. degree is 1-random, and hence weakly 1-random, but in fact, as shown by Kurtz [228], every nonzero c.e. degree contains a weakly 1-random set. The following is an extension of that result. (See Theorem 8.13.3 for a relativized form of this theorem.) Theorem 7.2.25 (Downey, Griffiths, and Reid [111]). Every nonzero c.e. degree contains a weakly 1-random left-c.e. real. Proof sketch. Let B be a noncomputable c.e. set. We build a weakly 1random left-c.e. real α ≡T B. The construction combines a technique for avoiding test sets with permitting and coding. We use an enumeration of partial Kurtz null tests U 0 , U 1 , . . . and ensure that, for each U i that really is a Kurtz null test, α is not in the null set determined by that test. A permitting condition ensures that if Bs n = B n then αs n = α n. We build α by defining sets [0, 1] ⊇ X0 ⊇ X1 ⊇ · · · and letting αs = inf Xs . We have two kinds of requirements: Ri : If U i is a Kurtz null test then ∃k, s (Xs ∩ Uki = ∅) and Pj : ∃n ∀i < j (α(n − j) + i = B(i − 1)).
296
7. Other Notions of Algorithmic Randomness
The latter requirements code B into α, although it is still necessary to argue that the coding locations can be recovered from α. Each requirement Ri has infinitely many strategies Ri,j . The strategy Re with e = i, j deals with a test set Uki , where k is determined during the construction to make sure the maximum possible measure of the test set is sufficiently small. The strategy for a single strategy Re is as follows. If τe is not defined, let τe = τe−1 0 (where τe−1 is a string defined by previous requirements), and choose k large enough so that μ(Uki ) < 2−|τe |−2 . If Re never acts and is never initialized then the entire construction takes place within τe , producing a real α extending τe . We have τe [s] ⊆ Xs at each stage s, but the τe are not “permanent” like the Xs are, in the sense that while Xs+1 ⊆ Xs for all s, we sometimes have τe [s + 1] not extending τe [s]. The strategy Re requires attention if Uki becomes nonempty. Once Re requires attention, it acts if Bs |τe | changes, at which point Re defines Xs+1 = Xs ∩ Uki ∩ τe , causing the construction to move out of τe . For a nonempty string τ , let τ + be the string obtained by changing the last bit of τ from 0 to 1 or vice versa. Since μ(Uki ) < 2−|τe |−2 , there is a σe τe+ such that σe ∩ Xs = σe . The construction then continues within such a cone σe . The strategy for a single requirement Pe is as follows. Given τe , extend it by e many bits to code B e. This action extends the string τe defined by Re . When defining τe+1 , the strategy Re+1 uses this extended version of τe . In the construction we use an additional notion of satisfaction: If Re acts, where e = i, j, then we declare Re to be satisfied for all e = i, j with j ∈ N. A strategy Re remains satisfied for the rest of the construction, and hence never requires attention or acts, unless it is initialized by the action of a stronger priority strategy. When a strategy Re is initialized, its string τe is canceled, as is its choice of k. When a strategy for a requirement Pe is initialized it simply ceases to have any impact in updating strings in the construction. The measures of the Uki are chosen so that at no time can we run out of space to meet the requirements. The construction is now a reasonably straightforward application of the finite injury priority method with permitting and coding. See [111] for details.
7.2.5 Solovay genericity and randomness Solovay [370] used forcing with closed sets of positive measure in his construction of a model of set theory (without the full axiom of choice but with dependent choice) in which every set of reals is Lebesgue measurable. For those familiar with the basics of set-theoretic forcing, we note that
7.2. Weak n-randomness
297
the generic extension of a ground model M obtained by Solovay forcing is of the form M [x] for a random real x, which has the property that it is not contained in any null Borel set coded in M . (See Jech [185] for more details.) The connection between this notion and our notions of algorithmic randomness should be clear. Kautz [200] made it explicit by giving a miniaturization (i.e., an effectivization) of Solovay’s notion that yields a characterization of weak n-randomness in terms of a forcing relation, in the same way that Cohen forcing is miniaturized to yield the notion of n-genericity. Thus, while there is no correlation between Cohen forcing and randomness,8 we do have a notion of forcing related to algorithmic randomness. Let Pn denote the partial ordering of Π0n classes of positive measure under inclusion. We work in the language of arithmetic with an additional set constant X and a membership symbol ∈. For a sentence ϕ in this language and a set A, we write A ϕ to mean that ϕ holds when X is interpreted as A and all other symbols are interpreted in the usual way. Definition 7.2.26 (Kautz [200]). A ϕ for all A ∈ T .
(i) For T ∈ Pn , we write T ϕ if
(ii) For a set A, we write A ϕ if there is a T ∈ Pn such that A ∈ T and T ϕ. (iii) A set A is Solovay n-generic if for every Σ0n formula ϕ, either A ϕ or A ¬ϕ. Kautz [200] showed that this notion is well-behaved. Proposition 7.2.27 (Kautz [200]). (i) (Monotonicity) If T ϕ then for all T ⊂ T in Pn , we have T ϕ. (ii) (Consistency) It is not the case that T ϕ and T ¬ϕ. (iii) (Quasi-completeness) For every Σ0n sentence ϕ and each T ∈ Pn , there is an S ⊆ T in Pn such that either S ϕ or S ¬ϕ. (iv) (Forcing=Truth) A is Solovay n-generic iff for each Σ0n or Π0n sentence ϕ, we have A ϕ iff A ϕ. Quasi-completeness is interesting to us in that it can be proved using algorithmic randomness: Let U0 , U1 , . . . be a universal Martin-L¨ of test relative to ∅(n−1) . Let Pi be the complement of Ui . Since the measures of the Pi go to 1, there must be an i such that T ∩ Pi ∈ Pn . Let S = T ∩ Pi . If A ϕ for all A ∈ S we are done, since then S ϕ. Otherwise the Π0n class S = S ∩ {A : A ¬ϕ} is nonempty. Since S contains (only) n-randoms, it must be in Pn , and S ¬ϕ. 8 We will explore the relationship between algorithmic randomness and n-genericity in Chapter 8.
298
7. Other Notions of Algorithmic Randomness
Kautz [200] also showed that Solovay n-genericity exactly characterizes weak n-randomness. Theorem 7.2.28 (Kautz [200]). A set is Solovay n-generic iff it is weakly n-random. Proof. Suppose that A is not weakly n-random. Then A is in some Π0n class S of measure 0. Let ϕ be a Π0n formula such that S = {B : B ϕ}. Since A ϕ, we cannot have A ¬ϕ. But since S has measure 0, there is no T ∈ Pn such that T ϕ. Thus A is not Solovay n-generic. Now suppose that A is weakly n-random and let ϕ be a Σ0n sentence. Let S = {B : B ϕ}. The class S is a union of Π0n−1 classes C0 , C1 , . . . . If A ∈ S, then A is in some Ci . This Ci must have positive measure, so Ci ∈ Pn and Ci ϕ. Thus A ϕ. If A ∈ / S then A is in the Π0n class S = {B : B ¬ϕ}. Again, S ∈ Pn , and S ¬ϕ. Thus A ¬ϕ.
7.3 Decidable machines Bienvenu and Merkle [40] introduced a class of machines that can be used to classify most of the randomness notions we have met so far, in particular, weak 1-randomness, Schnorr randomness, and 1-randomness. Definition 7.3.1 (Bienvenu and Merkle [40]). A machine M is decidable if dom(M ) is computable. Any computable measure machine is decidable, as is any bounded machine. As remarked following the proof of Theorem 2.19.2, any Σ01 class can be generated by a computable set of strings, so Martin-L¨of tests can be thought of as determined by uniformly computable sets of strings. Then if we follow the KC Theorem version of the proof of Schnorr’s Theorem 6.2.3, which converts tests to machines, it is not hard to show the following. Theorem 7.3.2 (Bienvenu and Merkle [40]). A set X is 1-random iff KM (X n) n − O(1) for all decidable prefix-free machines M . Moreover, there is a fixed decidable prefix-free machine N such that X is 1-random iff KN (X n) n − O(1).9 Note that KM for a decidable M is a particular example of a machine −f (n) computable function f with 2 1, so the above result can be n restated as follows. 9 Another way to prove this result is to notice that, given a prefix-free machine M , we can define a decidable prefix-free machine N by waiting until σ enters dom(M ) at stage s and then letting N (στ ) = M (σ)τ for all τ ∈ 2s . For any X, if KM (X n) n − c then there is an s such that KN (X n + s) n + s − c.
7.3. Decidable machines
299
Corollary 7.3.3. A set X is 1-random iff for all computable functions f with n 2−f (n) < ∞, we have f (X n) n − O(1). For Schnorr randomness, we have the following. Theorem 7.3.4 (Bienvenu and Merkle [40]). A set X is Schnorr random iff for all decidable prefix-free machines M and all computable orders g, we have KM (X n) n − g(n) − O(1). Proof. Suppose that X is not Schnorr random. Then by Definition 7.1.4, there is a computable martingale d and a computable order h such that d(X n) h(n) for infinitely many n. Let g be a computable order with g(n) = o(h(n)). We may assume that d(λ) 1. Let Ak be the set of strings k+1 σ such g(|σ|) but d(τ ) < 2k+1 g(|τ |) for all τ ≺ σ. Then that d(σ) 2 X ∈ k Ak . By Kolmogorov’s Inequality (Theorem 6.3.3), 2−|σ|+log(g(|σ|)) = 2−|σ|+k+1+log(g(|σ|)) 2k+1 σ∈Ak
σ∈Ak
2−|σ| d(σ) 1.
σ∈Ak
Thus
2−|σ|+log(g(|σ|)) 1.
k σ∈Ak
Therefore, we can apply the KC Theorem to get a prefix-free machine M corresponding to the requests (|σ| − log(g(|σ|)), σ) for all k and σ ∈ Ak . Since d(λ) 1, for all σ we have that 2|σ| d(σ), so if σ ∈ Ak then |σ| − log(g(|σ|)) k. Thus, to check whether τ ∈ dom M , it suffices to consider Ak for k |τ |. For each such k, there are only finitely many σ with |σ| − log(g(|σ|)) = |τ |. Once we have made requests for all such σ, we know τ can no longer be added to the domain of M . Hence M is a decidable machine, and since KM (σ) |σ| − log(g(|σ|)) for all σ ∈ Ak , there are infinitely many k such that KM (X k) k−log(g(k)). Taking a computable order g with g (n) = o(log g(n)), for each c there is an n such that KM (X n) < n − g (n) − c. For the converse, suppose we have a computable order g and a prefixfree decidable machine M such that for each k we have KM (X n) < n − g(n) − k for infinitely many n. We may assume that g is o(log n) without loss of generality. For each k let Ak be the set of σ such that KM (σ) < |σ| − g(|σ|) − k but KM (τ ) |τ | − g(|τ |) − k for all τ ≺ σ. The Ak form a uniformly computable family of prefix-free subsets of 2<ω such that, for all k, 2−|σ|+g(|σ|) 2−KM (σ)−k 2−k . σ∈Ak
σ∈Ak
We can then define a martingale d as follows.
300
7. Other Notions of Algorithmic Randomness
Let dpσ be the martingale that doubles its capital along σ except for the first p many bits. That is, let dpσ (τ ) = 1 if |τ | p, let dpσ (τ ) = 2min(|σ|,|τ |) if p < |τ | and τ (p) . . . τ (min(|σ|, |τ |) − 1) = σ(p) . . . σ(min(|σ|, |τ |) − 1), and let dpσ (τ ) = 0 for all other τ . Let g(|σ|) 2−|σ|+g(|σ|) dσ 2 (τ ). d(τ ) = k σ∈Ak
Clearly, d is a martingale. Furthermore, d(λ) = 2−|σ|+g(|σ|) , k σ∈Ak
and since σ∈Ak 2−|σ|+g(|σ|) 2−k , we can computably approximate d(λ), which is thus a computable real. Now we show that if we can compute d(τ ) then we can compute d(τ i) for i = 0, 1. We have g(|σ|) g(|σ|) d(τ i) − d(τ ) = 2−|σ|+g(|σ|) (dσ 2 (τ i) − dσ 2 (τ )). k σ∈Ak
This sum is actually a finite sum, as g(|σ|) 2
dσ
g(|σ|) 2
(τ ) = dσ
(τ i) = 1
for all σ with g(|σ|) 2|τ | + 2. Thus, by induction, d is computable. To finish the proof, note that if σ ∈ k Ak , then g(|σ|) 2
d(σ) 2−|σ|+g(|σ|) dσ
(σ) 2
g(|σ|) 2
.
g(|σ|) 2
is a computable order, and there are infinitely many prefixes of But 2 X in k Ak , so X is not Schnorr random by Definition 7.1.4. For weak 1-randomness, we have the following. Theorem 7.3.5 (Bienvenu and Merkle [40]). The following are equivalent. (i) X is not weakly 1-random. (ii) There exists a decidable prefix-free machine M and a computable order f such that CM (X f (n)) f (n) − n for all n. (iii) There exists a decidable (plain) machine M and a computable order f such that CM (X f (n)) f (n) − n for all n. Proof. If X is not weakly 1-random, then by Theorem 7.2.16, there is a computable measure machine M and a computable increasing function f such that CM (X f (n)) < f (n) − n for all n. This machine is a decidable prefix-free machine, so we have (ii) and (iii). Since (ii) clearly implies (iii), we are left with showing that (iii) implies (i). Assume (iii) and let Bn = {σ : |σ| = f (2n) ∧ CM (σ) |σ| − 2n}.
7.4. Selection revisited
301
Let dσ be the martingale that doubles its capital along σ; that is, dσ (τ ) = 2−|τ | for τ σ and dσ (τ ) = 2−|σ| for τ σ, with dσ (τ ) = 0 for all other τ . Define the martingale d by d(τ ) = 2−f (2n)+n dσ (τ ). n σ∈Bn
For all n,
dσ (τ ) |Bn |2|τ | 2f (2n)−2n+|τ |+1 ,
σ∈Bn
so
2−f (2n)+n dσ (τ ) 2−n+|τ |+1,
σ∈Bn
and hence d is computable. Furthermore, X f (2n) ∈ Bn for all n, and therefore for all m f (2n), we have dσ (X m) = 2f (2n) and hence d(X m) 2−f (2n)+n dσ (X m) 2n . σ∈Bn
Thus if we let g(m) = 2 for the largest n such that f (2n) m (or g(m) = 0 if there is none), then, by Theorem 7.2.13, g is a computable order showing, together with d, that X is not weakly 1-random. n
Using similar methods, Bienvenu and Merkle [40] also obtained the following characterization of weak 1-randomness. Theorem 7.3.6 (Bienvenu and Merkle [40]). The following are equivalent. (i) X is not weakly 1-random. (ii) There exists a decidable prefix-free machine M and a computable order h such that CM (x n) n − h(n) for all n. (iii) There exists a decidable (plain) machine M and a computable order h such that CM (x n) n − h(n) for all n.
7.4 Selection revisited 7.4.1 Stochasticity In this section we return to the roots of the subject, re-examining von Mises’ idea of selection. As we have seen in Section 6.5, a selection function is a partial function f : 2<ω → {yes, no}. Recall that we think of f as a rule that, given some bits of a sequence, selects the next bit to bet on. Recall also the following notation from Section 6.5. For a sequence A and a selection function f , let sf (A, n) be the nth number k such that
302
7. Other Notions of Algorithmic Randomness
f (A k) = yes, if such a number exists. If sf (A, n − 1) is defined, let Sf (A, n) = i 0 is such that there are infinitely many n with n0 > ( 12 + ε)n. (The case in which there are infinitely many n with n1 > ( 12 + ε)n is of course symmetric.) For any such n, let m = sf (A, n) + 1, where recall that sf (A, n) is the nth place of A selected by f . Then for any k we have 1
1
d0k (A m) = (1 + 2−k )n0 (1 − 2−k )n1 (1 + 2−k )( 2 +ε)n (1 − 2−k )( 2 −ε)n . Thus log d0k (A m) n(( 12 + ε) log(1 + 2−k ) + ( 12 − ε) log(1 − 2−k )). Now consider the function h : R → R defined by h(x) = ( 12 + ε) log(1 + x) + ( 12 − ε) log(1 − x). 1
+ε
1
−ε
2 2 We have h(0) = 0, and the derivative of h is h (x) = 1+x − 1−x , so h (0) = 2ε > 0. Thus, if x > 0 is sufficiently small, h(x) > 0. So if we choose k large enough, then log d0k (A m) δn for some δ > 0 independent of n. Since n can be chosen to be arbitrarily large, A ∈ S[dk ], and hence A ∈ S[d].
7.4. Selection revisited
303
Thus we have the following facts, which are implicit in the work of Schnorr [349]. A version of the first for a stronger notion of stochasticity appears in Shen [353]. The second is proved in Wang [405]. Theorem 7.4.2 (Schnorr [349], Shen [353], Wang [405]). If A is 1-random then A is von Mises-Wald-Church stochastic. If A is computably random then A is Church stochastic. So we see that computably random sets satisfy the law of large numbers with respect to computable selection functions. We will show in Corollary 7.4.7 that computable randomness does not imply von Mises-Wald-Church stochasticity. Later, at the end of section 8.4, we will show that Schnorr randomness does not imply Church stochasticity, a result due to Wang [405]. The converse to Theorem 7.4.2 fails to hold. Indeed, for any countable collection E of selection functions, being stochastic for E is not enough to imply even weak 1-randomness. To see that this is the case, fix E and let α be as in Ville’s Theorem 6.5.1. That is, α is stochastic for E, but S(α,n) 12 n for all n. Then α is not weakly 1-random, because it is contained in the Π01 class {X : ∀n ( S(X,n) 12 )}, which has measure 0. n
7.4.2 Partial computable martingales and a threshold phenomenon for stochasticity In defining a martingale d based on a partial computable selection function f as in the previous subsection, instead of having d(σj) = 0 if f (σ) ↑, we can let d(σj) be undefined in that case. We can also apply the proof of Proposition 7.1.2 to make d rational-valued. This definition makes d into a partial computable martingale. Definition 7.4.3. A partial computable (super)martingale is a partial computable function d : 2<ω → Q0 such that, if d(σi) is defined, then so are d(σ) and d(σ(1 − i)), and d, where defined, satisfies the (super)martingale property. A set is partial computably random if no partial computable martingale succeeds on it.10 This strengthening of the notion of computable randomness was first considered by Ambos-Spies and Wang (see Ambos-Spies [6], just before the bibliography). Partial computable martingales are also discussed in Terwijn [386], where they are attributed to Fortnow, Freivalds, Gasarch, Kummer, Kurtz, Smith, and Stephan [149]. By the comments above, if a set is partial computably random then it is von Mises-Wald-Church stochastic. It is also 10 Note that for a partial computable martingale d to succeed on A, we must have d(A n) ↓ for all n.
304
7. Other Notions of Algorithmic Randomness
clear that 1-random sets are partial computably random, since a partial computable martingale d can be turned into a c.e. martingale d by defining d(σ) to be 0 while d(σ) is undefined and then redefining d(σ) to be d(σ). Using this notion, we can show that there are von Mises-Wald-Church stochastic sets whose initial segment complexities are quite far from those of 1-random sets. The following theorem follows a line of work including Daley [86] and Li and Vit´ anyi [248]. Theorem 7.4.4 (Merkle [266], Lathrop and Lutz [236], Muchnik, see [289]11 ). (i) There is a computably random A such that for all computable orders f , we have C(A n | n) f (n) for almost all n. (ii) There is a partial computably random B such that for all computable orders f , we have C(B n | n) f (n) log n for almost all n. Proof. We begin by proving part (i). Let Fs = {i s : Φi is an order}. Let d0 , d1 , . . . be an effective enumeration of all partial computable martingales with initial capital 1, and let Ds = {i s : di is total}. Let n0 = 0 and ns+1 = min{n > ns : Φi (n) > 3(s + 1) for all i ∈ Fs }. Let Is = [ns , ns+1 ). We build A in stages. Let d s = i∈Ds 2−(i+ni +1) di . Note that d s is a martingale. Assume we have defined A n. Let σ = A n and let s be such that n ∈ Is . Let A(n) = 0 if d s (σ0) d s (σ) and A(n) = 1 otherwise. We claim that for all s and n ∈ Is , we have d s (A n) < 2−2−s . We prove this claim by induction on n. The case n = 0 is clear. For the induction step, since the construction of A ensures that d s (A n) is nonincreasing on Is , it suffices to consider n = ns . Let σ = A (n − 1). Then d s (σA(n)) d s (σ) = d s−1 (σ) + 2−(s+n+1) ds (σ) < 2 − 2−(s−1) + 2−s = 2 − 2−s . This fact establishes the claim. Let i be such that di is total. Then for s i and n ∈ Is , we have di (A n) 2i+ni +1 d s (A n) < 2i+ni +2 , so A is computably random. To conclude the proof, we need to show that for all computable orders f , we have C(A n | n) f (n) for almost all n. The key observation is that the sets Fs and Ds can be coded by a string τs of length 2s and that, given these two sets, we can simulate the construction of A ns+1 . Thus, for any n ∈ Is , we have C(A n | n) 2s + O(1). So for any computable 11 This
result was proved by Merkle in the form stated here, but he points out in [266] that the slightly weaker results of Lathrop and Lutz and of Muchnik discussed below are close to parts (i) and (ii), respectively, and were proved by essentially the same methods.
7.4. Selection revisited
305
order Φi , if s is sufficiently large and n ∈ Is , then C(A n | n) 2s + O(1) < 3s < Φi (ns ) Φi (n). The proof of part (ii) is a modification of that of part (i). We now let Ds be all i s, but in defining the sum d s (σ), we omit any di such that di (σ) ↑. In this case, in order to simulate the construction of A ns+1 , we still need s bits to code Fs , but to decode A n for n < ns+1 , we now also need to know for each di with i s, whether di (A n − 1) is defined, and if not, what is the largest m < n such that di (A m) is undefined. This information can be given with s many strings of log n many bits each. The result now follows as in part (i). Theorem 7.4.4 can be viewed as the culmination of a long sequence of results on stochastic and random sequences. Daley [86] looked at replacing some results on randomness due to Loveland by results on stochasticity. Lathrop and Lutz [236] introduced what they called ultracompressible sets, defined as those sets X such that for every computable order g, we have K(X n) K(n) + g(n) for almost all n. They proved the following result, which follows from Theorem 7.4.4 (i) by taking, say, f (n) = log g(n), since K(X n) K(n) + O(C(X n | n)). Theorem 7.4.5 (Lathrop and Lutz [236]). There are computably random ultracompressible sets.12 Similarly, Theorem 7.4.4 (ii) shows that there are partial computably random sets X such that K(X n) O((log n)2 ), say, thus establishing a strong failure of the converse to the first part of Theorem 7.4.2. A weaker form of Theorem 7.4.4 (ii) was also established by An. A. Muchnik in [289, Theorem 9.5]. In that version, the set being constructed depends on the computable order f , rather than there being a single set that works for all computable orders. Merkle [266] also showed that being stochastic does have some consequences for the initial segment complexity of a set. The following result generalizes the fact that no c.e. set can be von Mises-Wald-Church stochastic (which is obvious because every infinite c.e. set contains an infinite computable subset). Together with Theorem 7.4.4 (ii), this result establishes a kind of threshold phenomenon in the initial segment complexity behavior of von Mises-Wald-Church stochastic sequences. 12 In
Chapter 11, we will discuss the important notion of a K-trivial set, which is a set X such that K(X n) K(n) + O(1). As we will see in that chapter, the Ktrivial sets are far from random in several senses. In particular, K-trivial sets cannot be computably random, or even Schnorr random, because, as we will see, every K-trivial set is low (Theorem 11.4.1), and every low Schnorr random set is 1-random (Theorem 8.11.2). Theorem 7.4.5 shows that computably random sets can nevertheless be “nearly K-trivial”.
306
7. Other Notions of Algorithmic Randomness
Theorem 7.4.6 (Merkle [266]). If there is a c such that C(X n) c log n for almost all n, then X is not von Mises-Wald-Church stochastic. Proof. Let k = 2c + 2. Choose a computable sequence m0 , m1 , . . . with each mi a multiple of k + 1, such that for all i, mi . 10(m0 + · · · + mi−1 ) < k+1 Now partition N into blocks of consecutive integers I0 , I1 , . . . with |Ii | = mi , and then divide each Ii into k + 1 many consecutive disjoint intervals mi Ji0 , . . . , Jik , each of length k+1 . Let σi = X Ii and σij = X Jij . We begin by showing that there is an effective procedure that, given X min Jit , enumerates a set Tit of strings such that for some t k, (i) σit ∈ Tit for almost all i and (ii) |Tit |
mi 5(k+1)
for infinitely many i.
Let Ai = {σ ∈ 2mi : C(σ) < k log mi }. Note that |Ai | < 2k log mi = mki . Since m0 + · · · + mi−1 < mi , for almost all i,
C(σi ) = C(X Ii ) C X mj + C mj j
ji
c log
ji
mj + log
mj
j
c log(2mi ) + log(mi ) (2c + 1) log mi = (k − 1) log mi , so σi ∈ Ai . For each i, let Tik = {τ : σi0 . . . σik−1 τ ∈ Ai } and for j < k, let mi Tij = {τ : |τ | = |Jij | and there are at least ( 5(k+1) )k−j
many strings ρ such that σi0 . . . σij−1 τ ρ ∈ Ai }. There is an effective procedure that, on input i, enumerates Ai , and hence there is an effective procedure that, given i, j, and σi0 , . . . , σij−1 , enumerates Tij . Of course, given X min Jit , we can find i, j, and σi0 , . . . , σij−1 , so we are left with showing that (i) and (ii) are satisfied for some t. If t > 0 does not satisfy (ii) then, for almost all i, there are more than mi mi t 5(k+1) many strings in Ti ; that is, there are more than 5(k+1) many strings mi τ such that each τ can be extended by at least ( 5(k+1) )k−t many strings ρ to a string σi0 . . . σit−1 τ ρ ∈ Ai . Thus, for each such i, there are more mi than ( 5(k+1) )k−(t−1) many strings τ ρ that extend σi0 . . . σit−1 to a string in
7.4. Selection revisited
307
Ai , which implies that σit−1 ∈ Tit−1 . So in this case (i) is satisfied with t replaced by t − 1. Condition (i) is satisfied for t = k, so if (ii) is satisfied for t = k as well, then we are done. Otherwise, by the previous argument, we know that (i) is satisfied for t = k − 1 and we can iterate the argument. Proceeding this way, it suffices to argue that (ii) cannot fail for t = 0. Assuming that it mi does, for almost all i, there are more than 5(k+1) many strings τ of length mi 0 |Ji | such that each such τ can be extended in at least ( 5(k+1) )k many ways mi to strings in Ai . Thus, for almost all i, we have |Ai | > ( 5(k+1) )k+1 > mki , which is a contradiction, since, as noted above, |Ai | < mki . Now that we have a t satisfying (i) and (ii), we can define two partial computable selection functions r0 and r1 that demonstrate that X is not von Mises-Wald-Church stochastic. The idea is that rj tries to select places in Jit such that the corresponding bit of X is j. For each i, let |T t |−1
be a computable enumeration of Tit given X min Jit . τi0 , τi1 , . . . , τi i Pick i0 so that σi ∈ Tit for all i i0 . Both selection functions select numbers in intervals of the form Jit for i i0 . On entering such an interval, rj lets e = 0. It then starts scanning bits of X in the interval. It assumes that τie = σit and selects n iff the corresponding bit of τie is j. It proceeds in this way until either the end of the interval is reached or one of the scanned bits differs from the corresponding one of τie , and thus rj realizes that τie is the wrong guess for σit . In the second case, rj increments the counter e and continues the procedure. By the choice of i0 , the counter e always settles on a value such that τie = σit . For all i i0 , every number in the interval Jit is selected by either r0 or r1 . We say that a number n is selected correctly if it is selected by rj and X(n) = j. In each Jit there are at most |Tit | − 1 many numbers that are selected incorrectly, since each incorrect selection causes each rj to increment its counter. So by (ii), there are infinitely many i for which at 4mi least 5(k+1) numbers in Jit are selected correctly. Hence for some j and 2mi of the infinitely many i, the selection function rj selects at least 5(k+1) mi t numbers in Ji correctly, and at most 5(k+1) incorrectly. By the choice of mi many numbers that rj the mi , for each such i there are at most 10(k+1) could have selected before it entered the interval Jit . Hence, up to and including each such interval Jit , the selection function rj selects at least 2mi 3mi 5(k+1) many numbers correctly and at most 10(k+1) incorrectly. Thus rj witnesses that X is not von Mises-Wald-Church stochastic. By Theorem 7.4.4, there is a computably random X such that C(X n) 2 log n + O(1), so we have the following result. Corollary 7.4.7 (Ambos-Spies [6]). There is a computably random set that is not von Mises-Wald-Church stochastic.
308
7. Other Notions of Algorithmic Randomness
7.4.3 A martingale characterization of stochasticity Ambos-Spies, Mayordomo, Wang, and Zheng [11] have shown that stochasticity can be viewed as a notion of randomness for a restricted kind of martingale. These authors stated the following for a notion of time-bounded stochasticity, but their proof works for Church stochasticity as well. Definition 7.4.8 (Ambos-Spies, Mayordomo, Wang, and Zheng [11]). (i) A martingale d is simple if there is a number q ∈ Q ∩ (0, 1) such that for all σ and i = 0, 1, d(σi) ∈ {d(σ), (1 + q)d(σ), (1 − q)d(σ)}. (ii) A martingale d is almost simple if there is a finite set {q0 , . . . , qm } ⊂ Q ∩ (0, 1) such that for all σ and i = 0, 1, there is a k m such that d(σi) ∈ {d(σ), (1 + qk )d(σ), (1 − qk )d(σ)}. We say that a set is (almost) simply random 13 if no (almost) simple computable martingale succeeds on it. Actually, almost simple randomness and simple randomness coincide. Lemma 7.4.9 (Ambos-Spies, Mayordomo, Wang, and Zheng [11]). If d is an almost simple martingale then there are finitely many simple martingales d0 , . . . , dm such that S[d] ⊆ km S[dk ]. Proof. Suppose that d is almost simple via the rationals {qo , . . . , qm }. For each i m, define a simple martingale dk that copies d on all σ such that d(σi) ∈ {d(σ), (1 + qk )d(σ), (1 − qk )d(σ)} and is defined by dk (σi) = dk (σ) for i = 0, 1 otherwise. If d succeeds on A then clearly one of the dk must also succeed on A. Corollary 7.4.10. A set is almost simply random iff it is simply random. We can now state the following characterizations of stochasticity. Theorem 7.4.11 (Ambos-Spies, Mayordomo, Wang, and Zheng [11]). (i) A set is Church stochastic iff it is simply random. (ii) A set is von Mises-Wald-Church stochastic iff no partial computable (almost) simple martingale succeeds on it. Proof. We prove (i); the proof of (ii) is essentially the same. Suppose there is a computable simple martingale d with rational q that succeeds on A. We define a computable selection function f by letting no if d((X n − 1)0) = d(X n) f (X n) = yes otherwise. 13 Ambos-Spies, Mayordomo, Wang, and Zheng [11] used the terminology “weakly random”, which is already used in this book in relation to Kurtz randomness.
7.5. Nonmonotonic randomness
309
Then f is computable and succeeds on A: lim sup n
|{y < n : d((A y − 1)A(y)) = (1 + q)d(A y − 1)}| > 1, |{y < n : d((A y − 1)A(y)) = (1 − q)d(A y − 1)}|
so the limsup of the proportion of the bits of A selected by f that are 1 is bigger than 12 , and hence A is not Church stochastic. For the other direction, let the martingales dik be as in Section 7.4.1. Then each dik is a simple martingale, and, as shown in that subsection, if A is not Church stochastic then some dik succeeds on A.
7.5 Nonmonotonic randomness 7.5.1 Nonmonotonic betting strategies In this section we consider the effect of allowing betting strategies that can bet on the bits of a sequence out of order. This concept was introduced by Muchnik, Semenov, and Uspensky [289]. We use the notation of Merkle, Miller, Nies, Reimann, and Stephan [269]. Definition 7.5.1. A finite assignment (f.a.) is a sequence x = (r0 , a0 ), . . . , (rn , an ) ∈ (N × {0, 1})<ω such that the ri are pairwise distinct. We call the set of ri the (selection) domain and denote it by dom(x). A scan rule is a partial function s from the set of finite assignments to N such that for all finite assignments x, we have s(x) ∈ / dom(x). The idea is that the ri are the places selected by our nonmonotonic betting strategy, and the scan rule is the function that determines the next place to bet on based on the observed values of the previously selected places, i.e., the ai . Definition 7.5.2 (Merkle, Miller, Nies, Reimann, and Stephan [269]). A stake function is a partial function from the collection of finite assignments to [0, 2]. A nonmonotonic betting strategy is a pair consisting of a scan rule and a stake function. We can now define the analog of a martingale for a nonmonotonic betting strategy (s, q). This is called in [269] a capital function, d. The idea is that given a set X, we begin with d(λ), and given a finite assignment x, the strategy picks s(x) as the next place to bet on. If q(x) < 1, it is betting that X(s(x)) = 1, and staking 1 − q(x) much of its capital on that bet; if q(x) > 1, it is betting that X(s(x)) = 0, and staking q(x) − 1 much of its capital on that bet; and if q(x) = 1, it is betting evenly. Then, as in the case of a martingale, if X(s(x)) = 0, the current capital is multiplied by q(x),
310
7. Other Notions of Algorithmic Randomness
and otherwise it is multiplied by 2 − q(x). (That is, if the strategy makes the correct guess then the amount it staked on that guess is doubled, and otherwise that amount is lost.) We can consider nonmonotonic betting strategies as a game played on sequences. Let b = (s, q) be a nonmonotonic betting strategy. We define the partial functional p by pX (0) = λ and pX (n + 1) = pX (n) s(pX (n)), q(pX (n) . We have pX (n + 1) ↑ if s(pX (n)) is undefined. Using this functional we can formally define the payoff of our strategy as if X(pX (n)) = 0 qpX (n) cX (n + 1) = (2 − q)pX (n) if X(pX (n)) = 1. Finally, we can define the payoff function of b, which is our notion of a nonmonotonic martingale, as X dX b (n) = db (λ)
n
cX (i).
i=1
(This definition depends on a choice of dX b (λ), but that choice does not affect any of the relevant properties of db .) Definition 7.5.3 (Muchnik, Semenov, and Uspensky [289]). A nonmonotonic betting strategy b succeeds on X if lim sup dX b (n) = ∞. n→∞
We say that X is nonmonotonically random (or Kolmogorov-Loveland random) if no partial computable nonmonotonic betting strategy succeeds on it. We say that C ⊆ 2ω is Kolmogorov-Loveland null if there is a partial computable nonmonotonic betting strategy succeeding on all X ∈ C. The use of partial computable nonmonotonic strategies in Definition 7.5.3 is not essential. Theorem 7.5.4 (Merkle [265]). A sequence X is nonmonotonically random iff no total computable nonmonotonic betting strategy succeeds on it. Proof sketch. It is enough to show that if a partial computable nonmonotonic strategy succeeds on X then there is a total computable nonmonotonic strategy that also succeeds on X. Suppose the partial computable strategy (s, q) succeeds on X. This strategy selects a sequence of places s0 , s1 , . . . so that unbounded capital is gained on the sequence A(s0 ), A(s1 ) . . . . If we decompose s0 , s1 , . . . into even and odd places, then the unbounded capital gain must be true of either the subsequence of even
7.5. Nonmonotonic randomness
311
places or that of odd places. Without loss of generality, suppose it is the former. Now the idea is to build a total computable betting strategy that emulates (s, q) on the even places, but, while it is waiting for a convergence, plays on fresh odd places, betting evenly. If the convergence ever happens, the strategy returns to emulating (s, q). The fundamental properties of nonmonotonic randomness were first established by Muchnik, Semenov, and Uspensky [289]. Many of these were improved in the later paper of Merkle, Miller, Nies, Reimann, and Stephan [269], which, by and large, used techniques that were elaborations of those of [289]. Theorem 7.5.5 (Muchnik, Semenov, and Uspensky [289]). If A is 1-random then A is nonmonotonically random. Proof. Suppose that A is not nonmonotonically random and that b is a partial computable betting strategy that succeeds on A. We define a Martin-L¨ of test {Vn : n ∈ N} as follows. We put σ into Vn if there is a j such that dσb (j) 2n . (Note that this notation implies that the first j places selected by b are all less than |σ|.) It is straightforward to check that Kolmogorov’s Inequality (Theorem 6.3.3) holds for nonmonotonic martingales as well, and that this fact implies that {Vn : n ∈ N} is a Martin-L¨ of test. Since clearly A ∈ n Vn , we see that A is not 1-random. At the time of writing, it remains a fundamental open question whether this theorem can be reversed. It seems that most researchers in the area believe that the answer is no. Open Question 7.5.6 (Muchnik, Semenov, and Uspensky [289]). Is every nonmonotonically random sequence 1-random? Some partial results toward the solution of this problem will be reported here and in Section 7.9, where we discuss work by Kastermans and Lempp [199] on variations of nonmonotonic randomness called permutation and injective randomness. Thus we know that 1-randomness implies nonmonotonic randomness, which in turn clearly implies (partial) computable randomness, since monotonic betting strategies are a special case of nonmonotonic ones. The same proof as above shows that if we looked at computably enumerable nonmonotonic betting strategies, we would have a notion equivalent to 1-randomness. We can separate nonmonotonic randomness from computable (or even partial computable) randomness by combining the remark following Theorem 7.4.5 with the following result, which shows that nonmonotonic randomness is quite close to 1-randomness in a certain sense.
312
7. Other Notions of Algorithmic Randomness
Theorem 7.5.7 (Muchnik, see [289]14 ). (i) Let h be a computable order such that K(A h(n)) h(n) − n for all n. Then A is not nonmonotonically random. (ii) Indeed, there are two partial computable nonmonotonic betting strategies such that any sequence satisfying the hypothesis of (i) is covered by one of them. (iii) Moreover, these strategies can be converted into total computable ones. Proof sketch. For an interval I, let A I denote the restriction of A to I. Using h, we can define a computable function g such that K A [g(n), g(n + 1)) < g(n + 1) − g(n) − n for all n. (Having defined g(n), we can choose a sufficiently large m that codes g(n) (say by having m = g(n), i for some i) and let g(n+1) = h(m).) Let In = [g(n), g(n + 1)). We have two betting strategies, one based upon the belief that there are infinitely many e such that the approximation to K(A I2e ) settles no later than that of K(A I2e+1 ), and the other based on the belief that there are infinitely many e such that the approximation to K(A I2e+1 ) settles no later than that of K(A I2e ). These strategies are symmetric, so we assume that A satisfies the first hypothesis and describe the corresponding strategy. The idea is to scan A I2e+1 , betting evenly on those bits, and then wait until a stage t such that Kt (A I2e+1 ) < g(2e + 2) − g(2e + 1) − 2(e + 1). At this stage we begin to bet on the interval I2e , using half of our current capital. We bet only on those strings σ of length g(2e + 1) − g(2e) such that Kt (σ) < g(2e + 1) − g(2e) − 2e. If the hypothesis about settling times is correct for this e, then one of these strings really is A I2e . The capital is divided into equal portions and bet on all such strings. If we lose on all strings then we have lost half of the remaining capital. If the hypothesis is correct for this e, though, then a straightforward calculation shows that we increase our capital enough to make up for previous losses, so that, since the hypothesis will be correct for infinitely many e, this betting strategy succeeds on A. Notice that we can replace nonmonotonic randomness by 1-randomness and delete the word “computable” before h in the statement of Theorem 7.5.7 part (i). 14 The statements (ii) and (iii) were not present in Muchnik’s original result, but these stronger statements can be extracted from the proof, as pointed out in [269].
7.5. Nonmonotonic randomness
313
One key problem when dealing with nonmonotonic randomness is the lack of universal tests. The level to which the lack of universality hurts us can be witnessed by the following simple result. Theorem 7.5.8 (Merkle, Miller, Nies, Reimann, and Stephan [269]). No partial computable nonmonotonic betting strategy succeeds on all computably enumerable sets. Proof. Let b = (s, q) be partial computable. We build a c.e. set W . We compute xn = (r0 , a0 ), . . . , (rn−1 , an−1 ), starting with x0 = λ, and setting rn+1 = s(xn ), with an+1 = 1 if q(xn ) 1 and an+1 = 0 otherwise. We enumerate rn+1 into W if an+1 = 1. Clearly, b cannot succeed on W . Strangely enough, it turns out that two nonmonotonic betting strategies are enough to succeed on all c.e. sets. Theorem 7.5.9 (Muchnik, see [289], Merkle, Miller, Nies, Reimann, and Stephan [269]). There exist computable nonmonotonic betting strategies b0 and b1 such that for each c.e. set W , at least one of b0 or b1 succeeds on W. Proof.Divide N into intervals {Ik : k ∈ ω} with |Ik+1 | = 5|Ik | and let Je = j Ie,j . The first strategy is monotonic and extremely simple. It always bets 23 of its current capital that the next bit is 0. This strategy succeeds on all sequences X such that there are infinitely many initial segments X n where fewer than 14 of the bits are 1. In particular, it succeeds on all X such that Je ∩ X is finite for some e. The second strategy succeeds on all We that have infinite intersection with Je . Fix an enumeration of all pairs {(ei , zi ) : i ∈ ω} such that zi ∈ Wei ∩ Jei for some ei . The betting strategy b = (s, q) splits its initial capital so that it devotes 2−e−1 to We (so the initial capital is 1). Thus the X capital function dX b is broken into parts db,e . Given the finite assignment xn = (r0 , a0 ), . . . , (rn−1 , an−1 ), we let s(xn ) = zn and q(xn ) = 1 −
dX b,en (n) . dX (n) b −e−1
(Here we think of dX b,e (n) as the capital accrued by beginning with 2 and betting all our capital on 1 at stages n such that en = e.) This strategy will succeed on all We such that Xe ∩ Je infinite, since for each number in e this intersection, the capital dW b,e is doubled. Corollary 7.5.10 (Merkle, Miller, Nies, Reimann, and Stephan [269]). The class of Kolmogorov-Loveland null sets is not closed under finite unions.
7.5.2 Van Lambalgen’s Theorem revisited We have seen that looking at splittings yields significant insight into randomness, as witnessed by van Lambalgen’s Theorem (see Section 6.9).
314
7. Other Notions of Algorithmic Randomness
For an infinite and coinfinite set Z, let z0 < z1 < · · · be the elements of Z, let y0 < y1 < · · · be the elements of Z, and recall that A0 ⊕Z A1 = {zn : n ∈ A0 } ∪ {yn : n ∈ A1 }. That is, we use Z as a guide as to how to merge A0 and A1 . Clearly, the proof of van Lambalgen’s Theorem shows that if Z is computable then A0 ⊕Z A1 is 1-random iff Ai is 1-random relative to A1−i for i = 0, 1. Here is an analog of van Lambalgen’s Theorem for nonmonotonic randomness. Theorem 7.5.11 (Merkle, Miller, Nies, Reimann, and Stephan [269]). Suppose that A = A0 ⊕Z A1 for a computable Z. Then A is nonmonotonically random iff Ai is nonmonotonically random relative to A1−i for i = 0, 1. Proof. If A not nonmonotonically random, then let b be a nonmonotonic betting strategy succeeding on A. Let b0 and b1 be strategies with the same scan rule as b, but such that b0 copies b’s bets on places in Z, while betting evenly on places in Z, while b1 copies b’s bets on places in Z, while betting evenly on places in Z. One of b0 or b1 must also succeed on A. Suppose it is b0 , the other case being symmetric. It is easy to transform b0 into a strategy that is computable relative to A1 and succeeds on A0 , whence A0 is not nonmonotonically random relative to A1 . For the other direction, suppose that b is an A1 -computable nonmonotonic betting strategy that succeeds on A0 , the other case being symmetric. Then we can define a computable nonmonotonic betting strategy by scanning the Z positions of A, corresponding to A1 , betting evenly on those, until we get an initial segment of A1 that allows us to compute a new value of b. Then we place a bet on the A0 portion of A according to this new value of b, and repeat this procedure.
An important corollary of this result is the following. Corollary 7.5.12 (Merkle, Miller, Nies, Reimann, and Stephan [269]). If Z is computable and A0 ⊕Z A1 is nonmonotonically random, then at least one of A0 or A1 is 1-random. Proof. Let U0 , U1 , . . . be a universal Martin-L¨ of test. Suppose that neither of the Ai is 1-random and let fi (n) be the least s such that Ai ∈ Un [s]. Let i be such that there are infinitely many n with f1−i (n) fi (n). Let Vn = m>n Un [fi (n)]. Then V0 , V1 , . . . is a Schnorr test relative to Ai , and A1−i ∈ n Vn . Thus Ai−1 is not Schnorr random relative to Ai , and hence not nonmonotonically random relative to Ai . By Theorem 7.5.11, A0 ⊕Z A1 is not nonmonotonically random.
7.6. Demuth randomness
315
7.6 Demuth randomness In this section we introduce a notion of randomness due to Demuth [94, 95], whose underlying thesis was that reasonable functions should behave well on random points. Definition 7.6.1 (Demuth [94, 95]). A sequence of Σ01 classes {Wg(e) }e∈ω is called a Demuth test if (i) μ(Wg(e) ) 2−e and (ii) g is an ω-c.e. function; that is, there are a computable function f and a computable approximation g(e) = lims g(e, s) such that |{s : g(e, s + 1) = g(e, s)}| < f (e) for all e. We say that a set A passes this Demuth test if it passes it in the Solovay test sense: A ∈ / Wg(e) for almost all e. A set is Demuth random if it passes all Demuth tests. The idea here is that while the levels of the test are all c.e. objects, the function picking the indices of the levels is not computable but only approximable.15 If the function g used to define the test members is only Δ02 then we get another notion of randomness, called limit randomness, studied by Kuˇcera and Nies [220]. Limit randomness is implied by Schnorr randomness relative to ∅ (a concept further studied by Barmpalias, Miller, and Nies [27]), and hence by 2-randomness. Kuˇcera and Nies [220] also studied weak Demuth randomness, where a set is weakly Demuth random if it passes all Demuth tests {Wg(e) }e∈ω such that Wg(0) ⊇ Wg(1) ⊇ · · · . See [220] for more on these notions. Every Martin-L¨ of test is a Demuth test, and every Demuth test is a Martin-L¨ of test relative to ∅ , so we have the following. Proposition 7.6.2. 2-randomness implies Demuth randomness, which in turn implies 1-randomness. Notice that if a set is ω-c.e. then it cannot be Demuth random, and hence Ω is not Demuth random. (Consider the finite Demuth test given by Ve = Ω e + 1 for e ∈ ω.) Thus Demuth randomness is stronger than 1-randomness. Demuth [95] constructed a Δ02 Demuth random set, showing that 2-randomness is stronger than Demuth randomness. One method to establish this result is to construct a test that serves a purpose similar to that of a universal Martin-L¨ of test. (There is no universal Demuth test.) 15 Barmpalias [personal communication] has noted that a natural example of a Demuth test is the null class of all sets that compute functions not dominated by a Δ02 almost everywhere dominating function f , that is, a function f such that for almost every a and almost every g T a, the function f dominates g. That these sets can be captured by a Demuth test can be seen by using the natural construction of Kurtz [228] in Theorem 8.20.6 below. We will examine almost everywhere domination further in Section 10.6.
316
7. Other Notions of Algorithmic Randomness
This is done by enumerating all pairs {(ϕe (·, ·), ψe ) : e ∈ ω}, where ϕe and ψe are partial computable functions, and where we allow ϕe (x, s)[t] ↓ only if 1. ϕe (y, s)[t] ↓ for all y x, 2. ψe (x)[t] ↓, 3. |{u : u t ∧ ϕe (x, u + 1) = ϕe (x, u)}| ψe (x), and 4. μ(Wϕe (x,s) [t]) 2−x . That is, we enumerate all partial Demuth tests and then we can assemble the tails as usual to make a test universal for Demuth randomness. This test is of the form {Wh(e) }e∈ω with h a Δ02 function (in fact an (ω +1)-c.e. function). The leftmost set passing this test will be Demuth random and Δ02 . Thus we have the following. Theorem 7.6.3 (Demuth [95]). There exists a Δ02 Demuth random set. We will show in Theorem 8.14.2 that all Demuth random sets are GL1 . It is also possible to show that no Demuth random set is of hyperimmune-free degree. (Indeed, Miller and Nies (see [306]), proved that if a set is GL1 and DNC then it is hyperimmune. Hence no 1-random set that is GL1 can be hyperimmune-free. But there are hyperimmune free weakly 2-random sets, as we see in the next chapter.) This result shows that weak 2-randomness and Demuth randomness are incomparable notions.
7.7 Difference randomness Following the proof of Theorem 6.1.3, we discussed the class of 1-random sets that do not compute ∅ . Results such as Theorems 8.8.4 and 11.7.4 below will show that the sets in this class behave much more like “truly random” sets than computationally powerful 1-random sets such as Ω. In this section, we present a test set characterization of this class due to Franklin and Ng [157], which attests to its naturality as a notion of randomness. Definition 7.7.1 (Franklin and Ng [157]). A d.c.e. test is a sequence {Un \ Vn }n∈ω such that the Un and Vn are uniformlyΣ01 classes and μ(Un \ Vn ) 2−n for all n. A set A passes this test if A ∈ / n Un \ Vn . A set is difference random if it passes all d.c.e. tests. This definition can be generalized in a natural way to define n-c.e. tests and n-c.e. randomness for all n > 1 (where 2-c.e. randomness is what we have called difference randomness). However, Franklin and Ng [157] showed that this hierarchy of randomness notions collapses; that is, for every n > 1, a set is n-c.e. random iff it is difference random. Difference randomness can also be characterized in terms of a special kind of Demuth test.
7.7. Difference randomness
317
Definition 7.7.2 (Franklin and Ng [157]). A Demuth test {Wg(n) }n∈ω is strict if g has an ω-c.e. approximation g(n, s) such that if g(n, s + 1) = g(n, s) then Wg(n),s+1 ∩ Wg(n),t = ∅ for all t s. Lemma 7.7.3 (Franklin and Ng [157]). A set A is difference random iff for every strict Demuth test {Wg(n) }e∈ω , we have A ∈ / n Wg(n) . Proof. Suppose there isa strict Demuth test {W g(n) }n∈ω such that A ∈ {Wg(n,s) : g(n, s + 1) = n Wg(n) . Let Un = s Wg(n,s) and Vn = g(n, s)}. It is easy to check that {Un \ Vn }n∈ω is a d.c.e. test that A does not pass. Now suppose there is a d.c.e. test {Un \ Vn }n∈ω that A does not pass. We may assume without loss of generality that μ(Un [s] \ Vn [s]) 2−n for all n and s. We may also assume that A is 1-random, since otherwise there is a Martin-L¨ of test that A does not pass, and every Martin-L¨ of test is a strict Demuth test. Let {Cn,i : n ∈ ω, i 2n } be uniformly Σ01 classes defined as follows. Let sn,0 = 0. For each n, let Cn,0 copy Un+1 until μ(Cn,0 ) exceeds 2−n . If such a stage sn,1 is ever reached, then stop building Cn,0 and begin building Cn,1 by copying Un+2 \ Cn,0 until μ(Cn,1 ) exceeds 2−n . (Note that, in this case, Cn,0 is a Δ01 class, so Un+2 \ Cn,0 is still a Σ01 class.) Continue in this fashion, at step i building Cn,i by copying Un+i+1 \ (Cn,0 ∪ · · · ∪ Cn,i−1 ) until μ(Cn,i ) exceeds 2−n at some stage sn,i+1 , if ever. It is easy to see that for each n there is an in 2n such that this construction enters step in but never leaves it. The Cn,i are clearly uniformly Σ01 , so we can define a computable function g such that for each n and s, we have Wg(n,s) = Cn,i for the largest i such that s sn,i . Let g(n) = lims g(n, s). It is easy to check that {Wg(n) }n∈ω is a strict Demuth test, and hence so is {Wg(n) }nN for any N . We now show that there is an N such that A ∈ nN Wg(n) . Let S = {Un+i [sn,i ] \ Vn+i [sn,i ] : n ∈ ω ∧ 0 < i in }. The sum of the measures of the elements of S is bounded n i 2−n+i < ∞, so S is a Solovay test. By our assumption that A is 1-random, there is an N such that A ∈ / Un+i [sn,i ] \ Vn+i [sn,i ] for all n N and 0 < i in . Let n N . For 0 < i in , we have A ∈ / Vn+i , so that A ∈ / Vn+i [sn,i ], and hence A ∈ / Un+i [sn,i ] = Cn,i−1 . But A ∈ Un+in +1 , so A ∈ Un+in +1 \ (Cn,0 ∪ · · · ∪ Cn,in −1 ) = Cn,in = Wg(n) . Thus A ∈ nN Wg(n) . Using this characterization, we can show that the difference random sets are exactly the “computationally weak” 1-random sets. Theorem 7.7.4 (Franklin and Ng [157]). A set is difference random iff it is 1-random and does not compute ∅ .
318
7. Other Notions of Algorithmic Randomness
Proof. Suppose that A is 1-random but not difference random. Let {Wg(n) }n∈ω be a strict Demuth test such that A ∈ n Wg(n) . Let t(n) be an A-computable function such that for each n there is a k with A k ∈ Wg(n) [t(n)]. Let S = {Wg(n) [s] : n ∈ ∅s+1 \ ∅s }. Then S is a Solovay test, so for almost all n, if n enters ∅ at stage s+1, then A ∈ / Wg(n) [s], and hence s < t(n). Thus n ∈ ∅ iff n ∈ ∅t(n) for almost all n, and hence ∅ T t T A. For the other direction, since every difference random set is 1-random, it is enough to assume that A T ∅ and build a d.c.e. test that A does not pass. We build an auxiliary c.e. set C. By the recursion theorem, we may assume we have a reduction C = ΓA (as explained in the proof of Theorem 2.20.1). Let Sn,i = {X : ∀j < i (ΓX (n, j) ↓= 1) ∧ ΓX (n, i) ↓= 0}. Let C = {n, i : μ(Sn,i ) > 2−n }, let Un = {Sn,i : ∀j < i (μ(Sn,j ) > 2−n )}, and let Vn = {Sn,i : μ(Sn,i ) > 2−n }. The Sn,i are uniformly Σ01 classes, and hence so are the Un and the Vn . Fix n. The Sn,i are pairwise disjoint, so there is a least i such that μ(Sn,i ) 2−n . Then Un \ Vn = Sn,i , and if ΓX = C then X ∈ Sn,i , so in particular A ∈ Sn,i . It follows that {Un \ Vn }n∈ω is a d.c.e. test that A does not pass. Thus every difference random set is 1-random, and this implication is strict. Lemma 7.7.3 shows that every Demuth random set is difference random. As we will see in Theorem 8.13.1, no weakly 2-random set can compute ∅ , so weak 2-randomness also implies difference randomness. As mentioned in the previous section, Demuth randomness and weak 2-randomness are incomparable notions, so both of these implications are strict. See [157] for a martingale characterization of difference randomness.
7.8 Finite randomness A finite Martin-L¨ of test is a Martin-L¨ of test {Ve : e ∈ ω} where |Ve | < ∞ for all e; that is, each Ve is clopen. A set is finitely random if it passes all finite Martin-L¨of tests. We say that a finite Martin-L¨ of test is a computably finite Martin-L¨ of test if, additionally, there is a computable function f such that |Ve | < f (e). A set is computably finitely random if it passes all computably finite Martin-L¨ of tests. Interestingly, for Δ02 sets finite randomness and 1-randomness coincide. Proposition 7.8.1. If A is finitely random and Δ02 then A is 1-random. Proof. Suppose that A is Δ02 and not 1-random. Let U0 , U1 , . . . be a universal Martin-L¨of test. Let V0 , V1 , . . . be the test defined by letting Vi [s] = Ui [t] for the largest t s such that As ∈ Ui [t], if there is one, and t = s otherwise. Then V0 , V1 , . . . is a finite test, and its intersection contains A.
7.9. Injective and permutation randomness
319
Beyond the Δ02 sets the notions are quite different. One could argue that notions of finite randomness are not really randomness notions (because they are not stochastic), but it might in any case be interesting to see what insight they give into computability. We do not cover these notions further here, but point out one recent theorem, which indicates that they do provide some insight. A degree is totally ω-c.e. if every function it computes is ω-c.e. Theorem 7.8.2 (Brodhead, Downey, and Ng [44]). A computably enumerable degree contains a computably finitely random set iff it is not totally ω-c.e. At the time of writing, little else is known about these notions.
7.9 Injective and permutation randomness Two variations on nonmonotonic randomness were suggested by Miller and Nies [278] as possible methods of attacking Question 7.5.6. Both use the idea of restricting betting strategies to oblivious scan rules; that is, no matter what the input sequence A is, the places to be scanned are determined exactly by the previous places we have bet on, independently of the observed values of A at these places, so that the scan rule maps N<ω to N. Thus, there is an f such that if we have bet on positions n0 , . . . , nk , then we will bet on position f (n0 , . . . , nk ), where of course this value cannot be in {n0 , . . . , nk }. A sequence is injectively random if it is nonmonotonically random for all computable betting strategies based on such scan rules. It is permutation random if it is nonmonotonically random for all computable betting strategies based on such scan rules with the additional requirement that the strategy must bet on all bits eventually. (That is, the scan rule is also surjective.) Clearly, every nonmonotonically random set is injectively random, every injectively random set is permutation random, and every permutation random set is computably random. The following is what is known in this area. Due to space limitations, we only sketch the proof. Theorem 7.9.1 (Kastermans and Lempp [199]). There is an injectively random set that is not 1-random. Proof sketch. We need to build a c.e. martingale M and a set X such that M is unbounded along X but X is injectively random. We will think of M as an infinite sum of uniformly c.e. martingales Mi , each weighted so as to make sure that M = i Mi is a martingale. Each Mi bases its procedure on a guess; in order to win, it suffices that one of the Mi be unbounded along X. This idea is familiar from Theorem 6.3.7, where we constructed an optimal supermartingale, and Corollary 7.4.7. In the following, we will restrict ourselves to dealing with just two injective computable
320
7. Other Notions of Algorithmic Randomness
martingales. For the general case, dealing with all such martingales, see [199]. We first sketch a construction to make X permutation random: We need to ensure that each permutation martingale is bounded on X. For a single permutation martingale d, we have our c.e. martingale M0 guess that d is total. Consequently, M0 will always favor the successor of a node on which d gains less. In this way, we determine a sequence A on which M0 succeeds, but such that d is bounded on A, so if M0 ’s guess is correct, we can take X = A. It is true that d is not monotonic so d may be betting elsewhere before it bets on the next bit. However, since d is assumed to be total, we can always wait until bets have been placed on the next bit and then simply take the statistical average, thereby providing a “monotonized” version of d. We also have, for the ith node σ ∈ 2<ω , a c.e. martingale Mi+1 guessing that d is not defined on σ. This martingale can simply concentrate its capital on a sequence B extending σ, and if its guess is correct, we can take X = B. To defeat two permutation martingales d0 and d1 , each with initial capital 1, say, we have two cases for d1 : Under the guess that d0 is not defined on σ, we can simply run the above strategy in the cone above σ. Under the guess that d0 is partial, the strategy for d1 is more complicated; we cannot simply run the above strategy since d1 may be undefined only on nodes σ where d0 is very large. (Strictly speaking, this is not a problem for this two-strategy case, but it would be for the general construction.) So we modify the above strategy by distinguishing the following cases: (i) d1 is undefined on a node σ at which d0 is bounded by 2, say. We can then safely restrict our attention to nodes above σ and proceed as for a single permutation martingale. (ii) d1 is total when restricted to nodes σ at which d0 is bounded by 2. We can then pretend that d1 is indeed total without restrictions by modifying d1 to a permutation martingale d1 that behaves like d1 but bets evenly as soon as a node σ is reached at which d0 reaches 2. We now use a 1 “monotonized” version of d0 + 2c d1 , for some suitable c. Since our sequence X will not extend any σ at which d0 has reached 2, it is safe to use d1 in place of d1 . Finally, to deal with injective martingales, we need one more ingredient. The additional difficulty is that we can no longer “monotonize” as above, since we cannot afford to wait until all bets on the next bit have been placed. Instead, for a single injective martingale, M bets only on bits for which it knows that d bets as well (i.e., an infinite computable subset of the bits on which d bets). To deal with two injective martingales d0 and d1 , say, we distinguish cases, dealt with by separate strategies. If there are infinitely many bits on which both d0 and d1 bet, then we simply restrict ourselves to an infinite computable set of such bits. If there are only finitely many bits on which both d0 and d1 bet, say none past bit k, then past this bit k, we can run separate strategies against d0 and d1 on disjoint sets of bits,
7.9. Injective and permutation randomness
321
which will not interfere with each other. The details are a bit messy; see Kastermans and Lempp [199] for the full proof. There are two views on the above theorem. The first is that injective randomness is very different from nonmonotonic randomness, and the theorem says nothing about the question of whether nonmonotonic randomness and 1-randomness coincide. The second view is that perhaps the result could be generalized to solve that question. Maybe some kind of measure-theoretic generalization of the Kastermans-Lempp argument (looking at the measure of where two nonmonotonic martingales coincide) might work, but the problem seems difficult. There seem to be a number of other potentially interesting randomness notions intermediate between computable randomness and 1-randomness. We mention two that can be motivated by thinking about the coding of tests into supermartingales in the proof of Schnorr’s Theorem 6.3.4 that a set is 1-random iff no c.e. supermartingale succeeds on it. There, we start with a uniformly c.e. sequence R0 , R1 , . . . of prefix-free generators for a Martin-L¨ of test. We build a c.e. supermartingale d that bets evenly on σ0 and σ1 until it finds that, say, σ0 ∈ Rn , at which point it starts to favor σ0, to an extent determined by n. If later d finds that σ1 ∈ Rm , then what it does is determined by the relationship between m and n. If m < n then d still favors σ0, though to a lesser extent than before. If m = n then d again bets evenly on σ0 and σ1. If m > n then d switches allegiance and favors σ1. Of course, this switch can happen several times, as we find more Ri to which σ0 or σ1 belong. The computable enumerability of d is essential in the above. A computable supermartingale (which we have seen we may assume is rational-valued without loss of generality) has to decide which side to favor, if any, immediately. Hitchcock has asked whether an intermediate notion, where we allow our supermartingale to be c.e. but do not allow it to switch allegiances in the way described above, is still powerful enough to capture 1-randomness. The purest version of this question was suggested by Kastermans, in discussing Hitchcock’s question with Downey and Lempp. A Kastergale 16 is a pair consisting of a partial computable function g : 2<ω → {0, 1} and a c.e. supermartingale d such that g(σ) ↓= i iff ∃s (ds (σi) > ds (σ(1 − i)) iff d(σi) > d(σ(1 − i)). A set is Kastermans random if no Kastergale succeeds on it. A Hitchgale is the same as a Kasterj) 0 gale, except that in addition the proportion ddss(τ (τ ) (where we regard 0 as 0 being 0) is a Σ1 function, so that if we ever decide to bet some percentage of our capital on τ j, then we are committed to betting at least that per16 We reflect here this (so far informally adopted) practice of portmanteau naming of martingale varieties, without necessarily endorsing it. It may be worth noting that a martingale is not a martin-gale. The Oxford English Dictionary suggests that the word comes from the name of the French town of Martigues.
322
7. Other Notions of Algorithmic Randomness
centage from that point on, even if our total capital on τ increases later. A set is Hitchcock random if no Hitchgale succeeds on it. It is open whether these notions differ from each other or from 1-randomness.
8 Algorithmic Randomness and Turing Reducibility
In this chapter, we look at the distribution of 1-random and n-random degrees among the Turing (and other) degrees. Among the major results we discuss are the Kuˇcera-G´acs Theorem 8.3.2 [167, 215] that every set is computable from a 1-random set; Theorem 8.8.8, due to Barmpalias, Lewis, and Ng [24], that every PA degree is the join of two 1-random degrees; and Stillwell’s Theorem 8.15.1 [382] that the “almost all” theory of degrees is decidable. The latter uses Theorem 8.12.1, due to de Leeuw, Moore, Shannon, and Shapiro [92], that if a set is c.e. relative to positive measure many sets, then it is c.e. This result has as a corollary the fundamental result, first explicitly formulated by Sacks [341], that if a set is computable relative to positive measure many sets, then it is computable. Its proof will introduce a basic technique called the majority vote technique. We also explore several other topics, such as the relationship between genericity and randomness; the computational power of n-random sets, especially with respect to n-fixed point free functions; and some important unpublished results of Kurtz [228] on properties that hold of almost all sets, in the sense of measure. These results use a technique called risking measure, which has recently found applications elsewhere, such as in Downey, Greenberg, and Miller [108].
R.G. Downey and D. Hirschfeldt, Algorithmic Randomness and Complexity, Theory and Applications of Computability, DOI 10.1007/978-0-387-68441-3_8, © Springer Science+Business Media, LLC 2010
323
324
8. Algorithmic Randomness and Turing Reducibility
8.1 Π01 classes of 1-random sets For each n 1, relativizing Ω to ∅(n−1) shows that there are n-random sets computable in ∅(n) .1 We can get a stronger result using Π01 classes. For each constant c, the class {A : ∀n K(A n) n − c} is clearly a Π01 class, so basic facts about Π01 classes give us several interesting results about the 1-random sets, which have been noted by several authors. Proposition 8.1.1. The collection of 1-random sets is a Σ02 class. Proposition 8.1.2. There exist low 1-random sets. Proof. Apply the Low Basis Theorem 2.19.8. Proposition 8.1.3. There exist 1-random sets of hyperimmune-free degree. Proof. Apply the Hyperimmune-Free Basis Theorem 2.19.11. Proposition 8.1.2 is obviously particular to 1-randomness, since no 2random set can be Δ02 , let alone low. In Theorem 8.21.2, we will see that this is also the case for Proposition 8.1.3, since every 2-random set has hyperimmune degree. On the other hand, the above results obviously relativize, so, for instance, for all n 1, there exist n-random sets with ∅(n) -computable jumps.
8.2 Computably enumerable degrees Kuˇcera [215] completely answered the question of which c.e. degrees contain 1-random sets, by showing that the only such degree is the complete one. Theorem 8.2.1 (Kuˇcera [215]). If A is 1-random, B is a c.e. set, and A T B, then B ≡T ∅ . In particular, if A is 1-random and has c.e. degree, then A ≡T ∅ . Proof. Kuˇcera’s original proof used Arslanov’s Completeness Criterion (Theorem 2.22.3). We give a new direct proof. Suppose that A is 1-random, B is c.e., and ΨB = A. Let c be such that K(A n) > n − c for all n. We will enumerate a KC set, with constant d given by the recursion theorem. Let k = c + d + 1. We want to ensure that if B ψ B (n + k)[s] = Bs ψ B (n + k)[s], then ∅s (n) = ∅ (n), which clearly ensures that ∅ T B. We may assume that our approximations are sufficiently sped-up so that ΨB (n + k)[s] ↓ for all s and all n s. If n enters ∅ at some stage s n, we wish to change B below ψ B (n+k)[s]. We do so by forcing A to change below n + k. That is, we enumerate a KC 1 It
is also easy to construct an arithmetically random set computable from ∅(ω) .
8.3. The Kuˇcera-G´ acs Theorem
325
request (n + 1, As n + k). This request ensures that K(As n + k) n + d + 1 = n + k − c, and hence that A n + k = As n + k. Thus B ψ B (n+k)[s] = Bs ψ B (n+k)[s], as required. (Note that we enumerate at most one request, of weight 2−(n+1) , for each n ∈ N, and hence our requests indeed form a KC set.) In the above proof, if Ψ is a wtt-reduction, then so is the reduction from B to ∅ that we build. Hence, we have the following result. Corollary 8.2.2. If A is 1-random, B is a c.e. set, and A wtt B, then B ≡wtt ∅ . In particular, if A is 1-random and has c.e. wtt-degree, then A ≡wtt ∅ .
8.3 The Kuˇcera-G´acs Theorem In this section we present a basic result, proved independently by Kuˇcera [215] and G´ acs [167], about the distribution of 1-random sets and the relationship between 1-randomness and Turing reducibility. Specifically, we show that every set is computable from some 1-random set. We begin with an auxiliary result. Lemma 8.3.1 (Space Lemma, see Merkle and Mihailovi´c [267]). Given a rational δ > 1 and integer k > 0, we can compute a length l(δ, k) such that, for any martingale d and any σ, |{τ ∈ 2l(δ,k) : d(στ ) δd(σ)}| k. Proof. By Kolmogorov’s Inequality (Theorem 6.3.3), for any l and σ, the average of d(στ ) over strings τ of length l is d(σ). Thus |{τ ∈ 2l : d(στ ) > δd(σ)}| 1 < . 2l δ Let l(δ, k) = log 1−δk −1 , which is well-defined since δ > 1. Then |{τ ∈ 2l(δ,k) : d(στ ) > δd(σ)}| <
2l(δ,k) , δ
so |{τ ∈ 2l(δ,k) : d(στ ) δd(σ)}| > 2l(δ,k) −
2l(δ,k) δ
= 2l(δ,k) (1 − δ −1 )
k (1 − δ −1 ) = k. 1 − δ −1
One should think of the Space Lemma as saying that, for any σ and any martingale d, there are at least k many extensions τ of σ of length
326
8. Algorithmic Randomness and Turing Reducibility
|σ| + l(δ, k) such that d cannot increase its capital at σ by more than a factor of δ while betting along τ . We wish to show that a given set X can be coded into a 1-random set R. Clearly, unless X itself is 1-random, there is no hope of doing such a coding if the reduction allows for the recovery of X as an identifiable subsequence of R. For instance, it would be hopeless to try to ensure that X m R. As we will show, however, we can get X wtt R. The key idea in the proof below, which we take from Merkle and Mihailovi´c [267], is to use the intervals provided by the Space Lemma to do the coding. Theorem 8.3.2 (Kuˇcera [215], G´acs [167]). Every set is wtt-reducible to a 1-random set. Proof. Let d be a universal c.e. martingale. We assume that we have applied the savings trick of Proposition 6.3.8 to d so that for any R, if lim inf n d(R n) < ∞ then R is 1-random. Let r0 > r1 > · · · be a collection of positive rationals such that, letting βi = ji rj , the sequence {βi }i∈ω converges to some value β. Let ls = l(rs , 2) be as in the Space Lemma, which means that for any σ there are at least two strings τ of length ls with d(στ ) rs d(σ). Partition N into consecutive intervals {Is }s∈ω with |Is | = ls . Fix a set X. We construct a 1-random set R that wtt-computes X. We denote the part of R At stage s, we specify R on the elements of Is . specified before stage s by σs . (That is, σs = R i<s li .) If s > 0, then we assume by induction that d(σs ) βs−1 . We say that a string τ of length ls is s-admissible if d(σs τ ) βs . Since βs = rs βs−1 (when s > 0) and ls = l(rs , 2), there are at least two s-admissible strings. Let τ0 and τ1 be the lexicographically least and greatest among such strings, respectively. Let σs+1 = σs τi , where i = X(s). Now lim inf n d(R n) β, so R is 1-random. We now show how to compute X(s) from σs+1 = R is li . We know that σs+1 is either the leftmost or the rightmost s-admissible extension of σs , and being sadmissible is clearly a co-c.e. property, so we wait until either all extensions of σ to the left of σs+1 are seen to be not s-admissible, or all extensions of σ to the right of σs+1 are seen to be not s-admissible. In the first case, s∈ / X, while in the second case, s ∈ X. We will refer to the kind of coding used in the above proof as G´ acs coding, as it is an adaptation by Merkle and Mihailovi´c [267] of the original method used by G´acs [167]. Merkle and Mihailovi´c [267] also adapted the methods of G´ acs [167] to show that the use of the wtt-reduction in the Kuˇcera-G´acs Theorem can be quite small, on the order of n + o(n). This proof is along the lines of the above, but is a bit more delicate. We will see in Theorem 9.13.2 that this bound cannot be improved to n + O(1).
8.4. A “no gap” theorem for 1-randomness
327
8.4 A “no gap” theorem for 1-randomness The Kuˇcera-G´acs Theorem allows us to examine more deeply the characterization of 1-randomness by Miller and Yu, Theorem 6.7.2. We recall that this theorem states that a set is 1-random iff any of the following equivalent conditions holds. (i) C(A n | n) n − K(n) − O(1). (ii) C(A n) n − K(n) − O(1). (iii) C(A n) n − g(n) − O(1) for every computable function g such that n 2−g(n) < ∞. (iv) C(A n) n − G(n) − O(1), where
G(n) =
Ks+1 (t) n
if n = 2s,t and Ks+1 (t) = Ks (t) otherwise.
This function is clearly a Solovay function in the sense of Definition 3.12.3, that is, a computable upper bound for K that agrees with K up to a constant infinitely often. Using the Kuˇcera-G´acs Theorem, Bienvenu and Downey [39] showed that this is not a coincidence, as all functions that can take the place of G in Theorem 6.7.2 are Solovay functions. Theorem 8.4.1 (Bienvenu and Downey [39]). Let g be a function such that A is 1-random iff C(A n) n − g(n) − O(1). Then g is a Solovay function. To prove this result, we will show that in both characterizations of 1randomness, K(α n) n − O(1) and C(α n) n − K(n) − O(1), the lower bound on the complexity is very sharp; that is, there is no “gap phenomenon”. In Corollary 6.6.2, we saw that A is 1-random iff limn K(A n) − n = ∞. Together with Schnorr’s Theorem 6.2.3, this result gives the following dichotomy: Either A is not 1-random, in which case K(α n) − n takes arbitrarily large negative values, or A is 1-random, in which case K(α n)−n tends to infinity. So, for instance, there is no A such that K(A n) = n ± O(1). We can ask whether this dichotomy is due to a gap phenomenon. That is, is there a function h that tends to infinity, such that for every 1-random A, we have K(A n) n + h(n) − O(1)? Similarly, is there a function g that tends to infinity such that for every A, if K(A n) n − g(n) − O(1) then A is 1-random? We will give negative answers to both of these questions, as well as their plain complexity counterparts. Theorem 8.4.2 (Bienvenu [37], see also Bienvenu and Downey [39]). (i) There is no function h that tends to infinity such that if K(A n) n − h(n) − O(1) then A is 1-random.
328
8. Algorithmic Randomness and Turing Reducibility
In fact, there is no function h that tends to infinity such that if K(A n) n − h(n) − O(1) then A is Church stochastic (as defined in Definition 7.4.1). (ii) There is no function h such that if C(A n) n−K(n)−h(n)−O(1) then A is 1-random (or even Church stochastic). Proof. Since we can replace h by the function n → min{h(k) : k n}, we may assume that h is nondecreasing. We build a single A that establishes all parts of the theorem simultaneously. Let h−1 (n) be the least m such that h(m) = n. If h is computable, then we can simply take a 1-random X and obtain A by inserting zeroes into X at a sufficiently sparse set of positions, say h−1 (0), h−1 (1), . . . .2 However, we are working with an arbitrary order h, which may grow slower than any computable order, so this direct construction may not work, since inserting zeroes at a noncomputable set of positions may not affect the randomness of the resulting set. To overcome this problem, we invoke the Kuˇcera-G´acs Theorem and choose our 1-random set X so that h T X. Intuitively, the sequence A resulting from inserting zeroes into X at positions h−1 (0), h−1 (1), . . . should not be 1random, or even Church stochastic, since the part of A consisting of the original bits of X should be able to compute the places at which zeroes were inserted. However, the insertion of zeroes may destroy the Turing reduction Φ from X to h−1 . In other words, looking at A, we may not be able to distinguish the bits of X from the inserted zeroes. We solve this problem by delaying the insertion of the zeroes to “give Φ enough time” to compute the positions of the inserted zeroes. More precisely, we insert the kth zero at position nk = h−1 (k)+t(k) where t(k) is the time needed by Φ to compute h−1 (k) from X. This way, nk is computable from X nk in time at most nk . We can then construct a computable selection rule that selects precisely the inserted zeroes, witnessing the fact that A is not Church stochastic (and hence not 1-random). Moreover, since the “insertion delay” makes the inserted zeroes even more sparse, we still have K(A n) n − h(n) − O(1). And similarly, since X is 1-random, by Theorem 6.7.2 we have C(A n) n − K(n) − h(n) − O(1). The details are as follows. We may assume that t is a nondecreasing function. Let A be the set obtained by inserting zeroes into X at positions h−1 (n) + t(n). To show that A is not Church stochastic, we construct a (total) computable selection rule S that filters the inserted zeroes from A. We proceed by induction to describe the action of S on a sequence Z. Let kn the number of bits selected by S from Z n and let σn be the sequence obtained from Z n by removing all selected bits. At stage n + 1, 2 This
approach was refined by Merkle, Miller, Nies, Reimann, and Stephan [269], who used an insertion of zeroes at a co-c.e. set of positions to construct a left-c.e. real that is not von Mises-Wald-Church stochastic, but has initial segments of high complexity.
8.4. A “no gap” theorem for 1-randomness
329
the rule S selects the nth bit of Z iff Φσn (kn )[n] ↓ and for the computation time s of Φσn (kn ), we have Φσn (kn ) + s = n. Clearly, S is a total computable selection rule. We claim that, on A, the rule S selects exactly the zeroes that have been inserted into X to obtain A. Suppose that the first i many bits of A selected by S are the first i many inserted zeroes, the last of these being A(m). Let A(n) be the next inserted zero. Let l ∈ (m, n) and assume by induction that no bit of A in (m, l) is selected by S. Then σl ≺ X and kl = i, so if Φσl (kl ) ↓ then Φσl (kl ) = ΦX (i) = h−1 (i), and the time of this computation is t(i). But ΦX (i) + t(i) = l, since l is not an insertion place, and hence A(l) is not selected by S. So, by induction, no bit of A in (m, n) is selected by S. Thus σn ≺ X and kn = i. Furthermore, since n is the (i + 1)st insertion place, n = h−1 (i) + t(i), and Φσn (kn )[n] ↓= ΦX (i) = h−i (i), so A(n) is selected by S. Thus we have established our claim, which implies that A is not Church stochastic. Now fix n and let i be the number of inserted zeroes in A n. Then X n − i can be computed from A n, so K(A n) K(X n − i) − O(1) n − i − O(1). By the choice of insertion positions, i h(n), so K(A n) n − h(n) − O(1). A similar calculation shows that C(A n) n − K(n) − h(n) − O(1). The above construction can also be used to get the following dual version of Theorem 8.4.2. Corollary 8.4.3 (Bienvenu and Downey [39]). There is no function g that tends to infinity such that K(A n) n + g(n) − O(1) for every 1-random A. Similarly, there is no function g that tends to infinity such that C(A n) n − K(n) + h(n) − O(1) for every 1-random A. Proof. We prove the first statement, the proof of the second being essentially identical. Assume for a contradiction that such a g exists. We may assume that g is nondecreasing. Let h be an order that is sufficiently slow-growing so that h(n) g(n − h(n)) + O(1) and perform the construction in the proof of Theorem 8.4.2. Then, letting i be the number of inserted zeroes in A n, we have K(A n) K(X n − i) − O(1) n − i + g(n − i) − O(1) n − h(n) + g(n − h(n)) − O(1) n − O(1), contradicting the fact that A is not 1-random. Miller and Yu [279] extended the above result to any unbounded function (rather than only those that tend to infinity). We will present this result in Theorem 10.4.3. Bienvenu and Downey [39] also used the results above
330
8. Algorithmic Randomness and Turing Reducibility
to give a different proof of Theorem 6.7.2. A further application of these results is to prove Theorem 8.4.1. Proof of Theorem 8.4.1. Let g be a function such that A is 1-random iff C(A n) n − g(n) − O(1). Let h(n) = g(n) − K(n). Then A is 1-random iff C(A n) n − K(n) − h(n) − O(1), so by Theorem 8.4.2, h cannot tend to infinity, and thus g is a Solovay function. The consequences of Theorem 8.4.2 go beyond its applications to Solovay functions. For example, it can be used to show that Schnorr randomness does not imply Church stochasticity, as mentioned in Section 7.4.1. By Theorem 7.3.4, if K(A n) n − h(n) − O(1) for some computable order h, then A is Schnorr random. On the other hand, Theorem 8.4.2 says that this condition is not sufficient for A to be Church stochastic. One can also adapt the proof of Theorem 8.4.2 to separate Church stochasticity from Schnorr randomness within the left-c.e. reals, as we now sketch. Let α be a 1-random left-c.e. real. Let t(n) be least such that t(n) > t(n−1) and |α−αt(n) | < 2−n . Let β be obtained from α by inserting zeroes in positions t(0), t(1), . . . . Since t is approximable from below, β is left-c.e., and by the same reasoning as above, it is not Church stochastic. Let f (n) be the least m such that t(m) > n. The same kind of computation as above shows that K(β n) n − f (n) − O(1). It is easy to show that t grows faster than any computable function, so f grows more slowly than any computable order, whence β is Schnorr random. This result improves a theorem of Merkle, Miller, Nies, Reimann, and Stephan [269], who proved an equivalent fact for a weaker notion of stochasticity. For details on that result, see Bienvenu [37].
8.5 Kuˇcera coding One of the most basic questions we might ask about the distribution of 1-random sets is which Turing degrees contain such sets. Kuˇcera [215] gave the following partial answer, which was later extended by Kautz [200] to n-random sets using the jump operator, as we will see in Theorem 8.14.3. Theorem 8.5.1 (Kuˇcera [215]). If a 0 then a contains a 1-random set. Proof. Let a 0 and let X ∈ a. Let R be as in the proof of Theorem 8.3.2. Then X T R. Furthermore, we can compute R if we know X and can tell which sets are s-admissible for each s. The latter question can be answered by ∅ . Since X T ∅ , it follows that R T X. Thus R is a 1-random set of degree a. We now give a different proof of Theorem 8.5.1, which is of course also an alternative proof of the Kuˇcera-G´acs Theorem 8.3.2, along the lines of the original proof of Kuˇcera [215]. This version is due to Reimann [324] (see
8.5. Kuˇcera coding
331
also Reimann and Slaman [327]). Let S0 , S1 , . . . be an effective list of the prefix-free c.e. subsets of 2<ω , and let Sk,s be the stage s approximation to Sk . We begin with Kuˇcera’s useful construction of a universal Martin-L¨ of test. Kuˇ cera’s universal Martin-L¨ of test. Fix n ∈ N. For each e > n, enumerate all elements of SΦe (e) (where we take SΦe (e) to be empty if Φe (e) ↑) into a set Rn as long as σ∈SΦ (e),s 2−|σ| < 2−e . (That is, if this sum ever e gets to 2−e , stop enumerating the elements of SΦe (e) into Rn before it does.) Let Un = Rn . The Rn are uniformly c.e. and σ∈Rn 2−|σ| e>n 2−e = 2−n . Thus {Un }n∈ω is a Martin-L¨ of test. To see that it is universal, let {Vn }n∈ω be a Martin-L¨ of test. Let e be such that Vi = SΦe (i) for all i. Every computable function possesses infinitely many indices, so for each n there is an i > n such that Φe = Φi . For such an i, we have SΦi (i) = SΦe (i) = Vi , which means that every element of S is enumerated into R . So n Φ (i) i m Vm ⊆ Un . Thus m Vm ⊆ n Un . We next need a lemma similar in spirit to the Space Lemma. of test, and Lemma 8.5.2. Let {Un }n∈N be Kuˇcera’s universal Martin-L¨ let Pn be the complement of Un . Let C be a Π01 class. Then there exists a computable function γ : 2<ω × N → Q+ such that for any σ ∈ 2<ω and n ∈ N, Pn ∩ C ∩ σ = ∅ ⇒ μ(C ∩ σ) γ(σ, n). Proof. It is easy to check that if Pn ∩ C ∩ σ = ∅ then μ(C ∩ σ) > 0, but we need to show that we can give an effective positive lower bound for μ(C ∩ σ). Since C is a Π01 class, there is an index k such that C = Sk . Fix σ and n, and define the partial computable function Φ as follows. On input j, search for an s such that μ(σ \ Sk,s ) < 2−j . If such an s is ever found, then let Φ(j) be such that SΦ(j) is finite, SΦ(j) covers σ \ Sk,s , and μ(SΦ(j) ) 2−j . Clearly, {SΦ(j) }j∈ω is a Martin-L¨ of test. Let e > n be an index such that Φe = Φ, obtained effectively from σ and n. Let γ(σ, n) = 2−e . Suppose that Φe (e) ↓. Then SΦe (e) ⊆ Un , so there is an s such that σ \ Sk,s ⊆ Un . But C ∩ σ = σ \ Sk ⊆ σ \ Sk,s , so C ∩ σ ⊆ Un , and hence Pn ∩ C ∩ σ = ∅. Now suppose that Φe (e) ↑. Then μ(σ \ Sk,s ) 2−e for every s, so μ(C ∩ σ) = μ(σ \ Sk ) = lim μ(σ \ Sk,s ) 2−e = γ(σ, n). s
332
8. Algorithmic Randomness and Turing Reducibility
Alternate proof of Theorem 8.5.1. Let B T ∅ . We need to show that there is a 1-random set A such that A ≡T B. Let {Un }n∈ω be Kuˇcera’s universal Martin-L¨ of test. We describe how to code B into an element A of U0 . Let T be a computable tree such that [T ] = U0 and let E be the set of extendible nodes of T (i.e., those nodes extended by paths of T ). Note that E is co-c.e., and hence ∅ -computable. Let γ be as in Lemma 8.5.2 with C = U0 , and let b(σ) = − log γ(σ, 0) . Then we know that, for any σ, if U0 ∩ σ = ∅, then μ(U0 ∩ σ) 2−b(σ) . Thus, if σ ∈ E then σ has at least two extensions of length b(σ) + 1 in E. Suppose that, at stage n, we have defined A mn for some mn . Let σ = A mn . We assume by induction that σ ∈ E. Let τ0 and τ1 be the leftmost and rightmost extension of σ of length b(σ) + 1 in E, respectively. As pointed out in the previous paragraph, τ0 = τ1 . Let A b(σ) = τi , where i = B(n). Since A ∈ [T ] = U0 , we know that A is 1-random. We now show that A ≡T B. The construction of A is computable in B and ∅ . Since we are assuming that B T ∅ , we have A T B. For the other direction, first note that the function n → mn is computable in A. To see that this is the case, assume that we have already computed mn . Then mn+1 = b(A mn ) + 1. Now, to compute B(n) using A, let σ = A mn and let τ = A mn+1 . We know that τ is either the leftmost or the rightmost extension of σ of length mn+1 in E. Since E is co-c.e., we can wait until either all extensions of σ of length mn+1 to the left of τ leave E, or all extensions of σ of length mn+1 to the right of τ leave E. In the first case, s ∈ / B, while in the second case, s ∈ B. We call the coding technique in the above proof Kuˇcera coding. The Kuˇcera-G´acs Theorem goes against the intuition that random sets should be highly disordered, and hence computationally weak. However, note that this result is quite specific to 1-randomness. For example, if A is 2-random then it is 1-random relative to ∅ , and hence relative to Ω, so by van Lambalgen’s Theorem, Ω is 1-random relative to A, and hence Ω T A. Thus, no 2-random set can compute ∅ . (We will have more to say on this topic in the next section.) Furthermore, Theorem 8.5.1 can be combined with van Lambalgen’s Theorem to prove the following result, which shows how randomness is in a sense inversely proportional to computational strength. Theorem 8.5.3 (Miller and Yu [280]). Let A be 1-random and B be nrandom. If A T B then A is n-random. Proof. The n = 1 case is trivial, so assume that n > 1. Let X ≡T ∅(n−1) be 1-random. Since B is n-random, it is 1-random relative to X. By van Lambalgen’s Theorem, X is 1-random relative to B, and hence relative to
8.6. Demuth’s Theorem
333
A. Again by van Lambalgen’s Theorem, A is 1-random relative to X, and hence relative to ∅ . Thus A is n-random.
8.6 Demuth’s Theorem By Theorem 8.2.1, the only 1-random c.e. degree is 0 , so the class of 1random degrees is not closed downwards, even among the nonzero degrees. The following result shows that there is a sense in which it is closed downwards with respect to truth table reducibility. The proof we give is due to Kautz [200], and uses the results and notation of Section 6.12. Theorem 8.6.1 (Demuth’s Theorem [95]). If A is 1-random and B tt A is not computable, then there is a 1-random C ≡T B. X Proof. Let e be such that ΦA e = B and Φe is total for all oracles X. Let " ! ρ(σ) = μ {τ : ∀n < |σ| (Φτe (n) ↓= σ(n))}
otherwise, and and let ν = μρ . We claim that B is 1-ν-random. Suppose let {Vn }n∈ω be a Martin-L¨ of ν-test such that B ∈ n Vn . Let Un = {X : 0 ΦX e ∈ Vn }. It is easy to check that the Un are uniformly Σ1 classes and μ(Un ) = ν(Vn ) 2−n , so the Un form a Martin-L¨ of test. But A ∈ n Un , contradicting the hypothesis that A is 1-random. Thus B is 1-ν-random. Let C ∈ s (B s)ν . Then B = seqν C, since B is not computable, and hence neither is C. By Theorem 6.12.9, C ≡T B and C is 1-random. Too recently for inclusion in this book, Bienvenu and Porter [private communication] gave an example of a set B tt Ω such that there is no 1-random C ≡wtt B. Corollary 8.6.2 (Kautz [200]). There is a 1-random degree a such that every noncomputable b a is 1-random. Proof. By Proposition 8.1.3, there is a 1-random set A of hyperimmune-free degree. By Proposition 2.17.7, if B T A then B tt A, so the corollary follows from Demuth’s Theorem. It is not known whether a can be made hyperimmune. We will extend this result to weak 2-randomness in Corollary 8.11.13. Another nice application of Demuth’s Theorem is an easy proof of an extension of the result of Calude and Nies [51] that Ω is not tt-complete. Corollary 8.6.3. If A is an incomplete left-c.e. real then A tt Ω. Proof. By Theorem 8.2.1, no incomplete c.e. degree is 1-random. As a final corollary, pointed out by Nies, we get a short proof of a theorem of Lachlan [233] whose direct proof is considerably more difficult.
334
8. Algorithmic Randomness and Turing Reducibility
Corollary 8.6.4 (Lachlan [233]). There is a c.e. set that is wtt-complete but not tt-complete. Proof. The previous corollary, combined with Proposition 6.1.2, shows that Ω is wtt-complete but not tt-complete, so it is enough to define a c.e. set A ≡tt Ω. An example of such a set is the set of dyadic rationals less than Ω.
8.7 Randomness relative to other measures In this section, we revisit the material of Section 6.12. It is natural to ask, given a set X, whether there is a (possibly noncomputable) measure ν such that X is 1-ν-random. (This question can be thought of as asking whether X has some randomness content that can be extracted by an appropriate choice of ν.) Of course, every set is 1-ν-random if it is an atom of ν. If X is computable then {X} is a Σ01 class, so being an atom is the only way that X can be 1-ν-random. The following result shows that if X is noncomputable, then there is always a less trivial way to make X be 1-ν-random. Theorem 8.7.1 (Reimann and Slaman [327]). If A is noncomputable then there is a ν such that A is 1-ν-random and A is not an atom of ν. The proof of this result uses the relativized form of the following basis theorem. (In Section 15.5 we will study the special case of this basis theorem in which X = Ω.) Theorem 8.7.2 (Reimann and Slaman [327], Downey, Hirschfeldt, Miller, and Nies [115]). For every 1-random set X and every nonempty Π01 class P , there is an X-left-c.e. real A ∈ P such that X is 1-random relative to A. Proof. Let Vi =
{σ : ∃s ∀A ∈ Ps ∃τ σ (K A (τ ) |τ | − i)}.
It is easy to see that the Vi are uniformly Σ01 classes, and that μ(Vi ) 2−i , since if A ∈ P then Vi ⊆ {X : ∃n (K A (X n) n − i)}. Thus {Vi }i∈ω is a Martin-L¨ of test. If X is not 1-random relative to A for all A ∈ P , then, by compactness, for each i there is a σ ≺ X such that σ ⊆ Vi , and hence X ∈ i Vi . Thus, if X is 1-random, then there is an A ∈ P such that X is 1-random relative to A. We must still show that A can be taken to be an X-left-c.e. real. For each i, let Si = {A ∈ P : ∀n (K A (X n) > n − i)}. Each Si is class. We have shown that Si is nonempty for large enough i. For a Π0,X 1 such an i, the lexicographically least element A of Si is an X-left-c.e. real satisfying the statement of the theorem.
8.7. Randomness relative to other measures
335
We will also need the fact that we can represent the space of all probability measures on 2ω as a Π01 class P. It is not too difficult to do so using basic methods from effective descriptive set theory; see [325, 327]. Proof of Theorem 8.7.1. It is straightforward to relativize the proof of Theorem 8.5.1 to show that for all X and Y such that X ⊕ Y T X , there is an R that is 1-random relative to X such that X ⊕ Y ≡T X ⊕ R. Let A be noncomputable. By Theorem 2.16.2, there is an X such that X ⊕ A T X . Fix such an X and let R be 1-random relative to X and such that X ⊕ A ≡T X ⊕ R. Let Φ and Ψ be X-computable functionals such that ΦR = A and ΨA = R. Let p(σ) be the shortest string τ such that σ Φτ and τ Ψσ . Of course, p(σ) may be undefined, but it will be defined if σ ≺ A. We wish to define a measure ν such that A is 1-ν-random and A is not an atom of ν. To make A be 1-ν-random, we ensure that ν dominates an image measure induced by Φ, thereby ensuring that every 1-random set in the domain of Φ is mapped by Φ to a 1-ν-random set. Thus, we restrict ourselves to measures ν such that μ(p(σ)) ν(σ) μ(Ψσ ) for all σ such that p(σ) is defined. The first inequality ensures the required domination and the second that ν is not atomic on the domain of Ψ. It is easy to see that the subclass M of P consisting of representations of those ν that satisfy the above inequalities is a Π0,X class, since μ(p(σ)) is 1 X-approximable from below and μ(Ψσ ) is X-approximable from above. Furthermore, M is nonempty, since the inequalities involved in its definition yield compatible nonempty intervals, and hence we can take an extension Γ of Φ to a total functional and use Γ to define a measure with a representation contained in M . By the relativized form of Theorem 8.7.2, there is a P ∈ M such that R is 1-random relative to X ⊕ P . Let ν be the measure encoded by P . Since A is not an atom of ν, we are left with showing that A is not 1-νrandom. Assume for a contradiction that A is covered by the Martin-L¨of ν-test {Vn }n∈N . Let G0 , G1 , . . . be uniformly (X ⊕ P )-c.e. prefix-free sets of generators for the Vi . Let Un = σ∈Gn p(σ). The Un are uniformly Σ0,X⊕P -classes, and μ(Un ) ν(Vn ), since for each σ we have μ(p(σ)) 1 ν(σ). Thus the Un form a Martin-L¨of test relative to X ⊕ P . But for each n there is a σ ≺ A in Vn , and p(σ) ≺ R is in Un . Thus R ∈ n Un , contradicting the fact that R is 1-random relative to X ⊕ P . A rather more subtle question is which sets are 1-ν-random for some continuous measure ν. Reimann and Slaman [327] showed that a variation on the halting problem is never continuously random.3 Subsequently, KjosHanssen and Montalb´an (see [327]) noted the following. 3 Their particular example was the set {s n : n ∈ N} where s0 = 0 and sn+1 = max(min{s : ∅s n + 1 = ∅ n + 1}, sn + 1).
336
8. Algorithmic Randomness and Turing Reducibility
Theorem 8.7.3 (Kjos-Hanssen and Montalb´an, see [327]). If P is a countable Π01 class and ν is continuous, then no member of P is 1-ν-random. Proof. Since ν is continuous and P is countable, ν(P ) = 0, so there is a ν-computable function t such that ν(Pt(n) ) 2−n . Thus {Pt(n) }n∈ω is a Martin-L¨ of ν-test covering P .4 We now make some brief comments for those familiar with concepts and results from higher computability theory. For any computable ordinal α, there is a set X of degree 0α that is a member of some countable Π01 class, so there is no bound below Δ11 on the complexity of the collection NCR of never continuously random sets. Reimann and Slaman [327, 329] have further explored this class and, more generally, the classes NCRn of sets that are never n-ν-random for a continuous ν, obtaining some remarkable results. They showed that NCR ⊂ Δ11 and, using very deep methods such as Borel Determinacy, that every class NCRn is countable. They also showed that, quite surprisingly, there is a precise sense in which metamathematical methods such as Borel Determinacy are necessary to prove this result. It seems strange that, although randomness-theoretic definitions such as nrandomness have relatively low complexity levels, the metamathematics of proving the countability of the classes NCRn is so complex. See [327, 328, 329] for more on this subject.
8.8 Randomness and PA degrees In light of earlier results in this chapter, it is natural to ask whether the collection of 1-random degrees is closed upwards. In this section we give a negative answer to this question by examining the relationship between randomness and the PA degrees introduced in Section 2.21, a connection first explored by Kuˇcera [215]. As we saw in Section 2.21, one of the characterizations of the PA degrees (which we can take as a definition) is that a is PA iff each partial computable {0, 1}-valued function has an a-computable total extension. As we saw in Section 2.22, this fact implies that if a is PA then it computes a diagonally noncomputable (DNC) function, that is, a total function g such that g(n) = Φn (n) for all n. The same is true of 1-random degrees. Theorem 8.8.1 (Kuˇcera [215]). Every 1-random set computes a DNC function. Proof. Let A be 1-random. Let f (n) be the position of A n in some effective listing of finite binary strings. Since A is 1-random, K(f (n)) = 4 In
fact, this proof shows that no member of P is even weakly 1-ν-random.
8.8. Randomness and PA degrees
337
K(A n) ± O(1) n − O(1). On the other hand, if Φn (n) ↓, then K(Φn (n)) K(n) + O(1), so there are only finitely many n such that f (n) = Φn (n). By altering f at these finitely many places, we obtain an A-computable DNC function. This result has the following interesting consequences, the first of which also has a direct proof using a version of the recursion theorem. Corollary 8.8.2 (Kuˇcera [216]). If a is 1-random and c.e. then a = 0 . Proof. Apply Arslanov’s Completeness Criterion, Theorem 2.22.3 (together with Theorem 2.22.1). Corollary 8.8.3 (Kuˇcera [216]). If A and B are 1-random and A, B T ∅ , then the degrees of A and B do not form a minimal pair. Proof. By Theorem 8.8.1, each of A and B computes a DNC function, and hence computes a fixed-point free function (by Theorem 2.22.1). By Theorem 2.22.7, the degrees of A and B do not form a minimal pair. Corollary 8.8.3 also follows from Theorem 7.2.11, since if A, B T ∅ , then {A, B} is a Σ03 class (in fact, a Π02 class). Let A and B be relatively 1-random Δ02 sets. Heuristically, A and B should have very little common information, and thus we might expect their degrees to form a minimal pair. Corollary 8.8.3 shows that this is not the case, however. Still, our intuition is not far off. We will see in Corollary 8.12.4 that if A and B are relatively weakly 2-random, then their degrees form a minimal pair. Furthermore, we will see in Corollary 11.7.3 that even for relatively 1-random A and B, any set computable in both A and B must be quite close to being computable, in a sense that we will discuss in Chapter 11. A further connection between PA degrees and 1-random degrees is that, since there are Π01 classes consisting entirely of 1-random sets, every PA degree computes a 1-random set (by the Scott Basis Theorem 2.21.2). On the other hand, Kuˇcera [215] showed that the class of sets of PA degree has measure 0, and hence there are 1-random degrees that are not PA. He also showed that there are PA degrees that are not 1-random, and hence the collection of 1-random degrees is not closed upwards. The exact relationship between PA degrees and 1-random degrees was clarified by the following result of Stephan [379], which also yields Kuˇcera’s aforementioned results as corollaries. Theorem 8.8.4 (Stephan [379]). If a degree a is both PA and 1-random then a 0 . Proof. Let A T ∅ have PA degree. We first construct a partial computable function f such that for each e, the class of all sets B such that ΦB e is a total extension of f has small measure. Then we use f to show that A is
338
8. Algorithmic Randomness and Turing Reducibility
not 1-random, using the fact that, since A has PA degree, A can compute a total extension of f . We proceed as follows for all e simultaneously. Let Ie = [2e+2 , 2e+3 − 1]. Within Ie , define f as follows. At stage s, let as be the least element of Ie such that f (as ) has not yet been defined. (We will show below that such a number always exists.) For i = 0, 1, let Pe,s,i be the class of all B such that, for all n ∈ Ie , 1. ΦB e (n)[s] ↓ 2. if n < as then ΦB e (n) = f (n), and 3. if n = as then ΦB e (n) = i. Let de,s,i = μ(Pe,s,i ). By the usual use conventions, whether B ∈ Pe,s,i depends only on B s, so Pe,s,i is a Δ01 class. For the same reason, de,s,i is rational and the function (e, s, i) → de,s,i is computable. If de,s,0 + de,s,1 > 2−(e+1) , then choose i < 2 such that de,s,i de,s,1−i and let f (as ) = i. Otherwise, do nothing at this stage. Clearly, f is partial computable. When f (as ) is defined at stage s, we ensure that ΦB e does not extend f for all B ∈ Pe,s,1−f (as ) , and μ(Pe,s,1−f (as ) ) 2−(e+2) . Furthermore, for two such stages s and t, we have Pe,s,1−f (as ) ∩ Pe,t,1−f (at ) = ∅, so we cannot define f on all of Ie . Thus there is a stage t such that de,s,0 + de,s,1 2−(e+1) for all s > t. Since A is PA, it can compute a total {0, 1}-valued extension g of f . Let e0 < e1 < · · · be such that ΦA ei = g for all i. Let k0 , k1 , . . . be a computable enumeration of ∅ without repetitions. Let e(s) = eks and let r(s) be the first stage t > s such that de(s),t,0 + de(s),t,1 2−(e(s)+1) . Let S = {Pe(s),r(s),0 ∪ Pe(s),r(s),1 :s ∈ N}. Then S is a collection of uniformly Σ01 classes. Furthermore, s de(s),r(s),0 + −(e(s)+1) de(s),r(s),1 s 2 1, so S is a Solovay test. Let h(i) be the least s such that all computations ΦA ei (n) for n ∈ Iei have settled by stage s. Then h T A. There must be infinitely many s such that h(ks ) < s, since otherwise, for all but finitely many n, we would have n ∈ ∅ iff n = ks for some s h(n), which would imply that A T ∅ , contrary to hypothesis. For each s such that h(ks ) < s, we also have h(ks ) < r(s), and hence A ∈ Pe(s),r(s),0 ∪ Pe(s),r(s),1 . Thus A is in infinitely many elements of S, and hence is not 1-random. Corollary 8.8.5 (Kuˇcera [215]). There are PA degrees that are not 1random. Proof. Apply Theorem 8.8.4 to a low PA degree. Corollary 8.8.6 (Kuˇcera [215]). The collection of 1-random degrees is not closed upwards.
8.8. Randomness and PA degrees
339
Proof. As mentioned above, every PA degree computes a 1-random set. Now apply the previous corollary. Stephan [379] noted the following improvement of this result. Corollary 8.8.7 (Stephan [379]). Let a 0 . Then there is a degree b a that is not 1-random. Proof. Let f (n) be the least s such that ∅s n = ∅ n. Then a set computes ∅ iff it computes a function that majorizes f . By the relativized form of the hyperimmune-free basis theorem, there is a PA degree b a that is hyperimmune-free relative to a. Since a 0 , there is no acomputable function that majorizes f . Since every b-computable function is majorized by some a-computable function, there is no b-computable function that majorizes f . Thus b 0 . Since b is a PA degree, it cannot be 1-random. As Stephan [379] put it, Theorem 8.8.4 “says that there are two types of Martin-L¨ of random sets: the first type are the computationally powerful sets which permit to solve the halting problem [∅ ]; the second type of random sets are computationally weak in the sense that they are not [PA]. Every set not belonging to one of these two types is not Martin-L¨ of random.” We have already alluded to this dichotomy between the 1-random sets that compute ∅ and the ones that do not (in the paragraph following Theorem 6.1.3), and have seen in Section 7.7 that the latter can be characterized as the difference random sets. We will see other results that point to a real qualitative difference in terms of computability-theoretic strength between these two kinds of 1-random sets. As we have already pointed out, we should expect randomness to be antithetical to computational power. Indeed, as we move from 1-randomness to 2-randomness, or even weak 2-randomness, we do lose computational power. However, Stephan’s result exemplifies the fact that we do not need to go so far. If we ignore the 1-random sets above ∅ (which, as we pointed out following Theorem 6.1.3, there are reasonable heuristic reasons to do), then phenomena such as the Kuˇcera-G´acs Theorem disappear, and we find that 1-random sets are indeed computationally weak in various ways. We conclude this section with a further interesting connection between PA and 1-random degrees. Theorem 8.8.8 (Barmpalias, Lewis, and Ng [24]). Every PA degree is the join of two 1-random degrees. Thus, an incomplete PA degree is an example of a non-1-random degree that is the join of two 1-random degrees. Note that Theorem 8.8.8 does not hold for 1-genericity, or even weak 1-genericity, in place of 1-randomness, since there are hyperimmune-free PA degrees, but by Theorem 2.24.12, every weakly 1-generic set has hyperimmune degree.
340
8. Algorithmic Randomness and Turing Reducibility
Before we prove Theorem 8.8.8, we point out a corollary. While the 1random degrees are not closed upwards, one might wonder about weaker notions of randomness such as weak 1-randomness. After all, every hyperimmune degree is weakly 1-random. The following corollary was obtained by direct methods, before it was noticed that it could be obtained by an application of Theorems 8.8.4 and 8.8.8, together with Theorem 8.11.11 below, which states that on the hyperimmune-free degrees, 1-randomness and weak 1-randomness are equivalent. Corollary 8.8.9 (Downey and Hirschfeldt [unpublished]). The weakly 1random degrees are not closed upwards. Indeed, there is a degree that is not weakly 1-random but is above a 1-random degree. Proof. Let a be hyperimmune-free and PA. By Theorem 8.8.4, a is not 1random. By Theorem 8.11.11, a is not weakly 1-random. But by Theorem 8.8.8, a is the join of two 1-random degrees. Note that it follows from this result that the degrees corresponding to any notion intermediate between weak 1-randomness and 1-randomness, such as Schnorr or computable randomness, are also not closed upwards. The idea of the proof of Theorem 8.8.8 is the following. Let C have PA degree, and let P be a Π01 class consisting entirely of 1-random sets. For a perfect tree T such that [T ] ⊆ P , we will define a method for coding C into the join of two paths A and B of T , thus ensuring that C T A ⊕ B. The coding will be done in such a way that if C can compute the tree T , then we also have A ⊕ B T C. We will then build a class of such trees defined by a Π01 formula (that is, there will be a Π01 class whose elements effectively represent the trees in this class). Then we can use the fact that C is of PA degree to obtain a tree in this class that is computable from C (an idea of Solovay we have met in Corollary 2.21.4). A key ingredient of the proof is the use of indifferent sets for the coding. Definition 8.8.10 (Figueira, Miller, and Nies [146]). For a set I, we write to mean that R(i) = R(i) for all i ∈ R =I R / I. is 1-random for all R =I R. A set I is indifferent for a 1-random set R if R Figueira, Miller, and Nies [146] showed that every 1-random set has an infinite indifferent set. Their proof uses a computability-theoretic notion due to Trakhtenbrot [390]. A set A is autoreducible if there is a Turing functional Φ such that ΦA\{n} (n) = A(n) for all n. That is, membership of n in A can be determined from A without querying A about n itself. An example of such a set is a set A coding a complete theory. There, to find out whether the code of a sentence ϕ is in A, we can simply ask whether the code of ¬ϕ is in A. Figueira, Miller, and Nies [146] observed that no 1-random set is autoreducible. Indeed, if A is autoreducible then it is easy to define a partial computable nonmonotonic betting strategy that succeeds on A, so A is not
8.8. Randomness and PA degrees
341
nonmonotonically random, and hence not 1-random, by Theorem 7.5.5. (It is also not hard to build a c.e. martingale succeeding on A directly.) ∈P For a Π01 class P and A ∈ P , say that I is indifferent for A in P if A =I A. for all A Theorem 8.8.11 (Figueira, Miller, and Nies [146]). If P is a Π01 class and A ∈ P is not autoreducible, then there is an infinite I T A that is indifferent for A in P . Thus, since every 1-random set is contained in a Π01 class all elements of which are 1-random, every 1-random set has an infinite indifferent set. Proof. First we claim that if there is no n such that {n} is indifferent for A in P , then A is autoreducible. Indeed, in this case, we can let B0 and B1 be the two sets that are equal to A except possibly at n, and wait until one of these Bi leaves P , which must happen. Then A(n) = B1−i (n). This argument can easily be extended inductively to show that there is an infinite set I = {n0 , n1 . . .} such that each finite set {n0 , . . . , nk } is indifferent for A in P . Since P is closed, I is indifferent for A in P . We can define nk as the least n > nk−1 such that {n0 , . . . , nk−1 , n} is indifferent for A in P , and it is easy to see that this n can be found using A , so we can choose I T A . We say that a set is indifferent if it is indifferent for some 1-random set. We will not discuss indifferent sets further, but will mention that Figueira, Miller, and Nies [146] showed that an infinite indifferent set must be quite sparse and compute ∅ , but there are co-c.e. indifferent sets. It is an interesting open question whether there is a 1-random set X that computes a set that is indifferent for X. The concept of indifference has also been explored for genericity; see Day and Fitzgerald [91]. Proof of Theorem 8.8.8. We will be working with perfect trees, as in Section 2.17. That is, in this proof, a tree is a partial function T from 2<ω to 2<ω such that for every σ ∈ 2<ω and i ∈ {0, 1}, if T (σi) ↓, then T (σ) ↓ and T (σ(1 − i)) ↓, and we have T (σ) ≺ T (σi) and T (σ(1 − i)) | T (σi). A tree T is perfect if T (σ) ↓ for all σ. A finite tree T of level n is just the restriction of a tree (as a map) to strings of length n. We write [T ] for the class of all B for which there is an A with T (A n) ≺ B for all n. As mentioned above, we want to code a set C of PA degree into 1random sets A and B. A first idea is to use an infinite indifferent set I = {n0 < n1 < · · · } for some 1-random set X to do the coding, by letting A be the set defined by A(ni ) = C(i) and A(n) = X(n) for n ∈ / I, and letting B be the set defined by B(ni ) = 1 − C(i) and B(n) = X(n) for n ∈ / I. Then A and B are 1-random, and I = {n : A(n) = B(n)}, so I T A ⊕ B, and hence C T A ⊕ B. The problem with this approach is that the class of perfect trees cannot be expressed as a Π01 class in Cantor space, because 2ω is compact while
342
8. Algorithmic Randomness and Turing Reducibility
the space of perfect trees, with the natural topology generated by the finite trees, is not. Thus, although we can achieve C T A⊕B, we cannot achieve Turing equivalence through this approach. To get around this problem, we work in a compact subspace of the space of trees. Let f be an increasing function. (For convenience we let f (−1) = 0.) For a set A, let σA (0), σA (1), . . . be the unique sequence of strings such that |σA (i)| = f (i) − f (i − 1) and A = σA (0)σA (1) . . . . Say that A and B are piecewise f -different from level n if σA (i) = σB (i) for all i n. Given A and B that are piecewise f -different from level n, define the f,n tree TA,B as follows. f,n TA,B (∅) = A f (n − 1) f,n (τ 0) TA,B
f,n = TA,B (τ ) min{σA (n + |τ |), σB (n + |τ |)}
f,n (τ 1) TA,B
f,n = TA,B (τ ) max{σA (n + |τ |), σB (n + |τ |)},
where the min and max are with respect to lexicographic ordering. Let f,n T f,n = {TA,B | A and B are piecewise f -different from level n}, thought of as a subspace of the space of perfect trees. It is not hard to see that T f,n is compact and f -computably homeomorphic to 2ω . Let P be a Π01 class consisting entirely of 1-random sets. If f is computable, then f,n Cfn = {A ⊕ B | A, B are piecewise f -different from level n and [TA,B ] ⊆ P}
is a Π01 class. Of course, we do not know that it is nonempty, but if so, then we can finish the proof as follows: If C is of PA degree, then it computes f,n some member TA,B of Cfn . Without loss of generality, we may assume that A f (n − 1) = B f (n − 1) and that for each i n the string σA (i) is lexicographically smaller than σB (i) iff C(i) = 0. Then the sets A and B f,n ] ⊆ P , and hence are 1-random. Clearly, A ⊕ B T C. are members of [TA,B Furthermore, C(i) = 0 iff σA (i) is lexicographically smaller than σB (i), so C T A ⊕ B. Thus, we are left with showing that there is a computable increasing function f and an n such that Cfn = ∅. We will need to work with finite approximations to the notion of piecewise difference, so we introduce the following terminology. Fix an increasing function f and define σA (i) as above. We say that A ∈ P can be f -switched within [n, m] if there is a B such that σA (i) = σB (i) for all i ∈ [n, m] and for every sequence X0 , X1 , . . . ∈ {A, B}ω such that Xi = A for i ∈ / [n, m], we have σX0 (0)σX1 (1) . . . ∈ P . In this case we say that B is an f -switching partner for A within [n, m]. We say that A ∈ P can be f -switched from n if there is a B such that σA (i) = σB (i) for all i n and for every sequence X0 , X1 , . . . ∈ {A, B}ω such that Xi = A for i < n, we have σX0 (0)σX1 (1) . . . ∈ P . In this case we say that B is an f -switching partner for A from n.
8.8. Randomness and PA degrees
343
Let Cn,m (A) be the class of f -switching partners for A within [n, m] and C n be the class of f -switching partners for A from n. Notice that Cn (A) = m Cn,m (A) Notice also that each Cn,m (A) is clopen and each Cn (A) is closed. Let Dn,m be the class of sets that cannot be f -switched within [n, m]. Notice that Dn,m ⊆ Dn,m+1 , and the classes Dn,m are uniformly Σ0,f 1 . Let Dn be the class of sets that cannot be f -switched from n. If A ∈ D then, by compactness, C (A) = n,m n m m Cn,m (A) = ∅, so A ∈ Dn . But, clearly, if A ∈ Dn then A ∈ m Dn,m , so in fact Dn = m Dn,m , and n,m = Dn ∩ P and D n = Dn ∩ P . hence Dn = m Dn,m . Let D To show that there is a computable increasing function f and an n such that Cfn = ∅, we will show that there is a computable increasing function f such that if X ∈ P is sufficiently random (weakly 2-random suffices), then for some n and some Y such that X and Y are piecewise f -different from f,n level n, we have [TX,Y ] ⊆ P . It is enough to define a computable increasing i is a Π0 class of n ) O(2−n ), because then D function f such that μ(D 2 i measure 0, and hence for every weakly 2-random X ∈ P there is some n f,n such that X ∈ / Dn , which means that [TX,Y ] ⊆ P for some Y such that X and Y are piecewise f -different from level n. We will show that for any increasing function f , we have n,n ) 2f (n−1) 2−f (n) μ(D and n,m+1 \ D n,m ) 2f (m) 2f (m)−f (n−1) 2−f (m+1) . μ(D Letting f (0) = 1 and f (n + 1) = 2f (n) + n + 2, we then have n,n ) 2−n−1 μ(D and n,m ) 2−n−m−2 , n,m+1 \ D μ(D n ) O(2−n ). from which it follows that μ(D n,n ) 2f (n−1) 2−f (n) . Fix σ of length f (n − We first show that μ(D 1), and for each τ of length f (n) − f (n − 1), let Mστ (n, n) be the set n,n , we have that n,n . By the definition of D of all B such that στ B ∈ D M (n, n) ∩ Mσρ (n, n) = ∅ for all τ = ρ of length f (n) − f (n − 1). Hence στ −f (n) . Since τ ∈2f (n)−f (n−1) μ(Mστ (n, n)) 1, and so μ(Dn,n ∩ σ) 2 n,n ) 2f (n−1) 2−f (n) . there are 2f (n−1) many such σ, we have that μ(D n,m+1 \ D n,m ) 2f (m) 2f (m)−f (n−1) 2−f (m+1) . Finally, we show that μ(D Say that a string τ of length f (m)− f (n− 1) is a switching string for A ∈ P within [n, m] if for some (or equivalently, for all) η of length f (n−1) and all B, the set ητ B is an f -switching partner for A within [n, m]. For any string τ of length f (m) − f (n − 1), let Lτ be the set of elements of P for which n,m belongs to τ is a switching string within [n, m]. Since every A ∈ P \ D
344
8. Algorithmic Randomness and Turing Reducibility
some Lτ , we have n,m+1 \ D n,m = D
n,m+1 ∩ Lτ ∩ σ | σ ∈ 2f (m) ∧ τ ∈ 2f (m)−f (n−1) }. {D
Fix strings σ, τ of lengths as above and for each string ρ of length f (m+1)− n,m+1 ∩ Lτ . f (m) let Mσρ,τ (n, m + 1) be the set of all B such that σρB ∈ D As before, we have that Mσρ,τ (n, m + 1) ∩ Mσρ ,τ (n, m + 1) = ∅ for any ρ = ρ of length f (m + 1) − f (m). Hence μ(Mσρ,τ (n, m + 1)) 1, ρ∈2f (m+1)−f (m)
n,m+1 \ D n,m ) n,m+1 ∩ Lτ ∩ σ) 2−f (m+1) . Thus μ(D and so μ(D 2f (m) 2f (m)−f (n−1) 2−f (m+1) .
8.9 Mass problems A mass problem is a subset of ω ω . We view the elements of such a class P as solutions to some problem, which we identify with P . (For instance, if P = {f : ∀n (∅f (n) n = ∅ n)}, then we identify P with the “problem” of majorizing the settling time of ∅ .) There are two natural notions of reducing one problem to another in this context. Definition 8.9.1 (Muchnik [286], Medvedev [263]). Let P and Q be mass problems. (i) P is weakly reducible (or Muchnik reducible) to Q, denoted by P w Q, if for each f ∈ Q, there is a g T f in P . (ii) P is strongly reducible (or Medvedev reducible) to Q, denoted by P s Q, if there is a functional Ψ such that Ψf ∈ P for every f ∈ Q. In other words, P is weakly reducible to Q if from each solution to Q we can effectively obtain a solution to P , and P is strongly reducible to Q if there is a single effective procedure for converting solutions to Q into solutions to P . Note that if X, Y ∈ 2ω then {X} w {Y } iff {X} s {Y } iff X T Y , so these notions can be seen as extensions of Turing reducibility. It is interesting to note also that μ(P ) > 0, where P ⊆ 2ω , then P is incomparable (in terms of both weak and strong reducibility) with any of these singleton classes except the ones of the form {X} with X computable, since the Turing lower cone of a set has measure 0 because it is countable, while the Turing upper cone of a noncomputable set has measure 0 by Corollary 8.12.2 below. We can take equivalence classes of mass problems under these reducibilities to arrive at the notions of weak and strong degrees of mass problems. The weak and strong degrees both form lattices, with bottom element the
8.9. Mass problems
345
degree of ω ω (which consists exactly of those mass problems that have computable elements), and top element ∅. The join and meet of the degrees of mass problems P and Q (in both the strong and weak degrees) are the degrees of {f ⊕ g : f ∈ P ∧ g ∈ Q} and {0f : f ∈ P } ∪ {1g : g ∈ Q}, respectively. (In the weak degrees, the latter is the same as the weak degree of P ∪ Q.) Particular attention has been paid to the weak and strong degrees of Π01 classes (in 2ω ). These form sublattices of the corresponding global structures, with the top degree being that of the Π01 class of (codes of) completions of PA. (That this class is complete for weak reducibility follows from the Scott Basis Theorem 2.21.2; for the strong reducibility case see [359].) Let R be the mass problem consisting of all 1-random sets. While R is not itself a Π01 class, it is weakly equivalent to any nonempty Π01 class P all of whose elements are 1-random, such as the complement of one of the levels of a universal Martin-L¨ of test. Clearly R w P , since P ⊂ R, and P w R by Theorem 6.10.1. Indeed, this theorem shows that Q w R for any Π01 class of positive measure, so the weak degree of R can be characterized as the largest weak degree of Π01 classes of positive measure, as noted by Simpson [359]. Simpson and Slaman [unpublished] and Terwijn [388] have shown that there is no largest, or even maximal, strong degree of Π01 classes of positive measure. Let Pw be the structure of weak degrees of Π01 classes, with 0 and 1 its bottom and top degrees, respectively. We establish some further properties of the weak degree r1 of R, beginning with a couple of lemmas. Lemma 8.9.2 (Simpson [359]). Let P, Q ⊆ 2ω be nonempty Π01 classes such that μ(P ) > 0 and degw (Q) < 1. Then degw (P )∧ degw (Q) < 1. Proof. Assume for a contradiction that degw (P )∧ degw (Q) = 1. Let S be the Π01 class of separating sets for effectively inseparable c.e. sets A and B, so that S ∈ 1. Let g ∈ Q. Then for every f ∈ P , there is an element of S computable in f ⊕ g. Since there are only countably many Turing reductions, there are a functional Ψ and a class P ⊆ P such that μ(P ) > 0 and Ψf ⊕g ∈ S for all f ∈ P . By Lebesgue density, there is a σ such that μ(P ∩ σ) > 34 2−|σ| . Given n, we can g-computably find an i < 2 such that Ψf ⊕g (n) = i for a class of f ∈ σ of measure at least 38 2−|σ| and let h(n) = i. If n ∈ A then h(n) = 1, since otherwise there would be an f ∈ P ∩ σ such that Ψf ⊕g ∈ / S. Similarly, if n ∈ B then h(n) = 0. Thus h is a g-computable element of S. Since g ∈ Q is arbitrary, we have S w Q, contrary to hypothesis. Lemma 8.9.3 (Simpson [359]). Let P, Q ⊆ 2ω be nonempty Π01 classes ⊆ Q such that such that P w Q. Then there is a nonempty Π01 class Q P s Q.
346
8. Algorithmic Randomness and Turing Reducibility
Proof. By the hyperimmune-free basis theorem, there is a hyperimmunefree Y ∈ Q. Let X ∈ P be such that X T Y . By Theorem 2.17.7, there is = {Z ∈ Q : ΨZ ∈ P }. It is a total functional Ψ such that ΨY = X. Let Q easy to see that this class has the desired properties. In particular, if Q is weakly reducible to R then it is strongly reducible to some nonempty Π01 class of 1-random sets. Theorem 8.9.4 (Simpson [359]). (i) 0 < r1 < 1. (ii) If q ∈ Pw is such that q < 1, then r1 ∧ q < 1. (That is, r1 does not join to 1.) (iii) If q0 , q1 ∈ Pw and r1 q0 ∨ q1 , then r1 q0 or r1 q1 . (That is, r1 is meet-irreducible.) Proof. The class R has no computable elements, and it has an element that does not compute a completion of PA, by Kuˇcera’s result mentioned above Theorem 8.8.4, so we have (i). Part (ii) follows from Lemma 8.9.2. For part (iii), suppose that R w Q0 ∪ Q1 , where Q0 , Q1 ⊆ 2ω are Π01 classes. Let P be a nonempty Π01 subclass of R. Then P w Q0 ∪ Q1 , so by Lemma 8.9.3, there is a nonempty Π01 class P ⊆ P such that P s Qo ∪ Q1 . Thus there is a total functional Ψ such that Ψf ∈ Qo ∪Q1 for all f ∈ P . Let P i = {f ∈ P : Ψf ∈ Qi }. Then there is an i < 2 such that P i is nonempty. Since P i is a nonempty Π01 class of 1-random sets, as mentioned above, it is weakly equivalent to R, so R ≡w P i s Qi . That R is weakly equivalent to some Π01 class also follows from the following general result, known as the Embedding Lemma. (See [360] for a proof.) Theorem 8.9.5 (Simpson [360]). Let P ⊆ 2ω be a nonempty Π01 class and let S ⊆ ω ω be a Σ03 class. Then there is a nonempty Π01 class Q ⊆ 2ω such that Q ≡w P ∪ S. Thus, in particular, if S ⊆ 2ω is a Σ03 class with a nonempty Π01 subclass, then it is weakly equivalent to a Π01 class. If S is a Σ03 class with no nonempty Π01 subclass, for example the class of 2-random sets, then the natural Π01 class to take as P in the above theorem is the class of completions of PA, since in that case the weak degree of S ∪ P is the largest degree in Pw below that of S. Simpson [360] characterized the weak degree r∗2 of the mass problem of sets that are either 2-random or completions of PA as follows. Theorem 8.9.6 (Simpson [360]). The degree r∗2 is the largest weak degree of a Π01 subset of 2ω whose upward closure in the Turing degrees has positive measure.
8.10. DNC degrees and subsets of random sets
347
Proof. Let P be a Π01 class in r∗2 . If Q w P then every 2-random set computes an element of Q, so the upward closure of Q in the Turing degrees in the Turing has measure 1. Conversely, suppose that the upward closure Q 0 ω degrees of the Π1 class Q ⊆ 2 has positive measure. It is straightforward is a Σ03 class, so it has a Π02 subclass of positive measure. to show that Q By Theorem 6.10.2, this subclass contains translations of every 2-random set, so every 2-random set computes some element of Q. Since Q is a Π01 class, every completion of PA also computes some element of Q. Thus Q w P . Thus r1 r∗2 . In fact, as shown by Simpson [360], we have r1 < r∗2 < 1. These inequalities can be shown in several ways, for example as consequences of Theorem 8.8.4, which implies that an incomplete Δ02 1-random set cannot compute a completion of PA (and can obviously also not compute a 2-random set), while no 2-random set can compute a completion of PA. In this way, many of the classes of sets and functions we discuss in this book, such as the class of DNC functions, can be studied in the context of weak degrees of Π01 classes. Indeed, the weak degree d of the class of DNC functions is in Pw , since every PA degree computes a DNC function. Results such as Theorem 8.8.1, that every 1-random set computes a DNC function, can be reinterpreted as statements about degrees of mass problems (in most cases, the weak degrees being the most applicable). For instance, as noted by Simpson [360], Theorem 8.8.1 implies that d r1 , and this inequality is in fact strict, by Theorem 8.10.3 below. For more on mass problems, see Sorbi [373], Simpson [359, 360, 361], and Cole and Simpson [72].
8.10 DNC degrees and subsets of random sets In this section, we prove a strong extension of Kuˇcera’s Theorem 8.8.1 that every 1-random set computes a DNC function, by characterizing the DNC degrees in terms of 1-random sets. The following lemma and variations on it can be quite useful. Let P0 , P1 , . . . be an effective listing of all Π01 classes. Lemma 8.10.1 (Kuˇcera [215]). If P is a Π01 class of positive measure, then there is a nonempty Π01 class Q ⊆ P and a c such that for all e, Q ∩ Pe = ∅ ⇒ μ(Q ∩ Pe ) 2−e−c . Proof. Let c be such that 21−c < μ(P ). Let Q be the Π01 subclass of P obtainedby removing Pe [s] whenever μ(Pe [s]∩Q[s]) < 2−e−c . Then μ(Q) μ(P ) − e 2−e−c = μ(P ) − 21−c > 0, so Q is nonempty.
348
8. Algorithmic Randomness and Turing Reducibility
Theorem 8.10.2 (Kjos-Hanssen [204], Greenberg and Miller [171]5 ). The following are equivalent. (i) A computes a DNC function. (ii) A computes an infinite subset of a 1-random set. (iii) There is a Π0,A class of measure 0 containing a 1-random set. (In 1 other words, there is a 1-random set that is not weakly 1-random relative to A.) Proof. (i) ⇒ (ii). Let f T A be a DNC function. Let P be a Π01 class containing only 1-random sets and let Q and c be given by Lemma 8.10.1. For any set S, let QS = {X ∈ Q : S ⊆ X}. We construct an A-computable set S = {n0 < n1 < n2 < · · · } such that QS is nonempty. Assume that we have defined Si = {nj }j ni−1 . For each j < |B|, we have ensured that γni (j) = γbj (j), so ni ∈ / B. Therefore, QSi ∪{ni } is nonempty. By construction, S T f T A. Since QS = i QSi is a nested intersection of nonempty compact sets, it is nonempty. Therefore, S is a subset of some 1-random set. (ii) ⇒ (iii). Let X be 1-random and let S ⊆ X be infinite and Acomputable. Then the Π0,A class of all sets that contain S has measure 1 0 and contains X. (iii) ⇒ (i) For each e, let U0 , U1 , . . . be an effective listing of all clopen class of measure 0 containing a 1-random set. subsets of 2ω . Let P be a Π0,A 1 There is an A-computable function f such that P ⊆ Uf (e) and μ(Uf (e) ) 2−e for all e. For each n, let Vn be the union of the clopen sets UΦe (e) such that e > n and μ(UΦe (e) ) 2−e . Then {Vn }n∈ω is a Martin-L¨ of test. If f (e) = Φe (e) infinitely often, then P ⊆ n Vn . Since P contains a 15 The precise history of Theorem 8.10.2 is the following. Kjos-Hanssen [204] proved that (iii) implies (i), by essentially the proof given here. He also proved (ii), and hence (iii), from a strong form of (i): that A computes a sufficiently slow-growing DNC function. Independently, Greenberg and Miller [171] proved that (i) implies (iii), implicitly proving (ii) from (i) at the same time. The short elementary proof that (i) implies (ii) given here is due to Miller [personal communication to Downey]. Kjos-Hanssen’s proof relies on work in probability theory and, like the proof of Greenberg and Miller, is somewhat involved.
8.11. High degrees and separating notions of randomness
349
random set, this is not the case. Hence, f (e) = Φe (e) only finitely often, so A computes a DNC function. In the proof that (i) implies (ii), it is easy to code a set A into S by adding A(i) into the sequence used to define ni . Thus, the degrees of infinite subsets of 1-random sets are closed upward, and hence actually coincide with the degrees of DNC functions. The connection between DNC degrees and 1-randomness can be clarified as follows. Theorem 8.10.3 (Ambos-Spies, Kjos-Hanssen, Lempp, and Slaman [8], Kumabe and Lewis [225]). There is a DNC degree that does not bound any 1-random degree. Kumabe and Lewis [225] showed that there is a minimal DNC degree, which, by Corollary 6.9.5, cannot be 1-random. (The proof in their paper had existed in manuscript form for several years prior to publication.) Ambos-Spies, Kjos-Hanssen, Lempp, and Slaman [8] showed that there is an ideal in the Turing degrees containing a DNC degree but no 1-random degree. Indeed, they built an ideal I in the Turing degrees such that for every X ∈ I, there is a G ∈ I that is DNC relative to X, and such that there is no 1-random degree in I, and used this result to derive a separation between two principles in the context of reverse mathematics. By Theorems 8.10.2 and 8.10.3, there is a subset of a 1-random set that does not compute any 1-random sets.
8.11 High degrees and separating notions of randomness 8.11.1 High degrees, Schnorr randomness, and computable randomness We have seen in Corollary 8.8.2 that all 1-random sets of c.e. degree (and in particular all 1-random left-c.e. reals) are of Turing degree 0 . This result fails for computably random (and hence for Schnorr random) sets. Theorem 8.11.1 (Nies, Stephan, and Terwijn [308]). Every high c.e. degree is computably random. We will prove a stronger version of this result in Theorem 8.11.6 below. An earlier proof for Schnorr random sets was given by Downey, Griffiths, and LaForte [110], effectivizing arguments of Wang [405, 406]. Conversely, we have the following result. Theorem 8.11.2 (Nies, Stephan, and Terwijn [308]). If a nonhigh set is Schnorr random then it is 1-random.
350
8. Algorithmic Randomness and Turing Reducibility
Proof. Suppose that A is nonhigh and not 1-random. Let {Un }n∈ω be a universal Martin-L¨ of test. Let f (n) be the least s such that A ∈ Un [s]. Then f T A, so by Theorem 2.23.7, there is a computable function g such that g(n) f (n) for infinitely many n. Let Vn = Un [g(n)]. Then {Vn }n∈ω is a total Solovay test and A ∈ Vn for infinitely many n, so A is not Schnorr random by Theorem 7.1.10. Corollary 8.11.3 (Nies, Stephan, and Terwijn [308]). Every Schnorr random set of c.e. degree is high. Corollary 8.11.4 (Downey and Griffiths [109]). Every Schnorr random left-c.e. real is high. The following is a direct proof of this corollary. Proof of Corollary 8.11.4. Let α be a Schnorr random left-c.e. real. Let T = {i, j : |Wi | > j}. Note that T is c.e. We build a functional Γ such that for each i, we have Γα (i, k) = T (i, k) for almost all k. Then it is easy to see that α can decide which Wi are infinite, and hence α is high. For each i, we also build a collection Qi of open intervals with rational endpoints forming a total Solovay test. At each stage s, proceed as follows. For each i, k < s such that Γαs (i, k) is currently undefined, let Γαs (i, k) = Ts (i, k) with use k. For each i, k entering T at stage s, add (αs , αs + 2−k ) to Qi . Each Qi is either finite or contains an interval of length 2−k for each k. In either case, Qi is a total Solovay test. Since α can be in only finitely many intervals in Qi , for almost all k, if i, k enters T at stage s, then α k = αs k, whence Γα (i, k) = T (i, k).
8.11.2 Separating notions of randomness In Section 7.1.1, we mentioned that 1-randomness, computable randomness, and Schnorr randomness are all distinct, even for left-c.e. reals. (These are results of Schnorr [348] and Wang [405, 406] for the noneffective case and Downey, Griffiths, and LaForte [110] for the left-c.e. case.) Nies, Stephan, and Terwijn [308] established a definitive result on separating these notions of randomness, by showing that the degrees within which they can be separated are exactly the high degrees. The easy direction of this result is Theorem 8.11.2. Remarkably, the converse to this theorem also holds. To prove this fact, we will need the following extension of the Space Lemma 8.3.1. Lemma 8.11.5. Given a rational δ > 1 and integer k > 0, we can compute a length l(δ, k) such that, for any martingale d and any σ, |{τ ∈ 2l(δ,k) : ∀ρ τ (d(σρ) δd(σ))}| k.
8.11. High degrees and separating notions of randomness
351
Proof. As in the Space Lemma, let l(δ, k) = log 1−δk −1 , which is welldefined since δ > 1. Let S = {τ ∈ 2l(δ,k) : ∃ρ τ (d(σρ) > δd(σ))}. For each τ ∈ S, let ρτ be the shortest substring of τ such that d(σρτ ) > δd(σ). Then {ρ τ : τ ∈ S} is prefix-free, so by Kolmogorov’s Inequality (Theorem 6.3.3), τ ∈S 2−|ρτ | d(σρτ ) d(σ). Thus d(σ) τ ∈S 2−|ρτ | δd(σ), and hence μ(S) = τ ∈S 2−|ρτ | δ −1 . So |2l(δ,k) \ S| 2l(δ,k) (1 − δ −1 )
k (1 − δ −1 ) = k. 1 − δ −1
Theorem 8.11.6 (Nies, Stephan, and Terwijn [308]). The following are equivalent. (i) A is high. (ii) There is a B ≡T A that is computably random but not 1-random. (iii) There is a C ≡T A that is Schnorr random but not computably random. Furthermore, the same equivalences hold if A, B, and C are restricted to left-c.e. reals.6 Proof. (iii) ⇒ (i) and (ii) ⇒ (i) follow by Theorem 8.11.2. (i) ⇒ (ii) Let M0 , M1 , . . . be an effective listing of all partial computable martingales such that Mi (λ) = 1 if defined. We will first define a partial computable martingale M . There will be a perfect class P whose elements represent guesses as to which Mi converge on certain inputs. Along any such element, M will either be undefined from some point on, or its values will be bounded. Furthermore, for any B ∈ P whose guesses are correct, M will multiplicatively dominate all computable martingales on X; that is, for each i there will be an ε > 0 such that M (σ) εMi (B n) for all n. Thus, for any such B, no computable martingale will succeed on B. Then, given A, we will build such a B in two steps. First, we will define a set F ≡T A containing information about A and partial information about the behavior of computable martingales. Second, we will use M and the information in F to build a B ≡wtt F as above. We will also ensure that there is an infinite computable set of numbers n such that B(n) is computable from B n, from which it is easy to argue that B is not 1-random (or even partial computably random). 6 As
we saw in Corollary 8.11.4, all Schnorr random left-c.e. reals are high, so this theorem says that, among left-c.e. reals, the three notions of randomness can be separated in any degree in which they occur.
352
8. Algorithmic Randomness and Turing Reducibility
By Lemma 8.11.5, there is a computable f such that if σ ∈ 2n and M is a partial martingale defined on all τ σ with τ ∈ 2f (n) , then there are at least two strings τ0
where rτ is defined recursively so that for all τ ≺ τ , we have M (τ 0) + M (τ 1) = 2M (τ ) and rτ 0 = rτ 1 . The rτ are necessary because at every level zk , some Mi might be dropped from the sum, and (at most) one new Mi might be added. This new martingale is added if k = 1 + i, 0 and ai,0 = 1. In this case, it is added with the factor 2−2zk −1 , which guarantees that 2−2zk −1 Mi (η) is at most 2−zk −1 for all η ∈ 2zk . (Note that we have Mi (η) 2zk since Mi (λ) = 1.) Thus this addition increases the sum by at most 2−zk −1 , and therefore can be compensated by rτ . At every level, rτ 2−|τ | and at most 2−|τ |−1 of this capital is lost to maintain the martingale property of M . We now define F ≡T A. The idea is to have F (i, j) for i > 0 code whether Mi is defined on all strings of length up to zi,j+1 +1 . Since A is high, there is an A-computable g that dominates all computable functions. Let F be defined as follows. 1. Let F (0, 0) = 0. 2. Let F (0, j + 1) = A(j) for all j. 3. For i > 0, let F (i, j) = 1 if F (i, j ) = 1 for all j < j and Mi (τ ) is computed within g(i + j) many steps for all τ ∈ 2<ω with |τ | zi+j+1,i+j+1 +1 . Otherwise let F (i, j) = 0. Clearly F ≡T A. Finally, we define B by recursion. The idea is to code F (k) into B zk+1 while ensuring that B(zk ) is computable from B zk . Suppose that σ =
8.11. High degrees and separating notions of randomness
353
B zk has been defined. If M (σ0) M (σ1) then let B(zk ) = 0. Otherwise, M let B(zk ) = 1. Let η = B zk + 1 and let B zk+1 = τη,F (k) . Of course, this definition depends on certain values of M being defined, so we need to check that this is indeed the case. Note that the ai,j in the construction of M always exist for η ≺ B, and are just the bits F (i, j). Furthermore, for all i ∈ E with i > 0, we have i, j ∈ F for j j where j is maximal such that i, j < k, so Mi is defined on all strings of length up to zi+j +1,i+j +1 +1 zk+1 . Thus the computations in the definition of M all terminate, and hence M is defined on all extensions of B zk of length up to zk+1 . It follows that B is defined up to zk+1 and F (k) is coded into B. Note that the coding ensures that F wtt B. Furthermore, for each k we can compute B zk using F zk , so in fact B ≡wtt F . Since A ≡T F , we have B ≡T A. To see that B is not 1-random, it suffices to observe that B(zk ) can be computed from B zk . Then we can easily define a partial computable martingale (and hence a c.e. martingale, as noted in Section 7.4.2) that succeeds on B. To see that B is computably random, note first that the values of M on B are bounded, since M ’s capital while betting on B is not increased at zk , while between zk + 1 and zk+1 in this capital is at most 1 + 2−zk , , the gain −zk by the choice of the zk , and k 1 + 2 < k 1 + 2−k < ∞. Now let d be a computable martingale. Since g dominates every computable function, and every computable martingale appears infinitely often in the list M0 , M1 , . . ., there is an i such that Mi = d, and for all j, the value g(i+j) is greater than the number of steps required to compute Mi (τ ) for all τ ∈ 2zi+j+1,i+j+1+1 . It follows that, for all η B, we have d(η) 22zi,0+1 +1 M (η), so d does not succeed on B. (i) ⇒ (ii), left-c.e. case. If A is a c.e. set then we can choose the function g in the above proof to be approximable from below. This choice makes F c.e., which makes B almost c.e. (as defined in Section 5.1): B(zk ) is computed from B zk . Then, letting η = B zk + 1, we may assume that B zk+1 = M M τη,0 unless k enters F , in which case we know that B zk+1 = τη,1 (and M M recall that τη,0 x when defined. Also, for any n, we can effectively check whether n ∈ rng ψ, and if so, find the unique e and x such that ψ(e, x) = n.
354
8. Algorithmic Randomness and Turing Reducibility
Define a function p by recursion as follows. If there is an x < n and e log p(x) such that ψ(e, x) = n, then let p(n) = p(x) + 1. Otherwise, let p(n) = n + 4. The function p is computable, unbounded, and attains each value only finitely often. Assume without loss of generality that Φ0 is total, that g is nondecreasing, and that g(x) ψ(0, x) for all x, where g is as above. Let h(x) = max{ψ(e, x) : ψ(e, x) ↓ g(x) ∧ e < log p(x) − 1}. Note that h T A and h is nondecreasing. Furthermore, for any computable f (x) function f , there is an e such that Φe (x) = 22 . By the usual conventions f (x) on halting times, ψ(e, x) > 22 for all x. Thus, since g dominates all computable functions, h(log log x) > f (x) for almost all x. Let B and zk be as above. If zk = h(x) for some x < zk , then let C(zk ) = 0. For all other numbers, let C(n) = B(n). Since h T A, we have C ≡T A. To show that C is Schnorr random, assume for a contradiction that there are an i and a computable f such that that Mi (C f (m)) > m for infinitely many m. As with B, when betting along C, the martingale M increases its capital by less than 1 + 2−k between zk and zk+1 . At zk , it cannot increase its capital if zk ∈ / rng h. If zk ∈ rng h then M can double its capital at zk , but h(log log m) > f (m) for almost all m. Thus M (C f (m)) O(log m), whence Mi (C f (m)) O(log m), which is a contradiction. Finally, we describe a computable martingale d witnessing the fact that C is not computably random. Let Gn = {ψ(e, n) : ψ(e, n) ↓ ∧ e < log p(n) − 1}. These sets are pairwise disjoint, and Gn contains at least one zk such that C(zk ) = 0 (since by assumption Φ0 is total and g(x) ψ(0, x) for all x). We have numbers n and m, initially set to 0. We bet on 0 at strings of lengths in Gn and increase m every time our bet is wrong. Note that |Gn | log p(n) − 1, so along C, we cannot be wrong log p(n) many times. Once we bet correctly, we move to a new n and reset m to 0. Our bets are 1 designed to ensure that we increase our capital by p(n) whenever we bet correctly. More precisely, we define d as follows. Let d(λ) = 1, let nλ = 0, and let mλ = 0. Given d(σ), proceed as follows. If |σ| ∈ / Gn or mσ = log p(n), then for i = 0, 1, let d(σi) = d(σ), let nσi = nσ , and let mσi = mσ . Otherwise, 2m 2m let d(σ0) = d(σ)+ p(n) and d(σ1) = d(σ)− p(n) . Let nσ0 = |σ| and mσ0 = 0. Let nσ1 = nσ and mσ1 = mσ + 1. It is easy to check by induction that d(σ0) > 1. Note also that p(|σ|) = p(nσ ) + 1, by the definition of p. When betting along C, once we reach a σ0 ≺ C such that σ ∈ G0 , 1 which must exist, we have d(σ0) = 1 + p(0) = 1 + 14 . Then, when we reach a τ 0 ≺ C such that τ ∈ G|σ| , which again must exist, we have
8.11. High degrees and separating notions of randomness
355
1 1 d(τ 0) = 1 + p(0) + p(|σ|) = 1 + 14 + 15 . Proceeding in this fashion, we see 1 that lim supk d(C k) = lim supi 1 + 14 + 15 + · · · + i+4 = ∞. (i) ⇒ (iii), left c.e. case. If A is a c.e. set, then we can take g to be approximable from below, so that h is also approximable from below. Let h0 , h1 , . . . be such an approximation. We may assume we have defined g, and hence h, so that if hs+1 (n) > hs (n), then hs+1 (m) > zs for all m n. We now describe how to give C an almost c.e. approximation C0 , C1 , . . . . We define Cs as follows, beginning with k = 0.
1. Assume that σ = Cs zk has been defined. If Ms (σ0) ↑ then go to step 3. If there is an x < zk such that zk = ψ(e, x) for some x < zk and e < log(p(x)) − 1, and hs (x) = zk , then let Cs (zk ) = 0. In this case, we say that k is active for hs . Otherwise, if Ms (σ0) Ms (σ1) then let Cs (zk ) = 0, and otherwise, let Cs (zk ) = 1. 2. Let η = Cs zk + 1. If Ms (ητ ) ↑ for some τ ∈ 2zk+1 −zk −1 then go to step 3. Let Cs zk+1 = τη,Fs (k) . If k < s then go to step 1 for k + 1. Otherwise, go to step 3. 3. If the recursive definition above is terminated after defining an initial segment σ for some string σ, then let Cs = σ0ω . We say that Cs was terminated due to σ. It is easy to see that C = lims Cs , so we are left with showing that Cs L Cs+1 for all s. Fix s, and assume that Cs = Cs+1 . There are three cases. 1. If Cs is terminated for σ and σ ≺ Cs+1 , then Cs L Cs+1 , since Cs = σ0ω . 2. If the first disagreement between Cs and Cs+1 happens in step 1 of the kth iteration of the procedure, then k s and k must be active for one of hs or hs+1 but not the other. It cannot be the case that k is active for hs+1 , because then, for x as in step 1, we would have hs (x) < hs+1 (x), which would mean that hs+1 (x) > zs zk , by our assumption on h. So k is active for hs but not for hs+1 , which means that the disagreement between Cs and Cs+1 happens because Cs (zk ) = 0 but Cs+1 (zk ) = 1, and hence Cs
For separating weak 1-randomness from stronger randomness notions, the dividing line is hyperimmunity. We will show that every hyperimmune degree contains a weakly 1-random set that is not Schnorr random, but if a set of hyperimmune-free degree is weakly 1-random, then it is in fact weakly
356
8. Algorithmic Randomness and Turing Reducibility
2-random. The first result can be proved by considering the relationship between weak 1-randomness and weak 1-genericity. (We will return to the relationship between randomness and genericity in Section 8.20.) Theorem 8.11.7 (Kurtz [228]). Every weakly 1-generic set is weakly 1random. Proof. Let G be weakly 1-generic. Let S be a Σ01 class of measure 1. Let V be a c.e. set of strings generating S. Since S has measure 1, V is dense, so G meets V , and hence must be a member of S. By Theorem 2.24.19, we have the following. Corollary 8.11.8 (Kurtz [228]). Every hyperimmune degree is weakly 1random. Neither Theorem 8.11.7 nor Corollary 8.11.8 can be reversed, as we have seen in Proposition 8.1.3 that there are 1-random sets of hyperimmune-free degree, and by Theorem 2.24.19, the weakly 1-generic degrees coincide with the hyperimmune degrees. Combining Theorem 8.11.7 with the following fact also yields the separation result mentioned above. Proposition 8.11.9. No weakly 1-generic set is Schnorr random.
Proof. Let A be weakly 1-generic. Let Sn = {σ0n : σ ∈ 2n } and S = n Sn . Then S is computable and dense, so A meets often. In other S infinitely words, A ∈ Sn for infinitely many n. But n μ(Sn ) = n 2−n = 2, so {Sn }n∈ω is a total Solovay test, and hence, by Theorem 7.1.10, A is not Schnorr random. Corollary 8.11.10. Every hyperimmune degree contains a weakly 1random set that is not Schnorr random. However, the following result shows that outside of the hyperimmune degrees, many of our notions of randomness coincide. Theorem 8.11.11 (Nies, Stephan, and Terwijn [308]). If A has hyperimmune-free degree then A is weakly 1-random iff A is 1-random. Proof. Suppose that A has hyperimmune-free degree and is not 1-random. Let {Vn }n∈ω be a nested Martin-L¨ of test such that A ∈ n Vn . Using A, we can compute a stage g(n) such that A ∈ Vn . Since A has hyperimmune-free degree, there is a computable function f so that f (n) g(n) for alln. Let Un = Vn [f (n)]. Then {Un }n∈ω is a Kurtz null test such that A ∈ n Un , so A is not weakly 1-random. Actually, as observed by Yu [unpublished], we can replace the MartinL¨ of test in the above proof by a generalized Martin-L¨ of test to show the following.
8.11. High degrees and separating notions of randomness
357
Theorem 8.11.12 (Yu [unpublished]). If A has hyperimmune-free degree then A is weakly 1-random iff A is weakly 2-random. This result improves an earlier direct proof by Miller [personal communication] that there is a weakly 2-random hyperimmune-free degree. It has the following interesting corollary, which extends Corollary 8.6.2. Corollary 8.11.13 (Downey and Hirschfeldt [unpublished]). There is a degree a such that every nonzero degree b a is weakly 2-random. Proof. By Proposition 8.1.3, there is a 1-random set A of hyperimmune-free degree. Let B T A be noncomputable. By Proposition 2.17.7, B tt A, so by Demuth’s Theorem 8.6.1, there is a 1-random C ≡T B. By Theorem 8.11.12, C is weakly 2-random. As in the 1-randomness case, it is not known whether a can be made hyperimmune.
8.11.3 When van Lambalgen’s Theorem fails We are now in a position to prove the result, mentioned in Section 7.1.2, that van Lambalgen’s Theorem fails for Schnorr and computable randomness. This result was originally proved by Merkle, Miller, Nies, Reimann, and Stephan [269], with the following extension obtained by Yu [412]. Theorem 8.11.14 (Yu [412]). Let A ⊕ B be Schnorr random and computable in an incomplete c.e. set. Then A is not Schnorr random relative to B and B is not Schnorr random relative to A. The same is true for computable randomness. Proof. By Theorem 7.1.13, A and B are Schnorr random. By Theorem 8.2.1, A, B, and A ⊕ B are not 1-random. By Theorem 8.11.2, they are high. Thus (A⊕ B) ≡T ∅
358
8. Algorithmic Randomness and Turing Reducibility
Barmpalias, Downey, and Ng [22] have shown that this result holds also for weak 2-randomness.
8.12 Measure theory and Turing reducibility The first applications of measure theory to computability theory were due to Spector [375], who showed that almost all pairs of hyperdegrees (a generalization of the notion of Turing degree to higher computability theory) are incomparable, and, even earlier, to de Leeuw, Moore, Shannon, and Shapiro [92]. Their result, which we now discuss, was not well known in the computability theory community until recently, and has as an immediate corollary another fundamental theorem first explicitly stated by Sacks [341]. It says that if a set is c.e. relative to positive measure many oracles, then it is in fact c.e. Its proof uses what is known as the majority vote technique. Theorem 8.12.1 (de Leeuw, Moore, Shannon, and Shapiro [92]). If μ({X : A is c.e. in X}) > 0 then A is c.e. Proof. If {X : A is c.e. in X} has positive measure then there is an e such that S = {X : WeX = A} has positive measure. By the Lebesgue Density Theorem 1.2.3, there is an X such that limn 2n μ(S ∩ X n) = 1. Let n be such that 2n μ(S ∩ X n) > 12 and let σ = X n. We can enumerate A by “taking a vote” among the sets extending σ. More precisely, n ∈ A iff 2n μ({Y : σ ≺ Y ∧ n ∈ WeY }) > 12 , and the set of n for which this inequality holds is clearly c.e. Theorem 8.12.1 has a very interesting corollary, which was not explicitly stated in [92], but later independently formulated by Sacks [341]. For a set A, let AT = {B : A T B}. It is natural to ask whether there is a noncomputable set A such that AT has positive measure. Such a set would have a good claim to being “almost computable”. Furthermore, the existence of a noncomputable set A and an i such that μ({B : ΦB i = A}) = 1 could be considered a counterexample to the Church-Turing Thesis, since a machine with access to a random source would be able to compute A. The following result lays such worries to rest. Corollary 8.12.2 (Sacks [341]). If μ(AT ) > 0 then A is computable. Proof. If μ(AT ) > 0 then there is an i such that {B : ΦB i = A} has positive measure. It is now easy to show that there are j and k such that {B : WjB = A} and {B : WkB = A} both have positive measure. Thus A and A are both c.e. It is not hard to adapt the proof of Theorem 8.12.1 to show that if 0 μ({B : A is Δ0,B 2 }) > 0 then A is Δ2 . From this result it follows that if
8.12. Measure theory and Turing reducibility
359
d > 0 , then μ({B : deg(B ) d}) = 0. In Section 8.14 we will see that, in fact, almost all sets are GL1 . The following is a useful consequence of Corollary 8.12.2. Corollary 8.12.3. If A >T ∅ and X is weakly 2-random relative to A, then A T X. Proof. Suppose that ∅
this class has measure 1 was first shown by Stillwell [382].
360
8. Algorithmic Randomness and Turing Reducibility
Theorem 8.12.6 (Stillwell [382]). If μ({X : A T B ⊕ X}) > 0 then A T B. An immediate corollary to this result (which we will improve in Theorem 8.15.1) is the following. Corollary 8.12.7 (Stillwell [382]). Let a and b be degrees. Then (a ∨ b) ∧ (a ∨ c) is defined and equal to a for almost all c. Proof. Let A ∈ a and B ∈ b. For any set D, Theorem 8.12.6 implies that μ({C : D T A ⊕ C ∧ D T A}) = 0. Since there are only countably many D T A ⊕ B, μ({C : ∃D T A ⊕ B (D T A ⊕ C ∧ D T A)}) = 0. So for almost all C, every D that is computable in both A ⊕ B and A ⊕ C is also computable in A. Since A is computable in A ⊕ B and A ⊕ C, the corollary follows.
8.13 n-randomness and weak n-randomness In Theorem 7.2.7, we saw that every n-random set is weakly n-random, and every weakly (n + 1)-random set is n-random. In this section, we show that neither implication can be reversed, even at the level of degrees. An important special case of the following result was given in Theorem 7.2.8. Theorem 8.13.1 (Downey and Hirschfeldt [unpublished]). If A is weakly (n + 1)-random and ∅
B(k))}, and let Re = Pe ∩ Qe . Then Pe is a Π02 class, and Qe is a Π0,∅ 1 class, since for any string σ, if Φσe (k) ↓, then the condition Φσe (k) = B(k) can be checked effectively in ∅(n) . Since n 1, we see that Re is a Π0n+1 class. If X ∈ Re then X computes B, so by Corollary 8.12.2, Re has measure 0, and hence cannot contain a weakly (n + 1)-random set. The theorem now follows, since any set computing B must be in Re for some e. It follows immediately from the above theorem that no weakly (n + 1)random set can be ∅(n) -computable, or even compute a noncomputable ∅(n) -computable set. Since there are ∅(n) -computable n-random sets, we have the following result. Corollary 8.13.2 (Kurtz [228]). For every n 1, there is an n-random set that is not of weakly (n + 1)-random degree, and indeed cannot be computed by any set of weakly (n + 1)-random degree.
8.13. n-randomness and weak n-randomness
361
Kurtz [228] showed that there is a weakly 1-random set that is not of 1-random degree. Kautz [200] extended this result (at the set level) to show that if n 1, then there is a weakly n-random set that is not nrandom (though this result was stated without proof earlier by Gaifman and Snir [168]). Kautz’ proof was fairly complicated, but we will get a simpler one, which works at the level of degrees as well, using the following relativized form of Theorem 7.2.25. Note that this result does not follow immediately by relativizing the proof of Theorem 7.2.25, because weak (n + 1)-randomness is not the same as weak 1-randomness relative to ∅(n) . Theorem 8.13.3 (Downey and Hirschfeldt [unpublished]). If X >T ∅(n−1) is c.e. in ∅(n−1) then there is a weakly n-random set A such that A ⊕ ∅(n−1) ≡T X. Proof. The n = 1 case is Theorem 7.2.25, so let n 2. By Theorem 7.2.6, (n−2) of Σ∅1 A is weakly n-random iff for every ∅(n−1) -computable sequence classes {Ui }i∈ω such that μ(Ui ) 2−i , we have A ∈ / i Ui . We do the n = 2 case using this characterization. The general case then follows by a straightforward relativization. Thus we have a Σ02 set X T ∅ . We build a weakly n-random A with A ⊕ ∅ ≡T X via a ∅ -computable construction. Let X0 , X1 , . . . be an approximation to X as a ∅ -c.e. set. Let S0 , S1 , . . . be an effective listing of the Σ01 classes. We will first discuss how to build A T X, and later explain how to add coding to the construction to ensure that A ⊕ ∅(n−1) ≡T X. For simplicity of presentation, we adopt the convention that Φ∅e (i) ↓⇒ μ(SΦ∅e (i) ) 2−i (that is, we do not allow Φ∅e (i) to converge unless μ(SΦ∅e (i) ) 2−i , which we can determine using ∅ ). We need to satisfy the requirements
/ SΦ∅e (i) ) Re : Φ∅e total ⇒ ∃i (A ∈ while making A be X-computable. Associated with each Re will be a number re , which is initially set to equal e + 1 and may be redefined during the construction. We say that Re requires attention at stage s if Re is not currently satisfied and for some i, we have Φ∅e (re + i) ↓ and Xs+1 (i) = Xs (i). When a requirement Re is satisfied, it will be satisfied through a particular number ie (which indicates that we have committed ourselves to keeping A out of SΦ∅e (ie ) ). Initially each ie is undefined. We build A as the limit of strings σ0 , σ1 , . . . with |σs | = s. Let σ0 = λ. At stage s, we proceed as follows. 1. Fix the least e s that requires attention. (If there is none then proceed to step 2.) Define ie = re + i for the least i such that Φ∅e (re + i) ↓ and Xs+1 (i) = Xs (i). Declare Re to be satisfied. For each e > e, declare Re to be unsatisfied and undefine ie and re .
362
8. Algorithmic Randomness and Turing Reducibility
2. Let S = ij↓ SΦ∅ (ij ) . Let σs+1 ∈ 2s+1 be such that μ(σs+1 ∩ S) < 1 j and σs+1 ie = σs ie . 3. For all e such that re is undefined, define re to be a fresh large number, so that μ(σs+1 ∩ S) + e >e 2−re < 1. It is easy to check by induction that the definition of σs+1 at step 2 is always possible, because of how we define the re at step 3. Let A be the limit of the σs . It is also easy to check that A is well-defined and Xcomputable (since if n < s then σs+1 n = σs n unless some number less than or equal to n enters X at stage s). Furthermore, the usual finite injury argument shows that if Φ∅e is total then Re is eventually permanently satisfied, because once all Rj for j < e have stopped requiring attention at some stage s, the fact that X T ∅ implies that Re eventually requires attention after stage s. The fact that the Si are open then implies that Re is satisfied. Adding coding to this construction to ensure that X is computable in A ⊕ ∅ is not difficult. We have coding markers mn [s] such that mn = lims mn [s] exists. Every time we have mn [s + 1] = mn [s], we ensure that A mn [s] = σs mn [s]. We also ensure that if n enters X at stage s then A mn [s] = σs mn [s]. It is straightforward to add such markers to the construction, moving all markers of weaker priority whenever Re acts. The requirements on the markers ensure that, knowing A, we can follow the construction (which we can do using ∅ ) to find an s such that A mn [s] = σs mn [s], at which point we know that n ∈ X iff n ∈ Xs . Corollary 8.13.4 (Kurtz [228] for n = 1, Kautz [200] (at the set level)). Let n 1. There is a weakly n-random set that is not of n-random degree. Proof. Let X be c.e. in ∅(n−1) and such that ∅(n−1)
8.14. Every 2-random set is GL1
363
8.14 Every 2-random set is GL1 The fact that all 1-random left-c.e. reals are complete might lead to the false impression that randomness is a highness notion. In general, however, the reality is quite the opposite. We already saw results in this direction in Section 8.8, and will see further results (in Sections 15.5 and 15.6, for instance) showing that randomness is in fact antithetical to computational power, and in many ways, sufficiently random sets begin to resemble highly nonrandom (e.g., computable) ones. Thus randomness is in a sense a lowness notion. In this section, we prove a result that lends evidence to this claim. Recall that a set X is generalized low (GL1 ) if X ≡T ∅ ⊕ X. The following result generalizes Corollary 8.12.2. We will see in Corollary 8.14.4 that the n + 1 in its statement is optimal. Theorem 8.14.1 (Kautz [200]). Let n > 0. The class {X : X (n) ≡T X ⊕ ∅(n) } includes every (n + 1)-random set, and hence has measure 1. In particular, every 2-random set is GL1 .8 Proof. We build a functional Φ such that the classes {A : A(n) (e) = (n) ΦA⊕∅ (e)} for e ∈ ω form a Σ0n+1 -test. It then follows that if A is (n) (n + 1)-random, then A(n) (e) = ΦA⊕∅ (e) for almost all e, and hence (n) (n) (n) A T A ⊕ ∅ . (Of course, A T A ⊕ ∅(n) is automatic.) Let Cei = {A : e ∈ A(i) } and write Ce for Cen . We claim that the Ce are uniformly Σ0n classes. To establish this claim, assume by induction that the Cei−1 are uniformly Σ0i−1 classes. the classes Dσi−1 = {A : σ ≺ A(n−1) } Then i−1 0 i are uniformly Δi , and Ce = Φσ (e)↓ Dσ , whence the Cei are uniformly Σ0i . e
(n−1)
Thus, by Theorem 6.8.3, uniformly in ∅(n) , we can find Σ∅1 classes Ue with Ce ⊆ Ue and μ(Ue ) − μ(Ce ) 2−(e+1) for all e. There is an ∅(n) computable f taking natural numbers to finite sets of strings such that for each e we have f (e) ⊆ Ue and μ(Ue \ f (e)) 2−(e+1) . Let Ψ be a (n) functional such that f = Ψ∅ . Let ⎧ B B ⎪ ⎨1 if Ψ (e) ↓ ∧ A ∈ Ψ (e) A⊕B B Φ (e) = 0 if Ψ (e) ↓ ∧ A ∈ / ΨB (e) ⎪ ⎩ ↑ otherwise. Then
Φ
A⊕∅(n)
(e) =
1 if A ∈ f (e) 0 otherwise.
(n)
Let Ve = {A : A(n) (e) = ΦA⊕∅ (e)}. Then A ∈ Ve iff either A ∈ Ce and A ∈ / f (e), or A ∈ / Ce and A ∈ f (e). Since μ(Ce \ f (e)) μ(Ue \ 8 Kautz [200] points out that the “almost all” version of this result (that X (n) ≡ T X ⊕ ∅(n) for almost every X) is due to Sacks [341] and Stillwell [382].
364
8. Algorithmic Randomness and Turing Reducibility
f (e)) 2−(e+1) and μ(f (e) \ Ce ) μ(Ue \ Ce ) 2−(e+1) , we have μ(Ve ) 2e . Since the sets f (e) are clopen and uniformly computable in ∅(n) , the Ve are uniformly Σ0n+1 classes. Thus the Ve form a Σ0n+1 -test, as required. In the n = 1 case of the above proof, we can take Ue = Ce , and it is easy to see that we can also ensure that f wtt ∅ . It then follows that the Ve form a Demuth test, so we have the following result, which was first stated by Demuth [95], with a proof sketch, a full proof later appearing in Nies [306]. Theorem 8.14.2 (Demuth [95]). All Demuth random sets are GL1 . Using Theorem 8.14.1, Kautz [200] generalized Theorem 8.5.1 as follows. Theorem 8.14.3 (Kautz [200]). If a 0(n) then there is an n-random set A with A(n−1) ∈ a. Proof. The proof of Theorem 8.5.1 relativizes. Recall that in that proof we constructed a perfect tree T T ∅ such that [T ] ⊆ U0 , where U0 is the first element of Kuˇcera’s universal Martin-L¨ of test. To ensure the effectiveness of the coding of sets into paths of T used in the proof, we used Lemma 8.5.2, which allows us to compute effective lower bounds for the measures of nonempty sets of the form U0 ∩ σ. We can relative this construction to ∅(n−1) . We can construct T computably in ∅(n) , now letting U0 be the first element of Kuˇcera’s universal Martin-L¨ of test relativized to ∅(n−1) . The necessary lower bounds given by the relativized form of Lemma 8.5.2 are now computable in ∅(n−1) . We thus get an n-random A such that A ⊕ ∅(n−1) ∈ a. By Theorem 8.14.1, A(n−1) ≡T A ⊕ ∅(n−1) , so A(n−1) ∈ a. Corollary 8.14.4 (Kautz [200]). Let n > 0. The class {X : X (n) ≡T X ⊕ ∅(n) } does not contain every n-random set. Proof. Let A be n-random and such that A(n−1) ≡T ∅(n) . Then A(n) ≡T ∅(n+1) while A ⊕ ∅(n) T A(n−1) ⊕ ∅(n) ≡T ∅(n) . By relativizing Theorem 8.14.1 and Corollary 8.14.4, we get the following result, which will be useful below. Theorem 8.14.5. The class {X : (X ⊕ A)(n) ≡T X ⊕ A(n) } includes every set that is (n + 1)-random relative to A, and hence has measure 1, but does not include every set that is n-random relative to A.
8.15 Stillwell’s Theorem It is well known that the theory of the Turing degrees is undecidable, as shown by Lachlan [232] and in fact computably isomorphic to theory of second order arithmetic, as shown by Simpson [358]. On the other hand,
8.15. Stillwell’s Theorem
365
in this section we use material from previous sections to prove Stillwell’s Theorem that the “almost all” theory of the Turing degrees is decidable. Here the quantifier ∀ is interpreted to mean “for measure 1 many”. By Fubini’s Theorem (see [335]), A ⊂ (2ω )n has measure 1 iff almost all sections of A with first coordinate fixed have measure 1. This fact allows us to deal with nested quantifiers. Theorem 8.15.1 (Stillwell [382]). The “almost all” theory of the Turing degrees is decidable. Proof. Here variables a, b, c, . . . range over degrees. Our terms are built from these variables, 0, (jump), ∨, and ∧. An atomic formula is one of the form t0 t1 for terms t0 , t1 , and formulas in general are built from atomic ones using the connectives & (used here for “and” to avoid confusion with the meet symbol) and ¬, and the quantifier ∀, interpreted to mean “for measure 1 many”. Corollary 8.12.7 allows us to compute the meet of any two terms of the form a0 ∨ a1 ∨ · · · ∨ ak ∨ 0(m) and b0 ∨ b1 ∨ · · · ∨ bk ∨ 0(n) as c0 ∨ c1 ∨ · · · ∨ cl ∨ 0(min{m,n}) , where the ci are the variables common to both terms. For example, (a1 ∨ a3 ∨ 0(4) ) ∧ (a1 ∨ a5 ∨ a7 ∨ 0(6) ) = a1 ∨ 0(4) . Note that we also have a ∧ c = 0. Now we can give a normal form for terms. By Theorem 8.14.5, a(n) = a ∨ 0(n) for almost all a. We can use this fact to compute the jump of any term of the form a0 ∨a1 ∨· · ·∨am ∨0(l) . That is, as (a0 ∨a1 ∨am · · ·∨0(l) ) = (a0 ∨ a1 ∨ · · · ∨ am )(l) , it follows that (a0 ∨ a1 ∨ · · · ∨ am ∨ 0(l) ) = a0 ∨ a1 ∨ · · · ∨ am ∨ 0(l+1) . Hence, using the rule for ∧, the jump rule, and the join rule (a0 ∨ a1 ∨ am ∨ · · · ∨ 0(l) ) ∨ (b0 ∨ b1 ∨ bn ∨ · · · ∨ 0(k) ) = (a0 ∨ a1 ∨ · · · ∨ am ∨ b0 ∨ · · · ∨ bn ∨ 0(max{l,k}) ), we can reduce any term t to one of the form (a0 ∨ a1 ∨ · · · ∨ am ∨ 0(l) ). The proof is completed by giving the decision procedure. We show that every formula with free variables is effectively 0-1-valued . That is, it is satisfied by a set of instances of measure 0 or 1, and we can effectively determine which is the case. We proceed by structural induction. To see that t0 t1 is effectively 0-1-valued, first put the terms ti in normal form, then calculate t0 ∧ t1 . If t0 ∧ t1 = t0 then t0 t1 for almost all instances. Otherwise, we must have t0 ∧ t1 < t0 , either because t0 contains a variable not in t1 , or because the 0(l) term in t0 is too big. In either case, t0 t1 is false for almost all instances. If ψ0 , ψ1 are effectively 0-1-valued then clearly so are ψ0 &ψ1 and ¬ψi . Fubini’s Theorem shows that if ψ is effectively 0-1-valued, then so is ∀ai ψ(ao , . . . , an ), thus completing the proof.
366
8. Algorithmic Randomness and Turing Reducibility
8.16 DNC degrees and autocomplexity Kjos-Hanssen, Merkle, and Stephan [205] extended Theorem 8.8.1 to characterize the sets that compute diagonally noncomputable functions in terms of Kolmogorov complexity. Kanovich [197, 198] formulated the following definition of complexity and autocomplexity for computably enumerable sets, and stated the result that complex c.e. sets are wtt-complete and autocomplex sets are Turing complete, as we will see below. Definition 8.16.1 (Kjos-Hanssen, Merkle, and Stephan [205], Kanovich [197, 198]). (i) A is h-complex if C(A n) h(n) for all n. (ii) A is (order) complex 9 if there is a computable order h such that A is h-complex. (iii) A is autocomplex if there is an A-computable order h such that A is h-complex. Notice that we could have used K in place of C in the definition above, since the two complexities differ by only a log factor. Notice too that every 1-random set is complex. We now establish some interesting characterizations of autocomplexity and complexity. Lemma 8.16.2 (Kjos-Hanssen, Merkle, and Stephan [205]). The following are equivalent. (i) A is autocomplex. (ii) There is an A-computable h such that C(A h(n)) n for all n. (iii) There is an A-computable f such that C(f (n)) n for all n. (iv) For every (A)-computable (order) h, there is an A-computable g such that C(g(n)) h(n) for all n. Proof. (i) ⇒ (ii). Let g be an A-computable order such that C(A n) g(n) for all n. Then we get (ii) by taking h(n) = min{k : g(k) n}. (ii) ⇒ (iii). Let f (n) be the encoding of A h(n) as a natural number. Then C(f (n)) n − O(1), so we can take f (n) = f (n + k) for a suitable k. (iii) ⇔ (iv). Clearly (iv) implies (iii). Now let h be A-computable and let f be as in (iii). Let g(n) = f (h(n)). Then C(g(n)) = C(f (h(n)) h(n). (iii) ⇒ (i). Let f be as in (iii). Let Φ be such that ΦA = f and let g be the use function of that computation, which by the usual assumptions 9 This notion should not be confused with that of Kummer complex sets, which we will meet in Chapter 16.
8.16. DNC degrees and autocomplexity
367
on the use is an A-computable order. Then for any m g(n), the value of f (n) can be computed from n and A m, so n C(f (n)) C(A m) + 2 log n + O(1). Hence, for almost all n and all m g(n), we have C(A m) n2 . Therefore, a finite variation of the A-computable order n → 12 max{m : g(m) n} witnesses the autocomplexity of A. Similar methods allow us to characterize the complex sets. Theorem 8.16.3 (Kjos-Hanssen, Merkle, and Stephan [205]). The following are equivalent. (i) A is complex. (ii) There is a computable function h such that C(A h(n)) n for all n. (iii) A tt-computes a function f such that C(f (n)) n for all n. (iv) A wtt-computes a function f such that C(f (n)) n for all n. The following result can be viewed as the ultimate generalization of Kuˇcera’s Theorem 8.8.1 that 1-randoms have DNC degree. Theorem 8.16.4 (Kjos-Hanssen, Merkle, and Stephan [205]). A set is autocomplex iff it is of DNC degree. Proof. Let A be autocomplex and let f be as in Lemma 8.16.2 (iii). Then for almost all n, C(Φn (n)) C(n) + O(1) log n + O(1) < n C(f (n)), so a finite variation of f is DNC. Now suppose that A is not autocomplex. We show that no ΦA i is DNC. Fix i. Let U be the universal machine used to define C. For each σ let e(σ) be such that Φe(σ) (n) is computed as follows. If U (σ) = τ for some τ , then compute Φτi (n) and output the result if this computation halts. Let h(n) be the maximum of the uses of all computations ΦA i (e(σ)) with |σ| < n. Then h T A, so by Lemma 8.16.2, there are infinitely many n such that C(A h(n)) < n. For any such n, let σn ∈ 2
Φe(σn ) (e(σn )) = Φi
(e(σn )) = ΦA i (e(σn )),
so ΦA i is not DNC. Again, similar methods yield results for complex sets. Theorem 8.16.5 (Kjos-Hanssen, Merkle, and Stephan [205]). The following are equivalent. (i) A is complex.
368
8. Algorithmic Randomness and Turing Reducibility
(ii) A tt-computes a DNC function. (iii) A wtt-computes a DNC function. In particular, the sets that wtt-compute DNC functions and those that wttcompute computably bounded DNC functions coincide. By Theorems 8.10.3 and 8.16.4, there is an autocomplex set that does not compute any 1-random set. Kjos-Hanssen, Merkle, and Stephan [205] showed that the proof of Theorem 8.10.3 can be used to extend this result as follows. Theorem 8.16.6 (Kjos-Hanssen, Merkle, and Stephan [205]). There is a complex set that does not compute any 1-random set. Kjos-Hanssen, Merkle, and Stephan [205] also showed that a set is complex iff it is not wtt-reducible to any hyperimmune set. Miller [275] defined a set A to be hyperavoidable if there is a computable order h such that if Φe (n) ↓ for all n < e then A h(e) = Φe h(e). Miller [275] showed that these sets are exactly the ones that are not wtt-reducible to a hyperimmune set, so it turns out that a set is hyperavoidable iff it is complex. The fact that a set is complex iff it is not wtt-reducible to any hyperimmune set has also been used as a tool for proving results in computable model theory by Chisholm, Chubb, Harizanov, Hirschfeldt, Jockusch, McNicholl, and Pingrey [64]. Initial segment complexity and Turing complexity are strongly related in the presence of effective enumerations. For instance, in Theorem 9.14.6, we will show that if a left-c.e. real α is 1-random and B is a c.e. set, then there is a wtt-reduction Γ with identity use such that Γα = B. In the present context, we have the following result, which follows from the lemmas above and Arslanov’s Completeness Criterion (Theorem 2.22.3; see also the comment following the proof of that theorem and Theorem 2.22.1, which says that computing DNC functions is equivalent to computing fixed-point free functions). Theorem 8.16.7 (Kanovich [197, 198]). Let A be a c.e. set. (i) A is complex iff A is wtt-complete. (ii) A is autocomplex iff A is Turing complete. One of the points of [205] was to show how to avoid the use of Arslanov’s Completeness Criterion in results like the above. Thus, a direct proof of Theorem 8.16.7 was given in that paper. (Kanovich [197, 198] stated the result without proof.) We could also give a direct proof based on the method of Theorem 8.2.1. That is, suppose that C(A n) h(n) for a computable order h. We build a Turing machine M with coding constant c in the universal machine used to define C. For each k we can effectively find an nk such that h(nk ) > k+c. If k enters ∅ at stage s, we let M (0k ) = As nk , thus ensuring that K(As nk ) k + c < h(nk ), whence A nk = As nk .
8.16. DNC degrees and autocomplexity
369
For the autocomplex case, we can proceed in the same way, except that we now approximate h, so while k is not in ∅ , the value of nk can change. We can still recover ∅ from A because the final value of nk is A-computable, but this is no longer a wtt-reduction. We finish this section with a result that will be important when we consider lowness for weak 1-randomness in Section 12.4. A set A is weakly computably traceable if there is a computable h such that for all f T A, there is a computable g with |Dg(n) | h(n) for all n and f (n) ∈ Dg(n) for infinitely many n. Theorem 8.16.8 (Kjos-Hanssen, Merkle, and Stephan [205]). The following are equivalent. (i) The degree of A is DNC or high. (ii) A is autocomplex or of high degree. (iii) There is a g T A such that either g dominates all computable functions or for every partial computable h, we have g(n) = h(n) for almost all n. (iv) There is an f T A such that for every computable h, we have f (n) = h(n) for almost all n. (v) A is not weakly computably traceable. Furthermore, if the degree of B is hyperimmune-free and not DNC, then for each g T B, there are computable h and h such that ∀n ∃m ∈ [n, h(n)] (h(m) = g(m)). Proof. (i) ⇔ (ii) by Theorem 8.16.4. (ii) ⇒ (v). Suppose that (v) fails but A is either autocomplex or high. Let h be a computable function witnessing the fact that A is weakly computably traceable. If A is autocomplex, by Lemma 8.16.2, there is an A-computable function p such that C(A p(n)) > log n + h(n) for all n. Let g be a computable function such that {Dg(n) }n∈ω weakly traces p(n) and |Dg(n) | < h(n) for all n. We can describe p(n) by giving n and the program for g, together with log h(n) many bits to say which of the elements of Dg(n) is p(n). Thus for almost all n, we have C(p(n)) log n + O(log h(n)) < log n + h(n), contradicting the choice of p. If A is high, then let p T A dominate all computable functions. Let g be a computable function such that {Dg(n) }n∈ω weakly traces p(n). Then the function n → max Dg(n) + 1 is computable, so there are infinitely many n such that p > max Dg(n) , which is a contradiction. (v) ⇒ (iv). If (iv) fails then given any f T A, there is a computable function h with f (n) = h(n) for infinitely many n. Then (v) fails by letting g be such that Dg(n) = {h(n)}.
370
8. Algorithmic Randomness and Turing Reducibility
(iv) ⇒ (iii). If A is high then, by Theorem 8.16.4, there is an Acomputable function that dominates all computable functions. So suppose that A is not high. Let f T A be such that for every computable function h, we have f (n) = h(n) for almost all n. Suppose that ϕ is a partial computable function such that f (n) = ϕ(n) for infinitely many n. Let p T A be such that for each k there is an n k with ϕ(n)[p(k)] ↓= f (n). Since A is not high, there is a computable function q such that q(k) > p(k) for infinitely many k. We may assume that q is nondecreasing. Let h(n) = ϕ(n)[q(n)] if this value is defined, and h(n) = 0 otherwise. Then h is total and computable. If q(k) p(k) then there is an n k such that ϕ(n)[q(n)] = ϕ(n)[q(k)] = ϕ(n)[p(k)] = f (n), whence h(n) = f (n). Thus there are infinitely many n such that f (n) = h(n), contradicting the choice of f . (iii) ⇒ (i). By Theorem 2.23.7, if g dominates all computable functions then A is high. Otherwise, let h(e) = Φe (e). Then g(e) = h(e) for almost all e, so a finite variation of g is DNC. Finally, we verify the last part of the theorem. Let B be a set whose degree is hyperimmune-free and not DNC, and let g T B. Then B is not high, so there is a computable h such that h(n) = g(n) infinitely often. Let f (n) be the least m n such that h(m) = g(m). Then f T B. Since B has hyperimmune-free degree, there is a computable function h such that f (n) h(n) for all n. Then ∀n ∃m ∈ [n, h(n)] (h(m) = g(m)).
8.17 Randomness and n-fixed-point freeness Many results in this book demonstrate that randomness is antithetical to computational strength, and is in a sense a lowness property. Nevertheless, as the results of this section will show, n-random sets do have some computational power.10 Definition 8.17.1 (Jockusch, Lerman, Soare, and Solovay [191]). We define a relation A ∼n B as follows. (i) A ∼0 B if A = B. 10 Heuristically, we can distinguish between “hitting power” and “avoiding power”. Computing the halting problem, for example, requires hitting power, while having hyperimmune degree, say, may be better characterized as an avoidance property (i.e., avoiding domination by computable functions). Of course, the real distinction is between properties that define a class of measure 0 and those that define a class of measure 1. (Most natural properties fall into one of these two cases.) The former will not hold of any sufficiently random sets, while the latter will necessarily hold of all sufficiently random sets. Obviously, a property defining a class of measure 0 is the same as the negation of a property defining a class of measure 1, but this fact does not diminish the heuristic usefulness of this observation.
8.17. Randomness and n-fixed-point freeness
371
(ii) A ∼1 B if A =∗ B. (iii) If n 2 then A ∼n B if A(n−2) ≡T B (n−2) . A total function f is n-fixed-point free (n-FPF ) if Wf (x) ∼n Wx for all x. Note that the 0-FPF functions are just the usual fixed-point free functions. In Theorem 8.8.1 we saw that every 1-random set computes a DNC function, and hence a fixed-point free function. We now extend this result as follows. Theorem 8.17.2 (Kuˇcera [218]). Every (n + 1)-random set computes an n-FPF function. Proof. We begin with the case n = 1, proving that each 2-random set computes a 1-FPF function. We then indicate how to modify that proof to obtain the general case. Let {Un }n∈ω be Kuˇcera’s universal Martin-L¨ of test, as constructed in Section 8.5. We need only the first element of this test, whose definition we now recall. Let S0 , S1 , . . . be an effective list of the prefix-free c.e. subsets of 2<ω . For each e > 0, enumerate all elements of SΦe (e) (where we take SΦe (e) to be empty if Φe (e) ↑) into a set R as long as σ∈SΦ (e),s 2−|σ| < 2−e . (That e is, if this sum ever gets to 2−e , stop enumerating the elements of SΦe (e) into R before it does.) Let U0 = R. This construction relativizes to a given set X in the obvious way, yielding a class U0X . Let DX be the complement of U0X . Note that DX has positive measure. Now let A be 2-random. Since we are interested in A only up to degree, we may assume by Theorem 6.10.2 that A ∈ D∅ . Let h be a computable [i] [i] ∅ function such that Wh(e) = {i : We is finite}. (Recall that We is the ith
column of We .) For each e and j, define the Σ0,∅ class Vje as follows. If 1 ∅ e |Wh(e) | < j +1 then let Vj = ∅. Otherwise, let Fj be the smallest j +1 many ∅ and let Vje = {τ : ∀i ∈ Fj (τ (i) = 1)}. The Vje are elements of Wh(e)
uniformly Σ0,∅ classes, and μ(Vje ) < 2−j . Using the recursion theorem with 1 e parameters, let g be a computable function such that SΦg(e) (g(e)) = Vg(e) for all e. We may assume that g(e) > 0 for all e. e ∩ D∅ = ∅ for all e. Let σe By the construction of U0∅ , we have Vg(e) be the shortest initial segment of A containing g(e) + 1 many elements. ∅ | < g(e) + 1 or there is an i < |σe | such that Then, for each e, either |Wh(e)
∅ e (i) = σe (i), since otherwise we would have σe ⊆ Vg(e) , contradicting Wh(e)
our assumption that A ∈ D∅ . We are ready to define our 1-FPF function f . Let f be an A-computable function such that Wf (e) = {i, n : i < |σe | ∧ σe (i) = 0 ∧ n ∈ N}.
372
8. Algorithmic Randomness and Turing Reducibility
∅ To show that f is 1-FPF, fix e. If |Wh(e) | < g(e) + 1 then almost every column of We is infinite. Since only finitely many columns of Wf (e) are nonempty, in this case Wf (e) =∗ We . Otherwise, there is an i < |σe | such [i] [i] ∅ ∅ (i) = σe (i). If i ∈ Wh(e) then We is finite but Wf (e) is infinite; that Wh(e)
otherwise, We is infinite but Wf (e) is empty. In either case, Wf (e) =∗ We .
We now turn to the n 2 case. For a set B, let B [=i] = i=j B [j] . As [i]
[i]
(n)
before, we assume that A is an (n+1)-random set in D∅ . It is straightforward to adapt and relativize the proof of the Friedberg-Muchnik Theorem to build a set B that is c.e. relative to ∅(n−2) , low relative to ∅(n−2) , and such that for all i, we have ∅(n−2) T B [i] and B [i] |T B [=i] . Since B is low rela[=i] (n−2) tive to ∅(n−2) , given e, i, and k, we can determine whether We = ΦB k using ∅(n) as an oracle. Thus, by analogy with the n = 1 case, we can let h be a computable function such that (n)
∅ Wh(e) = {i : We(n−2) T B [=i] }. (n)
∅ ∅ Define Vie , g, and σe as before, but with Wh(e) in place of Wh(e) . As (n)
∅ before, for each e, either |Wh(e) | < g(e) + 1 or there must be an i < |σe | (n)
∅ such that Wh(e) (i) = σe (i). Let f be an A-computable function such that (n−2)
Wf∅ (e)
= {i, n ∈ B : i < |σe | ∧ σe (i) = 0 ∧ n ∈ N}.
We may assume that A(0) = 0, so that W ∅
(n−2)
f (e)
(n−2)
claim that We Fix e. If
≡T W ∅
∅(n−2) | |Wh(e)
(n)
(n−2)
≡T W ∅
(n−2)
(n−2)
f (e)
for all e.
∅ < g(e) + 1 then there is a i |σe | such that i ∈ / Wh(e) ,
which means that We We
(n−2)
f (e)
T B [0] T ∅(n−2) . We
T B [=i] . But W ∅
(n−2)
f (e)
T B [=i] , so in this case,
. (n)
(n)
∅ ∅ (i) = σe (i). If i ∈ Wh(e) Otherwise, there is a i < |σe | such that Wh(e) (n−2)
then We
T B [=i] but the ith column of W ∅ (n−2)
B [=i] , we have We
(n)
≡T W ∅
∅ then Finally, if i ∈ / Wh(e)
(n−2)
f (e) (n−2) We
T B [=i] but the ith column of W ∅
B [=i] . Thus W ∅
f (e)
W ∅
f (e)
.
(n−2)
f (e)
(n−2)
f (e)
(n−2)
is B [i] . Since B [i] T
.
is empty and the other columns of W ∅ (n−2)
(n−2)
f (e)
are uniformly computable from (n−2)
T B [=i] , and hence in this case also We
≡T
8.18. Jump inversion (n−2)
Thus We
≡T W ∅
(n−2)
for all e. Since W ∅
f (e)
(n−2)
f (e)
373
is CEA(∅(n−2) ) for
each e, and f is A-computable, the fact that the proof of the Sacks Jump Inversion Theorem 2.16.5 is uniform implies that there is an A-computable function f such that for all e, (n−2)
Wf (e) (n−2)
Therefore Wf (e)
(n−2)
≡T We
(n−2)
≡T Wf∅ (e)
.
for all e, as required.
8.18 Jump inversion In this section, we show that the jump operator can be used to study the distribution of 1-random (Δ02 ) degrees. We do this by proving a basis theorem for Π01 classes of positive measure. It is of course not the case that a nontrivial Π01 class must have members with all possible jumps, since we could be dealing with a countable Π01 class, which might even have only one noncomputable member. However, as we recall from Theorem 2.19.18, if P is a nonempty Π01 class with no computable members and X T ∅ , then there is an A ∈ P such that A ≡T A ⊕ ∅ ≡T X. Applying this result to a Π01 class of 1-random sets, we have the following. Corollary 8.18.1. If X T ∅ then there is a 1-random A such that A ≡T A ⊕ ∅ ≡T X. The situation for Δ02 1-randoms is less clear, because there is no extension of Theorem 2.19.18 where we additionally have A T ∅ . In fact, Cenzer [55] observed that there are Π01 classes with no computable members11 such that every member is GL1 , and hence every Δ02 member is low. To prove jump inversion for Δ02 1-randoms, we replace Theorem 2.19.18 with a similar basis theorem for fat Π01 classes. Theorem 8.18.2 (Kuˇcera [217], Downey and Miller [126]12 ). Let P be a Π01 class of positive measure, and let X T ∅ be Σ02 . Then there is a Δ02 set A ∈ P such that A ≡T X. Corollary 8.18.3 (Kuˇcera [217], Downey and Miller [126]). For every Σ02 set X T ∅ , there is a 1-random Δ02 set A such that A ≡T X. The proof of Theorem 8.18.2 can be viewed as a finite injury construction relative to ∅ . In that sense, it is similar to Sacks’ construction in [342] of a 11 Specifically,
what is known as a thin perfect class. [217], Kuˇcera constructed a high incomplete 1-random set, and in Remark 8 stated that similar methods allow for a proof of Theorem 8.18.2 (additionally avoiding upper cones above a given C such that ∅
374
8. Algorithmic Randomness and Turing Reducibility
minimal degree below 0 . We require two additional ideas. The first is the method of forcing with Π01 classes, which was introduced by Jockusch and Soare [195] to prove the low basis theorem. This method is used to ensure that A T X. The second is Kuˇcera’s Lemma 8.10.1, which allows us to computably bound the positions of branchings as in Kuˇcera coding. Even though randomness is not explicit in the statement of Kuˇcera’s Lemma, it is worth noting its implicit presence. Lemma 8.10.1 is proved using concepts from randomness and its main application has been to the study of the Turing degrees of 1-random sets. Proof of Theorem 8.18.2. Let P be a Π01 class with positive measure and let X T ∅ be a Σ02 set. Let Q and c be as in Lemma 8.10.1. That is, Q ⊆ P is a Π01 class, and for an effective listing P0 , P1 , . . . of Π01 classes, if Q ∩ Pe = ∅, then μ(Q ∩ Pe ) 2−(e+c) . Note that μ(Q) > 0 since Pe = Q for some e. We will construct a Δ02 set A ∈ Q such that A ≡T X. Before describing the construction, we give a few preliminary definitions. For each σ ∈ 2<ω , define a Π01 class Fσ = {B ∈ Q : ∀e < |σ| (σ(e) = 0 ⇒ ΦB e (e) ↑)}. At each stage s of the construction, we will define a string σs ∈ 2s , which is intended to approximate A s. We will tentatively restrict A to the class Fσs in order to force its jump. It is important to note that this restriction may be injured at a later stage by the enumeration of an e < s into X. Next we define a computable function f that grows fast enough to ensure that it eventually bounds the branchings between elements of Fσ for every σ. We will use f to code elements of X into A (or more precisely, into A ). Let h : 2<ω × 2<ω → ω be a computable function such that Ph(τ,σ) = τ ∩ Fσ for all τ and σ. Let f (0) = 0 and let f (s + 1) = 1 + max{h(τ, σ) + c : τ ∈ 2f (s) ∧ σ ∈ 2s }. Now take τ ∈ 2f (t) and σ ∈ 2s , for t s, such that τ ∩ Fσ = ∅. We claim that τ has distinct finite extensions ρ0 , ρ1 ∈ 2f (t+1) that extend to elements of Fσ . Suppose not, and let σ = σ1t−s . Then Fσ = Fσ , so −f (t+1) μ(Q ∩ Ph(τ, σ) ) = μ(τ ∩ Fσ ) 2 < 2−(h(τ, σ)+c) , which contradicts the choice of Q. Kuˇcera used the fact that we can bound branchings in a Π01 class of positive measure to code information into members of such a class. The most basic form of Kuˇcera coding constructs a sequence by extensions, choosing the leftmost or rightmost permissible extension to encode the next bit. For our construction, we distinguish only between the rightmost extension and any other permissible extension. Recall that L is the lengthlexicographic ordering of 2<ω . For a Π01 class R and τ ∈ 2f (s+1) for some s, let Hf (R; τ ) hold iff ∀ρ ∈ 2f (s+1) (ρ f (s) = τ f (s) ∧ τ
8.18. Jump inversion
375
By compactness, R ∩ ρ = ∅ iff there is an s such that R[s] ∩ ρ = ∅, so Hf (R; x) is a Σ01 condition, and if R ∩ τ = ∅, then Hf (R; τ ) is true iff τ is the rightmost extension of τ f (s) of length f (s + 1) that can be extended to an element of R. It will be useful to understand the interaction between f and Hf . Assume that we have τ ∈ 2f (t) for some t s, and σ ∈ 2s such that τ ∩ Fσ = ∅. Let τ ∈ 2f (t+1) be the leftmost extension of τ such that τ ∩Fσ = ∅. By the definition of f , there are multiple extensions to choose from, so Hf (Fσ , τ ) is false. In fact, if ρ σ, then Fσ ⊆ Fρ , so Hf (Fρ , τ ) is also false. We are now ready to describe the construction. Let {Xs }s∈ω be a ∅ computable enumeration of the Σ02 set X. We may assume that X0 = ∅ and |Xs+1 \ Xs | = 1 for all s. We construct A by initial segments using a ∅ oracle. At each stage s, we find a string τs ∈ 2f (t) for some t s. We will let A = s τs . Each stage also produces a string σs ∈ 2s , which is an approximation to A , although not necessarily an initial segment of it. For each s, we require that the following conditions hold. 1. τs ∩ Fσs = ∅. 2. If ρ = σs e + 1 for e < s and B ∈ τs ∩ Fρ , then B (e) = ρ(e). Construction. Stage 0. Let τ0 and σ0 both be the empty string. Then τ0 ∩ Fσ0 = Q = ∅, so (1) is satisfied. Notice that (2) is vacuous. Stage s + 1. Assume that we have already constructed τs ∈ 2f (t) for some t s and σs ∈ 2s satisfying (1) and (2). Let e be the unique element of Xs+1 \ Xs . Case 1. If e > s, then let τs+1 ∈ 2f (t+1) be the leftmost extension of τs such that τs+1 ∩ Fσs = ∅. Note that ∅ can determine whether ρ ∩ Fσs = ∅, for each ρ ∈ 2<ω , so ∅ can find τs+1 . If τs+1 ∩ Fσs 0 = ∅, then let σs+1 = σs 0. Otherwise, let σs+1 = σs 1. Again, this choice can be made using the ∅ oracle. Note that (1) and (2) are satisfied by our choices of τs+1 and σs+1 . Case 2. If e s, then let ρe = σs e. Let m be least such that f (e, m) |τs |. First let τ s ∈ 2f (e,m ) be the leftmost extension of τs such that τs ∩ Fρe = ∅. Next let τs+1 ∈ 2f (e,m +1) be the rightmost extension of τ s such that τs+1 ∩ Fρe = ∅. Now inductively define ρk for each e < k s + 1. If we have already defined ρk , then determine whether τs+1 ∩ Fρk 0 = ∅. If so, let ρk+1 = ρk 0. Otherwise, let ρk+1 = ρk 1. Finally, let σs+1 = ρs+1 . Again, the construction is computable relative to ∅ , and we have ensured that (1) and (2) continue to hold. End of Construction. We turn to the verification. The construction is ∅ -computable, so A is Δ02 . Furthermore, (1) tells us that τs ∩ Fσs = ∅, for each s. Because Fσs ⊆ Q ⊆ P , this fact implies that every τs can be extended to an element
376
8. Algorithmic Randomness and Turing Reducibility
of P . But P is closed, so A = s τs ∈ P . All that remains to verify is that A ≡T X. First we prove that A T X. To determine whether e ∈ A , use X and ∅ to find a stage s > e such that Xs e+1 = X e+1. Let ρ = σs e+1. We claim that σt e + 1 = ρ for all t s, because the only way that σs e + 1 can be injured during the construction is in Case 2, when an element less than or equal to e is enumerated into X, which will never happen after stage s. Therefore, for all t s, we have ρ σt and hence Fσt ⊆ Fρ . So τt ∩ Fρ = ∅ for all t s, which implies that A ∈ Fρ . By (2), we have A (e) = ρ(e). Thus we can uniformly decide whether e ∈ A using only X ⊕ ∅ ≡T X. Therefore, A T X. Now we must show that X T A . Assume by induction that we have determined X e for some e. Find the least s e such that Xs e = X e. Let ρ = σs e and note, as above, that ρ σt for all t s. Find the least m such that f (e, m) |τs |. Of course, both s and m can be found by ∅ . We claim that e ∈ X iff either e ∈ Xs or there is an n m such that Hf (Fρ ; A f (e, n + 1)). If e ∈ X \ Xs , then Case 2 ensures that Hf (Fρ ; A f (e, n + 1)) holds for some n m. So, assume that e ∈ / X. Then for every n m, the construction chooses the leftmost extension of A f (e, n) that is extendible in the appropriate Π01 class. This class is of the form Fρ , where ρ σt for some t s and | ρ| e. Then ρ ρ , so Fρ ⊆ Fρ . The definition of f ensures that there are distinct extensions of A f (e, n) of length f (e, n + 1)) that can be extended to elements of Fρ . Therefore, the leftmost choice consistent with Fρ must be left of the rightmost choice consistent with Fρ . Hence Hf (Fρ ; A f (e, n + 1)) is false. Finally, note that A can decide whether there is an n m such that Hf (Fρ ; A f (e, n + 1)), because Hf is Σ01 . Therefore A can determine whether e ∈ X, proving that X T A .
8.19 Pseudo-jump inversion In the same way that the construction of a low c.e. set was generalized by Jockusch and Shore [193, 194] (as we have seen in Section 2.12.2), Nies and Kuˇcera independently realized that Theorem 8.18.2 can be extended to a similar pseudo-jump theorem. Recall that an operator of the form V Y = Y ⊕ W Y for a c.e. set W with Y
8.19. Pseudo-jump inversion
377
Nies’ proof in [306] is a full approximation argument, and hence an effective construction, but we will use Kuˇcera’s oracle construction, as it is simpler. We thank George Barmpalias for telling us of this proof. Proof of Theorem 8.19.1. The proof resembles that of the Friedberg Completeness Criterion, Theorem 2.16.1, combined with Kuˇcera-G´acs coding. Let P be a Π01 class of positive measure, and let Q and c be as in Lemma 8.10.1. That is, Q ⊆ P is a Π01 class, and for an effective listing P0 , P1 , . . . of Π01 classes, if Q ∩ Pe = ∅, then μ(Q ∩ Pe ) 2−(e+c) . Let k > − log μ(Q). We build an A ∈ Q with V A ≡wtt ∅ . As in the proof of the Friedberg Completeness Criterion, at even stages we force the (pseudo-)jump and at odd ones we code. At stage 0 we consider the Π01 class M0 = {Y : 0 ∈ / V Y }. Let e be such that Pe = M0 . If M0 ∩ Q = ∅, let Q0 = Q. Otherwise, let Q0 = Q ∩ M0 . In either case, let A0 = λ. At stage 1, we want to code whether 0 ∈ ∅ . Let e be such that Pe = Q0 and let m0 = 2e+c + 1. Then μ(Q0 ) > 2−m0 , so there is a split in Q0 by level m0 . That is, there are at least two strings of length m0 that can be extended to elements of Q0 . As with Kuˇcera-G´acs coding, if 0 ∈ ∅ then let A1 be the rightmost such string, and otherwise let A1 be the leftmost such string. Let Q1 = Q0 ∩ A1 . The construction now works inductively in the obvious way. At stage 2n we look at Mn = {Y : n ∈ / V Y }. If Mn ∩ Qn−1 = ∅, we let Qn = Qn−1 . Otherwise, we let Qn = Qn−1 ∩ M0 . In either case we let A2n = A2n−1 . At stage 2n + 1, we let e be such that Pe = Q2n and let mn = 2e+c + 1. As before, there are at least two strings of length mn that can be extended to elements of Qn . If 0 ∈ ∅ , we let A2n+1 be the rightmost such string, and otherwise we let A2n+1 be the leftmost such string. Finally, we let Q2n+1 = Q2n ∩ A2n+1 . It is easy to see that the questions we have to ask of ∅ at each stage can be computably bounded in advance, so this construction produces an A with V A ≡wtt ∅ . Other extensions along similar lines can be found in Cenzer, LaForte, and Wu [56]. For instance, using somewhat similar techniques, they showed the following. Theorem 8.19.2 (Cenzer, LaForte, and Wu [56]). Let V be a nontrivial pseudo-jump operator, let P be a Π01 class of positive measure, and let C T ∅ . Then there is an A ∈ P such that V A ≡T C. It is interesting to note that, in contrast to Theorem 8.19.1, there are nontrivial pseudo-jump operators V for which there is no c.e. set A with V A ≡wtt ∅ , as was recently proved by Ng and Wu [personal communication].
378
8. Algorithmic Randomness and Turing Reducibility
8.20 Randomness and genericity Martin-L¨ of’s original definition of 1-randomness arises from the classical interpretation of randomness as a notion of being typical in terms of measure. We can also look at sets that are typical in terms of category. These are the (Cohen) generic sets, which are members of every (suitably definable) dense open class. In the effective setting, we have the n-generic and weakly n-generic sets introduced in Section 2.24. In this section, we look at how these two notions of being typical relate to each other. The answer, perhaps surprisingly, is that genericity and randomness are more or less orthogonal, especially when we look at higher levels of the arithmetic hierarchy of classes.
8.20.1 Similarities between randomness and genericity Before concentrating on the differences between randomness and genericity, we note that some results we have discussed above are special cases of general results on forcing, and hence have analogs for genericity. Van Lambalgen’s Theorem is an example. Theorem 8.20.1 (Yu [411]). The following are equivalent for any n 1. (i) A ⊕ B is n-generic. (ii) A is n-generic and B is n-generic relative to A. (iii) B is n-generic and A is n-generic relative to B. (iv) A and B are relatively n-generic (i.e., A is n-generic relative to B and B is n-generic relative to A). Proof. It is enough to show that (i) and (ii) are equivalent, since the equivalence of (i) and (iii) then follows by a symmetric argument, while (iv) is just the conjunction of (ii) and (iii). We write σ ⊕ τ for the string ρ of length 2 min(|σ|, |τ |) such that ρ(2n) = σ(n) and ρ(2n + 1) = τ (n) for all n min(|σ|, |τ |). Suppose that (ii) holds. Let S ⊆ 2<ω be Σ0n and let S = {τ : ∃σ ≺ A (σ ⊕ τ ∈ S)}. Then S is ΣA n . If there is a τ ≺ B with τ ∈ S then there is a σ such that σ ⊕ τ ≺ A ⊕ B and σ ⊕ τ ∈ S. Otherwise, since B is n-generic Let σ = A |τ | and relative to A, there is a τ ≺ B with no extension in S. 0 let T = {μ : ∃ν (σ ⊕ τ μ ⊕ ν ∈ S)}. Then T is Σn and no member of T is an initial segment of A, since otherwise we would have an extension of τ Since A is n-generic, there is a σ ≺ A with no extension in T . Then in S. (A ⊕ B) 2|σ | has no extension in S. Thus A ⊕ B is n-generic. Now suppose that (i) holds. First suppose that A is not n-generic. Let T ⊆ 2<ω be a Σ0n set such that no member of T is an initial segment of A and every initial segment of A has an extension in T . Let T = {σ ⊕ τ : σ ∈
8.20. Randomness and genericity
379
T ∧ τ ∈ 2<ω }. Then T is Σ0n , no member of T is an initial segment of A⊕B, and every initial segment of A ⊕ B has an extension in T , contradicting the assumption that A ⊕ B is n-generic. Thus A is n-generic. Now suppose that B is not n-generic relative to A. Let S be a ΣA n set such that no member of S is an initial segment of B and every initial segment of B has an extension in S. There is a Σ0n relation R with S = {τ : ∃n R(A n, τ )}. Let S = {σ ⊕ τ 0(|σ|−|τ |) : |τ | |σ| ∧ R(σ, τ )}. Then S is Σ0n , and for However, for all σ ⊕ τ ≺ A ⊕ B, there all σ ⊕ τ ≺ A ⊕ B, we have σ ⊕ τ ∈ / S. contradicting the assumption that exists a σ ⊕ τ σ ⊕ τ with σ ⊕ τ ∈ S, A ⊕ B is n-generic. Another example of a theorem about randomness that has an analog for genericity is Theorem 8.5.3. Theorem 8.20.2 (Csima, Downey, Greenberg, Hirschfeldt, and Miller [82]). Let B be 1-generic relative to X and let A T B be 2-generic. Then A is 1-generic relative to X. Proof. Let Φ be a reduction such that ΦB = A. Let W ⊆ 2<ω be c.e. in X. Suppose that no initial segment of A is in W . Let V = {σ ∈ 2<ω : ∃τ Φσ (τ ∈ W )}. Then V is c.e. in X, and no initial segment of B is in V . Thus there is some τ ≺ B with no extension in V . Let U = {μ ∈ 2<ω : ¬∃σ τ (μ ≺ Φσ )}. Then U is co-c.e. and no initial segment of A is in U , so there is some μ ≺ A with no extension in U . If ν μ then there is a σ τ such that ν ≺ Φσ , so if ν ∈ W then σ ∈ V , contradicting the choice of τ . Thus μ is an initial segment of A with no extension in W . Corollary 8.20.3 (Csima, Downey, Greenberg, Hirschfeldt, and Miller [82]). If B is n-generic and A T B is 2-generic, then A is n-generic. Proof. Let X = ∅(n−1) in Theorem 8.20.2. It is natural to ask whether the full analog of Theorem 8.5.3 holds for genericity, that is, whether a 1-generic set below an n-generic set is automatically n-generic. The following result, whose proof can be found in [82], gives a negative answer to this question. Theorem 8.20.4 (Csima, Downey, Greenberg, Hirschfeldt, and Miller [82]). Every 1-generic set computes a 1-generic set that is not weakly 2-generic.
8.20.2 n-genericity versus n-randomness We now discuss the interplay between n-genericity and n-randomness. In general, genericity and randomness are antithetical. We already saw in Proposition 8.11.9 that 1-genericity is enough to ensure that a set is not even Schnorr random. For 1-randomness, we have the following stronger
380
8. Algorithmic Randomness and Turing Reducibility
result. (As we have seen in Theorem 8.11.7, for weak 1-randomness, the situation is quite different.) Theorem 8.20.5 (Demuth and Kuˇcera [96]). If A is 1-generic and B T A then B is not 1-random. Proof. By Theorem 2.24.5, the degree of B is not DNC, and hence by Theorem 8.8.1, B is not 1-random. Since Ω ≡T ∅ and there are Δ02 1-generics, this theorem works only in one direction. Kurtz [228] showed that the upward closure of the weakly 2-generic degrees has measure 0. (We will see in Theorem 8.21.3 that this result cannot be extended to 1-genericity.) To prove this result, we need the following lemma, whose proof is fairly typical (no pun intended) of measure-theoretic arguments in computability theory. Lemma 8.20.6 (Kurtz [228]). There is an f T ∅ such that for almost every A, the function f dominates every A-computable function. Proof. It is enough to build uniformly ∅ -computable functions fn,k such that for each n and k, for measure 1 − 2−(n+k+1) many sets A, if ΦA k is total then fn,k majorizes ΦA k . We can then let f (m) = maxk,nm fk,n (m) and it is easy to check that f is as required, since it dominates every fn,k . We define fn,k (m) as follows. Let S = {A : ΦA k (m) ↓}. It is easy to see that μ(S) is a left-c.e. real, so using ∅ , we can find a rational q such that q < μ(S) and μ(S) − q 2−(n+k+m+2) . We can then find a prefix-free finite set F of strings such that σ ∈ F ⇒ Φσk (m) ↓ and μ(F ) > q. Let fn,k (m) = max{Φσk (m) : σ ∈ F }. This definition ensures that, for each m, the class of all A such that −(n+k+m+2) ΦA . Thus the class k (m) ↓> fn,k (m) has measure less than 2 A of all A such that Φ is total but not dominated by f n,k is less than k −(n+k+m+2) −(n+k+1) 2 = 2 . m We will examine this idea of “almost everywhere domination” further in Section 10.6. Theorem 8.20.7 (Kurtz [228]). The upward closure of the weakly 2-generic degrees has measure 0. Proof. Let f be as in Lemma 8.20.6. Recall that the principal function pG of an infinite set G = {a0 < a1 < · · · } is defined by pG (n) = an , and that for a string σ, we let pσ (k) be the position of the kth 1 in σ, or undefined if there is no such position. Let Sn be the set of strings such that for at least n many k, we have pσ (k) > f (k). Each Sn is a Δ02 dense set of strings, so if G is weakly 2-generic, then it meets each Sn , which implies that pG is not dominated by f . So if a set computes a weakly 2-generic set, then it must be in the class of measure 0 of sets computing functions not dominated by f.
8.21. Properties of almost all degrees
381
8.21 Properties of almost all degrees Randomness is, of course, deeply connected with “almost everywhere” behavior, since any property that holds of almost all sets holds of all sufficiently random sets. In this section, we prove several results about computability theoretic properties that hold of almost all degrees. We begin with Martin’s Theorem 8.21.1 that almost all degrees are hyperimmune. We then proceed to three theorems due to Kurtz [228], with related proofs of increasing complexity: Theorem 8.21.3 that almost every degree bounds a 1-generic degree, Theorem 8.21.8 that almost every degree is CEA, and Theorem 8.21.16 that for almost every degree a, the 1-generic degrees are downwards dense below a. We will also comment on the levels of effective randomness needed to ensure that a particular set has each of these properties. Of course, Theorem 8.21.16 implies Theorem 8.21.3 (as does Theorem 8.21.8, as we will see in Theorem 8.21.15), which in turn implies Theorem 8.21.1, since every 1-generic degree is hyperimmune, but presenting the simpler proofs of Theorems 8.21.1 and 8.21.3 will help in explaining Kurtz’ method of risking measure, which is an elaboration of the method used in the proof of Martin’s Theorem. (Similar ideas had even earlier been used by Paris [314], in proving Corollary 8.21.23 below.)
8.21.1 Hyperimmunity Theorem 8.21.1 (Martin [unpublished]). Almost every degree is hyperimmune. Proof. By Kolmogorov’s 0-1 Law, Theorem 1.2.4, it suffices to prove that there are positive measure many sets of hyperimmune degree. To do so, we construct a functional Ξ so that for positive measure many A, we have that ΞA is total and not majorized by any computable function. We will meet the following requirements for positive measure many sets A. Re : Φe total ⇒ ΞA is not majorized by Φe . The most naive attempt to meet Re is to choose a witness ne and define ΞA (ne ) = Φe (ne ) + 1 for all A. The problem with this approach, of course, is that Φe (ne ) may not be defined, and we need to make ΞA be total for positive measure many sets A. Thus we begin with a small class of A’s, choose a witness n0e , and define ΞA (n0e ) = Φe (n0e ) + 1 for all A in our class. If Φe (n0e ) ↑, then ΞA (n0e ) ↑ for all such A, but Re is automatically met, because Φe is not total. If Φe (n0e ) ↓, then we eventually satisfy Re for all A in our class, and can then move to a new class of sets, using a new witness n1e . This process can continue until we either cover all sets or have a final class C of sets for which we have appointed a harmful witness nke such that Φe (nke ) ↑, whence ΞA (nke ) ↑ for all A ∈ C. By ensuring that the measure of
382
8. Algorithmic Randomness and Turing Reducibility
C is at most 2−(e+2) , we can restrict the total effect of harmful witnesses in the construction to a class of measure at most 12 .
The Re work independently. Let νi0 , . . . , νi2 −1 list 2i in lexicographic 0 order. Then Re begins by working only above νe+2 , appointing a witness 0 0 A 0 0 ne for all extensions of νe+2 . We ask that Ξ (ne ) ↑ for all A extending νe+2 0 A 0 0 unless we see that Φe (ne ) ↓, at which point we define Ξ (ne ) = Φe (ne ) + 1 1 for all such A. We then begin working above νe+2 , appointing a fresh large 1 1 witness ne for all extensions of νe+2 . Again we ask that ΞA (n1e ) ↑ for all 1 A extending νe+2 unless we see that Φe (n1e ) ↓, at which point we define A 1 1 2 Ξ (ne ) = Φe (ne ) + 1 for all such A and begin working above νe+2 , and A so on. At each stage s, if no Re has asked that Ξ (s) ↑ or already defined ΞA (s), then we define ΞA (s) = 0. Since each Re can get stuck on at most one cone, and that cone has −(e+2) A measure 2 −(e+2), we1 see that Ξ fails to be total on a class of measure at most e 2 = 2 , and for all other A, each Re is met. i
Kautz [200] noted the following feature of the above proof. The class of all A such that ΞA is total is a Π02 class of positive measure and hence, by Theorem 6.10.2, contains elements of all 2-random degrees. Furthermore, by construction, for every A in this class, ΞA is not majorized by any computable function, so we have the following result. Theorem 8.21.2 (Kautz [200]). If a is 2-random, then a is hyperimmune. Thus, the class of sets of hyperimmune degree includes every 2-random set but not every 1-random set. We have already seen, in Theorem 8.11.12, that this result cannot be extended from 2-randomness to weak 2-randomness. Note that this fact shows that Theorem 6.10.2 cannot be extended from n-randomness to weak n-randomness. Kurtz [228] showed that there is no single total computable operator Ψ such that for almost every A, the function ΨA is not majorized by any computable function. Theorem 8.21.2 can also be obtained by the following elegant argument of Nies, Stephan, and Terwijn [308]. By Theorem 6.11.7, there is a computable g and a d such that a set A is 2-random iff C g (A n) n − O(1). Fix a 2-random set A and let d be such that C g (A n) n − d for all n. Let f (k) be the least n such that there are p0 , . . . , pk−1 n with C g (A pi ) pi − d for i < k. Then f is A-computable. We claim that f is not majorized by any computable function. Assume for a contradiction that h is computable and majorizes f . Define the Π01 class P = {Z : ∀k ∃p1 , . . . , pk h(k) (C g (Z pi ) pi − d)}. Then P = ∅, and every member of P is 2-random. Since P must have a Δ02 member and there are no Δ02 sets that are 2-random, we have a contradiction.
8.21. Properties of almost all degrees
383
8.21.2 Bounding 1-generics The following result follows from Theorem 8.21.8 below, but discussing its proof will help us to understand the more difficult proofs of Theorems 8.21.8 and 8.21.16. It should be contrasted with Theorem 8.20.7. Theorem 8.21.3 (Kurtz [228]). Almost every set computes a 1-generic set. Proof Sketch. By Kolmogorov’s 0-1 Law, Theorem 1.2.4, it suffices to prove that there are positive measure many sets computing 1-generic sets. Let V0 , V1 , . . . be an effective enumeration of the c.e. sets of strings. We build a functional Ξ so that for positive measure many sets A, we have that ΞA is total and the following requirements are met. Re : ΞA meets or avoids Ve . It will convenient to think of Ξ as a partial computable function from strings to strings, but we write Ξσ instead of Ξ(σ). Recall from Section 2.9 that the main rule we need to observe is that for σ ≺ τ , if Ξσ ↓ and Ξτ ↓, then Ξσ Ξτ . Note that it is not necessary that if Ξτ ↓ then Ξσ ↓ for all σ ≺ τ. Let us first discuss how to meet the single requirement R0 . We would like to find some σ ∈ V0 and ensure that ΞA σ for all A, thereby meeting R0 . However, V0 might be empty, and we cannot wait forever to define Ξ. Thus, as in the proof of Theorem 8.21.2, we set aside a portion of 2ω of relatively small measure where we choose not to define Ξ while waiting for some σ to occur in V0 . For R0 , we divide 2ω into two portions, say 00 and 00. We think of 00 as taking the role of ν00 in our proof of Theorem 8.21.2. While we are waiting for some string σ to occur in V0 , we pursue a strategy based on the belief that V0 = ∅, and hence R0 is automatically satisfied. This strategy builds the reduction Ξ within 00. Thus, we will never define ΞA for any A ∈ 00 unless some σ enters V0 . Should we find such a σ at a stage s, then we define Ξ00 = σ; that is, we ensure that ΞA σ for all A ∈ 00, hence meeting R0 within 00. At this point, following the model of the proof of Theorem 8.21.2, we would like to switch attention to 01, trying to satisfy R0 within that cone. The main problem not found in the proof of Theorem 8.21.2 is that we can no longer satisfy R0 by picking a single new witness and defining the value of ΞA on that witness, independent of other values on which ΞA may already have been defined. While we were waiting for a σ to enter V0 , other strategies might well have already defined Ξν for several ν ∈ 2s extending 01, 10, and 11. We might have such ν0 and ν1 with Ξνi = δi and δ0 | δ1 . To satisfy R0 within νi , we need to seek a γi δi in V0 . It might be the case that, say, there is no such γ0 in V0 but there is such a γ1 in V0 . Thus we need to pursue completely separate strategies above each νi . In particular, we
384
8. Algorithmic Randomness and Turing Reducibility
cannot work only above 01, since V0 may contain no extensions of Ξ01 and yet contain extensions of Ξ10 , say. The solution to this problem is to pursue independent constructions above all strings of length s in 00, each of which is basically a copy of the original construction. To organize these multiple constructions, we will color certain strings, using four basic colors: red, blue, green, and yellow. Each of these will also have a subscript denoting what requirement the color refers to, so that red4 , say, is red for R4 .13 In the beginning, we give the empty string λ the color blue0 , indicating that a strategy for R0 is being carried out above it. We also give the string 00 the color red0 , indicating that we are actively working above 00. As before, we would like to find some σ ∈ V0 and define Ξ00 = σ, thereby meeting R0 within the cone 00. If we actually get to meet R0 in this way within this cone, then the color of 00 changes to green0 , indicating that we have satisfied R0 within 00. While we are waiting for some σ to appear in V0 , we assign the color yellow0 to the strings of length 2 not colored red0 . The idea is that if 00 becomes green0 then we try to deal with the yellow0 strings (via extensions). While we are acting in this way for R0 , we assign R1 to 0100, giving 01 the color blue1 , giving 0100 the color red1 , and giving the other length 2 extensions of 01 the color yellow1 . As before, if no σ ever appears in V0 , we satisfy R0 everywhere. Now suppose that there is a stage s at which 00 becomes green0 because we see some σ enter V0 and get to define Ξ00 = σ. At this stage, we deal with the yellow0 strings (namely 01, 10, and 11). As discussed above, it may well be the case that there are extensions ν of these strings for which Ξν has already been defined, say by the action of R1 . Each such extension has length at most s, so for convenience we can work above the strings ν0 , . . . , νp of length s in 01, first defining Ξνi = Ξσ for the longest σ νi for which Ξ has been defined. We first remove the color yellow0 from the length 2 strings. We then give each νi the color blue0 and begin a new construction at each of them that is a clone of the basic construction. That is, we make each νi 00 a red0 string, and give the other length 2 extensions of the νi the color yellow0 . Now the goal of R0 via each red0 string νi 00 is to find a string σi ∈ V0 with Ξνi σi , and then make νi 00 green0 by defining Ξνi 00 = σi . Of course, once R0 is satisfied within the cone 00, we can have R1 take the role of R0 and satisfy it within that cone precisely as we did for R0 . More generally, Re will have a number of bluee nodes ν, with all its length e + 2 extensions colored yellowe , save one such extension ρ that is
13 For those familiar with the terminology, we mention that these are really e-states, and the argument to follow is a kind of full approximation argument.
8.21. Properties of almost all degrees
385
colored rede . This string will turn green0 once we find a γ ∈ Ve extending Ξν and define Ξρ = γ. The key point is that each time we attempt to satisfy Re above some string ν, either the rede extension ρ of ν never becomes greene , in which case Re is satisfied within ν, or it does, in which case Re is satisfied within ρ and the construction can continue to try to satisfy Re within the yellowe extensions of ν. We can then argue that the Re are satisfied on a class of positive measure. We will give more details of this construction and its analysis in the proof of Theorem 8.21.8. Kautz [200] analyzed the proof of Theorem 8.21.3 and showed the following. Theorem 8.21.4 (Kautz [200]). Every 2-random set computes a 1-generic set. In Theorem 8.21.8, we will show that every 2-random set is CEA, and in Theorem 8.21.15, that every CEA set computes a 1-generic, from which Theorem 8.21.4 will follow. We will also give a direct proof at the end of Section 8.21.3. By Theorem 8.11.12, there are weakly 2-random sets of hyperimmunefree degree, and by Theorem 2.24.12, none of these can compute a 1-generic. Thus Theorem 8.21.4 cannot be extended from 2-randomness to weak 2randomness . Kurtz [228] pointed out the following corollary to the above results. Corollary 8.21.5 (Kurtz [228]). For almost all degrees a, and indeed for any 2-random degree a, the degrees below a are not densely ordered. Proof. Let a bound a 1-generic degree b. By Theorem 2.24.9, b is c.e. in some strictly lower degree. The corollary now follows by applying the relativized form of Theorem 2.18.5 that every nonzero c.e. degree bounds a minimal degree. Although almost every degree bounds a 1-generic degree, almost no degrees are bounded by a 1-generic degree. Theorem 8.21.6 (Kurtz [228]). Almost no set is computable in a 1-generic set. Proof. Suppose otherwise. Then there must be a Φ and a k for which the set of A such that ΦG = A for some 1-generic G has measure greater than 2−k . We obtain a contradiction by building a c.e. S ⊆ 2<ω such that μ({A : A = ΦB for some B meeting S}) 2−k , and if G is 1-generic, then either G meets S or ΦG is not total. Let σ0 , σ1 , . . . be an effective listing of 2<ω . Let f (i) be the least n such that σn σ and Φτn is defined on at least k + i + 1 many inputs, if such an n exists, and let f (i) be undefined otherwise. Let S = {σf (i) : f (i) ↓}.
386
8. Algorithmic Randomness and Turing Reducibility
Clearly, S is c.e. If f (i) ↓, then the class of all A such that A = ΦB for some B σf (i) has measure at most 2−(k+i+1) , so μ({A : A = −(k+i+1) ΦB for some B meeting S}) = 2−k . Now let G be 1i2 generic. If G does not meet S then there is an i such that σi ≺ G and no extension of σi is in S, which implies that f (i) ↑, and hence ΦG can be defined on at most k + i many inputs. There is a local characterization of the sets computable in 1-generic ones. A c.e. S ⊆ 2<ω is a Σ01 -cover of a set A if for each σ ≺ A there is a τ σ in S. This Σ01 -cover is proper if no element of S is an initial segment of A. It is tight if it is proper and for each Σ01 -cover S of A, there are σ ∈ S and τ ∈ S such that σ τ .14 One might think of a tight cover as a “simple set for covers”. The following result completely classifies the sets that are computable in 1-generic ones. Theorem 8.21.7 (Chong and Downey [69, 70]). A set A has no tight Σ01 -cover iff there is a 1-generic G T A. The proof in [70] shows that there is a single procedure Φ such that if A is computable in a 1-generic set, then there is a 1-generic G T A such that A = ΦG . Notice that, by Theorems 8.21.6 and 8.21.7, almost every set has a tight Σ01 -cover.
8.21.3 Every 2-random set is CEA In Theorem 2.24.9 we saw that every 1-generic set A is CEA, that is, computably enumerable relative to some X
8.21. Properties of almost all degrees
387
Proof. We construct a functional Ξ so that μ({A : ΞA total ∧ ΞA T A}) 1 A whenever ΞA is total. By Kolmogorov’s 0-1 Law, 4 and A is c.e. in Ξ Theorem 1.2.4, the existence of such a functional is enough to conclude that almost every set is CEA. At the end of the proof we will argue that 2-randomness suffices. To make A c.e. in ΞA we will ask that n ∈ A iff n, m ∈ ΞA for some m. We say that a string ξ is acceptable for a string σ if ξ(n, m) = 1 ⇒ σ(n) = 1. We will always require that Ξσ [s] be acceptable. Of course, restricting ourselves to acceptable strings is not enough, since it does not ensure that if n ∈ A then n, m ∈ ΞA for some m. We will explicitly take care of that condition during the construction (which will be easy to do since for every σ and n there is a τ σ with τ (n, m) = 1 for some m). We will also meet requirements of the form A
Re : A = ΦΞ e . In the proof of Theorem 8.21.3, there are bluee strings, each of which is the base of a cone within which we are attempting to meet Re . Above each such string, there is a tree of strings whose leaves are yellowe , except for one rede leaf above which we are testing the c.e. set of strings Ve in an attempt to turn the rede node greene . The method of this construction is in some ways similar, but more complex. As before, we attempt to meet Re in the cone above a bluee string ν. We give the color rede to ν1e+2 , and give the other extensions of ν of the same length the color yellowe . However, the role of the rede node will be different in this construction, and there are now new players, the purplee strings. These represent cones Re forbids us to build within during the construction, because we seem to be failing to satisfy Re within these cones, as explained below. We cannot afford to allow the measure of such cones to become too large, so the role of the rede strings is, as Kurtz put it, to act as “safety valves” for the pressure exerted by the purplee strings. If the measure of cones corresponding to purplee strings becomes too large, the “safety valve” pops, and a backup strategy allows us to succeed. The following is the key definition. Definition 8.21.9. At a stage s, we say that a string θ ∈ 2s threatens a requirement Re if there are ν ≺ τ θ such that ν is bluee and τ is yellowe , and a ξ ∈ 2s that is acceptable for θ, such that (i) Ξτ [s] ξ, (ii) Φξe (k)[s] = τ (k) = 0 for some k such that |ν| < k |τ |, and (iii) no initial segment of θ has color purplej for j e. In the construction, we will give any string that threatens Re the color purplee . If θ is purplee , then its image under Ξ could be an initial segment of a set B such that ΦB e = A for some A extending θ. We hope to avoid
388
8. Algorithmic Randomness and Turing Reducibility
having ΞA = B for such A and B, so we stop building Ξ above purplee nodes. If A τ for a yellowe string τ and no initial segment of A is purplej for A j e, then we cannot have A = ΦΞ e , and hence A satisfies Re . To see that A this is the case, suppose otherwise. Then ΦΞ e (k) = ν(k) = 0 for some k with |ν| < k |τ |, since τ is not ν1e+2 . But then some initial segment of A extending τ will eventually threaten Re , and hence be colored purplee . Thus, if there are not too many purplee strings, then Re is met on a class of sufficiently large measure. Our main concern is what to do when we do have too many purplee strings. For each purplee string θ, let θ be the unique string of length |θ| extending the rede string ρ ν with θ(m) = θ (m) for all m |ρ|. Note that item (ii) above implies that there is a k such that Φξe (k) = τ (k) = 0 = 1 = ρ(k), since ρ = ν1e+2 . Thus, if we choose to define ΞA on sets A ρ to emulate A = A for such A. If the density of purplee Ξξ , then we cannot have ΦΞ e strings above ν grows too big, then using this fact we will be able to satisfy Re above these θ , giving them color greene . (By density here we mean the measure of the union of the cones σ above purplee strings σ divided by the measure of ν.) More precisely, when the density of purplee strings above the bluee string ν exceeds 2−(e+3) , there must be some yellowe string τ such that the density of purplee strings above τ also exceeds 2−(e+3) . Let θ0 , . . . , θn be the purplee strings above such a τ . For each θi , let ξi be a string witnessing the threat to Re as in Definition 8.21.9. Then (i) ξi is acceptable for θi , (ii) Ξτ ξi , and (iii) Φξei (k) = 0 = θi (k) for some k with ρ(k) = 1. The construction will be such that Ξρ [s] = Ξτ [s]. Now we act as follows. First, any string extending ν loses its color, including ν itself. (We are considering Re in isolation here. In the presence of other requirements, strings colored purplei for i < e do not lose their colors, to respect the priority order of requirements.) Then we give each θi color greene . Finally, for each i n, we define Ξθi = ξi , which ensures that θi ΦΞ e (k) = θi (k) for some k with |ν| < k |τ |, justifying the use of the color greene for the strings θi . Note that each time we are forced to pop our safety valve in this way, we are guaranteed to satisfy Re on a set of relative measure at least 2−(e+5) above ν. We will be able to show that this measure is sufficient for our purposes. Notice also that we are now able to replicate this construction on strings extending those that have lost their color. We now proceed with the formal details.
8.21. Properties of almost all degrees
389
Construction. At each stage s, we let Ds be the set of strings σ on which we have defined Ξσ [s]. The leaves of Ds are called the active strings at stage s + 1. Stage 0. Assign λ the color blue0 , assign 00, 01, and 10 the color yellow0 , and assign 11 the color red0 . Stage s + 1. This stage consists of four substages. Substage 1. For each e s and each σ ∈ Ds , if σ threatens Re then color σ purplee . Substage 2. For each e s and each bluee string ν, if the density of purplee strings extending ν is at least 2−(e+3) , then proceed as follows. (We say that Re acts for ν.) Let ρ be the rede string extending ν, and τ a yellowe string extending ν with the highest density of purplee strings extending it. Let θ0 , . . . , θn be the purplee strings extending τ , and let ξ0 , . . . , ξn be corresponding witnesses for the θi ’s threat to Re . For each i n, let θi be the string of length |θi | extending ρ with θi (m) = θi (m) for all m |ρ|. Declare that every σ ν loses its color unless it is colored purplej for some j < e. There are now two cases. Case 1. ρ has an extension with color purplej for some j < e. In this case, do nothing. Case 2. Otherwise. Then for each i n, let Ξθi = ξi . Give each θi color greene . Substage 3. For each active string σ that is not rede for any e and has no substring that is purplee for any e, proceed as follows. Let e be least such that σ has no substring colored greene or yellowe . Let ξ be the lexicographically least string such that σ(n) = 1 iff ξ(n, m) = 1 for some m. (Note that ξ is acceptable for σ.) Let Ξσ0 = Ξσ1 = ξ, and give σ0 and σ1 color bluee . Substage 4. For each e and each active bluee string ν, let Ξνσ = Ξν for all σ ∈ 2e+2 . Give ν1e+2 color rede . Give the other length e + 2 extensions of ν color yellowe . End of Construction. Verification. We will need the following color classes. Be = {A : ∃σ ≺ A (σ has final color yellowe or greene )}. Se = {A : ∃σ ≺ A (σ has final color rede )}. Pe = {A : ∃σ ≺ A (σ has final color purplee )}. Let S = e Se and P = e Pe . Lemma 8.21.10. Let A ∈ e Be . Then ΞA is total, A is c.e. in ΞA , and A T ΞA .
390
8. Algorithmic Randomness and Turing Reducibility
Proof. Substages 3 and 4 of the construction ensure that ΞA is total and that n ∈ A iff n, m ∈ ΞA for some m, whence A is c.e. in ΞA . Thus we are left with showing that A T ΞA . Assume for a contradiction that there is A a least e such that A = ΦΞ e . Let σ ≺ A have final color yellowe or greene . Case 1. σ has final color greene . Then, by construction, at some stage we define Ξσ = ξ for some ξ with Φξe (k) = 0 = 1 = σ(k) = 1 for some k, which is a contradiction. Case 2. σ has final color yellowe . Let ρ be the associated rede string. There A is a k such that ρ(k) = 1 = 0 = σ(k). By our assumption, σ(k) = ΦΞ e (k). A ξ Let ξ be an initial segment of Ξ such that Φe (k) = 0. Then ξ is acceptable for σ, and thus σ threatens Re via ξ at all sufficiently large stages. Thus σ is colored purplee unless some initial segment of it is colored purplej for some j e. In any case, there is a τ σ that is colored purplej for some j e at some point in the construction. The string τ can never later lose its color, for if it did, then σ would also lose its yellowe color. But then the construction above τ is eventually permanently halted, contradicting the assumption that A ∈ e Be . We will now show through a sequence of lemmas that μ( e Be ) 14 . Let B−1 = 2ω . Lemma 8.21.11. μ(Be−1 \ (Be ∪ S ∪ P )) = 0. Proof. We do the case e > 0. The e = 0 case is similar, using σ = λ. Assume for a contradiction that μ(Be−1 \ (Be ∪ S ∪ P )) > 0. The class Be−1 is a disjoint union of clopen sets σ where σ has final color yellowe−1 or greene−1 . For some such σ, we have μ(σ \ (Be ∪ S ∪ P )) > 0. Let s0 be the stage at which σ receives its final color. By the Lebesgue Density Theorem 1.2.3, there is a σ σ such that μ( σ \(Be ∪S∪P )) > 2−| σ| (1−2−(2e+5) ). In particular, σ is not contained in Be ∪ S ∪ P , so no substring of σ has final color redi or purplei for any i, and no substring of σ has final color yellowe , greene , rede , or purplee , which implies that no substring of σ has final color bluee . We may assume that σ is sufficiently long so that some substring τ of σ is active at stage s0 . First suppose that τ has a purplei substring γ for some i at stage s0 . We know that γ will eventually lose its color. Suppose this loss of color comes from some Rj with j < e acting for a bluej node ν. Then it is easy to check from the construction that ν ≺ σ, so Rj ’s action also causes σ to lose its color, which is a contradiction. Thus γ’s loss of color must come from some Rj with j e acting for a bluej node ν τ . Now it is easy to check from the construction that some predecessor of τ must have color bluee at some stage s s0 . Now suppose that τ has no purplei substrings for any i at stage s0 . It cannot be the case that τ is colored redj for some j < e, since then σ would
8.21. Properties of almost all degrees
391
not have its color. If τ is colored redj for some j e, then it has a bluee substring. Otherwise, it becomes bluee at stage s0 . Thus, in either case, there is some stage s s0 at which σ has a bluee substring. Let ν be the longest such substring. At some stage t, the string ν loses its color. By the same argument as above, this loss of color cannot come from the action of some Rj with j < e, since then σ would lose its color, and it cannot come from the action of Rj for j > e, since ν cannot have any bluej substrings. Thus it comes from the action of Re itself. So the density of purplee strings above ν is at least 2−(e+3) at stage t. At the end of stage t, there are pairwise incompatible strings γ0 , . . . , γm such that ν = im γi and each γi is either greene , bluee , or has a purplej substring for some j < e. If γi is greene then that is its final color, and if γi has a purplej substring for some j < e, then purplej is that substring’s final color (in both cases by the same argument as above, that a loss of color would have to come from the action of some Rk with k < e, which would also cause σ to lose its color). In either case, σ cannot extend γi . But if γi has color bluee at stage t, then again σ cannot extend γi , by the choice of ν. Thus σ does not extend any γi , and hence σ is a union of sets of the form γi . Since we are assuming that μ( σ \ (Be ∪S ∪P )) > 2−| σ| (1 − 2−(2e+5) ), there must be an i such that μ(γi \ (Be ∪ S ∪ P )) > 2−|γi | (1 − 2−(2e+5) ). Such a γi cannot have color greene or have a substring with color purplej for j < e, since, as noted above, in either case these colors would be final, and we would have γi ⊆ Be ∪ S ∪ P . Thus γi is colored bluee at stage t. This color cannot be final, since otherwise every set extending γi would have an initial segment with final color rede or yellowe , and hence we would have γi ⊆ Be ∪ S ∪ P . So there is a stage at which γi loses its color. By the same argument as above, this loss can happen only through the action of Re , so there is a stage at which the density of purplee strings above γi is at least 2−(e+3) . Let ρ = γi 1e+2 be the rede string above γi . If ρ has a substring with color purplej for some j < e then this substring’s color is final, so ρ ⊆ P , whence μ(γi ∩ (Be ∪ S ∪ P )) 2−(|γi |+e+2) , which is a contradiction. Otherwise, it follows from the construction that the density of greene strings above ρ is at least 2−(e+3) , and, as argued before, each of these strings has greene as its final color. Thus μ(γi ∩ (Be ∪ S ∪ P )) 2−(|ρ|+e+3) = 2−(|γi |+2e+5) , which again is a contradiction. Lemma 8.21.12. Be+1 ⊆ Be . Proof. Let A ∈ Be+1 , and let σ ≺ A have final color greene+1 or yellowe+1 , with ν ≺ σ the associated bluee+1 string. When ν received its bluee+1 color, there must have been some string τ ≺ ν with color greene or yellowe , as otherwise ν would have been colored bluee . Any action removing this color from τ would necessarily remove the color from σ as well, and hence τ ’s color is final.
392
8. Algorithmic Randomness and Turing Reducibility
Lemma 8.21.13. μ(S ∪ P ) + μ( e Be ) = 1. Proof. We again let B−1 = 2ω . Since (Be−1 \ Be ) ∩ S ∪ P = Be−1 \ (Be ∪ S ∪ P ), it follows from Lemma 8.21.11 that μ((Be−1 \ Be ) ∩ S ∪ P ) = 0, and hence
(Be−1 \ Be ) ∩ S ∪ P = 0. μ
e
By Lemma 8.21.12, e (Be−1 \ Be ) ∪ e Be = 2ω , and itis easy to see from the construction that(S ∪P )∩( e Be ) = ∅, so S ∪P ⊆ e (Be−1 \Be ), and hence μ(S ∪ P ) = μ( e (Be−1 \ Be )). Thus #
#
μ(S ∪ P ) + μ Be = μ (Be−1 \ Be ) + μ Be = 1. e
Lemma 8.21.14. μ(S)
e
1 2
e
and μ(P ) 14 .
Proof. By construction, the set of strings ν with final color bluee is prefixfree. For any such ν, there is a single rede node τ above ν, and μ(τ ) = 2−(e+2) μ(ν). Thus μ(Se ) 2−(e+2) , and hence μ(S) 12 . Similarly, the density of purplee strings in ν for any bluee string ν is bounded by 2−(e+3) , and hence μ(Pe ) 2−(e+3) , whence μ(P ) 14 . Putting together the previous two lemmas, we have μ( e Be ) 14 , which together with Lemma 8.21.10 and Kolmogorov’s 0-1 Law, shows that almost every set is CEA. The class {A : ΞA total} is a Π02 class of positive measure, and hence, by Theorem 6.10.2, for every 2-random set X, there are A and σ suchthat X = σA and ΞA is total. If A ∈ S ∪P then ΞA is not total, and if A ∈ e Be then ΞA
8.21. Properties of almost all degrees
393
there is a least e s such that we have never acted for e and there is a σ ∈ Ve [c(s + 1)] with σ Gs , then let Gs+1 = σ. (In this case, we say we have acted for e.) Otherwise, let Gs+1 = Gs . It is easy to check that lims |Gs | = ∞. Let G = lims Gs . Assume for a contradiction that G is not 1-generic. We show that c T B, which is impossible, as it implies that A T B. Let e be such that G neither meets nor avoids Ve . Let s0 be a stage after which we never act for a number less than e. Let s s0 and assume we know Gs . Then we can find a stage t such that some extension of Gs occurs in Ve [t]. (Such an extension must exist, since G does not avoid Ve .) We must have t > c(s + 1), since otherwise we would have acted for e at stage s. Thus we can compute c(s + 1) using B (as the least u such that Au n = At n), from which we can compute Gs+1 . Proceeding by induction, we can compute c from B. As mentioned following Theorem 8.21.4, there are weakly 2-random sets that do not compute any 1-generic sets. Thus there are weakly 2-random sets that are not CEA. Theorems 8.21.8 and 8.21.15 together give us Kautz’ Theorem 8.21.4 that every 2-random set computes a 1-generic set. The following is a direct proof, applying the analysis in the proof of Theorem 8.21.8 to the similar (but simpler) construction in the proof of Theorem 8.21.3. Proof of Theorem 8.21.4. Let Ξ be the functional built in the proof of Theorem 8.21.3. Then {A : ΞA total} is a Π02 class of positive measure, and hence, by Theorem 6.10.2, contains elements of all 2-random degrees. Thus it is enough to show that if ΞA is total and A is 2-random, then ΞA is 1-generic. For the construction in the proof of Theorem 8.21.3, let Be = {A : ∃σ ≺ A (σ has final color yellowe or greene )},
Se = {A : ∃σ ≺ A (σ has final color rede )},
and S = e Se . (Recall that in this construction there are no purple strings.) Arguing as in the proof of Theorem 8.21.8, we can show that μ(S) + μ( e Be ) = 1. Let Ye be the class of all A such that, infinitely often, some initial segment of A receives color yellowe . Then Ye is a Π02 class, and Ye ∩S = Ye ∩ e Be = ∅, so μ(Ye ) = 0.Thus if A is 2-random, then A ∈ / Ye for all e. We will show that then A ∈ e Be , which implies by construction that if ΞA is total then it is 1-generic, thus proving the theorem. So we are left with showing that if A ∈ / e Be then A ∈ Ye for some e. Let e be least such that A ∈ / Be . We do the argument for the case e > 0; the case e = 0 is similar. Let s be such that for each i < e, some initial segment of A has received final color yellowi or greeni by stage s. Let σ ≺ A have final color yellowe−1 or greene−1 . Suppose that τ ≺ A has some colore at a stage t > s. Recall
394
8. Algorithmic Randomness and Turing Reducibility
that if τ later loses its color through the action of some Ri , then τ must have a bluei substring ν, and in this case every successor of ν loses its color. Thus τ can lose its color only through the action of Re , since for i > e no substring of τ can be bluei , while for i < e, any bluei substring of τ is also a substring of σ, whose color is permanent. Now assume for a contradiction that the color τ has at stage t is rede . Since ΞA is total, this color cannot be permanent, so at some point it is removed through the action of Re , which means that τ then receives color greene , which must be permanent, since Re never acts to remove this color, and by the argument in the previous paragraph, neither does any other Ri . But we are assuming that A ∈ / Be , so we have a contradiction. Thus no initial segment of A receives color rede after stage s. It is then easy to check from the construction that there must be a stage s > s at which some initial segment of A has color yellowe . This color cannot be permanent, since A ∈ / Be , so there must be a stage s > s at which again some initial segment of A receives color yellowe , and so on. Thus A ∈ Ye .
8.21.4 Where 1-generic degrees are downward dense A class C of degrees is downward dense below a degree a if for all nonzero b a there is a c b in C. Theorem 8.21.16 (Kurtz [228]). For almost all degrees a, the class of 1-generic degrees is downward dense below a.15 Proof. The method of this proof is based on that of Theorem 8.21.3, but is considerably more complicated. Assume for a contradiction that the class of degrees below which the class of 1-generic degrees is not downward dense has positive measure. Then there is an e such that the class C of A such that ΦA e is total, noncomputable, and does not compute a 1-generic set is positive. By the Lebesgue Density Theorem 1.2.3, there is a σ such that A μ(C ∩ σ) > 78 2−|σ| . Let ΨA = ΦσA e . Then the class of all A such that Ψ is total, noncomputable, and does not compute a 1-generic set is greater than 78 . Our contradiction will be obtained by constructing a functional A Φ such that μ({A : ΦΨ total and 1-generic}) > 18 . To do so, we need to A ensure that for measure 18 many sets A, we have that ΦΨ is total and the following requirements are met, where V0 , V1 , . . . is an effective enumeration of the c.e. sets of strings. A
Re : ΦΨ meets or avoids Ve . 15 The precise level of randomness necessary to ensure that a has this property is not known.
8.21. Properties of almost all degrees
395
Let p∗ (σ) = μ({A : ΨA is total, noncomputable, and extends σ}). We will use the method of Theorem 8.21.3, but with p∗ (σ) replacing 2−|σ| as the measure of σ. Again we will have bluee , rede , yellowe , and greene strings, but we will need much more subtlety in the way that these colors are assigned. There will also be an additional color, graye , whose role will be discussed later. Unfortunately, we cannot actually compute p∗ . If we could, then we could proceed very much as in the proof of Theorem 8.21.3. Working above a bluee string ν for the sake of Re , we would search for a maximal finite set Eν of pairwise incompatible extensions of ν such that for some ρ ∈ Eν , p∗ (ν)2−(e+3) p∗ (ρ) p∗ (ν)2−(e+2) . By Corollary 8.12.2, limn p∗ (X n) = 0 for all X, and p∗ (σ) = p∗ (σ0) + p∗ (σ1) for all σ, so such a set exists. We would then give ρ color rede and the other members of Eν color yellowe , and proceed much as before. Such a construction would give us a Φ such that A
μ({A : ΨA total and ΦΨ not 1-generic})
1 , 2
which would yield our contradiction, since then A
μ({A : ΦΨ total and 1-generic}) A
μ({A : ΨA total}) − μ({ΨA total and ΦΨ not 1-generic}) 7 1 1 − > . 8 2 8 ∗ Of course, we cannot compute the actual value of p , so we will be forced to use an approximation p to p∗ defined as follows. Let F (σ) be the longest initial segment of Ψσ [|σ|]. Let ps (σ) = μ({τ ∈ 2s : σ F (τ )}) and let p(σ) = lims ps (σ). Note that p(σ) = μ({A : σ ΨA }) (where by σ ΨA we mean that ΨA (n) ↓= σ(n) for all n < |σ|.) At a stage s of the construction, we will use ps in place of p∗ . The new ideas in this construction will allow us to overcome the difficulties that this use of approximations causes. One obvious issue is that ps converges to p and not to p∗ . Let Y = {σ : μ({A : σ ΨA ∧ ΨA is nontotal or computable}) >
p(σ) 2 }.
We will later show that the class of sets A that meet Y has measure less than 14 . We will discuss two problems arising out of the ps not converging to p∗ . Both cause the apparent density of rede strings extending certain bluee strings to be too large. Our solution in both cases will be to argue that in
396
8. Algorithmic Randomness and Turing Reducibility
such cases the relevant rede strings belong to Y , and hence the measure of the class of sets extending them is not unacceptably large. The first difficulty arises because ΨA might not be total for all A, and hence it might be that p(σ) = p(σ0) + p(σ1) for some σ. Let us for now assume we are dealing directly with p, rather than with its approximations ps . Let ν be an active bluee string. Suppose that we use the following strategy for choosing rede and yellowe strings. Begin by giving ν color rede . During later stages, if ρ is the current rede string extending ν, compute p(ρ0) and p(ρ1). If p(ρi) p(ν)2−(e+3) for i = 0, 1 then do nothing. Otherwise, let i be least such that p(ρi) > p(ν)2−(e+3) and make ρi the new rede extension of ν, giving ρ(1 − i) color yellowe . The problem with this strategy is that we might well have p(ρ) > p(ν)2−(e+2) but p(ρi) p(ν)2−(e+3) for i = 0, 1. In this case, if we allow ρ to remain rede , then we might lose an unacceptably large fraction of the potential domain in this attempt to meet Re . On the other hand, if we make one of the ρi rede , then we risk having insufficient density of greene strings if Ve contains an extension of ν. The solution to this dilemma is to give one of the ρi the color rede if p(ρi) > p(ν)2−(e+4) . Now, of course, it may be that p(ρ) > p(ν)2−(e+2) but p(ρi) p(ν)2−(e+4) for i = 0, 1, so it would appear that the same problem as before presents itself. However, in this case we have ρ ∈ Y , and hence the apparent loss of measure caused by leaving ρ with the color rede can be blamed on Y . By controlling the measure of Y , and hence the effect of strings in Y on the construction, we will ensure that this kind of loss of measure occurs only acceptably often. The second problem we face is that ΨA might be computable for some A. There might be a computable set C ν such that μ({A : ΨA = C}) > p(ν)2−(e+4) , and we may always choose our rede strings ρ ν so that ρ ≺ C, which would cause us to pick a new rede string infinitely often. One apparent problem this situation causes is that our finite maximal set Eν of pairwise incompatible extensions of ν grows each time we change our rede extension of ν, and hence can be infinite in the limit. This problem turns out to be irrelevant: whether the final set of rede and yellowe extensions of ν is finite or infinite is immaterial; the only question is whether the measures assigned to the colors are correct. More important is the fact that, along C, we may still lose an unacceptably large amount of measure while attempting vainly to settle upon a final rede string. However, if μ({A : ΨA = C}) > 0, then for all sufficiently large n, μ({A : ΨA = C}) >
p(C n) μ({A : C n ΨA }) = , 2 2
since {A : ΨA = C} = n {A : C n ΨA }. Thus C n ∈ Y , and we are again saved by the fact that the measure loss due to strings in Y is controlled.
8.21. Properties of almost all degrees
397
The last problem we must overcome is that we have to work with ps instead of p. Suppose that at stage s we have a rede string ρ extending a bluee string ν with the desired condition ps (ν)2−(e+2) > ps (ρ) ps (ν)2−(e+4) . At a later stage t, we could have pt (ν) > ps (ν) while pt (ρ) = ps (ρ), which could mean that the ratio of pt (ρ) to pt (ν) is unacceptably small. (Since we need to ensure only that the density of rede and greene strings above each bluee string is bounded away from zero, we can declare “unacceptably small” to mean less than 2−(e+6) , for technical reasons that will become clear later.) It is now that the new color graye comes in. The graye strings will act as “traffic lights”, as explained below. The rede and yellowe strings will occur as extensions of graye strings, while the bluee strings control the placement of the graye strings. In this construction, it will be possible for one bluee string to extend another. With this fact in mind, we say that a bluee string ν belongs to a bluee string ν if ν is the longest bluee proper substring of ν . A greene or graye string γ belongs to ν if ν is the longest (not necessarily proper) bluee substring of γ. Finally, a rede string ρ belongs to ν if ρ extends a graye string belonging to ν. We will ensure that if ν is a bluee string, then either no string properly extending ν has a color, or else the greene , bluee , and graye strings that belong to ν form a finite maximal set of incomparable extensions of ν. When we compute the density of greene and rede strings that extend a bluee string ν, we will use only those that belong to ν. Suppose that ν is an active bluee string at a stage s such that ps (ν) > 0. (No action will be taken for ν until ps (ν) > 0.) We begin by giving ν itself the color graye (without removing the color bluee from ν). Suppose that at a stage t > s we appoint a rede string belonging to ν, and that at a later stage u, the density of rede and greene strings belonging to ν becomes unacceptably small. That is, for the rede and greene strings δ0 , . . . , δn belonging to ν, jn pu (δj ) 2−(e+6) . pu (ν) We then remove all colors from all strings properly extending ν, and remove the color graye from ν. We refer to this process as a wipe-out above ν. Let α0 , . . . , αm be the active strings extending ν at stage u. We do nothing above ν unless there is a stage v > u such that jm pv (αj ) > pv2(ν) . The key observation is that, unless ν ∈ Y , such a v must exist. In this case, we give each αj the color graye , and for each j m we begin to seek rede extensions of αj . If at some stage w > v, the density of rede and greene strings belonging to ν is again unacceptably small, then we again perform a wipe-out above ν, and try to place graye strings above ν as before. Note that wipe-outs can occur at most finitely often, since each wipe-out requires the approximation
398
8. Algorithmic Randomness and Turing Reducibility
to p(ν) to at least double. After the final wipe-out, we can meet Re above ν without any interference. We now turn to the formal construction. Let Ψ and ps be defined as above. Construction. Stage 0. The only active string is λ. Give λ the color blue0 and define Φλ = λ. Stage s + 1. At this stage, we work for each e s + 1. The stage is divided into five substages. Substage 1. (Red action) For each rede string ρ for which there is a ν ∈ Ve [s] with Φρ ν, proceed as follows. Let Φρi = ν for i = 0, 1. Remove the color rede from ρ and give it the color greene . For the unique graye string γ ρ, remove the color graye from γ, and remove the color of all yellowe strings extending γ. Substage 2. (Gray placement, or wipe-out recovery) For each bluee and nongraye string ν such that no string properly extending ν has a color, proceed as follows. Let α0 , . . . , αn be the active strings that extend ν. If ps (ν) then give each αi the colors graye and rede . jn ps (αj ) > 2 Substage 3. (Red push-out) For each rede string ρ, proceed as follows. Let (ρi) γ be the unique graye substring of ρ. If pps+1 < 2−(e+4) for i = 0, 1 then s+1 (γ) (ρi) do nothing. Otherwise, let i be least such that pps+1 2−(e+4) . Remove s+1 (γ) the rede color from ρ. Give the color rede to ρi and the color yellowe to ρ(1 − i). Let Φρj = Φρ for j = 0, 1. as follows. Let Substage 4. (Wipe-outs) For each bluee string ν, proceed
δ0 , . . . , δn be the rede and greene strings belonging to ν. If −(e+6)
ps+1 (δj ) ps+1 (ν)
jn
>
then do nothing. Otherwise, remove all colors from all proper 2 extensions of ν; if ν has color graye then remove this color. Substage 5. (Blue placement) For each nonactive string σ, proceed as follows. If there is an e such that σ has a bluee and nongraye substring ν none of whose proper extensions has a color (so that we are still attempting wipe-out recovery above ν), then do nothing. Otherwise, let e be least such that σ extends neither a greene nor a yellowe string, and give σ the color bluee . End of Construction. Verification. We will need the following classes. Be = {X : X has an initial segment with final color greene or yellowe }. Se = {X : X has an initial segment with final color rede }. Let Y be asabove. Let X ∈ e Be . We claim that ΦX is total and 1-generic. To establish this claim, fix e. Since X ∈ Be there is a string σ ≺ X with final color greene
8.21. Properties of almost all degrees
399
or yellowe . In greene case, there is a τ ΦX in Ve by the construction. In the yellowe case, there cannot be an extension of Φσ in Ve , since otherwise Re would eventually act to remove the color yellowe from σ. In either case, ΦX meets or avoids Ve . Since for each n there is an e such that Ve = 2n , we see that ΦX is also total. Thus it is enough to show that μ({A : ΨA total and in e Be }) > 18 . Every set is in one of the following classes, where we let B−1 = 2ω . (i) {A : ΨA total and in
e
Be },
(ii) {A : ΨA total and in Y }, (iii) e {A : ΨA total and in Se \ Y }, (iv) e {A : ΨA total and in Be−1 \ (Be ∪ Se ∪ Y )}, (v) {A : ΨA not total}. The last class has measure less than 18 , so it is enough to show that the sum of the measures of the classes in items 2–4 is at most 34 . We will do so by showing that μ({A : ΨA ∈ Y }) < 14 , that μ({A : ΨA total and in Se \ Y }) 2−(e+2) , and that μ({A : ΨA total and in Be−1 \(Be ∪Se ∪Y )}) = 0. Lemma 8.21.17. μ({A : ΨA ∈ Y }) < 14 . Proof. Let T be the set of elements of Y with no proper substring in T . Then 1 2μ({A : ΨA is nontotal or computable}) 4 2μ({A : ∃σ ∈ T (σ ΨA ) and ΨA is computable or nontotal}) μ({A : σ ΨA and ΨA is computable or nontotal}) =2 σ∈T
>
σ∈T
p(σ) =
μ({A : σ ΨA }) = μ({A : ΨA ∈ Y }).
σ∈T
Lemma 8.21.18. μ({A : ΨA is total and in Se \ Y )}) 2−(e+2) . Proof. Let R be the set of strings with final color rede and G be the set of strings with final color graye . Note that the elements of G are pairwise incompatible. Each element of R has a substring in G, and each element of G has at most one extension in R. Let ρ ∈ R and γ ∈ G be such that γ ρ. If p(ρ) > 2−(e+2) p(γ) then p(ρ0) + p(ρ1) < p(ρ) 2 , since otherwise the color rede would be passed from
400
8. Algorithmic Randomness and Turing Reducibility
ρ to one of the ρi at some stage. Thus in this case ρ ∈ Y . So μ({A : ΨA is total and in Se \ Y )}) {p(ρ) : ρ ∈ R \ Y } {2−(e+2) p(γ) : γ ∈ G} 2−(e+2) .
We finish by showing that μ({A : ΨA total and in Be−1 \ (Be ∪ Se ∪ Y )}) = 0. The intuitive reason that this fact is true is that if ΨA is total and in Be−1 \ (Be ∪ Se ∪ Y ), then we make infinitely many attempts to satisfy Re for A, and as we will see, the probability that we do so is 0. We begin with some auxiliary lemmas. Lemma 8.21.19. Be ⊆ Be−1 . Proof. The proof is the same as that of Lemma 8.21.12. Lemma 8.21.20. μ({A : ΨA is computable and not in Y }) = 0. Proof. Assume not. Then there is a computable / Y such that set C ∈ μ({A : ΨA = C}) > 0. Since {A : ΨA = C} = n {A : C n ≺ ΨA }, for all sufficiently large n we have μ({A : ΨA = C}) >
p(C n) μ({A : C n ΨA }) = , 2 2
whence C n ∈ Y , contradicting the assumption that C ∈ / Y . Lemma 8.21.21. Suppose that μ({A : ΨA total and in Be−1 \ (Be ∪ Se ∪ Y )}) > 0 and let δ < 1. Then there is a σ such that μ({A : ΨA total and in σ \ (Be ∪ Se ∪ Y )}) > δ2−|σ| . Proof. It will be convenient to work in 3ω . For σ ∈ 3<ω , we write σ3 for the basic open set of all extensions of σ in 3ω . To avoid confusion, for σ ∈ 2<ω , we write σ2 instead of σ for the set of extensions of σ in 2ω . Let Θ be a functional from 2ω to 3ω defined by letting ΘA (n) = ΨA (n) if the latter is defined, and ΘA (n) = 2 otherwise. Clearly Θ is Borel. Let q(σ) = μ({A : σ ≺ ΘA }) for σ ∈ 3<ω . Note that if σ ∈ 2<ω then q(σ) = p(σ). By a result of Carath´eodory (see Royden [335, p. 257]), there is a unique extension of q to a measure q on the Borel sets of 3ω . It is easy to check that q(U ) = μ({A : ΘA ∈ U }) for Borel U . The Lebesgue Density Theorem can be proved for q, so for every Borel U ⊆ 3ω of positive qmeasure, there is a σ ∈ 3ω such that the q-density of U in σ3 exceeds δ. By the hypothesis of the lemma, q(Be−1 \ (Be ∪ Se ∪ Y 2 )) > 0, so there is a σ ∈ 3ω such that the q-density of τ 2 \ (Be ∪ Se ∪ Y 2 ) in σ3 exceeds δ. Clearly σ ∈ 2<ω , since otherwise σ3 ∩ (τ 2 \ (Be ∪ Se ∪ Y 2 )) = ∅,
8.21. Properties of almost all degrees
401
and by the definition of q we have that μ({A : ΨA total and in σ2 \ (Be ∪ Se ∪ Y 2 )}) > δ2−|σ| . Lemma 8.21.22. μ({A : ΨA total and in Be−1 \ (Be ∪ Se ∪ Y )}) = 0. Proof. Assume for a contradiction that the lemma fails. Let σ be as in the previous lemma for δ = 1 − 2−(e+7) . If e > 0 then, since Be−1 is generated by the strings with final color greene or yellowe , we may assume that σ extends such a string. If a string has final color graye then every set extending it is either in Be ∪ Se or is the limit of a sequence of rede strings, in which case it is computable. Thus, by Lemma 8.21.20, σ cannot extend such a string. It follows that we appoint a maximal set I of pairwise incompatible extensions of σ and give each of them the color bluee . There is a ν ∈ I such that μ({A : ΨA total and in ν \ (Be ∪ Se ∪ Y )}) > (1 − 2−(e+7) )2−|ν| . After ν has wiped-out for the last time at a stage s0 , we appoint a maximal set of pairwise incompatible graye strings γ0 , . . . , γm ν. Let δis be the rede or greene string extending γi at stage s s0 , or δis = γi if there is no such string. Let Cs = {δsi : i m} and let C = s Cs . Let X ∈ C. If there is an i such that the δis come to a finite limit δi ≺ X then δi is either greene or rede , and hence X ∈ Be ∪ Se . Otherwise, X is the limit of δis for some i, and hence is computable. Thus, by Lemma 8.21.20, μ({A : ΨA total and in C}) μ({A : ΨA total and in Be ∪ Se ∪ Y }). Therefore, to obtain a contradiction, it is enough to show that μ({A : ΨA total and in C}) 2−(e+7) 2−|ν| . Assume not. Then, since Cs0 ⊇ Cs0 +1 ⊇ · · · , for all sufficiently large s, we have μ({A : −(e+7) −|ν| ΨA total and in Cs }) < 2 2 . It is then easy to see that for all sufficiently large s we have im ps (δis ) < 2−(e+7) 2−|ν| < 2−(e+6) ps (ν), the last inequality holding because p(ν) > 2−|ν|+1 . Therefore, ν must wipe-out again after stage s0 , contradicting the choice of s0 . As argued above, this lemma completes the proof of the theorem. The original proof of the following consequence of the above result was, to the authors’ knowledge, the first use of the method of risking measure. Corollary 8.21.23 (Paris [314]). The upward closure of the class of minimal degrees has measure 0. Jockusch [188] showed that the initial segment of degrees below a 1generic degree is never a lattice, so we have the following. Corollary 8.21.24 (Kurtz [228]). The upward closure of the class of degrees whose initial segments form lattices has measure 0.
Part III
Relative Randomness
9 Measures of Relative Randomness
Measures of relative complexity such as Turing reducibility, wtt-reducibility, and so on have proven to be central tools in computability theory. The resulting degree structures have been widely studied, and there has been considerable interplay between structural results and insights informing our understanding of the concept of computability and its application to such areas as effective mathematics and reverse mathematics. In this chapter, we examine reducibilities that can act as measures of relative complexity, and hence perform a similar role in the theory of algorithmic randomness. We use the term “reducibility” loosely, to mean any transitive and reflexive preorder on 2ω . For example, given the association between randomness and high prefix-free complexity of initial segments, it is natural to consider K-reducibility, where A K B if K(A n) K(B n) + O(1), and think of this preorder as a reducibility measuring relative randomness, although there is no corresponding concept of reduction in this case. That is, we have no way of generating a short description for A n from one for B n, just the knowledge that there is one. Before studying this important notion, we introduce some stronger reducibilities, many of which are “real” reducibilities, beginning with one defined by Solovay [371]. In Section 9.2 we will discuss Solovay’s motivation for considering this notion.
R.G. Downey and D. Hirschfeldt, Algorithmic Randomness and Complexity, Theory and Applications of Computability, DOI 10.1007/978-0-387-68441-3_9, © Springer Science+Business Media, LLC 2010
404
9.1. Solovay reducibility
405
9.1 Solovay reducibility As we will see below, probably the easiest way to define Solovay reducibility on left-c.e. reals is to say that α is Solovay reducible to β if there are a constant c and a left-c.e. real γ such that cβ = α + γ. We begin, however, with Solovay’s original definition, which is not restricted to left-c.e. reals (although Solovay reducibility is badly behaved outside the left-c.e. reals (see Proposition 9.6.1 below), and hence its applications to date have been restricted to the left-c.e. reals). Definition 9.1.1 (Solovay [371]). A real α is Solovay reducible, or Sreducible, to a real β (written α S β) if there are a constant c and a partial computable function f : Q → Q such that if q ∈ Q and q < β, then f (q) ↓< α and α − f (q) < c(β − q). One way to look at this definition is that a sequence of rationals converging to β from below can be used to generate one that converges to α from below at the same rate or faster. This idea is made precise for left-c.e. reals in the following result. Proposition 9.1.2 (Calude, Coles, Hertling, and Khoussainov [47]). Let α and β be left-c.e. reals, and let q0 < q1 < · · · → α and r0 < r1 < · · · → β be computable sequences of rationals. Then α S β iff there are a constant c and a computable function g such that α − qg(n) < c(β − rn ) for all n. Proof. Suppose that α S β, and let c and f be as in Definition 9.1.1. Given n, we have f (rn ) ↓< α, so there is an m such that f (rn ) < qm . Let g(n) = m. Then α − qg(n) < α − f (rn ) < c(β − rn ). Conversely, suppose there are c and g as above. Given s ∈ Q, search for an n such that s < rn , and if such an n is found, then let f (s) = qg(n) . If s < β then n exists, and α − f (s) < c(β − rn ) < c(β − s). The connection between Solovay reducibility and relative randomness comes from the fact that Solovay reducibility implies K-reducibility (as well as C-reducibility, where A C B if C(A n) C(B n) + O(1)). To prove this fact, we begin with a lemma. Lemma 9.1.3 (Solovay [371]). For each k there is a constant ck such that for all n and all σ, τ ∈ 2n , if |0.σ − 0.τ | < 2k−n , then C(σ) = C(τ ) ± ck and K(σ) = K(τ ) ± ck . Proof. Fix k. Given σ of some length n, we can compute the set S of τ ∈ 2n such that |0.σ − 0.τ | < 2k−n . The size of S depends only on k, and we can describe any τ ∈ S by giving σ and τ ’s position in the lexicographic ordering of S. Thus C(τ ) C(σ) + O(1) and K(τ ) K(σ) + O(1), where the constants depend only on k. The lemma follows by symmetry. Theorem 9.1.4 (Solovay [371]). If α S β then C(α n) C(β n) + O(1) and K(α n) K(β n) + O(1).
406
9. Measures of Relative Randomness
Proof. Let c and f be as in Definition 9.1.1. Let k be such that c + 2 < 2k . Let βn = 0.(β n) and αn = 0.(α n). Since βn is rational and β − βn 2−n , we have α − f (βn ) < c2−n . Let τn ∈ 2n be such that |0.τn − f (βn )| < 2−n . Then |αn − 0.τn | |α − αn | + α − f (βn ) + |0.τn − f (βn )| < (c + 2)2−n < 2k−n . Thus, by Lemma 9.1.3, K(α n) K(τn ) + O(1). But τn can be obtained computably from β n, so K(α n) K(β n) + O(1). The proof for plain complexity is the same. One useful property of Solovay reducibility is that it also implies Turing reducibility (which we will see later is not the case for K-reducibility). Proposition 9.1.5. For any reals α and β, if α S β then α T β. Proof. Let c and f be as in Definition 9.1.1. For each n, we can βcomputably find a rational qn < β such that β − qn < 2−n . Then g(q0 ), g(q1 ), . . . is a β-computable sequence of rationals converging to α at a computably bounded rate. Thus α T β. On the other hand, S-reducibility does not imply wtt-reducibility. Recall that a real is strongly c.e. if it is a c.e. set when thought of as a subset of N. Theorem 9.1.6 (Downey, Hirschfeldt, and LaForte [113]). There exist left-c.e. reals α S β such that α wtt β. Moreover, β can be chosen to be strongly c.e. Proof. It will be convenient to build an almost c.e. set A and a c.e. set B and think of α and β as 0.A and 0.B, respectively. We ensure that α S β while meeting the following requirement for each e and i. B Re,i : ∀n (ϕB e (n) ↓< Φi (n) ↓) ⇒ Φe = A.
We arrange these requirements into a priority list in the usual way. The idea for satisfying a single requirement Re,i is simple. We choose a number n and wait until Φi (n)[s] ↓. We then put n into B and put the interval (n, s] into A. This action adds 2−n to β and less than 2−n to α. We then wait until ΦB e (n) ↓= 0 with use less than Φi (n) (and hence less than s, by the usual use conventions). If this happens, we put n into A, remove the interval (n, s] from A, and put s into B. This action adds 2−s to both α B and β. It also preserves ϕB e (n), and hence ensures that Φe (n) = 0 = A(n). We now proceed to the general construction. Each requirement Re,i begins by picking a fresh large ne,i . Initially, we say that all requirements are in the first phase. At stage s, we say that Re,i requires attention if either Re,i is in the first phase and Φi (ne,i )[s] ↓, or Re,i is in the second phase B (defined below) and ΦB e (ne,i )[s] ↓= 0 with ϕe (ne,i )[s] < Φi (ne,i ). At stage s, let Re,i be the strongest priority requirement that requires attention. (If there is no such requirement then proceed to the next stage.)
9.1. Solovay reducibility
407
Initialize all weaker priority requirements, which means they have to pick their n’s to be fresh large numbers and are all now in the first phase. If Re,i is in its first phase, then put ne,i into B, put the interval (ne,i , s] into A, let se,i = s, and declare Re,i to be in its second phase. Note that ne,i was not already in B because requirements weaker than Re,i work beyond ne,i , while whenever a requirement stronger than Re,i requires attention, ne,i is picked to be a fresh large number. Thus, this action adds 2−ne,i to β and less than 2−ne,i to α. If Re,i is in its second phase, then put ne,i into A, remove the interval (ne,i , se,i ] from A, and put se,i into B. This action adds 2−se,i to both α and β. This completes the construction. Clearly, A is almost c.e. and B is c.e. Furthermore, α S β, since at each stage we add at least as much to β as to α. Assume by induction that we are at a point in the construction such that no requirement stronger than Re,i will ever require attention, so ne,i has reached its final value. If Φi (ne,i ) ↓ then eventually Re,i will require first phase attention, at a stage se,i . Note that Φi (ne,i ) se,i . At this point, (ne,i , se,i ] enters A, but ne,i is not in A. If we never have ΦB e (ne,i )[t] ↓= 0 with ϕB (n )[t] < Φ (n ) for t > s , then R is satisfied, since we never e,i i e,i e,i e,i e put ne,i into A. Otherwise, at the first such stage t, we put ne,i into A but do not change B below ϕB e (ne,i )[t]. From that point on, we never change B below ϕB (n )[t], and never remove ne,i from A. Thus Re,i is satisfied. e,i e In any case, Re,i eventually stops requiring attention. We finish this section by giving another characterization of Solovay reducibility on left-c.e. reals already mentioned above, namely that α is Solovay reducible to β iff there are a constant c and a left-c.e. real γ such that cβ = α + γ. We begin with a lemma that will also be useful below. Here and below, when we have a sequence r0 , r1 , . . ., we take the expression rn − rn−1 to have value r0 when n = 0. Lemma 9.1.7 (Downey, Hirschfeldt, and Nies [116]). Let α and β be leftc.e. reals and let r0 < r1 < · · · → β be a computable sequence of rationals. Then α S β iff there is a computable sequence of rationals p0 < p1 < · · · → α such that for some constant d we have ps − ps−1 < d(rs − rs−1 ) for all s. Proof. If there is a sequence p0 < p1 < · · · as in the lemma, then α − pn = p − p < d r s−1 s>n s s>n s − rs−1 = d(β − rn ), so by Proposition 9.1.2, α S β. We now prove the converse. Let q0 < q1 < · · · → α be a computable sequence of rationals, let c and g be as in Proposition 9.1.2, and let d > c be such that qg(0) < dr0 . We may assume without loss of generality that g is increasing. Let p0 = qg(0) . There must be an s0 > 0 such that qg(s0 ) − qg(0) < c(rs0 − r0 ), since otherwise we would have α − qg(0) = lims qg(s) − qg(0) lims c(rs − r0 ) = c(β − r0 ), contradicting our choice of c and g. We can now define p1 , . . . , ps0 so that p0 < · · · < ps0 = qg(s0 ) and ps − ps−1 c(rs − rs−1 ) < d(rs − rs−1 )
408
9. Measures of Relative Randomness
for all s s0 . For example, if we let μ the minimum value of c(rs − rs−1 ) for s s0 and let t be least such that p0 + c(rt − r0 ) > qg(s0 ) − 2−t μ, then we can define ⎧ ⎪ ⎨ps + c(rs+1 − rs ) if s + 1 < t ps+1 = qg(s0 ) − 2−(s+1) μ if t s + 1 < s0 ⎪ ⎩ qg(s0 ) if s + 1 = s0 . We can repeat the procedure in the previous paragraph with s0 in place of 0 to obtain an s1 > s0 and ps0 +1 , . . . , ps1 such that ps0 < · · · < ps1 = qg(s1 ) and ps − ps−1 < d(rs − rs−1 ) for all s0 < s s1 . Proceeding by recursion in this way, we can define a computable sequence of rationals p0 < p1 < · · · with the desired properties. Theorem 9.1.8 (Downey, Hirschfeldt, and Nies [116]). Let α and β be left-c.e. reals. The following are equivalent. (i) α S β. (ii) For each computable sequence of nonnegative rationals b0 , b1 , . . . such that β = n bn , there are a constant d and a computable sequence of rationals ε0 , ε1 , . . . ∈ [0, d] such that α = n εn bn . (iii) There are a constant d and a left-c.e. real γ such that dβ = α + γ. b1 , . . . be a computable Proof. (i) ⇒ (ii) Let b0 , sequence of nonnegative rationals such that β = i bi and let rn = in bi . Apply Lemma 9.1.7 n−1 to obtain d and p0 , p1 , . . . as in that lemma. Let εn = pn −p . Then bn pn −pn−1 n εn bn = n pn − pn−1 = α, and εn = (rn −rn−1 ) ∈ [0, d] for all n. (ii) ⇒ (iii) Let b0 , b1 , . . . be a computable sequence of nonnegative rationals such that β = n bn . Let ε0 , ε1 , . . . be as in (ii), and let γ = (d − ε )b . n n n (iii) ⇒ (i) Let r0 < r1 < · · · → α and s0 < s1 < · · · → γ be computable n sequences. Let pn = rn +s . Then p0 < p1 < · · · → β and α−rn < d(β−pn ), d so by Proposition 9.1.2, α S β.
9.2 The Kuˇcera-Slaman Theorem Solovay introduced Solovay reducibility to define a class of reals with many of the properties of Ω, but with a machine-independent definition. First he noted that Ω is Solovay complete, in the sense that α S Ω for all left-c.e. reals α. Proposition 9.2.1 (Solovay [371]). If α is a left-c.e. real then α S Ω. Proof. Let α be a left-c.e. real. Let M be a prefix-free machine such that α = μ(dom M ), and let τ be such that U(τ σ) = M (σ) for all σ. If we can
9.2. The Kuˇcera-Slaman Theorem
409
approximate Ω to within 2−(|τ |+n), then we can approximate α to within 2−n , so α S Ω. Furthermore, for any (not necessarily left-c.e.) real α, if Ω S α then, by Theorem 9.1.4, α is 1-random. Solovay [371] proved that the Solovaycomplete left-c.e. reals (which he called Ω-like) possess many of the properties of Ω. He remarked: “It seems strange that we will be able to prove so much about the behavior of [K(Ω n)] when, a priori, the definition of Ω is thoroughly model dependent. What our discussion has shown is that our results hold for a class of reals (that include the value of the universal measures . . . ) and that the function [K(Ω n)] is model independent to within O(1).” Several left-c.e. reals that arise naturally in the theory of algorithmic randomness are Ω-like. For example, Kuˇcera and Slaman [221] showed that if {Un }n∈ω is a universal Martin-L¨ of test, then μ(Un ) is Ω-like for all n > 0.1 Another example are the values Q(σ) from Definition 3.9.3. The following two results establish that using Ω-like reals in place of Ω makes no difference in the same way that using any set of the same mdegree as the halting problem gives a version of the halting problem, by Myhill’s Theorem. Thus, it turns out that Solovay’s observations are not so strange after all. The first result establishes that any Ω-like real is a halting probability. Theorem 9.2.2 (Calude, Hertling, Khoussainov, and Wang [50]). If α is Ω-like then there is a universal prefix-free machine U such that = α. μ(dom U) Proof. By Lemma 9.1.7, there are approximations to α and Ω, and a constant c such that Ωs+1 − Ωs < 2c (αs+1 − αs ) for all s. We may assume that c is large enough so that α + 2−c < 1 and 2−c < α. Let β = α + 2−c (1 − Ω). Note that β is a left-c.e. real, since β= αs+1 − αs + 2−c (1 − (Ωs+1 − Ωs )) , s
and each summand is positive. Also, β < 1, so there is a prefix-free machine M such that μ(dom M ) = β. Since β > 2−c , we may assume that there )= = M (σ) if σ ρ and let U(ρτ is a ρ ∈ 2c such that M (ρ) ↓. Let U(σ) = U(τ ) for all τ . Then U codes U and hence is universal, and μ(dom U) −c −c −c μ(dom M ) − 2 + 2 Ω = β − 2 (1 − Ω) = α. The complete characterization of the 1-random left-c.e. reals in terms of Solovay reducibility is given by the following result. 1 Kuˇ cera and Slaman [221] noted that a version of this result was proved earlier by Demuth [93].
410
9. Measures of Relative Randomness
Theorem 9.2.3 (Kuˇcera-Slaman Theorem [221]). If α is 1-random and left-c.e. then it is Ω-like. Proof. Let β be a left-c.e. real. We need to show that β S α. Define a Martin-L¨of test {Un }n∈ω in stages. At stage s act as follows. If αs ∈ Un [s] then do nothing. Otherwise let t be the last stage at which we put anything into Un (or t = 0 if there is no such stage), and put the interval (αs , αs + 2−n (βs − βt )) into Un . We have μ(Un ) 2−n β, so {Un }n∈ω is a Martin-L¨ of test. Thus there is an n such that α ∈ / Un . Let s0 = 0 and let s1 , s2 , . . . be the stages at which we put something into Un . Then for i > 0 we have αsi+1 − αsi > 2−n (βsi − βsi−1 ), so β S α. Thus we see that for left-c.e. reals α, the following are equivalent. (i) α is 1-random. (ii) β S α for all left-c.e. reals β. (iii) Ω S α. (iv) K(β n) K(α n) + O(1) for all left-c.e. reals β. (v) K(Ω n) K(α n) + O(1). (vi) K(Ω n) = K(α n) ± O(1). (vii) α is the halting probability of some universal machine, that is, a version of Ω. Notice in particular that if α and β are both 1-random left-c.e. reals, then α ≡S β, and hence K(α n) = K(β n) ± O(1). Since the 1-randomness of X requires only that K(X n) n + O(1), but K(X n) can be as large as n + K(n) ± O(1), the prefix-free complexity of the initial segments of a 1-random set can oscillate quite a bit. However, this oscillation is the same for all left-c.e. reals. That is, remarkably, all 1-random left-c.e. reals have high complexity (well above n) and relatively low complexity (close to n) at the same initial segments. These results can be seen as providing a randomness-theoretic version of Myhill’s Theorem 2.4.12. It is interesting to note that all 1-random left-d.c.e. reals are either leftc.e. or right-c.e. Theorem 9.2.4 (Rettinger [unpublished]). Let α be a 1-random left-d.c.e. real. Then either α or 1 − α is a left-c.e. real. Proof. By the characterization of left-d.c.e. reals in Theorem 5.4.2, there is a computable sequence of rationals q0 , q1 , . . . such that limi qi = α and i |qi+1 − qi | < 1. Suppose there is an n such that qi α for all i > n. Then we can pick out a nondecreasing subsequence of the qi converging to α, and hence α is left-c.e. Similarly, if there is an n such that qi α for all i > n then 1 − α is left-c.e. We claim that one of these two cases must
9.3. Presentations of left-c.e. reals and complexity
411
hold. Suppose otherwise, and let Ii be (qi , qi+1 ) if qi qi+1 and (qi+1 , qi ) if qi > qi+1 . Then α ∈ Ii for infinitely many i. But i |Ii | < 1, so the Ii form a Solovay test, contradicting the 1-randomness of α.
9.3 Presentations of left-c.e. reals and complexity Recall from Chapter 5 that a presentation of a left-c.e. real α is a prefixfree c.e. set W ⊂ 2<ω such that α = σ∈W 2−|σ| . In Corollary 5.3.14 we saw that there are noncomputable left-c.e. reals α such that any presentation of α is computable. By Theorems 9.2.2 and 9.2.3, such reals cannot be 1-random, since the domain of a universal prefix-free machine is not computable. However, Stephan and Wu [380] showed that they must have rather high initial segment complexity, and in particular must be weakly 1-random. Theorem 9.3.1 (Stephan and Wu [380]). Let α be a left-c.e. real for which there is a computable f such that K(α f (n)) < f (n) − n for all n. Then α has a noncomputable presentation. Proof. We may assume that f is increasing. It is easy to see that there is a computable sequence of rationals α0 < α1 < · · · → α such that, thinking of the αs as strings, for each s we have |αs | > f (s) and K(αs f (n)) < f (n) − n for all n s. Let r0 = α0 and rs+1 = αs+1 − αs . Thinking of the rsas strings, let L = {(n, 0s ) : s ∈ N ∧ rs (n) = 1}. Then −n = s rs = α, so L is a KC set, and the domain P of the (n,0s )∈L 2 corresponding prefix-free machine is a presentation of L. Thus it is enough to show that P is not computable. Assume for a contradiction that P is computable. Then for each n we know all strings of length n in P , and hence know all s such that (n, 0s ) ∈ L. Thus there is a computable g such that s > g(n) ⇒ rs < 2−n for all n. Since α is not computable, there is an n such that α − αg(f (n)) > 2−n+2 . Since K(αs f (n)) < f (n) − n for all s g(f (n)), we have |{αs f (n) : s g(f (n))}| < 2f (n)−n . It follows that there is an s g(f (n)) such that rs+1 = αs+1 − αs > 2−n+2 2−(f (n)−n)−2 = 2−f (n) , contradicting the definition of g.
In particular, by Theorem 7.2.16, we have the following. Corollary 9.3.2 (Stephan and Wu [380]). If every presentation of a leftc.e. real α is computable then α is either computable or weakly 1-random but not 1-random.
412
9. Measures of Relative Randomness
9.4 Solovay functions and 1-randomness Recall from Section 3.12 that Solovay proved the existence a Solovay of −f (n) function, that is, a computable function f such that < ∞, n2 whence K(n) f (n) + O(1), and lim inf n f (n) − K(n) < ∞. It turns out that, among the computable functions f such that sum n 2−f (n) is finite, the Solovay functions are precisely those for which this sum is a 1-random real. The proof of this result exploits the fact that the KuˇceraSlaman Theorem provides a method for characterizing Solovay functions in terms of Ω. Theorem 9.4.1 (Bienvenu and Downey [39]). Let f be a computable function. The following are equivalent. (i) f is a Solovay function. −f (n) (ii) is a 1-random real. n2 Proof. (i) ⇒ (ii). If f is a Solovay function, we already know by definition that α = n 2−f (n) is finite. Let us now prove that α is 1-random. Suppose it is not. Then for each c there is a k such that K(α k) k − c. Given α k, we can effectively find an s such that n>s 2−f (n) 2−k . Hence, by a standard KC Theorem argument, we have K(n | α k) f (n) − k + O(1) for all n > s. Thus, for all n > s, K(n) f (n) + K(α k) − k + O(1) f (n) + (k − c) − k + O(1) f (n) − c + O(1). Since c can be taken arbitrarily large, limn f (n) − K(n) = ∞; i.e., f is not a Solovay function. (ii) ⇒ (i).Suppose for a contradiction that f is not a Solovay function but α = n 2−f (n) is 1-random. By the Kuˇcera-Slaman Theorem 9.2.3, Ω S α. Let g be as in the definition of Solovay reducibility. That is, g is a partial computable function such that, for some d, if q is a rational less than α, then g(q) ↓< Ω and Ω − g(q) < 2d (α − q). Fix c and suppose that α k is given for some k. Since α− (α k) < 2−k , we have Ω − g(α k) < 2−k+d . Thus, from α k we can compute an s(k) such that n>s(k) 2−K(n) 2−k+d . If k is large enough, then n > s(k) ⇒ K(n) f (n) − c − d (because f is not a Solovay function), whence 2−f (n) 2−c−d 2−K(n) 2−c−d 2−k+d 2−k−c . n>s(k)
n>s(k)
So for large enough k, knowing α k suffices to effectively approximate α to within 2−k−c . In other words, α (k + c) can be computed from α k and c. Therefore, for all large enough k, K(α (k + c)) K(α k, c) + O(1) K(α k) + 2 log c + O(1).
9.5. Solovay degrees of left-c.e. reals
413
The constant in that expression is independent of c, so choose c such that the expression 2 log c + O(1) in the above inequality is smaller than c/2. Then, for all large enough k, c K(α (k + c)) K(α k) + . 2 An easy induction then shows that K(α k) O k2 , contradicting the assumption that α is 1-random. The following easy corollary, which says that there are nondecreasing computable upper bounds for K, is not at all obvious without the previous result. Corollary 9.4.2 (Bienvenu and Downey [39]). There exist nondecreasing Solovay functions. Proof. It suffices to take a computable sequence {rn }n∈ω of rational numbers such that every rn is a negative power of 2, the rn are nonincreasing, and n rn is a 1-random real. (It is easy to see that such sequences exist.) Then let f (n) = − log(rn ) for all n. The function f is computable, nondecreasing, and, by Theorem 9.4.1, is a Solovay function.
9.5 Solovay degrees of left-c.e. reals As with any reducibility, we can take equivalence classes of reals under S-reducibility, which we call Solovay degrees or S-degrees, and are partially ordered in the obvious way. One way to express the main result of Section 9.2 is that there is a largest S-degree of left-c.e. reals, which consists exactly of the 1-random left-c.e. reals, or, equivalently, the versions of Ω. We have seen that S-reducibility implies Turing reducibility, and it is easy to show that if α is computable then α S β for any left-c.e. real β. Thus there is also a least S-degree of left-c.e. reals, which consists exactly of the computable reals. As we will see, between these two extremes there is a rich structure. It was observed by Solovay [371] and others, such as Calude, Hertling, Khoussainov, and Wang [50], that the Solovay degrees of left-c.e. reals form an uppersemilattice, with the join operation induced by addition (or multiplication). Downey, Hirschfeldt, and Nies [116] showed that this uppersemilattice is distributive, where an uppersemilattice is distributive if c a ∨ b ⇒ ∃c0 a ∃c1 b (c = c0 ∨ c1 ). Theorem 9.5.1 (Solovay [371], Calude, Hertling, Khoussainov, and Wang [50], Downey, Hirschfeldt, and Nies [116]). The Solovay degrees of left-c.e. reals form a distributive uppersemilattice, with the join operation induced by addition. Proof. Let α and β be left-c.e. reals. By Theorem 9.1.8, α, β S α + β. Now suppose that α, β S γ. Let q0 , q1 , . . . be a computable sequence of
414
9. Measures of Relative Randomness
nonnegative rationals such that γ = n qn . By Theorem 9.1.8, there is a constant d and computable sequences of rationals ε0 , ε1 , . . . ∈ [0, d] and δ0 , δ1 , . . . ∈ [0, d] such that α = ε q and β = n n n n δn qn . Then εn + δn ∈ [0, 2d] for all n, and α + β = n (εn + δn )qn . So, again by Theorem 9.1.8, α + β S γ. Thus addition induces a join operation on the S-degrees of left-c.e. reals. Now suppose that γ S α+β. Let a0 , a1 , . . . and b 0 , b1 , . . . be computable sequences of nonnegative rationals such that α = n an and β = n bn . By Theorem 9.1.8, there is a constant d and a computable sequence of rationals ε 0 , ε1 , . . . ∈ [0, d] such that γ = n εn (an +bn ). Let γ0 = n ε n an and γ1 = n εn bn . Then γ = γ0 + γ1 and, again by Theorem 9.1.8, γ0 S α and γ1 S β. Thus the uppersemilattice of the S-degrees of left-c.e. reals is distributive. It is straightforward to modify the above proof to show that the join operation on the S-degrees of left-c.e. reals is also induced by multiplication. On the other hand, the S-degrees of left-c.e. reals do not form a lattice. Theorem 9.5.2 (Downey and Hirschfeldt [unpublished]). There exist c.e. sets A and B such that the Solovay degrees of A and B have no infimum in the Solovay degrees of left-c.e. reals. Proof. We actually prove something stronger, namely that there exist c.e. sets A and B such that for any set X T A, B, there is a c.e. set S with S S A, B and S T X. The proof is a straightforward adaptation of the proof that the c.e. wtt-degrees are not a lattice, using a method introduced by Jockusch [190]. We build c.e. sets A and B and auxiliary c.e. sets Si,j to meet the following requirement for each i, j, k. ΦA
B i Ri,j,k : ΦA i = Φj total ⇒ Si,j S A, B ∧ Si,j = Φk .
The proof is a standard finite injury argument, and it will suffice to describe our action for a single Ri,j,k . At stage s, let ms be largest such that ΦA i (m)[s] ↓ for all m < ms and A A Φi [s] ms = ΦB [s] m . Let σ = Φ [s] ms . Choose a fresh large s j i n and wait for a stage s such that Φσk (n)[s] ↓. Then initialize all weaker priority requirements (which means they have to pick new fresh large n’s). If Φσk (n)[s] = 0 then do nothing. Otherwise, proceed as follows. First put B n into A. Suppose there is a stage t > s such that ΦA i [t] ms = Φj [t] ms (and Ri,j,k has not been initialized meanwhile). Note that ΦB j [t] ms = ΦB [s] m = σ, because requirements weaker than R do not put s i,j,k j numbers into B ϕB (m − 1)[s] at stages greater than or equal to s. Now s j put n into B and into Si,j . If Ri,j,k is never initialized after stage t then no number enters A ϕA i (ms − 1)[t] at stages greater than or equal to t, so if Ri,j,k ’s hypothesis ΦA
σ i holds, then σ ≺ ΦA i , so Φk (n) = Φk (n)[s]. If this value is 0, we eventually
9.5. Solovay degrees of left-c.e. reals
415
put n into Si,j , while otherwise we never put n into Si,j . In either case, ΦA
Si,j (n) = Φk i (n). B Suppose that ΦA i = Φj and fix s. Any n being controlled by some Ri,j,k at stage s will eventually enter Si,j unless Ri,j,k is initialized. Thus we can find a stage g(s) such that if such an n ever enters Si,j , it does so by stage g(s). All numbers entering Si,j after stage g(s) must also enter A and B after stage s. Thus, thinking of our c.e. sets as reals, Si,j − Si,j [g(s)] A−A[s] and Si,j −Si,j [g(s)] B−B[s] for all s, and hence Si,j S A, B. Some results about the structure of the S-degrees of left-c.e. reals come for free from the fact that S-reducibility implies Turing reducibility. For instance, there are minimal pairs of S-degrees of left-c.e. reals because there are minimal pairs of c.e. Turing degrees. Other results require work, however. The question that led both authors into the study of algorithmic randomness was whether the partial order of S-degrees of left-c.e. reals is dense. This question led to the following results, which we will prove in a general context, applicable to many other reducibilities, in Section 9.8. Theorem 9.5.3 (Downey, Hirschfeldt, and Nies [116]). Let γ <S α <S Ω be left-c.e. reals. There are left-c.e. reals β0 and β1 such that γ <S βi <S α for i = 0, 1 and β0 + β1 = α. In other words, in the S-degrees of left-c.e. reals, every incomplete degree splits over every lesser degree. Theorem 9.5.4 (Downey, Hirschfeldt, and Nies [116]). Let γ <S Ω be a left-c.e. real. There is a left-c.e. real β such that γ <S β <S Ω. These two theorems together establish the following result. Corollary 9.5.5 (Downey, Hirschfeldt, and Nies [116]). The partial order of Solovay degrees of left-c.e. reals is dense. The hypothesis that α <S Ω in the statement of Theorem 9.5.3 is necessary. In fact, the S-degree of Ω does not split at all in the S-degrees of left-c.e. reals, which can be seen as a structural reflection of the qualitative difference between 1-random and non-1-random left-c.e. reals. This fact will follow from a stronger result that shows that, despite the upwards density of the S-degrees of left-c.e. reals, there is a sense in which the S-degree of Ω is very much above all other S-degrees of left-c.e. reals. We begin with two lemmas, the second of which will not actually be necessary in the proof of Theorem 9.5.8 below, but will be helpful in motivating that proof, and will also be used in Section 9.8. Lemma 9.5.6 (Downey, Hirschfeldt, and Nies [116]). Let α and β be leftc.e. reals and let α0 < α1 < · · · → α and β0 < β1 < · · · → β be computable sequences of rationals. Let f be an increasing computable function and let k > 0. If there are infinitely many s such that k(α − αs ) > β − βf (s) , but
416
9. Measures of Relative Randomness
only finitely many s such that k(αt − αs ) > βf (t) − βf (s) for all t > s, then β S α. Proof. By taking βf (0) , βf (1) , . . . instead of β0 , β1 , . . . as an approximating sequence for β, we may assume that f is the identity. By hypothesis, there is an r such that for all s > r there is a t > s with k(αt − αs ) βt − βs . Furthermore, there is an s0 > r such that k(α − αs0 ) > β − βs0 . Given si , let si+1 be the least number greater than si such that k(αsi+1 − αsi ) βsi+1 − βsi . Assuming by induction that k(α − αsi ) > β − βsi , we have k(α−αsi+1 ) = k(α−αsi )−k(αsi+1 −αsi ) > β−βsi −(βsi+1 −βsi ) = β−βsi+1 . Thus s0 < s1 < · · · is a computable sequence such that k(α−αsi ) > β −βsi for all i. Now define the computable function g by letting g(n) be the least si that is greater than or equal to n. Then β − βg(n) < k(α − αg(n) ) k(α − αn ) for all n, and hence β S α. Lemma 9.5.7 (Downey, Hirschfeldt, and Nies [116]). Let β S α be leftc.e. reals. Let f be a total computable function and k ∈ N. (i) For each n either (a) βt − βf (n) < k(αt − αn ) for all sufficiently large t or (b) βt − βf (n) > k(αt − αn ) for all sufficiently large t. (ii) There are infinitely many n such that (b) holds. Proof. If there are infinitely many t such that βt − βf (n) k(αt − αn ) and infinitely many t such that βt − βf (n) k(αt − αn ) then β − βf (n) = lim βt − βf (n) = lim k(αt − αn ) = k(α − αn ), t
t
which implies that β ≡S α. If there are infinitely many t such that βt − βf (n) k(αt − αn ) then β − βf (n) = lim βt − βf (n) lim k(αt − αn ) = k(α − αn ). t
t
So if this happens for all but finitely many n then β S α. (The finitely many n for which β − βf (n) > k(α − αn ) can be brought into line by increasing the constant k.) Theorem 9.5.8 (Downey, Hirschfeldt, and Nies [116]). Let α and β be leftc.e. reals and let α0 < α1 < · · · → α and β0 < β1 < · · · → β be computable sequences of rationals. Let f be an increasing computable function and let k > 0. If β is 1-random and there are infinitely many s such that k(α−αs ) > β − βf (s) , then α is 1-random. Proof. As in Lemma 9.5.6, we may assume that f is the identity. If α is rational then we can replace it with an irrational computable real α such that α − αs α − αs for all s, so we may assume that α is not rational.
9.5. Solovay degrees of left-c.e. reals
417
We assume that α is not 1-random and there are infinitely many s such that k(α − αs ) > β − βs , and show that β is not 1-random. The idea is to take a Solovay test A = {Ii }i∈N such that α ∈ Ii for infinitely many i and use it to build a Solovay test B = {Ji }i∈N such that β ∈ Ji for infinitely many i. Let U = {s : k(α − αs ) > β − βs }. Since we are assuming β is 1-random but α is not, β S α, so Lemma 9.5.7 guarantees that U is Δ02 . Thus a first attempt at building B could be to run the following procedure for all i in parallel. Look for the least t such that there is an s < t with s ∈ Ut and αs ∈ Ii . If there is more than one number s with this property then choose the least among such numbers. Begin to add the intervals [βs , βs + k(αs+1 − αs )], [βs + k(αs+1 − αs ), βs + k(αs+2 − αs )], . . . (9.1) to B, continuing to do so as long as s remains in U and the approximation of α remains in Ii . If the approximation of α leaves Ii then end the procedure. If s leaves U , say at stage u, then repeat the procedure (only considering t u, of course). If α ∈ Ii then the variable s in the above procedure eventually assumes a value in U . For this value, k(α − αs ) > β − βs , from which it follows that k(αu − αs ) > β − βs for some u > s, and hence that β ∈ [βs , βs + k(αu − αs )]. So β must be in one of the intervals (9.1) added to B by the above procedure. Since α is in infinitely many of the Ii , running the above procedure for all i guarantees that β is in infinitely many of the intervals in B. The problem is that we also need the sum of the lengths of the intervals in B to be finite, and the above procedure gives no control over this sum, since it could easily be the case that we start working with some s, see it leave U at some stage u (at which point we have already added to B intervals whose lengths add up to αu−1 − αs ), and then find that the next s with which we have to work is much smaller than u. Since this could happen many times for each i, we would have no bound on the sum of the lengths of the intervals in B. This problem would be solved if we had an infinite computable subset T of U . For each Ii , we could look for an s ∈ T such that αs ∈ Ii , and then begin to add the intervals (9.1) to B, continuing to do so as long as the approximation of α remained in Ii . (Of course, in this easy setting, we could also simply add the single interval [βs , βs + k |Ii |] to B.) It is not hard to check that this would guarantee that if α ∈ Ii then β is in one of the intervals added to B, while also ensuring that the sum of the lengths of these intervals is less than or equal to k |Ii |. Following this procedure for all i would give us the desired Solovay test B. Since β S α, though, there is no infinite computable T ⊆ U , so we use Lemma 9.5.6 to obtain the next best thing. Let S = {s : ∀t > s(k(αt − αs ) > βt − βs )}.
418
9. Measures of Relative Randomness
By Lemma 9.5.6 and the assumption that β S α, we may assume that S is infinite. Note that k(α − αs ) β − βs for all s ∈ S. In fact, we may assume that k(α − αs ) > β − βs for all s ∈ S, since if k(α − αs ) = β − βs then kα and β differ by a rational amount, and hence β is not 1-random. The set S is co-c.e. by definition, but it has an additional useful property. Let S[t] = {s : ∀u ∈ (s, t](k(αu − αs ) > βu − βs )}. If s ∈ S[t − 1] \ S[t] then no u ∈ (s, t) is in S, since for any such u we have k(αt − αu ) = k(αt − αs ) − k(αu − αs ) βt − βs − (βu − βs ) = βt − βu . In other words, if s leaves S at stage t then so do all numbers in (s, t). To construct B, we run the following procedure Pi for all i in parallel. Note that B is a multiset, so we are allowed to add more than one copy of a given interval to B. 1. Look for an s such that αs ∈ Ii . 2. Let t = s + 1. If αt ∈ / Ii then terminate the procedure. 3. If s ∈ / S[t] then let s = t and go to step 2. Otherwise, add the interval [βs + k(αt−1 − αs ), βs + k(αt − αs )] to B, increase t by one, and repeat step 3. This concludes the construction of B. We now show that the sum of the lengths of the intervals in B is finite and that β is in infinitely many of the intervals in B. For each i, let Bi be the set of intervals added to B by Pi and let li be the sum of the lengths of the intervals in Bi . If Pi never leaves step 1 then Bi = ∅. If Pi eventually terminates then li k(αt − αs ) for some s, t such that αs , αt ∈ Ii , and hence li k |Ii |. If Pi reaches step 3 and never terminates then α ∈ Ii and li k(α − αs ) for some s such that αs ∈ Ii , and hence again li k |Ii |. Thus the sum of the lengths of the intervals in B is less than or equal to k i |Ii | < ∞. To show that β is in infinitely many of the intervals in B, it is enough to show that, for each i, if α ∈ Ii then β is in one of the intervals in Bi . Fix i such that α ∈ Ii . Since α is not rational, αu ∈ Ii for all sufficiently large u, so Pi must eventually reach step 3. By the properties of S discussed above, the variable s in the procedure Pi eventually assumes a value in S. For this value, k(α − αs ) > β − βs , from which it follows that k(αu − αs ) > β − βs for some u > s, and hence that β ∈ [βs , βs + k(αu − αs )]. So β must be in one of the intervals (9.1), all of which are in Bi . The following corollary appears originally in Demuth [93], without proof, and was independently rediscovered by Downey, Hirschfeldt, and Nies [116]. We thank Antonin Kuˇcera for bringing Demuth’s paper to our attention. (A discussion of this paper may be found in [221, Remark 3.5].)
9.6. cl-reducibility and rK-reducibility
419
Corollary 9.5.9 (Demuth [93]). If α0 and α1 are left-c.e. reals such that α0 + α1 is 1-random then at least one of α0 and α1 is 1-random. Proof. Let β = α0 + α1 . For i = 0, 1, let αi,0 < αi,1 < · · · → αi be a computable sequence of rationals, and let βs = α0,s + α1,s . For each s, either 3(α0 − α0,s ) > β − βs or 3(α1 − α1,s ) > β − βs , so for some i < 2 there are infinitely many s such that 3(αi − αi,s ) > β − βs . By Theorem 9.5.8, αi is 1-random. Combining Theorem 9.5.3 and Corollary 9.5.9, we have the following results, the second of which also depends on Theorem 9.2.3. Theorem 9.5.10 (Downey, Hirschfeldt, and Nies [116]). A left-c.e. real γ is 1-random if and only if it cannot be written as α + β for left-c.e. reals α, β <S γ. Theorem 9.5.11 (Downey, Hirschfeldt, and Nies [116]). Let d be a Solovay degree of left-c.e. reals. The following are equivalent: 1. d is incomplete. 2. d splits. 3. d splits over any lesser Solovay degree. These results apply only to left-c.e. reals. For example, given Ω = 0.a0 a1 a2 . . ., let α = 0.a0 0a2 0 . . . and β = 0.0a1 0a3 . . . . Then neither α nor β is 1-random, but α + β = Ω. We finish this section by noting that the structure of the S-degrees of left-c.e. reals is not too simple. Theorem 9.5.12 (Downey, Hirschfeldt, and LaForte [114]). The firstorder theory of the uppersemilattice of Solovay degrees of left-c.e. reals is undecidable. The proof of this result uses Nies’ [301] method of interpreting effectively dense boolean algebras.
9.6 cl-reducibility and rK-reducibility Despite the success of S-reducibility in characterizing the 1-random left-c.e. reals, there are reasons to search for other measures of relative randomness. For one thing, S-reducibility is quite strong, and is rather uniform in a way that K-reducibility, for instance, is not. (That is, if α S β then α is quite closely tied to β in a computability-theoretic sense, while there is no reason to suppose that, whatever we mean by “A is no more random than B”, this notion should imply that A is so closely tied to B.) Furthermore, Sreducibility is quite badly behaved outside the left-c.e. reals, as exemplified by the following result.
420
9. Measures of Relative Randomness
Proposition 9.6.1. There is a right-c.e. real β that is not S-above any left-c.e. real (including the computable reals). Proof. It is enough to build β so that β S 1. To do so, we satisfy the requirements Re : ∃q ∈ Q (q < β ∧ (Φe (q) ↓< 1 ⇒ 1 − Φe (q) > e(β − q))), where we think of Φ0 , Φ1 , . . . as functions from rationals to rationals. We denote the stage s approximation of β by βs . We begin with β0 = 1. We initially choose q0 < q1 < · · · < 1. At stage s, we look for the least e s such that Φe (qe )[s] ↓< 1 and 1−Φe (qe ) e(βs −qe ). If there is no such e, we let βs+1 = βs and proceed to the next stage. Otherwise, we let βs+1 βs be such that qe < βs+1 and 1 − Φe (qe ) > e(βs+1 − qe ). We then redefine all qi for i > e so that qe < qe+1 < · · · < βs+1 . We say that Re acts at stage s. Assume by induction that all Ri with i < e act only finitely often, and let t be the least stage by which all such requirements have stopped acting. Then the value of qe at stage t is permanent. If Φe (qe ) ↑ or Φe (qe ) ↓ 1 then Re is satisfied and never acts after stage t. Otherwise, there is a least stage s t such that Φe (qe )[s] ↓. We ensure that 1 − Φe (qe ) > e(βs+1 − qe ). Furthermore, we have qe < βu βs+1 for all u > s, whence 1 − Φe (qe ) > e(β − qe ). Thus Re is satisfied and never acts after stage s. Downey, Hirschfeldt, and LaForte [113] introduced another measure of relative complexity, now often called computable Lipschitz or cl-reducibility. This reduction was originally called sw-reducibility by Downey, Hirschfeldt, and LaForte.2 Definition 9.6.2 (Downey, Hirschfeldt, and LaForte [113], Csima [80] and Soare [368]). A set A is cl-reducible to a set B (written A cl B if there is a functional Γ such that ΓB = A and γ B (n) n + O(1). If the constant is zero, then we will call this an ibT-reduction, for identity bounded Turing reduction, and write A ibT B. If A cl B then we can describe A n using B n and O(1) many more bits (to represent the additional bits of B below γ B (n)). Thus C(A n) C(B n) + O(1) and K(A n) K(B n) + O(1). In other words, like Sreducibility, cl-reducibility implies both K-reducibility and C-reducibility. Obviously, cl-reducibility also implies Turing (and even wtt-) reducibility. A strong form of ibT-reducibility has been used by Nabutovsky and Weinberger [292] for problems in differential geometry, as described in 2 This
abbreviation stands for “strong weak truth table reducibility”. The authors have deservedly received a certain amount of flak over this terminology. Hence the name change, suggested by Barmpalias and Lewis [247], which reflects the fact that cl-reducibility is an effective version of a Lipschitz transformation. See [247] for more on this view of cl-reducibility.
9.6. cl-reducibility and rK-reducibility
421
Soare [368] and Csima and Soare [85]. For a c.e. set We , the modulus function, or settling time function, me of We is defined by letting me (n) be the least s such that We,s n = We,t n for all t > s. Suppose that A and B are c.e. sets. We say that B is settling time dominated by B if there is an enumeration Wj of B and an enumeration Wi of A such that for all computable functions f , we have mi (n) > f (mj (n)) for almost all n. One can replace the existential quantifiers in this definition by universal ones, as this notion is invariant on cl-degrees (Soare [368]). We do not pursue this notion further, but clearly there seem to be interesting connections with Kolmogorov complexity; see Csima [80], Csima and Shore [84], Csima and Soare [85], and Soare [368]. For a discussion of extensions beyond the c.e. sets, see Csima [81]. We will see in Section 9.10 that S-reducibility and cl-reducibility are different on left-c.e. reals, and indeed neither implies the other, but that they agree on c.e. sets. We will also see that cl-reducibility is much less well behaved than S-reducibility. For instance, there is no largest cl-degree of left-c.e. reals, and there is no join operation on the cl-degrees of left-c.e. reals. We would like a measure of relative randomness combining the best features of S-reducibility and cl-reducibility while being less uniform and more closely tied to initial segment complexity. The following is a natural candidate, introduced by Downey, Hirschfeldt, and LaForte [113]. Definition 9.6.3 (Downey, Hirschfeldt, and LaForte [113]). A set A is relative K-reducible or rK-reducible to a set B (written A rK B) if K(A n | B n) O(1). Since C(σ | τ ) K(σ | τ ) + O(1) and K(σ | τ ) 2C(σ | τ ) + O(1), it would not change this definition to use plain complexity in place of prefixfree complexity (and thus there is no separate notion of “rC-reducibility”). If A rK B then the initial segments of A are easy to describe given the corresponding initial segments of B, and thus it makes sense to think of A as no more random than B. Clearly, rK-reducibility implies K-reducibility and C-reducibility. We now give alternate characterizations of rK-reducibility that demonstrate how it can be seen as a less uniform version of both S-reducibility and cl-reducibility. Item (ii) below can be seen as a form of “computation with advice”. Theorem 9.6.4 (Downey, Hirschfeldt, and LaForte [113]). The following are equivalent. (i) A rK B. (ii) There are a partial computable function f : 2<ω × N → 2<ω and a constant k such that for each n there is a j k for which f (B n, j) ↓= A n.
422
9. Measures of Relative Randomness
(iii) There are a partial computable function g : 2<ω → Q and a constant c such that for each n there is a τ ∈ 2n+c with |0.B − 0.τ | 2−n for which g(τ ) ↓ and |0.A − g(τ )| 2−n . Proof. (i) ⇒ (ii) Suppose that K(A n | B n) < c for all n. Let τ0 , . . . , τk be the strings of length less than c and define f as follows. For a string σ and j k, if U σ (τj ) ↓ then f (σ, j) = U σ (τj ), and otherwise f (σ, j) ↑. For each n there must be a j k such that U Bn (τj ) ↓= A n. For this j we have f (B n, j) ↓= A n. (ii) ⇒ (iii) Let c be such that 2c k and define the partial computable function g as follows. Given a string σ of length n, whenever f (σ, j) ↓ for some new j k, choose a new τ σ of length n + c and define g(τ ) = 0.f (σ, j). Then for each n there is a τ B n such that g(τ ) ↓= 0.(A n). We have |0.B − 0.τ | 2−n and |0.A − g(τ )| 2−n . (iii) ⇒ (i) For a string σ ∈ 2n , let Sσ be the set of all ν for which there is a τ ∈ 2n+c with |0.σ − 0.τ | 2−n+1 and |0.ν − g(τ )| 2−n+1 . It is easy to check that there is a k such that |Sσ | k for all σ, and hence that K(ν | σ) O(1) for all σ and ν ∈ Sσ . Fix n and let τ ∈ 2n+c be such that |0.B − 0.τ | 2−n and |0.A − g(τ )| 2−n . Then |0.(B n) − 0.τ | 2−n+1 and |0.(A n) − g(τ )| 2−n+1 . Thus A n ∈ SBn , and hence K(A n | B n) O(1). Corollary 9.6.5 (Downey, Hirschfeldt, and LaForte [113]). If A cl B then A rK B. Proof. Suppose that A = ΓB and γ B (n) n + c for all n. Let σ0 , . . . , σ2c −1 be the strings of length c. For j < 2c , let f (τ, j) = Γτ σj |τ | if the latter is defined. Then for each n there is a j < 2c such that f (B n, j) ↓= A n. Thus item (ii) in Theorem 9.6.4 holds. Corollary 9.6.6 (Downey, Hirschfeldt, and LaForte [113]). If α S β then α rK β. Proof. Let the partial computable function f and the constant k be such that if q ∈ Q and q < β, then f (q) ↓< α and α − f (q) < k(β − q). Let c be such that 2c k and let g(σ) = f (0.σ) if the latter is defined. Given n, we have β − 0.(β (n + c)) 2−(n+c) . Thus g(β (n + c)) ↓< α and |α − g(β n + c)| 2−n . So item (iii) in Theorem 9.6.4 holds. For left-c.e. reals, we have another characterization of rK-reducibility in terms of “settling times”. It roughly says that if α rK β then there are approximations to α and β such that α n cannot change much after the stage at which β n settles to its true value. For a left-c.e. real α and a fixed computable approximation α0 < α1 < · · · → α, let the mind-change function m(α, n, s, t) be the cardinality of {u ∈ [s, t] : αu n = αu+1 n}, and let m(α, n, s) = limt m(α, n, s, t).
9.6. cl-reducibility and rK-reducibility
423
Theorem 9.6.7 (Downey, Hirschfeldt, and LaForte [113]). Let α and β be left-c.e. reals. The following are equivalent. (i) α rK β. (ii) There are a constant k and computable approximations α0 < α1 < · · · → α and β0 < β1 < · · · → β such that for all n and t > s, if βt n = βs n then m(α, n, s, t) k. (iii) There are a constant k and computable approximations α0 < α1 < · · · → α and β0 < β1 < · · · → β such that for all n and s, if β n = βs n then m(α, n, s) k. Proof. (i) ⇒ (ii) We may assume that α and β are irrational, since otherwise (ii) is easy to show. Let c be such that K(α n | β n) < c for all n. Let β 0 < β 1 < · · · → β and α 0 < α 1 < · · · → α be computable sequences of rationals. Let α0 = β0 = 0. Suppose we have defined αn and βn . Search for i and j such that α i > αn and β j > βn , and for all m n + 1, we have K( αi m | βj m) < c. Such i and j must exist because the approximations to α and β eventually settle on their first n + 1 many bits. Let αn+1 = α i (n + 1) and βn+1 = β j (n + 1). It is easy to see that α0 < α1 < · · · → α and β0 < β1 < · · · → β. Suppose that βt n = βs n. Then for all u ∈ [t, s], we have K(αu n | βt n) < c. Thus αu n assumes fewer than 2c many values for such u, and hence m(α, n, s, t) 2c . (ii) ⇒ (iii) Obvious. (iii) ⇒ (i) Define a partial computable function f as follows. Given τ and j k, wait for a stage s such that τ = βs |τ |. If such a stage is found, then wait for a stage t > s such that m(α, n, s, t) = j. If such a stage is found, then let f (τ, j) = αt+1 |τ |. Then for each n there is a j k such that f (β n, j) = α n. Despite its nonuniform nature, rK-reducibility implies Turing reducibility. Theorem 9.6.8 (Downey, Hirschfeldt, and LaForte [113]). If A rK B then A T B. Proof. Let k be the least number for which there exists a partial computable function f such that for each n there is a j k with f (B n, j) ↓= A n. There must be infinitely many n for which f (B n, j) ↓ for all j k, since otherwise we could change finitely much of f to contradict the minimality of k. Let n0 < n1 < · · · be a B-computable sequence of such n. Let T be the B-computable subtree of 2<ω obtained by pruning, for each i, all the strings of length ni except for the values of f (B ni , j) for j k. The tree T has bounded width, so each element of [T ] is B-computable. But A ∈ [T ], so A T B.
424
9. Measures of Relative Randomness
Since any computable set is rK-reducible to any other set, the above theorem shows that the computable sets form the least rK-degree. By the Kuˇcera-Slaman Theorem, together with the fact that rK-reducibility implies K-reducibility and is implied by S-reducibility, the largest rK-degree of left-c.e. reals is the same as the largest S-degree of left-c.e. reals, namely the 1-random left-c.e. reals. Furthermore, like the case of S-reducibility but unlike that of cl-reducibility, the rK-degrees of left-c.e. reals are an uppersemilattice, with join induced by addition. Theorem 9.6.9 (Downey, Hirschfeldt, and LaForte [113]). The rK-degrees of left-c.e. reals form an uppersemilattice with least degree that of the computable sets, highest degree that of the 1-random left-c.e. reals, and join induced by addition. Proof. All that is left to show is that addition induces a join. Let α and β be left-c.e. reals. Since α, β S α + β, we have α, β rK α + β. Let α, β rK γ. Then we can compute (α + β) n given α n and β n, together with a couple of bits of information to indicate whether there are carries from further bits of α and γ. Since K(α n | γ n) O(1) and K(β n | γ n) O(1), it follows that K((α + β) n | γ n) O(1). Notice that in the above proof, the only use made of the fact that we were working with left-c.e. reals was in showing that α, β rK α + β. That is of course not necessarily the case in general, since, for example, Ω+(1−Ω) rK 1. It follows from the proof of Theorem 9.5.2 that the rK-degrees of left-c.e. reals are not a lattice. As we will see in Section 9.8, the analogs of Theorem 9.5.3 and Corollary 9.5.5 hold for rK-reducibility. Together with Corollary 9.5.9, they give the following result. Theorem 9.6.10 (Downey, Hirschfeldt, and LaForte [113]). The partial order of rK-degrees of left-c.e. reals is dense. Furthermore, in the rK-degrees of left-c.e. reals, every incomplete degree splits over every lesser degree, while the complete degree does not split at all. Raichev [319, 320] observed that rK-lower cones are well-behaved, in the following sense. Theorem 9.6.11 (Raichev [319, 320]). For any A, the reals rK-reducible to A form a real closed field. The proof uses similar approximation methods to that of Theorem 5.4.7. See [319] for details. Thus rK-reducibility shares many of the nice structural properties of S-reducibility on the left-c.e. reals, while still being a reasonable measure of relative randomness on all sets. There are still some basic open questions about rK-reducibility, however, such as whether the semilattice of rK-degrees of left-c.e. reals is distributive, or whether every set is rK-
9.7. K-reducibility and C-reducibility
425
reducible to a 1-random set. We will see in Section 9.13 that there are sets that are not cl-reducible to any 1-random set.
9.7 K-reducibility and C-reducibility Recall that A is K-reducible to B (written A K B) if K(A n) K(B n) + O(1), and A is C-reducible to B (written A C B) if C(A n) C(B n) + O(1). These reducibilities are the most basic measures of relative randomness based on the Kolmogorov complexity approach. If A is 1-random and A K B, then B is clearly 1-random. The same holds if A C B by Theorem 6.7.2. We have seen in Theorem 3.4.4 that a set A is computable iff C(A n) C(n) + O(1). The following theorem extends that result. Theorem 9.7.1 (Stephan [personal communication]). Let α and β be leftc.e. reals. If α C β then α T β.3 Proof. If β ≡T ∅ , then there is nothing to prove, so we assume that β g(n), since otherwise we could compute ∅ using g and hence using β. Thus we can find a β-computable infinite set S ⊆ ∅ such that f (n) > g(n) for all n ∈ S. For n ∈ S, we can determine β n by computing f (n), since then β n = βf (n) n. Thus for all such n we have C(β n) C(n) + O(1), and hence C(α n) C(n) + O(1). Since S is β-computable, the relativized form of Chaitin’s Theorem 3.4.5 implies that α T β. We will see in Chapter 11 that this result is not true if we replace plain complexity by prefix-free complexity. Indeed, even the analogue of Theorem 3.4.4 fails to hold in that case, because there are noncomputable sets A such that K(A n) K(n) + O(1). Sets with this property (including all the computable sets) are called K-trivial because they form the lowest Kdegree, and are of great interest in the theory of algorithmic randomness, as we will see in Chapter 11. For left-c.e. reals, it follows from the Kuˇcera-Slaman Theorem that the largest K-degree and the largest C-degree both consist exactly of the 1random left-c.e. reals. Furthermore, the K-degrees and C-degrees of left-c.e. reals have basic structural properties similar to those of the S-degrees and rK-degrees of left-c.e. reals, as we now show.
3 This result is not true in general outside the left-c.e. reals. If C(A n) then A C Ω, but there are uncountably many such A.
n , 2
say,
426
9. Measures of Relative Randomness
Theorem 9.7.2 (Downey, Hirschfeldt, Nies, and Stephan [117]). Let α and β be left-c.e. reals. Then C((α + β) n) = max(C(α n), C(β n)) ± O(1) and K((α + β) n) = max(K(α n), K(β n)) ± O(1). Proof. We do the proof for prefix-free complexity; the same works for plain complexity. Let γ = α + β. Let α0 < α1 < · · · → α and β0 < β1 < · · · → β be computable sequences of rationals and let γs = αs + βs . If we know γ n then we can find an s such that γs n = γ n. The approximation to α n can change at most once after stage s, since if there were two such changes, we would have γ − γs α − αs > 2−n , and hence γs n = γ n. Thus we can describe α n given γ n and one more bit of information, whence K(α n) K(γ n) + O(1). Similarly, K(β n) K(γ n) + O(1). For the other direction, fix n. Let s be least such that αs n = α n and let t be least such that βt n = β n. Suppose that s t. Then given α n, we can compute s, which also gives us β n = βs n. From this information plus a couple of bits to represent possible carries from further bits of α and β, we obtain γ n. Thus K(γ n) K(α n) + O(1). Similarly, if t s then K(γ n) K(β n) + O(1).
Corollary 9.7.3 (Downey, Hirschfeldt, Nies, and Stephan [117]). The Kdegrees and C-degrees of left-c.e. reals both form uppersemilattices with join given by addition. We will see in Section 9.8 that the analogs of Theorem 9.5.3 and Corollary 9.5.5 hold for K-reducibility and C-reducibility. Together with Corollary 9.5.9, they give the following result. Theorem 9.7.4 (after Downey, Hirschfeldt, Nies, and Stephan [117]). The partial orders of K-degrees and C-degrees of left-c.e. reals are dense. Furthermore, in both the K-degrees and C-degrees of left-c.e. reals, every incomplete degree splits over every lesser degree, while the complete degree does not split at all. As mentioned above, the fact that S-reducibility, cl-reducibility, and rKreducibility imply Turing reducibility allows us to transfer certain results, such as the existence of minimal pairs, from the Turing case. This tool is not available for K-reducibility. We will see in Theorem 9.15.8 that minimal pairs of K-degrees do exist, but it is not known whether there are minimal pairs of K-degrees of left-c.e. (or even Δ02 ) reals. In Chapter 10, we will discuss the structure of the K-degrees and Cdegrees of 1-random sets.
9.8. Density and splittings
427
9.8 Density and splittings In this section, we give a unified proof of the density and splitting theorems for various reducibilities considered above (Corollary 9.5.5 and Theorems 9.5.3, 9.6.10, and 9.7.4). Indeed, we will show that analogous results hold for all reducibilities with a few basic properties given in the following definition. Definition 9.8.1. Let C0 , C1 , . . . be an effective list of all left-c.e. reals. A reducibility r on the left-c.e. reals is Σ03 if there is a total computable function Φ such that for all a, b, we have Ca r Cb iff ∃k ∀m ∃n Φ(a, b, k, m, n). The reducibility r is standard if r is Σ03 , every computable real is rreducible to any given left-c.e. real, addition is a join in the r-degrees of left-c.e. reals, and for any left-c.e. real α and any rational q > 0, we have α ≡r qα. Standard reducibilities include Solovay reducibility, rK-reducibility, Kreducibility, and C-reducibility (but not cl-reducibility, as addition is not a join in the cl-degrees of left-c.e. reals). We begin with the splitting theorem. Theorem 9.8.2 (after Downey, Hirschfeldt, and Nies [116]). Let r be a standard reducibility on the left-c.e. reals. Let γ
428
9. Measures of Relative Randomness
adding γs − γs−1 to the current value of each β j ; in the limit, this action ensures that β j r γ. We will say that Ri,e is satisfied through n at stage s if ∀m < s ¬Φ(a, bi , e, n, m). The strategy for Ri,e is to act whenever either it is not currently satisfied or the least number through which it is satisfied changes. Whenever this happens, Ri,e initializes R1−i,e , which means that the amount of α − 2γ that R1−i,e is allowed to funnel into β i is reduced. More specifically, once R1−i,e has been initialized for the mth time, the total amount that it is thenceforth allowed to put into β i is reduced to 2−m . The above strategy guarantees that if R1−i,e is initialized infinitely often then the amount put into β i by R1−i,e (which in this case is all that is put into β i except for the coding of γ) adds up to a computable real. In other words, β i ≡r γ S α implies that there is a stage v after which R1−i,e is allowed to put all of α −α v into β i . In general, at a given stage s there will be several requirements, each with a certain amount that it wants (and is allowed) to direct into one of the β j . We will work backwards, starting with the weakest priority requirement that we are currently considering. This requirement will be allowed to direct as much of α s − α s−1 as it wants (subject to its current quota, of course). If any of α s − α s−1 is left then the next weakest priority strategy will be allowed to act, and so on up the line.
9.8. Density and splittings
429
We now proceed with the full construction. We give Ri,e stronger priority than Ri ,e if 2e + i < 2e + i . Recall that we say that Ri,e is satisfied through n at stage s if ∀m < s ¬Φ(a, bi , e, n, m). Let ni,e be the least s n through which Ri,e is satisfied at stage s, if such an n exists, and let ni,e s = ∞ otherwise. We say that Ri,e requires attention at stage s if either i,e i,e ni,e s = ∞ or ns = ns−1 . If Ri,e requires attention at stage s then we say that each requirement of weaker priority than Ri,e is initialized at stage s. Each requirement Ri,e has associated with it a left-c.e. real τ i,e , which records the amount put into β 1−i for the sake of Ri,e . We decide how to distribute δ = αs − αs−1 between β 0 and β 1 at stage s as follows. 1. Let j = s and ε = 2(γs − γs−1 ), and add γs − γs−1 to the current value of each β i . 2. Let i < 2 and e ∈ N be such that 2e + i = j. Let m be the number of times Ri,e has been initialized and let t be the last stage at which Ri,e was initialized (or t = 0 if there has been no such stage). Let i,e − τti,e )). ζ = min(δ − ε, 2−(j+m) Ωs − (τs−1
Add ζ to ε and to the current values of τ i,e and β 1−i . 3. If ε = δ or j = 0 then add δ − ε to the current value of β 0 and end the stage. Otherwise, decrease j by one and go to step 2. This completes the construction. Clearly, γ r β i r α for i = 0, 1 and β + β 1 = α. We now show by induction that each requirement initializes requirements of weaker priority only finitely often and is eventually satisfied. Assume by induction that Ri,e is initialized only finitely often. Let j = 2e + i, let m be the number of times Ri,e is initialized, and let t be the last stage at which Ri,e is initialized. The following are clearly equivalent. 0
1. Ri,e is satisfied. 2. lims ni,e s exists and is finite. 3. Ri,e eventually stops requiring attention. Assume for a contradiction that Ri,e requires attention infinitely often. Since Ω S α, part (ii) of Lemma 9.5.7 implies that there are v > u > t such that for all w > v we have 2−(j+m) (Ωw −Ωu ) > αw −αu . Furthermore, by the way the amount ζ added to τ i,e at a given stage is defined in step 2 i,e of the construction, τui,e − τti,e 2−(j+m) Ωu and τw−1 − τui,e αw−1 − αu .
430
9. Measures of Relative Randomness
Thus for all w > v, αw − αw−1 = αw − αu − (αw−1 − αu ) < 2−(j+m) (Ωw − Ωu ) − (αw−1 − αu ) = 2−(j+m) Ωw − (2−(j+m) Ωu + αw−1 − αu ) i,e 2−(j+m) Ωw − (τui,e − τti,e + τw−1 − τui,e ) i,e = 2−(j+m) Ωw − (τw−1 − τti,e ).
Thus, after stage v, the reverse recursion performed at each stage never gets past j, and hence everything put into β i after stage v is put in either to code γ or for the sake of requirements of weaker priority than Ri,e . Let τ be the sum of all τ 1−i,e such that R1−i,e has weaker priority than Ri,e . Let sl > t be the lth stage at which Ri,e requires attention. If R1−i,e is the pth requirement on the priority list and p > j then τ i ,e − τsil ,e 2−(p+l) Ω. Thus τ − τsl p1 2−(p+l) Ω = 2−l Ω 2−l , and hence τ is computable. Putting together the results of the previous two paragraphs, we see that β i r γ. Since α r γ, we have α r β i . It now follows that there is an n ∈ ω such that Ri,e is eventually permanently satisfied through n, and such that Ri,e is eventually never satisfied through any n < n. Thus lims ni,e s exists and is finite, and hence Ri,e is satisfied and eventually stops requiring attention. We now prove the upwards density theorem. Theorem 9.8.3 (after Downey, Hirschfeldt, and Nies [116]). Let r be a standard reducibility on the left-c.e. reals. Let γ
9.8. Density and splittings
431
is satisfied changes, Se initializes Re , which means that the fraction of Ω that Re is allowed to put into β is reduced. As in the previous proof, if Se is not eventually permanently satisfied through some n then the amount put into β by Re is computable, and hence β ≡r γ, which, as before, implies that there is a stage after which Se is permanently satisfied through some n and never again satisfied through any n < n. Once this stage has been reached, Re is free to code a fixed fraction of Ω into β, and hence it too succeeds. We now proceed with the full construction. For e < e , we give a requirement Xe stronger priority than a requirement Ye . We also give Re stronger priority than Se . As mentioned above, we say that Re is satisfied through n at stage s if ∀m < s ¬Φ(b, a, e, n, m). Similarly, we say that Se is satisfied through n at stage s if ∀m < s ¬Φ(c, b, e, n, m). For a requirement e Xe , let nX s be the least n through which Xe is satisfied at stage s, if such e an n exists, and let nX s = ∞ otherwise. We say that the requirement Xe Xe Xe e requires attention at stage s if either nX s = ∞ or ns = ns−1 . At stage s, proceed as follows. First add γs −γ2 s−1 to the current value of β. If no requirement requires attention at stage s then end the stage. Otherwise, let Xe be the strongest priority requirement requiring attention at stage s. We say that Xe acts at stage s. If X = S then initialize all weaker priority requirements and end the stage. If X = R then let m be the number of times that Re has been initialized. If s is the first stage at which Re acts after the last time it was initialized (or is the very first stage at which Re acts), then let t be the last stage at which Re was initialized (or let t = 0 if there has been no such stage), and otherwise let t be the last stage at which Re acted. Add 2−(e+m+3) (Ωs − Ωt ) to the current value of β and end the stage. This completes the construction. Since β is bounded by γ2 + e 2−e+2 Ω = γ+Ω 2 , it is a well-defined left-c.e. real. Furthermore, γ r β. We now show by induction that each requirement initializes requirements of weaker priority only finitely often and is eventually satisfied. Assume by induction that there is a stage u such that no requirement of stronger priority than Xe requires attention after stage u. The following are clearly equivalent. 1. Xe is satisfied. e 2. lims nX s exists and is finite.
3. Xe eventually stops requiring attention. 4. Xe acts only finitely often. First suppose that X = R. Let m be the number of times that Re is initialized. (Since Re is not initialized at any stage after stage u, this number is finite.) Suppose that Re acts infinitely often. Then the total amount added to β for the sake of Re is 2−(e+m) Ω, and hence β ≡r Ω r γ.
432
9. Measures of Relative Randomness
Thus there is an n such that Re is eventually permanently satisfied through n, and such that Re is eventually never satisfied through n < n. Thus e lims nR exists and is finite, and hence Re is satisfied and eventually stops s requiring attention. Now suppose that X = S and Se acts infinitely often. If v > u is the mth stage at which Se acts then the total amount added to β after stage v for purposes other than coding γ is bounded by i 2−(i+m+2) Ω < 2−m+1 . It follows that β ≡r γ r Ω. Thus there is an n such that Se is eventually permanently satisfied through n, and such that Se is eventually never satisfied through n < n. So lims nSs e exists and is finite, and hence Se is satisfied and eventually stops requiring attention. Combining Theorems 9.8.2 and 9.8.3, we have the following result. Corollary 9.8.4 (after Downey, Hirschfeldt, and Nies [116]). Let r be a standard reducibility on the left-c.e. reals that is at least as strong as Solovay reducibility. Then the r-degrees of left-c.e. reals are dense.
9.9 Monotone degrees and density One notable reducibility missing from the list above is monotone reducibility, introduced in Section 4.5. Certainly this is a Σ03 reducibility for which the degree of Ω is the top degree among left-c.e. reals and the computable sets form the bottom degree. However, it is not known whether addition induces a join on this degree structure. Open Question 9.9.1. Does addition induce a join on the monotone degrees of left-c.e. reals? We conjecture that the answer is no. In spite of this lack of knowledge, there is still a downward density theorem for monotone reducibility. Theorem 9.9.2 (Calhoun [46]). The monotone degrees of left-c.e. reals are downward dense, meaning that if b is a nonzero Km-degree of left-c.e. reals then there is a Km-degree of left-c.e. reals a with b > a > 0. The proof uses the following useful lemma, which says that we can use simple permitting in this case. Lemma 9.9.3 (Calhoun [46]). Suppose that A = lims As and B = lims Bs are monotonic approximations to left-c.e. reals A and B and f is a computable function such that Bs n = B n ⇒ Af (s) n = A n for all s and n. Then A Km B. Proof. Let U be a universal monotone machine. We build a monotone machine M by letting M (σ) = Af (s) n whenever U (σ) = Bs n and defining M (σ) in this way does not violate the monotonicity of M . By the hypothesis of the lemma, for all n, if U (σ) = B n them M (σ) = A n.
9.10. Further relationships between reducibilities
433
Proof of Theorem 9.9.2. The argument is a finite injury one. We are given a noncomputable left-c.e. real B = lims Bs and build a noncomputable left-c.e. real A s. At stage s, the requirement R2e will have control of A [n(2e − 1, s), n(2e, s)). At stage s + 1, if we see Km s (Bs n(2e, s)) Km s (As n(2e, s)) + e, then we allow R2e to assert control of the next position by setting n(2e, s + 1) = n(2e, s) + 1. Notice that, once R2e has priority, this can happen only finitely often, lest B be computable. Meeting R2e+1 we use a simple permitting argument. Once R2e+1 has priority, if it has control of position n, when we see that As n(2e + 1, s) = We,s n(2e + 1, s) we set n(2e + 1, s + 1) = n(2e + 1, s) + 1, and if ever Bt permits n(2e + 1, s) then we can make a disagreement in the usual way by changing At (n(2e + 1, s)). By Lemma 9.9.3, it cannot be the case that R2e+1 asks for permission infinitely often but never receives it, since in that case A would be computable but we would also have B Km A, which would make B computable. Open Question 9.9.4. Are the Km-degrees of left-c.e. reals dense?
9.10 Further relationships between S-, cl-, and rK-reducibilities It follows from Theorem 9.1.6 that S-reducibility does not imply clreducibility on the left-c.e. reals. The following result shows that Sreducibility and cl-reducibility are in fact incomparable on the left-c.e. reals. Theorem 9.10.1 (Downey, Hirschfeldt, and LaForte [113]). There exist left-c.e. reals α cl β such that α S β. Moreover, α can be chosen to be strongly c.e.
434
9. Measures of Relative Randomness
Proof. We must build α and β so that α cl β and α is strongly c.e., while satisfying the following requirement for each e. Re : ∃q ∈ Q (q < β ∧ (Φe (q) ↓< α ⇒ e(β − q) < α − Φe (q))). These requirements suffice because any partial computable function is equal to Φe for infinitely many e. We discuss the strategy for a single requirement Re . Let k be such that e 2k . We must make the difference between β and some rational q < β quite small while making the difference between α and Φe (q) relatively large. At a stage t we pick a fresh large number d. For the sake of Re , we will control the first d + k + 2 many bits of β and α. We set βt (n) = 1 for all n with d n d + k + 1, while at the same time keeping αt (n) = 0 for all such n. We let q = βt . Note that, since we are restraining the first d + k + 2 bits of β, we know that, unless this restraint is lifted, β − βt 2−(d+k+2) . We now need do nothing until we come to a stage s t such that Φe,s (q) ↓< αs and αs − Φe,s (q) 2−(d+2) . Our action then is the following. First we add 2−(d+k+1) to βs . Then we again restrain β d+k+2. Assuming that this restraint is successful, e(β − q) 2−(d+2) + 2−(d+1) < 2−d . Finally we win by our second action, which is to add 2−d to α. Then α − αs 2−d , so α − Φe (q) 2−d > e(β − q), as required. The theorem now follows by a simple application of the finite injury priority method. When we add 2−(d+k+1) to β at stage s, since βs (n) = 1 for all n with d n d + k + 1, bit d − 1 of β changes from 0 to 1. On the α side, when we add 2−d at stage s, the only change is that bit d − 1 of α changes from 0 to 1. Hence we keep α cl β (with constant 0). It is also clear that α is strongly c.e. On the other hand, S-reducibility and cl-reducibility do coincide on c.e. sets, as the following results show. Theorem 9.10.2 (Downey, Hirschfeldt, and LaForte [113]). Let α be a left-c.e. real and β be a strongly c.e. real. If α cl β then α S β. Proof. Let A be an almost c.e. set and B be a c.e. set such that there is a reduction ΓB = A with use n + c. Let α = 0.A and β = 0.B. We may assume that we have chosen approximations of A and B such that ΓB (n)[s] = As (n) for all s and n s. We may also assume that if n enters A at stage s then n s. If n enters A at stage s then some number less than or equal to n + c must enter B at stage s. Since B is c.e., it follows that βs − βs−1 2−(n+c) . But n entering A corresponds to a change of at most 2−n in the value of α, so βs − βs−1 2−c (αs − αs−1 ). Thus for all s we have α − αs 2c (β − βs ), and hence α S β. Theorem 9.10.3 (Downey, Hirschfeldt, and LaForte [113]). Let α be a strongly c.e. real and β be a left-c.e. real. If α S β then α cl β.
9.10. Further relationships between reducibilities
435
Proof. Let α and β satisfy the hypotheses of the theorem. Let the computable function f and the constant k be such that α−αf (s) < 2k−1 (β −βs ) for all s. To compute α n using β n + k, find the least stage s such that βs n + k = β n + k. Then β − βs < 2−(n+k) , so α − αf (s) < 2k−1 2−(n+k) = 2−(n+1) . Since α is strongly c.e., we must have α n = αf (s) n, since any change in the value of α n adds at least 2−(n+1) to the value of α. Corollary 9.10.4 (Downey, Hirschfeldt, and LaForte [113]). For c.e. sets A and B, we have A S B iff A cl B. Thus cl-reducibility can be a particularly useful tool in studying c.e. sets. For example, we have seen that there is a largest S-degree of left-c.e. reals, but we now show that there is no largest S-degree of c.e. sets. Theorem 9.10.5 (Downey, Hirschfeldt, and LaForte [113]). Let A be a c.e. set. There is a c.e. set that is not cl-below A, and hence is not S-below A. Proof. The argument is a finite injury construction, but is nonuniform, in the sense that we build two c.e. sets B and C, one of which is not cl-below A. Let Γe be Φe with use restricted to n + e on input n. We satisfy the following requirements. A Re,i : ΓA e = B ∨ Γi = C.
These requirements suffice because if B cl A then there is an e such that ΓA e = B, in which case the requirements of the form Re,i ensure that C cl A. The idea for satisfying a single requirement Re,i is simple. Let A l(e, i, s) = max{n : ∀m n (ΓA e (m)[s] = Bs (m) ∧ Γi (m)[s] = Cs (m))}.
Pick a fresh large number k and let Re,i assert control over the interval [k, 3k] in both B and C, waiting until a stage s such that l(e, i, s) > 3k. First work with C. Put 3k into C, and wait for the next stage t where l(e, i, t) > 3k. Note that some number must enter At \ As below 3k + i. Now repeat with 3k − 1, then 3k − 2, and so on. In this way, 2k many numbers are made to enter A below 3k + i. Now we can win using B, by repeating the process and noticing that, since k is larger than e and i, we cannot have 2k many numbers entering A below 3k + i and 2k many other numbers entering A below 3k + e. The theorem now follows by a standard application of the finite injury method. Since both S-reducibility and cl-reducibility imply rK-reducibility, it follows from Theorems 9.1.6 and 9.10.1 that rK-reducibility does not imply either of them on left-c.e. reals. We now show that this is the case even for c.e. sets.
436
9. Measures of Relative Randomness
Theorem 9.10.6 (Downey, Hirschfeldt, and LaForte [113]). There exist c.e. sets A and B such that A rK B but A cl B. Proof. Let Γe be Φe with use restricted to n + e on input n. We build c.e. sets A rK B to satisfy the following requirements. Re : ΓB e = A. The construction is a standard finite injury argument. We discuss the satisfaction of a single requirement Re . For the sake of this requirement, we choose a fresh large n, restrain n from entering A, and restrain n + e from entering B. If we find a stage s such that ΓB e (n)[s] ↓= 0 then we put n into A, put n + e into B, and restrain the initial segment of B of length n + e. Unless a higher priority strategy acts at a later stage, this action guarantees that ΓB e (n) = A(n). We ensure that the followers picked by different requirements are far enough apart that for each m there is at most one e such that Re has a follower n < m but n + e m. Thus, if B m = Bs m then A m can change at most once after stage s. By Theorem 9.6.7, A rK B. One further difference between cl-reducibility and S-reducibility is that if α is a noncomputable left-c.e. real, then there is a noncomputable c.e. set cl-below α, while there may not be any noncomputable c.e. set S-below α. Proposition 9.10.7 (Downey, Hirschfeldt, and LaForte [113]). Let α be a left-c.e. real. Then there is a c.e. set B such that B cl α and α tt B. Proof. At stage s+1, for each n, if αs (n) = αs+1 (n) then put n, i in B for the least i such that n, i ∈ / Bs . If αs m = α m then Bs m = B m, since n, i n for all n. Thus B cl α. There are at most 2n many i such that n, i ∈ B, and α(n) = 0 iff there are an even number of such i. Thus α tt B. Theorem 9.10.8 (Downey, Hirschfeldt, and LaForte [113]). There is a noncomputable left-c.e. real α such that all c.e. sets S-below α are computable. Proof. Recall that a prefix-free c.e. set A ⊂ 2<ω is a presentation of α if −|σ| = α. By Corollary 5.3.14, there is a noncomputable left-c.e. σ∈A 2 real α such that every presentation of α is computable. Let β S α be strongly c.e. Then there is a positive q ∈ Q and a left-c.e. real γ such that α = qβ + γ. Let k be such that 2−k q and let δ = γ + (q − 2−k )β. Then δ is a left-c.e. real such that α = 2−k β + δ. Let b0 , b1 , . . . andd0 , d1 , . . . be computable sequences of natural numbers such that 2−k β = i 2−bi and δ = i 2−di . Since β is strongly c.e., so is 2−k β, and hence we can choose b0 , b1 , . . . to be pairwise distinct, so that the nth bit of β is 1 iff n + k = bi for some i.
9.11. A minimal rK-degree
437
By the KC Theorem, since i 2−bi + i 2−di = α < 1, there is a prefixfree c.e. set A with an enumeration σ0 , σ1 , . . . such that |σ2i | = bi and |σ2i+1 | = di for all i. Now σ∈A 2−|σ| = α, so A is a presentation of α, and hence is computable. Now we can compute β as follows. Given n, run through all strings of length n + k in A and determine whether any of them is σ2i for some i. If so, then β(n) = 1, while otherwise, β(n) = 0.
9.11 A minimal rK-degree In this section, we will prove the following result. Theorem 9.11.1 (Raichev and Stephan [321]). There is a minimal rKdegree. This result is interesting because rK is one of the few measures of relative randomness for which we know whether or not there is a minimal degree. The proof below uses the fact that rK is reasonably well-behaved on very sparse sets. Merkle and Stephan [270] have exploited this idea of using sparse sets to establish results about measures of relative randomness quite fruitfully, as we will see in Section 9.15. Proof of Theorem 9.11.1. We build a Π01 class [T ] for a computable tree T with no computable paths. The tree T will have the property that for all total functionals Φ, for all X ∈ [T ], there is a string σ ≺ X such that one of the following holds: (i) ΦX and ΦY are compatible for all Y ∈ [T ] extending σ or (ii) ΦZ and ΦY are incompatible for all Z = Y extending σ in [T ]. We make the set S of splitting nodes of [T ] computably sparse. That is, for all computable functions g, ∀∞ σ ∈ S ∀τ ∈ S, (σ ≺ τ → g(|σ|) < |τ |). The proof uses movable markers. For each string ν, at each stage s the marker mν rests on a splitting node of Ts . At stage 0, we have T0 = 2<ω and mν = ν. At each stage s we will prune Ts to make Ts+1 . The basic action is called procedure CUT(σ, τ ). This procedure can be invoked for σ ≺ τ . It prunes all paths that extend mσ for all σ σ ≺ τ , but not mτ . Then all markers are moved accordingly. That is, we move mσ to mτ , and mσν to mτ ν . The construction at stage s works for nodes of length s and has the following actions. (a) If there is a σ, an i < 2, and an e |σ| with Φe (x)[s] = mσi (x) for all x |σ|, then invoke CUT(σ, σ(1 − i)).
438
9. Measures of Relative Randomness
σ0 σ1 (b) If there are σ, δ, ν, and e |σ| such that Φm [s] and Φm [s] are e e mσ0δ mσ1ν compatible for all arguments y |σ|, but Φe [s] and Φe [s] are incompatible at some argument y |σ|, then invoke CUT(σ0, σ0δ) and CUT(σ1, σ1ν).
(c) Finally, if there are σ, τ, ν, and e |σ| with σ ≺ τ ≺ ν and |mτ | Φe (|mσ |)[s] < mν , then invoke CUT(τ, ν). It is easy to show that all the markers stop moving at some stage. Addi tionally, at each stage the tree Ts is perfect, and hence s Ts is a perfect tree. By (a), [T ] has no computable members, and by (b) and (c), for all X ∈ [T ], there is a string σ ≺ X satisfying either (i) or (ii) above, and the splitting nodes of T are computably sparse. Now let A be a path on T of hyperimmune-free degree, obtained using the Hyperimmune-Free Basis Theorem 2.19.11. We claim that A has minimal rK-degree. Suppose that ∅ =T B rK A. Then B T A also and hence, as A is hyperimmune free, B tt A. Let ΦA = B be the witnessing truth table reduction with computable use ϕ. We show that A rK B. Since B is not computable, (ii) must hold for Φ. Let σ be the string mentioned in (ii). We need the following lemma. Lemma 9.11.2. For almost all n and almost all stages t, the tree Tt has at most two extensions of σ of length n with extensions in Tt that map to B n under Φ. Proof. For k > |σ| let f (k) be the first stage such that for all strings ν with A (k − 1) (1 − A(k)) ≺ ν of length ϕ(s) on Ts , there exists x s with Φν (x) ↓= ΦA (x). Let f (k) = 1 for k |σ|. The function f is total, since otherwise there is a k such that for each s there is a string νs with A (k − 1) (1 − A(k)) ≺ νs of length ϕ(s) on Ts and Φνs (x) = ΦA (x) for all x s. Then Y = lim inf s νs ∈ [T ] is distinct from A and ΦA = ΦY . However, (ii) holds as B is noncomputable, so this is a contradiction. Notice that f is A-computable, and hence, as A is hyperimmune free, there is a computable function g majorizing f . Now choose n larger than |σ|, the length at which T becomes g-sparse, and the length of the first splitting node of A on T . Let τ be the last splitting node on A n and ν ≺ τ any splitting node extending σ. Then by sparseness, we know that for s = f (|σ|), s g(|σ|) < |τ | n. Thus by stage s, every ρ ∈ Ts extending A (|ν| − 1) (1 − A(|ν|)) = ν (1 − A(|ν|) will have an argument x s with Φρ (x) = ΦA (x) = B(x). Therefore ρ cannot map to B n under Φ. As ν is an arbitrary splitting node of T below the last splitting node of A n, only strings extending the last splitting node of A n can map to B n under Φ.
9.12. Complexity and completeness for left-c.e. reals
439
To complete the proof, given B n, run through computable approximations of Ts until a sufficiently large stage t is found where Tt has at most two extensions of σ that can map to B n under Φ. The stage t exists by Lemma 9.11.2. We can find these extensions effectively from B n as Φ is a tt-reduction. The output of the two strings being found, one will be A n. This is an rK-reduction. Raichev and Stephan [321] established various facts about minimal rKdegrees. For example, they showed that there are 2ℵ0 many of them, as the hyperimmune-free basis theorem shows that every Π01 class with no computable members has 2ℵ0 many hyperimmune-free members. Additionally, they showed that minimal rK-degrees have fairly low initial segment complexity. Theorem 9.11.3 (Raichev and Stephan [321]). If A has minimal rKdegree, then for any computable order g, and for Q either C or K, Q(A n) Q(n) + g(n) + O(1). Proof. Given any computable strictly increasing function h, define the hdilution Ah of A by Ah (h(n)) = A(n) and Ah (k) = 0 for k ∈ / rng h(n). Notice that Ah S A T Ah and hence Ah ≡rK A if A has minimal rKdegree. To describe Ah (n) n we need only Q(n) plus k bits of information, where k is the number of elements of rng h(n) that are less than n. Since A rK Ah , the same is true of A, up to a constant. Thus, given any g, it is easy to choose a fast enough growing h to make Q(A n) Q(n) + g(n) + O(1). This result shows that if A has minimal rK-degree then it is far from random. By the Kuˇcera-G´acs Theorem 8.3.2, we know that there is a 1random R with A wtt R. Now choose an h growing much faster than the use of the reduction A wtt R. Then Ah S R, where Ah is as in the above proof, and hence we have the following result. Theorem 9.11.4 (Raichev and Stephan [321]). Every real of minimal rK-degree is rK-reducible to a 1-random real. Raichev and Stephan also showed that there are 1-random reals with no minimal rK-degrees below them.
9.12 Initial segment complexity and completeness for left-c.e. reals Although K-reducibility is far from implying cl-reducibility, even on leftc.e. reals, if the prefix-free complexity of the initial segments of a left-c.e. real α is much greater than that of the corresponding initial segments of a left-c.e. real β, then we do have β
440
9. Measures of Relative Randomness
Theorem 9.12.1 (Downey, Hirschfeldt, and LaForte [113]). Let α and β be left-c.e. reals such that limn K(α n) − K(β n) = ∞. Then β
9.13. cl-reducibility and the Kuˇcera-G´ acs Theorem
441
9.13 cl-reducibility and the Kuˇcera-G´acs Theorem Recall that the Kuˇcera-G´acs Theorem 8.3.2 states that every set is wttreducible to a 1-random set. In this section, we show that this wtt-reduction cannot in general be improved to a cl-reduction. In other words, the use of the reduction in the Kuˇcera-G´acs Theorem cannot be made to be n + O(1). Indeed, there is a ∅ -computable set that is not cl-reducible to any complex set. (Recall that a set A is complex if there is a computable order h such that K(A n) > h(n) for all n.) Lemma 9.13.1 (Downey and Hirschfeldt [unpublished]). Let Γ be a clreduction with use bounded by x + c, let σ ∈ 2<ω , and let f be a computable order. There exists a τ σ such that for every μ ∈ 2|τ |+c, if Γμ = τ then K(μ) < f (|μ|). Furthermore, τ can be found ∅ -computably. Proof. It is enough to prove the existence of such a τ , since then we can ∅ computably search for one. We assume the usual convention that if Γμ = ν and ν ≺ ν then there is a μ μ such that Γμ = ν . Let h(ν) be the number of strings μ of length 2|ν|+c such that Γμ = ν. Then h(ν0) + h(ν1) 2h(ν), so h is an integer-valued supermartingale. Let σ σ be such that h(σ ) is minimal among all extensions of σ. Then h(σ 0) = h(σ ), since h(σ 0) h(σ ) by the choice of σ , and h(σ 0) h(σ ) since otherwise h(σ 1) < h(σ ). Similarly, h(σ 1) = h(σ). Proceeding this way by induction, we see that h(ν) = h(σ ) for all ν σ . Thus there is a k such that if Γμ = ν for ν σ , then K(μ) K(ν) + k. Let τ σ be such that K(τ ) < f (|τ | + c) − k. It is easy to see that such a τ exists, because K(σ 0n ) K(n) + O(1) and K(n) has no computable lower bound, so that we can take τ = σ 0n for a sufficiently large n. Then τ has the desired properties.
Theorem 9.13.2 (Downey and Hirschfeldt [unpublished]). There is an A T ∅ that is not cl-reducible to any complex set. Proof. Let Γ0 , Γ1 , . . . be an effective listing of the cl-reductions, with the use of Γn bounded by x + n. Let f0 , f1 , . . . be a ∅ -effective listing of the computable orders. Let σ0 = λ. Given σn,e , by Lemma 9.13.1 we can ∅ computably find a τ σn,e such that for every μ ∈ 2|τ |+n, if Γμn = τ then K(μ) < fe (|μ|). Let σn,e +1 = τ . Let A = i σi . Then A T ∅ . If A cl X then there are infinitely many n such that ΓX n = A. For any such n and any computable order fe , let k = |σn,e +1 |. Then K(X k + n) < f (k + n). Thus X is not complex.
442
9. Measures of Relative Randomness
9.14 Further properties of cl-reducibility This section could be called “When reducibilities go bad”. We will see that in spite of a number of nice features mentioned in the previous section, cl-reducibility also has some undesirable properties.
9.14.1 cl-reducibility and joins By Theorem 9.5.2, and the note following it, the cl-degrees of left-c.e. reals do not form a lowersemilattice. That is a common feature of many reducibilities. However, the cl-degrees of left-c.e. reals also do not form an uppersemilattice; that is, there is no join operation for them. This result was first proved directly by Downey, Hirschfeldt, and LaForte in [113], but follows from the proof of the following result of Yu and Ding [414]. Theorem 9.14.1 (Yu and Ding [414]). There is no cl-complete left-c.e. real. Actually, they proved something even stronger: Theorem 9.14.2 (Yu and Ding [414]). There are two left-c.e. reals α0 and α1 such that there is no left-c.e. real β with αi cl β for i = 0, 1. Proof. The proof of this theorem has gone through a number of simplifications, particularly in the induction. The first significant streamlining was due to Barmpalias and Lewis [246]. The proof we will follow is due to Barmpalias, Downey, and Greenberg [21]. The main idea of the construction is that if β is a left-c.e. real that clcomputes both α0 and α1 , then alternatingly adding little bits to α0 and α1 (drip-feeding them, as it were) is sufficient to drive β to be too large. Intuitively, a real β computing another real γ with use x for every x means that as soon as γ changes at position x, the real β must change at a position less than or equal to x. That is, if γ can be computed with oracle β and use x, then β is not less than γ. So if there were a largest c.e. cl-degree β, we could select two reals α0 and α1 and change them alternatingly to drive β to be very large. Furthermore, we can reserve parts of the reals for this purpose, to work independently for each requirement. In more detail, suppose that Γ0 and Γ1 are cl-reductions and that β is a left-c.e. real. For all such triples, we need to meet the requirement RΓ0 ,Γ1 ,β : Either Γβ0 = α0 or Γβ1 = α1 . For simplicity of presentation, we will take the reductions Γi to be ibTreductions, but this assumption is inessential to the proof below; in the general case, the only change is that longer interval such as [k, k + 2k+c ]
9.14. Further properties of cl-reducibility
443
instead of [k, k + 2k ] would be needed.4 We view left-c.e. reals as both infinite binary sequences and as the corresponding elements in the Euclidean interval [0, 1] (via binary expansions). Addition means the usual arithmetic addition. To meet requirement R = RΓ0 ,Γ1 ,β , we describe a family of modules, each indexed by an interval of natural numbers. Let a b. A stage s is expansionary for the [a, b)-module (for R) if for both i = 0, 1 we have Γβi s [s] ⊃ αi,s b. The instructions for the module are as follows. Repeat the following 2b−a − 1 many times: 1. At the next expansionary stage, add 2−b to α0 . 2. At the next expansionary stage, add 2−b to α1 . At the end, wait for the next expansionary stage and then return. To meet the requirement R, we run the [k, k + 2k )-module for R for some k. We will shortly argue that this module cannot return, so it must get stuck waiting for some expansionary stage, and hence R is met. It is easy to see that assuming that we start with αi [a, b) = 0b−a , the [a, b)-module for any requirement makes changes only in αi [a, b) (for both i = 0, 1) and so if for distinct requirements we run modules on disjoint intervals, then there is no interaction between the requirements and so we can meet them all. The verification relies on the following lemma. For this lemma, we think of a finite binary string also as a natural number (via binary expansion). Lemma 9.14.3. Suppose that α0,t0 [a, b) = α1,t0 [a, b) = 0b−a . Suppose that at stage t0 , an [a, b)-module for R begins and returns at a stage t1 . Then βt1 a − βt0 a b − a. Proof. By induction on b − a. If b = a there is nothing to prove. Assume the lemma holds when b − a = n, and let a < b be such that b − a = n + 1. The key observation is that the [a, b)-module consists of three parts: 1. Running the [a + 1, b)-module. 2. Running one iteration of adding 2−b to α0 and then α1 (and waiting for expansionary stages). 3. Running the [a + 1, b)-module again. 4 In
fact, as observed in [21], if A and B are sets with an upper bound in the cl-degrees, then they also have one in the ibT-degrees, and similarly, if a set A is cl-reducible to a 1-random (left-c.e.) real, then it is ibT-reducible to a 1-random (left-c.e.) real.
444
9. Measures of Relative Randomness
Note also that both times we run the [a + 1, b)-module, we start with αi [a + 1, b) = 0n : the first time by the assumption of the lemma, and the second because when the first [a + 1, b)-module halts we have αi [a, b) = 01n for both i; we then add 2−b , which makes αi [a, b) = 10n . Let s0 be the stage at which the first [a + 1, b)-module returns, and s1 be the stage at which the second one begins. By induction, βs0 a + 1 − βt0 a + 1 n and βt1 a + 1 − βs1 a + 1 n. Between s0 and s1 we have a change in α0 (a), which forces a change in β a + 1 by the next expansionary stage, and then a change in α1 (n), which forces another change in β a + 1. Each of these changes adds at least 1 to β a + 1. Thus, in total, βt1 a + 1 − βt0 a + 1 2(n + 1), which implies that βt1 a − βt0 a n + 1, as required. This concludes the verification: the [k, k + 2k )-module can never return because that would force β k 2k , which is impossible. The Kuˇcera-Slaman Theorem 9.2.3 says that all 1-random left-c.e. reals are the same in terms of their complexity oscillations, and have sequences of rationals converging to them at essentially the same rates; and that from any of them, we can obtain any given left-c.e. real via a rather strong reduction. One consequence of Theorem 9.14.2, however, is that, in general, there is no efficient algorithm (in terms of the number of bits used) to obtain the bits of a left-c.e. real from the bits of a 1-random left-c.e. real.
9.14.2 Array noncomputability and joins By adding multiple permitting to the argument above, Barmpalias, Downey, and Greenberg [21] were able to classify the degrees within which constructions such as the one in the proof of Theorem 9.14.2 are possible. Theorem 9.14.4 (Barmpalias, Downey, and Greenberg [21]). The following are equivalent for a degree d. (i) There are left-c.e. reals α0 and α1 in d that do not have a common upper bound in the ibT-degrees (or cl-degrees) of left-c.e. reals. (ii) d is c.e. and array noncomputable.
9.14. Further properties of cl-reducibility
445
Proof. We begin by showing that we can perform the construction in the previous subsection within an array noncomputable c.e. degree. Recall from Section 2.23 that a c.e. degree d is array noncomputable iff for each (or some) very strong array (i.e., partition F0 , F1 , . . . of N of increasing size), there is a c.e. set D ∈ d such that for all c.e. sets W there are infinitely many n with W Fn = D Fn . Later, we will calculate the desired size of Fn for this proof. Assuming this has been done, fix some D ∈ d that satisfies the definition for this F0 , F1 , . . . . To get sufficiently many permissions, a requirement R = RΓ0 ,Γ1 ,β as above enumerates an auxiliary c.e. set WR and ties its actions to permissions from Fn . Again, a module for requirement R will operate on some interval [a, b); the notion of an expansionary stage for a module for R on an interval [a, b) is defined as before. To such a module, we will assign an n such that |Fn | > 2(2b−a − 1). Note that we choose distinct n’s for each module for R. To request permission, the module picks some x ∈ Fn that is not yet in WR and enumerates it into WR . Permission is received when at a later stage, x enters D. The standard modus operandi for multiple permitting is used: if some x ∈ Fn enters D before it is enumerated into WR , then Fn can be made incorrect for permitting by withholding x from ever entering WR , so we assume this never happens. The new module follows the following instructions: Repeat the following 2b−a − 1 many times: 1. Wait for an expansionary stage, then request permission. When permission is received, add 2−b to α0 . 2. Wait for another expansionary stage, then request permission. When permission is received, add 2−b to α1 . At the end, wait for the next expansionary stage and return. Note that the choice of n ensures that we can always request new permissions, as the module above requests at most 2(2b−a − 1) many permissions, and so we never get Fn ⊆ WR . The overall construction is as expected. For every R we pick an infinite set of intervals of the form [k, k + 2k ) and run modules on each interval separately (all of these modules together enumerate WR , though). We ensure that these intervals are pairwise disjoint and that the intervals used by different requirements are also pairwise disjoint. Also, to code in D, we fix a computable set C disjoint from every interval used by any requirement. Let cn be the nth element of C. For i = 0, 1, we declare that αi (cn ) = 1 iff n ∈ D. Since D is a c.e. set and all modules change αi only on their assigned intervals, we see that both αi are still left-c.e. reals. To conclude the construction, we need to specify the required size of Fn so that we can assign, for every requirement R, almost every n as a permitting number for some module working for R. So after we specify the coding location set C and assign the intervals for every requirement, we
446
9. Measures of Relative Randomness
define |Fn | to be large enough so that for every m n, for each of the first m many requirements, if [a, b) is the nth interval assigned to R, then |Fn | > 2(2b−a − 1), so that Fn can be assigned as a permitting set for the [a, b)-module for R. For the verification, we note that Lemma 9.14.3 holds for the new construction, with the same proof. Thus, for any requirement R, no [k, k + 2k )-module for R ever returns. The standard permitting argument now shows that it cannot be that every module for some requirement R gets stuck waiting for permission: For almost all n, the set Fn is assigned as a permitting set for some module for R, so there is some such n with WR Fn = D Fn , whence the module to which Fn is assigned is never stuck waiting for permission. As a result, this module must get stuck waiting for an expansionary stage, and so the requirement R is met. It is clear by coding that D T α0 , α1 . To conclude the verification, we show that α0 , α1 T D. Fix i < 2 and x. We compute αi (x) with oracle D as follows. If x ∈ C then we can calculate the n such that x = cn and then consult D for the value of αi (x). Otherwise, x belongs to some interval [a, b) on which a module for some requirement R is working. This requirement is assigned some permitting set Fn . Wait for a stage s at which Ds Fn = D Fn . Then, after stage s the segment αi [a, b) is fixed, and so αi (x) = αi,s (x). We now prove the converse: If α0 and α1 are left-c.e. reals with array computable (Turing) degrees, then α0 and α1 have a common upper bound in the cl-degrees. The idea is that we can approximate α0 and α1 with the number of mind changes (i.e., the function taking n to |{s : αi,s+1 n = αi,s n}|) bounded by a very slow growing function. This fact follows from the characterization of array computability as uniform total ω-c.e.-ness in Lemma 2.23.6: A c.e. degree d is array computable iff for every computable order h, every f T d has an h-c.e. approximation. We can then use this slow bound on the number of mind changes to build a left-c.e. real β that changes somewhere on its first n digits whenever either α0 or α1 does so. Again we think of these reals as elements of the interval [0, 1]. A request that β n change is met by adding 2−n to β. If the number of requests for a β n change does not exceed a bound g(n) such that
g(n)2−n < 1,
(9.2)
n1
then we can construct a left-c.e. real that changes appropriately whenever we ask it to. In our case, a request to change β n is made whenever we see a new value for α0 n or α1 n, so if the number of such new values is at most g(n) 2 , then the plan will work. Thus, all we need to do is fix some computable order g that grows sufficiently slowly so that (9.2) holds, and approximate the functions n → αi n in a g(n) 2 -c.e. way, which is possible, as mentioned above.
9.14. Further properties of cl-reducibility
447
There is not much to add to give a formal construction. Fix a computable order g that satisfies (9.2). For i = 0, 1, fix computable binary functions fi such that (writing fi,s (n) for fi (s, n)), for all n, the value fi,s (n) is a binary string of length n, |{s : fi,s+1 (n) = fi,s (n)}|
g(n) , 2
and lim fi,s (n) = αi n. s
We define a left-c.e. real β: start with β0 = 0 and let βs+1 = βs + {2−n : fi,s+1 (n) = fi,s (n)}. The fact that g satisfies (9.2) and the restriction on the number of mindchanges for the fi imply that β = lims βs < 1, so β is well-defined. Lemma 9.14.5. αi ibT β. Proof. If s is a stage such that βs n = β n then we never add 2−n to β after stage s. This fact implies that fi,t (n) = fi,s (n) for all t > s. Thus fi,s (n) = αi n. This concludes the proof of the theorem. Barmpalias, Downey, and Greenberg [21] remarked that the above proof for the array computable case does not use the fact that the αi are leftc.e. reals. This proof works for any sets that have array computable c.e. degrees, because approximations like the fi are available for all sets in those degrees.
9.14.3 Left-c.e. reals cl-reducible to versions of Ω Although there is no maximal c.e. cl-degree, we can say a little about the cl-degrees of 1-random left-c.e. reals. Theorem 9.14.6 (Downey and Hirschfeldt [unpublished]). If A is a c.e. set and α is a 1-random left-c.e. real, then A ibT α. Proof. Given A and α as above, we must construct Γα = A with use γ(x) = x. Since α is 1-random, there is a c such that K(α n) n − c for all n. We enumerate a KC set L and assume by the recursion theorem that we know an e such that if (m, σ) ∈ L then K(σ) m + e. Initially, we define Γαs (n) = 0 for all n, and maintain this definition unless n enters As+1 \ As . As usual, at such a stage, we would like to change Γα (n) from 0 to 1. To do this, we need α n = αs n. Should we see a stage t s with αt n = αs n then we can simply declare that Γαu (n) = 1 for all u t. For n > e + c + 2, we can force such a t to exist.
448
9. Measures of Relative Randomness
We simply enumerate a request (n − c − e − 1, αs n) into L, ensuring that K(αs n) n − c − 1, so that α n = αs n. Note that L is indeed a KC set because we make at most one request for each n > e + c + 2. Once consequence of this result is Proposition 6.1.2, that Ω is wttcomplete. Barmpalias and Lewis [23] proved that not every left-c.e. real is clreducible to a 1-random left-c.e. real. (This result should be compared with Theorem 9.13.2.) As with Theorem 9.14.2, this result was also strengthened by Barmpalias, Downey, and Greenberg [21], to classify the degrees within which it holds. Theorem 9.14.7 (Barmpalias, Downey, and Greenberg [21]). The following are equivalent for a c.e. degree d. (i) There is a left-c.e. real α T d not cl-reducible to any 1-random left-c.e. real. (ii) d is array noncomputable. Proof. Suppose that d is array noncomputable. As in the previous subsection, we begin with the construction of a left-c.e. real not ibT-reducible to a 1-random left-c.e. real, and later add multiple permitting. We build a left-c.e. real α to meet the requirements RΓ,β : If Γβ = α then β is not 1-random. Here Γ ranges over all ibT-functionals and β ranges over all left-c.e. reals. To meet RΓ,β , we run infinitely many modules, where the nth module enumerates a finite set of strings Un with μ(Un ) 2−n , such that if Γβ = α then β ∈ Un . Thus, together, these modules enumerate a Martin-L¨of test covering β, whence RΓ,β is satisfied. In the previous subsection, we used the leeway we had in the play between two left-c.e. reals to drive a potential common upper bound to be too large. Here we have only one real to play with, and the role of the second real is taken by the element of the Martin-L¨of test being enumerated. This construction is much more limited because the restriction on the size of the enumerated set of strings is much stricter than that on the enumeration of a left-c.e. real. The modules are built by recursion from smaller and smaller building blocks. Consider, for example, the following module, which changes α only from the ath bit on, and makes β a + 1 2. (Here, as before, we think of a string as a natural number, so, for example, to say that β a + 1 2 is to say that β a + 1 is greater than or equal to 0a−1 10 in the usual lexicographic ordering of strings.) We assume for now that β is playing its optimal strategy, which is adding the minimal amount necessary to match α’s movements. We later remove this assumption.
9.14. Further properties of cl-reducibility
1.
(i) (ii) .. .
Set α(a + 1) = 1; wait for β(a + 1) = 1. Set α(a + 2) = 1; wait for β(a + 2) = 1.
(mcmlxxiv)
Set α(a + 1974) = 1; wait for β(a + 1974) = 1.
449
2. Enumerate 0a+1 11974 into the test element; wait for β = 0a 10ω . 3. Add 2−a−1975 to α, thus setting α = 0a 10ω . Wait for β = 0a−1 10ω . The cost, in terms of the measure of strings enumerated into the test element, is 2−a−1975 ; as 1975 approaches infinity, we can make the cost as low as we like. Now this module can be iterated to make β a + 1 3: (i) (ii) 1.
Run the to get Run the to get
2-module from point a + 1, α(a + 1) = 1 and β(a) = 1. 2-module from point a + 2, α(a + 2) = 1 and β(a + 1) = 1.
.. . (mcmlxxx)
Run the 2-module from point a + 1979, to get α(a + 1979) = 1 and β(a + 1978) = 1.
2. Enumerate 0a 11979 into the test element; wait for β = 0a−1 10ω . 3. Set α = 0a 10ω . Wait for β = 0a−1 110ω . Again the cost can be kept down by making the number 1980 large, and then keeping the cost of every recursive call of the 2-module low as well. We can now define a 4-module, a 5-module, and so on. Note that there is a growing distance between the last point of change in α and the end of the string of 1’s in the version of β that goes into the test. This distance, according to our calculations below, is bounded by the level of the module (i.e., it is bounded by n for an n-module). We turn to formally describing the modules and investigating their properties, without assuming anything about how β is being built. Let R = RΓ,β . The module MR (a, n, ε) is indexed by: a, the bit of α where it begins acting; n, the level of the module; and ε, the bound on the measure of the c.e. open set that the module enumerates. By induction on stages, define the notion of an expansionary stage for requirement R: 0 is expansionary, and if s is an expansionary stage for R, and x is the largest number mentioned at stage s or before, then the next expansionary stage is the least stage t at which Γβt αt x. The module MR (a, 1, ε) is the following. Wait for an expansionary stage, then add 2−a−1 to α. Wait for another expansionary stage, and return ∅.
450
9. Measures of Relative Randomness
For n > 1, the module MR (a, n, ε) is the following, where U is the set it enumerates. Let b be the least b > a such that 2−b < 2ε . Let ε =
ε 2(b+n−a) .
1. For k = a + 1, a + 2, . . . , b + n, call MR (k, n − 1, ε ), and add the returned set to U . 2. Add the current version of β b to U , and wait for a stage at which β b changes. 3. Wait for an expansionary stage, then add 2−b−n−1 to α. Wait for another expansionary stage, and return U . We verify some properties of these modules, which will lead to the full construction. Lemma 9.14.8. A module MR (a, n, ε) does not change α a. Indeed, there is a computable function c(a, n, ε) such that for all R, a, n, and ε, if c = c(a, n, ε), and a module M = MR (a, n, ε) starts at stage s with αs [a, c) = 0c−a , then throughout its run, M changes only α [a, c), and if it returns at a stage t, then αt [a, c) = 10c−a−1 . Proof. By induction on n. The module MR (a, 1, ε) changes α(a) only from 0 to 1, so we can let c(a, 1, ε) = a + 1. For n > 2, we can calculate b (and ε ) as in the instructions for the module, and let c(a, n, ε) = max{b + n, c(k, n − 1, ε ) : k ∈ [a + 1, . . . , b + n]}. Let c = c(a, n, ε). Since c c(k, n − 1, ε ) for all k ∈ [a + 1, . . . , b + n], if we start with αs [a, c) = 0c−a , then by induction, after m iterations of part 1 of the module MR (a, n, ε), we have α [a, c) = 01m 0c−a−m−1 and so at the end of part 1 we have α [a, c) = 01b+n−a 0c−b−n−1 . At part 2 of the module, α [a, c) does not change, and at part 3, we set α [a, c) = 10c−a−1. Lemma 9.14.9. The measure of the set of strings enumerated by a module MR (a, n, ε) is at most ε. Proof. By a straightforward induction on n. For the next lemma, we again think of finite binary strings as natural numbers. Lemma 9.14.10. Suppose that a module MR (a, n, ε) starts running at stage s (with αs [a, c) = 0c−a , where c is as above), and returns at stage t. Then βt a + 1 − βs a + 1 n. Proof. By induction on n. The base case n = 1 is easy: if MR (a, 1, ε) starts running at stage s and αs (a) = 0, then the module changes αs (a) to 1, and so by the next expansionary stage we get a change in β a + 1, which implies that βt a + 1 − βs a + 1 1.
9.14. Further properties of cl-reducibility
• βsa
•
•
βsa+1
βsa+2
451
βsa + n2−(a+1) • • βsa+3· · · β ∗ •
Figure 9.1. The longest lines represent elements of Qa+1 , the next longest lines elements of Qa+2 , etc. In this example, n = 6. 2n
• βsk−1
n
• βsk
• β∗
n
Figure 9.2. The longer lines represent elements of Qk while the shorter lines represent elements of Qk+1 \ Qk . Again in this example, n = 6.
Now assume that the lemma is proved for some n 1. Suppose that we start running MR (a, n+1, ε) at stage sa with αs0 [a, c) = 0c−a (where c = c(a, n + 1, ε)). Calculate the associated b and ε . For k = a + 1, . . . , b + n + 1, let sk be the stage at which the recursive call of MR (k, n, ε ) returns. By induction, we have, for all such k, βsk k + 1 − βsk−1 k + 1 n. For each m, let Qm = {z2−m : z ∈ Z}. For γ ∈ (−∞, ∞), let γ m be the greatest element of Qm that is not greater than γ. Then for γ ∈ 2ω , again identified with an element of [0, 1], we have γ m = 2m γ m . Also, for γ < δ, let dm (γ, δ) = 2m ( δ m − γ m ) = | (Qm ∩ (γ, δ]) |. The induction hypothesis then says that dk+1 (βsk−1 , βsk ) n.
(9.3)
Let β ∗ = βsa a+1 + n2−a−1 , so that β ∗ ∈ Qa+1 . (See Figure 9.1.) By induction on k = a, . . . , b + n + 1, we show that dk+1 (βsk , β ∗ ) n.
(9.4)
For k = a this is immediate. If k > a and (9.4) is true for k − 1, then because β ∗ is in Qk , for every x ∈ (Qk+1 \ Qk ) ∩ (βsk−1 , β ∗ ], there is some y > x in Qk ∩ (βsk−1 , β ∗ ] and so dk+1 (βsk−1 , β ∗ ) 2n. Together with (9.3), we get (9.4) for k: see Figure 9.2.
452
9. Measures of Relative Randomness
• βsb+n+1
• β∗
Figure 9.3. Even though db+n+2 (βsb+n+1 , β ∗ ) may be n (in this example, n 17), we see that β ∗ is the only element of Qb+2 between βsb+n+1 and β ∗ itself. Again the shortest lines represent elements of Qb+n+2 , the next shortest ones elements of Qb+n+1 , etc.
At the end, we get da+b+2 (βsb+n+1 , β ∗ ) n, so β ∗ − βsb+n+1 (n + 1)2−b−n−2 . Since 2n n + 1, we have β ∗ − βsb+n+1 2−b−2 (see Figure 9.3). It follows that β ∗ b = βsb+n+1 b. So at the end of step (2) of the module MR (a, n + 1, ε), say at a stage t0 , we have βt0 > β ∗ , so da+1 (βsa , βt0 ) n; in other words, βt0 a + 1 − βsa a + 1 n. At step 3 of the module, we change α(a), and by the next expansionary stage, we have a change in β a + 1, which adds at least 1 to β a + 1. So if the module returns at stage t1 , then βt1 a + 1 − βsa a + 1 n + 1, as required. The construction is now clear: we partition N into disjoint intervals, all of the form [a, c(a, 2a+1 , 2−n )). A requirement R = RΓ,β will be assigned infinitely many such intervals and run MR (a, 2a+1 , 2−n ) on the nth interval assigned to it. By Lemma 9.14.10, none of these modules can return, because we cannot have β a + 1 2a+1 . If the hypothesis Γβ = α of R holds, then the module cannot ever be stuck waiting for an expansionary stage, and so it must get stuck waiting for β to avoid the set of intervals U that it enumerates. In other words, if the nth module for R enumerates Un , and Γβ = α, then β ∈ Un , as required. If this is the case for all n, then β is not 1-random. Finally, we can add multiple permitting to this construction to get the proof of this part of Theorem 9.14.7. This is done exactly as in the permitted version of Theorem 9.14.2, so we will only sketch the details. Partition N into an infinite computable coding set C and pairwise disjoint intervals of the form [a, c(a, 2a+1 , 2−n )) (all disjoint from C). Assign to every requirement R infinitely many intervals, where the nth interval assigned to n is of the form [a, c(a, 2a+1 , 2−n )). For every module MR (a, n, ε), calculate the total number p(a, n, ε) of stages at which the module changes α. Now partition N into a very strong array {Fn }n∈N , where for every requirement, |Fn | > p(a, 2a+1 , 2−n ) for almost every n (where the nth interval for R is [a, c)). Let D ∈ d be {Fn }-a.n.c. Every requirement R enumerates an auxiliary c.e. set WR . To request permission, a module M = MR (a, 2a+1 , 2−n ) enumerates some new x ∈
9.14. Further properties of cl-reducibility
453
Fn into WR . Permission is later received when x appears in D. The new instructions for the modules are as for the original ones, except that every change in α requires permission: The new module MR (a, 1, ε) is: Wait for an expansionary stage, then request permission. When permission is received, add 2−a−1 to α; wait for another expansionary stage, and return ∅. For n > 1, the new module MR (a, n, ε) is: Let b be the least b > a such that 2−b < 2ε ; let ε =
ε 2(b+n−a) .
1. For k = a + 1, a + 2, . . . , b + n, call MR (k, n − 1, ε ), and add the returned set to U . 2. Add the current version of β b to U , and wait for a stage at which β b changes. 3. Wait for an expansionary stage, then request permission. When permission is received, add 2−b−n−1 to α. Wait for another expansionary stage, and return U . For every R and n, we run MR (a, 2a+1 , 2−n ) on the nth interval [a, c) assigned for R. We also let α(cn ) = 1 iff n ∈ D, where cn is the nth element of C. That is the construction. Lemmas 9.14.8, 9.14.9, and 9.14.10 hold for the new modules as well, so there is no interaction between the requirements. If Γβ = α, then the Rmodules do not get stuck waiting for expansionary stages. For such R, for every n such that WR Fn = D Fn , every request for permission by the nth module for R is granted, so the nth module for R gets stuck waiting for β to leave Un (the open set enumerated by R). Thusfor infinitely many n we have β ∈ Un . By adjusting the set (letting Um = n>m Un ) we build a Martin-L¨ of test covering β, so β is not 1-random. Now we turn to the other direction of the theorem: If α has array computable Turing degree, then there is some 1-random left-c.e. real β ibT α. Again the idea is to use an approximation to α which does not change too much, and build β by requesting that it change by at least 2−n whenever α n changes. On top of this, we need to ensure that β is 1-random. To accomplish this task, let U be the second element of a universal Martin-L¨ of test. If we ensure that β ∈ / U then we will have ensured that β is 1-random. U with a prefix-free set of generators for U . Then We identify 1 −|σ| 2 . σ∈U 2 To ensure that β is not in U , whenever we discover that there is some σ ∈ U that is an initial segment of the current version of β, we request that this initial segment change by adding 2−|σ| to β. If the approximation to α is sufficiently tight so that the total amount added to β on behalf of following α is less than 12 , this strategy will succeed.
454
9. Measures of Relative Randomness
So fix a computable order g such that n 2−n g(n) < 12 (we allow the value 0 for g for finitely many inputs; we assume that if g(n) = 0 then α n is simply given to us as a parameter in the construction). Get a computable approximation fs (n) for the function n → α n such that for all n, we have |{s : fs+1 (n) = fs (n)}| g(n). The left-c.e. real β is defined by recursion. We start with β0 = 0 and let βs+1 = βs + {2−n : fs+1 (n) = fs (n)} + {2−|σ| : σ ∈ Us ∧ σ ≺ βs }. Lemma 9.14.11. β < 1 (and so is well-defined as an element of 2ω ). Proof. There are two kinds of contributions to β: following α and avoiding U . Now, g satisfies n 2−n g(n) < 12 and for all n we have |{s : fs+1 (n) = fs (n)}| g(n), so the total added to β by the clause {2−n : fs+1 (n) = fs (n)} is less than 12 . Because β is a left-c.e. real, for every σ ∈ U , there is at most one stage , so the contribution s at which we have σ ∈ Us and σ ≺ βs of the clause {2−|σ| : σ ∈ Us ∧ σ ≺ βs } is at most σ∈U 2−|σ| 12 . Lemma 9.14.12. β ∈ / U. Proof. If β ∈ U then there is some σ ∈ U such that σ ≺ β. Then at a late enough stage s we have σ ∈ Us and σ ≺ βs , so at stage s we add 2−|σ| to β and force σ ⊀ β for ever. Finally, the first clause in the definition of β shows that β ibT α, as in Lemma 9.14.5.
9.14.4 cl-degrees of versions of Ω We have seen that the Kuˇcera-Slaman Theorem fails for cl-reducibility. One possibility for rescuing some of the content of that result in the cl case is that at least different versions of Ω might have the same cl-degree. As we will see in this subsection, this possibility also fails. We begin with the following definition and a result about its connection to 1-randomness. Definition 9.14.13 (Fenner and Schaefer [145]). A set A is k-immune if there is no computable sequence F0 , F1 , . . . of pairwise disjoint sets, each of size at most k, such that Fn ∩ A = ∅ for all n. Note that every set is 0-immune, that the 1-immune sets are exactly the immune sets, and that hyperimmune sets are k-immune for all k. Fenner and Schaefer [145] showed that the classes of k-immune sets form a proper hierarchy. We now show that for 1-random sets, the situation is different. Lemma 9.14.14. Suppose that |B| = m, and let B be a collection of n pairwise disjoint subsets of B, each of size k. Then there are exactly 2m−kn (2k − 1)n subsets of B that have nonempty intersection with every set in B.
9.14. Further properties of cl-reducibility
455
k k Proof. Each B ∈ B has 2 − 1 many nonempty subsets, so there are (2 − n 1) many subsets of B with nonempty intersection with every B ∈ B. The size of B \ B is m − kn. Each subset property of B with the desired is determined by its intersections with B and with B \ B, which can be chosen independently.
Theorem 9.14.15 (Folklore). Every 1-random (or even weakly 1-random) set is k-immune for all k, but no 1-random set is hyperimmune. Proof. Let F0 , F1 , . . . be a computable collection of pairwise disjoint sets, each of size at most k. Let Q be the class of sets that have nonempty intersection with every Fn . Then Q is a Π01 class. Let In = [0, max in Vn then U0 , U1 , . . . is a MartinL¨ of test, and n Un is the collection of sets A for which there are infinitely many n with Gn ∩ A = ∅, and so contains every hyperimmune set. We are now ready to show that not all 1-random left-c.e. reals have the same cl-degree. Theorem 9.14.16 (Stephan, see [21]). There exist 1-random left-c.e. reals of different cl-degrees. Proof. The following proof is from [21]. Let α be a 1-random left-c.e. real. Thinking of α as a subset of N, we have that N \ α is also 1-random, so by Theorem 9.14.15, there is a computable sequence t0 < t1 < · · · such that for each n, at least one bit of α (tn , tn+1 ] is a 0, and the size of (tn , tn+1 ] is increasing in n. Let β be the real such that β(x) = 1 iff x = tn for some n > 0 (in other words, β = n>0 2−tn −1 ). Since β is computable, α + β is left-c.e. and 1-random. We claim that α does not cl-compute α + β. Suppose that it does. Then there is a c and a partial computable function ψ such that for all n, ψ(α (n + c)) = (α + β)(n). For all m, the block α (tm , tm+1 ] is not a string of 1’s, so adding 2−tm −1 to α does not create a carry before the tm th bit of α. Thus, for every n > 0, α (tn + 1) + β (tn + 1) = (α + β) (tn + 1).
(9.5)
456
9. Measures of Relative Randomness
By Theorem 9.14.15, α is not c-immune, so there are infinitely many n such that α [tn − c, tn ) = 1c and tn−1 < tn − c. By (9.5), for each such n, ψ(α tn ) = (α + β)(tn − c) = 1 − α(tn ). We can now define a c.e. supermartingale M that succeeds on α, which is a contradiction: Begin with M (λ) = 1. If there is an n such that |σ| = tn and σ [tn − c, tn ) = 1c , where tn−1 < tn − c, then wait for ψ(σ) to converge. If that never happens then M (σi) = 0 for i = 0, 1. Otherwise, let i = 1 − ψ(σ) and let M (σi) = 2M (σ) and M (σ(1 − i)) = 0. For all other σ, let M (σi) = M (σ) for i = 0, 1.
9.15 K-degrees, C-degrees, and Turing degrees Merkle and Stephan [270] have given simple and short but clever arguments about the structure of the C- and K-degrees, and their relationships with the Turing degrees, by using sparse sets. We will see other relationships between these measures of relative randomness and Turing degrees for random sets in Chapter 10. We begin with the following result, which we state in relativized form for future reference, and which will be important in Chapter 11. Lemma 9.15.1 (Merkle and Stephan [270], after Chaitin [61]). If K X (A n) K X (n) + O(1) for all n in an infinite set S then A T S ⊕ X . Proof. Let T = {σ : ∀τ σ (|τ | ∈ S ⇒ K X (τ ) K X (|τ |) + c)}. Then T is an (S ⊕ X )-computable tree, A ∈ [T ], and, by the relativized form of the Counting Theorem 3.7.6, T has bounded width. Thus A is an isolated path of T , and hence is (S ⊕ X )-computable. Theorem 9.15.2 (Merkle and Stephan [270]). Let Y ⊆ {2n : n ∈ ω} and X K Y . Then X T Y ⊕ ∅ . Hence, if ∅ T Y then X K Y implies X T Y . Proof. The idea is to code Y by a certain sparse set of numbers. Let 2k Y (2k+2 ). g(n) = 2n + k
− 1] and we can compute Y 2(n+1) from g(n). Then g(n) ∈ [2 , 2 Thus K(Y g(n)) K(g(n)) + O(1), so K(X g(n)) K(g(n)) + O(1). By Lemma 9.15.1, X T rng(g) ⊕ ∅ T Y ⊕ ∅ . n
(n+1)
A similar proof establishes the following. n
Theorem 9.15.3 (Merkle and Stephan [270]). If Y ⊆ {22 : n ∈ ω} and X C Y then X T Y .
9.15. K-degrees, C-degrees, and Turing degrees
Proof. Let g(n) = 2n +
457
! k+2 " k . Given a number m ∈ Jn = 2 Y 22 k
[2g(n) , 2g(n)+1 − 1], we can determine Y m by knowing m and g(n). Since |Jn | = 2g(n) , we can encode m by a string σ of length 2g(n) . From σ we can also decode g(n), so σ is enough to describe Y m, whence C(X m) C(Y m) + O(1) g(n) + O(1), where the constants depend on neither m nor n. Let c be the last constant in this inequality. Now let T be the set of strings σ such that for all τ σ, if |τ | ∈ Jn for some n then C(τ ) g(n) + c. Then T is a Y -c.e. tree, since g is Y computable. Furthermore, for each n, since |Jn | = 2g(n) , there must be some in ∈ Jn such that C(in ) g(n). Then for each n we have C(σ) C(in ) + c for all σ ∈ T such that |σ| = in . By Lemma 3.4.3, T has finite width. Now we can transform T into a Y -computable tree as in the proof of Theorem 3.4.4, to conclude that X is a path in a Y -computable tree of finite width, from which it follows that X T Y . The following are immediate corollaries to the above result. Corollary 9.15.4 (Merkle and Stephan [270]). There is a minimal pair of C-degrees of computably enumerable sets. n
Proof. Choose A, B ⊆ {22 : n ∈ ω} whose Turing degrees form a minimal pair and apply Theorem 9.15.3. Corollary 9.15.5 (Merkle and Stephan [270]). A C-degree contains at most one Turing degree. = {22n : n ∈ A} and B = {22n : n ∈ B}. Then Proof. Let A ≡C B. Let A ≡C B, so by Theorem 9.15.3, A ≡T B, and hence A ≡T B. A Not all C-degrees can contain entire Turing degrees. Recall that A is complex if there is a computable order h with C(A h(n)) n for all n (or, equivalently, a computable order g with K(A g(n)) n for all n). Theorem 9.15.6 (Merkle and Stephan [270]). If A is complex then neither the C-degree nor the K-degree of A contains an entire Turing degree. Proof. We do the proof for C, the K case being essentially the same. Fix h as above. Let B ≡C A. Let D = {h(4n) : n ∈ B}. Then D ≡T B, but D ≡C A, since C(A h(4n)) 4n, while C(D h(4n)) C(B n) + O(1) n + O(1). For the K-degrees, we can apply similar methods to get minimal pairs. Recall that a set A is K-trivial if K(A n) K(n) + O(1). Lemma 9.15.7 (Merkle and Stephan [270]). Suppose that Y is not Ktrivial and Y K X. Then there is a function f T Y ⊕ ∅ such that, for each n, the set X has at least one element between n and f (n).
458
9. Measures of Relative Randomness
Proof. Let c be a constant with Y K X via c. Let Fn be the collection of all sequences D such that D(k) = 0 for all k > n. Let f (n) be the least number m > n such that K(D m) + c < K(Y m) for all D ∈ Fn . Such an m exists because Y is not K-trivial, and f is computable from ∅ and Y . As Y K X via c, for each n, the set X must differ below f (n) from all D ∈ Fn , and the result follows. Theorem 9.15.8 (Merkle and Stephan [270]). Suppose that A and B are ∅ -hyperimmune sets, forming a Turing minimal pair relative to ∅ . Then the K-degrees of A and B form a minimal pair. Hence there are minimal pairs of K-degrees of Σ02 sets. Proof. Without loss of generality we may assume that A, B ⊆ {2n : n ∈ ω}. Let Y K A, B. By Theorem 9.15.2, Y T A ⊕ ∅ , B ⊕ ∅ . Thus Y T ∅ , as A and B form a minimal pair above ∅ . But by Lemma 9.15.7, if Y is not K-trivial, then A and B are not hyperimmune relative to ∅ , and the result follows. Using similar techniques, Merkle and Stephan [270] have constructed nontrivial branching degrees in the C- and K-degrees, where a degree is branching if it is the infimum of two incomparable degrees. Theorem 9.15.8 improves a basic result of Csima and Montalb´ an [83], who were the first to construct a minimal pair of K-degrees. Their method is interesting and informative in its own right, and will be dealt with in Section 11.12. We finish the section with the following result. Theorem 9.15.9 (Merkle and Stephan [270]). There is a minimal Cdegree. n
Proof. We use perfect trees whose branches are subsets of {22 : n ∈ ω}. Let T0 be the computable tree of branches of this form. Having defined Te , let Te+1 be a perfect computable subtree of Te so that one of the following holds: 1. ΦA e is partial for all branches A of Te+1 ; or 2. ΦA e is computable for all branches A of Te+1 ; or 3. ΦA e is total for all branches A of Te+1 and for any two distinct branches A and B of Te+1 , if σ is their longest common initial segment, then B there is an x such that ΦA e (x) = Φe (x), and any branch C of Te+1 that extends σ coincides with either A or B up to the max of the uses B of ΦA e (x) and Φe (x). The usual arguments show that it is always possible to construct Te+1 , and that if A ∈ e Te then A is noncomputable and has minimal Turing n degree. Let B C A be noncomputable. Since A ⊂ {22 : n ∈ ω}, by Theorem 9.15.3, B T A. Let e be such that ΦA e = B. Then case 3 above
9.16. The structure of the monotone degrees
459
must hold of Te+1 . Given n, suppose that C and D are branches of Te+1 D such that ΦC e (m) = B(m) and Φe (m) = B(m) whenever the relevant uses are less than n, but C n = A n and D n = A n. Let σ be the longest common substring of A and C. We may assume without loss of generality that D also extends σ (since otherwise we can simply interchange the roles A of C and D). But then there is an x such that ΦC e (x) = Φe (x) = B(x) and C D coincides with either A or C up to the use Φe (x). By the choice of C, this use must be greater than n. Since C n = A n, we conclude that D n = C n. In other words, A n is one of at most two strings of length n on Te+1 compatible with B n, whence C(A n) C(B n)+ O(1). Using more intricate methods, it is possible to prove that every c.e. Cdegree bounds a minimal C-degree and a minimal rK-degree. It is unknown whether there is a minimal K-degree.
9.16 The structure of the monotone degrees In this section we discuss results of Calhoun [46] on the structure of monotone degrees. Let M be our fixed universal monotone machine. We begin with a technical lemma about the behavior of Km. Lemma 9.16.1 (Calhoun [46]). Let s(σ, τ ) be the shortest initial segment of σ that is incomparable with τ if σ | τ , and let s(σ, τ ) = λ otherwise. (i) K(s(σ, τ )) Km(σ) + Km(τ ) + 2 log Km(σ) + 2 log Km(τ ) + O(1). (ii) K(|σ|) 3 max{Km(σ0), Km(σ1)} + O(1). (iii) Km(τ ) Km(στ ) + K(|σ|) + O(1). (iv) For each σ there is a c such that Km(τ ) Km(στ ) + c for all τ . (v) For each computable set B there is a c such that Km(σ (B n)) K(σ) + c for all σ and n. Proof. (i) Clearly K(s(σ, τ )) K(σ) + K(τ ) + O(1), and K(σ) + K(τ ) Km(σ) + Km(τ ) + 2 log Km(σ) + 2 log Km(τ ) + O(1), by Theorem 3.15.4. (ii) Assume that Km(σ0) Km(σ1), the other case being symmetric. By (i), K(|σ|) K(σ) + O(1) Km(σ0) + 2 log Km(σ0) + Km(σ1) + 2 log Km(σ1) + O(1) 2 Km(σ0) + 4 log Km(σ0) + O(1) 3 Km(σ0) + O(1). (iii) Define a monotone machine M by M (ρ0 ρ1 ) = ν if U(ρ0 ) = k and M(ρ1 ) = μν for some μ with |μ| = k. If ρ0 is a U-description of |σ| and ρ1 is an M-description of στ , then M (ρ0 ρ1 ) = τ . (iv) follows immediately from (iii).
460
9. Measures of Relative Randomness
(v) By Theorem 3.15.7, Km(σ (B n)) K(σ) + Km(B n) + O(1), but since B is computable, Km(B n) O(1) . Theorem 9.16.2 (Calhoun [46]). If A is neither 1-random nor computable, then there is a B such that A |Km B. Proof. Let X be 1-random. We build B in stages by a finite extension argument. For even s, let Bs+1 = Bs 0k for some k such that Km(Bs 0k ) < Km(A (|Bs | + k)) − s. Such a k exists because Km(A n) is unbounded, while Km(Bs 0k ) is bounded, since Bs 0ω is computable. For odd s, let Bs+1 = Bs (X k) for some k such that Km(Bs (X k)) > Km(A (|Bs | + k)) + s. Such a k exists because Bs X is 1-random, and hence Km(Bs (X n)) = n ± O(1), while A is not 1-random, and hence Km(A (|Bs | + n)) is bounded away from n. The even stages ensure that Km(A n) Km(B n) + O(1) and the odd stages that Km(B n) Km(A n) + O(1). It is straightforward to extend the proof of Theorem 9.16.2 to construct a perfect tree such that any two paths have incomparable Km-degrees, yielding the following. Theorem 9.16.3 (Calhoun [46]). There is an antichain of Km-degrees of size 2ℵ0 . Since A Km D B implies A Km B, the two previous theorems hold also of the discrete monotone degrees. Calhoun [46] proved some further structural results about the monotone degrees, such as the following. Theorem 9.16.4 (Calhoun [46]). There is a minimal pair of Km-degrees. Proof. We construct A and B in stages. Let A0 = B0 = λ and n0 = 0. At an even stage s, proceed as follows. Let c1 be the constant from Lemma 9.16.1 (ii), and let c2 be the constant from Lemma 9.16.1(v) applied to 0ω . Let A∗s be an extension of As with Km(A∗s ) > Km(As ). A string σ is terminal if there is no proper extension τ of σ with Km(τ ) = Km(σ). Let n be large enough so that the following hold. (i) n |A∗s |. (ii) There is no terminal σ with |σ| > n and Km(σ) < Km(A∗s ) + c2 + s. (iii) For all m n, we have K(m) 3(K(A∗s ) + c1 + c2 + s), where c1 is the constant from Lemma 9.16.1 (ii). ∗
Let Bs+1 = Bs 0n−|Bs | and As+1 = A∗s 0n−|As | , and let ns+1 = n. At an odd stage we do the same, but switching the roles of A and B. To see that A and B form a minimal pair, suppose that X Km A, B is not computable. We say that s is an increment stage if Km(X ns+1 ) >
9.16. The structure of the monotone degrees
461
Km(X ns ). There must be infinitely many increment stages, so suppose there are infinitely many even increment stages, the odd case being symmetric. For each even increment stage s, let x be such that ns < x ns+1 and Km(X x − 1) < Km(X x). As x > ns , if X x − 1 is terminal, then Km(X x − 1) K(A∗s ) + c2 + s by condition (ii). If X x − 1 is not terminal, then Km(X x) = max{Km((X x − 1) 0), Km((X x − 1) 1)}
K(x) 3
− c1 ,
by Lemma 9.16.1 (ii). By condition (iii), K(x) 3(K(A∗s ) + c1 + c2 + s). Thus, in either case, Km(X x) K(A∗s ) + c2 + s. By Lemma 9.16.1(v), ∗
Km(A x) = Km(A∗s 0x−|As | ) K(A∗s ) + c2 . Since s is unbounded and X Km A, we have a contradiction. Calhoun [46] also showed how to construct many uncountable Kmdegrees. For a set A and a strictly increasing function f , let A ⊗ f (n) be A(f −1 (n)) if n is in the range of f , and 0 otherwise. Similarly, σ ⊗ f is σ(f −1 (n)) if n is in the range of f and f −1 (n) < |σ|, and is 0 for all other n < f (|σ|). Let f −1 [n] = max{k : f (k) n}. Theorem 9.16.5 (Calhoun [46]). If f is strictly increasing and computable, then Km((A⊗f ) n) = Km(A f −1 [n])±O(1). Hence, if Km(A n) = Km(B n) ± O(1) then Km((A ⊗ f ) n) = Km((B ⊗ f ) n) ± O(1). Proof. When specifying a monotone machine M , we issue instructions of the form M (σ) τ . What we mean is that, on input σ, we want M to write τ on its output tape. We might also issue another instruction M (σ) τ . As long as τ and τ are compatible, M is still well-defined. (Of course, to ensure that M is monotone, we have to satisfy a stricter condition: if we issue instructions M (σ) τ and M (σ ) τ and σ and σ are compatible, then τ and τ must be compatible.) To show Km((A⊗f ) n) Km(A f −1 [n])+O(1), we build a monotone machine M . Let M (ρ) (τ ⊗f ) 0k if M(ρ) τ and k < f (|τ |+1)−f (|τ |). To see that M is monotone, suppose that M (ρ) σ1 and M (ν) σ2 with ρ and ν compatible. Then the corresponding τ1 and τ2 must also be compatible, as M is monotone. Assuming τ1 τ2 , it is easy to see that (τ1 ⊗ f ) 0k (τ2 ⊗ f ) 0ω for any k < f (|τ | + 1) − f (|τ |), and hence σ1 σ2 . Thus M is monotone. Now, if σ = (A ⊗ f ) n, then M (ρ) σ where ρ is a minimal-length M-description of A f −1 [n], and hence Km(σ) |ρ| = Km(A f −1 [n]). To show Km(A f −1 [n]) Km((A ⊗ f ) n) + O(1), we build a monotone machine N . Let N (ρ) σ whenever M(ρ) σ ⊗ f . If ρ and ν are compatible and we have made N (ρ) σ and N (ν) σ , then, since M is monotone, σ ⊗ f and σ ⊗ f are compatible, and we may suppose σ ⊗ f σ ⊗ f . Then σ σ by definition of ⊗. Thus N is monotone. Letting σ = A f −1 [n], we have σ ⊗ f (A ⊗ f ) n, and hence if ρ
462
9. Measures of Relative Randomness
is a minimal length M-description of σ ⊗ f , then Km(σ) |ρ| + O(1) = Km(σ ⊗ f ) + O(1) Km((A ⊗ f ) n) + O(1). Corollary 9.16.6 (Calhoun [46]). There is an order-preserving embedding from the rationals into the monotone degrees such that each degree in the image of the embedding has cardinality 2ℵ0 . Proof. For any rational number r ∈ (0, 1), let fr (n) = # nr $. Let A be 1random. By Theorem 9.16.5, Km(A ⊗ fr n) = Km(A fr−1 [n]) ± O(1) = fr−1 [n] ± O(1). By construction, fr−1 [n] = max{k : # kr $ n} = rn ± O(1). Then the map r → degKM (A ⊗ fr ) induces the desired embedding: If r < s the limn sn − rn = ∞, so A ⊗ fr
9.17 Schnorr reducibility One can define measures of relative randomness tailored to notions other than 1-randomness. We give an example in this section. Given the machine characterization of Schnorr randomness from Section 7.1.3, we can define a reducibility analogous to K-reducibility for Schnorr randomness. Definition 9.17.1 (Downey and Griffiths [109]). A set A is Schnorr reducible to a set B (written A Sch B) if for each computable measure machine M there is a computable measure machine N such that KN (A n) KM (B n) + O(1).5 Clearly, if A is Schnorr random and A Sch B, then B is Schnorr random. We prove two more facts about this reducibility. The first shows that it is not implied by K-reducibility, or even by cl-reducibility. Theorem 9.17.2 (Downey, Griffiths, and LaForte [110]). There are c.e. sets A and B such that A cl B but A Sch B. Proof. By Proposition 7.1.19, we can restrict our attention to machines M such that μ(dom M ) = 1. Let M0 , M1 , . . . be an effective list of all prefix-free machines. We build a computable measure machine M and c.e. sets A cl B to satisfy the requirements Re : μ(dom Me ) = 1 ⇒ ∃n (KM (B n) KMe (A n) − e). We build M indirectly, by enumerating a KC set L. 5 It
would be possible to define a uniform version of Schnorr reducibility, where an index for N must be obtained computably from an index for M , but this notion remains unexplored at the time of writing.
9.17. Schnorr reducibility
463
For each Re we set aside a block [ne , ne +3e+1], with ne+1 > ne +3e+1. We will never allow ne +j for 0 < j 3e+1 to enter B, but we may possibly put ne into B. Thus there are 2e+1 many possible values for B (ne +3e+2). For each e, we enumerate a request (2e + 2, τ ) for each of these possible values τ . These are all the requests ever put into L, so the weight of L is equal to e 2e+1 2−(2e+2) = e 2−(e+1) = 1. Thus there is a computable measure machine M such that KM (B (ne +3e+2)) = 2e+2 for all e. So it is enough to ensure that if μ(dom Me ) = 1 then KMe (A (ne + 3e + 2)) 3e + 2. There are fewer than 23e+2 many Me -programs of length less than 3e + 2, so there must be a string τ ∈ 23e+2 such that if A(ne ) . . . A(ne +3e+1) = τ , then KMe (A (ne + 3e + 2)) 3e + 2, regardless of the other values of A. So for each e we wait for a stage s such that 1 − μ(dom Me [s]) < 2−3e+1 . At that stage, we know Me (μ) for all Me -programs μ of length less than 3e + 2, and hence can determine a τ as above. We put numbers into A to ensure that A(ne ) . . . A(ne + 3e + 1) = τ , and put ne into B. The latter action ensures that A cl B. It turns out that tt-reducibility is related to Schnorr reducibility much as wtt-reducibility is to K-reducibility. This is not a surprising fact, since the essential difference between a tt-reduction and a wtt-reduction is that the former has a computable domain, which is exactly what distinguishes a computable measure machine from an ordinary prefix-free machine. To obtain a version of tt-reducibility that implies Schnorr reducibility, we need to restrict the use in the same way that we did to obtain cl-reducibility from wtt-reducibility. Definition 9.17.3 (Downey, Griffiths, and LaForte [110]). A set A is strongly truth table reducible to a set B (written A stt B) if A tt B via a truth table reduction Γ with use γ(n) n + O(1). Theorem 9.17.4 (Downey, Griffiths, and LaForte [110]). If A stt B then A Sch B. Proof. Let A stt B via a reduction Γ with use γ(n) n + c. Let γ0 , . . . , γ2c −1 be the strings of length c. Let M be a computable measure machine. Build a KC set L as follows. For every σ and τ such that M (σ) ↓= τ and every j < 2c , enumerate a request (|σ|+c, Γτ γj |τ |). (Since Γ is a tt-reduction, Γτ γj |τ | is always defined.) For each σ ∈ dom M , we add 2c 2−(|σ|+c) = 2−|σ| to the weight of L, so this weight is equal to μ(dom M ). Thus L is a KC set with computable weight, and hence there is a corresponding computable measure machine N given by the KC Theorem. If M (σ) ↓= B n then there is a string μ of length |σ| + c such that N (μ) ↓= ΓB(n+c) n = A n, so KN (A n) KM (B n) + O(1).
10 Complexity and Relative Randomness for 1-Random Sets
In this chapter, we examine basic questions about the function K, as well as related questions about the orderings C and K , and other measures of relative complexity. We also introduce some other natural measures of relative randomness. These measures, which include low for 1-randomness reducibility LR and van Lambalgen reducibility vL , are at first sight not directly related to initial segment complexity, but can be used to naturally relate initial segment complexity to Turing reducibility and other measures of relative computational complexity. Most of the material in this chapter concerns relative initial segment complexity and relative randomness within the class of 1-random sets, where some of the stronger reducibilities considered in the previous chapter do not seem to give us much information.
10.1 Uncountable lower cones in K-reducibility and C-reducibility We begin by establishing a relatively easy result and looking forward to some related ones we will prove later in the chapter. One property that typical computability-theoretic reducibilities share is that, for any given set A, there are only countably many sets reducible to A. This countable predecessor property does not hold of K and C . Theorem 10.1.1 (Yu, Ding, and Downey [415]). If A is 1-random then there are 2ℵ0 many sets K-reducible to A, and indeed there are sets of every m-degree K-reducible to A. R.G. Downey and D. Hirschfeldt, Algorithmic Randomness and Complexity, Theory and Applications of Computability, DOI 10.1007/978-0-387-68441-3_10, © Springer Science+Business Media, LLC 2010
464
10.1. Uncountable lower cones in K and C
465
= {2m : m ∈ B}. Then K(B n) K(n)+K(B Proof. For a set B, let B is K-reducible log n)+O(1) O(log n) n+O(1) K(A n)+O(1), so B is 1-1 and m-degree preserving (except when to A. But the map B → B B = N), so the result follows.
Essentially the same proof shows the following. Corollary 10.1.2 (Yu, Ding, and Downey [415]). If A is 1-random then there are 2ℵ0 many sets C-reducible to A, and indeed there are sets of every m-degree C-reducible to A. Perhaps strangely, it was rather difficult to construct an uncountable K-degree. The existence of such a degree follows from work of Reimann and Stephan [330] with a little work,1 though the original proof is an unpublished one due to Joe Miller. Space considerations preclude us from presenting this material. One consequence of this result is that it is not obvious that there are continuum many K-degrees, since the usual argument (every degree is countable but there are continuum many sets) does not apply. However, while the cardinality of sets K-reducible to a given set A might be large, the measure of the class of sets K-reducible to A is always 0, as shown by Yu, Ding, and Downey [415] and proved below as Theorem 10.3.13. This result is enough to conclude that there are uncountably many K-degrees, and in fact enough to prove the result of Yu and Ding [413] that there are continuum many K-degrees, as we will discuss in Section 10.3.3. Another route to this result is through Theorem 10.5.12 below that 1-random K-degrees are countable. Since there are continuum many 1random sets, this result implies that there are continuum many K-degrees of 1-random sets. In Section 10.3.4, we will see that the same facts hold of C-reducibility. Before looking at this material, however, we will examine the initial segment complexity of Ω, and introduce other measures of relative randomness, particularly van Lambalgen reducibility, first defined by Miller and Yu [280].
1 Referring
to that paper, specifically page 9, Theorem 1 and Lemma 1, Miller points out that the construction shows (for the right choice of premeasure) that there is an X ± O(1). The construction is loose enough with initial segment complexity K(X n) = n 2 that one could code any set into X (on the bits in positions dn for some constant d), so ± O(1). there are continuum many sets with initial segment complexity n 2
466
10. Complexity and Relative Randomness for 1-Randoms
10.2 The K-complexity of Ω and other 1-random sets 10.2.1 K(A n) versus K(n) for 1-random sets A We have seen a number of results, particularly those of Miller and Yu [280] in Section 6.6, establishing relatively precise bounds on the behavior of K(A n) for a 1-random set A. We now present results of Solovay [371] on the relationship between K(n) and K(A n) for 1-random sets A. Corollary 6.6.3, which states that if A is 1-random and Recall −f (n) 2 = ∞, then there are infinitely many n such that K(A n) > n+ n f (n) − O(1). As mentioned in Section 6.6, Solovay [371] had earlier proved this result for computable f . Note that, since K(A m) m+K(m)+O(1) for all m, the n such that K(A n) > n + f (n) − O(1) must be fairly complex, in the sense that K(n) f (n) − O(1) for all such n. Solovay [371] showed that, for computable f , there must also be such complex n for which the complexity of A n is not too high. Theorem 10.2.1 (Solovay [371]). Let A be 1-random. Let f be a com−f (n) putable function such that = ∞, and let h be a computable n2 2 order. Then there exist infinitely many n such that K(n) f (n) − O(1) and K(A n) n + h(n) + O(1).3 Proof. Let f (n) = min{f (n), 2 log n}. Since f (n)2 log n 2−f (n) < ∞, we have n 2−f(n) = ∞. It is enough to prove the theorem with f in place of f , since then the fact that K(n) < 2 log n for almost all n implies that f (n) = f (n) for almost all of the infinitely many n mentioned in the theorem. We claim there is a computable function g f such that n 2−g(n) = ∞ and K(n) g(n) + h(n) + O(1). Assuming the claim for now, we argue as follows. Since g f , it is enough to prove the theorem for g in place of f . By Theorem 3.11.4, K(A n) n + K(n) − g(n) + O(1) for infinitely many n. For each such n, K(n) K(A n) − n + g(n) − O(1) g(n) − O(1), since A is 1-random, and K(A n) n + K(n) − g(n) + O(1) n + g(n) + h(n) − g(n) + O(1) = n + h(n) + O(1). Thus it remains to establish our claim. To show that K(n) g(n) + h(n) + O(1) it is enough to show that n 2−g(n)−h(n) < ∞. We define g in 2 The
prototypes here are f (n) = log n and h(n) = log log n. is not hard to see that this result would not hold in general without the computability hypothesis on f and h. 3 It
10.2. The K-complexity of Ω and other 1-random sets
467
stages so that for each n, we have g(n) equal to either f (n) or 2 log n. At the beginning of stage i, we have defined g(n) for n < mi for some mi . Let mi+1 be least such that 1 2−f (n) 2, mi n<mi+1
h(n)i
−f(n)
= ∞ and limn h(n) = ∞. For n ∈ which exists because n2 [mi , mi+1 ), let g(n) = 2 log n if h(n) < i and let g(n) = f (n) otherwise. Then 2−g(n) 2−f(n) 1=∞ n
i
i
mi n<mi+1
h(n)i
and
2−g(n)−h(n)
n
i
i
2−i
mi n<mi+1
2−f(n)−h(n) +
mi n<mi+1
2
+
2−2 log n−h(n)
mi n<mi+1
h(n)i −f(n)
2−2 log n
n
i
h(n)
2−i+1 +
2−2 log n < ∞.
n
h(n)i
10.2.2 The rate of convergence of Ω and the α function The following definition provides a kind of “monotone approximation” to K. Definition 10.2.2 (Chaitin, see [371]). Let α(n) = min{K(m) : m n}. Note that if n > m then α(n) α(m). We can similarly define αX using K for any set X. The function α grows slower than any computable order. X
Proposition 10.2.3 (Chaitin, see [371]). If g is a computable order then α(n) < g(n) for almost all n. Proof. Let h(k) be the least n such that g(n) > k. Then K(h(k)) K(k)+ O(1) O(log k). Since h(g(n)) > n for all n, we have α(n) K(h(g(n)) O(log g(n)), which means that α(n) < g(n) for all sufficiently large n. We now consider the rate of convergence of Ω as a series, in the version defined as n 2−K(n) . Let s(n) = − log 2−K(j) . jn
Notice that s(n) goes to infinity with n.
468
10. Complexity and Relative Randomness for 1-Randoms
Theorem 10.2.4 (Solovay [371]). α(n) − O(log α(n)) s(n) α(n). Proof. By definition, there is an m n such that K(m) = α(n), so −K(j) 2−K(m) = 2−α(n) . Taking logs, we see that s(n) α(n). jn 2 For the other inequality, we will show how to compute from Ω n an integer tn so that (i) K(tn | Ω n) = O(1) and (ii) s(tn ) > n. Assuming we have such integers, let k = s(n). Then s(tk ) > s(n), so tk > n. Together with (i), this fact implies that α(n) K(tk ) K(Ω k) + O(1) k + K(k) + O(1) = s(n) + O(log s(n)). Since s(n) α(n), we have α(n) = s(n) ± O(log s(n)), so α(n) is asymptotically equal to s(n), and hence O(log α(n)) = O(log s(n)), giving the desired inequality. We finish by defining integers tn satisfying (i) and (ii). Our result uses Theorem 3.13.2, which we now recall. Let p(n) = |({σ ∈ 2n : U(σ) ↓})|. Then Theorem 3.13.2 states that p(n) ∼ 2n−K(n) . As usual, let Ωs = U (σ)[s]↓ 2−|σ| , where we may assume that at each stage s exactly one new string σs enters the domain of U. Given Ω n, compute the least kn such that Ωkn n = Ω n. Let tn be the least integer greater than |σi | for all i kn . Then there is a c such that 2−n > Ω − Ω n
{2−|σ| : |σ| tn ∧ U(σ) ↓} 2−j p(j) c 2−j 2j−K(j) = c2−s(tn ) . = jtn
jtn
Taking logs, we see that s(tn ) n + O(1). Thus, for a suitable choice of k, independent of n, if we let tn = tn+k then s(tn ) > n. Furthermore, K(tn | Ω n) K(tn | Ω n + k) + K(Ω n + k | Ω n). The first term is O(1) by the explicit description of tn from Ω n. The second is evidently O(1) since k is fixed independently of n.
10.2.3 Comparing initial segment complexities of 1-random sets The first person to study (implicitly) the K-degrees of 1-random sets was Solovay [371]. He used the relativized version of the α function of the pre-
10.2. The K-complexity of Ω and other 1-random sets
469
vious subsection to show that no 2-random set is K-reducible to Ω.4 Of course, we already know that this is the case. Indeed, Theorem 15.6.5 (which we discussed in Section 6.11) implies that if A is 2-random and A K B, then B is 2-random. (We will extend this result to n-randomness in Corollary 10.3.11.) However, Solovay’s methods are of independent interest, so we give his original proof here. We begin with the following result. Theorem 10.2.5 (Solovay [371]). There are infinitely many m such that K(Ω m) k + α∅ (m) + O(log α∅ (m)).5
Proof. We work with αΩ , which is equal to α∅ up to an additive constant. Let U be our fixed universal prefix-free oracle machine, and adopt the convention that if U X (τ ) ↓= ρ, then the use of this computation is at least |ρ|. Let M be the prefix-free machine defined as follows. The machine M tries to write its input as στ ν such that U(σ) = |τ | and U ν (τ ) ↓ with use exactly |ν|. If that is possible then M outputs ν. Let pn be the least i such that αΩ (i) n. It is easy to see that K Ω (pn ) n+O(1). Let τ be a minimal-length U Ω -program for pn . Let σ be a minimallength U-program for τ . Let mn be least such that U Ωmn (τ ) ↓. Then M (στ (Ω mn )) ↓= Ω mn . We have |τ | = n + O(1), whence |σ| O(log n). Furthermore, mn pn , so αΩ (mn ) n. Thus K(Ω mn ) |σ| + |τ | + |Ω mn | + O(1) mn + n + O(log n) mn + αΩ (mn ) + O(log αΩ (mn )).
As we will now see, Solovay [371] also showed that if X is 2-random then K(X n) n + α(n) − O(log α(n)). Since α∅ grows more slowly than any ∅ -computable function, this result, together with the theorem above, shows that no 2-random set is K-reducible to Ω. We begin with a lemma.
Lemma 10.2.6 (Solovay [371]). K ∅ (n) K(n) − α(n) + O(log α(n)). Proof. Let s(n) = − log jn 2−K(j) . By Theorem 10.2.4, there is a c such that s(n) α(n) − c log α(n). Let mk be the least n such that α(n) = k. (The sums below are taken over only those k such that mk is defined.) Let S = {(K(n) − α(n) + (c + 2) log α(n), n) : n ∈ N}.
4 Solovay actually stated his results in terms of arithmetic randomness, but the strengthening to 2-randomness can be extracted from his proofs. 5 We will see in the proof of Theorem 10.2.9 that in fact there are infinitely many m such that K(Ω m) k + α∅ (m) + O(1).
470
10. Complexity and Relative Randomness for 1-Randoms
Then S is ∅ -computable. Furthermore, 2−K(n)+α(n)−(c+2) log α(n) = 2−K(n)+k−(c+2) log k n
2k−(c+2) log k
k
2
k α(n)=k
2−K(n) =
nmk
k−(c+2) log k −α(mk )+c log α(mk )
2
2k−(c+2) log k 2−s(mk )
k
=
k
2k−(c+2) log k 2−k+c log k
k
=
2−2 log k < ∞.
k
Thus, by the KC Theorem relativized to ∅ , we have K ∅ (n) K(n) − α(n) + (c + 2) log α(n) + O(1). Theorem 10.2.7 (Solovay [371]). If X is 2-random then K(X n) n + α(n) − O(log α(n)). Proof. Let pn be the position of X n in the length-lexicographic listing of 2<ω . Note that pn n. If X is 2-random then K ∅ (X n) n − O(1). By Lemma 10.2.6, we have
K(X n) = K(pn ) K ∅ (pn ) + α(pn ) − O(log α(pn ))
K ∅ (K n) + α(pn ) − O(log α(pn )) n + α(pn ) − O(log α(pn )) n + α(n) − O(log α(n)), the last inequality following by the monotonicity of α. We will return to the connections between 2-randomness and the α and s functions at the end of Section 10.4. Corollary 10.2.8 (Solovay [371]). If X is 2-random then X K Ω. There is no obvious way to extend Solovay’s methods beyond 2randomness. However, we will introduce methods below that allow us to (m) (m) show in Corollary 10.3.11 that indeed Ω∅ |K Ω∅ (and, in fact, the K-degrees of Ω∅(n) and Ω∅(m) have no upper bound) for all m = n. We finish this subsection with the following result, which shows that K(Ω n) is sometimes very close to n. Theorem 10.2.9 (Solovay [371]). There is an infinite sequence of pairs of numbers m0n , m1n such that for i = 0, 1,
(i) α∅ (min ) = n ± O(1), (ii) K(Ω min ) min + n + O(1), (iii) K(m0n ) = α(m0n ) ± O(1), and (iv) K(m1n ) = log m1n + K(log m1n ) ± O(1).
10.2. The K-complexity of Ω and other 1-random sets
471
Proof. As in the proof of Theorem 10.2.5, we work with αΩ in place of α∅ . We begin by building m0 , m1 , . . . satisfying (i) and (ii). That is, α∅ (mn ) = n ± O(1) and K(Ω mn ) mn + n + O(1). At a first pass, we could use the proof of Theorem 10.2.5. Let pn and mn be as in that proof. Then αΩ (mn ) n. But K Ω (pn ) n + O(1) and mn can be obtained Ω-computably from pn , so in fact αΩ (mn ) = n ± O(1), as required. Furthermore, as shown in the proof of Theorem 10.2.5, K(Ω mn ) mn + n + O(log n). To remove the O(log n) term, we modify the machine M from the proof of Theorem 10.2.5 to a new prefix-free machine N . The idea is to interlace the oracle and input information to eliminate the need to provide the length of the input as additional information. Thus N attempts to write its input as x0 y0 x1 y1 . . . xn yn σ (where the xi are bits and σ is a string) so that letting τ = x0 . . . xn and ν = y0 . . . yn σ, we have U ν (τ ) ↓. (We adopt the convention that if U ν (τ ) ↓ then |ν| |τ |.) If that is possible, then N outputs ν. It is easy to check that N is prefix-free and that, with mn as above and by a similar argument as before, N witnesses the fact that K(Ω mn ) mn + n+ O(1). Now we obtain m0n and m1n from mn . Let d be such that for every m there is an m ∈ [n, 4n] with K(m) log m + K(log m) − d. Let m0n be the least number greater than or equal to mn such that K(m0n ) = α(mn ) and let m1n be the least number greater than or equal to mn such that K(m1n ) log m1n + K(log m1n ) − d. Then (iii) and (iv) are satisfied. By monotonicity, α∅ (min ) α∅ (mn ) = n ± O(1). But the min can be obtained ∅ -computably from mn , so
α∅ (min ) K ∅ (min ) K ∅ (mn ) + O(1) n + O(1), the last inequality following because the mn can be obtained ∅ -computably from the pn , and we have already seen that K ∅ (pn ) n + O(1). Thus (i) is satisfied. Finally, it is easy to check that each min is computable from Ω mn , so we can describe Ω min in a prefix-free way by giving a prefix-free description of Ω mn , and, since that description provides us with mn and min , a plain description of Ω [mn , min ). Thus K(Ω min ) K(Ω mn ) + C(Ω [mn , min )) + O(1) mn + n + min − mn + O(1) = min + n + O(1), so (ii) is satisfied.
10.2.4 Limit complexities and relativized complexities There are echoes of Solovay’s work in more recent results of Bienvenu, Muchnik, Shen, and Vereshchagin [42, 399].
Theorem 10.2.10 (Vereschagin [399]). C ∅ (σ) = lim supn C(σ | n)±O(1).
472
10. Complexity and Relative Randomness for 1-Randoms
Proof. Let V be our fixed universal plain machine. Let M (τ, n) = V ∅ (τ )[n] (where both the oracle and the universal machine are being approximated). If n is large enough and τ is a minimal-length V ∅ -program for σ, then M (τ, n) = σ, whence C(σ | n) |τ | + O(1) = C ∅ (σ) + O(1). Thus lim supn C(σ | n) C ∅ (σ) + O(1). For the other direction, let k = lim supn C(σ | n)+1. Let Vn = {σ : C(σ | n) < k}. We have |Vn | < 2k for all n. Let B = {τ : ∃m ∀n m (|Vn ∪{τ }| < 2k )}. Then B is ∅ -c.e. (uniformly in k), σ ∈ B, and |B| < 2k . Since we can describe σ from ∅ by giving its position in the enumeration of B as a string of length k, we have C ∅ (σ) k + O(1) = lim supn C(σ | n) + O(1). Theorem 10.2.11 (Bienvenu, Muchnik, Shen, and Vereschagin [42]). K ∅ (σ) = lim supn K(σ | n) ± O(1).
Proof. The argument that lim supn K(σ | n) K ∅ (σ) + O(1) is the same as in the previous proof, with U in place of V, since then M is prefix-free for any fixed n. Let (σ0 , m0 , k0 ), (σ1 , m1 , k1 ), . . . be an effective enumeration of 2<ω × N × N. Let K0 = K. Given Ks , let J be the result of altering Ks so that J(σs | n) = min{ks , K(σs | n)} for all n ms , leaving all other values of Ks unchanged. If J is an information content measure, in the sense of Definition 3.7.7, then let Ks+1 = J and say that k is good for σs . Otherwise let Ks+1 = Ks . Let I(σ) be the least k that is good for σ. Checking whether a given J as above is an i.c.m. can be done ∅ -computably, so I(σ) is ∅ -approximable from above. For each s, let Is (σ) be the least k that has been verified to be good for σ by stage s (or ∞ if there has been none). It is then easy to verify by induction that σ 2−Is (σ) 1 for all s, so that σ 2−I(σ) 1. Thus I is an i.c.m. relative to ∅ , and hence K ∅ (σ) I(σ) + O(1). Let k = lim supn K(σ | n), let m be such that K(σ | n) k for all n m, and let s be such that (σs , ms , ks ) = (σ, m, k). Then the J defined at stage s of the above construction equals Ks , whence k is good for σ. Thus I(σ) k, and hence K ∅ (σ) k + O(1) = lim supn K(σ | n) + O(1). It is straightforward to translate the above proof into the language of discrete semimeasures, and indeed that is how the proof is presented in [42]. The limit frequency of a partial function f is the function qf (k) = lim inf n |{i
10.3. Van Lambalgen reducibility
473
Using almost the same proof we can show the following. Theorem 10.2.14 (Bienvenu, Muchnik, Shen, and Vereshchagin [42]). KM ∅ (σ) = lim supn KM (σ | n) ± O(1). Bienvenu, Muchnik, Shen, and Vereshchagin [42] remarked that it is unknown whether the analogous result holds for monotone complexity Km.
10.3 Van Lambalgen reducibility In this section we introduce a measure of relative randomness that is interesting not only for its own sake, but also as a tool for the analysis of Kand C-reducibility and their interactions with Turing reducibility. The following definition is based on van Lambalgen’s Theorem, which recall states that X ⊕ Y is 1-random iff X is 1-random and Y is 1-random relative to X. Definition 10.3.1 (Miller and Yu [280]). We say that X is van Lambalgen reducible to Y , or simply vL-reducible to X, and write X vL Y , if for all Z, if X ⊕ Z is 1-random, then Y ⊕ Z is 1-random. As usual, we call the equivalence classes induced by this reducibility the van Lambalgen degrees (vL-degrees). Nies [302] defined the following closely related notion. Definition 10.3.2 (Nies [302]). We say that Y is low for 1-randomness reducible to X, or simply LR-reducible to X, and write Y LR X, if for all Z, if Z is 1-random relative to X then Z is 1-random relative to Y .6 By van Lambalgen’s Theorem, if X and Y are both 1-random, then Y LR X iff X vL Y . Notice that if X is not 1-random, then X vL Y for all Y , so vL-reducibility is interesting only on 1-random sets. Another related reducibility is extended monotone reducibility, defined by writing X Km ⊕ Y if X ⊕ Z Km Y ⊕ Z for all Z. Since all 1-random sets have the same Km-degree, by Theorem 6.3.10, Km ⊕ implies vL , but Km ⊕ also makes sense on non-1-random sets. We will concentrate on studying van Lambalgen reducibility, although, of course, since we will be looking only at 1-random sets, all of the results below could be phrased in terms of LR-reducibility. 6 The
name of this reducibility comes from the concept of lowness for 1-randomness, which will be discussed in Section 11.2. In the present notation, a set Y is low for 1-randomness if Y LR ∅. As we will see in Chapter 11, the class of such sets is a natural and highly interesting one, with several characterizations in terms of notions of randomness-theoretic weakness.
474
10. Complexity and Relative Randomness for 1-Randoms
The vL-degrees turn out to be a weak measure of degree of randomness. We will see in Corollaries 10.3.10 and 10.3.17 that both K and C refine vL . This is the key fact that allowed Miller and Yu [280] to transfer facts about the vL-degrees to the K- and C-degrees. This method is useful because many basic properties of the vL-degrees are easy to prove from known results.
10.3.1 Basic properties of the van Lambalgen degrees In establishing the basic properties of the van Lambalgen degrees, we use van Lambalgen’s Theorem, Kuˇcera’s Theorem 8.5.1 that every degree that computes ∅ is 1-random, and Corollary 11.2.7, due to Kuˇcera and Terwijn, that for every X there is a Y >T X such that every set that is 1-random relative to X is also 1-random relative to Y . Although we prove Corollary 11.2.7 only in Chapter 11, since the relevant techniques are more properly presented there, we use it here as a black box. Theorem 10.3.3 (Miller and Yu [280]). (i) The least vL-degree is 0vL = {X : X is not 1-random}. (ii) If X vL Y and X is n-random, then Y is n-random. (iii) If X ⊕ Y is 1-random, then X and Y have no upper bound in the vL-degrees, so there is no join in the vL-degrees. (iv) If Y T X and Y is 1-random, then X vL Y . (v) There are 1-random sets X ≡vL Y such that X 1. By Kuˇcera’s Theorem, there is a 1-random Z ≡T ∅(n−1) . Then X is 1-random relative to Z, so X ⊕ Z is 1-random. Thus, Y ⊕ Z is 1-random, so Y is 1-random relative to Z, and hence Y is n-random. (iii) Assume for a contradiction that X, Y vL Z. Because X vL Z and X ⊕ Y is 1-random, Z ⊕ Y is also 1-random. Therefore, Y ⊕ Z is 1-random. But Y vL Z, so Z ⊕ Z must also be 1-random, which is a contradiction. (iv) Let Z be such that X ⊕ Z is 1-random. Then Z is 1-random relative to X, and hence Z is 1-random relative to Y , which implies that Y ⊕ Z is 1-random. Thus X vL Y .
10.3. Van Lambalgen reducibility
475
(v) Let X = Ω. By Corollary 11.2.7, there is a W >T X such that every set that is 1-random relative to X is 1-random relative to W . By Kuˇcera’s Theorem, there is a 1-random Y ≡T W . Then X
(R, vL ). Suppose P = (P, ) and P = {pi }i m. It follows immediately (m) (n) from part (iii) of Theorem 10.3.3, since if m = n then Ω∅ ⊕ Ω∅ is 1-random. Corollary 10.3.4 (Miller and Yu [280]). If m = n, then Ω∅ have no upper bound in the vL-degrees.
(m)
and Ω∅
(n)
It is also clear that Ω is the least Δ02 1-random set in the vL-degrees, and that no 2-random set can be vL-above a Δ02 1-random set. Notice that parts (ii) and (iv) of Theorem 10.3.3 immediately imply Theorem 8.5.3, which states that if X is n-random and Y T X is 1random, then Y is n-random. Kuˇcera’s Theorem 8.5.1 says that every degree above 0 is 1-random, but no such degree can be n-random for any n > 1. Thus we have the following. Proposition 10.3.5. For every n-random degree a there is a 1-random degree b > a that is not 2-random.
10.3.2 Relativized randomness and Turing reducibility In this subsection we prove the following strong extension of Theorem 8.5.3. Theorem 10.3.6 (Miller and Yu [280]). Let X T Y and let X be 1random. For every Z, if Y is 1-random relative to Z, then X is also 1random relative to Z. To prove this result, we use the following lemma.
476
10. Complexity and Relative Randomness for 1-Randoms
Lemma 10.3.7 (Miller and Yu [280]). If X is 1-random then −n+c ∀e∃c∀n [μ({A : ΦA ]. e n = X n}) 2
Proof. Let Cσ = {A : ΦA e n = σ}. Note that if σ | τ then Cσ ∩ Cτ = ∅. Now let Fi = {σ : μ(Cσ ) > 2−|σ|+i } and let Qi = Fi . The Qi form a sequence of uniformly Σ01 classes. We claim that μ(Qi ) 2−i . Suppose not. Then there is some prefix-free D ⊆ Fi with μ(D) > 2−i . We now have μ({A : ΦA e |σ| = σ for some σ ∈ D}) =
2−|σ|+i = 2i
σ∈D
μ(Cσ )
σ∈D
2−|σ| = 2i μ(D) > 2i 2−i = 1,
σ∈D
which is a contradiction. So the Qi form a Martin-L¨of test. Since X is 1-random, X ∈ / Qc for some c, and the result follows. Proof of Theorem 10.3.6. Let X, Y , and Z be as in the theorem. Let Φ be such that ΦY = X, and let c be the constant for this Φ and X given by Lemma 10.3.7. For each σ, enumerate a set Fσ as follows. If Φτ |σ| has use exactly add τ to Fσ provided that doing so preserves the τ , then −|ρ| condition 2 2−|σ|+c . Since each Fσ is prefix-free, we have ρ∈Fσ −|τ | μ(Fσ ) = τ ∈Fσ 2 . Moreover, FXn = {A : ΦA n = X n} for all n, since the choice of c means that we are not prevented from adding any strings to FXn . Now suppose thatX is not 1-random relative to Z. For each i, define the Σ0,Z class Qi = K Z (σ)|σ|−c−i Fσ . Then 1 μ(Qi )
μ(Fσ )
K Z (σ)|σ|−c−i
K Z (σ)|σ|−c−i
2−|σ|+c
2−K
Z
(σ)−i
2−i .
σ
Hence the Qi form a Martin-L¨of test relative to Z. Since X is not 1-random relative to Z, for each i there is an n with K Z (X n) n − c − i, which implies that Y ∈ {A : ΦA n = X n} ⊆ Qi , contradicting the assumption that Y is 1-random relative to Z.
10.3.3 vL-reducibility, K-reducibility, and joins The key ingredient in connecting K to vL was the realization by Miller and Yu [280] that the initial segment complexity of X determines, for any Z, whether X ⊕ Z is 1-random. This idea then allowed them to prove the important result that X K Y implies X vL Y , which means that the results of the previous section have consequences for the K-degrees.
10.3. Van Lambalgen reducibility
477
Recall that strings of length n can be thought of as representing the numbers between 2n −1 and 2n+1 −2. We also need some additional notation for the proofs below. Definition 10.3.8 (Miller and Yu [280]). Given X = x0 x1 . . . and Z = Z be z0 z1 . . ., let X ⊕ z0 x0 x1 z1 x2 x3 x4 x5 z2 . . . zn x2n −2 . . . x2n+1 −3 zn+1 . . . Z is 1-random iff X ⊕ Z is 1-random. In fact, X ⊕ Z ≡vL Clearly, X ⊕ τ for strings σ, τ ∈ 2<ω , provided that X ⊕ Z. We can also define σ ⊕ 2|τ |−1 − 2 |σ| 2|τ | − 2. Theorem 10.3.9 (Miller and Yu [280]). X ⊕ Z is 1-random iff K(X (Z n)) (Z n) + n − O(1). Z is also 1Proof. First, assume that X ⊕ Z is 1-random. Then X ⊕ random. Let σ = Z n and σ = Z (n + 1). Then K(X σ) = K((X σ ) ± O(1). (We use σ here to make the ⊕-join σ) ⊕ defined.) But (X σ = (X ⊕ Z) (σ + n + 1), so σ) ⊕ Z) (σ + n + 1)) ± O(1) σ + n − O(1). K(X σ ) = K((X ⊕ For the other direction, define a prefix-free machine M as follows. To compute M (τ ), look for τ1 , τ2 , η1 , and η2 such that τ = τ1 τ2 , for the η2 , and |η1 τ2 | = η2 . universal prefix-free machine U we have U(τ1 ) = η1 ⊕ If these are found, define M (τ ) = η1 τ2 . Assume that X ⊕ Z is not 1-random. Then for each k, there is an m Z) m) m − k. Take strings η1 and η2 such that such that K((X ⊕ η2 . η1 ⊕ η2 = (X ⊕ Z) m and let τ1 be a minimal U-program for η1 ⊕ n n Let n = |η2 |. Then |η1 | 2 − 2 and η2 2 − 1, so there is a string τ2 such that η1 τ2 = X η2 . Then M (τ1 τ2 ) = X η2 . Therefore, K(X (Z n)) K(X η2 ) KM (X η2 ) + O(1) η2 ) + |τ2 | + O(1) |τ1 τ2 | + O(1) K(η1 ⊕ |η1 η2 | − k + |τ2 | + O(1) = |η1 τ2 | + |η2 | − k + O(1) = η2 + |η2 | − k + O(1) = Z n + n − k + O(1), where the constant depends only on M . Because k was arbitrary, K(X (Z n)) − (Z n) − n is not bounded below. Corollary 10.3.10 (Miller and Yu [280]). If X K Y then X vL Y . Combined with Theorem 10.3.3, this corollary has the following important consequence for the structure of the K-degrees. Corollary 10.3.11 (Miller and Yu [280]). (i) If X K Y and X is n-random, then Y is n-random.
478
10. Complexity and Relative Randomness for 1-Randoms
(ii) If X ⊕ Y is 1-random, then X |K Y , and X and Y have no upper bound in the K-degrees. Therefore, there is no join in the K-degrees. (iii) If n = m then the K-degrees of Ω∅(n) and Ω∅(m) have no upper bound. The following is another attractive corollary, which follows from Theorem 10.3.9 because X is 2-random iff X ⊕ Ω is 1-random. Corollary 10.3.12 (Miller and Yu [280]). X is 2-random iff X is 1-random and K(X (Ω n)) Ω n + n − O(1). Another important consequence of Corollary 10.3.10 is the following fundamental result, which shows that, while the cardinality of a K-lower cone might be big, its measure is always small. It was originally proved by direct and more complex methods. Theorem 10.3.13 (Yu, Ding, and Downey [415]). For any Y , we have μ({X : X K Y }) = 0. Proof. If X is 1-random relative to Y , then X vL Y , so by Corollary 10.3.10, X K Y . Since there are measure 1 many X that are 1-random relative to Y , the result follows. Similar methods also show that the cardinality of the K-degrees of 1-random sets is as large as it can be. (Actually, since K is Borel, the following result follows from Theorem 10.3.13 and Silver’s Theorem7 [357] that if a Borel (or even just coanalytic) equivalence relation has uncountably many classes, then it has continuum many classes.) Theorem 10.3.14 (Yu and Ding [413]). There are 2ℵ0 many K-degrees of 1-random sets. Theorem 10.3.14 has a direct proof, but it follows from another theorem from classical measure theory, as observed by Miller and Yu. Theorem 10.3.15 (Mycielski [290]). For any relation E ⊆ 2ω × 2ω such that E includes the diagonal and μ(E) = 0, there is a perfect set P ⊆ 2ω such that ∀X, Y ∈ P (X = Y ⇒ (X, Y ) ∈ E). We can get Theorem 10.3.14 from Theorem 10.3.15 by considering the relation E = {(X, Y ) : X ≡K Y ∨ X not 1-random ∨ Y not 1-random}.
10.3.4 vL-reducibility and C-reducibility In this section we prove analogs of the results of the previous section for initial segment plain complexity. Our main result is the following. 7 The
original proof of Silver’s Theorem was very complex. Silver’s Theorem is sometimes called the Harrington-Silver Theorem because Harrington gave a much simpler proof using Gandy forcing, as reported in Harrington, Marker, and Shelah [175].
10.3. Van Lambalgen reducibility
479
Theorem 10.3.16 (Miller and Yu [280]). Let Z be 1-random. The following are equivalent. (i) X is 1-random relative to Z. (ii) C(X n) n − K Z (n) − O(1). (iii) C(X n) + K(Z n) 2n − O(1). Proof. Suppose that Z is 1-random and X is 1-random relative to Z. Then, relativizing Theorem 6.7.2, C(X n) C Z (X n) − O(1) n − K Z (n) − O(1), so (i) implies (ii). Now assume (ii). Since Z is 1-random, by the Ample Excess Lemma, Theorem 6.6.1, K Z (n) K(Z n) − n + O(1), and hence C(X n) n − K Z (n) − O(1) 2n − K(Z n) − O(1), so (ii) implies (iii) Finally, assume (iii). It is straightforward to check that (for any X) we have C(X (X n)) (X n) − n + O(1). Thus K(Z (X n)) 2(X n) − C(X (X n)) − O(1) 2(X n) − (X n) + n − O(1) = (X n) + n − O(1). Hence, Z ⊕ X is 1-random by Theorem 10.3.9. Thus, by van Lambalgen’s Theorem, X is 1-random relative to Z, so (iii) implies (i). The following corollary is more or less immediate. Corollary 10.3.17 (Miller and Yu [280]). (i) X ⊕ Z is 1-random iff C(X n) + K(Z n) 2n − O(1). (ii) X C Y implies X vL Y . (iii) Let X C Y . If X is n-random then Y is n-random. (iv) If X ⊕ Y is 1-random then the C-degrees of Y and X have no upper bound, and hence there is no join in the C-degrees. (v) If n = m then the C-degrees of Ω∅(n) and Ω∅(m) have no upper bound. It follows, by the same arguments as in the previous section, that the measure of the class of sets C-reducible to a given set is always 0, and there are continuum many C-degrees of 1-random sets.
10.3.5 Contrasting vL-reducibility and K-reducibility One might hope that K-reducibility and vL-reducibility coincide on the 1-random sets. However, the following result implies that they differ even
480
10. Complexity and Relative Randomness for 1-Randoms
for Δ02 1-random sets. A slight modification of its proof shows that Creducibility and vL-reducibility also do not coincide on the 1-random sets. Theorem 10.3.18 (Miller and Yu [280]). For any 1-random set X, there is a 1-random set Y T X ⊕ ∅ such that X |K Y . As noted above, Ω is vL-below every Δ02 1-random set. On the other hand, Theorem 10.3.18 implies that there is a Δ02 1-random Y |K Ω. Therefore, vL does not, in general, imply K on the Δ02 1-random sets. The difficulty in proving Theorem 10.3.18 is getting the precise degree bound. Easier methods construct a Y T X (in fact even with Y T X ) class P of sets that are 1-random relative to such that X |K Y : Take a Π0,X 1 X, then use the low basis theorem to get a member of P that is low relative to X. Then, by van Lambalgen’s Theorem, X ⊕ Y is 1-random, and hence X |K Y , by Corollary 10.3.11. This method would suffice to construct an infinite K-antichain in the Δ02 degrees starting with a low 1-random set. Proof of Theorem 10.3.18. Let R = {Z : ∀n (K(Z n) n)} and note that μ(R) > 0. We define two predicates: A(τ, p) ⇐⇒ μ({Z τ : Z ∈ / R}) > p and B(σ, s) ⇐⇒ ∃n < |σ| (K(σ n) > K(X n) + s) ∧ ∃m < |σ| (K(σ m) < K(X m) − s), where σ, τ ∈ 2<ω , p ∈ Q, and s ∈ N. Clearly, we can compute whether B(σ, s) holds given X ⊕ ∅ . To see that we can compute whether A(τ, p) holds given ∅ , note that A(τ, p) is equivalent to ∃s (μ({Z τ : (∃n s) Ks (Z n) < n}) > p). We construct Y = s σs by finite initial segments σs ∈ 2<ω such that B(σs+1 , s) holds, which guarantees that X |K Y . We also require the inductive assumption that μ({Z : Z σs } ∩ R}) > 0, which ensures that Y ∈ R, because R is closed, and hence that Y is 1-random. Finally, the construction is done relative to the oracle X ⊕ ∅ , so Y T X ⊕ ∅ . Stage s = 0: Let σ0 = λ. Note that μ({Z : Z σ0 } ∩ R}) = μ(R) > 0, so the inductive assumption holds for the base case. Stage s + 1: We have defined σs so that μ({Z : Z σs } ∩ R}) > 0. Using the oracle X ⊕ ∅ , search for τ σs and p < 2−|τ | such that B(τ, s) and ¬A(τ, p). If these are found, then let σs+1 = τ and note that it satisfies our requirements. In particular, μ({Z : Z σs+1 } ∩ R}) 2−|σs+1 | − p > 0. All that remains is to verify that the search succeeds. We know by Corollary 10.3.11 (ii) that μ({Z : Z |K X}) = 1. Therefore, μ({Z σs : Z |K X} ∩ R) > 0. So there is a τ σs such that B(τ, s) and μ({Z : Z τ } ∩ R) > 0, which implies that there is a p ∈ Q such that p < 2−|τ | and ¬A(τ, p). A similar proof gives the following result.
10.4. Upward oscillations and K -comparable 1-random sets
481
Theorem 10.3.19 (Miller and Yu [280]). For any finite collection X0 , . . . , Xn of 1-random sets, there is a 1-random Y T X0 ⊕ · · · ⊕ Xn ⊕ ∅ such that, for every i n, the sets Y and Xi have no upper bound in the K-degrees.
10.4 Upward oscillations and K -comparable 1-random sets The main result of this section is a characterization by Miller and Yu [279] of the functions g for which K(X n) infinitely often exceeds n + g(n) for every 1-random X. As we will see, these are exactly the functions such that n 2−g(n) diverges. We will then use this result to show that there are comparable K-degrees of 1-random sets. One direction of the Miller-Yu characterization follows from the Ample Excess Lemma (Theorem 6.6.1). For the harder direction, we prove the following. Theorem 10.4.1 (Miller and Yu [279]). If n 2−f (n) < ∞, then there is a 1-random X such that K(X n) n + f (n) + O(1). Furthermore, we can ensure that X T f ⊕ ∅ . The proof of this result is broken up into two parts. The first part is essentially technical. We would like to be able to code f into a 1-random set in a compact way, but this may not be possible. Instead, we construct a function g such that g(n) f (n) for all n and g can be coded compactly, meaning that we can use G´acs coding to produce a 1-random X such that g(n) is computable from the first n bits of X, for all n. Furthermore, we ensure that n 2−g(n) < ∞. The second part of the proof is verifying that X is the desired 1-random set, which is the content of the following result. Lemma 10.4.2 (Bounding Lemma, Miller and Yu [279]). If n 2−g(n) < ∞ and g T X with use the identity function, then K(X n) n + g(n) + O(1). Proof. Because n 2−g(n) < ∞, using the KC Theorem, we can ensure that K g (σ) |σ| + g(|σ|) + O(1) for all σ. Furthermore, it is easy to see from the proof of the KC Theorem that we can arrange things so that, for σ ∈ 2n , the only part of the oracle that is needed is g n+ 1. Since g n+ 1 is computable from X n, we have a hope of removing the dependence on the oracle of the short description τ of X n given by the KC Theorem. However, it would appear that we need X n to tell us how to decode τ . As Miller and Yu put it, “it is as if we have encrypted the decryption key along with our message. We would know how to read the message if only we knew what the message said. The heart of the proof is resolving this circularity.”
482
10. Complexity and Relative Randomness for 1-Randoms
By the KC Theorem, there is a prefix-free oracle machine M and τ0 , τ1 , . . . such that |τn | g(n)+O(1) and M X (τn ) = X n for all n. We may assume that M X (τn ) has use exactly n and M X (τn ) reads exactly τn before halting. We think of M as having one-way, read-only input and oracle tapes, such that reading a bit moves the corresponding head one position (so that we cannot look at the same position of either tape twice). In addition, we think of M as having work tapes, which it can use to store the bits of the input and oracle that it reads, which is why our assumptions do not limit the power of M . The key step of the proof is to transform M into a machine M ◦ with no oracle. We do this by routing any reading requests that M makes to either its input or oracle tape to the input tape of M ◦ . We identify M ◦ with the partial computable function that converges on τ with output ρ iff M ◦ halts after reading exactly τ on its input tape and writing ρ on its output tape. In other words, we treat M ◦ as a self-delimiting machine, which ensures that M ◦ has prefix-free domain. Now suppose that M X (τ ) ↓= ρ with use exactly n, and that M reads exactly all of the bits of τ from the input tape. At certain stages of the computation, M asks for the next bit of the input or the next bit of the oracle, and by our assumptions, it cannot ask for the same bit of either tape more than once. Now merge the bits of τ and X n together in exactly the order that they are requested by M , and call the resulting string σ. Then M ◦ (σ) ↓= ρ, because the computation of M ◦ on σ is indistinguishable from the computation of M X on τ . Therefore, KM ◦ (ρ) |σ| = n + |τ |. Applying this observation to the τn shows that K(X n) KM ◦ (X n) + O(1) n + |τn | + O(1) n + g(n) + O(1). Proof of Theorem 10.4.1. Let f be such that n 2−f (n) < ∞. We want to construct a function g such that g(n) f (n) for all n and n 2−g(n) < ∞, and furthermore g can be coded in a compact way. In particular, we require that 1. g(n) f (n) for all n, −g(n) 2. < ∞, n2 3. g(0) = 0, 4. if n ≡ 3 mod 4, then g(n) = g(n + 1), and 5. |g(n + 1) − g(n)| 1 for all n. We want to define g to be (point-wise) maximal among the functions satisfying 1–5. That is, g satisfies 1–5 and if g satisfies 1–5 then g(n) g (n) for all n.
10.4. Upward oscillations and K -comparable 1-random sets
6 q q q q
q q q q
q q q q
q q q q
q q q q
q q q q
q q q q
483
q q q q
q q q q
b(10, n)
0
4
8
12
16
20
24
28
32
n
Figure 10.1. Here f (10) = 3 and the function b(10, n) is the upper bound on the values of g(n) imposed by that fact.
Any function g satisfying 3, 4, and 5 will have g(n) # n4 $. Furthermore, because g is forced to change at a slow rate, the value of f (n) not only bounds the value of g(n), but also places bounds on all values of g. For example, if f (10) = 3, then g is at most 3 on [8, 11], at most 4 on [4, 7] and [12, 15], and so on. Let b(i, n) be the upper bound placed on g(n) by the value of f (i) (see Figure 10.1). Putting together all of the restrictions on g(n), we define g(n) = min{# n4 $, min b(i, n)}.
(10.1)
i
Clearly g satisfies 1, 3, 4, and 5, and is point-wise maximal among −g(n) functions that do so. To verify that < ∞, note that for any n2 i, 2−b(i,n) 4 · 2−f (i) + 8 2−f (i)−n = 12 · 2−f (i) . n
n>0
Therefore, n
2
−g(n)
− n −b(i,n) 4 2 + 2 n
=
n
i
2− 4 + n
i
2−b(i,n) 8 +
n
12 · 2−f (i) < ∞.
i
Next we prove that g T f . Clearly, b T f . Although (10.1) expresses g(n) as the minimum of an infinite f -computable sequence, it is not hard to see that we can ignore all but finitely many terms. In particular, if i 2n, then b(i, n) # n4 $, so g(n) = min{# n4 $, min b(i, n)}. i<2n
Thus g T f . The restrictions placed on g allow us to code it compactly into a set C. It is of course necessary only to record the value of g(n + 1) − g(n) for all n ≡ 3 (mod 4). Two bits are sufficient to code g(n + 1) − g(n) because there are only three possible values. Thus we use the first two bits of C to record g(4) − g(3), the next two for g(8) − g(7), and so on. Note that g(n)
484
10. Complexity and Relative Randomness for 1-Randoms
can be computed from C # n2 $ (or more precisely, C (2# n4 $)). Of course, C T g T f . By the Kuˇcera-G´acs Theorem 8.3.2, there is a 1-random X such that C T X and X T C ⊕ ∅ T f ⊕ ∅ . (The upper bound on the complexity of X is clear from the construction in the proof of Theorem 8.3.2, and made more explicit in Theorem 8.5.1.) As was mentioned following the proof of Theorem 8.3.2, the G´acs version of that result produces an X that computes C with use 2n,8 meaning that only the first 2n bits of X are used to compute C n. Therefore, g(n) is computable from X n, for all n. Hence the Bounding Lemma 10.4.2 implies that K(X n) n + g(n) + O(1) n + f (n) + O(1). The following corollary improves an earlier result by Bienvenu and Downey [39] (Corollary 8.4.3). Their result was the same, but assuming that the function h tends to infinity rather than just being unbounded. Corollary 10.4.3 (Miller and Yu [279]). If h is unbounded, then there is a 1-random X such that K(X n) < n + h(n) for infinitely many n. Proof. Choose a sequence of distinct natural numbers {ni }i∈ω such that h(ni ) 2i. Let i if n = ni f (n) = n otherwise. We can apply Theorem 10.4.1 because n 2−f (n) i 2−i + n 2−n = 4, so there is a 1-random X such that K(X n) n+ f (n)+ O(1). Therefore, K(X ni ) ni + f (ni ) + O(1) = ni + i + O(1) ni + h(ni ) − i + O(1). When i is sufficiently large, K(X ni ) ni + h(ni ). To construct comparable 1-random K-degrees, we need a stronger form of Theorem 10.4.1. It is clear that we can modify the proof of that result to require that g(n) = g(n + 1) whenever n ≡ 7 (mod 8). Then g can be coded into C so that g T C with use #n/4$. By G´ acs coding, there is a 1-random X T f ⊕ ∅ such that g is computable from X with use # n2 $. So for any Z, we have g T X ⊕ Z with use n. Applying the Bounding Lemma 10.4.2 gives the following result. Theorem 10.4.4 (Miller and Yu [279]). If n 2−f (n) < ∞ then there is a 1-random X T f ⊕ ∅ such that K((X ⊕ Z) n) n + f (n) + O(1) for every Z. 8 Indeed,
as mentioned there, Merkle and Mihailovi´c [267] adapted the construction in G´ acs [167] to show that the use of the wtt-reduction in the Kuˇcera-G´ acs Theorem can be on the order of n + o(n).
10.4. Upward oscillations and K -comparable 1-random sets
485
We can now give the desired classification of functions coding the upward oscillations of initial segment complexities of 1-random sets. First we need a relatively straightforward lemma. Lemma 10.4.5 (Miller and Yu [279]). Assume that n 2−g(n) < ∞. (i) There is a function f T g withf (n) g(n) for all n and lim supn g(n) − f (n) = ∞ such that n 2−f (n) < ∞. (ii) There is a function f T g such that limn g(n) − f (n) = ∞ and −f (n) < ∞. n2 Proof. (i) For each m, let nm = min{n : g(n) 2m} and let f (nm ) = #g(nm )/2$. Let f (n) = g(n) for all other values of n. Then 2−f (n) 2−g(n) + 2−m < ∞. n
n
m
Clearly, f (n) g(n) for all n. Furthermore, f T g and lim supn g(n) − f (n) = ∞. (ii) Let c n 2−g(n) . Then g computes an increasing sequence {ni }i∈ω such that nni 2−g(n) 2ci for all i. We may assume that n0 = 0. Let f (n) = g(n) − #log |{i : ci n}|$ (or f (n) = 0 if this value is negative). Then 2−f (n) |{i : ni n}| 2−g(n) n
n
=
i
2−g(n)
nni
c = 2c < ∞. 2i i
Furthermore, f T g and limn g(n) − f (n) = ∞. Theorem 10.4.6 (Miller and Yu [279]). The following are equivalent. −g(n) (i) = ∞. n2 (ii) For every 1-random X, there are infinitely many n such that K(X n) > n + g(n). Proof. (i) ⇒ (ii) We prove the contrapositive. Assume that X is 1-random and there is an m such ∀n m (K(X n) n + g(n)). Then that−g(n) n−K(Xn) . The first sum is finite by the Ample nm 2 nm 2 Excess Lemma, Theorem 6.6.1, so n 2−g(n) converges. (ii) ⇒ (i) Again we prove the contrapositive. Assume that n 2−g(n) converges. By Lemma 10.4.5 (ii), there is a function f such that limn g(n)− f (n) = ∞ and n 2−f (n) < ∞. Hence, by Theorem 10.4.1, there is a 1random X such that K(X n) n + f (n) + O(1). This inequality, together with the fact that limn g(n) − f (n) = ∞, implies that K(X n) n + g(n) for almost all n.
486
10. Complexity and Relative Randomness for 1-Randoms
We are now ready to prove that there are comparable 1-random Kdegrees. It seems surprising that this result relies on such difficult methods, whereas obtaining incomparable 1-random K-degrees (as in Corollary 10.3.11) is relatively straightforward. Theorem 10.4.7 (Miller and Yu [279]). For every 1-random A, there is a 1-random B T A ⊕ ∅ such that B
K(B n) n + f (n) + O(1) n + g(n) + O(1) = K(A n) + O(1), so B K A. Furthermore, lim sup K(A n) − K(B n) lim sup K(A n) − n − f (n) − O(1) n
n
lim sup g(n) − f (n) − O(1) = ∞, n
so A K B. Corollary 10.4.8 (Miller and Yu [279]). For every Δ02 1-random A, there is a Δ02 1-random B such that B
Note that S is computably bounded (meaning that there is a computable function that majorizes every element of S), because K(A n)−n K(n)+
10.4. Upward oscillations and K -comparable 1-random sets
487
O(1) 2 log n + O(1).9 Therefore, by the low basis theorem relativized to A, there is a function g ∈ S such that g T A . −f (n) By Lemma 10.4.5 (ii), there is an f such that < ∞ and n2 limn g(n) − f (n) = ∞, and furthermore, f T g T A . Apply Theorem 10.4.1 to f to produce a 1-random B T f ⊕ ∅ T A ⊕ ∅ T A such that K(B n) n + f (n) + O(1). Then lim K(A n) − K(B n) lim K(A n) − n − f (n) − O(1) n
n
lim g(n) − f (n) − O(1) = ∞. n
Taking A in the previous corollary to be low, we have the following. Corollary 10.4.11 (Miller and Yu [279]). There are Δ02 1-random sets A and B such that B (K A. We finish this section with an application of the Bounding Lemma (and other results of Miller and Yu) to extend work of Solovay described earlier in this chapter. Recall Theorem 8.4.3, which states that there is no function g that tends to infinity such that K(A n) n + g(n) − O(1) for every 1-random A. As we have seen in Corollary 10.3.17, Miller and Yu [280] showed that if Z is 1-random, then X and Z are relatively 1-random iff K(X n) + C(Z n) 2n − O(1). Taking Z = Ω, we see that a 1-random set X is 2-random iff K(X n) n + f (n) + O(1), where f (n) = n − C(Ω n), which tends to infinity because Ω is not infinitely often C-random (see Theorem 6.11.3). Thus we have a “single gap characterization” of 2randomness (which can be extended to n-randomness for n > 2 by replacing Ω by a 1-random set Turing equivalent to ∅(n−1) ), and hence Theorem 8.4.3 does not hold for 2-randomness. Miller [unpublished] has shown that there is a characterization of 2-randomness via a single gap given by a function f that is not just unbounded, but also nondecreasing (i.e., an order). In fact, f can be taken to be Solovay’s function s(n) = − log jn 2−K(j) defined in Section 10.2.2. It will be convenient to use a different characterization of s. We take Ω = m 2−K(m) (which is a 1-random left-c.e. real and hence a version of Ω) and Ωn = m
ωω .
footnote 11 on page 73 for a comment on computably bounded Π01 classes in
488
10. Complexity and Relative Randomness for 1-Randoms
Since f is computable, we have K(m) f (m) + O(1), so 2−K(m) − log 2−f (m) + O(1) s(n) = − log = − log
mn
mn
(Ωm+1 − Ωm ) + O(1) = − log(Ω − Ωn ) + O(1) = γ(n) + O(1).
mn
Let γ (n) = max{k : Ωn k = Ω k}. Since Ω − Ωn 2− γ (n) , we have γ (n) γ(n)+O(1). (Miller asked whether γ is independent up to a constant of the choice of Ω and {Ωn }n∈ω , and whether in fact γ (n) = γ(n) ± O(1).) Theorem 10.4.13 (Miller [unpublished]). The following are equivalent. (i) A is 2-random. (ii) K(A n) n + γ(n) − O(1). (iii) limn K(A n) − n − γ(n) = ∞. The same equivalences hold with γ replaced by γ . Proof. It is clear that (iii) implies (ii). Since γ (n) γ(n) + O(1), each of (ii) and (iii) for γ implies the respective condition for γ . So it is enough to show that (i) implies (iii) for γ and (ii) for γ implies (i). We assume that A is 1-random, since otherwise none of the conditions hold. We first assume that (iii) does not hold and conclude that A is not 2random. By Corollary 6.6.4, K A (n) K(A n) − n + O(1), so there is a c such that S = {n : K A (n) γ(n) + c} is infinite. Define a prefix-free machine M A as follows. If U A (σ) ↓= n, then let M A (σ0) = Ωn (|σ| − c) and M A (σ1) = (Ωn + 2−|σ|+c ) (|σ| − c). Fix n ∈ S and let σ be a minimal length U A -program for n. Then |σ| − c = K A (n) − c γ(n), so Ω−Ωn 2−|σ|+c . Thus one of M A (σ0) and M A (σ1) is equal to Ω (|σ|−c). Therefore, K A (Ω (|σ| − c)) |σ| + O(1) for infinitely many σ. But then Ω is not 1-random relative to A, by the relativized version of Corollary 6.6.2, and hence, by van Lambalgen’s Theorem, A is not 1-random relative to Ω. In other words, A is not 2-random. We now assume that A is not 2-random and conclude that (ii) does not hold for γ . Define a function g by recursion on n as follows. To define g(n), look for the length-lexicographically least σ such that UnAn (σ) ≺ Ωn . If there is no such σ then let g(n) = n. Otherwise, let g(n) = |σ| and mark σ as used. g(n) can computed from A n, and that −g(n) Note that −|σ| be−n 2 2 + 2 < ∞. Thus, by the Bounding A n σ∈dom(U ) n Lemma, Theorem 10.4.2, K(A n) n + g(n) + O(1). By van Lambalgen’s Theorem, Ω is not 1-random relative to A. Fix c and let m be such that K A (Ω m) m − c. Let σ be the lengthlexicographically least U A -program for Ω m. For all sufficiently large n, we have UnAn (σ) ↓ and Ωn m = Ω m. Therefore, σ is eventually used
10.5. LR-reducibility
489
in the definition of g for some n, and we have g(n) = |σ| m − c. But Ωn m = Ω m, so γ (n) m. Thus K(A n) n + g(n) + O(1) n + m − c + O(1) n + γ (n) − c + O(1). Since c is arbitrary, (ii) does not hold for γ . Miller [unpublished] also showed that there is no A such that K(A n) n + α(n) − O(1).
10.5 LR-reducibility On the 1-random sets, vL- and LR-reducibility coincide, but LR-reducibility is also quite interesting and important on non-1-random sets. Many questions in randomness can be construed as ones about the structure of the LR-degrees. The following basic result provides a tool for investigating LR-reducibility. Theorem 10.5.1 (Kjos-Hanssen [203]). The following are equivalent. (i) A LR B. (ii) For every Σ0,A class T A of measure less than 1, there is a Σ0,B class 1 1 B V such that μ(V B ) < 1 and T A ⊆ V B . of test relative to A, (iii) For some member U A of a universal Martin-L¨ B B class V such that μ(V ) < 1 and UA ⊆ V B. there is a Σ0,B 1 Consequently, A LR B iff every Π0,A class of positive measure has a Π0,B 1 1 subclass of positive measure. We will give a proof due to Barmpalias, Lewis, and Soskova [25], which is conceptually simpler than Kjos-Hanssen’s original proof, but along similar lines.10 We begin with three lemmas. The first concerns oracle Martin-L¨ of tests. We can think of such a test as a uniformly c.e. sequence {Un }n∈ω of sets of axioms of the form (τ, σ), letting Unβ = {σ : ∃τ β [(τ, σ) ∈ Un ]}, where β ∈ 2ω or β ∈ 2<ω . It is easy to see that any oracle Martin-L¨ of test can be represented in this way, and that the following holds. Lemma 10.5.2 (Barmpalias, Lewis, and Soskova [25]). There is a universal oracle Martin-L¨ of test {Un }n∈ω with the following properties. (i) From a c.e. index for an oracle Martin-L¨ of test {Vn }n∈ω , we can A compute a k such that for all A ∈ 2ω and all n, we have Vn+k ⊆ UnA . 10 That
(iii) implies (i) in the B = ∅ case was noted by Kuˇcera and Terwijn [223] and formed an important tool in their proof that there is a noncomputable c.e. set A LR ∅, Corollary 11.2.6 below.
490
10. Complexity and Relative Randomness for 1-Randoms
(ii) If (τ1 , σ1 ), (τ2 , σ2 ) ∈ Un and τ1 ≺ τ2 , then σ1 | σ2 . (iii) If (τ, σ) ∈ Un then |τ | = |σ| and (τ, σ) ∈ Un,|τ | \ Un,|τ |−1. For sets of strings U and V , let U V = {στ : σ ∈ U ∧ τ ∈ V }. Let U 0 = 2<ω and U n+1 = U n U . (So U 1 = U and U 2 = U U , for instance.) Lemma 10.5.3 (Kjos-Hanssen [203]). Let U0 , U1 be open sets and S class of measure less than 1 such that U0 U1 ⊆ S. Then there is a Σ0,B 1 class R of measure less than 1 such that Ui ⊆ R for at least one a Σ0,B 1 i = 0, 1. Hence, by iteration, if U is an open set and S is a Σ0,B class of measure 1 class less than 1 such that U n ⊆ S for some n > 0, then there is a Σ0,B 1 R of measure less than 1 such that U ⊆ R. Proof. It suffices to prove the first part of the lemma, since the second one then follows by induction. First suppose that there is a σ ∈ U0 such that μ(σ \ S) > 0. Then we can let R = {τ : στ ⊆ S} and note that μ(R) < 1 and U1 ⊆ R. Otherwise, let q be a rational such that μ(S) < 1 − q and let R = {σ : μ(σ \ S) < q2−|σ| }. Then U0 ⊆ R and (1 − q)μ(R) μ(S), whence μ(R) < 1. Lemma 10.5.4 (Barmpalias, Lewis, and Soskova [25]). Let {Un }n∈ω be a universal oracle Martin-L¨ of test. If A LR B then there are σ and e > 0 such that σ U1B and UeA ∩ σ ⊆ U1B ∩ σ. Proof. Suppose otherwise. Let σ0 = λ. Given σn such that σn U1B , let σn+1 σn be such that σn+1 ⊆ UnA and σn+1 U1B , which must exist because UnA ∩ σn U1B ∩ σn . Let X = limn σn . Then X ∈ UnA for all n, so X is not 1-random relative to A, but X ∈ / U1B , so X is 1-random relative to B, which is a contradiction. Proof of Theorem 10.5.1. First suppose that A LR B. Let {Un }n∈ω be the test of Lemma 10.5.2. Let σ and e be as in Lemma 10.5.4. Let W B = σ ∪ (U1B ∩ σ). Then W B is a Σ0,B class, and UeA ⊆ W B , by the choice 1 B of σ and e. Since σ U1 , we have W B = ∅. But W B is a Π0,B class, 1 B B and every element of W is in U1 , and hence is 1-random relative to B. Thus μ(W B ) > 0, and hence μ(W B ) < 1, so we have (iii). class T A of measure less than 1, thought of as a Now consider a Σ0,A 1 prefix-free set of strings. Then some subsequence of {(T A )k }k∈ω is a MartinL¨ of test relative to A. By the properties of {Un }n∈ω , there is a k such that (T A )k ⊆ UeA . Hence (T A )k ⊆ W B and by Lemma 10.5.3 we have (ii). Finally, assume (iii). That is, assume there exist V A ⊆ T B such that A V is a member of a universal Martin-L¨ of test relative to A, the class B T B is Σ0,B , and μ(T ) < 1. Let X be 1-random relative to B. By the 1 relativized form of Theorem 6.10.2, there are Y and σ such that X = σY
10.5. LR-reducibility
491
and Y ∈ T B ⊆ V A . Then Y is 1-random relative to A, and hence so is X. Thus A LR B. Space considerations preclude us from giving too many details about the structure of the LR-degrees, but here are some known facts. Additional results may be found in, for example, [18, 19, 24, 25, 26]. We will meet LR-reducibility again in Section 11.9, in the context of studying lowness for weak 2-randomness. Theorem 10.5.5 (Barmpalias, Lewis, and Soskova [25]). Every non-GL2 set has uncountably many LR-predecessors. Theorem 10.5.6 (Barmpalias, Lewis, and Stephan [26]). Let A be noncomputable. Then there are B |T A and C >T A in the LR-degree of A. Theorem 10.5.7 (Barmpalias [18]). There is no minimal pair of Δ02 LRdegrees. Theorem 10.5.8 (Barmpalias [18]). Every Δ02 LR-degree bounds a nonzero c.e. LR-degree. As a representative example, we sketch the proof of Theorem 10.5.8. (The proof of Theorem 10.5.7 is similar.) We select this theorem because its proof involves a technique that is particular to the LR-degrees and relies heavily on Theorem 10.5.1. Proof sketch of Theorem 10.5.8. Let X LR ∅ be Δ02 . We build a c.e. set A such that ∅ s, then eventually we will have σ ∈ S X permanently.
492
10. Complexity and Relative Randomness for 1-Randoms
This method clearly ensures that U A ⊆ S X , but since A will not be computable, A a(σ, s) may well change after stage s, at a point at which we have already put σ into S X with an X-correct use. We have then added an interval to S X that may not be in U A , which is a threat to our requirement that μ(S X ) < 1. Thus we estimate the cost 11 c(n, s) of enumerating n into A at stage s to be μ({Z : Z ∈ U A [s] with use greater than n}). To describe how these costs stack up, and how we can deal with the above problem using the fact that X LR ∅, we discuss how we meet the Re , which are the sources of the A-changes. The basic plan is simple. We choose a finite set of strings Δe such that Δe is currently disjoint from Ve , and put Δe into QA with some large use n + 1. We then wait for Δe to enter Ve , and remove Δe from QA by enumerating n into A. We then repeat this procedure enough times to ensure that, since μ(Ve ) < 1 − 2−e , eventually one of our Δe never enters Ve , so that QA Ve . Unfortunately, every time we put a number n into A, we may incur a cost, as explained above. When we put Δe into QA , the opponent can also put Δe into U A with the same use, forcing us to put Δe into S X . The cost incurred when we put n into A and remove Δe from QA causes a problem unless we can ensure that X n + 1 is different from all the values X[s] n + 1 for which we have declared that Δe ⊆ S X[s]n+1 . Thus, we need a method of forcing enough X-changes to avoid these problems. To do so, we will have to be careful in how we select the Δe ’s, and will use our auxiliary class Ee . We will keep μ(Ee ) μ(Ve ) < 1 and, as long as Ve seems to be covering QA , we will attempt to make Ee cover UbXe for some be > e, chosen dynamically during the construction. (The larger be is, the smaller the measure of UbXe is, so being able to change be as we go along will allow us to deal with injuries to Re .) Since X LR ∅, we cannot succeed in making Ee cover UbXe for the final value of be . We will be able to arrange things so that the X-changes that prevent us from doing so are exactly the ones needed to ensure that the costs described above do not hurt us too much. We now describe the basic module for satisfying Re , using parameters ae and be determined by the construction. Let ke [s] be the least k such that X[s]k X[s]ke [s] Ube \ Ee [s] = ∅, let Ce [s] = Ube \ Ee [s], and let me [s] = μ(Ce [s]). The module also has a parameter re , initially set to 0. At a stage s0 at which we wish to launch an attack on Re , the module proceeds as follows. 1. Let pe [s0 ] = 2−re [s0 ]−e−6 me [s0 ]. Let Δe be a finite set of strings such that μ(Δe ) = pe [s0 ] and Δe ∩ Ve [s0 ] = ∅. Put Δe into QA with a fresh large use ne . 2. Wait for a stage s1 > s0 such that Δe ⊆ Ve [s1 ]. 11 The
idea of cost functions will play an important role in Chapters 11 and 14.
10.5. LR-reducibility
493
(a) Suppose that, while we are waiting, X ke [s0 ] change before s1 is found. Then increase re by one, declare that Δe is trash, and return to step 1. (b) Suppose that we find s1 . i. If c(ne , s0 ) ae pe [s0 ], then restrain A s1 and return to step 1. ii. Otherwise, proceed as follows. Put ne into A, thus removing Δe from QA . Put a subset of Ce [s0 ] of measure pe [s0 ] into Ee and return to step 1. We first analyze this module assuming that step 2(a) does not happen. Indeed, for simplicity, let us first assume that pe [s0 ] = me [s0 ]. We clearly have μ(Ee ) μ(Ve ), since we add measure to Ee only after a corresponding amount is added to Ve . Thus me must be bounded away from 0, as otherwise Ee would cover UbXe , which cannot happen since X LR ∅. Therefore, pe is bounded away from 0. Since each time an attack on Re is launched at stage s0 and ends, the measure of Ve increases by pe [s0 ], only finitely many such attacks can end, and eventually there is an attack that does not reach step 2, so Re is satisfied. If an attack reaches step 2(b)ii, then it removes all the measure it put into QA . If it reaches step 2(b)i, then the restraint imposed on A ensures that the ae pe [s0 ] = ae μ(Δe ) much measure that must be in U A to cause c(ne , s0 ) ae pe [s0 ] is permanently in U A . Since ne is redefined to be a fresh large number each time an attack is launched, the measures permanently added to U A each time step 2(b)i is reached come from disjoint subsets of U A . Thus we see that each time an attack on Re is launched, if the attack ends, then μ(QA ) increases by at most a1e as much as μ(U A ). The last attack launched on Re may never end, so if t is the stage at which this last A ) attack is launched, then μ(QA ) μ(U + pe [t], which is less than 1 if ae ae is at least 2. Finally, we must argue that μ(S X ) < 1. Our claim is that μ(S X ) μ(U A ) + ae μ(UbXe ), which suffices if be is chosen to be large enough. When some elements enters U A , the same elements are added to S X . These elements may later leave U A , but this event can happen only when we reach step 2(b)ii of the module and have X ke = X[s0 ] ke . The measure of the class of elements in question is then at most c(ne , s0 ) < ae pe [s0 ]. We then put pe [s0 ] much measure of Ce [s0 ] into Ee . Since X ke = X[s0 ] ke , all of this measure is in UbXe . Such enumerations into Ee correspond to disjoint classes of elements of UbXe , so μ(S X \ U A ) is at most ae μ(UbXe ), as claimed. It is straightforward to check that dropping the assumption that pe [s0 ] = me [s0 ] for the actual value pe [s0 ] = 2−re [s0 ]−e−6 me [s0 ] (while still assuming that step 2(a) does not occur, and hence re cannot change) causes no real problems.
494
10. Complexity and Relative Randomness for 1-Randoms
We now briefly discuss removing our assumption that step 2(a) does not occur. Each time this step occurs, the current attack on Re is canceled, and the set Δe that we put into QA becomes trash. However, the measure of this set is 2−re [s0 ]−e−6 me [s0 ], and each time step 2(a) occurs, we increase re by one. Thus the total measure of our trash is bounded by 2−e−5 . Using the fact that X is Δ02 , we can argue that ke [s] must reach a limit and hence so must re , so that eventually step 2(a) stops occurring, and Re can be satisfied as argued above. The remaining details of the proof are relatively straightforward, and involve choosing the parameters of the construction appropriately and applying the finite injury priority method. Each time a requirement acts, it initializes all weaker priority ones, forcing them to increase their parameters. The formal details are in [18]. We finish this section by showing that LR-reducibility, which can be seen as a way to compare the relative derandomization power of sets, can also be characterized in terms of relative compression power. We say that A is low for K reducible to B, or simply LK-reducible to B, and write A LK B, if K B (σ) K A (σ) + O(1).12 Clearly, if A LK B then A LR B. The following result shows that these reducibilities are in fact equivalent. Theorem 10.5.9 (Kjos-Hanssen, Miller, and Solomon [206]). If A LR B then A LK B. To prove this result, we need the following easy lemma from basic analysis. Lemma 10.5.10. Let a0 , a1 , . . . ∈ [0, 1) be real numbers. Then i (1−ai ) > 0 iff i ai < ∞. Proof of Theorem 10.5.9. Let A LR B. Define finite sets V0 , V1 , . . . as follows. Suppose that Vt has been defined for all t < s. Let n and τ be such that s = n, τ (where we identify τ with a natural number as usual). Let m be the length of the longest string in t<s Vt (or m = 0 if s= Vs = {σ0n : σ ∈ 2m }. Note that for any I ⊆ N, we have 0), and let μ( s∈I Vs ) = n,τ ∈I (1 − 2−n ). Let I = {|σ|, τ : U A (σ) = τ }. Then I is A-c.e., and hence P = 0,A class. Furthermore, s∈I Vs is a Π1 2−n 2−|σ| < 1. n,τ ∈I
σ∈dom(U A )
12 As with LR-reducibility, this reducibility comes from a notion of lowness that will be discussed in Section 11.2.
10.6. Almost everywhere domination
Thus, 0, τ ∈ / I for every τ , and hence μ(P ) =
n,τ ∈I (1
− 2−n ) > 0, by
Π0,B 1
Lemma 10.5.10. Therefore, by Theorem 10.5.1, there is a with μ(Q) > 0. Let J = {n, τ : Vn,τ ∩ Q = ∅}. Then J is B-c.e., and #
−n (1 − 2 ) = μ Vs μ(Q) > 0. n,τ ∈J
495
class Q ⊆ P
s∈J
Thus, by Lemma 10.5.10, there is a c such that n,τ ∈J 2−n < 2c . Let S = {(n + c, τ ) : n, τ ∈ J}. Then S is a KC set relative to B, so if n, τ ∈ J then K B (τ ) n + O(1). Since Q ⊆ P , we have I ⊆ J, and hence K A (τ ), τ ∈ J for each τ . Thus K B (τ ) K A (τ ) + O(1), and hence A LK B. This result has several important consequences. One of them is that if A is 1-random, then the K-degree of A is countable. (We will see another one in Theorem 11.2.4.) Recall that, by Theorem 9.15.1, if K X (A n) K X (n) + O(1) then A T X . Corollary 10.5.11. If A ≡LR X then A T X . Proof. If A ≡LR X then A ≡LK X, so K X (A n) = K A (A n) ± O(1) = K A (n) ± O(1) = K X (n) ± O(1), so Theorem 9.15.1 applies. Corollary 10.5.12 (Miller [unpublished]13 ). If X is 1-random then the K-degree of X is countable. Proof. Suppose that A ≡K X. By Corollary 10.3.10, A ≡vL X. Since X and A are both 1-random, A ≡LR X. By Corollary 10.5.11, A T X . Thus every element of the K-degree of X is computable in X , and hence the K-degree of X is countable.
10.6 Almost everywhere domination Motivated by a question in the reverse mathematics of measure theory, Dobrinen and Simpson [98] introduced the following concepts. Definition 10.6.1 (Dobrinen and Simpson [98]). (i) A set A is almost everywhere dominating if for almost all X (in the sense of measure), for each g T X there is an f T A that dominates g. 13 Miller originally proved this result by other methods, before Theorem 10.5.9 was known.
496
10. Complexity and Relative Randomness for 1-Randoms
(ii) A set A is uniformly almost everywhere dominating if there is an f T A such that for almost all X, the function f dominates all functions g T X. As we have seen in Lemma 8.20.6, Kurtz [228] showed that ∅ (and hence any set computing ∅ ) is uniformly almost everywhere dominating. Cholak, Greenberg, and Miller [67] showed that there are uniformly almost everywhere dominating sets that do not compute ∅ . On the other hand, it follows from Martin’s Theorem 2.23.7 that every uniformly almost everywhere dominating set is high. It was natural to ask for a characterization of the (uniformly) almost everywhere dominating sets. One has now been found in terms of LR-reducibility. We will show through a sequence of results that the following are equivalent: (i) A is almost everywhere dominating. (ii) A is uniformly almost everywhere dominating. (iii) ∅ LR A. In other words, a set A is (uniformly) almost everywhere dominating iff every set that is 1-random relative to A is 2-random. In Section 11.2, we will show that there are incomplete c.e. sets with this property, but not all high c.e. sets have it, so randomness-theoretic notions are essential in answering this question. We begin with the following characterizations of (uniform) almost everywhere domination. Theorem 10.6.2 (Dobrinen and Simpson [98]). (i) A set A is uniformly almost everywhere dominating iff for each Π02 class P there is a Σ0,A class Q ⊆ P such that μ(Q) = μ(P ). 2 (ii) A set A is almost everywhere dominating iff for each Π02 class P and class Q ⊆ P such that μ(Q) μ(P ) − ε. each ε > 0, there is a Π0,A 1 Proof. (i) First suppose that A is uniformly almost everywhere dominating, as witnessed by the function f T A. Let P be a Π02 class. It is easy to define a functional Φ such that P = {X : ΦX total}. Let g X (n) be the least s such that ΦX (n)[s] ↓. Then f dominates g for almost all X ∈ P , so we can take Q = {X : ∃c ∀n (ΦX (n)[f (n) + c)] ↓)}. Now suppose that the second half of (i) holds. Let P = {0e 1X : ΦX e total} class such that μ(Q) = μ(P ). Write Q as an and let Q ⊆ P be a Σ0,A 2 0,A effective union of Π1 classes Q0 , Q1 , . . . . By compactness, for each i, e, e and n, the set Se,i,n = {ΦX e (n) : 0 1X ∈Qi } is finite. These sets are uniformly A-computable. Let f (n) = max e,i
10.6. Almost everywhere domination
497
(ii) is just a nonuniform version of the above. First suppose that A is almost everywhere dominating. Let P be a Π02 class, and let g X be as above. For almost all X ∈ P , there is an f T A that majorizes g X . Since there are only countably many A-computable functions, given ε > 0 there are f0 , . . . , fk T A such that for all but measure ε many X ∈ P , some fi with i k majorizes g X . Let f (n) = maxik fi (n). Then we can take Q = {X : ∀n (ΦX (n)[f (n)] ↓)}. Now suppose that the second half of (ii) holds. Fix e and ε > 0. Let P = 0,A {X : ΦX class such that μ(Q) μ(P ) − ε. e total} and let Q ⊆ P be a Π1 Arguing by compactness as above, there is an f T A such that f majorizes ΦX e for every X in Q. Since ε is arbitrary, we see that for almost all X, there is an A-computable function that majorizes ΦA e . Since there are only countably many e, it follows that A is almost everywhere dominating. Theorem 10.6.3 (Binns, Kjos-Hanssen, Lerman, and Solomon [43]). If A is almost everywhere dominating then ∅ LR A. of test relative to ∅ and let Proof. Let U0 , U1 , . . . be a universal Martin-L¨ 0 P = U1 . Then P is a Π2 class of positive measure all of whose elements class Q ⊆ P of positive are 2-random. By Theorem 10.6.2, there is a Π0,A 1 measure. Let X be 1-random relative to A. By Theorem 6.10.2, there are Y ∈ Q and σ such that X = σY . Then Y is 2-random, and hence so is X. The other direction of our equivalence relies on the following result, which will also be important in Section 11.9. Theorem 10.6.4 (Kjos-Hanssen, Miller, and Solomon [206]). The following are equivalent. (i) A T B and A LR B. (ii) Every Π0,A class has a Σ0,B subclass of the same measure. 1 2 class has a Σ0,B subclass of the same measure. (iii) Every Σ0,A 2 2 Proof. (i) ⇒ (ii) This part of the proof is similar to the proof of Theorem 10.5.9. Extend the length-lexicographic ordering of strings L to pairs of strings. We define sets of strings V(σ,τ ) inductively. Suppose that we have defined V(σ ,τ ) for all (σ , τ ) L (σ, τ ) and let k be the maximum of the lengths of strings in (σ ,τ )L (σ,τ ) V(σ ,τ ) . Let V(σ,τ ) = {ν0|τ | : ν ∈ 2k }. It is easy to see that if I ⊆ 2<ω × 2<ω , then #
μ V(σ,τ ) = (1 − 2−|τ | ). (σ,τ )∈I
(σ,τ )∈I
Now let S be a nonempty Π0,A class. Let W A be a A-c.e. prefix-free 1 set of generators for S, and let I = {(σ, τ ) : τ ∈ W A with use exactly σ}.
498
10. Complexity and Relative Randomness for 1-Randoms
Let P = (σ,τ )∈I V(σ,τ ) . Note that P is a Π0,A class. Recall the fact 1 , a , . . . ∈ [0, 1] then from Lemma 10.5.10 that if a 0 1 i (1 − ai ) > 0 iff −|τ | −|τ | a < ∞. Now, 2 = 2 1, since W A is prefixA i i τ ∈W (σ,τ )∈I −|τ | free, so μ(P ) = (σ,τ )∈I (1 − 2 ) > 0. Thus, by Theorem 10.5.1, there is class Q ⊆ P with μ(Q) > 0. a Π0,B 1 Let J = {(σ, τ ) : Vσ,τ ∩Q = ∅}. Note that J is B-c.e., and (σ,τ )∈J (1 − 2−|τ |) = μ( (σ,τ )∈J V(σ,τ ) ) > 0, so (σ,τ )∈J 2−|τ | < ∞. Since A T B , there is a B-computable approximation A0 , A1 , . . . to A. Let Ts = {(σ, τ ) ∈ J : ∃t s (τ ∈ W At [t] with use exactly σ)} and Us = {τ : ∃σ ((σ, τ ) ∈ Ts )}. Then {Ts }s∈ω and {Us }s∈ω are sequences of uniformly B-c.e. sets. Let subclass of S. R = s Us . We claim that R is the desired Σ0,B 2 First notice that W A ⊆ Us for all s, and hence R ⊆ S. Now, for each (σ, τ ) ∈ T0 \ I, there is a final stage t such that σ is a prefix of At , as otherwise (σ, τ ) ∈ I. Then for any stage s > t, we have (σ, τ ) ∈ / Ts . Fix ε > 0 and choose n sufficiently large so that {2−|τ | : (σ, τ ) ∈ J ∧ |σ|, |τ | > n} < ε. Choose s large enough so that if (σ, τ ) ∈ T0 \ I and |σ|, |τ | n then (σ, τ ) ∈ / Ts . Then μ(S \ Us ) 2−|τ | 2−|τ | τ ∈Us \W A
(σ,τ )∈Ts \I
{2−|τ | : (σ, τ ) ∈ J ∧ |σ|, |τ | > n} < ε.
Since ε > 0 is arbitrary, μ(R) = μ(S). class and let P0 , P1 , . . . be uniformly Π0,A (ii) ⇒ (iii) Let S be a Σ0,A 2 1 classes such that S = i Pi . Let P = {0i 1α : ∃i (α ∈ Xi )}. Then P is a Π0,A class, so by (ii) it has a Σ0,B subclass Q of the same measure. 1 2 For each i, let Qi = {α : 0i 1α ∈ Q}. Then the Qi are uniformly Σ0,B 2 classes with Q ⊆ P for all i. If μ(Y ) < μ(X ) for some i, then μ(Q) = i i i i i+1 i+1 2 μ(Q ) < 2 μ(P ) = μ(P ), contradicting the choice of Q, so i i i i μ(Xi ) = μ(Yi ) for all i. Now let R = i Qi . Then R is a Σ0,B subclass of 2 P , and μ(P \ R) i μ(Pi − Qi ) = 0, so μ(R) = μ(P ). (iii) ⇒ (i) If P is a Π0,A class of positive measure then by (ii) (which 1 subclass S of the same measure. is clearly implied by (iii)), P has a Σ0,B 2 classes, so one of these classes has Now, S is a countable union of Π0,B 2 subclass of positive measure, so by positive measure. Thus P has a Π0,B 1 Theorem 10.5.1, A LR B.
10.6. Almost everywhere domination
499
Next, we show that A T B . Let σn = 0n 1 and let U = n∈A σn . Since U is a Σ0,A class (and hence a Π0,A class), there is a Π0,B 1 2 2 by (iii) −(n+1) class Q such that U ⊆ Q and μ(Q) = μ(U ) = n∈A 2 . We claim that n ∈ A iff σn ⊆ Q. If n ∈ A, then σn ⊆ U ⊆ Q. On the other hand, if n ∈ / A and σn ⊆ Q, then μ(Q) i∈A 2−(i+1) + 2−n > μ(U ), which is a contradiction. Let Q0 , Q1 , . . . be uniformly Σ0,B classes such that Q = k Qk . Then n ∈ 1 A iff σn ⊆ Q iff σn ⊆ Qk for all k. Since σn ⊆ Qk is a Σ0,B relation, 1 0,B these equivalences show that A is Π2 . The same argument applied to 0,B class n∈A as well, and hence A T the Σ0,A 1 / σn shows that A is Π2 14 B. Note that A LR B does not imply A T B because there is a B with uncountably many A LR B (see Barmpalias, Lewis, and Soskova [25]). Theorem 10.6.5 (Kjos-Hanssen, Miller, and Solomon [206]). If ∅ LR A then A is uniformly almost everywhere dominating. Proof. Suppose that ∅ LR A. Let P be a Π02 class. By Theorem 6.8.3, there are uniformly Π0,∅ classes R0 , R1 , . . . ⊆ P such that μ(P ) − μ(Rn ) < 2−n 1 for all n. Let R = n Rn . Then R is a Σ0,∅ class such that R ⊆ P and 2 μ(R) = μ(P ). Since ∅ T A , it follows from Theorem 10.6.4 that there is class Q ⊆ R ⊆ P such that μ(Q) = μ(R) = μ(P ). The theorem now a Σ0,A 2 follows from Theorem 10.6.2. See [206] for more on almost everywhere domination and related concepts, including a discussion of the reverse mathematical questions that originally motivated the study of this notion.
14 This proof can be used to show more, namely that A B . Applying the proof to T class n∈A σn , we see that A is Π0,B the Σ0,A 1 2 . But since A T B , we have that A is Σ0,B 2 , so in fact A T B . It follows that if A T B and A LR B, then A T B .
11 Randomness-Theoretic Weakness
In this chapter, we introduce an important class of “randomnesstheoretically weak” sets, the K-trivial sets. As we will see, this class has several natural characterizations, can be used to answer several questions in the theory of algorithmic randomness, and is also of great interest to computability theory.
11.1 K-triviality Recall Chaitin’s Theorem 3.4.4, which states that C(A n) C(n) + O(1) iff A is computable. Chaitin asked whether the analogous result holds for K in place of C. He was able to show the following, which is Theorem 9.15.1 for the case X = S = ∅. Theorem 11.1.1 (Chaitin [61]). If K(A n) K(n) + O(1) then A is Δ02 . Merkle and Stephan [270] noted that essentially the same proof establishes the following result, which we have already encountered in Section 9.15. Theorem 11.1.2 (Merkle and Stephan [270]). If K(A n) K(n) + O(1) for all n in an infinite set S then A T S ⊕ ∅ . The Counting Theorem in fact shows that the width of the tree T in the proof of Theorem 11.1.1 is O(2c ), so we have the following result. R.G. Downey and D. Hirschfeldt, Algorithmic Randomness and Complexity, Theory and Applications of Computability, DOI 10.1007/978-0-387-68441-3_11, © Springer Science+Business Media, LLC 2010
500
11.1. K-triviality
501
Theorem 11.1.3 (Zambella [416]). For each c there are O(2c ) many sets A such that K(A n) K(n) + c for all n. Of course, if A is computable then K(A n) K(n) + O(1), but as we will see in the next subsection, the converse is not true. In other words, even though A may look identical to ∅ as far as prefix-free initial segment complexity goes, it does not follow that A is computable, even for c.e. sets A. Thus we are led to the concept of K-triviality. Recall that a set A is K-trivial if K(A n) K(n) + O(1). Equivalently, A is K-trivial if A K ∅. It makes no difference in the above definition if we restrict our attention to an infinite computable set of n’s. Proposition 11.1.4. If K(A n) K(n) + O(1) for all n in an infinite computable set S then A is K-trivial. Proof. Let h(n) be the nth element of S, in natural order. For any n, we have K(n) = K(h(n)) ± O(1). Furthermore, from a description of A h(n), we can recover n and thus obtain A n, so K(A n) K(A h(n))+O(1). Thus K(A n) K(n) + O(1) for all n.
11.1.1 The basic K-triviality construction Solovay [371] was the first to construct a noncomputable K-trivial set. His method was adapted by Zambella [416] to construct a noncomputable Ktrivial c.e. set (see also Calude and Coles [49]). Downey, Hirschfeldt, Nies, and Stephan [117] gave a new construction of a K-trivial c.e. set, which we present below. (A similar construction had been produced independently by Kummer in unpublished work.) As we will later see, this construction gives a priority-free, and even a requirement-free, solution to Post’s Problem. Before presenting the construction from [117], we make a remark. While the proof below is easy, it is slightly hard to see why it works. So, by way of motivation, suppose that we were to asked to “prove” that ∅ has the same initial segment complexity as N. A complicated way to do so is to build our own prefix-free machine M whose only job is to compute initial segments of ∅. If U(σ) ↓= 1n then we want M (σ) ↓= 0n . We can build M implicitly, using the KC Theorem, by enumerating the request (|σ|, 0n ) every time U(σ) ↓= 1n . We are guaranteed that τ ∈dom M 2−|τ | −|σ| 1, and hence the KC Theorem applies. Note also that we σ∈dom U 2 could, for convenience and as we do in the construction below, use a string of length |σ| + 1, in which case we would ensure that τ ∈dom M 2−|τ | < 12 . Theorem 11.1.5 (Zambella [416], after Solovay [371]). There is a noncomputable K-trivial c.e. set. Proof. We define a noncomputable c.e. set A and a KC set L. As in the remark above, we follow U on n, in the sense that, when at a stage s we see a shorter σ with U(σ) = n than we have previously seen, we enumerate a
502
11. Randomness-Theoretic Weakness
request (|σ| + 1, As n) into L. To make A noncomputable, we sometimes need to make As+1 (n) = As (n). Then for each j with n < j s, for the currently shortest U-program σj for j, we also enumerate a request (|σj |, As+1 j). The construction works by making this extra measure added to the weight of L small. Let A = n : ∃e ∃s We,s ∩ As = ∅ ∧ n > 2e ∧ n ∈ We,s
−Ks (j) −(e+2) , ∧ 2 2 n<js
and let L be as described above. Clearly A is c.e. Since jm 2−K(j) goes to zero as m increases, if We is infinite then A∩We = ∅. Since A is also clearly coinfinite, it follows that A is noncomputable. Finally, the extra weight of L, beyond one half of the measure of dom U, is bounded by e 2−(e+2) (corresponding to at most one initial segment change for each e), whence the weight of L is bounded by 1 1 2−(|σ|+1) + 2−(e+2) + = 1. 2 2 e σ∈dom U
So the KC Theorem applies, and hence, since (K(n), A n) ∈ L for all n, we have K(A n) K(n) + O(1). A useful way to think of this proof is in terms of cost functions. The −Ks (j) function c(n, s) = represents the cost of enumerating n n<js 2 into A at stage s. To ensure that A is K-trivial, we need to ensure that {c(n, s) : μn ∈ As+1 \ As } is finite. We will see another example of a cost function construction in the proof of Theorem 11.2.5, and will discuss this idea further below. The above proof admits several variations. For instance, we can make A promptly simple, or below any nonzero c.e. degree. We cannot control the jump or make A Turing complete, since, as we will see below, all K-trivial sets are low.
11.1.2 The requirement-free version As we will see in Section 11.3, the construction above automatically yields a Turing incomplete c.e. set. It is thus an injury-free solution to Post’s Problem. It is not, however, priority-free, in that the construction depends on an ordering of the simplicity requirements, with stronger requirements allowed to add more to the weight of L. We can do methodologically better by giving a priority-free solution to Post’s Problem, in the sense that no explicit diagonalization (such as that of We above) occurs in the construction of the incomplete c.e. set, and therefore the construction of this set (as
11.1. K-triviality
503
opposed to the verification that it is noncomputable) does not depend on an ordering of requirements. We now sketch this method, which is due to Downey, Hirschfeldt, Nies, and Stephan [117], and is rather more like that of Solovay’s original proof of the existence of a noncomputable K-trivial set. Let us reconsider the key idea in the proof of Theorem 11.1.5. At certain stages we wish to change an initial segment of A for the sake of diagonalization. Our method is to make sure that the total measure added to the weight of our KC set L (which proves the K-triviality of A) due to such changes is bounded by 1. Suppose, on the other hand, that we were fortunate enough to have U itself “cover” the measure needed for these changes. That is, suppose we were at a stage s where we desired to put n into As+1 \ As and at that very stage Ks (j) changed for all j ∈ (n, s]. Then in any case we would need to enumerate new requests describing As+1 j for all j ∈ (n, s], whether or not these initial segments change. Thus, at that very stage, we could also change As j for all j ∈ (n, s] at no extra cost. Notice that we would not need to copy U at every stage. We could enumerate a collection of stages t0 , t1 . . . and only update L at stages ti . Thus, for the lucky situation outlined above to occur, we would need the approximation to K(j) to change for all j ∈ (n, ts ] only at some stage u with ts u ts+1 . This observation would seem to allow a greater possibility for this lucky situation to occur, since many more stages can occur between ts and ts+1 . The key point in this discussion is the following. Let t0 , t1 , . . . be a computable collection of stages. Suppose that we construct a set A = s Ats so that for n ts , if Ats+1 n = Ats n then Kts (j) > Kts+1 (j) for all j with n j ts . Then A is K-trivial. We are now ready to define such an A in a priority-free way. Let t0 , t1 , . . . be a collection of stages such that ti as a function of i dominates all primitive recursive functions. (Actually, we do not need i → ti to be quite this fast growing; see below for more details.) At each stage u, let {ai,u : i ∈ N} list Au . Let Ats+1 = Ats ∪ [an,ts , ts ], where n is the least number less than or equal to ts such that Kts+1 (j) < Kts (j) for all j ∈ (n, ts ]. (Naturally, if no such n exists, Ats+1 = Ats .) Requiring the complexity change for all j ∈ (n, ts ], rather than just j ∈ (an,ts , ts ], ensures that A is coinfinite, since for each n there are only finitely many s such that Kts+1 (n) < Kts (n). Note that there is no priority used in the definition of A. It is like the Dekker deficiency set or the so-called “dump set” (see Soare [366, Theorem V.2.5]). It remains to prove that A is noncomputable. By the recursion theorem, we can build a prefix-free machine M and know the coding constant c of M in U. That is, if we declare M (σ) = j then we will have U(τ ) = j for some τ such that |τ | |σ| + c. Note further that if we put σ into the domain of M
504
11. Randomness-Theoretic Weakness
at stage ts , then τ will be in the domain of U by stage ts+1 − 1. (This is why we required i → ti to dominate the primitive recursive functions. In fact, we only need this function to dominate the overhead of the recursion theorem; that is, we only need the property that if σ enters the domain of M at stage ts , then there is a τ such that |τ | |σ| + c and Uts+1 −1 (τ ) ↓= M (σ). The use of a fast growing sequence of stages was the key insight in Solovay’s original construction.) Now the proof looks like that of Theorem 11.1.5. We devote 2−(e+1) of the domain of our machine M to ensuring that A satisfies the eth simplicity requirement. When we see an,ts occur in We,ts , where n<jts 2−Kts (j) 2−(e+c+2) , we provide shorter Mts descriptions of all j with n < j ts so that Kts+1 (j) < Kts (j) for all such j. The cost of this change is bounded by 2−(e+1) , and an,ts will enter Ats+1 , as required. As observed by Downey, Hirschfeldt, Nies, and Stephan [117], and later formally written out by Nies [303, 306], many of the usual computabilitytheoretic variations work for this construction. Thus, for example, there are minimal pairs of degrees containing K-trivial sets, and the following holds (though the proof is a bit tricky; see [303] for details). Theorem 11.1.6 (Downey, Hirschfeldt, and Nies, see [303]). If B is a low c.e. set, then there is a K-trivial c.e. set A T B. Indeed, if B0 , B1 , . . . is a uniformly c.e. sequence of uniformly low sets, then there is a K-trivial c.e. set A such that A T Bi for all i.
11.1.3 Solovay functions and K-triviality Recall from Definition that a Solovay function is a computable func −f3.12.3 (n) tion such that 2 < ∞ and f (n) K(n) + O(1) for infinitely n many n. In this subsection we point out an attractive connection between K-trivials and Solovay functions. We begin with an analog of Chaitin’s Theorem 3.4.4 that all C-trivials are computable (where A is C-trivial if C(A n) C(n) + O(1)). Proposition 11.1.7 (Bienvenu Downey [39]). Suppose that for each and −f (n) computable function f with 2 < ∞, there exists a computable n function g with σ 2−g(σ) < ∞ such that g(A n) f (n) + O(1). Then A is computable. Proof. Let A satisfy the hypothesis of the proposition, and let f be a Solovay function. Let g be a computable function such that σ 2−g(σ) < ∞ and there is a c with g(A n) f (n) + c for all n. Let d be such that K(σ) g(σ) + d for all σ. Since f is a Solovay function, there is an e such that f (n) K(n) + e for infinitely many n. For any such n, |{τ ∈ 2n : g(τ ) f (n) + c}| |{τ ∈ 2n : K(τ ) K(n) + c + d + e}| 2c+d+e+O(1) ,
11.1. K-triviality
505
where the last inequality comes from the Counting Theorem 3.7.6. So the Π01 class {X ∈ 2ω : ∀n (g(X n) f (n) + c)}, to which A belongs, has only finitely many elements (at most 2c+d+e+O(1) many), and hence all these elements are computable. We can use Solovay functions to characterize K-triviality. Theorem 11.1.8 (Bienvenu and Downey [39]). A set A is K-trivial iff for all computable functions f such that n 2−f (n) < ∞, we have K(A n) f (n) + O(1). Moreover, there exists a single such computable function g such that A is K-trivial ⇔ K(A n) g(n) + O(1).
(11.1)
Proof. We give a simple proof due to Wolfgang Merkle. By Lemma 3.12.2, any K-trivial A satisfies K(A n) f (n) + O(1) for all computable f with n 2−f (n) < ∞. Thus, it is enough to find such a function g so that the ⇐ direction of (11.1) holds. In fact, we can take g to be Solovay’s Solovay function, the function constructed in the proof of Theorem 3.12.1. Let A be a set such that K(A n) g(n) + O(1). Given n, let s be least such that Ks (n) = K(n), and let m = n, s. Then g(m) K(n) + O(1), as explained in the proof of Theorem 3.12.1. Since n can be computed from m, we have K(A n) K(A m) + O(1) g(m) + O(1) K(n) ± O(1), so A is K-trivial. Too recently for a proof to be included in this book, Bienvenu, Merkle, and Nies [unpublished] have shown that a computable function g satisfies (11.1) iff it is a Solovay function. Theorem 11.1.9 (Bienvenu, Merkle, and Nies [unpublished]). The following are equivalent for a computable function g. (i) The function g is a Solovay function. (ii) A set A is K-trivial iff K(A n) g(n) + O(1). Bienvenu, Merkle, and Nies [unpublished] also showed that the above result still holds for co-c.e. functions g (where we remove the condition that g be computable from the definition of a Solovay function).
11.1.4 Counting the K-trivial sets Zambella’s Theorem 11.1.3 leads one to ask exactly how many K-trivials there are for a given constant of K-triviality, and how complicated it is to enumerate them. This question has been investigated by Downey, Miller, and Yu [127]. Let KT(c) = {A : ∀n (K(A n) K(n) + c)} and let
506
11. Randomness-Theoretic Weakness
G(c) = | KT(c)|. The function G appears to be a strangely complicated object. We calculate some bounds on its complexity. To do so, we begin with a combinatorial analysis. Let KT(c, n) = {σ ∈ 2n : K(σ) K(n) + c} and G(c, n) = | KT(c, n)|. Theorem 11.1.10 (Downey, Miller, and Yu [127]). limc G(c) c 2c < ∞.
G(c) 2c
= 0. Indeed,
Proof. Let n be large enough so that if A and B are distinct elements of KT(c) then A n = B n. Then for any m n, we have G(c) G(c,n) G(c, m). Thus for any d we have c
σ∈2n d0
σ∈2n
where the last inequality follows by Corollary 3.7.9. The above result is relatively sharp, in the following sense. Theorem 11.1.11 (Downey, Miller, and Yu [127]). There is an ε > 0 such c that G(c) > ε 2c2 for all large enough c. Proof. For any σ, K(σ0r ) K(σ) + K(0r ) + O(1) K(σ) + K(0|σ|+r ) + O(1) |σ0r | + K(σ) + O(1). So there is a d such that σ0ω ∈ KT(K(σ) + d) for all σ. If |σ| = (c − d) − 2c−d 2 log(c − d), then K(σ) c − d, so σ0ω ∈ KT(c). But there are (c−d) 2 many such σ, so the theorem follows. We now consider the complexity of G. Theorem 11.1.12 (Downey, Miller, and Yu [127]). G T ∅ . Proof. Suppose that G T ∅ . Let k be such that 0ω ∈ KT(k). Using the KC Theorem, we build a machine M with coding constant d, which we know in advance by the recursion theorem. Let c k, 2d be least such that G(c) −d . Such a c exists by Theorem 11.1.10, and we can approximate 2c < 2 c as the limit of a computable sequence c0 , c1 , . . . . We may assume that cs d for all s. We build M to ensure that KT(c) has at least 2c−d many elements, yielding a contradiction. At stage s, proceed as follows. If s = 0, then let S0 = {λ}. If s > 0 and cs = cs−1 , then let Ss be a set of strings of length s extending the strings in Ss−1 such that |Ss | = min(2cs −d , 2|Ss−1 |). If s > 0 and cs = cs−1 , then let Ss = {0s }. In any case, promise to give each string in Ss an M -description of length K(s) + cs − d, which is clearly possible because 2cs −d 2−(K(s)+cs −d) = 2−K(s) .
11.2. Lowness
507
Let t be least such that cs = c for all s t. Then St = {0t }, and from stage t on, |Ss | grows until it reaches 2c−d . Thus there are 2c−d many sequences α 0t such that α s ∈ Ss for all s t. For any such α, we have K(α s) KM (α s) + d K(s) + c for all s t, and K(α s) K(s) + k K(s) + c for all s < t. Thus every such α is in KT(c), and hence G(c) 2c−d , for the desired contradiction. A crude upper bound on the complexity of G is that G T ∅ . It is unknown whether G T ∅ , or even whether the answer to this question is machine dependent.
11.2 Lowness In computability theory, a set A is called low if its jump has the same Turing degree as the jump of ∅. The idea is that, in relation to the jump operator, A has no more power as an oracle than ∅. Another way to look at this concept is that A is low iff the class of sets that are Δ02 relative to A is the same as the unrelativized class of Δ02 sets. We can generalize this notion to any relativizable class C, and say that A is low for C if C A = C. For example, we have the following important concept. Definition 11.2.1. Let R be a relativizable notion of randomness. A set A is low for R-randomness if every R-random set is R-random relative to A. So A is low for R-randomness if it has no derandomization power with respect to R-randomness. Of course, here R can be any of the notions we have previously studied, such as 1-randomness, Schnorr randomness, etc. For notions of randomness R with a test definition, we also have the following notion. Definition 11.2.2. A set A is low for R-tests if every R-test relative to A is covered by an unrelativized R-test, that is, for every R-test {Un }n∈ω relative to A, there is an R-test {Vn }n∈ω such that n Vn ⊇ n Un . If A is low for R-tests, then it is low for R-randomness, but the converse is not necessarily true. For any notion of randomness with universal tests, however, the two notions clearly coincide. Thus, for instance, a set is low for Martin-L¨ of tests iff it is low for 1-randomness. We can also further generalize the idea of lowness to other notions, such as in the following concept due to An. A. Muchnik [unpublished]. Definition 11.2.3 (Muchnik [unpublished]). A set A is low for K if K(σ) K A (σ) + O(1).
508
11. Randomness-Theoretic Weakness
Suppose that A is low for K. Then K(A n) K A (A n) + O(1) K(n) + O(1), so A is K-trivial. Also, if X is 1-random then K A (X n) K(X n) − O(1) n − O(1), so A is low for 1-randomness. A set A is low for 1-randomness iff A LR ∅, and low for K iff A LK ∅, so Theorem 10.5.9 gives us the following result, originally proved by Nies [302]. Corollary 11.2.4 (Nies [302]). A set is low for 1-randomness iff it is low for K. Lowness for 1-randomness was first studied by Kuˇcera and Terwijn [223], who answered a question of van Lambalgen and Zambella, first published in [416], by showing that there is a noncomputable c.e. set that is low for 1-randomness. Their result was an early example of a cost function construction (as Kuˇcera [private communication] has put it, the strategy of such a construction is “do what is cheap”), a topic we will return to in Section 11.5. It follows from the following more recent theorem, which also gives another proof of the existence of noncomputable K-trivial c.e. sets. Theorem 11.2.5 (Muchnik [unpublished]). There is a noncomputable c.e. set that is low for K. Proof. We build a c.e. set A to be low for K while satisfying the usual simplicity requirements Re : We = A. We use a cost function as in the proof of Theorem 11.1.5. Let c(n, s) = {2−|τ | : U A [s](τ ) ↓ with use greater than n}. We think of c(n, s) as the cost of enumerating n into A at stage s. Initially, let ne,0 = 2e for all e. Unless explicitly defined otherwise, ne,s+1 = ne,s for all e and s. At stage s, say that Re requires attention if We,s ∩As = ∅ and ne,s ∈ We,s . If no Re with e s requires attention, then proceed to the next stage. Otherwise, let e be least such that Re requires attention. If c(ne,s , s) < 2−(e+2) then put ne,s into A. Otherwise, for all i e, let ni,s+1 be fresh large numbers with ni,s+1 > 2i. In either case, we say that Re acts at stage s. Clearly A is c.e. and coinfinite. Assume for a contradiction that A is computable. Let e be least such that We = A. If i < e then either Wi ∩A = ∅ or Wi is finite. In either case, Ri eventually stops requiring attention. Let s be a stage by which all such Ri have stopped requiring attention. If there were no stage s0 s at which Re acts, then we would have ne,t = ne,s for all t > s and ne,s ∈ / A. But then ne,s ∈ We , so Re would act at the least t s such that ne,s ∈ We,t . So there must be such a stage s0 . At that stage, we must have c(ne,s0 , s0 ) 2−(e+2) , since otherwise we would have ne,s0 ∈ We ∩ A. Now all ni,s0 +1 for i e get defined to be
11.2. Lowness
509
fresh large numbers, so the U A -computations contributing to c(ne,s0 , s0 ) are permanent. Thus μ(dom U A ) 2−(e+2) . But now, arguing as before, there is a stage s1 > s0 at which Re acts, and c(ne,s1 , s1 ) 2−(e+2) . Again, the U A -computations contributing to c(ne,s1 , s1 ) are permanent, and furthermore, they are all different from the ones that contributed to c(ne,s0 , s0 ), since their use must be greater than ne,s0 +1 , which was defined to be greater than the use of computations contributing to c(ne,s0 , s0 ). Thus μ(dom U A ) 2−(e+1) . We can now repeat this argument to show that μ(dom U A ) 2−e and so on. Eventually we conclude that μ(dom U A ) 2, which is a contradiction. It remains to show that A is low for K. Let L = {(|τ | + 1, σ) : ∃s (U A (τ )[s] = σ)}. The weight of L is bounded by
1 μ(dom U A ) + μ(dom U A [s] \ dom U A [s + 1]) . 2 s But if σ ∈ dom U A [s] \ dom U A [s + 1] then there must be an n entering A below the use of U A (σ)[s] at stage s. Thus μ(dom U A [s]\dom U A [s+1]) c(n, s) for the unique n entering A at stage s. (If there is no such n, then dom U A [s] \ dom U A [s + 1] = ∅.) So the weight of L is bounded by 1 μ(dom U A ) + {c(n, s) : n ∈ As+1 \ As } 2
1 μ(dom U A ) + 2−(e+2) = 1. 2 e Thus L is a KC set. Furthermore, for each σ, the request (K A (σ) + 1, σ) is in L, so K(σ) K A (σ) + O(1), as required. Corollary 11.2.6 (Kuˇcera and Terwijn [223]). There is a noncomputable c.e. set that is low for 1-randomness. We have already seen in Chapter 10 that the following relativized form of this result is quite useful. Corollary 11.2.7 (Kuˇcera and Terwijn [223]). For every X there is a Y >T X such that every set that is 1-random relative to X is 1-random relative to Y (i.e., Y LR X). Here is another application of (the proof of) this result. The construction in the proof of Theorem 11.2.5 gives us a way to define a nontrivial pseudojump operator taking any set X to a set Y LR X. By Theorem 2.12.3, there is an incomplete c.e. set X such that ∅ LR X. As we have seen in Section 10.6, the latter condition is equivalent to being (uniformly) almost everywhere dominating. On the other hand, it is not difficult to build a low c.e. set A that is not K-trivial, and hence not low for K, by combining
510
11. Randomness-Theoretic Weakness
lowness requirements with requirements of the form ∃n (K(A n) > K(n)+ c). (In Theorem 11.4.1, we will see that all K-trivial sets are superlow, and hence array computable (by the remark following the proof of Theorem 2.23.9). Thus the proof of Theorem 2.23.9 is another way to build such an A.) Again, we can use this construction to define a nontrivial pseudo-jump operator taking any set X to a low set Y LR X. By Theorem 2.12.3, there is a high c.e. set X such that ∅ LR X. Thus the collection of (uniformly) almost everywhere dominating sets contains incomplete c.e. sets, but does not contain all high c.e. sets. The proof of Theorem 11.2.5 is quite similar to the direct proofs of the existence of noncomputable K-trivial sets in the previous section. It is natural to wonder whether the notions of K-triviality, lowness for 1-randomness, and lowness for K might coincide. (We have already seen that the latter two do coincide, and imply K-triviality.) When this question was first considered, there was some evidence that this should not be the case. For instance, it was known that the K-trivial sets are all Δ02 , and that the join of two K-trivial sets is K-trivial, but it was open whether the analogous facts hold for lowness for 1-randomness. In the other direction, it was known that the sets that are low for 1-randomness are GL1 , and that, trivially, if a set A is low for 1-randomness, then so is every A-computable set, but it was open whether the analogous facts hold for K-triviality. As we will see below, however, Nies has developed a powerful method that shows, among other things, that these three notions do in fact coincide, and hence define the same natural class of randomness-theoretically weak sets. We finish this section by proving a useful characterization of lowness for 1-randomness (and hence of K-triviality). The “if” direction of this result is similar to a lemma Kuˇcera and Terwijn [223] used in proving Theorem 11.2.6. Theorem 11.2.8 (Nies and Stephan, see [299]). A is low for 1-randomness iff there are a Σ01 class R and a c such that μ(R) < 1 and K A (σ) |σ|−c ⇒ σ ⊆ R for all σ. Proof. Suppose such a class R exists. Let X be 1-random. By Theorem 6.10.2, there are σ and Y ∈ R such that X = σY . We have K A (Y n) > n − c for all n, so Y is 1-random relative to A, and hence so is X. Thus A is low for 1-randomness. Now suppose that A is low for 1-randomness. Let V be a Σ01 class such that μ(V ) < 1 and V contains all sets that are not 1-random. We claim that there are σ and k such that σ V and ∀τ σ (K A (τ ) |τ | − k ⇒ τ ⊆ V ). Suppose that this is not the case. Then we can construct a set X as follows. Let τ0 = λ, and let τn+1 be a proper extension of τn with τn+1 V and K A (τn+1 ) |τ | − n. Let X = limn τn . Then X is not 1-random relative to A. But X ∈ / V , so X is 1-random, which contradicts the assumption that A is low for 1-randomness.
11.3. Degrees of K-trivial sets
511
Thus we can fix σ and k as above and let R = {τ : στ ⊆ V }. Then R is a Π01 class of 1-random sets, and hence cannot be null, so μ(R) < 1. Furthermore, there is a c such that K A (τ ) |τ | − c ⇒ K A (στ ) |στ | − k ⇒ στ ⊆ V ⇒ σ ⊆ R.
11.3 Degrees of K-trivial sets The technique introduced in this section, which has been called the decanter method , or, in its full-fledged form, the golden run method , is a fundamental one in the study of K-triviality. In this section we will look only at the basic method, leaving the more difficult nonuniform applications to the following section. It is not difficult to show that K-trivial sets are wtt-incomplete. The following stronger result shows that the noncomputable c.e. K-trivial sets provide a more or less natural solution to Post’s Problem. Theorem 11.3.1 (Downey, Hirschfeldt, Nies, and Stephan [117]). Every K-trivial set is Turing incomplete. The decanter method evolved from attempted proofs of the negation of this result, the failures of which were eventually turned around into a proof of the theorem, in the time-honored symmetry of computability theory. Subsequently, the removal of artifacts of the original proof, and the use of a treelike structure by Nies, led to a very powerful and general technique, and, as we will see in the following section, to vast improvements of Theorem 11.3.1. The account below is similar to the one in [104].
11.3.1 A first approximation: wtt-incompleteness The fundamental tool used in all of these proofs is what can be described as “amplification”. Suppose that A is K-trivial with constant of triviality b, and we are using the KC Theorem to build a prefix-free machine M whose coding constant within the universal prefix-free machine U is known to be d. Then if we enumerate the request (p, n), meaning that we make M describe n by a string of length p, then U must describe n by a string of length at most p + d, and hence there must eventually also be a stage s at which U provides a description of As n of length at most p + b + d. Thinking of an opponent who provides U-descriptions, we notice that the existence of the constants b and d means that we have to spend more measure describing n than the opponent does describing As n. (That is, our description increases μ(dom M ) more than the opponent’s increases μ(dom U).) If we are trying to claim that A is not K-trivial, then we want to force the opponent to waste more measure trying to make A be K-trivial than
512
11. Randomness-Theoretic Weakness
we do in playing our strategy. The most basic idea is to force the opponent to give short descriptions of many different strings of the same length (i.e., versions of As n for different s but the same n). The easiest illustration of this method is to show that no K-trivial is wtt-complete. Proposition 11.3.2. If A is K-trivial then A is wtt-incomplete. Proof. Suppose that A is K-trivial and wtt-complete. We know that A is Δ02 , so we fix an approximation of A. Let b be a constant of K-triviality for A. We build a c.e. set B, and a prefix-free machine M via the KC Theorem. By the recursion theorem, we may assume we know a coding constant d of M in U, and also a wtt-reduction ΓA = B with computable use γ (as explained in the proof of Theorem 2.20.1). Let (s) be the length of agreement of ΓAs [s] and Bs . We pick k = 2b+d+1 many followers m0 > · · · > mk−1 targeted for B and wait for a stage s such that (s) > m0 . At this stage we choose a fresh large n > γ(m0 ) (and hence greater than γ(mi ) for all i < k), and enumerate a KC request (0, n). At some stage s0 we must get a U-description of As0 n of length b + d or less. That is, at least 2−(b+d) must be added to the measure of the domain of U. At the first such stage s0 , we put m0 into B, ensuring that A n = As0 n, since n > γ(m0 ). (In this case we could use any of the mi ’s, but later it will be important that the mi ’s enter in reverse order of size.) Now there must be a stage s1 > s0 with (s1 ) > m0 such that As1 n = As0 n and As1 n also has a U-description of length at most b + d. Thus we now know that μ(dom U) is at least 2 · 2−(b+d) . We now put m1 into B, which leads once again to a stage s2 > s1 with
(s2 ) > m0 such that As2 n = Asi n for i = 0, 1 and As2 n has a U-description of length at most b + d. We now know that μ(dom U) is at least 3 · 2−(b+d) . If we repeat this process one time for each mi , we eventually conclude that μ(dom U) k·2−(b+d) = 2b+d+1 2−(b+d) = 2, which is impossible.
11.3.2 A second approximation: impossible constants The argument above is fine for weak truth table reducibility, since the use does not change once defined, but there clearly are problems when Γ is a Turing reduction. That is, suppose that our new goal is to show that no K-trivial is Turing complete. The problem with the above construction is now the following. When we play our M -description of n, we use all the measure of dom M to describe a single number. Now it is in the opponent’s power to move the use γ(m0 , s) (or even γ(mk−1 , s)) to some value bigger than n before matching our description of n with a description of At n for some t. Thus it costs the opponent very little to match our M -description of n: the opponent changes A γ(mk−1 , s), makes γ(mk−1 , s+ 1) larger than n, then
11.3. Degrees of K-trivial sets
513
describes As+1 n, and we can no longer cause any changes of A below n, as all the Γ-uses we control are too big. It is at this point that we realize it is pretty dumb of us to try to describe n in one go. All that really matters is that we load lots of measure beyond some point where the opponent will be forced to match it many times. For instance, in the wtt case, we certainly could have used many n’s beyond γ(m0 ), giving each a description of length e for some large e, and attacking the opponent by putting markers into B only once we have amassed the requisite measure of descriptions of numbers beyond γ(m0 ). This is the idea behind our second approximation to the proof of Theorem 11.3.1. In this case, we assume we are given a Turing reduction ΓA = B and make the following impossible assumption: We assume that the constants b and d, defined as in the proof of Proposition 11.3.2, are both 0. Hence, when we enumerate a request (p, n) for M , the opponent will eventually enumerate a U-description of As n of length at most p. (Notice that, with this assumption, in the wtt case we would need only one follower.) With Γ a Turing reduction, we still have the problem outlined above, even with our impossible assumption. That is, if we use the dumb strategy of the wtt case, then the opponent can move Γ-uses before describing initial segments of A, thus describing each such segment only once. To get around this problem, we use a drip-feed strategy for loading measure, as discussed above. To make our terminology more precise, when we issue a request (p, n) for M , we say that we have loaded 2−p much measure at n. We pick a follower m and attempt to load a large amount of measure, say 34 , at n’s greater than γ(m) (we write γ(m) instead of γ(m, s), thinking of γ(m) as a movable marker with value γ(m, s) at stage s). Suppose we succeed, and the opponent provides corresponding short descriptions of initial segments of the current approximation to A, which amounts to adding 34 to the domain of U. Then we can force A γ(m) to change, and the opponent will not have another measure 34 many short descriptions for the initial segments of the new version of A. To deal with the possibility of A γ(m) changing before we are ready to force it to change ourselves, we establish a trash quota, say 18 . We want to make sure that the total measure corresponding to descriptions we waste (that is, descriptions that do not force the opponent to issue two corresponding descriptions of initial segments of approximations to A) does not 1 exceed this quota. Thus, we begin by loading measure in chunks of size 16 . That is, we pick an n0 > γ(m) and issue the request (4, n0 ). The opponent now has two options. One is to change A γ(m) and move 1 γ(m). In this case, we have wasted 16 much measure. Then we restart our 1 loading procedure, but in chunks of size 32 . The opponent’s other option is to issue a description of length 4 of the current approximation to A n0 . 1 If that happens then we pick a new large number n1 and load another 16 much measure at n1 . Again we wait for the opponent’s reaction.
514
11. Randomness-Theoretic Weakness
If the opponent issues a description of length 4 of the current approximation to A n1 , then we pick a new large number n2 and load yet another 1 16 much measure at n2 . However, if instead the opponent moves γ(m), we 1 make the following crucial observation: The 16 much measure we loaded 1 at n1 is wasted, but the 16 much measure we loaded at n0 is not, because the opponent has already issued a description of length 4 of a version of A n. By changing A γ(m), the opponent is now committed to issuing a second description of length 4, for the new approximation to A n (since n > γ(m), and hence A n also changes). So we can move on to loading 1 1 measure in chunks of 32 while having added only 16 to our total wasted measure. In general, every time the opponent moves γ(m) and forces us to go from loading measure in chunks of size 2−k to loading it in chunks of size 2−(k+1) , we add only 2−k to our total wasted measure, since only the very last chunk of measure of size 2−k we loaded is wasted. Thus our total wasted measure never exceeds our trash quota 18 . If indeed ΓB = A, then the opponent can move γ(m) only finitely often, so we eventually reach a final level at which we are loading measure in chunks of some fixed size 2−k . Once the total measure we have loaded, not counting the wasted amount, reaches our goal of 34 , we force A γ(m) to change ourselves, and win as before. We think of the above strategy as a procedure P ( 34 , 18 ) designed to load 3 4 much measure beyond γ(m) and then change A γ(m), while wasting at most 18 much measure. We call the set of n such that we give n a short description and then force the opponent to give two corresponding descriptions of versions of A n a 2-set, and we call the sum of the measure we loaded at these n the weight of this set. In this simplified construction, building a 2-set of weight 34 (or any weight greater than 12 ) is all we need to do, so running the single procedure P ( 34 , 18 ) suffices. In the next subsection, we introduce the general concept of a k-set, and use it to prove that K-trivials cannot be Turing complete.
11.3.3 The less impossible case We now remove the simplifying assumption of the previous section. A k-set is a set of numbers n such that we give n a short description and then force the opponent to give at least k many corresponding descriptions of versions of A n. As before, the weight of such a set is the sum of the measure we load at these n. In the previous section, by a “corresponding description”, we meant one of the same length as the one we gave. But without the simplifying assumption of that section, when we give n a description of length m, all we know is that the opponent will eventually have to give A n a description of length at most m + b + d, where b and d are as in the proof of Proposition 11.3.2. So it is no longer enough to force the opponent to provide two descriptions to each one of ours. We now need the opponent to provide at least 2b+d+1 many descriptions to each one of ours. In other
11.3. Degrees of K-trivial sets
515
words, where in the previous section we built a 2-set of weight greater than 1 b+d+1 -set of weight greater than 12 . To do so, 2 , we now need to build a 2 we take the key idea from the wtt case, in which the use is fixed but the coding constants are nontrivial, namely, that we can use multiple followers to build a k-set for any k we want. To simplify our discussion for now, suppose that b+d = 1. Emulating the wtt-case, we work with k = 21+1 = 4 and try to build a 4-set of sufficient weight. We break this task into the construction of a 2-set, which is turned into a 3-set, which is finally turned into the desired 4-set. We view these constructions as procedures Pj for 2 j 4, which are called in reverse order as follows. Our overall construction begins with, say, P4 ( 34 , 18 ), whose task is to build a 4-set C4 of weight 34 while wasting at most 18 much measure. To do this, P4 ( 34 , 18 ) calls procedures of the form P3 (g, q), which build a 3-set C3 by calling procedures of the form P2 (g , q ). The latter build a 2-set C2 and enumerate a KC set L. Each procedure Pj has rational parameters g, q ∈ [0, 1]. The goal g is the weight the procedure wants Cj to reach, and the trash quota q is how much measure it is allowed to waste. The main idea is that procedures Pj will ask that procedures Pi for i < j do the work for them, with P2 eventually really doing the work. The goals of the Pi are determined inductively by the trash quotas of the Pj for j > i, in such a way as to ensure that if the procedures are canceled before completing their tasks, then the amount of measure wasted is acceptably small. We begin the construction by starting P4 ( 34 , 18 ). Its action is first to choose a follower m4 and wait until ΓA (m4 ) ↓. Then P4 calls P3 (2−4 , 2−5 ). The idea here is that, instead of directly loading measure in chunks of size 1 16 beyond γ(m4 ) as in the previous section, P4 delegates that task to P3 . (The precise numbers being used here are immaterial, of course, except that we need to keep the total wasted measure below P4 ’s trash quota of 1 8 .) While we are waiting for P3 to return, if γ(m4 ) moves, then we restart, this time calling P3 (2−4 , 2−6 ). Otherwise, the following occurs. The procedure P3 (2−4 , 2−5 ) picks a follower m3 > m4 , waits until A Γ (m3 ) ↓, and then calls P2 (2−5 , 2−6 ). That procedure picks its own follower m2 > m3 and waits until ΓA (m2 ) ↓. At this point, P2 begins to try to load 2−5 much measure beyond γ(m2 ), in chunks of size 2−7 , just as in the previous section. (That is, P2 uses L to give descriptions of length 7 to n’s bigger than γ(m2 ).) While all of this action is occurring, many things can happen. The simplest case is that nothing happens to the uses, and hence, as with the wtt case, P2 successfully loads its goal amount 2−5 beyond γ(m2 , s). Should this happen then P2 enumerates m2 into B, forcing A γ(m2 , s) to change and hence creating a 2-set C2 of weight 2−5 . Then P2 returns.
516
11. Randomness-Theoretic Weakness
Now P3 sees that it has 2−5 much measure loaded beyond γ(m3 ), but it would like another chunk to be loaded there, so it again calls P2 (2−5 , 2−6 ) (which now chooses a new follower m2 ). If P2 again returns successfully, then we have a 2-set of weight 2−4 consisting of numbers greater than γ(m3 ) (which we are assuming has not moved). Now P3 enumerates m3 into B, causing this 2-set to become a 3-set of weight 2−4 , and returns. Now, of course, P4 needs to call P3 again, which will cause further calls to P2 . One can think of this construction as “wheels within wheels within wheels”, spinning ever faster. Of course, our difficulties all come about because uses can change, but the construction in the previous section gave us a technique to deal with that issue. For example, if only γ(m2 ) moves then, as before, only the last amount of measure loaded by P2 is wasted. So, again as before, P2 starts loading measure in chunks that are half the size of the ones it had been previously using. Since we are assuming that ΓA = B, and hence all uses stabilize, this problem will eventually stop occurring for any particular run of P2 . In general, the inductive procedures work similarly. For example, if γ(m3 ) moves during a run of P3 , then P3 cancels the current run of P2 (g, q) and calls P2 (g, q2 ) instead. Any measure that this run of P2 (g, q) had loaded is wasted, but what previous calls to P2 had accomplished is not. In the end, we can argue by induction that all tasks are completed. We now turn to the formal details of the construction. Proof of Theorem 11.3.1. Suppose that A is K-trivial and Turing complete. We know that A is Δ02 , so we fix an approximation of A. Let b be a constant of K-triviality for A. We build a c.e. set B, and a prefix-free machine M via the KC Theorem. By the recursion theorem, we may assume we know a coding constant d of M in U, and also a reduction ΓA = B. As above, we will think of the use γ A (n) as a marker whose value might change during the construction. Let k = 2b+d+1 . We assume we have sped up our enumeration of stages enough so that Ks (As n) Ks (n) + b for all s and all n mentioned in the construction by stage s. As in the wtt case, our construction will build a k-set Ck of weight greater than 12 to reach a contradiction. The procedure Pj for 2 j k enumerates a j-set Cj . As explained above, the construction begins by calling Pk , which calls Pk−1 several times, and so on down to P2 , which enumerates L (and C2 ). Each call of a procedure Pj has parameters g = 2−x and q = 2−y with q < g, again as explained above. We now describe the procedure Pj (g, q). 1. Choose mj large. 2. Wait until ΓA (mj ) ↓= 0.
11.3. Degrees of K-trivial sets
517
3. Let v be the number of times this procedure has gone through step 2. j = 2: Pick a large number n. Put (r, n) into L, where 2−r = 2−v q. Wait for a stage t such that Kt (n) r + d, and put n into C1 . j > 2: Let w be the number of Pj−1 procedures called so far. Call Pj−1 (2−v q, 2j−k−w−1 q). 4. If the weight of Cj−1 is less than g then repeat step 3. 5. Put mj into B. This action forces A to change below γ A (mj ) < min(Cj−1 ), and hence makes Cj−1 into a j-set (if we assume inductively that Cj−1 is a (j − 1)-set). So put Cj−1 into Cj , declare Cj−1 = ∅, and return. If γ A (mj ) changes during the execution of the loop at step 3, then cancel the runs of all subprocedures, and go to step 2. Despite the cancellations, Cj−1 is now a j-set because of this very change. (This is an important point, as it ensures that the measure associated with numbers already in Cj−1 is not wasted.) So put Cj−1 into Cj and declare Cj−1 = ∅. This completes the description of the procedures. The construction consists of calling Pk ( 34 , 18 ), say. It is straightforward to argue that since trash quotas are inductively halved each time they are injured by a use change, their sum is bounded by 14 , and that therefore L is a KC set. Furthermore Ck is a k-set of weight 34 , which is a contradiction since then the measure of U exceeds 1, as in the proof of Proposition 11.3.2. The following description is taken from [118]. (We have changed some variable names in this quote to match the ones we use.) “We can visualize this construction by thinking of a machine similar to Lerman’s pinball machine (see [366, Chapter VIII.5]). However, since we enumerate rational quantities instead of single objects, we replace the balls in Lerman’s machine by amounts of a precious liquid, say 1955 Biondi-Santi Brunello. Our machine consists of decanters Ck , Ck−1 , . . . , C0 . At any stage Cj is a j-set. We put Cj−1 above Cj so that Cj−1 can be emptied into Cj . The height of a decanter is changeable. The procedure Pj (g, q) wants to add weight g to Cj , by filling Cj−1 up to g and then emptying it into Cj . The emptying corresponds to adding one more A-change. “The emptying device is a hook (the γ A (m)-marker), which besides being used on purpose may go off finitely often by itself. When Cj−1 is emptied into Cj then Cj−2 , . . . , C0 are spilled on the floor, since the new hooks emptying Cj−1 , . . . , C0 may be much longer (the γ A (m)-marker may move to a much bigger position), and so we cannot use them any more to empty those decanters in their old positions. “We first pour wine into the highest decanter C0 , representing the left domain of L, in portions corresponding to the weight of requests entering L. We want to ensure that at least half the wine we put into C0 reaches Ck .
518
11. Randomness-Theoretic Weakness
Recall that the parameter β is the amount of garbage Pj (g, q) allows. If v is the number of times the emptying device has gone off by itself, then Pj lets Pj−1 fill Cj−1 in portions of size 2−v q. Then when Cj−1 is emptied into Cj , at most 2−v q much liquid can be lost because of being in higher decanters Cj−2 , . . . , C0 . The procedure P2 (g, q) is special but limits the garbage in the same way: it puts requests (rn , n) into L where 2−rn = 2−v q. Once it sees the corresponding A n description, it empties C0 into C1 (but C0 may be spilled on the floor before that because of a lower decanter being emptied).”
11.4 K-triviality and lowness At first glance, it may seem unclear whether the K-trivial sets are merely an artifact of the definition of prefix-free complexity. (Recall from Theorem 3.4.4 that all C-trivial sets are computable.) However, work of several researchers has shown that the class of K-trivial sets is remarkably robust, in that it has a host of natural characterizations via other notions of randomness theoretic weakness. This phenomenon is similar to what we saw for 1-randomness, where various approaches yield the same notion, but in this case the equivalence proofs are often rather involved. These alternate definitions also have significant degree-theoretic implications. For example, all K-trivial sets are superlow, and the class of K-trivial sets is closed downward under Turing reducibility, and in fact is an ideal (i.e., is also closed under join). In this section, we prove Nies’ result from [302] that every K-trivial set is low for 1-randomness, and hence low for K. We have already seen that the latter two notions are equivalent, and lowness for K clearly implies K-triviality, so all three notions coincide. We begin with a proof that all K-trivial sets are superlow, and then describe how to modify this proof to obtain lowness for 1-randomness. The method we introduce here, called the golden run method, has proved to be one of the major tools in the study of K-triviality and related notions. Theorem 11.4.1 (Nies [302]). Every K-trivial set is superlow. Proof. We build on the terminology and notation of the previous section, and introduce an improved version of the decanter method. Let A be K-trivial. In approximating A , when we see a convergent coms putation ΦA e (e), we need to decide whether to believe this computation. If we make only finitely many mistakes for each e, then we ensure that A is low, while if we can keep the number of mistakes below a computable bound, then we ensure that A is superlow. There is no claim to uniformity in the theorem, so we can have infinitely many guessing procedures, as long as one them is successful. As before, we exploit the fact that changes to the approximation of a Ktrivial set come at a certain cost, in terms of additional short descriptions
11.4. K-triviality and lowness
519
of initial segments. By loading measure beyond the use of a convergent s computation ΦA e (e), we ensure that any violation of this use will come at a relatively large cost. Only once we have loaded sufficient measure will we believe this computation. This procedure ensures that the number of times we believe an incorrect computation of this form for a given e is computably bounded. (If the cost to the opponent of violating the use of a computation is made to be 2−k , say, then we know we will believe no more than 2k many incorrect computations.) Thus we can approximate A in an ω-c.e. manner, whence A is superlow. In the proof of Theorem 11.3.1, we built a c.e. set B and assumed we had a reduction ΓA = B. Whenever we wanted to change A at a stage s to transform a (j − 1)-set Cj−1 into a j-set, we could do so simply by enumerating some number mj into B, since we arranged things so that an A-change below γ A (mj )[s] was sufficient to make Cj−1 into a j-set, and such a change was guaranteed to happen once mj entered B. What caused us problems was that A could change before we wanted it to, but we had no trouble ensuring that A would change once we wanted it to. When showing that A is superlow, rather than merely incomplete, we no longer have quite so much power. If the opponent is to prevent A from being superlow, it has to change A reasonably often, but we cannot force such changes to occur every time we would like them to. We get around this problem by further distributing our measure-loading strategy among subprocedures. In the proof of Theorem 11.3.1, a procedure Pj called a Pj−1 procedure several times, thus obtaining several (j −1)-sets. When it had enough such sets, it then promoted them to j-sets by forcing appropriate changes in A. The analogous procedure in this proof will run several subprocedures simultaneously. Each subprocedure will be associated with some index e. Each (j −1)-set C produced by such a subprocedure will have its corresponding measure loaded beyond some number n, so that a change of the approximation to A n to a hitherto unseen string will cause C to become a j-set. Our level j procedure will assume that such changes do not occur and, under that assumption, will guess at values of A . That is, when it has an m such that a change in A m would allow it to transform a (j − 1)-set C of sufficient weight given to it by a subprocedure associated s with e into a j-set, and it sees that ΦA e (e) ↓ with use below m, then it will believe that computation. If it is wrong, then C will be promoted. If it is wrong sufficiently often for subprocedures associated with some e, then it will build up a j-set with weight equal to its goal, and hence will return. If not, then it will be able to bound the number of wrong guesses it makes about the value of each A (e), and hence ensure that A is superlow. We envision the construction as happening on an infinitely branching tree of finite depth k (where k is as in the proof of Theorem 11.3.1), with each procedure on the tree having infinitely many subprocedures below it. As in the proof of Theorem 11.3.1, the top procedure will attempt to produce a k-set of inconsistently large measure, and hence will never return. Thus, at
520
11. Randomness-Theoretic Weakness
some point in the construction, we will start a run of some procedure that neither returns nor is canceled, but such that all the subprocedures it ever calls either return or are canceled. As outlined above, this run will succeed in guessing A with sufficiently few mistakes to ensure that A is superlow. We call such a run a golden run. It will be convenient to have two kinds of procedures. The procedure Pj calls subprocedures Qj−1,e,u , corresponding to potential uses u of cons vergent computations ΦA e (e), which in turn call procedure Pj−1 (except for the procedures Q1,e,u , which are the ones that actually perform the measure loading by enumerating requests into a KC set L). Each call to a procedure comes with two parameters, as in the proof of Theorem 11.3.1, a goal g and a trash quota q. As in that proof, a run of a procedure may be canceled due to a premature A-change, in which case it will generate a certain amount of “wasted measure”. Again as in that proof, the trash quota will bound that amount. We choose our trash quotas to ensure that the total amount of wasted measure produced over the entire construction is small, and choose the goals of calls to subprocedures made by a particular run of a procedure to be small enough that the measure wasted by their potential cancelations adds up to no more than the trash quota of that run. We now proceed with the formal details of the description of the procedures and the construction. The approximation to A showing that A is superlow will not happen during the construction, but will be built after the fact, in the verification. Let b be a constant of K-triviality for A. We assume we have sped up our stages sufficiently so that Ks (A n) Ks (n)+ b for all s. Let d be a coding constant, given by the recursion theorem, for the machine corresponding to the KC set L we build. Let k = 2b+d+1 . Procedure Pi (g, q), where 1 < i k and g = 2−l q = 2−r for some l and r. This procedure enumerates a set C, initially empty. At stage s, declare s to be available. (Availability is a local notion, applicable only to this particular run of this procedure.) For each e s, proceed as follows. 1. If e is available and ΦA e (e)[s] ↓, then let u be the use of this computation and call Qi−1,e,u (2−(e+1) g, min(2−(e+1) g, 2−(2i+n) )), where n is the number of runs of Qi−1 -procedures that have been started by this point. Declare e to be unavailable. 2. If e is unavailable due to a run of Qi−1,e,u (g , q ) and As u = As−1 u, then declare e to be available, say that this run of Qi−1,e,u (g , q ) is released and proceed as follows, where D is the set being built by the run of Qi−1,e,u (g , q ). (a) If the weight of C ∪ D (defined as in the proof of Theorem 11.3.1) is less than g then put D into C and proceed to step ⊆ D be such that the weight of C ∪ D 2(b). Otherwise, let D is g and put D into C. (It will be easy to see that D exists
11.4. K-triviality and lowness
521
from the way D is defined below, because g is of the form 2−l for some l smaller than all p such that we put (p, n) into L for n ∈ D.) Cancel all runs of subprocedures, and end this run of Pi , returning C. (b) If the run of Qi−1,e,u (g , q ) has not yet returned, then cancel it and all runs of subprocedures it has called. Procedure Qj,e,u (g, q), where 1 < j < k and g = 2−l q = 2−r for some l and r. This procedure enumerates a set D, initially empty. Call Pj (q, 2−(2j+3+n) ), where n is the number of runs of Pj procedures started so far. If this procedure returns a set C, then put C into D. If the weight of D is less than g then repeat this process, calling a new run of Pj . Otherwise, end the procedure, returning D. (It is easy to see that in this case the weight of D is equal to g.) Procedure Q1,e,u (g, q), where g = 2−l q = 2−r for some l and r. Choose a fresh large number n and put (r, n) into L. Wait for a stage t > n such that Kt (n) r + d and then put n into D. If the weight of D is less than g then repeat this process. Otherwise, end the procedure, returning D. (Again, in this case the weight of D is equal to g.) The construction begins by calling Pk,0 (1, 2−3 ). At each stage we perform one more cycle of each procedure that is currently running, descending through the levels and working in some effective order within each level. We now verify that this construction works as intended. Let Dj be the union of the sets enumerated by Qj -procedures. For j > 1, let Cj the union of the sets enumerated by Pj -procedures, and let C1 = {n : ∃r ((r, n) ∈ L)}. Lemma 11.4.2. Each Ci is an i-set. Proof. For each n ∈ D1 , let sn be the stage at which n enters D1 , and let rn be the unique number such that (rn , n) ∈ L. By construction, we have Ksn (Asn n) rn + b + d. Let i ∈ [2, k] and assume by induction that for every n ∈ Di−1 , if n enters Di at stage t then there are at least i − 1 many distinct strings σ of the form Kv n for v ∈ [sn , t] such that K(σ) r+b+d. We show that Ci also has this property, which clearly implies that it is an i-set, and that Di also has the property if i < k, so that the induction may continue. Suppose that n enters Ci at stage t. This enumeration happens at step 2(a) in the action of a run of some Pi -procedure for some e, and corresponds to the release of some run of a subprocedure Qi−1,e,u (g , q ). By the inductive hypothesis, there are i−1 distinct strings σ of the form Av n for v ∈ [sn , t) such that Kv (σ) r + b + d. Furthermore, the approximation to A u cannot have changed between stages sn and t − 1, since such a change would have caused the run of Qi−1,e,u (g , q ) to have been released sooner. Since At u = At−1 u, there is a new string τ = At n, not equal to any of these σ, such that Kt (τ ) r + b + d. Since n is an arbitrary
522
11. Randomness-Theoretic Weakness
element of Ci , we see that Ci has the desired property, an in particular is an i-set. The next lemma shows that trash quotas are respected. Lemma 11.4.3. (i) For j ∈ [1, k), the weight of the set of numbers in Cj \ Dj corresponding to a run of Qj,e,u (g, q) is at most q. (ii) For i ∈ (1, k], the weight of the set of numbers in Di−1 \ Ci corresponding to a run of Pi (g, q) is at most q. Proof. (i) There is at most one n corresponding to a run of Q1,e,u (g, q) in C1 \ D1 , since no number is put into C1 by this run until the previous number it put into C1 enters D1 . For such an n, the unique (r, n) ∈ L is such that 2−r = q. Thus the weight of n is q. If j > 1, then all numbers in Cj \Dj corresponding to a run of Qj,e,u (g, q) correspond to a single run of Pj (q, q ), because each time such a run returns, its numbers enter Dj . Thus this number never returns, and hence the weight of the set of numbers corresponding to it is at most q. (ii) Let n be one of the numbers corresponding to a run of Pi (g, q) that is in Di−1 by a stage t but never enters Ci . Then n is put in Di−1 by a run of a subprocedure Qi−1,e,u (2−(e+1) q, q ) called by this run of Pi (g, q). We claim that this run of Pi (g, q) never calls a subprocedure Qi−1,e,u after stage t. To verify this claim, first suppose that As u = As−1 u for some s > t. Since n does not enter Ci at stage s, it must be the case that the run of Pi (g, q) returns at that stage, and n is not needed for that run to reach its goal. Now suppose that there is no such s. Then the run of Qi−1,e,u that put n into Di−1 is never released, and e is never available after stage t. In either case, the claim holds. Thus, for each e there is at most one run of a subprocedure of the form Qi−1,e,u (2−(e+1) q, q ) called by our run of Pi (g, q) that leaves numbers in Di−1 \ C i . Thus the sum of the weights of these numbers over all such e is at most e 2−(e+1) q = q. Lemma 11.4.4. L is a KC set. Proof. Since Ck is a k-set and k = 2b+d+1 , the weightof Ck is at most 1 1j
11.4. K-triviality and lowness
523
words, there is a golden run of some procedure. This procedure cannot be a Q-procedure, since each run of a Q-procedure calls P -subprocedures with the same goal, and never cancels any of them itself, so that if it is never canceled, then it eventually fulfills its goal and returns. Thus the golden run is of some procedure Pi . We now show how to guess at A in an ω-c.e. way. Fix e. We begin by guessing that e ∈ / A . Every time the golden run sees A that Φe (e)[s] ↓, it starts a run of a subprocedure Qi−1,e,u , where u is the use of this computation. This run returns unless this use is violated before it can do so. If the subprocedure returns, we guess that e ∈ A . If the use u is later violated, then we go back to guessing that e ∈ / A . Every time a run of a subprocedure Qi−1,e,u returns, it adds 2−(e+1) g to the weight of the set C associated with the golden run. Hence, we cannot guess wrongly at whether e is in A more than 2e many times, lest the golden run reach its goal and return. Thus A is superlow. Corollary 11.4.5 (Nies [302]). Every K-trivial set is ω-c.e. The above proof also shows that the K-trivial sets have the following property, which will be relevant when we discuss strong jump traceability in Chapter 14. We write J A for the jump of A thought of as a partial function. That is, J A (e) = ΦA e (e). Definition 11.4.6 (Nies [303]). A set A is h-jump traceable for a computable order h if there are uniformly c.e. sets T0 , T1 , . . . such that |Tn | h(n) and J A (n) ↓ ⇒ J A (n) ∈ Tn for all n. A set A is jump traceable if it is h-jump traceable for some computable order h. It is easy to see that there is a computable function f such that J A (f (e, n)) = ΦA e (n) for all e and n, so A is jump traceable iff every A-partial computable function has a trace with some computable bound. Nies [303] showed the following. Theorem 11.4.7 (Nies [303]). A c.e. set is jump traceable iff it is superlow. On the other hand, Nies [303] also showed that there is a perfect Π01 class of jump traceable sets (see Theorem 14.5.3 below), and that even within the ω-c.e. sets, jump traceability and superlowness are incomparable notions. For more on jump traceability, see Chapter 14, Nies [303], and Figueira, Nies, and Stephan [147]. It is easy to see that the proof of Theorem 11.4.1 also yields the following result, because instead of guessing at whether e is in A with at most 2e many mistakes for each e, we could just as easily trace the value of J A (e) for each e with at most 2e many mistakes.
524
11. Randomness-Theoretic Weakness
Theorem 11.4.8 (Nies [302]). Every K-trivial set is jump traceable.1 The proof of Theorem 11.4.1 can be modified to prove the following. Theorem 11.4.9 (Nies and Hirschfeldt, see Nies [302]). If a set is K-trivial then it is low for K. Proof. The proof is essentially the same as that of Theorem 11.4.1, except that the “triggering condition” for having a P -strategy launch a Q-substrategy changes. Thus we limit ourselves to giving the description of the procedures (which will be quite similar to the ones in the proof of Theorem 11.4.1) and the necessary changes to the verification. Procedure Pi (g, q), where 1 < i k and g = 2−l q = 2−r for some l and r. This procedure enumerates a set C, initially empty. At stage s, declare each σ ∈ 2s to be available. For each σ ∈ 2s , proceed as follows. 1. If σ is available and U A (σ)[s] ↓= τ for some τ , then let u be the use of this computation and call Qi−1,σ,τ,u (2−|σ| g, min(2−|σ| g, 2−(2i+n) )), where n is the number of runs of Qi−1 -procedures that have been started by this point. Declare σ to be unavailable. 2. If σ is unavailable due to a run of Qi−1,σ,τ,u (g , q ) and As u = As−1 u, then declare σ to be available, say that this run of Qi−1,σ,τ,u (g , q ) is released and proceed as follows, where D is the set being built by the run of Qi−1,σ,τ,u (g , q ). (a) If the weight of C ∪ D is less than g then put D into C and ⊆ D be such that the proceed to step 2(b). Otherwise, let D weight of C ∪ D is g and put D into C. Cancel all runs of subprocedures, and end this run of Pi , returning C. (b) If the run of Qi−1,σ,τ (g , q ) has not yet returned, then cancel it and all runs of subprocedures it has called. Procedure Qj,σ,τ,u (g, q), where 1 < j < k and g = 2−l q = 2−r for some l and r. This procedure enumerates a set D, initially empty. Call Pj (q, 2−(2j+3+n) ), where n is the number of runs of Pj procedures started so far. If this procedure returns a set C, then put C into D. If the weight of D is less than g then repeat this process, calling a new run of Pj . Otherwise, end the procedure, returning D. Procedure Q1,σ,τ,u (g, q), where g = 2−l q = 2−r for some l and r. Choose a fresh large number n and put (r, n) into L. Wait for a stage t > n such that Kt (n) r + d and then put n into D. If the weight of D is less than g then repeat this process. Otherwise, end the procedure, returning D. 1 Zambella had earlier observed that if a set is low for 1-randomness then it is jump traceable. See Kuˇcera and Terwijn [223].
11.4. K-triviality and lowness
525
The construction runs as before. The verification that there is a golden run is also essentially as before, except for the last paragraph of the proof of Lemma 11.4.3. In this case, the same argument as in that proof shows that for each σ, there is at most one run of a subprocedure of the form Qi−1,σ,τ,u (2−|σ| q, q ) called by our run of Pi (g, q) that leaves numbers in Di−1 \ Ci . Furthermore, either none of these subprocedures are ever released, in which case all of the corresponding σ’s are in dom U A , or the run of Pi (g, q) terminates at a stage s such that none of these subprocedures terminate before stage s, in which case all of the corresponding σ’s are in dom U As−1 . Thus the sum of the weights of these numbers over all such σ is bounded by either ΩA q or ΩA s−1 q, and in either case is less than q. Thus, all that is left to do is to take a golden run of some Pi (g, q) and use it to show that A is low for K. To do so, we build a KC set W . Let c = log gq . Whenever a run of Qi−1,σ,τ,u (2−|σ| q, q ) called by the golden run returns, put (|σ| + c + 1, τ ) into W . We first show that W is a KC set. Suppose that (|σ| + c + 1, τ ) enters W at stage s because a run of Qi−1,σ,τ,u (2−|σ| q, q ) returns, and At u = As u for all t > s. Then σ ∈ dom U A , so the weight contributed by all such requests is bounded by 2−(c+1) ΩA < 12 . Now suppose that (|σ| + c + 1, τ ) enters W at stage s because a run of Qi−1,σ,τ,u (2−|σ| q, q ) returns, and there is a t > s such that At u = As u. Then the set D returned by this run has weight 2−|σ| q and enters the set C built by the golden run. Thus the weight contributed by all such requests −(c+1) −(c+1) 1 is bounded by 2 q times the weight of C. Since 2 q = 2g and the weight of C is bounded by g, the weight contributed by all such requests is bounded by 12 . Now fix τ and let s be the least stage such that for some σ of length K A (τ ), we have U A (σ) ↓ [s] = τ with use u and At u = As u for all t > s. It follows easily from the minimality of s that σ must be available at s. Thus a run of Qi−1,σ,τ,u is launched. This run is never canceled, and thus must return, by the definition of golden run. So (|σ| + c + 1, τ ) ∈ W , and hence K(τ ) K A (τ ) + O(1), where the constant does not depend on τ . Since τ is arbitrary, A is low for K. The following corollary is immediate. Corollary 11.4.10 (Nies [302]). The class of K-trivial sets is closed downward under Turing reducibility. Theorems 11.2.4 and 11.4.9, together with the fact that lowness for K implies K-triviality, give us the following result, which speaks to the naturality of K-triviality as a notion of randomness-theoretic weakness. Theorem 11.4.11 (Nies [302]). The following are equivalent. (i) A is K-trivial. (ii) A is low for 1-randomness.
526
11. Randomness-Theoretic Weakness
(iii) A is low for K. Thus we see that a set has trivial information content iff it has no derandomization power iff it has no compression power.
11.5 Cost functions The idea of a cost function probably first explicitly appeared in Downey, Hirschfeldt, Nies, and Stephan [117], but is certainly implicit in earlier work such as Kuˇcera and Terwijn [223], and arguably even in the first works on priority arguments. The following definition encompasses many important examples. Definition 11.5.1 (Nies [306], Greenberg and Nies [172]). (i) A cost function is a computable function c : N × N → Q0 . It satisfies the limit condition if limn sups c(n, s) = 0. It is monotone if for each n, the sequence c(n, 0), c(n, 1), . . . is nondecreasing and tends to a limit (which we denote by c(n)), and for each s, the sequence c(0, s), c(1, s), . . . is nonincreasing. (Note that in this case, the limit condition can be expressed as limn c(n) = 0.) We think of c as determining the cost c(n, s) of changing the approximation to A(n) at stage s for some Δ02 set A being constructed. (ii) Let A0 , A1 , . . . be a computable approximation to a Δ02 set A, and let ns be the least n such that As+1 (n) = As (n). This approximation obeys the cost function c if s c(ns , s) < ∞. A Δ02 set A obeys c if it has some computable approximation that obeys c. When A is c.e., we assume this approximation A0 , A1 , . . . is an approximation in the c.e. sense (i.e., A0 ⊆ A1 ⊆ · · · ). We have already seen two examples of cost functions: the standard cost function for K-triviality, c(n, s) = n
11.5. Cost functions
527
One outgrowth of Nies’ golden run method was the realization that not only can cost functions be used to build K-trivials sets, but they can in fact be used to characterize K-triviality. Lemma 11.5.2 (Nies [302]). Let A be K-trivial. There is a computable sequence of stages v(0) < v(1) < · · · such that, letting ns be the least n such that Av(s+1) (n) = Av(s+2) (n), and c(n, s) = n
c(mt , t) =
t
2−Kt (i)
mt
t
mt
2−Kv(s0 +t+1) (i)
c(ns , s) < 1.
ss0
0 , A 1 , . . . is an approximation witnessing the fact that A obeys the Thus A standard cost function. One consequence of this result is that K-triviality is basically a computably enumerable phenomenon.
528
11. Randomness-Theoretic Weakness
Corollary 11.5.4 (Nies [302]). Every K-trivial set is tt-computable from some c.e. K-trivial set. Proof. Let A be K-trivial and let A0 , A1 , . . . be an approximation to A witnessing the fact that A obeys the standard cost function. We may assume that A0 = ∅. By Corollary 11.4.5 and the proof of Theorem 11.5.3, which shows that we can take the Ai to be a subsequence of any given approximation to A, we may also assume that we have a computable function f such that |{s : As+1 (n) = As (n)}| f (n) for all n. Let B0 = ∅. Let Bs+1 be obtained from Bs as follows. First put all of Bs into Bs+1 . Then, for each n such that As+1 (n) = As (n), let k be least such that n, k ∈ / Bs and add n, k to Bs+1 . The Bi form an enumeration of a c.e. set B. It is easy to see that the Bi obey the standard cost function, whence B is K-trivial. Furthermore, A(n) is equal to the parity of the largest m f (n) such that n, k ∈ B for all k < m, which is clearly a tt-reduction. The following lemma, noted by Downey, Hirschfeldt, Miller, and Nies [115], will be useful in Section 15.8. Its proof is similar to that of Lemma 11.5.2. Lemma 11.5.5 (Downey, Hirschfeldt, Miller, and Nies [115]). Let M be a prefix-free oracle machine and let A be K-trivial. There is a computable sequence of stages v(0) < v(1) < · · · such that, letting ns be the least n such that Av(s+1) (n) = Av(s+2) (n) and c(n, s) = {2−|σ| : M A (σ)[v(s + 1)] ↓
we have
s
∧ n < use(M A (σ)[v(s + 1)]) v(s)}, c(ns , s) < ∞.
There are of course many possible variations on the idea of a cost function. Nies [298] has begun a program he calls the calculus of cost functions, which seeks to gain insight into notions related to algorithmic randomness by abstracting the idea of a cost function. As a by-product of this work, he obtained an interesting characterization of K-triviality that does not seem to involve randomness notions directly, although it does involve left-c.e. reals, and its proof uses halting probabilities. (In Miller’s Theorem 15.9.2 we will see another such characterization of K-triviality, involving left-d.c.e. reals.) Nies began by associating a cost function to the approximation of a leftc.e. real α, namely cα (n, s) = αs − αn . His characterization of K-triviality is then as follows. Theorem 11.5.6 (Nies [298]). A set is K-trivial iff it obeys cα for every left-c.e. real α. Proof. Any left-c.e. real α is the halting probability of some prefix-free machine N . We may assume that αs = μ(dom N [s]). We may also assume
11.6. The ideal of K-trivial degrees
529
that exactly one string τs enters dom N at each stage s. We use an idea from the proof of Theorem 3.12.1. Let M be a prefix-free machine such that M (τs ) ↓= s for each s, and M (σ) ↑ for all other strings σ. We may assume that we have sped up our stages sufficiently so that M (τs )[s] ↓= s for all s. Let cM be defined in the same way as the standard cost function, but with KM in place of K. That is, writing KM [s] for the stage s approximation to KM , we have cM (n, s) = n
n
so the proof of Theorem 11.5.3 shows that every K-trivial set obeys cα , and if we take α = Ω, so that cM is the standard cost function, then every set obeying cα is K-trivial.
11.6 The ideal of K-trivial degrees Recall that a collection of degrees is an ideal if it is closed downward and under join. By Theorem 9.7.2, the left-c.e. K-trivial reals are closed under addition. Let A and B be K-trivial c.e. sets. Then α = A ⊕ ∅ and β = ∅ ⊕ B are also K-trivial, so A ⊕ B = α + β is K-trivial. In Corollary 11.4.10, we have seen that the K-trivials are closed downward under Turing reducibility, so we have the following result. Theorem 11.6.1 (Nies [302]). The K-trivial c.e. degrees form a Σ03 ideal in the c.e. degrees. Proof. It remains to show that being K-trivial is a Σ03 property. But A is K-trivial iff there is a c such that for all n and m n, there is a stage s with Ks (As m) Ks (m) + c, which is clearly a Σ03 definition. There are not many known natural ideals in the c.e. degrees, and the K-trivials form the first nontrivial example whose complexity is this low. It turns out that closure under addition and join is a more general fact about K-triviality. Theorem 11.6.2 (Downey, Hirschfeldt, Nies, and Stephan [117]). If α and β are K-trivial then so is α + β. Proof. Let α and β be K-trivial. We may assume that α + β is not computable, since otherwise we are done. Let c be such that K(α n) and K(β n) are both below K(n) + c for every n. By the Counting Theorem 3.7.6, there is a d such that for each n there are at most d many strings τ ∈ 2n satisfying K(τ ) K(n) + c. Let Wn be the set of such strings, and note that the Wn are uniformly c.e.
530
11. Randomness-Theoretic Weakness
We can describe (α + β) n by giving a description of n, the positions i and j of α n and β n in the enumeration of Wn , and the carry bit from bit n + 1 to bit n when adding α and β, which is well defined since we are assuming that α+β is not computable, and hence in particular not rational. Since i, j d for all n, we have that K((α + β) n) K(n) + O(1). Corollary 11.6.3 (Downey, Hirschfeldt, Nies, and Stephan [117]). The class of K-trivial sets is closed under join. Proof. As above, if A and B are K-trivial then so are α = A ⊕ ∅ and β = ∅ ⊕ B, so A ⊕ B = α + β is K-trivial. One consequence of this result is that not every superlow set is Ktrivial. A proof of the following result appears in Nies [303], and in essence in Downey, Jockusch, and Stob [120]. (The original proof was in an unpublished manuscript.) Theorem 11.6.4 (Bickford and Mills [36]). There exist superlow c.e. sets A and B such that A ⊕ B ≡T ∅ . Proof sketch. We build a c.e. trace V0 , V1 , . . . such that |Ve | 3e+1 , to B satisfy requirements of the forms ΦA e (e) ∈ Ve and Φe (e) ∈ Ve . We code ∅ into A ⊕ B using coding markers γ(n, s). If n enters ∅ at stage n then we put γ(n, s) into whichever set does the least overall damage (in terms of injury to requirements). We also move markers to build our trace. The idea A is that if we see that ΦA e (e)[s] ↓, say, then we can put Φe (e)[s] into Ve,s+1 and move the markers γ(n, s) for n e to fresh large values, while also putting them into B to preserve the coding. Then the ΦA e (e)[s] computation can be injured only by computations corresponding to stronger priority requirements and by coding of markers corresponding to n < e. Corollary 11.6.5 (Nies [302]). The K-trivial sets form a proper subclass of the superlow sets. By work of Bickford and Mills [36] and Downey, Jockusch, and Stob [120], we cannot replace ≡T by ≡wtt in Theorem 11.6.4. Indeed, Downey, Jockusch, and Stob [120] proved that the wtt-degrees of c.e. array computable sets form an ideal in the c.e. wtt-degrees, and as mentioned following the proof of Theorem 2.23.9, all superlow sets are array computable. From Corollary 11.6.3, Downey, Hirschfeldt, Nies, and Stephan [117] concluded that the wtt-degrees of K-trivials form an ideal. However, by Corollary 11.4.10, we have the following stronger result. (An ideal I is generated by S ⊆ I if it is the smallest ideal containing S.) Theorem 11.6.6 (Nies [302]). The K-trivial degrees form an ideal generated by the class of c.e. K-trivial degrees.
11.7. Bases for 1-randomness
531
An ideal is principal if it has a greatest element (in other words, if it is of the form {X : X T A} for some A). By Theorems 11.1.6 and 11.4.8, there is no greatest K-trivial degree, so the ideal of K-trivial degrees is not principal.
11.7 Bases for 1-randomness The following notion of randomness-theoretic weakness is due to Kuˇcera [219]. Definition 11.7.1 (Kuˇcera [219]). Let R be a relativizable notion of randomness. A set A is a base for R-randomness if there is a set X T A such that X is R-random relative to A. In this section we focus on bases for 1-randomness. Recall Sacks’ Theorem (Corollary 8.12.2), which states that if A is not computable then μ({X : X T A}) = 0. In other words, the Turing upper cone above any noncomputable set is null. However, by the Kuˇcera-G´acs Theorem 8.3.2, such upper cones are never Martin-L¨ of null. Thus we could say that Sacks’ Theorem cannot be effectivized. On the other hand, it seems too much to expect to effectivize Sacks’ Theorem for the upper cone above A without reference to A itself. Thus it is more reasonable to ask: When is the upper cone above A Martin-L¨ of null relative to A? It follows easily from the existence of universal Martin-L¨ of tests that the upper cone above A is Martin-L¨ of null relative to A iff A is not a base for 1-randomness. We have seen in Corollary 8.12.3 that there are no bases for weak 2randomness other than the computable sets. Kuˇcera [219] showed that all bases for 1-randomness are GL1 . He also observed that if A is low for 1randomness, then it is a base for 1-randomness, since then the 1-random X T A whose existence is guaranteed by the Kuˇcera-G´acs Theorem 8.3.2 is also 1-random relative to A. Thus every K-trivial set is a base for 1randomness.2 The following result shows that the converse is also true, and thus we have yet another characterization of the K-trivial sets in terms of a notion of randomness-theoretic weakness. Its proof is known as the “hungry sets construction”. Theorem 11.7.2 (Hirschfeldt, Nies, and Stephan [178]). A set is K-trivial iff it is a base for 1-randomness. 2 Kuˇ cera’s
[219] original construction of a base for 1-randomness was different: Take a low 1-random set A and use the relativized low basis theorem to get a low set B that is 1-random relative to A. Then use Corollary 8.8.3 to get a noncomputable set X T A, B. Such an X is a base for 1-randomness, because X T B and B is 1-random relative to X, as it is 1-random relative to A and X T A.
532
11. Randomness-Theoretic Weakness
Proof. As remarked above, it is enough to let A be a base for 1-randomness and show that A is K-trivial. Fix a set X T A that is 1-random relative to A and a reduction ΦX = A. We will enumerate KC sets Ld for d ∈ N so that there is a d for which Ld contains (K(|τ |) + d + 2, τ ) for each τ ≺ A, thus ensuring that A is K-trivial. To define the Ld , we will build uniformly Σ01 classes Cdτ so that the following hold. 1. For each d, the Cdτ are pairwise disjoint. 2. μ(Cdτ ) 2−K(|τ |)−d. 3. If X ∈ / τ ≺A Cdτ , then μ(Cdτ ) = 2−K(|τ |)−d for all τ ≺ A. Suppose we can build such classes, and let Ud = τ ≺A Cdτ . The Ud are uni formly Σ01 relative to A, and μ(Ud ) = τ ≺A μ(Cdτ ) τ ≺A 2−K(|τ |)−d of test relative to A. 2−d , so {Ud}d∈ω is a Martin-L¨ We now define Ld = {(Ks (|τ |) + d + 2, τ ) : μ(Cdτ [s]) 2−Ks (|τ |)−d−1 }. For a fixed τ , the requests (r, τ ) ∈ Ld contribute to the weight of Ld at most twice the weight of the one with the smallest r. Thus the weight of Ld is bounded by τ μ(Cdτ ), which is less than or equal to 1 since the Cdτ are pairwise disjoint. So the Ld are KC sets. Since X is 1-random relative to A, we have X ∈ / Ud for some d, and hence μ(Cdτ ) = 2−K(|τ |)−d for all τ ≺ A, which implies that (K(|τ |)+d+2, τ ) ∈ Ld for all τ ≺ A, as desired. So it is enough to build the Cdτ with the above properties. To do so, as long as μ(Cdτ ) < 2−Ks (|τ |)−d, we look for strings σ such that τ Φσ and μ(Cdτ ) + 2−|σ| 2−Ks (|τ |)−d, and put σ into Cdτ . To keep our sets pairwise disjoint, we then ensure that no σ such that σ is compatible with σ is later put into any Cdν . If X ∈ / Ud , then no σ with σ ≺ X is ever put into any Cdτ , so if τ ≺ A = ΦX then μ(Cdτ ) = 2−K(|τ |)−d. We now give the details of the construction. We have a separate procedure for each d ∈ N. Initially, all strings are unused. Each stage s has 2s τ many substages, one for each σ ∈ 2s . Let Cd,σ denote the approximation τ to Cd at the beginning of the substage corresponding to σ. For each σ ∈ 2s in turn, if σ is not used then look for the shortest τ Φσ [s] such that τ μ(Cd,σ ) + 2−s 2−Ks (|τ |)−d. If there is such a τ , then put σ in Cdτ and declare every extension of σ to be used. Note that, for a fixed d, the Cdτ are pairwise disjoint, as we only enumerate unused strings of length s at stage s. Furthermore, it is clear from the construction that μ(Cdτ ) 2−K(|τ |)−d for all d and τ . Thus we are left with verifying property 3 above. Suppose that X ∈ / τ ≺A Cdτ . Assume for a contradiction that μ(Cdτ ) < 2−K(|τ |)−d for some τ ≺ A. Let s be a stage such that
11.7. Bases for 1-randomness
533
1. K(|τ |) = Ks (|τ |), 2. μ(Cdτ ) + 2−s 2−K(|τ |)−d, and 3. ΦX (|τ | + 1)[s] ↓. Then X (s + 1) must enter Cdτ at the substage corresponding to X (s+1) unless it enters some other Cdτ or is already used. In any case, there is a ν and an n such n is in Cdν . But then ν ΦXn ≺ ΦX = A, so that X τ X ∈ X n ⊆ τ ≺A Cd , contrary to hypothesis. Thus μ(Cdτ ) = 2−K(|τ |)−d for all τ ≺ A. It is not difficult to adapt the above proof to show directly that every base for 1-randomness is low for K. The details are in Nies [306]. (Note, however, that this fact does not give us a new proof that every K-trivial is low for K, since the fact that every K-trivial is a base for 1-randomness still comes from the golden run proof that every K-trivial is low for 1-randomness.) As mentioned above, it follows from the Kuˇcera-G´acs Theorem that every set that is low for 1-randomness is a base for 1-randomness. Thus, the above result also provides an alternate proof that every set that is low for 1-randomness is K-trivial (and even low for K, using the version in [306] mentioned above). It also helps restore the intuition that relatively 1random sets should have little common information, which was apparently contradicted by Kuˇcera’s result that there are relatively 1-random sets whose degrees do not form a minimal pair (Corollary 8.8.3).3 Corollary 11.7.3 (Hirschfeldt, Nies, and Stephan [178]). If A and B are relatively 1-random then any X T A, B is K-trivial. Proof. Since A is 1-random relative to B, it is 1-random relative to X. Thus X is a base for 1-randomness, and hence is K-trivial. The following is another interesting application of Theorem 11.7.2. Corollary 11.7.4 (Hirschfeldt, Nies, and Stephan [178]). Let A be c.e., and let Z be 1-random and such that ∅ T A ⊕ Z. Then Z is 1-random relative to A. Consequently, if A is c.e. and A T Z for some 1-random Z T ∅ (i.e., a difference random set Z, as defined in Section 7.7), then A is K-trivial. Proof. It is enough to prove the first statement, as the second one then follows by Theorem 11.7.2. Let U0A , U1A , . . . be a universal Martin-L¨ of test relative to A. Suppose that Z ∈ n UnA but ∅ T A ⊕ Z. We may assume that μ(UnAs [s]) 2−n for all n and s. Let the function f T A ⊕ Z be defined by letting f (n) be the least s for which there is a k such that Z k ∈ UnAs [s] with use u and 3 See Kuˇ cera’s original construction of a base for 1-randomness mentioned in the previous footnote.
534
11. Randomness-Theoretic Weakness
As u = A u. For such s and k, we have Z k ∈ UnAt [t] for all t s. Now let m(n) be the least s such that n ∈ ∅s if there is such an s, and let m(n) be undefined otherwise. Then ∃∞ n (m(n) ↓ f (n)), as otherwise we could compute ∅ from A ⊕ Z because, for almost all n, we would have n ∈ ∅ iff n ∈ ∅f (n) . A Let Sn = x>n,x∈∅ Ux m(x) [m(x)]. The classes {Sn }n∈ω are uniformly Σ01 , and μ(Sn ) 2−n . Moreover, Z ∈ n Sn , since f (x) < m(x) for infinitely many x. Thus Z is not 1-random. Hirschfeldt, Nies, and Stephan [178] also studied bases for computable randomness. They showed that these include every Δ02 set that is not DNC, but no set of PA degree. Jockusch, Lerman, Soare, and Solovay [191] showed that if an n-c.e. set is DNC, then it is Turing complete. Since 0 is a PA degree, it follows that an n-c.e. set is a base for computable randomness iff it is Turing incomplete. Franklin, Stephan, and Yu [159] showed that a set is a base for Schnorr randomness iff it does not compute ∅ .
11.8 ML-covering, ML-cupping, and related notions Corollary 11.7.4 suggests the following definition. Definition 11.8.1. A c.e. set A is Martin-L¨ of coverable if there exists a 1-random Z T ∅ (i.e., a difference random Z) such that A T Z. The Martin-L¨ of coverable sets form a subclass of the class of K-trivial c.e. sets. It is an open question whether they form a proper subclass. Open Question 11.8.2. Is every c.e. K-trivial set Martin-L¨ of coverable? The following are two other notions of randomness-theoretic weakness arising from results, such as Corollary 11.7.4, that indicate that there is a qualitative difference between those 1-random sets that compute ∅ and those that do not. Definition 11.8.3 (Kuˇcera, see Nies [304]). A set A is weakly Martin-L¨ of cuppable if there is a 1-random Z T ∅ (i.e., a difference random Z) such that ∅ T A⊕Z. It is Martin-L¨ of cuppable if Z can be chosen to be strictly below ∅ . For any A, we have A T A ⊕ ΩA . By Theorem 11.7.2, if A is not Ktrivial then A T ΩA , so if A is Δ02 and not K-trivial then it is weakly Martin-L¨ of cuppable. Similarly, if A is low and K-trivial then it is MartinL¨ of cuppable (since lowness implies that ΩA T ∅ ). In fact, this argument shows that if A computes a set that is low and not K-trivial, then it is Martin-L¨ of cuppable. Sacks [342] showed that if A is c.e. then it can be
11.8. ML-covering, ML-cupping, and related notions
535
written as the disjoint union of two low c.e. sets. By the same argument as in the proof of Corollary 11.6.3 below, if A is not K-trivial, then at least one of these sets must be non-K-trivial. Thus, if a c.e. set is not K-trivial, then it is Martin-L¨of cuppable, and hence the c.e. sets that are not Martin-L¨ of cuppable form a subclass of the K-trivials. As we will prove below, Nies [304] showed that this subclass contains more than just the computable sets. Theorem 11.8.4 (Nies [304]). There is a noncomputable c.e. set that is not weakly Martin-L¨ of cuppable. Again, proper containment is an open question. Open Question 11.8.5. Is every K-trivial (c.e.) set not (weakly) MartinL¨ of cuppable? Franklin and Ng [157] have shown that a c.e. set is Martin-L¨ of coverable iff it is a base for difference randomness, and not weakly Martin-L¨ of cuppable iff it is low for difference randomness. Recall from Theorem 7.2.11 that Hirschfeldt and Miller (see [130]) showed that for any Σ03 class S of measure 0, there is a noncomputable c.e. set A such that A T X for all 1-random X ∈ S. Inspired by this result, Nies [306] suggested the following definition. Definition 11.8.6 (Nies [306], after Hirschfeldt and Miller (see [130])). For a class S ⊆ 2ω , let S 3 be the collection of c.e. sets computable from every 1-random element of S. We call classes of the form S 3 diamond classes. Of course, S 3 always contains the computable sets, and by Corollary 8.12.2, if μ(S) > 0 then S 3 consists exactly of the computable sets. Theorem 7.2.11 can be rephrased by saying that if S is a Σ03 class of measure 0, then S 3 contains noncomputable sets. (We have already remarked, following the proof of Theorem 7.2.11, that this fact cannot be extended much beyond Σ03 classes.) A set X is Y -trivializing if Y is K-trivial relative to X (that is, K X (Y n) K X (n) + O(1)). Let S be the class of ∅ -trivializing sets.4 It is not hard to check that S is Σ03 . The following also holds. Lemma 11.8.7. μ(S) = 0. Proof. Relativizing the fact that every K-trivial set is low, we see that every set in S is high. By Theorem 8.14.1, for all but measure 0 many high sets, X ⊕ ∅ T X T ∅ . By the relativized version of Corollary 8.12.2, the class of sets X such that X ⊕ ∅ T ∅ has measure 0. Thus the class of high sets has measure 0. 4 Note
that if X is Y -trivializing then X ⊕ Y is K-trivial relative to X, so by the relativized form of Theorem 11.4.11, X ⊕ Y LR X, whence Y LR X. In particular, by Theorem 10.6.5, every ∅ -trivializing set is (uniformly) almost everywhere dominating.
536
11. Randomness-Theoretic Weakness
Thus S 3 contains noncomputable sets. We can use the basic construction of a noncomputable K-trivial c.e. set to define a nontrivial pseudo-jump operator taking any set X to a set that is K-trivial relative to X. Then Theorem 8.19.1 shows that there is a 1-random set X
11.9 Lowness for weak 2-randomness There have been several further characterizations of the K-trivials. We now discuss one that came as something of a surprise. Recall that a gener-
11.9. Lowness for weak 2-randomness
537
alized Martin-L¨ of test (relative to A) is a sequence of uniformly Σ01 (Σ0,A 1 ) classes {Un }n∈ω such that limn μ(Un ) = 0. Applying Definition 11.2.2 to this notion, we have the following definition: a set A is low for generalized Martin-L¨ of tests if for each generalized Martin-L¨ of test {Un }n∈ω relative to A,there is an unrelativized generalized Martin-L¨ of test {Vn }n∈ω such that n Un ⊆ n Vn . If A is low for generalized Martin-L¨ of tests, then every weakly 2-random set is weakly 2-random relative to A, so A is low for weak 2-randomness. The null classes determined by generalized Martin-L¨of tests (i.e., those of the form n Un for a generalized ML-test {Un }n∈ω ) are exactly the Π02 classes of measure 0. Thus, A is low for generalized ML-tests iff every Π0,A 2 class of measure 0 is contained in a Π02 class of measure 0 iff every Σ0,A 2 class of measure 1 has a Σ02 subclass of measure 1. Downey, Nies, Weber, and Yu [130] showed that there are noncomputable c.e. sets that are low for generalized Martin-L¨of tests, and hence low for weak 2-randomness. They also proved the following result. Theorem 11.9.1 (Downey, Nies, Weber, and Yu [130]). If A is low for weak 2-randomness then A is K-trivial. In the proof of this result, we will use the following notation. Definition 11.9.2 (Kjos-Hanssen, Nies, and Stephan [207]). Let C and D be classes of sets given by relativizable definitions, and let DA be the relativization of D to A. A set A is in Low(C, D) if C ⊆ DA .5 If C = D, we write simply Low(C). For example, letting MLR be the class of 1-random sets, A ∈ Low(MLR) = Low(MLR, MLR) means exactly that A is low for 1randomness. Relativizing a class D determined by a notion of randomness usually makes it smaller, of course, so if A is powerful enough, we would expect that C DA even if C ⊆ D. The sets in Low(C, D) are the ones that are not powerful enough for this situation to obtain. Let W2R be the class of weakly 2-random sets. Theorem 11.9.1 follows immediately from the following result, since every Martin-L¨ of test is a generalized Martin-L¨ of test, and hence W2RA ⊆ MLRA for any A. Theorem 11.9.3 (Downey, Nies, Weber, and Yu [130]). We have Low(W2R, MLR) = Low(MLR). In other words, if every weakly 2-random set is 1-random relative to A, then A is in fact low for 1-randomness, and hence K-trivial. Proof. This presentation of the proof is due to Joe Miller. 5 We
will say a little more about this notion in Section 12.6.
538
11. Randomness-Theoretic Weakness
Suppose that A is not low for 1-randomness. We show that W2R MLRA by building a set Z ∈ W2R that is not 1-random relative to A. We define a sequence of nonempty Π01 classes V0 ⊇ V1 ⊇ V2 ⊇ · · · and let Z be an element of n Vn (which is nonempty, by compactness). Let V0 be a nonempty Π01 class containing only 1-random sets. Let {Ue,n }e,n∈ω be a (noneffective) enumeration of all generalized Martin-L¨ of tests and let {UnA }n∈ω be a universal Martin-L¨ of test relative to A. Assume that we have defined Ve . We claim that there is an X ∈ Ve that be any is not 1-random relative to A. To see that this is the case, let X set that is 1-random but not 1-random relative to A. Since Ve contains a 1-random element, it must have positive measure. Hence, by Lemma 6.10.1, such that X ∈ Ve . This X is not 1-random relative to there is an X =∗ X A. Take τe ≺ X such that τe ⊆ UeA . It will follow from the construction that τe ≺ Z for each e, and hence Z is not 1-random relative to A. Now choose m large enough that μ(Ue,m ) < μ(τ ∩ Ve ) and let Ve+1 = 0 (τ ∩ Ve ) \ Ue,m . So Ve+1 is a nonempty Π1 subclass of Ve , and no element of Ve+1 is in n Ue,n . Since we ensure that this is the case for all e, it follows that Z is weakly 2-random. It had been strongly suspected that the class of sets that are low for weak 2-randomness is a proper subclass of the K-trivials. The reason for this suspicion was that the cost function used in the proof of the existence of a noncomputable set that is low for weak 2-randomness seems to decrease much more slowly than the one used in the proof of the existence of a noncomputable K-trivial. This conjecture was surprisingly disproved, independently by Nies [306] and Kjos-Hanssen, Miller, and Solomon [206]. The original proofs of the following result used the golden run method, but there is a more informative proof using LR-reducibility. Theorem 11.9.4 (Nies [306], Kjos-Hanssen, Miller, and Solomon [206]). If A is K-trivial then A is low for generalized Martin-L¨ of tests. Therefore, K-triviality, lowness for generalized Martin-L¨ of tests, and lowness for weak 2-randomness are all equivalent. Proof. If A is K-trivial then A T ∅ and A LR ∅, so by Theorem 10.6.4, every Σ0,A class of measure 1 has a Σ02 subclass of measure 1, which we have 2 already observed is equivalent to A being low for generalized Martin-L¨ of tests.
11.10 Listing the K-trivial sets In this section we prove a result about the presentation of the class K of K-trivial sets. As is true for every class that contains the finite sets and has a Σ03 index set, there is a uniformly c.e. listing of the c.e. sets in K.
11.10. Listing the K-trivial sets
539
(See the proof of Theorem 11.11.1 below.) We will use the following lemma to show that there is in fact a listing that includes the witnesses of the Σ03 statement, namely the constants via which the sets are K-trivial, and that this fact is true even in the general Δ02 case. We say that A is K-trivial via c if K(A n) K(n) + c for all n. Lemma 11.10.1 (Downey, Hirschfeldt, Nies, and Stephan [117]). There is an effective list {(Be,s }s∈ω , be , Qe ) of Δ02 -approximations, constants, and computable sets of stages with the following properties, where Qe = {ve (0) < ve (1) < · · · } (which may be a finite set). 1. For each e, the limit lims Be,s exists and is K-trivial. 2. Every K-trivial set is lims Be,s for some e. 3. Be,s (x) = Be,s−1 (x) ⇒ s = ve (r) for some r 2. Be,s−1 (n). 4. Let ns be the least number n (if any) such that Be,s (n) =−K ve (s+1) (x) . If ve (s + 1) is defined, then let c(n, s) = 2 n<xve (s) Then c(nve (s+2) , s) < 2be . nve (s+2)↓
Proof. We use Lemma 11.5.2 and its proof. By Lemma 5.3.15, there is a uniformly c.e. listing of the K-trivial c.e. sets. By Theorem 11.5.4, this listing can be used to obtain a uniformly ω-c.e. listing of all K-trivial sets A0 , A1 , . . . . We think of the number e as a tuple m, c, i, n, b and let be = b. The idea is that m is the index for an Am that we hope is K-trivial via c, the number i is the level at which we hope to have a golden run for the construction in the proof of Lemma 11.5.2, and n stands for the nth run of Pi in that construction, which we hope is a golden run Pi (g, q) with 2b = 2 gq . For each e, we run the construction in Lemma 11.5.2 with the K-trivial set Am and c as the purported constant of K-triviality. We wait until the nth run of Pi is launched, with some parameters g and q. If 2b = 2 gq then Qe = ∅ and Be = ∅. Otherwise, we start computing the numbers v(0) < v(1) < · · · as in the proof Lemma 11.5.2, letting Qe consist of these numbers. (Note that we know at stage s whether s is in Qe , which makes Q0 , Q1 , . . . uniformly computable.) Of course, if we have the wrong guess for c, i, or n, then Qe might be finite. Let Bs = ∅ for all s < v(2), or all s if v(2) is never defined. For r 2, let Bs = Am v(r) for all s ∈ [v(r), v(r + 1)), or all s v(r) if v(r + 1) is never defined. It is easy to check from the proof of Lemma 11.5.2 that property 4 of the lemma holds whether or not our guesses are correct. (In that proof, we showed that the relevant sum of costs is less than gq . The extra factor of 2 in the requirement that 2b = 2 gq is simply to take care of the cost associated with the initial change in approximation from Bv(2)−1 to Bv(2) .)
540
11. Randomness-Theoretic Weakness
Property 1 holds because every Be is either some Am or finite, while property 3 holds by construction. If c is a correct constant of K-triviality for Am , then the construction will have a golden run. If the nth run of Pi is a golden run and 2b = 2 gq , then Be = Am for e = m, c, i, n, b. Thus property 2 also holds. Theorem 11.10.2 (Downey, Hirschfeldt, Nies, and Stephan [117]). There are uniformly computable approximations Be,s and a computable sequence of constants de such that for every e, the set Be = lims Be,s exists and is K-trivial via de ; and every K-trivial set is Be for some e. Proof. Let Be,s , be , Qe , and ve be as in Lemma 11.10.1. Uniformly in e, we define KC sets Ve such that (K(n) + be + 3, Be n) ∈ Ve for all n. Then we obtain de by adding to be + 3 the coding constant of a prefix-free machine uniformly obtained from Ve . At stage s, for each e, n s, put (Ks (n) + be + 3, Be,s n) into Ve if (i) n = s, or (ii) n < s and Ks (n) < Ks−1 (n), or (iii) Be,s−1 n = Be,s n. Clearly, (K(n) + be + 3, Be n) ∈ Ve for each e and n. It remains to show that each Ve is a KC set. The weight contributed by requests added for reasons (i) and (ii) is at most 2−(be +3) 2 n 2−K(n) < 2−(be +2) 14 . Now consider the requests added for reason (iii). Since Be changes only at stages in Qe , for each n there are at most two enumerations at a stage s = ve (r + 2) such that n > ve (r). The weight contributed by all n at such stages is at most Ω4 . Now suppose that (iii) occurs for s = ve (r + 2) and n ve (r). First we consider the case Kve (r+1) (n) > Ks (n). This case occurs at most once for each value Ks (n) for s ∈ Qe . Since each value corresponds to a new description of n, the overall contribution to the weight of Ve of enumerations made when this case occurs, over all n, is at most Ω8 . Now we consider the case Kve (r+1) (n) = Ks (n). Since the approximation to Be (x) changes for some least x < n at stage s, the term 2−Ks (n) occurs in the sum c(x, r) from Lemma 11.10.1. Since the sum of all such costs is bounded by 2be , the overall contribution to the weight of Ve of enumerations made when this case occurs, over all n, is at most 18 . Using a c.e. version of Lemma 11.10.1, we can obtain a c.e. version of the above result with the same proof. Theorem 11.10.3 (Downey, Hirschfeldt, Nies, and Stephan [117]). There are uniformly c.e. sets A0 , A1 , . . . and a computable sequence of constants de such that Ae is K-trivial via de for each e, and every c.e. K-trivial set is Ae for some e.
11.11. Upper bounds for the K-trivial sets
541
Let C be an index set such that if We = Wi then e ∈ C iff i ∈ C. We say that C is uniformly Σ03 if there is a Π02 relation P such that e ∈ C iff ∃n (P (e, n)), and there is an effective sequence {(en , bn )}n∈ω such that P (en , bn ) for all n and for each e ∈ C there is an n with We = Wen . We have proved that K is uniformly Σ03 . It would be interesting to know which other properly Σ03 index sets have that property, for instance the class of computable sets. In Theorem 11.4.9, we saw that every K-trivial set is low for K. However, the constant c in the definition of being low for K was not obtained in a uniform way from the constant of K-triviality. The following result shows that this nonuniformity is necessary. We say that A is low for K via c if ∀σ (K(σ) K A (σ) + c). Lemma 11.10.4. From an index for a c.e. set A and a c such that A is low for K via c, we can effectively compute a lowness index for A. s Proof. Let f (e) be the least s such that ΦA e (e)[s] ↓ with an A-correct computation, if there is such an s, and let f (e) be undefined otherwise. Then f T A, and indeed there is a uniform procedure for computing f given A. Thus there is a d, independent of A, such that whenever f (e) ↓, we have K(f (e)) K A (f (e)) + c e + c + d. To compute whether e ∈ A using ∅ , we find the largest s such that K(s) e + c + d. Then e ∈ A s iff ΦA e (e)[s] ↓ with an A-correct computation. It is clear that an index for this procedure can be found effectively from a c.e. index for A and c.
Corollary 11.10.5 (Downey, Hirschfeldt, Nies, and Stephan [117]). There is no effective way to obtain from a pair (A, b), where A is a c.e. set that is K-trivial via b, a constant c such that A is low for K via c. Proof. Suppose otherwise. Then, by Theorem 11.10.3 we can obtain a uniformly c.e. listing A0 , A1 , . . . of the sets that are low for K, along with a computable sequence c0 , c1 , . . . such that each An is low for K via cn . By the lemma, the An are uniformly low, contradicting Theorem 11.1.6.
11.11 Upper bounds for the K-trivial sets We have mentioned in Theorem 11.1.6 that given any low c.e. set B we can construct a K-trivial A T B. The following result shows that we cannot replace low by low2 . It is a good example of a purely computability-theoretic result that has significant consequences for the study of the interactions between computability and randomness. Theorem 11.11.1 (Barmpalias and Nies [28]). Let I be a nontrivial Σ03 ideal in the c.e. Turing degrees. There is a low2 c.e. set A such that W T A for all W ∈ I. By Corollary 11.5.4 and Theorem 11.6.1, we have the following.
542
11. Randomness-Theoretic Weakness
Corollary 11.11.2 (Barmpalias and Nies [28]). There is a low2 c.e. set A such that X T A for all K-trivial sets X. Downey, Hirschfeldt, Nies, and Stephan [117] had earlier shown that there is an incomplete c.e. set A such that X T A for all K-trivial sets X. This simpler result can be proved as follows. Since the collection of Ktrivial c.e. sets is Σ03 and contains the finite sets, by Lemma 5.3.15, there is a c.e. set A such that every K-trivial c.e. set is a column of A, and every column of A is K-trivial. Since the K-trivials are closed under join, A[ f (e, s) then put n into Ve . If Φe is total then, given n, we can use U to compute an sn such that f (e, t) > n for all t sn . If n ∈ / Ve [sn ], then n∈ / Ve , so Ve T U . If ΦU is not total then f (e, s) has a finite liminf, so e Ve =∗ ∅ . Corollary 11.11.4 (Nies, see [28]). The Ui are uniformly low2 . i Proof. By the lemma, we have a computable function f such that if ΦU e is i total then Wf (e,i) T Ui , while otherwise Wf (e,i) =∗ ∅ . Then ΦU is total iff e 0 i Wf (e,i) ∈ I. Thus the question of whether ΦU is total is Σ . But it is easy 3 e to see that it is also Π03 , since the Ui are uniformly c.e. Thus this question can be answered by ∅ , and hence (since the totality question relative to an oracle U is Π02 -complete relative to U ), the Ui are uniformly low2 .
We will satisfy each coding requirement Ce : Ue T A ΓA e
= Ue . (In truth, we will have several strategies by building a reduction for meeting Ce . The one on the true path of the construction will build the correct reduction Γe .) We will define Γe in stages. As usual, if x enters Ue
11.11. Upper bounds for the K-trivial sets
543
after stage s, then we will change As γeA (x)[s]. As we will see, there will be other reasons we might want to change A below this use, but as long as the use does not go to infinity, we will still succeed in building Γe . To make A low2 , we will satisfy the following requirements. Ni : lim sup (τ, s) = ∞ ⇒ ΦA i total, s
where (τ, s) is the length of convergence max{x : ∀y x (ΦA i (y)[s] ↓)}, measured at the stages at which the node τ on the true path devoted to Ni has its true outcome, in a sense that will be made clear below. As usual for an infinite injury argument, the true path will be ∅ -computable, so these requirements ensure that ∅ can tell which ΦA i are total, thus making A low2 . Each requirement Ni has subrequirements Ni,j : lim sup (τ, s) = ∞ ⇒ ΦA i (j) ↓ . s
The priority tree will have 3 types of nodes: • β nodes, which work for C-requirements and have a single outcome ∞. We denote the e such that β works for Ce by e(β). • τ nodes, which work for N -requirements and, as a first approximation to the actual construction, have outcomes ∞
544
11. Randomness-Theoretic Weakness
the length of convergence rises, and hence τ ∞ looks increasingly correct, we want to preserve more and more of A to try to force ΦA i to be total if infinitely often it looks total. This preservation is implemented by the tree of subnodes below the main node τ . Now, there are two quite different issues that τ must deal with. One is the presence of β nodes above τ , whose coding action is more or less out of control as far as τ is concerned. As much as τ might wish to preserve a computation ΦA i (j) ↓, it cannot stop such a β from encoding Ue(β) into A, even if that encoding destroys the computation. The situation is similar to that in the proof of the Thickness Lemma, where a negative requirement must deal with encodings above it that it cannot control, but there is a notion of a node τ ’s computations being “correct”, which implies that no injury from above can occur to a given computation. We will develop an analogue to this notion of correctness using the fact that here the injury from above will be uniformly low2 , because the Ue are uniformly low2 . The general idea is that above τ will be a finite number of β nodes, coding sets Ui for a finite number of i’s, and we have thus a uniformly low2 amount of “noise” in the τ -computations. Before we say exactly how we will deal with this noise, we discuss the other issue that τ must confront: the impact of coding nodes of weaker priority than τ . So assume that τ is able to guess the behavior of stronger priority β nodes and uses only the τ -correct computations alluded to above. Let Ni be the requirement assigned to τ . If lim sups (τ, s) = ∞ then we need to ensure that ΦA is total. As mentioned above, for a given j, we have nodes α below β ∞ dedicated to ensuring that ΦA (j) ↓. For such an α, we need to deal with potential injuries to computations from β nodes between τ ∞ and α (of course, β nodes below α will be restrained by α, and we are working under the assumption that τ can deal with β nodes above it). The point, of course, is that such β nodes are trying to put infinitely many elements into A while α is trying to preserve computations with oracle A. The node α deals with a β node between τ and it as follows. Suppose that α is at a stage s at which it wants to preserve a τ -correct computation ΦA i (j)[s] ↓. (Note that, since α is below τ ∞, for α to be relevant at stage s, this stage must be one at which we believe the ∞ outcome of τ is plausible. That is, it must be τ -expansionary, which means that the length of convergence (τ, s) must be larger than the longest length of convergence previously seen.) The node α cannot really stop β from putting its numbers (which may well be below the use ϕA i (j)[s]) into A. Instead, α tries to lift the relevant Γβ -uses above ϕi (j)[s]. If α almost always succeeds, then the computation it is trying to preserve eventually becomes not only τ -correct, but α-correct, and hence α can eventually preserve this computation forever. If, on the other hand, α fails infinitely often, then we must arrange things so that the ∞ outcome of τ is not in fact the true one, that is, that lim sups (τ, s) < ∞. The trick is to destroy certain computations ourselves rather than try to preserve them, as we now explain.
11.11. Upper bounds for the K-trivial sets
545
Suppose that γβA (x)[s] < ϕA i (j)[s] for x > j. What α does (or more accurately, what τ does, blaming α) is put some z < γβA (x)[s] into A and lift γβA (y) above s for all y > x. The entry of z into A will kill the computation α wanted to preserve, and hence it is now reasonable to not believe that s is τ -expansionary after all (i.e., that the increase in the length of convergence (τ, s) that caused us to consider taking the ∞ outcome of τ at stage s was not real). So at stage s, we take the f outcome of τ instead of the ∞ outcome. Let s be the next potential (τ ∞)-stage. The number z has entered A below the A use of the ΦA i (j)[s] ↓ computation, and we have reset γβ (x)[s]. At stage s , we check whether we now have γβA (x)[s ] ϕA i (j)[s ] for all x > j, or whether we are in the same bad situation as at stage s. (Note that s being a potential (τ ∞)-stage means that s is τ -expansionary, so that
(τ, s ) > (τ, s), and hence ΦA i (j)[s ] ↓.) In the former case, we no longer have to worry about β. (It is true that we might have γβA (x)[s ] < ϕA i (j)[s ] for some x j, but since there are only finitely many such x, this situation can cause β to injure α only finitely often, which is fine, since α needs to preserve its computation only in the limit.) Otherwise, we repeat what we did at s, enumerating some number less than γβA (x)[s ] into A, once again destroying the computation ΦA i (j)[s ] ↓ rather than preserving it, and once again taking the finitary outcome f of τ . Now, if this cycle repeats itself infinitely often, then there are only finitely many τ ∞ stages and, indeed, ΦA i is not total. Suppose on the other hand that we hit τ at some τ -correct stage t and all of the offending γβ -uses for β between τ and α have cleared ϕA i (j)[t]. At such a stage, we allow α to impose restraint, initializing weaker priority requirements. Modulo the notion of a τ -correct computation, α can now be injured only by γβA (x)[s] where β is between τ and α and x j. As mentioned above, these produce only finite injury. Thus, still depending on the notion of τ -correctness, if τ ∞ is on the true path, then ΦA i will indeed be total. Once α asserts control at some τ -correct stage, after some finite injury noise, α’s restraint will be successful and preserve the ΦA i (k)[t] ↓ computation forever. We now turn to discussing the notion of τ -correctness. We have seen that β nodes above τ can also affect things, but it is not within τ ’s power to clear γβ -markers associated with such stronger priority β. That is, suppose that we reach a stage t that appears to be τ -expansionary. Let M be the maximum of the uses of the computations witnessing that t is τ -expansionary. We do not know whether to believe that t is in fact τ -expansionary because there may be γβ -markers for β above τ that are less than or equal to M . Action for the sake of these markers can injure the computations witnessing that t is τ -expansionary. If infinitely often we have such “fake” expansionary stages at which τ believes computations that are later injured, then τ will not succeed in satisfying its requirement. So what is τ to do?
546
11. Randomness-Theoretic Weakness
It is now that we use the fact that the Ue are uniformly low2 . Let B be the collection of β nodes above τ and let Uτ = β∈B Ue(β) . We write kβ for the position of Uτ corresponding to the kth position of Ue(β) . We build τ a reduction ΞU τ . By the recursion theorem, we know which questions to ask ∅ to find out whether this reduction is total. So we have a uniformly τ Δ03 way to decide whether ΞU τ is total. For now, let us think that we are τ getting “∅ -certifications” as to the totality of ΞU τ . (One way to think of these certifications is to pretend for now that our decision procedure is Π02 rather than Δ03 . Then there is a computable process that infinitely often certifies that our reduction is total iff it is in fact total.) When we are ready to believe that a stage t is τ -expansionary, we declare Uτ τ that ΞU τ (n)[t] ↓ for the least n for which Ξτ (n) was undefined, with use max{kβ : β ∈ B ∧ γβA (k)[t] M } + 1, where M is as above. We then do τ the same for the next n for which ΞU τ (n) was undefined, and keep doing it for larger and larger n until one of the following occurs. 1. For some β above τ and some n, we see that Uτ changes below the τ use of the computation ΞU τ (n). τ 2. We get a new ∅ -certification that ΞU τ is total.
One of these two possibilities must happen, since otherwise we would make τ ΞU τ total but would not get this totality certified. In the first case, we know that our original belief that t is τ -expansionary is wrong, so we can take the f outcome of τ . In the second case, we have some endorsement of our original belief, and can take the ∞ outcome after all. Notice that, if the τ second case happens infinitely often, then ΞU is indeed total, so almost τ all stages that we believe to be τ -expansionary in fact are so, and the ∞ outcome is the correct one. However, the above description proceeded under the simplifying assump0 0 τ tion that our approximation to the totality of ΞU τ is Π2 . In truth, it is Δ3 , 0 which means that we have a Σ3 approximation for totality and also a Σ03 approximation for nontotality. We can think of a Σ03 approximation as infinitely many Π02 processes, such that the condition being approximated holds iff one of these processes provides infinitely many certifications. (It is easy to see that we may assume that at most one process provides infinitely many certifications.) So we replace the two outcomes of τ by infinitely many outcomes (0, ∞)
11.11. Upper bounds for the K-trivial sets
547
act, all markers that could be moved by τ have already been moved. We do so by moving γβA (x)[s] not only when it is less than ϕA i (j)[s] for some j < x, but also preventively when ΦA (j)[s] ↑ for such a j, since we know i that, if τ has infinitary outcome, then ΦA (j) will eventually converge and i would then possibly cause us to move γβA (x)[s]. We now turn to the formal details. Define the priority tree as follows. Let λ be on the priority tree. Suppose that ν is on the priority tree. There are three cases. 1. If |ν| = 0 mod 3 then let e be least such that Ce is not assigned to any ρ ≺ ν. The node ν is a β node devoted to Ce . Put ν ∞ on the priority tree and let e(ν) = e. 2. If |ν| = 3i + 1 then ν is a τ node devoted to Ni . Put ν (k, ∞) and ν (k, f ) on the priority tree for all k ∈ N, with the ordering ν (0, ∞)
548
11. Randomness-Theoretic Weakness
A ΓA ν (n)[s] ↓= Ue(ν) (n)[s] then put γν (n)[s] − 1 into A. In this case, if ρ ν is an α node such that τ (ρ) ≺ ν and j(ν) n, then initialize ρ.
Case 2. ν is a τ node. Let i = i(ν). For each j, each β node ρ ν (j, ∞), each n, and each k n such that γρA (n)[s] ↓ and either ΦA i (k)[s] ↑ or A A A γρ (n)[s] ϕi (k)[s], enumerate γρ (n)[s] − 1 into A, declaring ΓA ρ (m) to be undefined for all m n. (Note that at this point we do not redefine A ΓA ρ (m), and hence do not redefine the use γρ (m). Such redefinitions will occur the next time we visit ρ.) Let (ν, s) = max{x : ∀y x (ΦA i (y)[s] ↓)}, where these computations take into account the values just enumerated into A (and so do not converge if a number was put into A below the old use ϕA i (y)[s]). If (ν, s) m(ν, t), then let u be the last stage before s at which ν had outcome (k, f ) for some k, or 0 if there has been no such stage. Look for the least v > u such that Tν (v) = (j, f ) for some j, which must exist ν since we are not defining any new values of ΞU ν , and declare that s is a (ν (j, f ))-stage. Let m(ν, s) = m(ν, t), and call (ν, s) the ν-correct length of convergence at stage s. If (ν, s) > m(ν, t), then proceed as follows. let u be the last stage before s at which ν had outcome (k, ∞) for some k, or 0 if there has been no such stage. Let Bν be the collection of β nodes above ν and start to define ν ΞU ν (n) for larger and larger n, all with use A max{kρ : ρ ∈ Bν ∧ γρA (k)[s] min(ϕA i (n)[s], ϕi (m(ν, t) + 1)[s])},
where kρ is, as above, the position of Uν corresponding to position k of Ue(ρ) , until one of the following occurs. 1. Some use ξνUν (n) is violated. 2. For some least v > u and some j, we have Tν (v) = (j, ∞). In the former case, let u be the last stage before s at which ν had outcome (k, f ) for some k, or 0 if there has been no such stage. Look for the least v > u such that Tν (v) = (j, f ) for some j, which must exist since we are ν not defining any new values of ΞU ν , and declare that s is a (ν (j, f ))stage, setting m(ν, s) = m(ν, t). Let n be the largest number such that the ν computation ΞU ν (n) is still valid (or 0 if there is no such number). Call max(n, m(ν, t)) the ν-correct length of convergence at stage s. In the latter case, declare that s is a ν (j, ∞)-stage, let m(ν, s) = m(ν, t) + 1, and call this value the ν-correct length of convergence at stage s. Case 3. ν is an α node. If ΦA i(ν) (j(ν))[s] ↓ then declare that s is a A (ν ϕi(ν) (j(ν))[s])-stage. Otherwise, declare that s is a (ν 0)-stage. End of Construction. To finish the proof, we verify the following by simultaneous induction on ν ⊂ TP (in the process showing in particular that TP is well-defined).
11.11. Upper bounds for the K-trivial sets
549
(i) If ν is a β node then ΓA ν = Ue(ν) , whence Ci(ν) is satisfied. In particular, lims γν (n)[s] exists for all n. (ii) If ν is a τ node, then the following hold. (a) There is a j such that either ν (j, f ) ≺ TP or ν (j, ∞) ≺ TP. (b) If ν (j, f ) ≺ TP, then ν has only finite effect on the nodes extending ν (j, f ), and the limit of the ν-correct length of convergence at (ν (j, f ))-stages is finite. (c) If ν (j, ∞) ≺ TP then the limit of the ν-correct length of convergence at (ν (j, ∞))-stages is infinite, and for each k, the node ν takes no action for k at almost all (ν (j, ∞))-stages. (d) ∃j (ν (j, ∞) ≺ TP) iff ΦA i(ν) is total, whence Ni(ν) is satisfied. (iii) If ν is an α node then the following hold. (a) The node ν is initialized only finitely often. (b) There is a k such that ν has outcome k almost always. We assume (i)–(v) for all σ ≺ ν. There are three cases to consider. Case 1. ν is a β node. Then there is a stage s0 after which ν is never again initialized. Since ν gets to act infinitely often, we will have ΓA ν (n) ↓= Ue(ν) (n) provided that lims γνA (n)[s] exists. After stage s0 , there are only two ways we can have γνA (n)[s] change. One is for n to enter Ue(ν) , which can happen only once. The other is for some τ node above ν to take action for a number less than or equal to n, which by our inductive hypothesis can happen only finitely often. Case 2. ν is a τ node. Let i = i(ν). First suppose that there is a least ν n such that ΞU ν (n) ↑. Then there is a unique j such that Tν (u) = (j, f ) ν for infinitely many u. Furthermore, either ΞU ν (n) is defined only finitely often, or it is defined infinitely often but its use is violated each time it is defined. In the former case, we must have (ν, s) m(ν, t) for almost all ν-stages s, where t is as in the construction. In particular, ΦA i is not total. In the latter case, because of the way the use ξνUν (n)[s] is defined, there is an m n such that the use of ΦA i (m) is violated infinitely often, whence ΦA is not total. In either case, ν (j, f ) ≺ T P , the limit of the ν-correct i length of convergence at (ν (j, f ))-stages is finite, and clearly ν has only finite effect on the nodes extending ν (j, f ). ν Now suppose that ΞU ν (n) is total. Then there is a unique j such that Tν (u) = (j, ∞) for infinitely many u. Furthermore, we must have lims m(ν, s) = ∞, since if this limit were a finite number m, then there would be infinitely many ν-stages s at which ξνUν (m + 1)[s] is violated. It follows that ν (j, ∞) ≺ T P (since ν has an infinitary outcome each time m(ν, s) increases). We are left with showing that ΦA i is total, which implies that the limit of the ν-correct length of convergence at (ν (j, ∞))-stages is infinite, and hence that for each k, the node ν takes no action for k at almost all (ν (j, ∞))-stages.
550
11. Randomness-Theoretic Weakness
Every (ν (j, ∞))-stage is expansionary, so for each sufficiently large (ν (j, ∞))-stage s, we have (ν, s) k. For such an s, we have ΦA i (k)[s] ↓. Assuming that s is large enough that ν is not initialized after stage s, there are only two ways a number can enter A below ϕA i (k)[s]. One is through some n < k entering Ue(ρ) for some β node ρ such that ρ is between ν (j, ∞) and some α node working for Ni,k . Since there are only finitely many such n and e(ρ), if s is large enough then no number will enter A below ϕA i (k)[s] for this reason. The other way a number may enter A below ϕA i (k)[s] is through the action of a β node above ν. But for any large enough n, if such ν a number enters A, then the use ξνUν (n)[s] is violated. Thus, since ΞU ν is total, if s is large enough then no number will enter A below ϕA (k)[s] for i this reason either. Thus there is an s such that ΦA (k)[s] ↓ and no number i A enters A below ϕA i (k)[s] during or after stage s, whence Φi (k) ↓.
Case 3. ν is an α-node. That ν is initialized only finitely often is clear from the description of the action of the β nodes above ν. Since τ (ν) (k, ∞) ν A for some k, we know that ΦA i(ν) is total, so in particular Φi(ν) (j(ν)) ↓, and ν has outcome ϕA i(ν) (j(ν)) almost always. We have seen that there is a low2 c.e. degree above all K-trivials, but no such low c.e. degree. What happens if the computable enumerability condition is removed? The following is a general theorem of Kuˇcera and Slaman [222] about ideals in the Δ02 degrees, which implies that the Ktrivials have a low upper bound. Theorem 11.11.5 (Kuˇcera and Slaman [222]). The following are equivalent for an ideal I in the Δ02 degrees. (i) I has a low upper bound. (ii) I is contained in an ideal generated by uniformly Δ02 sets A0 , A1 , . . ., and there is a Δ02 function that dominates every partial function computable by a set in I. The method of proof is a forcing one akin to the forcing proof of the low basis theorem with coding steps interleaved; see [222]. A remaining open question is whether the low upper bound for the K-trivials can be made 1random. (Indeed, it is not even known whether there is a 1-random set that computes all K-trivials but does not compute ∅ (i.e., a difference random set that computes all K-trivials); the existence of such a set would answer Question 11.8.2.)
11.12 A gap phenomenon for K-triviality In Theorem 9.15.8, we saw that there is a minimal pair of K-degrees, that is, a pair of non-K-trivial sets A and B such that for all X, if X K A, B then
11.12. A gap phenomenon for K-triviality
551
X is K-trivial. In this section, we give the original proof of this result, which is due to Csima and Montalb´an [83] and involves a lemma of significant independent interest. A reasonable strategy for building such a pair of sets would be to ensure that, for some constant c and all n, K(A n) K(n) + c
or
K(B n) K(n) + c,
(11.2)
while preventing A and B from being K-trivial. For any string σ, we know that σ0ω is K-trivial, so we could try to do the following. We start building A and B (thought of as sequences that we build up over time) by making A look random and adding 0’s to B. Then, once we have ensured that A has an initial segment of fairly high K-complexity, we start adding 0’s to A to bring down the K-complexity of its initial segments, while still adding 0’s to B. Once we have reached an m such that K(A m) is sufficiently low, we start making B look random while adding 0’s to A. Once we have ensured that B has an initial segment of fairly high K-complexity, we start adding 0’s to B to bring down the K-complexity of its initial segments, while still adding 0’s to A. Once we have reached an n such that K(B n) is sufficiently low, we go back to the beginning, making A look random while adding 0’s to B. The problem with such a construction is that, although σ0ω is indeed K-trivial for any σ, the constant of K-triviality depends on σ. Thus, while adding 0’s to a previously defined initial segment of A might bring the complexity of longer initial segments down quite a bit, it may never result in an n such that K(A n) K(n) + c for a given c fixed ahead of time. The clever solution to this problem, found by Csima and Montalb´ an [83], is to prove the following surprising “gap result”, which shows that we can replace c in (11.2) by a sufficiently slow-growing function of n. Recall that KT(c) denotes the class of sets that are K-trivial via c. Theorem 11.12.1 (Csima and Montalb´ an [83]). There is an order f such that a set X is K-trivial iff K(X n) K(n) + f (n) for almost all n.6 Proof. The “only if” direction is obvious. For the other direction, for each e > 0 we build an unbounded nondecreasing function fe such that fe (0) = e − 1 and if K(X n) K(n) + fe (n) for all n, then X ∈ KT(e). Given such functions, let f (n) = min{f2e (n)−e : e > 0}. It is easy to check that f is nondecreasing and unbounded. Furthermore, if K(X n) K(n) + f (n) for almost all n then there is an e > 0 such that K(X n) K(n)+f (n)+e for all n, which means that K(X n) K(n) + f2e for all n, and hence X ∈ KT(2e). Fix e > 0. We want to define a sequence 0 = n0 < n1 < · · · and let fe (k) = e + i − 1 whenever k ∈ [ni , ni+1 ). First let n0 = 0. Let n1 be 6 One may compare this result with Theorem 11.1.8, but note that in that result, the function g(n) − K(n) is not an order.
552
11. Randomness-Theoretic Weakness
such that for any set Y ∈ KT(e + 1) \ KT(e), there is an m < n1 with K(Y m) > K(m) + e. Such a number must exist because, by Theorem 11.1.3, KT(e + 1) is finite. We similarly choose n2 so that for any set Y ∈ KT(e + 2) \ KT(e), there is an m < n2 such that K(Y m) > K(m) + e. We now come to the heart of this argument. We choose n3 so that for any set Y ∈ KT(e + 3) \ KT(e), there is an m < n3 such that K(Y m) > K(m) + e, but we also impose an extra condition on n3 . Let Z be a set such that the least m with K(Z m) > K(m) + e is in [n1 , n2 ). Then Z cannot be in KT(e + 1) by the choice of n1 , so we can require that n3 be such that for each such Z there is an l < n3 such that K(Z l) > K(l) + e + 1. It is important to consider more carefully why such an n3 exists. Suppose it did not. Then for each l there would be a string σ of length l such that the least m with K(σ m) > K(m)+e is in [n1 , n2 ) and yet K(σ) K(l)+e+1. Thus, by K¨ onig’s Lemma (that every infinite, finitely branching tree has an infinite path), there would be a set Z such that the least m with K(Z m) > K(m) + e is in [n1 , n2 ) and yet K(Z l) K(l) + e + 1 for all l. Such a Z would be in KT(e + 1), contradicting the choice of n1 . The definition of ni for i > 3 is analogous. That is, we choose ni so that 1. for any set Y ∈ KT(e + i) \ KT(e), there is an m < ni such that K(Y m) > K(m) + e, and 2. for any Z such that the least m with K(Z m) > K(m) + e is in [ni−2 , ni−1 ), there is an l < ni such that K(Z l) > K(l) + e + i − 2. The same argument as in the i = 3 case shows that such an ni exists. As mentioned above, we let fe (k) = e + i − 1 whenever k ∈ [ni , ni+1 ). Suppose that K(X n) K(n) + fe (n) for all n. We need to show that X ∈ KT(e). Suppose not, and let i be least such that there is an m ∈ [ni , ni+1 ) with K(X m) > K(m) + e. Then, by the choice of ni+2 , there is an l < ni+2 such that K(X l) > K(l) + e + i. On the other hand, K(X l) K(l) + fe (l) K(l) + e + i, which is a contradiction. Corollary 11.12.2 (Csima and Montalb´ an [83]). There is a minimal pair of K-degrees. Proof. Let f be as in Theorem 11.12.1. We build A and B by finite extensions. Let A0 = B0 = λ. At stage e + 1 with e even, first define A˜e+1 Ae so that K(A˜e+1 ) > K(|A˜e+1 |) + e. Let me = |A˜e+1 | − |Ae |. Now let ce be such that A˜e+1 0ω ∈ KT(ce ), and let ne be such that f (ne ) > ce . Let Ae+1 = A˜e+1 0ne and Be+1 = Be 0me +ne . At stage e + 1 with e odd, do the same with the roles of A and B interchanged. It is straightforward to check that this construction ensures that A and B are not K-trivial, and that min(K(A n), K(B n)) K(n) + f (n) for all n, which implies that if X K A, B then X is K-trivial.
11.12. A gap phenomenon for K-triviality
553
Analysis of the above proofs shows that f can be chosen to be Δ04 , and hence so can A and B. It is unknown whether there is a minimal pair of K-degrees of Δ02 sets. (By Theorem 9.15.8, there is a minimal pair of Kdegrees of Σ02 sets.) More interestingly, it is also unknown whether there is a minimal pair of K-degrees of left-c.e. reals.
12 Lowness and Triviality for Other Randomness Notions
12.1 Schnorr lowness We now turn to lowness notions for other notions of randomness. We begin with Schnorr randomness. Since there is no universal Schnorr test, it is not clear that the notions of a set A being low for Schnorr randomness (i.e., every Schnorr random set is Schnorr random relative to A) and being low for Schnorr tests (i.e., every Schnorr test relative to A can be covered by an unrelativized Schnorr test1 ) should be the same. In fact, this was an open question in Ambos-Spies and Kuˇcera [9]. As we will see, it was solved by Kjos-Hanssen, Nies, and Stephan [207].
12.1.1 Lowness for Schnorr tests In such cases, it is usually easiest to begin with the test set version. A complete characterization of lowness for Schnorr tests was obtained by Terwijn and Zambella [389], using the notion of computable traceability introduced in Definition 2.23.15. Recall from that definition that a degree a is computably traceable if there is a computable function h such that, for each function f T a, there is a computable function g satisfying, for all n, (i) |Dg(n) | h(n) and 1 Another way of thinking of this notion is as lowness for Schnorr null sets: every set that is Schnorr null relative to A is Schnorr null.
R.G. Downey and D. Hirschfeldt, Algorithmic Randomness and Complexity, Theory and Applications of Computability, DOI 10.1007/978-0-387-68441-3_12, © Springer Science+Business Media, LLC 2010
554
12.1. Schnorr lowness
555
(ii) f (n) ∈ Dg(n) . Computable traces are often represented by a single computable set T = {y, n : y ∈ Dg(n) }. We will sometimes use this method, and write T [n] for {y : y, n ∈ T }. The following lemma is an analog of Proposition 2.23.12. Lemma 12.1.1 (Terwijn and Zambella [389]). Let A be computably traceable and let h be a computable order such that h(0) > 0. Then A is computably traceable with bound h. Proof. We show how to arbitrarily slow down the tracing process. Suppose that A is computably traceable with bound h, and fix f T A. Let q be a computable function such that, if we let ik be the least number such that q(ik ) > k, then h(ik ) h(k) for all k. Let {Tn }n∈ω be a computable trace with bound h for the function taking i to f (0), . . . , f (q(i) − 1). Let Sk be the set of all ak for which Tik has an element of the form a0 , . . . aq(i)−1 . Clearly, {Sn }n∈ω is a computable trace for f , and it is bounded by h(ik ) h(k). It is worth noting that sometimes the exact growth rate of orders does matter, as we will see in Chapter 14. The notion of computable traceability was inspired by arguments from set theory, particularly Raisonnier’s proof in [322] of Shelah’s result that the inaccessible cardinal cannot be removed from Solovay’s construction in [370] of a model of set theory (without the full axiom of choice but with dependent choice) in which every set of reals is Lebesgue measurable. As we noted in Section 2.23, Terwijn and Zambella [389] observed that the hyperimmune-free degree constructed by Miller and Martin (Theorem 2.17.3) is computably traceable, and that if a degree is computably traceable then it is hyperimmune-free, since computable traceability is actually a uniform version of hyperimmune-freeness. The difference between being hyperimmune-free and being computably traceable for a degree a is that for the latter there is a single computable bound h that works for all f T a, whereas for the former, for each f T a there is a separate computable function bounding f . This difference can be turned around into a proof that the concepts are different. We can derive this result most easily using Theorem 12.1.2 below, though, so we will return to it after we have proved the theorem. Since all computably traceable degrees are hyperimmune-free, no computably traceable degree is Δ02 . In particular, if a set is low for 1-randomness then it is not computably traceable. This fact is particularly interesting in view of the following result. Theorem 12.1.2 (Terwijn and Zambella [389]). A degree is low for Schnorr tests iff it is computably traceable.
556
12. Lowness and Triviality for Other Randomness Notions
Proof. For the “if” direction, let A be computably traceable. Let {Un }n∈ω −n be a Schnorr test relative to A, with μ(Un ) = 2 . We need to build an unrelativized Schnorr test {Vn }n∈ω with n Vn ⊇ n Un . It will be convenient to build the Vn so that μ(Vn ) is less than or equal to 2−n and computable (rather than equal to 2−n ). Identifying finite strings with their codes as usual, we have a computable trace T for the finite A-approximations Un,s . (So Un,s ∈ T [n,s ] for all n and s.) By Lemma 12.1.1, we may assume that T is bounded by a sufficiently slow growing bound h, to be specified below. We will use T to enumerate the Vn . The crucial idea is to make sure that the bulk of Un is enumerated before we trace it, so the approximation generated by T is a good one. Thus we speed up the A-enumeration of the Un so that μ(Un,s ) > 2−n (1−2−s). Given T , we define a new trace T as follows. Let T [n,s ] be the set of those D ∈ T [n,s ] such that 1. D is a finite subset of 2<ω (again identifying finite sets with their codes); 2. 2−n (1 − 2−s ) μ(D) 2−n ; and 3. C ⊆ D for some C ∈ T [n,s−1 ] (if s > 0). Here we have pruned those members of T that are not possible values of Un,s . Note that T is still a computable trace that captures the Un,s . We are now readyto define our Schnorr test. Let Vn,r = s
Using Lemma 12.1.1, we can choose h so that |T [2n,s ]| is small enough to ensure that μ(Vn ) 2−n . Furthermore, μ 2−2n−s |T [2n,s ] |, T [2n,s ] s>r
sr
so again using Lemma 12.1.1, we can choose h to be sufficiently slow growing so that μ(Vn,r ) converges computably to μ(Vn ). The other direction is more difficult. Let A be low for Schnorr tests. Let Bk,i = {τ 1k : τ ∈ 2i }. Note that μ(Bk,i ) = 2−k . The idea is to use these clopen sets to code an arbitrary function g T A. To do this, fix such a function and let Un = Bk,g(k) . k>n
Clearly, {Un }n∈ω is a Schnorr test relative to A, and hence is covered by an unrelativized Schnorr test. We will need only one of the levels of that
12.1. Schnorr lowness
557
test, say the fourth one, which is a Σ01 class V of measure 18 containing the intersection of the Un . For technical reasons that will become clear below, we want to ensure that μ(Bk,i \V ) = 2−(i+3) for all k and i. We can do this by adding elements to V while still keeping V a Σ01 class with computable measure less than 1 4 , so we assume we have done so. (The idea is that if we see the difference between Bk,i and Vs getting close 2−(i+3) , then we add enough of Bk,l to V to ensure that equality will not happen; we can certainly do this in such a way as to keep the measure computable and below 14 .) We first make the simplifying assumption that μ(Un \ V ) = 0 for some n, eliminating this assumption later. Let T be defined by T [k] = {i : μ(Bk,i \ V ) < 2−(i+3) }. Clearly g(k) ∈ T [k] for all k > n, since μ(Un \V ) = 0, so a finite modification of T traces g. We now must show that there is a computable function h such that T [k] = Dh(k) for all k and that |T [k] | is bounded by a computable function that does not depend on g. To achieve the first goal, it is enough to show that each T [k] is finite and that both T and the function k → |T [k] | are computable. Note that the computability of this function does give us a computable bound on |T [k] | (and hence is enough to show that A has hyperimmune-free degree), but this bound depends on g, so we will still have to provide a computable bound independent of g. We begin by showing that T is computable. It is easy to see that T is c.e., so to show that it is computable, it is enough to describe how to enumerate its complement. Suppose i ∈ / T [k] . Let s0 = 0. Having defined −(i+3) sj , let εj = μ(Bk,i \ Vsj ) − 2 . Note that εj > 0, since i ∈ / T [k] and −(i+3) we have built V so that μ(Bk,i \ V ) = 2 . Let sj+1 be such that ε μ(Vsj+1 ) > μ(V ) − 2j . Clearly, the εj converge to a limit ε and ε > 0, again by the assumption ε that μ(Bk,i \ V ) = 2−(i+3) . So there is a j such that 2j < εj+1 . If we enumerate V up to stage sj+1 , we know for sure that i ∈ / T [k] , since then μ(Bk,i \ V ) μ(Bk,i \ Vsj+1 ) − μ(V ) + μ(Vsj+1 ) > εj+1 + 2−(i+3) −
εj > 2−(i+3) . 2
Thus we can enumerate the complement of T , and hence T is computable. We now show that each T [k] is finite and we can compute |T [k] | given k. It suffices to show that we can effectively find an ik such that i ∈ / T [k] for all −(k+2) i > ik . Find a stage s such that μ(Vs ) > μ(V )−2 . Let G be a finite set of generators for Vs and let ik be larger than k and the length of all strings
558
12. Lowness and Triviality for Other Randomness Notions
in G. Let i > ik . It is easy to check that μ(Bk,i \ Vs ) = 2−k (1 − μ(Vs )), so μ(Bk,i \ V ) μ(Bk,i \ Vs ) + μ(Vs ) − μ(V ) > 2−k (1 − μ(Vs )) − 2−(k+2) 1 2−k (1 − ) − 2−(k+2) > 2−(k+2) > 2−(i+3) , 4 whence i ∈ / T [k] . It remains to show that |T [k] | is bounded by a computable function that does not depend on g. We claim that |T [k] | < 2k k. Suppose not. Then we can find i0 < · · · < i2k −1 ∈ T [k] such that ij+1 −ij k. Recall that μ(Bi ) = 2−k for all i. It is also easy to see that if j > l then μ(Bij ∩ Bil ) = 2−2k , whence μ(Bij \ l<j Bil ) 2−k − j2−2k . So μ Bk,i μ Bk,ij 2−k − j2−2k i∈T [k]
j<2k
>
j<2k
j<2k−1
2−k − 2k−1 2−2k =
2−k−1 = 1,
j<2k−1
which is impossible. We now show that we can remove the hypothesis that μ(Un \ V ) = 0, weakening it to the hypothesis that μσ (V ) < 14 and μσ (Un \ V ) = 0 for ∩σ) some σ and n, where μσ (W ) = μ(W μ(σ) . First we show that the proof still works with this weaker hypothesis, and then we show that this hypothesis always holds. Suppose that μσ (V ) < 14 and μσ (Un \ V ) = 0. For a class W , let W σ = {τ : στ ⊆ W }. We may assume that g(k) > k for every k because a trace for g(k) + k immediately yields a trace for g. Clearly we may also assume that n > |σ|. Let V˜ = V σ and g˜ be the translation of g defined by g˜(k) = max(g(k) − ˜n be defined as Un , but with g˜ in place of g. We claim that |σ|, 0). Let U ˜n \ V˜ ) = 0. If i > |σ| then B σ = Bk,i−|σ| . Since g(k) > k and n > |σ|, μ(U k,i ˜n , so μ(U ˜n \ V˜ ) = μσ (Un \ V ) = 0. This proves the claim. we have Unσ = U Now, it is clear that μ(V˜ ) is less than 14 and computable. So the proof given above is valid when V˜ and g˜ are substituted for V and g, and ensures the existence of a computable trace for g˜. But from a trace of g˜ we immediately obtain a trace for g. Finally, suppose that there are no σ and n such that μσ (Un \ V ) = 0 and μσ (V ) < 14 . We obtain a contradiction by constructing a set in n Un \ V . Let σ0 be the empty string and assume we have defined σn so that μσn (V ) < 14 . By hypothesis, μσn (Un \ V ) > 0, so there is a τ ⊆ Un such that μσn (τ \ V ) > 0. In particular, τ σn and μτ (V ) < 1. By the Lebesgue Density Theorem 1.2.3, applied to the complement of V , there is 1 a σn+1 τ such that μ (V ) < . Let α = σ n σn . Since [σn+1 ] ⊆ Un for 4 n+1 all n, we have α ∈ n Un . But [σn ] V for every n, so, since V is open, α∈ / V . This contradiction completes the proof.
12.1. Schnorr lowness
559
As mentioned above, the following result has an easy proof as an application of the above theorem. Theorem 12.1.3 (Terwijn and Zambella [389]). There are continuum many degrees that are hyperimmune-free but not computably traceable. Proof. As noted in Proposition 8.1.3, it follows from the hyperimmunefree basis theorem that there are 1-random (and hence Schnorr random) sets of hyperimmune-free degree. As observed after the proof of Theorem 2.19.11, it is not hard to adjust the construction in that proof to show that a Π01 class with no computable members has continuum many members of hyperimmune-free degree, so there are in fact continuum many 1-random sets of hyperimmune-free degree. Let A be such a set. We can A-computably construct a Schnorr test covering A, but that test cannot be covered by an unrelativized Schnorr test. Thus A is not low for Schnorr tests, and hence is not computably traceable.
12.1.2 Lowness for Schnorr randomness We now turn to the apparently weaker notion of lowness for Schnorr randomness. It is clear that if A is low for Schnorr tests then A is low for Schnorr randomness. Kjos-Hanssen, Nies, and Stephan [207] showed that the converse also holds, via some intermediate results of independent interest. Theorem 12.1.4 (Kjos-Hanssen, Nies, and Stephan [207]). A set A is c.e. traceable iff every Schnorr null set relative to A is Martin-L¨ of null. Proof. First suppose that every Schnorr null set relative to A is MartinL¨ of null. Follow the proof of Theorem 12.1.2 until the definition of V . Now we still have a Σ01 class V such that n Un ⊆ V and μ(V ) < 14 , but the measure of V is no longer computable. We can now continue to follow the proof of Theorem 12.1.2, with straightforward modifications. For the other direction, suppose that A is c.e. traceable. Let {Un }n∈ω be a Schnorr test relative to A with μ(Un ) = 2−n . We may assume that U0 ⊃ U1 ⊃ · · · . Let D0 , D1 , . . . be a canonical listing of the finite sets of strings. Let f T A be such that s Df (n,s ) = Un for all n, and for all n and s, we have μ(Df (n,s ) ) > 2−n (1 − 2−s ) and Df (n,s+1 ) ⊇ Df (n,s ) . By Lemma 2.23.12, f has a c.e. trace T0 , T1 , . . . such that |Ti | i for all i > 0. We may also assume that if e ∈ Tn,s then 2−n (1 − 2−s ) < μ(De ) 2−n and, if s > 0, then De ⊇ Di for some i ∈ Tn,s−1 . Let V n = {De : ∃s (e ∈ Tn,s )}. Then it is easy to see that μ(V n ) 2−n |Tn,0 | + s 2−s−n |Tn,s |. By the bound on the sizes of the Tn,s , it follows that we can compute a function
560
12. Lowness and Triviality for Other Randomness Notions
g such that μ(V g(n) ) 2−n for Let Vn = V g(n) . Then the Vn form a all n. Martin-L¨ of test, and clearly i Ui ⊆ i Vi . Lemma 12.1.5 (Kjos-Hanssen, Nies, and Stephan [207]). If A is of hyperimmune-free degree and c.e. traceable, then A is computably traceable. Proof. Let T be the c.e. trace of some function g T A. Let f (n) = μs (g(n) ∈ T [n] [s]). Then f T A, so f is majorized by some computable function h. Let T[n] = T [n] [h(n)]. Then T is a computable trace for g with the same bound as T . So to show that lowness for Schnorr tests coincides with lowness for Schnorr randomness, it is enough to show that every set that is low for Schnorr randomness has hyperimmune-free degree. This final step was actually the first result proved about lowness for Schnorr randomness, and is due to Bedregal and Nies [34]. Theorem 12.1.6 (Bedregal and Nies [34]). If A is either low for computable randomness or low for Schnorr randomness, then A is of hyperimmune-free degree. Proof. Suppose that A is of hyperimmune degree, so that there is a function g T A such that, for any computable function h, we have h(x) < g(x) for infinitely many x. We build a computably random set R and an Acomputable martingale F that succeeds on R in the Schnorr sense. Let Me be the eth partial computable martingale (as defined in Definition 7.4.3). Let T = {e : Me is total}. For certain finite sets D, we will define strings σD such that if D ⊂ E then σD ≺ σE . We will ensure that for each e ∈ T there is a c such that if e ∈ D ⊂ E then Me (σ) < c for all σ such that σD σ σE . We will also ensure that σ D is defined for all finite D ⊂ T . These two conditions ensure that R = D⊂T σD is computably random. We will also ensure that A can know enough about this construction to compute a martingale F that succeeds quickly on R. To help us with that part of the construction, we will impose a lower bound on the length of σD , the reason for which will become clear in the construction of F . We define the σD by induction, along with auxiliary constants pD and partial computable martingales MD . We ensure that if σD is defined then MD (σD ) ↓< 2, and that this convergence happens in at most g(|σD |) many steps. We begin with σ∅ = λ. Suppose that σE has been defined to have the above properties, and that D = E ∪ {i}, where i > max E. Let pD = 2−|σE |−1 (2 − ME (σE )) and MD = ME + pD Me . Since Me is a martingale on its domain, Me (τ ) < 2|τ | for all τ , so MD (σE ) < 2 if it is defined. Let e be the index of D in the canonical listing of finite sets. Look for σ σE with |σ| > 4e such that MD does not increase between σE and σ, and MD (σ) ↓ in at most g(|σ|) many steps. If such a string is found, then let σD = σ.
12.1. Schnorr lowness
561
We claim that if D ⊂ T then such a σ must exist. Let f (m) = μs ∀τ ∈ 2m (MD (τ )[s] ↓). Then f is a total computable function, and hence there is an m > |σE |, 4e such that g(m) > f (m). Clearly, there is a σ ∈ 2m such that MD does not increase between σE and σ, and by the choice of m, we have that MD (σ) ↓ in at most g(|σ|) many steps. Thus σD is defined. So we can define R = D⊂T σD . To see that R is computably random, let Me be total and let D = T ∩ [0, e]. Suppose that D ⊆ E and E = E ∪ {i} ⊂ T , where i > max E. If σE ≺ σ ≺ σE , then pD Me (σ) ME (σ) ME (σE ) < 2. Thus Me (σ) < p2D for all sufficiently long σ ≺ R. Thus we are left with showing that R is not Schnorr random relative to A. We build a martingale F T A for which there are infinitely many |σ| σ ≺ R with F (σ) > 2 4 . For each finite set D, define a martingale FD as follows. Let e be the index of D in the canonical listing of finite sets. If σD is undefined then let FD (τ ) = 2−e for all τ . Otherwise, let σ = σD # |σ2D | $ and define ⎧ −e 2 if τ σ ⎪ ⎪ ⎪ ⎨2−e+|τ |−|σ| if σ ≺ τ σD FD (τ ) = −e+|σD |−|σ| ⎪ 2 if τ σD ⎪ ⎪ ⎩ 0 if τ σ ∧ τ | σD . That is, FD bets evenly except following σ, where it concentrates its capital along σD . Let F = D FD . Clearly F (λ) is finite, and hence F is a martingale. For each D ⊆ T with canonical index e, we have F (σD ) FD (σD ) = |σD | |σD | |σD | 2−e+|σD |−|σ| > 2− 4 +|σD |− 2 = 2 4 . Thus we are left with showing that F T A. It is clearly enough to show that the FD are uniformly A-computable. Given τ and D, we can run through the inductive definition of σD to check whether |σD | 2|τ |. (That is, while we cannot know whether σD is defined, we can check whether some σ of length at most 2|τ | is defined to be σD .) If not, then FD (τ ) = 2−e , where e is the canonical index of D. Otherwise, we can determine the value of FD from the definitions of σD and FD . As discussed above, we now have the following. Corollary 12.1.7 (Kjos-Hanssen, Nies, and Stephan [207]). A is low for Schnorr randomness iff A is computably traceable iff A is low for Schnorr tests.
12.1.3 Lowness for computable measure machines Recall the characterization of Schnorr randomness in terms of computable measure machines presented in Section 7.1.3: A prefix-free machine is a computable measure machine if the measure of its domain is computable,
562
12. Lowness and Triviality for Other Randomness Notions
and Downey and Griffiths [109] showed that A is Schnorr random iff for every computable measure machine M we have KM (A n) n − O(1). We can relativize this notion, and say that a prefix-free oracle machine with oracle A is an A-computable measure machine if the measure of its domain is A-computable. There is a potential worry here, namely the distinction between prefixfree oracle machines M such that μ(dom M A ) is A-computable and those such that μ(dom M X ) is X-computable for all oracles X. However, we need not worry about this distinction, because given a prefix-free oracle machine M such that μ(dom M A ) is A-computable, we can define a prefix-free oracle machine N such that N A = M A and μ(dom N X ) is X-computable for all oracles X. To define N , let e be such that, thinking A A of ΦA e as a function from N to Q, we have Φe (s) = μ(dom M [s]). On oracle X, the machine N X is defined as follows. At stage s, let q = ΦX e (t)[s] for the largest t s such that ΦX (t)[s] ↓. (If there is no such t then proceed e to the next stage.) Search for a stage u such that μ(dom M X [u]) = q and ensure that N X agrees with M X on dom M X [u]. If there is no such u, then the definition of N X never leaves stage s. If at any later stage v we find that μ(dom M X [v]) > q + 2−t , then immediately stop defining N X on new strings. It is easy to check that for all X, either dom N X is finite, X X −t X or ΦX is e is total and 0 μ(dom N ) − Φe (t) 2 , whence dom N A A X-computable. It is also easy to check that N = M . We can define an analog of lowness for K for computable measure machines. Definition 12.1.8 (Downey, Greenberg, Mihailovi´c, and Nies [107]). A set A is low for computable measure machines if for each A-computable measure machine M there is a computable measure machine N such that KN (σ) KM (σ) + O(1). In Chapter 11 we saw that lowness for 1-randomness and lowness for K coincide. It is natural to ask whether the same is true for Schnorr randomness. That is, are lowness for Schnorr randomness and lowness for computable measure machines the same? Downey, Greenberg, Mihailovi´c, and Nies [107] have given a positive answer to this question. By the relativized version of Theorem 7.1.15, a set X is Schnorr random relative to A iff KM (X n) n − O(1) for all A-computable measure machines M . Thus, if a set is low for computable measure machines then it is low for Schnorr randomness. The converse follows from the following result, together with Corollary 12.1.7. Theorem 12.1.9 (Downey, Greenberg, Mihailovi´c, and Nies [107]). A set is low for computable measure machines iff it is computably traceable. Proof. The “only if” direction follows from the fact that lowness for computable measure machines implies lowness for Schnorr randomness, together with Corollary 12.1.7. Now let A be computably traceable and let
12.1. Schnorr lowness
563
M be an A-computable measure machine. We need to define a computable measure machine N such that KN (σ) KM (σ) + O(1). Let D0 , D1 , . . . be a canonical list of the finite subsets of 2<ω × 2<ω . Recall that the domain of Di is the set of all σ such that (σ, τ ) ∈ Di for some τ . Let tn be the least t such that μ(dom M − dom M [t]) < 2−2n . Let Gs be the graph of M [s], that is, the set of all (σ, τ ) such that M (σ)[s] = τ . Let cn be such that Dc0 = Gt0 and Dcn+1 = Gtn+1 \ Gtn . Note that μ(dom Dcn+1 ) < 2−2n . Let F0 , F1 , . . . be a computable sequence of finite sets such that for each n we have |Fn | 2n and cn ∈ Fn . Such a sequence exists because the function n → cn is A-computable and A is computably traceable. By removing elements if necessary, we may assume that for each c ∈ Fn+1 , the domain of Dc is prefix-free and μ(dom Dc ) < 2−2n . Let L = {(|τ | + 1, σ) : ∃n ∃c ∈ Fn ((τ, σ) ∈ Dc )}. This set is c.e., and its weight is μ(dom Dc ) < |Fn |2−(2n+1) 2n 2−(2n+1) = 1. 2 n n n c∈Fn
Thus L is a KC set. Furthermore, the weight of L is computable, since we Dc ) . Now the can approximate it to within 2−m by nm c∈Fn μ(dom 2 KC Theorem gives us a prefix-free machine N such that for each request (d, σ) in L, there is a ν ∈ 2d such that N (ν) = τ . In particular, if M (σ) = τ then (σ, τ ) ∈ Dcn for some n, and hence there is a ν such that |ν| = |σ| + 1 and N (ν) = τ . Furthermore, μ(dom N ) is equal to the weight of L, and hence is computable. So N is a computable measure machine and KN (σ) KM (σ) + O(1). Corollary 12.1.10 (Downey, Greenberg, Mihailovi´c, and Nies [107]). A set is low for computable measure machines iff it is low for Schnorr randomness. Nies [private communication to Downey] has observed that essentially the same proof yields the following. Corollary 12.1.11 (Nies [private communication]). A is c.e. traceable iff for all A-computable measure machines M and all σ, we have KM (σ) K(σ) − O(1). Proof Sketch. If A is c.e. traceable, then A-computable functions such as KM for an A-computable measure machine M can be traced arbitrarily slowly, so that we have few enough possibilities and hence can enumerate possible values of K(σ) based on apparent KM (σ). Conversely, if KM (σ) K(σ) − O(1) for all A-computable measure machines M , then given an A-computable function f , we can look at the particular A-computable measure machine M defined by M (1n 0) = f (n). Since KM (σ) K(σ) − O(1), for each n we can compute a weak index for Tn = {σ : K(σ) 2n}, and f (n) ∈ Tn for almost all n.
564
12. Lowness and Triviality for Other Randomness Notions
12.2 Schnorr triviality In Section 9.17, we defined the notion of Schnorr reducibility: A Sch B if for each computable measure machine M there is a computable measure machine N such that KN (A n) KM (B n) + O(1). Naturally, this notion gives rise to a notion of triviality. Definition 12.2.1 (Downey and Griffiths [109]). A is Schnorr trivial if A Sch ∅. The first construction of a noncomputable Schnorr trivial set was by Downey and Griffiths [109]. As we will see, Schnorr trivial sets behave quite differently from both K-trivial sets and sets that are low for Schnorr randomness. However, one implication does hold. Theorem 12.2.2 (Franklin [151, 153]). If A is low for Schnorr randomness then A is Schnorr trivial. Proof. This proof is from Downey, Greenberg, Mihailovi´c, and Nies [107]. Let A be low for Schnorr randomness. Then A is low for computable measure machines, by Theorem 12.1.10. Let M be a computable measure machine. Let L be the A-computable measure machine defined by letting L(σ) = A n whenever M (σ) = 0n . Then there is a computable measure machine N such that KN (A n) KL (A n)+O(1) KN (0n )+O(1).
12.2.1 Degrees of Schnorr trivial sets Downey, Griffiths, and LaForte [110] and Franklin [151, 152, 153] began systematic investigations of the concept of Schnorr triviality. Downey, Griffiths, and LaForte proved the following result, which clearly highlights the differences between Schnorr triviality on the one hand and K-triviality or lowness for Schnorr randomness on the other. Theorem 12.2.3 (Downey, Griffiths, and LaForte [110]). There is a Turing complete Schnorr trivial c.e. set. We do not give the proof of Theorem 12.2.3 here, as it will follow from Corollary 12.2.16 below. The following result shows that Schnorr trivials behave rather strangely with respect to Turing reducibility. Later we will explore the question of whether Turing reducibility is the correct notion to study in this context. Theorem 12.2.4 (Franklin [151, 153]). Every high degree contains a Schnorr trivial set. In other words, if a 0 then a contains a Schnorr trivial set.2 2 Using a forcing construction, Franklin [155] showed that, in fact, every Turing degree greater than or equal to a contains a Schnorr trivial set iff a 0 .
12.2. Schnorr triviality
565
Because both computable and complete Schnorr trivial c.e. sets exist, one might wonder whether every c.e. degree contains a Schnorr trivial set. The following theorem yields a negative answer to this question. Theorem 12.2.5 (Downey, Griffiths, and LaForte [110]). There is a c.e. degree b such that for each B ∈ b there is a computable measure machine N with ∀c ∃n (K(B n) > Kn (n) + c). Hence b contains neither Schnorr trivial nor K-trivial sets. Proof. We build a c.e. set A and for each pair of functionals Φ and Ψ a computable measure machine NΦ,Ψ , defined using the KC Theorem, to satisfy the following requirements. A
RΦ,Ψ : ΨΦ = A ⇒ ∀c ∃n (K(ΦA n) > KNΦ,Ψ (n) + c). Each such requirement has subrequirements A
SΦ,Ψ,i : ΨΦ = A ⇒ ∃n (K(ΦA n) > KNΦ,Ψ (n) + i). The strategy for RΦ,Ψ has substrategies for its subrequirements SΦ,Ψ,i , A which are allowed to act only on a sequence of stages at which ΨΦ = A appears more and more likely to be the case. Each such substrategy has a large number 2k associated with it and picks a sequence of witnesses A x0 , . . . , x2k −1 such that ψ Φ (xi ) < xi+1 for every i < 2k − 1 (where, as usual, ψ is the use of Ψ). (Of course, what we really mean here is that A ψ Φ (xi )[s] < xi+1 at the stage s at which we pick our witnesses, but since we control A, when thinking of a single strategy in isolation, we may assume that these uses do not change unless the strategy itself causes them to change.) Once these witnesses have been chosen, the strategy enumerates A (k − i − 1, ψ Φ (xk−1 )) into our KC set for NΦ,Ψ . If there are a later stage A s and a σ ∈ 2
is not difficult to build an assignment satisfying these conditions. We can use a list function L, defined recursively on the nodes in 2<ω and the natural numbers. Let L(λ, e, i) = RΦe ,Φi . For each σ and n, if L(σ, 0) = RΦ,Ψ then let L(σ1, n) = L(σ, n+1), let L(σ0, 2n) = L(σ, n + 1), and let L(σ0, 2n + 1) = SΦ,Ψ,n . Otherwise, let L(σ0, n) = L(σ1, n) = L(σ, n + 1). Each σ has requirement L(σ, 0) assigned to it.
566
12. Lowness and Triviality for Other Randomness Notions
1. Each requirement of the form RΦ,Ψ is assigned to a unique node in f. 2. If RΦ,Ψ is assigned to a node β and β0 ≺ f , then for each i there is a unique α such that β0 α ≺ f and SΦ,Ψ,i is assigned to α. We say that β is α’s coordinator. 3. If α is assigned some SΦ,Ψ,i , then α has a coordinator. (In other words, there is a unique β such that β0 α and β is assigned RΦ,Ψ .) The S-strategies have no outcomes, so we will arbitrarily give them outcome 0 at all stages. A node is initialized by having all its associated parameters undefined and all its associated sets set to ∅. A node α with a requirement RΦ,Ψ assigned to it has a machine Nα assigned to it that is built by enumerating a KC set Lα . A node α with a requirement SΦ,Ψ,i assigned to it has associated with it a parameter kα [s] and witnesses xα (0)[s], . . . , xα (2kα +i )[s]. Construction. Stage 0. Initialize all nodes in 2<ω . Stage s + 1. We define an approximation to the true path f [s], of length at most s, and allow each node α f [s] to act. For such an α, we call s an α-stage. Let n = |α|. Let s− be the most recent α-stage or the most recent stage at which α was initialized, whichever is greater. Case 1. Suppose α has RΦ,Ψ assigned to it. Define the length-of-agreement function A
lα [s] = {n : ∀m < n (ΨΦ (m)[s] = A(m)[s])}. Let s0 be the stage at which α was last initialized. A stage s is α-expansionary if lα [s] > max{lα [t] : s0 < t < s and t is an α-stage}. For each γ α with a requirement SΦ,Ψ,i assigned to it, if xγ (j)[s] ↓ for all j 2kγ [s]+i and lβ [s] > xγ (2kγ +i )[s] but lβ [s− ] xγ (2kγ +i )[s], then A enumerate (kγ [s], ψ Φ (xγ (2kγ +i ))[s]) into Lβ . If s is not α-expansionary, then let f (n)[s] = 1, so that α1 is accessible at stage s + 1, and initialize all nodes β such that α1
3. If there is a σ ∈ 2
12.2. Schnorr triviality
567
If any of these cases obtains, immediately end stage s + 1 and initialize all γ >L α. Otherwise, let f (n)[s] = 0, so that α0 is accessible at stage s + 1, and initialize all nodes β such that α0
568
12. Lowness and Triviality for Other Randomness Notions
and xα (2kα +i ) will be put into A. This action forces ΦA n to change, without violating any of the other uses relevant to α, since xα (2kα +i ) was chosen to be a fresh large number. Now, eventually, case 3 will obtain again, but now for a different σ, and xα (2kα +i − 1) will be put into A. This situation will continue to occur until all witnesses have entered A. At that point, there will be at least 2kα +i many U-programs of length less than kα + i, which is a contradiction. The last two lemmas establish the theorem.
12.2.2 Schnorr triviality and strong reducibilities In contrast to Theorem 12.2.3, we have the following result. Downey, Griffiths, and LaForte [110] stated it for left-c.e. reals, but the proof works more generally. Theorem 12.2.9 (Downey, Griffiths, and LaForte [110]). If A is Δ02 and Schnorr trivial then it is wtt-incomplete. Proof. The proof is similar to that of Theorem 11.3.2. Let A be Δ02 and wtt-complete. We construct a c.e. set B that forces the approximation to A to change too often for A to be Schnorr trivial. As in the proof of Theorem 11.3.2, by the recursion theorem, we may assume that a wtt-reduction ΓA = B is given in advance. We also build a computable measure machine M using the KC Theorem to satisfy the requirements Re : ∃n (K(A n) KM (n) + e). These requirements clearly suffice, since if A is Schnorr trivial and M is a computable measure machine, then K(A n) KM (n) + O(1). The main difference between this proof and that of Theorem 11.3.2 is that here we need to work for all e, rather than for a single one given by the recursion theorem, because even if A is Schnorr trivial, there need be no effective way to pass from a computable measure machine M to a computable measure machine N and a constant c such that KN (A n) KM (n) + c for all n. The basic strategy for satisfying a single Re is the same as in the proof of Theorem 11.3.2 (except that of course we can no longer use the request (0, n) as in that proof, but must use some more modest request, which we can compensate for by increasing the number of followers k). We put together the strategies for different Re using a finite injury construction. Associated with each Re will be followers m0 , . . . , mke and a number ne greater than γ(mej ) for all j ke . Initially these parameters are undefined. We assume we have sped up our stages so that at stage s, we have ΓAs (x)[s] ↓= Bs (x) for all numbers x mentioned in the construction so far. At stage s, proceed as follows. First, choose the least e (if any) such that ke is defined, Ks (As ne ) < log ke , and there is a least j such that mej is not yet in B. Put mej into B and undefine all parameters associated with
12.2. Schnorr triviality
569
Ri for i > e. Next, let e be least such that ke is not defined. Let ke be a fresh large number. Let me0 , . . . , meke be fresh large numbers and let ne be larger than γ(mej ) for all j ke . Enumerate a KC request ((log ke ) − e, ne ) for M . Notice first that M is a computable measure machine, since if a request of the form ((log ke ) − e, ne ) is made at a stage s then (log ke ) − e > s. To show that Re is satisfied, let s be a stage by which all Ri for i < e have stopped changing B and M , and such that ke becomes defined at stage s. This value of ke is the final one, and associated with it are final values of me0 , . . . , meke and ne . The construction ensures that KM (ne ) (log ke ) − e. Assume for a contradiction that K(A ne ) < KM (ne ) + e log ke . Then there are stages s < s0 < · · · < ske such that Ksj (Asj ne ) < log ke . At stage sj , we put mej into B. By the choice of ne , it must therefore be the case that Asj ne = Asi ne for i = j. Thus there are more than ke many strings with K-complexity less than log ke , which is impossible. So we have a contradiction, and hence Re is satisfied. In Chapter 7, we saw that Schnorr reducibility is naturally related to truth table reducibility. This relationship extends to Schnorr triviality. Theorem 12.2.10 (Downey, Griffiths, and LaForte [110]). If A is Schnorr trivial and B tt A, then B is Schnorr trivial. Proof. Let B = ΓA be a tt-reduction with use bounded by the strictly increasing computable function γ. Let M be a computable measure machine. We must build a computable measure machine N such that KN (B n) KM (n) + O(1). such that It is easy to define a computable measure machine M KM (γ(n)) KM (n) for all n. Since A is Schnorr trivial, there is a computable measure machine MA such that KMA (A γ(n)) KM (γ(n)) + O(1) KM (n) + O(1). If MA (σ) = τ , then let N (σ) = Γτ γ(j) j for the largest j such that γ(j) |τ |. Then, if MA (σ) = A γ(n), we have N (σ) = ΓAγ(n) n = B n, so KN (B n) KMA (A γ(n)) KM (n) + O(1). Since Γ is total on all oracles, μ(dom N ) = μ(dom MA ), so N is a computable measure machine. Given the above result, to show that the Schnorr trivials form an ideal in the tt-degrees it is enough to show that the join of two Schnorr trivial sets is still Schnorr trivial. We show that this is the case in the following section, where we obtain a natural characterization of Schnorr triviality in terms of initial segment complexity.
12.2.3 Characterizing Schnorr triviality As we saw in Section 12.1, a set is low for Schnorr randomness iff it is computably traceable, and hence all such sets have hyperimmune-free degree.
570
12. Lowness and Triviality for Other Randomness Notions
Thus, results such as Theorem 12.2.3 show that the beautiful coincidence between K-triviality and lowness for 1-randomness, which is one of the arguments for the naturality of the notion of K-triviality, do not hold in the case of Schnorr randomness. However, Franklin and Stephan [158] obtained a rather satisfying characterization of Schnorr triviality in terms of relativized truth tables and initial segment complexity, which argues for the naturality of this notion. Their key realization is that to relativize weak notions of randomness like Schnorr or computable randomness, Turing reducibility may not be the correct notion, and should be replaced by truth table reducibility. That is, while we have no hope of showing that Schnorr triviality coincides with lowness for Schnorr randomness, as we did for Ktriviality, there is some hope for a characterization along such lines if we restrict what we mean by relativization. We begin by recalling Theorem 7.1.8, which states that the following are equivalent. (i) X is not Schnorr random. (ii) There is a computable martingale d and a computable function f such that ∃∞ n (d(X f (n)) n). (iii) There is a computable martingale d with the savings property and a computable function f such that ∃∞ n (d(X f (n)) n). Armed with this characterization, Franklin and Stephan [158] suggested the following restricted notion of relativization. Definition 12.2.11 (Franklin and Stephan [158]). A set X is truth table Schnorr random relative to A if there are no martingale d tt A and no function g tt A such that ∃∞ n (d(X g(n)) n).4 A set A is truth table low for Schnorr randomness if every Schnorr random set is tt-Schnorr random relative to A. We can simplify the definition above by considering only computable g, rather than all g tt A. Lemma 12.2.12 (Franklin and Stephan [158]). A set X is tt-Schnorr random relative to A iff there are no martingale d tt A and no computable function g such that ∃∞ n (d(X g(n)) n). Proof. Suppose there is a martingale d tt A and a function h tt A such that ∃∞ n (d(X h(n)) n). It is easy to check from the proof of (ii) implies (iii) in Theorem 7.1.8 that we may assume that d has the savings property. We may also assume that h is nondecreasing. Since h tt A, given n, we can compute all possible values of h(n), so there is a computable g such that g(n) h(4n + 4) for all n. For infinitely many n, there is an 4 See [158] for an explanation of why relativizing the usual martingale definition of Schnorr randomness will not work in this case.
12.2. Schnorr triviality
571
m ∈ [4n, 4n + 3] such that d(X h(m)) m. For all such n, we have d(X g(n)) n. We will show that a set is Schnorr trivial iff it is tt-low for Schnorr randomness. We begin with the following lemma. We will then show that condition (ii) in this result is equivalent to Schnorr triviality. Lemma 12.2.13 (Franklin and Stephan [158]). For each A, at least one of the following conditions holds. (i) There is a computably random set that is not tt-Schnorr random relative to A. (ii) There is a computable function h such that, for every computable function u, there are uniformly computable finite sets of strings T0 , T1 , . . . with |Tn | h(n) and A u(n) ∈ Tn for all n. Proof. Clearly, in (ii) we can restrict attention to strictly increasing functions u. Let d0 , d1 , . . . be a (noneffective) listing of all computable positive-valued martingales. We build a set X and a (noneffective) martingale d. u(n) Fix a computable strictly increasing and function u. Let Jn = 2 partition N into intervals Iσ for σ ∈ n Jn so that Iσ has 2n elements for each σ ∈ Jn . We define X and d in stages. Let d(λ) = 1. At stage n, we work with the nth interval Iσ in natural number order. Let n be such that σ ∈ Jn . Let k be the largest index among the martingales mentioned before this stage. Let m = min(Iσ ). Before this stage, we will have defined X m and numbers c0 , . . . , ck m so that for all τ ∈ 2m we have d(τ ) = 2−k +
ik
2−i
di (τ ) . di (τ ci )
(12.1)
Extend d to all τ with length between m + 1 and m + 2n, preserving (12.1). Now check which of the following cases holds. 1. σ = A u(n). 2. σ = A u(n) and d((X m) 12n ) d(X m) + 2−n . 3. Otherwise. In cases 1 and 3, define X on Iσ so that d(X (m + 2n)) is as small as possible. In case 1, we have d(X (m+2n)) d(X m). In case 3, the gain of 2−n on the sequence 12n implies a loss of at least 8−n on at least one of the 4n − 1 other possible extensions of X on Iσ , so d(X (m + 2n)) (X m) − 8−n . In both these cases, we do not incorporate any new martingales into d at this stage.
572
12. Lowness and Triviality for Other Randomness Notions
In case 2, let X (m + 2n) = (X m) 12n . Let ck+1 = m + 2n. Note dk+1 (τ ) that for τ ∈ 2m+2n , (12.1) and the fact that dk+1 (τ ck+1 ) = 1 imply that d(τ ) = 2−k−1 +
ik+1
2−i
di (τ ) . di (τ ci )
In other words, (12.1) continues to hold at level m + 2n with k replaced by k + 1, as needed for the construction to proceed. We have completed the definitions of X and d. There are now two cases. First suppose that case 2 occurs infinitely often. By the savings trick, to show that X is not computably random, it is enough to show that no dk succeeds strongly on X. Each dk is incorporated into d from some point on, so if dk succeeds strongly on X then so does d. But d gains no capital along X in cases 1 or 3, and gains at most 2−n in case 2, which occurs only once for each n. Thus d does not succeed strongly on X, whence no dk succeeds strongly on X, and X is computably random. On the other hand, there are infinitely many n with IAu(n) ⊆ X. We can build a martingale d tt A that splits its capital into pieces of size 2−n−1 and uses each piece to make the 2n many bets that all members of IAu(n) are in X. It is easy to check that this martingale witnesses the fact that X is not tt-Schnorr random relative to A. Now suppose that case 2 occurs only finitely often. Then for almost all n, at the stage in which σ = A u(n), case 3 obtains, and hence d loses 8−n along X. For such n, let Tn be the set of all strings in Jn such that case 3 obtains at the stage corresponding to σ. (For the finitely many other n, let Tn = {A u(n)}.) Then A u(n) ∈ Tn for all n. When case 1 obtains, d gains no capital along X, so the capital reached by d on X at the end of the stages of the construction has a maximum value r. Thus |Tn | r8n for all n. Since we eventually stop incorporating dk ’s into d, the martingale d and the set X are both computable, so the Tn are uniformly computable. Theorem 12.2.14 (Franklin and Stephan [158]). The following are equivalent. (i) A is Schnorr trivial. (ii) There is a computable function h such that, for every computable function u, there are uniformly computable finite sets of strings T0 , T1 , . . . with |Tn | h(n) and A u(n) ∈ Tn for all n. (iii) The tt-degree of A is computably traceable. That is, there is a computable function g such that for each f tt A, there are uniformly computable finite sets S0 , S1 , . . . with |Sn | g(n) and f (n) ∈ Sn for all n. (iv) For each computable order g with g(0) > 0 and each f tt A, there are uniformly computable finite sets S0 , S1 , . . . with |Sn | g(n) and f (n) ∈ Sn for all n.
12.2. Schnorr triviality
573
(v) A is tt-low for Schnorr randomness. Proof. (i) implies (ii). Let A be Schnorr trivial. Let h(n) = 22n . Let u be a computable function, which we may assume is strictly increasing. Let M (1n 0) = u(n) for all n and let M be undefined on all other inputs. Then M is a computable measure machine, so there is a computable measure machine N such that KN (A u(n)) KM (u(n)) + O(1) = n + O(1). For each n, we can wait until μ(dom N ) − μ(dom N [s]) < 2−2n , and we will then know the set Tn of all σ such that KN (σ) < 2n. By modifying finitely many Tn if necessary, we have |Tn | h(n) and A u(n) ∈ Tn for all n. (ii) implies (iii). Let h be as in (ii). Let f tt A via a tt-reduction Γ with use u. Let T0 , T1 , . . . be as in (ii). We may assume that Tn ⊆ 2u(n) for all n. Let Sn = {Γσ (n) : σ ∈ Tn }. Since Γ is a tt-reduction, S0 , S1 , . . . are uniformly computable. Furthermore, |Sn | |Tn | h(n) and f (n) ∈ Sn for all n. (iii) implies (iv). Let g be as in (iii). We may assume that g(0) = 1. Let g be a computable order and let f tt A via a tt-reduction Γ with nondecreasing use function γ. Let p(n) be a computable order such that g(p(n)) g(n) for all n and let mi be the least n such that p(n) > i. Let f (i) = f (0), . . . , f (mi ). Let S0 , S1 , . . . be as in (iii). Let S n = {k : k is the nth element of a sequence in Sp(n) }. Then the S n are uniformly computable, and |S n | |Sp(n) | g(p(n)) g(n) and f (n) ∈ S n for all n. (iv) implies (i). Suppose (iv) holds. Let M be a computable measure machine. We may assume that n ∈ rng M for all n, whence we can compute KM (n) for all n. Since n 2−KM (n)−1 is less than 1 and computable, there is a computable order g such that n g(n)2−KM (n)−1 is still less than 1 and computable. We use the KC Theorem to build a computable measure machine N with KN (A n) KM (n) + O(1). Let σ0 , σ1 , . . . be the length-lexicographic ordering of 2<ω . Let f be such that σf (n) = A n. Note that f tt A. By (iv), there are uniformly computable S0 , S1 , . . . such that |Sn | g(n) and f (n) ∈ Sn for all L = {(KM (n) ) : m ∈ Sn }. Then the weight of L is + 1, σm n. Let−K −KM (n)−1 M (n)−1 |S |2 g(n)2 < 1, so L is a KC-set, detern n n mining a prefix-free machine N , and KN (A n) + 1 for all KM (n)−K −KM (n)−1 M (n)−1 n. Since g(n)2 is computable and |S |2 n n n>m −KM (n)−1 −KM (n)−1 g(n)2 for all m, we have that |S |2 is also n n>m n computable. Thus N is a computable measure machine. (v) implies (ii) follows from Lemma 12.2.13. (iv) implies (v). Suppose (iv) holds and let X be a set for which there are a computable function q and a martingale d tt A such that d(X q(n)) n for infinitely many n. We show that X is not Schnorr random. The basic idea is to use a trace of d with small bound to define a computable martingale d that succeeds fast on X.
574
12. Lowness and Triviality for Other Randomness Notions
Let m0 , m1 , . . . be an effective listing of all initial segments of rational valued martingales (i.e., each mi is a function from 2n to Q0 for some n, satisfying the martingale condition). Let g and h be computable orders such that g(0) > 0 and if nk is the least n with k h(n), then ink g(i) log log k + O(1). Let r(n) = maxih(n) q(i). Let f be such that mf (n) = d 2r(n) for all n. Note that f tt A. Let S0 , S1 , . . . be as in (iv). Define martingales d0 , d1 , . . . as follows. Initially, dj [0] = ∅ for all j. At each stage n, for each i ∈ Sn in turn, find the least j such that mf (i) extends the current approximation dj [n] (one must exist because only finitely many dj [n] will be different from ∅), and let dj [n + 1] = mf (n) . For each j such that dj [n] = ∅ and dj [n] has not been extended, extend dj [n] to dj [n + 1] arbitrarily, taking care only to preserve the martingale property. Note that the number of j’s such that dj [n + 1] = ∅ is bounded by in g(i). Let d = j 2−j dj . It is easy to see that d is a computable martingale. Suppose that d(X q(k)) k. Let n be least such that k h(n). Then there is a j in g(i) such that dj (X q(k)) = d(X q(k)) k, whence q(k)) 2−j k 2− in g(i) k k for some c. d(X c log k
Let p(n) = minq(k)n
k c log k .
Then p is a computable order, and there k are infinitely many k such that d(X q(k)) c log k p(q(k)), whence lim supn
d(Xn) p(n)
1. Thus X is not Schnorr random.
The above characterization can be used to show that the Schnorr trivials form an ideal in the tt-degrees. Corollary 12.2.15 (Franklin and Stephan [158]). Let A and B be Schnorr trivial. Then A ⊕ B is Schnorr trivial. Thus the Schnorr trivial sets form an ideal in the tt-degrees. Proof. By Theorem 12.2.14, there are computable functions g and h such that, for every computable function u, there are uniformly computable finite sets of strings S0 , S1 , . . . and T0 , T1 , . . . with |Sn |, |Tn | h(n), and A u(n) ∈ Sn and B u(n) ∈ Tn , for all n. Let f (2n) = g(n)h(n) and f (2n + 1) = g(n + 1)h(n). Let u be a computable function, and let S0 , S1 , . . . and T0 , T1 , . . . be as above. Let U2n = {σ ⊕ τ : σ ∈ Sn ∧ τ ∈ Tn } and U2n+1 = {σ ⊕ τ : σ ∈ Sn+1 ∧ τ ∈ Tn } . Then |Un | f (n) and (A ⊕ B) n ∈ Un for all n. So by Theorem 12.2.14, A ⊕ B is Schnorr trivial. Franklin and Stephan [158] observed that their characterization cannot be extended to weak truth table reducibility. They also provided a number of relatively natural examples of Schnorr trivial sets, such as the following. A c.e. set is dense simple if the principal function of its complement is dominant (i.e., dominates all computable functions).
12.2. Schnorr triviality
575
Corollary 12.2.16 (Franklin and Stephan [158]). If a c.e. set A is dense simple and B ⊇ A, then B is Schnorr trivial. Proof. Let u be computable. Then there is an m such that [0, u(n))\A < n for all n m. For n < m, let Tn = {A n}. For n m, find an s such that [0, u(n)) \ As < n and let Tn be the 2n many strings σ of length u(n) for which i ∈ As ⇒ σ(i) = 1 for all i < u(n). Then |Tn | 2n and B u(n) ∈ Tn for all n, so B is Schnorr trivial. We can combine this result with the following theorem due to Martin [258] to conclude that every high c.e. degree contains a Schnorr trivial c.e. set. Theorem 12.2.17 (Martin [258]). Every high c.e. degree contains a dense simple c.e. set. Proof. Let A be a high c.e. set. By Martin’s Theorem 2.23.7, there is a dominant function f T A. We may assume that f also dominates the modulus function of A (i.e., that Af (n) (n) = A(n) for all n). Let Φ be such that f = ΦA . Let g(n) be the least s such that for all m n, we have ΦA (m)[s] ↓ via an A-correct computation. Clearly g T A, and, by the usual convention on approximations, g(n) f (n) for all n, so g also dominates the modulus function of A, and hence in fact g ≡T A. Let G = {n, g(n) : n ∈ N}, where our pairing function is chosen so that x, y y for all y. Then it is easy to see that G is co-c.e. and that G ≡T g ≡T A. Furthermore, if m n then m, g(m) g(m) g(n) f (n), so there are at most n many elements of G that are less than f (n), which means that the principal function of G majorizes f , and hence is dominant. Thus the complement of G is a dense simple c.e. set of the same degree as A. Corollary 12.2.18 (Franklin [151]). Every high c.e. degree contains a Schnorr trivial c.e. set. We can apply the same argument to the Π01 class of separating sets of a Sacks splitting of a maximal set (which is dense simple) to show the following. (See [151, 152] for details.) Corollary 12.2.19 (Franklin [151, 152]). There is a perfect Π01 class of Schnorr trivial sets. Franklin and Stephan [158] also considered possible analogs to the fact that a set is K-trivial iff it is a base for 1-randomness. The obvious analog would be that A is Schnorr trivial iff there is a B tt A that is Schnorr random relative to A. Franklin and Stephan [158] observed that this result fails, but showed that one direction does hold: if A tt B and B is Schnorr random relative to A, then A is Schnorr trivial. They also considered extensions to stronger reducibilities. See [158] for more details.
576
12. Lowness and Triviality for Other Randomness Notions
12.3 Tracing weak truth table degrees 12.3.1 Basics In Theorem 12.2.14 we saw that Schnorr triviality corresponds to a very natural tracing notion when we view the world through strong reducibilities. Specifically, we saw that A is Schnorr trivial iff the truth table degree of A is computably traceable. In this section we will look at the situation for weak truth table reducibility and establish a deep connection between the complexity of a set and tracing properties of its weak truth table degree. Recall that a set A is complex if there is a computable order h such that C(A h(n)) n for all n. As we saw in Theorem 8.16.3, Kjos-Hanssen, Merkle, and Stephan [205] showed that A is complex iff there is a function f wtt A such that C(f (n)) n for all n. Motivated by this result, we make the following definition. Definition 12.3.1 (Franklin, Greenberg, Stephan, and Wu [156]). A set A is anti-complex if for every computable order f , we have C(A f (n)) n for almost all n. The intuition here is that anti-complex sets are highly compressible. Theorem 12.3.2 (Franklin, Greenberg, Stephan, and Wu [156]). The following are equivalent for a set A. (i) A is anti-complex. (ii) The wtt-degree of A is c.e. traceable. (iii) A is wtt-reducible to a Schnorr trivial set. Here, of course, a weak truth table degree a is c.e. traceable if there is a computable order h such that for every function f wtt a, there is a computable collection {Wg(x) : x ∈ ω} with |Wg(x) | h(x) and f (x) ∈ Wg(x) for all x. The argument of Lemma 12.1.1 shows that we can replace h by any computable order. We devote the rest of this section to proving Theorem 12.3.2. An interesting consequence of Theorems 12.2.14 and 12.3.2 is that a weak truth table degree a is c.e. traceable iff there is a weak truth table degree b a that contains a set whose truth table degree is computably traceable.
12.3.2 Reducibilities with tiny uses To prove Theorem 12.3.2, Franklin, Greenberg, Stephan, and Wu [156] introduced yet another variation on weak truth table reducibility. Definition 12.3.3 (Franklin, Greenberg, Stephan, and Wu [156]). Let A, B ∈ 2ω .
12.3. Tracing weak truth table degrees
577
(i) We say that A is uniformly reducible to B with tiny use, and write A uT(tu) B, if there is a Turing reduction ΦB = A whose use function is dominated by every computable order. (ii) We say that A is reducible to B with tiny use, and write A T(tu) B, if for every computable order h, there is a Turing reduction of A to B whose use function is bounded by h. Some of the basic properties of these reducibilities are easy to obtain. For instance, if A uT(tu) B then A T(tu) B, and if A T(tu) B then A wtt B. It is also easy to see that if A wtt C T(tu) B then A T(tu) B, and if A T(tu) C wtt B then A T(tu) B. Thus, the relation T(tu) is invariant on weak truth table degrees (and is preserved by increasing the degree on the range or decreasing the degree on the domain). The same holds for uT(tu) . Moreover, for a fixed B, the classes {A : A T(tu) B} and {A : A uT(tu) B} are wtt-ideals. Franklin, Greenberg, Stephan, and Wu [156] gave another formulation of these notions using not the use functions but their discrete inverses. If ΦB = A is a Turing reduction, then let Φ(B n) be the longest initial segment of A that is computed by ΦB querying the oracle B only on numbers smaller than n. Recall that a function is dominant if it dominates every computable function. The following is easy to show. Lemma 12.3.4 (Franklin, Greenberg, Stephan, and Wu [156]). Let A, B ∈ 2ω . (i) A T(tu) B iff, for every computable order g, there is a Turing reduction ΦB = A such that the map n → |Φ(B n)| bounds g. (ii) A uT(tu) B iff there is a Turing reduction ΦB = A such that the map n → |Φ(B n)| is dominant. We begin our analysis of these reducibilities with some further easy observations. Lemma 12.3.5.
(i) If A T(tu) A then A is computable.
(ii) If A is computable then A uT(tu) B for all B. Proof. Let f (n) = n + 1. If ΦA = A and for all n we have Φ(A n) A n + 1 then we can recursively compute A(n) by applying Φ to A n, which we previously computed. For the second part, use a reduction ΦB = A whose use function is the constant function 0. Corollary 12.3.6. If A T(tu) B and A is noncomputable, then A <wtt B. Proof. If B wtt A T(tu) B then B is computable, and hence so is A. Thus, for instance, if degwtt (B) is minimal, then every A T(tu) B is computable. In fact, something stronger is true.
578
12. Lowness and Triviality for Other Randomness Notions
Theorem 12.3.7 (Franklin, Greenberg, Stephan, and Wu [156]). If there is a noncomputable A such that A uT(tu) B, then B is high. Proof. For any Turing reduction Φ, if ΦB is total, then the map n → |Φ(B n)| is computable in B (indeed, weak truth table reducible to B). For a Φ that witnesses that A uT(tu) B, this map dominates every computable function. By Martin’s Theorem 2.23.7, B is high. See [156] for more on the degrees of sets that tiny use bound noncomputable sets.
12.3.3 Anti-complex sets and tiny uses The following straightforward result shows that anti-complexity is invariant for wtt-degrees. Lemma 12.3.8 (Franklin, Greenberg, Stephan, and Wu [156]). A set A is anti-complex iff C(f (n)) n + O(1) for every f wtt A. We now come to our first characterization of anti-complexity. Theorem 12.3.9 (Franklin, Greenberg, Stephan, and Wu [156]). A set is anti-complex iff its wtt-degree is c.e. traceable. Proof. Suppose that A is anti-complex, and let f wtt A. By Lemma 12.3.8, there is a c such that C(f (n)) n + c for all n. Let Tn = {x : C(x) n + c}. Then {Tn }n∈ω is a c.e. trace for f , and |Tn | 2n+c+1 for all n. By altering finitely many Tn , we can get a c.e. trace for f with bound 22n . Thus degwtt (A) is c.e. traceable. For the other direction, suppose that degwtt (A) is c.e. traceable, and let f wtt A. By Lemma 12.1.1, let {Tn }n∈ω be a c.e. trace for f bounded by the order h(n) = n. Let U be our universal machine. We can construct a machine M that on input σ, first computes U (σ), then, if this computation converges, interprets the result as a pair (n, m), and if m < n, outputs the mth element enumerated into Tn . If f (n) is the mth element enumerated into Tn , then C(f (n)) CM (f (N )) + O(1) C(n, m) + O(1) 2C(n) + O(1) 2 log n + O(1), so the condition of Lemma 12.3.8 holds. For the next characterization, we need a kind of monotone approximation to the function n → C(A n). Let gA (k) be the least n such that C(A m) > k for all m n. Clearly gA T A . Let V be our fixed universal ∗ (plain) machine. Recall that σC is the first τ of length C(σ) for which we see that V(τ ) = σ. Let $ % ∗ A∗ = A g A (k) C : k ∈ N . Clearly A∗ T A .
12.3. Tracing weak truth table degrees
579
Lemma 12.3.10 (Franklin, Greenberg, Stephan, and Wu [156]). For ev∗ ery A, the map k → (A gA (k))C is bounded by some computable function, where we identify a string with its position in the length-lexicographic ordering of 2<ω . Proof. Clearly there is a constant c such that C(τ 0), C(τ 1) C(τ )+c for all τ . For any k, let τk be a binary string and i < 2 be such that A ga (k) = τk i. By definition of gA (k), we have C(τk ) k, and so C(A gA (k)) k + c. ∗ Hence (A gA (k))C < 2k+c+1 . Theorem 12.3.11 (Franklin, Greenberg, Stephan, and Wu [156]). The following are equivalent. (i) There is some set B such that A T(tu) B. (ii) A is anti-complex. (iii) gA is dominant. (iv) A uT(tu) A∗ . Proof. (i) implies (ii): Assume that A T(tu) B. Suppose that f wtt A, so there is a functional Γ such that ΓA = f and the use of this computation is bounded by a computable function g. We can find a Φ such that ΦB = A and Φ(B n) is longer than A g(n) for all n. Then C(f (n)) C(Φ(B n)) + O(1) C(B n) + O(1) n + O(1). By Lemma 12.3.8, A is anticomplex. (ii) implies (iii): Suppose that A is anti-complex. Let f be an increasing computable function. By definition, C(A f (n)) n for almost all n. Hence, gA (n) > f (n) for almost all n. It follows that gA dominates every computable function. (iii) implies (iv): For everyA we have A T A∗ , because for our universal machine V, we have A = {V(σ) : σ ∈ A∗ } (in other words, A(x) = V(σ)(x) for any σ ∈ A∗ such that x < |V(σ)|, and for every x there is indeed some σ ∈ A∗ such that |V(σ)| > x). But if gA is dominant then this reduction witnesses that A uT(tu) A∗ . To see this, let Φ be this reduction, and let f be an increasing computable function. We need to show that n → |Φ(A∗ n)| dominates f . ∗ Let g be a computable function that dominates k → (A gA (k))C , which exists by Lemma 12.3.10. Since gA is dominant, gA (k) f (g(k + 1)) for ∗ almost all k. Suppose that k is large enough and that (A gA (k))C < ∗ n (A gA (k + 1))C . Then n g(k + 1) and so gA (k) f (n). But |Φ(A∗ n)| gA (k), since (A gA (k))∗C < n, and from (A gA (k))∗C we can compute A gA (k). Thus |Φ(A∗ n)| f (n) as required.
580
12. Lowness and Triviality for Other Randomness Notions
12.3.4 Anti-complex sets and Schnorr triviality We have seen that Schnorr triviality is not invariant in the weak truth table degrees. However, Theorem 12.3.2 concerns the downward closure of the wtt-degrees containing Schnorr trivial sets. Proposition 12.3.12 (Franklin, Greenberg, Stephan, and Wu [156]). Every Schnorr trivial set is anti-complex. Proof. Let A be Schnorr trivial. Fix a computable order h. Let Φ be a wtt-functional with computable bound g on its use function. The map n → A g(n) is tt-reducible to A, so by Theorem 12.2.14, there is a computable trace {Tn }n∈ω for this map that is bounded by h. If f = ΦA is total then we can enumerate a trace {Sn }n∈ω for f (with bound h) by putting Φσ (n) into Sn for those σ ∈ Tn for which Φσ converges with domain greater than n. Hence degwtt (A) is c.e. traceable. By Lemma 12.3.9, A is anti-complex. Combining this result with Lemma 12.3.8, which shows that the class of anti-complex sets is closed downwards under wtt-reducibility, we see that if A is wtt-reducible to a Schnorr trivial set, then it is anti-complex. Combined with Lemma 12.3.11, the following result establishes the converse, and hence completes the proof of Theorem 12.3.2. Theorem 12.3.13 (Franklin, Greenberg, Stephan, and Wu [156]). If gA is dominant then A is wtt-reducible to some Schnorr trivial set. Proof. Let f0 , f1 , . . . be a sequence of total computable functions such that 1. each fi is strictly increasing; 2. the range of fi+1 is contained in the range of fi ; and 3. every computable function is bounded by some fi . By Lemma 12.3.10, let g be a computable function that bounds the function ∗ ∗ k → (A gA (k))C . For each k > 0, let qk = (A gA (k))C , fi (k), where i is greatest such that g(k), fi (k) gA (k − 1). Note that qk gA (k − 1) for all k > 0. Let B = {qk : k > 0}. We claim that A wtt B and B is Schnorr trivial. For a fixed n, let k be greatest such that gA (k) n. Then qk+1 gA (k) n and A gA (k + 1) can be effectively obtained from qk+1 . This procedure allows us to generate A n effectively from B n + 1, so A wtt B. To see that B is Schnorr trivial, we appeal to Theorem 12.2.14. Here is where we use the fact that gA is dominant. The point is that, for every i, all but finitely many elements of B are pairs whose second coordinates are contained in the range of fi , which follows from the fact that the map k → g(k), fi (k) is computable, and so dominated by gA : for all but finitely ∗ many k, we have qk = (A gA (k))C , fi (k) for some i i, and the range of fi is contained in the range of fi .
12.4. Lowness for weak genericity and randomness
581
So let Ψ be a truth table functional. There is some i such that fi bounds the use function of Ψ. After specifying a fixed initial segment of B (specifying those qk whose second coordinates are not in the range of fi ), there are at most 2kg(k) many possibilities for B fi (k), because apart from the finitely many fixed numbers, there are only kg(k) many numbers below fi (k) that can be elements of B, as they all have the form p, fi (m) for some p < g(k) and m < k. After applying Ψ, we get a computable trace for ΨB whose kth element has size at most 2kg(k) . So degtt (B) is computably traceable, and hence, by Theorem 12.2.14, B is Schnorr trivial.
12.4 Lowness for weak genericity and randomness As in other cases, when considering weak 1-randomness, there are two reasonable notions: lowness for weak 1-randomness and the possibly stronger notion of lowness for Kurtz null tests. We begin with a result that sandwiches both notions of lowness between the hyperimmune-free sets and the computably traceable ones. Theorem 12.4.1 (Downey, Griffiths, and Reid [111]). (i) If a set is computably traceable then it is low for Kurtz null tests. (ii) If a set is low for weak 1-randomness then it has hyperimmune-free degree. Proof. (i) Let A be computably traceable. Let {Un }n∈ω be a Kurtz null test relative to A. Then there is an A-computable function f : ω → (2<ω )<ω such that Vn = f (n). Let T be a computable trace for f with bound n → n + 1. We can think of T [n] as a set of finite sets of strings. Let g(n) = {D ∈ T [n+1] : μ(D) 2−n } and Vn = g(n). Then μ(Vn ) −(n+1) −n (n+2)2 2 , so the V form a Kurtz null test, and clearly n n Vn ⊇ U . n n (ii) In Corollary 8.11.8, we proved that every hyperimmune degree is weakly 1-random, and clearly no weakly 1-random degree can be low for weak 1-randomness. We will now work toward showing that lowness for weak 1-randomness and lowness for Kurtz null tests coincide, and that the sets that are low for weak 1-randomness can be completely characterized as those whose degrees are hyperimmune-free and not DNC. As we will see, these are also the sets that are low for weak 1-genericity, i.e., the sets A such that if a set meets every dense Σ01 class, then it meets every dense Σ0,A class. 1 Recall that a degree a is DNC if there is a function f T a such that f (e) = Φe (e) for all e. Recall also from Theorem 8.16.8 that Kjos-Hanssen, Merkle, and Stephan [205] showed that if the degree of B is hyperimmunefree and not DNC, then for each g T B, there are computable h and h
582
12. Lowness and Triviality for Other Randomness Notions
such that ∀n ∃m ∈ [n, h(n)] (h(m) = g(m)). This result can be used to establish the following theorem. Theorem 12.4.2 (Stephan and Yu [381]). Suppose the degree of A is hyperimmune-free and not DNC. (i) Every Σ0,A class of measure 1 has a Σ01 subclass of measure 1. 1 (ii) Every dense Σ0,A class has a dense Σ01 subclass. 1 class of measure 1. Note that S is necessarily dense. Proof. Let S be a Σ0,A 1 We build a Σ01 subclass of S that has measure 1 and is dense. This construction establishes (i), and (ii) follows by the same proof with all measure considerations removed. Define a function f T A as follows. Given n, search for an m > n such that every σ ∈ 2n is extended by some τ ∈ 2m with τ ∈ S. (Note that every m > m will also have this property.) Then search for a k m such that μ( {σ : σ ∈ 2k ∧ σ ⊆ S}) 1 − 2−n . Let f (n) = k. Since X has hyperimmune-free degree, there is a computable function p such that p(n) f (n) for all n. Let q(0) = 0 and q(n + 1) = p(q(n)). Let g : N → (2<ω )<ω be an A-computable function such that for all n, 1. g(n) ⊆ 2q(n+1) , 2. g(n) ⊆ S, 3. for each σ ∈ 2q(n) there is a τ ∈ g(n) extending σ, and 4. μ(g(n)) 1 − 2−n . By Theorem 8.16.8, there are computable functions h : N → (2<ω )<ω and h : N → N such that for all n, 1. h(n) ⊆ 2q(n+1) , 2. for each σ ∈ 2q(n) there is a τ ∈ h(n) extending σ, 3. μ(h(n)) 1 − 2−n , and 4. there is an m ∈ [n, h(n)] such that h(m) = g(m). Let T = {X : ∃n ∀m ∈ [n, h(n)] (X q(m + 1) ∈ h(m))}. Then T is a Σ01 class. Furthermore, for each X ∈ T , there is an n and m ∈ [n, h(n)] such that g(m) = h(m). Then X ∈ h(m) = g(m), and therefore T ⊆ S.
12.4. Lowness for weak genericity and randomness
583
The measure of T is 1 because for all n, μ({X : ∀m ∈ [n, h(n)] (X m ∈ h(m))})
=1−μ h(m) 1 − m∈[n, h(n)]
2−m > 1 − 2−n+1 .
m∈[n, h(n)]
Finally, T is dense because given σ ∈ 2n , there is an extension σ of σ in 2 . But then there is an extension τ0 of σ in h(n). Since h(n) ⊆ 2q(n+1) , there is an extension τ1 of τ0 in h(n + 1). Continuing in this way, we get a sequence σ ≺ σ ≺ τ0 ≺ τ1 ≺ · · · such that τi ∈ h(n + i). Then i τi is an element of T extending σ. q(n)
Corollary 12.4.3 (Stephan and Yu [381]). If a degree is hyperimmunefree and not DNC then is is low for Kurtz null tests and low for weak 1-genericity. Stephan and Yu [381] also completely characterized the sets that are low for weak 1-genericity. Theorem 12.4.4 (Stephan and Yu [381]). The following are equivalent. (i) Every dense Σ0,A class has a dense Σ01 subclass. 1 (ii) A is low for weak 1-genericity. (iii) The degree of A is hyperimmune-free and every 1-generic is weakly 1-generic relative to A. (iv) The degree of A is hyperimmune-free and not DNC. Proof. Clearly (i) implies (ii), and (ii) implies (iii) by the n = 0 case of Theorem 2.24.14, which implies that every hyperimmune degree contains a weakly 1-generic set. Also, (iv) implies (i) by Theorem 12.4.2. To finish the proof, we show that if every 1-generic is weakly 1-generic relative to A then the degree of A is not DNC. Suppose that A is DNC and each 1-generic set is also weakly 1-generic relative to A. By Theorem 8.16.4, A is autocomplex, so there is an Acomputable function f such that K(A m) n for all m f (n). By replacing f with the use function of its computation from A, we may assume that f (n) can be computed from A f (n). Let S = {σ (A f (|σ|)) : σ ∈ 2<ω }. Then S is dense. By Theorem 2.24.2, each noncomputable c.e. set computes a 1-generic, so we can choose B to be low for K and 1-generic. By hypothesis, B is weakly 1-generic relative to A, so B meets S infinitely often, and hence there are infinitely many n with (B n) (A f (n)) ≺ B. For such an n, we can compute A f (n) from B n + f (n) and n. Furthermore,
584
12. Lowness and Triviality for Other Randomness Notions
with B as an oracle, we can compute f (n) from n. Thus, for such n, n K(A f (n)) K(B n + f (n), n) + O(1) K B (B n + f (n), n) + O(1) K B (n) + O(1). So there are infinitely many n with K B (n) n − O(1), which is a contradiction. It may not be surprising that lowness for weak 1-randomness and lowness for weak 1-genericity are related, since both weak 1-randoms and weak 1generics occur in each hyperimmune degree (as shown in Theorems 2.24.19 and 8.11.8). The following result shows that these notions of lowness in fact coincide. The original proof of this result was rather complex and used “svelte trees”, but subsequently the simple proof we present was discovered by Miller (in giving a simple proof of Theorem 8.10.2). Theorem 12.4.5 (Greenberg and Miller [171]). A is low for weak 1randomness iff A is low for Kurtz null tests iff A is low for weak 1-genericity. Thus, A is low for weak 1-randomness iff A is hyperimmune free and not DNC. Proof. If the degree of A is hyperimmune-free and not DNC, then, by Theorem 12.4.3, A is both low for Kurtz null tests and low for weak 1randomness. By Theorems 12.4.1 and 12.4.4, if A has hyperimmune degree then it is neither low for weak 1-randomness nor low weak 1-genericity. If A has DNC degree, then, by Theorem 12.4.4, A is not low weak 1genericity, while by Theorem 8.10.2, there is a 1-random set that is not weakly 1-random relative to A, so A is also not low for weak 1-randomness. One consequence of this characterization (which already follows from Corollary 12.4.3) is that lowness for weak 1-randomness and lowness for Schnorr randomness are different. Theorem 12.4.6 (Stephan and Yu [381]). There is a perfect Π01 class of sets that are neither of DNC degree nor c.e. traceable. Applying the hyperimmune-free basis theorem to this result (and keeping in mind that a c.e. traceable set of hyperimmune-free degree is computably traceable), we have the following. Corollary 12.4.7 (Stephan and Yu [381]). There are sets that are not computably traceable but are low for Kurtz null tests. Proof of Theorem 12.4.6. We construct a Π01 class [P ] in stages. For each e, we have the following requirements for all A ∈ [P ]: Re : ∃n (ΦA e (n) = Φn (n))
12.4. Lowness for weak genericity and randomness
585
(which ensure that A does not compute a fixed-point free function, and hence does not have DNC degree, by Theorem 2.22.1) and Qe : Φe total and an order ⇒ A is not c.e. traceable with order Φe . For the R-requirements, we use the recursion theorem with parameters to find a computable function g such that we can control Φg(n) (g(n)) for all n. We will define Φg(n) (g(n)) for infinitely many n to ensure that for each A e and each A ∈ [P ] with ΦA e total, there is some n such that Φe (g(n)) = Φg(n) (g(n)). The strategy for Re is as follows. At a stage s at which Rebegins acting, s we have pairwise incompatible strings σ0s , . . . , σm with Ps = im σis . For each i m in parallel, we proceed as follows. Working in σis , we pick a large number d = g(n) for some n, and wait until there is some τ σis with s Φτe (d) ↓. Should this never happen then ΦA e is not total for all A ∈ σi . If τ s τ is found then we let Φd (d) = Φe (d), and prune P so that P ∩ σi ⊆ τ , initializing all weaker priority requirements. Now Re is met within P ∩σis . The strategy for Qe is as follows. We define a function Ψ such that ΨA is total for all A ∈ [P ] if Φe is total and an order. We break Qe into subrequirements Qe,i . Each such subrequirement works to ensure that Wi does not trace ΨA with order Φe for A ∈ [P ]. Working within a cone σls as above, we pick a large number n and wait until Φe (n) ↓. If this convergence ever happens, we choose a τ σls such that τ is currently a subset of P and prune P by removing all elements of σls \ τ . Let m > |τ | be such that 2m > Φe (n), and let τ0 , . . . , τ2m −1 be the extensions of τ of length m. We ensure that no weaker priority strategy makes [P ] ∩ τj = ∅. For each τ τ j < 2m , we let Ψej (k) = 0 for all k < n for which Ψej (k) is not yet defined, τj and let Ψe (n) = j. We also initialize all weaker priority requirements. [n] From this point on, as long as |Wi | Φe (n), whenever j < 2m enters [n] Wi , we prune P to remove all of τj from [P ] and initialize all weaker priority requirements. This action never removes all of τ from [P ], and ensures that Qe,i is satisfied within σls , since it guarantees that if Wi is a [n] trace with order Φe and A ∈ [P ] ∩ σls , then A ∈ τj for some j ∈ / Wi , [n] whence ΨA / Wi . e (n) = j ∈ The remainder of the proof is a straightforward application of the finite injury method. The above proof is rather different from the original method of Stephan and Yu [381], who proved the following. Theorem 12.4.8 (Stephan and Yu [381]). There is a partial computable 0-1 valued function ψ with coinfinite domain such that every set whose characteristic function extends ψ is neither autocomplex nor c.e. traceable. We end this section by noting that there are no noncomputable sets that are low for 1-genericity. (Lowness for 1-genericity and for weak 1-genericity
586
12. Lowness and Triviality for Other Randomness Notions
were first investigated by Nitzpon [309], who showed that if X is low for weak 1-genericity then X has hyperimmune-free degree.) Theorem 12.4.9 (Greenberg and Miller [unpublished], Yu [411]). If X is low for 1-genericity then X is computable. Proof. Suppose that X is noncomputable and low for 1-genericity. By Theorem 2.24.4, there is a 1-generic A such that A ⊕ X ≡T X . Then A is 1-generic relative to X, so by the relativized form of Theorem 2.24.3, X ≡T (A ⊕ X) ≡T A ⊕ X ≡T X , which is a contradiction.
12.5 Lowness for computable randomness As we saw in Theorems 12.1.6 and 12.4.1, if a set is low for either Schnorr randomness or weak 1-randomness, then it is of hyperimmune-free degree. This fact may not be too surprising, since Schnorr randomness and weak 1-randomness are both related to total computable functions, as reflected in tests with uniformly computable measures. Martin-L¨ of randomness is concerned with c.e. objects, or partial computable functions, which may help explain why the corresponding lowness notion is more closely related to jump traceability than traceability. What of computable randomness? Here the graded tests are somewhat computable, but the overall measure is not. As it turns out, the situation for computable randomness is dramatically different from either of the above cases. We already saw in Theorem 12.1.6 that sets that are low for computable randomness are of hyperimmune-free degree. In this section, we prove the following remarkable result of Nies [302], which verifies a conjecture of Downey. Theorem 12.5.1 (Nies [302]). If a set is low for computable randomness then it is computable. We present a different proof from the one given by Nies in [302, 306]. This proof is also due to Nies [300], and is included here with his permission. We will assume that all martingales in this section are rational-valued, which we have already seen in Proposition 7.1.2 we can do with no loss of generality for computable, or more generally A-computable, martingales. The following lemma is obvious but quite useful. Following Nies, we call it the “nonascending path trick”. Lemma 12.5.2. Let M be a computable martingale. For each σ and each n > |σ|, we can compute a string τ σ such that |τ | = n and M (τ k + 1) M (τ k) for each k with |σ| k < n − 1. Recall that we say that a martingale M has the savings property if τ σ ⇒ M (τ ) M (σ) − 2.
12.5. Lowness for computable randomness
587
We saw in the comments following Proposition 6.3.8 that we may assume without loss of generality that our martingales have the savings property. A martingale operator is a Turing functional L such that LX is a martingale for each oracle X. To prove Theorem 12.5.1, we will define a martingale operator L. Given a set A that is low for computable randomness, we will apply the following purely combinatorial lemma to N = LA and the (noneffectively listed) family B0 , B1 , . . . of computable martingales with the savings property. Lemma 12.5.3. Let N be a martingale with N (λ) 1. Let B0 , B1 , . . . be martingales with the savings property such that S[N ] ⊆ i S[Bi ]. Then there are σ, d, and a martingale M = in qi Bi , with each qi ∈ Q>0 , such that M (σ) < 2 and ∀τ σ (N (τ ) 2d ⇒ M (τ ) 2). Proof. Assume for a contradiction that no such M exists. We define a sequence of strings σ0 ≺ σ1 ≺ · · · and positive rationals q0 , q1 , . . . such that for all n, we have N (σn ) 2n − 1 and in qi Bi (σn ) < 2. Let σ0 = λ and q0 = B01(λ) . Suppose we are given σ0 , . . . , σn−1 and q0 , . . . , qn−1 with the required properties. Since i |σ| will add up to 2−|σ| , so L will be a rational valued martingale operator. We will then let N = LA . Our goal is to use the fact that S[N ] ⊆ i S[Bi ] (which follows from the fact that A is low for computable randomness) to compute A. The computation procedure for A will be based on a witness
588
12. Lowness and Triviality for Other Randomness Notions
ηm = (e, σ, d) given by Lemma 12.5.3. Since we cannot determine this witness effectively, to make L a martingale operator we need to consider all ηm , including those for which the corresponding Me is partial. The idea behind our procedure for computing A is the following. Once L is defined, if ηm = (e, σ, d) is a witness to the truth of Lemma 12.5.3 for N = LA , then let M = Me and consider the tree Tm = {γ : ∀τ σ (Lγm (τ ) ↓ 2d ⇒ M (τ ) 2)}. By the choice of ηm and the fact that LA LA m , we see that A is a path of Tm . Let k = 2d+m , and let Ek be the collection of k-element sets of strings of equal length. Let α range over elements of Ek . We write αr for the rth element of α in lexicographic order, and identify α with the string α0 α1 . . . αk−1 . For each α, we will ensure that α Tm in an effective way. That is, given α, we will be able to find an s < k such that αs ∈ / Tm . This fact will allow us to define a computable tree R ⊇ Tm of width less than k. Since every path of a tree of finite width is computable and A is a path of Tm ⊆ R, this fact will imply that A is computable. Let σ0 , . . . , σk−1 be the strings of length d+m in lexicographic order. We describe in more detail the strategy that, given α, produces an s such that αs ∈ / Tm . Suppose that ρ σ is a string such that M (ρ) < 2, and no value Lγm (ρ ) has been defined for any ρ ρ (we call ρ an α-destroyer ). In this case we may define Lm (ρ) = 2−m regardless of the oracle. For each s < k, d s we ensure that Lα m (ρσs ) = 2 , by betting all our capital along ρσs from ρ on. Since M (ρ) < 2, by the nonascending path trick (Lemma 12.5.2) we can compute an s such that M (ρσs ) < 2. Then ρσs witnesses that αs ∈ / Tm . We want to carry out this strategy independently for different α, so we assign to each α a string να . Given ηm = (e, σ, d), let M = Me and let (ν) = max{M (ν ) : σ ν ν}. M The assignment function Gm : Ek → 2<ω mapping α to να will satisfy the following conditions. (ν) < 2. 1. The range of Gm is an antichain of strings ν such that M 2. The functions Gm and G−1 m are uniformly partial computable over all m. 3. If Me (σ) < 2 and M is total, then Gm is total. We cannot quite apply the strategy above with ρ = να , since we cannot tell whether Gm is total, but if we find out the value of να at stage s, then we can use the nonascending path trick to find a ρ να of length s such (ρ) < 2. This ρ will be our α-destroyer. that M The choice of Gm is irrelevant as long as the above properties hold, so we defer defining Gm . We define Lm by declaring axioms of the form Lγm (ρ) = p, in such a way that
12.5. Lowness for computable randomness
589
(a) |γ| |ρ|, and given γ and ρ we can determine computably whether we ever declare an axiom Lγ (ρ) = p for some p; and (b) whenever distinct axioms Lγ (ρ) = p and Lδ (ρ) = q are declared then γ and δ are incompatible. If some axiom Lγm (ρ) = p has been declared for γ ≺ X (which can be determined computably, by (a)), then we let LX m (ρ) = p. Otherwise, we let −m LX . m (ρ) be the “default value” 2 Given a string ρ, we check whether there is a ν ρ such that G−1 m (ν) ↓ in exactly |ρ| many steps. If so, then let α = G−1 m (ν). (We assume the usual convention that it takes at least max(|ν|, |α|) many steps for G−1 m (ν) to converge.) In this case, we declare axioms as follows (implementing the s strategy outlined above): For each τ = ρυ and each s < k, let Lα m (τ ) = 0 αs −m+|υ| unless υ is compatible with σs . In that case, let Lm (τ ) = 2 if υ σs , d s and Lα m (τ ) = 2 if σs υ. We have concluded the construction of Lm . Clearly (a) holds. Furthermore, (b) holds because the strings να form an antichain, we declare axioms Lγm (τ ) = p with να τ only if γ ∈ α, and the elements of α are pairwise incompatible. Finally, LX is clearly a martingale for each oracle X. Recall that B0 , B1 , . . . are the (total) computable martingales with the savings property. If A is low for computable randomness, then S[LA ] ⊆ i S[Bi ]. The linear combination M obtained in Lemma 12.5.3 is computable. So the following lemma suffices to compute A, since, as explained above, the existence of the tree R implies that A is computable. Lemma 12.5.4. Suppose ηm = (e, σ, d), where M = Me is total and M, σ, d witness the truth of Lemma 12.5.3 with N = LA . Then there is a computable tree R ⊇ Tm of width less than k. Proof. We define the levels R(j) of R recursively. Let R(0) = {λ}. Let j > 0 and suppose we have determined R(j−1) . Carry out the following procedure to determine R(j) . 1. Let F be the set of strings of length j that extend strings in R(j−1) (so |F | = 2|R(j−1) |). 2. If |F | k, then let α consist of the k leftmost strings of F . (a) Compute να = Gm (α), and let s be the stage at which Gm (α) ↓. (b) By the nonascending path trick, find ρ να of length s such (ρ) < 2. that M (ρσr ) < 2. Remove αr from F and (c) Search for r < k such that M repeat step 2. 3. Once |F | < k, let R(j) = F .
590
12. Lowness and Triviality for Other Randomness Notions
Clearly R is computable and has width less than k. By the definition of −d r Lm , for r as above we have Lα , whence αr ∈ / Tm . Thus m (ρσr ) = 2 R ⊇ Tm . (ν) = To conclude the proof it remains to define Gm . Recall that M max{M (ν ) : σ ν ν}. We first prove a lemma using the following instance of Kolmogorov’s Inequality: If M (τ ) < b then (ν) b}) M (τ ) , μτ ({ν τ : M b where μτ (X) = 2|τ |μ(X ∩ τ ). Lemma 12.5.5. Given ηm , suppose that M (τ ) < b for b ∈ Q. Let (ν) < b} P = {ν τ : M and let r be such that 2−r 1 − Meb(τ ) . Then given i we can compute ν (i) of length i + r + 1 such that ν (i) ∈ P and the strings ν (i) form an antichain. If M is partial, then we can compute ν (i) for each i such that M is defined for strings of length up to i + r + 1. Proof. Suppose inductively that ν (q) has been computed for q < i. Since −|ν (q) | = 2−r (1−2−i ) and 2−r μτ (P ), by Kolmogorov’s Inequality, q
12.6 Lowness for pairs of randomness notions Recall from Definition 11.9.2 that for classes C and D given by relativizable definitions, Low(C, D) is the class of sets A such that C ⊆ DA . We saw in
12.6. Lowness for pairs of randomness notions
weak 21comp. Schnorr weak 1-
weak 2K-triv. X X X X
1K-triv. K-triv. X X X
comp. K-triv. K-triv. comp. X X
Schnorr c.e. trac. c.e. trac. comp. trac. comp. trac. X
591
weak 1? ¬DNC ¬DNC nonhigh ¬DNC nonhigh ¬DNC hyp.-fr.
Table 12.1. Lowness for pairs of randomness notions
Theorem 11.9.3 that Low(W2R, MLR) is exactly the class of K-trivial sets, where MLR is the class of 1-random sets and W2R is the class of weakly 2-random sets. We also saw in Section 10.6 that, writing ML2R for the class of 2-random sets, Low(ML2R, MLR) is the class of (uniformly) almost everywhere dominating sets. There has been considerable work on characterizing lowness for other pairs of randomness notions. In this brief section, we state some results in that direction and summarize them, together with Theorems 11.4.11, 11.9.1, 11.9.3, 12.4.5, and 12.5.1 and Corollary 12.1.7, in Table 12.1. We write CompR, SchR, and W1R for the classes of sets that are computably random, Schnorr random, and weakly 1-random, respectively. Theorem 12.6.1 (Nies [302]). A set is in Low(MLR, CompR) iff it is Ktrivial. Theorem 12.6.2 (Kjos-Hanssen, Nies, and Stephan [207]). A set is in Low(MLR, SchR) iff it is c.e. traceable, and in Low(CompR, SchR) iff it is computably traceable. Note that the first part of the above result is Theorem 12.1.4 (given Corollary 12.1.7). Theorem 12.6.3 (Nies [306]). A set is in Low(W2R, CompR) iff it is Ktrivial. Theorem 12.6.4 (Greenberg and Miller [171]). A set is contained in Low(MLR, W1R) iff its degree is not DNC, and in Low(CompR, W1R) = Low(SchR, W1R) iff its degree is not DNC and not high. Note that the first part of the above result is part of Theorem 8.10.2. Theorem 12.6.5 (Bienvenu and Miller [41]). A set is in Low(W2R, SchR) iff it is c.e. traceable. At the time of writing, there is no known exact characterization of Low(W2R, W1R). There has also been work on highness for randomness notions, a concept introduced by Franklin, Stephan, and Yu [159]. Space considerations preclude us from going further into this material. A survey of lowness and highness for randomness notions may be found in Franklin [154].
13 Algorithmic Dimension
Not all classes of measure 0 are created equal, and the classical theory of dimension provides a method for classifying them. Likewise, some nonrandom sets are more random than others. In this chapter, we look at effectivizations of Hausdorff dimension and other notions of dimension, and explore their relationships with calibrating randomness.
13.1 Classical Hausdorff dimension The study of measure as a way of specifying the size of sets began with work of Borel and others in the late 19th century. In his famous thesis [237], Lebesgue introduced the measure that is now called Lebesgue measure. In 1914, Carath´eodory [54] gave a more general construction that included Lebesgue measure as a special case. For R, Carath´eodory’s definition yields the s-dimensional measure μs (A) = inf{ i |Ii |s : A ⊆ i Ii }, where each Ii is an interval. In 1919, Hausdorff [177] used s-dimensional measure to generalize the notion of dimension to non-integer values. This idea of changing the way we measure open sets by an additional factor in the exponent is realized in Cantor space as follows. For 0 s 1, the s-measure of a clopen set σ is μs (σ) = 2−s|σ| . Notice that the 1-measure is the same as the uniform measure. Definition 13.1.1. (i) A set of strings D is an n-cover of R ⊆ 2ω if R ⊆ D and D ⊆ 2n . (ii) Let Hns (R) = inf{ σ∈D μs (σ) : D is an n-cover of R}. R.G. Downey and D. Hirschfeldt, Algorithmic Randomness and Complexity, Theory and Applications of Computability, DOI 10.1007/978-0-387-68441-3_13, © Springer Science+Business Media, LLC 2010
592
13.1. Classical Hausdorff dimension
593
(iii) The s-dimensional outer Hausdorff measure of R is H s (R) = limn Hns (R). This notion has been widely studied. The fundamental result is the following. Theorem 13.1.2. For each R ⊆ 2ω there is an s ∈ [0, 1] such that (i) H t (R) = 0 for all t > s and (ii) H u (R) = ∞ for all 0 u < s. Proof. For all s, r ∈ [0, 1], and n ∈ N, Hns+r (R) = inf 2−s|σ| 2−r|σ| : D is an n-cover of R 2−rn Hns (R). σ∈D
So if H s (R) = 0 then Hns+r (R) = 0, while if Hns+r (R) = ∞ then H s (R) = ∞. Tocomplete the proof, notice that if t > 1 then limn Hnt (R) limn σ∈2n 2−t|σ| = limn 2n 2−tn = 0. Theorem 13.1.2 means that the following definition makes sense. Definition 13.1.3. For R ⊆ 2ω , the Hausdorff dimension of R is dimH (R) = inf{s : H s (R) = 0}. Hausdorff dimension has a number of important basic properties. Proposition 13.1.4.
(i) If μ(X) > 0 then dimH (X) = 1.
(ii) (monotonicity) If X ⊆ Y then dimH (X) dimH (Y ). (iii) (countable stability) If P is a countable collection of subsets of 2ω , then dimH ( X∈P X) = supX∈P dimH (X). In particular, dimH (X ∪Y ) is max{dimH (X), dimH (Y )}. Proof sketch. This proposition is well known, but its proof is worth sketching. We begin with a lemma. Lemma 13.1.5. If X is measurable then H 1 (X) = μ(X). Proof. By definition, μ(X) is the infimum of σ∈U 2−σ over all covers U of X. For any cover U of X, we can replace any σ ∈ U such that |σ| < n by extensions of length n to obtain an n-cover V of X such that all its −σ −σ 2 = 2 , so Hn1 (X) = μ(X) for all n. σ∈V σ∈U Part (i) of Proposition 13.1.4 follows immediately from the lemma. To see that part (ii) holds, take any s > dimH (Y ). Then since any n-cover of Y is an n-cover of X, we have Hns (X) Hns (Y ) for all n, so H s (X) = 0. Part (iii) is proved by a similar manipulation of covers.
594
13. Algorithmic Dimension
13.2 Hausdorff dimension via gales The first person to effectivize Hausdorff dimension explicitly was Lutz [252, 253, 254]. To do so, he used a generalization of the notion of (super)martingale. This idea was, however, in a sense implicit in the work of Schnorr [348, 349], who used orders to calibrate the rates of success of martingales, as we have seen in Chapter 7, in much the same way that the s factor calibrates the growth rate of the measure of covers. Recall that, for a (super)martingale d and an order h, the h-success set of d is Sh [d] = {A : lim supn d(An) h(n) = ∞}. As we will see, the theory of Hausdorff dimension can be developed in terms of orders on martingales, but Lutz [252, 254] originally used the following notions. Definition 13.2.1 (Lutz [252, 254]). An s-gale is a function d : 2<ω → R0 such that d(σ) = 2−s (d(σ0) + d(σ1)) for all σ. An s-supergale is a function d : 2<ω → R0 such that d(σ) 2−s (d(σ0)+ d(σ1)) for all σ. We define the success set S[d] of an s-(super)gale in the same way as for martingales. A 1-(super)gale is the same as a (super)martingale. For s < 1, we can think of an s-gale as capturing the idea of betting in an inflationary environment, in which not betting costs us money. In the case of martingales, if we are not prepared to favor one side or the other in our bet following σ, we just make d(σi) = d(σ) for i = 0, 1, and are assured of retaining our current capital. In the case of an s-gale with s < 1, we are no longer able to do so. If we want to have d(σ0) = d(σ1), then we will have d(σi) < d(σ) for i = 0, 1, and hence will necessarily lose money. It is quite easy to go between s-gales and rates of success of martingales. Lemma 13.2.2. For each (super)martingale d there is an s-(super)gale d = S2(1−s)n [d]. Conversely, for each s-(super)gale d there is a such that S[d] = S2(1−s)n [d]. (super)martingale d such that S[d] = 2(s−1)|σ| d(σ). Proof. For a (super)martingale d, let d be defined by d(σ) It is easy to check that d is an s-(super)gale. For each k, we have 2d(An) (1−s)n > k (s−1)n (1−s)n (s−1)n iff d(A n) = 2 d(A n) > 2 2 k = k, so S[d] = S2(1−s)n [d]. The converse is symmetric. Note that, in the above proof, if s is rational then we can pass effectively from d to d or vice versa, so, for instance, d is c.e. iff d is c.e. Lutz’ key insight was that Hausdorff dimension can be quite naturally characterized using either s-gales or growth rates of martingales. Theorem 13.2.3 (Lutz [252, 254]). For a class X ⊆ 2ω the following are equivalent.
13.2. Hausdorff dimension via gales
595
(i) dimH (X) = r. (ii) r = inf{s ∈ Q : X ⊆ S[d] for some s-(super)gale d}. (iii) r = inf{s ∈ Q : X ⊆ S2(1−s)n [d] for some (super)martingale d}. Proof. By Lemma 13.2.2, (ii) and (iii) are equivalent. We first prove that dimH (X) inf{s ∈ Q : X ⊆ S2(1−s)n [d] for some (super)martingale d}. We first consider the supermartingale formulation. s Let s > dimH (X) and let {Uk } k∈ω witness that X is H -null. That is, s|σ| −k for all k we have X ⊆ Uk and σ∈Uk 2 2 . We may assume that the Uk are prefix-free. For each σ and k, let Ukσ = {τ ∈ Uk : σ τ }. For each k, let
dk (σ) = 2|σ|
2−s|τ | ,
τ ∈Ukσ
and let d(σ) =
dk (σ).
k
If σ ∈ Uk , then Ukσi = ∅ for i = 0, 1, so dk (σ0) + dk (σ1) = 0 2dk (σ). If σ∈ / Uk , then Ukσ = Ukσ0 ∪ Ukσ1 , so dk (σ0) + dk (σ1) = 2|σ|+1
τ ∈Ukσ0
2−s|τ | +
2−s|τ |
τ ∈Ukσ1
= 2·2
|σ|
2
−s|τ |
= 2dk (σ).
τ ∈Ukσ
Thus each dk is a supermartingale. Furthermore, dk (λ) = σ∈Uk 2s|σ| 2−k , so d is a supermartingale. Now suppose that A ∈ k Uk . Then for each k there is an nk such that A nk ∈ Uk . Let t > s. Then d(A nk ) dk (A nk ) 2(1−s)nk = = 2(t−s)nk , 2(1−t)nk 2(1−t)nk 2(1−t)nk so lim supn 2d(An) = ∞, and hence A ∈ S2(1−t)n [d]. Since t > s > (1−t)n dimH (X) are arbitrary, dimH (X) inf{s ∈ Q : X ⊆ S2(1−s)n [d] for some supermartingale d}.
596
13. Algorithmic Dimension
To get the same result for martingales,1 we slightly change the definition of dk to ⎧ |σ| −s|τ | ⎪ if Ukσ = ∅ τ ∈Ukσ 2 ⎨2 (1−s)m dk (σ) = 2 if σ m ∈ Uk for m < |σ| ⎪ ⎩ 0 otherwise. We now prove that dimH (X) inf{s ∈ Q : X ⊆ S2(1−s)n [d] for some supermartingale d}. Suppose that d is a supermartingale and X ⊆ S2(1−s)n [d]. We may assume that d(λ) 1. Let d(σ) k Vk = σ : (1−s)|σ| 2 . 2 Let Uk be the set of σ ∈ Vk such that τ ∈ / Vk for τ ≺ σ. Then the Uk are prefix-free and cover X. Furthermore, d(σ) 2−s|σ| 2−s|σ| (1−s)|σ| 2−k = 2−k 2−|σ| d(σ) 2−k , 2 σ∈U σ∈U σ∈U k
k
k
the last inequality following by Kolmogorov’s Inequality (Theorem 6.3.3). Thus {Uk }k∈ω witnesses the fact that X is H s -null. Lutz [255] made the following remarks about the above characterization: “Informally speaking, the above theorem says that the dimension of a set is the most hostile environment (i.e., the most unfavorable payoff schedule, i.e., the infimum s) in which a single betting strategy can achieve infinite winnings on every element of the set.”
13.3 Effective Hausdorff dimension Using the martingale characterization of Hausdorff dimension, we can effectivize that notion as follows. Definition 13.3.1 (Lutz [252, 253, 254]). For a complexity class C of real-valued functions, the C (Hausdorff ) dimension of R ⊆ 2ω is inf{s ∈ Q : R ⊆ S2(1−s)n [d] for some martingale d ∈ C}. For A ∈ 2ω , the C dimension of A is the C dimension of {A}. Note that this infimum is equivalent to inf{s ∈ Q : R ⊆ S[d] for some s-gale d ∈ C}. 1 In the effective setting, we will discuss the issue of passing from supergales to gales when we prove Theorem 13.3.2.
13.3. Effective Hausdorff dimension
597
We will be particularly interested in the case C = Σ01 . The Σ01 Hausdorff dimension of R is sometimes called the constructive (Hausdorff ) dimension of R, but we will refer to is as the effective Hausdorff dimension, or simply the effective dimension, of R, and denote it by dim(R). In Lutz’ original paper [252], he defined C dimension using gales, but in the journal version [254], he used supergales. The issue here is one of multiplicative optimality versus universality. Let us consider the Σ01 case. By Theorem 6.3.5, there is a universal c.e. martingale, but while there is an optimal c.e. supermartingale (Theorem 6.3.7), there is no optimal c.e. martingale (Corollary 6.3.12). As noted by Hitchcock, Lutz, and others, it is not known whether there is always a c.e. r-gale that is universal for the class of all c.e. r-supergales. However, we have the following result, which shows that if r < t then there is a t-gale that is optimal for the class of all r-supergales. Thus gales suffice for defining effective dimension, and Lutz’ original formulation of effective dimension in terms of gales is equivalent to the one in terms of supergales. Theorem 13.3.2 (Hitchcock [179]). Let 0 r < t be rationals. There is a c.e. t-gale d∗ such that for all c.e. r-supergales d, we have S[d] ⊆ S[d∗ ]. Proof. Let d be a multiplicatively optimal c.e. r-supergale with d(λ) < 1 (constructed as in the proof of Theorem 6.3.7). It is enough to build d∗ so that S[d] ⊆ S[d∗ ]. Let A = {σ : d(σ) > 1} and let A[n] = A∩2n . By Kolmogorov’s Inequality (Theorem 6.3.3), |A[n] | 2rn for all n. Let d∗n be defined by 2−t(n−|σ|) |{τ ∈ A[n] : σ τ }| if |σ| n ∗ dn (σ) = 2(t−1)(|σ|−n) d∗n (σ (n − 1)) if |σ| > n. Then it is easy to check that each d∗n is a t-gale and d∗n (σ) = 1 for all σ ∈ A[n] . Now let s ∈ (r, t) be rational and let d∗ = 2(s−r)n d∗n .
n
(s−t)n 2 |A | < ∞, so d∗ is wellThen d (λ) = n2 n2 defined, and hence is a t-gale. Furthermore, it is c.e., since A is c.e. Let X ∈ S[d]. Then X (n − 1) ∈ A for infinitely many A. For such n, ∗
(s−r)n −tn
[n]
d∗ (X (n − 1)) 2(s−r)n d∗n (X (n − 1)) = 2(s−r)n . Hence X ∈ S[d∗ ]. Of course, this result translates into the language of martingales. Say that a (super)martingale d s-succeeds on A if lim supn 2d(An) (1−s)n = ∞. We can build a c.e. martingale that (s + ε)-succeeds on every set A for which there is a c.e. supermartingale that s-succeeds on A. However, we do not know
598
13. Algorithmic Dimension
whether for each such A there necessarily is a martingale that s-succeeds on A. In any case, dim(R) is equal to inf{s ∈ Q : R ⊆ S2(1−s)n [d]}, where d is an optimal c.e. supermartingale. It follows immediately that dim(R) = sup{dim(A) : A ∈ R} and, more generally, the effective dimension of an arbitrary union of classes is the supremum of the effective dimensions of the individual classes. (Notice that for classical Hausdorff dimension, the corresponding fact is true only for countable unions.) We can also effectivize Definition 13.1.1 directly. For Σ01 dimension, we have the following. Definition 13.3.3. (i) A set of strings D is an effective n-cover if it is an n-cover and c.e. ns (R) = inf{ (ii) Let H σ∈D μs (σ) : D is an effective n-cover of R}. s (R) = (iii) The effective s-dimensional outer Hausdorff measure of R is H s n (R). limn H (iv) The effective Hausdorff dimension of R is the unique s ∈ [0, 1] such u (R) = ∞ for all 0 u < s. t (R) = 0 for all t > s and H that H It is straightforward to effectivize our earlier work and show that the effective analog of Theorem 13.2.3 holds. That is, the above definition agrees with our earlier definition of effective Hausdorff dimension. A similar comment holds for other suitable complexity classes. There is another fundamental characterization of effective Hausdorff dimension, in terms of initial segment complexity. Theorem 13.3.4 (Mayordomo [262]). For A ∈ 2ω , we have K(A n) C(A n) 2 = lim inf . n n n Proof. The second equation is immediate, since C and K agree up to a log factor. We prove the first equation. First suppose that dim(A) < s. Let d be a universal c.e. supermartingale. Then lim supn 2d(An) (1−s)n = ∞. Now, as noted in Section 6.3.2, d(A n) = 2n−KM (An)±O(1) , so lim supn 2sn−KM (An) = ∞. But by Theorem 3.15.4, K(A n) KM (A n) + O(log n), so lim supn 2sn−K(An)+O(log n) = ∞. Thus there are infinitely many n such that K(A n) < sn + O(log n), and hence lim inf n K(An) s. n dim(A) = lim inf n
2 Staiger
[378] showed that Theorem 13.3.4 can be obtained from results of Levin in [425]. There were also a number of earlier results indicating the deep relationship between effective Hausdorff dimension and Kolmogorov complexity, such as those of Cai and Hartmanis [45], Ryabko [338, 339, 340], and Staiger [376, 377]. These results are discussed in Lutz [254] (Section 6, in particular) and Staiger [378].
13.3. Effective Hausdorff dimension
599
Now suppose that lim inf n K(An) < r < s for rationals r and s. Let n D = {σ : K(σ) < r|σ|}. Then S is c.e., and by Theorem 3.7.6, |D ∩ 2n | O(2rn−K(n) ). Let
2−r|τ | + 2(r−1)(|σ|−|ν|) . d(σ) = 2(s−r)|σ|
στ ∈D
ν∈D ∧ ν≺σ
Then d(λ) τ ∈D 2−r|τ | O( n 2rn−K(n) 2−rn ) = O( n 2−K(n) ) < ∞, and a straightforward calculation shows that d is an s-gale. Since D is c.e., so is d. There are infinitely many n such that A n ∈ D. For any such n, we have d(A n) 2(s−r)n , so d succeeds on A, and hence dim(A) s.3 One consequence of this characterization is that if A has positive effective Hausdorff dimension, then A is complex, since if dim(A) > r > 0, then C(A n) > rn for almost all n. The Kolmogorov complexity characterization of effective dimension also allows us to easily produce sets with fractional effective dimension. Let A be 1-random. Then B = A⊕0ω has effective dimension 12 , since K(B n) = K(A n )
2 = lim inf n = 12 . K(A # n2 $) ± O(1), and hence lim inf n K(Bn) n n Recall the following definition. Given an infinite and coinfinite set X, let x0 < x1 < · · · be the elements of X, let y0 < y1 < · · · be the elements of X, and define C ⊕X D to be {xn : n ∈ C} ∪ {yn : n ∈ D}. For a computable real r ∈ [0, 1], we can find a computable X such that limn |X∩[0,n)| = r. n Then it is easy to check that A ⊕X 0ω has effective dimension r. Thus we have the following result.
Theorem 13.3.5 (Lutz [252, 254]). For every computable real s, there is a set of effective Hausdorff dimension s. Clearly, every 1-random set has effective dimension 1, but if we take r = 1 in the above construction, then we have an example of a set of effective dimension 1 that is not 1-random. Schnorr [349] considered exponential orders on martingales, and hence was in a sense implicitly studying dimension. We thank Sebastiaan Terwijn for the following comments on Schnorr’s work. There is no explicit reference in Schnorr’s book to Hausdorff dimension. After introducing orders of growth of martingales, he places special emphasis on exponential orders. It is interesting to ask why he did this, and whether he might have been inspired by the theory of dimension, but when asked at a recent meeting, he said he was not so motivated, so it is unclear why he gave such emphasis to exponential orders. Chapter 17 of [349] is titled “Die Zufallsgesetze von exponentieller Ordnung” (“The statistical laws of exponential 3 Using the characterization in Definition 13.3.3, we can give a slightly cleaner proof. are infinitely many that Dn is Let Dn = {σ ∈ 2n : K(σ) < r|σ|}. Then there n such −K(σ) −r|σ| < 1. an effective n-cover of A, and for all n, we have σ∈Dn 2 σ∈Dn 2 Thus dim(A) r.
600
13. Algorithmic Dimension
order”) and it starts as follows: “Nach unserer Vorstellung entsprechen den wichtigsten Zufallsgesetzen Nullmengen mit schnell wachsenden Ordnungsfunktionen. Die exponentiell wachsenden Ordnungsfunktionen sind hierbei von besonderer Bedeutung.”4 Satz 17.1 then says that for any measure 0 set A the following are equivalent. 1. There are a computable martingale d and an a > 1 such that A is contained in {X : lim supn d(Xn) > 0}. an 2. There are a Schnorr test {Un }n∈ω and a b > 0 such that A is contained in {X : lim supn mU (X n) − bn > 0}, where mU is a “Niveaufunktion” from strings to numbers defined by mU (σ) = min{n : σ ∈ Un }. This result is a test characterization saying something about the speed at which a set is covered, and resembles characterizations of effective Hausdorff dimension. In this chapter we will focus on the Σ01 notion of effective dimension, but as with notions of algorithmic randomness, there are many possible variations. For instance, we can vary arithmetic complexity of the gales used in the definition of dimension. In Section 13.15, we will examine the effect of replacing Σ01 gales by computable ones, to yield a notion of computable dimension. We can also take the test set approach and study variations on that, for instance Schnorr dimension, which turns out to be equivalent to computable dimension, as we will see in Section 13.15. Some known relationships between notions of randomness and dimension are summarized in the following diagram from [118]. Here Δ02 randomness is the relativization of computable randomness to ∅ (so not the same as 2-randomness), and similarly for Schnorr Δ02 randomness and Δ02 dimension.
Δ02 dimension 1 ⇓ effective dimension 1 =⇒
Δ02 random ⇓ Schnorr Δ02 random =⇒ ⇓ 1-random =⇒ ⇓ computably random ⇓ Schnorr random =⇒
computable dimension 1
No other implications hold than the ones indicated.
4 “In our opinion, the important statistical laws correspond to null sets with fast growing orders. Here the exponentially growing orders are of special significance.”
13.4. Shift complex sets
601
13.4 Shift complex sets Not every set of high effective Hausdorff dimension is a “watered down” version of a 1-random set. Definition 13.4.1. Let 0 < d < 1. A set A is d-shift complex if for every m < n, we have K(A [m, n]) d(n − m) − O(1). If A is d-shift complex then K(A n) dn − O(1), so the effective Hausdorff dimension of A is at least d. On the other hand, a 1-random set cannot be d-shift complex for any d > 0, since it must contain arbitrarily long sequences of 0’s. Thus Theorem 13.4.3 below, which states that dshift complex sets exist for every d ∈ (0, 1), shows that sets of high effective dimension can be quite different from 1-random sets. (Note that any 1-shift complex set would be 1-random, and hence no such sets exist.) We give a proof due to Miller [274], though the original proof in Durand, Levin, and Shen [138] is also not complicated. We say that a set A avoids a string σ if there are no m < n such that σ = A [m, n]. Theorem 13.4.3 also follows from a stronger result due to Rumyantsev and Ushakov [337], who showed that if c < 1 and for each n, we have a set Fn ⊂ 2n with |Fn | 2cn , then there is an A such that for all sufficiently large n, the set A avoids every element of Fn . Lemma 13.4.2 (Miller [274]). Let S ⊆ 2<ω and c ∈ ( 12 , 1) be such that |τ | 2c − 1. Then there is an A that avoids every element of S. τ ∈S c / S. Let Proof. Let p = τ ∈S c|τ | . Since p < 1, we have λ ∈ w(σ) = {c|ρ| : ∃τ ∈ S (|ρ| < |τ | ∧ σρ ends in τ )}. We think of w(σ) as measuring the threat that an extension of σ ends in an element of S. Note in particular that if σ itself ends in some τ ∈ S then w(σ) c|λ| = 1, so it is enough to build A so that w(A n) < 1 for all n. We have w(λ) = 0. Now suppose that we have defined A n = σ so that w(σ) < 1. It is easy to check that c(w(σ0) + w(σ1)) = w(σ) + p < 2c, so there is an i < 2 such that w(σi) < 1, and we can let A(n) = i. Theorem 13.4.3 (Durand, Levin, and Shen [138]). Let 0 < d < 1. There is a d-shift complex set. Proof. Let b = − log(1 − d) + 1, let c = 2−d , and let S = {τ ∈ 2<ω : K(τ ) d|τ | − b}. We have c|τ | = 2−d|τ | 2−K(τ )−b 2−b 2−K(τ ) τ ∈S
τ ∈S
τ ∈S
τ ∈S
2
−b
1−d < 21−d − 1 = 2c − 1, = 2
602
13. Algorithmic Dimension
where the last inequality is easy to show using the fact that d ∈ (0, 1). Thus, by the lemma, there is an A that avoids every element of S. This A is clearly d-shift complex.
13.5 Partial randomness Results such as Theorem 13.3.4 show that effective dimension can be thought of as a notion of “partial randomness”. Indeed, as we saw above, a set such as A ⊕ 0 for a 1-random A, which one would intuitively think of as “ 21 -random”, does indeed have effective dimension 12 . In this section, we look at a different, though closely related, approach to defining a notion of s-randomness for s ∈ [0, 1]. We begin with the measure-theoretic approach. We will see that there are at least two reasonable ways to define the concept of s-Martin-L¨of randomness. The first is a straightforward generalization of the original definition. Definition 13.5.1 (Tadaki [385]). Let s ∈ [0, 1]. 5 (i) A test for weak s-Martin-L¨ of randomness sequence of uniformly is a −s|σ| c.e. sets of strings {Vk }k∈ω such that σ∈Vk 2 2−k for all k. (ii) A is weakly s-Martin-L¨ of random if A ∈ / k Vk for all tests for weak s-Martin-L¨of randomness {Vk }k∈ω .
We can also define A to be weakly s-random if K(A n) sn−O(1). The analog of Schnorr’s Theorem that 1-randomness is the same as Martin-L¨of randomness is straightforward. Theorem 13.5.2 (Tadaki [385]). Let s ∈ [0, 1] be a computable real. A set A is weakly s-Martin-L¨ of random iff A is weakly s-random. Proof. This proof directly mimics the one of Theorem 6.2.3. We give it here as an example. For other similar results whose proofs are straightforward modifications of the s = 1 case, we simply mention that fact. Suppose that A is weakly s-Martin-L¨ of random. Let Vk = {σ : K(σ) s|σ| − k}. Then σ∈Vk 2−s|σ| σ∈Vk 2−K(σ)−k 2−k , so the Vk form a test for weak s-Martin-L¨of randomness, and hence A ∈ / k Vk . Thus there is a k such that K(A n) sn − k. 5 Below, we will define the notion of tests for strong s-Martin-L¨ of randomness. Since they will determine a stronger notion of s-randomness, these tests will be weaker than the ones we define here. The terminology adopted here is meant to avoid the confusion of talking about weak tests in connection with strong randomness and strong tests in connection with weak randomness, or the equally confusing possibility of having objects called “weak tests” that are actually stronger than ones called “strong tests”.
13.5. Partial randomness
603
Conversely, suppose that A is not weakly s-Martin-L¨of random. Let {Vk }k∈ω be a test for weak s-Martin-L¨of randomness such that A ∈ k Vk . Then it is easy to check that {(s|σ| − k, σ) : σ ∈ Vk2 , k > 2} is a KC set, so for each c there is an n such that K(A n) < sn − c. Of course, weak s-randomness is closely related to effective dimension. Proposition 13.5.3. For all A, we have dim(A) = sup{s : A is weakly srandom}. Proof. If A is weakly s-random then lim inf n K(An) s, so dim(A) s. n Conversely, if A is not weakly s-random and r < s, then lim inf n K(An) n s < r, so dim(A) < r. Since r < s is arbitrary, dim(A) s. Whether A is dim(A)-random depends on A, of course. We have seen that there are sets of effective dimension 1 that are not 1-random, and it is easy to construct similar examples for s < 1. On the other hand, the examples we gave in connection with Theorem 13.3.5 have effective dimension s and are weakly s-random, so we have the following extension of that result. Theorem 13.5.4 (Lutz [252, 254]). For every computable real s ∈ [0, 1], there is a set of effective Hausdorff dimension s that is weakly s-random. The discussion above tells us that the concept of s-randomness does give us some information beyond what we can get from effective dimension alone. Using Theorem 13.5.2, we can emulate the proof that Ω is 1-random to show the following. Theorem 13.5.5 (Tadaki [385]). Let s ∈ (0, 1] and let |σ| Ωs = 2− s . U (σ)↓
Then Ωs is weakly s-random. Similarly, we can construct a universal test for weak s-Martin-L¨of randomness and develop other tools of the theory of 1-randomness. For example, it is useful to have an analog of Solovay tests for s-randomness. Definition Let s ∈ [0, 1]. A Solovay s-test is a c.e. set S ⊆ 2<ω 13.5.6. −s|σ| such that σ∈S 2 < ∞. A set A is covered by this test if A ∈ σ for infinitely many σ ∈ S. The following result can be proved in much the same way as the s = 1 case. Although one direction is slightly weaker than in the s = 1 case, this result is enough to establish that for all A, we have dim(A) = inf{s : A is covered by some Solovay s-test}.
604
13. Algorithmic Dimension
Theorem 13.5.7 (Reimann [324]). If A is weakly s-random then A is not covered by any Solovay s-test. If A is covered by some Solovay s-test then A is not weakly t-random for any t > s. The following variant will be useful in the proof of Theorem 13.8.1 below. Definition 13.5.8. An interval Solovay s-test is ac.e. set T of intervals in [0, 1] with dyadic rational endpoints such that I∈T |I|s < ∞ (where |I| is the length of I). It is easy to transform an interval Solovay s-test T into a Solovay stest S as follows. For each [q, q + r] ∈ T , first let n be largest such that [q, q + r] ⊆ [q, q + 2−n ]. Note that 2−n < 2r. Let σ0 and σ1 be such that 0.σ0 = q and 0.σ1 = q + 2−n . Let τi = (σi 0ω ) n. Put τ0 and τ1 into S. It is easy to check that if A ∈ I for infinitely many I ∈ T then A ∈ τ for infinitely many τ ∈ S, and τ ∈S 2−s|τ | 2 I∈T (2|I|)s < ∞. Thus, we can use interval Solovay s-tests in place of Solovay s-tests in bounding the dimension of a real. Turning now to the martingale approach to randomness, we run into some trouble. The following definitions are natural. Definition 13.5.9. Let s ∈ [0, 1]. (i) A is martingale s-random if for all c.e. martingales d, we have A ∈ / S2(1−s)n [d]. (ii) A is supermartingale s-random if for all c.e. supermartingales d, we have A ∈ / S2(1−s)n [d]. It is currently unknown whether these notions are the same, for the kind of reasons we mentioned when discussing Theorem 13.3.2. We would like to show that at least one of these notions of martingale s-randomness coincides with weak s-randomness, but one direction of the equivalence does not work. Consider Schnorr’s proof that if a set is MartinL¨ of random then no c.e. (super)martingale can succeed on it. We are given a c.e. (super)martingale d, and when we see d(σ) > 2k , we put σ into a test set Uk . Now imagine we follow the same proof for the s < 1 case. Here our test sets Vk are sets of strings. We might see that ds (σ0) > 2k and put σ0 into Vk . Since d is only c.e., at some later stage t > s we might see that dt (σ) > 2k . We would like to have σ ⊆ Vk , but of course putting σ into Vk is a waste, since σ0 is already there, and indeed the measure calculation in Schnorr’s proof is unlikely to work if Vk is not prefix-free. If we were to follow Schnorr’s proof, we would simply put σ1 into Vk . Unfortunately, 2(2−s(|σ|+1) ) > 2−s|σ| , so the simple measure calculation that allowed Schnorr’s proof to succeed is no longer available. This problem was overcome by the following definition of Calude, Staiger, and Terwijn [52]. Definition 13.5.10 (Calude, Staiger, and Terwijn [52]). Let s ∈ [0, 1].
13.5. Partial randomness
605
(i) A test for strong s-Martin-L¨ of randomness is a sequence of uniformly c.e. sets of strings {Vk }k∈ω such that for every k and every prefix-free V k ⊆ Vk , we have σ∈V k 2−s|σ| 2−k . (ii) A is strongly s-Martin-L¨ of random if A ∈ / k Vk for all tests for strong s-Martin-L¨of randomness {Vk }k∈ω . Now the following result can be proved by a straightforward modification of Schnorr’s proof mentioned above. The point is that when we notice that d(σ) > 2k , we can put σ into Vk even if, say, σ0 is already there, because any prefix-free subset of Vk will contain at most one of σ and σ0. Theorem 13.5.11 (Calude, Staiger, and Terwijn [52]). Let s ∈ (0, 1] be a computable real. A set A is strongly s-Martin-L¨ of random iff it is supermartingale s-random. Reimann and Stephan [330] showed that weak and strong s-randomness are indeed distinct. The proof is technical and basically involves careful KC calculations, and we omit it due to space considerations. Theorem 13.5.12 (Reimann and Stephan [330]). For all computable s ∈ (0, 1), there is a set that is weakly s-random but not strongly Martin-L¨ of s-random. On the other hand, we do have the following. Theorem 13.5.13 (Reimann [324]). If t < s and A is weakly s-random, then A is strongly Martin-L¨ of t-random. Proof. Suppose that A is not strongly Martin-L¨ of t-random, so there is a test for strong t-Martin-L¨of randomness {Un }n∈ω such that A ∈ n Un . For each n, 2−s|σ| = 2−sk σ∈Un
k σ∈Un ∩2k
=
k
2(t−s)k
σ∈Un
2−tk
∩2k
2(t−s)k 2−n ,
k
∩ 2 is prefix-free. Now, where the last inequality holds because each Un (t−s)k (t−s)k−m 2 < ∞, so if we let m be such that 1, then k k2 {Un }nm is a test for weak s-Martin L¨ of randomness witnessing the fact that A is not weakly s-random. k
As pointed out by Calude, Staiger, and Terwijn [52], Theorem 13.5.11 implies that strong s-randomness can be characterized in terms of KM since, as noted in Section 6.3.2, KM (σ) = − log d(σ) + |σ| ± O(1) for an optimal c.e. supermartingale d. But we would still want to have a machine-based characterization of strong s-randomness. The following definition provides one.
606
13. Algorithmic Dimension
Definition 13.5.14 (Downey and Reid, see [323]). Let f : 2<ω → 2<ω be a partial computable function computed by a Turing machine M , and let Q ⊆ rng f . The M -pullback of Q is the unique subset Q∗ of dom f satisfying the following conditions.6 (i) f (Q∗ ) = Q. (ii) If σ ∈ Q∗ and f (τ ) = f (σ), then |τ | |σ|. (iii) If σ ∈ Q∗ and f (τ ) = f (σ) with |τ | = |σ| then σ is enumerated into dom M before τ is. (We assume that elements enter dom M one at a time.) M is s-measurable if for all Q ⊆ rng M , we have A machine −s|σ| 2 1. σ∈Q∗ Theorem 13.5.15 (Downey, Reid, and Terwijn, see [323]). Let s ∈ (0, 1] be a computable real. A set A is strongly s-Martin-L¨ of random iff KM (A n) n − O(1) for all s-measurable machines M . We know of no characterization of strong s-randomness in terms of initial segment prefix-free or plain complexity. Before we prove Theorem 13.5.15, we need a lemma, which takes the role of the KC Theorem in this setting. Lemma 13.5.16 (Downey, Reid, and Terwijn, see [323]). Let s ∈ (0, 1]. Let {Uk }k∈ω be a test for strong s-Martin-L¨ of randomness. There is an s-measurable machine M such that for each k and σ ∈ U2k+2 , there is a τ of length |σ| − (k + 1) with M (τ ) = σ. Proof. If σ ∈ U2k+2 then 2−|σ| 2−s|σ| 2−(2k+2) , so |σ| 2k + 2, whence |σ|− (k + 1) > 0. So for a single such σ, requiring that there be a τ of length |σ| − (k + 1) with M (τ ) = σ does not cause a problem. We define M in the most obvious way: Whenever we see σ enter U2k+2 , we find the leftmost τ of length |σ| − (k + 1) and let M (τ ) = σ. We must show that there is always such a τ available, that is, that for each n there are at most 2n many strings of length n needed for the definition of M . As argued above, if σ ∈ U2k+2 then |σ| 2k + 2, so if k > n − 1 then |σ| − (k + 1) > n. Thus we need to worry about only those U2k+2 with k n − 1. Note that strings σ ∈ U2k+2 that require a domain element τ of length n must have length n + k + 1. For a set S of strings, let #(S, n) denote the number of strings of length n in S. As the set of all strings in U2k+2 of length n + k + 1 is prefix-free, {2−s(n+k+1) : σ ∈ U2k+2 ∧ |σ| = n + k + 1} 2−(2k+2) . 6 Roughly speaking, this definition is an approximation to the notion of σ ∗ for Kolmogorov complexity introduced in Chapter 3.
13.6. A correspondence principle for effective dimension
607
Thus #(U2k+2 , n + k + 1)2−s(n+k+1) 2−(2k+2) , so #(U2k+2 , n + k + 1) 2s(n+k+1)−(2k+2) 2(n+k+1)−(2k+2) 2−n−k−1 . Therefore,
#(U2k+2 , n + k + 1) =
k
2n−k−1 2n−1
k
2−k < 2n .
k
Finally, we need to show thatM is an s-measurable machine. Let Q ⊆ rng M be prefix-free. Then Q ⊆ k U2k+2 , so let Qk = U2k+2 ∩ Q, which is a prefix free subset of U2k+2 . Then 2−s|σ| 2s(k+1) = 2s(k+1) 2−s|σ| 2s(k+1) 2−(2k+2) 2−k−1 . σ∈Qk
σ∈Qk
Since Qk is prefix-free and s 1, it follows that 2−s|σ| 2−s|σ| 2−k−1 = 1, k σ∈Q∗ k
σ∈Q∗
k
where Q∗ is as in Definition 13.5.14. Proof of Theorem 13.5.15. First suppose we have an s-measurable machine M such that for each c there is an n with KM (A n) < n − c. Let Uk = {σ : KM (σ) < |σ| − ks }. The Uk are uniformly c.e., and A ∈ Uk for all k. Now we need a calculation to show that the Uk form a test for strong s-Martin-L¨of randomness. Let Q ⊆ Uk be prefix-free. Then k 2−s|σ| 2−s(KM (σ)+ s ) σ∈Q
σ∈Q
σ∈Q
2−k 2−sKM (σ) 2−k
2−s|τ | 2−k ,
τ ∈Q∗
∗
where Q is as in Definition 13.5.14. Now suppose that A is not strongly s-random. Then there is a test for strong s-Martin-L¨of randomness with A ∈ k Uk . Let M be as in Lemma 13.5.16. Then for each c there is an n such that KM (A n) < n − c.
13.6 A correspondence principle for effective dimension Hausdorff dimension and its effective counterpart are of course quite different in general. However, there are situations in which they coincide. We have seen that many important subsets of Cantor space are Π01 classes.
608
13. Algorithmic Dimension
Hitchcock [180] showed that for such classes, and in fact even for arbitrary countable unions of Π01 classes, the classical Hausdorff dimension of a class is equal to its effective Hausdorff dimension, and hence to the supremum of the Hausdorff dimensions of its elements. Thus, for such classes, we can think of classical Hausdorff dimension, which at a first look seems clearly to be a global property of a class, as a pointwise phenomenon. Theorem 13.6.1 (Hitchcock [180]). Let P be a countable union of Π01 classes. Then dimH P = dim P . Proof. It is enough to prove the result for the case in which P is a single Π01 class, since the Hausdorff dimension of a countable union of classes is the supremum of the Hausdorff dimension of the individual classes, and similarly for effective dimension. It is enough to show that dim P dimH P , so let s > dimH P . Then for each n there is a set of strings Un such that σ∈Un 2−s|σ| 2−n and P ⊆ Un . Because P is closed and 2ω is compact, P is compact, so the Un can be taken to be finite. Given n, we can search for a finite set of strings Vn and a stage s such that σ∈Vn 2−s|σ| 2−n and Ps ⊆ Vn , where Ps is as usual the stage s approximation to P . We have seen that such a Vn must exist, and we can recognize when we have found one. Then the Vn form a test for weak s-Martin-L¨of randomness, and P ⊆ n Vn , so no element of P is weakly s-Martin-L¨of random, and hence dim(A) < s for all A ∈ P . Thus dim P < s. Since s > dimH P is arbitrary, dim P dimH P . Hitchcock [180] also showed that if P is a Σ02 class (i.e., a union of uniformly Π01 classes), then the Hausdorff dimension of P is equal to its computable Hausdorff dimension (that is, the dimension defined using computable, rather than c.e., martingales, which is equivalent to the concept of Schnorr Hausdorff dimension defined below). Note that, as observed by Hitchcock [180], Theorem 13.6.1 cannot be extended even to a single Π02 class, since {Ω} is a Π02 class.
13.7 Hausdorff dimension and complexity extraction There have been several results about the dimensions of classes of sets defined in computability-theoretic terms. One important theme is the following: given a set of positive effective dimension, to what extent is it possible to extract a greater level of complexity/randomness from that set? As we will see, there are several ways to formalize this question. (Note that we have already seen in Theorem 8.16.6 that there are sets of reasonably high initial segment complexity that do not compute 1-random sets.)
13.7. Hausdorff dimension and complexity extraction
609
We begin by pointing out a fundamental difference between the degrees of 1-random sets and those of sets of high effective Hausdorff dimension. Lemma 13.7.1 (Folklore). Let A m B. There is a C ≡m B such that A and C have the same effective Hausdorff dimension. The same result holds for any weaker reducibility, such as Turing reducibility. Proof. Let f be a fast-growing computable function, such as Ackermann’s function (though a much slower-growing function than that will do for this argument). For each n, replace A(f (n)) by B(n). Call the resulting set C. Then C ≡m B, and A and C have the same effective Hausdorff dimension because K(A m) and K(C m) are very close for all m. The same procedure works for any weaker reducibility. As we saw in Corollary 8.8.6, Kuˇcera [215] proved that the Turing degrees of 1-random sets are not closed upwards. This fact is true even in the Δ02 degrees, for instance by Theorem 8.8.4. Hence, we have the following. Corollary 13.7.2 (Folklore). There are Δ02 degrees of effective Hausdorff dimension 1 containing no 1-random sets. Proof. Choose a < b < 0 such that a is 1-random but b is not. By Lemma 13.7.1, b contains a set of effective Hausdorff dimension 1. Since Ω tt ∅ , the method above also gives the following result. Corollary 13.7.3 (Reimann [324]). The tt-degree of ∅ has effective Hausdorff dimension 1. The only way we have seen to make a set of effective Hausdorff dimension 1 that is not 1-random is to take a 1-random set and “mess it up” by a process such as inserting 0’s at a sparse set of locations. It is natural to ask whether this is in some sense the only way we can produce such a set. One way to formalize this question is by asking whether it is always possible to extract randomness from a source of effective Hausdorff dimension 1. Question 13.7.4 (Reimann). Let A have effective Hausdorff dimension 1. Is there necessarily a 1-random B T A? We will give a negative answer to this question due to Greenberg and Miller [170] in Section 13.9, where we will present their construction of a set of effective Hausdorff dimension 1 and minimal Turing degree. We saw in Corollary 6.9.5 that no minimal degree is 1-random. Similarly, one of the only methods we have seen for making a set of fractional effective Hausdorff dimension is to take a set of higher effective Hausdorff dimension and “thin it out” (the other method being the construction of shift complex sets in Section 13.4). Again we are led to ask whether, given a set A of fractional effective Hausdorff dimension, it is always possible to find a set B T A of higher effective Hausdorff dimension, or even of effective Hausdorff dimension 1, or even a 1-random B T A. By
610
13. Algorithmic Dimension
Lemma 13.7.1, these questions remain the same if we replace T by ≡T , so we can think of them as questions about the relationship between the effective Hausdorff dimension of A and that of its degree. Thus we can ask the following question. Question 13.7.5 (Reimann, Terwijn). Is there a Turing degree a of effective Hausdorff dimension strictly between 0 and 1? If so, can a be Δ02 ? Question 13.7.5 is known as the broken dimension question. In Section 13.8, we will give a positive answer to it due to Miller [273]. Before doing so, we mention some earlier attempts to solve this question that are interesting and insightful in their own right. We have seen in Theorem 8.8.1 that 1-random sets can always compute DNC functions. Terwijn [387] proved that this is also the case for sets of nonzero effective Hausdorff dimension. We can easily extract this result from our results on autocomplex sets and DNC functions: As mentioned following Theorem 13.3.4, if A has positive effective Hausdorff dimension, then A is complex, so by Theorem 8.16.4, A computes a DNC function. However, Terwijn’s original proof is instructive and short, so we give it below. Theorem 13.7.6 (Terwijn [387]). If dim(A) > 0 then A can compute a DNC function. Proof. By Theorem 2.22.1, it is enough to show that A can compute a fixed-point free function. Let s be such that dim(A) > s > 0 and 1s ∈ N. For each e, let e0 and e1 be indices chosen in some canonical way so that We = We0 ⊕ We1 . For a string σ, let σ be the string of length |σ| such that σ (n) = 1 − σ(n). Let n
Be,n = {σ ∈ 2 s : ∃t (We0 ,t
n s
= σ ∧ We1 ,t
n s
=σ )}.
, then σ ∈ Be,n , If there is a σ such that We0 ns = σ and We1 ,t ns = σ but no other string is in B . Otherwise, B is empty. In either case, e,n e,n n |Be,n | 1, so σ∈Be,n 2−s|σ| 2−s s = 2−n , and hence the Be,n form a test for weak s-Martin-L¨of randomness for each e. We can effectively find indices g(e) for each such test. (That is, WΦg(e) (n) enumerates Be,n for each e and n.) The usual construction of a universal test applied to tests for weak sMartin L¨ of randomness produces a universal test for weak s-Martin-L¨of randomness {Un }n∈ω such that for each index i, for the ith test for weak s-Martin-L¨of randomness {Vn }n∈ω , we have Vn+i+1 ∈ Un for all n. Since dim(A) > s, there is an n such that A ∈ / Un , so A ∈ / Be,n+g(e)+1 for all e. To construct the fixed-point free function h T A, given e, let h(e) be an index so that Wh(e)0 f (n + g(e) + 1) = A f (n + g(e) + 1) and Wh(e)1 f (n + g(e) + 1) = A f (n + g(e) + 1). If Wh(e) = We , then
13.8. A Turing degree of nonintegral Hausdorff dimension
611
A f (n + g(e) + 1) ∈ Be,n+g(e)+1 , which is not the case, so Wh(e) = We for all e. Corollary 13.7.7 (Terwijn [387]). If A
13.8 A Turing degree of nonintegral Hausdorff dimension In this section, we present Miller’s proof of the following result, whose significance was discussed in the previous section.
612
13. Algorithmic Dimension
Theorem 13.8.1 (Miller [273]). Let α ∈ (0, 1) be a left-c.e. real. There is a Δ02 set A of effective Hausdorff dimension α such that if B T A then the effective Hausdorff dimension of B is at most α. Proof. To simplify the presentation, we take α = 12 . It should be clear how to modify the proof to work for any rational α, and not too difficult to see how to handle the general case. We will build A as the intersection of a sequence of nested Π01 classes of positive measure. We will begin with a class all of whose members have effective dimension at least 12 , thus ensuring that dim(A) 12 . At each stage A we will satisfy a requirement of the form ΨA e total ⇒ ∃k > n (K(Ψe k) 1 −n ( 2 + 2 )k). To do so, we will want to wait until a large set of oracles that appear to be in our current Π01 class P compute the same string via Ψe , and then compress that string. A key to showing that we can do this will be to use the dimension of the measure of P . Thus we will work with Π01 classes such that dim(μ(P )) 12 . We begin by introducing some concepts that will be needed in the proof. Definition 13.8.2. Let S ⊆ 2<ω . The direct weight of S is DW(S) = − |σ| 2 . The weight of S is W (S) = inf{DW(V ) : S ⊆ V }. σ∈S 2 Note that W (S) 1 because S ⊆ {λ} and DW({λ}) = 1. The weight of S is essentially the minimum cost of compressing some initial segment of each set in S by a factor of 2 (although programs cannot have fractional length, so the cost of compressing a string of length 5 by a factor of 2 is actually 2−2 , not 2−2.5 ). Of course, there is no reason to think that this compression can be realized effectively. In other words, even if S is c.e., there may not be a prefix-free machine M compressing some initial segment of each set in S by a factor of 2 so that the measure of the domain of M is close to W (S). Suppose that S is finite and let V be such that S ⊆ V . It is suboptimal for V to contain any string incomparable with every σ ∈ S or to contain a proper extension of some σ ∈ S. In other words, it is always possible to find a V with S ⊆ V and DW(V ) DW(V ), so that every string in V has an extension in S. Therefore, there are only finitely many V that need to be considered in computing the infimum in the definition of W (S). Hence this infimum is achieved, justifying the following definition when S is finite. Definition 13.8.3. An optimal cover of S ⊆ 2<ω is a set S oc ⊆ 2<ω such that S ⊆ S oc and DW(S oc ) = W (S). For the sake of uniqueness, we also require S oc to have the minimum measure among all possible contenders. It is not hard to see that optimal covers, if they exist, are unique and prefix-free. The analysis above shows that when S is finite, we can compute both the optimal cover of S and W (S).
13.8. A Turing degree of nonintegral Hausdorff dimension
613
Now consider an infinite S. Let {St }t∈ω be an enumeration of S, i.e., a sequence of finite sets S0 ⊂ S1 ⊂ · · · such that S = t St . If σ ∈ Stoc , oc oc then the only way for σ to not be in St+1 is for some τ ≺ σ to be in St+1 . This fact has some nice consequences. First, it implies a nesting property: oc Stoc ⊆ St+1 for all t. Second, it proves that the Stoc have a pointwise limit V . It is not hard to see that V = S oc , demonstrating that the definition above is valid for any S ⊆ 2<ω . If S is c.e., the above construction applied to an effective enumeration shows that the optimal cover S oc is Δ02 . More importantly, the nesting property implies that S oc is a Σ01 class. There will not generally be a c.e. set V ⊆ 2<ω such that S oc = V and DW(V ) = W (S), or even such that the direct weight of V is finite. However, we can find such a V for which the direct weight of any prefix-free subset is bounded by W (S). Here and below, we write τ 2<ω for {τ σ : σ ∈ 2<ω }. Lemma 13.8.4. For any c.e. set S ⊆ 2<ω , we can effectively find a c.e. V ⊆ 2<ω such that V = S oc and if P ⊆ V is prefix-free, then DW(P ) W (S). Proof. Let {St }t∈ω be an effective enumeration of S. Let V = t Stoc . Then V is c.e. and V = S oc . If there is an infinite prefix-free P ⊆ V such that DW(P ) > W (S), then there is a finite P ⊂ P with the same property. So assume that P ⊆ V is finite and prefix-free. We will prove that if τ ∈ V , then DW(P ∩ τ 2<ω ) DW({τ }), by induction on the distance k from τ to its longest extension in P (the claim being trivial if P ∩ τ 2<ω is empty). The case k = 0 is immediate. Now take τ ∈ V \ P . There is oc a unique t such that τ ∈ St+1 \ Stoc . Then DW(Stoc ∩ τ 2<ω ) DW({τ }), oc as otherwise τ ∈ St . The nesting property implies that Stoc ∩ τ covers P ∩ τ , since every element of P ∩ τ 2<ω must have entered V by stage t. Hence, applying the inductive hypothesis to the elements of Stoc ∩ τ 2<ω , we have DW(P ∩ τ 2<ω ) DW(Stoc ∩ τ 2<ω ). Of course, S oc covers P , so DW(P ) DW(S oc ) = W (S).
We will build a set A by approximations. As usual, the type of approximation—or in terminology borrowed from set theory, the type of forcing condition—determines the nature of the requirements that we can satisfy during the construction. Our conditions will be pairs (σ, S) with σ ∈ 2<ω and S ⊆ σ2<ω c.e. and such that σ ∈ / S oc . A condition describes a restriction on the set A that is being constructed. Specifically, the class of all sets consistent with a condition (σ, S) is the Π01 class P(σ,S) = σ \ S oc . Our definition of condition guarantees that P(σ,S) is nonempty. We say that a condition (τ, T ) extends (σ, S) and write
614
13. Algorithmic Dimension
(τ, T ) (σ, S) if P(τ,T ) ⊆ P(σ,S) .7 Extending a condition corresponds, of course, to further restricting the possibilities for A. We now establish some basic lemmas about these forcing conditions. The first shows that there is an S such that every element of P(λ,S) has effective dimension at least 12 . Lemma 13.8.5. Let S = {σ : K(σ) condition.
|σ| 2 }.
Then (λ, S) is a forcing
Proof. All that needs to be shown is that λ ∈ / S oc . If λ ∈ S oc , then − |σ| DW(S) W (S) = 1. But DW(S) = σ∈S 2 2 σ∈S 2−K(σ) Ω < 1. The next two lemmas show that each Π01 class corresponding to a condition has positive measure, and that the effective Hausdorff dimension of its measure is at most 12 . Lemma 13.8.6. Let S ⊆ σ2<ω . If σ \ S oc is nonempty, then it has positive measure. Proof. Let n = |σ|. The fact that S ⊆ σ implies that W (S) 2− 2 . Because σ \ S oc is nonempty, we know that σ ∈ / S oc . Hence τ ∈ S oc implies that |τ | > n. By these observations, n
μ(S oc ) =
τ ∈S oc
2−|τ | <
2−
|τ |+n 2
n
= 2− 2
τ ∈S oc
2−
|τ | 2
τ ∈S oc
=
n 2− 2
n
DW(S oc ) = 2− 2 W (S) 2−n .
Therefore, μ(σ \ S oc ) > 0. Lemma 13.8.7. Let (σ, S) be a condition. Then dim(μ(P(σ,S) )) 12 . Proof. We prove that dim(μ(S oc )) 12 , which is sufficient because μ(P(σ,S) ) = 2−|σ| − μ(S oc ). We may assume, without loss of generality, that S oc is infinite, since otherwise μ(S oc ) is rational. Let w = W (S). Let V ⊆ 2<ω be the c.e. set guaranteed by Lemma 13.8.4. That is, V = S oc and DW(P ) w whenever P ⊆ V is prefix-free. Note that V must be infinite. Let {Vt }t∈ω be an effective enumeration of V such that V0 = ∅. Let s > 1/2. We produce an interval Solovay s-test T covering μ(V ). (See Definition 13.5.8.) It consists of two parts, T0 and T1 . 1. If τ ∈ Vt+1 \ Vt , then put [μ(Vt+1 ), μ(Vt+1 ) + 2−|τ |] into T0 . 7 It
might seem that the symbol goes in the wrong direction, but this usage is consistent with what is often done in set theory, and derives from the identification of a condition with the class of sets consistent with it.
13.8. A Turing degree of nonintegral Hausdorff dimension
615
2. If μ(Vt ∩ 2>n ) k2−n and μ(Vt+1 ∩ 2>n ) > k2−n for some n and k, then put [μ(Vt+1 ), μ(Vt+1 ) + 2−n ] into T1 . Let T = T0 ∪ T1 . Notice that T does not actually depend on s. Using the fact that V ∩ 2n is prefix-free, we have
|I|s =
τ ∈V !
I∈T0
=
2
2−s|τ | = " 1 2 −s n
2−sn |V ∩ 2n | =
n
DW(V ∩ 2n )
n
!
2
!
2
" 1 n 2 −s n 2− 2 |V
n " 1 2 −s n w
=
n
∩ 2n |
w 1
1 − 2 2 −s
< ∞.
Now fix n and let k be the number of intervals of length 2−n added to T1 . By construction, 2−n k < μ(V ∩ 2>n ). Let P ⊆ V ∩ 2>n be a prefix-free set such that P = V ∩ 2>n . Then by the same argument as in the n previous lemma, μ(P ) < 2− 2 DW(P ). Putting these facts together we n n n have 2−n k < 2− 2 DW(P ) 2− 2 w, so k < 2 2 w. Thus I∈T1
|I| < s
n
n 2 2 w(2−n )s
=
!
2
" 1 2 −s n w
< ∞,
n
which proves that T is an interval Solovay s-test. Next we prove that T covers μ(V ). Call τ ∈ Vt+1 \ Vt timely if Vt+1 ∩ 2n = V ∩2n , in other words, if only strings longer than τ enter V after τ . Because V is infinite, there are infinitely many timely τ ∈ V . Fix one. Let t+1 be the stage at which τ enters V and let n = |τ |. We claim that there is an interval of length 2−n in T that contains μ(V ). Note that if u > t, then μ(V ) − μ(Vu ) μ(V ∩ 2>n ) − μ(Vu ∩ 2>n ). In response to τ entering V , we put the interval [μ(Vt+1 ), μ(Vt+1 ) + 2−n ] into T0 ⊆ T at stage t + 1. Let I = [μ(Vu ), μ(Vu ) + 2−n ] be the last interval of length 2−n added to T . If μ(V ) ∈ / I, then μ(V ) > μ(Vu ) + 2−n . Since u > t, we >n have μ(V ∩ 2 ) > μ(Vu ∩ 2>n ) + 2−n , so another interval of length 2−n is added to T1 ⊆ T after stage u, which is a contradiction. Thus μ(V ) ∈ I. We have proved that for any n that is the length of a timely element of V , there is an interval of length 2−n in T that contains μ(V ). Since there are infinitely many timely strings, μ(V ) is covered by T . Since T is an interval Solovay s-test for every s > 12 , it follows that dim(μ(S oc )) = dim(μ(V )) 12 . Miller [273] noted that, by applying the method of the main part of this proof (presented after the next lemma) to the identity functional, we can show that if (σ, S) extends the condition from Lemma 13.8.5, then dim(μ(P(σ,S) )) 12 . Hence, Lemma 13.8.7 is tight. Our final lemma gives a simple hypothesis on a collection of conditions that guarantees that they have a common extension.
616
13. Algorithmic Dimension
Lemma 13.8.8. Let (σ0 , S0 ), . . . , (σn , Sn ) be conditions such that P(σ0 ,S0 ) ∩ · · · ∩ P(σn ,Sn ) has positive measure. Then there is a condition (τ, T ) such that (τ, T ) (σi , Si ) for each i n. Proof. The σi are comparable by hypothesis, so let σ = σ0 ∪ · · · ∪ σn . Let P = P(σ0 ,S0 ) ∩ · · · ∩ P(σn ,Sn ) = σ \ S0oc ∪ · · · ∪ Snoc . In particular, P ⊆ σ. Let b be such that μ(P ) 2−b . For each m b, let Dm = {τ σ : |τ | = m and no prefix of τ is in Sioc for any i n}. Now μ(P ) |Dm |2−m , because if τ ∈ 2m is not in Dm , then τ is disjoint from P . Hence |Dm | 2m−b . Let τ ∈ Dm and let Tτ = τ 2<ω ∩ (S0oc ∪ · · · ∪ Snoc ). If τ ∈ / Tτoc, then (τ, Tτ ) is the condition required by the lemma. On the other hand, τ ∈ Tτoc implies m that DW(Tτ ) W (Tτ ) = 2− 2 . So assuming that the lemma fails, we have n+1 W (Si ) = DW(Sioc ) DW(S0oc ∪ · · · ∪ Snoc ) in
in
τ ∈Dm
DW(Tτ )
m
m
m
2− 2 2m−b 2− 2 = 2 2 −b .
τ ∈Dm
For large enough m, we have a contradiction. Note that ∅ can find the common extension guaranteed by the lemma. We are now ready to build A. We build a sequence of conditions (σ0 , S0 ) (σ1 , S1 ) (σ2 , S2 ) · · · and take A = t σt , which will be total. Equivalently, A will be the unique element of t P(σt ,St ) . The construction will be carried out with a ∅ oracle, so that A T ∅ . We begin by letting (σ0 , S0 ) be the condition from Lemma 13.8.5, which ensures that dim(A) 12 . For each e and n, we will meet the requirement A −n 1 Re,n : ΨA )k). e total ⇒ ∃k > n (K(Ψe k) ( 2 + 2
These requirements guarantee that if B T A, then dim(B) 12 , and in particular, that dim(A) = 12 . Suppose we have defined (σt , St ). We now show how to define (σt+1 , St+1 ) to satisfy Re,n , where t = e, n. Let b be such that 2−b < μ(P(σt ,St ) ). Note that b exists by Lemma 13.8.6, and can be found using ∅ . We define a prefix-free machine M . The idea is that M will wait for a large set of oracles that appear to be in P(σt ,St ) to compute the same sufficiently long initial segment via Ψe , and then will compress that initial segment. For each ρ, define M (ρ) as follows. First, wait until U(ρ) ↓. If that ever happens, then let σ = U(ρ) and m = |σ|. If σ is not an initial segment of the binary expansion of μ(P(σt ,St ) ), then let M (ρ) ↑. Otherwise, proceed as follows. For each τ , let Tτ = {ν σt : τ Ψνe }. Search for a τ ∈ 2m−b such that μ(P(σt ,St ∪Tτ ) ) < 0.σ. This condition is Σ01 , so ∅ can find such a τ if
13.8. A Turing degree of nonintegral Hausdorff dimension
617
there is one. If there is such a τ , then let M (ρ) = τ . Note that the domain of M is a subset of the domain of U, and hence is prefix-free. We can effectively find a c such that ∀τ (K(τ ) KM (τ ) + c). Using ∅ , we can search for a σ that is an initial segment of the binary expansion of μ(P(σt ,St ) ) of length m > n + b, and such that K(σ) + c ( 12 + 2−n )(m − b). Such a σ must exist by Lemma 13.8.7. Let ρ be a minimal length U-program for σ. The construction now breaks into two cases, depending on whether or not M (ρ) ↓ (which ∅ can determine, of course). Case 1. M (ρ) ↓= τ . In this case, we know that μ(P(σt ,St ∪Tτ ) ) < 0.σ and μ(P(σt ,St ) ) 0.σ. Thus P(σt ,St ) \ P(σt ,St ∪Tτ ) = (St ∪ Tτ )oc \ Stoc is nonempty. So there is a σt+1 ∈ Tτ such that σt+1 Stoc , since otherwise Stoc would be the optimal cover of (St ∪ Tτ )oc . Note that ∅ can find such a σt+1 . By definition, σt+1 σt . Because Tτ is closed under extensions, we may additionally require that σt+1 properly extend σt . Let St+1 = oc σt+1 2<ω ∩St . Since no prefix of σt+1 is in Stoc , we have St+1 = σt+1 2<ω ∩Stoc , which implies that P(σt+1 ,St+1 ) = σt+1 ∩ P(σt ,St ) = ∅. Thus (σt+1 , St+1 ) is a valid condition and P(σt+1 ,St+1 ) ⊆ P(σt ,St ) , so (σt+1 , St+1 ) (σt , St ). To verify that Re,n has been satisfied, take A ∈ P(σt+1 ,St+1 ) . Since σt+1 A and σt+1 ∈ Tτ , we see that τ ΨA e . Let k = |τ | = m − b, which is larger than n by our choice of σ. Then K(ΨA e k) = K(τ ) KM (τ ) + c |ρ| + c = K(σ) + c ( 12 + 2−n )(m − b) = ( 12 + 2−n )k. Case 2. M (ρ) ↑. In this case, μ(P(σt ,St ∪Tτ ) ) 0.σ for each τ ∈ 2m−b . Thus, for each such τ , we have that (σt , St ∪ Tτ ) is a valid condition extending (σt , St ). Furthermore, since P(σt ,St ∪Tτ ) ⊆ P(σt ,St ) and μ(P(σt ,St ) ) 0.σ + 2−m , we have μ(P(σt ,St ) \ P(σt ,St ∪Tτ ) ) 2−m . So
# P(σt ,St ∪Tτ ) = μ P(σt ,St ) \ (P(σt ,St ) \ P(σt ,St ∪Tτ ) ) μ τ ∈2m−b
μ(P(σt ,St ) ) −
τ ∈2m−b
μ(P(σt ,St ) \ P(σt ,St ∪Tτ ) ) > 2−b − 2m−b 2−m = 0.
τ ∈2m−b
Thus by Lemma 13.8.8, there is a condition (σt+1 , St+1 ) that extends (σt , St ∪ Tτ ) for every τ ∈ 2m−b . Then (σt+1 , St+1 ) (σt , St ). Furthermore, ∅ can find (σt+1 , St+1 ), and we may assume without loss of generality that σt+1 properly extends σt . To verify that Re,n is satisfied in this case as well, assume that ΨA e is total and let τ = ΨA (m − b). Since σ A, some ρ A is in Tτ . t e Therefore, A ∈ (St ∪ Tτ )oc and hence A ∈ / P(σt ,St ∪Tτ ) ⊇ P(σt+1 ,St+1 ) . We have completed the construction. Let A = t σt , which is total because we ensured that σt+1 properly extends σt for every t. The construc-
618
13. Algorithmic Dimension
tion was done relative to ∅ , so A is Δ02 . The remainder of the verification was given above.
13.9 DNC functions and effective Hausdorff dimension In this section we present work of Greenberg and Miller [170] connecting the growth rates of DNC functions with effective Hausdorff dimension. In particular, we describe their construction of a set of effective Hausdorff dimension 1 and minimal Turing degree. (As mentioned above, by Corollary 6.9.5, such a set cannot compute a 1-random set, and hence this result answers Question 13.7.4.) This theorem extends earlier work of Downey and Greenberg [106], who showed that there is a set of minimal Turing degree and effective packing dimension 1 (a concept defined below in Section 13.11.3), as described in Section 13.13. Greenberg and Miller [170] generalized effective Hausdorff dimension to a class of spaces similar to Cantor space. Let h : N → N \ {0, 1} be computable. Let hω be the product space n {0, 1, . . . , h(n) − 1}. That is, the elements of hω are infinite sequences of natural numbers such that the nth n element of the sequence is less than h(n). Let h = m
1 1 . = h(0) · · · h(|σ| − 1) |h|σ| |
Then μh can be extended to a measure on all of hω . When h is fixed, we write simply μ for μh . For a computable function h : N → N \ {0, 1}, let DNCh be the set of all DNC functions in hω . Let DNCn be the set of all DNC functions in nω . Greenberg and Miller’s proof uses work of Kumabe and Lewis [225], who built a DNC function of minimal degree. We will not include the fairly long proof of this result here. An analysis of the construction in [225], which can be found in [170], shows that the following holds. Theorem 13.9.1 (Kumabe and Lewis [225]). For any computable order h : N → N \ {0, 1}, there is an f ∈ DNCh whose Turing degree is minimal. Thus, to show that there is a set of effective Hausdorff dimension 1 and minimal degree, it will be enough to show that there is a computable order h : N → N \ {0, 1} such that every f ∈ DNCh computes a set of effective Hausdorff dimension 1. We do so in the rest of this section.
13.9. DNC functions and effective Hausdorff dimension
619
13.9.1 Dimension in h-spaces Greenberg and Miller [170] generalized many notions we have seen in this chapter to h-spaces. The following is the natural generalization of the notions of weak and strong s-Martin-L¨of randomness to hω . Definition 13.9.2. Let s ∈ [0, 1]. (i) A test for weak s-Martin-L¨ of randomness is a uniformly c.e. sequence {Vk }k∈ω of subsets of h<ω such that σ∈VK μ(σ)s 2−k for all k. (ii) A test for strong s-Martin-L¨ of randomness is a uniformly c.e. sequence {Vk }k∈ω of subsets of h<ω such that for every k and every prefix-free V k ⊆ Vk , we have σ∈V k μ(σ)s 2−k . (iii) A ∈ hω is weakly s-Martin-L¨ of random if A ∈ / k Vk for all tests for weak s-Martin-L¨of randomness {Vk }k∈ω . (iv) A ∈ hω is strongly s-Martin-L¨ of random if A ∈ / k Vk for all tests for strong s-Martin-L¨of randomness {Vk }k∈ω . Many results proved in Section 13.5 still hold for hω , such as Theorem 13.5.13, which says that if t < s and A is weakly s-random, then A is strongly t-random. Thus, for A ∈ hω , the supremum of all s for which A is weakly s-random equals the supremum of all s for which A is strongly s-random, and, by analogy with the 2ω case, can be called the effective (Hausdorff ) dimension of A, denoted by dimh (A). We can also generalize the notion of a Solovay s-test. Definition s ∈ [0, 1]. A Solovay s-test is a c.e. set S ⊆ h<ω 13.9.3. Let s such that σ∈S μ(σ) < ∞. A set A is covered by this test if A ∈ σ for infinitely many σ ∈ S. Theorem 13.5.7 still holds: If A is weakly s-random then A is not covered by any Solovay s-test. If A is covered by some Solovay s-test then A is not weakly t-random for any t > s. We can also define martingales for h-spaces. Definition 13.9.4. A supermartingale (for h) is a function d : h<ω → R0 such that for all σ ∈ h<ω , d(σi) h(|σ|)d(σ).8 i
If s ∈ [0, 1], we say that a supermartingale d s-succeeds on A ∈ hω if lim supn d(A n)μ(A n)1−s = ∞. that this condition is equivalent to d(τ )μ(τ ) d(σ)μ(σ), where the sum is taken over all immediate successors τ of σ. 8 Note
620
13. Algorithmic Dimension
As in the 2ω case, A ∈ hω is strongly s-Martin-L¨of random iff there is no c.e. supermartingale that s-succeeds on A. The proofs are as before, using in particular the fact that Kolmogorov’s Inequality holds in hω : For a supermartingale d and a prefix-free set C ⊂ h<ω , we have σ∈C d(σ)μ(σ) d(λ). The easiest way to see that this is the case is to think of d(σ)μ(σ) as inducing a semimeasure on hω . It is also worth noting that there is an optimal c.e. supermartingale for h, constructed as usual. Our goal is to use h-spaces to construct an element of 2ω of minimal degree with effective dimension 1. To do so, we will need to be able to translate between hω and 2ω in a dimension-preserving way. If h is very well-behaved then there is no problem. For example, if X ∈ 4ω and we let Y (2n) = # X(n) 2 $ and Y (2n+ 1) = X(n) mod 2, then Y ≡T X, and it is easy to check that dim(Y ) = dimh (X). To handle more complicated h-spaces, it is convenient to work through Euclidean space rather than going directly from hω to 2ω . There is a natural measure-preserving surjection of hω onto the Euclidean interval [0, 1]. First map strings to closed intervals: Let π h (λ) = [0, 1]. Once π h (σ) = I is defined, divide I into h(|σ|) many intervals I0 , I1 , . . . , Ih(|σ)|−1 of equal length and let π h (σk) = Ik for k < h(|σ|). Note that indeed μh (σ) = μ(π h (σ)) for all σ. Nowextend π h continuously to hω by letting π h (X) be the unique element of n π h (X n). (The fact that h(n) 2 for all n ensures that this intersection is indeed a singleton.) The mapping π h is not quite 1-1, but it is 1-1 if we ignore the (countably many) sequences that are eventually constant. Note that for all X ∈ hω , we have X ≡T π h (X). The theory of effective dimension can also be developed in the space [0, 1]. We do not have martingales, but we can still, for example, define Solovay tests, as we saw in Definition 13.5.8. Let dim[0,1] (X) be the infimum of all s such that there is an interval Solovay s-test that covers X (that is, such that X is in infinitely many elements of the test). Proposition 13.9.5 (Greenberg and Miller [170]). For all h and X ∈ hω , we have dim[0,1] (π h (X)) dimh (X). Proof. Let s > dimh (X) and let S be a Solovay s-test for h that covers X. Since π h is measure-preserving, the image of S under π h is an interval Solovay test covering π h (X). Equality of these dimensions does not hold in general, but we will see that we do get equality if h does not grow too quickly or irregularly. Suppose that S is an interval Solovay s-test covering some π h (X). We would like to cover X by something like (π h )−1 (S). The problem, of course, is that the basic sets (closed intervals with rational endpoints) in [0, 1] are finer than the basic sets in hω ; not every closed interval is in the range of π h . We thus need to refine S by replacing every I ∈ S by finitely many intervals in the range of π h . While we can control the Lebesgue measure of such a
13.9. DNC functions and effective Hausdorff dimension
621
collection, if the exponent s is smaller than 1 then the process of replacing large intervals by a collection of smaller ones may increase the s-weighted sum of the lengths of the intervals significantly. We show that if h does not grow too irregularly and we increase the exponent s slightly, then this sum remains finite. Let In = π h (hn ) and let I = n In = π h (h<ω ). Let γn = |h1n | . Then In consists of |hn | many closed intervals, each of length γn . For a closed interval I ⊆ [0, 1], let nI be the unique n such that γn |I| > γn+1 . Let kI be the largest natural number such that |I| > kγn+1 . Then there is an I ⊆ InI +1 of size kI + 1 with I ⊆ I. For a set S of closed subintervals of [0, 1] with rational endpoints, let Thus S ⊆ I, and if x = π h (X) and x is covered by S then X is S = I∈S I. h −1 Furthermore, covered by (π ) (S). If S is c.e. then so are S and (π h )−1 (S). t t |I| = μ(σ) . I∈S σ∈(π h )−1 (S) We express the regularity of h via the following condition for h and t > s 0: n
h(n)1−s (h(0) · · · h(n))
t−s
< ∞.
(13.1)
Lemma 13.9.6 (Greenberg and Miller [170]). Suppose that (13.1) holds for S is a set of closed intervals in [0, 1] such that t, s, sand h, and that t |I| is also finite. I∈S |I| is finite. Then I∈S Proof. Let I be any closed interval in [0, 1]. Let n = nI and k = kI . Since |I| γn+1 > k and k h(n), t−s s t 2kγn+1 γn+1 = 2k |I|t = (k + 1)γn+1
s γn+1 γ t−s |I|s |I|s n+1
t−s < 2k 1−s γn+1 |I|s
2h(n)1−s (h(0) . . . h(n))
s t−s |I| .
Thus if we let Sn = {I ∈ S : nI = n}, then I∈S
|I|t =
n I∈Sn
|I|t
n I∈Sn
2h(n)1−s |I|s (h(0) . . . h(n))t−s 2
I∈S
|I|s
n
h(n)1−s
t−s ,
(h(0) . . . h(n))
which by assumption is finite. Thus for all X ∈ hω , if (13.1) holds for t, s, and h, and dim[0,1] (π h (X)) < s, then dimh (X) t. So if (13.1) holds for all t > s 0, then dim[0,1] (π h (X)) = dimh (X) for all X ∈ hω .
622
13. Algorithmic Dimension
For example, (13.1) holds for all t > s 0 for the constant function h(n) = 2. Thus, as discussed above, dimension in [0, 1] is the same as dimension in 2ω . However, this condition holds for some unbounded functions h as well (for example h(n) = 2n ). The following is a sufficient condition for (13.1) to hold for all t > s 0. Lemma 13.9.7 (Greenberg and Miller [170]). If lim n
log h(n) =0 mn log h(m)
then (13.1) holds for all t > s 0, and so dimh (X) = dim[0,1] (π h (X)) for all X ∈ hω . t−s Proof. Let f (n) = log h(n). Let t > s 0 and let ε < 1−s . For sufficiently large n, we have f (n) < ε mn f (m). Let g(n) = h(0) · · · h(n) and let δ = ε(1 − s) − (t − s) < 0. For sufficiently large n, we have h(n) < g(n)ε , 1−s δ and hence h(n) g(n)t−s < g(n) . Thus there is an N such that
nN
h(n)1−s (h(0) . . . h(n))
t−s
<
n
g(n)δ =
log g(n) 2δ , n
which is finite because 2δ < 1 and log g(n) n (since h(n) 2). The regularity condition of Lemma 13.9.7 is not, strictly speaking, a 2 slowness condition, because, for example, h(n) = 2n satisfies this condition, yet there is a monotone function that is dominated by h but does not satisfy the condition. However, the condition does hold for all sufficiently slow monotone functions. Lemma 13.9.8 (Greenberg and Miller [170]). If h is nondecreasing and dominated by 2kn for some k, then log h(n) = 0, mn log h(m)
lim n
and so dimh (X) = dim[0,1] (π h (X)) for all X ∈ hω . Proof. If h is bounded then it is eventually constant, and the condition is easily verified, so assume that h is unbounded. Fix c > 0. There is an Nc such that log h(n) > c for all n Nc . For n > Nc , log h(n) kn kn n . c(n − Nc ) mn log h(m) Nc log h(m)
Since limn
kn c(n−Nc )
= kc , log h(n) k . c mn log h(m)
lim n
Since c is arbitrary, this limit is 0.
13.9. DNC functions and effective Hausdorff dimension
623
Finally, we get the result we will need to translate between hω and 2ω . Corollary 13.9.9 (Greenberg and Miller [170]). If h is nondecreasing and dominated by 2kn for some k, then every X ∈ hω computes a Y ∈ 2ω such that dimh (X) = dim(Y ).
13.9.2 Slow-growing DNC functions and sets of high effective dimension For natural numbers a > b > 0, let Qba be the collection of functions f such that for all n, 1. f (n) ⊆ {0, . . . , a − 1}, 2. |f (n)| = b, and / f (n). 3. if Φn (n) ↓ then Φn (n) ∈ By standard coding, Qba can be seen as a computably bounded Π01 subclass of ω ω . Note that Q1a is essentially the same as DNCa . To connect these sets for different values of a and b, we will use the concept of strong reducibility of mass problems from Definition 8.9.1. Recall that P ⊆ ω ω is strongly reducible to R ⊆ ω ω if there is a Turing functional Ψ such that Ψf ∈ P for all f ∈ R. If P is a Π01 class, then we may assume that the Ψ in the above definition is total, and hence the reduction is a truth table reduction, since we can always define ΨX (n) = 0 if there is an s such that X ∈ / P [s] and ΨX (n)[s] ↑. Lemma 13.9.10 (Greenberg and Miller [170]). If a > b > 0 then Qb+1 a+1 s Qba , uniformly in a and b. (Uniformity here means that an index for the reduction functional Ψ can be obtained effectively from a and b.) Proof. From n and y < a we can compute an mn,y such that 1. for all x < a such that x = y, we have Φmn,y (mn,y ) ↓= x iff Φn (n) ↓= x; and 2. Φmn,y (mn,y ) ↓= y iff either Φn (n) ↓= y or Φn (n) ↓= a. Let f ∈ Qba . Let f (n) = y b f Ψ (n) = f (n) ∪ {a} if |f (n)| = b.
624
13. Algorithmic Dimension
Corollary 13.9.11 (Greenberg and Miller [170]). If a 2 then Qb+1 a+b s DNCa , uniformly in a and b. For a 2 and c > 0, let Pac be the collection of functions f ∈ aω such that for all n and all x < c, if Φn,x (n, x) ↓ then Φn,x (n, x) = f (n). Note that Pa1 ≡s DNCa . Lemma 13.9.12 (Greenberg and Miller [170]). For all a > b > 0 and c > 0, if c(a − b) < a then Pac s Qba , uniformly in a, b, and c. Proof. Fix f ∈ Qba and n. For all x < c, if Φn,x (n, x) ↓ then f (n, y). Φn,x (n, x) ∈ [0, a) \ y
The set on the right has size at most c(a − b), so if c(a − b) < a then we can choose some x < a not in that set and define Ψf (n) = x. Corollary 13.9.13 (Greenberg and Miller [170]). If a 2 and c > 0 then c Pca s DNCa , uniformly in a and c. Proof. Let b = c(a − 1) − a + 1. Then Qb+1 a+b s DNCa and c((a + b) − (b + 1)) = c(a − 1) < a + b, c c = Pa+b s Qb+1 so Pc(a−1)+1 a+b . Since c > 0, we have c(a − 1) + 1 ca, c c c c so Pc(a−1)+1 ⊆ Pca , and hence Pca s Pc(a−1)+1 . All these reductions are uniform. c Greenberg and Miller [170] used the classes Pca to construct sequences of positive effective dimension.
Theorem 13.9.14 (Greenberg and Miller [170]). Let a 2 and ε > 0. Every f ∈ DNCa computes a set of effective Hausdorff dimension greater than 1 − ε, via a reduction that is uniform in a and ε.9 Proof. Fix c > 1. We work in the space (ca)ω . Let d be the universal c.e. supermartingale for this space. By scaling we may assume that d(λ) < 1. For σ ∈ (ca)<ω , let Sσ be the set of k < c such that d(σk) a|σ|+1 . Note that these sets are uniformly c.e. From σ, we can compute an mσ such that for each x < c, we have Φmσ ,x (mσ , x) ↓= k if k is the xth element to be enumerated into Sσ , and Φmσ ,x (mσ , x) ↑ if |Sσ | < x. The idea here is that if d(σ) a|σ| then |Sσ | c, by the supermartingale condition, so all elements of Sσ are “captured” by the map e → Φe (e). We c can thus use a function g ∈ Pca to avoid all such extensions: given such <ω g, inductively define X ∈ (ca) by letting the (n + 1)st digit of X be g(mXn ). Then, by induction, d(X n) an for all n. 9 That
each f ∈ DNCa computes such a set is of course not a new fact, since the Turing degree of a bounded DNC function is PA and so computes a 1-random set. The extra information is the uniformity.
13.9. DNC functions and effective Hausdorff dimension
625
Now let s 0 and suppose that d s-succeeds on X, that is, that d(X n)μca (X n)1−s is unbounded. Since μca (X n) = (ca)−n , n d(X n)μca (X n)1−s an (ca)−n(1−s) = cs−1 as . Thus, if d s-succeeds on X then cs−1 as > 1, so as > c1−s . Taking the 1 logarithm of both sides, we have logs c > 1 − s, so s > 1 − 1+log . Thus a ac ca 1 dim (X) 1 − 1+log c . Since limc loga c = ∞, given ε we can find some a c and X such that dimca (X) > 1 − ε. By Corollary 13.9.9, X computes a Y ∈ 2ω such that dim Y > 1 − ε. Theorem 13.9.15 (Greenberg and Miller [170]). Let a 2. Every f ∈ DNCa computes a set of effective Hausdorff dimension 1, via a reduction that is uniform in a. Proof. We combine the constructions of sets of effective dimensions closer and closer to 1 into one construction. Let h(n) = (n + 1)a. Let d be the n universal c.e. supermartingale for hω . Given f ∈ DNCa , obtain gn ∈ Pna for all n > 0 uniformly from f . For σ ∈ hn , let Sσ be the set of k n such that d(σk) a|σ|+1 and compute an mσ such that for each x n, we have Φmσ ,x (mσ , x) ↓= k if k is the xth element to be enumerated into Sσ , and Φmσ ,x (mσ , x) ↑ if |Sσ | < x. As in the previous proof, we can define X(n) = gn+1 (mXn ) and inductively prove that d(X n) an for all n. −n Now, μh (X n) = an! , so for s 0, d(X n)μh (X n)1−s
asn . (n!)1−s
Let s < 1. Let k be such that k s−1 a2s−1 < 1. For almost all n we have n! > (ka)n , so for almost all n we have asn asn = (k s−1 a2s−1 )n < 1, 1−s (n!) (ka)n(1−s) and hence d cannot s-succeed on X. Thus dimh (X) = 1, so by Corollary 13.9.9, X computes a Y ∈ 2ω of effective dimension 1. Finally, we can paste together these constructions for all a 2 to get the desired result. Theorem 13.9.16 (Greenberg and Miller [170]). There is a computable order ˜ h : N → N \ {0, 1} such that every f ∈ DNCh˜ computes a set of effective Hausdorff dimension 1. Proof. Let h(n) = (n+1)2n , and let d be the universal c.e. supermartingale for h. For σ ∈ hn , let Sσ be the set of k < 2n such that d(σk) (n+ 1)! and compute an mσ such that for each x < 2n , we have Φmσ ,x (mσ , x) ↓= k if k is the xth element to be enumerated into Sσ , and Φmσ ,x (mσ , x) ↑ if |Sσ | < x.
626
13. Algorithmic Dimension n
2 For n > 0, we have Ph(n) DNCn+1 uniformly in n, so there is an 2n for all effective list of truth table functionals Ψn such that Ψfn ∈ Ph(n) f ∈ DNCn+1 . Let ψn be a computable bound on the use function of Ψn . Let
m∗n = 1 + sup{mσ , x : σ ∈ hn ∧ x < 2n } and let un = ψn (m∗n ). Let u0 = 0. For all n > 0, if ρ is a sequence of length un that is a DNCn+1 -string (that is, ρ ∈ (n+1)un and for all y < un such that Φy (y) ↓, we have Φy (y) = ρ(y); or equivalently, ρ is an initial segment of a sequence in DNCn+1 ) then Ψn (ρ) 2n 2n is a Ph(n) -string (an initial segment of a sequence in Ph(n) ) of length at least m∗n . By increasing ψn we may assume that un < un+1 for all n > 0. Let ˜ h(k) = n + 1 for k ∈ [un−1 , un ). If f ∈ DNCh˜ then f un is a DNCn+1 -string for all n > 0, so combining the reductions Ψn , we can define a g T f such that for all n and all σ ∈ hn , 1. g(mσ ) < h(|σ|) and 2. for all x < 2n , if Φmσ ,x (mσ , x) ↓ then g(mσ ) = Φmσ ,x (mσ , x). We can now use g to define X ∈ hω as in the last two constructions, by letting X(n) = g(mXn ). By induction on n, we can show that d(X n) n!. As before, we can do so because if σ ∈ hn and d(σ) n!, then there are at most 2n many immediate successors τ of σ such that d(τ ) (n + 1)!, and so they are all captured by the function e → Φe (e) and avoided by g. Finally, we show that dimh (X) = 1, which by Corollary 13.9.9 implies that X computes a Y ∈ 2ω of effective dimension 1. Let s < 1. For any σ ∈ hn , μh (σ) =
1 1 = n . ( 20 21 · · · 2(n−1) n! 2 2 ) n!
Thus for all n, d(X n)μh (X n)1−s
(n!)s (n!) , (1−s)(n (1−s) ) (n2 ) 2 2 2
which is bounded (and indeed tends to 0). Thus d does not s-succeed on X. Thus, by Theorem 13.9.1, we have the following. Theorem 13.9.17 (Greenberg and Miller [170]). There is a minimal degree of effective Hausdorff dimension 1.
13.10. C-independence and Zimand’s Theorem
627
13.10 C-independence and Zimand’s Theorem By Miller’s Theorem 13.8.1, we cannot always extract a sequence of arbitrarily high effective Hausdorff dimension from one of nonzero effective Hausdorff dimension. In this section we prove Zimand’s theorem that such extraction is possible from two sufficiently independent sources of positive effective Hausdorff dimension. We begin by briefly discussing the notion of independence of sources. This notion dates back at least to Chaitin [62], who suggested that objects x and y should be independent in terms of their information if I(x)−I(x | y) and I(y) − I(y | x) are small, where I is some kind of information content measure in the wide sense, such as C or K. For 1-random sets A and B, we could regard A and B as being independent iff they are mutually 1-random. But what about nonrandom sets? For example, suppose that A and B have positive effective Hausdorff dimension but are not 1-random. What would it mean to say that A and B are independent? The theory here is largely undeveloped, but the following notions of independence are useful for our purposes. In this section, we will regard log factors as being small. We will say a little more about this convention below.10 Definition 13.10.1 (Zimand [424], Calude and Zimand [53]). (i) X and Y are C-independent 11 if C((X m) (Y n)) C(X m) + C(Y n) − O(log m + log n). (ii) X and Y are globally independent if C X (Y n) C(Y n) − O(log n) and C Y (X n) C(X n) − O(log n). Since C and K are within an O(log) factor of each other, both definitions remain the same if we replace C by K. Note also that C((X m) (Y n)) and C((Y n) (X m)) are the same up to an O(log m + log n) factor, so the first definition does not depend on the order of X and Y . As pointed out by Calude and Zimand [53], other notions of smallness are possible. For example, in (i) we could have O(C(m) + C(n)) in place of O(log m + log n), in which case the definition might be different for C and 10 In
general, working up to O(log n) instead of O(1) smooths out many issues, eliding the difference between various versions of Kolmogorov complexity, for example. The price one pays is losing the many important phenomena hiding in those log factors, such as K-triviality, the unrelativized complexity characterizations of 2-randomness, and so on. 11 Calude and Zimand [53] called this notion finitarily independent. We choose what we feel is a more descriptive name.
628
13. Algorithmic Dimension
for K. These modified concepts remain to be explored. We will stick to log factors, as they give enough independence for the results of this section. Clearly, if X and Y are mutually 1-random, then they are globally independent, so global independence can be seen as an extension of the notion of mutual randomness. If X is K-trivial, then X and Y are globally independent for all Y , since K(X n) O(log n) and X is low for K, so K X (Y n) = K(Y n) ± O(1). Using Lemma 13.10.2 (ii) below, this observation can be extended to show that if C(X m) O(log m), as is the case for an n-c.e. set, for instance, then X and Y are globally independent for all Y . The following result is an easy application of symmetry of information (see Section 3.3). Lemma 13.10.2 (Zimand [424], Calude and Zimand [53]). The following are equivalent. (i) X and Y are C-independent. (ii) C(X m | Y n) C(X m) − O(log m + log n). (iii) C((X n) (Y n)) C(X n) + C(Y n) − O(log n). (iv) C(X n | Y n) C(X n) − O(log n). The same holds for K in place of C. Theorem 13.10.3 (Calude and Zimand [53]). If X and Y are globally independent, then X and Y are C-independent. Proof. If we have Y as an oracle, we can describe X n by giving n and a description of X n from Y n, so C Y (X n) C(X n | Y n) + O(log n). So if X and Y are globally independent, then C(X n | Y n) C Y (X n) − O(log n) C(X n) − O(log n), and hence item (iv) of Lemma 13.10.2 holds Stephan (see [53]) showed that the converse fails: there are Cindependent X and Y that are not globally independent. He also showed the following. Theorem 13.10.4 (Stephan, see [53]). If X and Y are left-c.e. reals of positive effective Hausdorff dimension, then X and Y are not C-independent. Proof. Without loss of generality, there are infinitely many n such that, for the least s for which Ys n = Y n, we also have Xs n = X n. For all such n, we have C(X n | Y n) = O(1). But then, since dim(X) > 0, item (iv) of Lemma 13.10.2 cannot hold. Theorem 13.10.5 (Calude and Zimand [53]). If Y is 1-random relative to X, then X and Y are C-independent.
13.10. C-independence and Zimand’s Theorem
629
Proof. By the same argument as in the proof of Theorem 13.10.3, K X (Y n) K(Y n | X n) + 2 log n + O(1). If X and Y are not C-independent then there are infinitely many n with K(Y n | X n) < K(Y n) − 5 log n < n − 3 log n. For such n, we have K X (Y n) n − log n + O(1), contradicting the fact that Y is 1-random relative to X. We now turn to the task of amplifying the complexity of two C-independent sources. Theorem 13.10.6 (Zimand [424]). For any rational q > 0, there is a truth table reduction Ψ such that if X and Y are C-independent and have effective Hausdorff dimension greater than q, then dim(ΨX⊕Y ) = 1. Moreover, an index for Ψ can be determined computably from q. The idea of the proof will be to chop X and Y into suitable bits and reassemble them. A technical combinatorial lemma (Lemma 13.10.14) will be at the heart of the proof. It will use a well-known tool from probability theory, Chernoff bounds, which can be found in any standard probability textbook, such as Feller [144]. We give a brief introduction to the main ideas of the proof, following Zimand [424]. Suppose that we have two independent strings σ and τ of length n with C(σ) and C(τ ) both equal to qn for a positive rational q. (The independence here means that C(στ ) is approximately C(σ)+C(τ ) = 2qn.) Suppose further that we can construct a function E : 2n ×2n → 2m for some suitable m, such that E is regular, in the sense that for each sufficiently large rectangle B1 × B2 ⊆ 2n × 2n , the function E maps about the same number of pairs in B1 × B2 to each τ ∈ 2m . Then for a sufficiently large many preimages, and any B × B ⊆ 2qn × 2qn , any τ ∈ 2m has about |B×B| 2m A ⊆ 2m has about |B×B| |A| many preimages. 2m We claim that if ρ = E(σ, τ ) then the C-complexity of ρ must be large. Assume for a contradiction that C(ρ) < (1 − ε)m for a fairly large ε. We have the following facts. (i) The set B = {σ ∈ 2n : C(σ) = qn} has size approximately 2qn . (ii) The set A = {τ ∈ 2m : C(τ ) < (1 − ε)m} has size less than 2(1−ε)m . (iii) (σ, τ ) ∈ E −1 (A) ∩ B × B. qn 2
Thus |E −1 (A)∩B × B| (22εm) , approximately. If we effectively enumerate E −1 (A) ∩ B × B, then each pair of strings in that set can be described by its position in that enumeration, so C(σ, τ ) 2qn − εm, approximately, which violates the independence hypothesis on σ and τ . Now the above argument is not quite true, as a function E with the necessary properties may not exist, but in Lemma 13.10.14 we will construct a function that makes the above argument “true enough” to prove the theorem.
630
13. Algorithmic Dimension
Proof of Theorem 13.10.6. Note that, by hypothesis, C(X n) > qn and C(Y n) > qn for all sufficiently large n. Lemma 13.10.7. Let r be a rational such that 0 < r < q. For any sufficiently large n0 , we can compute an n1 > n0 such that C(X (n0 , n1 ) | X n0 ) > r(n1 − n0 ). Proof. Let 0 < r < q − r and let n1 = 1−r r n0 . Suppose that C(X (n0 , n1 ) | X n0 ) r(n1 − n0 ). The string X n1 can be described by giving a description of X n0 and a description of X (n0 , n1 ) given X n0 . Thus, for sufficiently large n0 , we have C(X n1 ) n0 + O(log n0 ) + r(n1 − n0 ) rn1 + (1 − r)n0 + O(log n0 ) rn1 + r n1 + O(log n0 ) < qn1 .
We use Lemma 13.10.7 to split up X and Y into blocks X1 X2 . . . and Y1 Y2 . . . , respectively. Each Xi is chosen based on the conditional complexity given the previous blocks, and similarly for the Yi . Let a be a number such that Lemma 13.10.7 holds for all n0 a. Let b be the constant 1−r in the proof of Lemma 13.10.7. Let t0 = 0, let ti = a, r and let ti+1 = b(t0 + · · · + ti ) for i > 0. For i 1, let Xi = X [ti−1 , ti ) i = X1 . . . Xi and Y i = Y1 . . . Yi . Note that and Yi = Y [ti−1 , ti ). Let X i−2 2 ti = ab(1 + b) for i 2 and |Xi | = |Yi | = ab (1 + b)i−3 for i 3. The following lemma follows immediately from the definitions and Lemma 13.10.7. Lemma 13.10.8. The following hold for all i. i−1 ) > r|Xi |. (i) C(Xi | X i | ∼ i. (ii) log |Xi | ∼ i and log |X The same holds for the Yi . The following inequalities follow easily by symmetry of information. Lemma 13.10.9. For all σ and τ , (i) C(στ ) C(σ) + C(τ ) + O(log C(σ) + log C(τ )), (ii) C(τ σ) C(σ) + C(τ | σ) − O(log C(σ) + log C(τ )), and (iii) |C(σ | τ ) − (C(στ ) − C(τ ))| < O(log |σ| + log |τ |). Now we assemble some facts about the Xi and Yi . Lemma 13.10.10. For all i and j, i X i ) + C(X j ))| O(i + j), j ) − (C(Y (i) |C(Y
13.10. C-independence and Zimand’s Theorem
631
iY i ) + C(Y j ))| O(i + j), j ) − (C(X (ii) |C(X i−1 Yj ) − C(Xi | X i−1 )| O(i + j), and (iii) |C(Xi | X jY i−1 ) − C(Yi | Yi−1 ))| O(i + j). (iv) |C(Yi | X Proof. By Lemmas 13.10.9 and 13.10.8, i ) + C(X j ) + O(log C(Y i ) + log C(X j )) j ) C(Y i X C(Y i ) + C(X j ) + O(i + j) C(Y and i ) + C(X j ) − O(log C(Y i ) + log C(X j )) j ) C(Y i X C(Y i ) + C(X j ) − O(i + j). C(Y Putting these two inequalities together gives (i), and (ii) is similar. Now for (iii), by Lemma 13.10.9 (iii), i−1 Y i−1 Yj )))| O(i + j). j ) − (C(Xi X i−1 Yj ) − (C(X |C(Xi | X i−1 Xi Yj ) ± O(i), and hence i−1 Yj ) = C(X Now, C(Xi X i−1 Y i Yj ) − (C(X i−1 Yj )))| O(i + j). j ) − (C(X |C(Xi | X
(13.2)
i Yj ) − (C(X i ) + C(Yj ))| O(i + j) By Lemma 13.10.10 (i) and (ii), |C(X and |C(Xi−1 Yj ) − (C(Xi−1 ) + C(Yj ))| O(i + j). Combining these facts with (13.2), we have i−1 Y i ) − C(X i−1 ))| O(i + j). j ) − (C(X |C(Xi | X
(13.3)
i−1 )−(C(Xi X i−1 ))| O(i+j). i−1 )−C(X By Lemma 13.10.9 (iii), |C(Xi | X Thus, since |C(Xi Xi−1 ) − C(Xi−1 Xi )| O(1), we have i−1 ) − (C(X i ) − C(X i−1 ))| O(i + j), |C(Xi | X and the result now follows by combining this inequality with (13.3). The argument for (iv) is analogous. The final preparatory lemma is the following. Lemma 13.10.11. For all i, i−1 Y i−1 Y i−1 Y i−1 ) C(Xi | X i−1 ) + C(Yi | X i−1 ) − O(i). C(Xi Yi | X Proof. Since C(Xi ) |Xi | + O(1) and similarly for Yi , if we apply Lemma 13.10.9 (ii) in conditional form, we get i−1 Y i−1 Yi−1 ) + C(Yi | Xi X i−1 ) C(Xi | X i−1 Yi−1 ) − O(i). C(Xi Yi | X iY i−1 Y i−1 ) = C(Yi | X i−1 ) ± O(1), we have Since C(Yi | Xi X i−1 Y i−1 Y iY i−1 ) C(Xi | X i−1 ) + C(Yi | X i−1 ) − O(i). C(Xi Yi | X
632
13. Algorithmic Dimension
iY i−1 ) − O(i) C(Yi | X i−1 Yi−1 ) − O(i), by i−1 ) C(Yi | Y But C(Yi | X Lemma 13.10.10 (iv). We now come to the combinatorial heart of the proof. Let ni = |Xi | = |Yi |. We will define a sequence of uniformly computable functions Ei and numbers mi , with Ei : 2ni × 2ni → 2mi . We will then let Zi = E(Xi , Yi ). i−1 Y i−1 ) > (1 − ε)mi , The pair (Ei , mi ) will be chosen so that C(Zi | X which will imply that C(Zi | Zi−1 ) is also close to mi . We will then be able to show that if we define Z = Z1 Z2 . . . , then C(Z k) is sufficiently close to k for all k. The exact method of choosing Ei and mi comes from the theory of randomness extractors and hashing. The proof below uses what is known as the probabilistic method. Definition 13.10.12. A function E : 2n × 2n → 2m is r-regular if for every B1 , B2 ⊆ 2n with |Bi | rn for i = 1, 2 and every σ ∈ 2m , |E −1 (σ) ∩ (B1 × B2 )| 2−m+1 |B1 × B2 |. The function E is weakly r-regular if it obeys the above definition when we consider only Bi with |B1 | = |B2 | = rn . Lemma 13.10.13. If E : 2n × 2n → 2m is weakly r-regular then E is regular. Proof. Let k = rn and suppose that for all B1 , B2 with |Bi | = 2k and all σ ∈ 2m , we have |E −1 (σ) ∩ (B1 × B2 )| 2−m+1 |B1 × B2 | = 2−m+1+2k . 1 and B 2 be such that |B i | = 2ki . Partition each Let k1 , k2 k and let B i i i Bi into pairwise disjoint sets A0 , A1 , . . . , A2ki −k −1 of size 2k . Then
1 × B 2 )| = |E −1 (σ) ∩ (B
|E −1 (σ) ∩ (A1i × A2j )|
i,jsi
2
k1 +k2 −2k −m+1+2k
2
1 × B 2 |. = 2−m+1 2k1 +k2 = 2−m+1 |B
Lemma 13.10.14. For each r > 0, if 2m < r2.99n then with positive probability, a randomly chosen function E : 2n × 2n → 2m will be r-regular. In particular, an r-regular E : 2n × 2n → 2m exists (and hence can be found effectively by searching). Proof. By the previous lemma, it is enough to show that E is weakly rregular. Let N = 2n and M = 2m (as numbers). Choose B1 , B2 ⊆ 2n with |Bi | = N r (ignoring rounding for simplicity). Let j1 ∈ B1 × B2 and j2 ∈ 2m . We think of 2n × 2n as a table with N rows and N columns, of 2m as colors, and of E as specifying a coloring. Then B1 × B2 is a rectangle in this table, j1 is a cell in this rectangle, and j2 is 1 one of M possible colors for this cell. Clearly Prob(E(j1 ) = j2 ) = M .
13.10. C-independence and Zimand’s Theorem
633
Let Cj2 be the number of j2 -colored cells in B1 × B2 . If there is no C 1 1 rectangle B1 × B2 and j2 such that Nj2r2 − M > M , then E is weakly r-regular, so it is enough to show that this event has positive probability. Applying the Chernoff bounds mentioned above, we have
Cj2 N 2r 1 1 Prob > < e− 3M . − 2r N M M C
1 1 The probability that Nj2r2 − M >M for some j2 ∈ 2m is less than or equal m to the sum over all j2 ∈ 2 of the above probability, and hence less than N 2r M e− 3M . The number of rectangles B1 × B2 is
N r 2 2 r r eN N = e2N e2N (1−r) ln N . r r N N C
Thus, to argue that there is no rectangle B1 × B2 and j2 such that Nj2r2 − 1 1 M > M , and hence E is weakly r-regular, it is enough to show that N 2r
M e− 3M e2N e2N r
r
(1−r) ln N
< 1.
Straightforward manipulation shows that this task is the same as showing 2r r r .99r . that N 3M −ln M > 2N +2N (1−r) ln N , which holds when M < N We are finally ready to build Z. We proceed as follows. 1. Split X = X1 X2 . . . and Y = Y1 Y2 . . . as above, using the parameters r = 2q and r = q4 to define a and b. Note that for each i, we have i−1 ) > rni and C(Yi | Y i−1 ) > rni . C(Xi | X 2. For the parameters ni = |Xi | = |Yi | and mi = i2 , find an function Ei . Let Zi = Ei (Xi , Yi ).
r 2 -regular
3. Let Z = Z1 Z2 . . . . Lemma 13.10.15. For all ε > 0 and all sufficiently large i, we have i−1 Y i−1 ) (1 − ε)mi . C(Zi | X i−1 Yi−1 ) < (1 − ε)mi , let Proof. If C(Zi | X i−1 Y i−1 ) < (1 − ε)mi }. A = {σ ∈ 2mi | C(σ | X i−1 Yi−1 ) and t2 = C(Yi | Then |A| < 2(1−ε)mi . Let t1 = C(Xi | X ni i−1 Y i−1 ) tj }. By Xi−1 Yi−1 ). For j = 1, 2, let Bj = {σ ∈ 2 | C(σ | X Lemma 13.10.9 (iii) and (iv), if i is sufficiently large then tj > rni − O(i) > r 2 ni for j = 1, 2, since C(Xi | Xi−1 ) > rni and C(Yi | Yi−1 ) > rni . tj +1 Now, |Bj | 2 , so we can choose Bj ⊇ Bj of size 2tj +1 . The bounds on the tj imply that the Bj are large enough to allow us to invoke the
634
13. Algorithmic Dimension
bounds for the regularity of Ei . Therefore, for any σ ∈ 2mi , |E −1 (σ) ∩ (B1 × B2 )| 2−mi +1 |B1 × B2 |. Thus |E −1 (A) ∩ (B1 × B2 )| |E −1 (A) ∩ (B1 × B2 )| |E −1 (σ) ∩ (B1 × B2 )| 2t1 +t2 −εmi +3 . σ∈A −1
Since E (A) ∩ (B1 × B2 ) can be effectively listed given the parameters i−1 , (1 − ε)mi , t1 , t2 , for any (σ, τ ) ∈ E −1 (A) ∩ (B1 × B2 ), we have i−1 , Y X i−1 Y i−1 ) t1 + t2 − εmi + 2(log(1 − ε)mi + log t1 + log t2 ) + O(1). C(στ | X Using the facts that mi = i2 and log ti = O(i), we see that there is a δ > 0 i−1 ) t1 + t2 − δi2 for all sufficiently large i. In i−1 Y such that C(στ | X particular, i−1 Y i−1 ) t1 + t2 − δ(i2 ) C(Xi Yi | X for all sufficiently large i. However, by Lemma 13.10.11, i−1 Y i−1 Y i−1 Y i−1 ) C(Xi | X i−1 ) + C(Yi | X i−1 ) − O(i) C(Xi Yi | X = t1 + t2 − O(i). For sufficiently large i, we have a contradiction. Lemma 13.10.16. For any δ > 0 and n, we have C(Z n) (1 − δ)n − O(1). Hence Z has effective Hausdorff dimension 1. i−1 Y i−1 ) (1 − ε)mi for Proof. Let ε = 4δ . By Lemma 13.10.15, C(Zi | X almost all i. Thus (Zi | Zi−1 ) (1 − ε)mi − O(1), since we can compute i−1 Yi−1 . A straightforward induction shows that C(Z i ) (1 − i−1 from X Z 3ε)(m1 + · · · + mi ) for sufficiently large i. i , and assume for a contradiction i−1 and Z Now take σ = Z n between Z i−1 by giving a description that C(σ) (1 − δ)|σ|. Then we can describe Z of σ, which takes (1 − 4ε)|σ| < (1 − 4ε)(m1 + · · · + mi ) many bits, the string i−1 | mi many bits, i−1 τ , which takes a further |σ| − |Z τ such that σ = Z and O(log mi ) many further bits to separate the descriptions of σ and τ . Therefore, for sufficiently large i, i−1 ) (1 − 4ε)(m1 + · · · + mi ) + mi + O(log mi ) C(Z = (1 − 4ε)(m1 + · · · + mi−1 ) + (2 − 4ε)mi + O(log mi ) < (1 − 3ε)(m1 + · · · + mi−1 ), mi = 0. Thus we have a the last inequality holding because limi m1 +···+m i−1 contradiction, and hence C(Z n) > (1 − δ)n for almost all n.
13.11. Other notions of dimension
635
This lemma completes the proof of the theorem. Note that the above result implies that Miller’s degree of effective Hausdorff dimension 12 cannot compute two C-independent sources of positive effective Hausdorff dimension. This fact likely has interesting consequences for the computational power of such degrees. Zimand [424] examined the extent to which the hypotheses on X and Y can be weakened. He stated that similar techniques can be used to show that for any δ > 0, there is a c such that, given sequences X and Y that are C-independent and such that C(X n) > c log n and C(Y n) > c log n for every n, a Z can be produced from X and Y with C(Z n) > (1 − δ)n for infinitely many n. In a more recent paper, Zimand [422] showed that even for pairs of sets of limited dependence (as defined in that paper), it is still possible to extract a Z of high initial segment complexity. The methods are similar to those described here, but more delicate. Also, using similar and equally delicate methods, Zimand [423] showed that from two partially random but independent strings (as defined in that paper), it is possible to construct polynomially many pairwise independent random strings, and if the two original strings are themselves random, then this construction can be done in polynomial time. See [53, 424, 422, 423] for more on this fascinating topic.
13.11 Other notions of dimension Hausdorff’s extension of Carath´eodory’s s-dimensional measure is certainly not the only generalization of the notion of dimension to non-integer values in geometric measure theory and fractal geometry, and it is also not the only one to have been effectivized. In this section, we will briefly look at a couple of further dimension concepts and their effectivizations. Further details about the classical notions may be found in Federer [142].
13.11.1 Box counting dimension For C ⊆ 2ω , let C n = {σ ∈ 2n : ∃α ∈ C (σ ≺ α)} and define the upper and lower box counting dimensions of C as dimB (C) = lim sup n
log(|C n|) n
and
dimB (C) = lim inf n
log(|C n|) , n
respectively. If dimB and dimB coincide, then this value is called the box counting dimension, or Minkowski dimension, of C. The name “box counting” comes from the fact that, for each level n, one simply counts the number of boxes of size 2−n needed to cover C. The following is clear.
636
13. Algorithmic Dimension
Lemma 13.11.1. For any C ⊆ 2ω , we have dimH (C) dimB (C) dimB (C). (Lower) box counting dimension gives an easy upper bound on Hausdorff dimension, although this estimate may not be very precise. For instance, let C be the class of all sequences that are 0 from some point on. Then we have dimH (C) = 0 but dimB (C) = 1. (In fact, dimB (D) = 1 for any dense D ⊆ 2ω .) This observation shows that, in general, box counting dimension is not a stable concept of dimension, since a countable union of classes of box counting dimension 0 such as C can have box counting dimension 1. Staiger [377, 378] has investigated some conditions under which Hausdorff and box counting dimension coincide. Probably the most famous example of such a set is the Mandelbrot set. One can modify box counting dimension to obtain a countably stable notion, yielding the concept of modified box counting dimension, defined as follows. The lower modified box counting dimension of C ⊆ 2ω is Xi . dimMB (C) = inf sup dimB (Xi ) : C ⊆ i
i
The upper modified box counting dimension of C ⊆ 2ω is dimMB (C) = inf sup dimB (Xi ) : C ⊆ Xi . i
i
If these values coincide, then they define the modified box counting dimension dimMB (C) of C. (That is, we split up a set into countably many parts and look at the dimension of the “most complicated” part. Then we optimize this value by looking for the partition with the lowest such dimension.) The modified box counting dimensions behave more stably than their original counterparts; in particular, all countable classes have modified box counting dimension 0. However, these dimensions are usually hard to calculate, due to the extra inf / sup process involved in their definitions.
13.11.2 Effective box counting dimension The effectivization of box counting dimension is due to Reimann [324]. An equivalent formulation of effective upper box counting dimension was also given by Athreya, Hitchcock, Lutz, and Mayordomo [15]. This concept turns out to coincide with effective packing dimension, which we discuss below. We will be concerned only with the dimension of individual sets, and hence will not look at the modified versions of box counting dimension needed for countable stability. It is of course possible to effectivize the definition of, for example, dimMB X as we do for box counting dimension below.
13.11. Other notions of dimension
637
Definition 13.11.2 (Reimann [324]). A c.e. set C ⊆ 2<ω is an effective box cover of B ∈ 2ω if B n ∈ C for almost all n. Effective box counting dimension measures how efficiently the initial segments of a set can be “enwrapped” in a c.e. set of strings. For D ⊆ 2<ω , let D[n] = D ∩ 2n . Definition 13.11.3 (Reimann [324]). Let B ∈ 2ω . The effective lower box counting dimension of B is log |C [n] | : C is an effective box cover of B , dim1B (B) = inf lim inf n n The effective upper box counting dimension of B is log |C [n] | 1 : C is an effective box cover of B . dimB (B) = inf lim sup n n Effective upper box counting dimension is the same as a concept analyzed by Athreya, Hitchcock, Lutz, and Mayordomo [15] and Hitchcock [180], building on work of Staiger [377]. Definition 13.11.4 (Staiger [377], Athreya, Hitchcock, Lutz, and Mayordomo [15]). For A ⊆ 2<ω , the entropy rate of A is HA = lim sup n
log |A[n] | . n
For A ⊆ 2<ω , let Ai.o. = {β ∈ 2ω : ∃∞ n (β n ∈ A)} and Aa.e. = {β ∈ 2ω : ∀∞ n (β n ∈ A)}. For C ⊆ 2ω , let H1 (C) = inf{HA : C ⊆ Ai.o. ∧ A ∈ Σ01 } and H1str (C) = inf{HA : C ⊆ Aa.e. ∧ A ∈ Σ01 }. For B ∈ 2ω , let H1 (B) = H1 ({B}) and H1str (B) = H1str ({B}). It is not hard to see that for any B ∈ 2ω , we have dim1B (B) = H1str (B) and dim(B) = H1 (B). It is a trivial observation that, as in the classical case, effective box counting dimension always bounds effective Hausdorff dimension from above; in particular, dim(B) dim1B (B) for any B ∈ 2ω . There is clearly a connection between effective box counting and Kolmogorov complexity, as indicated by the following observation of Kolmogorov [211], which we met in Theorem 3.2.2 (ii). Proposition 13.11.5 (Kolmogorov [211]). Let A ⊆ N×2<ω be computably enumerable and such that Am = {x : (m, x) ∈ A} is finite for all m. Then C(x | m) log |Am | + O(1) for all m and x ∈ Am .
638
13. Algorithmic Dimension
Indeed, it is possible to give an elegant characterization of effective upper box counting dimension in terms of Kolmogorov complexity. Theorem 13.11.6 (Reimann [324], Athreya, Hitchcock, Lutz, and Mayordomo [15]). For any B ∈ 2ω , dim1B (B) = lim sup n
K(B n) C(B n) = lim sup . n n n
Proof. The second equation is immediate, since C and K agree up to a log factor. () If dim1B (B) < s then it follows immediately from Proposition 13.11.5 that lim supn K(Bn) s. n () Suppose that lim supn K(Bn) < s for a rational s. Let D = {σ ∈ n <ω 2 : C(σ) < s|σ|}. Then D is an effective box cover of B. Clearly, D[n] 2sn , so dim1B (B) lim supn
|D [n] | n
s.
13.11.3 Packing dimension Packing dimension is another classically important fractional dimension. The effectivization of this notion was first considered by Athreya, Hitchcock, Lutz, and Mayordomo [15], but we will follow the treatment of Reimann [324], who noted the connections with the effective box counting dimension of the previous section. Packing dimension can be seen as a dual to Hausdorff dimension. Hausdorff dimension is defined in terms of economical coverings, that is, enclosing a class from the outside; packing measures approximate from the inside, by packing a class economically with disjoint sets of small size. A prefix-free set P ⊂ 2<ω is a packing in C ⊆ 2ω if every σ ∈ P has an extension in C. If |σ| n for all σ ∈ P , then we call P an n-packing in C. One can try to find a packing that is as “dense” as possible. Given s 0 and n, let & Pns (C) = sup 2−s|σ| : P is an n-packing in C . σ∈P
Clearly
Pns (X)
is nonincreasing with n, so the limit s (C) = lim Pns (C) P∞ n
exists. However, this definition leads to the same problems we encountered with box counting dimension. Taking for instance the class C of all sequences that are 0 from some point on, we can find denser and denser s packings in C, so that for every 0 s < 1, we have P∞ (C) = ∞. Hence s P∞ lacks countable additivity, and in particular is not a measure. This
13.11. Other notions of dimension
639
problem can be overcome by applying a Carath´eodory process to Pns . Let & s s P (C) = inf P∞ (Xi ) : C ⊆ Xi . i
i
(The infimum is taken over arbitrary countable covers of C.) Then P s is an outer measure on 2ω , and is Borel regular (see e.g. [335] for a definition).12 The measure P s is called the s-dimensional packing measure on 2ω . Packing measures were introduced by Tricot [391] and Sullivan [384]. They can be seen as dual concepts to Hausdorff measures, and behave in many ways similarly to them. In particular, one may define packing dimension in the same way as Hausdorff dimension. Definition 13.11.7. The packing dimension of C ⊆ 2ω is dimP (C) = inf{s : P s (C) = 0} = sup{s : P s (C) = ∞}. Packing dimension has stability properties similar to Hausdorff dimension, such as countable stability. With some effort, one can show that it coincides with dimMB (see Chapter 3 of Falconer [141]). Generally, the following relations between the different dimension concepts hold: dimH (C) dimMB (C) dimMB (C) = dimP (C) dimB (C). Although the traditional definition of packing dimension is rather complicated, due to the additional decomposition / optimization step, there is a simple martingale characterization of this notion. This characterization was discovered by Athreya, Hitchcock, Lutz, and Mayordomo [15], and demonstrates the dual nature of Hausdorff dimension and packing dimension much more clearly. It came as a surprise to workers in the area, who had thought packing dimension was too complex to have such a simple characterization. Definition 13.11.8. For 0 < s 1, a martingale d s-succeeds strongly on A if lim inf n 2d(An) (1−s)n = ∞. = ∞. From a Obviously, this condition is equivalent to limn 2d(αn) (1−s)n gambling perspective, succeeding strongly means not only accumulating arbitrary high levels of capital, but also being able to guarantee that the capital stays above arbitrarily high levels from a certain time on. Theorem 13.11.9 (Athreya, Hitchcock, Lutz, and Mayordomo [15]). For any C ⊆ 2ω , dimP (C) = inf{s : some martingale s-succeeds strongly on all A ∈ C}. Proof. We actually prove this result for dimMB (C) in place of dimP (C), and rely on the fact mentioned above that dimP (C) = dimMB (C). 12 This need no longer be the case if the dimension function h(x) = xs is replaced by more irregular functions not satisfying even weak continuity requirements.
640
13. Algorithmic Dimension
() Let d be a martingale that s-succeeds strongly on all A ∈ C. We show that dimMB (C) s. Without loss of generality, assume that d(λ) 1. Let Dn = {σ ∈ 2n : d(σ) > 2(1−s)n }. Every A ∈ C is contained in all but finitely many Dn , so # Dj . C⊆ i ji
Letting Xi = ji Dj , it is enough to show that dimB (Xi ) s for all i. Recall that for X ⊆ 2ω , we write X n for the set of strings of length n that have extensions in X. We have Xi ⊆ Dn for all n i, so |Xi n| |Dn |. By Kolmogorov’s Inequality (Theorem 6.3.3), |Dn | 2ns , so dimB (Xi ) = lim sup n
log |Xi n| log |Dn | lim sup s. n n n
() We may assume that dimMB (X) < 1. Let s, t, u be such that dimMB (C) < s < t < u < 1. By the definition of modified box counting dimension, there are X0 , X1 , . . . such that C ⊆ i Xi and dimB (Xi ) < s for all i. It is enough to show that for each i there is a martingale that t-succeeds strongly on elements of Xi , since we can then use the additivity of martingales to combine these into a single martingale that u-succeeds strongly on all elements of C. So we assume that X ⊂ 2ω is such that dimB (X) < s < s < t and show that there exists a martingale d that t-succeeds strongly on all A ∈ X. Let Xn = X n, and for σ ∈ 2n , let Xnσ = {τ ∈ Xn : σ τ }. There is an N such that log n|Xn | < s for all n N . For each n N , define a martingale dn (inductively) as follows: 2|σ|−sn |Xnσ | if |σ| n, dn (σ) = dn (σ n) if |σ| > n. It is easy to check that dn is a martingale, and for σ ∈ Xn , we have dn (σ) = 2(1−s)n . Let d = nN dn . The finiteness of d follows from d(λ) =
nN
dn (λ) =
nN
2−sn |Xn | <
2(s −s)n < ∞.
nN
If A ∈ X, then A n ∈ Xn for all n, so for n N , d(A n) dn (A n) 2(1−s)n = 2(t−s)n . 2(1−t)n 2(1−t)n 2(1−t)n As t > s, we see that d t-succeeds strongly on all A ∈ X.
13.11. Other notions of dimension
641
13.11.4 Effective packing dimension The somewhat involved definition of packing measures renders a direct Martin-L¨ of-style effectivization in terms of computably enumerable covers difficult. This obstacle can be overcome by using the martingale characterization provided by Theorem 13.11.9. Definition 13.11.10 (Athreya, Hitchcock, Lutz, and Mayordomo [15]). The effective packing dimension of C ⊆ 2ω is Dim(C) = inf{s : there is a c.e. martingale that s-succeeds strongly on all A ∈ C}. In Section 13.11.3 we stated the fact that packing dimension equals upper modified box counting dimension. For individual sets, the modifications to box counting dimension leading to countable stability can be disregarded. A careful effectivization of the proof of Theorem 13.11.9 yields the following. Theorem 13.11.11 (Reimann [324]). For every A ∈ 2ω , we have Dim(A) = dim1B (A). Combining this result with Theorem 13.11.6 allows us to characterize effective packing dimension in terms of initial segment complexity. Corollary 13.11.12 (Athreya, Hitchcock, Lutz, and Mayordomo [15]). For every A ∈ 2ω , we have Dim(A) = lim supn K(An) = lim supn C(An) . n n A direct proof of this statement can be obtained by a simple modification of the proof of Theorem 13.3.4 (changing liminf’s to limsup’s, “there are infinitely many” to “for almost all”, etc.). Notice that this characterization makes it clear that a sufficiently generic set (say a 2-generic set) has high effective packing dimension, since the set of strings of high Kolmogorov complexity is dense. Thus packing dimension is a common ground between category and measure. This characterization also makes it easy to build sets of specified effective packing dimension, as we saw for effective Hausdorff dimension. Indeed, it is not hard to use it to build, say, a set A such that dim(A) = p and Dim(A) = q for any rationals p and q such that 0 p q 1. In particular, we have the following. Corollary 13.11.13 (Athreya, Hitchcock, Lutz, and Mayordomo [15]). There is a set of effective packing dimension 1 and effective Hausdorff dimension 0.
642
13. Algorithmic Dimension
13.12 Packing dimension and complexity extraction Many of the questions we have investigated for Hausdorff dimension can also be examined for packing dimension. Some of the results for effective Hausdorff dimension immediately imply the same results for effective packing dimension, such as Greenberg and Miller’s construction in [170] of a minimal Turing degree of effective Hausdorff dimension 1 (Theorem 13.9.17), which implies the theorem of Downey and Greenberg [106] that there is a minimal degree of effective packing dimension 1 (Theorem 13.13.1). In other ways, however, the two notions of effective dimension are quite different. For instance, we will see in Section 13.14 that there is no correspondence principle for effective packing dimension analogous to Theorem 13.6.1. Another example is Miller’s Theorem 13.8.1 that there is a Turing degree of effective Hausdorff dimension 12 , which has no analog for packing dimension. Theorem 13.12.1 (Fortnow, Hitchcock, Pavan, Vinochandran, and Wang [150]). Let ε > 0. If Dim(A) > 0 then there is a B ≡wtt A such that Dim(B) > 1 − ε. Thus we conclude that there is a 0-1 law for packing dimension for weak truth table, and hence Turing, degrees. Too recently for a proof to be included in this book, Conidis [74] has shown that there is a set of positive effective packing dimension that does not compute any sets of effective packing dimension 1, and hence there is a Turing degree of effective packing dimension 1 with no element of effective packing dimension 1. We give a new proof of Theorem 13.12.1 due to Laurent Bienvenu. It is rather simple and relies on the relationship between complexities at endpoints of intervals and their oscillations within these intervals. It is fair to say that it is along the same lines as the original, but it does not rely on the difficult result on extractors due to Barak, Impagliazzo, and Wigderson [16]13 used in the original proof. Proof of Theorem 13.12.1. Let A be such that Dim(A) > 0. Let t > 0 be a rational such that C(A n) tn for infinitely many n. Let m be a fixed large integer. 13 The
original proof in [150] gave the stronger result that the set B has the same polynomial time Turing degree as A, and hence that there are 0-1 laws for low level complexity classes. However, the extractor result used in that proof (as a black box) is very complex and does not allow for a reasonable presentation in this book (its proof being over 30 pages long). The version we give here suffices for our purposes, as we do not need the result for complexity classes. Like the proof we present, the original proof is nonuniform, but in a different way, using a kind of binary search that identifies chunks of intervals of high complexity infinitely often. The closest we have in this book to this procedure is found in Section 13.10.
13.12. Packing dimension and complexity extraction
643
We claim that there exists a rational t > 0 such that C(A mk ) t mk for infinitely many k. To see this, choose any n such that C(A n) tn. Let k be such that n ∈ [mk−1 , mk ). Then C(α mk ) C(α n) − O(log n) tn − O(log n) tmk−1 − O(log n)
t k m − O(log mk ). m
t , we have C(α mk ) t mk for infinitely many k. Thus, for any t < m k) Now let σk = α mk and let s = lim supk C(σ |σk | . By the above claim we know that s > 0. Let s1 and s2 be rationals such that 0 < s1 < s < s2 . By the definition of s, we have C(σk ) s2 |σk | for almost all k (for simplicity, we may restrict our attention to large enough k and hence assume that this condition is true for all k) and for any d, we have C(σk ) s1 |σk | + d for infinitely many k. Let V be our universal machine. Given A as an oracle, we can compute σ0 , σ1 , . . . and, for each k, effectively find a τk such that V(τk ) = σk and |τk | s2 |σk |. Let D be the sequence given by τ0 τ1 τ2 . . . . It is easy to see that D wtt A. If we replace a sparse set of bits of D by the bits of A (for instance, replacing each D(2n ) by A(n)) to get B, we clearly have B ≡wtt A and Dim(B) = Dim(D). So let us evaluate Dim(D). For all k, we have C(τk ) C(σk ) − O(1), as σk can be computed from τk . Thus, for infinitely many k, we have C(τk ) s1 |σk | = s1 mk . Now,
Dim(D) lim sup k
C(τ0 . . . τk ) . |τ0 . . . τk |
On the one hand, we have C(τ0 . . . τk ) C(τk ) − O(log k), and thus C(τ0 . . . τk ) s1 mk − O(log k) for infinitely many k. On the other hand, we have |τk | s2 mk for all k. Therefore,
1 C(τ0 . . . τk ) s1 mk − O(log k) s1 . 1 − lim sup lim sup |τ0 . . . τk | s2 (1 + m + . . . + mk ) s2 m k k 1 Thus Dim(B) = Dim(D) ss12 1 − m . Since s1 and s2 can be taken arbitrarily close to each other and m arbitrarily large, the theorem is proved. On the other hand, packing dimension does have an effect on the complexity extraction question for Hausdorff dimension. The following result implies that if A is a set of positive effective Hausdorff dimension that does not compute any set of higher effective Hausdorff dimension, then A must have effective packing dimension 1. Theorem 13.12.2 (Bienvenu, Doty, and Stephan [38]). If Dim(A) > 0 dim(A) then for each ε > 0 there is a B T A such that dim(B) Dim(A) − ε.
644
13. Algorithmic Dimension
Proof. We may assume that dim(A) > 0. Let s1 , s2 , s3 , and t be rationals such that 0 < s1 < dim(A) < s2 < s3 and Dim(A) < t, and t+ss31−s1 dim(A) Dim(A)
− ε. Let σn ∈ 2n be such that A = σ1 σ2 . . . . The idea of the proof is to split A effectively into blocks ν = σn . . . σn+k such that K(ν) is relatively small and use these to find sufficiently incompressible short prefix-free descriptions τi for the σi . We then let B = τ1 τ2 . . . . We will make several uses of the fact that K(στ ) = K(σ) + K(σ | τ ) ± O(log |σ| + log |τ |), and more generally K(ρ0 . . . ρn ) =
K(ρj | ρ0 . . . ρj−1 ) ± O
jn
log |ρj | .
jn
These facts follow easily from the symmetry of information results proved in Section 3.3. (These results are proved for plain complexity, but as pointed out in Section 3.10, also hold for prefix-free complexity because K and C agree up to a log factor.) Suppose that we have already defined η = τ1 . . . τn−1 for ρ = σ1 . . . σn−1 so that |η| s3 |ρ|. By the choice of s2 , there are infinitely many k such that for ν = σn . . . σn+k , we have K(ν | ρ) s2 |ν|. Fix such a k. For j k, let νj = σn . . . σn+j . If k is large enough (so that quantities such as (s3 − s2 )|ν| overwhelm the relevant log terms), then, by symmetry of information, K(σn+i | ρνi−1 ) s3 |ν|, ik
and for all j k, by symmetry of information, K(νj | ρ) K(ρνj ) − K(ρ) + O(log |ρ| + log |νj |) t(|ρ| + |νj |) − s1 |ρ|, so that, again by symmetry of information, K(σn+i | ρνi−i ) (t − s1 )|ρ| + t|νj |. ij
Given the existence of such k, we can effectively find k and τn , . . . , τn+k such that, letting ν and νj be asabove, each τn+i is a U-description of σn+i given ρνi−1 , and we have ik |τn+i | s3 |ν| and ij |τn+i | (t − s1 )|ρ| + t|νj | for all j k. We have |ητn . . . τn+k | s3 |ρν|, so the recursive definition can continue. Let B = τ1 τ2 . . . . Clearly B T A, so we are left with showing that dim(A) − ε. dim(B) Dim(A) Fix m. Let n, k, etc. be as above so that n m n + k. Let j = m − n. The choice of the τi as appropriate U-descriptions ensures that K(σ1 . . . σm ) K(τ1 . . . τm ) + O(1), so K(τ1 . . . τm ) s1 |σ1 , . . . , σm | −
13.13. Clumpy trees and minimal degrees
645
O(1). Furthermore, |τ1 . . . τm | |η| + |τn . . . τn+j | s3 |ρ| + (t − s1 )|ρ| + t|νj | (t + s3 − s1 )|σ1 . . . σm |. Thus lim inf m
dim(A) K(τ1 . . . τm ) s1 − ε. |τ1 . . . τm | t + s 3 − s1 Dim(A)
Given l, let m be least such that B l τ1 . . . τm . Then |τ1 . . . τm | − l m, so K(τ1 . . . τm ) K(B l) + O(m). Since m is sublinear in |τ1 . . . τm |, it follows that K(B l) K(τ1 . . . τm ) dim(A) lim inf − ε. dim(B) = lim inf m l l |τ1 . . . τm | Dim(A)
13.13 Clumpy trees and minimal degrees In this section we introduce the useful notion of clumpy trees, using them to show that there is a minimal degree of effective packing dimension 1. This result follows from Theorem 13.9.17, of course, but can be proved using an easier method. Theorem 13.13.1 (Downey and Greenberg [106]). There is a set of minimal Turing degree and effective packing dimension 1. Proof. We give a notion of forcing P such that a sufficiently generic filter G ⊂ P yields a set XG ∈ 2ω with effective packing dimension 1 and minimal Turing degree.14 This notion is a modification of the standard notion of forcing with computable perfect trees due to Sacks [345]. We need to restrict the kind of perfect trees we use so that we can always choose strings that are sufficiently complicated (i.e., not easily compressed), to be initial segments of the set we build. The problem, of course, is that we cannot determine effectively which strings are sufficiently incompressible, but our conditions, the trees, have to be computable. The solution to this problem relies on the following lemma. 14 A
notion of forcing is just a partial order P, whose elements are called conditions, and whose ordering is called extension. A filter on P is a subset F of P such that if p, q ∈ F then p, q have a common extension in P, and if p ∈ F and p extends q, then q ∈ F . A subset D of P is dense if for every p ∈ P there is a q ∈ D extending p. The filter F is generic for a collection of dense sets if it contains an element of each set in the collection. By sufficiently generic we mean that F is generic for a countable collection of dense sets D0 , D1 , . . . that will be specified in the proof below. Such an F necessarily exists, because we can let p0 ∈ D0 and let pi+1 ∈ Di+1 extend pi , and then define F = {q ∈ P : ∃i (pi extends q)}.
646
13. Algorithmic Dimension
Lemma 13.13.2. For each σ ∈ 2<ω and ε ∈ Q>0 we can computably find ) n an n such that K(στ |στ | 1 − ε for some τ ∈ 2 . Proof. Let d = |σ| + 1 and let S = {ν : K(ν) |ν| − d}. Since μ(S) 2−d , we have σ S. So letting m > dε , we see that there is some ν σ of d length m such that K(ν) m − d. Then K(ν) m 1 − m > 1 − ε, so we can let n = m − |σ|. We denote the n corresponding to σ and ε in the above lemma by nε (σ). A perfect function tree is a map T : 2<ω → 2<ω that preserves extension and incompatibility. We write [T ] for the class of all B for which there is an A with T (A n) ≺ B for all n. Let T be a perfect function tree, σ ∈ rng T , and ε ∈ Q>0 . Let ρ = T −1 (σ). We say that T contains an ε-clump above σ if στ T (ρτ ) for all τ ∈ 2nε (σ) . (Note that this condition implies that T (ρτ ) = στ for all τ ∈ 2
13.13. Clumpy trees and minimal degrees
647
Proof. Since G is a filter, (T,ε)∈G [T ] is nonempty. By considering full subtrees,we see that for each n, the set {(T, ε) ∈ G : |T (λ)| > n} is dense in P, so (T,ε)∈G [T ] is a singleton, and hence XG is well-defined. Let q < 1 and let (T, ε) ∈ P. There is some ρ such that on (T, ε), the string σ = T (ρ) is labeled by a δ with 1 − δ > q. Let n = nδ (σ). There is ) some τ of length n such that K(στ |στ | > 1 − δ > q. Then T (ρτ ) is labeled by on (T, ε) and (ExtT (ρτ ) (T ), 2δ ) ∈ P extends (T, ε). If that condition is in G then στ T (ρτ ) ≺ XG . Thus the set of conditions that force there to be a ν ≺ XG such that K(ν) |ν| > q is dense in P. δ 2
We finish by showing that if G is sufficiently generic then XG has minimal Turing degree. Let Φ be a Turing functional. As usual in Sacks forcing, let DivΦ = {(T, ε) ∈ P : ∃x ∀σ ∈ rng T (Φσ (x) ↑)} and
TotΦ = {(T, ε) ∈ P : ∀x ∀σ ∈ rng T ∃σ ∈ rng T (σ σ ∧ Φσ (x) ↓)}. Let (T, ε) ∈ P. If (T, ε) ∈ / TotΦ then there is a σ ∈ rng T and an x such that for all σ ∈ rng T , if σ σ then Φσ (x) ↑. So (Extσ (T ), δ) ∈ DivΦ for the label δ of σ on (T, ε). Thus TotΦ ∪ DivΦ is dense in P. Let CompΦ be the collection of conditions (T, ε) ∈ TotΦ such that for all σ, σ ∈ rng T , the strings Φσ and Φσ are compatible. If (T, ε) ∈ CompΦ and (T, ε) ∈ G then ΦXG is computable, since to determine ΦXG (x) we need only look for σ ∈ rng T such that Φσ (x) ↓, which must exist and have Φσ (x) = ΦXG (x). Let SpΦ be the collection of conditions (T, ε) ∈ TotΦ such that for all incompatible labeled σ, σ ∈ rng T , the strings Φσ and Φσ are incompatible. Let (T, ε) ∈ G ∩ SpΦ . We claim that XG T ΦXG . Suppose that we have determined an initial segment σ of XG . It is easy to check that the labeled nodes are dense in rng T , so we can find a labeled σ σ in rng T such that Φσ ≺ ΦXG . Then we must have σ ≺ XG . Thus if we can show that TotΦ ∪ CompΦ ∪ SpΦ is dense in P, we will have shown that either ΦXG is not total, or ΦXG is computable, or ΦXG computes XG . Since Φ is arbitrary and we have already shown that XG is not computable, we can then conclude that XG has minimal degree. Since we have already seen that TotΦ ∪ DivΦ is dense in P, it is enough to show that SpΦ ∪ CompΦ is dense below TotΦ in P. That is, every condition in TotΦ is extended by one in SpΦ ∪ CompΦ . Lemma 13.13.4. SpΦ ∪ CompΦ is dense below TotΦ in P. Proof. Suppose that (T, ε) ∈ TotΦ has no extension in CompΦ . Then for all σ ∈ rng T , there are σ0 , σ1 ∈ rng T extending σ such that Φσ0 and Φσ1 are incompatible.
648
13. Algorithmic Dimension
By recursion, we define an extension (S, ε) of (T, ε) in SpΦ . We start by letting S(λ) = T (λ) and labeling this string by ε on (S, ε). Suppose that we have defined σ = S(ρ ) = T (ρ) where σ is labeled by δ on (T, ε) and by γ on (S, ε). Assume that we have carried out the construction so far so that nγ (σ) nδ (σ). For every τ of length strictly shorter than nγ (σ), we can let S(ρ τ ) = T (ρτ ) = στ (and leave it unlabeled on (S, ε)). Let τ0 , . . . , τ2n −1 be the strings of length n = nγ (σ). Enumerate the pairs of numbers i < j < 2n as (i0 , j0 ), . . . , (im , jm ). For each i 2n and k m, i define strings νki such that στi ≺ ν0i ≺ · · · ≺ νm and such that every νki is jk ik ik such that in the image of T . At step k, define νk νk−1 and νkjk νk−1 ik
jk
Φνk and Φνk are incompatible, which is possible because (T, ε) has no extension in CompΦ . Define S(ρ τi ) to be some string ηi ∈ rng T extending i νm that is labeled by some δi such that nδi (ηi ) n γ2 (ηi ). We can then label ηi by γ2 on (S, ε). Note that the inductive hypothesis assumed above is preserved, and so the construction can continue. This lemma completes the proof of Theorem 13.13.1.
13.14 Building sets of high packing dimension One way in which effective packing dimension differs from effective Hausdorff dimension is in the absence of a correspondence principle analogous to Theorem 13.6.1. Conidis [75] built a countable Π01 class P of effective packing dimension 1. (In fact, he built P to have a unique element of rank 1, so that all other elements of P are computable. See Section 2.19.1 for the definition of the rank of an element of a Π01 class.) Since P is countable, its classical packing dimension is 0. Note also that, by Theorem 13.6.1, the effective Hausdorff dimension of P is 0. Theorem 13.14.1 (Conidis [75]). There is a countable Π01 class of effective packing dimension 1. Proof. By Lemma 13.13.2, there is a computable sequence 0 = n0 < n1 < · · · such that for every σ ∈ 2nk , there is a τ ∈ 2nk+1 with σ ≺ τ and ns K(τ ) (1 − 2−k )nk+1 . We build such that T0 ⊂ finite trees Ts ⊆ 2 T1 ⊂ · · · and let our class be [ s Ts ]. We will have strings σk whose values will change throughout the construction, but will have limit values such that σ0 ≺ σ1 ≺ · · · and k σk is the unique rank 1 point of [ s Ts ]. Construction. Stage 0. Let σ0 = λ and T0 = {λ}. Stage s + 1. If there is a least k s such that Ks (σk ) < (1 − 2−k )|σk | then redefine σk to be the leftmost extension τ ∈ 2|σk | of σk−1 on Ts such that Ks (τ ) (1 − 2−k )|τ | (which we will argue below must exist). In this case, undefine all σj for j > k.
13.14. Building sets of high packing dimension
649
Let k be largest such that σk is still defined and let τ be the leftmost leaf of Ts that extends σk (so that |τ | = ns ). Define Ts+1 as follows. Begin by adding all extensions of τ of length ns+1 . For every leaf ρ = τ of Ts , add ρ0ns+1 −ns to Ts+1 . Finally, close downwards to obtain a subtree of 2ns+1 . Let σk+1 = τ 0ns+1 −ns . End of Construction. Every time σk is defined at a stage s to extend some τ ∈ 2ns , every extension of τ of the same length as σk is added to Ts+1 . Thus, by the choice of the ns , if we ever find that Ks (σk ) < (1 − 2−k )|σk | then we can redefine σk as in the construction. Thus the construction never gets stuck, and it follows by induction that every σk has a final value, which we will refer to simply as σk , such that σ0 ≺ σ1 ≺ · · · and K(σk ) (1 − 2−k )|σk | for all k. Let P = [ s Ts ] and X = k σk . Then X ∈ P and Dim(X) = 1, so Dim(P ) = 1. It is easy to see from the construction that if ρ = σk and |ρ| = |σk |, then every extension of ρ on P is eventually 0. Thus every element of P other than X is eventually 0, and hence P is countable. Note that, in the above proof, the σk never move to the left, so the unique noncomputable element X of P is a left-c.e. real. This result suggests that it is easier (from a computability-theoretic perspective) to build sets of high effective packing dimension than ones of high effective Hausdorff dimension. (Theorem 13.12.1 can also be seen as evidence for this claim.) Downey and Greenberg [106] have extended Theorem 13.14.1 as follows. Theorem 13.14.2 (Downey and Greenberg [106]). Every set A of array noncomputable degree computes a set B of effective packing dimension 1. If A is c.e. then B can be taken to be a left-c.e. real that is the unique rank 1 element of a Π01 class. Before proving this result, we make a few comments. The following result shows that c.e. traceable sets have packing dimension 0. Theorem 13.14.3 (Essentially Kummer [226], see Downey and Greenberg [106]). If A is c.e. traceable, then for every computable order h, we have C(A n) log n + h(n) + O(1). Proof. Let A be c.e. traceable and fix a computable order h. (We may assume that h(0) > 0.) By Proposition 2.23.12, A is c.e. traceable with bound h. Let T0 , T1 , . . . be a c.e. trace with bound h for the function n → A n. We can describe A n by describing n and the position of A n in the enumeration of Tn , so C(A n) log n + h(n) + O(1). Since we can choose h to be slow growing, and log n is bounded away from n, we have the following. Corollary 13.14.4. If A is c.e. traceable then Dim(A) = 0.
650
13. Algorithmic Dimension
Combining this result with Theorem 13.14.2 and the fact that a c.e. degree is array computable iff it is c.e. traceable (Theorem 2.23.13), we have the following. Corollary 13.14.5 (Downey and Greenberg [106]). The following are equivalent for a c.e. set A. (i) A computes a set of positive effective packing dimension. (ii) A computes a set of effective packing dimension 1. (iii) The degree of A is array noncomputable. Corollary 13.14.4, together with the existence of minimal degrees of packing dimension 1 (Theorem 13.9.17), implies Shore’s unpublished result that there is a minimal degree that is not c.e. traceable. As discussed in Section 2.23, Ishmukhametov [184] showed that if a degree is c.e. traceable then it has a strong minimal cover in the Turing degrees. Yates had earlier asked the still open question of whether every minimal degree has a strong minimal cover. It follows that c.e. traceability is not enough to answer this question. Downey and Ng [129] have further sorted out the connections between high effective packing dimension and notions such as array computability. Theorem 13.14.6 (Downey and Ng [129]). (i) There is a degree that is not c.e. traceable but has effective packing dimension 0. (ii) There is a degree that is array computable but has effective packing dimension 1. Part (ii) follows by taking a hyperimmune-free 1-random degree and applying Proposition 2.23.10. For a proof of part (i), see [129]. The degrees in the above result can also be Δ02 , and hence there seems to be no way to relate the degrees of packing dimension 1 with any known lowness class. We now proceed with the proof of Theorem 13.14.2. Proof of Theorem 13.14.2. We begin with the non-c.e. case. Recall that f pb C means that f can be computed from oracle C by a reduction procedure with a primitive recursive bound on the use function, and that f pb ∅ iff there is a computable function f (n, s) and a primitive recursive function p such that lims f (n, s) = f (n) and |{s : f (n, s) = f (n, s + 1)}| p(n). Recall further that a set of strings S is pb-dense if there is a function f pb ∅ such that f (σ) σ and f (σ) ∈ S for all σ, and that a set is pb-generic if it meets every pb-dense set of strings. Finally, recall that Theorem 2.24.22 states that if a is array noncomputable, then there is a pb-generic set B T a. Thus, it is enough to show that if B is pb-generic then Dim(B) = 1, which we now do. It is easy to check that the map given by Lemma 13.13.2, which takes ) σ ∈ 2<ω and ε ∈ Q>0 to an nε (σ) such that K(στ |στ | 1 − ε for some τ ∈
13.14. Building sets of high packing dimension
2nε(σ) , is primitive recursive. For k > 0, let Sk = {ν ∈ 2k :
K(ν) |ν|
651
> 1 − k1 }.
To see that Sk is pb-dense, first note that it is co-c.e. Let f (σ, s) be the leftmost extension of σ of length m = |σ0k | + n k1 (σ0k ) in Sk [s]. Then f s) = f(σ, s + 1)}| < 2m , which is a primitive is computable, |{s : f(σ, recursive bound, and lims f (σ, s) ∈ Sk . So if B is pb-generic then it meets all the Sk , which implies that Dim(B) = 1. We now turn to the c.e. case. Here, if we wish only to make B left-c.e. (without making it be the unique rank 1 element of a Π01 class), there is a fairly simple permitting argument.15 Let A be a c.e. set of a.n.c. degree. We have requirements
1 K(ν) k 1− ν≺B ∧ Rk : ∃ν ∈ 2 |ν| k for k > 0. It is enough to satisfy infinitely many of these requirements. As noted in the proof of Theorem 13.14.1, there is a computable sequence 0 = n0 < n1 < · · · such that for every σ ∈ 2nk , there is a τ ∈ 2nk+1 with σ ≺ τ and K(τ ) (1 − 2−k )nk+1 . Partition N into consecutive intervals I0 , I1 , . . . such that |Ik | = 2nk . By Theorem 2.23.4, there is a D ≡T A such that for every c.e. set W there are infinitely many m with D∩Im = W ∩Im . We define B as the limit of strings σ0 ≺ σ1 ≺ · · · , which can change value during the construction. We begin with σk = 0nk for all k. We also build a c.e. set W , into which we enumerate requests for D-permission. Let uk = |I0 | + |I1 | + · · · + |Ik |. At stage s, say that Rk requires attention if Ds ∩ Ik ⊆ Ws ∩ Ik = Ik and Ks (σk ) < (1 − 2−k )|σk |, and either 1. this is the first stage at which this inequality obtains for the current value of σk or 2. Ds+1 uk = Ds uk . If there is a least k s such that Rk requires attention then proceed as follows. If 1 holds then put the least n ∈ Ik \ W into W . If 2 holds then redefine σk to be the leftmost extension τ ∈ 2nk of σk−1 such that Ks (τ ) (1 − 2−k )|τ | and redefine σj for j > k to be σk 0nj −nk . 15 There is also a simple proof, due to Miller [personal communication], using material from Chapter 16: In Theorem 16.1.6 we will see that every a.n.c. c.e. degree contains a set X such that C(X n) 2 log n − O(1) for infinitely So let A be a c.e. set of many n. 1 a.n.c. degree, and let X ≡T A be as above. Let B = n∈X n log 2 n . This sum converges, so B is a left-c.e. real, and B ≡T A. Assume for a contradiction that B does not have packing dimension 1. Let q < 1 be such that K(B n) qn for all sufficiently large n. 1 We have B − B log n + 2 log log n n log 2 n , so if we know B log n + 2 log log n, then we can determine whether k is in X for all k < n. Thus
C(X n) K(B log n + 2 log log n) + K(n) + O(1) (1 + q) log n + O(log log(n)), contradicting our choice of X.
652
13. Algorithmic Dimension
Clearly, each σk has a final value. For these final values, let B = k σk . Since the σk move only to the right, B is a left-c.e. real. Since σk cannot change unless D changes below uk , we also have B wtt D T A. Now suppose that D ∩ Ik = W ∩ Ik . Let t be the least stage by which all Rj for j < k have stopped requiring attention, which must exist because each Rj can require attention at most twice for each string in 2nj . It can never be the case that Ds ∩ Ik \ Ws ∩ Ik = ∅, since then we would never again put a number into W ∩ Ik , ensuring that D ∩ Ik = W ∩ Ik . Every time we put a number into W ∩ Ik , we move σk . Since |Ik | = 2nk and σk always has length k, we never have Ws ∩ Ik = Ik . Thus, if there is an s t such that Ks (σk ) < (1 − 2−k )|σk |, then Rk requires attention at stage s, and so a number enters W ∩ Ik . This number will later enter D, at which point Rk will again require attention, and σk will be redefined. Thus, for the final value of σk , we must have K(σk ) (1 − 2−k )|σk |, and hence Rk is satisfied. Since there are infinitely many k such that D ∩ Ik = W ∩ Ik , infinitely many requirements are satisfied, and hence Dim(B) = 1. For the full result, we combine this permitting argument with the proof of Theorem 13.14.1. We have the same requirements as above. Let D, nk , and Ik be as above (except that it is convenient to have |Ik | = 2nk +1 now). We will have strings σk as above, but now their lengths may change during the construction, as in the proof of Theorem 13.14.1. As in that proof, we will build a tree via approximations Ts . While we work above a particular value of σk , we extend all other strings of the same length by 0’s. Thus, when we redefine σk , we must move σk+1 above σk 0n for some large n. This redefinition means that we now must associate some large Ij with σk+1 , since we may need many permissions to ensure that K(σk+1 ) < (1 − 2−(k+1) )|σk+1 | for the final value of σk+1 . The main idea of the construction is that, instead of having a fixed use uk for permitting changes to σk , we will now have a varying use uk (so we will no longer have a wtt-reduction from D). When we redefine σk , we also redefine uk to be |I0 | + |I1 | + · · · + |Ij−1 |, where Ij is the interval now associated with Rk+1 . When we seek a permission to change σk , we do so by putting numbers into every Im ⊆ [uk−1 , uk ). This action will allow us to argue that each Im such that D ∩ Im = W ∩ Im gives us enough permissions to satisfy some requirement, and hence infinitely many requirements are satisfied. We say that Ik is active at stage s + 1 if Ds ∩ Ik ⊆ Ws ∩ Ik = Ik . Construction. Stage 0. Let σ0 = λ, let T0 = {λ}, and let u0 = 0. Stage s + 1. Say that Rk requires attention if Ks (σk ) < (1 − 2−k )|σk | and either 1. this is the first stage at which this inequality obtains for the current value of σk or
13.14. Building sets of high packing dimension
653
2. Ds+1 uk = Ds uk . If there is a least k s such that Rk requires attention then proceed as follows. If 1 holds then for each active Im ⊆ [uk−1 , uk ), put the least n ∈ Im \ W into W . If 2 holds then redefine σk to be the leftmost extension τ ∈ 2|σk | of σk−1 on Ts such that Ks (τ ) (1 − 2−k )|τ | (which must exist by the same reasoning as in the proof of Theorem 13.14.1), redefine uk to be |I0 | + |I1 | + · · · + |Is |, and undefine all σj and uj for j > k. Now let k be largest such that σk is still defined and let τ be the leftmost leaf of Ts that extends σk (so that |τ | = ns ). Define Ts+1 as follows. Begin by adding all extensions of τ of length ns+1 . For every leaf ρ = τ of Ts , add ρ0ns+1 −ns to Ts+1 . Finally, close downwards to obtain a subtree of 2ns+1 . Let σk+1 = τ 0ns+1 −ns and uk+1 = |I0 | + |I1 | + · · · + |Is+1 |. End of Construction. Let P = [ s Ts ]. It is easy to check by induction that each σk has a final value (since, after σk−1 has settled, the length of σk cannot change). For these final values, let B = k σk . As in the proof of Theorem 13.14.1, B is a left-c.e. real, and is the unique rank 1 element of P . Neither σk nor uk can change unless some number less than uk enters D. Since each uk has a final value, D can compute the final value of each σk , and hence can compute B. Now suppose that D ∩ Ik = W ∩ Ik . It can never be the case that Ds ∩ Ik \ Ws ∩ Ik = ∅, since then Ik would become inactive, and we would never again put a number into W ∩ Ik , ensuring that D ∩ Ik = W ∩ Ik . Furthermore, each time a number enters W ∩ Ik , it does so because some string σk becomes redefined. It is easy to see from the construction that in that case |σk | nk , and the strings corresponding to different numbers entering W ∩ Ik must be different. Since |Ik | = 2nk +1 , we never put all of Ik into W . Thus Ik is permanently active. Let j be such that, for the final value of uj−1 and uj , we have Ik ⊆ [uj−1 , uj ), and let s0 be a stage by which these final values are reached. Then all Ri for i < j must eventually stop requiring attention (since each time such a requirement redefines σi , it undefines uj ), say by stage s1 > s0 . If at any stage s > s1 we have Ks (σj ) < (1 − 2−j )|σj |, then Rj will require attention. It will then put a number into W ∩ Ik . This number must later enter D, which will cause uj to change. By the choice of s0 , this situation cannot happen, so in fact Ks (σj ) (1 − 2−j )|σj | for all s > s1 , and hence K(σj ) (1 − 2−j )|σj |, which means that Rj is satisfied. Since there are infinitely many k such that D ∩ Ik = W ∩ Ik and only finitely many can have Ik ⊆ [uj−1 , uj ) for a given j, infinitely many requirements are satisfied, and hence Dim(B) = 1.
654
13. Algorithmic Dimension
13.15 Computable dimension and Schnorr dimension 13.15.1 Basics It is natural to consider notions of dimension obtained by replacing c.e. gales by computable ones. Recall that a martingale d s-succeeds on A if d(An) lim supn 2d(An) (1−s)n = ∞ and s-succeeds strongly on A if limn 2(1−s)n = ∞. As discussed in Section 13.3, the following definitions were in a sense implicit in the work of Schnorr [349]. Definition 13.15.1 (Lutz [252, 254], Athreya, Hitchcock, Lutz, and Mayordomo [15]). The computable Hausdorff dimension of C ⊆ 2ω is dimcomp (C) = inf{s : there is a computable martingale that s-succeeds on all A ∈ C}. The computable packing dimension of C ⊆ 2ω is Dimcomp (C) = inf{s : there is a computable martingale that s-succeeds strongly on all A ∈ C}. For A ∈ 2ω , we write dimcomp (A) for dimcomp ({A}) and refer to it as the computable Hausdorff dimension of A, and similarly for packing dimension. A different approach to passing from Σ01 objects to computable ones in the definition of algorithmic dimension is to consider test sets that are computable in the sense of Schnorr. Definition 13.15.2. Let s 0 be rational. A Schnorr of sets of strings s-test is a uniformly c.e. sequence {Sn }n∈ω such that σ∈Sn 2−s|σ| 2−n for all n and the reals σ∈Sn 2−s|σ| are uniformly computable. A class C ⊆ 2ω is Schnorr s-null if there exists a Schnorr s-test {Sn }n∈ω such that C ∈ n Sn . The Schnorr 1-null sets are just the Schnorr null sets. As with Schnorr tests, in the above definition we could require that σ∈Sn 2−s|σ| = 2−n and have the same notion of Schnorr s-null sets. The sets Sn in a Schnorr s-test are actually uniformly computable, since to determine whether σ ∈ Sn it suffices to enumerate Sn until the accumulated sum given by τ ∈Sn 2−s|τ | exceeds 2−n − 2s|σ| (assuming the measure of the nth level of the test is in fact 2−n ). If σ has not been enumerated so far, it cannot be in Sn . But of course the converse does not hold: there are computable sets of strings generating open sets with noncomputable measures. Indeed, we have mentioned before that every Σ01 class can be generated by a computable set of strings.
13.15. Computable dimension and Schnorr dimension
655
We can adapt Definition 13.5.6 to get an s-analog of the concept of total Solovay test introduced in Definition 7.1.9. A Solovay s-test D is total if −s|σ| is computable. σ∈D 2 As we saw in Section 13.5, the correspondence between tests for s-MartinL¨ of randomness and Solovay s-tests is close but not quite exact. In the Schnorr case, however, we do get an exact correspondence. Theorem 13.15.3 (Downey, Merkle, and Reimann [125]). For any rational s 0, a class C ⊆ 2ω is Schnorr s-null if and only if there is a total Solovay s-test that covers every element of C. Proof. Let {Sn }n∈ω be a Schnorr s-test. Let S = n Sn . Clearly, S is aSolovay s-test that covers all of n Sn , so it is enough to show that −s|σ| is computable. It is easy to see that to compute this sum to σ∈S 2 within 2−n , it is enough to compute σ∈Si 2−s|σ| to within 2−2n−3 for each i n + 1. converse, let S be a total Solovay s-test Given n, compute c = For the −s|σ| 2 to within 2−n−2 . Effectively find a finite F ⊆ S such that σ∈S 2−s|σ| c − 2−n−2 . c − 2−n−1 σ∈F
Let Sn = S\F . Then Sn covers every set that S covers, and σ∈Sn 2−s|σ| < 2−n . Furthermore, this sum is uniformly computable over all n. Thus the Sn form a Schnorr s-test whose intersection contains all sets covered by S. We have the following effective version of Theorem 13.1.2. Proposition 13.15.4 (Downey, Merkle, and Reimann [125]). For any rationals t s 0, if C is Schnorr s-null then it is also Schnorr t-null. Proof. It suffices to show that if s t, then every Schnorr s-test is also a Schnorr t-test. Obviously, the only issue is checking the computability of the relevant sum. Let {Sn }n∈ω be a Schnorr s-test. Given any rational r 0 and any n and k, let mn (r) = 2−r|σ| σ∈Sn
and mkn (r) =
2−r|σ| .
σ∈Sn ∩2k
It is easy to check that mkn (t) mn (t) mkn (t) + 2(s−t)k mn (s). Now, mn (s) is computable, and 2(s−t)k goes to zero as k gets larger. Therefore, we can effectively approximate mn (t) to any desired degree of precision.
656
13. Algorithmic Dimension
Thus the following definition makes sense. Definition 13.15.5 (Downey, Merkle, and Reimann [125]). The Schnorr Hausdorff dimension of C ⊆ 2ω is dimS (C) = inf{s 0 : C is Schnorr s-null}. For A ∈ 2ω , we write dimS (A) for dimS ({A}) and refer to it as the Schnorr Hausdorff dimension of A. Note that the Schnorr Hausdorff dimension of any class is at most 1, since for any ε > 0 we can choose a computable sequence k0 , k1 , . . . such that the sets 2kn form a Schnorr 1 + ε test covering all of 2ω . We saw in Section 8.11.2 that the concepts of computable randomness and Schnorr randomness do not coincide. That is, there are Schnorr random sets on which some computable martingale succeeds. However, the differences vanish when it comes to Hausdorff dimension. Theorem 13.15.6 (Downey, Merkle, and Reimann [125]). For any B ∈ 2ω , we have dimS (B) = dimcomp (B). Proof. () Suppose that a computable martingale d s-succeeds on B. By Theorem 7.1.2, we may assume that d is rational-valued. We may also assume that s < 1, since the case s = 1 is trivial. It suffices to show that for any t ∈ (s, 1), we can find a Schnorr t-test that covers B. Fix such a t. Let Uk = {σ : 2−(1−t)|σ| d(σ) 2k }. It is easy to see that the {Uk }k∈ω are uniformly computable (since d is rational-valuedand computable) and cover B, so we are left with showing that the reals σ∈Uk 2−s|σ| are uniformly computable. To approximate σ∈Uk 2−s|σ| to within 2−r , we first effectively find an n such that 2(1−t)n 2r d(λ). Let V = Uk ∩ 2n . If τ ∈ Uk \ V then d(τ ) 2(1−t)n 2k 2r+k d(λ). So by Kolmogorov’s Inequality (Theorem 6.3.3), μ(Uk ) − μ(V ) 2−(r+k) . () Suppose dimS (B) < s < 1. (Again the case s = 1 is trivial.) We show that there is a computable martingale d that s-succeeds on B. Let {Vk }k∈ω be a Schnorr s-test that covers B. Let dk (σ) =
2(1−s)|τ |
−|τ |+(1−s)(|σ|+|τ |) στ ∈Vk 2
if τ σ for some τ ∈ Vk otherwise.
13.15. Computable dimension and Schnorr dimension
657
We verify that dk is a martingale. Given σ, if there is a τ ∈ Vk such that τ σ, then clearly dk (σ0) + dk (σ1) = 2dk (σ). Otherwise, dk (σ0) + dk (σ1) 2−|τ |+(1−s)(|σ|+|τ |+1) + 2−|τ |+(1−s)(|σ|+|τ |+1) = σ0τ ∈Vk
=
σ1τ ∈Vk
2(−|ρ|+1)+(1−s)(|σ|+|ρ|) = 2dk (σ).
σρ∈Vk
Furthermore, dk (λ) = τ ∈Vk 2−|τ |+(1−s)|τ | = τ ∈Vk 2−s|τ | 2−k , so d = k dk is also a martingale. It is straightforward to use the fact that {Vk }k∈ω is a Schnorr s-test to show that the dk are uniformly computable, and hence d is computable. Finally, note that, for σ ∈ Vk , we have d(σ) dk (σ) = 2(1−s)|σ| , so if B ∈ k Vk , then d(B n) 2(1−s)n infinitely often, which means that d s-succeeds on B. Thus, in contrast to randomness, the approaches to dimension via Schnorr tests and via computable martingales yield the same concept. Because of the potential confusion between the terms “effective dimension” (which we have used to mean our primary notion of Σ01 dimension) and “computable dimension”, we will use the term “Schnorr Hausdorff dimension” and the notation dimS . For uniformity, we will also refer to computable packing dimension as Schnorr packing dimension and use the notation DimS in place of Dimcomp . It follows from the definitions that dimS (A) DimS (A) for any A. We call sets for which Schnorr Hausdorff and Schnorr packing dimension coincide Schnorr regular , following [391] and [15]. It is easy to construct a nonSchnorr regular sequence; in Section 13.15.4, we will see that such sequences occur even among the c.e. sets.
13.15.2 Examples of Schnorr dimension The easiest way to construct examples of sequences of non-integral Schnorr Hausdorff dimension is by inserting zeroes into a sequence of Schnorr Hausdorff dimension 1. Note that it easily follows from the definitions that every Schnorr random set has Schnorr Hausdorff dimension 1. On the other hand, it is not hard to show that not every set of Schnorr Hausdorff dimension 1 is Schnorr random. A second class of examples is based on the fact that Schnorr random sets satisfy the law of large numbers, not only with respect to the uniform measure, but also with respect to other computable Bernoulli distributions. For a sequence p = (p0 , p1 , . . . ) of elements of (0, 1), the Bernoulli measure μp is defined by μp (σ) = σ(i)=1 pi σ(i)=0 (1 − pi ). It is straightforward to modify the definition of Schnorr test to obtain Schnorr randomness notions
658
13. Algorithmic Dimension
for arbitrary computable measures, as we did for Martin-L¨of randomness in Section 6.12. Theorem 13.15.7 (Downey, Merkle, and Reimann [125]). (i) Let S be Schnorr random and let Z be a computable, infinite, coinfinite set such that δZ = limn |{0,...,n−1}∩Z| exists. Let SZ = S ⊕Z ∅, n where ⊕Z is as defined in Section 6.9. Then dimS (SZ ) = δZ . (ii) Let p = (p0 , p1 , . . . ) be a sequence of uniformly computable reals in (0, 1) such that p = limn pn exists. Then for any Schnorr μp -random set B, we have dimS (B) = −(p log p + (1 − p) log(1 − p)). Part 1 of the theorem is straightforward (using for instance the martingale characterization of Schnorr Hausdorff dimension); part 2 is an easy adaption of the corresponding theorem for effective dimension (as for example in Section 13.5). Lutz [254] proved part 2 for Martin-L¨ of μp -randomness and effective Hausdorff dimension. It is not hard to see that for the examples given in Theorem 13.15.7, the Schnorr Hausdorff dimension and the Schnorr packing dimension coincide, so these examples describe Schnorr regular sets. In Section 13.15.4, we will see that there are highly irregular c.e. sets: While all c.e. sets have Schnorr Hausdorff dimension 0, there are c.e. sets of Schnorr packing dimension 1. Downey and Kach [unpublished] have noted that the method in the proof of Theorem 8.11.2 can be used to show that if a set is nonhigh, then its Schnorr Hausdorff dimension equals its effective Hausdorff dimension. We will see in Section 13.15.4 that this correspondence fails for packing dimension.
13.15.3 A machine characterization of Schnorr dimension One of the fundamental aspects of the theory of 1-randomness is the characterization of that notion in terms of initial segment Kolmogorov complexity. There is an equally important correspondence between effective Hausdorff and packing dimensions and Kolmogorov complexity, as we saw in Theorem 13.3.4 and Corollary 13.11.12. A comparably elegant initial segment complexity characterization is not possible for Schnorr randomness, because such a characterization should be relativizable, and would therefore imply that lowness for K implies lowness for Schnorr randomness, which we saw was not the case in Section 12.1. As we saw in Section 7.1.3, however, it is possible to obtain a machine characterization of Schnorr randomness by restricting ourselves to computable measure machines, that is, prefix-free machines whose domains have computable measures. We now see that we can use such machines to characterize Schnorr dimension as well. Recall from Theorem 7.1.15 that Downey and Griffiths [109] showed that A is Schnorr random iff KM (A n) n − O(1) for every computable
13.15. Computable dimension and Schnorr dimension
659
measure machine M . Building on this characterization, we can describe Schnorr dimension as asymptotic initial segment complexity with respect to computable measure machines. Theorem 13.15.8 (Downey, Merkle, and Reimann [125]). The Schnorr Hausdorff dimension of A is the infimum over all computable measure machines M of lim inf n KM (An) . n Proof. () Let s > dimS (A) be rational. We build a computable measure machine M such that lim inf n KM (An) < s. n Let {Ui }i∈ω be a Schnorr s-test covering A. The KC Theorem is applicable to the set of requests (#s|σ|$, σ) for σ ∈ i>1 Ui , so there is a prefix-free machine M such that KM (σ) #s|σ|$ for allsuch σ. Furthermore, M is a computable measure machine because i>1 σ∈Ui 2−s|σ| is computable. We know that for each i there is an ni such that A ni ∈ Un , and clearly these ni go to infinity. So there are infinitely many n such that s. KM (A n) #sn$ sn, which implies that lim inf n KM (An) n () Suppose there is a computable measure machine M such that lim inf n KM (An) < s, where s is rational. Let SM = {σ : KM (σ) < s|σ|}. n We claim that SM is a total Solovay s-test covering A. Thereare infinitely many initial segments of A in SM , so it remains to show that σ∈SM 2−s|σ| is finite and computable. Finiteness follows from the fact that 2−s|σ| < 2−KM (σ) 1. σ∈SM
σ∈SM
To show computability, given ε, compute an s such that μ(dom(M ) \ dom(Ms )) < ε, and let SMs = {σ : KMs (σ) < s|σ|}. Then 2−s|σ| 2−s|σ| 2−s|σ| + ε, σ∈SMs
σ∈SM
σ∈SMs −s|σ|
< 2−KM (σ) , and any minimal since for any σ ∈ SM \ SMs , we have 2 length M -description of σ must be in dom(M ) \ dom(Ms ), whence the sum of 2−s|σ| over all such σ is bounded by μ(dom(M ) \ dom(Ms )) < ε. An analogous argument, using the correspondence between martingales and tests shown in Theorem 13.15.6, allows us to obtain a machine characterization of Schnorr packing dimension. Theorem 13.15.9 (Downey, Merkle, and Reimann [125]). The Schnorr packing dimension of A is the infimum over all computable measure machines M of lim supn KM (An) . n
13.15.4 Schnorr dimension and computable enumerability For left-c.e. reals, having high effective dimension has similar computabilitytheoretic consequences to being 1-random. For instance, as we have seen,
660
13. Algorithmic Dimension
every left-c.e. real of positive effective dimension is Turing complete. For Schnorr dimension, a straightforward generalization of Corollary 8.11.4 shows that if A is left-c.e. and dimS (A) > 0, then A is high. Computably enumerable sets are usually of marginal interest in the context of algorithmic randomness, one reason being that they cannot be random relative to most notions of randomness. For instance, we have the following. Proposition 13.15.10 (Folklore). No c.e. set is Schnorr random. Proof. Let A be c.e. We may assume that A is infinite, and hence contains an infinite computable subset {b0 < b1 < · · · }. Let Gn = {σ ∈ 2bn : ∀i < n (σ(bi ) = 1)} and let Un = Gn . Then μ(Un ) = 2−n , so the Un form a Schnorr test, and A ∈ n Un .
This proof does not immediately extend to showing that a c.e. set A must have Schnorr Hausdorff dimension 0. Defining coverings from the enumeration of A directly might not work either, since the dimension factor leads longer strings to be weighted more, so, depending on the enumeration, we might not get a Schnorr s-covering. However, we can exploit the somewhat predictable nature of A to define for each s > 0 a computable martingale that s-succeeds on A. Theorem 13.15.11 (Downey, Merkle, and Reimann [125]). Every c.e. set has Schnorr Hausdorff dimension 0. Proof. Let s ∈ Q>0 . Let γ > 21−s − 1 be rational. Partition N into consecutive disjoint intervals I0 , I1 , . . . so that there is an ε > 0 for which, letting in = |In | and jn = i0 + i1 + · · · + in , we have
lim n
(1 + γ)in
1−γ εin 1+γ
2(1−s)jn
= ∞.
To have this hold, it is enough to have In be much larger than In−1 , for example by letting |In | = 2|I0 |+···+|In−1 | . n| Let δ = lim supn |A∩I in . By replacing A with its complement if needed, we may assume that δ > 0. Let q < r ∈ Q be such that δ ∈ (q, r) and r − q < ε. There is a computable sequence n0 < n1 < · · · such that qink < |A ∩ Ink | < rink for all k. Let d be defined as follows. let d(λ) = 1. If |σ| ∈ / Ink for all k, then let d(σ0) = d(σ1) = d(σ). If |σ| ∈ Ik , then wait for an s such that |As ∩ Ink | > qink . If |σ| ∈ As , then let d(σ0) = 0 and d(σ1) = 2d(σ). Otherwise, let d(σ0) = (1 + γ)d(σ) and d(σ1) = (1 − γ)d(σ).
13.15. Computable dimension and Schnorr dimension
661
When betting along A, the number of times we are in the last case of the definition of d and yet |σ| ∈ A is less than (r − q)ink < εink , so d(A jnk ) > d(A jnk−1 )(1 + γ)ink −εink (1 − γ)εink = d(A jnk−1 )(1 + γ)ink
1−γ 1+γ
εink .
By our choice of ε, we see that d s-succeeds on A. We have seen that effective Hausdorff dimension is stable; that is, the effective Hausdorff dimension of a class is the supremum of the effective Hausdorff dimensions of its elements. It is not hard to see that for every Schnorr 1-test there is a c.e. (and even a computable) set that is not covered by it. Thus the class of all c.e. sets has Schnorr Hausdorff dimension 1, and hence Schnorr Hausdorff dimension is not stable, even for countable classes. Perhaps surprisingly, there do exist c.e. sets with Schnorr packing dimension 1. This result contrasts with the case of effective packing dimension, since as we will see in Theorem 16.1.1, if A is c.e. then C(A n) 2 log n + O(1). It can be generalized to show that every hyperimmune degree contains a set of Schnorr packing dimension 1. Downey, Merkle, and Reimann [125] remarked that a straightforward forcing construction shows the existence of degrees that do not contain any set of high Schnorr packing dimension. Theorem 13.15.12 (Downey, Merkle, and Reimann [125]). If B has hyperimmune degree then there is an A ≡T B with Schnorr packing dimension 1. If B is c.e. then A can be chosen to be c.e. Proof. We begin with the non-c.e. case. It is enough to build C T B with Schnorr packing dimension 1, since we can then let A = B ⊕X C for a sufficiently sparse computable set X (where ⊕X is as defined in Section 6.9), say X = {2n : n ∈ N}. Let g T B be such that for any computable function f , we have f (n) < g(n) for infinitely many n. Effectively partition N into disjoint consecutive intervals I0 , I1 , . . . such that |In+1 | is much greater than |I0 |+ · · ·+ |In |. For instance, we can choose the In so that |In+1 | = 2|I0 |+···+|In | . Let in = |In |. Let M 0 , M1 , . . . be an effective enumeration of all prefix-free machines. If Me [g(n)](σ)↓ 2−|σ| 1 − 2−ie,n then let C ∩ Ie,n = ∅. Otherwise, let σ ∈ 2ie,n be such that KMe [g(n)] (σ) ie,n . (If such a σ does not exist, then the domain of Me consists exactly of the finitely many strings of length ie,n , so we do not have to worry about Me ; in this case, let C ∩ Ie,n = ∅.) Note that KMe (σ) ie,n . Let C Ie,n = σ. We have C T g T B. Assume for a contradiction that DimS (C) < 1. Then there exists a computable measure machine M and an ε > 0 such that KM (C n) (1 − ε)n for almost all n. By Proposition 7.1.18, we may assume that μ(dom M ) =
662
13. Algorithmic Dimension
' with the same domain as M . If M (x) ↓ then 1. We define a machine M '(x) check whether |M (x)| = i0 + i1 + · · · + ik for some k. If so, then let M '(x) = 0. Let e be such be the last ik many bits of M (x). Otherwise, let M ' that Me = M . Let f (n) be the least stage s such that Me [s](σ)↓ 2−|σ| > 1 − 2−ie,n . Since μ(dom Me ) = 1, the function f is total and computable, so there are infinitely many n such that f (n) < g(n). For each such n, we have KMe (C Ie,n ) ie,n . On the other hand, for almost all n, KMe (C Ie,n ) KM (C Ie,0 ∪ . . . ∪ Ie,n )
ε ie,n , (1 − ε) ie,j 1 − 2 jn
so we have a contradiction. If B is c.e., then it is easy to see that we can let the function g above be defined by letting g(n) be the least stage s such that Bs (n) = B(n). If we build C as above using this g, then C is c.e., since if n ∈ / B then C Ie,n = ∅, while otherwise we can wait until the stage g(n) at which n enters B, and compute C Ie,n from g(n), enumerating all its elements into C at stage g(n). As mentioned in Section 13.15.2, this result, combined with Theorem 13.14.5 (or Theorem 16.1.1 below, which implies that all c.e. sets have effective packing dimension 0), shows that, as opposed to the case of Hausdorff dimension, there are nonhigh sets with Schnorr packing dimension 1 but effective packing dimension 0.
13.16 Kolmogorov complexity and the dimensions of individual strings In this section, we look at work of Lutz [254] on assigning dimensions to individual strings, and a new characterization of prefix-free complexity using such dimensions. We have seen that the effective dimension of a set can be characterized in terms of initial segment complexity and in terms of c.e. (super)gales. To discretize the latter characterization, Lutz replaced supergales by termgales, which can deal with the fact that strings terminate; this change is made by first defining s-termgales, and then defining termgales, which are uniform families of s-termgales. Next, he replaced “tending to infinity” by “exceeding a finite threshold”. Finally, he replaced an optimal s-supergale by an optimal termgale.
13.16. The dimensions of individual strings
663
The basic idea is to introduce a new termination symbol to mark the end of a string. A terminated binary string is σ with σ ∈ 2<ω . Let T be the collection of terminated binary strings. Definition 13.16.1 (Lutz [254]). For s 0, an s-termgale is a function d : 2<ω ∪ T → R0 such that d(λ) 1 and for all σ ∈ 2<ω , d(σ) 2−s (d(σ0) + d(σ1) + d(σ)). For s = 1, the s-termgale condition is akin to the usual supermartingale condition, but as noted by Lutz, if each of 0, 1, is equally likely, independently of previous bits, then the conditional expected capital after betting on the bit to follow σ is d(σ0)+d(σ1)+d(σ ) = 23 d(σ). However, the 3 definition of s-termgales is based on the assumption that the termination symbol should actually be regarded as having vanishingly small probability. In other words, the 1-termgale payoff condition is the same as the corresponding supermartingale condition, except that the 1-termgale must divert some of its capital to the possibility of occurring, and this diversion is without compensation, but can occur at most once, and we can make the impact of this diversion of capital small. Lutz [254] provided the following example. Let d(λ) = 1. Given d(σ), let d(σ0) = 32 d(σ) and d(σ1) = d(σ) = 14 d(σ). Then d is a 1-termgale and if σ is a string with n0 many 0’s and n1 many 1’s, then 1 3 d(σ) = ( )n0 ( )n1 +1 = 2n0 (1+log 3)−2(n1 +1) . 2 4 2 Thus if n0 * 1+log 3 (n0 + n1 + 1), then d(σ) is significantly greater than d(λ) even though d loses three quarters of its capital when occurs. The following facts are straightforward.
Lemma 13.16.2. Let d, d : 2<ω ∪ T → R0 be such that 2−s|σ| d(σ) = 2−s |σ| d (σ) for all σ ∈ 2<ω ∪ T . Then d is an s-termgale iff d is an s -termgale. In particular, if d is a 0-termgale and d (σ) = 2s|σ| d(σ) for all σ ∈ <ω 2 ∪ T , then d is an s-termgale, and each s-termgale can be obtained in this way from a 0-termgale. We need the following technical lemma. Lemma 13.16.3 (Lutz [254]). Let d be an s-termgale and τ ∈ 2<ω . Then −s|σ| d(τ σ) 2s d(τ ). σ∈2<ω 2 Proof. First suppose that s = 0. It is straightforward to show by induction that for all m, d(τ σ) + d(τ σ) d(τ ). Thus
σ∈2m
σ∈2<m
σ∈2m
d(τ σ) d(τ ) for all m, and hence
σ∈2<ω
d(τ σ) d(τ ).
664
13. Algorithmic Dimension
For the general case, let d (σ) = 2−s|σ| d(σ) for all σ ∈ 2<ω ∪ T . Then d is a 0-termgale, and hence 2−s|σ| d(τ σ) = 2s(|τ |+1) d (τ σ) 2s(|τ |+1) d (τ ) = 2s d(τ ). σ∈2<ω
σ∈2<ω
We are now ready to define (optimal) termgales. Definition 13.16.4 (Lutz [254]). (i) A termgale is a family d = {ds : s 0} of s-termgales such that 2−s|σ| ds (σ) = 2−s |σ| ds (σ) for all s, s , and σ ∈ 2<ω ∪ T . (ii) A termgale is Σ01 , or constructive, if d0 is a Σ01 function. (iii) A Σ01 termgale d˜ is optimal if for each Σ01 termgale d there is a c > 0 such that for all s and σ ∈ 2<ω , we have d˜s (σ) cds (σ). In the same way that we connected continuous semimeasures with martingales, we can define the termgale induced by a c.e. discrete semimeasure m via d[m]s (τ ) = 2s|σ| τ σ m(σ). (See Section 3.9 for more on discrete semimeasures.) Theorem 13.16.5 (Lutz [254]). If m is a maximal c.e. discrete semimeasure then d[m] is an optimal Σ01 termgale. Thus there is an optimal Σ01 termgale. Proof. It is easy to check that d[m] is a termgale. Let d = {ds : s 0} be a termgale. Let m(σ) = d0 (σ). By Lemma 13.16.3, m is a c.e. discrete semimeasure. Since m is multiplicatively optimal, for some c we have m(σ) cm(σ) for all σ ∈ 2<ω . Hence = cds (σ). d[m]s (σ) = 2s|σ | m(σ) 2s|σ | cm(σ)
Definition 13.16.6 (Lutz [254]). If d is a termgale, n > 0, and σ ∈ 2<ω , then the dimension of σ relative to d at significance level n is dimnd (σ) = inf{s 0 : ds (σ) > n}. We can characterize this dimension in terms of an optimal termgale. Theorem 13.16.7 (Lutz [254]). Let d˜ be an optimal Σ01 termgale, and let d be a Σ01 termgale. Then for each n > 0 there is a c > 0 such that, for all σ ∈ 2<ω , dimnd˜(σ) dim1d (σ) +
c . 1 + |σ|
13.16. The dimensions of individual strings
665
Hence, if d˜1 and d˜2 are both optimal Σ01 termgales and n1 , n2 > 0, then there is a c > 0 such that for all σ ∈ 2<ω , c . dimnd˜1 (σ) − dimnd˜2 (σ) 1 2 1 + |σ| ˜ there is a constant b ∈ (0, 1) such that for all Proof. By the optimality of d, s, we have d˜s (σ) bds (σ). Given n, let c = log n− log b, and notice that c c c > 0. Let s > dim1d (σ) + 1+|σ| and let s1 = s − 1+|σ| . Then s1 > dim1d (σ), so d˜s (σ) bds (σ) = b2(s−s1 )|σ | ds1 (σ)) = b2c ds1 (σ) > b2c = n.
Theorem 13.16.7 is the key to Lutz’ definition of the dimension of a string, since it says that if we base our definition on a particular optimal Σ01 termgale d˜ and significance level n, then this choice will have little effect on the dimension of a given string, particularly for longer strings. Therefore we fix an optimal Σ01 termgale d and make the following definition. Definition 13.16.8 (Lutz [254]). dim(σ) = dim1d (σ) for every σ ∈ 2<ω . In this context, we lose some of the finer points of Hausdorff dimension. For example, it is no longer true that the effective dimension is bounded above by 1. But we do have the following analog. Theorem 13.16.9 (Lutz [254]). There is a b > 0 such that dim(σ) b for all σ ∈ 2<ω . Proof. For each σ and s, let ds (σ) = 2(s−2)|σ| and ds (σ) = 2(s−2)|σ|+1 . Then it is easy to check that d = {ds : s 0} is a termgale, and d2 (σ) = 2 for all σ. Thus, by Theorem 13.16.7, there is a c such that dim(σ) 2 + c 1+|σ| 2 + c, so we can take b = 2 + c. Finally, we have an analog of the Kolmogorov complexity characterization of effective Hausdorff dimension. Theorem 13.16.10 (Lutz [254]). K(σ) = |σ| dim(σ) ± O(1). Proof. Let m be a maximal c.e. semimeasure, as given in Theorem 3.9.2, and let d[m] be the termgale induced by m. For all σ ∈ 2<ω and s > 0, we have d[m]s (σ) > 1 iff 2s|σ | m(σ) > 1 iff s >
− log m(σ) , 1 + |σ|
so (1 + |σ|) dim1d[m] (σ) = − log m(σ). Thus, by Theorem 13.16.7, (1 + |σ|) dim(σ) = − log m(σ) ± O(1), so by Theorem 13.16.9, |σ| dim(σ) = − log m(σ) ± O(1). But by the Coding Theorem 3.9.4, K(σ) = − log m(σ)±O(1), so |σ| dim(σ) = K(σ)±O(1).
666
13. Algorithmic Dimension
Theorem 13.16.10 implies that for any A ∈ 2ω , we have limn | K(An) − n dim(A n)| = 0. Thus the Kolmogorov complexity characterizations of effective Hausdorff and packing dimensions imply the following. Corollary 13.16.11 (Lutz [254]). For A ∈ 2ω , we have dim(A) = lim inf n dim(A n) and Dim(A) = lim supn dim(A n). In [254], Lutz also proved several other facts about the dimensions of finite strings. Space considerations preclude us from including this material.
Part IV
Further Topics
14 Strong Jump Traceability
As we have seen, the K-trivial sets form a class of “extremely low” sets, properly contained within the class of superlow sets and closed under join. The study of K-triviality has led to an increased interest in notions of computability-theoretic weakness. In particular, various forms of traceability have played important roles. In this chapter we study the notion of strong jump traceability introduced by Figueira, Nies, and Stephan [147] and later studied by Cholak, Downey, and Greenberg [65], among others, and explore its connections with K-triviality and other randomnesstheoretic notions. In the computably enumerable case, we will show that the strongly jump traceable c.e. sets form a proper subideal of the K-trivial c.e. sets, and can be characterized as those c.e. sets that are computable from every ω-c.e. (or every superlow) 1-random set. We will also discuss the general case in the last section. In particular, we will show that every s.j.t. set is K-trivial, a result of Downey and Greenberg [105].
14.1 Basics A Recall that we denote ΦA n (n) by J (n), and that a set A is h-jump traceable for a computable order h if there are uniformly c.e. sets T0 , T1 , . . . such that |Tn | h(n) and J A (n) ↓ ⇒ J A (n) ∈ Tn for all n. (That is, there is a c.e. trace for J A with bound h.) In this chapter we assume that all orders h have h(0) > 0.
R.G. Downey and D. Hirschfeldt, Algorithmic Randomness and Complexity, Theory and Applications of Computability, DOI 10.1007/978-0-387-68441-3_14, © Springer Science+Business Media, LLC 2010
668
14.1. Basics
669
Definition 14.1.1 (Figueira, Nies, and Stephan [147]). A set A is strongly jump traceable (s.j.t.) if it is h-jump traceable for every computable order h. It follows easily from the fact that there is a computable function f such that J A (f (e, n)) = ΦA e (n) for all e and n that if A is s.j.t. then for each computable order h and each partial A-computable function ψ, there is a c.e. trace for ψ with bound h. Indeed, we have the following useful fact. Lemma 14.1.2 (essentially Cholak, Downey, and Greenberg [65], see also Nies [306]). Let h0 , h1 , . . . be uniformly computable orders such that hn (0) n for all n. From indices for the sequence h0 , h1 , . . . and for a sequence of uniformly computable functionals Ψ0 , Ψ1 , . . ., we can effectively determine a computable order g such that for all A, if J A has a c.e. trace {Ti }i∈ω with bound g, then each ΨA n has a c.e. trace with bound hn , and these can be obtained effectively from an index for {Ti }i∈ω . It is not obvious that noncomputable s.j.t. sets exist, but the following result shows that they in fact do. We give a sketch of its proof from [65]. Theorem 14.1.3 (Figueira, Nies, and Stephan [147]). There is a promptly simple s.j.t. c.e. set. Proof sketch. Suppose that we wish to construct a noncomputable c.e. set A that is h-jump traceable for a single given computable order h. Let Pe be the eth noncomputability requirement, A = We , and let Ne be the requirement stating that if J A (e) ↓ then J A (e) ∈ Te , for a c.e. trace T0 , T1 , . . . with bound h that we build. We order the requirements in the following way: N0 N1 N2 . . . Ne . . . P0 . . . Ne . . . P1 . . . Ne . . . P2 . . . h(e)=1
h(e)=2
h(e)=3
The construction is straightforward: Each Pe is appointed a follower x. If x enters We at a stage s at which Pe is not yet satisfied, then we put x into A, satisfying Pe . If a new computation J A (e) appears at a stage s, then Ne puts its value into Te and initializes all weaker P -requirements, which are then appointed new, large followers (in particular larger than the use of the computation J A (e)[s]). It is easy to see that the arrangement of strategies ensures that |Te | h(e). The key to the success of this construction is that each requirement Pe acts at most once, and does not need to act again even if it is later initialized. It may be instructive to think of the priority ordering as dynamic: when Pe acts, it is removed from the list of requirements and is never troubled (nor influences other requirements) again. To make A be h-jump traceable for all computable orders h, a further dynamic element needs to be introduced to the priority ordering. The property of a partial computable function being an order is Π02 , and we approximate
670
14. Strong Jump Traceability
it in this fashion. Say that a stage is e-expansionary if at this stage we have further evidence that the eth partial computable function Φe is an order. If s is e-expansionary then the positive requirements are pushed down the ordering so that for every n such that Φe (n)[s] ↓, there are at most Φe (n) many positive requirements stronger than Ne,n , the requirement that traces J A (e) with at most Φe (n) many values. To protect the positive requirements from being moved down infinitely often, we stipulate that a positive requirement Pe cannot be moved by Φe if e < e; these positive requirements are ignored when we count the number of positive requirements that appear before some Ne,n . If Pe acts then we initialize every Ne,n and start a new trace. This proof admits the usual sorts of variations. For instance, Cholak, Downey, and Greenberg [65] showed that for every low c.e. set B, there is an s.j.t. c.e. set A T B. Since A T B implies that J A is uniformly coded into J B , we have the following. Theorem 14.1.4 (Figueira, Nies, and Stephan [147]). The class of strongly jump traceable sets is closed downward under Turing reducibility. Figueira, Nies, and Stephan [147] also studied the relationship between strong jump traceability and other kinds of approximations to the jump. Definition 14.1.5 (Figueira, Nies, and Stephan [147]). A set D is wellapproximable if for each computable order h there is a computable g such that for all n, we have |{s : g(n, s + 1) = g(n, s)}| h(n) and D(n) = lims g(n, s). Figueira, Nies, and Stephan [147] studied sets A such that A is wellapproximable. As with strong jump traceability, it is easy to see that the class of such jump well-approximable well sets is closed downward under Turing reducibility. We can also study the J A version of this definition: Downey and Greenberg [unpublished] defined A to be strongly jump well-approximable if J A is well-approximable. Downey and Greenberg conjectured that jump well-approximability, strong jump well-approximability, and strong jump traceability are equivalent, but this is not known to be the case in general. In the c.e. case, it is not hard to show that these notions are indeed the same. Theorem 14.1.6 (essentially Figueira, Nies, and Stephan [147]). The following are equivalent for a c.e. set A. (i) A is strongly jump traceable. (ii) A is jump well-approximable. (iii) A is strongly jump well-approximable.
14.1. Basics
671
Relatively weak computability-theoretic lowness properties do not imply randomness-theoretic lowness. For instance, there are 1-random sets that are low, or even superlow. However, the same is not true of very strong computability-theoretic lowness notions. For instance, we have the following plain complexity characterization of strong jump traceability. Theorem 14.1.7 (Figueira, Nies, and Stephan [147]). A set A is s.j.t. iff for every computable order h, we have C(σ) C A (σ) + h(C A (σ)) for almost all σ. Proof. Let A be s.j.t. and let h be a computable order. Let g(n) = log h(n). It clearly suffices to show that C(σ) C A (σ) + g(C A (σ)) + O(1). Since U A is a partial A-computable function, there is a c.e. trace {Tτ }τ ∈2<ω for U A with bound g(|τ |). Let τ be a minimal-length U A -description of σ. Since σ ∈ Tτ , we can describe σ by giving τ and a number less than or equal to g(|σ|) denoting σ’s position in the enumeration of Tτ . Thus C(σ) |τ | + g(|τ |) + O(1) = C A (σ) + g(C A (σ)) + O(1). Conversely, suppose that for every computable order h, we have C(σ) C A (σ) + h(C A (σ)) for almost all σ. If J A (e) ↓ and log n = e then C A (n, J A (e)) C A (n) + O(1) e + O(1). Let c be this final O(1) constant. Let h be a computable order. Let g be a computable order such that 3g(e+c) h(e) for all e. If J A (e) ↓ and log n = e then C(n, J A (e)) C A (n, J A (e)) + g(C A (n, J A (e))) e + g(e + c) + c. Let Te = {m : ∀n (log n = e ⇒ C(n, m) e + g(e + c) + c)}. Then for almost all e, if J A (e) ↓, then J A (e) ∈ T (e) , since for any n with log n = e, we have C(n, J A (e)) e + g(e + c) + c. So it is enough to show that |Te | h(e) for almost all e. Let n be such that log n = e and C(n) e. Then for all m ∈ Te , we have C(n, m) e+g(e+c)+c C(n)+g(e+c)+c, so by Theorem 3.4.6, for almost all e, we have |Te | 3g(e+c) h(e). One consequence of the above result is that jump well-approximability implies strong jump traceability in general. Theorem 14.1.8 (Figueira, Nies, and Stephan [147]). If A is well approximable, then for every computable order h, we have C(σ) C A (σ) + h(C A (σ)) for almost all σ. Thus every jump well-approximable set is s.j.t. ∗ ∗ Proof. Let h be a computable order. Let nσ = C(σ | σA ). (Recall that σA is a minimal length U A -description of σ.) We have C(σ) C A (σ) + 2nσ A for almost all σ, so it is enough to show that nσ h(C 2 (σ)) for almost all σ. In fact, we will show that nσ 2 log h(C A (σ)) + O(1). Define a partial A-computable function Ψ(m, n, ρ) as follows. Wait until σ = U A (ρ), if ever. Then search for a τ ∈ 2n such that U ρ (τ ) = σ. If such a τ is found, m < n, and the mth bit of τ is 1, then let Ψ(m, n, ρ) ↓ (the actual value of Ψ(m, n, ρ) is irrelevant). The important part of this
672
14. Strong Jump Traceability ∗
definition is that there is some τσ ∈ 2nσ such that U σA (τσ ) = σ and for ∗ m < nσ , we have Ψ(m, nσ , σA ) ↓ iff τσ (m) = 1. Let f be a computable function such that A (f (m, n, ρ)) = 1 iff Ψ(m, n, ρ) ↓. Let b be a computable order such that b(f (m, n, ρ)) mnh(|ρ|) for all m, n, and ρ. By definition, there is a way to approximate A so that A (k) is approximated with at most b(k) many mind ∗ changes. Thus, since τσ (m) = 1 iff A (f (m, nσ , σA )) = 1, we can describe ∗ τσ from σA by giving nσ and the exact number of times our approximation ∗ ∗ to A (f (0, nσ , σA )), . . . , A (f (nσ − 1, nσ , σA )) changes, which is bounded by ∗ 2 A ∗ n2σ h(|σA |) = nσ h(C (σ)). But we can also compute σ from τσ and σA , so ∗ ∗ nσ = C(σ | σA ) C(τσ | σA ) + O(1)
2 log nσ + log(n2σ h(C A (σ))) + O(1) nσ + log h(C A (σ)) + O(1), 4 log nσ + log h(C A (σ)) + O(1) 2 so nσ 2 log h(C A (σ)) + O(1). Ng [296] has shown that the index set of c.e. s.j.t. sets is Π04 -complete. His proof builds on the method of separating K-triviality and strong jump traceability from Theorem 14.3.7 below. Ng has also studied relativized versions of strong jump traceability and related notions. For example, a set A is ultra jump traceable if A is strongly jump traceable relative to all c.e. sets X. That is, for every c.e. set X and every X-computable order h, there is an X-c.e. trace for J A with bound h. Note that this definition is a “partial” relativization, in that we trace only J A , not J X⊕A . Ng [297, 294] has shown that noncomputable ultra jump traceable sets exist, but no noncomputable set is jump traceable relative to all Δ02 sets in this partial sense. He has also shown that the ultra jump traceable sets form a proper subclass of the s.j.t. sets, and that they cannot be promptly simple. They thus form the first known class defined by a cost function construction that has noncomputable elements but no promptly simple ones. This class is not yet well understood.
14.2 The ideal of strongly jump traceable c.e. sets By Theorem 14.1.4, the class of strongly jump traceable sets is closed downward under Turing reducibility. In this section we show that the join of two s.j.t. c.e. sets is also s.j.t., and so the c.e. s.j.t. sets form an ideal in the c.e. degrees. In fact, we show that for every computable order h there is another computable order g such that if A0 and A1 are c.e. and g-jump traceable, then A ⊕ B is h-jump traceable. This result should be contrasted with Theorem 11.6.4 that there exist superlow c.e. sets A and B such that A ⊕ B ≡T ∅ .
14.2. The ideal of strongly jump traceable c.e. sets
673
The following theorem improves an earlier unpublished result of Ng, who built a nontrivial class of c.e. sets C such that for all A ∈ C and all s.j.t. sets B, the join A ⊕ B is strongly jump traceable. Theorem 14.2.1 (Cholak, Downey, and Greenberg [65]). Let g be a computable order. There is a computable order h such that if A and B are h-jump traceable c.e. sets, then A ⊕ B is g-jump traceable. In particular, if A and B are s.j.t. c.e. sets then so is A ⊕ B. Proof. This proof is a relatively simple example of the box promotion (or box amplification) method, so we wish to describe the motivation behind it. Suppose that we wanted to prove the theorem wrong, that is, to construct s.j.t. c.e. sets A0 and A1 such that A0 ⊕ A1 is not s.j.t. We could attempt to combine the construction of a promptly simple s.j.t. c.e. set in the proof of Theorem 14.1.3 to build A0 and A1 with a strategy to diagonalize against possible traces for J A0 ⊕A1 by changing values of this computation sufficiently often. Recall that in the proof of Theorem 14.1.3, after some requirement Pe acts, it is removed from the list of requirements, and the blocks of N requirements to its left and right are merged. This action “promotes” the block of N -requirements to the right of Pe , in a sense increasing their priority, which means that the number of times they can be injured decreases by one. In our attempted construction, suppose we start with the same ordering of requirements, except that there are two kinds of negative requirements, one for A0 and one for A1 , and the positive requirements are the ones forcing us to change A0 or A1 to diagonalize against possible traces for J A0 ⊕A1 . The difference now is that the positive requirements need to act multiple times. Each time a positive requirement Pe acts, enumerating a number into some Ai , it must be demoted down the list and placed after all the negative requirements it has just injured. Since these requirements may later impose new restraints, a new follower for Pe may be needed each time such a requirement decides to impose restraint. Since some of the negative requirements are also promoted by positive requirements weaker than Pe , we cannot put any computable bound in advance on the last place Pe will occupy on the list, and hence on the number of followers it will need. Thus we cannot define the computable bound that we mean to defeat, and the construction fails. This failure is turned around in the following proof. We are given strongly jump traceable c.e. sets A0 and A1 , and a computable order g, and wish to trace J A0 ⊕A1 with bound g. (We will later see that we can replace strong jump traceability by h-jump traceability for an order h that depends only on g.) The strategies for tracing the various values of J A0 ⊕A1 act completely independently, so fix an input e. We define functionals Ψ0 and Ψ1 . When at some stage s of the construction we discover that J A0 ⊕A1 (e)[s] ↓, before we trace the value of this computation,
674
14. Strong Jump Traceability
we want to receive some confirmation that this value is genuine. Say that the computation has use σ0 ⊕ σ1 , where σi ≺ Ai [s]. We define Ψσi i (x) = σi i for some x. If indeed σi ≺ Ai then σi must appear in a trace Txi for ΨA i , Ai which we may assume we are given, by the universality of J and the recursion theorem. Thus we can wait until both strings σi appear in the relevant “boxes” Txi , and only then believe the computation J A0 ⊕A1 (e)[s]. Of course, it is possible that both σi appear in Txi but neither σi is really an initial segment of Ai , in which case we will have traced the wrong values. In this case, however, both boxes Txi have been promoted, in the sense that i they each contain an element that we know is not the real value of ΨA i (x), Ai and Ψi (x) becomes undefined (when we notice that Ai has moved to the right of σi ) and is therefore useful to us for testing another potential value of J A0 ⊕A1 (e) that may appear later. If the bound on the size of Txi (which we prescribe in advance, but has to eventually increase with x) is k, then we originally think of Txi as a “k-box”, a box that may contain up to k many values. After σi appears in Txi and is shown to be wrong, we can think of the promoted box as a (k − 1)-box. Eventually, if Txi is promoted k − 1 many times, then it becomes a 1-box. If a string σi appears in a 1-box then we know it must be a true initial segment of Ai . In this way we can limit the number of false J A0 ⊕A1 (e) computations that we trace. Since all requirements act independently, this strategy allows us to trace J A0 ⊕A1 to any computable degree of precision we wish. The above is the main idea of all “box promotion” constructions. Each such construction has particular combinatorial aspects designed to counter difficulties that arise during the construction (difficulties that we may think of as possible plays of an opponent, out to foil us). These combinatorics determine how slowly we want the size of our traces to grow, and which boxes should be used in the tests we perform. In this construction, the difficulty is the following. In the previous scenario, it is possible that, say, σ0 is indeed a true initial segment of A0 , but σ1 is not an initial segment of A1 , and to make matters worse, the latter fact may be discovered even before 0 σ1 turns up in Tx1 . However, we have already defined ΨA 0 (x) = σ0 with an A0 -correct use, which means that the input x will not be available later for a new definition. The box Tx0 must be discarded, and we have received no compensation—no other box has been promoted. The mechanics of the construction detailed below tell us which boxes to pick so that this problem can be countered. The main idea (which again appears in all box promotion constructions) is to use clusters of boxes (or “meta-boxes”) rather than individual boxes. Instead of testing σi on a single Txi , we bunch together a finite collection Mi of inputs x, and define Ψσi i (x) = σi for all x ∈ Mi . We believe the computation J A0 ⊕A1 (e)[s] only if σi has appeared in Txi for all x ∈ Mi . If this computation is believed and later discovered to be false, then all the boxes corresponding to elements of Mi are promoted. We can then break Mi up into smaller meta-boxes and use each separately. Thus
14.2. The ideal of strongly jump traceable c.e. sets
675
we magnify the promotion, to compensate for any losses that may occur as explained above. We now turn to the formal details of the proof. We fix a number e and show how to trace J A0 ⊕A1 (e), limiting our errors to a prescribed number m. Given m, our strategy will ask for an infinite collection of boxes and describe precisely how many k-boxes it requires for each k. As m grows, the least k for which k-boxes are required will grow as well. In our construction, we m will have k = # m 2 $. For m and k # 2 $, we will let r(k, m) be the number of k-boxes required to limit the size of the trace for J A0 ⊕A1 (e) to m. Recall that what we mean here is that the strategy will define functionals Ψe,i for m i i < 2 and expect to get traces {Tne,i }n∈N for ΨA e,i such that for all k # 2 $, e,i the collection of n such that |Tn | k has size at least r(k, m). Suppose that we can show that all our strategies succeed if their expectations are met. Then, given a computable order g, we can define a computable order h such that if A0 and A1 are h-jump traceable, then A0 ⊕ A1 is g-jump traceable. We do so as follows. For each c > 0, partition N into consecutive intervals {Ikc }kc such that |Ikc | = e: g(e) k r(k, g(e)), 2 and define a function hc by letting hc (n) = k if n ∈ Ikc . Note that, since g is an order, for each k there are only finitely many e such that # g(e) 2 $ k, so the intervals Ikc are well-defined. Clearly, the hc are uniformly computable orders, and hc (0) c. By Lemma 14.1.2, we can obtain a computable order h such that if A0 and A1 are h-jump traceable, then we can compute, i uniformly in c, traces {Snc,i }n∈N for ΦA c with bound hc . c For each c and k c, let {Nk,e } g(e) k be a partition of Ikc such that 2 c |Nk,e | = r(k, g(e)). For each c, we run our construction for all e such that defining Ψe,0 and Ψe,1 # g(e) 2 $ c simultaneously, with the eth requirement c with domain contained in k g(e) Nk,e , using the {Snc,i }n∈N as traces. 2 Using Posner’s trick (see p. 48), we can effectively obtain an index c such Ai i that for i = 0, 1, we have Φc = g(e) c ΨA e,i . By the recursion theorem, 2 there is a c such that Φc = Φc , and so if A0 and A1 are h-jump traceable, i then the {Snc,i }n∈N are indeed traces for ΦA c , so that for each e such that g(e) # 2 $ c, our strategy for e can succeed, and hence trace J A0 ⊕A1 (e) with bound g(e). (The e such that # g(e) 2 $ < c present no problem, of course, since there are only finitely many such e.) With the above in the background, we fix e and m and show how to i define r(k, m) and the functionals ΨA e,i . We also describe how, given traces A {Tne,i }n∈N for the Ψe,ii such that for all k # m 2 $, the collection of n with e,i |Tn | k has size at least r(k, m), we can trace J A0 ⊕A1 (e) with fewer than m many mistakes. From now on we drop the e from all subscripts and superscripts, writing Ψi for Ψe,i , writing Txi for Txe,i , and so on. Construction. For each n, define a metan0 -box to be any singleton {x} and a metank+1 -box to be a collection of n+2 many metank -boxes. We often ignore
676
14. Strong Jump Traceability
the distinction between a metank -box and the set of numbers appearing in its metan0 -sub-boxes. In this sense, the size of a metank -box is (n + 2)k . At the start of the construction, a meta-box M is an l-box (for both A0 and A1 ) if f (x) l for all x ∈ M , for an order f that we will soon define. At a later stage s, a meta-box M is an l-box for Ai if f (x) − |Txe,i [s]| l for all x ∈ M. At the start of the construction, for each k # m 2 $ we wish to have two metakk+1 -boxes that are k-boxes. We thus let r(k, m) = 2(k + 2)k+1 . Denote these two meta-boxes by Nk and Nk . Let f be a computable order such that for all k # m 2 $, the set of all n with f (n) = k has size at least r(k, m). We write N2 for the set of all numbers of the form n2 with n ∈ N. The construction begins at stage # m 2 $. At the beginning of stage s of the construction, we have two numbers k0∗ [s] and k1∗ [s]. (We start with ki∗ [0] = N ∗ #m 2 $.) For i < 2, each k ∈ [ki [s], s) has some priority pi (k)[s] ∈ 2 . For such pi (k)[s] -boxes M0i (k)[s], . . . , Mdi i (k)−1 (k)[s], k we have finitely many metak i each of which is free, in the sense that ΨA i (x)[s] ↑ for all x in any of these boxes. i i s At stage s # m 2 $, for i = 0, 1, let M0 (s), . . . , Ms+1 (s) be the metas -subboxes of Ns . (Recall that these are all s-boxes.) Let pi (s) = s. Suppose now that we are given a computation J A0 ⊕A1 (e)[s] with use σ0 ⊕ σ1 , which we want to test. This test is done in steps of increasing priority (i.e., decreasing argument in N2 ), beginning with step s. The instructions for testing σ0 ⊕ σ1 at step n ∈ N2 are as follows. For i = 0, 1, if there is some k such that pi (k)[s] = n (there will be at most one such k for each i), then take the last meta-box M = Mdi i (k)[s]−1 (k)[s] and test σi on M by defining Ψσi i (x) = σi for all x ∈ M . Run the enumeration of the trace {Txi }x∈N and of Ai until either σi appears in Txi for all x ∈ M , in which case we say that the test returns, or σi is found not to be an initial segment of Ai , in which case we say that the test fails. One of i these possibilities must occur, since {Txi }x∈N is a trace for ΨA i . If all tests that are started (none, one test for a single σi , or two tests, one for each σi ) return, then move to test step n − 12 , unless n = 1, in which case the tests at all levels have returned, so we believe the computation J A0 ⊕A1 (e)[s] and trace it. In the latter case, from this point on we monitor this belief. We keep defining pi (t) and Mji (t) at later stages t as above. If at such a stage t we discover that one of the σi is not in fact an initial segment of Ai , we update priorities as described in the following paragraph and go back to following the instructions above. Also, if some test at step n fails, then we stop the testing at stage s and update priorities as follows, with t = s. For each n such that a test of σi at step n returns, but at stage t s, we discover that σi ⊀ Ai , proceed as follows. Let k be the level such that pi (k)[s] = n. If k = ki∗ [s] then redefine ki∗ to be k − 1. Redefine pi (k − 1) to i be n and di (k − 1) to be #n$ + 2, and let M0i (k − 1)[t + 1], . . . , Mn+1 (k −
14.2. The ideal of strongly jump traceable c.e. sets
677
n
1)[t + 1] be the collection of metak−1 -sub-boxes of Mdi i (k)[s]−1 (which was n
the metak -box used for the testing of σi at step n of stage s). If k = s then redefine pi (k) to be s + 12 , redefine di (s) to be s + 2, and let M0i (s)[t + i 1], . . . , Ms+1 (s)[t + 1] be the metass -sub-boxes of Ns . (Note that these boxes are untouched so far.) On the other side, if a test of σ1−i at step n of stage s started and returned, and σ1−i ≺ A1−i [t], then discard the meta-box Md1−i (k)[s] 1−i (k)[s]−1 and redefine d1−i (k) to be d1−i (k)[s] − 1. Perform these redefinitions also if t = s, the step s test of σ1−i at stage s returns, and the step s test of σi at stage s fails. End of Construction. ∗ Lemma 14.2.2. Let i < 2, let s # m 2 $, let k ∈ [ki [s], s), let j < di (k)[s], n i and let n = pi (k)[s]. The metak -box Mj (k)[s] is a k-box. Indeed, for each x ∈ Mji (k)[s], there are at least #n$ − k = f (x) − k many strings in Txi [s] lexicographically to the left of Ai [s].
Proof. Let s be the least such that we define pi (k)[s] = n for some k. Then k = #n$ and there are two possibilities. If n ∈ N then s = n, the definition is made at the beginning of stage s, i and we define M0i (s)[s], . . . , Ms+1 (s)[s] to be sub-boxes of Ns , which is an s-box. If n ∈ / N then a test was conducted at stage #n$ s − 1, and returned on the σi side, and now σi ⊀ Ai [s]. We then define pi (#n$)[s] = n and i define M0i (#n$)[s], . . . , Mn+1 (#n$)[s] to be sub-boxes of Nn , which is an #n$-box. In either case, the Mji (#n$)[s] are #n$-boxes, so for each x in these metaboxes, f (x) = #n$, and hence, vacuously, Txi contains at least f (x) − #n$ many strings. By induction, if n = pi (k)[t] at a later stage t, then for all j, we have that Mji (k)[t] is a sub-box of some Mji (#n$)[s], and so for all x ∈ Mji (k)[t] we have f (x) = #n$. Suppose that at stage t we redefine pi (k − 1)[t + 1] = n and redefine Mji (k − 1)[t + 1]. Then at some stage u t, for all x ∈ Mdi i (k)[u]−1 [u], we defined Ψσi i (x) = σi , where σi ≺ Ai [u] but σi ⊀ Ai [t + 1]. By induction, at stage u there were at least #n$ − k many strings in Txi [u] to the left of Ai [u]. These strings must all be distinct from σi . The test at stage u returned, so σi ∈ Txi [u + 1]. Thus Txi [t + 1] contains at least #n$ − (k − 1) many strings to the left of Ai [t + 1]. Lemma 14.2.3. The sequence ki∗ [0], ki∗ [1], . . . is nonincreasing, and all its terms are positive.
678
14. Strong Jump Traceability
Proof. The fact that this sequence is nonincreasing is immediate from the construction. By Lemma 14.2.2, for all j < di (ki∗ )[s] and x ∈ Mji (ki∗ )[s], we have |Txi [s]| f (x) − ki∗ [s]. As |Txi | < f (x), we have ki∗ [s] > 0. Lemma 14.2.4. For each t, the sequence pi (0)[t], pi (1)[t], . . . is strictly increasing. Proof. Assume the lemma holds at the beginning of stage t. We first define pi (t) to be t at stage t, and all numbers used prior to this stage were below t, so the lemma continues to hold. Now suppose that at stage t we update priorities because a test that returned at some stage s t is found to be incorrect. The induction hypothesis for s, and the instructions for testing, ensure that the collection of levels k for which a test returned at stage s is an interval [k0 , s]. Priorities then shift one step downward to the interval [k0 − 1, s − 1]. (That is, for k ∈ [k0 , s], we have p(k − 1)[t + 1] = p(k)[s].) Hence the sequence of priorities is still increasing for k < s. Finally, a new priority s + 12 is given to level s. This priority is greater than the priorities for levels k < s (which have priority at most s), but smaller than the priority k given to each level k ∈ (s, t]. Also note that we always have pi (k)[s] k because we start with pi (k)[k] = k and never decrease the value of pi (k). The following key calculation ensures that we never run out of boxes at any level, on either side, so the construction never gets stuck. It ties losses of boxes on one side to gains on the other. For any k ∈ [ki∗ [s], s], let li (k)[s] be the least level l such that p1−i (l)[s] pi (k)[s]. Such a level must exist because at the beginning of stage s we let p1−i (s)[s] = s, which is greater than or equal to pi (k)[s] for all k s. Note that li (k)[s] 1. Lemma 14.2.5. At each stage s, for i < 2 and k ∈ [ki∗ [s], s], the number di (k)[s] of meta-k-boxes is at least li (k)[s] if p1−i (li (k))[s] > pi (k)[s], and at least li (k)[s] + 1 if p1−i (li (k))[s] = pi (k)[s]. Proof. We proceed by induction on stages. Suppose the lemma holds at the end of stage t − 1. At stage t, we first define pi (t) = p1−i (t) = t. Thus li (t)[t] = t, while di (t)[t] = t + 2. Suppose that a test that began at stage s t is resolved at stage t, and priorities are updated. (Otherwise, there is nothing to show.) First suppose that σi ⊀ Ai [t + 1] and di (k)[t + 1] = di (k)[t]. If k < s, then a test at level k + 1 returned at stage s. Then di (k)[t + 1] = #n$ + 2, where n = pi (k)[t + 1] = pi (k + 1)[s]. Since p1−i (#n$)[t + 1] #n$, we have li (k)[t + 1] #n$ + 1, and hence di (s)[t + 1] li (k)[t + 1] + 1. For the k = s case, we have di (s)[t + 1] = s + 2 and pi (s)[t + 1] = s + 12 . Since p1−i (s + 1)[t + 1] s + 1, we have li (s)[t + 1] s + 1, so di (s)[t + 1] li (s)[t + 1] + 1.
14.2. The ideal of strongly jump traceable c.e. sets
679
Now suppose that σi ≺ Ai [t + 1]. We may have lost some meta-boxes on this side, but changing priorities on the other side gives us compensation. Let k ∈ [ki∗ [t], t]. If k > s then di (k)[t + 1] = k + 2 and li (k)[t + 1] = k. If k = s, then di (k)[t + 1] = di (k)[s] − 1 = s + 1 and li (k)[t + 1] s, so we may assume that k < s. Let n = pi (k)[t + 1] = pi (k)[s]. If there is no k such that p1−i (k )[s] = n, then there is no k such that p1−i (k )[t + 1] = n, because the only priority we may add at stage t (after the initial part of the stage) is s+ 12 , and n < s. Thus, if p1−i (li (k))[s] > n then p1−i (li (k))[t + 1] > n. Let k = li (k)[s], and note that k < s. Let n = p1−i (li (k))[s]. There are three possibilities. 1. A step n test of σ1−i at stage s returns. In this case, li (k)[t+1] = k −1 and p1−i (li (k))[t + 1] = n . 2. A step n test of σ1−i at stage s does not return, but a step n + 1 test of σ1−i at stage s does return. In this case the priority n is removed on the 1 − i side at stage t. We redefine p1−i (k )[t + 1] = p1−i (k + 1)[s] > n . However, we still have li (k)[t + 1] = k because ∗ (if k > k1−i [s]) we have p1−i (k − 1)[t + 1] = p1−i (k − 1)[s] < n. 3. A step n +1 test of σ1−i at stage s is not started or does not return. In this case there is no change at level k and k −1. We have li (k)[t+1] = k and p1−i (k )[t + 1] = n . In any case, we see that li (k) cannot increase from stage s to stage t + 1, and we cannot have p1−i (li (k))[s] > n but p1−i (li (k))[t + 1] = n. Thus the required number of k-meta-boxes does not increase from stage s to stage t + 1, and we need only check what happens if di (k)[t + 1] = di (k)[s] − 1. Assume this is the case. In case 1 above, the number of required boxes has decreased by one, which exactly compensates the loss. Case 3 is not possible if a k-box is lost, because a step n test is started only after a step p1−i (k + 1)[s] test has returned. The same argument shows that if case 2 holds and we lost a k-box, then necessarily n = n. But then di (k)[s] k + 1, and the fact that now p1−i (k )[t + 1] > n implies that the number of required boxes has just decreased by one, to k , so that again our loss is compensated. We are now ready to finish the proof. If J A0 ⊕A1 (e) ↓ then at some point the correct computation appears and is tested. Of course, the relevant tests must return, so the correct value is traced. If, on the other hand, a value J A0 ⊕A1 (e)[s] is traced at stage s because all tests return, but at a later stage t we discover that this computation is incorrect, because σi ⊀ Ai [t + 1] for some i < 2, then ki∗ [t + 1] < ki∗ [t]. As we always have ki∗ [u] 1, this event happens fewer than 2# m 2 $ m many times. It follows that the total number of values traced is at most m, as required.
680
14. Strong Jump Traceability
It is not known at the time of writing whether the join of two (not necessarily c.e.) s.j.t. sets must be s.j.t. (and hence whether the s.j.t. sets form an ideal), though this would follow from Conjecture 14.5.1 below.
14.3 Strong jump traceability and K-triviality: the c.e. case In this section, we show that the s.j.t. c.e. sets form a proper subideal of the K-trivial c.e. sets. We begin with the following result. Theorem 14.3.1 (Cholak, Downey, and Greenberg [65]). There is a computable order h such that if A is c.e. and h-jump traceable then A is low for K, and hence K-trivial. In Theorem 14.5.2, we will see that in fact every s.j.t. set is K-trivial. (We will also obtain Theorem 14.3.1 as a corollary to Theorem 14.4.2.) However, we include the following proof of Theorem 14.3.1 here, as it is of technical interest (and proves that every s.j.t. c.e. set is low for K directly, without appealing to the equivalence between lowness for K and K-triviality). Proof of Theorem 14.3.1. To show that A is low for K we use the KC Theorem. That is, we build a KC set L such that if U A (σ) = τ then (|σ|, τ ) ∈ L, so that K(τ ) |σ| + O(1). We enumerate A and thus approximate U A . When a string σ enters the domain of U A at a stage s, we need to decide whether to believe the As -computation that put σ in dom U As . We do so by testing the use ρ ≺ As of this computation. A first approximation to our strategy for doing so is to build a functional Ψ, and look at a trace T0 , T1 , . . . for ΨA . Given ρ, we choose an n and define Ψρ (n) = ρ. We then wait to see whether ρ ever enters Tn . If so, then we believe that ρ ≺ A, and hence that σ ∈ dom U A , so we enumerate (|σ|, U A (σ)) into L. The problem, of course, is ensuring that the weight of L is bounded. If we believed only correct computations (that is, if we had |Tn | = 1 for all n), then the weight of L would be μ(dom U A ) < 1. But, of course, the Tn grow with n, so we will believe some incorrect uses ρ. We need to make sure that the Tn grow slowly enough that these mistakes do not prevent L from being a KC set. To help us do so, rather than treat each string σ individually, we deal with strings in batches. When a set of strings C ⊆ dom U As of we have −k −|σ| weight 2 (i.e., such that σ∈C 2 = 2−k ), we will verify A up to a As use that puts them all in dom U . The greater 2−k is, the more stringent the test will be (ideally, in the sense that the size of the corresponding Tn is smaller). We will put a bound mk on the number of times that a set of strings of weight 2−k can be believed and yet be incorrect. The argument will succeed as long as k mk 2−k is finite.
14.3. Strong jump traceability and K-triviality: the c.e. case
681
Once we use an input n to verify the A-correctness of a use, it cannot be used again for any testing, so following this strategy, we might need 2k many inputs for testing sets of weight 2−k . Even a single error on each n (and there will be more, as |Tn | goes to infinity) means that mk 2k , which is too large. Thus we need a combinatorial strategy to assign inputs in such a way as to keep the mk small. This strategy has two ingredients. First, note that two sets of weight 2−k can be combined into a single set of weight 2−(k−1) . So if we are testing one such set, and another one with compatible use appears, then we can let the testing machinery for 2−(k−1) take over. Thus, even though we still need several testing locations for 2−k , at any stage, the testing at level 2−k is responsible for at most one set of weight 2−k . It might appear that it is now sufficient to let the size of each Tn used to test sets of weight 2−k be something like k. However, sets of large weight can be built gradually. We might first see k many sets of weight 2−k (with incompatible uses), each of which requires us to use the input n devoted to the first set of weight 2−k , because at each stage we believe only one of these k many sets. Once Tn gets to its maximum size, we see a set of weight 2−k with a correct use. Then the same sequence of events happens again. We make k mistakes on each n used for level 2−k testing, and end up using about 2k many such n’s, which we have already seen we cannot do. The solution is to make every mistake count in our favor in all future tests of sets of weight 2−k . In other words, what we need to do is to maximize the benefit that is given by a single mistake. We make sure that a mistake on some set means one less possible mistake on every other set. Rather than testing a set on a single input n, we test it simultaneously on a large set of inputs and believe it is correct only if the relevant use shows up in the trace of every input tested. If our set is eventually believed but is in fact a mistake, then we have a large collection of inputs n for which the number of possible errors is reduced. We can then break up this collection into smaller collections and keep working with such subcollections. This idea can be geometrically visualized as follows. We have an mk dimensional cube of inputs, each side of which has length 2k . In the beginning we test each set on one hyperplane. If the testing on some hyperplane is believed and later found to be incorrect, then from then on we work in that hyperplane, which becomes the new cube for testing pieces of size 2−k , and we test on hyperplanes of this new cube. If the size of Tn for each n in the cube is at most mk , then we never “run out of dimensions”. We now proceed with the formal details of the construction. For c > 1, we partition N into consecutive intervals M0c , M1c , . . . such that |Mkc | = 2k(k+c) . For n ∈ Mkc , let hc (n) = k + c − 1. By Lemma 14.1.2, we can determine a ˜ such that, from a trace for J A with bound ˜h, we can computable order h compute, uniformly in c, a trace for ΦA c with bound hc .
682
14. Strong Jump Traceability
We will define a functional Ψ. By the recursion theorem, we know some ω c c such that ΨX = ΦX c for all X ∈ 2 . Let Mk [0] = Mk and let {Tn }n∈N A be a trace for Ψ with bound h = hc . The definitions of Ψ that we make will all be of the form Ψρ (n) = ρ. We will make such a definition at stage s only when ρ ≺ A[s]. Let Rk = {m2−k : m 2k } and let Rk+ = Rk \ {0}. We can label the elements of Mk [0] so that Mk [0] = {xf : f : (k + c) → Rk+ }. We can think of Mk [0] as a (k + c)-dimensional cube with sides of length 2k . At each stage s, for each k we will have a number dk [s] < k + c and a function gk [s] : dk [s] → Rk+ , which will determine the current value of Mk via the definition Mk [s] = {xf ∈ Mk [0] : gk [s] ⊂ f }. (Thus dk [0] = 0 and gk [0] is the empty function.) We can think of Mk [s] as a (k + c − dk )dimensional cube. For q ∈ Rk+ , let Nk (q)[s] = {xf ∈ Mk [s] : f (dk [s]) = q}, the q2k th hyperplane of Mk [s]. Recall that Ωρ = μ(dom U ρ ), the measure of the domain of the universal machine with oracle ρ. Note that if ρ ≺ ν then Ωρ Ων . We adopt the convention that the running time of any computation with oracle ρ is at most |ρ|, so the maps ρ → U ρ and ρ → Ωρ are computable, and |σ| |ρ| for all σ ∈ dom U ρ . It follows that Ωρ is a multiple of 2−|ρ| , and hence is an element of R|ρ| . Also note that Ωλ = 0. Let q ∈ Q. For ν such that Ων q, let ν (q) be the shortest ρ ν such that Ωρ q. Note that if q < q Ων then ν (q) ν (q ). Let A[s] denote the approximation to A at the beginning of stage s. At the beginning of stage s, the cubes Mk will be in the following standard configuration for the stage: Let k s and q ∈ Rk+ . If q ΩA[s]s then A[s]s (q) Ψ (x)[s] ↓= A[s]s (q) for all x ∈ Nk (q)[s]. Otherwise, ΨA[s] (x)[s] ↑ for all x ∈ Nk (q)[s]. Furthermore, for all k > s and x ∈ Mk [s], no definition of Ψ(x) for any oracle has been made before stage s. Let ρ A[s] s. We say that ρ is semiconfirmed at some point during stage s if for all x such that Ψρ (x) ↓= ρ at stage s, we have ρ ∈ Tx at that given point (which may be the beginning of the stage or later). We say that ρ is confirmed if every ρ ρ is semiconfirmed. Note that λ is confirmed at every stage, because Ωλ = 0, so we never have A[s]s (q) = λ for any s and q > 0, and hence never define Ψλ (x) for any x.
Construction. At stage s, proceed as follows. Speed up the enumeration of A and of the Tn (to get A[s+ 1] and Tn [s+ 1]) so that for all ρ A[s] s, either ρ is confirmed or ρ ⊀ A[s + 1]. One of the two must happen because {Tn }n∈N traces ΨA . For each k s, look for some q ∈ Rk+ such that q ΩA[s]s and A[s]s (q) was confirmed at the beginning of the stage but A[s]s (q) ⊀ A[s+1]. If there is such a q then extend gk by letting gk (dk ) = q, so that dk [s+1] = dk [s]+1 and Mk [s + 1] = Nk (q)[s].
14.3. Strong jump traceability and K-triviality: the c.e. case
683
Finally, define Ψ as necessary so that the standard configuration holds at the beginning of stage s + 1. End of Construction. Justification. We need to explain why the construction never gets stuck. There are two issues: why we can always increase dk when asked to (since dk must remain below k +c), and why we can always return to the standard configuration. To address the first issue, we prove the following. Lemma 14.3.2. For each s and x ∈ Mk [s + 1], there are at least dk [s] many strings in Tx , so dk [s] |Tx | h(x) = k + c − 1. Proof. Suppose that at stage s, we increase dk by one. This action is witnessed by some q ∈ Rk+ such that ρ = A[s]s (q) was confirmed at the beginning of the stage, and we let Mk [s + 1] = Nk (q)[s]. The confirmation of ρ implies that ρ ∈ Tx for all x ∈ Nk (q)[s]. But we also have that ρ ≺ A[s] and ρ ⊀ A[s + 1]. As A is c.e., A[s + 1] is to the right of ρ. If we increase dk at stages s0 < s1 (witnessed by strings ρ0 and ρ1 ) then ρ0 is to the left of A[s0 + 1], while ρ1 ≺ A[s1 ] (which is not to the left of A[s1 + 1]). Thus ρ0 is to the left of ρ1 , and, in particular, they are distinct. To address the second issue mentioned above, we assume that the standard configuration holds at the beginning of stage s and show that we can return to the standard configuration after stage s. Fix k s (as there is no problem with k = s + 1). If Mk [s + 1] = Mk [s], witnessed by some q ∈ Rk+ and ρ = A[s]s (q), then Ψρ (x) ↓= ρ for all x ∈ Mk [s + 1], so if ρ ≺ ρ then Ψρ (x)[s] ↑. As ρ ⊀ A[s + 1] and no definitions for oracles to the right of A[s] are made before stage s, we must have ΨA[s+1] (x)[s] ↑, so we are free to make the definitions needed for the standard configuration to hold at the beginning of stage s + 1. If Mk [s + 1] = Mk [s], then let q ∈ Rk+ be such that q ΩA[s+1]s+1 , and let x ∈ Nk (q)[s + 1] = Nk (q)[s]. We want to define Ψρ (x) ↓= ρ for ρ = A[s+1]s+1 (q). If ρ ⊀ A[s] then ρ is to the right of A[s], so Ψρ (x)[s] ↑ for all x ∈ Mk [s]. If ρ ≺ A[s], then there are two possibilities. If |ρ| s then ρ = A[s]s (q), so we already have Ψρ (x) ↓= ρ for all x ∈ Nk (q)[s]. If |ρ| = s + 1 then, since q > Ωρ for all ρ ≺ ρ, we have q > ΩA[s]s . Since the standard configuration holds at the beginning of stage s, we have ΨA[s] (x) ↑ for all x ∈ Nk (q) at the beginning of stage s. Thus we are free to define Ψρ (x) as we wish. Verification. Let ρ∗ [s] be the longest string (of length at most s) that is a common initial segment of both A[s] and A[s + 1]. Thus ρ∗ [s] is the longest string that is confirmed at the beginning of stage s + 1. Let ∗
L = {(σ, τ ) : U ρ
[s]
(σ) = τ ∧ s ∈ N}.
684
14. Strong Jump Traceability
Suppose that U A (σ) = τ , and let ρ ≺ A be such that U ρ (σ) = τ . Let s > |ρ| be such that ρ ≺ A[s], A[s + 1]. Then ρ ρ∗ [s], so (σ, τ ) ∈ L. Thus {(σ, τ ) : U A (σ) = τ } ⊆ L. Since L is c.e., we are left with showing that −|σ| < ∞, from which it follows that L is a KC set witnessing (σ,τ )∈L 2 the fact that A is low for K. Since μ(dom U A ) < 1, it is enough to show that (σ,τ )∈L\U A 2−|σ| < ∞. ∗ For k s, let qk [s] be the greatest element of Rk not greater than Ωρ [s] . If k < k s then qk [s] qk [s], because Rk ⊆ Rk . Note that |ρ∗ [s]| ∗ ∗ s, so Ωρ [s] is an integer multiple of 2−s . It follows that qs [s] = Ωρ [s] . Furthermore, since Ωρ < 1 for all ρ, we must have q0 [s] = 0. ∗ Let νk [s] = ρ [s] (qk [s]). By the monotonicity of qk [s], if k < k s then νk [s] νk [s], and so Ωνk [s] Ωνk [s] . Furthermore, ν0 [s] = λ, and ∗ ∗ Ωνs [s] = Ωρ [s] , whence U νs [s] = U ρ [s] . ∗ Since qk−1 [s] Ωνk−1 [s] Ωνk [s] Ωρ [s] qk−1 [s] + 2−(k−1) , we have the following key fact. Lemma 14.3.3. Let 1 k s. Then Ωνk [s] − Ωνk−1 [s] 2−k+1 . We now show that we can define sets Lk,t of relatively small weight such that if (σ, τ ) ∈ L \ U A , then we can “charge” the mistake of adding (σ, τ ) to L against some Lk,t . Formally, we define sets Lk,t with the following properties. 1. For each k and t, we have (σ,τ )∈Lk,t 2−|σ| 2−k+1 . 2. L \ U A ⊆ k,t Lk,t . 3. For each k, there are at most k + c many t such that Lk,t is nonempty. Given these properties, we have 2−|σ| (σ,τ )∈L\U A
2−|σ|
k,t (σ,τ )∈Lk,t
(k + c)2−k+1 < ∞,
k
as required. If 1 k t and νk [t] ⊀ A[t + 2] then let Lk,t = {(σ, τ ) : U νk [t] (σ) = τ ∧ U νk−1 [t] (σ) ↑}, and otherwise let Lk,t = ∅. We show that properties 1–3 above hold. Property 1 follows from Lemma 14.3.3: 2−|σ| = μ(dom(U νk [t] \ U νk−1 [t] )) = Ωνk [t] − Ωνk−1 [t] 2−k+1 . (σ,τ )∈Lk,t
Lemma 14.3.4. L \ U A ⊆
k,t
Lk,t .
Proof. Let (σ, τ ) ∈ L \ U A . Let ρ be a minimal length string such that (σ, τ ) ∈ U ρ and ρ ρ∗ [s] for some s. Since ρ ≺ A[s], A[s + 1] but ρ ⊀ A, there is a t s such that ρ ≺ A[t], A[t + 1] but ρ ⊀ A[t + 2].
14.3. Strong jump traceability and K-triviality: the c.e. case
685
∗
Since ρ ρ∗ [t] and U ρ [t] = U νt [t] , by the minimality of ρ we have ρ νt [t]. Since ν0 [t] = λ, there is a k ∈ [1, t] such that νk−1 [t] ≺ ρ νk [t]. Since ρ νk [t], we have (σ, τ ) ∈ U νk [t] . Since νk−1 [t] ≺ ρ∗ [t], the minimality of ρ implies that (σ, τ ) ∈ / U νk−1 [t] . Finally, ρ ⊀ A[t+2], so νk [t] ⊀ A[t+2]. Thus (σ, τ ) ∈ Lk,t . Lemma 14.3.5. If Lk,t = ∅ then Mk [t + 1] = Mk [t + 2]. Proof. Suppose that Lk,t = ∅, so that νk [t] ⊀ A[t + 2]. Let q = qk [t]. Then ∗ νk [t] = ρ [t] (q) = A[t]t (q). Now, νk [t] ρ∗ [t], so νk [t] is confirmed at the beginning of stage t + 1. Furthermore, q > 0, as otherwise νk [t] = λ ≺ A[t + 2]. Thus all the conditions for redefining Mk at stage t + 1 are fulfilled. Thus properties 1–3 above hold, and L is a KC set witnessing the fact that A is low for K. Using more intricate combinatorics it is also possible to establish the following result, where Martin-L¨of cuppability is as defined in Definition 11.8.3. Theorem 14.3.6 (Cholak, Downey, and Greenberg [65]). There is a computable order h such that if A is c.e. and h-jump traceable then A is not Martin-L¨ of cuppable. The relationship between strong jump traceability and K-triviality in the computably enumerable case was clarified by Cholak, Downey, and Greenberg [65], who showed that the s.j.t. c.e. sets form a proper subclass of the K-trivial c.e. sets. Theorem 14.3.7 (Cholak, Downey, and Greenberg [65]). There are a Ktrivial c.e. set A and a computable order h such that A is not h-jump traceable. Proof. The construction of a K-trivial set that is not strongly jump traceable came out of a construction of a K-trivial set that is not n-c.e. for any n. The existence of such sets can be shown using the fact the class of K-trivial sets is closed downwards under Turing reducibility, but let us consider a direct cost function construction of such a set. The basic construction of a K-trivial promptly simple c.e. set can be described as follows. The eth requirement wishes to show that the set A we construct is not equal to We . The requirement is given a capital of 2−e . It appoints a follower x0 , and waits for its realization, that is, for x0 to enter We . If, upon realization, the cost of changing A(x0 ) is greater than 2−e , the follower is abandoned, a new one x1 is picked, and the process repeats itself. Suppose now that we want to ensure that A is not 2-c.e. The eth requirement wants to ensure that A is not 2-c.e. via the eth 2-c.e. approximation Ye \ Ze (where both Ye and Ze are c.e.). Again the requirement is provided
686
14. Strong Jump Traceability
with 2−e much capital to spend. A naive strategy would be the following. Appoint a follower x0 and wait for first realization, that is, for x to enter Xe (and not Ye , for now.) Provided the price is not too high, extract x0 from A (we begin with A = N) and wait for second realization, that is, for x to enter Ye . At this point put x0 back into A. The problem, of course, is that we cannot put x0 into A unless the cost of changing A(x0 ) has remained small. For this strategy to succeed, the follower x0 needs two “permissions” from the cost function, and the danger is that we spend some capital on the first action (the extraction), but the second action is too expensive and the follower has to be abandoned. The amount we spend on extraction is nonrefundable, though, so this strategy soon runs into trouble. A better strategy is the following. From the initial sum 2−e , we set aside a part (say 2−(e+1) ), which is kept for the re-enumeration of a follower and will not be used for extraction. Of the remaining 2−(e+1) , we apportion some amount (say 2−(e+2) ) for the sake of extraction of the first follower x0 . If the cost of extraction of x0 is higher, we then abandon x0 (at no cost to us) and allot the same amount 2−(e+2) for the extraction of the next follower x1 . Suppose, for example, that we do indeed extract x1 , but when it is realized again and we are ready to re-enumerate it into A, the cost of doing so has risen beyond the amount 2−(e+1) we set aside for this task. We have to abandon x1 , appoint a new follower x2 , and start from the beginning. We did lose an uncompensated 2−(e+2) , though, so we reduce the sum that we may spend on extracting x2 to 2−(e+3) , and keep going. Between extractions, the sum we may spend on the next extraction is kept constant, so the usual argument shows that some follower eventually gets extracted (assuming that all followers are realized, of course). Furthermore, abandoning followers upon second realization can happen only finitely often, because the cost threshold 2−(e+1) for re-enumeration does not change. Each follower is appointed only after the previous one is canceled, and is chosen to be large, so the costs associated with attempts at re-enumeration of different followers are due to completely different parts of the domain of the universal machine having measure at least 2−(e+1) . Thus there cannot be more than 2e+1 many failed attempts at re-enumeration. Note that the same reasoning may be applied to the extraction steps. When we abandon a follower at first realization, the next follower is chosen to be large. The acceptable cost 2−m is not changed at that point, so this kind of abandonment cannot happen more than 2m many times (until we have an abandonment at second realization and decrease the acceptable cost to 2−(m+1) ). Putting together this argument with the previous one allows us to compute a bound on the total number of possible abandonments, which will be relevant below. It is straightforward to modify this strategy to satisfy a requirement working toward ensuring that A is not n-c.e. for a given n. Instead of two levels of realization, we will have n many, but the basic idea remains the same. To make A not strongly jump traceable, rather than not n-c.e., we
14.3. Strong jump traceability and K-triviality: the c.e. case
687
need to change J A (x) on some x more than h(x) many times, where h is some computable order we specify in advance (and x is an number for which we can control J A (x)). To change J A (x) we change A below the use of the computation that gives the current value of J A (x). Each time we do so, we can use a different element of A, so A can be made c.e. However, the main feature of the argument outlined above, that the same x receives attention several times, remains, so a similar strategy to the one we discussed will work here. We will define a computable order h shortly. We may assume that [n] |We | h(n) for all n (by stopping elements from entering We if their enumeration would cause this condition to be violated). We enumerate a set A and define a partial function p = ΦA . The requirement Re is that [n] {We }n∈ω is not a trace for p. The requirements act independently of each other. Let us discuss the action of a single requirement Re . Let Te be the set of all sequences (k0 , k1 , . . . , ki ) with i < e + 2 and j kj < 2(e+2)2 for each j i. We think of Te as a tree. An element σ ∈ Te |σ| is a leaf of Te if it has length e + 2. Let εσ = 2−(e+2)2 . Each leaf σ ∈ Te will correspond to an attempt at meeting Re . If i < e+2 then εσi is the amount that we are willing to spend on the (e + 2 − i)th attack with the follower corresponding to σ, in the sense of the standard (K-triviality) cost function. The tree Te and the rationals εσ were chosen so that 1. ελ = 2−(e+2) ; 2. if σ ∈ Te is not a leaf, then it has exactly successors on Te ; and
1 εσ
many immediate
3. if σ ∈ Te is not a leaf, then εσ is the sum of ετ over the immediate successors τ of σ on Te . Using these facts, we can show by reverse induction on |σ| that for each σ ∈ Te that is not a leaf, the sum of ετ as τ ranges over all extensions of σ on Te that are not leaves is (e + 2 − |σ|)εσ . Thus the sum of ετ as τ ranges over all elements of Te that are not leaves is (e + 2)2−(e+2). This quantity is the total amount we will allow Re to spend, so the construction will obey the standard cost function, as e (e + 2)2−(e+2) is finite. We can now define h. Partition N into intervals Ie so that max Ie + 1 = min Ie+1 , letting the size of Ie be the number of leaves of Te . Index the elements of Ie as xσ for leaves σ of Te . Let h(n) = e + 1 for all n ∈ Ie . If not yet satisfied at a stage s, the requirement Re will have a pointer σ pointing at some leaf of Te , and will be conducting an attack via xσe . This attack will have a level i < e + 2, which decreases with time, until the attack either is abandoned or fully succeeds when we get to the root. In the beginning, let σ = 0e+2 , the leftmost leaf of Te , and begin an attack with xσ at level e + 1.
688
14. Strong Jump Traceability
The following are the instructions for an attack at level i < e + 2, at a stage s. Let σ = σ[s]. Recall that the cost of enumerating a number x into A at stage s is c(x, s) = x
1. Define p(xσ ) = ΦAs (xσ ) = s with use s + 1. Wait for s to enter We σ . While waiting, if some other requirement puts a number y s into A, hence making p(xσ ) undefined, redefine p(xσ ), again with value s and use s + 1. [xσ ]
2. Suppose s enters We
at stage t.
(a) If c(s, t) εσi , then put s into A (making p(xσ ) undefined). Leave σ unchanged and attack with it at level i − 1 if i > 0. If i = 0 then cease all action for Re . (b) If c(s, t) > εσi , then abandon xσ . Move one step to the right of σ i + 1. That is, if σ = (k0 , . . . , ke−1 ) then let σ[t + 1] = (k0 , . . . , ki−1 , ki + 1, 0, . . . , 0). Begin an attack with this new σ at level e + 1. We must argue that when we redefine σ in step 2(b), we remain within 1 Te , which amounts to showing that ki + 1 < εσi . Let σ ∗ = (k0 , . . . , ki−1 ). We know that for all k ki , some attack was made with some string extending σ ∗ (k). Let τk be the rightmost string extending σ ∗ (k) that was ever used for an attack (so that τki = σ). We know that we attacked with τk at level i and this attack was abandoned. Let sk be the stage at which the attack with τk at level i began, and let tk > sk be the stage at which this attack was abandoned (so that tki = t). Since tk−1 sk , the intervals (sk , tk ] are disjoint. At stage tk , the attack with τk was abandoned because c(sk , tk ) > εσ∗ , so 1>
2−K(n)
kki sk
2−Ktk (n)
kki sk
=
c(sk , tk ) > (ki + 1)εσ∗ ,
kki 1
whence ki + 1 < εσ∗ as required. We now verify that the above construction works as intended. For each e and each τ ∈ Te that is not a leaf, there is at most one s enumerated into A because of a successful attack with some σ τ at level |τ |. Thus the total cost due to Re does not exceed (e + 2)2−(e+2) , and so the construction obeys the necessary cost function, which ensures that A is K-trivial. Fix e. It cannot be the case that an attack with some xσ at level 0 [x ] succeeds, since then we would have |We σ | > e + 1 = h(xσ ). Thus there is some stage s at which we begin an attack with xσ[s] at some level, but s [xσ ]
never enters We
[xσ ]
. Then p(xσ ) = s ∈ / We
, so Re is satisfied.
14.4. Strong jump traceability and diamond classes
689
These results suggest a question: Can K-triviality be characterized via hjump traceability for some class of orders h? In investigating this question, it is most natural to take the alternate definition that a set A is h-jump traceable if every partial A-computable function has a c.e. trace with bound h. One was to consider the class of computable orders h such suggestion 1 that n h(n) < ∞. Using the golden run method of Nies discussed in Chapter 11, it is not difficult to show that if a c.e. set A is K-trivial then A is h-jump traceable for all such h. However, work of Ng [296] and Barmpalias, Downey, and Greenberg [20] has shown that there are c.e. sets that are h-jump traceable for all such h but are not K-trivial. It remains open whether some other reasonably natural class of orders may capture K-triviality exactly.One possibility would be the class of all computable orders h such that 2−h(n) < ∞. H¨olzl, Kr¨ aling, and Merkle [183] have shown that A is K-trivial iff A is O(g(n) − K(n))-jump traceable for all Solovay functions g, and that, moreover, there is a single Solovay function f such that A is K-trivial iff A is O(f (n) − K(n))-jump traceable. However, the presence of K in this result might strike some as unsatisfying.
14.4 Strong jump traceability and diamond classes Recall that for a class S ⊆ 2ω , we define S 3 to be the collection of c.e. sets computable from every 1-random element of S. In this section, we prove theorems of Greenberg and Nies [172] and Greenberg, Hirschfeldt, and Nies [169] that show that the class of strongly jump traceable c.e. sets coincides with both (ω-c.e.)3 and superlow3. This result lends evidence to the claim that strong jump traceability is a natural notion to study in the context of algorithmic randomness, and further demonstrates the usefulness of the notion of diamond classes. As we will see, it also provides a useful tool for the study of superlowness. It is worth noting that, by Theorem 2.19.10 applied to a Π01 class of 1random sets, for every c.e. set A, there is a low 1-random set that does not compute A, so low3 (and hence (Δ02 )3 ) contains only the computable sets. We will use a characterization of strong jump traceability in terms of certain special cost functions. Recall the general notion of a cost function from Definition 11.5.1. Theorems 11.5.3 and 11.10.2 say that K-trivial sets have special approximations that change only a few times, and hence have low cost. Greenberg and Nies [172] began with these results as a starting point and studied the cost functions associated with s.j.t. sets. The following definition isolates an important property of the standard cost function c(n, s) = n0 → N such that for
690
14. Strong Jump Traceability
each positive rational q, if I is a set of pairwise disjoint intervals of natural numbers such that c(n, s) q for all [n, s) ∈ I, then |I| g(q).1 Note that benign cost functions always satisfy the limit condition. For example, the standard cost function c is benign: If I is as in the above definition, then |I| 1q , since q|I| c(n, s) 2−K(i) 2−K(i) 1. [n,s)∈I
[n,s)∈I n
i
Thus the following result implies Theorem 14.3.1 that every s.j.t. c.e. set is K-trivial. Theorem 14.4.2 (Greenberg and Nies [172]). A c.e. set is strongly jump traceable iff it obeys all benign cost functions. One direction of this theorem follows from the following lemma, since if a c.e. set obeys all benign cost functions then it is K-trivial, and hence jump traceable, by Theorem 11.4.8. Lemma 14.4.3 (Greenberg and Nies [172]). Let A be a jump traceable c.e. set and let h be a computable order with h(0) > 0. There is a benign cost function c such that if A obeys c, then J A has a c.e. trace bounded by h.2 Proof. Let h(n) = h(n) − 1 if h(n) > 1, and let h(n) = 1 otherwise. Let ψ(n) be the least stage at which J A (n) converges with an A-correct computation, if there is such a stage. Since A is c.e., tracing J A (n) is equivalent to tracing the partial function ψ. Since A is jump traceable, there is a c.e. trace S0 , S1 , . . . for ψ bounded by some computable order g. If J A (n)[s] ↓ with use k, then we say that this computation is certified if there is some t < s such that As k = At k and t ∈ Sn [s]. We want to ensure that the cost of all x < k at stage s is at least 1 , so h(n)
let c(x, s) be the maximum of all
1 h(n)
such that there is some u s at
which a computation J A (n)[u] with use greater than x is certified (where this maximum is 0 if there is no such n). Note that c is indeed a monotone cost function. We argue that c is benign. Let q > 0 be a rational and let I be a set of pairwise disjoint intervals of natural numbers such that c(x, s) q for all [x, s) ∈ I. Let n∗ be such that h(n∗ ) > 1q . Let [x, s) ∈ I. Then there is 1 Greenberg and Nies [172] suggested the following way to understand this concept. Let q > 0. Let y0q = 0, and if ykq is defined and there is some s such that c(ykq , s) q, q be the least such s. If c satisfies the limit condition limn c(n) = 0, then then let yk+1 this process has to halt after finitely many iterations. Then c is benign iff there is a computable bound on the number of iterations of this process given q. 2 Greenberg and Nies [172] remarked that this lemma can be uniformized to show that for each computable order h, there is a benign cost function c such that for all c.e. sets A obeying c, there is a trace for J A bounded by h.
14.4. Strong jump traceability and diamond classes
691
some n < n∗ and u s such that J A (n)[u] with use k > x is certified. Let t < u witness this certification. Then t ∈ (x, s), since x < k < t < u s, and so, in I are pairwise disjoint and t ∈ Sn , we have since the intervals |I| n
numbers into Tn . Say that the computation J Asi (n) gets certified at stage ui < si . Let ki be the use of that computation. For each i < |Tn | − 1, there vi ki = A vi +1 ki , since by stage is some stage vi ∈ [ui , ui+1 ) such that A s A si+1 , the computation J i (n) is injured by some number below ki entering 1 A. Since c(ki − 1, vi ) c(ki − 1, si ) h(n) , we have |Tn | − 1 h(n). Thus,
by altering finitely many Tn if necessary, we have |Tn | h(n) for all n.
The more difficult direction of Theorem 14.4.2 is the following lemma, whose proof we only sketch, since the methodology is quite similar to that of the proof of Theorem 14.3.1 that every s.j.t. c.e. set is K-trivial. Lemma 14.4.4 (Greenberg and Nies [172]). For each benign cost function c, there is a computable order h with h(0) > 0 such that for any c.e. set A, if J A has a c.e. trace bounded by h then A obeys c. Proof sketch. We define an A-partial computable function ΨA . Using the recursion theorem (or universal traces, as in [172]), we may assume we have a trace T0 , T1 , . . . with a bound h that is sufficiently slow growing to make the following box promotion construction work, such that T0 , T1 , . . . traces ΨA for all but finitely many values. As we have seen, the idea of a box promotion construction is to attempt to certify certain approximations to initial segments of A. For example, in the proof Theorem 14.3.1, we used certification to decide whether to believe a computation of the form U A (σ)[s] ↓. Here we use certification to obtain an enumeration of A that obeys c. To attempt to certify As u, we define ΨAs (n)[s] = s with use u for various n’s. Certification happens at a later stage t if At u = As u and s ∈ Tn [t] for all such n. The degree of certainty that this certification provides us depends on the bound h(n) on the size of the Tn . We know that we cannot make more than h(n) − 1 many mistakes. If h(n) = 1 and As u is certified at stage t, then we know that in fact A u = As u, but of course h(n) > 1 for almost all n, so we need to develop combinatorial strategies to deal with our mistakes.
692
14. Strong Jump Traceability
0 , A 1 , . . . of A that obeys c, we speed up our To build our enumeration A given enumeration A0 , A1 , . . . and accept only sufficiently certified approx s+1 \ A s and c(x, s) 2−k , we want A s x + 1 to have imations. If x ∈ A been certified via some n with h(n) k. Since certification through such an n cannot happen more than k many times, letting xs be the least number in As+1 \ As , we will then have s c(xs , s) < k k2−k+1 < ∞. The part of the construction dealing with those x’s for which c(x, s) 2−k , call it requirement Rk , may ignore those x’s for which c(x, s) 2−(k−1) , as these need to be certified by even stronger “boxes” Tn . All of these certification processes must work in concert. At a stage s, we have u1 < u2 < u3 < · · · < um such that As u1 has to be certified with strength 2−1 by R1 , while As u2 has to be certified with strength 2−2 by R2 , and so on. The problem is that ΨA (n) may not be traced by Tn for all n; there may be finitely many exceptions. Hence, for each d ∈ N, we have a version of the construction indexed by d, which guesses that ΨA (n) is traced by Tn for each n such that h(n) d. Almost all versions will be successful. To keep the various versions from interacting, each version controls its own infinite collection of boxes Tn . That is, for each n, only one version of the construction may attempt to define ΨA (n). As we have seen, a common feature of all box promotion constructions is that each certification takes place along a whole block of boxes Tn , which together form a meta-box. To ensure that Rk certifies only k − 1 many wrong guesses at initial segments of A, we need each failure to correspond to an enumeration into the same Tn . On the other hand, if a correct initial segment is tested on some Tn , then this n is never again available for testing other, longer initial segments of A. The idea is that if a meta-box B used by Rk is promoted (by some s that is in Tn for all n ∈ B being found to correspond to an incorrect guess at an initial segment of A, which means that each Tn can now be thought of as a smaller box), then we break B up into several sub-boxes, as in previous constructions. The fact that c is benign, witnessed by a computable bound function g, allows us to set in advance the size of the necessary meta-boxes, thus making h computable. A meta-box for Rk can be broken up at most k many times, so the necessary size for an original Rk meta-box is g(2−k )k+1 . The details of the construction are quite similar to those in the proof of Theorem 14.3.1, and may be found in [172].
Establishing a relationship between diamond classes and s.j.t. c.e. sets begins with the following basic result. Theorem 14.4.5 (Greenberg and Nies [172]). Let Y be a Δ02 1-random set. Then there is a monotone cost function c satisfying the limit condition such that every c.e. set that obeys c is computable from Y . If Y is ω-c.e., then c can be chosen to be benign.
14.4. Strong jump traceability and diamond classes
693
Proof. Let c(n, 0) = 2−n for all n. For s > 0, let ks be the least k such that Ys k = Ys−1 k (which we may assume always exists). Let c(n, s) = max(c(n, s − 1), 2−ks ) for all n < s and let c(n, s) = c(n, s − 1) for all n s. It is easy to check that c is a monotone cost function satisfying the limit condition. If there is a computable function g such that |{s : Ys (m) = Ys−1 (m)}| < g(m) for all k, and I is a set of pairwise disjoint intervals of natural numbers such that c(n, s) 2−k for all [n, s) ∈ I, then for each [n, s) ∈ I with n > k,there is a t ∈ (n, s] such that Yt k = Yt−1 k, and hence |I| (k + 1) + m v, n and Y n − 1 = Yt n − 1, then A n = At n, which implies that A T Y . Assume otherwise, and let u > t be such that Au n = Au−1 n. Then Au sn (u) = Au−1 sn (u). Let m be the least such that Au sm (u) = Au−1 sm (u). Then m n, and we enumerate Yu m − 1 into G. Since u > v, we have Yt m − 1 = Y m − 1 = Yu m − 1. But then su (m − 1) > t > n, so Au sm−1 (u) = Au−1 sm−1 (u), contradicting the minimality of m. Combining this result with Theorem 14.4.2, we have the following result. (Note that (ω-c.e.)3 ⊆ superlow3 , since every superlow set is ω-c.e.) Corollary 14.4.6 (Greenberg and Nies [172]). Every strongly jump traceable c.e. set is in (ω-c.e.)3 , and hence in superlow3 . This result has some powerful computability-theoretic consequences. It can be used to prove results of Ng [297] and Diamondstone [97] whose direct proofs are rather intricate. (These results answer natural questions about superlowness stemming from well-known results on lowness. See [97, 297] for further discussion.) We need the following version of Corollary 2.19.9, whose proof is similar. Lemma 14.4.7 (Greenberg and Nies [172]). Let P be a nonempty Π01 class. For each B there is a Y ∈ P such that (Y ⊕ B) tt B . We also use the easily proved fact that if C T D then C tt D . Theorem 14.4.8 (Ng [297]). There is a c.e. set A such that for every superlow c.e. set B, the join A ⊕ B is also superlow.
694
14. Strong Jump Traceability
Proof. Let A be an s.j.t. c.e. set. If B is superlow, then applying Lemma 14.4.7 to a nonempty Π01 class of 1-random sets, we see that there is a 1random set Y such that Y tt (Y ⊕ B) tt B tt ∅ . Then Y is superlow, so by Corollary 14.4.6, A T Y , and hence (A⊕ B) tt (Y ⊕ B) tt ∅ . Note that the above proof in fact proves the stronger result that if A is any s.j.t. c.e. set and B is any (not necessarily c.e.) superlow set, then A ⊕ B is also superlow. Taking A to be a promptly simple s.j.t. c.e. set (as given by Theorem 14.1.3), we have the following result. Theorem 14.4.9 (Diamondstone [97]). There is a promptly simple c.e. set A such that for every superlow c.e. set B, we have A ⊕ B
14.4. Strong jump traceability and diamond classes
695
By Corollary 2.19.9, every nonempty Π01 class has a superlow member, so if A satisfies the hypothesis of Theorem 14.4.12 then it is superlow. Thus, by Theorem 11.4.7, the extra hypothesis that A be jump traceable is not needed if A is c.e. Theorem 14.4.10 follows from Theorem 14.4.12 as follows. Let A ∈ superlow3 . Let X ⊕ Y be superlow and 1-random. Then both X and Y are superlow and 1-random, so A T X, Y . But Y is 1-random relative to X, and hence relative to A, so A is a base for 1-randomness, and hence K-trivial. By Theorem 11.4.8, A is jump traceable, so applying Theorem 14.4.12 to a nonempty Π01 class P all of whose elements are 1-random shows that A is s.j.t. Another corollary to Theorem 14.4.12 comes from the Scott Basis Theorem 2.21.2, which implies that every completion of PA computes a 1-random set. The reduction in the proof of that theorem is clearly a wtt-reduction, so it is also the case that every ω-c.e. completion of PA computes an ω-c.e. 1-random set. Thus we have the following result. Theorem 14.4.13 (Greenberg, Hirschfeldt, and Nies [169]). A c.e. set A is strongly jump traceable iff it is computable from every superlow (or equivalently, every ω-c.e.) completion of PA. To prove Theorem 14.4.12, we will not use the jump traceability of A directly, but instead will use the existence of certain nice approximations to A-partial computable functions. We think of a functional Ψ as a partial computable function p : 2<ω ×N → N such that for each n, the set of all σ such that (σ, n) ∈ dom p is prefixfree. We have ΨA (n) = m iff p(σ, n) = m for some σ ≺ A, in which case the use ψ A (n) equals |σ|. An enumeration Ψ0 , Ψ1 , . . . of Ψ is then simply a sequence of functionals corresponding to uniformly partial computable functions p0 ⊆ p1 ⊆ · · · such that dom ps ⊂ 2s × [0, s] and p = s ps . Definition 14.4.14 (Greenberg, Hirschfeldt, and Nies [169]). Let (As )s∈ω be a computable approximation to a Δ02 set A, and let (Ψs )s∈ω be an enumeration of a functional Ψ. We say that the pair ((As )s∈ω , (Ψs )s∈ω ) is a restrained A-approximation of ΨA if there is a computable function s g such that for each n, the number of stages s such that ΨA s (n) ↓ and As As As+1 ψs (n) = As ψs (n) is bounded by g(n). Theorem 14.4.15 (Greenberg, Hirschfeldt, and Nies [169]). The following are equivalent. (i) A is superlow and jump traceable. (ii) Each A-partial computable function has a restrained A-approximation. Given this fact, Theorem 14.4.12 follows from the following result. Theorem 14.4.16 (Greenberg, Hirschfeldt, and Nies [169]). Let P be a nonempty Π01 class, and let A be a set such that every A-partial computable
696
14. Strong Jump Traceability
function has a restrained A-approximation. If A is computable from every superlow element of P then A is strongly jump traceable. Rather than prove Theorem 14.4.15 here, we will show that we can obtain Theorem 14.4.10 from Theorem 14.4.16 without it. (See [169] for a proof of Theorem 14.4.16, which uses a characterization of sets that are both superlow and jump traceable due to Cole and Simpson [72].) Lemma 14.4.17. If B is a jump traceable c.e. set, then every B-partial computable function has a restrained B-approximation. Proof. Let θ = ΦB be a B-partial computable function. Let T0 , T1 , . . . be a c.e. trace for n → B ϕB (n) with a computable bound h. Define an enumeration of a functional Ψ by letting Ψτs (n) = Φτs (n) if τ ≺ Bs and τ ∈ Tn [s]. It is easy to verify that ΨB = θ, and there are at most h(n) Bs Bs s many stages s such that ΨB s (y) ↓ and Bs+1 ψs (y) = Bs ψs (y). Lemma 14.4.18. If A is computable from every superlow 1-random set, then every A-partial computable function has a restrained A-approximation. Proof. Let A be computable from every superlow 1-random set. As noted following Theorem 14.4.10, A is K-trivial. By Theorem 11.5.4, A is computable from some c.e. K-trivial set B. By Theorem 11.4.8, B is jump traceable, so by Lemma 14.4.17, every B-partial computable function has a restrained B-approximation, from which it follows easily that every A-partial computable function has a restrained A-approximation. Thus Theorem 14.4.10 follows from Theorem 14.4.16, which we now prove. Proof of Theorem 14.4.16. Let P be a nonempty Π01 class and let A be computable from every superlow element of P and such that every A-partial computable function has a restrained A-approximation. Fix a computable order h and an A-partial computable function θ. We show that θ has a c.e. trace bounded by n → 2h(n) , which is enough to establish strong jump traceability, since h is arbitrary. (Of course, it is enough to do this for the single case θ = J A .) Fix a restrained A-approximation ((As )s∈ω , (Ψs )s∈ω ) to θ. The key concept in this proof is that of a golden pair. Recall that in the proof of the (super)low basis theorem, we build a sequence of Π01 classes Q0 ⊇ Q1 ⊇ · · · by letting Qn+1 = Qn if n ∈ X for all X ∈ Qn and otherwise letting Qn+1 = {X ∈ Qn : n ∈ / X }. Then n Qn contains a single element Z, which is superlow, because to compute Z (n), we need only computably approximate which case of the definition of Qn+1 obtains, which we can do recursively, going through at most 2k many versions of each Qk . We can start this construction by letting Q0 = Q for any nonempty Π01 class Q. Let Qn,s be the stage s guess at Qn , which is defined as follows.
14.4. Strong jump traceability and diamond classes
697
First, let Q0,s = Q. If at stage s we see that n ∈ X for all X ∈ Qn,s [s], then let Qn+1,s = Qn,s . Otherwise, let Qn+1,s = {X ∈ Qn,s : n ∈ / X }. As n mentioned above, the guess Qn,s changes at most 2 many times. Thus, if we can guess at θ(n) in such a way that we make at most one guess per version of Qn,s , we can build a trace for θ with the appropriate bound. The existence of a golden pair is what will allow us to do so. Definition 14.4.19 (Greenberg, Hirschfeldt, and Nies [169]). A pair (Q, Φ), consisting of a nonempty Π01 class Q and a functional Φ, is a golden pair (for Ψ and h) if for almost all n such that ΨA (n) ↓ and all X ∈ Qh(n) , we have ΦX A ψ A (n). Lemma 14.4.20. If there is a golden pair then θ has a c.e. trace bounded by n → 2h(n) . Proof of Lemma 14.4.20. Let Q, Φ be a golden pair. We enumerate T0 , T1 , . . . as follows. If at stage s there is a σ such that Ψσs (n) ↓= k and σ ΦX for every X ∈ Qh(n),s [s], then put k into Tn . The Tn are uniformly c.e. For each different guess Qh(n),s , at most one number k gets enumerated into Tn , so |Tn | 2h(n) . By the definition of golden pair, for almost all n ∈ dom θ, we have A ψ A (n) ΦX for all X ∈ Qh(n) , which implies that if s is sufficiently large, then A ψ A (n) A ΦX for all X ∈ Qh(n),s [s], so θ = ΨAψ (n) (n) ∈ Tn . Thus we can make finitely many changes to the Tn to obtain a c.e. trace for θ with bound n → 2h(n) . So we are left with showing that a golden pair exists. We can think of a golden pair as arising from a failed attempt at building a superlow set Z ∈ P that does not compute A. Such an attempt can be made, starting with any Q ⊆ P , by interspersing the superlowness classes Qn with ones attempting to diagonalize against computations yielding A, that is, classes of the form {X ∈ Qm : ΦX e τ } for some τ ≺ A. For each e and n we have a strategy Sne that monitors whether this class is empty for τ = A ψ A (n) and m = h(n). As long as the class looks nonempty, so that it appears that Sne has succeeded in ensuring that ΦZ e = A, we start a new superlow basis construction with this class, but at level e + 1 now. (We think of the procedures at this new level as being called by the procedure Sne .) It cannot be the case that strategies at all levels succeed, since otherwise Z would be a superlow element of P not computing A. The failure at some level gives us a golden pair. We will have to be careful in showing that Z is in fact superlow, since the superlowness requirements will be distributed among constructions corresponding to different levels e. Each Sne works with A ψ A (n), the value of which can change since A is not computable. When this value changes, the strategies at level e + 1 called by Sne must be canceled. This action causes the bound witnessing superlowness to be
698
14. Strong Jump Traceability
more difficult to compute, but we will be able to show there is one by using the fact that ((As )s∈ω , (Ψs )s∈ω ) is a restrained approximation. The construction is a nonuniform argument in the spirit of the golden run method, but with a procedure calling structure of unbounded depth, as in box promotion arguments. For each e, we will have a procedure Re , provided with some Π01 subclass e P of P as input, which attempts to show that (P e , Φe ) is a golden pair. For each n such that ΨA (n) ↓, we will have a subprocedure Sne , which attempts to show that the golden pair condition holds at n, that is, that A e e ΦX e A ψ (n) for all X ∈ Ph(n) . Failing that, Sn will try to give permanent control to the next level e + 1. Construction. We describe the instructions for our procedures. A run of the procedure Re has an input P e and a parameter m. If it has control at stage s, it looks for the least n such that 1. h(n) > m, 2. ΨA s (n) ↓, and 3. it is not the case that a previous run of Sne with input As ψsA (n) has returned and not been subsequently canceled, as defined below. If there is such an n, then Re calls Sne with input As ψsA (n), transferring control to Sne . A run of the procedure Sne is provided with a string τ such that Ψτ (n) ↓. It e acts as follows. First it calls Re+1 with input P e+1 = {X ∈ Ph(n) : ΦX e τ } and parameter h(n). As long as Pe+1 appears to be nonempty, it halts all activity at level e, allowing Re+1 to act. If Pe+1 is ever found to be empty, it cancels the run of Re+1 and any subprocedures that run may have called and returns control to Re . A run of Sne with input τ believes that τ ≺ A and that the current guess e at Ph(n) is correct. So if at any stage after the beginning of this run we find that one of these conditions is not true, then this run, and the run of Re+1 it called, are canceled and, if Sne had not yet returned, control is returned to Re . The construction is started by calling R0 with input P 0 = P and parameter 0. End of Construction. A golden run is a run of some Re that is never canceled and such that every Sne called by that run eventually returns or is canceled. Lemma 14.4.21. If there is a golden run of Re with input Q, then (Q, Φe ) is a golden pair. Proof. Let m be the parameter with which the golden run is called. We claim that if h(n) > m then there is a final call of Sne that is never canceled and hence returns. To see that this is the case, assume by induction that
14.4. Strong jump traceability and diamond classes
699
there is a stage t such that no runs of Ske for k < n are ever active after stage t, our guess at Qh(n) has stabilized by stage t, and so has the approximation to A ψ A (n). Any run of Ske for k > n that may be active at stage t will eventually return or be canceled, so eventually a run of Sne that is never canceled will start, and hence return, after which point no further calls to Sne will be made. Since this run returns, the golden pair condition for n holds of (Q, Φe ). Since h(n) > m for almost all n, we see that (Q, Φe ) is a golden pair. Thus it remains to show that there is a golden run of some Re . We first need to do some counting to establish a computable bound on the number of times procedures can be called. Let g be the function witnessing the fact that ((As )s∈ω , (Ψs )s∈ω ) is a restrained approximation. Lemma 14.4.22. For each e, each run of Re calls at most g(n) + 2h(n) + 1 many runs of Sne . Proof. Fix a run of Re with input P e . Suppose that a run of Sne with input τ called by this run is canceled at stage s, but the run of Re is not. Then e e either Ph(n),s = Ph(n),s−1 or τ ≺ As−1 but τ ⊀ As . The first possibility can h(n) many times, and the second can occur at most g(n) occur at most 2 many times, because each time it occurs, our restrained approximation to ΨA (n) changes. A new run of Sne cannot be started until the previous run is canceled, so the lemma follows. Lemma 14.4.23. There is a computable bound N (n) on the number of times procedures Sne are called (over all e). Proof. We calculate by recursion on e and n a bound M (e, n) on the number of times any Re calls a run of Sne . By Lemma 14.4.22, we can let M (e, n) be the product of g(n) + 2h(n) + 1 with a bound on the number of runs e−1 of Re that are called by some Sm with parameter h(m) < h(n). Since h is monotone, the number of runs of Re with parameter less than h(n) is bounded by m
700
14. Strong Jump Traceability
e e {X ∈ Ph(n) : ΦX e τ } for a τ ≺ A (since Sn is never canceled). This e+1 definition contradicts the fact that Z ∈ P .
Lemma 14.4.24. Z is superlow. Proof. Let n > 0, and let e be the least number such that the permanent run of Re is started with a parameter greater than n. As mentioned in the proof of Lemma 14.4.23, the parameter of any run of Re is at least e, so such an e exists. e−1 Whether n ∈ Z depends only on Pn+1 , so we can approximate an answer to the question of whether n ∈ Z by tracking, at each stage, the guess at d Pn+1 at that stage, where d is the greatest number such that the current run of Rd has parameter no greater than n. d e The guess at Pn+1 can change because of a call to Sm where h(m) n or because of the approximation inherent in the superlow basis theorem strategy. Thus the number of times this guess can change is at most 2n+1 h(m)n N (m), which is a computable bound, and hence we have an ω-c.e. approximation to Z . As argued above, this lemma completes the proof of the theorem.
14.5 Strong jump traceability and K-triviality: the general case In general, s.j.t. sets are not as well understood as the special case of c.e. s.j.t. sets. It would be quite helpful if the following analog to Corollary 11.5.4 were to be established. Conjecture 14.5.1 (Downey and Greenberg). Every s.j.t. set is computable from some c.e. s.j.t. set. Downey and Greenberg [105] showed that all s.j.t. sets are Δ02 , and have recently improved this result considerably by showing that all s.j.t. sets are K-trivial. This result extends Theorem 14.3.1, but its proof is perhaps simpler. Theorem 14.5.2 (Downey and Greenberg [105]). There is a computable order h such that every h-jump traceable set is K-trivial. Before proving this result, we note that not all orders h suffice. Theorem 14.5.3 (Nies [303]). There are continuum many 22n+1 -jump traceable sets. Proof. We build a trace T0 , T1 , . . . and a sequence of uniformly computable functions fs : 2<ω → 2<ω such that f = lims fs exists. Let f0 (σ) = σ for all σ. At stage s + 1, look for the length-lexicographically least ν such that
14.5. Strong jump traceability and K-triviality: the general case
701
J fs (τ ) (|ν|) ↓∈ / Te,s for some τ ν with τ ∈ 2s+1 . If no such ν exists then let fs+1 (σ) = fs (σ) for all σ and proceed to the next stage. Otherwise, for every ρ, let fs+1 (νρ) = τ ρ. For every σ ν, let fs+1 (σ) = fs (σ). Enumerate J fs (τ ) (|ν|) into Te . It is easy to see that for every σ ∈ 2e , the value of fs (σ) changes at most 2e+1 − 1 many times, and causes at most that many elements to be put into Te , whence |Te | < 22e+1 . Moreover, the image of 2ω under f has size continuum, and for every member A of this image, we have J A (e) ↓ ⇒ J A (e) ∈ Te for all e. It is not hard to check that, as shown in [303], the above proof can be used to define a perfect Π01 class of 22n+1 -jump traceable sets. It would be interesting to improve the bounds on the levels of jump traceability that hold of continuum many sets, given by this result on the one hand, and the proof of Theorem 14.5.2 below on the other. This question is of course connected with the one mentioned at the end of Section 14.3, as well as with the question of what level of jump traceability is needed to ensure Δ02 -ness. ˜ such that if J A is Proof of Theorem 14.5.2. We build a computable trace h ˜ h-traceable then A is K-trivial. By Theorem 11.1.8, there is a computable function g such that n 2−g(n) < 1 and if K(X n) g(n) + O(1), then X is K-trivial. We wish to build a KC set to ensure that K(A n) g(n) + O(1). We have no direct access to A, so for each n, we will need to include (g(n), σ) in our KC set for several σ ∈ 2n . To keep the weight of our set finite, we need to limit the number of such σ. As in previous constructions in this chapter, we use a functional Ψ as a tester. By not believing that a given σ is a possible initial segment of A until it has been sufficiently tested, we will be able to limit the number of σ’s we consider. Suppose that n 2−g(n)were computable. Then there would be a computable order h such that n h(n)2−g(n) < 1. We could then let Ψσ (|σ|) = σ for all σ (thus testing every σ), take a trace {Tn }n∈N for ΨA with bound h, and let our KC set be {(g(n), σ) : σ ∈ Tn }. It is easy to check that this definition would ensure that K(A n) g(n) + O(1). Since n 2−g(n) is merely left-c.e., we will need to deal with approximations to this sum, and have a more complicated testing procedure. 2k Partition N into consecutive intervals I1 , I2 , . . . such that |Ik | = 22 . For e > 1 and n ∈ Ik , let he (n) = k + e. By Lemma 14.1.2, we can determine a ˜ such that, for any A, from a trace for J A with bound computable order h A ˜ h, we can compute, uniformly in e, a trace for ΦA e with bound he . Let J ˜ be h-traceable. We show that A is K-trivial. We will define a functional Ψ. By the recursion theorem, we know some A e such that ΨX = ΦX e for all X. Let {Tn }n∈N be a trace for Ψ with bound h = he , computed from e as described above.
702
14. Strong Jump Traceability k
Let b be such that |Ik | 2(k+e)2 +2k for all k b. Our testing procedures will happen at levels k b. For each k b, let tk0 , . . . , tk2k −1 be the first k
2k many numbers in Ik , and let M k be the next 2(k+e)2 many numbers in Ik . The set M k is split into 2k+e many subsets, which are further split into 2k+e many subsets, and so on for 2k many levels. We index these subsets as follows. Let M k (λ) = M k . If ν is a sequence of subsets of [0, k + e) of length less than or equal to 2k and M k (ν) is defined, then split M k (ν) evenly into 2k+e many subsets M k (νB) for B ⊆ [0, k + e). We will use both the tkm and the elements of M k as test inputs. −k −g(l) , then let nkm be the least n such that m2−k l2 If m2−g(l) < . Note that the set of all k and m for which nkm is defined is ln 2 c.e. For each k b and m < 2k , if we find that nkm is defined, then we test every string of length nkm using the “initial test input” tkm . That is, for each k σ ∈ 2nm , we let Ψσ (tkm ) = σ. This step allows us to forget about all but at most |Ttkm | h(tkm ) = k + e many strings of length nkm . We now proceed by recursion on k b, and on m < 2k within each level k, to define which strings of length nkm are (k, m)-approved, and how our testing at level m of M k is conducted. It will be clear from the construction that the collection of (k, m)-approved strings is c.e., uniformly in k and m. k A string σ ∈ 2nm is (k, m)-preapproved if for every l ∈ [b, k] and j < 2l such that either j2−l < m2−k or both l < k and j2−l = m2−k , the string l σ 2nj is (l, j)-approved. By induction, the collection of (k, m)-preapproved strings is c.e., uniformly in k and m. Let σ0 (k, m), σ1 (k, m), . . . be an effective list of the (k, m)-preapproved strings in Ttkm . This list has length at most k + e, since |Ttkm | h(tkm ) = k + e. For a sequence ν = (C0 , . . . , Cm ) of subsets of [0, k + e), a number i is ν-appropriate if (i) σi (k, m) is defined, (ii) i ∈ Cm , and (iii) for all m < m, if σj (k, m ) σi (k, m), then j ∈ / Cm . For all sequences ν as above and all ν-appropriate i, let Ψσi (k,m) (n) = σi (k, m) for all n ∈ M k (ν). The idea here is that we test σi (k, m) on all inputs in M k except for those where we already tested a substring of σi (k, m). It is easy to see that the above definition does in fact ensure that for all n ∈ M k , the set of all σ for which we define Ψσ (n) = σ is prefix-free, as it needs to be. We say that σi (k, m) is (k, m)-approved if for every n for which we define Ψσi (k,m) (n) = σi (k, m), we have σi (k, m) ∈ Tn . We now define our KC set G. For k b, let Sk be the set of all (k, m)approved strings, over all m < 2k . For k > b and σ ∈ Sk , let σk− be the
14.5. Strong jump traceability and K-triviality: the general case
703
longest initial segment of σ of length nk−1 for some l such that nk−1 is l l − defined. For σ ∈ Sb , let σb = λ. For k b and σ ∈ Sk , let Gk (σ) = {(g(|τ |), τ ) : σk− ≺ τ σ}. Let Gk = σ∈Sk Gk (σ) and G = kb Gk . Then G is c.e. We show that the weight of G is bounded. To do so, we redefine G in terms of certain subsets of the Gk . For k b, let Lk be the set of maximal elements of Sk (that is, elements of Sk with no proper extensions in Sk ). Let Ab = Sb . For k > b, let Ak = Lk \ Sk−1 . The following lemma shows that the Ak are small, and hence can be used in bounding the weight of G. Lemma 14.5.4. |Ak | k + e for all k b. k
Proof. For m < 2k , let Cm = {i : σi (k, m) ∈ Ak ∩ 2nm }. Let ν = (C0 , . . . , C2k −1 ). It is easy to check that, since Ak is an antichain, we test every σ ∈ Ak on the inputs in M k (ν). Thus, for every σ ∈ Ak and n ∈ M k (ν), we define Ψσ (n) = σ. But then σ ∈ Tn , since otherwise σ would not be in Sk . Since |Tn | h(n) = k + e for all such n, we have |Ak | k + e. The next two lemmas will allow us to express G in terms of the Ak . Lemma 14.5.5. Let k > b and let σ ∈ Sk \ Lk . Then there are a j ∈ [b, k) and a ρ ∈ Sj such that Gk (σ) ⊆ Gj (ρ). Proof. Let τ be a proper extension of σ in Sk . Let m be such that |σ| = nkm . −k Then |τ | nkm+1 . Let j = # m+1 < j2−(k−1) (m + 1)2−k , 2 $. Then m2 k−1 k−1 so τ nj ∈ Sk−1 and σ τ nj . Let ρ be the shortest extension of σ in Sk−1 , and let j b be least such that ρ ∈ Sj . − Then ρ− j ≺ ρ. Since ρ ∈ Sk−1 , we have ρj ∈ Sk−1 (as otherwise ρ would not have been preapproved at the k − 1 level). By the definition of ρ, we − − have ρ− j ≺ σ, so ρj σk . − Thus ρ− j σk σ ρ, from which it follows that Gk (σ) ⊆ Gj (ρ). Lemma 14.5.6. Let k b and σ ∈ Sk . Then there are a j ∈ [b, k] and a ρ ∈ Aj such that Gk (σ) ⊆ Gj (ρ). Proof. We define sequences of strings and numbers by recursion. Let σ0 = σ and k0 be least such that σ ∈ Sk0 . Suppose we have defined σi and ki so that ki is least such that σi ∈ Ski . If σi ∈ Aki then stop the recursion. Otherwise, by the definition of Aki , we have ki > b and σi ∈ / Lki . Thus,by Lemma 14.5.5, there are a j < ki and a σi+1 ∈ Sj such that Gki (σi ) ⊆ Gj (σi+1 ). Let ki+1 be least such that σi+1 ∈ Ski+1 . It is easy to see that Gj (σi+1 ) ⊆ Gki+1 (σi+1 ), so Gki (σi ) ⊆ Gki+1 (σi+1 ). Let i be largest such that σi is defined. Then σi ∈ Aki and, by induction, Gk (σ) ⊆ Gki (σi ).
704
14. Strong Jump Traceability
Let Hk = σ∈Ak Gk (σ) and H = kb Hk . Clearly H ⊆ G, and it follows from Lemma 14.5.6 that G ⊆ H, so in fact G = H. We now use this fact to bound the weight of G. Lemma 14.5.7. If σ ∈ Ak then 2−g(|σ|) < 2−(k−1) . Proof. Suppose that 2−g(|σ|) 2−(k−1) . Then there is an m such that |σ| = nk−1 m . By the definition of the approval process, σ ∈ Sk−1 , contradicting the definition of Ak . Lemma 14.5.8. Let k > b and σ ∈ Ak . Then the weight of Gk (σ) is bounded by 2−(k−2) . Proof. Let m be such that nkm = |σ|. Then n<|σ| < m2−k . Let l = # m 2 $. k−1 − k−1 k > nm for all i > l, so |σk | = nl . Thus Since σ ∈ / Sk−1 , we must have ni −g(n) −(k−1) − 2 l2 . By the previous lemma, 2−g(|σ|) < 2−(k−1) . So n|σk | the weight of Gk (σ) is
2−g(n) = 2−g(n) + 2−g(|σ|) − 2−g(n) − |σk |
− n|σk |
n<|σ|
< m2−k + 2−(k−1) − l2−(k−1) 2−(k−2) .
Thus weight of G = H is bounded by the weight of Gb plus |Ak |2−(k−2) (k + e)2−(k−2) < ∞. k>b
k>b
So G is a KC set, and hence, if σk− ≺ τ σ for σ in some Sk , then K(τ ) g(|τ |) + O(1). for k b, let mk be the greatest m < 2k such that m2−k < Finally, −g(l) . Let σ(b − 1) = 0. For k b, let σ(k) = A nkmk . It is easy to l2 see that for each k b, we have σ(k) ∈ Sk and σ(k)− k = σ(k − 1). Thus K(A n) g(n) + O(1), and hence A is K-trivial.
15 Ω as an Operator
15.1 Introduction We have already seen that Chaitin’s Ω is a natural example of a 1-random real. We have also seen that, in algorithmic randomness, prefix-free machines are the analogs of partial computable functions, and the measures of the domains of prefix-free machines, that is, left computably enumerable reals, take the role of the computably enumerable sets. In more detail, we have the following: 1. The domains of partial computable functions are exactly the c.e. sets, while the measures of the domains of prefix-free machines are exactly the left-c.e. reals. 2. The canonical example of a noncomputable set is the halting problem ∅ , i.e., the domain of a universal partial computable function. The canonical example of a 1-random real is Ω, the halting probability of a universal prefix-free machine. 3. ∅ is well-defined up to computable permutation, while Ω is welldefined up to Solovay equivalence. So far in this book we have dodged the “machine-dependence bullet” by choosing a fixed standard universal machine when looking at relativized halting probabilities and studying properties that do not depend on that choice. In this chapter we will look at results of Downey, Hirschfeldt, Miller, R.G. Downey and D. Hirschfeldt, Algorithmic Randomness and Complexity, Theory and Applications of Computability, DOI 10.1007/978-0-387-68441-3_15, © Springer Science+Business Media, LLC 2010
705
706
15. Ω as an Operator
and Nies [115] that grapple with versions of Ω as operators. Most of the results and proofs in this chapter are taken from that paper. Relativizing the definition of ∅ gives the jump operator. If A ∈ 2ω , then A is the domain of a universal machine relative to A. Myhill’s Theorem 2.4.12 relativizes, so A is well-defined up to computable permutation. Furthermore, if A ≡T B, then A and B differ by a computable permutation. In particular, the jump is well-defined on the Turing degrees. The jump operator plays an important role in computability theory; it gives a natural, uniform, and degree invariant way to produce, for each A, a c.e. set A with Turing degree strictly above that of A. What happens when the definition of Ω is relativized, by defining it relative to a particular universal prefix-free oracle machine U ? We will see that the Kuˇcera-Slaman Theorem 9.2.3 and Theorem 9.2.2 of Calude, Hertling, Khoussainov, and Wang both relativize (with some care, as we see below). However, if we wish to look at Ω as an analog of the jump, then we might hope that ΩA U is well-defined, not just up to Solovay equivalence relative to A, but even up to Turing degree. Similarly, we might hope for B ΩU to be a degree invariant operator; that is, if A ≡T B then ΩA U ≡ T ΩU . Were this the case, ΩU would provide a counterexample to a long-standing conjecture of Martin: it would induce an operator on the Turing degrees that is neither increasing nor constant on any cone. But as we show in Theorem 15.7.7, there are oracles A =∗ B (i.e., A and B agree except on a B finite set) such that ΩA U and ΩU are not only Turing incomparable, but even relatively random. In particular, by choosing these A and B appropriately, B we can ensure that ΩA U is a left-c.e. real while making ΩU as random as we A like. It follows easily that the Turing degree of ΩU generally depends on the choice of U , and in fact, that the degree of randomness of ΩA U can vary drastically with this choice. If U is a universal prefix-free oracle machine, then we call the map taking A to ΩA U an Omega operator . In spite of their failure to be Turing degree invariant, it turns out that Omega operators are rather interesting, and provide the first natural example of interesting c.e. operators that are not CEA. For example, in Section 15.4, we show that the range of an Omega operator has positive measure, and that every 2-random real is in the range of some Omega operator. (This fact is not true of every 1-random real.) In Section 15.5, we prove that A is mapped to a left-c.e. real by some Omega operator iff Ω is 1-random relative to A. (Such A’s are called low for Ω.) In the final section of this chapter, we consider the analytic behavior of Omega operators. We prove that Omega operators are lower semicontinuous but not continuous, and moreover, that they are continuous exactly at the 1-generic reals. We also produce an Omega operator that does not have a closed range. On the other hand, we prove that every non-2-random that is in the closure of the range of an Omega operator is actually in that range. As a consequence, for each U there is an A such that ΩA U = sup(rng ΩU ). In several proofs below, we think of a binary string σ as the rational 0.σ.
15.2. Omega operators
707
15.2 Omega operators Let us begin by recalling that a prefix-free oracle machine U is universal if for every prefix-free oracle machine M there is a prefix ρM ∈ 2<ω , called a coding string of M in U , such that ∀A ∈ 2ω ∀σ ∈ 2<ω (U A (ρM σ) = M A (σ)). Note that this condition is stronger than the requirement that U A be a universal A-computable prefix-free machine for all A. The fixed standard machine we have been using in this book is of course an example of a universal prefix-free oracle machine. For a prefix-free oracle M, machine−|σ| A A let ΩA 2 . A M be the halting probability of M . Formally, ΩM = M (σ)↓ This definition gives rise to an operator ΩM : 2ω → [0, 1]. As mentioned above, if U is a universal prefix-free oracle machine, then we call ΩU an Omega operator. It is of course a straightforward relativization of the fact that Ω is 1random that if ΩU is an Omega operator then ΩA U is 1-random relative to A. The following is a strengthening of this fact. Theorem 15.2.1 (Downey, Hirschfeldt, Miller, and Nies [115]). Let ΩU be an Omega operator. There is a b such that, for each A, the real ΩA U is 1-random relative to A with constant b; that is, ∀n (K A (ΩA n) n − b). U Proof. We define a prefix-free oracle machine M as follows. We identify a string τ with the rational 0.τ . For any A ∈ 2ω and σ ∈ 2<ω , first calculate −|τ | τ = U A (σ). Then wait for a stage s such that ΩA . If such an U [s] τ − 2 A A s is found, then let M (σ) = s + 1. Because M (σ) > s, the convergence of M A (σ) cannot already be taken into account in the calculation of ΩA U [s] (by the usual assumption on the stage by stage approximation to U ), so, A −|ρσ| letting ρ be a coding string of M in U , in this case ΩA . U [s] ΩU [s] + 2 Thus, either −|τ | ΩA U < τ −2
or A −|ρσ| τ − 2−|τ | + 2−|ρσ| . ΩA U ΩU [s] + 2
Assume for a contradiction that there is an A and an n such that K A (ΩA U n) < n − |ρ| − 1. Letting σ be a minimal length program for ΩA U n, we have shown that either A −n ΩA U − (ΩU n) < −2
or A −n + 2−|ρσ| > −2−n + 2−n+1 = 2−n . ΩA U − (ΩU n) −2
708
15. Ω as an Operator
Neither of these possibilities can happen, so we have a contradiction. Hence, for every A we have ∀n (K A (ΩA U n) n − |ρ| − 1), so we can take b = |ρ| + 1. Let U and b be as in the theorem. It is clear that if we define K A using U , then there is a c such that ∀A ∈ 2ω ∀σ ∈ 2<ω (K(σ) K A (σ) − c), so all reals in the range of ΩU are 1-random with constant b+c. In other words, the range of ΩU is contained in the closed set {X : ∀n (K(X n) n − b − c)}. In particular, every real in the closure of the range of ΩU is 1-random. We will discuss the range of ΩU and its closure in more depth in Section 15.10. Of course, ΩA U is an A-left-c.e. real, and every A-left-c.e. real is com A putable from A , hence ΩA U T A . It is not usually the case that ΩU ≡T A . Indeed, by Theorem 11.7.2, this cannot be the case unless A is K-trivial.1 On the other hand, the fact that Ω ≡T ∅ has a natural relativization in the following simple result. Proposition 15.2.2. ΩA U ⊕ A ≡T A . Proof. It is clear that ΩA U ⊕A T A . For the other direction, define a prefixfree oracle machine M such that M A (0n 1) ↓ iff n ∈ A , for all A ∈ 2ω and n ∈ ω. Assume that U simulates M by the prefix τ ∈ 2<ω . To determine A −(|τ |+n+1) whether n ∈ A , search for a stage s such that ΩA . U − ΩU [s] < 2 A This search can be done computably in ΩU ⊕ A. Note that U A cannot converge on a string of length |τ | + n + 1 after stage s, so n ∈ A iff M A (0n 1) ↓ iff U A (τ 0n 1) ↓ iff U A (τ 0n 1)[s] ↓. Therefore, A T ΩA U ⊕ A.
Recall that B ∈ 2ω is generalized low (GL1 ) if B T B ⊕ ∅ . Theorem 15.2.3 (Nies and Stephan, see [115]). If a Δ02 set A is 1-random relative to B, then B is GL1 . Proof. Let f (n) = μs ∀t s (At n = As n), and note that f T ∅ . B If ΦB e (e) ↓ then let Ve = As e + 1 for the least s such that Φe (e)[s]. Otherwise, let Ve = ∅. Let Un = en Ve . Itis easy to see that {Un }n∈ω is a Martin-L¨ of test relative to B. Thus A ∈ / n Un , so A is in only finitely many Ve ’s. So for almost all e such that ΦB e (e) ↓, we must have f (e) (μs) ΦB (e)[s] ↓. Hence B B ⊕ ∅ . T e Theorem 15.2.3 implies that the class of low 1-random reals is closed under the action of any Omega operator. Corollary 15.2.4. For each Δ02 1-random A, the real ΩA U is GL1 . If A is also low, then ΩA is low. U 1 The following is a simpler argument: Let A be 1-random. By van Lambalgen’s A Theorem, A is 1-random relative to ΩA U , so A T ΩU .
15.3. A-1-random A-left-c.e. reals
709
Proof. Let B = ΩA U . Clearly B is 1-random relative to A, so by van Lambalgen’s Theorem, A is 1-random relative to B, and Theorem 15.2.3 applies. 0 If A is low, then ΩA U is also Δ2 , and hence low.
15.3 A-1-random A-left-c.e. reals We can relativize Solovay reducibility as follows. For A, X, Y ∈ 2ω , we write Y A S X to mean that there are a c and a partial A-computable ϕ : 2<ω → 2<ω such that if q < X, then ϕ(q) ↓< Y and Y −ϕ(q) < c(X −q). We say that an A-left-c.e. real X is A-Solovay complete if Y A S X for every A-left-c.e. real Y . Some basic facts about Solovay reducibility relativize easily, with essentially the same proofs as before. For example: Theorem 15.3.1. If Y is 1-random relative to A and Y A X, then X is S also 1-random relative to A. The relativization of the Kuˇcera-Slaman Theorem is equally straightforward. Theorem 15.3.2. If an A-left-c.e. real X is 1-random relative to A then X is A-Solovay complete. On the other hand, a satisfactory relativization of Theorem 9.2.2, which states that each 1-random left-c.e. real is a halting probability, presents some difficulty. The direct relativization states that if X is an A-left-c.e. real and is A-Solovay complete, then there is an oracle prefix-free machine M such that M A is universal for A-computable prefix-free machines and X = ΩA M . It is by no means clear, though, that we should be able to relativize Theorem 9.2.2 to build a machine that is universal in our strong sense. Nevertheless, the relativized theorem is true. Theorem 15.3.3 (Downey, Hirschfeldt, Miller, and Nies [115]). Let X be an A-left-c.e. real that is A-Solovay complete. Then there is a universal prefix-free oracle machine U such that X = ΩA U. Proof. Let V be a universal prefix-free oracle machine. Because ΩA V is an A A-left-c.e. real, we have ΩA X. Choose n and a functional ϕ taking S V strings to strings such that 2n and ϕA witness this Solovay reduction. In other words, if q < ΩA V is a string, thought of as a rational by equating q with 0.q, then ϕA (q) ↓< ΩA V and A n ΩA V − ϕ (q) < 2 (X − q). −n
(15.1) −n
X 1 − 2 . (Clearly, We also require n to be large enough that 2 no computable real can be A-Solovay complete, so X = 0, 1.) We now define another universal prefix-free oracle machine U . To make U universal, let U B (0n σ) = V B (σ), for all σ ∈ 2<ω and oracles B ∈ 2ω .
710
15. Ω as an Operator
For convenience, we preserve the stage of convergence; i.e., U B (0n σ)[t] ↓ iff V B (σ)[t] ↓. The other strings in the domain of U are used to ensure that ΩA U = X. Let ψ be a functional taking numbers to strings (again thought of as rationals) such that {ψ A (s)}s∈ω is a nondecreasing sequence with limit X. Fix an oracle B. We add strings not extending 0n to the domain of U in stages. For each s: 1. Compute qs = ψ B (s). 2. Compute rs = ϕB (qs ). 3. Search for a ts such that ΩB V [ts ] rs . 4. If possible, add enough strings (not extending 0n ) to the domain of U at stage ts to make ΩB U [ts ] = qs . Note that (if B = A) this procedure may get stuck in any of the first three steps. In this case, U B will converge on only finitely many strings not extending 0n . This completes the construction of U , which is clearly a universal prefix-free oracle machine. It remains to verify that ΩA U = X. By the definition of ψ, we have qs = ψ A (s) ↓< X for each s. Therefore, rs = ϕA (qs ) ↓< ΩA V . So there is a −n stage ts such that ΩA , there are enough V [ts ] rs . Because qs < X 1−2 strings available in step (iv) to ensure that ΩA U [ts ] qs . But lims qs = X, so ΩA U X. Now assume for a contradiction that ΩA U > X. Because the strings extending 0n add at most 2−n X to ΩA U , there must be some s that causes too many strings to be added to the domain of U in step (iv). In other words, there is an s such that ΩA U [ts ] = qs and −n A (ΩA ΩA U [ts ] + 2 V − ΩV [ts ]) > X. A n A So ΩA V − ΩV [ts ] > 2 (X − qs ). But in step (iii), we ensured that ΩV [ts ] A A A n rs = ϕ (qs ). Therefore, ΩV − ϕ (qs ) > 2 (X − qs ), contradicting (15.1). Thus ΩA U = X.
Combining Theorems 15.2.1, 15.3.1, 15.3.2, and 15.3.3, we have the following corollary. Corollary 15.3.4 (Downey, Hirschfeldt, Miller, and Nies [115]). For A, X ∈ 2ω , the following are equivalent. (i) X is an A-left-c.e. real and 1-random relative to A. (ii) X is an A-left-c.e. real and A-Solovay complete. (iii) X = ΩA U for some universal prefix-free oracle machine U .
15.4. Reals in the range of some Omega operator
711
15.4 Reals in the range of some Omega operator We proved in the last section that X is in the range of some Omega operator iff there is an A such that X is both 1-random relative to A and an A-leftc.e. real. What restrictions does this place on X? The impression we have is that Ω is a very special 1-random real, and results such as Stephan’s Theorem 8.8.4 that most 1-random reals are computationally feeble suggest that most 1-random reals do not resemble Ω at all. However, we will see that this impression is not true in relativized form. In this section, we show that every 2-random real is 1-random relative to A and an A-left-c.e. real for some A, but that not every 1-random real has this property. Furthermore, we prove that the range of every Omega operator has positive measure. The following theorem should be compared with Theorem 8.21.8, that every 2-random set is CEA. Theorem 15.4.1 (Downey, Hirschfeldt, Miller, and Nies [115]). If X is 2-random, then there is an A such that X is 1-random relative to A and an A-left-c.e. real. Proof. Let A = (1−X+Ω) . Then X = 1 − 2A + Ω is an A-left-c.e. real. 2 Because X is 2-random, X is 1-random relative to Ω, so by van Lambalgen’s Theorem, Ω is 1-random relative to X. But then A is 1-random relative to (1−X+Ω) X (because clearly, Ω ≡X ). Therefore, applying van Lambalgen’s S 2 theorem again, X is 1-random relative to A. As was mentioned above, the previous theorem cannot be proved if X is assumed only to be 1-random. Example 15.4.2 (Downey, Hirschfeldt, Miller, and Nies [115]). 1 − Ω is not in the range of any Omega operator. Proof. The 1-random real X = 1 − Ω is a right-c.e. real, i.e., the limit of a decreasing computable sequence of rationals. Assume that X is an A-leftc.e. real for some A. Then A computes sequences limiting to X from both sides, and hence X T A. Therefore, X is not 1-random relative to A. It would not be difficult to prove that 1 − Ω cannot even be in the closure of the range of an Omega operator, but a direct proof is unnecessary because this fact follows from Theorem 15.10.4 below. We now consider a fixed Omega operator. Let U be an arbitrary universal prefix-free oracle machine. We will use the theorem of Lusin that analytic sets (i.e., projections of Borel sets) are measurable. See Sacks [346] for details.2 We will also use the fact that the image of an analytic set under 2 Sacks actually proves that Σ1 classes are measurable. However, every analytic subset 1 of 2ω is a Σ11 class relative to some oracle, so Lusin’s theorem follows by relativization.
712
15. Ω as an Operator
any Borel operator—for example, ΩU —is also analytic. For a class S ⊆ 2ω , we write ΩU [S] for the image of S under the operator ΩU . Theorem 15.4.3 (Downey, Hirschfeldt, Miller, and Nies [115]). The range of ΩU has positive measure. In fact, if S ⊆ 2ω is any analytic set whose downward closure under T is all of 2ω , then μ(ΩU [S]) > 0. Proof. Let R = ΩU [S]. Note that R is an analytic subset of 2ω and hence is measurable. Assume for a contradiction that μ(R) = 0. In particular, the outer measure of R is zero, which means that there is a nested sequence U0 ⊇ U1 ⊇ U2 ⊇ · · · of open subsets of 2ω such that for each n we have R ⊆ Un and μ(Un ) 2−n . Take a set B ∈ S that codes {Un }n∈ω in some effective way. Then {Un }n∈ω isa Martin-L¨ of test relative to B, which B implies that ΩB ∈ / U . But R ⊆ U , so Ω / R = ΩU [S], which is a n n U U ∈ n n contradiction. Thus μ(R) > 0. This theorem implies that many null classes have ΩU -images with positive measure, for example {A : ∀n (2n ∈ / A)}. We finish this section with a simple consequence of Theorem 15.4.3. Corollary 15.4.4 (Downey, Hirschfeldt, Miller, and Nies [115]). For almost every X, there is an A such that X =∗ ΩA U. 1 Proof. Let S = {X : ∃A (X =∗ ΩA U )}. Then S is a Σ1 class, and hence measurable by Lusin’s theorem mentioned above, and is closed under =∗ . But μ(S) μ(rng ΩU ) > 0. It follows from Kolmogorov’s 0-1 Law that μ(S) = 1.
15.5 Lowness for Ω We begin this section with the following question: For which oracles A is there a universal prefix-free oracle machine U such that ΩA U is a left-c.e. real? We will see that this property holds for almost every A. Definition 15.5.1 (Nies, Stephan, and Terwijn [308], Downey, Hirschfeldt, Miller, and Nies [115]). A set A is low for Ω if Ω is 1-random relative to A. It clearly does not matter in this definition which version of Ω we take, since all versions are Solovay equivalent. Theorem 15.5.2 (Downey, Hirschfeldt, Miller, and Nies [115]). A ∈ 2ω is low for Ω iff there is a universal prefix-free oracle machine U such that ΩA U is a left-c.e. real. Proof. First assume that there is a universal prefix-free oracle machine U such that X = ΩA U is a left-c.e. real. Every 1-random left-c.e. real is Solovay equivalent to Ω, so X S Ω, which means that X A S Ω. Both X and Ω are
15.6. Weak lowness for K
713
left-c.e. reals, and hence they are A-left-c.e. reals. Applying Theorem 15.3.1, we see that since X is 1-random relative to A, so is Ω. For the other direction, assume that A is low for Ω. Then Ω is 1-random relative to A and an A-left-c.e. real, so by Corollary 15.3.4, Ω = ΩA U for some universal prefix-free oracle machine U . It follows from the above proof and Proposition 15.2.2 that if A is low for Ω, then Ω ⊕ A ≡T A . Therefore, A ≡T ∅ ⊕ A; that is, A is GL1 , a fact which also follows from Theorem 15.2.3. We now show that almost every set is low for Ω. Theorem 15.5.3 (Nies, Stephan, and Terwijn [308]). A 1-random set A is low for Ω iff A is 2-random. Proof. Assume that A is 1-random. Since Ω ≡T ∅ , we have that A is 2random iff A is 1-random relative to Ω iff Ω is 1-random relative to A, where the last equivalence follows from van Lambalgen’s Theorem. More evidence for the ubiquity of low for Ω sets is the following low for Ω basis theorem, which we already met as Theorem 8.7.2, and is an immediate corollary of Theorem 15.5.2 and Theorem 15.7.1 below. Theorem 15.5.4 (Downey, Hirschfeldt, Miller, and Nies [115], Reimann and Slaman [327]). Every nonempty Π01 class contains a ∅ -left-c.e. real that is low for Ω. Every K-trivial set is low for 1-randomness, and hence low for Ω. However, by the previous result applied to the Π01 class of completions of Peano Arithmetic, there are also sets that are low for Ω but neither 1-random nor K-trivial. (Suppose that A is low for Ω and has PA degree. By Theorem 8.8.4 and the fact that no 2-random set can compute ∅ , as argued for instance in the paragraph above Theorem 8.5.3, A cannot be 2-random, and hence, by Theorem 15.5.3, A cannot be 1-random. As mentioned above Theorem 8.8.4, A computes a 1-random set, so by Corollary 11.4.10, A cannot be K-trivial.)
15.6 Weak lowness for K In this section we introduce the concept of weak lowness for K, and find surprising connections with the previous sections and with 2-randomness, culminating in a proof of two results of Miller: that having infinitely often maximal K-complexity and being 2-random coincide, and that 2-random sets are infinitely often quite trivial as oracles with respect to prefix-free complexity. Definition 15.6.1 (Miller [277]). We say that A is weakly low for K if there are infinitely many n such that K(n) K A (n) + O(1).
714
15. Ω as an Operator
15.6.1 Weak lowness for K and lowness for Ω Like their stronger analogs, lowness for K and lowness for 1-randomness, weak lowness for K and lowness for Ω coincide. Theorem 15.6.2 (Miller [277]). A is weakly low for K iff A is low for Ω. Proof. We first prove that if A is weakly low for K then A is low for Ω. This part of the proof follows Miller [277]. We prove the contrapositive. First, we define families of uniformly c.e. sets {Wσ }σ∈2<ω and {Dσ }σ∈2<ω . Fix σ ∈ 2<ω . Search for the least stage s such that σ ≺ Ωs . If no such stage is found, then let Wσ and Dσ be empty. Otherwise, for any τ such that U (τ ) converges for the first time after stage s, enumerate (|τ |, U (τ )) into Dσ . Also enumerate (|τ |, U (τ )) into Wσ as long as such enumeration preserves the condition that (d,n)∈Wσ 2−d 2−|σ| . Note that if Ks (n) = K(n), then (K(n), n) ∈ Dσ . We claim that if σ ≺ Ω, then Wσ = Dσ . It follows from our definition that (d,n)∈Dσ 2−d Ω − Ωs . But if σ ≺ Ω then Ω − Ωs 2−|σ| , so in this case (d,n)∈Dσ 2−d 2−|σ| , and hence Wσ = Dσ . The idea is that we have used an approximation of Ω to efficiently approximate all but finitely many values of K(n). Next, consider the A-c.e. set W = {(d + |τ | − |σ|, n) : U A (τ ) = σ and (d, n) ∈ Wσ }. By the construction of {Wσ }σ∈2<ω , 2−e = (e,n)∈W
U A (τ )↓=σ
(d,n)∈Wσ
2−d−|τ |+|σ|
2−|τ | 1.
U A (τ )↓
Thus W is a KC set relative to A, so K A (n) e + O(1) for all (e, n) ∈ W . Now, assume that Ω is not 1-random relative to A. Then for any c, there are τ and σ such that U A (τ ) = σ, with |σ| − |τ | c and σ ≺ Ω. Let s be the least stage such that σ ≺ Ωs . There is an N such that if n N , then Ks (n) = K(n) (by the usual conventions on stages, N = s + 1 is sufficient). For all n N , we have (K(n), n) ∈ Wσ , and hence (K(n)+|τ |−|σ|, n) ∈ W . But this means that K A (n) K(n) + |τ | − |σ| + O(1) K(n) − c + O(1), for all but finitely many n. But c is arbitrary, so A is not weakly low for K. The other half of Miller’s proof is much more difficult, but Bienvenu [personal communication to Downey] observed that it can be replaced with an argument based on the results of Bienvenu and Downey [39] on Solovay functions. that a Solovay function is a computable function f such that Recall −f (n) 2 < ∞, whence K(n) f (n) + O(1), and lim inf n f (n) − K(n) < n ∞. Recall also the Bienvenu-Downey characterization of Solovay functions, Theorem 9.4.1: A computable function is a Solovay function iff n 2−f (n) is 1-random. We apply this result. Suppose that A is not weakly low for K.
15.6. Weak lowness for K
715
Then limn K(n) − K A (n) = ∞. Let f be computable and such that Ω = −f (n) . Then f (n) K(n) − O(1), and hence limn f (n) − K A(n) = ∞. n2 Therefore, f is not an A-Solovay function, and so Theorem 9.4.1 relativized to A implies that Ω = n 2−f (n) is not 1-random relative to A. As a consequence, we have the following weakly low for K basis theorem. Corollary 15.6.3 (Miller [277]). Every nonempty Π01 class has a member that is weakly low for K. Proof. Given the above theorem, this result follows from Theorem 15.5.4. The following corollary improves an earlier unpublished result of Miller that 3-random sets are weakly low for K. It follows from Theorems 15.5.3 and 15.6.2. Corollary 15.6.4 (Nies, Stephan, and Terwijn [308], Miller [277]). A 1random set is weakly low for K iff it is 2-random. This is one of several results we have encountered that exhibit various ways in which highly random sets are relatively computationally powerless. These results point to an interesting phenomenon: in many aspects, highly random sets resemble highly nonrandom ones.3 One way to look at this phenomenon is that while highly random sets have high information, in the sense of initial segment complexity, this is “useless information”. It would be of great potential interest to develop precise ways to define usefulness of information in the context of algorithmic randomness.
15.6.2 Infinitely often strongly K-random sets We recall from Section 6.11 that A is infinitely often strongly K-random if K(A n) n + K(n) − O(1) for infinitely many n, and infinitely often C-random if C(A n) n − O(1) for infinitely many n. In that section, we saw that every 3-random set is infinitely often strongly K-random, that every infinitely often strongly K-random set is infinitely often C-random, and that being infinitely often C-random is the same as being 2-random. Thus it was a fundamental question whether being infinitely often C-random and being infinitely often strongly K-random are equivalent. Miller [277] answered this question in the affirmative, using results discussed above. Notice that the coincidence of these notions is in contrast with Solovay’s result, Corollary 4.3.8, that there are C-random strings that are not strongly K-random. 3 This
phenomenon is not limited to mathematical objects: in music, for example, it is often quite hard to distinguish pieces that make extensive use of chance methods from ones written following highly predetermined rules, as in total serialism.
716
15. Ω as an Operator
Theorem 15.6.5 (Miller [277]). If A is 2-random then it is infinitely often strongly K-random. Proof. Assume that A is 2-random. By Theorems 15.5.3 and 15.6.2, A is weakly low for K. By Corollary 6.6.4, K A (n) K(A n) − n + O(1), so K(A n) n + K A (n) − O(1). Because A is weakly low for K, there are infinitely many n such that K A (n) K(n) − O(1). For such n, we have K(A n) n+K(n)−O(1), so A is infinitely often strongly K-random.
15.7 When ΩA is a left-c.e. real In this section, we consider sets A for which ΩA U is a left-c.e. real. Far from being a rare property, we will show that μ{A : ΩA U is a left-c.e. real} > 0 for any fixed universal prefix-free oracle machine U . On the other hand, we will see that only a left-c.e. real can have an ΩU -preimage with positive measure. So left-c.e. reals clearly play an important role in understanding ΩU . Their main application here is in our proof that no Omega operator is degree invariant. Recall from the introduction to this chapter that we B want to obtain reals A =∗ B such that ΩA U is a left-c.e. real while ΩU is 1-random relative to a given (arbitrarily complex) Z. We show that each of these outcomes occurs with positive measure, in Theorems 15.7.4 and 15.7.5, respectively. Theorem 15.7.5 has no obvious connection to leftc.e. reals, but in fact, Theorem 15.7.4—applied to a modification of the universal machine U —is used to prove it. Theorem 15.7.1 (Downey, Hirschfeldt, Miller, and Nies [115]). Let M be a prefix-free oracle machine. If P ⊆ 2ω is a nonempty Π01 class, then there X is a ∅ -left-c.e. real A ∈ P such that ΩA M = inf{ΩM : X ∈ P }, which is a left-c.e. real. Proof. Let P ⊆ 2ω be a nonempty Π01 class and let X = inf{ΩA M : A ∈ P }. Note that X is a left-c.e. real because it is the limit of the nondecreasing computable sequence Xs = inf{ΩA M [s] : A ∈ Ps }. We will prove that there is an A ∈ P such that ΩA = X. Choose a sequence {Bn }n∈ω such that M −n n Bn ∈ P and ΩB − X 2 for each n. By compactness, {Bn }n∈ω has M −n n . Let A = a convergent subsequence {An }n∈ω . Note that ΩA M −X 2 A limn An . Because P is closed, A ∈ P . Therefore, ΩM X. Assume for a contradiction that ΩA M is strictly greater than X. Let m be such that −m −m ΩA − X > 2 . For some s, we have ΩA . Let k be the use M M [s] − X > 2 A of ΩM [s] (under the usual assumptions on the use of computations, we can B take k = s). In particular, if B k = A k, then ΩA M [s] = ΩM [s]. Now take n > m large enough so that An k = A k. Then An A −m n 2−n , 2−n ΩA M − X ΩM [s] − X = ΩM [s] − X > 2
which is a contradiction, proving that ΩA M = X.
15.7. When ΩA is a left-c.e. real
717
Finally, we must prove that A can be a ∅ -left-c.e. real. Let S = {A ∈ A P : ΩA M = X}. Note that S = {A : ∀s (A ∈ Ps ∧ ΩM [s] X)}. The fact 0,∅ that X T ∅ makes S a Π1 class. We proved above that S is nonempty, so A = min(S) is a ∅ -left-c.e. real satisfying the theorem. We now consider sets X such that Ω−1 U (X) has positive measure. Lemma 15.7.2 (Downey, Hirschfeldt, Miller, and Nies [115]). Let M be a prefix-free oracle machine. If μ{A : ΩA M = X} > 0 then X is a left-c.e. real. Proof. By the Lebesgue Density Theorem, there is a σ such that μ{A −|σ|−1 σ : ΩA . In other words, ΩM maps more than half of the M = X} > 2 extensions of σ to X. So X is the limit of the nondecreasing computable sequence {Xs }s∈ω , where for each s, we let Xs be the largest rational such −|σ|−1 that μ{A σ : ΩA . M [s] Xs } > 2 For X ∈ 2ω , let mU (X) = μ({A : ΩA U = X}). Define the spectrum of ΩU to be Spec(ΩU ) = {X : mU (X) > 0}. By the lemma, the spectrum is a set of 1-random left-c.e. reals. We prove that it is nonempty. We use the fact that every 2-random set is weakly 2-random, and hence cannot be contained in a null Π02 class. Lemma 15.7.3 (Downey, Hirschfeldt, Miller, and Nies [115]). mU (X) > 0 iff there is a 1-random A such that ΩA U = X. Proof. If mU (X) > 0, then there is clearly a 1-random A such that ΩA U = X, as there are measure 1 many 1-randoms. For the other direction, assume that A is 1-random and ΩA U = X. By van Lambalgen’s Theorem, the fact that X is 1-random relative to A implies that A is 1-random relative to X. But X ≡T ∅ , because X is a 1-random left-c.e. real, so A is 2-random. 0 But {B : ΩB U = X} is a Π2 class containing this 2-random set, and hence cannot be null. Thus mU (X) > 0. Theorem 15.7.4 (Downey, Hirschfeldt, Miller, and Nies [115]). Spec(ΩU ) is nonempty. Proof. Apply Theorem 15.7.1 to a nonempty Π01 class containing only 1random sets to obtain a 1-random A such that X = ΩA U is a left-c.e. real. By Lemma 15.7.3, X ∈ Spec(ΩU ). We have proved that ΩU maps a set of positive measure to the left-c.e. reals. One might speculate that almost every set is mapped to a left-c.e. real. We now prove that this is not the case (although we have seen that almost every set (in particular, every 2-random set), can be mapped to a left-c.e. real by some Omega operator). Theorem 15.7.5 (Downey, Hirschfeldt, Miller, and Nies [115]). There is an ε > 0 such that μ({B : ΩB U is 1-random relative to Z}) ε for all Z.
718
15. Ω as an Operator
Proof. It is easy to define a prefix-free oracle machine M such that ΩB M = B for every B. Define a universal prefix-free oracle machine V by V B (0σ) = ΩB +B
U . Apply TheoU B (σ) and V B (1σ) = M B (σ) for all σ. Then ΩB V = 2 rem 15.7.4 to V to get a left-c.e. real X such that S = {B : ΩB V = X} has positive measure. Let ε = μ(S). Now fix Z. We may assume without loss of generality that Z T ∅ . Let B B ∈ S be 1-random relative to Z. Then ΩB U = 2ΩV − B = 2X − B must also be 1-random relative to X, because X T Z. Therefore,
μ({B ∈ S : ΩB U is 1-random relative to Z}) μ({B ∈ S : B is 1-random relative to Z}) = μ(S) = ε, since there are measure 1 many sets that are 1-random relative to Z.4 These results tell us that the Σ03 class of sets A such that ΩA U is left-c.e. has intermediate measure. Corollary 15.7.6 (Downey, Hirschfeldt, Miller, and Nies [115]). 0 < μ({A : ΩA U is a left-c.e. real}) < 1. The most important consequence of the work in this section is the following resoundingly negative answer to the question of whether ΩU is degree invariant. Theorem 15.7.7 (Downey, Hirschfeldt, Miller, and Nies [115]). B (i) For all Z, there are A =∗ B such that ΩA U is a left-c.e. real and ΩU is 1-random relative to Z. B A B (ii) There are A =∗ B such that ΩA U |T ΩU (and, in fact, ΩU and ΩU are 1-random relative to each other).
Proof. (i) Let S = {A : ΩA U is a left-c.e. real} and R = {B : ΩB U is 1-random relative to Z}. By Theorems 15.7.4 and 15.7.5, respec = {A : ∃B ∈ R (A =∗ B)}. tively, both classes have positive measure. Let R = 1. Hence, there is an A ∈ S ∩ R. By Kolmogorov’s 0-1 Law, μ(R) (ii) By part (i), there are A =∗ B such that ΩA is a left-c.e. real and ΩB U U B A is 2-random. Then ΩU is 1-random relative to ΩU and, by van Lambalgen’s B A B Theorem, ΩA U is 1-random relative to ΩU . Thus ΩU |T ΩU . The following is another corollary of Theorem 15.7.1. Corollary 15.7.8 (Downey, Hirschfeldt, Miller, and Nies [115]). There is a properly Σ02 set A such that ΩA U is a left-c.e. real. 4 This
simple construction shows more. Because ΩB U = 2X − B for B ∈ S, we know that μ({ΩB U : B ∈ S}) = μ({2X − B : B ∈ S}) = μ(S) > 0. Therefore, the range of ΩU has a subset with positive measure. While this fact follows from the most basic case of Theorem 15.4.3, this new proof does not resort to Lusin’s theorem on the measurability of analytic sets.
15.8. ΩA for K-trivial A
719
We close the section with two further observations on the spectrum. Proposition 15.7.9 (Downey, Hirschfeldt, Miller, and Nies [115]). We have sup(rng ΩU ) = sup{ΩA U : A is 1-random} = sup(Spec(ΩU )). Proof. Let X = sup(rng ΩU ). Given a rational q < X, choose σ such that ΩσU q. By the same proof as Theorem 15.7.4, there is a 1-random A σ such that ΩA U is a left-c.e. real. Proposition 15.7.10 (Downey, Hirschfeldt, Miller, and Nies [115]). If p < q are rationals and C = {A : ΩA U ∈ [p, q]} has positive measure, then Spec(ΩU ) ∩ [p, q] = ∅. Proof. Note that C is the countable union of σ ∩ C over all σ such that Ωσ p. Because μ(C) > 0, for some such σ we have μ(σ ∩ C) > 0. But σ ∩ C = {A σ : ΩA q} is a Π01 class. Let R ⊂ 2ω be a Π01 class containing only 1-randoms with μ(R) > 1 − μ(σ ∩ C). Then R ∩ σ ∩ C is a nonempty Π01 class containing only 1-randoms. Applying Theorem 15.7.1 to this class, there is a 1-random A ∈ C such that X = ΩA U is a left-c.e. real. Then X ∈ Spec(ΩU ) ∩ [p, q], by Lemma 15.7.3 and the definition of C.
15.8 ΩA for K-trivial A In previous sections, we considered sets that can be mapped to left-c.e. reals by some Omega operator. We now look at sets A such that ΩA U is a left-c.e. real for every universal prefix-free oracle machine U . We will see that these are exactly the K-trivials. Recall Lemma 11.5.5, which states that if A is K-trivial and M is a prefix-free oracle machine, then there is a computable sequence of stages v(0) < v(1) < · · · such that { c(x, r) : x is minimal s.t. Av(r+1) (x) = Av(r+2) (x)} < ∞, (15.2) where c(z, r) =
{2−|σ| : M A (σ)[v(r + 1)] ↓ ∧ z < use(M A (σ)[v(r + 1)]) v(r)}.
Informally, c(x, r) is the maximum amount that ΩA M [v(r+1)] can decrease because of an A(x) change after stage v(r + 1), provided we count only the M A (σ) computations with use v(r). Theorem 15.8.1 (Downey, Hirschfeldt, Miller, and Nies [115]). Let U be a universal prefix-free oracle machine. The following are equivalent. (i) A is K-trivial.
720
15. Ω as an Operator
(ii) A is Δ02 and ΩA U is a left-c.e. real. (iii) A T ΩA U. (iv) A ≡T ΩA U. Proof. (ii) ⇒ (iii) follows from the fact that each 1-random left-c.e. real is Turing complete. (iii) ⇒ (i) follows from Theorem 11.7.2, since (iii) implies that A is a base for 1-randomness. (iii) ⇔ (iv) by Proposition 15.2.2. So we are left with showing that (i) ⇒ (ii). Let A be K-trivial. By Theorem 11.1.1, A is Δ02 . We show that there is an r0 ∈ N and an effective sequence {qr }r∈ω of rationals such that ΩA U = suprr0 qr , and hence ΩA is a left-c.e. real. U Applying Lemma 11.5.5 to U , we obtain a computable sequence of stages v(0) < v(1) < · · · such that (15.2) holds. The desired sequence of rationals is defined by letting qr = {2−|σ| : U A (σ)[v(r + 1)] ↓ ∧ use(U A (σ)[v(r + 1)]) v(r)}. Thus qr measures the computations existing at stage v(r + 1) whose use is at most v(r). We define r0 below; first we verify that ΩA U suprr0 qr for any r0 . Given σ0 , . . . , σm ∈ dom(U A ), choose r1 so that each computation U A (σi ) has settled by stage v(r1 ), with use v(r1 ). If r r1 , then qr −|σi | . Therefore, ΩA U lim supr qr suprr0 qr . im 2 Now define a Solovay test {Ir }r∈ω as follows: if x is minimal such that Av(r+1) (x) = Av(r+2) (x), then let
c(x, r), qr ]. Ir = [qr −
Then r |Ir | is finite by (15.2), so {Ir }r∈ω is indeed a Solovay test. Also note that, by the comment after the lemma, min Ir max Ir+1 for each r. A Since ΩA / Ir for all r r0 . U is 1-random, there is an r0 such that ΩU ∈ A We show that qr ΩU for each r r0 . Fix r r0 . Let t r be the first non-deficiency stage for the enumeration t → Av(t+1) . That is, if x is minimal such that Av(t+1) (x) = Av(t+2) (x), then ∀t t ∀y < x (Av(t +1) (y) = Av(t+1) (y)). c(x, t) measures the computations U A (σ)[v(t + 1)] with The quantity qt − use x. These computations are stable from v(t + 1) on, so ΩA U min It . Now ΩA / Iu for u r0 and min Iu max Iu+1 for any u. Applying this U ∈ fact to u = t − 1, . . . , u = r, we see that ΩA U max Ir = qr . Therefore, ΩA U suprr0 qr . One consequence of this theorem is that all Omega operators are degree invariant on the K-trivials. The next example shows that they need not be degree invariant anywhere else. Example 15.8.2 (Downey, Hirschfeldt, Miller, and Nies [115]). There is an Omega operator that is degree invariant only on K-trivials.
15.8. ΩA for K-trivial A
721
Proof. Let M be a prefix-free oracle machine such that A if A(0) = 0 A ΩM = 0 if A(0) = 1. by A(n) For any A, define A = A(n) iff n = 0. Define a universal prefixfree oracle machine V as follows. For all σ, let V A (00σ) = U A (σ), let A A V A (01σ) = U A (σ), and let V A (1σ) = M A (σ). Then |ΩA V − ΩV | = 2 for A A A all A. Assume that ΩV T ΩV . Then A T ΩV , so A is a base for 1 A randomness and hence K-trivial by Theorem 11.7.2. If ΩA V T ΩV , then A again A is K-trivial. Therefore, if A is not K-trivial, then ΩA V |T ΩV . The following corollary summarizes Theorem 15.8.1 and Example 15.8.2. Corollary 15.8.3 (Downey, Hirschfeldt, Miller, and Nies [115]). The following are equivalent. (i) A is K-trivial. (ii) Every Omega operator takes A to a left-c.e. real. (iii) Every Omega operator is degree invariant on the Turing degree of A. We have seen in Theorem 15.7.7 that no Omega operator is degree invariant. We have also seen that if A is not K-trivial, then there are Omega operators that are not invariant on the Turing degree of A. Can these two results be combined? Open Question 15.8.4 (Downey, Hirschfeldt, Miller, and Nies [115]). For a universal prefix-free oracle machine U and an A that is not K-trivial, is A there necessarily a B ≡T A such that ΩB U ≡T ΩU ? Finally, the following is a simple but interesting consequence of Example 15.8.2. Corollary 15.8.5 (Downey, Hirschfeldt, Miller, and Nies [115]). Every K-trivial is a left-d.c.e. real. Proof. Let V be the machine from the example. Assume that A is K-trivial be as in the example. Then ΩA and ΩA are both left-c.e. reals and let A V V A by Theorem 15.8.1. Therefore, A = 2|ΩA V − ΩV | is a left-d.c.e. real. By Theorem 5.4.7, the left-d.c.e. reals form a real closed field. The corollary gives us a nontrivial real closed subfield: the K-trivial reals. To see this, recall that in Chapter 11 we have seen that the K-trivials form an ideal in the Turing degrees. Because a zero of an odd degree polynomial can be computed relative to its coefficients, the K-trivial reals also form a real closed field.
722
15. Ω as an Operator
15.9 K-triviality and left-d.c.e. reals Having mentioned left-d.c.e. reals in the previous section, we now show that they can be used to give a rather surprising characterization of K-triviality, which does not mention randomness or initial segment complexity in any obvious way. Definition 15.9.1. A is low for left-d.c.e. reals if the class of reals that are left-d.c.e. relative to A coincides with the class of left-d.c.e. reals. Theorem 15.9.2 (Miller [unpublished]). A is K-trivial iff A is low for left-d.c.e. reals. Proof. First suppose that A is low for left-d.c.e. reals. Then ΩA is left-d.c.e. and 1-random, so by Rettinger’s Theorem 9.2.4, either ΩA or 1 − ΩA is a left-c.e. real. But 1 − ΩA cannot be a left-c.e. real, since that would imply ΩA is A-computable. So ΩA is a left-c.e. real, and hence ΩA ≡T ∅ . But A must be a left-d.c.e. real, since it is low for left-d.c.e. reals and of course is a left-d.c.e. real relative to itself. So A is Δ02 , and hence A T ΩA . By Theorem 15.8.1, A is K-trivial. Now suppose that A is K-trivial. Let α be a left-c.e. real relative to A. Then ΩA +α is a left-c.e. real relative to A and is 1-random, so it is a version of Ω relative to A. Thus, by Theorem 15.8.1, ΩA + α is an unrelativized version of Ω, and so is a left-c.e. real. Now let δ = α − β be a left-d.c.e. real relative to A. Then δ = (ΩA + α) − (ΩA + β), and hence δ is a left-d.c.e. real.
15.10 Analytic behavior of Omega operators In this section, we examine Omega operators from the perspective of analysis. Given a universal prefix-free oracle machine U , we consider two questions: 1. To what extent is ΩU continuous? 2. How complex is the range of ΩU ? To answer the first question, we show that ΩU is lower semicontinuous but not continuous. Furthermore, we prove that it is continuous exactly at 1-generic reals. Together with the semicontinuity, this fact implies that ΩU can achieve its supremum only at a 1-generic. But must ΩU actually achieve its supremum? This question relates to the second question above. We write S c for the closure of S. Theorem 15.10.4 states that any real in rng(ΩU )c \ rng(ΩU ) must be 2-random. Because X = sup(rng ΩU ) is a left-c.e. real, and hence not 2-random, there is an A such that ΩA U = X. It is natural to ask whether rng(ΩU ) is closed. In other words, is Theorem 15.10.4 vacuous? Example 15.10.6 demonstrates that for some choice
15.10. Analytic behavior of Omega operators
723
of U , the range of ΩU is not closed, and indeed, μ(rng(ΩU )) < μ(rng(ΩU )c ). Whether this is the case for all universal prefix-free oracle machines is open. Furthermore, we know of no nontrivial upper bound on the complexity of rng(ΩU ), but we do show below that rng(ΩU )c is a Π03 class. A function f : R → R is lower semicontinuous if {x : f (x) > a} is open for every a ∈ R. We show that for any prefix-free oracle machine M , the function ΩM is lower semicontinuous. Note that for any A, Am ∀δ > 0 ∃m (ΩA δ), M − ΩM
(15.3)
X and hence ∀X A m (ΩA M − ΩM δ).
Proposition 15.10.1 (Downey, Hirschfeldt, Miller, and Nies [115]). ΩM is lower semicontinuous for every prefix-free oracle machine M . Proof. Let a ∈ R and let A be such that ΩA M > a. Choose a real δ > 0 such that ΩA − δ > a. By the observation above, there is an m such that M X A X A m implies that ΩA − Ω δ. Therefore, ΩX M M M ΩM − δ > a, so X A m is an open neighborhood of A contained in {X : ΩM > a}. But A is an arbitrary element of {X : ΩX M > a}, proving that this set is open. We next prove that Omega operators are not continuous, and characterize their points of continuity. An open set S ⊆ 2ω is dense along A if each initial segment of A has an extension in S. It is easy to see that A is 1-generic iff A is in every Σ01 class that is dense along A. We prove that ΩU is continuous exactly on the 1-generics, for any universal prefix-free oracle machine U . Theorem 15.10.2 (Downey, Hirschfeldt, Miller, and Nies [115]). The following are equivalent. (i) A is 1-generic. (ii) If M is a prefix-free oracle machine, then ΩM is continuous at A. (iii) There is a universal prefix-free oracle machine U such that ΩU is continuous at A. Proof. (i) ⇒ (ii). Let M be any prefix-free oracle machine. By (15.3), it suffices to show that A ∀ε ∃n ∀X A n (ΩX M ΩM + ε).
Suppose this condition fails for a rational ε. Take a rational r < ΩA M such 0 that ΩA − r < ε. The following Σ class is dense along A: 1 M S = {B : ∃n (ΩB M [n] r + ε)}. A Thus A ∈ S. But then ΩA M r + ε > ΩM , which is a contradiction. (ii) ⇒ (iii) is trivial.
724
15. Ω as an Operator
(iii) ⇒ (i). Fix a universal prefix-free oracle machine U . We assume that A is not 1-generic and show that there is an ε > 0 such that A ∀n ∃B A n (ΩB U ΩU + ε).
(15.4)
Take a Σ01 class S that is dense along A but such that A ∈ / S. Define a prefix-free oracle machine LX as follows. When (some initial segment of) X enters S, then LX converges on the empty string. Thus LA is nowhere defined. Let c be the length of a coding string for L in U . We prove that ε = 2−(c+1) satisfies (15.4). Choose m as in (15.3) for the given universal machine, where δ = 2−(c+1) . For each n m, choose B A n such that B ∈ S. Since LB converges A −(c+1) on the empty string, ΩB + 2 c = ΩA U ΩU − 2 U + ε. Let U be a universal prefix-free oracle machine. Corollary 15.10.3 (Downey, Hirschfeldt, Miller, and Nies [115]). If ΩA U = sup(rng ΩU ), then A is 1-generic. Proof. By the previous theorem, it suffices to prove that ΩU is continuous at A. But note that the lower semicontinuity of ΩU implies that X X A {X : |ΩA U − ΩU | < ε} = {X : ΩU > ΩU − ε}
is open for every ε > 0, so ΩU is continuous at A. The above corollary does not guarantee that the supremum is achieved. Surprisingly, it is. In fact, we can prove quite a bit more. One way to view the proof of the following theorem is that we are trying to prevent any real that is not 2-random from being in the closure of the range of ΩU . If we fail for some X, then it will turn out that X ∈ rng(ΩU ). Note that this fact is a consequence of universality; it is easy to construct a prefix-free oracle machine M such that ΩM does not achieve its supremum. Theorem 15.10.4 (Downey, Hirschfeldt, Miller, and Nies [115]). If X ∈ rng(ΩU )c \ rng(ΩU ), then X is 2-random. Proof. Assume that X ∈ rng(ΩU )c is not 2-random and let RX = A Ω−1 U [X] = {A : ΩU = X}. For each rational p ∈ [0, 1], define Cp = {A : A ΩU p}. Note that every Cp is closed (in fact, a Π01 class). For every rational q ∈ [0, 1] such that q < X, we will define a closed set Bq ⊆ 2ω such that # # RX = Bq ∩ Cp , (15.5) q<X
p>X
where q and p range over the rationals. Furthermore, we will prove that every finite intersection of sets from {Bq : q < X} and {Cp : p > X} is nonempty. By compactness, this property ensures that RX is nonempty, and therefore, that X ∈ rng(ΩU ).
15.10. Analytic behavior of Omega operators
725
We would like to define Bq to be {A : ΩA U q}, which would obviously 0 satisfy (15.5). The problem is that {A : ΩA U q} is a Σ1 class; Bq must be closed if we are to use compactness. The solution is to let Bq = {A : ΩA U [k] q} for some k. Then Bq is closed (in fact, clopen) and, by choosing k appropriately, we will guarantee that ΩA U is bounded away from X for every A ∈ / Bq . For each rational q ∈ [0, 1], we build a prefix-free oracle machine Mq . For A ∈ 2ω and σ ∈ 2<ω , define MqA (σ) as follows. 1. Wait for a stage s such that ΩA U [s] q.
2. Compute τ = U ∅s (σ). 3. Wait for a stage t s such that ΩA U [t] τ . The computation may get stuck in any one of the three steps, in which case MqA (σ) ↑. Otherwise, let MqA (σ) = t + 1. The value to which MqA (σ) converges is relevant only because it ensures that a U -simulation of Mq cannot converge before stage t + 1. We are ready to define Bq ⊆ 2ω for a rational q ∈ [0, 1] such that q < X. Assume that U simulates Mq by the prefix ρ. Choose σ such that U ∅ (σ) = τ ≺ X and |τ | > |ρσ|. Such a σ exists because X is not 2-random. Choose kq large enough that U ∅s (σ) = τ , for all s kq . Let Bq = {A : ΩA U [kq ] q}. We claim that the definition of Bq ensures that ΩA is bounded away from U X for any A ∈ / Bq . Let lq = min{q, τ } and rq = τ + 2−|ρσ| . Clearly lq < X. To see that rq > X, note that X − τ 2−|τ | < 2−|ρσ| . Now assume that A A A∈ / Bq and that ΩA U lq . Thus ΩU q but ΩU [kq ] < q, which implies that the s found in step 1 of the definition of Mq is greater than kq . Therefore, U ∅s (σ) = τ . But ΩA U τ , so step 3 eventually produces a t s such that A A ΩA U [t] τ . Thus Mq (σ) ↓= t + 1, so U (ρσ) ↓ some time after stage t, A A −|ρσ| which implies that ΩU ΩU [t] + 2 τ + 2−|ρσ| = rq . We have proved that ΩA U ∈ [lq , rq ) ⇒ A ∈ Bq .
(15.6)
Next we verify (15.5). Assume that A ∈ RX . We have just proved that A ∈ Bq for all rationals q < X. Also, it is clear that A ∈ Cp for all rationals p > X. Therefore, R ⊆ B ∩ X q q<X p>X Cp . For the other direction, assume that A ∈ q<X Bq ∩ p>X Cp . Then q < X implies that A A A ΩA U ΩU [kq ] q, so ΩU X. On the other hand, if p > X, then ΩU p. A A This fact implies that ΩU X, and so ΩU = X. Therefore, A ∈ RX , which proves (15.5). It remains to prove that RX is nonempty. Let Q be a finite set of rationals less than X and P a finite set of rationals greater than X. Define l = max{lq : q ∈ Q} and r = min(P ∪ {rq : q ∈ Q}). Note that X ∈ (l, r). Because X ∈ rng(ΩU )c , there is an A such that ΩA U ∈ (l, r). From (15.6) it
726
15. Ω as an Operator
follows thatA ∈ Bq for all q ∈ Q. Clearly, A ∈ Cp for every p ∈ P . Hence q∈Q Bq ∩ p∈P Cp is nonempty. By compactness, RX is nonempty. If X ∈ rng(ΩU ) is not 2-random, then an examination of the above 0 construction gives an upper bound on the complexity of Ω−1 U [X]. The Π1 0 classes Cp can be computed uniformly. The Bq are also Π1 classes and can be found uniformly in X ⊕ ∅ . Therefore, Ω−1 [X] = B ∩ q U q<X p>X Cp is
class. a nonempty Π0,X⊕∅ 1 The following corollary gives an interesting special case of Theorem 15.10.4. It is not hard to prove that there is an A such that ΩA U = inf(rng ΩU ) (see Theorem 15.7.1). It is much less obvious that ΩU achieves its supremum. Corollary 15.10.5 (Downey, Hirschfeldt, Miller, and Nies [115]). There is an A such that ΩA U = sup(rng ΩU ). Proof. Since sup(rng ΩU ) is a left-c.e. real, it is not 2-random, so the corollary is immediate from Theorem 15.10.4. By Proposition 8.11.9, no 1-generic is 1-random, so μ({A : ΩA U = sup(rng ΩU )}) = 0. Therefore, sup(rng ΩU ) is an example of a left-c.e. real in the range of ΩU that is not in Spec(ΩU ). One might ask whether Theorem 15.10.4 is vacuous. In other words, is the range of ΩU actually closed? We can construct a specific universal prefix-free oracle machine such that it is not. The construction is somewhat similar to the proof of Theorem 15.4.3. In that case, we avoided a measure zero set by using an oracle that codes a relativized Martin-L¨ of test covering that set. Now we will avoid a measure zero closed set by using a natural number to code a finite open cover with sufficiently small measure. Example 15.10.6 (Downey, Hirschfeldt, Miller, and Nies [115]). There is a universal prefix-free oracle machine V such that μ(rng(ΩV )) < μ(rng(ΩV )c ). Proof. Let U be a universal prefix-free oracle machine. Let M be a prefixfree oracle machine such that 1 if |A| > 1 A ΩM = 0 otherwise. Define a universal prefix-free oracle machine V by V A (0σ) = U A (σ) and 1 V A (1σ) = M A (σ) for all σ. This definition ensures that ΩA V 2 iff |A| 1. 1 Therefore, μ(rng(ΩV )∩[0, 2 ]) = 0. We will prove that μ(rng(ΩV )c ∩[0, 12 ]) > 0. Let {Ii }i∈ω be an effective enumeration of all finite unions of open intervals with dyadic rational endpoints. We construct a prefix-free oracle machine N . By the recursion theorem, we may assume that we know a coding string ρ of N in V . Given an oracle A, find the least n such that
15.10. Analytic behavior of Omega operators
727
A(n) = 1. Intuitively, N A tries to prevent ΩA V from being in In . Whenever A A a stage s occurs such that ΩA V [s] ∈ In and ∀σ (V (ρσ)[s] = N (σ)[s]), then A A N acts as follows. Let ε be the least number such that ΩV [s] + ε ∈ / In and note that ε is necessarily a dyadic rational. If possible, N A converges on adA ditional strings with total measure 2|ρ| ε, which ensures that ΩA V ΩV [s]+ε. −|ρ| A If μ(In ) 2 , then N cannot run out of room in its domain, and we have ΩA / In . V ∈ Assume for the sake of contradiction that μ(rng(ΩV )c ∩ [0, 12 ]) = 0. Then there is an open cover of rng(ΩV )c ∩ [0, 12 ] with measure less than 2−|ρ| . We may assume that all intervals in this cover have dyadic rational endpoints. Because rng(ΩV )c ∩ [0, 12 ] is compact, there is a finite subcover In . But n ω / In , which is a contradiction. Thus μ(In ) < 2−|ρ| implies that Ω0V 10 ∈ μ(rng(ΩV )c ∩ [0, 12 ]) > 0. Note that the proof above shows that if U is a universal prefix-free oracle n ω machine and S = {Ω0U 10 : n ∈ N}, then S c has positive measure and S c \S contains only 2-randoms. Proposition 15.10.7 (Downey, Hirschfeldt, Miller, and Nies [115]). rng(ΩU )c is a Π03 class. Proof. It is easy to verify that a ∈ rng(ΩU )c iff ∀ε > 0 ∃σ ∈ 2<ω ΩσU [|σ|] > a − ε
∧ ∀n |σ| ∃τ σ (|τ | = n ∧ ΩτU [n] < a + ε) ,
where ε ranges over rational numbers. This definition is Π03 because the final existential quantifier is bounded.
16 Complexity of Computably Enumerable Sets
16.1 Barzdins’ Lemma and Kummer complex sets In this section, we look at the initial segment complexity of c.e. sets, including a fascinating gap phenomenon uncovered by Kummer [226]. We begin with an old result of Barzdins [29]. Theorem 16.1.1 (Barzdins’ Lemma [29]). If A is a c.e. set then C(A n | n) log n + O(1) and C(A n) 2 log n + O(1). Proof. To describe A n given n, it suffices to supply the number kn of elements in A n and an e such that A = We , since we can recover A n from this information by running the enumeration of We until kn many elements appear in We n. Such a description can be given in log n + O(1) many bits. For the second part of the lemma, we can encode kn and n as two strings σ and τ , respectively, each of length log n. We can recover σ and τ from στ because we know the length of each of these two strings is exactly half the length of στ . Thus we can describe A n in 2 log n + O(1) many bits. Barzdins also constructed an example of a c.e. set A with C(A n) log n for all n. If C(A n) log n + O(1) for all n then, by the proof of Theorem 3.4.4, A is computable. A long-standing open question was whether the 2 log n is optimal in the second part of Theorem 16.1.1. The best we could hope for is a c.e. set A such that C(A n) 2 log n − O(1) infinitely often, since the following is known. R.G. Downey and D. Hirschfeldt, Algorithmic Randomness and Complexity, Theory and Applications of Computability, DOI 10.1007/978-0-387-68441-3_16, © Springer Science+Business Media, LLC 2010
728
16.1. Barzdins’ Lemma and Kummer complex sets
729
Theorem 16.1.2 (Solovay [unpublished]). There is no c.e. set A such that C(A n | n) log n − O(1) for all n. Similarly, there is no c.e. set A such that C(A n) 2 log n − O(1) for all n. More recently, a simple proof of a result stronger than Solovay’s was discovered. Theorem 16.1.3 (H¨ olzl, Kr¨ aling, and Merkle [183]). For any c.e. set A there exist infinitely many m such that C(A m | m) = O(1) and C(A m) C(m) + O(1). Proof. We may assume that A is infinite. Let m be such that m enters A at stage s and no number less than m enters A after stage s. There must be infinitely many such m, and the theorem clearly holds for any of them. Solovay [unpublished] explicitly asked whether there is a c.e. set A such that C(A n) 2 log n − O(1) infinitely often. As we will see the answer is yes, and there is a precise characterization of the degrees that contain such sets. Definition 16.1.4. A c.e. set A is Kummer complex if for each d there are infinitely many n such that C(A n) 2 log n − d. Theorem 16.1.5 (Kummer [226]). There is a Kummer complex c.e. set. Proof. Let t0 = 0 and tk+1 = 2tk . Let Ik = (tk , tk+1 ] and
tk+1
f (k) =
(i − tk + 1).
i=tk +1 t2
Note that f (k) asymptotically approaches k+1 2 , and hence log f (k) > 2 log tk+1 − 2 for sufficiently large k. So it is enough to build a c.e. set A such that for each k there is an n ∈ Ik with C(A n) log f (k). Enumerate A as follows. At stage s + 1, for each k s, if Cs (As n) < log f (k) for all n ∈ Ik and Ik As , then put the smallest element of As ∩Ik into As+1 . Now suppose that C(A n) < log f (k) for all n ∈ Ik . Then there must be a stage s such that As n = A n and Cs (As n) < log f (k). We must have Ik As , since otherwise the smallest element of As ∩ Ik would enter A, contradicting the assumption that As n = A n. Thus, all of Ik is eventually put into A. So for each n ∈ Ik there are stages s0 < s1 < · · · < sn−tk such that Asi+1 n = Asi n and Csi (Asi ) < log f (k), and hence there are at least n − tk + 1 many strings σ with |σ| = n and C(σ) < log f (k). Thus, there are at least f (k) many strings σ such that C(σ) < log f (k), which is a contradiction. Kummer also gave an exact characterization of the degrees containing Kummer complex c.e. sets, using the notion of array noncomputability discussed in Section 2.23.
730
16. Complexity of Computably Enumerable Sets
Theorem 16.1.6 (Kummer’s Gap Theorem [226]). (i) A c.e. degree contains a Kummer complex set iff it is array noncomputable. (ii) In addition, if A is c.e. and of array computable degree, then for every computable order f , C(A n) log n + f (n) + O(1). (iii) Hence the c.e. degrees exhibit the following gap phenomenon: for each c.e. degree a, either (a) there is a c.e. set A ∈ a such that C(A n) 2 log n − O(1) for infinitely many n, or (b) there is no c.e. set A ∈ a and ε > 0 such that C(A n) (1 + ε) log n − O(1) for infinitely many n. Proof. Part (iii) follows immediately from parts (i) and (ii), so we prove the latter. Part (i): To make A Kummer complex, all we need is to have the construction from Theorem 16.1.5 work for infinitely many intervals. Let Ik and f (k) be as in the proof of that theorem, and let I be the very strong array {Ik }k∈N . Let A be an I-a.n.c. c.e. set. Define a c.e. set W as follows. At stage s + 1, for each k s, if Cs (As n) < log f (k) for all n ∈ Ik and Ik As , then put the smallest element of As ∩ Ik into W . Since A is I-a.n.c., there are infinitely many k such that A∩Ik = W ∩Ik . A similar argument to that in the proof of Theorem 16.1.5 now shows that, for any such k, if C(A n) < log f (k) for all n ∈ Ik then Ik ⊂ A, and hence that for each n ∈ Ik , there are at least n − tk + 1 many strings σ with |σ| = n and C(σ) < log f (k), which leads to the same contradiction as before. Thus A is Kummer complex. By Theorem 2.23.4, each array noncomputable c.e. degree contains an I-a.n.c. set, so each such degree contains a Kummer complex set. Part (ii): Let A be c.e. and of array computable degree, and let f be a computable order. Let m(n) = max{i : f (i) < n}. Note that m is a computable order, and is defined for almost all n. Let g(n) = A m(n). (If m(n) is undefined then let g(n) = 0.) Since g is computable in A, Proposition 2.23.6 implies that there is a total computable approximation {gs }s∈N such that lims gs (n) = g(n) and |{s : gs (n) = gs+1 (n)}| n for all n. (Recall that this cardinality is known as the number of mind changes of g at n.) Suppose that we are given n and, for kn = min{i : m(i) n}, we are also given the exact number pn of mind changes of g at kn . Then we can compute g(kn ) = A m(kn ), and hence also compute A n, since m(kn ) n. In other words, we can describe A n given n and pn , so C(A n) log n + 2 log pn + O(1).
16.2. The entropy of computably enumerable sets
731
By the definition of kn , we have m(kn − 1) < n, so by the definition of m, we have kn − 1 f (n). Furthermore, pn kn , so C(A n) log n + 2 log f (n) + O(1) log n + f (n) + O(1), as desired. Kummer’s Gap Theorem should be compared with results such as Downey and Greenberg’s Theorem 13.14.2 on packing dimension and the Barmpalias, Downey, and Greenberg result on cl-degrees, Theorem 9.14.4, which also concern array computability. It is natural to ask whether there is a classification of, say, all jump classes in terms of initial segment complexity.
16.2 The entropy of computably enumerable sets In this section we consider Chaitin’s notion from [59] of the entropy of a computably enumerable set, in the spirit of the Coding Theorem 3.9.4, and its relationship with the following notion, also taken from [59]. By a c.e. operator we mean a function W taking sets X to WeX for some e. We denote the value of W on X by W A . For a c.e. operator W , the enumeration probability PW (A) of a set A relative to W is the measure of the class of all X such that W X = A. This probability depends on W , of course, but there are optimal c.e. operators V such that for each c.e. operator W there is an ε > 0 for which PV (A) εPW (A) for all A. For example, we can list the c.e. operators W0 , W1 , . . . and let V be a c.e. operator such that e V 0 1X = WeX , where 0e 1X is the sequence 0e 1X(0)X(1) . . . . We fix such a V and write P (A) for PV (A). More generally, for a class C ⊆ 2ω , we let P (C) = μ({X : V X ∈ C}). If A is c.e., then clearly P (A) > 0. Recall that Theorem 8.12.1 says that the converse is also true: If P (A) > 0 then A is c.e. Recall also that the proof of Theorem 8.12.1 is the classic use of the “majority vote” technique, which uses the Lebesgue Density Theorem to find some string σ such that n ∈ A iff a majority (in the sense of measure) of the strings extending σ believe that n ∈ A. Thus a c.e. index for A is determined by a nonuniform argument based on the Lebesgue Density Theorem. Since the application of the Lebesgue Density Theorem is to find a σ such that the measure of the class of X σ such that V X = A is large, it is not unreasonable to imagine that c.e. sets with bigger enumeration probabilities should be easier to describe. Thus, the question we address in this section is the following: What effect does the enumeration probability of a c.e. set have on the difficulty of describing a c.e. index for it? This question leads us to the following definitions.
732
16. Complexity of Computably Enumerable Sets
Definition 16.2.1 (Chaitin [59]). For a c.e. set A, let H(A) = − log P (A) and I(A) = min{K(j) : A = Wj }.1 Chaitin [59] established some of the basic properties of these and related notions, while Solovay [371, 372] showed that there is indeed a deep relationship between them. Theorem 16.2.2 (Solovay [371, 372]). I(A) 3H(A)+2 log H(A)+O(1). It is not known to what extent Solovay’s bound may be improved, but Vereshchagin [400] showed that a tighter bound can be extracted from Solovay’s proof when A is finite. Theorem 16.2.3 (Vereshchagin [400]). If A is finite then I(A) 2H(A)+ 2 log H(A) + O(1). We will prove both of these results in this section. Our account of Solovay’s proof will be along the lines of the one given by Vereshchagin [400]. The proof relies on a combinatorial result of Martin (see [371, 372]), so we begin by presenting that result. Martin’s game is defined as follows. The game has parameters K and N . A configuration of the game is an N -tuple (S0 , . . . , SN −1 ) of clopen subsets of 2ω . The initial configuration is (2ω , . . . , 2ω ). There are two players. On 1 its turn, player I plays a clopen set Y ⊂ 2ω with μ(Y ) > K+1 . If Y is disjoint from S0 , . . . , SN −1 then player I wins. If not, then player II’s move is to pick an i < N with Si ∩ Y = ∅ and replace Si by Y . Player II wins if it can prevent player I from ever winning. Theorem 16.2.4 (Martin, see Solovay [371, 372]). In the game above, player II has a computable winning strategy if N = K(K+1) .2 2 Proof. This proof follows Solovay [372]. It uses a property of configurations called property M. This property is computably checkable, the initial configuration has it, and if a configuration has property M, then for any move by player I, player II has a move that results in a configuration that also has property M. Given such a property, the computable strategy for player II is simply to pick the least i so that Si ∩ Y = ∅ and the resulting configuration has property M. Suppose that N = K(K+1) . Let T = {(j, i) : i j < K}. Then |T | = 2 N . Given a bijection h : T → N and a configuration (Y0 , . . . , YN −1 ), we 1 We assume that our listing of c.e. sets is given by a reasonable enumeration of 1 , . . ., then for each i 0 , W Turing machines, so that if we have another such listing W j = Wi and vice versa. It is then easy to check we can effectively find a j such that W j } ± O(1), so the definition of I(A) is that min{K(j) : A = Wj } = min{K(j) : A = W listing-independent up to an additive constant. 2 Ageev [1] has shown that N has to be at least on the order of εK 2 (for some ε > 0) for player II to win Martin’s game.
16.2. The entropy of computably enumerable sets
733
write Zj,i for Yh(j,i) . We think of this bijection as arranging the Yk into a triangular array Z0,0 Z1,0 .. .
Z1,1
ZK−1,0
ZK−1,1
...
ZK−1,K−1 .
We say that (Y0 , . . . , YN −1 ) has property M if for some bijection h : T → N , the resulting triangular array has For all j < K, the following property: d if i0 i1 · · · id−1 j, then μ( k K+1 . Clearly, the initial configuration has property M, and we can effectively check whether a given configuration has property M. (Since a configuration is a finite sequence of clopen sets, it can be specified as a finite sequence of finite sets of generators.) To complete the proof, we show the following, which we state as a separate lemma for future reference. Lemma 16.2.5. Let (Y0 , . . . , YN −1 ) be a configuration with property M. 1 Let Y be a clopen subset of 2ω with μ(Y ) > K+1 . Then there is an i < N with Yi ∩ Y = ∅ such that the configuration obtained by replacing Yi by Y still has property M. Proof. Let h : T → N be a bijection witnessing the fact that (Y0 , . . . , YN −1 ) has property M, i) ∈ T be the corresponding triangular and let Zj,i for (j, K array. Then μ( i K+1 , so some ZK−1,i has nonempty intersection with Y . Let j be least such that Zj,i ∩ Y = ∅ for some i. Permuting the jth row if necessary, we may assume that Zj,j ∩ Y = ∅. We prove that if we replace Yh(j,j) by Y , the resulting configuration has property M. If j = 1 then there is nothing to prove, so suppose that j > 1. We build a new array by replacing Zj,j by Y and then swapping the rest of the jth row with the (j − 1)st row. That is, let Zj,j = Y , let Zj−1,i = Zj,i and Zj,i = Zj−1,i for i j − 1, and for all other k and i, let Zk,i = Zk,i . We claim that this new array witnesses the fact that (Y0 , . . . , Yh(j,j)−1 , Y, Yh(j,j)+1 , . . . , YN −1 ) has property M. Since property M works on a row-by-row basis, we do not need to worry about any rows other than the (j − 1)st and the jth. The (j − 1)st row is the same as the jth row of the original array, so we do not need to worry about iteither. So we need to show only that if if i0 · · · id−1 j d then μ( k K+1 . If id−1 < j then all elements involved in the k union are taken from the (j − 1)st row of the original array, so we are done. Thus, we may assume that id−1 = j, so that Zj,i = Y . By our choice d−1 )= of j, we know that Y ∩ Zj−1,i = ∅ for all i j − 1. Thus μ( k K+1 + K+1 = K+1 , as required. As explained above, this lemma completes the proof of the theorem.
734
16. Complexity of Computably Enumerable Sets
We are now ready to prove Theorem 16.2.2. Proof of Theorem 16.2.2. We will show that, for each k, we can enumerate O(22k ) many c.e. sets such that every set A with P (A) 2−k is among them. If this enumeration were uniform in k, then, taking k = H(A), we could describe an index for A simply by giving k and the position of A in our enumeration, which would require fewer than 2H(A) + 2 log H(A) many bits. The enumeration will not be uniform in k, however, but there will be nonuniformly given strings ρk ∈ 2k such that the enumeration for k can be done on input ρk+1 . The extra bits needed to describe ρk+1 are what will drive our bound to 3H(A) + 2 log H(A). Fix k. For a string σ, let Fσ be the finite set determined by σ, that is, Fσ = {n < |σ| : σ(n) = 1}. Let St (σ) = {X : V X [t] ∈ σ}. The following lemma is immediate. Lemma 16.2.6. P (A) = limn P (A n). We will also need the following lemma. Lemma 16.2.7. P (σ) = limt μ(St (σ)). Proof. Let St1 be the class of all X such V X [t] ⊇ Fσ and St2 be theclass of all X ∈ St1 such that V X [t] |σ| Fσ . Let S 1 = t St1 and S 2 = t St2 . Then P (σ) = μ(S 1 \ S 2 ) = μ(S 1 ) − μ(S 2 ) = limt (μ(St1 ) − μ(St2 )) = limt μ(St (σ)). It thus makes sense to define Pt (σ) = μ(St (σ)). An approximating sequence for a set A is a sequence σ0 , σ1 , . . . such that Fσ0 ⊆ Fσ1 ⊆ · · · and i Fσi = A. We say that the pair (t, n) is good for σ if |σ| = n and Pt (σ) 2−k−1 . We say that a sequence of pairs (t0 , n0 ), (t1 , n1 ), . . . is good for A if there is an approximating sequence σ0 , σ1 , . . . for A and j0 < j1 < · · · such that (tji , nji ) is good for σi for each i. The proof now boils down to the following two lemmas. Lemma 16.2.8. There are strings ρk ∈ 2k and an algorithm that, on input ρk+1 , computes a sequence of pairs (t0 , n0 ), (t1 , n1 ), . . . that is good for all A such that P (A) 2−k . Lemma 16.2.9. There is an algorithm that, when given a sequence of pairs as in Lemma 16.2.8 as an oracle, enumerates O(22k ) many sets C0 , . . . , CN −1 such that every set A for which the sequence is good coincides with some Ci . Given these lemmas, the theorem follows easily: Let k = H(A), so that P (A) 2−k . Let ρk+1 be as in Lemma 16.2.8, and let C0 , . . . , CN −1 be as in Lemma 16.2.9 when the oracle is the sequence computed from ρk+1 . Then
16.2. The entropy of computably enumerable sets
735
there is an i < N such that A = Ci . Given ρk+1 and i, we can determine an index j such that A = Wj . We can determine k from ρk+1 , so I(A) K(ρk+1 ) + K(i | k) k + 2 log k + 2k + O(1) = 3H(A) + 2 log H(A) + O(1). So to finish the proof, we need only to establish the two lemmas. We begin with the latter. . We are given Proof of Lemma 16.2.9. Let K = 2k+1 and N = K(K+1) 2 a sequence of pairs (t0 , n0 ), (t1 , n1 ), . . . that is good for all A such that P (A) 2−k . We will enumerate sets C0 , . . . , CN −1 using Martin’s game. We begin by letting (X0 [0], . . . , XN −1 [0]) be the initial configuration and C0 [0], . . . , CN −1 [0] all be empty. At the end of stage j of the algorithm, we have a configuration (X0 [j], . . . , XN −1 [j]) and finite sets C0 [j], . . . , CN −1 [j] with the following properties. (i) Ci [j] ⊆ [0, nj ). (ii) For every σ for which (tj , nj ) is good, Ci [j] = Fσ for some i < N . (iii) Ci [j] ⊆ V X [tj ] for all X ∈ Xi [j]. At stage j + 1, we proceed as follows. Let σ0 , . . . , σl be the strings for which (tj+1 , nj+1 ) is good. Play Y = St (σ0 ) for player I. Since μ(St (σ0 )) = 1 Pt (σ0 ) 2−k−1 > K+1 , this move is legal. Let Xi [j] be the set that the winning strategy for player II replaces by Y and let Xi [j + 1] = Y . Since Xi [j] ∩ Y = ∅, there is some X ∈ Xi [j] with V X [tj+1 ] ∈ σ0 . By condition (iii), Ci [j] ⊆ V X [tj ] ⊆ V X [tj+1 ], so by condition (i), Ci [j] ⊆ V X [tj+1 ] ∩ [0, nj+1 ) = Fσ0 . So we can let Ci [j + 1] = Fσ0 . Note that now Ci [j + 1] ⊆ V X [tj+1 ] for all X ∈ Xi [j + 1], and hence (iii) remains true for this i. Now repeat this procedure for σ1 , . . . , σl in turn. By definition, the sets St (σ0 ), . . . , St (σl ) are pairwise disjoint, so the strategy for player II always picks a new Xi [j] to replace. For all i such that Xi [j] is not picked, let Ci [j + 1] = Ci [j] and Xi [j + 1] = Xi [j]. Note that (iii) is true for all such i, since V X [tj+1 ] ⊇ V X [tj ] for all X. Let Ci = j Ci [j]. To see that the algorithm works, and hence complete the proof of Lemma 16.2.9, assume that there is an approximating sequence τ0 , τ1 , . . . for A and j0 < j1 < · · · such that (tjl , njl ) is good for τl . Then for each l there is an i < N such that Ci [jl ] = Fτl . So for some i < N , there are infinitely many l such that Ci [jl ] = Fτl . Then A = Ci . We finish by proving Lemma 16.2.8. Proof ofLemma 16.2.8. Let A0 , . . . , Ar list all the sets A with P (A) 2−k , let ρ = ir P (Ai ), and let ρk+1 = ρ k + 1.
736
16. Complexity of Computably Enumerable Sets
We say that (t, m, n) is opportune if m 2k+1 and for the list σ0 , . . . , σs 1 of all strings in 2n such that Pt (σ) 2−k − m , we have is Pt (σ) ρk+1 − 2−k−1 . Note that the first condition implies that Pt (σ) 2−k−1 for all i s, and hence, since the σ are disjoint, s 2k+1 . First we will show that for each c there is an opportune triple (t, m, n) with t, m, n > c. Since the property of being an opportune triple is computable given ρk+1 , we can then define, computably in ρk+1 , a sequence of opportune triples (t0 , m0 , n0 ), (t1 , m1 , n1 ), . . . that is increasing in each component. Then we will show that, for such a sequence and any A such that P (A) 2−k , there is an approximating sequence τ0 , τ1 , . . . for A and j0 < j1 < · · · such that Ptji (σi ) 2−k − m1j 2−k−1 for all i. Thus (t0 , n0 ), (t1 , n1 ), . . . is the desired sequence that is computable from ρk+1 and good for all A such that P (A) 2−k . We begin by showing that for each c there is an opportune triple (t, m, n) with t, m, n > c. Fix c. Let m > c, 2k+1 . Let n > c be such that the sets Ai ∩ [0, n ) for i = 0, . . . , r are pairwise distinct. By Lemma 16.2.6, P (Ai ) = limn P (Ai 1 n), so there is an n > n with P (Ai n) > P (Ai ) − 2m(r+1) for all i r. By Lemma 16.2.7, there is a t > c such that Pt (Ai n) > P (A1i ) − 1 1 −k 2 − for all i r and ir Pt (Ai n) > ρ − 2m 2m(r+1) m ρk+1 − 2−k−1 . It now follows from the definition of opportune triple that (t, m, n) is such a triple. As mentioned above, we now have a sequence of opportune triples (t0 , m0 , n0 ), (t1 , m1 , n1 ), . . . that is increasing in each component, and we can finish the proof by taking an A such that P (A) 2−k and showing that there is an approximating sequence τ0 , τ1 , . . . for A and j0 < j1 < · · · such that Ptji (σi ) 2−k − m1j for all i.3 Without loss of generality, by passing to a subsequence if necessary, we may assume that the number s + 1 of strings σ ∈ 2nj with Ptj (σ) 2−k − m1j does not depend on j. Let σ0j , . . . , σsj list these strings. Let B be the class of all B for which there are j0 < j1 < · · · and i0 , i1 , . . . s such that σij00 , σij11 , . . . is an approximating sequence for B. We want to show that A ∈ B. We do so by showing that P (B) 2−k for all B ∈ B and P (B) ρk+1 − 2−k−1 . These facts imply that A ∈ B, as otherwise the sum of P (X) over all X with P (X) 2−k would be greater than ρ. We first show that P (B) 2−k for all B ∈ B. Suppose not. Let m 1 be such that P (B) < 2−k − m . By Lemma 16.2.6, there is an n such
3 Note that, although we showed the existence of an opportune triple by finding 1 for all i r, there is no guarantee that (t, m, n) such that Pt (Ai n) > 2−k − m the particular computable sequence of opportune triples (t0 , m0 , n0 ), (t1 , m1 , n1 ), . . . has that property.
16.2. The entropy of computably enumerable sets
737
1 that P (B n) < 2−k − m . By Lemma 16.2.7, there is a t such that 1 −k Pt (B n) < 2 − m for all t t . Let j and i be such that σij n ≺ B and tj > t , and also mj > m and nj > n. Then
Ptj (σij ) Ptj (σij n) = Ptj (B n) < 2−k −
1 1 < 2−k − , m mj
contradicting the choice of σij . We now show that P (B) ρk+1 − 2−k−1 . We first need an auxiliary fact. Suppose there is an n such that for all j with nj > n, there is an i s such that σij n ⊀ B for all B ∈ B. Then we can form a sequence j0 < j1 < · · · such that for each jl there is a σijll such that Fσjl Fσjl for all l > l. But il
i l
then Ptjl (σijll , njl )) 2−k−1 for all l, and any X that contributes to this probability for l cannot contribute to it for l > l, which is a contradiction. So for each n there is a j with nj > n such that for each i s, we have σij n ≺ B for some B ∈ B. Since P (B) 2−k for all B ∈ B, we know that B is finite. Let B0 , . . . , Bl be its elements. By Lemma 16.2.6, and the fact that if n is large enough then the sets Bi n are pairwise disjoint, P (Bi n). P (B) = lim n
By Lemma 16.2.7, il
il
P (Bi n) = lim t
Pt (Bi n).
il
Let j be such that nj > n and for each i s we have σij n ≺ Bp for some p l. Then Pt (Bi n) = Pt (B0 n ∪ · · · ∪ Bl n) il
Pt (σ0j ∪ · · · ∪ σsj ) =
Pt (σij ) ρk+1 − 2−k−1 ,
is
by the definition of opportune triple. Thus il P (Bi n) ρk+1 − 2−k−1 for all n, and hence P (B) ρk+1 − 2−k−1 . As discussed above, the above lemma finishes the proof of Theorem 16.2.2. Finally, we prove Vereshchagin’s Theorem 16.2.3. Proof of Theorem 16.2.3. For a finite set B, let St (B) = {X : V X [t] = B} and Pt (B) = μ(St (B)). Lemma 16.2.10. For all finite sets B, we have P (B) = limt Pt (B).
738
16. Complexity of Computably Enumerable Sets
Proof. Let St1 be the class of all X such V X [t] ⊇ B and St2 be the class of all X ∈ St1 such that V X [t] = B. Now the proof is as in Lemma 16.2.7. Given k, we will enumerate O(22k ) many finite sets C0 , . . . , CN −1 so that at the end of step t of the enumeration, every finite set B with Pt (B) 2−k−1 coincides with Ci [t] for some i < N . Suppose we can do so. If B is finite and P (B) 2−k , then by Lemma 16.2.10, Pt (B) 2−k−1 for almost all t, so for some i and infinitely many t, we have Ci [t] = B. Then B = Ci . So to describe B we need to provide only k and i, which requires fewer than 2H(A) + 2 log H(A) many bits. We perform our enumeration using Martin’s game with K = 2k so that at the end of each step t, the following two conditions hold, where (X0 [t], . . . , XN −1 [t]) is the configuration of the game at the end of step t. (i) If Pt (B) 2−k−1 then Ci [t] = B for some i < N . (ii) For all i < N , we have Ci [t] ⊆ V X [t] for all X ∈ Xi [t]. As before, we begin with Ci [0] = ∅ and Xi [0] = 2ω . At step t + 1 we let B0 , . . . , Bs be all sets B such that Pt+1 (B) 2−k−1 and play Y = St+1 (B0 ) for player I in Martin’s game. As before, let i be such that the computable winning strategy for player II replaces Xi [t] by Y . Let Xi [t + 1] = Y . Since Xi [t] ∩ Y = ∅, there is an X ∈ Xi [t] with B0 = V X [t + 1]. By condition (ii), we have Ci [t] ⊆ V X [t] ⊆ V X [t + 1] = B0 , so we can let Ci [t + 1] = B0 . As before, we repeat this procedure for each B1 , . . . , Bs in turn, and at the end, for any i such that Xi [t] is not replaced, let Xi [t + 1] = Xi [t] and Ci [t + 1] = Ci [t]. This algorithm clearly has the necessary properties.
16.3 The collection of nonrandom strings 16.3.1 The plain and conditional cases Probably the most natural computably enumerable set associated with randomness is RC = {σ : C(σ) < |σ|}, 4 the collection of strings that are nonrandom relative to plain complexity. This set has been extensively studied by Kummer, Muchnik, Positselsky, Allender, and others. We have already seen in Chapter 3 that RC is a simple c.e. set, and hence is not m-complete. On the other hand, it is easily seen to be Turing complete. (This is Exercise 2.63 in Li and Vit´ anyi [248].) In this section we will prove the remarkable fact, due to Kummer [227], 4 Strictly
speaking, we should be studying {σ : C(σ) < |σ| + d} where d is largest such that this set is co-infinite. For ease of notation, we will suppress this constant. This simplification does not alter the results we discuss in this section.
16.3. The collection of nonrandom strings
739
that RC is in fact truth table complete, which is by no means obvious. In fact, the reduction built in Kummer’s proof is a conjunctive tt-reduction. (A conjunctive tt-reduction from A to B is a computable function f such that for all n, we have n ∈ A iff Df (n) ∈ B.) The problem with proving this result, roughly speaking, is as follows. Suppose that we are attempting to define a reduction from the halting problem ∅ to RC to show that ∅ tt RC . The most natural idea is to have a collection of strings Fn for each n and attempt to ensure that Fn ⊂ RC iff n ∈ ∅ . If we choose Fn appropriately, if n enters ∅ then we can lower the complexity of all elements of Fn and force them into RC . However, it can also happen that these complexities decrease without us doing anything, forcing all of Fn into RC even though n∈ / ∅ . This possibility is particularly hard to overcome because we need to specify the sets Fn ahead of time, since we are trying to build a ttreduction. Furthermore, we cannot make the Fn too large. For example, we cannot take Fn to be all the strings of length n, since we cannot force them all into RC if n enters ∅ . As we will see, Kummer used nonuniformity and a clever counting method to overcome this difficulty. His idea is the basis for other arguments such as Muchnik’s Theorem 16.3.1 below. In fact, we will begin with Muchnik’s result, since the argument is simpler and the crucial use of nonuniformity more readily apparent, which should help make the ideas underlying Kummer’s proof more transparent. We consider the overgraph OQ = {(σ, τ, n) : Q(σ | τ ) < n} for Q equal to K, Km, or C (or other similar complexity measures). Theorem 16.3.1 (Muchnik, see [288]). For any choice of Q, the overgraph OQ is m-complete. Proof. We fix Q as C, but the proof works with easy modifications for other complexities. Let O = OC , and let Os = {(σ, τ, n) : Cs (σ | τ ) < n}. In the course of our construction, we will build a partial computable function f . By the recursion theorem, we assume we know a constant d such that if f (τ ) = σ then C(σ | τ ) < d. We will build partial functions gi : N → 2<ω × 2<ω × N for i < 2d . For the largest i such that gi (n) is defined for infinitely many n, we will be able to argue that, for all sufficiently large n, we can find a stage s such that either n ∈ ∅s , or n ∈ ∅ iff gi (n) ∈ O. Let i0 be this i. Let σ0 , . . . , σ2d −1 be pairwise distinct strings. The values of gi will be of the form (σi , τ, d). The basic idea is that we will be able to ensure for i0 that, from some stage on, and for almost all n, the following occurs. First, we can find a stage s such that either n ∈ ∅s or gi0 (n) is permanently defined at stage s as (σi0 , τ, d) for some τ . Next, if the latter case occurs and n enters ∅ later on, then f (τ ) = σi0 , so gi0 (n) ∈ O by the choice of d. Finally, if n never enters ∅ , then gi0 (n) ∈ / O. To ensure this last condition,
740
16. Complexity of Computably Enumerable Sets
we design the construction so that, if gi0 (n) enters O without n entering ∅ , then we define gj (m) = (σj , τ, d) for some j > i0 , which cannot happen more than finitely often by the choice of i0 . We now proceed with the construction. Construction. Initially, all strings are active. At stage s, proceed as follows. For each active τ with |τ | s, find the least i < 2d such that (σi , τ, d) ∈ / Os , which must exist since {σ : (σ, τ, d) ∈ O} < 2d . For the least n ∈ / ∅s such that gi is not defined, let gi (n) = (σi , τ, d). Undefine gj (m) for all j < i and all m such that gj (m) = (σj , τ, d). We say that i acts at stage s. If n enters ∅ at stage s, then for the largest i such that gi (n) is defined (if any), let τ be such that gi (n) = (σi , τ, d), declare τ to be inactive, and let f (τ ) = σi . End of Construction. Let i < 2d be largest such that i acts at infinitely many stages. Let s0 be a stage after which no j > i ever acts. We describe an m-reduction from ∅ to O. Let n be large enough so that gi (n) is not defined at the end of stage s0 and gj (n) is never defined for j > i. Search for a stage s > s0 such that either n ∈ ∅s or gi (n) becomes defined at stage s. It is clear from the construction that one of these events must occur. If it is the former, then we know that n ∈ ∅ . Otherwise, we claim that gi (n) ∈ O iff n ∈ ∅ . If n ∈ ∅ , then n enters ∅ at some stage t > s. At that stage, i is largest such that gi (n) is defined, so f (τ ) = σi , and hence C(σi | τ ) < d, which means that gi (n) = (σi , τ, d) ∈ O. Conversely, suppose that gi (n) = (σi , τ, d) ∈ O. Assume for a contradiction that n ∈ / ∅ . At stage s, we cancel all gj (m) such that j < i and gj (m) = (σj , τ, d), and no gj (m) ever gets defined as (σj , τ, d) after that stage, so τ is permanently active. Let t > s0 be such that (σi , τ, d) ∈ Ot . We must have (σj , τ, d) ∈ O for j < i, since otherwise gi (n) would not have been defined as (σi , τ, d). But by the choice of s0 and t, there cannot be a j > i such that (σj , τ, d) ∈ / Ot , since then the least such j would be active at t. So in fact (σj , τ, d) ∈ O for all j < 2d , which is not possible. We now prove Kummer’s Theorem. The construction has a finite set of parameters (the i < 2d+1 below). We will build our tt-reduction based on the largest parameter selected infinitely often during the construction. The necessity of knowing this parameter to define the reduction is the source of the nonuniformity in the proof. Theorem 16.3.2 (Kummer [227]). RC is truth table complete. Proof. We will define a plain machine Φ with constant d given by the recursion theorem. To do so, we will build uniformly c.e. sets of strings E0 , E1 , . . . such that for each n we have En ⊆ 2n and |En | 2n−d−1 . Then for each n, if σ ∈ En we can ensure that Φ(τ ) = σ for some τ ∈ 2n−d−1 .
16.3. The collection of nonrandom strings
741
By doing so, we will ensure that for each n and σ ∈ En , we have C(σ) CΦ (σ) + d < n, or, in other words, that n En ∈ RC . To obtain our tt-reduction, for each i < 2d−1 , we will define a possibly infinite sequence of finite sets of strings Si,0 , Si,1 . . . . At the end of the proof we will argue that there is a largest i such that the corresponding sequence is infinite, and such that for almost all x, we have x ∈ ∅ iff Si,x ⊆ RC . Hence, we will have a tt-reduction (indeed, a conjunctive tt-reduction) from ∅ to RC . The definition of the Si,x has stronger priority than that of the Sj,y for j < i. If Sj,y is defined then all of its elements will have the same length, denoted by mj,y . At a later stage, some stronger priority Si,x may assert control and occupy this length, in which case we will undefine mj,y . (The idea here is that, for this situation to occur, the opponent will have had to make more strings of this length be nonrandom, and hence the new choice Si,x will be more likely to succeed. See condition (iii) in the construction below.) Finally, if mi,x is defined and x enters ∅ , then we will let Emi,x = Si,x , ensuring that Si,x ⊆ RC , since n En ∈ RC . We will then declare the value of mi,x to be used, and hence unavailable for future use. We will also have to argue that if x never enters ∅ , then Si,x RC . The basic idea here is that we work with the largest i such that all Si,x are defined. We set up our conditions so that if Si,x ⊆ RC , then this increase in the number of nonrandom strings of a particular length allows some j > i to define a new Sj,y . So if this situation happens infinitely often, then we contradict the maximality of i. We now proceed with the formal construction. Let On,s = {σ ∈ 2n : Cs (σ) < n}. Construction. Fix a parameter d. Stage 0. Let (i) = 0 for all i. Stage s + 1. Check whether there is an i < 2d+1 and an n s such that (i) n is unused and n d + 1, (ii) n = mj,x for all j i and all x, and (iii) i2n−d−1 |On,s |. If so, then choose the largest such i, and for this i the least such n, and then perform the following actions. 1. Declare mj,x to be undefined for all j and x such that mj,x = n. 2. Let Si,(i) be the set of the least k elements in 2n \ On,s , where k = min(2n − |On,s |, 2n−d−1 ). 3. Let mi,(i) = n. 4. Increment (i) by 1.
742
16. Complexity of Computably Enumerable Sets
For all j and x such that x ∈ ∅s+1 and mj,x is defined and unused, let Emj,x = Sj,x and declare that mj,x is used. End of Construction. Clearly, the En are uniformly c.e. Furthermore, at any stage s + 1, if mj,x is defined then |Sj,x | 2mj,x −d−1 , so |En | 2n−d−1 for all n. Thus we can define a plain machine Φ such that, for each n, if σ ∈ En then Φ(τ ) = σ for some τ ∈ 2n−d−1 . By the recursion theorem, we may assume that d is a coding constant of Φ, so thatfor each n and σ ∈ En , we have C(σ) CΦ (σ) + d < n. In other words, n En ∈ RC . Conditions (i)–(iii) in the construction always hold for i = 0 and n = s (once s d + 1). So at every large enough stage some i is chosen. Since there are only 2d+1 many relevant i, there is a largest i such that i is chosen infinitely often. Let i0 be this i. Let s0 be such that no j > i is ever chosen at a stage after s0 . Notice that Si0 ,x is defined for all x, and if x = y then the strings in Si0 ,x and Si0 ,y are of different lengths. Furthermore, once mi0 ,x is defined at a stage after s0 , which must happen, then mi0 ,x is never later undefined. So there is an x0 such that for all x x0 , once mi0 ,x is defined, it is never later undefined. / ∅ , Lemma 16.3.3. Let x x0 . If x ∈ ∅ , then Emi0 ,x = Si0 ,x , while if x ∈ then Emi0 ,x = ∅. Proof. Let s be the stage at which mi0 ,x first becomes defined as a value n. Then n is unused at the beginning of stage s, whence En = ∅ at the beginning of that stage. Since mi0 ,x = n at all later stages, En remains empty at all later stages unless x enters ∅ , in which case it is defined to be Si0 ,x . Corollary 16.3.4. For every x x0 , we have x ∈ ∅ iff Si0 ,x ⊆ RC . Proof. As argued above, n En ∈ RC , so by the lemma, if x x0 and x ∈ ∅ , then Si0 ,x ⊆ RC . For the other direction, assume for a contradiction that there is an x x0 such that x ∈ / ∅ but Si0 ,x ⊆ RC . Let s + 1 be the stage at which mi0 ,x becomes defined. By construction, we have |On,s | i0 2n−d−1 and Si0 ,x ∩ On,s = ∅. By hypothesis, there is a stage t > max(s0 , s) such that Si0 ,x ⊆ On,t . Now, |Si0 ,x | is either 2n − |On,s | or 2n−d−1 . It cannot be the former, since then we would have On,t = 2n , which is clearly impossible. So |Si0 ,x | = 2n−d−1 , and hence |On,t | (i0 + 1)2n−d−1 . Since |On,t | < 2n , we have i0 + 1 < 2d+1 . Thus conditions (i)–(iii) in the construction are satisfied at stage t + 1 for i = i0 + 1 and n = t + 1, and hence the i chosen at stage t + 1 is greater than i0 , contradicting the fact that t > s0 . It follows immediately from the corollary that ∅ tt RC .
16.3. The collection of nonrandom strings
743
With a relatively straightforward modification of the previous proof (using suitable sets of strings Ln in place of 2n ), Kummer proved the following result, which applies to sets like {σ : C(σ) log |σ|}. Theorem 16.3.5 (Kummer [227]). Let f be computable, with f (σ) |σ| and lim|σ| f (σ) = ∞. Then the set {σ : C(σ) < f (σ)} is conjunctive ttcomplete. There has been some exciting work on efficient reductions to RC , such as that of Allender, Buhrman, Kouck´ y, van Melkebeek, and Ronneburger, (see e.g. [4]), among others. These authors observed that the Kummer reduction, while having computable use, has exponential growth in the number of queries used. (It is known that, for instance, a polynomial number of queries cannot be achieved.) They asked what sets can be, say, polynomial time reduced to RC , as well as other time- and space-bounded variants of this question. Amazingly, these questions seem to have bearing on basic separation problems between complexity classes. For instance, PSPACE ⊆ PRC . The methods used in this work are highly nontrivial, and beyond the scope of this book.
16.3.2 The prefix-free case As in defining a notion of randomness for strings using prefix-free complexity (see Section 3.8), we have two reasonable analogs of RC in the prefix-free case. Let str
RK = {σ : K(σ) < |σ| + K(|σ|)} (the set of non-strongly K-random strings) and RK = {σ : K(σ) < |σ|} (the set of non-weakly K-random strings). Clearly RK is c.e., but it had been a long-standing open question, going str back to Solovay [371], whether RK is c.e. This question was answered by Miller [271]. Theorem 16.3.6 (Miller [271]). Fix c 0 and let B = {σ : K(σ) < |σ| + K(|σ|) − c}. Let A ⊇ B be such that |A ∩ 2n | 2n−1 for all n. Then A is not c.e. Before proving this theorem, we point out two consequences. The first one, which follows immediately from the theorem, answers Solovay’s question. Corollary 16.3.7. For all c 0, the set {σ : K(σ) < |σ| + K(|σ|) − c} is not c.e. Miller [271] pointed out that Theorem 16.3.6 also gives a weak form of Solovay’s theorem that there are strings that are C-random but not
744
16. Complexity of Computably Enumerable Sets
strongly K-random, Corollary 4.3.8. While the following result is weaker than Solovay’s, its proof is much easier. Corollary 16.3.8. Let c 0 and d > 0. There is a string that is C-random with constant d but not strongly K-random with constant c. Proof. If this is not the case, then the set B in Theorem 16.3.6 is contained in A = {σ : C(σ) < |σ| − d}. But A is c.e. and |A ∩ 2n | < 2n−d for all n, contradicting the theorem. Proof of Theorem 16.3.6. Assume for a contradiction that A is c.e. We build a KC set L to define a prefix-free machine M , with coding constant k given by the recursion theorem. For any s and n such that Ks (n) < Ks−1 (n), find 2n−k−c−2 many strings of length n that are not in As , which exist by hypothesis, and for each such string τ , enumerate (n + Ks (n) − k − c − 1, τ ) into L. It is easy to check that L is indeed a KC set. If Ks−1 (n) = K(n), then our definition of M ensures that at least 2n−k−c−2 many strings of length n that are not in As are each compressed by at least c + 1 many bits. These strings must then eventually be added to B, and hence to A, so |A ∩ 2n | |As ∩ 2n | + 2n−k−c−2 . Let b be greatest such that |A ∩ 2n | b2n−k−c−2 for infinitely many n. Define a partial computable function f as follows. If |As ∩2n | b2n−k−c−2 , then let f (n) = Ks (n). By the argument above and the choice of b, for almost all n, if f (n) is defined then f (n) = K(n). Furthermore, the choice of b guarantees that f has infinite domain. Thus, by changing finitely many values of f , we have a partial computable function with infinite domain that is equal to K(n) wherever is it is defined, contradicting Proposition 3.5.5. Let us now turn to RK . This set is c.e., and it is also clearly Turing complete. But is it tt-complete? An. A. Muchnik (see [288]) showed that the answer may be negative, depending on the choice of universal machine. (We will see later that the answer may also be positive for other choices U of universal machine.) Thus we will consider the set RK for a particular universal prefix-free machine U , which is RK when K is defined using U . Muchnik also considered the overgraph of K, that is, the set U OK = {(σ, n) : K(σ) < n},
where K is defined using U . Of course, Kummer’s Theorem 16.3.2 implies that the overgraph of C, that is, the set OC = {(σ, n) : C(σ) < n}, is tt-complete independent of the choice of universal machine, but Muchnik showed that the situation is different for K. Part (ii) of the proof below has a number of interesting new ideas, especially casting complexity considerations in the context of games on finite directed graphs.
16.3. The collection of nonrandom strings
745
Theorem 16.3.9 (Muchnik, see [288]). There exist universal prefix-free machines5 V and U such that V (i) OK is tt-complete but U
U (ii) OK (and hence RK ) is not tt-complete.
Proof. (i) We define V using our fixed universal prefix-free machine U. Let g(k) = k if k is even and g(k) = k + 1 is k is odd. Let L = {(g(k) + 3, σ) : KU (σ) k} ∪ {(g(k) + 2, 0n ) : KU (0n ) k ∧ n ∈ ∅ }. Then L is clearly a KC set. Let V be the prefix-free machine corresponding to L. Then V is universal, and n ∈ ∅ iff KV (0n ) is even. If c is such that KV (0n ) log n + c for all n, then KV (0n ) can be determined by querying V V OK on (k, 0n ) for k log n + c, and hence ∅ tt OK . (ii) This argument is much more difficult. We will follow Muchnik’s proof, which uses strategies for finite games. It may well be possible to remove this feature of the proof, but it is an interesting technique in its own right. The kind of game we will consider here is determined by 1. a finite set of positions (vertices), 2. a set of allowable moves for each of the two players, 3. two complementary subsets of the positions, S1 and S2 , and 4. an initial position d0 . The set of moves will allow players to stay in the same position, but will not allow a position to be returned to once it is left. The game begins at position d0 , and the two players play in turns, making allowable moves from one position to another. Since positions cannot be returned to once left, the game must stabilize at some stage, and if that stable position is in Si , then player i wins. Clearly this game is determined, and we can effectively find a winning strategy for one of the players.6 5 A minor issue here is that, as built in the following proof, V and U will not necessarily be universal by adjunction, as defined in Section 3.1. Fortunately, it is easy to get around this problem. Suppose that M is a universal prefix-free machine and c is such that KM (σ) K(σ) + c for all σ, where K is defined using the standard prefix-free machine U . Let τ be such that τ ∩ dom M = ∅ and |τ | > c. Let M be defined by letting M (τ ρ) = U (ρ) and M (ν) = M (ν) for all other ν. Then M is universal by adjunction M = OM . and OK K 6 One procedure for doing so is to consider the set of all positions d = d such that 0 (d0 , d) is an allowable move by player 1, then for each such d consider a separate game defined like the original one but starting from d, and with the order of the players reversed. We can then continue to define residual games of this kind with fewer and fewer positions, and finally work backwards to construct a winning strategy for one of
746
16. Complexity of Computably Enumerable Sets
We will construct a Σ01 function F and define H(σ) = min{K(σ) + 2, F (σ)}. We will show that σ 2−H(σ) 1, and hence we can use the KC Theorem to build a corresponding prefix-free machine U , which will clearly be universal. We will also construct a computably enumerable set W such U that W tt OK . Let Γn denote the nth partial truth table reduction. We will construct the diagonalizing set W as a set of triples n, i, j, using those with first coordinate n to diagonalize against Γn . In the construction, we will have numbers in and jn and a finite game Gn associated with n. Each time n is considered, we can either let one of the players play the current game Gn , or we can change both Gn and the numbers in and jn . The positions of a game Gn will be determined by a finite set of strings An , a set of functions An → N, and a set of pairs of rational numbers. The idea is the following. We wait until Γn provides us with a truth table U for the input n, in , jn . This truth table will query OK for certain (σ, k). We let An be the set of all such σ, except for those already in Am for m < n. We now define the game Gn to consist of positions (h, q, p) where the h : An → N are potential values of H on An , and q and p are rationals encoding how much making H equal to such an h on An would cost in terms of the change to σ 2−H(σ) . Each h, and hence each position, can U be used to provide a guess as to the values of OK queried by Γn on input n, in , jn , by assuming that H equals h on An . (More precisely, we need to take h together with values corresponding to strings in Am for m < n. We assume these values are fixed. If that is not the case, then we will eventually cancel our game and start a new one). Thus we can divide such positions into sets X0 and X1 such that Xi consists of those positions that guess that OU Γn K (n, in , jn ) = i. Either player 1 has a winning strategy to eventually stabilize the game in a position in X0 , or player 2 has a winning strategy to eventually stabilize the game in a position in X1 . But in the latter case, player 1 can stay in position d0 in its first move, and then emulate player 2’s winning strategy. So for some i = 0, 1, player 1 has a winning strategy to eventually stabilize the game in a position in Xi . Player 1 adopts that strategy for its moves in Gn . If i = 0, then we put n, in , jn into W . If i = 1, then we keep n, in , jn out of W . This action will ensure that, if OU
player 1 can carry out its strategy, then Γn K (n, in , jn ) = W (n, in , jn ), because in that case F will be defined to ensure that H is actually equal to h on An for the value of h such that Gn stabilizes on some (h, q, p). Player 1’s moves will determine the value of F (or, more precisely, the approximation Fs to F ) on An , and hence will have to be carefully bounded to ensure that σ 2−H(σ) 1. Player 2’s moves will be determined by the value of Ks on An . The rational q will code how much has been spent the players. See for instance Khoussainov and Nerode [202] for more details on finite games.
16.3. The collection of nonrandom strings
747
on changing F , and the rational p will code how much of the measure of the standard universal prefix-free machine has been spent by player 2 in changing Ks at strings in An . We place a bound on p and q. If stronger priority strategies (those working for Γm with m < n) do not intervene, then either player 1 will be able to complete its strategy, or the approximation to the values of K on string in An will change so much that player 2 will want to violate the rules of the game (by moving to an (h, q, p) where p is above our bound). In that case we cancel the game and start a new one. This situation will occur only finitely often, since each time it does, it causes a large change in the values of K on An . We begin with in = jn = 0. Each time we reconsider n, we increase the value of in by 1 if there is a number m < n such that either the position of Gm changed or Gm itself changed since the last time we considered n. If a change in the estimate of Ks violates the rules of Gn , as explained below, we increase the value of jn by 1. The bound on p and q decreases every time in changes, but not when jn changes. Construction. Initially, F0 (σ) = ∞ for all σ and all games Gn and sets An are undefined. Stage s = n, e. If e = 0, let in = jn = 0. Otherwise, check whether there is an m < n such that either the position of Gm has changed since stage n, e − 1 or Gm has been redefined since that stage. If so, then undefine Gn and increase in by 1. Case 1. Gn is not defined. Run s many steps in the computation of the table γn (n, in , jn ) for the reduction Γn with argument n, in , jn . If no table is produced, then Gn is undefined and we end the stage, with Fs+1 (σ) = Fs (σ) for all σ. Otherwise, we define the game Gn . The queries made by γn (n, in , jn ) will be of the form (σ,k). Let Bn be the set of first coordinates of such queries. Let An = Bn \ m
748
16. Complexity of Computably Enumerable Sets
discussed above, for some i = 0, 1, player 1 has a winning strategy to eventually stabilize the game in a position in Xi . Player 1 adopts that strategy for its moves in Gn . If i = 0, then we put n, in , jn into W . If i = 1, then we keep n, in , jn out of W . We have now defined Gn , and we end the stage, with Fs+1 (σ) = Fs (σ) for all σ. Case 2. Gn is defined. Then first let player 1 make a move according to its strategy. This move will be from a position (h, q, p) to a position (h , q , p). If h (σ) < h(σ), then let Fs+1 (σ) = h (σ). For all other σ, let Fs+1 (σ) = Fs (σ). Now let player 2 move as follows. Let (h , q , p) be the current position. Let h = min{Ks (σ) + 2, Fs+1 (σ)} for σ ∈ An . If there is a position (h , q , p ) that player 2 can move to, then do so. Otherwise, player 2 has violated the rules of the game, so undefine Gn and An . In either case, end the stage. End of Construction. Let Hs (σ) = min{Ks (σ) + 2, Fs (σ)} and let H(σ) = lims Hs (σ). Let O = {(σ, k) : H(σ) < k}. Assume by induction that there is a least stage t = n, e such that in is never incremented after stage t, and that for all Am with m < n defined at stage t (which by the first hypothesis are all the Am that will ever be defined from that stage on), we have Hs (σ) = H(σ) for all σ ∈ Am . Suppose that a version of Gn is first defined at stage s t. If player 2 never violates the rules of that game, then the game will eventually stabilize in a position (h, q, p), say by stage u = n, e . By construction, for all σ ∈ An , we have H(σ) = Hu (σ) = h(σ), since any change in such an H-value at stage u or later would correspond to a move by at least one of the players in Gn , or cause the game to become undefined. So if we let X0 and X1 be as defined in the construction at stage u and let i be such that (h, q, p) ∈ Xi , then ΓO (n, in , jn ) = i. Since in this case we ensured that W (n, in , jn ) = 1 − i, we have ΓO = W . Let h0 and q0 be as in the construction at the stage s at which Gn was defined. Suppose that at stage v = n, k s we reach position (h, q, p) by the end of the stage, and then at stage v = n, k + 1 we reach position , p ). Then it is easy to check from the construction that (h , q−K p σ∈An 2 v (σ)−2 − σ∈An 2−Kv (σ)−2 . Thus, if player 2 ever violates the rules of Gn at stage w, then σ∈An 2−Kw (σ)−2 − σ∈An 2−Ks (σ)−2 2−(n+in +3) . So this situation can occur only finitely often. Thus, either from some point on Gn is permanently undefined, in which case Γ is not total and the inductive hypotheses clearly hold of n, or there is a final version of Gn which, as explained above, eventually stabilizes and ensures that ΓO = W . In either case, the induction can continue. Thus, all that is left to prove is that σ 2−H(σ) 1. Then, using the KC Theorem, we get a prefix-free machine U corresponding to H, which
16.3. The collection of nonrandom strings
749
U is clearly universal by the definition of H. For this U , we have O = OK , U whence O is not tt-complete. K Since σ 2−(K(σ)+2) < 14 , it is enough to show that r = σ 2−F (σ) 34 . Suppose that σ ∈ An at some stage. Then for the current version of Gn , one of three things happens. Either
1. Gn is eventually canceled by in being increased, or 2. Gn is eventually canceled by player 2 violating its rules, or 3. Gn is permanently defined. Let s0 be the stage at which the current version of Gn was defined. It is easy to see by construction that until Gn is canceled (if ever), 2−Fs (σ) − 2−Fs0 (σ) < 2−(n+in +3) . Thus the amount added to r in cases 1 and 2 is at most 2−(n+in +3) (and note that these cases cannot occur more than once for the same value of in ). Furthermore, if case 2 happens at stage s then we must have 2−Fs (σ) − 2−Fs0 (σ) < 2−(n+in +3) τ ∈An 2−Ks (τ )−2 − −Ks0 (τ )−2 . Since no string can be in two different Am ’s at the τ ∈Am 2 −Fs (σ) same stage, we see that the sum of 2 − 2−Fs0 (σ) over all situations in −K(τ )−2 which case 2 occurs is bounded by τ 2 . Thus we have 3 1 1 2−F (σ) 2−(n+i+3) + 2−K(τ )−2 < + = . 2 4 4 σ n τ i One of the original uses Muchnik made of the construction above was to show that for each d there are strings σ and τ such that K(σ) > K(τ ) + d and C(τ ) > C(σ) + d, Theorem 4.4.1. A particularly interesting feature of this approach is that it uses a construction that depends on the choice of a particular universal machine to prove a fact that is independent of any such choice. Allender, Buhrman, and Kouck´ y [3] built a universal machine that makes the collection of non-C-random strings tt-complete. They noted that their proof technique can also be used for prefix-free machines, and indeed it is generalizable to other kinds of machines as well. We give a proof due to Day [personal communication], though the original proof contained similar ideas. This proof covers at once the cases of prefix-free machines, process machines (normal and strict), and monotone machines (for Km and for KM ). Theorem 16.3.10 (Allender, Buhrman, and Kouck´ y [3], also Day [personal communication]). Let Q be any of the following complexity measures: C, K, Km, KM , Km D , or Km S . There is a universal machine U for Q U such that set of nonrandom strings RQ is tt-complete. Proof. We will do the proof for the case Q = Km, and note at the end that it works in the other cases as well.
750
16. Complexity of Computably Enumerable Sets
For a monotone machine M , let Km M = min{|τ | : σ M (τ ) ↓} and M Ri = {σ : Km M (σ) |σ| − i}. We will use the following lemma. M
Lemma 16.3.11. If M is a monotone machine then μ(Ri ) 2−i for all i. M
Proof. Let M be a monotone machine and fix i. Let A ⊆ Ri be the M set of minimal elements of Ri under the relation. For each σ ∈ A let σ be a minimal-length string such that σ M (σ ). Since A is prefix M free, so is A = {σ : σ ∈ A}. Thus μ(Ri ) = μ(A) = σ∈A 2−|σ| − Km M (σ)−i = 2−i σ∈A 2−|σ | 2−i . σ∈A 2 The idea behind this proof is to construct a universal machine V that describes strings symmetrically; that is to say, if σ has a description of length n, then σ has a description of length n as well (where σ is the string of length |σ| defined by σ(n) = 1 − σ(n)). Then given V , we will build another universal machine U but break the symmetry of the set of nonrandom strings in U in order to encode ∅ . Given a universal monotone machine M, first construct another universal machine V as follows. Each time a pair (τ, σ) enters M, add (0τ, σ) and (1τ, σ) to V . This symmetrical construction means that the machine V has V V the property that τ ∈ Ri if and only if τ ∈ Ri , for all i. We will construct another universal machine U from V and a prefix-free machine M as follows. We will let U (00σ) = V (σ) for all σ. We will also define M and let U (1σ) = M (σ) for all σ. The job of the machine M is to U break the symmetry of U in certain places to encode ∅ into RKm . To do so, we will make use of a computable sequence of natural numbers D and the KC Theorem. At stage 0, let M0 = ∅, and let D be the empty sequence. At stage 2s + 1, ensure that for all (τ, σ) in Vs , we have (00τ, σ) ∈ U2s+1 . At stage 2s + 2, let U2s+1
Xs = ∅s ∩ {x : |2x+4 ∩ R1
| is even}.
Let x0 , . . . , xn be the elements of Xs . Let l be the current length of the sequence D. For each i n, extend the sequence D by letting dl+i = xi + 2. For each i, use the KC Theorem to find a string σl+i such that we can define M (σl+i ) = 11σl+1 while keeping M prefix-free. We will argue below that we do not run out of room for such strings. Enumerate (1σl+i , 11σl+i ) into U , which ensures that 11σl+i is nonrandom with respect to U . Note that |11σl+i | = 2 + |σl+i | = 2 + dl+i = xi + 4. The idea here is to attempt to U2s+2 U2s+2 make |2xi +4 ∩R1 | odd by adding a new string to R1 of length xi +4. To verify that this construction succeeds, we will show in the following lemma that each KC request we make can be honored. This lemma is the essence of the proof. The following is the basic reason it holds. Given any
16.3. The collection of nonrandom strings
751
x, assume we request some string σ of length x + 2. We need to request another string of length x + 2 only if 11σ becomes nonrandom. However, this situation must be caused by a description coming from V of length no greater than |11σ| − 3 (because of how V is coded into U ), and we can bound the measure of all such descriptions. Lemma 16.3.12.
i
2−di 1.
Proof. Let d0 , . . . , dn be any initial segment of the sequence D. Let I be the set of numbers that occur in d0 , . . . , dn . Note that the elements of I are all greater than or equal to 2. For each i ∈ I, let ki be the index of the last occurrence of i in the sequence. Let K = {ki : i ∈ I}. As 0, 1 ∈ / I, we have 1 −di 2 < . i∈K 2 Let J = {0, . . . , n} \ K. If j ∈ J, then dk = dj for some k > j. So at some Us
stage after dj is defined, |2dj +2 ∩ R1 | is even again. Taking σ = 11σj , this U
V
V
situation can happen only if σ enters R1 . Thus, σ ∈ R3 and so σ ∈ R3 . V Thus {11σj : j ∈ J} ⊆ R3 . Now, because {11σj : j ∈ J} is a prefix-free set, 1 V 2−di −2 = 2−|11σi | = μ({11σi : i ∈ J}) μ(R3 ) . 8 i∈J
i∈J
Thus i∈J 2−di 12 and so in 2−di = i∈K 2−di + i∈J 2−di < −|di | 1 = limn in 2−|di| 1. i2 2 = 1. Hence
1 2
+
U
Lemma 16.3.13. x ∈ ∅ iff |2x+4 ∩ R | is odd. Proof. If x ∈ / ∅ then x ∈ / Xs for all s, so if σ is any string of length x + 4 U V V U U in R1 , then σ ∈ R3 , and hence σ ∈ R3 and σ ∈ R1 . Thus |2x+4 ∩ R | is even. If x ∈ ∅ , then let I = {s : x ∈ Xs }. The set I is finite because i 2−di Ut
converges, so let s0 = max(I). If t > 2s0 + 2, then |2x+4 ∩ R1 | is odd, and U thus |2x+4 ∩ R | is odd. Lemma 16.3.11 will still hold if Km is replaced by KM . It will also hold if a universal prefix-free machine, a universal strict process machine, or a universal process machine is used instead of a universal monotone machine (with K, Km S , or Km D as the respective complexity measures), since prefix-free machines, process machines, and strict process machines are all monotone machines. As the machine M built in the above proof is a prefix-free machine (and so also a (strict) process machine), the same construction can be used to prove all the cases in the statement of the theorem.
752
16. Complexity of Computably Enumerable Sets
16.3.3 The overgraphs of universal monotone machines In this section, we present results of Day [88] on the overgraph relation for monotone and process complexity. The first result answers a question posed by Muchnik and Positselsky [288]. As above, for a monotone machine M , M let Km M = min{|τ | : σ M (τ ) ↓} and define the overgraph OKm to be {(σ, k) : Km M (σ) < k}. Recall from Theorem 3.16.2 that for a universal monotone machine U , the function MU defined by MU (σ) = μ({τ : ∃σ σ ((τ, σ ) ∈ U )}) is an optimal c.e. continuous semimeasure. Theorem 16.3.14 (Day [88]). For any universal monotone machine U , U the overgraph OKm is tt-complete. Proof. We will construct a monotone machine M whose index d in U is known by the recursion theorem. To give this proof the widest possible applicability, we will make M a strict process machine (which recall means that if τ ∈ dom(M ) and τ τ , then τ ∈ dom(M ) and M (τ ) M (τ )). In addition to M , we need to build a truth table reduction Γ. For this proof, M we will omit the Km subscript and write OM for OKm . This notation allows us to use the subscript position as follows: For a monotone machine M , let OkM = {σ : (σ, k) ∈ OM }. Our truth table reduction will work as follows. For each x, a set of strings U Sx will be specified. The reduction will determine which strings are in Od+x and make a decision as to whether or not x ∈ ∅ based on this information. The simplest thing to do would be to try to encode the fact that x enters U ∅ by adding all of Sx to Od+x . However, we run against the same problem as in proving Kummer’s Theorem 16.3.2, that the opponent could wait U until Sx is defined, then add all of it to Od+x and withhold x from ∅ . In fact, given any truth table reduction Γ, the opponent can choose an x, wait until the truth table used by Γ(x) is defined, and then adopt a winning U U strategy to ensure either that ΓO (x) = 0 or that ΓO (x) = 1. By adding x to ∅ in the first case and keeping it out in the second case, the opponent U could ensure ΓO (x) = ∅ (x). We overcome this problem as in the proof of Theorem 16.3.2 by making the reduction nonuniform. In this case, we will allow it to be wrong on some initial segment of ∅ . The reduction will be constructed in such a way that the cost to the opponent of making the reduction incorrect for any x is so significant that the reduction can be incorrect only a finite number of times. The strict process machine M we use for adding pairs to the overgraph will be constructed as follows. For each τ ∈ 2x we will choose some στ , so that {στ : τ ∈ 2x } is a prefix-free set. The pairs (τ, στ ) will be candidates for addition to our machine. We will make sure that if τ ≺ τ , then either στ στ or (τ , στ ) is never added to M . If we decide to add (τ, στ ) to M , U then Km(στ ) |τ | + d = x + d, and hence στ ∈ Od+x .
16.3. The collection of nonrandom strings
753
U Now, the opponent has the ability to add στ to Od+x as well. If the opponent does this, then it must have increased the measure on στ , i.e., set MU (στ ) to at least 2−d−x . Provided we have not described any extension of στ with M , we can now bypass this measure spent by the universal machine. Bypassing the measure allows us to ensure the reduction works on almost all inputs. We wait until an appropriate stage s when we have some lower bound on the measure that we can bypass. When we define Ss , we ensure that συ does not extend στ for all υ ∈ 2s . Instead we will let such συ extend some ρτ incompatible with στ . However, the opponent still has one last trick up its sleeve. Before it U adds some string στ to Od+|τ | , it can try to force us to enumerate some (τ , στ ) into M with στ στ . Doing so would prevent us from bypassing the measure the opponent spends on στ , because if τ ≺ υ, then we must have στ ≺ στ ≺ συ . The following reduction Γ is designed to minimize the problem this situation causes. The reduction Γ will be defined as follows. First Γ(0) = 0. If x = 0, then at stage x in the construction, a set Sx will be defined. This set will have 2x many elements and will be indexed by 2x so that for each τ ∈ 2x there is a unique string στ ∈ Sx . Let τ be the string of length |τ | defined by τ (k) = 1 − τ (k). To determine whether x ∈ ∅ , the reduction Γ runs the construction until Sx is defined and then determines which elements of the U U set Sx are in Od+x . If Sx ⊆ Od+x , then Γ(x) = 0. Otherwise Γ finds the x leftmost τ ∈ 2 such that either U 1. exactly one of στ and στ are in Od+x , in which case Γ(x) = 1; or U , in which case Γ(x) = 0. 2. neither στ nor στ are in Od+x
This reduction can be thought of as checking pairs of strings in a certain order. For example, consider S3 . First the reduction checks whether σ000 U and σ111 are in Od+3 . If they are not both in then the reduction can give an answer immediately. If they are both in, then the reduction checks whether σ001 and σ110 are in, and so on. This process can be described simply by looking at the indices of the σ’s involved, e.g., first 000 and 111, then 001 and 110, then 010 and 101, and finally 011 and 100. Let us see now how we can act to ensure that this reduction is correct. Again consider S3 and assume that 3 ∈ / ∅0 . Now suppose that at some Us0 stage s0 , the opponent enumerates σ000 into Od+3 . This enumeration will Us
cause ΓO 0 = 1. So we add (111, σ111 ) to M at the following stage. If a corresponding description appears in the universal machine at stage s1 , Us then ΓO 1 = 0. If at a later stage s3 , we have 3 ∈ ∅s3 , then we add U (001, σ001 ) to M . If at any later stage the opponent adds σ110 to Od+x then we respond by adding (010, σ010 ) to M , and so on. Note that for a given x, the value of the reduction Γ(x) can be changed U U by adding a single element of Sx to Od+x (provided Sx Od+x ). If at some
754
16. Complexity of Computably Enumerable Sets Us
stage s of the construction ΓO (x) = 0, then there are two possible choices Us of string for changing the reduction to 1, while if ΓO (x) = 1 then there is only one possible string that can enumerated into the overgraph to change Γ(x) to 0. Also note that if (τ, στ ) is enumerated into M , then the only reason to enumerate (τ , στ ) into M is to keep M a strict process machine. Us Now, if we get to a stage s where for some x, we have Sx ⊆ Od+x and x ∈ ∅s , then we no longer have any ability to change the reduction. At this point we give up making the reduction work on x, and in fact we go further and give up trying to make the reduction work on any value below s + 2. We will have a marker that points to a value after which the reduction works. We move the marker to point to s + 1 and call s + 1 a marker move stage. The reason that the marker cannot be moved infinitely often is that now when we define Ss+1 , we can do it in such a way as to avoid extending some of the strings that have been enumerated into the universal machine by the opponent. In looking for strings that have measure we can bypass, we do not consider just those strings in Sx . We consider all strings στ , where τ can have any length, such that for any ρ that occurs no later than τ in the search Us order of Γ, we have σρ ∈ Od+|τ | (e.g., 000, 111, 001, 110 all occur no later than 110 in the search order). Given this set of strings S, we let T be the set of indices describing the strings in S. We use T instead of S because it is easier to deal with. As Sx ⊆ OUs , it follows that μ(T ) = 1. Let T d+|x|
be the prefix-free set formed by taking those strings in T that are maximal with respect to the ordering. The way we choose our στ during the construction will allow us to find a lower bound on μ(T ). Let B be the set of strings in T that index strings enumerated into the overgraph by the opponent and whose corresponding measures we can bypass. We will be able to show that we can find a lower bound on μ(B) because nearly half the strings in T must be in B. The reason for this fact is a little complicated and will be detailed in the proof. Basically, though, it is twofold. First, we will show that if τ ∈ T , then τ ∈ T . Second, we are unlikely to add both (τ, στ ) and (τ , στ ) to M . The only reason we would do so would be to ensure that M is a strict process machine. Say we add (τ, στ ) to M to keep it a strict process machine. Then there is some τ τ / T , as otherwise τ would not be in T , it with τ ∈ dom(M ). Since τ ∈ must be that we add (τ , στ ) to M to encode some x entering ∅ . However, this scenario can affect only a certain number of elements of T . This fact is what we will use to find a lower bound for μ(B). For the verification of the proof, it is useful to formalize the order that the reduction uses the strings in Sx , which is done by defining a relation on 2k as follows: τ1 Γ τ2 if min(τ1 , τ1 ) L min(τ2 , τ2 ) where the minimum is with respect to the lexicographic ordering. While Γ is reflexive and transitive, it is not an ordering, as antisymmetry fails. However, if τ Γ ρ and ρ Γ τ , then either ρ = τ or ρ = τ . The relation Γ is total in the
16.3. The collection of nonrandom strings
755
sense that for all τ, ρ ∈ 2k , either τ Γ ρ or ρ Γ τ . Note that this fact implies that if τ Γ ρ then ρ ≺Γ τ . Lemma 16.3.15. Let τ1 , τ2 , υ1 , υ2 with |τ1 | = |τ2 | < |υ1 | = |υ2 | be such that τ1 ≺ υ1 and τ2 ≺ υ2 . Then τ1 ≺Γ τ2 implies υ1 ≺Γ υ2 . Proof. If τ1 ≺Γ τ2 , then τ1 = λ so τ1 = τ1 . Assume that τ1
The set Tks+1 is defined this way because this is the order in which the reduction Γ examines the strings in Sk . Let T s+1 = 1ks Tks+1 . We want to work with a prefix free set, so let T s+1 = {τ ∈ T s+1 : ∀τ τ (τ ∈ / T s+1 )}. Note that this set consists of the maximal elements of T s+1 , while prefix-free sets are usually constructed using minimal elements. To gather together those strings whose descriptions we can bypass, let B s+1 = {τ ∈ T s+1 : ∀τ τ (τ ∈ / dom(Ms )}. For all υ ∈ B s+1 , let {υ0 , . . . , υn } be the set of extensions of υ of length s + 1. Choose Us συ0 , ρυ0 , . . . , συn , ρυn that are pairwise incompatible, not in Od+s+1 , and s+1 s+1 extend ρυ . For all τ ∈ 2 that do not extend some υ ∈ B , choose a Us στ and a ρτ that are incompatible, not in Od+s+1 , and extend στ , where τ = τ (|τ | − 1). Again let Ss+1 = {στ : τ ∈ 2s+1 }.
756
16. Complexity of Computably Enumerable Sets
Second, we need to determine which descriptions to commit to our maUs chine M . Let Xs+1 = {x : cs < x < s ∧ ΓO (x) = ∅s (x)}. If Xs+1 = ∅, then let Ms+1 = Ms and Rs+1 = Rs and proceed to the next stage. OthUs erwise, if for any x ∈ Xs we have Sx ⊆ Od+x , then let Ms+1 = Ms and Rs+1 = Rs ∪ {s + 2}, which will cause the marker to be moved at the next stage. In this case, proceed to the next stage. Otherwise, let xs be the least element of Xs , and choose τ ∈ 2xs such Us Us that στ ∈ / Od+x and στ ∈ Od+x for all τ ≺Γ τ . We are going to add s s (τ, στ ) to Ms+1 . However, we want to make M a strict process machine, so we need to ensure that dom(Ms+1 ) is closed under substrings. Let υ be the longest initial segment of τ such that υ ∈ dom(Ms ). Consider any τ ≺ τ . If τ υ, then τ ∈ dom(Ms ) by our assumption that Ms is a strict process machine. Now suppose that υ ≺ τ τ . If |τ | cs , then we want to have Ms+1 (τ ) = Ms (υ). Otherwise, we want to have Ms+1 (τ ) = στ . Thus, let Ms+1 = Ms ∪ {(τ , Ms (υ)) : υ ≺ τ τ ∧ |τ | cs } ∪ {(τ , στ ) : υ ≺ τ τ ∧ |τ | > cs }. Finally, let Rs+1 = Rs . End of Construction. First we will show that M is a strict process machine. To do so, we need the following lemma. Lemma 16.3.16. If τ1 ≺ τ2 and στ1 ⊀ στ2 , then M (τ1 ) = στ1 . Proof. If τ1 ≺ τ2 and στ1 ⊀ στ2 , then there must be some marker move stage s with |τ1 | < s |τ2 | and ρτ0 ≺ στ2 for some τ0 ∈ B s with τ0 τ1 . Thus, τ1 ∈ / dom(Ms ), by the definition of B s . However, then for all stages t > s, we have Mt (τ1 ) = στ1 , because once the marker has moved, if τ1 is added to the domain of Mt , then Mt (τ1 ) = συ for some υ ≺ τ1 . Lemma 16.3.17. M is a strict process machine. Proof. We proceed by induction on the stages of the construction. Clearly M0 is a strict process machine. If Ms is a strict process machine then the construction ensures that Ms+1 is at least a function whose domain is closed under initial segments, because if Ms+1 = Ms , then Ms+1 is formed by taking some τ ∈ / dom(Ms ) and finding the longest υ ≺ τ such that υ ∈ dom(Ms ). The strings that we add to the domain of Ms+1 are exactly those strings τ such that υ ≺ τ τ . We also need to show that Ms+1 is a process. In the construction, Ms+1 is defined to be Ms ∪ {(τ , Ms (υ)) : υ ≺ τ τ ∧ |τ | cs } ∪ {(τ , στ ) : υ ≺ τ τ ∧ |τ | > cs }. s+1 = Ms ∪ {(τ , Ms (υ)) : υ ≺ τ τ ∧ |τ | cs } First we show that M s+1 ) with τ1 ≺ τ2 . If τ2 ∈ Ms then is a process. Take any τ1 , τ2 ∈ dom(M
16.3. The collection of nonrandom strings
757
s+1 (τ1 ) = Ms (τ1 ) Ms (τ2 ) = M s+1 (τ2 ). If τ2 ∈ τ1 ∈ Ms , so M / Ms and s+1 (τ2 ). Finally, if τ1 ∈ Ms , then τ1 υ, so Ms+1 (τ1 ) Ms+1 (υ) = M s+1 (υ) = M s+1 (τ1 ) = M s+1 (τ2 ). neither τ1 nor τ2 are in Ms then M The set {(τ , στ ) : υ ≺ τ τ ∧ |τ | > cs } is also a process: For any τ1 ≺ τ2 in this set, στ1 ≺ στ2 because the marker is not moved between the definitions of στ1 and στ2 . To show that the union of these two processes is a process, consider Ms+1 (υ). By construction it must be that Ms+1 (υ) = Ms+1 (υ ) = συ for some υ υ. From the previous lemma, this fact implies that συ στ , and so Ms+1 (υ) Ms+1 (τ ). Now for any τ1 ≺ τ2 with τ1 in the domain of the first process and τ2 in the domain of the second process, Ms+1 (τ1 ) Ms+1 (υ) and Ms+1 (τ2 ) Ms+1 (τ ). As Ms+1 (υ) Ms+1 (τ ), it follows that Ms+1 (τ1 ) Ms+1 (τ 2 ). Therefore Ms+1 is a strict process machine. It follows that M = s Ms is also a strict process machine. Lemma 16.3.18. If there are only a finite number of marker move stages, U then ΓO (x) = ∅ (x) for almost all x. Proof. Let s0 be the last marker move stage and assume for a contradiction U that there is a least x0 > s0 such that ΓO (x0 ) = ∅ (x0 ). Let s1 > x0 be a Us1 U stage such that ∅s1 (x0 + 1) = ∅ (x0 + 1) and Sx ∩ Od+x = Sx ∩ Od+x Us
U
for all x x0 . The latter condition implies that ΓO 1 (x) = ΓO (x) for all x x0 . If x0 ∈ Xs1 , then it must be the least element of Xs1 . Because there are Us1 no more marker move stages, it must be that Sx0 Od+x . But there is a 0 U
s1 / Od+x added to Ms1 +1 . This enumeration (τ, στ ) with τ ∈ 2x0 and στ ∈ 0
U
s1 U U , which is not possible because Sx0 ∩Od+x = Sx0 ∩Od+x . adds στ to Od+x 0 0 0 U
/ Xs , and so ΓO (x0 ) = Γ Hence x0 ∈ a contradiction.
OUs1
(x0 ) = ∅s (x0 ) = ∅ (x0 ), which is
Now we need to show that there are only a finite number of marker move stages. We will be able to do so because each time the marker is moved, a portion of the measure that the universal machine has spent is bypassed by the construction, and can no longer be used to affect Γ. By showing that there is a lower bound on the amount of measure that is bypassed each time the marker is moved, we will show that the marker can be moved only a finite number of times, since the domain of the universal machine eventually runs out of measure. For any x there is a direct relation between the indices of the strings in Sx and the measure required to be allocated U to these strings to add them to Od+x . Hence, to determine a lower bound on the amount of measure bypassed, it is useful to find a lower bound on μ(B s ). The first step toward achieving this bound will be to find a lower bound on μ(T s ).
758
16. Complexity of Computably Enumerable Sets
For the rest of the verification, fix s to be a particular marker move stage. As s is fixed, Tk will be used for Tks . Lemma 16.3.19. Let τ, ρ ∈ 2k with 1 k < s. If τ ∈ Tk and ρ Γ τ , then ρ ∈ Tk . Us / Od+k . Proof. If ρ ∈ / Tk then for some υ ∈ 2k such that υ Γ ρ, we have συ ∈ However, by the transitivity of the Γ relation, υ Γ τ , and so τ ∈ / Tk .
Note that this lemma implies that if τ ∈ Tk , then τ ∈ Tk as well. Lemma 16.3.20. If 1 k < j < s then either Tk ⊂ Tj or Tj ⊂ Tk . Proof. If Tj ⊂ Tk , then there is some υ ∈ Tj such that if τ = υ k, then τ ∈ / Tk . If τ ∈ Tk then τ ≺Γ τ (because the relation ≺Γ is total and if τ Γ τ then by definition τ ∈ / Tk ). Now let υ be any extension of τ such that |υ | = j. By Lemma 16.3.15, υ ≺Γ υ, and hence υ ∈ Tj by Lemma 16.3.19. Thus Tk ⊂ Tj . Us Let x be greatest such that Sx ⊆ Od+x . Since s is a marker move stage, such an x exists. Additionally, x is greater than the previous marker move stage. By the previous lemma, there are j(0) < j(1) < · · · < j(n) with j(0) = x and j(n) = s − 1, such that Tj(0) · · · Tj(n) and Tl ⊆ Tj(i+1) whenever j(i) < l j(i + 1). For i < n, let T j (i) = {τ ∈ Tj(i) : ∀τ ∈ Tj(i+1) (τ ⊀ τ )}. Let T j(n) = Tj(n) .
Lemma 16.3.21. For all i < n, we have μ(T j(i) ) μ(Tj(i) \Tj(i+1) )− 2−j(i)+1 . Proof. We have Tj(i) Tj(i+1) , so let τ be an element of Tj(i) for which there is some τ ≺ υ such that |υ| = j(i + 1) and υ ∈ / Tj(i+1) , but for all τ Γ τ and all υ with τ ≺ υ and |υ | = j(i + 1), we have υ ∈ Tj(i+1) . Now take any τ ∈ Tj(i) such that τ ≺Γ τ . For any υ of length j(i + 1) such that τ ≺ υ , it follows by Lemma 16.3.15 that υ ≺Γ υ and hence υ ∈ / Tj(i+1) , whence τ ∈ T j(i) . Thus for all τ ∈ Tj(i) , if τ ≺Γ τ then τ ⊆ Tj(i+1) . If τ ≺Γ τ then τ ⊆ T j(i) . If neither τ ≺Γ τ nor τ ≺Γ τ , then τ must be one of τ or τ. Thus T j(i) ⊇ (Tj(i) \ Tj(i+1) ) \ {τ, τ }. The lemma follows, as μ({τ, τ }) = 2−j(i)+1 . The following lemma shows that the T s defined in the construction (now referred to as T because s is fixed) is equal to in T j(i) . Lemma 16.3.22. T = in T j(i) .
16.3. The collection of nonrandom strings
759
Proof. If τ ∈ T , then by definition, τ ≺ τ for all τ ∈ T . Hence τ ∈ Tj(i) for some i and τ ∈ / Tj(i+1) for all τ ≺ τ . Thus τ ∈ T j(i) , so T ⊆ in T j(i) . For the other direction, first note that T j(n) = Ts−1 ⊆ T , because any maximal-length element must be a maximal element under . If τ ∈ T j(i) for some i < n, then τ ≺ τ for all τ ∈ Tj(i+1) . Now, Tl ⊆ Tj(i+1) for all l > j(i), so τ ⊀ τ for all τ ∈ Tl , and hence τ ∈ T . Thus in T j(i) ⊆ T . Us As j(0) = x and Sx ⊆ Od+x , we have μTj(0) = 1. We may assume that x 4 because x is greater than any previous marker move stage, so this assumption will be true after at most 3 marker move stages. Then μ(T ) = μ(T j(i) ) in
(μ(Tj(i) \ Tj(i+1) ) − 2−j(i)+1 ) + μ(Tj(n) )
i
= μ(Tj(0) ) −
i
2−j(i)+1 >
3 . 4
Now we have achieved the first step discussed above by finding a lower bound on μ(T ). The next step is to find a lower bound on μ(B s ). / Recall that B s was defined in the construction to be {τ ∈ T : ∀τ τ (τ ∈ dom(Ms−1 ))}. Lemma 16.3.23. If τ ∈ T then τ ∈ T . Proof. If τ ∈ T , then τ ∈ T j(i) for some i. So τ ∈ Tj(i) , and thus τ ∈ Tj(i) . If τ ≺ υ and |υ| = j(i + 1), then τ ≺ υ, and hence υ ∈ / Tj(i+1) (as τ ∈ T j(i) ). Thus υ ∈ / Tj(i+1) , and so τ ∈ Tj(i) . Lemma 16.3.24. If τ ∈ T and τ, τ ∈ / B s , then there exists υ ∈ dom(Ms−1 ) such that either τ ≺ υ or τ ≺ υ. Proof. This lemma follows from the fact that we add descriptions to M for only two reasons. The first is to change the reduction and the second is to ensure that the domain is closed under substrings. Assume that there is no υ ∈ dom(Ms−1 ) such that either τ ≺ υ or τ ≺ υ. In this case there is no need to add τ or τ to the domain of Ms−1 to close it under substrings. So if τ ∈ / B s , then it must be that τ ∈ dom(Ms−1 ). Furthermore, we must have added τ to the domain of Ms−1 to change the reduction Γ. In this case there is no reason why we should add τ to change the reduction as well. Hence τ ∈ / dom(Ms−1 ), so τ ∈ B s . Lemma 16.3.25. If τ1 , τ2 ∈ T are such that |τ1 | = |τ2 | and τ1 ≺Γ τ2 , then at least one of τ2 and τ 2 is in B s .
760
16. Complexity of Computably Enumerable Sets
Proof. First we know by Lemma 16.3.23 that τ 1 , τ 2 ∈ T . Assume for a contradiction that τ2 , τ 2 ∈ / Bs. By the previous lemma, if τ2 , τ 2 ∈ / B s , then there must be some υ2 ∈ dom(Ms−1 ) such that either τ2 ≺ υ2 or τ2 ≺ υ2 . We will assume without loss of generality that τ2 ≺ υ2 . Let k = |υ2 |. Note that k < s. As υ2 ∈ dom(Ms−1 ), for all υ ≺Γ υ2 , Us we have συ ∈ Od+k , by the way the construction chooses pairs to add to M . Hence υ ∈ Tk . Take any υ1 extending τ1 with |υ1 | = k. As τ1 ≺Γ τ2 , by Lemma 16.3.15, υ1 ≺Γ υ2 , and thus υ1 ∈ Tk by Lemma 16.3.19. Thus τ1 ⊆ Tk and so τ1 ∈ / T , which contradicts our initial assumption. We can use this last lemma to put a lower bound on the measure of B s . Lemma 16.3.26. If s is a marker move stage then μ(B s ) 14 . Proof. The previous lemma tells us that for any given length l, there is at most a single τ ∈ T of length l such that neither τ nor τ is in B s . For any string υ of length l such that υ = τ and υ = τ , either υ or υ is in B s . Thus
1 1 3 1 1 s s −j(i) μ(T ) − − = . > 2·2 μ(B ) 2 2 4 4 4 im
Us Now for all τ ∈ B s , we have στ ∈ Od+|τ | and so Km(στ ) d + |τ |. s Additionally, by construction, as B is a prefix-free set, so is {στ : τ ∈ B s }. Hence it follows that MU ({στ : τ ∈ B s }) = MU (στ ) 2−d−|τ | τ ∈B s
=2
−d
τ ∈B
μ(τ ) 2−d μ(B s ) 2−d−2 .
τ ∈B
Lemma 16.3.27. If s1 and s2 are both marker move stages and s1 = s2 , then the sets {στ : τ ∈ B s1 } and {στ : τ ∈ B s2 } are disjoint. Proof. Take any τ ∈ B s1 and υ ∈ B s2 . By construction, |τ | < s1 , and the length of υ is larger than any previous marker move stage, so in particular |υ| > s1 > |τ |. Now, if τ ⊀ υ, then the construction ensures that συ and στ are incompatible. If τ ≺ υ, then again by construction ρτ ≺ συ and hence συ and στ are incompatible. We can now finish the proof of the theorem. By Lemma 16.3.17, M is a monotone machine. By Lemma 16.3.18, if there are a finite number of Us marker moves then ΓO (x) = ∅ (x) for all but finitely many x. But there can be only a finite number of marker move stages because if R is the set of all marker move stages, then by the previous lemma MU (λ) MU ({στ : τ ∈ B s }) |R|2−d−2 . s∈R
16.3. The collection of nonrandom strings
761
Hence |R| 2d+2 , and in particular R is finite. U U The proof above still works if OKm is replaced by OKM = {(σ, k) : − log(MU (σ)) < k}: During the verification of the construction, it is only the measure that the universal machine places on elements of Sx that is considered, and not the length of their shortest descriptions. Additionally, if the construction enumerates some pair (σ, n) into M , this enumeration U U U U adds (σ, n + d) to OKM as well as OKm , because OKm ⊆ OKM . Thus we have the following.
Corollary 16.3.28 (Day [88]). For any universal monotone machine U U the overgraph OKM is truth table complete. Also, a universal process machine is a monotone machine. Hence the limitations of a universal monotone machine exploited in the proof also apply to a universal process machine. Furthermore, the machine M constructed in the proof is a strict process machine, so we have the following. Corollary 16.3.29 (Day [88]). For any universal process machine U , or U V any universal strict process machine V , the overgraphs OKm and OKm D S are truth table complete. An interesting point about the above construction is that ∅ can determine the correct tt-reduction, because R is finite and enumerated during the construction, and hence ∅ can determine its size. The construction used in the proof of Kummer’s Theorem 16.3.2 is different. It uses a finite set of sequences S1 , . . . , Sn , and the key to unraveling the construction is determining the maximum i such that Si is infinite, which cannot be done using a ∅ oracle.
16.3.4 The strict process complexity case This section presents Day’s proof from [88] that there are universal strict process machines whose set of nonrandom strings is not tt-complete. It is open whether this result holds for the case where “strict” is removed. In M M this section, we write R for RKm S . First we show that it is possible to construct a universal strict process machine whose set of nonrandom strings is closed under extensions. Theorem 16.3.30 (Day [88]). There exists a universal strict process maV V chine V such that R is closed under extensions; i.e., if σ ∈ R and V σ σ, then σ ∈ R . Proof. We build V from a standard universal strict process machine U . Let V (λ) = λ and let V (0τ ) = U (τ ) for all τ . This definition ensures the universality of V . We use strings in the domain of V starting with 1 to get the desired closure property. We can effectively enumerate the set S of strings τ such that U (τ ) ↓= σ for some σ such that |τ | |σ| − 2. If τ ∈ S,
762
16. Complexity of Computably Enumerable Sets
we know that U (ρ) ↓ for all ρ ≺ τ , so we can also effectively enumerate the prefix-free set T of strings τ ∈ S such that ρ ∈ / S for all ρ ≺ τ . For each τ ∈ T , let V (1τ ν) = U (τ )ν for all ν and let V (1ρ) = λ for all ρ ≺ τ . This definition clearly makes V a universal strict process machine. We V V now verify that R is closed under extensions. Let σ ∈ R , and let ν witness that fact. That is, V (ν) = σ and |ν| < |σ|. If ν is of the form 0τ , then U (τ ) = σ and |τ | |σ| − 2, so τ ∈ S. Let τ τ be in T and let σ = U (τ ). Note that σ σ. If ν is of the form 1τ , then, since σ = λ, there must be a τ ∈ T such that τ τ . Let σ = V (1τ ), and again note that σ σ. In either case, for every ρ, we have V (1τ ρ) = σ ρ, and |1τ ρ| < |σ ρ|, so V V σ ρ ∈ R . In particular, every extension of σ is in R .
The argument used in the previous proof does not generalize to process machines. Day [88] gave the following example. Let U be a universal process machine. Suppose that we see that U (00) = 0000 and U (10) = 0001, and so far U (λ) has not converged. If we try to follow the above construction, we set V (100) = 0000 and V (110) = 0001. If now U (λ) converges to 00, then we would like to set V (1) = 00, and somehow use extensions of 1 to make all extensions of 00 nonrandom. However, consider 001. It is not possible to set V (10) = 001 or V (11) = 001 and keep V a process machine. The definition of V in the proof of Theorem 16.3.30 can be carried out in a stage-by-stage manner. That is, whenever Us (τ ) ↓= σ and |τ | |σ| − 2, we can determine the unique τ τ such that τ ∈ T , and immediately define V (1τ ν) = U (τ )ν for all ν and V (1ρ) = λ for all ρ ≺ τ . This action Vs ensures that R is closed under extensions for all s. We use this idea to help build a universal strict process machine whose set of nonrandom strings is not tt-complete. This proof is an adaptation of Muchnik’s proof of Theorem 16.3.9 (ii), where he constructed a universal prefix-free machine whose overgraph is not tt-complete. Recall that that proof uses the fact that the outcome of a finite game can be computably determined. In the proof that follows there will be three roles: the champion, the opponent, and the arbitrator. The champion and the opponent will be players V in the game. They will move by adding strings to R . The arbitrator will make sure that the set of all nonrandom strings is closed under extensions. The opponent represents a fixed universal strict process machine. The index of the opponent in the proof is 000, and the index of the champion is 01. We give the champion a shorter index because it will need more measure (measure replacing money in these games). The index of the arbitrator will be 1. The arbitrator follows the strategy of the proof of Theorem 16.3.30. Because the actions of the arbitrator can be determined, both players know V that once a string σ is in R , all extensions of σ will also be in that set.
16.3. The collection of nonrandom strings
763
Vs
Hence, when we consider R , we will assume that this set is closed under extensions. Theorem 16.3.31 (Day [88]). There exists a universal strict process V machine V such that R is not tt-complete. Proof. To prove this theorem we build a universal strict process machine V V and a c.e. set A tt R . As discussed above, we assume that the arbitrator Vs
acts behind the scenes and that R is closed under extensions for all s. Let Γ0 , Γ1 , . . . be an enumeration of all truth table reductions. Let U be a universal strict process machine. Let Ds = {τ : Us (τ ) ↓ ∧ |τ | < V |Us (τ )| − 3}. The construction will ensure that if τ ∈ Ds , then Us (τ ) ∈ R . We will use Ds to determine how much the opponent spends in the games. We will show that if the opponent plays a move in a game between stages s0 and s1 , then we can determine a lower bound for μ(Ds1 ) − μ(Ds0 ). We want to satisfy the following requirements for all n.
V R Rn : ∃i, j A(n, i, j) = Γn (n, i, j) . The triples n, i, j will be used as follows. The number n represents the reduction to be diagonalized against. The number i is incremented every time the requirement is injured by a stronger priority requirement. It also provides an upper bound on the measure that can be used by the players in the game. The number j provides us with a series of games for each diagonalization. It will be incremented if our opponent ever “breaks the rules” of the game by using too much measure. Construction. Stage 0. Let V (λ) = V (0) = V (00) = λ. Let A0 = ∅. Stage 2s + 1. For all n < s, if Rn does not have a witness, assign n, 0, 0 to Rn . For n < s, let n, in , jn be the current witness for Rn . If Rn does not have a game assigned, run Γn (n, in , jn ) for s many steps to see whether it returns a truth table. If it does, let Xn = {σ0 , . . . , σk } be the set of strings used as variables by this truth table. For the purpose of this game, we will assume that stronger priority requirements have stopped acting, i.e., that the associated games are finished. We do not want to interrupt any earlier V2s games, so let Yn = {σ ∈ Xn : ∀τ ∈ i
would change some variable used that σ ∈ Xn \ Yn iff adding σ to R V2s+1 by a higher priority game (as R is closed under extensions). The game Gn,in ,jn is defined as follows. We will assume that the strings in Xn \ Yn do not change. The vertices in the game correspond to possible truth assignments to the variables in Yn . The vertices are labeled with the value of the corresponding line in the truth table (assuming those variables in Xn \ Yn retain their current values). An edge exists from vertex v1 to vertex v2 if it possible to go from the row associated with v1 to the row
764
16. Complexity of Computably Enumerable Sets
associated with v2 by changing some of the truth table variables from 0 to 1. If going from vertex v1 to vertex v2 requires changing exactly the variables in Σ ⊆ Yn , then the cost associated with the edge is μ(Σ). The amount of measure each player has to expend on the game is 2−n−in −6 . The game Gn,in ,jn , though defined, is said to be uninitialized. We allow the opponent to move by letting V (000τ ) = σ for all τ and σ such that Us (τ ) = σ.
Stage 2s + 2. We determine whether there is any game that the champion needs to attend to. We find all games assigned to requirements that are uninitialized or where the opponent has made a move. The opponent is considered to have made a move if some new strings used by the truth V2s+1 V2s table reduction have been enumerated into R \ R . If such games exist, let Gn,in ,jn be the strongest priority game (i.e., the game with the smallest n) that needs attention. First we reset all weaker priority games: For all p such that n < p s, let np , ip , jp be the current witness assigned to Rp . Remove this witness and the associated game and let np , ip + 1, 0 be the new witness. If Gn,in ,jn is uninitialized, then we set the start position for the game to be the vertex that corresponds to assigning σi a truth value of 1 iff V2s+2 σi ∈ R . The champion decides whether to take a winning strategy to ensure that Γn (n, in , jn ) = 0 or one to ensure that Γn (n, in , jn ) = 1. Of course, one of these must exist. In the first case we add n, in , jn to As ; in the second case we leave it out. Let Σ be the set of strings that the V2s+2 champion needs to enumerate into R for the first step of this strategy. We now say that the game Gn,in ,jn has been initialized. If Gn,in ,jn is a game in which the opponent has made a move, then let 2s0 be the stage at which this game was initialized. If μ(Ds )−μ(Ds0 ) 2−n−in −6 , then the opponent has exceeded the allocated measure for the game Gn,in ,jn . In this case, remove n, in , jn as a witness for Rn and also remove the game. Let n, in , jn + 1 be the new witness. Let Σ = ∅. If the opponent has not exceeded the allocated measure, then let Σ be the set V2s+2 of strings that the champion needs to enumerate into R for the next move in the predetermined winning strategy. V2s+2 Now we need to add Σ to R in order to make the champion’s next V2s+2 move. We know that the arbitrator will ensure that R is closed under = {σ0 , . . . , σk } to be a prefix-free set formed extensions. So if we take Σ by taking the -minimal elements of Σ, then the champion needs only to into RV2s+2 . We will use the KC Theorem to find descriptions enumerate Σ for these strings. We use the KC Theorem to request a string τi of length |σi | − 3 for each i k. For each such i, let V (01τi ) = σi and V (01τ ) = λ for all τ ≺ τi .
16.3. The collection of nonrandom strings
765
Note that the champion decreases the measure available for future requests by 23 μ(Σ). However, by scaling, we can regard the champion as having 18 much measure to spend, and this move as costing it μ(Σ). End of Construction. The first step in verifying this construction is to show that if the opponent makes a move then it must pay the cost of that move. V2s1
Lemma 16.3.32. If the opponent enumerates a set of strings Σ into R V2s R 0 , then μ(Ds1 ) − μ(Ds0 ) 24 μ(Σ).
\
= {σ0 , . . . σk } from the set of minimal Proof. Form a prefix-free set Σ elements of Σ under the relation. For all i k, there exists some τi with Us1 (τi ) σi such that |τi | |Us1 (τi )| − 4 and τi ∈ / dom(Us0 ). Let be a minimal C = {Us1 (τi ) : i k}. Now, C ⊇ Σ, so we can let C subset of C such that C ⊇ Σ. Then μ(C) μ(Σ). For all υ ∈ C, choose an i such that U (τi ) = υ and let I be the set of all such i. The set {τi : i ∈ I} is prefix-free, because its image under U is prefix-free and U is a strict process machine. It follows that 24 μ(Σ). μ({τi : i ∈ I}) = 2−|τi | 24−|Us1 (τi )| = 24 μ(C) i∈I
i∈I
/ dom(Us0 ), as the domain of Finally, take any i k. If τi ≺ τ , then τ ∈ a strict process machine is closed under substrings. So τ ∈ / Ds0 . If τ ≺ τi Vs and τ ∈ Ds0 , then Us0 (τ ) ∈ R 0 . Since Us0 (τ ) Us1 (τi ) σi , it follows Vs Vs that σi ∈ R 0 . But we assumed that σi ∈ / R 0 , and hence {τi : i k} ∩ Ds0 = ∅. The result follows, since Ds0 ⊆ Ds1 . Again by scaling, we can regard the opponent as having to spend, and the cost of the move as being μ(Σ).
1 16
much measure
Lemma 16.3.33. V is a strict process machine. Proof. U is by assumption a strict process machine, so to check that V is a strict process machine, we need to check only the strings enumerated into V by the champion. As the champion is effectively a prefix-free machine, we need to show only that the champion does not run out of measure. We divide the games into two sorts, those games Gn,i,j with j = 0, and all other games. We know the champion always keeps within the rules of the game. Let C be the cost to the champion of playing those games with j = 0. Then C is less than the sum allocated to each game. of the measures 1 Hence, C n i 2−n−i−6 = n 2−n−5 = 16 . Now, j is incremented only if the opponent exceeds the amount of measure allocated to a game. Hence the measure the champion spends on these games is always less than the measure the opponent spends overall. As the 1 opponent has only 16 to spend, it follows that the champion spends less than
766
16. Complexity of Computably Enumerable Sets
1 C + 16 = 18 , the amount of measure available to it. Hence the champion is a prefix-free machine and thus V is a strict process machine.
Lemma 16.3.34. All requirements are met. Proof. Take any requirement Rn . Assume that at some stage s0 all higher priority requirements have stopped acting. Let n, i, j be the witness assigned to Rn at stage s0 . Because all higher priority requirements have stopped acting, i is never incremented again. So if the witness is changed, it must be because j is increased. This event must in turn be caused by the opponent exceeding its allocated measure of 2−n−i−6 in the previous game, which can happen only finitely often, since otherwise the opponent would run out of measure. So there is some final witness n, in , jn assigned to Rn . If Γn (n, in , jn ) never halts then the requirement is met. If Γn (n, in , jn ) does halt then the champion will adopt a winning strategy for the game Gn,in ,jn , and so V
in either case ΓnR (n, in , jn ) = A(n, in , jn ). Together, the previous two lemmas complete the proof of the theorem.
References
[1] M. Ageev. Martin’s game: a lower bound for the number of sets. Theoretical Computer Science, 289:871–876, 2002. [732] [2] AIM/ARCC Workshop in Effective Randomness, August 7–11, 2006, Open problem list. http://www.math.dartmouth.edu/~frg/aimRandQsLive.pdf. [xiii] [3] E. Allender, H. Buhrman, and M. Kouck´ y. What can be efficiently reduced to the Kolmogorov-random strings? Annals of Pure and Applied Logic, 138:2–19, 2006. [749] [4] E. Allender, H. Buhrman, M. Kouck´ y, D. van Melkebeek, and D. Ronneburger. Power from random strings. SIAM Journal on Computing, 35:1467–1493, 2006. [743] [5] K. Ambos-Spies. Anti-mitotic recursively enumerable sets. Mathematical Logic Quarterly, 31:461–477, 1985. [53] [6] K. Ambos-Spies. Algorithmic randomness revisited. In B. McGuinness, editor, Language, Logic and Formalization of Knowledge. Coimbra Lecture and Proceedings of a Symposium held in Siena in September 1997, pages 33–52. Bibliotheca, 1998. [303, 307] [7] K. Ambos-Spies, C. G. Jockusch, Jr., R. A. Shore, and R. I. Soare. An algebraic decomposition of the recursively enumerable degrees and the coincidence of several degree classes with the promptly simple degrees. Transactions of the American Mathematical Society, 281:1089–1104, 1984. [15, 91, 92, 215]
Numbers in brackets after each entry indicate the pages on which the entry is cited.
R.G. Downey and D. Hirschfeldt, Algorithmic Randomness and Complexity, Theory and Applications of Computability, DOI 10.1007/978-0-387-68441-3, © Springer Science+Business Media, LLC 2010
767
768
References
[8] K. Ambos-Spies, B. Kjos-Hanssen, S. Lempp, and T. A. Slaman. Comparing DNR and WWKL. The Journal of Symbolic Logic, 69:102–143, 2004. [349] [9] K. Ambos-Spies and A. Kuˇcera. Randomness in computability theory. In P. A. Cholak, S. Lempp, M. Lerman, and R. A. Shore, editors, Computability Theory and its Applications. Proceedings of the AMS-IMS-SIAM Joint Summer Research Conference held at the University of Colorado, Boulder, CO, June 13–17, 1999, volume 257 of Contemporary Mathematics, pages 1–14. American Mathematical Society, Providence, RI, 2000. [277, 554] [10] K. Ambos-Spies and E. Mayordomo. Resource-bounded measure and randomness. In A. Sorbi, editor, Complexity, Logic, and Recursion Theory, volume 187 of Lecture Notes in Pure and Applied Mathematics, pages 1–47. Marcel Dekker, Inc., New York, 1997. [16, 270, 277] [11] K. Ambos-Spies, E. Mayordomo, Y. Wang, and X. Zheng. Resourcebounded balanced genericity, stochasticity and weak randomness. In C. Puech and R. Reischuk, editors, STACS ’96. Proceedings of the 13th Annual Symposium on Theoretical Aspects of Computer Science held in Grenoble, February 22–24, 1996, volume 1046 of Lecture Notes in Computer Science, pages 63–74. Springer-Verlag, Berlin, 1996. [308] [12] K. Ambos-Spies, K. Weihrauch, and X. Zheng. Weakly computable real numbers. Journal of Complexity, 16:676–690, 2000. [201, 217, 219] [13] M. M. Arslanov. On some generalizations of the fixed-point theorem. Soviet Mathematics, 25:1–10, 1981. Translated from Izvestiya Vysshikh Uchebnykh Zavedeni˘ı. Matematika. [89] [14] M. M. Arslanov. Degree structures in local degree theory. In A. Sorbi, editor, Complexity, Logic, and Recursion Theory, volume 187 of Lecture Notes in Pure and Applied Mathematics, pages 49–74. Marcel Dekker Inc., New York, 1997. [27, 89] [15] K. B. Athreya, J. M. Hitchcock, J. H. Lutz, and E. Mayordomo. Effective strong dimension in algorithmic information and computational complexity. SIAM Journal on Computing, 37:671–705, 2007. [636, 637, 638, 639, 641, 654, 657] [16] B. Barak, R. Impagliazzo, and A. Wigderson. Extracting randomness using few independent sources. SIAM Journal on Computing, 36:1095–1118, 2006. [642] [17] G. Barmpalias. Random non-cupping revisited. Journal of Complexity, 22:850–857, 2006. [536] [18] G. Barmpalias. Elementary differences between the degrees of unsolvability and degrees of compressibility. Annals of Pure and Applied Logic, 161:923– 934, 2010. [491, 494] [19] G. Barmpalias. Relative randomness and cardinality. Notre Dame Journal of Formal Logic, 51:195–205, 2010. [491] [20] G. Barmpalias, R. G. Downey, and N. Greenberg. K-trivial degrees and the jump traceability hierarchy. Proceedings of the American Mathematical Society, 137:2099–2109, 2009. [689] [21] G. Barmpalias, R. G. Downey, and N. Greenberg. Working with strong reducibilities above totally ω-c.e. and array computable degrees. Transac-
References
769
tions of the American Mathematical Society, 362:777–813, 2010. [442, 443, 444, 447, 448, 455] [22] G. Barmpalias, R. G. Downey, and K. M. Ng. Jump inversions inside effectively closed sets and applications to randomness. To appear in the Journal of Symbolic Logic. [358, 362] [23] G. Barmpalias and A. E. M. Lewis. A c.e. real that cannot be sw-computed by any Ω-number. Notre Dame Journal of Formal Logic, 47:197–209, 2006. [448] [24] G. Barmpalias, A. E. M. Lewis, and K. M. Ng. The importance of Π01 classes in effective randomness. The Journal of Symbolic Logic, 75:387–400, 2010. [323, 339, 491, 536] [25] G. Barmpalias, A. E. M. Lewis, and M. Soskova. Randomness, lowness and degrees. The Journal of Symbolic Logic, 73:559–577, 2008. [489, 490, 491, 499] [26] G. Barmpalias, A. E. M. Lewis, and F. Stephan. Π01 classes, LR degrees and Turing degrees. Annals of Pure and Applied Logic, 156:21–38, 2008. [491] [27] G. Barmpalias, J. S. Miller, and A. Nies. Randomness notions and partial relativization. To appear. [315] [28] G. Barmpalias and A. Nies. Upper bounds on ideals in the Turing degrees. To appear. [541, 542] [29] J. Barzdins. Complexity of programs to determine whether natural numbers not greater than n belong to a recursively enumerable set. Soviet Mathematics Doklady, 9:1251–1254, 1968. [122, 728] [30] T. Bayes. An essay towards solving a problem in the doctrine of chances, by the late Rev. Mr. Bayes, F. R. S. communicated by Mr. Price, in a letter to John Canton, A. M., F. R. S. Philosophical Transactions of the Royal Society of London, 53:370–418, 1763. [238] [31] V. Becher, S. Figueira, S. Grigorieff, and J. S. Miller. Randomness and halting probabilities. The Journal of Symbolic Logic, 71:1411–1430, 2006. [229] [32] V. Becher and S. Grigorieff. Random reals and possibly infinite computations. I. Randomness in ∅ . The Journal of Symbolic Logic, 70:891–913, 2005. [229] [33] V. Becher and S. Grigorieff. From index sets to randomness in ∅n : random reals and possibly infinite computations. II. The Journal of Symbolic Logic, 74:124–156, 2009. [257] [34] B. R. C. Bedregal and A. Nies. Lowness properties of reals and hyperimmunity. In R. de Queiroz, E. Pimentel, and L. Figueiredo, editors, WoLLiC ’2003. 10th Workshop on Logic, Language, Information, and Computation, Ouro Preto, Minas Gerais, Brazil, 29 July–01 August 2003, volume 84 of Electronic Notes in Theoretical Computer Science, pages 73–79 (electronic). Elsevier, Amsterdam, 2003. [560] [35] O. V. Belegradek. On algebraically closed groups. Algebra and Logic, 13:135–143, 1974. [206]
770
References
[36] M. Bickford and C. Mills. Lowness properties of r.e. sets. Unpublished manuscript, 1982, University of Wisconsin–Madison. [530] [37] L. Bienvenu. Game-theoretic Characterizations of Randomness: Unpredictability and Stochasticity. Ph.D. dissertation, University of Provence, 2008. [327, 330] [38] L. Bienvenu, D. Doty, and F. Stephan. Constructive dimension and Turing degrees. Theory of Computing Systems, 45:740–755, 2009. [643] [39] L. Bienvenu and R. G. Downey. Kolmogorov complexity and Solovay functions. In S. Albers and J.-Y. Marion, editors, 26th International Symposium on Theoretical Aspects of Computer Science. STACS 2009, February 26–28, 2009, Freiburg, Germany, volume 3 of Leibniz International Proceedings in Informatics, pages 147–158 (electronic). Schloss Dagstuhl–Leibniz Center for Informatics, Dagstuhl, Germany, 2009. [138, 327, 329, 412, 413, 484, 504, 505, 714] [40] L. Bienvenu and W. Merkle. Reconciling data compression and Kolmogorov complexity. In L. Arge, C. Cachin, T. Jurdzi´ nski, and A. Tarlecki, editors, Automata, Languages and Programming. 34th International Colloquium, ICALP 2007, Wroclaw, Poland, July 9–13, 2007. Proceedings, volume 4596 of Lecture Notes in Computer Science, pages 643–654. Springer, Berlin, 2007. Extended version at http://www.math.uni-heidelberg.de/logic/ merkle/ps/icalp-2007.pdf. [138, 139, 298, 299, 300, 301] [41] L. Bienvenu and J. S. Miller. Randomness and lowness notions via open covers. To appear in the Annals of Pure and Applied Logic. [591] [42] L. Bienvenu, An. A. Muchnik, A. Shen, and N. Vereshchagin. Limit complexities revisited. In S. Albers and P. Weil, editors, 25th International Symposium on Theoretical Aspects of Computer Science. STACS 2008, February 21–23, 2008, Bordeaux, France, volume 1 of Leibniz International Proceedings in Informatics, pages 73–84 (electronic). Schloss Dagstuhl– Leibniz Center for Informatics, Dagstuhl, Germany, 2008. [263, 471, 472, 473] [43] S. Binns, B. Kjos-Hanssen, M. Lerman, and R. Solomon. On a conjecture of Dobrinen and Simpson concerning almost everywhere domination. The Journal of Symbolic Logic, 71:119–136, 2006. [497] [44] P. Brodhead, R. G. Downey, and K. M. Ng. Bounded variations of randomness. In preparation. [319] [45] J.-Y. Cai and J. Hartmanis. On Hausdorff and topological dimensions of the Kolmogorov complexity of the real line. Journal of Computer and System Sciences, 49:605–619, 1994. [598] [46] W. C. Calhoun. Degrees of monotone complexity. The Journal of Symbolic Logic, 71:1327–1341, 2006. [147, 432, 459, 460, 461, 462] [47] C. Calude, R. J. Coles, P. H. Hertling, and B. Khoussainov. Degreetheoretic aspects of computably enumerable reals. In S. B. Cooper and J. K. Truss, editors, Models and Computability. Invited Papers from Logic Colloquium ’97 - European Meeting of the Association for Symbolic Logic, Leeds, July 1997, volume 259 of London Mathematical Society Lecture Note Series. Cambridge University Press, Cambridge, 1999. [203, 204, 205, 405]
References
771
[48] C. S. Calude and G. J. Chaitin. What is a halting probability? Notices of the American Mathematical Society, 57:236–237, 2010. [12, 142] [49] C. S. Calude and R. J. Coles. Program-size complexity of initial segments and domination relation reducibility. In J. Karhum¨ aki, H. Mauer, G. P˘ aun, and G. Rozenberg, editors, Jewels are Forever, pages 225–237. SpringerVerlag, Berlin, 1999. [501] [50] C. S. Calude, P. H. Hertling, B. Khoussainov, and Y. Wang. Recursively enumerable reals and Chaitin Ω numbers. In M. Morvan, C. Meinel, and D. Krob, editors, STACS 98. 15th Annual Symposium on Theoretical Aspects of Computer Science. Paris, France, February 25–27, 1998. Proceedings, volume 1373 of Lecture Notes in Computer Science, pages 596–606. Springer, Berlin, 1998. [200, 409, 413] [51] C. S. Calude and A. Nies. Chaitin Ω numbers and strong reducibilities. Journal of Universal Computer Science, 3:1162–1166 (electronic), 1997. [228, 333] [52] C. S. Calude, L. Staiger, and S. A. Terwijn. On partial randomness. Annals of Pure and Applied Logic, 138:20–30, 2006. [604, 605] [53] C. S. Calude and M. Zimand. Algorithmically independent sequences. In M. Ito and M. Toyama, editors, Developments in Language Theory. 12th International Conference, DLT 2008. Kyoto, Japan, September 16–19, 2008. Proceedings, volume 5257 of Lecture Notes in Computer Science, pages 183–195. Springer, Berlin, 2008. [627, 628, 635] ¨ [54] C. Carath´eodory. Uber das lineare Mass von Punktmengen, eine Verallgemeinerung des L¨ angenbegriffs. Nachrichten von der Gesellschaft der Wissenschaften zu G¨ ottingen, Mathematisch-Physikalische Klasse, pages 404–426, 1914. [592] [55] D. Cenzer. Π01 classes in computability theory. In E. R. Griffor, editor, Handbook of Computability Theory, volume 140 of Studies in Logic and the Foundations of Mathematics, pages 37–85. North-Holland, Amsterdam, 1999. [74, 373] [56] D. Cenzer, G. LaForte, and G. Wu. Pseudojumps and Π01 classes. Journal of Logic and Computation, 19:77–87, 2009. [377] [57] D. Cenzer and J. B. Remmel. Effectively Closed Sets. To be published in Lecture Notes in Logic. [74] [58] G. J. Chaitin. A theory of program size formally identical to information theory. Journal of the Association for Computing Machinery, 22:329–340, 1975. [xiii, 121, 125, 128, 129, 133, 134, 142, 227, 228, 232] [59] G. J. Chaitin. Algorithmic entropy of sets. Computers & Mathematics with Applications, 2:233–245, 1976. [731, 732] [60] G. J. Chaitin. Information-theoretical characterizations of recursive infinite strings. Theoretical Computer Science, 2:45–48, 1976. [118, 119, 120] [61] G. J. Chaitin. Nonrecursive infinite strings with simple initial segments. IBM Journal of Research and Development, 21:350–359, 496, 1977. [456, 500] [62] G. J. Chaitin. G¨ odel’s theorem and information. International Journal of Theoretical Physics, 21:941–954, 1982. [627]
772
References
[63] G. J. Chaitin. Incompleteness theorems for random reals. Advances in Applied Mathematics, 8:119–146, 1987. [129, 130, 131, 229, 233, 250] [64] J. Chisholm, J. Chubb, V. S. Harizanov, D. R. Hirschfeldt, C. G. Jockusch, Jr., T. McNicholl, and S. Pingrey. Π01 classes and strong degree spectra of relations. The Journal of Symbolic Logic, 72:1003–1018, 2007. [368] [65] P. A. Cholak, R. G. Downey, and N. Greenberg. Strong-jump traceablilty. I. The computably enumerable case. Advances in Mathematics, 217:2045– 2074, 2008. [668, 669, 670, 673, 680, 685] [66] P. A. Cholak, R. G. Downey, and M. Stob. Automorphisms of the lattice of recursively enumerable sets: promptly simple sets. Transactions of the American Mathematical Society, 332:555–570, 1992. [83] [67] P. A. Cholak, N. Greenberg, and J. S. Miller. Uniform almost everywhere domination. The Journal of Symbolic Logic, 71:1057–1072, 2006. [496] [68] P. A. Cholak and L. A. Harrington. Definable encodings in the computably enumerable sets. The Bulletin of Symbolic Logic, 6:185–196, 2000. [83] [69] C. T. Chong and R. G. Downey. Degrees bounding minimal degrees. Mathematical Proceedings of the Cambridge Philosophical Society, 105:211–222, 1989. [386] [70] C. T. Chong and R. G. Downey. Minimal degrees recursive in 1-generic degrees. Annals of Pure and Applied Logic, 48:215–225, 1990. [386] [71] A. Church. On the concept of a random sequence. Bulletin of the American Mathematical Society, 46:130–135, 1940. [xviii, 230] [72] J. A. Cole and S. G. Simpson. Mass problems and hyperarithmeticity. Journal of Mathematical Logic, 7:125–143, 2007. [347, 696] [73] R. J. Coles, R. G. Downey, C. G. Jockusch, Jr., and G. LaForte. Completing pseudojump operators. Annals of Pure and Applied Logic, 136:297–333, 2005. [42] [74] C. J. Conidis. A real of strictly positive effective packing dimension that does not compute a real of effective packing dimension one. To appear. [642] [75] C. J. Conidis. Effective packing dimension of Π01 -classes. Proceedings of the American Mathematical Society, 136:3655–3662, 2008. [648] [76] S. B. Cooper. Degrees of unsolvability complementary between recursively enumerable degrees. Annals of Mathematical Logic, 4:31–73, 1972. [72] [77] S. B. Cooper. Minimal degrees and the jump operator. The Journal of Symbolic Logic, 38:249–271, 1973. [72] [78] S. B. Cooper. A characterisation of the jumps of minimal degrees below 0 . In S. B. Cooper, T. A. Slaman, and S. S. Wainer, editors, Computability, Enumerability, Unsolvability. Directions in Recursion Theory, volume 224 of London Mathematical Society Lecture Note Series, pages 81–92. Cambridge University Press, Cambridge, 1996. [72] [79] S. B. Cooper. Computability Theory. Chapman & Hall/CRC, Boca Raton, FL, 2004. [xiii, 8]
References
773
[80] B. F. Csima. Applications of Computability Theory to Prime Models and Differential Geometry. Ph.D. dissertation, The University of Chicago, 2003. [420, 421] [81] B. F. Csima. The settling time reducibility ordering and Δ02 sets. Journal of Logic and Computation, 19:145–150, 2009. [421] [82] B. F. Csima, R. G. Downey, N. Greenberg, D. R. Hirschfeldt, and J. S. Miller. Every 1-generic computes a properly 1-generic. The Journal of Symbolic Logic, 71:1385–1393, 2006. [379] [83] B. F. Csima and A. Montalb´ an. A minimal pair of K-degrees. Proceedings of the American Mathematical Society, 134:1499–1502, 2006. [458, 551, 552] [84] B. F. Csima and R. A. Shore. The settling-time reducibility ordering. The Journal of Symbolic Logic, 72:1055–1071, 2007. [421] [85] B. F. Csima and R. I. Soare. Computability results used in differential geometry. The Journal of Symbolic Logic, 71:1394–1410, 2006. [421] [86] R. P. Daley. Minimal-program complexity of pseudo-recursive and pseudorandom sequences. Mathematical Systems Theory, 9:83–94, 1975. [304, 305] [87] A. R. Day. Increasing the gap between descriptional complexity and algorithmic probability. To appear in the Transactions of the American Mathematical Society. [151, 153, 169, 170] [88] A. R. Day. On the computational power of random strings. Annals of Pure and Applied Logic, 160:214–228, 2009. [147, 752, 761, 762, 763] [89] A. R. Day. Process complexity and computable randomness. Manuscript, June 2009. [282, 283] [90] A. R. Day. On process complexity. Chicago Journal of Theoretical Computer Science, 2010. Paper number 4 (electronic). [149] [91] A. R. Day and A. Fitzgerald. Indifferent sets for 1-generics. In preparation. [341] [92] K. de Leeuw, E. F. Moore, C. E. Shannon, and N. Shapiro. Computability by probabilistic machines. In C. E. Shannon and J. McCarthy, editors, Automata Studies, volume 34 of Annals of Mathematics Studies, pages 183– 212. Princeton University Press, Princeton, NJ, 1956. [323, 358] [93] O. Demuth. On constructive pseudonumbers. Commentationes Mathematicae Universitatis Carolinae, 16:315–331, 1975. In Russian. [199, 231, 409, 418, 419] [94] O. Demuth. On some classes of arithmetical real numbers. Commentationes Mathematicae Universitatis Carolinae, 23:453–465, 1982. In Russian. [315] [95] O. Demuth. Remarks on the structure of tt-degrees based on constructive measure theory. Commentationes Mathematicae Universitatis Carolinae, 29:233–247, 1988. [315, 316, 333, 364] [96] O. Demuth and A. Kuˇcera. Remarks on 1-genericity, semigenericity, and related concepts. Commentationes Mathematicae Universitatis Carolinae, 28:85–94, 1987. [103, 380] [97] D. Diamondstone. Promptness does not imply superlow cuppability. The Journal of Symbolic Logic, 74:1264–1272, 2009. [693, 694]
774
References
[98] N. L. Dobrinen and S. G. Simpson. Almost everywhere domination. The Journal of Symbolic Logic, 69:914–922, 2004. [495, 496] [99] R. G. Downey. Δ02 degrees and transfer theorems. Illinois Journal of Mathematics, 31:419–427, 1987. [70] [100] R. G. Downey. Subsets of hypersimple sets. Pacific Journal of Mathematics, 127:299–319, 1987. [83] [101] R. G. Downey. On the universal splitting property. Mathematical Logic Quarterly, 43:311–320, 1997. [205] [102] R. G. Downey. Computability, definability, and algebraic structures. In R. G. Downey, D. Ding, S. P. Tung, Y. H. Qiu, and M. Yasugi, editors, Proceedings of the 7th and 8th Asian Logic Conferences. Held in Hsi-Tou, June 6–10, 1999 and Chongqing, August 29–September 2, 2002, pages 63–102. Singapore University Press and World Scientific, Singapore, 2003. [204] [103] R. G. Downey. Some computability-theoretical aspects of reals and randomness. In P. A. Cholak, editor, The Notre Dame Lectures, volume 18 of Lecture Notes in Logic, pages 97–148. Association for Symbolic Logic and A K Peters, Ltd., Urbana, IL and Wellesley, MA, 2005. [212, 440] [104] R. G. Downey. The sixth lecture on algorithmic randomness. In S. B. Cooper, H. Geuvers, A. Pillay, and J. V¨ aa ¨n¨ anen, editors, Logic Colloquium 2006. Proceedings of the Annual European Conference on Logic of the Association for Symbolic Logic held at the Radboud University, Nijmegen, July 27–August 2, 2006, volume 32 of Lecture Notes in Logic, pages 103–134. Association for Symbolic Logic and Cambridge University Press, Chicago and Cambridge, 2009. [511] [105] R. G. Downey and N. Greenberg. Strong jump traceability II: K-triviality. To appear. [668, 700] [106] R. G. Downey and N. Greenberg. Turing degrees of reals of positive effective packing dimension. Information Processing Letters, 108:298–303, 2008. [618, 642, 645, 649, 650] [107] R. G. Downey, N. Greenberg, N. Mihailovi´c, and A. Nies. Lowness for computable machines. In C. T. Chong, Q. Feng, T. A. Slaman, W. H. Woodin, and Y. Yang, editors, Computational Prospects of Infinity: Part II. Presented Talks, Lectures from the Workshop Held at the National University of Singapore, Singapore, June 20–August 15, 2005, volume 15 of Lecture Notes Series. Institute for Mathematical Sciences. National University of Singapore, pages 79–86. World Scientific Publishing Company, Singapore, 2008. [562, 563, 564] [108] R. G. Downey, N. Greenberg, and J. S. Miller. The upward closure of a perfect thin class. Annals of Pure and Applied Logic, 165:51–58, 2008. [323] [109] R. G. Downey and E. J. Griffiths. Schnorr randomness. The Journal of Symbolic Logic, 69:533–554, 2004. [275, 277, 294, 350, 462, 562, 564, 658] [110] R. G. Downey, E. J. Griffiths, and G. LaForte. On Schnorr and computable randomness, martingales, and machines. Mathematical Logic Quarterly, 50:613–627, 2004. [240, 278, 279, 280, 349, 350, 462, 463, 564, 565, 568, 569]
References
775
[111] R. G. Downey, E. J. Griffiths, and S. Reid. On Kurtz randomness. Theoretical Computer Science, 321:249–270, 2004. [290, 291, 292, 293, 294, 295, 296, 581] [112] R. G. Downey and D. R. Hirschfeldt, editors. Aspects of Complexity (Short Courses in Complexity from the New Zealand Mathematical Research Institute Summer 2000 Meeting, Kaikoura), volume 4 of De Gruyter Series in Logic and its Applications. Walter de Gruyter & Co., Berlin, 2001. [xi] [113] R. G. Downey, D. R. Hirschfeldt, and G. LaForte. Randomness and reducibility. Journal of Computer and System Sciences, 68:96–114, 2004. [406, 420, 421, 422, 423, 424, 433, 434, 435, 436, 440, 442] [114] R. G. Downey, D. R. Hirschfeldt, and G. LaForte. Undecidability of the structure of the Solovay degrees of c.e. reals. Journal of Computer and System Sciences, 73:769–787, 2007. [419] [115] R. G. Downey, D. R. Hirschfeldt, J. S. Miller, and A. Nies. Relativizing Chaitin’s halting probability. Journal of Mathematical Logic, 5:167–192, 2005. [334, 528, 706, 707, 708, 709, 710, 711, 712, 713, 716, 717, 718, 719, 720, 721, 723, 724, 726, 727] [116] R. G. Downey, D. R. Hirschfeldt, and A. Nies. Randomness, computability and density. SIAM Journal on Computing, 31:1169–1183, 2002. [xi, 407, 408, 413, 415, 416, 418, 419, 427, 430, 432] [117] R. G. Downey, D. R. Hirschfeldt, A. Nies, and F. Stephan. Trivial reals. In R. G. Downey, D. Ding, S. P. Tung, Y. H. Qiu, and M. Yasugi, editors, Proceedings of the 7th and 8th Asian Logic Conferences. Held in Hsi-Tou, June 6–10, 1999 and Chongqing, August 29–September 2, 2002, pages 103– 131. Singapore University Press and World Scientific, Singapore, 2003. [426, 501, 503, 504, 511, 526, 529, 530, 539, 540, 541, 542] [118] R. G. Downey, D. R. Hirschfeldt, A. Nies, and S. A. Terwijn. Calibrating randomness. The Bulletin of Symbolic Logic, 12:411–491, 2006. [233, 258, 517, 600] [119] R. G. Downey and C. G. Jockusch, Jr. T-degrees, jump classes, and strong reducibilities. Transactions of the American Mathematical Society, 301:103– 136, 1987. [82, 207] [120] R. G. Downey, C. G. Jockusch, Jr., and M. Stob. Array nonrecursive sets and multiple permitting arguments. In K. Ambos-Spies, G. H. M¨ uller, and G. E. Sacks, editors, Recursion Theory Week. Proceedings of the Conference Held at the Mathematisches Forschungsinstitut, Oberwolfach, March 19– 25, 1989, volume 1432 of Lecture Notes in Mathematics, pages 141–174. Springer, Berlin, 1990. [93, 94, 95, 96, 98, 530] [121] R. G. Downey, C. G. Jockusch, Jr., and M. Stob. Array nonrecursive degrees and genericity. In S. B. Cooper, T. A. Slaman, and S. S. Wainer, editors, Computability, Enumerability, Unsolvability. Directions in Recursion Theory, volume 224 of London Mathematical Society Lecture Notes Series, pages 93–104. Cambridge University Press, Cambridge, 1996. [93, 94, 95, 96, 97, 107, 108, 109] [122] R. G. Downey and G. LaForte. Presentations of computably enumerable reals. Theoretical Computer Science, 284:539–555, 2002. [206, 208, 210, 215]
776
References
[123] R. G. Downey, G. LaForte, and A. Nies. Computably enumerable sets and quasi-reducibility. Annals of Pure and Applied Logic, 95:1–35, 1998. [206] [124] R. G. Downey, S. Lempp, and R. A. Shore. Jumps of minimal degrees below 0 . Journal of the London Mathematical Society. Second Series, 54:417–439, 1996. [72] [125] R. G. Downey, W. Merkle, and J. Reimann. Schnorr dimension. Mathematical Structures in Computer Science, 16:789–811, 2006. [655, 656, 658, 659, 660, 661] [126] R. G. Downey and J. S. Miller. A basis theorem for Π01 classes of positive measure and jump inversion for random reals. Proceedings of the American Mathematical Society, 134:283–288, 2006. [373] [127] R. G. Downey, J. S. Miller, and L. Yu. On the quantity of K-trivial reals. In preparation. [505, 506] [128] R. G. Downey and K. M. Ng. Lowness for Demuth randomness. In K. Ambos-Spies, B. L¨ owe, and W. Merkle, editors, Mathematical Theory and Computational Practice. 5th Conference on Computability in Europe, CiE 2009. Heidelberg, Germany, July 19–24, 2009. Proceedings, volume 5635 of Lecture Notes in Computer Science, pages 154–166. Springer, Berlin, 2009. [590] [129] R. G. Downey and K. M. Ng. Effective packing dimension and traceability. Notre Dame Journal of Formal Logic, 51:279–290, 2010. [650] [130] R. G. Downey, A. Nies, R. Weber, and L. Yu. Lowness and Π02 nullsets. The Journal of Symbolic Logic, 71:1044–1052, 2006. [288, 289, 535, 537] [131] R. G. Downey and J. B. Remmel. Classification of degree classes associated with r.e. subspaces. Annals of Pure and Applied Logic, 42:105–124, 1989. [206] [132] R. G. Downey and R. A. Shore. There is no degree invariant half-jump. Proceedings of the American Mathematical Society, 125:3033–3037, 1997. [42] [133] R. G. Downey and M. Stob. Splitting theorems in recursion theory. Annals of Pure and Applied Logic, 65:1–106, 1993. [205] [134] R. G. Downey and S. A. Terwijn. Computably enumerable reals and uniformly presentable ideals. Mathematical Logic Quarterly, 48:29–40, 2002. [209] [135] R. G. Downey and L. V. Welch. Splitting properties of r.e. sets and degrees. The Journal of Symbolic Logic, 51:88–109, 1986. [53] [136] R. G. Downey, G. Wu, and X. Zheng. Degrees of d.c.e. reals. Mathematical Logic Quarterly, 50:345–350, 2004. [218] [137] R. G. Downey and L. Yu. Arithmetical Sacks forcing. Archive for Mathematical Logic, 45:715–720, 2006. [103] [138] B. Durand, L. A. Levin, and A. Shen. Complex tilings. The Journal of Symbolic Logic, 73:593–613, 2008. [601] [139] H. B. Enderton. A Mathematical Introduction to Logic. Harcourt/Academic Press, Burlington, MA, second edition, 2001. [73, 84]
References
777
[140] R. L. Epstein, R. Haas, and R. L. Kramer. Hierarchies of sets and degrees below 0 . In M. Lerman, J. H. Schmerl, and R. I. Soare, editors, Logic Year 1979–1980, Proceedings of Seminars and of the Conference on Mathematical Logic Held at the University of Connecticut, Storrs, Conn., November 11–13, 1979, volume 859 of Lecture Notes in Mathematics, pages 32–48. Springer, Berlin, 1981. [28] [141] K. Falconer. Fractal Geometry. Mathematical Foundations and Applications. John Wiley & Sons, Ltd., Chichester, 1990. [639] [142] H. Federer. Geometric Measure Theory, volume 153 of Die Grundlehren der Mathematischen Wissenschaften. Springer-Verlag, New York, 1969. [635] [143] S. Feferman. Some applications of the notions of forcing and generic sets. Fundamenta Mathematicae, 56:325–345, 1964/1965. [100] [144] W. Feller. An Introduction to Probability Theory and its Applications, volume I. John Wiley & Sons, Inc., New York, 1950. [241, 250, 629] [145] S. Fenner and M. Schaefer. A note on a variant of btt-reducibility, immunity and minimal programs. Mathematical Logic Quarterly, 45:3–21, 1999. [454] [146] S. Figueira, J. S. Miller, and A. Nies. Indifferent sets. Journal of Logic and Computation, 19:425–443, 2009. [340, 341] [147] S. Figueira, A. Nies, and F. Stephan. Lowness properties and approximations of the jump. In R. de Queiroz, A. Macintyre, and G. Bittencourt, editors, Proceedings of the 12th Workshop of Logic, Language, Information and Computation (WoLLIC 2005). Held in Florian´ opolis, July 19–22, 2005, volume 143 of Electronic Notes in Theoretical Computer Science, pages 45–57. Elsevier Science B.V., Amsterdam, 2006. [120, 523, 668, 669, 670, 671] [148] L. Fortnow. Kolmogorov complexity. In R. G. Downey and D. R. Hirschfeldt, editors, Aspects of Complexity (Short Courses in Complexity from the New Zealand Mathematical Research Institute Summer 2000 Meeting, Kaikoura), volume 4 of De Gruyter Series in Logic and its Applications, pages 73–86. Walter de Gruyter & Co., Berlin, 2001. [xi, 132] [149] L. Fortnow, R. Freivalds, W. I. Gasarch, M. Kummer, S. A. Kurtz, C. H. Smith, and F. Stephan. On the relative sizes of learnable sets. Theoretical Computer Science, 197:139–156, 1998. [303] [150] L. Fortnow, J. M. Hitchcock, A. Pavan, V. Vinochandran, and F. Wang. Extracting Kolmogorov complexity with applications to dimension zeroone laws. In M. Bugliesi, B. Preneel, V. Sassone, and I. Wegener, editors, Automata, Languages and Programming. 33rd International Colloquium, ICALP 2006. Venice, Italy, July 10–14, 2006. Proceedings, Part I, volume 4051 of Lecture Notes in Computer Science, pages 335–345. Springer, Berlin, 2006. [642] [151] J. N. Y. Franklin. Aspects of Schnorr Randomness. Ph.D. dissertation, University of California, Berkeley, 2007. [564, 575] [152] J. N. Y. Franklin. Hyperimmune-free degrees and Schnorr triviality. The Journal of Symbolic Logic, 73:999–1008, 2008. [564, 575] [153] J. N. Y. Franklin. Schnorr trivial reals: a construction. Archive for Mathematical Logic, 46:665–678, 2008. [564]
778
References
[154] J. N. Y. Franklin. Lowness and highness properties for randomness notions. In T. Arai, J. Brendle, H. Kikyo, C. T. Chong, R. G. Downey, Q. Feng, and H. Ono, editors, Proceedings of the 10th Asian Logic Conference. Kobe, Japan, 1–6 September 2008, pages 124–151. World Scientific, Singapore, 2010. [591] [155] J. N. Y. Franklin. Schnorr triviality and genericity. The Journal of Symbolic Logic, 75:191–207, 2010. [564] [156] J. N. Y. Franklin, N. Greenberg, F. Stephan, and G. Wu. Anticomplexity, lowness and highness notions and reducibilities with tiny uses. In preparation. [576, 577, 578, 579, 580] [157] J. N. Y. Franklin and K. M. Ng. Difference randomness. To appear in the Proceedings of the American Mathematical Society. [316, 317, 318, 535] [158] J. N. Y. Franklin and F. Stephan. Schnorr trivial sets and truth-table reducibility. The Journal of Symbolic Logic, 75:501–521, 2010. [273, 570, 571, 572, 574, 575] [159] J. N. Y. Franklin, F. Stephan, and L. Yu. Relativizations of randomness and genericity notions. To appear. [534, 591] [160] R. M. Friedberg. A criterion for completeness of degrees of unsolvability. The Journal of Symbolic Logic, 22:159–160, 1957. [64] [161] R. M. Friedberg. Two recursively enumerable sets of incomparable degrees of unsolvability. Proceedings of the National Academy of Sciences of the United States of America, 43:236–238, 1957. [32, 34, 36] [162] R. M. Friedberg and H. Rogers, Jr. Reducibility and completeness for sets of integers. Zeitschrift f¨ ur Mathematische Logik und Grundlagen der Mathematik, 5:117–125, 1959. [82] [163] P. G´ acs. Lecture notes on descriptional complexity and randomness. Boston University, 1993–2005, available at http://www.cs.bu.edu/faculty/gacs/recent-publ.html. [138] [164] P. G´ acs. On the symmetry of algorithmic information. Soviet Mathematics Doklady, 15:1477–1480, 1974. [133, 134, 143, 144] [165] P. G´ acs. Exact expressions for some randomness tests. Zeitschrift f¨ ur Mathematische Logik und Grundlagen der Mathematik, 26:385–394, 1980. [154, 252] [166] P. G´ acs. On the relation between descriptional complexity and algorithmic probability. Theoretical Computer Science, 22:71–93, 1983. [147, 153, 169, 170] [167] P. G´ acs. Every set is reducible to a random one. Information and Control, 70:186–192, 1986. [323, 325, 326, 484] [168] H. Gaifmann and M. Snir. Probabilities over rich languages, testing and randomness. The Journal of Symbolic Logic, 47:495–548, 1982. [287, 361] [169] N. Greenberg, D. R. Hirschfeldt, and A. Nies. Characterizing the strongly jump traceable sets via randomness. To appear. [689, 694, 695, 696, 697] [170] N. Greenberg and J. S. Miller. Diagonally non-recursive functions and effective Hausdorff dimension. To appear in the Bulletin of the London Mathematical Society. [609, 618, 619, 620, 621, 622, 623, 624, 625, 626, 642]
References
779
[171] N. Greenberg and J. S. Miller. Lowness for Kurtz randomness. The Journal of Symbolic Logic, 74:665–678, 2009. [348, 584, 591] [172] N. Greenberg and A. Nies. Benign cost functions and lowness properties. To appear in the Journal of Symbolic Logic. [526, 689, 690, 691, 692, 693] [173] M. J. Groszek and T. A. Slaman. Π01 classes and minimal degrees. Annals of Pure and Applied Logic, 87:117–144, 1997. [79] [174] W. P. Hanf. The Boolean algebra of logic. Bulletin of the American Mathematical Society, 81:587–589, 1975. [84] [175] L. A. Harrington, D. Marker, and S. Shelah. Borel orderings. Transactions of the American Mathematical Society, 310:293–302, 1988. [478] [176] L. A. Harrington and R. I. Soare. Post’s program and incomplete recursively enumerable sets. Proceedings of the National Academy of Sciences of the United States of America, 88:10242–10246, 1991. [83] [177] F. Hausdorff. Dimension und ¨ außeres Maß. Mathematische Annalen, 79:157–179, 1919. [592] [178] D. R. Hirschfeldt, A. Nies, and F. Stephan. Using random sets as oracles. Journal of the London Mathematical Society. Second Series, 75:610–622, 2007. [531, 533, 534] [179] J. M. Hitchcock. Gales suffice for constructive dimension. Information Processing Letters, 86:9–12, 2003. [597] [180] J. M. Hitchcock. Correspondence principles for effective dimensions. Theory of Computing Systems, 38:559–571, 2005. [608, 637] [181] J. M. Hitchcock and J. H. Lutz. Why computational complexity requires stricter martingales. Theory of Computing Systems, 39:277–296, 2006. [242, 243, 244, 245] [182] C.-K. Ho. Relatively recursive reals and real functions. Theoretical Computer Science, 210:99–120, 1999. [199] [183] R. H¨ olzl, T. Kr¨ aling, and W. Merkle. Time bounded Kolmogorov complexity and Solovay functions. In R. Kr´ aloviˇc and D. Niwi´ nski, editors, Mathematical Foundations of Computer Science 2009. 34th International Symposium, MFCS 2009. Novy Smokovec, High Tatras, Slovakia, August 24–28, 2009. Proceedings, volume 5734 of Lecture Notes in Computer Science, pages 392–402. Springer, Berlin, 2009. [139, 689, 729] [184] S. Ishmukhametov. Weak recursive degrees and a problem of Spector. In M. M. Arslanov and S. Lempp, editors, Recursion Theory and Complexity. Proceedings of the International Workshop on Recursion Theory and Complexity Theory (WORCT’97) held at Kazan State University, Kazan, July 14–19, 1997, volume 2 of De Gruyter Series in Logic and its Applications, pages 81–88. Walter de Gruyter & Co., Berlin, 1999. [98, 99, 100, 650] [185] T. Jech. Set Theory. Springer Monographs in Mathematics. SpringerVerlag, Berlin, third millennium edition, 2003. [297] [186] C. G. Jockusch, Jr. The degrees of bi-immune sets. Zeitschrift f¨ ur Mathematische Logik und Grundlagen der Mathematik, 15:135–140, 1969. [295]
780
References
[187] C. G. Jockusch, Jr. Relationships between reducibilities. Transactions of the American Mathematical Society, 142:229–237, 1969. [70] [188] C. G. Jockusch, Jr. Degrees of generic sets. In F. R. Drake and S. S. Wainer, editors, Recursion Theory: its Generalisations and Applications, volume 45 of London Mathematical Society Lecture Note Series, pages 110–139. Cambridge University Press, Cambridge, 1980. [101, 102, 104, 401] [189] C. G. Jockusch, Jr. Fine degrees of word problems in cancellation semigroups. Zeitschrift f¨ ur Mathematische Logik und Grundlagen der Mathematik, 26:93–95, 1980. [206] [190] C. G. Jockusch, Jr. Three easy constructions of recursively enumerable sets. In M. Lerman, J. H. Schmerl, and R. I. Soare, editors, Logic Year 1979–1980, Proceedings of Seminars and of the Conference on Mathematical Logic Held at the University of Connecticut, Storrs, Conn., November 11–13, 1979, volume 859 of Lecture Notes in Mathematics, pages 83–91. Springer, Berlin, 1981. [414] [191] C. G. Jockusch, Jr., M. Lerman, R. I. Soare, and R. M. Solovay. Recursively enumerable sets modulo iterated jumps and extensions of Arslanov’s completeness criterion. The Journal of Symbolic Logic, 54:1288–1323, 1989. [87, 370, 534] [192] C. G. Jockusch, Jr. and D. Posner. Double jumps of minimal degrees. The Journal of Symbolic Logic, 43:715–724, 1978. [72] [193] C. G. Jockusch, Jr. and R. A. Shore. Pseudojump operators I: The r.e. case. Transactions of the American Mathematical Society, 275:599–609, 1983. [41, 42, 376] [194] C. G. Jockusch, Jr. and R. A. Shore. Pseudojump operators II: Transfinite iterations, hierarchies, and minimal covers. The Journal of Symbolic Logic, 49:1205–1236, 1984. [42, 64, 376] [195] C. G. Jockusch, Jr. and R. I. Soare. Degrees of members of Π01 classes. Pacific Journal of Mathematics, 40:605–616, 1972. [77, 81, 84, 87, 89, 374] [196] C. G. Jockusch, Jr. and R. I. Soare. Π01 classes and degrees of theories. Transactions of the American Mathematical Society, 173:33–56, 1972. [77, 78, 79, 81, 82, 84, 86, 87] [197] M. I. Kanovich. On the decision complexity of algorithms. Soviet Mathematics Doklady, 10:700–701, 1969. [366, 368] [198] M. I. Kanovich. On the decision and enumeration complexity of predicates. Soviet Mathematics Doklady, 11:17–20, 1970. [366, 368] [199] B. Kastermans and S. Lempp. Comparing notions of randomness. Theoretical Computer Science, 411:602–616, 2010. [311, 319, 320, 321] [200] S. M. Kautz. Degrees of Random Sets. Ph.D. dissertation, Cornell University, 1991. [xiii, 79, 255, 256, 257, 259, 265, 266, 267, 268, 287, 297, 298, 330, 333, 359, 361, 362, 363, 364, 382, 385, 386] [201] S. M. Kautz. An improved zero-one law for algorithmically random sequences. Theoretical Computer Science, 191:185–192, 1998. [259]
References
781
[202] B. Khoussainov and A. Nerode. Automata Theory and Its Applications. Progress in Computer Science and Applied Logic, 21. Birkh¨ auser Boston, Inc., Boston, 2001. [746] [203] B. Kjos-Hanssen. Low for random reals and positive-measure domination. Proceedings of the American Mathematical Society, 135:3703–3709, 2007. [489, 490] [204] B. Kjos-Hanssen. Infinite subsets of random sets of integers. Mathematics Research Letters, 16:103–110, 2009. [348] [205] B. Kjos-Hanssen, W. Merkle, and F. Stephan. Kolmogorov complexity and the recursion theorem. In B. Durand and W. Thomas, editors, STACS 2006. Proceedings of the 23rd Annual Symposium on Theoretical Aspects of Computer Science, Marseille, France, February 23–25, 2006, volume 3884 of Lecture Notes in Computer Science, pages 149–161. Springer-Verlag, Berlin, 2006. [366, 367, 368, 369, 576, 581] [206] B. Kjos-Hanssen, J. S. Miller, and R. Solomon. Lowness notions, measure and domination. To appear. [494, 497, 499, 538] [207] B. Kjos-Hanssen, A. Nies, and F. Stephan. Lowness for the class of Schnorr random sets. SIAM Journal on Computing, 35:647–657, 2005. [537, 554, 559, 560, 561, 591] [208] S. C. Kleene. General recursive functions of natural numbers. Mathematische Annalen, 112:727–742, 1936. [23] [209] S. C. Kleene. On notation for ordinal numbers. The Journal of Symbolic Logic, 3:150–155, 1938. [13, 14] [210] S. C. Kleene and E. L. Post. The upper semi-lattice of degrees of recursive unsolvability. Annals of Mathematics. Second Series, 59:379–407, 1954. [17, 32] [211] A. N. Kolmogorov. Three approaches to the quantitative definition of information. Problems of Information Transmission, 1:1–7, 1965. [xiii, xx, 111, 112, 115, 637] [212] A. N. Kolmogorov. Logical basis for information theory and probability theory. IEEE Transactions on Information Theory, 14:662–664, 1968. [113] [213] L. G. Kraft. A Device for Quantizing, Grouping, and Coding Amplitude Modulated Pulses. M.Sc. thesis, MIT, 1949. [125] [214] S. A. Kripke. “Flexible” predicates in formal number theory. Proceedings of the American Mathematical Society, 13:647–650, 1962. [93] [215] A. Kuˇcera. Measure, Π01 classes, and complete extensions of PA. In H.-D. Ebbinghaus, G. H. M¨ uller, and G. E. Sacks, editors, Recursion Theory Week. Proceedings of the Conference Held at the Mathematisches Forschungsinstitut in Oberwolfach, April 15–21, 1984, volume 1141 of Lecture Notes in Mathematics, pages 245–259. Springer, Berlin, 1985. [259, 323, 324, 325, 326, 330, 336, 337, 338, 347, 609] [216] A. Kuˇcera. An alternative priority-free solution to Post’s problem. In J. Gruska, B. Rovan, and J. Wiederman, editors, Mathematical Foundations of Computer Science 1986. Proceedings of the 12th Symposium. Bratislava, Czechoslovakia. August 25–29, 1986, volume 233 of Lecture
782
[217]
[218]
[219] [220] [221] [222] [223]
[224] [225] [226]
[227]
[228]
[229] [230] [231]
References Notes in Computer Science, pages 493–500. Springer, Berlin, 1986. [90, 91, 92, 337] A. Kuˇcera. On the use of diagonally nonrecursive functions. In H.-D. Ebbinghaus, J. Fernandez-Prida, M. Garrido, D. Lascar, and M. Rodr´ıquez Artalejo, editors, Logic Colloquium ’87. Proceedings of the Colloquium Held at the University of Granada, Granada, July 20–25, 1987, volume 129 of Studies in Logic and the Foundations of Mathematics, pages 219–239. North-Holland, Amsterdam, 1989. [92, 93, 373] A. Kuˇcera. Randomness and generalizations of fixed point free functions. In K. Ambos-Spies, G. H. M¨ uller, and G. E. Sacks, editors, Recursion Theory Week. Proceedings of the Conference Held at the Mathematisches Forschungsinstitut, Oberwolfach, March 19–25, 1989, volume 1432 of Lecture Notes in Mathematics, pages 245–254. Springer, Berlin, 1990. [371] A. Kuˇcera. On relative randomness. Annals of Pure and Applied Logic, 63:61–67, 1993. [531] A. Kuˇcera and A. Nies. Demuth randomness and computational complexity. To appear in the Annals of Pure and Applied Logic. [315] A. Kuˇcera and T. A. Slaman. Randomness and recursive enumerability. SIAM Journal on Computing, 31:199–211, 2001. [231, 409, 410, 418] A. Kuˇcera and T. A. Slaman. Low upper bounds for ideals. The Journal of Symbolic Logic, 74:517–534, 2009. [93, 550] A. Kuˇcera and S. A. Terwijn. Lowness for the class of random sets. The Journal of Symbolic Logic, 64:1396–1402, 1999. [489, 508, 509, 510, 524, 526] M. Kumabe. On the Turing Degrees of Generic Sets. Ph.D. dissertation, University of Chicago, 1990. [102] M. Kumabe and A. E. M. Lewis. A fixed-point-free minimal degree. Journal of the London Mathematical Society, 80:785–797, 2009. [349, 618] M. Kummer. Kolmogorov complexity and instance complexity of recursively enumerable sets. SIAM Journal on Computing, 25:1123–1143, 1996. [649, 728, 729, 730] M. Kummer. On the complexity of random strings. In C. Puech and R. Reischuk, editors, STACS ’96. Proceedings of the 13th Annual Symposium on Theoretical Aspects of Computer Science held in Grenoble, February 22– 24, 1996, volume 1046 of Lecture Notes in Computer Science, pages 25–36. Springer-Verlag, Berlin, 1996. [738, 740, 743] S. A. Kurtz. Randomness and Genericity in the Degrees of Unsolvability. Ph.D. dissertation, University of Illinois at Urbana–Champaign, 1981. [xii, 105, 107, 254, 255, 256, 259, 269, 285, 286, 287, 288, 295, 315, 323, 356, 360, 361, 362, 380, 381, 382, 383, 385, 386, 392, 394, 401, 496] S. A. Kurtz. Notions of weak genericity. The Journal of Symbolic Logic, 48:764–770, 1983. [105, 107] A. V. Kuznecov. On primitive recursive functions of large oscillation. Doklady Akademii Nauk SSSR (N.S.), 71:233–236, 1950. In Russian. [69] A. H. Lachlan. Lower bounds for pairs of recursively enumerable degrees. Proceedings of the London Mathematical Society, 16:537–569, 1966. [47]
References
783
[232] A. H. Lachlan. Distributive initial segments of the degrees of unsolvability. Zeitschrift f¨ ur Mathematische Logik und Grundlagen der Mathematik, 14:457–472, 1968. [364] [233] A. H. Lachlan. Wtt-complete sets are not necessarily tt-complete. Proceedings of the American Mathematical Society, 48:429–434, 1975. [333, 334] [234] R. E. Ladner. A completely mitotic nonrecursive r. e. degree. Transactions of the American Mathematical Society, 184:479–507, 1973. [207] [235] R. E. Ladner and L. P. Sasso, Jr. The weak truth table degrees of recursively enumerable sets. Annals of Mathematical Logic, 8:429–448, 1975. [207] [236] J. I. Lathrop and J. H. Lutz. Recursive computational depth. Information and Computation, 153:139–172, 1999. [304, 305] [237] H. Lebesgue. Le¸cons sur L’int´egration et la Recherche des Fonctions Primitives. Gauthier-Villars, Paris, 1904. [592] [238] S. Lempp and M. Lerman. Priority arguments using iterated trees of strategies. In K. Ambos-Spies, G. H. M¨ uller, and G. E. Sacks, editors, Recursion Theory Week. Proceedings of the Conference Held at the Mathematisches Forschungsinstitut, Oberwolfach, March 19–25, 1989, volume 1432 of Lecture Notes in Mathematics, pages 277–296. Springer, Berlin, 1990. [67] [239] S. Lempp and M. Lerman. The decidability of the existential theory of the poset of recursively enumerable degrees with jump relations. Advances in Mathematics, 120:1–142, 1996. [67] [240] M. Lerman. Degrees of Unsolvability. Local and Global theory. Perspectives in Mathematical Logic. Springer-Verlag, Berlin, 1983. [72, 475] [241] L. A. Levin. Some Theorems on the Algorithmic Approach to Probability Theory and Information Theory. Dissertation in Mathematics, Moscow University, 1971. In Russian. [xiii, 121, 125, 133, 145, 146, 154] [242] L. A. Levin. On the notion of a random sequence. Soviet Mathematics Doklady, 14:1413–1416, 1973. [xiii, 145, 146, 169, 227, 232, 239, 240] [243] L. A. Levin. Laws of information conservation (non-growth) and aspects of the foundation of probability theory. Problems of Information Transmission, 10:206–210, 1974. [xiii, 121, 125, 129, 133, 227] [244] P. L´evy. Th´eorie de l’Addition des Variables Al´eatoires. Gauthier-Villars, Paris, 1937. [235] [245] A. E. M. Lewis. A single minimal complement for the c. e. degrees. Transactions of the American Mathematical Society, 359:5817–5865, 2007. [93] [246] A. E. M. Lewis and G. Barmpalias. Random reals and Lipschitz continuity. Mathematical Structures in Computer Science, 16:737–749, 2006. [442] [247] A. E. M. Lewis and G. Barmpalias. Randomness and the linear degrees of computability. Annals of Pure and Applied Logic, 145:252–257, 2007. [420] [248] M. Li and P. Vitanyi. An Introduction to Kolmogorov Complexity and Its Applications. Texts and Monographs in Computer Science. Springer-Verlag,
784
References Berlin, 1993. [xi, xx, 110, 113, 114, 116, 122, 132, 136, 137, 150, 227, 238, 260, 265, 304, 738]
[249] E. H. Lieb, D. Osherson, and S. Weinstein. Elementary proof of a theorem of Jean Ville. Unpublished manuscript: arXiv:cs/0607054v1, July 2006. [246, 249] [250] D. W. Loveland. A variant of the Kolmogorov concept of complexity. Information and Control, 15:510–526, 1969. [117, 118, 122] [251] J. H. Lutz. Category and measure in complexity classes. SIAM Journal on Computing, 19:1100–1131, 1990. [270] [252] J. H. Lutz. Gales and the constructive dimension of individual sequences. In U. Montanari, J. D. P. Rolim, and E. Welzl, editors, Automata, Languages and Programming. 27th International Colloquium, ICALP 2000. Geneva, Switzerland, July 9–15, 2000. Proceedings, volume 1853 of Lecture Notes in Computer Science, pages 902–913. Springer, Berlin, 2000. [xiii, 594, 596, 597, 599, 603, 654] [253] J. H. Lutz. Dimension in complexity classes. SIAM Journal on Computing, 32:1236–1259, 2003. [594, 596] [254] J. H. Lutz. The dimensions of individual strings and sequences. Information and Computation, 187:49–79, 2003. [xiii, 594, 596, 597, 598, 599, 603, 654, 658, 662, 663, 664, 665, 666] [255] J. H. Lutz. Effective fractal dimensions. Mathematical Logic Quarterly, 51:62–72, 2005. [596] [256] W. Maass. Characterization of the recursively enumerable sets with supersets effectively isomorphic to all recursively enumerable sets. Transactions of the American Mathematical Society, 279:311–336, 1983. [90, 215] [257] A. Macintyre. Omitting quantifier free types in generic structures. The Journal of Symbolic Logic, 37:512–520, 1972. [206] [258] D. A. Martin. Classes of recursively enumerable sets and degrees of unsolvability. Zeitschrift f¨ ur Mathematische Logik und Grundlagen der Mathematik, 12:295–310, 1966. [80, 97, 575] [259] P. Martin-L¨ of. The definition of random sequences. Information and Control, 9:602–619, 1966. [xiii, xix, 231, 233, 234] [260] P. Martin-L¨ of. Complexity oscillations in infinite binary sequences. Zeitschrift f¨ ur Wahrscheinlichkeitstheorie und Verwandte Gebiete, 19:225–230, 1971. [136, 260] [261] Y. V. Matijasevic. The Diophantineness of enumerable sets. Soviet Mathematics Doklady, 11:354–358, 1970. [101] [262] E. Mayordomo. A Kolmogorov complexity characterization of constructive Hausdorff dimension. Information Processing Letters, 84:1–3, 2002. [598] [263] Y. T. Medvedev. Degrees of difficulty of the mass problem. Doklady Akademii Nauk SSSR (N.S.), 104:501–504, 1955. In Russian. [344] [264] Y. T. Medvedev. On nonisomorphic recursively enumerable sets. Doklady Akademii Nauk SSSR (N.S.), 102:211–214, 1955. In Russian. [69]
References
785
[265] W. Merkle. The Kolmogorov-Loveland stochastic sequences are not closed under selecting subsequences. The Journal of Symbolic Logic, 68:1362–1376, 2003. [310] [266] W. Merkle. The complexity of stochastic sequences. Journal of Computer and System Sciences, 74:350–357, 2008. [304, 305, 306] [267] W. Merkle and N. Mihailovi´c. On the construction of effective random sets. In K. Diks and W. Rytter, editors, Mathematical Foundations of Computer Science 2002. 27th International Symposium, MFCS 2002, Warsaw, Poland, August 26–30, 2002. Proceedings, volume 2420 of Lecture Notes in Computer Science, pages 568–580. Springer, Berlin, 2002. [325, 326, 484] [268] W. Merkle, N. Mihailovi´c, and T. A. Slaman. Some results on effective randomness. Theory of Computing Systems, 39:707–721, 2006. [243, 244, 279, 280] [269] W. Merkle, J. S. Miller, A. Nies, J. Reimann, and F. Stephan. KolmogorovLoveland randomness and stochasticity. Annals of Pure and Applied Logic, 138:183–210, 2006. [276, 309, 311, 312, 313, 314, 328, 330, 357] [270] W. Merkle and F. Stephan. On C-degrees, H-degrees and T-degrees. In Twenty-Second Annual IEEE Conference on Computational Complexity (CCC 2007), pages 60–69. IEEE Computer Society Press, San Diego, CA, 2007. [120, 437, 456, 457, 458, 500] [271] J. S. Miller. Contrasting plain and prefix-free Kolmogorov complexity. To appear. [743] [272] J. S. Miller. Deriving Muchnik’s Theorem from Solovay’s manuscript. Unpublished notes. [155, 168] [273] J. S. Miller. Extracting information is hard: a Turing degree of non-integral effective Hausdorff dimension. To appear in Advances in Mathematics. [610, 612, 615] [274] J. S. Miller. Two notes on subshifts. Unpublished notes. [601] [275] J. S. Miller. Π01 Classes in Computable Analysis and Topology. Ph.D. dissertation, Cornell University, 2002. [368] [276] J. S. Miller. Kolmogorov random reals are 2-random. The Journal of Symbolic Logic, 69:907–913, 2004. [261, 262] [277] J. S. Miller. The K-degrees, low for K-degrees, and weakly low for K sets. Notre Dame Journal of Formal Logic, 50:381–391, 2010. [263, 713, 714, 715, 716] [278] J. S. Miller and A. Nies. Randomness and computability: open questions. The Bulletin of Symbolic Logic, 12:390–410, 2006. [xiii, 319] [279] J. S. Miller and L. Yu. Oscillation in the initial segment complexity of random reals. To appear in Advances in Mathematics. [131, 329, 481, 484, 485, 486, 487] [280] J. S. Miller and L. Yu. On initial segment complexity and degrees of randomness. Transactions of the American Mathematical Society, 360:3193–3210, 2008. [229, 250, 251, 252, 332, 465, 466, 473, 474, 475, 476, 477, 478, 479, 480, 481, 487]
786
References
[281] R. Miller. The Δ02 -spectrum of a linear order. The Journal of Symbolic Logic, 66:470–486, 2001. [43] [282] W. Miller and D. A. Martin. The degrees of hyperimmune sets. Zeitschrift f¨ ur Mathematische Logik und Grundlagen der Mathematik, 14:159–166, 1968. [67, 68, 69, 100] [283] A. Mostowski. A generalization of the incompleteness theorem. Fundamenta Mathematicae, 49:205–232, 1960/1961. [93] [284] A. A. Muchnik. On the unsolvability of the problem of reducibility in the theory of algorithms. Doklady Akademii Nauk SSSR (N.S.), 108:194–197, 1956. In Russian. [32, 34, 36] [285] A. A. Muchnik. Isomorphism of systems of recursively enumerable sets with effective properties. Trudy Moskovskogo Matematicheskogo Obshchestva, 7:407–412, 1958. In Russian. [86] [286] A. A. Muchnik. On strong and weak reducibilities of algorithmic problems. Sibirskii Matematicheskii Zhurnal, 4:1328–1341, 1963. In Russian. [344] [287] An. A. Muchnik. Lower limits of frequencies in computable sequences and relativized a priori probability. Theory of Probability and its Applications, 32:513–514, 1987. [472] [288] An. A. Muchnik and S. P. Positselsky. Kolmogorov entropy in the context of computability theory. Theoretical Computer Science, 271:15–35, 2002. [155, 168, 739, 744, 745, 752] [289] An. A. Muchnik, A. L. Semenov, and V. A. Uspensky. Mathematical metaphysics of randomness. Theoretical Computer Science, 207:263–317, 1998. [304, 305, 309, 310, 311, 312, 313] [290] J. Mycielski. Algebraic independence and measure. Fundamenta Mathematicae, 61:165–169, 1967. [478] [291] J. Myhill. Creative sets. Zeitschrift f¨ ur Mathematische Logik und Grundlagen der Mathematik, 1:97–108, 1955. [21, 22] [292] A. Nabutovsky and S. Weinberger. The fractal nature of Riem/Diff. I. Geometrica Dedicata, 101:1–54, 2003. [420] [293] A. Nerode. General topology and partial recursive functionals. In Summaries of Talks Presented at the Summer Institute for Symbolic Logic, pages 247–251. Cornell University, 1957. [21] [294] K. M. Ng. Beyond strong jump traceability. To appear in the Proceedings of the London Mathematical Society. [672] [295] K. M. Ng. Some Properties of D.C.E. Reals and their Degrees. M.Sc. thesis, National University of Singapore, 2006. [219] [296] K. M. Ng. On strongly jump traceable reals. Annals of Pure and Applied Logic, 154:51–69, 2008. [672, 689] [297] K. M. Ng. Computability, Traceability and Beyond. Ph.D. dissertation, Victoria University of Wellington, 2009. [672, 693] [298] A. Nies. Calculus of cost functions. To appear. [528] [299] A. Nies. Low for random reals: the story. Unpublished manuscript, http://www.cs.auckland.ac.nz/~nies/papers/LRpreprint.pdf. [510]
References
787
[300] A. Nies. Lowness for computable and partial computable randomness. To appear, preprint available at http://www.cs.auckland.ac.nz/CDMTCS/ researchreports/363andre.pdf. [586, 590] [301] A. Nies. Intervals of the lattice of computably enumerable sets and effective Boolean algebras. The Bulletin of the London Mathematical Society, 29:683–692, 1997. [419] [302] A. Nies. Lowness properties and randomness. Advances in Mathematics, 197:274–305, 2005. [473, 508, 518, 523, 524, 525, 527, 528, 529, 530, 586, 591] [303] A. Nies. Reals which compute little. In Z. Chatzidakis, P. Koepke, and W. Pohlers, editors, Logic Colloquium ’02. Joint Proceedings of the Annual European Summer Meeting of the Association for Symbolic Logic and the Biannual Meeting of the German Association for Mathematical Logic and the Foundations of Exact Sciences (the Colloquium Logicum) Held in M¨ unster, August 3–11, 2002., volume 27 of Lecture Notes in Logic, pages 261–275. Association for Symbolic Logic and A K Peters, Ltd., La Jolla, CA and Wellesley, MA, 2006. [504, 523, 530, 700, 701] [304] A. Nies. Non-cupping and randomness. Proceedings of the American Mathematical Society, 135:837–844, 2007. [534, 535] [305] A. Nies. Eliminating concepts. In C. T. Chong, Q. Feng, T. A. Slaman, W. H. Woodin, and Y. Yang, editors, Computational Prospects of Infinity: Part II. Presented Talks, National University of Singapore, Institute for Mathematical Sciences, volume 15 of Lecture Notes Series. Institute for Mathematical Sciences. National University of Singapore, pages 225–248. World Scientific Publishing Company, Singapore, 2008. [536] [306] A. Nies. Computability and Randomness, volume 51 of Oxford Logic Guides. Oxford University Press, Oxford, 2009. [251, 316, 364, 376, 377, 504, 526, 533, 535, 538, 586, 591, 669] [307] A. Nies and J. Reimann. A lower cone in the wtt degrees of non-integral effective dimension. In C. T. Chong, Q. Feng, T. A. Slaman, W. H. Woodin, and Y. Yang, editors, Computational Prospects of Infinity: Part II. Presented Talks, National University of Singapore, Institute for Mathematical Sciences, volume 15 of Lecture Notes Series. Institute for Mathematical Sciences. National University of Singapore, pages 249–260. World Scientific Publishing Company, Singapore, 2008. [611] [308] A. Nies, F. Stephan, and S. A. Terwijn. Randomness, relativization, and Turing degrees. The Journal of Symbolic Logic, 70:515–535, 2005. [136, 229, 260, 261, 262, 349, 350, 351, 356, 382, 712, 713, 715] [309] D. Nitzpon. Lowness Properties and Randomness. Doctoraal examens, University of Amsterdam, 2002. [586] [310] P. Odifreddi. Classical Recursion Theory. The Theory of Functions and Sets of Natural Numbers, volume 125 of Studies in Logic and the Foundations of Mathematics. North-Holland Publishing Company, Amsterdam, 1990. [xiii, xix, 8, 28, 85]
788
References
[311] P. Odifreddi. Classical Recursion Theory. Vol. II, volume 143 of Studies in Logic and the Foundations of Mathematics. North-Holland Publishing Company, Amsterdam, 1999. [xiii, xix, 8, 28, 72, 208, 209] [312] J. C. Owings, Jr. Diagonalization and the recursion theorem. Notre Dame Journal of Formal Logic, 14:95–99, 1973. [14] [313] J. C. Oxtoby. Measure and Category. A Survey of the Analogies Between Topological and Measure Spaces, volume 2 of Graduate Texts in Mathematics. Springer-Verlag, New York, second edition, 1980. [4] [314] J. B. Paris. Measure and minimal degrees. Annals of Mathematical Logic, 11:203–216, 1977. [381, 401] [315] D. B. Posner and R. W. Robinson. Degrees joining to 0 . The Journal of Symbolic Logic, 46:714–722, 1981. [64, 65] [316] E. L. Post. Recursively enumerable sets of positive integers and their decision problems. Bulletin of the American Mathematical Society, 50:284–316, 1944. [22, 32, 82] [317] E. L. Post. Degrees of recursive unsolvability. Bulletin of the American Mathematical Society, 54:641–642, 1948. [25] [318] M. Pour-El and J. I. Richards. Computability in Analysis and Physics. Perspectives in Mathematical Logic. Springer-Verlag, Berlin, 1989. [199, 224] [319] A. Raichev. Relative randomness and real closed fields. The Journal of Symbolic Logic, 70:319–330, 2005. [219, 224, 424] [320] A. Raichev. Relative Randomness via rK-reducibility. Ph.D. dissertation, University of Wisconsin-Madison, 2005. [219, 224, 424] [321] A. Raichev and F. Stephan. A minimal rK-degree. In C. T. Chong, Q. Feng, T. A. Slaman, W. H. Woodin, and Y. Yang, editors, Computational Prospects of Infinity: Part II. Presented Talks, National University of Singapore, Institute for Mathematical Sciences, volume 15 of Lecture Notes Series. Institute for Mathematical Sciences. National University of Singapore, pages 261–270. World Scientific Publishing Company, Singapore, 2008. [437, 439] [322] J. Raisonnier. A mathematical proof of S. Shelah’s theorem on the measure problem and related results. Israel Journal of Mathematics, 48:48–56, 1984. [555] [323] S. Reid. A Class of Algorithmically Random Reals. M.Sc. thesis, Victoria University, Wellington, 2004. [606] [324] J. Reimann. Computability and Fractal Dimension. Doctoral dissertation, University of Heidelberg, 2004. [330, 604, 605, 609, 611, 636, 637, 638, 641] [325] J. Reimann. Effectively closed classes of measures and randomness. Annals of Pure and Applied Logic, 156:170–182, 2008. [335] [326] J. Reimann. Randomness—beyond Lebesgue measure. In Logic Colloquium 2006, volume 32 of Lecture Notes in Logic, pages 247–279. Association for Symbolic Logic, Chicago, IL, 2009. [264, 265]
References
789
[327] J. Reimann and T. A. Slaman. Measures and their random reals. To appear in the Transactions of the American Mathematical Society. [7, 21, 265, 331, 334, 335, 336, 713] [328] J. Reimann and T. A. Slaman. Probability measures and effective randomness. To appear. [21, 336] [329] J. Reimann and T. A. Slaman. Randomness for continuous measures. To appear. [21, 336] [330] J. Reimann and F. Stephan. Hierarchies of randomness tests. In S. S. Goncharov, R. G. Downey, and H. Ono, editors, Mathematical Logic in Asia. Proceedings of the 9th Asian Logic Conference Held in Novosibirsk, August 16–19, 2005, pages 215–232. World Scientific Publishing Company, Singapore, 2006. [465, 605] [331] R. Rettinger, X. Zheng, R. Gengler, and B. von Braunm¨ uhl. Weakly computable real numbers and total computable real functions. In J. Wang, editor, Computing and Combinatorics. 7th Annual International Conference (COCOON 2001). Guilin, China, August 20–23, 2001. Proceedings, volume 2108 of Lecture Notes in Computer Science, pages 586–595. Springer, Berlin, 2001. [217] [332] H. G. Rice. Classes of recursively enumerable sets and their decision problems. Transactions of the American Mathematical Society, 74:358–366, 1953. [13] [333] R. W. Robinson. Jump restricted interpolation in the recursively enumerable degrees. Annals of Mathematics. Second Series, 93:586–596, 1971. [67] [334] H. Rogers, Jr. Theory of Recursive Functions and Effective Computability. McGraw–Hill Book Company, New York, 1967. [xiii, 8, 28] [335] H. L. Royden. Real Analysis. The Macmillan Company, New York, 1963. [264, 365, 400, 639] [336] W. Rudin. Principles of Mathematical Analysis. International Series in Pure and Applied Mathematics. McGraw-Hill Book Co., New York, third edition, 1976. [219] [337] A. Y. Rumyantsev and M. A. Ushakov. Forbidden substrings, Kolmogorov complexity and almost periodic sequences. In B. Durand and W. Thomas, editors, STACS 2006. Proceedings of the 23rd Annual Symposium on Theoretical Aspects of Computer Science, Marseille, France, February 23–25, 2006, volume 3884 of Lecture Notes in Computer Science, pages 396–407. Springer-Verlag, Berlin, 2006. [601] [338] B. Y. Ryabko. Coding of combinatorial sources and Hausdorff dimension. Soviet Mathematics Doklady, 30:219–222, 1984. [598] [339] B. Y. Ryabko. Noiseless coding of combinatorial sources. Problems of Information Transmission, 22:170–179, 1986. [598] [340] B. Y. Ryabko. The complexity and effectiveness of prediction algorithms. Journal of Complexity, 10:281–295, 1994. [598] [341] G. E. Sacks. Degrees of Unsolvability. Princeton University Press, Princeton, NJ, 1963. [65, 66, 323, 358, 363]
790
References
[342] G. E. Sacks. On the degrees less than 0 . Annals of Mathematics. Second Series, 77:211–231, 1963. [39, 40, 58, 72, 373, 534] [343] G. E. Sacks. A maximal set which is not complete. Michigan Mathematical Journal, 11:193–205, 1964. [83] [344] G. E. Sacks. The recursively enumerable degrees are dense. Annals of Mathematics Second Series, 80:300–312, 1964. [58] [345] G. E. Sacks. Forcing with perfect closed sets. In D. S. Scott, editor, Axiomatic Set Theory. Proceedings of the Symposium in Pure Mathematics of the American Mathematical Society held at the University of California, Los Angeles, Calif., July 10-August 5, 1967., volume XIII, part I of Proceedings of Symposia on Pure Mathematics, pages 331–355. American Mathematical Society, Providence, RI, 1971. [645] [346] G. E. Sacks. Higher Recursion Theory. Perspectives in Mathematical Logic. Springer-Verlag, Berlin, 1990. [711] [347] A. Salomaa. Computation and Automata, volume 25 of Encyclopedia of Mathematics and Its Applications. Cambridge University Press, Cambridge, 1985. [8] [348] C.-P. Schnorr. A unified approach to the definition of a random sequence. Mathematical Systems Theory, 5:246–258, 1971. [xiii, 236, 237, 257, 269, 270, 271, 273, 276, 280, 350, 594] [349] C.-P. Schnorr. Zuf¨ alligkeit und Wahrscheinlichkeit. Eine algorithmische Begr¨ undung der Wahrscheinlichkeitstheorie, volume 218 of Lecture Notes in Mathematics. Springer-Verlag, Berlin–New York, 1971. [xiii, 146, 236, 237, 257, 269, 270, 271, 272, 273, 276, 280, 303, 594, 599, 654] [350] C.-P. Schnorr. Process complexity and effective random tests. Journal of Computer and System Sciences, 7:376–388, 1973. [xiii, 125, 145, 146, 227, 232, 239] [351] D. Scott and S. Tennenbaum. On the degrees of complete extensions of arithmetic. Notices of the American Mathematical Society, 7:242–243, 1960. [87] [352] D. S. Scott. Algebras of sets binumerable in complete extensions of arithmetic. In Proceedings of the Symposium on Pure and Applied Mathematics, volume V, pages 117–121. American Mathematical Society, Providence, RI, 1962. [84] [353] A. Shen. On relations between different algorithmic definitions of randomness. Soviet Mathematics Doklady, 38:316–319, 1989. [303] [354] J. R. Shoenfield. On degrees of unsolvability. Annals of Mathematics. Second Series, 69:644–653, 1959. [24, 25, 65] [355] J. R. Shoenfield. Applications of model theory to degrees of unsolvability. In J. W. Addison, L. Henkin, and A. Tarski, editors, Theory of Models (Proceedings of the 1963 International Symposium, Berkeley), pages 359– 363. North-Holland, Amsterdam, 1965. [54, 56] [356] J. R. Shoenfield. Degrees of Unsolvability, volume 2 of North-Holland Mathematics Studies. North-Holland Publishing Company, Amsterdam-London, 1971. [71]
References
791
[357] J. H. Silver. Counting the number of equivalence classes of Borel and coanalytic equivalence relations. Annals of Mathematical Logic, 18:1–28, 1980. [478] [358] S. G. Simpson. First-order theory of the degrees of recursive unsolvability. Annals of Mathematics. Second Series, 105:121–139, 1977. [364] [359] S. G. Simpson. Mass problems and randomness. The Bulletin of Symbolic Logic, 11:1–27, 2005. [345, 346, 347] [360] S. G. Simpson. An extension of the recursively enumerable Turing degrees. Journal of the London Mathematical Society. Second Series, 75:287–297, 2007. [346, 347] [361] S. G. Simpson. Mass problems and almost everywhere domination. Mathematical Logic Quarterly, 53:483–492, 2007. [347] [362] R. M. Smullyan. Theory of Formal Systems. Princeton University Press, Princeton, NJ, revised edition, 1961. [86] [363] R. I. Soare. Cohesive sets and recursively enumerable Dedekind cuts. Pacific Journal of Mathematics, 31:215–231, 1969. [199, 201] [364] R. I. Soare. Recursion theory and Dedekind cuts. Transactions of the American Mathematical Society, 140:271–294, 1969. [198, 204] [365] R. I. Soare. Automorphisms of the lattice of recursively enumerable sets. I. Maximal sets. Annals of Mathematics. Second Series, 100:80–120, 1974. [83] [366] R. I. Soare. Recursively Enumerable Sets and Degrees: A Study of Computable Functions and Computably Generated Sets. Perspectives in Mathematical Logic. Springer, Berlin, 1987. [xiii, xix, 8, 10, 14, 27, 29, 36, 43, 44, 57, 58, 65, 503, 517] [367] R. I. Soare. Computability and recursion. The Bulletin of Symbolic Logic, 2:284–321, 1996. [8] [368] R. I. Soare. Computability theory and differential geometry. The Bulletin of Symbolic Logic, 10:457–486, 2004. [420, 421] [369] R. J. Solomonoff. A formal theory of inductive inference, I and II. Information and Control, 7:1–22 and 224–254, 1964. [xiii, xx, 111, 112, 133, 145, 146] [370] R. M. Solovay. A model of set theory in which every set of reals is Lebesgue measurable. Annals of Mathematics, 92:1–56, 1970. [296, 555] [371] R. M. Solovay. Draft of paper (or series of papers) on Chaitin’s work. Unpublished notes, May 1975. 215 pages. [xi, xii, 137, 138, 139, 140, 141, 142, 143, 144, 155, 156, 160, 161, 162, 163, 164, 166, 167, 234, 251, 262, 288, 404, 405, 408, 409, 413, 466, 467, 468, 469, 470, 501, 732, 743] [372] R. M. Solovay. On random r.e. sets. In A. I. Arruda, N. C. A. da Costa, and R. Chuaqui, editors, Non-Classical Logics, Model Theory and Computability. Proceedings of the Third Latin-American Symposium on Mathematical Logic, Campinas, July 11–17, 1976, volume 89 of Studies in Logic and the Foundations of Mathematics, pages 283–307. North Holland, Amsterdam, 1977. [732]
792
References
[373] A. Sorbi. The Medvedev lattice of degrees of difficulty. In S. B. Cooper, T. A. Slaman, and S. S. Wainer, editors, Computability, Enumerability, Unsolvability. Directions in Recursion Theory, volume 224 of London Mathematical Society Lecture Note Series, pages 289–312. Cambridge University Press, Cambridge, 1996. [347] [374] C. Spector. On degrees of recursive unsolvability. Annals of Mathematics, 64:581–592, 1956. [70] [375] C. Spector. Measure-theoretic construction of incomparable hyperdegrees. The Journal of Symbolic Logic, 23:280–288, 1958. [358] [376] L. Staiger. Kolmogorov complexity and Hausdorff dimension. Information and Computation, 103:159–194, 1993. [136, 598] [377] L. Staiger. A tight upper bound on Kolmogorov complexity and uniformly optimal prediction. Theory of Computing Systems, 31:215–229, 1998. [598, 636, 637] [378] L. Staiger. Constructive dimension equals Kolmogorov complexity. Information Processing Letters, 93:149–153, 2005. [598, 636] [379] F. Stephan. Martin-L¨ of random sets and PA-complete sets. In Z. Chatzidakis, P. Koepke, and W. Pohlers, editors, Logic Colloquium ’02. Joint Proceedings of the Annual European Summer Meeting of the Association for Symbolic Logic and the Biannual Meeting of the German Association for Mathematical Logic and the Foundations of Exact Sciences (the Colloquium Logicum) Held in M¨ unster, August 3–11, 2002., volume 27 of Lecture Notes in Logic, pages 342–348. Association for Symbolic Logic and A K Peters, Ltd., La Jolla, CA and Wellesley, MA, 2006. [337, 339] [380] F. Stephan and G. Wu. Presentations of K-trivial reals and Kolmogorov complexity. In S. B. Cooper, B. L¨ owe, and L. Torenvliet, editors, New Computational Paradigms. First Conference on Computability in Europe, CiE 2005. Amsterdam, The Netherlands, June 8–12, 2005. Proceedings, volume 3526 of Lecture Notes in Computer Science, pages 461–469. Springer, Berlin, 2005. [411] [381] F. Stephan and L. Yu. Lowness for weakly 1-generic and Kurtz-random. In J.-Y. Cai, S. B. Cooper, and A. Li, editors, Theory and Applications of Models of Computation. Third International Conference, TAMC 2006. Beijing, China, May 15–20, 2006. Proceedings, volume 3959 of Lecture Notes in Computer Science, pages 756–764. Springer, Berlin, 2006. [582, 583, 584, 585] [382] J. Stillwell. Decidability of the “almost all” theory of degrees. The Journal of Symbolic Logic, 37:501–506, 1972. [323, 359, 360, 363, 365] [383] M. Stob. Index sets and degrees of unsolvability. The Journal of Symbolic Logic, 47:241–248, 1982. [83] [384] D. Sullivan. Entropy, Hausdorff measures old and new, and limit sets of geometrically finite Kleinian groups. Acta Mathematica, 153:259–277, 1984. [639] [385] K. Tadaki. A generalization of Chaitin’s halting probability Ω and halting self-similar sets. Hokkaido Mathematical Journal, 31:219–253, 2002. [602, 603]
References
793
[386] S. A. Terwijn. Computability and Measure. Ph.D. dissertation, The Institute for Logic, Language and Computation (ILLC), University of Amsterdam, 1998. [98, 303] [387] S. A. Terwijn. Complexity and randomness. Rendiconti del Seminario Matematico di Torino, 62:1–38, 2004. Notes for a course given at the University of Auckland, March 2003. [5, 610, 611] [388] S. A. Terwijn. The Medvedev lattice of computably closed sets. Archive for Mathematical Logic, 45:179–190, 2006. [345] [389] S. A. Terwijn and D. Zambella. Algorithmic randomness and lowness. The Journal of Symbolic Logic, 66:1199–1205, 2001. [99, 100, 554, 555, 559] [390] B. Trakhtenbrot. On autoreducibility. Soviet Mathematics Doklady, 11:814– 817, 1970. [340] [391] C. Tricot, Jr. Two definitions of fractional dimension. Mathematical Proceedings of the Cambridge Philosophical Society, 91:57–74, 1982. [639, 657] [392] A. M. Turing. On computable numbers with an application to the Entscheidungsproblem. Proceedings of the London Mathematical Society, 42:230–265, 1936. Correction in Proceedings of the London Mathematical Society 43: 544–546, 1937. [198] [393] V. A. Uspensky. Some remarks on r.e. sets. Zeitschrift f¨ ur Mathematische Logik und Grundlagen der Mathematik, 3:157–170, 1957. [69] [394] V. A. Uspensky. Complexity and entropy: an introduction to the theory of Kolmogorov complexity. In O. Watanabe, editor, Kolmogorov Complexity and Computational Complexity, EATCS Monographs on Theoretical Computer Science, pages 85–102. Springer-Verlag, New York, 1992. [238] [395] V. A. Uspensky, A. L. Semenov, and A. Kh. Shen. Can an individual sequence of zeros and ones be random? Russian Mathematical Surveys, 45:121–189, 1990. [246] [396] V. A. Uspensky and A. Shen. Relations between varieties of Kolmogorov complexities. Mathematical Systems Theory, 29:271–292, 1996. [122, 147, 170] [397] M. van Lambalgen. Random Sequences. Ph.D. dissertation, University of Amsterdam, 1987. [xix, 110, 227, 265] [398] M. van Lambalgen. The axiomatization of randomness. The Journal of Symbolic Logic, 55:1143–1167, 1990. [110, 257, 258] [399] N. Vereshchagin. Kolmogorov complexity conditional to large integers. Theoretical Computer Science, 271:59–67, 2002. [263, 471] [400] N. Vereshchagin. Kolmogorov complexity of enumerating finite sets. Information Processing Letters, 103:34–39, 2007. [732] ´ [401] J. Ville. Etude Critique de la Notion de Collectif. Monographies des Probabilit´es. Calcul des Probabilit´es et ses Applications. Gauthier-Villars, Paris, 1939. [xviii, 230, 235, 246] [402] R. von Mises. Grundlagen der Wahrscheinlichkeitsrechnung. Mathematische Zeitschrift, 5:52–99, 1919. [xviii, 229]
794
References
[403] A. Wald. Sur la notion de collectif dans la calcul des probabilit´es. Comptes Rendus des Seances de l’Acad´emie des Sciences, 202:180–183, 1936. [xviii, 230] [404] A. Wald. Die Wiederspruchsfreiheit des Kollektivbegriffes der Wahrscheinlichkeitsrechnung. Ergebnisse eines Mathematischen Kolloquiums, 8:38–72, 1937. [xviii, 230] [405] Y. Wang. Randomness and Complexity. Ph.D. dissertation, University of Heidelberg, 1996. [275, 286, 287, 290, 303, 349, 350] [406] Y. Wang. A separation of two randomness concepts. Information Processing Letters, 69:115–118, 1999. [349, 350] [407] G. Wu. Prefix-free languages and initial segments of computably enumerable degrees. In J. Wang, editor, Computing and Combinatorics. 7th Annual International Conference (COCOON 2001). Guilin, China, August 20–23, 2001. Proceedings, volume 2108 of Lecture Notes in Computer Science, pages 576–585. Springer, Berlin, 2001. [206] [408] C. E. M. Yates. A minimal pair of recursively enumerable degrees. The Journal of Symbolic Logic, 31:159–168, 1966. [47] [409] C. E. M. Yates. On the degrees of index sets II. Transactions of the American Mathematical Society, 135:249–266, 1969. [210] [410] C. E. M. Yates. Initial segments of the degrees of unsolvability, II. Minimal degrees. The Journal of Symbolic Logic, 35:243–266, 1970. [72] [411] L. Yu. Lowness for genericity. Archive for Mathematical Logic, 45:233–238, 2006. [378, 586] [412] L. Yu. When van Lambalgen’s theorem fails. Proceedings of the American Mathematical Society, 135:861–864, 2007. [276, 357] [413] L. Yu and D. Ding. There are 2ℵ0 many H-degrees in the random reals. Proceedings of the American Mathematical Society, 132:2461–2464, 2004. [465, 478] [414] L. Yu and D. Ding. There is no sw-complete c.e. real. The Journal of Symbolic Logic, 69:1163–1170, 2004. [442] [415] L. Yu, D. Ding, and R. G. Downey. The Kolmogorov complexity of random reals. Annals of Pure and Applied Logic, 129:163–180, 2004. [262, 464, 465, 475, 478] [416] D. Zambella. On sequences with simple initial segments. Technical Report ML-1990–05, The Institute for Logic, Language and Computation (ILLC), University of Amsterdam, 1990. [98, 501, 508] [417] X. Zheng. On the Turing degrees of weakly computable real numbers. Journal of Logic and Computation, 13:159–172, 2003. [218] [418] X. Zheng. Computability Theory of Real Numbers. Habilitationsschrift, BTU Cottbus, Germany, 2005. [217] [419] X. Zheng. Classification of computably approximable real numbers. Theory of Computing Systems, 43:603–624, 2008. [217] [420] X. Zheng and R. Rettinger. Weak computability and representation of reals. Mathematical Logic Quarterly, 50:431–442, 2004. [217]
References
795
[421] M. Ziegler. Algebraisch abgeschlossen Gruppen. In S. I. Adian, W. W. Boone, and G. Higman, editors, Word Problems II. The Oxford Book, Outgrowth of a Conference on Decision Problems in Algebra Held in Oxford, Summer 1976, pages 449–576. North Holland, Amsterdam, 1980. [206] [422] M. Zimand. Extracting the Kolmogorov complexity of strings and sequences from sources with limited independence. In S. Albers and J.-Y. Marion, editors, 26th International Symposium on Theoretical Aspects of Computer Science. STACS 2009, February 26–28, 2009, Freiburg, Germany, volume 3 of Leibniz International Proceedings in Informatics, pages 697–708 (electronic). Schloss Dagstuhl–Leibniz Center for Informatics, Dagstuhl, Germany, 2009. [635] [423] M. Zimand. On generating independent random strings. In K. Ambos-Spies, B. L¨ owe, and W. Merkle, editors, Mathematical Theory and Computational Practice. 5th Conference on Computability in Europe, CiE 2009. Heidelberg, Germany, July 19–24, 2009. Proceedings, volume 5635 of Lecture Notes in Computer Science, pages 499–508. Springer, Berlin, 2009. [635] [424] M. Zimand. Two sources are better than one for increasing the Kolmogorov complexity of infinite sequences. Theory of Computing Systems, 46:707–722, 2010. [627, 628, 629, 635] [425] A. K. Zvonkin and L. A. Levin. The complexity of finite objects and the development of the concepts of information and randomness by means of the theory of algorithms. Russian Mathematical Surveys, 25:83–124, 1970. [xiii, 116, 146, 150, 151, 237, 238, 265, 266, 267, 282, 598]
Index
2<ω , 2 2n , 2 2n , 2 λ, 2 |σ|, 2 στ , 2 σ τ , 2 σ
f ∼ g, 4 ∃∞ , 4 ∀∞ , 4 ∃k , 4 λx f (x, y0 , . . . , yn ), 4 μn R(n), 4 x ∨ y, 4 x ∧ y, 4 σ, 4, 618 A, 4, 618 μ(A), 5, 618 #(r), 8 Φe , 9, 17 Φ(n) ↓, 9 Φ(n) ↑, 9 ∅ , 11 Φe (x)[s], 11 Φe (x)[s] ↓, 11 Φe (x)[s] ↑, 11 We , 11 We [s], 11 We,s , 11 X[s], 13 ≡R (for a reducibility R ), 15
798
Index
deg, 16 a b, 16 ⊕, 16 a
∨ b, 17 , 17 0, 17 ΦA , 17 ΦA e (n)[s], 17 ϕA (for an oracle machine Φ), 17 A , 18 A(n) , 18 a , 18 a(n) , 18 ∅(ω) , 18 m , 19 1 , 20 (for truth tables), 20 tt , 20 wtt , 21 , 39 A[<e] , 58 [T ], 73 , 100 pb , 107 Cf (σ), 111 C(σ), 111 C(n), 112 C(σ, τ ), 112 ∗ , 112 σC C(σ | τ ), 114 C A (σ), 115 IC (σ : τ ), 116 Kd , 122 U, 123 K(σ), 124 K(σ | τ ), 124 K A (σ), 124 KM (σ), 124 K(n), 124 σ ∗ , 124 Ks (σ), 124 K(σ), 129 QM (σ), 133 Q(σ), 133 IK (σ : τ ), 134 I(σ : τ ), 134 K t (σ), 139 Dn , 141
deg–K F ∗ (for a finite set F ), 141 Ω, 142 Ωs , 142 M, 146 D, 146 Km(σ), 146 Km D (σ), 146 Km S (σ), 146 KM (σ), 150 ML (for a monotone machine L), 151 V, 154 Ln , 157 mC (σ), 161 mK (σ), 161 KM , 169 Km , 169 L(α), 197 I(α), 208 R2 , 219 229 Ω, S[d], 235, 243, 594 E[X], 242 E[X | A], 242 sf (α, n), 246 Sf (α, n), 246 S(α, n), 246 G(n), 252 ΩX , 257 ⊕X , 259 C g , 261 μ∗ρ , 264 μρ , 264 (σ)ν , 266 (σ)ν,s , 266 seqν (α), 266 Sh [d], 271 n , 281 SM Pn , 297 A ϕ (for a sentence of arithmetic), 297 (in effective Solovay forcing), 297 =I , 340 s , 344 w , 344 Pw , 345 r1 , 345 r∗2 , 346 A T , 358 K , 404
C –OQ (for a complexity measure Q) C , 405 S , 405 cl , 420 ibT , 420 rK , 421 A ⊗ f , 461 σ ⊗ f , 461 Sch , 462 stt , 463 α(n), 467 αX (n), 467 s(n), 467 qf , 472 vL , 473 LR , 473 Km ⊕ , 473 477 ⊕, K , 486 γ(n), 487 γ (n), 488 LK , 494 KT (c), 505 G(c), 506 KT (c, n), 506 J A , 523 cα , 528 S 3 (for a class S), 535 MLR, 537 W2R, 537 K, 538 Low(C, D), 537 Low(C), 537 uT(tu) , 577 T(tu) , 577 gA , 578 A∗ , 578 ML2R, 591 CompR, 591 SchR, 591 W1R, 591 Hns , 592 H s , 593 dimH , 593 s , 598 H Hns , 598 Ωs , 603 Q∗ , 606 hω , 618 hn , 618
Index
799
h<ω , 618 μh , 618 dimh , 619 dim[0,1] , 620 Qba , 623 Pac , 624 dimB , 635 dimB , 635 dimMB , 636 dimMB , 636 dimMB , 636 D[n] (for D ⊆ 2<ω ), 637 dim1B , 637 dim1B , 637 HA , 637 Ai.o. , 637 Aa.e. , 637 H1 , 637 Hstr 1 (C), 637 Pns , 638 s , 638 P∞ P s , 639 dimP , 639 Dim, 641 dim, 597 dimcomp , 654 Dimcomp , 654 dimS , 656 DimS , 657 μp , 657 , 663 d[m], 664 dimn d (σ), 664 d , 665 dim(σ) (for a string σ), 665 ΩA M , 707 ΩM , 707 A S , 709 ΩU [S], 712 Spec(ΩU ), 717 S c , 722 PW (A), 731 P (A), 731 H(A), 732 I(A), 732 RC , 738 OQ (for a complexity measure Q), 739
800
Index
str
RK , 743 RK , 743 U RK , 744 U OK , 744 U RQ (for a complexity measure Q), 749 Km M , 750 M Ri , 750 M OKm , 752 U OKM , 761 U OKm , 761 D U OKm S , 761 a priori entropy, 150 probability, 238 a.n.c., see array noncomputability Ackermann’s function, 29 adaptive selection function, see selection, function, adaptive Ageev, M., 732 algorithm, 7 Allender, E., xv, 738, 743, 749 “almost all” theory of the Turing degrees, 365 almost computably enumerable set, 200 and computable enumerability, 201 and left-c.e. reals, 200 “almost everywhere” behavior, xxii and CEA sets, 386 and density, 385 and hyperimmunity, 381 and initial segments of the degrees, 401 and minimal degrees, 401 and minimal pairs, 359 and nth jumps, 363 and 1-genericity, 383, 385 and randomness, see randomness, and “almost everywhere” behavior and Turing degrees, 360 and 2-randomness, 394 and weak 2-genericity, 380 almost everywhere domination, see domination, almost everywhere
str
RK –array almost recursive degree, see hyperimmune-freeness almost simple martingale, see gale, martingale, almost simple randomness, see randomness, almost simple α function, 467 and initial segment complexity, 489 and Ω, 469–470 and relativized complexity, 469 and the s function, 468 and 2-randomness, 470 growth rate, 467 α-computably enumerable set, 28 Ambos-Spies, K., xv, 15, 16, 53, 91–92, 201, 215, 217, 219, 270, 277, 303, 307–308, 349, 554 Ample Excess Lemma, xxi, 250 analytic set, 711 Anderson, B., xv Andrews, U., xv anti-complex set, 576 and c.e. traceability, 576, 578 and domination, 579 and minimal programs, 579 and plain complexity, 579 and Schnorr triviality, 576, 580 and (uniform) reducibility with tiny use, 579 and wtt-reducibility, 576, 578, 580 approximation, 24 and the halting problem, 24–25 notation, 13 restrained, 695 to a c.e. set, 12–13 to a Δ02 set, 24–26 to a left-c.e. real, see real, left computably enumerable, approximation to a real, see real, approximation arithmetic genericity, see genericity, arithmetic hierarchy, 23 and index sets, 26 randomness, see randomness, arithmetic set, 23 array
array computability–basis theorem strong, see strong, array very strong, see very strong array array computability, xxiv, 94, see also array noncomputability and c.e. traceability, 99 and cl-reducibility, 444, 447, 448 and definability, 99 and effective packing dimension, 649–650 and GL2 ness, 97 and hyperimmune-freeness, 98 and ibT-reducibility, 444, 447 and initial segment complexity, 730 and Kummer complex sets, 730 and lowness, 98 and low2 ness, 97 and pb-genericity, 107–108 and strong minimal covers, 100 and superlowness, 98 array noncomputability, 94, see also array computability and identity-bounded reductions, 96 and permitting, 94, 445 and separating sets, 94 and the halting problem, 94 and very strong arrays, 94–96 relative to a very strong array, 94 Arslanov’s Completeness Criterion, 89 Arslanov, A., xv Arslanov, M., 27, 89 asymptotically equal, 4 Athreya, K., 636–641, 654 atom of a computable measure, 266 of a measure, 265 atomic measure, see measure, atomic autocomplex set, 366 and diagonal noncomputability, 367, 369 and extending partial functions, 585 and Kolmogorov complexity, 366 and 1-randomness, 368 and Turing completeness, 368 and weak computable traceability, 369 automorphism machinery, 83
Index
801
autoreducibility, 340 and indifferent sets, 341 and nonmonotonic randomness, 341 and 1-randomness, 340 avoiding a class, 101 a string, 601 power, see randomness, and computational strength, avoiding power Barak, B., 642 Barmpalias, G., xiv–xv, xxii–xxiii, 69, 315, 323, 339, 358, 362, 377, 420, 442, 444, 447–448, 489–491, 499, 536, 541–542, 689, 731 Barzdins’ Lemma, 728 Barzdins, J., 122, 728 base for a notion of randomness, 531 for computable randomness, 534 for difference randomness, 535 for 1-randomness, 531 and K-triviality, 531 and lowness for K, 533 and lowness for 1-randomness, 531 for Schnorr randomness, 534 for weak 1-randomness, 359 for weak 2-randomness, 359, 531 basic open set, see set, open, basic basis for the Π01 classes, 77 and PA degrees, 86 nonexistence of a minimal basis, 80 basis theorem, 77 computably enumerable, 77 hyperimmune-free, 78 jump inversion, 81–82 for classes of positive measure, 373 Kreisel, 77 low, 77 with cone-avoidance, 78 low for a given 1-random, 334 low for Ω, 713 non-CEA, 79 pseudo-jump inversion, 376–377
802
Index
Scott, 84 superlow, 78 weakly low for K, 715 Bayes’ rule, 238 Bayes, T., 238 Becher, V., xv, 229, 257 Bedregal, B., 560 benign cost function, see cost function, benign Bernoulli measure, see measure, Bernoulli betting strategy, 235, 269 nonmonotonic, 309 and computable enumerability, 313 c.e., 311 capital function, 309 payoff function, 310 scan rule, 309 scan rule, oblivious, 319 stake function, 309 success, 310 bi-immunity, 295 Bickford, M., 530 Bienvenu, L., xiv–xv, 138–139, 254, 263, 298–301, 327, 329–330, 333, 412–413, 471–473, 484, 504–505, 591, 642, 643, 714 Binns, S., 497 boolean algebra effectively dense, 419 Borel Determinacy, 7, 336 σ-algebra, see σ-algebra, Borel Borel, E., 5, 12, 592 Borel-Cantelli Lemma, 5 bound, see traceability, bound bounded injury, see priority, method, bounded injury machine, see machine, bounded Martin-L¨ of test, see test, bounded Martin-L¨ of request set, see KC set Bounded Request Theorem, see KC Theorem bounding, 72 Bounding Lemma, 481 box, 674
Bayes’ rule–CEA set amplification, see box, promotion k-, 674 meta-, 674 promotion, 673–674 box counting dimension, see dimension, box counting branching degree, 458 Brodhead, P., xv, 319 broken dimension question, 610, 612 for m-degrees, 611 for wtt-degrees, 611 Buhrman, H., 743, 749 C-complexity, see complexity, plain Kolmogorov C-degrees, see degrees, CC-independence, see independence, CC-randomness, see randomness, CC-reducibility, see reducibility, CC-triviality, 504 Cai, J., 598 calculus of cost functions, see cost function, calculus Calhoun, W., 147, 432, 459–462 Calude, C., xi, xv, 12, 142, 200, 203–204, 228, 333, 405, 409, 413, 501, 604–605, 627–628, 706 Cantelli, F., 5 Cantor space, 3, 4 and [0, 1], 265–266 as a continuous sample space, 145 basic open set, see set, open, basic measure, see measure Cantor’s diagonalization argument, see diagonalization argument Cantor, G., 31 Cantor-Bendixson derivative, see class, Π01 , Cantor-Bendixson derivative capital function, see betting strategy, nonmonotonic, capital function cappable degree, 91 and prompt simplicity, 92 Carath´eodory, C., 400, 592, 635, 639 Cauchy sequence, 198 c.e., see computably enumerable CEA set, xxii, 79
CEA(X)–class and “almost everywhere” behavior, 386 and hyperimmune-freeness, 103 and 1-genericity, 104, 392 and Π01 classes, 79 and 2-randomness, 386 and weak 2-randomness, 393 CEA(X), see computably enumerable, in and above a set Cenzer, D., xv, 74–76, 373, 377 Chaitin’s Ω, see Ω Chaitin, G., xiii, xv, xxi, 12, 118–121, 125, 128–131, 133–134, 142, 145, 148, 227–229, 233, 250, 456, 467, 500, 504, 627, 705, 731–732 characteristic function, 2, 11 Chernoff bounds, 629, 633 Chisholm, J., 368 Cholak, P., xv, 83, 496, 668–670, 673, 680, 685 Chong, C., xv, 386 Chubb, J., 368 Church stochasticity, see stochasticity, Church Church, A., xviii–xix, 8, 230 Church-Turing Thesis, 8, 29, 358 cl-degrees, see degrees, computable Lipschitz cl-reducibility, see reducibility, computable Lipschitz class Δ01 , 74 uniformly, 74 diamond, see diamond class of positive measure and mass problem reducibilities, 344 Π0n , 75 and open Σ0n−1 classes, 255 (n−2)
classes, 287 and Π0,∅ 2 and relativized Π01 classes, 255 closed, 255 index, see index, for a Π0n class null, and (n − 1)-randomness, 260 of positive measure, 297
Index
803
of positive measure, and n-randomness, 259 of positive measure, and weak n-randomness, 382 relativized, 76 uniformly, 76 0-1 law, see 0-1-law, for Π0n classes Π01 , 73, 201 and CEA sets, 79 and complete extensions of a theory, 84 and Hausdorff dimension, 608 and hyperimmune-freeness, 78–79 and hyperimmunity, 81 and incomplete c.e. degrees, 81 and intersections, 73 and jump inversion, 81–82, 373 and low sets, 77–78 and minimal degrees, 79 and pseudo-jump operators, 376–377 and superlow sets, 78 and unions, 73 approximation, 74 as an effectively closed set, 74 basis, see basis for the Π01 classes basis theorem, see basis theorem basis, and PA degrees, 84 Cantor-Bendixson derivative, 75 countable, and effective packing dimension, 648–649 countable, and never continuously random sets, 336 finite, 75 forcing, see forcing, with Π01 classes in ω ω , 73 in ω ω , computably bounded, 73 index, see index, for a Π0n class of complete extensions of PA, 84 of 1-random sets, 234, 324 of positive measure, 347 of positive measure, and jump inversion, 373 of positive measure, and 1-randomness, 259
804
Index of positive measure, mass problem, see mass problem, of a Π01 class of positive measure rank, 75 rank, of an element, 75 relativized, and LR-reducibility, 497 special, 81 special, and jump inversion, 81 special, cardinality, 75 strong degrees, see degrees, strong, of Π01 classes thin perfect, 373 uniformly, 74 universal, 84 weak degrees, see degrees, weak, of Π01 classes with upwards closure of positive measure, mass problem, see mass problem, of a Π01 class with upwards closure of positive measure
Π02 and almost everywhere domination, 496 and Hausdorff dimension, 608 as an intersection of uniformly Σ01 classes, 77 closed, 76 0 Π0,X n , see class, Πn , relativized 0 Σn , 76 and closed Π0n−1 classes, 255 and relativized Σ01 classes, 255 (n−1) classes, 286 and Σ0,∅ 1 (n−2)
and Σ0,∅ classes, 287 2 index, see index, for a Σ0n class null, and (n − 1)-randomness, 260 open, 255 relativized, 76 relativized, generators, 76 uniformly, 76 0-1 law, see 0-1-law, for Σ0n classes Σ01 , 269 Σ01 , 73, 201 and diagonal noncomputability, 582–583
clumpy tree–Coding Theorem and hyperimmune-freeness, 582–583 approximation, 74 as an effectively open set, 74 generators, c.e., 74, 76 generators, computable, 74 index, see index, for a Σ0n class relativization and density, 582–583 relativization and having measure 1, 582 relativized, and LR-reducibility, 489 uniformly, 74 uniformly, and prefix-free complexity, 127 Σ03 , 289 and mass problems, 346 Σ02 and Hausdorff dimension, 608 as a union of uniformly Π01 classes, 77 of 1-random sets, 324 relativized, and LR-reducibility, 497 0 Σ0,X n , see class, Σn , relativized clumpy tree, see tree, ε-clumpy co-c.e. function, see function, real-valued, co-c.e. set, 11 coding constant for plain complexity, 111 for prefix-free complexity, 122 effective, 8 G´ acs, 326 into the natural numbers, 8 Kuˇcera, 332, 374 marker, see marker, coding method, 42–43 string for plain complexity, 111 for prefix-free complexity, 122 the halting problem, 10 Coding Theorem, xx, 133, 731 for continuous measures, 151 for continuous semimeasures, 150–153
Cof–complexity Cof, see index set, Cof Cohen forcing, see forcing, Cohen Cohen, P., 297 Cole, J., 347, 696 Coles, R., xi, xiv–xv, 42, 203–204, 405, 501 collection of nonrandom strings, see nonrandom strings column of a set, 3 and n-randomness, 259 comparable strings, see string, comparability completeness, 12 computable Lipschitz, 442 criterion Arslanov’s, see Arslanov’s Completeness Criterion Friedberg, see Friedberg Completeness Criterion many-one, 20 and creative sets, 22 relative to a class of sets, 25 NP-, 19 1-, 20 and creative sets, 22 relative to a class of sets, 25 Solovay, xxii, 408 and halting probabilities, 409–410 and initial segment complexity, 410 completeness, and 1-randomness, 409–410 relative to an oracle, 709 truth table, 89 and wtt-completeness, 334 Turing and effective immunity, 80 and fixed-point freeness, 89 and Π01 classes, 81 weak truth table and fixed-point freeness, 89 and tt-completeness, 334 completions of PA mass problem, see mass problem, of completions of PA complex set, 366 and C-degrees, 457 and cl-reducibility, 441
Index
805
and diagonal noncomputability, 367 and effective dimension, 599 and hyperavoidable sets, 368 and hyperimmunity, 368 and K-degrees, 457 and Kolmogorov complexity, 367 and 1-randomness, 368 and wtt-completeness, 368 Kummer, see Kummer complex set relative to a function, 366 shift, see shift complex set complexity bounds for computable measure machines, see machine, computable measure, complexity bounds C-, see complexity, plain Kolmogorov decision, 122 extraction, 608, 629, 635 and polynomial time Turing reducibility, 642 for effective dimension, 609 for effective packing dimension, 642 for Hausdorff dimension, 643 K-, see complexity, prefix-free Kolmogorov KM -, 150 and fast-growing functions, 238 and monotone complexity, 153, 169–170 and 1-randomness, 239 and optimal c.e. supermartingales, 238 and strong s-Martin-L¨ of randomness, 605 and universal monotone machines, 151 collection of nonrandom strings, see nonrandom strings, for KM -complexity overgraph, see overgraph, of KM -complexity relativized and limit, 473 Kolmogorov, xi, xx–xxi applications, 114
806
Index
foundations, 110 notation, xii, 110 plain, see complexity, plain Kolmogorov prefix-free, see complexity, prefix-free Kolmogorov with respect to a function, 111 monotone, 122, 146 and computability, 148 and concatenation, 148 and KM -complexity, 153, 169–170 and 1-randomness, 227, 239, 460 and prefix-free complexity, 147–148, 459 and process complexity, 147 collection of nonrandom strings, see nonrandom strings, for monotone complexity failure of subadditivity, 148 overgraph, see overgraph, of monotone complexity relativized and limit, 473 plain Kolmogorov, xii, xx, 111 and addition, 425 and c.e. sets, 115 and cl-reducibility, 440 and complexity of minimal C-programs, 112, 115 and computability, 117–120 and computable functions, 112 and concatenation, 113, 136 and effective dimension, 598 and effective upper box counting dimension, 638 and finite sets, 115 and immunity, 114 and 1-randomness, 252–254 and packing dimension, 641 and prefix-free complexity, 136, 154–156, 160–169 and process complexity, 147 and Turing reducibility, 120 and wtt-reducibility, 440 collection of nonrandom strings, see nonrandom strings, for plain complexity conditional, 114
complexity counting strings of low complexity, 117, 119–120 critique, 121 failure of subadditivity, 113 information content, see information content, plain lack of computable lower bounds, 124 monotone approximation, 578 of a number, 112 oscillations, 137 overgraph, see overgraph, of plain complexity relativized, 115 relativized and limit, 471 relativized, and Turing reducibility, 115 symmetry of information, see Symmetry of Information, for plain complexity time-bounded, 261 time-bounded, and 2randomness, 261–262 upper bound, 112 upper bound for initial segments of sets, 136–137 prefix Kolmogorov, see complexity, prefix-free Kolmogorov prefix-free Kolmogorov, xii, xix–xx, 124 and addition, 425 and cl-reducibility, 440 and computable functions, 124 and concatenation, 132, 135 and dimension of strings, 665 and effective dimension, 598 and effective upper box counting dimension, 638 and KC sets, 126 and Martin-L¨ of tests, 232–233 and maximal c.e. discrete semimeasures, 133 and monotone complexity, 147–148, 459 and 1-randomness, 229, 250–252 and packing dimension, 641 and plain complexity, 136, 154–156, 160–169
compression function–computable and presentations of left-c.e. reals, 411 and process complexity, 147 and Turing reducibility, 456 and universal prefix-free machine output probabilities, 133 and wtt-reducibility, 440 approximation, 124 collection of nonrandom strings, see nonrandom strings, for prefix-free complexity computable upper bound, 138–139 conditional, 124 counting strings of a given complexity, 140 counting strings of low complexity, 129, 131, 140 Counting Theorem, see Counting Theorem information content, see information content, prefix-free lack of computable lower bounds, 124 lowness, see lowness, for K minimality as an information content measure, 129–130 monotone approximation, 467 of a finite set, 141 of a number, 124 overgraph, see overgraph, of prefix-free complexity quadratic time version, 139 relativized, 124 relativized and limit, 472 relativized, and n-randomness, 257 subadditivity, 132 symmetry of information, see Symmetry of Information, for prefix-free complexity time-bounded, 139 time-bounded, and Solovay functions, 139 upper bound, 127–129, 131 upper bound for initial segments of sets, 137 process, 122, 146 and computability, 148
Index
807
and monotone complexity, 147 and 1-randomness, 227, 239 and plain complexity, 147 and prefix-free complexity, 147 collection of nonrandom strings, see nonrandom strings, for process complexity failure of subadditivity, 149 overgraph, see overgraph, of process complexity strict, 146 strict process collection of nonrandom strings, see nonrandom strings, for strict process complexity overgraph, see overgraph, of strict process complexity theory, xxv, 19, 114, 743 uniform, 122 compression function, 261 low, 261 computable analysis, 231 approximation, see approximation function, 8 and continuity, 31 and primitive recursive functions, 28–29 partial, see computable, partial function rational-valued, see function, rational-valued, computable real-valued, see function, real-valued, computable relative to an oracle, 16 uniformly, 11 Hausdorff dimension, see dimension, Hausdorff, computable index set, see index, set, computable Lipschitz degrees, see degrees, computable Lipschitz Lipschitz reducibility, see reducibility, computable Lipschitz martingale, see gale, martingale, computable measure, see measure, computable
808
Index
computably bounded Π01 class–computably enumerable
measure machine, see machine, computable measure model theory, 368 packing dimension, see dimension, packing, computable partial function, 9 domain emptiness problem, 10 enumeration, 9 index, see index, for a partial computable function uniformly, 11 universal, 9, 111 universal by adjunction, 111 permutation, 21–22 randomness, see randomness, computable rational probability distribution, 279 real, see real, computable set, 11, 16 and fast-growing functions, 504 and monotone complexity, 148 and plain complexity, 117–120 and process complexity, 148 and the arithmetic hierarchy, 23 relative to an oracle, 16 relativization and measure, 358 uniformly, 12 stochasticity, see stochasticity, Church traceability, see traceability, computable weak, see traceability, weak computable computably bounded Π01 class, see class, Π01 , in ω ω , computably bounded computably dominated degree, see hyperimmune-freeness computably enumerable continuous semimeasure, see semimeasure, continuous, computably enumerable degree, see degrees, Turing, computably enumerable discrete semimeasure, see semimeasure, discrete, computably enumerable function
real-valued, see function, real-valued, computably enumerable in and above a set, 65, see also CEA set martingale, see gale, martingale, computably enumerable operator, 731 optimal, 731 permitting, see permitting, c.e. real, see real, left computably enumerable set, 11 almost, see almost computably enumerable set and cl-reducibility, 435–436 and effective dimension, 611 and effective immunity, 80 and ibT-reducibility, 447 and initial segment complexity, 440 and minimal degrees, 72 and Π01 classes, 81 and plain complexity, 115 and rK-reducibility, 436 and Schnorr Hausdorff dimension, 660 and Schnorr packing dimension, 661 and Schnorr randomness, 660 and settling times, 18 and Solovay reducibility, 435 approximation, see approximation, to a c.e. set as a Σ01 set, 23 entropy, 731 enumeration probability, xxv, 731 gap phenomenon, see gap phenomenon, for c.e. sets high, see high set, c.e. index, see index, for a c.e. set initial segment complexity, xxv, 728–730 lattice of c.e. sets, 83 low, see low set, c.e. minimal pair of C-degrees, 457 modulus function, 421
Computably Enumerable Basis Theorem–cover pair of incomparable degrees, 36 relative to an oracle, 18 relativization and measure, 358 splitting, see splitting, c.e. superhigh, see superhigh set, c.e. thick subset, see thick subset, of a c.e. set Turing completeness, 89 uniformly, 12 uniformly, and prefix-free complexity, 127 wtt-completeness, 89 splitting, see splitting, c.e. supermartingale, see gale, supermartingale, computably enumerable traceability, see traceability, computably enumerable Computably Enumerable Basis Theorem, see basis theorem, computably enumerable computably finite Martin-L¨ of test, see test, computably finite Martin-L¨ of randomness, see randomness, computably finite computably graded test, see test, computably graded computably layered machine, see machine, computably layered computational paradigm, see randomness, paradigm, computational Conder, M., xiv condition, see forcing, condition conditional expectation, see expectation, conditional plain complexity, see complexity, plain Kolmogorov, conditional prefix-free complexity, see complexity, prefix-free Kolmogorov, conditional Conidis, C., xv, 642, 648 conjunctive truth table reducibility, see reducibility, truth table, conjunctive constant of K-triviality, see K-triviality, via a constant
Index
809
constructive Hausdorff dimension, see dimension, Hausdorff, effective termgale, see gale, termgale, Σ01 continuity and computability, 31 continuous measure, see measure, continuous semimeasure, see semimeasure, continuous convention hat, see hat, convention on logarithms, see logarithm conventions on reals, see real, convention on reals on uses, see use, conventions Cook, S., 19 Cooper, S., xiii, xv–xvi, 8, 72 correspondence principle for effective dimension, 608 for packing dimension, 648 for Schnorr Hausdorff dimension, 608 cost function, xxiii, 492, 502, 508, 526, 672 and Δ02 1-random sets, 692 and K-triviality, 527–528 benign, 689 and ω-c.e. 1-random sets, 692 and jump traceability, 690–691 and strong jump traceability, 690 calculus, 528 for lowness for K, 526 limit condition, 526 monotone, 526 obeying, 526 standard, 526, 689 and K-triviality, 527 countable predecessor property, 464 Counting Theorem, xx, 129 Improved, 131 tightness, 131 cover, 5 effective box, 637 n-, 592 effective, 598 Σ01 -, 386 proper, 386 tight, 386
810
Index
tight, and 1-genericity, 386 coverability, Martin-L¨ of, see Martin-L¨ of, coverability creative set, 22 and completeness, 22 Csima, B., xv, 379, 420–421, 458, 551–552 cuppability Martin-L¨ of, see Martin-L¨ of, cuppability wtt-, see wtt-cuppability d-minimal program, see program, d-minimal d-shift complex set d-shift complex set, see shift complex set d.c.e. real, see real, left-d.c.e. set, 27 and effective reals, 201 test, see test, d.c.e. Daley, R., 304, 305 Damle, V., xvi Day, A., xiv–xv, xx, 146–147, 149, 151, 153, 169–170, 282–283, 341, 749, 752, 761–763 de Leeuw, K., 323, 358 decanter method, 511, 517–518 decidable machine, see machine, decidable decision complexity, see complexity, decision Dedekind cut, see real, left cut deficiency set, 503 degrees, 16, 21, see also reducibility C-, 426 and complex sets, 457 and joins, 479 and Turing degrees, 457 branching, 458 cardinality, 465 density, 426 minimal, see minimal, degree, Cminimal pair, see minimal, pair, of C-degrees nonsplitting, 426 splitting, 426 computable Lipschitz
coverability, Martin-L¨ of–degrees and array computability, 444, 447 and ibT-degrees, 443 and joins, 444 computably random and upwards closure, 340 discrete monotone, 460 identity bounded Turing and array computability, 444, 447 and cl-degrees, 443 and joins, 444 K-, 426 and complex sets, 457 and joins, 477 and Turing degrees, 457–458 branching, 458 cardinality, 465, 478, 495 density, 426 minimal, see minimal, degree, Kminimal pair, see minimal, pair, of K-degrees nonsplitting, 426 splitting, 426 LR-, 489 and c.e. sets, 491 minimal pair, see minimal, pair, of LR-degrees many-one and effective dimension, 611 and the broken dimension question, 611 monotone, 432 antichains, 460 density, 432–433 minimal pair, see minimal, pair, of monotone degrees uncountable, 461–462 1-random and upwards closure, 338–339 relative K-, 424 and infima, 424 density, 424 minimal, see minimal, degree, rKnonsplitting, 424 splitting, 424 Schnorr random and upwards closure, 340
Dekker deficiency set–Demuth’s Theorem Solovay, xi, 413 and infima, 414 density, 415 minimal pair, see minimal, pair, of Solovay degrees nonsplitting, 415 splitting, 415, 419 theory, 419 undecidability, 419 strong, 344–345 of Π01 classes, 345 strong weak truth table, see degrees, computable Lipschitz truth table computable traceability, see traceability, computable, of a tt-degree Turing, 16 and “almost everywhere” behavior, 360 and C-degrees, 457 and effective dimension, 612 and infima, 17 and K-degrees, 457–458 and second order arithmetic, 364 and the broken dimension question, 612 cardinality, 17 computably enumerable, 16 computably enumerable, density, 58, 66 decidability of the “almost all” theory, 365 effective dimension, 610 incomparable, 32 incomparable, c.e., 36 initial segments and “almost everywhere” behavior, 401 join, 17 jump, see jump minimal, see minimal, degree minimal pair, see minimal, pair notation, 16 theory, 364–365 undecidability, 364 upper bounds, 58 van Lambalgen, 473 and decidability, 474 and joins, 474
Index
811
maximal, 474 minimal, see minimal, degree, van Lambalgen theory, 474 weak, 344–345 of Π01 classes, 345 weak truth table and effective dimension, 611 and the broken dimension question, 611 c.e. traceability, see traceability, computably enumerable, of a wtt-degree weakly 1-random, 290 and upwards closure, 340 Dekker deficiency set, see deficiency set Dekker, J., 503 Delahaye, J., xv delayed permitting, see permitting, delayed Δ0n set, 23 and the (n − 1)st jump, 25 test, see test, Δ0n Δ01 class, see class, Δ01 Δ03 method, 542 Δ02 dimension, see dimension, Hausdorff, Δ02 permitting, see permitting, Δ02 randomness, see randomness, Δ02 real, see real, Δ02 set and computable approximability, 26 and hyperimmunity, 67 and the halting problem, 26 relativization and measure, 358 splitting, see splitting, Δ02 sets 0 3 (Δ2 ) , see diamond class, (Δ02 )3 Demuth randomness, see randomness, Demuth weak, see randomness, weak Demuth test, see test, Demuth strict, see test, strict Demuth Demuth’s Theorem, xxii, 21, 333
812
Index
Demuth, O., xxii, 21, 103, 199, 231, 265, 315–316, 333, 364, 380, 409, 418–419 dense simple set, 574 and highness, 575 and Schnorr triviality, 574 density, xxii, 385 along a set, 723 and 2-randomness, 385 and “almost everywhere” behavior, 385 in a partial ordering, 645 of a set, 5 of a set of strings, 104 pb-, 108 uniform wtt-, 108 Density Theorem, 57, 58 description, see program system, xx, 111 descriptive set theory, xxv diagonal noncomputability and autocomplex sets, 367, 369 and c.e. traceability, 584 and complex sets, 367 and effective dimension, 610, 624–625 and fixed-point freeness, 87 and hyperimmune-freeness, 369, 582–584 and lowness for computable randomness / weak 1-randomness, 591 and lowness for Kurtz null tests, 583–584 and lowness for 1-randomness / weak 1-randomness, 591 and lowness for Schnorr randomness / weak 1-randomness, 591 and lowness for weak 1-genericity, 583–584 and lowness for weak 1-randomness, 584 and minimal degrees, 618 and 1-genericity, 103 and 1-randomness, 336, 347–349 and Σ01 classes, 582–583 mass problem, see mass problem, of DNC functions of a degree, 87
Demuth, O.–dimension of a function, 87 {0, 1}-valued and PA degrees, 89 diagonalization argument, 31 diamond class, 289, 535, 689 and jump traceability, 689, 693–694 (Δ02 )3 , 689 low3 , 689 (ω-c.e.)3 , 689, 693, 694 superhigh3 , 694 superlow3 , 689, 693–694 ∅ -trivializing3 , 536 Diamondstone, D., xv, 693–694 difference hierarchy, 27 randomness, see randomness, difference differential geometry, 420 dimension box counting, 635 and countable stability, 636 and Hausdorff dimension, 636 effective, and effective Hausdorff dimension, 637 lower, 635 lower, effective, 637 modified, 636 modified, and countable stability, 636 modified, lower, 636 modified, upper, 636 modified, upper, and packing dimension, 639 upper, 635 upper, effective, 637 upper, effective, and effective packing dimension, 641 upper, effective, and entropy rates, 637 upper, effective, and initial segment complexity, 638 effective, see dimension, Hausdorff, effective Hausdorff, xiii, xxiii, 593, 596 and box counting dimension, 636 and effective dimension, 608 and measure, 593 and packing dimension, 638 and Π02 classes, 608
dimension and s-(super)gales, 594 and Schnorr Hausdorff dimension, 608 and Σ02 classes, 608 and (super)martingales, 594 and unions of Π01 classes, 608 computable, 654, see also dimension, Hausdorff, Schnorr computable, and Schnorr Hausdorff dimension, 656 computable, correspondence principle, see correspondence principle, for computable Hausdorff dimension constructive, see dimension, Hausdorff, effective countable stability, 593 Δ02 , 600 effective, xiii, xxiii, 597, 598–600 effective, and C-independence, 629 effective, and c.e. sets, 611 effective, and c.e. supermartingales, 598 effective, and complex sets, 599 effective, and complexity extraction, 643 effective, and diagonal noncomputability, 610, 624–625 effective, and dimension of strings, 666 effective, and DNCh , 625 effective, and DNCn , 624–625 effective, and effective box counting dimension, 637 effective, and effective packing dimension, 641, 643, 649 effective, and entropy rates, 637 effective, and Hausdorff dimension, 608 effective, and initial segment complexity, 598 effective, and m-reducibility, 609, 611 effective, and 1-randomness, 599, 601, 609 effective, and Π02 classes, 608
Index
813
effective, and shift complexity, 601 effective, and the halting problem, 609 effective, and Turing reducibility, 612 effective, and unions of Π01 classes, 608 effective, and weak s-randomness, 603 effective, and wtt-reducibility, 611 effective, correspondence principle, see correspondence principle, for effective dimension effective, in h-spaces, 619 effective, in h-spaces / in [0,1], 620–622 effective, in h-spaces / in Cantor space, 623 effective, in [0,1], 620 effective, of Turing degrees, 610 effective, stability, 598 monotonicity, 593 relative to a complexity class, 596 Schnorr, xxiv, 600, 656, 657 Schnorr, and c.e. sets, 660 Schnorr, and computable Hausdorff dimension, 656 Schnorr, and computable measure machines, 659 Schnorr, and Hausdorff dimension, 608 Schnorr, and left-c.e. reals, 660 Schnorr, and Schnorr randomness, 657 Schnorr, and Σ02 classes, 608 Schnorr, and stability, 661 Σ01 , see dimension, Hausdorff, effective of a string, xxiv, 665 and prefix-free complexity, 665 relative to a termgale, 664 upper bound, 665 packing, 638, 639 and Hausdorff dimension, 638 and martingales, 639
814
Index
and upper modified box counting dimension, 639 computable, 654, see also dimension, packing, Schnorr correspondence principle, see correspondence principle, for packing dimension countable stability, 639 effective, xxiv, 636, 641 effective, and array computability, 649–650 effective, and c.e. traceability, 649–650 effective, and complexity extraction, 642–643 effective, and countable Π01 classes, 648–649 effective, and dimension of strings, 666 effective, and effective Hausdorff dimension, 641, 643, 649 effective, and effective upper box counting dimension, 641 effective, and genericity, 641 effective, and initial segment complexity, 641 effective, and left-c.e. reals, 649 effective, and Schnorr packing dimension, 662 effective, and Turing reducibility, 642 effective, 0-1 law, see 0-1 law, for effective packing dimension Schnorr, xxiv, 657 Schnorr, and c.e. sets, 661 Schnorr, and computable measure machines, 659 Schnorr, and effective packing dimension, 662 Schnorr, and hyperimmunity, 661 Ding, D., xv, 262, 442, 464–465, 475, 478 discrete machine, see machine, discrete monotone degrees, see degrees, discrete monotone semimeasure, see semimeasure, discrete DNC, see diagonal noncomputability
Ding, D.–drip-feed strategy DNCh , 618 and effective dimension, 625 and minimal degrees, 618 DNCn , 618 and effective dimension, 624–625 Dobrinen, N., 495–496 domain of a finite assignment, see selection, domain of a set of pairs, 3 dominant function, 97 and highness, 97 domination, 19 almost everywhere, xxiii, 315, 380, 495, 509–510 and highness, 496 and lowness for 2-randomness / 1-randomness, 496, 591 and LR-reducibility, 496–497, 499 and Π02 classes, 496 and ∅ , 496–497, 499 uniform, 496, 509–510 and GL2 sets, 97 and highness, 97 settling time, 421 Doty, D., 643 Downey, R., xi, xiv–xv, xxiv, 42, 53, 70, 72, 82–83, 93–98, 103, 107–109, 138, 204–210, 212, 215, 218, 240, 262, 275, 277–280, 288–295, 319, 321, 323, 327, 329, 334, 340, 349–350, 357–358, 360–362, 373, 379, 386, 406–408, 412–416, 418–427, 430, 432–436, 440–442, 444, 447–448, 462–465, 475, 478, 484, 501, 503–506, 511, 526, 528–530, 537, 539–542, 562–565, 568–569, 581, 586, 590, 606, 618, 642, 645, 649–650, 655–656, 658–661, 668–670, 673, 680, 685, 689, 700, 705, 707, 709–714, 716–721, 723–724, 726–727, 731 downward density, 394 drip-feed strategy, 212
dump set–finitary independence dump set, 503 Durand, B., 601 dynamic definition, see reduction, dynamic definition dynamical systems, xxv Dzhafarov, D., xv e-splitting tree, see tree, e-splitting e-state, 384 effective box cover, see cover, effective box coding, see coding, effective dimension, see dimension, Hausdorff, effective Hausdorff dimension, see dimension, Hausdorff, effective immunity, see immunity, effective lower box counting dimension, see dimension, box counting, lower, effective n-cover, see cover, n-, effective packing dimension, see dimension, packing, effective s-dimensional outer Hausdorff measure, see measure, s-dimensional outer Hausdorff, effective upper box counting dimension, see dimension, box counting, upper, effective effectively dense boolean algebra, see boolean algebra, effectively dense inseparable pair, 73 and PA degrees, 86 separating set, 86 0-1-valued formula, 365 Embedding Lemma, 346 Enderton, H., 84 entropy a priori, see a priori, entropy of a c.e. set, see computably enumerable, set, entropy rate, 637 and effective Hausdorff dimension, 637 and effective upper box counting dimension, 637
Index
815
enumeration probability, see computably enumerable, set, enumeration probability Enumeration Theorem, 9 for oracle machines, 17 ε-clumpy tree, see tree, ε-clumpy Epstein, R., 28 equivalence under a notion of reducibility, 16 Ershov hierarchy, 27 expectation, 242 conditional, 242 exponential order, see order, exponential extended monotone reducibility, see reducibility, extended monotone extendible element of a tree, 75 extension, see forcing, condition, extension extraction complexity, see complexity, extraction randomness, see randomness, extraction extractor, 632, 642 Fσ set, 77 f.a., see finite assignment Falconer, K., 639 Federer, H., 635 Feferman, S., 100 Feller, W., 250, 629 Fenner, S., 454 field of left-d.c.e. reals, 219 real closed, 219 and rK-reducibility, 424 of computable reals, 224 of Δ02 reals, 224 of K-trivial reals, 721 of left-d.c.e. reals, 219 Figueira, S., xv, 120, 229, 340–341, 523, 668–671 filter, 645 generic, 645 Fin, see index set, Fin finitary independence, see independence, C-
816
Index
finite assignment, 309 domain, see selection, domain extension method, 32 game, 745, 746, 762 injury, see priority, method, finite injury Martin-L¨ of test, see test, finite Martin-L¨ of randomness, see randomness, finite computably, see randomness, computably finite Solovay test, see test, Solovay, finite total Solovay test, see test, Solovay, finite total Fitzgerald, A., xiv–xv, 341 fixed point, 13 Fixed Point Theorem, see Recursion Theorem fixed-point freeness, xxii, 87 and DNC functions, 87 and joining to ∅ , 93 and minimal pairs, 92 and prompt simplicity, 90 and Turing completeness, 89 and wtt-completeness, 89 n-, 371 and (n + 1)-randomness, 371 flexible formula, 93 fluctuation about the mean, 249 follower, see priority, method, follower forcing, 7, 100 Cohen, 297 condition, 645 extension, 645 Gandy, 478 notion, 645 Solovay, 296–297 with Π01 classes, 374 with computable perfect trees, 68, 645 Fortnow, L., xi, xiv–xv, 132, 229, 303, 642 FPF, see fixed-point freeness Franklin, J., xv, xxiii, 273, 316–317, 534, 535, 564, 570–572, 574–580, 591 Freivalds, R., 303
finite–gale fresh large number, 37 Friedberg Completeness Criterion, 64 Friedberg, R., 32, 34–36, 38, 42, 64, 82 Friedberg-Muchnik Theorem, 36, 44 Fubini’s Theorem, 365 full approximation, 384 method, 72 function computable, see computable, function partial, 8 partial computable, see computable, partial function polynomial time computable, 16 primitive recursive, see primitive recursive function rational-valued computable, 203 real-valued, 202 co-c.e., 203 computable, 203 computably enumerable, 203 Σ01 , see function, real-valued, computably enumerable total, 8 tree, see tree, function perfect, see tree, perfect function functional truth table, 21 Turing, 17 weak truth table, 21 Gδ set, 77 G¨ odel number, 8 G¨ odel’s Incompleteness Theorem, 73, 84, 93 G¨ odel, K., 8 Gaifman, H., 287, 361 gale Hitchgale, 321 Kastergale, 321 martingale, xiii, xxi, xxiii, 235 additivity, 235 almost simple, 308 almost simple, and simple martingales, 308 and Hausdorff dimension, 594 and null sets, 235
gale
Index and packing dimension, 639 and s-gales, 594 co-c.e., and computable martingales, 280 computable, 236, 270 computable, and bounded Martin-L¨ of tests, 280 computable, and computable randomness, 270 computable, and computably graded tests, 280 computable, and enumerations, 276 computable, and rational-valued martingales, 270 computable, and Schnorr randomness, 273 computable, and weak 1-randomness, 290 computably enumerable, xix, 236 computably enumerable, and 1-randomness, 236 computably enumerable, and Martin-L¨ of tests, 236 computably enumerable, and optimality, 240 computably enumerable, nonexistence of enumerations, 240 computably enumerable, universal, 237 condition, 235 etymology, 321 h-success, 271 h-success set, 271 in probability theory, 242 in probability theory, and martingales / martingale processes, 243 Kolmogorov’s Inequality, see Kolmogorov’s Inequality multiplication by a constant, 235 operator, 587 partial computable, 303 process, see gale, martingale process rational-valued, 270
817
s-randomness, see randomness, martingale ss-success, 597 savings property, 238, 273 savings trick, 237, 273 Σ0n , 257 Σ0n , and n-randomness, 257 Σ01 , see gale, martingale, computably enumerable simple, 308 simple, and almost simple martingales, 308 simple, and von MisesWald-Church stochasticity, 308 Space Lemma, see Space Lemma strong s-success, 639 success, 235 success set, 235 success, relative to an order, see gale, martingale, h-success martingale process, 242 and 1-randomness, 244 and Schnorr’s critique, 269 Kolmogorov’s Inequality, see Kolmogorov’s Inequality, for martingale processes rational-valued, 245 success, 243 nonmonotonic, see betting strategy, nonmonotonic s-gale, 594 and s-supergales, 597 and Hausdorff dimension, 594 and martingales, 594 s-supergale, 594 and s-gales, 597 and Hausdorff dimension, 594 and supermartingales, 594 supermartingale, xiii, 150, 235 additivity, 235 and continuous semimeasures, 238 and Hausdorff dimension, 594 and null sets, 235 and s-supergales, 594 computably enumerable, 236 computably enumerable, and 1-randomness, 236
818
Index
computably enumerable, and effective dimension, 598 computably enumerable, and Martin-L¨ of tests, 236 computably enumerable, optimal, 237 computably enumerable, optimal, and KM -complexity, 238 computably enumerable, optimal, and optimal c.e. continuous semimeasures, 238 for h-spaces, 619 h-success, 271 h-success set, 271 Kolmogorov’s Inequality, see Kolmogorov’s Inequality multiplication by a constant, 235 partial computable, 303 rational-valued, computable, and enumerations, 276 s-randomness, see randomness, supermartingale ss-success, 597 savings property, 238 savings trick, 237 Σ0n , 257 Σ0n , and n-randomness, 257 Σ01 , see gale, supermartingale, computably enumerable success, 235 success set, 235 success, relative to an order, see gale, supermartingale, h-success termgale, 662, 664 constructive, see gale, termgale, Σ01 induced by a semimeasure, 664 optimal, 664 optimal, and semimeasures, 664 s-, 663 Σ01 , 664 terminology, 321 game finite, see finite game γ function, 487 and the s function, 487 and 2-randomness, 488 Gandy, R., 478
game–genericity gap phenomenon for c.e. sets, 730 for K-triviality, 551 for 1-randomness, 250 lack for Church stochasticity, 327 for 1-randomness, 327–329, 484 Gasarch, W., 69, 303 Gauld, D., xiv generalized highn set, 19 generalized low2 set and array computability, 97 and domination, 97 generalized lown set, 19 generalized Martin-L¨ of test, see test, generalized Martin-L¨ of generally noncomputable function, 92 {0, 1}-valued, 92–93 generators of an open set, 4 genericity, xxii arithmetic, 101 and initial segments of the degrees, 102 of degrees, 102 Cohen, 378 for a filter, see filter, generic n-, 101 and effective packing dimension, 641 and existence results, 102 and finite extension arguments, 102–103 and jumps, 102 and Turing reducibility, 379 and van Lambalgen’s Theorem, see van Lambalgen’s Theorem, for n-genericity and weak (n + 1)-genericity, 105 and weak n-genericity, 105, 107 forcing definition, 101 of degrees, 102 1-, 101 and “almost everywhere” behavior, 383, 385 and CEA sets, 104, 392 and diagonal noncomputability, 103 and generalized lowness, 102 and hyperimmune-freeness, 103
Gengler, R.–halting problem and initial segments of the degrees, 401 and jump inversion, 103 and Omega operators, 723–724 and 1-randomness, 380 and PA degrees, 103 and pb-genericity, 108–109 and tight Σ01 -covers, 386 and 2-randomness, 385, 393 and weak 2-genericity, 379 and weak 2-randomness, 385 and weak 1-genericity, lowness, see lowness, for 1-genericity / weak 1-genericity lowness, see lowness, for 1-genericity relative, and Turing reducibility, 379 pb-, 108 and array computability, 107–108 and effective packing dimension, 650 and 1-genericity, 108–109 and 2-genericity, 108–109 Solovay n-, 297 and weak n-randomness, 298 2and effective packing dimension, 641 and pb-genericity, 108–109 and Turing reducibility, 379 weak n-, 105 and (n − 1)-genericity, 105 and n-genericity, 105, 107 and relative hyperimmunity, 105 weak 1and hyperimmunity, 107 and 1-genericity, lowness, see lowness, for 1-genericity / weak 1-genericity and Schnorr randomness, 356 and weak 1-randomness, 356 lowness, see lowness, for weak 1-genericity weak 2and “almost everywhere” behavior, 380 and 1-genericity, 379 Gengler, R., 217
Index
819
GHn , see generalized highn set GLn , see generalized lown set global independence, see independence, global GNC function, see generally noncomputable function golden pair, 697 run, 520, 698 method, 511, 518 graph of a function, 3 Greenberg, N., xiv–xv, xxiv, 323, 348, 379, 442, 444, 447–448, 496, 526, 562–564, 576–580, 584–586, 591, 609, 618–626, 642, 645, 649–650, 668–670, 673, 680, 685, 689–695, 700, 731 Griffiths, E., xiv–xv, 240, 275, 277–280, 290–295, 349–350, 462–463, 562, 564–565, 568–569, 581, 658 Grigorieff, S., 229, 257 Groszek, M., 79 guess, see priority method, guess G´ acs, P., xv, xx–xxi, 133–134, 138, 143–144, 147, 153–154, 169–170, 172, 174, 252, 323, 325–326, 484 G´ acs coding, see coding, G´ acs h-space, 618 and [0, 1], 620 h-success, see gale, (super)martingale, h-success H¨ olzl, R., 139, 689, 729 Haas, R., 28 halting probability, 142, 201 and initial segment complexity, 410 and Solovay completeness, 409–410 relativized, see Omega operator and Ω, relativized halting problem, 9, 11, 21 and computable approximations, 24 and Δ02 questions, 33 and Δ02 sets, 26 and effective dimension, 609 and finiteness of trees, 75
820
Index
and index sets, 15 and Ω, 228 coding, see coding, the halting problem completeness, 20 computable enumerability, 12 relative to an oracle, 18 unsolvability, 9, 31 Hanf, W., 84 Harizanov, V., 368 Harrington’s golden rule, 36 Harrington, L., 83, 478 Harrington-Silver Theorem, 478 Hartmanis, J., 598 hashing, 632 hat convention, 57 trick, 57 Hausdorff dimension, see dimension, Hausdorff Hausdorff, F., 592, 635 Hertling, P., 200, 203–204, 405, 409, 413, 706 high set, xxii, 19 and domination, 97 and lowness for computable randomness / weak 1-randomness, 591 and lowness for Schnorr randomness / weak 1-randomness, 591 c.e., 42 and computable randomness, 349 incomplete, 53–54 measure of the class, 535 highn set, 19 highness for a notion of randomness, 591 Hirschfeldt, D., xiii–xv, xxiv, 96, 207, 289, 334, 340, 357, 360–361, 368, 379, 406–408, 413–416, 418–427, 430, 432–436, 440–442, 447, 501, 503–504, 511, 524, 526, 528–531, 533–536, 539–542, 689, 694–695, 705, 707, 709–713, 716–721, 723–724, 726–727 Hitchcock randomness, see randomness, Hitchcock
Hanf, W.–hyperimmunity Hitchcock, J., xv, 242–245, 321, 597, 608, 636–642, 654 Hitchgale, see gale, Hitchgale hitting power, see randomness, and computational strength, hitting power Ho, C., 199 Hofstadter’s Law, xii hungry sets construction, 531 hyperavoidable set, 368 and complex sets, 368 hyperdegree, 358 Hyperimmune-Free Basis Theorem, see basis theorem, hyperimmune-free hyperimmune-freeness, 67, see also hyperimmunity and array computability, 98 and CEA sets, 103 and computable traceability, 100, 555, 559 and diagonal noncomputability, 369, 582–584 and lowness for Kurtz null tests, 583–584 and lowness for weak 1-genericity, 583–584 and lowness for weak 1-randomness, 584 and 1-genericity, 103 and 1-randomness, 356 and Π01 classes, 78–79 and Σ01 classes, 582–583 and tt-reducibility, 69 and Turing reducibility, 69 and weak 1-randomness, 356–357 and weak 2-randomness, 357 cardinality of the class, 68 hyperimmunity, see also hyperimmune-freeness and “almost everywhere” behavior, 381 and complex sets, 368 and Π01 classes, 81 and principal functions, 69 and 2-randomness, 382 and weak 1-genericity, 107 and weak 1-randomness, 356 and weak 2-randomness, 382
hypersimplicity–injective randomness of degrees, 67 and of sets, 69 of noncomputable Δ02 degrees, 67 of sets, 69 and of degrees, 69 relative to ∅ and weak 2-randomness, 362 relativized and weak n-genericity, 105 hypersimplicity, 69 and c.e. splittings, 83 and wtt-cuppability, 82 ideal, 4 generated by a set, 530 principal, 531 Σ03 , 541 tt-, 574 Turing low upper bound, 550 low2 upper bound, 541 wtt-, 208 Σ03 , 208–210 identity bounded Turing degrees, see degrees, identity bounded Turing reducibility, see reducibility, identity bounded Turing ignorance test, 229 immunity, 12 and plain complexity, 114 bi-, see bi-immunity effective, 80 and Turing completeness, 80 k-, 454 and (weak) 1-randomness, 455 Impagliazzo, R., 642 incomparable strings, see string, incomparability Incompleteness Theorem, see G¨ odel’s Incompleteness Theorem independence C-, 627 and effective Hausdorff dimension, 629 and global independence, 628 and initial segment complexity, 628 and left-c.e. reals, 628
Index
821
and relative 1-randomness, 628 finitary, see independence, Cglobal, 627 and C-independence, 628 and initial segment complexity, 628 and K-triviality, 628 and relative 1-randomness, 628 of sources, xxiv, 627 of strings, 635 index for a c.e. set, 12 for a partial computable function, 9 for a Π0n class, 76 for a Σ0n class, 76 set, 13 and levels of the arithmetic hierarchy, 26 and ∅ , 16 Cof, 26 computable, 13 Fin, 26 Inf, 26 Tot, 26 uniformly Σ03 , 541 indifferent set, 341 and autoreducibility, 341 for a member of a Π01 class, 341 for a 1-random set, 340 for genericity, 341 Inf, see index set, Inf infinite injury, see priority, method, infinite injury infinitely often C-randomness, see randomness, infinitely often Cstrong K-randomness, see randomness, infinitely often strong Kinformation content measure, 129, 133 minimal, 129 plain, 116 prefix-free, 134 initialization, see priority, method, initialization injective randomness, see randomness, injective
822
Index
injury, see priority, method, injury bounded, see priority, method, bounded injury finite, see priority, method, finite injury infinite, see priority, method, infinite injury set, 57 unbounded, see priority, method, unbounded injury injury-free solution to Post’s Problem, see Post’s Problem, injury-free solution inner measure, see measure, inner interval determined by a string relative to a measure, 266 interval Solovay s-test, see test, Solovay s-, interval investment adviser method, 428 Ishmukhametov, S., 98–100, 650 isolated path, see path, isolated Jech, T., 297 Jockusch, C., xv, 15, 41–42, 64, 69, 72, 77–79, 81–82, 84, 86–89, 91–98, 101–102, 104–105, 107–109, 207, 215, 295, 368, 370, 374, 376, 401, 414, 530, 534 join, 4 Jones, V., xiv jump, 18 class, 19, 731 inversion, xxii, 64–66 and c.e. sets, 65 and Δ02 sets, 65 and 1-genericity, 103 and 1-randomness, 373 and Π01 classes, 81–82 basis theorem, see basis theorem, jump inversion for higher jumps, 66–67 Sacks, see Sacks, Jump Inversion Theorem Shoenfield, see Shoenfield Jump Inversion Theorem nth, 18 and “almost everywhere” behavior, 363
injury–K-triviality ω-, 18 operator, 18, 38, 64–66 and 1-genericity, 103 traceability, see traceability, jump strong, see traceability, strong jump ultra, see traceability, ultra jump well-approximability, see well-approximability, jump strong, see well-approximability, strong jump K-complexity, see complexity, prefix-free Kolmogorov K-degrees, see degrees, Kk-immune, see immune, kK-reducibility, see reducibility, Kk-set, 514 weight, see weight, of a k-set K-triviality, xxiii–xxiv, 305, 359, 425, 501, 668 and addition, 529 and array computability, 510 and bases for 1-randomness, 531 and computable enumerability, 527, 530 and cost functions, 527–528 and Δ02 -ness, 456, 500 and difference randomness, 533, 550 and downward closure, 525 and global independence, 628 and joins, 530 and jump traceability, 523, 680, 685, 689, 700 and left-d.c.e. reals, 721–722 and lowness for generalized Martin-L¨ of tests, 538 and lowness for K, 508, 524, 525, 541 and lowness for left-d.c.e. reals, 722 and lowness for 1-randomness, 510, 525, 533 and lowness for 1-randomness / computable randomness, 591 and lowness for weak 2-randomness, 537–538 and lowness for weak 2-randomness / computable randomness, 591
Kach, A.–Kuˇcera, A. and lowness for weak 2-randomness / 1-randomness, 537 and Omega operators, 719–721 and ω-c.e.-ness, 523 and real closed fields, 721 and Solovay functions, 505, 689 and strong jump traceability, 680, 685, 700 and superlowness, 518, 530 and the standard cost function, 527 and Turing incompleteness, 511 and wtt-incompleteness, 512 cost function, see cost function, standard counting, see K-triviality, via a constant, number of K-trivials gap phenomenon, see gap phenomenon, for K-triviality incomplete upper bound, 542 listing the K-trivials, 538–540 low upper bound, 550 low2 upper bound, 542 Σ03 ideal of K-trivial c.e. degrees, 529 Σ03 ideal of K-trivial degrees, 530 via a constant, 539 number of K-trivials, 501, 506–507 Kach, A., xv, 658 Kallibekov, S., 209 Kanovich, M., 366, 368 Karp, R., 19 Kastergale, see gale, Kastergale Kastermans randomness, see randomness, Kastermans Kastermans, B., 311, 319–321 Kautz, S., xiii, xxii, 79, 255–257, 259, 265–268, 287, 297–298, 330, 333, 359, 361–364, 382, 385–386, 393 KC request, see KC Theorem, request KC set, 126 and prefix-free complexity, 126 weight, see weight, of a set of KC requests KC Theorem, 125 and the recursion theorem, 127–128 request, 125
Index
823
weight, see weight, of a KC request Khoussainov, B., xv, 200, 203–204, 405, 409, 413, 706, 746 Kjos-Hanssen, B., xv, xxiii, 335–336, 348–349, 366–369, 489–490, 494, 497, 499, 537, 554, 559–560, 576, 581, 591 Kleene, S., 13–14, 17, 23, 28, 32, 34 Kleene-Post method, see finite extension method Theorem, 32 KM -complexity, see complexity, KM KM -reducibility, see reducibility, KM Km-reducibility, see reducibility, monotone Knight, J., xv Kolmogorov complexity, see complexity, Kolmogorov Kolmogorov’s 0-1 Law, see 0-1 law, Kolmogorov’s Kolmogorov’s Inequality, 235 for martingale processes, 243 in h-spaces, 620 Kolmogorov, A., xiii, xx, 6, 111–113, 115–116, 637 Kolmogorov-Loveland null set, see null set, Kolmogorov-Loveland randomness, see randomness, nonmonotonic Kouck´ y, M., 743, 749 Kr¨ aling, T., 139, 689, 729 Kraft’s Inequality, 125 computable version, see KC Theorem Kraft, L., 125 Kraft-Chaitin Theorem, see KC Theorem Kramer, R., 28 Kreisel Basis Theorem, see basis theorem, Kreisel Kreisel, G., 77 Kripke, S., 93 Kuˇcera, A., xiii, xv, xxi–xxii, 89–93, 103, 231, 234, 259, 277, 315, 323–326, 330–331,
824
Index
336–338, 346–347, 364, 367, 371, 373–374, 376–377, 380, 409–410, 418, 474, 489, 508–510, 524, 526, 531, 533–534, 550, 554, 609 Kuˇcera coding, see coding, Kuˇcera Kuˇcera’s universal Martin-L¨ of test, see test, Martin-L¨ of, universal, Kuˇcera’s Kuˇcera-G´ acs Theorem, xxi, 326, 330, 332 bound on the use, 326, 441 Kuˇcera-Slaman Theorem, xxii, 410, 444 and cl-reducibility, 454–455 relativized, 709 Kumabe, M., 102, 349, 618 Kummer complex set, 366, 729 and array computability, 730 Kummer’s Gap Theorem, 730 Kummer, M., xxv, 303, 501, 649, 728–730, 738–740, 743–744, 761 Kurtz array, 294 and Schnorr randomness, 294 n-randomness, see randomness, weak nnull test, see test, Kurtz null randomness, see randomness, weak 1Kurtz, S., xii–xiii, xxi–xxii, 105–107, 254–256, 259, 269, 285–288, 295, 303, 315, 323, 356, 360–362, 380–383, 385–387, 394, 401, 496 Kurtz-Solovay test, see test, Kurtz-Solovay Kuznecov, A., 69 Lachlan, A., 47, 58, 333–334, 364 Ladner, R., 207 LaForte, G., xiv–xv, 42, 206, 208, 210, 215, 240, 278–280, 349–350, 377, 406, 419–424, 433–436, 440, 442, 462–463, 564–565, 568–569 Lathrop, J., 304–305 lattice, 4
Kuˇcera coding–limited dependence of c.e. sets, see computably enumerable, set, lattice of c.e. sets law of iterated logarithms, 230 law of large numbers, xviii, 230, 232, 303 Lebesgue Density Theorem, 5 Lebesgue measure, see measure, uniform Lebesgue, H., 5, 592 left computable real, see real, left computably enumerable left computably enumerable real, see real, left computably enumerable left cut, see real, left cut left semicomputable real, see real, left computably enumerable left-c.e. real, see real, left computably enumerable left-d.c.e. real, see real, left-d.c.e. Lempp, S., xv, 67, 72, 311, 319–321, 349 length of agreement, 39 maximum, 48 length-lexicographic ordering, 2 Lerman, M., 67, 72, 87, 370, 475, 497, 517, 534 Levin, L., xiii, xv, 116, 121, 125, 129, 133–134, 145–147, 150–151, 154, 169, 227, 232, 237–240, 265–267, 598, 601 Levy, P., 235 Lewis, A., xv, xxii, 93, 323, 339, 349, 420, 442, 448, 489–491, 499, 618 Li, M., xi–xii, xx, 110, 113, 116, 122, 132, 136–137, 150, 227, 238, 260, 265, 304, 738 Lieb, E., 246, 249–250 limit condition, see cost function, limit condition limit frequency, 472 and discrete semimeasures, 472 Limit Lemma, 24, 29–30 strong form, 25 limit randomness, see randomness, limit limited dependence, 635
Lipschitz transformation–lowness Lipschitz transformation, 420 LK-reducibility, see reducibility, LKloading measure, 513 logarithm conventions, 3 factor, 627 Loveland, D., 117–118, 122, 305 Low Basis Theorem, see basis theorem, low low cuppable degree, 92 and prompt simplicity, 92 lown set, 19 low set, 19, 507 and Π01 classes, 77–78 and array computability, 98 and joins, 93 c.e., 38 low2 set and array computability, 97 low3 , see diamond class, low3 lower box counting dimension, see dimension, box counting, lower modified box counting dimension, see dimension, box counting, modified, lower semicomputable real, see real, left computably enumerable semicontinuity, 723 lowersemilattice, 4 lowness for a class, 507 for a notion of randomness, 507 for a pair of classes, 537, 590 for a test notion, 507 for computable measure machines, 562 and computable traceability, 562 and lowness for Schnorr randomness, 563 for computable randomness, xxiii and computability, 586 and hyperimmune-freeness, 560 for computable randomness / Schnorr randomness, 591 for computable randomness / weak 1-randomness, 591 for Demuth randomness, 590 for difference randomness, 535
Index
825
for generalized Martin-L¨ of tests, 537 and K-triviality, 538 and lowness for weak 2-randomness, 537, 538 for K, xxiii, 507 and bases for 1-randomness, 533 and jump traceability, 680 and K-triviality, 508 and K-triviality, 524, 525, 541 and LK-reducibility, 508 and lowness, 541 and lowness for 1-randomness, 508, 525, 533 and strong jump traceability, 680 cost function, see cost function, for lowness for K reducibility, see reducibility, LKvia a constant, 541 weak, 713 weak, and lowness for Ω, 714 weak, and 2-randomness, 715 weak, basis theorem, see basis theorem, weakly low for K for Kurtz null tests and computable traceability, 581, 584 and diagonal noncomputability, 583–584 and hyperimmune-freeness, 583–584 and lowness for weak 1-genericity, 584 and lowness for weak 1-randomness, 584 for left-d.c.e. reals, 722 and K-triviality, 722 for Martin-L¨ of tests, 507 for Ω, 712 and Omega operators, 712 and 2-randomness, 713 and weak lowness for K, 714 basis theorem, see basis theorem, low for Ω for 1-genericity, 586 for 1-genericity / weak 1-genericity, 583 for 1-randomness, xxiii, 473, 507–510
826
Index
and bases for 1-randomness, 531 and computable traceability, 555 and K-triviality, 510, 525, 533 and lowness for K, 508, 525, 533 and LR-reducibility, 508 reducibility, see reducibility, LRrelativized, 509 for 1-randomness / computable randomness, 591 for 1-randomness / Schnorr randomness, 591 for 1-randomness / weak 1-randomness, 347, 591 for partial computable randomness / computable randomness, 590 for Schnorr randomness, xxiii, 554 and computable traceability, 561 and hyperimmune-freeness, 560 and lowness for computable measure machines, 563 and lowness for Schnorr tests, 561 and lowness for weak 1-randomness, 584 and Schnorr triviality, 564 truth table, 570 truth table, and computable traceability, 572 truth table, and Schnorr triviality, 572 for Schnorr randomness / weak 1-randomness, 591 for Schnorr tests, 554 and computable traceability, 555 and lowness for Schnorr randomness, 561 for 2-randomness / 1-randomness, 591 for weak 1-genericity, 581, 586 and diagonal noncomputability, 583–584 and hyperimmune-freeness, 583–584 and lowness for Kurtz null tests, 584 and lowness for weak 1-randomness, 584 for weak 1-randomness
LR-degrees–machine and diagonal noncomputability, 584 and hyperimmune-freeness, 581, 584 and lowness for Kurtz null tests, 584 and lowness for Schnorr randomness, 584 and lowness for weak 1-genericity, 584 for weak 2-randomness and K-triviality, 537–538 and lowness for generalized Martin-L¨ of tests, 537, 538 for weak 2-randomness / computable randomness, 591 for weak 2-randomness / 1-randomness, 537 for weak 2-randomness / Schnorr randomness and c.e. traceability, 591 for weak 2-randomness / weak 1-randomness, 591 truth table for Schnorr randomness, see lowness, for Schnorr randomness, truth table LR-degrees, see degrees, LRLR-reducibility, see reducibility, LRLusin, N., 711–712, 718 Lutz, J., xiii, xv, xxiii, 242–245, 270, 304–305, 594–598, 603, 636–641, 654, 658, 662–666 m-completeness, see completeness, many-one M -pullback, see pullback m-reducibility, see reducibility, many-one Maass, W., 90, 215 machine bounded, 282 and computable randomness, 282 as a decidable machine, 298 computable measure, 277, 292 and Schnorr Hausdorff dimension, 659 and Schnorr packing dimension, 659 and Schnorr randomness, 277
majority vote–Martin-L¨ of and weak 1-randomness, 292 as a decidable machine, 298 complexity bounds, 278 lowness, see lowness, for computable measure machines relativized, 562 subadditivity, 278, see subadditivity, for computable measure machines with domain of measure 1, 278–279 computably layered, 291 and weak 1-randomness, 291 decidable, 298 and 1-randomness, 298 and Schnorr randomness, 299 and weak 1-randomness, 300–301 dependence, 21, 112, 124, 140, 142, 228, 408–409, 507, 705, 744, 749 discrete, 145 monotone, 145 and c.e. continuous semimeasures, 151 discrete, see machine, process universal, 146 universal, and KM -complexity, 151 universal, and optimal c.e. continuous semimeasures, 151 universal, output probability, 257 oracle, see machine, Turing, oracle pinball, see pinball machine plain, see machine, Turing prefix-free, 122, 705 and measure, 232 KC Theorem, see KC Theorem universal, 122 universal by adjunction, 122 universal oracle, 123 universal oracle, U, 123 universal, and Solovay completeness, 409 universal, counting programs of a given length, 140 universal, counting short programs, 140 universal, fixed points, 139
Index
827
universal, halting probability, see halting probability universal, output probability, 133, 229, 409 universal, output probability of a set, 229 universal, output probability, and maximal c.e. discrete semimeasures, 133 universal, output probability, and prefix-free Kolmogorov complexity, 133 universal, set of short halting programs, and Ω, 142 universal, set of short halting programs, complexity, 141–142 process, 146 quick, 282 universal, 146 s-measurable, 606 and strong s-Martin-L¨ of randomness, 606 and tests for strong s-Martin-L¨ of randomness, 606 self-delimiting, 122 strict process, 146 Turing, xix, 8 oracle, 16 universal, 9, 111 universal by adjunction, 111 universal oracle, 17, 114 with infinite outputs, 145 majority vote, 358 majorization, 67 Mandelbrot set, 636 many-one reducibility, see reducibility, many-one marker, 59 Marker, D., 478 Martin’s Conjecture, xxiv, 706 Martin, D., xxiv, 67–69, 80, 97–98, 100, 104, 288, 381, 496, 555, 575, 578, 706, 732, 735, 738 Martin, G., xiv Martin-L¨ of coverability, 534 and ∅ -trivialization, 536 cuppability, 534
828
Index
and (strong) jump traceability, 685 and ∅ -trivialization, 536 weak, 534 null set, see null set, Martin-L¨ of randomness, see randomness, Martin-L¨ of test, see test, Martin-L¨ of bounded, see test, bounded Martin-L¨ of computably finite, see test, computably finite Martin-L¨ of finite, see test, finite Martin-L¨ of generalized, see test, generalized Martin-L¨ of Martin-L¨ of, P., xiii, xvii, xix–xxi, 113, 136, 227, 231, 233–234, 254, 260, 269, 378 martingale, see gale, martingale mass problem, 344 and classes of positive measure, 344 and Turing reducibility, 344 of a Π01 class of positive measure, 345 of a Π01 class with upwards closure of positive measure, 346 of completions of PA, 345 of DNC functions, 347 of 1-random sets, 345–347 of sets that are 2-random or completions of PA, 346–347 reducibilities, see reducibility, weak and reducibility, strong Matijacevic, Y., 101 maximal computably enumerable discrete semimeasure, see semimeasure, discrete, computably enumerable, maximal set, 83 maximum length of agreement, see length of agreement, maximum Mayordomo, E., xv–xvi, 16, 270, 277, 308, 598, 636–641, 654 McNicholl, T., 368 measurable function, 242 measure and computability, 358 and computable enumerability, 358
Martin-L¨ of, P.–Medvedev, Y. and Δ02 -ness, 358 and Turing reducibility, 360 atom, see atom, of a measure atomic, 265 Bernoulli, 657 computable, 264 atom, see atom, of a computable measure computable rational probability, see computable, rational probability distribution continuous, 265 induced by a premeasure, 264 information content, see information content measure inner, 5 interval determined by a string, see interval determined by a string relative to a measure Lebesgue, see measure, uniform loading, see loading measure of relative randomness, xxii, see randomness, measure outer, 5 probability representation in a Π01 class, 335 representation of a real by a sequence, see representation, of a real by a sequence relative to a measure resource-bounded, 270 risking, see risking measure s-, 592 s-dimensional, 592 s-dimensional outer Hausdorff, 593 effective, 598 s-dimensional packing, 639 theory, xix, xxiii, 495, 592 trivial, 265 uniform, 5 and Hausdorff dimension, 593 for h-spaces, 618 measure-theoretic paradigm, see randomness, paradigm, measure-theoretic Medvedev degrees, see degrees, strong reducibility, see reducibility, strong Medvedev, Y., 69, 344
meet–monotone meet, 4 meet-irreducibility, 346 meeting a class, 100 Merkle, W., xv, 120, 138–139, 243–244, 251, 254, 276, 279–280, 298–301, 304–306, 309–314, 325–326, 328, 330, 357, 366–369, 437, 456–458, 484, 500, 505, 576, 581, 655–656, 658–661, 689, 729 meta-box, see box, metaMihailovi´c, N., 243–244, 279–280, 282, 325–326, 484, 562–564 Mileti, J., xv Miller, J., xiii–xv, xx–xxiv, 121, 125, 131, 155, 166, 168–169, 229, 250–252, 261–263, 276, 289, 309, 311–316, 319, 323, 327–330, 332, 334, 340–341, 348, 357, 368, 373, 379, 465–466, 473–481, 484–489, 494–497, 499, 505–506, 528, 535, 537–538, 584–586, 591, 601, 609–612, 615, 618–627, 635, 642, 651, 705, 707, 709–724, 726–727, 743 Miller, R., 43 Miller, W., 67–69, 100, 555 Mills, C., 530 mind-change function, see real, left computably enumerable, mind-change function minimal C-program, see program, minimal Cdegree, 70 and Π01 classes, 79 and “almost everywhere” behavior, 401 and c.e. sets, 72 and c.e. traceability, 650 and diagonal noncomputability, 618 and DNCh , 618 and GL2 -ness, 72 and joins, 93 and jump inversion, 72 and lowness, 72 and 1-randomness, 259
Index
829
C-, 458–459 K-, 459 of effective dimension 1, 626 of effective packing dimension 1, 645 rK-, 437, 459 rK-, and 1-randomness, 439 rK-, and initial segment complexity, 439 rK-, cardinality of the class, 439 van Lambalgen, 474 information content measure, see information content measure, minimal K-program, see program, minimal Kpair, 47 and “almost everywhere” behavior, 359 and fixed-point freeness, 92 of C-degrees, 457 of K-degrees, 458, 550–553 of c.e. degrees, 47 of c.e. degrees, and c.e. splittings, 53 of high c.e. degrees, 56 of K-trivial degrees, 504 of LR-degrees, 491 of monotone degrees, 460 of Solovay degrees, 415 Minimal Pair Theorem, 44 Minkowski dimension, see dimension, box counting modified box counting dimension, see dimension, box counting, modified modulus function, see computably enumerable, set, modulus function monotone complexity, see complexity, monotone cost function, see cost function, monotone degrees, see degrees, monotone machine, see machine, monotone reducibility, see reducibility, monotone
830
Index
Montalb´ an, A., xiv–xv, 335–336, 458, 551–552 Moore, E., 323, 358 Moscow school of algorithmic information theory, 120 Mostowski, A., 93 Muchnik degrees, see degrees, weak Muchnik reducibility, see reducibility, weak Muchnik, A., 32, 34–36, 86, 344 Muchnik, An., xx, 155, 168, 263, 304, 305, 309–313, 471–473, 507–508, 738–739, 744–745, 749, 752, 762 multiple permitting, see permitting, multiple music, 715 Mycielski, J., 478 Myhill Isomorphism Theorem, 21 Myhill’s Theorem, 22 randomness-theoretic version, 410 Myhill, J., 21–22 n-c.e. randomness, see randomness, n-c.e. test, see test, n-c.e. n-computably enumerable set, 27 n-cover, see cover, nn-fixed-point freeness, see fixed-point freeness, nn-FPF, see fixed-point freeness, nn-genericity, see genericity, nn-packing, see packing, nn-randomness, see randomness, nNabutovsky, A., 420 NCR, 336 NCRn , 336 Nerode, A., 21, 121, 746 never continuously random set, 7, 335–336 and countable Π01 classes, 336 Newton’s Method, 222 Ng, K., xiv–xvi, xxii, 219, 316–317, 319, 323, 339, 358, 362, 377, 535, 590, 650, 672–673, 689, 693 Nies, A., xi, xiii–xv, xxi–xxiv, 69, 120, 125, 228–229, 254, 258, 260–262, 276, 288–289,
Montalb´ an, A.–null set 309, 311–316, 319, 328, 330, 333–334, 340–341, 349–351, 356–357, 376–377, 382, 407–408, 413, 415–416, 418–419, 425–427, 430, 432, 473, 501, 503–505, 508, 510–511, 518, 523–531, 533–542, 554, 559–560, 562–564, 586, 590–591, 611, 668–671, 689–695, 700, 706–713, 715–721, 723–724, 726–727 Nitzpon, D., 586 Niveaufunktion, 600 nonascending path trick, 586 noncappable degree, 91 and prompt simplicity, 91 nonmonotonic betting strategy, see betting strategy, nonmonotonic nonmonotonic randomness, see randomness, nonmonotonic nonrandom strings, xxv for KM -complexity, 749 for monotone complexity, 749 for plain complexity, 738, 740, 743, 749 for prefix-free complexity, 743–745, 749 for process complexity, 749 for strict process complexity, 749, 761–763 nontrivial pseudo-jump operator, see pseudo-jump operator, nontrivial notion of forcing, see forcing, notion NP-completeness, see completeness, NPnth jump, see jump, nth Nugent. R., xvi null set, xix, 5, 231 and (super)martingales, 235 Kolmogorov-Loveland, 310 and finite unions, 313 Martin-L¨ of, 231 relative to a measure, 264 Martin-L¨ of ν-, see null set, Martin-L¨ of, relative to a measure Π0n , see class, Π0n , null
O-notation–optimal Π02 , 537 Schnorr, 272 and Schnorr tests, 272 Schnorr s-, 654 and total Solovay s-tests, 655 Σ0n , see class, Σ0n , null O-notation, 3 obeying a cost function, see cost function, obeying oblivious scan rule, see scan rule, oblivious Odifreddi, P., xiii, xix, 8, 28, 72, 85, 208–209 Ω, xxi–xxii, 12, 142, 227–229, 408–410, 705 and Demuth randomness, 315 and initial segment complexity, 410 and K-reducibility, 469–470 and sets of short halting programs, 142 and Solovay reducibility, 408, 415 and the α function, 469–470 and tt-incompleteness, 333 and 2-randomness, 469–470 and vL-reducibility, 475 and ∅ , 228 approximation, 142, 228 as an operator, see Omega operator initial segment complexity, 469–470 left computable enumerability, 228 1-randomness, 228 rate of convergence, 467–468 relativized, 257, see also halting probability, relativized and C-reducibility, 479 and K-reducibility, 477 and vL-reducibility, 475 Solovay completeness, 408 wtt-completeness, 228 Omega operator, xxiv, 386, 706 and continuity, 723 and K-triviality, 719–721 and left-c.e. reals, 716–719, 721 and lower semicontinuity, 723 and lowness, 708 and lowness for Ω, 712 and 1-genericity, 723–724 and 2-randomness, 711, 724
Index
831
failure of degree invariance, 718 range, 712, 724–727 spectrum, 717, 719, 726 and 1-randomness, 717 (ω-c.e.)3 , see diamond class, (ω-c.e.)3 ω-computably enumerable set, xxiv, 25, 27 and identity-bounded mind-change functions, 27 totally, see totally ω-c.e. degree ω-jump, see jump, ωΩ-like real, 409, see also completeness, Solovay 1-completeness, see completeness, 11-genericity, see genericity, 11-randomness, see randomness, 1and Solovay functions, 254 1-reducibility, see reducibility, 1only computably presentable real, see real, left computably enumerable, only computably presentable open set, see set, open Σ0n test, see test, Σ0n , open open question, xxi, 19, 122, 140, 145, 147, 150, 205, 263, 290, 311, 322, 333, 341, 357, 362, 394, 424, 426, 432, 433, 459, 462, 473, 486, 488, 507, 534, 535, 541, 550, 553, 590, 591, 597, 604, 606, 627, 628, 650, 670, 672, 680, 689, 694, 700, 701, 715, 721, 723, 731, 732, 743, 761 operator computably enumerable, see computably enumerable, operator martingale, see gale, martingale, operator Omega, see Omega operator pseudo-jump, see pseudo-jump operator opponent, 31 optimal computably enumerable continuous semimeasure, see semimeasure, continuous,
832
oracle–Π0n
Index
computably enumerable, optimal operator, see computably enumerable, operator, optimal supermartingale, see gale, supermartingale, computably enumerable, optimal termgale, see gale, termgale, optimal oracle, 16 machine, see machine, Turing, oracle prefix-free machine, see machine, prefix-free, oracle Turing machine, see machine, Turing, oracle order, xiii, 271 complex set, see complex set exponential, 599–600 ordinal notation, 27 Ordnungsfunktion, 271 Osherson, D., 246, 249–250 outcome, see priority method, outcome outer measure, see measure, outer output probability, see machine, prefix-free, universal, output probability and machine, monotone, universal, output probability overgraph, 739 of KM -complexity, 761 of monotone complexity, 752 of plain complexity, 744 of prefix-free complexity, 744–745 of process complexity, 761 of strict process complexity, 761 Owings, J., 14 Oxtoby, J., 4 PA, see Peano Arithmetic PA degree, xxii, 84 above a low degree, 93 and bases for Π01 classes, 84, 86 and c.e. degrees, 87 and computable trees, 86 and effectively inseparable pairs, 86 and jump classes, 93 and lowness, 87
and and and and and
minimal degrees, 87 1-genericity, 103 1-randomness, 337–340 separating sets, 86 {0, 1}-valued DNC functions, 89 upwards closure, 85 packing, 638 dimension, see dimension, packing n-, 638 Paris, J., 381, 401 partial function, see function, partial randomness, see randomness, weak s- and randomness, strong spartial computable function, see computable, partial function randomness, see randomness, partial computable path, 72 isolated, 74 computability, 74 Pavan, A., 642 payoff function, see betting strategy, nonmonotonic, payoff function pb-density, see density, pbpb-genericity, see genericity, pbpb-reducibility, see reducibility, pbPeano Arithmetic, 73, 84 mass problem of completions, see mass problem, of completions of PA perfect function tree, see tree, perfect function permitting and array noncomputability, 94, 445 c.e., 43 delayed, 61 Δ02 , 43 multiple, 94, 445 simple, see permitting, c.e. permutation computable, see computable, permutation randomness, see randomness, permutation Π0n
Π01 class–priority class, see class, Π0n set, 23 test, see test, Π0n 0 Π1 class, see class, Π01 Π02 null set, see null set, Π02 piecewise computable set, 56 trivial set, 53 thick subsets, 54 thick subsets, and highness, 53 pinball machine, 517 Pingrey, S., 368 Pippinger, N., 125 plain information content, see information content, plain Kolmogorov complexity, see complexity, plain Kolmogorov machine, see machine, Turing symmetry of information, see Symmetry of Information, for plain complexity polynomial time computable function, see function, polynomial time computable Turing reducibility, see reducibility, polynomial time Turing Porter, C., xvi, 333 Positselsky, S., 738, 752 Posner’s trick, 48 Posner, D., 48, 64–65, 72, 101 Post’s Problem, 32, 34, 36, 82–83, 501, 511 injury-free solution, 502 priority-free solution, 91, 502 Post’s Program, 83 Post’s Theorem, 25 Post, E., 17, 22, 24–25, 32, 34, 80, 82–83, 90 prefix Kolmogorov complexity, see complexity, prefix-free Kolmogorov prefix-free function, 122 computable, universal, 122 computable, universal by adjunction, 122 information content, see information content, prefix-free
Index
833
Kolmogorov complexity, see complexity, prefix-free Kolmogorov machine, see machine, prefix-free set, 74, 121–122 weight, see weight, of a prefix-free set symmetry of information, see Symmetry of Information, for prefix-free complexity premeasure, 264 rational representation, 264 presentation, see real, left computably enumerable, presentation preservation of the use, see use, preservation strategy, see Sacks, preservation strategy primitive recursive function, 28 and computable functions, 28–29 principal function, 69 and hyperimmunity, 69 ideal, see ideal, principal priority method, 34–36 bounded injury, 38 finite injury, 34–36 follower, 36 guess, 44 infinite injury, 47 initialization, 35 injury, 35 outcome, 44 priority tree, 44, 55 protecting a set, 37 requiring attention, 36 restraint, 37 restraint function, 41 strategy, 44 true outcome, 46 true path, 46–47 unbounded injury, 39–40 visiting a node, 46 ordering, 35 tree, see priority, method, priority tree
834
Index
priority-free solution to Post’s Problem–randomness
priority-free solution to Post’s Problem, see Post’s Problem, priority-free solution probabilistic method, 632 probability a priori, see a priori, probability halting, see halting probability measure, see measure, probability output, see machine, prefix-free, universal, output probability and machine, monotone, universal, output probability theory, xxv process complexity, see complexity, process machine, see machine, process productive set, 22 program, 118 counting programs, 118 d-minimal, 141 counting, 141 minimal C-, 112, 578 complexity, 115 minimal K-, 124 complexity, 143–145 for a finite set, 141 prompt simplicity, 90, 215 and cappability, 92 and fixed-point freeness, 90 and low cuppability, 92 and noncappability, 91 and superlowness, 694 proper Σ01 -cover, see cover, Σ01 , proper protecting a set, see priority, method, protecting a set pseudo-jump inversion, see pseudo-jump operator, inversion pseudo-jump operator, 41 inversion, 41 basis theorem, see basis theorem, pseudo-jump inversion for 1-random sets, 376 for wtt-reducibility, 377 nontrivial, 41 pullback, 606 quick process
machine, see machine, process, quick randomness, see randomness, quick process Raichev, A., xvi, 219, 224, 424, 437, 439 Raisonnier, J., 555 random variable, 242 randomness almost simple, 308 and simple randomness, 308 and “almost everywhere” behavior, xxv, 359, 381 and computational power, xxiii, 228–229, 288–289, 317, 326, 332, 339, 360, 363, 370–371, 533, 713, 715 avoiding power, 370 hitting power, 370 and stochasticity, 303 arithmetic, 256 and ∅(ω) -computability, 324 C-, 112, 743 and strong K-randomness, 161–164, 715 and weak K-randomness, 133, 251 computable, xxi, 270, 600 and bounded machines, 282 and bounded Martin-L¨ of tests, 280 and Church stochasticity, 303 and computably graded tests, 280 and existence of limits for computable martingales, 270 and highness, 349–351 and initial segment complexity, 304–305 and left-c.e. reals, 351 and nonmonotonic randomness, 311 and 1-randomness, 275, 351 and 1-randomness, lowness, see lowness, for 1-randomness / computable randomness and partial computable randomness, lowness, see
randomness lowness, for partial computable randomness / computable randomness and quick process randomness, 283 and relative tt-Schnorr randomness, 571 and Schnorr randomness, 275, 351 and Schnorr randomness, lowness, see lowness, for computable randomness / Schnorr randomness and ultracompressibility, 305 and upwards closure, 340 and van Lambalgen’s Theorem, see van Lambalgen’s Theorem, for computable randomness and von Mises-Wald-Church stochasticity, 307 and weak 1-randomness, lowness, see lowness, for computable randomness / weak 1-randomness and weak 2-randomness, lowness, see lowness, for weak 2-randomness / computable randomness base, see base, for computable randomness lowness, see lowness, for computable randomness computably finite, 318 and total ω-c.e.-ness, 319 Δ02 , 600 Demuth, 315 and Δ02 -ness, 316 and difference randomness, 318 and generalized lowness, 316, 364 and hyperimmunity, 316 and Ω, 315 and ω-c.e.-ness, 315 and 1-randomness, 315 and 2-randomness, 315 and weak 2-randomness, 316 lowness, see lowness, for Demuth randomness difference, 316, 339, 534–535 and K-triviality, 533, 550
Index
835
and Demuth randomness, 318 and martingales, 318 and 1-randomness, 317–318 and strict Demuth tests, 317 and the halting problem, 317 and weak 2-randomness, 318 base, see base, for difference randomness lowness, see lowness, for difference randomness extraction, xxiv, 608, 609, 618, 627 for strings, 635 extractor, see extractor finite, 318 and 1-randomness, 318 Hitchcock, 322 infinitely often C-, 260 and infinitely often strong K-randomness, 262–263, 715 and 1-randomness, 260 and 2-randomness, 261–262 infinitely often strong K-, xxiv, 262 and 2-randomness, 716 and infinitely often Crandomness, 262–263, 715 and 3-randomness, 262 and 2-randomness, 262–263 injective, 319 and 1-randomness, 319 Kastermans, 321 Kolmogorov-Loveland, see randomness, nonmonotonic Kurtz, see randomness, weak 1Kurtz n-, see randomness, weak nlimit, 315 Martin-L¨ of, xix, 231, see also randomness, 1and 1-randomness, 232 martingale s-, 604 measure, 404, 464 n-, xxi, 256 and C-reducibility, 479 and columns of a set, 259 and initial segment complexity, 257 and jumps, 324 and K-reducibility, 477
836
Index and (n − 1)st jumps, 363–364 and (n − 1)-fixed-point freeness, 371 and (n + 1)-randomness, 257 and nth jumps, 364 and 1-randomness relative to ∅(n−1) , 256 and open Σ0n classes, 254 and open Σ0n tests, 256 and Π0n classes of positive measure, 259 and Π0n+1 null classes, 260 and plain complexity, 263 and prefix-free complexity, 263 and Σ0n classes, 254 and Σ0n+1 null classes, 260 (n−1)
and Σ0,∅ classes, 254 1 and (super)martingales, 257 and Turing reducibility, 332, 475 and vL-reducibility, 474 and weak (n + 1)-randomness, 360 and weak n-randomness, 288, 361–362 and ∅(n) -computability, 324 and 0-1 laws, 259 relative, 257–258 relative to a continuous measure, 336 relative to a measure, 264 relative, and joins, 257–258 single gap characterization, 487 n-c.e., 316 n-ν-, see randomness, n-, relative to a measure nonmonotonic, xxi, 310, 319 and autoreducibility, 341 and initial segment complexity, 312 and 1-randomness, 311, 314 and (partial) computable randomness, 311 and total betting strategies, 310 and van Lambalgen’s Theorem, see van Lambalgen’s Theorem, for nonmonotonic randomness of a closed set, xxv of a continuous function, xxv
randomness of a degree, 227 of a string, xx, 110–112, 132–133 collection of nonrandom strings, see nonrandom strings relative to a function, 111 1-, xix, xxi, 227, 600 and addition, 419 and approximations to left-c.e. reals, 416 and autocomplex sets, 368 and autoreducibility, 340 and C-reducibility, 425 and c.e. degrees, 337 and cardinality of K-degrees, 495 and Church stochasticity, 232 and cl-reducibility, 441, 448, 455 and complex sets, 368 and computable enumerability, 324–325 and computable functions, 299 and computable randomness, 275, 351 and computable randomness, lowness, see lowness, for 1-randomness / computable randomness and computing c.e. sets, 289 and decidable machines, 298 and degrees computing ∅ , 330 and Demuth randomness, 315 and diagonal noncomputability, 336, 347–349 and difference randomness, 317–318 and downward closure, 333 and downward closure under strong reducibilities, 333 and effective dimension, 599, 601, 609 and finite randomness, 318 and hyperimmune-freeness, 324, 356 and hyperimmunity, 455 and ibT-reducibility, 447 and indifferent sets, 341 and infinitely often C-randomness, 260
randomness and initial segment complexity, 229, 250–254, 466, 477, 479, 481, 484–485 and initial segments of the degrees, 333 and injective randomness, 319 and joins, 339 and jump inversion, 373 and K-comparability, 486 and k-immunity, 455 and K-incomparability, 480–481 and K-reducibility, 425 and Kastermans / Hitchcock randomness, 322 and KM -complexity, 239 and left-d.c.e. reals, 410 and lowness, 324 and Martin-L¨ of randomness, 232 and martingale processes, 244 and measure, 234 and minimal degrees, 259 and minimal pairs, 359 and minimal rK-degrees, 439 and monotone complexity, 227, 239, 460 and nonmonotonic randomness, 311, 314 and 1-genericity, 380 and 1-randomness relative to a measure, 267 and PA degrees, 337–340 and partial computable randomness, 304 and Π01 classes, 234, 324 and Π01 classes of positive measure, 259 and plain complexity, 252–254 and presentations of left-c.e. reals, 411 and process complexity, 227, 239 and pseudo-jump operators, 376 and rK-reducibility, 424–425 and Schnorr randomness, 349–351 and Schnorr randomness, lowness, see lowness, for 1-randomness / Schnorr randomness and shift complexity, 601
Index
837
and Solovay completeness, 409–410 and Solovay functions, 412 and Solovay randomness, 234 and Solovay reducibility, 419 and strong K-belowness, 486–487 and (super)martingales, 236 and the law of large numbers, 232 and tt-reducibility, 333 and Turing reducibility, 259, 326, 332, 475–476 and 2-randomness, lowness, see lowness, for 2-randomness / 1-randomness and upwards closure, 338–339 and vL-reducibility, 474 and von Mises-Wald-Church stochasticity, 303 and weak K-randomness, 227, 251 and weak 1-randomness, 290, 356 and weak 1-randomness, lowness, see lowness, for 1-randomness / weak 1-randomness and weak 2-randomness, lowness, see lowness, for weak 2-randomness / 1-randomness and wtt-reducibility, 326 as a Σ02 class, 324 base, see base, for 1-randomness computable characterization, 243 gap phenomenon, see gap phenomenon, for 1-randomness lowness, see lowness, for 1-randomness mass problem, see mass problem, of 1-random sets no gap phenomenon, see gap phenomenon, lack, for 1-randomness plain complexity characterization, 252–254 plain complexity characterization, and Solovay functions, 327 relative to a computable measure, 264
838
Index
relative to a continuous measure, 335–336 relative to a measure, 264 relative to a measure, and 1-randomness, 267 relative to a measure, and noncomputability, 334 relative to a measure, for atomic measures, 268 relative, and C-independence, 628 relative, and global independence, 628 relative, and minimal pairs, 337, 533 relativized, 245 relativized, and generalized lowness, 708 relativized, and n-randomness, 256 relativized, and Turing reducibility, 475 Schnorr’s critique, see Schnorr’s critique subsets, 347–349 1-ν-, see randomness, 1-, relative to a measure paradigm computational, xx, 226–227 measure-theoretic, xx, 226, 229–231 stochastic, 226, 229–231 unpredictability, xx, 226, 234–236 partial, see randomness, weak sand randomness, strong spartial computable, 303 and computable randomness, lowness, see lowness, for partial computable randomness / computable randomness and initial segment complexity, 304–305 and nonmonotonic randomness, 311 and 1-randomness, 304 and von Mises-Wald-Church stochasticity, 303
randomness lowness, see lowness, for partial computable randomness permutation, 319 quick process, 282 and computable randomness, 283 relative, 257 measure, see randomness, measure relative to a measure, xxi, xxv Schnorr, xxi, 271, 600 and c.e. degrees, 350 and c.e. sets, 660 and Church stochasticity, 330 and computable martingales, 273 and computable measure machines, 277 and computable randomness, 275, 351 and computable randomness, lowness, see lowness, for computable randomness / Schnorr randomness and decidable machines, 299 and finite total Solovay tests, 294 and highness, 349–351 and initial segment complexity, 658 and Kurtz arrays, 294 and Kurtz null tests, 294 and left-c.e. reals, 350–351 and 1-randomness, 349–351 and 1-randomness, lowness, see lowness, for 1-randomness / Schnorr randomness and Schnorr Hausdorff dimension, 657 and Schnorr tests, 272 and total Solovay tests, 275 and upwards closure, 340 and van Lambalgen’s Theorem, see van Lambalgen’s Theorem, for Schnorr randomness and weak 1-genericity, 356 and weak 1-randomness, 286, 290, 356 and weak 1-randomness, lowness, see lowness, for Schnorr randomness / weak 1-randomness
randomness and weak 2-randomness, lowness, see lowness, for weak 2-randomness / Schnorr randomness base, see base, for Schnorr randomness lowness, see lowness, for Schnorr randomness relative to a computable measure, 658 relative to ∅ , 315 truth table lowness, see lowness, for Schnorr randomness, truth table Schnorr Δ02 , 600 set-theoretic, 297 simple, 308 and almost simple randomness, 308 and Church stochasticity, 308 Solovay, 234 and 1-randomness, 234 strong K-, 132, 744 and C-randomness, 161–164, 715 strong s-Martin-L¨ of, 605 and KM -complexity, 605 and s-measurable machines, 606 and supermartingale s-randomness, 605 and weak s-randomness, 605 in h-spaces, 619 supermartingale s-, 604 and strong s-Martin-L¨ of randomness, 605 3and infinitely often strong K-randomness, 262 truth table Schnorr relative, 570 and computable randomness, 571 2-, xxi, xxiv and “almost everywhere” behavior, 394 and CEA sets, 386 and Demuth randomness, 315 and density, 385 and generalized lowness, 363 and hyperimmunity, 382 and infinitely often C-randomness, 261–262
Index
839
and infinitely often strong K-randomness, 262–263, 716 and initial segment complexity, 470, 478, 487–488 and K-reducibility, 469–470 and lowness for Ω, 713 and mass problems, 346–347 and Ω, 469–470, 478 and Omega operators, 711, 724 and 1-genericity, 385, 393 and 1-randomness, lowness, see lowness, for 2-randomness / 1-randomness and the α function, 470 and the γ function, 488 and the s function, 487 and time-bounded plain complexity, 261–262 and Turing reducibility, 475 and 2-randomness relative to a measure, 268 and vL-reducibility, 475 and weak lowness for K, 715 relative to a measure, 268 single gap characterization, 487–488 2-ν-, see randomness, 2-, relative to a measure weak, 308 weak Demuth, 315 weak K-, 132 and C-randomness, 133, 251 and 1-randomness, 227, 251 weak n-, xxi, 286 and (n − 1)-randomness, 360 and n-randomness, 288, 361–362 and Π0n classes of positive measure, 382 (n−2)
classes, 287 and Σ0,∅ 1 and Σ0n Kurtz null tests, 287 and Σ0n−1 classes, 287 (n−2)
classes, 287 and Σ0,∅ 2 and Solovay n-genericity, 298 and weak 1-randomness relative to ∅(n−1) , 286 and ∅(n) -computability, 360 and ∅(n−1) -c.e. degrees, 361
840
Index
as strong (n − 1)-randomness, 288 weak 1-, 271, 285 and bi-immunity, 295 and c.e. degrees, 295 and computable martingales, 290 and computable measure machines, 292 and computable randomness, lowness, see lowness, for computable randomness / weak 1-randomness and computably layered machines, 291 and decidable machines, 300–301 and hyperimmune-freeness, 356–357 and hyperimmunity, 289–290, 356 and k-immunity, 455 and Kurtz null tests, 286 and Kurtz-Solovay tests, 293 and left-c.e. reals, 295 and 1-randomness, 290, 356 and 1-randomness, lowness, see lowness, for 1-randomness / weak 1-randomness and only computably presentable left-c.e. reals, 411 and Schnorr randomness, 286, 290, 356 and Schnorr randomness, lowness, see lowness, for Schnorr randomness / weak 1-randomness and upwards closure, 340 and weak 1-genericity, 356 and weak 2-randomness, 357 and weak 2-randomness, lowness, see lowness, for weak 2-randomness / weak 1-randomness lowness, see lowness, for weak 1-randomness weak s-, 602 and effective dimension, 603 and Solovay s-tests, 603 and strong s-Martin-L¨ of randomness, 605
rank and (super)martingales, 604 and weak s-Martin-L¨ of randomness, 602 Martin-L¨ of, 602 Martin-L¨ of, and weak s-randomness, 602 Martin-L¨ of, in h-spaces, 619 weak 2-, 232, 287–288 and CEA sets, 393 and computable randomness, lowness, see lowness, for weak 2-randomness / computable randomness and computing c.e. sets, 289 and Δ02 -ness, 288 and Demuth randomness, 316 and difference randomness, 318 and generalized Martin-L¨ of tests, 287 and hyperimmune-freeness, 357 and hyperimmunity, 382 and initial segments of the degrees, 357 and 1-genericity, 385 and 1-randomness, lowness, see lowness, for weak 2-randomness / 1-randomness and Schnorr randomness, lowness, see lowness, for weak 2-randomness / Schnorr randomness and the halting problem, 288–289 and van Lambalgen’s Theorem, see van Lambalgen’s Theorem, for weak 2- randomness and weak 1-randomness, 357 and weak 1-randomness, lowness, see lowness, for weak 2-randomness / weak 1-randomness and ∅ -hyperimmunity, 362 base, see base, for weak 2-randomness lowness, see lowness, for weak 2-randomness relative, and minimal pairs, 337, 359 rank, see class, Π01 , rank
rational representation–real rational representation, see premeasure, rational representation rational-valued function, see function, rational-valued martingale, see gale, martingale, rational-valued r.e., see computably enumerable real, 2, 197 approximation, 198–199, 217 rate of convergence, 198 computable, 31, 197, 198 as a real closed field, 224 uniformly, 202 convention on reals, 3 Δ02 , 199 as a real closed field, 224 left computably enumerable, xxi–xxii, 197, 199, 201–202, 705 and addition, 202, 419 and almost c.e. sets, 200 and C-independence, 628 and C-reducibility, 425 and cl-completeness, 442 and cl-reducibility, 448, 455 and cl-upper bounds, 442, 444 and computable randomness, 351 and effective packing dimension, 649 and ibT-reducibility, 447 and initial segment complexity, 440 and multiplication, 202 and Omega operators, 716–719, 721 and 1-randomness, 448 and Schnorr Hausdorff dimension, 660 and Schnorr randomness, 350–351 and Solovay completeness, 408–410 and strongly computably enumerable reals, 201 and von Mises-Wald-Church stochasticity, 328 and weak 1-randomness, 295
Index
841
approximation, 199–200, 203, 206 approximation, and monotone reducibility, 432 approximation, and 1-randomness, 416 approximation, and rK-reducibility, 422 approximation, and Solovay reducibility, 405, 407–408, 415–416 C-degree, see degrees, Ccl-degree, see degrees, computable Lipschitz ibT-degree, see degrees, identity bounded Turing initial segment complexity and addition, 425 K-degree, see degrees, Kmind-change function, 422 minimal pair of Solovay degrees, 415 monotone degree, see degrees, monotone 1-randomness and initial segment complexity, 410 only computably presentable, 206, 210 only computably presentable, and 1-randomness, 411 only computably presentable, and weak 1-randomness, 411 presentation, 206 presentation, and initial segment complexity, 411 presentation, and prompt simplicity, 215 presentation, and wtt-ideals, 208–209 representation, 203 rK-degree, see degrees, relative K settling time, 422 Solovay degree, see degrees, Solovay standard reducibility, see reducibility, standard uniformly, 202 left cut, 197
842
Index
left-c.e., see real, left computably enumerable left-d.c.e., 217 and K-triviality, 721–722 and 1-randomness, 410 as a field, 219 as a real closed field, 219 lowness, see lowness, for left-d.c.e. reals right computably enumerable, 199 and Solovay reducibility, 420 uniformly, 203 strongly computably enumerable, 200 and left computably enumerable reals, 201 real closed field, see field, real closed real-valued function, see function, real-valued Recursion Theorem, 13 and the KC Theorem, 127–128 for prefix-free machines, 123 with parameters, 14 recursive, see computable primitive, see primitive recursive function recursively enumerable, see computably enumerable reducibility, 15, 404, see also degrees C-, xxii, 405, 427 and addition, 426 and cl-reducibility, 420 and n-randomness, 479 and 1-randomness, 425 and rK-reducibility, 421 and Solovay reducibility, 405 and the countable predecessor property, 465 and Turing reducibility, 425, 456 and vL-reducibility, 479, 480 cl-, see reducibility, computable Lipschitz computable Lipschitz, xxii, 420, 427 and array computability, 444, 447 and C-reducibility, 420 and c.e. sets, 435, 436 and complex sets, 441
real closed field–reducibility and ibT-reducibility, 443 and initial segment complexity, 440 and joins, 442, 444 and K-reducibility, 420 and 1-randomness, 441, 448, 455 and rK-reducibility, 422, 435–436 and Schnorr reducibility, 462 and Solovay reducibility, 433–435 and the Kuˇcera-Slaman Theorem, 454–455 and tt-reducibility, 436 completeness, see completeness, computable Lipschitz equivalence, see equivalence under a notion of reducibility extended monotone, 473 ibT-, see reducibility, identity bounded Turing identity bounded Turing, 420 and array computability, 444, 447 and cl-reducibility, 443 and differential geometry, 420 and joins, 444 and 1-randomness, 447 K-, xxii, 404, 427 and addition, 426 and cl-reducibility, 420 and m-degrees, 464 and n-randomness, 477 and 1-randomness, 425, 480–481, 486 and rK-reducibility, 421 and Solovay reducibility, 405 and strongly K-belowness, 486 and the countable predecessor property, 464 and Turing reducibility, 456 and vL-reducibility, 477, 480 measure of lower cones, 478 Km-, see reducibility, monotone KM -, 169 LK-, xxiii, 494 and lowness for K, 508 and LR-reducibility, 494 low for K, see reducibility, LKlow for 1-randomness, see reducibility, LR-
reducibility LR-, xxii, 473, 509 and LK-reducibility, 494 and lowness for 1-randomness, 508 and non-GL2 -ness, 491 and Π01 classes, 497 and relative Δ02 -ness, 495 and relativized Σ01 classes, 489 and Σ02 classes, 497 and the countable predecessor property, 491 and Turing reducibility, 491 and (uniform) almost everywhere domination, 496–497, 499 and universal Martin-L¨ of tests, 489 and van Lambalgen reducibility, 473 m-, see reducibility, many-one many-one, 19 and effective dimension, 611 and the broken dimension question, 611 and Turing reducibility, 20 completeness, see completeness, many-one polynomial time, 19 Medvedev, see reducibility, strong monotone, 169, 432 and approximations, 432 Muchnik, see reducibility, weak 1-, 19 pb-, 107 polynomial time, see reducibility, many-one, polynomial time and reducibility, Turing, polynomial time relative K-, xxii, 421, 427 and addition, 424 and approximations, 421–422 and C-reducibility, 421 and cl-reducibility, 422, 435–436 and computing with advice, 421 and K-reducibility, 421 and 1-randomness, 424–425 and plain complexity, 421 and real closed fields, 424 and settling times, 422 and Solovay reducibility, 422, 435
Index
843
and Turing reducibility, 423 rK-, see reducibility, relative KS-, see reducibility, Solovay Schnorr, 462, 564 and cl-reducibility, 462 and strong truth table reducibility, 463 Σ03 , 427 Solovay, xxii, 405, 427 and addition, 408, 413 and approximations, 405, 407–408, 415–416 and C-reducibility, 405 and c.e. sets, 435, 436 and cl-reducibility, 433–435 and K-reducibility, 405 and multiplication, 414 and 1-randomness, 419 and right-c.e. reals, 420 and rK-reducibility, 422, 435 and Turing reducibility, 406 and wtt-reducibility, 406 completeness, see completeness, Solovay relativized, 709 standard, 427 and density, 427, 430, 432 and splitting, 427 strong, 206, 344 and weak reducibility, 345 strong truth table, 463 and Schnorr reducibility, 463 strong weak truth table, see reducibility, computable Lipschitz sw-, see reducibility, computable Lipschitz truth table, 20 and cl-reducibility, 436 and hyperimmune-freeness, 69 and 1-randomness, 333 and totality, 21 and Turing reducibility, 69 and weak notions of randomness, 570 and wtt-reducibility, 334 completeness, see completeness, truth table conjunctive, 739
844
Index
tt-, see reducibility, truth table Turing, 16 and C-reducibility, 425, 456 and effective dimension, 612 and hyperimmune-freeness, 69 and K-reducibility, 456 and LR-reducibility, 491 and m-reducibility, 20 and mass problem reducibilities, 344 and 1-randomness, 259 and rK-reducibility, 423 and Solovay reducibility, 406 and the broken dimension question, 612 and the countable predecessor property, 17 and tt-reducibility, 69 and vL-reducibility, 474 and wtt-reducibility, 70 completeness, see completeness, Turing measure of upper cones, 358 polynomial time, 16, 19 polynomial time, and complexity extraction, 642 relativization and measure, 360 uniform Schnorr, 462 van Lambalgen, xxii, 473 and C-reducibility, 479, 480 and joins, 474 and K-reducibility, 477, 480 and LR-reducibility, 473 and n-randomness, 474 and 1-randomness, 473, 474 and Turing reducibility, 474 vL-, see reducibility, van Lambalgen weak, 344 and strong reducibility, 345 weak truth table, 21 and effective dimension, 611 and initial segment complexity, 440 and Solovay reducibility, 406 and the broken dimension question, 611 and tt-reducibility, 334 and Turing reducibility, 70
reduction–relativized completeness, see completeness, weak truth table with tiny use, 577 and anti-complex sets, 579 and bounding orders, 577 and computability, 577 and wtt-reducibility, 577 uniform, 577 uniform, and anti-complex sets, 579 uniform, and computability, 577 uniform, and domination, 577 uniform, and highness, 578 wtt-, see reducibility, weak truth table reduction, 15, 29–30 dynamic definition, 29 rules, 30 static definition, 29 regularity, Schnorr, 657 Reid, S., xv, 290–295, 581, 606 Reimann, J., xv–xvi, 7, 21, 265–266, 276, 309, 311–314, 328, 330–331, 334–336, 357, 465, 603–605, 609–611, 636–638, 641, 655–656, 658–661, 713 relative Kdegrees, see degrees, relative Kreducibility, see reducibility, relative Krelative randomness, see randomness, relative relativization, 18, 245 relativized computable measure machine, see machine, computable measure, relativized hyperimmunity, see hyperimmunity, relativized KM -complexity, see complexity, KM , relativized Kuˇcera-Slaman Theorem, see Kuˇcera-Slaman Theorem, relativized lowness for 1-randomness, see lowness, for 1-randomness, relativized
Remmel, J.–Salomaa, A. monotone complexity, see complexity, monotone, relativized 1-randomness, see randomness, 1-, relativized Π0n class, see class, Π0n , relativized Π01 class, see class, Π01 , relativized plain complexity, see complexity, plain Kolmogorov, relativized prefix-free complexity, see complexity, prefix-free Kolmogorov, relativized Σ0n class, see class, Σ0n , relativized Σ01 class, see class, Σ01 , relativized Σ02 class, see class, Σ02 , relativized Solovay reducibility, see reducibility, Solovay, relativized Remmel, J., xvi, 74 representation of a left-c.e. real, see real, left computably enumerable, representation of a real by a sequence relative to a measure, 266 and continuous measures, 267 request, see KC Theorem, request requirement, 31–32 requiring attention, see priority, method, requiring attention resource-bounded measure, see measure, resource-bounded restrained approximation, see approximation, restrained restraint, see priority, method, restraint function, see priority method, restraint function hatted, 57 Rettinger, R., 217, 410, 722 reverse mathematics, xxiii, xxv, 349, 495, 499 Rice’s Theorem, 13 Rice, H., 13 right computably enumerable real, see real, right computably enumerable risking measure, 323, 381, 401 rK-degrees, see degrees, relative K-
Index
845
rK-reducibility, see reducibility, relative KRobinson, R., 58, 64–66 Rogers, H., xiii, 8, 28, 82, 209 Ronneburger, D., 743 Royden, H., 264, 400 Rumyantsev, A., 601 Rupprecht, N., 294 Ryabko, B., 598 s function, 467 and the α function, 468 and the γ function, 487 and 2-randomness, 487 S-degrees, see degrees, Solovay s-dimensional measure, see measure, s-dimensional outer Hausdorff measure, see measure, s-dimensional outer Hausdorff packing measure, see measure, s-dimensional packing s-gale, see gale, s-gale s-supergale, see gale, s-supergale s-m-n Theorem, 10 s-measurable machine, see machine, s-measurable s-measure, see measure, ss-randomness, see randomness, weak s- and randomness, strong sS-reducibility, see reducibility, Solovay s-success, see gale, (super)martingale, s-success s-termgale, see gale, termgale, ss.j.t., see traceability, strong jump Sacks Density Theorem, see Density Theorem Jump Inversion Theorem, 65 uniformity, 66 preservation strategy, 40 Splitting Theorem, 39 Sacks’ Theorem, 358, 531 Sacks, G., 39–40, 58, 65–66, 72, 83, 209, 323, 358, 363, 373, 534, 645, 711 Salomaa, A., 8
846
Index
Sasso, R., 207 savings property, see gale, (super)martingale, savings property trick, see gale, (super)martingale, savings trick scan rule, see betting strategy, nonmonotonic, scan rule Schaefer, M., 77, 454 Schnorr Δ02 randomness, see randomness, Schnorr Δ02 Hausdorff dimension, see dimension, Hausdorff, Schnorr null set, see null set, Schnorr packing dimension, see dimension, packing, Schnorr randomness, see randomness, Schnorr reducibility, see reducibility, Schnorr uniform, see reducibility, uniform Schnorr regularity, see regularity, Schnorr s-null, see null set, Schnorr ss-test, see test, Schnorr stest, see test, Schnorr triviality, xxiii, 564 and anti-complex sets, 576, 580 and bases for randomness, 575 and computable enumerability, 565 and computable traceability, 572 and dense simple sets, 574 and highness, 564, 575 and joins, 574 and lowness for Schnorr randomness, 564 and plain complexity, 580 and tt-lowness for Schnorr randomness, 572 and Turing completeness, 564 and wtt-completeness, 568 and wtt-reducibility, 574 as a tt-ideal, 574 size of the class, 575 tt-downward closure, 569 Schnorr’s critique, xxi, 243, 269
Sasso, R.–semimeasure and martingale processes, 269 Schnorr, C., xiii, xxi, 125, 145–147, 227, 232, 236–240, 243, 257, 269–273, 276, 280, 303, 321, 350, 587, 594, 599, 604–605, 654 Scott Basis Theorem, see basis theorem, Scott Scott, D., 84, 87 selection, 302 domain, 309 function, 230, 246, 301 adaptive, 230 and martingales, 302 rule, see selection, function self-delimiting machine, see machine, self-delimiting Semenov, A., 246, 309–311 semicontinuity, lower, see lower semicontinuity semimeasure continuous, xiii, 150, 264 and discrete semimeasures, 150 and supermartingales, 238 computably enumerable, 150, 170 computably enumerable, and monotone machines, 151 computably enumerable, optimal, 150 computably enumerable, optimal, and Bayes’ rule, 238 computably enumerable, optimal, and optimal c.e. supermartingales, 238 computably enumerable, optimal, and universal monotone machines, 151 subadditivity, 150 discrete, xiii, 133 and continuous semimeasures, 150 and limit frequencies, 472 computably enumerable, 129, 133 computably enumerable, maximal, 133
separating set–simple computably enumerable, maximal, and prefix-free Kolmogorov complexity, 133 computably enumerable, maximal, and universal prefix-free machine output probabilities, 133 separating set, 73 and PA degrees, 86 for an effectively inseparable pair, 86 sequence, 2 set, 2, see also computable, set; computably enumerable, set; etc. cover, see cover, of a set density, see density of a set domain, see domain, of a set of pairs eth column, see column of a set k-, see k-set of generators, see generators of an open set of nonrandom strings, see nonrandom strings open basic, 4 generators, see generators of an open set theory, 297, 555, 613 settling time domination, see domination, settling time function, see computably enumerable, set, modulus function of left-c.e. reals, see real, left computably enumerable, settling time Shannon, C., 323, 358 Shannon-Fano codes, 125 Shapiro, N., 323, 358 Shelah, S., 478, 555 Shen, A., xvi, 122, 147, 170, 246, 263, 303, 471–473, 601 shift complex set, 601 and 1-randomness, 601 and effective dimension, 601 and prefix-free complexity, 601
Index
847
1-, 601 Shoenfield Jump Inversion Theorem, 65 Shoenfield’s Limit Lemma, see Limit Lemma Shoenfield, J., 24–25, 29, 54, 56, 58, 65, 71 Shore, R., xvi, 15, 41–42, 64, 72, 91–92, 215, 376, 392, 421, 650 σ-algebra, 241 Borel, 242 generated by a class, 242 Σ0n class, see class, Σ0n Kurtz null test, see test, Σ0n Kurtz null (Martin-L¨ of) test, see test, Σ0n randomness, see randomness, nset, 23 and m-completeness, 25 and relative computable enumerability, 25 and the (n − 1)st jump, 25 Σ01 class, see class, Σ01 cover, see cover, Σ01 function, see function, real-valued, computably enumerable Hausdorff dimension, see dimension, Hausdorff, effective martingale, see gale, martingale, computably enumerable set, and computable enumerability, 23 supermartingale, see gale, supermartingale, computably enumerable termgale, see gale, termgale, Σ01 Σ03 class, see class, Σ03 ideal, see ideal, Σ03 reducibility, see reducibility, Σ03 wtt-ideal, see ideal, wtt-, Σ03 Silver’s Theorem, 478 Silver, J., 478 simple martingale, see gale, martingale, simple permitting, see permitting, c.e.
848
Index
randomness, see randomness, simple simplicity, 12, 43 and c.e. supersets, 83 prompt, see prompt simplicity Simpson, S., 69, 345–347, 364, 495–496, 696 Slaman, T., xv–xvi, xxii, 7, 21, 79, 93, 231, 243–244, 265, 279–280, 331, 334–336, 345, 349, 409–410, 550, 713 Slowdown Lemma, 15 Smith, C., 303 Smullyan, R., 86 Snir, M., 287, 361 Soare, R., xiii, xvi, xix, 8, 14–15, 27, 29, 36, 43, 44, 57–58, 65, 69, 77–79, 81–84, 86–89, 91–92, 198–199, 201, 204, 215, 370, 374, 420–421, 503, 534 Solomon, R., xxiii, 494, 497, 499 Solomonoff, R., xiii, xx, 111–112, 133, 145–146 Solovay completeness, see completeness, Solovay degrees, see degrees, Solovay forcing, see forcing, Solovay function, 139 and K-triviality, 505, 689 and Ω, 412 and 1-randomness, 254, 412 and plain complexity characterizations of 1-randomness, 327 nondecreasing, 413 Solovay’s, 139, 252, 505 n-generic, see generic, Solovay nrandomness, see randomness, Solovay reducibility, see reducibility, Solovay test, see test, Solovay total, see test, total Solovay Solovay, R., xi–xiii, xvi, xx, xxii, xxv, 85–89, 137–145, 147, 155–156, 160–164, 166–168, 229, 234, 251, 262, 275, 288, 296, 340, 370, 404–405, 408–409, 413,
simplicity–Stillwell, J. 466, 468–471, 487, 501, 503, 505, 534, 555, 715, 728–729, 732, 743–744 Sorbi, A., 347 Soskova, M., 489–491, 499 Space Lemma, 325, 331, 350 sparsity, 437, 456 special Π01 class, see class, Π01 , special Spector, C., 70–71, 358 spectrum of an Omega operator, see Omega operator, spectrum splitting c.e., 39 and cone avoidance, 40 and minimal pairs, 53 of the left cut of a real, 203–205 Δ02 sets, 40 theorem, see Sacks, Splitting Theorem tree, see tree, e-splitting Staiger, L., 136, 598, 604–605, 636–637 stake function, see betting strategy, nonmonotonic, stake function standard cost function, see cost function, standard reducibility, see reducibility, standard static definition, see reduction, static definition statistical test, see test, statistical Stephan, F., xvi, xxi–xxiii, 120, 229, 260–262, 273, 276, 303, 309, 311–314, 328, 330, 337–339, 349–351, 356–357, 366–369, 382, 411, 425–426, 437, 439–440, 455–458, 465, 491, 500–501, 503–504, 510, 511, 523, 526, 529–531, 533–534, 537, 539–542, 554, 559–560, 570–572, 574–585, 591, 605, 628, 643, 668–671, 708, 711–713, 715 Stephenson, J., xvi Stillwell’s Theorem, xxii, 365 Stillwell, J., xxii, 323, 359–360, 363, 365
Stob, M.–success Stob, M., 83, 93–98, 107–109, 205, 530 stochastic paradigm, see randomness, paradigm, stochastic stochasticity, xxi and randomness, 303 Church, xviii, 230, 302 and computable randomness, 303 and 1-randomness, 232 and Schnorr randomness, 330 and simple randomness, 308 no gap phenomenon, see gap phenomenon, lack, for Church stochasticity computable, see stochasticity, Church for a class, 302 time-bounded, 308 von Mises-Wald-Church, 230, 302 and computable enumerability, 305 and computable randomness, 307 and initial segment complexity, 306, 328 and 1-randomness, 303 and partial computable randomness, 303 and simple martingales, 308 strategy, see priority method, strategy strict Demuth test, see test, strict Demuth strict process complexity, see complexity, process, strict machine, see machine, process, strict string, 2 as an oracle, 16 comparability, 170 incomparability, 170 nonrandom, see nonrandom strings random, see randomness, of a string terminated binary, 663 strong array, 69 very, see very strong array degrees, see degrees, strong
Index
849
jump traceability, see traceability, strong jump jump well-approximability, see well-approximability, strong jump K-randomness, see randomness, strong Kminimal cover, 100 and array computability, 100 reducibility, see reducibility, strong s-Martin-L¨ of randomness, see randomness, strong s-Martin-L¨ of s-success, see gale, martingale, strong s-success truth table reducibility, see reducibility, strong truth table weak truth table degrees, see degrees, computable Lipschitz weak truth table reducibility, see reducibility, computable Lipschitz Strong Thickness Lemma, see Thickness Lemma, Strong strongly K-below, 486 and K-reducibility, 486 and 1-randomness, 486–487 strongly computably enumerable real, see real, strongly computably enumerable subaddivity, 132 success of a martingale process, see gale, martingale process, success of a nonmonotonic betting strategy, see betting strategy, nonmonotonic, success of a (super)martingale, see gale, (super)martingale, success relative to an order, see gale, (super)martingale, h-success s-, see gale, (super)martingale, s-success set of a (super)martingale, see gale, (super)martingale, success set of a (super)martingale relative to an order, see gale,
850
Index
(super)martingale, h-success set strong s-, see gale, (super)martingale, strong s-success Sullivan, D., 639 superhigh set, 42, 694 c.e., 42 superhigh3 , see diamond class, superhigh3 Superlow Basis Theorem, see basis theorem, superlow superlow set, xxiv, 21, 693–694 and array computability, 98 and joins, 530 and jump traceability, 523 and Π01 classes, 78 and prompt simplicity, 694 and strong jump traceability, 694 c.e., 39 superlow3 , see diamond class, superlow3 supermartingale, see gale, supermartingale svelte tree, see tree, svelte sw-degrees, see degrees, computable Lipschitz sw-reducibility, see reducibility, computable Lipschitz Symmetry of Information, xx for plain complexity, 116 for prefix-free complexity, 134 Tadaki, K., 602–603 tailset, 6 Tennenbaum, S., 87 termgale, see gale, termgale terminated binary string, see string, terminated binary termination symbol, 121, 663 Terwijn, S., xv, xxi–xxiii, 5, 99–100, 209, 229, 260–262, 303, 345, 349–351, 356, 382, 474, 489, 508–510, 524, 526, 554–555, 559, 599, 604–606, 610–611, 712, 713, 715 test bounded Martin-L¨ of, 279 and computable martingales, 280
Sullivan, D.–test and computable randomness, 280 and computably graded tests, 280 computably finite Martin-L¨ of, 318 computably graded, 279 and bounded Martin-L¨ of tests, 280 and computable martingales, 280 and computable randomness, 280 d.c.e., 316 Δ0n , 256 Demuth, 315 strict, see test, strict Demuth finite Martin-L¨ of, 318 finite total Solovay, see test, Solovay, finite total for strong s-Martin-L¨ of randomness, 605 and s-measurable machines, 606 in h-spaces, 619 for weak s-Martin-L¨ of randomness, 602 in h-spaces, 619 generalized Martin-L¨ of, 231, 287 and weak 2-randomness, 287 lack of a universal test, 289 lowness, see lowness, for generalized Martin-L¨ of tests ignorance, see ignorance test Kurtz null, 286, 294 and Schnorr randomness, 294 and weak 1-randomness, 286 lowness, see lowness, for Kurtz null tests Kurtz-Solovay, 293 and weak 1-randomness, 293 Martin-L¨ of, xxi, 231, 269 and prefix-free complexity, 232–233 and (super)martingales, 236 bounded, see test, bounded Martin-L¨ of computably finite, see test, computably finite Martin-L¨ of finite, see test, finite Martin-L¨ of generalized, see test, generalized Martin-L¨ of lowness, see lowness, for Martin-L¨ of tests
theory–traceability nested, 231 ν-, see test, Martin-L¨ of, relative to a measure relative to a measure, 264 Σ0n , see test, Σ0n universal, 233, 409 universal oracle, 245, 489 universal, and LR-reducibility, 489 universal, Kuˇcera’s, 331 n-c.e., 316 Π0n , 256 Schnorr, 272 and Schnorr null sets, 272 and Schnorr randomness, 272 lack of a universal test, 276 Schnorr s-, 654 Σ0n , 256 open, 256 open, and n-randomness, 256 Σ0n Kurtz null, 287 and weak n-randomness, 287 Solovay, 234, 256 finite total, 294 finite total, and Schnorr randomness, 294 total, 275, 294 total, and Schnorr randomness, 275 Solovay s-, 603 and weak s-randomness, 603 in h-spaces, 619 interval, 604 total, 655 total, and Schnorr s-null sets, 655 statistical, xix, 231, 233 strict Demuth, 317 and difference randomness, 317 total Solovay, see test, Solovay, total theory ”almost all”, see “almost all” theory of the Turing degrees complete extensions, see class, Π01 , and complete extensions of a theory of the Solovay degrees, see degrees, Solovay, theory
Index
851
of the Turing degrees, see degrees, Turing, theory of the vL-degrees, see degrees, van Lambalgen, theory thick subset, 53 of a c.e. set, 56, 58 of a piecewise trivial set, 54 and highness, 53 Thickness Lemma, 56, 57 Strong, 58 weak form, 54 tight Σ01 -cover, see cover, Σ01 , tight time-bounded prefix-free Kolmogorov complexity, see complexity, prefix-free Kolmogorov, time-bounded stochasticity, see stochasticity, time-bounded tiny use, see reducibility, with tiny use Tot, see index set, Tot total function, see function, total Solovay s-test, see test, Solovay s-, total Solovay test, see test, total Solovay totally ω-c.e. degree, 319 and computably finite randomness, 319 trace, see traceability, trace traceability bound, 98 computable, 100, 555 and c.e. traceability, 560 and diagonal noncomputability, 584 and hyperimmune-freeness, 100, 555, 559, 560 and lowness for computable measure machines, 562 and lowness for Kurtz null tests, 584 and lowness for 1-randomness / Schnorr randomness, 591 and lowness for Schnorr randomness, 561 and lowness for Schnorr tests, 555 bound independence, 100, 555
852
Index
of a tt-degree, 572 of a tt-degree, and c.e. traceability of wtt-degrees, 576 weak, see traceability, weak computable computably enumerable, 98 and array computability, 99 and computable measure machines, 563 and computable traceability, 560 and effective packing dimension, 649–650 and extending partial functions, 585 and hyperimmune-freeness, 560 and initial segment complexity, 649 and lowness for 1-randomness / Schnorr randomness, 591 and lowness for weak 2-randomness / Schnorr randomness, 591 and minimal degrees, 650 and null sets, 559 bound independence, 99 of a wtt-degree, 576 of a wtt-degree, and anti-complex sets, 578 of a wtt-degree, and computable traceability of tt-degrees, 576 jump, 523, 669, 700–701 and benign cost functions, 690–691 and joins, 673 and K-triviality, 523, 680, 685, 689, 700 and lowness for K, 680 and ML-cuppability, 685 and Π01 classes, 694 and superlowness, 523 for an order, 523 strong jump, xxiv, 669 and benign cost functions, 690 and completions of PA, 695 and computable enumerability, 700 and joins, 673 and jump well-approximability, 670–671
Trakhtenbrot, B.–truth table and K-triviality, 680, 685, 700 and lowness for K, 680 and ML-cuppability, 685 and (ω-c.e.)3 , 689, 693, 694 and Π01 classes, 694 and strong jump wellapproximability, 670 and superhigh3 , 694 and superlow3 , 689, 693–694 and superlowness, 694 downward closure, 670 ideal of s.j.t. c.e. sets, 672 index set, 672 plain complexity characterization, 671 trace, 98 universal, 99, 691 ultra jump, 672 weak computable, 369 and autocomplex sets, 369 Trakhtenbrot, B., 340 trash quota, 513 tree, 72 e-splitting, 71 ε-clumpy, 646 function, 68 perfect function, 646 svelte, 584 Tricot, P., 639 trivial measure, see measure, trivial triviality K-, see K-triviality Schnorr, see Schnorr, triviality trivialization, 535 ∅ -trivializing set, 535–536 and LR-reducibility, 535 and ML-coverability, 536 and ML-cuppability, 536 and (uniform) almost everywhere domination, 535 ∅ -trivializing3 , see diamond class, ∅ -trivializing3 true outcome, see priority method, true outcome path, see priority method, true path truth table, 20
tt-completeness–universal functional, see functional, truth table ideal, see ideal, ttlowness for Schnorr randomness, see lowness, for Schnorr randomness, truth table reducibility, see reducibility, truth table strong, see reducibility, strong truth table Schnorr relative randomness, see randomness, truth table Schnorr relative tt-completeness, see completeness, truth table tt-ideal, see ideal, tttt-lowness for Schnorr randomness, see lowness, for Schnorr randomness, truth table tt-reducibility, see reducibility, truth table tt-Schnorr relative randomness, see randomness, truth table Schnorr relative Turetsky, D., xvi Turing completeness, see completeness, Turing degree, see degrees, Turing functional, see functional, Turing jump, see jump machine, see machine, Turing reducibility, see reducibility, Turing Turing, A., 8, 198 Turing-Church Thesis, see Church-Turing Thesis 2-genericity, see genericity, 22-randomness, see randomness, 2ultra jump traceability, see traceability, ultra jump ultracompressibility, 305 and computable randomness, 305 unbounded injury, see priority, method, unbounded injury uniform almost everywhere domination, see domination, almost everywhere, uniform
Index
853
complexity, see complexity, uniform measure, see measure, uniform reducibility with tiny use, see reducibility, with tiny use, uniform Schnorr reducibility, see reducibility, uniform Schnorr wtt-density, see density, uniform wttuniformly computable functions, see computable, function, uniformly reals, see real, computable, uniformly sets, see computable, set, uniformly computably enumerable sets, see computably enumerable, set, uniformly Δ01 classes, see class, Δ01 , uniformly left-c.e. reals, see real, left computably enumerable, uniformly partial computable functions, see computable, partial function, uniformly Π0n classes, see class, Π0n , uniformly Π01 classes, see class, Π01 , uniformly right-c.e. reals, see real, right computably enumerable, uniformly Σ0n classes, see class, Σ0n , uniformly Σ01 classes, see class, Σ01 , uniformly Σ03 index set, see index set, uniformly Σ03 universal by adjunction, see machine, prefixfree, universal by adjunction and machine, Turing, universal by adjunction c.e. martingale, see gale, martingale, computably enumerable, universal computable prefix-free function, see prefix-free, function, computable, universal generalized Martin-L¨ of test
854
Index
nonexistence, see test, generalized Martin-L¨ of, lack of a universal test Martin-L¨ of test, see test, Martin-L¨ of, universal monotone machine, see machine, monotone, universal oracle Martin-L¨ of test, see test, Martin-L¨ of, universal oracle prefix-free machine, see machine, prefix-free, universal oracle Turing machine, see machine, Turing, universal oracle partial computable function, see computable, partial function, universal Π01 class, see class, Π01 , universal prefix-free machine, see machine, prefix-free, universal process machine, see machine, process, universal Schnorr test nonexistence, see test, Schnorr, lack of a universal test trace, see trace, universal Turing machine, see machine, Turing, universal unpredictability paradigm, see randomness, paradigm, unpredictability upper box counting dimension, see dimension, box counting, upper upper modified box counting dimension, see dimension, box counting, modified, upper uppersemilattice, 4 use, 17 computable, see reducibility, weak truth table conventions, 17 identity, see reducibility, identity bounded Turing up to a constant, see reducibility, computable Lipschitz notation, 17 preservation, 34 tiny, see reducibility, with tiny use
unpredictability paradigm–Wald, A. Use Principle, 18 useless information, 715 Ushakov, M., 601 Uspensky, V., 69, 122, 147, 170, 238, 246, 309–311 van Lambalgen degrees, see degrees, van Lambalgen reducibility, see reducibility, van Lambalgen van Lambalgen’s Theorem, xxi, 258, 473 for computable randomness, 276, 357 for n-genericity, 378 for nonmonotonic randomness, 314 for Schnorr randomness, 276, 357 for weak 2-randomness, 358 van Lambalgen, M., xix, xxi, 110, 227, 257–258, 265, 508 van Melkebeek, D., 743 Vereshchagin, N., xvi, 263, 471–473, 732, 737 very strong array, 94 Ville’s Theorem, xviii–xix, xxi, 230, 246 and the fluctuation about the mean, 249–250 finite version, 246 Ville, J., xviii–xix, 230, 232, 235–236, 246 Vinochandran, V., 642 visiting a node, see priority method, visiting a node Vit´ anyi, P., xi–xii, xvi, xx, 110, 113, 116, 122, 132, 136–137, 150, 227, 238, 260, 265, 304, 738 vL-degrees, see degrees, van Lambalgen vL-reducibility, see reducibility, van Lambalgen von Braunm¨ uhl, B., 217 von Mises, R., xvii–xix, 229–230, 301 von Mises-Church-Wald stochasticity, see stochasticity, von Mises-Church-Wald Wald, A., xviii, 230
Wang, F.–Zvonkin, A. Wang, F., 642 Wang, Y., 200, 275, 286–287, 290, 303, 308, 349–350, 409, 413, 706 weak computable traceability, see traceability, weak computable degrees, see degrees, weak Demuth randomness, see randomness, weak Demuth K-randomness, see randomness, weak Klowness for K, see lowness, for K, weak Martin-L¨ of cuppability, see Martin-L¨ of, cuppability, weak n-genericity, see genericity, weak nn-randomness, see randomness, weak n1-randomness, see randomness, weak 1randomness, see randomness, weak reducibility, see reducibility, weak s-Martin-L¨ of randomness, see randomness, weak s-Martin-L¨ of s-randomness, see randomness, weak s2-randomness, see randomness, weak 2weak truth table functional, see functional, weak truth table ideal, see ideal, wttreducibility, see reducibility, weak truth table Weber, R., xv–xvi, 288–289, 537 weight of a k-set, 514 of a KC request, 126 of a KC set, see weight, of a set of KC requests of a prefix-free set, 201 of a set of KC requests, 126 Weihrauch, K., 201, 217, 219 Weinberger, S., 420 Weinstein, S., 246, 249–250 Welch, L., 53 well-approximability, 670
Index
855
jump, 670 and strong jump traceability, 670–671 strong jump, 670 and strong jump traceability, 670 Wigderson, A., 642 window lemma, 57 witness, 33 wtt-completeness, see completeness, weak truth table wtt-cuppability, 82 and hypersimplicity, 82 wtt-ideal, see ideal, wttwtt-reducibility, see reducibility, weak truth table wtt-topped degree, 207 Wu, G., xv–xvi, 206, 218, 377, 411, 576–580 Yates, C., 47, 72, 210, 650 Young, C., xvi Yu, L., xiv–xvi, xxi–xxii, 103, 131, 229, 250–252, 262, 276, 288–289, 327, 329, 332, 356–357, 378, 442, 464–466, 473–481, 484–487, 505–506, 534, 537, 582–586, 591 Yue, Y., xvi Zambella, D., xxiii, 98–100, 501, 505, 508, 524, 554–555, 559 0-1 law, 5 effective, xxi for complexity classes, 642 for effective packing dimension, 642 for Π0n classes, 259 Σ0n classes, 259 Kolmogorov’s, 6 ∅ -trivializing sets, see trivialization, ∅ -trivializing sets ∅ -trivializing3 , see diamond class, ∅ -trivializing3 Zheng, X., xvi, 201, 217–219, 308 Zimand, M., xvi, xxiv, 627–629, 635 Zvonkin, A., 146, 266