Chapter 22 LOGIC, RANDOMNESS AND COGNITION Michel de Rougemont Universit´ ´ Paris-II
Abstract
1.
Many natural intensi...
16 downloads
558 Views
336KB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Chapter 22 LOGIC, RANDOMNESS AND COGNITION Michel de Rougemont Universit´ ´ Paris-II
Abstract
1.
Many natural intensional properties in artificial and natural languages are hard to compute. We show that randomized algorithms are often necessary to have good estimators of natural properties and to verify some specific relations. We concentrate on the reliability of queries to show the advantage of randomized algorithms in uncertain cognitive worlds.
Introduction
Classical studies in Complexity Theory consider deterministic or non deterministic algorithms on perfect data and often privilege a worstcase analysis to classify between easy and hard problems. In recent years, some important developments in theoretical Computer Science have shown the fundamental role of randomness in computing in at least three different settings. - randomized algorithms for search and decision problems. - models for randomized verification, i.e. given a function f and two values a, b decide if f (a) = b. - average case analysis on the inputs. We believe that new ideas are emerging that could turn out to be quite relevant to Cognitive Sciences, when we try to estimate intensions associated with natural or artificial languages. One fundamental aspect of computations in the context of cognitive science is the ability to deal with uncertainty. We will show that randomized techniques are quite efficient in uncertain situations. We refer to intensions as properties other than the truth-value (or extension) of a formula and concentrate in the sequel on the notion of D. Vanderveken (ed.), Logic, Thought & Action, 497–506. 2005 Springer. Printed in The Netherlands.
c
498
LOGIC, THOUGHT AND ACTION
reliability. Let us fix the Universe as a large finite structure Un of a class K where n is its size, with functions, relations and higher-order objects. Let L be the vocabulary associated with a class K of such structures. If we fix a language with a denotational semantics, like the standard first-order logic (F O(L )), the truth value is well-defined but of limited interest in cognitive studies. Some other properties, usually defined inductively on a structure may be more relevant. These intensions are in general hard to compute (in the algorithmic sense) as we will see on some examples. For artificial languages used in Computer Science two natural intensions are: the complexity and the reliability when we deal with uncertain data. In the sequel we concentrate on the reliability question and show how to use ramdomness to estimate a classical property: graph reliability. We also mention that this property may be easy on the average for some specific distributions (natural gaussian distributions). For other intensional properties, one would conjecture similar results. In section 2 we introduce the reliability of a query as a basic intension. In section 3, we define random computations and describe some randomized algorithms. In section 4, we mention some classical results related to the verification of properties that are hard to compute. In section 5, we discuss the role of the average case complexity.
2.
The intension of queries: reliability as an example
A query on a class K is a function which associates with every Un ∈ K a relation of fixed arity on Un . If the arity is 0, we have boolean queries which are true or false. It is also called a global relation on a class in the litterature. A query is definable in a logic L if there exists a formula ψ ∈ L such that for all Un ∈ K, the relation defined by the query is precisely [ψ]Un , i.e. the relation defined by the formula. The arity of the query is the number of free variables of the formula. For simplicity, we concentrate on the following property of queries defined by a formula ψ, the reliability ρ(ψ) introduced in dR95: given a structure Un and a random substructure Un (the uncertain world) ρ(ψ) is the probability that the truth-value [ψ Un ] coincides with the truth-value [ψ Un ]. Consider a finite relational database and for the sake of simplicity we Vn , E) with n nodes assume the database to be a finite graph Gn = (V and E ⊆ Vn2 is the set of edges. Let δ : E → [0, 1] be the uncertainty function where we interpret δ(e) as the probability that the edge e exists. The probabilistic space induced by Gn and δ is the set of all subgraphs
Logic, Randomness and Cognition
499
Gn = (V, E ) of Gn with a probability IP rob(G ) = [Πe∈E δ(e)].[Πe∈E−E (1 − δ(e)] Let Qδ be the random variable defining the (boolean) query Q on the probabilistic space induced by Gn and δ. We denote the mathematical expectation of this random variable by IE (Qδ ). A distribution µ defines a different probabilistic space: it assigns for a given n, the probability of Gn . Definition 1 The reliability of a boolean query Q on a graph Gn is the function: ρ(Q, Gn ) = 1 − IEδ (| Qδ − Q |) The reliability on a distribution µ of a query Q is the function: ρ(Q, n) = IEµ [ρ(Q, Gn )] This definition consider only boolean queries but generalizes to queries of arbitrary arity. Example: Let G5 be the graph below with 5 nodes and Q the query defined by the first-order formula: ∃x, y, z(zEx ∧ xEy ∧ yEz)
The graph G5 with uncertain edges.
1 2.
Assume δ(e) = The value of ρ(Q, G5 ) is the probability that a realization, i.e. a subgraph of G5 contains a triangle. The reliability is hard to compute because we have to analyze all possible subgraphs Gn , i.e. exponentially many, and check some property for each one. For many queries Q, there seems to be no better way than this exhaustive computation. The reliability of even first-order definable queries is hard to compute, not known to be computable in polynomial time.
500
LOGIC, THOUGHT AND ACTION
3.
Randomized Computations
There are many equivalent definitions of randomized computations. Consider a computing device, a Turing machine or a RAM (Random Access Machine) with two inputs: the real input x of length n and an auxiliary binary input y = y1 ....ym , the random sequence. The probabilistic space is the set of y with a uniform probability 1/2m , i.e. each yi = 0 or yi = 1 is chosen with the same probability 1/2. In the case of decision problems, the machine accepts (M (x, y) = 1) or rejects (M(x,y) =0). We say that M accepts a language L if - If x ∈ L then IP roby [M (x, y) = 1] ≥
1 2
+
- If x ∈ L then IP roby [M (x, y) = 0] ≥
1 2
+
The most classical complexity class (see Pap94; LdR96) is the class BP P , when M accepts or rejects deterministically in polynomial time. We can also define a probabilistic run on the input x: it first produces y and then run M (x, y). Notice that the error can be made exponentially small ( 21k ) by repeating the computation k times. In particular it can be made negligible compared to the inherent reliability of hardware components.
3.1
Some classical examples
One standard example showing the advantage of randomness is primality testing, i.e. deciding if a natural number is prime or composite. This can be done in randomized polynomial time and is conjectured not to be possible in deterministic polynomial time. Another classic example is the random walk in a symmetric graph. We can decide in randomized logarithmic space1 if there is a path between two distinguished elements s and t, but it is conjectured to be impossible for deterministic computations (in logarithmic space). Consider a heap of needles and the associated graph where each node is the extremity of needles or the intersection of crossing needles. Edges connect nodes along the needles and there are two distinguished nodes: s and t.
1 Logarithmic
space can be understood as constant space, in the sense of a constant number of registers, each holding log n bits. To store a node in a graph with n nodes, we need to store a value i between 1 and n, requiring log n bits in the classical binary representation.
Logic, Randomness and Cognition
501
Needles: are s and t connected?
In the first question, we ask if the two distinguished points s and t are connected, i.e. if there exists a path which connects them. This is extremely easy for the human eye and for a randomized algorithm that performs a random walk from s hoping to reach t after n2 steps. Such an algorithm generates a sequence y of random choices, starts in s and uses the random bits of y to select an adjacent node2 . It proceeds for n2 steps keeping only the current node. Any deterministic algorithm needs to keep track of the paths and needs polynomial space. On the other hand, it is a much harder task if the graph is oriented and it is conjectured to be impossible to decide in randomized logarithmic space. Notice that it is also far more difficult to the human eye. We need to follow various paths edge by edge and do not have a global view of the situation, as in the previous example.
Oriented Needles: are s and t connected?
2 If s has four neighbors i , i , i , i ∈ {1, 2, . . . n} where i 1 2 3 4 j < ij+1 , then we select i1 if y starts with 00, i2 if y starts with 01, i3 if y starts with 10, and i4 if y starts with 11.
502
LOGIC, THOUGHT AND ACTION
Notice that in the previous examples, the graphs are perfect, i.e. with no uncertainty on the edges. One important factor in cognitive tasks is to cope with uncertainty and to develop robust algorithms, i.e. procedures that are insensitive to erroneous data. Problems with a probabilistic uncertainty assume that data are partially correct, i.e. the given graph is only a probabilistic realization of another unknown graph. For example, the unknown graph may have extra edges that do not appear in the observed graph and some edges in the observed graph may not exist in the unknown graph. An important distinction is whether the uncertainty is static (i.e. fixed as the algorithm starts) or dynamic (i.e. changes as the algorithm computes).
3.2
Static uncertainty
The probabilistic model introduced in section 2 is static in the sense that the random data is determined before any computation starts. For a query ψ, the computation of ρ(ψ, Gn ) may indeed be vary hard, in fact #P hard3 . The standard example is the graph-reliability introduced in Val79, which is also the reliability of the query: Are s and t connected?. Formally the function GR is defined as follows: GR (Graph reliability)Val79 Input: An undirected graph G = (V, E) with n vertices; s, t ∈ V ; and for every edge e, a rational number δ(e) ∈ [0, 1] representing the probability that the edge e exists (does not fail). Output: The probability that there is a path from s to t consisting exclusively of edges that have not failed. Consider a fixed-point formula ψ defining the query s-t connectivity, also called GAP . The probability we are looking for is ρ(GAP, Gn ). It is known that this problem is #P hard.
3.3
Dynamic uncertainty
A natural generalization of the graph reliability is DGR, Pap85 the dynamic reliability problem. Let us introduce a slightly different model of uncertainty: suppose you try to traverse a colored graph and at every step you decide on a particular edge to follow. Then the uncertainty removes some of the remaining edges. We call such a model dynamic because the uncertainty is an adversary at every step. In this dynamic 3 A function is #P is there exists a non deterministic Turing machine which accepts or rejects in polynomial time such that for all x, the value f (x) is the number of accepting branches.
Logic, Randomness and Cognition
503
model of uncertainty, we can show that randomized decisions can be better than deterministic ones. This situation is typical in this more elaborate example BdRS96, where we try to traverse a colored graph, subject to uncertain deviations. Consider a graph supplied with additional information concerning colours of vertices, probabilities of deviations from the chosen direction, labels of edges and so on. An edge with the tail u and the head v will be denoted by uv or (u, v). Let G = (V, E) be a directed graph (digraph) with the vertices V and edges E. OU T (v) = {e ∈ E : tail(e) = v}, IN (v) = {e ∈ E : head(e) = v}. COLOU RS is a finite set of colours, clr : V → COLOU RS is the colouring function. To model the uncertainty we introduce two functions, one is an auxiliary function of labeling which gives the local names of the edges outcoming from a given vertex, and the other, denoted by µ below, is the function describing deviations from a chosen move. LABELS is a finite set of edge labels, lblv : OU T (v) → LABELS; lblv is injective without loss of generality, µ : E × E → [0, 1] is the function describing the uncertainty. Let e = (v, w) be an edge chosen to follow. Actually the motion will be along another edge e1 = (v, u) with the probability µ(e, e1 ), and so the vertex w will be reached only with the probability µ(e, e). We assume µ(e, e1 ) = 1. (22.1) e1 ∈OU T (v)
The input of our problem is an object of the form ((V, E, clr, lbl, µ), s, t), which we call graph with uncertain deviation and source/target vertices or UD-graph. A strategy is a function σ which assigns to a finite sequence of colours (the history of colours of visited points) an edge label describing uniquely the edge to follow. σ : COLOU RS ∗ → LABELS . The semantics of a strategy σ (or the behaviour due to σ) is given by the random mapping pathσ : N → V ∗ which for every k ∈ N defines a random path traversed following σ in k steps. The motion starts from s. Then σ, on the basis of clr(s), chooses some edge e ∈ OU T (s) (i.e. e = lbls−1 (σ(clr(s)))), and with the probability µ(e, e1 ) goes along an edge e1 to head(e1 ) and so on. A path that can be a value of pathσ (k) is
504
LOGIC, THOUGHT AND ACTION
called a realization of a strategy σ after k steps. A realization is simple if it contains not more than one occurrence of the target vertex. A realization is precise w.r.t. the target iff it is simple and has t as its last vertex. We say that σ leads from s to t in k steps (with a probability 1 − θ) if (with the probability 1 − θ) there exists a realisation of σ of the length k with the first vertex s and the last vertex t. The general problem is to reach t from s with the maximal probability for a limited or unbounded number of steps. This motivates the following criterion of the reliability: - R(σ, k) = P rob(σ leads from s to t in not more than k steps ), - R∞ (σ) = R(σ) = supk R(σ, k). It can be shown that computing R∞ (σ) can be arbitrarly complex, and that computing R(σ, k) is #P computable. However define a randomized strategy as: σR : COLOU RS ∗ .{0, 1}∗ → LABELS . It can be shown easily that for a fixed horizon k some simple randomized strategy are better than any deterministic strategy with bounded memory dRS97. For the general problem, we can show that randomized strategies are better than deterministic ones for most finite horizon k BdRS96.
4.
Randomized Verification
The verification problem for a function f can be stated as follows: given x and y, decide if f (x) = y. The definition of the class of functions that can be verified in randomized polynomial time was first defined by GMR85; Bab85 who introduced the class IP (Interactive proofs). It was shown that randomness and interaction could vastly increase the domain of functions verifiable in randomized polynomial time. In CDFdRS94, we gave an interactive protocol for the verification of GR, which can’t be done in polynomial time but with a simple O(n) interactive protocol. Consider now the graph-traversing problem of section 3.3. Suppose two agents (programs) claim to traverse a graph with probability greater than 0.5. How do we verify their claims? How do we know that one of the program is better than the other? It appears essential to compare strategies, or more generally cognitive tasks. An interactive proof for these problems would be extremely useful and allow us to answer these questions with a very simple randomized verification. It would lead to better strategies on specific inputs.
Logic, Randomness and Cognition
5.
505
Computing on the average
The average case complexity Lev73 is another interesting approach to complexity. An algorithm A whose running time is TA (x) is computable TA (x)] ≤ nk for an input distribution in Average polynomial time if IEµ [T µ. The problem 3COL (whether a graph is 3-colorable) is NP complete but it was shown by Gurevich that it can be solved in constant time on the average for the uniform distribution. What can we say for GR? It has been shown Sin93 that GR is not approximable but it is an open problem whether it is polynomial on the average (Average(P )) for the uniform distribution. Consider however the following gaussian distribution: Let µ be the gaussian distribution on the edges defined as follows: µ : e = (i, j) −→ µ[(i, j)] = exp[−(i − j)2 ] A random graph for µ assumes an ordering on the vertices and the probability to join (i, j) decreases exponentially quickly with the distance d = j − i. For such a distribution, we showed in BdR98 that GR is computable in average polynomial time. In simple words, the algorithm works well on most inputs except on some bad ones which are rare. It is important to notice how a statistical information (the distribution µ) changes the complexity of the problem and directly influences the search of randomized algorihms.
6.
Conclusion
Many intensional properties associated with natural and artificial languages are difficult to compute in the classical sense. Randomized algorithms must be used for the verification or the estimation of these properties. We described the case of the reliability, a property hard to compute in general, but which can be approximated on specific inputs. We believe that many intensional properties can be approached with similar techniques, which could be useful to the cognitive sciences.
References Babai L. (1985). Trading Group Theory for Randomness. Symposium on the Theory of Computing, 421–429. Burago D. and de Rougemont M. (1998). “On the Average Complexity of Graph Reliability”. Fundamenta Informaticae 36 (4):307–315. Burago D., de Rougemont M. and Slissenko A (1996). “On the Complexity of Partially Observed Markov Decision Processes”. Theoretical Computer Sciencce, (157):161–183. Couveignes J.M., Diaz-Frias J.F., de Rougemont M. and Santha M (1994). “On the Interactive Complexity of Graph Reliability”. FSTCS
506
LOGIC, THOUGHT AND ACTION
International Symposium on Theoretical Computer Science, Madras, LNCS 880 :1–14. de Rougemont M. (1995). The Reliability of Queries. ACM Principles on Databases Systems, 286–191. de Rougemont M. and Schlieder C. (1997). Spatial Navigation with Uncertain Deviations. American Asssociation for Artificial Intelligence. Goldwasser S., Micali S. and Rackoff C. (1985). The Knowledge Complexity of Interactive Poof Systems. Symposium on the Theory of Computing, 291–304. Lassaigne R. & de Rougemont M. (1996). Logique et complexit´. Herm`es. Levin L. (1973). “Universal Sorting Problems”. Problems of Information Transmission 9 (3):265–266. Papadimitriou C. (1985). “Games Against Nature”. Journal of Computer and System Sciences 31 :288–301. Papadimitriou C. (1994). Computational Complexity. Addison-Wesley. Sinclair A. (1993). Algorithms for Random Generation and Counting. Birhauser Verlag. Valiant L. (1979). “The Complexity of Enumeration and Reliability Problems”. SIAM Journal of Computing 8 (3).