Logic, Thought, and Action 22

Chapter 22 LOGIC, RANDOMNESS AND COGNITION Michel de Rougemont Universit´ ´ Paris-II Abstract 1. Many natural intensi...

Author: Vanderveken Daniel

16 downloads 558 Views 336KB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Chapter 22 LOGIC, RANDOMNESS AND COGNITION Michel de Rougemont Universit´ ´ Paris-II

Abstract

1.

Many natural intensional properties in artiﬁcial and natural languages are hard to compute. We show that randomized algorithms are often necessary to have good estimators of natural properties and to verify some speciﬁc relations. We concentrate on the reliability of queries to show the advantage of randomized algorithms in uncertain cognitive worlds.

Introduction

Classical studies in Complexity Theory consider deterministic or non deterministic algorithms on perfect data and often privilege a worstcase analysis to classify between easy and hard problems. In recent years, some important developments in theoretical Computer Science have shown the fundamental role of randomness in computing in at least three diﬀerent settings. - randomized algorithms for search and decision problems. - models for randomized veriﬁcation, i.e. given a function f and two values a, b decide if f (a) = b. - average case analysis on the inputs. We believe that new ideas are emerging that could turn out to be quite relevant to Cognitive Sciences, when we try to estimate intensions associated with natural or artiﬁcial languages. One fundamental aspect of computations in the context of cognitive science is the ability to deal with uncertainty. We will show that randomized techniques are quite eﬃcient in uncertain situations. We refer to intensions as properties other than the truth-value (or extension) of a formula and concentrate in the sequel on the notion of D. Vanderveken (ed.), Logic, Thought & Action, 497–506. 2005 Springer. Printed in The Netherlands.

c

498

LOGIC, THOUGHT AND ACTION

reliability. Let us ﬁx the Universe as a large ﬁnite structure Un of a class K where n is its size, with functions, relations and higher-order objects. Let L be the vocabulary associated with a class K of such structures. If we ﬁx a language with a denotational semantics, like the standard ﬁrst-order logic (F O(L )), the truth value is well-deﬁned but of limited interest in cognitive studies. Some other properties, usually deﬁned inductively on a structure may be more relevant. These intensions are in general hard to compute (in the algorithmic sense) as we will see on some examples. For artiﬁcial languages used in Computer Science two natural intensions are: the complexity and the reliability when we deal with uncertain data. In the sequel we concentrate on the reliability question and show how to use ramdomness to estimate a classical property: graph reliability. We also mention that this property may be easy on the average for some speciﬁc distributions (natural gaussian distributions). For other intensional properties, one would conjecture similar results. In section 2 we introduce the reliability of a query as a basic intension. In section 3, we deﬁne random computations and describe some randomized algorithms. In section 4, we mention some classical results related to the veriﬁcation of properties that are hard to compute. In section 5, we discuss the role of the average case complexity.

2.

The intension of queries: reliability as an example

A query on a class K is a function which associates with every Un ∈ K a relation of ﬁxed arity on Un . If the arity is 0, we have boolean queries which are true or false. It is also called a global relation on a class in the litterature. A query is deﬁnable in a logic L if there exists a formula ψ ∈ L such that for all Un ∈ K, the relation deﬁned by the query is precisely [ψ]Un , i.e. the relation deﬁned by the formula. The arity of the query is the number of free variables of the formula. For simplicity, we concentrate on the following property of queries deﬁned by a formula ψ, the reliability ρ(ψ) introduced in dR95: given a structure Un and a random substructure Un (the uncertain world) ρ(ψ) is the probability that the truth-value [ψ Un ] coincides with the truth-value [ψ Un ]. Consider a ﬁnite relational database and for the sake of simplicity we Vn , E) with n nodes assume the database to be a ﬁnite graph Gn = (V and E ⊆ Vn2 is the set of edges. Let δ : E → [0, 1] be the uncertainty function where we interpret δ(e) as the probability that the edge e exists. The probabilistic space induced by Gn and δ is the set of all subgraphs

Logic, Randomness and Cognition

499

Gn = (V, E ) of Gn with a probability IP rob(G ) = [Πe∈E δ(e)].[Πe∈E−E (1 − δ(e)] Let Qδ be the random variable deﬁning the (boolean) query Q on the probabilistic space induced by Gn and δ. We denote the mathematical expectation of this random variable by IE (Qδ ). A distribution µ deﬁnes a diﬀerent probabilistic space: it assigns for a given n, the probability of Gn . Deﬁnition 1 The reliability of a boolean query Q on a graph Gn is the function: ρ(Q, Gn ) = 1 − IEδ (| Qδ − Q |) The reliability on a distribution µ of a query Q is the function: ρ(Q, n) = IEµ [ρ(Q, Gn )] This deﬁnition consider only boolean queries but generalizes to queries of arbitrary arity. Example: Let G5 be the graph below with 5 nodes and Q the query deﬁned by the ﬁrst-order formula: ∃x, y, z(zEx ∧ xEy ∧ yEz)

The graph G5 with uncertain edges.

1 2.

Assume δ(e) = The value of ρ(Q, G5 ) is the probability that a realization, i.e. a subgraph of G5 contains a triangle. The reliability is hard to compute because we have to analyze all possible subgraphs Gn , i.e. exponentially many, and check some property for each one. For many queries Q, there seems to be no better way than this exhaustive computation. The reliability of even ﬁrst-order deﬁnable queries is hard to compute, not known to be computable in polynomial time.

500

LOGIC, THOUGHT AND ACTION

3.

Randomized Computations

There are many equivalent deﬁnitions of randomized computations. Consider a computing device, a Turing machine or a RAM (Random Access Machine) with two inputs: the real input x of length n and an auxiliary binary input y = y1 ....ym , the random sequence. The probabilistic space is the set of y with a uniform probability 1/2m , i.e. each yi = 0 or yi = 1 is chosen with the same probability 1/2. In the case of decision problems, the machine accepts (M (x, y) = 1) or rejects (M(x,y) =0). We say that M accepts a language L if - If x ∈ L then IP roby [M (x, y) = 1] ≥

1 2

+

- If x ∈ L then IP roby [M (x, y) = 0] ≥

1 2

+

The most classical complexity class (see Pap94; LdR96) is the class BP P , when M accepts or rejects deterministically in polynomial time. We can also deﬁne a probabilistic run on the input x: it ﬁrst produces y and then run M (x, y). Notice that the error can be made exponentially small ( 21k ) by repeating the computation k times. In particular it can be made negligible compared to the inherent reliability of hardware components.

3.1

Some classical examples

One standard example showing the advantage of randomness is primality testing, i.e. deciding if a natural number is prime or composite. This can be done in randomized polynomial time and is conjectured not to be possible in deterministic polynomial time. Another classic example is the random walk in a symmetric graph. We can decide in randomized logarithmic space1 if there is a path between two distinguished elements s and t, but it is conjectured to be impossible for deterministic computations (in logarithmic space). Consider a heap of needles and the associated graph where each node is the extremity of needles or the intersection of crossing needles. Edges connect nodes along the needles and there are two distinguished nodes: s and t.

1 Logarithmic

space can be understood as constant space, in the sense of a constant number of registers, each holding log n bits. To store a node in a graph with n nodes, we need to store a value i between 1 and n, requiring log n bits in the classical binary representation.

Logic, Randomness and Cognition

501

Needles: are s and t connected?

In the ﬁrst question, we ask if the two distinguished points s and t are connected, i.e. if there exists a path which connects them. This is extremely easy for the human eye and for a randomized algorithm that performs a random walk from s hoping to reach t after n2 steps. Such an algorithm generates a sequence y of random choices, starts in s and uses the random bits of y to select an adjacent node2 . It proceeds for n2 steps keeping only the current node. Any deterministic algorithm needs to keep track of the paths and needs polynomial space. On the other hand, it is a much harder task if the graph is oriented and it is conjectured to be impossible to decide in randomized logarithmic space. Notice that it is also far more diﬃcult to the human eye. We need to follow various paths edge by edge and do not have a global view of the situation, as in the previous example.

Oriented Needles: are s and t connected?

2 If s has four neighbors i , i , i , i ∈ {1, 2, . . . n} where i 1 2 3 4 j < ij+1 , then we select i1 if y starts with 00, i2 if y starts with 01, i3 if y starts with 10, and i4 if y starts with 11.

502

LOGIC, THOUGHT AND ACTION

Notice that in the previous examples, the graphs are perfect, i.e. with no uncertainty on the edges. One important factor in cognitive tasks is to cope with uncertainty and to develop robust algorithms, i.e. procedures that are insensitive to erroneous data. Problems with a probabilistic uncertainty assume that data are partially correct, i.e. the given graph is only a probabilistic realization of another unknown graph. For example, the unknown graph may have extra edges that do not appear in the observed graph and some edges in the observed graph may not exist in the unknown graph. An important distinction is whether the uncertainty is static (i.e. ﬁxed as the algorithm starts) or dynamic (i.e. changes as the algorithm computes).

3.2

Static uncertainty

The probabilistic model introduced in section 2 is static in the sense that the random data is determined before any computation starts. For a query ψ, the computation of ρ(ψ, Gn ) may indeed be vary hard, in fact #P hard3 . The standard example is the graph-reliability introduced in Val79, which is also the reliability of the query: Are s and t connected?. Formally the function GR is deﬁned as follows: GR (Graph reliability)Val79 Input: An undirected graph G = (V, E) with n vertices; s, t ∈ V ; and for every edge e, a rational number δ(e) ∈ [0, 1] representing the probability that the edge e exists (does not fail). Output: The probability that there is a path from s to t consisting exclusively of edges that have not failed. Consider a ﬁxed-point formula ψ deﬁning the query s-t connectivity, also called GAP . The probability we are looking for is ρ(GAP, Gn ). It is known that this problem is #P hard.

3.3

Dynamic uncertainty

A natural generalization of the graph reliability is DGR, Pap85 the dynamic reliability problem. Let us introduce a slightly diﬀerent model of uncertainty: suppose you try to traverse a colored graph and at every step you decide on a particular edge to follow. Then the uncertainty removes some of the remaining edges. We call such a model dynamic because the uncertainty is an adversary at every step. In this dynamic 3 A function is #P is there exists a non deterministic Turing machine which accepts or rejects in polynomial time such that for all x, the value f (x) is the number of accepting branches.

Logic, Randomness and Cognition

503

model of uncertainty, we can show that randomized decisions can be better than deterministic ones. This situation is typical in this more elaborate example BdRS96, where we try to traverse a colored graph, subject to uncertain deviations. Consider a graph supplied with additional information concerning colours of vertices, probabilities of deviations from the chosen direction, labels of edges and so on. An edge with the tail u and the head v will be denoted by uv or (u, v). Let G = (V, E) be a directed graph (digraph) with the vertices V and edges E. OU T (v) = {e ∈ E : tail(e) = v}, IN (v) = {e ∈ E : head(e) = v}. COLOU RS is a ﬁnite set of colours, clr : V → COLOU RS is the colouring function. To model the uncertainty we introduce two functions, one is an auxiliary function of labeling which gives the local names of the edges outcoming from a given vertex, and the other, denoted by µ below, is the function describing deviations from a chosen move. LABELS is a ﬁnite set of edge labels, lblv : OU T (v) → LABELS; lblv is injective without loss of generality, µ : E × E → [0, 1] is the function describing the uncertainty. Let e = (v, w) be an edge chosen to follow. Actually the motion will be along another edge e1 = (v, u) with the probability µ(e, e1 ), and so the vertex w will be reached only with the probability µ(e, e). We assume µ(e, e1 ) = 1. (22.1) e1 ∈OU T (v)

The input of our problem is an object of the form ((V, E, clr, lbl, µ), s, t), which we call graph with uncertain deviation and source/target vertices or UD-graph. A strategy is a function σ which assigns to a ﬁnite sequence of colours (the history of colours of visited points) an edge label describing uniquely the edge to follow. σ : COLOU RS ∗ → LABELS . The semantics of a strategy σ (or the behaviour due to σ) is given by the random mapping pathσ : N → V ∗ which for every k ∈ N deﬁnes a random path traversed following σ in k steps. The motion starts from s. Then σ, on the basis of clr(s), chooses some edge e ∈ OU T (s) (i.e. e = lbls−1 (σ(clr(s)))), and with the probability µ(e, e1 ) goes along an edge e1 to head(e1 ) and so on. A path that can be a value of pathσ (k) is

504

LOGIC, THOUGHT AND ACTION

called a realization of a strategy σ after k steps. A realization is simple if it contains not more than one occurrence of the target vertex. A realization is precise w.r.t. the target iﬀ it is simple and has t as its last vertex. We say that σ leads from s to t in k steps (with a probability 1 − θ) if (with the probability 1 − θ) there exists a realisation of σ of the length k with the ﬁrst vertex s and the last vertex t. The general problem is to reach t from s with the maximal probability for a limited or unbounded number of steps. This motivates the following criterion of the reliability: - R(σ, k) = P rob(σ leads from s to t in not more than k steps ), - R∞ (σ) = R(σ) = supk R(σ, k). It can be shown that computing R∞ (σ) can be arbitrarly complex, and that computing R(σ, k) is #P computable. However deﬁne a randomized strategy as: σR : COLOU RS ∗ .{0, 1}∗ → LABELS . It can be shown easily that for a ﬁxed horizon k some simple randomized strategy are better than any deterministic strategy with bounded memory dRS97. For the general problem, we can show that randomized strategies are better than deterministic ones for most ﬁnite horizon k BdRS96.

4.

Randomized Veriﬁcation

The veriﬁcation problem for a function f can be stated as follows: given x and y, decide if f (x) = y. The deﬁnition of the class of functions that can be veriﬁed in randomized polynomial time was ﬁrst deﬁned by GMR85; Bab85 who introduced the class IP (Interactive proofs). It was shown that randomness and interaction could vastly increase the domain of functions veriﬁable in randomized polynomial time. In CDFdRS94, we gave an interactive protocol for the veriﬁcation of GR, which can’t be done in polynomial time but with a simple O(n) interactive protocol. Consider now the graph-traversing problem of section 3.3. Suppose two agents (programs) claim to traverse a graph with probability greater than 0.5. How do we verify their claims? How do we know that one of the program is better than the other? It appears essential to compare strategies, or more generally cognitive tasks. An interactive proof for these problems would be extremely useful and allow us to answer these questions with a very simple randomized veriﬁcation. It would lead to better strategies on speciﬁc inputs.

Logic, Randomness and Cognition

5.

505

Computing on the average

The average case complexity Lev73 is another interesting approach to complexity. An algorithm A whose running time is TA (x) is computable TA (x)] ≤ nk for an input distribution in Average polynomial time if IEµ [T µ. The problem 3COL (whether a graph is 3-colorable) is NP complete but it was shown by Gurevich that it can be solved in constant time on the average for the uniform distribution. What can we say for GR? It has been shown Sin93 that GR is not approximable but it is an open problem whether it is polynomial on the average (Average(P )) for the uniform distribution. Consider however the following gaussian distribution: Let µ be the gaussian distribution on the edges deﬁned as follows: µ : e = (i, j) −→ µ[(i, j)] = exp[−(i − j)2 ] A random graph for µ assumes an ordering on the vertices and the probability to join (i, j) decreases exponentially quickly with the distance d = j − i. For such a distribution, we showed in BdR98 that GR is computable in average polynomial time. In simple words, the algorithm works well on most inputs except on some bad ones which are rare. It is important to notice how a statistical information (the distribution µ) changes the complexity of the problem and directly inﬂuences the search of randomized algorihms.

6.

Conclusion

Many intensional properties associated with natural and artiﬁcial languages are diﬃcult to compute in the classical sense. Randomized algorithms must be used for the veriﬁcation or the estimation of these properties. We described the case of the reliability, a property hard to compute in general, but which can be approximated on speciﬁc inputs. We believe that many intensional properties can be approached with similar techniques, which could be useful to the cognitive sciences.

References Babai L. (1985). Trading Group Theory for Randomness. Symposium on the Theory of Computing, 421–429. Burago D. and de Rougemont M. (1998). “On the Average Complexity of Graph Reliability”. Fundamenta Informaticae 36 (4):307–315. Burago D., de Rougemont M. and Slissenko A (1996). “On the Complexity of Partially Observed Markov Decision Processes”. Theoretical Computer Sciencce, (157):161–183. Couveignes J.M., Diaz-Frias J.F., de Rougemont M. and Santha M (1994). “On the Interactive Complexity of Graph Reliability”. FSTCS

506

LOGIC, THOUGHT AND ACTION

International Symposium on Theoretical Computer Science, Madras, LNCS 880 :1–14. de Rougemont M. (1995). The Reliability of Queries. ACM Principles on Databases Systems, 286–191. de Rougemont M. and Schlieder C. (1997). Spatial Navigation with Uncertain Deviations. American Asssociation for Artiﬁcial Intelligence. Goldwasser S., Micali S. and Rackoﬀ C. (1985). The Knowledge Complexity of Interactive Poof Systems. Symposium on the Theory of Computing, 291–304. Lassaigne R. & de Rougemont M. (1996). Logique et complexit´. Herm`es. Levin L. (1973). “Universal Sorting Problems”. Problems of Information Transmission 9 (3):265–266. Papadimitriou C. (1985). “Games Against Nature”. Journal of Computer and System Sciences 31 :288–301. Papadimitriou C. (1994). Computational Complexity. Addison-Wesley. Sinclair A. (1993). Algorithms for Random Generation and Counting. Birhauser Verlag. Valiant L. (1979). “The Complexity of Enumeration and Reliability Problems”. SIAM Journal of Computing 8 (3).