Preface One of the central tasks of computer vision and robotics is building a 3-D model of the environment from sensor and image data. An intrinsic difficulty of this task is the fact that sensor and image data are not necessarily accurate. In the past, many techniques have been proposed to cope with this problem. One of the most effective solutions is to take advantage of our prior knowledge about the structure of the problem: we make use of such geometric relationships as collinearity, coplanarity, rigidity, and epipolar relation that should hold in the absence of noise; they are called geometric constraints. This book focuses on two such techniques that are the most fundamental. One is geometric correction: the data are optimally modified so as to satisfy the geometric constraints. The other is parametric fitting: the parameters of the geometric constraints are optimally estimated. In order that a particular method be called "optimal," the theoretical bound on its attainable accuracy must be known; only methods which can attain it are qualified to be called optimal. In this book, we give many synthetic and real data examples to demonstrate that conventional methods are not optimal and how accuracy improves if truly optimal methods are employed. However, computing optimal estimates alone is not sufficient; at the same time, we must evaluate in quantitative terms how reliable the resulting estimates are. The knowledge that a distance is optimally estimated to be 5m is of little use if we do not know whether the value is reliable within 4-10cm or E l m . This reliability issue has not received much attention in the past. In order to compute optimal estimates and evaluate their reliability, we need an efficient numerical algorithm. Since such estimation is based on our knowledge about the structure of the problem, we also need a criterion for testing if the assumed structure, or model, is correct. This book presents rigorous mathematical techniques for these purposes. Thus, the subject of this book is closely related to traditional statistics. However, the main goal of traditional statistics is to infer the structure of a phenomenon from a number of sample data obtained by repeated measurements. In our applications, usually only one set of data is obtained by a measurement. Also, noise in electronic devices can be assumed to be fairly small, so we can apply first order analysis. Since geometric objects are represented by vectors and tensors in two and three dimensions, linear algebra, eigenvalue analysis in particular, plays a central role in our analysis. These features make our treatment very different from traditional statistics. This book first summarizes the mathematics that is flmdamental to our analysis and then presents techniques for optimal estimation and reliability evaluation by assuming that the noise is Gaussian. We also discuss numerical computation schemes called the optimal filter and renormalization and consider computational issues such as convergence of iterations and numerical stability. Then, we derive the geometric information criterion and apply it
vi to such problems as stereo vision and 3-D motion analysis. Finally, the theoretical accuracy bound for a general non-Gaussian noise model is obtained in the form of the Cramer-Rao lower bound expressed in terms of the Fisher information matrix. This book is an elaboration of the author's lecture note for a series of seminars he gave at the Department of Mathematical Engineering and Information Physics, University of Tokyo in 1994. The author thanks Shun-ichi Amari, Kokichi Sugihara, Koichiro Deguchi, and Ken Hayami of the Department of Mathematical Engineering and Information Physics, University of Tokyo and Keisuke Kinoshita of the ATR Human Information Processing Research Laboratories for detailed discussions. He also thanks Azriel Rosenfeld of the University of Maryland, who carefully read the entire manuscript, and Naoya Ohta of Gunma University, Yasushi Kanazawa of Gunma College of Technology, and all of the students in his laboratory for helping him with this laborious project. This work was in part supported by the Ministry of Education, Science, and Culture, Japan under a Grant in Aid for Scientific Research B (No. 07458067) and the Okawa Institute of Information and Telecommunication, Japan. Gunma, Japan January 1996
Kenichi Kanatani
Chapter 1 Introduction This chapter introduces the aims, features, backgrounds, and organization of the subsequent chapters. In particular, the differences between our treatment and existing studies are described in detail. Also, related references are given for each subject. Since the discussion here refers to many technical issues in the subsequent chapters, readers are advised to go through this chapter for a general overview in the first reading and come back here later for detailed information.
1.1
1.1.1
T h e A i m s of This B o o k
Statistical optimization for image and sensor data
Intelligent robots are expected to play a vital role in manufacturing industries and various hazardous environments such as nuclear power stations, outer space, and deep water. For autonomous robotic operations in an unknown environment, robots must first acquire a 3-D model of their workspace. The most fundamental source of 3-D information is vision. Today, the study of extracting 3-D information from video images and building a 3-D model of the scene, called computer vision or image understanding, is one of the research areas that attract the most attention all over the world [5, 6, 12, 47, 57, 65, 85, 90, 130, 135, 151, 185, 211, 227, 229]. Various other sensing techniques have also been developed--tactile sensing and ultrasonic range sensing, for example. The crucial fact about image and sensor data is that they are not necessarily accurate. We discuss errors in range sensing in Section 10.3, but in the rest of the book we exclusively deal with points and lines detected from camera images. Errors in such image data originate from various sources. Camera images undergo optical distortion, called aberration, due to imperfections of the lens. They are also distorted by mechanical causes (e.g., misalignment of the lens and the array sensor) and electronic causes (e.g., difference between the vertical scanning rate and the horizontal sampling rate). Theoretically, such systematic distortions can be corrected by prior calibration. In reality, however, it is very difficult to remove them completely. Even if a perfect camera is used, images are usually inaccurate because of poor lighting, imperfect focusing, limited resolution, and various factors concerning the scene and objects in question (shades, shadows, specularity, reflection, etc.). Hence, points and lines detected by applying an image processing operation to the gray levels are not located where they should be. In this book, such deviations
2
Chapter 1.
Introduction
of points and lines are collectively termed "image noise". Today, coping with the uncertainty of the data, whatever origins they have, is one of the greatest challenges to computer vision and robotics researchers, and many techniques have been proposed for this purpose. They are basically categorized into two types"
9 Traditional statistical approach: the uncertainty of the data is conquered by repeated measurements. 9 Physics-oriented approach: the uncertainty of the data is compensated for by a priori knowledge about the structure of the problem. This book concentrates on the latter approach. Namely, we take advantage of the geometric constraints that measurement values should satisfy if noise did not exist. For point and line data, such constraints include collinearity and coplanarity of points, concurrency of lines, rigidity of motion, and what is known as the epipolar constraint for stereo vision and motion analysis. We focus on two of the most fundamental methods that make use of such constraints:
9 Geometric correction: the data are optimally modified so as to satisfy the geometric constraints exactly. 9 Parametric fitting: the parameters of the geometric constraints are optimally estimated. The latter is not limited to fitting curves and surfaces to point data [20, 200, 160, 228] but covers a much wider range of problems such as 3-D motion analysis, as we will show later. In the following, we primarily consider points and lines in images, but in principle our statistical theory can also apply to other types of image data (e.g., gray levels of pixels) and sensor data obtained from devices other than cameras.
1.1.2
W h a t are the i s s u e s ?
For any type of statistical estimation, we encounter the following issues:
9 Accuracy bound. In order to claim that a particular method is "optimal," one must know the theoretical bound on its attainable accuracy; only those methods which attain that bound are qualified to be called "optimal." If such methods are not known, we can still evaluate the performance of available methods by comparing their accuracy with the theoretical bound. 9 Reliability evaluation. Computing an optimal estimate alone is not sufficient in real applications; we must at the same time evaluate the reliability of the resulting estimate in quantitative terms. If a robot does
1.1.
The Aims of This Book
3
not know about the reliability of the 3-D model according to which it is operating, the robot is unable to judge if its next action can achieve the given task within the required level of accuracy. For example, the knowledge that the distance to an object is optimally estimated to be 5m is of little use if the robot does not know whether the value is reliable within +10cm or + l m .
9 E~ciency of computation. An efficient numerical algorithm must be available for computing optimal estimates and evaluating their reliability. In choosing algorithms, we must also take into consideration various computational issues other than efficiency, such as numerical instability due to rounding and approximation errors involved in the computation. 9 Plausibility of the model. We need a criterion to judge if the assumed structure of the problem on which the estimation is based is correct. Given a sequence of points, for example, we can optimally fit a line to them by an efficient algorithm and evaluate the reliability of the resulting fit, provided the points are assumed to be collinear in the absence of noise. How can we confirm such a presumption? In the past, the traditional statistical approach has been the main tool in dealing with noisy data, and various statistical techniques have been employed in many forms. However, they are mostly used to make qualitative decisions and judgments [35, 36, 77, 108, 111, 139, 204]. In such problems, probabilities are usually adjusted empirically so as to reflect subjective degrees of tolerance rather than physical noise characteristics. A typical example is a technique called probabilistic (or stochastic) relaxation [12, 151]. For numerical estimation problems, on the other hand, statistical aspects have not been considered very much. Rather, much attention has been paid to algebraic aspects--writing down equations that describe the geometric relationship between the assumed 3-D structure of the scene and the 2-D description of its projected image and then solving these equations. In the presence of noise, the equations obtained by substituting the data are often inconsistent with each other, so some kind of optimization is applied. However, the optimization criterion is chosen rather heuristically or merely for convenience of computation, and its performance is evaluated empirically by simulations using real and synthetic data. A typical example is a technique called regularization [158, 159], for which the regularizing parameter is adjusted on a trial-and-error basis. If statistical techniques are employed, they are often transplanted from textbooks on statistics. In this book, we give many synthetic and real data examples to demonstrate that conventional methods are not optimal and how accuracy improves if truly optimal methods are employed. We also show that the reliability of the resulting optimal solution can be evaluated in analytical terms.
4
Chapter 1.
Introduction
Testing the validity of the assumed model is also very crucial in computer vision and robotics applications. However, it has been customary to make judgments based on a heuristic criterion. For example, a sequence of points are judged as collinear if the residual of line fitting is smaller than an arbitrarily set threshold. In this book, we present a systematic procedure for such a judgment in rigorous statistical terms.
1.1.3
Why is a new statistical theory necessary?
The aim of this book is to give a rigorous mathematical foundation to numerical optimization problems for computing 2-D and 3-D geometric quantities from inaccurate image and sensor data. To this end, we need a new theory of statistics, because the basic premises of traditional statistics do not apply to the problems we consider in this book: what is very important in traditional statistics is not so important in our problems, while what is very important in our problems has not been much recognized by statisticians. One of the major reasons why traditional statistical theories are not suitable for our purpose is their rather narrowly defined framework of viewing statistical estimation as inferring a structure by observing multiple data. This is with a view to evaluating and comparing effects and procedures in domains that involve a large degree of uncertainty, such as medicine, biology, agriculture, manufacturing, sociology, economics, and politics [38, 42, 52, 53, 63, 115, 137, 230]. In such domains, the problem is usually translated into the mathematical language as estimating the parameters in-
volved in a probability distribution from multiple independent samples ]rom it. Although this framework is suitable in the above mentioned domains, statistical estimation problems in computer vision and robotics have many non-traditional elements. In traditional statistics, errors are regarded as uncontrollable; the accuracy of estimation can be improved only by repeated measurements. However, repeating measurements is costly. Hence, if the accuracy is the same, those methods which require a smaller number of data are more desirable. In other words, methods whose accuracy improves rapidly as the number of data increases are more desirable than those with slow increase of accuracy. Thus, the study of asymptotic properties in the limit of a large number of data has been one of the central subjects in traditional statistics [18, 137]. In such engineering domains as computer vision and robotics, where electronic sensing devices are used, errors are usually very small and called noise. Moreover, they are controllable: accuracy can be improved, for example, by using high-resolution devices and controlling the environment (lighting, dust, temperature, humidity, vibration, etc.). However, such control is costly. Hence, if the accuracy is the same, those methods which allow higher levels of noise are more desirable. In other words, methods whose accuracy improves rapidly as the noise level decreases are more desirable than those with slow
1.1.
The Aims of This Book
5
increase of accuracy. Thus, the study of the accuracy of estimation in the limit of small noise is very important. In this book, we assume that errors are small and apply first order approximations in various forms. In this sense, our analysis is essentially a perturbation theory. In many engineering domains, repeating measurements under the same condition (which is easy) often produces the same result because the sources of inaccuracy in the device and the environment are fixed (but unknown). In such a domain, the basic premise of traditional statistics that independent
samples from the same distribution can be observed as many times as desired does not hold. How can one do statistical estimation from only one set of data? One cannot take even the "sample average," which is the most fundamental statistic. The answer is, as we mentioned earlier, the use of a priori knowledge about the problem: we can make inferences from the degree to which the data deviate from the the geometric constraints that should hold if the data were accurate. We can also make use of partial information about the accuracy of the image and sensor devices. To a first approximation, the uncertainty of the data supplied by a particular measurement can be characterized by the covariance matrix inherent to the sensor, but we need not know it exactly if we incorporate a priori knowledge about the problem: it suffices to know the covariance matrix up to scale. In other words, all we need is qualitative properties, such as isotropy and homogeneity, of the error distribution. Another difference from traditional statistics is the geometric nature of our problems. Traditional statistics is mainly concerned with individual variables involved in a phenomenon. In computer vision and robotics applications, however, we are concerned with geometric objects such as points, lines, and surfaces in two and three dimensions and their interrelations such as incidence, joins, and intersections. Such geometric objects and interrelations are described by vectors, tensors, and manifolds. Moreover, all procedures for statistical estimation must be written in a form that is invariant to coordinate transformations: the estimate obtained by applying a coordinate transformation to the data must coincide with the value obtained by applying the same coordinate transformation to the estimate. For example, the problem of fitting a line to approximately linearly correlated data is called regression in statistics, but the independent variables (called abscissa variables, explanatory variables, controlled variables, predictor variables, covariate variables, and by many other names) and the dependent variables (called observed variables, data variables, response variables, outcome variables, ordinate variables, and by m a n y other names) must be distinguished. In contrast, the x and y coordinates are completely equivalent when we fit a line to approximately collinear point data; the coordinate axes are chosen merely for computational convenience, and the line to be fitted must be identical whatever coordinate system is used.
6
1.2
1.2.1
Chapter 1.
Introduction
T h e F e a t u r e s of T h i s B o o k
Theoretical accuracy bound
One of the most significant consequences of our analysis is that a theoretical bound on the attainable accuracy is given for geometric correction and parametric fitting problems in the form of the covariance matrix of the parameters to be estimated. The bound is first derived in terms of the covariance matrices of the data by assuming that the noise is Gaussian. Later, a rigorous mathematical proof is given for general non-Gaussian noise, for which the Fisher information matrix plays the role of the covariance matrices of the data. The theoretical bound we derive corresponds to what is known as the Cramer-Rao lower bound in statistics, but our treatment is very different from traditional statistics. It is well known in statistics that as the number N of data increases, the variance/covariance of an estimate generally converges to zero at a rate of O(1/N). Hence, the bound on accuracy is defined by the asymptotic limit of N times the variance/covariance as N ~ c~. However, sophisticated mathematical arguments are necessary for rigorously defining various types of convergence (week convergence, strong convergence, etc.). In this book, we fix the number N of data and analyze the first order behavior in the limit of small noise. Hence, the bound on accuracy is defined as the limit of 1/e 2 times the variance/covariance as e --+ 0 for an appropriately defined noise level e. Such a perturbation analysis does not involve intricate mathematical subtleties about convergence. W i t h our perturbation approach, it can be shown that the ubiquitous least-squares optimization is not optimal; statistical bias exists in the leastsqum'es solution in general. We can also show that the theoretical bound can be attained by maximum likelihood estimation in the first order if the distribution of the noise belongs to the exponential family, in which the Gaussian distribution is a typical example. Just as m a x i m u m likelihood estimation is given a special status in the traditional asymptotic theory of statistics, it also plays an important role in our perturbation analysis.
1.2.2
Evaluation of reliability and testing of hypotheses
The fact that the reliability of m a x i m u m likelihood estimation can be evaluated quantitatively implies, for example, that when we fit a line to a sequence of point data, we can compute not only an optimal fit but also the probability that the fitted line deviates from the true position by a specified amount. This fact provides a means to visualize the most likely deviations. In this book, we present two visualization techniques--the standard confidence region and the primary deviation pair. They are multi-dimensional analogues of the "confidence interval" for point estimation in statistics. Being able to predict the probability of likely deviations means that we
1.2.
The Features of This Book
7
can derive a technique for testing hypotheses. Namely, if a prediction based on a hypothesis is not consistent with its observation to a significant degree, the hypothesis is rejected. For example, when we apply a minimization scheme for fitting a line to a sequence of points which are supposedly deviated from their original linear configuration by image noise, we can compute the probability distribution of the residual of the minimization. If the actual residual value is very large as compared with the predicted noise level of the sensor, the hypothesis that the original configuration of the points is linear is rejected. In this book, we give a rigorous statistical criterion for testing geometric hypotheses in the form of the X~2 test. However, such a .~ test is effective only when the accuracy of the sensor can be estimated a priori. If no information about the noise is available, we cannot tell whether the hypothesis is violated because of the noise or the hypothesis itself is wrong. Also, we need to set the significance level for doing a X2 test, but no theoretical basis exists for its choice (5% or 1%?). In this book, we show that the goodness of the assumed geometric constraint, or model, can be measured by the geometric information criterion obtained by modifying the AIC widely known in statistics [2]. By using the geometric information criterion, we can compare the plausibility of two models without introducing any arbitrarily set threshold.
1.2.3
Geometric models as manifolds
Parametric fitting may sound merely like fitting lines and surfaces to point data, but it has a much wider meaning. In this book, we primarily deals with such simple geometric primitives as points, lines, planes, conics, and quadrics in two and three dimensions, but our theory can be applied to any primitives as long as they can be specified by coordinates and parameters. In fact, any primitive can be viewed as a point in an appropriately defined parameter space; our theory holds in any dimensions. A relationship between primitives can be identified with a parameterized manifold in an abstract parameter space; we call such a manifold a model. The goal of parametric fitting is to find an optimal model that best fits the data points that represent individual primitives. The criterion of optimality is given in terms of the Mahalanobis distance in the parameter space. Thus, inferring a true configuration that satisfies a constraint such as incidence, coincidence, collinearity, concurrency, coplanarity, parallelism, or orthogonality can be viewed as parametric fitting. From this viewpoint, the problem of 3-D reconstruction from stereo and motion images and optical flow is also parametric fitting: a 3-D structure is reconstructed by fitting a relationship called the epipolar constraint [47, 90, 135, 227]. Once a relationship is inferred by parametric fitting, the next stage is geometric correction. For example, after a line is fitted to non-collinear points, the individual points are optimally moved onto the fitted line. If the epipolar
8
Chapter 1.
Introduction
geometry is fitted to stereo or motion data, they are optimally corrected so as to satisfy the fitted geometry. Mathematically, this problem can be viewed as an optimal projection of points that represent primitives onto a manifold in an abstract space, and the criterion of optimality is given in terms of the Mahalanobis distance. This type of geometric correction not only increases the accuracy of 3-D reconstruction but also allows us to evaluate its reliability, since the amount of such correction indicates the degree of inaccuracy of the data. Geometric correction is also important if we adopt a computational technique called linearization for 3-D motion analysis. It has been widely believed that this technique sacrifices accuracy at the cost of computational convenience [189, 191, 207]. In this book, we point out that linearization does not reduce the accuracy if the geometric correction is optimally applied. In the past, attempts have been made to view motion and optical flow analysis as traditional statistical estimation and apply standard statistical techniques [222, 233]. However, explicit and analytically closed expressions are difficult to obtain because what are known in statistics as nuisance parameters are involved. Viewing the problem as parametric fitting and geometric correction in an abstract space, we can systematically derive an optimal estimation criterion, an efficient numerical scheme, a theoretical bound on accuracy, and statistical tests for various types of hypotheses. In this respect, our approach is very different from all existing treatments of motion and optical flow analysis.
1.2.~
Numerical schemes for optimization
Presenting a mathematical theory is not the only purpose of this book. Giving consideration to making the theory applicable in real situations is also a major theme. The use of the rank-constrained generalized inverse to prevent numerical instability and increase the robustness of the solution is one example. Numerical schemes for computing an optimal solution are also studied in detail. Since an optimal solution is usually given by nonlinear optimization, which is time-consuming if solved by numerical search, we devise two simplified schemes for computing an approximately optimal solution: the optimal filter and renormalization. The latter plays a major role in this book. The renormalization procedure requires no a priori knowledge of the noise level; it is estimated a posteriori as a result of renormalization. It consists of iterated computations of eigenvalues and eigenvectors and bias-correction procedures. The accuracy of renormalization is shown to attain the theoretical bound in the first order. Many simulations and real image examples are given to show that renormalization is superior to conventional methods.
1.3.
Organization and Background
1.3 1.3.1
9
Organization and Background Fundamentals of linear algebra
One of the main characteristics of our analysis is the geometric nature of the problem: the data are vectors and tensors in two and three dimensions that represent 2-D and 3-D objects. Hence, linear algebra plays a fundamental role. Chapter 2 summarizes vector and matrix calculus, the eigenvalue problem, the singular value problem, the generalized inverse, and matrix and tensor algebra. Since the materials presented here are well established facts or their easy derivatives, theorems and propositions are stated without proofs. Many of them are proved in Kanatani [90] in the form of the answers to the exercises. The generalized inverse that will be used in this book is only of the MoorePenrose type (see [171] for other types). Readers who want a more advanced treatment of 3-D rotations and related subjects, such as Lie groups and Lie algebras, are advised to read Kanatani [85]. Along with fundamentals of linear algebra, topics concerning linear equations and optimization are also presented in Chapter 2: techniques for robustly solving indeterminate or ill-posed equations are summarized, and analytical solutions are derived for least-squares optimization, constrained quadratic optimization, and optimal rotation fitting. The optimal rotation fitting problem has been studied by many researchers in many different forms [11, 67, 69, 90, 93, 212]. The solution given in Chapter 2 is based on Kanatani
[90, 93]. 1.3.2
Probabilities and statistical estimation
Chapter 3 summarizes basic concepts of probability and statistics such as mean, variance, covariance, Gaussian distribution, X2 distribution, and the X2 test. Also, various principles of statistical estimation including maximum likelihood estimation and maximum a posteriori probability estimation (Bayesian estimation) are described by assuming that the noise is Gaussian. Then, the (discrete-time) Kalman filter is derived by applying maximum a posteriori probability estimation to a linear dynamical system with Gaussian noise. The Kalman filter is often defined as minimum mean square estimation by means of orthogonal projection, since this formalism can be applied to nonGaussian processes as well. However, minimum mean square estimation is identical with maximum a posteriori probability estimation if all variables are Gaussian, and the derivation of the Kalman filter is much easier for Gaussian processes. The details of the Kalman filter as well as the (continuous-time) Kalman-Bucy filter are left to the literature [10, 29, 33, 54, 74, 78, 79, 138, 140]. Chapter 3 also gives a general formulation of statistical estimation that does not depend on Gaussian properties. First, the Cramer-Rao lower bound
10
Chapter 1.
Introduction
is derived for the covariance matrix of an arbitrary unbiased estimator in terms of the Fisher information matrix. Then, asymptotic properties of the maximum likelihood estimator and the role of the exponential family of distributions are discussed without going into details. Finally, we derive the AIC (Akaike information criterion) that measures the goodness of a statistical model. The AIC was proposed by Akaike [2] and has been widely used for selecting a statistical model for a given random phenomenon without introducing an arbitrarily set threshold such as the significance level of the X2 test. Our analysis has two non-traditional elements. Firstly, we consider probability distributions over a manifold defined by geometric constraints, so fundamentals of manifolds and tangent spaces are briefly summarized. Detailed discussions, transversality in particular, are found in books on catastrophe theory [166, 201]. Statistical treatment of random variables that are constrained to be in a manifold is very difficult even for a very simple one such as a sphere in three dimensions [82, 129]. In order to simplify the argument, we consider only local distributions, assuming that the noise is very small. This is related to the second feature of our analysis: we are mainly interested in the statistical behavior of estimates in the limit of small noise as opposed to their behavior in the limit of a large number of data, which is the main concern of traditional statistical analysis. There exists a vast of amount of introductory literature on probability and statistics. Classical textbooks include [37, 38, 49, 52, 53, 62, 63, 104, 107, 115, 170, 230]. Statistical analysis involving generalized linear models, the Fisher information matrix, and the exponential family of distributions can be found in [42, 137]. Recently, much attention of statisticians has been drawn to geometric treatments of statistical distributions" a parameterized probability density is identified with a point in the parameter space, and statistical problems are interpreted in such geometric terms as Riemannian metrics and affine connections. This approach is called statistical geometry [7, 14, 15, 143]. Although this book also takes a geometric approach, we do not need such sophisticated concepts: all we need is tangent spaces to manifolds and projection operations onto them.
1.3.3
Representation of geometric primitives
Chapter 4 discusses mathematical representations of basic geometric primitives in two and three dimensions such as points, lines, planes, conics, and quadrics. These primitives are the main ingredients of projective geometry [90, 183, 184]. However, the representations established in projective geometry are defined with a view to making the mathematical treatment consistent without allowing any anomalies. For example, ordinary points and ideal points (points at infinity) are treated in the same way in terms of homogeneous coordinates. Hence, two lines always meet at a single (ordinary or ideal) point in
1.3.
Organization and Background
11
two dimensions. Another characteristic of projective geometry is its duality: points and lines are treated in an identical manner in two dimensions, and so are points and planes in three dimensions. This implies that theorems and statements concerning points, lines, and planes are automatically extended to their duals. In engineering applications such as computer vision and robotics, ordinary objects and ideal objects need to be distinguished, because only ordinary objects can be measured by sensors in real environments. Also, the duality concerning points, lines, and planes does not hold in real environments, because the error behavior is different for an object and its dual. In image analysis, for example, feature lines are defined by edge detection followed by line fitting, while feature points are usually defined as the intersections of detected feature lines. Since the error behavior of lines thus defined and the error behavior of points computed from them are very different, the mathematical elegance of projective geometry is destroyed, and representations of geometric primitives should reflect the way they are computed from sensor data. In order to describe geometric primitives and their relationships in real environments, one must therefore reformulate projective geometry from a computational point of view. This was done by Kanatani [86, 90, 91, 94, 95], who called the resulting formalism computational projective geometry. The treatment in Chapter 4 is a hybrid of projective geometry and real computation, but the emphasis is shifted more toward computational aspects than toward the role of projective geometry. Moreover, the same primitive is often given multiple representations; one is useful for error analysis, while another is useful for describing geometric relationships to other primitives. Almost all the representations used in this book have inherent constraints in one way or another, and the number of parameters that specify a primitive is generally larger than its true degrees of freedom. As a result, error behavior is described by a singular covariance matrix having a null space, and dealing with null spaces is one of the main characteristics of our analysis. In Chapter 4, we list formulae to convert expressions for geometric properties and error behavior from one representation to another and from one coordinate system to another. Then, perspective projection is introduced as a physical interpretation of the homogeneous coordinates. This is the key relationship between 3-D objects and their 2-D descriptions. In order to apply 3-D analysis based on perspective projection to real camera images, the camera is assumed to be calibrated, i.e., its imaging geometry is known mid modeled as perspective projection. Techniques for camera calibration are found in [13, 32, 44, 56, 59, 57, 88, 98, 110, 116, 117, 136, 156, 199, 210, 213, 214, 215, 218, 223]. Finally. we give a brief account of conics and quadrics. A conic (also referred to as a conic locus or conic section) is a plane curve defined by a quadratic equation in the coordinates. Conics are classified into ellipses (including circles), parabolas, hyperbolas, and their various degeneracies. If a
12
Chapter 1.
Introduction
robot is to operate in an industrial environment (say, in a nuclear power station), it must recognize gauges, meters, dials, and handles, most of which are circular, and circles are perspectively projected into ellipses. Hence, ellipses, or conics in general, are widely recognized as one of the most fundamental features in the study of computer vision and robotics [50, 169, 121, 178]. Detected conics can provide not only clues to object recognition; if the observed conic in an image is known to be a perspective projection of a conic in the scene of a known shape, its 3-D geometry can be computed analytically [51, 90, 97, 128, 180, 181]. A quadric is a space surface defined by a quadratic equation in the coordinates. Quadrics are classified into ellipsoids (including spheres), paraboloids, hyperboloids, and their various degeneracies. As compared with conics, the role of quadrics appears to be minor in computer vision and robotics applications, since ellipsoidal objects are rare in real environments. However, quadrics play an important role for visualizing the reliability of computation in three dimensions. We also present a technique for visualizing the reliability of objects that have more than three degrees of freedom by means of the
primary deviation pair. In Chapter 4, we only deal with points, lines, planes, conics, and quadrics in two and three dimensions, but this is simply because they are the most frequently encountered primitives in robotics applications. Indeed, the statistical estimation theory described in this book can be applied to any primitives if they are identified with points in an abstract parameter space and their interrelationships are represented by manifolds in that space.
.3.4
G e o m e t r i c correction
Chapter 5 discusses the geometric correction problem in general. A typical problem is to move data points so that they are on a particular line, curve, plane, or surface. Such a correction is necessary if the points we are observing are known to be on a specific line, curve, plane, or surface in the absence of noise. A naive idea is to project each point to the "nearest point" on the line, etc. However, this can be justified only when the noise distribution is isotropic. Otherwise, the correction should be such that it compensates for the most likely deviation. This problem is not limited to points; it can also apply to lines, planes, and more complex objects for imposing geometric constraints on their configurations. If the noise is Gaussian, an optimal solution is obtained by minimizing the Mahalanobis distance" the solution is the "nearest point" on the constraint surface in the parameter space measured in the Mahalanobis distance. We derive an explicit expression for the optimal solution, for which the noise level need not be known: it can be estimated a posteriori by analyzing the statistical behavior of the residual of the minimization. We also show that in the course of computing an optimal solution, its a posteriori covariance
1.3.
Organization and Background
13
matrix can be is automatically evaluated. If the noise level can be predicted in advance, we can test the existence of the constraint" the hypothesis that the observed object is in a certain configuration is rejected if the residual of the minimization is very large as compared with the predicted noise level. We formalize this process as the X2 test. After giving a general theory, we apply it to the problem of imposing coincidence and incidence on points, lines, conics, and planes. We also study the problem of imposing orthogonality on orientations. Applications of this type of geometric correction in reconstructing 3-D structures from stereo images for robot navigation are found in [161, 162, 163, 165].
1.3.5
3-D computation by stereo vision
In Chapter 6, we study stereo vision--3-D reconstruction by triangulation from two (or more) images. Since the principle is very simple, stereo vision has become one of the most widely used means of 3-D sensing for autonomous manipulation and navigation by robots. In the past, studies of stereo vision have been almost entirely concentrated on matching of the two stereo images. This is because establishing correspondences between the two images is a very difficult task to automate efficiently, and many matching techniques have been proposed and tested [47, 57, 227]. In contrast to matching, 3-D reconstruction appears rather trivial. However, since image data contain noise, the reconstructed 3-D structure is not necessarily accurate. Hence, it is very important to evaluate the reliability of the reconstructed 3-D structure. Otherwise, robots are unable to take appropriate actions to achieve given tasks with required precision. This issue has been discussed by only a few researchers (e.g., see [19]). We concentrate on this issue and evaluate the covariance matrices of the reconstructed points and lines by applying the theory of geometric correction given in Chapter 5. In Chapter 6, we first discuss the epipolar constraint of stereo vision by introducing the concepts of epipoles and epipolars and show that the degree to which the required epipolar equation is not satisfied provides an estimate of the noise level, from which the reliability of the reconstructed 3-D structure can be evaluated. This result is based on Kanazawa and Kanatani [101]. It turns out that this analysis is equivalent to introducing a Riemannian metric into the 3-D scene by projecting the direct product of two images onto the 3-D manifold defined by the epipolar equation. Applying the theory of geometric correction given in Chapter 5, we also present an optimal scheme for imposing the constraint that the reconstructed points and lines must be on a specified planar surface. Finally, we analyze the errors in 3-D reconstruction due to camera calibration errors (e.g., see [39, 80, 186] for stereo camera calibration).
14
1.3.6
Chapter 1.
Introduction
Parametric fitting
In Chapter 7, we study methods for fitting a geometric object to multiple instances of another geometric object in an optimal manner in the presence of noise. A typical example is fitting a line to a sequence of points. Line fitting is one of the most important steps in computer vision, and various techniques for it have been studied in the past [81,106, 219, 235]. This is because the first step of image analysis is detecting edges, i.e., sequences of pixels constituting boundaries that separate objects from the background. They are detected by applying a filter called an edge operator. Since man-made objects in robotic workspaces usually have linear boundaries, many objects in the image can be located by fitting straight lines to detected edge pixels. Then, their shapes and locations in the scene are inferred by various technique, e.g., by computing vanishing points and focuses of expansion [24, 90]. Since errors in line fitting propagate to the final 3-D reconstruction, the reliability of the reconstructed 3-D shape can be evaluated if the reliability of line fitting is quantitatively evaluated. In Chapter 7, we analyze the statistical behavior of a line fitted to a sequence of edge pixels. See [23, 87, 88, 91, 94, 187, 220] for various types of statistical inference based on line fitting errors. The theory in Chapter 7 is not limited to line fitting. We generalize the theory of parametric fitting to deal with arbitrary geometric objects in abstract terms. The goal is not only obtaining an optimal fit but also evaluating its reliability in statistical terms. First, the criterion for parametric fitting is derived as maximum likelihood estimation by assuming that the noise is Gaussian. The covariance matrix of the resulting estimate is evaluated, and the statistical behavior of the residual of optimization is analyzed. This analysis leads to the X2 test for the hypothesis that the observed objects are in a special configuration. Then, we study various types of fitting problem in two and three dimensions such as finding an optimal average, estimating a common intersection, and fitting a common line or plane.
1.3.7
Optimal filter and renormalization
In Chapters 8 and 9, we study numerical methods for efficiently computing the optimal solution of the parametric fitting problem. In Chapter 8, we construct a filter that optimally updates the estimate each time a new datum is read; we call it simply the optimal filter. The update rule is derived for Gaussian noise by adopting the Bayesian standpoint and applying maximum a posteriori probability estimation. Various assumptions and approximations involved in the derivation m'e elucidated, and the philosophical implications of the Bayesian approach are also discussed. The update rule is simplified by introducing the effective gradient approximation. The Kalman filter was originally derived for dynamical linear systems, but it can also be applied to nonlinear dynamical systems by introducing
1.3.
Organization and Background
15
linear approximation into the system equations. The resulting filter is known as the extended Kalman filter [10, 29, 33, 54, 140]. In Chapter 8, we show that the optimal filter is also obtained from the extended Kalman filter if the parametric fitting problem is identified with a nonlinear dynamical system with no state transition (i.e., a nonlinear "static" system). For this reason, the optimal filter described in Chapter 8 is often called the "extended Kalman filter" [61, 161, 162, 163, 165]. Porrill [164] pointed out the fact that the extended Kalman filter yields a statistically biased solution if the standard procedure for linear approximation is applied. He also proposed a method for correcting the bias, which is a special version of the effective gradient approximation described in Chapter 8. Chapter 9 focuses on the parametric fitting problem for linear equations. To a first approximation, the problem reduces to least-squares fitting, for which the solution can be obtained by solving an eigenvalue problem; we call the resulting solution the eigenvector fit. We first show that this leastsquares approximation introduces statistical bias into the solution. After analyzing the statistical bias in quantitative terms, we present a simple correction scheme for cancelling this bias; we call the resulting solution the unbiased eigenvector fit. In order to compute the unbiased eigenvector fit, however, we need to estimate the noise level precisely, which is usually difficult. In order to avoid this difficulty, we present a scheme for computing an unbiased solution, called the generalized eigenvector fit, without assuming a priori knowledge of the noise level. The computation is further simplified into an iterative form called renorrealization. In contrast to the optimal filter, no initial estimate is necessary for renormalization. The noise level is estimated a posteriori as a result of renormalization, and the covariance matrix of the computed estimate is automatically obtained. We then discuss a procedure called linearization, which allows us to apply renormalization to nonlinear equations. Finally, we define second order renormalization that removes statistical bias up to higher order terms.
1.3.8
Applications of geometric estimation
Chapter 10 illustrates the renormalization technique by solving the problems of line fitting and conic fitting in two dimensions and plane fitting in three dimensions. The computation for line fitting by renormalization is straightforward. If the image noise is uniform and isotropic, the least-squares scheme gives an optimal fit. This means that accuracy is not increased by the use of renormalization any further. However, the advantage of renormalization is that in the course of the fitting computation the reliability of the fit is automatically evaluated in the form of the covariance matrix. We will demonstrate this by showing the standard confidence region and the primary deviation pair that visualize the reliability of the solution computed from simulations and
16
Chapter 1.
Introduction
real image data. Next, we study the problem of conic fitting. As pointed out earlier, conics are important image features for computer vision, and if the observed conic in an image is known to be a perspective projection of a conic in the scene of a known shape, its 3-D geometry can be computed analytically [51, 90, 97, 121, 128, 178, 180, 181]. In order to do such an analysis, the conic must be given a mathematical representation in advance. This is done by applying an edge operator to a gray-level image, detecting object boundaries as edge segments, and fitting a quadratic curve to those edge segments which supposedly constitute a boundary of a conic region. Numerous conic fitting techniques have been proposed in the past [4, 20, 21, 34, 41, 45, 55, 145, 155, 167, 179, 182], but most of them are least-squares schemes with different parameterizations and criteria, and little consideration has been given to the statistical behavior of image noise. An exception is Porrill [164], who devised an iterative filtering scheme, which he called the "extended Kalman filter." Pointing out the existence of statistical bias in the solution if the filter was formulated naively, he proposed a bias correction procedure, which is equivalent to the "effective gradient approximation" introduced in Chapter 8. Kanatani [92] presented a prototype of renormalization, from which the formulation in this book has evolved. As in the case of line fitting, the reliability of the fit is evaluated in the course of the the fitting computation. Finally, we study the problem of fitting a planar surface to two types of data. Planar surface fitting is a very important process in an indoor robotic workspace, where many objects, including walls, ceilings, and floors, have planar surfaces. First, we assume that the 3-D positions of feature points that are known to be coplanar in the scene are measured by a range finder. In this case, the uncertainty of the data has a special form, according to which the covariance matrix of the data is modeled. The computation is straightforward, and the reliability of the fit is automatically evaluated in the form of the covariance matrix in the course of the fitting computation. Line and plane fitting to range data has been studied by many researchers in the past [22], but the reliability of the fit has been evaluated only in an ad hoc manner [197]. The method described here is based on Kanazawa and Kanatani [102]. Next, we consider the problem of reconstructing a planar surface from stereo images of feature points that are known to be coplanar in the scene. An indirect but straightforward method is first computing the 3-D positions of individual feature points separately and then fitting a planar surface to them. However, it is expected that the reliability of the 3-D reconstruction can be enhanced if the knowledge that the feature points are coplanar is incorporated from the beginning. With this motivation, we optimally reconstruct a planar surface directly from stereo correspondence pairs. Applying the theory of geometric correction given in Chapter 5, we optimally correct corresponding
1.3.
Organization and Background
17
pairs of feature points in the stereo images in such a way that they define exactly coplanar points in the scene. Then, the parameters of the plane on which they lie are computed by renormalization. Here again, the reliability of the fit is evaluated in the course of the fitting computation. This result is based on Kanazawa and Kanatani [100].
1.3.9
3-D motion analysis
Mathematical analysis of 3-D rigid motion estimation, known as shape (or structure) from motion, was initiated by Ullman [211], who presented a basic mathematical framework that has had a lasting influence over the subsequent computer vision research. Roach and Aggarwal [177] applied this framework to real images and obtained the solution by numerical search. Nagel [144] presented a semi-analytical formulation, reducing the problem to solving a single nonlinear equation. A complete analytical solution for eight feature points was independently given by Longuet-Higgins [122] and Wsai and Huang [207]. The solution of Longuet-Higgins was based on elementary vector calculus, while the solution of Tsai and Huang involved singular value decomposition. Zhuang et al. [238] combined them into a simplified eight-point algorithm. Zhuang [236] also discussed the uniqueness issue. All these algorithms first compute the essential matrix from the epipolar equation and then compute the motion parameters from it. This technique is called linearization, and algorithms that use it are called linearized algorithms. Huang and Faugeras [73] pointed out that a matrix can be an essential matrix for some motion if and only if it has singular values 1, 1, and 0. Linear algorithms compute the essential matrix without considering this decomposability condition. Since the essential matrix has five degrees of freedom, a 3-D interpretation can be determined, in principle, from five feature points. Using a numerical technique called the homotopy method, Netravali et al. [150] showed the existence of at most ten solutions. Arguing from the standpoint of projective geometry, Faugeras and Maybank [48] also showed that at most ten solutions can be obtained from five feature points. They reduced the problem to solving an algebraic equation of degree ten and solved it by symbolic algebra software. Using the quaternion representation of 3-D rotation, J erian and J ain [75] reduced the problem to solving the resultant of degree 16 of a pair of polynomials of degree 4 in two variables and computed the solution by symbolic algebra software. Other proposed techniques include [113, 103]. Jerian and Jain [76] exhaustively reviewed algorithms known by that time and compared their performances for noisy data. However, all these algorithms are constructed on the assumption that all data are exact. Hence, they are all fragile in the sense that inconsistencies arise in the presence of noise (e.g., the solution becomes different, depending on which of the theoretically equivalent relationships are used). A noise robust algorithm was presented by Weng et al. [226], who estimated the essential
18
Chapter 1.
Introduction
matrix from the epipolar equation by least squares and then computed the motion parameters by least squares. Spetsakis and Aloimonos [188, 189, 190] applied direct optimization to the epipolar equation without computing the essential matrix. Although error analyses have been given for 3-D motion analysis by several researchers, most of the studies were empirical and qualitative, e.g., estimating approximate orders of errors and conducting simulations with noisy data [112, 157]. A notable exception is Weng et al. [226], who analyzed the perturbation of the essential matrix and the resulting motion parameters in detail. The fact that the least-squares solution based on the epipolar equation is statistically biased has also been recognized [3, 189, 190] and analyzed in detail by Kanatani [89, 90]. Spetsakis [191] conducted a statistical error analysis on a heuristic basis. Weng et al. [222] presented an optimal algorithm by identifying the problem with a traditional statistical estimation problem. Planar surface motion has also been studied by many psychologists from the viewpoint of human visual perception, and the fact that multiple 3-D interpretations are possible was pointed out by Hay [58] as early as in 1966. A rigorous mathematical analysis was given by Tsai and Huang [206, 208] and Tsai et al. [209] by using singular value decomposition. A complete solution was presented by Longuet-Higgins [125], and errors analysis was done by Weng et al. [221]. These results are summarized in Kanatani [47, 90, 135, 227]. Hu and Ahuja [72] extended the analysis to multiple planar surface images. Chapter 11 presents a statistically optimal algorithm for computing the 3-D camera motion and the object shape from corresponding points over two views. We first study the case in which the feature points in the scene are in general position and then the case in which they are known to be coplanar. Our analysis is distinct from all existing studies in many respects. First of all, the reliability of the computed motion parameters and the reconstructed depths is evaluated in the form of their covariance matrices, and the theoretical bound on their attainable accuracy is obtained in an explicit form. This is made possible by viewing 3-D motion analysis as a correction and fitting problem and applying the theories established in Chapters 5 and 7. This viewpoint is in a sharp contrast to that of Weng et al. [222], who treated the problem as a traditional statistical estimation problem. A germ of our approach was found in the work of Trivedi [205], but his idea has not been noticed very much: the present theory is in a sense a full-fledged version of his idea. The analysis of general motion given here is based on Kanatani [96], and the analysis of planar surface motion is based on Kanatani and Takeda [99]. Since we first compute the essential matrix and then decompose it into the motion parameters, our algorithm can be classified as a linear algorithm. In the past, the linearization technique has often been rejected as having poor accuracy as compared with direct nonlinear optimization. We point out that this is because statistical aspects have not been considered fully" we demon-
1.3.
Organization and Background
19
strate that linearization does not reduce the accuracy if the essential matrix is optimally computed and optimally corrected. The renormalization procedure produces not only a statistically optimal estimate of the essential matrix but also its covariance tensor, according to which the computed essential matrix is optimally corrected so as to satisfy the decomposability condition. Another new viewpoint given in Chapter 11 is the introduction of two types of statistical test. One is the rotation test. Prior to 3-D analysis, we need to test if the camera motion is a pure rotation, in which case no 3-D information is obtained. This is done by first hypothesizing that the camera motion is a pure rotation and then testing whether the observed images support that hypothesis to a statistically admissible degree. Since no 3-D information is obtained if the camera motion is a pure rotation, the degree to which the observed images support the rotation hypothesis can be viewed as defining the information in the motion images. By computing the information in observed images, we can predict the degree of robustness of the resulting 3-D reconstruction. The other test is the planarity test" we need to test if the object is a planar surface, in which case a separate algorithm must be used. In the past, various ad hoc criteria have been used for this test. For example, since the algorithm for general motion breaks down if the object is exactly planar, it has been customary to switch to the planar surface algorithm only when computational dij~culties are encountered (e.g., if theoretically nonzero quantities approach zero), and the judgment about this has been made arbitrarily. Our approach is the opposite: we first use the planar surface algorithm and switch to the general motion algorithm only when the assumption of planarity is questioned. We make this judgment by the ;~2 test, hypothesizing that the object is a planar surface and testing if the observed data support that hypothesis. 3-D interpretation of the scene becomes more reliable and more realistic if a sequence of image frames is incorporated [71,198]. However, the discussion in this book is limited to two-view analysis, because multi-view analysis raises many mathematical and technical problems that are outside the scope of this book. One approach is the use of the extended Kalman filter with non-linear internal dynamics (the optimal filter we discuss in Chapter 8 has no internal dynamics); see [25, 26, 27, 28, 47, 114, 131, 164, 227, 231]. Thomas et al. [202] presented a related statistical analysis. Another approach includes considering the physics of the 3-D motion in question [225] and using stereo image sequences [224, 234]. Tomasi and Kanade [203] proposed a 3-D analysis method called factorization for a sequence of orthographically projected images. Chapter 11 includes a brief discussion about the critical surface that gives rise to ambiguity of 3-D interpretation. Longuet-Higgins [123] pointed out that all linear algorithms for determining the essential matrix fail to yield a unique solution if and only if all the feature points are on a special quadric surface, which is called the weak critical surface in this book. Horn [68] studied
20
Chapter 1.
Introduction
the problem of two-camera registration, which is different from 3-D motion analysis in appearance but has essentially the same mathematical structure. He also discussed ambiguity of interpretation and showed that the critical surface must be a hyperboloid of one sheet or its degeneracy. A more direct analysis was given by Longuet-Higgins [126], who analyzed the condition that the critical surface degenerates into two planar surfaces or takes on other special forms. Further analyses were done by Negahdaripour [147, 148] and Maybank [133, 134, 135], giving detailed classifications of possible types of the critical surface. Their results are concisely summarized in Kanatani [90]. In this book, we concentrate on 3-D analysis of motion images and do not deal with image processing techniques for tracking feature points, on which research is still in progress [47, 57, 227].
1.3.10
3-D interpretation of optical flow
If a sequence of images is taken by a smoothly moving camera, the difference between two consecutive frames is very small. Small displacements of points define a dense "flow," called optical flow, on the image plane. In general, the computation of 3-D interpretation based on optical flow is likely to be sensitively affected by image noise. Also, detecting accurate optical flow is very difficult. However, optical flow has the advantage that the flow can be detected densely (usually at each pixel) over the entire image by an image processing technique, while it is in general very difficult to detect point-topoint correspondences between two images--particularly so if the interframe motion is large. Hence, the use of optical flow is expected to have practical significance only if an appropriate optimization technique is available. In Chapter 12, we first summarize the principle for optimally detecting optical flow from two gray-level images. A prototype of optical flow detection technique is due to Horn and Schunck [70]. However, it has a flaw in that it uses the smoothness constraint, which is a special case of the heuristics called regularization [158, 159]. If the detected flow is to be used for image segmentation (e.g., for separating moving objects from the stationary scene), the smoothed (or regularized) solution will often blur the motion boundaries. If the purpose is 3-D reconstruction, smoothing does not increase the accuracy of the reconstructed 3-D shape, because smoothing is essentially interpolation in the image. Rather, we should reconstruct only those 3-D points which produce reliable optical flow. We should then interpolate them in the scene when and only when some knowledge about the true shape of the object (e.g., planarity) is available. In Chapter 12, we apply the theory of parametric fitting given in Chapter 7 to optical flow detection. However, since research on optical flow detection is still in progress [47, 57, 227], we do not go into details. A comprehensive review of existing techniques is given in [16]. The discussion in this book is limited to a single optical flow image observed by one camera in motion. 3-D analysis from stereo optical flow images observed
1.3.
Organization and Background
21
by two or more cameras in motion is found in [105, 118, 142, 216]. We then turn to 3-D reconstruction from optical flow, which has also been studied in various forms in the past. If the object is a planar surface, the solution can be computed in an analytical form although multiple solutions exist [83, 84, 124, 192]. If the object surface is expressed as a polynomial or a collection of planar patches, the problem reduces to estimating the coefficients [1, 193, 217]. An analytical solution can be obtained if spatial derivatives of the flow velocity are used [127]. Since the flow due to camera rotation is depth independent [85], thereby globally continuous and smooth, a sudden change of the flow over a small number of pixels implies the existence of a translational motion and a depth discontinuity; this is the phenomenon called motion parallaz, from which the translational velocity can be estimated [127, 172]. More systematically, the translation velocity can be determined by subtracting the effect of camera rotation in such a way that the resulting flow has a common focus of ezpansion [168]. A more direct approach is to do numerical search for minimizing the sum of the squares of the differences between the observed flow and the expected theoretical expression [30, 43, 60, 232]. Zhuang et al. [237] derived a linear algorithm similar to the corresponding finite motion algorithm. As in the case of finite motion, it has been pointed out that the solution based on least-squares optimization is likely to be systematically biased [40]. Tagawa et al. [195, 196] proposed an iterative method to remove statistical bias, which was a prototype of the renormalization procedure described in this book. Endoh et al. [46] discussed the asymptotic accuracy behavior. A theoretical bound on accuracy based on the Cramer-Rao inequality was studied by Young and Chellappa [233], who regarded optical flow analysis as a statistical estimation problem in the traditional form. Various aspects about the accuracy and robustness of the solution have been studied in many different forms [31, 141, 149, 194]. Mathematically, optical flow is simply an infinitesimal limit of a finite image motion, so all procedures for 3-D interpretation of optical flow should be obtained from those for finite motion by taking the limit as the interframe time approaches zero. In fact, the analysis of 3-D interpretation of optical flow given in Chapter 12 is exactly parallel to the finite motion analysis given in Chapter 11" we first study the case in which the object has a general shape and then the case in which the object is a planar surface. We also describe a theoretical bound on the attainable accuracy and present a computational scheme by using renormalization and linearization. The analysis given here is based on Ohta and Kanatani [154]. The rotation test and the planarity test are also discussed in the same way as in the case of finite motion. Finally, we study the error introduced by identifying the interframe displacement by a continuous flow and point out that the ratio of image noise to optical flow is a very deceptive measure for predicting the accuracy of the resulting 3-D interpretation.
22
Chapter 1.
Introduction
A brief description of the critical surface of optical flow is also given here. Its definition and geometric properties are the same as those of finite motion, but historically the existence of the critical surface was first pointed out for optical flow. Maybank [132] showed that such a surface should necessarily be a quach-ic surface passing though the viewpoint. Horn [66] introduced the term "critical surface" and showed that the critical surface must generally be a ruled quadric and hence a hyperboloid of one sheet or its degeneracy. Negahdaripour [146] exhaustively classified the possible types of critical surface. Their results are concisely summarized in Kanatani [90].
1.3.11
Information criterion for model selection
In order to apply the techniques for geometric correction and parametric fitting, one needs to know the geometric model, i.e., the constraints and hypotheses that should hold in the absence of noise. But how can one prefer one geometric model to another? The problem of model selection is very important in many engineering domains where probabilities are involved, and two approaches have attracted attention: one is the AIC (Akaike information criterion) [2]; the other is the MDL (minimum description length) principle [173, 174, 175, 176]. The AIC favors a model whose maximum likelihood estimator has a minimum residual for future data, while the MDL principle favors a model that can be described in a code of minimum length. Since a geometric model is defined as a manifold, the complexity of the model is evaluated not only by its degree of freedom but also by such invariant quantities as the dimension and codimension of the manifold. In Chapter 13, we first summarize what we call the Mahalanobis geometry of maximum likelihood estimation and derive the geometric information criterion by modifying the AIC. This criterion in its original form requires exact knowledge of the noise level. In order to avoid this, we present a comparison criterion for evaluating the goodness of one model relative to another without using any arbitrarily set threshold such as the significance level of the X2 test. We derive explicit expressions for the comparison criterion for point data in two and three dimensions, 3-D reconstruction by stereo vision, 3-D motion analysis, and 3-D interpretation of optical flow. At the same time, we elucidate the geometric structures of these problems and their interrelationships--in particular, the ambiguity and singularity of 3-D reconstruction when the object is a planar surface or the camera motion is a pure rotation.
1.3.12
General theory of geometric estimation
Chapter 14 presents a rigorous mathematical foundation for the theory of geometric correction given in Chapter 5 and the theory of parametric fitting given in Chapter 7. For both, the problem is stated in general terms without assuming Gaussian noise: the role of the covariance matrix for a
1.4.
The Analytical Mind: Strength and Weakness
23
Gaussian distribution is played by the Fisher information matrix. We derive a lower bound, which corresponds to the Cramer-Rao lower bound in traditional statistics, on the covariance matrix of the unbiased estimator of the parameter. Then, the maximum likelihood estimator is proved to attain it in the first order if the problem belongs to the exponential family. Finally, the maximum likelihood estimation process is expressed in a computationally convenient form, where the rank-constrained generalized inverse is used to deal with the ill-posedness of the problem and the numerical instability of the solution. The theme of Chapter 14 is essentially statistical estimation, but our treatment has non-traditional elements for the reasons pointed out in Section 1.1.3. The statistical problem closely related to the parametric fitting problem is what is known as the Neyman-Scott problem [152]: observing multiple data, each having a distribution characterized by a common parameter (called the structure parameter or the parameter of interest) and a distinct parameter (called the nuisance parameter) that depends on each observation, one has to estimate the structural parameter without knowing the nuisance parameters [18]. In the parametric fitting problem, the true values of image/sensor data can be viewed as nuisance parameters, since the goal is to estimate the fitting parameter that characterizes the 2-D/3-D structure without knowing the exact values of the sensor data. However, the probability densities of the sensor data do not involve the fitting parameter. Hence, various mathematical concepts and tools developed for the Neyman-Scott problem [8, 9, 17, 109, 119, 120] cannot be applied directly in the form given in the literature of statistics.
The Analytical Mind: Strength and Weakness
1.4
i. .i
Criticisms of statistical approaches
In concluding this chapter, let us note the fact that there exist strong criticisms of statistical (or in general analytical) theories of robotics and computer vision. Since any theory is essentially an idealization of reality, there is always something in the real world that the theory does not cover, and statistical theories of robotics and computer vision have often been criticized for this very reason. Strong arguments against statistical theories include the following: 9 In order to apply a statistical theory, one must model the noise in a mathematically tractable form, say, assuming that the noise is independent, uniform, homogeneous, isotropic, Gaussian, etc. However, these are mathematical artifacts" noise in reality is more or less correlated, non-uniform, non-homogeneous, non-isotropic, and non-Gaussian. Hence, one does not know how reliable the analytical results based on these assumptions are.
24
Chapter 1.
Introduction
A statistically optimal solution is usually obtained by nonlinear optimization in a very complicated form, requiring a lot of computations. However, a human can easily perceive 3-D structures using vision and can take appropriate actions with sufficient precision. It is difficult to imagine that the brain conducts complicated optimization computations, such as renormalization, each time a human sees or does something. Rather, human perception seems to be based on ingenious combinations of experiences (data) and heuristics (rules). Hence, it should be possible to build an intelligent robot that has a human's versatility without doing rigorous mathematical analysis. We should learn from humans rather than relying on mathematics. Uncertainty in the reaI world is not merely quantitative. There is always a possibility that an unexpected circumstance happens--sudden appem'ance or disappearance of an object, for example. Since it is difficult to model the uncertainty in the real world completely, it should be more realistic to adopt heuristics and avoid rigorous mathematical assumptions. Experience has shown that empirically adjusted systems often outperform those designed by precise mathematical analysis; the latter do well only when the assumptions are exactly met. Those who criticize the statistical approach advocate such non-analytical pm'adigms as artificial intelligence, fuzzy inference, and neuro-computing. In the domain of computer vision, such compromises as qualitative vision and purposive vision have been proposed as alternatives to rigorous mathematical analysis. However, just as the importance of such paradigms cannot be denied, the importance of analytical methods cannot be denied either, because advantages and disadvantages are both sides of the same coin. Rigid and inflexible as it is, an analytically designed system has the advantage that it can faithfully reveal the insufficiency of the model. Namely, its poor behavior implies the existence of new factors in the environment that are not properly modeled in the analysis, and the study of these becomes the next goal, an analysis guiding us to a new analysis. In contrast, such insufficiency is often concealed by adroit behavior of a heuristically designed system, and its longterm performance is unpredictable--it may suddenly break down when its versatility capacity reaches a limit. This is an inescapable destiny of a system that learns from humans. By a cm'eful mathematical analysis, humans can learn nmch about the mechanisms that underlie seemingly simple phenomena.
Gaussian noise ass~tmption and outlier detection In our statistical analysis, the covariance matrix plays a central role as a measure of uncertainty. This is practically equivalent to assuming that the distribution of the noise is either Gaussian or approximately Gaussian. We develop
1.4.
The Analytical Mind: Strength and Weakness
25
a general non-Gaussian theory in Chapter 14, but the noise distribution is required to share many properties with the Gaussian distribution--smoothness and unimodality, for instance. Although assuming Gaussian or Gaussian-like noise nlakes mathematical analysis very easy, this certainly sets a limitation on potential applications of our theory. A typical situation in which this Gaussian noise assumption is violated is when the noise is due to systematic bias in the data acquisition procedure. For example, lens aberration causes a systematic distortion of the image, e.g., symmetric elongation or contraction around its center. The use of wrong camera parameters estimated by poor calibration is another source of systematic errors. In processing images, error characteristics are different from operation to operation. For example, if the epipolar constraint is used for finding point correspondences between stereo images, the error characteristics are very different along the epipolar and in the direction orthogonal to it. In this book, deviations of points and lines from their supposed positions are collectively regarded as "noise" irrespective of their sources. The Gaussian noise assumption is a good approximation if a lot of independent noise sources are involved, each having a small effect. This is a consequence of the central limit theorem,. In contrast, the Gaussian noise assumption poses a difficulty if a small number of very different noise sources coexist. If only one noise source exists, the noise characteristics can be estimated by repeating measurements. If nmltiple sources exist and one source is dominant, data from a dominant source are called inliers; those from other sources are called outliers. In order to do statistical estimation, one must judge which data are inliers and which are outliers. This can be done, for example, by hypothesizing that particular data are inliers and testing that hypothesis by various statistical tests. Such a procedure is called outlier detection,. This book does not deal with outlier detection at all. This is because it requires treatments that are very difficult to generalize. However, it should be empha~sized that the theory of statistical estimation and model selection presented in this book provides a theoretical foundation for developing such techniques. See [139, 204, 219] for outlier detection in computer vision problenls.
Remaining problems In this book, we concentrate only o11 numerical computation by assuming that necessary data are provided by image processing and other sensing operations. We do not go into the details of image processing techniques involving gray levels and colors such as edge detection, stereo matching, and feature point tracking (see [47, 57, 185, 227] for these topics). We also do not consider 3-D analysis based on the gray levels of images such as shape from shading and photometric stereo (see [65]). Since the focus is on numerical computation, we do not discuss paradigms of image understanding and computer vision and
26
Chapter 1.
Introduction
relationships with human perception. See [5, 6, 12, 130, 151, 211, 229] for such arguments.
Chapter 2 F u n d a m e n t a l s of Linear Algebra This chapter presents fundamentals of linear algebra t h a t will be necessary in subsequent chapters. Also, the symbols and terminologies t h a t will be used t h r o u g h o u t this book are defined here. Since the materials presented here are well established facts or their easy derivatives, theorems and propositions are listed without proofs; readers should refer to standard textbooks on mathematics for the details.
2.1
2.1.1
Vector
and
Matrix
Calculus
Vectors and matrices
T h r o u g h o u t this b o o k , g e o m e t r i c q u a n t i t i e s such as v e c t o r s a n d t e n s o r s a r e d e s c r i b e d w i t h r e s p e c t to a Cartesian coordinate system, t h e c o o r d i n a t e axes b e i n g m u t u a l l y o r t h o g o n a l a n d h a v i n g t h e s a m e u n i t of l e n g t h 1. W e also a s s u m e t h a t tile c o o r d i n a t e s y s t e m is right-handed 2. B y a vector, we m e a n a c o l u m n of real n u m b e r s 3. V e c t o r s are d e n o t e d b y l o w e r c a s e b o l d f a c e l e t t e r s s u c h as a, b, u , a n d v; t h e i r c o m p o n e n t s are w r i t t e n in t h e c o r r e s p o n d i n g l o w e r c a s e italic l e t t e r s . A v e c t o r w h o s e c o m p o n e n t s are a l , a2, ..., an is also d e n o t e d by (ai), i - 1, ..., n; t h e n u m b e r n of t h e c o m p o n e n t s is called t h e dimension of this vector. If t h e d i m e n s i o n is u n d e r s t o o d , n o t a t i o n s such as (ai) are used. In t h e following, a n n - d i m e n s i o n a l v e c t o r is r e f e r r e d to as an n-vector. T h e v e c t o r w h o s e c o m p o n e n t s are all 0 is called t h e zero vector a n d d e n o t e d b y 0 ( t h e d i m e n s i o n is u s u a l l y i m p l i e d by the context). A matrix is an a r r a y of real n u m b e r s . M a t r i c e s are d e n o t e d b y u p p e r c a s e b o l d f a c e l e t t e r s s u c h as A , B , S , a n d T ; t h e i r e l e m e n t s are w r i t t e n in t h e c o r r e s p o n d i n g u p p e r c a s e italic l e t t e r s . A m a t r i x is also defined b y its e l e m e n t s as (Aij), i = 1, ..., m, j = 1, ..., n; s u c h a m a t r i x is said to be of type ran. In t h e following, a m a t r i x of t y p e m n is r e f e r r e d to as a n mn-matrix; if m = n, it is also called a square matrix or s i m p l y n-dimensional matrix. If t h e t y p e is ] This is only an intuitive definition, since "orthogonality" and "length" are later defined in terms of coordinates. To be strict, we need to start with axioms of one kind or another (we do not go into the details). 2In three dimensions, a Cartesian coordinate system is right-handed if the x-, y-, and z-axes have the same orientations as the thumb, the forefinger, and the middle finger, respectively, of a right hand. Otherwise, the coordinate system is left-handed. In other dimensions, the handedness, or the parity, can be defined arbitrarily: if a coordinate system is right-handed, its mirror image is left-handed (we do not go into the details). 3We do not deal with complex numbers in this book. 27
28
Chapter 2.
Fundamentals of Linear Algebra
understood, notations such a s ( A i j ) are used. The matrix whose elements are all 0 is called the zero m a t r i x and denoted by O (the type is usually implied by the context). If not explicitly stated, the type is understood to be nn in this chapter but 33 in the rest of this book. The unit m a t r i x is denoted by I; its elements are written a s (~ij (not Iij); the dimension is usually implied by the context. The symbol (~ij, which takes value 1 for i = j and 0 otherwise, is called the K r o n e c k e r delta. Addition and subtraction of matrices and multiplication of a matrix by a scalar, vector, or matrix are defined in the standard way. n The trace of nn-matrix A - ( A i j ) is the sum ~ / = 1 A i i of its diagonal elements and is denoted by trA. Evidently, t r l = n. The transpose of a vector or matrix is denoted by superscript T. A matrix A is s y m m e t r i c if A = A T. We say that a matrix is of type ( n n ) or an ( n n ) - m a t r i x if it is an n-dimensional symmetric matrix. A matrix A is antisyrnmetric (or skews y m m e t r i c ) if A = - A T. We say that a matrix is of type Inn] or [nn]-matriz if it is an n-dimensional antisymmetric matrix. Note the following expression, which is sometimes called the outer product of vectors a and b: albi a2bl .
ab T - ( a i b j ) -
alb2 a2b2 .
.'. "" .
anbl
anb2
.
albn a2bn .
.
(2.1)
.
999
anbn
The following identities are very familiar: (AT) T -
A,
(AB) T-
t r ( A T) - trA,
tr(AB)-
B T A T,
tr(BA).
(2.2)
The i n n e r product of vectors a - (ai) and b - (bi) is defined by n
(a, b) - a T b -- E
aibi.
(2.3)
i=1
Evidently, (a, b) - (b, a). Vectors a and b are said to be orthogonal if (a, b) = 0. The following identities are easily confirmed: (a, T b ) - ( T T a , b),
t r ( a b T) - (a, b).
(2.4)
9 The matrix consisting of vectors a l, a2, ..., an as its columns in that order is denoted by (al, a2, ...., an). If A -
B - (bl, b2, ..., bn),
( a l , a 2 , ..., a n ) ,
the following identities hold: Yt
AB T -
E aibT i , i--1
(2.5)
2.1.
Vector and Matrix Calculus
A_rB -
29
(al,bl)
(al,b2)
-..
(a2,bl)
(a2,b2)
""
9
.
(a,~ b l ) ( a , , b 2 )
.
.
(al,b,~) (a2, b,~) .
...
.
(2.6)
"
(a~,b~)
The n o r m 4 and the n o r m a l i z a t i o n operator N [ . ] are defined as follows"
Ilall- v/(a,a)-
•
'/~1 ._ ai 2,
a N[a]- Ilall"
(2.7)
A unit vector is a vector of unit norm. A set of vectors {Ul, ..., u,.} is said to be o r t h o n o r m a l if its members are all unit vectors and orthogonal to each other: (ui, u j ) - 5ij. The following Schwarz inequality holds:
-Ilall [[bI[<__(a,b)_< Ilall" [[bll.
(2.8)
Equality holds if vectors a mid b are parallel, meaning that there exists a real n u m b e r t such that a - tb or b - 0. The Schwarz inequality implies the following triangle inequality with the same equality condition: Ila + bll ~ Ilal[ + [Ib[I. 2.1.2
Determinant
(2.9)
and inverse
The d e t e r m i n a n t of a square m a t r i x A = (Aij), denoted by det A or IAI, is defined by 71,
detA-
E
eil...i~Alil . " A , ~ i ~ ,
(2.10)
il ..... i,~-- 1
where eil...i, is the signature symbol defined by if (i~i2 . . . i,~)is an even p e r m u t a t i o n of ( 1 2 . . . n), if (ili.2...i,~) is an odd p e r m u t a t i o n of ( 1 2 . . . n), otherwise. (2.11) Evidently, det I = 1. The following identity holds: e~i2...i~ =
1 -1 0
d e t ( A B ) - det A det B.
(2.12)
4 T h i s n o r m is called t h e E u c l i d e a n n o r m (or the 2 - n o r m ) . In general, t h e n o r m Ilall can be defined a r b i t r a r i l y as long as (i) all > 0, e q u a l i t y h o l d i n g if a n d only if a = 0, (ii)
IIcall- Icl" Ila for any sc~,~ c, a,~d (iii) the triangle inequality (2.9) holds. There exist o t h e r definitions t h a t satisfy t h e s e - - - t h e 1 - n o r m la l1 -- ~'i-1 Jail a n d the e~-noTwn Ilall~ maxi ]ail, for i n s t a n c e . T h e y can be g e n e r a l i z e d into t h e M i n k o w s k i n o r m (or t h e p - n o r m ) y ~ n
Ilal]p -
~ / E ~ = 1 ai[P for 1 < p < ~ ; the 1-norm, t h e 2 - n o r m , a n d t h e ~ - n o r m
cases of t h e M i n k o w s k i n o r m for p -
1, 2, cx>, respectively.
are special
30
Chapter 2.
(~ij -~- cAij
Replacing Aij by
Fundamentals of Linear Algebra
in eq. (2.10) and expanding it in z, we obtain
d e t ( l + cA) = 1 + ctrA + O(c2),
(2.13)
where the order symbol 0 ( . . . ) denotes terms having order the same as or higher than --.. Let A (ij) be the matrix obtained from a square matrix A - (Aij) by removing the ith row and the j t h column. The determinant det A is expanded in the form n
n
dee A - ~ ( - 1 ) i + J A i j
det A (ij) - Z ( - 1 ) i + J A i j
i--1
det A (ij).
(2.14)
j=l
This is called the cofactor expansion formula. matrix A t = (A~j) of A is defined by
The cofactor (or adjugate)
Ai~ - ( - 1 ) i+j det A (ji).
(2.15)
Eq. (2.14) can be rewritten as
A A * = A f A = (det A ) I .
(2.16)
The following identity holds: det(A + cB) = det A + c t r ( A f B ) + O(c2).
(2.17)
The elements of the cofactor matrix A t of nn-matrix A are all polynomials of degree n - 1 in the elements of A. In three dimensions, the cofactor matrix of A = (Aij) has the following form: A12A23 - A22A13 A13A21 - A23All ) 9 AliA22 -- A21A12 (2.18) The inverse A -1 of a square matrix A is defined by
Af
-
A22A33 - A32A23 A23 A31 A 3 3 A21 A21A32 - A31A22 -
-
A32A13 - A12A33 A33A11 - A13A31 A31A12 - All A32
A A -1 = A - 1 A = I,
(2.19)
if such an A - 1 exists. A square matrix is singular if its inverse does not exist, and nonsingular (or of full rank) otherwise. Eq. (2.16) implies that if A is nonsingular, its inverse A -1 is given by At A -1 = ~ det A"
(2.20)
If we define A ~ = I, the following identities hold for nonsingular matrices (k is a nonnegative integer): (A - 1 ) - ' - A,
( A B ) -1 - B - i A -1,
(A-1)k _ ( A k ) - i
2.1.
Vector and Matrix Calculus (A-l-)_ 1 _ ( A _ I ) T
31 detA_ 1 _ 1 - det----A"
'
(2.21)
The third identity implies t h a t m a t r i x (A -1)k can be unambiguously denoted by A -k. Note t h a t the d e t e r m i n a n t and the inverse are defined only for square matrices. Let A be a nonsingular n n - m a t r i x , and B a nonsingular ram-matrix. Let S and T be rim-matrices. The following matrix inversion formula holds, provided that the inverses involved all exist" (A -~- S B T T ) -1 - A -1 - A - 1 S ( B -1 -!- T T A - 1 S ) - I T T A
-1.
(2.22)
If m - 1, the n m - m a t r i c e s S and T are n-vectors, and the r a m - m a t r i x B is a scalar. If we let B - 1 and write S and T as s and t, respectively, the above formula reduces to
A - l s t T A -1 (A + stn-) -1 - A -1 - 1 + ( t , A - l s ) "
(2.23)
For A - I , we obtain st T
(I + stT) -1 - I 2.1.3
1 + (s, t)"
(2.24)
Vector p r o d u c t in three d i m e n s i o n s
In three dimensions, the signature symbol defined by eq. (2.11) is often referred to as the Eddington epsilon 5. It satisfies the following identity:
3
E
eijmeklm - 5ikSjl- 5ilSjk.
(2.25)
m=l
The vector (or exterior) product of 3-vectors a - (ai) and b - (bi) is defined by
--s
a • b-
( a2b3-a3b2)
(j~l,= e i j k a j b k ) -
ala3blb2--a2blalb3
9
(2.26)
Evidently, axb--bxa,
(b, a • b ) -
a•
(a, a • b ) - O.
(2.27)
The following identities, known as the Lagrange formulae, are direct consequences of eq. (2.25)" a x (b • c) - (a, c)b - (a, b)c, 5Some a u t h o r s use different terminologies such as the
Levi-Civitasymbol.
32
Chapter 2.
Fundamentals of Linear Algebra
O a
a
(~)
(b)
Fig. 2.1. (a) Vector product. (b) Scalar triple product.
(a x b) x c = (a, c ) b -
(b, c)a.
(2.28)
The expressions a x (b x c) and (a x b) x c are called vector triple products. The following identities also hold:
(a x b, c x d) = (a, c)(b, d) - (a, d)(b, c),
(2.29)
Ila • bll 2-- Ilall~llbll ~- (a,b) 2.
(2.30)
If 3-vectors a and b make angle 0, we have (a, b) = Ilall" Ilbll ~os0,
lid x bll = Ilall. Ilbll sin 0.
(2.31)
Eq. (2.30) states the well-known trigonometric identity cos20 + sin20 = 1. From eq. (2.26), the third of eqs. (2.27), and the second of eqs. (2.31), we can visualize a x b as a vector normal to the plane defined by a and b; the length of a x b equals the area of the parallelogram made by a and b (Fig. 2.1a). The scalar triple product la, b, c I of 3-vectors a, b, and c is the determinant of the matrix (a, b, c) having a, b, c as its colunms in that order. We say that three 3-vectors {a, b, c} are a right-handed system if la, b, cl > 0 and a left-handed system if ]a, b, c] < 0. The scalar triple product la, b, c] equals the signed volume of the parallelepiped defined by a, b, and c (Fig. 2.1b); the volume is positive if the three vectors are a right-handed system in that order and negative if they are a left-handed system. The equality la, b, c I - 0 holds if and only if a, b, and c are coplanar, i.e., if they all lie on a common plane. We can also write
la, b,c[ = (a x b , c ) = (b x c , a ) =
(c x a , b ) .
(2.32)
Since [a, b, a x b[ - lid x bll 2, th~ vector product a • b is oriented, if it is not O, in such a way that {a, b, a x b} form a right-handed system (Fig. 2.1a). The following identity also holds: (a x b) x (c x d) = ]a, b, d]c - ]a, b,
old --- la, c, dlb - Ib, c, did.
(2.33)
2.1.
Vector and Matrix Calculus
33
Taking the determinant of (a,b, c ) ( a , b , c ) T (see eq. (2.12)), we obtain
(a, b) (a,c)
llall= (b, a) (c,a)
la, b, cl "2
llbll
(b, c)
(c,b)
llcll=
(2.34)
The vector (or exterior) product of 3-vector a mad 33-matrix T - (tl, t2, t3) is defined by a X T - (a X t l , a X t 2 , a X ta). (2.35) From this definition, the following identities are obtained: a x
a x I-
0 a3 -a2
--ct3 0 al
(Tb)-
(a
a2 ) --al , 0
x
T)b,
(a x I ) T - - - - a
X
I.
(2.36)
The matrix a x I is called the antisymmetric matrix associated with the 3-vector a. The following identity is an alternative expression to the Lagrange formulae (2.28): (a x I ) ( b x /)T (a, b ) I - ba T. (2.37) _
The vector (or exterior) product of 33-matrix T and 3-vector b is defined by T x b - T ( b x I)T.
(2.38)
This definition implies the following identities"
( T x b) T - b x T
T x a,
(a x T ) T - T
T
( T x b ) c - T ( c x b).
(2.39)
It is easy to confirm that (axT)
xb-ax
( T x b),
(2.40)
which can be written unambiguously as a x T x b. We also have (a x T x b) T -- b x T T X a.
(2.41)
Eq. (2.37) now reads a x I x b-
(a,b)l-ba
T.
(2.42)
The following identities are also important:
(axb)(cxd) m-ax (a x b , T ( c x d ) ) -
(bd T) x c - b x
(ac T ) x d,
(2.43)
(a, (b x T x d)c) - (b, (a x T x c)d).
(2.44)
34
Chapter 2.
Fundamentals of Linear Algebra
a/ a
:\
l
n
\
0
P.a
(b)
(~)
Fig. 2.2. (a) Projection onto a line. (b) Projection onto a plane. The exterior product [A x B] of 33-matrices A = (Aij) and B = (Bij) is a 33-matrix defined as follows6: 3
[A • S]ij -
E
eiklejmnAkrnBln.
(2.45)
k,l,m,n=l
If A and B are both symmetric, their exterior product [A • B] is also symmetric.
2.1.4
Projection matrices
If a vector a is projected orthogonally onto a line 1 that extends along a unit vector n, it defines on l a segment of signed length (n, a) (Fig. 2.2a); it is positive in the direction n and negative in the direction - n . The vector a is decomposed into the component (n, a)n parallel to I and the component a - ( n , a)n (= ( I - nnq-)a) orthogonal it. Let {n}L be the one-dimensional subspace defined by unit vector n, and {n}L~ its orthogonal complement--the set of all vectors orthogonal to n. The projection of a vector a onto {n}Ll is written as P n a (Fig. 2.2b). The matrix P n is defined by
P n - I - n n q-,
(2.46)
and called the projection matrix onto the plane orthogonal to n, or the projection matrix along n. The following identities are easily confirmed:
pn_pmn, det P n = 0,
p 2 = Pn,
trPn = n - 1,
I]Pn[I = x / n - 1.
Here, the matrix norm II-II is defined by Ilall -
(2.47)
E~=a E~=I A ~ for m,~-
v/ ~ matrix A = (Aij). In three dimensions, eq. (2.42) implies the following identity for unit vector n:
n • I • n-
(n • I ) ( n • I) m - Pn.
6For e x a m p l e , [A • B]11 = A22B33 - A32B23 - A23B32 + A33B22.
(2.48)
2.1.
Vector and Matrix Calculus
35
The projection matrix can be generalized as follows. Let the symbol 7~~ denote the n-dimensional space of all n-vectors. Let $ be an m-dimensional subspace of 7~~, and Af (= $ • its orthogonal c o m p l e m e n t - - t h e set of all vectors that are orthogonal to every vector in $. The orthogonal projection 7 P X onto $ is a linear mapping such that for an arbitrary vector v G T~n P x v E $,
v-
Pxv EAf.
(2.49)
In other words, P • is the operator that removes the component in Af. We also use an alternative notation p S when we want to indicate the space to be projected explicitly. Let { n l .... , n m } be an orthonormal basis of Af. The orthogonal projection P x has the following matrix expression: m
Px - I-
Z
nin i .
(2.50)
i=1
Eqs. (2.47) can be generalized as follows: exdet P~r = O, 2.1.5
Pfr, trPx
= n-
P~m,
PX, [[P~II = x / n - m.
(2.51)
Orthogonal matrices and rotations
Matrix R is orthogonal if one of the following conditions holds (all are equivalent to each other): R R y - I,
RTR-
I,
R -1 - R y.
(2.52)
Equivalently, matrix R = (7"1, ... , l " n ) is orthogonal if and only if its columns form an orthonormal set of vectors: (ri, r j ) = 5ij. For an orthogonal matrix R and vectors a and b, we have (Ra, Rb)
= ( a , b),
IIRall- Ilall,
(2,53)
The second equation implies that the length of a vector is unchanged after multiplication by an orthogonal matrix. The first one together with eqs. (2.31) implies that in three dimensions the angle that two vectors make is also unchanged. Applying eq. (2.12) to eqs. (2.52), we see that det R = • for an orthogonal matrix R. If det R = 1, the orthogonal matrix R is said to be a rotation 7The notation given here is non-traditional: the projection onto subspace S is usually denoted by Ps. Our definition is in conformity to the notation Pn given by eq. (2.46).
36
C h a p t e r 2.
F u n d a m e n t a l s of Linear A l g e b r a
r=o) x r Rr
r
1
O)
O
O
(~)
(b)
Fig. 2.3. (a) Axis and angle of rotation. (b) Instantaneous rotation.
(1) (0)(0)
matrix s. In t h r e e dimensions, t h e o r t h o n o r m a l C a r t e s i a n c o o r d i n a t e basis vectors are i -
0 0
,
j -
1 0
,
k -
0 1
.
(2.54)
T h e c o l u m n s of a t h r e e - d i m e n s i o n a l r o t a t i o n m a t r i x R = ( r l , r 2 , r 3 ) define a r i g h t - h a n d e d o r t h o n o r m a l s y s t e m { r l , r2, ra}. T h e m a t r i x R m a p s t h e c o o r d i n a t e basis {i, j , k} to { r i , r2, r3}. Such a m a p is realized as a r o t a t i o n along an axis 1 by an angle f~ of r o t a t i o n (Euler's theorem; Fig. 2.3a). T h e axis l (unit vector) a n d t h e angle f~ ( m e a s u r e d in t h e screw sense) of r o t a t i o n R are c o m p u t e d as follows:
l
N[
R13 --
/~31
],
=
cos
i
tr
1.
(2.55)
R21 -- R12
Conversely, an axis I a n d an angle ~ define a r o t a t i o n R in t h e following form: cosFt + lie(1 - cosFt) /~ --
1211 ( 1 - cos Ft) + 13 sin fi
/311(1 - cos f~) - 12 sinf~
l112 (1 -- cos f~) -- 13 sin fl cos ~ + 122 (1 -- cos fl) 1312(1 - c o s f~) + ll s i n f l Ill3(1 - cos f~) + 12 sin ~ 1213 (1 - cos ~ ) - 11 sin f~ ) . cos fl + 132 ( 1 - cos f~)
(2.56)
F r o m this equation, we see t h a t a r o t a t i o n a r o u n d unit v e c t o r / by a small angle A~2 is expressed in t h e f o r m R = I + Aal x I + O(Afl2),
(2.57)
SThe set of all n-dimensional rotation matrices forms a group, denoted by SO(n), under matrix multiplication. It is a subgroup of O(n), the group consisting of all n-dimensional orthogonal matrices. The group consisting of all nonsingular nn-matrices is denoted by GL(n), and the group consisting of all nn-matrices of determinant 1 is denoted by SL(n).
2.2.
Eigenvalue Problem
37
which implies that for a continuous rotation there exists a vector w such that Afll = w A t + O ( A t 2) for a short lapse of time At. The vector w is called the rotation velocity: Its orientation N[w] defines the instantaneous axis of rotation; its norm I1~11 d fin s the angular velocity. Eq. (2.57) reads
R = I + w x I A t + O(At2).
(2.58)
Hence, the velocity/" = linl,,xt-+o(Rr- r ) / A t of vector R r at At = 0 has the form /~ = w x r, (2.59) and is orthogonal to both r and w (Fig. 2.3b).
2.2 2.2.1
Eigenvalue Problem Spectral d e c o m p o s i t i o n
An eigenvector of an (nn)-matrix 9 A for eigenvalue )~ is a nonzero vector u such that
A u = )~u. This equation can be rewritten as ( A I solution if and only if the flmction
(2.60)
A ) u = 0, which has a non-trivial
OA()~) = I)~I- AI
(2.61)
has a zero: ~bA ()~) = 0. The function ~bA ()~) is an nth degree polynomial in )~ and called the characteristic polynomial of A. The equation CA()~) = 0 is called the characteristic equation and has n roots {)~/} (with multiplicities counted). The number of nonzero eigenvalues (with multiplicities counted) is called the rank of the (nn)-matrix A. The eigenvalues {~i } of an (nn)-matrix A are all real. The corresponding set {ui} of unit eigenvectors can be chosen to be an orthonormal set. Let us call the set {ui} so defined the eigensystem of the (nn)-matrix A. An (nn)-matrix A is expressed in terms of its eigenvalues {)~i} and eigensystem {ui } in the form
A-
Z
)~iUi ~ ,
(2.62)
i=1
which is called the spectral (or eigenvalue) decomposition of A. In particular, the identity ] , )
~iU?
-- I
(2.63)
i=1 9 Recall that by an ( n n ) - m a t r i x we mean an n-dimensional symmetric matrix (see Section 2.1.1). Eigenvalues and eigenvectors are defined for a general (non-symmetric) matrix in exactly the same way, but in this book we deal with eigenvalues and eigenvectors of symmetric matrices only.
Chapter 2.
38
Fundamentals of Linear Algebra
holds for an arbitrary orthonormal system {ui}. From (x, ~in.=.l Uilt~T,) --" (x, Ix), we obtain the following identity for an arbitrary vector and an arbitrary orthonormal system {ui}: n
~ ( ~ , ~ ) ~ - I1~11~.
(2.64)
i=1
Let {Ai} be the eigenvalues of (nn)-matrix A, and {ui} the corresponding eigensystem. Since {ui} is an orthonormal system, the matrix U = (Ul, u 2 , . . . , u . ) i s orthogonal. Eq. (2.62)is equivalent to
A-
(2.65)
U A U ~,
where A is the diagonal matrix with diagonal elements A1, A2, ..., A~ in that order; we write A = diag(A1, A2, ..., An). (2.66) From eq. (2.65), we obtain
U T A U - A,
(2.67)
which is called the diagonalization of A. Applying the fourth of eqs. (2.2) and eq. (2.12) to eq. (2.65), we obtain the following identities: rt
trA = ~
n
~,
det A = I-[ ~"
i=1
(2.6S)
i=1
From the spectral decomposition (2.62), the kth power A k for an arbitrary integer k > 0 is given by n
Ak - E Aikuiu~"
(2.69)
i=1
This can be extended to an arbitrary polynomial p(x):
p(A)- ~p(~)~uT.
(2.70)
i=1
If A is of full rank, its inverse A -1 is given by
A -1 -- ~1"= -~iltt i u T .
(2.71)
This can be extended to an arbitrary negative power of A (see the third of eqs. (2.21)):
A- ~ _ ~ i=1
1 7~~'~,
(2.72)
2.2.
Eigenvalue Problem
2.2.2
39
Generalized inverse
An ( n n ) - m a t r i x A is positive definite if its eigenvalues are all positive, and is positive semi-definite if its eigenvalues are all nonnegative; it is negative definite if its eigenvalues are all negative, and is negative semi-definite if its eigenvalues are all nonpositive. For a positive semi-definite (nn)-matrix A, eq. (2.69) can be extended to arbitrary non-integer powers A q, q > 0. In particular, the "square root" v ~ of A is defined by n
7-
(2.73)
i=1
It is easy to see t h a t ( v f A ) 2 - A . If A is positive definite, eq. (2.69) can be extended to arbitrary negative non-integer powers such as A -2/3. Let {rl, ..., rl}L denote the linear subspace spanned (or generated) by r l , ..., rl, i.e., the set of all vectors that can be expressed as a linear combination l ~-,i=1 ciri for some real numbers cl, ..., ct. A positive semi-definite (nn)matrix of rank r (_< n) has the following spectral decomposition:
A - ~ Aiuiu-~,
Ai > 0,
i-
1, ..., r.
(2.74)
i=1
Let the symbol T~n denote the n-dimensional space of all n-vectors. r-dimensional subspace 7~A = {Ul,..., U,}L C T~n
The
(2.75)
is called the range (or image space) of A, for which the set {Ul, ..., u~} is an orthonormal basis. The ( n - r)-dimensional subspace A/"A = {u~+l, ..., u,~}n C 7~n
(2.76)
is called the null space of A, for which the set {u~+ 1, ..., un } is an orthonormal basis. The n-dimensional space is the direct sum of T~A and HA, each being the orthogonal complement of the other: T~n - 7~A 9 AfA,
7~A _L N"A .
(2.77)
This definition implies
PXA A = APxA = A.
(2.78)
The (Moore-Penrose) generalized (or pseudo) inverse 1~ A - of A is defined 1~ Moore-Penrose generalized inverse is often denoted by A + in order to distinguish it from the generalized inverse in general, which is defined as the matrix X that satisfies A X A = A and denoted by A - . The generalized inverse we use throughout this book is always the Moore-Penrose type, so we adopt the generic symbol A - . The symbol A + will be given another meaning (see Section 2.2.6).
40
Chapter 2.
Fundamentals of Linear Algebra
by
A-
-
~1 i=1
--~--uiu Ai
7-.
(279)
i
Evidently, the generalized inverse A - coincides with the inverse A -1 if A is of full rank. From this definition, the following relationships are obtained (see eqs. (2.50)and (2.63))"
(A-)--A,
PXA A - - A - P x A - A - , A-A-
(2.so)
A A - - PXA.
From eqs. (2.78)and (2.80), we obtain
AA-A-
A,
A-AA-
- A-.
(2.81)
The rank and the generalized inverse of a matrix are well defined concepts in a mathematical sense only; it rarely occurs in finite precision numerical computation that some eigenvalues are precisely zero. In computing the generalized inverse numerically, the rank of the matrix should be predicted by a theoretical analysis first. Then, the matrix should be modified so that it has the desired rank. Let A be a positive semi-definite (nn)-matrix of rank 7" r; let A - ~i=1 Aiuiu-{, A1 > "" _ A,. > 0, be its spectral decomposition. Its rank-constrained generalized inverse (A)~ of rank r' (_< r ) i s defined by
(A)~, - ~ - " u1i u Ai i=1
i7-.
(2.82)
From this definition, the following identities are obtained:
(A)~A-
A(A)~, - Px(a _~,,
(A)~,A(A)~, - ( A ) , - , .
(2.83)
Let A be an (nn)-matrix, and B an (mm)-matrix. Let S and T be nmmatrices. Even if A and B are not of full rank, the matrix inversion formula (2.22) holds in the form (A + PA;ASBTTPxA) - - A - - A - S ( B - + P x B T T A - S P x B ) - T T A-,
(2.84) provided that matrix A + P x A S B T T P x A has the same rank as A and matrix B - + P x B T T A - S P x 8 has the same rank as B - . We call eq. (2.84) the generalized matrix inversion formula.
2.2.3
R a y l e i g h q u o t i e n t and quadratic f o r m
For an (nn)-matrix A, the expression (u, Au)/[]u[I 2 is called the Rayleigh quotient of vector u for A. Let )~min and Amax be, respectively, the largest
2.2.
Eigenvalue Problem
41
and the smallest eigenvalues of A. arbitrary nonzero vector u:
/~min ~
The following inequality holds for an
(u, Au) ]]U]]2 ~ )kmax.
(2.85)
The left equality holds if u is an eigenvector of A for eigenvalue )~inin; the right equality holds if u is an eigenvector for eigenvalue Am~x. The Rayleigh quotient (u, Au)/llull 2 is invariant to multiplication of u by a constant and hence is a function of the orientation of u" if we put n N[u], then (u, Au)/llull ~ = (n, An), which is called the quadratic form in n for A. Eq. (2.85) implies min (n, Ilnll-1
A n ) - )~min,
max (n, Iln]l=l
A n ) - )~max.
(2.86)
The minimum is attained by any unit eigenvector n of A for eigenvalue/~min; the m a x i n m m is attained by any unit eigenvector n for eigenvalue "~max. It follows that an (nn)-matrix A is positive definite if and only if (r, A r ) > 0 for an arbitrary nonzero vector r; it is positive semi-definite if and only if (r, At) > 0 for an arbitrary n-vector r. For an arbitrary ran-matrix B, the matrix B TB is symmetric (see the second of eq. (2.2)). It is also positive semi-definite since (r, B T B r ) - [ [ B r [ [ 2 > 0 for an arbitrary n-vector r. If B is an n n - m a t r i x of full rank, equality holds if and only if r - 0. For an (nn)-matrix A, its square root ~ is also symmetric (see eq. (2.73)). We can also write A observations, we conclude the following:
v/-A Tv/A.
From these
9 Matrix A is positive semi-definite if and only if there exists a matrix B such that A - B T B . 9 Matrix A is positive definite if and only if there exists a nonsingular matrix B such that A - B TB. 9 If A is a positive semi-definite (nn)-matrix, matrix B
T A B is a positive
semi-definite ( m m ) - m a t r i x for any n m - m a t r i x B.
2.2.~
Nonsingulargeneralized eigenvalue problem
Let A be an (nn)-matrix, and G a positive semi-definite (nn)-matrix. If there exists a nonzero vector u and a scalar/~ such that
Au-
)~Gu,
(2.87)
the scalar/~ is called the generalized eigenvalue of A with respect to G; the vector u is called the corresponding generalized eigenvector. The problem of
42
Chapter 2.
Fundamentals of Linear Algebra
computing such u and )~ is said to be nonsingular if G is of full rank, and singular otherwise. Consider the nonsingular generalized eigenvalue problem. Eq. (2.87) can be rewritten as ()~G - A ) u = 0, which has a nonzero solution u if and only if function CA,G ( , ~ ) - I , ~ G - A I (2.88) has a zero: CA G ('~) - 0. The function CA G("~) is an nth degree polynomial in ,~ and is cali'ed the generalized characteristic polynomial of A with respect to G. The equation CA,G()~) - 0 is called the generalized characteristic equation of A with respect to G and has n roots {,~i} (with multiplicities counted). The generalized eigenvalue problem with respect to I reduces to the usual eigenvalue problem. The generalized eigenvalues {)~i} of A with respect to G are all real. The corresponding generalized eigenvectors {ui} can be chosen so that
(ui, a u j ) = 6ij,
(2.89)
(Ui, A u j ) - )~j~ij.
(2.90)
which implies Let us call the set {ui} so defined the generalized eigensystem of the (nn)matrix with respect to the positive definite (nn)-matrix G. Let U = (Ul,...,un) and A = diag(A1, ..., An), respectively. Eqs. (2.89) and (2.90) can be rewritten as
U - c G U - I,
U - c A U - A.
(2.91)
By multiplying the first equation by G U from the left and U-rG from the right, the following generalized spectral decomposition is obtained: n
A - GUAU-VG - E
)~i(Gui)(Gui)-r"
(2.92)
i=1
The number of nonzero generalized eigenvalues is equal to the rank of A. If A is positive definite, {hi } are all positive; if A is positive semi-definite, {hi} are all nonnegative. The generalized eigenvalue problem A u - )~Gu reduces to an ordinary eigenvalue problem as follows. Let C - G -1/2 and it - C - l u (see eqs. (2.71) and (2.73)). It is easy to see that eq. (2.87) can be written as -
A
=
CAC.
(2.93)
If an eigenvector it of A is computed, the corresponding generalized eigenvector is given by u - C~. (2.94)
2.2.
Eigenvalue Problem
43
The expression (u, A u ) / ( u , Gu) for an (nn)-matrix A and a positive definite (nn)-matrix G is called the generalized Rayleigh quotient of u. It satisfies
(u, Au)
)~min <~_ -(~:eu)
<- )~max,
(2.95)
where Amin and )~max are, respectively, the largest and the smallest generalized eigenvalues of A with respect to G. The left equality holds if u is a generalized eigenvector of A for the generalized eigenvalue )~min; the right equality holds if u is a generalized eigenvector for the generalized eigenvalue Area•
2.2.5 Singulargeneralized eigenvalue problem Consider the singular generalized eigenvalue problem with respect to a positive semi-definite (nn)-matrix G {Vl, ..., Vm} be an orthonormal basis of the range T ~ vn } an orthonormal basis of its null space A/"G. Define an n ( n - m)-matrix P0 by
of an (nn)-matrix A of rank m (< n). Let of G, and {vm+l, ..., an nm-matrix P1 and
Po - (Vm+l,..., vn).
P1 - (vl, ..., vm),
(2.96)
Then,
P ~ P1 = I,
P~ Po - O ,
P f Po - I.
(2.97)
Here, we only consider the case where P f A P o is nonsingular 11. Since T~~ = T ~ | A/G, an arbitrary n-vector can be uniquely written in the form
u-
P l x + PoY,
(2.98)
where x is an m-vector and y is an ( n - m)-vector. Eqs. (2.97)imply that x and y are respectively given by
x-
P~u,
y-
PJu.
(2.99)
Substituting eq. (2.98) into eq. (2.87) and noting the identities GPo - 0 and Po-rG - O, we can split eq. (2.87) into the following two equations:
A*x-
AG*x,
y-
B*x.
(2.100)
Here, A* and G* are (mm)-matrices; B* is an ( n - re)m-matrix. They are defined by
A* - P ? A P 1 - P ? A P o C * - I p [ A P 1 , G* - P~GP1,
B* - - C * - I p [ A P I ,
(2.101)
where C* is an ( n - m)(n - m)-matrix defined by
C*
-
P[APo.
1JThis is always true if A is positive definite or negative definite.
(2.102)
44
Chapter 2.
Fundamentals of Linear Algebra
The definition of the matrix P0 implies that the matrix G* is positive definite. Hence, the first of eqs. (2.100) is a nonsingular generalized eigenvalue problem. The generalized Rayleigh quotient of A with respect to G for u ~ AfG (i.e., x # 0) can be written as follows:
(u, A u ) = ( x , A * x ) + ( y - B ' x , C * ( y - B ' x ) )
(2.103)
If C* is positive definite 12, we observe that
(u, A u ) (x,A*x) (~, Gu) ~- (x, Ggx-)) --~ )~mi.,
(2.104)
where )~min is the smallest generalized eigenvalue of A with respect to G (see eqs. (2.100)). Equality holds if u is the corresponding generalized eigenvector. If C* is negative definite 13, we observe that (U, A u ) -< (x, A ' x )
< )~max,
-
(2.105)
)~max is the smallest generalized eigenvalue of A with respect to G. Equality holds if u is the corresponding generalized eigenvector. where
2.2.6
P e r t u r b a t i o n theorem
Let A and D be (nn)-matrices. Let {)~i} be the eigenvalues of A, and {ui} the corresponding eigensystem:
Aui = )~iui,
(ui, u j ) -- (~ij.
(2.106)
Consider a perturbed matrix
A' = A + eD
(2.107)
for a small e. Let {s and {u~} be, respectively, the eigenvalues and the eigensystem of A' corresponding to {)~i} and {ui}. The following relations hold (the perturbation theorem):
)~i' = )~i + c(ui, D u i ) + O(e2),
, u i
-- ui
+e ~
(uj,Dui)uj
i"i---~j
+ O(e2).
(2.108)
(2.109)
j:A i
Let un be the unit eigenvector of A for the smallest eigenvalue )~n, which is assumed to be a simple root. Let {ui} be the eigensystem of A defined so lUThis is always true if A is positive definite. ]3This is always true if A is negative definite.
2.3.
Linear Systems and Optimization
that the corresponding eigenvalues A + by
45
are A1 ~>... n-1
~__ A n _ 1 > A n.
Define matrix
T
A+ _ ~
uiu______~i
._
Ai-
(2 110)
An"
"
This is a positive semi-definite matrix having eigenvalues { 1 / ( A / - An)} for the same eigensystem { u i } . If A~ - 0, the matrix A + coincides with the generalized inverse A - . Eq. (2.109) can be rewritten as u~ - u,~ - c A + D u n + O(e2).
(2.111)
Let A and D be (nn)-matrices, and G a positive definite (nn)-matrix. Let {A/} be the generalized eigenvalues of A with respect to G, and { u i } the corresponding generalized eigensystem: Aui - AiGui,
(ui, V u j ) - (~ij.
(2.112)
If A is perturbed in the form of eq. (2.107), the perturbation theorem holds in the same form. Eq. (2.111) also holds if {Ai} in eq. (2.110) are interpreted as generalized eigenvalues of A with respect to G. Linear
2.3 2.3.1
Systems
Singular
and
Optimization
value decomposition
and generalized
inverse
If A is an ran-matrix, A T A is a positive semi-definite (nn)-matrix, and A A T is a positive semi-definite (mm)-matrix. They share the same nonzero eigenvalues or1 >_ cr2 > ... >_ a~ (> 0), r < rain(m, n). The number r is called the rank of A. Let Ai - v/-~, i - 1, ..., r, and ~i - 0, i - r + 1, ..., max(re, n). It can be shown that orthono~r~i~l systems { u i } , i - 1, ..., n, and {v i}, i 1, ..., m, exist such that 9 A u i - Aivi, i - 1, ..., rain(m, n). 9 { u i } , i - 1, ..., n, is the eigensystem of A T A for eigenvalues {A2}, i 1, ..., n. 9 { v i } , i - 1, ..., m, is the eigensystem of A A T for eigenvalues {A2}, i -
Matrix A is expressed in terms of { u i } , {vi}, and {~i} in the form
A
Aiviu i .
-
(2.113)
i=1
This is called the singular value decomposition of A; the values {Ai}, i - 1, ..., min(m, n), are called the singular values of A. Let us call { u i } , i - 1, ..., n,
Chapter 2.
46
Fundamentals of Linear Algebra
A
~Rm
Fig. 2.4. The range TrA and the null space AfA of linear mapping A. and {vi}, i - 1, ..., m, the right orthonormal system and the left orthonormal system of A, respectively. If we define orthogonal matrices U - ( u l , u 2 , . . . , u n ) and V = (Vl, v2, ..., vm), eq. (2.113)can be rewritten in the form
A-
V A U T,
(2.114)
where A is an mn matrix whose first r diagonal elements a r e )~1, ..., )~r in that order and whose other elements are all zero. If m -- n, matrix A is diagonal. The r-dimensional linear subspace (2.115)
nA = { ~ , "",-~}L C ~e~
is called the range (or image space) of A: for any m-vector y E T~A, there exists an n-vector x such that y = A x (Fig. 2.4). The ( n - r )- dimensional linear subspace
AfA
-
(2.116)
{ U r + I , ..., U n } L C T~ n
is called the null space (or kernel) of A: A x = 0 for any n-vector x E A fA (Fig. 2.4). If A is symmetric, its right and left orthonormal systems coincide with its eigensystem, and its singular value decomposition coincides with its spectral decomposition (see eq. (2.62)). Since {ui} is an orthonormal system, eq. (2.64) holds for an arbitrary nvector x. Let )kma x be the maximum singular value. Since {v i} is also an orthonormal system, we see from eq. (2.113) that
IIAxll = - II
A~(u~, x)v~ll = i:1
A 2 (ui, x) 2 _ <
i:1
Amax
-- Amax Ilxl
9
i=1
(2.117) Hence, if we define the spectral norm (or the natural norm) of A by
IIAII~-
)kmax,
(2.118)
2.3.
Linear Systems and Optimization
47
eq. (2.117) implies the following inequality: [[Ax[[-< IIAIIsll~ll.
(2.119)
Equality holds for X ~
'~'~lTlaX +
H A.
(2.120)
The right-hand side means the first term plus any element of AfA (and such a form only), and Umax is the vector ui corresponding to the singular value Amax 9
Let eq. (2.113) be the singular value decomposition of matrix A. (Moore-Penrose) generalized inverse is defined by
A- - s i=1
uiv---Z-~ Ai "
Its
(2.121)
Evidently, the generalized inverse A - coincides with the inverse A -1 if A is nonsingular. In correspondence with eq. (2.78) and eqs. (2.80), the following relationships hold:
( A - ) - - A, PnAA-
A-A-
A P x A -- A,
PxA,
AA-
- p nA,
P x A A - - A - P nA - A - .
(2.122)
Here, P nA (_ p n ~ ) and PXA (= p X~) are the projection matrices onto 7~A and Af/~, respectively. From the above equations, we obtain
AA-A
=
A,
A-AA-
-
A-.
(2.123)
The rank-constrained generalized inverse (a)~-, of rank r' (<_ r) is defined by
(A)~, - E i=1
uiv__~ Ai '
(2.124)
and the following relations hold:
(A)~-,A- P
R (A)r;
,
A(A)-~ --Px(AI_~, ,
(A),.,A(A)~ -(A)<. 2.3.2
(2. 25)
Linear equations
Let A be an ran-matrix, and b an m-vector. Consider the following linear equation for n-vector x: Ax = b. (2.126) The following is the fundamental theorem for linear equations:
48
Chapter 2.
Fundamentals of Linear Algebra
9 The solution exists if and only if b E 7~A (or PnA b - 0). 9 If the solution exists, it is unique if and only if AfA - {0}. The problem (2.126)is said to be c o n s i s t e n t (or solvable) when b e ~ A , and i n c o n s i s t e n t (or unsolvable) otherwise; if it is consistent, it is said to be d e t e r m i n a t e when H A - {0}, and i n d e t e r m i n a t e otherwise. If eq. (2.126) is solvable, the solution can be explicitly written in the following form: x - A-b + H A. (2.127) If A is nonsingular, the solution is given by Atb x - A-lb
-
det A '
(2.128)
where A t is the cofactor matrix of A (see eq. (2.20)). Let A - (al,..., an). From the cofactor expansion formula (2.14), the following C r a m e r f o r m u l a is obtained: (i) lal,..., b,..., a~[ xi = detA " (2.129) The numerator on the right-hand side is the determinant of the matrix obtained by replacing the ith column of A by b. If det A is very close to 0, a small perturbation of b can causes a large perturbation to the solution x. If this occurs, the linear equation (2.126) is said to be i l l - c o n d i t i o n e d ; otherwise, it is w e l l - c o n d i t i o n e d . If b is perturbed into b + Ab, the solution x - A - l b is perturbed by Ax = A - l A b . Applying eq. (2.119), we obtain I[Ax[[ _< [[A-I[I~[IAb[[. From eq. (2.126), we have [[b[] _< [IAll~llxll. Combining these, we obtain
II/X ll < cond(A)IIAbll II ll IIbll ' where cond(A)-
Amax
[[AI]~[IA-I[I~ = )~mi--~"
(2.13o)
(2.131)
Here, Am~x and Amin are the largest and the smallest singular values of A, respectively (see eq. (2.118)). The number c o n d ( A ) i s called the c o n d i t i o n n u m b e r 14 and measures the ill-posedness of the linear equation (2.126)--the equation becomes more ill-conditioned as cond(A) becomes larger. Suppose eq. (2.126) is consistent but only 7" (_< m) of the m component equations are independent, i.e., the matrix A has rank r. Theoretically, the 14The condition number can also be defined for a singular matrix A in the form cond(A) - Ilhlls [A-[[s = )~max/)kmin, where )~max and )~min are, respectively, the largest and the smallest of the nonnegative singular values of A.
2.3.
Linear Systems and Optimization
49
solution is given in the form of eq. (2.127). However, if the elements of the matrix A and the components of the vector b are supplied by a physical measurement, all the m equations may be independent because of noise. As a result, eq. (2.126) may become ill-conditioned or inconsistent. In such a case, a well-conditioned equation that gives a good approximation to x is obtained by "projecting" both sides of eq. (2.126) onto the eigenspace of A defined by the largest r singular values. The solution of the projected equation is given in terms of the rank-constrained generalized inverse in the form
Jc - (A)-~b + Af(A)-.
(2.132)
The rank r is estimated either by an a priori theoretical analysis or by appropriately thresholding the singular values of A a posteriori.
2.3.3
Quadratic optimization
A. Least-squares optimization Let A be an ran-matrix, and b an m-vector. Consider the least-squares optimization for n-vector x in the form
J[x] = I I A x - bll 2 --+ min.
(2.133)
Application of the singular value decomposition to A yields the general solution in the following form:
Jc = A - b + H A.
(2.134)
If x is constrained to be in A/~, the solution is uniquely given by 5~ - A - b . The residual J[5~] is given by J[5~] = IlPZaAbll 2.
(2.135)
Evidently, the residual is 0 if and only if A x = b is solvable.
B. Unconstrained quadratic optimization Let C be a positive semi-definite (nn)-matrix, and d an n-vector. Consider the quadratic optimization for n-vector x in the form J[x]-
~l(x, C x ) +
(d, x) --+ rain .
(2.136)
If x is constrained to be in A ~ , the solution is uniquely given in the following form: Jc = - C - d . (2.137) The residual is
-I
J[~] - - 2 ( d , C - d ) .
(2.138)
50
Chapter 2.
Fundamentals of Linear Algebra
C. Constrained quadratic optimization
Let S be a positive semi-definite (nn)-matrix. Consider the quadratic optimization for n-vector x in the form 1 J[x I - ~(x, S x ) - + min.
(2.139)
Evidently, x - 0 is a solution (but not necessarily unique) if no constraint is imposed on x. The following three types of constraint are important: If x is constrained to be a unit vector (llxll = 1), the solution is given by any unit eigenvector 5~ of S for the smallest eigenvalue ) ~ m i n (see eqs. (2.86)); the residual is J[5~] = )~min (see eq. (2.95)). If x is constrained by (x, G x ) = 1 for a positive definite (nn)-matrix G, the solution is given by any unit generalized eigenvector 5~ E N:,-Lq of S with respect to G for the smallest generalized eigenvalue Amin~'the residual is J[5~] = ) i m i n . If S is of full rank, the same conclusion is obtained even though G is not of full rank (see eq. (2.104)). 9 Suppose x is constrained by a linear equation A x = b, where A is an mn-matrix and b is an m-vector. If 1. x is constrained to be in Aft, and 2. the constraint A x - b is satisfiable for x E Aft, i.e., at least one x0 E A@ exists such that A x o - b, then the solution is uniquely given in the following form:
5~- S - A T ( A S - A T ) - b . The residual is
(2.140)
1
J[5~] - -~(b, ( A S - A T)-b).
2.3.~
(2.141)
Matrix inner product and matrix norm
The matrix inner product of mn-matrices A = (Aij) and B = (Bij) is defined by m
(A; B ) -
tr(ATB)-
t r ( A B T) - ~
n
~
AijBij.
(2.142)
i=l j--1
Evidently, (A; B) = (B; A). If (A; B) = 0, matrices A and B are said to be orthogonal. An (nn)-matrix is orthogonal to any [nn]-matrix; an [nn]-matrix is orthogonal to any (nn)-matrix. The following identities are easy to prove: (A; B C ) - (B T A; C) - ( A C T ; B),
2.3.
51
Linear Systems and Optimization
(a, Ab) - (abT;A),
(abT;cd T) - ( a , c ) ( b , d ) .
(2.143)
The (Euclidean) matrix norm 15 of an ran-matrix is defined by
IIAII-
v/(A; A) -
Z
AiJ 2"
(2.144)
i : 1 '=
We define the normalization N [ . ] of an n n - m a t r i x A as follows (see the second of eqs. (2.7)): A N [ A ] - IIAIl" (2.145) The Schwarz inequality and the triangle inequality hold in the same way as in the case of vectors:
-IIAII IIBIJ _< (A; B) _< IIAII-IIBII,
(2.146)
IIA + BII_ IIAII § IIBII.
(2.147)
In both inequalities, equality holds if and only if there exists a real number t such that A = tB or B - O. Let U be an n-dimensional orthogonal matrix. From eqs. (2.52) and the first of eqs. (2.143), it is immediately seen that for arbitrary nn-matrices A and B (UA; U B ) = (AU; B U ) = (A; B ) . (2.148) Letting A -- B , we obtain
IIUAII- IIAUII = IIAII.
(2.149)
Further letting A - I, we see that
IIUII = v~.
(2.150)
A nonsingular n n - m a t r i x T defines a mapping from an n n - m a t r i x A to an n n - m a t r i x in the form A'=T-1AT. (2.151) 15Some authors use different terminologies such as the F r o b e n i u s n o r m , the S c h u r n o m n , and the S c h m i d t n o r m . In general, the norm ]]A]] can be defined arbitrarily as long as (i) ]]A]] > 0, equality holding if and only if A = O, (ii) []cA]] = ]c].]]A]] for any scalar c, and (iii) the triangle inequality (2.147) holds. There exist other definitions that satisfy t h e s e - - t h e 1 - n o r m []A]]I - E'i=l maxj [Aij], the o c - n o r m ]]A]]o~ = Ej~__I maxi ]mij], and the s p e c t r a l n o r m ]]A]]s defined by eq. (2.118), for instance. If [lAx]] < ]]A]]. ]]x]] holds, the matrix norm ]IA]] is said to be c o n s i s t e n t with the vector norm ]]x[]. The spectral norm ]]A]ls is consistent with the Euclidean norm ]]x]], and the 1-norm ]]A]]I and the co-norm ]]A[]oo are consistent with the 1-norm ]Ix]]1 and the cx~-norm ]]x]]o~, respectively (see Footnote 4 in Section 2.1).
52
Chapter 2.
Fundamentals of Linear Algebra
This is a one-to-one and onto mapping and is called the similarity transformation 16. A function f ( . ) of a matrix is called an invariant with respect to similarity transformations if f (A') = f (A) for an arbitrary nonsingular matrix T. The trace and the determinant are typical invariants" tr(T - 1 A T ) - trA,
det(T -1AT) - det A.
(2.152)
Eq. (2.67) implies that any symmetric matrix is mapped to a diagonal matrix by an appropriate similarity transformation; the transformation is defined by an orthogonal matrix. Hence, if A is a symmetric matrix with eigenvalues {)~i}, any invariant with respect to similarity transformations is a function of {Ai}. Eqs. (2.67) and (2.149)imply that n
IIAll -
Z A/2.
(2.153)
i--1
Hence, IIAII is also an invariant with respect to similarity transformation. In three dimensions, trA, det A, and IIAII can uniquely determine the three eigenvalues {)~1, )~2, A3} of a (33)-matrix A (see eqs. (2.68)). Hence, the three invariants {trA, det A, IIAII} are an invariant basis in the sense that any invariant can be expressed in terms of them. A nonsingular nn-matrix T defines a mapping from an (nn)-matrix A to an (nn)-matrix in the form A' - TTAT.
(2.154)
This is a one-to-one and onto mapping and called the congruence transformation 17. The pair (p, q) consisting of the number p of positive eigenvalues and the number q of negative eigenvalues of an (nn)-matrix A is called the signature of A. Under a congruence transformation, the signature does not change (Sylvester's law of inertia). Hence, the rank is also preserved. It follows that a positive definite symmetric matrix is always transformed to a positive definite symmetric matrix; a positive semi-definite symmetric matrix is always transformed to a positive semi-definite matrix of the same rank. The congruence transformation defined by an orthogonal matrix U coincides with the similarity transformation defined by U, and the matrix inner product and the matrix norm are also preserved: (U TAU; U T B U )
-
(A; B),
IIU TAUII - IIAII.
(2.155)
16Similarity transformations define a group of transformations isomorphic to GL(n), the group of nonsingular matrices under multiplication. 17Congruence transformations define a group of transformations isomorphic to GL(n), the group of nonsingular matrices under multiplication.
2.3.
53
Linear Systems and Optimization UN u
"'"
t,
"~ v ,
2 U 1
0
vN
Fig. 2.5. Finding a rotation that maps one set of vectors to another.
2.3.5
O p t i m a l r o t a t i o n fitting
Let {us} and {v~}, a - 1, ..., N, be two sets of n-vectors. problem of finding a rotation R such that
Consider the
N
Z W llu -
Rv~[I 2 --+ min,
(2.156)
c~--1
where Wa are nonnegative weights (Fig. 2.5). Since IiR ll- II~ll, the right-hand side cazl be rewritten as ~ Nc~=1 w iiuoll = - 2 E , :NI W , ( u , , R v ~ ) + E ~=1 W~[Iv~ll 2 Hence, if we define the correlation matrix N
A -
Z
W~u~v~,
(2 157)
c~--1
the problem can be rewritten as follows (see the second of eqs. (2.143)): (A; R) -~ max.
(2.158)
This problem can also be viewed as finding a rotation matrix R that is the closest to a given matrix A in the matrix norm:
iIR- A l l - +
min.
(2.159)
In fact, eqs. (2.144)and (2.150)imply that [IR-All 2 - [ I R [ [ 2 - 2 ( R ; A)+[IA[[ 2 = n - 2(A; R ) + IIA[I2, so minimizing [IR - All is equivalent to maximizing (A;R) Let A - V A U T be the singular value decomposition of A. The solution of the optimization (2.159) is given by R - Vdiag(1, ..., 1, d e t ( V U T ) ) U T.
(2.160)
If the optimization is conducted over orthogonal matrices (i.e., if det R - 1 is not required), the solution is given by
R-
V U T.
(2.161)
54
C h a p t e r 2.
2.4
F u n d a m e n t a l s of Linear Algebra
Matrix and Tensor Algebra
2.4.1
Direct sum and tensor product
For an m - v e c t o r a = (ai) a n d an n-vector b = (bi), the (m + n ) - v e c t o r (al, ..., am, b l , . . . , b , ) T is called the direct s u m of a and b and d e n o t e d by a | b. For an m m - m a t r i x A a n d an n n - m a t r i x B , the ( m + n ) ( m + n ) - m a t r i x t h a t has A a n d B as diagonal blocks in t h a t order a n d zero elements elsewhere is called t h e direct s u m of A a n d B and d e n o t e d by A | B . Direct sums of m o r e t h a n two vectors or m o r e t h a n two matrices are defined similarly:
a|174
/o/ "
,
A|174
=
/a / "..
.
(2.162)
B Let A b e an r a m - m a t r i x , a n d B an n n - m a t r i x . Let u and a be m-vectors, a n d v a n d b n-vectors. T h e following relations are obvious:
( A | B ) ( u @ v) - ( A u ) @ ( B y ) , ( a | b, u | v) - (a, u ) + (b, v).
(2.163)
A set of real n u m b e r s 7- = (Tili2...ir), i l , i 2 , . . . , i , = 1, ..., n, with r indices r u n n i n g over n-dimensional coordinates is called a tensor of d i m e n s i o n n a n d degree r. If each index corresponds to coordinates of a different dimensionality, T is called a tensor of mixed d i m e n s i o n s or a mixed tensor. If index i k runs over 1, ..., nk for k = 1, ..., r, the tensor is said to be of type n l n 2 . . . n r . A tensor of t y p e n l n 2 " . . n r is also referred to as an n l n 2 " " a t - t e n s o r . If Tili2...ir is s y m m e t r i c with respect to indices ik and ik+l, t h e t y p e is w r i t t e n as il " " ( i k i k + l ) ' " i,; If Ti~i~...i. is a n t i s y m m e t r i c with respect to indices ik a n d ik+l, t h e t y p e is w r i t t e n as il . . . [ i k i k + l ] " "it; Scalars, vectors, a n d m a t r i c e s are tensors of degrees 0, 1, and 2, respectively. T h e tensor product of tensor A = (Ail...i.) of degree r and t e n s o r / 3 = (Bil...i~) of degree s is a tensor C = (Cil...i.+~) of degree r + s defined by
Cil...i,,+~
-- Ail...irBil...is.
(2.164)
This is symbolically w r i t t e n as C = A @ B.
(2.165)
T h e following identities hold for scalar c a n d vectors a a n d b:
c @ u -- cu,
a @ b - ab T.
(2.166)
2.4.
M a t r i x and Tensor Algebra
2.~.2
55
C a s t in three d i m e n s i o n s
A. 3 3 - m a t r i c e s
T h e elements of a 33-matrix A - (Aij) are r e a r r a n g e d into a 9-vector
All A12 a -
.
,
(2.167)
which can be written as a - (a~) with a~
- A(~-l)div3+l,(~-l)mod3+l.
(2.168)
T h e symbols 'div' and ' m o d ' denote integer division and integer remainder, respectively. Conversely, a 9-vector a - (a~) is r e a r r a n g e d into a 33-matrix
A-
al
a2 a3 )
a4
as as
a7
a6 a9
,
(2.169)
which can be written as A - (Aij) with Aij
-
(2.17o)
a3(i-1)+j.
T h e above type transformation or cast is denoted by a -- type9[A],
A-
type33[a ].
(2.171)
T h e n o r m is preserved by cast:
I1 11- IIAII.
(2.172)
T h e left-hand side designates the vector norm, whereas the right-hand side designates the m a t r i x norm. T h e cast can be extended to tensors: 9 A 3333-tensor T - ( T i j k l ) is cast, by rearranging the elements with respect to the indices i and j, into a mixed tensor * 7 - - (*T~kt) of type 933, which is denoted by typ%33 [7-]; the inverse cast is 7- - type3333 [ ' 7 ]. 9 A 3333-tensor 7- - (Tijkt) is cast into a tensor 7-* -- (Ti)~) of type 339, which is denoted by type339[7-]; the inverse cast is 7- - type3333[T* ]. 9 If b o t h operations are applied, T - (Tijkl) is cast into a 99-matrix T = ( T ~ ) , which is denoted by type99 [7-]; the inverse cast is 7- - typeaaaa [T ].
56
Chapter 2.
Fundamentals of Linear Algebra
B. (33)-matrices The elements of a (33)-matrix S = (Sij) are rearranged into a 6-vector
S11 $22 $33 V/2S23
8 ~-~
(2.173)
9
v S3, V/2S12 Conversely, a 6-vector s = (s~) is rearranged into a (33)-matrix
l( Sl 86 ss)
S- ~
86 V/'282 84 85 84 V/'283
9
(2.174)
This cast is denoted by s = type6[S],
S -- type(33)[s].
(2.175)
The norm is preserved by cast:
Ilsll = IlSll.
(2.176)
The cast can be extended to tensors: 9 A (33)33-tensor s = (Lijkl) is cast, by rearranging the elements with respect to the indices i and j, into a mixed tensor *s = (*L,~kl) of type 633, which is denoted by typ%33[s the inverse cast is/2 = type(33)3a[*/: ]. 9 A 33(33)-tensor 3 / - (Sijkl) is cast to a mixed tensor N'* = (S~.~) of type 336, which is denoted by type336[Af]; the inverse cast is N = type33(33) [N'*]. 9 If both operations are applied, a (33)(33)-tensor 3// = (Mijkl) is cast to a 66-matrix M = (M~x), which is denoted by typ%6[A4 ]. In elements,
M-
Ml111 M l 1 2 2 Ml133 v~M1123 v~M1131 v~M1112 M2211 M2222 M2233v/2M2223 v~M2231 v~M2212 M3311 M3322 M3333v/2M3323 v/2M3331 ~/r2M3312 v/2M2311V/'2M2322 V/2M2333 2M2323 2M2331 2M2312 v~M3111 V/2~43122 V/'2M3133 2M3123 2M3131 2M3112 V/2~'1211 V/'2~/1222 V/'2M1233 2M1223 2M1231 2M1212 (2.177)
The inverse cast is 3// = type(33) (33) [ M ].
2.4.
M a t r i x and Tensor Algebra
57
C. [33]-matrices
(Wij) are rearranged
T h e elements of a [33J-matrix W -
W--
W13 W21
into a 3-vector
,
(2.178)
which can be written as w = (w~) with
3
1
(2.179) i,j=l
Conversely, a 3-vector w = ( w ~ ) i s rearranged into a [33]-matrix
W
-
w3
0
--W2
Wl
which can be written as W =
(Wij)
-wl
-
w
x I,
(2.180)
0
with
3 VI/~j - - E
eij~w~.
(2.181)
W-
(2.182)
k=l This cast is denoted by w - type3[W],
type[aa][W I.
T h e following identities hold, where r is an a r b i t r a r y 3-vector:
IIwII- v@l ll,
x
(2.183)
T h e cast can be extended to tensors: 9 A [33133-tensor 7) - (Pijkl) is cast, by rearranging the elements with respect to the indices i and j, into a mixed tensor *7) = (* P~k~) of type 333, which is denoted by type333[7)]; the inverse cast is 7) - type[33]33[*7) 1. 9
A 33[33J-tensor Q - (Qi~k~)is cast to a mixed tensor Q* = of type
(QiS,~)
333, which is denoted by type333[Q]; the inverse cast is Q -
type33[33] [~*]. 9 If b o t h operations are applied, a [331133]- tensor T~ - (Rijkl) is cast to a 33-matrix R - ( R ~ ) , which is denoted by type3317~]. In elements,
R -
R3232 R3213 R3221 ) R1332 R1313 R1321 9 R2132 R2113 R2121
T h e inverse cast is 7~ - type[33] [33] [R ].
(2.184)
58
Chapter 2.
2.~.3
Fundamentals of Linear Algebra
Linear mapping of matrices in three dimensions
A. 3 3 - m a t r i c e s
A 3333-tensor T - (Tijkl) defines a linear mapping from a 33-matrix to a 33-matrix: matrix A = (Aij) is mapped to matrix A' = (A~j) in the form
A'ij- ~ TijktAkl.
(2.185)
k,/=l
This mapping is denoted by A'The identity mapping Z
-
TA.
(2.186)
(Iijkl) is given by Iijkl
-- ~ik~jl.
(2.187)
The similarity transformation A ' = T -1 A T defined by a nonsingular matrix T = (Tij) maps a 33-matrix A to a 33-matrix (see eq. (2.151)). This mapping can be written as A ' = T A , where the tensor T = (Tijkl) is defined by rijkl -- r ~ l T l j .
(2.188)
Here, T~ 1 denotes the (ik) element of T -1. If a 3333-tensor 7" is cast into a 99-matrix T and if 33-matrices A and A' are cast into 9-vectors a and a', respectively, the mapping A' = T A is identified with a' = T a , (2.189) which is a linear mapping from a 9-vector a to a 9-vector a'. Hence, the mapping T is nonsingular if and only if the 99-matrix T obtained by cast is nonsingular. The inverse T -1 of a nonsingular mapping T is given through the cast: ,-f-- 1 __ type3333 [type99 [7]-1]. (2.190) If mapping T is singular, its generalized inverse is also defined through the same cast: 7"- = type3333[type9917"]-]. (2.191) A 33-matrix A is an eigenmatrix of a 3333-tensor T for eigenvalue A if TA-
AA.
(2.192)
Eigenvalues and eigenmatrices are computed by solving the eigenvalue problem of the (99)-matrix obtained by cast: if T = type99[T ] and a = typeg[A ], eq. (2.192) reads T a = )~a. (2.193)
2.4.
Matrix and Tensor Algebra
59
B. (33)-matrices A (33)(33)-tensor f14 = (Mijkl) defines a linear mapping from a ( 33 )- matrix to a (33)-matrix: matrix S is m a p p e d to matrix S ' = M S in the form eq. (2.185). The identity mapping 2: = (Iijkl) is given by 1
Iijkl -- -~(~ik(~jl "Jr"(~jk(~il).
(2.194)
The congruence transformation S ~ = T -1 S T defined by a nonsingular 33matrix T = (Tij) maps a (33)-matrix S to a (33)-matrix (see eq. (2.154)). This mapping can be written as S I = M S , where the tensor M = (Mijkl) is defined by
M~jk~ - -~1(Tk~Ttj + TkjT~i) .
(2.195)
If a (33)(33)-tensor 34 is cast into a 66-matrix M and if (33)-matrices S and S ~ are cast into 6-vectors s and s ~, respectively, the mapping S ~ = AdS is identified with s'= Ms, (2.196) which is a linear mapping from 6-vector s to 6-vector s ~. Hence, the mapping 3/[ is nonsingular if and only if the 66-matrix M obtained by cast is nonsingular. The inverse 3//-1 and the generalized inverse 3 d - are defined through the cast: M -1 - type(33)(33)[type66[M]-l], (2.197) M-
= type(33)(33)[typ%6[M]- ].
(2.198)
Eigenvalues and eigenmatrices are also defined and computed through the cast.
C. [33]-matrices If a [33][33]-tensor T4 is cast into a 33-matrix R and if [33J-matrices W and W' are cast into 3-vectors w and w/, respectively, the mapping W ! = ~ W is identified with w'= 2Rw, (2.199) which is a linear mapping from 3-vector w to 3-vector w'. Hence, the mapping is nonsingular if and only if the 33-matrix R obtained by cast is nonsingular. The inverse T~-1 and the generalized inverse T~- are defined through the cast: lp-1_
I
type[33][33] [type33 [7~]-1],
1 7"4.-- -- ~type[33][33][type33[T~]-].
(2.200) (2.201)
Eigenvalues and eigenmatrices are also defined and computed through the cast.
60
Chapter 2.
L33
L33
Fundamentals of Linear Algebra
L33
L33
(~)
(b)
Fig. 2.6. (a) Linear mapping defined by (33)(33)-tensor S. (a) Linear mapping defined by [33][33J-tensor 7~. D. Matrix
spaces
The nine-dimensional linear space 1233 of all 33-matrices is the direct sum of the six-dimensional subspace s of all (33)-matrices and the threedimensional subspace s of all [33]-matrices (Fig. 2.6). The two subspaces are orthogonal complements of each other (see Section 2.3.4):
~33 : ~(33) O ~[33],
~(33) _L ~[33].
(2.202)
This is because any 33-matrix A is uniquely decomposed into a (33)-matrix As and a [33]-matrix Aa: A = As + Aa, As = SIAl,
(As; Aa) = 0,
(2.203)
As = A[A].
(2.204)
Here, the symmetrization operator S[. ] and the antisymmetrization operator A[-] are defined as follows:
S[A] -
( A + AT),
I(A_AT A[A] - -~
).
(2.205)
We observe the following: (33)(33)-tensor S is viewed as a 3333-tensor, the linear mapping it defines is singular: its null space includes ~[33], and its range is a subspace of ~(33) (Fig. 2.6a). Hence, it always has eigenvalue 0, whose multiplicity is at least 3.
9 If a
[33][33]-tensor T~ is viewed as a 3333-tensor, the linear mapping it defines is also singulm': its null space includes E(33), and its range is a subspace of/2[33] (Fig. 2.6b). Hence, it always has eigenvalue 0, whose multiplicity is at least 6.
9 If a
Chapter 3 Probabilities and Statistical Estimation This chapter summarizes mathematical fundamentals of probabilities and statistical estimation. Since the facts established here are directly connected with the analysis in the subsequent chapters, brief derivations are given to most propositions, but those which require lengthy and subtle mathematical arguments are stated without proofs.
3.1 3.1.1
Probability Distributions Mean, variance, and covariance
Let x be a scalar random variable, and p(x) its probability density defined for real x. The expectation E[x] and the variance V[x] of x are defined by
E[x] --
f
F
V[x] =
xp(x)dx,
( x - E[x])2p(x)dx.
(3.1)
(:x)
By definition, the variance V[x] is nonnegative. Two random variables x and y are independent of each other if their joint probability density p(x, y) has the form p(x, y) = px(x)py(y). The covariance (or correlation) of x and y is defined by
V[x, Yl = E [ ( x - E[xl)(y- E[y])].
(3.2)
R a n d o m variables x and g are said to be uncorrelated if V[x,y] = 0. Independent random variables are always uncorrelated, but the converse does not necessarily hold. Let x be an n-vector random variable, and p(x) its probability density defined in the entire n-dimensional space ~ n . The expectation E[x] and the variance-covariance matrix (or simply covariance matrix) V[x] of x m'e defined by s
E[x] - Jn[,~xp(x)dx, -
The covariance matrix
-
-
(3.3)
V[x] is always positive semi-definite, since
(a, V[x]a) = E[(x - E[x], a) 2] > 0 61
(3.4)
62
Chapter 3.
Probabilities and Statistical Estimation
for an arbitrary n-vector a. The covariance matrix V[x] is diagonal if and only if the components of x are uncorrelated to each other. The variancecovariance tensor (or simply covariance tensor) 12IX] of an ran-matrix random variable X is an mnmn-tensor defined by V[X l - E[(X - E [ X ] ) |
( X - E[X])].
(3.5)
(A, 1 2 [ X ] A ) - E [ ( X - E[X]; A) 2] > 0
(3.6)
This is also positive semi-definite, since
for an arbitrary mn-matrix A. Quantities E[x2], E[[[x[[2], and E[[IXII 2] are called the mean squares of x, x, and X , respectively. Their square roots are called the root mean squares of the respective random variables. Let X1, ..., XN be independent samples (they may be scalars, vectors, or matrices) from a distribution that has mean E[X] and variance or covariance matrix/tensor V[X]. Their average N -
x
1
-
Z
(3.7)
<:x-- 1
has the following expectation and variance or covariance matrix/tensor: E[X]-
E[X],
V[X]-
NV[X].
(3.8)
This implies that for a large number N 1
X ~ E[X] + O ( ~ ) ,
(3.9)
and hence X ,,., E[X] in the asymptotic limit N --+ oc, where the symbol "~" indicates that the statistical behavior is similar. This fact is known as the law
of large numbers 1. If x and y are random variables, their direct sum x | y is also a random variable. Its expectation is
E[x | y] - E[x] | E[y].
(3.10)
The covariance matrix of x | y has the form V[x + y ] -
(
V[x]
V[y, x]
V[x,y] ) V[y]
'
(3.11)
where y] -
-
-
1The law of large numbers can be stated in many different ways; the precise meaning of the symbol ",,~" differs in each case (we omit the details).
3.1.
Probability Distributions
63
v[y, . ] - E[(y - E [ y ] ) ( . - E [ . ] ) ~1 = Y [ . , y]~.
(3.12)
If x a n d y are i n d e p e n d e n t , t h e n V[x, y] -- 0 a n d V[y, x] - O, a n d h e n c e t h e covariance m a t r i x of x | y has t h e f o r m
v [ . 9 y] - v[~] 9 v[y].
(3.13)
If x is an n - v e c t o r r a n d o m variable a n d A is an m n - m a t r i x , t h e n y - A x is an m - v e c t o r r a n d o m variable; its e x p e c t a t i o n a n d c o v a r i a n c e m a t r i x are
E[y] - AE[x],
V[y] - A V [ x l A T .
(3.14)
Let y - y ( x ) be an m - v e c t o r f u n c t i o n of n - v e c t o r x. If x is a r a n d o m variable, t h e n y is also a r a n d o m variable. If we write ~ - E[x] and x = 5c + A x , t h e d e v i a t i o n A x is a r a n d o m variable of m e a n 0. If we write ~) - E[y] a n d y ~) + A y , we o b t a i n to a first a p p r o x i m a t i o n
OY I A x ,
(3.15)
w h e r e Oy/Oxlx is an r a n - m a t r i x w h o s e (ij) e l e m e n t is Oy~/axj e v a l u a t e d at 2. To a first a p p r o x i m a t i o n , t h e covariance m a t r i x of y can be w r i t t e n as
v[y]-
Oy
v[~] oy
(3.16)
Let x be an n - v e c t o r r a n d o m variable, a n d let n = N[x]. If we write 5; -- E[x], x - 5; + A x , ~t - E[n], a n d n - ft + A n , we o b t a i n to a first approximation 1
A n - ,~_,, P r t A x ,
-- X[5~],
Ilxll
w h e r e Pr~ is t h e p r o j e c t i o n m a t r i x along ft. covariance m a t r i x of n can be w r i t t e n as
(3.17)
To a first a p p r o x i m a t i o n , t h e
1
V[n] - ilxll 2 Prt V[x]Prt. 3.1.2
Geometry
(3.18)
of probability distributions
Let x be an n - v e c t o r r a n d o m variable, a n d let 5~ - E[x]. T h e d e v i a t i o n A x = x - 5;, often called "error", is an n - v e c t o r r a n d o m variable of m e a n 0. We can w r i t e
x -- ~, + Ax,
E[II~II~I of
The mean square covariance m a t r i x V[x]"
V[x]-
E[AxAxT].
(3.19)
th~ error A x is given by t h e t r a c e of t h e
E[IIAxl] 2] - trV[x].
(3.20)
64
Chapter 3.
Probabilities and Statistical Estimation
The spectral decomposition of the covariance matrix V[x] has the following form (see eq. (2.62))" n
V.x[1-
E
~
'2uiu-~ ,
o'1 > _ ""
> _ an > _ 09
(3.21)
i=1
The vector ul indicates the orientation in which the error in x is most likely to occur, and a 2 is the variance in that orientation. In fact, for an arbitrary unit vector u
v[(~. ~)l - E[(~. ~.)~] - (~. E [ A . A . ~ ] u ) - (~. V[.]~).
(3.22)
which is maximized by u l ; the m a x i m u m value is a 2 (see eqs. (2.86)). We can also see that for each i the eigenvalue a i2 indicates the variance of the error A x in orientation ui. Since {ui} is an orthonormal system, the error A x can be expressed in the form Tt
Ax - Z
Axiui,
A x i - (Ax, ui).
(3.23)
i=1
It follows from eq. (3.21) that the distribution in each orientation is uncorre-
fated" E [ A x i A x j ] - (ui, V[xluj) - a~Sij.
(3.24)
If the distribution of the error A x is isotropic, i.e., its occurrence is equally 2 Eq. (2.63) implies that the likely in every orientation, we have a 2 = ... = a n. covariance matrix V[x] has the form
v[~]
-
-~- z ,
~2 _
E[IIAxlI2].
(3.25)
n
Let {ul, ..., un} be an orthonormal basis of 7~n. If the distribution of A x is restricted to the r-dimensional subspace S - {ul, ..., U.}L C 7~n, the covariance matrix V[x] is singular:
V[x] - ~ cr2uiui r.
(3.26)
i=1
The null space of V[x] is the orthogonal complement Af - {U,+l, ..., Un}L C TC~ of S. If the distribution is isotropic in S, the covariance matrix V[x] has the following form (see eqs. ( 2 . 5 0 ) a n d (2.51))"
v[~]
-
-~- P x , r
~2 _
E[llAxl[2].
(3.27)
3.2.
Manifolds a n d Local D i s t r i b u t i o n s
65
%(s)
0 Fig. 3.1. The tangent space Tx(S) and the normal space Tx(,5') to S at x in three-dimensions.
3.2 3.2.1
M a n i f o l d s and Local D i s t r i b u t i o n s M a n i f o l d s and tangent spaces
A (differential) manifold S in an n - d i m e n s i o n a l space T~n is a s u b s e t of 7~ n such t h a t for each p o i n t x C S t h e r e exists a diffeomorphism (a s m o o t h differential m a p p i n g ) b e t w e e n a n e i g h b o r h o o d of x in S a n d an o p e n s u b s e t of 7~ m for some m, which is called t h e dimension of t h e m a n i f o l d S. A m a n i f o l d is a generalization of a s m o o t h c u r v e in two d i m e n s i o n s a n d a s m o o t h surface in t h r e e d i m e n s i o n s (the exact definition of a m a n i f o l d is o m i t t e d ) . A n n - d i m e n s i o n a l space 7~ ~ is itself an n - d i m e n s i o n a l manifold. If an m~-dimensional m a n i f o l d S ' is a s u b s e t of an m - d i m e n s i o n a l m a n i f o l d S, m a n i f o l d S ' is said to be a submanifold of S of codimension m - m'. In p a r t i c u l a r , an m - d i m e n s i o n a l m a n i f o l d in 7~m is a s u b m a n i f o l d of 7~ ~ of c o d i m e n s i o n n - m. Manifolds are often defined by e q u a t i o n s 2. For e x a m p l e , an ( n - 1)d i m e n s i o n a l u n i t sphere in n d i m e n s i o n s c e n t e r e d at t h e c o o r d i n a t e origin, d e n o t e d by S ~-1, is defined by Ilxl] 2 = 1. A n e q u a t i o n f ( x ) = 0 is nonsingular if it defines a m a n i f o l d of c o d i m e n s i o n 1, a n d is singular otherwise. For e x a m p l e , e q u a t i o n Ilxll 2 = 0 is singular a, b e c a u s e t h e m a n i f o l d it defines is t h e c o o r d i n a t e origin O, w h i c h is z e r o - d i m e n s i o n a l (i.e., of c o d i m e n s i o n n). Let x be a p o i n t in an m - d i m e n s i o n a l m a n i f o l d S C g ~ . T h e set of all infinitesimally small vectors Am E T~n such t h a t x + A x E S forms an md i m e n s i o n a l linear space to a first a p p r o x i m a t i o n . T h i s linear space is d e n o t e d by Tx(S) a n d called t h e tangent space 4 to S at x (Fig. 3.1). A t a n g e n t space is a g e n e r a l i z a t i o n of a t a n g e n t line to a c u r v e in two a n d t h r e e d i m e n s i o n s a n d a t a n g e n t p l a n e to a surface in t h r e e d i m e n s i o n s . T h e o r t h o g o n a l c o m p l e m e n t 2A manifold is also called an algebraic variety if it can be defined by polynomial equations F(k)(x) = 0, k = 1, ..., L. 3In this book, we always consider real spaces; in a complex space, equation I]xll2 = 0 defines an (n - 1)-dimensional imaginary surface. 4The collection of all Tx(S), x C S, is called the tangent bundle of S.
66
Chapter 3.
Probabilities and Statistical Estimation
,,,-" ./i
(~)
(b)
(c)
(d)
Fig. 3.2. (a) Two surfaces intersecting transversally. (b)-(d) Two surfaces meeting non-transversally. of T x ( S ) C ~ n , denoted by N x ( S ) , is an ( n - m)-dimensional linear space and is called the normal space to 8. If f ( x ) = 0 is a nonsingular equation that defines a manifold of codimension 1, it has an ( n - 1)-dimensional tangent space T x ( S ) - {Vf}Ll and a one-dimensional normal space N x ( S ) - { V f } L , where V f = (Of/Oxl,...,Of/Oxn) T. If an m-dimensional manifold S and an m~-dimensional manifold 8 ~ meet in 7~" and if m + m ' > n, their intersection 8 f 3 8 ~ is in general an ( m + m ' - n ) dimensional manifold. Equivalently put, if a manifold S of codimension I and a manifold $~ of codimension 1~ meet in T~n and if 1-4-1~ < n, their intersection S gl S~ is in general a manifold of codimension 1 + 1~. If the following condition is satisfied in addition, manifolds 8 and S~ are said to intersect transversally:
T~ (S n S') = T~ (S) n T~ (S').
(3.28)
For example, two surfaces in three dimensions intersect transversally if they cut each other along a curve (Fig. 3.2a), but they do not if they overlap (Fig. 3.2b) or touch each other along a curve (Fig. 3.2b) or at a single point (Fig. 3.2c). If two manifolds intersect transversally, their intersection is structurally stable in the sense that its dimension (or codimension) is preserved if the two manifolds are infinitesimally perturbed in an arbitrary manner.
3.2.
Manifolds a n d Local Distributions
T
67
(S)
x)
0 Fig. 3.3. A local distribution.
3.2.2
Local distributions
Let x be an n-vector r a n d o m variable whose distribution is r e s t r i c t e d to an m - d i m e n s i o n a l manifold $ C 7~n. In general, m a t h e m a t i c a l t r e a t m e n t is very difficult if a r a n d o m variable is constrained to be in a manifold. For example, if x is a unit 3-vector, its probability density p(x) is defined over a unit sphere S 2, but its e x p e c t a t i o n E[x] - fs2 xp(x)dx is generally inside the sphere. In this book, w h e n e v e r we consider a probability distribution of a r a n d o m variable x c o n s t r a i n e d to be in a manifold $, we a s s u m e t h a t x has a local distribution in the sense t h a t the distribution is sufficiently localized a r o u n d some point 5~ E $ and hence to a first a p p r o x i m a t i o n the d o m a i n of the distribution can be identified with the t a n g e n t space T x ( $ ) at 5~ (Fig. 3.3). We choose the point ~ in such a way t h a t
p S E [ x - 5~] - O,
(3.29)
and identify the covariance m a t r i x of x with v[.]-
-
-
(3.30)
where p S is the projection m a t r i x onto the t a n g e n t space T x ( $ ) at 5~. We often call 5~ simply "the t r u e value of x". Eq. (3.30) implies t h a t the range and the null space of V[x] coincide with the t a n g e n t space T x ( S ) and the n o r m a l space N x ( $ ) , respectively. For brevity, we call the null space of the covariance m a t r i x V[x] of x simply "the null space of x".
3.2.3
Covariance matrix of a 3-D rotation
Consider a local distribution of three-dimensional r o t a t i o n s 5 a r o u n d R. Namely, we r e g a r d a t h r e e - d i m e n s i o n a l r o t a t i o n m a t r i x R as a r a n d o m variable p e r t u r b e d from 1~ by a small a m o u n t A R in the form R - R + A R . Since R and R are b o t h rotations, the t r a n s f o r m a t i o n from 1~ to R is also 5A three-dimensional rotation can also be represented by a 4-vector q, called quaternion, constrained to be on a 3-dimensional unit sphere S 3 in four dimensions. Hence, the distribution of three-dimensional rotations can also be thought of as defined over S 3.
68
Chapter 3.
Probabilities and Statistical Estimation
a rotation around some axis by a small angle. Since a small rotation has the form given by eq. (2.57), we can write R - ( I + A ~ l x I + O ( h ~ 2 ) ) R , or
R - / ~ +/',fit x/~ + o(A~2).
(3.31)
Hence, to a first approximation AR-
AFtl •
(3.32)
We define the covariance matrix of rotation R by (3.33) The unit eigenvector of V[R] for the largest eigenvalue indicates the axis around which perturbation is most likely to occur. The corresponding eigenvalue indicates the mean square of the angle of rotation around that axis. The mean square of the angle of perturbed rotation in total is E [ A ~ 2] -- trV[R].
(3.34)
In particular, if the perturbation is equally likely to occur for every axis orientation, the covariance matrix V[R] has the form
v [ n ] - ~-~, 3.3 3.3.1
~ = E[Z~n~].
(3.35)
G a u s s i a n D i s t r i b u t i o n s and )~2 D i s t r i b u t i o n s Gaussian distributions
The most fundamental probability distribution of an n-vector random variable is the multidimensional Gaussian distribution (or normal distribution). We say that n-vector x is a Gaussian random variable if it has a multidimensional Gaussian distribution. If it has mean m and covariance matrix ~ of full rank, the probability density has the form
p(x) -
1
e-(m-m'E-~(m-m))/z
(3.36)
v/(2~.)n i~, I which defines a distribution over the entire n-dimensional space 7~n. It is easy to confirm that
E[,] - ]~o .p(.)d. - m, v[.] - / ~ ( .
- m)(. - m)~p(.)d.
- ~.
(3.37)
3.3.
69
Gaussian Distributions and ~2 Distributions
////2 ....
1
X
0
ml
Fig. 3.4. Contours along which probability density is constant. The probability density is constant on the surface in 7~~ defined by (x-
m , 27-1(x - m ) ) =
c
(3.38)
for a positive constant c (Fig. 3.4). Suppose x is decomposed into the form x = x l | x2, and let m = m l | m2 be the corresponding decomposition of the mean m . If X l and x2 are uncorrelated to each other, the covariance matrix E is decomposed into the form ,~ = •1 | E:2, where -~1 and E2 are the covariance matrices of xl and x2, respectively. Then, eq. (3.36) has the form
1 p(x)-
C
- ( x , - m ~ E~-~(x~-m~))/2
V/(27r)nlEil X
1
e
- ( x 2 - m 2 ' ~,2-~(x2-m2))/2
.
(3.39)
This means that uncorrelated Gaussian random variables are always indepen-
dent of each other. In one dimension, the probability density reduces to
p(x)-
1
e--(x--m)2/2cr2
where cr2 (> 0) is the variance of x (Fig. 3.5). standard deviation. Let us call the pair {m + cr, m - or}
(3.40) The value a is called the
(3.41)
the standard deviation pair, and the interval [m - ~r, m + cr]
(3.42)
the standard confidence interval. The probability that x falls into the standard confidence interval is about 68.27%. If m = 0 and a = 1, the distribution is called the standard Gaussian (or normal) distribution.
70
C h a p t e r 3.
Probabilities and Statistical Estimation
m--o
m
m+o
"------
X
Fig. 3.5. One-dimensional Gaussian distribution. Gaussian distribution of mean m and covariance matrix E of rank r (< n) is defined as follows. Consider the case m - 0 first. Let Tt
-- Z
6ri2Uilti'
0"1 ~ ' ' "
-~ ~7r > O'r-t-1 :
"''-"
f i n ( = 0),
(3.43)
i--1
be the spectral decomposition of 2~. From the discussion in Section 3.1.2, we see t h a t x cannot deviate in the orientations u , + l , ..., Un. Since {ui} is an orthonormal basis of T~n, vector x is expressed in the form
- ~ x,~,,
x,- (,, ~,).
(3.44)
i----1
It follows t h a t the distribution is limited to the range 7"4,E = {Ul, ..., U r } L of the covariance matrix ~ . The components Xl, ..., x . have an r-dimensional Gaussian distribution with density
p(Xl,...,Xr.) __
1 p
C--Z'i=I xi2/2~
(3.45)
v/(2~) ~ 1-i~=l ~, which can be rewritten in the form
p(x) -
1
e-(X'~-x)/2 ,
(3.46)
X/(2~)~l~l+ where I ~ l + - 1-ii=lr ~ is the product of all positive eigenvalues of 2~ Eq. (3.46) defines a probability distribution only in the r-dimensional subspace ~ E . Hence,
v[x]-/~
xxTp(x)dx- ~.
(3.47)
23
T h e Gaussian distribution for m :fi 0 is defined by replacing x by x - m .
3.3.
Gaussian Distributions and )~2 Distributions
71
If the covariance m a t r i x 27 has the spectral decomposition in the form of eq. (3.43), Ul indicates the orientation of the most likely deviation (see Fig. 3.4); or1 is the standard deviation in that orientation. Hence, the probability t h a t - a l < (Ul, x - m ) < al is about 68.27%. Let us call { m + o l U l , m -- crlUl }
(3.48)
the primary deviation pair, which indicates in which orientation the deviation is most likely to occur. The Gaussian distribution plays a special role in statistics for m a n y reasons, of which the most important is the fact t h a t if X1, ..., XN are independent samples from a distribution of mean zero and variance/covariance matrix V[X], the average .'~ - Ec~N__I X ~ / N is asymptotically a Gaussian r a n d o m variable of mean zero and variance/covariance m a t r i x V[X]/N under a mild regularity condition. This fact is known as the central limit theorem 6. Other important properties of the Gaussian distribution include the following: 9 If x is an n-vector Gaussian random variable of mean m and covariance matrix 27, m-vector y - A x for an arbitrary r a n - m a t r i x A is also a Gaussian r a n d o m variable of mean A m and covariance m a t r i x A ~ A T (see eqs. (3.14)). 9 Each component of a vector Gaussian r a n d o m variable x is independent and has the s t a n d a r d Gaussian distribution if and only if m - 0 and ,~ mY.
9 If each component of x is independent and has the s t a n d a r d Gaussian distribution, each component of vector y = A x is independent and has the s t a n d a r d Gaussian distribution if and only if AA-r = I. Since the Gaussian distribution is defined over the entire n-dimensional space 7~n or its linear subspace, the probability tails away infinitely. However, we can define a Gaussian distribution over an arbitrary manifold $ C 7~n if the distribution is sufficiently localized around one point 5~ E ,5 and hence the domain of the distribution can be identified with the tangent space Tx ($) (see Section 3.2.2). Namely, the distribution can be regarded as locally Gaussian if it has a probability density in the form
p(x) - Ce -(x-Jc'E-(x-'2"))/2,
(3.49)
where C is the normalization constant. The mean 5~ and the covariance matrix E are assumed to satisfy the following relations:
~ p(x)dx - l,
P~ fs(x-
5~)p(x)dx - O,
6We omit the exact statement of the theorem and the proof.
72
Chapter 3.
P~ L(.
Probabilities and Statistical Estimation
- ~)(~ - ~ ) T p ( ~ ) d ~ P ~ - E.
(3.50)
Here, P ~ is the projection matrix onto the tangent space T x ( $ ) at 5~.
3.3.2
Moment generating functions and moments
The moment generating function of a scalar random variable x is defined by 7
E[~~~ - Eoo E[xk k! ] Ok.
r
(3.51)
k=0 If x is a Gaussian random variable of mean 0 and variance a2, its moment generating function has the following form:
9 (0)- ~
20~/~= ~0r ~•2k 0 2k .
(3.52)
k=0 Comparing this with eq. (3.51) term by term, we obtain the kth moment E[x k] in the following form:
E[xkl -
k!ak
k - 0, 2, 4,6,..., k - 1,3,5,7, ....
2k/2(k/2)! , 0,
(3.53)
The moment generating function of a vector random variable x is defined by 8 O0
~ o ( 0 ) - E[e (x'O)] - ~
~ E [ ( x , 0)k],
(3.54)
k:0 where the argument 0 is also an n-vector. If x is a Gaussian random variable of mean 0 and covariance matrix E , its characteristic function has the following form: OO
9 (o)- ~(o.~o),= ~ 2_~(o, ~o)~
(3.55)
k=O Comparing eqs. (3.54) and (3.55), we obtain the expressions for the (multidimensional) moments E[xi~xi2"" xi.]. For example,
E[xi] - 0,
E[xixj] - ~ij,
E[xixjxkxl] -- ~ij~kl + ~ik~jl + ~il~jk,
E[xixjxk] -- O, E[xixjxkXlXm] = 0.
(3.56)
7The function r is called the characteristic function (i is the imaginary unit). It is simply the Fourier transform of the probability density p(x) of x. SFunction (~(iO) is also called the characteristic function. It is the multidimensional Fourier transform of the probability density p(x) of x.
3.3.
Gaussian Distributions and X2 Distributions
73
O Fig. 3.6. )~2 distribution with r degrees of freedom.
3.3.3
X 2 distributions
If xl, ..., x~ are i n d e p e n d e n t samples from the s t a n d a r d Gaussian distribution, the distribution of
R-xa
+...+x~
(3.57)
is called the X2 distribution with r degrees of freedom. We call a r a n d o m variable which has the X 2 distribution simply a X2 variable. Its probability density is defined over [0, oc) in the form
p~(R)-
1 Rr/2_le_R/2 2~/2r(r/2)
(3.58)
where F(n) - f o t ~ - l e - t d t is the Gamma function 9 (Fig. 3.6). T h e m e a n and the variance of this distribution are
E[R] = r,
V[R] = 2r.
(3.59)
T h e density p~(R) takes its m a x i m u m at R = r - 2. T h e i m p o r t a n t facts concerning the X~z distribution include the following: 9 If R1, ..., RN are i n d e p e n d e n t ~2 variables with rl, ..., rN degrees of freedom, respectively, the sum R = Ra + " " + RN
(3.60)
is a )i2 variable with rl + ' . . + rn degrees of freedom. 9 If x is a Gaussian r a n d o m variable of m e a n 0 and covariance m a t r i x E of rank r, the q u a d r a t i c form R = (x,1ff-x)
9F(n + 1) - n! and F(n + 1/2) = (2n)Iv/~/22nnI for nonnegative integers n.
(3.61)
74
C h a p t e r 3.
Probabilities and Statistical E s t i m a t i o n
is a X2 variable with r degrees of freedom. 9 T h e probability t h a t a Gaussian r a n d o m variable of m e a n 0 and covariance m a t r i x E of rank r satisfies
(x, ~ - x )
_< 1
(3.62)
to 0 f0 9 If x~, c~ = 1, ..., N , are i n d e p e n d e n t Gaussian r a n d o m variables, each having m e a n 0 and covariance m a t r i x E a of rank ra, the s u m N
R - E
(x~, E ~ x~)
(3.63)
or--1
is a X2 variable w i t h Ec~N__Ir~ degrees of freedom. 9 Let n - v e c t o r x and m - v e c t o r y be Gaussian r a n d o m variables of m e a n 0, and let ~ x and E y be their respective covariance matrices. Let n and r ( ~ n) be the ranks of ~ x and ,!Ty, respectively. If t h e r e exists an r a n - m a t r i x A such t h a t y = A x , t h e difference R - (x, ~7~1x) - (y, E y y ) is a X2 variable with n -
3.3.~
(3.64)
r degrees of freedom (Cochran's theorem 11).
Mahalanobis distance and X 2 test
Let n-vector x be a Gaussian r a n d o m variable of m e a n 0 and covariance m a t r i x E . If E is of full rank, we can define a n o r m 12 of x by
ll ll -
(3.65)
E q u i d i s t a n t points from the origin in this n o r m have equal probability densities, and the probability density at x becomes smaller as IlxllE becomes larger. T h e value IlxllE is called the Mahalanobis distance of x f r o m the origin. If x is r a n d o m l y chosen, Ilxll~ is a ~(2 variable with n degrees of freedom. If E has rank r ( < n), we can define a p s e u d o - n o r m 13
I1 11 -
(3.66)
1~ four decimal digits, this equals 0.6827, 0.3935, 0.1987, 0.0902, 0.0374 for r = 1, 2, 3, 4, 5, respectively. alTo be exact, this is a special case of Cochran's theorem. 12For any positive definite symmetric matrix ~, eq. (3.65) defines a norm in the strict mathematical sense described in Footnote 4 in Section 2.1.1. 13This is not a norm in the strict mathematical sense because the triangle inequality (2.9) does not hold; see eq. (3.67).
3.3.
G a u s s i a n Distributions a n d X2 Distributions
75
a
100
Fig. 3.7. ~:2 test with significance level a%. which is also called the M a h a l a n o b i s distance. Since I1 11 -- 0 for x e N'2~, eq. (3.66) defines a distance in the usual sense only in the range ~ of ~ ; for x l E 7~2~ a n d x2 E N'2~, we have
+ If x is r a n d o m l y chosen,
II ll
= ll lll .
is
(3.67)
variable with r degrees of freedom.
T h e X2 d i s t r i b u t i o n provides a simple m e a n s to test hypotheses. In m a n y problems, we can define a r a n d o m variable R in the form of eq. (3.57), where each x / m a y not have zero mean. T h e e x p e c t a t i o n of R b e c o m e the smallest w h e n all x i have zero means. Suppose all x i have zero m e a n s if and only if some condition is satisfied. This condition is r e g a r d e d as a hypothesis a n d can be t e s t e d by observing a s a m p l e d value of R: the hypothesis is rejected if it is very large to an inadmissible degree. An exact p r o c e d u r e is as follows. Let R be a sample from a X2 d i s t r i b u t i o n with r degrees of f r e e d o m u n d e r t h e hypothesis. T h e hypothesis is rejected with significance level a% (or with confidence level ( 1 0 0 - a ) % ) i f it falls into the rejection region (X2,a, e~) a n d is r e g a r d e d as acceptable 14 otherwise (Fig. 3.7). T h e t h r e s h o l d value Xr,a 2 is called t h e a ~o significance value of X2 with r degrees of f r e e d o m a n d defined in such a way t h a t 15
f
~ p~(R)dR2r,a
a 100
(3.68)
Thus, the hypothesis is rejected with significance level a% if R>
2 X~,a"
(3.69)
This p r o c e d u r e is called the X2 test a n d frequently a p p e a r s in m a n y practical p r o b l e m s - - i n p a r t i c u l a r w h e n least-squares o p t i m i z a t i o n based on the Mahalanobis distance is used, since the residual of o p t i m i z a t i o n is usually a X 2 variable if the noise is G a u s s i a n (see the next section). 14Note that we do not say that the hypothesis is accepted. Being acceptable means that there exists no evidence strong enough to reject it. 15If r is large, say r > 30, the approximation Xr, 2 a "~ (Na +v/2r - 1)2/2 holds, where the number Na is defined in such a way that a standard Gaussian random variable falls in the interval (Na, ~ ) with probability a/100.
76
Chapter 3.
Probabilities and Statistical Estimation
O F i g . 3.8. M o d i f i e d X 2 d i s t r i b u t i o n w i t h r degrees of f r e e d o m .
If R is a X2 variable with r degrees of freedom, the distribution of 8 _
R
/
J-I
p..rr~
\
_
\
----/
r
is sometimes called the modified ~12 distribution16 with r degrees of freedom. The probability density of s is given by rp~(rs), where p r ( R ) is the X2 probability density given by eq. (3.58) (Fig. 3.8). Eq. (3.59) implies that its expectation and variance are
E[s]- 1,
V[s] =
2
-.
(3.71)
r
In terms of the modified k,2 variable s, the X2 test given by eq. (3.69) can be rewritten as >
(3.72)
r
The X2 test usually takes this form when the magnitude of the noise is estimated mid compared with its presumed value, as will be shown in later chapters.
3.4
3.~.1
S t a t i s t i c a l E s t i m a t i o n for G a u s s i a n M o d e l s
Maximum likelihood estimation
Let x be an n-vector, and A an ran-matrix. Let ~ be an m-vector Gaussian r a n d o m variable of mean 0 and covariance matrix E of full rank. Then y -
Ax
+ e
(3.73)
16This terminology is not widely used because the only difference from the X2 distribution is scaling. However, this distribution plays an essential role in the problems we study in this book, as we will see later.
3.4.
Statistical Estimation for Gaussian Models
77
is an m-vector Gaussian random variable with mean A x and covariance matrix ~7. Hence, the probability density of y is given by p(v) =
1
_(y_Ax,,F,-l(y_Ax))/2
(3.74)
Consider the problem of estimating the parameter x from a sampled value y. Namely, we want to find a function ~ ( y ) that gives an estimate of x for a given y. Such a function is called an estimator. Evidently, any value x ~ such that A x ~ = 0 can be added to x. In order to remove this indeterminacy, we assume that x is constrained to be in A f t , the null space of matrix A. Maximum likelihood estimation seeks the value x that maximizes the probability density p(y), which is called the likelihood when viewed as a function of the observed value y. The problem reduces to minimizing the Mahalanobis distance I l Y - AxlI,F,, i.e.,
J[x] = (y - A x , E - l (y - A x ) ) --+ min
(3.75)
under the constraint x E A f t . The solution, which is called the maximum likelihood estimator, is obtained in the following form (see eqs. (2.136) and (2.137)): -(A-r~-IA)-Aq-,F,-ly. (3.76) Its expectation and covariance matrix are E[5~] = x,
V[~] = ( A q - ~ - I A ) -.
(3.77)
An estimator is unbiased if its expectation coincides with the true value. The first of eqs. (3.77) implies that the maximum likelihood estimator 5~ is unbiased. The residual J[~] of the function J[x] given by eq. (3.75) can be written as follows (see eq. 2.138)): J[~] = (y, ~ 7 - 1 y ) - ( ~ , a q - ~ - l A ~ ) .
(3.78)
This is a ~2 variable with n - m ~ degrees of freedom, where m ~ = r a n k A (see eq. (3.64)). If each component of e distributes independently and isotropically with the same root mean square e, the covariance matrix of E has the form V[e] = e2I (see eqs. (3.25)). Hence, eq. (3.75) reduces to the least-squares optimization I l Y - Axll 2 -4 min,
(3.79)
and the maximum likelihood estimator 5: is given as follows (see eq. (2.134)):
= A-y.
(3.S0)
78
Chapter 3.
Probabilities and Statistical Estimation
(.i:-x, 2:- ~(x-x)) = constant
Rn
L
Fig. 3.9. The point ~ E S that minimizes J[R] is the tangent point of the equilikelihood surface to S.
3.4.2
Optimization
with linear constraints
Let x be an n-vector Gaussian random variable with an unknown mean 5~ and a known covariance matrix 22 of full rank. Suppose the mean 2 satisfies a linear constraint A x - b. Consider the problem of estimating 9 from a sampled value x. Since the probability density of x has the form 1
e -(x-x,
~ -1
(x - x))/2
(3.81)
I the m a x i m u m likelihood estimator for 2 is obtained by the minimization J[5~] = (x - ~, 27 -1 (x - ~)) --+ min
(3.82)
under the constraint A ~ = b. In geometric terms, this problem can be interpreted as follows. The constraint A x = b defines an a ~ n e subspace 17 S in 7~~. The minimization (3.82) means that 5~ is a point in $ that has the shortest Mahalanobis distance 115;- 5c]122 from the sampled position x. Since the equilikelihood surface (the set of all 5~ for which J[5~] = constant) is an "ellipsoid" in 7~~, the point 5~ that minimizes J[2] in S is the "tangent point" of this ellipsoid to 8; all other points in S should be outside that ellipsoid (Fig. 3.9). If we let A x = x - ~, eq. (3.82) reduces to the minimization (Ax, 27 - l A x ) --+ min
(3.83)
under the constraint AAx
= A x - b.
(3.84)
If this constraint is satisfiable, the m a x i m u m likelihood estimator ~ (= x - A x ) is obtained as follows (see eq. (2.140)): 5; - x - • A 17An in ~r~.
affine subspace of
q- ( A E A
T-)- ( A x - b).
(3.85)
7~ n is a s u b s e t of 7~ n o b t a i n e d by t r a n s l a t i n g a linear s u b s p a c e
3.4.
Statistical Estimation for Gaussian Models
79
Its expectation and covariance matrix are E[5~]- 3,
V [ : b ] - 2 ~ - IF,A T ( A ~ A - C ) - A E .
(3.86)
Hence, the m a x i m u m likelihood estimator 5: is unbiased. The residual J[~] of the function J[x] given by eq. (3.82) can be written as J[~] - ( A x - b, ( A ~ , A T ) - ( A x
- b)).
(3.87)
This is a X2 variable with m' degrees of freedom, where m' - r a n k ( A E A ]-) (= rankA) (see eq. (3.61)).
3.4.3
Maximum
a posteriori probability e s t i m a t i o n
Let x be an n-vector, and A an ran-matrix. Let y be an m-vector Gaussian random variable of mean 9 and covariance matrix 2~y of full rank. Consider the following linear model: z = A x + y. (3.88) We want to estimate the parameter x from a sampled value of m-vector z. Suppose the parameter x has an a priori distribution with mean 5~ and covariance matrix 2~x of full rank. The a priori probability density (or prior) of x is p(x) 1 - ( x - x , 2 ~ ~(x-x))/2 -
v/(2~)~l~l
~
.
(3.89)
.
(3.90)
The probability density of y is p(u)
=
~
v/(2~)ml~ui
~-r
~r
For a particular value of x, the m-vector z defined by eq. (3.88) is a Gaussian random variable of mean A x + ~) and covariance matrix E y . Hence, the conditional probability density p(zlx ) of z conditioned on x is
p(z]x) -
1 ~/(2~)~l~yl
e - ( z - A x - ~ t ' E ~ t ~(z-Ax-~l))/2 .
(3.91)
The marginal probability density p(z) of z is defined by
p(z) - f p(z[x)p(x)dx,
(3.92)
which is computed indirectly as follows. From eq. (3.88), the expectation and the covariance matrix of z are given by
E[z] - ASc + 9,
V[z] - A ~ x A -r + E y .
(3.93)
80
Chapter 3.
Probabilities and Statistical Estimation
Hence, the marginal probability density p(z) should have the form
p(z)
1
_(z_A~c_ 9 (A,V, x A T+Ey)-I(z_A~c_y))/2
V/ ( 2rr) m lA ,U,x A r + .,Uy l (3.94) The a posteriori probability density (or posterior)p(xlz) of x conditioned on z is defined by
p(z)
'
(3.95)
which is known as the Bayes formula. Maximum a posteriori probability estimation (often called Bayes estimation) seeks the value of x that maximizes the a posteriori probability density p(xlz ). Maximum likelihood estimation is a special case obtained by setting p(x) - constant. If eqs. (3.89), (3.91), and (3.94) are substituted into eq. (3.95), the a posteriori probability density is obtained in the form
P(XlZ) where
~ l ~ x I +A-rE~IIAI -(x-x
(2~)-
e
(E[r~+ATE~/A)(x-x))/2 '
(3.96)
the vector ~ is defined as follows (see the matrix inversion formula
(2.22)): 5; -- 9 + (,E~, x + AT-,~yl A ) - I A T z~yl (Z -- (A~ + 9)) = ~ + r, x A r ( A r , x A T +
~ ) - 1 ( ~ _ (A~ + 9)).
(3.97)
Thus, the a posteriori probability density p(xlz ) defines a Gaussian distribution of x. If we write its mean and covariance matrix as E[x, lz ] and V[xlz ], respectively, we have W[xl2:] _ ( ~ 1
E[xlz]_ ~ +
.~_ A T . ~ I A ) - I ,
V[xlz]Ar.~yl ( Z -
(A~ -~- 9)).
(3.98) (3.99)
Evidently, p(xlz ) is maximized by x = ~ (= E[zlz]) given by eq. (3.97). We also see that E[~] - 2, i.e., the maximum a posteriori probability estimator 5~ is unbiased. The uncertainty of x that still remains after z is observed is described by the a posteriori covariance matrix V[xlz ] given by eq. (3.98). The marginal probability density p(z) in the denominator in the Bayes formula (3.95) does not involve x. Hence, maximizing p(xlz ) is equivalent to maximizing p(zlx)p(x), which in turn is equivalent to maximizing l o g p ( x ) + logp(zlx ). If eqs. (3.89) and (3.91) are substituted into this, the problem can be written in the following form:
J[x] - (x-hc, z ~ l ( x - 5 ; ) ) + ( z - A x - ~ l ,
Eyl(z-Ax-~l))
--+ min. (3.100)
3.4.
Statistical Estimation for Gaussian Models
81
Va_l
state transitiOnr~B ~ ....
Xct_l - - ~ z q k a _ l ~ - ~ Xct----~ ...
....... o bs;;ation
...................
......................
Y~ Fig. 3.10. State transition of a linear dynamical system. Hence, the problem is viewed as minimizing the square sum of the Mahalanobis distances:
d [ x ] - [ I x - 5:[]2E x + I1 -
~ min.
(3.101)
J[&] - (z - AS~ - ~), (A,V, x A T + , ~ y ) - l ( z - AS - ~)).
(3.102)
The residual J[&] can be written as
This is a X2 variable with m degrees of freedom. The marginal probability density p(z) given by eq. (3.94) has the form
p ( z ) - constant • e -J[2]/2. 3.4.4
(3.103)
K a l m a n filter
The Kalman filter is an iterative linear update procedure for maximum a posteriori probability estimation when the parameters to be estimated change as time progresses in the form of a linear dynamical system. Let n-vector x~ be the state vector at time c~, and L-vector y~ the observation vector at time c~. The process of state transition and observation is described by x~ - A ~ - I x ~ - i + B ~ - l V ~ - I ,
(3.104)
y,~ = C~x~ + w,~,
(3.105)
where A ~ - I , B ~ - I , and C a are constant matrices (Fig. 3.10). Vectors v~ and w~ are assumed to be independent Gaussian random variables, and their expectations E[vo~] and E[w(~] and covariance matrices V[vo~] and V[wo~] are assumed to be known. Furthermore, the covariance matrix V[w,~] is assumed to be of full rank. With this setting, the Kalman filter computes the estimator &~ of the state vector x~ at time c~ and its covariance matrix V[&~] by iterating maximum a posteriori probability estimation. The update rule is derived as follows.
Chapter 3.
82
Probabilities and Statistical Estimation
E[Va_l]
--"
~-,-'~-~,~~
i
"~~ ~
"'"
Fig. 3.11. Kalman filter. Assume that x,~-i is a Gaussian random variable of mean X~--I and covariance matrix V[5~_1]. Since eq. (3.104) is linear in x~-1 and V~-l, the state vector xa at time a is also a Gaussian random variable. Let ~ and V[xa] be, respectively, the mean and the covariance matrix of that distribution. They are computed from eqs. (3.88) and (3.93) as follows (read z, A, x, and y in eqs. (3.93) as x~, A ~ - I , x~-1, and B~-lV~-l, respectively):
Xo, -- Ac~-i ~,o~-1 Jr So~-i E[vo~-I ],
(3.106)
T~r[xal-Aa-lV[ ~ o~-1]Ac~_l 7- + B a - 1 V[vot-]1 B aT- l .
(3.107)
If the value y~ determined by eq. (3.105) is observed, the a posteriori probability distribution of x~ is also Gaussian. Let ~ and V [ ~ ] be, respectively, the mean and the covariance matrix of that distribution. They are computed as follows (read z, A, x, and y in eqs. (3.98), and (3.99) as y~, C a , x~, and w~, respectively):
5~ = 5c~ + V[Sc~lC~V[w~]-l(y~ - ( C ~ 5 ~ + E[w~])),
( -
-1 +
T V [ w a ] - l e a ) -1 .
Eqs. (3.106)-(3.109) define the Kalman filter for computing ~ XO~--I V[~c~-l]. Eq. (3.108) can also be written in the following form:
from
and
5~,~ = 5~ + K,~ (y,~ - :~,~),
K~ = V[~,o~lC:V[wcv] -1,
~lc~ --C~5~ + E[w~].
(3.108) (3.109) and V[&~]
(3.110) (3.111)
Since ~ and 9~ are maximum likelihood estimators of x~ and ya before the actual value y~ is observed, eq. (3.110) can be viewed as correcting the predicted value 5~ by feeding back the difference between the actual observation y~ and its estimator 9~ (Fig. 3.11). In this sense, the matrix K ~ is often referred to as the Kalman gain. The difference y~ - 9~ is independent of Y~-I, Y~-2, " " , and has mean 0; it is called the innovation of y~.
3.5.
General Statistical Estimation
83
In the above formulation, the Kalman filter is derived as m a x i m u m a posteriori probability estimation on the assumption that all the variables are Gaussian. However, the same Kalman filter can be obtained without assuming Gaussian distributions: if we adopt the criterion of minimum mean square estimation, we can obtain eqs. (3.106)-(3.109) by orthogonal projection is of the state vector onto the affine subspace defined by the observation vector (we omit the details).
3.5
3.5.1
General Statistical Estimation
Score and Fisher information matrix
We now study the problem of statistical estimation in the general case where the distribution of the data or noise is not necessarily Gaussian and the statistical model is not necessarily linear. In abstract terms, the problem is stated as follows. Let x be a random variable that has a probability density p(x; O) parameterized by 0. The problem is estimating the parameter 0 by observing random samples from that distribution. We assume that x is an n-vector constrained to be in an hi-dimensional manifold X C 7~~ and the parameter 0 is an m-vector constrained to be in an m'-dimensional manifold $ C 7~m. The probability density p(x; 8) is assumed to be continuous and continuously differentiable with respect to both x and 0 an arbitrary number of times. We also assume that p(x; 8) > 0 for all x e X. Furthermore, differentiation V0(. ) (= ( 0 ( . ) / 0 0 1 , . . . , 0 ( 9)lOOn)T) with respect to 0 and integration f dx with respect to x are assumed to be interchangeable for any expression of p(x; 8) (as long as the integration exists). The probability density p(x; 8) is not defined for 0 g S. For the convenience of analysis, however, we extend it to 0 g S in such a way that p(x; 0 + AO) = p(x; 8) + 0 ( 5 8 ) 2 for all 0 e S and A0 e TO(,S)• Here, TO(S ) is the tangent space to manifold $ at 0 (see Section 3.2.1); O(A0) 2 denotes terms of degree 2 or higher in the components of A0. In intuitive terms, the probability density is "constant in the normal direction" to S in ~r~. This assumption implies the identity
v0p(.; 0) e TO(S).
(3.112)
Define an m-vector random variable I by I = V 0 logp(x; 8).
(3.113)
This vector is called the score of x with respect to the parameter 8. Since lSOrthogonality is defined in the statistical sense as having no correlation. We omit the details.
Chapter 3.
84
Probabilities and Statistical Estimation
p(x; 0) is a probability density, the normalization condition
xp(x; O)dx
- 1
(3.114)
holds for any 8 E $. It follows that if 8 is perturbed into 8 + A0 in such a way that 0 + A0 E 8, the first variation of the left-hand side of eq. (3.114) must be 0. The constraint 0 + A0 E 8 requires A0 E TO(S ) to a first approximation. If we use the logarithmic differentiation formula V0P(X; 0) =
p(x;
0)V 0 logp(x; 0),
(3.115)
the first variation of the left-hand side of eq. (3.114) is
fxp(x;
0)(V0 logp(x; 0),
AO)dx -
(El/f, A0),
(3.116)
where we have invoked the assumption that differentiation and integration are interchangeable. Since eq. (3.116) must vanish for an arbitrary A0 E TO(S ), we conclude that E[l] E TO(S) • However, eq. (3.112)implies that E[l] E TO(S ). It follows that the score l is a random variable of mean 0:
E[l] : The
Fisher information matrix
0.
(3.117)
is defined by
j-
(3.11S)
Taking the expectation of the identity 02 log p
OOiOOj
0 log p 0 log p
OOi
OOj
1
02p
I P O0~OOj'
(3.119)
and noting that differentiation and integration are interchangeable, we obtain
/0,.o,. /,lo,.o.o, / o, pdx OOiooPpdx = Ppdx + OOi
OOj
OOiOOj
02 - - fx liljpdx -~ OOTOOj/xPdX - -E[lilj].
(3.120)
Hence, if an (mm)-matrix L is defined by L - - V ~ logp(x; 0),
(3.121)
where V~(. ) denotes a matrix whose (ij) element is 02( 9)/OOiOOj, the Fisher information matrix is expressed in the following form:
j = E[L].
(3.12z)
3.5.
General Statistical Estimation Since
85
I E TO(S ), we have P~J-
J P ~ - J,
(3.123)
where P 0s is the projection matrix onto TO(N ). Hence, the rank of J is at most rn'. We say that the distribution p(x; O) is regular if J has rank m, which means that I can take all orientations in TO(S ) if x is appropriately chosen. In this book, we consider only regular distributions. Since the range of J coincides with TO(S ) for a regular distribution, the following identity holds: J JJ - J - P~. (3.124) -
If the distribution is locally Gaussian and has the probability density given by eq. (3.49}, the score 1 and the Fisher information m a t r i x J have the following form: l = E-(x~), /., = J = 2Y-. (3.125)
3.5.2
Unbiased estimator and Cramer-Rao lower bound
Let x E A" be a sample from a distribution which has a probability density
p(x; O) parameterized by 0 E $. Let ~}(x) be an estimator of 0, i.e., a function of x that returns an estimate of 0. The estimator ~}(x) is assumed to satisfy the constraint on 0: 0(x) E S for any x E X. The estimator {}(x) is unbiased if 19
P~ / ( O ( x ) - O)p(x; O)dx - O, ax
(3.126)
where P0s is the projection matrix onto TO(S) (see eq. (3.29)). The covariance matrix of the estimator 0(x) is defined as follows (see eq. (3.30))"
ViOl = Po~/;(0(~)
- o)(o(x) - o)Tp(x;
o)~P~.
(3.127)
Since eq. (3.126) holds identically for any 0 E $, the first variation of the left-hand side of eq. (3.126) must be 0 if 0 is perturbed into 0 + A0 for AO e TO(S ). If we use the logarithmic differentiation formula (3.115) and interchange differentiation and integration, the first variation of the left-hand side of eq. (3.126) can be written as
= --AO + E[P~O(x)I-C]AO, 19It seems that unbiasedness can be defined by El0] = 0, but is "curved" (see Section 3.2.2).
(3.128)
E[O]may be outside S if S
86
Chapter 3.
Probabilities and Statistical Estimation
where we have employed the approximation P~+AoE[O
-
01 ~ 0. The exact
expression for PO+AoE[O s - 0] involves the second fundamental form of the manifold $ at 0. Roughly speaking, it has the order of a2AO/R 2, where a is the "standard deviation" of 0 in $ and R is the "radius of curvature" of $ (we omit the exact analysis). Here, we simply assume that the manifold $ is sufficiently "flat" in the domain within which the distribution of 0 is localized. Throughout this book, this is always assumed whenever we talk about local distributions. Eq. (3.128) must vanish for an arbitrary A0 e T0($ ). Since E[P~OI T] P~OE[I] T - O, this condition is written as -
A0 e $.
-
(3.129)
If we let A0 - P~Ae, then A0 E S for an arbitrary Ar E 7~m. Hence, the identity E[P~(O - O)(P~I)TIAE - P~Ae (3.130) must hold identically for e E ~m. eq. (3.130) implies
Since I E T0($ ) and hence P~l - l,
E[P~(O - 0)1 T] - P~.
(3.131)
Combining this with eqs. (3.127) and (3.118), we obtain the following relationship:
T ( ]_
V[0] P0s
p S J
)
9
(3 132) .
Since the left-hand side is positive semi-definite, the following matrix is also positive semi-definite (see Section 2.2.3)" (pS
,-
_j-)(
,-)-(
V[0]
V[O]-J-
,-). (3. 33)
Here, we have assumed that the distribution is regular and hence eqs. (3.123) and (3.124) hold. Since J - is positive semi-definite, the positive semidefiniteness of eq. (3.133) implies V[0]_ J - ,
(3.134)
where A __ B means A - B is positive semi-definite. Eq. (3.134) is called the Cramer-Rao inequality and gives a lower bound, called the Cramer-Rao lower bound, on the covariance matrix V[0] of an arbitrary unbiased estimator 0. If this bound is attained, the estimator 0 is said to be efficient.
3.6.
Maxinmm Likelihood Estimation
87
Suppose N independent samples Xl, -.-, X N are observed from a distribution whose probability density is p(x; 0). Let 0(ix, ...,iN) be an estimator of 0. If we consider the direct sum -- Xl @''"
the independence of x l,
..., X N
(3.135)
O iN,
implies that 5: has the probability density
/5(5:; 0) -- p(xl; 0 ) . .
"p(xN; 0).
(3.136)
Since the estimator ~}(Xl, . . . , i x ) can be viewed as an estimator 0(5:) of 5:, the argument described earlier can be applied. The score of 5: is N
l-
V 0 log/5(5:; 0 ) -
N
V0 E
logp(x~; 0 ) -
ct=l
E
l~,
(3.137)
a=l
where l~ is the score of x~. Since {x~} are independent, the Fisher information matrix of 5: is T
[ I - E[
l~
l#
a=l
]- E
/3=1
E[l~l~] - N J,
(3.138)
c~,f~=l
where J is the Fisher information matrix for p(x; 0). Cramer-Rao lower bound is given in the following form:
Consequently, the
1 v[0] ~_NJ-.
(3.139)
In particular, if 0 is a scalar, we have V[0] >
3.6
3.6.1
1
1
NE[(O logp/O0) 2] = -NE[O z logp/O02]"
(3.140)
Maximum Likelihood Estimation
Maximum likelihood estimator and the exponential family
Given a sample x E ,Y from a distribution which has a probability density
p(x; O) parameterized by 0 E 8, the maximum likelihood estimator 0 is the value of 0 that maximizes the likelihood p(x; 0), i.e., the probability density viewed as a function of 0 by substituting the sampled value. Hence, the maximum likelihood estimator 0 is the solution of the minimization JThe probability density in preceding section.
-21ogp(x; 0) --+ min.
(3.141)
p(x; O) is assumed to have the properties described
88
Chapter 3.
Probabilities and Statistical Estimation
In order to distinguish variables fi'om their true values, we regard 0 as a variable and write its true value as 0. With the expectation that the maximum likelihood estimator 0 is close to 0, we write 0-0+A0.
(3.142)
The constraint 0 E S requires A0 E To(S ) to a first approximation. Substituting eq. (3.142) into the fimction J in eq. (3.141) and expanding it in the neighborhood of 0, we obtain _
J = - 2 logp(x; 0) - 2(1, A0) + (A0, LA0) + O(A0) 3,
(3.143)
where I is the score defined by eq. (3.113) and L is the matrix defined by eq. (3.121). The bar indicates that the value is evaluated at 0. The assumption (3.112) implies that P ~ ] , = LPoS - L,
(3.144)
where P ~ is the projection matrix onto T0(S ). It follows that the rank of L is at most the dimension m ~ of the tangent space T0(S ). Here, we assume that the rank of L is exactly rn ~ so as to guarantee the unique existence of the value of 0 E $ that minimizes J. Then, the range of L coincides with T0(S ). If the term O(A0) 3 in eq. (3.143) is ignored, the function J is minimized under the constraint A0 E T0(S ) by the following value (see eq. (2.137)): A0 - - L - i .
(3.145)
It follows that the maximum likelihood estimator is given by 0 - 0 + A0. From eqs. (3.117) and (3.144), we see that P x o E [ O - 0 ] - 0, which means that the maximum likelihood estimator is unbiased in the first order 2~ A probability distribution which has a probability density p(x; O) parameterized by 0 is said to belong to the exponential family if p(x; O) can be expressed in terms of a vector function f ( x ) and scalar functions C(O) and g(x) in the form 21
p(x; O) -- C(O) exp[(f(x), O) + 9(x)].
(3.146)
Many probability distributions that appear in practical applications have probability densities in this form, and the Gaussian distribution is a typical example 22. For a distribution of the exponential family, the matrix L 2~ proviso "in the first order" means t h a t the result is obtained by ignoring high order terms in A0. 21If the exponent on the right-hand side of eq. (3.146) has the form (f(x), h(0)) + g(x) for some vector function h ( . ), the distribution is said to belong to the curved exponential family. If h(0) is taken as a new parameter, i.e., if rI = h(0) can be solved for 0 in the form 0 = t(rl) , the distribution belongs to the exponential family with p a r a m e t e r rI. 22Beside the Gaussian distribution, the exponential family includes such distributions as the Poisson distribution, the binomial distribution, the gamma distribution, the beta distribution, and the Poisson-gamma distribution.
3.6.
Maximum Likelihood Estimation
89
defined by eq. (3.121) does not depend on x and hence is equal to the Fisher information matrix J (see eq. (3.122)). From eq. (3.145), we see that the covariance matrix V[0] of the maximum likelihood estimator 0 is given as follows (see eqs. (3.127)and (3.144)): V [ 0 ] - L-E[~q-]L - - J - J J - - J - .
(3.147)
This means that the Cramer-Rao lower bound is attained (see eq. (3.134)). Thus, the maximum likelihood estimator is efficient in the first order if the distribution belongs to the exponential family.
3.6.2
Asymptotic behavior
If N independent samples X l , . . . , X N are observed from a distribution whose probability density is p(x; 0), the maximum likelihood estimator ~) is the value of 0 that maximizes the likelihood p(xl; O) .. . p ( x s ; O). Namely, 0 is the solution of N
J-
-2 E
logp(x~)--+ min.
(3.148)
c~'-i
In this case, eq. (3.145) is replaced by
(s ) ~=1
(~=1
where l~ and L~ are, respectively, the score l and the matrix L for x~ evaluated at 0. Hence, the covariance matrix V[0] of the maximum likelihood estimator 0 - 0 + A0 is written in the following form:
E
ViOl = E[
c~=l
~=1
3,=1
5=1
-l. (3. 50)
Matrices [,1, ..., I,N are random variables that belong to the same distribution, and their common expectation is the Fisher information matrix J . Hence, the following law of large numbers holds for a sufficiently large N (see eq. (3.9))" N
1 ~ E
t ~ ,,~ J .
(3.151)
c~=l
From this and the independence of each x~, we can write eq. (3.150) as follows (see eqs. (2.81 ) )" N
1
N
[lj~-lJc~,/~=l
N2Z c~--1
J - J J - - -N J - "
(3.152)
90
Chapter 3.
Probabilities and Statistical Estimation
This means that the Cramer-Rao lower bound is attained in the asymptotic limit (see eq. (3.139)). Thus, the maximum likelihood estimator is asymptotically efficient 23. We also observe the following (we omit the details)" 9 An estimator 0 of 8 is said to be consistent if 0 ~ 8 as N -+ co. Since eq. (3.152)implies that V[0] ,.~ O(1/N), the maximum likelihood
estimator consistent. 9 Since 11, ..., IN are independent random variables of mean 0 (see eq. (3.117)) and have the same distribution, the central limit theorem N (see Section 3.3.1) states that ~ = 1 l ~ / N is asymptotically Gaussian. It follows from eq. (3.149) that the maximum likelihood estimator is
asymptotically a Gaussian random variable of mean 0 and covariance matrix J - IN. 3.7
3. 7.1
Akaike
Information
Criterion
Model selection
Suppose N data x l, ..., x N are observed. A statistical test is a procedure for judging if they can be regarded as independent random samples from a particular distribution with probability density p(x); the X2 test is a typical example (see Section 3.3.4). If the data are known to be independent random samples from a distribution with probability density p(x; O) parameterized by O, the procedure for determining the parameter 8 that best explains the data is called statistical estimation; maximum likelihood estimation is a typical example (see Section 3.6.1). But how can we guess a parameterized probability density p(x; 8)? In other words, how can we judge if the data can be regarded as independent random samples from a distribution with probability density pl(x; 8), or with probability density p2(x; 8), or with other probability densities? A parameterized probability density p(x; 8) is called a (statistical) model of the distribution. In order to select a best model, we need a criterion that measures the "goodness" of a particular model. If we adopt a particular model p(x; 8) and apply maximum likelihood estimation, the parameter 8 is determined by maximizing l I N (~=1 p ( x ~ ; 8) o r N 1 log p(x a; 0). Let ~ be the resulting maxiequivalently minimizing - 2 ~ 4= mum likelihood estimator of 0. Let us call n
J[{x~ }, 0] - - 2 ~
logp(x~; ~)
(3.153)
o~--1
23The effect of the neglected high order terms in A0 converges to 0 as N --+ c~ under mild regularity conditions. Hence, the proviso "in the first order" can be dropped. We omit the details.
3.7.
Akaike Information Criterion
91
simply the residual. A good model is expected to have a large likelihood l-I~'~=l p(xa; 0), thereby a small residual J[{x~},0]. Hence, the residual appears to be a good criterion for model selection. However, since 0 is determined so as to minimize the residual for the current data {x a}, the residual can be made arbitrarily small, say, by assuming that x can take only N values x l , .... , x u . Such an artificial model may explain the current data very well but may be unable to predict the data to be observed in the future. This * ... , x N * be independent ranobservation leads to the following idea. Let Xl, dom samples to be observed in the future; they are assumed to have the same distribution as the current data x l, ..., X g. For a good model, the residual n
J[{x,~}, 0] - - 2
logp(x,~, 0)
(3.154)
o~1
for the future data { x ; } should be small. Since the future data { x ; } and the maximum likelihood estimator 0, which is a function of the current data {x~}, are both random variables, the above residual is also a random variable. In order to define a definitive value for the model, we take expectation and consider I - E*[E[J[{x~}, 0]]]
(3.155)
where E*[.] and E[.] denote expectation with respect to the future data {x~} and the current data {xa }, respectively. We call I simply the expected residual and regard a model as better if the expected residual I is smaller.
3. 7.2
Asymptotic expression for the expected residual
As in Sections 3.5 and 3.6, we assume that the data x l, ..., x N are n-vectors sampled from an hi-dimensional manifold X C ~ n . The model parameter 0 is assumed to be an m-vector constrained to be in an ml-dimensional manifold S C 7~m. Hence, the model p(x; O) has rn' degrees of freedom. We also assume that the model p(x; 0) contains the true probability density. Suppose the true model is p(x; 0), and let 0 be the maximum likelihood estimator of 0. Writing - 0 + A0 and expanding logp(x~; 0) in the neighborhood of 0, we obtain -, A 0 ) l o g p ( x ; ; 0) - logp(x~; 0- ) + (1,~,
~1 ( A 0 , L : A 0 )
+ O(A0) 3
(3.156)
where the score l~ and the m a t r i x / ~ are defined as follows (see eqs. (3.113) and (3.121)):
i: -
log p(. ;; 0),
/,~ = - V ~ logp(x*; 0).
(3.157)
92
Chapter 3.
Probabilities and Statistical Estimation
Ignoring higher order terms and noting that A0 does not depend on {x~}, we obtain from eqs. (3.155) and (3.156) N
I -- Z (-2E*tlogp(x*~; 0)1- 2(E*[l*], ElAn}I)-t- El(A0, E*[L*IA~})]). o~--1
(3.158) Since {x;} and {x~} have the same distribution, we have
E*[logp(x*~;0)1 - E[logp(x,~; 0)1.
(3.159)
Recall that E * [ l : ] - 0,
E * [ L : ] - J,
(3.160)
where J is the Fisher information matrix (see eqs. (3.117), (3.118), and (3.122)). It follows that eq. (3.158) can be rewritten as N
I - -2E[Z
logp(x~; 0)] +
E[N(AO, JA0)].
(3.161)
ct=l
Expanding log p(x~; 0) in the neighborhood of 0, we obtain -
1
logp(x~; 0) - logp(x~; 0) - (l~, A0) - ~(A0, s
+ O(A0) 3, (3.162)
where l~ and I,a are defined by 1,~ = V 0 logp(x,~; 0),
L,~ - - V ~ logp(x,:,; 0).
(3.163)
If we put .[,,~ -- - V ~ logp(xo~; 0),
(3.164)
I,~ - L~ + O(A0).
(3.165)
we have Substituting this into eq. (3.162) and summing it over c~ - 1, ..., N, we obtain N
Z
N
N
logp(xa; 0) - Z logp(x~; 0) - ( Z 1~' A0)
c~=l
~=I
2
4=1
~ ~ L~
AO) + O(AO)3.
(3.166)
c~=l
Since the maximum likelihood estimator {} maximizes ~ =N1 log p(x o~; 0 ) , w e see that ( Z lo,, A0) -- (V 0 ot=l
logp(xo,; 0) ce=l
, A0) -- 0
(3.167)
3.7.
Akaike Information Criterion
93
for any admissible variation A~} e T~)($). If we recall that eq. (3.151) holds for N ,,~ oo (the law of large numbers) and ignore higher order terms in A~} in eq. (3.166), we have N
E
N
logp(xa; 0) -~ E
c~=l
logp(x~;/)) - N N ( A / ) , JA/})
(3.168)
c~=l
for N ,,~ oe. Substituting this into eq. (3.161), we obtain N
I ,,~ - 2 E [ E
logp(x~;/})1 + 2NE[(A/},
JAb)l.
(3.169)
ce--1
As shown in Section 3.6.2, the deviation A0 of the maximum likelihood estimator is asymptotically a Gaussian random variable of mean 0 and covariance matrix J - / N (the central limit theorem). We assume that the distribution is regular, so the Fisher information matrix J has rank m ~ (see Section 3.5.1). It follows that (A~), ( J - / N ) - A O ) - N ( A 0 , J A 0 ) is asymptotically a X2 variable of m' degrees of freedom (see eq. (3.61)). Hence, we have E[N(A/}, J A 0 ) ] ,,~ m'
(3.170)
for N ,,~ c~. Thus, eq. (3.169) can be expressed in the form
I ,,, E[J[{x~}, t}]] + 2m'.
(3.171)
This means that if we define
A I C - J[ { x ~ }, 0] + 2m',
(3.172)
this is an unbiased estimator of the expected residual I for N ~- oe. This estimator is called the Akaike information criterion, or A I C for short, and can be used as a measure of the goodness of the model; the predictive capacity of the model is expected to be large if AIC is small. According to this criterion, a good model should not only have a small residual but at the same time have a small degree of freedom; otherwise, one can define a model that fits the current data arbitrarily well by increasing the number of the model parameters.
Chapter 4 Representation of Geometric Objects This chapter discusses representations of geometric objects in two and three dimensions. First, representations of points and lines in two dimensions are described, and their error behavior is characterized in terms of their covariance matrices. Similar analysis is done for points, lines, and planes in three dimensions. Then, incidence relations are described in terms of the representations of the objects involved, and the error behavior of intersections and joins is analyzed. Geometric properties of conics and quadrics are also discussed. Finally, the coordinate transformation of three dimensional objects and their perspective projection onto the image plane are summarized. 4.1
4.1.1
Image
Points
and
Image
Lines
Representation of image points
We call points and lines defined in two-dimensional images image points and image lines, respectively, to distinguish them from points and lines in threedimensional scenes, which we call space points and space lines, respectively. An image point is represented by its image coordinates (x, y) with respect to a fixed Cartesian coordinate system. An alternative representation is the use of the 3-vector x -
(x) y
.
(4.1)
If the image plane is viewed as a two-dimensional projective space 1, the three components of the 3-vector x can be interpreted as the homogeneous coordinates of the image point it represents. By definition, a 3-vector x represents an image point if and only if (k, x) = 1,
(4.2)
where k - (0, 0, 1)T. The distance of the image point x from the image origin o is d - V/x 2 + y2 _ V/llxll2 _ 1. (4.3) Consider an X Y Z Cartesian coordinate system with origin O in the scene. We call the plane placed parallel to the X and Y axes in distance f from the 1A two-dimensional Euclidean space can be identified with a two-dimensional projective space if ideal image points ("image points at infinity") are added. Ideal image points are represented by homogeneous coordinates whose third components are 0. 95
96
Chapter 4.
x
Representation of Geometric Objects
.P(X,Y,Z)
~f-~
(a)
~
P
(b)
Fig. 4.1. (a) Perspective projection. (b) Idealized camera model.
X Y plane the image plane. Define an xy coordinate system on the image plane in such a way that the x- and y-axes are parallel to the X- and Y-axes, respectively, and the origin o is on the Z-axis. If viewed from the origin O, a space point P with coordinates (X, Y, Z) is seen at p on the image plane Z = f with image coordinates (x, y) given by X
Y
(4.4)
We call the origin O and the constant f the viewpoint and the focal length, respectively. Eqs. (4.4) define a mapping, called perspective projection, from the three-dimensional X Y Z scene to the two-dimensional xy image plane (Fig. 4.1a). This is an idealized model of camera imaging geometry (Fig. 4.15)" the viewpoint O corresponds to the center of the lens; the Z-axis corresponds to the optical axis of the lens; the focal length f corresponds to the distance from the center of the lens to the surface of the film 2 (or the photo-cells for video cameras). In the following, we use f as the unit of length, so the image plane can be written as Z - 1. If the above geometry of perspective projection is assumed, the vector representation (4.1) can be thought of as identifying an image point p with its position x in three dimensions (Fig. 4.2a). Let us call the space line that starts from the viewpoint O and passes through image point p the line of sight of p. The vector x that represents image point p indicates the orientation of the line of sight of p. Suppose observation of an image point is susceptible to image noise. Let (x, y) be the observed position. If image noise (Ax, Ay) is randomly added, this position is perturbed into (x + Ax, y + Ay). The image noise is assumed to be very small3: IAxl KK 1, IAyl K< 1. In the vector representation, the observed value x is randomly perturbed into x + A x by image noise Ax. If Ax 2The focal length f thus defined depends on the position of the object on which the camera is focused. Hence, it is generally different from the optical focal length of the lens; the two values coincide only when the object in focus is infinitely far away. 3Since we take f as the unit of length, this means that image noise is very small as
4.1.
Image Points and Image Lines
X
97
x
X
x
n
O Y
(b)
(~)
Fig. 4.2. (a) Representation of an image point. (b) Representation of an image line. is regarded as a random variable of mean 0, the uncertainty of the value x is characterized by the covariance matrix V[x] = E [ A x A x T ] . Since the image noise A x is orthogonal to k, it is constrained to be in the two-dimensional subspace {k}L~, the set of all vectors orthogonal to k. Consequently, the covariance matrix V[x] is singular; its null space is the one-dimensional subspace { k } L generated by k. The fact that rankV[x] - 2 simply states that an image point is represented by a 3-vector but has only two degrees of freedom. In other words, the rank of the covariance matrix indicates the degrees of freedom of the representation. E x a m p l e 4.1 If Ax and Ay are independent Gaussian r a n d o m variables of mean 0 and variance e2, the covariance matrix of x is given as follows (see eqs. (3.27)): V[x] -
Here, P k = I eq. (2.46)).
4.1.2
0 0
e2 0
0 0
- e2Pk .
k k T is the projection matrix onto the X Y
(4.5)
plane (see
Representation of image lines
An image line l is represented by its equation
Ax + By + C = 0.
(4.6)
Since the coefficients A, B, and C are determined only up to scale, we impose the normalization A 2 + B 2 + C 2 = 1. Then, an image line I is represented by compared with the focal length f. The focal length f is usually comparable to or much larger than the physical size of the image. So, this assumption is satisfied if the image noise is much smaller than the size of the image. For an image of 512 • 512 pixels, for example, this assumption is well satisfied if the image noise is less than, say, five pixels.
Chapter 4.
98
Representation of Geometric Objects
a unit 3-vector .
-
B
.
(4.7)
C This representation is not unique: n and - n represent the same image line. If the image plane is viewed as a two-dimensional projective space, the three components of the 3-vector n can be interpreted as the homogeneous coordinates of the image line it represents. Since A and B in eq. (4.6) cannot be zero at the same time 4, an arbitrary 3-vector n represents an image line if and only if I I n ] l - 1,
n ~: + k .
(4.8)
The distance of the image line represented by n from the image origin o is d -
ICl v/A2 + B 2
=
I(k,-)l . X/1 - (k, n) 2
(4.9)
Recall the geometry of perspective projection described in Fig. 4.1a. The lines of sight of all the image points on an image line l define a space plane. The vector representation (4.7) can be thought of as identifying the image line l with the unit surface normal n to that space plane. In fact, the space plane defined by an image line A x + B y + C -- 0 is A X + B Y + C Z = O, which has surface normal n = (A, B, C)7- (Fig. 4.2b). As in the case of an image point, observation of an image line also has uncertainty. Suppose an image line represented by n is randomly perturbed into a position represented by n - n + A n . If the error A n is regarded as a r a n d o m variable of mean 0, the uncertainty of the value n is characterized by the covariance matrix y[n] = E[AnAnT]. Since n is normalized into a unit vector, the error A n is orthogonal to n to a first approximation. Hence, the covariance matrix Y[n] is singular; its null space is {n}L. Thus, although an image line is represented by a 3-vector n, it has only two degrees of freedom: rankV[n] = 2. This description of error behavior implies that we are viewing n as a r a n d o m variable which has a local distribution (see Section 3.2.2). Since n is a unit vector, it has a distribution over S 2, a unit sphere centered at the origin. We are assuming that the distribution is sufficiently localized around n and hence its domain can be identified with the tangent plane T n ( S 2) at n (see Fig. 3.3). 4The set of all ideal image points in a two-dimensional projective space is called the ideal image line ("image line at infinity"). In projective geometry, the ideal image line can be treated just like an ordinary image line and represented by homogeneous coordinates whose first and second components are both 0.
4.1.
Image Points and Image Lines
99
X,Y
n
nl
l, Z
(~)
(b)
Fig. 4.3. (a) Incidence of image point p and image line I. (b) Parallel image lines 11 and 12.
4.1.3
Incidence, intersections, and joins
A. I n c i d e n c e
Image point p and image line 1 are incident to each other if p is on l, or 1 passes through p (Fig. 4.3a). Let (x, y) be the image coordinates of p. If image line 1 is represented by A, B, and C, image point p and image line l are incident to each other if and only if A x + B y + C - O. In other words, an image point p represented by x and an image line 1 represented by n are incident to each other if and only if (n, x) = 0, (4.10) which is simply the equation of image line I if x is regarded as a variable. The distance D(p, l) between an image point p represented by x and an image line 1 represented by n is D(p,l) -
[(n' x)l . V/1 - (k, n) 2
(4.11)
B. I n t e r s e c t i o n s
Two image lines (nl, x) = O and (n2, x) = O are parallel to each other if and only if Ini,n~, k l - 0 .
(4.12)
(Fig. 4.3b.) If image lines (hi,X) = 0 and (n2,x) = 0 are not parallel s, they intersect at a single image point. From Fig. 4.4a, we see that the vector x that represents the intersection must be orthogonal to both n l and n2. Hence, x c< n l x n2. Since (k, x) = 1, the intersection x is obtained in the following form (see eq. (2.32)): nl xll2
-
kl
(4.13)
5parallel distinct image lines can be thought of as intersecting at an ideal image point
("image point at infinity").
100
Chapter 4.
Representation of Geometric Objects
n
P2 Pl
x2
(b)
Fig. 4.4. (a) The intersection p of image lines 11 and 12. (b) The join l of image points pl and p2. Let V[Tll] and V[n2] be the covariance matrices of ~1,1 and n2, respectively. If n l and n2 are perturbed into n l + An1 and n2 -b An2, respectively, the intersection x computed by eq. (4.13) is perturbed into x + Ax accordingly. The error is to a first approximation mx
(IAn , n~, kl § Inl,n2,kl
-- Z~nl X IF~2 -I-7"/,1X Art 2 --
In1, An2, kl)~.
(4.14)
If n l and n2 are statistically independent 6, the covariance matrix V[x] E [ A x A x -r-] of the intersection x is given by
V[m]
=
1 ( n2 Inl,n2,kl
x V[~r~l] x •2
-~- 2S[7~,2 x V [ n l ] ( l o , 2 x k ) x
+ ( n 2 • k, V[nl](n2 • k ) ) x x T + n l + 2 S [ n l x V[nu](nl X k ) x T] + (nl
X V[~o,2] x X
Tl
nl
k, V[n2](nl
x
k))xxT), (4.15)
where S[. ] denotes the symmetrization operator (see eqs. (2.205)). In deriving the above expression, the identity (Ani x n j ) ( A n k x nl) T _ nj x ( A n i A n - [ ) x nt has been used (see eq. (2.43)). C. Joins
An image line (n, x) = 0 that passes through two distinct image points X l and x2 is called the join of X 1 and X 2. Since (n, x l ) - 0 and (n, x2) - 0, the vector n must be orthogonal to both Xl and x2 (Fig. 4.4b). Also, n is normalized into a unit vector. Hence, =
•
(4.16)
6We use an informal expression like this, instead of saying that the noise that arises in the observation of nl and the noise that arises in the observation of n2 are independent random variables.
4.1.
Image Points and Image Lines
101
which defines an image line as long as X l r X2. Let V[Xl] and V[x2] be the covariance matrices of image points Xl and x2, respectively. If Xl and x2 are perturbed into Xl + / k X l and x2 + Ax2, respectively, the vector n computed by (4.16) is perturbed into n + A n accordingly. The error is to a first approximation an
=
4-
Pn(Axl
x
x2 + Xl
X
AX2)
.
(4.17)
Here, P n = I - n n T is the projection matrix along n. If X l and x2 are statistically independent, the covariance matrix V[n] - E [ A n A n T] of n is given by
Pn(xl x
V[n]-
V[x2] x x, + x2 x V[Xl] x I1~1 x ~,~11~
x2)Pn
(4.18)
E x a m p l e 4.2 As shown in Example 4.1, if each coordinate is perturbed by Gaussian noise independently of mean 0 and variance e 2, the covariance matrices of Xl and x2 are V[Xl] - V[x2] - e 2 P k . Note the following identity (see eq. (2.42))" xi
x
Pk
x
xi
-
xi
(I-
x
kk T)
X Xi
= II~,,ll~x - ~,~T-
(~, x k)(~, • k) ~.
(4.19)
Let u be the unit vector that indicates the orientation of the join (n, x) = 0, and w the separation between the two image points. Their midpoint is x c = (Xl + x2)/2, so we can write x l - X c - w u / 2 and x2 - x c + w u / 2 . If the two image points are close to the image origin o, we have ll~,ll ~ 1, ~ , • k o, i - 1, 2, and IIXI X X211 ,-~ W. Using eq. (4.19), we can approximate eq. (4.18) in the form Y[n] '~ - ~
Pn - xcx~-
---~uu T
,
(4.20)
where the identities P n u - u and P n x c - X c are used. If X c is approximately perpendicular to u, we obtain the approximation P n ~ x c x ~ + u u T. Since the distance w is usually very small as compared with the focal length (which we take as the unit of length), we obtain to the following approximation: 2e 2
v[n] ~ ~ - y ~ .
(4.21)
This means that the vector n is most likely to deviate in the direction u (Fig. 4.5a); the p r i m a r y deviation pair 7 is given as follows (see eq. (3.48))"
~+ ~ N[. +
V~ W
~],
~- ~ N [ ~ - vJA~]. W
(4.22)
7The geometric meaning of the primary deviation pair will be discussed in detail in Section 4.5.3.
102
Chapter 4.
n- n
Representation of Geometric Objects
+ i
_
l
n§ ~
n
-
l+
(b)
(~)
Fig. 4.5. (a) P r i m a r y d e v i a t i o n pair. (b) D e v i a t i o n of t h e join. The two image lines 1+ and l - defined by these two vectors intersect at the midpoint of the two points, indicating that the join is very likely to pass near the midpoint in the presence of noise (Fig. 4.5b). It is also seen that the error is approximately proportional to the reciprocal of the separation w between the two image points.
4.2 ~.2.1
Space Points and Space Lines Representation of space points
A space point (X, Y, Z) is represented by a 3-vector r - (X, Y, Z)T. Alternatively, it is represented by a 4-vector
P=
X Y z
= ~el
(4.23)
1 The distance from the coordinate origin O is d = Jx~
+ y 2 + z~ = II,'lI = JIIpll ~
- 1.
(4.24)
A 4-vector p represents a space point if and only if (~, p) = 1,
(4.25)
where ~r - (0, 0, 0, 1)x (__ 0 | 1). If the three-dimensional space is viewed as a three-dimensional projective space, the four components of the 4-vector p can be interpreted as the homogeneous coordinates of the space point 8. The 8As in the two-dimensionM case, a three-dimensional Euclidean space can be identified with a three-dimensional projective space if ideal space points ("space points at infinity") are added. Ideal space points are represented by homogeneous coordinates whose fourth components are zero.
4.2.
Space Points and Space Lines
103
p-representation is very convenient for various computations, but we must be careful about the rounding effect in fixed precision computation: if X, Y, and Z are very large, the fourth component of p may be treated as effectively 0. Hence, an appropriate scaling must be applied to keep the space coordinates in a reasonable range of magnitude. Observation of a space point is susceptible to noise. Suppose the observed value r is randomly perturbed into v + A t . If A t is regarded as a random variable of mean 0, the uncertainty of r is characterized by the covariance matrix Y[r] = E [ A r A r T ] , which we assume is positive definite: rankV[r] = 3. In the 4-vector representation p = r | 1, the covariance matrix V[p] is singular and has the form
V[p]-(
V[r]
0)-V[r]|
(4.26)
Since the error Ap is constrained to be in the three-dimensional subspace {~;}~ C 7~4, the null space of V[p] is {~}L. The fact that rankV[p] = 3 states that a space point has three degrees of freedom even if it is represented by a 4-vector p.
E x a m p l e 4.3 If each component of (X, Y, Z) is perturbed independently by Gaussian noise of mean 0 and variance e2, the covariance matrix V[r] has the following form (see eq. (3.25)): Vir]
(4.27)
= e2 I .
In the 4-vector representation, the covariance matrix
V[p] = e2I | where P ~ - I -
4.2.2
~T
0 = e2P~,
V[p] has
the form (4.28)
is the four-dimensional projection matrix along ~.
Representation of space lines
Consider a space line L. Let H be the point on L closest to the coordinate origin O, and put rH -- OH. Let m be the unit vector that indicates the orientation of L (Fig. 4.6a). The space line L is represented by two 3-vectors {m, rH}; two representations {m, rH} and { - m , rH} define the same space line. A space point r is on L if and only if vector r - rH is parallel to m , or (v
-
rH) X m = 0,
(4.29)
which is simply the equation of space line L if r is regarded as a variable. Two 3-vectors {m, rH} represent a space line if and only if
Ilmll = 1,
(m, rH)
=
0.
(4.30)
104
Chapter 4.
Representation of Geometric Objects
O O
(~)
(b)
Fig. 4.6. (a) The {m, rH}-representation. (b) The {p, n}-representation. Eq. (4.29) can also be written as v x m = rH • m . Hence, the equation of a space line can alternatively be written in the form r
x p = n.
(4.31)
Since both sides can be multiplied by an arbitrary nonzero constant, we normalize the 3-vectors p and n into lip 9 nil = 1 (i.e., ]Jp]l2 + ]]n]l 2 = 1). The signs of p and n are chosen in such a way that {p, n, rH } constitute a righthanded orthogonal system (Fig. 4.6b). With this convention, a space line L is also represented by two 3-vectors {p, n}; two representations {p, n} and { - p , - n } define the same space line. Geometrically, the vector p indicates the orientation of L, and the vector n indicates the surface normal to the space plane defined by L and the coordinate origin O. The distance of L from O is
d = Ilnll = v/1- IlPll ~ _ IlPll Ilvll - r
Ilnll Ilnll ~"
(4.32)
It follows that 1
Ilvll
d
i
v/1 + d 2'
ll~ll
= -x/1 - - - +- ~ d"
(4.33)
The 3-vectors p and n can be interpreted as the homogeneous coordinates of a space line, known as the Pliicker (or Grassmann) coordinates. Two 3-vectors {p, n} represent a space line if and only if
Ilpll = + II~L]~ - 1,
(p, n) = O.
(4.34)
The { m , rH}-representation and the {V, n}-representation are related as follows: m-
N[p]
rH = p • n
'
(4.35)
IlPll ~ '
(p) N,(o rHxmm)1
430,
4.2.
Space Points and Space Lines
105
Suppose the values m and ~'H are randomly perturbed into m + A m and rH + A rH, respectively. If the errors A m and A rH are regarded as random variables of mean 0, the uncertainty of the values {m, rH } is characterized by the six-dimensional covariance matrix V i m | rH] = E [ ( A m @ A r H ) ( A m @ ArH)T], which has the following submatrices (see eqs. (3.11) and (3.12)):
vim 9 ~.] -
(
V[m] v[~.,m]
V[m, r H ] ) v[~]
(4.37)
The constraint (4.30) implies that the errors A m and ArH are constrained by (m, A m ) = 0, ( A m , rH) + (m, ArH) = 0, (4.38) Hence, the null space of V[m | rH] is the two-
to a first approximation. dimensional subspace
AfmerH = { m | 0, rH @ m}L C 7~6.
(4.39)
It follows that rankV[m | rH] -- 4; a space line has four degrees of freedom. The six-dimensional projection matrix onto H mi ~ r H is given by PX~r
-1+11r.112 m.~
I
mm T
'
(4.40)
where P m I - m m T is the projection matrix along m. In the {p, n}-representation, the error behavior is characterized by the six-dimensional covariance matrix
Vtp|
= ( V[n,p] V[pl V[p'n] )V[n] "
(4.41)
The constraint (4.34) implies that the errors Ap and A n are constrained by
(p, z~p) + (~, ~ ) to a first approximation. dimensional subspace
= 1,
(~p, ~) + (p, z ~ ) = 0,
(4.42)
Hence, the null space of V[p @ n] is the two-
H p e n - {v 9 n, n r v} ~ c n ~.
(4.43)
Again, rank V[p | n] = 4, confirming that a space line indeed has four degrees of freedom. The six-dimensional projection matrix onto Afl~en is given by
pxpe
_ ( I - pp T - n n T --2S[pn T]
-2S[p n T] ) I -- ppT _ n n T 9
(4.44)
As in the case of image lines, we are considering a local distribution (see Section 3.2.2). Since a space line is represented by two 3-vectors {m, rH}
106
Chapter 4.
Representation of Geometric Objects
with two constraints given by eqs. (4.30), the set of all space lines defines a
four-dimensional manifold S in ~6. We are assuming that the distribution is sufficiently localized around the value m | r H in S and hence the domain of the distribution can be identified with the tangent space T m e r , ( S ) to S at m | rH. It follows that the null space A/'merH and its orthogonal complement HXmerH coincide, respectively, with the normal space N m e r , ( S ) and the tangent space T m e r , ( S ) (see Section 3.2.1). The same can be said for the {p, n }-representation. The covariance matrix Vim | rH] defined by eq. (4.37) is expressed in terms of the covariance matrix V[p | n] defined by eq. (4.41) as follows (see eqs. (3.18)): 1
V [ m ] - [ipll2PmV[p]Pm,
V[m, rH] --
1 ilPl13 (ProW[p] x n + 2PmV[plprTH - PraY[p, n] x P)
= V[rH,m] T,
WitH]-
,(
IIPlI4 n x V[p] x n -
4S[(n
x
V[p]p)ri~ll
+4(p, V[plp)rHrI~- 2S[n x V[p, n] x P] - 4 S [ ( p x V[p, n]Tp)rnTl + p x V[n I x p ) .
(4.45)
Here, P m - I - m m T is the projection matrix along m. Conversely, the covariance matrix V[p | n] is expressed in terms of the covariance matrix V[m 9 rH] in the form 1
V[p ~
nl - 1 + II~ll2 Pp~n
(
V[m] V[~ x m, m]
v[~, ~ • ~] ) V[r x m]
Ppen,
(4.46)
where v[m, ~ • m]-
V[m x r ] - r
v [ m ] • ~ - v [ m , ~] • m - v [ ~ x m , m] ~,
x V[m] x r - 2 S [ r x V[m, rl x m] + m x V[r] x m. (4.47)
Here, Ppen is the six-dimensional projection matrix along p @ n, which has the form
Ppen-(
I-ppT -np T
--pnT ) T I - nn
(4.48)
4.2.
Space Points and Space Lines
107
L
m m' r I
t
L
O
(~)
(b)
Fig. 4.7. (a) Two intersecting space lines. (b) The join of two space points.
~.2.3
Incidence, intersections, and joins
A. I n c i d e n c e
A space point P and a space line L are incident to each other if P is on L, or L passes through P. If space point P is represented by r and space line L is represented by {m, rH} (or {p, n}), they are incident to each other if and only if (r - rH) X m -- 0 (or r x p -- n). The distance D(P, L) between a space point P represented by r and a space line L represented by ( v - r H ) X m =0 (orr• is
D(P, L) - lit x p -
nil =
IIPmr-
~HII-
(4.49)
Ilpll
B. I n t e r s e c t i o n s
The distance D ( L , L ' ) between a space line L represented by {m, rH} and a ! space line L' represented by { m ' , r H} is given by
lira x m'll
D(L,L')/.
II ,,
t
-
.11.
ifm x m' r (4.50) ifmx
m ' = 0.
It follows that the space lines L and L' intersect if and only if the three vectors m , m ' , and r~ - rH are coplanar and 9 m r + m ' (Fig. 4.7a), i.e., !
Im, m ' , rH -- rHI - 0,
m x m'#
0.
(4.51)
9Parallel distinct space lines (m = =Em' or p x pl = 0) can be thought of as intersecting at an ideal space point ("space point at infinity").
108
Chapter 4.
Representation of Geometric Objects
In the {p, n}-representation, eqs. (4.50) are replaced by
(V,n')+(V',n)
V'
lip • p'll
D(L,L')
'
if p x
# o,
-
nt
pt
(p, n ' ) + (p', n ) = 0,
p x p' # 0.
Illl tl- tl'lLII, n
(4.52)
-o
Eqs. (4.51) are replaced by (4.53)
The intersection r of the space lines L and L' is given by r =
(m • ~.) • (m'• Im, m', ~-I
~h) _ n • n' - (p', n)"
(4.54)
This expression is convenient for theoretical analysis but not suitable for actual computation, since both the numerator and the denominator vanish when Ira, m ' , rHI = 0 or (p', n) = 0. A more convenient expression is
(m. rl,) + (m. m')(m', r.) r - r. +
lira • m'll ~
m
1 /' Ilpll:lp, p', n ' l - (p, p')lp, p', nl = Ilpll ~ [,
IIv • p'll:
v + p • n)
(4.55)
C. J o i n s
The join of two distinct space points r l and r2 is a space line that passes through both rl and r2 (Fig. 4.7b). In the {m, rH}-representation, the join is given by
'm -- N['rl - 'r2],
'rH -
('rrt, 'r 1)'r' 2 -- ('D't, 'r'2)'r 1 IIr~ - ,'~11
(4.56)
In the {p, n}-representation,
(P)-'N[(~t
~2'P1-'P2X,rl)].
(4.57)
If the space points Vl and v2 are statistically independent and have respective covariance matrices V[rl] and V[r2], the covariance matrix of their join is computed from eq. (4.57) in the following form (see eqs. (3.18)): v [ p 9 ~] -
1 I1~1 - ~ II= + I1~ • 'rl II2 P p G n
(
V[?'I]-~-V[/'2]
V[T1] x/'2-~-V[/'2] x/'1
,.~ • v[,-~] + ,-~ • v[,,~]
,-~ • v[,~,] • ,-~ + ,-, • v[,-~] • ,-,
Ppen.
) (4.58)
Here, Pp~n is the six-dimensional projection matrix defined by eq. (4.48).
4.3.
Space Planes
109
d
O Fig. 4.8. Representation of a space plane.
4.3
~.3.1
Space Planes
Representation of space planes
A space plane H is represented by its unit surface normal n and its (signed) distance d from the coordinate origin O (Fig. 4.8): the distance d is positive in the direction n and negative in the direction - n . Hence, {n, d} and { - n , - d } represent the same space plane. The equation of a space plane that has unit surface normal n = (A, B, C) T and distance d is A X + B Y + C Z - d,
(4.59)
(n, r) = d.
(4.60)
which can be written as If the 4-vector representation p = r | 1 is used for space point r, eq. (4.60) is expressed in the form ( v , p ) = 0, (4.61) where v ~ n | ( - 1 ) . The scale indeterminacy can be removed by imposing the normalization I l v l l - 1:
v--
v/1 + a~
-d
"
Conversely, the unit 4-vector v - (Vl, v2, u3,//4) T Can be decomposed into the unit surface normal n and the (signed) distance from the coordinate origin in the form -
1,
x[
d -
-
va
-
/23
where ~ - (0, 0, 0, 1) T. As in the case of space points, we must be careful about the rounding effect in fixed precision computation: if the distance d is very large, the first three components of v may be treated as effectively 0.
110
Chapter 4.
Representation of Geometric Objects
Hence, an appropriate scaling must be applied to keep the distance d in a reasonable range of magnitude. As we did for image lines and space lines, we assume local distributions for all variables; the error behavior of a space plane is characterized by the four-dimensional covariance matrix
V[n | d] - ( V[n] V[n,d] ) V[n,d] V[d]
'
(4.64)
where V[n, d] (- V[d, n] T) is a a-vector and V[d] is a scalar. Since n is a unit vector, the null space of V[n | d] is {n | 0}L. If the v-representation, the covariance matrix V[~,] has the form
1
V[v] - --------gPv 1+ d
(
V[n] -V[n,d]) -V[n, d] V[d] P~'
(4.65)
where Pu - I - u u -r is the four-dimensional projection matrix along u. Since u is a unit vector, the null space of V[u] is {U}L. Whichever representation is used, a space plane has three degrees of freedom: rankV[n | dl - rankV[u]
--3.
If let v -+ v + A v in eqs. (4.63), the unit surface normal n and the distance d are respectively perturbed to a first approximation by
A n - VII + d 2P v
Ab'l)
Au2 A~,3
,
Ad - - v / ( i + d2)aAv4,
(4.66)
where we have used eqs. (3.17). It follows that for a given covariance matrix V[~,] the the corresponding covariance matrix Y[n | d] can be computed as follows"
V[n]-
V[n,d]4.3.2
(1 +
d2)Pv
- ( 1 + d2)2pv
V[V]ll V[v]31 V[v]14 ) V[v]24 , V[V]34
)
V[u]21
V[d]-
(1
+ d2)3V[v]44.
(4.67)
Incidence, intersections, and joins
A. I n c i d e n c e
1. A space point P and a space plane H are incident to each other if P is on H, or H passes through P. A space point r (or p) and a space plane represented by {n, d} (or u) are incident to each other if and only if
4.3.
Space P l a n e s
111 n
L
0 Fig. 4.9. Incidence of space plane II and space line L. (n, r ) = d (or (v, p) = 0). T h e distance D ( P , H) b e t w e e n a space point P r e p r e s e n t e d by r (or p) a n d a space plane r e p r e s e n t e d by {n, d} (or
. ) is D ( P , II) - ](n, r) - d] -
](v, p)]
v/i
.
(4.68)
-
2. A space line L and a space plane H are incident to each other if L is on H, or H passes t h r o u g h L (Fig. 4.9). A space line (r - rH) • m = 0 and a space plane (n, r) = d (or (v, p) = 0) are incident to each other if a n d only if (n, m ) = 0,
(n, rH) = d,
(4.69)
(v, rH | 1) = 0.
(4.70)
or equivalently 1~ (v, m | 0) = 0,
B.
Intersections
1. T w o distinct space planes ( n l , r ) = d l and (n2, r ) = d2 intersect along a space line as long as 11 n 1 ~ -~-n 2 (Fig. 4.10a). In the { m , rH}r e p r e s e n t a t i o n , the intersection is m -- N [ n l • n2],
T'H - -
+
]]n I X n 2]]2
"
(4.71)
l~ first of eqs. (4.70) can be given the interpretation that the ideal space point ("space point at infinity") m | 0 of the space line (r - rH) • rn -- 0 is on the space plane (L,,p) = 0. 11 Parallel distinct space planes can be thought of as intersecting along an ideal space line ("space line at infinity").
112
Chapter 4.
Representation of Geometric Objects
n2
L /
0
0
(a) (b) Fig. 4.10. (a) The intersection L of space planes Ha and 122. (b) The intersection P of space line L and space plane 12. In the {p, n}-representation, \
•
nx - din2
)
(4.72)
Suppose the two space planes (•1, ~') dl and (n2, r) = d2 are statistically independent. Let V[nl | dl] and V[n2 | d2] be their respective covariance matrices. The covariance matrix of the intersection is computed from eq. (4.72) in the form -
vtp |
(\
-
Vnp
')
Vnn /
(4.73)
where
V~p
--"2
X W [ - 1 ] x ~'12 "~KI 1 X V[~'~2] x - 1 ,
V p n -- -d2n2 x V [ n l ] - d l n l • V[n2] + n 2 x V[nx,dx]n: +
~"$1 X
V~l, rl -- d22V[~'ll]--[-- d 1 2 V [ ~ l , 2 ] -
V[n2,d2]n? = VYnp,
2d2S[V[rll,dl]~'lT2]
-2d1S[V[~9,2,d21717] ...[-V [ d l ] t l 2 ~ l , : Here, P p e n eq. (4.48).
--[- V[d2]irll~,l,?.
(4.74)
is the six-dimensional projection matrix defined by
2. A space plane (nn, r) - d and a space line (r - rH) X m -- 0 (or r x p = nL) intersect at a space point unless 12 (nri, rn) -- 0 (or (nn, p) -- 0) 12A space plane and a space line that are parallel and not incident to each other can be thought of as intersecting at an ideal space point ("space point at infinity").
4.4.
Conics
113 t
L /7
O O
O
(b)
(c)
Fig. 4.11. (a) The join II of three space points P0, P1, and P2. (b) The join II of space line L and space point P. (c) The join H of space lines L and L'. (Fig. 4.10b). Their intersection r is given by
d--(nn,rH)
l
(Inn,nL,PlTdllpll 2
)
,r --- ,r H +
(4.75) C. J o i n s
1. Three distinct space points r0, /'1, and r2 define a space plane, called their join (Fig. 4.11a), represented by =
-
•
-
d = (n, r0).
(4.76)
2. A space line ( r - rH) X m -- 0 and a space point r p define a space plane, called their join (Fig. 4.11b), if they are not incident to each other. The join is represented by d = (n, vp).
(4.77)
3. Two intersecting space lines (r - VH) X m = 0 and (v - V~H) X m ' = 0 define a space plane, called their join (Fig. 4.11c), represented by
n = N [ m • m'],
4.4
4.4.1
d = (n, ell).
(4.78)
Conics
C l a s s i f i c a t i o n o f conics
A conic (sometimes referred to as a conic locus or conic section) is a curve on a two-dimensional plane whose equation has the form
Ax 2 + 2 B x y + Cy 2 + 2(Dx + Ey) + F = O.
(4.79)
114
Chapter 4.
Representation of Geometric Objects
If an image point (x,y) is represented by 3-vector x - (x,y, 1) T , eq. (4.79) can be written in the form -0, (4.80) where Q is a (33)-matrix defined by
Q-
B D
C E
E F
.
(4.81)
A conic does not necessarily define a curve. A conic is singular if it defines two (real or imaginary) lines 13 (including one degenerate line); otherwise, the conic is nonsingular. It is easy to prove the following: 9 Conic (x, Q x ) - 0 is nonsingular if and only if the matrix Q is nonsingular. 9 A nonsingular conic (x, Q x ) = 0 defines a real conic (an ellipse, a parabola, or a hyperbola) if and only if the signature of Q is (2,1) or (1,2) (see Section 2.3.4). If the sign of Q are chosen so that d e t Q < 0, the type of conic (x, Q x ) = 0 is classified as follows: 1. If det Q = 0, the conic defines two (real or imaginary) lines (including one degenerate line). 2. If det Q 7~ 0, then (a) if A C - B
2 > 0, then
i. if A + C > 0, the conic is an ellipse, ii. if A + C < 0, the conic is an empty set (an imaginary ellipse), (b) if A C -
B 2 - 0, the conic is a parabola,
(c) if A C -
B 2 < 0, the conic is a hyperbola.
Given a nonsingular conic (x, Q x ) - 0 and an image point xp, the image line (np, x) - 0 for np - :l:N[Qxp] (4.82) is called the polar of the image point xp with respect to the conic (x, Q x ) -- O. Conversely, given a nonsingular conic (x, Q x ) - 0 and an image line (np, x) - O, the image point
xp = (k, Q - l n p )
(4.83)
is called the pole of the image line (np, x) - 0 with respect to the conic (x, Q x ) - O. We observe the following: 13If two imaginary lines intersect, their intersection is an isolated real point.
4.4.
Conics
115 P Ii Pl P2
Fig. 4.12. Image point p is the pole of image line l, and image line l is the polar of image pont p. The tangent points pl and p2 are the poles of the tangents ll and 12, respectively, and the tangents 11 and 12 are the polars of the tangent points pl and p2, respectively. 9 A point xp is on conic (x, Qx) = 0 if and only if its polar (np, x) = 0 is tangent to the conic (x, Qx) = 0 at xp. 9 Let xp be the pole of an image line (np, x) = 0 with respect to conic (x, Qx) = 0. If the image line (np, x) = 0 has two intersections Xl and x2 with the conic (x, Qx) = 0, the join of Xp and xi is tangent to the conic (x, Qx) = 0 at xi, i = 1, 2 (Fig. 4.12).
4.4.2
Canonical forms of conics
A real conic can be reduced to its canonical form by an appropriate translation and rotation of the xy coordinate system. If the coordinate system is rotated by angle 0 and translated by (a, b), the transformation of coordinates can be written in the following form14" cos0
x t-Ax,
A-
sin0 cos0 0
-sinO 0
-acos0-bsin0 asin0-bcos0 1
\ ) .
(4.84)
Hence, x
--
A - 1x ! ,
A
-
1
__
cos0 sin 0 0
-sin0 cos 0 0
a ) b . 1
(4.85)
By this coordinate transformation, conic (x, Q x ) = 0 is transformed to conic (x', Q'x') = 0 for Q' =
(A-~ )-CQA-1.
(4.86)
14The set of matrices that have the form shown in eqs. (4.84) is closed under matrix multiplication and called the group of two-dimensional Euclidean motions.
116
Chapter 4.
Representation of Geometric Objects
This is the congruence transformation of Q by matrix A -1 (see eq. (2.154)). Since the signature is preserved by a congruence transformation (Sylvester's law of inertia; see Section 2.3.4), a real conic is always transformed to a real conic 15 . Consider a nonsingular conic (x, Q x ) - 0. If the scale and the sign of Q are chosen so that det Q - - 1 , the conic has the following canonical form. Let (A + C) 4- v/(A + C) 2 - 4(AC - B 2) 2
)~i, )~2 --
.
(4.87)
I. If A C - B 2 ~= O, then AI and A2 are both nonzero. Let
a-
11'
b-
g.
(a) If p a l > 0 and pA2 > 0, the conic is an ellipse (Fig. canonical form X2
4.13a) with
y2
a--7 + ~-g -- 1.
(4.89)
This ellipse has radii a and b in the x and y directions, respectively. (b) If #A1 and #A2 have opposite signs, the conic is a hyperbola (Fig. 4.13b) with canonical form x2 y2 + ~ ~: ~- - I,
(4.90)
where the upper signs are for p a l > 0 and #A2 < 0, and the lower signs are for p a l < 0 and pA2 > 0. In the former (latter) case, this hyperbola intersects with the x-axis (y-axis) at x - a (y = b) and has two asymptotes y = +(b/a)x. (c) If p a l < 0 and pA2 < 0, the conic is an imaginary ellipse with canonical form x2 y2 -
a--g
b2 = 1.
2. If A C - B 2 = O, either A1 or A2 is zero. (Fig. 4.13c).
(4.91)
The conic is a parabola
(a) If A1 ~ 0 and A2 = 0, the canonical form is y
__.
(A + C)v/A 2 + B 2 2(BD-
AE)
x 2.
(4.92)
aSThis property holds not only for two-dimensional Euclidean motions but also for all two-dimensional projective transformations. A two-dimensional projective transformation has the form x' or Ax, where A is an arbitrary nonsingular 33-matrix. Conics undergo the congruence transformation in the form of eq. (4.86).
4.4.
Conics
117
y
X
O
(~)
(b)
(c)
Fig. 4.13. (a) Canonical form of an ellipse. (b) Canonical form of a hyperbola. (c) Canonical form of a parabola. (b) If
)~1
--
0 and A2 #- 0, the canonical form is g
D
(A + C)~/B 2 + C 2
2(BE-CD)
x 2.
(4.93)
T h e shape of a real conic is characterized by its axes of s y m m e t r y and
eccentricity as follows: 1. An ellipse has two axes of symmetry, called the principal axes, t h a t are orthogonal to each other. In the canonical form of eq. (4.89), the x-axis is called the major (minor) axis if a < b (a > b). Its area is
S-
7~ab.
(4.94)
Its eccentricity e is defined to be via /a 2 - b 2
~ ,
a >_ b,
e m
(4.95)
x/b 2 -a2
~ ,
a
T h e eccentricity of a circle is 0. 2. For a h y p e r b o l a in the canonical form x2/a 2 - y2/b2 - 1, the x- and y-axes are called the transverse axis and the conjugate axis, respectively. Its eccentricity is defined to be Via 2 + b2 e -
a
.
(4.96)
If e - V/2 (i.e., a - b), the curve is said to be a rectangular hyperbola, for which the two a s y m p t o t e s are m u t u a l l y orthogonal. 3. A p a r a b o l a has one axis of s y m m e t r y . In the canonical form of eq. (4.92) or (4.93), the origin and the y-axis are called the axis and the vertex, respectively. T h e eccentricity of a p a r a b o l a is defined to be e - 1.
Chapter 4.
118
Representation of Geometric Objects
l/a/ //
circle
ellipse
parabola
i
i
0
1
hyperbola
Fig. 4.14. Eccentricity e of conics: 0 _ e < 1 for ellipses, e - 1 for parabolas, and e > 1 for hyperbolas. Thus, 0 <_ e < 1 for ellipses, e - 1 for parabolas, and e > 1 for hyperbolas (Fig. 4.14).
4.5
4.5.1
Space Conics and Quadrics Representation in three dimensions
A. Space conics
A space conic is a conic defined on a space plane. We represent it by back projection: we regard a space conic as the intersection of a space plane with a "cone" with vertex at the viewpoint O generated by the lines of sight of image points on a conic (Fig. 4.15). It follows that a space conic is represented by the space plane (n, r) - d on which the space conic lies and its projection (x, Q x ) = 0 onto the image plane; a space conic represented by {n, d, Q} is a set of space points that satisfy ( n , r ) - d,
(r, Q r ) - 0.
(4.97)
B. Quadrics
A quadric is a surface defined by a quadratic equation in the coordinates (X, Y, Z). Quadrics are classified into ellipsoids, paraboloids, hyperboloids, and their degeneracies (e.g., cylinders, pairs of space planes, space points, and empty sets). Paraboloids, hyperboloids, and their degeneracies are called centered quadrics, because they have centers of symmetry. Let re be the center of symmetry, which we simply call the center. Then, a centered quadric is represented in the form (r - rc, S(r
- rc))
-
1,
where S is a (33)-matrix. It is easy to see the following: 1. If S is positive definite, the quadric is an ellipsoid.
(4.98)
4.5.
Space Conics and Quadrics
119
(x, Qx)=o 0
(r, Qr)= 0
Fig. 4.15. Representation of a space conic.
O Fig. 4.16. Representation of a quadric. 2. If S is negative definite, the quadric is an empty set. 3. If S has signature (2,1), the quadric is a hyperboloid of one sheet. 4. If S has signature (1,2), the quadric is a hyperboloid of two sheets. 5. If S is singular, various types of degeneracy occur. The quadric given by eq. (4.98) is said to be singular or nonsingular depending on the matrix S is singular or nonsingular. The eigenvectors of S are called its principal axes. The reciprocal of the square root of its positive eigenvalue is the radius of the quadric for the corresponding principal axis (Fig. 4.16). For an ellipsoid that has an orthonormal system {Ul, u2, u3} as its principal axes and (rl, (re, and (re as the corresponding radii, the matrix S has the following form (see eq. (2.62)):
S-
..... uluT1 + u2u~
+
(4.99)
uau[
E x a m p l e 4.4 An ellipsoid centered at (a, b, c) with principal axes in the the coordinate axis orientations and the corresponding radii (rl, a2, and (r2 has the form
(X-a) 2 (r2
(Y-b) 2 +
(r~
(Z-c) 2 +
(r2
= 1.
(4.100)
120
Chapter 4.
Representation of Geometric Objects
(a) (lJ l )
The center re and the matrix S are given as follows:
rc -
4.5.2
b c
,
S -
1/a~
.
(4.101)
1/cr~
Polarity and conjugate direction
A. Poles and polars
Consider a nonsingular quadric given by eq. (4.98). The polar of a space point rp with respect to this quadric is a space plane defined by (r - rc, S(rp - r e ) ) = 1.
(4.102)
Its unit surface normal np and distance d from 0 are
~,
= N[s(~, - ~)1,
1 a~ = I I s ( ~ - ~)11 +
(~' '~')
(4.~0a)
Conversely, the space point rp is called the pole of the space plane (np, r) = dp with respect to the same quadric. From eqs. (4.103), we see that s-in
rp -- dp - (re, rip)"
(4.104)
Eq. (4.102) implies the following: 9 The polar of rp passes through rp if and only if rp is on the quadric (4.98). 9 If rp is on the quadric (4.98), the polar of rp is tangent to the quadric at rp (Fig. 4.17a). 9 The surface normal n to the tangent plane to the quadric (4.98) at r has orientation S (r - re): n ~ s(~-
~).
(4.105)
B. Conjugate direction
Consider a space plane H that passes through the center r e of the quadric (4.98). Let n be its surface normal. The conjugate direction 16 of the space 16If t h e space u n d e r g o e s a general linear t r a n s f o r m a t i o n , vectors n a n d n t t r a n s f o r m differently as a covariant vector and a contravariant vector, respectively. In this sense, t h e t r a n s f o r m a t i o n s of n and n t are contragradient to each oth e r .
4.5.
Space Conics and Quadrics
121
np
O (a) (b) Fig. 4.17. (a) The polar of rp passes through rp if and only if it is tangent to the quadric at rp. (b) The conjugate direction n t of space plane II. plane H with respect to the quadric (4.98) is the orientation of the vector n t that starts from the center r e and points toward a point on the quadric at which the tangent plane has surface normal ~ n (Fig. 4.17b). It follows from eq. (4.105) that n t c( S - i n . (4.106) If the quadric is a sphere, the conjugate direction of a space plane coincides with its surface normal. In this sense, the conjugate direction is a generalization of the surface normal to a space plane. C. G e n e r a l i z a t i o n
Quadrics can also be defined in higher dimensions: eq. (4.98) defines a quadric in n dimensions if r and r e are regarded as n-vectors and S is regarded as an ( n n ) matrix. If S is nonsingular, the polarity and the conjugate direction are defined as in three dimensions straightforwardly. Suppose S is a positive semidefinite symmetric m a t r i x of rank r (< n), and let {ui } be its eigensystem for eigenvalues {)~i}, where )~1 _> ... _> )~ > A~+ 1 = ... = )~n - 0. Then, the center r e and the vectors Ul, ..., u~ define an r-dimensional ajfine subspace 17 $ C 7r n. If the quadric is restricted to this affine subspace (Fig. 4.18), it defines a nonsingular quadric in it, so the polarity and the conjugate direction are defined in S.
.5.3
Visualization of covariance matrices
Conics are i m p o r t a n t image features because m a n y m a n - m a d e objects have circular and spherical parts, and circles and spheres are projected onto ellipses on the image plane. As compared with conics, the role of quadrics appears to be minor, since ellipsoidal objects are rare in real environments. However, quadrics have an important role: a three-dimensional positive semi-definite 17See Footnote 17 in Section 3.4.2.
Chapter 4.
122
Representation of Geometric Objects
Fig. 4.18. A singular quadric defines a nonsingular quadric if restricted to the affine subspace spanned by eigenvectors for positive eigenvalues. symmetric matrix is represented by an ellipsoid. This fact allows us to visualize three-dimensional covariance matrices. The following two techniques serve this purpose. A. Standard confidence region
Let V[~] be the covariance matrix of a space point ~. If it has eigenvalues A1, A2, and A3 and the corresponding eigensystem {ul, u2, u3}, it has the spectral decomposition
V['r]- o2UlU1-I- + 02u2'u: + 02~3u:,
(4.107)
where (ri - ~X~, i = 1, 2, 3. We can regard ei as the standard deviation in direction ui (see eq. (3.21)). 1. Suppose V[~] is of full rank. If the distribution is Gaussian, the surface on which the probability density is constant is an ellipsoid centered at in the form (r - ~, V [ ~ ] - l ( r -/~)) = constant (see eq. (3.38)). If we choose the constant to be 1/v/(27~)31V[i']Ie-t/2 , the ellipsoid is (r - ~, V [ ~ ] - l ( r - ~)) = 1,
(4.108)
which has principal axes {ul, u2, u3} and the corresponding radii al, cr2, and a3. Let us call the region inside this ellipsoid the standard confidence region of ?. This is a natural extension of the standard confidence interval (3.42). The probability that the true value of r falls inside the standard confidence region is 19.87% (see eq. (3.62)). 2. If V[/'] is of rank 2 with a3 = 0, the standard confidence region degenerates into a space conic with principal axes {Ul, u2} and the corresponding radii (71 and a2. It is represented by (u3, r) = (u3, i'),
(x, Q x ) = O,
(4.109)
where Q -- N[(Ul, u2,/')t-rdiag(er2, a~,--1)(Ul, u2, r)*].
(4.110)
4.5.
Space Conics and Quadrics
123
^
)
O Fig. 4.19. The standard confidence region is defined in the tangent space TU(L/) to the manifold U defined by the constraint. The symbol N[.] denotes the normalization for a matrix (see eq. (2.145)), and (Ul, u2, ~)t is the cofactor of the matrix whose columns are ul, u2, and ~ in that order (see eq. (2.18)). The probability that the true value of r falls inside this space conic is 39.35%. 3. If VIii is of rank 1 with a2 = a3 = 0, the standard confidence region degenerates into the line segment connecting ~-crlUl and ~ + a l U l . The probability that the true value of r falls inside this segment is 68.27%. The standard confidence region can be defined in higher dimensions in the same way. Let /~ be an n-vector that represents some object, and V[/~] its covariance matrix. Suppose/~ is constrained to be in an r-dimensional manifold L/. Then, V[/L] is generally a singular matrix of rank r, so the equation (u-/~, V[/~l-(u-/L)) = 1 (4.111) generally defines a singular quadric. However, it defines a nonsingular quadric if it is restricted to the tangent space T/~ (b/) to the manifold U a t / , (Fig. 4.19). We define the standard confidence region of V[/~] to be inside this quadric.
B. Primary deviation pair Covariance matrices can be visualized by means of their standard confidence regions only in three dimensions. In higher dimensions, we use an alternative technique. Note that if Crl is sufficiently large as compared with a2 in eq. (4.107), the error distribution is localized along the major axis (i.e., the principal axis with the largest eigenvalue). This suggests the following visualization technique. Let/~ be an n-vector that represents a geometric object, and VICe] its covariance matrix. Let n
-
>__ i=1
>_...
>_
(4.112)
124
Chapter 4.
Representation of Geometric Objects
~(U)
O Fig. 4.20. The primary deviation pair. be its spectral decomposition. The vector U l Can be interpreted as the orientation of the most likely deviation, and crl as the standard deviation in that orientation. Hence, the reliability o f / t can be visualized by displaying the two objects represented by the primary deviation pair {u +, u - } defined as follows (see eq. (3.48)): U + -- C[~t -[- 0"1~11 ,
~ t - "-- C[~t - o-1~/,1].
(4.113)
Here, the operation C[. ] designates high order correction, such as normalization, to ensure that u + E / d and u - E L/. To a first approximation, the primary deviation pair {u +, u - } indicates diametrically located endpoints in the standard confidence region is in the direction of its major axis (Fig. 4.20).
4.6
~. 6.1
Coordinate Transformation and P r o j e c t i o n
Coordinate transformation
Since objects in three dimensions are represented with respect to a fixed X Y Z coordinate system, the same object has different representations if described with respect to different coordinate systems. Suppose a new X ' Y ' Z ' coordinate system is defined in such a way that its axis orientation is obtained by rotating the original X Y Z coordinate system by R and its origin O' is translated from O by h; the vector h and the rotation m a t r i x / / a r e defined with respect to the original X Y Z coordinate system. We call {h, R} the motion parameters of the coordinate system (Fig. 4.21). This definition implies that if {h, //} are the motion parameters of the X ' Y ' Z ' coordinate system with respect to the X Y Z coordinate system, the motion parameters {h', R'} of the X Y Z coordinate system with respect to the X ' Y ' Z ' coordinate system are given by h'=-RTh,
R ' = R T.
(4.114)
lSNote that the standard confidence region of V[fi] is defined in the tangent space Tta(H), which does not coincide with the manifold///if b/is "curved" (Fig. 4.19).
4.6.
Coordinate Transformation and Projection
!,
125
g
O Fig. 4.21. Coordinate transformation and the motion parameters {h, R}.
A. Space points 1. A space point r defined with respect to the X Y Z coordinate system is represented with respect to the X ~ Y ~Z ~ coordinate system in the form r'-
RT(r-
h).
(4.115)
Hence, r
(4.116)
-- h + R r ' .
2. In the p-representation 19 ,
B. Space lines 1. In the {m, rH }-representation, a space line ( r - - r H ) x m -- 0 defined with respect to the X Y Z coordinate system is represented by ( r~ --?"H)t X m ! = 0 with respect to the X ~ Y ~ Z ~ coordinate system, where m ' -- R T m ,
- h).
(4.118)
rH -- R P m , ( r ~ H + R T h ) .
(4.119)
r 'H -- R T p m ( r ,
Hence, m - Rm',
2. In the {p, n}-representation, a space line r x p - n defined with respect to the X Y Z coordinate system is represented by r ~x ff - n ~with respect to the X ~ Y ~ Z ~ coordinate system, where p' (n')-N[(
RTp RT(n_h
xp))]"
(4.120)
19Eqs. (4.117) define a t r a n s f o r m a t i o n of T~3 and its inverse. T h e set of all such transformations forms the group of three-dimensional E u c l i d e a n m o t i o n s . If the rotation matrix R
in eqs. (4.117) is replaced by an arbitrary nonsingular 33-matrix, the resulting group is the group of three-dimensional affine transformations. If the 44-matrices in eqs. (4.117) are replaced by an arbitrary nonsingular 44-matrix and its inverse, the resulting group is the group of three-dimensional projective transformations of the four-dimensional space T/4, which can be identified with a three-dimensional projective space.
Chapter 4.
126
Representation of Geometric Objects
Hence, n
R(n'+
p')
(RTh)•
I.
(4.121)
C. Space plane 1. In the {n, d}-representation, a space plane (n, r) = d defined with respect to the X Y Z coordinate system is represented by (n', r') - d' with respect to the X ' Y ' Z ' coordinate system, where n' - R Tn,
d' = d - (n, h).
(4.122)
Hence, d - d' + (n, R T h).
n = Rn',
(4.123)
2. In the r,-representation, ~/= N[
(
_hTRR
01
)
v],
~, = N[
hT-
1
v'].
(4.124)
D. Space conics and quadrics 1. A space conic {n, d, Q} defined with respect to the X Y Z coordinate system is represented by {n', d', Q'} with respect to the X ' Y ' Z ' coordinate system, where n ' -- R T n ,
Q' - R T
i + e-
(n, h),
d' = d -
(,~, h)
Q
I + d -- (~i h)
R.
(4.~25)
"
(4.126)
Hence 2~, d - d' + (n, R T h),
n - Rn',
d
RQ'R T
I
d
2. A quadric ( r - rc, S ( r - rc)) - 1 defined with respect to the X Y Z coordinate system is represented by ( r ' - r'c, S ' ( r ' - rb) ) - 1 with respect to the X ' Y ' Z ~ coordinate system, where
~
- R ~(~
- h),
s' - R ~sR.
(4.127)
S - R S ' R T.
(4.128)
Hence, r e -- h + R r ~ ,
2~ (4.125) and (4.126) define a congruence transformation (see eq. (2.154)) and its inverse in the form Q = ATQ'A and Q' = (A-1)TQA -1, where A is the matrix that defines the projective transformation of the image plane (see eq. (4.86) and the matrix inversion formula (2.24)).
4.6.
Coordinate Transformation and Projection
r
127
L
p
0
(~)
(b)
Fig. 4.22. (a) Perspective projection of a space point. (b) Perspective projection of a space fine.
4.6.2
Perspectiveprojection
A. Space points 1. A space point P not on the X Y plane represented by r is perspectively projected onto the intersection p of the image plane Z - 1 with the line of sight of P (Fig. 4.22a). It is represented by x -
r (k, r)"
(4.129)
2. Space points on the X Y plane produce no images on the image plane 21. B. Space lines 1. A space line L that does not pass through the viewpoint O is projected onto the intersection I of the image plane Z = 1 with the space plane that passes through O and L (Fig. 4.22b). The surface normal to that space plane is n in the {p, n}-representation, and rH X m in the {m, rH }-representation (see eq. (4.36)). Hence, the projected image line is represented by (nt, x) = 0, where nl-- N[n]-
N[I"H x m].
(4.130)
If a space point on space line L moves in one direction indefinitely, its projection converges to an image point called the vanishing point of L (Fig. 4.23a); the same vanishing point is defined if the space point moves in the opposite direction. The vanishing point is represented by -
m
_
(k, m)
-
p x n Ik, p ,
hi
(4.131)
21A space point r on the X Y plane different from O can be thought of as projected onto the ideal image point ("image point at infinity") in the direction of r on the image plane; perspective projection of the viewpoint O is not defined.
128
C h a p t e r 4.
R e p r e s e n t a t i o n of Geometric Objects
x
x
O~
(b) Fig. 4.23. (a) Vanishing point of a space line. (b) Perspective projection of a space line that passes through the viewpoint O. T h e vanishing point separates the projected image line into two half lines. One side is the projection of the part of the space line for which Z > 0; the other side is the projection of the part for which Z < 0. T h e part for which Z = 0 is not projected onto the image plane 22. 2. A space line L t h a t the intersection p of (Fig. 4.235). Hence, r x p = 0 for (k, p)
passes t h r o u g h the viewpoint O is projected onto the image plane with L if L is not on the X Y plane space line v x m = 0 for (k, m ) # 0, or space line # 0, is projected onto image point
3. A space line on the X Y
m _ p (k, m ) (k,v)
(4.132)
plane is not projected onto the image plane 23.
C. S p a c e p l a n e s 1. A space plane II t h a t does not pass t h r o u g h the viewpoint O is p r o j e c t e d onto the entire image plane. If the space plane H is not parallel to the image plane, those space points which are on H and infinitely far a w a y 24 from the viewpoint O are projected onto an image line, known as the vanishing line of H (Fig. 4.24a). It is easily seen t h a t the vanishing line of space plane ( n n , r ) = d ( # 0) is ( n n , x) = O.
(4.133)
22The part of a space line for which Z = 0 can be thought of as projected onto the ideal image point ("image point at infinity") of the projected image line. 23A space line on the X Y plane can be thought of as projected onto the ideal image line ("image line at infinity") of the image plane if it does not pass through the viewpoint O; perspective projection of a space line on the X Y plane that passes through O is not defined. 24The set of all space points on a space plane that are infinitely far away from the viewpoint O is called an ideal space line ("space line at infinity"). The ideal space lines defined by mutually parallel space planes are regarded as the same ideal space line.
4.6.
Coordinate Transformation and Projection
n
129
.... .~ . . i
~..............
(b)
(~)
Fig. 4.24. (a) Vanishing line of a space plane. (b) Perspective projection of a space plane that passes through the viewpoint O. One of the two regions on the image plane separated by the vanishing line is the projection of the part for which Z > 0; the other is the projection of the part for which Z < 0. The part for which Z - 0 is not projected 25. If n - -t-k, no vanishing line appears 26. 2. A space plane H that passes through the viewpoint O is projected onto the intersection 1 of the image plane with II if it does not coincide with the X Y plane itself (Fig. 4.24b). Namely, space plane ( n n , r ) - 0 for n n 7~ + k is projected onto image line ( n n , x) = 0. 3. The X Y
plane is not projected onto the the image plane 27.
D. Space conics and quadrics 1. By definition, a space conic {n, d, Q} is projected onto conic ( x , Q x ) = 0 (see Fig. 4.15). 2. If space point r - Z x is on quadric (r - r e , S ( r - r e ) ) - 1, we have
Sx) - 2Z(x, S r c ) + (rc, S r c ) - 1. (4.134) This equation yields two real solutions for Z if the line of sight of x intersects the quadric at two space points; no real solution exists if it does not meet the quadric. The set of those image points whose lines of sight are tangent to an object in the scene is called the (occluding) c o n t o u r of the object (Fig. 4.25). It follows that an image pont x is on the contour of quadric ( r - r e , S ( r - r e ) ) - 1 if and only if eq. (4.134) has one multiple root, i.e., ( Z x - rc, S ( Z x -
rc))-Z2(x,
(x, S r c ) 2 - (x, S x ) ( ( r c , S r c ) - 1) - O,
(4.135)
25The part of a space plane for which Z = 0 can be thought of as projected onto the ideal image line ("image line at infinity").
U6The vanishing line can be thought of as the ideal image line ("image line at infinity"). 27Space plane (k,r) = 0 can be thought of as projected onto the ideal image line ("image line at infinity").
130
Chapter 4.
Representation of Geometric Objects
(r-rc,S(r-rc))=l
(x, Qx)=O o
Fig. 4.25. Perspective projection of a quadric. which defines a conic
(x, Qx) = 0,
(4.136)
O = ((~, s , ~ ) - 1 ) s - s , ~ , J s .
(4.137)
where In other words, quadric ( r - rc, S ( r - r c ) ) = 1 is projected onto the conic given by eq. (4.136). This conic may not be a real conic28; if it is a real conic, it may be degenerate.
28For example, an ellipsoid centered at the viewpoint O is projected onto an "imaginary conic" (an empty set in the real space).
Chapter 5 G e o m e t r i c Correction Multiple geometric objects that axe supposedly interrelated by a constraint may not satisfy it if each object is separately observed in the presence of noise. This chapter presents a statistically optimal way to correct the positions of geometric objects so that they satisfy a required constraint. The fundamental principle is the minimization of the Mahalanobis distance defined in terms of the covariance matrices of the objects. First, a general theory is formulated in abstract terms. Then, it is applied to typical geometric problems in two and three dimensions-optimally imposing coincidence and incidence on image points, image lines, conics, space points, space lines, and space planes. For each problem, explicit expressions for the correction and the a posteriori covariance matrices are derived. Optimal correction for orthogonality constraints is also studied.
5.1 5.1.1
General Theory Basic formulation
Consider N geometric objects in two or t h r e e dimensions, the c~th object being r e p r e s e n t e d by an n ~ - v e c t o r u~, c~ - 1, ..., N . Let n a be t h e dimension of vector u~. T h e N objects are a s s u m e d to be statistically i n d e p e n d e n t , b u t the c o m p o n e n t s of each u ~ m a y be correlated. We a s s u m e t h a t each u a is c o n s t r a i n e d to be in an n~~ -dimensional manifold b/~ C 7Zn~ which we call t h e data space of u~. Let ft~ be the t r u e value we should observe in the absence of noise, a n d write u ~ = ~2~ + A u ~ . T h e error A u a is, to a first a p p r o x i m a t i o n , constrained to be in the tangent space Tu~ (b/a) to the manifold Ha at ~ . Let V[u~] be the covariance m a t r i x of the error A u ~ . W e a s s u m e t h a t no constraint exists on A u a other t h a n A u a E T~t~ (Ha) a n d hence the range of V[ua] coincides with T~to (L/a). It follows t h a t
where P~tu" is the n ~ - d i m e n s i o n a l projection m a t r i x onto Tu~ (b/a). Suppose L s m o o t h functions F ( k ) ( . , . . . , 9)" 7~nl • the t r u e values Ul, .... , u N are k n o w n to satisfy
F(k)(a,,..., ~ ) --0, 131
~ - N _~ 7~ exist and
k = 1, ..., L.
(5.2)
132
C h a p t e r 5.
S(1) '
Geometric Correction
S
S
(~)
(b)
Fig. 5.1. (a) Nonsingular constraint imposed by three equations that are not independent. (b) Singular constraint imposed by two independent equations. We call eq. (5.2) simply the constraint, whereas we refer to the constraint u~ E L/a as the inherent constraint on u a . We now consider the problem of optimally correcting u l , .... , u g so t h a t these L equations are satisfied. Since each u~ is constrained to be in the d a t a space Ha C 7~ n~ , the direct sum ~[~N__1 u~ is constrained to be in its d a t a space
U_(DN ~=1/4~
C ~=~n~.
We say t h a t the constraint imposed by eq. (5.2) is
nonsingular if each of the L equations defines a manifold S (k) of codimension i i n / 4 and the L manifolds S (k), k = 1, ..., L, meet each other transversally in b/ (Fig. 5.1a; see Section 3.2.1); otherwise, the constraint is said to be singular I (Fig. 5.1b). In this chapter, we assume t h a t the constraint (5.2) is nonsingular, although the following theory can also be applied to singular constraints if appropriately modified 2. T h e L equations (5.2) m a y not necessarily be i n d e p e n d e n t (Fig. 5.1a). We call the n u m b e r r of i n d e p e n d e n t equations the rank of the constraint. It L S(k) of codifollows t h a t the constraint (5.2) defines a manifold S - Nk=l mension r in H; we call S the (geometric) modal of th~ constraint (5.2). F r o m the above definition, we see t h a t the rank r equals the dimension of the linear subspace
N
N
-
~N
a=l
(5.a)
ot=l
where V u / ~ ( k ) is the abbreviation of ~7u,~F(k)('Y_tl,...,ftN). Note t h a t the dimension of this subspace m a y not be equal to the dimension of the linear subspace N
v-
us { G P,"oV, oF
c~=l
N
~ 9 ...j
p u s V, o F(L) Us
c
N- ~n~
(5.4)
ct=l
1For example, if the L equations (5.2) are expressed as one equation, say, in the form ~N_ 1F(k)(fil, ..., fiN) 2 = 0, the constraint is singular. 2-'~Vewill see an example of a singular constraint in the motion analysis in Chapter 11.
5.1.
General Theory
133
for u~ # ~t~, where V u F (k) is the abbreviation of VuF(k)(Ul,..., UN). We say that the constraint (5.2) is degenerate 3 if the dimension of )2 is larger that t h e dimension of the subspace 12. Substituting u~ - ~ + Au~ i n t o F(k)(Ul,...,UL) and taking a linear approximation, we can replace eq. (5.2) to a first approximation by _
N
F(a) - E (Vu~/~(a)' A u a ) ,
k - 1, ..., L,
(5.5)
or--1
where F (k) is the abbreviation of F(k)(ul,..., UL). This linearized constraint is assumed to be satisfiable, i.e., there exists at least one set of solutions A u ~ E T~t. (b/a), c~ = 1, ..., N, that satisfies eq. (5.5). If A u ~ is a solution of eq. (5.5), the correction takes the f o r m / t ~ = u~ Au~ to a first approximation. However, infinitely many solutions may exist for Au~, c~ = 1, ..., N. From among them, we choose the one which minimizes the square sum of the Mahalanobis distance ]]Au~[l~,[u~] (see eq. (3.66)), i.e., N
J-
E ( A u ~ , IT[u~l-Au~)--4 min
(5.6)
c~--1
under the inherent constraint A u ~ E T~t. (t4~), c~ = 1, ..., N. N Geometrically, we are projecting the direct sum ( ~ = 1 u~ onto the "closest point" in the model $ determined by eq. (5.2), where the "closeness" is measured in the Mahalanobis distance with respect to the total covariance N matrix (~)~=1 17[u~] (Fig. 5.2). This criterion can be justified as maximum likelihood estimation for Gaussian noise. Namely, if the errors AUl, ..., AUN have the probability density
~=1
(5.7) maximizing the likelihood is equivalent to minimizing the function J given in (5.6) (see eqs. (3.46), ( 3 . 8 1 ) a n d (3.82)).
5.1.2
Optimal solution
1,..., n ~ - n ~ , ' be an orthonormal basis of T~t. (t4~) • Let ~"-( V j(a) } , j inherent constraint A u ~ E T u . (L/a) can be written as 9v j , Au (:'(~)
~
) -
0,
j -
1,
...,
3A more rigorous argument will be given in Chapter 14.
n ~ - n' ~ .
The
(5.8)
134
Chapter 5.
Geometric Correction
N ~ t t
a
a=l
Fig. 5.2. Projecting ~[~=1 N us onto the closest point in the model ,.q measured in the Mahalanobis distance. Introducing Lagrange multipliers Ai and pl ~), differentiating L
N
N
J-2EAkE(Vu k=l
/~ ( k ) , A u ~ ) - 2 E
a=l
a=l
n~-n,~
! 9
E
(~) ,-(~)
t~j (vj
,Au~)
(5.9)
j=l
with respect to each Au~, and setting the result zero, we obtain L
n ~ - - n IoL
-
v[~ol-~uo - Z ~ v ~ o p(~) + ~ k=l
=(~)
.~(o)~
(5.10)
j=l
Multiplying this by V[u~] on both sides and noting that =(~) u~ ~ T~to(U~) • we obtain L
)~kV[u~lVu~T'(k)'
PuU~ Au~ - E
(5.11)
k=l
where eqs. (5.1) have been used. Since Au~ E T~t(b/~), the solution is given by L
~
- ~ ~ [ ~ o l V ~ o ~ (~).
(5.12)
k=l
Substitution of this into eq. (5.5) yields
(v~oF (k), ~?[u~lVuoP (~)) ~ - F(k) /=1
(5.13)
c~=1
Since eq. (5.5) is assumed to be satisfiable, this equation is solvable (see Section 2.3.2); the solution is given in the following form: L
~ - Z w(~')r('). /=1
(5.14)
5.1.
General Theory
135
Here, l~ (kl) is the (kl) element of the (LL)-matrix I ~ - (I~ (kl)) defined by 12V - V - , where V - (~(kl)) is the (LL)-matrix defined by
(~?(~')) - ( L~=1(vuo F(k)' r
P('))) "
(5.15)
In the following, we use the following abbreviation to denote the (LL)-matrix
W-
(w(k~)):
(~v(kl))--(~(~u' ) ~-'a(k)'~/r[ua]vu' = l ~(l))
(5.~6)
It can be shown 4 that the rank of the matrix V (hence of I?V) equals the rank r of the constraint (5.2). It follows that the optimal correction is given in the following form (see eqs. (2.140)and (3.85)): L
(5.17)
k,l--1
This equation has the following geometric interpretation. If the noise is Gaussian, the equiprobability surface for ua has the form
(5.~8)
(u a - u a, V[u ~ ](ua - ~ a ) ) - constant.
As discussed in Section 4.5, this equation defines a nonsingular quadric in the tangent space T~t~ (L/a). Let Sa be the restriction of the model S to L/a obtained by fixing u~ - ~tZ for/3 ~ a. We now show that the optimal correction A u a given by eq. (5.17) is in the conjugate direction of the tangent space T~t~ ($a) to the model Sa at ~a (see Section 4.5.2). In T u , (L/a), the tangent hyperplane to the quadric defined by eq. (5.18) at ua + A u a has the following surface normal (see eq. (4.105)): L
no o~ e [ ~ o ] - ~
- Z
w<~')F(~)P,~~
k,l=l
p<~).
(5.~0)
Let v be an arbitrary tangent vector to the manifold Sa at ua (Fig. 5.3). Since the orthogonal complement of T~t~ ($a) with respect to T~t~ (L/a) is generated by P~tu~ Vu/~(k), k - 1, ..., L, eq. (5.19)implies L
(',"o) ~ Z w(~')F(~)(', Pu~V~o r(*)) - 0.
k,l-=l
Thus, Au~ is in the conjugate direction of T~t~ ($a). 4The proof will be given in C h a p t e r 14 in a more general framework.
(5.20)
136
Chapter 5.
r
Geometric Correction
.2v
Fig. 5.3. Geometric interpretation of optimal correction.
5.1.3
Practical considerations
Eq. (5.17) is merely a theoretical expression, because the right-hand side involves the covariance matrices V[ua] evaluated at the true values ~ta, a 1, ..., N, which we want to compute. It appears that they can be approximated by the covariance matrices V[ua] evaluated at the observed values u a , a - 1, ..., N. However, if the matrix V - (~(k0) defined by eq. (5.15) is approximated by V - (V (kl)) in the form
(5.21)
(v(k'))--(~(Vu"F(k)'V[ualVu"F(O)) ' a 1
matrices V and V may have different ranks" the rank of V is larger than that of Y if the constraint (5.2) is degenerate (see eqs. ( 5 . 3 ) a n d (5.4)). Hence, even if V is a good approximation to V, its generalized inverse W - V may be very different from W - V . A practical solution to this difficulty is to compute the rank-constrained generalized inverse (see eq. (2.82)). Namely, if the rank of the constraint (5.2) is r, eq. (5.17) is approximated by L
(5.22)
A u a - V[ua] E W
where
W - (W (kO) is an (LL)
matrix defined by W - (V)~-, which we write
as
a=l
r
The use of V[ua] instead of V[ua] has the following geometric interpretation. The quadric defined by eq. (5.18) is centered at the true value u a in the tangent space Tu~ (bG) at ua; the correction A u given by eq. (5.17) is an element of Tuo (bG) (Fig. 5.4). Using V[ua] instead of V[ua] means replacing eq. (5.18) by (ua - Ua, V[ua](~ta - u a ) ) - constant. (5.24)
5.1.
General Theory
137
a
Fig. 5.4. Theoretical analysis is done in the tangent space Tu~ (b/~) at fi~.
"
Js~
Fig. 5.5. Actual correction is done in the tangent space Tu~ (U~) at u~. If we regard ~ts as a variable, this equation defines a quadric centered at the data value u s in the tangent space Tu~ (Sts) at u s . Hence, the correction A u given by eq. (5.22) is an element of Tu~(Ua) (Fig. 5.5). This means that the data value u~ is corrected within Tu~ (Lts) in such a way that the Mahalanobis distance IIAusllv[u~] is minimized. This observation implies that as long as A u s E Tu~ (b/s), the inherent c o n s t r a i n t / t s E L/s on the corrected v a l u e / t s - u s - A u s is satisfied to a first approximation but may be violated if higher order terms are considered (Fig. 5.5). It follows that if we want to impose the inherent c o n s t r a i n t / t s E U~ exactly, we need a higher order correction, which we denote by C[. ] (see eqs. (4.113)): = C[uo -
(5.25)
This higher order correction can be made rather arbitrarily, since the correction is optimal in the first order. Because the correction given by eq. (5.22) is based on the linear approximation (5.5), the values {/t~} corrected by eq. (5.25) may not exactly satisfy the constraint (5.2) (Fig. 5.5). In order to impose it exactly, the computation is iterated by replacing the original values { u s } by the corrected values {/ts }. This process is essentially the Newton iterations, so the convergence is quadratic; usually two or three iterations are sufficient. In these iterations, the covariance matrix V[us] must also be updated, be-
138
Chapter 5.
Geometric Correction
U(~
Ua
Fig. 5.6. The error of optimal correction. cause the range of V[ua] at u~ is Tu~ (Ha) and is generally different from the range at /ta, which should be T/t ~ (Ha) (Fig. 5.5). If the covariance matrix V[ua] is given as a function of u~, it only needs to be re-evaluated at the updated value/~a. In many practical problems, however, the covariance matrix V[ua] is given only at the initial value ua. In such a case, a practical compromise is to "project" the covariance matrix V[ua] onto the tangent space T/t ~ (Ha) a t / t a in the form V[ua] - P i ~
V[ua]PU"its'
(5.26)
where P/t~ is the projection matrix onto the the tangent space T/t = (Ha) at
Ua.
5.1.4
A posteriori covariance matrices
Even if the constraint (5.2) is exactly imposed on the corrected values {/ta }, they are random variables because they are computed from the data {ua }. Let ua - ua + Aria, a - 1, ..., N, be the observed values, where Aria is the actual error in ua. After the correction A u a given by eq. (5.17) is subtracted, the data value ua is modified into
L Ua -- ('Ua + /k~ta)- ~r[Ua] E ~Tv
= ~ +
(
~-
k,l-1
r
~ k,l=l
w(k~)~(v~.~(k),~)v~o~(~) ~=1
)
(5.27)
to a first approximation. Let A/ta = /ta - u a be the error in the corrected v a l u e / t a (Fig. 5.6). The covariance matrix V[/ta,/t~] - E[A/taA/t~-] of the corrected values {/ta } is computed as follows:
~[~,~] - z[~oa~]
5.1.
General Theory
--
E
139
l~r(mn)(~/'[us
g[A~t
-T
6=1 m,n=l N L ~,=1 k,l--1 N L 3,,6=1 k,l,m,n=l
(Vu /~(k)) r E[A ~ A ~ - ] Vu~ F(,n)(s
Vu, F(n) ) T
= ~r['l/,o~](~c~ -- ( L ~v(mn)(~'r['uj3]Vu/~F(n))(V,cto,~-~(m))T~/r['llto~]) m~n----1 L
- ~
r162
k,/=l L
+
}~
r162
(~
k,l,m,n=l N
(Vu /~(k), ~/[u~lVuf.(m))(f/[u~]Vuof.(n))T
3,=1 L
-
~
s
l;V(m')(f/[u~]Vu,F('))(f/[u~]Vu~F(m))-r
re,n--1 L
k,l=l L + E ~v(kl)~v(mn)(~'r[U~176 k,l,m,n=l L
= ~[~1~-
~
"/(~~163
~.
(5.2s)
k,l=l Here, we have invoked the assumption that each us is independent and hence E[Aft~A~Z] - f/[u~]6~/3. We have also used the identity ITVVW
= ww-w-
w (s~e eqs. (2.123)).
Letting a - fl in eq. (5.28), we obtain L
(~[~1- ~ [ ~ ] - ~ -~(~)((~[~]v~oP(~))(r~[~o]v~o~(~ ~ k,/=l
(5.29)
140
Chapter 5.
Geometric Correction
/ T
J
ft--
N
---
a
,/
Fig. 5.7. The a priori and a posteriori standard confidence regions. For a ~ fl, we obtain L =
_
(5.30)
k,/----1
Thus, the corrected values {ira } are correlated even though the original values {u~} are independent. It can be confirmed 5 that the ranges of V[/ta] and Y[/ta, $tZ] coincide with the tangent space T/t" (b/a) to the manifold U~ at ~ta.
Eqs. (5.29) and (5.30) can be given the following geometric interpretation. The a priori covariance matrix ~[~N (x--1 Vinci defines the a priori standard confidence region in the tangent space T~=I ~t. (/4) to the manifold/4 = ~aN__1 Ua (Fig. 5.7). The a posteriori covariance matrix given by eq. (5.28) defines the a posteriori standard confidence region in the tangent space TeN=I ~t. (S) to the model $ defined by the constraint (5.2). This confidence region is the "projection" of the a priori standard confidence region onto Tey=l u . ($) along the "conjugate direction". It will be shown in Chapter 14 that eq. (5.28) coincides with the Cramer-Rao lower bound on the covariance matrix of the corrected values, meaning that the correction given by eq. (5.17) is indeed theoretically optimal. Eqs. (5.29) and (5.30) are mere theoretical expressions, since they involve values evaluated at ~ , a = 1, ..., N. Furthermore, eqs. (5.29) and (5.30) define the covariance matrices evaluated at ~a, a - 1, ..., N. In actual computation, a consistent approximation is identifying ~ w i t h / t ~ , i.e., we compute L T
k,l=l
5The proof will be given in Chapter 14 in a more general framework.
5.1.
General Theory
141 L
V[/ta,/tfi] = -
l~(kO(f/'[ua]Vu.~'(k))(f/[u~]Vu~ "f'(O)-r"
E
(5.32)
k,l=l
Here, IY[ua] is the matrix defined by eq. (5.26), and Vu/#(k) is the abbreviation of Vu F(k)(/tl,...,/tN). The matrix I ~ = (l~ (kl)) is defined by
(5.aa) a:l
r
The ranges of V[/ta] and V[~ia,/till thus defined coincide with the tangent space T~t ~ (b/a).
5.1.5
Hypothesis testing and noise level estimation
So far, the constraint (5.2) has been assumed given. However, it can be byand the above correction procedure can provide a means of testing this hypothesis. Let us hypothesize that the true values ua, a - 1, ..., N, satisfy eq. (5.2). Let ua = ua + A u a , a = 1, ..., N, be the observed values, and regard each A u a as an independent Gaussian random variable with mean 0 and covariance matrix I?[ua]. Since Vu/~(k) are deterministic values, eq. (5.5) implies that to a first approximation F (k) is a Gaussian random variable of mean 0. Noting that each A u a is independent, we can compute the covariance of F (k) and F (l) in the following form:
pothetical,
N
N
a=l
fl=l
N
=
(~ a,fl=l N
= E(Vu
/~(k) ' lF[ua]Vu /~(t)) _ ~(kt).
(5.34)
a--1
The matrix I~r - (W (kt)) defined by eq. (5.16) has rank r ( - the rank of eq. (5.2)), so the quadratic form L
d -
W(kOf(k)V(O
(5.35)
k,l=l
is a X2 variable with r degrees of freedom (see eq. (3.61)). It is easily confirmed that the right-hand side of eq. (5.35) coincides with the value obtained by substituting eq. (5.17) into eq. (5.6) (see eqs. (2.141) and
142
Chapter 5.
Geometric Correction
(3.87)). In other words, j is the residual of the optimization (5.6). If this value is much larger than can be accounted for by the statistical behavior of the noise, the hypothesis (5.2) should be rejected. It follows that the hypothesis (5.2) can be tested by the standard X2 test" the hypothesis is rejected with significance level a% if J > x ,o, where X,.,a 2 is the a% significance value of X2 with r degrees of freedom (see Section 3.3.4). Intuitively, the hypothesis that eq. (5.2) holds is rejected if the Mahalanobis distance over which the data {ua} must be displaced for imposing eq. (5.2)is too large (see Fig. 5.2). Eq. (5.35) is merely a theoretical expression, since it involves 1~r = (l~ (kl)). A simple approximation is using W - (W (kl)) defined by eq. (5.23), but now that the optimal estimate/t~ has been obtained, we can alternatively use ITr = (1~ (kt)) defined by eq. (5.33). However, the use of W rather than W makes only a second order difference. In many application problems, the geometric characteristics of noise (e.g., the degree of homogeneity/inhomogeneity and isotropy//anisotropy) can be relatively easily predicted but the absolute magnitude of noise is very difficult to estimate a priori. In such a case, we can write the covariance matrix Vinci in the form = = ..., N, (5.a7) where V0[u,] has a known form while e is unknown. Let us call V0[u~] the normalized covariance matrix of u~, and e the noise level. It is easily seen from eqs. (5.22) and (5.23) that the optimal correction is not affected by multiplication of Vinci by an arbitrary positive constant. Hence, V[u,] can be replaced by the normalized covariance matrix V0[u~]. In other words, we need not know the absolute noise level .for the optimal correction. Once the optimal solution is computed, the noise level e can be estimated a posteriori as follows. If the normalized covariance matrix V0[u~] is used for V[u,] in eq. (5.35), the resulting residual J0 equals e2J. Since 2 is a X2 variable with r degrees of freedom, its expectation and variance are r and 2r, respectively (see eqs. (3.59)). Hence, an unbiased estimator ~2 of e2 is obtained in the form
J0.
(5.as)
r
Its expectation and variance are respectively given by
(5.39) r
In geometric terms, we are estimating the noise level from the Mahalanobis distance with respect to V0[u,] over which the data ( u , } must be displaced for imposing eq. (5.2) (see Fig. 5.2). It follows that the hypothesis test (5.a6)
5.1.
General Theory
143
can be interpreted as comparing the a priori value e with the a posteriori estimate ~ computed on the assumption that the hypothesis is true. In fact, eq. (5.36) is equivalently rewritten as 6 ~2
2
~r,a
-7 e > --'r
5.1.6
(5.40)
Linear constraint
In many problems, the constraint is linear in the form N
EAters=b,
(5.41)
s=l
where A s is an L n s - m a t r i x and b is an L-vector. The rank r of this constraint equals the rank of the L ( ~ N s= 1 n s)-matrix
A-
(A1P~tU11,..., ANPuU~),
(5.42)
in terms of which eq. (5.41) can be written as A (~)N s = l ~ ' s - - b. Suppose the observed values u s , c~ = i, ..., N, do not satisfy the constraint (5.41). If we write uo = ~o + A , ~ , eq. (5.41) can be written as N
N
s 88
-b.
(5.43)
s=l
The correction A u s is determined by the optimization (5.6) under the inherent constraint A u s e Tu~ (L/s). The solution given by eq. (5.17) reduces to N
Aus - V [ u s l A ~ W ( E
Azu~ - b),
(5.44)
~=1 where the (LL)-matrix W is given as follows (see eq. (5.23)):
w
=
,545, s--1
r
Since the constraint (5.41) is linear, no approximation has been made to obtain eq. (5.43). Hence, no iterations are necessary. The a posteriori covariance matrices given by eqs. (5.31) and (5.32) reduce to V [ / t ~ ] - i Y [ u ~ ] - V[u~]A~WA~/[u~], (5.46) 6 T h i s is a c o n s e q u e n c e of t h e fact t h a t ~2/e2 is a modified X 2 variable with r degrees of freedom if the h y p o t h e s i s is t r u e (see eq. (3.72)).
144
Chapter 5. V[/~,/~fl]
-
Geometric Correction
V[u,~]A~WAzV[uz].
(5.47)
The residual of the optimization (5.6) can be written in the form N J
-
(E
N
A~/G
-
b, W ( E
c~=l
AZ/~Z
-
b)),
(5.48)
fl--1
which is a X2 variable with r degrees of freedom if the noise is Gaussian.
5.2
Correction of Image Points and Image Lines
5.2.1 A. Image
Optimal correction for coincidence points
Let xl and x2 be two image points, and V[Xl] and V[x2] their respective a priori covariance matrices. Suppose X l and x2 are two different estimates of the same image point. Consider the problem of estimating the true position. Let 51 and 52 be the true values of xl and x2, respectively. The constraint to be imposed is 51 52, (5.49) -
-
which has rank 2 because both sides are orthogonal to k = (0, 0, 1) T. If the two image points are statistically independent, the optimal estimate ~ = Xl - AXl = x2 -- Ax2 is obtained by finding AXl and Ax2 such that 7 J -- ( A X l , J~:[Xl]- A X l ) --[- ( A x 2 , ~ ' r [ x 2 ] - A x 2 ) ~ m i n
(5.50)
under the linearized constraint
e
/~X2 -- /~Xl -- X2 -- Xl,
(5.51)
The first order solution is given by 8 /XXlAx2
-
V[Xl]W(Xl-X2), -
V[x2]W(x2
-
-
X 1),
(5.52)
where W is a (33)-matrix defined by (5.53) 7We adopt the convention that V[. ] denotes the value of the covariance matrix V[. ] evaluated at the true value of the variable. SWe mean by "first order solution" the approximation expressed in terms of the data and the covariance matrices evaluated at the data values (see eq. (5.22)).
5.2.
Correction of Image Points and Image Lines
145
The a posteriori covariance matrix of the estimate 5~ is
V[xllWV[xl]- V[xl]WV[x2] v[~]- v[x:]wv[x:].
Fix] - V [ X l ] =
(5.54)
The residual of J can be written as J -- ( X 2 -
Xl, W(x2
(5.55)
-- X l ) ) ,
which is a X2 variable with two degrees of freedom 9. This fact provides a coincidence test for image points" the hypothesis that image points X l and x2 coincide with each other is rejected with significance level 6% if J ~" ~2,a.
(5.56)
E x a m p l e 5.1 If each coordinate is perturbed independently by Gaussian noise of mean 0 and variance e 2, the covariance matrices of AXl and Ax2 are V[xl] - V[x2] - e2Pk (see Example 4.1). The optimal estimate 5~ is 1
(5.57)
-- Xl -- A X l -- X2 -- A X 2 -- ~ ( X l "~" X2).
The a posteriori covariance matrix of ~ is 62
v[~]-
-~Pk.
(5.58)
The residual (5.55) can be written as 1
J - ~-j2 IIx2 - xl[I 2
(5.59)
Hence, an unbiased estimator of the variance e2 is obtained in the form ~2 _ ~11~ 1 - ~, I12 9
(5.60)
The value ~ thus estimated equals the half-distance between the two image points. If the value e is given a priori, the coincidence test takes the form ~2
X 2,a
--~ > - - ~ .
(5.61)
9We assume Gaussian noise and do first order analysis whenever we refer to X2 distributions and X2 tests.
146
Chapter 5.
Geometric Correction
B. Image lines
The same analysis can be done for two image 0. Let Y [ n l ] and V[n2] be their respective a signs of n l and n2 are chosen in such a way and (n2, x ) , = 0 be the true image lines. The
lines ( n l , x) - 0 and (n2, x) priori covariance matrices. The that n l ~ n2. Let (n l, x) - 0 constraint to be imposed is (5.62)
~1 -- n2,
which has rank 2 because both sides are unit vectors 1~ If the two image lines are statistically independent, the first order solution of the optimization g - ( A n i , ~r[l"~l]-- A1"~1) ~- (A1"~2, ~-7[n2]-A?'$2) --+ min
(5.63)
under the linearized constraint An2 - An1 - n2 - n l, A n I e {~t,1}L~,
n~O,2 e {~2}L~,
(5.64)
is given by A1"~1- V[nl]W(lo,1 - n2), A~~2 -- V[?'~2]W(n2 - 1"~1),
(5.65)
where W is a (33)-matrix defined by 11 W - (V[ni]-~- V[~o,21):.
(5.66)
A realistic form of the correction is i~, - N [ n i - A n 1 ] -
- An2].
i[n2
(5.67)
The a posteriori covariance matrix of the estimate/~ is V[~] -- ? [ r t i ] -
?[19,11WV[rg1 ] - V[l"~llWg[TI,2]
(5.68)
=
Here, i = 1, 2,
V[ni] - P f t V [ n i ] P n ,
(5.69)
where Pr is the projection matrix along/~. The matrix I ~ is obtained by replacing V[ni] by V[ni], i - 1, 2, in eq. (5.66). The residual of J can be written as J
-
-
-
(5.70)
a~ constraint is degenerate. 11The rank-constrained generalized inverse (.)2 is used because the ranges of V[nl] and V[n2] are different from the ranges of V[nl] and V[n2]. Consequently, although V[nl] + V[n2] is a singular matrix of rank 2, the matrix V[nl] + V[n2] is generally nonsingular.
5.2.
Correction of Image Points and Image Lines
which is a
X2
147
variable with two degrees of freedom.
This fact provides a
coincidence test for image lines: the hypothesis that image lines (n l, x) - 0 and (n2, x) - 0 coincide with each other is rejected with significance level a ~ if
j > ,~2,a.
5.2.2
(5.71)
Optimal correction for incidence
A. S i m u l t a n e o u s correction Suppose image point x and image line (n, x) - 0 are, respectively, estimates of an image point p and an image line 1 that should be incident to each other in the absence of noise. Consider the problem of optimally correcting them so as to make them incident. In other words, we want to find A x and A n such that 9 - x - A x and ~ - n - A n satisfy
(~, ~) - 0.
(5.72)
The rank of this constraint is 1. Let V[x] and V[n] be the a priori covariance matrices of x and n, respectively. If the image point and the image line are statistically independent, the problem can be written as the optimization J - (Ax, V [ x ] - A x ) + (An, V [ n ] - A n ) -+ min
(5.73)
under the linearized constraint
(g. ~ )
+ (~. ~ n ) - (n. ~).
(5.74)
k • The first order solution is given by A~ -
(n, ~ ) v [ ~ ] ~
v[~]n) + (~. v[.~]~) '
(~. A,~ -
(n, ~ ) v [ n ] ~
(.. v[~]~) + (~. v [ ~ ] ~ )
(5.75)
A realistic form of the correction is - ~-
A~,
a-
N[n-
An].
(5.76)
The a posteriori covariance matrices of the corrected values & and ~ are V[~] - V[x] -
(V[~']'~)(V[~I'~)T (~, v[~]~) + (~, ~ [ ~ ] ~ ) '
v[,~] - ?In]
(r (~, v[~],~) + (~, ~ [ n ] ~ ) '
-
148
Chapter 5.
( r l ~(~r 1 T ~,v~x~,,_~n~,,
v[~,h]--
(~. v[~]~) + (~. ~[.]~)
Geometric Correction
= V[fi, &]T,
(5.77)
where V [ n ] - P n V[n]Pn.
(5.78)
The residual of J can be written as J -
(5.79)
(n'x)2
(g, V[~lg) + (~, ~[.]~)' which is a X2 variable with one degree of freedom. This fact provides an incidence test for an image point and an image line: the hypothesis that image point x and image line (n, x) - 0 are incident to each other is rejected with significance level a ~ if J > X2,a 9 (5.80)
B. Image point correction If the image line (n, x) - 0 is fixed, the linearized constraint is
(-,~)
• A ~ e {k)L.
- (-,~),
(5.81)
The first-order correction of the image point x is obtained by letting V[n] O in eqs. (5.75)"
~ = (~,~)v[~]~
(~, v [ ~ ] ~ ) "
(5.82)
The a posteriori covariance matrix of the corrected value ~ is
v[~]-
v[~]-
(v[~]')(v[~]~)~
(~. v[~].)
(5.83)
Since V[hc]n = 0, the rank of V[~] is 1; its null space is {k, n } L , which is orthogonal to the orientation m - N[k x n] of the space line (n, x) = 0. The residual
j_
(n, ~)~
(n, V[x]n)
(5.84)
is a X2 variable with one degree of freedom. Hence, and the incidence test given by eq. (5.80) can be applied. E x a m p l e 5.2 If each coordinate is perturbed independently by Gaussian noise of mean 0 and variance e2, the covariance matrix of A x is V[x] e2Pk . The optimal correction (5.82) reduces to
~
- - (~' ~ ) P k ~ 1 - (k, ~ ) ~
(5.s5)
5.2.
Correction of Image Points and Image Lines
149
X
^ x
m
Fig. 5.8. Optimal incidence correction for an image point. Consequently, x is displaced onto the image line ( n , x ) - 0 perpendicularly (Fig. 5.8). The a posteriori covariance matrix of the corrected value 5~ can be written as V[~] = e2....~T. (5.86) where m = X[k x n I is the orientation of the image line (n, x) = 0. The residual (5.84) can be written as
af -
~
(n'~):
-'T E 1 -(k,
n) 2"
(5.87)
Hence, an unbiased estimator of the variance e2 is obtained in the form ~2 __--
( n , X) 2
- (k,.)~"
(5.8s)
The value @thus estimated equals the distance between the image point and the image line (see eq. (4.11)). If the value e is given a priori, the incidence test takes the form ~2 E---~-
> ~1~,o.
(5.s9)
C. Image line correction
If the image point x = 0 is fixed, the linearized constraint is ( A n , x) - (n, x),
A n 6 {fi}LI .
(5.90)
The first-order correction of the image line (n, x) = 0 is obtained by letting V[x] = O in eqs. (5.75):
A,~ - (n, ~ ) v [ n ] ~ (~, v[n]~) "
(5.91)
The a posteriori covariance matrix of the corrected value ~ is
V[n]- ~tT[n]- (~/"[fl']X)(Y[f't]x)T
(5.92)
150
Chapter 5.
Geometric Correction
The rank of V[~] is 1; its null space is {~, X}L. The residual J -- (x, l}'[n]x)
(5.93)
is a X2 variable with one degree of freedom. Hence, the incidence test given by eq. (5.80) can be applied.
5.3
5.3.1
Correction of Space Points and Space Lines
Optimal correction for coincidence
A. Space points Two space points ?'1 and ?'2 that are supposed to coincide can be optimally corrected in the same way as in the case of image points. Let V[?'l] and V[?'2] be their respective a priori covariance matrices. Let r l and r2 be the true positions of ?'1 and ?'2, respectively. The constraint to be imposed is
r, - r2,
(5.94)
which has rank 3. If the two space points are statistically independent, the problem is finding A?'I and A?'2 such that J - (A?'I, V[?'1]-1A?'1) -~-(A?'2, V[?'2]-1A?'2) --+ min
(5.95)
under the linearized constraint
A?'2 -- A?'I -- ?'2 -- ?'1.
(5.96)
The first order solution is given by
A?'I-
V[?'1]W(?'1-?'2),
A?'2 -- V[?'2]W(?'2 - ?'1),
(5.97)
where W is a (33)-matrix defined by
W-
(V[?'l]-~- V[?'2]) -1 9
(5.98)
The a posteriori covariance matrix of the estimate ~ is
V [ ' r ' ] - V [ ? ' l ] - V[?'I]WV[?'I] -- V[?'I]WV[?'2] =
(5.99)
The residual of J can be written as
J-
(?'2-?'1,W(?'2-?'1)),
(5.100)
5.3.
Correction of Space Points and Space Lines
151
which is a X2 variable with three degrees of freedom. This fact provides a coincidence test for space points: the hypothesis that space points r l and ?'2 coincide with each other is rejected with significance level a% if
J >
(5.101)
E x a m p l e 5.3 If each coordinate is perturbed independently by Gaussian noise of mean 0 and variance e 2, the covariance matrices of A r l and At2 are V[rl] = V[r2] = e2I. The optimal estimate ~ is 1
~" -- ~'1 -- A / ' I -- /'2 -- A/'2 -- ~(7"1 ~t- ~'2).
(5.102)
The a posteriori covariance matrix of ~ is s
V[e]-
~-I.
(5.103)
The residual (5.100) can be written as
1
12
J -- ~E2 11~2 -- ~'1[ 9
(5.104)
Hence, an unbiased estimator of the variance e 2 is obtained in the form ~2_ 1 ~llr2 - rill 2.
(5.105)
The value ~ thus estimated equals 1/vf3 times the half-distance between the two space points. If the value e is given a priori, the coincidence test takes the form ~:2
X 2 ,a
J > -i-"
(5.106)
B. Space lines
The same analysis can be done n2. Let V[p 1 | n l] and V[p 2 | signs of {Pl, n l } and {P2, n2} /' X P l - - ~'1 and r x P2 - n2 imposed is
for two space lines r x Pl - n l and r x P2 = n2] be their a priori covariance matrices. The are chosen so that Pl ~ P2 and ~t 1 ~,~ ~t 2. Let be the true space lines. The constraint to be
P l -- P2,
~1 -- fi2,
(5.107)
which has rank 4 because the representations {Pl, fil } and {102, fi2 } have four degrees of freedom 12 (see Section 4.2.2). If the two space lines are statistically independent, the problem is finding Apl , An1, A p 2, and An2 such that J -
( / k p l 9 ATtl, ~r[p 1 9 n l l - ( / k p
I 9 ATt,1))
_
+ ( A p 2 | An2, V[p 2 | n : ] - ( A p 2 | An2)) ~ min 12This constraint is degenerate.
(5.108)
152
Chapter 5.
Geometric Correction
under the linearized constraint Ap 2 | An2 - Ap 1 | An1 - P2 | n2 - Pl | n l , A p l ~ A n l e {Pl 0 ~/,I, ~,1 O P l } f , 2_
(5.109)
Ap2 | An2 E {P2 @ fi2,fi2 @ P2}L" The first order solution is
~XV~ 9 ~X,~ - V[p~ ~ ,~]W(p~ 9 ,~ - p, 9 nl),
(5.110)
where W is a (66)-matrix defined by 13 W-(V[p
1
@rtl] "4-V[p2 O n 2 ] )
(5.111)
4 9
A realistic form of the correction is (lb)
_N•
PI--Apl nl --/knl
h
where the operation N•
N• ( a ) b
rt2 -- Art2
'
(5.112)
] is defined by
{ N[a | PN[a] b] ]-
p2--Ap2 )]
)I-N•
N[PN[b]a | b]
if"a">--"bl"
(5.113)
otherwise.
The a posteriori covariance matrix of the estimate lb @ h is W[i~ @ ~] - l$[pl @ rtl] - V[Pl @ n l]l~rV[Pl 9 n l]
(5.114) Here,
V[Pi | nil = Px~,.~ V[pi | n~]P~r
i = 1, 2,
(5.115)
where PX~.~ is the six-dimensional projection matrix onto N ' p.1_. n
(see
eq. (4.44:)). The matrix I ~ is obtained by replacing V[p i @ hi] by l~[p i @ hi], i - 1, 2, in eq. (15.111). The residual of J can be written as
(5.116) which is a X2 variable with four degrees of freedom. This fact provides a coincidence test for space lines" the hypothesis that space lines r x Pl = n l and r x P2 - n2 coincide with each other is rejected with significance level a% if J > X~,a" (5.117) 13The ranges of V[pl • nl] and V[p2 9 n2] are different from the ranges of l/[pl (~ nl] and 17[p2 @ n2]. Consequently, although IY[pl @ nl] + l?[p2 @ n2] is a singular matrix of rank 4, the matrix V[pl | nl] + V[p2 | he] is generally nonsingular.
5.3.
5.3.2
Correction of Space Points and Space Lines
153
Optimal correction for incidence
A. Simultaneous correction As in two dimensions, a space point r and a space line r • p - n can be optimally corrected so as to make them incident. Let V[r] and V[p | n] be their a priori covariance matrices. The problem is finding A t , Ap, and A n such that ~ = r - A t , p = p - Ap, and ~ = n - A n satisfy
e xp--ff,.
(5.118)
The rank of this constraint is 2 because the three component equations are algebraically dependent 14. If the space point and the space line are statistically independent, the problem can be written as the optimization J - (At,
~r[~,]--I A/')
-~- (Ap | An, fT[p | n ] - ( A p | A n ) ) -+ min
(5.119)
under the linearized constraint Ar
x
p+f"
x
Ap--An=r
x
p_n,
Ap | A n E {p | n, fi @ lb}~.
(5.120)
The first order solution is given by
=-(vH
x r,)w(,, x p - n),
Ap = (V[p] x r -
V[p, n l ) W ( r x p -
A n = ( V t n , p] x r -
V[n])W(r x p-
n), (5.121)
n),
where W is a (33)-matrix defined by 15 W-
(p x V[v] x p + v x V[p] x r -
)-
2S[v x V[p,n]l + V[n] 2 "
(5.122)
The symbol S[. ] denotes the symmetrization operator (see eqs. (2.205)). A realistic form of the correction is f--r-At,
h
n--An
]'
(5.123)
where the operator N• ]is defined by eq. (5.113). The a posteriori covariance matrices of the corrected values ?,/~, and h are
U p ] - V[,']- (V[r] x/,)W(p x V[,']), 14This constraint is degenerate. 15The rank-constrained generalized inverse ( . ) 2 is used because ( . ) is generally nonsingular if evaluated at the data values; it should be a singular matrix of rank 2 if evaluated at the true values.
154
Chapter 5.
Geometric Correction
v[v] - ?[v]- (?Iv] • e - ?[v,,~])W(e • ?Iv]- ?In, v]), V[Ib, ~t] - V[p, n] - (V[p] • ? - lY[p, n])I~(/" • V[p, n] - I7[n]) - V[/t, lb] 7-,
v[,~]- ? [ n l - (?[n,v] x e - ?[n])Cc(e • ?[v, n ] - ?[n]), v[~,/,]- (v[r] x/,)g,(~ x
?[v]- ~[n, v ] ) -
v[/,, ~]T,
V[O, ~t] = (V[r] x p)I~(O x V[p,n]- I Y [ n ] ) - V[~t, ?]T.
(5.124)
The matrices IY[p], ~Z[p, n], and V[n] are obtained as submatrices of
V[p | n] - P~'~e~V[p @ nlPx~e~.
(s.~2s)
The matrix I&r is obtained by replacing r, p, V[p], V[p, n], and V[n] b y / ' , lb, V[p], V[p, n], and V[n], respectively, in eq. (5.122). The residual of J can be written as j - (r • p - n, W ( r x p - n)), (5.126) which is a ~:2 variable with two degrees of freedom. This fact provides an incidence test for a space point and a space line: the hypothesis that space point r and space line r • p - n are incident to each other is rejected with significance level a ~ if j > ~2,a. (5.127) B. Space point correction
If the space line r • p - n is fixed, the linearized constraint is
Arxp=rxp-n.
(5.128)
The first order correction of the space point r is
~
= -(v[~] • v)w(~
• v-
n),
(s.129)
where W is a (33)-matrix given by
The a posteriori covm'iance matrix V[~] of the corrected value ~ is given in the form shown in eqs. (5.124), where lb and ~ are replaced by p and n, respectively. Matrix V[~] has rank 1; its null space is {n, rH}L in the {m, rH}-representation. The residual J is given in the form of eq. (5.126) and is a X2 variable with two degrees of freedom. Hence, the incidence test given by eq. (5.127) can be applied.
5.3.
Correction of Space Points and Space Lines
155
Ar r
O Fig. 5.9. Optimal incidence correction for a space point. E x a m p l e 5.4 If each coordinate is p e r t u r b e d independently by Gaussian noise of mean 0 and variance e2, the covariance m a t r i x of A r is V[r] - e2I. The optimal correction (5.129) reduces to
p
At-
x
(r x p - n ) ilPl12
-- P r o m -
?"H.
(5.131)
Consequently, r is displaced onto the space line r x p - n perpendicularly (Fig. 5.9). The a posteriori covariance matrix of the corrected value ~ has the form V [ ~ ] - e 2 m m -T-, (5.132) where m - N[p] is the orientation of the space line r x p - n. The residual (5.126) can be written as
J-
1
--~
I1~ x p - n i l ~ 1 ilpl12 = -~llPmr-
r.II 2.
(5.133)
Hence, an unbiased estimator of the variance e e is obtained in the form
~2
I1',' x
p-.,112
211pll2
12
1
= ~llPmr-
rill 9
(5.134)
The value ~ thus estimated equals 1 / v ~ times the distance between the space point and the space line (see eq. 4.49)). If the value e is given a priori, the incidence test takes the form ~2 X2,~ e-Y > ---~--.
(5.135)
C. Space line correction If the space point r is fixed, the linearized constraint is r x Ap-
An - v x p-
n,
156
Chapter 5. AV+n
Geometric Correction (5.136)
~ {p+ n,n +p}~.
The optimal correction of the space line r x p - n is Ap~.
V[p, n l ) W ( r
(V[p] x r -
- ((~ • v [ v , . ] ) ~
x p-
n), (5.137)
-),
- V[.l)W(~ • v-
where W is a (33)-matrix given by w
-
(\ ~ • v [ v ] • ~ - ~ • v [ v , . ] -
(~ • v [ v , . ] ) ~
+ vM
.
(5.138)
The a posteriori covariance matrices of the corrected values/5 and fi are given in the form shown in eqs. (5.124), where ~ is replaced by r. The residual is (5.139)
J - (," • v - - , w ( , - • v - -)).
This is a X2 variable with two degrees of freedom. Hence, the incidence test given by eq. (5.127) can be applied. 5.4
5.~.1
Correction
of Space
Planes
Optimal correction for coincidence
Two space planes (v,, p) = 0 and (v2, p) = 0 that are supposed to coincide can also be optimally corrected. Let Y[vl] and Y[v2] be their respective a priori covariance matrices. The signs of the 4-vectors Vl and v2 are chosen so that vl ~-- v2. Let (Pl, P) - 0 and (P2, P) - 0 be the true space planes. The constraint to be imposed is (5.140)
Vl -- V2,
which has rank 3 because both sides are unit vectors 16. If the two space planes are statistically independent, the problem is finding AVl and Av2 such that J - (AVl, ~r[vl]-lAvl)
-[- ( A v 2 , ~ r [ / 2 2 ] - l n v 2 )
--=+m i n
(5.141)
under the linearized constraint A v 2 -- A V l -- /22 -- V l ,
Av2 e {P2}L~. The first order solution is given by ~1 16This constraint is degenerate.
- V[~l]W(~l - ~),
(5.142)
5.4.
Correction of Space Planes
157
AI/2 -- V[I22]W(v 2 - Vl) ,
(5.143)
where W is a (44)-matrix defined by a7 (5.144) A realistic form of the correction is ~' = N[L,1 - A I / 1 ] - N[L,2 - A!21].
(5.145)
The a posteriori covariance matrix of the estimate s is Villi -
?[Vl]W?[Vl]-
?[Vl]W?[v2]
(5.146)
Here, -
i - 1, 2,
(5.147)
where P~,, is the four-dimensional projection matrix along u i. The matrix IrV is obtained by replacing V[vi] by Y[~'i], i - 1, 2, in eq. (5.144). The residual of J can be written as ^
J - - (V2 - V l , W ( v 2 - V l ) ) ,
(5.148)
which is a X2 variable with three degrees of freedom. This fact provides a coincidence test for space planes: the hypothesis that two space planes (Vl, p) = 0 and (L'2, p) = 0 coincide with each other is rejected with significance level a% if J > ~2,a. (5.149)
5.~.2
Optimal incidence with space points
A. S i m u l t a n e o u s c o r r e c t i o n
A space point p and a space plane (v, p) - 0 can be optimally corrected so as to make them incident. Let V[p] and V[v] be their respective a priori covariance matrices. The problem is finding A p and A v such that ~ = p - A p and P - L , - A v satisfy (P, h) - 0. (5.150) The rank of this constraint is 1. If the space point and the space plane are statistically independent, the problem can be written as the optimization J - (Ap, V [ p ] - A p ) + (Av, V [ v ] - A v ) --+ min
(5.151)
aTThe the ranges of V[ua] and V[u2] are different from the ranges of l)'[pa] and V[v2]. Consequently, although ~'[Vl]+ V[u2] is a singular matrix of rank 3, the matrix VIal]+ V[v2] is generally nonsingular.
158
Chapter 5.
Geometric Correction
under the linearized constraint
(~,,/,) + (~, ~p) = (~,, p), tg /
(5.152)
where ~ = (0, 0, 0, 1) T. The first order solution is given by
(~,, p)V[plv ~t, - (~, v[p]v) + (p, v[~,]p)' (~,, p)v[~,]p (v, V[plv) + (p, v[~,lp)
Alp
(5.153)
A realistic form of the correction is /5 - p -
Ap,
~, = N [ ~ , -
(5.154)
A~,].
The a posteriori covariance matrices of the corrected values/~ and 1) are
v[/,]- v [ d -
(v[p]~)(v[p]~) ~ (~,, v[p]~,) + (/,, ?[~,]/,)'
(?[,,,]/,)(?[,,].a) T v[~] = ~ [ ~ ] v[/,, ~,1 -
(~,, v[p]~,) + (/,, ?[~,]/,)'
(v[p]~,) (?[~,]/,)T (~,, v[p]~,) + (h, ~'[~,]h)
-
= v[~,,/,]T
.
(5.155)
Here,
r
p~v[~]p~,
(5.156)
where P/, is the four-dimensional projection matrix along ~,. The residual of J can be written as
.]
-
(u' P)2
(5.157)
(~', V[d~') + (h, ?[~']/')' which is a X2 variable with one degree of freedom. This fact provides an incidence test for a space point and a space plane: the hypothesis that space point p and space plane (u, p) - 0 are incident to each other is rejected with significance level a~0 if j > X2,a .
(5.158)
5.4.
Correction of Space Plmles
159
O Fig. 5.10. Optimal incidence correction for a space point. B. Space point correction
If the space plane (v, p) = O is fixed, the linearized constraint is
(~, Ap)
(~, p),
-
Ap
e {~}~.
(5.159)
The optimal correction of p is
Ap
(v,p)V[p]u
(5.160)
(~, v[p]~) "
-
The a posteriori covariance matrix of the corrected value/5 is V[~b]-
V[p]- (Y[P]v)(Y[P]V)-r (•, V[plv)
"
(5.161)
Since Viably = O, the rank of V[/~] is 2; its null space is {~, V}L. The residual j
_
(v, p)2
(~, v[p]~)
(5.162)
is a X2 variable with one degree of freedom. Hence, the incidence test given by eq. (5.158) can be applied. E x a m p l e 5.5 If each coordinate is perturbed independently by Gaussia~ noise of mean 0 rand variance e2, the covariance matrix of p is V[p] = e2P~ (=
~ I ~ 0). In the (n, d}-represent~tion, the optim~ correctio~ (5.160) reduces to
zx~ = ((n, ~) - d)n.
(5.~63)
Consequently, r is displaced onto the space plane (n, r) = d perpendicularly (Fig. 5.10). The a posteriori covariance matrix of the corrected value ? is V[/'] =
e2pn.
(5.164)
The residual (5.162) can be written as 1 j - j ( ( n , r) - d)2.
(5.165)
160
Chapter 5.
Geometric Correction
Hence, an unbiased estimator of the variance e2 is obtained in the form -
-
d)
(5.166)
The value ~ thus estimated equals the distance between the space plane and the space point (see eq. (4.68)). If the value e is given a priori, the incidence test takes the form ~2 e-~ > X~,a" (5.167) C. Space plane correction
If the space point p is fixed, the linearized constraint is (Av, p ) -
(v,p),
Av e {P}~.
(5.168)
The optimal correction of space plane (v, p) - 0 is (p, V[v]p) "
(5.169)
The a posteriori covariance matrix the corrected value D is V[~]- ?[~l-
(V[~IP)(?[~]P)~ (p, ~[~]p) .
(5.~70)
The rank of V[9] is 2; its null space is {/~, P}L. The residual
J -
P):
(5. 71)
is a X2 variable with one degree of freedom. Hence, the incidence test given by eq. (5.158) can be applied.
5.4.3
Optimal incidence with space lines
A. Simultaneous correction
A space line (r - rH) • m = 0 and a space plane (n, r) = d can be optimally corrected so as to make them incident. Let V[m (DrH] and V[n @d] be their respective a priori covariance matrices, and ( r - OH) • rh -- 0 and (fi, r) d their true equations. The constraint is (~, ,~) - 0,
(~, ~ , ) - d,
(5.172)
which has rank 2. If the space line and the space plane are statistically independent, the problem is finding A m , Art/, An, and Ad such that _
J - (Am + ArH, Vim + rHI-(Am + Art/)) + ( A n | Ad, V[n 9 d]- ( A n | Ad)) -+ min _
(5.173)
5.49
Correction of Space Planes
161
under the linearized constraint
(An,~,)+(~,Am)=(n,m),
(~.. e.) + (~. ~ . ) - ~d = (....) - d. • A m | ArH e { m | 0, OH | vh }L,
An e {n}f.
(5.174)
The first order solution is given by
vimlo
Am
V[rH,m]Tn
Ad
--
(m, V[n, d])
v[...l.
(.. ...) - d
(rH, V[n, d]) - V[d]
W
'
(n, rH) - d ' (5.175)
where W is a (22)-matrix defined by
(.. vim].) + (m. v[.]m) w =
(.. vim. ~.1.) + (~.. v [ . ] m ) (n. V[~?l.. 7".]n) -}- (771..V[n]~'.) - (~11..V[n. d]) ~ -1 (n,
V[rH]n) +
(vii, V[nIrH ) -- (rH, V[n, d])
)
9 (5.176)
A realistic form of the correction is
vh = N [ m -
Am],
if, : N [ n -
An],
d = d - Ad.
(5.177)
The a posteriori covariance matrices of the corrected values rh,/'H, fi, and d
are
v[.~l
v[.~. #~]
#[m]
#[m. ~1
v[+~,m] v[+~])= (~[~,m] ~[~] ) ~[m..'~/].~ ) v[r.].~
viii
(v[d,,~]
v[~. d] vial]
~[.] #[n. d] ~[d. nl V[dl ^ ) _ ( ~[.].~ ~[.]~ - ~[.,d] (.h. e[.. dl) (~.. e[-. dl/- ~[d] ) W ( ?[-l.h e[n]~H-'~[-.dl (.h. e[.. dl) (+.. r d]) - V[~I )
)=(
162
Chapter 5.
v[.~. ,~] ( v[e..a]
v[.~. d] v[~. , d] ) - - (
?[m]a
?[rH, m] .~
i~r ( ?[n]~q~
Geometric Correction
?[m.~.]a
?[rH]/~ )
?[nigH -- ?[n, d]
(rh, ?In, d]) (/~g,?[n, d]) - ?[4 )
, (5.178)
where V[m], V[m, rH], etc. are computed as submatrices of V[m | r , ]
- Px~e~ V[m |
lY[n | d] - ( P n | 1)V[n | Here, P x ~ e ~
rg]Px~e~,
d](Pr~ |
1).
(5.179) • N,i~**. (se~ m, rH, n, V[m],
is the six-dimensional projection matrix onto
eq. (4.40)). The matrix I ~ is obtained by replacing Vim, rH], etc. by rh, rH,/~, ?[m], ? I r a , rH], etc., respectively, in eq. (5.176). The residual of J can be written as J -- ~TV "(11) (1"~,m ) 2 +212V(12) (n, m)((n,
rH)-d)+IV (22)((n, rH)--d) 2,
(5.180)
which is a X2 variable with two degrees of freedom. This fact provides an incidence test for a space line and a space plane: the hypothesis that space line ( r - r H ) • m -- 0 and space plane are incident to each other is rejected with significance level a% if J > X~,a. (5.181) B. Space line correction
If the space plane (n, r) = d is fixed, the linearized constraint is (n, A m ) = (n, m),
(n, Art/) = (n,
rH) -- d,
A m G A~'H e {?~ O 0, ~H O m } f .
(5.182)
The optimal correction of the space line (r - r H ) X m = 0 is
~.
v[~.. m] ~ .
v[~.l.
(~. ~.) - d
'
where W is a (22)-matrix defined by
w-
(n,V[m]n) ((,vim, ~.1~)
(n, V[m, rH]n) )-1 (~. v[~.].) 9
(5.~84)
The a posteriori covariance matrices of the corrected values ~h and rH are given in the form shown in eqs.^(5.178), where/~ and d are replaced by n and d, respectively. The residual J is given in the form of eq. (5.178) and is a X2 variable with two degrees of freedom. Hence, the incidence test given by eq. (5.181) can be applied.
5.5.
Orthogonality Correction
163
C. Space plane correction
If the space line (r - r H ) • m - 0 is fixed, the linearized constraint is (Art, m ) = (n, m),
(An, r g ) -- Ad = (n, rH) -- d, _k A n e {n}L.
(5.185)
The optimal correction of the space plane (n, r) = d is
Ad
=
(m, V[n, d])
(r, V[n, dl) - V[d]
W
(n, rH) - d
' (5.186)
where W is a (22)-matrix defined by
W -
(m, V[n]m) ( ~ , Via]m)
(m, v[,~]rH) - (m, v [ . , d]) ( ~ , v[n]~H) - (~, v[n, d])
-i
)
(5.187)
The a posteriori covariance matrices of the corrected values fi and d are given in the form shown in eqs. (5.178), where ~'H and ~h are replaced by r H and m , respectively. The residual J is given in the form of eq. (5.178) and is a X2 variable with two degrees of freedom. Hence, the incidence test given by eq. (5.181) can be applied.
5.5
Orthogonality Correction
5.5.1
Correction of two orientations
A. Simultaneous correction
Let m l and m2 be unit 3-vectors that indicate orientations supposedly orthogonal. Let V[ml] and Vim2] be their respective a priori covariance matrices. In the presence of noise, m l and m2 are not exactly orthogonal. Consider the problem of optimally correcting them so as to make them orthogonal (Fig. 5.11). In other words, we want to find Am1 and Am2 such that m l = m l -- A m l a n d m 2 - m 2 - A m 2 satisfy (rhl, m2) - 0.
(5.188)
The rank of this constraint is 1. If the two orientations are statistically independent, the problem can be written as the optimization
J
-
(Am1,
V [ m l ] - A m 1) +
(Am2,
V[m2]-Am2) .-+ r a i n
under the linearized constraint
(~ml, m2)+ (rex, ~m2) - (ml, m2),
(5.189)
164
Chapter 5.
Geometric Correction
Am I
ml
m I
Am2 0
n-t2
Fig. 5.11. Orthogonality correction for two orientations.
A m l E {ml}L~,
Am2 E {?~2}~.
(5.19o)
The first order solution is given by
(ml,m2)V[ml]m2 A m i -- (m2,V[mi]~T/,2) ~- (ml, V[m2]mi)'
(mi,m2)V[m2]ml Am2 = (m2, V [ m i ] m 2 ) + (m~, V[m2]mi)"
(5.191)
A realistic form of the correction is
ml
-
-
N[ml
-
-
Am:],
vh2 - N[m2 - Am2].
(5.192)
The a posteriori covariance matrices of the corrected values vhl and ~h2 are
V[mi] -- ? [ m i ] - (T~2,V[ml]m2) 4- ( m l , V [ m 2 ] m l ) ' V[zh2] - ?[m2] - (zh2, V[ml]~~2) -~- (rex, V[/2]~/~l)' ( ? [ m l ]m2 ) ( ? [ m 2 l m i ) T
V[T~I, m21 -- --(m2, V[mi]m2) + (ziz~, V[m2]ziZl)"
(5.193)
Here, V[mi] - P v h V [ m i ] P v i z , ,
i-
1, 2,
(5.194)
where Pviz~ is the projection matrix along zizi. The residual of J can be written as
,] --
(mi, m2) 2 (~TI'2,~/[ml]m2) -}- (#/,1, IY[m2]ziZl)'
(5.195)
which is a )/2 variable with one degree of freedom. This fact provides an orthogonality test for two orientations" the hypothesis that the two orientations m l and m2 are orthogonal to each other is rejected with significance level a% if
J > x i,o.
(5.196)
5.5.
Orthogonality Correction
165
Am3 in3 ~ m3 Am, Am 2 Fig. 5.12. Orthogonality correction for three orientations. B. Correction of one orientation If m2 is fixed, the linearized constraint is (Am1, m2) - ( m l , m2),
A m l e {Vhl }L I.
(5.197)
The optimal correction of m l is =
(m2, U[ml]m2) .
(5.198)
The a posteriori covariance matrix of the corrected value vhl is
V[?~I]-- Y[ml]- (?[ml]m2)(?[ml]m2)T (m2, ? [ m i ] m 2 )
"
(5.199)
Since V[Vhl]m2 -- O, the rank of V[vh]] is 1; its null space is {Vhl,m2}L. The residual
j --
(ml, m2) 2 (m2, V[ml]m2)
(5.200)
is a X2 variable with one degree of freedom. Hence, the orthogonality test given by eq. (5.196) can be applied.
5.5.2
C o r r e c t i o n of three o r i e n t a t i o n s
A. Simultaneous correction The same procedure can be applied to three orientations. Let m l, m2, and m3 be unit 3-vectors that indicate three orientations supposedly orthogonal (Fig. 5.12). The problem is finding A m i such that m i = m i - A m i satisfies
(~'~,i, ~'j) "-- (~ij,
i, j = 1, 2, 3.
(5.201)
The rank of this constraint is 3. Let V[mi] be the a priori covariance matrix of mi. If the three orientations are statistically independent, the problem can be written as the optimization
3 J-
~-~(Ami, < d [ m i ] - A m i ) ~ min
i=1
(5.202)
166
Chapter 5.
Geometric Correction
under the linearized constraint (m3, Am2) + (m2, Am3) = (m2, m3), (rhl, Am3)+ (rn3, Am1) -- ( m 3 , m l ) , (rh2, Am1)+ (ml,Am2) = ( m l , m 2 ) , mi e {rhi}Ll,
i = 1, 2, 3.
(5.203)
The first order solution is given by
( mx) ( o Vmlm3VEmxlm) 2 (,m2,m3,) Am2
--
Am3
V[m2]m3
0
V[m2]m I
V[m3lm2
W[m3lml
0
W
(m3,ml)
,
(ml,m2) (5.204)
where W is a (33)-matrix defined by W
/(o m3m2) ( viral] v[m~] m3 m2
0 ml
ml 0
) v[~]
(
0 m3 m2
m3 0 ml
m2 ml 0
))
(5.205)
A realistic form of the correction is i = 1,2,3.
~'1,i - N [ m i - Ami],
(5.206)
The a posteriori covariance matrices of the corrected values rhi are
V[ml] V['r~2, 'r~,l] V["~3, ~'Y/'I]
V[~,I, vh2] V[rhl, m3] ) V ['rll,2] V["~3, '/~2]
V['rh2, 'r?t,3] V['/~3]
o --
( V[ml]
f'[m~]ma ?[ml]~t2 )
~'[,ml)
V[m2l'rn3
0 V[m2]~'~t1 0 l}'[m3lrh2 ~Z[m3]'m 1 0
V[ml]~,3
?[m~],~ o ~[m~],~ ~[,m],~ where
f'[m~]
-
f~[m~]-Pm,V[,,~,lP,u,,
?[ml]~'h2 ) V[m2]~,l
,
(5.207)
0
i = 1,2,3.
(5.208)
5.5.
Orthogonality Correction
167
V[mi]
The matrix I ~ is obtained by replacing m~ and by ~h~ and IY[mi], respectively, in eq. (5.207). The residual of J can be written as - (
(m3, m l )
,W
(ml,m )
(m3, m l )
),
(5.209)
(ml,m )
which is a ~2 variable with three degrees of freedom. This fact provides an for three orientations: the hypothesis that the three orientations m i , i - 1, 2, 3, are orthogonal to each other is rejected with significance level a% if
orthogonalitytest
J > X ,a. B.
Correction
of one
(5.e10)
orientation
If m l and m2 are fixed in such a way that ( m l, m2) - 0, the rank of the constraint decreases to 2, and the linearized constraint is (ml,Am3)-
( m l , m3),
(m2, A m 3 ) -
(m2,m3),
(5.211)
e The optimal correction of m3 is
Am3-V[m3](ml'm2)W((me'm3))(m2,m3),
(5.212)
where W is a (22)-matrix defined by
W_ ((ml,V[m3]ml) (ml,V[m3]m2))-1 (m2, V [ m 3 ] m l )
(m2,V[m3]m2)
.
(5.213)
It is evident from the underlying geometry that ~4~3 - -~-ml x m2
(5.214)
is the exact solution if the sign is appropriately chosen. Hence, its covariance is V[,~I] = O. (5.215) The residual can be written as ,]
--
w ( l l ) ( m l , m3)2 +
2I~(12)(ml,m3)(m2,m3)+W(22)(m2,m3)2, (5.2 6)
which is a X2 variable with two degrees of freedom. Here, the matrix I~r = (l~ (kl)) is obtained by replacing m l , m2, and Vim3] by ~hl, ~h2, and Vim3], respectively, in eq. (5.213). The orthogonality test takes the form
j >
(5.217)
168
Chapter 5.
Geometric Correction
C. C o r r e c t i o n o f t w o o r i e n t a t i o n s
If m3 is fixed, the rank of the constraint is 3, and the linearized constraint is
(m3, Aml)--(m3, ml),
(m3, Am2) -- (m3, m2),
( A m l , m 2 ) + ( m l , A m 2 ) -- ( m l , m 2 ) ,
AT/%I E {Thl}t ,
Am2 6 {~2}L~.
(5.218)
The optimal correction of m l and m2 is given by
( iml ) _ ( V[~l,1]m3 ~m:
o
O u[m:].%
V[ml]m2 ) W
((m3,~D,1))
V[m~]m~
(m~,m~)
'
(5.2~9)
where W is a (33)-matrix defined by
W = ( ( m 3 , V[ml]m3)
\
(m3,V[m2]m3)
(ml,V[m2lm3) (m3,V[ml]m2) ) -1 (lYt,3,V[m2]m l ) (m2, V[ml]m2) + (ml, V[m2]ml)
(m2, V[~I]TYt3)
The a posteriori covariance matrices of the corrected values
(5.220)
m l and vh2 are
V[~rrt,1]-- Vii1] - w(ll)(i~,r[ml]m3)(~,r[ml]m3)T,
u[,~] VITal,m2]--
-
?Imp] w(~)(?[.~].~)(?[.~]m~) ~, -
-w(i2)(V[ml]m3)(?[m2]m3) T -
U[m2,~l] T,
(5.221)
where the matrix W is obtained by replacing m i and V[mi] by vhi and ? [ m i ] , respectively, in eq. (5.220). The residual can be written as J - Iiv(ll) (m3, m i ) 2 +
21~(12)(m3, mi)(m3,m2)+ Wi22)(m3,m1) 2,
(5.222) which is a ~(2 variable with two degrees of freedom. Hence, the orthogonality test given by eq. (5.217) can be applied.
5.6
Conic Incidence C o r r e c t i o n
Consider a conic (x, Qx) - 0 (see eq. (4.80)). Let x be an image point not on conic (x, Qx) = O. We consider the problem of optimally correcting x so as to make it incident to the conic (x, Qx) = O. In other words, we want to find A x such that 5~ = x - Ax satisfies (~, Q $ ) = 0.
(5.223)
5.6.
Conic Incidence Correction
169
The rank of this constraint is 1. Let V[x] be the a priori covariance matrix of x. The problem can be written as the optimization J-
(Ax, V [ x ] - A x ) ~ min
(5.224)
under the linearized constraint 1
( ~ , Q~) - 5(~, Q~),
(5.225)
The first order solution is given by ~,~
-
(x Q x ) V [ x ] Q x ' 2(x, QV[x]Qx) "
(5.226)
If we put n - N[Qx], eq. (5.226) can be written as
~_
(~,~)v[~]~ 2(., v[~]~)
(5.227)
This problem can be viewed as imposing the incidence constraint on the image point x and its polar (n, x) = 0 with respect to the conic (x, Q x ) = 0 (see eq. (4.82)). The difference in the factor 2 (see eq. (5.82)) is due to the fact that as the image point x approaches, its polar (n, x) - 0 also approaches its pole x by the same distance. The a posteriori covariance matrix of the corrected position :b is
v i i i - v[~]- (v[~]a)(v[~la)~ (a, via]a) '
(5.22s)
i-~ = N[Q~,].
(5.229)
where Eq. (5.228) has the same form as eq. (5.83). Hence, the rank of V[5~] is 1; its null space is {k, n } i , which is orthogonal to the orientation vh - N[k • nl of the polar (~, x) - 0. The residual of J can be written as
j -
(~, Q~)~ 4(~,,QV[x]Q~,)'
(5.230)
which is a X2 variable with one degree of freedom. This fact provides a conic incidence test: the hypothesis that image point x is on conic (x, Q x ) = 0 is rejected with significance level a ~ if
j > ~2,a.
(5.231)
170
Chapter 5.
Geometric Correction
X
Fig. 5.13. Optimal incidence correction for an image point. E x a m p l e 5.6 If each coordinate is perturbed independently by Gaussian noise of mean 0 and variance e 2, the covariance matrix of x is V[x] = e2Pk . The optimal correction (5.226) reduces to
A x - (x, Q X ) P k Q x . 2[[PkQx[[2
(5.232)
Consequently, x is displaced onto the conic perpendicularly (Fig. 5.13). The a posteriori covariance matrix of the corrected value 5~ is
V[5~]- E2ff~vhT,
(5.233)
where rh - N[k x r is the orientation of the polar (r x) = 0. The residual (5.230) can be written as
j-
1 (~, Q~)~ 4e 2 [[pkQS~[[2.
(5.234)
Hence, an unbiased estimator of the variance e2 is obtained in the form
~ = 1 (~, Q:~):
411PkQSc[I 2"
(5.235)
The value ~ thus estimated equals half the distance between the image point and its polar with respect to the conic. If the value e is given a priori, the conic incidence test takes the form
~:2 E--~ > ~2,a.
(5.236)
Chapter 6 3-D Computation by Stereo Vision As seen in the preceding chapter, the covariance matrix plays a fundamental role in any type of optimization. In the beginning, the a priori covariance matrices of raw data are determined by the characteristics of the sensing device. Then, the raw data define geometric objects, and the resulting geometric objects in turn define another class of geometric objects, which define geometric objects in a higher level and so on. In this process, the error characteristics can be traced bottom up: the covariance matrix of an object is computed from the covariance matrices of the objects in the lower hierarchy. In this chapter, we do such an analysis for stereo vision. We first study the epipolar constraint of a stereo system and derive explicit expressions for optimal 3-D reconstruction of points and lines. Then, the error behavior of the reconstructed space points and space lines is analyzed. We also derive expressions for optimal back projection of image points and image lines onto a space plane and analyze the behavior of the errors involved. Finally, we evaluate the effect of camera calibration errors.
6.1
6.1.1
Epipolar
Constraint
Camera imaging geometry
Stereo vision is a m e a n s of r e c o n s t r u c t i n g 3-D s t r u c t u r e s from two-dimensional images by t r i a n g u l a t i o n using two c a m e r a s 1 . To do this, t h e c a m e r a g e o m e t r y m u s t be known, since 3-D s t r u c t u r e s are r e c o n s t r u c t e d by inverting the imaging process. As discussed in Section 4.1.1, the c a m e r a g e o m e t r y is modeled as perspective projection: a space point P is p r o j e c t e d onto the intersection p of the image plane with the line of sight t h a t s t a r t s from the center of the lens and passes t h r o u g h P (see Fig. 4.1). T h e center of the lens is called the viewpoint, and the distance f from the viewpoint to the image plane is called the focal length. Define an X Y Z c o o r d i n a t e s y s t e m by identifying the origin O with the viewpoint and t a k i n g the Z-axis along the optical axis of the lens. T h e unit of length is scaled so t h a t the focal length f is unity. First-order analysis based on covariance matrices is justified if image noise is very small as c o m p a r e d with the focal length, which is usually the case. Given two cameras, define an X Y Z c o o r d i n a t e s y s t e m for the first camera, and an X I Y ~ Z ~ c o o r d i n a t e s y s t e m for the second (Fig. 6.1). T h e two cameras 1More than two cameras can be used to enhance the reliability of the reconstruction. If the number of cameras needs to be specified, such terms as binocular stereo vision and trinocular stereo vision are used. 171
172
Chapter 6.
3-D Computation by Stereo Vision P
X
,
1
""',,
~ O
X'
Z'
Y' O'
Y
Fig. 6.1. Geometry of stereo vision. are assumed to have the same focal length. Let h represent the origin O ~ of the X ~ Y t Z ~ coordinate system with respect to the X Y Z coordinate system; we call it the base-line vector. Let R be the rotation matrix that describes the orientation of the X~Y ~Z ~ coordinate system relative to the X Y Z coordinate system. The relative configuration of the two cameras is specified by the pair {h, R}, which we call the motion parameters (see Section 4.6.1). Consider a space point which has coordinates (X, Y, Z) with respect to the X Y Z coordinate system. Let (X ~, Y~, Z ~) be the coordinates of the same space point with respect to the X~Y ~Z ~ coordinate system. If we put r - (X, Y, Z) 1and r ' = ( X ' , Y ' , Z ' ) -7, the following relationship holds (see eq. (4.116)): r - h + Rr'.
6.1.2
(6.1)
Epipolar equation
Let p be the perspective projection of a space point P onto the image plane of the first camera, and pl that for the second camera. Evidently, the three vectors Op, O 0 ~, and O~p~ must be coplanar (Fig. 6.1). Hence,
(6.2)
Idp, oh,, orp, I - o,
where I', "," I den~ the scalar triple product (see eq. (2.32)). Eq. (6.2) is known as the epipolar constraint; the plane defined by alp, O~9', and Orp' is called the epipolar plane. Let (x,y) be the image coordinates of p, and (x', y') those of p'. The image points p and p~ are respectively represented by the following 3-vectors (see eq. (4.1))" x --
,
x' -
(X) y' 1
.
(6.3)
Since the X~Y~Z! coordinate system is rotated by R relative to the X Y Z coordinate system, vector O~p~ is represented by R x ~ with respect to the
6.1.
Epipolar Constraint
173
P
n = N[h •
(n,x) =0
0
x Fig. 6.2. Epipole and epipolar.
X Y Z coordinate system. Noting that Op = x and 0 0 ' - h, we can express eq. (6.3) with respect to the X Y Z coordinate system in the form Ix, h, Rx'] = O,
(6.4)
which we call the epipolar equation. The epipolar equation provides a strong clue to automatically detecting point-to-point correspondences between the two images. Consider the image plane of the first camera. For a fixed value of x', eq. (6.4) defines an image line if x is regarded as a variable. This line is called the epipolar of x I. Eq. (6.4) states that the epipolar of x' passes through x. Let pe be the image point represented by h xe = (k, h ) ' (6.5) where k = (0, 0, 1)7-. This point is called the epipole 2 of the first image plane (Fig. 6.2). Since Ix~, h, R x ' I = 0 holds irrespective of the value of x', all
epipolars pass through the epipoIe. Consider the image plane of the second camera. For a fixed value of x, eq. (6.4) defines an image line if x' is regarded as a variable. This line is called the epipolar of x. Eq. (6.4) states that the epipolar of x passes through x ~. Let Pie be the image point represented by
x~
Rr h (k, R_Ch)
(6.6)
This point is also called the epipole of the second image plane. Since Ix, h, P~'el - 0 holds irrespective of the value of x, all epipolars passes through
the epipole. The above observations are summarized as follows: 9 The image point p that corresponds to an image point p' in the other image is located on the epipolar of pl, and vice versa. 2In a real camera, the size of the image frame is finite, so the epipole pe may be located outside the image frame, or it can be an ideal image point located at infinity. In the following analysis, the image plane is assumed to be infinitely extended.
174
Chapter 6.
3-D Computation by Stereo Vision Z
X
**.-'~
P
,,""" Z
0
Y
P
x
-~ O'
\
jy'
Fig. 6.3. Parallel stereo system. 9 The epipolar in each image is determined by the location of the corresponding point in the other image and the motion parameters {h,
R}.
9 In each image, epipolars are concurrent, all passing through the epipole. If we define the essential matrix G = h x R,
(6.7)
the epipolar equation (6.4) can be written as = 0.
(6.8)
Hence, the epipolar in the first image is an image line represented by n = ax'.
(n, x) = 0,
(6.9)
The epipolar in the second image is an image line represented by (n', x') - 0 ,
6.1.3
n'--GTx.
(6.10)
Parallel stereo system
Let us call a stereo system parallel if the optical axes of the two cameras are parallel and the base-line vector is perpendicular to them (Fig. 6.3). If the Y-axis is taken in the direction of the base-line vector, the motion parameters have the form h-
(o) (o 01) h 0
,
R
= I.
(6.11)
The essential matrix (6.7) reduces to
G-h
0 0 0 -1 0 0
.
(6.12)
6.2.
Optimal Correction of Correspondence
175
The epipolar equation (6.4) is simply
x-x'
=0,
(6.13)
which defines a line parallel to the y-axis on each image plane. The epipole is an ideal image point located at infinity in the direction of the base-line vector. The epipolar constraint simply states that corresponding points must have the same x coordinate. We now show that any stereo system can be regarded as a parallel stereo system by changing the camera coordinate systems. Consider a stereo system with motion parameters {h, R}. Let j - N [ h ] , and let k be an arbitrary unit vector orthogonal to j . If we define ~ - j x k, the set {i, ), k} is an orthonormal system. Define a new X Y Z coordinate system for the first camera by taking {i, ~, k} as the axis orientations. An image point represented by x with respect to the X Y Z coordinate system is now represented by 3 /~7- x -
,
(6.14)
where
h - (i, j, k). Similarly, define a new . ~ ' Y ' Z ' coordinate system for the second camera by taking the same axis orientations. An image point represented by x' with respect to the X ' Y ' Z ' coordinate system is now represented by I~TRx' -'
-
.
(6.16)
With respect to the X Y Z and X ' Y ' Z ' coordinate systems, the stereo system is parallel with motion parameters {llhllj, I}. 6.2
Optimal
6.2.1
Correction
of Correspondence
Correspondence detection and optimal correction
In order to do error analysis based on corresponding image points, we need to consider the way they are detected. Basically, there are two possibilities:
9 Edge-based method. We first detect corresponding edge segments. Then, point-to-point correspondence is established by computing the intersection of one edge segment with the epipolars of the pixels of the other edge segment. a Eq. (6.14) defines a two-dimensional projective transformation if the image plane is identified with the two-dimensional projective space by adding ideal image points. If R runs over all rotations and k is its third column, the set of the corresponding transformations is a subgroup of the two-dimensional projective transformations.
176
Chapter 6. ."-,..
f" ,, f
3-D Computation by Stereo Vision
,.
,,.
,*
O' O Fig. 6.4. point.
Corresponding image points are corrected so that they define a space
9 Feature matching method. We match feature points directly: for a small neighborhood of each feature point in one image, the other image is searched for a portion that has a maximum correlation with that neighborhood. Many types of correlation measure and searching strategy are conceivable. The edge-based method has the advantage that a large number of corresponding pairs can be obtained easily. However, computational difficulties may arise for those edge segments which are nearly parallel to the epipolars. Moreover, the motion parameters {h, R} must be known accurately for computing the epipolars. In contrast, the feature matching method does not require any knowledge of the motion parameters. If they are known, the epipolar constraint can be used for limiting the search region--only a small neighborhood of each epipolar needs to be searched. In Chapter 11, we will show that the motion parameters {h, R} can be computed if a sufficient number of corresponding feature points are detected. The feature matching method is also suitable if natural or artificial markers are involved or humans intervene by using cursors and mice. In this section, we assume that corresponding points are detected by the feature matching method. The case in which correspondence is given between straight edge segments will be studied in Section 6.4. If corresponding image points x and x' are detected by the feature matching method in the presence of image noise, they may not strictly satisfy the epipolar equation (6.4). This means that their lines of sight do not intersect in the scene. In order that a unique space point be determined, image points x and x' must be corrected so as to satisfy the epipolar equation exactly. In geometric terms, this means modifying x and x' so that their lines of sight meet in the scene (Fig. 6.4). This problem is equivalent to finding A x and A x ' such that ~ = x - A x and 5~' = x ' - A x ' satisfy the epipolar equation
(e, a e ' ) = 0.
(6.17)
The rank of this constraint is 1. Let V[x] and V[x'] be the a priori covariance matrices of x and x', respectively. If x and x' are statistically independent,
6.2.
177
Optimal Correction of Correspondence
the problem can be written as the optimization J-
(Ax,
V[x]-Ax)+ (Ax', V[x']-Ax') ~ min
(6.18)
under the linearized constraint (Ax,
Gx') + (x, GAx') - (x, Gx'), ~, ~'
e {k}~.
(6.19)
The first order solution is given as follows (see eq. (5.17))"
Gx')V[x]Gx' (x', GTV[x]Gx ') + (x, GV[x']G Tx)' (x,
AxAx'=
(x',
(x'Gx')V[x']GTx 9 GTV[x]Gx ') + (x, GV[x']G Tx)
(6.20)
Since the constraint (6.19) is obtained by a linear approximation, corrections x +-- x - Ax and x' +-- x ' - Ax' need to be iterated until (x, Gx') = 0 is sufficiently satisfied (see Section 5.1.3). The a posteriori covariance matrices of the corrected positions ~ - x - Ax and 5:' - x ' - A x ' are computed as follows (see eqs. (5.31)and (5.32)):
(V[~]G~') (V[~]G~') ~
V[~]- V[~]-
(Jc', GTV[x]GSc ') + (5:, GV[x']GTSc) '
v[~']- v[~']-
(v [,'] a ~ ~) (v [~'] a ~ ~) 9 (5:', GTV[x]GSc ')
+ (~c,GV[x']G TS)'
(V[x]GScl)(V[x']G T Sc) T
V[~, ~'] = -(~,, GTV[x]G~,) + (~, GV[x,IGT~) 6.2.2
= V[5:', 5:] q-.
(6.21)
Correspondence test and noise level estimation
The residual of the function J given in eq. (6.18) can be written as follows (see eq. (5.34))"
j --
(~' a~')~
(~', a~v[~]ae ') + (~, av[~']a~e)
.
(6.22)
This is a X2 variable with one degree of freedom if the noise is Gaussian (see Section 5.1.5). This fact provides a correspondence test: the hypothesis that image points x and x' correspond to each other is rejected with significance level a% if
J > ~,a.
(6.23)
178
Chapter 6.
3-D Computation by Stereo Vision
As discussed in Section 5.1.5, it is usually very difficult to estimate the absolute magnitude of image noise, but often its geometric characteristics (e.g., the degree of homogeneity/inhomogeneity and isotropy/anisotropy) can be relatively easily predicted. Let us assume that the covariance matrices V[x] and V[x'] can be written in the form V [ x ] - e2Vo[x],
V [ x ' ] - e2Vo[x'],
(6.24)
where matrices V0[x] and V0[x'] are known but the constant e2 is unknown. We call V0[x] and V0[x'] the normalized covariance matrices and e the noise level. From eq. (6.20), we see that the optimal correction is not affected if the covariance matrices Y[x] and V[x'] are replaced by the normalized covariance matrices V0[x] and V0[x'], respectively. The unknown noise level e can be estimated a posteriori as follows. If Y[x] and V[x'] in eq. (6.22) are replaced by V0[x] and V0[x'], respectively, the right-hand side is multiplied by 1/e 2. Since J is a X2 variable with one degree of freedom, an unbiased estimator of e2 is obtained in the following form (see eq. (5.38)):
-- (5:', GTV0[x]GS: ') + (5:, GVo[x']GTS:) Its expectation and variance are given as follows (see eqs. (5.39)):
E[~ 2] = E2,
V[~ 2] -- 2E4.
(6.26)
If the value e is given a priori, the X2 test (6.23) takes the following form (see eq. (5.40)): ~2 "~ ~" ~2,a. (6.27) E x a m p l e 6.1 If each coordinate is perturbed independently by Gaussian noise of mean 0 and variance e2, the covariance matrices of x and x' are V[x] - Y[x'] - e2Pk . The optimal correction (6.20) reduces to
(x, G x ' ) P k G x ' AT,
--
IIPkGT~II ~ § IIPkG~'II ~' Ax' =
(x'Gx')PkGTx 9 IlekGTxll 2 + IlekGx'll 2
The a posteriori covariance matrices (6.21) become
Viii- ~2 (Pk -
(PkG~')(PkC~')T) IIPkGT~ll2+llPkG~'ll 2 '
(6.28)
3-D Reconstruction of Points
6.3.
(PkGT'x')(PkGT~)T)
v[~']- r ( e k V[5c, 5c'] -
E2
-
179
IIPkGT~II 2+IIPkG~:'II 2
(PkGhCt )(PkGT yc)T
'
= V[5~', 5~1T.
(6.29)
IIPkGT~II 2 + IlekG~'ll 2 An unbiased estimator of the variance e2 is obtained in the form
~2 =
(z, 0~') 2 IIPkGT 5=II2 + IIPkG5='II 2
9
(6.30)
E x a m p l e 6.2 Consider the parallel stereo system described in Section 6.1.3 with the noise characteristics given in Example 6.1. Let (x,y) and (x',y') be the corresponding image points on the first and the second image planes, respectively. The optimal correction (6.28) reduces to
Jc-
(x + x')/2 ) y 1
^,
(x+y,x')/2 )
(6.31)
1
The residual (6.22) is simply
1 (xJ = ~-~=
x') ~
(6.32)
which gives an unbiased estimator of e2 in the form
~2
1 x,)2 =~(x-
(6.33)
The a posteriori covariance matrices (6.29) reduce to ( 1/2
v[~] = v[~'] = ~ ( ~/2
1
, V[~, ~'] - e2 0
0
0)
(6.34)
6.3
6.3.1
3-D Reconstruction
of Points
Depth reconstruction
Let & and &' be the corrected positions of the corresponding image points on the first and the second image planes, respectively. Since the epipolar equations (&, G&') = 0 is satisfied, there exist constants Z and Z' such that
Z& = h + Z'R&'.
(6.35)
180
Chapter 6.
3-D Computation by Stereo Vision i
i
i i
x~_..~f
!
!
i
Rx'
x
~
~
h
O'
0
0 ~
h
O'
(b)
(~)
Fig. 6.5. (a) The feature point is infinitely fax away. (b) The feature point is in the direction of the base-line vector h. We call Z and Z' the depths of x and x', respectively; they are the distances of the reconstructed space point P from the X Y plane of the first camera and from the X ' Y ' plane of the second camera, respectively (see eq. (6.1)). Taking the vector product with RS~' on both sides of eq. (6.35), we obtain Z#c x
RS~'- h
x
RS~'.
(6.36)
Taking the vector product with ~ on both sides of eq. (6.35), we obtain O = #c x h + Z'5: x RS:'.
(6.37)
From these, we obtain the depths Z and Z' in the following form 4" Z-
(h x R5~', ~ x R5~') ]l~xR~'ll 2 '
Z'-
(h x 5, ~ x R ~ ' ) II~xP~'l] ~ "
(6.38)
The corresponding space point is given by r-
Z~.
(6.39)
Eqs. (6.38) have a singularity: the depths Z and Z' cannot be determined when (and only when) 5: x R ~ ' - 0. This configuration occurs in two cases" 9 The space point P is infinitely far away from the two cameras (Fig. 6.5a). 9 The space point P and the two viewpoints O and O' are collinear, i.e., the space point P is projected onto the epipoles of the two image planes (Fig. 6.5b).
4The depths Z and Z' can be equivalently written in various different forms such as Z = llh x R~'II/II~ x R~'I] and Z' = l]h • ~[I/ll~ x R~'II. Among them, eqs. (6.38) are the most convenient. For one thing, this form makes sense even if the depth is negative (i.e., the object is behind the camera).
6.3.
3-D Reconstruction of Points
181
E x a m p l e 6.3 For the parallel stereo system described in Section 6.1.3, the space point r reconstructed from corresponding image points (x, y) and (x', y') has the depths h Z- Z'= (6.40) D' -
-
where D - y - y~ is called the disparity of the two image points.
6.3.2
Error behavior of reconstructed space points
The depths Z and Z ~ are defined in such a way that eq. (6.35) holds. If 5: and ^1 x are perturbed by AS~ and A~ ~, respectively, the resulting perturbations AZ and AZ ~ of Z and Z ~ satisfy to a first approximation
AZYc
AZ'RS~'- ZA~, + ZIRA~, I.
-
(6.41)
Taking the vector product with R ~ ' on both sides, we obtain
~z(n~') • ~- -z(n~') x ~
+ z'(n~') x n ~ ' .
(6.42)
The unit surface normal to the epipolar plane, on which 5:, h, and R ~ ' lie, is given by i ~ - X[h x 5]. (6.43) Taking the inner product with ft on both sides of eq. (6.42), we obtain AZ(/~, (RS~') x 5:) - - Z ( ~ , (RS~') x ASz) + Z'(~, (R~') x R A ~ ' ) .
(6.44)
If we define rh = / ~ x R~',
(6.45)
eq. (6.44) can be written as ~z
= - ('~' z~
- Z'RLx~') (rh, &)
"
(6.46)
Using the relationship (6.47) and noting that E[ASzA~ T] _ V[Sz], E[A~ASz 'T] -- V[5~, 5~'], and E[A~'AS~ 'T] = Y[5~'] (see eqs. (6.21)), we obtain the variance V[Z] - E[(AZ) 2] of Z from eq. (6.46) in the following form:
z~(m, v[~],~) V[Z]
-
2zz'(,~, v[~, ~']R~,~) + z'~(,~, RV[~']RT,~) I1~ x R~'II 2 (6.48)
182
Chapter 6.
3-D Computation by Stereo Vision
The covariance vector V[5~, Z l - E[A&AZ] has the form
v[~, z] - - ( z v [ ~ ] -
z'v[~, ~']R~),~ (,~,~)
(6.49)
The covariance matrix of the reconstructed space point r = Z& is given by
v[,] -
z:v[~] + 2zs[v[~, z]~ ~] + v [ z ] ~ ~,
(6.50)
where S[. ]is the symmetrization operator (see eqs. (2.205)). E x a m p l e 6.4 Consider the parallel stereo system described in Section 6.1.3 with the noise characteristics given in Example 6.1. Eqs. (6.48) and (6.49) reduce to 2e2Z 4 e2Z 2 V[Z] h~ , Y[~, Z] h j' (6.51) where j - (0, 1, 0)7-. Put ^
~* - ~(~ + ~'),
(6.52)
which represents the midpoint of & and &~ if the two image planes are identifled. The covariance matrix V[r] of the reconstructed space point r given by eq. (6.50) reduces to V [ r ] = e 2 Z( 2
P k + - D-Fx4 , x , T ) ,
(6.53)
where D - y-y~ is the disparity. It is geometrically evident that the error in r is proportional to e and Z. Eq. (6.53) also implies that the error in orientation x* is very large. If x* ~ k, for instance, the error is approximately isotropic around the Z-axis but its magnitude along the Z-axis is about 2/D times as large as in the direction orthogonal to it. If the focal length of the camera is 700 pixels and the disparity is 10 pixels, for example, we have 2/D -- 140. If the disparity is 1 pixel, we have 2/D - 1400. Thus, the uncertainty of the Z coordinate is very large as compared with that of the X and Y coordinates.
E x a m p l e 6.5 Fig. 6.6 are simulated stereo images (480 x 680 pixels with focal length f = 600 (pixels)) of a cylindrical grid. Gaussian noise of mean 0 and standard deviation a - 2 (pixels) is independently added to the x- and ycoordinates of each grid point, so the noise level is e = a / f = 1/300. However, the value of e is treated as unknown in the 3-D reconstruction computation and estimated a posteriori by using eq. (6.30) for each grid point. Fig. 6.7 shows the standard confidence regions (see Section 4.5.3) of the grid points computed from eq. (6.50).
6.3.
3-D Reconstruction of Points
183
Fig. 6.6. Simulated stereo images. E x a m p l e 6.6 Fig. 6.8 are real stereo images (512 x 512 pixels with focal length f - 1870 (pixels)) of a surface of a building. Fig. 6.9a shows a grid pattern defined by feature points (corners of the windows) extracted from the left image in Fig. 6.8. Assuming that the image noise is isotropic, we estimate the noise level by eq. (6.30). Fig. 6.9b displays the reconstructed grid and the two shapes that envelop the primary deviation pairs of the grid points (see Section 4.5.3). In this example, the base-line length Ilhll is very short as compared with the distance to the surface (approximately 1/16), so a very small error in camera calibration sensitively affects the computation, thereby reducing the reliability of 3-D reconstruction. Since the noise level is estimated from "the degree to which the epipolar equation is not satisfied", the error in the motion parameters is also treated as "image noise".
6.3.3
M a h a l a n o b i s distance in the scene
If a particular stereo system with known motion parameters {h, R} is fixed and particular characteristics of the image noise are assumed, eq. (6.50) can be viewed as defining an uncertainty field over the 3-D scene. Given an arbitrary space point r, its depths Z and Z ~ are computed by Z = (k, r),
Z' = (Rk, r - h).
(6.54)
The corresponding image points are given by 1
x - 2r,
x'-
1
~-TRT(r -- h).
(6.55)
By construction, the epipolar equation Ix, h, R x ~] = 0 is satisfied. Identifying x and x ~ with & and &~, we can compute the covariance matrix V[r] from eqs. (6.21), (6.48), (6.49), and (6.50). Hence, V[r] is a function s of r, and 5As we noted earlier, singularities exist along the base line OO ~. In reality, 3-D cannot be reconstructed in the part outside either of the views of the two cameras, including the
184
Chapter 6.
3-D Computation by Stereo Vision
Fig. 6.7. 3-D reconstruction and standard confidence regions of grid points.
Fig. 6.8. Real stereo images.
:-:v--,v7~Ti_,~ 1
(~) (b) Fig. 6.9. (a) Extracted feature pattern. (b) 3-D reconstruction (solid lines) and the two shapes that envelop the primary deviation pairs of grid points (dashed lines).
6.3.
3-D R e c o n s t r u c t i o n of Points
185
R2
x .......
x
o,/
............
4
2
........~....
y,
X
/
/
........................
1t
Z
y'
Fig. 6.10. Geometry of 3-D reconstruction by stereo. it defines the Mahalanobis distance 6 t h a t m e a s u r e s the degree of u n c e r t a i n t y in the scene; a p e r t u r b a t i o n in the s a m e M a h a l a n o b i s distance has the same probability density (see Section 3.3.4). T h e following geometric i n t e r p r e t a t i o n is given to this distance. An image plane can be identified with a two-dimensional C a r t e s i a n space 7~2, a n d the direct s u m x | can be viewed as a point in t h e four-dimensional direct sum C a r t e s i a n space ~ 4 _ 7~2 | 7~2. T h e set of all x | x ' t h a t satisfy the epipolar e q u a t i o n (6.4) defines a t h r e e - d i m e n s i o n a l manifold $ in ~ 4 . T h e 3-D r e c o n s t r u c t i o n by eqs. (6.38) can be t h o u g h t of as establishing a one-to-one and onto m a p p i n g from the manifold S to the scene 7~3. Fig. 6.10 schematically illustrates these relations, where the ellipses indicate the stand a r d confidence regions defined by the covariance matrices. T h e covariance m a t r i c e s V[x] and V[x'] define the M a h a l a n o b i s distances t h a t m e a s u r e the u n c e r t a i n t y of x and x ' in the two images. T h e optimization (6.18) m e a n s projecting each direct s u m point x | x ' E 7~4 = 7~2 x ~ 2 onto the "nearest point" & | 5~' in S m e a s u r e d in the M a h a l a n o b i s distance defined by the direct s u m covariance m a t r i x V[x] | Y[x']. Eqs. (6.20) describe this projection, and eqs. (6.21) define the s t a n d a r d confidence region of & @ &' in the tangent space Ta~e~,(S ) to S at 5~ | :b'. Eq. (6.50) can be viewed as defining the M a h a l a n o b i s distance in 7~3 by m a p p i n g 7 the M a h a l a n o b i s distance in $ in accordance with the 3-D r e c o n s t r u c t i o n equation r - Zh~; part behind the cameras, but mathematically the value of V[r] is defined everywhere except at singularities. 6This distance is generally Riemannian (non-Euclidean) and defines a nonzero Riemannian curvature in the scene. Hence, the 3-D scene can be regarded as a Riemannian space. 7Mathematically, this process of projecting the distance in T~4 onto 8 and mapping the distance in 8 to T~3 is defined by a procedure called pull-back of a tensor field.
186
Chapter 6.
3-D Computation by Stereo Vision
""...-~-_,
n r
....~
O "-...~ n
O' Fig. 6.11. 3-D reconstruction of a space line. the standard confidence region of :b | &~ in Tx~x, ($) is mapped to define the standard confidence region of r in 7r 3.
6.4
6.4.1
3-D
Reconstruction
of Lines
Line reconstruction
Image lines are also very important image features. As we noted in Section 6.2.1, they are usually detected as edge segments, i.e., sequences of pixels. If straight lines are fitted to them (the fitting procedure will be discussed in subsequent chapters), point-to-point correspondence is easily established once line-to-line correspondence is obtained: for a given image point p on one image line l, the corresponding image point p~ on the other image line l' is at the intersection s of the epipolar of p with l'. By construction, the corresponding image points p and p' satisfy the epipolar equation. Hence, 3-D can be reconstructed pointwise. However, if the two image lines are represented by (n, x) = 0 and (n', x') = 0, a space line is directly reconstructed from n and n " all we need to do is compute the intersection of the space plane defined by the viewpoint O and the image line 1 with the space plane defined by the viewpoint O' and the image line l' (Fig. 6.11). In other words, the point-to-point correspondence
need not be computed. Note that a space line r x p - n is perspectively projected onto the image plane of the first camera as an image line (n, x) - 0 (see eq. (4.130)). In order to invert this projection, we must be careful about the scaling of the vector n. Recall that we adopted the scale I]nll - 1 in representing an image line (see eqs. (4.8)) while we adopted the scale Ilpll2 + Ilnll 2 = i in representing a space line (see eqs. (4.34)). For the convenience of computation, let us temporarily SAs pointed out in Section 6.2.1, computational difficulties may arise in the intersection computation if the epipolar is nearly parallel to l'.
6.4.
3-D R e c o n s t r u c t i o n of Lines
187
adopt the scale Ilnll - I for the {p, n } - r e p r e s e n t a t i o n of space lines 9. Let the two corresponding image lines 1 and l' be represented by (n, x) = 0,
(n', x ' ) = 0.
(6.56)
T h e space plane defined by the first viewpoint O and image line 1 is (n,r) =0.
(6.57)
T h e space plane defined by the second viewpoint O' and image line l' is
(Rn', r - h) = 0.
(6.58)
T h e intersection of these two space planes define a space line L. Let its equation be r • p = n. T h e vector p is given as follows (see eq. (4.72)): n
P -
x
(h,
Rn' Rn')"
(6.59)
After p is computed, the vectors {p, n } are normalized to if necessary.
6.~.2
Ilvll 2 + Ilnll = 1,
Error behavior of reconstructed space lines
First, assume the scale Ilnll = IIn'll = 1. If n and n ' are p e r t u r b e d by A n and A n ' , respectively, the vector p c o m p u t e d by eq. (6.59) is p e r t u r b e d to a first a p p r o x i m a t i o n by Ap =
A n x Rn' + n x R A n '
(h,
R A n ' ) n x Rn'
m
(h, R n ' )
(h, R n ' ) 2
(h, Rn')
(6.60)
Let V[n] and V[n'] be the covariance matrices of n and n ' , respectively. Assuming t h a t the two image lines are obtained by separately processing the two images, we regard n and n ' as i n d e p e n d e n t r a n d o m variables. T h e covariance matrices V[p] = E[ApAp T] and V[p, n] = E[ApAn T] are given as follows (see eq. (2.43)):
1
V[p] - (h, Rn,)2
( (Rn')
x V[n] x (Rn') + (h, RV[n']RTh)pp T
- 2S[n x RY[n']RThp T] + n x R V [ n ' ] R T
x.)
9If n is scaled to a unit vector, space lines that pass through the coordinate origin O cannot be represented in the form r x p = n. However, such space lines are "invisible" when viewed from O, so we need not consider them for the purpose of 3-D reconstruction.
188
Chapter 6.
v i v , ~] --
3-D Computation by Stereo Vision
-(R~') • v[n] (h, R n ' )
(6.61)
"
Once the covariance matrices V[p], V[p, n] (= V[n, plT), and V[n] are obtained for the vectors {p, n}, Ilnl[ - 1, the corresponding covariance matrices for the rescaled vectors {lb, ~t}, 11151]2 + II~tll2 - 1, are computed as follows (see eqs. (3.18))" (
V[lb]
V[lb,~t])_
1 ( V[p] V [ p , n ] ) ilpll + ilnll Ppeff, Vfn, p] V[n]
PPC~'ff" (6.62)
Here, Plben is the (66)-projection matrix onto {lb | fi}L~ (see eq. (4.48)).
6.5 6.5.1
Optimal Back Projection onto a Space Plane Back projection of a point
A. Image transformation b e t w e e n the two images
If a feature point is known to be on a space plane H whose equation is (nn, r) - d, a single image is sufficient to compute their 3D positions: all we need to do is back project the image point onto the space plane H along its line of sight (see Section 4.5.1). In fact, let x be an image point on the first image plane. If the reconstructed space point r - Zx is on space plane (nn, r) = d, we have (nn,Zx) - d. Hence, the depth Z is given by d
Z-
(nn,x)'
(6.63)
and the space point is reconstructed in the position dx
r = (nn, x~.)
(6.64)
It follows that a pair of stereo images have redundant information. We can take advantage of this fact: the reliability of 3D reconstruction can be enhanced by optimizing this redundancy. Suppose there exists no image noise, and consider a stereo system with motion parameters {h, R}. Let x be an image point in the first image. The corresponding image point x' in the second image satisfies Zx - h + Z ' R x ~ (see eqs. (6.1) and (6.35)). Hence,
--~R =
-
-
ZRT (
Z RT(hn~ -- dI)x. Z'd
X--
(nnd.X)h )
(6.65)
6.5.
Optimal Back Projection onto a Space Plane
189
It follows that the transformation between the two image planes can be written in the form x'-
(6.66)
kAx,
where A
-
(6.67)
-
The constant k is chosen so that (k, x !) = i holds. An image transformation in the form of eq. (6.66)is called a (two-dimensional) projective transformation 1~ or collineation (see Sections 4.4.2 and 4.6.1). Note that eq. (6.66) implies the epipolm" equation (6.4) and hence is a stronger condition than eq. (6.4). In fact, we see from eqs. (6.66) and (6.67) that Ix, h, R x ' [ - k ( x x h, R A m ) -
k ( x x h, ( h n ~ -
dl)x)
(6.68)
= k ( x x h, (nn, x ) h - dx) - O.
E x a m p l e 6.7 Consider the parallel stereo system described in Section 6.1.3. If we write n n - (nii(1),nii(2),niI(3)) -V, the transformation matrix A has the form
(
A -
0
hnn(1) 0
hnn(2) - d 0
0 )
hnii(3) -d
(6.69)
9
It follows that the constant k in eq. (6.66) should be chosen to be k - - 1 / d . In image coordinates, the transformation given by eq. (6.66) has the following form:
(x
(
0
x
0
(6.70) This is an affine transformation. It is easily seen that although the transformation between two images of a space plane is a projective transformation in general, it reduces to an affine transformation if and only if the two optical axes are parallel and the base-line vector is orthogonal to them, i.e., ( k , h ) = 0 and R k - k. As pointed out in Section 6.1.3, any stereo system can be equivalently treated as a parallel stereo system by applying the projective transformations given by eqs. (6.14) and (6.16) to the two images. It follows that the transformation between two images of a space plane reduces to an affine transformation for any stereo system if an appropriate transformation is applied to each image. X~ set of all image transformations in the form of eq. (6.66) for an arbitrary nonsingular matrix A forms a group of two-dimensional projective transformations with respect to the composition operation. Note that eq. (4.126) can be written as Q = A-'-Q~A/d 2 in terms of the matrix A defined in eq. (6.67).
190
C h a p t e r 6.
3-D C o m p u t a t i o n by Stereo Vision
X
X' ~X'
O
O'
Fig. 6.12. Two corresponding image points are back projected onto a space plane.
B. Optimal correction In the presence of noise, corresponding image points x and x ' do not necessarily satisfy eq. (6.66). Hence, we optimally correct them. In geometric terms, this m e a n s modifying x and x so t h a t their lines of sight meet exactly on the space plane H (Fig. 6.12). This problem is equivalent to finding A x and A x ' such t h a t 5~ = x - A x and 3' = x ~ - A x ' satisfy eq. (6.66) or equivalently 11 9 ' x A S = 0.
(6.71)
T h e rank of this constraint 12 is 2, since only two of the three c o m p o n e n t equations are algebraically i n d e p e n d e n t 13. Let Y[x] and Y[x'] be the a priori covariance matrices of x and x ' , respectively. As discussed in Section 5.1.1, the optimal correction is d e t e r m i n e d by the optimization
J = (Ax, V[x]-Ax)+ ( A x ' , V[x'l-Ax') --4 min
(6.72)
u n d e r the linearized constraint
x'
x
A A x - (Ax)
x
Ax'=
x'
x
Ax,
~,, ~,' e {k}f.
(6.73)
The first order solution is given as follows (see eq. (5.17)): A x - ( Y [ x ] A -r x x')W(x' x A x ) ,
~'=-(v[~']
• (A~,))W(~,' • A~).
(6.74)
11If eq. (6.71) satisfied, there exists a value k such that ~' - kAY; it is chosen so that (k, x') - 1 holds. 12This constraint is degenerate (see Section 5.1.3). 13The third one can be obtained by multiplying the first and the second ones by - ~ ' and -~3', respectively, and adding them together.
6.5.
Optimal Back Projection onto a Space Plane
191
Here, W is a (33)-matrix defined as follows (see Section 5.1.3): W-
(x' x A V [ x ] A -r • x' + ( A x )
)
• V[x'] • ( A x ) 2 "
(6.75)
Since the constraint (6.73) is obtained by a linear approximation, corrections x t - x - A x and x' +- x ' - A x ~ need to be iterated until the constraint (6.71) is sufficiently satisfied (see Section 5.1.3). The a posteriori covariance matrices of the corrected positions ~ and ~ are given as follows (see eqs. (5.31) and (5.32)):
V [ ~ l - v [ ~ ] - (V[~]A r • ~')W(V[~]A r • ~')r v[~'] = v [ ~ ' ] - (v[~'] • (A~))W(V[~'] • (A~)) T, V[~, ~ ' 1 - (V[x]A T • 5~t)I~(V[x '] • (A~)) -r - V[~', ~] T.
(6.76)
Here, the matrix I~r is obtained by replacing x and x t by 5~ and 5~~, respectively, in eq. (6.75). The 3-D position r of the back projected point is determined by the depth Z given by eq. (6.63). Its covariance matrix V[r] is computed from eqs. (6.48), (6.49), and (6.50). Since r is constrained to be on H, its covariance matrix V[r] is singular; the surface normal n n is its eigenvector for eigenvalue 0, and the remaining eigenvectors lie on H. C. I n c i d e n c e t e s t a n d noise level e s t i m a t i o n
The residual of J given in eq. (6.72) can be written as follows (see eq. (5.34)):
j-
(~' • A~, W(~' • A~)).
(6.77)
If the noise is Gaussian, the residual J is a X2 variable with two degrees of freedom (see Section 5.1.5). This fact provides an incidence test: the hypothesis that the space point defined by image points x and x ~ is on the space plane (nn, r) - d is rejected with significance level aT0 if
j > /~2,a.
(6.78)
Suppose the covariance matrices V[x] and V[x'] are expressed in terms of the normalized covariance matrices V0[x] and V0[x'] and the noise level e in the form of eqs. (6.24). From eqs. (6.74) and (6.75), we see that the optimal correction is not affected if the covariance matrices V[x] and V[x'] are replaced by the normalized covariance matrices V0[x] and V0[x'], respectively. The unknown noise level c can be estimated a posteriori in the form
~2_ 1
~(~'• A~, W0(~' • A~)),
(6.79)
192
Chapter 6.
where ~r
-
(~ ' x
3-D Computation by Stereo Vision
A V o [ x l A T x ~' + (A~) x V0[x'] x (A~)
)- .
(6.80)
2
The expectation and variance of ~2 are given as follows (see eqs. (5.39)):
E[~ ~] = d,
V[~ ~1 = ~ .
(6.8~)
If the value e is given a priori, the X2 test (6.78) takes the following form (see eq. (5.40)): ~2 X 2,a ~-~ > -5-"
(6.82)
E x a m p l e 6.8 Consider the paraJlel stereo system described in Section 6.1.3 with the noise characteristics given in Example 6.1. Suppose a feature point is known to be on space plane Z = d. Let (x,y) and (x',y ~) be its image coordinates on the first and the second image planes, respectively. According to eqs. (6.74) and (6.75), the two image points are respectively corrected into --
(y + y' + h / d ) / 2 1
,
5~' -
(y + y' - h / d ) / 2 1
.
(6.83)
The a posteriori covariance matrices (6.76) reduce to E2
V [ & ] - V [ & ' ] - V[&, 5~']- ~ P k "
(6.84)
The residual (6.77) is a f - _ 1~ ( ( x - x ' ) 2 + ( y - y ' - ~ h )
2).
(6.85)
Hence, an unbiased estimator of e2 is obtained in the form =~
+(y-
-~)
.
(6.86)
The depths Z and Z ~ computed by eqs. (6.38) are simply Z = Z'=
d.
(6.87)
Eq. (6.84)reduces eqs. (6.48) and (6.49)to
v[z] = o,
v[~, z] = o.
(6.88)
From eq. (6.50), we see that the covariance matrix of the reconstructed space point r is e2d 2
V [ r ] - ----~-Pk,
(6.89)
meaning that errors in r are constrained to be on the space plane Z - 1, on which the distribution is isotropic with variance ed/2 in each orientation.
6.5.
Optimal Back Projection onto a Space Plane
193
Fig. 6.13. Stereo images of a planar grid.
(~)
(b)
Fig. 6.14. (a) Simple back projection. (b) Optimal back projection. E x a m p l e 6.9 Fig. 6.13 shows two simulated stereo images (512 x 512 pixels with focal length f - 600 (pixels)) of a planar grid in the scene. The x and y coordinates of each grid point are independently perturbed by Gaussian noise of standard deviation cr = 5 (pixels), so the noise level is e = (r/f = 1/120, which is treated as unknown in the subsequent computation and estimated a posteriori by using eq. (6.79). The equation of the space plane on which the grid lies is assume to be known. Fig. 6.14a shows the back projected pattern computed by eq. (6.64) viewed from a different angle; Fig. 6.14b is the corresponding result obtained after the feature points are optimally corrected. We can clearly see that the correction enhances the accuracy of 3-D reconstruction.
6.5.2
Back projection of a line
Suppose we observe a space line which is known to lie on a space plane ( n n , r) = d. L e t / a n d l' be its stereo images; let (n, x) - 0 and (n', x') - 0 be their respective equations. The image point x ~ that corresponds to an image point x on 1 must be on l ~. If there is no image noise, the corresponding image points x and x' are related by eq. (6.66). Hence, every x that satisfies (n, x)
194
Chapter 6.
3-D Computation by Stereo Vision
O'
n
Fig. 6.15. T w o c o r r e s p o n d i n g i m a g e lines are b a c k p r o j e c t e d o n t o a s p a c e p l a n e . = 0 must satisfy (n',kAx) - 0 or ( A T n ' , x ) - 0, which implies 14
n - k'ATn',
(6.90)
where the constant k' is chosen so that I]k'ATn'l[ - 1 holds. In the presence of noise, corresponding image lines (n, x) - 0 and (n', x') - 0 do not necessarily satisfy eq. (6.90). Hence, we optimally correct n and n ' so that eq. (6.90) is strictly satisfied. In geometric terms, this means modifying the image lines l and l' so that the space plane defined by the viewpoint O and 1 meet the space plane defined by the viewpoint O' and l' exactly on the space plane H (Fig. 6.15). This problem is equivalent to finding A n and A n ' such that ~ - n - A n and ~' = n ' - A n ' satisfy eq. (6.90) or equivalently n x ATfi' = 0.
(6.91)
As in the case of image points, this constraint 15 has rank is 2. Let V[n] and V[n'] be the a priori covariance matrices of n and n', respectively. The problem can be written as the optimization
J = (An, V [ n ] - A n ) + (An', V[n']-An')--+ min
(6.92)
under the linearized constraint - - A T n ' X A n + n x A T A n t __ n X A T n ' , e
n st
solution is given An--(V[n]
e
onows
•
(6.93)
(5. lr))-
x ( A T n ' ) ) W ( n x ATn'),
a4If we note that ( A T ) -1 -- ( A - l ) T (see eqs. (2.21)), we see from eq. (6.90) t h a t n' -- ( A - 1 ) T n / k '. This is also a projective transformation (or collineation) of image lines. This transformation is contragradient to eq. (6.66). This duality between image points and image lines is a f u n d a m e n t a l property of projective geometry. 15This constraint is degenerate (see Section 5.1.3).
6.6.
Scenes Infinitely Far Away
195
~'-(v[~']A x ~)W(~ x A ~ ' ) .
(6.94)
Here, W is a (33)-matrix defined as follows (see Section 5.1.3): W =
( ( A T n ') x V[n] x ( A T n t) + n x A T V [ n t ] A
Since the constraint (6.93) is obtained by n t-- n + A n and n t ~ n ~ + A n t need (6.91) is sufficiently satisfied (see Section matrices of the corrected values ~ and ht and (5.32)):
v[q = ~[q-
)-2 .
(6.95)
a linear approximation, corrections to be iterated until the constraint 5.1.3). The a posteriori covariance are given as follows (see eqs. (5.31)
(C[q x (A~g'))W(~[q x (A~g')) ~,
V[g']- V[n']- (V[n']A x V[g,g']-
x n
g)W(V[n']Ax g)T,
(V[n] x ( A T g ' ) ) W ( V [ n ' ] A x g ) T = V [ g , , g l T .
(6.96)
r
(6.97)
Here, we define
~[n]- P,~V[n]Pn,
P,~v[n']P,~.
The matrix ~ r is obtained by replacing n and n ~ by ~ and h~, respectively, in eq. (6.95). The space line reconstructed by eq. (6.59) lies exactly on the space plane II. Its covariance matrix V[p | n] is computed from eqs. (6.61). The residual of J can be written as follows (see eq. (5.34)): J-
(~ x ATfi ', ~Z(~ x A T ~ ' ) ) .
(6.98)
If noise is Gaussian, the residual J is a X2 variable with two degrees of freedom (see Section 5.1.5). This fact provides an incidence test: the hypothesis that the space line defined by image lines ( h i , X ) = 0 and ( n 2 , x ) = 0 is on the space plane (nn, r) = d is rejected with significance level a% if
J > )~2,a.
(6.99)
If the noise level in the images is unknown, it can be estimated a posteriori as in the case of image point back projection.
6.6
6.6.1
Scenes Infinitely Far Away
Spacepoints infinitely far away
If x and x ~ are stereo images of a feature point that belongs to an object located practically infinitely far away (e.g., a mountain, a boat on the sea,
196
Chapter 6.
3-D Computation by Stereo Vision
...................... iiiiii............... ax
.......iiiii..................... .....
............
......
............ii....................
y
Ax' .........
Fig. 6.16. Two corresponding image points are corrected so that their lines of sight become parallel. or an airplane in the sky), then lines of sight of x and x' should be parallel. This condition can be written as
x-
kRx'
(6.100)
for some constant k. This equation may not necessarily be satisfied exactly in the presence of noise (Fig. 6.16). Consider the problem of optimally correcting them: we want to find A x and A x ' such that 9 - x - A x and 3' - x ' - A x ' satisfy eq. (6.100) or equivalently x R2'-0.
(6.101)
Like eq. (6.71), this constraint t6 has rank 2. Let Y[x] and Y[x'] be the a priori covariance matrices of x and x', respectively. As discussed in Section 5.1.1, the optimal correction is determined by the optimization J = (Ax, V [ x l - A x ) + (Ax', V[x']-Ax') ~ nfin
(6.102)
under the linearized constraint x x RAx'-
(Rx')
~,
~'
x Ax
-
x x
Rx', (6.103)
e {k}~.
The first order solution is given as follows (see eq. (5.17)): ~
= -(V[~l
~'-(v[~']R
• (R~'))w(~
• R~'),
~ x ~ ) w ( ~ x R~').
(6.104)
Here, W is a (33)-matrix defined as follows: w
=
(
(R~') • v[~] • (R~')+
~ • RV[~']R ~ • ~
16This constraint is degenerate (see Section 5.1.3).
)2
.
(6.105)
6.6.
Scenes Infinitely Far Away
197
Since the constraint (6.103) is obtained by a linear approximation, corrections x +-- x - A x ' and x' +---x ' - A x need to be iterated until the constraint (6.101) is sufficiently satisfied (see Section 5.1.3). The a posteriori covariance matrices of the corrected positions 5: and 5:' are given as follows (see eqs. (5.31) and (5.32)): viii-
•
• 9
•
9
•
V[Sz, Sc'] - (V[x] • ( R ~ c ' ) ) W ( V [ x ' ] R T • 5:)7- - V[Yc',Sc] q-.
(6.106)
The matrix TV is obtained by replacing x and x ~ by 5: and 5~~, respectively, in eq. (6.105). The residual of J given in eq. (6.102) can be written as follows (see eq. (5.34)):
J-
(x • R x ' , W ( x • R x ' ) ) .
(6.107)
If the noise is Gaussian, the residual j is a X2 variable with two degrees of freedom (see Section 5.1.5). This fact provides an infinity test: the hypothesis that the space point defined by image points x and x' is infinitely far away is rejected with significance level a% if Y > ~2,a.
(6.108)
In actual applications of stereo vision, feature points reconstructed to be very far away from the cameras are often disregarded as meaningless, but the judgement as to how far is far enough is usually done ad hoc. The above procedure provides a rigorous statistical criterion. Suppose the covariance matrices V[x] and V[x'] are expressed in terms of the noise level e and the normalized covariance matrices V0[x] and V0[x'] in the form of eqs. (6.24). From eqs. (6.104) and (6.105), we see that the optimal correction is not affected if the covariance matrices V[x] and V[x'] are replaced by the normalized covariance matrices V0[x] and V0[x'], respectively. The unknown noise level e can be estimated a posteriori in the form 42 - ~(x 1 • R x ' , I2r
• Rx')),
(6.109)
where
Wo - ((R:i:') • Vo[x]
• (Rh:') + ~ • R V o [ x ' ] R T •
The expectation and variance of 42 E[~ 21 -.- 62 '
are
5:)2.
(6.110)
given as follows (see eqs. (5.39)): V[~ 2] --- E4.
(6.111)
198
Chapter 6.
3-D Computation by Stereo Vision
If the value e is given a priori, the ;~z test (6.108) takes the following form (see eq. (5.40)): ~2 X2,a e--5- > --~-. (6.112) E x a m p l e 6.10 Consider the parallel stereo system described in Section 6.1.3 with the noise characteristics given in Example 6.1. Suppose a feature point is known to be infinitely far away. Let (x, y) and (x', y') be its image coordinates on the first and the second image planes, respectively. Eqs. (6.104) and (6.105) lead to
- ~'-
(tx+xl ) (y+y')/2 1
.
(6.113)
Namely, they are corrected into their "midpoint" if the two image planes are identified. The a posteriori covariance matrices (6.106) reduce to E2
v [ e ] - v [ e ' l - v[e, e ' ] - -~Pk.
(6.114)
The residual (6.107) is 1
j = ~
((x- x') ~ + ( u - u')~) 9
(6.115)
An unbiased estimator of e2 is obtained in the form ~2
6.6.2
1 =~((x-x') ~+(u- yt)2 ).
(6.116)
Space lines infinitely f a r away
Let 1 and I t be stereo images of a space line located infinitely far away. If ( n , x ) = 0 and (n',x') = 0 are their respective equations, the space planes (n, r) = 0 and (Rn', r) = 0 should be parallel to each other. This condition can be written as n = kRn' (6.117) for some constant k. the presence of noise n and nt: we want n ' - A n ' satisfy eq.
This equation may not necessarily be satisfied exactly in (Fig. 6.17). Consider the problem of optimally correcting to find A n and A n t such that f = n - A n and fit = (6.117) or equivalently fix R~'-0.
As in the case of image points, this constraint 17 has rank 2. 17This constraint is degenerate (see Section 5.1.3).
(6.118)
6.6.
Scenes Infinitely Far Away
199 .~ . - - " "
o
iiiiii! iiiii!!!iiill. . . . . . . .
Fig. 6.17. Two corresponding image lines are corrected so as to define a space line located infinitely far away. Let V[n] and V[n'] be the a priori covariance matrices of n and n', respectively. The problem can be written as the optimization J = (An, V [ n ] - A n ) +
(An', V [ n ' ] - A n ' )
--+ min
(6.119)
under the linearized constraint -Rn'
x An + n x RAn'
~ . e {-}~.
= n x Rn',
~ . ' e { n ' }L" 9
(6.120)
The first order solution is given as follows (see eq. (5.17)):
zx~ = - ( v [ ~ ]
• (R.'))w(.
zx.'-(v[,~'lR ~ x ~)w(n
• R.'), • R.').
(6.1Zl)
Here, W is a (33)-matrix defined as follows (see Section 5.1.3): W -
(
(Rn') x Vial x (Rn')+
n x RV[n']R T x n
)-
2
.
(6.122)
Since the constraint (6.120) is obtained by a linear approximation, corrections n +-- n + A n and n' +-- n ~ + A n ' need to be iterated until the constraint (6.118) is sufficiently satisfied (see Section 5.1.3). The a posteriori covariance matrices of the corrected values/~ a n d / t ' are given as follows (see eqs. (5.31) and (5.32)):
v[,~]- r
(9"[r4 • (m~'))g~(C'[n] x (R,~')) T,
v[~'] = ~ [ n ' ] - (~[n']R ~ x .~)W(9[n']R T x ~)~. V[/~,/~']- (V[n]
x
(R/t'))I~(Ik[n']R T
x
~)T = V[/t',/~]q-.
(6.123)
Here, we define I?[n] = Pr V [ n ] P i z ,
l?[n'] = P n V[n']Po,.
(6.124)
200
Chapter 6.
3-D Computation by Stereo Vision
The matrix I ~ is obtained by replacing n and n ~ by ~ and ~1, respectively, in eq. (6.122). The residual of J can be written as follows (see eq. (5.34)):
(6. 25) If noise is Gaussian, the residual J is a X2 variable with two degrees of freedom (see Section 5.1.5). This fact provides an infinity test: the hypothesis that the space line defined by image lines (nl, x) = 0 and (n2, x) = 0 is infinitely far away is rejected with significance level a% if J > k~,,.
(6.126)
If the noise level is unknown, it can be estimated a posteriori as in the case of image points.
6.7
Camera
6. 7.1
Calibration
Errors
Errors in base-line
So far, our analysis has been based on the assumption that the motion parameters and the focal length are accurately calibrated beforehand. In reality, however, accurately estimating such parameters is very difficult. We now analyze how the accuracy of the camera parameters affects the accuracy of 3-D reconstruction. The effect of calibration errors is very different from the effect of image noise: the statistical characteristics of all space points reconstructed from perturbed camera parameters are correlated, while the effect of image noise is generally independent from point to point. For simplicity, errors in the camera parameters are assumed to be independent of image noise. This means that when we analyze the effect of errors in the camera parameters, we can assume that image noise does not exist, since the combined effect of image noise and calibration errors is, to a first approximation, a superimposition of the effect of image noise with accurate calibration and the effect of calibration errors with no image noise. First, assume that errors in the base-line vector h are independent of errors in other parameters and image noise. For the reason stated above, other parameters and image points can be assumed accurate. Let V[h] = E[AhAhT] be the covariance matrix of h. Since the image points are assumed correct, we have ~ - x and ~1 _ xl. If h is perturbed into h + Ah, the image points ~ and 5~~ are corrected into 5~ + AS~ and ~ + A~ !, respectively, so as to satisfy the perturbed epipolar equation for G ~ = (h + Ah) x R. Noting that (5~, G~') = 0, we put A~-
(~, G ' ~ ) =
(~, Ah x R ~ ) -
-(5~ x
R S ~ , A h ) - - ( d , Ah),
(6.127)
6.7.
Camera Calibration Errors
201
where a - 5 x P.,5'.
(6.128)
According to eqs. (6.20), the correction of the image points has the form
~X~V[~IGe' (5', GTV[x]G5 t) '1- (5, GV[xt]G TS~)' aev[x']GVe A&' = - ( 5 ' , GTV[xlG5 ') + (5c,av[x']G T ge)"
(6.129)
Hence, the a posteriori covariance matrices of the corrected points are given by V[~,] V[~I(VixlG~')(V[x]G~')T
(~', aTv[~]C~ ') + (~, av[~'laT~) V[~']
V[~](V[~'laT~)(VWIOT~)T
--
(~', a~-v[xla~, ') + (~, Grin']aTe) v[~,~'] -
VH(V[~]a~')(V[~']GT ~')T ((5 t, oTv[XlOx 1) + (X, aV[=tl~.Tx)) 2'
(6.130)
where v[a] = E[Ae2I = (a, V[h]&).
(6.131)
We also obtain
v[h, ~} = -
V[h,~](V[x]GS') T (5', GTV[xlG5 ') + (5, GV[xqG TS)'
V[h,~](V[x']GT5) T V[h,~'] - -(5,,GTV[x]GS,) + (x, GV[x,IGTS) ,
(6.132)
where
V[h, ~] = E[AhA~] = -V[hlfi,.
(6.133)
If h in eq. (6.35) is perturbed by Ah, eq. (6.41) is replaced by
AZ5 = AZIR51 + A h - ZA5 + ZIRA5 I,
(6.134)
from which we obtain AZ-
-(rh, ZA5-
Z'RA~,'- Ah) (,h, ~)
(6.~35)
202
Chapter 6.
3-D Computation by Stereo Vision
where rh is defined by eq. (6.45). The variance by
V[Z] =
E[(AZ) 2] of Z is given
1
y[z] =
x R 'll z (
- 2ZZ'(rh, V[~, ~']Rq-rh)
+ Z'2(rh, RV[~'lRq-rh) - 2Z(rh, V[~, hlrh)
+ 2Z'(~z, RV[~',hlrh) + (~h, V [ h ] r h ) ) .
(6.136)
The covariance vector V[5~, Z] = E[AYcAZ] has the form
V[hc Z] '
ZV[~]gn - Z'V[5c, ~']Rq-~h - V[]c, h]~Zn -
(rh, ~)
.
(6.137)
The covariance matrix of the reconstructed space point r - Zh: is given by eq. (6.50). Although the above computation involves the a priori covariance matrices V[x] and V[x'], the result is invariant to multiplication of them by any positive constant. Hence, the covariance matrices V[x] and V[x'] need to be given only up to scale for this analysis.
6. 7.2
Errors in camera orientation
Suppose the rotation matrix R that describes the relative orientation of the two cameras is not accurate. If R is perturbed into R + AR, both R and R + A R are rotation matrices, so the difference is also a small rotation. Since a small rotation is given in the form of eq. (2.57), there exists a small vector A ~ such that to a first approximation A R = A a • R.
(6.138)
This means that the second camera is further rotated around axis A ~ by angle [IA~II, where the vector A ~ is defined with respect to the first camera coordinate system (see eqs. (2.58) and (3.32)). The covariance matrix of R is defined by V [ R ] - E [ A ~ A ~ q-] (see eq. (3.33)). We assume that errors in R are independent of errors in other parameters and the image noise. It follows that other parameters and image points can be assumed accurate, so we let ~ = x and ~ = x ~. If R is perturbed into R + A R , the image points 5~ and 5~~ are corrected into 5: + A~ and 5:~+ A~ ~ respectively, so as to satisfy the perturbed epipolar equation for G ~ = h • ( R + A ~ • R). Put A~ = (5, G ' ~ ) =
(5, h • ( A a • R ~ ) )
= ( h , / ~ ' ) ( 5 : , A a ) - (5, Rhe')(h, A a ) - - ( b , A a ) ,
(6.139)
6.7.
Camera Calibration Errors
203
where
b - (5:, R ~ ' ) h - (h, RS:')']c.
(6.140)
The corrections AS: and AS:' are given in the form of eqs. (6.129). Hence, the a posteriori covariance matrices V[~], V[5:'], and V[~, ~'] are given in the form of eqs. (6.130), where eq. (6.131)is replaced by V[~] -
E[A~ 2] - (b, V[h]b).
(6.141)
Similarly, the covariance matrices V[R, 5:] - E[AaA~7-1 and V[R, 5:'] E[Ag~A~ 'q-] are given in the form
v[~,~] = -
V[n,~](V[.lG~') T
(~c', aTv[x]G~c ') + (~c,aV[x']G TS:)' V[n,~](V[.'IGT~) T
(6.142)
V [ R , ~c'] = - (~,, G T V [ x ] G S , ) + (~c, a V [ x ' ] G T ~) ' where
V[R, ~] - E[AY2A~] - -V[R]b.
(6.143)
If R in eq. (6.35) is perturbed by A R - ~ x R, eq. (6.41) is replaced by
A z ~ - AZ'R~' - Z'(R~') • A ~ -
Z A ~ + Z'RA~',
(6.144)
from which we obtain ~xz -
- ('~'' z ~ x ~
-
z'n~x~')
-
z'll~'ll~(~, A a )
(6.145)
(~h, ~) Hence, the variance v[z] -
V[Z] - E[(AZ) 2] of Z is given by
1
Jl~ • n~'ll ~ ( z~('~' v[~]~h) -
2zz'(,~,
+ Z '~ (,~, R v [ ~ ' l n T , ~ ) -
v[~, ~ ' ] n ~ h )
2z'll~'ll ~(z(m, v[~, Rla)
- Z'(,~,RVK~',R]~)) + z'Zll~'ll4(~, v [ R ] ~ ) ) .
(6.146)
The covariance vector V[5:, Z] has the form v[~,
z]
-
- zv[~]m
-
z'v[~,
~']R wm - Z'll~'llzv[~, R]~ (rh, &)
.
(6.147)
The covariance matrix of the reconstructed space point r - ZS: is given by eq. (6.50). As in the case of base-line vector errors, the covariance matrices V[x] and V[x'] need to be given only up to scale for this analysis.
204
Chapter 6.
3-D Computation by Stereo Vision
..........................................................................: ii/:::............... : ( x 'y
~~~
1
(~,~)
0
Fig. 6.18. Image coordinates (x, y) are computed with respect to a perturbed focal length f.
6.7.3
Errors in focal length
As mentioned in Section 6.1.1, the focal length f is the distance between the image plane and the viewpoint O. In 3-D analysis, we take f as the unit of length (see also Section 4.1.1). However, the focal length f is also difficult to calibrate accurately. Let f be the correct focal length. Consider an image point p whose correct image coordinates are (2, ~). The physical location of p measured in pixels is (f2, f~). If (x, y) are the image coordinates of p computed by using a perturbed value f = f + A f, we have
((/+ ~:)x, (/+ ~:)v) = (/~, /~),
(6.148)
since the physical location of the image point should be the same (Fig. 6.18). From eq. (6.148), we obtain to a first approximation = x + -ATfx ,
~ = ~ + -ATf~ .
(6.149)
In other words, the true location of an image point which appears to be at x is
5c - x + ~ - P k X .
(6.150)
A. I n d e p e n d e n t c a m e r a s
Consider the case in which the two cameras have different focal lengths f and ft, which are separately calibrated. Then, errors in f and errors in f~ are regarded as independent. Suppose f is not accurate but f~ is accurate. Other parameters and image points are also assumed accurate. Let x and x ~ be corresponding image points. Because f is not accurate, x and x ~ do not
6.7.
Camera Calibration Errors
205
exactly satisfy the epipolar equation, but their true locations 2 and 5:' (= x') do. From eq. (6.150), we see that to a first approximation
Hence,
(x, ax')
= - A--L(x,
f
Pk Gx').
(6.152)
The image points x and x' are corrected into 5: = x + Ax and &~ = x ~+ Ax so as to satisfy (~, G ~ ' ) = 0. The corrections Ax and Ax' are given by eqs. (6.20). Let A~ = 5 : - ~. Then,
AS: -
x + A x - 5: - - ~ - P k x +
Ax
.I
= f~ Pkx + (x', GTV-[x-~x--;) --(--(xi-~x']Gq-x)
"
(6.153)
Similarly,
~I
(*.Pka*')v[~']GT*
Let V[f] = E [ A f 2] be the variance of f. The covariance matrices of the corrected position & and &' are given as follows:
(*'Pka~')V[*]G~'
v[f]( v[~]= /~ Pk*+
(pe~+ v[~'] =
(X'PkGx')V[x]Gx' ) GTV[x]Gx ') + (x, GV[xqG T x)
V[f] (x, PkGX')2(V[x']GTx)(V[x']GTx) -r f~ (~,. aTV[.]G. ') + (~. aV[.']a T*)
V[f](
V[Jz, 5:'] -
(x',
)
f2
(
)'
(x, PkGx')V[x]Gx' ) Pk x + (x', GTV[x]Gx ') + -~I-~x'IGTx) (x, Pk Gx')(V[x'IGT x) T (x', GTV[x]Gx ') + (x, GV[x']G vx)" (6.155)
The subsequent analysis is the same as in Section 6.3.2. Namely, the covariance matrix of the reconstructed space point r = Z~ is computed by eqs. (6.48), (6.49), and (6.50). The covariance matrices Y[x] and Y[x'] need to be given only up to scale.
206
Chapter 6.
If f is accurate but f ' is not, leads to
V[f']
v[~] -
3-D Computation by Stereo Vision
V[f'] being
its variance, a similar analysis
(x, GPkx')2(V[x]Gx')(V[x]ax')T (~', GTV[~Ia~ ') + (~, aV[~']GT~)
V[f'] ( x, V[x'l -
(x, GPkx')V[x'] GTx ) Pk + (x ' , G T-~x ]-~x' ) + -~ :-G-~ qG Tx )
if2
(
y [ ~ , ~']
(x, GPkx,)V[x,]GT x
+
V[f']
IT
(x, GPkx')V[x]Gx'
-
f,2 (x', GTv[x]Gx ') + (x, GV[xqG Tx)
(
(x, GPkx,)V[xqGTx Pk ' +
:i
)T 9 (6.~56)
If f and f ' are both inaccurate but independent, the error behavior of 3-D reconstruction is described to a first approximation as the superimposition of t h a t for inaccurate f and accurate f ' and that for accurate f and inaccurate f'. B. Identical
cameras
Consider the case in which the two cameras are identical and have the same focal length but the value f we use is not necessarily correct. Let V[f] be its variance. In this case, eq. (6.152) is modified to
(x, Gx') -
Af s ((*Pk, ax') + (~, GPk,'))
Af
f (x, (Pk G + GPk)x' ).
(6.157)
The corrections AS~ and AS~' are given as follows: (x,
~'=
(Pk G + GPk)x')V[x]Gx'
(x, (Pk G + GPk)x')V[x']GTx ) ~'I Pk~' + f (x', GTV[x]Gx ') + (x, GV[x']GTx)
.
(6.~58)
6.7.
Camera Calibration Errors
207
Hence, viii -
V[f]f2 IRk x + ( Pk~ +
v[~']
V[f] (x'
(x, (PkG + GPk)x')V[x]Gx' ) (x, (Pk G +
GPk)x')V[x]Gx'
(x' (PkG+ GPk)x')V[x/]GTx )
-
T
Pk x' + ( x ; : G T ~ g ~ i
,E:I(
+(-x:
I-,(-~ §
v[~.~']= :.- Pk-+ (.,i~ev-~a-~u165
(
GV[x'IGTx)
)
(x, (PkG + GPk)x')V[x']GTx ) T . (6.159) V[x]
V[x']
Again, the covariance matrices and need to be given only up to scale. The subsequent analysis is the same as given in Section 6.3.2.
Chapter 7 Parametric Fitting This chapter studies methods for fitting a geometric object to multiple instances of another geometric object in an optimal manner in the presence of noise. First, the criterion for optimal fitting is derived in general terms as maximum likelihood estimation. The covariance matrix of the resulting estimate is explicitly obtained, and the statistical behavior of the residual of optimization is analyzed. This analysis provides a criterion for testing the hypothesis that the observed objects are in a special configuration. Then, we study various types of fitting problems in two and three dimensions such as finding an optimal average, estimating a common intersection, and fitting a common line or plane. In two dimensions, this theory predicts the statistical behavior of an image line fitted to an edge segment.
7.1
7.1.1
General
Theory
Parametric fitting
We consider the problem of defining a geometric object from a number of data. Let u be the vector t h a t represents the object to be determined, and let a l , . . . , a N be the data. Suppose the d a t a a l , . . . , a N are vectors t h a t represent N instances of the same type of object (e.g., image point). They are supposed to satisfy the same geometric relationship (e.g., incidence) with the object in question in the absence of noise. We want to determine the value of u by optimally imposing the relationship on the data. This type of problem frequently arises in computer vision and robotics applications. Let n be the dimension of the vector u. Its domain is assumed to be an n ~dimensional manifold b/ C 7~n, which we call the parameter space. Consider a n object represented by an m-vector a. Its domain is assumed to be an m~-dimensional manifold A C 7~m, which we call the data space. Suppose N different instances a l, ..., aN of vector a are observed in the presence of noise. Let 5~ be the true value of am, and write a, = a, + Aa,,
a = 1, ..., N .
(7.1)
Assuming t h a t each am has been obtained by a separate process, we regard the noise A a a as a statistically independent r a n d o m variable of mean 0 and covariance m a t r i x V i a l ] . Since the d a t a are constrained to be in the d a t a space ,4, the noise A a ~ is, to a first approximation, constrained to be in the tangent space Ta~ (A) to the manifold A at ~ . Hence, the domain and 209
210
Chapter 7.
Parametric Fitting
the null space of the covariance matrix ~r[aa] are Ta~ (,4) and Ta~(A) • respectively. Suppose L smooth functions F (k) ( . , 9)" ~ m • 7~" --+ 7~ exist and the true values al, ..., aN satisfy
F (k) (d~, u) - 0,
k - 1, ..., L,
(7.2)
for a - 1, ..., N for some value u. Eq. (7.2) imposes a parameterized constraint on { a s }; we call this the hypothesis, because the value of the parameter u is unknown. Our goal is to optimally estimate the parameter u from the data a l , ..., aN and their covariance matrices V[ai], ..., V[aN]. We assume that the hypothesis (7.2) is nonsingular in the sense discussed in Section 5.1. The L equations (7.2) may not be independent; we call the number of independent equations the rank of the hypothesis (7.2). To be specific, the rank is the codimension of the manifold ,S defined by the L equations F(k)( 9, u) -- 0, k - 1, ..., L, in the data space j[ (see Section 3.2.1). We call S the (geometric) model of the hypothesis (7.2). It can be shown I that the rank of the hypothesis (7.2), or the co dimension of the model S, generally coincides with the dimension of the linear subspace
9a - { P ~ VaF(1)(aa, u),
p Aa. VaF(L)(a,, U)}L
C T~ m
(7.3)
for all a, where PaA is the m-dimensional projection matrix onto the tangent space Ta. (A) at a s . If the dimension of this linear subspace is smaller than the rank of the hypothesis (7.2), we call a s a singular datum; otherwise, it is a nonsingular datum. In the following, we assume that singular data are removed from the fitting data. However, the dimension of the linear subspace defined by eq. (7.3) may not be equal to the dimension of the linear subspace
)]a - {PaA.VaF(i)(a,,u),...,PaA.VaF(L)(aa, u)}L C T~m
(7.4)
for a s # a s (see Section 5.1.1); we say that the hypothesis (7.2) is degenerate if the dimension of the subspace ];a is larger than the dimension of ];a (see Section 5.1.3).
7.1.2
Maximum likelihood estimation
Our approach consists of two stages 2" the correction stage and the estimation stage. A. Correction stage
We first assume a particular value of the parameter u. If the data { a s } do not satisfy the hypothesis for the assumed value u, we optimally corrected them 1Detailed discussions will be given in Chapter 14. 2The proof that the following approach is indeed optimal will be given in Chapter 14.
7.1.
General Theory
211
by applying the theory given in Chapter 5. Namely, we find A a s such that a s = a s - A a s satisfies eq. (7.2). As discussed in Section 5.1.1, this correction is done for each a by minimizing the Mahalanobis distance IIAasl]~[a~], i.e., by the optimization
(7.5)
Js - ( A a s , V [ a s ] - A a s ) --+ min.
This can be justified as maximum likelihood estimation for Gaussian noise (see Section 5.1.1). Eq. (7.2) imposes the following linearized constraint on A a s 6 Ta~(A)"
(VaF(k)(as, u ) , A a s ) - F(k)(as,u),
k-
1,...,L.
(7.6)
The solution of the optimization (7.5) is given as follows (see eq. (5.17)): L
A a s -~?[a~] ~
W-(kl) s (u)F (k)(as, U)VaF (l)(as, u).
(7.7)
k,l=l
Here, I?r
- (l?~(kl) ( u ) ) i s the (LL)-matrix defined by
(17V(kt) ) - ((VaF(k) (as, u), V[as]VaF(l) (as, u))) - .
(7.8)
As in Section 5.1.2, this expression is the abbreviation of I2r - Vs(u)for the (LL)-matrix V s ( u ) - (V(~kl)(u)) defined as follows (see eqs. (5.15) and (5.16))" (7.9) It can be proved 3 that the rank of matrix V s ( u ) (hence of matrix IgVs(u)) equals the rank r of the hypothesis (7.2). B. Estimation stage
If A a s is an independent Gaussian random variable of mean 0 and covariance matrix V[as], the joint probability density of all {Aas}, a - 1, ..., N, has the following form (see eq. (3.46))"
Ir
(7.10)
This probability density can be viewed as the likelihood of the observed values {Aas}, a - 1, ..., N. Let Js be the residual of Js obtained by substituting 3The proof will be given in Chapter 14.
212
Chapter 7.
Parametric Fitting
eq. (7.7) into eq. (7.5). The likelihood takes the form
~--
/3:1
a-'l
v/(2~)"~I?[a~]l+
which is a flmction of u alone. We now seek the value u that maximizes this likelihood. This is equivalent to minimizing the sum of the residuals j~, which we write as Jlul" I. J N
J[u]-
~
Jm ~ min.
(7.12)
or--1
Substituting eq. (7.7) into eq. (7.5), we see that this minimization takes the following form (see eq. (5.35)): N
J[u] - Z m=l
L
~
lrV(~al)(u)F(k)(am,u)F (l) (am, u) -4 min.
(7.13)
k,l=l
C. P r a c t i c a l c o n s i d e r a t i o n s
In practice, the function J[u] cannot be computed from the data alone, since l~(kt)(u) involves the true value a~. Hence, as we did in Section 5.1.3, the (LL)-matrix I~m(u) - (17V(~kt) (u))is approximated by the (LL)-matrix Win(u) - (lVt~kt) (u)) obtained by using the rank-constrained generalized inverse (see eq. (2.82)):
( ( V a F (k) (a~, u), V[am]VaF (l) (am, u)) )- .
(W.(k0 (u)) -
(7.14)
7-
Here, r is the rank of the hypothesis (7.2). Eq. (7.14) is the abbreviation of Win(u) - (V~(u))-~ for the (LL)-matrix Vm(u) - (V(~kz) (u)) defined by (V(kl) (u)) - ( ( V a F (k) (a~, u), V[a,~]VaF (0 (a~, u))).
(7.15)
The rank-constrained generalized inverse is used because matrix Vm(u) has a larger rank than matrix IP~(u) if the hypothesis (7.2) is degenerate (see Section 5.1.3). Thus, the actual computation is N
J[u] - ~ a=l
L
~
W (kt) (u)F (a) (am, u)F (0 (am, u) -+ min.
(7.16)
k,l=l
Since u is constrained to be in the parameter space b/C T~n, the minimization search is done in H. In principle, the solution/~ can be obtained by numerical
7.1.
General Theory
213
computation, and many kinds of numerical software are available for that purpose (e.g., the quasi-Newton method and the conjugate gradient method). In the rest of this chapter, we assume that the optimal s o l u t i o n / t has been obtained by some numerical means; actual computational schemes for this optimization will be given in subsequent chapters.
7.1.3
Covariance matrix of the optimal fit
The optimal e s t i m a t e / t is a random variable, because it is computed from the data {am}. We now study its statistical behavior. To do this, we need to distinguish u regarded as a variable from its true value. Let ~t be the true value that satisfies the hypothesis (7.2), and put
(7.17)
u-~+Au.
Since u is constrained to be in the parameter space b/ C T~~, the variation A u is constrained, to a first approximation, to be in the tangent space T~t (L/) to the manifold L / a t r If we substitute am - aa + A a a and u = ~t + A u into F(k)(aa, u) and expand it in the neighborhood of ~ and ~, we obtain F (k) (am, u) - ( V a / ~(k) , A a ~ ) + (Vu/~(k) , A u ~ ) + O ( A a , , A t ) 2,
(7.18)
where Va/)(k) and Vu/~(k) are the abbreviations of VaF(k) ( ~ , ~) and VuF(k)(da, ~t), respectively. The symbol O(...)P denotes terms of order p or higher i n . . . . Noting that 1 ~ (kl)(u) - 1~(kl)(~) + O ( A u ) , we see from eq. (7.13) that N
L
aao))
:[,,1- Z E c~=1 k , l = l
((Vu/~('), A t ) +
(Va/~(0, A a ~ ) ) +
O(Aa~,Au) a.
(7.19)
Let {vi}, i - 1, ..., n - n', be an orthonormal basis of Tu(b/)• constraint A u E T~t(/d) can be written as (oi, A t ) - 0,
i - 1, ..., n - n'.
The
(7.20)
The minimum of eq. (7.19) is obtained by introducing Lagrange multipliers ~_~/-l__ n I
~i and differentiating g[u]- z-,i=l higher order terms, we obtain -
N
L
c~----1 k , l ' - i
-
Ai(Oi,Au)
with respect to Au. Ignoring n-n
I
i--1
(7.21)
214
Chapter 7.
Parametric Fitting
Let p U be the n-dimensional projection matrix onto the tangent space Tu (H) at ft. Multiplying p u on both sides and noting that p U A u = Au and pUoi , we obtain - 0 , i = 1, .. . , n - n '
~=1
)
k,l=l
u w.(~')(~)P. v ~ G-(k) (VaL(~
+~ a=l
k,l=l
T
)
a ~ = o.
(7.22)
If we define the moment matrix M by N
L
c~=l
k,l=l
W,
(~)
)(puvulV~(O)
,
(7.23)
Aa~.
(7.24)
eq. (7.22) can be written in the following form:
l~Au = - E c~--1
- (kz)(~)(PgVul~(k) )(Va ~(1)) T W~ k,l=l
)
The moment matrix M is an (nn)-matrix, and its range is contained in Tu(H). Hence, the rank o f / ~ is at most n'. Here, we assume that the number of data {a~ } is sufficiently large and no special relationship exists among &l, ..., aN other than eq. (7.2). Then, the range of M generally coincides with Tu(H). It follows that /I5//~- = M M = pU. (7.25) Multiplying / ~ - on both sides of eq. (7.24) and noting that M p U A u = Au, we obtain
Au = - l f / I - E a=l
l~,.(kO(f~)(pUvu[-'(k))(VaF('))-r k,l=l
)
MAu =
Aa,~.
(7.26)
The covariance matrix l?[/~] - E[AuAuT-] of the optimal estimate/~ - ~ + A u is computed as follows: N
L
/ ,-,/4v~ ~(k) o',~= 1
k,l,m,n=l
(Va te~(')) n-E[Aa, Aa~](Val~('))(pU Vu F ( m ) ) r / ( / / N
-M- E c~,fl= 1
L
E k,l,m,n=l
W.
(fi)I~* (m')(~)(PgVuk (k))
7.1.
General Theory
215
(v~ f~') )9 a~, ~[~ 1(v~ L (n))(eg v~ P~m))~ MN
L
-M-E k,l,rn,n=l E s=l
(V~f~'), r = M- ~
(eg V~(m))r
E
-
)
W~') (~)e~ 'n)(~)W(m~)(~)
s=l k,rn=l l,n=l
(eg v ~ L (~))(eg v~ L(~) )9 u - .
(7.27)
(~)~,r(In)(~l,)~v(rnn)(~t)equals
Here, we have used eq. (7.9). Since E l,m=l L l~(kl) the (kn) element of matrix Iizs(~)I~r~ ( f t ) - I ~ s ( f t ) - l ~ ( f t ) we obtain N
r
(see eq. (2.81)),
L
M- Z ~ w~')(~)(egv~r~))(egv~L(')) ~ s=l k,/=l
=M
M M
- M
(7.28)
.
Thus, the covariance matrix of the optimal e s t i m a t e / t is given by
% s=l
(~)(P~Vu
)(PgV~L ('))
(7.29)
, =1
which has rank n'; its null space is Tu(b/) • Eq. (7.29) coincides with the Cramer-Rao lower bound 4 on the attainable accuracy of parametric fitting. Eq. (7.29) is purely a theoretical expression because it involves the true values of u and as. A simple approximation to compute it form the data alone is replacing the true value ft by the optimal estimate it and the true value a s by the optimally corrected value &s - a s - A a s , where A a s is given by eq. (7.7). Then, the covariance matrix V[~] takes the form
W(kt) ( i t ) ( p U Vu/5(k))(pU Vu~,(z))7
V[it] s=l , =1
)
(7.30)
where Vu/~(k) is the abbreviation of V u F (k) (&s, it). However, the data {as } themselves can be used instead of the corrected values {&s } to a first approximation. In whichever form, the resulting covariance matrix V[it] also has rank n'; its null space is T~t(b/) • 4Detailed discussions will be given in Chapter 14.
216
Chapter 7.
Parametric Fitting
Hypothesis testing and noise level estimation
Z1.4
The above estimation procedure is based on the hypothesis that the data { a s } are random deviations from the values { ~ } that satisfy eq. (7.2); minimizing the function J[u] defined by eq. (7.13) can be interpreted as choosing the value/t that make the hypothesis the most likely. If the hypothesis is correct, the residual J[/t] should be 0 for the true values { ~ }. However, the residual is generally positive for the data {as }. This suggests that if the residual J[/t] is much larger than can be accounted for by the statistical behavior of the noise in {a~}, the hypothesis should be rejected. In order to formulate this process as a statistical test, we need to derive the probability distribution of the residual j[/t]. We do this by assuming that the noise is Gaussian. A. Testing of a strong hypothesis
As a preliminary step, consider the residual J[~] for the true value ~t. Letting A u - 0 in eq. (7.19) and neglecting higher order terms, we observe that N
J[u]-
E
L
E
c~--1
ITV(kO(u)(VaF(k)'Aa~)(VaF(O'Aa~)"
(7.31)
k,l--1
If we put
(7.32) the L-vector e~ (e (1) Its covariance matrix is _
e~(L) ) T is a Gaussian random variable of mean 0
V[eo~] = ( E [ e ( k ) e ~ ) ] ) - ((VaF~k),E[Aao, Aa:]VaT'(~O)) = ((Va.F'(k),V[a~]VaF~O))- (12(k0(,R))- (l~(k0(O,)) - '
(7.33)
where eqs. (7.9) and (7.8) are used. Hence, eq. (7.31) can be written as N
=
(7.34) oe=l
The rank of V[ec~] (= W~(~t)-) equals the rank r of the hypothesis (7.2), and each e~ is an independent Gaussian random variable. Hence, the residual J[~] is a X2 variable with rN degrees of freedom (see eq. (3.63)). Its expectation and variance are given as follows (see eq. (3.59)): _
E [ J [ ~ ] ] - rN,
V [ J [ f i ] ] - 2rN.
(7.35)
It follows that if a particular value ~ is given independently of the data {as }, the stronghypothesis that the true value is ~t can be tested by the standard
7.1.
General Theory
217
the k2 test (see Section 3.3.4). Namely, the strong hypothesis is rejected with a% significance level if
J[u] > X~N. (7.36) In practice, however, the function J[u] defined by eq. (7.13) cannot be computed, since I7V~ (kt) involves true value a~. Hence, J[u] is approximated by the function J[u] given by eq. (7.16). B. T e s t i n g of a w e a k h y p o t h e s i s
Next, consider the residual J[/t] for the optimal estimate it. Substituting eq. (7.26) into eq. (7.19) and using eqs. (7.24) and (7.29), we obtain to a first approximation N
L
:t<- E E c~=1
((rata
w~ (~)
,Z~a~)(Va
,aa~)
k,l= l
+ (v~&(~). a~)(VaL('), aa~) + (Va&(~).a~.)(V~&('), a~)
k,l=l
a=l
+(At, ~
- (k0 ( ~ ) ( P U V u ~~(1) -(k) ) -r W~ g )(VaF~ =
k,/=l
E
+(Au, c~--1
)
-(kt) ( R ) ( p g Vup(k) )(va&(')) T aa~) wo
= Y[R] + ( A t , E
)
Aa~)
)
I~(kO(u)(PgVu["(k))(PgVuf-'(O) T Au)
k,l--1
= J [ a ] - (Au, g[/t]-Au).
(7.37)
Since V[/t] is the covariance matrix of Au and has rank n', the quadratic form ( A t , V [ / t ] - A u ) i s a X2 variable with n' degrees of freedom (see eq. (3.61)). From eq. (7.32), we see that eq. (7.26) is expressed in terms of e~ ) in the form N
L
c~=l
k ,l= l
(7.38) meaning that A u is obtained by a linear mapping from {e~}. Hence, the residual ][/t] is a X2 variable with r N - n' degrees of freedom (see eq. (3.64)). Its expectation and variance are given as follows (see eq. (3.59)):
E[Y[6]I- ~ N - n',
V[Y[6]]- 2 ( r N - n').
(7.39)
218
Chapter 7.
Parametric Fitting
It follows that the number N of data must be such that N > --.
(7.40)
r
From eqs. (7.39), we see that with a high probability the residual J[/t] is smaller than the residual j[~]. This is because the e s t i m a t e / t is defined as the value that minimizes J[u]; the e s t i m a t e / t is a function of the data {as} and hence is correlated with them. The above analysis can be used to test the weak hypothesis that eq. (7.2) is satisfied by some value u. Namely, the weak hypothesis is rejected with a% significance level if (7.41) J[/$] > ~rN-n',a" In practice, the function J[u] is approximated by the function J[u] given by eq. (7.16). C. N o i s e level e s t i m a t i o n
Note that although the covariance matrices Vial], c~ = 1, ..., N, are involved in the optimization (7.16), we need not know their absolute scale" it suffices to know them only up to scale. In fact, if the covariance matrices V[a~] are multiplied by a positive constant c, the matrix (w(kl)(u)) is multiplied by 1/c (see eq. (7.14)), and multiplication of J[u] by a positive constant does not affect the value that minimizes it. The covariance matrix Vial] can be decomposed into the noise level e and the normalized covariance matrix V0[a~] in the form V[a~] = e2V0[a~],
c~ = 1, ..., N.
(7.42)
As discussed in Section 5.1.5, the normalized covariance matrices V0[a~] can be relatively easily predicted in many practical problems while the noise level e is very difficult to estimate a priori. Since the optimal solution is invariant to the scale of V[a,], the normalized covariance matrices V0[a,] can be used in the computation. Once the optimal estimate/~ is obtained, the noise level e is estimated a posteriori (see Section 5.1.5). In fact, let J0[qi] be the residual computed by using the normalized covariance matrices V0[a~]. An unbiased estimator ~2 of e2 is obtained in the form ~2= [/t_______~] J0 r N - n 1"
(7.43)
Its expectation and variance are respectively given by 2E 4
E[~2]_ e2
V[~2]- -r -N- - - 7 n"
(7.44)
7.1.
General Theory
219
On the other hand, testing of hypotheses requires a priori knowledge of the noise level e, because a hypothesis is tested by comparing the a priori value e with the a posteriori estimate ~ c o m p u t e d on the assumption that the hypothesis is true. The X"2 test (7.41) can be rewritten in terms of the estimate given by eq. (7.43) in the following formS: ~:2
2
XrN-n',a -g e > -------------7" rN- n
7.1.5
(7.45)
Linear hypothesis
In m a n y c o m p u t e r vision and robotics applications, eq. (7.2) takes the following linear form: ( a ~ ) , u) - O,
k - 17 ..., L.
(7.46)
For each k, vectors ~ k ) , - - . , ~(Nk) are the true values of the d a t a a~ k) , " ' ' ~ a(Nk) which are N instances of an n-vector variable a (k). If we define the direct sum vector a = a (1) @ - - . 9 a (L), (7.47) and define k functions k = 1,...,L,
(7.48)
eq. (7.46) can be viewed as a special case of eq. (7.2). From the above definition, we see that
VaF(k) (a, u) -- 0 | 1 7 4
(ku) |
|
(7.49)
where the stacked symbol (k) means that u is the kth c o m p o n e n t of the direct sum. Write
and assume that the noise A a ~ ) is a r a n d o m variable of m e a n 0, independent for each a. However, Aa(~1), ..., Aa(~L) m a y be correlated. We write their covariance matrices as
(7.Sl) If we abbreviate V [ a ~ ), a s(k) ] as V [ a ~ )] (= E [ A a ~ ) A a ~ ) T ] ) , the covariance (1) ~L) m a t r i x of the direct sum vector a s - a s | @a has the following 5This is a consequence of the fact that ~2/e2 is a modified X2 variable with r N - n' degrees of freedom if the hypothesis is true (see eq. (3.72) and Footnote 6 in Section 5.1.5).
Chapter 7.
220
Parametric Fitting
submatrices (see eq. (3.11))"
-
:
...
Via (L),as(1)] ---
9
.
(7.52t
V[a~ )]
Eq. (7.7) now reads L
Aa~)-
VV(tm)(u)(a(m)'u)V[a~)'a~)]u"
E
(7.53)
l,m=l
Eq. (7.8) is replaced by (7.54) The optimization (7.16) reads N
L
J[u] -- E c~----1
E W(kt) (u)(a~)' u)(a~)' u) --+ min.
(7.55)
k,l--1
Eq. (7.14) is replaced by
(
(W.(kl)(u)) -- (u, V[a~),a~)]u)
)
,
(7.56)
1"
where r is the rank of the hypothesis (7.46) and V[a~ ), a~ )] is the value of the covariance matrix V[a~ ), a~ )] evaluated at a ~ ) and a~ ). From eq. (7.30), the covariance matrix of the optimal estimate/t is obtained in the following form: V[/t]-
E c~=l
W(~kO(it)(Pg&(~k))(Pgd~))T
k,l= l
)
(7.57)
Here, d~) - a~ ) - A a ~ ) is the value of a ~ ) corrected by using eq. (7.53), in which I~ (ira) and V[a (k), a(~k)] are approximated by W,(Ira) and V[a~ 1, a~)], respectively. To a first approximation, however, we may use the data {a~ ) } themselves instead of {h~)}. 7.2
7. 2.1
Optimal Fitting for Image Points Image point fitting
Given N image points {x~}, a - 1, ..., N, consider the problem of fitting an image point x to them. This means finding an optimal average x of {xa }.
7.2.
Optimal Fitting for Image Points
221
We write x~ - ~ + Axe,
(7.58)
and regard each A x a as an independent random variable of mean 0 and covariance matrix V[x~]. The hypothesis is 2~ = x,
c~ - 1, ..., N,
(7.59)
which has rank 2 because both sides are orthogonal to k - (0, 0, 1)T. For each x~, the optimal estimate of its true value 5~ is evidently x, so the optimal average & can be obtained by the optimization N
J[x] - E ( x ~ -
x , V [ x ~ l - ( x ~ - x)) ~ min
(7.60)
4=1
under the constraint x E {k}~. form: -
v[~.lc~=l
The solution is obtained in the following
~
v[~.l-~.
+ k.
(7.61)
c~--1
Its covariance matrix is
which has rank 2; its null space is {k}L. The residual J[&] is a ~2 variable with 2 ( N - 1) degrees of freedom 6. Evidently, the number N of image points must be such that N > 1.
(7.63)
The residual is 0 when N - 1. For N > 2, we obtain a coincidence test for image points: the hypothesis that image points {x~ } coincide is rejected with significance level a% if J[&] > X~(N-1),a"
(7.64)
The 2 ( N - 1) degrees of freedom can be intuitively interpreted as follows. An image point has two degrees of freedom, so N image points have 2N degrees of freedom for their deviations from the true position 3. But if we consider deviations from the average &, no deviation occurs when N - 1. Hence, the degrees of freedom decrease by two. 6As in C h a p t e r 5, we a s s u m e G a u s s i a n noise a n d do first o r d e r a n a l y s i s w h e n e v e r we refer to X2 d i s t r i b u t i o n s a n d X2 tests.
222
Chapter 7.
Parametric Fitting
P(l
Fig. 7.1. Optimally averaging image points by least squares. E x a m p l e 7.1 If each coordinate is perturbed independently by Gaussian noise of mean 0 a n d variance e2, the covariance matrix of each image point xo" is V[xo'] = e2Pk, so eq. (7.60) reduces to the usual least-squares optimization N
J014- ~
N
I1~ - ~11~ - Z D(PO"P)2 ~ min,
o'=l
(7.65)
o,=1
where D(po',p) is the distance from the a t h image point po" to the image point p to be fitted (Fig. 7.1). The optimal average given by eq. (7.61) reduces to the sample average N
1 5: -- ~ E
xo'.
(7.66)
v[~]- ~Pk-
(7.67)
o'--1
Its covariance matrix is E2
An unbiased estimator of the variance e2 is obtained in the form N
1 ~~iI 1 ~
~2_ 2 ( N - 1)
- ~11 ~.
(7.68)
If the value e is given a priori, the coincidence test takes the form ~2
X2(N_I), a
--~ e > 2(N-
7.2.2
1)"
(7.69)
Image line fitting
Given N image points {xo'}, a = 1, ..., N, consider the problem of fitting an image line (n, x) = 0 to them. We write xo" = ~o" + Axo',
(7.70)
and regard each Axo" as an independent random variable with mean 0 and covariance matrix V[xo']. The hypothesis is (n, ~ )
- 0,
~ - 1, ..., N,
(7.71)
7.2.
Optimal Fitting for Image Points
223
which has rank 1. An optimal estimate of n can be obtained by the optimization N
J[n] - ~
2"1 under the constraint
(I,*ll -
( n , xc~)2
V[x~]n)
--+ min
(7.72)
1. The covariance matrix of the solution ti is
V[fl]-(o~=~l(P~x'~ ~
)
(7.73)
where P n and ~ are the projection matrix along fi and the optimally corrected value of x~, respectively (see eq. (5.82)). The rank of V[fi] is 2; its null space is {fi} L. The residual J[fi] is a ;~2 variable with N - 2 degrees of freedom. Evidently, the number N of image points must be such that N _> 2.
(7.74)
The residual is 0 when N - 2. For N >_ 3, we obtain a collinearity test for image points: the hypothesis that image points {x~ } are collinear is rejected with significance level a~0 if
J[,i] > x 2
(7.75)
The N - 2 degrees of freedom can be intuitively interpreted as follows. A free image point has two degrees of freedom. An image point constrained to be on an image line keeps the incidence if it moves along that image line, so there remains one degree of freedom to break the incidence. Consequently, N image points have N degrees of freedom for their deviations from the true image line (~t, x) = 0. But if we consider their deviations from the fitted image line (fi, x) = 0, no deviations occur when N = 2. Hence, the degrees of freedom decrease by two. E x a m p l e 7.2 If each coordinate is perturbed independently by Gaussian noise of mean 0 and variance e 2, image point x~ has covariance matrix V[x~] = e2Pk , so eq. (7.72) reduces to the least-squares optimization
J~
E a=l
1-(k:n--) 2 = E
D(p~'I)2 --} min,
(7.76)
a=l
where D(p~,l) is the distance fl'om the a t h image point p~ to the image line 1 to be fitted (Fig. 7.2; see eq. (4.11)). The covariance matrix of the solution ti for image points { (x~, y~)} is given by
224
Chapter 7.
Parametric Fitting
P~ Fig. 7.2. Line fitting by least squares. ~2
V[ti]- l+d 2
/
E 4= N I X4 ^2
N 1 :r4Y4 E 4=
E N4=1 ~)4Xa Ec~=l S 024 4=1
E o<=1 N :~4
E N4=1 ~)4
4--1 04
N
/
,
(7.77)
where d is the distance of the image line (f~, x) - 0 from the image origin o (see eq. (4.9)). Here, (x4, ~)4)is the optimally corrected position of ( x 4 , y , ) , i.e., orthogonal projection onto the fitted image line (see eq. (5.85)). If we write f~ = (A,/~, ~)T, the residual of the minimization (7.76) can be written as
N
J0[/~]- (1 + c{2) ~
(Ax4 + / ) y 4 + C) 2.
(7.78)
4=1
Hence, an unbiased estimator of the variance e2 is obtained in the form N
@2 __ l_~_d2
N-
2~
(ii.x4 +/~y4 + C) 2.
(7.79)
c,--1
If the value e is given a priori, the collinearity test takes the form @2
X~V-2,a
> N--r-f"
(7.80)
The solution of the least-squares optimization (7.76) can be obtained analytically. In fact, let x cos 0 + y sin 0 = d be the image line to be fitted. The function to minimize can be written in the following form: N
J(O,d) - Z ( x 4
cos 0 + y4 s i n 0 - d ) 2.
(7.81)
4=1
Differentiating this with respect to d and setting the result 0, we obtain d in the form d - 2cos0 + t3sin0, (7.82) where -
1
N
Z 4:1
xo,
-
1
N
Z ~:1
(7.83/
7.2.
Optimal Fitting for Image Points
225
Eq. (7.82) states that the image line should pass through the centroid (2, fy) of the data {(x~,y~)}. Substituting eq. (7.82) into eq. (7.81), we obtain a function of 0 to minimize in the form N
J(O)- Z
((x~ - 2) cos O+ (g~ - ~) sin0) 2 .
7( . 8 4 )
ct--1
If we put ii - (cos 0, sin 0)7-, this equation can be rewritten as
J [ n ] - (n, Mn), where M is the two-dimensional N
M ---
(7.85)
moment matrix
)2
E~----I(Xo~ N
-- 2
E~=I(~- 0)(x~- ~)
(7.86)
E N ( ~ - ~)~ c~=l
Since g is a unit vector, J[fi] is minimized by the unit eigenvector of the moment matrix M for the smallest eigenvalue (see eqs. (2.86)). The value of d is given by eq. (7.82). E x a m p l e 7.3 Suppose N is an odd number in Example 7.2. Let w be the distance between ( x l , g l ) and (xN,yN). Put
X(N+I)/2 ) Y(N+I)/2 ,
XC --
(7.87)
1
which represents the midpoint. Let u be the unit vector that indicates the orientation of the image line (/t, x) - 0. If {(x~,g~)}, a - 1, ..., N, are approximately equidistant, we have w
x~ ~ xc + N - 1 (a
N+I
-----~---)u.
(7.88)
Then, eq. (7.77) is approximated by
v[n] ~
e2
X(~+
d~
(
)
~~
+
w 2 ( N + 1)
~2(x-~)
u~
7-).
(7.89)
If the number N of the data points is very large, this expression gives the covariance matrix of the image line fitted to an edge segment having length w, orientation u, and midpoint x c (Fig. 7.3). Define the edge density (the number of edge pixels per unit length) p by N p - --. W
(7.90)
226
Chapter 7.
Parametric Fitting
^
g/
Fig. 7.3. Line fitting to an edge segment. In the limit N ~ cr we have the following asymptotic expression: V[~]
12e2 --
p 3(1 +
( UU
T
+
12
xcx T
)-
(7.91) "
The distance d of the fitted line from the image origin o is usually very small as compared to the focal length (which we take as the unit of length). If the edge segment is near the image origin o, we can assume that (Xc, u) 0 and IIxcII ~ 1. The length w of the edge segment is usually very small as compared with the focal length, so 1 << 12/w 2. Then, eq. (7.91) has the following approximation: 12e2 7(7.92) viii
It follows that the primary deviation pair is approximated as follows (see Section 4.5.3): n + ~ N[fi + p l /2v/'3e 2 w 3 / 2 u] ,
2v~e u] . n - ,~ N [ ~ t - pl/2w3/2
(7.93)
We see from this that the fitted line is very likely to pass near the midpoint of the edge segment. We also see that error is approximately proportional to p-1/2 and w -3/2 (cf. Example 4.2 in Section 4.1.3).
7.3
7. 3.1
Optimal Fitting for Image Lines Image point fitting
Given N image lines { ( n ~ , x ) = 0}, a = 1, ..., N, consider the problem of fitting an image point x to them. This means estimating their common intersection. We write n~ - ft~ + An~, (7.94) and regard each An~ as an independent random variable with mean 0 and covariance matrix V[n,~]. The hypothesis is (ft~, x) - 0,
c~ - 1,...,N,
(7.95)
7.3.
Optimal Fitting for hnage Lines
227
which has rank 1. An optimal estimate of x can be obtained by the optimization
N (n(~, x)2 J [ x ] - ~ (xi~[-n~Tx) -+ min
(7.96~
under the constraint x E {k}L~. The covariance matrix of the solution & is given by
(e, v[n
o=1
le/
'
(7.97/
where ~ is the optimally corrected value of n~ (see eq. (5.91)). The rank ot V[:b] is 2; its null space is {k}n. The residual J[&] is a X2 variable with N - 2 degrees of freedom. Evidently. the number N of image lines must be such that N _> 2.
(7.98)
The residual is 0 when N - 2. For N > 3, we obtain a concurrency test for image lines: the hypothesis that image lines {(n~, x) - 0} are concurrent is rejected with significance level a ~ if J[5~]
> )~-2,a"
(7.99)
The N - 2 degrees of freedom can be intuitively interpreted as follows. A free image line has two degrees of freedom. An image line passing through an image point keeps the incidence if it changes its orientation around that image point, so there remains one degree of freedom to break the incidence. Consequently, N image lines have N degrees of freedom for their deviations from the true intersection 3. But if we consider their deviations from the estimated intersection &, no deviations occur when N - 2. Hence, the degrees of freedom decrease by two. E x a m p l e 7.4 Suppose each image line l~ is likely to be translated by noise into a position parallel to la, and suppose the distance of such a parallel translation is an independent random variable for each image line with mean 0 and standard deviation e, which is assumed to be very small. If image line l~ is represented by (n~, x) - 0, the covariance matrix of n~ for this statistical model is V[n,~] - e 2 ( 1 - (k, n~)2)(Pn k)(Pn k) T. (7.100) If image point x is very close to each image line (n~, x) - 0, we see that
(x,V[n~]x) - e 2 ( 1 - (k,n~)2)(x, P n . k ) 2 - e2(1 - ( k , n ~ ) 2 ) ( P n . x , k )
e2(l
-
(k, n,)2)(x,k)
2 -
e2(l - (k, n,)2).
2
(7.i01)
228
Chapter 7.
--
Parametric Fitting
II
Fig. 7.4. Intersection estimation by least squares. Hence, eq. (7.96) reduces to the following least-squares optimization (Fig. 7.4; see eq. (4.11))" J0[x] ~ E
1 -~,~-s
(7.102)
D(p'I~) 2 -+ min.
= E
c~=l
c~=1
The solution can be obtained analytically. In fact, let x cos0~ + y sin0~ d~ be the c~th image line. The function to minimize can be written in the following form: N
J ( x , y ) - E ( x c o s e ~ + y sin0~ - d ~ ) 2.
(7.103)
This is a quadratic polynomial in x and y, so the solution is given by solving the following linear equation: E c~---=1 cOS2 {9c~ N
~=lsin0~
cos0~
E ~=1 N
cos 0~ sin 0~ ~-~N c~--1 sin 2 0~
x g
N -"
d~ cos 0~ d~ sin0~
9
(7.104) However, the statistical model of noise given by eq. (7.100) is very artificial and unnatural (see Examples 4.2 and 7.3). Hence, the least-squares optimization (7.102) for intersection estimation is not appropriate even though the leastsquares optimization (7.77) is appropriate for line fitting.
7.3.2
Image line fitting
Given N image lines {(n~,m) = 0}, c~ = 1, ..., N, consider the problem of fitting an image line (n,m) - 0 to them. This means finding an optimal average n of {n~ }. We write n~ = ~
+ An~,
(7.105)
and regard each An~ as an independent random variable of mean 0 and covariance matrix V[n~]. The hypothesis is ft~ = n,
c~ = 1, ..., N,
(7.106)
7.3.
Optimal Fitting for hnage Lines
229
which has rank 2 because both sides are unit vectors 7. For each n~, the optimal estimate of its true value fi~ is evidently n, so the optimal average/t can be obtained by the optimization N
J[n] = E ( n ~ - n, V[n~l-(n~ - n)) --+ min
(7.107)
c~---1
under the constraint I1 11- 1. The covariance matrix V[na] has the null space {n~}L, so V[n~ln~ - O. Hence, eq. (7.107) can be rewritten as
J[n]-(n,(~V[na]-~=l
n) --+ min.
(7.108)
The solution ti is given by the unit eigenvector of the matrix N
M - E V[n~]-
(7.109)
c~=l
for the smallest eigenvalue (see eqs. (2.86)). The covariance matrix of the solution ~i is
V[Ti] - ( ~ Piz
,
(7.110)
which has rank 2; its null space is {/Z}L. The residual J[,i I is a )(2 variable with 2 ( N - 1) degrees of freedom. Evidently, the number N of image lines must be such that N >_ 1.
(7.111)
The residual is 0 when N - 1. For N > 2, we obtain a coincidence test for image lines: the hypothesis that image lines {(n~, x) - 0} coincide is rejected with significance level a% if J[~] > X2(N_I),a .
(7.112)
The 2 ( N - 1) degrees of freedom can be intuitively interpreted as follows. An image line has two degrees of freedom, so N image lines have 2N degrees of freedom for their deviations from the true position (fi, x) - 0. But if we consider deviations from the average (ti, x) - 0, no deviation occurs when N - 1. Hence, the degrees of freedom decrease by two. 7This hypothesis is degenerate.
230 7.4
7.~. 1
Chapter 7. Optimal
Fitting
for Space
Parametric Fitting
Points
Space point fitting
Given N space points {r~}, a - 1, ..., N, consider the problem of fitting a space point r to them. This means finding an optimal average r of {r~ }. We write r~ - +~ + Ar~ (7.113) and regard each Ar~ as an independent random variable of mean 0 and covariance matrix V[r~]. The hypothesis is +~ - r,
a - 1, ..., N,
which has rank 3. For each r~, the optimal estimate of its true value ~ the optimal average can be obtained by the optimization
(7.114)
is evidently r, so
N
J[r] - ~-~(r~ - r, V[/'~]-I(/'. - / ' ) ) ---+min.
(7.115)
~=1
The solution is given by -1
N
V[rol-lro.
+:
(7.116)
o~--1
Its covariance matrix is -1
vi+l ( o=1 v ro, 1)
(7.1 7)
which has rank 3. The residual J[~] is a X2 variable with 3 ( N - 1) degrees of freedom. Evidently, the number N of space points must be such that N _> 1.
(7.118)
The residual is 0 when N - 1. For N >__ 2, we obtain a coincidence test for space points: the hypothesis that space points {ra} coincide is rejected with significance level a% if 2 J[r] > ~ 3 ( N - 1 ) , a " (7.119) The 3 ( N - 1) degrees of freedom can be intuitively interpreted as follows. A space point has three degrees of freedom, so N space points have 3N degrees of freedom for their deviations from the true position e. But if we consider deviations from the average ~, no deviation occurs when N - 1. Hence, the degrees of freedom decrease by two.
7.4.
Optimal Fitting for Space Points
231
P~
Fig. 7.5. Optimally averaging space points by least squares. E x a m p l e 7.5 If each coordinate is perturbed independently by Gaussian noise of mean 0 and variance e 2, the covariance matrix of each space point r4 is V[r~] = e2I, so eq. (7.115) reduces to the least-squares optimization N
J0[r] = E
N
D(P4, p)2
Ilr4 - vii2 - E
4"-1
__+ min,
(7.120)
4:1
where D(P4, P) is the distance from the c~th space point P~ to the space point P to be fitted (Fig. 7.5). The optimal average given by eq. (7.116) reduces to N
1
- ~ E
r4.
(7.121)
4--1
Its covariance matrix is E2
v i i i - ~x.
(7.~22)
An unbiased estimator of the variance e2 is obtained in the form ~2_ 1 - 3(N- 1)E
N
I 1 ~ - ell ~.
(7.123)
4-'1
If the value e is given a priori, the coincidence test takes the form ~2
2
)~3(N-1),a
J > 3(X- 1)"
7.~.2
(7.124)
Space line fitting
Given N space points {r~}, a - 1, ..., N, consider the problem of fitting a space line r x p - n to them. We write r 4 - ~4 + A t 4 ,
(7.125)
and regard each A t 4 as an independent random variable with mean 0 and covariance matrix V[r4]. The hypothesis is 04 x p -
n,
c ~ - 1, ..., N,
(7.126)
232
Chapter 7.
Parametric Fitting
which has rank 2 because both sides are orthogonal to p. If we let
e (1) - -
(1) 0
,
e (2) - -
(o)(o) 1
0
e (3)
,
--
0
0 1
,
(7.127)
and
r~ ) - r~ x e (k),
k - 1, 2, 3,
(7.128)
the hypothesis (7.126) can be equivalently written in the form
(~(2),v) + (~(~),n) - o,
k = 1, 2, 3.
(7.129)
If we define 6-vectors a~ )
- - r ( ~ k) |
e (k),
u
--
p | n,
(7.130)
eq. (7.129) can be further rewritten in the form (~,~), u)
--
0,
a
--
1, ..., N,
(7.131)
where the bar refers to the true value. Eq. (7.131) has the same form as eq. (7.46), so the result in Section 7.1.5 can be applied. Since e (k) does not incur noise, the covariance matrix Y[a(~k), a~ )] has the form
(')]- v[,~) ,~)] 9 o.
(7.132)
The covariance matrix of r(~k) is given as follows (see eq. (2.43)):
v[,~),,~)]-
~(~) • v[,o] • ~(').
(7.133)
From eqs. (7.128)and (7.130), we see that (a~),
u)
-
-(v~
• p -
(7.x34)
n, e(k)).
Hence, eq. (7.55) reduces to the optimization N
J[p| n ] - E ( r ~
x p - n, W~(p)(r~ x p - n)) -+ min
(7.135)
c~=l
under the constraints (p, n) - 0 and Ilpll2 -4-Ilnll 2 W a ( p ) is given by
W~(p)-
(p • V[t~] x p ) - .
1. The (33)-matrix (7.136)
7.4.
Optimal Fitting for Space Points
233
The covariance matrix of the solution i5 @/~ is given by eq. (7.57), which reduces to V[15 |
- (P~c~e~
E~=I
E~=,~xW~(#)x§
~2 N
~=1
W (#)x~
~2 N
(~=1 W c ~ ( # )
c~
Par~e~
,
(7.137)
where ~ is the optimally corrected value of r~ (see eq. (5.129)). Here, PAGe~ is the six-dimensional projection matrix onto Afpen• (see eqs. (4.43) and (4.44)). The rank of the covariance matrix V[lb |
is 4; its null space is
.N'lbr ~ 9 The residual J[lb | 5] is a X2 variable with 2 ( N - 2)degrees of freedom. Evidently, the number N of space points must be such that N _> 2.
(7.138)
The residual is 0 when N - 2. For N >__ 3, we obtain a collinearity test for space points: the hypothesis that space points {r a} are collinear is rejected with significance level a% if
(7.139)
2
The 2 ( N - 2) degrees of freedom can be intuitively interpreted as follows. A free space point has three degrees of freedom. A space point constrained to be on a space line keeps the incidence if it moves along that space line, so there remain two degrees of freedom to break the incidence. Consequently, N space points have 2 N degrees of freedom for their deviations from the true image line r x p - ft. But if we consider their deviations from the fitted space line r x lb = fi, no deviations occur when N - 2. Hence, the degrees of freedom decrease by four. E x a m p l e 7.6 If each coordinate is perturbed independently by Gaussian noise of mean 0 and variance e2, space point r a has covariance matrix Viva] = e2I, so eq. (7.136) can be rewritten as follows (see eq. (2.42))"
- -
ix
1 (,1 11 21
ppT)
-
, ( =
211pll2
(7.140) I
ilpl12
.
Then, eq. (7.135) reduces to the least-squares optimization
N
Jo[p 9 ~]- ~ I1~ x p~ a=l IlPll
.112
N -- E D(Pa, L)2 --+ min, ~=1
(7.141)
234
Chapter 7.
Parametric Fitting
P~j
~
D(P~, L)
Fig. 7.6. Space line fitting by least squares. where D(P,~, L) is the distance from the a t h space point to the space line L to be fitted (Fig. 7.6; see eq. (4.49)). Eq. (7.1371 reduces to
v[/,e
,~] - ~11/,11~
(P~.
~=1~ N
to\
x I x ~
NI
E~=I(~O • I/T
P,v'~,r
1
)
(7.1421
where ~a is the orthogonal projection of r~ onto the fitted space line (see eq. (5.13111. An unbiased estimator of e2 is obtained in the form
~
-
-
2(N-
~
2)
~~1 IIr~ xi l ~2,,~11~ . = [ ~
(7.143)
If the value e is given a priori, the collinearity test takes the form ~2
X2(N_2),a
e-~ > 2 ( N - 2)"
(7.144/
The solution of the least-squares optimization (7.1411 can be obtained analyticaily. In fact, let ( r - rH) • m -- 0 be the {m, rH}-representation of the space line to be fitted. The function to minimize can be written in the following form (see eq. (4.49))" N
J[m, rH] - E [ I P m r ~ -- Villi2"
(7.1451
o~"-1
Differentiating this with respect to/'H and setting the result 0, we obtain rH in the form
rH -- Pm~',
1
N
r - ~ E
r~.
(7.1461
c~--I
This means that the space line should passes through the centroid f" of the data {ra}. If we note the identity IIPmal[ 2 -Ilall 2 - ( m , a) 2 for an arbitrary
7.4.
Optimal Fitting for Space Points
235
vector a (see Fig. 2.2), eq. (7.145) reduces to a function of m in the form N
N
J[m] - E IlPm(r~ - 0)112 - E c~:l
N
IIr~ - ~ll 2 - ~ ( m ,
c~:l
r ~ - ~)2
(~=1
N
= Z
lira - rll2 - ( m , M m ) ,
(7.147)
o~--1
where M is the moment matrix N
M - Z
(r~ - e ) ( r ~ - ~ ) r .
(7.148)
The function g[m] is minimized if (m, M m ) is maximized. Since m is a unit vector, the solution is obtained as the unit eigenvector of the moment matrix M for the largest eigenvalue (see eqs. (2.86)). The vector rH is given by eq. (7.146).
7.~.3
Space plane fitting
Given N space points {p~}, c~ - 1, ..., N, consider the problem of fitting a space plane (v, p) - 0 to them. We write p~ - ,o~ + Ap,~,
(7.149)
and regard each Ap~ as an independent random variable with mean 0 and covariance matrix V[p~]. The hypothesis is (,, ~)
- 0,
~ = 1, ..., N ,
(7.150)
which has rank 1. An optimal estimate i can be obtained by the optimization J[v] - E
(v, V[p~lv) -+ min
(7.151)
o~'--1
under the constraint []vii - 1. The covariance matrix of the solution/, is
v[~]-(o:~(p~#~)(p~#~)~i~:y~]~)
(7.~52)
w h e r e / ~ is the optimally corrected value of p~ (see eq. (5.160)). The rank of V[/,] is 3; its null space is {/'}L. The residual J[/,] is a )C2 variable with N - 3 degrees of freedom. Evidently, the number N of space points nmst be such that N > 3.
(7.153)
236
Chapter 7.
Parametric Fitting
P~
Fig.
7.7. Space plane fitting by least squares.
The residual is 0 when N - 3. For N _> 4, we obtain a coplanarity test for space points: the hypothesis that space points {r~} are coplanar is rejected with significance level a% if
J[/)] > jX2_3,a .
(7.154)
The N - 3 degrees of freedom can be intuitively interpreted as follows. A free space point has three degrees of freedom. A space point constrained to be on a space plane keeps the incidence if it moves within that space plane, so there remains one degree of freedom to break the incidence. Consequently, N space points have N degrees of freedom for their deviations form the true space plane (p, p) = 0. But if we consider their deviations from the fitted space plane ( i , p) = 0, no deviations occur when N = 3. Hence, the degrees of freedom decrease by three. E x a m p l e 7.7 If each coordinate is perturbed independently by Gaussian noise of mean 0 and variance e2, space point p~ has covariance matrix V[p~] = e2I | - e2(I - ~ - r ) , where ~ - (0,0,0, 1) T. In this case, eq. (7.151) reduces to the least-squares optimization
N N (u,p~)2 _ E J0[v] - E 1 - i~;i ~,]2 ~=i
D ( P ~ ' H ) 2 ~ min,
(7.155)
cr--I
where D(P~, I I) is the distance from the a t h space point to the space line H to be fitted (Fig. 7.7; see eq. (4.68)). The covariance matrix of the solution/, for space points { (X~, Y~, Z~) } is given in the form
ViOl
-- 1+c~ 2
/
o~=1 Xo~ E E N f~2~ a=l E N Xcr
E a = l "Yc~"c~ Eo~=l N y2 EO~-'I E a =Nl ~r Ec~LI Ya N
E c~=l 2o N L EO~=I
/
(7. s6) where d is the distance of the space plane (~,, p) - 0 from the origin O (see eqs. (4.63)). Here, (X~,lY~,Z~)is the orthogonal projection of ( X ~ , Y ~ , Z ~ )
7.4.
237
Optimal Fitting for Space Points
onto the fitted space plane (see eq. (5.163)). If we write ~, - (A,/3, ~, D)T, the residual of the minimization (7.155) can be written as N
J o i n ] - (1 + d 2) E
(_/iX~ +/~Y~ + C;Z~ + / ) ) 2 .
(7.157)
o~1
Hence, an unbiased estimator of the variance e2 is obtained in the form ~2 __ 1 - [ - ( t 2
- x-
N
3 ~
(/i_u176 + BY. + d z ~ + D) ~.
(7.~5s)
If the value e is given a priori, the coplanarity test takes the form ~2
X2_3,a
~-Y > N---:~
(7.159)
The solution of the least-squares optimization (7.155) can be obtained analytically. In fact, let (n, v) - d be the {n, d}-representation of the space plane to be fitted. The function function to minimize can be written in the following form (see eq. (4.68))" N ct--- 1
Differentiating this with respect to d and setting the result 0, we obtain d in the form N 1
- (~, ~1,
~ = ~ Z ~*"
(7.161/
o~'-1
This means that the space plane should pass through the centroid f" of the data {r~}. Substituting eq. (7.161)into eq. (7'.160), we obtain a function of n to minimize in the form N
(n, v~ - ~)2 _ (n, M n ) ,
J[n] - E
(7.162)
where M is the m o m e n t matrix N
M - E (r~ - e)(r~ - e)7-.
(7.163)
o~--'1
Since n is a unit vector, J[n] is minimized by the unit eigenvector of the moment matrix M for the smallest eigenvalue (see eqs. (2.86)). The value of d is given by eq. (7.161).
238
7.5
7.5.1
Chapter 7.
Parametric Fitting
Optimal Fitting for Space Lines Space point fitting
Given N space lines {r x p~ - n~}, a - 1, ..., N, consider the problem of fitting a space point r to them. This means optimally estimating their common intersection. We write p~ = 1~ + Ap~,
n~ - ~,~ + A n ~ ,
(7.164)
and regard A p a and A n a as random variables of mean 0 and covariance matrix V[p~ | n~], independent for each a. The hypothesis 8 is v x p~ - f i ~ ,
a-
1,...,N,
(7.165)
which has rank 2 because both sides are orthogonal to p~. Let p - (X, Y, Z, 1) T, and define e (1) , e (2) , and e (3) by eqs. (7.127). If we define 4-vector (7.166) a(~(k) _ (e(k) • p~) | (e(k) n,~), the hypothesis (7.165) can be rewritten in the form (a(~k) , p) - O,
k - 1, 2, 3,
(7.167)
where the bar refers to the true value. Since e (k) does not incur noise, the covariance matrix V[a (k), a~ )] is given as follows (see eqs. (2.39) and (2.44)):
e(k) • V[po~] • e(l) =
• v[v
l
(
e (k) x V[p~]e(O ) (e (k), V[p~]e(O) "
)lT
(7.168)
From eq. (7.166), we see that (a(2),
p)
-
•
-
(7.169)
Hence, the optimization (7.55) can be rewritten in the form N
J[r]-
E(r
• p~-
n~, W ~ ( r ) ( r • p ~ -
n~))~
min,
(7.170)
o~=1
where the (33)-matrix W ~ ( r ) i s W~(r)-
(r • 2 1 5 2 1 5 2 1 5
given by
I)- . 2
(7.171) 8This hypothesis is degenerate.
7.5.
Optimal Fitting for Space Lines
239
The covariance matrix of the solution ~ is given by eq. (7.57), which reduces to --1
w h e r e / ~ is the optimally corrected value of p~ (see eqs. (5.137)). The residual J[/'] is a ~2 variable with 2 N - 3 degrees of freedom. Evidently, the number N of space lines must be such that N _> 2.
(7.173)
However, the residual is not 0 when N - 2. For N > 3, we obtain a concurrency test for space lines" the hypothesis that space lines {v x p~ - n~} are concurrent is rejected with significance level a% if 2 J[r] > X2N-3,a"
(7.174)
The 2 N - 3 degrees of freedom can be intuitively interpreted as follows. A free space line has four degrees of freedom. If it is constrained to pass through a space point, it can still change its orientation freely around that space point, so there remain two degrees of freedom to break the incidence. Consequently, N space lines have 2 N degrees of freedom for their deviations from the true intersection ~. But we are considering their deviations form the estimated intersection ~. For N - 2, there remains only one degree of freedom to break the concurrency: the freedom of translation orthogonal to the two space lines. Hence, the total degrees of freedom are 2 ( N - 2) + 1 = 2 N - 3.
7.5.2
Space line fitting
Given N space lines {v x p~ - n~}, a - 1, ..., N, consider the problem of fitting a space line r x p - n to them. This means finding optimal averages p and n of {p~} and {n~}, respectively. We write p~ - 1~ + Ap~,
n~ -- gt~ + A n ~ ,
(7.175)
and regard Ap~ and A n ~ as random variables of mean 0 and covariance matrix V[p~ | n~], independent for each a. The hypothesis is p~ - p,
fi~ - n,
a - 1, ..., N,
(7.176)
which has rank 4 because {p, n} have four degrees of freedom 9. For each a, the optimal estimates of the true values i~a and fia are evidently p and n, so the optimal averages can be obtained by the optimization N
g[p, n] - ~ - ~ ( p , | 1 7 4 cr=l 9This hypothesis is degenerate.
V[p,|174174
-+ min (7.177)
240
Chapter 7.
Parametric Fitting
under the constraints (p, n) - 0 and Ilpll + Ilnll 1. Since the covariance matrix V[p, | n~] has null space Afp.en. - {p~ | n,, n , 9 P~}L C Tt 6 (see eq. (4.43)), we have V[p0, | n~](p~ | no,) - O. Hence, the optimization (7.177) can be rewritten in the form N
J[p,n] - E ( p |
|174
~ rain.
(7.178)
or--1
The solution l~ @ h is given by the unit eigenvector of the matrix N
M - E V[p~ | n,~]-
(7.179)
o'--1
for the smallest eigenvalue (see eqs. (2.86)). The covariance matrix of of the solution io | h is
V[P|
)iPa6*~V[p~| c ~ -
(7.180)
i
where Par~ea is the six-dimensional projection matrix onto
• arp.,
(see
eq. (4.4,4)). The covariance matrix V[p | ti] has rank 4; its null space is .A/'pe~ . The residual N
J[fg, ~i] - E (P~ | n~ - p | ~i, V[p~ | n~]- (p, | n~ - p | ~i)) (7.181) 0,--1
is a X2 variable with 4 ( N - 1) degrees of freedom. Evidently, the number N of space lines must be such that N _> 1.
(7.182)
The residual is 0 when N - 1. For N > 2, we obtain a coincidence test for space lines: the hypothesis that space lines {r x p~ = n~ } coincide is rejected with significance level a% if J[lb, hi > X~(N-1),~-
(7.183)
The 4 ( N - 1) degrees of freedom can be intuitively interpreted as follows. A free space line has four degrees of freedom, so N space lines have 4 N degrees of freedom for their deviations from the true position r x p - ~. But if we consider deviations from the average r x la - 1i, no deviation occurs when N = 1. Hence, the degrees of freedom decrease by four.
7.5.
Optimal Fitting for Space Lines
7. 5.3
241
Space plane fitting
Given N space lines { ( r - - r H ) x m ~ = 0}, a = 1, ..., N, consider the problem of fitting a space plane (v, p) = 0 to them. We write m ~ = rh~ + A m ~ ,
I'H~ -- ~ H ~ -!-
Atria,
(7.184)
and regard A r e a and ArHa as random variables of mean 0 and covariance matrix V[m~ @rS~], independent for each a. The hypothesis can be written ~ follows ( ~ ~q. (4.70)): (v, rh~ | 0) = 0,
(v, ~ , ~ | 1) = 0,
a = 1, ..., N.
(7.185)
The rank of this hypothesis is 2. If we define 4-vectors
the hypothesis (7.187) can be rewritten in the form ( a ~ ) , p) - 0,
k - 1, 2,
(7.187)
where the bar refers to the true value. The covariance matrices k, 1 = 1, 2, are given as follows:
Via(i) V[a~
(1)] _ U[frtc~] 9 0. W [ a ~ ) , a (')] - V[m~, the] T @ 0,
) a(~(~)l
V[a~ )
a~ )]
- v [ m , , ~ .c~ ] 9 0
V [ a ~ ) , a ~ )] --
V[rH~] |
0.
(7.188)
Hence, eq (7.55) reduces to the optimization
N 1 J[l~,d]- 1 --~d2 E
ll)(n'd)(~'t"~Tl'c~)2
ct-1
-+-2W(12)(n, d)(n, ~r~o~)((n,rHa ) - d) + W(~22)(n,d)((n, rH~)-d) 2) --~ rain i
W~(n,d)is given (n,V[m~,rH~]n))-1
under the constraint I1~11 = 1, where the (22)-matrix
1 W ~ ( ~ , d ) - 1 + d~
((n,V[m~]n) (~, V [ ~ , o , m ~ l n )
(7.189)
(,,, V [ ~ , ~ ] n )
by
. (7.190)
The covariance matrix of the solution i is given by eq. (7.57), which reduces to V[/,]- (Po
(Ec~% 1 ( w(ll)(~,d) ~^~ o^ T +2w~(~) (n ~)s[,~,~o] +w.(~) EN: 1 (W(12) (~'~'d)'r~ o~+ ~V(22)(~r 0~)'rHo~)T
(~,^^~)~.o~.o^~)
E N (W(12' (.. (~)m -~-./17(22)(?zt.j) ~ Hc~) ) )c~:l (~ ' ' ~ PO , E c~--1 ~ w($~)(n,d)
(7.191)
242
Chapter 7.
Parametric Fitting
where vh~ and /~H~ are the optimally corrected values of m a and rH(~, respectively (see eq. (5.183)). The symbol S[. ] denotes the symmetrization operator (see eqs. (2.205)). The rank of V[~,] is 3; its null space is {U}L. The residual J[fi, d] is a X2 variable with 2 N - 3 degrees of freedom. Evidently, the number N of space lines must be such that N _> 2.
(7.192)
However, the residual is not 0 when N - 2. For N > 3, we obtain a coplanarity test for space lines: the hypothesis that space lines { ( r - rHa) X m ~ -- 0} are coplanar is rejected with significance level a% if J [ n , d] > X~g-3,a.
(7.193)
The 2 N - 3 degrees of freedom can be intuitively interpreted as follows. A free space line has four degrees of freedom. If it is constrained to be on a space plane, it can still translate and rotate freely within that space plane, so there remain two degrees of freedom to break the incidence. Consequently, N space lines have 2N degrees of freedom for their deviations from the true plane (P, p) = 0. But we are considering their deviations from the fitted plane (~, p) - 0. For N - 2, there remains only one degree of freedom to break the coplanarity: the freedom of translation orthogonal to the two space lines. Hence, the total degrees of freedom are 2 ( N - 2) + 1 - 2 N - 3.
7.6
7.6.1
Optimal
Fitting
for Space
Planes
Space point fitting
Given N space planes { ( v a , p ) - 0}, a - 1, ..., N, consider the problem of fitting a space point p to them. This means optimally estimating their common intersection point. We write =
+
(7.194)
and regard each A v a as an independent random variable of mean 0 and covariance matrix Viral. The hypothesis can be written as follows: (p~, p) - 0,
a - 1, ..., N.
(7.195)
The rank of this hypothesis is 1. An optimal e s t i m a t e / , can be obtained by the optimization
J[P]- E (p~(~-~p) --+min
(7.196)
7.6.
243
O p t i m a l Fitting for Space Planes
under the constraint (to, p) - 1, where tr - (0, 0, 0, 1) 7-. The covariance m a t r i x of the solution/5 is
V[~]_(L(Ptci'~)(Ptci', r) ~) a=i
(p, U[vc~]h)
(7.107)
where Oa is the optimally corrected value of v~ (see eq. (5.169)). The rank of V[~] is 3; its null space is Af~ - {~}L. The residual J[~] is a X2 variable with N - 3 degrees of freedom. Evidently, the n u m b e r N of space lines must be such that N _> 3.
(7.198)
The residual is 0 when N - 3. For N > 4, we can test if space planes have a c o m m o n intersection: the hypothesis t h a t space planes {(v~, p) - 0} have a common intersection point is rejected with significance level a% if J[~b] > X~v-3,a"
(7.199)
The N - 3 degrees of freedom can be intuitively interpreted as follows. A free space plane has three degrees of freedom. If it is constrained to pass t h r o u g h a space point, it can still change its orientation freely around that space point, so there remains one degree of freedom to break the incidence. Consequently, N space lines have N degrees of freedom for their deviations from the true space point lb. But if we consider their deviations from the estimated intersection point t), no deviations occur when N - 3. Hence, the degrees of freedom decrease by three.
7. 6.2
Space line fitting
Given N space planes { ( v ~ , p ) = 0}, a = 1, ..., N, consider the problem of fitting a space line ( r - r H ) X m -- 0 to them. This means optimally estimating their c o m m o n intersection line. We write v~ - O~ + A v ~ ,
(7.200)
and regard each A v ~ as an independent r a n d o m variable of mean 0 and covariance m a t r i x V[v~]. The hypothesis can be written as follows (see eq. (4.70))" (v~, m | 0) - 0,
(v~, rH | 1) -- 0,
a -- 1, ..., N.
(7.201)
The rank of this hypothesis is 2. If we define 8-vectors
u--mOOOrHO1,
a (1) - v~ | O,
-
o
(7.202)
244
Chapter 7.
Parametric Fitting
the hypothesis (7.201) can be rewritten in the form (a~), u) - O,
k - 1, 2,
(7.203)
where the bar refers to the true value. The covariance matrices V[a~ ) a~ )] k, l - 1, 2, are given as follows:
V[a(1), ao~ (1)] __
v[~),
(1)
ao I -
( v[..] 0 (
O O
O
'
O )
V[~o]
O
V[a)(i a .(2)] _
( o V[~o] )
'
, V[a~),~
O
O
(2)]_ ( O O
O
V[~]
)(7.204)
From eqs. (7.202), eq. (7.55) reduces to the optimization
N Jim, rill-- ~
l + d1~
(W (11) (m, ~'H)(~'~c~m) 2
+ 2 W (12) (m, rH)(no~, m ) ( ( n ~ , rH) -- da)
-~- W~(22)(m, rH)((no~, under the constraints Ilmll W~,(m, rH) is given by
rH) --
da) 2) -+ min
(7.205)
1 and (m, rH) = 0, where the (22)-matrix
(m 9 0, V[vc~](~ H O 1)) ~ --1 (rH 9 1, V[v~](rH 9 1)) / " (7.206) The covariance matrix of the solution/t is given by eq. (7.57), which reduces to
W,~(m, rH) --
( m | 0, V[v~](m | 0)) (rH 9 1, V[v~](m 9 0))
V[/~]- (PNa
E Nc~=lw(ll)(m, ~'H)~Vc~ ^T EN 1W~12)(m, ~'H)0c~Vc~
E ~ w~(l~)(,~,e.)~ ^~
~= ,~:(~)
9
-
P~) ,
(7.207)
where f,~ is the optimally corrected value of v~ (see eq. (5.186)). Here, PX~ is the eight-dimensional projection matrix onto the orthogonal complement of
Af
~u-~
f
0 o
0
'
~'H
0
0
1
m
0
'
o
0
0 0 '
o
} ~,
(7.208)
1
which is the four-dimensional null space of/t; the covariance matrix V[/t] has rank 4.
7.6.
Optimal Fitting for Space Planes
245
The residual J[rh, rill is a )r variable with 2 ( N - 2) degrees of freedom. Evidently, the number N of space planes must be such that N _~ 2.
(7.209)
The residual is 0 when N - 2. For N _ 3, we can test if space planes have a common intersection line: the hypothesis that space planes {(n~, r) - da} have a common intersection line is rejected with significance level a% if J[rh, rH] > ~2(N-2),a.
(7.210)
The 2 ( N - 1) degrees of freedom can be intuitively interpreted as follows. A free space plane has three degrees of freedom. If it is constrained to pass through a space line, it can still rotate freely around that space line, so there remain two degrees of freedom to break the incidence. Consequently, N space planes have 2 N degrees of freedom for their deviations from the true intersection line ( r - r H) • rh -- 0. But if we consider their deviations from the estimated intersection line (r - / ' H ) • rh - 0, no deviations occur when N = 2. Hence, the degrees of freedom decrease by two.
7. 6.3
Space plane fitting
Given N space planes { ( v ~ , p ) - 0}, a - 1, ..., N, consider the problem of fitting a space plane (v, p) - 0 to them. This means finding an optimal average v of {v~}. We write v~ - O~ + Av~,
(7.211)
and regard each A v a as an independent random variable of mean 0 and covariance matrix V[va]. The hypothesis is O~ - v,
a-
1, ..., N,
(7.212)
which has rank 3 because both sides are unit 4-vectors 1~ For each v~, the optimal estimate of its true value O~ is evidently v, so the optimal average can be obtained by the optimization N
J[u] - E ( u ~ - u, V [ u ~ ] - ( u ~
- v)) --4 man
(7.213)
o~--1
under the constraint I1 11- 1, Sinc the cov~ri~nc~ matrix V[v~] has null space Af~ - {Us}L, we have V[u~]u~ - O. Hence, the optimization (7.213) can be rewritten in the form
J[u] - (u' ( L 1~
hypothesis is degenerate.
V[u~]-) u)) -+
(7.214)
246
Chapter 7.
Parametric Fitting
The solution ~ is given by the unit eigenvector of the matrix N
M - Z
Viva]-
(7.215)
for the smallest eigenvalue (see eqs. (2.86)). The covariance matrix of of the solution ~ is given by
(7.216) which has rank 3; its null space is { h a } L . The residual J[/,] is a X2 variable with 3 ( g - 1) degrees of freedom. Evidently, the number N of space planes must be such that N > 1.
(7.217)
The residual is 0 when N - 1. For N > 2, we obtain a coincidence test for space planes: the hypothesis that space planes {(va, p) - 0} coincide is rejected with significance level a% if
>
(7.218)
The 3 ( N - 1) degrees of freedom can be intuitively interpreted as follows. A space plane has three degrees of freedom, so N space planes have 3N degrees of freedom for their deviations from the true position (P, p) - 0. But if we consider deviations from the average (~, p) - 0, no deviation occurs when N = 1. Hence, the degrees of freedom decrease by three.
Chapter 8 Optimal Filter In the preceding chapter, the statistical characteristics of the solution of the parametric fitting problem were examined by assuming that the solution was computed by some numerical method. This and the next chapters present such numerical methods. In this chapter, we construct a filter that starts from an initial estimate and optimally updates it each time a new datum is read; we call it simply the optimal filter. The update rule is derived from the Bayesian standpoint: we apply the principle of maximum a posteriori probability estimation by assuming that the noise is Gaussian. The assumptions and approximations introduced in the derivation are elucidated, and the philosophical implications of the Bayesian approach are discussed. Then, the update rule is simplified by introducing the effective gradient approximation. The resulting expression is compared with the Kalman filter with no internal dynamics. Finally, the update rule for linear hypotheses is derived.
8.1
8.1.1
General Theory
Bayesian approach
In the preceding chapter, it was implicitly assumed t h a t the entire set of d a t a { a s } , a = 1, ..., N , was available for computing the optimal estimate of p a r a m e t e r u. Suppose the d a t a are a time sequence, the a t h value a s read at time a - 1, 2, 3, .... If the sequence is very long, it is realistic to c o m p u t e an estimate u s of u at time a in such a way t h a t it is optimal for the d a t a a l , ..., a s and u p d a t e it to Us+l at time a + 1 so t h a t it is optimal for a l, ..., a s + l , and so on (Fig. 8.1). If the u p d a t e c o m p u t a t i o n does not require rereading the past data, the estimation process can be efficiently run in real time. In this chapter, we derive a theoretically optimal u p d a t e rule for Gaussian noise. T h e resulting filter not only suits real time d a t a processing but also serves as a numerical algorithm for the parametric fitting problem even when the entire set of d a t a is simultaneously available. If we try to formulate such an u p d a t e rule, however, we immediately encounter a subtle problem, which, although purely philosophical, has beset m a n y theoreticians. A natural form of u p d a t e is c o m p u t i n g the a t h estimate u s and its covariance matrix V[us] at time a and correcting t h e m into Us+l and V[Us+l] by using the (ct + 1)st d a t u m a s + l . As the u p d a t e proceeds, the estimate u s is expected to approach the true value, and its covariance matrix V[u~] is expected to decrease. In this process, we are viewing the estimate u s as a random variable. 247
248
Chapter 8.
Optimal Filter
In the preceding chapter, we derived the covariance matrix V[/t] of the optimal e s t i m a t e / t . The optimal estimate /t is a random variable because it is computed from the data {a~}. The statistical characteristics of the data {aa} reflect the accuracy and reliability of the data acquisition process (e.g., image processing and 3-D sensing), and the noise distribution can be estimated, at least in principle, by repeating the data acquisition process many times. Thus, the statistical characteristics of the optimal e s t i m a t e / t are a mere (and faithful) reflection of the uncertainty of the data acquisition process. On the other hand, starting such an update procedure requires an initial value u0 and its covariance matrix V[u0]. The question is: is Uo a random variable? Since no data exist yet, it has nothing to do with the uncertainty of the data acquisition process. If it is a random variable, how can its distribution be defined? This observation necessitates that we think of u0 as our subjective belief, and its distribution as the relative strength of our belief in particular values. Once this subjective interpretation is accepted, logical consistency dictates that we view the subsequent update process in the same way: each time a new datum is read, the belief is influenced and its subjective distribution is modified. The approach that admits such a subjective interpretation is known as Bayesian 1, whereas that which rejects it is called non-Bayesian. The distinction between the Bayesian and the non-Bayesian approaches is purely philosophical and often not so clear in reality. For example, one can compute the initial estimate u0 and its covariance matrix V[u0] from a small number of data, say a l , a2, and a3, and then start the update process by regarding the subsequent data a4, as, ... as a new sequence a~, a~, .... In this case, the statistical characteristics of u0 reflect the characteristics of the errors in a l, a2, and a3. Hence, the approach is non-Bayesian. On the other hand, if one guesses Uo and V[u0] and applies the same procedure thereafter, the approach is Bayesian. In this chapter, we adopt the Bayesian approach for the sake of formal consistency, but we do not worry about its philosophical foundation any further. The distribution of an estimate before data are observed is called the a priori distribution (or simply the prior), while the distribution updated by the data is called the a posteriori distribution 2 (or simply the posterior).
8.1.2
Maximum a posteriori probability estimation
Given an m-vector datum a, we write a=~+Aa,
(8.1)
1This terminology derives from the Bayes formula (see eq. (3.95)), which plays an essential role in this approach. 2In the preceding chapters, the terms a priori and a posteriori were used in the nonBayesian context.
8.1.
General Theory
249
a~
a 2
//0 I~ V[uol
a 3
UI_~U2 VIu,l
~
VIu~]
U3
a4
V[u3]
Fig. 8.1. The c~th estimate h~ and its covariance matrix V [ ~ ] are updated by the (c~ + 1)th d a t u m a~+l.
and regard A a as a Gaussian random variable of mean 0 and covariance matrix Via]. We assume that a and g are constrained to be in an m~-dimensional manifold .A C 7~m, which we call the data space. It follows that the noise A a is constrained, to a first approximation, to be in the tangent space Ta(A) to the manifold ,4 at g. We also assume that no constraint is imposed on A a other than A a E To,(JI), so the range of the covariance matrix IF[a] coincides with the tangent space Ta(A). Our task is to estimate the n-vector u that satisfies the hypothesis F(k) (a, u) -- 0,
k - 1, ..., L,
(8.2)
given by L smooth functions F(k) ( ", . ): 7~m x 74n ~ 74. We assume that the domain of the parameter u is an n~-dimensional manifold/4 C T/n, which we call the parameter space. As argued in Section 7.1.2, this problem is solved in two stages: the correction stage and the estimation stage. The difference from Section 7.1.2 is that the parameter u is now a random variable that has an a priori probability density. The formulation in Section 7.1.2 is modified as follows.
A. Correction stage We estimate the value ~ - a - A a that satisfies the hypothesis (8.2) for a particular value of u. As shown in Section 5.1.1, the optimal correction A a is determined by the optimization J-
(s.3)
(Aa, l ) [ a ] - A a ) --+ min
under the linearized constraint ( V a F (k) (a, u), A a ) - F (k) (a, u),
k - 1,...,L,
(8.4)
together with A a E Tet(.A). The optimization (8.3), which minimizes the Mahalanobis distance I]Aal]f~[a], can be justified as m a x i m u m likelihood estimation for Gaussian noise (see Section 5.1.1). The first order solution is given as follows (see eq. (5.17))" L
A a - l~[a] E k,l--1
l~
(s.5)
250
Chapter 8.
Here, I?r
Optimal Filter
- (l~(kl)(u)) is the (LL)-matrix defined by (lV
(8.6)
by which we mean I ~ ( u ) - V ( u ) - , where V ( u ) - (V(kl)(u)) is the (LL)matrix defined by
(8.7)
(V
l~(kl)(u)F(k)(a, u ) V a F (1) (a, u).
a -- a - Sz[a] Z
(8.8)
k,l=l
If ~ is the true value, a particular value a has the following probability density (see eq. (3.46)):
e-( a - a ( a,u) ,?[a]- (a-a(a,u)))/2 p(alu)
-
(8.9)
.
Here, m' is the rank of Via], i.e., the degrees of freedom of the m-vector a. Since eq. (8.9) gives a probability density of a for a given value u, it defines a conditional probability density conditioned on u. B. E s t i m a t i o n stage
Suppose the parameter u has an a priori probability density of the form
e-(U-Uo ,v[u0] - (U-Uo ))/e p(u) -
,
(8.10)
v"(2~-)"' IV[~o]l+ where n' is the rank of V[u0], i.e., the degrees of freedom of the n-vector u. The joint probability density of u and a is
e-( U-Uo ,V[Uo]- (U-Uo))/2-(a-a,?[a]- (a-a))/2 p(alu)p(u) -
.
(8.11)
V/(27v) n' +m' iV[u ol I+ I~[a] I+ According to the Bayes formula (3.95), the a posteriori probability density of u determined by a particular value a is -- $
P(alu)p(u)
(8.12)
8.2.
Iterative Estimation Scheme
251
maximum a posteriori probability estimator, or simply the Bayes estimator 3, is the value u that maximizes this a posteriori probability density
The
for given a. Since the denominator of the right-hand side of eq. (8.12) does not depend on u, maximizing p(u[a)is equivalent to maximizing p(alu)p(u), which in turn is equivalent to minimizing J[u] = (u - u0, V [ u 0 l - ( u - u0)) + (a - a, V[~]- (~ - a)).
(8.13)
Substitution of eq. (8.8) into eq. (8.13) yields the following expression (see eq. (7.13)): L
J[u]- (U-uo,V[uo]-(U-Uo))+ E
l~(kO(u)F(k)(a'u)F(l)(a'u)"
(8.14)
k,l=l
The above argument tacitly assumes that the distributions of a and u are Section 3.2.2). Namely, eq. (8.9) is based on the assumption that the distribution of a is sufficiently concentrated around ~ in the data space A and hence the domain of the distribution can be identified with the tangent space Ta(A) of the manifold A at 5. Similarly, eq. (8.10) is based on the assumption that the distribution of u is sufficiently concentrated around u0 in the parameter space 5/ and hence the domain of the distribution can be identified with with the tangent space Tuo (b/) of the manifold L/ at u0. It follows that J[u] has a meaning only for u - Uo E TUo (l/l).
local (see
8.2
8.2.1
Iterative
Estimation
Scheme
Optimal update rule
Eq. (8.14) is merely a theoretical expression because lZV(kO(u) involves the true value & and the covariance matrix Vial evaluated at ~,. In order to compute it from the d a t u m a and its covariance matrix Vial, we apply the approximation introduced in Section 7.1.2. Namely, the (LL)-matrix W~(u) is approximated by the rank-constrained generalized inverse
(8.15) where r is the rank of the hypothesis (8.2) (see eq. (7.14)). Expanding W(kl)(u), F(k)(a,u), and F(t)(a,u) in the neighborhood of u0, we obtain + (VuW
-
+O(u
1 ,u -
+
-
V uW0
(- -
- uo )3,
3To be precise, the Bayes estimator is defined as the value that minimizes the expectation of a cost (or loss) function with respect an a posteriori distribution. It can be identified with the maximum a posteriori probability estimator if the cost function is quadratic and the a posteriori distribution is Gaussian.
252
Chapter 8.
Optimal Filter
F < k ) ( a , u ) - F(ok) + (VuFg k) , u - s o ) + - { l ( u - Uo V ~ F (k) ( u - So))
+O(u - to)3, 1
l)
F
(8.16)
Here, the subscript 0 refers to the value evaluated at u0; the symbol O ( u - u o ) 3 denotes terms of order 3 or higher in u - u0. We use the notation V2uf to denote the matrix whose (ij) element is 02.f/OuiOuj. Substituting eqs. (8.16) into eq. (8.14) and ignoring O ( u - u0) a, we obtain
J[u] - ( u - Uo, V[uol-(u - Uo)) + (u - uo,So(u - Uo)) (8.17)
+2(to, u - Uo) + Eo, L
So : Z
+ Fo( )V Fo('))
k,l:l
+2F(ok) S[Vu'w(kl) o (VuF~I))
to-E L (
+ 51F(~ F~l) V2W(~
) '
1F(k)F(ot)VuW(okl) )
(8.18)
(8.19)
o
k,l--1 L
Eo -
E
(s20)
'),
k,l=l
where S[. ]is the symmetrization operator (see eqs. (2.205)). As pointed out in the preceding section, the function J[u] is defined only for u - Uo E Tuo (b/). Let {Vj(o)}, j = 1, ..., n - n', be an orthonormal basis of the orthogonal complement Tuo (t4) • The constraint u - Uo E Tuo (L/) can be expressed in the form
(Vj(o), u - So) - O,
j - 1, ..., n - n'.
(8.21) !
Introducing Lagrange multipliers Aj, differentiating J[u]- 2 Ej--1 ( V j ( 0 ) Uo) with respect to u, and setting the result zero, we obtain n--It
n_n
/
y[~ol-(~- - o ) + S o ( ~ - ~o)+ to - Z -~(o). j--1
' ~$ - -
(8.22)
8.2.
Iterative E s t i m a t i o n Scheme
253
(U)
Fig. 8.2. The mapping from Tu0 (5/) to Tu~ (b/). Let puUo be the n-dimensional projection m a t r i x onto
Two (L/).
eq. ( 8 . 2 2 ) b y puUo on b o t h sides and noting t h a t P u ~V [ u o ] uo) - u -
pUuovj(o) - O, we
uo, and
Multiplying
V[uol, PuUo ( u -
obtain
v [ , 0 ] - ( , - - 0 ) + PgoSoP~uo(~, - uo) + Pgoto - o.
(8.23)
Solving this for u, we obtain the Bayes estimator, which we write as u l, in the form -
+
If this is s u b s t i t u t e d for u, eq. (8.17) can be r e a r r a n g e d into the form
J[u] - (u
-
Ul, (V[u0]- --~ P ~ o S o P u U o ) ( U
-
~1)) -~-El,
(s.25)
where
E1
Eo
Since eq. (8.11) has the form can be w r i t t e n in the form
(+.2+/
+
p(alu)p(u) ~ e -a[u]/2,
the Bayes formula (8.12)
(8.27) which is a function defined for u - U l E Tuo (lg). In order to carry out the u p d a t e , we need to view p(ula) as a function defined for u - u l E Tul (L/). F r o m eq. (8.27), we see t h a t p(ula ) = p(u + via ) for an a r b i t r a r y v E Tuo(Lt) • Hence, we can regard eq. (8.27) as defined for an a r b i t r a r y u E 7~n: the value p(ula ) is extended "cylindrically" into Tuo(lg) L (Fig. 8.2). T h e function p(u]a)thus extended over u E ~ n is then restricted to the domain u - u a E Tul (b/). Let P U 1 be the projection m a t r i x onto Tul (b/). p(ula ) for u -
Tu~ (L/),
the value of
p(:l:) ~ ~-<~-~"P~' <-E~ol-+P~o SoP(,otPg~l (~-~,/)/~,
(s.2s)
Since p U ( u - U l ) - u - u l for u - U l E u l E Tut (L/) can be w r i t t e n as
254
Chapter 8.
Optimal Filter
Eq. (8.28) states that a posteriori the parameter u has a Gaussian distribution with mean u l and covariance matrix
V[~'I] --
(P'u,U1V['uo]-V~,U1 "4- PUUl P~o S~
PU'Ul)- "
(8.29)
Having thus computed ul and V[Ul], we can compute u2 and V[u2] by the same procedure when the next datum is read; the update rule has been established. Since the (generally nonlinear) constraint on u has been linearized in the above derivation, the updated value u l may not exactly satisfy the inherent constraint on it. Hence, a higher order correction C[-] need to be added (see eq. (5.25)). Writing u0 as u~, we obtain the update rule of the optimal filter in the following form:
(V[u,~]- + pU s~pU ) t~] IX So~P~LP~L+, ) (?[~c~l- q-P/dUo,+lP,"Uc,
u~+l - C[u~ V[uaq-l]--
(8.30) (8.31)
L
~c~ -- k ~l (w(kl)((~uf(ak))(~uf(l))T q-f(k)k-72 "" Ol v u "~(l) O~ ) 12---
+2F(k)S[VuW(~kt)(VuF(O)T]
t~ -
+
1F<+)G(0 <~0), 2 " VgW~
+~( w..~)F~)v,+G (~) + ~G' ~ ( + ) G (') v,~w~,(~,)) ,
(8.32)
(s.33)
1---
~ u VIu~]( p u~+ V[u~] -pU u~+ ~ )
.
(8.34)
Here, the subscript c~ refers to the value evaluated at u = u~ and a = a~+1. For instance, F (k) is the abbreviation of F(k)(a~+l,U~). The (LL)-matrix W ~ is given by
(W(kO)- ((VaF(k) V[aa+l]VaF(l)))]. 8.2.2
(8.35)
Bayesianinterpretation
From the Bayesian standpoint, eqs. (8.30)-(8.35) have the following interpretation: 9 The matrix S~ is generally positive definite, so eq. (8.31) implies that V[u~] usually decreases after each update. Hence, the uncertainty of the estimate generally decreases as the update progresses.
8.2.
Iterative Estimation Scheme
255
)
(
V[u~] ~ O, then V[u~]- +Puus s , p Uus ~ O. Hence, U~+l u~ and V[U~+l] ,~ O. Namely, if the estimate u~ is very certain, the amount of update is very small, and the resulting estimate is also very certain.
9 If
9 If V[aa+l] ~ oc, eq. (8.35) implies that W (kl) ~ 0, so Sa ~ O and ta 0. Hence, U~+l ~ u~ and V[U~+l] ~ V[u~]. Namely, if the datum is very inaccurate, it adds little information, so the estimate is updated little. 9 If F(k)(a~+l, u~) ~ 0, eq. (8.33) implies that t~ ~ 0, s o ,u(~_~_1 ~,~ 'U~, but V[u~] generally decreases. Namely, if the observed datum is compatible with the predicted estimate, the estimate is updated little, but the belief in it is enhanced. In Section 7.1.2, we showed that an optimal estimate of u was given by the optimization N
L
J[u] - E
E W(kl)(a~'u)F(k)(a~'u)F(t)(a~'u) --+min.
~=1
(8.36)
k,l=l
The following question arises: in what sense is the final Bayes estimator an approximation to the solution of this optimization? We now show that is the solution of the optimization
UN UN
J[~] - ( ~ - ~0, V [ u 0 ] - ( ~ - u0)) N
L
+E E W(kl)(a~176 c~=1
-+ min.
(8.37)
k ,l= l
In fact, rewriting eq. (8.14) in the form of eq. (8.25) is equivalent to rewriting the above J[u] in the form -- (U -- Ul, V [ u l l - ( u -
j[l/,]
N
+E o~=2
Ul))
L
E w(kl)(aa, u ) F
4- El,
(8.a8)
+ E1 + E2,
(8.39)
k,l=l
which in turn can be rewritten in the form
j[,,] = ( u -
u~, v [ u ~ ] - ( u - ,,~)) N
+ E
L
E w(kl)(ao~, c~----3 k,l--1
u ) f (k)(aa, u)F (l)(ac~, u)
256
Chapter 8.
Optimal Filter
and so forth. The final expression is N
J[u] - (u - u N , V [ u N I - ( u -
UN)) + Z
(8.40)
E+,
c~=1
whose minimum is attained by u - U N, meaning that U N is the solution of the optimization (8.37). From eqs. (8.36)and (8.37), we see that -
-
-
+
Hence, it is concluded that the Bayes estimator UN is a good approximation to the solution of the optimization (8.36) if UN ~ Uo or V[u0]- ~ O. In other words, u N is expected to be the optimal solution of (8.36) with a high probability if we start with a good initial guess u0 or assume a large covariance matrix V[u0] ~ c<~. In Bayesian terms, a good estimate is obtained if initially we know very much about the true solution or we know very little. Although the controversy over the Bayesian and the non-Bayesian philosophies is very deep, the above observation shows that the distinction is vague in analytical terms" the Bayesian approach simply adds a "starter" ( u - no, V [ u o ] - ( u - no)) to the optimization criterion so that computation becomes easy.
8.3
Effective
Gradient
Approximation
Since the function F(k)(a, u) takes value 0 if a and u are correctly chosen, we can assume that F (a) - F(k)(a~+l, u~) is very small. If F(a)(a, u) is very smooth in u, the second derivative V2 F (a) is also very small. Hence, we can assume that F ( k ) V ~ F (t) ~ 0. (8.42) Consider the (LL)-matrix W , ( u ) (W(kl)(u)) -
- ( W (kl) (u)) defined by
( ( V a F ( k ) ( a ~ + l , u ) , V [ a ~ + l ] V a F ( l ) ( a ~ , u ) ) )-
.
(8.43)
r
If W ( k l ) ( u ) i s very smooth in u, the second derivative V2uW(k') Vu w(kl) ,, ~ (u~) is very small. Hence, we can assume that F(k) F(k>V2w(kl) ~ O.
-
(8.44)
Under these approximations, eqs. (8.32) and (8.33) are respectively approximated by L
so
Z
+
,
k,l=l (8.45)
8.3.
Effective Gradient Approximation
257
L
t~
E
W(kt)F(a)VuF(t)"
(8.46)
k,/--1
Let s be the range of the (LL)-matrix W~(u) defined by eq. (8.43); it is a linear subspace of 7~L. Let p C~(u) be the L-dimensional projection matrix onto the linear subspace s Since P C~(U)W~(u) - W~(u), we have
OpC~(u) Oui W~(u) + pLy(u)OW~(U)oui __ OW~(U)oui If W~(u) is very smooth in u, its range s a first approximation. Hence, OPC~(u)/oui ,~ W~(u)W~(u)-, we obtain the approximation
.
(8.47)
does not depend on u to O. Noting that pC~(u) =
L
Z
w(kl)(w(lrn))-v7vu LTr(mn) ,, ~ ~ VuW~
(8.48)
l,rn--1
(W(~kl))- denotes the (kl) element of (LL)-matrix W~ - (W(~kl)). 7*P(~k) by Define the effective gradient x'.u.
where
the generalized inverse of the
L
V ~9 F 5(k) - Vu F(k) +
~
(W~(kin)) - F ( ~ ) w~ u ,w, ~( ~ )
9
(8.49)
re,n--1 Ignoring terms mations:
of
O(EIL1
)2 we obtain the following approxi-
L
E
w(kl)(v~tF(k))(V~tF(l))q-
k,l--1 L
L
k,l=l
m,n=l
L T ( ~uF(I)'+" E (W(Ip))-F(q)~7uW(Pq)) p,q--1 L
~ w.(~) (v,~F~(~))(vu 5~) )~ k,l=l L
L
+ E
F(')( E
l,n=l L
W(lk)(W(km))-VuW(~"~'~))(VuF(t))-v
k,m=l L
+ Z r2)VuF(~)Z (w(~'(w(2~))-VuW(:q))T k,q--1
1,p=l
258
Chapter 8. L
Optimal Filter
L
E
w~(~')(v~F~(~))(v~r~(')) ~ + E
k,l=l
F-(~)v~w~(*~))(V~F-(')) ~
l,n=l L
+ E F(q)VuF(k)(VuW(Pq))T k,q--1 L
~ (w.(~')(VuF.(~))(v~F~(~))~ +2F(2)S[V~W(2')(V~F2))~]),
(s.5o)
k,l=l L
Z
w(~")F(~')v;,F(2)
k,l=l L
L
k,l=l
m,n=l
L
:
~ w(~~')F(2,v~F(2) k,l=l L
L
+ E
W(kO(W(tm))-VuW(mn))
F(k)F(n)( E
k,n=l
l,m=l
L
L
= E w'(kOF(k)VuF(~t)+ E F(k)F(n)VuW(k")) k,l=l
k,n=l
L
(8.51)
"~ E W.(kOF(k)VuF('). k,l=l
Hence, eqs. (8.32) and (8.33) are approximated by L
9 (k) )(V,~ F ( 0 ) 7- , so~ E W(~I)(V,,F~
(s.521
k,l=l L
to ~ Z
w(2') F(2)'*~
o
.
(s.sa)
k,l=l
We call these the effective gradient approximation and use "=" instead of " ~ " in the following. From the generalized matrix inversion formula (2.84), we obtain
(V[u~]- + p UU~ , % p uU,~) L
= v[~] - Z k,l=l
w^(~') ; ( v[~olV;~F~(~) ) (v [~o]v;,F~ , (,))T. (8.54/
8.3.
Effective Gradient Approximation
259
Here, the (LL)-matrix Wc~ - (17v(kt)) is defined by
(W~
W~ -
+ PC"(U~)v~Pc~(u~)
)- ,
(8.55)
where l~c, -- (~(kt))is the (LL)-matrix defined by
(~(~))
(-u- ~ ,
(8.56)
Since the matrices W ~ and IV(, share the same range s
we see that
I;V~V~W~ - W,~PC~(U")v~PL~(U")w,~ = IV.
((W-~ + pC.
= w~w-~w~
- w.w2w.
= p ~ ( U ~ ) w , ~ _ I;V,~pC.~(u~) =W~-W~.
(8.57)
From eqs. (8.52)and (8.54), we obtain
L k,l=l L
-V[uo]
w~^<~)w;(m")F.(m)(V;~F~(~)),.~=,~*~<~)~,v[~lV;~F,("))
~ k,l,m,n=l L
= v[~,~] ~
w(2~)r(2)v;~F(2 )
k,l=l
-V[uc~] E k,m=l
)
~v(kl)?(ln)w(nm)F(m)v~tF(k) l,n=l
L
- v[u~l Z w(~)r(~ ~)~*'~,~<~)o k,l--1 L
-v[uo] Z
(w~(~m>- ~V"(~m))F"(m)v'~F~ (~)
k,m=l L
-u-.. k,l=l
()8.58
260
Chapter 8.
Optimal Filter
On the other hand, we obtain the following identity from the generalized matrix inversion formula (2.84) and eq. (8.52): ( l ? ' [ u . ] - + p~U+ 1el~,,~o~So~pI~'~'c~pl~,,~,c~+ 1 ) - L
- l/[u~] - E l~(kt)(l~[u~]Pg~ V~tF(k))(?[uo~]Pg,~ w* vu.t-(t) . ) 7-. k,l-1 Here, the (Ln)-matrix I ~
- ( l ~ (kl)) is defined by
(W ~
I~o~ -
(8.59)
)- ,
(8.60)
V~F~ ) .
(8.61)
+ PC'~(u'~)~"o~PC'~(u'~)
where V~ -- (IP~(kl)) is the (LL)-matrix defined by (~(kt))_
('U',
,Pu,
Hence, we obtain the following effective gradient approximation to the update rule of u~ and V[u~]: L
u~+,
- C[u~ - V[u~] E
I?V"(kt)F(k)V~tF(O]
(8.62)
k,l=l L
v[~o+~] =
?[~.]- ~
* V ~*F ~(~) ) ~ W~(~)(?[uolp~ov~F~(~))((~[~olpgo
k,l--1
(8.63)
8.4
R e d u c t i o n from the K a l m a n Filter
We now show that the Kalman filter described in Section 3.4.4 can be applied to solve our problem. For simplicity, let us assume that V[a~] is of full rank. Let {u~ } be a sequence of estimates of u. For the moment, we simply assume that such a sequence is somehow given and do not question how it is defined. We have 0 -- g (k) (ac~, u) - F (k) (lot - Aaot, u a - 1 -Jr-(u - lion_ 1 )) = F (k) (aa, u a - 1 ) - ( ~ a F(k) (ao~ , u a - 1 ) , A a a ) + ( V ~ F (k) ( ~
,
~o-1), ~
__
~_~)
+ O(~
,
~ - ~-1
)2
(8.64)
Assuming that u ~ - i is a good estimate of u and ignoring O(Aa~, u - u ~ _ l ) 2, we obtain
r,(k) u - - uo~--1 ) , F(k) (ac~, ua-1) -- (Va F(k) c~-~, Aloe) -- lw I, v u t ' c ~ - - l ,
(s.65)
8.4.
R e d u c t i o n from t h e K a l m a n Filter
261
VaF (k) and VuF (k) are the a b b r e v i a t i o n s VuF (k)( a ~ + l , u ~ ) , respectively. If we define an n - v e c t o r xa, an L-vector y~,
where
of
VaF(a)(ao~+l,U,~) and
an L - v e c t o r w~, a n d an
L n - m a t r i x C by
X(~ - - - - ( U
" I F(1)(aa'ua-1) I F(L)(ao~,Uo~_l)
Yo~ --
-- U ~ _ 1 ) ,
,
(8.66)
(~aF(1)_l, Aaot )
F c~-1 1)
/0Ul
F 1)
"'"
"
Co/
(8.67)
9 ..
--(L)
0t~'~_I /OU 1 ...
"~F (L)"/Oun "i o~--1
eq. (8.65) can be w r i t t e n in the form
y,~ -C,~x,~ + w~.
(8.68)
Since A a ~ is a r a n d o m variable of m e a n 0 a n d covariance m a t r i x V[a~], the L-vector w a is also a r a n d o m variable; its e x p e c t a t i o n a n d covariance m a t r i x are given as follows:
E[.,o]- (,~ E[Aao])) - o ~, v a ~ '~(~) a_l,
V[w~] -
( ( V a F (k)~_1'
E[Aa,~Aa~] wva'E~(1)c~-I )_)
--((~aF(k)__ 1 U[a 1~ F (l, )) --
(8.69)
,
aJ
a
o~-1
"
(8.7o)
Now,
9 . - -(~-- Xc~-I
~._~) - -(~-~- ( U c ~ - I
u._:)
+ ( ~ . _ 1 - ~o_~)
(8.71)
--Uc~-2).
If we p u t A~ - I ,
B a - I,
v~
- - Uc~ - - U c ~ - - I ,
(8.72)
eq. (8.71) can be w r i t t e n as x a -- Aa-1Xc~-I -[- B ~ - I vc~-l.
(8.73)
262
Chapter 8.
Optimal Filter
This can be viewed as a "linear dynamical system" with input va, although no internal dynamics exists. Strictly speaking, the input va is not a random variable, but it can be regarded as a random variable of zero c o v a r i a n c e , i.e.,
E[,,o] =
-
o-1,
(8.74)
V[va] - O.
Eqs. (8.73) and (8.68) have exactly the same form as eqs. (3.104) and (3.105). Hence, the Kalman filter defined by eqs. (3.106)-(3.109) can be applied. Substituting eqs. (8.69), (8.70), (8.72), and (8.74)into eqs. (3.106)(3.109), the estimate 5~a and its covariance matrix V[5~] are computed in the following form: ~a -- ;~o~--1 -[-Ua--1 --Ua--2 _[..V[;~a] C aT V [ w a ] - i (Ya - C a ( ; ~ a - x -t- IIa-1 - 15a-2)),
(
V[ ~ot] --
)-1
V[;~a-l] -1 -~- cWV[wa]-lcaa
(8.78) (8.76)
.
Applying the matrix inversion formula (2.22), we can rewrite eq. (8.76) in the form V[;~a] -- V [ ; ~ a - 1 ] - V [Xa-1 " ]C aT V~ a- 1_ 1 (V[;~a-1 ] C : ) T,
(8.7'7)
where
(8.78)
Y a -- V[Wa+l] -[- C a + l V [ ; ~ a l C aq-l" T From eq. (8.77), we obtain V[xa]CT~ V ~ ! l T~-I -,, _ x ] C a V-l-a _ --1 1 - - V[~g a-- l] C aT V ~1- 1 -- V[~g a - 1 ] C a V a - 1 C a V [ x a ^
T
-1
: V[xa_ 1]Cava_l : ~ [ - 1 (( -V[;~a-1]C a--1 Vc~--i "[- C a V [ ~ , a - 1 ] C : ) ,, T V a--II - V[xot-x]Ca ^
--
V [ X a - ]1 C a T V ~-1-1
"t-
- Va-1
)
V a-1 -1
V[x a-X ] C aT -V-ax_ 1
T~-l
= V [ x a - 1 ] C a V a _ 1.
(8.7.9)
Substituting this into eq. (8.75) and incrementing a by 1, we obtain
^ "~'V[xalCa+l
v21
(Ya+l - C a + l (~a -[- 1$a - "l/,a-1)).
If :Ca+l is the optimal estimate of Xa+l - - ( u of u at time a + 1 is given by /ta+l ---- Ua -- ;~a-F-1.
(8.80)
ua), the optimal estimate
(S.81)
8.4.
Reduction from the Kalman Filter
263
Substituting eq. (8.80) into this and noting that x s + u s - U s - 1 we obtain the optimal e s t i m a t e / t s + l in the following form: i t s_t_ 1 - - it s -- V[/to~ ] C s -rt - 1 V- -s- i
(Ys+l
-- C s + l ( ? - t s
-- ~ t s ) ) ,
- - i t s - lTts,
(s.s2)
Since V[/ts] - V[&s], the covariance matrix o f / t s + l is given by V[~ts+l]
-- Viers]
-- V [ I t^s ] C s + lT
~r~ l(v[/tslC
sT+ l ) T 9
(s.sa)
Recall that the sequence {us } is yet to be defined. We now define it so that it happens to coincides with {/ts}. Using eqs. (8.67) and (8.70), we can now reduce eqs. (8.82) and (8.83) to the following form: L
/ t s + l -- / i s -- V [ / t s ]
E W~-(kl)Fs(k)~uF(1)
(8.84)
k,l=l L
v[go]- Z
wg-( 0
(8.85)
k,l=l -
(kl)
W~
__
,
\ ( V a , s k) V[as+ l ]Va F(l) ) +
(ruFf( )
~( '~ - 1 V[itslVu,~k))j . (8.86)
Applying a linear approximation at assumed estimates, we can make the Kalman filter applicable to nonlinear dynamical systems as well; the resulting rule is called the extended Kalman filter. In this sense, eqs. (8.84)--(8.86) can be called the extended Kalman filter, although nonlinearity enters only the observation process (i.e., no internal dynamics exists). Comparing eqs. (8.84) and (8.85) with eqs. (8.62) and (8.63), we see that eqs. (8.62) and (8.63)reduce to eqs. (8.84) and (8.85)if 19 the effective gradient v.Tu* . ~ . (sk ) in eqs. (8.62) and (8.63)is replaced by the ordinary gradient V u F (k) , and 2. no constraint is imposed on the parameter space for u. ,
(k)
Approximating VuF~
by VuF(~ k) is equivalent to assuming that
- o.
(s.8r)
However, this approximation introduces statistical bias into the solution, as will be discussed in the next chapter. Thus, the update rule derived from the Kalman filter by restricting the underlying linear dynamical system to a linear "statical" system is cruder than the optimal filter directly derived by applying the maximum a posteriori probability principle, on which the Kalman filter is also based (see Section 3.4.4).
264
Chapter 8.
8.5
Estimation
from
Linear
Optimal Filter
Hypotheses
As seen in Chapter 7, many hypotheses involving image and space objects (e.g., their coincidence, incidence, collinearity, concurrency, and coplanarity) are linear in the parameters to be estimated and can be written in the form (a~), u) - O,
k - 1, ..., L,
(8.88)
where u and 6_,~), k = 1, ..., L, a - 1, ..., N, are all n-vectors. Let { a ~ ) } be the observed data, and let V [ a ~ ), a~ )] be the covariance matrix of a ~ ) and a~ ). As in Section 7.1.5, eq. (8.88) can be written in the form F (k) (D.o~,u) 0 if we define the direct sum vector am by ac~ -- ac~
|
...
| a,~ ) ,
(8.89)
and L functions F(k)(., 9)" T~ Ln X ~"~n ._.+ ~ by F
e...
9 a
(8.90)
= (a (k),
Hence, the general theory described in Sections 8.2.1 and 8.3 can be applied if we put F(k ) __ (a~k+)1, i r a ) ' V,I.I,F~k ) - - ~_o ~( +k1) ' VaF(a k) - 0 |
(k)
| uc~ |
(8.91)
| O.
The (LL)-matrix W , ~ ( u ) = (W~(kt) (u)) defined by eq. (8.43) can be rewritten as
Let s be the range of the (LL)-matrix W , ~ ( u ) . W a ( u ) - W o ~ ( u ) - p~=,~(u) with respect to u, we obtain +
_
Oui
Oui
-
OR Oui
Differentiating
'
(8.93)
where P L,(u) is the L-dimensional projection matrix onto the linear subspace s As in Section 8.3, we assume that the range s is smooth in u w and hence O P e " ( U ) / O u i ~ O. Abbreviating W (kt)(u~) and ~ u ~~r(kl) (u~) to W~(kl) and VuW~(kl), respectively, we obtain L
L
1=1
/=1 L
,.., -- Z w(~lm)v'u(U' Via(k) t o~+1' t~.(1) ~c~+llU)Iu--U~ l-=l L ---- --2 E w(lm) s[V[a~)+I , t%+llluc~...(l) (8.94) /=1
8.5.
Estimation from Linear Hypotheses
265
Define the effective value a*~(k) of a ~ ) by the effective gradient w* v U ~~(k) c~--1 (see eq. (8.49)): ac~9(k)
L . (k) a(ck) W ( l l_ ) (a~) ~u , -1 ) S [ V [ a ~ ) a(m)]]u,_l. ---%Fc~_1 ~ 2 E l,m=l
(8.95) The update rule given by eqs. (8.62) and (8.63) can be written in the form L Uc~+I-
N[u,~
V[u~] ~
-
W~^(at),(ac~+l (k) ~u~) a ~*(0 + l ],
(8.96)
k,l--1 V[Ua+l]-
L V[~/,a] -- E ~v(kl)(~'Z[ua]P~ a(~T1)(I~[~a] ,(k) p l,U 4 a a,(l) +l), T k,l=l
(8.97)
where
? [ff-J'C]~
u.+ 1) --" (PU~tfa+ 1V [ff'gCt]- P'/A
-
(8.98)
The (LL)-matrices l~V~ - (IV~(kl)) and I~r~ - ( I ~ (kl)) are defined by
(W,~ + PC"(U")v~Pc"(u~ ) (W-~ + PC~(U~)vo~Pf-'~(u~) )
I;V~ W,~ -
, ,
(8.99)
where V~ - (lk~(kl)) and V~ - (~(kl)) axe the (LL)-matrices defined by -
?(kl)__
~a~,+ 1, v [ u ~ , l a ~ , + ~)
( ( a a + l , F"Zt~
,
*(" ) 9
uaaa+l)
(8.100)
E x a m p l e 8.1 If the hypothesis consists of a single equation (a, u) = 0 and no constraint is imposed on u other than normalization Ilull = 1, the optimal filter takes the following form: (ao~+l, Uo~)V['uo~]a~+ 1 a* 9 ~o~+1 -- N[uo~ - (Uo~, V [ a a + l ] ~ ) -~- (c~+l,V[~c~]ac~+l) v[u,~+l]
-
~[u~]
-
]
'
(8.101)
(V[u~lPua a ~ + 1 ) ( V [ u , ~ I P u " a ~ + 1) T (u,~, V[a~+llu,~) + (a~+ 1, eu.~"[u~lPu.a*a+l)
'
(8.102) a ; + 1 -- a(~+i --
2(ac~+l' It(~) U [ a a + 1]Itc~ (~tc~, V[ac~+l]~,c~) '
(8.103)
r[uc~l-Pu~+l)- .
(8.104)
~ / ' [ U a ] - (Pu~+l
Here, Pu,~ and Pu~+l are the n-dimensional projection matrices along u~ and U~+l, respectively.
Chapter
9
Renormalization This chapter f o c u s e s o n the parametric fitting problem for a linear hypothesis with no constraints other than normalization. To a first approximation, the problem reduces to least-squares fitting, for which the solution can be obtained analytically by solving the eigenvalue problem. We first show that this least-squares approximation introduces statistical bias into the solution whatever weights are used. After analyzing the statistical bias in quantitative terms, we present an iterative procedure, called renormalization, for removing the bias by automatically adjusting to the noise. In contrast to the optimal filter we studied in the preceding chapter, renormalization requires no initial estimate and no knowledge of the noise level. We then discuss a procedure called linearization, which enables us to apply renormalization to nonlinear constraints. Finally, we define second order renormalization which removes statistical bias up to second order terms.
9.1 9.1.1
Eigenvector Fit Least-squares approximation
As shown in Chapter 7, many hypotheses involving image and space objects (e.g., coincidence, incidence, collinearity, concurrency, and coplanarity) are linear in the parameters to be estimated. If the hypothesis is linear, the problem takes the form of estimating an n-vector u such that (~),u)
k-l,...,L,
-- O,
(9.1)
from n-vector data { a ~ ) }, k - 1, ..., L, c~ - 1, ..., N. We write
(9.2) and assume that the noise Aa(~k) is a random variable of mean 0, independent for each c~. Let V [ a ~ ), a~ )] ( - E[Aa~)Aa~)-r]) be the covariance matrix of a ~ ) and a~ ). In this chapter, we further assume that each A a ~ ) is O(e), where e is an appropriately defined constant that measures the average magnitude of the noise. We write v[a
(9.3)
-
and call e and Vo[a~),a~ )] the noise level and the normalized covariance matrix, respectively (see eqs. (5.37) and (7.42)). 267
268
Chapter 9.
Renormalization
According to the general theory in Section 7.1.5, the optimal estimate of u is obtained by the minimization (7.55). Since multiplication of eq. (7.55) by a positive constant does not affect the solution, we can alternatively minimize N
1
L
(9.4) 4=1 k ,l= l
In this equation, W(~kl) (u) can be multiplied by an arbitrary positive constant, so we hereafter define the (LL)-matrix W4(u) - (W (kl) (u)) by using the normalized covariance matrix Vo[a~),a~ )] instead of Y[a~),a~ )] (see eq. (7.55))"
Vo[a(k), a~ )] is evaluated at the true values 1 a ~ ) and a~ ) . In actual computation, we replace the matrix Vo[a(k) , a~ ) ] by the values evaluated at the data values a (k) and a~ ) and approximate the matrix W . ( u ) Here, the matrix
by computing the rank-constrained generalized inverse
where r is the rank of the hypothesis (9.1) (see eq. (7.56)). From eqs. (9.4) and (9.5), we immediately see that the scale of u is indeterminate if no constraints are imposed. In order to remove this indeterminacy, we impose normalization II ll = 1. We further assume that no other constraints exist on u. It follows that the parameter space is an ( n - 1)dimensional unit sphere S n - l . If the functions W(~kl) (u) are replaced by constants W(~kl), eq. (9.4) has the form N
~
1
J[u] - -~ ~
L
~
W..~ (k') (a~),
u)(a~ ), u).
(9.7)
4=1 k,l=l
The constants W(~kl) are chosen, for example, in the form W (kl) - W (kt) (u'), where u* is an appropriate estimate of u. If the hypothesis consists of a single equation (a4, u) - 0, minimization of eq. (9.7) is equivalent to 1
N
--~ Z W~(a4, u) z --4 min,
(9.8)
4--1 1To be consistent with the notation in Chapters 7 and 8, we should write V 0 [ a ~ ) , a ~ )] and w ( k l ) ( u ) as V0[a(k),a~ )] and lTV(kl)(u), respectively. In the following, however, we omit the bars to simplify the notation.
9.1.
Eigenvector Fit
269
which is a special type of least-squares optimization weighted by W~. Since minimization of eq. (9.7) is a straightforward generalization of eq. (9.8) to multiple hypotheses, we call eq. (9.7) the least-squares approximation to eq. (9.4). The least-squares approximation has a practical significance: the solution is computed analytically. Define the moment matrix N
L
c~--1
k,l=l
i
(kl) (ak)a~) T
(9.9)
This is a positive semi-definite symmetric matrix. Eq. (9.7) can be rewritten in the form ~][u]- (u, M u ) - ~ min. (9.10) The right-hand side is a quadratic form in u. Hence, J[u] is minimized under the constraint Ilull - 1 by the unit eigenvector /t of M for the smallest eigenvalue (see eqs. (2.86) and (2.139)); the smallest eigenvalue equals the residual J[/t]. We call the solution/t thus obtained the eigenvector fit.
9.1.2
Statistical bias of eigenvector fit
In spite of the computational advantage of the eigenvector fit, it has a drawback. For computing the weights W(~kl) - W (kO (u*), we need a good estimate u*. How can we choose it? A naive strategy for this is to substitute an initial guess u0 for u* and compute the eigenvector fit u l; substituting u l for u*, we again compute the eigenvector fit u2 and so on. However, such iterations introduce statistical bias into the solution. Before going into the details, we must note the following two facts: 1. Statistical bias is a second order effect: if the noise is of mean zero and of order O(e), the bias of the output is of order O(e2), since the expectations of noise terms of odd orders vanish. 2. The fact that an estimator X of 2 is unbiased (i.e., E[X] - 2 ) makes little practical sense unless its variance V[X] is very small. If X is determined from a large number of data and if l i m g ~ E[X] - f( and l i m N ~ V[X] - O, where N is the number of the data, X is said to be a consistent estimator (see Section 3.6.2). A typical example is when X is an average of a large number of independent data of mean zero and of the same order of magnitude (the law of large numbers; see eq. (3.9)). We also need the following preliminary results: 9 Consider the identity W~(u)W~(u)-
- p~:~(u)
(9.11)
where p ~ ( u ) is the L-dimensional projection matrix onto the range s of the (LL)-matrix W ~ ( u ) (see eqs. (2.80)).
270
Chapter 9.
Renormalization
9 Taking the trace of eq. (9.11) on both sides, we obtain
tr(W~(u)W~(u)-) = r, (9.12) where r is the rank of Wa(u) (see eqs. (2.51)), which equals the rank of the hypothesis (9.1). If eq. (9.5) is substituted into eq. (9.12), we have L
E W(kt)(u)(u, Vo[a(k), a~)]u) = r.
(9.13)
k,l--1
9 Differentiating eq. (9.13) with respect to u on both sides, we obtain L
L
E Vuw(ki)(u)(u, Vo[a~ ) , a~)]u) + 2 E w(kl)(u)V~ k,l=l
a~)]u =
O.
k,l=l
(9.14) 9 Substituting eq. ( 9 . 5 ) i n t o the identity W~(u) (see eqs. (2.81)), we obtain
Wa(u)Wa(u)-W~(u) =
L
y ~ W (k0 (u)W.(m~) (u)(u, Vo[a~ ), a(m)]u) = W (k~).
(9.15)
/,m=l
9 Substituting eq. (9.2)into eq. (9.9), we obtain 1
N
M = ~ ~
L
~
w.(~*)(~(2) + A.~))(a~) + A.~))~ - M + AM.
c~--1 k,l--1
(9.16) where M is the unperturbed moment matrix defined by N -
1
M - ~ E
L
E
W(k/>~i~>a~>T"
(9.17)
c~=l k ,l= l
The deviation A M is expressed in the following form: AM-
N
1
L
~E
(at) ( ~i~) Aa~) T +Aa~>a~>T)
Z
W+
c~=l k,l= l
1
N
L
(kt)
(9.18)
c~=l k,l=l
The first term on the right-hand side is O(e/yr~) since it is an average of independent random variables of mean zero, while the second term is O(e2). Since E[Aa(~k)] = 0, we see that e2
N
L
E[ZXM]- ~ Z ~ c~=l k,l=l
w~')Vo[~),~)]
9
(9.19)
9.1.
Eigenvector Fit
271
9 Let ft be the true value of u that satisfies the hypothesis ( ~ ) , u) = 0, k - 1, ..., L, c~ - 1, ..., N. From eq. (9.17), it is immediately seen that Mft
-
O,
(9.20)
meaning that the true value ~ is the unit eigenvector of the unperturbed moment matrix M for eigenvalue 0. _
Now, consider the gradient of J[u] at the true value ~. Differentiating eq. (9.10) with respect to u, n o t i n g / ~ f t - 0 and ( ~ ) , ~) - 0, and using eq. (9.18), we obtain Vu~][~] - 2 M ~ t - 2(_M + A M ) ~ -
2
N
2AM~t
L
a(k) a~)
(kl) a=l
(9.21)
k,l= l
Let P~t be the n-dimensional projection matrix along ~. Since (g(k), ~) _ 0 implies P ~ t ~ ) - ~ ) , we see from eq. (9.21)that N
L
a=l
k,/=l
2 P~tVu3[~t]- ~ ~
~
W,,~ (kl) ( ( A a ~ ) , ~ ) , ~ k ) + Aa~)Aa~)-r~t _(~,, (Aa(k)Aa~)T)~)~)
,
(9.22)
Suppose the estimate u* for computing the weights W (kt) is chosen to be the true value ~. Using eq. (9.13), we observe that 2e 2
N
L
E[P~tVuJ[ft]]- ~ Z Z
lu
(V~
a ~ ) ] f t - (ft, Vo[a~ ), a~))]ft)ft)
or-- 1 k , l = 1
= 2r
~ Z c~--i
Z
W('kl)V~
-
k,l--1
The right-hand side is O(e 2) but not zero in general even in the limit N oc. In other words, P~tVu.][g] is not zero on average however many data we use. This implies that J[u] does not take its minimum at ~ because J[u] can be further reduced without violating the constraint I1 11 - 1 if u is infinitesimally incremented in the direction - P u V u J [ f t ] . It follows that if/t minimizes J[u], the probability that ( / t - ~ , < 0 is larger than the probability that ( / t E[Pu%.J[ ll) > 0. the eigenvector fit is biased (Fig. 9.1).
272
Chapter 9.
Renormalization
~
)[ul fi
ti
Fig. 9.1. Bias of the value fi that minimizes J[u]. On the other hand, no such bias exists if the the original function J[u] is minimized. Differentiating eq. (9.4) with respect to u, we obtain 1
N
L
(~----1 k , l - 1
+2w~(~)(~)(~), ~)~)).
(9.24)
Evaluating this at ~ and substituting eq. (9.2) into this, we observe that N
L
c~:1 k , l : l
+2W.(kt)(~l(a~) + A a (a), ~)(a~ ) + Aa~))) N
_
L
1
(Vu
(kt)(. (Aa~)
,
c r = l k,l= l
+2W. (kt)(~)((Aa(k), ~)a~) + (Aa~), ~)Aa~))) N
L
c~=1 k , l = l
(9.25) From eq. (9.14), we see that
E[VuJ[~,]] - ~
V~w(~') (a)(a, Vo[~), (x:l
~)]a)
k ,l: l
+2 Z w~(~)(~)v0[~),~)]" k,l:l
)
-0.
(9.26)
9.2.
Unbiased Eigenvector Fit
273
Consequently, E[P~tVuJ[ft]] = 0. Since P~tVuJ[ft] is an average of a large number of independent data of mean zero and of order O(e), we have P~tVuJ[ft] = O(e2/v/-N). This means that the minimum of J[u] occurs at the true value ~ in the limit N --+ oc: the optimal solution is a consistent estimator. From this argument, we also see that the statistical bias of the eigenvector fit is solely due to the least-squares approximation W(~kl) (u) ~ W(~kl) (u*), or equivalently VuW(~ kl) ~ O. (9.27) (See eq. (8.87).) Hence, the bias is unavoidable however the estimate u* is chosen--even for u* = ~,.
Unbiased Eigenvector Fit
9.2
9.2.1
Unbiasedleast-squares approximation
The analysis in the preceding section suggests that an unbiased solution could be obtained if
1
N
L
(kl)
E
J[u]--~E
W~
((a~)u)a~ ) ,
(
,u) - V[a~ )
,a~)])
(9.28)
c~=l k , / = l
is minimized instead of eq. (9.7). By the same analysis as shown earlier (see eq. (9.23)), we conclude that E[VuJ[~t]]- 0 and hence E[PuVuJ[~]]o. Since P~t Vu~][~] is an average of a large number of independent data of mean zero and of order O(e), we have P~tVuJ[ft] - O(e2/v/N). Hence, the solution is a consistent estimator. We call eq. (9.28) the unbiased least-squares approximation to eq. (9.4). The unbiased least-squares approximation is computationally convenient. In fact, if we define the unbiased moment matrix
M -
N
1
L
)
Z
a~)T
- V [ a ~ ), a~)]),
(9.29)
o = 1 k,l= l
eq. (9.28) can be rewritten as the quadratic form
J[u] - (u, 1VIu),
(9.30)
which is minimized under the constraint Ilull - 1 by the unit eigenvector/t of / ~ for the smallest eigenvalue. Let us call the s o l u t i o n / t thus obtained the
unbiased eigenvector fit. Assume that the true value ~, which is the unit eigenvector of M for eigenvalue 0 (see eq. (9.20)), is unique, i.e., the eigenvalue 0 is a simple root.
274
Chapter 9.
Renormalization
According to the perturbation theorem, the perturbation A/t - / t - fi of/t from ~ is given in the following form (see eq. (2.111))" A/t - - / f / / - A/~lrf~ + O(e2). The perturbation A / ~ -- M - / ~ N
1
o f / ~ f r o m / ~ has the following form:
L
~M - ~ Z Z w~(~) (~2) ~ (~=1
(9.31)
a~)T
+ ~a(2 )
~,~) T)
k.l=l
N
L
1
(kt)
+~ E ~ wo (~ (~=1
a~)
T
~)
,
- ~V0[a~) ~)l).
(9.32)
k.l--1
Suppose the estimate u* for computing the weights W(~kt) - W (kl) ( u * ) i s chosen to be the true value ft. Ignoring terms of O(e 2) and noting that (d~), ~t) - 0, we obtain from eq. (9.32) 1
N
L
c~=1
k.l=l
~M~ = ~ ~ Z w~,(kt) (~ a~) , f t ),~
(k)
.
(9.33)
Since a ~ ) is independent for each a, we obtain E[(/kMa)
(A/~a)
1
m]
N
L
-- N 2 E a.fl=l 62
N
L
-- N 2 Z c~.~=1
=
N2 E
]~)~
~ w'(kl)W(mn)(fl"'S[Aa~)Aa(n)T~"~'(k)~l'(m) k.l.m.n=l ~ w(~kl)w(mn) (~t. 5~t~Vo[a~) , a~)]~t)a~ ) 5 ~ ) k.l.m.n=l
E
c~=l
~2 N _-- N2 ~ c~=l
k.m=l
w(kl)w(nm)(~l,,
go[a~ ) . a~)].)
l.n=l
)
~.(k)a(m)
L ~ I/V(km) ~t(k) ~t(m) __ ~/iS/, E2 k,m=l
(9.34)
where the identity (9.15) is used. From eq. (9.31), the covariance matrix V[~i] - E[A~iA~i -r] of the unbiased eigenvector fit ~i is obtained to a first approximation in the following form: E2
V [ / t ] - _M-E[(A/I~ft)(A/~'~)T]/~- = ~ M = e2
Z c~----1 k.l= l
)
W(kt)a~)a~) -~
"
M/~-=
E2 _ _
~M
(9.35)
9.2.
Unbiased Eigenvector Fit
275
The rank of this matrix is n - 1 in general; the null space is {R}L. Comparing this with the general result in Section 7.1.5 (see eq. (7.57)), we conclude that the unbiased eigenvector fit has the same covariance matrix as the truly optimal estimate. In other words, the unbiased least-squares approximation (9.28) does not decrease the accuracy of the solution. In actual computation, eq. (9.35) is approximated by
(z N
=
EL
T
k,l=l
)
(9.36)
where P/~ is the n-dimensional projection matrix 2 along/~. Here, h ~ ) is the optimally corrected value of a ~ ) (see Section 5.1.2), but to a first approximation it can be approximated by the data value a ~ ) itself.
9.2.2
Analysis of residual
As discussed in Section 7.1.4, the residual of optimization plays an important role in testing of hypotheses. As in Section 7.1.4, we consider both the residual J[~] for the true value ~ and the residual j[/t] for the estimate/t. A. Residual for the true value Consider the residual J[~] for the true value a. Since ( ~ , / ~ ) that
- 0,
we
see
j[~] -- (~, f ~ u ) - (~, ( M + ~ f ~ ) ~ ) - (~, ~f~u) 1 = NE
N
L
(k~) , a~) W~ (Aa~) R)(A ,R)
~
c~=1 k , l = l 62
N
N ~
L
~
W(kO(~' V~
a~)]u)"
(9.37)
c~=1 k , l = l
We assume that the weights W~ kl) are so chosen that W (kl) - W (k') (~,). Then, the second term on the right-hand side of the above equation is - e 2 r (see the identity (9.13)). If we put -
(9.3s)
the L-vector e~ - (e (1),..., e~)) T is a Gaussian random variable of mean 0. Its covariance matrix is V[e~]- (E[e~)e~)])-
((ft, E[Aa~)Aa~)]~t))
2The projection matrix P u is inserted to ensure t h a t the computed covariance matrix V[fi] has null space {fi}L, which is required by the normalization constraint Ilfi[[ = 1.
276
Chapter 9.
Renormalization
Hence, V[e~]- = W ~ / e 2, so the first term on the right-hand side of eq. (9.37) is l
N
L
-
~.
- X ~(~.,
a--1 k,l--1
vW.]-~.).
(9.40)
c~--I
Since V[e~] (= e2W-~) has rank r and the vector e~ is independent for each c~, the sum ~ g~=l(e~, V[e~]-e~)is a )~2 variable with rN degrees of freedom (see eq. (3.63)). Consequently, eq. (9.37) can be written in the form ~2
J[a]-
7(~,~-
rX),
(9.41)
where X~N : stands for a X 2 variable with r N degrees of freedom. The expectation and variance of the residual J[~] are given as follows (see eqs. (3.59)):
2e4r
E[J[~]] - 0,
v[J[~]] - --~-.
(9.42)
It follows that if a particular value ~t is given independently of the data {a~) }, we can construct the X2 test for the strong hypothesis that the true value is ~: the strong hypothesis is rejected with significance level a% if 62
J[~] > -~(X2~N,, -- rX).
(9.43)
B. R e s i d u a l for t h e e s t i m a t e
Consider the residual j[/t] for the unbiased eigenvector fit/t. We see that
J[~] - ( ~ , . ~ ) - (~ + A~, M(~ + ~ ) ) = (~, M~) + 2(/x~, M ~ ) + (A~, M/x~) = j[~] + 2 ( ~ , (M + ~ M ) ~ ) + (a~, M ~ ) = j['~] + 2(A/t, A / ~ ) +
(A/t,/I/A/t).
(9.44)
Since A/t = O ( A / ~ ) = O(e/v/-N) (see eqs. (9.31) and (9.32)), the last term on the right-hand side is ~3
(A/t, f / A / t ) - (A/t, (AT//+ A / ~ ) A / t ) - (A/t, AT//A/,) + O(Nv/_~).
(9.45)
Consider the second term on the right-hand side of eq. (9.44). According to the perturbation theorem, the smallest eigenvalue A (= j[/t]) o f / ~ has the following form (see eq. (2.108)): 6
-
(/t, A / ~ / t ) + O(AA}I) 2 - O(--~:).
(9.46)
9.2.
Unbiased Eigenvector Fit
277
Since/t is a unit vector, we have (?2
(zx~, ~) - o(zx~) ~ - o ( v ) . Hence,
(9.47)
(?3
(Aa, M a ) - s
(9.48)
~) - O( N .~=).,/j~
On the other hand,
_
(?3
(9.49)
= (A/t, A ] ~ f t ) + (A/t, M A / t ) + O ( N ).~/~ .rr~. Comparing eqs. (9.48) and (9.49), we conclude that _
(A/t, A.~r
(?3
- -(A/t, MA/t) +
O(N"7;)'x/J~/
(9.50)
Substituting eqs. (9.45) and (9.50)into eq. (9.44) and using eqs. (9.37) and (9.40), we obtain (?3
j[,i] - 2[~,] - (zx,i, M ~ , i ) + O ( x v ~ ) (?3
= V
( ~ , v [ ~ o ] - ~ ) - (~x,i, v[,i]- zx,~) - r x c~=1
+ o( x4-~), (9.51)
e21~-/N
where we have used the fact that V[/t] = (see eq. (9.35)). Since we are assuming that there exists no constraint other than II,ill- 1, the rank of V[/t] is n - 1 in general. Hence, the quadratic form (A/t, V [ / t ] - A / t ) i s a X2 variable with n - 1 degrees of freedom (see eq. (3.61)). From eqs. (9.31), (9.33), and (9.38), we have N
~x,~-
X1 ~
L
Z
W~(~)~) M a~) + o(~: ),
(9.52)
a=1 k,l=l
meaning that A/t is obtained, to a first approximation, by a linear mapping from {e~}. Consequently, J[/t] can be written to a first approximation as follows (see eq. (3.64)):
6_2 J[/t]-
--~(~2vN_n+ 1 -- r N ) .
(9.53)
278
Chapter 9.
Renormalization
Its expectation and variance are E2
E[J[/tl]--~(n-
2e4r(n-l) V[j['5]]- ~ 1 rN
1),
"
(9.54)
From eq. (9.53), we can construct the X2 test for the weak hypothesis that eq. (9.1) holds for some value u: the weak hypothesis is rejected with significance level a% if E2
>
XrN-nWl,a - r N + n - 1 ) .
(9.55)
Comparing eqs. (9.42) and (9.54), we observe that the expectation and variance of J[/t] are both smaller than their respective values for J[R] because of the correlation etween/t and the data { a ~ ) }.
9.3 9.3.1
Generalized Eigenvalue Fit N o i s e level e s t i m a t i o n
Although the unbiased eigenvalue fit seems very desirable for its unbiasedness and computational convenience, a difficulty arises if we want to compute it in a real situation: the noise level c must be estimated precisely. If the noise is underestimated, statistical bias still remains, while if it is overestimated, statistical bias arises in the opposite direction. However, accurately predicting the noise level e is very difficult in practice. This problem does not occur for the eigenvector fit ~, since the leastsquares approximation does not involve the noise level e: the covariance matrices V [ a ~ ), a ~ )] need to be estimated only up to scale. Thus, the eigenvalue f i t / t is insensitive to the noise level e, while the unbiased eigenvalue f i t / t is very sensitive to it. As pointed out in Sections 5.1.5 and 7.1.4, it is often easy to predict the form of the covariance matrix from geometric considerations, but its absolute magnitude is very difficult to predict a priori. Hence, it is desirable to solve the problem without knowing the noise level e. In the following, we present such a scheme; the noise level e is estimated a posteriori. Note that eq. (9.29) can be written in the following form: /14 - M 1
N
e2N,
(9.56)
L
N - '-N E E w(kl)V~ (~=1 k,l=l Let u* be the estimate for computing the weights: W (kO
(9.57)
-
9.3.
Generalized Eigenvalue Fit
279
see from the identity (9.13) that N
L
1 E (u*,Nu*) - -~
W~(kl) (u*, Vo[a~) , a~ )] u* ) - r .
E
(9.58)
o~--1 k,l--1
Recall that we have defined the unbiased moment m a t r i x / ~ - M - e2N with the expectation t h a t / ~ should be a better estimate o f / ~ / t h a n M . This suggests that the noise level e should be estimated so that / ~ ~ M - e2 N .
(9.59)
However, exact equality m a y not hold. So, we_ introduce the following cornpromise. Since the true value ~ satisfies (~, M~t) - 0, we seek the value e that satisfies ( u , / ~ u ) - (u, M u ) - e2(u, N u ) - 0 (9.60) for some u. There may exist multiple pairs {u, e} that satisfy eq. (9.60). From among them, we choose the one for which e 2 - (u, M u ) / ( u , N u ) is the smallest. Eq. (9.58) implies that if u ~ u* then (u, N u ) 7s O. Hence, the vector u is determined by minimizing
I[u]
-
(u, (u, M N uu)) "
(9.61)
In other words, we minimize the generalized Rayleigh quotient (see eq. (2.103)). Since M is generally positive definite 3 in the presence of noise, eq. (9.61) is minimized under the constraint Ilull - 1 by the unit generalized eigenvector/~ of the generalized eigenvalue problem
Mu - cNu
(9.62)
for the smallest generalized eigenvalue (see eq. (2.104)). At the same time, the smallest generalized eigenvalue c gives an estimate of e2. We call the resulting solution/~ the generalized eigenvector fit.
9.3.2
Accuracy of generalized eigenvector fit
A. U n b i a s e d n e s s of generalized e i g e n v e c t o r fit The bias analysis in Section 9.1.2 can be applied to I[u] by assuming that the estimate u* for computing the weights W (kt) is chosen to be the true value ft. Differentiating eq. (9.61) with respect to u, we obtain
2Mu -
(u, Nu)
2(u, M u ) N u
(9.63)
-
3The m o m e n t matrix M is positive semi-definite by definition; it is singular if and only if noise does not exist. It follows that if the smallest eigenvalue of M happens to be zero, the corresponding unit eigenvector is the true value fi, so we need not consider this case.
280
Chapter 9.
Renormalization
Since (~, N~t) - r (see eq. (9.58)), we see that 2 (AT//+ A M ) ~ t 2(~t, (_37//+ A M ) ~ t ) N ~ (~t, N ~ ) (~, N ~ ) 2
VuI[~] -
2AM~t
2(~, A M ~ t ) N ~ t
r
r 2
(9.64) "
Eqs. (9.19) and (9.57)imply E[AM] - e2N, so
E[VuI[~]]- 2e2Nu r
2r --
r 2
=
0.
(9.65)
Consequently, E[P~tVuI[~t]] - O. Hence, the generalized eigenvector fit is a consistent estimator by the same argument as in the case of the unbiased eigenvector fit. B. Covariance m a t r i x of generalized eigenvector fit The covarimlce matrix V[/t] of the generalized eigenvector f i t / t is evaluated as follows. If I[u] takes its minimum a t / t , we have PuVuI[it] = 0. Eq. (9.63) implies (u, VuI[u]) - O. Consequently, PitVuI[it] - VuI[/t], and the generalized eigenvector f i t / t satisfies VuI[/t] - 0.
(9.66)
Using eq. (9.63), we can rewrite this equation as (/t, N i t ) M i t - (/t, M i t ) N i t .
(9.67)
_
S u b s t i t u t i n g / t - ft + A/t and M - M + A M into this, we obtain (~, N ~ ) ( / ( f A / ~ + A M ~ ) - (~, A M ~ ) N ~
+ O(e2).
(9.68)
Since (~, N ~ ) - r, we have
l"~Ait
-
-(AM
- ( ~ t ' A M ~ t ) N ) ~ + O(e2).
(9.69)
r
Hence, - -a3-
+ o(d),
(9.70)
where we have defined A / ~ ' - A M - (~' A M ~ t ) N . r
Let A / ~ - A ) I - / ~ .
Eq. (9.56)implies that
A/~ - M-
E2N-/~
- A M - E2N.
(9.72)
9.3.
Generalized Eigenvalue Fit
281
Hence, eq. (9.71) can be written as
AI~' - AI~ - ( (ft, AMft)Nr
- e 2) W -
A / ~ - ~,3[~t]r
(9.73)
where we have used the relation Jilt] - (ft, A/~/~) - (ft, ( A M - e 2 N ) ~ ) - (~, A M f t ) -
Sinc
(9.42) imply J[a] ~ d
V/2r/N,
e2r.
(9.74)
(9.73)is written as 52
A/I)/' - A / ~ / + O ( ~ ) .
(9.75)
This means that the statistical behavior of A / ~ ~ is the same as that of A/I)/if terms of O(e2/v/-N) are ignored. Hence, eq. (9.34) holds for A M ' as well to a first approximation, and the covariance matrix of the generalized eigenvector f i t / t is also given by ~2
L
W..(kOa(k)a~ )T
V [ / t ] - ~ / f / / - - e2 a=l
,
(9.76)
--1
which can be approximated by eq. (9.36) in actual computation.
9.3.3
Analysis of residual
A. Residual
for t h e t r u e v a l u e
Consider the residual I[~] for the true value ~. Here, too, we assume that the estimate u* for computing the weights W (kO is chosen to be the true value ~. Then, I[ft]- (f~,M~)= (~t, N ~ )
(R,/~)+e2(~,N~)= r
(R,/P/R) + e 2 r _ J[f~] ~_ ~2. r r (9.77)
From eq. (9.41), we obtain e2
2
(9.78)
Hence, the expectation and the variance of I[~] are 2(~4
E [ I [ ~ ] ] - ~2
V[I[~]]-
rN"
(9.79)
If the noise level e is given a priori, we can construct the X2 test for testing the strong hypothesis: if a particular value ~ is given independently of the
282
Chapter 9.
Renormalization
data {a~)}, the strong hypothesis that the true value is ~ is rejected with significance level a% if e2 2 (9.80) I[~] > 7 - ~ X r N , a . B. Residual for the estimate Consider the residual I[/~] for the generalized eigenvector fit/~. Substituting /L - ~ + Au -- ~ + O(e) and noting that (~, N ~ ) - r, we obtain (/z, M/~)
(/~, flr/~)+ e2 (/~, N/~)
i[~] - (,~, N a ) = (a, N a ) + O(r (~, M~) + ~((~, N~) + O(~11 r + 0(~)
(a, M a )
+ e2 + O(e3).
(9.81)
Eqs. (9.31), (9.70), and (9.75)imply that the statistical behavior of the generalized eigenvector fit is the same as the statistical behavior of the unbiased eigenvector fit if terms of O(e2/v/-N) are ignored. Hence, the quadratic form (/~,/174r/~) in eq. (9.81) can be identified with the residual j[a] of the unbiased eigenvector fit under the same approximation. It follows from eq. (9.53) that to a first approximation E2
(9.82)
1[~1- 7-~x~_~+~. Hence, its expectation and the variance of I[/~] are rg
, Y[I[/~]] = 7-N
1
rg
"
Comparing eqs. (9.79) and (9.83), we observe that the expectation and variance of I[/~] are both smaller than their respective values for I[~] due to the correlation between/~ and the data {a~)}. From eq. (9.82), an unbiased estimator ~2 of the squared noise level e2 is obtained in the following form: ~2 =
I[/~] 1 - ( n - 1)/rN"
(9.84)
Its expectation and variance are 2E 4
E[~2]_ e2,
V[~:]- rN-
n + 1"
(9.85)
If the noise level e is given a priori, we can construct the X2 test for the weak hypothesis that eq. (9.1) holds for s o m e value u" the weak hypothesis is rejected with significance level a% if E2
2
1[~] > 7-~X~N_~+~,o.
(9.86)
9.4.
Renormalization
283
In terms of the estimate ~2 computed by eq. (9.84), the above X2 test takes the form ~2 2
~rN-n+l,a
--5 e > rN-
9.4
n + 1"
(9.87)
Renormalization
9.~.1
Iterations for generalized eigenvalue problem
In order to compute the generalized eigenvector fit, one must solve the generalized eigenvalue problem for the moment matrix M with respect to the matrix N . If N is of full rank, this is a nonsingular generalized eigenvalue problem (see Section 2.2.4), and the solution is obtained by computing the inverse square root N -1/2 (see eqs. (2.93) and (2.94)). However, the domain of the data { a ~ ) } is usually constrained, i.e., the covariance matrices V [ a ~ ) , a ~ )] are generally singular. As a result, the matrix N is generally singular, so the generalized eigenvalue problem is singular (see Section 2.2.5). It follows that one must solve a generalized eigenvalue problem of a smaller size (see eq. (2.100)). However, there exists a more efficient method if we note the following two facts" 9 Not all the generalized eigenvalues and generalized eigenvectors need to be computed" only the smallest generalized eigenvalue and the corresponding generalized eigenvector are necessary. 9 Computing the smallest generalized eigenvalue c of the problem M u = c N u is equivalent to computing the value c for which the smallest eigenvalue of M - M - c N is 0. ^
The second assertion is proved as follows. If c is the smallest generalized eigenvalue of the problem M u - c N u a n d / t is the corresponding generalized eigenvector, we have f / I i t - M i t - c N i t - 0, (9.88) meaning that ./~ has eigenvalue 0 for eigenvector/t. Suppose M has another eigenvalue -)~' (< 0) for a unit eigenvector u'. Then, M u ' - M u ' - c N u ' = - ) ~ ' u ' . Hence, (u', M u ' ) - c(u', N u ' ) - -)~'. (9.89) If (u', N u ' ) - O, then (u', M u ' ) - -)~', which is a contradiction because M is semi-positive definite. Hence, (u I, N u ~) > 0, but then (u', M u ' )
= c-
)~'
(9.90)
which is also a contradiction, since c should be the minimum of the generalized Rayleigh quotient (u, M u ) / ( u , N u ) .
284
Chapter 9.
Renormalization
Thus, we only need to compute the value c for which the smallest eigenvalue of .l~ - M - c N is 0. I f / t is the corresponding unit eigenvector, we have ( / t , / ~ / t ) - 0. Let u be the unit eigenvector of M for the smallest eigenvalue )~ (~= 0). Then, (u, ( M - c N ) u ) = (u, M u ) - c(u, N u ) = )~ - c(u, N u ) .
(9.91)
It follows that if we define - M-
~----~----N,
we have ( u , / ~ u ) - 0. However, u may not be an eigenvector o f / ~ . iterate this process:
(9.92) So, we
1. Let c = 0. 2. Let u be the unit eigenvector of the matrix
I~/I-M-cN
(9.93)
for the smallest eigenvalue, and let A be that smallest eigenvalue. 3. If A ~ 0, return u and c. Else, update c as follows:
c+--c+
(9.94)
4. Go back to Step 2. The convergence of this process can be confirmed as follows. If u is the unit eigenvector of M for the smallest eigenvalue A, the m a t r i x / ~ is perturbed by - ) ~ N / ( u , N u ) at the next iteration. According to the perturbation theorem, the corresponding eigenvalue A is perturbed in the following form (see eq. (2.108)): AN )( - )~- (u, (u, N u ) u) + O()~2) - O()~2). (9.95) In other words, A converges to 0 quadratically as in the Newton iterations, meaning that three or four iterations are usually sufficient.
9.~.2
Iterations for weight update
So far, we have assumed that the weights W (kl) are approximated by using an appropriate estimate u*. Since we need not worry about the bias any longer, it can be determined by iterations: we guess the initial value u0 and compute the generalized eigenvector fit u x; using u l, we compute the generalized eigenvector fit uz and so on. Since the generalized eigenvector fit is itself computed by iterations, the combined process requires double loops of iterations. However, the two loops can be merged in the following form:
9.4.
Renormalization
285
1. Let c - 0 and W (kl) - Ski, a -- 1, ..., N. 2. Compute the following matrices M and N : N
1
L
M - -~ E E W (~kt)a(~k)a(~)T a=l
1
N
(9.96)
k,l=l L
N - -~ E E W'(kOV~ ), a~)]"
(9.97)
c~=l k , / = l
3. Compute the smallest eigenvalue A of the matrix
M-M-cN
(9.98)
and the corresponding unit eigenvector u. 4. If A ~ 0, return u, c, and 1~. Else, update c and W ~ follows: c +-- c + (u,
Nu)'
(
W~ +--- (u,
)
V0[a (k), a~)]u) ~ .
(9.99)
5. Go back to Step 2. We call this process renormalization. Although the convergence is no longer quadratic, the computation is usually very efficient. Since the returned value u gives the generalized eigenvector fit/t, the covariance matrix o f / t is given by eq. (9.76): E2
V[/t]- v(M)n_I.
(9.100)
Here, the rank-constrained generalized inverse is used b e c a u s e / ~ / t - 0 may not be strictly satisfied if the iterations are pffematurely terminated. Since the eigenspace for the smallest eigenvalue of M is {/t}L, computing ( / ~ ) ~ - 1 is equivalent to applying the projection matrix P / t as in eq. (9.76). After renormalization, the returned constant c equals the residual I[/t] of the generalized eigenvector fit /t (see eqs. (9.61), and (9.62)). Hence, an unbiased estimator ~2 of the squared noise level e2 is obtained from eq. (9.84) in the following form: ~:2 : c
1
1)/rN"
(9.101)
Its expectation and variance are given by eqs. (9.85). It follows that the number N of necessary data for uniquely estimating u must be such that n-1 N > ~ . r
(9.102)
286
Chapter 9.
Renormalization
If N = ( n - 1)/r, the value of u is determined but the noise level e cannot be estimated. The significance of renormalization is that it produces not only an unbiased estimator /t of the parameter u by a simple numerical computation but at the same time its normalized covariance matrix V0[/t], an unbiased estimator ~2 of the squared noise level, and its variance V[~2] as well. Let ~max be the largest eigenvalue of V[/t], and Vmax the corresponding unit eigenvector. The primary deviation pair {u +, u - } is given as follows (see Section 4.5.3): U+ -- N[/t + X/#maxVmax],
U - -- N [ / t -
~/~maxVmax].
(9.103)
If the value e is given a priori, the hypothesis (9.1) can be tested by comparing it with the estimate ~ (see eq. (9.87))" the hypothesis (9.1) is rejected with significance level a ~ if ~2
2
XrN-n+l,a
~e > r N 9.5
n + 1"
(9.104)
Linearization
9.5.1
Linearized algorithm
Renormalization is a numerical means to compute the generalized eigenvector fit, which is the solution of the unbiased least-squares approximation to the original optimization, but the unbiased least-squares approximation and the least-squares approximation are both designed for problems for which 1. the hypothesis is linear, and
2. no constraint is imposed other than normalization. In many application problems, the parameter u is constrained in various ways, even though the hypothesis is linear (see Chapter 5). For example, u may be a unit vector constrained to be in a parameter space b / C 7~n. Then, the procedure for renormalization as described in the preceding section can no longer be applied. We now devise a method for overcoming this restriction by noting that the following generally holds: 9 The constraint u E L/ is compatible with the hypothesis (9.1): there exists a solution fi E/4 that satisfies the hypothesis (9.1). 9 The hypothesis (9.1) is an overspecification: there exists a minimum number No (_< N) such that equations ( ~ ) , u) - 0, k - 1, ..., L, a 1, ..., No, can uniquely determine the solution ~t E/4. It follows that if the noise is small, the solution/t based on the hypothesis (9.1) alone is expected to be a good estimate of the true value ~t. This observation leads us to the following approach:
9.5.
Linearization
287
Fig. 9.2. Optimally projecting fi onto the parameter space/4 C ~n.
1. We apply renormalization without considering the constraint u EL/. 2. We then optimally project the resulting solution onto the parameter space/2 (Fig. 9.2). L e t / t be the solution obtained in Step 1, a n d / t t E L/its projection onto b/. Suppose the manifold/4 is defined by M equations F (m) (u) - 0,
rn - 1, ..., M,
(9.105)
together with the normalization condition Ilull- 1. Substituting/t / - / t - A / t into eq. (9.105) and taking a linear approximation, we obtain (Vu/~(m), A/t) -/~(m),
rn - 1, ..., M,
(9.106)
where Vu/~(rn) and/~(m) are the abbreviations of VuF(m)(it) and F(m)(/t), respectively. The correction A/t is determined by minimizing the Mahalanobis distance IIAullv0[/t] (see eq. (5.6))" J = ( A t , V0[/t]-Au)--+ min,
A/~ E {/L}f.
(9.107)
Here, we use the normalized covariance matrix V0[/~] instead of V[/L], since the solution of the optimization is not affected if the covariance matrix is multiplied by a positive constant. Theoretically, V0[/L] should be evaluated a t / I but we approximate it by the value at/~. Note that the normalized covariance matrix V0[/t] is given as a by-product of renormalization (see eq. (9.100)). Let r be the rank of the constraint (9.105). As shown in Section 5.1.2, the first order solution of the optimization (9.107) under the constraint (9.106) is given in the following form (see eqs. (5.15) and (5.17)): M
A/t--Vo[/I,] E
W(~'~)P(m)VuP("),
(9.108)
re,n--1
(9.109)
288
Chapter 9.
Renormalization
In order to impose the normalization condition {{/t'{] - 1 exactly, the actual correction o f / t and its normalized covariance matrix takes the following form (see eqs. (5.25)and (5.26)): /t'-
~/~[/t]' - Pit~Vo[/t]Pit~.
N [ / t - A/t],
(9.110)
Here, P / tu, is the n-dimensional projection matrix onto the tangent space T/t,(5/) to the manifold b/ at /t'. This correction is iterated until the cons t r a i n t / t ~ E 5/is sufficiently satisfied. The normalized a posteriori covariance m a t r i x of the final v a l u e / t ' is given as follows (see eq.(5.31)): M
Pit~ Y~
Y~
' - E w(rnn) ( gi.L~P~ m,n=l
)(git~ P~
) T"
(9.111)
This matrix has rank 4 n - r - 1. From eq. (9.111), the a posteriori covariance matrix o f / t ' is given in the form V[/t'] = ~2V0[/t'], where ~2 is an estimate of the squared noise level e2; it is given as a by-product of renormalization (see eq. (9.101)). We call the above procedure linearization, and the resulting algorithm the
linearized algorithm. 9.5.2
Decomposability condition
In some applications such as 3-D motion analysis, which we will study in Chapters 11 and 12, the hypothesis is nonlinear in the parameter u but can
be rearranged into the form ( ~ ) , v)
- -
0,
k
- -
1,
. . . ,
L,
(9.112)
where v is obtained from u by a nonlinear mapping v ( . ): 7"4n --+ 7-4m in the form v = v(u). (9.113) As before, the data { a ~ ) } , k - 1, ..., L, c~ - 1, ..., N, are m-vectors and are assumed to have the form
(9.114) where A a ~ ) is a random variable of mean 0, independent for each a. It appears that renormalization and linearization can be applied if v is regarded as a new variable instead of u. However, we must be careful about the degree of freedom of the variable: 4 T h e last " - 1 " in n -
r-
1 is due to the normalization I l f i ' l ] - 1.
9.5.
Linearization
289
R"
R"~
Fig. 9.3. Optimally projecting ~, E ~'~ onto the manifold 12 C 7~m so that the equation v(6) - fi' has a solution fi E b/. 1. If rn < n, the p a r a m e t e r u cannot be d e t e r m i n e d even if v is obtained. In other words, the original p r o b l e m is underspecified; the solution is indeterminate. 2. If m - n, eq. (9.113) d e t e r m i n e s t h e value u from the c o m p u t e d value v in general. 3. If m > n, eq. (9.113) is an general.
overspecification; no solution m a y exist in
Let us consider the case rn > n m o r e closely. If rn > n, vector v(u) is generally c o n s t r a i n e d to be in an n~-dimensional manifold 12 C ~ m when u ranges over the p a r a m e t e r space 5/ C T~~. Let F (m) (v)
-
O,
m
--
1, . . . , m
-
n',
(9.115)
be the e q u a t i o n of t h e manifold Y. In order t h a t eq. (9.113) has a solution u, the o p t i m i z a t i o n for v m u s t be done u n d e r eq. (9.115). In this sense, eq. (9.115) is called the decomposability condition. If the d e c o m p o s a b i l i t y condition (9.115) is imposed, r e n o r m a l i z a t i o n can no longer be applied in the original form. As discussed in the preceding subsection, a simple way to overcome this restriction is c o m p u t i n g an e s t i m a t e / J of v w i t h o u t considering the d e c o m p o s a b i l i t y condition (9.115) and t h e n optimally projecting i~ onto the manifold 1; (Fig. 9.3). As described earlier, this p r o j e c t i o n is done so t h a t the M a h a l a n o b i s distance I I v ( u ) - viIvo[v ] is minimized. T h e p r o c e d u r e is s u m m a r i z e d as follows: 1. C o m p u t e the optimal estimate/~, its n o r m a l i z e d covariance m a t r i x V0[/~], and the u n b i a s e d e s t i m a t o r ~ :~ the s q u a r e d noise level by applying r e n o r m a l i z a t i o n to the hypothesis (9.112). 2. C o m p u t e t h e optimal correction A 9 so t h a t / ~ - / ~ - A 9 satisfies the d e c o m p o s a b i l i t y condition (9.115); do iterations if necessary.
290
Chapter 9.
Renormalization
3. Compute the normalized a posteriori covariance matrix V0[O~] of the resulting/~' E ];. 4. Solve the equation v(/t) - 'O' for/t E ~n. 5. Compute the normalized a posteriori covariance matrix V0[/t] of the resulting solution /t e L/ (see eq. (3.16)); its a posteriori covariance matrix is given in the form Y[/t] - ~2V0[/t].
9.6
9.6.1
S e c o n d Order R e n o r m a l i z a t i o n
Effective value of nonlinear data
In many problems in computer vision and robotics, the hypothesis is nonlinear in the raw data but can be rearranged into the form (b~) , u) -- 0,
k - 1, ..., L,
(9.116)
where b~) is obtained from g~ by a nonlinear mapping b(k)( 9)" T~m --4 7~" in the form b ~ ) - b(k) (a~). (9.117) In this case, if we compute b~ ) - b (k) ( a , )
(9.118)
for given data { a~ }, we can apply renormalization to the converted data { b, }. The deviation Ab~ ) -- b~ ) - E[b~ )] (9.119) is a random variable of mean O. Hence, the covariance matrix of b~ ) is defined by r [ b ~ ) b~)] _ E[Ab~)Ab(/)T]. (9.120) However, such a conversion introduces a small error. As before, let the raw data be a~ - d~ + Aa~, the noise A a a being an independent random variable of mean 0 and covariance matrix V[aa]. In general, the expectation
E[b (k) (a,)] is not equal to b~ ) (Fig. 9.4). In fact, let blk)( 9), b(k).(i), and ~(k),(i) be the ith components of b(k)(-), b(k)(aa), and b(k)(g~), respectively. We observe that
b(~)i) = bl k) ( ~ + Aa~) 3_-- b(~k()i)4-t,rwvaua(i), ~(k) Aa~) + 51'(V2~(k) a o~(i); A a , A a ~ ) + O ( A a , ) 3, (9.121)
9.6.
Second Order Renormalization
E[b~)]
291
b(~)(.) f .
.
.
.
.
.
.
.
.
.
.
.
.
aa
Fig. 9.4. The expectation E[b(k)(a~)] is not equal to ~(k).
Vabl k)(gs). The symbol V a2 ~(k) s(i) denotes (pq) element is c92bIk)(D,`~)/Oapcoaq. Let e be an ap-
where Wva%(i)~(k)is the abbreviation of
the (mm)-matrix whose propriately defined noise level such that A a s - O(e). Taking the expectation of eq. (9.121) and noting that the expectation of the terms of order 3 in A a s vanishes, we obtain
1 (~: ~(k) v [ ~ ] ) + o(~).
(9.122)
Thus, E[b (k) (am)] is not equal to b (k) (as), although the deviation is of order
O(~2).
effective value b*~(k) =
This deviation can be canceled if we define the ( b : ~ ) by
b,(~) s(i) = b(~) s(i) - ~l~v a~b(~) s(i); v[~]),
(9.123)
(k) is the abbreviation of ~2~(k) where V2b a s(i) -avi (as). In fact, if we note that V[a] = O(e 2) and V a2 b(k) s(i) _ V2~(k) a s(i) + O(e), (9.124) we obtain from eqs. (9.119), (9.122), and (9.123) l(w2~(k)
1 v~ ~(k)
Vial])
b(k) v [ ~ ] ) + o(~ 4) (9.125)
Hence, if terms of O(e 3) are ignored, {b~(k) } can be regarded as converted data with the interpretation that the true value b (k) that satisfies the hypothesis (9.116) is perturbed by independent noise Ab~ ) of mean 0 and covariance
292 matrices
Chapter 9.
Renormalization
V[b~),b~)], k, 1 = 1, ..., L. Since E[O(e3)] - O@4), we see that E[b; (k)] - b~) + 0 ( ~ 4 ) .
(9.126)
Following the general theory given in Section 7.1.5, the optimal estimate/t is obtained as the solution of the optimization N
J[u] - E
L
E
W(kO (u)(b*(k), u)(b*(t), u) ~ min,
(9.127)
a = l k ,l= l
(9.128) where r is the rank of the hypothesis (9.116). Suppose u is constrained to be in an n'-dimensional parameter space/4 C 7~n, and l e t / t be the resulting estimate. As discussed in Section 7.1.4, the residual J[/t] is a X2 variable with r N - n' degrees of freedom, where n' is the degree of freedom of u. The covariance matrix o f / t is given in the following form (see eq. (7.57)):
V[/L]-
E
W(kt)(it)(P~b(k))(Pg b~))T
(9.129)
ot=l k,l=l
Here, p U is the n-dimensional projection matrix onto the tangent space T/t(b/) to the manifold b / a t / t .
9.6.2
Second order unbiased estimation
Suppose no constraints are imposed on the n-vector u other than normalization I l u l l - 1. The least-squares approximation to eq. (9.127) is obtained if the functions W~(kl) (u) are replaced by constants W(~kO. Since multiplication of J[u] by a positive constant does not affect the solution, the least-squares approximation to eq. (9.127) can be written in the form J[u] - (u, M * u) ~ min, where M * is the
(9.130)
effective moment matrix defined by 1
M*--~E
N
L
,(k) ,(l)n-
EW(~kOb~
b~
(9.131)
c~=l k,l=l
The solution under the constraint Ilull - 1 is obtained as the unit eigenvector of M * for the smallest eigenvalue. As discussed in Section 9.1, however, the solution of the least-squares approximation is statistically biased whatever weights W (kl) are used.
9.6.
Second Order Renormalization
293
In order to construct an unbiased least-squares approximation, we first consider the moment matrix of the converted data {b~ ) } defined by N
L
1
(kl) b(k) b(1)T
(9.132)
c~----1 k,/----1
L e t / ~ be the unperturbed moment matrix obtained by replacing b~ ) by b~). The hypothesis (9.116) implies that the true value ~ satisfies M ~ - 0, i.e., ft is the unit eigenvector o f / ~ for eigenvalue 0. In general, E[M] is not equal t o / ~ . Suppose we have found (nn)-matrices N (1) and N (2) such that the matrix /~ M - e2N (1) ~- e4N (2) ~- 0(e 6) (9.133) -
has expectation/~/r, where matrices N (1) and N (2) do not involve the noise level e. Then, the unbiased least-squares approximation is obtained in the following form: J[u] - (u, 1Vlu) -+ min. (9.134) The solution under the constraint - x is obtained as the unit eigenvector o f / ~ for the smallest eigenvalue. If the noise level e is not known, we let
f l - M - c N (1) + c 2 N (2) ,
(9.135)
and choose the constant c so that the smallest eigenvalue o f / ~ is 0. Let u be the unit eigenvector o f / ~ for the smallest eigenvalue ~ (7~ 0). If c is incremented by Ac, the m a t r i x / ~ changes into /I;/~ - M - (c + A c ) N (1) + (c + A c ) 2 N (2) -- / ~ Since ( u , / ~ u )
A c ( N (1) -- 2 c N (2)) + A c 2 N (2).
(9.136)
- )~, we see that (9.137)
It follows that we can let (u, fir~u) - 0 by choosing the increment Ac to be Ac
1
2(u,N(2)u) ((u'N(1)u) - 2c(u,N(2)u) _
~/((u,N(i)u)-
2 c ( u , N ( 2 ) u ) ) 2 - 4)~(u,N(2)u)) . (9.138)
The covariance matrix defined by eq. (9.120) generally has the form V[b~ ) , b~ )] - e2Vo(1)[b~), b~ )] + e4Vo(2)[b~),b~ )] q-O(e6),
(9.139)
294
Chapter 9.
Renormalization
where the matrices V0(1)[b (k), b(ak)] and V0(2)[b(ak),b(~k)] do not involve the noise level e. By incorporating the update of the weights W~(kl) , the renormalization procedure is described in the following form: 1. Let c - 0 and W (kl)
-
5ij , k,
l - 1, ..., L, a - 1, ..., N .
2. Compute the matrices M , N (1), and N (2). 3. Compute the smallest eigenvalue )~ of the matrix 2~ - M
-
eN
(1) +
c2N (2)
(9.140)
and the corresponding unit eigenvector u. 4. If )~ ~ 0, return u, c, a n d / ~ . Else, update c and W (kl) as follows: D
--
((u,N(1)u)
_ 2 c ( u , N ( 2 ) u ) ) 2 - 4)~(u, N(2) u),
2 (u, c+-
c+
2(u,N(2)u)
)~
,
i f D > 0, (9.142) ifD <0,
c + (u, N (1) u) W (kl) t--
(9.141)
( (u, V0(1)[b (k) " (l)
, o ~ ] u ) + c(u , V0(2) [b (a) , b ~ ) ] u )
).
(9.143)
5. Go back to Step 2. We call this process second order renormalization. As before, the covariance matrix of the optimal estimate/t is obtained in the following form:
V[/t]-
~2 -~r(/~)n_l. 2"4
(9.144)
The squared noise level e2 is estimated in the form
~2 __
C 1 -(n-
1)/rN"
Its expectation and variance are given by eqs. (9.85).
(9.145)
Chapter 10 Applications of G e o m e t r i c E s t i m a t i o n This chapter illustrates the theories in Chapters 7 and 9 by solving typical parametric fitting problems, all of which are very important in computer vision and robotics applications: fitting lines and conics to edge segments detected by an image processing operation and fitting planes to space points reconstructed by a range finder or stereo vision. In each problem, an optimal fit is computed by renormalization. At the same time, its reliability is evaluated in terms of the covariance matrix and visualized by means of the primary deviation pair. Real-data and simulation examples are also shown.
10.1 10.1.1
Image Line Fitting O p t i m a l line f i t t i n g
Line fitting is one of the most important processes in image understanding, because most objects in indoor environments have linear boundaries. In an image, object boundaries are usually detected by an edge operator as edge segments, i.e., sequences of pixels. By fitting image lines to the detected linear edge segments, we can obtain a 2-D representation of the objects. Such a 2-D representation (or "line drawing") plays a basic role in computing the 3-D interpretation of the scene. Hence, fitting an image line to pixel data is a fundamental stage for understanding the scene. Let {x~}, c~ - 1, ..., N, be a sequence of image points to which an image line is to be fitted. Let {5~ } be their true positions. We write x~-
2~+Ax~,
(10.1)
and regard the noise A x ~ as a Gaussian random variable of mean 0 and covariance matrix V[x~]. Let (n, x) - 0 be the image line to be fitted. The problem is stated as follows. Problem
10.1 Estimate a unit vector n such that (n, 5~) - 0,
a - 1,..., N,
(10.2)
from the data {x a}, c~ = 1, ..., N. Eq. (10.2) is the hypothesis from which the vector n is to be estimated. The rank of this hypothesis is 1. As shown in Section 7.2.2, an optimal estimate 295
296
Chapter 10.
Applications of Geometric Estimation
of n can be obtained by the optimization
J[n]- Z
(nlV~--~]n) -+ min
(10.3)
c~--I
under the constraint I1 11- 1. The theoretical bound on the accuracy of fitting is given by the covariance matrix of the optimal estimate h in the form
T)
-
c~--i (n, v[
ln)
(10.4)
which has rank 2; its null space is {n}L. Decompose the covariance matrix V[x~] into the noise level e and the normalized covariance matrix ~)[xa] in the form V[xc~ ] - ~2V0[x(~ 1.
(10.5)
If the denominator in eq. (10.3) is replaced by a constant, we obtain the least-squares approximation (n, M n ) - + min,
J[n]-
(10.6)
where the moment matrix M is defined by N
1 E -~
M-
T, W,~x ,~x ,~
(10.7)
o~--1
1
W~ = (n*, Vo[x~]n*)"
(10.S)
Here, n* is an appropriate estimate of n. The solution ~t of the optimization (10.6) is obtained as the unit eigenvector of the moment matrix M for the smallest eigenvalue.
10.1.2
Unbiased e s t i m a t i o n and renormalization
The solution of the least-squares approximation is in general statistically biased whatever weights W~ are used. In fact, taking the expectation of the moment matrix M , we see that N
1 ElM] - --~ E
W~E[(~c,~ + A x ~ ) ( ~ + Axe) y]
a=l _
62
N
= M + -~ E c~=l
W'~V~
(10.9)
10.1.
Image Line Fitting
297
where M is the unperturbed moment matrix obtained by replacing x~ by 5~ in eq. (10.7). The hypothesis (10.2)implies t h a t / l ~ n 0, i.e., n is the unit eigenvector of the unperturbed moment matrix M for eigenvalue 0. Since E[M] is perturbed from M by O(e2), the solution ~ of the least-squares approximation is statistically biased by O(e 2) according to the perturbation theorem. However, if we define the unbiased moment matrix _
- M - e2 N , 1
(10.10)
N
N - ~ ~
W , V0[x,],
(10.11)
o~-~1
we have E[h)/] - /I)I. Hence, we obtain the unbiased least-squares approximation J [ n ] - ( n , / ~ n ) --+ min. (10.12) The solution /t is obtained as the unit eigenvector of / ~ for the smallest eigenvalue. If the noise level e is not known, the renormalization procedure is given as follows (see Section 9.4.2): 1. L e t c = 0 a n d W ~ -
1, c ~ - 1 , . . . , N .
2. Compute the matrices M and N defined by eqs. (10.7) and (10.11), respectively. 3. Compute the smallest eigenvalue A of the matrix
1(I- M-cN,
(10.13)
and the corresponding unit eigenvector n. 4. If A ~ 0, return n, c, and M . Else, update c and W~ as follows: A
c +-c+
1
W~ +--
(10.14)
5. Go back to Step 2. After renormalization, the squared noise level ~2 is estimated in the form ~2 __
C
(lo.15)
Its expectation and variance are E[g:2] - e2 '
2E 4
N- 2
(10.16)
298
Chapter 10.
Applications of Geometric Estimation
(b) Fig. 10.1. (a) Supposedly collinear image points. (b) Optimally fitted image line (solid line) and its primary deviation pair (dashed lines). If the value e is given a priori, the collinearity test for image points takes the following form (see eq. (7.75))" ~2
X~V-2,a
> g---=7
(10.17)
The covariance matrix of the resulting estimate ~ is obtained in the following form: ~2 V[fi]- -~;(/~)2(10.18) Iv
Let ~max be the largest eigenvalue of V[fi], and irma x the corresponding unit eigenvector. The primary deviation pair {n +, n - } is given as follows (see Section 4.5.3)" n + -- N[~/, -4- ~/~maxCtmax],
n - - N [ f i - X/#m~xUmax].
(10.19)
E x a m p l e 10.1 Fig. 10.1a shows ten image points supposedly collinear and equidistant at four pixel intervals (the distance between the end points is 40 pixels). We perturbed them by adding Gaussian random noise of mean 0 and standard deviation 4 (pixels) to the x and y coordinates of each point independently. Fig. 10.1b shows the image line fitted by renormalization (solid line) and its primary deviation pair (dashed lines). Let (fi, x) = 0 and (7%,x) = 0 be the true and the fitted image lines, respectively. Since the deviation of fi from fi is orthogonal to ~ to a first approximation, the error of computation is represented by A n = Prt (n - n),
(10.20)
10.1.
Image Line Fitting
o~
9
ogl~ 9 q d ~ O ~ d I I I D q b ~ o b q ~ q g ~ w q ~
(~)
299
9 ~
(b)
Fig. 10.2. (a) Error distribution. (b) Theoretical standard confidence region. where P n is the projection matrix along ft. In Fig. 10.2a, A n is plotted in the plane perpendicular to fi for 100 trials, each time using different noise. Fig. 10.2b shows the standard confidence region computed from the theoretical expression of eq. (10.4) (see Section 4.5.3). Comparing Figs. 10.2a and 10.25, we see that the theoretical bound on accuracy is almost attained. We also see that the error arises almost in the direction of the fitted line, confirming that eq. (7.92) is indeed a good approximation. If the noise characteristics are symmetric with respect to the true line, the statistical bias does not appear, so the accuracy does not improve by renormalization. Recall that if the noise distribution is isotropic and homogeneous, the minimization (10.3) reduces to the least-squares optimization and the solution can be obtained analytically (see Example 7.2). Hence, we need not do iterations. However, the advantage of renormalization is that it can produce an optimal fit irrespective of the noise distribution. Also, it automatically estimates the noise level and gives the covariance matrix of the obtained fit. E x a m p l e 10.2 Fig. 10.3a is an edge image obtained by applying an edge operator to a portion of a real image. An image line is fitted to one edge segment by assuming that the distribution of the image noise is isotropic and identical for each pixel (see Examples 7.1 and 7.2 in Section 7.2.2); the noise level is unknown. Fig. 10.3b shows the optimally fitted image line (solid line) and its primary deviation pair (dashed lines) superimposed on the original gray-level image. This example demonstrates that by renormalization we can not only obtain an optimal fit but also visualize its reliability without knowing the noise level.
300
Chapter 10.
Applications of Geometric Estimation
(~)
(b)
Fig. 10.3. (a) Edge image. (b) Optimally fitted image line (solid line) and its primary deviation pair (dashed lines).
10.2
10.2.1
Conic Fitting
Optimal conic fitting
Conics (ellipses, parabolas, and hyperbolas) are very important image fedtures in computer vision and robotics applications. Many industrial objects have circular and spherical shapes, and circles and spheres are projected as conics on the image plane. If a conic on the image plane is known to be a projection of a circle or an ellipse of known shape, its 3-D position can be computed analytically (but not always uniquely). In order to do such an analysis, the conic detected on the image plane must be given a mathematical representation first. Like image lines, conics can be detected as edge segments by an edge operator. Hence, the first task is to fit a conic equation to a given sequence {x~}, a = 1, ..., N, of image points. In the presence of image noise, the detected image points {x~ } are not necessarily accurate. Let ~a be the true location of the a t h image point. We write x~ = ~ + A x e , (10.21) and regard the noise Ax~ as a Gaussian random variable of mean 0 and covariance matrix V[x~]. As shown in Section 4.4, a conic is represented in the form (x, Q x ) = 0 (see eq. (4.80)). Since the matrix Q can be determined only up to scale, we adopt the normalization IIQII = 1. The problem is stated as follows: P r o b l e m 10.2 Estimate a (33)-matrix Q of unit n o r m such that (2~, Q ~ )
= 0,
f r o m the data {x,~}, a = 1, ..., N .
a = 1, ..., N,
(10.22)
10.2.
Conic Fitting
301
Eq. (10.22) is the hypothesis from which the matrix Q is to be estimated. The rank of this hypothesis is 1. Define a (33)-matrix X~ - xa @ x~,
(10.23)
and let X c~ be the unperturbed value of ~ffa obtained by replacing x~ by 5~. Eq. (10.22) can be written in the form (X~; Q) - 0 .
(10.24)
Substituting eq. (10.21) into eq. (10.23), we can express X ~ in the form
x . - (~. + ~x~.) G (~o + ~xx.) _
= X ~ + Ax~ | ~
+ ~
| Ax~ + Ax~ | Ax~.
(10.25)
Since E [ A x ~ ] - 0, we have
(10.26)
E [ X o ] - X+ + v[:.l. Hence, the effective value of X ~ is given as follows (see eq. (9.123))"
(10.27)
X ~ - Xo, - V[xo,].
The deviation A X ( ~ - X*~ - . , ~ - A x ~ @ ~
+~ | Ax~ + Ax~ | Ax~ - V[x~] (10.28)
is a random variable of mean O. Let x~(i), Xo,(i), and Axo,(i) be the ith components of x~, ~ , and Axe, respectively. The covariance tensor F[X~] of X ~ has the following (ijkl) element"
v [ x . l u ~ - E[/',xo o/',xo]~5~ = E[Axa(j)Axa(k)]2a(i)2a(l)+ +E[Ax~(i)Ax~(k)]2~(j)2~(l)
E[Axa(j)Axa(1)]2a(i)2a(k) + E[Ax~(i)Ax~(o]2~(j)2~(k )
-E[Axo~(i) AXo~(j)]V[x,a]kl - E[Axo~(k) AXo~(l)]V[xo~]ij
+E[Ax.(g)AX.(;)Ax~(k)Ax~(oI+ V[xo~]ijV[xc~lkl =
v[~lj~(~)~(~)+ v[~]j~(~)~(~)+ v[~]~(j)~(~) +v[~]~.(j)~.(,)
+ v [ ~ . ] ~ v [ ~ . ] j ~ + v[~.]~v[~o]j~.
(10.29)
Here, we have used the following identity for Gaussian random variables (see eqs. (3.56))"
E[Ax a(i) Ax a(j) Axa(k) Ax ~(/)] =
v[~o]uV[~]~, + v[~l~V[x.lj,
+
v[xol,,V[~.lj~.
(lO.3O)
302
Chapter 10.
Applications of Geometric Estimation
From eq. (10.29), we obtain (Q; F[X~]Q) - 4(5~, QV[x~]Qgc~) + 2(V[x.]Q; QV[x.]). Hence, an optimal estimate of Q can be obtained as the solution of the optimization I
"
o
-
Q):
min
J[Q] - ~=1 ~ 4(x c~, QV[x~]Qx~) + 2(V[x.]Q; QV[x.])
(10.32)
under the constraint IIQII- 1. Let Q be the resulting optimal estimate. Since Q is a symmetric matrix and is normalized to IIQII - 1, it has five degrees of freedom. Hence, the residual J[Q] is a X2 variable with N - 5 degrees of freedom. It follows that the number of image points must be such that N _> 5.
(10.33)
The residual is 0 when N - 5. For N _> 6, we obtain the conic test for image points: the hypothesis that image points {x~ } are on a conic is rejected with significance level a% if 2 J[Q] > XN-5,a(10.34) The theoretical bound on the accuracy of fitting is given by the covariance tensor of Q in the form
(L 1)[Q]-
c~=l
oe )
oe )
)
4(5:c~, QV[x~]Qfc~) + 2(V[x~lQ; QV[x~])
(10.35)
where 79 = (Pijkl) is the projection tensor defined by Pijkl
-- (~ikSjl
--
QijQkl.
( o.a6)
The covariance tensor ])[Q] has rank 5; its null space is {Q}L.
10.2.2
Unbiased estimation and renormalization
Decompose the covariance matrix V[x~] into the noise level e and the normalized covariance matrix V0[x~] in the form V [ x ~ ] - e2V0[x~].
(10.37)
If the denominator in eq. (10.32) is replaced by a constant, we obtain the least-squares approximation J [ Q ] - (Q; fl4*Q)--+ min, 1 In
(10.38)
the denominator, the true values { . ~ } are approximated by the data values {xa }.
10.2.
Conic Fitting
303
where A4* is the effective moment tensor defined by N
1 M* - V ~
w.(~.
o ~ . - v[~o]) o ( ~ . o ~ . - v [ ~ . ] ) .
1
W~
4(x~, Q*Vo[x~]Q* x~) + 2e2(Vo[x~]Q*; Q*Vo[x~]))"
(10.39)
(10.40)
Here, Q* is an appropriate estimate of Q. The minimum is sought under the normalization constraint IIQll = 1. If a 6-vector q and a 66-matrix M * are defined by casting the (33)-matrix Q and the (33)(33)-tensor jVI* in the form q - type6[Q] ,
M * - type6613d*],
(10.41)
we have (Q; Ad*Q) = (q, M ' q ) and IIQ[I = IIq[I (see Section 2.4.2). Hence, the solution of the minimization (10.38) is obtained as the eigenmatrix of norm 1 of tensor A4* (i.e., the matrix Q obtained by cast from the unit eigenvector q of the matrix M ) for the smallest eigenvalue. However, the solution of the least-squares approximation is statistically biased whatever weights W~ are used. Define the moment tensor 1
N
M - -~ E W,~x~ | x~ | x~ | x~,
(10.42)
o~=1
and let M be the unperturbed moment tensor obtained by replacing x~ by 5~. Then, the true value (~ satisfies ,Q(~ - O, i.e., (~ is the eigenmatrix of norm 1 of M for eigenvalue 0. However, E[,M] is not equal to ,M. In fact, _
1
N
E[M,5~,] - -f Z woE[x.(i)xo(j)x.(~)x.(,)] ~=1
1
N
Z W.E[(~.(i) + Ax.(,))(~.(~) + Ax.(j)) c~=1
(~r
1
N
+ Ax~r
+ Ax~t*))]
(
c~--1
+3cc~(i)~(k)E[Axa(j) /kx~(l)] -k Yc~(i)5:~(l)E[Ax~(j) Axa(k)] +Yca(j)YC~(k)E[Axo~(i)Ax~(l)] + x~(j)YCa(l)E[/kx~(i)Axc~(k)] -4- ~a(k)Jca(l)E[Ax~(i)Ax~(j)] -4- E[Ax~(i)Ax~(j)Ax~(k)Ax~(l)])
304
Chapter 10. ~2
Applications of Geometric Estimation
N c~=l ~4
N ~-~1
+ v0[*~l~kV0[~]~ + v0[~l~v0[~l~k). Define (aa)(aa )-tensors N "(1) - t~,N ijkl) (1) and N "(2)
(10.43) (2) ) -(Nijkl
by
N
i(1)
1
( c~--I
+ Vo[Xa]ilXa(j)Xa(k) + Vo[Xa]jkXa(i)Xot(l)
+ Vo[~.lj~/.(~)/.(j) + Vo[~.]k~/.(o/.(~)). 1
(10.44)
N
~ - - ~ Z wo "(Vo[~l,j yo[.o]~, + Vo[~l,kVo[~]j, + yo[~l**Yo[~lj~). ot:l
(10.45) Let sV"(1) be the unperturbed value of N "(1) obtained by replacing x~ by ~ in eq. (10.44). Then, eq. (10.43) can be written in the following form: E[M]
- ./~ -[- ~2 ~ ( 1 ) ..[_ e4/~-(2).
(10.46)
From eqs. (10.44) and (10.45), we immediately see that E[.AS( 1 )] __ ./~r(1) _~_ 2E2./~(2).
(10.47)
It follows that if we define the unbiased moment tensor -- ./~ -- C2Af(1) -~- (~4N'(2) '
(10.48)
we have E[M] - AJt. Hence, we obtain the unbiased least-squares approximation J [ Q ] - (Q; 3~tQ)--+ min. (10.49) The solution under the constraint lIQll- 1 is obtained as the eigenmatrix of 324 of norm 1 for the smallest eigenvalue. If the noise level e is not known, the second order renormalization procedure is given as follows (see Section 9.6.2)" 1. L e t c - 0 a n d W ~ -
1, a -
1,...,N.
2. Compute the (33)(33)-tensors M , N "(1), and Af (2) defined by eqs. (10.42), (10.44), and (10.45), respectively.
10.2.
Conic Fitting
305
3. Compute the smallest eigenvalue )~ of the (33)(33)-tensor
(~0.50)
./~ __ ./~ __ C./~ (1) ..[_ C 2 ./~f(2),
and the corresponding eigenmatrix Q of norm 1. 4. If ~ ,~ 0, return Q, c, and 3~t. Else, update c and W~ as follows: D - ( ( Q ; A f ( 1 ) Q ) - 2c(Q;A/'(2)Q)) 2 - 4s c + (Q; Af(1)Q) - 2c(Q;N'(2)Q)- ~ )~ 2( Q; N'(2)Q)
c +--
Af(2)Q),
,
if D_> O,
(10.51)
(10.52)
ifD <0,
c + (Q;Af(1)Q), 1
Iu
+- 4(x~,
QVo[x~]Qx~) + 2c(Vo[x,~]Q; QV0[x~])"
(10.53)
5. Go back to Step 2. After renormalization, the squared noise level e: is estimated in the form ~2 __
C
1 - 5/N"
(10.54)
Its expectation and variance are
E[~:]- ~:'
2~4
v[~:]- N--2-g
(~0.55)
If the value e is given a priori, the conic test takes the following form (see eq. (10.34))" ~2
X~Y-5,a
(10.56)
e--~ > N----ZY" The covariance tensor of the resulting estimate (~ is obtained in the following form: ~:2 ^
V[Q]- N(M)[.
(10.57)
Let ~max be the largest eigenvalue of 1;[(~], and Umax the corresponding
eigenmatrix of norm 1. The primary deviation pair {Q+, Q - } is given as follows (see Section 4.5.3)" Q+ .- N[0-~- 4~tmaxUmax],
Q-- -- N [ 0 -
4~maxUmax].
(10.5s)
306
Chapter 10.
Applications of Geometric Estimation
E x a m p l e 10.3 Fig. 10.4a shows sixty equidistant image points on the ellipse x2 502
y2 +
1002
= ~
(~o.59)
in the first quadrant in a simulated image of 512 • 512 pixels (lengths are measured in pixels). The interval between consecutive points is approximately 2 pixels. We added Gaussian random noise of mean 0 and standard deviation a - 0.5 (pixel) to each of the coordinates of these points independently. Fig. 10.4b shows the conic fitted by renormalization (solid line) and its primary deviation pair (dashed lines). Fig. 10.5a shows ten fits computed by the optimal least-squares approximation (the weights are computed from the true values), each time using different noise. The true conic is drawn in a dashed line. The existence of statistical bias is evident. Fig. 10.5b shows corresponding optimal fits computed by renormalization. We see that the statistical bias is removed. E x a m p l e 10.4 Fig. 10.6a is an edge image obtained by applying an edge operator to a portion of a real image. A conic is optimally fitted to the longest edge segment that constitutes a part of a conic by assuming that the distribution of the image noise is isotropic and identical for each pixel; the noise level is unknown. Fig. 10.6b shows the fitted conic (solid line) and its the primary deviation pair (dashed lines) superimposed on the original graylevel image. Fig. 10.7 shows similarly obtained images. We can see that the reliability of the fit rapidly decreases as the length of the conic edge segment decreases.
10.3 10.3.1
Space Plane Fitting by Range Sensing Optimal space plane fitting
Suppose we are observing space points which are known to be on a planar surface. Let {r~}, a = 1, ..., N, be their locations observed in the presence of noise, and consider the problem of fitting a space plane to them. Let { ~ } be the true positions of {r~}. We write r, - e, + At,,
(10.60)
and regard the noise Ar~ as a Gaussian random variable of mean 0 and covariance matrix Viral. Let p~ be the p-representation of ra, and ~ its true value (see eq. (4.23)). The covariance matrix of p~ has the form
v[p~] = v[,.~l e o - ( VIral
o
)
9
(lo.61)
Let (v, p) - 0 be the v-representation of the space plane to be fitted (see eq. (4.61)). The problem is stated as follows.
10.3.
Space Plane Fitting by Range Sensing
307
(~) (b) Fig. 10.4. (a) Equidistant image points on an ellipse in the first quadrant. (b) Optimal fit (solid line) and its primary deviation pair (dashed lines).
"-.
s-"
(~)
(b)
Fig. 10.5. (a) Ten fits obtained by the least-squares approximation. (b) Corresponding fits obtained by renormalization.
308
Chapter 10.
Applications of Geometric Estimation
jl
(~)
(b)
Fig. 10.6. (a) Edge image. (b) Optimally fitted conic (solid line) and its primary deviation pair (dashed lines).
!
(~)
(b)
Fig. 10.7. (a) Edge image. (b) Optimally fitted conic (solid line) and its primary deviation pair (dashed lines).
10.3.
Space Plane Fitting by Range Sensing
Problem
309
10.3 Estimate a unit 4-vector v such that (v, ~ )
- O,
a - 1,..., N,
(10.62)
from the data {p~}, a - 1, ..., N. Eq. (10.62) is the hypothesis from which the 4-vector v is to be estimated: its rank is 1. As shown in Section 7.4.3, an optimal estimate of v can be obtained by the optimization
N J[v] - E
(v, p~)2
c~--1 (/2~~r[-P-~J~)
-4 min
(10.63)
under the constraint ]]v]] - 1. The theoretical bound on the accuracy of fitting is given by the covariance matrix of the optimal e s t i m a t e / , in the form
) 10.3.2
(10.64)
Noise model of range sensing
Range sensing means measuring the distances to objects in the scene from a fixed position without approaching them, and a device for this purpose is called a range finder; the data obtained by a range finder are called range data. Many types of range sensing are possible. A typical m e t h o d is to emit a sound or radio wave and observe the phase shift between the emitted and reflected waves. In general, the error behavior of range data depends on not only the mechanical characteristics of the range finder and the accuracy of the electronic data processing involved but also the shape and position of the object to be measured. Hence, it is very difficult to give a precise error model. However, the accuracy of range data generally decreases as the distance to the object increases. Here, we assume that to a first approximation the error is proportional to the distance to the object. We also assume that the orientation in which the distance is measured can be controlled accurately. According to this model, the error Arm in eq. (10.60) occurs in the direction of ~, and the covariance matrix V[ra] is modeled in the form
viii-
(10.65)
where e is a constant. We call e the noise level; the standard deviation of ]]Ar~l[ is elle~]]. Since it is very difficult to predict the accuracy of range sensing a priori, we treat the noise level e as unknown in the subsequent computation.
310
Chapter 10.
Applications of Geometric Estimation
The covariance matrix V[p~] is now decomposed into the noise level e and the normalized covariance matrix V0[pa] in the form
V[pa] - e2Vo[p~], Vo[p~]- ( f'~f':
0
)
"
(10.66)
Let (n, r) = d be the {n, d}-representation of the space plane to be fitted (see eq. (4.60)). Noting that (n, ~ ) = d, a = 1, ..., N, we see from eqs. (4.23) and (4.62) that
(~, V o [ p o ] ~ )
-
(~,, p o )
-
( 1~ '+ '*)~ d2
=
1 +d~ d 2 '
((,,, 1~+) d-2 d) ~ "
(10.67) (10.68)
Substituting these into eq. (10.63) and noting that multiplication of J[u] by a positive constant does not affect the solution, we can equivalently rephrase the optimization as finding a unit vector n and a scalar d such that N
1
d) 2
min.
(10.69)
o'--1
If the factor 1/d 2 is ignored, this reduces to the usual east-squares optimization, minimizing the sum of the squared perpendicular distances from the data positions to the space plane (see eqs. (4.68) and (7.155)). The least-squares optimization is optimal if the noise distribution is isotropic and identical for each datum; otherwise, the solution is biased. In fact, eq. (10.69) implies that the optimal solution should have a larger value of d than the least-squares solution. The solution of the minimization (10.69) can be obtained analytically. In fact, if we put ~ = n/d, the function to be minimized can be written in the following form: N
Jill
-
,o) 1) -
(10.70)
(~--1
Since this is a quadratic polynomial in ~, the solution is obtained by solving the following linear equation"
T ,ic~--1
r~.
(10.71)
c~--1
The optimal estimate of the surface parameters {n, d} is given by ~ = N[ft] and d - 1/11~11. Although the solution itself can be immediately obtained, we derive the renormalization procedure in the following, because it can automatically estimate the noise level e and give the covariance matrix of the obtained fit.
10.3.
10. 3.3
Space Plane Fitting by Range Sensing
311
Unbiased e s t i m a t i o n and r e n o r m a l i z a t i o n
If the denominator in eq. (10.63) is replaced by a constant, we obtain the least-squares approximation i f [ u ] - (u, M u ) ~
min,
(10.72)
where the moment matrix M is defined by N
1 M - -~ E
W~p~p{,
(10.73)
c~'-i
1
W~ = (v*, V0[p~]v*)"
(10.74)
Here, v* is an appropriate estimate of v. The solution s) of the optimization (10.72) is obtained as the unit eigenvector of the moment matrix M for the smallest eigenvalue. As in the case of line fitting, the solution of the leastsquares approximation is in general statistically biased whatever weights W~ are used. On the other hand, if we define the unbiased moment matrix
1~ - M - e2N,
(10.75)
N
1 N - -~ E
W~Vo[p~],
(10.76)
o~=1
we have E[/~] - / ~ . Hence, we obtain the unbiased least-squares approximation i f [ v ] - (v, ]l)/v) ~ min, (10.77) and the solution 12, is obtained as the unit eigenvector of M for the smallest eigenvalue. If the noise level e is not known, we apply renormalization. In this problem, however, the normalized covariance matrix V0[p~] is also unknown, since it involves the true values ~ (see eqs. (10.66)). If ~ is approximated by the data value r~, the error magnitude is underestimated at those points which are detected at shorter distances than their true positions and overestimated at those points detected at longer distances. Here, we approximate ~ by the projection of r~ onto the fitted space plane in the iterations. Since the projection of r~ onto space plane (n, r) - d is -
d"po~
(,i,
(lO.TS)
the renormalization procedure is modified as follows (see eqs. (6.63) and (6.64))"
312
Chapter 10.
Applications of Geometric Estimation
1. Compute the (44)-matrices v(O)[p~]_ and let c -
( r ~ r ~T 0 ) '
O, Vo[p~] - Vo(~
a-
1,...,N,
(10.79)
and W~ - 1, a - 1, ..., N.
2. Compute the (44)-matrices M and N defined by eqs. (10.73) and (10.76), respectively. 3. Compute the smallest eigenvalue )~ of the (44)-matrix 1~
-
(10.80)
M - cN,
and the corresponding unit eigenvector v. 4. If )~ ~ 0, return v, c, a n d / ~ . Else, update c, Vo[pa], and Wa as follows, where ~; - (0, 0, 0, 1)7-: ,X c ~- c + ( . , N . ) '
Vo[p.] ~-
W ~ +---
(~, ~)~ (v,p. _,r .o(O)[p.],
(..Vo[p.].)
(10.81)
5. Go back to Step 2. After renormalization, the squared noise level 42 is estimated in the form ~2 __
(lO.82)
1-3/N"
Its expectation and variance are
E[~:]- ~:'
244
v[~:]- N - - ~
(10.s3)
If the value e is given a priori, the c o p l a n a r i t y test for space points takes the following form (see eq. (7.154)): ~2 X2_3,a 4-T> N----L-~.
(10.84)
The covariance matrix of the resulting estimate D is obtained in the following form: ~2 ^ V[D]- v(M)~-. (10.85)
10.3.
Space Plane Fitting by Range Sensing
313
Fig. 10.8. Planar surface patch in the scene. Let #max be the largest eigenvalue of V[/,], and ~max the corresponding unit
eigenvector. The primary deviation pair {v +, v - } is given as follows (see Section 4.5.3): V + -- X [ v - t - 4~max~max],
v- - N[i-
V/#max~maxl .
(10.86)
As we pointed out in Section 4.2.1, we must be careful about the scaling of the coordinates when we use the p-representation (X, Y, Z, 1)7-. Namely, we must use an appropriate unit of length for space coordinates so that X, Y, and Z are not too large or too small as compared with 1. A convenient scaling is to regard the average r0 - Ec~N_I Ilr~l[/N as unit length. Once the {v}representation of the fitted space plane is obtained, its {n, d}-representation can be obtained by computing eq. (4.63). The covariance matrix V[fi], the correlation vector V[ti, d], and the variance V[d] are computed from V[/p] by eqs. (4.67). E x a m p l e 10.5 Fig. 10.8 shows a planar surface patch placed in the scene and viewed from the coordinate origin, at which we assume a range finder is fixed. The dots in the figure indicate the orientations in which the distance is measured. We assume that the rays emitted outside the patch return the value oo and are ignored in the fitting computation. We simulated the distance measurement by adding Gaussian random noise to the exact distance to the surface in each orientation independently according to the statistical model given by eq. (10.65) for e = 0.1. Then, a space plane was fitted to them. Let {fi, d} and {~, d} be the true and the computed surface parameters. Since the deviation of ti from fi is orthogonal to ~ to a first approximation, the error in ti can be represented by a 3-vector
Pn(a-
+
d-d d
(10.s7)
where Pft is the projection matrix along ft. The covariance matrix of this
314
Chapter 10.
Applications of Geometric Estimation
oo
e#
9 9
9 9
(~)
(b)
Fig. 10.9. (a) Error distribution for the least-squares solution. (b) Error distribution for renormalization. vector is given by ^
+
J] ~,T] _[..7~,~T.
(lO.8S)
In Figs. 10.9a and 10.9b, A u is plotted in three dimensions for 100 trials, each time using different noise: Fig. 10.9a is for the least-squares solution given by eq. (10.69); Fig. 10.9b is for renormalization. In each figure, the ellipse indicates the standard confidence region computed from eq. (10.88) by using the theoretical expression (10.64) (see Section 4.5.3); the cubes are drawn merely for the sake of reference. We can see that statistical bias exists in the least-squares solution and the bias is removed by renormalization. We can also see from Fig. 10.9b that the theoretical bound on accuracy is almost attained. Fig. 10.10a shows a grid reconstructed by eq. (10.78) for a typical surface fit obtained by renormalization and viewed from an angle (the true position is superimposed in dashed lines); Fig. 10.10b shows its primary deviation pair.
10.4
10.~.1
Space Plane Fitting by Stereo Vision
Optimal space plane fitting
Suppose multiple feature points in the scene are observed by stereo vision, and suppose they are known to be on a planar surface. Although the 3-D structure is uniquely reconstructed by stereo vision alone, we can expect that the reliability of 3-D reconstruction can be enhanced if the knowledge that the feature points are coplanar is incorporated. This problem has practical significance, since many objects in an indoor robotic workspace, such as walls, ceilings, and floors, have planar surfaces. An indirect but simple method is first reconstructing 3-D by stereo vision alone and computing the covariance matrices of the feature points as
10.4.
Space Plane Fitting by Stereo Vision
315
(~)
(b)
Fig. 10.10. (a) An example of a fit computed by renormalization. (b) Its primary deviation pair. described in Section 6.3. Then, a space plane is optimally fitted, as described in Section 7.4.3. Here, we consider a direct method: a space plane is optimally reconstructed directly from stereo correspondence pairs {x~, x ~C~ }, a - 1, " ' ' ~ N. As shown in Section 6.5.1, the necessary and sufficient condition for 5~ and x- a to be projections of a space point on space plane (n, r) - d viewed by a stereo system with motion parameters {h, R} is given as follows (see eqs. (6.67) and (6.71); the scale difference does not affect the result): _,
R
x~ • A h ~ - 0,
A -
T
( hrt T
u
dl)
x/1 + d 2
.
(10.89)
This is the hypothesis from which the surface parameters {n, d} are to be estimated. Define a 34-matrix B ~ and a 4-vector v by
B,~-(
t T T / x,::,,xR hx,:,, x,:,,xR
v -
x/1
1(o) + d2
-d
T x
"
,~ ), (10.90)
Let (~, p) - 0 b~ th~ ~,-repr~s~t~tio~ of the s p ~ p l ~ (n, r) - d. L~t B~ be the u n p e r t u r b e d value of B~ obtained by replacing x~ and x~ by ~ and x~,-' respectively, in the first of eqs. (10.90). Then, the first of eqs. (10.89) can be rewritten as B ~ v - 0. Hence, the problem is stated as follows: Problem
10.4 E s t i m a t e a unit ~-vector v such that /~v
f r o m the data { B ~ } ,
a -
- 0,
1, .... N .
a - i, ..., N,
(10.91)
316
Chapter 10.
Applications of Geometric Estimation
If we let the kth row of the m a t r i x / ) ~ be a ~ )T, eq. (10.91) can be written as ( ~ ) , v) - 0, k - 1, 2, 3, which has the same form as eq. (7.46). Hence, the general theory described in Section 7.1.5 can be applied. Since the true positions 5~ and xa-' satisfy the epipolar equation 15:~,h, -' - - 0 (see eq. (6.4)), vectors ~ , h, and /~x~ -' are coplanar. It n ol -, -, n follows that R 5:~, R T h , and x~ are also coplanar. Hence, vectors x~ x h and x~ - ' x R T ~:~ are collinear, meaning that the m a t r i x / ) ~ has rank 1. Consequently, only one of the three equations ( ~ ) , v) - 0, k - 1, 2, 3, is linearly independent, so the rank of the hypothesis (10.91) appears to be 1. However, the epipolar equation is also an implicit hypothesis, and from the general theory given in Section 5.1.1 it can be shown that the rank 2 of the hypothesis (10.91) is 2. The optimization (7.55) for this problem can be rewritten in the form N
J[v]-
~ ( B ~ v , W~(v)B~v) --+ min,
(10.92)
c~--1
where W ~ ( v ) i s the (33)-matrix given by eq. (7.54). In the present problem, it can be rewritten in the following form: W~(~) - (V[B~,]);. !
Since B a t , - x~ x ABe,
Axa, we have -!
- Ax: x A~
+ x~ x A A x ~ + A x " x AAx~.
-
x'o xAV[x~]A T
(10.94)
V[B~v] - E[(AB~v)(AB~v) T] has the form
Hence, the covariance matrix
Y[B~v]
(10.93)
!
[V[x']xAV[x~]A T],
'
(10.95) where the symbol [. x 9] denotes the exterior product of matrices (see eq. (2.45)). The theoretical lower bound on the covariance matrix of the optimal estim a t e / , is given in the following form (see eq. (7.57))" 12[/,1 -
P~,B~W~(~,)B~Pv
)
.
(10.96)
Here, Pt, is the four-dimensional projection matrix along t,. The rank of V[/,] is 3; its null space is {V}L. Since a space plane has three degrees of freedom, the residual N
J[/,] - ~ ( B ~ / , ,
W~(i,)B~i,)
(10.97)
ct--1 2This hypothesis is degenerate; see Section 7.1.1. We will give a rigorous m a t h e m a t i c a l a r g u m e n t a b o u t the rank of a hypothesis is C h a p t e r 14.
10.4.
Space Plane Fitting by Stereo Vision
317
is a X2 variable with 2 N - 3 degrees of freedom. Evidently, the number N of the pairs of corresponding image points must be such that N > 2.
(10.98)
The residual is not 0 when N - 2. For N >_ 3, we obtain the coplanarity test for feature points" the hypothesis that the observed feature points are projections of coplanar space points is rejected with significance level a% if J[/,]
>
(10.99)
~2N_3, a.
Once the solution t9 is obtained, all the feature points are optimally back projected onto the space plane (/,, p) - 0 by the procedure described in Section 6.5.1.
10.~.2
Unbiasedestimation and renormalization
If the covariance matrices V[x~] and V[x~] are decomposed into the noise level e and the normalized covariance matrices V0[x~] and V0[x~] in the form
V [ x ~ ] - e2Vo[x~],
V [ x ~ ] - eZVo[x~],
(10.100)
the least-squares approximation to the optimization (10.92) can be written in the form a~[v]- (v, M y ) --+ min, (10.101) where M is the (44)-matrix defined by N
i
M-
1 u
T
-N ~~"a:l (h' X'~h)xaxc~ 1
N
~" E c t = l
(Xo~, X o ~ h ) x T
)
1NE : . ot=l
,
"
(10.102)
Here, we have defined the (44)-matrix X~ by
X ~ - R ( z ' ~ x W ~ x x ~,)R
s.
(io.lo3)
The (44)-matrix W~ is defined by wo
-
/
A*
I
+ D
+ e2[Vo[x'~] x A*Vo[x~]A *s] 2 '
(lo.lo4)
where A* is the value of A obtained by replacing n and d by their appropriate estimates rt* and d*, respectively, in the second of eqs. (10.89). The solution of the optimization (10.101) under the constraint I1~11 - 1 is obtained as the unit eigenvector of M for the smallest eigenvalue.
318
Chapter 10.
Applications of Geometric Estimation
It is easy to confirm that
E[Ax~ • W~ • A x e ] - [W~ • V[x~]].
(10.105)
_
Let X~ be the unperturbed value of X~ obtained by replacing x~' b y -x~ ' in eq. (10.103). From eqs. (10.103) and (10.105), we obtain
E[X,] - -X o, + e2Y~,
(10.106)
where we have defined the (44)-matrix Y~ by
Y . - R [ W . • Vo[x~]]R T.
(~o.~o7)
Using eq. (10.106), we obtain the following identities:
El(h, Xo~h)x~x~] - (h, E[Xa]h)E[xo~x~] = (h, (X~ + Y ~ ) h ) ( 5 ~ ~ + e2Vo[x~]) -T e2(h,.~ ~ h) V0[x~]+ e2 (h,Y~h)x~x~ - -T =(h,X~h)Sc,~x~+ +e4 (h, Y~h)Vo[x~], (10.108)
-17E[(h,X~x~)x~] - E[x~x~]E[X,~]h = (~, ~x~ - T + 62 Vo[x~])(X c~ + •2 Y~)h
= (~. X h ) ~ + ~Vo[~]X~h + ~ ( ~ , Y~h)~ +e4 Vo[x ~]Y ~h,
(10.109)
E[(~o, X.~o)] - (E[~o~:]; E[Xo]) = (2~2~ + e2Vo[x~]; . ~ + e2Y~)
= (~.. x . ~ . ) + ~:(yo[~.]; X . ) + ~:(~.. Y.~.) -[-~4(Vo [xa] ; Y~).
(10.110)
Define (44)-matrices/~(1) and N (2) by
N (1) _ ( ~ E N - 1 ((h, io~h))Uo[xo~] + (h, Yo~h)xo~X~)
1
))
~ ~=1
Vo[x~]X~h+ (x~,Y~h)x~ 1 E ~~= I ( (Vo[x,]; X~) + (x~, Y~x~) )
N(2) _
V1 E t =Nl ( h Y , ~h)Vo[x~]
1 E ,N -~ = i (Vo[x~]Y,h) T
IN
; E~=l Vo[x~]Y~h 1 N
~ E~=l(Vo[~o]; Y~)
' (10.111)
) (10.112)
10.4.
Space Plane Fitting by Stereo Vision
319
Let/V(1) be the unperturbed value of N (1) obtained by replacing xa, x , and X,~ by ~,~, x~, -' and X ~ respectively, in eq. (10.111). From eqs. (10.102), (10.10S)-(10.112), we see that _
E [ M ] - M + ~2J~ (1) nt- ~4N(2).
(10.113)
Since the true value of u is the unit eigenvector of M for eigenvalue 0, eq. (10.113) implies that the least-squares solution is statistically unbiased (see Section 9.1.2). From eqs. (10.111) and (10.112), it is easily seen that E [ N (1)]
J~r(1) + 2e2N (2).
(10.114)
/ ~ - M - E2N (1) + e 4 N (2),
(10.115)
-
It follows that if we define
we have E[/I)/] - _]~. Hence, we obtain the unbiased least-squares approximation J[n] - ( v , l ~ v ) ~ min. (10.116) The solution /, is obtained as the unit eigenvector of /~/ for the smallest eigenvalue. If the noise level e is not known, the second order renormalization procedure is given as follows: 1. L e t c - 0 a n d W ~ - I , a -
1,...,N.
2. Compute the (44)-matrices M , N (1), and N (2) defined by eqs. (10.102), (10.111) and (10.112), respectively. 3. Compute the smallest eigenvalue A of the (44)-matrix - M - c N (1) + c2N (2),
(10.117)
and the corresponding unit eigenvector v. 4. If )~ ~ 0, return v - (Ul, u2, ua, u4) 7-, c, a n d / ~ . as follows: D - ((v, N (1) v) - 2c(v, N (2) v)) 2
4A(v, N(2)v),
(v, N(1)v) - 2c(v, N(2)v) - x / ~ , 2(v, N(2)v) A c + (V, N (l)v)' c+
c+-
-
Else, update c and W ~
(10.118)
i f D >_ O, (10.119) i f D < O,
320
Chapter 10. A -
W ~ +-
Applications of Geometric Estimation
RT(h(~'I,
!
x~xAVo[x~]A
T
(m.120)
v2, zJ3) -4- t ' 4 I ) , !
xx~ + (Ax~)xVo[x~]xAx~ _
2 "
+ c[V~176
(10.121)
5. Go back to Step 2. After renormalization, the squared noise level e2 is estimated in the form ~2 ._
C
(10.122)
1 - 3/2N"
Its expectation and variance are 2E 4
E[~21- e2'
V[~21 - 2 N - 3"
(10.123)
If the value e is given a priori, the coplanarity test takes the following form (see eq. (10.99): ~2 2 X2N-a,a (10.124) -g > 2N-----7-~. The covariance matrix of the resulting estimate/, is obtained in the following form: V[i]-
N(M)a.
(10.125)
Let Pmax be the largest eigenvalue of V[/,], and ~max the corresponding unit eigenvector. The primary deviation pair {v +, v - } is given as follows (see Section 4.5.3)"
v + - N[/, + ~ / ~ m a x { m a x ] ,
-
V'max m
]"
(10.126)
Once the v-representation (/,,p) - 0 of the fitted space plane is obtained, its {n,d}-representation (~, r) - a~ is computed by eq. (4.63). The covariance matrix V[/z], the correlation vector V[/z, a~], and the variance V[d] of the parameters {/z, a~} are computed from V[i] in the form of eqs. (4.67) (see Section 4.3.1). E x a m p l e 10.6 Fig. 10.11 shows simulated stereo images (512 x 512 pixels with focal length f = 600 (pixels)) of a planar grid placed in the scene. We added Gaussian noise of mean 0 and standard deviation a = 2 (pixels) to each image coordinate of the grid points independently (so the noise level is e - o / f - 1/300) and fitted a space plane (/z, r) - d. As in Example 10.5, the error in the parameters {/z, a~} is represented by the vector A u given by eq. (10.87), and its covariance matrix is given by eq. (10.88).
10.4.
Space Plane Fitting by Stereo Vision
321
In Fig. 10.12, A u is plotted in three-dimensions for 100 trials, each time using different noise: Fig. 10.12a is for the optimal least-squares approximation (the weights are computed from the true values); Fig. 10.12b is for renormalization. We can see that statistical bias exists in the least-squares solution and the bias is removed by renormalization. The ellipses in the figures indicate the standard confidence regions computed from eq. (10.88) by using the theoretical expression (10.96); the cubes are drawn merely for the sake of reference. We can also see from Fig. 10.12b that the theoretical bound on accuracy is almost attained. Fig. 10.13a shows a reconstructed grid viewed from an angle. The true position is superimposed in dashed lines. Fig. 10.13b shows the primary deviation pairs viewed from a different angle. E x a m p l e 10.7 Fig. 10.14 shows the real stereo images used in Example 6.6 (see Fig. 6.8), where the 3-D shape was computed without assuming any knowledge about the shape of the surface. Here, we incorporate the knowledge that the surface is planar and reconstruct it from the same feature points shown in Fig. 6.9a. Fig. 10.15 shows two views of the surface reconstructed by the procedure described in this section (solid lines) and its primary deviation pair (dashed lines). Comparing this with Fig. 6.9b, we can see that the reliability is indeed increased by the knowledge that the surface is planar.
322
Chapter 10.
Applications of Geometric Estimation
Fig. 10.11. Simulated stereo images of a planar grid.
(~)
(b)
Fig. 10.12. (a) Error distribution for the least-squares approximation. (b) Error distribution for renormalization.
(~)
(b)
Fig. 10.13. (a) Reconstructed grid (solid lines) and its true position (dashed fines). (b) Primary deviation pair of the reconstructed grid.
10.4.
Space Plane Fitting by Stereo Vision
323
Fig. 10.14. Real stereo image (the same as Fig. 6.8).
Fig. 10.15. Reconstructed surface (solid lines) and its primary deviation pair (dashed lines).
Chapter 11 3-D M o t i o n Analysis This chapter presents a statistically optimal algorithm for computing 3-D motion of an object from corresponding image points observed over two views. At the same time, the reliability of the computed motion parameters and the reconstructed depths are evaluated quantitatively. The analysis consists of two separate cases: the case in which the feature points axe in general position in the scene, and the case in which they are known to be coplanax. The statistical properties of the theoretically optimal estimate are described first, providing a theoretical bound on the attainable accuracy. Then, the statistical properties of the solution computed by renormalization and linearization are discussed. We also discuss the critical surface that gives rise to ambiguity of 3-D interpretation. Finally, we formulate a statistical test for testing if the camera motion is a pure rotation or if the object is a planar surface.
11.1
11.1.1
General Theory Camera and object motion
In C h a p t e r 6, we saw how 3-D s t r u c t u r e is recovered from image point correspondences observed by a stereo s y s t e m with known m o t i o n p a r a m e t e r s {h, R } . We now show t h a t the m o t i o n p a r a m e t e r s can also be c o m p u t e d if the correspondence is established for a sufficient n u m b e r of image points. In this chapter, the stereo configuration defined in C h a p t e r 6 is r e i n t e r p r e t e d as a c a m e r a m o t i o n in the scene: the first and second c a m e r a positions of a stereo s y s t e m are identified with the c a m e r a positions before and after the motion, respectively. However, we also use stereo terminologies interchangeably. - ~ their t r u e Let x a and x ~ be corresponding image points, and 5~ and x~ positions. We write x~ -
-
/
x~ + Axe,
x~ -
-/
/
x~ + Axe,
(11.1)
and regard the noise t e r m s A x and A x ~ as i n d e p e n d e n t Gaussian r a n d o m variables of m e a n 0 and covariance m a t r i c e s V[x~] and V[x~], respectively. As discussed in Section 6.1.2, the t r u e positions m u s t satisfy the epipolar equation (6.4). Hence, the p r o b l e m is s t a t e d as follows: Problem
11.1 Estimate the motion parameters {h, R } that satisfy le~, h, n e ' ~ l -
0,
t from the data x~ and x~, c~ - 1, ..., N.
325
~ - 1,..., N,
(11.2)
326
Chapter 11.
...................... ',,
3-D Motion Analysis
-. ...
Fig. 11.1. A large camera motion relative to a large object in the distance is indistinguishable from a small camera motion relative to a small object near the camera. Eq. (11.2) is the hypothesis from which the motion parameters {h, R } are to be estimated. The rank of this hypothesis is 1. It is immediately seen that translation h can be determined only up to scale. This is intuitively interpreted as follows: as long as an image sequence is the only source of information, a large camera motion relative to large objects far away from the camera is indistinguishable from a small camera motion relative to small objects near the camera (Fig. 11.1). In order to remove this indeterminacy, we adopt the scaling Ilhll - 1 if h r 0. It follows that we must first decide whether or not h = O, i.e., whether or not the camera motion is a pure rotation. The procedure for this decision, which we call the r o t a t i o n test, will be discussed in Section 11.7.1. Here, we assume that h r 0 has already been confirmed. In the following, we assume that the camera moves in a stationary scene, but the subsequent analyses can also be applied to a moving object viewed from a stationary camera (Fig. 11.2). If an object rotates around a fixed reference point rG in the object, say its centroid, b y / / o and then translates by ho, a point v in the object moves to r t -- rG
q-
ho + R o ( r -
rG).
(11.3)
Comparing this with eq. (6.1), we find that the resulting image motion is the same as when the object is stationary and the camera moves with motion parameters h - rc - Rro(ho + ra), R - tlor. (11.4) Conversely, the parameters { h o , Ro} of the object motion are computed from the parameters {h, R} of the equivalent camera motion in the following form: ho - R V ( r c -- h ) -
r~,
Ro - R w.
(11.5)
11.1.
General Theory
327
~-~R
FG
Fig. 11.2. Object motion relative to a stationary camera coordinate system.
11.1.2
Optimal estimation of motion parameters
A. Nonlinear optimization For each c~, define the 33-matrix
X ~ - x ~ x ' (~ 7-9
(11.6)
Let X ~ be the unperturbed value of X ~ obtained by replacing x~ and x~ by 5~ and x~,~ respectively. The epipolar equation (11.2) is expressed in the following form: (X~; h x R) - 0. (11.7) This equation has the same form as eq. (10.24) if h x R is identified with Q. Hence, the subsequent analysis takes almost the same form as in Section 10.2.1, except that X a and h x R are no longer symmetric matrices. Since x~ and x~ are independent, we have E [ X a ] - .X Hence, the deviation A X ~ -- X ~ - . ~ is a random variable ofmean O. From eqs. (11.1) and (11.6), we see that
A X ~ - Ax~Sc 'O/ -c + 5 ~ A x ~ 7- + A x ~ A x ~ - C
(11.8)
Let AXa(ij) , xa(i) , and x ~~(i) be the (ij) element of A X ~ and the ith components of x~ and x ~ respectively. The covariance tensor 1;[X~] of X ~ has the following (ijkl) element:
F[X~]~jkl - E[AX~(ij)AX~(kO] = E[Ax~(i)Ax~(k)]-' -t
= V[x~]ikx~(j)
,~,l
-'
'
'
t
I
~(,)+ V[x~]jt2~(i)2~(k)+ V[x~]ikV[x~]jl
.
(11.9)
According to the general theory in Section 7.1.5, the optimal estimate of the motion parameters { h, R} can be obtained by the following minimization (see eq. (7.55))"
J[h,R]-
g E c~----1
(X~; h x R) 2 (h • R; F [ X ~ ] ( h • R)) --+ min.
(11.10)
328
Chapter 11.
3-D Motion Analysis
The minimum is sought under the constraint that h is a unit vector and R is a rotation matrix. If eq. (11.9) is substituted into eq. (11.10), the optimization can be rewritten in the following form: N
J[h, n ] -
Z W(h,
h, n 'l
~ min,
(11.11)
or--1
W~(h,R) - 1 / ( ( h x I:lS:~, V[xa](h x Rfc~)) + (h x 2 ~ , R V [ x ~ ] R T ( h x ~ ) )
+ (V[x~l(h x R); (h x R)V[x~])). In actual computation, the true values ~ data values x~ and x~, ' respectively.
(11.12)
and x~-' are approximated by the
B. Rigidity test
Let {h, R} be the resulting estimate of {h, R}. The motion parameters (h, R} have five degrees of freedom--two for h (unit vector) and three for R (rotation matrix). Hence, the residual J[/z, l~] is a X2 variable with N - 5 degrees of freedom (see Section 7.1.4). It follows that the number N of the pairs of corresponding image points must be such that 1 N ~ 5.
(11.13)
The residual is 0 when N - 5. For N >_ 6, we obtain a rigidity test" the hypothesis that the camera is moving in a stationary scene (or equivalently the object in motion is rigid) is rejected with significance level a% if J[h,
.l~] >
~y-5,a"
(11.14)
C. Focus of expansion
Suppose the camera moves toward or away from the a t h feature point P~ in the scene (Fig. l l.3a). Since h x ~ - 0 and h x R 2 ~ - 0, the first two terms in the denominator on the right-hand side of eq. (11.12) become 0. Consequently, W~ (h, R) becomes very large 2. Geometrically, this is interpreted as follows. The expression (h x R; Y[X~](h x R)) measures how likely the epipolar equation (11.7) is to be violated by noise. The epipolar equation 1Algorithms for computing the motion parameters from five pairs of points are known as five-point algorithms. 2Theoretically, W~(h, R) becomes infinite. The fact that it always remains finite is due to the approximation introduced in the computation.
11.1.
General Theory
329
..
.e
Pc~
..
(~)
(b)
Fig. 11.3. (a) T h e c a m e r a moves toward or away from the a t h feature point. (b) Focus of expansion.
states that the baseline and the two lines of sight that start from the viewpoints O and O' and pass through P~ are coplanar (see eq. (6.2)). However, if the camera moves toward or away from P~, the two lines of sight and the baseline become collinear. The collinearity relation may be destroyed in the presence of noise, but the coplanarity relation still holds to a first approximation. Consequently, P~ is regarded as "robust" to noise, and thereby given a large weight. If the camera orientation does not change (i.e., R = I), the projection of P~ does not move on the image plane; all other feature points seem to move away from or toward it (Fig. ll.3b). Such an image point is known as the focus of expansion. Although this description does not hold if the camera rotates, we still call the projection of P~ the "focus of expansion" if O, O', and P~ are collinear.
11.1.3
Theoretical bound on accuracy
The theoretical bound on the attainable accuracy is obtained by analyzing the behavior of the function J[h, R] near the true values { h , / / } . The translation h is a unit vector, so we can write
h-h+Ah,
(h, Ah) -- 0,
(11.15)
to a first approximation. Since R is a rotation matrix, a small vector 12 exists such that
P,.- R + AP,.,
AP,.- AY2 x R,
(11.16)
to a first approximation (see eqs. (2.57) and (3.31)). Substituting eqs. (11.1), (11.15), and (11.16)into eq. (11.11), we obtain N
J[h,R]
E(W~(h,R)
+ O(Ah, AI2))
Ix~,h, Rx~[ + I~ , A h , - -'
o~:1
+ ] ~ , h, A R x ~ l +
O(Ax~,Ax~,Ah, AR)2) 2
330
Chapter 11.
3-D Motion Analysis
N
=
~
Rx~l ,
-
(5: x Rx~ --,, o
Ah)
oc'-- i
-
((~,~, R- x_,~ ) h
-
( h , R-x- 'o ) ~ , o , ~ x a )
)~
N
+ E W~(ft'R)O(Ax"'Ax~'Ah'At'-2)3'
(11.17)
a=l
where O(.-.)P denotes terms of order p or higher in ---. If we ignore the last term and define vectors a~ -- x~ x R x ~ ,
we obtain
b~ - (e~, Re,'. ) h -
(h, i~5~'. )5~. ,
(
N
J[h,R] - E W,~(ft, R) I~,h,/~21-(&~,Ah)-(b~,Aa)
(11 . lS)
)2
,
(11.19)
(~=1
where we have used the following relationship:
I ~ , h , ~ R ~ I-'
- ( ~ , ( h , R- -~') / ~
-
( h , A ~ ) R- -x'~ )
(11.20)
If we minimize J[h, R] by regarding A h and At? as variables, we need not consider the constraint (h, Ah) -- 0, since the epipolar equation (11.2) implies that (a~, h) - l e o , R e ~ , s = 0, (11.21) _
and hence (as, A h + ch) - (&o,,Ah) for an arbitrary constant c, i.e., the component of A h parallel to h has no effect on the minimization. Differentiating J[h, R] with respect to A h and Ag~ and setting the result 0, we obtain N
N
ct=l
ct=l
(11.22) N
N
c~=l
c~=l
(11.23) If we define the (66)-matrix
A- = (
~-~N EL
ac~
1W~(h,R)b~ -Tas
N
T)
N
-T
(11.24)
eqs. (11.22) and (11.23) are combined into one equation in the form A
,AY2
- E W~(h,/~)[x~, h,/~x~[ o~--1
( )oo ~
.
(11.25)
11.1.
General Theory
331
It is easily seen from eq. (11.21) that the matrix A has rank 5; its null space is {h | 0}L. The solution of eq. (11.25) is obtained in the following form:
= Z wo(h, tZ)l~, h , -R ~ I' , ~
,•
c~=l
--
( ) ac~
~
.
(11.26)
The optimal estimate {h,/~} of the motion parameters {h, R} is given by h - h + Ah,
1 ~ - / ~ + At2 x 1~.
(11.27)
Their covariance matrices are defined as follows (see eq. (3.33))" 17[s -
E[AhAhT],
V[R, s
E[AY2AhT],
1?[/~,/~] - E[AhAI2T], V [ l t / ] - E[AY2AY2T].
(11.28)
From eq. (11.26), we obtain
r
~[h,/~]
N
a,fl--1
=
A
AA
,
-A
(11.29)
where we have used the following relationship:
E[I~, h, R ~ : l - I ~ , h, R ~ I ] - 5~,E[I~, h, R~'I ~] = 5~zE[(X~; h x R)~] - 5~zE[(AX~; h x/~)~] 5~/~ i~)" = &,~(h x i~; v[x,,](h x i ~ ) ) - w~(h,
(11.30)
The covariance matrices f~[h], ~7[h,/~] (= ~7[/t{,h]), and f~[/~] thus obtained give a theoretical bound on the attainable accuracy of estimating {h, R}. In actual computation, eq. (11.29) is approximated by
V[hl V[h,i~] V[&h] _ --
o~=1
E
N, = , o
v[R]
]:
W.(h,i{)(Phd~)(Phh.)T " ., ,, Wo~(h,R)b~(PhD',~)T
)
N N Eo~=, Wo,(h, R)"b~b~^T
(11.31)
where {1%,h} are the estimated motion parameters and
a~-~
xR~,' ~,
go - (~,~ , R' '~ ) h
-
(h,^R x^' ~ ) ~ ~,.
(11.32)
Here, 5~ and 5~'~ are, respectively, the optimally corrected positions of x~ and x~ (see eqs. (6.20)).
332
Chapter 11.
11.2
3-D Motion Analysis
L i n e a r i z a t i o n and R e n o r m a l i z a t i o n
11.2.1
Linearization
The optimization (11.11) is nonlinear, requiring numerical search. However, the hypothesis (11.2) (or equivalently (11.7))is linear in the essential matrix G = h x R (see eq. (6.7)), and the epipolar equation (11.2) is expressed in the form ( ~ , G 2 ~ ) - 0 (see eq. (6.8)). Hence, the linearization technique described in Section 9.5.2 can be applied. To be specific, Problem 11.1 is decomposed into the following two subproblems: P r o b l e m 11.2 Estimate a matrix G such that (~, G~)
- 0,
a - 1, .., N,
(11.33)
from the data x~ and x~, a - 1, ..., N.
P r o b l e m 11.3 Decompose the matrix G into motion parameters {h, R } in such a way that G - h x R. (11.34) Consider Problem 11.2 first. Eq. (11.33) is the hypothesis from which the essential matrix G is to be estimated. The rank of this hypothesis is 1. Since the scale of G is indeterminate, we normalize it so that the resulting translation h is a unit vector. Note the following relationship (see eqs. (2.37) and (2.142))" IIGII 2 - t r
((hx R)(h•
R) T) - t r
= tr(llhll2l - h h T) - 211hll2.
((hx I)RRT(hx
I)T)
(11.35)
Hence, normalizing h to Ilhll- 1 is equivalent to normalizing G to IIGIIv~. If the essential matrix G - h x R is regarded as a variable in eqs. (11.11) and (11.12), it can be optimally estimated by the following optimization"
J[G] - oE~ - - 1 (x~' ' G T V [ x ~ ] G x 'o,, )+(x~ , GY[x~]GTx )+(V[x~]G;GV[x~])
--+ min.
( 11.36)
Let (~ be the resulting estimate. Since G has eight degrees of freedom (the nine elements are normalized), the residual J[(~] is a X2 variable with N - 8 degrees of freedom. It follows that the number N of the pairs of corresponding image points must be such that N > 8.
(11.37)
11.2.
Linearization and Renormalization
333
The residual is 0 when N = 8. In other words, the linearization technique requires three extra pairs of image points a as compared with direct optimization (11.10) (see eq. (ll.la)). If the covariance matrices V[x~] and V[x~] are decomposed into the noise level e and the normalized covariance matrices V0[x~] and V0[x~] in the form
V[xo~]- e2Volxo~],
V[x~]- e2Vo[x~],
(11.38)
the least-squares approximation to (11.36) has the form
J[ G]
-(G;
(11.39)
M G ) -~ min,
where the m o m e n t tensor M is defined by N
1 ~
!
,
(11.40)
M - -~ ~_, W ~ x ~ | x~ | x~ | x~, ~--1
,C
Vo
'
,r
a*Vo[='])).
+
(i .41)
Here, G* is an appropriate estimate of G. The minimum is sought under the normalization [ ] G l l - vr2. If a 9-vector g and a 99-matrix M are defined by casting the 33-matrix G and the 3333-tensor M , respectively, in the form 4 g-
typ%[G],
M-
typ%9[M],
(11.42)
we have (G;lt4G) - (g, M g ) and ] ] G ] [ - ]]g]] (see Section 2.4.2). Hence, the solution of the optimization (11.39) under the constraint lIGII - x/~ is obtained as the eigenmatrix of norm vf2 of tensor M (i.e., the matrix G obtained by cast from the eigenvector g of norm v ~ of the matrix M ) for the smallest eigenvalue. Let fl74 be the unperturbed moment tensor obtained by replacing x~ and x~' by ~ and x~,-' respectively, in eq. (11.40). Eq. (11.33) implies that the true value (~ satisfies M G - O, i.e., (~ is the eigenmatrix fl74 for eigenvalue 0. However, E[M] is generally not equal to M . In fact, _
+ = ~,(i)~,(k)+ E[Ax,(i)Ax,(k)]-
+ Axe(k))] ~,(i)Sc,(k)+ c2Vo[x,]ij,
(11.43)
3Algorithms for computing the motion parameters from eight pairs of points by using the linearization technique are known as eight-point algorithms. 4Here, G is a 33-matrix and M is a 3333-tensor, so they are respectively cast into a 9-vector and a 99-matrix, whereas Q in Section 10.2.2 is a (33)-matrix and M is a (33)(33)tensor, so they are respectively cast into a 6-vector and a 66-matrix; see eq. (10.41).
334
Chapter 11.
3-D Motion Analysis
and similarly -' z ' o~(1)4- 62V0 E[x~(j) x'o~(/)] -- Xo~(j)
(11.44)
'
It follows that N
1
E[Mijk~] - ~ ~ I~E[x.(~)x.(k)]E[x'.(j)x.(t)] c~=1 _
62
N
or--1 64
N
-~--N E
W~176176176
(11.45)
Hence, the solution of the least-squares approximation is statistically biased whatever weights W~ are used.
11.2.2
Unbiased estimation and renormalization
Constructing the unbiased least-squares approximation requires defining an unbiased moment tensor M such that E[3)t] - ~t. Define 3333-tensors 2((1) _ -
-
[ 7V(1)
~,'~'ijkl)and
./~'(2)
{ ~V(2)
- ~ "ijkl] by
N
Ni(1)
=
1
E
(
x'
a=l
'
+
'
) ,
(11.46)
N
Ni(2) i E Wc~Vo[xot]ikVo[x~]jl. jkl --- --N
(11.47)
c~=l
Let .~(1) be the unperturbed value of A: (1) obtained by replacing x~ and x~ by 5~ and x~,-' respectively, in eq. (11.46). Then, eq. (11.45) can be written in the following form: E [ . / ~ ] - ./~ -~- 62./~f (1) -~- 64./~ (2) .
(11.48)
It is immediately seen from eqs. (11.46) and (11.47) that
E[~(1)] -- j~f(') + 2E2N"(2).
(11.49)
It follows that if we define ./~ -- ./~ -- E2./~ (1) ~- 64.]~ (2),
(11.50)
we have E[,M] - ~:4. Hence, we obtain the unbiased least-squares approximation J[G]- (G; A ) / G ) ~ min. (11.51)
11.2.
Linearization and Renormalization
335
The solution under the constraint IIGII - v ~ is obtained as the eigenmatrix of norm ~ of 3~ for the smallest eigenvalue. If the noise level E is not known, the second order renormalization procedure is given as follows (see Section 9.6.2)" 1. L e t c - 0 a n d W ~ -
1, a -
1,...,N.
2. Compute the 3333-tensors M , Af(1), and Af (1) defined by eqs. (11.40), (11.46), and (11.47), respectively. 3. Compute the smallest eigenvalue )~ of the 3333-tensor - - . / ~ - - c N * ( 1 ) _~_ c 2 j ~ f ( 2 ) '
(11.52)
and the corresponding eigenmatrix G of norm v~. 4. If )~ ~ 0, return G, c, and fl;t. Else, update the constant c and the weights 1u as follows"
D - ((a; Az(')a)-
c +-
-
8)~(G; A/"(2)G),
c + (G; N ' ( 1 ) G ) - 2c(G;N'(2)G)- ~ 2~ 2(G; Af(z) G)
,
(11.53)
If D_> 0, IfD <0,
c + (G;Af(llG),
(11.54)
W~ ~ 1/ ((x~, G-r-Vo[x,~]Gx~)+ (x~, GVo[x~lGmx~)
+ ~(Vo[~ola; aVo[~'])).
(11.55)
5. Go back to Step 2. After renormalization, the squared noise level e2 is estimated in the form
(11.56)
~2 . _
-S/N" Its expectation and variance are 2E 4
E[~2] - e2'
V[~2] - N -
8"
(11.57)
The covariance tensor V[G] of the resulting estimate G is obtained in the following form:
~2 F [ G ] - ~(3~4)s.
(11.58)
336 11.3
11.3.1
Chapter 11. Optimal
Correction
and
3-D Motion Analysis
Decomposition
Correction of the essential matrix
A. Decomposability condition
We now consider Problem 11.3. Here, the crucial fact is that not every matrix G can be decomposed into the form G = h x R . A matrix G is said to be decomposable if there exist a unit vector h and a rotation matrix R such that G = h x R. It can be proved that a matrix G is decomposable if and only if its singular values are 1, 1, and 0 (see Section 2.3.1), which can equivalently be written as det G - 0,
IIGII - v/2,
IIGGTII - v/2.
(11.59)
We call these equations the decomposability condition. B. Correction for decomposability
The optimal estimate of G computed by renormalization may not be exactly decomposable, so it nmst be optimally corrected into G that satisfies eqs. (11.59) (see Section 9.5.2). Since the normalization IIGII- vf2 is imposed by renormalization, we only need to impose det (~ -- 0,
ii(~(~w ii2 _ 2.
(11.60)
The rank of this constraint is 2. Let (~ - G - AG. In linear approximation, the first constraint d e t ( G AG) - 0 can be written as det G - t r ( G t A G ) - 0 (see eq. (2.17)), which can be rewritten in the form (GtT; A G ) - det G,
(11.61)
where G t is the cofactor matrix of G (see eq. (2.18)). In linear approximation, the second of eqs. (11.60) can be written as
(GGTG; A G )
-
1 IIGGTII2
~
21
(11.62)
In order not to violate the normalization IIGII = v/2, the increment A G must be such that A G 6 {G}L~ to a first approximation. The increment A G is optimally determined by minimizing the Mahalanobis distance IIAGIIv[G] (see Section 9.5.2), i.e., (AG; "V[G]-AG) --+ min,
e {G}f,
(11.63)
under the linearized constraints (11.61) and (11.62). This optimization requires the covm'iance tensor ];[G], which is given by eq. (11.58) as a by-product
11.3.
Optimal Correction and Decomposition
337
of renormalization. Since nmltiplication of the covariance tensor FIG] by a positive constant does not affect the solution, it can be replaced by the normalized covariance tensor 120[G] defined by setting @- 1 in eq. (11.58). The first order solution is given as follows (see eqs. (5.12), (5.14) and (5.15))"
A G - / ~ l ' ~ 2 0 [ G l G tT -t- )t2]}o[G](GG-I-G),
() )~1
X2
(det
-W
(llaa
T
II~ - 2 ) / 4
)
'
-1 (Gt-l-; 12o[GI(GGTG)) ) (atT;Vo[ala tr) w ( ((]raTa;]20[alatT) (GGTG;])o[G](GGTG)) -
(11.64)
(11 65) (11.66)
-
Since the constraint A G e {G}~ ensures I I G l l - x/~ only to a first approximation, the actual correction takes the form
0 - v ~ N [ a - ~ a],
(11.67)
where N[. l is the normalization operator (see eq. (5.25)). This correction is iterated until the decomposability condition (11.60) is sufficiently satisfied. In this process, the normalized covariance tensor V0[G] also needs to be updated, since its null space should change as G changes (see eq. (5.26)). So, it is projected in the following form:
~o[Glijkl--
3 E PijmnPklpq~o[Glmnpq. m,n,p,q=l
(11.68)
Here, "l)O[G]ijkl and ~o[G]ijkl are the (ijkl)elements of 120[G] and l)0[G], respectively, and Pt) - (Pijk~)is the orthogonal projection tensor onto {G}Ll defined by 1 Pijkl- ~ikSjl- -~aijakl. (11.69) C. Singularity of the decomposability condition Since G has nine elements, the three equations (11.59) should constrain to be in a six-dimensional manifold in the nine-dimensional parameter space. However, a decomposable matrix (~ is specified by a unit vector h and a rotation matrix R, so G has five degrees of freedom. Where does this discrepancy come from? This anomaly originates from the fact that the constraint given by eqs. (11.59) is singular in the sense discussed in Section 5.1.1. In general, the three equations d e t G = Cl, IIGII = c2, and IIGG-Cll = c3 constrain G to be in a six-dimensional manifold. However, the particular values cl - 0, c2 v/-2, and c3 = v/2 happen to be critical values, at which the six-dimensional
338
Chapter 11.
3-D Motion Analysis
d = ddTd G
...................................................... =
GGTG
I
Fig. 11.4. The manifold defined by IIGII - v ~ and the manifold defined by IIGGTII = x/~ do not meet transversally. manifold degenerates into five dimensions. To be specific, the manifold defined by I I G I I - v/2 and the manifold defined by I I G G T I I - ~ do not meet transversally in 7~9 (see Section 3.2.1). As an analogy, imagine two (two-dimensional) spheres in three dimensions. If they meet, their intersection is generally a (one-dimensional) circle. However, as the two spheres move away from each other, they eventually reach a critical state, at which their intersection is a (zero-dimensional) point. If they move further apart, there no longer exists an intersection s . The critical nature of eqs. (11.59)is easily detected from eq. (11.62). In fact, in the limit in which (~ is decomposable, the "surface normal" (~(~TG coincides with (~ (see the next subsection). Hence, the linear subspace defined by eqs. (11.62) coincides with {(~}~ (Fig. 11.4), just as two spheres in contact have a common normal at the contact point, sharing a common tangent plane. As a result of this singularity, the convergence of the iterations in the form of (11.67) is no longer quadratic (see Section 5.1.3).
11.3.2
Decomposition into motion parameters
If G is decomposable into motion parameters {h, //}, the decomposition is done as follows. First, note the following identities (see eqs. (2.36) and (2.48))" G Th - ((h x I ) R ) T h - - 1 t Th x h - O,
(11.70)
GG T - ( h x R)(h x R) T - ( h x I)RRT(h x l ) T ( h x I)(h x I ) T _ Ph" (11.71) Here, Ph is the projection matrix along h. Eq. (11.70) implies that h is computed up to sign as the unit eigenvector of the symmetric matrix GG T for eigenvalue 0, or the smallest eigenvalue in practice. 5Such a critical phenomenon is characteristic of a real space and disappears in a complex space. For example, disjoint spheres in a real space have imaginary intersections in a complex space.
11.3.
Optimal Correction and Decomposition
339
Let Z~ and Z~ be the depths of the a t h feature point Pa before and after the camera motion, respectively (see eq. (6.35)). From !
-!
Z~5c~ - h + Z ~ R x ~ , we obtain Z~h x ~
/
(11.72)
-!
- Z~h x R x ~ .
(11.73)
Since Z~ > 0 and Z~ > 0, we have
Ih,~,~, a ~ l -
_,
(h • ~,~,h • R i o t ) -
z"
_,
lln • Rx.I > o,
(11.74)
unless h x ~ - h x R ~ - 0, in which case P~ is the focus of expansion (see 11.3). It follows that the sign of h is determined by the following criterion: N
> o.
sgn(Ih,
(11.75)
On the other hand, if we define the matrix K-
(11.76)
- h x G,
the following identity is obtained (see eq. (2.48))" K - - h x (h x R ) -
-(h x I)(h x I)R-
P h R.
(11.77)
If we define unit vectors Vl and v2 such that Y h = (vl, v2, h) is an orthogonal matrix, we obtain the identity Ph
-
Vhdiag(1, 1, 0)V/~,
(11.78)
which is the spectral decomposition of P h (see eq. (2.62)). Hence, eq. (11.77) can be written as K-
Vhdiag(1,1, O ) V ~ z a - ( V l , V 2 , h ) A ( R m v l , R T v 2 , R x v )
x.
(11.79)
Let K-
(11.so)
VAU x
Comparing
be the singular value decomposition of K (see eq. (2.114)). eqs. (11.79) and (11.80), we conclude that Y -(CVl,r162
U-
(cRTvl,cRTv2,c"Rh),
where c, e ~, and c" are 1 or - 1 independently. following relation (see eqs. (2.6))" VU T
-
-
VlVlTR-+-V2V2
(11.81)
If c'c" - 1, we have the
T R + h h mR - - ("1"1-1- -~- v2 v T2-~- h h T )/]~
-
R. (11.82)
340
Chapter 11.
3-D Motion Analysis
If c'e" -- - 1, we have Vdiag(1, 1 , - 1 ) U T - v , v ~ R + v2vT2 R + h h T R - R.
(11.83)
In either case, R -- Vdiag(1, 1, d e t ( V U T ) ) u T.
(11.84)
In summary, the motion parameters {h, R} are computed from the essential matrix G by the following procedure: 1. Let h be the unit eigenvector of matrix G G T for the smallest eigenvalue. 2. Adjust the sign of h so that E aN= t Ih , x ~ , G x ' lol>
O.
3. Compute the singular value decomposition of the matrix K - - h x G in the form K - V A U T. 4. Compute R - Vdiag(1, 1, d e t ( V U T ) ) u T. Note t h a t Steps 3 and 4 can also be viewed as the following least-squares optimization (see Section 2.3.5)" I I R - K I I - - + min.
(11.85)
On the other hand, there exists an alternative method, which is more efficient. Let G ( g l , g 2 , g 3 ) and R ( r l , r 2 , r3). S i n c e G - h x R , we have gi - h • ri, i = 1, 2, 3, from which the following identity is obtained:
ri -- gi x h + gi+l x gi+2.
(11.86)
Here, the indices are computed modulo 3 (e.g., g4 means gl). This identity is easily confirmed by substituting g i - h x r i and using the relationship r i+1 x ri+2 = ri. The reason why eq. (11.86) has not been used is that eq. (11.86) holds if and only if G is decomposable. Hence, the computation is fragile in the sense that if G is not decomposable, the computed matrix R is not necessarily a rotation matrix. In contrast, the method of singular value decomposition produces an exact rotation matrix R even if G is not decomposable, say, as a result of a premature termination of the correction iterations described in the preceding subsection. In this sense, the computation is robust. The same applies to the computation of h: many other methods exist for computing h if G is decomposable, and some are more efficient, but the above method is robust, always producing a unit vector h very close to the true solution even if G is not decomposable.
11.4. 11.4
Reliability of 3-D Reconstruction Reliability
11.~. 1
341
of 3-D Reconstruction
D e p t h and its variance
Let { s be the motion parameters computed by the procedure described so far. From them, the 3-D positions of the feature points are reconstructed / just as in the case of stereo vision. Namely, each corresponding pair {x~, x~ } is optimally corrected into { ~ , ~ } so that the epipolar equation ( ~ , Gx~)^^' = 0 is satisfied, where (~ - h x 1~. The correction is given as follows (see eqs. (6.20)):
~
=
(~ (~)Vo[~](~' (~-, (~TVo[ ~ l ( ~ - ) + ( ~ , (~Vo[~-](~ T ~ ) ' ,,
Ax~ =
!
,,T
(x~,Gx~)V0[x~]G
xa
.
(11.87)
(~-, d;~Vo[~] ^a~o)+' ( ~ , d;Vo[~-](~ ~ ) As discussed in Chapter 6, this correction has the following geometric interpretations (they are equivalent to each other): 9 Image points x~ and x ~ are moved to ~ and X^'c ~ respectively, so that ~ is on the epipolar of Yc~ in the first image and x~ ^ is on the epipolar of ~ in the second image (see Section 6.1.2). 9 Image points x~ and x t are moved to &~ and x~, ^' respectively, so that the lines of sight of ~ and ic'~ meet in the scene (see Fig. 6.4). 9 The direct sum x~ | x ' is identified with a point in the four-dimensional space T~4 - 7~2 | ~2 and moved to &~ | x~ so that &~ | &'~ is on the three-dimensional manifold defined by (x, Gx') = 0 (see Fig. 6.10). According to the general theory in Section 7.1.2, we obtain the following relationship for the residual (see eq. (11.11)): N
a=l
+(~, - ~~,, v0[~'~] _ ( ~ , - ~^,))) "
_ ~
^, G~
= (~,
,
(x~, Gx~) ^
^]
V0[~]a~)
2
^~
+ (e~, GV0[~']G e~)
.
( 11. 88)
Since J0[h,/~]/e 2 is a ?(2 variable with N - 5 degrees of freedom, an unbiased estimator of the squared noise level e2 is obtained in the form ~2 _ Jo[h, R]
N----Z-U"
(~.89)
342
Chapter 11.
3-D Motion Analysis
Its expectation and variance are 2G4
E[~ ~] = ~,
V[~2]- N - 5"
(11.90)
The rigidity test (11.14) can be rewritten as ~:2 Xk-5,a e--7 > N-~--~-.
(11.91)
^1
Using the corrected values 5:~ and x~, we can compute the depths as follows (see eqs. (6.38)):
2o - (h x ./z:~', ~,~ x/~'~1 I1~,~ • R:~'oll ~
'
2"= (h x ~ o , ~ x R ~ ' ) ^ ^1 I1~ x R ~ I I 2
(11.92)
The corresponding space point is reconstructed to be ~
- 2~.
(11.93)
Up to now, there exist two solutions, since renormalization computes the essential matrix G up to sign. It is easily seen that if one solution is {h, R}, Za, and Z~, the other solution 6 is { - h , R } , - Z a , a n d - Z ~ . Hence, the correct solution is chosen by imposing the constraint N ~(sgn(2a)
+ sgn(2~)) > O.
(11.94)
or--1 N
^!
It appears that we could alternatively impose ~c~=1(2a + Za) > 0. However, this would be dangerous because there exists a possibility that the depth of a feature point located very far away in front of the camera (Z~ ~ oo) is estimated to be very far behind the camera (2a ~ - c ~ ) in the presence of N ^! image noise; even one such anomaly could reverse the sign of ~ = 1 (2~ + Z~). The reliability of the reconstructed space points is affected by the following two sources of errors: 1. errors in the observed image points x~ and x~, c~ - 1, ... , N ; 2. errors in the computed motion parameters { h , / ~ } . Strictly speaking, errors in {h /~} are correlated with errors in x~ and x~, ! since the motion parameters {h, /~} are computed from x~ and x~, c~ - 1, 6Eq. (11.74) only guarantees that Z~ and Z~ have the same sign.
11.4.
Reliability of 3-D Reconstruction
343
..., N. However, if we focus on a particular feature point, it can be regarded as approximately independent of errors in {h, /~} if the number of feature points is large. Hence, the effect of image noise at individual image points and the effect of errors in { h , / ~ } can be treated separately. In other words, we can assume that the motion parameters are accurate when analyzing the effect of image noise, while we can assume that image noise does not exist when analyzing the effect of errors in the motion parameters. Consider the effect of image noise, assuming that the motion parameters {h, /~} are accurate. The covariance matrices of the corrected positions ~ and 5'~ are given by eqs. (6.21), i.e.,
(v0[x
u [ e o ] - e~ (Vo[~ol-
" ^'
t ' ~1~ J[~,c~,~'*'T-v0r~ff, [ o~1a" ~ )
" ^'
'~
+ ( ~ , (~Vo[~-](~ ~ )
-'
.
(Vo[~]a ~ ) ( V o [ ~ ] O ~ )
u[e'~]- ~ ( v 0 [ ~ ' ] -
^ :1= . . . . .
- 7 - - - 7. . . . . . .
T
(~'o, a Vo[~]a~o) + ( ~ , OVob'~]O ~ ) ~ [ ~ , ~1 _ _ .,
)'
~2
(v0[~l
" ,, /
a~ (~o, OrVo[~o] ^ -,
)(v0[~'10 ~
,
~)~
)
= ~[~, ~o1~
(11.95) where g2 is the estimate of e2 given by eq. (11.89). The subscript i indicates that we are considering image noise only. The variance I~[Z~] and the covariance vector ~ [ ~ , 24] are computed by eq. (6.48), i.e.,
~[2o] -
1
., I~ Ileo, x .R~,~I
(
2X(mo,<[:~]m~)
- 22~2" (rh~ , l d [ ~ , ~^'] R ^ T , h ~ ) + &",2 (,-h~ , i~V~[~] ^' _oT , ~ ) ) , /
AT
V/leo, 2~]- _ ( 2 ~ [ ~ o 1 - 2-v/[e~, ~ ] R )m~ (r~o,e~)
(11.96)
where we put too, - X [ h x eo] x
R~'~.
(11.97)
The covariance matrix l~[~] of the reconstructed space point/'~ is given by eq. (6.50), i.e.,
v/[~l- 2~v/[~o] + 22~s[vi[~o, 2~]~r l + ~[2o]~r where S[. ]is the symmetrization operator (see eqs. (2.205)).
(11.98)
344
11.4.2
Chapter 11.
3-D Motion Analysis
Effect of errors in the motion parameters
We now consider the effect of errors in the motion parameters {h, /~}, assuming that x~ and x ~a, a - 1, ..., N, are all accurate. This analysis was already done in Section 6.7, but there the translation h and the rotation i t were assumed independent. If correlations exist between them, the computation of the covariance matrix ~/m[/~] of the reconstructed space point ?~ must undergo the following stages of computation (the subscript m indicates that we are considering errors in the motion parameters only). A. Epipolar equation Since image noise is assumed not to exist, we have ~ba = 5~ and &~ - 5~~ Hence, the epipolar equation (&~, h x R x ~ ) = 0 is satisfied. However, if the perturbed motion parameters h - h + A h and t t - / t + A R are substituted for their true values {h, tt}, the quantity Ol
--
--
C~"
^1
^
^/
(11.99)
is no longer zero. Writing A R - A / J x R, we obtain to a first approximation ~ = (&~, Ah x Xt&~ + h x AR&')a - - ( d ~ , A h ) -
(ba, A ~ ) ,
(11.100)
where 7 a. = ~ ~ •
,"
^l
,
b.
--
(~.,
"
Rx~
,,!
)h
--
(h, R^x ^' ~ ) x ~^.
(11.101)
Hence,
v~[tol = (a.. V [ h l a . ) + 2(ao. vih./~]C) + (C. v[/~]C).
vm[h. ~.1 -
-v[h]a. -
v[h. i~]C.
v~ [/~. ~o1 - -v[/~. hla. - v[/~]C.
(11.102)
the covariance matrices V[h], V[h,R], and V[R] are computed by eq. (11.31). In computing eq. (11.31), we need the value e2 because W~ (h, R) involves the covariance matrices V[x~] - e2V0[x~] and V[x~] - e2Vo[x~] (see eq. (11.12)). It is estimated by eq. (11.89).
where
B. Correction of image points Since image points &~ and x~ ~' do not satisfy the epipolar equation for the perturbed motion parameters, they are respectively corrected into & ~ - A&~ and 7Strictly speaking, (h, R} in eqs. (11.101) should be (h, R}, but the use of {l~, R} introduces only a second order difference.
11.4.
Reliability of 3-D Reconstruction
^1
^1
345
x~ - Ax~ according to eqs. (11.87). To a first approximation, the correction has the following form:
~i.
~ . v 0 [ ~ . ] G^x*,o,
=
^' & T v 0 [Za] c^~ ^! . ) + ( e . d;Vo[~']d;Te.) (,T,a, ,,-I_ ~.Vo[x'~]G &. . .(Xc~, ' (~TVo [ . o ] a"~^,. ) + (~.. GVo[*']GT &~)
zX&'. =
(11.103)
Hence, we obtain
v~[~ol(Vo[x o]Gx~)(Vo[x~ld~,:) ^ ^' 9
v~[~,~] -
,,T
Vrn[eo~](Vo[;T,~]~rT'.~,o~)(Vo[x~]a V~
^]
(,,, d;7 Vo[~ ,~]ax,) (x,~, " ^'
~,
Xa~)T
+ (i,,~ GVo[x']d;ti:,)
[~']G^ T ~,) V m [ ~ I ( V o [ ~ ] ^G~,)(Vo ^,
Vm[i:o, ~:'] -
)2,
T
,,t d;Tvo[x~] G,Tar)-[" (Xc~, l~W0[X~](~Txot) )~' ((xo,, " .., V m [ h , eo~](Vo[2s vm[h,~o]
-
-
Vm[h,e'.] -
-
vm[h. ~.](Vo[~'.]CT^ ~ . ) ~
(~,',~, G-rVo[x,~]d~:' +,)
AT
(~:,~, d~Vo[~']G
(x~^' , Cl. TVo[x . ~ ] G^ x,,!~ ) +
(i:,~, I
vm[.~, ~'] _
_
~,,)
v0[R. ~ ] ( v 0 [ ~ ~]Gxo~) ^ ^' T
Vm[.0, ~,] - -
C. D e p t h
t
(',Xot (~T Vo[~,]d;'~.) + (~. . (~Vo[~'.]a,, T ~.)
G V o [ x ~ ] G,, T ~,~)
,, T
~m[h, ~,](Vo[~,]C ~, )t (~:,,~, G-rVo[~o] G~) " .., + ( ~ , O V o [ ~ - ] O m~) o,
.
(11.104)
variance
Recall that the depths Za and Z~ have been determined from following relationship (see eq. (6.35))" 2~&~ - h + 2:l~&'~.
(11.105)
If ft and J~ are perturbed by A h and A / J x R, respectively, the depths 2~ and 2~ are accordingly perturbed by AZ~ and AZ~, respectively. To a first approximation, we have ~z~
-
~ z " n x ^' ~ ^
+
~ x h -
z~ ^'
^ ^'
^'/~A~'
(11.1o6)
346
Chapter 11. ^
3-D Motion Analysis
^1
Taking the vector product with Rx~ on both sides, we obtain
^' (ik~'~) • ( ^ ^' Z. (n~o) - 2 ~ ( n~^' ~ . ) • A~. + 2'o ( n^^' ~ . ) • /ZA~'..
~ z . ( n ~ .^')
• ~.
( n^~^'. ) • Ah
-
-
• ~a)
(~1.1or)
Taking the inner product with
~
(11.108)
N [ h x Xo~]
-
on both sides of eq. (11.107), we obtain AZ~(~
( n^~ )^'
•
~)
-
(,~
,
~ ^'
Ah) + Z~ ^' X^'c ~
- 2 ~ ( ~ . , ( / ~ : ) • a ~ o ) + 2"(ao ( n~~ )^' ,
,
• R^A x^'a )
.
(11.109)
If we define Fa~ - g ~
x R x^' ~ - N[~: x hi "
x
n^x^' ~,
(11.110)
we obtain
~xz~
_
(m~,,,",h)+2"ll
^! ~
T,~,~x,~') II~ (,~,~x~) - 2~(,~,~x~)+z~(~t ^' (m,~, ~ )
(11.111) Noting eq. (6.47), we obtain the variance Vm[Za] and the covariance vector Vm[5~a, Za] in the following form:
1
V~[2o] --
( !
V[h./~]fi.) , vm[h,~'l/ZZm~)
^T
Vm[h. ~]rh~)
+ 2 2 : I1~'~[I2 (m~.
- 22. (m..
+22:(m~
- 2 z^~ Z .^' l l ~ l l^'
2( ~ ,
v~[/~, ~],h~)
+22',,2 I1~ 112(,~,~, Vm [.#., ~ ' ] h T m . )
22~2"(m~, Vm[~, ~.]RT
-
1
v ~ [ ~ , 2~] = (m~: ~ )
(v~[~,
)),
h ] , ~ + 2"ll~'ll~Y~[~, R I ~
- Z,~Vm['&,~Ivb,,~+ Z ' V m [ ~ , ~ ' ] l ! ~ T v h ~ ) .
(11.112)
Finally, the covm'iance matrix Vm [~] of the reconstructed space point ~a = ZaS~a is given by Vm[~]
^T
-
2 ~ 1 ~ [ ~ l + 22~S[Vm[Jc~,2~15c~l + Vm[2~l~x~.
(11.113)
11.4.
Reliability of 3-D Reconstruction
347
Fig. 11.5. Simulated images of two objects.
~
....o.~..,~tk..%~,~.. 9 o ~'.. oO
(b)
(~)
Fig. 11.6. Least-squares approximation with optimal weights: (a) errors in translation; (b) errors in rotation. E x a m p l e 11.1 Fig. 11.5 shows simulated images (512 x 512 pixels with focal length f - 600 (pixels)) of two cylindrical grids in the scene viewed from different angles. We added Gaussian noise of mean 0 and standard deviation a -- 1 (pixels) to the x- and y-coordinates of each grid point independently, so the noise level is e - cr/f -- 1/600 (this value is not used in the reconstruction computation). Let {s R } and {h, l~} be the true and the computed motion parameters, respectively. Since h is a unit vector, the error in translation is measured by A h - PN[h](h,- N[h]), (11.114) where
PN[h] is the projection matrix along N[s
The error in rotation is
measured by A~-
A~tl,
(11.115)
where A ~ mid l are the angle and axis (unit vector) of the relative rotation
RR-1 (__ RRT).
In Figs. 11.6-11.8, A h and AY2 are plotted in three-dimensions for 100 trials, each time using different noise. The ellipses in these figures indicate the standard confidence regions s defined by V[ft] and V[I~] (see eq. (11.29)); the cubes are drawn merely for the sake of reference. Fig. 11.6 is for the SSince V[la] has rank 2 and VIlli has rank 3, the standard confidence region is a space conic for translation and an ellipsoid for rotation; see Section 4.5.3.
348
Chapter 11.
3-D Motion Analysis
least-squares approximation with optimal weights (computed from the true motion parameters {h, f/}); Fig. 11.7 is for renormalization; Fig. 11.8 is for renormalization followed by optimal correction of the essential matrix. The corresponding standard confidence regions and reference cubes have the same absolute sizes. From Fig. 11.8, we can see that the theoretical bound on the accuracy of the motion parameters given by eq. (11.29) is almost attained if renormalization and optimal correction are combined. Fig. 11.9 shows a reconstructed shape viewed from a different angle. The true shape is superimposed in dashed lines. Fig. ll.10a shows the standard confidence regions computed from the total covariance matrix Vt[~] = Vi[~] + Vm[/'a] around the reconstructed grid points, where V/[/'~] is the covariance matrix of/'~ due to image noise (see eq. (11.98)) and Vm[iZ~] is the covariance matrix of ~a due to errors in the computed motion parameters (see eq. (11.113)). Fig. ll.10b shows the shapes that envelop the primary deviation pairs of the grid points. 11.5
11.5.1
Critical
Surfaces
Weak critical surfaces
The linearization technique introduced in Section 11.2.1 is based on the fact that the essential matrix G is an eigenmatrix of the unperturbed moment tensor ,s for eigenvalue 0, which we have implicitly assumed to be a simple root. If it is a multiple root, infinitely many eigenmatrices exist. Such ambiguity occurs if and only if the epipolm" equation (11.33) is satisfied by a false essential matrix (~ (r G). In other words, ambiguity occurs if and only if space points r~ - Z ~ and r~' - Z~x~ ' -' satisfy /'cv
, G~ r ~ ) /
-
0,
a -
1, ..., N ,
(11.116)
for G r G. If so, the linearization technique in Section 11.2.1 does not work, since any linear combination of G and G is an eigenmatrix of M for eigenvalue 0. Eliminating r'~ by using tile relation r~ -- h + I;tr~ (see eq. (6.35)), we can rewrite eq. (11.116) in the form
(ra,G'(ra-h))-O,
a-
1, ..., N ,
(11.117)
where G' - (~R T .
(11.118)
Eq. (11.117) implies that the 3-D interpretation is ambiguous if and only if all feature points are on a surface whose equation has the form
(r,G'(r-h)) -0
(11.119)
11.5.
Critical Surfaces
9....~.:~
349
:...
(~)
(b)
Fig. 11.7. Renormalization: (a) errors in translation; (b) errors in rotation.
(~)
(b)
Fig. 11.8. Renormalization followed by optimal correction: (a) errors in translation; (b) errors in rotation.
Fig. 11.9. Reconstructed shape. The true shape is superimposed in dashed lines.
(~)
(b)
Fig. 11.10. (a) Reconstructed shape and the standard confidence regions of grid points. (b) Two shapes corresponding to the primary deviation pairs of grid points.
350
Chapter 11.
3-D Motion Analysis
for G ~ r G R T. Since this equation is identically satisfied by r = 0 and r = h, the origins O and O ~ of the two camera coordinate systems are both on this surface. If S[G'] is nonsingular, eq. (11.119) defines a centered quadric of the following form (see eq. (4.98)): (r - r c , S ( r - r e ) ) = 1.
(11.120)
The center r c and the matrix S are given by
rc-
1S[G']-IG'h,
S -
4S[G']
(G,h,S[G,]_IG,h).
(11.121)
If a surface in the scene has the form of eq. (11.119) for some 33-matrix G ~, we call it a weak critical surface.
11.5.2
Strong critical surfaces
If the observed feature points are all on a weak critical surface, the epipolar equation (11.33) is satisfied by infinitely many false essential matrices. However, the true essential matrix G must be decomposable, so the correct essential matrix G can be chosen if the false essential matrices are not decomposable. In fact, the correct motion parameters {h, / / } are obtained by the direct optimization (11.11). On the other hand, if the epipolar equation (11.33) is satisfied by a false essential matrix that is decomposable, the problem is inherently ambiguous. This occurs if and only if all the feature points are on a surface whose equation has the form of eq. (11.119) for which the matrix (~ in eq. (11.118) can be decomposed into false motion parameters { s R}. This condition is equivalent to saying that there exist a vector h (% h) and a rotation m a t r i x / / ( ~ R) such that
G ' = h" x R.R T.
(11.122)
Let us call a surface that satisfies this condition a strong critical surface. By definition, a strong critical surface is also a weak critical surface and hence a quadric t h a t passes through the origins O and O ~ of the two camera coordinate systems. However, the converse does not necessarily hold. In fact, eq. (11.122) implies G ' T h - 0, (11.123) and hence eq. (11.119) is identically satisfied by r - ch for an arbitrary constant c. In other words, the space line r x h - 0 is entirely contained in that surface. This implies that a strong critical surface is a hyperboloid of one sheet or its degeneracy (Fig 11.11a). A critical surface is not a mere theoretical construct; it inevitably appears in practical applications. Indeed, all planar surfaces are strong critical surfaces and hence weak critical surfaces as well. This may sound contradictory,
11.6.
3-D Reconstruction from Planar Surface Motion
351
H
h
(~)
(b)
Fig. 11.11. (a) Hyperboloid of one sheet. (b) Two planar surfaces as a degenerate hyperboloid. since a strong critical surface must be a hyperboloid passing through O and 0 I. By a careful analysis, however, it can be shown that an arbitrary space plane H that does not pass though O or O ! can be coupled with another space plane H I that passes through O and O ! and intersects with H perpendicularly in such a way that II and II ! constitute a degenerate hyperboloid (Fig. l l . l l b ) . Consequently, the 3-D interpretation is inherently ambiguous if all feature points are on a planar surface. In the following, we construct an algorithm for computing all possible interpretations of a planar surface scene. In order to invoke this algorithm, however, we must first test if the feature points we are observing are on a planar surface. This planarity test is derived from a statistical analysis of the image noise.
11.6
11.6.1
3-D Reconstruction
from Planar Surface Motion
Optimal solution and planarity test
As shown in Sections 6.5.1 and 10.4.1, the necessary and sufficient condition for 2~ and x-~a to be projections of a space point on space plane (n, r) d viewed by two cameras with motion parameters {h, R } is -~ 0, where a R T ( h n T - dI) (see eqs. (6.67) mad (6.71)). Hence, the 3-D analysis can be decomposed into the following two subproblems: Problem
11.4 Estimate a matrix A such that x A~
-
0,
~ -
1, ..., N ,
(11.124)
!
from the data x~ and x~, ~ - 1, ..., N. P r o b l e m 11.5 Decompose the matrix A into surface and motion parameters {n, d} and {h, R } in such a way that
A - R-C(hn n- - d I ) .
(11.125)
352
Chapter 11.
3-D Motion Analysis
Consider Problem 11.4 first. Eq. (11.124) is the hypothesis from which {h, R} and {n, d} are to be estimated. The rank 9 of this hypothesis is 2. Since eq. (11.124) is linear in A, the scale of the matrix A is indeterminate. This is a consequence of the fact that a small camera motion near the surface is indistinguishable from a large camera motion far away from the surface, just as in the case of non-coplmlar feature points (see Fig. 11.1). In order to remove this indeterminacy, we adopt the normalization [IAI[- 1. For each a, define the 33-matrix X (k) - - e (k) X X '
~x T
(11.126)
where e (1) - - (1, 0, 0) T , e (2) -- (0, 1, 0, )T, and e (3) - (0, 0, 1) T. Let X(k)~ be the unperturbed value of X ~ ) obtained by replacing x~ and x~' by 5~ and x~,-' respectively. Eq. (11.124) can be written in the following form: (X.(k).~, A) - 0,
k - 1, 2, 3.
(11.127)
Let X ~(~j), (k) x~(0, and x~( 0 be the (ij) element of X ~ ) and the ith components ' respectively. In elements, eq. (11.126) can be written as of x~ and x~, 3 j) - -
X
E /=1
e i k l x l ot(1)Xc~(j),
(11.128)
where eijk is the Eddington epsilon (see Section 2.1.3). Since xa and xaI are assumed independent , we have E[X ~(k)] - X-(k) ~ . The deviation Ax(k)o, = X(k)~ has the following (ij) element:
X~)_
3 Ay(k) ~'~('j)
--
Eeik'(
A X a (/l ) X a ( j )
-/ (l) Ax o~(j) + Axo~ ! (l) Ax o~(j) ). + xo~
(11 9129)
l=l
Let l;[X(m), X(~")] - E [ A X L m) | AX(~")] be the covariance tensor of X(~m) and X(~~) . Since Ax~ and Ax~ are independent random variables of mean 0, the covariance tensor 1;IX (m), X ~ n)] has the following (ijkl) element" 3
v[x~),x2)l,J ~, - Z
~,.~~
(E[Ax~(j)Axo(,)]~'(~)~'(~)
p,q=l
"b E[Ax~(p) Ax~(q)]Yco,(j) 5c~(1) -b E[Ax,~(j)Ax~(o ]E[Ax~(v)AX~(q)]) 3
_-
E
x~(~) xo(~) + v [ ~ l ~'x ~ u )-~ ( , ) + v [ ~ ] ~ , v [ ~ ' ] ~ -'
) .
~,~=1
(11.130) 9This hypothesis is degenerate;
see Section
6.5.1.
11.6.
3-D Reconstruction fl'om Planar Surface Motion
353
According to the general theory in Section 7.1.5, the matrix A can be optimally estimated by the following optimization (see eq. (7.55))" N
J[A] - Z
3
Z
W" (~kO(A)(X~); A ) ( X ~ ); A) --+ min.
(11.131)
c~=l k,l= l
Here, the (33)-matrix W ~ ( A )
(
, X IA) )-
W~(A) -
,
-
(W(kO(A)) is defined by W ~ ( A )
-
which can be expressed in the form !
x'~ x AV[x~]A -r x x~ + ( A x e ) x V[x~] x ( A x e ) + [AV[
o]A T •
:
'
)-
,
(11.132)
where the symbol [. x .] denotes the exterior product of matrices (see eq. (2.45)). The minimum of J[A] is sought under the constraint I I A I I 1. Let A be the resulting estimate. Since A has eight degrees of freedom (its nine elements are normalized), the residual J[A] is a X2 variable with 2 ( N - 4) degrees of freedom. It follows that the number N of the pairs of corresponding image points must be such that N _> 4.
(11.133)
The residual is 0 when N - 4. For N _> 5, we obtain a planarity test" the hypothesis that the feature points are collinear is rejected with significance level a% if
J[~i]
>
~22(N_4),a .
(11.134)
Note that although the general motion algorithm breaks down if all feature points are coplanar, the planar surface algorithm can be applied to general motion as well, always yielding an (incorrect) matrix A. It follows that if the feature points are nearly coplanar, it is safer to apply the planar surface algorithm than the general motion algorithm; it should be switched to the general motion algorithm only when assuming planarity is not compatible with the image data to a convincing degree. In practice, however, the opposite approach has often been adopted: the general motion algorithm is used first by assuming non-planarity and switched to the planar surface algorithm only when assuming non-planarity causes a computational breakdown. To be specific, it has been customary to abandon the general motion algorithm when the second smallest eigenvalue of the moment tensor 3/i given by eq. (11.40) is sufficiently close to its smallest eigenvalue, and the decision criterion has been set arbitrarily. In contrast, the use of the planarity test as described above has a solid statistical foundation. In Chapter 13, we present an alternative criterion which does not require the significance level.
354
Chapter 11.
11.6.2
3-D Motion Analysis
Unbiased estimation and renormalization
The least-squares approximation to the optimization (11.131) based on the normalized covariance matrices V0[x~] and V0[x~] can be written in the following form: J [ A ] - (A; MA)--+ min. (11.135) The moment tensor f14 is defined by N
3
M _ I~r E
E
W
(11.136)
c t = l k,l=l
x ,I x A* V0[x,] A,T x x~I + (A*x,) x V0[x~] x (A*x,~)
W~ -
)-
+ e2[A*Vo[x~]A *T x V0[x'~]] 2 '
(11.137)
where A* is an appropriate estimate of A. The solution of (11.135) under the constraint IlAl[ i is obtained as the eigenmatrix of norm 1 of tensor f14 for the smallest eigenvalue. The eigenvalues and eigenmatrices of tensor M are computed by casting A and M into a 9-vector and a 99-matrix, respectively, as in the case of the essential matrix G (see eqs. (11.42)). The hypothesis (11.127) implies that the true value 2I satisfies ,MA - O, i.e., A is the eigenmatrix of ,Q for eigenvalue 0. However, E[,M] is generally not equal to 374. In fact, substituting X ~ ) - X(k)~ + A X e ) into eq. (11.136) and noting that E [ A X ~ )] - O, we see that -
E[J~I
-
N
1
3
NE E
w ( k l ) E [ X -. (k)
- )
ct-1 k,l-1
1
N
3
~)
+-f E Z W<2')E[AX |
Ax
~)
]
c r = l k,l= l N
= ' / Q + N E1
3
E
(kt) 12[X~) , X ~ )1. Wc~
(1 1. 138)
c~=1 k,l=l
Hence, the solution of the least-squares approximation is statistically biased whatever weights Wa are used. Define 3333-tensors Af (1) -(Ni(jl)kt ) and N "(2) - t ~,"~r(2) " ijklJ~ by N i(1)
3 w(mn)
1
'
eimpeknq" a a=l
'
Xa(p)Xa(q)
m,n,p,q=l \
+ V0
(j) x ~(t) ) ,
(11.139)
11.6.
3-D Reconstruction from Planar Surface Motion I
N
355
3 irnp~knqW~ rnn) Vo [Xc~]jlNo[x ~ lpq.
(11.140)
a = l rn,n,p,q= 1 !
Let 2~f(1) be the unperturbed value of N "(1) obtained by replacing x a and x a by ~ and x~,-' respectively, in eq. (11.139). Eq. (11.138) can be written in the following form: E [ . / ~ ] - ] ~ -1- E 2 . ~ (1) -}- ~4N'(2).
(11.141)
It is immediately seen from eqs. (11.139) and (11.140) that g [ . / ~ ( 1)] _ .]~/-(1) _]_ 2e2 j~f(2).
(11.142)
It follows that if we define A~t - M - c2A; (1) + e4Af(2),
(11.143)
we have E[yQ] - ,~I. Hence, we obtain the unbiased least-squares approximation J [ A ] - (A; JQA)--+ min. (11.144) The solution under the constraint IIAII- 1 is obtained as the eigenmatrix of unit norm of A)t for the smallest eigenvalue. If the noise level e is not known, the second order renormalization procedure is given as follows (see Section 9.6.2): 1. L e t c - 0 a n d W ~ -
I,c~-
1,...,N.
2. Compute the 3333-tensors ,A/t, ./~f(1) and Af (2) defined by eqs. (11.136), (11.139), and (11.140), respectively. 3. Compute the smallest eigenvalue A of the 3333-tensor
(11.145)
./~ --- ./~ -- CA/'(i) -~- C2/~f (2) '
and the corresponding eigenmatrix A of unit norm. 4. If A ~ 0, return A, c, and JQ. Else, update c and W ~ as follows:
2c(A;N'(2)A))2 - 4A(A;Af(2)A), I c+ (A;'h[(1)A)-2c(A;'/V'(2)A)-v/D
D - ((A;H(1)A)-
c+--
W~
2(A" A/(2) A) A ' c + (A;AY(1)A),
• AV0[
.IA
•
+
+c[AVo[x~]AT x V0[x~]])-2 "
,
(11.146)
If n > 0,
If D < 0, (11.147) •
•
(11.148)
356
Chapter 11.
3-D Motion Analysis
5. Go back to Step 2. After renormalization, the squared noise level e2 is estimated in the form ~2 __
C
1 -4/N"
(11.149)
Its expectation and variance are given by E4
E[~2]_ e2,
V[~2]- N---:~"
(11.150)
The covariance tensor ]2[A] of the resulting estimate h is obtained in the following form: ~2
1;[-~]-
V(./~:'t)8.
(11.151)
In terms of the estimate ~2 computed by eq. (11.149), the planarity test given by eq. (11.134) can be rewritten in the form ~2 E-"~
>
X2 2(g-4),a
(11.152)
We can interpret this as follows: we compare the estimate ~ computed under the hypothesis that the feature points are all coplanar with the noise level e expected from the accuracy of the image processing operations for locating feature points.
11.6.3
Computation of surface and motion parameters
A. Decomposition of the matrix of the image transformation We now solve Problem 11.5. The important fact is that unlike the decomposition of the essential matrix G, no decomposability condition needs to be imposed. In other words, there always exist {n, d} and {h, R} that satisfy eq. (11.125), although the decomposition may not be unique. This is easily understood by counting the degrees of freedom. The matrix A has eight degrees of freedom, since its elements are normalized. The surface parameters {n, d} have three degrees of freedom; the motion parameters {h, R} have five degrees of freedom. Thus, the matrix A has the same degrees of freedom as the surface and motion parameters. In order to solve Problem 11.5, the following two types of camera motion must be distinguished (Fig. 11.12): Case 1 The camera moves on one side of the planar surface; the two images are views from the same side of the surface.
11.6.
3-D Reconstruction from Planar Surface Motion
357
47 (~)
(b)
Fig. 11.12. (a) Case 1: the camera moves on one side of the planar surface. (b) Case 2: the camera penetrates through the planar surface. C a s e 2 The camera penetrates through the planar surface; the second image is a view from the opposite side of the surface. Although the normalization IIAII = i is imposed in the renormalization computation, we rescale A so that det A - 1 Then, the surface and motion parameters {n, d} and {h, R} are computed by the following procedure: 1. Let )~1 >_ )~2 >__A3 (> 0) be the eigenvalues of matrix ATA, and {Ul, u2, u3} the orthonormal set of the corresponding eigenvectors 1~ Let cri = x / ~ (> 0), i = 1, 2, 3. 2. The surface parameters {n, d} in Case 1 are given by
n - ]~[V/Cr2 --O'2Ul-t- V/O"2 --cr2u3],
d = or-------L-2. Or I
~
(11.153)
0- 3
In Case 2, the distance d is replaced by
(11.154)
d -- or---L-2. 0"1 + 0"3
3. The motion parameters {h, R} in Case 1 are given by
(i1.i55) R-
1 (I + o'3ph T)
A T,
(11.156)
o"2 l~ other words, {ui} is the see Section 2.3.1.
left orthonormal system of A
for
singular values {x/~};
358
Chapter 11.
3-D Motion Analysis
Fig. 11.13. A surface behind the camera cast be also observed. and in Case 2 by h--
N[0"3 V/0-12--0"2~ 1 "4-0-1V/0-22- 0"32U3],
R - 1 (-I + a~phT) A m,
(11.157) (11.158)
0"2
where the double signs 4- correspond to that in eqs. (11.153). 4. For each solution, another solution is obtained by changing the signs of n and v simultaneously.
B. G e o m e t r y of ambiguity
The above procedure yields eight solutions in general. This ambiguity is partly explained by noting the following two facts: 9 According to our camera imaging model, a surface behind the camera can be observed as well as a surface in front of the camera 11 (Fig. 11.13). 9 The surface and motion parameters are computed from the matrix A of the image transformation, not individual feature points. Suppose one solution {n, d} and {h, R} is obtained. T y p e 1 If we 1. move the first camera to the other side of the planar surface, and 2. reverse the translation (Fig. 11.14), the new surface and motion parameters are { - n , d} and { - h , R}, respectively. We can see from eq. (11.125) that the matrix A of the image transformation does not change. 11This is not a mere mathematical artifact due to the abstract perspective projection model; this also occurs to a real camera. For example, if the optical axis is oriented upward but not vertical, the horizon can be seen if the field of view is sufficiently large.
11.6.
3-D R e c o n s t r u c t i o n from P l a n a r Surface M o t i o n
,,"
d
"'.,.
359
\ \\
................
m fl
_....~'~tx,~
"
1 -n
~ R
Fig. 11.14. The first camera is moved to the other side of the planar surface, and the camera translation is reversed. Type
2 If we 1. reverse t h e three c o o r d i n a t e axis orientations of the second camera, and 2. move it to the mirror image position with respect to the p l a n a r surface (Fig. 11.15), the new m o t i o n p a r a m e t e r s 12 {h ~, R ~} are given by h' = J n h + 2 d n ,
It'=
-RJn,
(11.159)
where .In is the linear m a p p i n g of the reflection with respect to the plane with surface unit n o r m a l n (Fig. 11.16): it has the expression 9I n = I -
2 n n -r.
(11.160)
T h e m a t r i x of t h e resulting image t r a n s f o r m a t i o n is A ' - l : l ' T ( h ' n -r - d I ) - - F l - r J rn ( ( J n h + 2 d n ) n =
_R, jn (Jnnn"
= -R-r(nn
T -dl)
- -A,
-
-r - d l ) )
-dJn)
(11.161)
where we have used the identities J ~ - .In and j 2 _ I . Thus, t h e two m o t i o n s have the same m a t r i x of image t r a n s f o r m a t i o n up to sign la.
C. Resolution of ambiguity A l t h o u g h eight solutions exist in general, we can reduce the n u m b e r of solutions if we can tell w h e t h e r the m o t i o n belongs to Case 1 or Case 2. In fact, 12The camera orientation obtained after two reflections is described by a (proper) orthogonal matrix. 13Note that multiplication of the matrix A by a nonzero (positive or negative) constant does not affect the 3-D interpretation.
360
Chapter 11.
3-D Motion Analysis
R'
ol ." --....
i
,,'"'""
n l
\
......
~
Fig. 11.15. The second camera is reversed and moved to the mirror image position with respect to the planar surface.
Fig. 11.16. Vectors h and h' are the mirror images of each other with respect to the plane (n,r) - d. if we impose the condition that all the feature points have positive depths before and after the camera motion, the number of solution reduces to at most t w o - - i n most cases one. This is because spurious interpretations are in most cases such that the feature points are all behind the camera or some of them are behind the camera. In the latter case, the vanishing line separates some feature points from the rest, which is impossible because only one side of it is visible (Fig. 11.17; see Section 4.6.2). The true position e~ of the (~th feature point satisfies the following equation (see eq. (11.72)):
(11.162) If this point is on space plane (n, r) - d, the depths 2~ and 2~ can be determined by substituting eq. (11.162)into (n, e~) - d: we obtain
2o -
(~,d~.),
d-(n,h)
2" = ~ : n ~ - )
(11.163)
11.6.
3-D Reconstruction from Planar Surface Motion
361
.---",,, .....
::,:::('::::
......
Fig. 11.17. The vanishing line cannot pass through the projected feature points; only one side of it is visible. Hence, the condition to be imposed is d
(n, ~-) > o,
d - (n, h) > 0.
(11.164)
(~, n ~ - )
If the correct sm~face and motion parameters are chosen, eqs. (11.162) and (11.163) determine the 3-D position of each feature point. However, if the ! data values x~ and x~ are substituted for ~ and x~ -~ in them, eq. (11.162) does not hold exactly. Geometrically, this is equivalent to saying that the two lines of sight defined by x~ and x ~ may not meet; if they do, the intersection may not be on the space plane (n, r) = d. As described in Section 6.5.1, this problem can be resolved by the optimal back projection: the corresponding / image points x~ and x~ are optimally corrected into ~ and x~ ^ so that their lines of sight intersect exactly on the space plane (n, r) = d (see Fig. 6.12). The correction takes the following form (see eqs. (6.74)and (6.75)):
~ ,,/
- x~ -(Y0[x~]A /
v x x'~)W~(x~
x Axe),
/
x~ - x~ + (Vo[x~] x (Ax,~))W,~(x~ x Ax,~). w~
-
( ~'~ • a v o [ ~ , ] a T • ~ ' + ( A ~ )
' • Vo[~o] • (A~)
)-
(11.165) .
(11.166)
This correction is iterated until eq. (11.124) is sufficiently satisfied. Then, the 3-D position of the c~th feature point is given by
r~-
(n,~)
E x a m p l e 11.2 Fig. 11.18 shows two simulated images (512 x 512 pixels) of a planar grid in the scene viewed from a moving camera. The focal length is assumed to be f = 600 (pixels). We added Gaussian noise of mean 0 and standard deviation a = 5 (pixels) to the x and y coordinates of each grid point
362
Chapter 11.
3-D Motion Analysis
Fig. 11.18. Motion images of a planar grid. independently, so the noise level is e - o / f -- 1/120 (this value is not used in the reconstruction computation). We assume that the motion belongs to Case 1. We also assume that the correct solution can be distinguished from the spurious solution. Let {n, d} and {h, R} be the computed surface and motion parameters. Their deviation from the true values {~, d} and {h, R} is measured as follows (see eqs. (10.87), (11.114), and (11.115)): 9 The error in the surface parameters is represented by the 3-vector Au - Pr~(n-
~) + I]h[[d d - d •"
(11.168)
9 The error in translation is represented by the 3-vector A h -- P N [ h ] ( h -
N[h]).
(11.169)
9 The error in rotation is represented by the 3-vector A ~ - AFt/,
(11.170)
where ~ and I are, respectively, the axis and angle of the relative rotation - , (= In Figs. 11.19 and 11.20, A t , Ah, and AY2 are plotted in three dimensions for 100 trials, each time using different noise. Fig. 11.19 is for the least-squares approximation (the weights W(~kl) are computed from the true surface and motion parameters {fi, d} and {h, IR}); Fig. 11.20 is for renormalization. Comparing Figs. 11.19 and 11.20, we can see that the least-squares solution has statistical bias, which is removed by renormalization. Fig. 11.21a shows one example of a grid reconstructed by directly applying eq. (11.167) to x~; Fig. 11.21b shows the corresponding surface reconstructed by the optimal correction of the feature points. In both figures, the true position is superimposed in dashed lines. We can see that the correction enhances the accuracy of 3-D reconstruction.
11.6.
3-D Reconstruction from Planar Surface Motion
9
-~-~
-
(~)
.
363
9
(b)
(~)
Fig. 11.19. Least-squares approximation. (a) Errors in the surface parameters. (b) Errors in translation. (c) Errors in rotation.
/
(~)
(b)
(c)
Fig. 11.20. Renormalization. (a) Errors in the surface parameters. (b) Errors in translation. (c) Errors in rotation.
/
/
(~)
(b)
Fig. 11.21. (a) Simple back projection. (b) Optimal back projection.
364
Chapter 11.
Z'
R
Z
3-D Motion Analysis
p
9
Z'=I
Z=I
~
V
xa
Fig. 11.22. Camera rotation with no translation. 11.7 11.7.1
Camera Rotation
Rotation
and
Information
test
A. R o t a t i o n e s t i m a t i o n
So far we have assumed that h ~ O. If h = O, the camera motion is a pure rotation around the center of the lens, and the incoming rays of light are the same before and after the camera rotation (Fig. 11.22). Consequently, no 3-D information can be obtained. In fact, if no image noise exists, eq. (11.72) implies that the corresponding image points ~ and x-'a are related by Z~
-- Z ' . R ~ ' .
(11.171)
Thus, the depths Z~ and Z~ are completely indeterminate. It follows that in order to do a 3-D motion analysis, we need to test whether h = 0 or not in advance. This r o t a t i o n t e s t can be done in the form of a X2 test by first hypothesizing h = 0 and then testing if the discrepancy of the observation from the hypothesis is small enough to be accounted for by the statistical behavior of the image noise. Eq. (11.171) is equivalent to 9 ~ c< R~'a.
(11.172)
In the presence of noise, the observed image points x~ and x'~, a - 1, ..., N, may not exactly satisfy this condition. So, we consider the following problem: Problem
11.6 E s t i m a t e a r o t a t i o n m a t r i x R s u c h t h a t
-!
x~ x R S ~ -- 0, !
f r o m the d a t a x ~ a n d x ~ , ~ -
1, ..., N .
- 1, ..., N,
(11.173)
11.7.
Camera Rotation and Information
365
Evidently, eq. (11.172) is equivalent to eq. (11.173). This problem is formally identical with Problem 11.4, so the matrix R can be optimally estimated by the following optimization (see eq. (11.131))" N
3
J[R] -- E E W(kl)(R)(x(k); R ) ( X ~ ) ; R ) -+ min.
(11.174)
a=l k,l=l /
w.(R)-
• nv[
:]n T •
+
+[RV[x~]R T x V[x~
•
]])
-
•
(11.175)
2 ~
The only difference from eq. (11.131) is that the minimum is sought under the constraint that R is a rotation matrix. B. X2 test
Let R be the resulting estimate. Since R has three degrees of freedom, the residual J[/~] is a ~2 variable with 2 N - 3 degrees of freedom. It follows that the number N of the pairs of corresponding image points must be such that
(II.176)
N > 2.
However, the residual is not 0 when N = 2. The rotation test can be done in the form of the standard ~2 t e s t : the hypothesis that the camera motion is a pure rotation is rejected with significance level a% if J[R] > X2N_3,,.
(11.177)
Decompose the covariance matrices V[xa] and V[x~] into the noise level e and the normalized covariance matrices V0[xa] and V0[x~l as shown in eqs. (11.38). Let J0[lk] be the normalized residual obtained by replacing V[xa] and V[x~] by V0[xa] and V0[x~], respectively, in the expression for J[IR]. An unbiased estimator of c2 under the hypothesis h - 0 is obtained in the form ~2= [1~______~] J0 2N-3" Its expectation and variance under the hypothesis are E[~2]- e2'
264
V[e2] = 2 N ~ - 3"
(11.178)
(11.179)
In terms of the estimate ~2 the rotation test given by eq. (11.177) takes the following form: ~2 ~2 2N-3,a ~ > 2N------j~. (11.180)
366
Chapter 11.
3-D Motion Analysis
The interpretation is the same as in the case of the planarity test (see eq. (11.152)): we compare the estimate ~ computed under the hypothesis h 0 with the noise level e expected from the accuracy of the image processing operations for locating feature points. C. E s t i m a t i o n o f t h e r e s i d u a l
Since the optimization (11.174) is nonlinear, we need numerical search in the parameter space for R, which is computationally costly. However, an approximately optimal solution can be computed easily. Normalizing both sides of eq. (11.171)into unit vectors and noting that N [ R x ~ ] - R N [ x ~ ] , we obtain
N[~]- RN[x:].
(11.181)
!
If vectors x~ and x~, ~ = 1, ..., N, are normalized into m ~ -- N[x~],
m 's = N [ x ~ ] ,
(11.182)
the rotation R can be estimated by the least-squares optimization 1
J[R]
-
N
K ~
W~llm~ - Rm~ll ~ -~ min,
(11.183)
where W~ are arbitrary positive weights. If we define the correlation m a t r i x N
C -- E
W~mam~7-
(11.184)
o~--1
the optimization (11.183) is equivalent to the following maximization (see
eqs. (2.157)~nd (2 158))" (C; R)--4 max.
(11.185)
C - VAU T
(11.186)
Let
be the singular value decomposition of C (see eq. (2.114)). The solution of the optimization (11.185)is given as follows (see eq. (2.160))" /~ = Vdiag(1, 1, d e t ( V U T ) ) U T.
(11.187)
This may not be the truly optimal solution of (11.174) but can be used to compute a good approximation to the residual, because to a first approximation the function J0[R] is stationary in the neighborhood of/~.
11.7.
11.7.2
C a m e r a Rotation and Information
367
Information in motion images
The rotation test described in the preceding section can also be given the following intuitive interpretation. If I l h l l - 0, no 3-D information can be obtained, and the residual J0[/~] is very small (exactly 0 if noise does not exist). As in the case of stereo vision, the reliability of 3-D reconstruction generally increases as Ilhll increases. If Ilhll increases, the residual J0[/~] also increases. This implies that we can view J0[/~] as the information in the two images. From this viewpoint, we observe the following: 9 The information is m i n i m u m when h - 0; it generally increases as Ilhll increases. 9 If h is the same, the information increases as the variation of the depth becomes larger 14. 9 The information is unchanged if an arbitrary rotation R is added to the camera motionl 5. 9 The information increases as the number N of the feature points increases. In general, the accuracy of 3-D interpretation decreases as the image noise increases. However, the above observation implies that the negative effect of image noise cannot be measured simply by its magnitude: it must be weighed against the information in the two images. In other words, 3-D interpretation can be disrupted by image noise of a very small magnitude if the information is very small, while 3-D interpretation is robust to image noise of a large magnitude if the information is sufficiently large. This viewpoint is further extended in a more general framework in Chapter 13. E x a m p l e 11.3 Fig. 11.23 shows two images of a scene with small depth variance viewed from a translating camera. It is widely known that 3-D interpretation of this type of motion images is very difficult in the presence of noise, because similar images would be observed by a camera rotating around the center of the lens (Fig. 11.24), in which case 3-D interpretation is theoretically impossible. For images such as shown in Fig. 11.23, accurate 3-D interpretation is possible only if the image noise is very small. In other words, the image noise must be small as compared with the information in the images. Thus, the image noise magnitude is a very deceptive measure of the reliability of the 3-D interpretation. 14Humans in motion can perceive 3-D structures from the retinal image motion caused by the difference in depth. Psychologists call this phenomenon motion parallax. 15We a s s u m e that all visible feature points are still visible in the image frame after the rotation.
368
C h a p t e r 11.
3-D M o t i o n Analysis
Fig. 11.23. Motion images of a scene viewed from a translating camera.
Fig. 11.24. Motion images of a scene viewed from a rotating camera.
Chapter 12 3-D Interpretation of Optical Flow In this chapter, we study 3-D interpretation of small (theoretically infinitesimal) image motion called optical flow. We begin with the problem of how to detect optical flow from gray-level images and how to evaluate its reliability. We then turn to 3-D reconstruction from optical flow. The entire analysis is based on the epipolar equation, which is obtained from the epipolar equation for finite motion by taking the limit of infinitesimal motion. The subsequent analysis almost parallels the finite motion analysis given in Chapter 11: we derive a theoretical bound on the attainable accuracy of the motion parameters and present a numerical scheme for 3-D reconstruction by using renormalization and linem'ization. The critical surface of optical flow, the rotation test, and the planarity test are also discussed in the same way as in the case of finite motion.
12.1 12.1.1
Optical Flow D e t e c t i o n Gradient equation
If a smoothly moving camera takes a sequence of images of a stationary object (or equivalently a stationary camera takes a sequence of images of a smoothly moving object), the difference between consecutive image frames is very small and defines interframe displacements of image points called optical flow. As compared with feature matching, which is necessary for 3-D analysis of finite motion, optical flow has the advantage that the flow can be detected densely (usually at each pixel) by a homogeneous image operation. Hence, it is widely expected that optical flow can be used as an i m p o r t a n t source of information for many practical purposes such as real time 3-D interpretation of the environment for robot navigation. The basic assumption for detecting optical flow is that corresponding image points have the same gray level. If the image motion is idealized as continuous, this assumption takes the following form:
Ixu + I~v + It = O.
(12.1)
Here, I and (u, v) are, respectively, the image intensity and the optical flow in position (x, y) at time t. The subscripts x, y, and t denote partial derivatives cg/Ox, O/Oy, and O/Ot, respectively. A differential equation of the form of eq. (12.1) is generally known as the conservation equation, stating that quantity I does not change in the course of motion. Let us call Ix, I v, and
369
370
Chapter 12.
3-D Interpretation of Optical Flow
It the (spatio-temporal) gradient values, and eq. (12.1) the (spatio-temporal) gradient equation. The gradient equation (12.1) alone is insufficient to determine the two unknowns u and v. This indeterminacy can be resolved if the flow is assumed to be constant over a small region in the image. Suppose the flow is constant in an image region that contains N pixels, to which we refer by serial numbers a -- 1, ..., N. We write the gradient values at the a t h pixel as Ixs, Iys, and Its They can be computed by applying a digital filter in the (spatio-temporal) neighborhood of the pixel in question; different computational schemes are obtained depending on what kind of filter is used. Whatever filter is used, however, the computed value is an approximation to the true value. Hence, optical flow detection can be viewed as the following parametric fitting: P r o b l e m 12.1 Estimate the flow components u and v from observed gradient
values {Ixs, Iys, Its} in such a way that their (unknown) true values {Ixs, iys, I--ts} satisfy
L ~ + t~o~ + t~ - o ,
c ~ - 1, ..., N.
(12.2)
Eq. (12.2) is the hypothesis from which the optical flow is to be estimatedi the rank of this hypothesis is 1. Problem 12.1 is formally equivalent to the problem of fitting a space plane u X + v Y + Z - 0 to space points (I~s, Iys, Its), c~ - 1, ..., N, in the X Y Z space. We write the observed gradient values as
I~s - i~s + AI~s,
Ivs - fvs + A i r s ,
Its - its + AI, s,
(12.3)
and regard the noise terms Alxs, Alys, and Airs as Gaussian random variables of means 0, independent for each a. Let their variances and covariances be E [ A I ~ I = ~2 , E [ A I ~ ] - %~2 , E[AI~] - ~,
E[,~L,,~,~Iy~] - %,y~, E[AL,~AI~,~]- ~,~, If we write ~ -
('XO) I~
,
It~
,~-
E[,~I~,~,~I~]- %,~,. (12.4)
~
,
(12.5)
1
the gradient equation (12.2) can be written in the form (~s, u ) - 0 ,
(12.6)
where a s denotes the true value of a s . Vector a s can be interpreted as the space-time gradient of the image intensity I at the a t h pixel in the xyt space time. It is normal to the surface on which the intensity I has a constant value (Fig. 12.1). Vector u can be interpreted as the space-time velocity. It is tangent to the world line in the xyt space time. It follows that the gradient
12.1.
Optical Flow Detection
371
v t
,
. (x, y, t)
stant Y
Fig. 12.1. The image intensity I is constant along the world line in the xyt space time. equation (12.6) (or equivalently eq. (12.2)) states that in the absence of noise
the image intensity I is constant along the world line in the xyt space time. The covariance matrix of vector a~ is given by
(2
O"z c~
V[ a ~ ] -
"Y~ ~/xt(~
"Yz y a 2
"}/z t a
%~
"~t~
"Yytc~
O'2tc~
)
9
(12.7)
As discussed in Section 7.1.1, the optimal estimate of u can be computed by the optimization N (a~, u) 2 J [ u ] - ~ (u~ Yf~]u) ~ min (12.8) under the constraint (k, u) - 1. L e t / t be the resulting estimate. Theoretically, the residual J[/t] is a ~(2 variable of N - 2 degrees of freedom (so, we need at least two pixels), and the hypothesis of the constancy of gray levels in the course of motion can be tested by the standard X2 test, provided the statistical model given by eqs. (12.3) and (12.4) is strictly true, which is very difficult to ascertain for real images.
12.1.2
R e l i a b i l i t y of optical f l o w
According to the general theory in Section 7.1.3, the theoretical covariance matrix of the optimal estimate/t computed by the optimization (12.8) is given by
~--i
(~, V[a~]~t)
'
(12.9)
where ~t is the true value of u. This covariance matrix gives a theoretical bound on the attainable accuracy of optical flow detection. In practice, the
372
Chapter 12. ].
.. },~ n
9
n
9. 9.
.
.i..
9 ."~
" ' : i i . . " ~ '?
5, .
3-D Interpretation of Optical Flow
~,
.
9 " '~
\
..".
9 "\ 9 .\
.
.
: ;i;?": .
,,
9
... 9 . .. 9 .....\
.....'" ...
:.
\ 9 '~\
9
.\\
9 9 ..
".. '!\
9 ?...... L_ .i..3.
Fig. 12.2. The aperture problem: the tangential component is ambiguous when a gray-level image is moving. above expression is approximated by V[/t] -
(Pka~)(Pka~)T (a, v[
(12.10)
16)
Let (~, ~) be the estimated flow, and write VI~ = can be written as 0
)-
0
(Ix~,Iy,~)T 9
Eq.
(12.io)
(12.11)
'
where V[fi, 6] is a (22)-matrix given by --1
-
E cN~ = l W~I2~ Ecr--1
Wo~Yyo~txo~
E cu~ = l
EotN=l Wo~I2o~
,
(i2.i2)
where 1
W~ = o . ~ 2 + 2 7 ~ f i ~ + o ~
2 + 2 ( 7 ~ t ~ + %t,~) + aL"
(12.13)
The inverse matrix in eq. (12.12) may not always exist. It is easily seen that the inverse exists if and only if the rank of {VI~} is two, i.e., the intensity gradient vectors {VI~} are not all parallel. If {VI~} are all parallel, the isointensity contour of the gray level consists of parallel lines, so the motion along the isointensity lines (called tangential flow) is indiscernible; only the motion in the direction of the intensity gradient (called the normal flow) is visible (Fig. 12.2). For such a flow, the covariance matrix diverges to infinity. This anomaly, known as the aperture problem, can be understood easily if we recall the space-time description: since the constraint on the space-time velocity u is its orthogonality to the space-time gradient a~, we need gradient values that have different orientations at least at two pixels (Fig. 12.3).
12.1.
Optical Flow Detection
373
t
t ~ = constant y
/
Fig. 12.3. The space-time velocity u can be determined if and only if the space component of the space-time gradient has different orientations at least at two pixels. The aperture problem always occurs in a region where the image intensity is constant or changes linearly. Hence, the accuracy of the detected flow is generally low. Since the interfi'ame camera motion relative to the object is very small, 3-D interpretation based on optical flow is sensitively affected by image noise. In the following sections, we cope with this difficulty by deriving a theoretically optimal technique for maximizing the accuracy of the 3-D reconstruction. E x a m p l e 12.1 If the noise in the gradient values is isotropic and homogeneous in space and if the time components are independent of the space components, the noise is characterized by two constants o-s and at:
o'~2 _ O.y~2 _ o.~2 a L -
"~u~ --')'~t~ - ' T y t a --O"
(12.14)
Then, the minimization (12.8) reduces to N ( z ~ + I ~ + z,~) 2 ao[u, v ] - E (u 2 + v2 ~ _ - ~ --+ min,
(12.15)
a,=l
where 7 = at/o.s. Eq. (12.12) reduces to N
v[~ ~1- ~ ( ~ + ~ + ~)
N
2
E o:, I ~
'
E~=I I~Iw
E~l~l Iy~Ix~
Ea=l IU~ N
2
)1 "
(12 16) "
An unbiased estimator of o.s2 is obtained in the form ^2 _
J o [ u , ~]
a~
N-2
(12 17) "
Its expectation and variance are given by 2
(12.18)
374
Chapter 12.
3-D Interpretation of Optical Flow
X O)
12
z
Y Fig. 12.4. Instantaneous motion parameters.
12.2
12.2.1
Theoretical Basis of 3-D Interpretation
Optical flow equation
Suppose the camera is smoothly moving in a stationary scene. An instantaneous camera motion is a composition of an instantaneous translation of the center of the lens and an instantaneous rotation around it. Let I (unit vector) be the instantaneous axis of rotation, and w the angular velocity around it. The instantaneous camera motion is specified by the t r a n s l a t i o n v e l o c i t y v and the r o t a t i o n v e l o c i t y ca = w l (see Section 2.1.5); we call {v, w} the (instantaneous) m o t i o n p a r a m e t e r s (Fig. 12.4). Viewed from the camera in motion, a space point r is rotating around the center of the lens with rotation velocity -ca and translating with translation velocity - v . Hence, the velocity of r relative to the camera has the following form (see eq. (2.59)): /~ = - v - ca • r. (12.19) Throughout this chapter, we use dots to denote time derivatives. As in the case of finite motion, an infinitesimal object motion relative to a stationary camera is equivalently treated as an infinitesimal motion of the camera relative to the object. Suppose an object is rotating around a reference point r G arbitrarily fixed in the object, say its centroid, with rotation velocity Wo and translating with translation velocity Vo. The velocity of an arbitrary point r in the object is = v0 +
• (r -
rG).
(12.20)
Comparing this with eq. (12.19), we find that the motion parameters of the equivalent camera motion relative to the object are given as follows (see eqs. (11.4)): v = - V o + Wo x t o , w = -Wo. (12.21) Conversely, the parameters { v o, w o} of the object motion are computed from the parameters {v, w } of the equivalent camera motion in the following form
12.2.
Theoretical Basis of 3-D Interpretation
375
O
Fig. 12.5. Optical flow. (see eqs. (11.5) )" Vo---V-
~o--w.
~ X rG,
Consider an image point x. written in the following form:
(12.22)
The corresponding space point v can be ?-
x-
Since
(12.23)
Z(x)"
(k,x) - 1 for k - (0,0, 1) T, the depth Z(x)is given by Z(x) = (k, r).
(12.24)
Suppose the space point r has velocity/~ relative to the camera (Fig. 12.5). Differentiating eq. (12.23) with respect to time, we obtain
2(~)~ ~ = z(~)~ z(~)
m
z(~)
(k, e)~ Q~e = z ( ~ ) ~ z(~)'
(12.25)
where we have defined Qx - I -
(12.26)
xkT.
Note the following identities:
Qxx-o, Q~ - Q x ,
Qx&-ic,
Pk Qx - Qx,
Qx Pk - Pk"
(12.27)
Substituting eq. (12.19) into eq. (12.25), we obtain the following expression for the image velocity x: -
We call this the optical flow optical flow at x.
-Qx
(v
Z(x) +w
x x
)
.
(12.28)
equation. In the following, we call 5~ simply the
376
Chapter 12.
3-D Interpretation of Optical Flow .... -5:" p
xx)A/
O(t) ~ vat
................... ~ ......... O(t+At)
Fig. 12.6. Geometric interpretation of the epipolar equation.
12.2.2
Epipolar equation for optical flow
Consider an infinitesimal image motion. moves to ~' =
After At seconds, image point x
~ + x~,t + o(At~).
(~2.29)
The motion parameters {h, R} m'e expressed in the following form (see
~q. (2.58)): h = vat + O(At~),
ft = I + ,, • IAt + O(At~).
(12.30)
From eqs. (12.29)and (12.30), we obtain Ix, h, R x ' l = - e ( x ) A t 2 + O(At) 3,
(12.31)
~(~) =
(12.32)
where
I~, x , - I + (~ • ~ , ~ • ~).
Thus, the epipolar equation Ix, h, lZtx~l = 0 reduces to
e(x) = 0.
(12.33)
The epipolar equation Ix, h, Rx'] = 0 states that the camera translation h and the two vectors x and l:tx I, which define the lines of sight of a feature point before and after the camera motion, must be coplanar (see Fig. 6.1). Since eq. (12.33) is obtained as the limit of infinitesimal camera motion, it should describe the condition that the camera translation velocity v, the vector x, which defines the line of sight of a feature point, and the velocity of x must be coplanar (Fig. 12.6). This is easily confirmed if we note that the velocity of vector x relative to a stationary scene is x + v + w • x; the vector x is moving with velocity x relative to the camera, which is translating with translation velocity v and rotating with rotation velocity w. It is immediately seen that eq. (12.32) can be rewritten in the form
e(x) = Ix, ic + v + w • x, vl,
(12.34)
12.2.
Theoretical Basis of 3-D Interpretation
377
stating that three vectors x, x + v + w • x, and v are coplanar. We call eq. (12.33) the epipolar equation for optical flow. The fact that eq. (12.33) is the necessary and sufficient condition for 3-D reconstruction is alternatively confirmed as follows. First, note that the optical flow equation (12.28) can be rewritten in the following form: •
Z(x)"
(12.35)
It follows that for given vectors fi~, w, and v, the depth Z ( x ) that satisfies eq. (12.35) exists if and only if vectors x + Qx(w • x) and Qx v are parallel, i.e.~
/
(5c + Qx(w x x ) ) x Q x v - o.
(12.36)
By definition, 5~ 6 {k}~. The defining equation (12.26) of matrix Qx implies that Qx a 6 {k}L1 for an arbitrary vector a. Hence, the left-hand side of eq. (12.36) has the form ck for some constant c. Since (x, k) = 1, the constant c can be determined by computing the inner product of the left-hand side of eq. (12.36) with x" we obtain
= Ix, ic, v] + (v • x, ,, • x) - e(x).
(12.37)
It follows that eq. (12.36) can be equivalently written as e(x)k = 0, meaning that the depth Z ( x ) exists if and only if e(x) - O.
12.2.3
3-D analysis from optical flow
In the above analysis, we have assumed that no image noise exists. In the presence of noise, the epipolar equation e(x) - 0 does not necessarily hold. Let & be the flow that should be observed in the absence of noise. We write - x + A~,
(12.38)
and regard the noise term A& as a random variable of mean 0 and covariance matrix V[&], independent at each pixel. If the optical flow is detected by the optimization (12.15), the covariance matrix V[&] can be identified with the matrix V[/t] given by eq. (12.11). As in the case of the finite motion analysis, the problem of computing 3-D structures from optical flow is stated as follows" P r o b l e m 12.2 Estimate the motion parameters {v, w} that satisfy •
from the observed flow &.
•
(12.39)
378
Chapter 12.
3-D Interpretation of Optical Flow
Eq. (12.39)is the hypothesis from which the motion parameters {v, w} are to be estimated. The rank of this hypothesis is 1. Since eq. (12.39) is homogeneous in v, the absolute magnitude of the translation velocity v is indeterminate. As in the case of finite motion, this is due to the fact that as long as the image motion is the only source of information, a small camera motion near a small object is indistinguishable from a large camera motion far away from a large object (see Fig. 11.1). The decision criterion (the rotation test) to test whether or not v - O, i.e., whether or not the camera motion is a pure rotation, will be discussed in Section 12.9.2. In the following, we normalize v into [Ivll- 1, assuming that v # 0 has already been confirmed.
12.3
Optimal Estimation of Motion Parameters
12.3.1
Optimal estimation
A. F l o w m a t r i x
Eq. (12.32) can be rewritten in the following form:
= (~,~; ~ • ,) +
(~.~; (~, ~),
-
s[~]).
(12.40)
The symbol S[. ] denotes the symmetrization operator (see eqs. (2.205)). Define 33-matrices X and F by
X - x x n- + d[bcx-r],
F = (~, ~ ) , -
(12.41)
s [ , ~ ~1 + ~ •
(12.42)
,,
where the symbol A[.] denotes the antisymmetrization operator (see eqs. (2.205)). We call F the flow matrix. Let X be the unperturbed value of X obtained by replacing ~ by ~ in eq. (12.41). It is easily seen that eq. (12.39) can be written in the form
( x ; F) - 0.
(12.43)
This equation has the same form as eq. (11.7). Hence, the subsequent analysis takes essentially the same form as in the case of finite motion analysis. Eq. (12.41) is linear in &, so E[X] - X and the deviation A X - X is a random variable of mean O. From eq. (12.41), we see that A X
= A[A~x T]
-
1
(Axx T
_
xAxT )
(12.44)
Let ]2[X] be the covariance tensor of X . It has the following (ijkl) element:
]2[X]ijkt- E[AXijAXkl]1
=
1 E [ ( i ~ g i x j __ x i A ~ j ) ( A ~ , k X -~
l __ XkA~gl) ]
-4(V[ic]ikxjxt- V[x]itxjxk - V[Sc]jkxixt + V[Se]jlxixk).
(12.45)
12.3.
Optimal Estimation of Motion Parameters
379
Since A X is a [33]-matrix, the covariance tensor P[X] is a [33][33J-tensor. According to the general theory in Section 7.1.5, an optimal estimate of the motion parameters {v, w} can be obtained by the following optimization (see eq. (7.55)): g[v, ca] -
f~ ( f ;(X;F) )?[X]F)
d x d y --+ min.
(12.46)
The minimum is sought under the constraint that Ilvll - 1. Here, the integral f s d x d y is a symbolic notation for summation over all the pixels in the region S where the optical flow is defined. From eq. (12.42), we see that (12.47)
A[F] = v x I.
Since 12[X] is a [33][33]-tensor, all (33)-matrices belong to its null space (see Section 2.4.3). Hence, eq. (12.45) implies that (F;)2[X]F) = (S[F] + A[F]; "N[X](S[F] + A[F])) = (A[F]; "P[X]A[F]) (v, (x x V[&] x x ) v ) .
= (x,A[F]TV[&]A[F]x)=
(12.48)
Consequently, the optimization (12.46) can be rewritten in the following form:
J[v,.,]-
J; w(x,v) (Ix, x, v l + ( ~
x x,~ x x
dxdy ~ rain,
1
w(~, ~ ) - (~, (~ • v[,l • ~)~)"
(12.49) (12.50)
E x a m p l e 12.2 If each flow component is perturbed independently by Gaussian noise of mean 0 and variance e2, the covariance matrix of & is V[&] = c : P k. If we define a (33)-matrix S x by 1 Sx - X X Pk X x -
1 --x
--y
x) --y
(12.51)
x 2 + y2
eq. (12.50) can be written in the following form: 1
w(~, ~) - ~(~, s ~ )
(12.52)
B. Rigidity test
Let {~, &} be the optimal estimate of {v, w} determined by the optimization (12.49). The motion parameters {v, w} have five degrees of freedom--two for v (unit vector) and three for w. Hence, the residual J[/J, &] is a X2 variable with S - 5 degrees of freedom, where S is the number of pixels at which
380
Chapter 12.
X
3-D Interpretation of Optical Flow
09
O
z
(~) (b) Fig. 12.7. (a) The image point seen in the direction of the camera translation. (b) Focus of expansion. the optical flow is defined. It follows that optical flow must be observed at least at five pixels. This is in agreement with the case of finite motion, but this is merely a theoretical result. In an actual application, the flow must be observed at a large number of pixels, since it is much more vulnerable to image noise than the point correspondence for finite motion. We can also obtain, at least in principle, a rigidity test" the hypothesis that the camera is moving in a stationary scene (or equivalently the object in motion is rigid) is rejected with significance level a% if
J[V,~.~] > ~2_5, a.
(12.53)
C. Focus of e x p a n s i o n
Let Xv be the image point whose line of sight is in the direction of the translation velocity v (Fig. 12.7a). Since x v c< v, eq. (12.50)implies that W ( x v , v) = c~. It is immediately seen from eq. (12.32) that e(xv) - 0 holds irrespective of the values of x and w. Hence, Xv is a singularity of the optimization (12.49). If the camera is not rotating (i.e., w - 0), the optical flow seems to diverge from or converge to Xv. Such a point is known as the focus of expansion (Fig. 12.7b). Although this description does not hold if the camera is rotating, we still call Xv the focus of expansion. The underlying geometry is the same as in the case of the finite motion analysis in Section 11.1.2 (see Fig. 11.3). In an actual application, a measure must be taken to avoid this singularity. For example, each time a candidate value of v is computed, a neighborhood of the corresponding focus of expansion is excluded or the function W ( x , v) is replaced by a constant in that neighborhood in the next optimization step.
12.3.2
Theoretical bound on accuracy
The theoretical bound on the attainable accuracy is obtained by analyzing the behavior of the function J[v, w] near the true values {0, ~}. Substituting
12.3.
Optimal Estimation of Motion P a r a m e t e r s
381
v - 0 + Av, w - ~ + Aw, and eq. (12.38) into eq. (12.49), we obtain
~(w(~, ~)+ o(~)) (1~, ~, ~1 + I~,~, A~I + (o • ~,~ • ~)
J[v, w]
+ (~x~, • ~,~, • ~ ) + (v • ~, ~x~o • ~ ) + o ( z x e , ~x~,, ~x~o) ~ -
ax@
w(~,o) ~(~)+ (a(~),~x~) + (b(~), ~x~) alas + f W(x, O)O(A[c,Av, Aw)3dxdy, ds
(12.54)
where we have defined
~(~) - I ~ , x, el + (~ • ~, a, • ~),
(~2.55)
a ( ~ ) - ~ • ~ + I1~11=~ - ( ~ , ~ ) ~ , b(~) -II~llZo Ignoring
fs W(x, 0 ) O ( A x ,
Av,
(~, ~)~.
Aw)3dxdy in
(12.56)
eq. (12.54), we obtain (12.57)
Since v is normalized into a unit vector, the deviation Av should be such that (0, Av) - 0 to a first approximation. However, this constraint need not be imposed, because the epipolar equation implies that
(a(~), ~) - (~ • ~ , ~ ) + I 1 ~ 1 1 = ( ~ , ~ ) - ( ~ , ~ ) ( ~ , ~ ) = I~, $, el + (e x ~, a, x ~) - o.
(~2.5s)
Hence, ( a ( x ) , Av + cO) - ( a ( x ) , Av) for an arbitrary constant c. In other words, the component of A v parallel to 0 has no effect on the minimization. Differentiating J[v, w] with respect to Av and Aw and setting the result 0, we obtain
~ w(~, ~) ((a(~), ~.)a(~)+
(~(~), ~ ) a ( ~ ) ) axas
- - f~ w(~, ~,)~(~)~(~)axa~,
(12.59)
f w(~, o) ((a(~), ~.)~(~)+ (~(~), ~)b(~)) alas - - ~ w(~, o)~(~)~(~)axa~.
(12.60)
382
Chapter 12.
3-D Interpretation of Optical Flow
If we define (66)-matrix
~4 -
( ff: W(x, o)a(x)a(x)Tdxdy fs W(x, o)a(x)b(x)Tdxdy ) W(x, o)b(x)a(x)Tdxdy fs W(x, O)b(x)b(x)Tdxdy
, (12.61)
eqs. (12.59) and (12.60) are combined into one equation in the form A(
Av~ ) - - ~ W ( m , ~ ) ~ ( x ) ( a ( x ) A
b(m)
] dxdy.
(12.62)
It is easily seen from eq. (12.58) that the matrix .4 has rank 5; its null space is {0 | 0}L. The solution of eq. (12.62) is obtained in the following form:
~
-
-
w ( ~ , o)~(~)2~-
a(~) ~,(~)
dxa~.
(12.63)
The optimal estimate {0, d~} of the motion parameters {v, w} is given by - o + A~,
d+ - ~ + Acb.
(12.64)
Their covariance matrices are obtained from eq. (12.63) in the following form:
- 2 1 - 2 1 ] - - 21-.
(12.65)
Here, we have used the relationship E [ ~ ( x ) ~ ( x ' ) ] - 5(x - x')E[~(x) ~] - 5 ( x - x ' ) E [ ( X ; t~) ~]
: 5(x - x')E[(AX', ~,)2]
_
5(x -
x')(~'; PIXIe') :
5(Xw(x,-~i) ,
(12.66)
where ~' is the unperturbed flow matrix F obtained by replacing v and w by and ~, respectively, in eq. (12.42). The function 5(x) takes value 1 if x 0 and 0 otherwise. The covariance matrices VIOl, V[/J, r (= V[d~, o]T), and V[cb] thus obtained give a theoretical bound on the attainable accuracy of estimating {v, }. In actual computation, eq. (12.65) is approximated by
v[~,~]
v[~]
fs W(x, i~)b(x)(Podz(x))-rdxdy
fs W(x, i~)b(x)b(x)Tdxdy
' (1~.67)
12.4.
Linearization and Renormalization
383
where {~, &} are the estimated motion parameters and
a(~) - ~ x ~ + II~ll:& - ( ~ ,
&)~,
b(x) - [[x[[2O - (x, 0)x.
(12.68)
Here, x is the optimally corrected flow (this correction will be discussed later).
12.4
12.~. 1
Linearization
and Renormalization
Linearization
The optimization (12.49) is nonlinear, requiring numerical search. However, the hypothesis (12.43) is linear in the flow matrix F. Hence, the linearization technique described in Section 9.5.2 can be applied. To be specific, Problem 12.2 is decomposed into the following two subproblems" P r o b l e m 12.3 Estimate a matrix F such that (X; F ) - 0
(12.69)
from the observed flow &. P r o b l e m 12.4 Decompose the matrix F into motion parameters {v, w} in
such a way that F - (v, w ) Z - S[vw T] + v x *.
(12.70)
Consider Problem 12.3 first. Eq. (12.69) is the hypothesis from which the flow matrix F is to be estimated; the rank of this hypothesis is 1. Eq. (12.47) implies that the translation velocity v is obtained by the following cast (see eqs. (2.182)): v = type3[A[F]]. (12.71) It follows that the normalization malization (see eqs. (2.183))"
Ilvll-
1 is equivalent to the following nor-
IIA[F]II- v~.
(12.72)
If we regard F as a variable in the optimization (12.46) and use eq. (12.45), we see that the flow matrix F can be optimally estimated by J[F] -
f~
(x;F)
(x,A[F]TV[JclA[Flx) dxdy ~ min
(12.73)
under the constraint IIA[F]I]- v/2. Let F be the resulting estimate of F. Since the flow matrix F has eight degrees of freedom (its nine elements are constrained by the normalization (12.72)), the residual J[~'] is a X2 variable
384
Chapter 12.
3-D Interpretation of Optical Flow
with S - 8 degrees of freedom. It follows that the flow matrix can be determined in principle if optical flow is observed at least at eight pixels. This is in agreement with the finite motion analysis, but in an actual application a large number of data are necessary for robust computation, as pointed out earlier. If the covariance matrix V[&] is decomposed into the noise level e and the normalized covariance matrix V0[&] in the form V[&] - e2Vo[&],
(12.74)
the covariance tensor ])[X] is accordingly decomposed in the form FIX]-
e2])0[X].
(12.75)
The least-squares approximation to the optimization (12.73) has the following form: J [ F ] - (f; MR) -+ min. (12.76) Here, the
moment tensor M is defined by 1 ~ W(x)X | Xdxdy,
(12.77)
1
W ( x ) - (x,A[F,]TVo[ic]A[F,]x),
(12.78)
where F* is an appropriate estimate of F. If a 9-vector f and a 99-matrix M are respectively defined by casting the 33-matrix F and the 3333-tensor M in the form f -- type9[F],
M - type99[M],
(12.79)
we have (f; M r ) = (F, MF) (see Section 2.4.2). Hence, the solution of the optimization (12.76) is obtained as the eigenmatrix of tensor M (i.e., the matrix F obtained by cast from the eigenvector f of the matrix M ) for the smallest eigenvalue. The scale of the matrix f is adjusted so that [[AIR][] =
12.4.2
Unbiased estimation and renormalization
Let M be the unperturbed moment tensor obtained by replacing X by X in eq. (12.77). Eq. (12.69) implies that the true flow matrix ~' satisfies ,~7~' = O, i.e., F is the eigenmatrix M for eigenvalue 0. However, the expectation E[uk4] of ,~[ is generally not equal to f14. In fact,
E[M] -
_
~
W(x)E[(X + AX) + (X + AX)]dzdy
12.4.
Linearization and Renormalization
385
_ I_sJfs W ( x ) (.)2 + .X + E [A X | A X ] ) d x d y
6-2 Jfs W(x)'l)o[Xldxdy. = M- + -~
(12.80)
Hence, the solution of the least-squares approximation is statistically biased whatever estimate F* is chosen. Define a [33][33]-tensor iV by
A/-
-~
W ( x ) "l,;o[X l d x d y .
(12.81)
Eq. (12.80) implies that if we define (12.82) we have E[3)t] - 2Q. Hence, we obtain the unbiased least-squares approximation J [ F ] - (F; 3)IF)--+ min. (12.83) The solution is obtained as the eigenmatrix of tensor A)[ for the smallest eigenvalue. The scale of the matrix F is adjusted so that IIA[F]II = V~. If the noise level e is not known, the renormalization procedure is given as follows 1 (see Section 9.4.2): 1. Let c = 0 and W ( x ) = 1. 2. Compute the 3333-tensor 3/[ and the [33][33]-tensor Af defined by eqs. (12.77) and (12.81), respectively.
3. Compute the smallest eigenvalue ~ of the [331133]-tensor -
(12.84)
M-cA;,
and the corresponding eigenmatrix F scaled so that
IIA[FII] = rift.
4. If ,~ ~ 0, return F, c, and 3)t. Else, update the constant c and the function W (x) as follows:
~IIFII 2 c +- c + ( F ; A f F ) '
1
W ( x ) +-- (x,A[F]TVo[x]A[F]x).
(12.85)
5. Go back to Step 2. 1in Section 9.4, r e n o r m a l i z a t i o n was i n t r o d u c e d by a s s u m i n g t h a t no c o n s t r a i n t o t h e r t h a n n o r m a l i z a t i o n existed. It is easy to confirm t h a t t h e s a m e p r o c e d u r e can be applied if t h e n o r m a l i z a t i o n IIA[F]II = v/2 is imposed instead of IIFII = 1.
386
Chapter 12.
3-D Interpretation of Optical Flow
After renormalization, the squared noise level e2 is estimated in the form ~2 _ .
C
( 2.s6)
- s/s
Its expectation and variance are given by 2E4
E[~2] - e2'
V[~2] - S - S"
(12.87)
Since the number S of the pixels at which the optical flow is defined is usually very large 2, the variance V[~ 2] is very small. Hence, ~2 is expected to be a very good estimate of e2. The constraint (12.72) is linearized as follows: (A[F]; A[AF]) - (A[F]; A F ) - 0. Hence, the null space of f onto {A[F]}~ is given by
is
{A[F]}L.
The projection tensor 7) -
Pijkl - 5ik(~jl- 1d[Fij]d[Fkl],
(12.88)
(Pijkt) (12.89)
where A[Fij] - (Fij- Fji)/2 (= the (ij) element of A[F]). Let M - (l~lijkl) be the moment tensor obtained after renormalization. Define a [33][33]-tensor A4'-- (]~[jkl) by A t
Mij a l -
E
PijmnPklPq]~/lmnpq"
(12.90)
m,n,p,q=l
The covariance tensor V[F] of the resulting estimate F is obtained in the following form (see eq. (7.30))"
V[F]12.5 12.5.1
~2
Optimal 3-D Reconstruction Optimal correction and decomposition
A. Decomposability condition We now consider Problem 12.4. Namely, we decompose the flow matrix F into motion parameters {v, w} such that eq. (12.70) holds. As in the case of the essential matrix G for finite motion (see Section 11.3.1), not every matrix 2For example, if optical flow is defined at every pixel of a 512 • 512 image frame, we haves = 512• 512~ 3• l0 s .
12.5.
Optimal 3-D Reconstruction
387
F can be decomposed in the form of eq. (12.70): a (33)-matrix F has the form of eq. (12.70) if and only if 1
K - ~trK(l-
vv
T
)+ 2S[KvvTI,
Ilvll-- x,
(12.92)
where
K = S[F],
v = typea[A[F]].
(12.93)
In fact, if eq. (12.70) holds, it is easy to confirm eqs. (12.92) by direct substitution. Conversely, if eqs. (12.92) hold, it is immediately observed that (v, K v ) = 0.
(12.94)
If we let 1 (trK)v
-
2Kv,
(12.95)
eq. (12.70)can be easily confirmed by direct substitution. We call eqs. (12.92) and (12.93) the decomposability condition for optical flow. We say that a flow matrix F is decomposable if it satisfies the decomposability condition. B. Correction of t h e f l o w matrix The flow matrix F computed by renormalization may not satisfy the decomposability condition (12.92) exactly. Hence, it must be corrected into ~" that satisfies ,/ff - ~ t r K ( l - / ~ / ~
) + 2S[~/('/j/~T]
where = S[~'],
/~ = type3[A[Fl].
(12.97)
Constrained by the normalization IIA[F]II- v~, the flow matrix F has eight degrees of freedom, while the motion parameters {v, w} have five degrees of freedom. Hence, only three of the six 3 component equations of (12.96) are independent,. In other words, the rank of the constraint given by eqs. (12.96) and (12.97) is 3. Substituting F - F - A F into eq. (12.96) and taking a linear approximation, we obtain K-
~1 t r K ( l -
v v T) - 2 S [ K v v T] - t r K S [ A v v T] + 2 S [ K A v v T
+ K v A v T] - A K + 2 t r A g ( l -
v v T) + 2 S [ A g v v T] - O, (12.98)
3Since eq. (12.96) is an equality between (33)-matrices, it gives six element-wise equalities.
388
Chapter 12.
3-D Interpretation of Optical Flow
where A K -- S[AF] and Av -- type3[A[AF]]. If a (33)3-tensor A a (33)(33)-tensor 13 -- (Bijkl), and a (33)-matrix D are defined by 3
1
Aijk -- -~trK(6ikvj + 6jkVi) -- I(ikVj -- I(jkVi - 6jk E
(Aijk), 3
Kilvt - 6ik E
/=1
I(jlvt,
/=1
(12.991 Bijkl - -~l(6ik6jl +6it6jk . 6ij6kt. +6klViVj .
6ikVlVj . . 6jlVkVi
6ilVkVj
6jkVlVi),
(12.1001 D-
K-
ltrK(I-
v v T) - 2 S [ K v v T ] ,
(12.101)
eq. (12.981 can be written in the following form:
AAv + BAK
- D.
(12.102)
If a (33)33-tensor C - (Cijkl) is defined by 1
C -- ~type33133][A ] + B,
(12.1031
or in elements 1
3
Cijkl- --~ E ~SklmAijm+ Bijkl,
(12.104)
m--1
eq. (12.1021 can be rewritten in the following form" CAF-
D.
(12.105)
Hence, the optimal correction A F is determined by minimizing the Mahalanobis distance IIAFIIv[F] (see Section 9.5.2), i.e., (AF, V [ F ] - A F )
~ min,
A F e {A[F]}Ll ,
(12.1061
under the linearized constraint (12.105). The covariance tensor V[F] of the flow matrix F is given as a by-product of the renormalization procedure (see eq. (12.9111. Since multiplication of the covariance tensor V[F] by a positive constant does not affect the solution, it can be replaced by the normalized covariance tensor V0[F] defined by setting ~ = 1 in eq. (12.911. The first order solution is obtained as follows: 3
AFij =
E
Wm"pqCm'klV~
(12.107)
k,l,m,n,p,q--1 Here, "142-- (Wijkt)is a (33)(33)-tensor defined by
W - (V)~-,
(12.108)
12.5.
Optimal 3-D Reconstruction
389
where )2 - (Fijkl) is a (33)(33)-tensor given by
Vijkl --
3 E CijrnnCklpq'12~ m,n,p,q= l
(12.109)
Since the constraint A F e {A[FI} ~ns~r~s the constraint IIA[F]Ib- r only to a first approximation, the actual correction takes the following form: v~(F-
-~ = IIA[F-
AF)
F]II"
(12.110)
This correction is iterated until the matrix D becomes sufficiently close to O. In this process, the normalized covariance tensor l;0[F] needs to be updated, since its null space should change as F changes (see eq. (5.26)). Let 79 = (Pijkl) be the orthogonal projection tensor onto {A[~']}~ (see eq. (12.89)). We let ^
~)o[lP]ijkl - 19ijmnPklpq'l~O[f ]mnpq,
(12.111)
where Vo[F]ijal and 12o[F]ijal are the (ijkl)elements of V0[F] and ))0IF], respectively. C. D e c o m p o s i t i o n into motion parameters
If the corrected flow matrix ~" satisfies the decomposability condition (12.92), the decomposition into motion parameters {v, w} can be done in principle by the second of eqs. (12.93) and eq. (12.95). However, these hold only when ~" is decomposable. As in the case of finite motion, we should use a robust expression that yields a good approximation even if ~" does not satisfy eq. (12.92), say, as a result of a premature termination of the correction iterations described above. We compute the translation velocity v by the second of eqs. (12.97) and determine the rotation velocity w by the least-squares optimization [IS[/v] - ( v , w)I + S[vw-r]ll: -+ min. (12.112) Then, we obtain v - type3[A[/~]],
1
,~ - ~ ( t r F + 3(v,/~v)) v - 2S[F]v.
(12.113)
The second equation is equivalent to eq. (12.95) if F is decomposable.
12.5.2
Optimal correction of op tical flow
A. Epipolar of optical flow
Since the epipolar equation e(x) - 0 is linear in &, it defines a plane in a three-dimensional space if & is regarded as a variable. The constraint (k,&)
390
Chapter 12.
3-D Interpretation of Optical Flow
X Fig. 12.8. Epipolar of optical flow. = 0 also defines a plane. Hence, the set of all vectors s that satisfy the epipolar equation for fixed x, v, and ca is a line: it is the intersection of the plane defined by (k, 5) - 0 with the plane defined by the epipolar equation e ( x ) - 0 (Fig. 12.8). We call this line the epipolar of the optical flow ~b at x. From eqs. (12.36) and (12.37), we see that the equation of the epipolar is x Qxv - -(v
x x,
x x)k.
(12.114)
The orientation of this line is Q x v, and its distance d from the origin O is d-
I(v • x , , , , •
(12.115)
B. C o r r e c t i o n of optical flow
Let {/~, r be the motion parameters computed from optical flow. Although they are estimated so that the flow satisfies the epipolar equation e ( x ) = 0 on the average, individual flow components may not necessary satisfy it. In order that the depth can be defined at each point, the observed flow 5~ must be corrected into the value ~ that is compatible with the epipolar equation for the computed motion parameters {~3, r This correction has the following geometric interpretations (they are equivalent to each other)" 9 The vector & is moved to ~ so that ~ is on the epipolar at x (see Fig. 12.9). 9 The vector x is moved to x so that the lines o/sight o f x ( t ) and x ( t + A t ) meet at a point in the scene to a first approximation (see Fig. 12.6). If we let
~(x) = Ix, x, 'b I + (0 x x , ~ x x),
(12.116)
this is not zero in general. It is easy to see that the right-hand side vanishes if & is replaced by ~ - & - A& in such a way that (~3 x x, A:b) = ~(x).
(12.117)
12.5.
Optimal 3-D Reconstruction
391
Ax
X
Fig. 12.9. Optical correction of optical flow. The rank of this constraint is 1. An optimal value of A5~ is determined by minimizing the Mahalanobis distance IIAS~lly0[x], i.e., (A&, t ~ [ x ] - A x ) ~ min,
•
Aic e {k}L,
(12.118)
under the constraint (12.117). The solution is given by
Ax-
~(x)Vo [&] (O x x) (6 • ~, Vo[x](6 • ~))"
(12.119)
It follows that the correction has the form .~
-z-
~(x)V0[~](~? x x)
(~, (~ x v0[x] x ~)~)"
(12.120)
Since the epipolar equation e(x) - 0 is linear in ~b, no approximation has been introduced into the constraint (12.117). Hence, the above correction need not be iterated. The residual (A~, V0[:b]-A~) of the optimization (12.118) is given by
~(~)~ e(x)2 = (~?, (x x Vo[x] x x)/J)"
(12.121)
From eqs. (12.49) and (12.50), we can see that J0[0, do] - Is ~(x)2dxdy
(12.122)
equals e2 times the residual of the optimization (12.49). Hence, J0[/~, ~b]/e 2 is a ~(2 variable with S - 5 degrees of freedom. An unbiased estimator of the squared noise level e2 is obtained in the form g2= Jo [/~, dJ]. S-5
(12.123)
392
Chapter 12.
3-D Interpretation of Optical Flow
Its expectation and variance are E[~2]_ e2
v[~]-
2E4 s- 5
(12.124)
The rigidity test (12.53) can be rewritten as
12.5.3
~2
X2_5,a
-e7 >
S-5
(12.125)
Computation of depth
A. Depth equation From eq. (12.119), we see that (Q~O V o [ ~ ] - ~ x ) - ~(~)(Q~o' Vo[x]-Vo[x](o • ~)) ' (~ • ~, Vo[x](o • ~)) ~(x)(Q~,,pk(~, • ~)) ~(~)(PkQ~,, ~, • ~) (~ • ~, Vo[~](~ • ~)) (~ • ~, Uo[~](~ • ~)) _
~ ( x ) ( Q x i ~ , i~ x x )
_ ~(x)(iJ-
(~ • ~, Vo[~](o • ~))
( i ~ , k ) x , iJ x x ) - O ,
(12.126)
(~ • ~, Vo[~](o • ~))
where we have used eqs. (12.26) and (12.27). If the corrected flow x - 5~- AS~ satisfies the epipolar equation for the computed motion parameters {0, r there exists a depth Z(x) such that
_ A~-
_Q~
0 + d~ x x ) . 2(~)
(12.127)
Noting the second of eqs. (12.27), we can rewrite the above equation in the following form: ~-Q~
(
~+~•
@ ) ( 1 2 . 1 2 8 ) "
Eq. (12.126) implies (Q~o, v0[~]-Q~
~ + ~ x ~ + 2(~)
) - 0,
(12.129)
from which the depth Z(x) is given in the following form: 2(~1 - -
(~ , Q~~ v0[~l - Q ~ )
(iJ, QTx Vo[&,]-Qx(SC + go x x ) )
We call this the depth equation. We observe the following:
(12 ~30)
12.5.
Optimal 3-D Reconstruction
393
9 Eq. (12.28) implies that Qx(& + d~ x x) = 0 if ~5 = O. Hence, the depth Z(x) is indeterminate if ~5 - O. 9 Let x 0 (<x /J) be the focus of expansion for the computed motion parameters {~, r (see Section 12.3.1). Since Qx~O ~ Qx~xi~ = 0 (see the first of eqs. (12.27)), the depth at the focus of expansion xo is indeterminate. Up to now, there exist two solutions, since renormalization computes the flow matrix F only up to sign. It is easily seen that if one solution is {v, w} and Z(x), the other solution is { - v , w} and - Z ( x ) . Hence, the correct solution is chosen by imposing the constraint
~
sgn(2(x))dxdy > O.
(12.131)
As in the case of finite motion (see eq. (11.94)), the use of a seemingly equivalent condition fs Z(x)dxdy > 0 is dangerous because a large depth (Z(x) oo) can be estimated to be very far behind the camera (Z(x) ~ - o o ) in the presence of image noise. B. Reinterpretation of optimal estimation
Eq. (12.128) implies that
(zx~, y0[~]-~x) = (~ + ~, x ~ + 2(~)' Q g V o [ ~ I - Q ~
~ + ~, x ~ +
2(~) ). (12.132)
If eq. (12.130) is substituted into the right-hand side of this equation, it should be equal to the right-hand side of eq. (12.121). On the other hand, it is easily seen that the depth Z(x) given by eq. (12.130) minimizes the right-hand side of eq. (12.132) viewed as a function of Z(x). Thus, we conclude that
~(.): (~. (.
x
Vo[x] x .)~)
= min ('~ +
z(x)
Z(x)
+ & x x, Q~Vo[x]-Qx
(
)
x + z ( x ) + a, • x ),
(12.133)
and the minimum is attained by the depth Z(x) given by eq. (12.130). It follows that the optimization (12.122) is equivalent to f
Jo[v .,]- I '
min
Js z(x)
& + Qx
(v
Z(x) + w x x
)lq2
v0[x]
dxdy --+ min.
(12.134)
394
Chapter 12.
3-D Interpretation of Optical Flow
E x a m p l e 12.3 If each flow component is perturbed independently by Gaussian noise of mean 0 and variance e2, the normalized covariance matrix of is V0[x] - Pk" From the fourth of eqs. (12.27), we see that
Q~ Pk Q~ - Q~ Q~ - S~ ,
(12.135)
where the matrix Sx is defined by eq. (12.51). Hence, eq. (12.130) can be written as
2(~)--(0,s~(~
(,5, Sx+)
(12 136)
+ ~ • ~1 ) 9
Eq. (12.133) implies the following identity:
min
z(~)
II (~ ~ +
Qx
Z(x)
+ d~ x x
)]1
=
.
(e, s ~ )
(12.137)
The minimum is attained by the depth Z(x) given by eq. (12.136). It follows that the optimization (12.134) with respect to the Mahalanobis distance reduces to the following optimization with respect to the Euclidean distance:
Jo[v' w] - fs z(~) min ll~ + Qx ( z (vx ) + w x x -- JfS e(x)2dxdy (~:~~ . m i n 12.6
12.6.1
)112
dxdy (12.138)
Reliability of 3-D Reconstruction
Effect of image noise
Errors in the 3-D shape reconstructed by the depth equation (12.130) originate from the following two sources: 1. errors in the observed flow 5; 2. errors in the computed motion parameters {/J, &}. Strictly speaking, these two sources of error are correlated, since {/~, d~} are computed from the flow 5~. However, since {/~, r are estimated by optimization over all the pixels at which the flow is defined, the correlation between {~, r and ~ at a particular pixel is expected to be very small. Hence, as in the case of finite motion, errors in 3-D reconstruction can be treated to a first approximation as the sum of the errors from these two sources. First, consider the effect of image noise, assuming that the motion parameters {/~, r are accurate. If the observed flow 5~ is corrected into
12.6.
Reliability of 3-D Reconstruction
395
by eq. (12.120), its a posteriori covariance matrix is given as follows (see eq. (5.46))"
t~[~] -
~ (Vo[x]-
\
(Vo[x](,:, x ~))(Vo[~](~, • ~))' '~ (~ x x, Vo[x](/~ x x)) / "
(12.139)
Here, ~2 is the value estimated by eq. (12.123). The subscript i indicates that we are considering the effect of image noise only. It i.s easily seen that the above matrix has rank 1; its range and null space are {(x • 6) • k } L = { Q x O } L and {x x/~, k } L , respectively (see eq. (12.26)). This is a consequence of the fact that errors in x are constrained to be along the epipolar at x (see Fig. 12.9). It follows that the covariance matrix Viii] has the following form: Vi[x] - constant x (Qxi~)(QxO) T.
(12.140)
Recall that the depth Z(x) has been computed from the following relationship (see eq. (12.127)): [r, - - Q x
Z(x) + ~ x x
.
(12.141)
Since the corrected flow ~ is computed from the observed flow, it is a random variable. If it is perturbed by A~, the depth Z(x) is accordingly perturbed by AZ(x). To a first approximation, we have
Ax - AZ(x)= QxO.
(12.142)
z(.)~
Hence, trVi[~]
-
E[IIA~II 2] - E[AZ(x)2] (~ Q~Q~e) = V~[2(~)] (~, S ~ ) , 2(x)4 ' 2(x)4
(~2.143) where the matrix Sx is defined by eq. (12.51) (see eqs. (12.135)). Consequently, the variance Y~[2(x)] of the computed depth Z(x) is given by V~[2(x)]- 2(x)4trV~[~]
12.6.2
(12.144)
Effect of errors in the motion parameters
Consider the effect of errors in the motion parameters {/~, d~}, assuming that the f l o w x i s accurate. Let / ~ - ~ + A v a n d c b - ~ - ~ + A w . Since the epipolar equation e ( x ) - 0 is satisfied by the true values {0, ~}, we have to a first approximation
~(~) - I~, x, ~ +/',vl + ((~ + i v ) x ~, (,x, + A ~ ) x ~) = I', ~, ,~"1 + ( ~ " x . , ~ x . ) + = (a(~), ,~,,) + (/,(~), ~,.,.,),
(~ x . , ~ . , x . ) (12.145)
Chapter 12.
396
3-D Interpretation of Optical Flow
where 4
a(~) - ~ • x + I1~11~, - ( ~ , ~)~, b(x) -I]x[[2~) - (x, 9)x.
(12.146)
From eq. (12.145), we obtain the following relationships (the subscript m indicates that we are considering errors in the motion parameters only)"
v~[~(.)] - ( a ( . ) . v[~]a(~)) + 2(a(~). v[~. ~ ] b ( . ) ) + (b(.). v [ ~ ] b ( . ) ) .
v~[~. ~ ( . ) ] - E[~(~)Zx~]- V[~]a(.)+ V[~. ~]b(.). Vm [~b, (~(x)]-
E[8(x)Aw]- V[~, O]8(x)+ V[~b]b(x).
(12.147)
The covariance matrices V[~], V[/J, ~b], and V[&] are computed by eq. (12.67). In computing eq. (12.67), we need the value of e2 because W(x, iJ) involves the covariance matrix V[&] = e2V0[x] (see eq. (12.50)). It is estimated by eq. (12.123). If the motion parameters are not accurate, the flow x is corrected by eq. (12.120) even though x itself is accurate. From eq. (12.119), the a posteriori covariance matrix Vm[~] is obtained to a first approximation in the following form:
v~[~]- v~[~(~)](v0[x](~ • .))(v0[~](~ • .))~ (+ •
(12.148)
v0[~](~ • ~))~
We also have
v~[~. ~] - v~[~. ~(.)](v0[~](~ • .)) 9 (~ • .. v0[~](~ • . ) ) ' v~[~. ~ ] - v~[~,. ~(.)](v0[~](~ • .))~ (~3 x x, V0[~b](~3 x x))
"
(12.149)
From eq. (12.141), we see that the perturbations Av, Aw, and A~ are related to the perturbation A Z ( x ) to a first approximation in the form
~
- -q~
2(~)
2(~)~
+ zx~ x ~
.
(12.150)
from which we have
4Strictly speaking, the motion parameters {~,, &} in eqs. (12.146) should be {9, &}, but this approximation introduces only a second order difference.
12.6.
Reliability of 3-D Reconstruction
397
Fig. 12.10. Simulated optical flow. Taking the expectation of the squared norm on both sides, we obtain
Vm[2(~ )] llQ~ ~ll ~ 2(x) 4
-E[
I1( Qx
A~+
~.
^
z(~)
)ll2
+Amxx
(12.152)
],
which can be rewritten as
Vm[2(~)l(i,, Q~Q~ i,) 2(~)~ T
=
(E[ a ~ + 2(~) + ~
• ~
~
+ 2(~)
+~
•
];Q~Q~). (12.153)
From this, the variance Vm[Z(x)] of the depth Z(x) is obtained in the following form:
2(~) ~ ( (V[&I;S~)+ (v[~]; Vm[2(~)]- (~,S~) 2(~)s~) + (~ x v[~] x ~; s~) 2(Vm[/~, x]; S x ) _ 2(x x Vm[&, x]" S x ) - 2(x x V[&,/J]; S x ) ) .
(12.154)
E x a m p l e 12.4 Fig. 12.10 shows a simulated image (512 x 512 pixels with focal length f = 600 (pixels)) of three planar grids in the scene viewed from a moving camera. We synthesized optical flow at the grid points according to eq. (12.28) and added Gaussian noise of mean 0 and standard deviation a = 1 (pixels) to the x and y flow components at each point independently. Hence, the noise level is ~ = ~ / f = 1/600 (this value is not used in the reconstruction computation). Let {~, aJ} and {/J, r be the true and the computed motion
398
Chapter 12.
3-D Interpretation of Optical Flow
....-~ 9
1~ (b)
(~)
Fig. 12.11. Least-squares approximation with optimal weights: (a) errors in translation; (b) errors in rotation.
"; "'~"
IL I ",d
I
(~)
(b)
Fig. 12.12. Renormalization: (a) errors in translation; (b) errors in rotation. parameters, respectively. Since v is a unit vector, the error in translation is measured by Av = PN[v](~) - N[0]). (12.155) The error in rotation is measured by
A~ - c, - ~.
(12.156)
In Figs. 12.11-12.13, vectors Av and Aw are plotted in three dimensions for 100 trials, each time using different noise. The ellipses in these figures indicate the standard confidence regions 5 defined by V[~] and V[d~] (see eq. (12.65)); the cubes are drawn merely for the sake of reference. Fig. 12.11 is for the least-squares approximation with optimal weights (computed from the true motion parameters {0, 3}); Fig. 12.12 is for renormalization; Fig. 12.13 is for renormalization followed by optimal correction of the flow matrix. The corresponding standard confidence regions and reference cubes have the same absolute sizes. From Fig. 12.13, we can see that the accuracy almost attains the theoretical bound given by eq. (12.65) if renormalization and the optimal correction are combined. Fig. 12.14 shows a reconstructed shape viewed from a different angle. The true shape is superimposed in dashed lines. Fig. 12.15a shows the standard confidence regions, which degenerate into line segments. They are computed from the total variance Vt[Z(x)] - Y~[2(x)] + Vm[2(x)] around the reconstructed grid points: Vi[Z(x)] is the variance due to the image noise (see 5Since l?[~,] has rank 2 and ~r[&] has rank 3, the standard confidence region is a space conic for the translation velocity and an ellipsoid for the rotation velocity; see Section 4.5.3.
12.6.
Reliability of 3-D Reconstruction
(~)
399
(b)
Fig. 12.13. Renormalization followed by optimal correction: (a) errors in translation; (b) errors in rotation.
Fig. 12.14. Reconstructed shape. The true shape is superimposed in dashed lines.
(~)
(b)
Fig. 12.15. (a) Reconstructed shape and the standard confidence regions of grid points. (b) Two shapes corresponding to the primary deviation pairs of grid points.
400
Chapter 12.
3-D Interpretation of Optical Flow
eq. (12.144)); Vm[Z(x)] is the variance due to errors in the computed motion parameters (see eq. (12.154)). Fig. 12.15b shows the shapes that envelop the primary deviation pairs of the grid points.
12.7
Critical Surfaces for Optical flow Weak critical surfaces
12.7.1
As in the case of finite motion, the linearization technique described in Section 12.4.1 is based on the fact that the flow matrix F is an eigenmatrix of the unperturbed moment tensor A74 for eigenvalue 0, which we have implicitly assumed to be a simple root. If it is a multiple root, infinitely many eigenmatrices exist. This occurs if and only if the epipolar equation (12.69) is satisfied by a false flow matrix F ( r F). In other words, ambiguity occurs if and only if the object surface r - Z ( 2 ) 2 is such that (JE; ~') - 0
(12.157)
for F r F . For such a surface, the linearization technique in Section 12.4.1 does not work, since any linear combination of F and F is an eigenmatrix of A//for eigenvalue 0. If we put ~ - type3[A[~']], we can write A[~'] - ~ • I. It follows that
(-~; ~') -- (X= T 4- A[x=T]; S[~'] 1L A[F]) - (xxT; S[~']) 4- (~=T; ~ X I ) = ( x , S [ F l x ) + (&, iJ x x).
(12.158)
Substituting eq. (12.28) into this and replacing x by r / Z ( x ) , we obtain
( x ; F) - ( . . S [ P ] . ) - ( Q .
=
(v
z(.) + ~ x ~
)
. ~ x .)
(~.s[/~]~) _ ( ~ ~, x ~ Q=~(~ x ~)1 z(~)~ z ( ~ ) + z(~----?' z(~)
(~. s[Pl~) - (~. ~ • ~1 - (~. ((~. ~), - s [ ~ ] ) z(.)~
~)
(12.159)
where we have used the identity QxT(9 x x) -- ~ • x. Hence, eq. (12.157) holds if and only if the object surface has the form
(r, K ' r ) - (v • ~, r),
(12.160)
where
K'-
S I R ] - (~, ~)* + S [ ~ ] .
(12.161)
12.7.
Critical Surfaces for Optical flow
401
Since eq. (12.160) is identically satisfied by r = 0, the origin 0 of the camera coordinate system is on this surface. If K ~ is nonsingular, eq. (12.160) defines a centered quadric of the following form (see eq. (4.98)):
(r - rc, S ( r -
r e ) ) = 1.
(12.162)
The center r c and the matrix S are respectively given by rc-
12 K ' - l (v • v)
S-
4K I (v x O , K ' - l ( v x ~))"
(12.163)
We call a surface whose equation has the form of eq. (12.160) for some (33)matrix K ~ a weak critical surface.
12. 7.2
S t r o n g critical surfaces
If the object surface is included in a weak critical surface, the epipolar equation (12.69) is satisfied by infinitely many false flow matrices. However, since the true flow matrix F must be decomposable, the correct flow matrix F can be chosen if the false flow matrices are not decomposable. In fact, the correct motion parameters {v, w} can be obtained by the direct optimization (12.49). However, if the epipolar equation (12.69) is satisfied by a false flow matrix that is decomposable, the problem is inherently ambiguous. This occurs if and only if the object surface has the form of eq. (12.160) for which the matrix F in eq. (12.161) can be decomposed into false motion parameters {~, 5~}. This condition is equivalent to saying that there exist vectors ~ (~ v) and ~ (:/: w) such that K ' = (9, r - w ) I - S[~(r - w)T]. (12.164) Let us call a surface that satisfies this condition a strong critical surface. By definition, a strong critical surface is also a weak critical surface and hence a quadric that passes through the origin O of the camera coordinate system. However, the converse does not necessarily hold. In fact, eq. (12.164) implies
= 0,
(
2.165)
and hence eq. (12.160) is identically satisfied by r = c9 for an arbitrary constant c. In other words, the space line r x ~ - 0 is entirely contained in that surface. This implies that a strong critical surface is a hyperboloid of one sheet or its degeneracy (Fig. 12.16a). As in the case of finite motion, all planar surfaces are strong critical surfaces and hence weak critical surfaces as well: an arbitrary space plane H that does not pass through the origin O can be coupled with another space plane H ~ that passes through O and intersects with H perpendicularly in such a way that II and H ~ constitute a degenerate hyperboloid (Fig. 12.16b). Consequently, the 3-D interpretation of a planar surface is inherently ambiguous. In
402
Chapter 12.
3-D Interpretation of Optical Flow
H
O
(~)
(b)
Fig. 12.16. (a) Hyperboloid of one sheet. (b) Two planar surfaces as a degenerate hyperboloid. the following, we construct an algorithm for computing all possible interpretations of a planar surface scene. In order to invoke this algorithm, however, we must first test if the object surface we are observing is a planar surface. This planarity test is derived from a statistical analysis of the image noise.
12.8 12.8.1
Analysis of Planar Surface Optical Flow Optical flow equation for a planar surface
If space plane (n, r) = d is viewed from a camera moving with motion parameters {h, R}, the projected image motion has the following form (see
~qs. (6.66)~d (6.67)): x' =
Ax (k, A x ) '
A - RT(hn T -dI).
(12.166)
If the camera is stationary, we have h - 0 and R - I, so A = - d I . It follows that for an infinitesimal camera motion the matrix A has the form
A = -~(Z + WAt) + O(~t~).
(12.167)
Substitution of this into the first of eqs. (12.166) yields
x + W x A t + O(At 2)
, X
-"
=
(k, =)+ (k, W~)At + O(~t~) +
+ O(At2).
-(k, /
\
(~2.168)
Hence, the optical flow 5: = limAt--~o(X'-- x ) / A t has the form ic =
Wx
-
(k, W x ) x .
(12.169)
12.8.
Analysis of Planar Surface Optical Flow
403
We call W the flow matrix of planar surface motion. Eq. (12.169) implies that the flow 5~ remains the same if the flow matrix W is replaced W + cI for an arbitrary constant c. This indeterminacy originates from the fact that the image motion given by the first of eqs. (12.166) is unchanged if the matrix A is replaced by cA for an arbitrary constant c. In order to remove this indeterminacy, we hereafter impose the following normalization: t r W = 0. (12.170) If the camera motion is infinitesimal, the motion parameters {h, R} have the following form (see eqs. (12.30)): R - I + w x I A t + O(At2),
h - v a t + O(At2).
(12.171)
Substituting these into the second of eqs. (12.166), we obtain x I A t + O(At2)) ( - d l +
A - (I-w =
+
§
•
vnn-At+O(At2))
(12.172)
§
Comparing this with eq. (12.167) and noting the normalization (12.170), we see that the flow matrix W has the following form:
W - - - j 1 ( vn 7- - 3 l ( v , n ) l ) - w
(12.173)
x I.
Alternatively, eq. (12.169) can be obtained from the general optical flow equation (12.28) by noting that if space point r - Z ( x ) x is on space plane (n, r) - d, the depth is given by Z(x)-
d (n,x)"
(12.174)
Substituting this into eq. (12.28) and noting that Q x x - O, we obtain
x
d = QxWx-
12.8.2
Wx-
+ ~'xx
) - - Q x ( vnd
(k, Wx)x.
+wxl)
x
(12.17 )
Estimation of the flow matrix
In the presence of noise, the observed flow ~ does not satisfy eq. (12.169) exactly. As in the case of finite motion analysis, the surface and motion parameters {n, d} and {v, w} can be estimated by solving the following two subproblems:
404
Chapter 12.
Problem
3-D Interpretation of Optical Flow
12.5 Estimate a matrix W of trace 0 such that -
Wx
-
(12.176)
(k, Wx)x
from the observed flow 5. 12.6 Decompose the matrix W into surface and motion parameters {n, d} and {v, w} in such a way that
Problem
W -
1( v,-, T - 5(,,, 1 n)I) - w•
- -~
(~2.177)
Problem 12.5 can be optimally solved by minimizing the integral of the squared Mahalanobis distance, i.e., f
J[W]- I
Js
I1~- w=
+ (k ,
w=)=ll =v 0 [ x ]
dxdy--+ min ,
(12.178)
under the constraint t r W - 0. The first variation of J[W] for an infinitesimal variation W ~ W + 5 W is
5J[W] - 2 ~ (-SWx + (k, 6Wx)x, V0[&] (& - Wx + (k, Wx)x))dxdy
- -2(/~
(Vo[~] (~ -
w=
+ (k, w=)=)=T
- ( x , V0[&] (& - Wx + (k, Wx)x)) kx T) dxdy; 5W).
(12.179)
This must vanish for an arbitrary variation 6 W such that t r 6 W - 0, so we obtain
g Vo[~] ( ( w =
-
(k. w = ) = ) = ~ -
- (=.
Vo[=]( w =
- (k. w = ) = ) ) k = T) ,x,~
where A is the Lagrange multiplier for the constraint t r W - 0. If a 3333tensor A = (Aijkl) and a 33-matrix B are defined by 3
m=l
+ (x, Vo[i~lx)kikk) xjxtdxdy,
B- fs (Void]==T_ (=, y0[=]=)k~T) dxd~,
(12.181) (12.182)
12.8.
Analysis of Planar Surface Optical Flow
405
eq. (12.180) can be written in the following form: (12.183)
. A W - B + AI.
It is easily seen from eq. (12.181) that .41 = O. This is a consequence of the fact that the flow matrix W can be determined only up to a constant multiple of I. It follows that the rank of tensor .4 is 8; its null space is { I } L . Multiplying the generalized inverse A - of tensor A on both sides of eq. (12.183) and noting that A - I = O , we obtain the solution W in the following form: W = A-B. (12.184) The generalized inverse A - is obtained by casting the 3333-tensor ,4 into a 99-matrix, computing its generalized inverse, and casting it back into a 3333-tensor (see Section 2.4.2).
12.8.3
Planarity test
Let I ~ be the optimal estimate given by eq. (12.184). Since V[5~] has rank 2 and the flow matrix W has eight degrees of freedom, the residual f
J[W]- /I1~- w~ + (k, W~)~ll=V0[x]dxdy Js
(12.185)
is a ~2 variable with 2 ( S - 4) degrees of freedom if the noise is Gaussian, where S is the number of the pixels at which the optical flow is defined. This implies that optical flow must be defined at least at four pixels, as expected from the finite motion analysis for a planar surface (see Section 11.6.1). If the covariance matrix V[5~] is decomposed into the noise level e and the normalized covariance matrix V015~] in the form of eq. (12.74), an unbiased estimator of e2 is obtained in the form
~ _ z0[w] -
-
2(S -
4)'
(12.186)
where J0[W] is the normalized residual obtained by replacing V[&] by V0[&] in the expression for J[l/~r]. The expectation and variance of ~2 are given by ~4
E[~2] _
e2,
Vie2] -
S-
4"
(12.187)
Since S is usually very large, ~2 is a very good estimate of e 2. This analysis can be used to test if the observed flow is due to a planar surface motion. Namely, the hypothesis that the object surface is planar is rejected with significance level a% if 6
J[W] >
~2(S_4),a.
(12.188)
2 6Since S is usually very large, we can use the approximation X2(s_4), a ,~ (Na +
v/4S - 17)2/2 ~ (Na + 2v/~)2/2; see Footnote 15 in Section 3.3.4.
406
Chapter 12.
3-D Interpretation of Optical Flow
In terms of the estimate ~2 computed by eq. (12.186), this planarity test takes the following form: ~2 ~22(S_4),a -~ e > 2 ( S - 4)" (12.189) This has the same interpretation as the planarity test for finite motion (see eq. (11.152)). Namely, we compare the estimate ~ computed under the hypothesis that the object surface is planar with the noise level e expected from the accuracy of the image processing operations for detecting optical flow. As in the case of finite motion analysis, the planar surface algorithm can be applied to general optical flow as well. Hence, it is safer to apply the planar surface algorithm than the general optical flow algorithm; it should be switched to the general optical flow algorithm only when assuming planarity is not compatible with the observed optical flow to a convincing degree. The opposite approach has often been adopted in practice: the general optical flow algorithm is used first and abandoned, say, when the second smallest eigenvalue of the moment tensor A// given by eq. (12.77) is sufficiently close to its smallest eigenvalue, and the decision criterion is set arbitrarily. In contrast, the use of the planarity test as described above has a solid statistical foundation. In Chapter 13, we present an alternative criterion which does not involve the significance level.
2.s.4
Computation of surface and motion parameters
We now solve Problem 12.6. Unlike the decomposition of the flow matrix F for general optical flow, no decomposability condition needs to be imposed. In other words, there always exist {n, d} and {v, w} that satisfy eq. (12.176) exactly, although the decomposition may not be unique. This is easily understood by counting the degrees of freedom. The flow matrix W has eight degrees of freedom since its trace is 0. The surface parameters {n, d} have three degrees of freedom; the motion parameters {v, w} have five degrees of freedom. Thus, the matrix W has the same degrees of freedom as the surface and motion parameters. The surface and motion parameters {n, d} and {v, w} are computed by the following procedure: 1. Compute w = -type3[A[W]].
(12.190)
2. If S[W] = O, then v = 0 and w = w. The surface parameters {n, d} are indeterminate. 3. Else, let 0-1 >_ 02 _~ 0"3 be the eigenvalues of S[W], and {Ul, u2, u3 } the orthonormal system of the corresponding eigenvectors.
12.8.
Analysis of Planar Surface Optical Flow
407
d
Fig. 12.17. Two solutions for the same image transformation. 4. The surface parameters are given by -
VO'I - -
0" 2 U 1 -~-
~t
V0-1
V0"2 - -
1
O"3"U3
d -
- - 0-3
'
~
(12.191)
.
0-1 - - 0"3
5. The motion parameters are given by +Vo,
v - -
-
o-
u, -
x/al-aa
Vo-
-
o3
3
n ,
~
-
w
x v
2d
'
(12 192) "
where the double sign =i= corresponds to that in eqs. (12.191). 6. For each solution, another solution is obtained by changing the signs of n and v simultaneously. Thus, four solutions are obtained. As in the case of finite motion, this ambiguity is partly due to the fact that the interpretation is solely based on the flow matrix W , not the flow itself, and the fact that the scene behind the camera can be seen (Fig. 11.13). Suppose one solution {n, d} and {v, w} is obtained. If we 1. move the camera to the other side of the planar surface, and 2. reverse the translation, the new surface and motion parameters are { - n , d} and { - v , w }, respectively (Fig. 12.17). We can see from eq. (12.177) that the flow matrix W does not change. As in the case of finite motion, however, this ambiguity can be reduced if we also consider which part of the surface we are observing. In fact, if the condition that the visible part of the surface has positive depth is imposed, the number of solution reduces to at most t w o - - i n most cases one. This is because spurious interpretations are in most cases such that the surface is all behind the camera or some of part of it is behind the camera. In the latter
408
Chapter 12.
3-D Interpretation of Optical Flow
Fig. 12.18. The vanishing line cannot pass through the visible part of the planar surface; only one side of it is visible. case, the vanishing line passes through the visible part, which is impossible because only one side of it is visible (Fig. 12.18; see Section 4.6.2). From eq. (12.174), the condition to be imposed is d (n, x) > 0.
(12.193)
The 3-D position r is given by dx
r =
(n, x~.)
(12.194)
E x a m p l e 12.5 Fig. 12.19a shows a simulated optical flow image (512 x 512 pixels) of a planar grid in the scene viewed from a moving camera. The focal length is assumed to be f = 600 (pixels). We added Gaussian noise of mean 0 and standard deviation ~ (pixels) to the x and y component of the flow at each point independently (the noise level e = ~ / f is not used in the reconstruction computation). We assume that the correct solution can be distinguished from spurious solutions. Fig. 12.19b shows the reconstructed grid for ~ - 1, 2, 3. The true position is superimposed in dashed lines. We can see that the reconstruction error increases as the noise increases.
12.9
12.9.1
Camera
Rotation
and
Information
Rotation estimation
So far, we have assumed that v r 0. If v = 0, no 3-D information can be obtained from the optical flow. Hence, we must test if v - 0, i.e., if the camera motion is a pure rotation, before attempting 3-D reconstruction. In order to do this rotation test, we first hypothesize that v - 0, i.e., the camera motion is a pure rotation around the center of the lens. If no image noise exists, eq. (12.28) implies that the flow should have the form 5~ = Q x ( x x w).
12.9.
Camera Rotation and Information
409
(~)
(b)
Fig. 12.19. (a) Optical flow of a planar grid in the scene. (b) 3-D reconstruction f o r a = 1,2,3. Since the observed flow may be perturbed by image noise, we consider the following problem. P r o b l e m 12.7 Estimate a vector w such that
= Q x ( x x w)
(12.195)
from the observed flow &. This problem can be solved in the same way as Problem 12.6: we minimizing the integral of the squared Mahalanobis distance, i.e.,
J[~]
-
f~ II~ - Q~(~ • ~)1 Iy[x]dxdy ~
--+ min.
(12.196)
Noting the identities
II&- Qx(x x
o,)11 ~-Exl
- IJxll.lxj2(~, v i i i ~
-
Q~(~ • ~ ) ) + IIQ~(~ • ~)11:v[x], (12.197)
(x, v[x] - Q~(~ x ~)) - (Q~T v [ x ] - x , ~ x ~ ) -
- (~ • Q ~ V [ ~ ] - ~ , ~ ) , (12.198)
IIQ~(~ x ~ ) l l ~ [ x l - (Q~(~ x ~), v [ x ] - Q ~ ( ~ x = (~ • ~, Q ~ V [ x ] - Q ~ ( ~ • ~ ) ) -
(~, (~
•
~))
Q~r[~]-Q~ •
~)~).
(12.~9v)
we can rewrite the function J[w] in the form
J[~] = c -
2(b, ~ ) + (~, A~),
where
A-
fs x/Q~V[Sc]-Qx
x xdxdy,
(~2.2oo)
410
Chapter 12.
b = - jfs x x QTx V[Se,]-~dxdy, Differentiating
J[w]
3-D Interpretation of Optical Flow
C = ~ ll~ll2[fr,]dxdy.
(12.201)
with respect to w and setting the result 0, we obtain
Aw = b.
(12.202)
The matrix A is nonsingular in general 7, so eq. (12.202) gives a unique solution.
12.9.2
Rotation test
Let & be the resulting estimate. Since V[~] generally has rank 2 and the vector w has three degrees of freedom, the residual J[&]-/s
Ilde'-Qx(x x &)ll2v[(r']dxdy - ffs II~CllY[X]dxdy-(d"'A&)"
(12.203)
is a X2 variable with 2S degrees of freedom. If the covariance matrix V[~] is decomposed into the noise level e and the normalized covariance matrix V0[x] in the form V[&] = e2VoI~], (12.204) an unbiased estimator of e2 is obtained in the form ~2_
J0[w] 2S-3'
(12.205)
where Jo[&] is the normalized residual obtained by replacing V[~] by Vo[x] in the expression for J[&]. The expectation and variance of ~2 are given by 2e 4
2S---:-5"
( 2.206)
Since S is usually very large, ~ is a very good estimate of e. The above analysis is based on the hypothesis that v - 0, i.e. the camera motion is a pure rotation. It follows that the rotation hypothesis can be tested by the standard X2 test: the hypothesis v - 0 is rejected with significance level a% if J[&] > X2s_3,~. (12.207) In terms of the estimate ~2 computed by eq. (12.205), this rotation test takes the following form: ~2
X2S_3
-~ e > 2S - 3"
(12.208)
This has the same interpretation as the planarity test (see eq. (12.189)). Namely, we compare the estimate ~ computed under the hypothesis v -- 0 with the noise level e expected from the accuracy of the image processing operations for detecting optical flow. 7The matrix A is singular if the rank of V[~:] is less than 2 or the flow is defined only along a special type of curve on the image plane. We ignore such pathological cases
12.9.
12.9.3
Camera Rotation and Information
411
Information in optical flow
We can define the information of optical flow as in the case of finite motion (see Section 11.7.2). Let d~ be the rotation velocity estimated by the method described in Section 12.9.1. Define x~ = Q x ( x
x d~).
(12.209)
This is the rotational flow that best "mimics" the observed flow ~b; it contains no information about the 3-D structure of the scene. Letting ~bi - ~ b - ~bu, we can write - xi + xu.
(12.210)
This equation can be interpreted as decomposing the observed flow into the part that contains information about the 3-D scene and the part that contains no information. Let us call 5~i the informative part and x~ the uninformative part. In general, the accuracy of 3-D interpretation decreases as the image noise increases. As in the case of finite motion, however, the negative effect of image noise cannot be measured simply by its magnitude but it must be
compared with the magnitude of the informative part of the optical flow. Let us call the normalized residual J0[d~] the information in the optical flow. We observe the following: 9 The information is minimum if v = 0; it generally increases as Ilvll increases. 9 If v is the same, the information increases as the variation of the depth becomes larger. 9 The information is unchanged if an arbitrary rotational velocity w is added to the camera motion. 9 The information increases as the number pixels at which optical flow is defined increases. Thus, 3-D interpretation can be disrupted by image noise of a very small magnitude if the information is very small, while 3-D interpretation is robust to image noise if the information is sufficiently large. If a rotational velocity is added to the camera motion, the flow magnitude increases, thereby decreasing the relative noise magnitude. However, such an apparent decrease of the relative noise magnitude does not increase the accuracy of 3-D interpretation, because camera rotation does not add any 3-D information to the optical flow. Thus, the relative noise magnitude is a very deceptive measure of the accuracy of 3-D interpretation. This viewpoint is further extended in a more general framework in Chapter 13.
412
Chapter 12.
3-D Interpretation of Optical Flow
(~)
(b)
Fig. 12.20. (a) Optical flow of a scene viewed from a translating camera. Optical flow of the same scene viewed from a rotating camera.
(b)
E x a m p l e 12.6 Fig. 12.20a shows an optical flow of a scene with a small depth variance viewed from a translating camera. It is widely known t h a t 3-D interpretation of this type of optical flow is very difficult in the presence of noise, because a similar optical flow would be obtained by a camera rotating around the center of the lens (Fig. 12.20b), in which case 3-D interpretation is theoretically impossible. For flows such as shown in Fig. 12.20a, accurate 3-D interpretation is possible only if the image noise is very small. In other words, the image noise must be small as compared with the information in the
~OW. E x a m p l e 12.7 Fig. 12.21 shows the informative part and the uninformative part of the optical flow given in Fig. 12.19. The informative part of this flow has a sufficiently large magnitude, so we can expect a stable 3-D interpretation.
12.9.4
Midpoint flow approximation
In applying the theory described in this chapter to an optical flow detected from real images, we must note the discrepancy between the theoretical optical flow and the real optical flow. Theoretically, optical flow is defined as an instantaneous velocity: it is defined by differentiation of a continuous image motion. On the other hand, what can be detected by image processing operations is the inter-frame displacement of each pixel. These two definitions differ even if optical flow is exactly detected with subpixel accuracy. This difference becomes apparent when the trajectory of an image point is not straight. In fact, the theoretical optical flow is tangent to the trajectory, while the interframe displacement is a chord, connecting two points on it (Fig. 12.22a). Suppose an image point x moves to x I in the next frame. A simple and effective way of avoiding the above discrepancy is to identify the
12.9.
Camera Rotation and Information
413
I. I I I I I /
I IIllll 9
Ill
/
/
i
i
\
\
\
\
~
,
'
,
I
,
I
/
~
~
" '
/,1~///\ \ \ , ,
"
\\\\\\ \ \ \ \ ,, \\\\\\\\\\\
~,
\\\\\\\\~,,
/
~
I I I , ,
(~)
"
(b)
Fig. 12.21. (a) Informative part of the flow shown in Fig. 12.19a. (b) Uninformative part. displacement x ' - x with the flow at the midpoint (x the midpoint flow approximation (Fig. 12.22a).
+ x ' ) / 2 . Let us call this
E x a m p l e 12.8 Fig. 12.23 shows superimposition of simulated consecutive images of a planar grid in the scene viewed from a moving camera (no image noise is added). The image frame has 512 x 512 pixels, and the focal length is assumed to be f = 600 (pixels). Fig. 12.24a shows the 3-D shape reconstructed by identifying the displacement field with an optical flow. The true shape is superimposed in dashed lines. Fig. 12.24b shows the corresponding result obtained by the midpoint flow approximation. We can see that the midpoint flow approximation dramatically increases the accuracy of 3-D reconstruction.
414
Chapter 12.
3-D Interpretation of Optical Flow X r --X X r
(b)
(~)
Fig. 12.22. (a) The difference between the theoretical optical flow and the real image optical flow. (b) Midpoint flow approximation.
Fig. 12.23. Two consecutive image frames superimposed.
i
(~)
\
(b)
Fig. 12.24. 3-D reconstruction from Fig. 12.23: (a) direct optical flow approximation; (b) midpoint flow approximation.
C h a p t e r 13 I n f o r m a t i o n C r i t e r i o n for M o d e l Selection In order to apply geometric correction and parametric fitting presented in Chapters 5 and 7, we first need to know the geometric model, i.e., the constraints and hypotheses that should hold in the absence of noise. But how can we prefer one geometric model to another? In this chapter, the AIC introduced in Section 3.7 is modified so that it can be used for selecting a plausible geometric model, which is identified with a manifold. We show that the complexity of the model is evaluated by not only its degrees of freedom but also such invariant quantities as the dimension and codimension of the manifold. We also present a procedure for evaluating relative goodness of one model to another without using any arbitrarily set threshold such as the significance level of the X2 test. This comparison criterion is applied to point data in two and three dimensions, 3-D reconstruction by stereo vision, 3-D motion analysis, and 3-D interpretation of optical flow.
13.1 13.1.1
Model Selection Criterion Model estimation
In the preceding chapters, we have formulated m a n y types of statistical test, all of which have the form of comparing the a posteriori noise level ~ estimated under a hypothesis with the a priori value e expected from the accuracy of the d a t a acquisition process (image processing, range sensing, etc.). As has often been pointed out, however, it is very difficult to predict the noise level e a priori in real situations. It can be estimated a posteriori only if the hypothesis is true. Then, is it not possible to test the hypothesis without using the a priori noise level e? In this chapter, we show t h a t although this is impossible in general, we can compare two hypotheses for the same d a t a and determine which hypothesis is more plausible. In order to show this, we must first generalize the problem in abstract terms. Let al, . . . , a N be m-vector d a t a sampled from an m'-dimensional manifold ,4 C 7~m, which we call the data space. We write
(13.1) where ~ is the true position of d a t u m a s . The noise t e r m A a ~ is assumed to be an independent Gaussian r a n d o m variable of mean 0 and covariance matrix V[a~]. We assume t h a t the covariance m a t r i x Vial] has range Ta~ (A) ( 415
416
Chapter 13.
Information Criterion for Model Selection
9 A
aa
.
i
!
9
O
o
Fig. 13.1. Estimating a manifold $ C jt and ~
E $, a -- 1, ..., N, from the data
the tangent space to A at ~ ) and hence has rank m ~. We want to find a manifold $ C A such that the true values ~ are all in it. This problem is stated as follows (Fig. 13.1): P r o b l e m 13.1 Estimate a manifold S C A and m-vectors { ~ } ~ E $, a -- 1, ..., N , f r o m the data { a ~ } .
such that
In order to solve this problem, we parameterize the manifold S by an n-vector u constrained to be in ml d - d i m e n s i o n a l manifold L/ C 74 ~, which we call the parameter space. It follows t h a t the manifold $ has n ~ degrees of freedom, and Problem 13.1 reduces to parametric fitting: we want to compute an optimal value of the p a r a m e t e r u E b/. If $ has 0 degrees of freedom, it is a fixed manifold, and Problem 13.1 can be identified with geometric correction" we want to correct each a~ into &~ E $ in an optimal manner. Thus, Problem 13.1 generalizes both geometric correction and parametric fitting. We call a parameterized manifold a (geometric) model. If it is a ddimensional manifold in an m~-dimensional d a t a space and has d free parameters, we say t h a t the model has dimension d, codimension r - m I - d, and d degrees of/reedom. If the model is specified by L equations in the form F ( k ) ( a , u ) - O,
k - 1, .... , L ,
(13.2)
the co dimension r of the manifold was called the rank of the "hypothesis" (or "constraint" if the parameter u does not exist) in the preceding chapters (see Sections 5.1.1 and 7.1.1). E x a m p l e 13.1 Suppose N image points Xl, ..., xN are observed. The d a t a space is the entire image plane X - {x E 7~3I(k, x) - 1}. Let ~ be the true position of x~. We write x~ - 2~ + A x ~ and regard the noise t e r m A x a as an independent Gaussian random variable of mean 0 and covariance m a t r i x V[x~], which we assume has range X and rank 2. The following are typical models for the image point d a t a {x~}" 1. Image point model: the true positions 5~, a - 1, ..., N , all coincide (Fig. 13.2a). This model is an image point p - {3}; it has dimension
13.1.
Model Selection Criterion
X
417
X
X X a - "~ a @
.................... o / / X ' " ' " .... "".o
O
(~)
(b)
(c)
Fig. 13.2. (a) Image point model p. (b) Image line model I. (c) Image plane model X. 0, codimension 2, and two degrees of freedom (an image point has two degrees of freedom; see Section 4.1.1). If p is constrained to be at the image origin, the constrained model p~ - {k} has zero degrees of freedom.
2. Image line model: the true positions 5~, c~ = 1, ..., N , are all collinear (Fig. 13.2b). This model is an image line 1 = {(n,m) = 0}; it has dimension 1, co dimension 1, and two degrees of freedom (an image line has two degrees of freedom; see Section 4.1.2). If 1 is constrained to pass through the image origin, the constrained model l' = {(n, m) = 0 - - (n, k) = 0} has one degree of freedom.
3. Image plane model" no constraint is imposed on the true positions 5~, - 1, ..., N (Fig. 13.2c). This model is the entire image plane A'; it has dimension 2, codimension 0, and zero degrees of freedom.
E x a m p l e 13.2 Suppose N space points ~'1, . . . , ~ ' N are observed. The data space is 7~3 itself. Let ~ be the true position of r~. We write r~ = ~ + A r ~ and regard the noise term A r ~ as an independent Gaussian random variable of mean 0 and covariance matrix V[r~], which we assume has range 7~3 and rank 3. The following are typical models for the space point data {r~}:
1. Space point model: the true positions ~ ,
a = 1, ..., N, all coincide (Fig. 13.3a). This model is a space point g = {e}; it has dimension 0, codimension 3, and three degrees of freedom (a space point has three degrees of freedom; see Section 4.2.1). If P is constrained to be at the coordinate origin, the constrained model p / _ {0} has zero degrees of freedom.
2. Space line model: the true positions ~ , a - 1, ..., N, are all collinear (Fig. 13.3b). This model is a space line L - {r x p - n}; it has dimension 1, codimension 2, and four degrees of freedom (a space line
418
Chapter 13.
Information Criterion for Model Selection
Z
r(~
Z
e
r a
Y
X I
.,
x
(~)
(b)
Z r a ~ i" a
roe
y
(c)
(d)
Fig. 13.3. (a) Space point model P. (b) Space line model L. (c) Space plane model II. (c) Space model 7~3. has four degrees of freedom; see Section 4.2.2). If L is constrained to pass through the coordinate origin, the constrained model L ~ - {r • p = O} has two degrees of freedom.
3. Space plane model: the true positions ~ , a - 1, ..., N, are all coplanar (Fig. 13.3c). This model is a space plane II - {(n, r ) - d}; it has dimension 2, codimension 1, and three degrees of freedom (a space plane has three degrees of freedom; see Section 4.3.1). If II is constrained to pass through the coordinate origin, the constrained model II' - {(n, r) = 0}; has two degree of freedom.
4. Space model: no constraint is imposed on the true positions ~ , c~ - 1, ..., N (Fig. 13.3d). This model is the entire space 7~3; it has dimension 3, codimension 0, and zero degrees of freedom.
13.1.2
Minimization of expected residual
Given a particular model $, we can compute an optimal solution as described in Chapters 5 and 7. Namely, we can obtain the m a x i m u m likelihood solution
13.1.
Model Selection Criterion
419
of Problem 13.1 by minimizing the sum of the squared Mahalanobis distances N
J - ~-~(a~ - gs, V [ a s l - ( a s - g s ) )
(13.3)
c~=l
under the constraint that a s E $, c~ - 1, ..., N (see eq. (5.6), (7.5), and (7.12)). Let {hs } and S be the resulting maximum likelihood estimators of {as} and S, respectively. Substituting them back into the function J, we write the residual in the following form: N
- Z
- a II E 1.
(13.4)
s--1
This quantity measures the minimum discrepancy between the assumed model $ and the data {as }, so this appears to be a good measure of the goodness of the model. However, because {~is } and S are determined so as to minimize the residual for the current data {as}, the residual can be made arbitrarily small. In fact, if the manifold $ has a sufficient number of parameters, we can make S pass through all the data {as }. Such an artificial model may explain the current data but may be unable to predict occurrence of the data to be observed in the future. In order to measure the predicting capacity of the model, we adopt the criterion introduced in Section 3.7.1. Namely, we consider future data { a ; } that have the same distribution as the current data {as } and require that the residual for the future data N
J[{as},* {ds},S] - E
lids* - &s 112-,,vra~l~__J
(13.5)
be small. Since this is a random variable, we take expectation to define a definitive value for the model" I ( $ ) - E*[E[J[{a~}, {&s},S]]].
(13.6)
Here, E*[. ] and E[. ] denote expectation with respect to the future data {a~ } and the current data {a~}, respectively. We call I ( $ ) the expected residual of model 8 and regard $ as good if I ( $ ) is small. E x a m p l e 13.3 Consider the three models described in Example 13.1. For whatever data {x~ }, the residual of the image line model 1 is always no more than the image point model p, and the residual of the image plane model X is always 0 since the maximum likelihood estimator 5~ of the true value 5~ is the datum x~ itself. The same can be said for Example 13.2: for whatever data {r~}, the residual of the space line model L is always no more than
420
C h a p t e r 13.
I n f o r m a t i o n Criterion for Model Selection
Fig. 13.4. The Mahalanobis projection of a~ onto manifold S is the tangent point of the equilikelihood surface to S. t h e space point model P; the residual of the space plane model H is always no m o r e t h a n the space line m o d e l L; the residual of the space m o d e l g 3 is always 0. Thus, the residual is not a g o o d m e a s u r e of the goodness of the model.
13.2
Mahalanobis Geometry Mahalanobis projection
13.2.1
First, we consider the case in which the model $ has no free p a r a m e t e r s (n' = 0). As discussed in Section 5.1.1, if the t r u e value 6.~ of d a t u m a~ is in S, the m a x i m u m likelihood e s t i m a t o r &~ of 6.~ is the point in S in the shortest distance from a~ m e a s u r e d in the Mahalanobis distance with respect to the covariance m a t r i x V[a~]. Geometrically, &~ is the tangent point of the equilikelihood surface ( a - a~, V [ a ~ ] - ( a - a~)) - constant, a E ,4, to t h e manifold $ (Fig. 13.4). Let us call &~ the Mahalanobis projection of a~ onto $ with respect to V[a~]. In t h e following, the proviso "with respect to V[a~]" is o m i t t e d as u n d e r s t o o d . W e a s s u m e t h a t the noise is sufficiently small, by which we m e a n t h a t V[a~] - O(e2), a - 1, ..., N , for an a p p r o p r i a t e l y defined noise level e, as c o m p a r e d with which the d a t a space ,4 and the manifold $ are b o t h ass u m e d to be sufficiently s m o o t h 1. T h e following p r o p o s i t i o n is easily o b t a i n e d (Fig. 13.5)"
Proposition Ila
-
13.1
The following equality holds to a first approximation:
l12[a ]
-
Ila
-
21VIal] +
-
Iv[a ]2 ,
(13.7)
1A precise statement is that the radius of curvature, which is defined as the reciprocal of the total curvature, of the manifold is sufficiently small. This is the fundamental assumption for defining a (local) Gaussian distribution over a manifold (see Section 3.2.2). This assumption also played a fundamental role in Section 3.5.2.
13.2.
Mahalanobis Geometry
421
aa #
Fig. 13.5. The relation between datum as, its true value fi~ E $, and the Mahalanobis projection g~ of as onto S.
Proof. A point ac~ ^' on t h e line t h a t connects 6,~ and ac~ can be expressed in the form a"~ - a s + t ( ~ - ao,) (13.8) for a real n u m b e r t. If t is small, point a^ s is on 3 to a first a p p r o x i m a t i o n . Hence,
II'~<~-a<~ll.[a<~] ^'~
. II(a~ . . . a~)
2t(a~, -
-- Ila.~ - a.o~ I1~[~,~] +dlla~
-
a~)l I~E,~ol
t(a~
ao~, V[a~]-
(ao~ - ao~))
a,~ll~Eao],
(13.9)
which should take a m i n i m u m at t - 0 by the definition of the M a h a l a n o b i s p r o j e c t i o n ~i~. Hence, (~
- a~, v[~o]-(a~
- a~)) -
0.
(13.10)
F r o m this a n d I1~ - a~ll~E~oj
-
I1(~o - ao) - (a
= Ilao~ - ao, ,,.~,,vrao~
- a~)ll ~vial]
-2(a~,--aoV[a~,]-(a~, (13.11)
we o b t a i n eq. (13.7). P r o p o s i t i o n 13.1 can be i n t e r p r e t e d as the Pythagoras theorem, stating t h a t t h r e e points a s , &~, a n d ~ define a "right-angled triangle," where the length is m e a s u r e d in the M a h a l a n o b i s distance a n d t h e orthogonality of vectors u and v is defined by (u, V[a~]-v) = 0. In this sense, ei~ is the "foot" of the " p e r p e n d i c u l a r line" d r a w n from a s to $.
422
Chapter 13.
Information Criterion for Model Selection a n
Fig. 13.6. The Mahalanobis projections ~, and fi~ of datum as onto the true manifold ,~ and the the optimally fitted manifold ,~, respectively. 13.2 To a first approximation, II~-a~ll~Ea~] , II~-a~ll = vial], and Ilda - a , ll~[a~] are X 2 variables with m ~, r, and d degrees of freedom, respectively.
Proposition
Proof. By definition, a s - a s is a Gaussian random variable of mean 0 and covariance matrix V[a~] of rank rn' (see eq. (3.61)). Hence, Ila~ - a~ll =VEao] is a X2 variable with rn' degrees of freedom. The fact that Ila~ - a ~ l l =v[a~] is a X2 variable with r degrees with freedom was proved in Section 5.1.5 (see eqs. (5.6) and (5.35)). Since a s - d ~ is obtained from a s - ~ by a linear mapping, &~ - 5~ - ( a s - d~) - ( a s - &~) is a ~2 variable with m ' - r ( d) degrees of freedom (see eq. (3.64)). [] C o r o l l a r y 13.1 To a first approximation,
E[lla~ E[lla~
13.2.2
- a,~ll~[ao]] 2
- a~,llv[ao]]
-
r,
-
d.
(13.12)
Residual of model fitting
Now, we consider the case in which the model S has n ~ free parameters. Let be its m a x i m u m likelihood estimators. We assume that this model contains the true manifold 3 ^(Fig. 13.6). Let h~ and h~ be the Mahalanobis projections of a~ onto S and S, respectively. Since each d a t u m a~ is independent, we obtain from Proposition 13.2 the following propositions: Proposition
13.3 The residual N 2 o<--1
13.3.
Expected Residual
423
is a X 2 variable with m ' N degrees of freedom.
P r o p o s i t i o n 13.4 The residual N
J[{as},{Ss} , S ] -
~ l l a . - a.I Iv[as] 2
(13.14)
s=l
is a ~(2 variable with r N degrees of freedom.
In other words, the degrees of freedom of the residual decrease from m t N to r N if the true value as is replaced by the Mahalanobis projection as. However, if as is projected onto the maximum likelihood estimator $, the degrees of freedom of the residual further decrease by n ~, as shown in Section 7.1.4 (Fig. 13.6). Namely, P r o p o s i t i o n 13.5 The residual N
J[{as,} {ds },,~] - E I l a s -
ds 112[a,]
(13.15)
s----1
is a X 2 variable with r N -
n' degrees of freedom.
This proposition implies that if the model $ has r N or more independent parameters, the residual given by eq. (13.15) is generally 0, confirming the fact that the residual is not a good measure of the goodness of the model. From Propositions 13.3-13.5, we obtain the following corollary: C o r o l l a r y 13.2 The following relationships hold: N
E [ ~ liar- a~llv[aol] 2
m ! -
N,
s=l N
E[~-~ Ila~ '~-II~[aol]- rN, -
s--1 N
E[ E I l a s
-
aoll~Eaol]-
rN
-
,~'.
(13.16)
s--1
13.3 13.3.1
Expected
Residual
E v a l u a t i o n of the expected residual
We now evaluate the expected residual I($) defined by eq. (13.6). Let ~s E $ be the true position of datum as, and let hs and as be the Mahalanobis projections of as onto S and S, respectively. Let a~ be the future datum corresponding to as (Fig. 13.7).
424
Chapter 13.
Information Criterion for Model Selection
aa
a a
Fig. 13.7. Evaluation of the expected residual. L e m m a 13.1 To a first approximation,
I ( $ ) -- E [ J [ { a , ), {a,},,5]] + E [ J [ { a , } , { a , } , S]] + 2 d N + n'.
(13.17)
Proof. Since the maximum likelihood estimators {&~ } are determined from the current data {as}, they are independent of the future data {a;}. Hence, eq. (13.6) reduces to N
I(S)-E*[E[~,II(a~* -
~)
-
(a~ - ~)2__ IIv[a~]]]
c~=1 N
= E*[~
N
Ilao,* - a,~ll~[ao]] + E[~
Ila,~ - a,~ll~[ao]].
(~3.~s)
~=1
c~=l
Since {a~}* and {as} have the same distribution, we have E*[II%*-a~ II~[a~]] = E[[la~ - as[[2[a~] .] Hence, eq. (13.18) can be written as N
N
z(s) - E [ ~
Ila,~ - ao~[l~[a,~]] + E [ ~
o~--1
Ilao~ - ao~ll~[a,~]].
(13.19)
o~--1
Consider the first term on the right-hand side. Applying Proposition 13.1, we can write N
N 2
E [ y ~ Ila~ - a~ll~[~o]] - E [ ~ c~--1
N
Ila~ - a~ll~[~o]] + E [ ~
(r=l
Ila~ - a~ll~[ao]].
or--1
(13.20) From Corollary 13.2, we have N
E[~lla~ a=l
N
- &~ll~[a~]]- E [ ~ o~--1
[[a~ - &~ [l~[a~] l+n'.
(13.21)
13.3.
Expected Residual
425
Corollary 13.1 implies N
e[~ I1~ -
a~ll~[a~]] - dN.
(13.22)
Ila~ - a ~ II~[a~]]+ d N + ~'
(13.23)
o,-'1
Hence, eq. (13.20) reduces to N
E[~
N
Ila~ - a~ll~[a~]] - E [ ~
ct=l
c~=l
Next, consider the second term on the right-hand side of eq. (13.19). If the noise is small, the Mahalanobis projection of g~ E ,r onto the true manifold S coincides to a first approximation with the Mahalanobis projection 6,~ of the datum a s onto S (Fig. 13.7). Hence, we obtain from Proposition 13.1 and eq. (13.22) N
N
- a~ll ~v[ao] = ~
~lla~ c~=l
N
Ila,~ - a,~ll~[ao] + ~
c~=1
Ila~ - a~ll~[ao]
c~=l
N
= ~
ila~ - a,~ilL[ao] +
(13.24)
tiN.
c~--1
Substituting eqs. (13.23) and (13.24) into eq. (13.19), we obtain eq. (13.17).D
13.3.2
Accuracy of parametric fitting
In order to evaluate the second term on the right-hand side of eq. (13.17), we need an explicit representation of the model ,S. Suppose ,S is given by L equations F (}) (a, u) = 0, k - 1, ..., L, (13.25) !
parameterized by an n-vector u constrained to be in an n-dimensional parameter space/2 C T~~. As in Sections 5.1.1 and 7.1.1, we assume that the hypothesis (13.25) is nonsingular 2 and has rank r. This means that 1. only r of the L equations are algebraically independent, 2. each equation defines a manifold of codimension 1 in the data space A, and
m
' -dimensional
3. the L manifolds intersect with each other transversally (see Section
3.2.1). 2 T h e d e t a i l e d m a t h e m a t i c a l a r g u m e n t will be given in t h e n e x t c h a p t e r .
426
Chapter 13.
Information Criterion for Model Selection
It follows that the intersection of the L manifolds is a manifold of codimension r.
Let ~ be the true value of u, i.e., the value that realizes the true manifold S. The moment matrix is defined as follows (see eq. (7.23)): N
M- - E
L
E 17v(kl)(P"L/ VaF~- ( k ) o~=1 k,l=l
) (PUVa/>('))
(lg~(k~ - ((VaF~(k), V[a,lVaF(~O))
T
- .
(13.26)
(13.27)
Here, P ~ is the n-dimensional projection matrix onto the tangent space Tu (b/) to the manifold b / a t ~. The symbol Va denotes differentiation with respect to the argument a, and Va/~(k) is the abbreviation of V a F (k) (da, ~). We also use the abbreviated notation for generalized inverse introduced in Chapters 5 and 7: eq. (13.27) has the same meaning as eq. (7.8). Let it be the maximum likelihood estimator of u, i.e., the value that realizes the maximum likelihood estimator S of $. L e m m a 13.2 To a first approximation, N
Ila~ - a ~ l l ~ [ ~ ]
- (~ - g , M ( 6
- ~))-
(13.28)
c~=l
Proof. The Mahalanobis projection ha of &~ E S onto g is given as follows (see eq. (5.17))" L
&~
-
&~ -
V[a~]E
17V(kl)F(k)(&~' U)Va/>(t)'
(13.29)
k,l=l
(W~-(kl))_ ((X7aP(k) , V[a~lVa~j)))
-.
(13.30)
Here, VaF~(k) is the abbreviation of V a F (k) (a,, ~). Since d , E S, we have
F(k)(a~,it) - O,
k - 1,...,L.
(13.31)
Letting A it - i t - ~ and taking a linear approximation, we obtain F <~) (ao. a) - F <~) ( a . . ~ - h a )
- - ( v ~ , > . (k), h a ) .
(13.32)
where Vu denotes differentiation with respect to the parameter u, and V u F (k) is the abbreviation of VuF(k)(&a, it). Since eq. (13.32) is a linear approximation in A/t, we can replace Vu[:~ k) by V u / ~(k) (= the abbreviation of
13.3.
Expected Residual
427
voW:?/ a o
----~-p.,
V__o
a a
aa
Fig. 13.8. Accuracy of the optimal fitting VuF(k)(d~, fi)) with errors of O(A/~) 2 (see Fig. 13.8). Hence, eq. (13.29) is written to a first approximation as L
ao
-
ao
.[~1 E
-
k,l=l
- (~')
L (~)
(13.33)
From this, we see that N c~--I N
L
= Z(v[~ol E #.-~')v~~)v~P2 ~ ~ , c~=1
k,l=l L
~
V[aa]-V[aa]
Wt a~( m n ) ~ T a ~ , ( m ) ~ 7 u / ~ ( n ) T A / S )
m,n=l N
--(~, ( E a=l
L
Z
k,l,m,n=l
W(~'~W~m~
~o,0~(~)~
V[a~]V[a~l-V[a~lVaP(~m)VuP(.')r) Ai~) -- l , n = l
= (A/~,
l~(t~)Vule(t)Vu/>(n)T c~=l 1
1
)
k,m=l
)
A/~),
(13.34) ~
where we have used the defining equation (la.a0) of the (LL)-matrix W ~ = ( l ~ (kl)) and the identities v[~]v[~l-V[~l - v [ ~ l and I~r~W~ W ~ =
428
Chapter 13.
Information Criterion for Model Selection
ITV~. To a first approximation, T~r,(kt) can be replaced by I?V(kt) defined by eq. (13.27). Since A/t E T~t(L/) to a first approximation and hence P U A / t A/t, we can write eq. (13.34)as ( A / t , / ~ A / t ) . [] P r o p o s i t i o n 13.6
I(S) - E[J[{a~}, {&~},Sll + 2(dN + n').
(13.35)
Proof. As shown in Section 7.1.3, the a posteriori covariance matrix of the optimal estimate/t is given by the generalized inverse o f / ~ (see eq. (7.29))" V[/t]- M
.
(13.36)
Since this matrix generally has rank d , the quadratic form (a - a,M(a
- a ) ) - ( a - a, f ' [ a ] - ( a _ a ) )
(13.37)
is a X2 variable with n' degrees of freedom (see eq. (3.61)). Hence, its expectation is d . From Lemmas 13.1 and 13.2, we obtain eq. (13.35). [] 13.4
13.4.1
Geometric
Information
Criterion
Model selection by AIC
Proposition 13.6 implies that
AIC(S) - J[{a~}, {&~}, SI + 2(dN + n')
(13.38)
is an unbiased estimator of the expected residual I(S). We call this quantity the geometric information criterion, or geometric AIC for short, and use it as a measure of the goodness of the model: if AIC(S1) < AIC(S2) for models S1 and $2, we prefer model S1 because S1 is expected to have more predicting capacity than model $2. Eq. (13.38) formally coincides with eq. (3.172) up to additive constants. This is easily seen if we note the following: 1. The residual J[{aa}, {d~},S] equals - 2 times the log-likelihood of the Gaussian probability density plus some additive constants. 2. The manifold S is specified by n ~ parameters. If a d-dimensional curvilinear coordinate system is defined in S, the maximum likelihood estimator &~ E S is specified by d coordinates; in total dN parameters are necessary for {d~ }. Thus, the number of parameters for describing the solution of Problem 13.1 is dN + d.
13.4.
Geometric Information Criterion
429
However, there is a marked difference: eq. (3.172) is obtained in the asymptotic limit N ~ oe by applying the law of large numbers and the central limit theorem, while eq. (13.38) is obtained in the limit of small noise for a fixed number N of the data 3. This leads to the following distinctive features of the geometric AIC as compared with the usual AIC: 9 The degree of freedom n ~ of the model has no significant effect for the geometric AIC if the number N of data is large, whereas it plays a dominant role in the usual AIC. 9 The dimension d of the model manifold plays a dominant role in the geometric AIC, while no such geometric concepts are involved in the usual AIC. 9 The number N of data explicitly appears in the expression for the geometric AIC, but it does not in the usual AIC. Eq. (13.38) is not a convenient form for actual applications, because computing the residual J[{aa }, { &~ }, S] requires knowledge of the covariance matrices V[a~]. We decompose it into the noise level e and the normalized covariance matrix V0[aa] in the form
V[a~] =
e2Vo[ao,],
(13.39)
and define the normalized residual by N
J0[S] - E ( a ~
- a~, Vo[a~]-(a~ - ~i~)).
(13.40)
Multiplying eq. (13.38) by c2, we define the normalized geometric AIC of model S by AICo(S) - Z0[S] + 2(dN + n')e 2. (13.41) In the following, we call AICo(S) and Jo[S] simply the AIC and the residual, respectively. Given a set of N data {as} and two models $1 and $2, we regard model $1 better than $2 if AICo(S1) < AICo(S2). If model $1 has dimension dl, codimension rl, and n~ degrees of freedom, and model $2 has dimension d2, codimension r2, and n 2/ degrees of freedom, this condition is written as g0[s1] - J0[S2] < 2 ((d2 - d l ) N + (n~2
-
~t/1)) ~2.
(13.42)
E x a m p l e 13.4 Consider Example 13.1. If each image coordinate is perturbed by independent G aussian noise of mean 0 and standard deviation e, the AICs of the models defined there are:
dICo(p) = J0~] + 4e 2,
AICo(p') = J0~'],
(13.43)
3This difference will be discussed in a more general framework in the next chapter.
430
Chapter 13.
Information Criterion for Model Selection
A I C o ( 1 ) - J0[/] + 2(N + 2)e 2,
A I C o ( l ' ) - J0[['] + 2(N + 1)e 2,
AICo(X)
(13.44) (13.45)
= 4 N e 2.
9 Comparing model 1 with model X, we can infer that the true positions are collinear if Jo[/] < 2 ( N - 2)e 2. (13.46) We can then infer that the image line on which the true positions lie passes through the image origin if g0[/']- g0[/] < 2e 2.
(13.47)
9 Comparing model p with model l, we can infer that the true positions are identical if J0[th]- J0[/] < 2 N e 2. (13.48) We can then infer that the true position is at the image origin if J0[ih']- J0[ih] < 4e 2.
(13.49)
Let (x~, y~) be the image coordinates of the data, and (&, ~)) their sample average. The above condition can be rewritten as follows: +
(13.50)
<
In other words, the true position is inferred to be at the image origin if the sample average is within distance 2 E / V ~ from the image origin. E x a m p l e 13.5 Consider Example 13.2. If each coordinate is perturbed by independent Gaussian noise of mean 0 and standard deviation e, the AICs of the models defined there are: AICo(P)AICo(L)
J0[P] + 6e 2,
- J0[]-] + 2(N + 4)e 2,
A I C o ( I I ) - J0[I~I] + 2(2N + 3)e 2,
AICo(P')-
AICo(L')
Jo[P'],
- J0[]-'] + 2(N + 2)e 2,
(13.51) (13.52)
A I C o ( I I ' ) - J0[]-'] + 4(N + 1)e 2, (13.53)
d X C o ( ~ 3) = 6 N e 2.
(13.54)
9 Comparing model H with model V, we can infer that the true positions are coplanar if Jo [l~I] < 2 ( N - 3)e 2. (13.55) We can then infer that the space plane on which the true positions lie passes through the coordinate origin if J0 [l~I'1 - go [I*I] < 2e 2.
(13.56)
13.4.
Geometric Information Criterion
431
9 Comparing model L with model II, we can infer that the true positions are collinear if g 0 [ L ] - J0[1Pl] < 2 ( N - 1)e 2. (13.57) We can then infer that the space line on which the true positions lie passes through the coordinate origin if
Jo[L'] - J0[L] < 4 d .
(13.58)
9 Comparing model P with model L, we can infer that the true positions are identical if J0[/5] - J0[L l < 2(N + 1)e 2. (13.59) We can then infer that the true position is at the coordinate origin if Jo[/5'] - Jo[/5] < 6e 2.
(13.60)
Let ? be the sample average of the data {r~}. The above condition can be rewritten as
I1'11 <
(13.61)
In other words, the true position is inferred to be at the coordinate origin if the sample average is within distance Vr6e/x/~ from the coordinate origin.
13.~.2
Model comparison by AIC
In order to apply the AIC criterion as described in the preceding subsection, we need to know the noise level e, which is very difficult to predict. Note that i f m o d e l S is correct, an unbiased estimator of e2 is obtained in the form ~2 =
J0[S]
(13.62)
rN - n ! '
as shown in Section 7.1.4. However, we need the true noise level e to judge if the model S is correct. This difficulty can be avoided if we focus on c o m p a r i n g two models such that one i m p l i e s the other. Let $1 be a model of dimension dl and codimension rl with n 1 degrees of freedom, mid $2 a model of dimension d2 and codimension r2 with n~ degrees of freedom. Suppose model $2 is obtained by adding an additional constraint to model $1. We say that model $2 is s t r o n g e r than model $1, or model $1 is w e a k e r than model $2, and write
$2 ~ S1.
(13.63)
Then, J0[S2] _> J0[Sx] for whatever data {a~}. This confirms our observation made earlier that the residual alone cannot measure the goodness of the model.
432
Chapter 13.
Information Criterion for Model Selection
Suppose $1 is a general model which is assumed to be correct. Then, the squared noise level e2 is estimated by eq. (13.62) as long as r N - n' r 0. Substituting it to e2 in the expression for the geometric AIC, we obtain 2 ( d i n ~- n~)
A I C o ( $ 1 ) - J0[S1] +
rlN-
A I C o ( $ 2 ) - J01321 +
n~ J0[Sl],
(13.64)
2(d2N + n~) r:)V-- n~ J0131].
(13.65)
If A I C o ( $ 2 ) < A I C o ( $ 1 ) , the predicting capability is expected to increase by replacing the general model $ by the strong model $2. The condition for this is J0132] < 1 + 2(dl - d 2 ) X 4- 2(nl - n~) . (13.66) J0 [31 ] r l N - r~ In terms of the estimators i~ and ~ defined in the form of eq. (13.62) for models ~1 and S2, respectively, the above condition can be written as follows: e2 (rl -~- 2(dl - d 2 ) ) X + n i - 2n~2 e-~ < .
(13.67)
r2N- n~
E x a m p l e 13.6 Consider Example 13.1. We see that p' ~- p,
l' ~- l,
~- 1 ~- X.
(13.68)
We assume that each image coordinate is perturbed by independent Gaussian noise of mean 0 and standard deviation e. 9 If the true positions are known to be collinear, we can infer that the image line on which the true positions lie passes through the image origin if J0 [/'] 2 < i d - ~ (13.69)
:0[[]
N - 2'
and infer that the true positions are identical if J0 [i5]: < 3 -~- ~ 2. J0[/] N- 2
(13.70)
9 If the true positions are known to be identical, we can infer that the true position is at the image origin if J0 [i5]
< Id-~.
2
N-
(13.71)
1
In terms of the sample average (k, ~)), this condition can be written as follows:
/2
V/X'2 + ~12< V
N(N +
+ I)
"
(13.72)
13.4.
Geometric Information Criterion
433
Note that application of the comparison criterion (13.67) is somewhat limited as compared with the direct use of the AIC criterion (13.42). For example, one cannot infer collinearity by comparing model I and model X, since the residual for X" is identically zero. Example
13.7 Consider Example 13.2. We see that P ' ~- P,
L' ~- L,
H' ~- H,
P ~- L ~- 7r 3.
(13.73)
We assume that each coordinate is perturbed by independent Gaussian noise of mean 0 and standard deviation e. 9 If the true positions are known to be coplanar, we can infer that the space plane on which the true positions lie passes through the coordinate origin when Jo [H'] 2 < 1+ (13.74) Jo[H] N - 3' and infer that the true positions are collinear when J0[L]
< 3+ ~ J0 [IAI] N-
4 3"
(13.75)
9 If the true positions are known to be collinear, we can infer that the space line on which the true positions lie passes through the coordinate origin when
J0[L'] Jo[L]
< 1+
2 N-
2'
(13.76)
and infer that the true positions are identical when
Jo[P] 3 < 2+ ~ Jo[P] N- 2
(13.77)
9 If the true positions are known to be identical, we can infer that the true position is at the coordinate origin when
go[P']
Jo[P]
2 < 1+ ~ .
N-
(13.78)
In terms of the sample average ~, this condition can be written as follows:
I111 <
NiNl+ 11
(13.79)
434
3.4.3
Chapter 13.
Information Criterion for Model Selection
Model selection vs. testing of hypotheses
The model selection criterion given by eq. (13.66) has a positive implication in contrast to the negative meaning of the statistical testing of^ hypotheses, according to which the procedure is given as follows. Since Jo[S2]/e 2 is a X2 variable with r 2 N - n~ degrees of freedom, the hypothesis that "model $2 is correct" is rejected if
Jo[$21 : ~7 > ~:~N-%,a
( 3.so)
~p,a2
with significance level a ~ , where is the upper a% point of the X2 distribution with p degrees of freedom. If the square noise level e2 is approximated by the estimator ~2 given by eq. (13.62), we can rewrite (13.80) as 4
J0[S2] > J0 IS1 ]
X22 N -
' ,a
-: r 1N - n i "
(13.81)
The interpretation of this test is that if eq. (13.81) holds, the hypothesis t h a t the model is $2 is very questionable with confidence level ( 1 0 0 - a)% because if the hypothesis is true, we are observing a very rare event that occurs only with a probability a%. Hence, we decide that there exists no reason to favor model $2 over $1. In other words, a statistical test can only reject a hypothesis when the data do not support it within a specified allowance threshold. Its ultimate purpose is to negate a hypothesis (hence called the null hypothesis) in favor of a default hypothesis (called the alternative hypothesis). After all, any hypothesis is rejected in the presence of noise if the significance level is lowered (or the confidence level is raised); the judgement is not definitive in this sense, and it does not address the issue of choosing one model in favor of another. In contrast, the criterion given by eq. (13.66) gives a positive and definitive assertion that model $2 is preferable to $1 with regard to the predicting capability; it requires no knowledge about the noise magnitude and no arbitrarily set thresholds.
13.5
13.5.1
3-D Reconstruction by Stereo Vision General m o d e l
Consider a stereo system. Let {h, R } be its motion parameters. Suppose we ! observe N pairs of corresponding image points x~ and x~, c~ - 1, ..., N. We cml regard them as six-dimensional data a s = x~ | x ~Ol E 7~6 sampled from 4The left-hand side equals the ratio of the logarithmic likelihoods of the two models, so this test belongs to a class called the logarithmic likelihood ratio t e s t .
13.5.
3-D Reconstruction by Stereo Vision
435
the f o u r - d i m e n s i o n a l data space x
y A
~'t
x'1
Ix , y , x' , y ' E T /~ C 7 ~ 6 .
().13.82
y/ 1 The condition that image points x and x ~ correspond to each other is given by the following epipolar equation (see eq. (6.8)): - o.
(13.83)
The essential m a t r i x V is defined as follows (see eq. (6.7)): G - h x R.
(13.84)
The epipolar equation (13.83) describes the condition that the lines of sight of x and x ~ meet in the scene (see Fig. 6.4), or equivalently image point x is on the epipolar of x ~ in the first image and x ' is on the epipolar of x in the second image (see Section 6.1.2). The epipolar equation (13.83) defines a three-dimensional manifold S in the four-dimensional data space ,4. It follows that 3-D reconstruction by stereo vision can be viewed as the following problem: --
--!
P r o b l e m 13.2 E s t i m a t e the true positions a s - ms | x s of the data a s = x s | x ~O~ E fit in such a way that g~ E 8
Since no free parameters are involved, the manifold S is a model of dimension 3, co dimension 1, and zero degrees of freedom; the maximum likelihood estimator ,S of 8 is 8 itself. Let V[xs] and V[x~] be the covariance matrices of x s and x ~ respectively, and decompose them into the noise level e and the normalized covariance matrices V0[xs] and V0[x~] in the form S~
V[xs]-
e2V0[x~l,
V[x:]-
e2V0[x:].
(13.85)
The noise is assumed to be Gaussian and independent for each datum. The AIC of this model is d I C o ( S ) - Jo [,-r + 6 g e 2. (13.86) As shown in Section 6.2.2, the residual J0[S] is computed as follows 5 (see eq. (6.22))"
J018]
N
(z
,Gx,)2
aTVo[ ]a ") +
(13.S7) -!
5Theoretically, x a and x a in the d e n o m i n a t o r should be their t r u e values x a and x a T h e y could be a p p r o x i m a t e d by their corrected values x a and X^o ' .~ Since Jo[,~] - - O(e2), the use of the d a t a x a and x ta results in only a higher order difference.
436
Chapter 13.
Information Criterion for Model Selection
If this model is correct, j0[$]/~ ~ is a X2 variable with N degrees of freedom. Hence, the squared noise level e2 can be estimated by
~ = -g013] -~. 13.5.2
(~3.ss)
Planar surface model
If the object is a planar surface (n, r) - d, the constraint is given by xt x A x -
0,
(13.89)
where the matrix A is defined as follows (see eqs. (6.67) and (6.71))" A -- R T ( h n - d I ) .
(13.90)
Eq. (13.89) describes the condition that the lines of sight of x and x ~ meet on the space plane (n, r) = d (see Fig. 6.12). Since only two component equations of eq. (13.89) are independent, it defines a two-dimensional manifold 8n in the four-dimensional data space .4. It follows that 3-D reconstruction of a planar surface by stereo vision can be identified with the following problem: P r o b l e m 13.3 Estimate the manifold Sn and the true positions ~ --!
x~ of the data a~
-
-
- 5c~ |
x~ O x t E A in such a way that ( ~ E $~
The unknown surface parameters {n, d} have three degrees of freedom 6, so the manifold 5'n is a model of dimension 2, codimension 2, and three degrees of freedom. Let Sn be the maximum likelihood estimator of Sn. The AIC of this model is A I C o ( S n ) - J0[Sn] + 2(2N + 3)e 2. (13.91) Let {~, d} be the maximum likelihood estimators of the surface parameters {n, d}; the computational scheme for this estimation by renormalization is described in Section 10.4. The residual J0[Sn] is computed as follows 7 (see eqs. (6.77)and (6.80))" N
Jo[&]- E(~:
x .a~,,, W,,(~: x .~,,)),
(13.92)
o,'~1
w,, =
(
~ ,'
x AVo[x,~].h v x x~' + ( A x e ) x Vo[x~] x ( A x e )
)
.
(13.93)
2
Here, A is the estimate of matrix A obtained by replacing {n, d} by their maximum likelihood estimators {~, d~} in eq. (13.90). If this model is correct, 6Vector n is normalized into a unit vector; see Section 4.3.1. 7The approximation used in the general model is also used.
13.5.
3-D Reconstruction by Stereo Vision
437
J0[Sn]/e 2 is a X2 variable with 2 N - 3 degrees of freedom. Hence, the squared noise level e 2 can be estimated by
~:~]= Jo ['Sn]
(13.94)
2N-3" 13.5.3
Infinity model
If the object we are viewing is infinitely far away, we have x x Rx'-O,
(13.95)
which describes the condition that the lines of sight of x and x ! are parallel (see eq. (6.100)). Like eq. (13.89), eq. (13.89) are independent, it defines a two-dimensional manifold $ ~ in the four-dimensional data space ,4. Consider the following problem: -!
P r o b l e m 13.4 Estimate the true positions ~t~ - 5~ | x a of the data a s = ! xc~ Q x c~ E A in such a way that a,c~ E $ ~ . Since no free parameters are involved, the manifold $c~ is a model of dimension 2, codimension 2, and zero degrees of freedom; the m a x i m u m likelihood estimator S ~ of $ ~ is $ ~ itself. The AIC of this model is
AICo($oo) -
Jo[,Soo] +
(13.96)
4 N e 2.
The residual J0[~doc] is computed as follows s (see eqs. (6.105) and (6.107))" N
a:l
W.
=
(
• Rv0[
o]R
•
+
•
•
)
.
(13.98)
If this model is correct, Jo[3~]/e 2 is a X2 variable with 2 N degrees of freedom. Hence, the squared noise level e 2 can be estimated by -,2
e~ = 13.5.~
J0 [,Soo] 2N
"
(13.99)
Model comparison
Eq. (13.95) is obtained if A is replaced by R T in eq. (13.89). Alternatively, eq. (13.95) is obtained from eq. (13.89) by taking the limit d -+ ec. SThe approximation used in the general model is also used.
438
Chapter 13.
Information Criterion for Model Selection
Eqs. (13.89) and (13.95) both imply the epipolar equation (13.83). From these observation, we have the following order in the strength of the three models:
s ~ ~- sn ~- $.
(13.100)
I It follows that for whatever data x~, x~, ~ - 1, ..., N, we have J 0 [ , ~ ] >_ J0[Sn] _> J018]. If we apply the comparison criterion (13.66), we obtain the following test procedures:
1. Planarity test: Compm'ing the planar surface model Sn with the general model $, we infer that the object is a planar surface if
Jo [SrI]
6 < 3- --. J0131 X
(13.101)
In terms of the estimators ~2 and ~ defined by eqs. (13.88) and (13.94), the above condition can be written as follows: i~ 3Ni--2- < 2 X -
6 3"
(13.102)
2. Infinity test: Comparing the infinity model $oo with the general model $, we infer that the object is infinitely far away if
J0[ l J0[ ]
< 3.
(13.103)
In terms of the estimators i 2 and e~^2defined by eqs. (13.88) and (13.99), the above condition can be written as follows: ^2 Eoo
3
~--2- < ~"
13.6
13.6.1
(13.104)
3-D Motion Analysis
General model
As discussed in Chapter 11, 3-D motion analysis from two views is essentially 3-D reconstruction by stereo vision; the only difference is that the motion parameters {h, R} are unknown. Hence, the mathematical framework is the same except for the degrees of freedom of the model. As in the case of I stereo vision, the corresponding image points x~ and x~, a = 1, .. ., Y , ~ C a l l be regarded as six-dimensional data a~ - x~ | x~t E 7"2,,6 sampled fi'om the
13.6.
3-D Motion Analysis
439
f o u r - d i m e n s i o n a l data space X
y
,4={
1 X t
Ix, y,x', g' E 7~} C 7~6.
(la.105)
gl 1
The epipolar equation Ix, h, R I = 0
(13.106)
defines a three-dimensional manifold $ C ,4. The problem is stated as follows: P r o b l e m 13.5 E s t i m a t e the m a n i f o l d $ and the true p o s i t i o n s ~
- 5~ |
-!
of the data a s - x a Q x a E A in such a way that ga E $ .
The unknown motion parameters {h, R} have five degrees of freedom ~ so the manifold $ is a model of dimension 3, codimension 1, and five degrees of freedom. As in the case of stereo vision, let V[x~] and V[x~] be the covariance matrices of x~ and x~, respectively, and decompose them into the noise level e and the normalized covariance matrices V0[x~] and V0[x~] in the form of eqs. (13.85). The noise is assumed to be Gaussian and independent for each datum. Let S be the maximum likelihood estimator of $. The AIC of this model is A I C o ( S ) - Jo[,~] + 2(3N + 5)e 2. (13.107) Let {h,, /~} be the maximum likelihood estimators of the motion parameters {h, R}; the computational scheme for this estimation by linearization and renormalization is described in Sections 11.2 and 11.3. The residual J0 [S] is computed as follows 1~ (see eqs. (11.11) and (11.12)):
/~V0[x'] /z (h x x)) (13.108) If this model is correct, Jo[S]/e 2 is a X2 variable with N - 5 degrees of freedom. Hence, the squared noise level e2 can be estimated by J0[S]- E
^ '
^ ',
J0[ ]
(13.109)
9The translation h is normalized into a unit vector; see Section 11.1.1. 1~ in the case of stereo vision, the denominator should be expressed in terms of the true values xa and xc~-' but they. are approximated by the d a t a x~ and x~' Also, the second order term r x R)" (h • R ) V o [ x ' ] ) is omitted. Since Jo[8] = O(e2), these approximations result in only a higher order difference.
440
Chapter 13.
Information Criterion for Model Selection
In Sections 11.3.2 and 11.4.1, we showed that the solution {s /~} of the motion parameters was not unique; we chose the one for which all the feature points have positive depths before and after the camera motion. The existence of multiple solutions is due to the fact that different motion parameters can define the same manifold S. Hence, although the maximum likelihood estimators {/~,/~} of the motion parameters {h, R} may not be unique, the maximum likelihood estimator S of the manifold $ is generally unique. 13.6.2
P l a n a r surface m o d e l
If the object is a planar surface (n, r) - d, the constraint is given by x ' x A x = 0,
(13.110) (13.111)
A-RT(hn-dl).
As in the case of stereo vision, eq. (13.110) defines a two-dimensional manifold Sn in the four-dimensional data space ,4. Hence, 3-D reconstruction of a planar surface can be identified with the following problem: P r o b l e m 13.6 Estimate the manifold Sn and the true positions g~c, --xa,@ -! ! xc~ of the data am - xa @ xa E ,4 in such a way that ac~ E S n . The unknown surface parameters {n, d} have three degrees of freedom, and the unknown motion parameters {h, R} have five degrees of freedom. Hence, the manifold Sn is a model of dimension 2, co dimension 2, and eight degrees of freedom. Let Sn be the maximum likelihood estimator of Sn. The AIC of this model is (13.112)
AICo($n) - J0[Sn] + 2(2N + 8)e 2.
Let {~, d} and {h,/~} be the maximum likelihood estimators of the surface and motion parameters {n, d} and {h, R}, respectively; the computational scheme for this estimation by renormalization is described in Section 11.6. The residual J0[Sn] is computed as follows 11 (see eqs. (11.131) and (11.132))" N J0[&]
=
•
•
o=1 _
l)Ifa - (X' X )I.Vo[xa].~T
•
'
+
•
x
~
. (13.114)
Here, A is the estimate of matrix A obtained by replacing {n, d} and {h, R} by their maximum likelihood estimators {fi, d} and {h,,/~}, respectively, 11The approximations used in the general model are also used.
13.6.
3-D Motion Analysis
441
in eq. (13.90). If this model is correct, J0[Sii]/e 2 is a X2 variable with 2 N - 8 degrees of freedom. Hence, the squared noise level e2 can be estimated by J0[Sn] ~i - 2 N - 8 "
(13.115)
In Section 11.6.3, we showed that the solution {~, 0~} and {/~,/~} of the surface and motion parameters was not unique and reduced the ambiguity by imposing the condition that the depths of the feature points should be all positive before and after the camera motion. As in the case of the general model S, this ambiguity is due to the fact that different surface and motion parameters can define the same manifold. Hence, although the maximum likelihood estimators {/~, d} and {/~,/~/} of the surface and motion parameters {n, d} and {h, R} may not be unique, the m a x i m u m likelihood estimator ,~n of the manifold Sn is generally unique. 13.6.3
Rotation
model
If the camera motion is a pure rotation (i.e., h = 0), no 3-D information can be obtained (see Section 11.7.1). The camera is a pure rotation if and only if all corresponding image points x and x' satisfy the following equation (see eq. (11.173)): x • R x ' = O. (13.116) Like eq. (13.110), eq. (11.173) defines a two-dimensional manifold $R in the four-dimensional data space .A. Hence, estimation of the camera rotation R can be identified with the following problem: P r o b l e m 13.7 Estimate the manifold SR and the true positions ~ta I
9 ' of the data ac~ = x a 9 x a E A in such a way that g~c~ E ,SR 13'
"
The unknown rotation R has three degrees of freedom, so the manifold 3R is a model of dimension 2, codimension 2, and three degrees of freedom. Let SR be the maximum likelihood estimator of SR. The AIC of this model
is A I C o ( $ R ) - J0[SR] + 2(2N + 3)e 2.
(13.117)
Let R be the maximum likelihood estimator of the rotation R; the computational scheme for this estimation is described in Section 11.7.1. The residual J0[SR] is computed as follows 12 (see eqs. (11.174) and (11.175))" N J0[SR]- E(m~
^
x Rm~, I ~ ( m a
!
x Rm~)),
ol=l
12The approximations used in the general model are also used.
(13.118)
442
Chapter 13.
-
(
o
Information Criterion for Model Selection
•
+ (R
x
• ( Rx~ )
)
(13.119)
If this model is correct, Jo[,~n]/e 2 is a X2 variable with 2 N - 3 degrees of freedom. Hence, the squared noise level e2 can be estimated by
~=
13.6.4
Jo[3 ]
2N-3"
(13.12o)
Model comparison
As was pointed out in Chapter 11, the general algorithm fails if all the feature points are coplanar in the scene because a planar surface is a degenerate critical surface (see Section 11.5). Also, the general and the planar surface algorithms both asstune that the translation of the camera is not zero. If follows that the 3-D motion analysis must follow the following steps:
1. Rotation test: We test if the translation is 0. If so, output a warning message and stop. 2. Planarity test: We test if the object is a planar surface. If so, apply the planar surface algorithm. 3. Else, apply the general algorithm. In Chapter 11, the rotation and planarity tests were formulated as X2 tests (see Sections 11.6.1 and 11.7.1). However, they have the following shortcomings: 9 We need to estimate the noise level e, but this is different from image to image. 9 Even if e is predicted, we need to set the significance level. The judgment differs if the significance level is set differently. These difficulties can be avoided by the use of the geometric AIC. As in the case of stereo vision, eqs. (13.110) and (13.116) both imply the epipolar equation (13.83). Also, eq. (13.110) reduces to eq. (13.116) if the matrix A in eq. (13.89) is replaced by R T. These observations imply the following order of the strength of the three models: SR ~- Sn ~- $-
(13.121)
It follows that for whatever data x~, x'~, a 1, ..., N, we have J0 [,SR] >_ J0[,~n] >_ J0[,S]. If we apply the comparison criterion (13.66), we obtain the following test procedures:
13.6.
3-D Motion Analysis
443
(~)
(b)
Fig. 13.9. (a) Multiple instances of the general model S can pass through the planar surface model SH. (b) Infinitely many instances of the general model S can pass through the rotation model Sn. 1. Planarity test: Comparing the planar surface model Sn with the general model S, we infer that the object surface is planar if
J0 [Sn]
4 < 3 + ~ . J0[S] N- 5
(13.122)
In terms of the estimators ~2 and ~ defined by eqs. (13.109) and (13.115), the above condition can be written as follows: ~2 3 N - 11 ~--~- < 2N-----~"
(13.123)
2. Rotation test: Comparing the rotation model $R with the general model $, we infer that the camera motion is a pure rotation if
Jo [,-On]
J0[3]
14 < 3 + ~ .
N- 5
(13.124)
In terms of the estimators ~2 and ~2 defined by eqs. (13.109) and (13.120), the above condition can be written as follows: ~ 3N- 1 --7< 2N-3"
(13.125)
The fact that the 3-D reconstruction algorithm for the general model fails if the object shape is a planar surface or the camera motion is a pure rotation can be geometrically understood as follows. Note that in the four-dimensional data space ,4 the intersection of two three-dimensional manifolds is in general a two-dimensional manifold. 9 The general 3-D reconstruction algorithm fails for a planar surface because multiple instances of the three-dimensional general model $ can pass through the planar surface model $n (Fig. 13.9).
444
Chapter 13.
I
Information Criterion for Model Selection
I
Fig. 13.10. Images of two planar grids hinged together in the scene (8 = 50~ 9 No 3-D information can be obtained if the camera motion is a pure rotation because infinitely many instances of the general model $ can pass through the rotation model Sn (Fig. 13.9). In fact, we can see that
{x | x'l Ix, h, R x ' l -
{x | x'[x x Rx' = 0} =
0}.
(13.126)
h~n3
E x a m p l e 13.8 Two planar grids hinged together with angle 7 r - 0 were defined in the scene, and images viewed from different camera positions were generated. The image size and the focal length were assumed to be 512 • 512 (pixels) and 600 (pixels), respectively. Fig. 13.10a shows the images for 0 = 50 ~ We added random Gaussian noise of mean 0 and standard deviation (pixels) to the x- and y-coordinates of each grid point independently. Using the grid points as feature points, we conducted the planarity test 100 times, each time using different noise. Fig. 13.11 shows the percentage of the instances for which the object is judged as planar. If a = 1.0, the percentage is approximately 50% for ~ = 22 ~ Fig. 13.12 shows one instance for which the object is judged as planar; Fig. 13.13 shows one instance for which the object is judged as non-planar. Fig. 13.14a and Fig. 13.14b show the corresponding 3-D shapes reconstructed by the general and planar surface algorithms. The true shape also is superimposed. We can see that although the images look almost the same, the shape reconstructed by the general algorithm has little sense if the object is judged as planar, while the non-planar shape can be reconstructed fairly well if the object is judged as non-planar.
13.6.
3-D M o t i o n A n a l y s i s
445
100 90 80 70 60
30 2O l0 0
o"=0.5 o- = 1.0 o =1.5 o- =2.0
0
10
20
30
40 50 0 [deg]
60
70
80
90
Fig. 13.11. The percentage of the instances judged as planar.
-+,
7
_
J
A
Fig. 13.12. An instance for which the object is judged as planar.
-I-
---F
~~-
I I -.-.-.._.,._
...__.
_..._
Fig. 13.13. An instance for which the object is judged as nonplanar.
446
Chapter 13.
Information Criterion for Model Selection
-.%
(~)
(b)
Fig. 13.14. (a) 3-D shapes and the true shape reconstructed from Fig. 13.12. (a) 3-D shapes and the true shape reconstructed from Fig. 13.13.
13.7
13. 7.1
3-D Interpretation of Optical Flow General model
As argued in Chapter 12, the formalism of 3-D interpretation of optical flow can be obtained from finite motion analysis in the limit of infinitesimal camera motion. Hence, the mathematical structure is almost identical with that for finite motion analysis. However, a subtle difference exists. Given optical flow 5: at image point x, we can view the set of 6-vectors a - x | x E T~6 as data sampled from the four-dimensional data space
A-
{
I}, ~, x, y e 7~} C 7~6.
(13.127)
However, noise occurs only in k and ~. Hence, the covariance matrix Via] of a has rank 2, not ~, in contrast to the case of stereo vision and finite motion analysis. Because of this singularity, we must make appropriate modifications to the general argument given in Sections 13.1-13.4. Let {v, w} be the motion parameters. In the absence of noise, the optical flow & must satisfy the following epipolar equation (see eqs. (12.32) and (12.33)):
Ix, ~, v[ + (v x x, w • x) = 0.
(13.128)
13.7.
3-D Interpretation of Optical Flow
447
This equation defines a three-dimensional manifold S in the four-dimensional data space .A. Hence, the problem of 3-D interpretation of optical flow is stated as follows: P r o b l e m 13.8 E s t i m a t e the manifold S and the true positions ~ = ~ | x of the data a = i~ O x E A in such a way that ~ E S.
The the unknown motion parameters {v, w} have five degrees of freedom la, so the model S has dimension 3, co dimension 1, and five degrees of freedom. Let V[5~] be the covariance matrix of optical flow 5~, and decompose it into the noise level e and the normalized covariance matrix V015~] in the form of V[x] = e2Vo[x]. (13.129) The noise is assumed to be Gaussian and independent at each point. Let be the maximum likelihood estimator of S. The AIC of this model is A I C o ( S ) - Jo[S] + 2(S + 5)e 2,
(13.130)
where S is the number of pixels where the optical flow is observed. Note that although the model S has dimension 3, the last term on the right-hand side of eq. (13.130) is not 2 ( 3 S + 5)e 2. This is because x is not a random variable: for a fixed position x, the epipolar equation (13.129) defines a one-dimensional submanifold of S (see Section 12.5.2 and Fig. 12.8). Hence, if d and S are estimators of ~ and S, respectively, the estimator & E S has only one degree of freedom within ,~. Let {0, dJ} be the maximum likelihood estimators of the motion parameters {v, oa}; the computational scheme for this estimation by linearization and renormalization is described in Sections 12.4 and 12.5. The residual J0[S] is computed as follows (see eqs. (12.49) and (12.50)):
Jo[3] - is
Ol + (,:,,
x x
x
dxdy
(13.131)
x
As in Chapter 12, integration f s d x d y is a symbolic notation for summation over all pixels at which the optical flow is defined. If this model is correct, J0[8]/E 2 is a X2 variable with S - 5 degrees of freedom. Hence, the squared noise level E2 can be estimated by
~2_ J013]
(13.132)
S-5" In Section 12.5.3, we showed that the solution {/~, &} of the motion parameters was not unique and chose the one for which the depth is positive 1 3 T h e t r a n s l a t i o n v is n o r m a l i z e d into a u n i t v e c t o r ; see Section 12.2.3.
448
Chapter 13.
Information Criterion for Model Selection
everywhere. As in the case of finite motion analysis, this ambiguity is due to the fact that the same manifold S admits different parameterizations by the motion parameters {v, w} (see Section 12.8.4). Hence, although the maximum likelihood estimators {0, ~b} of the motion parameters {v, w} may not be unique, the maxinmm likelihood estimator ,r of the manifold S is generally unique.
13. 7.2
Planar surface model
If the object is a planar surface (n, v) - d, the optical flow has the form 5: - W x
where k - (0,0,1) T. The flow m a t r i x W eqs. (12.176) and (12.177))" W-
1(
--~
vn T-
(13.133)
- (k, W x ) x ,
1
5(v,n)l
)
is defined as follows (see
-w
x I.
(13.134)
The third components of both sides of eq. (13.133) are identically 0. Hence, eq. (13.133) defines a two-dimensional manifold Sri in the four-dimensional data space A. Thus, 3-D reconstruction of a planar surface can be identified with the following problem: P r o b l e m 13.9 E s t i m a t e the manifold S n and the true positions a -- ~ | x of the data a - ~ | x E ,4 in such a way that ~ E $II.
The unknown surface parameters {n, d} have three degrees of freedom, and the unknown motion parameters {v, w} have five degrees of freedom. Hence, the manifold Sn is a model of dimension 2, co dimension 2, and eight degrees of freedom. Let ,~n be the maximum likelihood estimator of Sn. The AIC of this model is A I C o ( $ r i ) - J0[3n] + 16e2.
(13.135)
Note that although SrI has dimension 2, the last term on the right-hand side of eq. (13.135) is not 2(2S + 8)e 2. This is because for a fixed position x and a fixed estimator Sn of ,S the planar surface optical flow equation (13.133) determines the optical flow x uniquely 14. Let {fi, 0~} and {fi, & } be the maximum likelihood estimators of the surface and motion parameters {n, d} and {v, w}, respectively; the computational scheme for this estimation is described in Section 12.8. The residual J0[Sn] is computed as follows (see eq. (12.178)): J 0 [ 3 n ] - L I1~ - I~r + (k , w
) ll Vo[ic]dxdy,
(13.136)
14A problem of this special type corresponds to what is called regression in statistics.
13.7.
3-D Interpretation of Optical Flow
449
Here, I ~ is the estimate of the flow matrix W obtained by replacing {n, d} and {v, w} by their maxinmm likelihood estimators {ti, el} and {0, d~}, respectively, in eq. (13.134). If this model is correct, J0[3n]/e 2 is a X2 variable with 2 S - 8 degrees of freedom. Hence, the squared noise level e2 can be estimated by ~ = J0[Sn] 2S-8"
(13.137)
In Section 12.8.4, we showed that the solution {~, d} and {/~, &} of the surface and motion parameters was not unique and reduced the ambiguity by imposing the condition that the depth should be positive everywhere. As in the case of finite motion analysis, this ambiguity is due to the fact that the same manifold $ admits different parameterizations by the surface and motion parameters {n, d} and {v, co}. Hence, although the maximum likelihood estimators {/t, el} and {0, d~} of the surface and motion parameters {n, d} and {v, w} may not be unique, the maximum likelihood estimator Sn of the manifold Sn is generally unique.
13.7.3
Rotation model
If the camera motion is a pure rotation (i.e., v - 0), no 3-D information can be obtained (see Section 12.9). The camera motion is a pure rotation if and only if the optical flow 5~ has the following form (see eq. (12.195))" & - Q x ( X x w).
(13.138)
Here, Q x - I - xk-F. The third components of both sides of eq. (13.138) are identically 0. Hence, eq. (13.138) defines a t w o - d i m e n s i o n a l manifold 8 n in the four-dimensional data space ,4. Thus, estimation of the rotation velocity w can be identified with the following problem: P r o b l e m 13.10 E s t i m a t e the m a n i f o l d 8 R and the true p o s i t i o n s (~ - 5~ | x of the data a - 5~ | x E A in such a way that ~ E S R .
The unknown rotation w has three degrees of freedom, so the manifold $ R is a model of dimension 2, co dimension 2, and three degrees of freedom. Let SR be the maximum likelihood estimator of $ a . The AIC of this model is A I C o ( S R ) - J0[$n] + 6e 2.
(13.139)
The last term on the right-hand side is not 2(2S + 3)e 2 by the same reason as in the case of the planar surface model. Let d~ be the maximum likelihood estimator of the rotation velocity w; the computational scheme for this estimation is described in Section 12.9.1. The residual J0[Sn] is computed as follows (see eq. (12.196))" J 0 [ S R ] - [ [[J:- Q x ( x x Co)[[2Vo[jc]dxdy. Js
(13.140)
Chapter 13.
450
Information Criterion for Model Selection
If this model is correct, Jo[,~R]/e2 is a X2 variable with 2 S - 3 degrees of freedom. Hence, the squared noise level e2 can be estimated by Jo [,-r 2S-3"
~=
13. 7.4
(13.141)
Model comparison
EQ. (13.116) is obtained if W in eq. (13.89) is an antisymmetric matrix. Alternatively, eq. (13.116) is obtained from eq. (13.89) by taking the limit d c~. Eqs. (13.89) and (13.116) both imply the epipolar equation (13.128). From these observations, we have the following order in the strength of the three models: Sn ~- Sri ~- S. (13.142) It follows that for whatever optical flow & we have J0[SR] >_ J0[Sn] >_ J0[S]. If we apply the comparison criterion (13.66), we obtain the following test procedures:
1. Planarity test: Comparing the planar surface model Sn with the general model S, we infer that the object is a planar surface if
Jo[~r Jo [,~]
4 s- 5
< 3 + ~ .
(13.143)
In terms of the estimators ~2 and ~ defined by eqs. (13.132) and (13.137), the above condition can be written as follows:
~I
3s- ~ ~-~- < 2S----~"
(13 ~44)
2. Rotation test: Comparing the infinity model $ ~ with the general model S, we infer that the object is infinitely far away if
J0[$.] J0[d]
< 3 + ~
14
s-
5
(13.145)
In terms of the estimators ~2 and ~2 defined by eqs. (13.132) and (13.141), the above condition can be written as follows: ~ 3S- i ~--~-< 2S---~"
(13.146)
The above results are formally identical with the case of finite motion. The singularity of 3-D interpretation also has the same geometric interpretation (see Fig. 13.9).
Chapter 14 General Theory of Geometric Estimation This chapter gives a mathematical extension to the theory of geometric correction in Chapter 5 and the theory of parametric fitting in Chapter 7. Here, the error distribution is no longer assumed to be Gaussian: the role of the covariance matrix for a Gaussian distribution is played by the Fisher information matrix. We derive a theoretical lower bound, called the Cramer-Rao lower bound, on the covariance matrix of an unbiased estimator of the parameter to be determined and show that the maximum likelihood estimator attains this bound in the first order if the problem belongs to the exponential family. We then express the computation of maximum likelihood estimation in the form suitable for numerical computation in practical applications.
14.1
Statistical Estimation in Engineering
The subject of this chapter is essentially statistical estimation. However, the following treatment is very different from traditional statistics, whose main objective is to infer the structure of a random phenomenon by observing multiple data with a view to evaluating and comparing effects and procedures in domains that involve a large degree of uncertainty, such as medicine, biology, agriculture, manufacturing, sociology, economics, and politics. In such a domain, the problem is usually translated into the mathematical language as estimating parameters involved in the probability distribution from multiple independent samples from it (see Section 3.5 for the classical results). Although this framework covers almost all types of relevant applications in the above mentioned domains, geometric estimation problems in computer vision and robotics have many non-traditional elements. In traditional statistics, errors are regarded as uncontrollable; the accuracy of estimation is improved only by repeated measurements. However, repeating measurements is costly. Hence, if the accuracy is the same, those methods which require a smaller number of data are more desirable. In other words, methods whose accuracy improves rapidly as the number of data increases are more desirable than those with slow increase of accuracy. Thus, the study of asymptotic properties of estimation in the limit of a large number of data has been one of the central subjects in traditional statistics (see Section 3.6.2 for the classical results). In such engineering domains as computer vision and robotics, where electronic sensing devices are used, errors are usually small and called noise. 451
452
Chapter 14.
General Theory of Geometric Estimation
Moreover, they are controllable" the accuracy of sensing can be improved by using high-resolution devices and controlling the environment (lighting, dust, temperature, humidity, vibration, etc.). However, such control is costly. Hence, if the accuracy is the same, those methods which tolerate a higher level of noise are more desirable. In other words, methods whose accuracy improves rapidly as the noise level decreases are more desirable than those with slow increase of accuracy. Thus, the study of the accuracy of estimation in the l i m i t of s m a l l noise is very important. In this sense, our approach of assuming local distributions and applying linear analysis, which has been the basic principle in the preceding chapters, can be justified. In many engineering domains, repeating measurements under the same conditions (which is easy) often produces the same results because the sources of inaccuracy in the device and the environment are fixed (but unknown). Also, the number of independent data is usually fixed. Hence, the basic premise of traditional statistics that i n d e p e n d e n t s a m p l e s f r o m the s a m e dist r i b u t i o n can be observed as m a n y t i m e s as desired does not hold. This observation also underlies the statistical treatment in the preceding chapters.
14.2
General Definition
Geometric
Correction
of the problem
Let F ( k ) ( u ) , k = 1, ..., L, be continuously differentiable scalar functions of argument u E 7~". The domain of u is assumed to be an n~-dimensional manifold g / C ~ , which we call the data space. G e o m e t r i c correction is the problem of correcting a given datum u to such a value/t that satisfies the L equations F (k) (it) - O,
k - 1, ..., L.
(14.1)
In Chapter 5.1, we considered N variables u 1 , . . . , n N, but the problem becomes equivalent if we regard their direct sum u - u l | | U N, n -nl ~ - ' . " + rig, as a new variable. As in Chapter 5.1, we call the L equations (14.1) simply the constraint, as opposed to which we call the constraint u E U the i n h e r e n t constraint on u. Let fi E U be the true value of u, which we assume is unknown. We regard the datum u as a random variable with probability density p ( u ; ~t) parameterized by ~. We assume that the distribution of u in the data space b/is local (see Section 3.2.2) and define the "true value" ~ by the relationship given by eq. (3.29). However, such an explicit characterization of fi will not be used in the subsequent analysis, so it can be thought of simply as s o m e value in U. The geometric correction problem is formally stated as follows (see Section 5.1.1):
14.2.
General Geometric Correction
453
P r o b l e m 14.1 From a given datum u E ld, estimate the true value ~t E ld that satisfies the constraint F (k) (~) - 0,
k - 1, ..., L.
(14.2)
As in Sections 3.5 and 3.6, the probability density p(u; ~t) is assumed to satisfy the following regularity condition: A s s u m p t i o n 14.1 The probability density p(u; ~t) is continuously differentiable with respect to u and s an arbitrary number of times, and
p(u; ~) > 0
(14.3)
for all u E hi. Furthermore, the integration operation fu du for any expression of p(u; ~t) with respect to u is, if the integration exists, interchangeable with the differentiation operation V~t with respect to ~t. The probability density p(u; ~t) is not defined for ~t ~/d. For the convenience of analysis, however, we extend it to ~ ~ / 4 in such a way that +
= ;(u;
+
(14.4)
for all ~ E b / a n d A ~ E Tu (b/) • where T~t (/4) C ~'~ is the tangent space to the manifold/d at ~t (see Section 3.5.1). In precise terms, this assumption is stated as follows (see eq. (3.112)): Assumption
14.2
yap e
14.2.2
(14.5)
T h e r a n k o f the c o n s t r a i n t
The L equations (14.1) may be algebraically dependent. If only r of them are independent, they give only r effective constraints and define an (n ~ - r)dimensional manifold $, or a manifold $ of codimension r, in the data space b/except at possible singular points. We assume that ~t E $ is not a singular point and call r the rank of the constraint (14.1). A precise statement of this assumption, which also gives a precise definition of the rank r, is as follows (see Section 7.1.1): A s s u m p t i o n 14.3 The L equations of the constraint define a manifold $ of codimension r in the data space ld in the neighborhood of ~t. As in the preceding chapter, let us call the manifold $ the (geometric) model of Problem 14.1. The solution is given by the projection of the datum u E ?d onto the model 8 C L/(Fig. 14.1).
454
Chapter 14.
General Theory of Geometric Estimation
Rn
Fig. 14.19 Projecting u E U onto the model S C/4. If we write u = ~ + Au, we have F <1)(~ + Au) / 9
( __
) ~7/~(1),--',
~7/~(L)
T Au
(14.6)
F (L) (~ + Au)
to a first approximation in Au, where and hereafter V F (k) (~) is abbreviated as V/~(k). The inherent constraint u E U requires that A u E Tu(U) to a first approximation. If A u ranges over the entire tangent space Tu(/4), eq. (14.6) defines a linear mapping from Tu(b/) to a linear subspace
- {
/
9
zx.)/
IA. e Tu(U)}
c n
The dimension of this subspace equals the dimension of the subspace {P~V/~(1) , ..., P-UuV F (L ) } L C ,]p~,n,
(14.8)
where P ~ be the n-dimensional projection matrix onto the tangent space Tu(U). Consequently, the dimension of s is at most the rank r of the constraint (14.1), but we further assume the following1: A s s u m p t i o n 14.4 The dimension of the linear subspace s rank r of the constraint.
is equal to the
In mathematical terms, Assumption 14.4 states that each of the L equations (14.1) defines a manifold of codimension 1 in the data space/J such that they intersect each other transversally at ~ (see Section 3.2.1). We say that the 1As a result, the L equations (14.1) cannot be replaced by a single equation, say E L F(k)(u) 2 = O, which still has rank r but s = {0} k--1
14.2.
General Geometric Correction
455
intersection of the L manifolds is a singular value of the constraint (14.1) if the dimension of the subspace L:~t is less than r, and an ordinary value otherwise. Assumption 14.4 requires that the true value ~ should not be a singular value. We say that the constraint (14.1) is nonsingular if Assumption 14.4 is satisfied, and singular otherwise (see Section 5.1.1). The linear subspace L:~t defined by eq. (14.7) can alternatively be defined as the tangent space to the manifold
F 1)(u) / " lu~}eT~
7-{
L
(14.9)
F(L)(u) at a
E td.
However,
a
can be a singular point
of the mapping
(F(I)(u),...,F(L)(u)) T" 5t -~ 7~L. In other words, the linear subspace L:u defined by replacing ~ by u (~ ~) in eq. (14.7) may have a higher dimension than r. If that is the case, we say that the constraint is degenerate (see Section 5.1.1).
14.2.3
Cramer-Rao lower bound for geometric correction
Let l be the score of u with respect to ~ (see eq. (3.113))"
- Vu logp.
(14.10)
The Fisher information matrix J with respect to ~ is defined as follows (see eq. (3.118))" J - E[HT]. (14.11) The symbol E[.] denotes the expectation with respect to the probability density p(u; ~t). Assumption 14.2 implies that I E T~t(b/). We assume that the distribution p(u; ~t) is regular with respect to ~t in the sense that I takes all orientations in T~t (5/) as u ranges over the entire data space L/(see Section 3.5.1). In precise terms, this assumption is stated as follows: 14.5 The Fisher information matrix J is positive semi-definite and has range Tu (l~).
Assumption
Define an (LL)-matrix 12 - (12(kl)) by (12 (kt)) - ((V/~(a), J - V / ~ ( t ) ) ) ,
(14.12)
and an (LL)-matrix I?V - (I?V(kl)) by I?V - V - , or in the notation used in Chapter 5 (l~ (kl)) - ((V/~(k), J - V / ~ ( l ) ) ) - .
(14.13)
456
Chapter 14.
General Theory of Geometric Estimation
Define (nn)-matrices N and S by L
(14.14) k,l=l
S-J--N.
(14.15)
L e t / t - / t ( u ) be an unbiased estimator of the true value ~t determined as a function of the d a t u m u. Since r is constrained to be in the model $, it must s a t i s f y / t ( u ) E $ for any u E/4. Its unbiasedness is defined by
psE[it - ~ ] - 0, where T~t($) of/t is matrix
(14.16)
p S is the n-dimensional projection matrix onto the tangent space to the model $ at fi (see eq. (3.126)). Assuming that the distribution sufficiently localized around ~ in the model $, we define the covariance o f / t in the following form (see eq. (3.115))" V[/t] - psE[(it - g)(/t - ~t)T]pS.
(14.17)
We now prove the following theorem 2" Theorem
14.1
s.
(14.1s)
Since this is an analogue of the Cramer-Rao inequality (3.134) in Section 3.5.2, we call eq. (14.18) and the bound it gives the Cramer-Rao inequality and the Cramer-Rao lower bound, respectively, for geometric correction.
1~.2.~
Proof of the main theorem
L e m m a 14.1 The (LL)-matrix V is positive semi-definite and has range f-.~t.
Proof. Let rI - (r/k) be an arbitrary L-vector. We have L
L
(17, V~7) - (~-~, rlkVF(k),J - ~ qlV/~(l)) 9 k=l
(14.19)
/=1
Since J is positive semi-definite, its generalized inverse J - is also positive semi-definite. Hence, the right-hand side of eq. (14.19) is nonnegative for an arbitrary r/, meaning that V is positive semi-definite. Suppose rl belongs to the null space of V. Then, the left-hand side of eq. (14.19) is 0. It follows that ~-~L which is identical with the k--1 ~k~Tt (k) belongs to the null space of J 2Recall that the relation A _ B means that matrix A - B is positive semi-definite (see Section 3.5.2).
14.2.
General Geometric Correction
457
null space of J. Assumption 14.5 implies that the null space of J coincides with T~t(L/) • Hence, L
E
~k(V/~(k)' Au) - 0
(14.20)
k=l
for an arbitrary Au E T~t(L/). From the definition of the subspace s this means that r/E s Conversely, if r/E s we obtain (r/, V~7) - 0 by following eqs. (14.20) and (14.19) backward. Hence, the null space of the (LL)-matrix V coincides with_ the ( L - r)-dimensional subspace L:~t C 7r L. Consequently, the range of V coincides with s [::1 _
Consider the (LL)-matrix W defined by eq. (14.13). Let P ~ be the Ldimensional projection matrix onto/:~t. Lemma 14.1 implies the following: C o r o l l a r y 14.1 The ( L L ) - m a t r i x W is positive semi-definite and has range l?,u, and the following relationship holds: VW-
P~.
(14.21)
NJN-
N.
(14.22)
L e m m a 14.2
Proof. Since J - J J NJN
- J - , we see that
=
m,n=l L
-
z k,l,rn,n= l
k,n=l
l
--1
)
L
- N,
(14.23)
k,n=l
where the identity 1 7 V V W - IZVW W - I;V is used. L e m m a 14.3
SJS-
S.
(14.24)
458
Chapter 14.
General Theory of Geometric Estimation
Fig. 14.2. The tangent space Tu(S) to model S at fi and its orthogonal complement with respect to Tu(b/). Proof. Since J - J J SJS
- J - , we see that
- (J- - N)J(J= J- - P~N-
- N) - J-J J- - J-JN NP~
+ N - J-
- N J J- + NJN
(14.25)
- N - S,
where the relationship J - J - J J - - P ~ implied by Assumption 14.5 and the relationship P ~ N - N P ~ - N implied by eq. (14.14)are used. El The following lemma plays an essential role in proving Theorem 14.1. L e m m a 14.4 The ( n n ) - m a t r i x S is positive semi-definite and has range
Tu (S). Proof. By definition, S is a positive semi-definite matrix. From Lemma 14.3, we see that for an arbitrary n-vector ~ we see that (~,S~) - (~,SJS~)
(14.26)
- (S~,J(S~)).
Since J is positive semi-definite, the above expression is nonnegative for any E T~". Hence, S is positive semi-definite. By definition, the tangent space T~t(8) is the orthogonal complement of the linear subspace { P - ~ V F ( k ) } L C T~t(L/) (Fig. 14.2). From eqs. (14.14) and (14.15), we see that
= (P~VF(k),J-P~VF(k)) L
-
E
17v(Im)(P~VF(k)'J-VF(t))(J-VF(m)'P~
V~(k))
l,m=l
= (V/~(k), P ~ J - P ~ V F ( k ) ) L
-
E l,m--1
I~('m)(v/~(k)' P g J - V / ~ ( O ) ( V / ~ ( m ) ' J - P ~
V~(k))
14.2.
General Geometric Correction
459
L ---(V~"(k)'J-VP(k))- E (V~"(k)'J-VF(l))~'r(lm)(vF(m)'J-VF(k)) l,m=l L = ?(aa) _ E
?(at)Vv(lm)?(mk) -- O,
(14.27)
l,m=l where the identity V W V - V V V - ~r is used. Eq. (14.27) implies that the linear subspace {PUV/~(k)}L is included in the null space of S. This means that the range of S is included in T u ( $ ) . Hence, the rank of S is at most n ~ - r. We now prove that the range of S exactly coincides with T u ( S ) by showing that S has at least rank n ~ - r. Let J be the linear subspace of 7~n obtained by mapping the r-dimensional subspace { p u V F ( k ) } L C Tu(ld) by J - . Assumption 14.5 implies that J has range Tu(b/), so Y is an r-dimensional subspace of Tu(b/). Let {vl, ..., v n , - . } be a basis of the orthogonal complement of ,7 with respect to Tu (/1/). Since the vectors vi, i - 1 n ~ - r, are orthogonal to vectors J - P U V f i ' ( k ) k - 1, ..., L, we obtain L
-
(14.28)
k,l--1 By Assumption 14.5, matrix J - is positive semi-definite and has range Tu(/d). Since vectors {vi } E Tu (b/), i - 1, ..., n ' - r , are linearly independent, vectors { J - v i } - {Svi}, i - 1, ..., n ' - r, are also linearly independent. Thus, S has at least rank n ~ - r. [] C o r o l l a r y 14.2 The variation A ~ of r defined by A~-
SA~
(14.29)
satisfies Aft E T u ( $ ) for any ~ E T4n. L e m m a 14.5
E [ P ~ ( i ~ - ~ ) ( S l ) T] -- S.
(14.30)
Proof. Since the estimator/~ is unbiased, eq. (14.16) holds identically for any E $. It follows that the first variation of the left-hand side of eq. (14.16) should be 0 for any A~ C T u ( $ ) . Hence, it should be 0 for A~ -- SA~ for any A~ E 7~~. Taking the first variation of the left-hand side of eq. (14.16) with respect to ~, substituting eq. (14.29) into it, noting Assumption 14.1, and using the logarithmic differentiation formula (3.115), we obtain A ( P ~ E [ / t - ~]) - - P S E [ A ~ ) ]
= P~SAf 1
: - {S\
+ P ~ .fTj(/t - ~)(V~t logp, A~t)pdu
- P ~ E [ ( i ~ - ~)iT]A~ \
E [ P S ( i ~ - ~(SI~T]~ Af. /k/J]
(14.31)
460
Chapter 14.
General Theory of Geometric Estimation
Here, we have used the assumption that PS+~uE[it - ~] ~ PSE[i~- ~t] (see Section 3.5.2) and the relationship PfftS = S. Since the above expression should be 0 for any A~ E 7~~, we obtain eq. (14.30). [] The proof of Theorem 14.1 is given as follows. Lemma 14.3, we see that
E[(SI)(Sl) T] - S E [ ~ T ] s -
From eq. (14.11) and
SJS-
S.
(14.32)
From this and Lemmas 14.5, we obtain E[
{ p S (/t_- ~t)
n S ( / t - fi)
T
I/t] S
\
S
"
(14.33)
This matrix is symmetric and positive semi-definite. If matrix A is positive semi-definite and symmetric, so is B T A B for any matrix B as long as the multiplication can be defined (see Section 2.2.3). Hence, the following matrix is positive semi-definite and symmetric:
- PPdS ) ( vb] S
sS ) (
_psPS P~ ) _ ( V [ / , ] - S
S)"
(14.34) Here, we have used the fact that matrices V[/t] and S share the same range Tu(S) and hence P s S = SP~t = S and pSv[it]Pfft = V[/t]. Since S is positive semi-definite, matrix V[/t] - S is also positive semi-definite. [] 14.3
14.3.
Maximum
Likelihood
Correction
M a x i m u m likelihood estimator
Maximum likelihood estimation for this problem is to choose fi that maximize the likelihood, i.e., the probability density p(u; ft) viewed as a function of ~ for a given datum u. The resulting estimate/t is called the maximum likelihood estimator of ~t (see Section 3.6.1). In order to distinguish "variables" from their specific "values", we regard the symbol ~ as the true value and use x when it is viewed as a variable. Since maximizing the likelihood is equivalent to minimizing its logarithm with opposite sign, the problem reduces to the minimization J =-21ogp(u;x)-+ min (14.35) with respect to x E S, i.e., under the constraint
F(k)(x) -- O,
k - 1, ...,L,
x e hi.
(14.36)
Define an (nn)-matrix i, by L -- - V 2 logplx= ~.
(14.37)
14.3.
Maximum Likelihood Correction
461
With the expectation that the maximum likelihood estimator /t is located near ~, we write x - ~t + Ax. (14.38) Substituting this into the function J in eq. (14.35) and expanding it in the neighborhood of ~, we obtain J - c - 2(1, Ax) + (Ax, L A x ) + O(Ax) 3,
(14.39)
where ~ is the value of J evaluated at ~. Assumption 14.2 implies that the rank of L is no more than the dimension n' of the tangent space Tx(14). In order to guarantee unique existence of the value x E b / t h a t minimizes J, we assume the following: A s s u m p t i o n 14.6 For any u E Lt, the (nn)-matrix L is positive semidefinite and has range T~t(bt). Substituting eq. (14.38) into eq. (14.36) and taking a linear approximation, we obtain (V/~(k), Ax) -- 0. (14.40) The inherent constraint x E/d requires that Ax E Tu(b/) to a first approximation. Introducing Lagrange multipliers ~(k) to eq. (14.40), differentiating J with respect to Ax, and ignoring higher order terms, we obtain the following linear equation in Ax" L
L A x -- 1 + E
A(k)PUV/~(k)"
(14.41)
k--1
By Assumptions 14.2 and 14.6, eq. (14.41) has the following unique solution (see Section 2.3.3): L
Ax -- L - 1 + E
A(k)s
(14.42)
k=l
Substituting eq. (14.42)into eq. (14.40), we obtain -
(14.43)
where we put s - (~(k)) and define the (LL)-matrix ? ' - (iF '(k0) by (lY'(kl))
-
((V/~(k), l , - V / ~ ( t ) ) ) .
(14.44)
Since matrix I, has range Tx(g4), matrix V~ can be shown to be positive semi-definite and have range s by exactly the same argument as given in
462
Chapter 14.
General Theory of Geometric Estimation
the proof of Lemma 14.6. Consequently, if u is an ordinary value (see Section 14.2.2), the solution of eq. (14.43) is uniquely given by L
~(k) = _ E
l/V'(kl)(V/~(l)' A u ) '
(14.45)
/--1
where 17V'(kl) is the (kl) element of the (LL)-matrix I ~ ' - V ' - , i.e., (ITV l(kl)) -- ((V/~(k), L--V/7(/))) - 9
(14.46)
Substituting eq. (14.45) into eq. (14.42), we can express the maximum likelihood estimator/t in the form /t - ~ + S'l,
(14.47)
where we define the (nn)-matrix S / by L
S'-
L--
E 17v'(kl)(L-V/~(k))(L-V/~(0) T" k,l=l
(14.48)
By the argument similar to that in the proofs of Lemmas 14.2 and 14.3, we can show that S~LS ~ = S ~. Hence, the (nn)-matrix S' can be shown to be positive semi-definite and have range Tu(,5) by the same argument as given in the proof of Lemma 14.4. Consequently, the covariance matrix of the maximum likelihood estimator/t has the following form: V[/,] - E[S'H TS'].
1~.3.2
(14.49)
Geometric correction of the exponential family
If the distribution of u belongs to the exponential family (see Section 3.6.1), the probability density p(u; ~t) has the following expression (see eq. (3.146)):
p(u; ft) - C(~t)exp[(J'(u), ~) + g(u)].
(14.50)
Here, f ( u ) is an n-dimensional function of u e 7~n, and C(u) and g(u) are scalar functions of u G 7~n. We call Problem 14.1 geometric correction of the exponential family if the probability density p(u; ~t) has the above expression. For geometric correction of the exponential family, we see from eqs. (14.37) and (14.50) that the matrix L does not depend on u. Hence, the following equality holds (see eq. (3.122)): J-
E[s
= I,.
From this, we obtain the following proposition:
(14.51)
14.3.
M a x i m u m Likelihood Correction
463
14.1 The maximum likelihood estimator it for geometric correction of the exponential family is unbiased in the first order.
Proposition
Proof. If eq. (14.51) holds, the matrix S ' defined by eq. (14.48) is a constant matrix. Since El1] : 0 (see eq. (3.117)), we see from eq. (14.47) that E [
-
:
S E [
] :
o.
(14.52)
An estimator is said to be efficient if its covariance matrix attains the Cramer-Rao lower bound (see Section 3.5.2). Extending this concept, we say that an estimator for geometric correction is e]ficient if its covariance matrix attains the Cramer-Rao lower bound given by Theorem 14.1. 14.2 The maximum likelihood estimator it for geometric correction of the exponential family is efficient in the first order.
Proposition
Proof. Since L - J , we see that V ' - V (see eqs. (14.12) and (14.44)) and hence I26' - W (see eqs. (14.13) and (14.46)). It follows that S' - S (see eqs. (14.15) and (14.48)). From eq. (14.49), we see that V[it]- SE[a7-]S- SJS-
S,
where we have used L e m m a 14.3.
(14.53) [:3
The proofs been obtained polynomial in approximation
of Propositions 14.1 an 14.2 are based on eq. (14.47), which has by approximating the function J in eq. (14.35) by a quadratic the neighborhood of their true values and applying a linear to the constraint (14.36). In other words, eq. (14.47) is an expression in the limit of small noise. Accordingly, Propositions 14.1 and 14.2 hold in this sense, which is what the phrase "in the first order" means. This type of first order analysis is compatible with our approach of studying statistical behavior for small noise (see Section 14.1).
1~.3.3
Computation of maximum likelihood correction
We now derive an analytical expression for the m a x i m u m likelihood estimat o r / t . First, we introduce a new assumption about the probability density p(u; x). We assume that viewed as a function of x it takes its m a x i m u m at u. In other words, given a single d a t u m u, the d a t u m u itself is the most likely value of the true value ~ if no other information is given. To be specific, what we need is the following" Assumption
14.7
Vxp(u; X)lx=u = 0.
(14.54)
464
Chapter 14.
General Theory of Geometric Estimation
For example, this assumption is satisfied if the probability density is expressed by a smooth scalar function f ( . ) that takes its maximum at the origin 0 in the form p(u; ~t) = f (u - ~t). The Gaussian distribution is a typical example. Assuming that the solution x of the optimization (14.35) is close to the datum u, we write x - u + Ax. Substituting this into the function J, expanding it in the neighborhood of the datum u, and using Assumption 14.7, we obtain J = c - (Ax, L A x ) + O ( A x ) 3. (14.55) The constant c is the value of J evaluated at u, and L is an (nn)-matrix defined by L
-Vx2 logplx=u.
-
(14.56)
In order to guarantee unique existence of the value x E L / t h a t minimizes J, we assume that Assumption 14.6 also applies to L" A s s u m p t i o n 14.8 For any u E 14, the (nn)-matrix L is positive semidefinite and has range Tu(N). The linear approximation of the constraint (14.36) is ( V F (k) , Ax) - - F (k) ,
(14.57)
where F (k) and V F (k) are the abbreviations of F(k)(u) and VF(k)(u), respectively. The inherent constraint x E L/requires that Ax E Tu(Lt) to a first approximation. Introducing Lagrange multipliers )~(k) to eq. (14.57), differentiating J with respect to Ax, and ignoring higher order terms, we obtain the solution Ax uniquely in the following form: L
-
(14.5s) k=l
Substituting this into eq. (14.57) and putting )~ - (A(k)), we obtain V)~ - - ( F ( k ) ) ,
(14.59)
where the (LL)-matrix V -- (V (kl)) is defined by
L-vF(')))
(14.60)
Here, a computational problem arises. If the L equations (14.36) are not algebraically independent but give only r constraints, the matrix V~ computed from the true value ~ by eq. (14.44) has rank r. However, the rank of the matrix V computed from the datum u by eq. (14.60) may be larger than r. In other words, among the L linear equations in Ax given by eq. (14.57), more than r equations may be linearly independent if evaluated at u r ft. If this
14.3.
Maximum Likelihood Correction
465
Fig. 14.3. The corrected value fi is in Tu(b/) but may not be in L/. Hence, a higher order correction is necessary. occurs, the constraint (14.36) is said to be degenerate (see Section 14.2.2). If the constraint (14.36) is degenerate, some of the positive eigenvalues of matrix V converge to 0 in the limit u --+ ~. Hence, eq. (14.59) viewed as a linear equation in ,k is ill-conditioned when the datum u is close to the true value s (see Section 2.3.2). In order to avoid this ill-posedness, we project both sides of eq. (14.59) onto the eigenspace defined by the largest r eigenvalues of V. In other words, we effectively use only r constraints from among the L linear equations (14.57). The resulting solution may be slightly different from the exact solution of eq. (14.59) but does not effectively affect the subsequent analysis because eq. (14.59) is a first order approximation. The solution of eq. (14.59) given by this projection is L
)~(k) = _ Z
W(kl)F(l)'
(14.61)
/=1
where W (kl) is the (kl) element of the rank-constrained generalized inverse (V)~- (see eq. (2.132)), which we write (14.62) Substituting eq. (14.61) into eq. (14.58), we obtain the following conclusion: P r o p o s i t i o n 14.3 The maximum likelihood estimator it is computed to a first approximation in the form L
it - u -
L- Z
W(kt)F(k)VF(l)"
(14.63)
k,l=l
This gives only a first order solution under the linearized inherent constraint i t - u E Tu(lg) (Fig. 14.3). In actual computation, we need to projected
466
Chapter 14.
General Theory of Geometric Estimation
T.(U) / ~ . . ~ ~U
u"
Fig. 14.4. Locally Gaussian distribution. thus computed value/t onto the data space L/by adding a higher order correction, as discussed in Section 5.1.3. The resulting solution still may not be exactly in the model 8 C b/, so the correction is repeated until the constraint is sufficiently satisfied.
~4.3.4
Locally Gaussian model
Suppose the distribution of u is locally Gaussian (Fig. 14.4; see Section 3.3.1): the probability density has the form p(u; r
- Ce - ( u - v t ' v [ u ] - ( u - g ) ) / 2 ,
(14.64)
where C is a normalization constant. The covariance matrix V[u] is assumed to have range T~t (L/) and have the expression
v[~] -
P~E[(~- a ) ( ~ -
(14.65)
a)~]P~.
This distribution belongs to the exponential family, and eq. (14.64) implies that L - I, - J -- V[/t]-. Hence, the minimization (14.35)is equivalent to (x - u, V [ u ] - ( x - u)) --+ min.
(14.66)
In other words, the solution /t is the Mahalanobis projection of u onto the model $, i.e., the nearest point to u in S measured in the Mahalanobis disthrice with respect to V[u] (see Section 13.2.1). From Proposition 14.3, the solution is obtained in the form L
u - V[u] ~ W(kOF(k)VF(o,
i~ -
(14.67)
k,l=l
(W (kl)) -
(( V F (k), V[/t]VF (0) )
,
(14.68)
P
which coincides with the expression derived in Section 5.1.2 (see eq. (5.17)). As shown in Section 5.1.4, the a posteriori covariance matrix of this estimator has the following expression (see eqs. (5.31) and (5.32)):
.t l-
L
E k,/--1
T
9
(14.69)
14.4.
General Parametric Fitting
467
T~(S)
Fig. 14.5. The datum u is projected onto the tangent point fi of the equilikelihood surface of u to b/, and the standard confidence region is projected onto Tu (U). Here, V[u] is the matrix obtained by projecting V[u] onto T/t(L/) (see eq. (5.26)), and l ~ (kl) is the (kl) element of the ( L L ) - m a t r i x I ~ - (W (kt)) defined by (l~ (kl)) -
((V/~(k), I}'[u]VF (0) )- ,
(14.70)
]o
where V/~(k) is an abbreviation of VF(k)(/t) (see eq. (5.33)). As discussed in Section 5.1.4, eq. (14.69) can be interpreted as the projection of the standard confidence region of u onto the tangent space T/t (S) to the model S at /t (Fig. 14.5); it gives a first order approximation to the Cramer-Rao lower bound. 14.4
1~.4.1
General Parametric Fitting
Definition of the problem
Let F(k)(a,u), k = 1, ..., L, be continuously differentiable scalar functions of arguments a E T4m and u E 7~n. The domains of a and u are assumed to be an ml-dimensional manifold A C gUn and an hi-dimensional manifold /2 C T4n, respectively; we call A the data space a n d / 2 the parameter space. Parametric fitting is the problem of computing the parameter u in such a way that the L equations F (k) (a, u) - 0,
k = 1, ..., L,
(14.71)
"fit" multiple instances {a~}, c~ = 1, ..., N, of the variable a. We call the vectors {a~} and u the data and the fitting parameter, respectively. As in Chapter 7.1, let us call the above L equations the hypothesis. Let a~ E A be the true value of a~, which we assume is unknown. We regard each d a t u m a~ as a r a n d o m variable with probability density p~(a~; gt~) parameterized by a~. As in the case of geometric correction, we assume that the distribution of a~ in the data space A is local and define the "true value"
468
Chapter 14.
General Theory of Geometric Estimation
g~ by the relationship given by eq. (3.29). However, such an explicit characterization of ds will not be used in the subsequent analysis, so it can be thought of simply as some value in A. The parametric fitting problem is formally stated as follows (see Section 7.1.1): P r o b l e m 14.2 Estimate the true value ~t of the fitting parameter u E Lt from
the data {as}, ~ = 1, ..., N, in such a way that their (unknown) true values {a.s}, a = 1, ..., N, satisfy the hypothesis F ( k ) ( a s , ~) = O,
k = 1,..., L.
(14.72)
As in the case of geometric correction, the probability density p~(as; as) is assumed to satisfy the same regularity conditions (see Section 14.2.1): 14.9 The probability density ps(as; ds) is continuously differentiable with respect to as and ds an arbitrary number of times, and
Assumption
ps(as; ds) > 0
(14.73)
for all as E A. Furthermore, the integration operation f A das with respect to a~ for any expression of ps(as; ds) is, if the integration exists, interchangeable with the differentiation operation Va, with respect to as. The probability density p~(as; a~) is not defined for ds ~ .A. For the convenience of analysis, however, we extend it to ds ~ ,4 in the same way as in the case of geometric correction. Namely, we assume that
ps(as; ds + A d s ) = ps(as; ds) + O ( A a , ) 2
(14.74)
for all ds E A and A d s E Tao(A) • where T a , ( A ) C T~m is the tangent space to the manifold jt at 5,s. In precise terms, we assume the following: Assumption
14.10
Vamp, E Ta, (A). 1~.~.2
(14.75)
The rank o f the hypothesis
The L equations (14.71) may be algebraically dependent. If only r of them are independent, they generally give only r effective constraints and define an (m ~ - r)-dimensional manifold 8, or a manifold S of codimension r, in the data space jt except at singular points. We assume that all {as } are not singular points and call r the rank of the hypothesis (14.71). A precise statement of this assumption, which also gives a precise definition of the rank r, is as follows: 14.11 The L equations of the hypothesis define a manifold $ of codimension r in the data space A in the neighborhood of as, a = 1, ..., N, for u = ~t. Assumption
14.4.
General Parametric Fitting
469
As in the preceding chapter, let us call the manifold $ the (geometric) model of Problem 14.2. If we write a s = a s + A a ~ , we have
/
§
t
9
__
Va/~(1)...,
VaF(~L)
(
)
Aa~
(14.76)
to a first approximation in A a a , where and hereafter VaF(k)(5~, ~t) is abbreviated as Va/Nk). The inherent constraint a s E ,4 requires that A a s E Tas (,4) to a first approximation. If A a s ranges over the entire tangent space T a , (A), eq. (14.76) defines a linear mapping from Tas (,4) to a linear subspace
z:a
-
{
9
I/Xa
e Tao (A)}
c
n
(14.77)
The dimension of this linem" subspace equals the dimension of the subspace
{ PaA VaF(1) , ..., PaAs VaF(L) } L C Ti m,
(14.78)
where p aA s is the m-dimensional projection matrix onto T a s ( A ) . Consequently, the dimension of s is at most the rank r of the hypothesis (14.71), but we further assume the following3: 14.12 The dimension of the linear subspace s the rank r of the hypothesis for all c~.
Assumption
is equal to
In m a t h e m a t i c a l terms, Assumption 14.12 states that each of the L equations of the hypothesis defines a manifold of codimension 1 in the d a t a space A such that they intersect each other transversally at 5,s, c~ = 1, ..., N. We say that a s is a singular datum if the dimension s is less than r, and an ordinary datum otherwise. Assumption 14.12 requires that singular data should be excluded from the fitting process. We say that the hypothesis (14.71) is nonsingularif Assumption 14.12 is satisfied, and singular otherwise (see Section 7.1.1). The linear s u b s p a c e / : a s defined by eq. (14.77) can alternatively be defined as the tangent space to the manifold
jz _ {
9 F(L) (a, ~t)
e 7"iLia e A} e Ti n
(14.79)
3As a result, the L equations (14.71) cannot be replaced by a single equation, say = {0} for all a.
L Ek--1 F(k)(a' u)2 = 0, which still has rank r but s
470
Chapter 14.
General Theory of Geometric Estimation
at a,m E A. However, am can be a singular point of the mapping (F(1)(a, fi), .... F (L)(a, ~))T: ,4 --+ 7~L. In other words, the linear subspace s defined by replacing gm by am ( r gin) in eq. (14.77) has a higher dimension that r. If this is the case, we say that the hypothesis is degenerate (see Section 7.1.1). We further assume that no effective constraints exist on ~t other than the inherent constraint fi E b/. Putting it differently, we assume that for an arbitrary variation A u of ~ there exists a variation A~m of am such that
F(k)(am + A6,m, ~ + Au) = 0,
k = 1,..., L,
(14.80)
for c~ - 1, ..., N. The above equation is equivalent to k-
1,...,L,
(14.81)
to a first approximation, where and hereafter VuF(k)(~m, ~) is abbreviated as VuF (k). Let Tu(/d) be the tangent space to the manifold/d at ~. The inherent constraint u E ?d requires that A u E T~t (b/) to a first approximation. Similarly, the inherent constraint ~m E A requires that A~m E Ta~ (A) to a first approximation. From the definition of the linear subspace s the above assumption is formally stated as follows: A s s u m p t i o n 14.13 For an arbitrary Au E T~t(Lt),
E s
9
1~.~.3
(14.82)
Cramer-Rao lower bound for parametric fitting
Let Im be the score of am with respect to am: lm - Va~ logpm.
(14.83)
The Fisher information matrix Jm with respect to am is defined by w
-T
gm - E[lmlm ].
(14.84)
The symbol El. ] denotes the expectation with respect to the joint probability density 4 I-I N m=l pro(am; am). Assumption 14.10 implies that lm E Ta~(A). We_ assume that the distribution pro(am; ~tm) is regular with respect to am, i.e., lm takes all orientations in Tam (A) as am ranges over the entire data space A. In precise terms, this assumption is stated as follows: 4Since each aa is assumed to be independent, the expectation E[.] of a quantity that depends on aa alone equals the expectation with respect to pa(aa;~a) alone.
14.4.
General Parametric Fitting
471
A s s u m p t i o n 14.14 The Fisher information matrix J s is positive semi-
definite and has range Tvt. (.A). Define an (LL)-matrix V~ - ( ~ ( k 0 )
by
(g~k,,)_ ((Var(k),j-~Va~(O)), (LL)-m
trix
-
(-Ws(kl) ) -
by
-
Y2,
(14.85)
or
((Va/>(k) , J-~Va />(0 )) - .
(14.86)
Define an (nn)-matrix M by N
- E s=l
L
E
u
-(k)
17Vak')(P~VuF~ ) ( P U V u )
/>(t) T ,
(14.87)
k,l=l
where p U is the n-dimensional projection matrix onto the tangent space Tu (b/) to the manifold gg at R. We c a l l / ~ the moment matrix (see eq. (r.2a)). We assume that the number of the data {as } is sufficiently large and their true values {R~} distribute "generally" in the model $. In other words, we exclude the possibility that { a s } have a special configuration (being identical, collinear, coplanar, etc.) such that the fitting parameter u is indeterminate. In precise terms, this assumption is stated as follows: A s s u m p t i o n 14.15 The moment matrix M is positive semi-definite and has
range Tu (gt). L e t / t - / t ( a l , . . . , aN) be an unbiased estimator of u determined from the data {as}. Since u is constrained to be in the parameter space U, it must satisfy/t(al,..., aN) E ~ for any a~ E A, a - 1, ..., N. Its unbiasedness is defined by P ~ E[/t - R] - 0. (14.88) As in the case of geometric correction, we assume that the distribution of /t is sufficiently localized around ~ in the parameter space b / a n d define the covariance matrix of/t in the following form: V [ / t ] - P-UuE[(it - R ) ( / t - R)q-]Pg.
(14.89)
We now prove the following theorem: T h e o r e m 14.2 V[/t] __>/~-.
(14.90)
Since Theorem 14.2 is an analogue of the Cramer-Rao inequality (3.134) in Section 3.5.2, we call eq. (14.90) and the bound it gives the Cramer-Rao inequality and the Cramer-Rao lower bound, respectively, for parametric fitting.
472
Chapter 14.
14.4.4
General Theory of Geometric Estimation
Proof of the main theorem
L e m m a 14.6 The (LL)-matrix Va is positive semi-definite and has range
Proof. Let v / - (~](k)) be an arbitrary L-vector. We have L
L
(~7, V~7) - ( E q(k)VaF(k)'J-~ E r](l)~af(l))" k=l
(14.91)
/=1
Since J ~ is positive semi-definite, its generalized inverse J ~ is also positive semi-definite. It follows that the right-hand side of eq. (14.91) is nonnegative for an arbitrary ~7, meaning that V~ is positive semi-definite. Suppose v/ belongs to the null space of V~. Then, the left-hand side of eq. (14.91) is 0. Hence, ~kL=l rl(k)VaF(~k) belongs to the null space of J ~ , which is identical with the null space of J a . Assumption 14.14 implies that the null space of J ~ coincides with Tas (A) • Hence, L
A o)
-
o
(14.92)
k--1
for an arbitrary Aa~ E Ta s (A). From the definition of the subspace /:as, this means that v/ E /:_x as" Conversely, if v/ E /:--P as, we obtain (r/, V~v/) - 0 by following eqs. (14.92) and (14.91) backward. Hence, the null space of the (LL)-matrix V~ coincides with the ( L - r)-dimensional subspace s a s C 7~L Consequently, the range of V~ coincides w i t h / : a s . [] Consider the (LL)-matrix ISr~ defined by eq. (14.86) 9 Let p aL be the Ls dimensional projection matrix o n t o / : a s - Lemma 14.6 implies the following: C o r o l l a r y 14.3 The (LL)-matrix 17Va is positive semi-definite and has range f-.(zs, and the following relationship holds: V~I?r
- p_C as
(14.93)
9
The following lemma plays an essential role in proving Theorem 14.2. L e m m a 14.7 The variation Ads of ~
defined by
L
(14.94) k,l=l
satisfies Aa E Tcts(.A) for any A u E T~t(bl), and to a first approximation F(k)(d~ + Aa~, ~ + A u ) - 0,
k-
1,..., L.
(14.95)
14.4.
General Parametric Fitting
473
Proof. Assumption 14.10 implies le E T a : (A). Hence, we have p_u a~ J e - J e from the definition (14.84) of the Fisher information matrix J e . Consequently, p_U a , ~ j-~ _ j-~ ~ and eq. (14.94)implies PU a ~ Age = Age ~ i.e. ~ Age E Ta~ (A) 9 Eq. (14.95) is proved by showing (Vu/~(k) , Age) - -(Va/~(k) , Au).
(14.96)
From eqs. (14.93) and (14.94), we see that
(VaF(k)' A g e ) - - - E
(~TaF(k) , J-~Va~'(rn)) - (ml)W~ (Vu~(1) , Au) /=1
m=l
L
= - E ( P a ~ )kt(Vu/~(/) , Au),
(14.97)
/=1
where (paC)kt is the (kl) element of the projection matrix p_Cas. Assumption 14.13 implies that the L-vector ((Vu/~a(k), Au)) belongs to the s u b s p a c e / : a . Hence, the last term of eq. (14.97) is equal to -(Vaf'(k),Au).
[]
Define an n-vector random variable m e by L
me-
W~-
E
(14.98)
k,l--1 L e m m a 14.8
T
(14.99)
Proof. Since the estimator it is unbiased, eq. (14.88) holds identically for any ge E S, ct = 1, ..., N, and ft E b/that satisfy eq. (14.72). This means that the first variation of the left-hand side of eq. (14.88) should be 0 for any Au E Tu(L/) if the variation Age defined by eq. (14.94) is added to each ge. Taking the first variation of the left-hand side of eq. (14.88), substituting eq. (14.94) into it, noting Assumption 14.9, and using the logarithmic differentiation formula (3.115), we obtain
N
N
"k /.A P~(it - ft) H Pfl E (~,~ l~ /3=1
N
= -Au-
E[P~' ( i t - ft)
Z e=l
Aae)dal "" .daN
e=l
k,/=l
W;
(-le,J-~VaF(O)Vu
]Au.
(14. loo)
474
Chapter 14.
General Theory of Geometric Estimation
Here, we have used the assumption that Pu+muE[it u - ~] "~ P~ E[it - ~] (see Section 3.5.2). The above expression should be 0 for any Au E T~t(U). If we let A u - p U A~, we have A u E T~t (5/) for any A~ E 7~~. Hence,
E[P-~(it- ~t)
w(kl)(~~
E
]P~A~ - -P~A~.
c~--I k , / = l
(14.101) []
Since this is an identity for A~ E 7~~, we obtain eq. (14.99). L e m m a 14.9
(
c~=l
,. 3=1
Proof. Since the expectation of the score l~ is 0 (see eq. (3.117)), eq. (14.98) implies that E[m~] - O. Noting that each a~ is independent, we see that T
E[
ms
mz
ol=l
l-
~-1
E
E[m~mT] - E E[m~mT]" (14.103)
o~,~ 1
c~=l
From eq. (14.98), we obtain L -
W~
-
.. ~
,g-~E[lo~l,~]g-~VaF(m))
(Va
k,l,m,n----1
= E l,n--1
,j~Va~(m)),.oe
-(lk) k,m--1
L -
-
E
T
I ~ ( ' n ) ( P ~ V u / ~ ( t ) ) (PuVu/~('~))
--/~'
(14.104)
l,n=l
where we have used the defining equation (14.86) of the matrix W a and the identity W ~ V ~ W ~ - W ~ W ~ W ~ - I~r [] The proof of Theorem 14.2 is as follows. From eq. (14.89) and Lemmas 14.8 and 14.9, we see that T
,
-P~M )"
(14.105)
14.5.
Maximum Likelihood Fit
475
This matrix is symmetric and positive semi-definite. If matrix A is positive semi-definite and symmetric, so is B - t A B for any matrix B as long as the multiplication can be defined (see Section 2.2.3). Hence, the following matrix is positive semi-definite and symmetric:
_ ( VIii-M- M_).
(14.106)
Here, we have noted that the covariance matrix V[/t] and the moment matrix / ~ share the same range T~t(/A) and hence / ~ / ~ - - / ~ - / 1 ~ - p U . Since/f//- is positive semi-definite, matrix V[/t] - / ~ is also positive semidefinite. [] Suppose the probability densities ps (as ;a) of the data {as } are expressed in terms of a common function p ( . ; - ) in the form ps (as; a s ) = p(as; as). Then, the weight 17V~ (kl) defined by eq. (14.86) is also expressed in terms of a common function lrv(al)( 9) in the form 17V~ (al) - l~(kl)(ds). Suppose the true values {gs } are chosen from a bounded subset $~ of the model $ with density P(a) (the number of data is given by N = fs, P(a)da). Define the following matrix (cf. eq. (14.87)): L
k,l=l
(14.107) If N is sufficiently large, the Cramer-Rao lower bound is approximated by M ~ , which we call the asymptotic approximation of the Crarner-Rao lower bound.
14.5 1~.5.1
M a x i m u m Likelihood Fit M a x i m u m likelihood estimator
Maximum likelihood estimation for this problem is to choose {as} and the fitting parameter u that maximize the likelihood, i.e., the joint probability density 1-Is=l N ps (as; as) viewed as a function of { a s } for given data { a s }. The resulting estimates {ds } a n d / t are called the maximum likelihood estimators of {as} and u, respectively. In order to distinguish "variables" from their specific "values", we regard the symbol a s as the true value and use xs when it is viewed as a variable. Since maximizing the likelihood is equivalent to minimizing its logarithm with opposite sign, the problem reduces to the
476
Chapter 14.
General Theory of Geometric Estimation
minimization N
J - -2 E
logp~(a~; x~) --+ min
(14.108)
ot:l
with respect to xa E $ and u E L/, i.e., under the hypothesis F (k) (x~, u) - 0,
x~ E A,
k - 1, ..., L,
u e/2.
(14.109)
Define an (mm)-matrix L~ by
].~ - - V 2 1 o g p ~ l x ~ = a .
(14.110)
With the expectation that the maximum likelihood estimators {&~} and /~ are located near { ~ } and ~, respectively, we write x~ - a~ + A x e ,
u-~
+ Au.
(14.111)
Substituting the first equation into the function J in eq. (14.108) and expanding it in the neighborhood of a~, we obtain
j
-
e-
N
N
c~=l
c~--1
2
+
(14.112)
where a is the value of J evaluated at ~ , a - 1, ..., N. Assumption 14.10 implies that the rank of L~ is no more than the dimension m' of the tangent space Tx~ (A). In order to guarantee unique existence of the values xa E A, a = 1, ..., N, that minimize J, we assume the following: 14.16 For any aa E ,4, matrix La is positive semi-definite and has range Ta~ (A).
Assumption
Substituting eqs. (14.111) into the hypothesis (14.109) and taking a linear approximation, we obtain
(vap2
),
-
Au).
(14.113)
The inherent constraint x~ E A requires that Ax~ E Ta~ (A) to a first approximation. We seek the value Ax~ that minimizes J for a fixed value of A u under the linearized hypothesis (14.113). Introducing Lagrange multipliers ~ ) to eq. (14.113), differentiating J with respect to A x e , and ignoring higher order terms, we obtain the following linear equation in Axe" L
t ~ A x ~ -- l~ + E k=l
'X~)P.Aa,~Va/Nk) .
(14.114)
14.5.
Maximum Likelihood Fit
477
By Assumptions 14.10 and 14.16, eq. (14.114) has the following unique solution: L
k=l
Substituting eq. (14.115) into eq. (14.113) and putting A~ -- (A~)), we obtain I F : A ~ - -((Vu/~(k), A u ) ) -
((Va/~(k), L : l ~ ) ) ,
(14.116)
where the (LL)-matrix V~ - (~-(kl))is defined by (14.117) - /
By Assumption 14.16, matrix V~ can be shown to be positive semi-definite and have range s by exactly the same argument as given in the proof of Lemma 14.6. Consequently, if a~ is an ordinary datum (see Section 14.4.2), the solution of eq. (14.116) is uniquely given by L
~(k) = _ E l~(kt) ( (Vu/~(t)' Au) + (VaF~(t)' L : l a ) ) '
(14.118)
l=l
where W~ (kl)- is the (kl) element of the (LL)-matrix W~-' defined by I2r V~ , i.e., --
]
=
m
(w,
(14.119)
Substituting eq. (14.118) into eq. (14.115), substituting the resulting expression for Ax~ into eq. (14.112), and ignoring higher order terms, we obtain the following form: J - ~+ E a=l
-t(kt) (v,.p: w;
+ (v.p:
t2L)
k,l=l
((VuF(I),Au) + (Va/~(t),t;la)) -(l~,.t.,~-lo~)) . (14.120) We now seek the value Au that minimizes eq. (14.120). The inherent constraint u E H requires that Au E Tu(Lt) to a first approximation. Differentiating eq. (14.120) with respect to Au, we obtain the following linear equation in Au: N
M'Au-
L
- E E l~(kl)(Vaf'(~k)'t-s a=l
k,l= l
(14.121)
478
Chapter 14.
General Theory of Geometric Estimation
Here, the matrix M ' is defined as follows (see eq. (14.87)): N
L
T
m--1
k,l--1
(14.122) We assume that Assumption 14.15 also applies to M~: A s s u m p t i o n 14.17 The matrix M ~ is positive semi-definite and has range
Tu(U). Then, the solution of eq. (14.121) is uniquely given by N
A u -- - M ' -
E
L
E
I~(kt)(Va/~(k)' L~Im)P~Vu/~(0"
(14.123)
m--1 k,l--1
Since the maximum likelihood estimator of u is given b y / t - ~ + A t , the covariance matrix V[/t] (see eq. (14.89))is evaluated as follows: V[/~]- E [ M ' - M " M ' - ] .
(14.124)
Here, the matrix M " is defined by N
M"=
L
E
E
m,/3-- 1
I~(kt)iZV~('~n)(Vaf-'(k)'L:lm)(Vaf-'(m)'L~ ~)
k,l,m,n--1
(P~Vu/~(t)) ( ' ~ V u / ~ ( n ) ) T .
1~.5.2
(14.125)
Fitting of the exponential family
If the distribution of each am belongs to the exponential family, the probability density pro(am;am) has the following expression (see eq. (3.146))" a.)
-
a.) +
(14.126)
Here, f~(a)is an m-dimensional function of a e 7~TM, and Cm(a) and gin(a) are scalar functions of a E 7~m. We call Problem 14.2 parametric fitting of the exponential family if the probability densities p~(am; am) have the above expression. For parametric fitting of the exponential family, we see from eqs. (14.110) and (14.126) that the matrix Lm does not depend on am. Consequently, we have the following relationship:
Jm - E[Lm] - L m. From this, we obtain the following proposition:
(14.127)
14.5.
Maximum Likelihood Fit
479
P r o p o s i t i o n 14.4 The maximum likelihood estimator it for parametric fit-
ting of the exponential family is unbiased in the first order. Proof. If eq. (14.127) holds, the matrix M ' defined by eq. (14.107)is a constant matrix. Since E[l] - O, we see from eq. (14.123) that -
E[au]
-
(14.128)
o.
As in the case of geometric correction, we say that an estimator for parametric fitting is efficient if its covariance matrix attains the Cramer-Rao lower bound given by Theorem 14.2. P r o p o s i t i o n 14.5 The maximum likelihood estimator it for parametric fit-
ting of the exponential family is efficient in the first order. --
_
]
-
Proof. Since L~ - J ~ , we see that V~ - V~ (see eqs. (14.85) and (14.117)) and hence IV~ - I ~ (see eqs. (14.86) and (14.119)). It follows that M ' = M (see eqs. (14.87) and (14.122)). Taking the expectation of eq. (14.125) and noting the independence of each as, we see that N
E[M'] -
L
~
~
~/~kl) W~ - (rnn) (Va ~ ( k ) , J-~ E[l,l ~ T ]J~ Va ~ ( m ) )
o~,fl--1 k,l,m,n--1 (P~TuF(l)) N
(P-Uu~TuF(n))T
L
W~
W~
, (J-~J,J-~)Va ['(m))
(~--1 k,l,m,n--1
u
= l --1
N
-(0
k
)
)
f-,(n) T
)
--1
L
- E
E
ITV(l~' (P-~Vu~(t)) ( P ~ V u / ~ ( ~ ) ) 7 -
M,
(14.129)
1,n--1
~--1
where we have used the identities J - ~ J ~ J ~ - J-~ and I?V~V~l~r~ = W ~ W ~ W ~ - W ~ . Thus, eq. (14.124)is written as V [ i t ] - M M M -
-
--M
-
-
-
.
-
-
-
-
-
-
-
m
[]
480
Chapter 14.
General Theory of Geometric Estimation
Recall that an estimator is consistent if it converges to the true value as the number of data increases (see Section 3.6.2). P r o p o s i t i o n 14.6 The maximum likelihood estimator for parametric fitting of the exponential family is consistent in the first order if all gts, c~ = 1, ..., N , are in a bounded subset of the model $.
Proof. If all a s are bounded, we h a v e / ~ - O ( N ) from the definition of the moment m a t r i x / ~ (see eq. (14.87)). Hence, V[/t] - / ~ - O(1/N). n The proofs of Propositions 14.4, 14.5, and 14.6 are based on eq. (14.123), which is obtained by approximating the function J in eq. (14.108) by a quadratic polynomial in the neighborhood of their true values and applying a linear approximation to the hypothesis (14.109). In other words, eq. (14.123) is an expression in the limit of small noise. Accordingly, Propositions 14.4, 14.5, and 14.6 hold in this sense, which is what the phrase "in the first order" means. As in the case of geometric correction, this type of first order analysis is compatible with our approach of studying statistical behavior for small noise (see Section 14.1).
1~.5.3
Computation
of m a x i m u m
likelihood fit
We now derive a computational scheme for the maximum likelihood estimator /t by eliminating the estimators {~i~}, c~ -- 1, ..., N. To this end, we first compute the solution x~ of the optimization (14.108) under the hypothesis (14.109) for a fixed u and then substitute the resulting expression back into eq. (14.108), reducing it to a function of u alone. As in the case of geometric correction, we first introduce a new assumption about the probability density ps(as; x~). We assume that viewed as a function of x s it takes its maximum at as. In other words, given a single datum a s , the datum a s itself is the most likely value of the true value a s if no other information is given. To be specific, what we need is the following: A s s u m p t i o n 14.18
Vx~ps(as;x~)lx~=a ~ =0.
(14.130)
For example, this assumption is satisfied if the probability density is expressed by a smooth scalar function f ( . ) that takes its maximum at the origin 0 in the form p~(as; gts) = f ( a ~ - gts). The Gaussian distribution is a typical example. Assuming that the solution x s of the optimization (14.108) is close to the data value a s , we write x~ = a s + Axe. Substituting this into the function J, expanding it in the neighborhood of the datum as, and using Assumption
14.5.
Maximum Likelihood Fit
481
14.18, we obtain N
N
J - c + E(Ax~,L~Ax~) c~--1
+ E O(Ax~)3"
(14.131)
c~----1
The constant c is the value of J evaluated at a s , a - 1, ..., N, and L~ is an (mm)-matrix defined by L~ - - V ~ , logp~lxo,=a .
(14.132)
In order to guarantee unique existence of the values x~ E A, a - 1, ..., N, that minimize J, we assume that Assumption 14.16 also applies to L~" A s s u m p t i o n 14.19 For any a~ E A, matrix I,~ is positive semi-definite
and has range T(z. (.A). The linear approximation of the hypothesis (14.109) is ( V a F (k) , A x e ) - - F (k) ,
(14.133)
F (k) and VaF (k) are the abbreviations of F(a)(a~,u) and VaF(k)(a~, u), respectively. The inherent constraint x~ E .4 requires that Ax~ E T a . (,4) to a first approximation. Introducing Lagrange multipliers
where
$~) to the linearized hypothesis (14.133), differentiating J with respect to A x a , and ignoring higher order terms, we obtain the solution A x a uniquely in the following form: L
Ax~ - E )~) L-~VaF(~k)"
(14.134)
k---1
Substituting this into eq. (14.133) and putting A - (A~)), we obtain
V~A~ = -(F(k)),
(14.135)
where the (LL)-matrix V , - (V(kl)) is defined by (u(kl))
-
((VaF(k), L-~VaF(O))
(14.136)
As in the case of geometric correction, a computational problem arises. Namely, if the L equations (14.109) are not algebraically independent but give only r constraints, the matrix V'o~ computed from the true value d~ by eq. (14.117) has rank r. However, the rank of the matrix V~ computed from the datum a~ by eq. (14.136) may be larger than r. In other words, among the L linear equations in Ax~ given by eq. (14.133), more than r equations may be linearly independent if evaluated at the datum a~ # ~ . If this occurs,
482
Chapter 14.
General Theory of Geometric Estimation
the hypothesis (14.109) is said to be degenerate (see Section 14.4.2). If the hypothesis (14.109) is degenerate, some of the positive eigenvalues of matrix V~ converge to 0 in the limit a s ~ as. Hence, eq. (14.135) viewed as a linear equation in ,ks is ill-conditioned when the data {as} are close to the true values {as} (see Section 2.3.2). This ill-posedness can be avoided in the same way as in the case of geometric correction: we project both sides of eq. (14.135) onto the eigenspace defined by the largest r eigenvalues of V a. In other words, we effectively use only r constraints from among the L linear equations (14.133). The resulting solution may be slightly different from the exact solution of eq. (14.135) but does not effectively affect the subsequent analysis because eq. (14.135) is a first order approximation. The solution of eq. (14.135) given by this projection is L
~ ) - - ~ W(kOF(0,
(14.137)
/=1
where W~(kl) is the (kl) element of the rank-constrained generalized inverse (V~);-, which we write
(W(akl)) -((VaF(k),L-s
(14.138)
Substituting eq. (14.137) into eq. (14.134), substituting the resulting expression into eq. (14.131), and ignoring higher order terms, we obtain the following expression: N
J-c+
E c~=l
L
E
Ws
F~(k F(1).
(14.139)
k,l=l
Since the constant c is irrelevant to optimization, we obtain the following conclusion: P r o p o s i t i o n 14.7 The maximum likelihood estimator i~ of u is given as the
solution of the nonlinear optimization N
J[u] = E s=l
L
E
W~(k')F(ak)F(~') ~ min,
u e ?d.
(14.140)
k,l=l
This proposition gives a rigorous mathematical foundation to the formulation of parametric fitting as presented Chapter 7, upon which the subsequent applications are all based. Let /t be the computed maximum likelihood esdata timator of u. If the model 8 has codimension r in the m-dimensional ' space .4 parameterized by u E ~, it is a model of dimension d = m ' - r , codimension r, and n' degrees of freedom. The argument given in Sections
14.5.
M a x i m u m Likelihood Fit
483
13.2 and 13.3 can be generalized to d a t a with non-Gaussian distributions, and the geometric information criterion can be defined as follows:
A I C ( $ ) - J[/~]-t- 2(dN -t- n').
(14.141)
Bibliography [1] G. Adiv, Determining three-dimensional motion and structure from optical flow generated by several moving objects, IEEE Transactions on Pattern Analysis and Machine Intelligence, 7-4 (1985), 384-401. [2] H. Akaike, A new look at the statistical model identification, IEEE Transactions on Automatic Control, 19-6 (1974), 716-723. [3] J. Aisbett, An iterated estimation of the motion parameters of a rigid body from noisy displacement vectors, IEEE Transactions on Pattern Analysis and Machine Intelligence, 12-11 (1990), 1092-1098. [4] A. Albano, Representation of digitized contours in terms of conic arcs and straight-line segments, Computer Graphics and Image Processing, 3-1 (1974), 23-33.
[5]
Y. Aloimonos (ed.), Active Perception, Lawrence Erlbaum Associates, Hillsdale, N J, U.S.A., 1993.
[6]
J. (Y.) Aloimonos and D. Shulman, Integration of Visual Modules: An Extension of the Mart Paradigm, Academic Press, San Diego, CA, U.S.A., 1989.
[7]
S. Amari, Differential-Geometrical Methods in Statistics, Springer, Berlin, F.R.G., 1985.
[8]
S. Amari and M. Kumon, Estimation in the presence of infinitely many nuisance parameters--Geometry for estimation functions, Annals of Statistics, 16-3 (1988), 1044-1068.
[9]
E. B. Andersen, Asymptotic properties of conditional maximum likelihood estimators, Journal of the Royal Society of London, B32-2 (1970), 283-301.
[10]
B. D. O. Anderson and J. B. Moore, Optimal Filtering, Prentice-Hall, Englewood Cliffs, N J, U.S.A., 1979.
[11]
K. S. Arun, T. S. Huang and S. D. Blostein, Least-squares fitting of two 3-D point sets, IEEE Transactions on Pattern Analysis and Machine Intelligence, 9-5 (1987), 698-700.
[12]
H. Ballard and C. M. Brown, Cliffs, N J, U.S.A., 1982.
[13]
A. Bani-Hashemi, A Fourier approach to camera orientation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 15-11 (1993), 1197-1202.
[14]
O. E. Barndorff-Nielsen, Information and Exponential Families in Statistical Theory, John Wiley and Sons, New York, NY, U.S.A., 1978.
[15]
O. E. Barndorff-Nielsen, Parametric Statistical Models and Likelihood, Lecture Notes in Statistics 50, Springer, Berlin, F.R.G., 1988.
Computer
Vision, Prentice-Hall, Englewood
[16] J. L. Barron, D. J. Fleet and S. S. Beauchemin, Performance of optical flow techniques, International Journal of Computer Vision, 12-1 (1994), 43-77.
[17]
J. Bhanja and J. K. Ghosh, Efficient estimation with many nuisance parameters, Parts 1, 2, 3, Sankhy~, A54-1,2,3 (1992), 1-39, 135-156, 297-308. 485
486
Bibliography
[ls]
P. J. Bickel, C. A. J. Klaassen, Y. Ritov and J. A. Wellner, Efficient and Adaptive Estimation for Semiparametric Models, Johns Hopkins University Press, Baltimore, MD, U.S.A., 1993.
[19]
S. D. Blostein and T. S. Huang, Error analysis in stereo determination of 3-D point positions, IEEE Transactions on Pattern Analysis and Machine Intelligence, 9-6 (1987), 752-765 (corrected in 10-5 (1988), 765).
[20]
R. M. Bolle and B. C. Verumi, On three-dimensional surface reconstruction methods, IEEE Transactions on Pattern Analysis and Machine Intelligence, 9-1 (1991), 1-13.
[21]
F. L. Bookstein, Fitting conic sections to scattered data, Computer Graphics and Image Processing, 9-1 (1979), 56-71.
[22]
K. L. Boyer, J. Mirza and G. Ganguly, The robust sequential estimator: A general approach and its application to surface organization in range data, IEEE Transactions on Pattern Analysis and Machine Intelligence, 16-10 (1994), 987-1001.
[23]
B. Brillault-O'Mahony, New method for vanishing point detection, CVGIP: Image Understanding, 54-2 (1991), 289-300.
[241 B. Brillault-O'Mahony, High level 3D structure from a single view, Image and Vision Computing, 10-7 (1992), 508-520.
[25]
T. J. Broida, S. Chandrashekhar and R. Chellappa, Recursive estimation of 3D motion from a monocular image sequence, IEEE Transactions on Aerospace and Electronic Systems, 26-4 (1990), 639-656.
[26]
T. J. Broida and R. Chellappa, Estimation of object motion parameters from noisy images, IEEE Transactions on Pattern Analysis and Machine Intelligence, 8-1 (1986), 90-99.
[271 T. J. Broida and R. Chellappa, Performance bounds for estimating threedimensional motion parameters from a sequence of noisy images, Journal of the Optical Society of America, A6-6 (1989), 879-898.
[2s]
T. J. Broida and R. Chellappa, Estimating the kinematics and structure of a rigid object from a sequence of monocular images, IEEE Transactions on Pattern Analysis and Machine Intelligence, 13-6 (1991), 497-513.
[29]
R. C. Brown, Introduction to Random Signal Analysis and Kalman Filtering, John Wiley and Sons, New York, NY, U.S.A., 1983.
[3o]
A. R. Bruss and B. K. P. Horn, Passive navigation, Computer Vision, Graphics, and Image Processing, 21-1 (1983), 3-20.
[311
M. Campani and A. Verri, Motion analysis from first-order properties of optical flow, CVGIP: Image Understanding, 56-1 (1992), 90-107.
[32]
B. Caprile and V. Torte, Using vanishing points for camera calibration, International Journal of Computer Vision, 4-2 (1990), 127-140.
[33]
C. K. Chui and G. Chen, Kalman Filtering with Real-Time Applications, Springer, Berlin, F.R.G., 1991.
Bibliography
487
[34]
D. R. Cooper and N. Yalabik, On the computational cost of approximating and recognizing noise-perturbed straight lines and quadratic arcs in the plane, IEEE Transactions on Computers, 25-10 (1976), 1020-1032.
[35]
I. J. Cox, A review of statistical data association techniques for motion correspondence, International Journal of Computer Vision, 10-1 (1993), 53-66.
[36]
I. J. Cox, J. M. Rehg and S. Hingorani, A Bayesian multiple-hypothesis approach to edge grouping and contour segmentation, International Journal of Computer Vision, 11-1 (1993), 5-24.
[37]
D. R. Cox and D. V. Hinkley, Theoretical Statistics, Chapman and Hall, London, U.K., 1974.
[3s]
H. Cram~r, The Elements of Probability Theory and Some of Its Applications, Almqvist and Wiksell/Gebers, Stockholm, Sweden, 1955.
[39]
J. L. Crowley, P. Bobet and C. Schmid, Auto-calibration by direct observation of objects, Image and Vision Computing, 11-2 (1993), 67-81.
[4o]
K. Daniilidis and H.-H. Nagel, Analytical results on error sensitivity of motion estimation from two views, Image and Vision Computing, 8-4 (1990), 287-303.
[41]
E. R. Davies, Finding ellipses using the generalised Hough transform, Pattern Recognition Letters, 9-1 (1989), 87-96.
[42]
A. J. Dobson, An Introduction to Generalized Linear Models, Chapman and Hall, London, U.K., 1990.
[43]
L. S. Dreschler and H.-H. Nagel, Volumetric model and 3-D trajectory of a moving car derived from monocular TV-frame sequences of a street scene, Computer Graphics and Image Processing, 20-3 (1982), 199-228.
[44]
T. Echigo, A camera calibration technique using three sets of parallel lines, Machine Vision and Applications, 3-3 (1990), 159-167.
[45]
T. Ellis, A. Abboot and B. Brillault, Ellipse detection and matching with uncertainty, Image and Vision Computing, 10-5 (1992), 271-276.
[46]
T. Endoh, T. Toriu and N. Tagawa, A superior estimator to the maximum likelihood estimator on 3-D motion estimation from noisy optical flow, IEICE Transactions on Information and Systems, E77-D-11 (1994), 1240-1246.
[47]
O. Faugeras, Three-Dimensional Computer Vision: A Geometric Viewpoint, MIT Press, Cambridge, MA, U.S.A., 1993.
[4s]
O. D. Faugeras and S. Maybank, Motion from point matches: Multiplicity of solutions, International Journal of Computer Vision, 4-3 (1990), 225-246.
[49]
W. Feller, Probability Theory and Its Applications, Vols. 1, 2, John Wiley and Sons, New York, NY, U.S.A., 1950, 1966.
[50]
D. Forsyth, J. L. Mundy and A. Zisserman, Transformation invariance--A primer, Image and Vision Computing, 10-1 (1992), 39-45.
[51]
D. Forsyth, J. L. Mundy, A. Zisserman, C. Coelho, A. Heller and C. Rothwell, Invariant descriptors for 3-D object recognition and pose, IEEE Transactions on Pattern Analysis and Machine Intelligence, 13-10 (1991), 971-991.
488
Bibliography
[52]
J. E. Freund, Mathematical Statistics, Prentice-Hall, Englewood Cliffs, N J, U.S.A., 1962.
[53]
J. E. Freund, Modern Elementary Statistics, 3rd Ed., Prentice-Hall, Englewood Cliffs, N J, U.S.A., 1967.
[54]
A. Gelb (ed.), Applied Optimal Estimation, MIT Press, Cambridge, MA, U.S.A., 1974.
[55]
R. Gnanadesikan, Methods for Statistical Data Analysis of Multivariate Observations, John Wiley and Sons, New York, NY, U.S.A., 1977.
[56]
W. I. Grosky and L. A. Tamburino, A unified approach to the linear camera calibration problem, IEEE Transactions on Pattern Analysis and Machine Intelligence, 12-7 (1990), 663-671.
[57]
R. M. Haralick and L. G. Shapiro, Computer and Robot Vision, Vols. 1, 2, Addison-Wesley, Reading, MA, U.S.A., 1992.
[ss]
J. C. Hay, Optical motions and space perception--An extension of Gibson's analysis, Psychological Review, 73-6 (1966), 550-565.
[59]
G. Healey and R. Kondepudy, Radiometric CCD camera calibration and noise estimation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 16-3 (1994), 267-275.
[6o]
D. J. Heeger and A. D. Jepson, Subspace methods for recovering rigid motion I: Algorithm and implementation, International Journal of Computer Vision, 7-2 (1992), 95-117.
[61]
Y. Hel-Or and M. Werman, Pose estimation by fusing noisy data of different dimensions, IEEE Transactions on Pattern Analysis and Machine Intelligence, 17- 2 (1995), 195-201.
[62]
P. G. Hoel, Elementary Statistics, 2nd Ed., John Wiley and Sons, New York, NY, U.S.A., 1966.
[63]
P. G. Hoel, Introduction to Mathematical Statistics, 4th Ed., John Wiley and Sons, New York, NY, U.S.A., 1971.
[64]
R. Horaud, R. Mohr and B. Lorecki, On single-scanline camera calibration, IEEE Transactions on Robotics and Automation, 9-1 (1993), 71-74.
[65] [66]
B. K. P. Horn, Robot Vision, MIT Press, Cambridge, MA, U.S.A., 1986.
[67]
B. K. P. Horn, Closed-form solution of absolute orientation using unit quaternions, Journal of the Optical Society of America, A4-4 (1987), 629-642.
[6s]
B. K. P. Horn, Relative orientation, International Journal of Computer Vision, 4-1 (1990), 59-78.
[69]
B. K. P. Horn, H. M. Hilden and S. Negahdaripour, Closed-form solution of absolute orientation using orthonormal matrices, Journal of the Optical Society of America, A5-5 (1988), 1127-1135.
[7o]
B. K. P. Horn and B. Schunck, Determining optical flow, Artificial Intelligence, 17-1/2/3 (1981), 185-203.
B. K. P. Horn, Motion fields are hardly ever ambiguous, International Journal of Computer Vision, 1-3 (1987), 259-274.
Bibliography
489
[71]
X. Hu and N. Ahuja, Motion and structure estimation using long sequence motion models, Image and Vision Computing, 11-9 (1993), 549-569.
[72]
X. Hu and N. Ahuja, Necessary and sufficient conditions for a unique solution of planar motion and structure, IEEE Transactions on Robotics and Automation, 11-2 (1995), 304-308.
[73]
T. S. Huang and O. D. Faugeras, Some properties of the E matrix in twoview motion estimation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 11-12 (1989), 1310-1312.
[74]
A. H. Jazwinski, Stochastic Processes and Filtering Theory, Academic Press, New York, NY, U.S.A., 1970.
[75]
C. Jerian and R. Jain, Polynomial methods for structure from motion, IEEE Transactions on Pattern Analysis and Machine Intelligence, 12-12 (1990), 1150-1165.
[76]
C. P. Jerian and R. Jain, Structure from Motion--A critical analysis of methods, IEEE Transactions on Systems, Man, and Cybernetics, 21-3 (1991), 572588.
[77]
J.-M. Jolion, P. Meer and S. Bataouche, Robust clustering with applications in computer vision, IEEE Transactions on Pattern Analysis and Machine Intelligence, 13-8 (1991), 791-802.
ITs]
R. E. Kalman, A new approach to linear filtering and prediction problems, Transactions of ASME, Journal of Basic Engineering, 82D-1 (1960), 35-45.
[79]
R. E. Kalman and R. S. Bucy, New results in linear filtering and prediction theory, Transactions of ASME, Journal of Basic Engineering, 83D-1 (1961), 95-108.
[so]
B. Kamgar-Parsi and R. D. Eastman, Calibration of a stereo system with small relative angles, Computer Vision, Graphics, and Image Processing, 511 (1990), 1-19.
[Sl]
B. Kamgar-Parsi, B. Kamgar-Parsi and N. Netanyahu, A nonparametric method for fitting a straight line to a noisy image, IEEE Transactions on Pattern Analysis and Machine Intelligence, 11-9 (1989), 998-1001.
[82]
K. Kanatani, Distribution of directional data and fabric tensors, International Journal of Engineering Science, 22-2 (1984), 149-164.
Is3]
K. Kanatani, Structure and motion from optical flow under orthographic projection, Computer Vision, Graphics, and Image Processing, 35-2 (1986), 181199.
Is4]
K. Kanatani, Structure and motion from optical flow under perspective projection, Computer Vision, Graphics, and Image Processing, 38-2 (1987), 122146.
[ss]
K. Kanatani, Group-Theoretical Methods in Image Understanding, Springer, Berlin, F.R.G., 1990.
Is6]
K. Kanatani, Computational projective geometry, CVGIP: Image Understanding, 54-3 (1991), 333-348.
490
Bibliography
[87]
K. Kanatani, Hypothesizing and testing geometric properties of image data, CVGIP: Image Understanding, 54-3 (1991), 349-357.
[ss]
K. Kanatani, Statistical analysis of focal length calibration using vanishing points, IEEE Transactions on Robotics and Automation, 8-6, (1992), 767775.
IS9]
K. Kanatani, Unbiased estimation and statistical analysis of 3-D rigid motion from two views, IEEE Transactions on Pattern Analysis and Machine Intelligence, 15-1 (1993), 37-50.
[90]
K. Kanatani, Geometric Computation for Machine Vision, Oxford University Press, Oxford, U.K., 1993.
[91]
K. Kanatani, Statistical analysis of geometric computation, CVGIP: Image Understanding, 59-3 (1994), 286-306.
[92]
K. Kanatani, Statistical bias of conic fitting and renormalization, IEEE Transactions on Pattern Analysis and Machine Intelligence, 16-3 (1994), 320-326.
[93]
K. Kanatani, Analysis of 3-D rotation fitting, IEEE Transactions on Pattern Analysis and Machine Intelligence, 16-5 (1994), 543-549.
[941
K. Kanatani, Statistical foundation for hypothesis testing of image data, CVGIP: Image Understanding, 60-2 (1994), 382-391.
[95]
K. Kanatani, Computational cross ratio for computer vision, CVGIP: Image Understanding, 60-2 (1994), 371-381.
[96]
K. Kanatani, Renormalization for motion analysis: Statistically optimal algorithm, IEICE Transactions on Information and Systems, E77-D-11 (1994), 1233-1239.
[971 K. Kanatani and W. Liu, 3-D interpretation of conics and orthogonality, CVGIP: Image Understanding, 58-3 (1993), 286-301.
[9s]
K. Kanatani and Y. Onodera, Anatomy of camera calibration using vanishing points, IEICE Transactions on Information and Systems, E74-10 (1991), 3369-3378.
[99]
K. Kanatani and S. Takeda, 3-D motion analysis of a planar surface by renormalization, IEICE Transactions on Information and Systems, E78-D-8 (1995), 1074-1079.
[~00]
Y. Kanazawa and K. Kanatani, Direct reconstruction of planar surfaces by stereo vision, IEICE Transactions on Information and Systems, E78-D-7 (1995), 917-922.
[101] Y. Kanazawa and K. Kanatani, Reliability of 3-D reconstruction by stereo vision, IEICE Transactions on Information and Systems, E78-D-10 (1995), 1301-1306. [102] Y. Kanazawa and K. Kanatani, Reliability of fitting a plane to range data, IEICE Transactions on Information and Systems, E78-D-12 (1995), 16301635.
[103] A.
Kara, D. M. Wilkes and K. Kawamura, 3D structure reconstruction from point correspondences between two perspective projections, CVGIP: Image Understanding, 60-3 (1994), 392-387.
Bibliography
491
[104] M. G. Kendall and A. Stuart, The Advanced Theory of Statistics, Vols. 1 (3rd Ed.), 2 (3rd Ed.), 3 (2nd Ed.), Griffin, London, U.K., 1969, 1973, 1968. [105] Y. C. Kim and J. K. Aggarwal, Determining object motion in a sequence of stereo images, IEEE Journal of Robotics and Automation, 3-6 (1987), 599614. [106] N. Kiryati and A. M. Bruckstein, What's in a set of points?, IEEE Transactions on Pattern Analysis and Machine Intelligence, 14-4 (1992), 496-500. [107] E. Kreyszig, Introductory Mathematical Statistics: Principle and Methods, John Wiley and Sons, New York, NY. U.S.A., 1970. [108] R. Kumar, Robust methods for estimating pose and a sensitivity analysis, CVGIP: Image Understanding, 60-3 (1994), 313-432. [109] M. Kumon and S. Amari, Estimation of a structural parameter in the presence of a large number of nuisance parameters, Biometrika, 71-3 (1984), 445-459. [110] J. Z. C. Lai, On the sensitivity of camera calibration, Image and Vision Computing, 11-10 (1993), 656-664. [111] S. M. LaValle and S. A. Hutchinson, A Bayesian segmentation methodology for parametric image models, IEEE Transactions on Pattern Analysis and Machine Intelligence, 17-2 (1995), 211-217. [112] C.-H. Lee, Time-varying images: The effect of finite resolution on uniqueness, CVGIP: Image Understanding, 54-3 (1991), 325-332. [113] C.-H. Lee, Computing three-dimensional motion parameters: A hypothesis testing approach, Image and Vision Computing, 11-3 (1993), 145-154. [114] S. Lee and Y. Kay, A Kalman filter approach for accurate 3-D motion estimation from sequence of stereo images, CVGIP: Image Understanding, 54-2 (1991), 244-258. [115] E. L. Lehman, Testing Statistical Hypotheses, 2nd Ed., John Wiley and Sons, New York, NY. U.S.A., 1986. [116] R. K. Lenz and R. Y. Tsai, Techniques for calibration of the scale factor and image center for high-accuracy 3-D machine vision metrology, IEEE Transactions on Pattern Analysis and Machine Intelligence, 10-5 (1988), 713-720. [117] R. K. Lenz and R. Y. Tsai, Calibrating a Cartesian robot with eye-on-hand configuration independent of eye-to-hand relationship, IEEE Transactions on Pattern Analysis and Machine Intelligence, 11-9 (1989), 916-928. [118] L. Li and J. H. Duncan, 3-D translational motion and structure from binocular image flow, IEEE Transactions on Pattern Analysis and Machine Intelligence, 15-7 (1993), 657-667. [119] B. G. Lindsay, Conditional score functions: Biometrika, 69-3 (1982), 503-512.
Some optimality results,
[120] B. G. Lindsay, Using empirical partially Bayes inference for increased efficiency, Annals of Statistics, 13-3 (1985), 914-931. [121] W. Liu and K. Kanatani, Interpretation of conic motion and its applications, International Journal of Computer Vision, 10-1 (1993), 67-84.
492
Bibliography
[122] H. C. Longuet-Higgins, A computer algorithm for reconstructing a scene from two projections, Nature, 293-10 (1981), 133-135. [123] H. C. Longuet-Higgins, The reconstruction of a scene from two projections-Configurations that defeat the 8-point algorithm, Proceedings of the First IEEE Conference on Artificial Intelligence Applications, Denver, CO, U.S.A., December 1984, pp. 395-397. [124] H. C. Longuet-Higgins, The visual ambiguity of a moving plane, Proceedings of the Royal Society of London, B223 (1984), 165-175. [125] H. C. Longuet-Higgins, The reconstruction of a plane surface from two perspective projections, Proceedings of the Royal Society of London, B227 (1986), 399-410. [126] H. C. Longuet-Higgins, Multiple interpretations of a pair of images of a surface, Proceedings of the Royal Society of London, A418 (1988), 1-15. [127] H. C. Longuet-Higgins and K. Prazdny, The interpretation of a moving retinal image, Proceedings of the Royal Society of London, B208 (1980), 385-397. [128] S. D. Ma, Conics-based stereo, motion estimation, and pose determination, International Journal of Computer Vision, 10-1 (1993), 7-25. [129] K. V. Mardia, Statistics of Directional Data, Academic Press, London, U.K., 1972. [130] D. Marr, Vision: A Computational Investigation into the Human Representation and Processin 9 of Visual Information, W. H. Freeman, San Francisco, CA, U.S.A., 1982. [131] L. Matthies, T. Kanade and R. Szeliski, Kalman filter-based algorithms for estimating depth from image sequences, International Journal of Computer Vision, 3-3 (1989), 209-236. [132] S. Maybank, The angular velocity associated with the optical flowfield arising from motion through rigid environment, Proceedings of the Royal Society of London, A401 (1985), 317-326. [133] S. J. Maybank, The projective geometry of ambiguous surface, Philosophical Transactions of the Royal Society of London, A332 (1990), 1-47. [134] S. J. Maybank, Ambiguity in reconstruction from image correspondences, Image and Vision Computing, 9-2 (1991), 93-99. [135] S. Maybank, Theory of Reconstruction from Image Motion, Springer, Berlin, F.R.G., 1993. [136] S. J. Maybank and O. D. Faugeras, A theory of self-calibration of a moving camera, International Journal of Computer Vision, 8-2 (1992), 123-151. [137] P. McCullagh and J. A. Nelder, Generalized Linear Models, 2nd edition, Chapman and Hall, London, U.K., 1989. [138] J. S. Meditch, Stochastic Optimal Linear Estimation and Control, McGrawHill, New York, NY, U.S.A., 1969. [139] P. Meet, D. Mintz and A. Rosenfeld, Robust regression methods for computer vision: A review, International Journal of Computer Vision, 6-1 (1991), 5970.
Bibliography
493
[140] J. Mendel, Lessons in Digital Estimation Theory, Prentice-Hall, Englewood Cliffs, N J, U.S.A., 1987. [141] E. De Micheli, V. Torre and S. Uras, The accuracy of the computation of optical flow and the recovery of motion parameters, IEEE Transactions on Pattern Analysis and Machine Intelligence, 15-5 (1993), 434-447. [1421 K. M. Mutch, Determining object translation information using stereoscopic motion, IEEE Transactions on Pattern Analysis and Machine Intelligence, 8-6 (1986), 750-755. [143] M. K. Murray and J. W. Rice, Differential Geometry and Statistics, Chapman and Hall, London, U.K., 1993. [144] H.-H. Nagel, Representation of moving rigid objects based on visual observations, Computer, 14-8 (1981), 29-39. [145] Y. Nakagawa and A. Rosenfeld, A note on polygonal and elliptical approximation of mechanical parts, Pattern Recognition, 11-2 (1979), 133-142. [146] S. Negahdaripour, Critical surface pairs and triplets, International Journal of Computer Vision, 3-4 (1989), 293-312. [147] S. Negahdaripour, Multiple interpretations of the shape and motion of objects from two perspective images, IEEE Transactions on Pattern Analysis and Machine Intelligence, 12-11 (1990), 1025-1030. [148] S. Negahdaripour, Closed-form relationship between the two interpretations of a moving plane, Journal of the Optical Society of America, A7-2 (1990), 279-285. [149] S. Negahdaripour and S. Lee, Motion recovery form image sequences using only first order optical flow information, International Journal of Computer Vision, 9-3 (1992), 163-184.
[:50]
A. N. Netravali, T. S. Huang, A. S. Krishnakumar and R. J. Holt, Algebraic methods in 3-D motion estimation from two-view point correspondences, International Journal of Imaging Systems and Technology, 1-1 (1989), 78-99.
[1511 R. Nevatia, Machine Perception, Prentice-Hall, Englewood Cliffs, N J, U.S.A., 1982. [152] J. Neyman and E. L. Scott, Consistent estimates based on partially consistent observations, Econometrica, 16-1 (1948), 1-32.
[:sa]
Y. Nonmra, M. Sagara, H. Naruse and A. Ide, Simple calibration algorithm for high-distortion-lens camera, IEEE Transactions on Pattern Analysis and Machine Intelligence, 14-12 (1992), 1095-1099.
[154] N. Ohta and K. Kanatani, Optimal structure from motion algorithm for optical flow, IEICE Transactions on Information and Systems, E78-D-12 (1995), 1559-1566.
[:55]
K. A. Paton, Conic sections in chromosome analysis, Pattern Recognition, 2-1 (1970), 39-51.
[156] M. A. Penna, Camera calibration: A quick and easy way to determine the scale factor, IEEE Transactions on Pattern Analysis and Machine Intelligence, 1312 (1991), 1240-1245.
494
Bibliography
[157] J. Philip, Estimation of three-dimensional motion of rigid objects from noisy observations, IEEE Transactions on Pattern Analysis and Machine Intelligence, 13-1 (1991), 61-66. [158] T. Poggio and C. Koch, Ill-posed problems in early vision: From computational theory to analogue networks, Proceedings of the Royal Society of London, B226 (1985), 303-323.
[159]
T. Poggio, V. Torte and C. Koch, Computational vision and regularization theory, Nature, 317 (1985), 314-319.
[160]
R. Poli, G. Coppini and G. Valli, Recovery of 3D closed surfaces from sparse data, CVGIP: Image Understanding, 60-1 (1994), 1-25.
[1611
S. B. Pollard, J. Porrill and J. E. Mayhew, Experiments in vehicle control using predictive feed-forward stereo, Image and Vision Computing, 8-1 (1990), 6370.
[162]
S. B. Pollard, T. P. Pridmore, J. PorriU, J. E. Mayhew and J. P. Frisby, Geometric modeling from multiple stereo views, International Journal of Robotics Research, 8-4 (1989), 3-32.
[163] J. Porrill, Optimal combination and constraints for geometrical sensor data, International Journal of Robotics Research, 7-6 (1988), 66-77. [164] J. Porrill, Fitting ellipses and predicting confidence envelopes using a bias corrected Kalman filter, Image and Vision Computing, 8-1 (1990), 37-41. [165] J. Porrill, S. B. Pollard and J. E. W. Mayhew, Optimal combination of multiple sensors including stereo vision, Image and Vision Computing, 5-2 (1987), 174180.
[166]
T. Poston and I. Stewart, Catastrophe Theory and Its Applications, Pitman, London, U.K., 1978.
[167] V. Pratt, Direct least-squares fitting of algebraic surfaces, Computer Graphics, 21-4 (1987), 145-152.
[1681
K. Prazdny, Determining the instantaneous direction of motion from optical flow generated by a curvilinearly moving observer, Computer Graphics and Image Processing, 17-3 (1981), 238-248.
[1691
L. Quan, P. Gros and R. Mohr, Invariants of a pair of conics revisited, Image and Vision Computing, 10-5 (1992), 319-323.
[1701 C. R. Rao, Linear Statistical Inference and Its Applications, 2nd Ed., John Wiley and Sons, New York, NY, U.S.A., 1973. [1711 C. R. Rao and S. K. Mitra, Generalized Inverse of Matrices and Its Applications, John Wiley and Sons, New York, NY, U.S.A., 1971. [1721 J. H. Rieger and D. T. Lawton, Processing differential image motion, Journal of the Optical Society of America, A2-2 (1985), 354-360. [173] J. Rissanen, Modeling by shortest data description, Automatica, 14-5 (1978), 465-471.
[174]
J. Rissanen, A universal prior for integers and estimation by minimum description length, Annals of Statistics, 11-3 (1983), 416-431.
Bibliography
495
[175] J. Rissanen, Universal coding, information, prediction, and estimation, IEEE Transactions on Information Theory, 30-4 (1984), 629-636. [1761 J. Rissanen, Stochastic Complexity in Statistical Inquiry, World Scientific, Singapore, 1987. [177] J. W. Roach and J. K. Aggarwal, Determining the movement of objects from a sequence of images, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2-6 (1980), 544-562. [1781 C. A. Rothwell, A. Zisserman, C. I. Marinos, D. A. Forsyth and J. L. Mundy, Relative motion and pose from arbitrary plane curves, Image and Vision Computing, 10-4 (1992), 250-262. [1791 R. Safaee-Rad, I. Tchoukanov, B. Benhabib and K. C. Smith, Accurate parameter estimation of quadratic curves from grey-level images, CVGIP: Image Understanding, 54-2 (1991), 259-274. [lS0] R. Safaee-Rad, I. Tchoukanov, K. C. Smith and B. Benhabib, Constraints on quadratic-curved features under perspective projection, Image and Vision Computing, 19-8 (1992), 532-548. [1811 R. Safaee-Rad, I. Tchoukanov, K. C. Smith and B. Benhabib, Threedimensional location estimation of circular features for machine vision, IEEE Transactions on Robotics and Automation, 8-5 (1992), 624-640. [182] P. D. Sampson, Fitting conic sections to "very scattered" data: An iterative refinement of the Bookstein algorithm, Computer Graphics and Image Processing, 18-1 (1982), 97-108. [183] J. G. Semple and G. T. Kneebone, Algebraic Projective Geometry, Clarendon Press, Oxford, U.K., 1952 (reprinted 1979). [184] J. G. Scruple and L. Roth, Introduction to Algebraic Geometry, Clarendon Press, Oxford, U.K., 1949 (reprinted 1987). [1851 Y. Shirai, Three-Dimensional Computer Vision, Springer, Berlin, F.R.G., 1987. [186] S. Smith, Note on small angle approximations for stereo disparity, Image and Vision Computing, 11-6 (1993), 395-398. [187] M. A. Snyder, The precision of 3-D parameters in correspondence based techniques: The case of uniform translation motion in a rigid environment, IEEE Transactions on Pattern Analysis and Machine Intelligence, 11-5 (1989), 523528.
[1881
M. E. Spetsakis, A linear algorithm for point and line-based structure from motion, CVGIP: Image Understanding, 56-2 (1992), 230-241.
[189] M. Spetsakis and J. (Y.) Aloimonos, A multi-frame approach to visual motion perception, International Journal of Computer Vision, 6-3 (1991), 245-255. [190] M. E. Spetsakis and Y. Aloimonos, Optimal visual motion estimation: A note, IEEE Transactions on Pattern Analysis and Machine Intelligence, 14-9 (1992), 959-964. [191] M. Spetsakis, Models of statistical visual motion estimation, CVGIP: Image Understanding, 60-1 (1994), 300-312.
496
Bibliography
[192] M. Subbarao and A. M. Waxman, Closed form solution to image flow equations for a planar surface in motion. Computer Vision, Graphics, and Image Processing, 36-2/3 (1986), 208-228. [193] M. Subbarao, Interpretation of image flow: Rigid curved surfaces in motion, International Journal of Computer Vision, 2-1 (1988), 77-96. [194] M. Subbarao, Bounds on time-to-collision and rotational component from first-order derivatives of image flow, Computer Vision, Graphics, and Image Processing, 50-3 (1990), 329-341. [195] N. Tagawa, T. Toriu and T. Endoh, Un-biased linear algorithm for recovering three-dimensional motion from optical flow, IEICE Transactions on Information and Systems., E76-10 (1993), 1263-1275. [196] N. Tagawa, T. Toriu and T. Endoh, Estimation of 3-D motion from optical flow with unbiased objective function, IEICE Transactions on Information and Systems., E77-D-10 (1994), 1148-1161.
[1971
H. Takeda, C. Facchinetti and J.-C. Latombe, Planning the motions of a mobile robot in a sensory uncertainty field, IEEE Transactions on Pattern Analysis and Machine Intelligence, 16-19 (1994), 1002-1017.
[198] T. N. Tan, K. D. Baker and G. D. Sullivan, 3D structure and motion estimation from 2D image sequences, Image and Vision Computing, 11-4 (1993), 203-210.
[199] K. Tarabanis, R. Y. Tsai and D. S. Goodman, Calibration of a computer controlled robotic vision sensor with a zoom lens, CVGIP: Image Understanding, 59-2 (1994), 226-241.
[200]
G. Taubin, F. Cukierman, S. Sullivan, J. Ponce and D. J. Kriegman, Parameterized families for bounded algebraic curve and surface fitting, IEEE Transactions on Pattern Analysis and Machine Intelligence, 16-3 (1994), 287-303.
[201] R. Thorn, Stabilitd Structurelle et Morphogdn~se, Benjamin, New York, NY, 1972.
[2021
J. I. Thomas, Refining 3D reconstructions: A theoretical and experimental study of the effect of cross-correlations, CVGIP: Image Understanding, 60-3 (1994), 359-370.
[203]
C. Tomasi and T. Kanade, Shape and motion from image streams under orthography--A factorization method, International Journal of Computer Vision, 9-2 (1992), 137-154.
[204]
P. H. S. Torr and D. W. Murray, Statistical detection of independent movement from a moving camera, Image and Vision Computing, 11-4 (1993), 180187.
[205]
H. P. Trivedi, Estimation of stereo and motion parameters using a variational principle, Image and Vision Computing, 5-2 (1987), 181-183.
[206]
R. Y. Tsai and T. S. Huang, Estimating three-dimensional motion parameters of a rigid planar patch, IEEE Transactions on Acoustics, Speech, and Signal Processing, 29-6 (1981), 1147-1152.
Bibliography
497
[207]
R. Y. Tsai and T. S. Huang, Uniqueness and estimation of three-dimensional motion parameters of rigid objects with curved surfaces, I E E E Transactions on Pattern Analysis and Machine Intelligence, 6-1 (1984), 13-27.
[208]
T. Y. Tsai and T. S. Huang, Estimating 3-D motion parameters of a rigid planar patch III. Finite point correspondences and the three-view problem, I E E E Transactions on Acoustics, Speech, and Signal Processing, 32-2 (1984), 213-220.
[209] R. Y. Tsai, T. S. Huang and W.-L. Zhu, Estimating three-dimensional motion parameters of a rigid planar patch, II: Singular value decomposition, I E E E Transactions on Acoustics, Speech, and Signal Processing, 30-3 (1982), 525534. [210] R. Y. Tsai and R. K. Lenz, A new technique for fully autonomous and efficient 3D robotics hand/eye calibration, I E E E Transactions on Robotics and Automation, 5-3 (1989), 345-358. [211] S. Ullman, The Interpretation of Visual Motion, MIT Press, Cambridge, MA, U.S.A., 1979. [212] S. Umeyama, Least-squares estimation of transformation parameters between two point patterns, I E E E Transactions on Pattern Analysis and Machine Intelligence, 13-4 (1991), 376-380. [213] T. Vi~ville, Auto-calibration of visual sensor parameters on a robotic head, Image and Vision Computing, 12-4 (1994), 227-237. [214] C.-C. Wang, Extrinsic calibration of a vision sensor mounted on a robot, IEEE Transactions on Robotics and Automation, 8-2 (1992), 161-175. [215] L.-L. Wang and W.-H. Tsai, Camera calibration by vanishing lines for 3-D computer vision, I E E E Transactions on Pattern Analysis and Machine Intelligence, 13-4 (1991), 370-376.
[2~6]
A. M. Waxman and J. H. Duncan, Binocular image flows: Steps toward stereomotion fusion, I E E E Transactions on Pattern Analysis and Machine Intelligence, 8-6 (1986), 715-729.
[217] A. M. Waxman, B. Kamgar-Parsi and M. Subbarao, Closed-form solutions to image flow equations for 3D structure and motion, International Journal of Computer Vision, 1-3 (1987), 239-258. [218] G.-Q. Wei and S. D. Ma, Implicit and explicit camera calibration: Theory and experiments, I E E E Transactions on Pattern Analysis and Machine Intelligence, 16-5 (1994), 469-480. [2191 I. Weiss, Line fitting in a noisy image, I E E E Transactions on Pattern Analysis and Machine Intelligence, 11-3 (1989), 325-329. [220] R. Weiss, H. Nakatani and E. M. Riseman, An error analysis for surface orientation from vanishing points, I E E E Transactions on Pattern Analysis and Machine Intelligence, 12-12 (1990), 1179-1185.
[221]
J. Weng, N. Ahuja and T. S. Huang, Motion and structure from point correspondences with error estimation: Planar surfaces, I E E E Transactions on Signal Processing, 39-12 (1991), 2691-2717.
498
Bibliography
[2221 J. Weng, N. Ahuja and T. S. Huang, Optimal motion and structure estimation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 15-9 (1993), 864-884. [2231 J. Weng, P. Cohen and M. Herniou, Camera calibration with distortion models and accuracy evaluation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 14-10 (1992), 965-980. [2241 J. Weng, P. Cohen and N. Rebibo, Motion and structure estimation from stereo image sequences, IEEE Transactions on Robotics and Automation, 8-3 (1992), 362-382.
[22~]
J. Weng, T. S. Huang and N. Ahuja, 3-D motion estimation, understanding, and prediction from noisy image sequences, IEEE Transactions on Pattern Analysis and Machine Intelligence, 9-3 (1987), 370-389.
[2261 J. Weng, T. S. Huang and N. Ahuja, Motion and structure from two perspective views: Algorithms, error analysis, and error estimation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 11-5 (1989), 451-467. [2271 J. Weng, T. S. Huang and N. Ahuja, Motion and Structure from Image Sequences, Springer, Berlin, F.R.G., 1993.
[2281
M. Werman and Z. Greyzel, Fitting a second degree curve in the presence of error, IEEE Transactions on Pattern Analysis and Machine Intelligence, 17-2 (1995), 207-211.
[229]
P. H. Winston (ed), Psychology of Computer Vision, McGraw-Hill, New York, NY, U.S.A., 1975.
[230] T. H. Wonnacott and R. J. Wonnacott, Introductory Statistics, 2nd Ed., John Wiley and Sons, New York, NY, U.S.A., 1972. [2311 J. J. Wu, R. E. Rink, T. M. Caelli and V. G. Gourishankar, Recovery of the 3-D location and motion of a rigid object through camera image (an extended Kalman filter approach), International Journal of Computer Vision, 2-4 (1989), 373-394. [2321 Y. Yasumoto and G. Medioni, Robust estimation of three-dimensional motion parameters from a sequence of image frames using regularization, IEEE Transactions on Pattern Analysis and Machine Intelligence, 8-4 (1986), 464471. [233] G.-S. J. Young and R. Chellappa, Statistical analysis of inherent ambiguities in recovering 3-D motion from a noisy flow field, IEEE Transactions on Pattern Analysis and Machine Intelligence, 14-10 (1992), 995-1013. [234] Z. Zhang and O. D. Faugeras, Three-dimensional motion computation and object segmentation in a long sequence of stereo frames, International Journal of Computer Vision, 7-3 (1992), 211-241. [2351 Y. T. Zhou, V. Venkateswar and R. Chellappa, Edge detection and linear feature extraction using a 2-D random field model, IEEE Transactions on Pattern Analysis and Machine Intelligence, 11-1 (1989), 84-95. [236] X. Zhuang, A simplification to linear two-view motion algorithm, Computer Vision, Graphics, and Image Processing, 46-2 (1989), 175-178.
Bibliography
499
[237] X. Zhuang, T. S. Huang, N. Ahuja and R. M. Haralick, A simplified linear optical flow-motion algorithm, Computer Vision, Graphics, and Image Processing, 42-3 (1988), 334-344. [238] X. Zhuang, T. S. Huang and R. M. Haralick, Two-view motion analysis: A unified algorithm, Journal of the Optical Society of America, A3-9 (1986), 1492-1500.
Index Bayesian standpoint 14 Beta distribution 88 Binocular stereo vision 171 Binomial distribution 88
A posteriori covariance matrix 80 A posteriori distribution 248 A posteriori probability density 80, 250 A posteriori standard confidence region 140 A priori distribution 79, 248 A priori probability density 250 A priori standard confidence region 140 AIC 10, 22, 93, 429 Aberration 1 Adjugate matrix 30 Affine connection 10 Affine subspace 78, 121 Affine transformation three-dimensional 125 two-dimensional 189 Akaike information criterion s e e AIC Algebraic variety 65 Alternative hypothesis 434 Angle (of rotation) 36 Angular velocity 37 Antisymmetric matrix 28, 33 Antisymmetrization 60 Aperture problem 372 Artificial intelligence 24 Asymptote 116 Asymptotic approximation 475 Asymptotic limit 429 Average 62 Axis of rotation 36 of a parabola 117
Camera calibration 11 Canonical form 115 Cartesian coordinate system 27 Cast 55 Catastrophe theory 10 Center (of a quadric) 118 Centered quadric 118 Central limit theorem 25, 71, 93, 429 Characteristic equation 37 Characteristic function 72 Characteristic polynomial 37 X2 distribution 73 )2 test 75, 90, 142 )~2 variable 73 Cochran's theorem 74 Codimension of a manifold 65, 132,210, 453 of a model 416 Cofactor 30, 123 Cofactor expansion formula 30 Cofactor matrix 336 Coincidence test for image lines 147, 229 for image points 145, 221 for space lines 152, 240 Coincidence test for space planes 157, 246 for space points 151,230 Collinearity test for image points 223, 298 for space points 233 Collineation 189, 194 Comparison criterion 22 Computational projective geometry 11
Back projection 118, 188, 361 Base-line vector 172 Bayes estimation 80, 251 Bayes formula 80, 248, 250 Bayesian approach 248 501
502 Computer vision 1 Concurrency test for image lines 227 for space lines 239 Condition number 48 Conditional probability density 79, 250 Confidence level 75, 434 Congruence transformation 52, 116, 126 Conic 11, 113, 300 Conic fitting 300 Conic incidence test 169 Conic locus s e e Conic Conic section s e e Conic Conic test 302 Conjugate axis 117 Conjugate direction 120, 135 Conservation equation 369 Consistent equation 48 Consistent estimator 90, 269 Consistent fitting 480 Consistent norm 51 Constraint 132, 452 Contour 129 Contragradient transformation 194 Contragradient vector 120 Contravariant vector 120 Coplanar vectors 32 Coplanarity test for space lines 242 for space points 236, 312, 317 Correlation 61 Correlation matrix 53, 366 Correspondence test 177 Cost function 251 Covariance 61 Covarimlce matrix 61, 68 Covariance tensor 62 Covariant vector 120 Cramer formula 48 Cramer-Rao inequality 86, 456, 471
Index Cramer-Rao lower bound 9, 23, 86, 140, 215, 456, 471 Critical surface of finite motion 19, 350 of optical flow 22, 401 Curved exponential family 88
Data 467 Data space 131,209, 249, 415, 452, 467 Decomposability condition 17, 289, 336, 340, 350, 387 Decomposable matrix 336, 340, 350, 387 Degenerate constraint 133, 455, 465 Degenerate hyperboloid 350, 401 Degenerate hypothesis 210, 470, 482 Degree of freedom of a ~2 distribution 73 of a geometric model 416 of a statistical model 91 Depth for finite motion 342 for stereo vision 180 for optical flow 392 Depth equation 392 Determinant 29 Determinate equation 48 Diagonalization 38 Diffeomorphism 65 Differential manifold s e e Manifold Dimension of a geometric model 416 of a manifold 65 of a matrix 27 of a tensor 54 of a vector 27 Direct, sum of matrices 54 of vectors 54 Disparity 181 Duality 11, 194
Index Dynamical system 1 Eccentricity 117 Eddington epsilon 31 Edge 14 Edge density 225 Edge detection 25 Edge operator 14 Edge segment 175, 186, 225, 295 Effective gradient 257 Effective gradient approximation 14, 258 Effective moment matrix 292 Effective moment tensor 303 Effective value 265, 291, 301 Efficient correction 463 Efficient estimator 86, 463 Efficient fitting 479 Eigenmatrix 58 Eigensystem 37 Eigenvalue 37 Eigenvalue decomposition see Spectral decomposition Eigenvector 37 Eigenvector fit 15, 269 Eight-point algorithm 333 Ellipse 11, 114 Ellipsoid 12, 118 Epipolar of stereo vision 13, 173, 435 of optical flow 390 Epipolar constraint 2, 7, 13, 172 Epipolar equation 17, 173, 316, 325, 377, 435 Epipolar plane 172 Epipole 13, 173, 180 Essential matrix 17, 174, 332, 435 Estimator 77, 85 Euclidean matrix norm 51 Euclidean motion two dimensional 115 three-dimensional 125 Euclidean norm of a matrix 51
503 of a vector 29 Euler's theorem 36 Expectation 61 Expected residual 91,419 Exponential family of distributions 10, 88, 462, 478 geometric correction of 23, 462 parametric fitting of 23, 478 Extended Kalman filter 15, 19, 263 Exterior product of a matrix and a vector 33 of a vector and a matrix 33 of two matrices 34 of two vectors see Vector product Factorization 19 Feature point tracking 25 Fisher information matrix 10, 23, 84, 455, 470 Fitting parameter 467 Five-point algorithm 328 Flow matrix for planar surface optical flow 403, 448 for general optical flow 378 Focal length 96, 171, 204 Focus of expansion for finite motion 329 for optical flow 21,380, 393 Fragile algorithm 17 Fragile computation 340 Frobenius norm see Euclidean matrix norm Fuzzy inference 24 Gamma distribution 88 Gamma function 73 Gaussian distribution 68 Gaussian random variable 68 Generalized Rayleigh quotient 43, 279
504 Generalized characteristic equation 42 Generalized characteristic polynomial 42 Generalized eigensystem 42 Generalized eigenvalue 41 Generalized eigenvector 41 Generalized eigenvector fit 15, 279 Generalized inverse 9, 39, 47 Generalized matrix inversion formula 40, 258 Generalized spectral decomposition 42 Generated subspace 39 Geometric AIC 7, 22, 428, 483 Geometric constraint 2 Geometric correction 2, 7, 12, 452 of the exponential family 462 Geometric information criterion s e e Geometric AIC Geometric model 22, 132,210, 416, 453, 469 Goodness of a geometric model 7, 419 of a statistical model 90 Gradient equation 370 Gradient value 370 Grassmann coordinate s e e Pliicker coordinate Group of affine transformations 125 of congruence transformations 52 of Euclidean motions three-dimensional 125 two-dimensional 115 of general linear transformations 36 of orthogonal transformations 36 of rotations 36 of similarity transformations 52
Index of special linear transformations 36 of special orthogonal transformations 36 of projective transformations 125 three-dimensional 125 two-dimensional 189 Homogeneous coordinate of an image line 98 of an image point 10, 95 of a space line 104 of a space point 102 Homotopy method 17 Hyperbola 11, 114 Hyperboloid 12, 118, 350, 351,401 Hypothesis 75, 141, 210, 434, 467 Ideal image line 98, 128, 129 Ideal image point 95, 99, 127, 128, 173, 175 Ideal space line 111, 128 Ideal space point 102, 107, 111,112 Ill-conditioned equation 48, 465, 482 Image line 95 at infinity s e e Ideal image line Image plane 96 Image point 95 at infinity s e e Ideal image point Image space s e e Range Image understanding 1 Imaginary ellipse 114 Incidence of an image point and an image line 99 of a space line and a space plane 111 of a space point and a space line 107, 110 Incidence test
Index for an image point and an image line 148 for a space line and a space plane 162, 195 for a space point and a space plane 1.58, 191 for conics 169 Inconsistent equation 48 Independent random variables 61 Indeterminate equation 48 oo-norm 29, 51 Infinity test for a space line 200 for a space point 197 for stereo vision 438 Information matrix s e e Fisher information matrix Information of finite motion 367 of optical flow 411 Informative part 411 Inherent constraint 11, 132, 452 Inlier 25 Inner product of matrices 50 of vectors 28 Innovation 82 Instantaneous axis of rotation 37 Instantaneous motion parameters 374 Invariant 52 Invariant basis 52 Inverse 30 Isotropy 64 Join of a space point and a space line 113 of space three points 113 of two image points 100 of two space lines 113 of two space points 108 Joint probability density 250
505 Kalman filter 9, 81,260 Kalman gain 82 Kalman-Bucy filter 9 Kernel s e e Null space Kronecker delta 28 Lagrange formula 31 Law of large numbers 62, 89, 93, 269, 429 Least-squares approximation 269 Least-squares optimization 49 Left orthonormal system 46, 357 Left-handed coordinate system 27 Left-handed system 32 Levi-Civita symbol s e e Eddington epsilon Likelihood 77, 87, 211,460, 475 Line fitting 295 Line of sight 96 Linear dynamical system 81 Linearization 8, 15, 17, 288 Linearized algorithm 17, 288 Local distribution 10, 67, 98, 105, 251, 452 Gaussian 71 Logarithmic differentiation formula 84 Logarithmic likelihood ratio test 434 Loss function 251 MDL principle 22 Mahalanobis distance 74, 133, 185, 420 Mahalanobis geometry 420 Mahalanobis projection 420, 466 Major axis of an ellipse 117 of a quadric 123 Manifold 10, 22, 65, 106 Marginal probability density 79 Matrix 27 Matrix inner product 50 Matrix inversion formula 31
506 Matrix norm consistent 51 Euclidean 51 c~-51 natural s e e spectral 1-51 Schmidt s e e Euclidean Schur s e e Euclidean spectral 46, 51 Matrix of full rank 30 Maximum a posteriori probability estimation 14, 89, 251 Maximum likelihood estimation 7 7 , 90, 133, 460, 475 Maximum likelihood estimator 7 7 , 87, 460, 475 Mean square 62 Midpoint flow approximation 413 Minimum description length principle s e e MDL principle Minimum mean square estimation 83 Minkowski norm s e e p-norm Minor axis 117 Mixed tensor 54 Model geometric 7, 210, 416, 453, 469 statistical 90 Modified ~2 distribution 143, 219 Modified 1~2 variable 143, 219 Modified k~2 test 76 Moment 72 Moment generating function 72 Moment matrix 214, 225, 235, 237, 269, 311, 426, 471, 303, 333, 384 Moore-Penrose generalized inverse s e e Generalized inverse Motion parallax 21, 367 Motion parameter finite 17, 124, 172 instantaneous 374 Multidimensional moments 72
Index Natural norm 46 s e e Spectral norm Negative definite matrix 39 Negative semi-definite matrix 39 Neuro-computing 24 Newton iteration 284 Neyman-Scott problem 23 Noise level 142, 178, 218, 267, 296, 309, 333, 429 Non-Bayesian approach 248 Non-Euclidean distance 185 Nonsingular conic 114 Nonsingular constraint 132, 455 Nonsingular data 210 Nonsingular generalized eigenvalue problem 42 Nonsingular hypothesis 210, 425, 469 Nonsingular matrix 30 Nonsingular quadric 119 Norm of a vector 29 Euclidean 29 c~- 29 p- 29 1- 29 of a matrix s e e Matrix norm Normal distribution 68 Normal flow 372 Normal space 66 Normalization 29, 51 Normalized covariance matrix 142, 178, 218, 267, 333, 429, Normalized covariance tensor 337, 388 Normalized geometric AIC 429 Normalized residual 429 Nuisance parameter 8, 23 Null hypothesis 434 Null space 11, 39, 46 Observation vector 81 Occluding contour 129 1-norm 29, 51 Optical flow 20, 369, 375
Index Optical flow equation 375 Optimal filter 8, 14, 254 Order symbol 30 Ordinary datum 469 Ordinary value 455 Orthogonal complement 34 Orthogonal matrix 35 Orthogonal projection 35, 83 Orthogonality of matrices 50 of vectors 28 Orthogonality test for three orientations 167 for two orientations 164 Orthonormal system 29 Outer product 28 Outlier 25 detection 25 Overspecification 286, 289 Parabola 11, 114 Paraboloid 12, 118 Parallel stereo system 174 Parallelism 29 Parameter of interest 23 Parameter space 209, 249, 416, 467 Parametric fitting 2, 7, 467 of the exponential family 478 Parity 27 Perspective projection 11, 96, 171 Perturbation theorem 44, 274 Photometric stereo 25 Plficker coordinate 104 Planarity test for finite motion 19, 351,353, 442, 443 for optical flow 402, 450 for stereo vision 438 p-norm 29 Poisson distribution 88 Poisson-gamma distribution 88 Polar 114, 120, 169 Pole 114, 120, 169 Positive definite matrix 39
507 Positive semi-definite matrix 39 Posterior s e e a posteriori probability density Primary deviation pair 12, 71,101, 124, 183, 286 Principal axis 117, 119 Prior s e e a priori probability density Probabilistic relaxation 3 Projection 10 Projection matrix 34 Projection tensor 302 Projective geometry 10, 98, 194 Projective space three-dimensional 102, 125 two-dimensional 95, 175 Projective transformation three-dimensional 125 two-dimensional 116, 126, 175, 189, 194 Pseudo inverse s e e Generalized inverse Pull-back 185 Purposive vision 24 Pythagoras theorem 421 Quadratic convergence 284 Quadratic form 41 Quadric 11, 118 Qualitative vision 24 Quaternion 67 Radius of curvature 420 Range 39, 46 Range data 309 Range finder 16, 309 Range sensing 309 Rank of a constraint 453 of a hypothesis 210, 425, 468 of a non-symmetric matrix 45 of a symmetric matrix 37 of equations 132 of vectors 372
508 Rank-constrained generalized inverse 8, 40, 47, 136 Rayleigh quotient 40 Real conic 114 Real optical flow 412 Rectangular hyperbola 117 Regression 5, 448 Regular distribution 85, 455, 470 Regularization 3, 20 Rejection region 75 Renormalization 8, 15, 285 Residual 49, 91, 142, 429 Riemannian curvature 185 Riemannian distance 185 Riemannian metric 10, 13 Riemannian space 185 Right orthonormal system 46 Right-handed coordinate system 27 Right-handed system 32 Rigidity test for finite motion 328 for optical flow 380 Robust computation 340 Root mean square 62 Rotation matrix 35, 365 Rotation test for finite motion 19, 326~ 364, 442, 443 for optical flow 378, 408, 450 Rotation velocity 37, 374 Satisfiability 50, 133 Satisfiable constraint 133 Scalar triple product 32 Schmidt norm s e e Euclidean matrix norm Schur norm s e e Euclidean matrix norm Schwarz inequality of vectors 29 of matrices 51 Score 83, 455, 470 Second fundamental form 86
Index Second order renormalization 15, 294 Shape from motion s e e Structure from motion Shape from shading 25 Signature 52, 114 Signature symbol 29 Significance level 75, 434 Similarity transformation 52 Singular conic 114 Singular constraint 132, 337, 455 Singular datum 210, 469 Singular generalized eigenvalue problem 42 Singular hypothesis 469 Singular matrix 30 Singular quadric 119 Singular value 45, 336, 357, 455 Singular value decomposition 45 Skew-symmetric matrix 28 Smoothness constraint 20 Solvable equation 48 Space conic 118 Space line 95 at infinity s e e Ideal space line Space point 95 at infinity s e e Ideal space point Space-time velocity 370 Spanned subspace 39 Spatio-temporal gradient equation 370 Spatio-temporal gradient value 370 Spectral decomposition 37 Spectral norm 46, 51 Square matrix 27 Standard Gaussian distribution 69 Standard confidence interval 69 Standard confidence region 122, 140, 182 Standard deviation 69 Standard deviation pair 69 Standard normal distribution 69 State transition 15
Index State vector 81 Statistical bias 263, 269 Statistical estimation 90 Statistical geometry 10 Statistical model 90 Statistical test 90 Stereo matching matching 25 Stereo vision 13, 171 Stochastic relaxation 3 Strong critical surface for finite motion 19, 350 for optical flow 22, 401 Strong hypothesis 276, 282 Structural stability 66 Structure from motion 17 Structure parameter 23 Submanifold 65 Sylvester's law of inertia 52, 116 Symmetric matrix 28 Symmetrization 60 Tangent 115, 120 Tangent bundle 65 Tangent space 10, 65, 131, 185 Tangential flow 372 Tensor 54 of mixed dimensions s e e Mixed tensor Tensor product 54 Testing of hypotheses 75, 141,275, 434 Theoretical optical flow 412 Total curvature 420 Trace 28 Translation velocity 374 Transversal intersection 10, 66, 132, 228, 425, 452, 469 Tra~lsverse axis 117 Triangle inequality of matrices 51 of vectors 29 Trinocular stereo vision 171 2-norm s e e Euclidean norm, Euclidean matrix norm
509 Type 27, 54 Type transformation 55 Unbiased eigenvector fit 15, 273 Unbiased estimator 77, 85, 456, 471 Unbiased least-squares approximation 273 Unbiased moment matrix 273, 293, 311, 304 Uncertainty field 183 Underspecification 289 Uninformative part 411 Unit matrix 28 Unit vector 29 Unsolvable equation 48 Upper a% point 434 Vanishing line 128, 360 Vanishing point 14, 127 Variance 61 Variance-covariance matrix 61 Variance-covariance tensor 62 Variety 65 Vector 27 Vector product 31, 33 Vector triple product 32 Vertex 117 Viewpoint 96, 171 Weak critical surface for finite motion 19, 350 for optical flow 401 Weak hypothesis 278, 282 Well-conditioned equation 48 World line 370 Zero matrix 28 Zero vector 27