Theoretical Computer Science 315 (2004) 307 – 308
www.elsevier.com/locate/tcs
Preface Symbolic/algebraic and numerical algorithms are the backbone of the modern computations in sciences, engineering, and electrical engineering. These two classes of algorithms and the respective scienti*c communities have been separated historically and still have relatively little interaction; yet they may bene*t from combining their power and techniques toward common computational goals. Such a combination is a rather recent undertaking, which became visible only in the last decade, but it is already popular in such central areas as root-*nding for polynomials and systems of polynomials and computations with Toeplitz and other structured matrices (having ties and impact to both numerical matrix methods and algebraic polynomial techniques). The areas happened to be central also for the Editors’ research interests, which motivated our e1orts to bring the subjects to the attention of the TCS readers. (The readers can *nd links to the introductory and advanced bibliography on these subjects in the Editors’ web home pages.) The present issue of the TCS partly re3ects the state of the art. It includes papers on various algebraic and numerical algorithms and techniques, focuses on combined application of the methods from both groups, and extensively represents topics in polynomial root-*nding and structured matrix computations. The issue includes recent advances in the numerical study of the root variety for a system of multivariate polynomials by homotopy methods, leading to the variety’s irreducible decomposition and the approximation of all common roots, by Sommese et al. A complexity study is undertaken by Bompadre et al, for the problem of solving polynomial systems by combining di1erent techniques such as the Newton-Hensel lemma, characteristic polynomials, and resultants, in the setting of Straight-Line programs. The same setting is used by Pardo and San Mart;
308
Preface / Theoretical Computer Science 315 (2004) 307 – 308
compression of structured matrices applied in Newton’s iteration for their inversion by Pan et al; and the study of the matrix algebras of the preconditioners for multilevel Toeplitz systems by Noutsos et al. Kaporin re-examines the known algebraic algorithms for fast matrix multiplications and modi*es them for practical numerical computing, with quite surprising and promising results. The abc conjecture from Number Theory is employed by Croot et al, in order to reduce the number of bits required in rounding the reciprocal square root. Topics in complexity analysis and geodesic constructions from a computational geometry point of view are considered by Burago et al, and approximation algorithms are proposed. An important application of parameterizing algebraic curves is treated by P;erez-D;
Theoretical Computer Science 315 (2004) 309 – 318
www.elsevier.com/locate/tcs
Inverse eigenproblem for centrosymmetric and centroskew matrices and their approximation Zheng-Jian Bai∗ , Raymond H. Chan1 Department of Mathematics, Chinese University of Hong Kong, Shatin, NT, Hong Kong, China
Abstract In this paper, we 0rst give the solvability condition for the following inverse eigenproblem (IEP): given a set of vectors {xi }mi=1 in Cn and a set of complex numbers {i }mi=1 , 0nd a centrosymmetric or centroskew matrix C in Rn×n such that {xi }mi=1 and {i }mi=1 are the eigenvectors and eigenvalues of C, respectively. We then consider the best approximation problem for the IEPs that are solvable. More precisely, given an arbitrary matrix B in Rn×n , we 0nd the matrix C which is the solution to the IEP and is closest to B in the Frobenius norm. We show that the best approximation is unique and derive an expression for it. c 2004 Elsevier B.V. All rights reserved. Keywords: Eigenproblem; Centrosymmetric matrix; Centroskew matrix
1. Introduction Let Jn be the n-×-n anti-identity matrix, i.e. Jn has 1 on the anti-diagonal and zeros elsewhere. An n-by-n matrix C is said to be centrosymmetric (or persymmetric) if C = Jn CJn , and it is called centroskew (or skew-centrosymmetric) if C = − Jn CJn . The centrosymmetric and centroskew matrices play an important role in many areas [7,16] such as signal processing [8,11], the numerical solution of diBerential equations [2], and Markov processes [17]. In this paper, we consider two problems related to centrosymmetric and centroskew matrices. Both problems are on numerical and approximate computing but here we ∗
Corresponding author. E-mail addresses: [email protected] (Z.-J. Bai), [email protected] (R.H. Chan). 1 The research was partially supported by the Hong Kong Research Grant Council Grant CUHK4243/01P and CUHK DAG 2060220. c 2004 Elsevier B.V. All rights reserved. 0304-3975/$ - see front matter doi:10.1016/j.tcs.2004.01.017
310
Z.-J. Bai, R.H. Chan / Theoretical Computer Science 315 (2004) 309 – 318
solve them algebraically, based on some explicit expressions for the solutions of overdetermined linear systems of equations. The 0rst problem is an inverse eigenproblem. There are many applications of structured inverse eigenproblems, see for instance the expository paper [5]. In particular, the inverse eigenproblem for Toeplitz matrices (a special case of centrosymmetric matrices) arises in trigonometric moment problem [10] and signal processing [9]. The inverse eigenproblem for centrosymmetric Jacobi matrices also comes from inverse Sturm–Liouville problem [19, p. 70]. There are also diBerent types of inverse eigenproblem, for instances multiplicative type and additive type [19, Chapter 4]. Here we consider the following type of inverse eigenproblem which appeared in the design of Hop0eld neural networks [4,13]. Problem I. Given X = [x1 ; x2 ; : : : ; xm ] in Cn×m and = diag(1 ; : : : ; m ) in Cm×m , 0nd a centrosymmetric or centroskew matrix C in Rn×n such that CX = X. The second problem we consider in this paper is the problem of best approximation. Problem II. Let LS be the solution set of Problem I. Given a matrix B ∈ Rn×n , 0nd C ∗ ∈ LS such that B − C ∗ = min B − C; C∈LS
where · is the Frobenius norm. The best approximation problem occurs frequently in experimental design, see for instance [14, p. 123]. Here the matrix B may be a matrix obtained from experiments, but it may not satisfy the structural requirement (centrosymmetric or centroskew) and/or spectral requirement (having eigenpairs X and ). The best estimate C ∗ is the matrix that satis0es both requirements and is the best approximation of B in the Frobenius norm. In addition, because there are fast algorithms for solving various kinds of centrosymmetric and centroskew matrices [12], the best approximate C ∗ of B can also be used as a preconditioner in the preconditioned conjugate gradient method for solving linear systems with coeJcient matrix B, see for instance [1]. Problems I and II have been solved for diBerent classes of structured matrices, see for instance [18,20]. In this paper, we extend the results in [18,20] to the classes of centrosymmetric and centroskew matrices. We 0rst give a solvability condition for Problem I and also the form of its general solution. Then in the case when Problem I is solvable, we show that Problem II has a unique solution and we give a formula for the minimizer C ∗ . The paper is organized as follows: In Section 2 we 0rst characterize the class of centrosymmetric matrices and give the solvability condition of Problem I over this class of matrices. In Section 3, we derive a formula for the best approximation of Problem II, give the algorithm for 0nding the minimizer, and study the stability of the problem. In Section 4 we give an example to illustrate the theory. In the last section, we extend the results in Sections 2–3 to centroskew matrices.
Z.-J. Bai, R.H. Chan / Theoretical Computer Science 315 (2004) 309 – 318
311
2. Solvability condition for Problem I We 0rst characterize the set of all centrosymmetric k, let Ik 1 Ik Ik 1 K2k = √ and K2k+1 = √ 0 2 Jk −Jk 2 J
k
matrices. For all positive integers √0 Ik 2 0 : 0 −Jk
Clearly Kn is orthogonal for all n. The matrix Kn plays an important role in analyzing the properties of centrosymmetric matrices, see for example [6]. In particular, we have the following splitting of centrosymmetric matrices into smaller submatrices using Kn . Lemma 1 (Collar [6]). Let Cn be the set of all centrosymmetric matrices in Rn×n . We have E FJk k×k E; F ∈ R ; C2k = Jk F Jk EJk E a FJk C2k+1 = bT c bT Jk E; F ∈ Rk×k ; a; b ∈ Rk ; c ∈ R : Jk F Jk a Jk EJk Moreover, for all n = 2k and 2k + 1, we have G1 0 T (n−k)×(n−k) k×k Cn = Kn Kn G1 ∈ R : ; G2 ∈ R 0 G2
(1)
Before we come to Problem I, we 0rst note that we can assume without loss of generality that X and are real matrices. In fact, since Cn ⊂ Rn×n , the complex eigenvectors √ and eigenvalues of any C ∈ Cn will appear in complex conjugate pairs. If ± −1 √ and x ± −1y are one of its eigenpair, then we have Cx = x − y and Cy = y + x, i.e. C[x; y] = [x; y] : − Hence we can assume without loss of generality that X ∈ Rn×m and = diag(1 ; 2 ; : : : ; l ; 1 ; : : : ; m−2l ) ∈ Rm×m ; where
i =
i
i
−i
i
with i ; i and i in R.
(2)
312
Z.-J. Bai, R.H. Chan / Theoretical Computer Science 315 (2004) 309 – 318
Next, we investigate the solvability of Problem I. We need the following lemma where U + denotes the Moore–Penrose pseudo-inverse of U . Lemma 2 (Sun [15, Lemma 1.3]). Let U; V ∈ Rn×m be given. Then YU = V is solvable if and only if VU + U = V . In this case the general solution is Y = VU + + Z(I − UU + ); where Z ∈ Rn×n is arbitrary. In the remaining part of the paper, we will only give the theorems and the proofs for even n. The case where n is odd can be proved similarly. Thus we let n = 2k. Theorem 1. Given X ∈ Rn×m and as in (2), let X˜ KnT X = ˜ 1 ; X2
(3)
where X˜2 ∈ Rk×m . Then there exists a matrix C ∈ Cn such that CX = X if and only if + X˜ 1 X˜ 1 X˜ 1 = X˜ 1
+ and X˜ 2 X˜ 2 X˜ 2 = X˜ 2 :
In this case, the general solution to CX = X is given by + Z1 (In−k − X˜ 1 X˜ 1 ) 0 Cs = C0 + Kn KnT ; + 0 Z2 (Ik − X˜ 2 X˜ 2 ) where Z1 ∈ R(n−k)×(n−k) and Z2 ∈ Rk×k are both arbitrary, and + X˜ 1 X˜ 1 0 T C0 = Kn + Kn : 0 X˜ 2 X˜ 2
(4)
(5)
(6)
Proof. From (1), C ∈ Cn is a solution to Problem I if and only if there exist G1 ∈ R(n−k)×(n−k) and G2 ∈ Rk×k such that G1 0 C = Kn KnT (7) 0 G2 and
Kn
G1 0 KnT X = X: 0 G2
(8)
Using (3), (8) is equivalent to G1 X˜ 1 = X˜ 1
and
G2 X˜ 2 = X˜ 2 :
(9)
Z.-J. Bai, R.H. Chan / Theoretical Computer Science 315 (2004) 309 – 318
313
According to Lemma 2, Eqs. (9) have solutions if and only if Eqs. (4) hold. Moreover in this case, the general solution of (9) is given by +
+
G1 = X˜ 1 X˜ 1 + Z1 (In−k − X˜ 1 X˜ 1 );
(10)
+ + G2 = X˜ 2 X˜ 2 + Z2 (Ik − X˜ 2 X˜ 2 );
(11)
where Z1 ∈ R(n−k)×(n−k) and Z2 ∈ Rk×k are both arbitrary. Putting (10) and (11) into (7), we get (5). 3. The minimizer of Problem II Let CnS be the solution set of Problem I over Cn . In this section, we solve Problem II over CnS when CnS is nonempty. Theorem 2. Given X ∈ Rn×m and as in (2), let the solution set CnS of Problem I be nonempty. Then for any B ∈ Rn×n , the problem minC∈CnS B − C has a unique solution C ∗ given by + B˜ 11 (In−k − X˜ 1 X˜ 1 ) 0 ∗ KnT : C = C0 + Kn (12) + 0 B˜ 22 (Ik − X˜ 2 X˜ 2 ) Here X˜1 , X˜2 , and C0 are given in (3) and (6), and B˜11 and B˜22 are obtained by partitioning KnT BKn as B˜ 11 B˜ 12 T ; (13) Kn BKn = ˜ ˜ B21 B22 where B˜22 ∈ Rk×k . Proof. When CnS is nonempty, it is easy to verify from (5) that CnS is a closed convex set. Since Rn×n is a uniformly convex Banach space under the Frobenius norm, there exists a unique solution for Problem II [3, p. 22]. Moreover, because the Frobenius norm is unitary invariant, Problem II is equivalent to min KnT BK − KnT CK2 :
(14)
C∈CnS
By (5), we have 2 + ˜ ˜ ˜ ˜ Z B − X X P 0 B 11 1 12 1 1 KnT BK − KnT CK2 = − ; + 0 Z Q ˜ ˜ ˜ ˜ 2 B21 B22 − X 2 X 2 where +
P = In−k − X˜ 1 X˜ 1
and
+
Q = Ik − X˜ 2 X˜ 2 :
(15)
314
Z.-J. Bai, R.H. Chan / Theoretical Computer Science 315 (2004) 309 – 318
Thus (14) is equivalent to min
Z1 ∈R(n−k)×(n−k)
+ + B˜ 11 − X˜ 1 X˜ 1 − Z1 P2 + min B˜ 22 − X˜ 2 X˜ 2 − Z2 Q2 : Z2 ∈Rk×k
Clearly, the solution is given by Z1 and Z2 such that +
Z1 P = B˜ 11 − X˜ 1 X˜ 1
and
+
Z2 Q = B˜ 22 − X˜ 2 X˜ 2 :
Notice that by (15), P and Q are projection matrices, i.e. P 2 = P and Q2 = Q. Therefore Z1 P = (B˜11 − X˜1 X˜1+ )P and Z2 Q = (B˜22 − X˜2 X˜2+ )Q. Notice further that because X˜1+ X˜1 X˜1+ = X˜1+ , we have +
+
+
+
+
(B˜ 11 − X˜ 1 X˜ 1 )P = B˜ 11 − B˜ 11 X˜ 1 X˜ 1 − X˜ 1 X˜ 1 + X˜ 1 X˜ 1 X˜ 1 X˜ 1 + = B˜ 11 − B˜ 11 X˜ 1 X˜ 1 = B˜ 11 P:
Similarly, Z2 Q = (B˜22 − X˜2 X˜2+ )Q = B˜22 Q. Hence the unique solution for Problem II is given by (12). Based on Theorem 2, we give the following algorithm for solving Problem II for n = 2k. Algorithm I. (a) Compute X˜1 and X˜2 by (3) and then compute X˜1+ and X˜2+ . (b) If X˜1 X˜1+ X˜1 = X˜1 and X˜2 X˜2+ X˜2 = X˜2 , then the solution set CnS to Problem I is nonempty and we continue. Otherwise we stop. (c) Partition KnT BKn as in (13) to get B˜11 and B˜22 . (d) Compute + + W1 = X˜ 1 X˜ 1 + B˜ 11 − B˜ 11 X˜ 1 X˜ 1 ;
+ W2 = X2 X2+ + B˜ 22 − B˜ 22 X˜ 2 X˜ 2 :
(e) Then C ∗ = Kn
W1 0 KnT : 0 W2
Next, we consider the computational complexity of our algorithm. For Step (a), since Kn has only 2 nonzero entries per row, it requires O(nm) operations to compute X˜1 and X˜2 . Then using singular value decomposition to compute X˜1+ and X˜2+ requires O(n2 m + m3 ) operations. Step (b) obviously requires O(n2 m) operations. For Step (c), because of the sparsity of Kn , the operations required is O(n2 ) only. For Step (d), if 2 ˜ ˜ ˜+ we compute B˜ii X˜ i X˜ + i as [(Bii X i )X i ], then the cost will only be of O(n m) operations. 2 Finally, because of the sparsity of Kn again, Step (e) requires O(n ) operations. Thus the total complexity of the algorithm is O(n2 m+m3 ). We remark that in practice, m n. Before we end this section, we give a stability analysis for Problem II, that is, we study how the solution of Problem II is aBected by a small perturbation of B. We have the following result.
Z.-J. Bai, R.H. Chan / Theoretical Computer Science 315 (2004) 309 – 318
315
Corollary 1. Given B(i) ∈ Rn×n , i = 1; 2. Let C ∗(i) = arg minC∈CnS B(i) −C for i = 1; 2. Then there exists a constant independent of B(i) , i = 1; 2, such that C ∗(2) − C ∗(1) 6 B(2) − B(1) : Proof. By Theorem 2, C ∗(i) is given by (i) B˜ 11 P 0 ∗(i) KnT ; = C0 + Kn C (i) 0 B˜ 22 Q
(16)
i = 1; 2;
(i) where B˜22 are the blocks of KnT B(i) Kn as de0ned in (13), and P and Q are given in (15). Thus we have (2) ˜ 11 − B˜ (1) ( B )P 0 11 ∗(2) ∗(1) T Kn C − C = Kn (2) (1) 0 (B˜ 22 − B˜ 22 )Q (2) (1) ˜ 0 B11 − B˜ 11 P 0 6 (2) (1) 0 Q 0 B˜ 22 − B˜ 22
6 KnT (B(2)
P − B )Kn 0 (1)
0 6 B(2) − B(1) ; Q
where = P + Q. Thus (16) holds. 4. Demonstration by an example Let us 0rst compute the input matrices X and for which Problem I has a solution. We start by choosing a random matrix Cˆ in Cn : 0:1749 0:0325 −0:2046 0:0932 0:0315 0:0133 −0:0794 −0:0644 0:1165 −0:0527 Cˆ = 0:1741 0:0487 0:1049 0:0487 0:1741 ∈ C5 : −0:0527 0:1165 −0:0644 −0:0794 0:0133 0:0315 0:0932 −0:2046 0:0325 0:1749 √ Then we compute its eigenpairs. The eigenvalues of Cˆ are 0:1590 ± 0:2841 −1, √ −0:1836, 0:1312, and 0:0304. Let x1 ± −1x2 , x3 ; x4 , and x5 be the corresponding eigenvectors. Then we take 0:4815 0:2256 −0:2455 −0:7071 −0:1313 0:0118 0:1700 0:7071 −0:1427 −0:7071 0 X = [x1 ; x2 ; x3 ; x4 ; x5 ] = 0:4322 −0:5120 0:2235 0 0:0118 0:1700 0:7071 0:1427 0:7071 0:4815 0:2256 −0:2455 0:7071 0:1313
316
Z.-J. Bai, R.H. Chan / Theoretical Computer Science 315 (2004) 309 – 318 15
10
5
0
−5
−10
−15
−20 −10
−8
−6
−4
−2
0
2
4
6
8
10
log10 ε
Fig. 1. log10 B(") − C ∗ (") (“∗”) and log10 Cˆ − C ∗ (") (“+”) versus log10 ".
and
0:1590 −0:2841 = 0 0 0
0:2841 0:1590 0 0 0
0 0 0:0304 0 0
0 0 0 0:1312 0
0 0 : 0 0 −0:1836
ˆ Thus C5S is Given this X and , clearly we have a solution to Problem I, namely C. nonempty. Next, we perturb Cˆ by a random matrix to obtain a matrix B(") ∈= C5 :
1:4886 −0:9173 1:2688 −0:1869 −1:0830 1:2705 −1:1061 −0:7836 1:0132 1:0354 B(") = Cˆ + " · −1:8561 0:8106 0:2133 0:2484 1:5854 : 2:1343 0:6985 0:7879 0:0596 0:9157 1:4358 −0:4016 0:8967 1:3766 −0:5565 Then we apply our algorithm in Section 3 to obtain C ∗ (") corresponding to B("). In Fig. 1, we plot the following two quantities for " between 10−10 to 1010 : log10 B(") − C ∗ (") (marked by “∗”) and log10 Cˆ − C ∗ (") (marked by “+”). We can see that as " goes to zero, C ∗ (") approaches B(") as expected. Also when "610−1 , C ∗ (") = Cˆ up to the machine precision (we use MATLAB which has machine precision around 10−16 ).
Z.-J. Bai, R.H. Chan / Theoretical Computer Science 315 (2004) 309 – 318
317
5. Extension to the set of centroskew matrices In this section, we extend our results in Sections 2 and 3 to centroskew matrices, i.e. matrices S such that S = − Jn SJn . The results and the proofs are similar to the centrosymmetric case, and we only list the results for the case when n is even and omit the proofs. Let n=2k. Considering Problem I for Sn , we have the following theorem. Theorem 3. Given X ∈ Rn×m and as in (2), let X˜1 and X˜2 be as de
+ and X˜ 2 X˜ 1 X˜ 1 = X˜ 2 :
In this case, the general solution to SX = X is given by + 0 Z1 (Ik − X˜ 2 X˜ 2 ) KnT ; Ss = S0 + Kn + 0 Z2 (Ik − X˜ 1 X˜ 1 ) where Z1 ∈ Rk×k and Z2 ∈ Rk×k are both arbitrary, and + 0 X˜ 1 X˜ 2 KnT : S0 = Kn + X˜ 2 X˜ 1 0
(17)
For Problem II over the solution set SnS of Problem I for Sn , we have the following result. Theorem 4. Given X ∈ Rn×m and as in (2), let the solution set SnS of Problem I be nonempty. Then for any B ∈ Rn×n , the problem minS∈Sn S B − S has a unique solution S ∗ given by + 0 B˜ 12 (Ik − X˜ 2 X˜ 2 ) ∗ KnT : S = S0 + Kn + 0 B˜ 21 (In−k − X˜ 1 X˜ 1 ) Here X˜1 , X˜2 , B˜12 , B˜21 , and S0 are given in (3), (13), and (17). Moreover S ∗ is a continuous function of B. Acknowledgements We thank the referees for their helpful and valuable comments. References [1] T. Chan, An optimal circulant preconditioner for Toeplitz systems, SIAM J. Sci. Statist. Comput. 9 (1988) 766–771. [2] W. Chen, X. Wang, T. Zhong, The structure of weighting coeJcient matrices of harmonic diBerential quadrature and its application, Comm. Numer. Methods Eng. 12 (1996) 455–460.
318
Z.-J. Bai, R.H. Chan / Theoretical Computer Science 315 (2004) 309 – 318
[3] E. Cheney, Introduction to Approximation Theory, McGraw-Hill, New York, 1966. [4] K. Chu, N. Li, Designing the Hop0eld neural network via pole assignment, Internat. J. Systems Sci. 25 (1994) 669–681. [5] M. Chu, G. Golub, Structured inverse eigenvalue problems, Acta Numer. 11 (2002) 1–71. [6] A. Collar, On centrosymmetric and centroskew matrices, Quart. J. Mech. Appl. Math. 15 (1962) 265–281. [7] L. Datta, S. Morgera, On the reducibility of centrosymmetric matrices—applications in engineering problems, Circuits Systems Signal Process 8 (1989) 71–96. [8] J. Delmas, On adaptive EVD asymptotic distribution of centro-symmetric covariance matrices, IEEE Trans. Signal Process. 47 (1999) 1402–1406. [9] G. Feyh, C. Mullis, Inverse eigenvalue problem for real symmetric Toeplitz matrices, Internat. Conf. Acoustics, Speech, Signal Process. 3 (1988) 1636–1639. [10] U. Grenander, G. SzegVo, Toeplitz Forms and their Applications, Chelsea Publishing Company, New York, 1984. [11] N. Griswold, J. Davila, Fast algorithm for least squares 2D linear-phase FIR 0lter design, IEEE Internat. Conf. Acoustics, Speech, Signal Process. 6 (2001) 3809–3812. [12] T. Kailath, A. Sayed, Fast Reliable Algorithms for Matrices with Structures, SIAM, Philadelphia, PA, 1999. [13] N. Li, A matrix inverse eigenvalue problem and its application, Linear Algebra Appl. 266 (1997) 143–152. [14] T. Meng, Experimental design and decision support, in expert systems, in: C. Leondes (Ed.), The Technology of Knowledge Management and Decision Making for the 21st Century, Vol. 1, Academic Press, New York, 2001. [15] J. Sun, Backward perturbation analysis of certain characteristic subspaces, Numer. Math. 65 (1993) 357–382. [16] D. Tao, M. Yasuda, A spectral characterization of generalized real symmetric centrosymmetric and generalized real symmetric skew-centrosymmetric matrices, SIAM J. Matrix Anal. Appl. 23 (2002) 885–895. [17] J. Weaver, Centrosymmetric (cross-symmetric) matrices, their basic properties, eigenvalues, and eigenvectors, Amer. Math. Monthly 92 (1985) 711–717. [18] D. Xie, X. Hu, L. Zhang, The solvability conditions for inverse eigenproblem of symmetric and anti-persymmetric matrices and its approximation, Numer. Linear Algebra Appl. 10 (2003) 223–234. [19] S. Xu, An Introduction to Inverse Algebraic Eigenvalue Problems, Peking University Press and Vieweg Publishing, Braunschweig, 1998. [20] L. Zhang, D. Xie, X. Hu, The inverse eigenvalue problems of bisymmetric matrices on the linear manifold, Math. Numer. Sin. 22 (2000) 129–138.
Theoretical Computer Science 315 (2004) 319 – 333
www.elsevier.com/locate/tcs
Bernstein–Bezoutian matrices D.A. Bini, L. Gemignani∗ Dipartimento di Matematica, Universita di Pisa, via Buonarroti 2, Pisa I-56127, Italy
Abstract Several computational and structural properties of Bezoutian matrices expressed with respect to the Bernstein polynomial basis are shown. The exploitation of such properties allows the design of fast algorithms for the solution of Bernstein–Bezoutian linear systems without never making use of potentially ill-conditioned reductions to the monomial basis. In particular, we devise an algorithm for the computation of the greatest common divisor (GCD) of two polynomials in Bernstein form. A series of numerical tests are reported and discussed, which indicate that Bernstein–Bezoutian matrices are much less sensitive to perturbations of the coe0cients of the input polynomials compared to other commonly used resultant matrices generated after having performed the explicit conversion between the Bernstein and the power basis. c 2004 Elsevier B.V. All rights reserved. Keywords: Bezoutian matrices; Bernstein polynomial basis; Displacement structure; Fast algorithms
1. Introduction Approximation methods based on B5ezier curves have become more and more popular in computer aided geometric design (CAGD) [10–12,14,27]. Since B5ezier curves are parametrized by means of Bernstein polynomials, it follows that computational problems involving B5ezier curves generally reduce to manipulating polynomials expressed with respect to the Bernstein polynomial basis. In particular, B5ezier curve intersection problems are shown to be equivalent to checking the relative primality of two polynomials in the Bernstein basis. Explicit conversion between the Bernstein and the power polynomial basis is exponentially ill-conditioned as the polynomial degree n increases [13]. Therefore, for numerical computations involving polynomials in Bernstein form ∗
Corresponding author. E-mail addresses: [email protected] (D.A. Bini), [email protected] (L. Gemignani). URLs: www.dm.unipi.it/∼bini, www.dm.unipi.it/∼gemignan
c 2004 Elsevier B.V. All rights reserved. 0304-3975/$ - see front matter doi:10.1016/j.tcs.2004.01.016
320
D.A. Bini, L. Gemignani / Theoretical Computer Science 315 (2004) 319 – 333
it is essential to consider algorithms which express all intermediate results using this form only. The purpose of this paper is to provide theoretical bases for the design of fast and accurate algorithms for computing the greatest common divisor (GCD) of two real polynomials p(z) and q(z) of degree at most n expressed in the Bernstein polynomial basis {0(n) (z); : : : ; n(n) (z)}, where i(n) (z) = ni (1 − z)n−i z i , 0 6 i 6 n. In theory, fast O(n2 ) algorithms can be obtained by Arst determining the power form of p(z) and q(z) and then by applying some method based on the subresultant theory [7,5] or its matrix counterparts [3] to evaluate their GCD. However, due to the ill-conditioning of the explicit conversion between the Bernstein and the power basis, it has been shown [24,25] that such an approach may suCer from severe numerical di0culties and, in particular, the worst case precision of O(n) bits is nearly always required in calculations to retain some signiAcant correct bits in the output. In this paper, we circumvent these di0culties by considering a modiAed resultant matrix for polynomials in Bernstein form which is represented by its short displacement generator. This displacement representation is novel and quite e0cient, being explicit, algebraic and available at practically no cost. That is, we have another important example where algebraic techniques come to rescue to overcome numerical di0culty. A solution of the GCD problem for polynomials in the Bernstein form, which does not employ any basis conversion, is Arst provided in [24]. The approach relies upon the construction of a suitable Frobenius matrix F ∈ Rn×n of p(z) directly determined from the coe0cients of its representation in the Bernstein basis. Given such a matrix F, one can consider the matrix q(F) obtained by evaluating the polynomial q(z) at the matrix F. This matrix inherits several properties of the resultant matrix of p(z) and q(z) and, in particular, its LU factorization yields the coe0cients of the GCD of p(z) and q(z). The results of numerical experiments presented in [26] show that q(F) is generally better conditioned than q(C), where C is the classical Frobenius matrix associated with p(z). This improvement of the accuracy is, however, paid for by an increase of the computational cost of the resulting method. The calculation of the entries of q(F), as outlined in [24], given the entries of F and the coe0cients of q(z) in the Bernstein basis has a cost of O(n3 ) arithmetic operations (ops). In addition to that, the factorization phase, where Gaussian elimination is applied to q(F) in order to reduce q(F) in its row echelon form, also requires O(n3 ) ops. In this paper, we introduce the Bezout form B = (bi;j ) ∈ Rn×n of the resultant of p(z) and q(z) deAned by n p(z)q(w) − p(w)q(z) (n−1) (n−1) bi;j i−1 (z)j−1 (w): = z−w i;j=1
Bezoutian matrices with respect to diCerent polynomial bases have been previously considered by many authors (see [1,16,18,19,23]). Quite apart from their theoretical interest, they have been proved to be a powerful tool for devising e0cient numerical methods for computations with polynomials and structured matrices [4]. In Section 2, we show that the matrix B can be constructed using O(n2 ) ops given the coe0cients expressing p(z) and q(z) in the Bernstein basis {0(n) (z); : : : ; n(n) (z)}.
D.A. Bini, L. Gemignani / Theoretical Computer Science 315 (2004) 319 – 333
321
In addition to that, we relate the properties of a block triangular factorization of B with the ones of a certain polynomial remainder algorithm applied to the reversed polynomials of p(z) and q(z). This result enables the computation of the GCD of p(z) and q(z) to be reduced either to computing a block LU factorization of B or to solving an homogeneous linear system with coe0cient matrix being the kth leading principal submatrix of B for a suitable k. Since we are interested in using Goating point arithmetic, it is worth realizing that, independently of the numerical method we consider to solve these problems, the precision of computations must be dynamically tuned according to the condition number of the leading principal submatrices of B. For input polynomials in Bernstein form we have performed extensive numerical experiments, which are partly reported and discussed in Section 4, by comparing the conditioning proAle of B with that one of the classical Bezout matrix B˜ generated after having explicitly evaluated the coe0cients of the polynomials in the monomial basis. Almost in any case the conditioning proAle of B was signiAcantly better than the one of B˜ whereas in the remaining few cases they were comparable. Similar conclusions are reached in [26] for a diCerent resultant matrix for polynomials in Bernstein form. Hence, the Bezout–resultant matrix B is numerically superior to its power basis equivalent. Among the numerical algorithms that solve underdetermined linear systems, it is known that SVD provides the most reliable one. Methods based on SVD computations on subresultant matrices for numerically computing GCDs of polynomials in power form have been proposed in [8,9] (see also [22] for a discussion on these methods compared with some known approaches as well as for extensions to other resultant matrices). These methods can be generalized to polynomials in Bernstein form by simply considering a diCerent matrix formulation relying on the use of the matrix B. An alternative approach exploiting the reduction of the GCD computation to the block LU factorization of B is motivated by the structural properties of the resultant matrix B. In Section 3, we describe the displacement structure of B by proving that r F(B) is a small rank matrix, say F(B) = i=1 ui CTi with rn, where F: Rn×n → Rn×n ;
F(B) = LT B − BL
and L ∈ Rn×n is a lower bidiagonal matrix with unit diagonal entries. The vectors ui and Ci are called generators of the displacement representation of B. The displacement structure of B can be incorporated into the calculations of its block triangular factorization. In particular, we And that a suitable variant of the block Gaussian elimination scheme only using recursions on the generators can be applied to B thus leading to a fast O(n2 ) algorithm for the computation of the GCD of p(z) and q(z). This algorithm can be made robust in Goating point arithmetic by replacing zero-check conditions with criteria based both on backward error analysis for LU factorization and conditioning estimates for the leading principal submatrices of B. Fast numerical schemes based on similar techniques were developed in [6,15] for the solution of Toeplitz and Hankel linear systems. The generalization of the error analysis presented there to Bernstein– Bezoutian linear systems is beyond our present scope and it is a part of an ongoing investigation on the numerical properties of resultants for Bernstein polynomials.
322
D.A. Bini, L. Gemignani / Theoretical Computer Science 315 (2004) 319 – 333
Section 5 contains a brief discussion on future work that follows on from the results described in this paper. 2. Bezoutians of polynomials in Bernstein form In this section, we introduce the Bezout form of the resultant of two polynomials expressed in the Bernstein basis by showing that its properties can be used to compute the greatest common divisor (GCD) of polynomials in Bernstein form. Let p(z) and q(z) be two real polynomials of degree less than or equal to n. The (k) k polynomials i (z) = i (1−z)k−i z i , 0 6 i 6 k, form the Bernstein basis of the vector space of real polynomials of degree at most k. Assume that p(z) =
n i=0
pi i(n) (z);
q(z) =
n i=0
qi i(n) (z)
(1)
deAnes the Bernstein form of p(z) and q(z), respectively. From −1 n k n j k(n) (z); j = 0; : : : ; n; z = j j k=j (n) one immediately obtains that the matrix Tn = (ti;j ) ∈ R(n+1) × (n+1) deAning the transformation between the Bernstein and the power basis is given by (n) 1 0 (z) 0 if i ¿ j; −1 (n) = Tn ... = ... ; ti;j (2) j−1 n if i 6 j: n (n) i − 1 i − 1 z (z) n
The Bezoutian matrix B = (bi;j ) ∈ Rn×n of p(z) and q(z) in the Bernstein basis {0(n−1) (z); : : : ; n(n−1) (z)} is deAned by n p(z)q(w) − p(w)q(z) (n−1) (n−1) bi;j i−1 (z)j−1 (w) = z−w i;j=1
(3)
which can equivalently be rewritten as 0(n−1) (w) p(z)q(w) − p(w)q(z) (n−1) .. (z)]B = [0(n−1) (z); : : : ; n−1 : . z−w (n−1) n−1 (w)
(4)
Our Arst result is concerned with the construction of the matrix B given the coe0cients of the Bernstein form (1) of p(z) and q(z). Theorem 1. Given the coe5cients pi ; qi , 0 6 i 6 n, as in (1), the Bernstein–Bezoutian matrix B = (bi;j ) ∈ Rn×n satisfying (3) can be constructed at the cost of O(n2 )
D.A. Bini, L. Gemignani / Theoretical Computer Science 315 (2004) 319 – 333
323
arithmetic operations according to the following rules: n (pi q0 − p0 qi ); i
bi;1 =
1 6 i 6 n;
bi;j+1 =
n2 j(n − i) (pi qj − pj qi ) + bi+1;j ; i(n − j) i(n − j)
bn;j+1 =
n (pn qj − pj qn ); n−j
1 6 i; j 6 n − 1;
1 6 j 6 n − 1:
Proof. From (1) and (3) we deduce that n
(pi qj − pj qi )i(n) (z)j(n) (w) = (z − w)
i;j=0
n i;j=1
(n−1) (n−1) bi;j i−1 (z)j−1 (w):
Since z
n
(n−1) (n−1) bi;j i−1 (z)j−1 (w)
i;j=1
= (w + (1 − w))z = zw
n i;j=1
n i;j=1
(n−1) (n−1) bi;j i−1 (z)j−1 (w)
(n−1) (n−1) bi;j i−1 (z)j−1 (w) +
n i;j=1
(n−1) (n−1) bi;j zi−1 (z)(1 − w)j−1 (w)
and, similarly, w
n i;j=1
(n−1) (n−1) bi;j i−1 (z)j−1 (w)
= (z + (1 − z))w = zw
n i;j=1
n i;j=1
(n−1) (n−1) bi;j i−1 (z)j−1 (w)
(n−1) (n−1) bi;j i−1 (z)j−1 (w) +
n i;j=1
(n−1) (n−1) bi;j (1 − z)i−1 (z)wj−1 (w);
one Ands that n i;j=0
(pi qj − pj qi )i(n) (z)j(n) (w)
=
n i;j=1
(n−1) (n−1) bi;j zi−1 (z)(1−w)j−1 (w)+ −
n i;j=1
(n−1) (n−1) bi;j (1 − z)i−1 (z)wj−1 (w):
This can be rewritten as n i;j=0
(pi qj − pj qi )i(n) (z)j(n) (w)
=
n i;j=1
bi;j
n i (n) n−j+1 (n) n − i + 1 (n) j i (z) j−1 (w)+ − i−1 (z) j(n) (w): bi;j n n n n i;j=1
324
D.A. Bini, L. Gemignani / Theoretical Computer Science 315 (2004) 319 – 333
Hence, by equalizing the coe0cients of j(n) (w) on both sides of the previous relation, it follows that n i=0
(pi q0 − p0 qi )i(n) (z) =
n i=1
bi;1
i (n) (z) n i
and n
(pi qj − pj qi )i(n) (z) =
i=0
n n i n − i + 1 (n) n−j j bi;j+1 i(n) (z) − bi;j i−1 (z) n i=1 n n i=1 n
for j = 1; : : : ; n−1. A comparison of the coe0cients of i(n) (z) now concludes the proof of the theorem. Next result relates the block LU factorization of B with the computation of the GCD of p(z) and q(z) expressed in the Bernstein form (1). The crucial observation is that B is congruent to the classical Bezoutian matrix associated with p(z) and q(z). That is, from (2) and (4) one obtains 1 p(z)q(w) − p(w)q(z) . −T −1 (5) BTn−1 = [1; : : : ; z n−1 ]Tn−1 .. z−w n−1 w −T −1 and thus Bˆ = Tn−1 BTn−1 is the classical Bezout matrix of order n associated with p(z) and q(z) of degree at most n. Let Jn ∈ Rn×n be the permutation (reversion) matrix having unit antidiagonal entries. ˜ = z n q(z −1 ). By Moreover, introduce the reverse polynomials p(z) ˜ = z n p(z −1 ) and q(z) multiplying both sides of 1 −1 −1 −1 −1 p(z )q(w ) − p(w )q(z ) . = [1; : : : ; z −n+1 ]Bˆ .. ; z −1 − w−1 w−n+1
by z n−1 wn−1 it is readily veriAed that
1 p(z) ˜ q(w) ˜ − p(w) ˜ q(z) ˜ . ˆ n = [1; : : : ; z n−1 ]Jn BJ .. w−z wn−1
ˆ n is the classical Bezout matrix generated by which says that, up to the sign, B˜ = Jn BJ p(z) ˜ and q(z). ˜ The characterization of the Euclidean algorithm applied to the polynomials p(z) ˜ ˜ n provided in and q(z) ˜ in terms of properties of the block LU factorization of Bˆ = Jn BJ [2,3,17] enables us to show that B is indeed a resultant matrix for the polynomials p(z) and q(z). Given two polynomials p(z) and q(z) of degree at most n we say that ∞ is a common root of p(z) and q(z) if deg(p(z))¡n and deg(q(z))¡n. In the following
D.A. Bini, L. Gemignani / Theoretical Computer Science 315 (2004) 319 – 333
325
theorem we extend the results of [2,3,17] to the representation of polynomials in the Bernstein basis. Theorem 2. Assume that both 0 and ∞ is not a common root of the two real polynomials p(z) and q(z) de:ned by (1). Moreover, let w(z) be the GCD of p(z) and q(z). Then: (1) The degree k of w(z) is equal to k = n − rank(B), where B is the Bernstein– Bezoutian matrix generated from p(z) and q(z) as in (3). (2) We have det(Bn−k ) = 0 and det(Bn−j ) = 0 for j = k + 1; : : : ; n, where Bj denotes the j × j leading principal submatrix of B. Set 1 6 m1 ¡m2 ¡ · · · ¡mL = n − k be the integers such that det(Bmi ) = 0, 1 6 i 6 L, det(Bj ) = 0, otherwise. (3) Let B be partitioned into a 2 ×2 block matrix as follows: BmL−1 P B= : Q R Moreover, consider the Schur complement S = R − QBm−1L−1 P of BmL−1 in B and let [bmL−1 +1 ; : : : ; bn ] be the :rst row of S. There exists a nonzero scalar such that (n−1) bmL−1 +1 m(n−1) (z) + · · · + bn n−1 (z) = z mL−1 z k w(z −k ): L−1
(6)
Proof. Since −T −1 ˜ n = Tn−1 Jn BJ BTn−1 T and, moreover, Tn−1 is a nonsingular lower triangular matrix, then (1) and (2) can be easily obtained from the analogous properties of classical Bezoutians stated in [2, Corollary 3.1]. Concerning part (3), we recall that the Schur complement Sˆ of the ˜ n is such that its Arst row gives the leading principal submatrix of order mL−1 of Jn BJ suitably normalized coe0cients of the greatest common divisor z k w(z −1 ) of p(z) ˜ and T ˆ ˆ ˆ ˆ q(z) ˜ [2,3,17]. This way, relation (6) now follows from T S T = S, where T denotes the trailing principal submatrix of Tn−1 of order n − mL−1 .
Example 3. Consider the polynomials p(z) = 4 − 5z 2 + z 4 = 40(4) (z) + 41(4) (z) + and q(z) =
1 2
q(z) =
19 6
2(4) (z) + 32 3(4) (z)
− 14 z − 2z 2 + z 3 , 1 2
0(4) (z) +
7 16
1(4) (z) +
1 24
2(4) (z) −
7 16
3(4) (z) − 34 4(4) (z);
whose (monic) greatest common divisor is w(z) = z − 2. We And that 1 17=6 10=3 3 17=6 157=36 83=18 4 B= 10=3 83=18 187=36 19=4 : 3 4 19=4 9=2
326
D.A. Bini, L. Gemignani / Theoretical Computer Science 315 (2004) 319 – 333
Hence, the Schur complement S of B2 in B is 5=11 15=22 S= 15=22 45=44 and thus it is readily veriAed that 5 (3) 15 (3) 15 2 (z) + 3 (z) = − z 2 (z − 2): 11 22 22 The Schur complement of B3 is the zero matrix of order 1. In view of the triangular structure of Tn−1 , it can also be shown that the sequence {m1 ; : : : ; mL } in Theorem 2, which corresponds the sequence of jumps in the block triangular factorization process applied to B, can be determined by means of a direct inspection of the entries of the computed Schur complements. In particular, the occurrence of a jump is revealed by zero entries in the north-western corner of S. Example 4. Let p(z) = 1 + 4z 4 + z 5 and q(z) = z + z 5 so that p(z) ˜ = z 5 p(z −1 ) = z q(z) ˜ + 3z + 1: The Schur complement S 0 0 0 1=12 S= 3=16 2=3 1 8=3
of B1 is
3=16 1 2=3 8=3 17=8 6 6 10
and thus m2 = 4. Finally, we observe that the assumptions of Theorem 2 could be relaxed and similar properties are still valid in the degenerate cases where p(z) and q(z) have a common root at 0 or ∞. This situation can easily be detected by evaluating the polynomials and the reverse polynomials at the origin and, then, Theorem 2 applies to the possibly deGated polynomials. 3. The displacement structure of Bernstein–Bezoutian matrices So far we have shown that the solution of the GCD problem for polynomials expressed with respect to the Bernstein polynomial basis can be reduced to the computation of a block triangular factorization of a certain matrix B generated according to Theorem 1 from the coe0cients of these polynomials. In order to design a fast algorithm for this latter task, in this section we investigate the displacement structure of B. −1 Next result provides a characterization of the Bernstein-companion matrix Tn−1 ZTn−1 , where Z = (zi;j ) is the down-shift matrix of order n deAned by zi;j = i−1;j and i;j is the Kronecker symbol.
D.A. Bini, L. Gemignani / Theoretical Computer Science 315 (2004) 319 – 333
327
Theorem 5. We have
−1 Tn−1 ZTn−1
1 + !1 !2 : : : !n "1 1 ◦ =V = ; .. .. . . ◦ "n−1 1
where "i = (n − i)=i, 1 6 i 6 n − 1, !i = − n=i, 1 6 i 6 n. Proof. From (2) and from
1 ... z 1 −1 z −1 = . . . + z e1 .. .. .. z n−1 z n−1 ◦ 1 0 1 z .. .
0
◦
one Ands that, for any z ∈ R, (n−1) 0(n−1) (z) 0 (z) (n−1) (z) (n−1) (z) 1 1 −1 z −1 ZTn−1 = Tn−1 + z −1 e1 : .. .. . .
(n−1) n−1 (z)
(7)
(n−1) n−1 (z)
From the deAnition of the Bernstein polynomials i(n−1) (z) it is easily veriAed that, for 1 6 i 6 n − 1, n − i (n−1) n−1 (n−1) −1 (n−1) i−1 (z): z i (z) − i (z) = (1 − z)n−i z i−1 = (8) i i Moreover, a partition of unity, that is, n−1 (n−1)since Bernstein polynomials n−1deAne (n−1) −1 (z) = 1, it follows that z (z) = z −1 , and, therefore, i i i=0 i=0 z −1 0(n−1) (z) = z −1 − z −1
n−1 i=1
i(n−1) (z)
= z −1 −(n − 1)0(n−1) (z)−
n−2 i=1
1+
n−i−1 i+1
(n−1) i(n−1) (z)−n−1 (z):
(9) Hence, by combining relations (8) and (9), we deduce that (7) still holds if we −1 ZTn−1 by the matrix V . Since the value of z can be arbitrarily choreplace Tn−1 (n−1) (n−1) sen and 0 (z); : : : ; n−1 (z) are linearly independent, then we may conclude that −1 V = Tn−1 ZTn−1 .
328
D.A. Bini, L. Gemignani / Theoretical Computer Science 315 (2004) 319 – 333
From the previous theorem it is immediately seen that −1 Tn−1 ZTn−1 = L + e1 [!1 ; !2 ; : : : ; !n ];
where L = I + Z diag["1 ; : : : ; "n ] ∈ Rn×n denotes the lower bidiagonal matrix with unit diagonal entries and subdiagonal entries equal to "1 ; : : : ; "n−1 . This implies the matrix equation ZTn−1 − Tn−1 L = e1 [!1 ; !2 ; : : : ; !n ] = e1 T ;
(10)
which can be used to derive a displacement equation for the Bernstein–Bezoutian matrix B generated from the coe0cients of two polynomials p(z) and q(z) as in (3). Recall that classical Bezoutians are the inverse of Hankel matrices and, therefore, they are Hankel-like matrices [20,21]. In particular, the Bezout matrix −T −1 Bˆ = Tn−1 BTn−1
(11)
of p(z) and q(z) with respect to the standard power basis satisAes ˆ = uCT − CuT Z T Bˆ − BZ
(12)
ˆ 1 and C. Thus, from (10) and (11) one for two suitable n−dimensional vectors u = Be obtains that −T −1 T LT B − BL = LT Tn−1 Tn−1 B − BTn−1 Tn−1 L T T T ˆ T T ˆ B(ZT = (Tn−1 Z − e1 )BTn−1 − Tn−1 n−1 − e1 );
from which, in the light of (12), it follows that T T LT B − BL = Tn−1 (uvT − vuT )Tn−1 − uT Tn−1 + Tn−1 uT T T = Tn−1 u(T + CT Tn−1 ) − ( + Tn−1 C)uT Tn−1 :
This means that the matrix B has displacement rank at most 2 with respect to the displacement operator F: Rn×n → Rn×n ;
F(B) = LT B − BL
or, equivalently, rank(F(B)) 6 2. By looking back at Theorem 1, we And that, for 1 6 i; j 6 n − 1, (LT B − BL)i;j = "i bi+1; j − "j bi; j+1 n2 n − j j(n − i) = bi+1; j − bi;j+1 = − (pi qj − pj qi ): j i(n − j) ij Analogously for j = 1; : : : ; n, we obtain that n (LT B − BL)n; j = −"j bn; j+1 = − (pn qj − pj qn ): j T T Observe − BL = L˜ B − BL˜ for L˜ = L − I = DZD−1 , where D = diag that L B n n n . That is, the scaled Bezoutian matrix B˜ = DBD is such that 0 ; 1 ; : : : ; n−1
D.A. Bini, L. Gemignani / Theoretical Computer Science 315 (2004) 319 – 333
329
˜ has rank at most 2. Observe also that, if pi and qj are integers, then the Z T B˜ − BZ ˜ have integer entries. The latter scaled Bezoutian B˜ as well as the matrix Z T B˜ − BZ T ˜ 2 ˜ i;j = (n =ij)di dj (pi qj − pj qi ), where di = n . matrix can be written as (Z B − BZ) i−1 In this way, we arrive at the following result, which characterizes the generators of the displacement representation of B in terms of the coe0cients of the Bernstein form of p(z) and of q(z). Theorem 6. The Bernstein–Bezoutian matrix B generated from p(z) and q(z) by means of (3) satis:es the displacement equation LT B − BL = qˆpˆT − pˆqˆT ; where L = I + Z diag["1 ; : : : ; "n ], pˆ = [np1 ; (n=2)p2 ; : : : ; pn ]T , qˆ = [nq1 ; (n=2)q2 ; : : : ; qn ]T and pi ; qi , 0 6 i 6 n, are the coe5cients of the Bernstein form (1) of p(z) and q(z), respectively. If Jn denotes the reversion matrix introduced in the previous section, then Theorem 6 provides a displacement equation for Jn BJn of the form ˜ n BJn ) − (Jn BJn )L˜T = q˜p˜T − p˜q˜T ; L(J where L˜ is a lower triangular matrix with unit diagonal entries. Since Bernstein polyno(n) mials are symmetric, i.e. i(n) (z) = n−i (1 − z), we And that, up to the sign, Jn BJn is the Bernstein–Bezoutian matrix associated with p(1 − z) and q(1 − z). Therefore, Theorem 2 allows us to reduce the computation of the GCD of p(z) and q(z) to determining a block triangular factorization of Jn BJn . To do this we can consider the generalized Schur algorithm for generalized displacement structures described in [20,21]. The derivation of this algorithm relies upon the fundamental property that the Schur complement of a nonsingular leading principal submatrix of Jn BJn inherits the same displacement structure of Jn BJn . This enables the elimination procedure to be deAned by means of a set of recursions only involving the displacement generators. Although the algorithm presented in [20,21] only works in the strongly nonsingular case, where all the leading principal submatrices of Jn BJn are nonsingular, its extension to cover input matrices with singular submatrices is straightforward. In fact, by a continuity argument a block elimination step can be reduced to performing a sequence of steps. Observe that the size of the jumps occurring in the block elimination procedure can be determined by a direct inspection of the computed Schur complements as shown in Example 4. Summarizing , we apply the generalized Schur algorithm for the block triangular factorization of Jn BJn to obtain a fast O(n2 ) algorithm for the evaluation of the GCD of two polynomials p(z) and q(z) given in Bernstein form.
330
D.A. Bini, L. Gemignani / Theoretical Computer Science 315 (2004) 319 – 333 35
60
30
50
25
40
20
30
15 20
10
10
5 10
20
30
10
k=0
20
30
40
k=0
70 60
60
50 40
40
30 20
20
10 10
20
30
40
k=0
10
20
30
40
k=0 Fig. 1.
4. Numerical experiments The numerical performance of the LU factorization algorithm applied to a matrix Jn AJn is strongly inGuenced by the condition numbers of the k × k trailing principal submatrices Ak of A. Therefore, in order to compare the performances of computing the GCD of two polynomials in the monomial basis and in the Bernstein basis, we have compared the values of the condition numbers of the trailing principal submatrices B˜ k and Bk of B˜ and B, respectively. In fact, B˜ and B are the representations of the Bezoutian b(z; w) = (p(z)q(w) − p(z)q(w))=(z − w) in the monomial basis and in the Bernstein basis, respectively. For k = 0; 1; 2; 3 we have generated two pseudo-random polynomials p(z) and q(z) of degree n = m + k, with m = 40, having a common factor s(z) of degree k, in the following way: let Pi and Q i be random integers uniformly m distributed in the range m [−100; 100], set p(z) = s(z) i=0 Pi i(m) (z), q(z) = s(z) i=0 Qi i(m) (z). The common factor s(x) has been chosen in the set {1; z + 2; (z + 2)(z − 3); (z + 2)(z − 3)(z + 1=3)}. From p(z) and q(z) we have constructed the matrices B˜ and B. In Fig. 1, we report the plots of the logarithm to the base 10 of the spectral condition numbers of the matrices B˜ i (dark grey), and Bi (light grey) for i = 1; : : : ; m + k, for the values k = 0; 1; 2; 3. As we can see, the growth of the spectral condition number with respect to i is much larger for the Bezoutian represented in the monomial basis than for the Bezoutian represented in the Bernstein basis. In particular, if the polynomials p(z) and q(z) are relatively prime (k = 0) then the condition numbers of Bi are uniformly bounded. This shows that any numerical method for the computation of GCD(p(z); q(z)) based on the LU factorization of the Bezout matrix is less prone to
D.A. Bini, L. Gemignani / Theoretical Computer Science 315 (2004) 319 – 333
2.5
5
7.5
10
12.5
331
15
-2 -4 -6 -8 -10 -12
Fig. 2.
numerical instability if the computation is performed in the Bernstein basis rather than in the monomial basis. We also considered the use of SVD to obtain a satisfactory “approximate-gcd” for polynomials in Bernstein form (see [22] and the bibliography therein for a review of some major known methods for “approximate-gcds” of polynomials in power form). In [9] it was proved that a reasonable termination criterion for the Euclid’s algorithm in Goating-point arithmetic is to test the ratio between the smallest singular values of two consecutive submatrices of the subresultant of the input polynomials against a prescribed tolerance. In Fig. 2 we compare the robustness of this indicator for Bernstein– Bezoutian matrices (light grey) and classical Bezoutian matrices (dark grey) obtained after having performed the explicit conversion between the Bernstein and the power basis. SpeciAcally, we plot the logarithm to the base 10 of the ratio between the smallest singular values of two consecutive leading principal submatrices. The input polynomials are pseudo random polynomials of degree 20 in Bernstein form with a common factor of degree 1. Computations are carried out using the standard numerical precision of about 16 digits. We see that for classical Bezout matrices the ratio proAle experiences dramatic and unpredictable changes at each successive stage so that its comparison with a speciAed tolerance is an unreliable indicator of when to stop the Euclidean algorithm. On the contrary, the test performs quite well for Bernstein–Bezoutian matrices. 5. Future work The results presented in this paper provide theoretical bases for the design of fast and accurate algorithms for the computation of the GCD of two polynomials in Bernstein form. Our future research will mainly focus on studying the numerical behavior of these algorithms. A look-ahead strategy can be incorporated into the generalized Schur algorithm in order to improve its robustness and accuracy. An implementation of the Schur algorithm in a look-ahead fashion, using variable-precision Goating point arithmetic, would
332
D.A. Bini, L. Gemignani / Theoretical Computer Science 315 (2004) 319 – 333
provide a competitive method for solving the GCD problem for polynomials in the Bernstein basis. Schur algorithm is based on the invariance of the Bezoutian structure under Schur complementation. This property, rephrased into a polynomial setting, leads to polynomial schemes for the computation of the GCD of two given polynomials. We refer to [2,3] for a description of these schemes for polynomials expressed with respect to the standard power basis. In particular in [3] it was noticed that the polynomial equivalence can be exploited in order to decrease the Boolean complexity of the factorization procedures. The results of this paper allow us to extend the approach in [2,3] to the case where the input polynomials are represented in the Bernstein basis. In this way, we obtain polynomial remainder algorithms for computing the GCD of two polynomials in Bernstein form which retain the Bernstein basis throughout the computation. Extensive numerical experiments comparing the Boolean cost of these polynomial schemes and of the generalized Schur algorithm would yield some important insights in order to arrive at a conclusive choice. Besides GCD computation, the properties of Bezoutian matrices allow the design of e0cient root localization procedures and stability tests for scalar and matrix polynomials. This transfers also to Bernstein–Bezoutian matrices and the study of the resulting algorithms would be useful for a variety of applications in computer graphics. Acknowledgements We wish to thank the anonymous referees whose valuable suggestions greatly improved the presentation of the paper. This work was partially supported by MIUR, Grant number 2002014121. References [1] S. Barnett, A Bezoutian matrix for Chebyshev polynomials, in: Applications of Matrix Theory (Bradford, 1988), Vol. 22, Institute of Mathematics and its Application Conference Series New Series, Oxford University Press, 1988, Oxford, pp. 137–149. [2] D. Bini, L. Gemignani, Fast parallel computation of the polynomial remainder sequence via B5ezout and Hankel matrices, SIAM J. Comput. 24 (1) (1995) 63–77. [3] D.A. Bini, L. Gemignani, Fast fraction-free triangularization of Bezoutians with applications to sub-resultant chain computation, Linear Algebra Appl. 284 (1998) 19–39. [4] D.A. Bini, V. Pan, Matrix and Polynomial Computations: Fundamental Algorithms, Vol. 1, BirkhUauser, Boston, 1994. [5] W.S. Brown, J.F. Traub, On Euclid’s algorithm and the theory of subresultants, J. Assoc. Comput. Mach. 18 (1971) 505–514. [6] T.F. Chan, P.C. Hansen, A look-ahead Levinson algorithm for indeAnite Toeplitz systems, SIAM J. Matrix Anal. Appl. 13 (2) (1992) 490–506. [7] G.E. Collins, Sub-resultants and reduced polynomial remainder sequences, J. Assoc. Comput. Mach. 14 (1967) 128–142. [8] R.M. Corless, P.M. Gianni, B.M. Trager, S.M. Watt, The singular value decomposition for polynomial systems, in Proc. ACM Internat. Symp. Symbolic and Algebraic Computation, Montreal, Quebec, Canada, 1995, pp. 195–207.
D.A. Bini, L. Gemignani / Theoretical Computer Science 315 (2004) 319 – 333
333
[9] I.Z. Emiris, A. Galligo, H. Lombardi, CertiAed approximate univariate GCDs, J. Pure Appl. Algebra 117/118 (1997) 229–251. [10] G. Farin, Curves and surfaces for computer-aided geometric design, Computer Science and ScientiAc Computing, 4th Edition, Academic Press, San Diego, CA, 1997 (A practical guide, Chapter 1 by P. B5ezier; Chapters 11 and 22 by W. Boehm, With 1 IBM-PC Goppy disk (3.5 inch; HD)). [11] G.E. Farin, B. Hamann, Current trends in geometric modeling and selected computational applications, J. Comput. Phys. 138 (1) (1997) 1–15. [12] G.E. Farin, D. Hansford, For graphics and modeling, The Geometry Toolbox, A K Peters Ltd., Natick, MA, 1998. [13] R.T. Farouki, V.T. Rajan, On the numerical condition of polynomials in Bernstein form, Comput. Aided Geom. Design 5 (1988) 1–26. [14] A.R. Forrest, Interactive interpolation and approximation by B5ezier polynomials, Comput. J. 15 (1972) 71–79. [15] R.W. Freund, H.Y. Zha, A look-ahead algorithm for the solution of general Hankel systems, Numer. Math. 64 (3) (1993) 295–321. [16] L. Gemignani, Fast and stable computation of the barycentric representation of rational interpolants, Calcolo 33 (3–4) (1998) 371–388 (Toeplitz matrices: structures, algorithms and applications, Cortona, 1996). [17] L. Gemignani, Schur complements of Bezoutians and the inversion of block Hankel and block Toeplitz matrices, Linear Algebra Appl. 253 (1997) 39–59. [18] I. Gohberg, V. Olshevsky, Fast inversion of Chebyshev–Vandermonde matrices, Numer. Math. 67 (1) (1994) 71–92. [19] T. Kailath, V. Olshevsky, Displacement-structure approach to polynomial Vandermonde and related matrices, Linear Algebra Appl. 261 (1997) 49–90. [20] T. Kailath, A.H. Sayed, Displacement structure: theory and applications, SIAM Rev. 37 (3) (1995) 297–386. [21] T. Kailath, A.H. Sayed (Eds.), Fast Reliable Algorithms for Matrices with Structure, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1999. [22] V.Y. Pan, Computation of approximate polynomial GCDs and an extension, Inform. and Comput. 167 (2) (2001) 71–85. [23] K. Rost, Generalized companion matrices and matrix representations for generalized Bezoutians, Linear Algebra Appl. 193 (1993) 151–172. [24] J.R. Winkler, A resultant matrix for scaled Bernstein polynomials, Linear Algebra Appl. 319 (1–3) (2000) 179–191. [25] J.R. Winkler, A comparison of the average case numerical condition of the power and Bernstein polynomial bases, Internat. J. Comput. Math. 77 (4) (2001) 583–602. [26] J.R. Winkler, A companion matrix resultant for Bernstein polynomials, Linear Algebra Appl. 362 (2003) 153–175. [27] H.J. Wolters, G.E. Farin, Geometric curve approximation, Comput. Aided Geom. Design 14 (6) (1997) 499–513.
Theoretical Computer Science 315 (2004) 335 – 369
www.elsevier.com/locate/tcs
Polynomial equation solving by lifting procedures for rami(ed (bers A. Bompadrea;1 , G. Materab; c;∗ , R. Wachenchauzerd , A. Waissbeina;2 a Departamento
de Matem aticas, Facultad de Ciencias y Naturales, Universidad de Buenos Aires, Ciudad Universitaria, Pabell on I (1428) Buenos Aires, Argentina b Instituto de Desarrollo Humano, Universidad Nacional de General Sarmiento, Campus Universitario, Jos e M. Guti errez 1150 (1613), Los Polvorines, Pcia. de Buenos Aires, Argentina c Member of the CONICET, Argentina d Departamento de Computaci on, Facultad de Ingenier 5a, Universidad de Buenos Aires, Av. Paseo Col on 850 (1063) Buenos Aires, Argentina
Abstract Let be given a parametric polynomial equation system which represents a generically unrami(ed family of zero-dimensional algebraic varieties. We exhibit an e4cient algorithm which computes a complete description of the solution set of an arbitrary parameter instance from a complete description of the in(nitesimal structure of a particular rami(ed parameter instance of our family. This generalizes in the case of space curves previous methods of Heintz et al. and Schost, which require the given parameter instance to be unrami(ed. We illustrate our method solving particular polynomial equation systems by deformation techniques. c 2004 Elsevier B.V. All rights reserved. Keywords: E4cient polynomial equation solving; Rami(ed (bers of dominant mappings; Puiseux expansions of space curves; Newton–Hensel lifting
∗
Corresponding author. E-mail addresses: [email protected] (A. Bompadre), [email protected] (G. Matera), rosita@ mara.(.uba.ar (R. Wachenchauzer), [email protected] (A. Waissbein). 1 Present address: MIT Operations Research Center, 77, Massachusetts Avenue Building E40-130 Cambridge, MA 02139, USA. 2 Research was partially supported by the following Argentinian and German grants: UBACyT X198, PIP CONICET 2461, BMBF-SETCIP AL/PA/01-EIII/02, UNGS 30/3005 and beca de posgrado interno CONICET. Some of the results presented here were (rst announced at the Workshop Argentino de Inform atica Te orica, WAIT’01, held in September 2001 (see [12]). c 2004 Elsevier B.V. All rights reserved. 0304-3975/$ - see front matter doi:10.1016/j.tcs.2004.01.015
336
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
1. Introduction Algorithmic multivariate polynomial system solving is a central theme of computational algebraic geometry, which arises in connection with numerous scienti(c and technical problems (see e.g., [21,62]). In order to solve polynomial equation systems, several symbolic and numeric algorithms have been proposed. Unfortunately, typical symbolic elimination methods based on rewriting techniques (see e.g., [19,20]) have superexponential complexity, which makes them infeasible for realistically sized problems. On the other hand, in the case of typical numeric (iterative) techniques (see e.g., [55]), it is not easy to obtain good initial guesses for the solutions of the system under consideration. In order to circumvent these di4culties diIerent attempts were made, from the symbolic and the numeric point of view, to solve polynomial equation systems by means of deformation techniques, based on a perturbation of the original system and a subsequent path-following method (see e.g., [1,6,10,18,29,42]). A common drawback of these methods is the fact that they typically introduce spurious solutions which may be computationally expensive to identify and eliminate in order to obtain the actual solutions. In the series of papers [28,30–32,52], a new symbolic elimination algorithm was introduced. This algorithm is based on a Jat deformation of a certain morphism of a4ne varieties, which was isolated and re(ned in [39] (see also [58]). More precisely, let V be a Q-de(nable equidimensional a4ne variety of dimension m, and let be given a generically unrami(ed, (nite morphism : V → Cm . Then, given a complete description of a particular unrami(ed (ber −1 (y0 ), [39] exhibits an algorithm which computes a complete description of an arbitrary (ber −1 (y) using a global version of the Newton–Hensel lifting. This deformation technique may be used in order to solve particular polynomial equation systems. A typical application of this method is the following (see e.g., [38,39,53]): suppose that we are given a sparse polynomial equation system which de(nes a zerodimensional a4ne variety. Suppose further that a suitable replacement of some of the coe4cients of the original polynomials by indeterminates gives a generically unrami(ed family of zero-dimensional a4ne varieties, with underlying (nite morphism. Then, if there exists a particular unrami(ed (ber which is easy to solve, it is possible to solve the original system by using the algorithm of [39]. Our main objective here is to extend the “catalogue” of polynomial equation systems which may be treated using this deformation technique. For this purpose, we are going to exhibit an algorithm which, given a generically unrami(ed family of zerodimensional a4ne varieties, represented by a dominant (not necessarily (nite) morphism : V → Cm , and the in(nitesimal structure of a particular (eventually rami?ed) (ber −1 (y0 ), computes a complete description of any (ber −1 (y). In view of the main outcome of the articles [33,40], namely the conclusion that the elimination techniques of [28,30] can be e4ciently reduced to the case of algebraic curves (i.e., a4ne equidimensional algebraic subvarieties of dimension 1 of Cn+1 ), in this article we shall limit ourselves to this case. Let V ⊂ Cn+1 be a Q-de(nable algebraic space curve, and let us assume that the morphism : V → C induced by the canonical projection in the (rst coordinate is
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
337
dominant and generically unrami(ed. Let −1 (0 ) be a (nite and rami(ed (ber. Suppose further that we are given the in(nitesimal structure of −1 (0 ), i.e., the set of singular parts of the Puiseux expansions of the branches of V lying above 0 (see Section 2.2). Then we exhibit an algorithm which computes a complete description of an arbitrary (ber −1 () (see Section 4). Our algorithmic method is essentially based on a new variant of the global Newton– Hensel procedure of [28,30] which is described in Section 3. Its time–space complexity is roughly O(D ), where is the degree of V , D is the degree of and = 1 in several important cases. Then our algorithm extends and improves the procedures in [39,58]. Furthermore, our algorithm treats all the branches of V lying above 0 separately, improving thus the re(nements of [39, Section 3]. Finally, in Section 5 we illustrate our method on a few examples, where the deformation technique of [39] cannot be applied. We solve Pham–Brieskorn systems, examples provided by discretization problems of partial diIerential equations and generalized Reimer systems. 2. Preliminaries In this section, we (x the notation and terminology used throughout this paper. In Section 2.1, we introduce the terminology about projections and the notion of geometric solution of an a4ne variety. In Section 2.2, we introduce terminology about space curves, extending the usual terminology of Puiseux expansions of plane curves (see e.g., [64]) and rational Puiseux expansions (see e.g., [23,65]). Finally, in Section 2.3 we (x our computational model. 2.1. Geometric solutions We use standard notions and notations of commutative algebra and algebraic geometry, which can be found in e.g., [24,45,48,60]. For a given algebraically closed (eld k and m ∈ N, we denote by Am (k) the m-dimensional a4ne space k m equipped with its Zariski topology over k. In particular, we shall use the notation Am := Am (C). Let us (x n ∈ N. Points in An+1 shall be denoted either by (; x), with ∈ C and x ∈ Cn , or by (; x1 ; : : : ; xn ) with ; x1 ; : : : ; xn ∈ C. Let E; X1 ; : : : ; Xn be indeterminates over Q, let X := (X1 ; : : : ; Xn ), and let Q[E; X ] := Q[E; X1 ; : : : ; Xn ] be the ring of polynomials in the variables E; X with coe4cients in Q. Let F1 ; : : : ; Fn be polynomials in Q[E; X ] which form a regular sequence of Q[E; X ] and generate a radical ideal in Q[E; X ]. Then V := {(; x) ∈ An+1 : F1 (; x) = 0; : : : ; Fn (; x) = 0} is an equidimensional a4ne variety of dimension dim V = 1. The coordinate ring Q[V ] and the ring of rational functions Q(V ) of V are de(ned as the quotient ring Q[E; X ]= (F1 ; : : : ; Fn ) and its total ring of fractions, respectively. Let : V → A1 be the morphism induced by the restriction to V of the canonical projection in the (rst coordinate (; x) := . Let V = C1 ∪ · · · ∪ Cs be the decomposition
338
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
of V into irreducible components. Suppose that |Ci is dominant for 16i6s. We de(ne s the degree of as the number D := i=1 [Q(Ci ) : Q(E)], where [Q(Ci ) : Q(E)] denotes the degree of the ( (nite) (eld extension Q(E) ,→ Q(Ci ) for 16i6s. We assume that is generically unrami?ed, i.e., the (ber −1 () consists of exactly D points for a generic value ∈ A1 . This implies that the Jacobian determinant JF := det(@Fi =@Xj )16i; j6n is not a zero divisor in Q[V ]. Let U be a nonzero linear form of Q[X ] and let u be the element of Q[V ] induced by U . Let u : V → A2 be the morphism de(ned by u (; x) := (; u(x)). By a standard argument we conclude that the Zariski closure u (V ) of the image of V under u is a Q-de(nable hypersurface of A2 . Let Z be an indeterminate over Q. Then there exists a unique (up to scaling by nonzero elements of Q) minimal equation Mu ∈ Q[E; Z] de(ning u (V ). From the BQezout inequality (see e.g., [26,36]) we deduce the estimate deg Mu 6deg V . On the other hand, we have the estimate deg Z Mu 6D. Let mu ∈ Q(E)[Z] denote the (unique) monic multiple of Mu with deg Z mu = degZ Mu . We call mu the projection polynomial of u in V . We de(ne the Projection Problem as follows: given F1 ; : : : ; Fn and the linear form U ∈ Q[X ], (nd the projection polynomial mu . It is well-known that there exists a nonempty Zariski open set U ⊂ An such that for any linear form U := 1 X1 + · · · + n Xn with (1 ; : : : ; n ) ∈ U we have deg Z mu = D. Any linear form satisfying this condition is called generic. Let us observe that for any generic linear form U ∈ Q[X ], the induced coordinate function u is a primitive element of the Q-algebra extension Q(E) ,→ Q(V ), whose minimal polynomial over Q(E) equals mu . Let U ∈ Q[X ] be a generic linear form. Using a suitable variant of the so-called Shape Lemma (see e.g., [33,56]), the computation of the projection mu can be easily extended to a symbolic solution of V in the following sense (see e.g., [28,30,33]). A geometric solution of the a4ne variety V consists of: • a generic linear form U ∈ Q[X ], • the projection polynomial mu ∈ Q(E)[Z], • elements v1 ; : : : ; vn of Q(E)[Z] such that (@mu =@Z)Xi ≡ vi mod Q(E) ⊗ Q[V ] and degZ vi ¡D hold for 16i6n. This notion of geometric solution has a long history, going back at least to [44] (see also [47,67]). One might consider [18,27] as early references where this notion was implicitly used in modern symbolic computation. 2.2. Space curves We maintain the notations and assumptions introduced in Section 2.1. Let T be an ˜ X˜ ) indeterminate over Q. A parameterization of the curve V is a nonconstant vector (E; S S of elements of the (eld of Laurent series Q((T )), with X˜ := (X˜1 ; : : : ; X˜n ) ∈ Q((T ))n , such S ˜ X˜ ) = 0; : : : ; Fn (E; ˜ X˜ ) = 0 holds in Q((T ˜ X˜ ) is called that F1 (E; )). A parameterization (E; k n+1 S ˜ X˜ ) ∈ Q((T irreducible if there does not exist an integer k¿1 for which (E; )) holds. ˜ The coeAcient ?eld of a parameterization (E; X˜ ) of V is the (eld extension of Q ˜ X˜1 ; : : : ; X˜n . generated by the coe4cients of the series E;
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
339
S We de(ne the order oT (’) of ’ ∈ Q((T )) as the least power of T appearing with a ˜ X˜ ) and (E˜ ; X˜ ) are called equivnonzero coe4cient in ’. Two parameterizations (E; ˜ ) = E˜ (’(T )); alent if there exists a power series ’ ∈ C
(1) m (‘) e‘ m¿m‘ ai; m ·E
For 16‘6D, let us write x(‘) := (x1(‘) ; : : : ; xn(‘) ) and xi(‘) := (16i6n), S Without loss of generality we may assume for with e‘ ∈ N, m ‘ ∈ Z and ai;(‘)m ∈ Q. 16‘6D that e‘ has no common factors with the greatest common divisor of the set of m’s for which ai;(‘)m = 0 holds. The number e‘ is called the rami?cation index of the series x(‘) . Let us remark that for 16‘6D the coe4cient (eld generated by all the coordinates of x(‘) is a (nite extension of Q (see e.g., [23]). Its degree f‘ is called the residual degree of x(‘) . Following [23] (see also [65]), a set of non-equivalent parameterizations (1)
{(E˜ ; X˜
(1)
(g) ˆ
); : : : ; (E˜ ; X˜
(g) ˆ
S )} ⊂ Q((T ))n+1
(2)
containing a complete set of representatives of the branches of V lying above 0 is called a system of rational Puiseux expansions (of the branches of V lying above 0) S if it is invariant under the action of the Galois group of the (eld extension Q=Q and (‘) e ‘ S E˜ = ‘ T , with e‘ ∈ N and ‘ ∈ Q\{0} for 16‘6g. ˆ Let g be the number of orbits S de(ned on the set (2) under the action of the Galois group of Q=Q and suppose that we have chosen the numbering in (2) such that the (rst g elements represent diIerent orbits.
340
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
Let us observe that from a given system of rational Puiseux expansions we may easily obtain the system of classical Puiseux expansions of the branches of V lying above 0, i.e., the complete set of solutions of (1). Indeed, let (‘) m (‘) m (‘) (‘) (E˜ ; X˜ ) := ‘ T e‘ ; :16‘6g (3) a1;m T ; : : : ; an;m T m¿m‘
m¿m‘
‘ S denote a prim∈Q be a system of rational Puiseux expansions of V , and let (‘ ; −1=e ‘ −1 itive e‘ th root of 1 and an e‘ th root ‘ for 16‘6g. Then the classical Puiseux expansions of the branches of V lying above 0 are given by
(‘) {X˜ ((j‘ ‘−1=e‘ E1=e‘ ) : 1 6 ‘ 6 g; 1 6 j 6 e‘ }: ‘ Observe that the rami(cation index of the expansion X˜ (‘) ((‘j −1=e E1=e‘ ) is e‘ . Let R ‘ R (‘) m denote the least integer such that the partial expansion vectors m=m‘ am T := R (‘) (‘) m m=m‘ (a1; m ; : : : ; an ;m )T are pairwise distinct for 16‘6D. Let us remark that a combination of [58, Proposition 1 and 23, Lemma 2] yields the estimate R − m ‘ 62(e‘ f‘ )2 . The integerR is called the regularity index of system (3). For 16‘6g, the partial R (‘) m expansion m=m‘ am T is called the singular part of X˜ (‘) .
2.3. Computational model Our model of computation is based on the concept of arithmetic-boolean circuits (also called arithmetic networks) and computation trees (see e.g., [63] or [15]). An arithmetic-boolean circuit over Q[E; X ] is a directed acyclic graph (dag for short) whose nodes are labeled either by an element of Q ∪ {E; X1 ; : : : ; Xn }, or by an arithmetic operation or a selection (pointing to other nodes) subject to a previous equal-to-zero decision. On the dag associated to a given arithmetic-boolean circuit * we may play a pebble game (see [14,57]). A pebble game is a strategy of evaluation of * which converts * into a sequential algorithm (called computation tree) and associates to * natural time and space measures. Space is de(ned as the maximum number arithmetic registers used at any moment of the game, and time is de(ned as the total number of arithmetic operations and selections performed during the game. A computation tree without selections is called a straight-line program (see e.g., [15,37,61]). In the sequel, we shall tacitly assume that our arithmetic-boolean circuits and computation trees in Q[E; X ] contain only nonessential divisions, i.e., only divisions by nonzero elements of Q. 3. Lifting procedures for ramied bers With notations and assumptions as in Section 2.1, let {(E˜(‘) ; X˜ (‘) ) : 16‘6g} be a set of parameterizations which induces a system of rational Puiseux expansions of the S branches of V lying above 0 by the action of the Galois group of Q=Q. For 16‘6g,
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
341
let e‘ ; f‘ ∈ N denote the rami(cation index and the residual degree of the Puiseux expansions associated to the parameterization (‘) m (‘) (‘) (E˜ ; X˜ ) := ‘ T e‘ ; (4) am T m¿m‘
(‘) S n for 16‘6g; m¿m ‘ . We have g e‘ f‘ = D [23]. Let R ∈ Z be the ∈Q with am ‘=1 regularity index of the system of rational Puiseux expansions (4). Let us recall the estimate R − m ‘ 62(e‘ f‘ )2 on the size of the singular parts of the parameterizations in (4) (see Section 2.2). S and write Y := (Y1 ; : : : ; Yn ). Let K (‘) := Let T; Y1 ; : : : ; Yn be indeterminates over Q (‘) (‘) Q({ ‘ ; am;1 ; : : : ; am; n : m¿m ‘ }) be the coe4cient (eld of the parameterization (E˜(‘) ; the morphisms of the Galois group of the (eld extension X˜ (‘) ). Denote by -1(‘) ; : : : ; -f(‘) ‘ (‘) Q ,→ K . For any (‘; j; k) ∈ N3 with 16‘6g, 16j6n and 16k6f‘ , let us de(ne S Gj(‘; k) ∈ Q[T; Y ] by R−1 (‘;k) (‘) (‘) (‘)
j;‘ e‘ m R ; (5) -k (am )T + YT Gj := T Fj -k (‘ )T ; m=m‘
where j; ‘ ∈ Z is chosen such that the order of T in Gj(‘; k) equals zero. Our algorithmic methods are based on a deformation technique which allows us to compute an arbitrary (ber of the morphism : V → A1 by “lifting” the (ber −1 (0). In order to perform this process of lifting, we would like to use a global Newton–Hensel procedure as in [28,30] (see also [39,58]). Unfortunately, this is no longer possible because the essential hypothesis on the unrami(edness of the (ber −1 (0) is missed. In order to circumvent this di4culty, one might try to proceed as in the plane S curve case and consider the ideal I(‘; k) of Q[T; Y ] generated by the polynomials (‘; k) (‘; k) G1 ; : : : ; Gn for 16‘6g and 16k6f‘ . Let V (‘; k) be the a4ne subvariety of An+1 de(ned by I(‘; k) , and let (‘; k) : V (‘; k) → A1 be the morphism de(ned by (‘; k) (t; x) := t. Unlike the plane curve case, G1(‘; k) ; : : : ; Gn(‘; k) are not necessarily smooth at T = 0, unless a suitable Jatness condition is satis(ed (compare [3,4,6]). In Section 3.2, we exhibit a Jatness condition which assures that the points of the (ber ((‘; k) )−1 (0) are (G1(‘; k) ; : : : ; Gn(‘; k) )-smooth. Then in Section 3.3 we describe a variant of the global Newton–Hensel procedure of [28,30] speci(cally adapted to our situation. Finally, in Section 3.4 we show that this Jatness condition is also necessary to assure smoothness. Let us observe that the main results of this section, namely Theorems 5 and 7 below, depend on the in(nitesimal structure of the (ber −1 (0), and hence can be (slightly) generalized to the case where F1 ; : : : ; Fn form a regular sequence of Q[E](E) [X ] and generate a radical ideal of Q[E](E) [X ]. Nevertheless, for the sake of clarity we are not going to prove this generalization. 3.1. Properties of the ideal I(‘; k) Let us (x integers ‘; k with 16‘6g and 16k6f‘ . In order to exhibit our Jatness condition we (rst need to establish some properties of the ideal I(‘; k) .
342
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
S )∗ [Y ]. S )∗ [Y ] denote the (extended) ideal generated by I(‘; k) in Q(T Let I(‘; k) Q(T (‘; k) S ∗ n S ∗ In order to describe the zero set of I Q(T ) [Y ] in A (Q(T ) ), for any pair (‘; k) let L‘; k be the set of pairs (‘ ; k ) for which there exists a vector of Puiseux series associated to the (‘ ; k )th parameterization which agrees up to order R with another one associated to the (‘; k)th parameterization, i.e., −1=e L‘;k := (‘ ; k ); e‘ = e‘ ; m‘ = m‘ ; (∃‘−1=e‘ )(∃‘ ‘ ) R−1 m=m‘
(‘) (‘) −1=e‘ m m -k(‘) (am )-k (‘ ) T =
R−1 m=m‘
−1=e‘ m
(‘ ) (‘ ) -k(‘ ) (am )-k (‘
) Tm
: (6)
The sets L‘; k form a partition of the set of pairs
16‘6g {‘}
× {1; : : : ; f‘ }.
S )∗ [Y ] de?nes a zero-dimensional subvariety Lemma 1. The extended ideal I(‘; k) Q(T (‘; k) n S ∗ ˜ V of A (Q(T ) ). Furthermore, we have (‘ ) (‘ ) (‘ ) −1=e‘ m (‘) 1=e‘ m m−R (‘;k) n S ˜ V ∩ Q
(7) Proof. From the de(nition of G1(‘; k) ; : : : ; Gn(‘; k) and the parameterization (E˜(‘) ; X˜ (‘) ), it (‘) (‘) m−R is a point of V˜ (‘; k) ⊂ follows that the vector of power series m¿R -k (am )T n S ∗ (‘; A (Q(T ) ). On the other hand, we observe that any point of V˜ k) induces univocally ∗ ∗ S S a (nite set of points xS ∈ An (Q(E) ) such that Fj (E; xS ) = 0 holds in Q(E) for 16j6n. n S ∗ Since {x ∈ A (Q(E) ) : F1 (Sx) = 0; : : : ; Fn (Sx) = 0} has dimension zero (see Section 2.2), it follows that V˜ (‘; k) must also have dimension zero. Now we show identity (7). Let Vˆ (‘; k) be the right-hand side of identity (7): (‘ ) (‘ ) (‘ ) −1=e‘ m (‘) 1=e‘ m m−R (‘;k) ˆ V := -k (am )-k (‘ ) -k (‘ ) T ; (‘ ; k ) ∈ L‘;k : m¿R
It is easy to see that Vˆ (‘; k) ⊂ V˜ (k; ‘) holds. On the other hand, we observe that any point m S = n induces a unique parameterization ˜ (‘; k) ∩ Q
-k(‘) (‘ )T e‘ ;
R−1 m=m‘
(‘) -k(‘) (am )T m
+
m¿R
bm−R T
m
of a branch of V lying above 0, and hence a vector of Puiseux series xS :=
R−1 m=m‘
(‘) (‘) −1=e‘ m m=e‘ -k(‘) (am )-k (‘ ) E +
m¿R
bm−R -k(‘) (‘−1=e‘ )m Em=e‘
satisfying Fj (E; xS ) = 0 for j = 1; : : : ; n. Then there exists (‘0 ; k0 ) such that xS = (‘0 ) (‘0 ) (‘0 ) −1 m=e‘ m=e‘0 0E . This shows that (‘0 ; k0 ) belongs to L‘; k m¿m‘ -k0 (am )-k0 ( ‘0 ) 0
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
343
(‘0 ) −1=e‘0 m (‘) 1=e‘ m m 0) and ’ = (-k(‘) ( ‘ )T e‘ ; m¿m‘ -k(‘0 0 ) (a(‘ ) -k ( ‘ ) T ) holds. Then m )-k0 ( ‘0 (‘0 ) (‘0 ) (‘0 ) −1=e‘0 m (‘) 1=e‘ m m−R m−R = m¿R -k0 (am )-k0 ( ‘0 ) -k ( ‘ ) T , which shows m¿R bm T identity (7). Let us observe that G1(‘; k) ; : : : ; Gn(‘; k) are obtained from F1 ; : : : ; Fn by applying the S S X ] → Q[T; Y ] de(ned by mapping 1R(‘; k) : Q[E; R−1 (‘) (‘) m 1R(‘;k) (F(E; X )) := T F F -k(‘) (‘ )T e‘ ; -k (am )T + YT R ; m=m‘
where F ∈ Z is chosen such that the order of T in 1R(‘; k) (F) is zero. In order to “invert” the mapping 1R(‘; k) , up to a power of E, we introduce the following morphism S )[Y ] → Q(E)[X S S ] of Q-algebras: 2(‘; k) : Q(T R
R−1 (‘) (‘) m 2R(‘;k) (F(T; Y )) := F E; E−R X − : -k (am )E m=m‘
S We have E F 2R(‘; k) (1R(‘; k) (F)) = F(-k(‘) ( ‘ )Ee‘ ; X ) for any F ∈ Q[E; X ]. S Lemma 2. G1(‘; k) ; : : : ; Gn(‘; k) form a regular sequence of Q[T; Y ]. Proof. Arguing by contradiction, assume that G1(‘; k) ; : : : ; Gn(‘; k) do not form a regular S Y ]=I(‘; k) , sequence. Then there exists j¿2 such that Gj(‘; k) is a zero divisor of Q[T; S i.e., there exist H˜ ; P˜ 1 ; : : : ; P˜ j−1 ∈ Q[T; Y ] such that H˜ 1R(‘;k) (Fj ) = H˜ Gj(‘;k) =
j−1 i=1
P˜ i Gi(‘;k) =
j−1 i=1
P˜ i 1R(‘;k) (Fi )
(8)
S holds in Q[T; Y ]. Applying the morphism 2R(‘; k) to the left- and right-hand side members of identity (8) and multiplying by a suitable power of E, we deduce that there exist S H; P1 ; : : : ; Pj−1 ∈ Q[E; X ] such that HFj (-k(‘) (‘ )Ee‘ ; X ) =
j−1 i=1
Pi Fi (-k(‘) (‘ )Ee‘ ; X )
(9)
S holds in Q[E; X ]. Identity (9) may be rewritten in the following way: e ‘ −1 h=0
Eh Hh Fj (-k(‘) (‘ )Ee‘ ; X ) =
e ‘ −1 h=0
Eh
j−1 i=1
Pi;h Fi (-k(‘) (‘ )Ee‘ ; X )
(10)
S e‘ ; X ] for 06h6e‘ − 1. Then identity (10) holds if and with Hh ; P1; h ; : : : ; Pj−1; h ∈ Q[E only if the following identity: Hh Fj (-k(‘) (‘ )Ee‘ ; X ) =
j−1 i=1
Pi;h Fi (-k(‘) (‘ )Ee‘ ; X )
344
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
S holds in Q[E; X ] for 06h6e‘ − 1. This implies that Fj is a zero divisor of the S S Q-algebra Q[E; X ]=(F1 ; : : : ; Fj−1 ), which contradicts our hypotheses. S Let us remark that Lemma 2 shows in particular that the ring Q[T; Y ]=I(‘; k) is Cohen–Macaulay. From now on we (x the notations: JF := det(@Fi =@Xj )16i; j6n , JG := det(@Gi(‘; k) = @Yj )16i; j6n . S Lemma 3. The ideal I(‘; k) is a radical ideal of Q[T; Y ]. Proof. Since by hypothesis the morphism is generically unrami(ed, the Jacobian S determinant JF is not a zero divisor of Q[E; X ]=(F1 ; : : : ; Fn ). We claim that the Jacobian S Y ]=I(‘; k) . determinant JG is not a zero divisor of Q[T; S ˜ ˜ Suppose that there exist polynomials H ; P 1 ; : : : ; P˜n ∈ Q[T; Y ] such that H˜ JG =
n i=1
P˜ i Gi(‘;k)
(11)
S holds in Q[T; Y ]. Observe that JF (-k(‘) ( ‘ )Ee‘ ; X ) = E 2R(‘; k) (JG ) holds for a suitable
∈ Z. Arguing as in the proof of Lemma 2 we conclude that there exist polynomials S e‘ ; X ] for 06h6e‘ − 1 and 16i6n such that identity (11) holds if and Hh ; Pi; h ∈ Q[E only if the identity Hh JF (-k(‘) (‘ )Ee‘ ; X ) =
n i=1
Pi Fi (-k(‘) (‘ )Ee‘ ; X )
S S e‘ ; X ] for 06h6e‘ −1. We conclude that JF is a zero divisor of Q[E; X ]= holds in Q[E (F1 ; : : : ; Fn ), contradicting thus the hypothesis on the generic unrami(edness of . We S conclude that JG is not a zero divisor of Q[T; Y ]=I(‘; k) . This implies that the ideal generated by the n × n minors of the Jacobian matrix of G1(‘; k) ; : : : ; Gn(‘; k) with respect S S to T; Y1 ; : : : ; Yn has codimension at least 1 in Q[T; Y ]=I(‘; k) . Since Q[T; Y ]=I(‘; k) is (‘; k) is radical. a Cohen–Macaulay ring, from [24, Theorem 18.15] we conclude that I 3.2. The unrami?edness of the morphism (‘; k) at T = 0 In what follows, we shall use the following terminology: For a given polynomial S S ˆ where g is a nonzero G ∈ Q[T; Y ] := Q[T; Y1 ; : : : ; Yn ], let us write G(T; Y ) = T g(Y )+ G, S ] and Gˆ ∈ Q[T; Y ] has order at least + 1 in T . The polynomial polynomial of Q[Y g(Y ) is called the initial form of G and is denoted in(G). Let us (x ‘; k ∈ N with 16‘6g and 16k6e‘ . We are going to show that the morphism (‘; k) : V (‘; k) → A1 de(ned by (‘; k) (t; y) := t is unrami(ed at every point of the (ber ((‘; k) )−1 (0). For this purpose, we are going to prove that for any point b ∈ ((‘; k) )−1 (0) there exists a unique holomorphic branch of the curve V (‘; k) passing through b, and b ∈ ((‘; k) )−1 (0) has multiplicity 1 in this branch. This is equivalent to showing that the zero-dimensional a4ne variety de(ned by the (initial) ideal
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
345
S ] generated by the set {in(F) : F ∈ I(‘; k) } has as many points as the in(I(‘; k) ) ⊂ Q[Y number of holomorphic branches of V (‘; k) passing through points of ((‘; k) )−1 (0), S = n ) with the notations of Lemma 1. This is the content of our namely #(V˜ (‘; k) ∩ Q
−1=e‘ R (‘) 1=e‘ R ) -k (‘ )
W (‘;k) = {-k(‘ ) (aR(‘ ) )-k(‘ ) (‘
: (‘ ; k ) ∈ L‘;k }:
−1=e
‘ R Proof. Let Wˆ (‘; k) := {-k(‘ ) (aR(‘ ) )-k(‘ ) ( ‘ ‘ )R -k(‘) (1=e ‘ ) : (‘ ; k ) ∈ L‘; k }. We want to (‘; k) (‘; k) show that W = Wˆ holds. We (rst prove the inclusion W (‘; k) ⊃ Wˆ (‘; k) . Let b ∈ Wˆ (‘; k) and let F ∈ I(‘; k) . Then −1=e ‘ R there exists (‘ ; k ) ∈ L‘; k such that b = -k(‘ ) (aR(‘ ) )-k(‘ ) ( ‘ ‘ )R -k(‘) (1=e ‘ ) holds. Let
S S ˆ ˆ us write F = T in(F) + F, with in(F) ∈ Q[Y ]\{0} and F ∈ Q[T; Y ] of order at least
+ 1 in T . From Lemma 1 we have (‘ ) (‘ ) (‘ ) −1=e‘ m (‘) 1=e‘ m m−R -k (am )-k (‘ ) -k (‘ ) T 0 = F T;
m¿R
−1=e‘ R (‘) 1=e‘ R ) -k (‘ ) )
= T in(F)(-k(‘ ) (aR(‘ ) )-k(‘ ) (‘ ˆ ); = T in(F)(b) + T +1 f(T
ˆ ) + T +1 f(T
S
‘=1 k=1 j=1
where
-1(‘) ; : : : ; -f(‘) ‘ (‘)
range over all the morphisms of the Galois group of the (eld
‘ ; (‘ denote an e‘ th root of −1 extension Q ,→ K and −1=e ‘ and a primitive e‘ th root ‘ of 1. From [23, Theorem 2], we deduce that, for 16‘6g,
mu(‘) := :=
f‘
m(‘;k) u f‘ e‘
k=1
k=1
j=1
Z−
m¿m‘
(‘) m=e‘ U (-k(‘) (am ))-k(‘) (‘−1=e‘ )m (jm ‘ E
(12)
346
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
is an irreducible polynomial of Q((E))[Z], and, for 16k6f‘ , mu(‘; k) is an irreducible S element of Q((E))[Z] satisfying e‘ (‘) (‘) (‘) j e‘ m Z − : (13) (( )T ; Z) = U ((a ))(( T ) m(‘;k) ‘ u m k k ‘ m¿m‘
j=1
S For 16‘6g and 16k6f‘ , let us consider the morphism of Q-algebras (‘;k) S S ))[Y ] 1˜ R : Q((E))[X ] → Q((T R−1 (‘) (‘) m F(E; X ) → F -k(‘) (‘ )T e‘ ; -k (am )T + YT R : m=m‘
Let us (x ‘; k with 16‘ 6g and 16k 6e‘ . Applying the morphism 1˜ R(‘; k) to the polynomial mu(‘ ; k ) (E; U (X )), from identity (12) we obtain (‘;k) ;k ) 1˜ R (m(‘ (E; U (X ))) u e R−1 ‘ (‘) = U (-k(‘) (am ))T m + U (Y )T R
j=1
−
m=m‘
m¿m‘
−1=e (‘ ) me‘ =e‘ U (-k(‘ ) (am ))-k(‘ ) (‘ ‘ )m -k(‘) (‘1=e‘ )m (jm ‘ T
:
This identity shows that all the factors of 1˜ R(‘; k) (mu(‘ ; k ) (E; U (X ))) have order at most R and the coe4cient of the least nonzero power of T arising in the Laurent series S ]((T )) is 1˜ R(‘; k) (mu(‘ ; k ) (E; U (X ))) ∈ Q[Y −1=e ‘ R S • either of the form U (Y − -k(‘ ) (aR(‘ ) )-k(‘ ) ( ‘ ‘ )R -k(‘) (−1=e ) ) with ∈ Q\{0}, in ‘ case that (‘ ; k ) ∈ L‘; k holds, S otherwise. • or a nonzero constant ∈ Q We deduce that the coe4cients of the least nonzero power of T arising in the S ]((T )): following elements of Q[Y (‘;k) (‘;k) ;k ) ;k ) m(‘ (E; U (X )) ; 1˜ R m(‘ (E; U (X )) 1˜ R u u (‘;k )∈L‘;k
(‘;k )∈L = ‘;k
S ] with ∈ Q\{0}, S S and a constant ˜ ∈ Q\{0}, are of the form b∈Wˆ (‘; k) U1 (Y −b) ∈ Q[Y respectively. We conclude that the following identity holds: U (Y − b): (14) in(1R(‘;k) (mu (E; U (X )))) =
b∈Wˆ
(‘;k)
Since mu (E; U (X )) ∈ I(V ), we conclude that there exists ∈ Z such that T 1R(‘; k) (mu (E; U (X ))) ∈ I(‘; k) . Then in(1R(‘; k) (mu (E; U (X )))) ∈ in(I(‘; k) ). Now, let U1 ; : : : ; Un be Q-linearly independent generic linear forms. Repeating the previous arguments with U1 ; : : : ; Un , from identity (14) we conclude that W (‘; k) is a
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
347
zero-dimensional subvariety of An . Furthermore, we have (‘;k) (‘;k) deg Wˆ 6 deg W (‘;k) 6 deg U1 (Y − b) = #(Wˆ ): (‘;k) b∈Wˆ
This shows that #(Wˆ (‘; k) ) = #(W (‘; k) ). Therefore, taking into account the inclusion Wˆ (‘; k) ⊂ W (‘; k) , we see that W (‘; k) = Wˆ (‘; k) holds. Now we exhibit a Jatness condition which assures that any point of the (ber ((‘; k) )−1 (0) is (G1(‘; k) ; : : : ; Gn(‘; k) )-smooth. For this purpose, we introduce the notion S S Y ] := Q[T; Y1 ; : : : ; Yn ] is called a of standard basis (see [20]). A set {G1 ; : : : ; Gs } ⊂ Q[T; standard basis (of the ideal I they generate) if the ideal (in(G1 ); : : : ; in(Gs )) generated S ] agrees with the ideal in(I ) := (in(G) : G ∈ I ) by the initial forms of G1 ; : : : ; Gs in Q[Y generated by the initial forms of all the polynomials G ∈ I . Theorem 5. Let notations and assumptions be as above. Suppose further that G1(‘; k) (T; Y ); : : : ; Gn(‘; k) (T; Y ) form a standard basis of the ideal I(‘; k) . Then the Jacobian determinant JG does not vanish at any point of ((‘; k) )−1 (0). Proof. Since G1(‘; k) ; : : : ; Gn(‘; k) form a standard basis of I(‘; k) we see that ((‘;k) )−1 (0) = {0} × V (G1(‘;k) (0; Y ); : : : ; Gn(‘;k) (0; Y ))
= {0} × V (in(G1(‘;k) ); : : : ; in(Gn(‘;k) )) = {0} × W (‘;k)
holds. From Proposition 4 we see that for any b ∈ W (‘; k) there exists a unique vector S = such that ’(0) = b and Gi(‘; k) (T; ’) = 0 hold for 16i6n. of power series ’ ∈ Q
348
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
(‘) (‘) 9‘ ( f(‘) ) := −1 and 9‘ ( fm(‘) ) := am; ∈ Q[S1 ; S2 ] be the polynomial j . Finally, let p ‘ e‘ (‘) (‘) p := S2 − f (S1 ), and let
W (‘) := {(s1 ; s2 ) ∈ A2 : p(‘) (s1 ; s2 ) = 0; q(‘) (s1 ) = 0}:
(15)
It is easy to see that W (‘) is a zero-dimensional variety of degree deg W (‘) = e‘ f‘ . [23] shows that the (eld K (‘) is the (eld extension of Q generated by the coe4cients aj;(‘)m for 16j6n and m ‘ 6m6R. In particular, K (‘) is the minimal (eld extension of Q containing the coe4cients of the singular parts of the given set of rational Puiseux expansions. ; For ;¿R, let u(;; ‘) := m=m‘ U ( fm(‘) (S1 ))(S2 T )m ∈ Q(S1 ; S2 ; T ) and let
f‘ e‘ k=1 j=1
Z−
; m=m‘
(‘) U (-k(‘) (am ))((j‘ -k(‘) (‘−1=e‘ )T )m :
(16)
‘ Observe that if the norm in the (eld extension K (‘) (-k(‘) (−1=e )T )=Q(T e‘ ) is extended ‘ ; (‘) −1=e‘ m to polynomials, then
mu(‘) (T e‘ ; Z) ≡
(17)
holds in Q((T ))[Z], with 0 := −1 for m ‘ ¡0 and 0 := 0 otherwise. Taking into account that
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
349
Let NG(‘) be the Newton–Hensel operator associated to G1(‘) ; : : : ; Gn(‘) , namely −1 @Gi(‘) NG(‘) (Y ) := Y − (G1(‘) ; : : : ; Gn(‘) )t ; @Yj
(18)
16i;j6n
where t denotes transposition, and let NG;(‘) denote the ;th fold iteration of NG(‘) . Finally, let R−1 (‘) (‘) (;;‘) m ; R R u˜ := U fm (S1 )(S2 T ) + NG(‘) (S1 ; S2 ; T; fR (S1 )S2 )T m=m‘
S )[Z] be its characteristic polynomial. Then
Gi(‘) (s1 ; s2 ; T; Y ) = s2j‘ Gi(‘;k) (s2 T; s2−R Y ):
(19)
s2R -k(‘) (aR(‘) ) ∈ An
G1(‘) (s1 ;
belongs to the a4ne variety de(ned by Let us observe that s2 ; 0; Y ); : : : ; Gn(‘) (s1 ; s2 ; 0; Y ). Furthermore, from Theorem 5 and identity (19) we conclude that JG(‘) (T; Y ) := det(@Gi(‘) =@Yj )16i; j6n (s1 ; s2 ; T; Y ) does not vanish at (0; s2R -k(‘) (aR(‘) )) ∈ An+1 , and hence JG(‘) (T; s2R -k(‘) (aR(‘) )) is a unit in the local ring S =; (T )). (Q
Therefore, we obtain R−1 (‘) fm (s1 )(s2 T )m + NG(‘) (s1 ; s2 ; T; -k(‘) (aR(‘) )s2R )T R U m=m‘
≡U
m¿m‘
(‘) -k(‘) (am )(s2 T )m
;
mod(T R+2 );
;
;
which implies u˜(;; ‘) (s1 ; s2 ; T ) ≡ u(R−1+2 ; ‘) (s1 ; s2 ; T ) mod(T R+2 ). Lemma 6 shows that
350
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
standard basis requirement. Nevertheless, this is not an arbitrary “algebraic” requirement, as shown by the following result: Lemma 8. Let notations and assumptions be as in Lemmas 1–3. Suppose that the morphism (‘; k) is unrami?ed at T = 0. Then G1(‘; k) ; : : : ; Gn(‘; k) form a standard basis of the ideal I(‘; k) . Proof. Let x ∈ An+1 be a point of the (ber ((‘; k) )−1 (0). Let (OV (‘; k) ; x ; m x ) denote the local ring of the point x on the variety V (‘; k) and let (OA1 ; 0 ; m0 ) denote the local ring of 0 on A1 . Since (‘; k) is unrami(ed at T = 0, we have mx = ((‘;k) )∗ (m0 ) (‘; k) −1
(20) (‘; k) ∗
(‘; k) ∗
for any x ∈ ( ) (0), where ( ) denotes the local homomorphism ( ) : OA1 ; 0 → OV (‘; k) ; x induced by the morphism (‘; k) . Identity (20) implies that the morphism dx (‘; k) : TV (‘; k) ; x → TA1 ; 0 of tangent spaces is injective [22]. We deduce that the dimension dim (TV (‘; k) ; x ) of the tangent space TV (‘; k) ; x of V (‘; k) at x is at most 1. Taking into account that V (‘; k) is an equidimensional variety of dimension 1 (Lemma 2), we conclude that dim (TV (‘; k) ; x ) = 1. Therefore, x is a smooth point of V (‘; k) . Identity (20) shows that the quotient ring OV (‘; k) ; x =((‘; k) )∗ (m0 ) is a zero-dimensional S Q-algebra. Let us observe that OV (‘; k) ; x is a Cohen–Macaulay ring (because it is a localization of a Cohen–Macaulay ring), the local ring OA1 ; 0 is a regular ring and the identity dim OV (‘; k) ; x = dim OA1 ;0 + dim OV (‘; k) ; x =((‘; k) )∗ (m0 ) holds. Then applying [48, Theorem 23.1] we conclude that the local homomorphism ((‘;k) )∗ : OA1 ;0 → OV (‘;k) ; x
(21)
induced by (‘; k) is Jat. S (‘; k) ]m0 is a semilocal ring, whose maximal We observe that the localization Q[V ideals correspond to the maximal ideals m x induced by the points x of ((‘; k) )−1 (0). Therefore, since the morphism of (21) is Jat for any point x ∈ ((‘; k) )−1 (0), applying [48, Theorem 7.1] we conclude that S 1 ]m0 → Q[V S (‘;k) ]m0 ((‘;k) )∗ : Q[A is Jat, i.e., (‘; k) is Jat at T = 0. Therefore, from [5, Part I, Proposition 3.1] (see also S ]n of the polynomials G (‘; k) (0; Y ); : : : ; [6]) it follows that any syzygy (h1 ; : : : ; hn ) ∈ Q[Y 1 S Y ]n of G1(‘; k) ; : : : ; Gn(‘; k) , i.e., for Gn(‘; k) (0; Y ) “lifts” to a syzygy (H1 ; : : : ; Hn ) ∈ Q[T; 16i6n the identity Hi (0; Y ) = hi (Y ) holds. S Now we adapt the contents of e.g., [49] to our setting. For F ∈ Q[T; Y ], let oT (F) denote the highest power of T dividing F. We claim that any polynomial G ∈ I(‘; k) has a representation G=
n i=1
Hi Gi(‘;k)
(22)
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
351
(‘; k) with order o be a polynomial with a represenT (Hi )¿oT (G) for 16i6n. Let G ∈ I n (‘; k) tation G = i=1 Hi Gi . Let := min{oT (Hi ) : 16i6n}; and suppose that ¡oT (G) holds. Let J be the set of indices i for which = oT (Hi ) holds. Then the identity −
(T Hi )(0; Y )Gi(‘;k) (0; Y ) = 0 i∈J
S ]n , with hi := (T − Hi )(0; Y ) if i ∈ J and hi := 0 otherwise, shows that (h1 ; : : : ; hn ) ∈ Q[Y (‘; k) is a syzygy of G1 (0; Y ); : : : ; Gn(‘; k) (0; Y ). Then there exists a lifting (H˜1 ; : : : ; H˜n ) ∈ S Q[T; Y ]n of the syzygy (h1 ; : : : ; hn ), and we have G=
n i=1
(Hi − T H˜ i )Gi(‘;k)
with oT (Hi − T H˜i )¿ for 16i6n. Repeating this argument at most oT (G) times, we conclude the validity of our claim. Finally, let G ∈ I(‘; k) . Then we have a representation of G as in (22), with order oT (Hi )¿oT (G) for 16i6n. Let J be the (nonempty) set of indices i for which oT (G) = oT (Hi ) holds. Then we have in(G) = (T −oT (G) G)(0; Y ) = =
i∈J
i∈J
(T −oT (G) Hi )(0; Y ) · Gi(‘;k) (0; Y ) (T −oT (G) Hi )(0; Y ) · in(Gi(‘;k) ):
This shows that G1(‘;k) ; : : : ; Gn(‘;k) form a standard basis of the ideal I(‘;k) . 4. Algorithms and complexity estimates Let notations and assumptions be as in Section 2.1. Let := deg V denote the degree of the variety V , and let D := deg denote the degree of the morphism : V → A1 . Suppose that we are given a straight-line program * computing F1 ; : : : ; Fn with space S and time T. Let S1 ; S2 be indeterminates over Q. With the notations of Section 3.3, for 16‘6g (‘) (‘) (‘) ; : : : ; fm; ∈ Q[S1 ; S2 ] be polynomials and m ‘ 6m6R, let q(‘) ; f(‘) ; fm;1 n ∈ Q[S1 ] and p de(ning the system of rational Puiseux expansions of the branches of V lying above 0 of Section 3.3. In particular, we have the estimates deg(q(‘) ) = f‘ , deg( f(‘) )¡f‘ and (‘) deg(fm; i )¡f‘ for 16i6n, and the singular parts of the (classical) Puiseux expansions of the branches of V lying over 0 are given by g R e‘ (‘) m m (‘) (‘) T ; ; p (s1 ; s2 ) = q (s1 ) = 0 ; fm (s1 )s2 T (23) ‘=1
m=m‘
(‘) (‘) n where fm(‘) := ( fm;1 ; : : : ; fm; n ) ∈ Q[S1 ] . Let U ∈ Q[X ] a generic linear form, i.e., a linear form whose projection polynomial mu ∈ Q(E)[Z] satis(es deg Z mu = D. Then identity (12) of Section 3 shows that mu has the following factorization into irreducible
352
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
factors in Q((E))[Z]: mu = :=
g
mu(‘) f‘ g e‘
‘=1
‘=1
k=1 j=1
Z−
m¿m‘
(‘) m=e‘ U (-k(‘) (am ))-k(‘) (‘−1=e‘ )m (jm ‘ E
:
(24)
In this section, we exhibit an algorithm which has as input the straight-line pro(‘) (‘) gram * and the dense representation of p(‘) ; q(‘) ; f(‘) ; fm;1 ; : : : ; fm; n for 16‘6g and m ‘ 6m6R and computes a geometric solution of V . Let us (x ‘ with 16‘6g. The critical part of our algorithm is a procedure which computes a suitable approximation mˆ u(‘) ∈ Q(E)[Z] of the polynomial m(‘) u ∈ Q((E))[Z]. This procedure applies our variant of the global Newton–Hensel lifting of [28,30], based on Theorem 7. For this purpose, we shall deal with the variety W (‘) of (15), namely W (‘) := {(s1 ; s2 ) ∈ A2 : q(‘) (s1 ) = 0; p(‘) (s1 ; s2 ) = 0}: From the fact that deg W (‘) = e‘ f‘ holds, we easily conclude that S2 is a primitive element of the Q-algebra extension Q ,→ Q[W (‘) ]. Therefore, we have a geometric solution of W (‘) of the form
(‘) @m (‘) S 2 W (‘) = (s1 ; s2 ) ∈ A2 : mS2 (s2 ) = 0; s1 (25) (s2 ) − v(‘) (s2 ) = 0 ; @Z ∈ Q[Z] is the minimal polynomial of S2 in the extension Q ,→ Q[W (‘) ] where mS(‘) 2 (‘) and v ∈ Q[Z] satis(es deg v(‘) ¡deg W (‘) . In the sequel, time-complexity estimates will be given using the standard “soft-Oh” notation O˜, which does not take into account polylogarithmic terms. Lemma 9. There exists a computation tree computing the geometric solution (25) of W (‘) with space O(e‘ f‘2 ) and time O˜(e‘ f‘2 ). Proof. Let us suppose (rst f‘ = 1. Then we may assume without loss of generality q(‘) = S1 . Furthermore, we have f(‘) ∈ Q\{0} and p(‘) = S2e‘ − f(‘) . Therefore, = p(‘) = Z e‘ − f(‘) and v(‘) = 0 yield in fact the geometric solution of W (‘) we mS(‘) 2 are looking for (and we have nothing to compute). Now suppose that f‘ ¿1 holds. Let us introduce a new indeterminate ?, and let us consider the linear form L := ?S1 + S2 ∈ Q[?][S1 ; S2 ]. It is easy to see that L is a primitive element of the integral ring extension Q[?] ,→ Q[?] ⊗ Q[W (‘) ], with minimal equation (‘) mL (Z) = ResS1 (q(‘) (S1 ); p(‘) (S1 ; Z − ?S1 ));
(26)
where ResS1 (f; g) denotes the resultant of f and g with respect to S1 . Following an idea originally due to [44] (see also [2,33,47 Section II. 21,54,56]), we have a congruence
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
relation: (‘) mL (Z)
=
mS(‘) (Z) 2
353
@m(‘) + ? S1 S2 (Z) + v˜ (‘) (Z) @Z
mod(?2 )
with v˜ (‘) ∈ Q[Z], deg v˜ (‘) ¡e‘ f‘ and S1 (@mS(‘) =@Z)(S2 ) + v˜ (‘) (S2 ) ∈ I (W (‘) ). Then mS(‘) 2 2 and v(‘) := −v˜ (‘) can be obtained from the resultant of the right-hand side of identity (26) modulo ?2 . Using interpolation in the variable Z, this computation can be performed with space O(e‘ f‘2 ) and time O˜( e‘ f‘2 ). Our variant of the global Newton–Hensel lifting requires the Rth “initial approximation” of mu(‘) given by the following expression (compare with (24)): f‘ e‘ R (‘) (‘) (‘) −1=e‘ m jm m (‘) e‘ m˜ u (T ; Z) := Z− : (27) U (-k (am ))-k (‘ ) (‘ T k=1 j=1
m=m‘
Lemma 10. There exists a computation tree which takes as input the polynomials (‘) (‘) (‘) p(‘) ; q(‘) ; f(‘) ; fm; i (16k6n); mS2 ; v , which de?ne the ‘th expansion of the given system of rational Puiseux expansions of V and form the geometric solution (25) of W (‘) , and computes the dense representation of m˜ (‘) u with space O(R ‘ e‘ f‘ ) and time O˜(R ‘ e‘2 f‘2 ), where R ‘ := (R − m‘ )e‘ f‘ + 1. Proof. From the de(nition of m˜ u(‘) and the variety W (‘) we easily see that T −m‘ e‘ f‘ m ˜ u(‘) (T e‘ ; T m‘ Z) equals the characteristic polynomial
354
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
Newton–Hensel lifting (Theorem 7), combined with an adaptation of the procedure of [33, Proposition 7]. For this purpose, following Theorem 7, let Y1 ; : : : ; Yn be indeterminates over Q, let Y := (Y1 ; : : : ; Yn ), and let us de(ne G1(‘) ; : : : ; Gn(‘) ∈ Q[S1; S2−1 ; S2 ; T; Y ] by R−1 (‘)
j‘ e‘ (‘) m R : (28) fm (S1 )(S2 T ) + YT Gj := T Fj T ; m=m‘
Proposition 11. Let us ?x ;¿0. Then there exists a computation tree which takes (‘) (‘) (‘) as input the polynomials p(‘) ; q(‘) ; f(‘) ; fm; i (16k6n); mS2 ; v , which de?ne the ‘th parameterization of the given system of rational Puiseux expansions of V and form the geometric solution (25) of W (‘) , and computes an approximation mˆ (‘) u ∈ Q(E)[Z] (‘) of mu in Q((E))[Z] with precision (R+;)=e‘ +1 and parameterizations of Y1 ; : : : ; Yn in terms of the linear form U up to order (R + ;)=e‘ + 1, with space and time O(ne‘ f‘ (S‘ (; + 0 m‘ e‘ f‘ ) + R ‘ )) and O˜(ne‘ f‘ (T‘ + n4 )(; + 0 m‘ e‘ f‘ + (R ‘ − 1)e‘ f‘ )); respectively, where R ‘ := (R − m‘ )e‘ f‘ + 1, 0 := − 1 for m‘ ¡0 and 0 := 0 otherwise, and S‘ , T‘ denote the space and time complexity required for the evaluation of the polynomials G1(‘) ; : : : ; Gn(‘) . Proof. Theorem 5 shows that the Newton operator NG(‘) of (18) is well de(ned at fR(‘) (s1 )s2R for any (s1 ; s2 ) ∈ W (‘) . Then Theorem 7 shows that from the A := log2 (; + 0 m‘ e‘ f‘ +1)-fold iteration of the Newton operator NG(‘) we obtain a rational function mˆ u ∈ Q(E)[Z] which approximates mu(‘) in Q((E)) with precision (R + ;)=e‘ . In order to compute mˆ u (T e‘ ; Z) we use an adaptation of the procedure of [33, Proposition 7]: we start with the initial approximation provided by the polynomial m˜ u(‘) of (27) and parameterizations of X1 ; : : : ; Xn in terms of the linear form U up to order R + 1, i.e., elements v˜1(‘) ; : : : ; v˜n(‘) of Q(E)[Z] such that (@m˜ u(‘) =@Z)(T e‘; U )Xi ≡ v˜i(‘) (T e‘; U ) mod(T R+1 ; mu(‘) (T e‘ ; U )). Then we perform A steps of the global Newton– Hensel lifting of [33, Proposition 7] applied to the polynomials G1(‘) ; : : : ; Gn(‘) . Applying Lemma 10 we obtain the polynomial m˜ u(‘) of (27) with space O(R ‘ e‘ f‘ ) and time O˜(R ‘ e‘2 f‘2 ). Combining Lemma 10 and the formulae of e.g., [2,56] or [33] as in the proof of Lemma 9 we obtain the parameterizations of X1 ; : : : ; Xn in terms of U up to order R + 1 with space O(nR ‘ e‘ f‘ ) and time O˜(nR ‘ e‘2 f‘2 ). Now, applying [33, Proposition 7] we obtain an approximation of mu(‘) with precision R + ; + 1 in Q((T ))[Z] and parameterizations of X1 ; : : : ; Xn in terms of U up to order R + ; + 1 with space and time O(ne‘ f‘ (S‘ (; + 0 m‘ e‘ f‘ ) + R ‘ ))
and
O˜(ne‘ f‘ (T‘ + n4 )(; + 0 m‘ e‘ f‘ + R ‘ )); respectively. Since mu(‘) (T e‘ ; Z) and the parameterizations of X1 ; : : : ; Xn in terms of U are elements of Q((T e‘ ))[Z], replacing T e‘ by E we obtain mˆ u(‘) and the parameterizations of X1 ; : : : ; Xn in terms of U up to order (R + ;)=e‘ + 1.
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
355
Adding the complexity of each step of our procedure the proposition follows. Now we state the main result of this section: Theorem 12. There exists a computation tree in Q[E; X ] which takes as input the straight-line program * de?ning the polynomials F1 ; : : : ; Fn and the given system of rational Puiseux expansions, and computes a geometric solution of V with space and time g e‘2 f‘ (S‘ (0 m‘ f‘ + 1) + R ‘ ) and O n ‘=1 g 2 4 ne‘ f‘ (T‘ + n )( + 0 m‘ f‘ + (R ‘ − 1)f‘ ) O˜ ‘=1
respectively, where R ‘ := (R − m‘ )e‘ f‘ + 1, 0 := − 1 for m‘ ¡0 and 0 := 0 otherwise, and S‘ , T‘ denote the space and time complexity required for the evaluation of the polynomials G1(‘) ; : : : ; Gn(‘) of (28). Furthermore, for any B ¿ 2, such a computation tree can be randomly constructed with a probability of success of at least 1−1=2B¿ 34 . Proof. Let U ∈ Q[X ] be a generic linear form. Let us (x B¿2. Using the Zippel– Schwartz test (see [59,68]), we conclude that the coe4cients of U can be randomly chosen in the set {1; : : : ; 4BnD2 } with a probability of success of at least 1−1=2B ¿ 3=4, where D := deg . Let := deg V . Applying Proposition 11 for 16‘6g with ; := 3e‘ − R, we obtain elements mˆ u(‘) ; vˆ1(‘) ; : : : ; vˆn(‘) (16‘6g) of Q(E)[Z] such that: (1) mˆ u(‘) (E; Z) ≡ mu(‘) (E; Z) modulo(E3+1 ), (2) (@mˆ u(‘) =@Z)(E; U )Xi ≡ vˆi(‘) (E; U ) modulo(E3+1 ; mu(‘) (E; U )), (3) degZ mˆ u(‘) 6e‘ f‘ and degZ vi(‘) 6e‘ f‘ − 1 for 16i6n. These polynomials can be computed with space and time g e‘2 f‘ (S‘ (0 m‘ f‘ + 1) + R ‘ ) and O n ‘=1 g 2 O˜ ne‘ f‘ (T‘ + n4 )( + 0 m‘ f‘ + (R ‘ − 1)f‘ ) : ‘=1
Let v1 ; : : : ; vn be the elements of Q(E)[Z] parameterizing X1 ; : : : ; Xn in terms of the linear form U in V , i.e., satisfying (@mu =@Z)(E; U )Xi ≡ vi (E; U ) mod I (V ) for 16i6n. From [58, Proposition 1] we see that the orders oE (mu ); oE (v1 ); : : : ; oE (vn ) are bounded from below by −. Combining this observation with properties (1), (2), (3) we conclude that the following congruence relations hold in Q((E))[Z]: g
mˆ u(‘) ≡ mu mod(E2+1 ); (‘ ) (‘) vˆi ≡ vi mod(E2+1 ): mˆ u vWi :=
mW u :=
‘=1
16‘6g
‘ =‘
356
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
Using fast procedures for multiplication and Chinese Remainder Theorem (see e.g., [9]), we compute the polynomials mW u ; vW1 ; : : : ; vWn using space O(nD) and time O˜(nD). Taking into account the estimates degZ mu = D; degZ vi 6 D − 1; (1 6 i 6 n); degE mu 6 ; degE vi 6 (1 6 i 6 n); (see [58]), we conclude that mu ; v1 ; : : : ; vn can be computed from the truncated Laurent series mW u ; vW1 ; : : : ; vWn using PadQe approximants. More precisely, by interpolation in the variable Z we reduce the computation of the polynomials mu ; v1 ; : : : ; vn to at most (n + 1)D problems of PadQe approximation of degree at most . Thus, using a fast algorithm for computing PadQe approximations (see e.g., [9]), we conclude that the polynomials mu ; v1 ; : : : ; vn can be computed using space O(nD) and time O˜(nD). Adding the space and time complexity of each step of our procedure we deduce the complexity estimate of the statement of Theorem 12. Let us make here a few remarks concerning the hypotheses and complexity estimates of Theorem 12. First we observe that the parameters S‘ and T‘ can be estimated by O(S + n) and O˜(T + nR ‘ ), respectively, where S and T are the space and time complexity of the straight-line program computing F1 ; : : : ; Fn . Then we have the worstcase estimates O(n2 SD4 ) and O˜( n4 TD4 ) for the space and time complexity of the procedure underlying Theorem 12. Nevertheless, these estimates can be improved in several important cases, such as that with R = m‘ and Q[E](E) ,→ Q[V ](E) an integral extension. In this case, we have the estimates O((S + n)nDe) and O˜((T + n4 )nDe), respectively, with e := max{e‘ : 16‘6g} (see Sections 5.3 and 5.4). Theorem 12 generalizes the results of [39,58] in the unidimensional case. More precisely, in case that the “known” -(ber is unrami(ed, our space and time complexity estimates are O((n + S)nD) and O˜((n4 + T)D), which improve the estimates of [39] and have the same asymptotic behaviour as those of [58]. The algorithm underlying Theorem 12 proceeds by computing a suitable approxi mation of the factors mu(‘) of the minimal polynomial mu = 16‘6g mu(‘) of the linear
form U . Observe that for 16‘6g the polynomial mu(‘) is an irreducible polynomial of Q((E))[Z] (see Section 3). In this sense, this algorithm constitutes an improvement of the re(nements described in Section 3 of [39] (based on the factorization of the polynomial mu in Q[E; Z]). The singular parts (23) can be e4ciently computed from the input polynomials F1 ; : : : ; Fn and a geometric solution of an unrami?ed (ber of the morphism , by a suitable combination of the following algorithmic tools: • A Newton polygon algorithm for computing the singular parts of a system of rational Puiseux expansions as in [23] or [66]. • A projection procedure for unrami(ed (bers as in [58]. The asymptotic space and time complexity of such a procedure is roughly O˜(D4 +%2 ) and O˜(D8 + %2 ), respectively, where % denotes the geometric degree of the system F1 ; : : : ; Fn (in the sense of [28]). Observe that the estimates D66% hold. Nevertheless,
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
357
as we are only interested in particular cases where the singular parts can be immediately generated (see Sections 5.3 and 5.4), we are not going to use this procedure. 5. Examples In this section, we apply our algorithmic method in order to compute a geometric solution of certain zero-dimensional polynomial equation systems. In Section 5.1, we treat the case of Pham–Brieskorn systems. In Section 5.2, we treat a family of systems which arise from a semidiscretization of certain parabolic diIerential equations with nonlinear source terms and nonlinear boundary conditions. Finally, in Section 5.4 we treat a generalization of Reimer systems, which we called generalized Reimer systems. In all the above cases, we “deform” the polynomial equation system under consideration to a one-dimensional polynomial equation system satisfying the hypotheses of Theorem 7. Then the algorithm underlying the proof of Theorem 12 yields an e4cient procedure to compute a geometric solution of the original zero-dimensional polynomial equation system. 5.1. Pham–Brieskorn systems Let us (x n; d ∈ N. Let g1 ; : : : ; gn ∈ Q[X ] := Q[X1 ; : : : ; Xn ] satisfy deg(gi )¡d and gi (0; : : : ; 0) = 0 for 16i6n. Let us de(ne f1 ; : : : ; fn ∈ Q[X ] by f1 := X1d − g1 ; : : : ; f1 := Xnd − gn :
(29)
A system of this form is called a Pham–Brieskorn system (see e.g., [11,34,35,53]). It is easy to see that f1 ; : : : ; fn form a regular sequence of Q[X ] and generate a radical ideal of Q[X ]. Therefore, f1 ; : : : ; fn de(ne a zero-dimensional a4ne subvariety V˜ of An . Our aim is to compute a geometric solution of this variety V˜ . Let E be an indeterminate over Q and de(ne F1 ; : : : ; Fn ∈ Q[E; X ] by F1 := X1d − Eg1 ; : : : ; Fn := Xnd − Egn :
(30)
Let V be the a4ne subvariety of An+1 de(ned by the polynomials F1 ; : : : ; Fn , and let : V → A1 be the morphism de(ned by (; x) = . We observe that −1 (1) = {1} × V˜ and −1 (0) = {0} ⊂ An+1 hold. In Section 5.3, we exhibit an algorithm which computes a geometric solution of the variety V . Furthermore, specializing the polynomials of Q[E; X ] which constitute this geometric solution into the value E = 1 we shall obtain a geometric solution of V˜ . 5.2. Systems coming from a semidiscretization of certain parabolic diFerential equations In this section, we consider a family of polynomial equation systems which arises in the analysis of the stationary solutions of a numerical approximation, obtained by
358
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
a semidiscretization in space, of certain parabolic diIerential equations with nonlinear source terms and nonlinear boundary conditions (see e.g., [13,25]). Let us (x n; d ∈ N with d ¿ 2. Let T be an indeterminate over Q, and let g; h ∈ Q[T ]\{0} satisfy deg(g)¡d and deg(h) = d. Let us write h = aT d + h1 (T ) with a = 0 and deg(h1 )¡d. Let f1 ; : : : ; fn be the polynomials of Q[X ] := Q[X1 ; : : : ; Xn ] de(ned in the following way: f1 := 2(n − 1)2 (X2d − X1d ) − g(X1 ); d d fi := (n − 1)2 (Xi+1 − 2Xid + Xi−1 ) − g(Xi );
fn := 2(n − 1)
2
d (Xn−1
−
Xnd )
(2 6 i 6 n − 1);
+ 2(n − 1)h(Xn ) − g(Xn ):
(31)
An important case of study is that of the stationary solutions of the porous medium equation with nonlinear source terms and nonlinear boundary condition (see e.g. [17,41]). Typical discretizations of this problem lead for example to instances of system (31) with h := T d and g := T (see e.g., [25]). Let V˜ be the a4ne subvariety of An de(ned by the polynomials f1 ; : : : ; fn . Our aim is to exhibit an e4cient algorithm which computes a geometric solution of the variety V˜ . For this purpose, let f := (f1 ; : : : ; fn ), en := (0; : : : ; 0; 1) ∈ Qn , G := (g(X1 ); : : : ; g(Xn )), and X d := (X1d ; : : : ; Xnd ). Let A ∈ Qn×n be the following nonsingular tridiagonal matrix:
−2 1 2 A := (n − 1)
2 −2
1 .. .
..
. 1
..
. −2 2
1 −2 +
2a n−1
:
Then the polynomials f1 ; : : : ; fn can be expressed as ft = A · (X d )t + 2(n − 1)h1 (Xn )ent − G t ;
(32)
where t denotes transposition. In order to solve the system de(ned by the polynomials in (32), we introduce a new indeterminate E and consider the following polynomials of Q[E; X ]: (F˜ 1 ; : : : ; F˜ n )t := A · (X d )t + E(2(n − 1)h1 (Xn )ent − G t ) − 2(n − 1)E(1 − E)ent : (33) Let V be the a4ne subvariety of An+1 de(ned by the polynomials F˜ 1 ; : : : ; F˜n and let : V → A1 the morphism de(ned by (; x) = . We observe that −1 (1) = {1} × V˜ and −1 (0) = {0} ⊂ An+1 . Since the matrix A is nonsingular, multiplying both sides of (33) by A−1 we obtain the following polynomials, whose zero set also de(nes the
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
359
variety V : (F1 ; : : : ; Fn )t := (X d )t + EA−1 (2(n − 1)h1 (Xn )ent − G t ) − E(E − 1)vt ;
(34)
where v := (n − 1)=2a(1; : : : ; 1). In Section 5.3, we exhibit an algorithm computing a geometric solution of the variety V . By specializing the polynomials of Q[E; X ] which constitute this geometric solution into the value E = 1 we shall obtain a geometric solution of our input variety V˜ . 5.3. A common approach to both examples In this section, we describe an algorithm which (nds a geometric solution of the variety de(ned by any system of form (30) and (34). Then, we shall specialize the polynomials of Q[E; X ] which form such geometric solution into the value E = 1 in order to obtain a geometric solution of the variety de(ned by the corresponding system of form (29) and (31). Let us (x n; d ∈ N. For 16i6n, let Hi ∈ Q[E; X ] satisfy deg Hi 6d − 1 and i := Hi (0; 0) = 0. Suppose further that we are given a straight-line program computing the polynomials H1 ; : : : ; Hn using space S and time T. For 16i6n, let us de(ne Fi ∈ Q[E; X ] by the following expression: Fi := Xid − EHi (E; X ):
(35)
Let I be the ideal of Q[E; X ] generated by F1 ; : : : ; Fn and let V be the a4ne subvariety of An+1 de(ned by I. Let : V → A1 denote the restriction to V of the canonical projection onto the (rst coordinate. Our purpose is to compute a geometric solution of {1} × V˜ := −1 (1). It is easy to see that any system of form (30) and (34) is a particular instance of a system of form (35). In order to apply our algorithmic method, we (rst show in Lemmas 13 and 14 below that the polynomials F1 ; : : : ; Fn of (35) form a regular sequence of Q[E; X ], the ideal I ⊂ Q[E; X ] they generate is radical, and the morphism is (nite and generically unrami(ed. Lemma 13. The polynomials F1 ; : : : ; Fn form a regular sequence of Q[E; X ] and the morphism is ?nite. Proof. From Buchberger’s (rst criterion (see e.g., [7]), we conclude that for 16i6n the polynomials F1 ; : : : ; Fi form a GrXobner basis of the ideal they generate with respect to the graded lexicographical order induced by the ordering X1 ¿ · · · ¿Xn ¿E. This implies that the a4ne variety of An+1 de(ned by F1 ; : : : ; Fi has codimension i for 16i6n. Then F1 ; : : : ; Fn form a regular sequence of Q[E; X ]. Furthermore, we observe that the leading monomial of Fi under this order is Xid for 16i6n. Therefore, the set {X1i1 · · · Xnin : 06i1 ; : : : ; in ¡d} is a basis as Q[E]-module of Q[E; X ]=I. This proves that is a (nite morphism.
360
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
For 16i6n, let Gi ∈ Q[E; X ] be the following polynomial: Gi (E; X ) := E−d Fi (Ed ; EX ): Let W˜ ⊂ An+1 be the a4ne variety de(ned by G1 ; : : : ; Gn , and let ˜ : W˜ → A1 be the morphism induced by the canonical projection onto the (rst coordinate. We claim the morphism ˜ is generically unrami(ed. Let us observe that for = 0 we have #(˜ −1 ()) = #(−1 (d )). Therefore, from the fact that the morphism is (nite we easily conclude that ˜ is dominant and dim W˜ ¿1 holds. Furthermore, from the fact that Q(V ) is a zero-dimensional Q(E)-algebra, we deduce that Q(W˜ ) is also a zero-dimensional Q(E)-algebra. This shows that W˜ is a one-dimensional variety. Let us (x ∈ A1 . Taking into account that degX Gi (; X ) = d for 16i6n, from the BQezout inequality (see [26,36]) we deduce that deg ˜ −1 ()6dn holds. On the other hand, for 16i6n we have Gi (0; X ) = Xid − i , where i = Hi (0; 0) = 0. This implies that ˜ −1 (0) has cardinality dn . We conclude that any generic (ber ˜ −1 () has cardinality dn . Lemma 14. I is a radical ideal and the morphism is generically unrami?ed. Proof. For a generic choice ∈ A1 , we have #(˜ −1 ()) = #(−1 (d )) = dn . This implies that there exists a (ber −1 () of cardinality dn . On the other hand, applying the BQezout inequality (see [26,36]) we see that #(−1 ())6dn holds for any ∈ A1 . We conclude that #(−1 ()) = dn holds for any generic choice of the value ∈ A1 . Let be a generic element of A1 . Then dimC C[X ]=(F1 (; X ); : : : ; Fn (; X )) = dn = deg −1 (). This implies (see e.g., [20, Corollary 2.6]) that −1 () is a smooth variety and the polynomials F1 (; X ); : : : ; Fn (; X ) generate a radical ideal of C[X ]. In particular, we have that the Jacobian determinant JF (; X ) := det(@Fi =@Xj )16i; j6n (; X ) does not vanish on any point x ∈ An with (; x) ∈ −1 (). Thus, JF (E; X ) is not a zero divisor of Q[E; X ]=I and is generically unrami(ed. Finally, since F1 ; : : : ; Fn form a regular sequence of Q[E; X ], from [24, Theorem 18.15] we deduce that the ideal I is radical. Let us observe that the origin 0 ∈ An+1 is the only point of −1 (0). Therefore, there are deg() = dn branches of the curve V passing through 0 ∈ An+1 . For F ∈ Q[E; X ], let us write F(Ed ; EX ) = E f(X ) + O(E +1 ), with f = 0. We de(ne the initial term of F with respect to the weight (d; 1; : : : ; 1) as the polynomial ind (F) := f. Let ind (I) ⊂ Q[X ] be the ideal generated by the set {ind (F): F ∈ I} and let W ⊂ An be the a4ne variety de(ned by ind (I). Lemma 15. W = V (X1d − 1 ; : : : ; Xnd − n ) and G1 ; : : : ; Gn form a standard basis. Proof. Let us observe that the set {ind (F): F ∈ I} is contained in the set of initial terms (in the sense of Section 3) of the polynomials of the ideal (G1 ; : : : ; Gn ). ˜ ˜ X ) = 0. Let F ∈ (G1 ; : : : ; Gn ), and let us write F = E F(E; X ), with ¿0 and F(0;
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
361
Since E is not a zero divisor of the Q-algebra Q[E; X ]=(G1 ; : : : ; Gn ), we conclude that F˜ ∈ (G1 ; : : : ; Gn ) holds. Then ˜ = F(0; ˜ X ) ∈ (G1 (0; X ); : : : ; Gn (0; X )) = (X1d − 1 ; : : : ; Xnd − n ) in(F) which implies that ind (I) ⊂ (X1d − 1 ; : : : ; Xnd − n ) holds and G1 ; : : : ; Gn form a standard basis. On the other hand, (X1d − 1 ; : : : ; Xnd − n ) = (ind (F1 ); : : : ; ind (Fn ) ⊂ ind (I) from which the statement of Lemma 15 follows. Since there are dn branches of V lying above 0 and deg W = dn , we conclude that the system of (classical) Puiseux expansions of the branches of the curve V lying above 0 has regularity index 1, and the singular parts of its expansions are represented by the points of W . Lemmas 14 and 15 show that the polynomials of (35) satisfy the hypotheses of Theorems 7 and 12. In order to apply the algorithm underlying Theorem 12 to our input system, we (rst need an explicit description of the set of singular parts of a system of rational Puiseux expansions of the branches of V lying above 0. For this purpose, we observe that the set of singular parts is given by S ]n+1 ; {(T d ; (j1 11=d T; : : : ; (jn n1=d T ); 0 6 j1 ; : : : ; jn ¡ d} ⊂ Q[T S are dth roots of 1 ; : : : ; n , S is a primitive dth root of 1 and 1=d ; : : : ; n1=d ∈ Q where ( ∈ Q 1 −1=d respectively. Replacing T by 1 T we obtain the following system of rational Puiseux expansions of the branches of V lying above 0: S ]n+1 ; {( 1−1 T d ; T; (j2 *21=d T; : : : ; (jn *n1=d T ); 0 6 j2 ; : : : ; jn ¡ d} ⊂ Q[T S are dth roots of *2 := −1 2 ; : : : ; *n := −1 n , respectively. With where *21=d ; : : : ; *n1=d ∈ Q 1 1 the notations of Section 2.2, we have g = 1, e1 = d, f1 = dn−1 . Let Y2 ; : : : ; Yn be new indeterminates over Q. Let W0 := {((j2 *21=d ; : : : ; (jn *n1=d ); 0 6 j2 ; : : : ; jn ¡ d} = V (Y2d − *2 ; : : : ; Ynd − *n ): Then we see that a geometric solution of the variety W0 yields the polynomials q(1) ; f2(1) ; : : : ; fn(1) required for the application of the algorithm of Theorem 12. Let U := E2 Y2 + · · · + En Yn be a linear form of Q[Y2 ; : : : ; Yn ] inducing a primitive element of the Q-algebra extension Q ,→ Q[W0 ]. Let us (x B ¿ 2. Using the Zippel–Schwartz test (see [59,68]), we conclude that the coe4cients of U can be randomly chosen in the set {1; : : : ; 4Bnd2n−2 } with a probability of success of at least 1 − 1=2B ¿ 34 . We now describe an algorithm computing a geometric solution of W0 . Let m2 ; : : : ; mn ∈ Q[Z] be the sequence of polynomials de(ned recursively by m2 := Z d − *2 ;
˜ d ˜ mi := ResZ˜ (E−d i (Z − Z) − *i ; mi−1 (Z))
for 3 6 i 6 n:
362
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
Then the polynomial mn equals (up to scaling by a nonzero element of Q) the minimal polynomial q(1) ∈ Q[Z] of the coordinate function induced by U in the Q-algebra extension Q ,→ Q[W ]. Combining fast algorithms for the computation of univariate resultants (see e.g., [46]) and univariate interpolation (see e.g., [9]) as in e.g., [33], we conclude that q(1) can be computed in space O(d2n−2 ) and time O˜(d2n−2 ). Combining this algorithm and the formulae of e.g., [2] or [56] as in the proof of Lemma 9, we obtain a geometric solution of W0 with space O(nd2n−2 ) and time O˜(d2n−2 ). Finally, applying Theorem 12 we obtain the following result: Theorem 16. There exists a computation tree computing a geometric solution of the variety V with space O(nSd2n ) and time O˜(Td2n ). The geometric solution provided by Theorem 16 consists of a randomly chosen linear form U ∈ Q[X ] and polynomials mu ; v1 ; : : : ; vn ∈ Q[E; Z]. Suppose that the U is also a primitive element of the original variety {1} × V˜ = V ∩ ({1} × An ). Specializing mu ; v1 ; : : : ; vn into the value E = 1, we obtain polynomials mu (1; Z); v1 (1; Z); : : : ; vn (1; Z) of Q[X ] de(ning a (eventually nonreduced) Shape-Lemma-like representation of V˜ . Therefore, computing a square-free representation of mu (1; Z), and cleaning the multiple factors of the polynomial mu (1; Z) out of v1 (1; Z); : : : ; vn (1; Z) we obtain a geometric solution of V˜ with the same complexity estimate (see [33] for details). This result improves the O˜(3n d2n ) time-complexity estimate of [50]. Let us also mention the results of [51], where the authors announce an O˜(d2n ) time-complexity estimate for approximating one root of a Pham system. Comparing our result with the O˜(Td2n−1 ) time-complexity estimate provided by the application of the algorithm of [33] to this case, we see that the performance of [33] is better. Nevertheless, let us observe that the leading term d2n of our time-complexity estimate can be expressed as degE mu and we are dealing in this case with an “ill-conditioned” system, for which the worst case estimates = dn and degE mu = dn hold. If the input system satis(es degE mu dn , then the performance of [33] does not change, whereas in our timecomplexity estimate the d2n factor reduces accordingly. Furthermore, if degE mu = 1, we achieve the lower bound dn of this factor (see [16]). 5.4. Reimer systems In this section, we consider another family of examples called (generalized) Reimer systems (compare [8]). Let us (x n ∈ N, and let us de(ne f1 ; : : : ; fn ∈ Q[X ] := Q[X1 ; : : : ; Xn ] in the following way: fi := i +
n j=1
ai; j Xji+1 ;
(36)
where ai; j ; i (16i; j6n) are generic elements of Q (see Lemma 17 below) with
i ; ai; i = 0 for 16i6n. Let V˜ be the a4ne subvariety of An de(ned by f1 ; : : : ; fn . Our purpose is to compute a geometric solution of V˜ . Our next result shows that V˜ has dimension zero and degree (n + 1)!.
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
363
Lemma 17. Let U := (Ui; j )16i; j6n be a matrix of indeterminates and let H1 ; : : : ; Hn be the elements of Q[U; X ] de?ned in the following way: Hi := i +
n j=1
Ui; j Xji+1 : 2
Then there exists a nonempty Zariski open set U ⊂ An with the following property: for any u ∈ U, the aAne subvariety of An de?ned by the polynomials H1 (u; X ); : : : ; Hn (u; X ) has dimension 0 and degree (n + 1)!. 2
2
Proof. Let Z be the a4ne variety of An +n de(ned by H1 ; : : : ; Hn and let U : Z → An the morphism de(ned by U (u; x) = u. Let ˝ be the prime ideal of Q[U ] generated by the set {Ui; j ; 16i; j6n; i = j}. We claim that H1 ; : : : ; Hn form a regular sequence of Q[U ]˝ [X ]. In order to prove this claim, following [38], we de(ne a “triangular” sequence (Rj(i) )16i6n; i+16j6n ⊂ Q[U; X ] in the following way: • R(1) j := ResX1 (H1 ; Hj ) for j = 2; : : : ; n. • Rj(i) := ResXi (Ri(i−1) ; Rj(i−1) ) for 26i6n − 1 and i + 16j6n. From elementary properties of the resultant we see that Ri(i−1) is a nonzero element of Q[U; Xi ; : : : ; Xn ] ∩ (H1 ; : : : ; Hi ), with degX Ri(i−1) = degXi Ri(i−1) . Furthermore, a recursive argument shows that the coe4cient of the highest power of Xi occurring in Ri(i−1) does not belong to the prime ideal ˝. We conclude that H1 ; : : : ; Hi de(ne an ideal of Q[U ]˝ [X ] of Krull dimension n − i. This implies that H1 ; : : : ; Hn form a regular sequence of Q[U ]˝ [X ]. Furthermore, the polynomial Rn(n−1) gives an integral dependence equation for the coordinate class of Xn in the ring Q[U ]˝ [X1 ; : : : ; Xn ]=(H1 ; : : : ; Hn ) over the ring Q[U ]˝. Then a recursive argument with the polynomials Ri(i−1) for 16i6n shows that Q[U ]˝ ,→ Q[U ]˝ [X1 ; : : : ; Xn ]=(H1 ; : : : ; Hn )
(37)
is an integral Q-algebra extension. 2 We conclude that there exists a Zariski neighborhood U˜ ⊂ An of V (˝) such that U |Z∩(U˜ × An ) : Z ∩ (U˜ × An ) → U˜ is a (nite morphism and Z ∩ (U˜ × An ) is an equidimensional variety of dimension n2 . This shows that for any choice of u ∈ U˜ the variety Z ∩ {U = u} = U−1 (u) has dimension 0. Now we show that the existence of the Zariski open set U ⊂ U˜ of the statement of the lemma. First, we observe that the BQezout inequality (see [26,36]) implies ˜ On the other hand, for any nonsingular diagonal matrix deg(U−1 (u))6dn for any u ∈ U. −1 (0) (0) ˜ u ∈ U we have deg(U (u )) = (n + 1)!. We conclude that there exists a non-empty Zariski open set U ⊂ U˜ such that deg (U−1 )(u) = (n + 1)! holds for any u ∈ U. Let us observe that for any u ∈ U we have that C[X ]=(H1 (u; X ); : : : ; Hn (u; X )) is a (nite-dimensional C-vector space of dimension at most (n + 1)!. On the other hand, we have #(U−1 (u)) = (n+1)!. We conclude that the polynomials H1 (u; X ); : : : ; Hn (u; X )
364
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
generate a radical zero-dimensional ideal of C[X ], and hence the Jacobian determinant JH (u; X ) := det(@Hi =@Xj )16i; j6n (u; X ) does not vanish on any point x with (u; x) ∈ U−1 (u). This implies that JH does not vanish on any point of Z ∩ (U × An ). In order to solve a system of form (36) with a := (ai; j )16i; j6n ∈ U, let us introduce an indeterminate E over Q and the following elements of Q[E; X ]: Fi := i Ei+1 + ai;i Xii+1 + ai; j EXji+1 (1 6 i 6 n): (38) 16j6n j=i
Let V be the a4ne subvariety of An+1 de(ned by F1 ; : : : ; Fn and let : V → A1 be the morphism de(ned by (; x) := . We have −1 (1) = {1} × V˜ and −1 (0) = {0} ⊂ An+1 . We are going to show that F1 ; : : : ; Fn form a regular sequence of Q[E](E) [X ] and generate a radical ideal of Q[E](E) [X ], and the morphism is dominant and generically unrami(ed. For this purpose, let us de(ne G1 ; : : : ; Gn ∈ Q[E; X ] in the following way: Gi := E−(i+1) Fi (E; EX ) = i +
n j=1
gi; j Xji+1 ;
where gi; j := ai; j E for i = j and gi; i := ai; i . Let W˜ be the a4ne subvariety of An+1 de(ned by G1 ; : : : ; Gn , and let ˜ : W˜ → A1 be the morphism de(ned by (; ˜ x) = . 2 Observe that g(1) ∈ U holds, where U ⊂ An is the Zariski open set of the statement of Lemma 17. Therefore, for a generic choice ∈ A1 , we have g() ∈ U. Taking into account the remarks after the proof of Lemma 17, we conclude that ˜ is dominant and generically unrami(ed. Finally, since #(˜ −1 ()) = #(−1 ()) holds for any = 0, we deduce the following result: Lemma 18. The morphism is dominant and generically unrami?ed. On the other hand, we have the following result: Lemma 19. F1 ; : : : ; Fn form a regular sequence in Q[E](E) [X ] and generate a radical ideal of Q[E](E) [X ]. Proof. For 16i6n, let Fˆi ∈ [E; X0 ; : : : ; Xn ] denote the homogenization of the polynomial Fi with respect to the variables X . We have Fˆi ≡ ai; i Xii+1 mod (E). Following [38], we de(ne the following “triangular” sequence (Rˆ j(i) )16i6n; i+16j6n of Q[E; X ]: ˆ ˆ • Rˆ (1) j := ResX1 (F 1 ; F j ) for j = 2; : : : ; n. (i) • Rˆ j := ResXi (Rˆ i(i−1) ; Rˆ j(i−1) ) for 26i6n − 1 and i + 16j6n. From the elementary properties of the resultant we deduce that Rˆ j(i) is an homogeneous polynomial of (Fˆ 1 ; : : : ; Fˆj ) ∩ Q[E; X0 ; Xi+1 ; : : : ; Xn ]. Furthermore, taking into account the congruence relation Fˆi ≡ ai; i Xii+1 mod (E), a simple recursive argument shows that Rˆ i(i−1) ≡ ci Ximi mod(E) holds for suitable ci ∈ Q\{0} and mi ∈ N. This shows that
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
365
the coe4cient of Ximi in Ri(i−1) does not belong to the prime ideal (E) ⊂ Q[E]. Specializing the variable X0 into the value X0 = 1, with a similar argument as in the proof of Lemma 17 we conclude that F1 ; : : : ; Fn form a regular sequence of Q[E](E) [X ] and Q[E](E) ,→ Q[E](E) [X ]=(F1 ; : : : ; Fn ) is an integral Q-algebra extension. Finally, since F1 ; : : : ; Fn form a regular sequence of Q[E](E) [X ], and the morphism is generically unrami(ed, applying [24, Theorem 18.15] as in Lemma 14 we conclude that the ideal generated by F1 ; : : : ; Fn in Q[E](E) [X ] is radical. Let us observe that the origin 0 ∈ An+1 is the only point of −1 (0). Therefore, there are deg() = (n + 1)! branches of V passing through 0 ∈ Cn+1 . ˜ For any F ∈ Q[E](E) [X ], let us write F(E; EX ) = E F(E; X ) with F˜ ∈ Q[E](E) [X ]\ (E)Q[E](E) [X ]. We de(ne the initial term of F with respect to the weight (1; : : : ; 1) as ˜ X ). Let I be the ideal of Q[E](E) [X ] generated by F1 ; : : : ; Fn , and let in1 (F) := F(0; in1 (I) ⊂ Q[X ] be the ideal generated by the set {in1 (F) : F ∈ I}. Let W := V (in1 (I)) ⊂ An . Lemma 20. W = V (a1;1 X12 − 1 ; : : : ; an; n Xnn+1 − n ) and G1 ; : : : ; Gn form a standard basis in Q[E](E) [X ]. Proof. Let us observe that the set {in1 (F): F ∈ I} is contained in the set of initial terms (in the sense of Section 3) of the polynomials of the ideal (G1 ; : : : ; Gn ) ⊂ ˜ Q[E](E) [X ]. Let F ∈ (G1 ; : : : ; Gn ) and write F = E F(E; X ), with ¿0 and ˜ X ) = 0. Since E is not a zero divisor of the Q-algebra Q[E](E) [X ]=(G1 ; : : : ; Gn ), F(0; we conclude that F˜ ∈ (G1 ; : : : ; Gn ) holds. Then ˜ = F(0; ˜ X ) ∈ (G1 (0; X ); : : : ; Gn (0; X )) = (a1;1 X12 − 1 ; : : : ; an;n Xnn+1 − n ); in1 (F) which implies that in1 (I) ⊂ (a1;1 X12 − 1 ; : : : ; an; n Xnn+1 − n ) holds and G1 ; : : : ; Gn form a standard basis. On the other hand, we have the inclusion (a1;1 X12 − 1 ; : : : ; an;n Xnn+1 − n ) = (in1 (F1 ); : : : ; in1 (Fn )) ⊂ in1 (I) from which the lemma follows. Since there are (n + 1)! branches of V lying above 0 and deg W = (n + 1)!, we conclude that the system of (classical) Puiseux expansions of the branches of the curve V lying above 0 has regularity index 1, and the singular parts of its expansions are represented by the points of W . Lemmas 18–20 show that the polynomials F1 ; : : : ; Fn of (38) satisfy all the hypotheses of Theorems 7 and 12 (see the remark right before Section 3.1). In order to apply the algorithm underlying Theorem 12 we need a description of the singular parts of the branches of V lying above 0. A similar argument as in Section 5.3 shows that, with the notations of Section 2.2, g = 1, e1 = 1 and f1 = (n + 1)! in this case. Hence, we have that a geometric solution of the variety W yields the polynomials q(1) ; f1(1) ; : : : ; fn(1)
366
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
required for the application of Theorem 12. Such a geometric solution can be obtained in space O(n(n + 1)!2 ) and time O˜((n + 1)!2 ), using a similar algorithm to that of Section 5.3. Finally, applying Theorem 12 we obtain the following result: Theorem 21. There exists a straight-line program computing a geometric solution of the variety V with space O(n(n + 1)!2 ) and time O˜((n + 1)!2 ). In order to obtain a geometric solution of the variety {1} × V˜ = −1 (1) from the geometric solution of V provided by Theorem 21, we proceed in a similar way as in Section 5.3 (see the remarks after Theorem 16). Acknowledgements The authors wish to thank Joos Heintz for his helpful remarks. They are also grateful to the anonymous referees for many suggestions which helped to signi(cantly improve the presentation of the results of this paper. G. Matera and R. Wachenchauzer thank the Departamento de ComputaciQon, Universidad Favaloro, where they did part of this work. References [1] E. Allgower, K. Georg, Numerical Continuation Methods: An Introduction, Springer Series in Computational Mathematics, Vol. 13, Springer, Heidelberg, 1990. [2] M. Alonso, E. Becker, M.-F. Roy, T. WXormann, Zeroes, multiplicities and idempotents for zerodimensional systems, in: L. GonzQalez-Vega, T. Recio (Eds.), Algorithms in Algebraic Geometry and Applications Progress in Mathematics, Vol. 143, BirkhXauser, Basel, 1996, pp. 1–15. [3] M. Alonso, G. Niesi, T. Mora, M. Raimondo, Local parametrizations of space curves at singular points, in: I. Herman, C. Pienovi (Eds.), Computer Graphics and Mathematics, Springer Eurographic Seminar Series, Springer, Berlin Heidelberg, New York, 1991, pp. 61–90. [4] M. Alonso, G. Niesi, T. Mora, M. Raimondo, An algorithm for computing analytic branches of space curves at singular points, in: W.-T. Wu, M.-D. Cheng (Eds.), Proc. 1992 International Workshop on Mathematical Mechanization, Int. Academic Pub., 1992, pp. 135–166. [5] M. Artin, Lectures on deformation of singularities, Tata Inst. Fund. Res., 1976. [6] D. Bayer, D. Mumford, What can be computed in algebraic geometry? in: D. Eisenbud, L. Robbiano (Eds.), Computational Algebraic Geometry and Commutative Algebra, Symp. Mathematics, XXXIV, Cambridge University Press, Cambridge, 1993, pp. 1–49. [7] T. Becker, V. Weispfenning, GrXobner bases, A Computational Approach to Commutative Algebra, Springer Graduate Texts in Mathematics, Vol. 141, Springer, Berlin, Heidelberg, New York, 1993. [8] D. Bini, B. Mourrain, Polynomial test suite, http://www-sop.inria.fr/saga. [9] D. Bini, V. Pan, Polynomial and Matrix Computations, BirkhXauser, Basel, 1994. [10] L. Blum, F. Cucker, M. Shub, S. Smale, Complexity and Real Computation, Springer, New York, Berlin, Heidelberg, 1998. [11] A. Bompadre, Un problema de eliminaciQon geomQetrica en sistemas de Pham-Brieskorn, Master’s Thesis, Fac. Cs. Exactas y Nat., Universidad de Buenos Aires, 2000. [12] A. Bompadre, G. Matera, R. Wachenchauzer, A. Waissbein, Lifting procedures for rami(ed (bers and polynomial equation solving, in: M. FrQ]as, J. Heintz (Eds.), Proc. Workshop Argentino de InformQatica TeQorica, WAIT’01, Buenos Aires, September 2001, Vol. 30 Anales JAIIO, Buenos Aires, 2001, pp. 13–31.
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
367
[13] J.F. Bonder, J. Rossi, Blow-up vs. spurious steady solutions, Proc. Amer. Math. Soc. 129 (1) (2001) 139–144. [14] A. Borodin, Time space tradeoIs (getting closer to the barriers?), in: 4th Internat. Symp. on Algorithms and Computation, ISAAC ’93, Hong Kong, December 1993, Lecture Notes in Computer Science, Vol. 762, Springer, Berlin, 1993, pp. 209–220. [15] P. BXurgisser, M. Clausen, M. Shokrollahi, Algebraic Complexity Theory, Grundlehren der mathematischen Wissenschaften, Vol. 315, Springer, Berlin, Heidelberg, New York, 1997. [16] D. Castro, M. Giusti, J. Heintz, G. Matera, L.M. Pardo, The hardness of polynomial equation solving, Found. Comput. Math. 3 (4) (2003) 347–420. [17] M. Chipot, M. Fila, P. Quittner, Stationary solutions, blow up and convergence to stationary solutions for semilinear parabolic equations with nonlinear boundary conditions, Acta Math. Univ. Comenianae 60 (1) (1991) 35–103. [18] A. Chistov, D. Grigoriev, Subexponential time solving systems of algebraic equations I, II, LOMI preprints E-9-83, E-10-83, Steklov Institute, Leningrad, 1983. [19] D. Cox, J. Little, D. O’Shea, Ideals, Varieties, and Algorithms: An Introduction to Computational Algebraic Geometry and Commutative Algebra, Undergraduate Texts in Mathematics, Springer, Berlin, Heidelberg, New York, 1992. [20] D. Cox, J. Little, D. O’Shea, Using Algebraic Geometry, Graduate Texts in Mathematics, Vol. 185, Springer, Berlin, Heidelberg, New York, 1998. [21] D. Cox, B. Sturmfels, Applications of computational algebraic geometry, Vol. 53, Proc. Symp. Applied Mathematics, AMS, Providence, RI, 1998. [22] V. Danilov, Algebraic varieties and schemes, in: I. Shafarevich (Ed.), Algebraic Geometry I, Encyclopedia Mathematics Sciences, Vol. 23, Springer, Berlin, 1994, pp. 167–297. [23] D. Duval, Rational Puiseux expansions, Compositio Math. 70 (1989) 119–154. [24] D. Eisenbud, Commutative Algebra with a View Toward Algebraic Geometry, Graduate Texts in Mathematics, Vol. 150, Springer, Berlin, Heidelberg, New York, 1995. [25] R. Ferreira, P. Groisman, J. Rossi, Numerical blow-up for a nonlinear problem with a nonlinear boundary condition, Math. Models Methods Appl. Sci. 12 (4) (2002) 461–484. [26] W. Fulton, Intersection Theory, Springer, Berlin, Heidelberg, New York, 1984. [27] P. Gianni, T. Mora, Algebraic solution of systems of polynomial equations using GrXobner bases, in: L. Huguet, A. Poli (Eds.), Proc. AAECC-5, Menorca, Spain, June 1987, Lecture Notes in Computer Science, Vol. 356, Springer, Berlin, 1989, pp. 247–257. [28] M. Giusti, K. HXagele, J. Heintz, J.E. Morais, J.L. Monta˜na, L.M. Pardo, Lower bounds for Diophantine approximation, J. Pure Appl. Algebra 117 (118) (1997) 277–317. [29] M. Giusti, J. Heintz, La dQetermination des points isolQes et de la dimension d’une variQetQe algQebrique peut se faire en temps polynomial, in: D. Eisenbud, L. Robbiano (Eds.), Computational Algebraic Geometry and Commutative Algebra, Symposia Mathematica XXXIV, Cambridge University Press, Cambridge, MA, 1993, pp. 216–256. [30] M. Giusti, J. Heintz, J.E. Morais, J. Morgenstern, L.M. Pardo, Straight-line programs in geometric elimination theory, J. Pure Appl. Algebra 124 (1998) 101–146. [31] M. Giusti, J. Heintz, J.E. Morais, L.M. Pardo, When polynomial equation systems can be solved fast? in: G. Cohen, M. Giusti, T. Mora (Eds.), Proc. AAECC-11, Lecture Notes in Computer Science, Vol. 948, Springer, Berlin, 1995, pp. 205–231. [32] M. Giusti, J. Heintz, J.E. Morais, L.M. Pardo, Le rˆole des structures de donnQees dans les probl_emes d’Qelimination, C. R. Acad. Sci. Paris 325 (1997) 1223–1228. [33] M. Giusti, G. Lecerf, B. Salvy, A GrXobner free alternative for polynomial system solving, J. Complexity 17 (1) (2001) 154–211. [34] M. Gonzalez-Lopez, L. Gonzalez-Vega, Newton identities in the multivariate case: Pham systems, in: B. Buchberger et al. (Eds.), GrXobner Bases and Applications, London Math. SLNS 251, Cambridge University Press, Cambridge, MA, 1998, pp. 351–366. [35] L. Gonzalez-Vega, A special quanti(er elimination algorithm for Pham systems, in: C. Detzell et al. (Ed.), Real algebraic geometry and ordered structures, Contemporary Mathematics, Vol. 253, AMS, Providence, RI, 1998, pp. 115–366.
368
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
[36] J. Heintz, De(nability and fast quanti(er elimination in algebraically closed (elds, Theoret. Comput. Sci. 24 (3) (1983) 239–277. [37] J. Heintz, On the computational complexity of polynomials and bilinear mappings. A survey, in: L. Huguet, A. Poli (Eds.), Proc. AAECC-5, Menorca, Spain, June 1987, Lecture Notes in Computer Science, Vol. 356, Springer, Berlin, 1989, pp. 269–300. [38] J. Heintz, G. JerQonimo, J. Sabia, J. San MartQ]n, P. SolernQo, On the multihomogeneous BQezout theorem, in: Intersection Theory and Deformation Algorithms. The multihomogeneous case, manuscript Universidad de Buenos Aires, 2002. [39] J. Heintz, T. Krick, S. Puddu, J. Sabia, A. Waissbein, Deformation techniques for e4cient polynomial equation solving, J. Complexity 16 (1) (2000) 70–109. [40] J. Heintz, G. Matera, A. Waissbein, On the time-space complexity of geometric elimination procedures, Appl. Algebra Engrg. Comm. Comp. 11 (4) (2001) 239–296. [41] D. Henry, Geometric Theory of Semilinear Parabolic Equations, in: Lecture Notes in Mathematics, Vol. 840, Springer, Berlin, Heideilberg, 1981. [42] B. Huber, B. Sturmfels, A polyhedral method for solving sparse polynomial systems, Math. Comput. 64 (112) (1995) 1541–1555. [43] T. Krick, L.M. Pardo, A computational method for Diophantine approximation, in: L. GonzQalez-Vega, T. Recio (Eds.), Algorithms in Algebraic Geometry and Applications, Progress in Mathematics, Vol. 143, BirkhXauser, Basel, 1996, pp. 193–254. [44] L. Kronecker, GrundzXuge einer arithmetischen Theorie der algebraischen GrXossen, J. Reine Angew. Math. 92 (1882) 1–122. [45] E. Kunz, Introduction to Commutative Algebra and Algebraic Geometry, BirkhXauser, Boston, 1985. [46] T. Lickteig, M.-F. Roy, Cauchy index computation, Calcolo 33 (1997) 337–351. [47] F.S. Macaulay, The Algebraic Theory of Modular Systems, Cambridge University Press, Cambridge, 1916. [48] H. Matsumura, Commutative Ring Theory, Cambridge University Press, Cambridge, 1986. [49] T. Mora, G. P(ster, C. Traverso, An introduction to the tangent cone algorithm, in: C. HoImann (Ed.), Issues in robotics and non-linear geometry, Vol. 6 of Advances in Computing Research, JAI Press, Greenwich CT, 1992, pp. 199–270. [50] B. Mourrain, V. Pan, Solving special polynomial systems by using structural matrices and algebraic residues, in: F. Cucker, M. Shub (Eds.), Proc. FOCM’97, Springer, Berlin, Heidelberg, New York, 1997, pp. 287–304. [51] B. Mourrain, V. Pan, Multivariate polynomials, duality and structured matrices, J. Complexity 16 (1) (2000) 110–180. [52] L.M. Pardo, How lower and upper complexity bounds meet in elimination theory, in: G. Cohen, M. Giusti, T. Mora (Eds.), Proc. AAECC-11, Lecture Notes in Computer Science, Vol. 948, Springer, Berlin, Heidelberg, New York, 1995, pp. 33–69. [53] L.M. Pardo, J. San MartQ]n, Deformation techniques to solve generalized Pham systems, Theoret. Comput. Sci., this Vol. (2004). [54] J. Renegar, On the computational complexity and geometry of the (rst order theory of the reals I, II, III, J. Symbolic Comput. 13 (3) (1992) 255–352. [55] W. Rheinboldt, Methods for solving systems of nonlinear equations, CBMS-NSF Regional Conf. Series in Applied Mathematics, Vol. 70, SIAM, Philadelphia, 1998. [56] F. Rouillier, Solving zero-dimensional systems throught rational univariate representation, Appl. Algebra Engrg. Comm. Comp. 9 (5) (1997) 433–461. [57] J. Savage, Models of Computation. Exploring the Power of Computing, Addison Wesley, Reading, MA, 1998. [58] E. Schost, Computing parametric geometric resolutions, Appl. Algebra Engrg. Comm. Comp. 13 (2003) 349–393. [59] J. Schwartz, Fast probabilistic algorithms for veri(cation of polynomial identities, J. ACM 27 (4) (1980) 701–717. [60] I. Shafarevich, Basic Algebraic Geometry: Varieties in Projective Space, Springer, Berlin, Heidelberg, New York, 1994.
A. Bompadre et al. / Theoretical Computer Science 315 (2004) 335 – 369
369
[61] V. Strassen, Algebraic complexity theory, in: van Leeuwen, J. (Ed.), Handbook of Theoretical Computer Science, Elsevier, Amsterdam, 1990, pp. 634–671 (Chapter 11). [62] B. Sturmfels, Solving Systems of Polynomial Equations, CBMS Regional Conference Series in Mathematics, AMS, Providence, RI, 2002. [63] J. von zur Gathen, Parallel arithmetic computations: a survey, in: B.R.J. Gruska, J. Wiedermann (Eds.), Proc. 12th Symp. MFCS, Bratislava, Czechoslovakia, 1986, Lecture Notes in Computer Science, Vol. 233, Springer, Berlin, 1986, pp. 93–112. [64] R. Walker, Algebraic Curves, Dover Publications Inc., New York, 1950. [65] P. Walsh, On the complexity of rational Puiseux expansions, Paci(c J. Math. 188 (2) (1999) 369–387. [66] P. Walsh, A polynomial-time complexity bound for the computation of the singular part of a Puiseux expansion of an algebraic, function, Math. Comput. 69 (231) (2000) 1167–1182. [67] O. Zariski, Algebraic Surfaces, Springer, Berlin, Heidelberg, New York, 1995. [68] R. Zippel, Probabilistic algorithms for sparse polynomials, in: Proc. EUROSAM’79, Marseille 1979, Lecture Notes in Computer Science, Vol. 72, Springer, Berlin, 1979, pp. 216–226.
Theoretical Computer Science 315 (2004) 371 – 404
www.elsevier.com/locate/tcs
Approximating shortest path for the skew lines problem in time doubly logarithmic in 1/epsilon D. Buragoa;1 , D. Grigorievb;∗;2 , A. Slissenkoc;3 a Department
of Mathematics, Penn State University, University Park, PA 16802, USA Mathematique de Rennes, Universite Rennes 1, Beaulieu 35042, Rennes, France c Laboratory for Algorithmics, Complexity and Logic, Department of Informatics, University Paris 12, 61 Av. du Gen. de Gaulle, 94010, Creteil, France b Institut
Abstract We consider two three-dimensional situations when a polytime algorithm for approximating a shortest path can be constructed. The main part of the paper treats a well-known problem of constructing a shortest path touching lines in R3 : given a list of straight lines L = (L1 ; : : : ; Ln ) in R3 and two points s and t, 0nd a shortest path that, starting from s, touches the lines Li in the given order and ends at t. We remark that such a shortest path is unique. We show that it can be length–position -approximated (i.e. both its length and its position can be found approximately) in time (Rn= d˜ ) ˜ 16 + O(n2 log log 1=), where d˜ is the minimal distance between consecutive lines of L, ˜ is the minimum of sines of angles between consecutive lines, and R is the radius of a ball where the initial approximation can be placed (such a radius can be easily computed from the initial data). As computational model we take real RAM extended by square and cubic roots extraction. This problem of constructing a shortest path touching lines is known for quite some time to be a challenging problem. The existing methods for approximating shortest paths based on adding Steiner points which form a grid and subsequently applying Dijkstra’s algorithm for 0nding a shortest path in the grid, provide a complexity bound which depends polynomially on 1=, while our algorithm for the problem under consideration has complexity linear in log log 1=. Our algorithm is motivated by the observation that the shortest path in question is a geodesic in a certain length space of non-positive curvature (in the sense of A.D. Alexandrov), and it relies on the (elementary) theory of CAT(0)-spaces. ∗ Corresponding
author. E-mail addresses: [email protected] (D. Burago), [email protected] (D. Grigoriev), [email protected] (A. Slissenko). 1 Partially supported by an Alferd P. Sloan Fellowship and NSF Grant DMS-9803129. 2 Member of St-Petersburg Steklov Mathematical Institute, Russian Academy of Sciences, St-Petersburg, Russia. 3 Member of St-Petersburg Institute for Informatics, Russian Academy of Sciences, St-Petersburg, Russia. c 2004 Elsevier B.V. All rights reserved. 0304-3975/$ - see front matter doi:10.1016/j.tcs.2004.01.014
372
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
In the second part of the paper we analyze very simple grid approximations. We assume that a parameter a ¿ 0 describing separability of obstacles is given and the part of a grid with mesh size a outside the obstacles is built (for semi-algebraic obstacles all these pre-calculations are polytime). We show that there is an algorithm of time complexity O((1=a)6 ) which, given a-separated obstacles in a unit cube, 0nds a path (between given vertices s and t of the grid) whose length is bounded from above by (84 ∗ + 96a), where ∗ is the length of a shortest path. On the other hand, as we show by an example, one cannot approximate the length of a shortest path better than 7 ∗ if one uses only grid polygons (constructed only from grid edges). For semialgebraic obstacles our computational model is bitwise. For a general type of obstacles the model is bitwise modulo constructing the part of the grid admissible for our paths. Observe that the existing methods for approximating shortest paths are not directly applicable for semi-algebraic obstacles since they usually place the Steiner points forming a grid on the edges of polyhedral obstacles. c 2004 Elsevier B.V. All rights reserved. Keywords: Shortest path; Skew lines problem
1. Introduction We consider two three-dimensional situations when a polytime algorithm for approximating shortest path can be constructed. The main part of the paper concerns a known problem of constructing a shortest path touching lines in R3 in a speci<ed order: given a list of straight lines L = (L1 ; : : : ; Ln ) in three-dimensional space and two points s and t, 0nd a shortest path that starts from s, touches the lines Li in the given order and ends at t. We will call this problem the skew lines problem (cf. [22]). The second situation is, in a way, more simple: we look for a length approximation of a shortest path (i.e. for a path such that its length, but not necessarily its position, is close to the length of shortest paths) amidst obstacles that are separated. 1.1. Related results The general problem of constructing a shortest path or even a ‘suIciently good approximation’ for such a path is well known [18]. It was shown to be NP-hard even for convex polyhedral obstacles [9] (by “polyhedral obstacles” we mean unions of polyhedra). The skew lines problem in R3 is also well known. This problem is mentioned in [22] as, presumably, representing the basic diIculties in constructing shortest paths in three-dimensional Euclidean space. The paper [22] states “For example, there is no eIcient algorithm known for 0nding the shortest path touching several such lines in a speci0ed order”, where “such lines” means “skew lines in three-dimensional space”. The same problem of 0nding a shortest path touching straight lines was mentioned in a survey [23] as presumably diIcult, though a possibility of using numerical methods was mentioned without any elaboration. In the problem of approximation of shortest paths we distinguish two types of approximations: length approximation and length–position approximation.
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
373
Length approximation means that, given an , we look for a path whose length is -close to the length of shortest path (-close may mean either additive or multiplicative error). Position approximation presumes that a path we wish to construct is in a given neighborhood of a shortest path. An -neighborhood of a path is a union of balls of radius centered at points of . To construct a length–position approximation means that, given an , we look for a path that is in an -neighborhood of a shortest path and whose length is -close to the length of shortest path. In [9] one can 0nd the following result that can be related to the complexity of position approximation: for the case of polyhedral obstacles, 0nding the sequence of edges touched by a shortest path is NP-hard. In the same article [9] it√is proven that for polyhedral obstacles with the size of description N , determining O( N ) bits of the length of a shortest path is also NP-hard. Concerning weaker approximations, for the case of polyhedral obstacles, paper [22] describes an algorithm that 0nds a path which is of length at most (1 + ) times of the length of shortest path, and whose running time is polynomial in the size of the input data and 1=. Considerable gaps in this paper have been 0xed in [11] and the complexity of the algorithm [11] is roughly O(n4 N=), where n is the total number of edges in the polyhedra. Further, another method was exhibited in [12] with complexity bound depending on n and on as O(n2 =4 ). An algorithm with complexity bound O(n4 =6 ) is designed in [15] which in addition, constructs for a 0xed source s a preprocessed data structure allowing one to compute subsequently an approximation for any target t with the complexity bound O(log(n=)). For the easier problem of constructing a shortest path on a given polyhedral surface one can 0nd a shortest path exactly. First, an algorithm with complexity O(n3 log n) was exhibited in [24] in case of a convex surface, thereupon it was generalized in [19] to arbitrary polyhedral surfaces, and its complexity is O(n2 log n). Finally, the latter algorithm was improved in [10] providing a complexity O(n2 ). To improve the quadratic complexity, several algorithms which give -approximations of a shortest path were produced: in [1] with complexity bound O(n+1=3 ) in case of a convex surface, then in [3] with complexity roughly O(n=) in case of a simple polyhedral surface. In [15] for a 0xed s, a preprocessed data structure is constructed with complexity O(n=3 ) in case of a convex surface and with complexity O(n2 + n=) in case of an arbitrary polyhedral surface, respectively, which allows one for any target t to output an approximation within complexity O(log(n=)). All the mentioned approximation results use an approach that places Steiner points on the edges of the polyhedral obstacles to form a grid and after that to 0nd a shortest path in this grid applying the algorithm of Dijkstra [2]. Since the number of the placed Steiner points depends polynomially on n and on 1=, the complexity of this approach should depend on n and on 1= as well. The approach of [20] to solve the weighted region problem gives an algorithm whose running time depends polynomially on log 1=. Combining the binary search with the local optimization as in [20] one may probably 0nd an approximation for the skew lines problem; however, in any case, our algorithm is exponentially faster in 1= (and our text is shorter than that of [20]).
374
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
1.2. On computational models In the present paper we also study the problem of approximating shortest paths in R3 trying to improve the computational complexity in some aspect. Even in simple two-dimensional situations, exact constructions of shortest paths involve substantial extensions of the computational model by operations like square root extraction or more powerful ones, see e.g. [13,14,17]. Note that a shortest path in R3 between two rational points, with even one cylindrical obstacle, may have transcendental length and necessitate using transcendental points in its construction [16]. The algorithm we propose for length–position approximation of the shortest path touching lines is based on methods of steepest descent. Each iteration of such a method involves measuring distances between points and other algebraic but not rational operations. If one uses a bitwise computational model, the error analysis becomes diIcult and demands an extra considerable work. So to give a solution to the skew lines problem we exploit here an “extended” computational model over reals, which we make precise below. As it was mentioned above, in this paper we present two results: the 0rst one treats the skew lines problem, where we give a length–position approximation. The second result is an analysis of the complexity of a length approximation of a simple grid method to 0nd a path amidst separated semi-algebraic obstacles. Though the obtained estimation shows that the quality of approximation is rather low, the simplicity of the method makes it worthy of analysis. For our main result, concerning a length–position approximation of the shortest path touching lines in R3 our computational model is real RAMs [6] with square and cubic roots extraction. This model extends (by allowing also extracting cubic roots) the common model used (implicitly) in [1,3,10,12,15,19,20] which admits rational operations and extracting square roots. We mention also that in [11,22] a bitwise model was used which takes into account the bit complexity. Our second result uses the bitwise model modulo constructing a grid; the latter can be done in polytime for semi-algebraic obstacles. 1.3. Our results and methods we use We start with a remark that the shortest path ∗ that we seek, is unique. We show that this path ∗ can be length–position -approximated in time (Rn= d˜ ) ˜ 16 + 2 ˜ O(n log log 1=), where d is the minimal distance between consecutive lines of L, ˜ is the minimum of sines of angles between consecutive lines, and R is the radius of a ball where the initial approximation can be placed (we have this formula under the ˜ condition that d61; it does not diminish the generality; without this condition one must take maximum of (35) and (39)). Such a satisfactory radius R can be easily 3 √ computed from the initial data: R ∈ [ n|∗ |; 2n 2 |∗ |], where |∗ | is the length of ∗ (estimate (3) in Section 2). Observe that when ˜ = 0 or d˜ = 0 it could happen that the gradient descent and Newton’s method which are invoked in our algorithm, do not converge. (There are considerations how to treat these particular cases but they need another, geometrically more ‘invariant’ approach.)
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
375
The algorithm is based on the following geometric idea, which shows that there is a unique minimum for the problem in question, and it is a strict minimum (the latter is quanti0ed by means of a convexity estimate). Given a sequence of lines, one can consider a sequence of copies of Euclidean space, and glue them into a “chain” by attaching “neighboring” spaces along the corresponding line. The resulting (singular) space happens to be of non-positive curvature (see [4,5] and the Section 2.1 below). Now a shortest path we want to construct is a geodesic in this new space, and this immediately implies its uniqueness (by the Cartan–Hadamard Theorem [4]). Furthermore, convexity comparisons for the distance functions in nonnegatively curved spaces allow us to estimate the rate of convergence in gradient descent methods. This approach is somewhat similar to applications of Alexandrov geometry to certain problems concerning hard ball gas models and other semi-dispersing billiard systems, see [7,8]. To avoid excessive use of Alexandrov geometry, we formulate the mentioned convexity property in terms of the Hessian of the distance function and prove it by direct computation. The exposed geometric idea helps us to achieve the principal feature of our algorithm, namely that its complexity depends linearly on log log(1=) rather than polynomially on 1= as in the existing algorithms from [1,3,11,12,15,22] based on the grid method (at the expense of a worse dependency on n and on other geometric parameters). It would be interesting to describe more situations when a better dependency of the complexity on 1= (logarithmic, cf. [20], or even double logarithmic) would be possible simultaneously with better dependency on geometric parameters. However, our algorithm does not allow to solve the skew lines problem for bitwise models of computations. The latter would require, in particular, estimating the bitsize of the output data. For bitwise models one can consider the following two settings. SkewLines Bitwise exact: Given a list of n lines and two points s and t in R3 , 0nd n algebraic numbers on the consecutive lines such that the path going through these points is the shortest path between s and t touching the lines. SkewLines Bitwise approximate: Given a list of n lines and two points s and t in R3 , and given an , 0nd an -approximation to the shortest path between s and t touching the lines. We believe that our result is a step towards solving (SkewLines Bitwise Approximate) problem for which it would be interesting to design an algorithm with complexity polynomial in log(1=). Very likely the method we use can be generalized to 0nd a shortest path touching cylinders in R3 in a prescribed order. It is straightforward (a rather general case of three-dimensional TSP with Neighborhoods [18]) that if the order of touching lines is not prescribed the problem of 0nding a shortest path becomes NP-hard. Indeed, take the lines to be parallel and place the points s and t at a plane orthogonal to these lines. Then the problem of 0nding a shortest path touching all the lines is equivalent to the Euclidean traveling salesman problem in which it is necessary to 0nd a shortest path between s and t and passing all intersecting points of the lines with the plane and which is known to be NP-complete [21]. We can slightly incline lines if we wish not to have parallel ones. Our second result is an analysis of very simple grid approximations. The question concerning possibilities of ‘generalized’ grid approximations is mentioned in the
376
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
conclusion of [11]. The matter is that the result of [11,22] is based on such a method. The initial idea used in [11,22] is straightforward: partition the edges of polyhedra that constitute the obstacles into suIciently small segments and take them as vertices of a graph of visibility. Then determine whether two such segments are mutually visible, in which case connect them by edges. It is not a simple task to implement correctly this idea for the bitwise computational model because of the necessity to ‘approximate approximations’, and this was the main source of gaps in [22]. The primitive grid method we use avoids such diIculties even for semi-algebraic obstacles. But one cannot get approximations that are, in the worst case, better than 7 ∗ , where ∗ is the length of the shortest path. More precisely, our result is as follows. When considering grid approximations we assume that a parameter a¿0 describing a separability of obstacles and the part of a grid admissible for the paths is given, and s and t are vertices of the grid lying in the space admissible for the paths. We show that there is an algorithm that in the bitwise machine model has the running time O((1=a)6 ) and that for given a-separated obstacles in a unit cube 0nds a path (between given points s and t) whose length is bounded from above as (84 ∗ + 96a), where ∗ is the length of shortest paths. On the other hand, we show by an example that one cannot approximate the length of shortest paths better than 7 ∗ . We conjecture that 7 is the exact bound for the three-dimensional case for the method we consider. We do not discuss in detail how to construct the parameter a and the part of the grid admissible for the paths. It depends on the type of obstacles. For semi-algebraic obstacles in R3 it can be done (precisely, in terms of algebraic numbers) in polytime by a bitwise machine. After that, the construction of a shortest path approximation deals only with natural numbers of a modest size. As we mentioned above the methods of placing Steiner points and forming a grid developed in [1,3,11,12,15,22] cannot be directly applied to semi-algebraic obstacles since they place Steiner points just on the edges of polyhedral obstacles. On the other hand, in our simple grid method the dependency on the complexity of obstacles is reduced to the construction of the grid; after that the complexity of the algorithm depends only on the parameter a of the obstacles; this is in accordance with the proposal expressed in [3]: ”: : :while studying the performance of geometric algorithms, geometric parameters (e.g. fatness, density, aspect ratio, longest, closest) should not be ignored: : :”. 1.4. Structure of the paper The paper is divided into two parts. Its structure is as follows. The 0rst and the main part of the paper deals with shortest paths touching lines. In Section 2 we give basic properties of the spaces under consideration, in particular, we explain what geometric properties ensure the uniqueness of the shortest path. Our main theorem is formulated in Section 2.5. In fact, this part is not needed for the proof as we reduce the problem to a problem of computing a minimum of a strictly convex function of many variables on a compact space. However, these geometric considerations, that are easy to apply, show the road to take, and they are very instructive indeed for the problem of shortest path that has rather various technical contexts and rarely
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
377
has a unique solution. Section 3 contains main technical estimates. The central point is to bound from below the eigenvalues of the Hessian of the path length function (Section 3.2, Proposition 2). This bound is crucial for the estimation of complexity of the gradient descent. The latter provides an initial approximation for the application of Newton’s method that follows the gradient descent. Section 4 gives our approximation algorithm. It starts with a construction of an initial approximation, then applies a gradient descent and after that Newton’s method. In the last Section 5 of Part 1 we estimate the complexity of the algorithm. Part 2 presents our grid algorithm. The de0nition of separability is given in Section 6. The same section (Proposition 7) gives a lower bound for the probability to get well separated obstacles when randomly choosing balls as obstacles. Then in Section 7 we present the algorithm that constructs a length approximation. Part 1: Shortest paths touching lines 2. Basic properties of shortest paths touching lines Our construction is based on the following initial observation: we are seeking a shortest path in a simply connected metric space of non-positive curvature. This observation guarantees the uniqueness of the shortest path; moreover, it suggests that the distance functions in the space enjoys certain convexity properties, which will actually permit us to use the method of steepest descent to reach an approximate position of the shortest path. As we mentioned in the Introduction, formally speaking we can omit these considerations and go directly to the estimations of convexity of the length function within our parametrization of paths. However, these geometric arguments are simple and direct, and can be used in other problems of construction of shortest paths. We remark that usually the uniqueness of the shortest path does not take place though the metric function is quite good (e.g. a sphere as an obstacle in R3 ). First, we recall some known facts about spaces of non-positive curvature. 2.1. On spaces of non-positive curvature Let M be a simply connected metric space. Denote by |XY |M , or by |XY | if M is clear from the context, the distance between points X and Y in M. A path in M between two points A and B is a continuous mapping ’ of [0; 1] into M such that ’(0) = A and ’(1) = B. The length of a path ’ can be de0ned as the limit of sums 06i6(n−1) |’(i(1=n))’((i + 1)1=n)| as n→∞. A shortest path between two points is a path connecting these points and having the smallest length. We suppose that for any two points of M there exists a shortest path between them. By the angle between two intervals going out of the same point on the plane R2 we mean the (nondirected) angle in [0; ]. Given 3 points A, B and C the angle between intervals AB and AC will be denoted by “BAC or “CAB, the angle between 2 intervals and will be denoted by “, and the angle between 2 vectors will
378
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
A M(BAC)
B't
C'τ
B
AB (t)
AC ()
C
B' C' Fig. 1. Comparing triangles in M and on the plane.
be treated as the angle between two intervals obtained from the vectors by translating them to the same origin. If diNerent spaces are considered in the same context the angle will be denoted by “M . Take any three points A, B and C in M (Fig. 1). Let AB and AC be shortest paths respectively between A and B, and A and C. On these paths we take two arbitrary points AB (t) and AC (), where t and are the values of parameters determining these points. Then we choose any point A in R2 and take in R2 two points Bt and C that satisfy the equalities: |A Bt |R2 = |AB ([0; t])|M ;
|A C |R2 = |AC ([0; ])|M ;
|Bt C |R2 = |AB (t)AC ()|M : We are interested in the triangle A Bt C in the plane R2 . Denote by (t; ) the angle between the intervals A Bt and A C in R2 . The angle between AB and AC at A is de0ned as limt; →0 (t; ), if the latter exists. De0ne the angle at A to B and C, or shorter “M (BAC) as the supremum of angles at A between a shortest path from A to B and a shortest path from A to C. Consider again three points A, B and C in M. Draw in R2 a triangle A B C determined by the following conditions: |A B |R2 = |AB|M ;
|A C |R2 = |AC|M
and
“M (BAC) = “R2 (B A C ):
By de0nition, the space M is of non-positive curvature if the angle “M (BAC) exists and |B C |R2 6|BC|M for any three points A, B and C of M. It is known the following (see, for instance, [4]). Uniqueness of the shortest path: In any simply connected space of non-positive curvature, there is a unique shortest path between any two points. 2.2. The con
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
379
the shortest paths touching the lines of L in the given order are polygonal chains (broken lines). The space where we are looking for such a shortest broken line can be obtained by the following operation. For each pair (Li ; Li+1 ) take a copy Ri of R3 . Glue consecutively Ri and Ri+1 along the line Li+1 . Denote the obtained space by RL . Each space Ri is a space of non-positive curvature, and they are glued along isometric convex sets. The Reshetnyak’s theorem (see [5]) implies that the resulting space RL is of non-positive curvature. The next two paragraphs are not necessary to describe the main algorithm, but rather to clarify some geometric background behind it. Similarly to nonpositively curved spaces, one can consider spaces with other curvature bounds (say, spaces with curvature bounded above by −1). They feature even stronger convexity of distance functions, and algorithms for constructing shortest paths converge in such spaces even faster. Shortest paths in such spaces have remarkable trapping properties, as it is asserted by the classic Morse Lemma. The spaces constructed in this paper by gluing several copies of Euclidean space along a number of lines are nonpositively curved. Of course, they are not spaces of curvature bounded above by any negative number, for they have Oat parts (regions in the Euclidean spaces used for this construction). However, if one considers two points A; B in diNerent copies of Euclidean space and such that the shortest path between these points meets al least two lines used to glue the spaces together, and the gluing lines are not parallel, then a distance function |A · | restricted to a neighborhood of B has the same convexity properties as if the space was of curvature bounded above by a negative constant. This allows us to treat such shortest paths as if they were in a space with negative upper curvature bound. Note that it is easy to see geometrically where this negative curvature comes from if the gluing lines are skew lines, and what happens if the gluing lines are parallel. First assume that the gluing lines are parallel. Consider the following plane lying in our space and passing through A and B. It is made of three parts: a half-plane containing A and bounded by the 0rst gluing line (this part lies in the same copy of Euclidean spaces as A), then a strip between the two gluing lines (it lies in the copy of Euclidean space which the shortest path connecting A and B passes through), and 0nally a halfplane containing B and bounded by the second gluing line. These three parts indeed form a plane (a surface isometric to a plane) lying in our space totally geodesically. It contains A and B, and hence the convexity properties of the distance function |A · | restricted to a neighborhood of B are no better than in a Oat space. To see where the negative curvature comes from in case of skew lines, consider a short Euclidean segment containing B and parallel to the second gluing line. Connect all points of with A. If the gluing lines were parallel, we would get just a part of the plane discussed above (more precisely, a triangular region). However, in case of skew lines the family of segments connecting A with form a non-Oat surface. This surface has two Oat parts (a triangle with a vertex at A, and a trapezoid (one of the bases of the trapezoid is , and the other one lies in the gluing line)). However, the third part of this surface is a ruled surface connecting two segments on the gluing lines, and this is indeed a negatively curved surface.
380
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
A similar construction can be used to obtain a space of non-positive curvature to seek shortest paths consecutively touching convex cylinders Z = (1 ; : : : ; n ) in RN . In this case the gluings are made along the corresponding cylinders. Consider the shortest path problem in RL , where two points s and t are 0xed. From the uniqueness of shortest path between two points in a simply connected nonpositively curved space we immediately conclude that: Claim 1. There is a unique shortest path between any two given points in RL ; hence a shortest path touching the lines in a prescribed order and connecting two given points is unique. Furthermore, a standard distance comparison argument with Euclidean development of a broken line representing an approximate solution implies that length-approximation guarantees position-approximation (this is proven in detail in Lemma 4, Section 4.1): Claim 2. If the length of the actual shortest path is L, every path whose length is less than L + belongs to the L=2-neighborhood of the actual shortest path. To simplify our constructions we will assume that (see Assumption 2.5 in Section 2.5) the consecutive lines of L are pairwise disjoint. The case with intersecting consecutive lines needs a particular attention when our approximation to the shortest path is close to a point of intersection of lines. By a path in RL we mean a polygon P consisting of n + 1 links (straightline intervals), denoted by Pi , 16i6n, connecting s with t via consecutive lines, i.e. the link P1 connects s with L1 , the link Pi+1 connects Li with Li+1 (16i6n − 1), and the link Pn+1 connects Ln with t. When speaking about the order of points on a path we mean the order corresponding to going along the path starting from s. Thus any link of a path is directed from s to t. Notice that for a shortest path P the angle between Pi (incident ray) and Li must be equal to the angle between Li and Pi+1 (emergent ray). Clearly, the incident angle being 0xed, the emergent rays constitute a cone (which may have up to two intersection points with Li+1 —this means that a geodesic in our con0guration space can branch, having two diNerent continuations after passing through a line Li ). 2.3. Technical notations Throughout the text we use some notations that we summarize here—the reader can consult the list when coming across a new notation. The list is divided into 4 parts: notations concerning vectors, lengths, angles, paths. Some more local notation will appear later. Notation 1 (Vectors, lines, points). • U; V is the scalar product of vectors U and V .
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
381
• |U | or U is the L2 -norm of a vector U (the length of U ). • A , where A is a matrix, is the spectral norm of A. For positive de0nite symmetric matrices (our main case) A = sup{ AU; U : U = 1}. • L = (L1 ; : : : ; Ln ) is the list of straight lines in R3 we consider, and s and t are respectively the 0rst and the last point to connect by a polygon touching the lines of L in the order of the list. • !0i is a 0xed vector determining a point on Li . We use it as the origin of coordinates on Li . • !i is a unit vector de0ning the direction on Li . • T = (t1 ; : : : ; tn ) is a list of n reals that will serve as coordinates respectively on L1 ; : : : ; Ln . Our applications of gradient and Newton’s methods take place in the space Rn of such points T . • To make the notations more uniform we assume that !00 = s, !0n+1 = t, and that always t0 = tn+1 = 0. • Wi (ti ) =df (!0i + ti !i ); clearly, it is a point on Li determined by a single real parameter ti . Note that W0 (t0 ) = s and Wn+1 (tn+1 ) = t. • Vi = Vi (ti ; ti ) =df Wi+1 (ti+1 ) − Wi (ti ) is a vector connecting Wi with Wi+1 (in this order), 06i6n. • D0 is a vector from s to L1 perpendicular to L1 , Dn is a vector from Ln to t perpendicular to Ln , and Di is a vector connecting Li and Li+1 and perpendicular to both of them. • For a directed interval % in R3 we denote by %− and %+ its beginning and its end, respectively. Notation 2 (Lengths). • & will be used to denote various L2 -distances, e.g. &(s; Li ) will denote the distance between s and Li . • vi =df |Vi | is the length of Vi , 06i6n. • di is the square of the distance between Li and Li+1 for 16i6n − 1, i.e. di = &(Li ; Li+1 )2 , d0 =df &(s; L1 )2 , dn =df &(Ln ; t)2 . • d =df min{d √ i : 06i6n}. ˜ we do not loose the gener• d˜ =df + d. To simplify formulas we assume that d61; ality as we can change the coordinates in an appropriate way, and within our model of computation this change does not aNect the complexity. In the proofs we will also give formulas without this assumption. Notation 3 (Angles). • “ABC, where A, B and C are points, is the angle at B in the triangle determined by these 3 points. In fact, we will consider angles in R3 , though formally speaking we are in con0guration space RL de0ned in Section 2.2. • “VV , where V and V are vectors, is the angle in [0; ] between these vectors. 2 • i =df (sin“! √ i !i+1 ) , 16i6n − 1, =df min{ i : 16i6n − 1}. • ˜ =df + . • (i =df (cos “!i !i+1 )2 , (˜i =df cos “!i !i+1 , 16i6n − 1.
382
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
Notation 4 (Paths). • A path (T ) is determined by a list of reals T that gives the consecutive points W (ti ) of (T ) on the lines of Li connected by this path. Thus, W0 (t0 )=s, Wi (ti )=(T )∩Li for 16i6n and Wn+1 (tn+1 ) = t, where Wi , ti and Li are de0ned in Notations 1. • By Pi for a path P we denote the ith link of P. We consider Pi as an interval directed from Li to Li+1 , 06i6n. Regarded as a vector, it coincides with Vi . • ∗ is the shortest path between s and t. • T ∗ = (t1∗ ; : : : ; tn∗ ) is the point in Rn de0ning ∗ , i.e. ∗ = (T ∗ ). • 0 is the initial approximation of ∗ de0ned below in Section 2.4. • T 0 is the point in Rn de0ning 0 , i.e. 0 = (T 0 ). • r =df |0 | is the length of the initial approximation. It will be clear from its construc˜ tion that d¡r. Moreover, without loss of generality, just to simplify some formulas that will appear in the estimations of complexity, we suppose that r¿1. √ • R =df r n is the radius of a ball in Rn that will contain all paths in question. • B is the closed ball in Rn of radius R centered at T 0 ; it contains all paths under consideration. 2.4. Initial approximation for the shortest path Denote by Pi0 the base of the perpendicular from s to Li . Take as initial approximation to the shortest path the polygon 0 that starts at s, then goes to P10 , then to P20 and so on to Pn0 and 0nally to t. Clearly 0 ⊆ B and |Pi Pi+1 | 6 (|sPi | + |sPi+1 |)
for 1 6 i 6 n − 1:
0
(1) ∗
The length of can be bounded in terms of the length of the shortest path as |∗ | 6 |0 | = r 6 2
n−1 i=1
&(s; Li ) + |st| 6 2n|∗ |:
(2)
The 0rst inequality follows from the fact that the length of any path connecting s and t and touching lines from L is greater than or equal to |∗ |, the second inequality is implied by (1), and the third one is due to the fact that the distance from s to any line is not greater than |∗ |. For R estimations (2) give √ n|∗ | 6 R 6 2n3=2 |∗ |: (3) We remark that if Li = {Wi0 + t!i : t ∈ R}, then Pi = Wi0 + ti0 !i is determined by
Pi ; !i = 0, and thus, (4) Pi = Wi0 − Wi0 ; !i !i : √ 0 The parameter R = | | n intervenes in our estimation of complexity, and one may ask what is the dependence of R on the length of the input. Suppose that the input lines are represented by Wi0 and !i , and the maximum of the lengths of involved numbers is +. Then the length of the input is O(n+). Clear that in the worst case the value R is exponential in this input length. But it can be as well linear, logarithmic and so on. The precise bound on R is an open question.
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
383
2.5. Main Theorem Hereafter we suppose that ˜ Assumption. ¿0 ˜ and d¿0. The 0rst inequality says that there are no consecutive parallel lines among Li , and the second one says that the consecutive lines do not intersect and the points s and t are not on the lines. Now we can formulate our main result: Main Theorem. There is an algorithm that
Rn d˜ ˜
16
1 ; + O n2 log log
where R, d˜ and ˜ are de
3. Bounds on eigenvalues of the Hessian of the path length This section contains technical bounds on the norm of derivatives of the path function, in particular the main estimation, namely, a lower bound on the eigenvalues of the Hessian (‘second derivative’) of the path length. The fact that the length function is at “least as convex as in Euclidean space” follows from the curvature bound. We will make direct computation though, due to the fact that we use a particular coordinate system. Consider a path (T ) = (t1 ; : : : ; tn ) represented by points Wi , 06i6n + 1 (Notations 4 from Section 2.3). Notations 5. n • f(T ) =df || = i=0 vi is the length of (T ), where vi is de0ned in Notations 2. @f @f • g(T ) =df f (T ) =df gradf(T ) = ( @t (T ); : : : ; @t (T )) is its gradient (0rst derivative). 1 n
384
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
Fig. 2. Hessian (of path length) /.
Let us begin with the formula for g, that is the 0rst variation of length. For 16i6n we set
i =
@f @ @vi−1 @vi
Vi−1 ; !i Vi ; !i = (vi−1 + vi ) = + = − @ti @ti @ti @ti vi−1 vi
(5)
and g = ( 1 (t1 ; t2 ); 2 (t1 ; t2 ; t3 ); : : : ; n−1 (tn−2 ; tn−1 ; tn ); n (tn−1 ; tn ))
(6)
(which stresses the variables on which each component of g depends). 3.1. The second variation formula for the path length The Hessian of ||, which we will denote / = /(T ) (Jacobian of || ), looks as shown in Fig. 2. The matrix of / is symmetric 3-diagonal with positive diagonal entries. Here are the formulas for derivatives involved in / in arbitrary coordinates (for further references). i
@Vi ; !i+1 Vi ; !i+1 Vi ; @V @2 vi @ Vi ; !i+1 @ti = @ti = − 3 @ti @ti+1 @ti vi vi vi
!i ; !i+1 Vi ; !i+1 Vi ; !i =− + vi vi3
(7)
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
1 (cos “!i !i+1 − cos “Vi !i+1 · cos “Vi !i ); vi @2 vi
Vi ; !i 2 (sin “Vi ; !i )2 1 1 − = = ; 2 2 vi vi @ti vi @ 2 vi
Vi ; !i+1 2 (sin “Vi ; !i+1 )2 1 1 − = = : 2 2 vi vi @ti+1 vi =−
385
(8) (9) (10)
3.2. Lower and upper bounds on the eigenvalues of the Hessian of the path length Decompose / into a sum of matrices with 2 × 2-blocks corresponding to links of plus two 1 × 1-blocks for the 0rst and the last links. Notation 6. 2 @ v0 • /0 =df . @t12 @2 vi @2 vi @t 2 @ti @ti+1 i for 16i6n − 1. • /i =df @2 vi @2 vi 2 @ti @ti+1 @ti+1 2 @ vn • /n =df . @tn2 • /˜ i is the n × n-matrix consisting of /i situated at its proper place in / and with zeros at all other places. More precisely, (1; 1)-element of /i is placed at (i; i) for 16i6n and at (1; 1) for i = 0. Within these notations /˜ i : /= 06i6n
(11)
3.2.1. Lower bound on eigenvalues of / Strict convexity of metric implies that the second derivative / is positive de0nite. However, we need constructive upper and lower bounds on eigenvalues of /. So we do not use explicitly the just mentioned property of the metric—it will follow from the estimations below. We start with a more diIcult question of obtaining a lower bound on the least eigenvalue of /, i.e. on inf { /U; U : U = 1}. To do it we reduce this problem to the same problem for matrices /i . Lemma 1. The only eigenvalue of /0 and that of /n is greater than d=2r 3 , and both eigenvalues of /i for 16i6n − 1 are greater than d =2r 3 . Thus d =2r 3 is a lower bound on the norm of /i for all i (and for any ti ).
386
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
Proof. As supposed in the beginning of Section 2.5 there are no consecutive parallel lines among Li . Thus vectors !i , !i+1 and Di (see notations in Section 2.3) constitute a basis in R3 , and we can represent the link Vi as Vi = ti · !i + ti+1 · !i+1 + Di (here ti and ti+1 are diNerent from those used in notations in Section 2.3). To simplify computations rewrite this formula as W = xA + yB + D, where W = Vi , x = ti , A = !i , y = ti+1 , B = !i+1 and D = Di . For i = 0 we put x = 0, and for i = n we put y = 0. And let v = W; W . Within these notations di = D; D, D ⊥ A and D ⊥ B. For i = 0 we have that the only eigenvalue of /0 is its element @2 v0 =@t12 = (1=v0 )(1 − W; B2 =v02 ). Here W; B = yB + D; B = y, W; W = v02 =
yB + D; yB + D = y2 + d0 and thus
1 y2
1 d0 d
1 − 2 = 3 (y2 + d0 − y2 ) = 3 ¿ 3 : /0 = v0 r v0 v0 v0 Similarly for i = n we have /n ¿dn =vn3 ¿d=r 3 . Consider i ∈ {1; : : : ; n − 1}. Notice that in our notations v2 = xA + yB + D; xA + yB + D = x2 + 2xy(˜i + y2 + di and
@2 v @2 v @x2 @x@y /i = @2 v @ 2 v : @x@y @y2
Now we compute the second derivative of v: @2 v 1
xA + yB + D; A2 1 1 − = 3 (v2 − (x + y(˜i )2 ) = 2 2 @x v v v 1 2 1 = 3 (y + di − (y(˜i )2 ) = 3 (y2 i + di ); v v
xA + yB + D; B2 1 @2 v 1 1 − = 3 (v2 − (x(˜i + y)2 ) = @y2 v v2 v 1 = 3 (x2 i + di ); v @2 v 1 =− @x@y vi 1 =− 3 v 1 =− 3 v 1 =− 3 v
xA + yB + D; B xA + yB + D; A (˜i − v2
((˜i (x2 + 2xy(˜i + y2 + di ) − (x(˜i + y)(x + y(˜i )) 2
2
((x2 (˜i + 2xy(˜i + y2 (˜i + di (˜i ) − (x2 (˜i + xy + xy(˜i + y2 (˜i )) 2
(xy(˜i + di (˜i − xy) = −
1 (di (˜i − xy i ): v3
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
387
Let us estimate the determinant det /i = = = = ¿ ¿
1 ((y2 i + di )(x2 i + di ) − (di (˜i − xy i )2 ) v6 2 1 2 2 2 (x y i + di i (x2 + y2 ) + d2i − d2i (˜i − x2 y2 i2 + 2xy i di (˜i ) 6 v 1 (di i (x2 + y2 ) + d2i i + 2xy i di (˜i ) v6
i (di (x2 + y2 ) + d2i + 2xydi (˜i ) v6
i (di (x2 + y2 ) + d2i − 2|x||y|di ) v6 d2i i : v6
Note that Tr /i = (1=v3 )( i (x2 + y2 ) + 2di ). The smallest of two eigenvalues of /i is not less than det /i d2 i v3 d2i i di i d
¿ 3i ¿ 6 = 3 ¿ 3: 2 2 Tr /i v ( i (x + y ) + 2di ) v 2di 2v 2r Proposition 2. All eigenvalues of /(T ) are greater than d =2r 3 . Thus /(T ) is positive de
/˜ i U; U = /i Ui ; Ui ;
/U; U = i
i
where Ui is the appropriate 2-vector. From Lemma 1 we have
/U; U ¿
d
d d
d
Ui ; Ui = 3
Ui ; Ui = 3 · 2 = 3 : 3 2r i 2r r i 2r
This inequality gives a lower bound on eigenvalues of /. Hence, / ¿d =r 3 , and /−1 6r 3 =d . 3.2.2. Upper bound on the norm of / The matrix / is symmetric, therefore, using formulas (8) – (10), Notations 2 and standard inequalities for the norms one gets / = /(T ) 6 √ 6 4 n max i
for all T .
√
n · / 1 =
√ 1 4 n 6 vi d˜
√
n · max i
j
|/i;j | (12)
388
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
3.3. Newton’s method for the search of zero of the gradient of path length Our algorithm consists of 3 phases described below in Section 4. The third phase is an application of Newton’s method to approximate the zero of the Jacobian f of the length function. The main diIculty, however, is to 0nd an appropriate approximation for Newton iterations. That will be done by a gradient descent described in Section 4.3. We will use the bounds we just have obtained to estimate the rate of convergence of Newton’s method. First let us recall standard facts about Newton’s method. Consider a mapping H = (h1 ; : : : ; hn ) from Rn into Rn . Letter X without or with superscripts and subscripts will denote vector-columns from Rn . Newton’s iterations are de0ned by the formula X k+1 = X k − (H (X k ))−1 H (X k );
(13)
where H is the Jacobian of H . Proposition 3 gives a suIcient condition to ensure a fast convergence of Newton’s iterations (it can be found in textbooks on numerical methods). R where 0=(0; R : : : ; 0)∈Rn . Proposition 3. Suppose that X0 is a zero of H , i.e. H (X0 )=0, Let a; a1 ; a2 be reals such that 0¡a and 06a1 ; a2 ¡∞, and denote 6a = {X : X − X0 ¡a}, c = a1 a2 and b = min{a; 1=c}. If X0 ∈ 6b and (A) (H (X ))−1 6a1 for X ∈ 6a , (B) H (X1 ) − H (X2 ) − H (X2 )(X1 − X2 ) 6a2 X1 − X2 2 for X1 ; X2 ∈ 6a then X k − X0 6
k 1 (c X 0 − X0 )2 : c
(14)
In Proposition 3 there are two constants a1 and a2 : the norm of the inverse of H has to be at most a1 (condition (A)); and a2 is, in fact, an upper bound on the norm of the second derivative of H (condition (B)). Both bounds must hold in a-neighborhood of the zero X0 of H . In our case a, that appears only in (B), will be ‘big’ (more precisely, we will take a = R), so we can ignore it for now. Then the convergence is determined by two things: a parameter b = 1=a1 a2 , and the choice of initial approximation X 0 that must be in an open b-neighborhood of the zero X0 . We are interested in making c X 0 − X0 smaller than 1 with a known upper bound less than 1. We will construct X 0 such that c X 0 − X0 6 12 . We will take the bound from Proposition 2 for a1 . As for a2 , it is not hard to estimate it using formulas for elements of /. We will choose X 0 in Section 4.3. 3.3.1. Choosing parameters a1 and a2 to satisfy (A) and (B) of Proposition 3 We start with calculating a2 . To satisfy condition (B) it is suIcient to show that the (partial) second derivatives of f are bounded in some neighborhood of X0 . We use Taylor’s formula with the third derivatives of f. Take any T1 ; T2 ∈ B. In our case H (X1 ) − H (X2 ) − H (X2 )(X1 − X2 ) = f (T1 ) − f (T2 ) − /(T2 )(T1 − T2 ):
(15)
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
389
We assume that the vectors involved in (15) are represented as columns. To bound from above the norm of this vector, 0rstly estimate its components. Using notations Tj = (t1( j) ; : : : ; tn( j) ); j = 1; 2, we can write the ith component of (15) as (1) (1) (1) ; ti ; ti+1 )
i (ti−1
−
(2) (2) (2)
i (ti−1 ; ti ; ti+1 )
(2) @ i (tj ) (1) + (tj − tj(2) ): @t j j
(16)
Taylor’s formula says that (16) is equal to @2 i () @2 i () 1 + (t − tj )(tk − tk ) 2 j=i−1;i;i+1 @tj @(tj − tj )2 j;k=i−1;i;i+1 @tj @tk j
(17)
for some vector = (j )j whose jth component is between tj(1) and tj(2) . Equalities (7)– (10) show that the second derivatives of i (notation from (6)) involved in (17) are of the form 1 (sinO(1) ’; cosO(1) vi2
; (O(1))) +
1 2 vi−1
(sinO(1) ’; cosO(1)
; (O(1)));
where (sinO(1) ’; cosO(1) ; (O(1))) is a sum of expressions in parenthesis (“arguments of :”). Thus the absolute value of each second derivative of i is bounded by O(1=d). Hence, the absolute value of each component of vector (15) is bounded by O(1=d). This implies that in our case (see (15)) H (X1 ) − H (X2 ) − H (X2 )(X1 − X2 ) = f (T1 ) − f (T2 ) − /(T2 )(T1 − T2 ) √ n T1 − T2 2 6 C1 (18) d for some constant C1 ¿0 and for T1 ; T2 ∈ B. Thus we set √ n a2 = C1 · : d
(19)
As in our case H = /, Proposition 2 permits to take as a1 the lower bound for /−1 from this proposition. Now we can de0ne all the parameters for Newton’s method: • a = R ( justi0ed by (18)), • a1 = r 3 =d (see Proposition 2), a2 = C1 (n=d) (see (18)), • c = a1 a2 = C1 (nr 3 = d2 ) for the constant C1 from (18). Our assumptions about r (see Notations 4) imply that a = r¿1=c. Hence we take 4
b=
d2 1
˜2 d˜ = = : 3 c C1 nr C1 nr 3
(20)
390
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
For further references we rewrite (20) as (we use Notations 2 and 3) 4
•
˜2 d˜
d2 b = c2 3 ; = c2 3 2 nr nr
(21)
where c2 =df 1=2C1 . To ensure fast convergence we will need a ‘good’ initial approximation X 0 for Newton’s method, namely such that X 0 ∈ 6b=2 :
(22)
For such X 0 we have c X 0 − X0 6 12 ;
(23) k
and the rate of convergence guaranteed by Proposition 2 will be ( 12 )2 , where k is the number of iterations (see Proposition 3). 4. Algorithm for the shortest path touching lines The algorithm takes as input a list L of lines, points s and t and , 1¿¿0. The algorithm outputs a path, represented as the list of points where it meets the lines of L. The output path is in the -neighborhood of the shortest path and its length is -close to the length of the shortest path. We assume that the lines of L are represented by vectors !0i (point on the line Li ) and !i (unit vector directed along the line Li ) introduced in Section 2.3. The complexity of transforming another usual representation of lines and points into this form is of linear complexity for our computation model. The algorithm consists of three phases: Phase 1: Preliminary computations. Phase 2: Application of a gradient method to 0nd an approximation for Newton’s method. Phase 3: Application of Newton’s method. Below in this section we will describe the phases. Their descriptions are rather short within the technique developed before, however some new notions will be needed. Together with this description we will make some estimations of complexity that concerns ‘local’ computations. The global estimation of the complexity will be done in the next Section 5. 4.1. Length approximation from a position approximation The algorithm seeks an approximation of the position of the shortest path. An appropriate approximation of the length in our case is ‘automatic’ as follows from Lemma 4 below.
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
391
B A*
B*
A
Li
Li+1
Fig. 3. Comparing the lengths of two paths.
√ Lemma 4. |(T )| − |(T ∗ )|62 T − T ∗ 1 6 n T − T ∗ . Proof. The last inequality is standard. Let us prove the 0rst one. We compare = (T ) and ∗ = ∗ (T ∗ ) linkwise, see Fig. 3, where i = AB, i∗ = A∗ B∗ , |AA∗ | = |ti − ∗ ti∗ |, |BB∗ | = |ti+1 − ti+1 |. The triangle inequality immediately implies that |AB| − |A∗ B| 6 |AA∗ |;
|A∗ B| − |A∗ B∗ | 6 |BB∗ |
and thus, |AB| − |A∗ B∗ |6|AA∗ | + |BB∗ |. Similarly, we get |A∗ B∗ |−|AB|6|AA∗ |+|BB∗ |. Hence, AB|−|A∗ B∗ 6|AA∗ |+|BB∗ |. It remains to notice that we use each |AA∗ | = |ti − ti∗ | twice for 16i6n. Thus, to obtain a length-approximation with precision , we will construct a position √ = n-approximation. 4.2. Preliminary computations ˜ , ˜ r, R and save also all Phase 1: Compute the path 0 and the values d, d, intermediate values that appear within these calculations (all that will be used in the estimations that govern the two other phases). Our construction of the initial approximation 0 was described in Section 2.4. The values d, d˜ are de0ned in Notations 2, the values , ˜ are de0ned in Notations 3 and the values r, R are de0ned in Notations 4. 4.3. Initial gradient descent to initial approximation of Newton’s method Recall that f(t1 ; : : : ; tn ) denotes the length of a path represented by points on the lines of L, i.e. f(t1 ; : : : ; tn ) = f(T ) = |(T )| (see Notations 5). This function f is convex (see Proposition 2), and the point at which it attains its minimum is denoted by T ∗ (Section 2.3, Notations 4). Proposition 2 gives a positive lower bound for the norm of the Hessian / of f; we denote this lower bound by ;. Let ;˜ be the upper bound on the norm of / from (12).
392
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
Consider g =df grad f = ((@f=@t1 ); : : : ; (@f=@tn )). Notice that g(T ∗ ) = 0. We have T g(T ) = T ∗ /, where one can integrate / along any path from T ∗ to T . Denote by < the distance from T ∗ to T . Since / is positive de0nite and its norm is at least ;, we have /(V ); V= V ¿; V for all vectors V . Hence the projection of the integral of / along the segment from T ∗ to T to this segment is at least ;<, and in particular the norm of the integral itself is at least ;<. Recalling that this integral is equal to the gradient of f at T , we conclude that the absolute √ value |(@f=@ti )(T )| of one of the coordinates of the gradient g(T ) is at least ; · <= n. Choose i such that
@f
¿ ;√· < : (T )
@ti
n
(24)
Let us estimate how one can decrease the length of the path by changing ti (which geometrically represents moving the point where the path meets Li ). To study how f depends on ti , introduce a function in one variable given by ’() = f(t1 ; : : : ; ti−1 ; ; ti+1 ; : : : ; tn ). Function ’ is the restriction of f to a straight line and hence convex. Let ’ attain its minimum at 0 . We summarize the just introduced notations for further references: Notations √ 7. • ;˜ = 4 n= d˜ is an upper bound for / (formula (12)). • ; = d =r 3 is a lower bound for / (Proposition 2). • g =df grad f = ((@f=@t1 ); : : : ; (@f=@tn )). • < is the distance from T ∗ to T , i.e. < =df T ∗ − T , where T ∗ is a point where the length of our path (T ∗ ) is minimal and T is the current point in the space of parameters determining our path. • ’() = f(t1 ; : : : ; ti−1 ; ; ti+1 ; : : : ; tn ) (f is de0ned in Notations 5). • i is chosen such that (24). We wish to estimate |’(ti )−’(0 )| from below. Notice that our bounds on the Hessian ˜ We need the following lemma. of f imply that ;6’ 6;. Lemma 5. Let h be a strictly convex function, i.e. h ¿0 on a segment %. Suppose ˜ that h attains its minimum at a point 0 , and assume that h 6;. Then h() − h(0 ) ¿
(h ())2 4;˜
(25)
for all ∈ %. Proof. Suppose without loss of generality that 0 ¡, and let a point 1 be such that h (1 ) = h ()=2. This de0nes 1 uniquely since the derivative h of h is strictly increasing.
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
393
Then h () : 2 On the other hand, integrating h from 1 to yields h() − h(1 ) ¿ ( − 1 ) ·
h () 6 ;˜ · ( − 1 ): 2 Multiplying (26) and (27) we obtain (25).
(26)
(27)
Lemma 5 gives Lemma 6. In the notations introduced above (Notations 7) |’(ti ) − ’(0 )| ¿
;2 <2 : 4n;˜
This Lemma 6 describes what one gains (in terms of shortening the path in question) by setting ti = 0 . To use this lemma we introduce Notations 8. • gain =df |’(ti ) − ’(0 )| is the gain in the shortening of the path (T ) after setting ti = 0 , where 0 is the point where ’ attain its minimum. ˜ 1=2 =; is an upper bound on < (the latter denotes the distance • UpBnd =df (gain · 4n;) ∗ between T and T —see Notations 7). Phase 2: Gradient method procedure. Phase 2.1: Find i such that
@f
(T ) = max @f (T ) :
@ti
@tj
j (See formulas (5) for @f=@ti = @||=@ti .) This is done by using formulas (5), which involve only arithmetic operations and square root extractions. Claim 3. The complexity of Phase 2.1 is linear in n. Phase 2.2: Find the point 0 where ’() attains its minimum. Since ’ is strictly convex, this happens at the point where its derivative vanishes. Using our notations (see Notations 5 for ’ and Notations 2 of Section 2.3 for vi ) we have ’() = v0 (t1 ) + v1 (t1 ; t2 ) + · · · + vi−1 (ti−1 ; ) + vi (; ti+1 ) + · · · + vn (tn ): Hence, using (5), we have d’ dvi−1 (ti−1 ; ) dvi (; ti+1 ) + () = d d d
Vi−1 (ti−1 ; ); !i Vi (; ti ); !i = − : vi−1 vi
(28)
394
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
Equating expression (28) to zero, then squaring both sides and expressing the square of the length of a vector vj by the scalar product of the vector by itself, we get
Vi (; ti ); !i 2
Vi−1 (ti−1 ; ); !i 2 = :
Vi−1 (ti−1 ; ); Vi−1 (ti−1 ; )
Vi (; ti ); Vi (; ti )
(29)
Notice that both Vi−1 (ti−1 ; ) and Vi (; ti ) depend in linearly (see Notations 1). Thus, Eq. (29) is of the 4th degree in . The classical method of L. Ferrari reduces this equation to equations of the 2nd and of the 3rd degree. This method uses a 0xed number of arithmetical operations. The solutions of these equations of the 2nd and 3rd degree (formulas of G. Cardano for the latter) can be found using explicit formulas over arithmetic operations and square and cubic root extraction—all these operations are admissible in our computational model (see the Introduction). Hence, Claim 4. The complexity of Phase 2.2 is O(1). Note that, for a standard convexity reason, the modi0cation of the path described in this phase cannot make it leave a ball where the path lied. Hence the path stays in the ball B (Notations 4). Phase 2.3: Calculate (we use Notations 8) gain, the point T = (t1 ; : : : ; ti−1 ; 0 ; ti ; : : : ; tn ) and UpBnd, where 0 is from Phase 2.2 just above. √ If = n¡UpBnd¡b=2 (cf. (21) and (22)) then go to Phase 3 with T as the initial approximation for Newton’s method. Otherwise, repeat Phase 2 with T = T √ . Thus, the recalculation of gain, T and UpBnd is iterated while UpBnd¿ max{= n; b=2}. As one can see from the formulas de0ning the values of gain, T and UpBnd (Notations 8, 5 and the respective notations from Section 2.3), Claim 5. The complexity of computing gain, T and UpBnd is linear in n. 4.4. Application of Newton’s method Phase 3 (Newton’s method). Apply Newton’s method with the value of T , found by Phase 2.3, as X 0 . Iterate (13) until obtaining a suIciently close approximation; see Section 5 below, formula (46). The rate of convergence is estimated in Proposition 3, formula (14).
5. Complexity of the algorithm Now we summarize the comments on the complexity made in the previous sections to count the number of steps used by the algorithm to 0nd a length–position -
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
395
approximation of the shortest path touching the lines of L. By Ci , cj we will denote various positive constants (using capital C for upper bounds, and lower case c for lower bounds). 5.1. Complexity of preliminary computations Phase 1 0nds perpendiculars from s to lines Li . These perpendiculars fall at points Pi ∈ Li . To 0nd a point Pi we solve a linear equation with one unknown ti : Wi (ti ) − s; !i = 0. This can be done using only arithmetic operations. The complexity of solving one equation is constant, thus the total complexity of 0nding all Pi ’s is linear in n. The points Pi represent the polygon 0 , and computing its length |0 | also involves square root extraction. However, the complexity is still linear in n. Thus, the parameter R has been computed with linear complexity. To compute sines ˜ i = sin “!i !i+1 and cosines (˜i = cos “!i !i+1 we compute the scalar products !i ; !i+1 , and use the standard formulas that involve arithmetic operations and square root extractions only. The complexity is again linear in n. The next parameter to 0nd is the minimal distance between consecutive lines. For two consecutive lines Li and Li+1 , the distance between them is the length of the segment connecting the lines and perpendicular to both of them; this is again a standard “analytic geometry” computation, which yields d˜ and d in linear time. The calculation of ; and ;ˆ add only O(1) to the complexity (see formulas in Notations 5). Hence, Claim 6. The complexity of Phase 1 is linear in n. 5.2. Complexity of gradient descent The complexity of one iteration of the gradient descent of Phase 2 was estimated from above at the end of Section 4.3. This complexity is O(n). So we are to estimate a suCcient number of iterations. Recall that T − T ∗ = <6UpBnd (Notations 5 and Lemma 6). √Phase 2.3 iterates the calculation of UpBnd until the latter becomes less than max{= n; b=2}. It is evident that initially we have gain6r (r is the length of the initial approximation to the shortest path—see Notations 4). Hence, initially for the initial gain (we use the expression for UpBnd from Notations 8) UpBnd 6
˜ 1=2 (rn · n1=2 )1=2 r 3 r 7=2 · n3=4 (gain · 4n;) = c3 · 5=2 =df B0 : 6 c3 1=2 ; d˜ d
d˜ ·
(30)
˜ ; by their expressions using, respectively, Notations 2 (We replaced d by d˜ 2 and ;; and Notations 7.) So we can estimate the number of iterations in the Phase 2.3 as the ratio of the right-hand side value in formula (30) over a lower bound for gain that is valid while UpBnd goes down to the demanded value. Such a lower bound can be found from the
396
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
√ condition UpBnd¿ max{= n; b=2} that controls the continuation of the iterations: ˜ 1=2 (gain · 4n;) b √ UpBnd = : (31) ; ¿ max ; n 2 This condition (31) gives 2 2 2 2 ;2 (d )2 · d˜ b b √ · max gain ¿ ; = 6 ; · max ˜ n 4 n 4 r · 4n · 4 n 4n; 2 2 5 d˜ · 2 b ; ; = 6 · max r · 16n3=2 n 4 5 d˜ 2 b2 =df g1 if · 6 r · 16n3=2 4 = ˜5 2 2 d
· =df g2 if r 6 · 16n3=2 n
b2 2 ¡ n 4 2 b2 ¿ : n 4
(32)
(33)
To estimate the number of iterations of recalculations of UpBnd in Phase 2 we estimate B0 =g1 and B0 =g2 . Denote the number of iterations by NbIter, and consider 2 cases corresponding to the cases in (33). Case 1: 2 =n¡b2 =4. Replace b2 =4 by its value from (21) g1 =
5 9
d2 d˜ 2 d˜ · 3 · c = c · : 2 2 r 6 · 16n3=2 nr 3 r 9 · n5=2
Divide B0 from (30) by g1 from (34) 7=2 3=4 9 5=2 12:5 3:25 r ·n ·r ·n r ·n NbIter 6 O =O : 11:5 5=2 9 · 4 d˜ d˜ · · d˜ · 3
(34)
(35)
˜ We take into account that d61¡r (Notations 2 and 4), 61 and = ˜ 2 (Notations 3) and n¿1, and rewrite (35) as 13 12:5 4 r rn n NbIter 6 O 6O · : (36)
d˜ d˜ ˜ Case 2: 2 =n¿b2 =4. Divide B0 from (30), this time, by g2 from (33): 19=2 13=4 7=2 3=4 6 3=2 B0 r ·n r ·n ·r ·n ·n NbIter 6 =O : =O 5=2 5 15=2 g2 d˜ · · d˜ · 2 · 2 d˜ · 3 · 2 In Case 2 we can bound 1=2 using Case 2 condition: 6 6 1 r ·n 4 r ·n = O ; 6 = O 8 2 b2 · n d4 · 2 d˜ · 2
(37)
(38)
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
397
where we used also (21) to replace b and Notations 2 to replace d by d˜ 2 . From (38) and (37) we get NbIter 6 O
r 19=2 · n13=4 · r 6 · n
=O
r 31=2 · n17=4
15=2 8 31=2 d˜ d˜ · 3 · d˜ · 2 · 5 16 15:5 5 r rn n 6O 6O · :
d˜ d˜ ˜
=O
r 15:5 · n4:25 15:5 · 5 d˜
(39)
(40)
The bound in (40) majorizes the bound in (36), so we can take the bound from (40) for further references. Hence, in any case NbIter 6 O
rn d˜ ˜
16 :
(41)
From this bound (41) and Claim 5 that says that the complexity of calculating of gain, T and UpBnd in Phase 2.3 is linear we get Claim 7. The complexity of Phase 2.3 is O(rn= d˜ ) ˜ 16 that is majorized by O(Rn= d˜ ) ˜ 16 . From Claims 3, 4 and 7 we deduce Claim 8. The complexity of Phase 2 is O(rn= d˜ ) ˜ 16 or, in other terms, O(Rn= d˜ ) ˜ 16 . 5.3. Complexity of Newton’s method √ Once = n6b=2, and this is the case we are mainly interested in, Phase 3 of our algorithm applies Newton’s method. Formula (14) for the convergence of Newton’s method estimates the error after k iterations by k 1 (c X 0 − X0 )2 ; c
(42)
√ which we want to be smaller than = n. For our choice of initial approximation for Newton’s method, see (23), and for our value of 1=c, see (20), value (42) takes the form 1 c
2k 2k 2k 1 1 1
d2 =b = : 2 2 C1 nr 3 2
(43)
Thus we need 2k √ 1 C1 nr 3 ¡ : 2
d2
(44)
398
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
Inequality (44) is equivalent to 1 1 1 + log n + 3 log r + log − 2 log d + O(1) 2
1 1 6 log + O log r + log
2k ¡ log
(45)
where all log’s are with base 2; recall also that (without loss of generality) we assumed n6r. From (45) we can bound k as 1 1 k ¡ log log + O log r + log : (46)
It remains to estimate the complexity of computing the kth iterate X k+1 using formula (13). In our case F(T ) = gradf(T ) and F (T )−1 = /(T )−1 . The complexity of calculating F(T ) is shown to be linear in n. Computing the inverse of a 3-diagonal n × n-matrix /(T ) takes O(n2 ) operations. Note that the complexity of computing one element of /(T ) is constant. Thus the total complexity of Newton’s method is 1 1 : (47) O n2 log log + O log r + log
Taking into consideration that log(x + y)6(log x + log y) for x; y¿2, we can simplify this estimation (47), obtaining the following bound for the complexity of Newton’s method Claim 9. The complexity of Phase 3 is 1 1 2 2 O n log log + O n log r + log :
(48)
Combining complexity estimates for the three phases stated in Claims 6, 8 and 9 and observing that the term O(n2 (log r + log 1= )) is much smaller than the bound on the complexity of the gradient procedure, we
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
399
6.1. Obstacles and paths Let W be an arbitrary set in R3 , and s and t be two points in its compliment. Denote by cl(S), int(S) and bnd(S), respectively, the closure, the interior and the boundary of a set S. Denote by Bv (a), where a ∈ R¿0 and v ∈ R3 , the ball of radius a centered at v. The boundary bnd(W ) = cl(W )\int(W ) may contain ‘degenerate’ pieces. For example, an isolated point or a point with a neighborhood homeomorphic to a segment. Such pieces can hardly be considered as obstacles. So we assume that every point of bnd(W ) has a two-dimensional neighborhood. For technical reasons it is convenient to assume a neighborhood of each point of bnd(W ) intersects int(W ). To achieve this we can ‘slightly inOate’ W . It is known how to do it eIciently for semi-algebraic obstacles, see e.g. [13]. A path is a continuous piecewise smooth image of a closed segment. A simple path or a quasi-segment is a path without self-intersections (which is, clearly, homeomorphic to a segment). The set R3 \W will be called the free space, and its closure will be called the space admissible for trajectories. We consider only paths lying in the admissible space and not intersecting the interior of W . 6.2. Separability and random separated balls We say that obstacles W are a-separated if for any v ∈ R3 the set W ∩ Bv (a) is connected. For example, if each connected component of W is convex and the distance between each two connected components is greater than 2a, then W is a-separated. However, the convexity of the obstacles is not assumed in the general case. What is imposed by a-separation is a certain smoothness of concave (from the point of view of an observer outside the obstacles) pieces of the boundary—a ball of radius 2a that goes inside ‘fjords’ of an obstacle cannot touch or intersect two pieces of the obstacles that are ‘remote’ if to follow the boundary. Our main goal is to describe an algorithm that constructs a length approximation to a shortest path under the condition of separability of obstacles. But now we will make a digression estimating the expectation of separability of obstacles constituted by n randomly chosen balls. The centers of the balls are independently chosen in the unit cube [0; 1]3 , and their radii are independently chosen from an interval [0; r] under the uniform distribution. Proposition 7. There exist constants c1 ; c2 ¿0 such that for any r¡c1 n−2=3 the union of n randomly chosen balls in [0; 1]3 is (c2 n−2=3 )-separated with probability greater than 23 . Proof. We claim that n randomly chosen points in the unit cube are c3 =n2=3 -separated with probability greater than 23 for some constant c3 ¿0.
400
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
Indeed, a choice of n points in [0; 1]3 is equivalent to a choice of a point p = (x1 ; y1 ; z1 ; : : : ; xn ; yn ; zn ) ∈ [0; 1]3n . For any pair of natural numbers i; j such that 16i¡j6n, the measure those points p for which |xi −xj |; |yi −yj |; |zi −zj |6c3 n−2=3 √ of−2=3 is not greater than ( 2c3 n )3 . This is true for any constant c3 . Hence the measure of those p for which there exist √ a pair 16i¡j6n such that this condition holds is not greater than [n(n − 1)=2]( 2c3 n−2=3 )3 6 13 for an appropriate constant c3 . To 0nish the proof of the lemma we choose c1 and c2 so that 2c1 + c2 ¡c3 . 7. A grid algorithm for a shortest path approximation In this section we assume that the obstacles W ⊆ [0; 1]3 are a-separated and that s; t ∈ [0; 1]3 . 7.1. Approximation algorithm and its complexity Denote by L ⊆ [0; 1]3 a cubic grid with mesh (edge of the basic cube) a. Without loss of generality we assume that 1=a is an integer and that s and t are nodes of the grid. Consider a graph G whose vertices are those nodes of the grid that do not belong to W and whose edges are those edges of the grid that do not intersect W . Assume that the length of every edge in G is a. The length of a path P will be denoted by |P|. Theorem on Simple Grid Method. There is an algorithm of time complexity O((1=a)6 ) that, using the graph G de
(49)
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
401
Then this grid polygon can be constructed in quadratic time as a shortest path G in the graph G by using any polytime algorithm for a shortest path in a graph (e.g. see [2]). Estimate (49), and hence Theorem on Simple Grid Method will follow from Lemmas 9 and 10. We say that a cube of the grid is visited by a path P if the closure of this cube without vertices intersects with P. Notice that P does not determine uniquely the order of visited cubes as P may visit two cubes simultaneously by going along their common edge. Lemma 9. If two nodes v1 and v2 of the grid are connected by a path P that visits s distinct cubes of the grid, then one can connect v1 and v2 by a grid polygon whose length is not greater than 612sa. Proof. For each cube K, we can consider a maximal segment of P contained in K. Thus P is subdivided into intervals, each contained in one cube of the grid and such that its continuation in either direction leaves the cube. Consider such a segment (a subpath) [wu] for a cube K, that is P enters K via a point w and leaves it via a point u (of course, P can make several visits like that to a cube). Let u lie in a face with vertices u1 , u2 , u3 , u4 . Then one of the four intervals uui , 16i64, does not intersect the obstacles W . Indeed, assume the contrary. Then, because of a-separability of W , the intersection W with any cube of the grid and, thus, with any of its faces is a convex set. The fact that a segment lying in the face intersects W means that there are points of the segment in the interior of W . But u ∈ int(W ). So if u is diNerent from any ui , then our assumption implies that u ∈ int(W ), for u is in the convex hull of the intersection of the face with int(W ). If u = ui for some i then uui = u have the desired property. Now replace P by a path P obtained by a sequence of the following modi0cations. Take any cube K visited by P, together with a maximum segment [wu] of P in K. Let i, 16i64, be such that the segment uui does not intersect W . Insert the intervals uui and ui u into P to force P to visit a vertex of the cube by going there and back. Repeating the same procedure for the entry point w (which is in its turn an exit point for some other cube), we modi0ed our segment so that it ends and begins at a vertex of K. We will say that the visit of P to K terminates at ui . Perform this operation for all cubes visited by P. This gives P , a new path which visits the same cubes, and for which every maximum subpath contained in a cube begins at and leaves the cube via a vertex. We remark that though the number of cubes visited by P is the same as for P, i.e. s, the length of P might have increased. Delete from P all loops connecting the same vertex, thus obtaining a new path P . Notice that the number of vertices visited by P is at most 8s, for any vertex of any of the visited cubes is visited at most once. Moreover, P , still connects v1 and v2 . Now consider a maximum subpath of P contained in a grid cube K and connecting two vertices of K. Denote by W1 the (convex) intersection W ∩ K. Consider the following homotopy of this subpath in K: pull each of its points v in the direction from
402
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
the point of W1 nearest to v. (The nearest point is unique since W1 is convex.) Let the magnitude of the velocity of v (with respect to the parameter of the homotopy) be equal to the distance from v to the boundary @K of K. This way we “push” the maximum subpath of P in question away from W1 until we transform it into a path lying in the boundary @K of K, connecting the same vertices of K and not intersecting W . Of course, its length still might have increased. Repeat the same procedure for each maximal subpath contained in one cube, having constructed a path going along faces. Having repeated again the same operation for each maximum subpath contained in one boundary square of a grid cube (and now using a homotopy pushing this subpath to the boundary of this square), we end up with a polygonal path P that consists of edges of the grid, connects the same vertices v1 and v2 , and is contained in at most s cubes. After removing all loops from this path, we obtain a path P that traverses each edge at most once. Since s cubes together have at most 12s edges, we have constructed a path with required properties and whose length is at most 12as. Lemma 10. If a path P connecting nodes v1 and v2 of the grid has visited s diEerent cubes of the grid, then |P|¿(s − 2)=7a. Proof. Mark the points when P visits for the 0rst time the 1st, the 9th, the 16th; : : : ; the (7l + 2)th; : : : cube (we count only the visits to new cubes). The pieces of P between two consecutive marked points will be called intervals. Clearly, the number of intervals is at least (s − 2)=7. It remains to show that the length of each interval is at least a. Reasoning by contradiction, note that the projection of an interval whose length is less than a to each coordinate axis is a segment of a length less than a. Hence it is contained in the interior of the union of two adjacent intervals of the form [ka; (k + 1)a]. Now the product of these intervals, which is the interior of the union of eight grid cubes with a common vertex, contains the interval. This contradicts the assumption that the interval visited at least 9 cubes. Recall that s denotes the number of diNerent cubes of the grid visited by a shortest path ∗ . Lemma 10 gives |∗ |¿(s − 2)=7a¿((s − 2)=7 − 67 )a. On the other hand, for the length of a shortest grid polygon G Lemma 9 gives |G |612sa. Estimate (49) immediately follows from these two inequalities, and this concludes the proof of Theorem on Simple Grid Method. Remark 3. The estimation given by Theorem on Simple Grid Method is, obviously, far from the exact one. We conjecture that |G |67|∗ |. The latter bound cannot be essentially improved as one can see from the following example. Denote by v1 , v2 , v3 , v4 the vertices of the bottom face of the unit cube (ordered counter clockwise), and by u1 , u2 , u3 , u4 the respective vertices of the top face, see Fig. 4. Let W0 be the polyhedron with the following 5 vertices: 3 vertices at the middles of edges v1 v2 , v3 v4 , u3 u4 , and 2 vertices on edges u1 v1 and u2 v2 close to points v1 and v2 , respectively. To obtain W , apply to W0 a homothety (scaling) with coeIcient (1 + ), for a small enough ¿0 centered at the center of the cube.
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404 v3
403
v2
v1
v4
u2 u3
u4
u1
Fig. 4. Some faces of W0 .
For this obstacle the only grid polygon connecting v1 and v2 visits consecutively v1 , v4 , u4 , u1 , u2 , u3 , v3 , v2 . On the other hand, one can see that v1 and v2 can be connected by a path (avoiding W and contained in the cube) of length close to 1. Acknowledgements We are thankful to Misha Gromov for sharing his geometric insight, and to Yuri Burago for his useful comments and his help with the references. We are also grateful to anonymous referee for many useful remarks that helped to improve the presentation. References [1] P.K. Agarwal, S. Har-Peled, M. Sharir, K.R. Varadarajan, Approximate shortest paths on a convex polytope in three dimensions, J. Assoc. Comput. Mach. 44 (1997) 567–584. [2] A. Aho, J. Hopcroft, J. Ullman, The Design and Analysis of Computer Algorithms, Addison-Wesley, Reading, MA, 1976. [3] L. Alexandrov, M. Lanthier, A. Maheshwari, J.-R. Sack, An -approximation algorithm for weighted shortest paths on polyhedral surfaces, in: Proc. 6th Scandinavian Workshop on Algorithm Theory, Lecture Notes in Computer Science, Vol. 1432, Springer, Berlin, 1998, pp. 11–22. [4] W. Ballmann, Lectures on spaces of nonpositive curvature, DMV Seminar, Band 25, BirkhWauser, Basel, Boston, Berlin, 1995. [5] V. Berestovsij, I. Nikolaev, Multidimensional generalized Riemann spaces, in: Yu. Reshetnyak (Ed.), Geometry 4, Non-regular Riemann Geometry, Encyclopaedia of Mathematical Sciences, Vol. 70, Springer, Berlin, 1993. [6] L. Blum, M. Shub, S. Smale, On a theory of computation and complexity over real numbers: NPcompleteness, recursive functions and universal machines, Bull. Amer. Math. Soc. 1 (1989) 1–46. [7] D. Burago, Hard balls gas and Alexandrov spaces of curvature bounded above, Proc. ICM, Invited Lecture, Vol. 2, Berlin, 1998, pp. 289–298.
404
D. Burago et al. / Theoretical Computer Science 315 (2004) 371 – 404
[8] D. Burago, S. Ferleger, A. Kononenko, Uniform estimates on the number of collisions in semi-dispersing billiards, Ann. Math. 147 (1998) 695–708. [9] J. Canny, J. Reif, New lower bound technique for robot motion planning problems, in: Proc. 28th Ann. IEEE Symp. on Foundations of Computer Science, Los Angeles, 1987, pp. 49–60. [10] J. Chen, Y. Han, Shortest paths on a polyhedron, in: Proc. 6th Ann. ACM Symp. on Computational Geometry, Berkeley, June 6–8, New York, 1990, pp. 360–369. [11] J. Choi, J. Sellen, C.-K. Yap, Approximate Euclidean shortest path in 3-space, in: Proc. 10th ACM Symp. on Computational Geometry, Stony Brook, New York, 1994, pp. 41–48. [12] K.L. Clarkson, Approximate algorithms for shortest path motion planning, in: Proc. 19th Ann. ACM Symp. on Theory of Computing, New York, 1987, pp. 56–65. [13] D. Grigoriev, A. Slissenko, Polytime algorithm for the shortest path in a homotopy class amidst semi-algebraic obstacles in the plane, in: Proc. of the 1998 Internat. Symp. on Symbolic and Algebraic Computations (ISSAC’98), ACM Press, New York, 1998, pp. 17–24. [14] D. Grigoriev, A. Slissenko, Computing minimum-link path in a homotopy class amidst semi-algebraic obstacles in the plane, St. Petersburg Math. J. 10 (2) (1999) 315–332. [15] S. Har-Peled, Constructing approximate shortest path maps in three dimensions, SIAM J. Comput. 28 (4) (1999) 1182–1197. [16] J. Heintz, T. Krick, A. Slissenko, P. SolernXo, Une borne infXerieure pour la construction de chemins polygonaux dans rn , Publications du dXepartement de mathXematiques de l’UniversitXe de Limoges, l’UniversitXe de Limoges, France, 1993, pp. 94–100. [17] J. Heintz, T. Krick, A. Slissenko, P. SolernXo, Search for shortest path around semialgebraic obstacles in the plane, J. Math. Sci. 70 (4) (1994) 1944–1949 (translation into English of the paper published in Zapiski Nauchn. Semin. LOMI 192 (1991) 163–173). [18] J.S.B. Mitchell, Geometric shortest paths and network optimization, in: J.-R. Sack, J. Urrutia (Eds.), Handbook of Computational Geometry, Elsevier, North-Holland, Amsterdam, 2000, pp. 633–701. [19] J.S.B. Mitchell, D. Mount, C.H. Papadimitriou, The discrete geodesic problem, SIAM J. Comput. 16 (4) (1987) 647–668. [20] J. Mitchell, C. Papadimitriou, The weighted region problem: 0nding shortest paths through a weighted planar subdivision, J. Assoc. Comput. Mach. 38 (1991) 18–73. [21] C.H. Papadimitriou, Euclidean TSP is NP-complete, Theoret. Comput. Sci. 4 (1977) 237–244. [22] C.H. Papadimitriou, An algorithm for shortest-path motion in three dimensions, Inform. Process. Lett. 20 (1985) 259–263. [23] J.T. Schwartz, M. Sharir, Algorithmic motion planning in robotics, in: J. van Leeuwen (Ed.), Handbook of Theoretical Computer Science, Vol. A, Elsevier, Amsterdam, 1990, pp. 391–430. [24] M. Sharir, A. Schorr, On shortest paths in polyhedral spaces, SIAM J. Comput. 15 (1986) 193–215.
Theoretical Computer Science 315 (2004) 405 – 417
www.elsevier.com/locate/tcs
The abc conjecture and correctly rounded reciprocal square roots Ernie Croota , Ren-Cang Lib;∗ , Hui June Zhuc a Department
of Mathematics, University of California, Berkeley, CA 94720, USA of Mathematics, University of Kentucky, 715 Patterson O%ce Tower, Lexington, KY 40506, USA c Department of Mathematics and Statistics, McMaster University, Hamilton, Canada, L8S 4K1 b Department
Abstract √ The reciprocal square root calculation = 1= x is very common in scienti/c computations. Having a correctly rounded implementation of it is of great importance in producing numerically predictable code among today’s heterogenous computing environment. Existing results suggest that to get the correctly rounded in a 3oating point number system with p signi/cant bits, we may have to compute up to 3p + 1 leading bits of . However, numerical evidence indicates the actual number may be as small as 2p plus a few more bits. This paper attempts to bridge the gap by showing that this is indeed true, assuming the abc conjecture which is widely purported to hold. (But our results do not tell exactly how many more bits beyond the 2p bits, due to the fact that the constants involved in the conjecture are ine6ective.) Along the way, rough bounds which are comparable to the existing ones are also proven. The technique used here is a combination of the classical Liouville’s estimation and contemporary number theory. c 2004 Elsevier B.V. All rights reserved. Keywords: Correct rounding; Reciprocal square root; The abc conjecture; Floating point number; Algebraic number
1. Introduction Since computers have only /nite memory, any involved real numbers have to be /nitely approximated, in the form of 2oating point numbers (FPNs). By default in this paper, all FPNs, unless otherwise explicitly stated, are binary and of the same type, ∗
Corresponding author. E-mail addresses: [email protected] (E. Croot), [email protected] (R.-C. Li), [email protected] (H.J. Zhu). c 2004 Elsevier B.V. All rights reserved. 0304-3975/$ - see front matter doi:10.1016/j.tcs.2004.01.013
406
E. Croot et al. / Theoretical Computer Science 315 (2004) 405 – 417
i.e., p bits in the signi/cand, hidden bits (if any) included. Thus an FPN x takes the form x = ±2mx · (1:x1 · · · xp−1 );
(1.1)
where mx is the exponent, p is the number of signi;cant digits, and xi ∈ {0; 1}. In most commonly used FPN systems (see [1,2]), p = 24 (“single”), 53 (“double”), 64 (“double-extended”), or 113 (“quad”). Since one binary digit takes one bit to store, binary digits and bits are often used indistinguishably. Even though there are restraints upon mx in actual FPN systems, beyond which under3ow or over3ow occurs, for the purpose of correct roundedness these exceptional cases may be resolved by interpreting our results with a di6erent p other than the default one. Consequently, we impose no restraints upon mx in this paper. Binary FPN systems are mostly common on today’s computers. But this paper can be modi/ed in a straightforward way for FPN systems in radices other than 2, e.g., the decimal FPN as proposed in [4]. Clearly no irrational real number can be written in the form (1.1) without rounding. Without loss of generality, from now on we assume ¿0 (which evidently holds for the reciprocal square root to be discussed soon) and write = 2m · (1:y1 · · · yp−1 yp · · ·);
(1.2)
where yi ∈ {0; 1}. The IEEE standard mandates four rounding modes (see [1]). They are: rounding to nearest (or to nearest even whenever in a tie), rounding towards +∞, rounding towards −∞, and rounding towards 0. Under the rounding to nearest mode, correctly rounded is 2
m
· (1:y1 · · · yp−1 ) +
if ¡ 1=2; or = 1=2; yp−1 = 0;
0 −p+1+m
2
if ¿ 1=2; or = 1=2; yp−1 = 1;
where := 0:yp · · · : Notice = 12 does not occur as is irrational, but we include the case merely for completeness. Under the last three modes (collectively termed the direct rounding modes), correctly rounded is 2m · (1:y1 · · · yp−1 ) or 2m · (1:y1 · · · yp−1 ) + 2−p+1+m depending on whether ¿0 or not. Therefore, /nding the correctly rounded rests on correct estimation of . Our central concern is the following question, which is of practical and theoretical importance in computer arithmetic. See existing results in [8,11]. Question 1.1. Given p and an irrational algebraic number by its minimal polynomial, ;nd the minimal number of correct leading signi;cant bits of in (1.2) that is necessary to round correctly to an FPN of p signi;cant digits.
E. Croot et al. / Theoretical Computer Science 315 (2004) 405 – 417
Let
n :=
p+1
in the nearest rounding mode;
p
in the direct rounding modes:
407
(1.3)
When the binary representation (1.2) of contains k consecutive 0’s (resp., 1’s) starting at the bit of yn we say it has a 0-chain (resp., 1-chain) of length k. These can be seen that the worst scenarios for rounding correctly only possibly happen when there is a long 0-chain or 1-chain. Speci/cally, they are in the nearest rounding mode yp = 1 (resp., yp = 0) followed by a long 0-chain (resp., 1-chain), and in the direct rounding mode any yp−1 followed by a long 0-chain or 1-chain. In both cases we may write = 2m · (Yn + 2−n+1 );
Yn = 1:y1 · · · yn−1 ;
(1.4)
where ||¡ 12 and yp = 1 if in the nearest rounding mode. Some of the yi ’s in (1.4) may di6er from those in (1.2). However, this will not a6ect our argument in the rest of this paper as long as their m ’s are equal, which is indeed the case with possible exceptions Yn = 2 − 2−n+1
and
0 6 ¡ 1=2:
(1.5)
But these exceptions can be handled individually. For the reciprocal square root case, no such cases are possible because then x = 1=2 would not be any FPN. So we shall assume for the remainder of the paper that the m in (1.4) and (1.2) are the same. The length of the 0-chain (or 1-chain) in , denoted by D(), is equal to D() := − log2 || − 1;
(1.6)
where · means the 3oor of a real number. Note that ¿0 (resp., ¡ 0) corresponds to the case with a long 0-chain (resp., 1-chain). Then we /nd that q := p + 1 + D()
(1.7)
leading correct signi/cant digits are suMcient for resolving Question 1.1 and hence it suMces to give an upper bound on D(). This will be the subject of the paper. Our main results concern Question 1.1 for reciprocal square root as an illustrative example. The technical details can be modi/ed for other algebraic numbers like the cube root or powers of other fractions and their reciprocals. But we shall omit the detail because of the similarity in technicality. There is another reason for our choice of the reciprocal square root, too. It is ubiquitous in scienti/c computations, and has become part of the elementary function libraries libm provided by major computer vendors such as HP [12], IBM [9], and Intel. 1 Both HP and IBM name it rsqrt, while Intel names it invsqrt. Owing to the speed considerations, HP’s and Intel’s reciprocal square root subroutines for Itanium are not guaranteed to be correctly rounded, except for IEEE single precision ones. The authors are unsure about IBM’s rsqrt, but doubted it was 1 See http://www.intel.com/software/products/opensource/whats new.htm #opt math.
408
E. Croot et al. / Theoretical Computer Science 315 (2004) 405 – 417
correctly rounded. In any event, it is fair to say that any of these library implementations may run roughly twice as fast as by taking 2 a square root and then a division. The remainder of this paper is organized as follows: Our major contribution, sharper bounds on D() based on the famous abc conjecture (yet widely anticipated to hold) from Number Theory, are presented in Section 2, where we also proved rough bounds on D() which are comparable to the existing ones (see [8,11]). The sharper bounds can di6er from the rough ones by as many as p bits. Section 3 presents a brief discussion of the abc conjecture, along with other theorems, that relate to the approximation of an algebraic number by rational ones. We give our conclusion remarks in Section 4. 2. Reciprocal square root Fix an FPN x = 2mx (1: x√1 x2 · · · xp−1 ) in the standard form such that x is not an even power of 2 (otherwise 1= x will be a power of 2 and thus an exact FPN) and mx = 0 or 1. Our later theorems are stated in more general terms, i.e., without restraining mx to either 0 or 1. This is done by making m˜ x := mx mod 2 ∈ {0; 1}
(2.1)
appear in bounds instead. Since 1¡2mx (1: x1 x2 · · · xp−1 )¡4 we have 1¡ 2mx (1: x1 x2 · · · xp−1 )¡2. Let := √ 1= x. Since 12 ¡¡1, we have m = −1 in (1.2). This is the positive solution of the equation f() := 1 − x2 = 0. For any approximation Z to , we have f(Z) = f(Z) − f() = f ()(Z − ) and hence Z −=
f(Z) −2x
(2.2)
for some between and Z. This approach is in a similar spirit to that of Liouville’s estimation for arbitrary algebraic numbers of higher orders (see [13,15]). Recall (1.3). Set Z = 2−1 Yn . Then = Z + 2−n . We may write = Z + t2−n = + (t − 1)2−n = (1 + ) for some 0¡t¡1, and = 2−1 (t − 1)2−n+1 =. By (2.2) we have −2−n =
1 − xZ 2 ; −2x
1 2 = 2n−1 (1 − xZ 2 ) = 2n−1 (1 − xZ 2 ) ; x 1+ : log2 || = n − 1 + log2 |1 − xZ 2 | + log2 1+
= 2n−1 (1 − xZ 2 )
2 A correctly rounded square root followed by a correctly rounded division does not guarantee a correctly rounded reciprocal square root.
E. Croot et al. / Theoretical Computer Science 315 (2004) 405 – 417
409
Notice = (1 − + 2 − · · ·) 1+ = − 2−1 (t − 1)2−n+1 + · · · = 2−1 (Yn + (2 − t)2−n+1 ) + · · · : It can be seen that Yn = 1, otherwise x = 1=2 = 4=(1+2−n+1 )2 cannot be an FPN unless = 0 which corresponds to the excluded case when x is a power of 2. Since ||¡ 12 and Yn ¿1, and the exceptional case (1.5) is excluded, we have log2 (= (1 + )) = log2 = −1. Therefore log2 || ¿ n − 2 + log2 |1 − xZ 2 |:
(2.3)
To bound D() it suMces to get min |1 − xZ 2 | (or its lower bound) for all FPNs x. In what follows, we shall bound min |1 − xZ 2 | in two di6erent ways: a crude one that leads to rough bounds and more re/ned one that /rst reformulates it into an integral minimization problem and then employs the abc conjecture. The latter leads to sharper bounds. 2.1. Rough bounds √ Theorem 2.1. Let x be an FPN of p signi;cant digits, = 1= x, and let m˜ x be de;ned as in (2.1). Then either is an FPN, or 2p + 1 − m˜ x in nearest rounding mode; (2.4) D() 6 2p − m˜ x in direct rounding modes: Proof. As we remarked at the beginning of Section 2, we may assume mx = m˜ x for the purpose of this proof. Because xZ 2 = 2mx −2 (1:x1 · · · xp−1 )(1:y1 · · · yn−1 )2 ;
(2.5)
in its /xed point binary representation, the least signi/cant bit of xZ 2 is given by 2 × 2−p−2n+1+mx : 2mx −2 × xp−1 2−p+1 × (yn−1 2−n+1 )2 = xp−1 yn−1
Therefore |1 − xZ 2 | ¿ 2−(p+2n−1−mx ) , and thus log2 || ¿ n − 2 − (p + 2n − 1 − mx ) = −p − n − 1 + mx : Eq. (2.4) is a consequence of (1.3), (1.6), and (2.3). Applying Theorem 2.1 to the case n = p, we arrive at a bound 2p − mx that is better than 2n + 1 = 2p + 1 in [11] but one bit worse than 2n − 1 = 2p − 1 in [8] for the case mx = 0.
410
E. Croot et al. / Theoretical Computer Science 315 (2004) 405 – 417
2.2. Sharper bounds assuming the abc conjecture We shall /rst reduce the computation of min |1 − xZ 2 | to an integral approximation problem which allows us to employ the abc conjecture. The reader is referred to Section 3 for a discussion of it. Set a = (1x1 · · · xp−1 )binary ;
b = (1y1 · · · yp−1 )binary :
It can be seen that 2p−1 ¡ a ¡ 2p − 1:
(2.6)
Since 2−mx −1 ¡2 = 1=x¡2−mx , we have 2(1−mx )=2 ¡2 = 1:y1 · · · yp−1 · · · ¡2(2−mx )=2 . Thus b = 2p−1 (1:y1 y2 · · · yp−1 )¿2p−1 (1:y1 y2 · · · yp−1 · · ·) − 1¿2p−1+(1−mx )=2 − 1. Combine the above inequalities to get 2p−(1+mx )=2 − 1¡b¡2p−mx =2 :
(2.7)
For the nearest rounding, that is n = p + 1 and yp = 1, we have 1 − xZ 2 = 1 − 2mx −p+1 a(2−1−p+1 (b + 1=2))2 = 1 − 2mx −p+1 a(2−1−p (2b + 1))2 = 1 − 2−3p−1+mx a(2b + 1)2 = 2−3p−1+mx (23p+1−mx − a(2b + 1)2 ):
(2.8)
For the direct rounding, that is n = p, we have 1 − xZ 2 = 1 − 2mx −p+1 a(2−1−p+1 b)2 = 1 − 2−3p+1+mx ab2 = 2−3p+1+mx (23p−1−mx − ab2 ):
(2.9)
The above discussion, (1.6), and (2.3) lead to √ Lemma 2.2. Let x be an FPN with exponent mx , = 1= x, and let m˜ x be de;ned as in (2.1). Then either is an FPN or 2p + 1 − m˜ x − log |23p+1−m˜ x − a(2b + 1)2 | in nearest rounding mode; 2 D() 6 2p − m˜ x − log2 |23p−1−m˜ x − ab2 |
in direct rounding mode:
E. Croot et al. / Theoretical Computer Science 315 (2004) 405 – 417
411
Better bounds now rest on the solutions to the following integral minimization problems: Given p and mx ∈ {0; 1}, /nd min |23p+1−mx − a(2b + 1)2 |;
(2.10)
min |23p−1−mx − ab2 |;
(2.11)
subject to (2.6) and (2.7). Solving them will provide us with much sharper bounds on the number D() of consecutive 0’s (and 1’s) by Lemma 2.2. This is the place where we need help from the abc conjecture. The reader who is not familiar with the conjecture is referred to Section 3 before proceeding from here. Lemma 2.3. Assume the abc conjecture holds. Let p be a positive integer and mx = 0 or 1, and let a and b be integers satisfying (2.6) and (2.7). For any 0¡"¡1 there exists a positive constant C" such that min |23p+1−mx − a(2b + 1)2 | ¿ C" (2p )1−" ;
(2.12)
min |23p−1−mx − ab2 | ¿ C" (2p )1−" :
(2.13)
Proof. (1) We shall /rst prove (2.12). Let d := 23p+1−mx − a(2b + 1)2 :
(2.14)
Write a = 2s a for some odd integer a and 06s¡p. Then (2.14) reduces to d = 23p+1−mx −s − a (2b + 1)2 . It can be seen that gcd(23p+1−mx −s ; a (2b + 1)2 ; d ) =1 since a (2b + 1)2 and d are odd integers. Write $ := "=4. By the abc conjecture (see Conjecture 3.1), there exists a constant A$ ¿0 such that 23p+1−mx −s 6 A$ (rad(23p+1−mx −s a (2b + 1)2 d ))1+$ :
(2.15)
It can be seen that rad(23p+1−mx −s a (2b + 1)2 d )62a (2b + 1)|d |. Thus we get 23p+1−mx −s 6 A$ (2a (2b + 1)|d |)1+$ ¡ A$ (2 · 2p−s · 2p+1 |d |)1+$ = A$ (22p+1−s |d |)1+$ : Write B$ := A$−1=(1+$) , the above inequality is equivalent to |d | ¿ B$ 2(3p+1−mx −s)=(1+$)−(2p+1−s) : Since $¡1 and 06s¡p, we have 3p + 1 − mx − s − (2p + 1 − 2s) 1+$ ¿ (3p + 1 − mx − s)(1 − $) − (2p + 1 − 2s)
(2.16)
412
E. Croot et al. / Theoretical Computer Science 315 (2004) 405 – 417
= p − mx + s − $(3p + 1 − mx − s) ¿ p − 4$p − 1 = p(1 − ") − 1: Write C" := B$ =2, then (2.16) implies that |d| = 2s |d | ¿ B$ 2(3p+1−mx −s)=(1+$)−(2p+1−2s) ¿ (B$ =2)(2p )1−" = C" (2p )(1−") : By (2.14) this proves our /rst claim. (2) Now we prove (2.13). Since it is similar to part (1) we shall outline our proof but omit details. Let d := 23p−1−mx − ab2 . Write d = 2s d for some odd integer d and 06s¡p. Then d = 23p−1−mx −s − 2−s ab2 . It is easy to see that gcd(23p−1−mx −s ; 2−s ab2 ; d ) = 1. By the abc conjecture, we have 23p−1−mx −s 6 A$ (2 · 2−s ab|d |)1+$ :
(2.17)
Write B$ := A$−1=(1+$) . Note that ab¡22p−mx =2 . By (2.17) we have |d | ¿ B$ 2(3p−1−mx −s)=(1+$)+s−1−2p+mx =2 : But 3p − 1 − mx − s mx + s − 1 − 2p + 1+$ 2
mx −s ¿ (3p − 1 − mx − s)(1 − $) − 2p + 1 − 2 mx ¿ p − $(3p − 1 − mx − s) − 2 − 2 5 ¿ p(1 − ") − : 2 √
√
Let C" := 2=8 B$ , then we have |d |¿B$ 2p(1−")−5=2 = 2=8 B$ (2p )1−" = C" (2p )1−" . This /nishes our proof. Theorem √ 2.4. Assume the abc conjecture holds. Let x be an FPN with exponent mx , = 1= x, and let m˜ x be de;ned as in (2.1). Then either is an FPN or for any 0¡"¡1 there exists a positive constant C" (only depends on ") such that p + 1 − m˜ x − log2 C" − p" in nearest rounding mode; D() 6 p − m˜ x − log2 C" − p"
in direct rounding modes: Proof. As a consequence of Lemmas 2.2 and 2.3, we have in the rounding to nearest mode D() 6 2p + 1 − m˜ x − log2 C" + p(1 − ") = p + 1 − m˜ x − log2 C" − p"
E. Croot et al. / Theoretical Computer Science 315 (2004) 405 – 417
413
and in the direct rounding modes D() 6 2p − m˜ x − log2 C" + p(1 − ") = p − m˜ x − log2 C" − p" : The proof is completed. Bounds in Theorem 2.4 are sharper than those in Theorem 2.1 by roughly p(1−")+ −4=("+4) log2 C" . From the proof, one has C" = 12 c"=4 for any "¿0, where c" is the constant given in the abc conjecture. One notices that when "¿0 approaches 0, C" approaches 1=2c"=4 . As c" is ine6ective, our sharper bounds in Theorem 2.4 can only serve as a theoretical background for a better understanding of the correct roundedness of the reciprocal square root. 3. The abc conjecture, Roth theorem and Liouville’s estimates We shall /rst give a brief discussion of the abc conjecture, which we applied in proving Theorem 2.4. The abc conjecture was proposed by Masser and OesterlRe independently. There are several versions of the conjecture right now, but only the traditional one is used here. For a readable survey see [6,16] and also [7, Part D] or [10, IV, Section 7]. For any non-zero integer N , let rad(N ) denote the radical of N , that is, rad(N ) := ‘|N ‘ where the product ranges over all distinct prime divisors of N . For instance, rad(−6) = 6 and rad(8) = 2. Conjecture 3.1 (The abc Conjecture). Given "¿0 there exists a positive number c" such that max(|A|; |B|; |C|) 6 c" · rad(ABC)1+" for any non-zero integers A; B; C with gcd(A; B; C) = 1 and A + B = C. As of today the abc conjecture remains one of the most famous open questions in Number Theory. If the integers A; B; C are replaced by polynomials in one variable over a /eld, an analogous statement of the abc conjecture is known to hold, thanks to Mason (see [10, p. 194]). There is a long and ever-growing list of signi/cant consequences. Remarkably, a stronger version of the abc conjecture implies the famous Roth theorem [17] and the Fermat’s last theorem (recently resolved by Wiles, see [18]). For these reasons, the abc conjecture is widely anticipated to hold among number theorists. A good starter for all these may be the survey [6]. In our application (see Lemma 2.3 and Theorem 2.4) of Conjecture 3.1, the integers A, B, and C are taken more speci/c forms, i.e., one of them is a power of 2, and another one is either ab2 or a(2b+1)2 with some constraints on integers a and b. It is a natural question to ask whether those structures upon A, B, and C present a constrained abc conjecture? It is conceivable that one should arrive at a smaller constant c" for a given " in these cases. But we do not know and did not pursue further in this direction.
414
E. Croot et al. / Theoretical Computer Science 315 (2004) 405 – 417
We shall emphasize that the conjecture does not give an e6ective bound, namely, one does not know how small c" can really be. Below we give some number theoretical perspective of our central question. Note that Question 1.1 can be reformulated as follows: Question 3.2. Given p and an algebraic number by its minimal polynomial, ;nd the minimal q such that there is an FPN y with q correct signi;cant digits and the rounded y is equal to the rounded in p signi;cant bits. For simplicity, we con/ne ourselves with the mode of rounding to nearest in this section. Proposition 3.3. Suppose y is rounded to an FPN of p signi;cant digits, written as M=2t , where M is an integer satisfying 2p−1 6M 62p − 1 and t = p − 1 − my . If y is a rational approximation to precisely enough so that y − M + | − y| ¡ 1 : (3.1) t 2 2t+1 Then the number of correct signi;cant bits of y is no fewer than q as required by Question 3.2. An exception to this is when the ;rst p + 1 leading signi;cant bits of are all ones, for which 1=2t+1 must be replaced by 1=2t+2 . Proof. Eq. (3.1) holds for y suMciently close to , since |y − M=2t |61=2t+1 , where the inequality is strict for y suMciently close to (since is irrational). Then, by the triangle inequality, we will have | − M=2t | ¡ 1=2t+1 , which holds if and only if is also rounded to M=2t . We leave the exception case to the reader. In practice, a computer can perform such approximation task by computing more and more signi/cant bits of and yielding y closer and closer to , until the inequality holds. But unless is an algebraic number such as the reciprocal square root, in general there is no way of gauging, a priori, how precise an approximation we will need; that is, we have no way of estimating the “run time” of such a procedure for an arbitrary irrational . In what follows we shall see what we can get from Roth’s theorem (see [5, p. 30] or [17]): Let *¿0 be an arbitrarily small real number. If u; v are integers, v¿1, then u c*; − ¿ 2+* ; v v where c*; ¿0 is some (ineDective) constant and depends only on * and . From this we have the following: Proposition 3.4. Let notation be as in Proposition 3.3. If |y − | ¡
c*; ; 2(t+1)(2+*)
E. Croot et al. / Theoretical Computer Science 315 (2004) 405 – 417
415
then the number of signi;cant digits of y is no fewer than q as required by Question 3.2. Proof. If is not rounded to M=2t so that (3.1) fails to hold, we will have M M 1 c*; 1 6 − t 6 t − y + |y − | ¡ t+1 + (t+1)(2+*) : 2t+1 2 2 2 2 Thus, for b = ± 1, we will have c*; − 2M + b 6 ; t+1 (t+1)(2+*) 2 2 which is impossible, by Roth’s Theorem. Thus, for such y, is rounded to M=2t . It can be seen that any FPN y with (t +1)(2+*)−log2 c*; +m correct signi/cant bits satis/es the condition of Proposition 3.4. By proper scaling by some power of 2, we may assume m = 0. Then this implies that q 6 (t + 1)(2 + *) − log2 c*; 6 p(2 + *) − log2 c*; ;
(3.2)
where we have used t = p − 1 − my 6p. This bound is comparable to Theorem 2.4, except the constant term here log2 c*; which depends on itself and hence is of little practical value. Along the same lines we may be able to develop a similar proposition based on Liouville’s theorem [3] as in Section 2.1. We leave this as an exercise to the interested readers. 4. Conclusions We have studied the minimal number q of leading correct signi/cant bits of the re√ ciprocal square root = 1= x over entire range of an FPN system enough for correctly rounding according to the IEEE standards. The technique used is a combination of the ancient Liouville’s estimation and the modern number theory. The main results are summarized in Theorems 2.1 and 2.4 which provides much sharper estimates. However, the sharper bounds are only of theoretical interest for now as they are built upon an unproven, though widely thought to be true, famous abc conjecture. Even so the e6ort here represents a step forward in bridging the gap between the existing results on the minimal number q and the numerically observed one. Our study here on the reciprocal square root, to certain extent, is representative to 1
other algebraic functions in scienti/c computations, most notably the cube root x 3 which has been included in libm by HP [6], Intel (see the web page in a previous footnote), and IBM [7]. It can be proved that: Assume the abc conjecture holds. Let 1
x be an FPN, = x 3 . Then either is an FPN or for any 0¡"¡1 there exists a positive constant C" (only depends on ") such that D()6p[1 + "] + C" . In view of similar technicality and keeping this paper short, we omit the detail.
416
E. Croot et al. / Theoretical Computer Science 315 (2004) 405 – 417
Our focus on the binary FPN systems is representative, too. Extensions to FPN systems in radix other than 2 can be done along similar lines to what we have here. Acknowledgements The authors are grateful to the anonymous referees for detailed and extremely helpful comments and suggestions. The work of Ren-Cang Li was supported in part by the National Science Foundation under Grant No. ACI-9721388 and by the National Science Foundation CAREER award under Grant No. CCR-9875201. Part of this work was conceived while he was on leave at Hewlett-Packard Company. He is grateful for help received from Jim Thomas, Jon Okada, and Peter Markstein of HP Itanium 3oating point and elementary math library team at Cupertino, California. The research of Hui June Zhu was done while a postdoctoral fellow at University of California at Berkeley, and she was partially supported by a grant from the David and Lucile Packard Foundation to Bjorn Poonen of University of California at Berkeley. References [1] American National Standards Institute and Institute of Electrical and Electronic Engineers, IEEE standard for binary 3oating-point arithmetic, ANSI/IEEE Standard, Std 754-1985, New York. [2] American National Standards Institute and Institute of Electrical and Electronic Engineers, IEEE standard for radix independent 3oating-point arithmetic, ANSI/IEEE Standard, Std 854-1987, New York. [3] A. Baker, Transcendental Number Theory, 2nd Edition, Cambridge University Press, Cambridge, 1979. [4] M.F. Cowlishaw, E.M. Schwarz, R.M. Smith, C.F. Webb, A decimal 3oating-point speci/cation, in: N. Burgess, L. Ciminiera (Eds.), Proc. 15th IEEE Symp. on Computer Arithmetic, Vail, Colorado, IEEE Computer Society Press, Los Alamitos, CA, 2001, pp. 147–154. [5] J. Esmonde, J.R. Murty, M.R. Murty, Problems in Algebraic Number Theory, in: Graduate Texts in Mathematics, Vol. 190, Springer, New York, 1999. [6] A. Granville, T.J. Tucker, It’s as easy as abc, Notice Amer. Math. Soc. 49 (10) (2002) 1124–1231. [7] M. Hindry, J.H. Silverman, Diophantine Geometry: An Introduction, in: Graduate Texts in Mathematics, Vol. 201, Springer, New York, 2000. [8] C.S. Iordache, D.W. Matula, In/nitely precise rounding for division, square root, and square root reciprocal, in: I. Koren, P. Kornerup (Eds.), Proc. 14th IEEE Symp. on Computer Arithmetic, Adelaide, Australia, IEEE Computer Society Press, Los Alamitos, CA, 1999, pp. 233–240. [9] IBM, Technical Reference: Base Operating System and Extensions, 4th Edition, Vol. 2, International Business Machines Corporation, 2002, available at http://www16.boulder.ibm.com/cgi-bin/ ds rslt1. [10] S. Lang, Algebra, 3rd Edition, Addison-Wesley Pub Co, Reading, MA, 1992. [11] T. Lang, J.-M. Muller, Bounds on runs of zeros and ones for algebraic functions, in: N. Burgess, L. Ciminiera (Eds.), Proc. 15th IEEE Symp. Computer Arithmetic, Vail, Colorado, IEEE Computer Society Press, Los Alamitos, CA, 2001, pp. 13–20. [12] R.-C. Li, P. Markstein, J. Okada, J. Thomas, The libm library and 3oating-point arithmetic in HP-UX for Itanium II, available at http://h21007.www2.hp.com/dspp/files/unprotected/ Itanium/FP White Paper v2.pdf (June 2002). [13] J. Liouville, Sur des classes trXes eR tendues de quantitRes dont la valeur n’est ni algRebrique, ni mˆeme rReductible aX des irrationnelles algRebriques, C.R. Acad. Sci. Paris SRer. A 18 (1844) 883–885. [14] J. Liouville, Nouvelle dRemonstration d’un thReorXeme sur les irrationnelles algRebriques insRerRe dans le compte rendu de la derniXere sReance, C.R. Acad. Sci. Paris SRer. A 18 (1844) 910–911.
E. Croot et al. / Theoretical Computer Science 315 (2004) 405 – 417
417
[15] J. Liouville, Sur des classes trXes eR tendues de quantitRes dont la valeur n’est ni algRebrique, ni mˆeme rReductible aX des irrationnelles algRebriques, J. Math. Pures Appl. 16 (1851) 133–142. [16] B. Mazur, Questions about powers of numbers, Notice Amer. Math. Soc. 47 (2) (2000) 195–202. [17] K.F. Roth, Rational approximations to algebraic numbers, Mathematika 2 (1955) 1–20. [18] A. Wiles, Modular elliptic curves and Fermat’s last theorem, Ann. Math. 141 (1995) 443–551.
Theoretical Computer Science 315 (2004) 419 – 452
www.elsevier.com/locate/tcs
Fast arithmetic with general Gau# periods Joachim von zur Gathen∗ , Michael N*ocker Faculty of Computer Science, Electrical Engineering, and Mathematics, University of Paderborn, D-33095 Paderborn, Germany
Abstract We show how to apply fast arithmetic in conjunction with general Gau# periods in 0nite 0elds. This is an essential ingredient for some e1cient exponentiation algorithms. c 2004 Elsevier B.V. All rights reserved. Keywords: Exponentiation; Finite 0elds; Normal basis; Gau# period; E1cient arithmetic
1. Introduction Exponentiation is an important task with several applications in computer algebra and cryptography. If the ground domain is a 0nite 0eld of “small” characteristic, then normal bases are a well-known and useful tool for this purpose. The goal of this paper is a computational framework in which one can combine the use of these normal bases with fast polynomial arithmetic. If q is a prime power and Fq n an extension of Fq , then an element ∈ Fq n is normal 2 n−1 over Fq if and only if its conjugates ; q ; q ; : : : ; q are linearly independent over Fq . A qth power of an element represented in this basis is just a cyclic shift of coordinates, and a general exponentiation also requires fewer operations than in the usual polynomial representation given by an irreducible polynomial. This is one reason why normal elements are an attractive data structure. An apparent drawback is that multiplication in this data structure is generally based on linear algebra and hence seems quite expensive. A construction of special normal elements is via Gau# periods. We have an integer k, a prime number r with nk = r − 1, a primitive rth root of unity in some
∗
Corresponding author. E-mail address: [email protected] (J. von zur Gathen). URL: http://www-math.upb.de/∼aggathen/
c 2004 Elsevier B.V. All rights reserved. 0304-3975/$ - see front matter doi:10.1016/j.tcs.2004.01.012
420
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
Table 1 The percentage for which the minimal parameter k ∈ N¿1 is given by the special class of a Gau# period. The values are given for all 0eld extensions Fq n with 26n¡10000. The values for q are given in the 0rst row; e.g. the distribution over the binary 0eld F2 is listed in the second column. The search for k = (r)=n is restricted to r61 000 000
Minimal value of the parameter k for normal Gau# periods with respect to the class Class\q
2
3
5
7
11
13
17
19
Prime Squarefree Prime power General No normal GP
57.79 26.19 0.87 2.66 12.50
63.04 29.22 0.89 6.85 0.00
63.25 30.35 0.92 5.48 0.00
63.24 23.35 0.84 12.56 0.00
64.71 25.78 0.95 8.56 0.00
65.27 25.16 1.08 8.49 0.00
64.93 32.33 0.79 1.95 0.00
65.20 22.59 0.62 11.58 0.00
extension of Fq , a subgroup K ⊆ Z× r with k elements, and the Gau7 period a : = a∈K
Then ∈ Fq n , and it is normal over Fq if and only if q mod r and K generate the group × Z× r , that is, q; K = Zr (see [2,18]). Rather than cumbersome matrix multiplication, as used for general normal bases, one can use polynomial multiplication to multiply elements in such a special normal basis. One can plug in any multiplication routine, from classical via Karatsuba to asymptotically fast ones (FFT-based or Cantor’s method). This results in a speedup by an order of magnitude and the fastest exponentiation algorithms in large 0nite 0elds of small characteristic known today, both in theory and in software practice [7–9]. The time taken by the multiplication algorithm grows with the parameter k, which is extraneous to the base problem of calculating in Fq n . It is desirable to choose k small, ideally k = 1 or k = 2. (Then is called an optimal normal basis; see [15]). But that is not always possible. The applicability of this method was broadened by a recent generalization of Gau# periods from prime numbers r to arbitrary integers r. Gau#—who had used his periods for the construction of the regular 17-gon—had already presaged this, in Article 356 of his Disquisitiones Arithmeticae [10], but never published the general method: “These theorems retain the same or even greater elegance when they are extended to composite values of n. But these matters are on a higher level of investigation, and we will reserve their consideration for another occasion.” [Gau#’ n is the r used above.] The goal of this paper is to show that the use of (fast) polynomial arithmetic is also feasible with these general Gau# periods. We achieve this in three steps: 0rst, when r is a prime power, then when r is arbitrary and the Gau# period is of a special form, called decomposable. Lastly, we show that for an arbitrary Gau# period, we can always 0nd a decomposable one with the same parameters. Table 1, whose details are explained in 7.1.1, shows that for roughly 35% of the 0eld extensions in our experiments, general Gau# periods reduce the minimal value of k as compared to prime Gau# periods. The progress of the present work is to extend the applicability of polynomial arithmetic from the prime case to the general situation.
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
421
2. Gau periods In an arbitrary normal basis, all known multiplication algorithms such as the Massey– Omura multiplier make use of linear algebra. Our goal is to replace matrix-based multiplication by faster algorithms for speci0c normal elements, namely Gau# periods. This has been achieved by Gao et al. [7–9] for prime Gau# periods over Fq , and also by Blake et al. [3] for the special case of optimal normal bases (corresponding to k ∈ {1; 2}) in F2n . Our results generalize all these. In this section, we present Gau# periods and some of their properties for further use. We use the following notation throughout this paper. Notation 2.1. k, n, q, and r are positive integers with q a prime power, r¿2, gcd(q; r) = 1, and (r) = nk, where denotes Euler’s totient function, and is a primitive rth root of unity in an extension 0eld of Fq . Furthermore, K is a subgroup of Z× r of order k. We let r = r1 · · · rt
with
ri = piei
for 1 6 i 6 t
(2.2)
be the prime power factorizationof r, where p1 ; : : : ; pt are pairwise distinct primes and e1 ; : : : ; et ∈ N¿1 . We call R1 = 16i6t; ei =1 pi the squarefree part of r and R2 = r=R1 the non-squarefree part. (This is not to be confused with another common designation, namely that of p1 · · · pt as the squarefree part.) We say that r is squarefree when r = R1 . Feisel et al. [5] introduced the following Gau# periods. Denition 2.3. In the above notation, let b(x) = xR2 ·
16i6t 16s6ei pi |R2
s
xr=pi ∈ Fq [x]:
(2.4)
The Gau7 period of type (n; K) over Fq given by is de0ned as =
a∈K
b(a ):
It is easy to see that ∈ Fq n . When r is prime, a prime power, or squarefree, we call a prime, prime power, or squarefree Gau# period, respectively. The de0nition of simpli0es in these cases: r prime or squarefree ⇒ =
a∈K
r = pe a prime power ⇒ =
a ;
a∈K 06s¡e
s
ap :
422
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
Example 2.5. Let q = 2. (i) Let r = 5, ∈ F24 a primitive 5th root of unity, and let K = {1} be the uniquely determined subgroup of Z× 5 of order k = 1. Then = is a prime Gau7 period of type (4; {1}) in F24 over F2 . (ii) Let r = 32 , a primitive 9th root of unity, and K = {1; 8}. Then = 1·1 + 3·1 + 1·8 + 3·8 = + 3 + 8 + 6 is a prime power Gau7 period of type (3; {1; 8}) in F23 over F2 . (iii) Let r = 32 ·5, and be a primitive 45th root of unity. There are three subgroups of order k = 2 of Z× 45 which de0ne three diPerent Gau# periods in F212 . The subgroup K1 = {1; 26} determines 1 = 14 +24 +4 +39 of type (12; {1; 26}), K2 = {1; 44} generates 2 = 14 +24 +21 +31 , and K3 = {1; 19} de0nes 3 = 14 +24 +6 +41 . We denote by q; K = {qh a: h ∈ Z; a ∈ K} the subgroup of Z× r that is jointly generated by (q mod r) and K. Normality of Gau# periods can be characterized by this subgroup. Normal Gauss Period Theorem 2.6 (Feisel et al. [5]). Let be the Gau7 period of type (n; K) over Fq . Then is normal in Fq n if and only if q; K = Z× r . Example 2.5 (continued). (i) Since 2; {1} = {2; 4; 3; 1} = Z× 5 , the Gau# period of type (4; {1}) is normal in F16 over F2 . (ii) One can easily check that 2; {1; 8} = Z× 9 . Hence, the Gau# period of type (3; {1; 8}) is normal in F8 over F2 . (iii) Only the two subgroups K1 = {1; 26} and K2 = {1; 44} generate normal Gau# periods in F212 over F2 . For K3 = {1; 19} we have 2; {1; 19} = {1; 2; 4; 8; 16; 17; 19; 23; 31; 32; 34; 38} = Z× 45 . Thus, the Gau# period of type (12; {1; 19}) over F2 is not normal in F4096 . Two Gau# periods of the same type but given by diPerent primitive rth roots of unity are conjugate. The following is the main result of this paper. Theorem 2.7. Let be a normal Gau7 period of type (n; K) over Fq , and r = r1 · · · rt the prime power factorization (2.2) of r with K ⊆ Z× r . Then there exists a normal Gau7 period with the same parameters so that two elements in Fq n represented in this normal basis can be multiplied with O r· (log ri · log log ri ) or 16i6t
O(nk log(nk) log log(nk))
operations in Fq . The proof is given at the end of Section 6.
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
423
Fig. 1. Four projection homomorphisms.
3. Towers of groups and elds Let be a normal Gau# period of type (n; K) over Fq , and the Frobenius automorphism of Fq n over Fq . Wassermann [18], Bemerkung 3.1.2, observed that for a prime Gau# period, q → induces an isomorphism from Z× r =K to Gal(Fq n : Fq ). This is also true for general Gau# periods. Let r ¿2 be a divisor of r, × r : Z× r → Zr
with r (a) = (a mod r )
(3.1)
× × the canonical projection of Z× r onto Zr , and r (K) the image of K ⊆ Zr under this × epimorphism. Thus r (K) is a subgroup of Zr . The order k of r (K) divides both k = #K and (r ) = #Z× r . The following lemma states that the canonical projection gives a normal Gau# period in a sub0eld of Fq n .
Lemma 3.2. Let be a normal Gau7 period of type (n; K) over Fq given by , r ¿2 a divisor of r, r as in (3.1), k = #r (K), and n = (r )=k . Then n divides n, r=r is a primitive r th root of unity, and the Gau7 period of type (n ; r (K)) over Fq with respect to r=r is normal in Fq n over Fq . Proof. The canonical projection r is surjective, and q; K = Z× r , hence r (q); r (K) = Z× . The square of group homomorphisms in Fig. 1 commutes. The top r and right-hand maps are surjective, and hence also the bottom one. It follows that × n = #Z× r =r (K) divides n = #Zr =K. The other claims are clear. The connection between the group Z× r and the normal Gau# period in a sub0eld plays an important role in what follows. We illustrate this in the case of prime power Gau# periods. Let r be a prime power pe with e¿2, and let be a primitive pe th root of unity. that the subgroup K of Z× r de0nes a normal Gau# period We suppose ap s = a∈K 06s¡e of type (n; K) over Fq with respect to . Then q; K = Zp×e .
For 0¡‘¡e, the element ‘ = p n‘ = (p ‘ )=#p ‘ (K). Then ‘ =
a∈p‘ (K) 06s¡‘
‘ap
s
e−‘
is a primitive p ‘ th root of unity, and we set
424
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
is the Gau# period of type (n‘ ; p ‘ (K)) over Fq with respect to ‘ by Lemma 3.2. Since q; p ‘ (K) = Zpב , the Gau# period ‘ is normal in Fq n‘ over Fq . × Example 2.5 (continued). (ii) The canonical projection 3 : Z× 9 → Z 3 maps K = {1; 8} 2−1 × 3 3 onto the subgroup 3 (K) = {1; 2} of Z 3 , and a1 =3 6= is a primitive third root of unity. Lemma 3.2 says that 1 = a∈3 (K) 1 = + = 1 is a normal Gau# period of type (1; {1; 2}) over F2 . In fact, we have 2; {1; 2} = Z× 3 , and 1 is indeed a normal prime Gau# period.
3.1. Cyclotomic polynomials Primitive roots of unity are related to a special class of polynomials: the cyclotomic polynomials; see Lidl and Niederreiter [13], Section 2.4 for details. When q is a prime power, r a positive integer coprime to q, and a primitive rth root of unity over Fq , then r = (x − s ) ∈ Fq [x] 0¡s¡r gcd(s;r)=1
is the rth cyclotomic polynomial over Fq . Since the roots of r are all (r) distinct primitive rth roots of unity, the degree of r is (r), and ∈ Fq (r) . Over the 0eld Q of rational numbers, the cyclotomic polynomial r is always irreducible. This is no longer true in the case of a 0nite 0eld Fq with nonzero characteristic. But in this case the factorization pattern is well-known. Fact 3.3 (Lidl and Niederreiter [13, Theorem 2.47]). Let q be a prime power coprime to a positive integer r, and let N = ord r (q) be the order of q in Z× r . Then the rth cyclotomic polynomial r ∈ Fq [x] factors into (r)=N distinct monic irreducible polynomials of the same degree N. We denote the d = (r)=N irreducible factors by 1 ; : : : ; d ∈ Fq [x]. By the Chinese Remainder Theorem we have the isomorphism of Fq -algebras : R = Fq [x]=(r ) → Fq [x]=(1 ) × · · · × Fq [x]=(d ) A → (A mod 1 ; : : : ; A mod d ):
(3.4)
Since r () = 0 for any primitive rth root of unity ∈ Fq (r) , we know that the minimal polynomial of in Fq [x] is one of the 1 ; : : : ; d . Then ’ : Fq () → Fq [x]=( ) with ’ Ai i = Ai (xi mod ) 06i¡N
06i¡N
is the canonical isomorphism between the two images of FqN . The 0eld Fq () is a sub0eld of Fq (). Thus, we know the image of in Fq [x]=( ). The key for fast multiplication of Gau# periods lies in the choice of a suitable preimage of in R.
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
425
For any i6d, let ci ∈ K be such that i = ci is a root of i . Then we have = b(a ) = b(ci a ) = b(i a ); a∈K
a∈K
a∈K
since a → ca is a bijection of K. Applying the inverse isomorphism of , we have the preimage b(xa mod 1 ); : : : ; b(xa mod d ) = b(xa mod r ) a∈K
a∈K
a∈K
of in R. Finally, let ’1 ; : : : ; ’d be the canonical isomorphisms with i = ci and i (i ) = 0 for 16i6d. We de0ne the homomorphism of Fq -algebras ’ : Fq () → R = Fq [x]=(r ); A → (’1 (A); : : : ; ’d (A)): If A =
h
06h¡n
’
(3.5)
06h¡n
Ah q is given as a linear combination of the conjugates of , then Ah q
h
=
06i¡n
Ai
a∈K
b(xa mod r ):
This map allows us to transfer multiplication in the normal basis representation of Fq n = Fq () to multiplication in R, which is just polynomial multiplication modulo r . Wonderful. The only drawback is that the original problem size is n = dimFq Fq n , while the new problem size nk = (r) = dimFq R is larger by a factor of k. We want to keep this extraneous factor k as small as possible. 3.2. Field towers, traces, and normal elements We conclude this section by collecting some well-known properties on normal elements that are useful subsequently. The properties listed below are true not only for normal Gau# periods but for all normal bases. We will discuss the algorithmic aspects for normal bases generated by Gau# periods in the subsequent sections. 3.2.1. The product of normal elements It is a well-known fact (see e.g. [14]) that normality is inherited along a tower of =elds Fq ⊆ Fqn1 ⊆ Fqn1 n2 ⊆ · · · ⊆ Fqn1 ···nt ; whenever the degrees n1 ; : : : ; nt ¿1 are pairwise coprime. Fact 3.6. Let n1 and n2 be two coprime positive integers, n = n1 · n2 , and i be a normal element in Fq ni over Fq for i = 1; 2. Then = 1 · 2 is normal in Fq n over Fq .
426
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
Fact 3.6 shows a way to compute the multiplication matrix TN of the normal ban n −1 sis N = (; : : : ; q 1 2 ) if gcd(n1 ; n2 ) = 1 and the matrices TNi are already given for i = 1; 2. Fact 3.7. Let n1 , n2 and 1 ; 2 as in Fact 3.6 and set n = n1 · n2 . Let TN1 = (uj1 ; h1 )06j1 ; h1 ¡n1 and TN2 = (vj2 ; h2 )06j2 ; h2 ¡n2 be the multiplication matrices of Ni = h
{iq : 06h¡ni } for i = 1; 2. (i) The multiplication matrix TN = (tj; h )06j; h¡n of = 1 · 2 is given by tj;h = uj1 ;h1 · vj2 ;h2 ;
where j ≡ ji mod ni and h ≡ hi mod ni for i = 1; 2. (ii) The density dN of TN is the product of the densities dN1 and dN2 of TN1 and TN2 , respectively. (iii) The multiplication matrix TN can be calculated with dN = dN1 · dN2 multiplications in Fq from TN1 and TN2 . 3.2.2. The trace of a normal element The trace also inherits normality. The next fact is true for all Galois extensions over a 0nite 0eld, see [11, Lemma 5.3]. Thus the trace map inherits normality downwards a 0eld tower, while multiplication induces normality upwards. Fact 3.8. Let n1 and n2 be two coprime positive integers and n = n1 ·n2 . If is normal in Fq n over Fq , then Tr qn =qn1 () is normal in Fq n1 over Fq . In the special case where n = n1 · n2 is the product of two coprime factors we get some further useful properties. A proof of 3.9(i) is given in Jungnickel [12], Lemma 5.1.8, and a special version of Lemma 3.9(ii) is cited in [1] for optimal normal bases. The proof technique will be used extensively in our algorithms, in particular analogs of the index maps %n1 and %n2 . Lemma 3.9. Let n1 and n2 be coprime positive integers, n = n1 · n2 , and let 1 and 2 be normal in Fq n1 and Fq n2 over Fq , respectively. Then (i) Tr qn =qn2 (1 · 2 ) = Tr qn1 =q (1 ) · 2 and (ii) 2 is normal in Fq n over Fq n1 . Proof (see Fig. 2). (i) We have Tr qn =qn2 (1 · 2 ) = =
06i¡n=n2
06i¡n=n2 qin2
(1 · 2 )q 1q
in2
· 2q
in2
in2
= 2 ·
06i¡n=n2
1q
in2
since 2 ∈ Fq n2 , that is, 2 = 2 for all 16i¡n=n2 . Moreover, the map n2 : {0; : : : ; n1 − 1} → {0; : : : ; n1 − 1} with n2 (i) = n2 i rem n1 is a bijection
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
427
Fig. 2.
and hence 06i¡n=n2
1q
in2
=
06i¡n1
i
1q = Tr qn1 =q (1 ):
n2 −1
(ii) Since N2 = (2 ; : : : ; 2q ) is a basis for Fq n2 over Fq , the set N2 is a basis of Fq n over Fq n1 . By assumption, n1 and n2 are coprime, and hence the map n1 : {0; : : : ; n2 − 1} → {0; : : : ; n2 − 1} with n1 (i) = n1 i rem n2 is a bijection. Theren1 h h fore, the set {2q : 06h¡n2 } = {2q : 06h¡n2 } is the set of all n2 conjugates of 2 over Fq n1 , and N2 is a normal basis over Fq n1 as claimed. 4. The prime power case We are now ready to develop an algorithm that integrates polynomial multiplication in a normal basis representation whenever thenormal a Gau# period. In element is s this section, we restrict to the case where = a∈K 06s¡e ap is a prime or prime power Gau# period of type (n; K) over Fq , that is, r = pe . The main result of this section generalizes the approach that was described in [7,9] for prime Gau# periods. Result 4.1. Let p be a prime, e be a positive integer, and be a normal prime power Gau# period of type (n; K) over Fq , where K is a subgroup of Zp×e . Two elements n−1 of Fq n expressed in the normal basis N = (; : : : ; q ) can be multiplied with at most O(pe log pe · log log pe ) operations in Fq . The underlying algorithm is one of the cornerstones of this paper. The algorithm e consists of three parts: multiplication in Fq [x]=(xp − 1), sorting the product to identify prime (power) Gau# periods in sub0elds of Fq n , and then applying the trace map to return to the linear combination of the conjugates of the prime (power) Gau# period. 4.1. An algorithm for fast multiplication We start with an example illustrating the algorithmic ideas. Example 4.2. Let be a primitive 9th root of unity, and let be the normal Gau# period of type (3; {1; 8}) over F2 as in Example 2.5(ii). The conjugates of = + 3 + 1 2 8 + 6 are 2 = 2 + 6 + 7 + 3 and 2 = 4 + 3 + 5 + 6 .
428
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452 2
(i) To calculate the product 2 · as linear combination of ; 2 ; 4 , we regard the conjugates of as elements of F2 (). The product in this extension 0eld is 4 · = (4 + 3 + 5 + 6 ) · ( + 3 + 8 + 6 ) = + 8 : Both and 8 are summands of . We complete the missing terms to get 4 · = ( + 3 + 8 + 6 ) + 3 + 6 : (ii) Observe that 3 and 6 are primitive third roots of unity over F2 . We apply the × canonical projection 3 : Z× 9 → Z3 as de0ned in (3.1). Then 3 ({1; 8}) = {1; 2} = × Z3 and hence n = (3)=#{1; 2} = 1. Thus, the projection generates the prime Gau# period 1 = 3 + (3 )2 over F2 . We substitute 3 + 6 by 1 to get 4 · = + 1 : (iii) In order to express 1 as a linear combination of the conjugates of we compute the trace of over F2 : Tr 23 =21 () =
06i¡3
i
2 = + 2 + 4
= ( + 3 + 8 + 6 ) + (2 + 6 + 7 + 3 ) + (4 + 3 + 5 + 6 ) = + 7 + 4 + 8 + 2 + 5 + 3 + 6 : We sort the summands and apply the fact that 0 = 3 (3 ) = 1 + 3 + 6 to get Tr 23 =21 () = · (1 + 6 + 3 ) + 2 · (6 + 1 + 3 ) + 3 + 6 = 3 + 6 = 1 : Indeed, the trace describes a linear combination of the conjugates of for 1 . We insert this linear combination 4 · = + 1 = + Tr 23 =21 () = 2 + 4 which completes the computation. We will show that the map ’: Fq () → R = Fq [x]=(p e ) as in (3.5) is in fact an injective ring homomorphism if is normal over Fq . 4.1.1. A sum of Gau7 periods We use the following notation. Notation 4.3. Let be a primitive pe th root of unity. For 0¡‘6e let p ‘ be the canonical projection from Zp×e onto Zpב . Set k‘ = #p ‘ (K) and n‘ = (p ‘ )=k‘ .
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
The Gau7 period of type (n‘ ; p ‘ (K)) over Fq with respect to ‘ = p by ‘ . We set n0 = k0 = 1.
e−‘
429
is denoted
We take a look at the summands of the product ’(A) · ’(B), and want to write a e preimage of ’ of this product in Fq [x]=(xp − 1) in a particular way. We note that e xa ≡ xb mod(xp − 1) if a ≡ b mod pe . For all 06i¡n, we de0ne the positive integers (i) u‘;h = #{a ∈ K: 1 + aqi ∈ pe−‘ qh K} (i) = #{a ∈ K: 1 + ap‘ qi ∈ qh K} v‘;h
for 0 ¡ ‘ 6 e and 0 6 h ¡ n‘ ;
for 0 ¡ ‘ ¡ e and 0 6 h ¡ n‘ :
(4.4)
Furthermore, we set (i) = u0;0
1 if there is a ∈ qi K such that 1 + aqi ≡ 0 mod pe ; and 0 otherwise: e
These numbers de0ne the special form of the preimage in Fq [x]=(xp − 1) of ’(A) · ’(B) that we are looking for. Subsequently, we suppose that q; K = Zp×e . Since ’ is additive, it is su1cient to look at the following product. A generalization is shown in Proposition 4.10. Lemma 4.5. Let 06i¡n and F be the prime sub=eld of Fq . Then there are C0(i) and C‘;(i)h in F for 0¡‘6e and 06h¡n‘ such that
s i
a∈K 06s¡e
≡
C0(i)
+
xap q
·
xbp
b∈K 06s ¡e
0¡‘6e 06h¡n‘
(i) C‘;h
s
a∈p‘ (K) 06s¡‘
e
qh (x
pe−‘ aps
)
e
mod(xp − 1):
i
Since is a root of (xp − 1), the product of q times can be written as a sum of those Gau# periods ‘ which are given by the canonical projection of K onto Z× . p‘ Corollary 4.6. Let be the Gau7 period of type (n; K) over Fq with respect to . For 0¡‘6e, let ‘ be the Gau7 period of type (n‘ ; p ‘ (K)) over Fq with respect to e−‘ p . For 06i¡n, let C0(i) and C‘;(i)h for 06‘¡e and 06h¡n‘ as in Lemma 4.5. Then i
q · = C0(i) +
0¡‘6e 06h¡n‘
h
C‘;(i)h ‘q :
We start with a proposition that describes the coe1cients of the preimage of ’(A) · e (i) (i) ’(B) in Fq [x]=(xp − 1) in terms of u‘; h and v‘; h .
430
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
(i) (i) Proposition 4.7. Let 06i¡n be =xed and u‘; h and v‘; h as in (4.4). Set
C0
=k·
(e − ‘) ·
06‘6e
06h¡n‘
(i) u‘;h
and Cp e−‘ qh
k = · k‘
‘6s6e
(i) us;h
+
0¡s¡‘
(i) (vs;h
+
(n−i) vs;h−i )
for all 0 ¡ ‘ 6 e
and 0 6 h ¡ n‘ : Then
a∈K 06s¡e
≡ C0 +
x
aps qi
·
b∈K 06s ¡e
0¡‘6e 06h¡n‘
Cp e−‘ qh
x
bps
a∈p‘ (K)
(xp
e−‘
h
e
)aq mod(xp − 1):
Proof. A straightforward computation gives
a∈K 06s¡e
s
a;b∈K 06s¡e
xap q +bp
s+0
s i
a;b∈K 06s;s ¡e
+
0¡‘¡e 06s¡e−‘
i
xbp (1+aq ) +
(xap
s
xbp (1+ap
xap q +bp
s
s+‘ i
q +bps
s i
s+‘
+ xap q +bp )
‘ i
q)
a;b∈K 0¡‘¡e 06s¡e−‘
+
s i
06s¡e
a;b∈K
≡
xbp = s
b∈K 06s ¡e
=
s i xap q ·
s i
xap q (1+bp
‘ n−i
q
)
e
mod(xp − 1):
a;b∈K 0¡‘¡e 06s¡e−‘
We consider the three major summands separately. Fix a ∈ K. Then 1 + aqi is either equal 0 modulo pe or there are 0¡‘6e and 06h¡n‘ such that 1 + aqi ∈ pe−‘ qh K ⊆ Zpe . Then b∈K 06s¡e
s
i
xbp (1+aq ) ≡
b∈K 06s¡e
e
x0 ≡ ke mod(xp − 1)
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
431
if 1 + aqi ≡ 0 mod pe , and otherwise we have bpe−(‘−s) qh bps (1+aqi ) bps pe−‘ qh bpe+(s−‘) qh x ≡ x ≡ x + x b∈K 06s¡e
b∈K 06s¡e
≡ ≡
b∈K
xbp
b∈K 06s¡‘
q
+
k ks
0¡s6‘
e−(‘−s) h
b∈ps (K)
06s¡‘
b∈K
(xp
e−s
‘6s¡e
(e − ‘) h
e
)bq + k(e − ‘) mod(xp − 1):
If a runs through K then we get the 0rst intermediate result as s i xbp (1+aq ) a∈K
≡
b∈K 06s¡e
0¡‘6e 06h¡n‘
≡k·
(i) u‘;h
0¡s6‘
06‘¡e
(e − ‘) ·
0¡‘6e 06h¡n‘
b∈ps (K)
06h¡n‘
)
(i) + u0;0 ke
+ k(e − ‘)
(i) u‘;h
(i) us;h
‘6s6e
(x
pe−s bqh
k k‘
k ks
+
·
qh
b∈p‘ (K)
(xp
e−‘
e
)b
mod(xp − 1):
For the second sum, we 0x a ∈ K and 0¡‘¡e. Since 1 + ap ‘ qh ∈ Zp×e and q; K = Zp×e , there is 06h¡n such that 1 + ap ‘ qi ∈ qh K. Then we get
s
b∈K 06s¡e−‘
xbp (1+ap
‘ i
q)
≡ ≡
s h
xbp q ≡
b∈K 06s¡e−‘
‘¡s6e
k ks
b∈ps (K)
b∈K ‘¡s6e
(xp
e−s
(xp
e−s
h
)bq
h
e
)bq mod(xp − 1):
If a runs through K then the sum over all 0¡‘¡e is given by s ‘ i xbp (1+ap q ) a∈K 0¡‘¡e
≡
≡
b∈K 06s¡e−‘
0¡‘¡e 06h¡n‘
1¡‘6e 06h¡n‘
(i) v‘;h
k k‘
·
‘¡s6e
0¡s¡‘
e−s h k · (xp )bq ks b∈ps (K)
(i) vs;h
b∈p‘ (K)
qh (xp
e−‘
)b
e
mod(xp − 1):
432
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
By changing the roles of a and b and substituting i by n − i, we get the formula for the third summand: qi s ‘ n−i xap (1+bp q ) a∈K 06s¡e−‘
b∈K 0¡‘¡e
≡
≡
1¡‘6e 06h¡n‘
1¡‘6e 06h¡n‘
k k‘ k k‘
0¡s¡‘
0¡s¡‘
(n−i) vs;h
(n−i) vs;h−i
a∈p‘ (K)
a∈p‘ (K)
qi+h (x
pe−‘ a
(x
pe−‘ a
)
qh )
e
mod(xp − 1):
With the help of this proposition, we can group all summands of the preimage of e ’(A) · ’(B) in Fq [x]=(xp − 1) — except the constant coe1cient—in terms of e−‘ h p )aq with 0¡‘6e and 06h¡n‘ . Let 06i¡n be 0xed as before; we a∈p ‘ (K) (x omit it in the notation. Now our approach is to sort these terms into sums which are preimages of ‘ , for 0¡‘6e, in R. This is obvious but a little bit technical. Thus, we want to de0ne two useful sequences of integers for all 0¡‘6e, ‘6s¡e, and 06h¡n‘ : D‘;(e)h = 0; C‘; h = Cp e−‘ qh − D‘;(‘)h ; D‘;(s)h = D‘;(s+1) + h
ks+1 k‘
06j¡ns+1 =n‘
Cs+1;h+jn‘ :
(4.8)
Informally speaking, the D‘;(s)h are those parts of the Cpe−‘ qh which have already been
identi0ed as Gau# periods. We give some alternative computations of the D‘;(s)h to illustrate this.
Lemma 4.9. Let D‘;(s)h and C‘; h be as above. Then (i) D‘;(s)h = s6s ¡e (ks +1 =k‘ )( 06j¡ns +1 =n‘ Cs +1; h+jn‘ ) for 0¡‘6s¡e, (‘+1) (ii) D‘;(‘+1) = (k‘+1 =k‘ ) 06j¡n‘+1 =n‘ D‘+1; h h+jn‘ for 0¡‘¡e, (‘) (‘+1) (iii) D‘; h = (k‘+1 =k‘ ) 06j¡n‘+1 =n‘ (D‘+1; h+jn‘ + C‘+1; h+jn‘ ) for 0¡‘¡e. Proof. (i) We proceed by induction on s. For s = e − 1, by de0nition we have for all 0¡‘¡e that ke D‘;(e−1) = D‘;(e)h + Ce;h+jn‘ h k‘ 06j¡ ne n‘ ks +1 = Cs +1;h+jn‘ ; e−16s ¡e k‘ 06j¡ns +1 =n‘
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
433
using D‘;(e)h = 0. We suppose that the claimed formula is also true for 1¡s + 1¡e. Inserting the induction hypothesis into the de0nition of D‘;(s)h gives D‘;(s)h
= D‘;(s+1) h = =
ks+1 + k‘
s+16s ¡e
s6s ¡e
Cs+1;h+jn‘
06j¡ns+1 =n‘
ks +1 k‘
ks +1 k‘
06j¡ns +1 =n‘
+
Cs +1;h+jn‘
06j¡ns+1 =n‘
Cs+1;h+jn‘
;
Cs +1;h+jn‘
06j¡ns +1 =n‘
ks+1 k‘
and the induction step is complete. (ii) Let 0¡‘¡e. Then D‘;(‘+1) h
=
ks +1 k‘
‘+16s ¡e
06j¡ns +1 =n‘
Cs +1;h+jn‘
by (i). We sort the summands and use (i) again to obtain D‘;(‘+1) h
=
‘+16s ¡e
k‘+1 ks +1 · k‘ k‘+1
k‘+1 = · k‘ 06j¡n‘+1 =n‘ =
06j¡n‘+1 =n‘ 06i¡ns +1 =n‘+1
‘+16s ¡e
ks +1 k‘+1
Cs +1;h+(jn‘ +in‘+1 )
06i¡ns +1 =n‘+1
Cs +1;(h+jn‘ )+in‘+1
k‘+1 · D(‘+1) : k‘ 06j¡n‘+1 =n‘ ‘+1;h+jn‘
(iii) We use induction on ‘. For ‘ = e − 1, we have by de0nition (e−1) (e) De−1;h = De−1;h +
ke ke−1
06j¡ne =ne−1
Ce;h+jne−1 ;
which is just the claimed formula since D‘;(e)h = 0 for all 0¡‘6e. We assume that the claim also holds for 1¡‘ + 1¡e. Then (ii) gives + D‘;(‘)h = D‘;(‘+1) h =
k‘+1 k‘
k‘+1 k‘
06j¡n‘+1 =n‘
06j¡n‘+1 =n‘
C‘+1;h+jn‘
(‘+1) (D‘+1;h+jn + C‘+1;h+jn‘ ): ‘
434
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
We prove with the help of these sequences D‘;(s)h and C‘; h that the preimage of e ’(A) · ’(B) in Fq [x]=(xp − 1) can be written as a sum of Gau# periods. The following proposition includes Lemma 4.5 as the special case ‘ = 0. Proposition 4.10. Let C‘; h and D‘;(s)h be as in (4.8), and 06‘ 6e. Then
a∈K 06s¡e
+
C‘; h ·
‘ ¡‘6e 06h¡n‘
0¡‘6‘ 06h¡n‘
xbp s
b∈K 06s ¡e
≡ C0
s i xap q ·
qh
a∈p‘ (K) 06s¡‘
(Cp e−‘ qh −
D‘;(‘h) )
·
(xp
e−‘
)ap
s
a∈p‘ (K)
xp
e−‘
aq
h
e
mod(xp − 1)
for all 06‘ 6e. Proof. We use induction on ‘ . For ‘ = e, the right-hand side of the claimed equation is C0 + 0 +
0¡‘6e 06h¡n‘
(Cp e−‘ qh −
D‘;(e)h )
·
a∈p‘ (K)
xp
e−‘
aq
h
which is just the right-hand side of the congruence in Proposition 4.7, since all D‘;(e)h are zero. Now, we suppose that the formula is true for an ‘ ∈ N¿0 with 0¡‘ 6‘6e. Then for all 06h¡n‘ (Cp e−‘ qh − D‘(‘ ;h) ) · ≡ C‘ ;h · ≡ C‘ ;h ·
a∈p‘ (K)
(4:8)
06s¡‘
a∈p‘ (K)
a∈p‘ (K)
06s¡‘
mod(x
pe
− 1):
xp
e−‘
xp
xp
aqh
e−‘
s h
ap q
−
16s¡‘
e−‘
aps qh
−C‘ ;h ·
xp
e−‘
s h
ap q
a∈p‘ (K)
16s¡‘
xp
e−(‘ −s)
aqh
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
435
We sort the summands by adding the 0rst term of the diPerence to the already collected summands
C0
+
C‘; h ·
‘ ¡‘6e 06h¡n‘
+
06h¡n‘
≡ C0 +
C‘ ;h ·
a∈p‘ (K) 06s¡‘
xp
a∈p‘ (K) 06s¡‘
qh
‘ 6‘6e 06h¡n‘
C‘; h ·
e−‘
(x
pe−‘ aps
)
aps qh
qh
a∈p‘ (K) 06s¡‘
(xp
e−‘
)ap
s
e
mod(xp − 1):
The remaining part is 0¡‘¡‘ 06h¡n‘
−
(Cp e−‘ qh −
C‘ ;h ·
06h¡n‘
≡
≡
0¡‘¡‘ 06h¡n‘
06h¡n‘
a∈p‘ (K) 16s¡‘
−
C‘ ;h ·
0¡‘¡‘ 06h¡n‘
·
a∈p‘ (K)
(Cp e−‘ qh
·
xp
xp
−
D‘;(‘h) )
a∈p‘ (K)
16s¡‘
Cp e−‘ qh
D‘;(‘h) )
·
k‘ ks
D‘;(‘h)
−
e−‘
aq
h
e−(‘ −s)
a∈p‘ (K)
a∈ps (K)
aqh
x
pe−‘ aqh
(xp
e−s
)aq
h
k‘ + · C‘ ;h+jn‘ k‘ 06j¡n‘ =n‘
x
pe−‘ aqh
e
mod(xp − 1):
But D‘;(‘h ) + (k‘ =k‘ ) 06i¡n‘ =n‘ C‘ ; h+in‘ = D‘;(‘h −1) by construction in (4.8), and the induction step follows.
4.1.2. Applying the trace map The last ingredient is the trace map. It provides a way of writing a normal Gau# period ‘ ∈ Fq n‘ as a linear combination of the elements of the normal basis N = (; q ; n−1 : : : ; q ) of Fq n . Lemma 4.11. Let r = pe be a prime power, and let be a prime power Gau7 period of type (n; K) over Fq with respect to , where q; K = Zp×e . For any 0¡‘6e,
436
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
let ‘ be the Gau7 period of type (n‘ ; p ‘ (K)) over Fq with respect to p in‘ q = pe−‘ ‘ for 0 ¡ ‘ 6 e:
e−‘
. Then
06i¡n=n‘
Furthermore, we have qi = −pe−1 : 06i¡n
We again derive these formulas step by step, and will give a proof of Lemma 4.11 as a conclusion at the end of this paragraph. Moreover, we show that this lemma includes the reduction modulo p e we are looking for. We start by de0ning a set of polynomials -0 ; -‘; b ∈ Fq [x] for 0¡‘¡e and b ∈ p ‘ (K). Since we are still working in e e the ring Fq [x]=(xp − 1), we assume all polynomials to be reduced modulo xp − 1, e that is, we identify (a mod pe ) ∈ Zp×e with its canonical representative aS ∈ Z, 0¡a¡p S , e such that aS ≡ a mod p . For 0¡‘¡e, 06i¡n‘+1 =n‘ , and b ∈ p ‘ (K), we consider I‘;b;i = {a ∈ p‘+1 (K): a ≡ q−in‘ b mod p‘ }; the set of all elements in p ‘+1 (K) that are preimages of q−in‘ b under the canonical projection : Zpב+1 → Zpב . For 0¡‘¡e and b ∈ p ‘ (K), we set -0 =
06i¡n1 a∈p (K)
(xp
e−1
i
)aq + 1 ∈ Fq [x]
and -‘;b =
06i¡n‘+1 =n‘ a∈I‘;b;i 06s¡‘+1
(xp
e−(‘+1)
s in‘
)ap q
−p·
06s¡‘
(xp
e−‘
s
)bp ∈ Fq [x]: (4.12)
Proposition 4.13. For 0¡‘¡e, let -0 and -‘; b be the polynomials as in (4.12) for all b ∈ p ‘ (K). Then p e divides -0 and -‘; b . Proof. Fix 0¡‘¡e, and let : Zpב+1 → Zpב with (a) = (a mod p ‘ ) the canonical projection from Zpב+1 onto Zpב . Since we have p ‘ = ◦p ‘+1 , the projection is a surjective homomorphism. Thus, each element b ∈ Zpב has a preimage set −1 (b) = {a ∈ Zpב+1 : a ≡ b mod p ‘ } of order #−1 (b) = #Zpב+1 =#Zpב = p‘ (p − 1)=p‘−1 (p − 1) = p. One can easily check that the kernel of is ker = {(1 + p‘ z) mod p‘+1 : 06z¡p}. This gives a second way to express the preimage set of b in Zpב+1 : −1 (b) = b · ker = {(b + zp‘ ) mod p‘+1 : 0 6 z ¡ p}: Here we use that the map b : {0; : : : ; p − 1} → {0; : : : ; p − 1} with a permutation because gcd(b; p) = 1.
(4.14) b (z) = bz rem p
is
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
437
We can also give a description of −1 (b) involving I‘; b; i . Since we know that q ∈ p‘ (K), also the inverse of qin‘ is an element in p‘ (K). Thus, the set I‘; b; i contains k‘+1 =k‘ elements. For 0¡i¡n‘+1 =n‘ and a ∈ I‘; b; i , we have (qin‘ · a) ≡ qin‘ ·q−in‘ b ≡ bmod p‘ . Hence, the set {qin‘ a: 06i¡n‘+1 =n‘ and a ∈ I‘; b; i } is a subset of −1 (b). But 06i¡n‘+1 qi p‘+1 (K) is a partition of Zpב+1 , and each subset has (n‘+1 =n‘ ) · k‘+1 =k‘ = (p‘+1 )= (p‘ ) = p diPerent elements. Therefore, equality holds: n‘+1 −1 in‘ (4.15) and a ∈ I‘; b; i : (b) = q a: 0 6 i ¡ n‘ n‘
With the help of these formulas we have for 0¡‘¡e and all b ∈ p‘ (K): e−(‘+1) in‘ s (xp )aq p a∈I‘;b;i 06s¡‘+1
06i¡n‘+1 =n‘
4:15
≡
(xp
−1
e−(‘+1)
s
a∈ (b) 06s¡‘+1
≡
e−(‘+1)
s
)bp ·
(xp
06z¡p 06s¡‘+1
(xp
06s¡‘+1
(4:14)
)ap ≡
06z¡p
e−(‘+1)
s
)p (b+zp
‘
)
(xp
e−1
)zp
s
e
mod(xp − 1):
For s = 0, the sum in the inner brackets vanishes modulo p e since 06z¡p
(xp
e−1
e
xp − 1 ≡ 0 mod pe : xpe−1 − 1
)z =
For s¿1, we simplify modulo p e : e−1 1+(s−1) s−1 (xp )zp ≡ 1zp ≡ p mod pe : 06z¡p
06z¡p
Inserting both formulas gives 06i¡n‘+1 =n‘ a∈I‘; b; i 06s¡‘+1
≡ (xp ≡p·
e−(‘+1)
0
)bp · 0 +
06s¡‘
(x
(xp
e−(‘+1)
16s¡‘+1
pe−‘ bps
)
s in‘
)ap q
(xp
e−(‘+1)
s
)bp · p
mod pe :
(4.16)
It follows by construction of -‘; b in (4.12) that p e is a divisor of -‘; b for 0¡‘¡e and b ∈ p‘ (K). For -0 we have a∈p (K) 06i¡n1
(xp
e−1
i
)aq =
=
a∈Z× p
(xp
06z¡p
e−1
(xp
)a
e−1
e
)z − 1 =
xp − 1 − 1 ≡ −1 mod pe ; xpe−1 − 1
since q; p (K) = Zp× , and the claim follows also for -0 .
438
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452 e−‘
Let p = ‘ be a primitive p‘ th root of unity for 06‘¡e. Since e − ‘¿1 and e−1 e−‘+e−1 = 1, a simple computation gives (‘ )p = p
-0 (‘ ) =
a∈p (K) 06i¡n1
(‘p
e−1
i
)aq + 1 = n1 k1 + 1 = (p) + 1 = p = 0
in Fq :
Thus, gcd(-0 ; p‘ ) = 1 for 06‘¡e and for all b ∈ p‘ (K) we have e
gcd(-0 ; -1;b ; : : : ; -e−1;b ; xp − 1) = pe
in Fq [x]:
(4.17)
Since qin‘ ∈ p‘ (K), we can write −1 (p‘ (K)) as −1 (p‘ (K)) = p‘+1 (K) =
06i¡n‘+1 =n‘ b∈p‘ (K)
I‘;b;i :
A direct consequence is that for 0¡‘¡e -‘ =
06i¡n‘+1 =n‘ a∈p‘+1 (K) 06s¡‘+1
(xp
e−(‘+1)
s in‘
)ap q
−p·
b∈p‘ (K) 06s¡‘
(xp
e−‘
)bp
s
(4.18) is divisible by p e . Remark 4.19. Successively applying (4.12) and (4.18), respectively, we can transform the equation given in Lemma 4.5 into
a∈K 06s¡e
x
aps qi
·
b∈K 06s ¡e
x
bps
≡
06h¡n
Ce;h ·
a∈K 06s¡e
x
aps
q h mod pe ;
where Ce; h depends on 06i¡n. This is indeed a way to compute a suitable preimage of ’(A)·’(B) in R = Fq [x]= (p e ). We observe that the 0nal formula is due to a basis of R which supports the back-transformation into a linear combination of the conjugates of a normal Gau# period . Lemma 4.20. Let = (x mod p e ) and R = Fq [x]=(p e ). If Zp×e = q; K for a subgroup K of Zp×e then B=
06s¡e
aps
:a ∈
Z× pe
is a basis of R. e
Proof. The set B = {1; ; : : : ; (p )−1 } is a basis of R. Since B has at most #B =
(p e ) elements, it is su1cient to prove that B ⊆ B.
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
439
s
ap ∈ B for a ∈ Zp×e . By induction on ‘, we 0nd e−‘ s with Proposition 4.13 that for 0¡‘¡e we have 06s¡‘ (p )ap ∈ B for a ∈ Zpב , since -‘; b () = 0. Furthermore, we have −1 ∈ B. Now let 16a¡ (p e ). Then there exist uniquely determined 0¡‘6e and c ∈ Zpב such that a ≡ p e−‘ c mod p e , and By construction, we have
06s¡‘
(p
e−‘
s
)cp = cp
06s¡e
e−‘
+
16s¡‘
(p
e−‘
s
)cp = cp
e−‘
+
06s¡‘−1
(p
e−(‘+1)
s
)cp :
p e−‘ cps p e−‘+1 cps ) and ) are elements of B. Hence, But both 06s¡‘ ( 06s¡‘−1 ( cp e−‘ a e = ∈ B for all 06a¡ (p ), and the claim follows. Now we translate this result into the language of traces that has motivated the choice of -‘; b . Let Tr q n‘ =q n‘−1 be the trace map of Fq n‘ into Fq n‘−1 for 0¡‘6e; here n0 = 1 by de0nition. We have in‘−1 Tr qn‘ =qn‘−1 (‘ ) = ‘q : 06i¡n‘ =n‘−1
Since is a root of p e , we can apply (4.18) to ‘ =
a∈ ‘ K p 06s¡‘
(p
e−‘
s
)ap . Then
Tr qn‘+1 =qn‘ (‘+1 ) = p‘ for all 1 6 ‘ ¡ e:
(4.21)
For -0 , we simply have -0 () = 0 and Tr qn1 =q (1 ) = −1:
(4.22)
The trace map is transitive, so that Tr q n =q n‘ () = Tr q n‘+1 =q n‘ (Tr q n =q n‘+1 ()). We use this to prove Lemma 4.11 by induction on 06‘6e. The case ‘ = 0 is also called the absolute trace. Proof of Lemma 4.11. For ‘ = e, we have Tr q n =q ne () = Tr q ne =q ne () = e since n = ne . Now we suppose that the claim is true for an 1¡‘¡e. Then Tr qn =qn‘ (‘+1 ) = Tr qn‘+1 =qn‘ (pe−(‘+1) ‘+1 ) (4:21)
= pe−(‘+1) Tr qn‘+1 =qn‘ (‘+1 ) = pe−(‘+1) (p‘ ): (4:22)
For ‘ = 0 we get Tr q n =q () = pe−1 Tr q n1 =q (1 ) = −p e−1 in the same way. We 0nally rewrite Remark 4.19 inserting the root of p e . Remark 4.23. The primitive p e th root of unity is a zero of p e , and we have i
q · =
06h¡n
Ce;h q
h
440
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
for all 06i¡n. The Ce; h depend on the given 06i¡n. They are elements of the prime sub0eld F of Fq because Cp(i) e−‘ qh ∈ F by Lemma 4.5 and all manipulations on the coe1cients are done in F. Thus, the multiplication matrix TN has entries in F. 4.1.3. The complete algorithm We have presented all parts of the algorithm, and now summarize the complete multiplication routine. Algorithm 4.24. The prime power case. Input: A normal prime power Gau# period of type (n; K) over Fq with K a i i subgroup of Zp×e of order k, and two elements A = 06i¡n Ai q and B = 06i¡n Bi q of Fq n with coe1cients Ai ; Bi ∈ Fq for 06i¡n. i Output: The product C = 06i¡n Ci q of A and B with coe1cients Ci ∈ Fq for 06i¡n. e Transformation from Fq n into Fq [x]=(xp − 1): e 1. Aj ← 0 and Bj ← 0 for all 0¡j¡p . 2. For all 06i¡n and a ∈ K do set j = aqi rem p e and Aj ← Ai and Bj ← Bi . 3. For 0¡‘¡e and all i ∈ Zp×e−(‘−1) do 4. set j = i·p‘ rem p e and Aj ← Aj + Ai ; Bj ← Bj + Bi . 5. Set A = 16j¡p e Aj xj and B = 16j¡p e Bj xj . e Multiplication in Fq [x]=(xp − 1): 6. Compute C = 26j¡2p e −1 Cj xj ← A ·B with (fast) polynomial multiplication in Fq [x]. e 7. Reduce C modulo xp − 1: For 26j¡p e − 1 do Cj ← Cj + Cj+p e . Set C0 = Cp e ; j C1 = Cp e +1 ; C = 06j¡pe Cj x . e Write the product as a sum of Gau7 periods in Fq [x]=(xp − 1): 8. Set C0 = C0 . 9. For all 0¡‘6e and 06h¡n do D‘;(‘)h ← 0 and Ce; h ← Cq h . 10. For ‘ from e − 1 down to 1 do 11–14 11. For 06h¡n ‘ do (‘+1) 12. D‘;(‘)h ; ← k‘+1 06j¡n‘+1 =n‘ (D‘+1; h+jn‘ + C‘+1; h+jn‘ ). k‘ · 13. For 06h¡n‘ do 14. C‘; h ← Cpe−‘ qh − D‘;(‘)h . e−‘ s h e 15. Set C = C0 + 0¡‘6e 06h¡n‘ C‘; h ( a∈ ‘ (K) 06s¡‘ (xp )ap )q mod(xp −1). p Reduction modulo p e ∈ Fq [x] applying the trace map: 16. For 06h¡n1 do C1; h ← C1; h − C0 . 17. For 16‘¡e and do 06h¡n‘ do 18. For 06i¡ nn‘+1 do C‘+1; h+in‘ ← C‘+1; h+in‘ + p−1 ·C‘; h . ‘ Back transformation from R = Fq [x]=(p e ) into Fq n : 19. For 06h¡n do set Ch = Ce; h . h 20. Return C = 06h¡n Ch q .
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
441
Lemma 4.25. Algorithm 4.24 works as speci=ed. Proof. The computation of the transformation in steps 1–5 follows the de0nition of e Gau# periods. The multiplication in steps 6 –7 in Fq [x]=(xp − 1) generates a preimage of the product of A·B. To compute the reduction modulo p e , we apply the reordering of the summands according to Proposition 4.10 in steps 8–15. Notice that we compute only the D‘;(‘)h for 16‘¡e according to Lemma 4.9(iii). These are su1cient to get all coe1cients of Lemma 4.5, see (4.8). The reduction in steps 16 –18 is done according to (4.12) and (4.18), respectively. Thus, we get the preimage of A ·B in the ring R = Fq [x]=(p e ) under the isomorphism as stated in Remark 4.19. The 0nal back transformation (steps 19 – 20) uses the fact that C is a linear combination of the conjugates of as claimed in Remark 4.23. It remains to count the number of operations in Fq . M(n) denotes a multiplication time, so that two polynomials in Fq [x] of degree at most n can be multiplied with O(M(n)) operations in Fq . We may use M(n) = n log n log log n by Sch*onhage and Strassen [17] and Sch*onhage [16]; see also Cantor [4]. We recall that n‘ 6 (p‘ ) for 16‘6e. Furthermore, the telescoping sum below is useful:
(p‘ ) = (p‘ − p‘−1 ) = pe + p‘ − p‘ − p0 = pe − 1: 16‘6e
16‘6e
16‘¡e
16‘¡e
We have the following estimates for each part of the algorithm. We emphasize the prime case e = 1 since some steps are omitted in this special situation. • The transformation (steps 1–5) is calculated with 2 additions for each i ∈ Zp×e−(‘−1) where 0¡‘¡e. This results in a total of at most 2 (pe−(‘−1) ) = 2
(p‘ ) = 2(pe − 1 − (p)) = 2pe − 2p 0¡‘¡e
26‘6e
operations in Fq . For the prime case e = 1 we have 2(p e − p) = 0 operations. e • Since both A and B have constant coe1cient zero, the multiplication modulo xp −1 in steps 6 –7 can be done with M(pe − 1) + (pe − 3) operations. The second term counts the additions. If e = 1 then p − 1 = (p) = nk. • The sorting of the summands in steps 8–15 is omitted for the prime case e = 1. Otherwise e¿2 and we may assume that k‘+1 =k‘ is precomputed for all 0¡‘¡e. Then the number of operations is bounded by n‘+1 1+ −1+ 1+1 n‘ 16‘¡e 06i¡n‘+1 =n‘ 06h¡n‘
=2
16‘¡e
n‘+1 +
16‘¡e
n‘ 6 2
= 2(pe − p) + pe−1 − 1:
26‘6e
(p‘ ) +
16‘¡e
(p‘ )
442
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
• The trace is applied in steps 16 –18. Step 16 is executed for all e¿1 with n1 6p − 1 operations. For e = 1, we have n1 = n. If e¿2 the subsequent iterative computation of the trace map in steps 17–18 can be done with 2=2 n‘ 6 2
(p‘ ) = 2(pe − p) 16‘¡e 06i¡n‘+1 =n‘ 06h¡n‘
26‘6e
26‘6e
further operations if we suppose p−1 to be precomputed. • The back-transformation (steps 19 –20) can be done without operations in Fq . We summarize this detailed cost analysis in the next theorem. Theorem 4.26. Let q be a prime power coprime to a prime p, and e a positive integer such that there exists a normal Gau7 period of type (n; K) over Fq , where K is a n−1 subgroup of Zp×e . In the normal basis representation with respect to N = (; : : : ; q ), two elements of Fq can be multiplied with at most M(pe − 1) + 7pe + pe−1 − 6p − 4 + n 6 M(pe ) + 8pe ∈ O(M(pe )) operations in Fq . We remark that all divisions in the algorithm (steps 12 and 18) are performed in the prime sub0eld of Fq n . The only operations that are performed in Fq are additions, subtractions, and multiplications. The result of Gao et al. [7,9] for the prime case kn = ’(p) = ’(p e ) is a corollary. Corollary 4.27 ([9, Theorem 4.1]). Let Fq n be given by a normal basis N = (; : : : ; n−1 q ), where is a prime Gau7 period of type (n; k) over Fq . Then two elements of Fq given as a linear combination of the basis elements can be multiplied with at most M(kn) + (k + 1)n − 3 operations in Fq . 5. Decomposable Gau periods The main work in our connection between polynomial arithmetic and Gau# periods is for a special case, namely decomposable Gau7 periods, the topic of this section. The general case is dealt with later. Let be a normal Gau# period of type (n; K) over Fq and r = r1 · · · rt the prime power decomposition as in (2.2), so that × ∼ × Z× r = Zr1 × · · · × Zrt ;
K ⊆ r1 (K) × · · · × rt (K): Sometimes, K equals this direct sum of its projections.
(5.1)
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
443
Example 2.5 (continued). (iii) Recall the two subgroups K1 = {1; 26} ∼ = {1; 8} × {1} = 9 (K1 ) × 5 (K1 ); K2 = {1; 44} = {1; 19; 26; 44} ∼ = {1; 8} × {1; 4} = 9 (K2 ) × 5 (K2 ) of Z× 45 . Both generate normal Gau# periods in F212 over F2 . Thus K1 is the direct sum of its projected images while K2 is not. Denition 5.2. Let r¿2 be an integer with prime power decomposition r = r1 · · · rt , and let K be a subgroup of Z× r . × (i) Let ri : Z× r → Zri for 16i6t be the canonical projection. The subgroup K is called decomposable if K∼ = r1 (K) × · · · × rt (K): (ii) A Gau# period of type (n; K) over Fq is decomposable if and only if K is decomposable. Let R1 be the squarefree part of r as in De0nition 2.3. We call a Gau# period of type (n; K) over Fq with K ⊆ Z× r squarefree if r = R1 . If K is decomposable, then we can factor the normal Gau# period . For squarefree r, this (and also Proposition 5.4 below) is in Gao [6], Theorem 1.5. Lemma 5.3. Let be a decomposable normal Gau7 period of type (n; K) over Fq given by ; r = r1 · · · rt the prime power decomposition, and for 16i6t let i be the Gau7 period of type (ni ; ri (K)) over Fq with respect to i = r=ri , where ni =
(ri )=#ri (K). Then there exist h1 ; : : : ; ht with 06hi ¡ni for i6t and such that =
16i6t
hi
iq :
Before we give the proof, we illustrate it by an example. Example 2.5 (continued). (iii) Let be a primitive 45th root of unity. The normal Gau# period = 14 + 24 + 4 + 39 of type (12; {1; 26}) with {1; 26} ⊂ Z× 45 is decomposable. The canonical projections along the prime power decomposition of 45 = 32 ·5 generate the prime Gau# period 5 = 9 of type (4; {1}) and the prime power Gau# period 9 = 5 + (5 )3 + (5 )8 + (5 )6 of type (3; {1; 8}) over F2 . Computing the product 5 ·9 = 9 · (5 + 15 + 40 + 30 ) = 14 + 24 + 4 + 39 veri0es that 5 · 9 is indeed a factorization of . Proof. We divide the proof into three steps. Since is normal, we have q; K = Z× r by Theorem 2.6. Claim. A decomposable normal Gau7 period can be written as a product of a squarefree Gau7 period and a non-squarefree Gau7 period.
444
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
Let R1 be the squarefree part of r and R2 = r=R1 , and set ai ≡ a mod Ri for i = 1; 2. For a primitive rth root of unity , we have i = r=Ri a primitive Ri th root of unity for i = 1; 2. Hence, 1a = 1a1 and 2a = 2a2 . Because K is decomposable, we have the direct sum K = R1 (K) × R2 (K). By a straightforward computation we have = = = =
a∈K
b(a ) =
a∈K
R2 a · (
(a1 ;a2 )∈R1 (K)×R2 (K)
(a1 ;a2 )∈R1 (K)×R2 (K)
a1 ∈R1 (K)
b(1a1 ) ·
16i6t;pi |R2 16s6ei
r=R1 a1
) ·
b(1a1 )
·
s
aR1 R2 =pi
16i6t;pi |R2 16s6ei b(2a2 )
s
(r=R2 )a2 R2 =pi
b(2a2 ):
a2 ∈R2 (K)
The 0rst factor is a squarefree Gau# period of type ( (R1 )=#R1 (K); R1 (K)) over Fq with respect to 1 = r=R1 , the second one is a non-squarefree Gau# period. This proves the claim. Claim. A decomposable non-squarefree Gau7 period which is not a prime power Gau7 period can be written as a product of a non-squarefree Gau7 period and a prime power Gau7 period. Let be a non-squarefree Gau# period. Since it is not a prime power Gau# period, we have t¿2. Set R = r1 · · · rt−1 ¿2. Then rt ¿2 is a prime power coprime to R. For a primitive rth root of unity , we have 1 = rt = r=R a primitive Rth root of unity, and 2 = R = r=rt is a primitive rt th root of unity. Let a1 ≡ a mod R and a2 ≡ a mod rt . Then = = = =
a∈K
b(a ) =
a∈K 16i6t 16s6ei
s
a∈K 16i6t;pi |R 16s6ei
a1 ∈R (K)
b(1a1 ) ·
(rt )aR=pi ·
s
ar=pi
s
16i6t;pi |rt 16s6ei
16i6t;pi |R 16s6ei b(2a2 ); a2 ∈rt (K)
(a1 ;a2 )∈R (K)×rt (K)
(1 )
a1 R=pis
·
(R )art =pi
16i6t;pi |rt 16s6ei
(2 )
a2 rt =pis
with the 0rst factor a non-squarefree Gau# period and the second one a prime power Gau# period. This shows the claim. Claim. A squarefree Gau7 period which is not a prime Gau7 period can be written as a product of (conjugates of ) a squarefree Gau7 period and a prime Gau7 period. Let be a primitive rth root of unity, and let R = r1 · · · rt−1 , which is greater than 1 and coprime to rt . Let 1 = rt be a primitive Rth root of unity and 2 = R a primitive
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
445
rt th root of unity, and u1 ; u2 ∈ Z such that u1 rt + u2 R = 1; we can 0nd these by the × Extended Euclidean Algorithm. Let a1 and a2 be the projections of a onto Z× R and Zrt , respectively, and set n1 = (R)=#R (K) and n2 = (rt )=#rt (K). Since is normal, we × have q; R (K) = Z× R and q; rt (K) = Zrt . Thus, there are 06h1 ¡n1 and 06h2 ¡n2 h1 h2 such that u1 ∈ q R (K) and u2 ∈ q rt (K). The 0rst factor is a squarefree Gau# period of type (n1 ; R (K)) over Fq with respect to r=R , and the second factor is a prime Gau# period of type (n2 ; rt (K)) over Fq with respect to r=rt . The claim is proven. Induction on the number t of prime divisors of r completes the proof of the lemma. 5.1. Fast multiplication for decomposable Gau7 periods If a normal Gau# period is decomposable then its factorization into prime and prime power Gau# periods is related to a tower of 0elds. Each Gau# period along this tower satis0es the assumptions of Fact 3.6, i.e. the extension degrees are pairwise coprime. Proposition 5.4. Let r; q; n; k be positive integers such that q¿2 and r¿2 are coprime and (r) = nk. Let r1 · · · rt be the prime power decomposition of r. Let K be a × subgroup of Z× r of order k, set Ki = ri (K) its image of order ki onto Zri under the canonical projection ri , and ni = (ri )=ki for 16i6t. Then the following are equivalent: (i) q; K = Z× r and K is decomposable. (ii) q; Ki = Z× ri for all 16i6t, and n = n1 · · · nt with n1 ; : : : ; nt pairwise coprime. × Proof. (i) ⇒ (ii): The canonical projection ri is an epimorphism. Thus, Z× ri = ri (Zr ) = ri (q; K) = q;Ki for all 16i6t. Since K is decomposable, we have k = k1 · · · kt and n = (r)=k = 16i6t (ri )=ki = 16i6t ni . We prove by induction on the number of prime divisors that n1 ; : : : ; nt are pairwise coprime. For i = 1 there is nothing to show. Thus, we suppose that the claim is true for K = K1 × · · · ×Ki which is a de composable subgroup of Z× r of order k where r = r1 · · · ri . By construction we have × q; K = Zr and n = (r )=k = n1 · · · ni . We suppose that d = gcd(n ; ni+1 )¿1, i.e. n ·ni+1 =d¡n1 · · · ni+1 . Since q ni+1 ∈ Ki+1 , we have q ni+1 ·n =d ∈ Ki+1 . But also q n ·ni+1 =d ∈ n K since q ∈ K , and we conclude with the help of the Chinese Remainder The orem that q n ·ni+1 =d ∈ K × Ki+1 . Then #q; K × Ki+1 6(n ·ni+1 =d)·k ·ki+1 ¡(n ·k )· × (ni+1 ·ki+1 ) = (r )· (ri+1 ) = #(Z× r1 × · · · × Zri+1 ) which is a contradiction. Hence, n and ni+1 are coprime. The induction hypothesis guarantees that n1 ; : : : ; ni are pairwise coprime, and the claim holds for n1 ; : : : ; ni+1 . (ii) ⇒ (i): The group K can be regarded as a subgroup of K1 × · · · × Kt ; hence k is a divisor of k1 · · · kt . By assumption we have n = n1 · · · nt . Thus, k = (r)=n = 16i6t (ri )=ni = k1 · · · kt , i.e. the subgroup K is decomposable. We always have q; K ⊆ Z× r , and it remains to prove the other inclusion to show equality. Let a be an element in Z× r and ai = ri (a) for all 16i6t. For 16i6t there are ci ∈ Ki hi × and 06hi ¡ni such that ai = q ci ∈ q; Ki = Zri . But n1 ; : : : ; nt are pairwise coprime,
446
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
and by the Chinese Remainder Theorem there exist 06h¡n with h ≡ hi mod ni for 16i6t. Since q ni ∈ Ki , we have qh ≡ qhi ci mod ri for suitable ci ∈ Ki , 16i6t. We set c = (c1 =c1 ; : : : ; ct =ct ) ∈ K to get a ≡ qh c mod r. Thus q; K ⊇ Z× r and hence q; K = Z× r , as claimed. The factorization of a normal decomposable Gau# period oPers a recursive approach to do multiplication fast whenever Fq n is represented by a normal basis n−1 N = (; : : : ; q ). Remark 5.5. Let n1 and n2 be two coprime integers, and set n = n1 ·n2 . Let 1 ∈ Fq n1 and 2 ∈ Fq n2 be normal elements over Fq , and = 1 · 2 be a normal element in Fq n . (i) The element 2 is normal in Fq n over Fq n1 . (ii) Transforming an element given as linear combination of the conjugates of over Fq into a linear combination of the conjugates of 2 over Fq n1 can be computed without operations in Fq . Proof. (i) This is just Lemma 3.9(ii). h (ii) Let A = 06h¡n Ah q be an element in Fq n . Let hi ≡ h mod ni for i = 1; 2. Then h1
q = 1q ·2q h
A=
h2
and
06h¡n1 n2
h1 Ah (1q
·
h2 2q )
=
06h2 ¡n2
06h1 ¡n1
h1 A(h1 ;h2 ) 1q
h2
2q ;
where we identify h and (h1 ; h2 ) = (h mod n1 ; h mod n2 ). Since n1 and n2 are coprime, we have {n1 a rem n2 : 06a¡n2 } = {06a¡n2 } and n1 h2 qh1 2(q ) : A(h1 ;n1 h2 ) 1 A= 06h2 ¡n2
06h1 ¡n1
This just means sorting the coe1cients of A and can be done without operations in Fq . 5.1.1. A constructive proof We are now ready to apply fast polynomial multiplication if Fq n is represented by n−1 a normal basis N = (; : : : ; q ) over Fq , where is a decomposable Gau# period. Theorem 5.6. Let be a decomposable normal Gau7 period of type (n; K) over Fq with K a subgroup of Z× r , and let r1 · · · rt be the prime power decomposition of r. Then two elements in Fq n given as linear combinations of the elements of the normal n−1 basis N = (; : : : ; q ) can be multiplied with at most (log ri · log log ri ) O r· 16i6t
operations in Fq .
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
447
Proof. We prove the claim by induction on the number t of prime divisors of r. If t = 1, the claim follows from Theorem 4.26. Now we suppose t¿2. We can write hi = 16i6t iq as a product of conjugates of normal prime and prime power Gau# periods i of type (ni ; ri (K)) over Fq by Lemma 5.3. Set n = n=nt . The element hi = 16i6t−1 iq is normal in Fq n over Fq . Since is decomposable, Proposition 5.4 claims that n and nt are coprime. Then t is a normal prime or prime power Gau# period in Fq n over Fq n by Remark 5.5(i). As claimed in Remark 5.5(ii), we can multiply two elements in Fq n over Fq by multiplying them in Fq n over Fq n . By Theorem 4.26, the multiplication can be done with at most O(M(rt )) operations (additions, multiplications) in Fq n . Moreover, is a decomposable normal Gau# period of type (n ; r1 (K) × · · · × rt−1 (K)) over Fq . By the induction hypothesis, multiplication in Fq n can be done with at most O( 16i6t−1 M(ri )) operations in Fq , and the claim follows. Example 2.5 (continued). (iii) The decomposable Gau# period = 14 + 24 + 4 + 39 of type (12; {1; 26}) with {1; 26} ⊂ Z× 45 over F2 is normal in F212 . We calculate the 2 product 2 · . (i) As shown above, factors into = 5 · 9 with 5 a prime Gau# period of type (4,1) over F2 , and 9 a prime power Gau# period of type (3; {1; 8}) over F2 where {1; 8} ⊂ Z× 9 . We transform the task into a multiplication over F8 : 4 · = (54 · 94 ) · (5 · 9 ) = (54 · 5 ) · (94 · 9 ): Now 94 · 9 = 92 + 94 as computed in Example 4.2. (ii) It remains to perform the arithmetic in F8 over F2 . Since 5 is a prime Gau# period, we have 54 · 5 = (9 )4 · (9 ) = (9 )5 = 1 = 5 + 52 + 54 + 58 : (iii) Combining both results gives 4 · = (5 + 52 + 54 + 58 ) · (92 + 94 ) 0
1
1
1
2
1
3
1
0
2
1
2
2
2
3
= 52 92 + 52 92 + 52 92 + 52 92 + 52 92 + 52 92 + 52 92 + 52 92 4
1
10
7
8
5
2
2
11
= 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 ; h
h
h
since 2 = 52 1 ·92 2 = (5 · 9 )2
9h1 +4h2
.
6. From general to decomposable Gau periods There is one step missing to derive Theorem 2.7 from Theorem 5.6: Not every normal Gau# period is decomposable, as already illustrated in Example 2.5(iii). We now show that a normal Gau# period always entails a decomposable normal Gau# period with the same parameters. The proof of Theorem 6.3 is based on the following result of Gao [6], Theorem 1.1.
448
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
Fact 6.1. Let Z be an Abelian group of =nite order. Let Q be a subset and K be a subgroup of Z such that Z = Q; K. Then, for any direct sum of Z = Z1 × · · · × Zt , there exists a subgroup L of the form L = L1 × · · · × Lt with Li a subgroup of Zi for 16i6t such that Z = Q; L and Z=L ∼ = Z=K. For our situation, we formulate the following special case. Corollary 6.2. Let r and q be coprime positive integers greater than 2, and r1 · · · rt be the prime power factorization (2.2) of r. If there is a subgroup K of Z× r with × q; K = Z× , then there is a decomposable subgroup L of Z of the same order r r . #L = #K such that q; L = Z× r Theorem 6.3. Let r; q; n; k be positive integers with r; q¿2 such that r and q are coprime and (r) = nk. Then there is a normal Gau7 period of type (n; K) over Fq with K a subgroup of Z× r of order k if and only if such a period exists with decomposable K. Proof. This follows from Corollary 6.2 and the Normal Gau# period Theorem 2.6. We merge Theorem 6.3 with Theorem 5.6, and apply fast polynomial multiplication to prove Theorem 2.7. Proof of Theorem 2.7. Let be a general Gau# period of type (n; K) over Fq generating a normal basis in Fq n . By Theorem 6.3 there is a normal decomposable Gau# period of type (n; L) in Fq n with #L = #K. Thus, we can write an element of Fq n n−1 as a linear combination of the elements of the normal basis N = (; : : : ; q ) over Fq . In this case Theorem 5.6 states that we can apply fast polynomial multiplication to compute the product of two elements in Fq n . Inserting M(ri ) = O(ri log ri · log log ri ) for 16i6t proves the claimed bound on the number of operations in Fq . In the 0nal estimate of the theorem, one can replace the factor log(nk) by the entropy of (r1 ; : : : ; rt ). 7. Existence of normal Gau periods 7.1. A criterion for the existence of a normal Gau7 period Given a prime power q and an integer n, how can we 0nd normal Gau# periods in Fq n over Fq ? We start with two previous results. Fact 7.1 ([6, Theorem 1.4]). Let p be a prime, n and e be positive integers, and set q = p e . There exist a positive integer r and a subgroup K ⊆ Z× r such that the Gau7 period of type (n; K) over Fq is normal in Fq n if and only if the following hold: gcd(e; n) = 1; and 8An in the case p = 2:
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
449
Fact 7.2 ([9, Theorem 3.1]). Let r = p e be a prime power not divisible by 8, and let q be an integer greater than 1 and coprime to r. Let n be a positive divisor of (r), and K the uniquely determined subgroup of Z× r of order k = (r)=n. Then q; K = Z× r if and only if gcd( (r)=N; n) = 1, where N = ord r (q) is the order of q in Z× r . For the non-cyclic group Z× 2e with e¿3 this criterion is no longer true. Example 7.3. For r = 8 and K = {1; 7}, we have 3; K = {1; 3; 5; 7} = Z× and 8
(8)=#K = 42 = 2. Furthermore, N = ord 8 (3) = 2, so that (8)=N = 2, and gcd( (8)=N;
(8)=#K) = gcd(2; 2) = 2 = 1. × For n = 1 and k = #Z× 2e , we can always choose the trivial subgroup K = Z2e to × × get q; K = Z2e . For n¿2 we recall that Z2e is the direct product of the two cyclic groups {±1} = −1 mod 2e and Z2e = 5 mod 2e = {(4i + 1) mod 2e : 06i¡2e−2 }. We start with the assumption that the subgroup generated by q has maximal possible order N = ord 2e (q).
Proposition 7.4. Let r¿16 be a power of 2, and let q¿3 be odd. If N = ord r (q) = 2e−2 and n¿2 is a divisor of N , then K = {±1} · 5n mod 2e is a subgroup of Z× r of order k = (r)=n such that q; K = Z× r . e−2 Proof. For r = 2e and e¿4, the subgroup K of Z× =n = 2e−1 =n = r has order 2·2 e−2
(r)=n. We have #q = N = 2 , by assumption. Thus, q={±1} = Z2e because q generates a cyclic subgroup. By construction, −1 ∈ K, hence q ∪ (−1) · q is a subset of q; K of order 2 · 2e−2 . We conclude that #q; K = (r), and q; K = Z× r , as claimed.
Lemma 7.5. Let e¿4 be an integer, let q be an odd prime power and K be a × e subgroup of order k of Z× 2e such that q; K = Z2e , and n = (2 )=k. If n¿4, then q e−2 has maximal order N = ord 2e (q) = 2 . Proof. Since n divides N , we have N ¿4. Furthermore, the subgroup K has order #K = (2e )=n62e−1 =4 = 2e−3 ¿2. Let S : Z× 2e → Z2e be the canonical projection with a = aS · {±1}. Then q S is a cyclic subgroup of Z2e of order N ¿4. The projection is e−2 e−3 S = Z2e . But n = #Z2e =#K¿2 S an epimorphism. Hence, q; S K =2 = 2 is divisible by 2, and the subgroup q S contains a subgroup of maximal order 2e−2 , since Z2e is cyclic. We conclude that q S = Z2e , and N = ord 2e (q)¿#Z2e = 2e−2 . But a cyclic × subgroup of Z2e has order at most 2e−2 and thus N = 2e−2 . For e = 3, we have always N = 2, and there is a subgroup K ⊆ Z× 8 of order 2 with × q; K = Z× ; for given q¿3 we can choose K = a with a ∈ Z \{1; q mod 8}. 8 8 The only case left is n = 2 and 26N ¡2e−2 for e¿4. Here two diPerent cases of q are important. Since we have q an odd prime power, either q ≡ 1 mod 4 or q ≡ 3 mod 4. These two cases have diPerent projections of q onto {±1}. We
450
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
× consider the canonical projection : Z× 2e → Z4 . Then ker = Z2e , and we have a bijec× × tion between {± 1} = Z2e = ker = Z2e =Z2e and Z× 4 applying the fundamental theorem on groups. Thus, q=Z2e is {±1} if q ≡ 3 mod 4 and is {1} if q ≡ 1 mod 4.
Lemma 7.6. Let e¿4 be an integer, r = 2e , and let q¿3 be an odd integer with 26N = ord r (q)¡2e−2 . Then there is a subgroup K ⊆ Z× 2e of order k such that q; K = Z× if and only if q ≡ 3 mod 4. r Proof. For q ≡ 3 mod 4, we have q=Z2e = {±1}. Since n = 2 = #{±1} and {±1} ⊆ q by assumption, we have qN=n = {±1}. Choosing the subgroup K = Z2e of order k = 2e−2 gives q; K = Z× 2e . e−3 e−3 For q ≡ 1 mod 4, we have qN=n = 52 mod 2e = {52 ; 1} ⊂ Z× 2e . Since e¿4, e−2 e of order k = 2 ¿4 in this case: K there are three subgroups of Z× e 1 = 5 mod 2 , 2 e 2 e e−3 K2 = −5 mod 2 , and K3 = {± 1}·5 mod 2 . For e¿4, we have 2 ¿2 and e−3 e−3 e−4 52 = (−5)2 =(52 )2 mod 2e is an element of all three subgroups. Hence, × q; Ki = Ki = Z2e for 16i63. Thus, there is no suitable subgroup in the case q ≡ 1 mod 4. We collect the 0ndings above to get the following criteria on the existence of a suitable subgroup K in Z× 2e . Lemma 7.7. Let r¿8 be a power of two. Let q¿1 be an odd integer, and n be a divisor of N = ord r (q). Set k = (r)=n. Then the following are equivalent: × (i) There is a subgroup K ⊆ Z× r of order k with q; K = Zr . (ii) One of the following criteria holds: • n = 1, or • n = 2 and q ≡ 3 mod 4, or • N = r=4. Proof. We write r = 2e with e¿2. If one of the criteria in (ii) is satis0ed then either n = 1 and K = Z× 2e , or Proposition 7.4 or Lemma 7.6, respectively, guarantee the existence of a subgroup K of order k with q; K = Z× 2e for e¿4. There are two more cases to consider. For e = 3 and n = 2 we have N = ord 8 (q) = 2. Then we can choose K = {1; 3} if q ≡ 1 mod 4 and K = {1; 5} if q ≡ 3 mod 4. Thus, it remains to prove that in the case n = 2 and q ≡ 1 mod 4 and N ¡2e−2 there is no suitable subgroup. We have q={± 1} ⊂ Z2e , and thus q ⊆ 52 mod 2e . But 52 mod 2e is an el2 e ement in all three subgroups of order k = 2e−2 of Z× 2e ; we have 5 ∈ 5 mod 2 and 2 2 e 2 2 e 5 = (−5) ∈ −5 mod 2 and 1·5 ∈ {± 1}·5 mod 2 . Since we have discussed all possible cases, equivalence holds. We now have the following criterion for existence of a normal Gau# period. For squarefree r, this follows from Theorem 1.5 in [6].
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
451
Table 2 Percentage of 0eld extensions Fq n over Fq with 26n¡10000 for which there is a normal basis given by a squarefree Gau# period of type (n; K) over Fq
Existence of normal bases generated by a squarefree Gau# period with given parameter k ¿1 k \q
2
3
5
7
11
13
17
19
k =1 k 62 log2 n k 6√ k6 n k¡∞
4.70 25.22 75.90 87.24 87.50
4.76 25.78 86.23 99.65 100.00
4.92 24.60 86.11 99.68 100.00
4.65 23.21 85.18 99.63 100.00
4.43 23.77 85.24 99.66 100.00
4.57 22.67 84.51 99.57 100.00
4.50 25.18 86.31 99.57 100.00
4.72 22.75 83.84 99.50 99.98
The rows show the distribution if the value for k = #K is restricted. We limited our experiments for r with (r) = nk to 26r¡1 000 000.
Theorem 7.8. Let q be a prime power and r and n be positive integers such that gcd(r; q) = 1 and n divides (r). Let k = (r)=n and r1 · · · rt be the prime power decomposition of r. Then the following properties are equivalent: (i) There is a subgroup K of Z× r of order k such that the Gau7 period of type (n; K) over Fq is normal. (ii) There are pairwise coprime positive integers n1 ; : : : ; nt such that n = n1 · · · nt , and • gcd( (ri )=Ni ; ni ) = 1 if ri is not divisible by 8, and • ni divides Ni and either ni = 1, or ni = 2 and q ≡ 3 mod 4, or Ni = 2e−2 if 8 divides ri where Ni = ord ri (q) for 16i6t. Proof. (i) ⇒ (ii): By Theorem 6.3 there is a decomposable Gau# period of type (n; L) over Fq with q; L = Z× r . By Proposition 5.4 the ni = (ri )=#ri (L) for 16i6t are pairwise coprime and n1 · · · nt = n. Furthermore, q; ri (L) = Z× ri and the criteria follows immediately with Fact 7.2 and Lemma 7.7. (ii) ⇒ (i): By Fact 7.2 and Lemma 7.7, respectively, there is a subgroup Li of order ki = (ri )=ni such that q; Li = Z× ri for all 16i6t. Obviously, L = L1 × · · · × Lt meets the assumptions of Proposition 5.4. By the Normal Gau# period Theorem 2.6, the criterion q; L = Z× r is su1cient for the Gau# period of type (n; L) over Fq to be normal. 7.1.1. Experiments Tables 1 and 2 present results about the smallest values of k that lead to normal Gau# periods. Table 1 illustrates the progress made by the various categories of Gau# periods, going from the most specialized category “prime” in the 0rst row to the general periods in the fourth row. In each row we 0nd the percentage of n having a normal Gau# period of its row category with a smaller value of k than any the more specialized categories above it. The extension degree n goes from 2 to 10 000. The second column says, for example, that for 26.19% of those n some squarefree Gau# period in F2n over F2 has a smaller value of k than any prime Gau# period and that no general Gau# period improves on this k, and for 2.66% a general Gau# period provides a smaller
452
J. von zur Gathen, M. N-ocker / Theoretical Computer Science 315 (2004) 419 – 452
k than any of the specialized categories in the three rows above. Similarly, Table 2 shows the percentage of extensions with squarefree Gau# periods when the value of k is bounded in terms of n, again for 26n610 000. For both tables, the value of r was limited to 106 . Acknowledgements We thank the anonymous referees for a large number of corrections and useful suggestions, and Victor Pan for his ePorts in handling the paper. References [1] G.B. Agnew, R.C. Mullin, S.A. Vanstone, An implementation of elliptic curve cryptosystems over F2155 , IEEE J. Selected Areas Comm. 11 (5) (1993) 804–813. [2] D.W. Ash, I.F. Blake, S.A. Vanstone, Low complexity normal bases, Discrete Appl. Math. 25 (1989) 191–210. [3] I.F. Blake, R.M. Roth, G. Seroussi, E1cient arithmetic in GF(2n ) through palindromic representation, Tech. Report HPL-98-134, Visual Computing Department, Hewlett Packard Laboratories, 1998, Available via www.hpl.hp.com/techreports/98/HPL-98-134.html. [4] D.G. Cantor, On arithmetical algorithms over 0nite 0elds, J. Combin. Theory Ser. A 50 (1989) 285–300. [5] S. Feisel, J. von zur Gathen, M.A. Shokrollahi, Normal bases via general Gau# periods, Math. Comput. 68 (225) (1999) 271–290, URL: http://www.ams.org/journal-getitem?pii=S0025-5718-99-00988-6. [6] S. Gao, Abelian groups, Gauss periods, and normal bases, Finite Fields Their Appl. 7 (1) (2001) 148–164. [7] S. Gao, J. von zur Gathen, D. Panario, Gauss periods and fast exponentiation in 0nite 0elds, in: Proc. LATIN’95, ValparaWXso, Chile, Lecture Notes in Computer Science, Vol. 911, Springer, Berlin, 1995, pp. 311–322. ISSN 0302-9743. Final version in Mathematics of Computation and Journal of Symbolic Computation. [8] S. Gao, J. von zur Gathen, D. Panario, Gauss periods: orders and cryptographical applications, Math. Comput. 67(221) (1998) 343–352, URL: http://www.ams.org/jourcgi/amsjournal?fn=120pg1= piis1=S0025571898009351. With micro0che supplement. [9] S. Gao, J. von zur Gathen, D. Panario, V. Shoup, Algorithms for exponentiation in 0nite 0elds, J. Symbolic Comput. 29 (6) (2000) 879–889, URL: http://www.idealibrary.com/links/doi/10.1006/ jsco.1999.0309. [10] C.F. Gau#, Disquisitiones Arithmeticae, Gerh, Fleischer Iun., Leipzig (English translation by A.A. Clarke, Springer, New York, 1986) 1801. [11] D. Hachenberger, Finite 0elds: normal bases and completely free elements, The Kluwer Internat. Series in Engineering and Computer Science, Kluwer Academic Publishers, Boston/Dordrecht/London, 1997. [12] D. Jungnickel, Finite Fields: Structure and Arithmetics, BI Wissenschaftsverlag, Mannheim, 1993. [13] R. Lidl, H. Niederreiter, Finite 0elds, in: Encyclopedia of Mathematics and its Applications, Vol. 20, Addison-Wesley, Reading MA, 1983. [14] A.J. Menezes, I.F. Blake, X. Gao, R.C. Mullin, S.A. Vanstone, T. Yaghoobian, Applications of Finite Fields, Kluwer Academic Publishers, Norwell, MA, 1993. [15] R.C. Mullin, I.M. Onyszchuk, S.A. Vanstone, R.M. Wilson, Optimal normal bases in GF(pn ), Discrete Appl. Math. 22 (1989) 149–161. [16] A. Sch*onhage, Schnelle Multiplikation von Polynomen u* ber K*orpern der Charakteristik 2, Acta Inform. 7 (1977) 395–398. [17] A. Sch*onhage, V. Strassen, Schnelle Multiplikation gro#er Zahlen, Computing 7 (1971) 281–292. [18] A. Wassermann, Zur Arithmetik in endlichen K*orpern, Bayreuther Math. Schriften 44 (1993) 147–251.
Theoretical Computer Science 315 (2004) 453 – 468
www.elsevier.com/locate/tcs
Split algorithms for skewsymmetric Toeplitz matrices with arbitrary rank pro'le Georg Heiniga;∗ , Karla Rostb a Dept.
b Dept.
of Math. and Comp. Sci., Kuwait University, P.O. Box 5969, Safat 13060, Kuwait of Mathematics, Chemnitz University of Technology, D-09107 Chemnitz, Germany
Abstract Split Levinson-type and Schur-type algorithms for the solutions of linear systems with a nonsingular skewsymmetric Toeplitz matrix are designed. In contrast to previous ones, the algorithms work for any nonsingular skewsymmetric Toeplitz matrix. Moreover, generalizations of ZW - and WZ-factorizations of skewsymmetric Toeplitz matrices related to the new split algorithms are presented. c 2004 Elsevier B.V. All rights reserved. MSC: 65F05; 15A06; 15A23; 15A09 Keywords: Toeplitz matrix; Skewsymmetric matrix; Split algorithm; Levinson algorithm; Schur algorithm; WZ-factorization
1. Introduction This paper is dedicated to fast algorithms for nonsingular skewsymmetric Toeplitz matrices, i.e. matrices of the form TN = [ai−j ]Ni;j=1 with a−j = − aj . We assume that the entries are from a 'eld F of characteristic di>erent from two. A general linear system TN f = b with a nonsingular Toeplitz coe?cient matrix can be solved “fast” with complexity O(N 2 ) using Levinson-type or Schur-type algorithms. A problem is that the classical Levinson and Schur algorithms work only if the matrix TN is strongly nonsingular, which means that all leading principal submatrices Tk = [ai−j ]ki;j=1 are nonsingular for k = 1; : : : ; N . This condition is never satis'ed for a ∗
This work was supported by Research Grant SM05/02 of Kuwait University. Corresponding author. E-mail addresses: [email protected] (G. Heinig), [email protected] (K. Rost).
c 2004 Elsevier B.V. All rights reserved. 0304-3975/$ - see front matter doi:10.1016/j.tcs.2004.01.003
454
G. Heinig, K. Rost / Theoretical Computer Science 315 (2004) 453 – 468
skewsymmetric Toeplitz matrix, since skewsymmetric matrices of odd order are always singular. The problem of fast solving skewsymmetric Toeplitz systems was addressed in our recent paper [14]. In this paper fast algorithms were designed for skewsymmetric Toeplitz matrices which work under the condition that every leading principal submatrix of even order is nonsingular, which means the same as the nonsingularity of all −‘ central submatrices TN −2‘ = [ai−j ]Ni;j=‘+1 , ‘ = 0; 1; : : : ; N=2 − 1. Matrices with the latter property are called centro-nonsingular. The algorithms in [14] are, in principle, split algorithms in the sense of Delsarte-Genin in [3,4]. Some algorithms in [14] are the skewsymmetric counterparts of double-steps split algorithms for symmetric Toeplitz matrices proposed in [16] and [8]. However, surprisingly, there are also algorithms for skewsymmetric Toeplitz matrices that do not have an obvious symmetric counterpart, which is due to some additional symmetry properties of skewsymmetric Toeplitz matrices. An algorithm for Toeplitz matrices working without additional conditions was 'rst proposed in [7]. A discussion of algorithms of this kind can also be found in [19]. But these algorithms are for general Toeplitz matrices and do not fully utilize additional symmetry properties like symmetry or skewsymmetry. Thus, the aim of the present paper is to design (split) algorithms that exploit both the Toeplitz structure as well as the skewsymmetry and work without assumption on the rank pro'le. Split algorithms for general symmetric Toeplitz matrices were designed in our recent paper [15]. Let us reiterate that the skewsymmetric case is not simply an analogue of the symmetric case but has some speci'c peculiarities. Our approach is based on a look-ahead strategy. In the algorithms we consider only those submatrices Tn which are nonsingular. Let n1 ¡n2 ¡ · · · ¡nr = N be the set of all n = nk for which Tn is nonsingular, and let u(k) be the vector spanning the (onedimensional) nullspace of Tnk +1 . Here TN +1 means any skewsymmetric extension of TN . The Levinson-type algorithm computes a vector u(k+1) from u(k) and u(k−1) by a threeterm recursion, and the Schur-type algorithm computes the corresponding residuals. The last two vectors u(k) determine the inverse matrix via an “inversion formula” which allows to solve a linear system e?ciently. Note that a di>erent approach for solving skewsymmetric Toeplitz systems will be discussed in a forthcoming paper [9]. The approach in [9] is based on the recursion of fundamental systems (see [13]). One of its advantages is that it can easily be generalized to the block case, which is not the case for the look-ahead approach. Like the classical Schur algorithm is related to an LU-factorization of the Toeplitz matrix and the classical Levinson algorithm is related to a UL-factorization of its inverse, the split Schur algorithm for symmetric Toeplitz matrices is related to a ZW -factorization 1 of the matrix and the split Levinson algorithm to a WZ-factorization of its inverse. This was observed in [5]. Concerning WZ-factorization for general matrices we refer to [6,18] and references therein. In [14] the structure of the ZW -factorization of centro-nonsingular skewsymmetric Toeplitz matrices was studied. It was shown that such a matrix TN admits a represen1
The de'nitions of Z- and W -matrices are given in Section 6.
G. Heinig, K. Rost / Theoretical Computer Science 315 (2004) 453 – 468
455
tation TN = ZXZ T in which Z is a special unit Z-matrix and X is a skewsymmetric antidiagonal matrix (and a similar result for TN−1 ). In the present paper we show that, more general, any nonsingular skewsymmetric Toeplitz matrix admits such a representation in which X is a skewsymmetric block antidiagonal matrix and the blocks are multiples of the identity. The factors Z and X can be computed with the help of the generalized split Schur algorithm. The factorization combined with back substitution gives another possibility to solve linear systems without computing the vectors u(k) . Besides the solution via inversion formula and factorization we also discuss the solution via direct recursion. We refrain from computing the computational complexities in all cases, since their exact values depend on the rank pro'le of the matrix. However, it can be pointed out that these values are in general not essentially higher and in most cases even lower than the corresponding values computed in [14] for the case of a centro-nonsingular skewsymmetric Toeplitz matrix. Let us introduce some notations that will be used throughout the paper. We denote by Jn the n × n matrix of counteridentity, which has ones on the antidiagonal and zeros elsewhere. A vector u ∈ Fn is called symmetric if u = Jn u and skewsymmetric if u = − Jn u. An n × n matrix B is called centrosymmetric if Jn BJn = B or n be the subspaces of Fn consisting of all centro-skewsymmetric if Jn BJn = − B. Let F± symmetric, skewsymmetric vectors, respectively. Occasionally we will use polynomial language. For a matrix A = [aij ], A(t; s) will denote the bivariate polynomial A(t; s) = aij t i−1 sj−1 ; i;j
n and for u = (ui )ni=1 we set u(t) = i=1 ui t i−1 . For a vector u = (ui )li=1 , let Mk (u) denote the (k + l − 1) × k matrix 0 u1 . . . . . . u1 k + l − 1: Mk (u) = ul .. . . . . 0 ul It is easily checked that, for x ∈ Fk , (Mk (u)x)(t) = u(t)x(t). Furthermore, ek ∈ Fn will denote the kth vector in the standard basis of Fn , and 0k will denote a zero vector of length k. If the length of the vector is clear or irrelevant we omit the subscript. 2. Inversion formula From now on, let TN = [ai−j ]Ni;j=1 be a nonsingular skewsymmetric Toeplitz matrix and TN +1 any skewsymmetric (N + 1) × (N + 1) Toeplitz extension of TN . Clearly, N must be even and TN +1 and TN −1 have one-dimensional nullspaces. Let u ∈ FN +1
456
G. Heinig, K. Rost / Theoretical Computer Science 315 (2004) 453 – 468
and u ∈ FN −1 be the vectors spanning these nullspaces. In [13] (see also [14]) it was shown that the vectors u and u are symmetric. Since TN is nonsingular, the last component of u is nonzero. Therefore, we may assume that it is equal to 1. Note that the last component of u might be zero. Let r be de'ned by r = [aN −1
···
a1 ]u :
Since TN is nonsingular, we have r = 0. It is worth to mention that the vectors 1 0 1 u ; − r 0 r u are the last and the 'rst columns of TN−1 , respectively. We introduce the (symmetric) vector 0 1 x = u ∈ FN +1 r 0 which is the solution of the equation TN +1 x = eN +1 − e1 . The following is a speci'cation of a well-known inversion formula for general Toeplitz matrices (see [10,1]) for the case of skewsymmetric matrices and was discussed in [14]. Theorem 2.1. The inverse of TN is given by TN−1 (t; s) =
x(t)u(s) − u(t)x(s) : 1 − ts
(1)
Formula (1) can be expressed in matrix form in many ways. Let us present one of them, which is the “classical” Gohberg-Semencul formula built from triangular Toeplitz matrices. +1 , let L(v) denote the N × N lower triangular Toeplitz matrix For a vector v = (vi )Ni=1 v1 0 . . : . . L(v) = . . vN · · · v1 Corollary 2.2. The inverse of TN is given by TN−1 = L(x)L(u)T − L(u)L(x)T :
(2)
The direct application of (2) has complexity O(N 2 ), but if F is the 'eld of real or complex numbers fast algorithms with complexity O(N log N ) can be applied. Let us mention that there are formulas for TN−1 that contain only diagonal matrices and discrete
G. Heinig, K. Rost / Theoretical Computer Science 315 (2004) 453 – 468
457
Fourier or real trigonometric transformations, which are ready for implementation (see for example [11,12] and references therein). Note also that formula (2) can be written in terms of polynomial multiplication, and polynomial multiplication can be carried out with complexity O(N log N log log N ) in any 'eld (see [17] and references therein). 3. Recursion background We are going to show some facts which will be the basis for the split algorithms developed in the next sections. Besides the (nonsingular) matrix TN and its extension TN +1 we consider its central submatrices. Recall that N is even and so all central submatrices of TN have even order. These central submatrices coincide with the leading principal submatrices Tk = [ai−j ]ki;j=1 for even k. Let Tn be nonsingular. Then Tn+1 has the kernel dimension one. Let un span the kernel of Tn+1 . Since the last component of un does not vanish we may assume that it is equal to 1. As mentioned above, un is symmetric. We introduce the numbers rj = [aj+n
···
aj ]un
for j = 1; : : : ; N − n, which will be called residuals of un . Proposition 3.1. Let r1 = · · · = rd−1 = 0, rd = 0, and m = n + 2d. Then Tn+1 ; : : : ; Tm−1 are singular and Tm is nonsingular. Proof. We have
Od×d
Tm M2d (un ) = O
T
R
−R O ; Od×d
(3)
where R denotes the d × d upper triangular Toeplitz matrix rd · · · r2d−1 .. .. R= . . : 0 rd Hence
0k
Tn+2k+1 un = 0 0k for k = 0; : : : ; d − 1, which means that the matrices Tn+1 ; : : : ; Tm−1 are singular. Furthermore, we conclude from (3) that the vectors e1 ; : : : ; ed and em−d+1 ; : : : ; em belong to the range of Tm and also to the range of TmT .
458
G. Heinig, K. Rost / Theoretical Computer Science 315 (2004) 453 – 468
Suppose that Tm v = 0. Then gT v = 0 for all vectors from the range of TmT . Hence the 'rst and last d components of v vanish, and v is of the form vT = [0d vT 0d ]T , where v belongs to the kernel of Tn . Since Tn is nonsingular, we conclude that v = 0. Thus Tm is nonsingular. Besides the vector un we consider a solution xn of the equation Tn+1 xn = en+1 − e1 . Since unT (en+1 − e1 ) = 0, this equation has a (non-unique) solution x, which is symmetric, due to the centro-skewsymmetry of Tn . We introduce numbers sj = [aj+n
···
aj ]xn
for j = 0; : : : ; N − n. In particular, s0 = 1. Let xm be a solution of the equation Tm+1 xm = em+1 − e1 and um the vector spanning the kernel of Tm+1 with the last component equal to 1. We show now how um and xm can be computed from un and xn . From (3) we conclude that 0d Tm+1 un = rd (em+1 − e1 ): 0d Thus xm can be chosen as 0d 1 xm = un : rd 0d To 'nd um we observe that 0 . . . 0 Tm+1 M2d+1 (un ) = 0 rd . . . r2d
(4)
· · · −rd
0 ..
.
···
rd
· · · −r2d .. .. . . −rd 0 : 0 .. . ··· 0
Let R˜ denote the (d + 1) × (d + 1) upper triangular Toeplitz matrix rd · · · r2d .. .. R˜ = . . ; 0 rd
(5)
G. Heinig, K. Rost / Theoretical Computer Science 315 (2004) 453 – 468
459
and c = (ci )d+1 i=1 the solution of the triangular Toeplitz system T R˜ c = s;
where s = (si−1 )d+1 i=1 .
Furthermore, let c˜ = p = q˜c. Then we have
c c
2d+1 ∈ F+ be the symmetric extension of c, q = 1=c1 , and
0d
Tm+1 M2d+1 (un )p − q xn = 0: 0d By construction, the last (and the 'rst) component of M2d+1 (un )p equals 1. We arrived at the relation 0d um = M2d+1 (un )p − q xn : (6) 0d We write relations (4) and (6) in polynomial language and arrive at the following. Proposition 3.2. The vectors um and xm can be computed from un and xn via um (t) = p(t)un (t) − qt d xn (t); 1 d xm (t) = t un (t): rd
(7)
4. Split algorithms We discuss now the algorithms emerging from the recursion described in Proposition 3.2. First we introduce some notation. Let n1 ¡ · · · ¡n‘ = N be the integers n ∈ {1; 2; : : : ; N } for which Tn is nonsingular, dk = 12 (nk+1 − nk ), and let u(k) be the vector spanning the kernel of Tnk +1 with last component equal to 1 and x(k) a solution of Tnk +1 x(k) = enk +1 − e1 . The residuals rj(k) and sj(k) of u(k) and x(k) are de'ned by rj(k) = [aj+nk
···
aj ]u(k) ;
sj(k) = [aj+nk
···
aj ]x(k) ;
(8)
respectively, for j = 0; : : : ; N − nk . Clearly, r0(k) = 0 and s0(k) = 1. Our aim is to 'nd u = u(‘) and x = x(‘) . Then the solution of a linear system TN f = b can be computed using the formula from Corollary 2.2 or another inversion formula. First let us note that according to (7) 0dk−1 1 x(k) = (k−1) u(k−1) rdk−1 0dk−1
460
G. Heinig, K. Rost / Theoretical Computer Science 315 (2004) 453 – 468
and sj(k) =
1 rd(k−1) k−1
(k−1) rj+d : k−1
That means it is su?cient to compute the residuals rj(k) and to construct the vectors u(k) . For initialization we set n0 = 0 and u(0) = 1. Then rj(0) = aj . If a1 = · · · = ad−1 = 0 and ad = 0, then n1 = 2d. The vector u(1) is the normalized solution of the homogeneous system T2d+1 v = 0. We show how this solution can be found. We form the matrix ad · · · a2d (0) .. .. R˜ = . . : 0 ad (0) T c ˜ ∈ Let c be the solution of the triangular Toeplitz system (R ) c = e1 and v = c F2d+1 its symmetric extension. Then T2d+1 v = 0. Hence u(1) = (1=c)v, where c is the 'rst component of c. We assume now that nk−1 , nk , u(k−1) and u(k) are given. We also need some of the values rj(k−1) (j = 1; : : : ; 2dk−1 ) that are computed in the previous step. Now nk+1 and (k) = 0 and rd(k) = 0, then dk = d, i.e. u(k+1) are computed as follows. If r1(k) = · · · = rd−1 nk+1 = nk + 2d. (k) (k) ; : : : ; r2d and form the matrix R˜ as We compute the numbers rd(k) k +1 k (k) (k) rdk · · · r2d k (k) . . .. .. R˜ = : rd(k) k
0
If dk ¿dk−1 , then it will be necessary to compute also the numbers rj(k−1) for j = 2dk−1 d +d
k k−1 + 1; : : : ; dk + dk−1 to form the vector r(k−1) = (rj(k−1) )j=d . k−1 (k) Let c be the solution of the triangular Toeplitz system
(k)
(R˜ )T c(k) = r (k−1) ; q(k) = 1=c, where c is the 'rst component of c(k) , and p(k) = q(k) the symmetric extension of q(k) c(k) . Then 0dk +dk−1 u(k+1) = M2dk +1 (u(k) )p(k) − q(k) u(k−1) : 0dk +dk−1 In polynomial language the recursion can be written as follows.
c(k) c(k)
2dk +1 ∈ F+ be
G. Heinig, K. Rost / Theoretical Computer Science 315 (2004) 453 – 468
461
Theorem 4.1. The polynomials u(k) (t) satisfy the three-term recursion u(k+1) (t) = p(k) (t)u(k) (t) − t dk +dk−1 q(k) u(k−1) (t): Example 1. Consider the skewsymmetric Toeplitz matrix T6 = [ai−j ]6i;j=1 , with (ak )5k=1 = (1; 2; 3; 5; 6). Since we need also an extension of T6 we set a6 = 0. The standard setting for initialization is n0 = 0, u(0) = 1 and rj(0) = aj . Since r1(0) = 1 = 0 we have d0 = 1 and n1 = n0 + 2d0 = 2. We obtain x(1) = [0 1 0]T and u(1) = [1; −2; 1]T . With u(0) and u(1) we can start the recursion. We compute the residuals as r1(1) = 0, r2(1) = 1. Thus d1 = 2, n2 = n1 + 2d1 = 6, and (1) x(2) = [0; 0; 1; −2; 1; 0; 0]T . In order to form the matrix R˜ we 'nd that r3(1) = − 1 and r4(1) = − 7, and in order to form the vector r(0) we observe that r2(0) = a2 = 2, (1) r3(0) = a3 = 3. The solution of the system (R˜ )T c(1) = r(0) is c(1) = [1; 3; 13]T . Hence T (1) p = [1; 3; 13; 3; 1] , which gives u(2) = [1; 1; 8; −21; 8; 1; 1]T : The inverse of T6 is now given by Corollary 2.2 with x = x(2) and u = u(2) . A check shows that this really gives the inverse matrix. Let us discuss the complexity of the algorithm emerging from Theorem 4.1. Surprisingly, the existence of singular central submatrices does not increase the complexity, in many cases it even decreases it. For simplicity we assume that all dk are equal to d, where d is small compared with N . We neglect lower order terms. The amount for inner product calculations will be almost independent of d. We have to compute about N inner products of a symmetric and a general vector. For this 12 N 2 additions and 1 2 4 N multiplications are needed. Then we have in each step 2d + 1 vector additions of symmetric vectors and 2 d + 1 multiplications of a2 symmetric vector by a scalar. 1 1 N additions and 18 + 8d N multiplications. Thus, the overThis results in 14 + 8d 2 2 3 1 1 N multiplications. That all complexity is about 4 + 8d N additions and 38 + 8d means the amount decreases when d increases. In the case d = 1, which is the centrononsingular case, Theorem 4.1 is just Theorem 3.2 in [14]. In this case the complexity is 78 N 2 additions and 12 N 2 multiplications (comp. [14]). The algorithm just described is a split Levinson-type algorithm and includes the calculation of the residuals via long inner products, which might be not convenient in parallel computing. We show now that the residuals can also be computed by a Schurtype algorithm. The Schur-type algorithm is of independent interest, since it provides a factorization, which will be described in Section 6. −nk We consider the full residual vectors r(k) = (rj(k) )Nj=1 and the corresponding polyno(k) (k) mials r (t). By the de'nition of the integer dk , r˜ (t) = t −dk +1 r(k) (t) is a polynomial. The monic, symmetric polynomial p(k) (t) and q(k) ∈ F have been constructed in such way that the polynomial r˜(k) (t)p(k) (t) − q(k) r˜(k−1) (t)
462
G. Heinig, K. Rost / Theoretical Computer Science 315 (2004) 453 – 468
has a zero of order dk + 1 at t = 0. According to Theorem 4.1, the will give remainder N us r(k+1) (t). Let Pm denote the projector mapping a polynomial j=1 pj t j−1 (N ¿m) m to j=1 pj t j−1 , i.e. cutting o> high powers. Theorem 4.1 gives us immediately the following recursion formula for the residuals. Theorem 4.2. The polynomials r(k) (t) satisfy the recursion r(k+1) (t) = PN −nk+1 (t −2dk p(k) (t)r(k) (t) − t −dk−1 −dk q(k) r(k−1) (t)): To write this recursion in matrix form we introduce the matrix Q(k) by (k) Q(k) = [r2d ]%k 2dk +1 ; k +i−j+1 i=1 j=1
where %k = N − nk+1 = N − nk − 2dk . Now we have r(k+1) = Q(k) p(k) − q(k) rO(k−1) ; where rO(k−1) = [rd(k−1) ]% k . k +dk−1 +i i=1
recursion starts with rO(−1) = 0, r(0) = [aj ]Nj=1 , p(0) = u(1) , and Q(0) = N −n1 n1 +1 [an1 +i−j+1 ]i=1 . The vector u(1) will be computed as described in the initializaj=1 tion of the Levinson-type recursion via the solution of a triangular (d1 + 1) × (d1 + 1) Toeplitz system. Theorem 4.2 can be combined with Theorem 4.1 to compute u and x, the parameters for the inversion formula. The
5. Solution of linear systems In this section we show how to solve a linear system TN fN = bN with a nonsingular N × N skewsymmetric Toeplitz coe?cient matrix TN recursively without using the inversion formula. We use all notations that were introduced in the previous section. +nk ) nk Suppose that b = [bi ]Ni=1 . We set b(k) = [bi ](1=2)(N We consider the sysi=(1=2)(N −nk +2) ∈ F tems T (k) f (k) = b(k) ; where T (k) = Tnk . Our aim is to compute f (k+1) from f (k) . Since T (k+1) is of the form (k) ∗ ∗ −B− T (k+1) = ∗ T (k) ∗ ; ∗
(k) B+
∗
G. Heinig, K. Rost / Theoretical Computer Science 315 (2004) 453 – 468
where
(k) = B+
a1 .. . ; · · · adk ··· .. .
ank .. . ank +dk −1
463
(k) (k) B− = Jdk B+ Jn k
we have
0
(k) −'−
T (k+1) f (k) = b(k) ; (k) 0 '+ (k) (k) (k) where '± = B± f . As in Section 4 we obtain
T (k+1) M2dk (u(k) ) =
−R(k) O ;
O O
(R(k) )T where
R(k) =
O
· · · r2dk −1 .. .. . . :
rdk
rdk (k) Hence we have, for (± ∈ F dk ,
0
T (k+1) f (k) + M2dk (u(k) ) 0
(k) (+ (k) (−
=
(k) (k) −R(k) (− − '−
b(k)
(k) (R(k) )T (+
+
:
(9)
(k) '+
From this relation we conclude the following. Theorem 5.1. Suppose that
(k) b−
b(k+1) = b(k) ; (k) b+ (k) (k) where b± ∈ Fdk and (± are the solutions of (k) (k) (k) (R(k) )T (+ = b+ − '+ ;
(k) (k) (k) R(k) (− = −b− − '− :
(10)
464
G. Heinig, K. Rost / Theoretical Computer Science 315 (2004) 453 – 468
Then the solution f (k+1) of T (k+1) f (k+1) = b(k+1) is given by
0 (k) (+ (k) (k+1) (k) f = f + M2dk (u ) (k) : (− 0
(11)
(k) which require For one step of the recursion one has 'rst to compute the vectors '± (k) the multiplication of a vector by the dk × nk Toeplitz matrix B± , then to solve two triangular dk × dk Toeplitz systems (with actually the same coe?cient matrix) to get (k) and 'nally to apply formula (11). (± (k) require long inner product calculations which The computations of the vectors '± (k) can be avoided if the full residual vectors '˜± ∈ FN −nk are considered. These vectors are given by (k) 0 −'˜− (k) : TN f (k) = b (k) 0 '˜ +
Let
(k) Q±
be de'ned by
(k) (k) Q+ = [r2d ]) k d k ; k +i−j+1 i=1 j=1
(k) (k) Q− = J)k Q+ Jd k ;
where )k = 12 (N − nk ). Then we conclude from (9) that (k+1) (k) (k) (k) = Q± (± + ('˜± ) ; '˜± (k) where here the prime at '˜+ means that the 'rst dk components are deleted and at (k) '˜− that the last dk components are deleted.
6. Generalized ZW -factorization Like the classical Schur algorithm for symmetric Toeplitz matrices is related to the LU-factorization of the matrix and the classical Levinson algorithm related to a ULfactorization of its inverse, the split Schur algorithm is related to a ZW -factorization of the matrix and the split Levinson algorithm to a WZ-factorizaton of the inverse. In [14] the latter factorizations were investigated for skewsymmetric Toeplitz matrices. It was shown that centro-nonsingular skewsymmetric matrices admit a ZW -factorization in which the factors possess some additional symmetry properties. We are going to generalize this result to arbitrary nonsingular skewsymmetric Toeplitz matrices. The factorization will lead to the possibility to solve a linear system by a pure Schur-type algorithm. To be more precise, let us recall some concepts. A matrix A = [aij ]ni;j=1 is called a W -matrix if aij = 0 for all (i; j) for which i¿j and i + j¿n or i¡j and i + j6n.
G. Heinig, K. Rost / Theoretical Computer Science 315 (2004) 453 – 468
465
The matrix A will be called a unit W -matrix if, in addition, aii = 1 for i = 1; : : : ; n and ai;n+1−i = 0 for i = (n + 1)=2. The transpose of a W -matrix is called a Z-matrix. A matrix which is both a Z- and a W -matrix will be called an X -matrix. The names come from the shapes of the set of all possible positions for nonzero entries, viz. • • •• • • • • • ◦ ◦ ◦ ◦ • ◦ • • ◦ ◦ ◦ ◦ • ◦ • : W = ; Z = • ◦ • ◦ • • ◦ • • • • ◦ ◦ ◦ • • •
•• • • • •
•
A unit Z- or W -matrix is obviously nonsingular and a linear system with such a coe?cient matrix can be solved by back substitution with n2 =2 additions and n2 =2 multiplications. A representation A = ZXW in which Z is a unit Z-matrix, W is a unit W -matrix, and X a nonsingular X -matrix is called ZW -factorization. Analogously WZ-factorization is de'ned. A admits a ZW -factorization if and only if A is centro-nonsingular. Under the same condition A−1 admits a WZ-factorization. That means if A is not centro-nonsingular, then no such a factorization exists and a generalization is not at hand. We show now that, nevertheless, in the special case of a skewsymmetric Toeplitz matrix there is a natural generalization of the factorization result in [14]. We introduce N × dk matrices W±(k) by O)k ×dk O)k ×dk M (u(k) ) O d dk ×dk (k) W−(k) = k ; W+ = ; Odk ×dk Mdk (u(k) ) O)k ×dk
O)k ×dk
where )k = 12 (N − nk+1 ), and form the matrix W = [W−(‘−1))
···
W−(0) W+(0)
···
W+(‘−1) ]:
Recall that u(0) = 1, n0 = 0. Obviously, W is a centrosymmetric unit W -matrix. We have (k) (k) −S− −Sˆ+ O(nk+1 −dk )×dk −R(k) (k) (k) ; TN W− = ; ; TN W+ = (k) T (R ) O(nk+1 −dk )×dk (k) S+(k) Sˆ− where (k) S+(k) = [r2d ]) k d k ; k +i−j i=1 j=1
(k) S− = [r)(k) ]) k d k ; k −i+j i=1 j=1
(12)
466
G. Heinig, K. Rost / Theoretical Computer Science 315 (2004) 453 – 468
(k) (k) Jdk . We set r (k) = rd(k) , Sˆ± = J)k S± k
Z+(k) =
1 r (k)
TN W−(k) ;
(k) Z− =−
1 r (k)
TN W+(k) ;
and form the matrix (‘−1) Z = [Z−
···
(0) (0) Z− Z+
···
Z+(‘−1) ]:
(13)
Then Z is a centrosymmetric unit Z-matrix. Furthermore, TN W = ZX; where X is the skewsymmetric block antidiagonal matrix 0 −r (‘−1) Id‘−1 . .. −r (0) Id0 X = r (0) Id0 . .. r (‘−1) Id‘−1 0
:
(14)
This leads to the following. Theorem 6.1. A nonsingular skewsymmetric Toeplitz matrix and its inverse admit representations TN = ZXZ T ;
TN−1 = WX −1 W T ;
where Z is a centrosymmetric Z-matrix given by (13), W is a centrosymmetric W -matrix given by (12), and X is a skewsymmetric block antidiagonal matrix given by (14). Example 2. Let us illustrate the factorizations for the example of a nonsingular skewsymmetric Toeplitz matrix T6 = [ai−j ]6i;j=1 with a1 = 0 for which T4 is singular. That means we have n1 = 2 and N = n2 = 6. Let u(1) = [1 u 1]T span the nullspace of T3 . Then the factors of the generalized ZW -factorization of T6 and generalized WZfactorization of T6−1 are given by r 3 a3 a2 1 − 0 0 a1 r2 a 1 1 0 a 2 1 −1 0 u 1 0 0 a1 1 u 1 0 1 0 1 0 W = ; Z = ; 0 1 u 1 0 1 0 1 a 2 0 0 1 u 0 −1 1 a 1 0 1 a2 a3 r3 1 0 0 − a1 a1 r2
G. Heinig, K. Rost / Theoretical Computer Science 315 (2004) 453 – 468
X = r2 0
−r2 0 −a1 a1 0
0
467
−r2 ;
r2
where r2 = a4 + a3 u + a2 and r3 = a5 + a4 u + a3 . Let us point out that the factorization of TN can be computed with the help of the Schur-type algorithm emerging from Theorem 4.2. and the factorization of TN−1 with the help of the Levinson-type algorithm emerging from Theorem 4.1. Thus these algorithms can be used to solve linear systems via factorization and back substitution or matrix multiplication, respectively. 7. Concluding remarks The algorithms described in the previous sections lead to several methods for solving a linear system TN f = b with a nonsingular, skewsymmetric Toeplitz coe?cient matrix. There are three possibilities, namely (a) via inversion formula, (b) via direct recursion, and (c) via factorization. For each possibility there is a Levinson-type and a Schur-type version. That means we have six methods. In [14] these six methods (and two more) are described in detail and compared from the view point of complexity in sequential processing in the centro-nonsingular case. In the general case complexity matters are more complicated, since the complexity heavily depends on the rank pro'le but the comparison will give, in principle, the same result. It turned out that the Levinson-type algorithm combined with the inversion formula is the most e?cient one from the complexity point of view, provided that for matrixvector multiplication a fast algorithm is used. If it is carried out in the classical way, then direct recursion and factorization are preferable. Let us point out that complexity is not the only criterion for estimating the performance of an algorithm. In Qoating point arithmetics stability is an important issue. It is well known that, as a rule, Schur-type algorithms are more stable than Levinson-type algorithms (see [2]). From this point of view a solution via ZW -factorization and back substitution might be preferable over the other methods. Furthermore, all Schur-type versions are preferable in parallel computing, since they avoid inner product calculations.
References [1] D.A. Bini, V.Y. Pan, Matrix and Polynomial Computations 1: Fundamental Algorithms, BirkhSauser Verlag, Basel, Boston, Berlin, 1994.
468
G. Heinig, K. Rost / Theoretical Computer Science 315 (2004) 453 – 468
[2] R.P. Brent, Stability of fast algorithms for structured linear systems, in: T. Kailath, A.H. Sayed (Eds.), Fast Reliable Algorithms for Matrices with Structure, SIAM, Philadelphia, 1999. [3] P. Delsarte, Y. Genin, The split Levinson algorithm, IEEE Trans. Acoust. Speech Signal Process. ASSP-34 (1986) 470–477. [4] P. Delsarte, Y. Genin, On the splitting of classical algorithms in linear prediction theory, IEEE Trans. Acoust. Speech Signal Process. ASSP-35 (1987) 645–653. [5] C.J. Demeure, Bowtie factors of Toeplitz matrices by means of split algorithms, IEEE Trans. Acoust. Speech Signal Process. ASSP-37 (10) (1989) 1601–1603. [6] D.J. Evans, M. Hatzopoulos, A parallel linear systems solver, Internat. J. Comput. Math. 7 (3) (1979) 227–238. [7] G. Heinig, Inversion of Toeplitz and Hankel matrices with singular sections, Wiss. Zeitschr. d. TH Karl-Marx-Stadt 25 (3) (1983) 326–333. [8] G. Heinig, Chebyshev-Hankel matrices and the splitting approach for centrosymmetric Toeplitzplus-Hankel matrices, Linear Algebra Appl. 327 (1–3) (2001) 181–196. [9] G. Heinig, A. Al-Rashidi, Fast algorithms for skewsymmetric Toeplitz matrices based on recursion of fundamental systems, in preparation. [10] G. Heinig, K. Rost, Algebraic Methods for Toeplitz-like Matrices and Operators, BirkhSauser Verlag, Basel, Boston, Stuttgart, 1984. [11] G. Heinig, K. Rost, DFT representations of Toeplitz-plus-Hankel Bezoutians with application to fast matrix-vector multiplication, Linear Algebra Appl. 284 (1998) 157–175. [12] G. Heinig, K. Rost, E?cient inversion formulas for Toeplitz-plus-Hankel matrices using trigonometric transformations, in: V. Olshevsky (Ed.), Structured Matrices in Mathematics, Computer Science, and Engineering, Vol. 2, AMS-Series, Providence, RI, Contemp. Math. 281 (2001) 247–264. [13] G. Heinig, K. Rost, Centrosymmetric and centro-skewsymmetric Toeplitz matrices and Bezoutians, Linear Algebra Appl. 343–344 (2001) 195–209. [14] G. Heinig, K. Rost, Fast algorithms for skewsymmetric Toeplitz matrices, Oper. Theory Adv. Appl. 135 (2002) 193–208. [15] G. Heinig, K. Rost, Split algorithms for symmetric Toeplitz matrices with arbitrary rank pro'le, Numer. Linear Algebra Appl., to appear. [16] A. Melman, A two-step even-odd split Levinson algorithm for Toeplitz systems, Linear Algebra Appl. 338 (2001) 219–237. [17] V.Y. Pan, Structured Matrices and Polynomials, BirkhSauser Verlag, Boston, Springer, New York, 2001. [18] S. Chandra Sekhara Rao, Existence and uniqueness of WZ factorization, Parallel Comp. 23 (8) (1997) 1129–1139. [19] V.V. Voevodin, E.E. Tyrtyshnikov, Numerical Processes with Toeplitz Matrices, Nauka, Moscow, 1987 (in Russian).
Theoretical Computer Science 315 (2004) 469 – 510
www.elsevier.com/locate/tcs
The aggregation and cancellation techniques as a practical tool for faster matrix multiplication Igor Kaporin1 Computational Center of the Russian Academy of Sciences, Vavilova 40, 119991 Moscow, Russia
Abstract The main purpose of this paper is to present a fast matrix multiplication algorithm taken from the paper of Laderman et al. (Linear Algebra Appl. 162–164 (1992) 557) in a re-ned compact “analytical” form and to demonstrate that it can be implemented as quite e1cient computer code. Our improved presentation enables us to simplify substantially the analysis of the computational complexity and numerical stability of the algorithm as well as its computer implementation. The algorithm multiplies two N × N matrices using O(N 2:7760 ) arithmetic operations. In the case where N = 18 · 48k , for a positive integer k, the total number of 6ops required by the algorithm is 4:894N 2:7760 − 16:165N 2 , which may be compared to a similar estimate for the Winograd algorithm, 3:732N 2:8074 − 5N 2 6ops, N = 8 · 2k , the latter being current record bound among all known practical algorithms. Moreover, we present a pseudo-code of the algorithm which demonstrates its very moderate working memory requirements, much smaller than that of the best available implementations of Strassen and Winograd algorithms. For matrices of medium-large size (say, 2000 6 N ¡ 10; 000) we consider one-level algorithms and compare them with the (multilevel) Strassen and Winograd algorithms. The results of numerical tests clearly indicate that our accelerated matrix multiplication routines implementing two or three disjoint product-based algorithm are comparable in computational time with an implementation of Winograd algorithm and clearly outperform it with respect to working space and (especially) numerical stability. The tests were performed for the matrices of the order of up to 7000, both in double and single precision. c 2004 Elsevier B.V. All rights reserved. Keywords: Fast matrix multiplication; Strassen algorithm; Winograd algorithm; Pan’s aggregation/cancellation method; Numerical stability; Computational complexity
1
Supported by the NSF grant CCR-9732206. E-mail address: [email protected] (I. Kaporin).
c 2004 Elsevier B.V. All rights reserved. 0304-3975/$ - see front matter doi:10.1016/j.tcs.2004.01.004
470
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
1. Introduction Matrix multiplication is one of the most basic computational tasks arising in numerical computing. Software implementing this operation (among other basic linear algebra modules) is always included into general-purpose scienti-c packages, or invoked by them, see, e.g., [10,13]. The most widely known is the LAPACK library, which includes, e.g., such routines as DGEMM and SGEMM (multiplication of general rectangular matrices in double and single precision, respectively). Matrix multiplication is also a basic operation for many important non-numerical computational problems such as: • transitive closure and all-pair-shortest-distance problems in graphs [1,29]; • parsing algorithms for context-free grammars (as is known, context-free language recognition over an input sequence of length n can be reduced to multiplication of n × n matrices) [15,27]; • pattern recognition tasks (classi-cation and -nding similar objects), arising, e.g., in factor analysis of texts or in image retrieval, see [8] and references therein; • computational molecular biology (processing gene expression pro-les, which is reduced to the problem of identi-cation of Boolean networks) [2,5]. In some of the above problems, the matrices are Boolean rather than -lled with 6oatingpoint numbers; however, most of the results on fast matrix multiplication still hold true. Moreover, the numerical stability problem disappears in Boolean settings. As a part of intensive development of software for fundamental computational kernels during the last three decades, a considerable eLort was directed towards e1cient implementation of fast matrix multiplication (MM) algorithms [3,7,12,17,23,26]. However, only Strassen algorithm (1969) [25] and rather similar Winograd algorithm (1974), see, e.g., [6,14], have been implemented. The latter is often referred to as StrassenWinograd’s, and hereafter we use the abbreviation SW. The main de-ciencies of the SW based implementations are: • the much larger worst-case upper bound for the 6oating-point error as compared to that of the classical O(n3 ) procedure (hence, the Strassen-type algorithms cannot be safely used in single precision 6oating-point computations) cf. [6,7,12,14]; • the need for a rather large volume of work memory; • the discrepancy between the algorithmic tunings providing the minimization of the total operation count and the tunings aimed at the maximization of M6ops performance on modern RISC computers, see, e.g., [24]; • algorithmic complications arising for inputs being rectangular matrices of arbitrary sizes. Some problems also arise with e1cient parallel implementation, but these issues are not treated here. However, there exist other class of practical matrix multiplication algorithms which are clearly better than the SW ones with respect to the numerical stability and workspace consumption, and are competitive with respect to operation count and running time for realistic matrix sizes. The basis for the construction of such algorithms was set in [19,20,21], where the so-called aggregation-cancellation techniques were proposed for calculating two or three disjoint matrix products. Later on, in [18] a great practical
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
471
potential hidden in such designs was revealed, in particular the gain in 6oating-point accuracy, but also their rather regular structure and very moderate working memory requirements, typically smaller than that of the available SW implementations. Our re-ned algorithm multiplies two N × N matrices by using O(N 2:7760 ) 6ops (6oating point arithmetic operations). In the case where N = 18 · 48k , for a positive integer k, the total number of 6ops required by the algorithm is 4:894N 2:7760 − 16:165N 2 which may be related to the estimate TSW = 3:732N 2:8074 − 5N 2 6ops, N = 8 · 2k , for the SW algorithm. The latter was a current record bound among all known practical algorithms. (We do not count the theoretically fast algorithms [9,16] that support even much smaller exponents (2:375 : : : for square matrix multiplications) but are not competitive even with classical algorithm unless N is immensely large.) Our numerical tests indicate that the fast matrix multiplication routine implementing our algorithm based on two and three disjoint products is comparable to an implementation of the SW algorithm with respect to time, but takes considerably less working storage and possesses much better numerical stability (almost as good as for some implementations of the standard MM algorithm). The tests were performed for the matrices of the order of up to 7000, both in double and single precision. The paper is organized as follows. In Section 2, we restate and re-ne some results from [18]; one of the main results is the n × 2n by 2n × n MM algorithm requiring n3 + 3n2 − n bilinear multiplications. This also serves as an elementary introduction into our subject. In Section 3 we present a re-ned compact version of the fast Disjoint Triple MM algorithm taken from [19] as well as the related n × 3n by 3n × n matrix multiplication algorithm using n3 + 12n2 + 24n bilinear multiplications derived similarly to [18]. We give there pseudo-codes for the key algorithms, as well as the analysis of the computational complexity and discussion on numerical stability and computer implementation of the algorithm. In Section 4, we outline one-level procedures derived from the above rectangular MM algorithms, in particular, their adjustment to odd-sized and rectangular inputs. In Section 5, the results of numerical tests are given. Finally concluding remarks are given in Section 6. 2. Two disjoint product based algorithms Let us devise fast MM algorithms [18,22] by relying on aggregation technique, speci-cally, on the so-called 2-procedure; hereafter we refer to them as PK2-algorithms. 2.1. A recursive procedure for two disjoint MM To compute two generally disjoint matrix products Z = XY;
W = UV;
where all U; V; W; X; Y; Z are n × n block matrices with the blocks properly dimensioned, consider n3 aggregates mijk = (xik + ukj )(ykj + vji ):
472
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
Summation over k or over j gives us zij or wki , respectively, up to some additive correction terms which involve only 3n2 multiplications: zij = −cj − (xi + uj )vji +
n k=1
mijk ;
wki = −rk − xik (yk + vi ) +
n j=1
mijk ;
where cj = xi =
n k=1 n k=1
ukj ykj ; xik ;
rk =
uj =
n j=1
n k=1
ukj ykj ;
ukj ;
yk =
n j=1
ykj ;
vi =
n j=1
vji :
Hence, the number of multiplications is only (n) = n3 + 3n2 (compared to 2n3 for the double application of the standard algorithm). The number of additions and subtractions must be accounted separately for each typical size of matrix blocks involved. In what follows, the three matrix pairs {X; U }, {Y; V }, and {Z; W } are composed of l × l=t, l=t × l, and l × l blocks, respectively, where t = 2 for 2-procedure (Section 2) and t = 3 for 3-procedure (Section 3). One can see that the number of additions and subtractions is 1 (n) = 2n3 + 6n2 − 4n for the input-type blocks (i.e., related to the input matrices X; Y; U; V ), and 2 (n) = 2n3 + 4n2 − 2n for the output-type blocks (i.e., related to the output matrices Z; W ). Since the number n3 + 3n2 is always even, a recursive algorithm groups smaller MM problems into pairs, and for each pair, the same procedure applies. For N = nk l with some l -xed, one has k 3 (n) n + 3n2 b(l); b(N ) = b(N=n) = · · · = 2 2 where b(N ) is the number of multiplications for two N × N disjoint matrix products. Thus, the total number of operations can be estimated as O(N !(n) ), where 3 n + 3n2 log n; !(n) = log 2 in particular, !(13) = 2 + log13 8¡2:81071. This exponent ! slightly exceeds ! = log2 7¡2:80736 in the Strassen-type algorithms, but the fast MM algorithm above is much more appealing from the practical viewpoint, especially for 6oating-point calculations, cf. [18].
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
473
2.2. The algorithm for n × 2n by 2n × n product For computation of a single matrix product, one can save more operations. Consider the product H of n × 2n block matrix F by 2n × n block matrix G: H = FG: The standard algorithm “by de-nition” hij = k fik gkj uses 2n3 block multiplications and 2n3 − n2 (output-type) block additions. The original problem is reduced to two disjoint products by the column splitting of F and row splitting of G into two equal blocks each, that is, Y F = [X U ]; G= ; V where X; U and Y; V have the block sizes n × n. Equations Z = XY;
W = UV;
H = Z + W;
reduce the problem to a pair of disjoint matrix multiplications and an n × n matrix addition. Analysis of the expression for zii + wii shows, however, that we may remove the aggregates miii from the summation by spreading their terms among the diagonal corrections for hii . Indeed, for i = j one can directly use the formulas of the preceding subsection, hij = zij + wij = −cj − ri − (xi + uj )vij − xij (yi + vj ) +
n k=1
(mijk + mjki );
while for i = j one readily obtains hii = −ci − ri − (xi + ui )vii − xii (yi + vi ) + 2miii +
k=i
(miik + miki )
= −ci − ri − (xi + ui − uii )vii − xii (yi − yii + vi ) +
k=i
(miik + miki );
where ci = ci + uii yii ; xi = xi + xii ;
ri = ri + uii yii ; ui = ui + uii ;
yi = yi + yii ;
vi = vi + vii :
Introducing the notations Fi;j = xij ;
Fi;n+j = uij ;
Gi;j = yij ;
Gn+i;j = vij ;
for the entries of the input matrices and F0i = −xi − ui + uii ; G1i = −yi ;
G0i = −yi + yii − vi ;
G2i = −vi ;
H 1i = −ci ;
F1i = −xi ;
H 2i = −ri ;
F2i = −ui ;
474
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
for temporary variables, one can obtain the following pseudo-code for the algorithm (the latter eight equalities are valid at the Step 4 below): Step 1: F1i = 0, F2i = 0, G1i = 0, G2i = 0, H 1i = 0, H 2i = 0, i = 1; : : : n; Step 2: do i = 1; n do j = 1; n P := Fi; n+j · Gi; j if (i = j) then Hi; i := P else H 1j := H 1j − P H 2i := H 2i − P F1i := F1i − Fi; j F2j := F2j − Fi; n+j G1i := G1i − Gi; j G2j := G2j − Gn+i; j end if end do end do Step 3: do i = 1; n F0i := F1i + F2i + Fi; n+i F1i := F1i − Fi; i F2j := F2j − Fi; n+i G0i := G1i + G2i + Gi; i G1i := G1i − Gi; i G2j := G2j − Fn+i; i P := Hi; i Hi; i = H 1i + H 2i H 1j := H 1j − P H 2i := H 2i − P end do Step 4: do i = 1; n do j = 1; n if (i = j) then Hi; i := Hi; i + F0i · Gn+i; i + Fi; i · G0i else S1 := F1i + F2j S2 := G1i + G2j Hi; i := H 1j + H 2i + S1 · Gn+j; i + Fj; i · S2
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
475
end if end do end do Step 5: do i = 1; n do j = 1; n do k = 1; n if (|i − j| + |j − k| = 0) then S1 := Fi; k + Fk; n+j S2 := Gk; j + Gn+j; i P := S1 · S2 Hi; j := Hi; j + P Hk; i := Hk; i + P end if end do end do end do Here P; S1; S2 are temporary variables and the symbol “ := ” denotes in-place updating. The symbols F0i ; Hi; j ; : : : indicate some storage areas rather than algebraic terms. The working memory is exactly de-ned by the matrix blocks F0i ; F1i ; F2i ; G0i ; G1i ; G2i ; H 1i ; H 2i , i = 1; : : : ; n. For n of the order of tens, this typically comprises only a small fraction of the total volume of the input and output data. The operations count for the above algorithm is as follows. The number of multiplications is (n) ˜ = n3 + 3n2 − n (n2 at Step 2; 2n2 at Step 4; n3 − n at Step 5), and the number of block additions and subtractions is ˜1 (n) = 2n3 + 6n2 − 4n for the input-type blocks (4(n2 − 2n) at Steps 1–2; 8n at Step 3; 2(n2 − n) at Step 4; 2(n3 − n) at Step 5), and ˜2 (n) = 2n3 + 5n2 − 4n (2(n2 − 2n) at Steps 1–2; 3n at Step 3; 2n + 3(n2 − n) at Step 4; 2(n3 − n) at Step 5) for the output-type blocks. Here we assumed a non-trivial initialization of F1; : : : ; H 2 (diLerent from zeroing at Step 1 above), which allows us to eliminate 6n -ctitious subtractions from zero at Step 2. A similar algorithm with a larger number of multiplications n3 + 3n2 was described in [18].
476
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
2.3. The recursive algorithm for square matrices Multiplying a pair of N × N matrices F and G with numerical entries, assume, for simplicity, that N = nk l, where n and l are even, and k¿1, so M = N=n is also even. Represent F as an n × 2n block matrix with N=n × N=(2n) blocks, G as an 2n × n block matrix with N=(2n) × N=n blocks, and H as an n × n block matrix with N=n × N=n blocks. Then the algorithm of the preceding subsection can be readily applied using TPK2 (N ) = ˜1 (n)
N2 N2 (n) ˜ + ˜ (n) + T2 (N=n); 2 2n2 n2 2
arithmetic operations, where T2 (M ) operations are required for the computation of a pair of M × M=2 by M=2 × M matrix products. The latter problem can be solved either by a standard algorithm (T2 (M ) = 2M 3 − 2M 2 ), which gives rise to the so-called onelevel algorithm [18], or by the application of the (generally, recursive) algorithm of Section 2.1. The one-level algorithm is hereafter referred to as PK21. For this algorithm, one readily obtain that 3 5 1 3 N 2; TPK21 (N ) = 1 + − 2 N + 2n + 5 − n n n which has minimum near n = O(N 1=2 ). However, the actual constant within this “O” should be adjusted when running the corresponding PK21 code on a speci-c computer (see Section 5). If one decides to use recursive calls, Step 5 in the above pseudo-code should be unrolled twice: do i = 1; n=2 do j = 1; n do k = 1; n if (|i − j| + |j − k| = 0) then S1 := Fi; k + Fk; n+j S2 := Gk; j + Gn+j; i T 1 := Fn+1−i; n+1−k + Fn+1−k; 2n+1−j T 2 := Gn+1−k; n+1−j + G2n+1−j; n+1−i P := S1 · S2, Q := T 1 · T 2 Hi; j := Hi; j + P Hk; i := Hk; i + P Hn+1−i; n+1−j := Hn+1−i; n+1−j + Q Hn+1−k; n+1−i := Hn+1−k; n+1−i + Q end if end do end do end do
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
477
For the recursive algorithm we have T2 (M ) = 1 (n)
M2 M2 (n) + 2 (n) 2 + T2 (M=n); 2 2n n 2
where M = nk−1 l and T2 (l) = 2l3 − 2l2 . We need the following simple technical result, cf. [1]. Lemma (FMM recursion). Let T (l) be given, M = nm l, and T (M ) = *T (M=n) + +n−2 M 2 for some constants *¿n2 and +. Then T (M ) = (T (l) + -l2 )*m − -M 2 ; where - = +=(* − n2 ). Corollary. Under the assumptions of the FMM Recursion Lemma, it holds T (M ) = (T (l)l−! + -l2−! )M ! − -M 2 ; where !=
log * : log n
In our case, *=
(n) = (n3 + 3n2 )=2; 2
+=
1 (n) + 2 (n) = 3n3 + 7n2 − 4n; 2
and, consequently, !=
log((n3 + 3n2 )=2) ; log n
-=
6n2 + 14n − 8 : n2 + n
Applying the lemma and using M = N=n, m = k − 1, T2 (l) = 2l3 − 2l2 , we obtain k−1 3 n + 3n2 4n2 + 12n − 8 2 6n2 + 14n − 8 T2 (N=n) = 2l3 + − l (N=n)2 : 2 n +n 2 n2 + n Insert this into the formula for TPK2 , use ((n3 + 3n2 )=2)k = (N=l)! , and after some simpli-cations, obtain n2 + 3n − 1 n2 + 3n − 2 2−! 5n3 + 12n2 − 13n + 4 2 3−! ! 2l TPK2 (N ) = N + 4 − N : l n2 + 3n n2 + n n3 + n 2 For n = 12 we obtain ! 6 2:81086 and TPK2 (N ) =
3−! 179 180 (2l
+
178 2−! )N ! 39 l
−
1277 234
N 2:
478
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
Table 1 PK2 exponents !(n)
n
n3 + 3n2 2
log
n3 + 3n2 = log n 2
8 10 12 14 16 18 20 22 24 26
352 650 1080 1666 2432 3402 4600 6050 7776 9802
2.819810 2.812913 2.810856 2.810920 2.811981 2.813520 2.815275 2.817112 2.818957 2.820770
Table 1 shows !(n) for the nearby even n. (Although the smallest value is !(13), odd n’s are less convenient for coding.) Finally, choosing l = 10, so N = 10 · 12k , we obtain TPK2 (N ) 6 3:776N 2:81086 − 5:457N 2 : This estimate should be compared with similar bounds TS (N ) = 3:895N 2:80736 − 6N 2 , N = 10·2k , for the Strassen algorithm [25] and TSW (N ) = 3:732N 2:80736 −5N 2 , N = 8·2k , for a similar algorithm by Winograd. Thus, the PK2 algorithm can be quite competitive with Strassen type algorithms for not very large matrices. Remark 1. Since all the above functions TPK2 (N ), TS (N ), and TSW (N ) are de-ned for the values of N belonging to special subsets of integers (which never intersect), the above formulas cannot be used for extracting the “best” algorithm unless N is very large. For concrete values of N one should -rst specify the rule by which these algorithms are generalized for an arbitrary N . In particular, one can use padding by zeroes (i.e., in6ating the matrix dimension to a closest regular value) or peeling (i.e., two by two block splitting with regularly sized leading block of maximum possible dimension) in their static or dynamic versions. Also, for the PK2 method one can use n = 12, diLerent for each recursion level. After all, the concrete software design and hardware features can aLect the performance much more essentially than certain less than 10 per cent operation count variations. For certain regular (but “non-optimal”) matrix sizes N and cut-oL parameters l, one can -nd the values of TSW (N ) and TPK21 (N ), e.g., in Table 5, see Section 5. Remark 2. In [18], a somewhat underestimated operation count was mistakenly given for a similar matrix multiplication algorithm.
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
479
3. Three disjoint product based algorithms Our next construction of fast MM algorithms relies on aggregation=cancellation techniques and on two-level block matrix structure; the aggregates involve quadruple rather than double indexing of matrix entries. This enables us to develop the so-called 3Procedure (for computing Three Disjoint MMs), and we refer to the resulting methods for single matrix product as the “PK3 algorithms”. In our exposition, we follow the notations of [11, Section 5]. Our basic problem is the calculation of three disjoint n × n matrix products C 0 = A0 B 0 ;
W 0 = U 0V 0;
Z 0 = X 0Y 0;
(1)
and, for simplicity, we let n be even, n = 2m − 2:
(2)
We -rst describe preprocessing of the input matrices similar to that in [19]. 3.1. Reduction to the case of zero row and column sums We assume, for simplicity, that the entries of A0 ; B0 ; U 0 ; V 0 ; X 0 ; Y 0 are real numbers. (In general, these matrices can be composed of rectangular submatrices, and then our formulae (3)– (8) would still apply.) Write 0 0 0 0 A11 A12 B11 B12 0 0 A = ; B = ; 0 0 0 0 A21 A22 B21 B22 and similarly for U 0 ; V 0 ; X 0 ; Y 0 , where each of the four submatrices has the size (m − 1) × (m − 1), cf. (2). Let I be the (m − 1) × (m − 1) identity matrix and let u0T = [1 : : : 1]
uT = [u0T 1]
and
denote the (m − 1)- and m-vectors composed of all ones, respectively. De-ne the matrices I 1 1 T L= ; R = I − u0 u0 − u0 −u0T m m of sizes m × (m − 1) and (m − 1) × m, respectively. Noting that uT L = 0;
Ru = 0;
RL = I;
consider the transformations A11 = LA011 R;
0 T B11 = LB11 L
0 of the blocks A011 and B11 . Then clearly,
uT A11 = 0;
A11 u = 0;
480
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
uT B11 = 0; A11 B11 =
B11 u = 0;
0 T (LA011 R)(LB11 L )
=
0 L(A011 B11 )LT
=
0 ∗ A011 B11 : ∗ ∗
Now, replace each of the four (m − 1) × (m − 1) blocks A0ij in A0 and Bij0 in B0 by the transformed m × m blocks Aij and Bij with zero row and column sums and arrive at the matrices 0 LA11 R LA012 R L 0 0 R 0 = A= A ; 0 R 0 L LA021 R LA022 R 0 T T 0 T LB11 L LB12 L 0 0 L 0 L = B= B 0 T 0 T : 0 L 0 LT LB21 L LB22 L The product 0 0
0
A B =C =
0 0 C12 C11 0 0 C21 C22
is recovered from the (m − 1) × (m − 1) leading submatrices of the m × m blocks C11 ; C12 ; C21 ; C22 in the product 0 0 C11 ∗ C12 ∗ T ∗ ∗ ∗ ∗ L 0 L 0 : C = AB = A0 B0 = T 0 0 C21 0 L 0 L ∗ C22 ∗ ∗ ∗ ∗ ∗ To conclude this section, let us specify the transformation H = LGR of an (m − 1) × (m − 1)-submatrix G of a left multiplier (e.g., G = A011 into H = A11 ): Him = −
1 m−1 Gij ; m j=1
Hij = Gij + Him ; Hmj = −
m−1 i=1
Hij ;
i = 1; : : : ; m − 1;
i = 1; : : : ; m − 1; j = 1; : : : ; m − 1; j = 1; : : : ; m − 1:
(3) (4) (5)
For the right multipliers, the transformation of an (m − 1) × (m − 1)-submatrix G (e.g., 0 into H = B11 ) given by H = LGLT is even simpler: G = B11 Him = −
m−1 j=1
Hij = Gij ; Hmj = −
i = 1; : : : ; m − 1;
i = 1; : : : ; m − 1; j = 1; : : : ; m − 1;
m−1 i=1
Gij ;
Hij ;
j = 1; : : : ; m − 1:
(6) (7) (8)
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
481
Due to (5) and (8), we avoid computing the matrices Apm+m; qm+m ; Bpm+m; qm+m , : : : ; Ypm+m; qm+m ; p = 0; 1; q = 0; 1, which are not used in our algorithm (as one can see in the next section). Remark 3. The above preprocessing algorithm is diLerent from that in Section 5 of [19], where the same transformation is made for both left and right multiplicands (e.g., for A0 and B0 , respectively), followed by a post-processing stage. In our case, there is no numerical post-processing, and the operation count corresponding to (3) –(8) is, therefore, only about 5=8 times that involved in the preprocessing in [19]. To obtain our next algorithm for three disjoint matrix products, we removed some redundant operations in the algorithm in Section 5 of [19], change some signs in the aggregates, and reordered rows and columns in the transformed matrices A; B; U; V; X; Y . 3.2. A compact form of the aggregation-cancellation algorithm Suppose all six input matrices A0 ; B0 ; U 0 ; V 0 ; X 0 ; Y 0 are preprocessed as in the preceding subsection. Then the following three disjoint products, C = AB;
W = UV;
Z = XY;
are actually computed, where each matrix has size (n + 2) × (n + 2) for n + 2 = 2m. For the transformed matrices we have the following “zero-sum” relationships: m i=1 m j=1 m j=1 m k=1 m j=1 m k=1 m k=1 m i=1
Apm+i;qm+j = 0;
1 6 j 6 m;
Apm+i;qm+j = 0;
1 6 i 6 m;
Bqm+j;rm+k = 0;
1 6 k 6 m;
Bqm+j;rm+k = 0;
1 6 j 6 m;
Urm+j;pm+k = 0;
1 6 k 6 m;
Urm+j;pm+k = 0;
1 6 j 6 m;
Vpm+k;qm+i = 0;
1 6 i 6 m;
Vpm+k;qm+i = 0;
1 6 k 6 m;
p = 0; 1; q = 0; 1;
q = 0; 1; r = 0; 1;
r = 0; 1; p = 0; 1;
p = 0; 1; q = 0; 1;
482
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510 m k=1 m i=1 m i=1 m j=1
Xqm+k;rm+i = 0;
1 6 i 6 m;
Xqm+k;rm+i = 0;
1 6 k 6 m;
Yrm+i;pm+j = 0;
1 6 j 6 m;
Yrm+i;pm+j = 0;
1 6 i 6 m;
q = 0; 1; r = 0; 1;
r = 0; 1; p = 0; 1:
To devise our algorithm, consider the 8m3 = (n+2)3 products (the so-called aggregates, cf. [19]) pqr Mijk = ((−1)r Apm+i;qm+j + (−1)q Urm+j;pm+k + (−1)p Xqm+k;rm+i )
×(Bqm+j;rm+k + Vpm+k;qm+i + Yrm+i;pm+j ); 1 6 i 6 m; 1 6 j 6 m; 1 6 k 6 m; p = 0; 1; q = 0; 1; r = 0; 1: (9) Each of these products equals the sum of the following nine terms: pqr Mijk = (−1)r Apm+i;qm+j Bqm+j;rm+k + (−1)r Apm+i;qm+j Vpm+k;qm+i
+ (−1)r Apm+i;qm+j Yrm+i;pm+j + (−1)q Urm+j;pm+k Bqm+j;rm+k + (−1)q Urm+j;pm+k Vpm+k;qm+i + (−1)q Urm+j;pm+k Yrm+i;pm+j + (−1)p Xqm+k;rm+i Bqm+j;rm+k + (−1)p Xqm+k;rm+i Vpm+k;qm+i + (−1)p Xqm+k;rm+i Yrm+i;pm+j :
Sum over r; i, note that the theseq quantities over q; j, over p; k, and sums rof the type p (−1) U Y , (−1) X B ; rm+j; pm+k rm+i; pm+j qm+k; rm+i qm+j; rm+k q; j p; k r; i (−1) Apm+i; qm+j Vpm+k; qm+i are equal to zero (due to the so called cancellation eLect, cf. [19]), and take into account the zero sum properties of the input matrices. This produces the following expressions for (AB)pm+i; rm+k , (UV )rm+j; qm+i , and (XY )qm+k; pm+j , respectively, which de-ne the desired algorithm: pqr (AB)pm+i;rm+k = (−1)r Mijk − (−1)p+r m Xqm+k;rm+i Vpm+k;qm+i −
q;j
q
q;j
Apm+i;qm+j Yrm+i;pm+j −
q;j
(−1)q+r Urm+j;pm+k Bqm+j;rm+k ;
1 6 i 6 m − 1; 1 6 k 6 m − 1; p = 0; 1; r = 0; 1; (UV )rm+j;qm+i = (−1)q −
p;k
p;k
pqr Mijk − (−1)r+q m
Urm+j;pm+k Bqm+j;rm+k −
p
Apm+i;qm+j Yrm+i;pm+j
p;k
(10)
(−1)p+q Xqm+k;rm+i Vpm+k;qm+i ;
1 6 j 6 m − 1; 1 6 i 6 m − 1; r = 0; 1; q = 0; 1;
(11)
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
(XY )qm+k;pm+j = (−1)p −
r;i
r;i
pqr Mijk − (−1)q+p m
r
483
Urm+j;pm+k Bqm+j;rm+k
Xqm+k;rm+i Vpm+k;qm+i − (−1)r+p Apm+i;qm+j Yrm+i;pm+j ; r;i
1 6 k 6 m − 1; 1 6 j 6 m − 1; q = 0; 1; p = 0; 1:
(12)
In the next section, we estimate arithmetic complexity of this algorithm. Remark 4. The above algorithm can be easily generalized to the case where the sizes of the input matrices are n1 × n2 , n2 × n3 , n2 × n3 , n3 × n1 , n3 × n1 , and n1 × n2 for A0 ; B0 ; U 0 ; V 0 ; X 0 , and Y 0 , respectively, as in [19]. Remark 5. For each -xed triple i; j; k; the eight products (9) obtained with diLerent p; q; r correspond exactly to the eight products P1 ; : : : ; P8 introduced in [11, p. 572] as follows: P1 = M 000 ; P2 = M 010 ; P3 = M 100 ; P4 = M 001 ; P5 = −M 111 ; P6 = −M 101 ; P7 = − M 011 ; P8 = − M 110 . 3.3. Asymptotics for bilinear multiplicative cost The algorithm can be summarized as follows: • Split the matrices properly and apply transformation (3) – (8) to each of the 24 blocks 0 ; then perform all the matrix additions involved in (9). A011 ; : : : ; Y22 • Perform the (bilinear) matrix multiplications involved in (9) – (12) (in general, either a recursive call, or the trivial algorithm, or another algorithm can be applied here). • Perform all additions involved in (10) – (12) (as follows from Section 3.1, for the resulting products C, W , and Z, the bordering rows and columns introduced at the preprocessing stage need not be calculated). This rather rough sketch makes it possible to estimate the number of bilinear multiplications involved. To estimate the number of linear operations (additions, subtractions, and multiplications by scalars) and the working memory usage, we have to reorder the computations properly, see Sections 3.4 –3.6. pqr pqr Note that for all p; q; r there is no actual need to calculate the products Mimm , Mmim , pqr Mmmi , i = 1; : : : ; m, and Apm+m; qm+m Yrm+m; pm+m , Urm+m; pm+m Bqm+m; rm+m , Xqm+m; rm+m Vpm+m; qm+m , since these quantities are never used in (10) – (12). The remaining prodpqr ucts Mijk and the correction terms of the type Apm+i; qm+j Yrm+i; pm+j are computed by 3 using 8(m − 3m + 2) and 24(m2 − 1) multiplications, respectively. Add these quantities and recall 2m = n + 2 to yield the following expression for the total number of bilinear multiplications: (n) = 8m3 + 24m2 − 24m − 8 = n3 + 12n2 + 24n: This number is divisible by 3 whenever n = 6k;
k = 1; 2; : : :
(recall that we already assumed that n is even). Hence, the MMs of smaller size in this construction can be regrouped again in triples. Assuming that N = nk l, k¿1, l¿1,
484
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
one readily obtains the following recurrence relation: 3 n N (n) N = = ··· b + 4n2 + 8n b 3 n 3 n 3 k n 2 = + 4n + 8n b(l); 3
b(N ) =
where b(N ) is the number of bilinear multiplications in the resulting recursive algorithm for three disjoint products of N × N matrices. For n = 48, -xed l, and k → ∞, we obtain an algorithm with asymptotic complexity T (N ) = O(N 2:7760 ): In general, the “base n” algorithm has the asymptotic complexity O(N !(n) ), where !(n) = logn ((n)=3) (cf. Section 2.1); some exponents !(n) are shown in Table 2. The above asymptotics hold for all N since the limitation N = nk l can be relaxed using simple bordering of the original matrices by zeroes (also called static padding) [25]. Such techniques may also be of practical use, see Section 4.2, where the case of rectangular matrices is considered. 3.4. Implementation details for 3-procedure Next, we study the computational scheme for Three Disjoint MMs in some detail to estimate the number of linear operations involved and the working memory used.
Table 2 PK3 exponents !(n)
n
n3 + 12n2 + 24n 3
log
n3 + 12n2 + 24n = log n 3
12 18 24 30 36 42 48 54 60 66 72 78 84 90 96
1248 3384 7104 12,840 21,024 32,088 46,464 64,584 86,880 113,784 145,728 183,144 226,464 276,120 332,544
2.869040 2.811685 2.790517 2.781468 2.777555 2.776125 2.775995 2.776577 2.777559 2.778763 2.780085 2.781464 2.782860 2.784249 2.785617
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
485
Let us introduce the following more compact notation using four-dimensional indexing: qp Apm+i;qm+j = Apq ij ; : : : ; Zqm+k;pm+j = Zkj :
The main part of the algorithm described by (9) – (12) can be implemented as shown by the following pseudo-code: Step 1: do p = 0; 1; q = 0; 1 : do i = 1; : : : ; m : pq pq pq pq pq pq Cmi := 0, Cim := 0, Wmi := 0, Wim := 0, Zmi := 0, Zim := 0 end do end do Step 2: do p = 0; 1; q = 0; 1; r = 0; 1: do i = 1; : : : ; m; j = 1; : : : ; m : rp 00 if (i¡m or j¡m) Cmm := Apq ij Yij if (i¡m and j¡m) then if (p = 0) then 00 Wjirq := Cmm else 00 Wjirq := − (−1)q+r m(Wjirq + Cmm ) end if end if pr pr 00 if (i¡m) Cim := Cim + Cmm qp qp 00 if (j¡m) Cmj := Cmj + (−1)p+r Cmm end do do j = 1; : : : ; m; k = 1; : : : ; m : qr 00 if (j¡m or k¡m) Cmm := Ujkrp Bjk if (j¡m and k¡m) then if (r = 0) then 00 Zkjqp := Cmm else 00 Zkjqp := − (−1)p+q m(Zkjqp + Cmm ) end if end if rq rq 00 if (j¡m) Wjm := Wjm + Cmm pr pr 00 if (k¡m) Wmk := Wmk + (−1)q+r Cmm end do do k = 1; : : : ; m; i = 1; : : : ; m : 00 if (i¡m or k¡m) Cmm := Xkiqr Vkipq if (i¡m and k¡m) then if (q = 0) then 00 Cikpr := Cmm
486
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
else 00 Cikpr := − (−1)q+r m(Cikpr + Cmm ) end if end if qp qp 00 if (k¡m) Zkm := Zkm + Cmm rq rq 00 if (i¡m) Zmi := Zmi + (−1)p+q Cmm end do end do Step 3: do p = 0; 1; q = 0; 1; r = 0; 1 : do i = 1; : : : ; m; j = 1; : : : ; m; k = 1; : : : ; m : if ((i¡m; j¡m) or (j¡m; k¡m) or (k¡m; i¡m)) then 01 q rp p qr Cmm := (−1)r Apq ij + (−1) Ujk + (−1) Xki qr pq rp 10 Cmm := Bjk + Vki + Yij 00 01 10 Cmm := Cmm Cmm end if 00 if (i¡m and k¡m) Cikpr := Cikpr + (−1)r Cmm rq rq q 00 if (i¡m and j¡m) Wji := Wji + (−1) Cmm 00 if (j¡m and k¡m) Zkjqp := Zkjqp + (−1)p Cmm end do end do Step 4: do p = 0; 1; r = 0; 1 : do i = 1; : : : ; m − 1; k = 1; : : : ; m − 1 : pr pr Cikpr := Cikpr − Cim − Wmk end do end do do q = 0; 1; r = 0; 1 : do j = 1; : : : ; m − 1; i = 1; : : : ; m − 1 : rq rq Wjirq := Wjirq − Wjm − Zmi end do end do do p = 0; 1; q = 0; 1 : do k = 1; : : : ; m − 1; j = 1; : : : ; m − 1 : qp qp Zkjqp := Zkjqp − Zkm − Cmj end do end do We use the bordering rows of the resulting matrices C pr ; W rq ; Z qp as temporary variables for the accumulation of appropriate sums. The symbol ‘ := ’ denotes in-place updating, so our symbols Cikpr ; Wjirq ; Zkjqp indicate certain storage areas rather than algebraic terms. Obviously, the required memory does not exceed the amount of bordering
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
487
introduced for all the input and output matrices. We choose n = 2m − 2 of the order of tens, so this typically comprises only a moderate fraction (not larger than (4n + 4)=n2 ) of the total input data volume. We have not commented above on the grouping of matrix products into triples as implied by the recursion. However, for matrix sizes not larger than 10,000, the one-level scheme appears to be most e1cient, at least for many modern RISC computers (see Sections 4 and 5). In this case, no grouping by triples is required, whereas grouping of pairs should be done if the 2-procedure instead of the standard MM is used at the inner level; the latter choice seems to be good for very large matrix sizes. 3.5. Scalar multiplications and additions: exact operation count To show that the algorithm is practically competitive, e.g., with the ones presented in [18,19,25], we should estimate the actual number of linear operations. The number of “linear operations” (i.e., matrix additions, subtractions, and multiplications by scalars m−1 or m required for performing (3) or computing the correction terms in (8)– (10), respectively) can be estimated as follows: • Steps (3)– (8) take 12(5m2 − 13m + 8) operations applied to input-type blocks. • Step (9) involves 32(m3 − 3m + 2) operations applied to input-type blocks. • Steps (10)– (12) can be performed in 24(m3 + 2m2 − 6m + 3) operations applied to output-type blocks. Substituting 2m = n + 2, one obtains the estimates 1 (n) = 4n3 + 39n2 − 18n and 2 (n) = 3n3 + 30n2 + 12n for linear operations performed on the input-type and the output-type blocks, respectively. In Section 3.7, the above formulas are used as the basis for the operation count for a regular level of recursion in the above algorithm. 3.6. An algorithm for a single matrix product The above procedure can be applied to multiply a single pair of N × N matrices with scalar coe1cients quite similar to the approach of Subsection 2.2 (cf. [18]). Consider the product H = FG of two square N × N matrices. Let N be an integer multiple of 3. Split the columns of F and the rows of G into three equal blocks each, that is, B F = [A X U ]; G = Y ; V where A; X; U and B; Y; V have the sizes N × N=3 and N=3 × N , respectively. Then, by computing C = AB;
Z = XY;
W = UV;
488
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
one obtains the required product as H = C + Z + W; and the problem is thus reduced to a triple of disjoint matrix multiplications, followed by a pair of N × N matrix additions. We keep working memory as small as in Section 3.4, by accumulating all three products simultaneously in the course of calculations. Indeed, as one can see from the pseudo-code below, after adding the bordering block rows and columns to the input and output matrices, all the subsequent computations can be performed in-place again. Write as above Fpm+i;qm+j = Fijpq ;
Gqm+j;rm+k = Gijpq ;
Hpm+i;rm+k = Hkjqp ;
and summarize the main part of the algorithm (performed after completing the bordering) as follows: Step 1: do p = 0; 1; q = 0; 1 : do i = 1; : : : ; m; j = 1; : : : ; m : Hijpq := 0 end do end do Step 2: do p = 0; 1; q = 0; 1; r = 0; 1 : do i = 1; : : : ; m; j = 1; : : : ; m : rp 00 if (i¡m or j¡m) Hmm := Fijpq G2m+i; j rq rq 00 if (i¡m and j¡m) Hji := Hji − (−1)q+r Hmm pr pr 00 if (i¡m) Him := Him + Hmm qp qp 00 if (j¡m) Hmj := Hmj + (−1)p+r Hmm end do do j = 1; : : : ; m; k = 1; : : : ; m : qr 00 if (j¡m or k¡m) Hmm := Fj;rpm+k Gjk qp qp 00 if (j¡m and k¡m) Hkj := Hkj − (−1)p+q Hmm rq rq 00 if (j¡m) Hjm := Hjm + Hmm pr pr 00 if (k¡m) Hmk := Hmk + (−1)q+r Hmm end do do k = 1; : : : ; m; i = 1; : : : ; m : pq 00 if (i¡m or k¡m) Hmm := Fk;qr2m+i Gm+k; i pr pr 00 if (i¡m and k¡m) Hik := Hik − (−1)q+r Hmm qp qp 00 if (k¡m) Hkm := Hkm + Hmm rq rq 00 if (i¡m) Hmi := Hmi + (−1)p+q Hmm end do end do
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
489
Step 3: do p = 0; 1; q = 0; 1 : do i = 1; : : : ; m − 1; j = 1; : : : ; m − 1 : Hijpq := mHijpq end do end do Step 4: do p = 0; 1; q = 0; 1; r = 0; 1 : do i = 1; : : : ; m; j = 1; : : : ; m; k = 1; : : : ; m : if ((i¡m; j¡m) or (j¡m; k¡m) or (k¡m; i¡m)) then 01 Hmm := (−1)r Fijpq + (−1)q Fj;rpm+k + (−1)p Fk;qr2m+i qr pq rp 10 Hmm := Gjk + Gm+k; i + G2m+i; j 00 01 10 Hmm := Hmm Hmm end if 00 if (i¡m and k¡m) Hikpr := Hikpr + (−1)r Hmm rq rq q 00 if (i¡m and j¡m) Hji := Hji + (−1) Hmm 00 if (j¡m and k¡m) Hkjqp := Hkjqp + (−1)p Hmm end do end do Step 5: do p = 0; 1; q = 0; 1 : do i = 1; : : : ; m − 1; j = 1; : : : ; m − 1 : pq pq Hijpq := Hijpq − Him − Hmj end do end do Fortunately, the algorithm for a single MM appears to be even more compact than the generic Three Disjoint Product procedure. Here we use 4m2 redundant additions due to the simplistic initialization at Step 1 above, but we save many scalar multiplications by m, performing them just once at Step 3. The latter algorithm actually presents a procedure for multiplying n × 3n matrix F by 3n × n matrix G and requires (n) = n2 +12n+24 bilinear multiplications (the same as above) and ˜1 (n) = 4n3 + 39n2 − 18n and ˜2 (n) = 3n3 + 27n2 + 9n linear operations performed on input-type and output-type blocks, respectively. The above formulas are used in the next section to estimate the complexity for the starting level of recursion in the PK3 algorithm.
490
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
Note that in the above algorithm, the preprocessing stage of Section 3 is made separately for every n × n block, a triple of which composes F or G. Similar to Section 3.4, the working memory volume for the above procedure is bounded by the total amount of the bordering blocks introduced at the preprocessing stage, i.e. ((n + 2)2 − n2 )=n2 = (4n + 4)=n2 times the memory occupied by A; B, and C. When the recursive base n algorithm is applied (see the next subsection), the above quantity should be multiplied by 1 + n−2 + · · · + n−2k+2 6 n2 =(n2 − 1). For instance, in the case of multiplying N × N matrices (C = A · B, N = nk l), the working memory volume for the PK3 method is estimated as WPK3 6
12 N2 n−1
while the Winograd method requires [12] WSW ≈
2 3
N2
workspace. If one takes, e.g., n = 48, then the workspace for Winograd method appears to be more than 2.6 times larger than that required in the PK3 method. 3.7. Recursive algorithm and its best-case performance Let the (block) sizes of all these matrices be n × n. This corresponds to the assumption that N = nk l, where n = 6k (as was assumed earlier) and l is an integer multiple of 3, so each of matrices A; X; U and B; Y; V is partitioned as a square n × n block matrix composed of l × l=3 and l=3 × l submatrices, respectively (l = N=n). Hence, the above recursion scheme readily applies. Noting that the recursive 3Procedure and the corresponding PK3 method for square matrix multiplication diLer only in their initialization stage, one can formally write TPK3 (N ) = −
3n + 3 2 N + T3 (N ): n
The input-type and output-type linear operations take (N=n)2 =3 and (N=n)2 6ops, respectively, so the 3-Procedure (i.e., three disjoint products of N × N=3 by N=3 × N matrices) uses 1 (n) (n) T3 (N ) = + 2 (n) N 2 =n2 + T3 (N=n) 3 3 =
13n3 + 129n2 + 18n 2 2 n3 + 12n2 + 24n N =n + T3 (N=n) 3 3
6ops, and T3 (l) = 2l3 − 3l2 ;
l N:
Applying now the FMM recursion Lemma, one obtains 10n2 + 102n − 54 2−! 13n2 + 129n + 18 2 3−! ! l N ; N T3 (N ) = 2l + − n2 + 9n + 24 n2 + 9n + 24
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
491
and therefore, 10n2 + 102n − 54 2−! 16n3 + 159n2 + 117n + 72 2 3−! ! TPK3 (N ) = 2l N + − l N ; n2 + 9n + 24 n3 + 9n2 + 24n where
log !=
n3 + 4n2 + 8n 3 : log n
To minimize !, set n = 48 to obtain 4647 2−! 29743 2 l N : N! − TPK3 (N ) = 2l3−! + 460 1840 With optimum l = 18, this yields TPK3 (N ) 6 4:894N 2:7760 − 16:165N 2 : With respect to the total operations count, the above PK3 algorithm is quite competitive with Strassen’s, for which TS (N ) 3:895N 2:80736 − 6N 2 ; and even with Winograd’s one, which has TSW (N ) 3:732N 2:80736 − 5N 2 : By the reasons quoted above in Remark 1, we would refrain from a direct comparison of Strassen-type methods and the PK3 algorithm based on the above best-case operation counts. With respect to the running time, the actual cross-over points for these methods will mostly depend on implementation details and computational platform rather than on their operation counts. (Of course, for su1ciently large values of N the PK3 algorithm will always run faster due to its smaller exponent ! = 2:7760.) For certain regular (but “non-optimal”) matrix sizes N and cut-oL parameters l, one can -nd the values of TSW (N ) and TPK31 (N ), e.g., in Table 5 below, see Section 5. 3.8. Cross-over point between PK and SW algorithms It appears that the total operation count for FMM algorithms based on 2- and 3procedures is comparable with that of the Winograd algorithm for square matrices of the order 500¡N ¡4640. For larger orders, the new algorithms are slightly better, at least for 46416N ¡200; 000 with just a few marginal exceptions near N = 33; 000. The numerical comparison was performed as follows. For an arbitrary N , the operation count for the Winograd algorithm was estimated as stat TSW (N ) = min(TSW
padd
dyn ; TSW
padd
stat ; TSW
peel
dyn ; TSW
peel
);
where • stat padd denotes the odd-size -x-up by “static padding”, i.e., by embedding the original N × N matrices into matrices of the (generally larger) size N+ = 2k l with
492
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
subsequent application of the Winograd algorithm. This approach was historically the -rst [25]. In our calculations, the values of k and l delivering the minimum total operation count were obtained through the exhaustive search. • dyn padd denotes the odd-size -x-up by “dynamic padding”, i.e., at each recursive step one increases, if necessary, the matrix size only by one to make it even. Then the Winograd recursion is applied, and the recursion stops when the operation count attains its minimum. The method of row=column duplication [12] is described in a diLerent way but yields the same operation count. • stat peel denotes the odd-size -x-up by “static peeling”, i.e., splitting the original N × N matrices into 2 × 2 block form with the upper left block of the size N1 = 2k l and subsequent application of the Winograd algorithm for the multiplication of such blocks. The rest of the calculations involving rectangular blocks is performed using the standard MM algorithm. The values of k and l delivering the minimum total operation count were obtained through the exhaustive search. • dyn peel denotes the odd-size -x-up by “dynamic peeling”, i.e., at each recursive step one splits, if necessary, the matrix into 2 × 2 block form with 1 × 1 right lower blocks to make the size N − 1 of the left upper block even. Then the Winograd recursion is applied for left upper blocks, while the arising matrix-vector and vector-vector operations are performed by the standard algorithm. The recursion stops when the operation count attains its minimum. While static peeling appears to be always worse than dynamic peeling, there is no clear loser among the remaining three algorithms. On the average, dynamic peeling requires (up to 20%) smaller number of operations than padding algorithms in ≈ 85% cases. The operation count for the PK2=PK3 algorithm was estimated as stat padd stat peel stat padd stat peel stat padd stat peel TPK (N ) = min(TPK22 ; TPK22 ; TPK31=21 ; TPK31=21 ; TPK32 ; TPK32 );
where • stat padd and stat peel denote the same approaches to the odd-size -x-up as above but with the regular problem size N = nml instead of N = 2k l. • PK22 denotes the two-level MM algorithm which uses the algorithms of Sections 2.2, 2.1, and the standard procedure at its outer, middle, and inner recursion levels, respectively. The total operation count for the regular case N = nml, with n and l even, is TPK22 (nml) =
3n2 + 8n − 6 2 n2 + 3n − 1 N + n n 2 m + 2m − 2 2 m + 3 3 × N + N : m 2mn
• PK31=21 denotes the two-level MM algorithm which uses the algorithms of Sections 3.6, 2.1, and the standard procedure at its outer, middle, and inner recursion levels, respectively. The total operation count for the regular case N = nml,
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
493
with n even and l divisible by 3, is TPK31=21 (nml) =
13n2 + 120n + 9 2 n2 + 12n + 24 N + 3n n 2 5m + 9m − 10 2 m + 3 3 × N + N : 6m 2mn
• PK32 denotes the two-level MM algorithm which uses the algorithms of Sections 3.6, 3.4, and the standard procedure at its outer, middle, and inner recursion levels, respectively. The total operation count for the regular case N = nml, with n divisible by 6, m even, and l divisible by 3, is TPK32 (nml) =
13n2 + 120n + 9 2 n2 + 12n + 24 N + 3n 3n 12m2 + 117m − 6 2 m2 + 12m + 24 3 N + N : × 3m 3m2 n
In the above algorithms, the values of n, m and l for which the total operation count is minimum were obtained through the exhaustive search. The values of N beg ; Nend and K = |{N : Nbeg 6 N 6 Nend ; TSW =TPK ¿ 1}|; Rmin =
min
Nbeg 6N 6Nend
TSW =TPK ;
Rmax =
max
Nbeg 6N 6Nend
TSW =TPK
are given in Table 3. These data con-rm that the new algorithms are quite competitive with Winograd algorithm with respect to the total operation count. Of course, the above two-level algorithms are e1cient only for limited values of N . For instance, the obvious PK32=31 or PK33 three-level procedures should be tried as N approaches 200,000. Remark 6. The multiplicative constant in TPK3 (N ) becomes somewhat smaller than 4.894 when the algorithm of Section 2.1 is employed instead of the standard MM at the lowest level l. The one-level procedures PK21 and PK31 can be readily implemented in codes running at high M6ops rate in the range 10006N 610; 000, which may not be the case for the above described two-level procedures. This explains the choice of algorithms for numerical testing in Section 5. 3.9. Estimating numerical stability of the 3-Procedure As we show in Section 5, the presented matrix multiplication algorithm (similar to the one in [18]) demonstrates very good numerical stability due to the structural advantage given by the “long base” recursions. This is an essential property of the algorithms based on the schemes in [19,20,21], whereas the Strassen type algorithms use “base two” recursions and therefore are much less numerically stable. The techniques for the estimation of stability of MM algorithms can be found in [10,13,14]. The
494
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
Table 3 The ratio R = TSW (N )=TPK (N )
N beg
Nend
K
R min
R max
500 1000 2000 3000 4000 5000 6000 7000 8000 9000 10,000 15,000 20,000 25,000 30,000 32,000 33,000 35,000 40,000 50,000 60,000 70,000 80,000 90,000 100,000 110,000 120,000 130,000 140,000 160,000
999 1999 2999 3999 4999 5999 6999 7999 8999 9999 14,999 19,999 24,999 29,999 31,999 32,999 34,999 39,999 49,999 59,999 69,999 79,999 89,999 99,999 109,999 119,999 129,999 139,999 159,999 199,999
1 24 131 822 976 1000 1000 1000 1000 1000 5000 5000 5000 5000 2000 989 2000 5000 10,000 10,000 10,000 10,000 10,000 10,000 10,000 10,000 10,000 10,000 20,000 40,000
0.905 0.941 0.943 0.975 0.991 1.008 1.013 1.024 1.021 1.021 1.018 1.018 1.015 1.003 1.006 0.999 1.006 1.006 1.006 1.017 1.023 1.015 1.025 1.028 1.037 1.022 1.028 1.015 1.015 1.012
1.003 1.012 1.022 1.051 1.056 1.067 1.086 1.078 1.078 1.083 1.088 1.076 1.065 1.062 1.064 1.044 1.059 1.066 1.074 1.091 1.084 1.081 1.079 1.082 1.084 1.085 1.075 1.069 1.066 1.063
general approach to theoretical estimation of the error growth factor for the 6oatingpoint implementation of such algorithms can be found in [6], where the whole class of Strassen-like algorithms was analyzed. Using the standard techniques [6,13,14] for estimating the numerical error growth for the 3-Procedure, one can obtain the following result (somewhat similar to that presented for the 2-Procedure in [18]). If we denote by 7 the machine tolerance (usually near 10−15 and 10−7 in double and single precision, respectively) and use the matrix norm S = max |(S)i;j |; i;j
then the error in the 6oating point implementation of the 3-Procedure applied to a triple of N × N=3 by N=3 × N products C 0 = A0 B0 , W 0 = U 0 V 0 , Z 0 = X 0 Y 0 with N = nk−1 l,
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
495
l6n, and n¿6, satis-es the bound fl([C 0 |W 0 |Z 0 ]) − [C 0 |W 0 |Z 0 ]
2 6O N
log(3n +24n−52) log n
7[A0 |U 0 |X 0 ] [B0 |V 0 |Y 0 ] + O(72 ):
Here and hereafter, [C 0 |W 0 |Z 0 ] denotes the N × N matrix having 1 × 3 block structure, etc. The sketch of the proof is as follows. (We are trying to be as close as possible to the analysis of Strassen’s algorithm in [13,14].) The 6oating point model of scalar additions=subtractions and multiplications is fl(a ± b) = a(1 + ) ± b(1 + *); fl(ab) = ab(1 + +); where ||; |*|; |+|67. Hereafter, we will use the notation 9(S) = fl(S) − S: Together with the simple estimate 9(S1 + S2 )627[S1 |S2 ], we use its general form 2 J 6 J + J − 2 7[S1 | : : : |SJ ] + O(72 ); 9 S i 2 i=1 valid for arbitrary matrices S1 ; : : : ; SJ , as well as the error bound for the standard algorithm applied to the product ST of a I × J matrix P by a J × K matrix Q: 9(PQ) 6
J 2 + 3J − 2 7P Q + O(72 ): 2
The latter, taken with I = K = l, J = l=3, yields 9([A0 B0 |U 0 V 0 |X 0 Y 0 ]) 6 ’(l)7
and ’(l) = (l2 + 9l − 18)=18 6 l2 =9;
l ¿ 6;
which can be used as the induction basis. The inductive hypothesis of the same form (and with ’(l) replaced by ’(N )) is then proved for one recursive step of the algorithm (as speci-ed in Section 3.4 above) with ’(N ) 6 c1 ’(N=n) + c2 n2 N; where (under the condition n¿6) c1 = 3n2 + 24n − 52 and c2 ¿0 is a certain absolute constant.
496
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
The proof is quite similar to [13,14]. Note that the main input in the bound comes from the preprocessing procedure (3)– (8) which maps A0 to A, B0 to B, etc. Indeed, for all blocks of left and right multiplicands one has, for example (recall that n + 2 = 2m), Apq ij 6 ((2n − 2)=(n + 2))
6 (n=(n + 2))
i ¡ m; j ¡ m;
i ¡ m; j = m;
6 ((n − 1)n=(n + 2))
i = m; j ¡ m
and Bijpq 6
Bijpq
i ¡ m; j ¡ m;
6 (n=2)
i ¡ m; j = m; or i = m; j ¡ m;
respectively (and the same for the blocks of U; V and X; Y ). In order to estimate the value of c1 , it su1ces to assume that no numerical errors are introduced by the scalar matrix operations at the recursion level, and to make only an account for the multiplication errors, for instance, 2n − 2
9(Xkiqr Vkipq ) 6 ’(N=n)7
etc. This results in 9((AB)pr ik ) 6
q=0;1;j¡m
+ +
pqr 9(Mijk ) +
q=0;1;j¡m
q=0;1;j¡m
pqr 9(Mimk ) + m
q=0;1
rp 9(Apq im Yim ) +
q=0;1
rp qr 9(Umk Bmk ) +
q=0;1
q=0;1
9(Xkiqr Vkipq )
rp 9(Apq im Yim ) rp qr 9(Umk Bmk )
n3 + 12n2 − 9n − 2 6 ’(N=n)7 < L
2 ’(N ) = O N
log(3n +24n−52) log n
;
which easily yields the required error estimate.
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
497
The obtained error growth estimate is O(N 2 ) for any -xed k, e.g., for k = 2 which corresponds to the one-level PK31 algorithm discussed later in Sections 4 and 5. Hence, the error growth is asymptotically the same as that of the standard MM method. Even with the “optimum” n = 48, one gets the error growth only O(N 2:322 ), which should be compared to a rather disappointing estimates O(N 3:585 ) and O(N 4:170 ) for the Strassen and Winograd algorithms, respectively (valid for -xed size of the innermost matrix multiplications, see [6,13]). The numerical tests given below clearly con-rm this theoretical comparison of stability between the Strassen-type algorithms and the new ones. 4. One-level algorithms for medium-size matrices As follows from consideration regarding the performance of modern RISC computers, it appears that when the matrix size is not too large, say n¡10; 000, it makes sense to perform only one step of the recursion, and then switch to the standard algorithm. Otherwise, a large number of small subproblems of matrix addition=subtraction and multiplication arises, and they cannot be processed at high M6ops rates. Hence, to multiply two not too large N × N matrices, N = nl, it is enough to apply the procedure of the preceding section for m = (n + 2)=2 with all block multiplications being l × l=3 by l=3 × l ones and performed, e.g., by a properly tuned standard MM routine, e.g. DMR code [4,11]. The one-level PK21 algorithm was already outlined in Section 2.3. We now consider the PK31 algorithm, where the triple disjoint product procedure is applied once, and then switch is made to the standard MM. If the nearly optimum (from the viewpoint of Section 3.3) values of n ≈ 50 are used, then 406l6200, which is rather advantageous for attaining a su1ciently high performance for MM of sizes 2000 to 10; 000 on RISC computers. Next, we present an analysis showing the optimum n which minimizes the total operation count for the two-level method. As follows from the discussion presented in Section 3.7, 2 N 13n2 + 120n + 9 2 N 3 2 TPK31 (N ) = 2 N + (n + 12n + 24n) −1 3n n 3n 10 2 8 16 21 N2 = + + 2 N3 + n + 28 − 3 n n 3 n for N = nl. The minimizer of the latter expression is √ n∗ ≈ 2:4N ; and the corresponding operation count is given by ∗ TPK31 (N ) ≈
2 3
N 3 + 10:33N 5=2 + O(N 2 ):
This minimum is attained at 5 l∗ ≈ 12 N;
498
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
which satis-es 326l662 for 2000¡N ¡10; 000. Such bounds on l seem rather satisfactory for attaining good M6ops performance. Note also that T as the function in l is very 6at to the right of l∗ , so using somewhat larger l would only slightly increase the operation count while may considerably improve the M6ops rate for the standard MM routine at the inner level. Also, using larger values of l is necessary to adjust the algorithm to odd-sized and rectangular input matrices by padding them with zeros, see the next section. The latter operation count should be compared with related to that in Section 2.3, ∗ TPK2 (N ) = N 3 + 4:45N 5=2 + O(N 2 ):
The latter bound is clearly inferior for su1ciently large values of N . However, the advantage of PK21 algorithm is that it tends to have larger optimum cut-oL level 2 ∗ l ≈ 3 N , and therefore, may deliver better M6ops performance. It should be noted that for realistic cut-oL sizes l and limited values of N , say, 500¡N ¡18; 000, these simple procedures appear to be quite competitive even in operation count with the Strassen-type algorithms. This is demonstrated in the next section, where the operation count of the above mentioned methods is estimated for an arbitrary value of N . 4.1. The comparison of performance for odd-sized matrices Consider the case where N is an arbitrary number. Both algorithms of the preceding section can be employed using the bordering technique (also called static padding) as described above in Section 3.8. For Winograd’s algorithm we -nd some N+ such that N 6N+ = 2k l, for which the operation count TSW (N+ ) is minimum. Similarly, for the one-level algorithm we use N+ such that N 6N+ = nl (with n even and l an integer multiple of 3) for which TPK31 (N+ ) is minimum. To this end, the estimated total number of operations is TSW (2k l) = (2l3 + 4l2 )7k − 5l2 4k and TPK31 (nl) = ((2n2 + 24n + 48)l + 10n2 + 84n − 63)nl2 =3: Then the original matrices were augmented with N+ − N zero rows and columns, and the above described procedures applied. The results shown in Fig. 1 (where we give the ratio T (N )=N 2 versus N ) con-rm our best expectations. Indeed, for all medium-large matrices (1500¡N ¡18; 000), the one-level PK31 algorithm requires clearly smaller number of operations, provided that the cut-oL size satis-es l¿72. A similar comparison can be done between the SW algorithm and PK21 (the onelevel 2-Procedure, see Section 2.3 above), for which TPK21 (nl) = ((n2 + 3n − 1)l + 2n2 + 5n − 5)nl2 under the same restriction l¿72. In this case, one can observe that PK21 has (on average) a better operation count for all matrix sizes in the range 5006N 62300.
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
499
Strassen-Winograd vrs. PK31: simple bordering 40000 upper: Standard middle: SW algorithm lower: PK31 algorithm 35000
(total ops.count)/n^2: cut-off.ge.72
30000
25000
20000
15000
10000
5000
0 0
2000 4000 6000 8000 10000 12000 14000 16000 18000 matrix size
Fig. 1. Standard, Strassen–Winograd, and PK31 operation counts (cut-oL bounded: l¿72).
Note that for N ¿18; 000 one can switch to 2-level algorithms, e.g. PK22, see [18], or 2-level designs for 3-Procedure. Recall that imposing a lower bound on the cut-oL size l (say, near 72, or even more, as in our numerical experiments) is necessary for attaining a satisfactory M6ops rate on RISC computers. Remark 7. The peeling techniques described earlier in Section 3.8 can also be used to perform the above comparison. In this case, one can expect somewhat smaller (on average) operation counts; however, the less regular structure of the arising algorithms may deteriorate their M6ops performance. 4.2. Adjustment of fast algorithms for rectangular MM We mainly follow the bordering techniques outlined in [17,26]. Assume that we are multiplying N × K matrix A by K × M matrix B. The design can rely on using either Strassen-type algorithm for n × n by n × n MM with n = 2k , k¿1, or a 2-Procedure related algorithm for n × 2n by 2n × n MM with n¿4, or a 3-Procedure
500
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
Table 4 MM on Pentium III server: double precision
Size
Method
Total ops.
Time (s)
M6ops
Err
Rel.mem.
1152 1152 1152 1152 1152 1152 1152 1152 1152
DGEMM DMR SW(18) SW(36) SW(72) SW(144) SW(288) PK21(72) PK31(72)
0.306+10 0.306+10 0.152+10 0.165+10 0.184+10 0.207+10 0.235+10 0.186+10 0.198+10
29.83 8.57 8.28 7.58 7.66 7.27 7.15 7.21 8.99
102.5 356.6 183.6 217.7 240.2 284.7 328.7 258.6 220.8
6.77-14 8.81-15 8.30-11 1.93-11 4.41-12 8.82-13 1.81-13 2.27-13 8.44-14
0.0000 0.0031 0.6700 0.6700 0.6700 0.6700 0.6700 0.2747 0.8260
2304 2304 2304 2304 2304 2304 2304 2304 2304
DGEMM DMR SW(18) SW(36) SW(72) SW(144) SW(288) PK21(144) PK31(96)
0.245+11 0.245+11 0.106+11 0.116+11 0.129+11 0.145+11 0.165+11 0.147+11 0.131+11
238.03 70.12 56.68 51.25 51.26 50.08 49.66 54.16 58.46
102.7 349.4 187.0 226.3 251.6 289.5 332.3 271.8 224.6
1.72-13 1.76-14 6.91-10 1.40-10 3.58-11 7.78-12 1.41-12 3.27-13 1.96-13
0.0000 0.0008 0.6700 0.6700 0.6700 0.6700 0.6700 0.2562 0.5281
4608 4608 4608 4608 4608 4608 4608 4608 4608
DGEMM DMR SW(18) SW(36) SW(72) SW(144) SW(288) PK21(144) PK31(144)
0.196+12 0.196+12 0.746+11 0.810+11 0.902+11 0.102+12 0.115+12 0.108+12 0.941+11
1911.80 553.79 386.24 357.49 361.32 334.53 364.77 408.33 363.38
102.3 353.9 193.1 226.6 249.6 304.9 315.3 264.5 259.0
3.58-13 4.60-14 4.47-09 1.10-09 2.25-10 5.17-11 1.05-11 1.08-12 4.72-13
0.0000 0.0002 0.6700 0.6700 0.6700 0.6700 0.6700 0.1265 0.3885
related algorithm for n × 3n by 3n × n MM with n = 2k, k¿4. For simplicity, let us consider the case when the n × 2n by 2n × n algorithm of Section 2.2 is used. Assuming that n is considerably smaller than min(N; K; M ), represent the matrix sizes as N = nlN − rN ;
0 6 rN ¡ n;
K = 2nlK − rK ;
0 6 rK ¡ 2n;
M = nlM − rM ;
0 6 rM ¡ n:
Then we set N+ = nlN ;
K+ = 2nlK ;
M+ = nlM ;
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
501
Table 5 MM on Pentium III server: single precision
Size
Method
Total ops.
Time (s)
M6ops
Err
Rel.mem.
2304 2304 2304 2304 2304 2304 2304 2304 2304
SGEMM DMR SW(18) SW(36) SW(72) SW(144) SW(288) PK21(144) PK31(96)
0.245+11 0.245+11 0.106+11 0.116+11 0.129+11 0.145+11 0.165+11 0.147+11 0.131+11
221.42 59.81 42.83 39.25 39.44 40.50 43.36 45.15 46.45
110.5 408.9 250.1 294.5 326.3 358.3 379.9 326.0 282.8
7.76-05 1.06-05 3.45-01 6.50-02 1.68-02 3.90-03 7.67-04 1.45-04 1.40-04
0.0000 0.0008 0.6700 0.6700 0.6700 0.6700 0.6700 0.2562 0.5281
4608 4608 4608 4608 4608 4608 4608 4608 4608
SGEMM DMR SW(18) SW(36) SW(72) SW(144) SW(288) PK21(144) PK31(144)
0.196+12 0.196+12 0.746+11 0.810+11 0.902+11 0.102+12 0.115+12 0.108+12 0.941+11
1741.96 472.87 306.16 281.43 283.74 292.15 306.75 336.52 308.71
112.3 413.8 244.6 287.4 317.7 348.0 376.2 322.4 304.9
1.83-04 2.87-05 2.72+00 5.64-01 1.26-01 2.46-02 5.15-03 4.73-04 2.51-04
0.0000 0.0002 0.6700 0.6700 0.6700 0.6700 0.6700 0.1265 0.3885
6912 6912 6912 6912 6912 6912 6912 6912
SGEMM DMR SW(27) SW(54) SW(108) SW(216) PK21(192) PK31(144)
0.660+12 0.660+12 0.231+12 0.268+12 0.302+12 0.342+12 0.361+12 0.286+12
5873.83 1607.36 915.67 844.40 877.34 933.15 1075.20 950.61
112.4 410.9 252.5 317.4 343.8 366.1 336.1 300.9
1.71-04 8.57-05 3.05+00 6.23-01 1.42-01 2.77-02 8.70-04 4.73-04
0.0000 0.0001 0.6700 0.6700 0.6700 0.6700 0.1118 0.2560
and note that lN ¡
N + 1; n
lK ¡
K + 1; 2n
lM ¡
M + 1: n
Next, we augment the matrix A by N+ −N null rows and K+ −K null columns to obtain N+ × K+ matrix A+ , and augment the matrix B by K+ − K null rows and M+ − M null columns to obtain K+ × M+ matrix B+ . Finally, we multiply these matrices using the algorithm of Section 2.2 to obtain C+ = A+ B+ , and return the -rst N rows and M columns of C+ as the required product C. Since all the blocks of so constructed matrix A+ and B+ are of the size lN × lK and lK × lM , respectively, we have the following estimate (cf. Section 2.2): TPK21 (N; K; M ; n) ˜1 (n) ˜1 (n) lN lK + lK lM + ˜1 (n)lN lM + (n)l = ˜ N lM (2lK − 1) 2 2
502
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
Table 6 MM on SUN workstation: single precision
Size
Method
Total ops.
Time (s)
M6ops
Err
Rel.mem.
2304 2304 2304 2304 2304 2304 2304 2304
SGEMM DMR SW(36) SW(72) SW(144) SW(216) PK21(144) PK31(96)
0.245+11 0.245+11 0.116+11 0.129+11 0.145+11 0.163+11 0.147+11 0.131+11
455.08 112.01 86.71 79.11 75.71 77.49 76.45 98.58
53.7 218.3 133.3 162.7 191.7 212.6 192.5 133.3
7.68-05 4.77-05 3.61-01 1.41-01 1.97-02 3.45-03 2.14-04 1.41-04
0.0000 0.0008 0.6700 0.6700 0.6700 0.6700 0.2562 0.5281
4608 4608 4608 4608 4608 4608 4608 4608
SGEMM DMR SW(36) SW(72) SW(144) SW(216) PK21(144) PK31(144)
0.196+12 0.196+12 0.810+11 0.902+11 0.102+12 0.115+12 0.108+12 0.941+11
3647.83 987.23 639.30 575.50 557.50 548.28 570.62 597.62
53.6 198.2 126.5 156.6 182.4 210.5 190.1 157.5
1.72-04 3.81-05 2.90+00 1.08+00 1.49-01 3.61-02 6.63-04 2.52-04
0.0000 0.0002 0.6700 0.6700 0.6700 0.6700 0.1265 0.3886
6912 6912 6912 6912 6912 6912 6912 6912
SGEMM DMR SW(27) SW(54) SW(108) SW(216) PK21(192) PK31(144)
0.660+12 0.660+12 0.231+12 0.268+12 0.302+12 0.342+12 0.361+12 0.286+12
12332.00 3143.21 2280.88 1774.05 1646.10 1720.38 1757.01 1814.96
53.5 210.1 101.4 151.1 183.2 198.6 205.7 157.6
3.85-04 6.87-05 1.87+01 6.47+00 1.40+00 2.46-01 1.11-03 4.73-04
0.0000 0.0001 0.6700 0.6700 0.6700 0.6700 0.1118 0.2560
N K ¡ (n + 3n − 2n) +1 +1 n 2n M K 3 2 +1 +1 + (n + 3n − 2n) 2n n N M 3 2 + (2n + 5n − 4n) +1 +1 n n N M K 3 2 + (n + 3n − n) +1 +1 +1 n n n 3n2 + 9n − 4 3 1 3n2 + 8n − 5 = 1 + − 2 NKM + (NK + KM ) + NM n n 2n n 3
2
+ (4n2 + 11n − 7)(N + M ) + (2n2 + 6n − 3)K + 5n3 + 14n2 − 9n 3 1 1 1 + O(NK + KM + NM ): = NKM 1 + + 3n + + n 2M 2N K
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
503
Hence, if min(N; K; M ) is large and n is chosen as 2NKM ; n∗ ≈ NK + KM + 2NM then the operation count is almost by twice smaller than that of the standard algorithm, 2NKM − NM : TPK21 (N; K; M ; n∗ )
= NKM
1+3
2 4 2 + + 2M 2N K
+ O(NK + KM + NM ):
If N ≈ K ≈ M , then we still have n = O(N 1=2 ) and therefore the cut-oL levels are again O(N 1=2 ), but with somewhat larger constant, which even gives us some additional advantage of improving M6ops performance on RISC computers, see the next section. 5. Numerical results For numerical tests we used a server installed at GC CUNY with two Pentium III XEON 733MHz processors, 1GB ECC RAM, and 50GB RAID 5 storage. The operating system is RedHat Linux 7.2; tests were run using single processor. The object code was compiled using “g77-O3-funroll-loops *.f ” command line. Another set of test runs was performed on a single processor of a multiprocessor high-performance SUN workstation under UNIX. In this case the codes were compiled from the command line “f77-O4-native-dalign-fsimple = 1 *.f ”. We used the matrix–matrix multiplication Fortran routine DMR [11] as the lowestlevel procedure for fast matrix multiplication (both in the Strassen–Winograd and the PK2=PK3 codes), as well as the benchmark code which implements the standard O(N 3 ) algorithm. DMR is a public-domain code optimized for the IBM RS6000 architecture and based on the use of blocked and unrolled matrix–matrix multiplication. The source code of DMR can be downloaded from the net using “http:==www.netlib.org=blas=dmr”. We also give a comparison with the “plain” MM routines DGEMM=SGEMM downloaded from the same NETLIB=BLAS website. Note that M6ops rates for the latter codes are almost four times worse than that attained by DMR. Remark 8. If available, one should also try to use vendor BLAS utilities to implement the lowest-level matrix multiplications, as well as the matrix additions etc. Another possibility to choose the building blocks for the fast algorithms is to use the automatically tuned library ATLAS [28], which possess both good portability and high performance. The test problem C = AB was chosen with A = I + uvT ;
B = I − uvT =(1 + vT u);
C = I;
504
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510 DGEMM, DMR, SW, PK21, and PK31 timings 10000
running time: Pentium III, double precision
DGEMM DMR SW PK21 PK31
1000
100
10
1 1000
10000 matrix size DGEMM, DMR, SW, PK21, and PK31 errors
floating-point error: Pentium III server, double precision
1e-09 SW PK21 PK31 DGEMM DMR
1e-10
1e-11
1e-12
1e-13
1e-14
1e-15 1000
10000 matrix size
Fig. 2. MM on Pentium III server in double precision: running time (left) and 6oating-point error (right).
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
505
SGEMM, DMR, SW, PK21, and PK31 timings 10000
running time: Pentium III, single precision
SGEMM DMR SW PK21 PK31
1000
100
10
1 1000
10000 matrix size SGEMM, DMR, SW, PK21, and PK31 errors
floating-point error: Pentium III server, single precision
1 SW PK21 PK31 SGEMM DMR
0.1
0.01
0.001
0.0001
1e-05
1e-06 1000
10000 matrix size
Fig. 3. MM on Pentium III server in single precision: running time (left) and 6oating-point error (right).
506
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510 SGEMM, DMR, SW, PK21, and PK31 timings 100000
running time: SUN workstation, single precision
SGEMM DMR SW PK21 PK31
10000
1000
100
10 1000
10000 matrix size SGEMM, DMR, SW, PK21, and PK31 errors
floating-point error: SUN workstation, single precision
10 SW PK21 PK31 SGEMM DMR
1
0.1
0.01
0.001
0.0001
1e-05 1000
10000 matrix size
Fig. 4. MM on SUN workstation in single precision: running time (left) and 6oating-point error (right).
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
507
where the vectors u and v were speci-ed by √ ui = 1=(N + 1 − i); vi = i; i = 1; : : : ; N: Therefore, the computational error was measured as Err = max |(fl(C))i; j − >i; j |; i; j
where fl(C) denotes the product computed in the double precision 6oating point arithmetics and >i; j stands for the Kronecker’s delta. In Tables 4–6, we display (for several matrix sizes N = 2n 3m ) the total operation count, CPU time in seconds, performance in mega6ops, 6oating point error as de-ned above, and the memory volume in 6oat words per N 2 . The SW method and our onep q level 2-Procedure and 3-Procedure based √ methods with the cut-oL √ level l = 2 3 are denoted here SW(l), PK21(l), l ≈ 2:5 N , and PK31(l), l ≈ 2 N , respectively. We have not actually run SGEMM with N = 6912 on SUN workstation; extrapolated data are given instead. The results show that the new algorithms are quite competitive with the SW algorithm with respect to the total operation count and, at the same time, provide a dramatic improvement in the precision of the 6oating point result. Remark 9. An unexpected observation is that the running time of our SW routine decreases as the total operation count increases. This eLect de-nitely suggests that local data processing (within CPU=registers=cache) is many times faster than main core memory addressing. Therefore, the elapsed time depends on the number of main memory references rather than on the arithmetic operation count (cf. [11]). This also applies to the PKt1(l) codes, t = 2; 3, where numerous l × l=t, l=t × l, and l × l matrix additions take relatively large fraction of time (running at ≈ 50 M6ops) as compared to ≈ 3t − 2 times fewer number of l × l=t by l=t × l matrix multiplications (running at ¿350 M6ops). In Figs. 2– 4, some data from these tables are visualized to show the computing time and the 6oating-point error versus matrix size in log–log scale. The cut-oL size was chosen l = 72 for Strassen–Winograd algorithm, except for the case of N = 6912, where l = 54. In Fig. 5, the ratio of computing times TDMR (N )=TPK1 (N ) for Pentium III is shown for all N = 1000; 1001; : : : ; 4000. It is seen that this ratio approaches its limiting value 1=2 as the matrix size increases. Here we used the simple bordering approach described in Section 4.1. It can be seen that, despite somewhat lower M6ops rates, the PK2 and, especially, PK3 methods make it possible to perform matrix multiplication up to 1.7 times faster even when compared to the one of the fastest available Fortran codes, the DMR routine. Strassen–Winograd algorithm appears to be somewhat faster than PK21 and PK31 (mainly because of better M6ops performance due to a smaller percentage of matrix addition calls), but its numerical accuracy level is by several orders of magnitude worse exactly in the cases when the operation count is minimum.
508
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510 Time(DGEMM2)/Time(DMR) versus N 0.86 runtime reduction 0.84 0.82
runtime reduction
0.8 0.78 0.76 0.74 0.72 0.7 0.68 0.66 1000
1500
2000
2500
3000
3500
4000
matrix size
Fig. 5. Running time reduction with PK1 compared to DMR on Pentium III: double precision.
Remark 10. With respect to the estimated operation number, it should be noted that Table 3 in Section 3.8 and Tables 4 –6 report on di;erent PK-type methods. This explains, for instance, why the best operation count for SW is smaller than that of PK for N = 6912. In Table 3, the size N = 6912 is treated with the PK31=21 algorithm (i.e. the two-level scheme with N = nml, n = 48, m = 12, l = 12) which is chosen there to optimize the operation count. Tables 5 and 6 report the case N = 6912 for the PK31 algorithm (i.e. one-level scheme with N = nl, n = 48, l = 144), was chosen for the actual computer implementation to provide a better compromise between operation count and M6ops rate.
6. Conclusions In this paper, a new class of practically applicable fast matrix multiplication algorithms was described which is quite competitive with the Strassen and Winograd methods with respect to the total arithmetic costs. At the same time, new algorithms are considerably more numerically stable, take much less working storage, and have clear and 6exible structure that make them rather appealing for the implementation on computers with memory hierarchy and=or parallel processing.
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
509
Acknowledgements This work has been supported by the NSF grant CCR-9732206. The author would like to acknowledge helpful comments of Prof. V.Y. Pan on the subject of the paper and the assistance of A. Koukinova in performing experiments shown in Fig. 5. The author acknowledges many helpful comments of anonymous referees which made it possible to considerably improve the presentation of the paper. References [1] A.V. Aho, J.E. Hopcroft, J.D. Ullman, The Design and Analysis of Computer Algorithms, Addison-Wesley, Reading, MA, 1974. [2] I. Akutsu, S. Miyano, S. Kuhara, Algorithms for identifying Boolean networks and related biological networks based on matrix multiplication and -ngerprint function, J. Comput. Biol. 7 (2000) 331–343. [3] D.H. Bailey, Extra speed matrix multiplication on the Cray-2, SIAM J. Sci. Stat. Comput. 9 (3) (1988) 603–607. [4] Basic Linear Algebra Subprograms, http://www.netlib.org/blas/dmr. [5] T. Biedl, B. Brejova, E.D. Demaine, A.M. Hamel, T. Vinar, Optimal arrangement of leaves in the tree representing hierarchical clustering of gene expression data, Tech. Report 2001-14, Department of Computer Science, University of Waterloo, 12pp. [6] D. Bini, G. Lotti, Stability of fast algorithms for matrix multiplication, Numer. Math. 36 (1980) 63–72. [7] R.P. Brent, Algorithms for matrix multiplication, Report TR-CS-70-157, Department of Computer Science, Stanford, March 1970, 52pp. [8] E. Cohen, D. Lewis, Approximating matrix multiplication for pattern recognition tasks (special issue of selected papers from SODA’97), J. Algorithms 30 (1999) 211–252. [9] D. Coppersmith, S. Winograd, Matrix multiplication via arithmetic progressions, J. Symbolic Comput. 9 (3) (1990) 251–280. [10] J.W. Demmel, N.J. Higham, Stability of block algorithms with fast level-3 BLAS, ACM Trans. Math. Soft. 18 (3) (1992) 274–291. [11] J.J. Dongarra, P. Mayes, Giuseppe Radicati di Brozolo, The IBM RISC System/6000 and Linear Algebra Operations, University of Tennessee Computer Science, Tech. Report: CS-90-122, 1990. [12] C.C. Douglas, M. Heroux, G. Slishman, R.M. Smith, GEMMW: a portable Level 3 BLAS Winograd variant of Strassen’s matrix–matrix multiply algorithm, J. Comput. Phys. 110 (1994) 1–10. [13] N. Higham, Exploiting fast matrix multiplication within the Level 3 BLAS, ACM Trans. Math. Soft. 16 (4) (1990) 352–368. [14] N. Higham, Accuracy and Stability of Numerical Algorithms, SIAM Publications, Philadelphia, 1996. [15] J.E. Hopcroft, J.D. Ullman, Introduction to Automata Theory, Languages, and Computation, Addison-Wesley, Reading, MA, 1979. [16] X. Huang, V.Y. Pan, Fast rectangular matrix multiplication and applications, J. Complexity 14 (1998) 257–299. [17] S. Huss-Lederman, E.M. Jacobson, J.R. Johnson, A. Tsao, T. Turnbull, A portable implementation of Strassen’s algorithm (DGEFMM User’s Guide), Computer Science Department, University of Wisconsin-Madison. Madison, WI, November 12, 1996. [18] I. Kaporin, Practical algorithm for faster matrix multiplication, Numer. Linear Algebra Appl. 6 (1999) 687–700. [19] J. Laderman, V.Y. Pan, X.-H. Sha, On practical algorithms for accelerated matrix multiplication, Linear Algebra Appl. 162–164 (1992) 557–588. [20] V.Y. Pan, Computation schemes for a product of matrices and for the inverse matrix (Russian), Uspekhi Mat. Nauk. 27 (5) (1972) 249–250.
510
I. Kaporin / Theoretical Computer Science 315 (2004) 469 – 510
[21] V.Y. Pan, New combinations of methods for the acceleration of matrix multiplications, Comput. Math. Appl. 7 (1981) 73–125. [22] V.Y. Pan, How can we speed up matrix multiplication? SIAM Rev. 26 (3) (1984) 393–415. [23] V.P. Pauca, X. Sun, S. Chatterjee, A.R. Lebeck, Architecture-e1cient Strassen’s matrix multiplication: a case study of divide-and-conquer algorithms, Proc. ILAS Symp., June 1997, or Tech. Report CS-1998-06, Department of Computer Science Duke University, Durham, May 1998, 16pp. [24] C. Pernet, Implementation of Winograd’s algorithm over -nite -elds using ATLAS Level3 BLAS, Tech. Report, Laboratoire Informatique et Distribution, ENISMAG, Montbonnot Saint Martin, France, July 2001, 23pp, http://www-id.imag.fr/Apache/RR/RR011122FFLAS.ps.gz. [25] V. Strassen, Gaussian elimination is not optimal, Numer. Math. 13 (1969) 354–356. [26] M. Thottlehodi, S. Chatterjee, A.R. Lebeck, Tuning Strassen’s matrix multiplication for Memory E1ciency, Proc. Supercomputing’98, November 1998. [27] L.G. Valiant, General context free recognition in less than cubic time, J. Comput. System Sci. 10 (1975) 308–315. [28] R.C. Whaley, A. Petitet, J.J. Dongarra, Automated empirical optimizations of software and the ATLAS project, Parallel Comput. 27 (1–2) (2001) 3–35. [29] U. Zwick, All pairs shortest paths in weighted directed graphs—exact and inexact algorithms, Proc. 39th Ann. Symp. on Foundations of Computer Science (FOCS’98), IEEE Computer Soc. Press, Los Alamos, California, 1998, pp. 310–319.
Theoretical Computer Science 315 (2004) 511 – 523
www.elsevier.com/locate/tcs
Fast inversion of triangular Toeplitz matrices Fu-Rong Lina; b;∗;1 , Wai-Ki Chingb;2 , Michael K. Ngb;3 a Department
of Mathematics, Shantou University, Shantou, Guangdong 515063, PR China of Mathematics, the University of Hong Kong, Pokfulam road, Hong Kong
b Department
Abstract In this paper, we present an approximate inversion method for triangular Toeplitz matrices based on trigonometric polynomial interpolation. To obtain an approximate inverse of high accuracy for a triangular Toeplitz matrix of size n, our algorithm requires two fast Fourier transforms (FFTs) and one fast cosine transform of 2n-vectors. We then revise the approximate method proposed by Bini (SIAM J. Comput. 13 (1984) 268). The complexity of the revised Bini algorithm is two FFTs of 2n-vectors. c 2004 Published by Elsevier B.V. Keywords: Triangular Toeplitz matrix; Interpolation; Fast Fourier transform; Fast cosine transform
1. Introduction In this paper, we introduce a method for fast inversion of a lower triangular Toeplitz matrix t0 t1 t0 Tn = . ; .. .. .. . . tn−1 : : : t1 t0
∗ Corresponding author. Department of Mathematics, Shantou University, Shantou, Guangdong 515063, PR China. E-mail addresses: [email protected] (F.-R. Lin), [email protected] (W.-K. Ching), [email protected] (M.K. Ng). 1 Research supported by the Natural Science Foundation of China No. 19901017. 2 Research supported in part by RGC Grant No. HKU 7126/02P and HKU CRCG Grant No. 10203919. 3 Research supported in part by Hong Kong Research Grants Council Grant Nos. HKU 7132/00P and 7130/02P, and HKU CRCG Grant Nos. 10203501, 10203907 and 10204437.
c 2004 Published by Elsevier B.V. 0304-3975/$ - see front matter doi:10.1016/j.tcs.2004.01.005
512
F.-R. Lin et al. / Theoretical Computer Science 315 (2004) 511 – 523
where tj , j = 0; 1; : : : ; n − 1 are real with t0 = 0. Applications of triangular Toeplitz matrix inversion include Gauss–Seidel iteration for Toeplitz systems, which can also be used as a smoother in multigrid methods. We remark that Toeplitz systems arise in a variety of applications in mathematics and engineering, e.g. signal and image processing [5]. There are two types of inversion methods for triangular Toeplitz matrices: exact inversion (see for instance [3,7]) and approximate inversion (see for instance [2,9,11]). A simple method for obtaining the inverse of a triangular Toeplitz is the forward substitution method which requires about n(n + 1) arithmetic operations. Another exact method is the divide-and-conquer method which requires O(n log n) operations. More precisely, it requires about 10 fast Fourier transforms (FFTs) of n-vectors (FFT(n)) [7]. Bini [2] proposed an approximate method for the fast inversion of a triangular Toeplitz matrices. Let Hn = [hjk ]nj;k=1 , where all entries of Hn are zero except that hj+1;j = 1 for j = 1; : : : ; n − 1. We see that Tn =
n−1 j=0
tj Hnj :
His basic idea is to approximate Hn by Hn( ) = [hij( ) ]ni; j=1 ; ( ) = n . Here is a small positive number. where hij( ) = hij when (i; j) = (1; n) and h1n Let
Dn( ) = diag(1; ; : : : ; n−1 ): It is easy to check that Dn( ) Hn( ) (Dn( ) )−1 can be diagonalized by Fourier matrix, it follows that the inverse of Tn( ) =
n−1 j=0
tj (Hn( ) ) j
can be computed eJciently by using the FFTs. We note that since we only require
n to be small enough, the value of can be chosen to be a number that is close to one. According to our preliminary numerical tests, = (0:5 · 10−8 )1=n is a good choice for his method. The main contribution of this paper is that we propose an approximate inversion method for triangular Toeplitz matrices based on trigonometric polynomial interpolation. To obtain an approximate inverse of high accuracy for a triangular Toeplitz matrix of size n-by-n, our algorithm requires two FFTs and one fast Cosine transform (DCT) of 2n-vectors. We then revise the approximate method proposed by Bini [2] to obtain the inverse of a triangular Toeplitz matrix with higher accuracy. Our algorithms are numerical but their applications include polynomial division, which is a fundamental problem of computer algebra and which happened to be essentially equivalent to the triangular Toeplitz matrix inversion [3]. Both of our algorithms have technical similarity with the algorithms in [8] (see also [9]) proposed for polynomial division.
F.-R. Lin et al. / Theoretical Computer Science 315 (2004) 511 – 523
513
The outline of the paper is as follows. In Section 2, we present our approximate inversion method based on trigonometric polynomial interpolation. In Section 3, we revise the approximate method proposed by Bini [2]. We also discuss the relationship between the Bini and our algorithms and some other known algorithms. In Section 4, we show some numerical examples. Finally, a summary is given in Section 5 to conclude the paper. 2. Inversion by interpolation We Krst state some properties of a lower triangular Toeplitz matrix. Property 1. For a lower triangular Toeplitz matrix Tn , we deKne the polynomial: n (z) =
n−1 k=0
tk z k :
(1)
Let the Maclaurin series of −1 n (z) be given by −1 n (z) = then
∞ k=0
bk z k ;
(2)
Tn−1
b0 b1 = . ..
:
b0 .. .. . .
(3)
bn−1 : : : : : : b0
Thus in order to obtain Tn−1 , we only need to compute the coeJcients bk for k = 0; 1; : : : ; n − 1. Property 2. Replacing z in (1) and (2) by z we get n; (z) = n (z) =
n−1 k=0
(tk k )z k
and
−1 −1 n; (z) = n (z) =
Equivalently, we have −1 t0 b0 t1 b1 t 0 = .. .. .. .. . . . . n−1 tn−1 : : : t1 t0 n−1 bn−1
∞ k=0
(bk k )z k :
b0 .. .. . .
:
: : : b1 b0
We note that we can choose ∈ (0; 1) such that ∞ k=0
|bk k | ¡ ∞:
(4)
514
F.-R. Lin et al. / Theoretical Computer Science 315 (2004) 511 – 523
Property 3. The inverse of the leading principal sub-matrix Tn (1 : p; 1 : p) is equal to the leading principal sub-matrix Tn−1 (1 : p; 1 : p) for 16p6n. 2.1. Approximate method √ In this subsection, we present our algorithm. By replacing z by e−ix , where i = −1 and x is a real variable, we see that n (e−ix ) is a trigonometric polynomial. Therefore, one possible way to obtain bk is to compute the Fourier coeJcients of 1=n (e−ix ): 2 1 1 bk = eikx dx for k = 0; 1; : : : ; n − 1: (5) 2 0 n (e−ix ) Unfortunately, it is diJcult to obtain bk accurately by using the above formula since −1 the explicit form of −1 n is generally unknown. Therefore, we consider interpolating n deKned as in (2) by trigonometric polynomial. We note that it is diJcult to compute bk by algebraic polynomial interpolation since the high-order Vandemonde matrices are very ill-conditioned. We let () ≡ n; (e−i ) =
n−1 k=0
(tk k )e−ik = (r) () + i(i) ();
(6)
where (r) () and (i) () are the real and imaginary parts of (), respectively. Similarly, by using (2), we have h () ≡ −1 () =
∞ k=0
(bk k )e−ik :
In particular, the real part of h () is given by h(r) () =
∞ k=0
(bk k ) cos(k) =
(r) () ((r) ())2 + ((i) ())2
:
(7)
Obviously, h(r) () is a 2-periodic even function. To obtain approximate values bˆk for bk (k = 0; 1; : : : ; n − 1), we interpolate h(r) by a function in Tn−1 , where Tm denotes the set of all even trigonometric polynomials of degree 6m. We use the following equidistant points k =
2k − 1 ; 2n
k = 1; 2; : : : ; n
as the interpolating knots. The advantages of using these interpolating knots are that bˆk can be obtained eJciently by using the DCT and the interpolating trigonometric polynomial can approximate the original function accurately. Let Pn−1 () =
n−1 k=0
ck cos k
F.-R. Lin et al. / Theoretical Computer Science 315 (2004) 511 – 523
515
be the corresponding interpolating polynomial for h(r) (). By using the interpolating conditions Pn−1 (k ) = h(r) (k ), k = 1; : : : ; n, we have C(c0 ; c1 ; : : : ; cn−1 )t = (h(r) (1 ); h(r) (2 ); : : : ; h(r) (n ))t
(8)
where [C]j;k = cos
(k − 1)(2j − 1) 2n
:
Note that C = (c ) D, where c is the discrete cosine transformation matrix and D = diag
√
n;
n In−1 ; 2
we see that if the values of h(r) (k ), k = 1; 2; : : : ; n are known, then ck can be obtained by using one DCT of n-vector (DCT(n)). Finally, bˆk can be obtained in O(n) divisions by making use of bˆk = ck −k ;
k = 0; 1; : : : ; n − 1:
Regarding the accuracy of the interpolating polynomial Pn−1 , we have Pn−1 −
h(r)
6
2 2 + log(n) En−1 (h(r) );
where · denotes the supremum norm, and Em (hr ) = min P − hr P∈Tm
is the error of the best approximation in Tm , see for instance [4,10]. By using Jackson’s Theorem, we have that if f possesses a continuous pth derivative, then Em−1 (f) 6 2
p 1 f(p) ; m
(9)
see [6] for instance. For the accuracy of bˆk , k = 0; 1; : : : ; n − 1, we have the following theorem. Theorem 1. Let the Maclaurin series of −1 n (z) be given by ∞ ∈ (0; 1) such that k=0 |bk k | ¡ ∞. Let Pn−1 () =
n−1 j=0
cj cos j
∞
k=0
(bk k )z k and
516
F.-R. Lin et al. / Theoretical Computer Science 315 (2004) 511 – 523
be the interpolating polynomial for h(r) () with interpolating knots k = (2k − 1)=2n, k = 1; 2; : : : ; n and bˆj = cj −j , j = 0; 1; : : : ; n − 1. Then we have bˆ 0 = b0 + bˆ j = bj +
∞ m=1 ∞ m=1
(−1)m 2mn b2mn ; (−1)m (2mn b2mn+j + 2(mn−j) b2mn−j )
( j = 1; 2; : : : ; n − 1):
(10)
Proof. Let us consider the interpolating polynomial of cos j with k (k = 1; 2; : : : ; n) as interpolating knots, where j¿0 is an integer. Let m and j be integers, we have that for k = 1; 2; : : : ; n,
2k − 1 (2k − 1)j m cos((2mn ± j)k ) = cos (2mn ± j) = (−1) cos : 2n 2n In particular, we have cos(2mnk ) = (−1)m and cos((2m + 1)nk ) = 0. Now we see that the interpolating polynomials of cos(2mn), cos((2m + 1)n) and cos((2mn ± j)) (16j6n − 1) are (−1)m , 0, and (−1)m cos( j), respectively. It follows that c0 = b0 +
∞ m=1
(−1)m 2mn b2mn ;
cj = bj j + j
∞ m=1
(−1)m (2mn b2mn+j + 2(mn−j) b2mn−j );
j = 1; 2; : : : ; n − 1:
Using cj = bˆ j j , (10) follows. It follows from (10) that the smaller the value of , the more accurate the approximate inverse (bˆ j ; j = 0; 1; : : : ; n − 1) will be. Discarding the terms including the factor 2n (we can choose such that 2n is very close to zero) in (10), we get bˆ 0 ≈ b0 ; bˆ j ≈ bj − 2(n−j) b2n−j = bj − 2(n−j) bn+(n−j) ;
j = 1; 2; : : : ; n − 1:
Numerically, we cannot set too small, otherwise the computation of cj = j will bring in very large rounding error for large j. For that is close to 1, bˆ j can be very accurate for small j since 2(n−j) is very close to zero. However, bˆ j may not be accurate for j close to n, e.g. bˆ n−1 −bn−1 ≈ −2 bn+1 . In addition, rounding errors make the numerical results less accurate. To illustrate the results of Theorem 1, we plot in Fig. 1 the errors in Knding the inverse of a triangular Toeplitz matrix with entries given by tj = 1=(1 + j)2 ;
j = 0; 1; : : : ; 511; = 1 and = 2−36=512 :
Here b is the Krst column of the inverse of Tn obtained by the divide-and-conquer method and bˆ is the Krst column of the approximate inverse obtained by the interpolation method. It is clear that the numerical results are consistent with the theoretical results.
F.-R. Lin et al. / Theoretical Computer Science 315 (2004) 511 – 523
517
Fig. 1. log10 (|bˆ − b|) for tj = 1=( j + 1)2 , j = 0; 1; : : : ; 511. The left one is for = 1 and the right one is for = 2−36=512 .
We observe that by choosing a suitable , the accuracy of the approximate solution can be improved remarkably, especially for j that is not close to n. Using property 3 of lower triangular Toeplitz matrices, one can improve the accuracy of the numerical inverse by interpolating h(r) by trigonometric polynomial of degree n + n0 , where n0 ¿ 0. After obtaining the coeJcients cj for j = 0; 1; : : : ; n; : : : ; n + n0 − 1, we compute bˆ = [cj = j ]jn−1 = 0 . For simplicity, we set n0 = n. The algorithm is given as follows. Algorithm 1. Inversion based on interpolation Step 0: Choose ∈ (0; 1) and compute t˜j = j tj ; for j = 0; 1; : : : ; n − 1. n−1 Step 1: Compute (k ) = l=0 t˜l e−ilk , where k = (2k − 1)=4n for k = 1; : : : ; 2n. Step 2: Compute hk = h (r) (k ) = (r) (k )=( (r) (k ))2 + ( (i) (k ))2 , k = 1; : : : ; 2n. Step 3: Solve C(c0 ; c1 ; : : : ; c2n−1 )t = (h1 ; h2 ; : : : ; h2n )t ; n−1 n−1 where [C]j;k = cos((k −1)(2j−1)=4n); j; k = 1; : : : ; 2n. Compute [bˆ k ]k=0 = [ck −k ]k=0 .
To end this subsection, we discuss the cost of Algorithm 1. Since (2k ) =
n−1 l=0
t˜l e−il2k =
n
(t˜l e−3i(l−1)=4n )e−2i(l−1)(k−1)=2n ;
l=1
we see that the values of (2k ) for k = 1; : : : ; n can be computed by one FFT(2n). Similarly, (2k−1 ), k = 1; : : : ; n can be obtained by one FFT(2n) too. The main cost of Step 3 is one DCT(2n). Therefore, the cost of Algorithm 1 is about two FFT(2n) and one DCT(2n). Finally, we remark that = 2−18=n is a good choice.
518
F.-R. Lin et al. / Theoretical Computer Science 315 (2004) 511 – 523
3. The revised version of Bini’s algorithm The idea of Bini’s algorithm is to approximate Tn by Tn( ) =
n−1 j=0
tj (Hn( ) ) j
(cf. Section 1). The inverse of Tn( ) can be computed fast by using the decomposition (Tn( ) )−1 = (Dn( ) )−1 Fn∗ Dn−1 Fn Dn( ) ;
(11)
√ where Dn( ) = diag(1; ; : : : ; n−1 ), Dn = diag(d) with d = nFn Dn( ) [tj ]jn−1 = 0 and Fn is the n-by-n Fourier matrix. By using (11), we see that Bini’s Algorithm for computing the Krst column b( ) of (Tn( ) )−1 is as follows: Step 0: Choose ∈ (0;√ 1). Compute t˜k = tk k for k = 0; 1; : : : ; n − 1. Step 1: Compute d = ( nFn )t˜. n−1 n−1 Step 2: Compute c = [cj ]j=0 = [1=dj ]j=0 . √ ∗ Step 3: Compute f = (Fn = n)c. n−1 n−1 = [fk = k ]k=0 . Step 4: Compute [bk( ) ]k=0 It is not diJcult to check that for k = 0; 1; : : : ; n − 1, n−1 j=0
−1
(bj( ) j )e−2ijk=n
=
n−1 j=0
j
(tj )e
−2ijk=n
=
1 ; (2k=n)
(12)
where is deKned as in (6). That is, Bini’s algorithm is equivalent to interpolating 1=
by a trigonometric polynomial of degree less than n with k = 2k=n, k = 1; : : : ; n as the interpolating knots. Similarly to Theorem 1, we have the following theorem concerning the error in b( ) . Since the proof is similar to that of Theorem 1, we omit it. ∞ k k Theorem 2. Let the Maclaurin series of −1 n ( z) be given by k=0 (bk )z and ∈ ∞ k ( ) be the 6rst column of the approximate (0; 1) such that k=0 |bk | ¡ ∞. Let b inverse obtained by Bini’s algorithm. Then bk( ) − bk = O( n ), k = 0; 1; : : : ; n − 1. More precisely, bk( ) = bk + n
∞ j=1
( j−1)n bk+jn ;
k = 0; 1; : : : ; n − 1:
(13)
Theoretically, the smaller , the more accurate the approximate inverse. On the other hand, if is close to zero, Dn( ) will be very ill-conditioned for large n. Therefore, we must choose a suitable to balance these two facts. Now let us illustrate the eOect of rounding error by a simple example. Let Tn be the lower triangular Toeplitz matrix with the Krst column given by t0 = 1, t1 = − 1, tk = 0 for k = 2; : : : ; n − 1. We have that the Krst column of Tn−1 given by bk = 1 for
F.-R. Lin et al. / Theoretical Computer Science 315 (2004) 511 – 523
519
( ) Fig. 2. log10 (|b˜ − b|) for t0 = 1, t1 = − 1, tk = 0, k = 2; 3; : : : ; 511. The left one is for = (0:5 × 10−8 )1=n and the right one is for = 10−10=n .
k = 0; 1; : : : ; n − 1 and the Krst column of (Tn( ) )−1 is bk( ) = 1=(1 − n ), k = 0; 1; : : : ; n − 1. ( ) That is, bk( ) − bk = n =(1 − n ) ≈ n for all k. The errors in the numerical results b˜ −8 1=n −10 1=n of Bini’s algorithm for = (0:5 × 10 ) and = (1:0 · 10 ) are shown in Fig. 2. ( ) From Fig. 2, we see that b˜k is less accurate for k close to n. Based on the above discussions, one can revise Bini’s algorithm such that the revised algorithm can be applied to obtain accurate approximate inverse. The idea is similar to what we have done for the interpolation method in Section 2: embed the n-by-n triangular Toeplitz matrix into an (n + n0 )-by-(n + n0 ) triangular (banded) Toeplitz matrix, where n0 is a positive integer. For simplicity, we set n0 = n. The algorithm can be stated as follows. Algorithm 2. Revised version of Bini’s algorithm Step 0: Choose ∈ (0; 1). Compute t˜k = tk k for k = 0; 1; : : : ; n − 1 and set t˜k = 0 for k = n; n + 1; : : : ; 2n − 1. √ Step 1: Compute d = ( 2nF2n )[t˜k ]k2n−1 =0 . 2n−1 Step 2: Compute c = [cj ]j2n−1 = [1=d j ]j = 0 . =√ 0 ∗ Step 3: Compute f = (F2n = 2n)c. ( ) n−1 Step 4: Compute [bk ]k = 0 = [fk = k ]kn−1 = 0. It is obvious that the cost of Algorithm 2 is about two FFT(2n). Numerical tests show that = (0:5 × 10−8 )1=n and = 10−5=n are good choices for Bini’s and revised Bini’s algorithm, respectively. Remarks. (1) Bini’s algorithm is equivalent to the algorithm proposed by SchPonhage −1 [11], see [3] for detail. In [11], the Krst column [bk ]kn−1 is obtained = 0 of the matrix Tn
520
F.-R. Lin et al. / Theoretical Computer Science 315 (2004) 511 – 523
by using the formula 2 1 1 eikx dx bk = 2 k 0 n; (e−ix )
for k = 0; 1; : : : ; n − 1
and approximating the right-hand side by the rectangular rule with xj = 2j=n, j = 0; 1; : : : ; n − 1 as quadrature grid. (2) Bini’s algorithm can be derived from the polynomial division algorithms proposed by Pan et al. [8], see also [9]. Here, we brieQy describe their ideas. Associated with the Krst columns of Tn and Tn−1 , we deKne two polynomials n−1
T (z) =
j=0
tn−1−j z j
and
B(z) =
n−1 j=0
bn−1−j z j :
It can be easily checked that R(z) = z 2n−2 − T (z)B(z) is a polynomial with degree less than n − 1. That is, B(z) and R(z) are the quotient and the remainder of the division of two polynomials, z 2n−2 by T (z). The Krst key point of the algorithm proposed in [8] is that R(z)=T (z) → 0 as |z| → ∞. In particular, for suJciently small , we have B(eix = ) ≈
(eix = )2n−2 ; T (eix = )
where x is any real number. Setting xk = 2k=n, k = 0; 1; : : : ; n − 1, we get n−1 j=0
bn−1−j (ei(2k=n) = ) j ≈ 1−n ei(2k(n−1)=n)
×
−1
n−1 j=0
(tn−1−j n−1−j )e−i(2k(n−1−j)=n)
i.e. n−1 j=0
;
−1
(bn−1−j
n−1−j
)e
−i(2k(n−1−j)=n)
≈
n−1 j=0
(tn−1−j
n−1−j
)e
−i(2k(n−1−j)=n)
which is equivalent to (12). Thus, we come to Bini’s algorithm. The second key idea in [8] is that the errors of the approximation of the coeJcients bj are proportional to j , that is, the errors dramatically decrease if the polynomial T (z) is replaced by z k T (z) for larger k. This idea is the basis for the improved version of the algorithm presented in [8, Section 5]. Our embedding of Tn can be viewed as the matrix version of the translation of the scaling of T (z) by the power z k , although our work was independent, our derivation was distinct (it came from the study of triangular Toeplitz matrices versus polynomials in [8,9]), and we even learned about [8,9] only from the referees of our paper.
F.-R. Lin et al. / Theoretical Computer Science 315 (2004) 511 – 523
521
(3) Our revision of Bini’s algorithm exploits the idea of interpolation for increasing the order of approximation. We trace this idea back to [1]. Using (13), we have that for the revised Bini’s algorithm with n0 = n, the Krst column of the approximate inverse ( ) n−1 ]k=0 ) satisKes (denoted by [bk;2n ( ) bk;2n = bk + 2n
∞ j=1
2( j−1)n bk+jn ;
k = 0; 1; : : : ; n − 1:
On the other hand, using the interpolation technique on approximate inverses of Bini’s algorithm for diOerent ’s, we can get approximate inverses of high accuracy. For example, we can obtain the approximate inverse ( ) −1 1 2 ((Tn )
+ (Tn(
√ n
− n ) −1
)
)
by interpolating the approximate inverses for and to see that the above two formulae are equivalent.
√ n
− n . Using (13) again, it is easy
4. Numerical examples In this section, we test the accuracy of our algorithm, Bini’s algorithm and the revised Bini algorithm for six diOerent sequences of lower triangular Toeplitz matrices. They are (i) tj = 1=( j + 1)3 , j = 0; 1; : : : ; n − 1, (ii) tj = 1=( j + 1)2 , j = 0; 1; : : : ; n − 1, (iii) tj = 1=( j + 1), j = 0; 1; : : : ; n − 1, (iv) tj = 1= log( j + 2), j = 0; 1; : : : ; n − 1, (v) t0 = 1, t1 = 1 and tj = 0 for j = 2; : : : ; n − 1, (vi) t0 = 1, t1 = −2, t2 = 1 and tj = 0 for j = 3; : : : ; n − 1. We note that sequences (i)–(iv) are quite well conditioned (for n = 4096, the condition numbers of the above sequences are 1:4, 2:3, 16:8 and 799:5, respectively), sequence (v) has the inverse bj = (−1) j and sequence (vi) has the inverse bj = j + 1 for j = 0; 1; : : : ; n − 1. In the following tables, we show the relative accuracy of the approximate inverse b˜ − b 1 ; b 1 where b˜ is the Krst column of the approximate inverse and b is the Krst column of the exact inverse. The second row, third row and fourth row display the accuracy of the computed inverses of Bini’s algorithm, the revised Bini algorithm and our algorithm, respectively. For sequences (i)–(iv), the exact inverses are unknown, so we can just set b to the Krst column of the inverse computed by the divide-and-conquer method. We observe from Tables 1–6 that all approximate inverses of the three methods are very accurate. According to these very limited samples, Bini’s algorithm is suitable for the cases where we do not require very high order of accuracy, for instance in the Gauss–Seidel iteration for Toeplitz systems. The revised Bini algorithm and our
522
F.-R. Lin et al. / Theoretical Computer Science 315 (2004) 511 – 523
Table 1 Accuracy for tj = 1=( j + 1)3 , j = 0; 1; : : : ; n − 1
n
128
256
512
1024
2048
4096
Bini Revised Bini Interpolation
1.0e–8 4.5e–12 2.7e–11
1.9e–8 9.0e–12 3.1e–11
2.5e–8 1.2e–11 4.6e–11
3.8e–8 1.9e–11 7.5e–11
5.0e–8 3.3e–11 1.2e–10
7.0e–8 4.6e–11 1.6e–10
Table 2 Accuracy for tj = 1=( j + 1)2 , j = 0; 1; : : : ; n − 1
n
128
256
512
1024
2048
4096
Bini Revised Bini Interpolation
9.5e–9 5.5e–12 2.3e–11
1.4e–8 8.5e–12 2.6e–11
2.5e–8 1.3e–11 4.4e–11
3.3e–8 2.1e–11 7.2e–11
5.6e–8 3.1e–11 8.9e–11
7.6e–8 4.2e–11 1.5e–10
Table 3 Accuracy for tj = 1=( j + 1), j = 0; 1; : : : ; n − 1
n
128
256
512
1024
2048
4096
Bini Revised Bini Interpolation
5.0e–9 7.3e–12 2.0e–11
2.0e–8 1.2e–11 3.7e–11
2.9e–8 1.7e–11 4.8e–11
4.2e–8 2.3e–11 6.6e–11
5.6e–8 3.3e–11 9.8e–11
8.2e–8 4.7e–11 1.6e–10
Table 4 Accuracy for tj = 1= log( j + 2), j = 0; 1; : : : ; n − 1
n
128
256
512
1024
2048
4096
Bini Revised Bini Interpolation
1.5e–8 7.2e–12 2.2e–11
2.1e–8 1.8e–11 3.7e–11
3.5e–8 3.4e–11 7.6e–11
6.4e–8 4.8e–11 1.1e–10
1.1e–7 7.4e–11 1.7e–10
1.9e–7 1.3e–11 3.4e–10
Table 5 Accuracy for t0 = t1 = 1, tj = 0, j = 2; : : : ; n − 1
n
128
256
512
1024
2048
4096
Bini Revised Bini Interpolation
5.2e–9 1.0e–10 1.4e–12
5.7e–9 1.0e–10 7.0e–12
5.3e–9 1.0e–10 5.8e–12
5.7e–9 1.0e–10 8.9e–12
5.9e–9 1.0e–10 1.1e–11
9.2e–9 9.7e–11 7.4e–11
Table 6 Accuracy for t0 = t2 = 1, t1 = − 2, tj = 0, j = 3; : : : ; n − 1
n
128
256
512
1024
2048
4096
Bini Revised Bini Interpolation
1.4e–8 5.0e–10 6.8e–12
1.5e–8 5.0e–10 1.7e–11
1.7e–8 5.0e–10 2.8e–11
2.3e–8 4.7e–10 1.0e–10
6.0e–8 4.0e–10 9.7e–10
1.1e–7 9.2e–10 4.0e–9
F.-R. Lin et al. / Theoretical Computer Science 315 (2004) 511 – 523
523
interpolation method are more accurate than Bini’s algorithm. However, we remark that the revised Bini algorithm only requires two FFT(2n) (about twice that of Bini’s algorithm or 45 of that of Algorithm 1). 5. Summary In summary, we have proposed an approximate inversion method for triangular Toeplitz matrices based on trigonometric polynomial interpolation, the fast Fourier transforms and the fast cosine transform. We have also revised Bini’s approximate method by extending the approach proposed by Bini in 1980. Some numerical examples are included. Acknowledgements We thank Prof. E.I. Landowne for Kxing the formula in our Remark 2, Prof. V.Y. Pan and the referees for their valuable and detailed comments and suggestions for improving the paper. References [1] D. Bini, Relations between ES-algorithms and APA-algorithms, Applications, Calcolo XVII (1980) 87–97. [2] D. Bini, Parallel solution of certain Toeplitz linear systems, SIAM J. Comput. 13 (1984) 268–276. [3] D. Bini, V. Pan, Polynomial division and its computational complexity, J. Complexity 2 (1986) 179–203. [4] L. Brutman, Lebesgue functions for polynomial interpolation—a survey, Ann. Numer. Math. 4 (1997) 111–127. [5] R. Chan, M.K. Ng, Conjugate gradient methods for Toeplitz systems, SIAM Rev. 38 (1996) 427–482. [6] E.W. Cheney, Introduction to Approximation Theory, AMS Chelsea Publishing, New York, 1982. [7] D. Commenges, M. Monsion, Fast inversion of triangular Toeplitz matrices, IEEE Trans. Automat. Control AC-29 (1984) 250–251. [8] V. Pan, E. Landowne, A. Sadikou, Univariate polynomial division with a remainder by means of evaluation and interpolation, Proceedings of the Third IEEE Symposium on Parallel and Distributed Processing, December 1991, Dallas, TX, USA, 1991, pp. 212–217. [9] V. Pan, A. Sadikou, E. Landowne, Polynomial division with a remainder by means of evaluation and interpolation, Inform. Process. Lett. 44 (1992) 149–153. [10] T.J. Rivlin, The Lebesgue constants for polynomial interpolation, in: H.C. Garnier, et al., (Eds.), Functional Analysis and its Applications, Springer, Berlin, 1974, pp. 422–437. [11] A. SchPonhage, Asymptotically fast algorithms for the numerical multiplication and division of polynomials with complex coeJcients, in Proceedings, EUROCAM, Marseille, 1982.
Theoretical Computer Science 315 (2004) 525 – 555
www.elsevier.com/locate/tcs
High probability analysis of the condition number of sparse polynomial systems Gregorio Malajovicha;∗;1 , J. Maurice Rojasb;2 a Departamento
de Matem atica Aplicada, Universidade Federal do Rio de Janeiro, Caixa Postal 68530, CEP 21945-970, Rio de Janeiro, RJ, Brazil b Department of Mathematics, Texas A&M University, TAMU 3368, College Station, TX 77843-3368, USA
Abstract Let f := (f1 ; : : : ; fn ) be a random polynomial system with .xed n-tuple of supports. Our main result is an upper bound on the probability that the condition number of f in a region U is larger than 1=”. The bound depends on an integral of a di2erential form on a toric manifold and admits a simple explicit upper bound when the Newton polytopes (and underlying variances) are all identical. We also consider polynomials with real coe5cients and give bounds for the expected number of real roots and (restricted) condition number. Using a K8ahler geometric framework throughout, we also express the expected number of roots of f inside a region U as the integral over U of a certain mixed volume form, thus recovering the classical mixed volume when U = (C∗ )n . c 2004 Elsevier B.V. All rights reserved. Keywords: Mixed volume; Condition number; Polynomial systems; Sparse; Random
1. Introduction From the point of view of numerical analysis, it is not only the number of complex solutions of a polynomial system which make it hard to solve numerically but the sensitivity of its roots to small perturbations in the coe5cients. This is formalized in ∗
Corresponding author. Tel.: 5521-2562-7515; fax: 5521-2290-1095. E-mail addresses: [email protected] (G. Malajovich), [email protected] (J.M. Rojas). URLs: http://www.labma.ufrj.br/∼gregorio, http://www.math.tamu.edu/∼rojas 1 Partially supported by CNPq grant 303472/02-2, by CERG (Hong Kong) grants 9040393, 9040402, and 9040188, by FAPERJ and by FundaHca˜ o JosJe PelJucio Ferreira. 2 Partially supported by Hong Kong UGC grant #9040402-730, Hong Kong/France PROCORE Grant #9050140-730, and NSF Grants DMS-0138446 and DMS-0211458. c 2004 Elsevier B.V. All rights reserved. 0304-3975/$ - see front matter doi:10.1016/j.tcs.2004.01.006
526
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
the condition number, (f; ) (cf. De.nition 4 of Section 1.1), which dates back to work of Alan Turing [39]. In essence, (f; ) measures the sensitivity of a solution to perturbations in a problem f, and a large condition number is meant to imply that f is intrinsically hard to solve numerically. Such analysis of numerical conditioning, while having been applied for decades in numerical linear algebra (see, e.g., [11]), has only been applied to computational algebraic geometry toward the end of the twentieth century (see, e.g., [33]). Here we use K8ahler geometry to analyze the numerical conditioning of sparse polynomial systems, thus setting the stage for more realistic complexity bounds for the numerical solution of polynomial systems. Our bounds generalize some earlier results of Kostlan [20] and Shub and Smale [36] on the more restricted dense case, and also yield new formulae for the expected number of roots (real and complex) in a region. The appellations “sparse” and “dense” respectively refer to either (a) taking into account the underlying monomial term structure or (b) ignoring this .ner structure and simply working with degrees of polynomials. Since many polynomial systems occurring in practice have rather restricted monomial term structure, sparsity is an important consideration and we therefore strive to state our complexity bounds in terms of this re.ned information. To give the Savor of our results, let us .rst make some necessary de.nitions. We must .rst formalize the spaces of polynomial systems we work with and how we measure perturbations in the spaces of problems and solutions. Denition 1. Given any .nite subset A ⊂ Zn , let FC (A) (resp. FR (A)) denote the vector space of all polynomials in C[x1 ; : : : ; xn ] (resp. R[x1 ; : : : ; xn ]) of the form a∈A ca xa where the notation xa := x1a1 · · · xnan is understood. For any .nite subsets A1 ; : : : ; An ⊂ Zn we then let A := (A1 ; : : : ; An ) and FC (A) := FC (A1 ) × · · · × FC (An ) (resp. FR (A) := FR (A1 ) × · · · × FR (An )). The n-tuple A will thus govern our notion of sparsity as well as the perturbations allowed in the coe5cients of our polynomial systems. It is then easy to speak of random polynomial systems and the distance to the nearest degenerate system. Recall that a degenerate root of f is simply a root of f having Jacobian of rank ¡n. Denition 2. By a complex (resp. real) random sparse polynomial system we will mean a choice of A := (A1 ; : : : ; An ) and an assignment of a probability measure to each FC (Ai ) (resp. FR (Ai )) as follows: endow FC (Ai ) (resp. FR (Ai )) with an independent complex (resp. real) Gaussian distribution having mean O and a (positive de.nite and diagonal) variance matrix Ci . Finally, let the discriminant variety, (A), denote the set of all f ∈ FC (A) (resp. f ∈ FR (A)) with a degenerate root and de.ne F (A) := {f ∈ FC (A) | f( ) = O} (resp. F (A) := {f ∈ FR (A) | f( ) = O}) and (A) := F (A) ∩ (A). Theorem 1. Suppose A ⊂ Zn is a =nite set with a convex hull of positive volume and A := (A; : : : ; A). Then there is a natural metric d(·; ·) on FC (A) such that n
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
527
(f; ) = 1=d(f; (A)). Furthermore, 1 Prob (f; ) ¿ for some root ∈ (C∗ )n of f ” 6 n3 (n + 1) Vol(A)(#A − 1)(#A − 2)”4 ; where f is a complex random sparse polynomial system, #A denotes the number of points in A, and Vol(A) denotes the volume of the convex hull of A (normalized so that Vol(O; e1 ; : : : ; en ) = 1). The above theorem is in fact a simple corollary of two much more general theorems (Theorems 4 and 5) which also include as a special case an analogous result of Shub and Smale in the dense case [6, Theorem 1, p. 237]. We also note that theorems such as the one above are natural precursors to explicit bounds on the number of steps required for a homotopy algorithm [33] to solve f. We will pursue the latter topic in a future paper. Indeed, one of our long term goals is to provide a rigorous and explicit complexity analysis of the numerical homotopy algorithms for sparse polynomial systems developed by Verschelde et al. [40], Huber and Sturmfels [17], and Li and Li [21]. The framework underlying our .rst main theorem involves K8ahler geometry, which is the intersection of Riemannian metrics and symplectic and complex structures on manifolds. On a more concrete level, we can give new formulae for the expected number of roots of f in a region U . For technical reasons, we will mainly work with logarithmic coordinates. That is, we will let Tn be the n-fold product of cylinders (R × (R mod 2))n ⊂ Cn , and use coordinates p + iq := (p1 + iq1 ; : : : ; pn + iqn ) ∈ Tn to stand for a root := exp(p + iq) := (ep1 +iq1 ; : : : ; epn +iqn ) of f. Roots with zero coordinates can be handled by then working in a suitable toric compacti=cation and this is made precise in Section 2. The idea of working with roots of polynomial systems in logarithmic coordinates seems to be extremely classical, yet it gives rise to interesting and surprising connections (see the discussions in [24,25,41]). Theorem 2. Let A1 ; : : : ; An be =nite subsets of Zn and U ⊆ Tn be a measurable region. Pick positive de=nite diagonal variance matrices C1 ; : : : ; Cn and consider a complex random polynomial system as in De=nition 2, for some (A1 ; C1 ; : : : ; An ; Cn ). Then there are natural real 2-forms !A1 ; : : : ; !An on Tn such that the expected number of roots of f in exp U ⊆ (C∗ )n is exactly (−1)n(n−1)=2 !A1 ∧ · · · ∧ !An : n U In particular, when U = (C∗ )n , the above expression is exactly the mixed volume of the convex hulls of A1 ; : : : ; An (normalized so that the mixed volume of n standard n-simplices is 1). See [7,31] for the classical de.nition of mixed volume and its main properties. The result above generalizes the famous connection between root counting and mixed
528
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
volumes discovered by David N. Bernshtein [5]. The special case of unmixed systems with identical coe5cient distributions (A1 = · · · = An , C1 = · · · = Cn ) recovers a particular case of Theorem 8.1 in [12]. However, comparing Theorem 2 and [12, Theorem 8.1], this is the only overlap since neither theorem generalizes the other. The very last assertion of Theorem 2 (for uniform variance Ci = I for all i) was certainly known to Gromov [14], and a version of Theorem 2 was known to Kazarnovskii [18, p. 351] and Khovanskii [19, Proposition 1, Section 1.13]. In [18], the supports Ai are even allowed to have complex exponents. However, uniform variance is again assumed. His method may imply this special case of Theorem 2, but the indications given in [18] were insu5cient for us to reconstruct a proof. Also, there is some intersection with a result by Passare and RullgUard (Theorem 5 in [29] and Theorem 20 in [28]). However, this result is about a more restrictive choice of the domain U and a more general class of functions (holomorphic, not polynomials) under a di2erent averaging process. As a consequence of our last result, we can also give a coarse estimate on the expected number of real roots in a region. Theorem 3. Let U be a measurable subset of Rn with Lebesgue volume (U ). Then, following the notation above, suppose instead that f is a real random polynomial system. Then the average number of real roots of f in exp U ⊂ Rn+ is bounded above by
2 −n=2 (4 ) (U ) (−1)n(n−1)=2 !A1 ∧ · · · ∧ !An : (p;q)∈U ×[0;2)n
This bound is of interest when n and U are .xed, in which case the expected number of positive real roots grows as the square root of the mixed volume. 1.1. Stronger results via mixed metrics Our remaining new results, which further sharpen the preceding bounds and formulae, will require some additional notation. Denition 3. We de.ne a norm on FC (Ai ) by fi C −1 := ci C −1 (ci )H where ci is the i row vector of coe5cients of fi and (·) H denotes the usualHermitian conjugate transn pose. Finally, we de.ne a norm on FC (A) by f 2 := i=1 fi 2C −1 , and a metric i dP on the product P(FC (A)) := P(FC (A1 )) × · · · × P(FC (An )) n of projective spaces by dP (f; g) := i=1 min∈C∗ fi − gi = fi , where we implicitly use the natural embedding of P(FC (Ai )) into the unit hemisphere of FC (Ai ). Each of the terms in the sum above corresponds to the square of the sine of the Fubini (or angular) distance between fi and gi . Therefore, dP is never larger than the Hermitian distance between points in FC (A), but is a correct .rst-order approximation of the distance when g → f in P(FC (A)) (compare with [6, Chapter 12]). Recall that Tp M denotes the tangent space at p of a manifold M .
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
529
Denition 4. De.ne the evaluation map, evA , as follows: evA :
F × T n → Cn ((f1 ; : : : ; fn ); p + iq) → (f1 (exp(p + iq)); : : : ; fn (exp(p + iq))):
Given any root exp(p + iq) of an f in FC (A), the condition number of f at p + iq, (f; p + iq), is then de.ned to be the operator norm DG|f := max DG|f ; g=1
where G is the unique branch of the implicit function which satis.es G(f) = p + iq and evA (g; G(g)) = O for all g su5ciently near f, and DG : Tf FC (A) → Tp+iq Tn is the derivative of G. (We set the condition number (f; p + iq) := + ∞ in the event that Df does not have full rank and G thus fails to be uniquely de.ned.) Note that the implied norm on Tf FC (A) was detailed in the previous de.nition, while the implied norm on Tp+iq Tn has intentionally been left unspeci.ed. This is because while FC (A) admits a natural Hermitian structure, the solution-space Tn admits n di2erent natural Hermitian structures (one from each support Ai , as we shall see in the next section). Nevertheless, we can give useful bounds on the condition number and give an unambiguous de.nition in certain cases. Theorem 4 (Condition Number Theorem). If (p; q) ∈ Tn is a non-degenerate root of f then ˙ Ai 6 max min DGf f
˙ f61
i
1 ˙ Ai : 6 max max DGf f i ˙ dP (f; (p;q) ) f61
In particular, if A1 = · · · = An and C1 = · · · = Cn , then ˙ Ai = max max DGf f ˙ Ai = max min DGf f
˙ f61
i
i
˙ f61
1 dP (f; (p;q) )
and we can de=ne (f; (p; q)) to be any of the three preceding quantities. This generalizes [6, Theorem 3, p. 234] which is essentially equivalent to the last assertion above, in the special case where Ai is an n-column matrix whose rows {A&i }& consist of all partitions of di into n non-negative integers and Ci = Diag& ((di − 1)!= n (Ai )&1 !(Ai )&2 ! · · · (Ai )&n !(di − j=1 (Ai )&j )!)—in short, the case where one considers complex random polynomial systems with fi a degree di polynomial and the underlying probability measure is invariant under a natural action of the unitary group U (n + 1) on the space of roots. The last assertion of Theorem 4 also bears some similarity to Theorem D of [9] where the notion of metric is considerably loosened to give a statement which applies to an even more general class of equations. However, our philosophy is radically di2erent: we consider the inner product in FC (A) as the starting point of our investigation and we do not change the metric in the .ber F(p;q) .
530
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
Theorem 4 thus gives us some insight about reasonable intrinsic metric structures on Tn . In view of the preceding theorem, we can de.ne a restricted condition number with respect to any measurable sub-region U ⊂ Tn as follows: Denition 5. We let (f; U ) := 1= min(p;q) ∈ U dP (f; (p;q) ). Also, via the natural GL(n)-action on T(p;q) Tn de.ned by (p; ˙ q) ˙ → (Lp; ˙ L q) ˙ for any L ∈ GL(n), we de.ne the mixed dilation of the tuple (!A1 ; : : : ; !An ) as )(!A1 ; : : : ; !An ; (p; q)) := min max L∈GL(n)
i
maxu=1 (!Ai )(p;q) (Lu; JLu) ; minu=1 (!Ai )(p;q) (Lu; JLu)
where J : T Tn → T Tn is canonical complex structure of Tn . Finally, we de.ne )U := sup(p;q)∈U )(!A1 ; : : : ; !An ; (p; q)), provided the supremum exists, and )U := + ∞ otherwise. We can then bound the expected number of roots with condition number ¿”−1 on U in terms of the mixed volume form, the mixed dilation )U and the expected number of ill-conditioned roots in the linear case. The linear case corresponds to the point sets and variance matrices below: 0 ··· 0 1 1 1 ALin = . CiLin = : . i .. .. 1 1 Theorem 5 (Expected value of the condition number). Let -Lin (n; ”) be the probability that a complex random system of n polynomial in n variables has condition number larger than ”−1 . Let - A (U; ”) be the probability that (f; U )¿”−1 for a complex random polynomial system f with supports A1 ; : : : ; An and variance matrices C1 ; : : : ; Cn . Then !Ai Lin √ A - (U; ”) 6 U - (n; )U ”): !ALin U i Our .nal main result concerns the distribution of the real roots of a real random polynomial system. Let - R (n; ”) be the probability that a real random linear system of n polynomials in n variables has condition number larger than ”−1 . Theorem 6. Let A = A1 = · · · = An and C = C1 = · · · = Cn and let U ⊆ Rn be measurable. Let f be a real random polynomial system. Then, Prob[(f; U ) ¿ ”−1 ] 6 E(U )-R (n; ”); where E(U ) is the expected number of real roots on U .
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
531
Note that E(U ) depends on C, so even if we make U = Rn we may still obtain a bound depending on C. Shub and Smale showed in [32] that the expected number of real roots in the dense case (with a particular choice of probability measure) is exactly the square root of the expected number of complex roots. The sparse analogue of this result seems hard to prove even in the general unmixed case: Explicit formulY for the unmixed case are known only in certain special cases, e.g., certain systems of bounded multi-degree [30,27]. Hence our last theorem can be interpreted as another step toward a fuller generalization. 2. Symplectic geometry and polynomial systems 2.1. Some basic de=nitions and examples For the standard de.nitions and properties of symplectic structures, complex structures, Riemannian manifolds, and K8ahler manifolds, we refer the reader to [26,8]. A treatment focusing on toric manifolds can be found in [15, Appendix A]. We brieSy review a few of the basics before moving on to the proofs of our theorems. Denition 6 (K8ahler manifolds). Let M be a complex manifold, with complex structure J and a strictly positive symplectic (1; 1)-form ! on M (considered as a real manifold). We then call the triple (M; !; J ) a KBahler manifold. Example 1 (A5ne space). We identifyCM with R2M and use coordinates Z i = X i + √ M −1Y i . The canonical 2-form !Z = i=1 dXi ∧ dYi makes CM into a symplectic manifold. √ The natural complex structure J is just the multiplication by −1. The triple (CM ; !Z ; J ) is a K8ahler manifold. Example 2 (Projective space). Projective space PM −1 admits a canonical 2-form de.ned as follows. Let Z = (Z 1 ; : : : ; Z M ) ∈ (CM )∗ , and let [Z] = (Z 1 : · · · : Z M ) ∈ PM −1 be the corresponding point in PM −1 . The tangent space T[Z] PM −1 may be modeled by Z ⊥ ⊂ TZ CM . Then we can de.ne a two-form on PM −1 by setting: ![Z] (u; v) = Z −2 !Z (u; v); where it is assumed that u and v are orthogonal to Z. The latter assumption tends to be quite inconvenient, and most people prefer to pull ![Z] back to CM by the canonical projection : Z → [Z]. It is standard to write the pull-back 2 = ∗ ![Z] as 2Z = − 12 dJ ∗ d 12 log Z 2 ; using the notation d3 = i @3=pi ∧ dpi + @3=qi ∧ dqi , and where J ∗ denotes the pullback by J . Projective space also inherits the complex structure from CM . Then ![Z] is a strictly positive (1; 1)-form. The corresponding metric is called Fubini-Study metric in CM or CM −1 .
532
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
√ Remark 1. Some authors prefer to write −1@@Z instead of − 12 dJ ∗ d. The following Z = @3= ZZ i ∧ d ZZ i . Then they write notation is assumed: @3 = i @3=Zi ∧ dZi and @3 i 2Z as √ Zi ∧ Z j dZj Zi Z d Z Z dZ ∧ d Z i −1 i i j i 2Z = : − 2 Z 2 Z 4 Example 3 (Toric K8ahler manifolds from point sets). Let A be any M × n matrix with integer entries whose row vectors have n-dimensional convex hull and let C be any diagonal positive de.nite M times M matrix. De.ne the map VˆA from Cn into CM by 1 zA . 1=2 VˆA : z → C .. : M zA We can also compose with the projection into projective space to obtain a slightly di2erent map VA = ◦ VˆA : Cn → PM −1 de.ned by VA : z → [VˆA (z)]. When C is the identity, the Zariski closure of the image of VA is called the Veronese variety and the map VA is called the Veronese embedding. Note that VA is not de.ned for certain values of z, like z = 0. Those values comprise the exceptional set which is a subset of the coordinate hyper-planes. There is then a natural symplectic structure on the closure of the image of VA , given by the restriction of the Fubini-Study 2-form 2: We will see below (Lemma 1) that by our assumption on the convex hull of the rows of A, we have that DVA is of rank n for z ∈ (C∗ )n . Thus, we can pull-back this structure to (C∗ )n by 7A = VA∗ 2. Also, we can pull back the complex structure of PM −1 , so that 7A becomes a strictly positive (1; 1)-form. Therefore, the matrix A de.nes a K8ahler manifold ((C∗ )n ; 7A ; J ). follows: if f denotes also The reason we introduced C in the de.nition of VˆA is as the row-vector of the scaled coe5cients of f, then f(z) = a fa (Ca )+1=2 z a = fVˆA (z). This way, the 2-norm of the row vector f is also the norm of the polynomial f in (FA ; · C −1 ). A random normal polynomial with variance matrix C corresponds to a random normal row vector f with unit variance. Example 4 (Toric manifolds in logarithmic coordinates). For any matrix A as in the previous example, we can pull-back the K8ahler structure of ((C∗ )n ; 7A ; J ) to obtain another K8ahler manifold (Tn ; !A ; J ). (Actually, it is the same object in logarithmic coordinates, minus points at “in.nity”.) An equivalent de.nition is to pull back the def K8ahler structure of the Veronese variety by vˆA = VˆA ◦ exp. Remark 2. The Fubini-Study metric on CM was constructed by applying the operator − 12 dJ ∗ d to a certain convex function (in our case, 12 log Z 2 ). This is a general standard way to construct K8ahler structures. In [14], it is explained how to associate a (non-unique) convex function to any convex body, thus producing an associated K8ahler metric.
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
533
For the record, we state explicit formulY for several of the invariants associated to the K8ahler manifold (Tn ; !A ; J ). First of all, the function gA = g ◦ vˆA is precisely: Formula 2.1.1. The canonical Integral gA (or KBahler potential) of the convex set associated to A gA (p) :=
1 2
log((exp (A · p))T C(exp(A · p))):
The terminology integral is borrowed from mechanics, and it refers to the invariance of gA under a [0; 29)n -action. Also, the gradient of gA is called the momentum map. Recall that the Veronese embedding takes values in projective space. We will use the following notation: vA (p) = vˆA (p)= vˆA (p) . This is independent of the representative of equivalence class vA (p). Now, let vA (p)2 mean coordinate-wise squaring and vA (p)2T be the transpose of vA (p)2 . The gradient of gA is then: Formula 2.1.2. The Momentum Map associated to A ∇gA = vA (p)2T A: Formula 2.1.3. Second derivative of gA D2 gA = 2DvA (p)T DvA (p): We also have the following formulae: Formula 2.1.4. The symplectic 2-form associated to A: (!A )(p;q) =
1 2 (D gA )ij dpi ∧ dqj : 2 ij
Formula 2.1.5. Hermitian structure of Tn associated to A: (u; wA )(p;q) = uH ( 12 D2 gA )p w: In general, the function vA goes from Tn into projective space. Therefore, its derivative is a mapping √ (DvA )(p;q) : T(p;q) Tn → Tv (p+q√−1) PM −1 vˆA (p + q −1)⊥ ⊂ CM : A
For convenience, we will write this derivative as a mapping into CM , with range √ ⊥ vˆA (p + q −1) . Let Pv be the projection operator Pv = I −
1 vvH : v 2
We then have the following formula.
534
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
Formula 2.1.6. Derivative of vA (DvA )(p;q) = PvˆA (p+q√−1) Diag
√ vˆA (p + q −1) √ A: vˆA (p + q −1)
Lemma 1. Let A be a matrix with non-negative integer entries, such that Conv(A) has dimension n. Then (DvA )p (resp. (DvA )p+iq ) is injective, for all p ∈ Rn (resp. for all p + iq ∈ Cn ). Proof. We prove only the real case (the complex case is analogous). The conclusion of this Lemma can fail only if there are p ∈ Rn and u = 0 with (DvA )p u = 0. This means that PvA (p) diag(vA )p Au = 0: This can only happen if diag(vA )p Au is in the space spanned by (vA )p , or, equivalently, Au is in the space spanned by (1; 1; : : : ; 1)T . This means that all the rows a of A satisfy au = for some . Interpreting a row of A as a vertex of Conv(A), this means that Conv(A) is contained in the a5ne plane {a : au = }. An immediate consequence of Formula 2.1.6 is √ Lemma 2. Let f ∈ FA and (p; q) ∈ Tn be such that f · vˆA (p + q −1) = 0. Then, f · (DvA )(p;q) =
1 f · (DvˆA )(p;q) : vˆA (p; q)
√ In other words, when (f◦exp)(p+q −1) vanishes, DvA and DvˆA are the same up to scaling. Noting that the Hermitian metric can be written (u; wA )(p;q) = uh DvA (p; q)H DvA (p; q)w, we also obtain the following formula. Formula 2.1.7. Volume element of (Tn ; !A ; J ) dTAn = det( 12 D2 gA (p))dp1 ∧ · · · ∧ dpn ∧ dq1 ∧ · · · ∧ dqn : 2.2. Toric actions and the momentum map The momentum map, also called moment map, was introduced in its modern formulation by Smale [37] and Souriau [38]. The reader may consult one of the many textbooks in the subject (such as Abraham and Marsden [1] or McDu2 and Salamon [26]) for a general exposition (see also the discussion at the end of [23]). In this section we instead follow the point of view of Gromov [14]. The main results in this section are the two propositions below. Proposition 1. The momentum map ∇gA maps Tn onto the interior of Conv(A). When ∇gA is restricted to the real n-plane [q = 0] ⊂ Tn , this mapping is a bijection.
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
535
This would appear to be a particular case of the Atiyah–Guillemin–Sternberg theorem [2,16]. However, technical di5culties prevent us from directly applying this result here. 3 Proposition 2. The momentum map ∇gA is a volume-preserving map from the manifold (Tn ; !A ; J ) into Conv(A), up to a constant, in the following sense: if U is a measurable region of Conv(A), then Vol((∇gA )−1 (U )) = n Vol(U ): Proof of Proposition 2. Consider the mapping M:
Tn →
1 2
Conv(A) × Tn
(p; q) → ( 12 ∇gA (p); q): Since we assume dim Conv(A) = n, we can apply Proposition 1 and conclude that M is a di2eomorphism. The pull-back of the canonical symplectic structure in R2n by M is precisely !A , because of FormulY 2.1.3 and 2.1.4. Di2eomorphisms with that property are called symplectomorphisms. Since the volume form of a symplectic manifold depends only of the canonical 2-form, symplectomorphisms preserve volume. We compose with a scaling by 12 in the .rst n variables, that divides Vol(U ) by 2n , and we are done. Before proving Proposition 1, we will need the following result about convexity which has been attributed to Legendre. (See also [14, Convexity Theorem 1.2] and a generalization in [4, Theorem 5.1].) Legendre’s Theorem. If f is convex and of class C2 on Rn , then the closure of the image {∇fr : r ∈ Rn } in Rn is convex. By replacing f by gA , we conclude that the image of the momentum map ∇gA is convex. Proof of Proposition 1. The momentum map ∇gA maps Tn onto the interior of Conv(A). Indeed, let a = A& be a row of A, associated to a vertex of Conv(A). Then there is a direction v ∈ Rn such that a·v=
max
x∈Conv(A)
x·v
for some unique a. We claim that a ∈ ∇gA (Rn ). Indeed, let x(t) = vA (tv), t a real parameter. If b is another row of A, ea·tv = eta·v etb·v = eb·tv 3 The Atiyah–Guillemin–Sternberg Theorem applies to compact symplectic manifolds and the implied compacti.cation of T n may have singularities.
536
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
as t → ∞. We can then write vˆA (tv)2T as T .. .. . ta·v . C Diag eta·v : e = .. .. . .
vˆA (tv)2T
Since C is positive de.nite, C&& ¿0 and vˆA (tv)2T C&& = eTa = eTa ; t→∞ vˆA (tv) 2 C&&
lim vA (tv)2T = lim
t→∞
where ea is the unit vector in RM corresponding to the row a. It follows that limt→∞ ∇gA (tv) = a. When we set q = 0, we have det D2 gA = 0 on Rn , so we have a local di2eomorphism at each point p ∈ Rn . Assume that (∇gA )p = (∇gA )p for p = p . Then, let >(t) = (1 − t)p + tp . The function t → (∇gA )>(t) > (t) has the same value at 0 and at 1, hence by Rolle’s Theorem its derivative must vanish at some t ∗ ∈ (0; 1). In that case, (D2 gA )>(t ∗ ) (> (t ∗ ); > (t ∗ )) = 0 and since > (t ∗ ) = p − p = 0, det D2 gA must vanish in some p ∈ Rn . This contradicts Lemma 1. 2.3. The condition matrix √ Following [6], we look at the √linearization of the implicit function p+q −1 = G(f) for the equation evA (f; p + q −1) = 0. √ Denition 7. The condition matrix of ev at (f; p + q −1) is DG = DTn (ev)−1 DF (ev); where F = FA1 × · · · × FAn . Above, DTn (ev) is a linear operator from an n-dimensional complex space into Cn , while DF (ev) goes from an (M1 + · · · + Mn )-dimensional complex space into Cn . Lemma 3. If p + iq ∈ Tn and f(exp(p + iq)) = O then det(DG DG H )−1 dp1 ∧ dq1 ∧ · · · ∧ dpn ∧ dqn √ i = (−1)n(n−1)=2 ∧ −1fi · (DvAi )(p;q) dp ∧ fZ · (DvAi )(p;−q) dq: Note that although fi ·(DvAi )(p;q) dp is a complex-valued form, each wedge fi · (DvAi )(p;q) dp ∧ fZi · (DvAi )(p;−q) dq is a real-valued 2-form.
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
537
Proof of Lemma 3. We compute: M1 √ & 1 v ˆ (p + q −1) df & &=1 A1 . .. DF (ev)|(p;q) = ; Mn √ & n vˆAn (p + q −1) df& &=1
and hence DF (ev)DF (ev)H = diag vˆAi 2 : Also,
f1 · DvˆA1 .. DTn (ev) = : .
fn · DvˆAn
Therefore,
H det(DG(p;q) DG(p;q) )−1
2 1 DvˆA1 f1 · vˆA1 .. : = det . 1 n f · DvˆAn vˆAn
We can now use Lemma 2 to conclude the following: Formula 2.3.1. Determinant of the condition matrix 1 2 f · DvA1 .. H det(DG(p;q) DG(p;q) )−1 = det : . n f · DvAn We can now write the same formula as a determinant of a block matrix: 1 f · DvA1 .. . fn · Dv An H −1 det(DG(p;q) DG(p;q) ) = det 1 fZ · DvZA1 .. . n Z f · DvZAn and replace the determinant by a wedge. The factor (−1)n(n−1)=2 comes from replacing dp1 ∧ · · · ∧dpn ∧ dq1 ∧ · · · ∧ dqn by dp1 ∧ dq1 ∧ · · · ∧ dpn ∧ dqn .
538
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
We are now ready to prove our main theorems. 3. The proofs of Theorems 1–6 We .rst prove that Theorem 1 follows from Theorem 4. Then we will prove our remaining main theorems in the following order: 2, 4, 5, 3, 6. 3.1. Proof of Theorem 1 The .rst assertion, modulo an exponential change of coordinates and using the multiprojective metric dP (·; ·), follows immediately from Theorem 4. As for the rest of Theorem 1, Theorem 4 applied to the linear case then provides the following interpretation of -Lin (n; ”): -Lin (n; ”) = Prob[dP (f; (p;q) ) ¡ ”]; where f is a complex random linear polynomial system, and (p; q) is such that f(exp(p + iq)) = 0. So we are on our way to proving the inequality Prob[dP (f; (p;q) ) ¡ ”] 6 n3 (n + 1) Vol(A)(#A − 1)(#A − 2)”4 ; for general f, which clearly implies our desired bound. To prove the latter inequality, recall that by the de.nition of the multi-projective distance dP (·; ·), we have the following equality: dP (f; (p;q) )2 =
min
g∈(p;q) ∈(C∗ )n
n f i − gi 2 i : i 2 f i=1
So let g be so that the above minimum is attained. Without loss of generality, we may scale the gi so that 1 = · · · = n = 1. In that case, n i i 2 n f i − gi 2 2 i=1 f − g ¿ : dP (f; (p;q) ) = n j 2 fi 2 i=1 j=1 f We are then in the setting of [6, pp. 248–250], where we identify our linear f with a normally distributed (n + 1) × n complex matrix. The right-hand side in the above inequality is then precisely the left-hand term in [6, Remark 2, p. 250]. Therefore, using the notation of [6, Proposition 4], dP (f; (p;q) )¿dF (f; x ). So it follows that -Lin (n; ”) = Prob[dP (f; (p;q) ) ¡ ”] 6 Prob[dF (f; x ) ¡ ”] and the last probability is bounded above by n3 (n + 1)(#A − 1)(#A − 2)”4 via [6, Therorem 6, p. 254]. Theorem 1 now follows.
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
539
3.2. Proof of Theorem 2 Using [6, Theorem 5, p. 243] (or Proposition 5, p. 31 below), we deduce that the average number of complex roots is i 2 e−f =2 H det(DG(p;q) DG(p;q) )−1 : Avg = (2)Mi (p;q)∈U f∈F(p;q) By Lemma 3, we can replace the inner integral by a 2n-form valued integral: Avg = (−1)n(n−1)=2
(p;q)∈U
i 2
e−f =2 i f · (DvAi )(p;q) dp i (2)Mi f∈F(p;q)
i ∧fZ · (DvAi )(p;−q) dq:
Since the image of DvAi is precisely F √Ai ;(p;q) ⊂ FAi , one can add n extra variables corresponding to the directions v (p + q −1) without changing the integral: we write Ai √ √ FAi = FAi ;(p;q) × CvAi (p + q −1). Since (fi + tvAi (p + q −1))DvAi is equal to fi DvAi , the average number of roots is indeed: Avg = (−1)n(n−1)=2
(p;q)∈U
i 2
e−f =2 i f · (DvAi )(p;q) dp i (2)Mi +1 f∈F
i
∧fZ · (DvAi )(p;−q) dq: In the integral above, all the terms that are multiple of f&i fZ?i for some & = ? will cancel out. Therefore, Avg = (−1)n(n−1)=2
(p;q)∈U
∧ (DvAi )&(p;−q) dq:
i 2
e−f =2 i 2 |f& | (DvAi )&(p;q) dp i Mi +1 (2) & f∈F
Now, we apply the integral formula:
2e
x∈CM
|x1 |
−x2 =2
(2)M
to obtain: Avg =
(−1)n(n−1)=2 n
=
2e
x1 ∈C
(p;q)∈U
|x1 |
−|x1 |2 =2
2
=2
(DvAi )&(p;q) dp ∧ (DvAi )&(p;−q) dq: &
According to formulY 2.1.3 and 2.1.4, the integrand is just 2−n ∧ !Ai , and thus (−1)n(n−1)=2 Avg = n
U
n! i !A i = n
U
dTn :
540
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
3.3. Proof of Theorem 4 Let (p; q) ∈ Tn and let f ∈ F(p;q) . Without loss of generality, we can assume that f is scaled so that for all i, fi = 1. Let @f ∈ F(p;q) be such that f + @f is singular at (p; q), and assume that @fi 2 is minimal. Then, due to the scaling we choose, dP (f; (p;q) ) = @fi 2 : Since f + @f is singular, there is a vector u = 0 such that 1 (f + @f1 ) · (DvˆA1 )(p;q) .. u = 0 . n n (f + @f ) · (DvˆAn )(p;q) and hence 1 (f + @f1 ) · (DvA1 )(p;q) .. u = 0: . n n (f + @f ) · (DvAn )(p;q) This means that f1 · DvA1 u = −@f1 · DvA1 u .. .
fn · DvAn u = −@fn · DvAn u: Let D(f) denote the matrix 1 f · (DvA1 )(p;q) def .. D(f) = : . fn · (DvAn )(p;q)
Given v = D(f)u, we obtain: v1 = −@f1 · DvA1 D(f)−1 v .. .
(3.3.1)
vn = −@fn · DvAn D(f)−1 v:
We can then scale u and v, such that v = 1. Claim 1. Under the assumptions above, @fi is colinear to (DvAi D(f)−1 v)H . Proof. Assume that @fi = g + h, with g colinear and h orthogonal to (DvAi D(f)−1 v)H . As the image of DvAi is orthogonal to vAi , g is orthogonal to vAHi , so ev(gi ; (p; q)) = 0 and hence ev(hi ; (p; q)) = 0. We can therefore replace @fi by g without compromising equality (3.3.1). Since @f was minimal, this implies h = 0.
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
541
We obtain now an explicit expression for @fi in terms of v: @fi = −vi
(DvAi D(f)−1 v)H : DvAi D(f)−1 v 2
Therefore, @fi =
|vi | |vi | = : DvAi D(f)−1 v (D(f)−1 v) Ai
So we have proved the following result: Lemma 4. Fix v so that v = 1 and let @f ∈ F(p; q) be such that Eq. (3.3.1) holds and @f is minimal. Then, @fi =
|vi | : D(f)−1 v Ai
Lemma 4 provides an immediate lower bound for @f = @fi ¿
@fi 2 : Since
|vi | ; maxj D(f)−1 v Aj
we can use v = 1 to deduce that 1 1 @fi 2 ¿ ¿ : −1 v −1 max D(f) max D(f) j Aj j Aj i Also, for any v with v = 1, we can choose @f minimal so that Eq. (3.3.1) applies. Using Lemma 4, we obtain: @fi 6 Hence i
|vi | : minj D(f)−1 v Aj
@fi 2 6
1 : minj D(f)−1 v Aj
Since this is true for any v, and @f is minimal for all v, we have 1 @fi 2 6 maxv=1 minj D(f)−1 Aj i and this proves Theorem 4. 3.4. The idea behind the proof of Theorem 5 The proof of Theorem 5 is long. We .rst sketch the √ idea of the proof. Recall that F(p; q) is the set of all f ∈ F such that ev(f; p + q −1) = 0, and that (p; q) is the
542
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
restriction of the discriminant to the .ber F(p; q) : def
(p;q) = {f ∈ F(p;q) : D(f)(p;q) does not have full rank}: The space F is endowed with a Gaussian probability measure, with volume element 2
e−f =2 dF; (2) Mi where dF usual volume form in F = (FA1 ; ·; · A1 ) × · · · × (FAn ; ·; · An ) and is i the f 2 = f 2Ai . For U a set in Tn , we de.ned earlier (in the statement of Theorem 5) the quantity: def
-A (U; ”) = Prob[(f; U ) ¿ ”−1 ] = Prob[∃(p; q) ∈ U : dP (f; (p;q) ) ¡ ”]: def
The na8^ve idea for bounding -A (U; ”) is as follows: Let V (”) = {(f; (p; q)) ∈ F × U : ev(f; (p; q)) = 0 and dP (f; (p; q) )¡”}. We also de.ne : V (”) → F as the canonical def
projection mapping F × U to F, and set #V (”) (f) = #{(p; q) ∈ U : (f; (p; q)) ∈ V (”)}. Then, -A (U; ”) =
6
2
f∈F
A(V (”)) (f)
e−f =2 dF (2) Mi 2
f∈F
#V (”)
e−f =2 dF (2) Mi
√ with equality in the linear case and when B¿ n. Now we apply the coarea formula [6, Theorem 5, p. 243] to obtain: A
- (U; ”) 6
(p;q)∈U ⊂Tn
2
f∈F(p;q) dP (f;(p;q) )¡”
e−f =2 1 dF dVTn ; NJ (f; (p; q)) (2) Mi
where dVTn stands for Lebesgue measure in Tn . Again, in the linear case, we have equality. We already know from Lemma 3 that 1=NJ (f; (p; q)) =
n i=1
i
fi · (DvAi )(p;q) dp ∧ fZ · (DvZAi )(p;q) dq:
We should focus now on the inner integral. In each coordinate space FAi , we can introduce a new orthonormal system of coordinates (depending on (p; q)) by decomposing: i fi = fIi + fIIi + fIII ;
where fIi is the component colinear to vAHi , fIIi is the projection of fi to (range DvAi )H , i and fIII is orthogonal to fIi and fIIi .
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
543
Of course, fi ∈ (FAi )(p; q) if and only if fIi = 0. Also, n i=1
i fi · (DvAi )(p;q) dp ∧ fZ · (DvZAi )(p;q) dq
=
n i=1
i fIIi · (DvAi )(p;q) dp ∧ fZ II · (DvZAi )(p;q) dq:
It is an elementary fact that i dP (fIIi + fIII ; (p;q) ) 6 dP (fIIi ; (p;q) ):
It follows that for f ∈ F(p; q) : dP (f; (p;q) ) 6 dP (fII ; (p;q) ); with equality in the linear case. Hence, we obtain: A
- (U; ”) 6
f∈F(p;q) dP (fII ;(p;q) )¡”
(p;q)∈U ⊂Tn
n i=1
fIIi · (DvAi )(p;q) dp
−fi +fi 2 =2 II III e i Z ∧ fII · (DvZAi )(p;q) dq · dF dVTn ; (2) Mi with equality in the linear case. We can integrate the obtain:
(Mi − n − 1) variables fIII to
Proposition 3. A
- (U; ”) 6
(p;q)∈U ⊂Tn
n2
fII ∈C dP (fII ;(p;q) )¡”
n i=1
fIIi · (DvAi )(p;q) dp
−fi 2 =2 II e i ∧ fZ II · (DvZAi )(p;q) dq · dVTn n(n+1) (2) with equality in the linear case. 3.5. Proof of Theorem 5 The domain of integration in Proposition 3 makes integration extremely di5cult. In order to estimate the inner integral, we will need to perform a change of coordinates. Unfortunately, the Gaussian in Proposition 3 makes that change of coordinates extremely hard, and we will have to restate Proposition 3 in terms of integrals over a product of projective spaces.
544
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
The domain of integration will be Pn−1 × · · · × Pn−1 . Translating an integral in terms of Gaussians to an integral in terms of projective spaces is not immediate, and we will use the following elementary fact about Gaussians: Lemma 5. Let ’ : Cn → R be C∗ -invariant (in the sense of the usual scaling action). Then we can also interpret ’ as a function from Pn−1 into R, and: 1 Vol(Pn−1 )
[x]∈Pn−1
’(x) d[x] =
2
x∈Cn
’(x)
e−x =2 dx; (2)n
where, respectively, the natural volume forms on Pn−1 and Cn are understood for each integral. Now the integrand in Proposition 3 is not C∗ -invariant. This is why we will need the following formula: Lemma 6. Under the hypotheses of Lemma 5, 1 Vol(Pn−1 )
1 ’(x) d[x] = 2n n−1 [x]∈P
2
e−x =2 x ’(x) dx; (2)n x∈Cn 2
where, respectively, the natural volume forms on Pn−1 and Cn are understood for each integral. Proof.
2
e−x =2 x ’(x) dx (2)n x∈Cn ∞ 2 e−|r| =2 |r|2n+1 ’(E) dr dE = (2)n E∈S 2n−1 r=0 ∞ ∞ −|r|2 =2 −|r|2 =2 2n e 2n−1 e − |r| + 2n |r| dr ’(E) dE = (2)n (2)n E∈S 2n−1 r=0 0 2 e−x =2 = 2n ’(x) dx: (2)n x∈Cn 2
We can now introduce the notation: def
WEDGEA (fII ) =
n i=1
1 i fi · (DvAi )(p;q) dp ∧ fZ II · (DvZAi )(p;q) dq: fIIi 2 II
This function is invariant under the (C∗ )n -action ? fII : fII → (1 fII1 ; : : : ; n fIIn ). We adopt the following conventions: FII ⊂ F is the space spanned by coordinates fII and P(FII ) is its quotient by (C∗ )n .
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
545
We apply n times Lemma 6 and obtain: def
Proposition 4. Let VOL = Vol(Pn−1 )n . Then, (2n)n -A (U; ”) 6 fII ∈P(FII ) VOL (p;q)∈U ⊂Tn
WEDGEA (fII ) dP(FII ) dVTn
dP (fII ;(p;q) )¡”
√ with equality when B¿ n. In the linear case, (2n)n Lin WEDGELin (gII ) d(PFLin - (U; ”) = II ) dVTn : gII ∈P(FLin II ) VOL (p;q)∈U ⊂Tn Lin dP (gII ;(p;q) )¡”
Now we introduce the following change of coordinates. Let L ∈ GL(n) be such that the minimum in De.nition 5, p. 6 is attained: ’ : Pn−1 × · · · × Pn−1 → Pn−1 × · · · × Pn−1 def
such that gIIi = fIIi · DvAi L:
fII → gII = ’(fII );
Without loss of generality, we scale L such that det L = 1. The following property follows from the de.nition of WEDGE: WEDGEA (fII ) = WEDGELin (gII )
n gi 2 II i 2: i=1 fII
(3.5.1)
Lin Assume now that dP (fII ; (p; q) )¡”. Then there is @f ∈ FII , such that f + @f ∈ (p; q) i and @f 6” (assuming the scaling fII = 1 for all i). Lin Setting gII = ’(fII ) and @g = ’(@g), we obtain that g + @g ∈ (p; q) .
Lin dP (g; (p;q) )6
n @gi 2 i 2 : i=1 gII
At each value of i, @gi @fi 6 )(DfIIi ’i ); gIIi fIIi where ) denotes Wilkinson’s condition number of the linear operator DfIIi ’i . This is precisely )(DvAi L). Thus, Lin ) 6 ” max )(DvAi L) = max )(!Ai ): dP (g; (p;q) i
i
A Thus, an ”-neighborhood of (p; q) is mapped into a
√
Lin )U ” neighborhood of (p; q) .
546
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
We use this property and Eq. (3.5.1) to bound: (2n)n Lin -A (U; ”) 6 (gII ) n−1 n−1 WEDGE VOL (p;q)∈U ⊂Tn gII ∈P Lin×···×P √ dP (gII ;(p;q) )¡ )U ”
gIIi 2 · i 2 i=1 fII n
|JgII ’−1 |2 d(Pn−1 × · · · × Pn−1 ) dVTn
(3.5.2)
where JgII ’−1 is the Jacobian of ’−1 at gII . Remark 3. Considering each DvAi as a map from Cn into Cn , the Jacobian is JgII ’−1 =
n ’−1 (g )i n II (det DvAHi DvAi )−1=2 : gIIi n i=1
We will not use this value in the sequel. In order to simplify the expressions for the bound on -A (U; ”), it is convenient to introduce the following notation: (2n)n d(Pn−1 × · · · × Pn−1 ) WEDGELin (gII ) ; VOL n!(!Lin ) n i 2 n def gII −1 2 H= i 2 |Jg ’ | ; i=1 fII def
dP =
def
A@ = A{g:dP (g;(p;q) Lin )¡@} : Now Eq. (3.5.2) becomes: -A (U; ”) 6 n!
(p;q)∈U ⊂Tn
(!Lin )
n
gII ∈Pn−1 ×···×Pn−1
dP H (gII )A√)U ” (gII ): (3.5.3)
Lemma 7. Let (p; q) be =xed. Then Pn−1 × · · · × Pn−1 together with density function dP, is a probability space. Proof. The expected number of roots in U for a linear system is n n! !Lin dP (p;q)∈U
and is also equal to n! same and gII ∈Pn−1 ×···×Pn−1
gII ∈Pn−1 ×···×Pn−1
U
n
!Lin . This holds for all U , hence the volume forms are the
dP = 1:
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
547
This allows us to interpret the inner integral of Eq. (3.5.3) as the expected value of a product. This is less than the product of the expected values, and: A n - (U; ”) 6 n! (!Lin ) dP H (gII ) (p;q)∈U ⊂Tn
gII ∈Pn−1 ×···×Pn−1
·
gII ∈Pn−1 ×···×Pn−1
dPA√)U ” (gII )
:
Because generic (square) systems of linear equations have exactly one root, we can n Lin also consider U as a probability space, with probability measure (1=Vol (U ))n!!Lin . Therefore, we can bound: 1 A n - (U; ”) 6 n!(!Lin ) dP H (gII ) VolLin (U ) (p;q)∈U gII ∈Pn−1 ×···×Pn−1 n √ · n!(!Lin ) dP A )U ” (gII ) : gII ∈Pn−1 ×···×Pn−1
(p;q)∈U
The .rst parenthetical expression is VolA (U ), the volume of U with respect to the toric volume form associated to A = (A1 ; : : : ; An ). The second parenthetical expression √ is -Lin ( )U ”; U ). This concludes the proof of Theorem 5. 3.6. Proof of Theorem 3 As in the complex case (Theorem 2), the expected number of roots can be computed by applying the coarea formula:
AVG =
p∈U
i 2
n e−f =2 H −1=2 : √ Mi det(DG DG ) f∈FR 2 p i=1
Now there are three big di2erences. The set U is in Rn instead of Tn , the space FpR contains only real polynomials (and therefore has half the dimension), and we are integrating the square root of 1=det(DG DG H ). Since we do not know in general how to integrate such a square root, we bound the inner integral as follows. We consider the real Hilbert space of functions integrable in FpR endowed with Gaussian probability measure. The inner product in this space is: def
’; =
i 2
FR p
’(f) (f)
n e−f =2 √ Mi −1 dV; i=1 2
where dV is Lebesgue volume. If 1 denotes the constant function equal to 1, we interpret AVG = (2)−n=2 det(DG DG H )−1=2 ; 1: p∈U
548
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
Hence Cauchy–Schwartz inequality implies: AVG 6
p∈U
(2)−n=2 det(DG DG H )−1=2 1 :
By construction, 1 = 1, and we are left with: AVG 6
p∈U
(2)−n=2
i 2
n e−f =2 √ Mi −1 det(DG DG H )−1 : i=1 FR 2 p
As in the complex case, we add extra n variables: AVG 6 (2)−n=2
p∈U
i 2
n e−f =2 √ Mi det(DG DG H )−1 ; FR i=1 2
and we interpret det(DG DG H )−1 in terms of a wedge. Since
2e
x∈RM
−x2 =2
|x1 | √
2
=
2e
M
y∈R
−y2 =2
y √
2
=
2
y∈R
e−y =2 √ = 1; 2
we obtain: AVG 6 (2)−n=2
√ p∈U
n! dTn = (2)−n=2
√ p∈U
n! dTn :
Now we would like to use Cauchy–Schwartz again. This time, the inner product is de.ned as def ’; = ’(p) (p) dV: p∈U
Hence, AVG 6 (2)−n=2 n! dTn ; 1 6 (2)−n=2 n! dTn 1 : This time, 1 2 = (U ), so we bound: −n=2
AVG 6 (2)
6 (42 )
(U )
−n=2
U
n! dTn
(U )
(p;q)∈Tn ;p∈U
n! dTn :
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
549
3.7. Proof of Theorem 6 Let ”¿0. As in the mixed case, we de.ne: def
-R (U; ”) = Probf∈F [(f; U ) ¿ ”−1 ] = Probf∈F [∃p ∈ U : ev(f; p) = 0 and dP (f; p ) ¡ ”] where now U ∈ Rn . def Let V (”) = {(f; p) ∈ FR × U : ev(f; p) = 0 and dP (f; p )¡”}. We also de.ne : V (”) → P(F) to be the canonical projection mapping FR × U to FR and set #V (”) (f)
def
= #{p ∈ U : (f; p) ∈ V (”)}. Then,
-R (U; ”) =
f∈FR
f∈FR
e− √
fi 2 =2
2 i
Mi
fi 2 =2
2
Mi
p∈U ⊂Rn
A(V (”)) (f) dFR #V (”) dFR
6
i
6
e− √
f∈FR p dP (f;p )¡”
e− √
i
fi 2 =2
2
Mi
1 dFR p dVTn : NJ (f; p)
As before, we change coordinates in each .ber of FAR by f = fI + fII + fIII i with fIi colinear to vAT , (fIIi )T in the range of √ DvA , and fIII orthogonal to fIi and fIIi . This coordinate system is dependent on p + q −1. In the new coordinate system, formula 2.3.1 splits as follows: H −1=2 det(DG(p) DG(p) ) dVTn 1 (DvA II )11 : : : (DvA II )1n (fII )1 : : : (fII1 )n .. .. .. det = det ... dV . . . n n II n II n (fII )1 : : : (fII )n (DvA )1 : : : (DvA )n 1 (fII )1 : : : (fII1 )n .. det DvH Dv : = det ... A A . n (fII )1 : : : (fIIn )n
550
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
The integral E(U ) of therefore
det DvA DvAH is the expected number of real roots on U ,
-R (U; ”) 6 E(U )
fII +fIII ∈FR p
e−
dP (fII +fIII ;p )¡”
i
√
i fIIi +fIII 2 =2
2
Mi
1 (fII )1 : : : (fII1 )n .. dFR : · det ... p . n (fII )1 : : : (fIIn )n In the new system of coordinates, p is de.ned by the equation: 1 (fII )1 : : : (fII1 )n .. = 0: det ... . (fIIn )1 : : : (fIIn )n
Since fII + fIII ¿ fII , dP (fII + fIII ; p ) ¡ ” ⇒ dP (fII ; p ) ¡ ”: This implies: -R (U; ”) 6 E(U )
fII +fIII ∈FpR
dP (fII ;[det=0])¡”
e−
i
√
i fIIi +fIII 2 =2
2
Mi
1 (fII )1 : : : (fII1 )n .. dFR : · det ... p . n (fII )1 : : : (fIIn )n We can integrate the ( Mi − n − 1) variables fIII to obtain: i 2 2 e− i fII =2 2 -R (U; ”) = E(U ) |det fII |2 dRn : √ n2 fII ∈Rn 2 dP (fII ;[det=0])¡” This is E(U ) times the probability -(n; ”) for the linear case. Acknowledgements Steve Smale provided valuable inspiration for us to extend the theory of [32–36] to sparse polynomial systems. He also provided examples on how to eliminate the dependency upon unitary invariance in the dense case. The paper by Gromov [14] was of foremost importance to this research. To the best of our knowledge, [14] is the only clear exposition available of mixed volume in terms of a wedge of di2erential forms. We thank Mike Shub for pointing out that reference and for many suggestions.
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
551
We would like to thank Jean-Pierre Dedieu for sharing his thoughts with us on Newton iteration in Riemannian and quotient manifolds. Also, we would like to thank Felipe Acker, Felipe Cucker, Alicia Dickenstein, Ioannis Emiris, Askold Khovanskii, Eric Kostlan, T.Y. Li, Nelson Maculan, Martin Sombra and Jorge P. Zubelli for their suggestions and support. This paper was written while G.M. was visiting the Liu Bie Ju Center for Mathematics at the City University of Hong Kong. He wishes to thank City U for its generous support. Some of the material above was previously circulated in the technical report [22]. Appendix A. The coarea formula Here we give a short proof of the coarea formula, in a version suitable to the setting of this paper. This means we take all manifolds and functions smooth and avoid measure theory as much as possible. Proposition 5. (1) Let X be a smooth Riemann manifold, of dimension M and volume form |dX |. (2) Let Y be a smooth Riemann manifold, of dimension n and volume form |dY |. (3) Let U be an open set of X , and F : U → Y be a smooth map, such that DFx is surjective for all x in U . (4) Let ’ : X → R+ be a smooth function with compact support contained in U . def Then for almost all z ∈ F(U ), Vz = F −1 (z) is a smooth Riemann manifold, and ’(x)NJ (F; x)|dX | = ’(x)|dVz ||dY | X
z∈Y
x∈Vz
where |dVz | is the volume element of Vz and NJ (F; x) = of the singular values of DFx .
det DFxH DFx is the product
By the implicit function theorem, whenever Vz is non-empty, it is a smooth (N − n)dimensional Riemann submanifold of X . By the same reason, V := {(z; x) : x ∈ Vz } is also a smooth manifold. Let 3 be the following N -form restricted to V : 3 = dY ∧ dVz : This is not the volume form of V . The proof of Proposition 5 is divided into two steps: Lemma 8. ’(x)|3| = ’(x)NJ (F; x)|dX |: V
X
552
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
Lemma 9. ’(x)|3| = V
z∈Y
x∈Vz
’(x)|dVz ||dY |:
Proof of Lemma 8. We parametrize: :X →V x → (F(x); x): Then,
V
’(x)|3| =
X
(’ ◦ )(x)|
∗
3|:
We can choose an orthonormal basis u1 ; : : : ; uM of Tx X such that un+1 ; : : : ; uM ∈ ker DFx . Then, ! (DFx ui ; ui ) i = 1; : : : ; n D (ui ) = (0; ui ) i = n + 1; : : : ; M: Thus, |
∗
3(u1 ; : : : ; uM )| = |3(D u1 ; : : : ; D uM )| = |dY (DFx u1 ; : : : ; DFx un )| |dVz (un+1 ; : : : ; uM )| = |det DFx |ker DFx⊥ | = NJ (F; x)
and hence ’(x)|3| = ’(x)NJ (F; x)|dX |: V
X
Proof of Lemma 9. We will prove this Lemma locally, and this implies the full Lemma through a standard argument (partitions of unity in a compact neighborhood of the support of ’). Let x0 ; z0 be .xed. A small enough neighborhood of (x0 ; z0 ) ⊂ Vz0 admits a .bration over Vz0 by planes orthogonal to ker DFx0 . We parametrize: J : Y × Vz0 → V (z; x) → (z; K(x; z)); where K(x; z) is the solution of F(K) = z in the .ber passing through (z0 ; x). Remark that J∗ dY = dY , and J∗ dVz = K∗ DVz . Therefore, J∗ (dY ∧ dVz ) = dY ∧ (K∗ dVz ):
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
553
Also, if one .xes z, then K is a parametrization Vz0 → Vz . We have:
V
’(x)|3| = = =
Y ×Vz0
’(K(x; z))|J∗ 3|
z∈Y
z∈Y
x∈Vz0
x∈Vz
’(K(x; z))|K∗ dVz | |dY |
’(x)|dVz | |dY |:
The proposition below is essentially Theorem 3, p. 240 of [6]. However, we do not require our manifolds to be compact. We assume all maps and manifolds are smooth, so that we can apply Proposition 5. Proposition 6. (1) Let X be a smooth M -dimensional manifold with volume element |dX |. (2) Let Y be a smooth n-dimensional manifold with volume element |dY |. (3) Let V be a smooth M -dimensional submanifold of X × Y , and let 1 : V → X and 2 : V → Y be the canonical projections from X × Y to its factors. (4) Let be the set of critical points of 1 , we assume that has measure zero and that is a manifold. (5) We assume that 2 is regular (all points in 2 (V ) are regular values). def −1 (6) For any open set U ⊂ V , for any x ∈ X , we write: #U (x) = #{1 (x) ∩ U }. We assume that x∈X #V (x)|dX | is =nite. Then, for any open set U ⊂ V ,
x∈1 (U )
#U (x)|dX | =
z∈Y
x∈Vz (x;z)∈U
1 |dVz ||dY |; det DGx DGxH
where G is the implicit function for (x; ˆ G(x)) ˆ ∈ V in a neighborhood of (x; z) ∈ V \ . Proof. Every (x; z) ∈ U \ admits an open neighborhood such that 1 restricted to that neighborhood is a di2eomorphism. This de.nes an open covering of U \ . Since U \ is locally compact, we can take a countable sub-covering and de.ne a partition of unity (’ )∈L subordinated to that sub-covering. Also, if we .x a value of z, then (’ )∈L becomes a partition of unity for 1 (1−1 (Vz ) ∩ U ). Therefore, x∈1 (U )
#U (x)|dX | = =
∈L
∈L
x;z∈Supp ’
’ (x; z)|dX |
z∈Y
x;z∈Supp ’
’ (x; z) |dX | NJ (G; x)
554
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
= =
’ (x; z) |dX | NJ (G; x)
z∈Y ∈L
x;z∈Supp ’
z∈Y
1 |dX |; NJ (G; x)
x∈Vz
where the second equality uses Proposition 5 with ’ = ’ =NJ . Since NJ = det DGx DGxH , we are done. References [1] R. Abraham, J.E. Marsden, Foundations of Mechanics, 2nd Edition, Benjamin/Cummings, Advanced Book Program, Reading, MA, 1978. [2] M.F. Atiyah, Convexity and commuting Hamiltonians, Bull. London Math. Soc. 14 (1) (1982) 1–15. [3] M.F. Atiyah, Angular momentum, convex polyhedra and algebraic geometry, Proc. Edinburgh Math. Soc. (2) 26 (2) (1983) 121–133. [4] M. Avriel, Nonlinear Programming Analysis and Methods, Prentice-Hall, Englewood Cli2s, New Jersey, 1976. [5] D.N. Bernstein, The number of roots of a system of equations, Functional Anal. Appl. 9 9 (3) (1975) (1976) 183–185. [6] L. Blum, F. Cucker, M. Shub, S. Smale, Complexity and Real Computation, Springer, New York, 1998. [7] Yu.D. Burago, V.A. Zalgaller, Geometric Inequalities, in: Grundlehren Der Mathematischen Wissenschaften, Vol. 285, Springer, Berlin, 1988. [8] S.S. Chern, W.H. Chen, K.S. Lam, Lectures on Di2erential Geometry, World Scienti.c, River Edge, NJ, 1999. [9] J.-P. Dedieu, Approximate Solutions of Numerical Problems, Condition Number Analysis and Condition Number Theorem, The Mathematics of Numerical Analysis (Park City, UT, 1995), American Mathematical Society, Providence, RI, 1996, pp. 263–283. [10] T. Delzant, Hamiltoniens pJeriodiques et images convexes de l’application moment, Bull. Soc. Math. France 116 (3) (1988) 315–339. [11] J.W. Demmel, Applied Numerical Linear Algebra, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1997. [12] A. Edelman, E. Kostlan, How many zeros of a random polynomial are real? Bull. Amer. Math. Soc. (N.S.) 32 (1) (1995) 1–37. [13] G. Ewald, Combinatorial Convexity and Algebraic Geometry, Springer, New York, 1996. [14] M. Gromov, Convex Sets and K8ahler Manifolds, Advances in Di2erential Geometry and Topology, World Scienti.c Publishing, Teaneck, NJ, 1990, pp. 1–38. [15] V. Guillemin, Moment Maps and Combinatorial Invariants of Hamiltonian T n -Spaces, Progress in Mathematics, Vol. 122, Birkhuser Boston, Boston, MA, 1994. [16] V. Guillemin, S. Sternberg, Convexity properties of the moment mapping, Invent. Math. 67 (3) (1982) 491–513. [17] B. Huber, B. Sturmfels, A polyhedral method for solving sparse polynomial systems, Math. Comp. 64 (212) (1995) 1541–1555. [18] B.Ja. Kazarnovskia^, On zeros of exponential sums, Soviet Math. Doklady 23 (2) (1981) 347–351. [19] A.G. Khovanskia^, Fewnomials, in: Translations of Mathematical Monographs, Vol. 88, American Mathematical Society, Providence, Rhode Island, 1991. [20] E. Kostlan, On the distribution of roots of random polynomials, from topology to computation; Proceedings of the Smalefest (Berkeley, CA, 1990), Springer, New York, 1993, pp. 419–431. [21] Li, Tien-Yien, Li, Xing, Finding mixed cells in the mixed volume computation, Found. Comput. Math. 1 (2) (2001) 161–181. [22] G. Malajovich, J.M. Rojas, Random sparse polynomial systems, Math ArXiv preprint NA/0012104, 2000.
G. Malajovich, J.M. Rojas / Theoretical Computer Science 315 (2004) 525 – 555
555
[23] G. Malajovich, J.M. Rojas, Polynomial Systems and the Momentum map, in: F. Cucker, J.M. Rojas (Eds.), Foundations of Computational Mathematics, Proceedings of SMALEFEST 2000, World Scienti.c, Singapore, 2002, pp. 251–266. [24] G. Malajovich, J. Zubelli, On the geometry of grae2e iteration, Journal of Complexity 17 (3) (2001) 541–573. [25] G. Malajovich, J. Zubelli, Tangent grae2e iteration, Numerische Mathematik 89 (4) (2001) 749–782. [26] D. McDu2, D. Salamon, Introduction to Symplectic Topology, 2nd Edition, Clarendon Press, Oxford University Press, New York, 1998. [27] A. McLennan, The expected number of real roots of a multihomogeneous system of polynomial equations, Amer. J. Math. 124 (1) (2002) 49–73. [28] G. Mikhalkin, Amoebas of algebraic varieties, preprint, Mathematics ArXiv AG/0108225, 2001. [29] M. Passare, H. RullgaU rd, Amoebas, Monge-Ampcere measures, and triangulations of the Newton polytope. Research Reports in Mathematics No. 10, Department of Mathematics, Stockholm University, 2000. [30] J.M. Rojas, On the average number of real roots of certain random sparse polynomial systems, Lect. Appl. Math. 32 (1996) 689–699. [31] J.R. Sangwine-Yager, Mixed volumes, Handbook of Convex Geometry, Vol. A, B, North-Holland, Amsterdam, 1993, pp. 43–71. [32] M. Shub, S. Smale, Complexity of Bezout’s Theorem. II. Volumes and Probabilities, Computational Algebraic Geometry (Nice, 1992), Birkh8auser Boston, Boston, MA, 1993, pp. 267–285. [33] M. Shub, S. Smale, Complexity of BJezout’s theorem. I, Geometric aspects, J. Amer. Math. Soc. 6 (2) (1993) 459–501. [34] M. Shub, S. Smale, Complexity of Bezout’s theorem. III. Condition number and packing, J. Complexity 9 (1) (1993) 4–14 (Festschrift for Joseph F. Traub, Part I). [35] M. Shub, S. Smale, Complexity of Bezout’s theorem. V. Polynomial time, Theoret. Comput. Sci. 133 (1) (1994) 141–164. [36] M. Shub, S. Smale, Complexity of Bezout’s theorem. IV. Probability of success; extensions, SIAM J. Numer. Anal. 33 (1) (1996) 128–148. [37] S. Smale, Topology and mechanics, I, Invent. Math. 10 (1970) 305–331. [38] J.-M. Souriau, Structure des systcemes dynamiques, Dunod, Paris, Maˆ^trises de mathJematiques, 1970. [39] A. Turing, On computable numbers, with an application to the Entscheidungsproblem, Proc. London Math. Soc. 2 (42) (1936) 230–265. [40] J. Verschelde, P. Verlinden, R. Cools, Homotopies exploiting Newton polytopes for solving sparse polynomial systems, SIAM J. Numer. Anal. 31 (3) (1994) 915–930. [41] O. Viro, Dequantization of Real Algebraic Geometry on Logarithmic Paper, European Congress of Mathematics, Vol. I, Barcelona, 2000; Progress in Mathematics, Vol. 201, Birkhuser, Basel, 2001, pp. 135–146.
Theoretical Computer Science 315 (2004) 557 – 579
www.elsevier.com/locate/tcs
Matrix algebra preconditioners for multilevel Toeplitz systems do not insure optimal convergence rate D. Noutsosa , S. Serra Capizzanob;∗ , P. Vassalosa a Department
b Dipartimento
of Mathematics, University of Ioannina, T.K 45110, Greece di Chimica, Fisica e Matematica, Universita dell’Insubria - Sede di Como, Via Valleggio 11, Como 22100, Italy
Abstract In the last decades several matrix algebra optimal and superlinear preconditioners (those assuring a strong clustering at the unity) have been proposed for the solution of polynomially ill-conditioned Toeplitz linear systems. The corresponding generalizations for multilevel structures are neither optimal nor superlinear (see e.g. Contemp. Math. 281 (2001) 193). Concerning the notion of superlinearity, it has been recently shown that the proper clustering cannot be obtained in general (see Linear Algebra Appl. 343–344 (2002) 303; SIAM J. Matrix Anal. Appl. 22(1) (1999) 431; Math. Comput. 72 (2003) 1305). In this paper, by exploiting a proof technique previously proposed by the authors (see Contemp. Math. 323 (2003) 313), we prove that the spectral equivalence and the essential spectral equivalence (up to a constant number of diverging eigenvalues) are impossible too. In conclusion, optimal matrix algebra preconditioners in the multilevel setting simply do not exist in general and therefore the search for optimal iterative solvers should be oriented to di6erent directions with special attention to multilevel/multigrid techniques. c 2004 Elsevier B.V. All rights reserved. MSC: 65F10; 15A12; 15A18 Keywords: Preconditioning and multigrid; Finite di6erence and Toeplitz matrices; Matrix algebras; (Essential) Spectral equivalence
∗
Corresponding author. E-mail addresses: [email protected] (D. Noutsos), [email protected], [email protected] (S. Serra Capizzano), [email protected] (P. Vassalos). c 2004 Elsevier B.V. All rights reserved. 0304-3975/$ - see front matter doi:10.1016/j.tcs.2004.01.007
558
D. Noutsos et al. / Theoretical Computer Science 315 (2004) 557 – 579
1. Introduction In the past two decades a lot of attention has been paid to the solution of multilevel Toeplitz systems owing to the several applications in which these structures occur as in signal processing, image restoration, PDEs, time series (see e.g. [6]). If f is a complex valued function of d variables, integrable on the d-cube Qd := (0; 2)d , the symbol Qd stands for (2)−d Qd , xˆ = (x1 ; : : : ; xd )T , yˆ = (y1 ; : : : ; yd )T , and ˆx; y ˆ = j xCj yj is the usual inner product, then the Fourier coeDcients of f, given by ˆ ˆ ˆ f(ˆx)e−ij; x d xˆ ; i2 = −1; jˆ ∈ Zd ; fjˆ := Qd
are used for building up d-level Toeplitz matrices generated by f. More precisely, if nˆ = (n1 ; : : : ; nd )T is a d-index with positive integer entries, then the symbol Tnˆ ( f) d denotes the d-level Toeplitz matrix of order N (n) ˆ (throughout, we let N (n) ˆ := i=1 ni ) constructed according to the rule ˆ Tnˆ (f) = ··· (1) fˆjˆJnˆ(j ) = fˆ(j1 ;:::; jd ) Jn(j1 1 ) ⊗ · · · ⊗ Jn(jdd ) : ˆ n |j|¡ˆ
|j1 |¡n1
|jd |¡nd
In the above equation, ⊗ denotes tensor product, Jm(l) denotes the matrix of order m ˆ whose (i; j) entry equals 1 if j − i = l and equals zero otherwise, while Jnˆ(j ) , where jˆ and nˆ are multiindices, is the tensor products of all Jn(i ji ) , for i = 1; : : : ; d. Furthermore, multilevel Toeplitz matrices are not only interesting from the point of view of the applications (or from a “pure mathematics” point of view [4,29]), but also from the viewpoint of the complexity theory [3] since the cost of determining the vector u = Tnˆ ( f)v, for an arbitrary vector v, is of O(N (n) ˆ log N (n)) ˆ arithmetic operations, that is the cost of applying a constant number of multilevel Fast Trigonometric/Fourier transforms (see e.g. [17,3]). Regarding the applications, we remind that the main problem is to solve linear systems of the form Tnˆ ( f)u = v for a given vector v and for a given L1 symbol f. Since the matrix vector multiplication can be performed eDciently, a simple but good idea is to solve the considered linear systems by using iterative solvers in which the involved matrices preserve a Toeplitz structure. Some possibilities are the following: conjugate gradient methods, Chebyshev iterations, Jacobi or Richardson methods with or without polynomial or matrix algebra preconditioning (see [10]). Under these assumptions, the total cost for computing u within a preassigned accuracy , is O(knˆ ()N (n) ˆ log N (n)) ˆ where knˆ () is the required number of iterations. If f is strictly positive and bounded or if the closed convex hull of the range of f is bounded and does not contain the complex zero, then many of the cited iterations are optimal and we have knˆ () = O(1) [25]. The same is true in the case where f is continuous, nonnegative, with a Jnite number of zeros of even orders, the number d of levels equals 1 and we use a preconditioned conjugate gradient (PCG) method [5,8,7,19,15,16]. Here we want to consider the same case (f nonnegative with a Jnite number of zeros) but in the multilevel setting i.e. d¿1. The reason of this attention relies to the importance of the considered case, since the discretization of elliptic d-dimensional
D. Noutsos et al. / Theoretical Computer Science 315 (2004) 557 – 579
559
PDEs, by Finite Di6erences on equispaced grids, leads to sequences {Tnˆ (p)} where p is positive except at xˆ = (0; : : : ; 0)T and is a multivariate trigonometric polynomial. A similar situation occurs in the case of image restoration problems where the sequence {Tnˆ (p)} is associated to a polynomial p which is positive everywhere but at xˆ = (; )T ∈ Q2 . Unfortunately, under the assumption that the preconditioners belong to matrix algebras related to Fast Trigonometric Transforms, no optimal PCG methods are known in this case in the sense that the number of iterations knˆ () is a mildly diverging function of the dimensions nˆ (generally knˆ () ∼ [N (n)] ˆ with some ∈ (0; 1)). In this paper, we show that the search for essentially spectrally equivalent (up to a constant number of diverging eigenvalues) preconditioners cannot be successful in general (at least in the multilevel circulant and cases). Indeed we will use a proof technique proposed in [13] for obtaining such negative results on the important case of {Tnˆ (pkˆ )} with pkˆ(ˆx) = (2 − 2 cos(x1 ))k1 + (2 − 2 cos(x2 ))k2 + · · · + (2 − 2 cos(xd ))kd ; kˆ = (k1 ; k2 ; : : : ; kd )T , kj positive integers, and xˆ = (x1 ; x2 ; : : : ; xd )T . More precisely, concerning the circulant algebra, we demonstrate the result under the assumption that mini ki ¿1 and regarding the algebra we will prove the same with the restriction that mini ki ¿2: we recall that these statements widely extend the analysis provided in [13] where it was considered the case ki = 1, i = 1; 2, d = 2 for the circulants and the case ki = 2, i = 1; 2, d = 2 for the class. Finally, we stress the following two points: • the proof in the general case is much more diDcult than the basic cases considered in [13]: the reason is due to the fact that the rank correction which separates the Toeplitz matrix Tnˆ (pkˆ ) from the corresponding circulant and natural approximations are essentially indeJnite while we need a positive bound from below for the restrictions of this rank correction on suitable subspaces of frequencies. These key facts are proven in Lemmata 2.6 and 2.9 by using of combinatorial arguments developed in Lemmata 2.3 and 2.4; • it is worth mentioning that, after a suitable scaling, the matrices in {Tnˆ (pkˆ )} with kˆ = k(1; : : : ; 1)T , k¿1, represent the centered Finite Di6erences discretization of precision order two of the elliptic di6erential equation (−1)k ∇2k u = f with proper homogeneous boundary conditions and, in addition, they can be used as optimal preconditioners (see e.g. [20,1]) for variable coeDcients elliptic and semi-elliptic Partial Di6erential Equations. We observe that it is important to have a wide class of counterexamples in applicative Jelds because this shows that our negative results are really meaningful and interesting in applications (the latter was not so evident in [13] since we provided only two counterexamples). 2. Tools, denitions and main results In the following, we will restrict our attention to the simpler case where the generating function f is nonnegative, multivariate and has isolated zeros so that the matrices
560
D. Noutsos et al. / Theoretical Computer Science 315 (2004) 557 – 579
Tnˆ ( f) are positive deJnite and ill-conditioned. We will consider as case study two multilevel matrix algebras: the d-level algebra and the d-level circulant algebra. The d-level algebra is generated by the d-dimensional basic structures: n 1 ⊗ I n 2 ⊗ · · · ⊗ In d ;
In1 ⊗ n2 ⊗ · · · ⊗ Ind ; : : : ; In1 ⊗ In2 ⊗ · · · ⊗ nd
and each matrix of the algebra is simultaneously diagonalized by the orthogonal matrix Q of size N (n), ˆ where Im denotes the m-sized identity matrix, 0 1 1 0 ... m = .. .. . . 1 1 0 m and the columns of matrix Q are given by vj(n1 1 ) ⊗ vj(n2 2 ) ⊗ · · · ⊗ vj(nd d ) with
j=m sj 2 (m) vs = sin : m+1 m+1 j=1
(2)
Similarly, the d-level circulant algebra is generated by the d-dimensional basic structures Zn1 ⊗ In2 ⊗ · · · ⊗ Ind ;
In1 ⊗ Zn2 ⊗ · · · ⊗ Ind ; : : : ; In1 ⊗ In2 ⊗ · · · ⊗ Znd
and each matrix of the algebra is simultaneously diagonalized by the unitary Fourier matrix F of size N (n), ˆ where 0 ::: 0 1 1 0 Zm = . . . ... 0 10 m 1) 2) d) and the columns of matrix F are given by fj(n ⊗ fj(n ⊗ · · · ⊗ fj(n with 1 2 d
2(s−1)(j−1) j=m 1 m ei : fs(m) = m j=1
(3)
The strong relationships between these algebras and Toeplitz structures, emphasized by the fact that the generators are of Toeplitz type, have been deeply studied. Given a d-variate complex polynomial p, we mention that Tnˆ (p) = Cnˆ (p) + T˜nˆ (p); where Cnˆ (p) is the d-level circulant matrix whose eigenvalues are
2(j1 − 1) 2(j2 − 1) 2(jd − 1) pjcˆ = p ; ;:::; n1 n2 nd and where T˜ nˆ (p) is d-level Toeplitz matrix of rank proportional to
d
i=1
ni .
D. Noutsos et al. / Theoretical Computer Science 315 (2004) 557 – 579
561
Concerning the case of d-level matrices we remark that the involved structures are inherently symmetric and real and therefore we have to restrict our attention to real valued even polynomials p (i.e. p(ˆx) ≡ p(x1 ; : : : ; xd ) = p(|x1 |; : : : ; |xd |) for every xˆ ∈ Qd ). In that setting, we have Tnˆ (p) = nˆ (p) + Hnˆ (p); where nˆ (p) is the d-level matrix whose eigenvalues are
j2 jd j1 ; ;:::; pjˆ = p n1 + 1 n2 + 1 nd + 1 d and where Hnˆ (p) is d-level Hankel matrix of rank proportional to i=1 ni . In order to make clear these statements and to give more details, we report the following four lemmas that will be quite useful in the sebsequent analysis. Lemma 2.1. Let pkˆ (ˆx) = (2 − 2 cos(x1 ))k1 + (2 − 2 cos(x2 ))k2 + · · · + (2 − 2 cos(xd ))kd , ˆ ∞ ¿1 and xˆ = where kˆ = (k1 ; k2 ; : : : ; kd )T is a vector of nonnegative indices with k ˆ ∞ = 1 then Tnˆ (p) = nˆ (p) while (x1 ; x2 ; : : : ; xd )T is a multivariate vector in Qd . If k ˆ ∞ ¿2 then Tnˆ (p) = nˆ (p) + Hnˆ (p) where if k Hnˆ (p) = (Ek(n1 1 ) + (Ek(n1 1 ) )R ) ⊗ In2 ⊗ · · · ⊗ Ind + In1 ⊗ (Ek(n2 2 ) + (Ek(n2 2 ) )R ) ⊗ · · · ⊗ Ind + · · · + In1 ⊗ In2 ⊗ · · · ⊗ (Ek(nd d ) + (Ek(nd d ) )R )
(4)
with Ek(m) = 0 if k61 and being the m-sized low rank Hankel matrix
Ek(m)
2k 2k − · · · (−1)k−2 · · · · · · k − 2 k − 3
· · 2k − ·· ·· k −3 .. · . ·· = (−1)k−2 .. · . ·· .. . 0 ··· ··· ···
0 .. . .. . .. . 0
(5)
if k¿2 and with ·· denoting the binomial coe:cient operator. Moreover (Ek(m) )R = JEk(m) J with J being the m-by-m ;ip matrix de
562
D. Noutsos et al. / Theoretical Computer Science 315 (2004) 557 – 579
Lemma 2.2. Let pkˆ (ˆx) = (2 − 2 cos(x1 ))k1 + (2 − 2 cos(x2 ))k2 + · · · + (2 − 2 cos(xd ))kd , ˆ ∞ ¿1 and where kˆ = (k1 ; k2 ; : : : ; kd )T is a vector of nonnegative indices with k T xˆ = (x1 ; x2 ; : : : ; xd ) is a multivariate vector in Qd . Then Tnˆ (p) = Cnˆ (p) + T˜ nˆ (p) where (n1 ) (n1 ) T˜nˆ (p) = (Jˆk1 + (Jˆk1 )T ) ⊗ In2 ⊗ · · · ⊗ Ind (n2 ) (n2 ) + In1 ⊗ (Jˆk2 + (Jˆk2 )T ) ⊗ · · · ⊗ Ind + · · · (nd ) (nd ) + In1 ⊗ In2 ⊗ · · · ⊗ (Jˆkd + (Jˆkd )T ) (m) with Jˆk being the m-sized low rank Toeplitz matrix
2k 2k k−1 ··· − 0 · · · · · · (−1) k −2 k − 1 .. .. 2k . . − k − 2 . . . .. .. .. (m) : ˆ Jk = . k−1 .. (−1) . .. . . . . .. .. . 0 ··· ··· ··· 0
(6)
(7)
Lemma 2.3. Let pkˆ (ˆx) = (2 − 2 cos(x1 ))k1 + (2 − 2 cos(x2 ))k2 + · · · + (2 − 2 cos(xd ))kd , ˆ ∞ ¿2 and where kˆ = (k1 ; k2 ; : : : ; kd )T is a vector of nonnegative indices with k T xˆ = (x1 ; x2 ; : : : ; xd ) is a multivariate vector in Qd . Let Hnˆ (pkˆ ) be the Hankel correcˆ ˆ tion de
d 4 sin 2 (% j ) sj (n ) r(%sj j ); n + 1 j j=1
where
lim (n )
%=%sj j →0
r(%) =
2kj − 4 kj − 2
(8)
:
(9) (n )
Finally, if nj ∼ m for every j and %sj j = o(1), then 2 (n ) vT Hnˆ (pkˆ)v ¿ cm−1 max{%sj j } j
for a suitable positive constant c independent of m.
(10)
D. Noutsos et al. / Theoretical Computer Science 315 (2004) 557 – 579
563
Proof. By the deJnition of the matrix Hnˆ (pkˆ ) and of the vector v, it follows that vT Hnˆ (pkˆ )v can be written as the sum over i of the terms (vs(n1 1 ) ⊗ · · · ⊗ vs(nd d ) )T (In1 ⊗ · · · ⊗ (Ek(ni i ) + (Ek(ni i ) )R ) ⊗ · · · ⊗ Ind )(vs(n1 1 ) ⊗ · · · ⊗ vs(nd d ) ) T
T
T
T
= (vs(n1 1 ) vs(n1 1 ) )(vs(n2 2 ) vs(n2 2 ) ) · · · (vs(ni i ) (Ek(ni i ) + (Ek(ni i ) )R )vs(ni i ) ) · · · (vs(nd d ) vs(nd d ) ) T
= vs(ni i ) (Ek(ni i ) + (Ek(ni i ) )R )vs(ni i )
(11)
and therefore vT Hnˆ (p)v =
d i=1
T
vs(ni i ) (Ek(ni i ) + (Ek(ni i ) )R )vs(ni i ) :
We have now to compute each term of the sum in the above expression. For the sake of simplicity, we put s, m and k in place of si , ni and ki , respectively, and we write (n ) % = %sj j = %s(m) = s=(m + 1). From (2) and from (5) we get T
T
vs(m) (Ek(m) + (Ek(m) )R )vs(m) = 2vs(m) Ek(m) vs(m) because of the centrosymmetry of the matrix Ek(m) +(Ek(m) )R and of the centrosymmetry/ anticentrosymmetry of the eigenvectors vs(m) . By considering the last vector-matrixvector product, we deduce the following “binomial coeDcient based” expression: T vs(m) Ek(m) vs(m)
k−i 2 k−1 2k i+j sin(j%) = sin(i%) (−1) k −i−j m + 1 i=1 j=1
l−1 k 2 2k l = (−1) sin(j%) sin((l − j)%) k − l j=1 m + 1 l=2
l−1 k sin(j%) sin((l − j)%) 2 sin2 (%) 2k (−1)l = k − l j=1 sin(%) m + 1 l=2 sin(%) =
2 sin2 (%) r(%): m+1
(12)
Therefore (8) is proved. Moreover, it is obvious that the term 2 sin2 (%)=(m + 1) tends to zero at least as m−1 and, if % tends to zero, it follows that m tends to inJnity and therefore its global asymptotic order is m−1 %2 . Consequently, also the Jnal relation (10) is proven if we show that (9) holds. To this aim, it remains to estimate the double sum r(%) appearing in (12): r(%) =
k l=2
(−1)l
2k k −l
l−1 sin(j%) sin((l − j)%) : sin(%) j=1 sin(%)
(13)
564
D. Noutsos et al. / Theoretical Computer Science 315 (2004) 557 – 579
We take the limit as % tends to zero:
l−1 k sin(j%) sin((l − j)%) 2k l (−1) lim%→0 r(%) = lim k − l j=1 sin(%) %→0 l=2 sin(%)
k l−1 2k = (−1)l j(l − j) k − l j=1 l=2
k 2k l+1 l : (−1) = k −l 3 l=2 We then use the following relationship:
l − k s s − m − 1 k l +m (−1) = (−1) l − m − n m k − n k 6l
(14)
(15)
concerning a special sum of products of binomial coeDcients with l , m , n ¿0 integer numbers (refer to Eq. (5.25) in Table 169, p. 169 of the book of Graham et al. [11] where we have used prime in denoting the parameters in (15), just in order to avoid the confusion with the parameters used in the paper). By replacing s = 2k, k = k − l + 1, l = k + 2, m = 3 and n = 1 in relationship (15), we deduce k l=2
(−1)
l
2k k −l
l+1 3
=
2k − 4 k −2
:
(16)
Thus, lim r(%) =
%→0
2k − 4 k −2
:
Finally the claimed thesis follows by replacing back si , ni and ki to s, m and k, respectively. Lemma 2.4. Let pkˆ (ˆx) = (2 − 2 cos(x1 ))k1 + (2 − 2 cos(x2 ))k2 + · · · + (2 − 2 cos(xd ))kd , ˆ ∞ ¿1 and xˆ = where kˆ = (k1 ; k2 ; : : : ; kd )T is a vector of nonnegative indices with k (x1 ; x2 ; : : : ; xd )T is a multivariate vector in Qd . Let T˜ nˆ (pkˆ ) be the Toeplitz correction ˆ ˆ 1) de
where
(17)
lim (n )
%=%sj j →0
r(%) =
2kj − 2 kj − 1
:
(18)
D. Noutsos et al. / Theoretical Computer Science 315 (2004) 557 – 579
565
(n )
Finally, if nj ∼ m for every j and %sj j = o(1), then vT Hnˆ (pkˆ)v¿cm−1
(19)
for a suitable positive constant c independent of m. Proof. By the deJnition of the matrix T˜ nˆ (pkˆ ) (see (6)) and of the vector v it follows: vH T˜nˆ (pkˆ)v =
d
(ni ) H (ni ) fs(ni i ) (Jˆki + (Jˆki )T )fs(ni i ) :
i=1
We have now to compute each term of the previous sum. For simplicity we put s, m and k in the places of si , ni and ki respectively and % = 2(s − 1)=m. From form (3) of the eigenvectors fs(m) of circulant matrices and from the form (7) of the low rank (m) Toeplitz matrix Jˆk we get
k k−i+1 1 2k (m) H ˆ(m) (m) −i(i−1)% i+j ei(m−j)% e (−1) fs J k fs = k −i−j+1 m i=1 j=1
l−1 1 k+1 2k l ei(m−l+1)% = (−1) k −l+1 m l=2 j=1
1 k+1 2k l−1 ei(m−l+1)% ; = (−1)l (20) k −l+1 1 m l=2 while H (m) fs(m) (Jˆk )T fs(m)
1 k+1 = (−1)l m l=2
2k k −l+1
l−1 1
e−i(m−l+1)% :
(21)
Therefore, H
(m)
fs(m) (Jˆk
(m)
+ (Jˆk )T )fs(m)
2 k+1 2k l−1 l cos((m − l + 1)%) = (−1) k −l+1 1 m l=2 2 = r(%) (22) m
and (17) is proven. In addition, it is obvious that the statement contained in (17) is simply implied by (18) and thus we prove the latter. For this purpose, we have to estimate the sum of (22):
k+1 2k l−1 l cos((m − l + 1)%): (23) r(%) = (−1) k −l+1 1 l=2 We take the limit as % tends to zero
k+1 2k l−1 l lim r(%) = ; (−1) k −l+1 1 %→0 l=2
(24)
566
D. Noutsos et al. / Theoretical Computer Science 315 (2004) 557 – 579
we use again relation (15), and, by replacing s = 2k, k = k − l + 1, l = k, m = 1 and n = 0, we obtain
k+1 2k l−1 2k − 2 l = : (25) (−1) k −l+1 1 k −1 l=2 Therefore,
lim r(%) =
%→0
2k − 2 k −1
:
Finally, the claimed thesis follows by replacing back si , ni and ki to s, m and k, respectively. A tool for proving that a PCG method is optimal when the coeDcient matrix sequence is {An }n and the preconditioning sequence is {Pn }n is the spectral equivalence and the essential spectral equivalence between the two sequences. Denition 2.1. Given {An }n and {Pn }n two sequences of positive deJnite matrices of increasing size dn (dn ¡dn+1 , for all n), we say that they are spectrally equivalent i6 all the eigenvalues {*(Pn−1 An )}n of {Pn−1 An }n belong to a positive interval [+; ] independent of n with 0¡+6¡∞. We say that the sequences {An }n and {Pn }n are essentially spectrally equivalent i6 there is at most a constant number of outliers and they are all bigger than . In practice, in terms of Rayleigh quotients, the spectral equivalence means that for every nonzero v ∈ Cdn we have +6
v H An v 6 v H Pn v
while the essential spectral equivalence is equivalent to the following two conditions: for every nonzero v ∈ Cdn we have +6
v H An v v H Pn v
and there exists a constant positive integer q independent of n such that, for every subspace V of dimension greater than q we have min
v∈V;v =0
v H An v 6 : v H Pn v
(26)
In other words, calling -1 ¿-2 ¿ · · · ¿-dn the eigenvalues of Pn−1 An , we have -q+1 6 and possibly the Jrst q eigenvalues diverging to inJnity as n tends to inJnity. In view of the min max characterization -q+1 =
max
min
dimV=q+1 v∈V;v =0
v H An v v H Pn v
it follows that for every subspace V of dimension q + 1 we must have (26).
D. Noutsos et al. / Theoretical Computer Science 315 (2004) 557 – 579
567
2.1. Negative results: the case We begin with the main negative theorems for the case. Theorem 2.5. Let f be asymptotic to pkˆ (ˆx) = (2 − 2 cos(x1 ))k1 + (2 − 2 cos(x2 ))k2 + · · · + (2 − 2 cos(xd ))kd , where kˆ = (k1 ; k2 ; : : : ; kd )T is vector of nonnegative indices with mini ki ¿2 and xˆ = (x1 ; x2 ; : : : ; xd )T is a multivariate vector in Qd . Let also be a <xed positive number independent of nˆ = (n1 ; n2 ; : : : ; nd )T with ni ∼ nj , ∀i; j = 1; : : : ; d. Then for every sequence {Pnˆ } with Pnˆ ∈ nˆ and such that -max (Pnˆ−1 Tnˆ (f)) 6
(27)
uniformly with respect to n, ˆ we have (a) the minimal eigenvalue of Pnˆ−1 Tnˆ ( f) tends to zero (in other words {Tnˆ ( f)} does not possess spectrally equivalent preconditioners in the nˆ algebra); (b) the number #{-(ˆn) ∈ *(Pnˆ−1 Tnˆ (f)) : -(ˆn) →N (ˆn)→∞ 0} tends to in
¿ -1 ¿
where, by Lemma 2.3, vT Hn (pkˆ)v =
(n )
d 2 sin 2 (% j ) sj (n ) r(%sj j ): nj + 1 j=1
568
D. Noutsos et al. / Theoretical Computer Science 315 (2004) 557 – 579
Consequently we have d 2 sin 2 (%(nj ) ) 1 s (n ) j -sˆ ¿ psˆ + r(%sj j ) : nj + 1 j=1 Now, for every multi-index sˆ such that psˆ 6m−1=2 , taking into account that k = n mini ki ¿2, we deduce that (n )
lim %sj j = 0:
nj →∞
(n )
Therefore Lemma 2.3 applies and r(%sj j ) = cj (1 + o(1)) where the positive constant
2kj − 4 cj = kj − 2 is independent of nj . Finally, by considering that nj ∼ m for every j, it follows that (n )
d 2 sin 2 (% j ) sj (n ) r(%sj j ) ¿ cm−3 n + 1 j j=1
for a suitable positive constant c and hence -sˆ ¿
c : m3
(28)
For the complementary set of indices such that psˆ ¿m−1=2 we have |vT Hn (pkˆ)v| 6 c1 m−1 ˆ for every v = vˆ(sˆn) : hence it follows that
-sˆ ¿
m−1=2 − c1 m−1 c ¿ m
for a suitable positive constant c. In conclusion (28) is satisJed uniformly with regard to the multi-index sˆ. On the other hand, from the asymptotic knowledge of the eigenvalues of Tnˆ (pkˆ ), we infer that Tnˆ ( f) has gd (m) eigenvalues -(m) going to zero asymptotically faster than m−3 i.e. -(m) = o(m−3 ) where g(m) → ∞ with the constraint that g(m) = o(m1−3=2k ). Finally, we deduced that at least gd (m) eigenvalues of the preconditioned matrix collapse to zero as m tends to inJnity and this proves both the claims (a) and (b). Now we prove that also the essential spectral equivalence is impossible: the problem is more involved and the technique that we use is di6erent from the one of the previous theorem. Indeed, to this purpose, we need a further preliminary lemma. Lemma 2.6. Let pkˆ (ˆx) = (2 − 2 cos(x1 ))k1 + (2 − 2 cos(x2 ))k2 + · · · + (2 − 2 cos(xd ))kd , where kˆ = (k1 ; k2 ; : : : ; kd )T is a vector of nonnegative indices with mini ki ¿2. Let
D. Noutsos et al. / Theoretical Computer Science 315 (2004) 557 – 579
569
Hnˆ (pkˆ ) be the Hankel correction de
v∈Vt ;v2 =1
vT Hnˆ (pkˆ)v ¿
4 sin2 (%1(n1 ) ) r(%1(n1 ) )(1 + o(1)) n1 + 1
with r(·) and %1(n1 ) de
)
d−1 wi = v1(n1 ) ⊗ vs(n2 2 ) ⊗ · · · ⊗ vsd−1 ⊗ vq(ni d ) ;
i = 1; 2; : : : ; t;
d−1 2k such that j=2 sj j + qi2kd = o(m). Thus the subspace Vt is contained in Wnˆ and it can be written as t (nd−1 ) Vt = v : v = ci v1(n1 ) ⊗ vs(n2 2 ) ⊗ · · · ⊗ vsd−1 ⊗ vq(ni d ) ; ci ∈ R i=1
t (nd−1 ) = v : v = v1(n1 ) ⊗ vs(n2 2 ) ⊗ · · · ⊗ vsd−1 ⊗ ci vq(ni d ) ; ci ∈ R : i=1
We have now to estimate the quantity minv ∈ Vt ; v2 = 1 vT Hnˆ (pkˆ )v. By taking into act t (nd−1 ) count Lemma 2.3, setting v1(n1 ) ⊗ vs(n2 2 ) ⊗ · · · ⊗ vsd−1 ⊗ ( i=1 ci vq(ni d ) ) with i=1 ci2 = 1, and by deJning S(c1 ; : : : ; ct ) = 2
t t i=1 j=1
T
ci cj vq(ni d ) Ek(nd d ) vq(nj d ) ;
we have T
v Hnˆ (pkˆ)v =
v1(n1 )
×
⊗
d
j=1
vs(n2 2 )
⊗ ··· ⊗
In1 ⊗ · · · ⊗
(nd−1 ) vsd−1
(n ) (Ekj j
+
⊗
T
i=1
ci vq(ni d )
(n ) (Ekj j )R )
t
)
d−1
(n ) T (n ) (n ) vsj j Ekj j vsj j
j=2
i=1
T
⊗ · · · ⊗ In d
(n
d−1 v1(n1 ) ⊗ vs(n2 2 ) ⊗ · · · ⊗ vsd−1 ⊗
= 2v1(n1 ) Ek(n1 1 ) v1(n1 ) + 2
t
ci vq(ni d )
570
D. Noutsos et al. / Theoretical Computer Science 315 (2004) 557 – 579
+2
t t i=1 j=1
T
ci cj vq(ni d ) Ek(nd d ) vq(nj d ) (n )
=
d−1 4 sin2 (%sj j ) 4 sin2 (%1(n1 ) ) (n ) r(%1(n1 ) ) + r(%sj j ) + S(c1 ; : : : ; ct ) n1 + 1 n + 1 j j=2
¿
4 sin2 (%1(n1 ) ) r(%1(n1 ) ) + S(c1 ; : : : ; ct ): n1 + 1
In the last inequality, we observe that equality takes place if d = 2 while we obtain a strict inequality otherwise. The minimum of the above t quantity is given by minimizing S(c1 ; : : : ; ct ) under the assumption v2 = 1 ⇔ i=1 ci2 = 1. We compute Jrst the (n ) T
T
(n ) (n )
general term vq(ns d ) Ek(nd d ) vq(nr d ) , s; r = 1; 2; : : : ; t as done in Lemma 2.3 for vsj j Ekj j vsj j : T
vq(ns d ) Ek(nd d ) vq(nr d )
k k d −1 d −i 2 2kd sin(j%qr ) sin(i%qs ) (−1)i+j kd − i − j nd + 1 i=1 j=1
l−1 kd 2 2kd (−1)l sin(qs j%) sin(qr (l − j)%) = kd − l j=1 m + 1 l=2
l−1 kd sin(qs j%) sin(qr (l − j)%) 2 sin2 (%) 2kd l (−1) = k − l nd + 1 l=2 sin(%) sin(%) d j=1 =
=
2 sin2 (%) zsr (%); nd + 1
% = %1(nd ) =
: nd + 1
By simple manipulations (refer to Lemma 2.3) we get
2kd − 4 lim zsr (%) = qs qr : kd − 2 %→0 Consequently, the term S(c1 ; : : : ; ct ) is given by S(c1 ; : : : ; ct ) := S(%) = Thus
lim z(%) =
%→0
2kd − 4 kd − 2
t t 4 sin2 (%) 4 sin2 (%) ci cj zij (%) = z(%): nd + 1 i=1 j=1 nd + 1
t t i=1 j=1
ci qi cj qj =
2kd − 4 kd − 2
t i=1
2 ci qi
¿ 0:
t Since % = o(1) and the above limit has minimum at i=1 ci qi = 0, we deduce that minv∈Vt ;v2 =1 z(%) = r(%)o(1) with r(·) deJned as in Lemma 2.3 and, again by Lemma 2.3, with
2kd − 4 lim r(%) = : kd − 2 %→0
D. Noutsos et al. / Theoretical Computer Science 315 (2004) 557 – 579
571
Hence, since n1 ∼ nd , we deduce that min
v∈Vt ;v2 =1
S(%) =
4 sin2 (%1(n1 ) ) 4 sin2 (%) r(%)o(1) = r(%1(n1 ) )o(1); nd + 1 n1 + 1
which completes the proof. Theorem 2.7. Let f be asymptotic to pkˆ (ˆx) = (2 − 2 cos(x1 ))k1 + (2 − 2 cos(x2 ))k2 + · · · + (2 − 2 cos(xd ))kd , where kˆ = (k1 ; k2 ; : : : ; kd )T is vector of nonnegative indices with mini ki ¿2 and xˆ = (x1 ; x2 ; : : : ; xd )T is a multivariate vector in Qd . Let also + be a <xed positive number independent of nˆ = (n1 ; n2 ; : : : ; nd )T with ni ∼ nj , ∀i; j = 1; : : : ; d. Then for every sequence {Pnˆ } with Pnˆ ∈ nˆ and such that -min (Pnˆ−1 Tnˆ (f)) ¿ +
(29)
uniformly with respect to n, ˆ we have (a) the maximal eigenvalue of Pnˆ−1 Tnˆ ( f) diverges to in
and
-q+1 6 :
Therefore, from the relation -N (n) ˆ ¿+ it follows that Tnˆ (pkˆ )¿+Pnˆ where the relation is in the sense of the partial ordering between Hermitian matrices. On the other hand, from well-known results on the asymptotic spectra of Toeplitz matrices (see [29] and references therein), we infer that the smallest eigenvalue of Tnˆ (pkˆ ) is asymptotic to d 1 i=1
i n2k i
∼ max
i=1;:::;d
1 i n2k i
:
Let m be the value of ni and k be the value of ki corresponding to maxi=1;:::;d
1 2k ni i
(we
can assume ni ∼ m for every i and therefore k = mini ki ). It can be also derived, from the above reference and from [2], that Tnˆ (pkˆ ) possesses gd (m) eigenvalues -(m) going to zero asymptotically faster than m−2k+1 i.e. -(m) = o(m−2k+1 ) where g(m) → ∞ with
572
D. Noutsos et al. / Theoretical Computer Science 315 (2004) 557 – 579
the constraint that g(m) = o(m1=2k ). Consequently, since Pnˆ 6(1=+)Tnˆ (pkˆ ) it follows that also Pnˆ has gd (m) eigenvalues -(m) going to zero asymptotically faster than m−2k+1 . We suppose that Pnˆ−1 Tnˆ (pkˆ ) has at most q eigenvalues bigger than . Let -sˆ, ˆ = vs(n1 1 ) ⊗ vs(n2 2 ) (ˆs = (s1 ; s2 ; : : : ; sd )T ) the eigenvalue of Pnˆ related to the eigenvector vˆ(sˆn) (nd ) ⊗ · · · ⊗ vsd . Since 3nˆ (pkˆ )6Tnˆ (pkˆ ), for a positive constant 3 independent of m (see e.g. [23]), and since Pnˆ−1 Tnˆ (pkˆ ) has at most q eigenvalues bigger than , it follows that Pnˆ−1 nˆ (pkˆ ) has at most q eigenvalues bigger than =3 (-(Pnˆ−1 nˆ (pkˆ ))6=3 with at most q outliers). Therefore, because Pnˆ and nˆ (pkˆ ) belong both to the algebra, we infer that -sˆ ¿
3 p sˆ
with at most the exception of q indices sˆ. The eigenvalues of Pnˆ that are o(m−2k+1 ) d are also such that i=1 ( nsii )2ki = o(m−2k+1 ) i.e. d i=1
si2ki = o(m):
(30)
This means that the subspace Wnˆ spanned by the eigenvectors related to o(m−2k+1 ) eigenvalues of Pnˆ has to be contained (up to q possible indices) in span
vs(n1 1 )
⊗
vs(n2 2 )
⊗ ··· ⊗
vs(nd d )
:
d i=1
si2ki
= o(m) :
Now we look for the contradiction. By using (4) and by Lemma 2.6, we infer the following chain of relations: ¿-q+1 = = ¿ ¿
max
dim V=q+1
max
dim V=q+1
max
dim V=q+1
min
v∈Vq+1 ;v =0
vT Tnˆ (pkˆ)v v∈V;v =0 vT Pnˆ v T v (nˆ (pkˆ) + Hnˆ (pkˆ))v min v∈V;v =0 vT Pnˆ v T v Hnˆ (pkˆ)v min v∈V;v =0 vT Pnˆ v T v Hnˆ (pkˆ)v vT Pnˆ v min
¿
minv∈Vq+1 ;v2 =1 vT Hnˆ (pkˆ)v maxv∈Vq+1 ;v2 =1 vT Pnˆ v
¿
(4 sin2 (%1(n1 ) )=(n1 + 1))r(%1(n1 ) )(1 + o(1)) : maxv∈Vq+1 ;v2 =1 vT Pnˆ v
(31)
D. Noutsos et al. / Theoretical Computer Science 315 (2004) 557 – 579
573
As a consequence, since Vq+1 ⊂ Wnˆ so that maxv∈Vq+1 ;v2 =1 vT Pnˆ v = -sˆ = o(m−2k+1 ), we obtain −1 4 2 sin (%)r(%) max -sˆ ¿ (1 + o(1)) n1 + 1 vˆsnˆˆ ∈Wnˆ with %=
: n1 + 1
Finally max -sˆ = o(m
−2k+1
vˆsnˆˆ ∈Wnˆ
4 )¿ sin2 n1 + 1
n1 + 1
r(%)(1 + o(1)) ∼ m−3
which is a contradiction for all k¿2, since 2k − 1¿3 for all k¿2 and since n1 ∼ m. 2.2. Negative results: the circulant case We directly state and prove the main negative results for the circulant case. Theorem 2.8. Let f be asymptotic to pkˆ (ˆx) = (2 − 2 cos(x1 ))k1 + (2 − 2 cos(x2 ))k2 + · · · + (2 − 2 cos(xd ))kd , where kˆ = (k1 ; k2 ; : : : ; kd )T is a vector of nonnegative indices with min ki ¿1 and xˆ = (x1 ; x2 ; : : : ; xd )T is a multivariate vector in Qd . Let also be a <xed positive number independent of nˆ = (n1 ; n2 ; : : : ; nd )T with ni ∼ nj , ∀i; j = 1; : : : ; d. Then for every sequence {Pnˆ } with Pnˆ d-level circulant and such that -max (Pnˆ−1 Tnˆ (f)) 6
(32)
uniformly with respect to n, ˆ we have (a) the minimal eigenvalue of Pnˆ−1 Tnˆ ( f) tends to zero (in other words {Tnˆ ( f)} does not possess spectrally equivalent preconditioners in the d-level circulant algebra); (b) the number #{-(ˆn) ∈ *(Pnˆ−1 Tnˆ (f)) : -(ˆn) →N (ˆn)→∞ 0} tends to in
every i and therefore k = mini ki ) and let us consider the eigenvectors of the multilevel
574
D. Noutsos et al. / Theoretical Computer Science 315 (2004) 557 – 579
ˆ 1) 2) d) circulant algebra v = fˆ(sˆn) = fs(n ⊗ fs(n ⊗ · · · ⊗ fs(n . By making use of relation (6) 1 2 d we will give a direct proof. Calling -sˆ the corresponding eigenvalue of Pn , we have
vH Tn (pkˆ)v v H Pn v vH Cn (pkˆ)v + vH T˜n (pkˆ)v = v H Pn v pc + vH T˜n (pkˆ)v = sˆ ; -sˆ
¿-1 ¿
where, by Lemma 2.4, vH T˜n (pkˆ)v =
d 2 (n ) r(%sj j ): n j j=1
Consequently we have d 2 1 (nj ) c psˆ + r(%sj ) : -sˆ ¿ j=1 nj Now, for every multi-index sˆ such that pscˆ 6m−1=2 , taking into account that k = mini ki ¿1, we deduce that (n )
lim %sj j = 0:
nj →∞
(n )
Therefore Lemma 2.4 applies and r(%sj j ) = cj (1 + o(1)) where
2kj − 2 cj = kj − 1 is independent of nj . Finally, by considering that nj ∼ m for every j, it follows that d 2 (n ) r %sj j ¿ cm−1 j=1 nj for a proper positive constant c and thus -sˆ ¿
c : m
For the complementary set of indices such that psˆ ¿m−1=2 we have |vH Hn (pkˆ)v| 6 c1 m−1 ˆ for every v = fˆsˆ(n) : it follows that
-sˆ ¿
m−1=2 − c1 m−1 c ¿ m
(33)
D. Noutsos et al. / Theoretical Computer Science 315 (2004) 557 – 579
575
for a suitable positive constant c and consequently (33) is satisJed uniformly with regard to the multi-index sˆ. Moreover, from the asymptotic knowledge of the eigenvalues of Tnˆ (pkˆ ), we infer that Tnˆ ( f) has gd (m) eigenvalues -(m) going to zero asymptotically faster than m−1 i.e. -(m) = o(m−1 ) where g(m) → ∞ with the constraint that g(m) = o(m1−1=2k ). Finally we deduced that at least gd (m) eigenvalues of the preconditioned matrix collapse to zero as m tends to inJnity and this plainly implies both the statements in (a) and (b) hold. Analogously to the case, we prove that also the essential spectral equivalence is impossible with the help of the following lemma. Lemma 2.9. Let pkˆ (ˆx) = (2 − 2 cos(x1 ))k1 + (2 − 2 cos(x2 ))k2 + · · · + (2 − 2 cos(xd ))kd , where kˆ = (k1 ; k2 ; : : : ; kd )T is a vector of nonnegative indices with mini ki ¿2. Let T˜ nˆ (pkˆ ) be the Toeplitz correction de
v∈Vt ;v2 =1
2 vH T˜nˆ (pkˆ)v ¿ r(%1(n1 ) )(1 + o(1)) n1
with r(·) and %1(n1 ) deJned as in Lemma 2.6. Proof. We take the same choice of the subspace Vt as in Lemma 2.6, with the only (n ) di6erence that the eigenvectors vj j of the class are replaced by the circulant eigen(n )
vectors fj j . Then the proof follows exactly the same steps as the one of Lemma 2.6, where the role of Lemma 2.3 is naturally replaced by that of Lemma 2.4. Theorem 2.10. Let f be asymptotic to pkˆ (ˆx) = (2 − 2 cos(x1 ))k1 + (2 − 2 cos(x2 ))k2 + · · · + (2 − 2 cos(xd ))kd , where kˆ = (k1 ; k2 ; : : : ; kd )T is a vector of nonnegative indices with min ki ¿1 and xˆ = (x1 ; x2 ; : : : ; xd )T is a multivariate vector in Qd . Let also + be a <xed positive number independent of nˆ = (n1 ; n2 ; : : : ; nd )T with ni ∼ nj , ∀i; j = 1; : : : ; d. Then for every sequence {Pnˆ } with Pnˆ d-level circulant and such that -min (Pnˆ−1 Tnˆ (f)) ¿ +
(34)
uniformly with respect to n, ˆ we have (a) the maximal eigenvalue of Pnˆ−1 Tnˆ ( f) diverges to in
576
D. Noutsos et al. / Theoretical Computer Science 315 (2004) 557 – 579
tends to in
d (n2 ) (nd ) 2ki 1) span fs(n ⊗ f ⊗ · · · ⊗ f : s = o(m) s2 sd i 1 i=1
with the possible exception of q indices sˆ. Now we look for the contradiction: ¿ -q+1 = = ¿
max
min
max
min
max
min
dim V=q+1 v∈V;v =0
dim V=q+1 v∈V;v =0
vH Tnˆ (pkˆ)v vH Pnˆ v
vH (Cnˆ (pkˆ) + Tˆnˆ (pkˆ))v vH Pnˆ v vH Tˆnˆ (p ˆ)v k
dim V=q+1 v∈V;v =0
vH Pnˆ v
(35)
Finally, in analogy with the case, we choose a subspace Vq+1 ⊂ Wnˆ as in Lemma 2.9 and the proof is obtained in a perfectly identical way. More speciJcally, we infer 2 r(%) ¿ n1
−1
max -sˆ
n fˆsˆ ∈Wnˆ
(1 + o(1))
and then max -sˆ = o(m−2k+1 ) ¿ nˆ fˆsˆ∈Wnˆ
2 r(%)(1 + o(1)) ∼ m−1 n1
which is a contradiction for all k¿1, since 2k − 1¿1 for all k¿1 and n1 ∼ m.
D. Noutsos et al. / Theoretical Computer Science 315 (2004) 557 – 579
577
Remark 2.1 (Generalizations). The proofs in both circulant and cases have been worked out when the symbol f has a unique root at zero of even order. We believe that a similar construction can be done in the case where f has either a zero located elsewhere or has multiple distinct roots. We have also to remark that if some of the terms (2 − 2 cos(xi ))ki do not appear in the polynomial p, then the above theory holds as well. In such a case the function f has a hyperplane of roots according the directions xi whose term (2 − 2 cos(xi ))ki does not appear. As a consequence, the negative result for the matrix algebra preconditioners are applicable in a more general setting. Remark 2.2 (Partial di6erential equations). The Finite Di6erence and Finite Element discretization of elliptic PDEs over a rectangular domain, with variable coeDcients and suitable boundary conditions, leads to sequences of nonstructured matrices (or with hidden structure [24]!) which are spectrally equivalent to sequences of the form Tnˆ (pkˆ ) with kˆ = k(1; : : : ; 1)T , k¿1. Therefore, due to the transitivity in equivalence relations, it follows that the negative results presented in this paper are plainly generalized to this more general context too.
3. Conclusions From this note and from [13,26,27,22], we deduce a message which can be resumed as follows: the d-level case with d¿2 is dramatically di6erent from the scalar Toeplitz case in terms of preconditioning using fast transforms algebras. More precisely, in the d-level case with d¿2, it is impossible (except for rare exceptions) to Jnd superlinear and/or (essentially) spectrally equivalent matrix algebra preconditioners: we notice that the approach in [26,22] mainly pertains to the impossibility of a strong clustering while the approach used here and in [13] pertains to the impossibility of the essential spectral equivalence. On the other hand, the optimality has been proved in the case of the multilevel band Toeplitz preconditioning [18,12,21] (see also [14] for a new proposal). However, it should be mentioned that the problem of solving optimally those multilevel banded systems is still a diDcult problem that has been solved (both theoretically and practically) only in some cases with the help of a multigrid strategy [9,28,23]. In conclusion, the positive message of this note is the invitation for researchers interested in this Jeld to work on multigrid/multilevel procedures for multilevel banded Toeplitz structures and to give practical techniques for devising spectrally equivalent preconditioners having a multilevel and banded pattern.
Acknowledgements Warm thanks to Kolia Zamarashkin and to the referees for their very helpful suggestions.
578
D. Noutsos et al. / Theoretical Computer Science 315 (2004) 557 – 579
References [1] D. Bertaccini, G. Golub, S. Serra Capizzano, C. Tablino Possio, Preconditioned HSS method for the solution of non Hermitian positive deJnite linear systems, Tech. Report SCCM-02-10, Stanford University, 2002. [2] D. Bini, M. Capovani, Spectral and computational properties of band symmetric Toeplitz matrices, Linear Algebra Appl. 52/53 (1983) 99–126. [3] D. Bini, V. Pan, Matrix and Polynomial Computations, vol. 1: Fundamental Algorithms, BirkUauser, Boston, MA, 1994. [4] A. BUottcher, B. Silbermann, Introduction to Large Truncated Toeplitz Operators, Springer, New York, NY, 1999. [5] R.H. Chan, Toeplitz preconditioners for Toeplitz systems with nonnegative generating functions, IMA J. Numer. Anal. 11 (1991) 333–345. [6] R.H. Chan, M. Ng, Conjugate gradient method for Toeplitz systems, SIAM Rev. 38 (1996) 427–482. [7] F. Di Benedetto, Analysis of preconditioning techniques for ill-conditioned Toeplitz matrices, SIAM J. Sci. Comput. 16 (1995) 682–697. [8] F. Di Benedetto, G. Fiorentino, S. Serra Capizzano, C.G. Preconditioning for Toeplitz matrices, Comput. Math. Appl. 25 (1993) 35–45. [9] G. Fiorentino, S. Serra Capizzano, Multigrid methods for symmetric positive deJnite block Toeplitz matrices with nonnegative generating functions, SIAM J. Sci. Comput. 17 (4) (1996) 1068–1081. [10] G. Golub, C. Van Loan, Matrix Computations, The Johns Hopkins University Press, Baltimore, MR, 1983. [11] R.L. Graham, D.E. Knuth, O. Patashnik, Concrete Mathematics. A Foundation for Computer Science, Addison-Wesley Publ. Company, Reading, MA, 1989. [12] M. Ng, Band preconditioners for block-Toeplitz–Toeplitz-block systems, Linear Algebra Appl. 259 (1997) 307–327. [13] D. Noutsos, S. Serra Capizzano, P. Vassalos, Spectral equivalence and matrix algebra preconditioners for multilevel Toeplitz systems: a negative result (V. Olshevsky Ed.), Contemp. Math. 323 (2003) 313–322. [14] D. Noutsos, P. Vassalos, New band Toeplitz preconditioners for ill-conditioned symmetric positive deJnite Toeplitz systems, SIAM J. Matrix Anal. Appl. 23 (3) (2002) 728–743. [15] D. Potts, G. Steidl, Preconditioners for ill-conditioned Toeplitz systems constructed from positive kernels, SIAM J. Sci. Comput. 22 (5) (2001) 1741–1761. [16] D. Potts, G. Steidl, Preconditioning of Hermitian block-Toeplitz–Toeplitz-block matrices by level 1 preconditioners (V. Olshevsky Ed.), Contemp. Math. 281 (2001) 193–212. [17] K. Rao, P. Yip, Discrete Cosine Transforms: Algorithms, Advantages, Applications, Academic Press, New York, NY, 1990. [18] S. Serra Capizzano, Preconditioning strategies for asymptotically ill-conditioned block Toeplitz systems, BIT 34 (1994) 579–594. [19] S. Serra Capizzano, Optimal, quasi-optimal and superlinear band-Toeplitz preconditioners for asymptotically ill-conditioned positive deJnite Toeplitz systems, Math. Comput. 66 (1997) 651–665. [20] S. Serra Capizzano, The rate of convergence of Toeplitz based PCG methods for second order nonlinear boundary value problems, Numer. Math. 81 (3) (1999) 461–495. [21] S. Serra Capizzano, Spectral and computational analysis of block Toeplitz matrices having nonnegative deJnite matrix-valued generating functions, BIT 39 (1) (1999) 152–175. [22] S. Serra Capizzano, Matrix algebra preconditioners for multilevel Toeplitz matrices are not superlinear, Linear Algebra Appl. 343–344 (2002) 303–319. [23] S. Serra Capizzano, Convergence analysis of two grid methods for elliptic Toeplitz and PDEs matrix sequences, Numer. Math. 92 (3) (2002) 433–465. [24] S. Serra Capizzano, Generalized locally Toeplitz sequences: spectral analysis and applications to discretized partial di6erential equations, Linear Algebra Appl. 366 (1) (2003) 371–402. [25] S. Serra Capizzano, P. Tilli, Extreme singular values and eigenvalues of non-Hermitian block Toeplitz matrices, J. Comput. Appl. Math. 108 (1/2) (1999) 113–130.
D. Noutsos et al. / Theoretical Computer Science 315 (2004) 557 – 579
579
[26] S. Serra Capizzano, E. Tyrtyshnikov, Any circulant-like preconditioner for multilevel matrices is not superlinear, SIAM J. Matrix Anal. Appl. 22 (1) (1999) 431–439. [27] S. Serra Capizzano, E. Tyrtyshnikov, How to prove that a preconditioner cannot be superlinear, Math. Comput. 72 (2003) 1305–1316. [28] H. Sun, X. Jin, Q. Chang, Convergence of the multigrid method for ill-conditioned block Toeplitz systems, BIT 41 (2001) 179–190. [29] H. Widom, Toeplitz matrices, in: I. Hirshman Jr. (Ed.), Studies in Real and Complex Analysis, Math. Ass. Amer., Washington, DC, 1965.
Theoretical Computer Science 315 (2004) 581 – 592
www.elsevier.com/locate/tcs
Iterative inversion of structured matrices Victor Y. Pana , Marc Van Barelb;∗ , Xinmao Wangc , Gianni Codevicoa a Department
of Mathematics and Computer Science, Lehman College, City University of New York, Bronx, NY 10468, USA b Department of Computer Science, Katholieke Universiteit Leuven, Celestijnenlaan 200A, Leuven (Heverlee) B-3001, Belgium c Ph.D. Program in Mathematics, Graduate School of CUNY, New York, NY 10016, USA
Abstract Iterative processes for the inversion of structured matrices can be further improved by using a technique for compression and re1nement via the least-squares computation. We review such processes and elaborate upon incorporation of this technique into the known frameworks. c 2004 Elsevier B.V. All rights reserved. Keywords: Structured matrices; Displacement rank; Iterative inversion; Least-squares computations
1. Introduction Structured matrices such as Toeplitz, Hankel, Vandermonde, and Cauchy matrices as well as matrices with a structure generalizing the latter classes are omnipresent in computations for sciences, engineering, and signal processing. Displacement representations of such a matrix enable its fast multiplication by a vector and expression of The research of the V.Y.P. and X.W. was supported by NSF Grant CCR 9732206 and PSC CUNY Award 66406-0033. The research of the M.V.B. and G.C. author was supported by the Research Council K.U. Leuven, project OT/00/16 (SLAP: Structured Linear Algebra Package), by the Fund for Scienti1c Research– Flanders (Belgium), projects G.0078.01 (SMA: Structured Matrices and their Applications), G.0176.02 (ANCILA: Asymptotic aNalysis of the Convergence behavior of Iterative methods in numerical Linear Algebra), and G.0184.02 (CORFU: Constructive study of Orthogonal Functions), and by the Belgian Programme on Interuniversity Poles of Attraction, initiated by the Belgian State, Prime Minister’s OGce for Science, Technology and Culture, project IUAP V-22 (Dynamical Systems and Control: Computation, Identi1cation & Modelling). The scienti1c responsibility rests with the authors. ∗ Corresponding author. E-mail addresses: [email protected] (V.Y. Pan), [email protected] (M. Van Barel), [email protected] (X. Wang), [email protected] (G. Codevico).
c 2004 Elsevier B.V. All rights reserved. 0304-3975/$ - see front matter doi:10.1016/j.tcs.2004.01.008
582
V.Y. Pan et al. / Theoretical Computer Science 315 (2004) 581 – 592
its inverse via the solutions of a few linear systems of equations. The latter problems (of inversion and linear system solving) are highly important for the theory and practice of computing. Some eKective direct solution algorithms exploiting the displacement representation can be found, e.g., in [5,7,13,18–20,25,32]. Alternative iterative methods were proposed, e.g., in [3,21–24,26,27,31]. The latter methods nontrivially extend some preceding work for general input matrices [2,29,33] and can be most eKective for well-conditioned inputs. We brieNy survey the state of the art in Sections 2–5. In particular, in Section 5, we cover two policies of keeping matrices compressed during the iterative process. In Section 6, we cover another technique based on the least-squares computation, which enables both compression and re1nement of the computed approximations to the inverse (see Theorem 6.1). We elaborate upon incorporation of this technique into the known frameworks for iterative inversion. Section 7 demonstrates the validity of the method by numerical experiments. Due to the well-known close correlation between computations with structured matrices and with polynomials and rational functions [25], many fundamental algebraic computations such as polynomial multiplication and division, and polynomial and rational interpolation and multipoint evaluation can be reduced to operations with structured matrices. So our work may serve as an example of eKective application of numerical methods to solve fundamental problems of algebraic computation. 2. Iterative matrix inversion Newton’s iteration for matrix inversion Xi+1 = Xi (2I + MXi );
i = 0; 1; : : :
(2.1)
de1nes a sequence of approximations X0 ; X1 ; : : : to −M −1 with the residuals I + MXi and I + Xi M squared in each step (2.1). Thus, the matrices Xi rapidly converge to −M −1 if initially the norms I + MXi and=or I + Xi M are substantially less than 1. In some cases an initial approximation X0 can be supplied from outside; otherwise it can be generated according to some rules speci1ed in [2,27,29,33], [25, Chapter 6]. (See also the homotopy=continuation approach in [25, Chapter 6, 27].) There are certain policies allowing convergence acceleration, such as Xi+1 = ai Xi (2I + MXi );
i = 0; 1; : : : ;
(2.2)
decreasing the number of steps by a factor of 2 versus (2.1) for scalars ai speci1ed in [29] (cf. [25, p. 191]), and ); Xi+1 = Xi (I + Ri + R2i + · · · + Rp−1 i
Ri = I + MXi ;
i = 0; 1; : : : :
(2.3)
Note that Ri+1 = Rpi under (2.3), multiplication of p matrix pairs is needed per step (2.3), and processes (2.2) and (2.3) turn into (2.1), where ai = 1 for all i and p = 2, respectively.
V.Y. Pan et al. / Theoretical Computer Science 315 (2004) 581 – 592
583
Table 1 Four classes of structured matrices
Toeplitz matrices (ti−j )n−1 i;j=0 t0 t−1 · · · t1−n . .. . .. t1 t0 .. . . . . . . . t−1 tn−1 · · · t1 t0
Hankel matrices (hi+j )n−1 i;j=0 h0 h1 · · · hn−1 .. . hn h1 h2 .. .. . . . . . . . . hn−1 hn · · · h2n−2
Vandermonde matrices (tij )n−1 i;j=0
1 Cauchy matrices ( s −t )n−1 i;j=0
1 t0 · · · t0n−1 1 t · · · t n−1 1 1 . .. . . . n−1 1 tn−1 · · · tn−1
1 s0 −t0 1 s1 −t0
···
1 sn−1 −t0
···
.. .
···
i
j
1 s0 −tn−1 1 s1 −tn−1
.. .
1 sn−1 −tn−1
3. Structured matrices Iterative inversion is most eKective for structured matrices, for which matrix–matrix and matrix–vector multiplication can be performed at a low computational cost (using O(n log n) or O(n log2 n) Nops, versus the order of n3 for n × n general matrices). Table 1 displays the most popular classes of structured matrices. Each of these n × n matrices is completely de1ned by n, 2n − 1, or 2n parameters. More generally, many other matrices with similar structures can be represented with O(n) parameters as follows. Associate with a 1xed class of structured matrices M a pair of operator matrices A and B such that the Sylvester and=or Stein displacements of M , A;B (M ) = M − AMB = GH T ;
∇A;B (M ) = AM − MB = GH T ;
(3.1)
respectively, have small rank r (called the displacement rank of M ). The n × n matrix M can be eKectively expressed via the columns of its displacement generator matrices G and H of size n × r. Then operate with structured matrices represented in this compressed form. We trace this important approach back to [12,14,15,17] (cf. also [10,18]); it has a huge bibliography (cf. [4,13,16,25] for surveys and details). We cover the Sylvester displacement representation referring the reader to [25] and the references therein on the dual Stein displacement representation. Typical operator matrices are the unit f-circulant matrices, n−1 Zf = (zi;j )i;j=0 ;
(3.2)
where zi+1; i = 1, i = 0; 1; : : : ; n − 2, z0;n−1 = f, zi;j = 0 if (i − j) mod n = 1, and diagonal n−1 n−1 for s = (si )i=0 . Table 2 shows operator matrices associated matrices, Ds = diag(si )i=0 with structured matrices of Table 1, the displacement rank, and the cost in Nops for multiplication by a vector. For these operator matrices, the arithmetic cost of multiplication of a structured matrix by a vector lies in O(rn log n) with the operators ∇Ze ; Zf
584
V.Y. Pan et al. / Theoretical Computer Science 315 (2004) 581 – 592
Table 2 Structured matrix operator matrices
Matrix class
Pair of operator matrices
Displacement rank
No. of Nops for multiplication by vector
Toeplitz (ti−j )i;j Hankel (hi+j )i;j Vandermonde (tij )i;j Cauchy (1=si − tj )i;j
Ze ; Zf ; e = f Ze ; ZfT ; ef = 1 Dt ; ZfT D s ; Dt
62 62 61 61
O(n O(n O(n O(n
log n) log n) log2 n) log2 n)
and ∇Ze ; ZfT (that is, in the Toeplitz=Hankel, T=H , case) and in O(rn log2 n) with the operators ∇D(t); ZfT and ∇D(s); D(t) (that is, in the Vandermonde=Cauchy, V=C, case). 4. Structured iterative inversion via matrix-by-vector multiplication The acceleration of iterative inversion of structured matrices is achieved by reducing processes (2.1)– (2.3) to matrix-by-vector multiplication. Similarly to (3.1), write ∇B; A (Xi ) = Gi HiT ;
(4.1)
where Gi , Hi are n × li matrices, r6li 6n, and observe that iteration (2.2) can be performed by computing the generators Gi+1 = (ai (2I + Xi M )Gi ; ai Gi ; ai Xi G); HiT T Hi+1 = H T MXi H T Xi
(4.2) (4.3)
for Xi+1 . For ai = 1, this turns into a compressed version of iteration (2.1), and similar expressions can be derived for process (2.3). Eqs. (4.2) and (4.3) reduce step (2.2) (or (2.1)) essentially to multiplication of the matrices M; M T ; Xi , and XiT by li , r, r +li , and r + li vectors, respectively, where r and li denote the lengths of the displacement generators available for M and Xi , respectively. This means cr;n;li = O((r + li )2 n logd n)
(4.4)
Nops per step (2.2), where d = 1 (in the T=H case) or d = 2 (in the V=C case), so it is crucial to bound li to make the iteration eKective. Typical initial choices of X0 achieve l0 6r, but processes (4.2) and (4.3) may inNate this to li = 3i l0 , so special care should be taken periodically to keep computations eKective. 5. Compression of the iterates via the truncation of singular values or via substitution To compress the iterates Xi+1 one should modify (2.1) – (2.3) as follows: Xˆ i+1 = g(Xi ; M );
Xi+1 = f(Xˆ i+1 );
(5.1)
V.Y. Pan et al. / Theoretical Computer Science 315 (2004) 581 – 592
585
where g(Xi ; M ) is the iteration de1ned by (2.1) and (2.2), or (2.3), and the function f(W ) de1nes a compression rule. Unfortunately, already with the 1rst compression step, the theorems in [29] supporting acceleration by twice in (2.2) vs. (2.1) hold no more, so one may either ensure a desired decrease of the residual norm in fewer steps (2.2) by postponing compression (5.1) and thus involving more Nops per step or use compression and then risk divergence or a slow-down of convergence. The 1rst simple policy of de1ning f(Xˆ i+1 ) in (5.1) (proposed in [22–24] and described as Subroutine R1 in [25, Section 6.4]) is to truncate the smallest singular T values of the displacement ∇B;A (Xˆ i+1 ) = Gˆ i+1 Hˆ i+1 . Then we recover Xi+1 from the resulting shorter displacement generator. Another policy [26,31] (called compression via substitution, relying on the observation that T ∇B; A (−M −1 ) = G− H− = M −1 ∇A; B (M )M −1 = M −1 GH T M −1
and described as Subroutine R2 in [25, Section 6.5]) is to replace (2.1) – (2.3) by the process Gi+1 = g(Xi ; M )G;
T Hi+1 = H T g(Xi ; M );
(5.2)
requiring about cr;n;li Nops per step. The reader is referred to [22,23,26,28] on the estimates for how much (or how little) these compression policies slow down the convergence. 6. Compression using a least-squares criterion The third policy is to compute a least-squares re1nement Gi+1 ; Hi+1 of the displacement generator Gˆ i+1 ; Hˆ i+1 of the computed approximation Xˆ i+1 to −M −1 such that T ∇B; A (Xˆ i+1 ) = Gˆ i+1 Hˆ i+1 ;
(6.1)
Gi+1 = Gˆ i+1 Yi+1;G ;
(6.2)
Hi+1 = Hˆ i+1 Yi+1; H
and the column-wise vector 2-norms NG = G − M Gˆ i+1 Yi+1;G = G − M Gˆ i+1 Yi+1;G 2
(6.3)
NH = H − M T Hˆ i+1 Yi+1; H = H − M T Hˆ i+1 Yi+1; H 2
(6.4)
and
are minimum over all li+1 × r matrices Yi+1; G and Yi+1; H . The pair Gi+1 ; Hi+1 is used as a displacement generator representing the matrix T Xi+1 = ∇−1 B; A (Gi+1 Hi+1 )
of (5.1) (cf. (4.1)). Besides compression, this policy is also intended to re1ne the approximation to −M −1 (see Theorem 6.1).
586
V.Y. Pan et al. / Theoretical Computer Science 315 (2004) 581 – 592
We consider two applications of this policy of compression and re
T
(Gˆ i+1 M T )M Gˆ i+1 Yi+1;G = Gˆ i+1 M T G;
(6.5)
T T (Hˆ i+1 M )M T Hˆ i+1 Yi+1; H = Hˆ i+1 MH:
(6.6) T
This amounts to multiplication of each of the matrices M and M by li+1 vectors (that is a fraction of cr;n;li Nops of (4.4)), 2li+1 (li+1 + r)(2n − 1) Nops for computing the coeGcients of normal equations (6.5) and (6.6), and 2rli+1 (li+1 − 1)(2li+1 + 5)=3 = 4rl3i+1 =3 + O(rl2i+1 ) Nops for solving these equations. Clearly, computations in (6.1) – (6.4) compress the generator Gˆ i+1 ; Hˆ i+1 to the smallest length r. Their role for re1nement can be seen from the next theorem applied for X∗ = Xi+1 ;
G∗ = Gi+1 ;
H∗ = Hi+1 :
Theorem 6.1. Let A, B, M , and X∗ be n × n matrices and let G, H , G∗ and H∗ be n × r matrices, 16r¡n, such that M is nonsingular, ∇A; B (M ) = AM − MB = GH T ; ∇B; A (X∗ ) = BX∗ − X∗ A = G∗ H∗T : Then M ∇B; A (X∗ + M −1 )M == −G(H T − H∗T M ) − (G − MG∗ )H∗T M: Proof. The generator G− , H− for the inverse matrix −M −1 satis1es the following relation: T ∇B; A (−M −1 ) = M −1 ∇A; B (M )M −1 = G− H− :
Hence, G− , H− can be chosen as follows: G = MG− ; M T H− = H:
V.Y. Pan et al. / Theoretical Computer Science 315 (2004) 581 – 592
587
Therefore, we have T M ∇B; A (X∗ + M −1 )M = M (−G− H− + G∗ H∗T )M
= −GH T + MG∗ H∗T M = −G(H T − H∗T M ) − (G − MG∗ )H∗T M: The computations in (6.1)– (6.4) minimize the 2-norms of approximations to G by MG∗ and to H by M T H∗ ; since G = MG− and H = M T H− , this should move G∗ closer to G− and H∗ closer to H− . By Theorem 6.1, we get the following result: Corollary 6.1. Using the same notation as in Theorem 6.1 and the 2-norm, we obtain that N = M ∇B; A (X∗ + M −1 )M 2 6 G2 H T − H∗T M 2 + G − MG∗ 2 H∗T M 2 :
(6.7)
Hence, decreasing the norms H − M T H∗ 2 and G − MG∗ 2 leads to decreasing the upper bound on the norm N and consequently on the error norm −1 2 E = X∗ + M −1 2 6 N ∇−1 : B; A 2 M
In [25,30] quite tight upper and lower bounds on ∇−1 B;A 2 are derived for various often used pairs of A and B. In particular by Corollary 8.10 of [30] we have ∇−1 Z e ; Z f 2 6
√
2r
mini; j
˜ (n−1)=n |e˜f| |e1=n !i − f1=n !j |
√ provided that e˜ = max{|e|; 1=|e|}, f˜ = max{|f|; 1=|f|}, ! = exp(2( −1=n), and the operator is applied on the matrices of rank r. We also note the respective bounds on the norms of the left and right residuals I + Xi M 2 6 EM 2 ;
I + MXi 2 6 EM 2 :
Remark 6.1. For a given displacement ∇A;B (M ) the choice of the generator pair G; H satisfying (3.1) is not unique but this choice does not aKect norm N of (6.7), which T T only depends on GH T . This follows because Yi+1; G Yi+1; as H only depends on GH can be seen from (6.5) and (6.6) in the full rank case and from QR factorization of M Gˆ i+1 and M T Hˆ i+1 with column pivoting ([11, Section 5.4.1]) in the rank de1cient case. 7. Numerical experiments Let us specify a particular invertible displacement operator, which we used to perform the numerical tests. We write C + = Z+1 , C − = Z−1 (cf. (3.2)), and denote with
588
V.Y. Pan et al. / Theoretical Computer Science 315 (2004) 581 – 592
n n C + (x) = i=1 xi (C + )i−1 , C − (x) = i=1 xi (C − )i−1 the (+1)-circulant and (−1)circulant matrices having x as the 1rst column. We recall from [6] that C + (x) = F diag(y)F H ;
C − (x) = DF diag(y)F ˆ H DH ;
y=
1 H F x; n
1 H H F D x; F = (!(i−1)(j−1) )i; j=1;:::; n ; D = diag(1; ,; ,2 ; : : : ; ,n−1 ); n √ √ ! = cos(2(=n) + −1 sin(2(=n) = exp(2( −1=n); √ √ , = cos((=n) + −1 sin((=n) = exp(( −1=n): yˆ =
(7.1)
Consider the invertible operators + (M ) = C + M − MC −
and
− (M ) = C − M − MC + :
The following theorem summarizes some well-known results (see, e.g., [1,8,9,25]). Theorem 7.1. It holds that + (M ) = − (M ) = −
(M
−1
k i=1 k i=1
ui viT ⇔ M =
k 1 C + (ui )C − (J vi ); 2 i=1
ui viT ⇔ M = −
k 1 C − (ui )C + (J vi ); 2 i=1
) = −M −1 + (M )M −1 ;
where J is the permutation matrix having ones on the anti-diagonal. In particular, if k + (M ) = i=1 ui viT and det M = 0, we obtain M −1 =
k 1 C − (M −1 ui )C + (JM −T vi ): 2 i=1
(7.2)
We have implemented the classical approach based on truncating the smallest singular values and the new least-squares based compression approach for the displacement operators + and − . The software was written in Matlab. 1 Experimental tests based on this algorithm clearly show that the new least-squares cutting approach gives more accurate results compared to the classical compression for well-conditioned matrices. The algorithm is applied to 100 Toeplitz matrices M = T of size 100 × 100, where the entries of each Toeplitz matrix are uniformly random between zero and one. For each of the 100 samples, the starting point is computed as T −1 ∗ (I + /l ∗ R), where the entries of R are uniformly random between −1 and +1. The parameter /l is determined such that the norm of the left residual I + X0 ∗ T equals 1 and the norm of the right residual I + T ∗ X0 is larger than 1. Fig. 1 gives a histogram of log10 (cond(T )). 1
Matlab is a registered trademark of the MathWorks.
V.Y. Pan et al. / Theoretical Computer Science 315 (2004) 581 – 592
589
Fig. 1. Histogram of log10 of the condition numbers of the 100 Toeplitz matrices.
Let Ri = I + Xi ∗ T be the left residual for Xi . Then, the choice of our starting point guarantees that the convergence is reNected in the behaviour of the sequence R0 ; R1 ; : : : because for the Newton iteration without compression, we have Ri+1 6Ri 2 . Fig. 2 shows the results for these 100 samples. Each plot shows a histogram of − log10 Ri (as the x-coordinate) for i = 0; 3 and 6. The y-coordinate shows the number of sampled matrices (out of the total of 100) with this value of Ri . The histograms at the left show the results when the classical compression is used while the histograms at the right are corresponding to the least-squares compression. In the 1rst iteration the residual norms are the same in all tests, but already 3 iterations with the least-squares compression generically result in a signi1cantly smaller left residual norm. This is illustrated in the histograms by the fact that the black area in the right 1gure is shifted more towards the right compared to the left plot. In the 6th iteration the diKerence between the two approaches shows up even more dramatically. 8. Conclusion We 1rst recalled Newton’s iteration algorithms for the inversion of structured matrices and then presented an alternative compression strategy based on a least-squares criterion. The numerical experiments with inverting random Toeplitz matrices show that for well-conditioned matrices this new compression scheme results in fewer iteration steps to obtain the same accuracy. In a companion paper [34], an alternative initial iteration step is proposed leading to a much more robust iteration scheme, especially
590
V.Y. Pan et al. / Theoretical Computer Science 315 (2004) 581 – 592
Fig. 2. Distribution of the left residual norms in 1; 3 and 6 iterations in the two approaches: the classical truncation of singular values (left) and the least-squares compression (right).
when the matrix is ill-conditioned. In an upcoming paper, we also investigate a general compression framework with truncation, substitution and the least-squares approach as special cases. To come to a user-friendly, robust and eGcient iteration scheme for approximating the inverse of a structured matrix, still a lot of research has to be done. Several other compression strategies are possible. Finding the right combination of all the possible variants for inverting a 1xed Toeplitz matrix is not trivial.
References [1] G. Ammar, P. Gader, A variant of the Gohberg-Semencul formula involving circulant matrices, SIAM J. Matrix Anal. Appl. 12 (3) (1991) 534–541. [2] A. Ben-Israel, A note on iterative method for generalized inversion of matrices, Math. Comput. 20 (1966) 439–440. [3] D. Bini, B. Meini, Approximate displacement rank and applications, in: V. Olshevsky (Ed.), Proc. AMS Conf. “Structured Matrices in Operator Theory, Numerical Analysis, Control, Signal and Image Processing”, Boulder, 1999, American Mathematical Society, Providence, RI, 2001, pp. 215–232. [4] D. Bini, V. Pan, Polynomial and matrix computations, in: Fundamental Algorithms, Vol. 1, BirkWauser, Boston, 1994. [5] R.R. Bitmead, B.D.O. Anderson, Asymptotically fast solution of Toeplitz and related systems of linear equations, Linear Algebra Appl. 34 (1980) 103–116.
V.Y. Pan et al. / Theoretical Computer Science 315 (2004) 581 – 592
591
[6] W. Cline, R. Plemmons, G. Worm, Generalized inverses of certain Toeplitz matrices, Linear Algebra Appl. 8 (1974) 25–33. [7] I. Gohberg, T. Kailath, V. Olshevsky, Fast Gaussian elimination with partial pivoting for matrices with displacement structure, Math. Comput. 64 (212) (1995) 1557–1576. [8] I. Gohberg, V. Olshevsky, Circulant displacement and decomposition of matrices, Integral Equations Operator Theory 15 (1992) 730–743. [9] I. Gohberg, V. Olshevsky, Complexity of multiplication with vectors for structured matrices, Linear Algebra Appl. 202 (1994) 163–192. [10] I. Gohberg, A. Semencul, On the inversion of 1nite Toeplitz matrices and their continuous analogs, Mat. Issled. 2 (1972) 187–224. [11] G.H. Golub, C.F. Van Loan, Matrix Computations, 3rd ed., The Johns Hopkins University Press, Baltimore, MD, 1996. [12] G. Heinig, K. Rost, Algebraic Methods for Toeplitz-like Matrices and Operators, Akademie-Verlag, Berlin, and BirkhWauser, Basel/Stuttgart, 1984. [13] T. Kailath, A.H. Sayed (Eds.), Fast Reliable Algorithms for Matrices with Structure, SIAM Publications, Philadelphia, 1999. [14] T. Kailath, S.-Y. Kung, M. Morf, Displacement ranks of matrices and linear equations, J. Math. Anal. Appl. 68 (1979) 395–407. [15] T. Kailath, S. Kung, M. Morf, Displacement ranks of a matrix, Bull. Amer. Math. Soc. 1 (1979) 769–773. [16] T. Kailath, A. Sayed, Displacement structure: theory and applications, SIAM Rev. 37 (3) (1995) 297–386. [17] T. Kailath, A. Vieira, M. Morf, Inverses of Toeplitz operators, innovations and orthogonal polynomials, SIAM Rev. 20 (1978) 106–119. [18] M. Morf, Fast algorithms for multivariable systems, Ph.D. Thesis, Department of Electrical Engineering, Stanford University, Stanford, CA, 1974. [19] M. Morf, Doubling algorithms for Toeplitz and related equations, in: Proc. IEEE Internat. Conf. on ASSP, IEEE Press, Piscataway, NJ, 1980, pp. 954–959. [20] V. Olshevsky, V.Y. Pan, A uni1ed superfast algorithm for boundary rational tangential interpolation problem, in: Proc. 39th Ann. IEEE Symp. Foundations of Computer Science (FOCS’98), IEEE Computer Society Press, Los Alamitos, CA, 1998, pp. 192–201. [21] V.Y. Pan, Fast and eGcient parallel inversion of Toeplitz and block Toeplitz matrices, Oper. Theory: Adv. Appl. 40 (1989) 359–389. [22] V.Y. Pan, Parallel solution of Toeplitz-like linear systems, J. Complexity 8 (1992) 1–21. [23] V.Y. Pan, Decreasing the displacement rank of a matrix, SIAM J. Matrix Anal. Appl. 14 (1) (1993) 118–121. [24] V.Y. Pan, Concurrent iterative algorithm for Toeplitz-like linear systems, IEEE Trans. Parallel Distributed Systems 4 (1993) 592–600. [25] V.Y. Pan, Structured Matrices and Polynomials: Uni1ed Superfast Algorithms, BirkhWauser/Springer, Boston/New York, 2001. [26] V.Y. Pan, S. Branham, R.E. Rosholt, A. Zheng, Newton’s iteration for structured matrices and linear systems of equations, in: T. Kailath, A.H. Sayed (Eds.), Fast Reliable Algorithms for Matrices with Structure, SIAM Publications, Philadelphia, 1999, pp. 189–210. [27] V.Y. Pan, M. Kunin, R.E. Rosholt, H. CebecioYglu, Residual correction algorithms for general and structured matrices, preprint, 2001. [28] V.Y. Pan, Y. Rami, X. Wang, Structured matrices and Newton’s iteration, Linear Algebra Appl. 343–344 (2002) 233–265. [29] V.Y. Pan, R. Schreiber, An improved Newton iteration for the generalized inverse of a matrix, with applications, SIAM J. Sci. Statist. Comput. 12 (5) (1991) 1109–1131. [30] V.Y. Pan, X. Wang, Inversion of displacement operators, SIAM J. Matrix Anal. Appl. 24 (3) (2003). [31] V.Y. Pan, A. Zheng, X. Huang, O. Dias, Newton’s iteration for inversion of Cauchy-like and other structured matrices, J. Complexity 13 (1) (1997) 108–124. [32] V.Y. Pan, A. Zheng, Superfast algorithms for Cauchy-like matrix computations and extensions, Linear Algebra Appl. 310 (2000) 83–108.
592
V.Y. Pan et al. / Theoretical Computer Science 315 (2004) 581 – 592
[33] T. SWoderstrWom, G. Stewart, On the numerical properties of an iterative method for computing the Moore–Penrose generalized inverse, SIAM J. Numer. Anal. 11 (1974) 61–74. [34] M. Van Barel, G. Codevico, An adaptation of the Newton iteration method to solve symmetric positive de1nite Toeplitz systems, Report TW 349, Department of Computer Science, K.U. Leuven, Leuven, Belgium, November 2002; URL http://www.cs.kuleuven.ac.be/publicaties/rapporten/tw/ TW349.abs.html.
Theoretical Computer Science 315 (2004) 593 – 625
www.elsevier.com/locate/tcs
Deformation techniques to solve generalised Pham systems Luis Miguel Pardoa;∗ , Jorge San Mart,-nb a Departamento
de Matem aticas, Estad stica y Computaci on, Facultad de Ciencias, Universidad de Cantabria, E-39071 Santander, Spain b Departemento de Inform atica, Estad stica y Telem atica, Escuela Superior de Ciencias Experimentales y Tecnol ogicas, Universidad Rey Juan Carlos, c/Tulip an s/n. 28933 M ostolas, Spain
Abstract In Heintz et al. (Electron. J. SADIO 1(1) (1998) 37), Castro et al. (Found., Comput. Math. (2003) to appear) and Pardo (Proceedings EACA’2000, 2000, pp. 25–51), the authors have shown that universal solving procedures require exponential running time. Roughly speaking, a universal solving procedure takes as input a system of multivariate polynomial equations and outputs complete symbolic information on the solution variety. Here, we introduce a non-universal solving procedure adapted to Generalised Pham Systems. The aim is to compute partial information of the variety de>ned by the input system. The Algorithm is based on an homotopic deformation and on a non-Archimedean lifting procedure from a non-singular zero of the homotopy curve. The complexity of the procedure is also stated and it depends on some intrinsic quantity called the deformation degree of the given input system. c 2004 Elsevier B.V. All rights reserved. Keywords: Generalised Pham system; Universal algorithms; Homotopic deformation; Geometric degree
1. Introduction In [4,25,42], the authors prove that universal elimination procedures require exponential running time. In fact, these authors show that the B,ezout number of some input systems of polynomial equations is a lower bound for the output length and, hence, for the running time of universal elimination procedures. In these three papers, the authors ∗
Research was partially supported by the Spanish grant BFM2000-0349. Corresponding author. E-mail addresses: [email protected] (L.M. Pardo), [email protected] (J.S. Mart,-n).
c 2004 Elsevier B.V. All rights reserved. 0304-3975/$ - see front matter doi:10.1016/j.tcs.2004.01.009
594
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
also observed that most symbolic procedures in Elimination Theory are universal in their sense. Roughly speaking, a universal elimination procedure is based on some universal polynomial equation solver. A polynomial equation solver is a device that takes as input a system of multivariate polynomials F := [f1 ; : : : ; fr ] ∈ Q[X1 ; : : : ; Xn ]r and returns some information concerning the solution variety V (F) ⊆ Cn V (F) := {x ∈ Cn : fi (x) = 0; 16i 6 r}: Informally speaking, a polynomial equation solver is called universal if for every input F, the output contains enough information to answer any elimination question involving V (F). We refer to [4] for precise de>nitions and statements of universal procedures. An alternative to improve the eKciency of symbolic elimination procedures is the introduction of symbolic polynomial equation solvers that are not universal in the previously quoted sense. This paper is devoted to exhibit a non-universal polynomial equation solver of symbolic nature (cf. Algorithm 2 in Section 3). A classical example of non-universal polynomial equation solver is the numerical approach to solving. Most of the polynomial system solvers in numerical analysis follow this pattern: Given F a system of polynomial equations and given an accuracy ” ¿ 0, output a zero of system F up to a distance smaller than ”. This approach to solving is followed in most studies on numerical analysis polynomial equation solvers (cf. [10,38,40,53,54,58,59]). Observe that a numerical analysis procedure that approximates all solutions of a given system immediately requires a running time greater than the number of actual zeros of the input system. Since the number of solutions generically equals the B,ezout number, it follows that this kind of numerical analysis procedures also behaves as universal procedures and their running time is at least exponential in the number of variables (cf. [42] for precise statements). Note that the output of a non-universal symbolic solver is not a complete description of the solution variety V (F). In fact, we will show an algorithm whose output contains partial information of V (F). The amount of information contained in the output is conceptually inspired by Approximate Zero Theory (introduced by Smale in [52] and developed by Shub and Smale in the series of papers [47–51]). In this approach to polynomial equation solving, the input is a system of polynomial equations F := [f1 ; : : : ; fn ] ∈ Q[X1 ; : : : ; Xn ]n and the output is an approximate zero of F with associated zero ∈ V (F). An approximate zero of F is a point z ∈ Q[i]n such that the sequence of iterates of the Newton operator NF applied to z converges quadratically to the actual zero ∈ V (F) ⊆ Cn . Such a kind of numerical procedures are not universal procedures in the sense of [4,25,42]. Observe that an approximate zero z ∈ Q[i]n associated to some actual zero ∈ V (F) does not contain complete information of V (F). In fact, in [5], the authors prove that the amount of information contained in an approximate zero is computationally equivalent to the amount of information contained in the residue class >eld Q() of (the minimal >eld extension of Q that contains all coordinates of ). In other words, the digits of the approximate zero z are enough to reconstruct the whole symbolic structure of the >eld Q(). This is why in these pages we consider this information (the residue class >eld Q()) as the minimal unit of symbolic information
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
595
over Q. This unit of information is also called a Q-irreducible component of V (F) (see Section 2.1 below). Thus, the symbolic non-universal polynomial equation solver we introduce outputs an amount of information equivalent to an approximate zero. Namely, we will show an algorithm that performs the following task: Input: A system F := [f1 ; : : : ; fn ] ∈ Z[X1 ; : : : ; Xn ]n of multivariate polynomial equations. Output: A symbolic encoding of some Q-irreducible component of V (F). It should be clear from the context that computing a full symbolic description of V (F) and then applying some kind of factorisation techniques to compute a Q-irreducible component of V (F) has no sense: it is already a universal procedure. Hence, the key will be to compute a Q-irreducible component of V (F) without computing (as much as possible) full information on V (F). Here, symbolic encodings of Q-irreducible components follow the trends of Kronecker’s (also geometric) encodings of equidimensional algebraic varieties as used in the series of papers [12–15,17,21,26,33,41]. A Kronecker’s encoding of some equidimensional algebraic variety V ⊆ Cn is a birational isomorphism of V with some well-suited hypersurface embedded in some aKne space of appropriate dimension. A more precise de>nition may be seen in Section 2.1 below. 1.1. Main statements In these pages we study a particular class of homotopic deformation from a symbolic (non-Archimedean) approach. This particular class of deformation is well-suited for Generalised Pham Systems. A generalised Pham system is a system of multivariate polynomials F ∈ Q[X1 ; : : : ; Xn ]n such that the homogeneous components of highest degree of the polynomials in F de>ne the empty variety in the projective space of dimension (n − 1) (cf. Section 3.1 for a more precise de>nition and basic properties of generalised Pham systems). The reader can also refer to [3,6,37,38] and the references therein. Our algorithm has two main features. First, it is not universal and well-behaved with respect to Generalised Pham Systems. Secondly, it is based in some homotopic deformation technique. The use of homotopic deformation techniques within a symbolic context is not new at all. Deformation techniques were used in [10,11,19] and the references therein. Linear deformation techniques underlay the algorithmic approach in [2,43,57] and the references therein. Linear homotopic deformation techniques are also implicitly considered in the sequence of papers that developed Kronecker’s like approach to solving (cf. [12–15,41]) and they were explicitely discussed in [24]. Otherwise said, there is no novelty on the use of homotopic deformations within a symbolic framework. The novelty here is the use of a non-universal homotopic deformation technique. All homotopic deformation techniques take as input a system of polynomial equations F and introduces a deformation to have a continuous path of systems {Ft : t ∈ [0; 1]}. These algorithms compute (in diOerent forms) a universal description of some unrami=ed =bre “easy-to-solve”. Namely, they compute all the information about all the zeros of the “easy-to-solve” system (say F1 ). From this complete information
596
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
on the unrami>ed >bre V (F1 ) these standard algorithms compute also universal (i.e. complete) information about V (Ft ) and, then, eliminate T to compute universal information on the solution variety V (F0 ) = V (F) de>ned by the input system F. Our main algorithm below does not behave in this form. We also introduce a linear homotopic deformation that de>nes a solution curve V (Fa ). We also search for some unrami>ed >bre V (Fa ) ∩ V (T − 1). However, the unrami>ed >bre is not assumed to be “easy-tosolve”; in fact, it can also be “hard-to-solve” and we do not care very much on that. The reason is simple: our algorithm does not compute a complete description (universal description) of the unrami=ed =bre V (Fa ) ∩ V (T − 1), because it is precisely what is not wanted here. The “easy task” in the unrami>ed >bre V (Fa ) ∩ V (T − 1) is not to solve it completely, but to >nd a point, which, by the way, is the point (1; a). From this solely information we lift to compute just some piece Wa of the curve V (Fa ) (and not the complete curve). Then, from this subvariety Wa of V (Fa ) we also compute some partial information of the solution variety V (F). The algorithm is aimed to do so in order to test two main aspects. First, the possibility of having a non-universal symbolic polynomial equation solver. And our algorithm behaves this way. Secondly, the hope that some of these algorithms could improve the existing upper complexity bounds. In Theorems 1 and 2 we show that this goal is far from being reached by our algorithm. The restriction to Generalised Pham Systems is not a serious restriction at all. Firstly because Generalised Pham Systems are densely and uniformly distributed among the systems of n polynomials equations in n variables. Secondly, because using diOerent strategies one can reduce the input system to a Generalised Pham System. This is achieved, for instance, by means of the strategy introduced in [19]. We do not include this reduction in order to keep the length limits of this paper as short as possible; but the reader may >nd diOerent strategies for this reduction by himself. More precisely, the non-Archimedean homotopic deformation we introduce works as follows. Let F ∈ Z[X1 ; : : : ; Xn ]n be the input generalised Pham system and H ∈ N be a positive integer. Let a ∈ Zn be randomly chosen such that a6H and such that the Jacobian matrix de>ned by F at a is regular (i.e. DF(a) ∈ GL(n; Q)). We consider the following deformation of the original system: f1 (X1 ; : : : ; Xn ) − Tf1 (a) = 0 .. .. . .
(1)
fn (X1 ; : : : ; Xn ) − Tfn (a) = 0: Let Wa ⊆ Cn+1 be the unique Q-irreducible component of the homotopic curve (1) that contains the point (1; a) ∈ Cn+1 (see Proposition 19 below). Then, the algorithm outputs a Kronecker’s encoding of Wa ∩ V (T ) ⊆ Cn which turns to be a non-trivial component of the solution variety V (F) (see Section 4 below). The algorithm can be resumed in the following theorem. Theorem 1. There is a bounded error probability Turing machine M that performs the following task: The machine M takes as input (1) A straight-line program of size L, depth ‘ and parameters in Z of bounded bit length at most log H that evaluates a list of polynomials F := [f1 ; : : : ; fn ] ∈ Z[X1 ; : : : ; Xn ]n such that F is a Generalised Pham System.
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
597
The machine M outputs a Kronecker’s encoding of some non-empty Q-de=nable component of V (F). The running time of the machine M is polynomial in the following quantities: L; d; n; log H; def deg(F); where d is the maximum of the degrees of the polynomials in F and def deg(F) is an intrinsic quantity de=ned as the deformation geometric degree of F. Our algorithm assumes that the input system is given by its straight-line program encoding, whereas the output are univariate polynomials given by their dense encoding. Nevertheless, the integer coeKcients of the univariate output polynomials are given by their straight-line program encoding. In Section 2.3 below, the reader should >nd more precise statements of these encodings. The time complexity of this algorithm depends polynomially on the input length and on some intrinsic quantity def deg(F). The quantity def deg(F) can be de>ned in the following terms. Let V (Fa ) ⊆ Cn+1 be the equidimensional curve (1). As observed in Proposition 19, there is one and only one Q-irreducible component Wa ⊆ V (Fa ) that contains the smooth point (1; a) ∈ V (Fa ). This unique component Wa determines the deformation degree of the generalised Pham system F as def deg(F) := max{deg Wa : DF(a) ∈ GL(n; Q)}; where deg(Wa ) is the geometric degree of Wa in the sense of [23]. In Proposition 33 below we observe an upper bound of def deg(F) in terms of the geometric degree of some special subvariety of C2n+1 . Namely, let V (FY ) ⊆ C2n+1 be the algebraic variety given as the set of common zeros of the system of polynomial equations: f1 (X1 ; : : : ; Xn ) − Tf1 (Y1 ; : : : ; Yn ) = 0 .. .. . . fn (X1 ; : : : ; Xn ) − Tfn (Y1 ; : : : ; Yn ) = 0: From Proposition 33 there is only one Q-irreducible component WY of V (FY ) such that the following holds: • codim(WY ) = n; • {(1; x1 ; : : : ; xn ; y1 ; : : : ; yn ) ∈ C2n+1 : xi = yi ; 16i6n} ⊆ WY . Then, the deformation degree of F satis>es n def deg(F) 6 min deg WY ; deg(fi ) : i=1
Observe that the upper bound given by the B,ezout number ( attained:
deg(fi )) is not always
598
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
Given d1 ; : : : ; dn ∈ 2N be positive even numbers and let F be the generalised Pham system given by the following equality: F := [X1d1 ; : : : ; Xndn ] ∈ Z[X1 ; : : : ; Xn ]n : The set of regular points the corresponding mapping F is the Zariski open set given of n by {(x1 ; : : : ; xn ) ∈ Cn : i=1 xi = 0}. Let a ∈ Zn be one of such regular points. Then, the number of Q-irreducible components of V (Fa ) is at least greater than 2n−1 . Thus, we conclude n def deg(F) 6
i=1 di 2n−1
¡
n i=1
di :
However, this improvement of the eKciency with respect to the standard symbolic methods has some drawbacks. In fact, in Section 5 we prove Theorem 2. With the same notations as above, there are in=nitely many points a ∈ Zn such that the previous algorithm outputs V (F). In particular, on the average, we should have: (1) deg V (F)6def deg(F). (2) The algorithm behaves as a universal symbolic polynomial equation solver. The reader should observe that the output of the algorithm in Theorem 1 is not a Q-de>nable irreducible component of the solution variety. In fact, this algorithm outputs information on some subvariety of V (F) and we wanted to compute information concerning irreducibility. This can also be done by means of a factoring procedure adapted to straight-line program encoding of integers (cf. [5]). We also exhibit the following theorem: Theorem 3. There is a bounded error probability Turing machine M that performs the following task: The input of machine M is (1) A straight-line program of size L, depth ‘ and parameters in Z of bounded bit length at most log H that evaluates a list of polynomials F := [f1 ; : : : ; fn ] ∈ Z[X1 ; : : : ; Xn ]n such that F is a generalised Pham system. The output of M is a Kronecker’s encoding of the residue class =eld of some zero ∈ V (F). The running time of M is polynomial in the following quantities: L; d; n; log H; def deg(F); ht() where d is the maximum of the degrees of the polynomials in F and ht() is the height of the residue class =eld of the point ∈ Cn , whose coordinates are algebraic over Q.
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
599
2. Basic notions and notations 2.1. Kronecker’s encoding A Q-de=nable algebraic varieties V ⊆ Cn is the set of common zeros of a >nite set of polynomial equations with coeKcients over the >eld Q. Namely, V ⊆ Cn is a Q-de>nable algebraic varieties if there are polynomials F := [f1 ; : : : ; fs ] ∈ Q[X1 ; : : : ; Xn ]s such that V = V (F). The class of all Q-de>nable algebraic varieties de>nes a unique Noetherian Zariski topology in Cn . This Noetherian topology has the corresponding notion of irreducible closed sets which we call Q-de=nable irreducible algebraic subsets of Cn . Additionally, every Q-de>nable algebraic varieties V ⊆ Cn has a unique minimal description as a >nite union of Q-de>nable irreducible algebraic varieties V = V1 ∪ · · · ∪Vt ⊆ Cn . These Q-de>nable irreducible varieties V1 ; : : : ; Vt are called the Q-irreducible components of V . The C-irreducible components of V are simply called irreducible components. Observe that if W is an irreducible component of V , then there is a Q-irreducible component WQ of V such that W ⊆ WQ . A Q-de>nable algebraic variety V ⊆ Cn is said to be a Q-de>nable complete intersection of codimension r, if there are polynomials F := [f1 ; : : : ; fr ] ∈ Q[X1 ; : : : ; Xn ]r such that V = V (F) and dim V = n − r. Observe that from Macaulay’s Unmixedness Theorem (cf. [36]), if V ⊆ Cn is a Q-de>nable complete intersection variety of codimension r, all the Q-irreducible components of V also have dimension n − r. In [33], L. Kronecker introduced a notion of description of equidimensional algebraic varieties that for sake of readability we reproduce here. This notion has been extensively used in the sequence of papers [12–15,17,18,21,24,26]. Let V ⊆ Cn be an equidimensional Q-de>nable algebraic variety of dimension n−r. From Noether’s Normalisation Lemma, there are generically many non-singular matrices ∈ GL(n; Q) such that the following holds: Let (Y1 ; : : : ; Yn ) be the new coordinates of the aDne space Cn de=ned by . Then, the following is an integral ring extension: A := Q[Y1 ; : : : ; Yn−r ] ,→ Q[V ] := Q[X1 ; : : : ; Xn ]=I (V ): We say that the variables (Y1 ; : : : ; Yn ) de>ned by are in Noether position with respect to the variety V . Observe that if (Y1 ; : : : ; Yn ) are in Noether position with respect to an equidimensional algebraic variety V ⊆ Cn and if W is a Q-irreducible component of V , then the variables (Y1 ; : : : Yn ) are also in Noether position with respect to W . Moreover, let V ⊆ Cn be a Q-de>nable complete intersection variety of codimension r. Let F := [f1 ; : : : ; fr ] ∈ Q[X1 ; : : : ; Xn ]r be a system of polynomial equations de>ning the variety V (i.e. V (F) = V ). Let (F) be the ideal in Q[X1 ; : : : ; Xn ] generated by {f1 ; : : : ; fr } and assume that (F) is a radical ideal. Let ∈ GL(n; Q) be a non-singular matrix that puts the variables in Noether position with respect to the variety V . Then, the ring extension A := Q[Y1 ; : : : ; Yn−r ] ,→ B := Q[X1 ; : : : ; Xn ]=(F) is integral. Because of Macaulay’s Unmixedness Theorem, we conclude that B is a Cohen–Macaulay ring and, from [16, Lemma 3.3.1], we also know that B is a free A-module of positive rank.
600
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
Let V ⊆ Cn be a Q-de>nable equidimensional algebraic variety of codimension r and let ∈ GL(n; Q) be a non-singular matrix that puts the variables in Noether position with respect to V . We denote by (Y1 ; : : : ; Yn ) the set of coordinates in Cn given by . Let u ∈ Q[Y1 ; : : : ; Yn ] be a polynomial. We de>ne the regular mapping 'u : Cn → Cn−r+1 depending on and u as the mapping given by the following identity: 'u (x1 ; : : : ; xn ) := (y1 ; : : : ; yn−r ; u(x1 ; : : : ; xn )): Let 'u |V : V → Cn−r+1 be the restriction of 'u to the algebraic variety V . The image of 'u |V (i.e. 'u (V )) is a Q-de>nable hypersurface Hu of Cn−r+1 . Let mu ∈ Z[Y1 ; : : : ; Yn−r ][Z] be the minimal polynomial equation of the hypersurface Hu . The polynomial mu is a square-free, primitive polynomial, monic with respect to the variable Z (up to a non-zero integer). We say that u is a primitive element with respect to the variety V if 'u |V de>nes a birational isomorphism between V and Hu . In this case, there are polynomials: • * ∈ Z[Y1 ; : : : ; Yn−r ]\{0}, • v1 ; : : : ; vn ∈ Z[Y1 ; : : : ; Yn−r ; Z], such that the rational mapping ('u |V )−1 : Hu → V is given by the following identity: v1 vn −1 ('u |V ) (y1 ; : : : ; yn−r ; z) := (y1 ; : : : ; yn−r ; z); : : : ; (y1 ; : : : ; yn−r ; z) * * for every (y1 ; : : : ; yn−r ; z) ∈ Hu such that *(y1 ; : : : ; yn−r ) = 0. The rational functions {vi =* : 16i6n} are called the parametrisations with respect to the Noether normalisation given by and the primitive element u. The non-zero polynomial * is called a discriminant associated to and u. Denition 4. Let V ⊆ Cn be a Q-de>nable equidimensional algebraic variety of codimension r. A Kronecker’s encoding of V is given by the following sequence of items: (1) A non-singular matrix ∈ GL(n; Z) that puts the variables in Noether position with respect to the variety V . (2) A linear form u := ,1 X1 + · · · + ,n Xn ∈ Z[X1 ; : : : ; Xn ], which is a primitive element with respect to the Noether normalisation given by and with respect to the variety V . (3) The minimal polynomial mu ∈ Z[Y1 ; : : : ; Yn−r ][Z] of the hypersurface Hu := 'u (V ). (4) A non-zero discriminant * ∈ Z[Y1 ; : : : ; Yn−r ] associated to and u. (5) The parametrisations {v1 ; : : : ; vn } ⊆ Z[Y1 ; : : : ; Yn−r ][Z] associated to , u, V and *. In [14,41], Kronecker’s encoding and Kronecker’s polynomial system solver were rediscovered without knowledge of their existing ancestor. In [12,15] the main diKculties in Kronecker’s original approach were solved. For a Q-de>nable complete intersection algebraic variety V ⊆ Cn of dimension n − r, let u ∈ Q[Yn−r+1 ; : : : ; Yn ] be a primitive element of some Kronecker’s encoding of V . Let Hu ⊆ Cn−r+1 be the hypersurface introduced above with minimal polynomial mu .
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
601
Then, the Q-irreducible components of V are in one-to-one correspondence to those of Hu and, hence, in one-to-one correspondence to the irreducible factors of mu . 2.2. Geometric degree For sake of completeness, we shall resume some basic facts concerning geometric degree as introduced in [23] (cf. also [9,60] for alternative notions). Let V ⊆ Cn be a zero-dimensional variety; the geometric degree of V is the number of points in V . If V ⊆ Cn is an equidimensional algebraic variety, the geometric degree of V is the maximum of the degrees of the intersections of V with aKne linear varieties H of dimension dim H = codim V such that V ∩ H is zero dimensional. In the general case, when V ⊆ Cn is not equidimensional, let V = j Cj be an equidimensionaldecomposition of the variety V ; we de>ne the (geometric) degree of V as deg V := j deg Cj . A key result due to [23] is the B ezout’s Inequality: given V; V ⊆ Cn two algebraic varieties, then deg(V ∩ V )6deg V deg V . For instance, given F := [f1 ; : : : ; fr ] ∈ Q [X1 ; : : : ; Xn ]r a system of polynomial r equations de>ning a complete r intersection variety V (F) ⊆ Cn , we have deg V (F)6 i=1 deg fi , and this quantity i=1 deg fi is called the B ezout number of system F. This last inequality is not always an equality; however, it is generically (i.e. up to a zero measure set of the space of polynomial equations of given degree) an equality. A consequence of B,ezout’s inequality above is the following proposition. Proposition 5 (Sabia and Solern,o [44]). Let V ⊆ Cn be a Q-de=nable equidimensional algebraic variety. Assume that the variables are in Noether position with respect to V. Let mu be the minimal polynomial of the complex hypersurface Hu ⊆ Cn−r+1 . Then, deg mu 6deg V . Moreover, the total degree of the discriminant * and the total degree of the parametrisations v1 ; : : : ; vn are also bounded by a quantity that depends polynomially on deg V . 2.3. Straight-line programs Our basic data structure to handle with integer numbers and polynomials is the straight-line program. In this section, we state its de>nition and the model to codify Kronecker’s encoding of algebraic varieties. For a more detailed treatment on straightline programs as data structures, see [31,41,56] and the references therein. Denition 6. A division-free non-scalar straight-line program with inputs X1 ; : : : ; Xn is a pair := (G; Q), where G is a directed acyclic graph, with n + 1 input gates, and Q is a function that assigns to every gate (i; j) one of the following instructions: i=0: Qij :=
Q0;1 := 1; Q0;2 := X1 ; : : : ; Q0;n+1 := Xn ; rs r s Ai; j Qrs · Bi; j Qr s ;
r6i−1;16s6Lr
r 6i−1;16s 6Lr
602
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
r s where 0 6 i 6 ‘ and Ars i; j ; Bi; j are indeterminates over Z called the parameters of . The size of the straight-line program is L() = L0 + · · · + L‘ (where L0 := n + 1), and its depth ‘() = ‘.
r s We identify A = (Ars i; j ) and B = (Bi; j ). Semantically speaking, the straight-line program de>nes an evaluation algorithm of the polynomials:
Qi; j =
|0|62i
Qi;0 j (A; B)X101 · · · Xn0n ;
where each coeKcient Qi;0 j (A; B) is a polynomial in Z[A; B]. A =nite set of polynomials f1 ; : : : ; fr ∈ Z[X1 ; : : : ; Xn ] is said to be evaluated by a straight-line program with parameters in a set F ⊂ Z if specialising the coordinates of the parameters A and B in to values in F, there exist gates (i1 ; j1 ); : : : ; (ir ; jr ) of such that fk = Qik ;jk (a; b; X1 ; : : : ; Xn ) holds for every k, 16k6r. Specialising in the indicated way the parameters of into values of F we obtain a copy of the directed acyclic graph G underlying the straight-line program and of its instruction assignment Q. We call this copy a straight-line program in Z[X1 ; : : : ; Xm ] with parameters in F. The gates of correspond to polynomials belonging to Z[X1 ; : : : ; Xn ]. In this way f1 ; : : : ; fr are represented, computed or evaluated by . We say that f ∈ Z[X1 ; : : : ; Xn ] is computable (or evaluated) by a straight-line program with parameters of height h if the specialisation of A and B is done with integer numbers of bounded height h. Finally, we can encode an integer number by a straight-line program: an integer number 5 ∈ Z is said to be computed by a straight-line program if it can be computed by a straight-line program when considered 5 as an element in Z[X ]. 2.3.1. Straight-line program encoding for varieties Here, we will discuss how our Turing machines work with Kronecker’s encoding of algebraic varieties. Let V := V (F) ⊆ Cn be a complete intersection algebraic variety of codimension r, where F := [f1 ; : : : ; fr ]. Then, a Kronecker’s encoding of V is the list of items [ ; u; mu ; *; v1 ; : : : ; vn ] satisfying the properties described in De>nition 4 above. A mixed dense/straight-line program data structure of a Kronecker’s encoding of V is a straight-line program such that: (0) evaluates {f1 ; : : : ; fr }. (I) evaluates the integral entries of ∈ GL(n; Q). (II) evaluates u := ,1 X1 + · · · + ,n Xn ∈ Z[X1 ; : : : ; Xn ]. (III) evaluates mu ∈ Z[Y1 ; : : : ; Yn−r ][Z]. This polynomial mu is encoded as a list of its coeKcients with respect to the variable Z. The coeKcients in Z[Y1 ; : : : ; Yn−r ] are polynomials evaluated by in labelled nodes. (IV) evaluates * ∈ Z[Y1 ; : : : ; Yn−r ]. (V) evaluates {v1 ; : : : ; vn } ⊆ Z[Y1 ; : : : ; Yn−r ][Z]. Again, the vi s are encoded as the list of their coeKcients in Z[Y1 ; : : : ; Yn−r ], and evaluates these coeKcients at labelled nodes.
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
603
2.4. Some preliminary subalgorithms to be used in the sequel 2.4.1. Elimination step The following statement is a consequence of the technical tools used in the series of papers [12–15,17,18,21,24,26,41]. Theorem 7. There is a bounded error probability Turing machine M1 that performs the following task: • The input of machine M1 is given by the following list of items: ◦ A Kronecker’s encoding of the Q-de=nable algebraic variety V . ◦ A polynomial g ∈ Z[X1 ; : : : ; Xn ] such that g is not a zero divisor in the residue ring Q[V ] and such that V ∩ V (g) = ∅. • The output of machine M1 is a Kronecker’s encoding of the Q-de=nable equidimensional algebraic variety V ∩ V (g). The input of machine M1 is represented in the following form: (1) A straight-line program 1 that codi=es a mixed dense/straight-line program representation of a Kronecker’s encoding of V. (2) The additional polynomial g is given by a non-scalar straight-line program 2 that evaluates g. The running time of M1 is at most polynomial in the quantities deg(V ); L; n; d, where L is the maximum of the sizes of 1 and 2 , and d is the degree of g. The output of M1 (i.e. the Kronecker’s encoding of V ∩ V (g)) is also given using a mixed dense/straight-line program representation of the corresponding Kronecker’s encoding. 2.4.2. Non-Archimedean approximants Let b ∈ Z be a >xed integer number and K a >eld of characteristic 0. In this section we propose an algorithm to solve the following problem: “Given a non-Archimedean approximant of an integral formal power series 8 ∈ K[[T − b]], compute its minimal polynomial in K[T; Z].” This problem is just a classical in a series concerning non-Archimedean approximants and minimal polynomial. In [35] the authors introduced such a kind of algorithms for p-adic approximants. Diophantine approximants were considered in [30]. A close treatment to ours is that of [19]. The new outcome here is not the concept of the procedure but the fact that it is well-suited for mixed dense/straight-line program data structures with precise estimates on its complexity. Denition 8. Let K be a >eld of characteristic zero. A formal power series 8 ∈ K[[T − b]] is an integral formal power series if there exists a non-zero polynomial q(T; Z) ∈K[T; Z] such that the following properties hold: • q(T; Z) is irreducible over K[T; Z]. • q(T; 8) = 0. • deg q = degZ q.
604
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
Such a polynomial q is unique (up to a constant in K), and it is called the minimal polynomial of 8. If d = deg q, we say that 8 has degree d. The regular local ring K[[T − b]] has a natural non-Archimedean absolute value given by its discrete valuation (cf. [61] for instance). Let | · | : K[[T − b]]\{0} → R+ be the non-Archimedean absolute value associated to the (T − b)-adic >ltration in the local ring K[[T − b]]. For every formal power series 8 ∈ K[[T − b]] as above, and for every positive integer d ∈ N, we de>ne the truncated Taylor series expansion of 8 up d−1 to degree d as the univariate polynomial 8d := k=0 ak (T − b)k . For every polynomial q(T; Z) ∈ K[T; Z] and for every positive integer d ∈ N, we have |q(T; 8) − q(T; 8d )|61=2d and the following equivalence also holds: |q(T; 8)|6
1 1 ⇔ |q(T; 8k )|6 d : d 2 2
(2)
Denition 9. Let 8 ∈ K[[T − b]] be a formal power series and let m; k ∈ N be two positive integer numbers. Let K[T; Z]m be the K-vector space of all polynomials in K[T; Z] of (total) degree at most m. We de>ne the subset Lm;k (8) ⊆ K[T; Z]m by the following identity: Lm;k (8) :=
1 g ∈ K[T; Z]m : |g(T; 8)| 6 k 2
:
Observe that Lm;k (8) is a K-vector space of >nite dimension. From Equivalence (2) above, we conclude the following chain of set equalities: 1 Lm;k (8) = g ∈ K[T; Z]m : |g(T; 8k )|6 k ; 2 m+2 2 i j k = (aij ) ∈ K aij T (8k ) ∈ (T − b) : : i+j6m
(3)
Proposition 10. With the same notations as above, let 8 be an integral formal power series of degree d with coeDcients in the =eld K. Let m; k ∈ N be two positive integers. If m¿d and k¿m2 + 1, then, for every g ∈ Lm;k (8), g(T; 8) = 0. Proof. Let q(T; Z) ∈ K[T; Z] be the minimal polynomial of 8. This polynomial is an irreducible polynomial, monic up to a constant, that de>nes a plane algebraic curve V (q) ⊆ K2 , where K is the algebraic closure of K. Additionally, the ring extension A := K[T ] ,→ B := K[T; Z]=(q) is integral and, from [16, Lemma 3.3.1], B is a free A-module. Now, assume that g ∈ Lm;k (8) is a non-zero polynomial. Let ;g : B → B be the S := gh ∈ B; ∀hS ∈ B, where S· denotes residue class modulo homothesy given by ;g (h) the ideal (q).
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
605
Let G(T; U ) ∈ K[T ][U ] be the minimal polynomial of ;g . This polynomial satis>es that it is monic with respect to the variable U (up to a constant in K), and its total degree is, at most, equal to deg(g)deg(q)6m2 (cf. [23]). As G(T; g) ∈ (q), the polynomial G(T; g(T; Z)) ∈ K[T; Z] vanishes on the curve V (q). Now we proceed by extending scalars by tensoring with K[[T − b]]. Namely, as B is a free A-module, the following is also an integral ring extension: A ⊗A K[[T − b]] = K[[T − b]] ,→ B := K[[T − b]] ⊗A B and B is the completion of B. In fact, we have B = K[[T − b]][Z]=(q)e . As G(T; g) ∈ (q) in B, we also have G(T; g) ∈ (q)e in B . As q(T; 8) = 0, then, G(T; g(T; 8)) = 0 too. Finally, observe that g(T; 8) ∈ K[[T − b]] is an integral formal power series and we have just shown that the minimal polynomial with coeKcients in K[T ] satis>ed by g(T; 8) has degree at most m2 . Let us denote >(T; R) ∈ K[T ][R] as the minimal polynomial of g(t; 8) over K[T ]. We assume that it can be written in the following form: >(T; R) = a0 (T ) + a1 (T )R + · · · : If we evaluate this last expression at R = g(T; 8), we get 0 = >(T; g(t; 8)) = a0 (T ) + a1 (T )g(T; 8) + · · · :
(4)
Since g(T; Z) belongs to Lm;k (8), it veri>es g(T; 8) ∈ (T − b)k , so by hypothesis it also 2 2 holds g(T; 8) ∈ (T − b)m +1 and we conclude from Eq. (4) that a0 (T ) ∈ (T − b)m +1 . 2 As a0 ∈ K[T ] and deg(ao ) ≤ deg >6m , we obviously conclude that a0 (T ) ≡0 in K[T ]. Therefore, >(T; R) = RA(T; R) ∈ K[T; R], where A(T; R) is a monic polynomial with respect tho the variable R of total degree at most deg > − 1. As > is the minimal polynomial of g(T; 8) over K[T ], we conclude that A(T; g(T; 8)) = 0. Hence, as K[[T − b]] is an integral domain, the proof is >nished since: >(T; g(T; 8)) = 0 ∧ A(T; g(T; 8)) = 0 =⇒ g(T; 8) = 0
in K[[T − b]]:
Remark 11. Let {U1 ; : : : ; Un } be new variables and let K := Q(U1 ; : : : ; Un ) be a transcendental extension of Q. Let 8 ∈ K[[T − b]] be an integral formal power series and let q(T; Z) ∈ K[T; Z] its minimal polynomial over K[T ]. Then, q(T; Z) is an irreducible polynomial characterised by the following property: “Assume deg(q) = d and let m; k ∈ N be two positive integers such that m¿d and k¿m2 +1. Then, q(T; Z) is the lowest degree monic (up to a constant in K) polynomial in Lm;k (8).” Now we are in conditions to state the basic algorithm of this section. Theorem 12. Let K := Q(U1 ; : : : ; Un ) be a transcendental extension of Q as in Remark 11. Then, there is a universal constant c ¿ 0 such that the following holds:
606
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
There is a bounded error probability Turing machine M2 that performs the following task: • The input machine M2 is a straight-line program of size L, depth ‘ and parameters in a =nite set F ⊆ Z. The straight-line program evaluates the coeDcients in K of some polynomial g ∈ K[T; Z] such that g is the Taylor expansion up to order D2 + 1 of an integral power series 8 ∈ K[[T − b]] of degree D. Moreover, assume that deg(g)6D2 + 1. • The output of machine M2 is a straight-line program 1 of size L1 , depth ‘1 and parameters in the =nite set F1 := F ∪ {x ∈ Z : |x|6(nD)c }. This straight-line program 1 evaluates the minimal polynomial of 8 over K[T; Z] The running time of M2 is at most polynomial in the quantities D; L; n. The total size L1 of the output straight-line program 1 is at most the running time of M2 and, hence, polynomial in the quantities D; L; n. Proof. From Equality (3), given m; k ∈ N and given 8k = g, we can always compute a basis of the K-vector space Lm;k (8) using the Linear Algebra methods adapted to straight-line program encodings as in [31] (which are based on [1] or [8,39]). These Linear Algebra methods adapted to straight-line program encoding contain random methods based either on Zippel–Schwartz tests (cf. [45] or [62]) or on correct-test sequences (cf. [27] or [31]). The running time of these procedures is polynomial in the wanted quantities. Once a basis of Lm;k (8) has been computed we can easily >nd the wanted lowest degree monic (up to a constant in K) polynomial q(T; Z) ∈ Lm;k (8). Remark 13. Observe that if either m2 ¡ D or k ¡ m2 +1, the same algorithm computes either a minimal polynomial of some diOerent integral formal power series 8 of lower degree than 8 or it outputs that Lm;k (8) is the null vector space. In either cases we can proceed to the output for further discussions.
3. Generalised Pham systems In this section, we brieTy discuss some basic facts concerning generalised Pham systems. The reader may >nd additional information on Pham systems in [3,6] or [37,38] and the references therein. 3.1. Basic notions and notations In the sequel, K will denote a zero characteristic >eld and K its algebraic closure. Denition 14. A Pham system of codimension r (r6n) is a >nite subset of polynomials F := [f1 ; : : : ; fr ] ∈ K[X1 ; : : : ; Xn ]r such that for every i, 16i6r, there are polynomials gi ∈ K[X1 ; : : : ; Xn ] and natural numbers di ∈ N\{0} such that fi = Xidi + gi and deg gi ¡ di .
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
607
For every Pham system codimension r, F ∈ K[X1 ; : : : ; Xn ]r , we denote by (F) the ideal in K[X1 ; : : : ; Xn ] generated by the elements in F. Next the lemma follows from a classical and elementary argument. Lemma 15. Let F := [f1 ; : : : ; fr ] ∈ K[X1 ; : : : ; Xn ]r be a Pham system of codimension r, and let B the module B := K[X1 ; : : : ; Xn ]=(F). Then, the extension K[Xr+1 ; : : : ; Xn ] ,→ B is an integral ring extension. In particular, V (F) ⊆ Kn is an algebraic variety of pure codimension r and B is a free (Cohen-Macaulay) K[Xr+1 ; : : : ; Xn ]-module. Let X0 be a new variable. For every polynomial f ∈ K[X1 ; : : : ; Xn ], let fh ∈ K[X0 ; X1 ; : : : ; Xn ] be the homogenisation of fi with respect to new the variable X0 . Let Pn (K) be the n-dimensional projective space over K and let H∞ := {X0 = 0} ⊆ Pn (K) be the hyperplane of points at in>nity in Pn (K) with respect to the new variable X0 . For every list of polynomials F := [f1 ; : : : ; fs ] ∈ K[X1 ; : : : ; Xn ]s let us denote by V (Fh ) the projective variety of the common zeros of [f1h ; : : : ; fsh ] in Pn (K). Denition 16. A generalised Pham system is a >nite subset of polynomials F := [f1 ; : : : ; fn ] ∈ K[X1 ; : : : ; Xn ]n such that the projective variety V (Fh ) ⊆ Pn (K) is a zerodimensional projective variety without points at in>nity (i.e. V (Fh ) ∩ H∞ = ∅). In other words, a system F := [f1 ; : : : ; fn ] ∈ K[X1 ; : : : ; Xn ]n is a generalised Pham system if and only if for every i, 16i6n, there are polynomials Ai ; gi ∈ K[X1 ; : : : ; Xn ] such that fi = Ai + gi and the following properties hold: • For every i, 16i6n, Ai ∈ K[X1 ; : : : ; Xn ] is a homogeneous polynomial of degree deg fi . • For every i, 16i6n, gi is a polynomial of degree at most deg fi − 1. • The projective algebraic variety V (C) ⊆ Pn−1 (K) is empty, where C := [A1 ; : : : ; An ] is the list of leading homogeneous terms of F. For every generalised Pham system F ∈ K[X1 ; : : : ; Xn ]n , we also denote by (F) the ideal in K[X1 ; : : : ; Xn ] generated by the elements in F. Proposition 17. Let F := [f1 ; : : : ; fn ] ∈ K[X1 ; : : : ; Xn ]n be a generalised Pham system. Then, V (F) ⊆ Kn is a non-empty zero-dimensional algebraic variety. Moreover, the Jacobian determinant det(DF) = det(@fi =@Xj ) ∈ K[X1 ; : : : ; Xn ] is a non-zero polynomial. The following elementary lemma follows from the upper degree bounds in the Hilbert Nullstellensatz. The reader may follow some of them in [13,31,32,44] and the references therein. Lemma 18. Let F := [f1 ; : : : ; fn ] ∈ K[X1 ; : : : ; Xn ]n be a generalised Pham system. Then, the ideal (F) contains a Pham system of codimension n. Proof of Proposition 17. Using the previous lemma, the ideal (F) contains a Pham system of codimension n. Hence, V (F) is either empty or a zero-dimensional aKne algebraic variety.
608
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
Let V (Fh ) ⊆ Pn (K) be the projective algebraic variety associated to system F. Since V (Fh ) is de>ned as the set of common zeros of n homogeneous polynomials in n + 1 variables, then V (Fh ) = ∅ (see for instance [46]). Moreover, as F is a generalised Pham system, then V (Fh ) is a zero-dimensional projective variety such that V (Fh ) ⊆ {x0 = 0} and that implies V (F) = ∅. Thus, V (F) ⊆ Kn is a non-empty zero-dimensional algebraic variety. As for the second claim, let F : Kn → Kn be the polynomial mapping given by the identity F(x) := (f1 (x); : : : ; fn (x)) ∀x ∈ Kn . First of all, we observe that F is surjective. In order to prove this claim, let , := (,1 ; : : : ; ,n ) ∈ Kn be a point in Kn . Then, the >bre F −1 (,) is de>ned as the set of common zeros of the generalised Pham system given by the sequence of polynomials [f1 − ,1 ; : : : ; fn − ,n ] ∈ K[X1 ; : : : ; Xn ]n . Thus, F −1 (,) is a non-empty zero-dimensional variety and F is a surjective mapping. From the Second Bertini Theorem (cf. [46, p. 141, Theorem 2]) there is a zero measure subset U ⊆ Kn such that for every x ∈ F −1 (Kn \U ) the tangent mapping DF(x) : Tx Kn → TF(x) Kn is surjective. In particular, DF(x) is a non zero matrix and det(DF) ∈ K[X1 ; : : : ; Xn ] is a non zero polynomial. 3.2. Deforming a generalised Pham system In the sequel we assume that all the polynomials in a generalised Pham system have degree at least 2. Let F := [f1 ; : : : ; fn ] ∈ K[X1 ; : : : ; Xn ]n be a generalised Pham system and let a ∈ K n be a regular point of the mapping F : Kn → Kn (namely, a ∈ K n such that the Jacobian matrix DF(a) is non-singular). We de>ne the deformation of F at a as the system of polynomial equations: Fa := [f1 − Tf1 (a); : : : ; fn − Tfn (a)] ∈ K[T; X1 ; : : : ; Xn ]n : In a numerical analysis context, this deformation is called “Newton homotopy” or “global homotopy”. This deformation is a particular case of the linear deformation (1 − T )F − T G, where G ∈ K[X1 ; : : : ; Xn ]. In our particular case, G := F − F(a). Let V (Fa ) ⊆ Kn+1 be the K-de>nable algebraic variety given by V (Fa ) := {(t; x) ∈ Kn+1 : fi (x) − tfi (a) = 0;
1 6 i 6 n}:
Finally, let (Fa ) ⊆ K[T; X1 ; : : : ; Xn ] be the ideal generated by the set of polynomials {fi (X1 ; : : : ; Xn ) − Tfi (a) : 16i6n}. Proposition 19. Let F be a generalised Pham system with coeDcients in K and let a ∈ K n be a regular point of F : Kn → Kn (i.e. DF(a) ∈ GL(n; K)). With the same notations as above, the following properties hold: (1) The ideal (Fa ) contains a Pham system of codimension 1. (2) The following is an integral ring extension: K[T ] ,→ B := K[T; X1 ; : : : ; Xn ]=(Fa ) and B is a free K[T ]-module of positive rank.
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
609
(3) The variety V (Fa ) is an equidimensional curve (i.e. V (Fa ) has no isolated component of dimension 0). (4) The point (1; a) ∈ V (Fa ) is a smooth point of V (Fa ) and there is one and only one K-irreducible component Wa of V (Fa ) such that (1; a) ∈ Wa . Proof. Assume that F := [f1 ; : : : ; fn ] ∈ K[X1 ; : : : ; Xn ]n . According to the notations of De>nition 16, for every j, 16j6n, fj = Aj + gj , where Aj ∈ K[X1 ; : : : ; Xn ] is a homogeneous polynomial of degree deg(fj ) and gj ∈ K[X1 ; : : : ; Xn ] is a polynomial of degree at most deg(fj ) − 1. As the projective algebraic variety VP (A1 ; : : : ; An ) is empty, there is some constant D = D(deg f1 ; : : : ; deg fn ) such that for every i, 16i6n, there are homogeneous polynomials hij ∈ K[X1 ; : : : ; Xn ], 16j6n, of degree D − deg(fi ) such that the following equality holds: XiD =
n j=1
hij Aj :
Hence the following equality also holds: XiD −
n j=1
hij (fj − Tfj (a)) = −
n j=1
hij (gj − Tfj (a)):
For every i, 16i6n, let Gi (T; X1 ; : : : ; Xn ) ∈ K[T; X1 ; : : : ; Xn ] be the polynomial given by the following identity: Gi (T; X1 ; : : : ; Xn ) :=
n j=1
hij gj −
n j=1
Thij fj (a):
n Observe that XiD − Gi (T; X1 ; : : : ; Xn ) = j=1 hij (fj − Tfj (a)) ∈ (Fa ). As deg(fi )¿2 for every i, 16i6n, we conclude that deg(Gi )6D − 1 for every i, 16i6n. In particular, the system G := [X1D − G1 ; : : : ; XnD − Gn ] ∈ K[T; X1 ; : : : ; Xn ]n is a Pham system of codimension n. Moreover, (G) ⊆ (Fa ). As (1; a) ∈ V (Fa ), we conclude that V (Fa ) is either a curve in Kn+1 or a zerodimensional algebraic variety. Moreover, from Lemma 15, the ring extension K[T ] → K [T; X1 ; : : : ; Xn ]=(G) is integral, where (G) is the ideal generated by the elements in G. We claim that (Fa ) ∩ K[T ] = (0). In order to prove this claim, let h(T ) ∈ K[T ] be a polynomial in the ideal (Fa ). Then, for every i, 16i6n, there are polynomials hi (T; X1 ; : : : ; Xn ) ∈ K[T; X1 ; : : : ; Xn ] such that the following holds: h(T ) =
n i=1
hi (T; x)(fi (x) − Tfi (a)):
Hence, if h(T ) were a non-zero polynomial, there would exist t0 ∈ Q such that h(t0 ) = 0. Thus it would follow that 0 = h(t0 ) = hi (t0 ; X1 ; : : : ; Xn )(fi (x) − t0 fi (a)): (5)
610
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
On the other hand, let Fa; t0 ⊆ K[X1 ; : : : ; Xn ] be the system of polynomials given by the following equality: Fa;t0 := [f1 (x) − t0 f1 (a); : : : ; fn (x) − t0 fn (a)] ∈ K[X1 ; : : : ; Xn ]n : Observe that Fa; t0 is a generalised Pham system in K[X1 ; : : : ; Xn ]. Hence Proposition 17 implies that V (Fa; t0 ) = ∅ in contradiction with Eq. (5) above. Thus, (Fa ) ∩ K[T ] = (0) and we have the following commutative diagram of ring extensions: K[T ] ,→ B2 := K[T; X1 ; : : : Xn ]=(G) ↓' K[T ] ,→ B1 := K[T; X1 ; : : : Xn ]=(Fa ); where ' : B2 → B1 is the canonical projection. In particular, the ring extension K[T ] ,→ B1 is an integral ring extension, and (Fa ) is a complete intersection ideal of codimension 1. Now, from [16, Lemma 3.3.1] we conclude that B1 is a free K[T ]-module of positive rank. From Macaulay’s Unmixedness Theorem (cf. [36, Proposition 16f ] for instance), we know that the ideal (Fa ) has no embedded associated primes. In particular, all associated primes over (Fa ) have codimension 1 and the curve V (Fa ) is an equidimensional curve. Let ma ⊆ K[T; X1 ; : : : ; Xn ] be the maximal ideal associated to the point (1; a). Namely, ma := (T −1; X1 −a1 ; : : : ; Xn −an ), where a = (a1 ; : : : ; an ) ∈ K n . Let B1 := K[T; X1 ; : : : ; Xn ]ma be the localisation of K[T; X1 ; : : : ; Xn ] at ma . From the Jacobian Criterium (cf. [22]) the set Fa is part of a regular system of parameters that generate the maximal ideal of B1 . As the ideal (Fa ) is a complete intersection ideal of codimension 1, we conclude that (B1 )ma := K[T; X1 ; : : : ; Xn ]ma =(Fa )ma is a regular local ring of dimension 1, and the ideal (Fa )ma is a prime ideal in K[T; X1 ; : : : ; Xn ]ma . Then, we conclude that there is a unique K-irreducible component Wa of V (Fa ) such that (1; a) ∈ Wa . Moreover, I (V )ma = I (Wa )ma = (Fa )ma in B1 . Hence, we conclude that K[V (Fa )] = (B1 )ma is a regular local ring of dimension 1 and (1; a) ∈ V (Fa ) is a smooth zero of V (Fa ). Corollary 20. With the same notations as in Proposition 19 above, let K = Q and let Wa be the unique Q-irreducible component of V (Fa ) that contains (1; a). Then, there is at least one Q-irreducible component W of V (F) such that {0} × W ⊆ Wa ∩ V (T ), where V (T ) := {(0; x) ∈ Cn+1 : x ∈ Cn }. Proof. From the second claim of Proposition 19 above, we have the integral ring extension Q[T ] ,→ B := Q[T; X1 ; : : : ; Xn ]=(Fa ). Then, as I (Wa ) is a minimal prime ideal
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
611
over (Fa ), the following is also an integral ring extension: Q[T ] ,→ Q[Wa ] = Q[T; X1 ; : : : ; Xn ]=I (Wa ): From the Krull–Cohen–Seidenberg Theorems, we conclude that Wa ∩ V (T ) is a nonempty zero-dimensional algebraic variety. Hence, as Wa ∩ V (T ) ⊆ V (F) and Wa ∩ V (T ) = ∅, the claim follows. Observe that in the previous Corollary we have shown that T is not a zero divisor in Q[Wa ] = Q[T; X1 ; : : : ; Xn ]=I (Wa ). Hence, the algorithm cited in Theorem 7 can be applied to perform the following task: • Take as input a Kronecker’s encoding of the curve V (Fa ). • Output a Kronecker’s encoding of some Q-de>nable component of V (F). Corollary 21. With the same notations and assumptions as in Proposition 19 above, let Wa ⊆ Qn+1 be the unique Q-irreducible component of V (Fa ) that contains (1; a) ∈ Cn+1 . Let Q[Wa ]ma be the localisation of Q[Wa ] at the maximal ideal ma := (T − 1; X1 − a1 ; : : : ; Xn − an ), where a = (a1 ; : : : ; an ) ∈ Qn . Then, Q[T ](T −1) ,→ Q[Wa ]ma is an integral ring extension and Q[Wa ]ma is a free Q[T ](T −1) -module of positive rank. Hence, the following inequalities hold: rank Q[T ](T −1) Q[Wa ]ma 6 deg(Wa ) 6
n i=1
deg(fi ):
Proof. From Proposition 19 above, we have that A := Q[T ] ,→ B := Q[T; X1 ; : : : ; Xn ]=(Fa ) is an integral ring extension and B is a free A-module of positive rank. From B,ezout’s n inequality we also conclude that deg(Wa )6 deg(V (Fa ))6 i=1 deg(fi ). Finally, in the proof of Proposition 19 we have shown that Q[Wa ]ma = Q[T; X1 ; : : : ; Xn ]ma =(Fa )ma : Additionally, let Q(T ) be the >eld of fractions of Q[T ] and let Q(Wa ) be the >eld of rational functions de>ned in Wa . As Q[T ] ,→ Q[Wa ] is an integral ring extension, Q(Wa ) is a >nite >eld extension of Q(T ). From the de>nition of geometric degree in [23], we have [Q(Wa ) : Q(T )]6 deg(Wa ). In order to conclude the proof of this Corollary we just have to observe that Q[T ](T −1) = Q[T; X1 ; : : : ; Xn ]ma =(X1 − a1 ; : : : ; Xn − an )ma and the following is an integral ring extension Q[T; X1 ; : : : ; Xn ]ma =(X1 − a1 ; : : : ; Xn − an )ma ,→ Q[T; X1 ; : : : ; Xn ]ma =(Fa )ma :
612
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
4. The algorithm Now we are in conditions to exhibit the algorithm we refer at the Introduction. This algorithm has three main steps: Step 1: Choose at random a point a ∈ Zn of bounded height such that DF(a) ∈ GL (n; Q). This is achieved by any of the probabilistic zero tests based either on Zippel– Schwartz test (as in [45,62]) or using correct-test sequences (as in [27] or [31]). In the sequel we always assume that the regular value a satis>es a 6 (nd)O(1) . This upper bound is an immediate consequence of applying any of these probabilistic zero test. Step 2: Lifting Step. From the smooth point (1; a) of the curve V (Fa ), compute a Kronecker’s encoding of the Q-irreducible component Wa ⊆ Cn+1 of V (Fa ). Step 3: Using the algorithm cited in Theorem 7 above, compute a Kronecker’s encoding of the intersection Wa ∩ V (T ). The key ingredient is clearly the algorithm that performs Step 2. We start by a description of this algorithm. 4.1. Lifting step First of all, the following technical property holds: Proposition 22. Let F := [f1 ; : : : ; fn ] ∈ Q[X1 ; : : : ; Xn ]n be a generalised Pham system, and let a ∈ Qn be a point such that DF(a) ∈ GL(n; Q). Let Fa ⊆ Q[T; X1 ; : : : ; Xn ] be the deformation of F given by the regular point a ∈ Qn . Then, the following properties holds: (1) There is an holomorphic mapping A : D → Cn , de=ned in an open neighbourhood D ⊆ C of 1 ∈ D such that V (Fa ) agrees with the graph of A near the simple point (1; a) ∈ V (Fa ). (2) Assume that A := (A1 ; : : : ; An ), where Ai : D → C are holomorphic mappings. For every i, 16i6n, let 8i ∈ C[[T − 1]] be the Taylor expansion of Ai at T = 1. Then, 8i ∈ Q[[T − 1]] and 8i is integral over Q[T ]. (3) Let Wa ⊆ Cn+1 be the unique Q-irreducible component of V (Fa ) that contains the point (1; a). Then, for every i, 16i6n, the integral formal power series 8i ∈ Q[[T − 1]] have degree at most deg(Wa ). Proof. The >rst claim of this proposition is granted by the Implicit Function Theorem (cf. [20] for instance). Moreover, since Wa is the unique Q-irreducible component of V (Fa ) that contains the point (1; a), we conclude that, near (1; a), Wa agrees with the graph of A. For every i, 16i6n, let 'i : Cn+1 → C2 be the canonical projection 'i (t; x1 ; : : : ; xn ) := (t; xi ), ∀(t; x1 ; : : : ; xn ) ∈ Cn+1 , and let Vi := 'i (Wa ) be the ith projection of the Q-irreducible variety Wa . As Q[T ] ,→ Q[Wa ] is an integral ring extension, Vi ⊆ C2 is a hypersurface and there is a polynomial qi (T; Xi ) ∈ Q[T; Xi ] of degree at most deg(Wa ), monic with respect to the variable Xi such that qi |Vi ≡ 0. As the graph of A locally agrees with Wa near (1; a), we conclude that the graph of the holomorphic mapping 'i ◦ A : D → C2 is included in
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
613
Vi near 'i (1; a). In particular, qi (T; Xi ) vanishes in the graph of Ai . Then, by the Identity Principle (cf. [20] for instance) we conclude that for every i, 16i6n, the following holds: qi (T; 8i ) ≡ 0
in C[[T − 1]]:
Moreover, since (1; a) ∈ Qn+1 and F ∈ Q[X1 ; : : : ; Xn ]n , using Hensel’s Lemma (cf. [12] or [61] for instance) we conclude that 8i ∈ Q[[T − 1]]. In particular, 8i ∈ Q[[T − 1]] is an integral formal power series of degree at most deg(Wa ) as wanted. As in Section 2.4.2, let {U1 ; : : : ; Un } be independent variables over Q, let K := Q(U1 ; : : : ; Un ) be the corresponding transcendental >eld extension of Q and let K be the algebraic closure of K. For a generalised Pham system F := [f1 ; : : : ; fn ] ∈ Q[X1 ; : : : ; Xn ]n and for a regular point a ∈ Qn , let 81 ; : : : ; 8n be the Taylor expansions of the holomorphic functions A1 ; : : : ; An of the second claim of Proposition 22 above. We have 8i ∈ Q[[T − 1]] for every i, 16i6n. Then, the following is a formal power series in K[[T − 1]] u := U1 81 + · · · + Un 8n ∈ K[[T − 1]]: Moreover, u is integral over the ring K[T ] and the following Proposition holds: Proposition 23. With the same notations as above, let qu (T; Z) ∈ K[T; Z] be the minimal polynomial of the integral power series u = U1 81 +· · ·+Un 8n de=ned above. Then, qu (T; Z) is the Chow polynomial of the K-de=nable irreducible variety Wa ⊆ Kn+1 with respect to the Noether normalisation A := K[T ] ,→ B := K[T; X1 ; : : : ; Xn ]=I (Wa ):
(6)
In particular, qu (T; Z) is an irreducible polynomial of total degree at most 2 deg(Wa ). The reader should observe that the Chow polynomial with respect to the Noether normalisation (6) is also de>ned in the following terms: Let {U1 ; : : : ; Un } be some new variables, K := Q(U1 ; : : : ; Un ) and let K Q A = K[T ] ,→ K Q B =: BK = K[T; X1 ; : : : ; Xn ]=I (Wa )e be the integral ring extension obtained by extending scalars. Let ;u : BK → BK be the homothesy de>ned by ;u (g) S = (U1 X1 + · · · + Un Xn )g ∈ BK ;
∀gS ∈ BK ;
where · denotes residue class modulo the extended ideal I (Wa )e . The minimal equation of ;u is a polynomial in K[U1 ; : : : ; Un ; T; Z], monic with respect to the variable Z of total degree at most 2 deg(Wa ). This minimal equation of ;u is called the Chow polynomial of Wa with respect to the Noether normalisation (6). The degree bound is a consequence of B,ezout’s inequality as in [23]. Finally, we shall make use of the Newton operator as in [12]. From now on, let F := [f1 ; : : : ; fn ] ∈ Q[X1 ; : : : ; Xn ]n be a generalised Pham system and let a ∈ Qn be a
614
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
regular point of F (i.e. DF(a) ∈ GL(n; Q). Let Wa be the unique Q-irreducible component of V (Fa ) that contains the point (1; a). We de>ne the Newton operator associated to the system Fa as f1 (Z) − Tf1 (a) Z1 .. NFa (Z1 ; : : : ; Zn ) := ... − DFa (Z)−1 : . f1 (Z) − Tf1 (a)
Zn
This Newton operator satis>es the following standard and well-known Proposition. Proposition 24. With the same notations and assumptions as above, for every positive integer number k ∈ N, let NFka (a) ∈ Q[[T − 1]]n be the list of rational functions (in Q[T ]n(T −1) ) given by the following recursion: NF0a (a) = a ∈ Q[[T − 1]]n and for every k, k¿1, we de=ne NFka (a) := NFa (NFk−1 (a)) ∈ Q[[T − 1]]n . a k Then, the sequence {NFa (a) : k ∈ N} is well-de=ned. Moreover, let · : Q[[T − 1]]n → R+ be the maximum norm with respect to the non-Archimedean absolute value |·| : Q[[T −1]] −→ R. Then, for every positive integer number k ∈ N, the following holds: (81 ; : : : ; 8n ) − NFka (a) 6
1 ; 22 k
where (81 ; : : : ; 8n ) ∈ Q[[T − 1]]n are the implicit formal power series of the second claim of Proposition 22. The following algorithm easily follows from the one discussed in [12,15]. This Algorithm uses Strassen’s Vermeidung von Divisionen technique (cf. [55] as adapted in [31]). Proposition 25. There is a deterministic Turing machine M4 that performs the following task: • The input of machine M4 is given by the following information: ◦ A straight-line program of size L, depth ‘ and parameters in Z of bit length at most log2 H that evaluates a generalised Pham system F := [f1 ; : : : ; fn ] ∈ Z[X1 ; : : : ; Xn ]n . ◦ A regular point a ∈ Zn such that a6H . ◦ A positive integer D ∈ N. • The output of machine M4 is the truncated Taylor series expansion (up to degree D) uD of the integral formal power series u := U1 81 + · · · + Un 8n ∈ K(U1 ; : : : ; Un )[[T − 1]]: The polynomial uD is given by its dense encoding in Q(U1 ; : : : ; Un )[T − 1] and its coeDcients are given by a straight-line program of size polynomial in the quantities D; L; d; n, where d := max{deg fi : 16i6n}.
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
615
The running time of M4 is polynomial in the quantities D; L; d; n; log H . The following algorithm is due to [17] (cf. also [34]). We rewrite it as adapted to our particular situation. Theorem 26 (Giusti et al. [17]). There is a bounded error probability Turing machine M5 that performs the following task: • The machine M5 takes as input the following information: ◦ A straight-line program that evaluates a generalised Pham system F := [f1 ; : : : ; fn ] ∈ Z[X1 ; : : : ; Xn ]n . The size of is at most L, the depth is ‘ and the parameters in have bit length at most h. ◦ A regular value a ∈ Zn of bit length at most h. ◦ An irreducible monic polynomial q ∈ Z[U1 ; : : : ; Un ][T; Z] encoded by a non-scalar straight-line program of size at most L, depth at most ‘ and parameters of bit length at most h. Assume that the total degree of q is at most D. • The machine M5 outputs the following information: ◦ First of all, M5 decides whether q is the Chow polynomial of the unique Q-irreducible component Wa of V (Fa ) with respect to the Noether normalisation Q[T ] ,→ Q[T; X1 ; : : : ; Xn ]=(Fa ). ◦ If so, M5 outputs a Kronecker’s encoding of Wa . The running time of M5 is polynomial in max{D; deg Wa }; L; n; d; h, where d := max{deg fi : 16i6n}. In fact, this algorithm in [17] can also be replaced by the “two-by-two reconstruction” algorithm in [31] with similar time bounds and characteristics. The procedure >rst computes a Kronecker’s encoding of some curve C associated to the polynomial q(U1 ; : : : ; Un ; T; Z). Then, M5 decides whether C ⊆ V (Fa ) and (1; a) ∈ C. If this were the case, then C = Wa and we already have a Kronecker’s encoding of Wa . Now we can >nally de>ne the subalgorithm that performs Step 2. This is Algorithm 1. Algorithm 1. INPUT • A straight-line program that evaluates a generalised Pham system F := [f1 ; : : : ; fn ] ∈ Z[X1 ; : : : ; Xn ]n . • A regular point a ∈ Zn . D←1 already computed n ← false while (D6 i=1 deg(fi ) ∨ ¬ already computed) do Apply the Newton operator as in the Turing machine M4 of Proposition 25 above to compute a truncated Taylor expansion (up to degree D2 + 1) uD ∈ K[[T − 1]]. Apply the Turing machine M2 of Theorem 12 to uD . The output is a polynomial qD ∈ K[T; Z] of degree at most D2 + 1.
616
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
if qD = 0 then D←D + 1 else Apply the Turing machine M5 of Theorem 26 to decide whether qD is the Chow polynomial of Wa with respect to the Noether normalisation Q[T ] ,→ Q[T; X1 ; : : : ; Xn ]=I (Wa ): if this were the case then already computed ← true. else D←D + 1 end if end if end while OUTPUT the Kronecker’s encoding of Wa . The following Theorem is simply a consequence of our previous discussion. The reader should simply note that the output of Lifting Step is a Kronecker’s encoding given by polynomials in Z[T; Z] which are given by their dense encoding and their coeKcients (in Z) are given by straight-line program encoding whose size is at most the running time of the procedure. Theorem 27. Algorithm 1 outputs a Kronecker’s encoding of Wa in time at most polynomial in the quantities deg(Wa ); L; n; d; h; where d := max{deg(fi ) : 16i6n}, L is an upper bound of the size of and h is an upper bound of the bit length of and of the bit length of the coordinates of a. 4.2. Proofs of the main Theorems 1 and 3 Proof of Theorem 1. The algorithm cited in Theorem 1 is Algorithm 2 below. Algorithm 2. INPUT • A non-scalar straight-line program evaluating a generalised Pham system F := [f1 ; : : : ; fn ] ∈ Z[X1 ; : : : ; Xn ]n . • Choose at random a point a ∈ Zn such that DF(a) ∈ GL(n; K). Apply the LIFTING STEP Algorithm described in Theorem 27 above. The output is a Kronecker’s encoding of Wa . Apply the elimination Algorithm of Theorem 7 above. The output is Kronecker’s encoding of the non-empty (see Corollary 20 above) zero dimensional algebraic variety W := Wa ∩ V (T ). OUTPUT the Kronecker’s encoding of W .
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
617
From Corollary 20, we know that W := Wa ∩ V (T ) is a non-empty zero-dimensional subvariety of V (F). Then, this algorithm computes what was announced in the claim of Theorem 1. In what concerns complexity, our intermediate results show that the time complexity of this procedure is polynomial in the input length and polynomial in the geometric degree deg(Wa ). As deg(Wa )6def deg(F), then the theorem follows. Proof of Theorem 3. As observed in the Introduction, the output of the algorithm of Theorem 1 is the Kronecker’s encoding of some zero-dimensional Q-de>nable component W of V (F). This encoding is given by the following information: (1) A primitive element u := ,1 X1 + · · · + ,n Xn ∈ Z[X1 ; : : : ; Xn ] whose coeKcients are given by their binary/decimal expansion. (2) The minimal equation mu ∈ Z[T ] of the primitive element. This polynomial is given in dense encoding but its coeKcients are given in straight-line program encoding. (3) The discriminant * ∈ Z given by its straight-line program encoding. (4) The parametrisations: v1 ; : : : ; vn ∈ Z[T ] whose coeKcients are also given by their straight-line program encoding. As W is a Q-de>nable non-empty zero-dimensional variety, there should be some ∈ Cn such that ∈ W . Then, there is at least one Q-irreducible component W of W such that W contains the point ∈ Cn . In fact, all Q-irreducible components of W are of this kind, and W hasan irreducible s minimal decomposition given by W = W1 ∪ · · · ∪ Ws , where deg W = i=1 #(Wi )6 def deg(F) and i ∈ Wi for every i, 16i6s. Moreover, each Q-irreducible component of W is one-to-one identi>ed with some irreducible factor of the polynomial mu over Q[T ]. Thus, the algorithm that proves Theorem 3 is Algorithm 3 below. Algorithm 3. INPUT • A non-scalar straight-line program evaluating a generalised Pham system F := [f1 ; : : : ; fn ] ∈ Z[X1 ; : : : ; Xn ]n . • Apply the Algorithm of Theorem 1 to output a Kronecker’s encoding of W := Wa ∩ V (T ) Factor the minimal polynomial mu ∈ Z[Z] of the primitive element u with respect to the variety W . Choose one of these factors q ∈ Z[Z]. Reduce the parametrisations with respect to the polynomial q and output new parametrisations * ; w1 ; : : : ; wn ∈ Z[Z]. OUTPUT q; * ; w1 ; : : : ; wn ∈ Z[Z]. There is one new task performed by this algorithm: factoring a univariate polynomial whose coeKcients are given in straight-line program encoding. The process of factoring a univariate polynomial whose coeKcients are given in straight-line program encoding was >rst discussed in [28,29]. However, E. Kaltofen did not take into account that the bit complexity not only depends on the degree and
618
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
the size of the straight-line program. As observed in [5], the factorisation of univariate polynomials with integral coeKcients, whose coeKcients are given by straight-line programs also depend on the height of the factors. In fact, in [5] the authors proved the following statement: Theorem 28 (Castro et al. [5]). There is a deterministic Turing machine M6 that performs the following task: • The input of M6 is given by the following items: ◦ A polynomial p ∈ Z[T ] of degree at most d whose coeDcients are encoded by a straight-line program of size L, using parameters of bit length at most h. ◦ A positive integer number H ∈ N. • The output of M6 is the list of all the irreducible factors of p whose coeDcients can be written with at most H bits (i.e. the irreducible factors of p are of logarithmic height at most H ). The running time of M6 is polynomial in the following quantities: d; L; H . Using this algorithm M6 in the step factor of the algorithm of Theorem 3 above, we can >nd the minimum H such that mu ∈ Z[T ] has an irreducible factor whose coeKcients have bit length at most H . Choosing just one of them, we proceed to the step reduce in the same theorem. The height of a zero ∈ Cn is precisely the maximum number of digits required to represent the coeKcients of a Kronecker’s encoding of W . Hence, ht()6H and the Theorem follows. 5. Universal behaviour In this section, we will show that, although the algorithm in Theorem 1 is not universal in the sense of [4,25,42], unfortunately, on the “average” it behaves as a universal symbolic polynomial equation solver. This is what we prove in this section. Proposition 29. Let F be a generalised Pham system with coeDcients in Q and let a ∈ Qn be a point such that F(a) ∈ Qn is a regular value of F (i.e. for every point c ∈ Cn in the =bre F −1 ({F(a)}), c is a regular point of F). Then, we have: (1) For every point c ∈ Cn in the =bre F −1 ({F(a)}), there is one and only one Qirreducible component Wc of V (Fa ) that contains the point (1; c) (i.e. (1; c) ∈ Wc ). is the decom(2) There is a =nite subset S ⊆ F −1 ({F(a)}) such that the following position of V (Fa ) into Q-irreducible components V (Fa ) = c ∈ S Wc . Proof. From De>nition 16, if F := [f1 ; : : : ; fn ] ∈ Q[X1 ; : : : ; Xn ]n is a generalised Pham system, it is also a generalised Pham system in C[X1 ; : : : ; Xn ]n . As c ∈ F −1 ({F(a)}), we have F(c) = F(a), and also V (Fc ) = V (Fa ). As F(a) is a regular value, c ∈ Cn is also a regular point of the mapping F : Cn → Cn . Hence, Proposition 19 applies and there is one and only one (C-)irreducible component Vc of V (Fc ) that contains the smooth point (1; c). Next, as V (Fc ) = V (Fa ), there is at least one Q-irreducible component Wc of V (Fa ) that contains Vc and the smooth point (1; c). Additionally, as (1; c) is a smooth point in V (Fa ) = V (Fc ), the variety Wc is unique and the >rst claim holds.
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
619
On the other hand, let W ⊆ V (Fa ) be a Q-irreducible component of V (Fa ). The ring extension Q[T ] ,→ Q[T; X1 ; : : : ; Xn ]=(I (W )) is integral. In particular, W ∩ V (T − 1) is a non-empty algebraic variety contained in V (Fa ) ∩ V (T − 1) = {(1; x) ∈ Cn+1 : F(x) − F(a) = 0}: Then, if (1; c) ∈ W ∩ V (T − 1) we conclude that F(c) = F(a) (or, equivalently, c ∈ F −1 ({F(a)})) and the >rst claim implies W = Wc . Let F ∈ Q[X1 ; : : : ; Xn ]n be a generalised Pham system. For every point a ∈ Qn such that F(a) ∈ Qn is a regular value, we can decompose V (Fa ) according to either Q-irreducible components or (C-)irreducible components. We shall introduce some notations to distinguish both of them. Thus, we may assume that there are two subsets S; S˜ ⊆ F −1 ({F(a)}) such that V (Fa ) =
c∈S
Wc =
c∈S˜
c ; W
where Wc ⊆ V (Fa ) is the unique Q-irreducible component of V (Fa ) that contains the c is the unique irreducible component of V (Fa ) that contains smooth zero (1; c) and W c ⊆ Wc . As {0} × V (F) = V (Fa ) ∩ the smooth zero (1; c). Additionally, we have that W V (T ), the following corollary immediately follows: Corollary 30. With the same notations and assumptions as above, let a ∈ Qn be such that F(a) ∈ Qn is a regular value and let ∈ V (F) be a zero of the generalised Pham system. Then, there is some c ∈ F −1 ({F(a)}) such that c ⊆ Wc : (0; ) ∈ W We shall make use of a generic deformation of a generalised Pham system in the following terms. Let F := [f1 ; : : : ; fn ] ∈ Q[X1 ; : : : ; Xn ]n be a generalised Pham system with rational coeKcients. Let {Y1 ; : : : ; Yn } be a set of variables algebraically independent over C. Let us de>ne the system of polynomials FY given by the following identities: (Y )
fi := fi (X1 ; : : : ; Xn ) − Tfi (Y1 ; : : : ; Yn ) ∈ Q[T; X1 ; : : : ; Xn ; Y1 ; : : : ; Yn ]; (Y ) FY := [f1 ; : : : ; fn(Y ) ] ∈ Q[T; X1 ; : : : ; Xn ; Y1 ; : : : ; Yn ]n : We call FY the generic deformation of the generalised Pham system F. Let W (FY ) ⊆ C2n+1 be the algebraic variety given by (Y )
W (FY ) := {(t; x; y) ∈ C2n+1 : fi (t; x; y) = 0;
1 6 i 6 n}:
Observe that for every a := (a1 ; : : : ; an ) ∈ Qn , the following equality holds: V (Fa ) = W (FY ) ∩ V (Y1 − a1 ; : : : ; Yn − an ): Proposition 19 above may be rewritten in the following terms:
620
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
Proposition 31. Let F := [f1 ; : : : ; fn ] ∈ Q[X1 ; : : : ; Xn ]n be a generalised Pham system and let FY ∈ Q[T; X1 ; : : : ; Xn ; Y1 ; : : : ; Yn ]n be its generic deformation. Let K := Q(Y1 ; : : : ; Yn ) be the =eld of rational functions with rational coeDcients and let (FY )e be the ideal generated by FY in the ring K[T; X1 ; : : : ; Xn ]. Then, the ring extension K[T ] ,→ K[T; X1 ; : : : ; Xn ]=(FY )e is integral. Moreover, there is a non-zero polynomial h ∈ Q[Y1 ; : : : ; Yn ] such that the following is also an integral ring extension: Q[T; Y1 ; : : : ; Yn ]h ,→ Q[T; X1 ; : : : ; Xn ; Y1 ; : : : ; Yn ]h =(FY )ec ;
(7)
where Q[T; Y1 ; : : : ; Yn ]h and Q[T; X1 ; : : : ; Xn ; Y1 ; : : : ; Yn ]h are the respective localisations at the multiplicative system S := {1; h; h2 ; : : :}, and (FY )ec is the ideal generated by FY in Q[T; X1 ; : : : ; Xn ; Y1 ; : : : ; Yn ]h . Proposition 32. With the same notations and assumptions as above, there is a unique prime ideal pY ∈ Spec(Q[T; X1 ; : : : ; Xn ; Y1 ; : : : ; Yn ]) such that the following properties hold: (1) pY ∩ Q[Y1 ; : : : ; Yn ] = (0). (2) pY is a minimal prime ideal over (FY ) of coheight n + 1. (3) Let peY be the prime generated by pY in K[T; X1 ; : : : ; Xn ]. Then, peY is the unique minimal prime ideal over (FY )e contained in the maximal ideal of K[T; X1 ; : : : ; Xn ] generated by {T − 1; X1 − Y1 ; : : : ; Xn − Yn }. (4) The following is an integral ring extension: Q[T; Y1 ; : : : ; Yn ]h ,→ Q[T; X1 ; : : : ; Xn ; Y1 ; : : : ; Yn ]h =pec Y ;
(8)
where h ∈ Q[Y1 ; : : : ; Yn ]\{0} is the non-zero polynomial of Proposition 31 above and pec Y is the ideal generated in Q[T; X1 ; : : : ; Xn ; Y1 ; : : : ; Yn ]h by pY . Proof. From Proposition 31 above, there is one and only one prime ideal P ∈ Spec(K [T; X1 ; : : : ; Xn ]) such that P is a minimal prime ideal over (FY )e and such that P is contained in the ideal generated in K[T; X1 ; : : : ; Xn ] by {T − 1; X1 − Y1 ; : : : ; Xn − Yn }. Let m ⊆ K[T; X1 ; : : : ; Xn ] be the maximal ideal given by m := (T − 1; X1 − Y1 ; : : : ; Xn − Yn ). Then, the following properties hold: • FY is part of a regular system of parameters in the local ring A := K[T; X1 ; : : : ; Xn ]m = Q[Y1 ; : : : ; Yn ; T; X1 ; : : : ; Xn ](T −1;
X1 −Y1 ; :::; Xn −Yn ) :
• Pm = (FY )em is the unique prime ideal generated by FY in the local ring A. Then, there is a unique prime ideal pY ∈ Spec(Q[T; X1 ; : : : ; Xn ; Y1 ; : : : ; Yn ]) such that (pY )m = (FY )em and pY ⊆ (T − 1; X1 − Y1 ; : : : ; Xn − Yn ). We also have that (FY ) ⊆ pY and pY ∩ Q[Y1 ; : : : ; Yn ] = (0). From Krull’s Principal Ideal Theorem, we conclude that ht(pY )6n. Additionally, from the integral ring extension (7) we conclude that the ec ring extension Q[T; Y1 ; : : : ; Yn ]h ,→ Q[T; X1 ; : : : ; Xn ; Y1 ; : : : ; Yn ]h =pec Y is integral, where pY is the extension of pY to Q[T; X1 ; : : : ; Xn ; Y1 ; : : : ; Yn ]h . In particular, we conclude that ht(pY )¿n and the second claim follows. The reader should observe that the third and the fourth claim have been already stated.
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
621
Proposition 33. With the same notations and assumptions as in the previous Proposition, let WY ⊆ C2n+1 be the algebraic variety de=ned as the set of common zeros de=ned by the polynomials in pY . Then, the following properties hold: (1) WY is a Q-de=nable irreducible algebraic variety of dimension n + 1. (2) For every c := (c1 ; : : : ; cn ) ∈ Cn such that h(c) = 0, the algebraic set WY(c) := WY ∩ V (Y1 − c1 ; : : : ; Yn − cn ) is a curve in C2n+1 . (3) For every point c ∈ Cn such that F(c) is a regular value and such that h does not vanish on the =bre F −1 ({F(c)}), then WY(c) is equidimensional and veri=es c × {c} ⊆ W (c) . Moreover, if c ∈ Qn is a rational point, then W (c) the inclusion W Y
Y
is a Q-de=nable equidimensional algebraic variety and veri=es Wc × {c} ⊆ WY(c) .
Proof. We clearly have that WY is a Q-de>nable irreducible algebraic variety of dimension n + 1. Since WY = V (pY ) ⊆ C2n+1 , taking into account the integral ring extension (8) and extending scalars (i.e. tensoring by C ⊗Q ), the following is also an integral ring extension: C[T; Y1 ; : : : ; Yn ]h ,→ C[T; X1 ; : : : ; Xn ; Y1 ; : : : ; Yn ]h =C Q pec Y : From Krull–Cohen–Seidenberg Theorem, we conclude that for every c ∈ Cn , h(c) = 0, Y ∩ V (Y1 − c1 ; : : : ; Yn − cn ) is non-empty. From Krull’s Printhe algebraic set WY(c) := W cipal Ideal Theorem, we conclude that dim WY(c) ¿1. On the other hand, we have the (c)
Y ⊆ V (Fc ) × {c}, and the set V (Fc ) × {c} is a curve. The second claim inclusion W then follows. Assume now that F(c) is a regular value and that h does not vanish on the >bre F −1 ({F(c)}). From Krull’s Principal Ideal Theorem, every minimal prime ideal over pY + (Y1 − c1 ; : : : ; Yn − cn ) has height at most 2n. Then, every irreducible component of WY(c) is also a curve. Thus, there is a >nite subset S1 ⊆ F −1 ({F(c)}) such that a × {c}: W WY(c) = a∈S1
Finally, as pY ⊆ (T − 1; X1 − Y1 ; : : : ; Xn − Yn ), WY also contains the diagonal ⊆ C2n+1 (c) Y and, by given by the identity := {(t; x; y) ∈ C2n+1 : x = y}. In particular, (1; c) ∈ W (c)
c × {c} ⊆ W Y . Moreover, if c belongs to Qn , then W (c) = WY ∩ V (Y1 − irreducibility, W Y c1 ; : : : ; Yn −cn ) is a Q-de>nable algebraic variety contained in V (Fc ) × {c}. This implies Wc × {c} ⊆ WY(c) . Proposition 34. With the same notations as above, let ∈ V (F) be a zero of the generalised Pham system. Let A ⊆ Cn be the constructible set given by the following identity: A := {z ∈ Cn : (0; ; z) ∈ WY(z) ; h(z) = 0}: Then, A contains a non-empty Zariski open set.
622
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
Proof. Assume that A is contained in some proper hypersurface H := V (G). From the second Bertini Theorem (cf. [46]), there is an open set U ⊆ Cn such that for every x ∈ U , x is a regular value of the surjective mapping F : Cn → Cn . Let c ∈ Cn be such that F(c) ∈ U is a regular value. Then, there is some a ∈ Cn such that F(a) = F(c) a . Thus, either h(a) = 0 or h(a) = 0 and (0; ; a) ∈ W (a) . This second case and (0; ) ∈ W Y implies a ∈ A and hence G(a) = 0. In conclusion, U is contained in the constructible set U0 := F(V (G) ∪ V (h)). But dim U0 6 dim(V (G) ∪ V (h))6n−1 which yields a contradiction. Then, the Proposition follows. Corollary 35. There is a Zariski open set A ⊆ Cn such that the following holds for every c ∈ A: Let ' : C2n+1 → Cn be the canonical projection in the second group of coordinates, '(t; x; y) := x, ∀(t; x; y) ∈ C2n+1 . Then, '(WY(c) ∩ V (T )) = V (F). Proof. We just need to observe that A := previous Proposition.
∈V (F)
A , and the result follows from the
Proposition 36. With the same notations as in Proposition 19, there exists in=nitely many integer points a ∈ Zn such that the following properties hold: (1) F(a) is a regular value of F : Cn → Cn . (2) h(a) = 0. (3) '(Wa ∩ V (T )) = V (F), where ' : Cn+1 → Cn stands for the canonical projection '(t; x) := x; ∀(t; x) ∈ Cn+1 . Proof. Since a ∈ Zn , we apply Proposition 33 to conclude that Wa × {a} ⊆ WY(a) . Then, it suKces to show that we can choose in>nitely many a ∈ Zn such that WY(a) is Qirreducible to conclude that the following equality holds: Wa × {a} = WY(a) and the result follows from Corollary 35 above. Now, consider the following ring extension: Q[T; Y1 ; : : : ; Yn ]h ,→ Q[T; X1 ; : : : ; Xn ; Y1 ; : : : ; Yn ]h =pec Y : Observe that pec Y is a prime ideal in Q[T; X1 ; : : : ; Xn ; Y1 ; : : : ; Yn ]h . There is a polynomial q(T; Y1 ; : : : ; Yn ) ∈ Q[T; Y1 ; : : : ; Yn ]h such that the following is an isomorphism: Q[T; X1 ; : : : ; Xn ; Y1 ; : : : ; Yn ]=pec Y ≡ Q[T; Y1 ; : : : ; Yn ; Z]h =(q)h : Observe that q(T; X1 ; : : : ; Xn ; Y1 ; : : : ; Yn ) is an irreducible polynomial in the ring Q[T; Y1 ; : : : ; Yn ; Z]h . Now, for every integer point a := (a1 ; : : : ; an ) ∈ Zn such that h(a) = 0 and q(T; a1 ; : : : ; an ; Z) is irreducible in Q[T; Z], we have that WY(a) is a Q-irreducible variety. The existence of in>nitely rational points a ∈ Zn verifying that property is guaranteed by Hilbert’s Irreducibility Theorem (cf. [7] or [63]). Proof of Theorem 2. It follows from Proposition 36 above.
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
623
References [1] S.J. Berkowitz, On computing the determinant in small parallel time using a small number of processors, Inform. Process. Lett. 18 (1984) 147–150. [2] D.N. Bernstein, A.G. KuWsnirenko, A.G. HovanskiX-, Newton polyhedra, Uspehi Mat. Nauk 31 (3189) (1976) 201–202. [3] A. Bompadre, Un problema de eliminaci,on geom,etrica en sistemas de Pham-Brieskorn, Master’s Thesis, Universidad de Buenos Aires, Argentina, 2000. [4] D. Castro, M. Giusti, J. Heintz, G. Matera, L.M. Pardo, The hardness of polynomial equation solving, Found. Comput. Math. 3 (2003) 347–420. [5] D. Castro, K. HYagele, J.E. Morais, L.M. Pardo, Kronecker’s and Newton’s approaches to solving: a >rst comparison, J. Complexity 17 (1) (2001) 212–303. [6] E. Cattani, A. Dickenstein, B. Sturmfels, Computing multidimensional residues, in: T. Recio, L. Conzalez (Eds.), Algorithms in Algebraic Geometry and Applications, BirkhYauser, Basel, 1996, pp. 135–164. [7] S.D. Cohen, The distribution of Galois groups and Hilbert’s irreducibility theorem, Proc. London Math. Soc. (3) 43 (2) (1981) 227–250. [8] L. Csanky, Fast parallel matrix inversion algorithms, SIAM J. Comput. 5 (1976) 618–623. [9] W. Fulton, Intersection Theory, Ergebnisse der Mathematik, 3 Folge Band 2, Springer, Berlin, 1984. [10] C.B. Garc,-a, W.I. Zangwill, Pathways to Solutions, Fixed Points, and Equilibria, Prentice-Hall, Englewood CliOs, NJ, 1981. [11] M. Giusti, J. Heintz, La d,etermination des points isol,es et de la dimension d’une vari,et,e alg,ebrique peut se faire en temps polynomial, in: Computational Algebraic Geometry and Commutative Algebra, Cortona, 1991, Symp. Mathematics XXXIV, Cambridge University Press, Cambridge, 1993, pp. 216–256. [12] M. Giusti, J. Heintz, K. HYagele, J.E. Morais, L.M. Pardo, J.L. Monta˜na, Lower bounds for Diophantine approximations, J. Pure Appl. Algebra 117/118 (1997) 277–317. [13] M. Giusti, J. Heintz, J.E. Morais, J. Morgenstern, L.M. Pardo, Straight-line programs in geometric elimination theory, J. Pure Appl. Algebra 124 (1–3) (1998) 101–146. [14] M. Giusti, J. Heintz, J.E. Morais, L.M. Pardo, When polynomial equation systems can be “solved” fast? in: Applied Algebra, Algebraic Algorithms and Error-Correcting Codes, Paris, 1995, Lecture Notes on Computer Science, Vol. 948, Springer, Berlin, 1995, pp. 205–231. [15] M. Giusti, J. Heintz, J.E. Morais, L.M. Pardo, Le rˆole des structures de donn,ees dans les probl]emes d’,elimination, C.R. Acad. Sci. Paris S,er. I Math. 325 (11) (1997) 1223–1228. [16] M. Giusti, J. Heintz, J. Sabia, On the eKciency of eOective NullstellensYatze, Comput. Complexity 3 (1) (1993) 56–95. [17] M. Giusti, G. Lecerf, B. Salvy, A GrYobner free alternative for polynomial system solving, J. Complexity 17 (1) (2001) 154–211. , Schost, Solving some overdetermined polynomial systems, in: Proc. 1999 Internat. Symp. [18] M. Giusti, E. on Symbolic and Algebraic Computation, Vancouver, BC, ACM, New York, 1999, pp. 1–8 (electronic). [19] D.Yu. Grigoriev, Factorization of polynomials over a >nite >eld and the solution of systems of algebraic equations, Zapiski Nauchnykh Seminarov Leningradskigi Otdeleniya Matematicheskogo Instituta 137 (1984) 20–79. [20] R.C. Gunning, H. Rossi, Analytic Functions of Several Complex Variables, Prentice-Hall Inc., Englewood CliOs, NJ, 1965. [21] K. HYagele, J.E. Morais, L.M. Pardo, M. Sombra, On the intrinsic complexity of the arithmetic Nullstellensatz, J. Pure Appl. Algebra 146 (2) (2000) 103–183. [22] R. Hartshorne, Algebraic Geometry, Springer, New York, 1977. [23] J. Heintz, De>nability and fast quanti>er elimination in algebraically closed >elds, Theoret. Comput. Sci. 24 (3) (1983) 239–277. [24] J. Heintz, T. Krick, S. Puddu, J. Sabia, A. Waissbein, Deformation techniques for eKcient polynomial equation solving, J. Complexity 16 (1) (2000) 70–109. [25] J. Heintz, G. Matera, L.M. Pardo, R. Wachenchauzer, The intrinsic complexity of parametric elimination methods, Electron. J. SADIO 1 (1) (1998) 37–51 (electronic).
624
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
[26] J. Heintz, G. Matera, A. Waissbein, On the time-space complexity of geometric elimination procedures, Appl. Algebra Eng. Comm. Comput. 11 (4) (2001) 239–296. [27] J. Heintz, C.P. Schnorr, Testing polynomials which are easy to compute, in: Logic and Algorithmic, Monograph. Enseign. Math 30, Univ. Gen]eve, Geneva, 1982, pp. 237–254. [28] E. Kaltofen, Polynomial-time reductions from multivariate to bi- and univariate integral polynomial factorization, SIAM J. Comput. 14 (2) (1985) 469–489. [29] E. Kaltofen, Factorization of polynomials given by straight-line programs, in: S. Micali (Ed.), Randomness and Computation, Advances in Computing Research, Vol. 5, JAI Press Inc., Greenwitch, CT, 1989, pp. 375–412. [30] R. Kannan, A.K. Lenstra, L. Lov,asz, Polynomial factorization and nonrandomness of bits of algebraic and some transcendental numbers, Math. Comput. 50 (181) (1988) 235–250. [31] T. Krick, L.M. Pardo, A computational method for Diophantine approximation, in: T. Recio, L. Gonzalez (Eds.), Algorithms in Algebraic Geometry and Applications, BirkhYauser, Basel, 1996, pp. 193–253. [32] T. Krick, L.M. Pardo, M. Sombra, Sharp estimates for the arithmetic Nullstellensatz, Duke Math. J. 109 (3) (2001) 521–598. [33] L. Kronecker, GrundzYuge einer arithmetischen theorie de algebraischen grYossen, J. Reine Angew. Math. 92 (1882) 1–122. [34] G. Lecerf, Une alternative aux m,ethodes de r,ee, criture pour la r,esolution des syst]emes alg,ebriques, Ph.D. , Thesis, Ecole polytechnique, Palaiseau, France, 2001. [35] A.K. Lenstra, H.W. Lenstra Jr., L. Lov,asz, Factoring polynomials with rational coeKcients, Math. Ann. 261 (4) (1982) 515–534. [36] H. Matsumura, Commutative Algebra, 2nd Edition, Benjamin/Cummings Publishing Co., Inc., Reading, MA, 1980. [37] B. Mourrain, V.Y. Pan, Solving special polynomial systems by using structured matrices and algebraic residues, in: F. Cucker, M. Shub (Eds.), Foundations of Computational Mathematics, Springer, Berlin, 1997, pp. 287–304. [38] B. Mourrain, V.Y. Pan, Multivariate polynomials, duality, and structured matrices, J. Complexity 16 (1) (2000) 110–180. [39] K. Mulmuley, A fast parallel algorithm to compute the rank of a matrix over an arbitrary >eld, Combinatoria 7 (1) (1987) 101–104. [40] V.Y. Pan, Y. Rami, X. Wang, Structured matrices and Newton’s iteration: uni>ed approach, Linear Algebra Appl. 343/344 (2002) 233–265. [41] L.M. Pardo, How lower and upper complexity bounds meet in elimination theory, in: Applied Algebra, Algebraic Algorithms and Error-Correcting Codes (Paris, 1995), Lecture Notes in Computer Science, Vol. 948, Springer, Berlin, 1995, pp. 33–69. [42] L.M. Pardo, Universal elimination requires exponential running time (extended abstract), in: Proceedings EACA’2000, 2000, pp. 25–51. [43] J.M. Rojas, Some speed-ups and speed limits for real algebraic geometry, J. Complexity 16 (3) (2000) 552–571. [44] J. Sabia, P. Solern,o, Bounds for traces in complete intersections and degrees in the Nullstellensatz, Appl. Algebra Eng. Comm. Comput. 6 (6) (1995) 353–376. [45] J.T. Schwartz, Fast probabilistic algorithms for ver>cation of polynomial identities, J. ACM 27 (4) (1980) 701–717. [46] I.R. Shafarevich, Basic Algebraic Geometry, Vol. 1, 2nd Edition, Springer, Berlin, 1994. [47] M. Shub, S. Smale, Complexity of B,ezout’s theorem I: geometric aspects, J. AMS 6 (2) (1993) 459–501. [48] M. Shub, S. Smale, Complexity of B,ezout’s theorem II: volumes and probabilities, in: Proceedings of the EOective Methods in Algebraic Geometry, Progress of Mathematics, Vol. 109, BirkhYauser, Basel, 1993, pp. 267–285. [49] M. Shub, S. Smale, Complexity of B,ezout’s theorem III: condition number and packing, J. Complexity 9 (1993) 4–14. [50] M. Shub, S. Smale, Complexity of B,ezout’s theorem V: polynomial time, Theoret. Comput. Sci. 133 (1994) 141–164.
L.M. Pardo, J.S. Mart n / Theoretical Computer Science 315 (2004) 593 – 625
625
[51] M. Shub, S. Smale, Complexity of Bezout’s theorem. IV. probability of success and extensions, SIAM J. Numer. Anal. 33 (1) (1996) 128–148. [52] S. Smale, The fundamental theorem of algebra and complexity theory, Bull. Amer. Math. Soc. (N.S.) 4 (1) (1981) 1–36. [53] A.J. Sommese, J. Verschelde, C.W. Wampler, Numerical decomposition of the solution sets of polynomial systems into irreducible components, SIAM J. Numer. Anal. 38 (6) (2001) 2022–2046 (electronic). [54] A.J. Sommese, J. Verschelde, C.W. Wampler, Numerical irreducible decomposition using projections from points on the components, in: Symbolic Computation: Solving Equations in Algebra, Geometry, and Engineering, Contemporary Mathematics, Vol. 286, American Mathematical Society, Providence, RI, 2001, pp. 37–51. [55] V. Strassen, Vermeidung von divisionen, Crelle J. Reine Angew. Math. 264 (1973) 184–202. [56] V. Strassen, Algebraic complexity theory, in: Handbook of Theoretical Computer Science, Vol. A, Elsevier Amsterdam, 1990, pp. 633–672. [57] B. Sturmfels, Solving Systems of Polynomial Equations, CBMS Regional Conference Series in Mathematics, Vol. 97, Published for the Conf. Board of the Mathematical Sciences, Washington, DC, 2002. [58] J. Verschelde, Toric Newton method for polynomial homotopies, J. Symbolic Comput. 29 (4–5) (2000) 777–793. [59] J. Verschelde, P. Verlinden, R. Cools, Homotopies exploiting Newton polytopes for solving sparse polynomial systems, SIAM J. Numer. Anal. 31 (3) (1994) 915–930. [60] W. Vogel, Lectures Notes on B,ezout’s Theorem, in: Tata Lectures Notes, Vol. 74, Springer, Berlin, 1984. [61] O. Zariski, P. Samuel, Commutative Algebra II, Graduate Texts in Mathematics, Vol. 39, Springer, Berlin, 1960. [62] R. Zippel, Probabilistic algorithms for sparse polynomials, in: Proceedings EUROSAM’79, 1979, pp. 216–226. [63] R. Zippel, EOective Polynomial Computation, Kluwer Academic Publishers, Dordrecht, 1993.
Theoretical Computer Science 315 (2004) 627 – 650
www.elsevier.com/locate/tcs
Parametrization of approximate algebraic curves by lines Sonia P(erez-D(+aza , Juana Sendrab , J. Rafael Sendraa;∗ a Departamento
de Matem aticas, Universidad de Alcal a, Facultad de Ciencias, Apartado de Correos 20, E-28871 Madrid, Spain b Departamento de Matem aticas, Universidad Carlos III, E-28911 Madrid, Spain
Abstract It is well known that irreducible algebraic plane curves having a singularity of maximum multiplicity are rational and can be parametrized by lines. In this paper, given a tolerance ¿ 0 and an -irreducible algebraic plane curve C of degree d having an -singularity of multiplicity d − 1, we provide an algorithm that computes a proper parametrization of a rational curve that is exactly parametrizable by lines. Furthermore, the error analysis shows that under certain initial conditions that ensures that points are projectively well de4ned, the output curve lies within the √ o5set region of C at distance at most 2 21=(2d) exp(2). c 2004 Elsevier B.V. All rights reserved. Keywords: Approximate algebraic curves; Rational parametrization; Hibrid symbolic-numeric methods
1. Introduction Over the past several years, many authors have approached computer algebra problems by means of symbolic-numeric techniques. For instance, among others, methods for computing greatest common divisors of approximate polynomials (see [6,9,15,29]), for determining functional decomposition (see [10]), for testing primality (see [21]), for 4nding zeros of multivariate systems (see [9,16,18]), for factoring approximate polynomials (see [11,20,30,31]), or for numerical computation of GrCobner basis (see [28,36]) have been developed.
Authors partially supported by BMF2002-04402-C02-01, HU2001-0002 and GAIA II (IST-2002-35512). Corresponding author. E-mail addresses: [email protected] (S. P(erez-D(+az), [email protected] (J. Sendra), [email protected] (J.R. Sendra). ∗
c 2004 Elsevier B.V. All rights reserved. 0304-3975/$ - see front matter doi:10.1016/j.tcs.2004.01.010
628
S. P erez-D )az et al. / Theoretical Computer Science 315 (2004) 627 – 650
Similarly, hybrid (i.e. symbolic and numeric) methods for the algorithmic treatment of algebraic curves and surfaces have been presented. For instance, computation of singularities have been treated in [3,5,13,22,26], implicitization methods have been proposed in [12,14], and the numerical condition of implicitly given algebraic curves and surfaces have been analyzed (see [17]). Also, piecewise parametrizations are provided (see [11,23,19]) by means of combination of both algebraic and numerical techniques for solving di5erential equations and rational B-spline manipulations. However, although many authors have addressed the problem of globally and symbolically parametrizing algebraic curves and surfaces (see, [1,24,25,32–34]), only few results have been achieved for the case of approximate algebraic varieties. The statement of the problem for the approximate case is slightly di5erent than the classical symbolic parametrization question. Intuitively speaking, one is given an irreducible aIne algebraic plane curve C, that may or not be rational, and a tolerance ¿0, and J and its parametrization, such the problem consists in computing a rational curve C, that almost all points of the rational curve CJ are in the “vicinity” of C. The notion of vicinity may be introduced as the o5set region limited by the external and internal o5set to C at distance (see Section 4 for more details, and [2] for basic concept on o5sets), and therefore the problem consists in 4nding, if it is possible, a rational curve CJ lying within the o5set region of C. For instance, let us suppose that we are given a tolerance = 0:001, and that we are given the quartic C de4ned by 16:001 + 24:001x + 8y − 2y2 + 12yx + 14:001x2 + 2y2 x + x2 y + x4 − y3 + 6:001x3 : Note that C has genus 3, and therefore the input curve is not rational. Our method provides as an answer the quartic CJ de4ned by 16:008 + 24:012x + 8y − 2y2 + 12yx + 14:006x2 + 2y2 x + x2 y + x4 − y3 + 6:001x3 : Now, it is easy to check that the new curve CJ has an aIne triple point at (−2; −2), and hence it is rational. Furthermore, it can be parametrized by P(t) = (t 3 − 0:001 − t − 2t 2 ; t 4 + 1:999t − t 2 − 2t 3 − 2): In Fig. 1 one may check that C and CJ are close (see Example 2 in Section 3 for more details). The notion of vicinity is geometric and in general may be diIcult to deduce it directly from the coeIcients of the implicit equations; in the sense that two implicit equations f1 and f2 may satisfy that f1 −f2 is small, and however they may de4ne algebraic curves that are not close; i.e. none of them lie in the vicinity of the other. 1 For example, if we consider the line f1 = x + y and the conic f2 = x + y + 1000 x2 + 1 1 1 2 1000 y − 1000 , we have that f1 − f2 ∞ = 1000 . Nevertheless, the curves de4ned by f1 and f2 are not close. The problem of relating the tolerance with the vicinity notion, may be approached either analyzing locally the condition number of the implicit equations (see [17]) or
S. P erez-D )az et al. / Theoretical Computer Science 315 (2004) 627 – 650
629
Fig. 1. Curve C (left) and curve CJ (right).
studying whether for almost every point P on the original curve, there exists a point Q on the output curve such that the Euclidean distance of P and Q is signi4cantly smaller than the tolerance. In this paper our error analysis will be based on the second approach. From this fact, and using [17], one may derive upper bounds for the distance of the o5set region. In [4], the problem described above is studied for the case of approximate irreducible conics, rational cubics and quadrics, and the error analysis for the conic case is presented. In this paper, although we do not give an answer for the general case, we extend the results in [4] by showing how to solve the question for the special case of curves parametrizable by lines. More precisely, we provide an algorithm that parametrizes approximate irreducible algebraic curves of degree d having an -singularity of multiplicity d−1 (see Section 2). We illustrate the results by some examples (see Section 3), and we analyze the numerical error showing that the output rational curve lies within √ the o5set region of the input perturbated curve at distance at most 2 21=(2d) exp(2) (see Section 4). 2. Numerical parametrization by lines It is well known that irreducible algebraic curves having a singularity of maximum multiplicity are rational, and that they can be parametrized by lines. Examples of curves parametrizable by lines are irreducible conics, irreducible cubics with a double point, irreducible quartics with a triple point, etc. In this section, we show that this property is also true if one considers approximate irreducible algebraic curves that “almost” have a singularity of maximum multiplicity. Before describing the method for the approximate case, and for reasons of completeness, we brieQy recall here the algorithmic approach for symbolically parametrize
630
S. P erez-D )az et al. / Theoretical Computer Science 315 (2004) 627 – 650
curves having a singularity of maximum multiplicity. The geometric idea for these type of curves is to consider a pencil of lines passing through the singular point if the curve has degree bigger than 2, or through a simple point if the curve is a conic. In this situation, all but 4nitely many lines in the pencil intersect the original curve exactly at two di5erent points: the base point of the pencil and a free point on the curve. The free intersection point depends rationally on the parameter de4ning the line, and it yields a rational parametrization of the curve. More precisely, the symbolic algorithm for parametrizing curves by lines (where the trivial case of lines is excluded) can be outlined as follows (see [33,34] for details): Symbolic parametrization by lines • Given an irreducible polynomial f(x; y) ∈ K[x; y] (K is an algebraically closed 4eld of characteristic zero), de4ning an irreducible aIne algebraic plane curve C of degree d¿1, with a (d − 1)-fold point if d¿3. • Compute a rational parametrization P(t) = (p1 (t); p2 (t)) of C. 1. If d = 2 take a point P on C, else determine the (d − 1)-fold point P of C. 2. If P is at in4nity, consider a linear change of variables such that P is transformed into an aIne point. Let P = (a; b). 3. Compute 1 t @(d−1) f @(d−1) f + + ··· + @(d−1) x (d − 1)! @(d−2) x@y (d − 2)! A(x; y; t) = t @ df 1 @ df + (d−1) + ··· + d @ x d! @ x@y (d − 1)!
t (d−1) @(d−1) f (d − 1)! @(d−1) y : t d @ df d! @ d y
and return P(t) = (−A(P; t) + a; −tA(P; t) + b): Remark. The parametrization can also be obtained as −gd−1 (1; t) −tgd−1 (1; t) P(t) = + a; +b ; gd (1; t) gd (1; t) where gd (x; y) and gd−1 (x; y) are the homogeneous components of g(x; y) = f(x + a; y + b) of degree d and d − 1, respectively. Observe that both components of P(t) have the same denominator. Now, we proceed to describe the method to parametrize by lines approximate algebraic curves. For this purpose, we distinguish between the conic case and the general case. The main di5erence between these two cases is that in the case of conics, if the approximate curve is irreducible, the rationality is preserved. As we will see, the results obtained for conics are similar to those presented in [4]. Afterwards, the ideas for the 2-degree case will be generalized to any degree and therefore results in [4] will be extended. Throughout this section, we 4x a tolerance ¿0 and we will use the polynomial ∞-norm; i.e if p(x; y) = i; j∈I ai; j xi y j ∈ C[x; y] then p(x; y) is de4ned as max{|ai; j |=i; j ∈ I }. In particular if p(x; y) is a constant coeIcient p(x; y) will denote its module.
S. P erez-D )az et al. / Theoretical Computer Science 315 (2004) 627 – 650
631
2.1. Parametrization of approximate conics Let C be a conic de4ned by an -irreducible (over C) polynomial f(x; y) ∈ C[x; y]; that is f(x; y) cannot be expressed as f(x; y) = g(x; y)h(x; y) + E(x; y) where g; h; E ∈ C[x; y] and E(x; y)¡f(x; y) (see for instance [11]). In particular, this implies that f(x; y) is irreducible and therefore C is rational. Thus, one may try to apply the symbolic parametrization algorithm to C. In order to do that one has to compute a simple point on C. Furthermore, one may check whether the simple point can be taken over R and, if possible, compute it. This can be done either symbolically, for instance introducing algebraic numbers with the techniques presented in [35], or numerically by root 4nding methods. If one works symbolically then the direct application of the algorithm will provide an exact answer. Let us assume that the simple point is approximated. For this purpose, we introduce the notion of -point. J ∈ C2 is an -aIne point of an algebraic plane Denition 1. We say that PJ = (a; J b) curve C de4ned by a polynomial f(x; y) ∈ C[x; y] if it holds that J |f(P)| ¡ ; f(x; y) that is, PJ is a simple point on C computed under 4xed precision f(x; y). Note that we required the relative error w.r.t f(x; y) because for any non-zero complex number the polynomial f(x; y) also de4nes C. J be an -aIne point of C, and let us consider the In this situation, let PJ = (a; J b) conic CJ de4ned by the polynomial J y) = f(x; y) − f(P): J f(x; J Furthermore, CJ is irreducible. Indeed, if fJ factors Now, PJ is really a point on C. J J J J J as f = gJh then f = gJh + f(P) and |f(P)|¡f(x; y), that is f is not -irreducible, which is impossible. Therefore, we have constructed a rational conic, namely CJ on J Hence, we may directly apply the symbolic which we know a simple point, namely P. algorithm to C to get the rational parametrization J J J t) + a; J t) + b); P(t) = (−A(P; J −tA(P; where A(x; y; t) =
(@2 f=@2 x)1=2!
@f=@x + t(@f=@y) : + t (@2 f=@x@y) + (t 2 =2!)@2 f=@2 y
2.2. Parametrization of approximate curves In this subsection we deal with approximate curves of degree bigger than 2. In this case, the main diIculty is that the given approximate algebraic curve is, in general, non-rational even though it might correspond to the perturbation of a rational curve. The idea to solve the problem is to generalize the construction done for conics. For
632
S. P erez-D )az et al. / Theoretical Computer Science 315 (2004) 627 – 650
this purpose, we observe that the output curve in the 2-degree case is the original polynomial minus its Taylor expansion up to order 1 at the -point, i.e. the evaluation of the polynomial at the point. We will see that for curves of degree d having “almost” a singularity of multiplicity d−1 one may subtract to the original polynomial its Taylor expansion up to order d − 1 at the quasi-singularity to get a rational curve close to the given one. To be more precise, we 4rst introduce the notion of -singularity. J ∈ C2 is an -aIne singularity of multiplicity r of Denition 2. We say that PJ = (a; J b) an algebraic plane curve de4ned by a polynomial f(x; y) ∈ C[x; y] if, for 06i+j6r−1, it holds that J (@i+j f=@i x@ j y) (P) ¡ : f(x; y) Note that an -singularity of multiplicity 1 is an -point on the curve. Similarly, one may introduce the corresponding notion for -singularities at in4nity. However, here we will work only with -aIne singularities taking into account that the user can always prepare the input, by means of a suitable linear change of coordinates, in order to be in the aIne case. Alternatively, one may also use the method described in [9]. In this situation, we denote by Ld the set of all -irreducible (over C) real algebraic curves of degree d having an -singularity of multiplicity d − 1, that we assume is real. In the previous subsection we have seen how to parametrize by lines elements in L2 . In the following, we assume that d¿2 and we show that also elements in Ld can be parametrized by lines. In order to check whether a given curve C of degree d, de4ned by a polynomial f(x; y), belongs to Ld , one has to check the -irreducibility of f(x; y) as well as the existence of an -singularity of multiplicity d − 1. For this purpose, to analyze the -irreducibility, one may use any of the existing algorithms (e.g. [11,21,20,31]). The algorithm given in [11] has polynomial complexity. However, although the algorithm given in [21] has exponential complexity, in practice has very good performance. Furthermore, algorithms in [20,31] provide improvements to the methods described in [21]. For checking the existence and computation of -singularities of multiplicity d − 1 one has to solve the system of algebraic equations: @i+j f (x; y) = 0; @i x@ j y
i + j = 0; : : : ; d − 2;
under 4xed precision ·f(x; y), by applying root 4nding techniques (see [9,22,26,27]). Nevertheless, one may accelerate the computation by reducing the number of equations and degrees involved in the system. More precisely, for some i0 ; j0 ; i1 ; j1 , such that i0 + j0 = i1 + j1 = d − 2, one computes the solutions of the system @i1 +j1 f @i0 +j0 f (x; y) = (x; y) = 0; @i0 x@j0 y @i1 x@j1 y
S. P erez-D )az et al. / Theoretical Computer Science 315 (2004) 627 – 650
633
Fig. 2. Real part of the curve C.
under 4xed precision f(x; y). Note that the two equations involved are quadratic. For this purpose, one may use well known methods (see for instance [9,22,26,27]). Once these solutions have been approximated, one may proceed as follows: if any of J satis4es that the roots obtained above, say P, i+j @ f J @i x@ j y (P) 6 f(x; y); i + j = 0; : : : ; d − 3; then PJ is an -singularity of multiplicity d−1; otherwise, C does not have -singularities of multiplicity d − 1. As an example (see Example 3 in Section 3), let = 0:001, and let C be the real -irreducible quartic de4ned by f(x; y) = x4 + 2y4 + 1:001x3 + 3x2 y − y2 x − 3y3 + 0:00001y2 − 0:001x − 0:001y − 0:001: Applying the process described above one gets that C has a 3-fold -singularity at PJ = (−0:1248595915 10−6 ; 0:1249844199 10−6 ). In Fig. 2 appears the plot of the real part of C, and one sees that PJ is “almost” a triple point of the curve. Alternatively to the approach described above one may use the techniques presented in [5] in combination with the Gap Theorem (see [8]), and the Test Criterion. Now, in order to parametrize the approximate algebraic curve C ∈ Ld we consider J of multiplicity d − 1. J b) a pencil of lines Ht passing through the -singularity PJ = (a; That is, Ht is de4ned by the polynomial Ht (x; y; t) = y − tx − bJ + at: J If PJ had been really a singularity, then the above symbolic algorithm would have output the parametrization (pJ 1 (t); pJ 2 (t)) ∈ R(t)2 , where pJ 1 (t) is the root in R(t) of
634
S. P erez-D )az et al. / Theoretical Computer Science 315 (2004) 627 – 650
the polynomial f(x; tx + bJ − at) J d−1 (x − a) J and pJ 2 (t) = t pJ 1 (t) + bJ − t a. J However, in our case PJ is not a singularity but an -singularity. Then, the idea consists in computing the root in R(t) of the quotient of J at) J at)) f(x; tx+ b− J and (x−a) J d−1 w.r.t. x (note that degx (f(x; tx+ b− J = d, and therefore J = (pJ 1 (t); t pJ 1 (t) + the quotient has degree 1 in x), say pJ 1 (t), to 4nally consider P(t) J bJ − t a) J as approximate parametrization of C. In the next lemma we prove that P(t) is really a rational parametrization, and in Section 4, we will see that the error analysis shows that this construction generates a rational curve close to the original one. J Lemma 1. Let f(x; y) be the implicit equation of a curve C ∈ Ld and let PJ = (a; J b) be the -singularity of multiplicity d − 1 of C. Let pJ 1 (t) be the root in R(t) of the quotient of f(x; tx + bJ − at) J and (x − a) J d−1 , and let pJ 2 (t) = t pJ 1 (t) + bJ − t a. J Then J = (pJ 1 (t); pJ 2 (t)) is a rational parametrization. P(t) Proof. To prove the lemma one has to show that at least one of the components of J P(t) is not a constant. Let g(x; t) = f(x; tx + bJ − at). J We see that pJ 1 (t) = a. J Indeed, if pJ 1 (t) = a, J since pJ1 (t) is the root of quotient of g(x; t) and (x − a) J d−1 , one has that g(x; t) = (x − a) J d + R(t), where ∈ R? , and R(t) ∈ R(t). Moreover, since R(t) is the remainder and (x − a) J d−1 is monic in x, one has that R(t) is a polynomial. Let us say s that R(t) = as t + · · · + a0 , with as = 0. Thus,
y − bJ f(x; y) = g x; x − aJ
= (x − a) J d+
J s + as−1 (y − b) J s−1 (x − a) J + · · · + a0 (x − a) J s as (y − b) : s (x − a) J
J s which is impossible However, if s¿0 this implies that (x − a) J divides as (y − b) because as = 0. Hence s = 0; i.e. R(t) is a constant . That is, f(x; y) = (x − a) J d + . Therefore, since f(x; y) is a univariate of polynomial of degree bigger than 1, it is reducible and hence it is not -irreducible which is impossible. J = (pJ 1 (t); pJ 2 (t)) in Lemma 1 is proper. Lemma 2. The parametrization P(t) J pJ 1 − a). J J Proof. Note that t = (pJ 2 − b)=( J Thus, P(t) is proper and its inverse is (y − b)= (x − a). J In the next lemma, for P ∈ R2 and ¿0, we denote by D(P; ) the Euclidean disk D(P; ) = {(x; y) ∈ R2 | (x; y) − P2 6 }:
S. P erez-D )az et al. / Theoretical Computer Science 315 (2004) 627 – 650
635
Lemma 3. Let C be an a:ne algebraic curve, de;ned by a polynomial f(x; y) ∈ R[x; y], having a real -singularity PJ of multiplicity r. Then, there exists ¿0 such J ) is also an -singularity of multiplicity r of C. that any point Q ∈ D(P; Proof. We denote by fi; j the partial derivative @i+j f=@i x@ j y. Since PJ is an J -singularity of multiplicity r, for i +j = 1; : : : ; r −1, it holds that |fi; j (P)|¡f(x; y). J = i; j for i+j = 1; : : : ; r −1. Then, for each i; j there exist i; j ¿0 Let us denote |fi; j (P)| such that i; j = f(x; y) − i;j ¡ f(x; y): We consider = min{i; j , i + j = 1; : : : ; r − 1} (note that ¿0). On the other hand, since all partial derivatives are continuous, let M bound all partial derivatives up to J ), and let be strictly smaller than min{=(2M ); }; order r in the compact set D(P; note that M ¿0 since otherwise it would imply that C contains a disk of points which J ). Then, by applying the Mean Value Theorem, we is impossible. Now, take Q ∈ D(P; have that for i + j = 1; : : : ; r − 1 J + |fi; j (P) J − fi; j (Q)| 6 i; j + |∇(fi; j (!i; j )) · (PJ − Q)T |; |fi; j (Q)| 6 |fi; j (P)| J Then, one concludes that where !i; j is on the segment joining Q and P. |fi; j (Q)| 6 f(x; y) − i; j + 2M 6 f(x; y) − + 2M ¡ f(x; y): Therefore, Q is an -singularity of multiplicity r of C. Now, let C ∈ Ld be de4ned by the polynomial f(x; y). Then by Lemma 3, one deduces that C has in4nitely many (d − 1)-fold -singularities. For our purposes, we are interested in choosing the singularity appropriately. More precisely, we say that J is a proper (d − 1)-fold -singularity of C if the polynomial PJ = (a; J b) d j1 +j2 =d−1
@j1 +j2 f J J j2 1 ; (P)(x − a) J j1 (y − b) @j1 x@j2 y j1 !j2 !
is irreducible over C. Note that this is always possible because a small perturbation of the coeIcients of a polynomial transforms it onto an irreducible polynomial. The following theorem shows that the implicit equation of the rational curve de4ned by the parametrization generated by the above process can be obtained also, as in the conic case, by Taylor expansions at the -singularity. In fact, the theorem includes as a particular case the result for conics. This result will avoid quotient computations and will be used to analyze the error. J Theorem 1. Let f(x; y) be the implicit equation of a curve C ∈ Ld and let PJ = (a; J b) be a proper -singularity of multiplicity d − 1 of C. Let pJ 1 (t) be the root in R(t) J of the quotient of f(x; tx + bJ − at) J and (x − a) J d−1 , and let pJ 2 (t) = t pJ 1 (t) + bJ − t a. Then the implicit equation of the rational curve CJ de;ned by the parametrization
636
S. P erez-D )az et al. / Theoretical Computer Science 315 (2004) 627 – 650
J = (pJ 1 (t); pJ 2 (t)) is P(t) J y) = f(x; y) − T (x; y); f(x; J where T (x; y) is the Taylor expansion up to order d − 1 of f(x; y) at P. Proof. Let J + f(x; y) = f(P)
d j1 +j2 =1
@j1 +j2 f J J j2 1 (P)(x − a) J j1 (y − b) j j 1 2 @ x@ y j1 !j2 !
J Thus, be the Taylor expansion of f(x; y) at P. J + f(x; tx + bJ − t a) J = f(P)
d j1 +j2 =1
@j1 +j2 f J 1 (P)(x − a) J j1 +j2 t j2 @j1 x@j2 y j1 !j2 !
@j1 +j2 f J 1 = (x − a) J d−1 (P)(x − a) J j1 +j2 −d+1 t j2 j1 x@j2 y !j @ j 1 2! j1 +j2 =d−1 d−2 @j1 +j2 f 1 J + J + f(P) (P)(x − a) J j1 +j2 t j2 j1 j2 j1 !j2 ! j1 +j2 =1 @ x@ y d
= (x − a) J d−1 M (x; t) + N (x; t); where N (x; t) = T (x; tx + bJ − t a); J
M (x; t) =
S(x; tx + bJ − t a) J d−1 (x − a) J
J We observe and S(x; y) is the Taylor expansion from order d − 1 up to order d at P. that degx (M ) = 1, and degx (N )6d − 2. On the other hand, let U (x; t) and V (x; t) be the quotient and the remainder of f(x; tx + bJ − t a) J and (x − a) J d−1 w.r.t. x, respectively. Then, f(x; tx + bJ − t a) J = (x − a) J d−1 U (x; t) + V (x; t) with degx (V )6d − 2. Therefore, (x − a) J d−1 (M (x; t) − U (x; t)) = V (x; t) − N (x; t): Thus, since the degree w.r.t. x of V − N is smaller or equal d − 2, and (x − a) J d−1 divides V − N , one gets that M = U and V = N . In this situation, J P(t)) J J J J f( = f(P(t)) − T (P(t)) = f(pJ 1 (t); t pJ 1 (t) + bJ − t a) J − T (P(t)) J = (pJ1 (t) − a) J d−1 U (pJ1 (t); t) + N (pJ 1 (t); t) − T (P(t)) J J = T (P(t)) − T (P(t)) = 0:
Moreover, since PJ is a proper -singularity of multiplicity d − 1 of C, one has that fJ J J is irreducible, and thus P(t) parametrizes C.
S. P erez-D )az et al. / Theoretical Computer Science 315 (2004) 627 – 650
637
This result can be applied to derive a similar algorithm for parametrizing approximate algebraic curves by lines similar to the symbolic algorithm. Numerical parametrization by lines • Given the de4ning polynomial f(x; y) of C ∈ Ld , d¿2. J of a rational curve CJ close to C. • Compute a rational parametrization P(t) 1. If d = 2 compute an aIne -point PJ of C, else compute a proper -singularity PJ of C of multiplicity d − 1. J y) = f(x; y) − T (x; y) where T (x; y) is the Taylor expansion of 2. Compute f(x; J f(x; y) up to order d − 1 at P. J 3. Apply step 3 of the symbolic algorithm to fJ and P. 3. Examples In this section, we illustrate the numerical parametrization algorithm developed in Section 2 by some examples where one can check that the output rational curve CJ is close to the original curve C. This behavior will be clari4ed in the error analysis section. We give an example in detail, where we explain how the algorithm is performed, and we summarize seven other examples in di5erent tables. In these tables we show J the the input curve C, the tolerance considered, the -singularity, the output curve C, J J and a 4gure representing C and C. J output parametrization P(t) de4ning the curve C, Example 1. We consider = 0:001 and the curve C of degree 6 de4ned by the polynomial f(x; y) = y6 + x6 + 2:yx4 − 2:y4 x + 10−3 x + 10−3 y + 2 · 10−3 + 10−3 x4 : First of all, by applying the algorithm developed in [11], we observe that the polynomial f(x; y) is -irreducible. Now, we apply the 4rst step of the Algorithm Numerical Parametrization by Lines, and we compute the -singularity. For this purpose, we determine the solutions of the system (see [9,27]) @4 f @4 f (x; y) = (x; y) = 0; @4 x @4 y under 4xed precision f(x; y) = 0:002. We get four solutions PJ 1 = (−0:06650062380 + 0:1157587268I; 0:06683312414 + 0:1154704132I ); PJ 2 = (−0:06650062380 − 0:1157587268I; 0:06683312414 − 0:1154704132I ); PJ 3 = (0:1875000000 · 10−5 ; −0:50000002 · 10−3 ); PJ 4 = (0:1329993725; −0:1331662483): Only the root PJ 3 , satis4es that i+j @ f J @i x@ j y (P 3 ) 6 0:002;
i + j = 0; : : : ; 3:
638
S. P erez-D )az et al. / Theoretical Computer Science 315 (2004) 627 – 650
Then PJ = PJ 3 = (0:1875000000 · 10−5 ; −0:50000002 · 10−3 ) is an -singularity of multiplicity 5, and therefore C ∈ L60:001 . Applying the second step of the Algorithm Numerical Parametrization by Lines, we compute J y) = f(x; y) − T (x; y); f(x; J where T (x; y) is the Taylor expansion of f(x; y) up to order 5 at P, T (x; y) = 0:001000000000x + 0:0010000000000y + 0:1000000173 · 10−8 yx + 0:1300000000 · 10−10 x4 + 0:7500000034 · 10−8 x3 − 0:2499999700 · 10−8 y3 + 0:4000000160 · 10−2 xy3 + 0:1500000000 · 10−4 x3 y − 0:2109375027 · 10−13 x2 + 0:3000000000 · 10−12 y4 − 0:2812500001 · 10−11 y2 − 0:4218750000 · 10−10 yx2 + 0:3000000240 · 10−5 y2 x + 0:2000000000 · 10−2 : One gets the curve CJ de4ned by J y) = −0:1250000464 · 10−12 x + 0:1125000100 · 10−14 y + 0:9999999873 · 10−3 x4 + f(x; 2:yx4 −2:y4 x−0:1000000173·10−8 yx+y6 +x6 −0:7500000036·10−8 x3 +0:2499999700· 10−8 y3 + 0:2109375029 · 10−13 x2 − 0:3000000180 · 10−12 y4 + 0:2812500000 · 10−11 y2 − 0:1500000000 · 10−4 x3 y − 0:4000000160 · 10−2 xy3 − 0:3000000240 · 10−5 y2 x + 0:4218750000 · 10−10 yx2 + 0:1562500311 · 10−18 : J Thus, we compute Now, we apply step 3 of the symbolic algorithm to fJ and P. @5 fJ 1 @5 fJ t t 5 @5 fJ + + · · · + @5 x 5! @4 x@y 4! 5! @5 y A(x; y; t) = 6 6 J J @ f 1 @ f t t 6 @6 fJ + 5 + ··· + 6 @ x 6! @ x@y 5! 6! @6 y 6x + 2:000000000t − 2:000000000t 4 + 6yt 5 = 1 + t6 and we return J t) − 0:50000002 · 10−3 ) J t) + 0:1875000000 · 10−5 ; −tA(P; P(t) = (−A(P; = (pJ 1 (t); pJ 2 (t)); where pJ 1 (t) =
−2:000000000t + 0:3000000120 · 10−2 t 5 + 0:1875000000 · 10−5 t 6 1 + t6 4 2:000000000t − 0:9375000000 · 10−5 + 1 + t6
and pJ 2 (t) =
−0:4887500200 · 10−3 − 2:000000000t 4 − 0:3000000120 · 10−2 t 5 1 + t6 2:000000000t − 0:5000000200 · 10−3 t 6 + : 1 + t6
See Fig. 3 to compare the input curve and the rational output curve.
S. P erez-D )az et al. / Theoretical Computer Science 315 (2004) 627 – 650
Fig. 3. Input curve C (left) and output curve CJ (right).
Example 2. Input curve C
16:001 + 24:001x + 8y − 2y2 + 12yx + 14:001x2 + 2y2 x + x2 y + x4 − y3 + 6:001x3
Tolerance
0:001
-Singularity
(−2; −2)
Output curve CJ
16:008 + 24:012x + 8y − 2y2 + 12yx +14:006x2 + 2y2 x + x2 y + x4 − y3 + 6:001x3
Parametrization J = (pJ 1 (t); pJ 2 (t)) P(t)
pJ 1 = t 3 − 0:001 − t − 2t 2 ;
Figures Curve C(left) J Curve C(right)
pJ 2 = t 4 + 1:999t − t 2 − 2t 3 − 2
639
640
S. P erez-D )az et al. / Theoretical Computer Science 315 (2004) 627 – 650
Example 3. Input curve C
x4 + 2y4 + 1:001x3 + 3x2 y − y2 x − 3y3 + 0:00001y2 − 0:001x − 0:001y − 0:001
Tolerance
0:001
-Singularity
(−0:1248595915 10−6 ; 0:1249844199 · 10−6 )
Output curve CJ
x4 + 2:y4 + 1:001x3 + 3:x2 y − y2 x − 3:y3 + 10−6 y2 − 0:6243761996 · 10−13 x − 0:6260915576 · 10−13 y + 0:9744187291 · 10−23 − 0:3522924910 · 10−16 x2 + 0:9991263887 · 10−6 xy 2
−6 4
3
t −6:15167t ; pJ 1 = −0:487671 · 2:0526−2:05055t +6:15167t+0:512063·10 Parametrization 1+2:t 4 3 2 4 J = (pJ 1 (t); pJ 2 (t)) −2:05260t+2:05055t −6:15167t +6:15167t +0:256287·10−6 P(t) pJ 2 = 0:487671 · : 1+2:t 4
Figures Curve C(left) J Curve C(right)
Example 4. Input curve C
y5 + x5 + x4 + 0:001x + 0:001y + 0:002 + 0:001x2 + 0:005y2 + 0:001x3
Tolerance
0:01
-Singularity
(− 0:0002501; 0)
Output curve CJ
y5 + x5 + x4 + 0:6255863298·10−10 x + 0:9999998183·10−3 x3 + 0:3912115701 · 10−14 + 0:3751562603 · 10−6 x2
Parametrization −6 2384119+597t 5 ; pJ 2 = −0:9987492180 1+tt 5 : J = (pJ 1 (t); pJ 2 (t)) pJ 1 = − 41902244·10 · 1+t 5 P(t)
Figures Curve C (left) Curve CJ (right)
S. P erez-D )az et al. / Theoretical Computer Science 315 (2004) 627 – 650
641
Example 5. −10:x + 2:y + xy4 + 862:x4 y − 359:x3 y2 + 3:099 Input curve C
− 859:967x3 y + 39:x2 y3 + 299:011x2 y2 + 52:x2 y − 3:xy3 + 5:xy2 − 7:901xy + 687:x4 − 642:x5 − 67:989x3 + 14:x2 − 9:989y4 + y5 − 4:y3 − y2
Tolerance
0:1
-Singularity
(0:999067678; 1:99734) 10:12701492x + 1:548607302y + xy4 + 862:x4 y − 359:x3 y2 − 859:9670000x3 y + 39:x2 y3
Output curve CJ
+ 299:0110000x2 y2 + 52:18519488x2 y − 3:xy3 + 4:626307400xy2 − 7:063248589xy − 642:x5 − 67:98172465x3 + 13:33333837x2 − 9:989000000y4 + y5 − 3:999974822y3 − 0:9012712980y2 + 687:x4 + 3:247948193 pJ 1 = 0:22545229 ·
0:69592866·103 −0:128422685·104 t+0:0102:t 4 +4:4313t 5 t 5 +t 4 +862:t+39:t 3 −642−359:t 2 3 2
3 3
t −0:19495476·10 t + 0:22545229 · 0:81893515·10 Parametrization t 5 +t 4 +862:t+39:t 3 −642−359:t 2 ; 5 J = (pJ 1 (t); pJ 2 (t)) P(t) t−0:82845609·104 t 2 +4:4380666t 5 pJ 2 = 0:22545229 · 0:111775629·10 t 5 +t 4 +862:t+39:t 3 −642−359:t 2
+ 0:22545229 ·
Figures Curve C(left) J Curve C(right)
0:27553162·104 t 3 −0:35891982·103 t 4 −0:56876434·104 t 5 +t 4 +862:t+39:t 3 −642−359:t 2
642
S. P erez-D )az et al. / Theoretical Computer Science 315 (2004) 627 – 650
Example 6. Input curve C
x3 + x2 y + x2 + xy2 + y3 + y2 − 0:999990x − 0:999980y − 0:9999600
Tolerance
0:01
-Singularity
(−0:99000000; 0)
Output curve CJ
x3 + x2 y + x2 + xy2 + y3 + y2 − 0:9603000x − 0:9801000y − 0:9604980
Parametrization J = (pJ 1 (t); pJ 2 (t)) P(t)
−0:99t pJ 1 = 0:99t+0:98−t ; pJ 2 = t(1:98t+1:97−0:01t 1+t+t 2 +t 3 1+t+t 2 +t 3
2
3
2
)
Figures Curve C(left) J Curve C(right)
Example 7. Input Curve C
y5 + x5 + x4 − 2:y4 + 10−3 x + 10−3 y + 10−3 + 10−3 x2 + 10−3 x3 + 2 · 10−3 y2 x + 10−3 y3
Tolerance
0:01
-Singularity
(−0:2501564001 · 10−3 ; 0:1250195 · 10−3 )
Output curve CJ
0:6255863298 · 10−10 x + 0:1562864926 · 10−10 y + y5 + x5 + x4 − 2:y4 + 0:9999998183 · 10−3 x3 + 0:3751562603 · 10−6 x2 + 0:9999997015 · 10−3 y3 − 0:1875194239 · 10−6 y2 + 0:3423651857 · 10−14 2 4
t +0:2177516307·10 pJ 1 = −0:114881528 · 8:695909548−0:1740379799·10 Parametrization 1+t 5 5 −3 J 8:702443119t −4:346866016t+0:544123596·10 P(t) = (pJ 1 (t); pJ 2 (t)) pJ 2 = 0:2297630556 · : 1+t 5
Figures Curve C(left) J Curve C(right)
−2 5
t
S. P erez-D )az et al. / Theoretical Computer Science 315 (2004) 627 – 650
643
Example 8.
Input curve C
291:9690000x − 17:00300000y − 100:9940000y2 + 20:y4 x − 511:9760000x2 + x7 − 14:x6 + 82:x5 − 259:9990000x4 + 479:9920000x3 + 29:y5 − 74:99900000y4 − 40:y3 x + 40:y2 x − 160:x2 y + 140:xy + 2:x5 y − 20:x4 y + 80:x3 y + y7 − 7:y6 + 114:9960000y3 − 72:98400000 − 4:y5 x
Tolerance
0:001
-Singularity
(2; 1)
Output curve CJ
−73 + 292:x − 17:y − 101:y2 − 512:x2 + x7 − 14:x6 −260:x4 + 480:x3 + 29:y5 − 75:y4 − 40:y3 x −160:x2 y + 140:xy + 2:x5 y − 20:x4 y + 80:x3 y + y7 −7:y6 + 115:y3 − 4:y5 x + 20:y4 x + 82:x5 + 40:y2 x:
Parametrization 2(t 7 +1+2t 5 −t) ; J = (pJ 1 (t); pJ 2 (t)) pJ 1 = t 7 +1 P(t)
pJ 2 = 4t
6
−2t 2 +t 7 +1 t 7 +1
Figures Curve C(left) J Curve C(right)
4. Error analysis Examples in Section 3 show that, in practice, the output curve of our algorithm is quite close to the input one. In this section we analyze how far these two aIne curves are. To be more precise let C ∈ Ld be de4ned by f(x; y). In addition, we will denote by J = pJ 1 (t) ; pJ 2 (t) ; P(t) q(t) J q(t) J J Moreover, where gcd(pJ i ; q) J = 1, the generated parametrization of the output curve C. since we will measure distances, we may assume that the -singularity of C is the
644
S. P erez-D )az et al. / Theoretical Computer Science 315 (2004) 627 – 650
origin, otherwise one can apply a translation such that it is moved to the origin and distances are preserved. Also we assume that f(x; y) = 1, otherwise we consider f(x; y)=f(x; y). If one does not normalize the input polynomial f(x; y), a similar treatment with relative errors can be done. In this situation, the general strategy we will follow is to show that almost any aIne real point on CJ is at small distance of an aIne real point on C. For this purpose, we J observe that P(t) is an exact parametrization of CJ obtained by lines, and therefore all aIne real points on CJ are obtained as the intersection of a line of the form y = tx, for J Then, if one intersects the curve C with the same line one gets d points t real, with C. on C, counted properly, and we show that at least one of these intersection points on J Also, we observe that it is enough to reason with C is close to the initial point on C. slope parameter values of t in the interval [−1; 1] because if |t|¿1 one may apply a similar strategy intersecting with lines of the form x = ty. Therefore, let t0 ∈ R be such J 0 ). Let us J 0 ) = 0. Then, the corresponding point QJ on CJ is QJ = P(t that |t0 |61 and q(t expressed QJ as J = QJ = (a; J b)
aJ1 bJ1 ; ; cJ cJ
where aJ1 = pJ 1 (t0 ), aJ2 = pJ 2 (t0 ) and cJ = q(t J 0 ). Observe that, since we are cutting with the line y = t0 x, it holds that bJ = t0 a. J Thus, if we write the aIne point QJ projectively J Now, observe that if |aJ1 | and |c| J are simultaneously very one has that (aJ1 : t0 aJ1 : c). small, i.e. very close to , this point is not well de4ned as an element in P2 (R). For J is bigger than a certain bound that this reason, we will assume that either |aJ1 | or |c| depends on the tolerance. In fact, for our error analysis, we 4x that |aJ1 | ¿ 1=d
or |c| J ¿ 1=d :
Furthermore, we observe that the de4ning polynomials of CJ and C have the same homogeneous form of maximum degree, and hence both curves have the same points at in4nity. Now, let Q = (a; b) be any aIne point in C ∩ {y = t0 x}; note that here it also holds that b = t0 a. We want to compute the Euclidean distance between QJ and Q. In order to do that, we observe that QJ − Q2 =
√ (aJ − a)2 + (bJ − b)2 = (aJ − a)2 (1 + t02 ) 6 2| aJ − a|:
Therefore, we focus on the problem of computing a good bound for | aJ − a|. For this purpose we 4rst prove two di5erent lemmas that will be used as general strategies in our reasonings. Lemma 2. It holds that |aJ − a| 6 · C;
S. P erez-D )az et al. / Theoretical Computer Science 315 (2004) 627 – 650
where
d−2 C=
j1 +j2 =0
|a| J j1 +j2 |t0 |j2 1=j1 !j2 ! |a| J d−1 |c| J
645
:
J t0 x) = xd−1 Proof. First of all, we note that aJ is a root of the univariate polynomial f(x; (cx J − aJ1 ), and that a is a root of the univariate polynomial f(x; t0 x) = xd−1 (cx J − aJ1 ) +
d−2 j1 +j2 =0
@j1 +j2 f 1 (0; 0)xj1 (t0 x)j2 : j j 1 2 @ x@ y j1 !j2 !
Since (0; 0) is the (d − 1)-fold -singularity of CJ it holds that j1 +j2 j 1 @ f |t0 | 2 J t0 x) = (0; 0) max f(x; t0 x) − f(x; j1 +j2 =0;:::;d−2 @j1 x@j2 y j1 !j2 ! j1 +j2 @ f 6 max (0; 0) ¡ f(x; y) = j1 +j2 =0;:::;d−2 @j1 x@j2 y J t0 x) can be written as and thus f(x; J t0 x) = f(x; t0 x) + R(x) where R ∈ R[x] f(x;
and
R(x) ¡ :
Therefore, by applying standard numerical techniques to measure |aJ − a| by means of the condition number (see for instance [7, p. 303]), one deduces that |aJ − a| 6 · C; where
d−2 C=
|a| J j1 +j2 |t0 |j2 1=j1 !j2 ! = J |@f=@x( a; J t0 a)| J
j1 +j2 =0
d−2
j1 +j2 =0
|a| J j1 +j2 |t0 |j2 1=j1 !j2 ! |a| J d−1 |c| J
:
Lemma 3. Let h(x) = c
n
i=1
(x − ci ) ∈ C[x] with deg(h) = n
and let ∈ C be such that |h()|6. Then, there exists a root ci0 of h(x) such that | − ci0 | 6
|c|
1=n :
Proof. Let us assume that for i = 1; : : : ; n, | − ci |¿(=|c|)1=n . Then, |h()| = |c|
n
i=1
| − ci | ¿ ;
which contradicts that |h()|6.
646
S. P erez-D )az et al. / Theoretical Computer Science 315 (2004) 627 – 650
Now, we proceed to analyze |aJ − a| by using the previous lemmas. For this purpose, we distinguish di5erent cases depending on the values of |aJ1 | and |c|: J Lemma 4. Let |c|¿1. J Then, it holds that: 1. If |a|¿1, J then |aJ − a|6 exp(2). 2. If |a|61, J then |aJ − a|6( exp(2))1=d . Proof. 1. If |a|¿1, J we have that the constant C in Lemma 2 can be bounded as d−2 C= 6
j1 +j2 =0
d−2 k=0
|a| J j1 +j2 |t0 |j2 1=j1 !j2 ! |a| J d−1 |c| J
d−2 =
k=0
(|a| J + |at J 0 |)k =k! |a| J d−1 |c| J
d−2 (1 + |t0 |)k (1 + |t0 |)k 6 6 exp(1 + |t0 |) 6 exp(2): k!|a| J d−1−k k! k=0
Therefore, by Lemma 2 we deduce that |aJ − a| 6 exp(2): 2. If |a|61, J we have that d−2 @j1 +j2 f 1 J j1 j2 |f(a; J at J 0 )| = f(a; (0; 0) a J J at J 0) + (t a) J 0 j j 1 2 j1 !j2 ! j1 +j2 =0 @ x@ y d−2 1 @j1 +j2 f j1 j2 = (t a) J (0; 0) a J 0 j1 +j2 =0 @j1 x@j2 y j1 !j2 ! 6
d−2 j1 +j2 =0
j +j @ 1 2f j 1 j2 1 |a| J j2 (0; 0) 6 exp(|a|(1 J + |t0 |)) @j1 x@j2 y J |t0 | |a| j1 !j2 !
6 exp(2): In this situation, by Lemma 3 we deduce that there exists a root of the univariate polynomial f(x; t0 x), that we can assume w.l.o.g. that is a, such that |aJ − a| 6
exp(2) |c| J
1=d
6 ( exp(2))1=d :
Lemma 5. Let |c|¡1 J and |aJ1 |¿1. Then, it holds that |aJ − a|6 exp(2).
S. P erez-D )az et al. / Theoretical Computer Science 315 (2004) 627 – 650
647
Proof. Since |c|¡1 J and |aJ1 |¿1, we have that the constant C in Lemma 2 can be bounded as d−2 d−2 k J j1 +j2 |t0 |j2 1=j1 !j2 ! J (d−2−k) =k! j1 +j2 =0 |a| k=0 (|aJ1 | + |aJ1 t0 |) |c| = C= |a| J d−1 |c| J |aJ1 |d−1 d−2 d−2 (1 + |t0 |)k (1 + |t0 |)k 6 6 6 exp(1 + |t0 |) 6 exp(2): d−1−k k! k=0 k!|aJ1 | k=0 Therefore, by Lemma 2 we deduce that |aJ − a| 6 exp(2): Finally, it only remains to analyze the case where |c|¡1 J and |aJ1 |¡1. In order to do that, we recall that we have assumed that either |aJ1 | or |c| J is bigger than 1=d . In the next lemma, we study these cases. Lemma 6. It holds that: 1. If |c|¡1 J and 1=d ¡|aJ1 |¡1, then |aJ − a|61=d exp(2). 2. If |aJ1 |¡1 and 1=d ¡|c|¡1, J then |aJ − a|6(1=2 exp(2))1=d . Proof. 1. If |c|¡1 J and |aJ1 |¿1=d , we have that the constant C in Lemma 2 can be bounded as d−2 d−2 j1 +j2 −d+1 J j1 +j2 |t0 |j2 1=j1 !j2 ! |t0 |j2 1=j1 !j2 ! j1 +j2 =0 |a| j1 +j2 =0 |aJ1 | C= = |a| J d−1 |c| J |c| J j1 +j2 −d+2 d−2 J d−j1 −j2 −2 |t0 |j2 1=j1 !j2 ! j1 +j2 =0 |c| = |aJ1 |d−j1 −j2 −1 d−2 j2 exp(2) j1 +j2 =0 |t0 | 1=j1 !j2 ! 6 6 6 exp(2) −1+1=d |aJ1 |d−1 |aJ1 |d−1 Therefore, by Lemma 2 we deduce that |aJ − a| 6 1=d exp(2): 2. Let 1=d ¡|c|¡1 J and |aJ1 |¡1. First we assume that |aJ1 |61=d . Otherwise we would J In these conditions, we deduce reason as in [1]. Thus, one has that |aJ1 |61=d ¡|c|¡1. that d−2 @j1 +j2 f 1 J j1 j2 |f(a; J at J 0 )| = f(a; J at J 0) + (t a) J (0; 0) a J 0 j1 x@j2 y @ j !j ! 1 2 j1 +j2 =0 d−2 1 @j1 +j2 f j1 j2 = J (0; 0)aJ (t0 a) j1 +j2 =0 @j1 x@j2 y j1 !j2 !
648
S. P erez-D )az et al. / Theoretical Computer Science 315 (2004) 627 – 650
6
d−2 j1 +j2 =0
j +j @ 1 2f j 1 J 1 |t0 |j2 |a| J j2 @j1 x@j2 y (0; 0) |a| j1 !j2 !
6 exp(|a|(1 J + |t0 |)) 6 exp(2): Now, by Lemma 3 we deduce that there exists a root of the univariate polynomial f(x; t0 x), that we can assume w.l.o.g. that is a, such that |aJ − a| 6
exp(2) |c| J
1=d
6 ( exp(2))1=d
= ( · exp(2))1=d
1 1=2d
1 1 6 ( exp(2))1=d 1=d2 |c| J 1=d
= (1=2 exp(2))1=d :
From the previous lemmas, one deduces the following theorem. Theorem 2. For almost all a:ne real point QJ ∈ CJ there exists an a:ne real point Q ∈ C such that √ QJ − Q2 6 21=2d exp(2): Proof. Applying Lemmas 4–6 one deduces that
(aJ − a)2 + (bJ − b)2 = (aJ − a)2 (1 + t02 ) √ √ 6 2|aJ − a| 6 21=2d exp(2):
QJ − Q2 =
J be a regular point on CJ such that there exists Q = (a; b) ∈ C Now, let QJ = (a; J b) √ J with Q − Q2 6 21=2d exp(2) (see Theorem 2). In this situation, we consider the J where (nx ; ny ) is the unitary J i.e. T (x; y) = nx (x − a) tangent line to CJ at Q; J + ny (y − b), J J normal vector to C at Q. Then, we bound the value T (Q): J 6 QJ − Q2 (nx + ny ) J + ny · |b − b| T (Q) 6 nx · |a − a| √ 1=2d 6 2 2 exp(2): Therefore, reasoning as in Section 2.2 of [17] one deduces the following theorem. √ Theorem 3. C is contained in the o<set region of CJ at distance 2 21=2d exp(2):
References [1] S. Abhyankar, C. Bajaj, Automatic parametrization of rational curves and surfaces III: algebraic plane curves, Comput. Aided Geom. Des. 5 (1988) 321–390. [2] E. Arrondo, J. Sendra, J.R. Sendra, Parametric generalized o5sets to hypersurfaces, J. Simbolic Comput. 23 (1997) 267–285.
S. P erez-D )az et al. / Theoretical Computer Science 315 (2004) 627 – 650
649
[3] C. Bajaj, C.M. Ho5mann, J.E. Hopcroft, R.E. Lynch, Tracing surface intersections, Comput. Aided Geom. Des. 5 (1988) 285–307. [4] C. Bajaj, A. Royappa, Parameterization in 4nite precision, Algorithmica 27 (1) (2000) 100–114. [5] C. Bajaj, G. Xu, Piecewise approximations of real algebraic curves, J. Comput. Math. 15 (1) (1997) 55–71. [6] B. Beckermann, G. Labahn, When are two polynomials relatively prime? J. Symbolic Comput. 26 (1998) 677–689. [7] R. Bulirsch, J. Stoer, Introduction to Numerical Analysis, Springer, New York, 1993. [8] J. Canny, The complexity of robot motion planning, ACM Doctoral Dissertation Award, The MIT Press, Cambridge, MA, 1987. [9] R.M. Corless, P.M. Gianni, B.M. Trager, S.M. Watt, The singular value decomposition for polynomial systems, Proceedings of the ISSAC 1995, ACM Press, New York, 1995, pp. 195–207. [10] R.M. Corless, M.W. Giesbrecht, D.J. Je5rey, S.M. Watt, Approximate polynomial decomposition, in: S.S. Dooley (Ed.), Proceedings of the ISSAC 1999, Vancouver, Canada, ACM Press, New York, 1999, pp. 213–220. [11] R.M. Corless, M.W. Giesbrecht, I.S. Kotsireas, M. van Hoeij, S.M. Watt, Towards factoring bivariate approximate polynomials, Proceedings of the ISSAC 2001, Bernard Mourrain, London, 2001, pp. 85–92. [12] R.M. Corless, M.W. Giesbrecht, I. Kotsireas, S.M. Watt, Numerical implicitization of parametric hypersurfaces with linear algebra. Proceedings of the Arti4cial Intelligence with Symbolic Computation (AISC 2000), Lecture Notes in Arti4cal Intelligence, Springer, Berlin, 1930, pp. 174–183. [13] J. Demmel, D. Manocha, Algorithms for intersecting parametric and algebraic curves II: multiple intersections, Graphical Models and Image Processing: GMIP 57 (2) (1995) 81–100. [14] T. Dokken, Approximate implicitization, in: T. Lyche, L.L. Schumakes (Eds.), Mathematical Methods for Curves and Surfaces in GAGD, Oslo 2000, Innovations in Applied Mathematical Series, Vanderbilt University Press, 2001, pp. 81–102. [15] I.Z. Emiris, A. Galligo, H. Lombardi, Certi4ed approximate univariate GCDs, J. Pure Appl. Algebra 117 and 118 (1997) 229–251. [16] I.Z. Emiris, Pan, Y. Victor, Symbolic and numeric methods for exploiting structure in constructing resultant matrices, J. Symbolic Comput. 33(4) 2002 393–413. [17] R.T. Farouki, V.T. Rajan, On the numerical condition of algebraic curves and surfaces. 1: Implicit equations, Comput. Aided Geom. Des. 5 (1988) 215–252. [18] Fortune, Polynomial root-4nding using iterated eigenvalue computation, Proceedings of the ISSAC 2001, ACM Press, New York, pp. 121–128. [19] J. Gahleitner, B. JCuttler, J. Schicho, Approximate parameterization of planar cubic curve segments, Proceedings of the Fifth International Conference on Curves and Surfaces, Saint-Malo, 2002, Nashboro Press, Nashville, TN, 2002, pp. 1–13. [20] A. Galligo, D. Rupprech, Irreducible decomposition of curves, J. Symbolic Comput. 33 (2002) 661–677. [21] A. Galligo, S.M. Watt, in: W. Kuchlin (Ed.), A Numerical Absolute Primality test for Bivariate Polynomials, ISSAC 1997, Maui, USA, ACM, New York, 1997, pp. 217–224. [22] G.H. Golub, C.F. Van Loan, Matrix Computations, The Johns Hopkins University Press, Baltimore and London, 1989. [23] E. Hartmann, Numerical parameterization of curves and surfaces, Comput. Aided Geom. Des. 17 (2000) 251–266. [24] M. van Hoeij, Computing parametrizations of rational algebraic curves, in: J. von zur Gathen (Ed.), Proceedings of the ISSAC94, ACM Press, New York, pp. 187–190. [25] M. van Hoeij, Rational parametrizations of curves using canonical divisors, J. Symbolic Comput. 23 (1997) 209–227. [26] C.M. Ho5mann, Geometric and Solid Modeling, Morgan Kaufmann Publ., Inc., Los Altos, CA, 1993. [27] S. Krishnan, D. Manocha, Solving algebraic systems using matrix computations, Sigsam Bull. ACM. 30 (4) (1996) 4–21. [28] H.M. MColler, GrCobner bases and numerical analysis, in: B. Buchberger, F. Winkler (Eds.), GrCobner Bases and Applications, Lecture Notes in Statistics, Vol. 251, Springer, Berlin, 1998, pp. 159–178.
650
S. P erez-D )az et al. / Theoretical Computer Science 315 (2004) 627 – 650
[29] V.Y. Pan, Numerical computation of a polynomial GCD and extensions, Tech. report, N. 2969, Sophia-Antipolis, France, 1996. [30] V.Y. Pan, Univariate polynomials: nearly optimal algorithms for factorization and root4nding, ISSAC 2001, London, Ont., Canada, ACM Press, New York, NY, USA, 2001, pp. 253–267. [31] T. Sasaki, Approximate multivariate polynomial factorization based on zero-sum relations, Proceedings of the ISSAC 2001, ACM Press, New York, NY, USA, 2001, pp. 284–291. [32] J. Schicho, Rational parametrization of surfaces, J. Symbolic Comput. 26 (1998) 1–9. [33] J.R. Sendra, F. Winkler, Symbolic parametrization of curves, J. Symbolic Comput. 12 (1991) 607–631. [34] J.R. Sendra, F. Winkler, Parametrization of algebraic curves over optimal 4eld extension, J. Symbolic Comput. 23 (1997) 191–207. [35] J.R. Sendra, F. Winkler, Algorithms for rational real algebraic curves, Fund. Inform. 39 (1999) 211–228. [36] H. Stetter, Stabilization of polynomial systems solving with Groebner bases, Proceedings of ISSAC 97, 1997, pp. 117–124.
Theoretical Computer Science 315 (2004) 651 – 669
www.elsevier.com/locate/tcs
Numerical factorization of multivariate complex polynomials Andrew J. Sommesea , Jan Verscheldeb;∗ , Charles W. Wamplerc a Department
of Mathematics, University of Notre Dame, Notre Dame, IN 46556-4618, USA of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago, 851 South Morgan (M/C 249), Chicago, IL 60607-7045, USA c General Motors Research & Development, Mail Code 480-106-359, 30500 Mound Road, Warren, MI 48090-9055, USA b Department
Abstract One can consider the problem of factoring multivariate complex polynomials as a special case of the decomposition of a pure dimensional solution set of a polynomial system into irreducible components. The importance and nature of this problem however justify a special treatment. We exploit the reduction to the univariate root 0nding problem as a way to sample the polynomial more e1ciently, certify the decomposition with linear traces, and apply interpolation techniques to construct the irreducible factors. With a random combination of di3erentials we lower multiplicities and reduce to the regular case. Estimates on the location of the zeroes of the derivative of polynomials provide bounds on the required precision. We apply our software to study the singularities of Stewart–Gough platforms. c 2004 Elsevier B.V. All rights reserved. MSC: Primary 13P05; 14Q99; Secondary 65H10; 68W30 Keywords: Approximate factorization; Divided di3erences; Generic points; Homotopy continuation; Irreducible decomposition; Newton interpolation; Numerical algebraic geometry; Monodromy; Multiple roots; Polynomial; Stewart–Gough platform; Symbolic-numeric computation; Traces; Witness points
∗
Corresponding author. Fax: +1-312-996-1491. E-mail addresses: [email protected] (A.J. Sommese), [email protected], [email protected] (J. Verschelde), [email protected] (C.W. Wampler). URLs: http://www.nd.edu/∼sommese, http://www.math.uic.edu/∼jan c 2004 Elsevier B.V. All rights reserved. 0304-3975/$ - see front matter doi:10.1016/j.tcs.2004.01.011
652
A.J. Sommese et al. / Theoretical Computer Science 315 (2004) 651 – 669
1. Introduction We consider a polynomial f with complex coe1cients in several variables. We wish to write f as a product of irreducible polynomials: f(x) =
N i=1
qi (x)i ;
x = (x1 ; x2 ; : : : ; xn );
N i=1
i deg(qi ) = deg(f):
(1)
Note that for each i, the factor qi occurs with multiplicity i . The problem of factoring multivariate polynomials occurs frequently in computer algebra. Especially the case when the coe1cients of f are known only approximately is important for applications and is stated as a challenge problem to symbolic computation in [10]. Recent papers on this problem are [2,3,5,6,8,19,20]. These papers propose algorithms in hybrid symbolic-numeric computation [4]. We 0nd our way of working very much related to the method of computing the approximate gcd of two polynomials using their zeros, as presented in [1]. Using homotopies theoretically, the complexity of factoring polynomials with rational coe1cients was shown in [1] to be in NC. The crux of our approach is the numerical computation of witness sets, which in the case of a single polynomial means 0nding the intersection of f−1 (0) with a generic line in Cn and then partitioning these witness points according to their membership in the irreducible factors. In this way, each factor qi is witnessed by deg(qi ) distinct points of multiplicity i . The main contribution of this paper is a symbolic-numeric method for reducing multiplicities by di3erentiation, along with an analysis of the numerical stability of this step. Subsequent to the determination of the witness sets, one can numerically reconstruct the coe1cients of each qi by tracking the witness points in a continuation as the generic line is moved and interpolating points on these paths. For many purposes, the interpolation step is not necessary; for example, using only the witness set for a component, a homotopy membership test can check if a given point lies on that component. In this sense, we may consider the polynomial to be numerically factored once the witness set for f has been decomposed into the witness sets for its irreducible components qi . Because interpolation can be sensitive to errors in the sample points, it is preferable to work directly with the witness sets whenever possible. Even so, for the sake of completeness, we carry out the interpolation step in our test examples. The approach just described is a specialization of the tools we built for computing an irreducible decomposition of the solution set of a polynomial system. These tools were developed to implement the research program Numerical Algebraic Geometry, outlined in [30]. In [22] we gave algorithms to decompose solution sets of polynomial systems into irreducible components of various degrees and dimensions, applying an embedding and sequence of homotopies [21] to 0nd generic points e1ciently. The homotopy test presented in [23] to determine whether a given point belongs to an irreducible component led to the use of monodromy [24] for which linear traces [25] provide an e1cient veri0cation and interpolation method. Applications of our software [28]
A.J. Sommese et al. / Theoretical Computer Science 315 (2004) 651 – 669
653
to design problems in mechanical engineering are described in [27]. A tutorial to our recent developments can be found in [29]. In [24] we gave an algorithm for using monodromy to decompose the zero set of a polynomial system into irreducible components. The main di1culty in the use of monodromy occurs when we track points on irreducible components of multiplicity at least two. In [26] we presented a method for tracking these singular paths, but it necessitates special care and usually requires higher precision arithmetic. In this article we specialize the algorithm of [25] to the case of a single polynomial f(x) on Cn where it is possible to replace singular paths by nonsingular ones via di3erentiation. Consider the example f(x) = (x12 − x2 )3 (x12 + x22 + x23 ). If this polynomial is represented exactly, we could symbolically di3erentiate twice with respect to x1 to obtain a polynomial where x12 − x2 is a factor of multiplicity one. However, when f is represented numerically in unfactored form, care must be taken to ensure that di3erentiation does not lead to numerical instability and erroneous results. For simplicity, 0rst assume that the polynomial is on C. Data for polynomials arising in engineering and science is sometimes noisy. Also zeroes of polynomials of multiplicity more than one are hard to compute exactly. Thus even if the actual polynomial has a factor with multiplicity ¿2, we must expect that if we solve the restriction of the polynomial to a general line, we will 0nd not a single zero of multiplicity , but a cluster of zeroes. Assume a polynomial has a zero z of multiplicity . Di3erentiating − 1 times will yield a polynomial having a nonsingular factor x − z. Because of slightly perturbed coe1cients or roundo3 errors, we may have a nearby polynomial f(x) containing a factor i=1 (x − zi ) with zi near z. In contrast to the case of an exact multiple root, it does not follow that f(−1) (x) has a single zero near z. Here a remarkable result of Marden and Walsh [11] (formulated as Corollary 3.3 below) gives mild numerical conditions guaranteeing the numerical stability of the symbolic operation of di3erentiation. It guarantees that if we have a cluster of roots in a disk D of radius r, and no root outside of D is a distance less than R from the center of the disk D, then under mild conditions on the size of R=r, f(−1) (x) has one root in D, and a lower bound is given for the distance of any root of f(−1) (z) outside of D to the center of D. For polynomials of several variables, i.e., x ∈ Cn , we 0nd the roots of the restriction of f(x) to a general line, i.e., a line x(t) = x0 + tv where we have chosen random vectors x0 ; v ∈ Cn . For roots of multiplicity one, we can use the monodromy technique of [24], or if the degree of f(x) is low, use the trace theorem of [25] to justify the partial sum approach of [19]. To deal with the clusters of roots we can compute the ( − 1)th derivative of f(x) in the direction v, i.e., with v = (v1 ; : : : ; vn ), we compute −1 @ @ g(x) := v1 + · · · + vn f(x) (2) @x1 @xn and apply the techniques to the multiplicity one roots of g(x) corresponding to the clusters. Since −1 d g(x0 + tv) = f(x0 + tv); (3) dt
654
A.J. Sommese et al. / Theoretical Computer Science 315 (2004) 651 – 669
we can use Corollary 3.3 to check that g(x0 + tv) has multiplicity one roots corresponding to the multiplicity clusters. Now as we vary the line, we can use g(x) to track the appropriate roots. As a numerical safety check we can check that the continuations of these roots on the line intersection with g−1 (0) have small residual when we evaluate f(x) at them. In this paper, we 0rst outline the algorithms using pseudocode. Then we justify our use of di3erentiation to remove multiplicities and examine the implications of the results of Marden and Walsh for the behavior of root clusters under di3erentiation. After discussing some numerical aspects of our implementation, we apply our software to a problem from mechanical engineering concerning the singularities of Stewart–Gough platforms.
2. Algorithms Given a pure k-dimensional a1ne algebraic set Z, we use the term “witness point set” to designate the intersections of Z with a generic linear space LN −k of complementary dimension N − k (see for example, [22]). LN −k can be de0ned by k linear equations on CN , each having random, complex coe1cients, or equivalently, it can be given in N −k parametric form as x = x0 + i=1 vi ti , where x0 ; vi ∈ CN are random and complex. In the case at hand, Z is a hypersurface given by a single polynomial equation f(x) = 0, so we intersect it with a one-dimensional linear space, L1 . In the initial stage we compute a set of witness points on the hypersurface de0ned by f(x) = 0 and store clustered points according to the size of the cluster. More precisely, if d = deg(f), WitnessGenerate computes d witness points and partitions the set of witness points into W = {W1 ; W2 ; : : : ; Wm }, where each Wi is a set of clusters of size i. Algorithm W = WitnessGenerate(f; x0 ; v) Input : f(x) polynomial in n variables with complex coe1cients; x0 and v represent a random line x(t) = x0 + tv. Output : W = {W1 ; W2 ; : : : ; Wm }, m = maxdi=1 i , for all X ∈ Wi : #X = i . The method to solve a polynomial in one variable is invoked once in WitnessGenerate. The algorithm RegularFactor assumes the witness points all have multiplicity one. It is invoked repeatedly in the main factorization algorithm. Algorithm P = RegularFactor(f; x0 ; v; W1 ) Input : f(x) polynomial in n variables with complex coe1cients; x0 and v represent a random line x(t) = x0 + tv; W1 set of t-values for which f(x(t)) = 0, with multiplicity one. k Output : P = {p1 ; p2 ; : : : ; pk }, k irreducible factors of f with i=1 deg(pi ) = #W1 . We have two di3erent implementations of RegularFactor: (1) Using the monodromy grouping algorithm [24], certi0ed with linear traces [25]. (2) Applying linear traces to enumerated factors [5,6,19].
A.J. Sommese et al. / Theoretical Computer Science 315 (2004) 651 – 669
655
Fig. 1. Illustration of the combinatorial method to factor a cubic, given by witness points 1, 2, 3. Every question is answered by a linear trace test.
Both methods apply path-following techniques. In each step of the path tracker we slightly move the random line, predict the location of the solutions and feed the predicted solutions to Newton’s method for correction. The main di3erence between the two implementations lies in the number of computed samples. With monodromy we compute witness point sets on many random lines, and take the witness points connected by paths as belonging to the same irreducible factor. Linear traces are then applied to certify the factorization predicted by the monodromy. In the enumeration method, we also use linear traces, but plainly enumerate all possible factorizations. For three witness points, the enumeration method runs as in Fig. 1. Each test is answered by the computation of a linear trace and comparing the value at the linear trace with the sum directly computed from the samples. The largest number of tests in this algorithm occurs when the factor witnessed by W1 is irreducible, and equals 2w−1 − 1, where w = #W1 . After the grouping of the witness points along the irreducible factors, we can apply interpolation techniques to 0nd symbolic expressions for the polynomials. In our implementation we postponed all interpolation to the end, because this stage is most time consuming and sometimes also not really necessary. After WitnessGenerate and RegularFactor we have all irreducible factors of multiplicity one. To build the higher multiplicity factors, we propose to take random combinations of all partial derivatives. Each di3erentiation cuts the multiplicity by one and the number of solutions in the cluster drops accordingly. To process Wi , f is di3erentiated i − 1 times yielding g := D(i−1) f. The routine ReneCluster takes the center of each clustered set of Wi as an initial approximation for root re0nement with g. Thereafter we can apply RegularFactor on g and the reduced set Wi as before. With the witness points grouped according to the factors, we can apply interpolation methods, e.g., using traces [20], to obtain symbolic representations of the factors. 3. How clusters of zeroes spread out under di$erentiation At the innermost level of our routine, we have reduced the problem to following a multiple root of a complex polynomial in one variable. Recall that if h(z) has a root
656
A.J. Sommese et al. / Theoretical Computer Science 315 (2004) 651 – 669
of multiplicity at a point z0 , then its derivative h (z) has a root of multiplicity − 1 at z0 , and so h(−1) (z) has a nonsingular root at z0 . While a nonsingular root is much easier to compute accurately than a multiple root, we must be concerned about the numerical stability of the di3erentiation. This problem manifests itself in the way that the roots away from z0 move under di3erentiation. Ignoring the roots exactly at z0 , it may well happen that h (z) has at least one root closer to z0 than any root of h(z), and after − 1 derivatives, some root may come so close to z0 as to be numerically indistinguishable from z0 itself. For a simple example showing the problem, consider h(z) = z (z −1)d− . The root z = 0 occurs with multiplicity , with the d− remaining roots at z = 1. The derivative h (z) has z = 0 as a root of multiplicity − 1, but it also has a root =d, which for large d can be near zero. If we will need to di3erentiate − 1 times, this can lead to a serious problem. Algorithm Q = Factor(f) Input : f(x) polynomial in n variables with complex coe1cients. Output : Q = { (qi ; i ) | i = 1; 2; : : : ; N }, irreducible factors qi with multiplicities i . x0 := Random(n; C); v := Random(n; C); W := WitnessGenerate(f; x0 ; v); Q := ∅; P := RegularFactor(f; x0 ; v; W1 ); for all pi ∈ P do Q := Q ∪ (pi ; 1); end for; n @ D := vi ; @xi i=1 g := f; for = 2; 3; : : : ; #W do g := Dg; Wi := ReneCluster(g; x0 ; v; Wi ); P := RegularFactor(g; x0 ; v; Wi ); for all pi ∈ P do Q := Q ∪ (pi ; ); end for; end for; return Q.
[choose n random numbers in C] [v is direction of line x(t) = x0 + tv] [9nd witness points] [Q will collect all factors] [9nd regular factors] [collect multiplicity one factors] [di:erential along direction v] [g is ( − 1)-th di:erential of f, = 1] [construct multiplicity factors] [di:erentiate so that g = D(−1) f] [re9ne center of clusters] [9nd regular factors] [collect multiplicity factors]
The problem is further exacerbated if, due to numerical roundo3 in its representaˆ tion, we begin with h(z), nearby to h(z), having a cluster of roots near z0 . After di3erentiation, we would like to have an hˆ (z) with a cluster of − 1 roots near z0 and all other roots far away, but as seen from the above example, this may not always be the case. The following result gives some bounds on the behavior of the roots under di3erentiation and helps in guiding the choice of how many digits of precision we should use in implementation of our algorithm. The result follows from a very
A.J. Sommese et al. / Theoretical Computer Science 315 (2004) 651 – 669
657
special case of a beautiful, but somewhat intricate, classical result of Marden and Walsh [11, Theorem 21.1] about the geometry of the zeroes of the derivative of a polynomial. For the convenience of the reader, we include a self-contained proof of the result we need. As a preliminary step, we derive a simple result on sums of complex numbers. Lemma 3.1. Let u1 ; : : : ; u denote ¿0 complex numbers satisfying |ui − !|¡r where ! and r are real numbers satisfying !¿r¿0. Then 1 : ¡ (4) !+r i=1 ui √
Proof. Write each ui in polar form, ri e −1#i . Since !¿r, the real parts of each 1=ui are positive, and we have from the triangle inequality that 1 ¿ Re 1 = 1 cos(−#i ): (5) ui i=1 ui i=1 i=1 ri Note that for a 0xed ri , the smallest value of the positive number (1=ri ) cos(−#i ) for |ui − !|¡r occurs at the boundary |ui − !| = r. It is a simple calculus problem that the minimum of the real part of 1=u for u satisfying |u − !| = r, occurs when u = ! + r. Theorem 3.2 (Marden and Walsh [11]). Let h(z) be a degree d polynomial of one complex variable. Assume that h(z) has zeroes in the disk $r (z0 ) := {z ∈ C| |z − z0 |6r} and d − zeroes in the region {z ∈ C | |z − z0 |¿R}, where R¿r. Then, if (R + r)¿2 dr, it follows that h (z) has − 1 roots in $r (z0 ) and all the remaining d − roots in the region z ∈ C |z − z0 | ¿ (R + r) − r : (6) d Proof. By translation we can assume without loss of generality that z0 = 0. In this case we abbreviate $r (0) to $r . Without loss of generality we assume that h(z) is monic, i.e., that the highest order term of h(z) is z d . Note that to prove the theorem, it is enough to prove the following assertion. Claim 1. Given a real number ! satisfying R¿!¿r, it follows that if !¡
(R + r) − r; d
(7)
then h has exactly − 1 zeroes in the disk $! . The assumptions of Theorem 3.2 and Claim 1 imply that we may write h(z) = p(z) q(z), where p(z) is a monic polynomial of degree , which has the same zeroes with multiplicities, that h(z) has in $r , and q(z) is a monic polynomial with all roots at least distance R¿!¿r from the origin. The polynomial p (z)q(z) has the same zeroes in
658
A.J. Sommese et al. / Theoretical Computer Science 315 (2004) 651 – 669
$r as p (z). Therefore, it su1ces to check, that for !¿r satisfying Eq. (7), p (z)q(z) and h (z) have the same number of zeroes counting multiplicities in $! . By RouchRe’s Theorem, e.g., [11, p. 2], we know that p (z)q(z) and h (z) = p (z)q(z) + p(z)q (z) have the same number of zeroes in $! if |p(z)q (z)| ¡ |p (z)q(z)|
(8)
for z satisfying |z| = !. Therefore, to prove Claim 1, it su1ces to show that Eq. (7) implies Eq. (8). Since h(z) has no zeroes satisfying |z| = !, it su1ces to show that Eq. (7) implies q (z) ¡ p (z) : (9) p(z) q(z) Letting z1 ; : : : ; z denote the zeroes of p(z), each listed a number of times equal to its multiplicity, and letting w1 ; : : : ; wd− denote the zeroes of q(z), each listed a number of times equal to its multiplicity, we see that Eq. (9) is equivalent to d− 1 1 : (10) ¡ j=1 z − wj i=1 z − zi Consequently, we prove Claim 1, and hence the theorem, by showing that for |z| = !; d− 1 d− 1 : (11) ¡ ¡ ¡ j=1 z − wj R−! !+r i=1 z − zi The leftmost inequality in expression (11) follows from the triangle inequality and the fact that for z satisfying |z| = ! we have |z − wj |¿R − !. The middle inequality is a simple consequence of Eq. (7). To complete the proof, we proceed as follows. For ! √ 0xed, let z∗ = !e −1#∗ denote the point that minimizes the rightmost side of expression (11). Then, 1 1 ¿ √ 1 = : √ (12) z − zi −1#∗ − z − −1#∗ i=1 i=1 !e i=1 ! − zi e i Since |zi |6r, we see that Lemma 3.1 applies, and the result is shown. We denote the kth derivative of h by h(k) (z). Corollary 3.3. Let h(z) be a degree d polynomial of one complex variable. Assume that h(z) has zeroes in the disk $r (z0 ) := {z ∈ C | |z − z0 |6r} and d − zeroes in the region {z ∈ C | |z −z0 |¿R}, where R¿r. Assume further that k6 − 1. Then, if R=r¿ 2 d =d − + 1 − 1, it follows that h(k) (z) has − k roots in $r (z0 ) and all the remaining d − roots in the region
k (13) z ∈ C |z − z0 | ¿ (R + r) − r : d k
A.J. Sommese et al. / Theoretical Computer Science 315 (2004) 651 – 669
659
Proof. Use Theorem 3.2 k times. Suppose that h(z) is an approximation to an underlying exact polynomial having a root of multiplicity . If we increase the precision used to evaluate h(z) su1ciently, it will have a tight cluster of roots near the multiple root of the exact polynomial. Re0ning the precision until the radius r of the cluster is within the bound given by Corollary 3.3 assures us that the roots of h(k) (z) remain clustered. In particular, letting r denote the radius of a disk $r (z0 ) around z0 , the multiplicity weighted average of the roots, which contains the cluster of roots, and letting R denote the distance from z0 to the 0rst root outside of $r (z0 ), we have that the conservative estimate
2 d R ¿ r d−+1
(14)
guarantees that h(k) (z), for all k6 − 1, has exactly − k zeroes in $r (z0 ). For example, for a polynomial of degree 75, with roots of at most multiplicity 10, log10 (R=r)¿10:41 is su1cient. For a given d, the worst case happens when is approximately d=2. Thus log10 (R=r)¿57:3 is su1cient for a degree 200 polynomial having a worst case root of multiplicity 100. We may also regard log10 (R=r) as a measure of the number of decimal places needed to accurately carry out the computations. Using Stirling’s approximation, we see that the number of decimal places needed in case ≈ d=2, and thus for all with the given d is approximately: log10 (R=r) ¿ 0:3d:
(15)
To determine how to set the precision, we rearrange inequality (14) into d
r6
2
d−+1
R:
(16)
We consider the quantities at the right of (16) as 0xed, while the radius r of the cluster is variable. The right-hand side of (16) imposes a bound on the accuracy of the cluster, i.e., the number of decimal places all points in the cluster need to agree on. In a straightforward but e3ective manner we can apply Newton’s method locally to all points in the cluster, adapting the precision until the radius r is small enough. 4. Computational experiments In this section, we report numerical results obtained with PHCpack [31]. This software package has recently been extended with facilities to handle positive dimensional solution components (see [28]). In particular, invoking the executable with the option -f gives access to the capability to factor multivariate polynomials. The black-box solver of PHCpack now applies the numerical factorization methods when given on input a single polynomial in several variables. All our experiments were done with standard double precision arithmetic.
660
A.J. Sommese et al. / Theoretical Computer Science 315 (2004) 651 – 669
4.1. A numerical implementation Specialized versions of the path tracking routines in PHCpack have been built to deal with the case of homotopies between systems of one polynomial equation in several variables: f(z(t; &)) = 0;
(17)
z(t; &) = (x0 + tv)& + (y0 + tw)(1 − &)
(18)
where de0nes the movement from the line x(t) = x0 +tv to the line y(t) = y0 +tw, as & moves from 1 to 0, i.e.: z(t; &) = x(t)& + y(t)(1 − &). One motivation for this specialized code is to save linear algebra operations. Without the parametric representation of a general line x(t) = x0 +tv, we have to consider polynomial systems of n equations consisting of one polynomial (the polynomial we wish to factor) and n − 1 hyperplanes to cut out the line. Now we can use, like in [20], the method of Weierstrass 1 (also known as Durand–Kerner) to solve f(x(t)) = 0. See [32] for methods to locate zero clusters of univariate polynomials. The other motivation is that we hope to have a better understanding and control of the numerical stability of the algorithms. To give an impression of the numerical di1culty of computing witness points, consider the substitution of x(t) = x0 +tv into the polynomial f(x) = xd (in one variable x). Application of the binomial theorem gives d d x0d−k vk t k = 0 f(x) = (x0 + tv)d = (19) k k=0 which must be solved for t. Assuming that the magnitudes of x0 and v are approximately one, the leading coe1cient is also magnitude one, but for degree d = 30, the largest coe1cient (k = 15) has a magnitude occupying nine decimal places. Such large ranges in coe1cients are known to cause numerical sensitivity. Extrapolating from this simple case to the general case of computing all witness points, solving f(x(t)) = f(x0 + tv) = 0 for t, we warn that even if the original coe1cients of f are nice, and if we choose the entries of x0 and v on the complex unit circle, for large degrees, the univariate polynomial in t may have coe1cients that vary greatly in magnitude. To deal with such polynomials numerically, higher working precision may be required. This also implies that the path tracking in the monodromy phase of the algorithm may need higher precision. In previous work on polynomial systems having positive dimensional solution sets of higher multiplicity [28], we already extended the path-tracking routines in PHCpack to multi-precision. For such general polynomial systems, multi-precision path tracking tends to be computationally expensive, but we expect more reasonable execution times when addressing homotopies of only one polynomial equation. 1 In [15] this method is quali0ed as “quite e3ective and increasingly popular”. Convergence is global and quadratic in the limit [17].
A.J. Sommese et al. / Theoretical Computer Science 315 (2004) 651 – 669
661
Our working precision determines the accuracy of the algorithm RegularFactor. For instance, consider the polynomial f(x; y) = xy + 10−16 . Working with standard double precision Toating point numbers, the constant in f(x; y) will be ignored and a loop which shows f is irreducible will not be found, and also the validation with linear traces will con0rm the breakup into the factors x and y. On the other hand, doubling the precision will show that f is irreducible. In principle, the groupings of the monodromy algorithm can deal with approximate coe1cients if we set the working precision according to the accuracy level of the coe1cients. However, this scheme only works for su1ciently low degrees, because more precision is usually needed to evaluate polynomials of high degree. To obtain symbolic expressions of the polynomial factors, we applied the interpolation methods using traces, developed and implemented for any degree and any number of variables using multi-precision arithmetic if needed, reported in [25]. Multivariate Newton interpolation with divided di3erences is outlined in [9]. Since the algorithms for multivariate Newton interpolation involve a recursive application of the classical one variable case, the error analysis in [7, pp. 110–111] shows the relation between the errors on the coe1cients, the degree of the polynomial, and the working precision. In particular, the error Uc on the coe1cients is bounded by 1 |Uc| 6 − 1 |L| |f|; (20) (1 − 3u)n where u is an upper bound on the roundo3 in one arithmetical operation (depends on the working precision), n is the degree of the polynomial, L is the product of n bidiagonal matrices of dimension n (expressing the computation of divided di3erences in matrix–vector form), and f is the vector of function values at sample points. Note, however, that usually we do not need the symbolic representation of the factors to work with them. For instance, with the witness points we can determine whether a point satis0es a factor, via the homotopy membership test of [23]. 4.2. Singularities of Stewart–Gough platforms A mechanical device of considerable interest in mechanical engineering is the Stewart –Gough platform, consisting of a moving platform supported above a stationary base by six legs. One end of the ith leg connects to the base via a ball joint centered on point bi (given in the base coordinate system) and the other end connects to the platform via a ball joint at point ai (given in platform coordinates). The length of the leg, Li , is controlled by a linear actuator. A good general reference discussing this device and its relatives is Merlet [14]. For 0xed leg lengths, the device is generally rigid, but at singular con0gurations, the rigidity of the device is lost. That is, even though the leg lengths are held constant, the platform has at least one combination of velocity and angular velocity that, up to 0rst order, is unconstrained. In the design of such a device, an understanding of its singularities is crucial; see [12,13] and their references for background. The condition for singularity can be derived as follows. We represent the position of the platform by p ∈ C3 and the orientation by a quaternion q ∈ P3 . Letting vi be the
662
A.J. Sommese et al. / Theoretical Computer Science 315 (2004) 651 – 669
vector, in base coordinates, from base point bi to the corresponding platform point ai , we have vi := −bi + p + R(q)ai = −bi + p + qai q =qq ;
(21)
where R(q) is the 3×3 rotation matrix giving the same rotation to a vector w as the quaternion operation qwq =qq . The squared length of leg i is L2i = vi · vi , so Li L˙i = vi · v˙i = vi · (p˙ + G × (Rai ));
(22)
where “·” is the vector inner product, × is the vector cross product, and G is the ˆ angular velocity vector. Rewriting R = R=Q with Q := qq = q02 + q12 + q22 + q32
(23)
and substituting from Eq. (21), we may transform Eq. (22) into ˆ i ) · p˙ + ((Ra ˆ i ) × (p − bi )) · G: QLi L˙i = (Q(p − bi ) + Ra Letting J be the matrix whose ith column is ˆ i Q(p − bi ) + Ra Ji = ˆ i ) × (p − bi ) ; i = 1; : : : ; 6 (Ra
(24)
(25)
the singularity condition is 0 = Jw for some w = 0, or equivalently, det J = 0. Since the elements of Rˆ and Q are quadratic polynomials in q, one sees that det J is a polynomial in p; q; ai ; bi . Taking all of these as variables, the 0rst three rows of J are cubic and the last three are quadric, so det J is a homogeneous polynomial of degree 1728 in 42 variables. Not much understanding is likely to result from analyzing such a complicated object, nor could we begin to deal with it numerically. However, considerable insight can be gained by studying cases where some variables are taken as given, as has been done in [12,18]. In the next few paragraphs, we study such cases, some never before published, and use our numerical algorithm to factor det J. In all of the following examples, we expanded det J into monomials for convenient input to our computer code. This made automatic computation of derivatives very simple, but it is a very ine1cient way to evaluate the polynomial. It would be much more e1cient and accurate to evaluate the matrix entries numerically and then evaluate the determinant by reducing the matrix to triangular form. Therefore, the computation times reported here are far from the best that could be achieved. The examples serve to show how the algorithm works on fully expanded polynomials. 4.2.1. General platform, 9xed position For the general platform, we give p, ai and bi , i = 1; : : : ; 6, as random, complex values; that is, we choose a generic Stewart–Gough platform at a generic position, and look for singularities arising from rotations of the platform. The factorization of one such example will indicate, with probability one, the form of the factorization for
A.J. Sommese et al. / Theoretical Computer Science 315 (2004) 651 – 669
663
Table 1 Cluster radius r versus distance R to the nearest root outside the cluster, for the 0rst case of general platform, 0xed position. There are three roots in every cluster
Cluster
r
R
R=r
One Two
1.7E-05 4.9E-06
3.4E-01 1.7E-01
2.0E+04 3.6E+04
Table 2 Execution times for the 0rst case of general platform, 0xed position
Elapsed user CPU times on 2:4 Ghz WindowsXP 1. 2. 3. 4.
Monodromy grouping: Linear traces certi0cation: Interpolation at factors: Multiplication validation:
0h 0h 1h 0h
6 min 0 min 41 min 0 min
40 s 30 s 53 s 8s
469 ms 672 ms 78 ms 156 ms
Total time for all 4 stages:
1h
49 min
12 s
391 ms
almost all 2 Stewart–Gough platforms. In this case, det J is a homogeneous polynomial of degree 12 in q = {q0 ; q1 ; q2 ; q3 }. We 0nd numerically that a generic line hits det J in six regular points and two singular points of multiplicity 3. The regular points form one factor of degree six. Using di3erentiation to remove the multiplicity, we 0nd that the two singular points form one quadratic factor, and interpolating that factor shows it to be precisely Q from Eq. (23). That is, det J = F1 (q)Q3 ;
(26)
where F1 (q) is a sextic. Since a quaternion with zero norm does not represent a valid rotation matrix, the factor of Q3 = 0 is not of physical signi0cance. The computed factorization is certi0ed with linear traces by comparing the value at the linear trace with the calculated sum of the witness points on each factor. The maximal di3erence in the comparison for the two factors is 2.049E-13. If we multiply the interpolated factors and take the di3erence with the original polynomial, then the largest norm of the coe1cients of the di3erence polynomial is 1.919E-05, as explained by roundo3 in the interpolation of high degree polynomials. The data for the cluster analysis, comparing cluster radius r and distance R to the nearest other root outside the cluster, is given in Table 1. For d = 12 and = 3, the right-hand side of estimate (14) evaluates to 44. This bound is clearly smaller than 104 , so the initial approximations for the multiple root are accurate enough for the di3erentiation process. Table 2 lists the execution times for each stage in the factorization: monodromy grouping, certi0cation with linear traces, interpolation in the factors, and 0nally, the 2 The exceptions will be an algebraic subset of the space of all platform devices as parameterized by p, ai , bi , i = 1; : : : ; 6.
664
A.J. Sommese et al. / Theoretical Computer Science 315 (2004) 651 – 669
Table 3 Cluster radius r versus distance R to the nearest root outside the cluster, for the second case of planar base and platform, 0xed position. There are three roots in every cluster
Cluster
r
R
R=r
One Two
6.2E-05 4.8E-05
2.4E-01 6.0E-01
3.8E+04 1.2E+04
Table 4 Execution times for the second case of planar base and platform, 0xed position
Elapsed user CPU times on 2:4 Ghz WindowsXP 1. 2. 3. 4.
Monodromy grouping: Linear traces certi0cation: Interpolation at factors: Multiplication validation:
0h 0h 1h 0h
17 min 0 min 24 min 0 min
34 s 27 s 45 s 8s
735 ms 359 ms 766 ms 172 ms
Total time for all 4 stages:
1h
42 min
56 s
32 ms
comparison between the product of the factors with the original polynomials. The evaluation of a polynomial of degree 12 in four variables with 910 terms is responsible for the dominance of stages one and three in the overall execution time. 4.2.2. Planar base and platform, 9xed position This is the same as the former case, except that the third component of each of ai , bi , i = 1; : : : ; 6 is zero, meaning that the points of the base are all in a common plane, as are the points of the platform. Now det J is still homogeneous of degree 12, and it still factors in two pieces: one irreducible single factor of degree six, and the quadratic factor having multiplicity three. The maximal di3erence in certifying with linear traces is 4.147E-11, i.e., the difference between the calculated sum in the roots and the value predicted by the linear trace of the factor is 4.147E-11, showing the inTuence of roundo3 in evaluation a polynomial of degree 12 in four variables with 910 terms. The inTuence of roundo3 in the comparison between the original and product of the interpolated factors is even more obvious: we see 9.483E-05 as the maximal norm of the di3erence in the coe1cients. In Table 3 we summarize the results of the cluster analysis for two clusters which contain witness points of multiplicity three. As in Table 1 we can make the same observations, to conclude that R=r ≈ 104 ¿44 is safe for the di3erentiations. Table 4 shows the execution times. The algorithm RegularFactor takes so much time because of the cost of evaluating a polynomial of degree 12 in four variables with 910 terms.
A.J. Sommese et al. / Theoretical Computer Science 315 (2004) 651 – 669
665
Table 5 Cluster radius r versus distance R to nearest root outside the cluster, for the third case of planar base and platform, parallel planes. There are three roots in the 0rst cluster, and 0ve roots in the other two clusters
Cluster
r
R
R=r
One Two Three
5.1E-07 7.3E-04 4.0E-03
1.0E+00 3.4E-01 7.2E-01
2.0E+06 4.7E+02 1.8E+02
4.2.3. Planar base and platform, parallel planes In this case, which was studied in [12,18], we consider a device with planar base and platform in a con0guration with the two planes parallel to each other. The condition of parallelism means that the platform is rotated only about the third axis, so q1 = q2 = 0. The position, p, is now left as variable, and det J becomes cubic in p, homogeneous of degree 12 in (q0 ; q3 ), and degree 15 in (p; q) together. One does not know a priori that the contribution of p will factor out separately, but in fact it does. The computed factorization is det J = ap33 (q0 + bq3 )(q0 + cq3 )(q0 + iq3 )5 (q0 − iq3 )5 ;
(27)
where a; b; c are constants that depend on the choice of ai ; bi . This result is in agreement with [18], when we consider that, over the complex numbers, any homogeneous polynomial in two variables breaks into linear factors. Notice that the multiplicity 0ve factors are points on Q from Eq. (23) and therefore are not of physical signi0cance. The condition p3 = 0 means that the two planes coincide, which is clearly singular since then the legs provide no support perpendicular to the plane. Otherwise, singularity does not depend on position at all, as the other factors depend only on orientation. This fact is used to advantage in [18] to characterize the singularities of the planar– planar Stewart–Gough platforms. The numerical results give a maximal di3erence over all factors between the computed sum of roots and the value at the linear trace as 8.047E-08. When we interpolate the factors and compare the multiplied factors with the original polynomial, we 0nd 3.599E-07 as the highest norm of the di3erence between the coe1cients. In Table 5 we summarize the results of the cluster analysis for the three factors occurring with multiplicities three, 0ve, and 0ve. Observe that the cluster radius grows as the multiplicity gets larger. The bound in the estimate of the right-hand side of (14) now evaluates to 546 using d = 15 and = 5. While the approximation for R=r lies now much closer to this bound, numerically we can still apply the di3erentiation procedure successfully. Table 6 shows the execution times. Since there is a single cluster that contains three witness points, we may immediately conclude that it represents a linear factor of multiplicity three, namely p33 without further testing. In contrast, there are two clusters of size 5, so we must apply monodromy or trace tests to determine whether they represent an irreducible quadratic or they factor into two linears. While the other polynomials of Sections 4.2.1 and 4.2.2 each have 910 terms, this polynomial is much
666
A.J. Sommese et al. / Theoretical Computer Science 315 (2004) 651 – 669
Table 6 Execution times for the third case of planar base and platform, parallel planes
Elapsed user CPU times on 2:4 Ghz WindowsXP 1. 2. 3. 4.
Monodromy grouping: Linear traces certi0cation: Interpolation at factors: Multiplication validation:
1 min 0 min 0 min 0 min
13 s 3s 4s 1s
656 ms 891 ms 734 ms 657 ms
Total time for all 4 stages:
1 min
23 s
938 ms
Table 7 Execution times for the factorization using monodromy, compared to enumerating factors, for the three cases of the singularities of the Stewart–Gough platform
User CPU times on 2:4 Ghz Windows XP Case
Monodromy
1 2 3
6 min 17 min 1 min
Enumeration 40 s 34 s 13 s
460 ms 735 ms 656 ms
40 s 31 s 3s
750 ms 657 ms 0 ms
Table 8 Execution times for the factorization using monodromy, compared to enumerating the irreducible factors, for three very sparse random polynomials of increasing degrees
User CPU times on 2:4 Ghz Windows XP Degree
Monodromy
10 15 16
5s 8s 16 s
Enumeration 484 ms 187 ms 63 ms
1s 2s
312 ms 453 ms 875 ms
sparser: only 24 terms. The sparsity reduces the cost of evaluation and explains why this polynomial is factored much faster than the other ones. 4.3. Monodromy compared to the enumeration method In this section, we show that for the polynomials of modest degrees that we factored in this paper, the enumeration method outperforms the monodromy algorithm. In Table 7 we list execution times for the three cases treated above. We see the enumeration method as a clear winner. Recall that the highest degree of an irreducible factor is six, so only relatively few tests are needed in the enumeration method. Irreducible polynomials are the most di1cult for the enumeration method. In Table 8 we compare execution times again, but now for random irreducible polynomials of 0ve
A.J. Sommese et al. / Theoretical Computer Science 315 (2004) 651 – 669
667
monomials, and for increasing degrees. We see that the ratio of monodromy time to enumeration time drops from 18 to about 6 as the degree increases.
5. Conclusions In this paper, we show how the numerical factorization of a single polynomial in several variables with approximate complex coe1cients can be accomplished by specializing continuation methods for the numerical irreducible decomposition of a polynomial system. In the case of a system of polynomials, components with multiplicity ¿1 require the tracking of a singular path, but this di1cult numerical task can be avoided in the specialization to a single polynomial. To do so, we symbolically di3erentiate the polynomial − 1 times, thus replacing singular roots with nonsingular ones. In Toating point calculations, a multiple root becomes a cluster of nearly singular roots. Via a result of Marden and Walsh, we can estimate the precision needed to successfully apply di3erentiation to replace such clusters with one nonsingular root. To illustrate the methods, we applied the algorithms in a study of singularities of Stewart–Gough platforms. We experienced that a numerical factorization (i.e., partition of the witness set) is is usually less expensive and more numerically stable than the subsequent interpolation to obtain coe1cients for the factor written as a polynomial. Since most questions can be answered via continuation of the witness set, the interpolation step can usually be omitted. As the algorithms require repeated evaluations of the polynomial and its derivatives, a major factor in the cost of the method is the sparsity of the polynomials. The sparser the polynomials, the faster the evaluation and interpolation algorithms run. Since the polynomials we can factor with standard arithmetic have modest degrees, the enumeration methods of Galligo and Rupprecht [5,6,19] proved to be faster than the monodromy method in our tests. If one were to factor polynomials of higher degree, the speed advantage might reverse, due to exponential growth of the number of cases the enumeration method may have to test. Acknowledgements We gratefully acknowledge the support of this work by Volkswagen-Stiftung (RiPprogram at Oberwolfach). The 0rst author thanks the Duncan Chair of the University of Notre Dame and National Science Foundation. This material is based upon work supported by the National Science Foundation under Grant No. 0105653. The second author thanks the Department of Mathematics of the University of Illinois at Chicago and National Science Foundation. This material is based upon work supported by the National Science Foundation under Grant No. 0105739 and Grant No. 0134611. The third author thanks General Motors Research and Development for their support. The authors thank AndrRe Galligo for bringing the work of [1] to their attention. Last and not least, we are grateful to the editors and referees for their suggestions to improve the paper.
668
A.J. Sommese et al. / Theoretical Computer Science 315 (2004) 651 – 669
References [1] C. Bajaj, J. Canny, T. Garrity, J. Warren, Factoring rational polynomials over the complex numbers, SIAM J. Comput. 22 (2) (1993) 318–331. [2] R.M. Corless, A. Galligo, I.S. Kotsireas, S.M. Watt, A geometric-numeric algorithm for factoring multivariate polynomials, in: T. Mora (Ed.), Proc. 2002 Internat. Symp. on Symbolic and Algebraic Computation (ISSAC 2002), ACM, New York, 2002. [3] R.M. Corless, M.W. Giesbrecht, M. van Hoeij, I.S. Kotsireas, S.M. Watt, Towards factoring bivariate approximate polynomials, in: B. Mourrain (Ed.), Proc. 2001 Internat. Symp. on Symbolic and Algebraic Computation (ISSAC 2001), ACM, New York, 2001, pp. 85–92. [4] R.M. Corless, E. Kaltofen, S.M. Watt, Hybrid methods, in: J. Grabmeier, E. Kaltofen, V. Weispfenning (Eds.), Computer Algebra Handbook, Springer, Berlin, 2002, pp. 112–125. [5] A. Galligo, D. Rupprecht, Semi-numerical determination of irreducible branches of a reduced space curve, in: B. Mourrain (Ed.), Proc. 2001 Internat. Symp. on Symbolic and Algebraic Computation (ISSAC 2001), ACM, New York, 2001, pp. 137–142. [6] A. Galligo, D. Rupprecht, Irreducible decomposition of curves, J. Symbolic Comput. 33 (5) (2002) 661–677. [7] N.J. Higham, Accuracy and Stability of Numerical Algorithms, SIAM, Philadelphia, PA, 1996. [8] Y. Huang, W. Wu, H.J. Stetter, L. Zhi, Pseudofactors of multivariate polynomials, in: C. Traverso (Ed.), Proc. 2000 Internat. Symp. on Symbolic and Algebraic Computation (ISSAC 2000), 2000, pp. 161–168. [9] E. Isaacson, H.B. Keller, Analysis of Numerical Methods, Dover, New York, 1994 (Dover Reprint of the 1966 Wiley editon). [10] E. Kaltofen, Challenges of symbolic computation: my favorite open problems, J. Symbolic Comput. 29 (6) (2000) 891–919. [11] M. Marden, The geometry of the zeroes of a polynomial in a complex variable, in: Mathematical Surveys, Vol. 3, American Mathematical Society, New York, 1949. [12] B. Mayer St-Onge, C.M. Gosselin, Singularity analysis and representation of the general Gough–Stewart platform, Internat. J. Robotics Res. 19 (3) (2000) 271–288. [13] J.P. Merlet, Singular con0gurations of parallel manipulators and Grassmann geometry, Internat. J. Robotics Res. 8 (5) (1999) 45–56. [14] J.P. Merlet, Parallel Robots, Kluwer Academic Publishers, Dordrecht, The Netherlands, 2000. [15] V. Pan, Solving a polynomial equation: some history and recent progress, SIAM Rev. 39 (2) (1997) 187–220. [16] V. Pan, Computation of approximate polynomial GCDs and an extension, Inform. Comput. 167 (2001) 71–85. [17] L. Pasquini, D. Trigiante, A globally convergent method for simultaneously 0nding polynomial roots, Math. Comput. 44 (169) (1985) 135–149. [18] F. Pernkopf, M.L. Husty, Singularity analysis of spatial Stewart–Gough platforms with planar base and platform, Proc. ASME Design Eng. Tech. Conf. Montreal, Canada, September 30–October 2, 2002. [19] D. Rupprecht, Semi-numerical absolute factorization of polynomials with integer coe1cients, manuscript. [20] T. Sasaki, Approximate multivariate polynomial factorization based on zero-sum relations, in: B. Mourrain (Ed.), Proc. 2001 Internat. Symp. on Symbolic and Algebraic Computation (ISSAC 2001), ACM, New York, 2001, pp. 284–291. [21] A.J. Sommese, J. Verschelde, Numerical homotopies to compute generic points on positive dimensional algebraic sets, J. Complexity 16 (3) (2000) 572–602. [22] A.J. Sommese, J. Verschelde, C.W. Wampler, Numerical decomposition of the solution sets of polynomial systems into irreducible components, SIAM J. Numer. Anal. 38 (6) (2001) 2022–2046. [23] A.J. Sommese, J. Verschelde, C.W. Wampler, Numerical irreducible decomposition using projections from points on the components, in: E.L. Green, S. HoZsten, R.C. Laubenbacher, V. Powers (Eds.), Journal of Symbolic Computation: Solving Equations in Algebra, Geometry, and Engineering, Contemporary Mathematics, Vol. 286, American Mathematical Society, Providence, RI, 2001, pp. 37–51.
A.J. Sommese et al. / Theoretical Computer Science 315 (2004) 651 – 669
669
[24] A.J. Sommese, J. Verschelde, C.W. Wampler, Using monodromy to decompose solution sets of polynomial systems into irreducible components, in: C. Ciliberto, F. Hirzebruch, R. Miranda, M. Teicher (Eds.), Application of Algebraic Geometry to Coding Theory, Physics and Computation, Proc. NATO Conf., Eilat, Israel, Kluwer Academic Publishers, Dordrecht, February 25–March 1, 2001, pp. 297–315. [25] A.J. Sommese, J. Verschelde, C.W. Wampler, Symmetric functions applied to decomposing solution sets of polynomial systems, SIAM J. Numer. Anal. 40 (6) (2002) 2026–2046. [26] A.J. Sommese, J. Verschelde, C.W. Wampler, A method for tracking singular paths with application to the numerical irreducible decomposition, in: M.C. Beltrametti, F. Catanese, C. Ciliberto, A. Lanteri, C. Pedrini. W. de Gruyter (Eds.), Algebraic Geometry, a Volume in Memory of Paolo Francia, W. de Gruyter, Belmont, CA, 2002, pp. 329–345. [27] A.J. Sommese, J. Verschelde, C.W. Wampler, Advances in polynomial continuation for solving problems in kinematics, in: Proc. ASME Design Engineering Tech. Conf. (CDROM), Paper DETC2002/MECH-34254, Montreal, Quebec, September 29–October 2, 2002 (A revised version will appear in the ASME Journal of Mechanical Design). [28] A.J. Sommese, J. Verschelde, C.W. Wampler, Numerical irreducible decomposition using PHCpack, in: M. Joswig, N. Takayama (Eds.), Algebra, Geometry, and Software Systems, Springer, Berlin, 2003, pp. 109–130. [29] A.J. Sommese, J. Verschelde, C.W. Wampler, Introduction to numerical algebraic geometry, in: Graduate School on Systems of Polynomial Equations: From Algebraic Geometry to Industrial Applications, 2003, pp. 229–247 (Organized by A. Dickenstein, I. Emiris, 14–25 July 2003, Buenos Aires, Argentina. Notes published by INRIA). [30] A.J. Sommese, C.W. Wampler, Numerical algebraic geometry, in: J. Renegar, M. Shub, S. Smale (Eds.), The Mathematics of Numerical Analysis, Lectures in Applied Mathematics, Vol. 32, Proc. AMS-SIAM Summer Seminar in Applied Mathematics, Park City, Utah, July 17–August 11, 1995, Park City, UT, 1996, pp. 749–763. [31] J. Verschelde, Algorithm 795: PHCpack: a general-purpose solver for polynomial systems by homotopy continuation, ACM Transactions on Mathematical Software 25 (2) (1999) 251–276, Software available at http://www.math.uic.edu/∼jan. [32] J.-C. Yakoubsohn, Finding a cluster of zeros of univariate polynomials, J. Complexity 16 (3) (2000) 603–638.
Theoretical Computer Science 315 (2004) 671–672
Author index volume 315 (2004) The issue number is given in front of the page numbers.
Bai, Z.-J. and R.H. Chan, Inverse eigenproblem for centrosymmetric and centroskew matrices and their approximation Bauer , A., L. Birkedal and D.S. Scott, Equilogical spaces Berardi, S. and C. Berline, Building continuous webbed models for system F Berline, C., see S. Berardi Bini , D.A. and L. Gemignani, Bernstein–Bezoutian matrices Birkedal, L., see A. Bauer Bompadre, A., G. Matera, R. Wachenchauzer and A. Waissbein, Polynomial equation solving by lifting procedures for ramified fibers Burago, D., D. Grigoriev and A. Slissenko, Approximating shortest path for the skew lines problem in time doubly logarithmic in 1/epsilon Calcagno, C., Two-level languages for program optimization Chan, R.H., see Z.-J. Bai Ching, W.-K., see F.-R. Lin Codevico, G., see V.Y. Pan Croot, E., R.-C. Li and H.J. Zhu, The abc conjecture and correctly rounded reciprocal square roots de Paiva, V., see A. Schalk Emiris, I.Z., B. Mourrain and V.Y. Pan, Preface Furedi, . Z. and R.P. Kurshan, Minimal length test vectors for multiple-fault detection Gemignani, L., see D.A. Bini Grigoriev, D., see D. Burago Heinig, G. and K. Rost, Split algorithms for skewsymmetric Toeplitz matrices with arbitrary rank profile Kaporin, I., The aggregation and cancellation techniques as a practical tool for faster matrix multiplication Konecˇn!y, M., Real functions incrementally computable by finite automata Kurshan, R.P., see Z. Furedi . Li, R.-C., see E. Croot Lin, F.-R., W.-K. Ching and M.K. Ng, Fast inversion of triangular Toeplitz matrices Lowe, G., Semantic models for information flow Malajovich, G. and J.M. Rojas, High probability analysis of the condition number of sparse polynomial systems Mart!ın, J.S., see L.M. Pardo Matera, G., see A. Bompadre Mislove, M., Mathematical Foundations of Programming Semantics: Papers from MFPS 14 and MFPS 16 Mourrain, B., see I.Z. Emiris Ng, M.K., see F.-R. Lin . Nocker, M., see J. von zur Gathen Noutsos, D., S. Serra Capizzano and P. Vassalos, Matrix algebra preconditioners for multilevel Toeplitz systems do not insure optimal convergence rate O’Hearn, P.W., see D.J. Pym
(2–3) (1) (1) (1) (2–3) (1)
309–318 35– 59 3– 34 3– 34 319–333 35– 59
(2–3) 335–369 (2–3) (1) (2–3) (2–3) (2–3)
371–404 61– 81 309–318 511–523 581–592
(2–3) (1) (2–3) (1) (2–3) (2–3)
405–417 83–107 307–308 191–208 319–333 371–404
(2–3) 453–468 (2–3) (1) (1) (2–3) (2–3) (1)
469–510 109–133 191–208 405–417 511–523 209–256
(2–3) 525–555 (2–3) 593–625 (2–3) 335–369 (1) 1– 2 (2–3) 307–308 (2–3) 511–523 (2–3) 419–452 (2–3) 557–579 (1) 257–305
672
Author index volume 315 / Theoretical Computer Science 315 (2004) 671–672
Pan, V.Y., M. Van Barel, X. Wang and G. Codevico, Iterative inversion of structured matrices Pan, V.Y., see I.Z. Emiris Pardo, L.M. and J.S. Mart!ın, Deformation techniques to solve generalised Pham systems P!erez-D!ıaz, S., J. Sendra and J.R. Sendra, Parametrization of approximate algebraic curves by lines Pym, D.J., P.W. O’Hearn and H. Yang, Possible worlds and resources: the semantics of BI Rojas, J.M., see G. Malajovich Rost, K., see G. Heinig Schalk, A. and V. de Paiva, Poset-valued sets or how to build models for linear logics Schellekens, M.P., The correspondence between partial metrics and semivaluations Scott , D.S., see A. Bauer Sendra, J., see S. P!erez-D!ıaz Sendra, J.R., see S. P!erez-D!ıaz Serra Capizzano, S., see D. Noutsos Slissenko, A., see D. Burago Sommese, A.J., J. Verschelde and C.W. Wampler, Numerical factorization of multivariate complex polynomials Van Barel, M., see V.Y. Pan Vassalos, P., see D. Noutsos Verschelde, J., see A.J. Sommese . von zur Gathen, J. and M. Nocker, Fast arithmetic with general Gau periods Wachenchauzer, R., see A. Bompadre Waissbein, A., see A. Bompadre Wampler, C.W., see A.J. Sommese Wang, X., see V.Y. Pan Yang, H., see D.J. Pym Yang, Z., Encoding types in ML-like languages Zhu, H.J., see E. Croot
(2–3) 581–592 (2–3) 307–308 (2–3) 593–625 (2–3) 627–650 (1) (2–3) (2–3) (1) (1) (1) (2–3) (2–3) (2–3) (2–3)
257–305 525–555 453–468 83–107 135–149 35– 59 627–650 627–650 557–579 371–404
(2–3) (2–3) (2–3) (2–3) (2–3) (2–3) (2–3) (2–3) (2–3) (1) (1) (2–3)
651–669 581–592 557–579 651–669 419–452 335–369 335–369 651–669 581–592 257–305 151–190 405–417