DOI: 10.2478/s11533-006-0045-2 Research article CEJM 5(1) 2007 1–18
A family of regular vertex operator algebras with two generators Draˇzen Adamovi´c∗ Department of Mathematics, University of Zagreb, 10 000 Zagreb, Croatia
Received 7 September 2006; accepted 20 November 2006 Abstract: For every m ∈ C \ {0, −2} and every nonnegative integer k we define the vertex operator 3m (super)algebra Dm,k having two generators and rank m+2 . If m is a positive integer then Dm,k can be realized as a subalgebra of a lattice vertex algebra. In this case, we prove that Dm,k is a regular vertex operator (super)algebra and find the number of inequivalent irreducible modules. c Versita Warsaw and Springer-Verlag Berlin Heidelberg. All rights reserved. Keywords: Vertex operator algebras, vertex operator superalgebras, rationality, regularity, lattice vertex operator algebras MSC (2000): 17B69
1
Introduction
In the theory of vertex operator (super)algebras, the classification and construction of rational vertex operator (super)algebras are important problems. These problems are connected with the classification of rational conformal field theories in physics. The rationality of certain familiar vertex operator (super)algebras was proved in papers [1– 3, 7, 8, 14, 20, 25]. It is natural to consider rational vertex operator (super)algebras of certain rank. In particular, in the rank one case for every positive integer k we have √ the well-known rational vertex operator (super)algebra Fk associated to the lattice kZ. These vertex operator (super)algebras are generated by two generators. In the present paper we will be concentrated on vertex operator (super)algebras of rank ∗
E-mail:
[email protected]
2
D. Adamovi´c / Central European Journal of Mathematics 5(1) 2007 1–18
3m cm = m+2 , m ∈ C \ {0, −2}. This rank has the vertex operator algebra L(m, 0) associated to the irreducible vacuum slˆ2 –module of level m and the vertex operator superalgebra Lcm associated to the vacuum module for the N = 2 superconformal algebra with central charge cm ( cf. [2, 3, 11, 15–17]). In the case m = 1 these vertex operator (super)algebras are included into the family Fk , k ∈ N, since L(1, 0) ∼ = F2 and Lc1 ∼ = F3 . The main purpose of this article is to include L(m, 0) and Lcm into the family Dm,k , k ∈ Z≥0 , of rational vertex operator (super)algebras of rank cm for arbitrary positive integer m. In fact, for every m ∈ C \ {0, −2} we define the vertex operator (super)algebra Dm,k as a subalgebra of the vertex operator (super)algebra L(m, 0) ⊗ Fk (cf. Section 4). In the special case k = 1, Dm,1 is in the N = 2 vertex operator superalgebra Lcm constructed by using the Kazama-Suzuki mapping (cf. [15, 19]). We also have that Dm,0 ∼ = L(m, 0) ∼ and D1,k = Fk+2 . Moreover, we shall demonstrate that Dm,k has many properties similar to those of affine and N = 2 superconformal vertex algebras. When m is not a nonnegative integer, then Dm,k has infinitely many irreducible representations. Thus, it is not rational (cf. Section 4). In order to construct new examples of rational vertex operator (super)algebras we shall consider the case when m is a positive integer. Then Dm,k can be embedded into a lattice vertex algebra (cf. Section 5). In fact, we shall prove that
Dm,k ⊗ F−k ∼ = L(m, 0) ⊗ F− k (mk+2) (k even), 2 ∼ Dm,k ⊗ F−k = L(m, 0) ⊗ F−2k(mk+2) ⊕ L(m, m) ⊗ MF−2k(mk+2)
(1) (k odd).
(2)
These relations completely determine the structure of Dm,k ⊗ F−k as a weak L(m, 0)– module. In [9] the notion of a regular vertex operator algebra was introduced, i. e. rational vertex operator algebra with the property that every weak module is completely reducible. The relations (1) and (2), together with the regularity results from [9] and [21] imply that Dm,k is a simple regular vertex operator algebra if k is even, and a simple regular vertex operator superalgebra if k is odd. It was shown in [5] that regularity is equivalent to rationality and C2 –cofiniteness. Therefore, vertex operator (super)algebras Dm,k are also rational and C2 –cofinite. Let us here discuss the case k = 2n, where n is a positive integer. The relation (1) suggests that one can study the dual pair (Dm,2n , F−2n ) directly inside L(m, 0) ⊗ F−2n(nm+1) . This approach requires many deep results on the structure of the vertex operator algebra L(m, 0) and deserves to be investigated independently. Instead of this approach, we realize the vertex algebra Dm,2n ⊗F−2n inside a larger lattice vertex algebra. Then the formulas for the generators are much simpler (cf. Section 6). The similar analysis can be done when k is odd (cf. Section 7). This approach was also used in [3] for studying the fusion rules for the N = 2 vertex operator superalgebra Dm,1 . Our results show that for every m ∈ N, there exists an infinite family of rational vertex operator algebras of rank cm . We believe that these algebras will have an important role in the classification of rational vertex operator algebras of this rank. As an example, in this paper we shall consider in detail the vertex operator (super)algebras of rank c4 = 2.
D. Adamovi´c / Central European Journal of Mathematics 5(1) 2007 1–18
3
Then our vertex operator (super)algebras Dm,k admit nice realizations. In Section 8 we show that D4,k is a Z2 –orbifold model of a lattice vertex operator superalgebra under an automorphism of order two. This paper is a slightly modified version of the preprint math.QA/0111055.
2
Preliminaries
In this section we recall the definition of vertex operator superalgebras their modules (cf. [12, 13, 18, 20]). We also recall the basic properties of regular vertex operator superalgebras. Let V = V¯0 ⊕ V¯1 be any Z2 –graded vector space. Then any element u ∈ V¯0 (resp. u ∈ V¯1 ) is said to be even (resp. odd). We define |u| = ¯0 if u is even and |u| = ¯1 if u is odd. Elements in V¯0 or V¯1 are called homogeneous. Whenever |u| is written, it is understood that u is homogeneous. Definition 2.1. A vertex superalgebra is a triple (V, Y, 1) where V = V¯0 ⊕ V¯1 is a Z2 – graded vector space, 1 ∈ V¯0 is a specified element called the vacuum of V , and Y is a linear map Y (·, z) : V → (End V )[[z, z −1 ]]; a → Y (a, z) = an z −n−1 ∈ (End V )[[z, z −1 ]] n∈Z
satisfying the following conditions for a, b ∈ V : (V1) |an b| = |a| + |b|. (V2) an b = 0 for n sufficiently large. d (V3) [D, Y (a, z)] = Y (D(a), z) = dz Y (a, z), where D ∈ End V is defined by D(a) = a−2 1. (V4) Y (1, z) = IV (the identity operator on V ). (V5) Y (a, z)1 ∈ (End V )[[z]] and limz→0 Y (a, z)1 = a. (V6) The following Jacobi identity holds z1 − z2 z2 − z1 −1 |a||b| −1 z0 δ z0 δ Y (a, z1 )Y (b, z2 ) − (−1) Y (b, z2 )Y (a, z1 ) z0 −z0 z1 − z0 −1 = z2 δ Y (Y (a, z0 )b, z2 ). z2 A vertex superalgebra V is called a vertex operator superalgebra if there is a special element ω ∈ V¯0 (called the Virasoro element) whose vertex operator we write in the form Y (ω, z) = n∈Z ωn z −n−1 = n∈Z L(n)z −n−2 , such that 3 (V7) [L(m), L(n)] = (m − n)L(m + n) + δm+n,0 m 12−m c, c = rank V ∈ C. (V8) L(−1) = D. (V9) V = ⊕n∈ 1 Z V (n) is a 12 Z–graded so that V¯0 = ⊕n∈Z V (n), V¯1 = ⊕n∈ 1 +Z V (n), 2
2
L(0) |V (n) = nIV |V (n) , dim V (n) < ∞, and V (n) = 0 for n sufficiently small.
4
D. Adamovi´c / Central European Journal of Mathematics 5(1) 2007 1–18
We shall sometimes refer to the vertex operator superalgebra V as quadruple (V, Y, 1, ω). Remark 2.2. If in the definition of vertex (operator) superalgebra the odd subspace V¯1 = 0 we get the usual definition of vertex (operator) algebra. We will say that the vertex operator superalgebra is generated by the set S if V = spanC {u1n1 · · · urnr 1| u1 , . . . , ur ∈ S, n1 , . . . , nr ∈ Z, r ∈ Z≥0 }. A subspace I ⊂ V is called an ideal in the vertex operator superalgebra V if an I ⊂ I for every a ∈ V and n ∈ Z. A vertex operator superalgebra V is called simple if it does not contain any proper non-zero ideal. There is a canonical automorphism σV of the vertex operator superalgebra V such that σV |V¯0 = 1 and σV |V¯1 = −1. Definition 2.3. Let V be a vertex operator superalgebra. A weak V –module is a pair (M, YM ), where M = M¯0 ⊕ M¯1 is a Z2 –graded vector space, and YM (·, z) is a linear map YM : V → End(M)[[z, z −1 ]], a → YM (a, z) = an z −n−1 , n∈Z
satisfying the following conditions for a, b ∈ V and v ∈ M: (M1) |an v| = |a| + |v| for any a ∈ V . (M2) YM (1, z) = IM . (M3) an v = 0 for n sufficiently large. (M4) The following Jacobi identity holds z1 − z2 z2 − z1 −1 |a||b| −1 z0 δ z0 δ YM (a, z1 )YM (b, z2 ) − (−1) YM (b, z2 )YM (a, z1 ) z0 −z0 z1 − z0 −1 = z2 δ YM (Y (a, z0 )b, z2 ). z2 A weak V –module (M, YM ) is called a V –module if (M5) M = n∈C M(n); (M7) L(0)u = nu, u ∈ M(n); dim M(n) < ∞; (M8) M(n) = 0 for n sufficiently small.
We recall the definition of regular vertex operator algebra introduced by C. Dong, H. Li and G. Mason in [9]. Definition 2.4. The vertex operator superalgebra V is called regular if every weak V – module is a direct sum of irreducible modules. If vertex operator superalgebra V is regular, then V is also a rational vertex operator superalgebra, meaning that V has only finitely many irreducible modules and that every V –module is completely reducible.
D. Adamovi´c / Central European Journal of Mathematics 5(1) 2007 1–18
5
A vertex operator superalgebra V is called C2 –cofinite if the subspace C2 (V ) = spanC {u−2 v| u, v ∈ V } has finite codimension in V . This condition is important in the representation theory of vertex operator superalgebras. Proposition 2.5. ([5, 23]) The vertex operator superalgebra V is regular if and only if V is rational and C2 –cofinite. Remark 2.6. A regularity result for the affine, Virasoro and lattice vertex operator algebras was obtained in [9]. Regularity of vertex operator superalgebras associated to minimal models for the Neveu-Schwarz and N = 2 superconformal algebra was proved in [3, 4].
3
Lattice and affine vertex algebras
In this section, we shall recall the lattice construction of vertex superalgebras from [8, 18]. Let L be a lattice. Set h = C ⊗Z L and extend the Z-form ·, · on L to h. Let ˆ h = C[t, t−1 ] ⊗ h ⊕ Cc be the affinization of h. We also use the notation h(n) = tn ⊗ h for h ∈ h, n ∈ Z. ˆ+ = tC[t] ⊗ h; h ˆ− = t−1 C[t−1 ] ⊗ h. Then h ˆ+ and h ˆ− are abelian subalgebras of Set h ˆ Let U(h ˆ− ) = S(h ˆ− ) be the universal enveloping algebra of h ˆ− . Let λ ∈ h. Consider h. ˆ the induced h-module ˆ ⊗U (C[t]⊗h⊕Cc) C S(h ˆ− ) (linearly), M(1, λ) = U(h) where tC[t] ⊗ h acts trivially on C, t0 ⊗ h acting as h, λ for h ∈ h and c acts on C as multiplication by 1. We shall write M(1) for M(1, 0). For h ∈ h and n ∈ Z write h(n) = tn ⊗ h. Set h(z) = n∈Z h(n)z −n−1 . Then M(1) is a vertex operator algebra which is generated by the fields h(z), h ∈ h, and M(1, λ), for λ ∈ h, are irreducible modules for M(1). ˆ be the canonical central extension of L by the cyclic group ±1: Let L ˆ →L 1 → ±1 → L ¯ →1
(3)
ˆ be with the commutator map c(α, β) = (−1)α,β+α,αβ,β for α, β ∈ L. Let e : L → L a section such that e0 = 1 and : L × L → ±1 be the corresponding 2-cocycle. Then (α, β)(β, α) = (−1)α,β+α,αβ,β , (α, β)(α + β, γ) = (β, γ)(α, β + γ)
(4)
ˆ and eα eβ = (α, β)eα+β for α, β, γ ∈ L. Form the induced L-module ˆ ⊗±1 C C[L] (linearly), C{L} = C[L] ˆ where C[·] denotes the group algebra and −1 acts on C as multiplication by −1. For a ∈ L, ˆ on C{L} is given by: a · ι(b) = ι(ab) write ι(a) for a ⊗ 1 in C{L}. Then the action of L ˆ and (−1) · ι(b) = −ι(b) for a, b ∈ L.
6
D. Adamovi´c / Central European Journal of Mathematics 5(1) 2007 1–18
ˆ Furthermore we define an action of h on C{L} by: h·ι(a) = h, a ¯ι(a) for h ∈ h, a ∈ L. h h,¯ a ι(a). Define z · ι(a) = z The untwisted space associated with L is defined to be ˆ− ) (linearly). VL = C{L} ⊗C M(1) C[L] ⊗ S(h ˆ z h (h ∈ h) act naturally on VL by acting on either C{L} or M(1) as indicated ˆ h, Then L, above. Define 1 = ι(e0 ) ∈ VL . We use a normal ordering procedure, indicated by open colons, which signify that in the enclosed expression, all creation operators h(n) (n < 0), ˆ are to be placed to the left of all annihilation operators h(n), z h (h ∈ h, n ≥ 0). a∈L ˆ set For a ∈ L, Ê
Y (ι(a), z) =: e
(¯ a(z)−¯ a(0)z −1 )
az a¯ : .
ˆ h1 , · · · , hk ∈ h; n1 , · · · , nk ∈ Z (ni > 0). Set Let a ∈ L; v = ι(a) ⊗ h1 (−n1 ) · · · hk (−nk ) ∈ VL . Define vertex operator Y (v, z) with 1 1 d n1 −1 d nk −1 h1 (z) · · · hk (z) Y (ι(a), z) : . : ( ) ( ) (n1 − 1)! dz (nk − 1)! dz
(5)
This gives us a well-defined linear map Y (·, z) : VL → (EndVL )[[z, z −1 ]] v → Y (v, z) = vn z −n−1 , (vn ∈ EndVL ). n∈Z
Let { hi | i = 1, · · · , d} be an orthonormal basis of h and set 1 hi (−1)hi (−1) ∈ VL . ω= 2 i=1 d
Then Y (ω, z) = n∈Z L(n)z −n−2 gives rise to a representation of the Virasoro algebra on VL with the central charged d and L(0) (ι(a) ⊗ h1 (−n1 ) · · · hn (−nk )) 1
¯a, a ¯ + n1 + · · · + nk (ι(a) ⊗ h1 (−n1 ) · · · hk (−nk )) . = 2
(6)
The following theorem was proved in [8] and [18]. Theorem 3.1. (i) The structure (VL , Y, 1) is a vertex (super)algebra. (ii) Assume that L is a positive definite lattice. Then the structure (V, Y, 1, ω) is a vertex operator (super)algebra.
D. Adamovi´c / Central European Journal of Mathematics 5(1) 2007 1–18
7
Define the Schur polynomials pr (x1 , x2 , · · · ) in variables x1 , x2 , · · · by the following equation: ∞ ∞ xn yn = pr (x1 , x2 , · · · )y r . (7) exp n n=1 r=0 For any monomial xn1 1 xn2 2 · · · xnr r we have an element h(−1)n1 h(−2)n2 · · · h(−r)nr 1 in both M(1) and VL for h ∈ h. Then for any polynomial f (x1 , x2 , · · · ), f (h(−1), h(−2), · · · )1 is a well-defined element in M(1) and VL . In particular, pr (h(−1), h(−2), · · · )1 for r ∈ N are elements of M(1) and VL . ˆ such that a Suppose a, b ∈ L ¯ = α, ¯b = β. Then ∞ α(−n) α,β n Y (ι(a), z)ι(b) = z exp z ι(ab) n n=1 =
∞
pr (α(−1), α(−2), · · · )ι(ab)z r+α,β .
(8)
r=0
Thus ι(a)i ι(b) = 0
for i ≥ − α, β.
(9)
Especially, if α, β ≥ 0, we have ι(a)i ι(b) = 0 for i ≥ 0, and if α, β = −n < 0, we get ι(a)i−1 ι(b) = pn−i (α(−1), α(−2), · · · )ι(ab)
for i ∈ {0, . . . , n}.
(10)
Let n ∈ Z, n = 0, and β, β = n. Define Ln = Zβ,
Fn = VLn .
Then Fn is a simple vertex algebra if n is even, and a simple vertex superalgebra if n is odd. For i ∈ Z, let i = i + nZ ∈ Z/nZ. We define Fni = VZβ+ i β . Clearly Fn = Fn0 . It is n
well-known (cf. [7, 8, 26]) that the set {Fni }i=0,...,|n|−1 provides all irreducible Fn –modules. In particular, Fn has |n| inequivalent irreducible modules. ˜ 2k = β + Zβ, and MF2k = V ˜ = F k . Then F2k is a If n = 2k is even, we define L 2k L2k 2 vertex algebra, and MF2k is a F2k –module. We shall also need the following result from [9]. Proposition 3.2. [9] Assume that n ∈ Z, n = 0. Then the vertex (super)algebra Fn is regular, i.e., any (weak) Fn –module is completely reducible. Let g be the Lie algebra sl2 with generators e, f, h and relations [e, f ] = h, [h, e] = 2e, [h, f ] = −2f . Let gˆ = g ⊗ C[t, t−1 ] ⊕ CK be the corresponding affine Lie algebra of (1) type A1 . As usual we write x(n) for x ⊗ tn where x ∈ g and n ∈ Z. Let Λ0 , Λ1 denote the fundamental weights for gˆ. For any complex numbers m, j, let L(m, j) =
8
D. Adamovi´c / Central European Journal of Mathematics 5(1) 2007 1–18
L((m − j)Λ0 + jΛ1) be the irreducible highest weight slˆ2 –module with the highest weight (m − j)Λ0 + jΛ1 . Then L(m, 0) has a natural structure of a simple vertex operator algebra. Let 1m denote the vacuum vector in L(m, 0). If m is a positive integer then L(m, 0) is a regular vertex operator algebra, and the set {L(m, j)}j=0,...,m provides all inequivalent irreducible L(m, 0)–modules. We shall now recall the lattice construction of the vertex operator algebra L(m, 0). Define the following lattice A1,m = Zα1 + · · · + Zαm
αi , αj = 2δi,j , for every i, j ∈ {1, . . . , m}. Define also A˜1,m =
α1 +···+αm 2
+ A1,m . We have:
Lemma 3.3. [8] The vectors E = ι(eα1 ) + · · · + ι(eαm ), F = ι(e−α1 ) + · · · + ι(e−αm ), generate a subalgebra of VA1,m isomorphic to L(m, 0). Moreover, L(m, m) is a L(m, 0) submodule of VA˜1,m .
4
The definition of Dm,k
In this section we give the definition of the vertex operator (super)algebra Dm,k . Let the vertex (super)algebras L(m, 0) and Fk be defined as in Section 3. Definition 4.1. Let m ∈ C \ {0, −2}, and let k be a nonnegative integer. Let Dm,k be the vertex subalgebra of the vertex operator (super)algebra L(m, 0) ⊗ Fk generated by the vectors: ¯ = e(−1)1m ⊗ ι(eβ ) and Y¯ = f (−1)1m ⊗ ι(e−β ). X Let 1m,k = 1m ⊗ 1 ∈ Dm,k ⊂ L(m, 0) ⊗ Fk . Define the following elements of Dm,k : ¯ =X ¯ k Y¯ = h(−1)1m ⊗ 1 + m1m ⊗ β(−1)1, H 1 1 − k ¯2 ¯ ¯ ¯ ¯ ωm,k = Xk−1 Y + Yk−1X + H 1m,k . 2(m + 2) mk + 2 −1 Assume that mk + 2 = 0. Then the components of the field Y (ωm,k , z) = L(n)z −n−2 n∈Z 3m give rise a representation of the Virasoro algebra of central charge cm = m+2 . We shall now investigate the conformal structure on Dm,k defined by the Virasoro element ωm,k . For n ≥ 0 one has
¯ ¯ = δn,0 (1 + k )X L(n)X 2
k and L(n)Y¯ = δn,0 (1 + )Y¯ . 2
(11)
D. Adamovi´c / Central European Journal of Mathematics 5(1) 2007 1–18
9
¯ and Y¯ of Dm,k are primary vectors of conformal weight 1 + k Therefore the generators X 2 for the Virasoro algebra. Moreover, the operator L(0) defines on Dm,k a Z≥0 –gradation if k is even and a 12 Z≥0 –gradation if k is odd. Assume first that k is even. Then Dm,k is a subalgebra of the vertex algebra L(m, 0) ⊗ Fk . Since the operator L(0) defines on Dm,k a Z≥0 –gradation we have that Dm,k is a vertex operator algebra. ¯ and If k is odd, then ι(eβ ) and ι(e−β ) are odd elements in Fk , which implies that X Y¯ are also odd elements in the vertex superalgebra L(m, 0) ⊗ Fk . Therefore Dm,k carries the structure of a vertex operator superalgebra which is generated by the odd elements ¯ and Y¯ of half-integer conformal weight. X In this way we get the following theorem. Theorem 4.2. Let m ∈ C \ {0, −2}, and let k be a nonnegative integer. Assume that mk + 2 = 0. Then Dm,k is a vertex operator algebra if k is even and a vertex operator superalgebra if k is odd. The Virasoro element is ωm,k , the vacuum vector is 1m,k and the rank is cm . Let k = 0. Then Dm,0 is isomorphic to the slˆ2 vertex operator algebra L(m, 0). Note also that the vector 1 1 2 ¯ −1 Y¯ + Y¯−1 X ¯ 1m,0 ¯+ H X ωm,0 = 2(m + 2) 2 −1 coincides with the Virasoro element in L(m, 0) constructed using the Sugawara construction. For k = 1, Dm,1 is in fact the vertex operator superalgebra associated to the vacuum representation of the N = 2 superconformal algebra constructed using the Kazama-Suzuki mapping (cf. [15, 19]). The Virasoro element in Dm,1 is ωm,1 =
1 ¯ ¯ 0 Y¯ + Y¯0 X). (X 2(m + 2)
Its representation theory was studied in [2, 3, 11] . It was proved in [3] that if m is a positive integer, then Dm,1 is a regular vertex operator superalgebra and that the vertex superalgebra Dm,1 ⊗ F−1 is a simple current extension of the vertex algebra L(m, 0) ⊗ F−2(m+2) . When m is not a nonnegative integer then Dm,1 is not rational. In Theorem 4.4 we will generalize this fact for every positive integer k. The definition of Dm,k implies that for every weak L(m, 0)–module M, M ⊗ Fk is a weak module for Dm,k . Thus, the representation theory of Dm,k is closely related to the representation theory of the vertex operator algebra L(m, 0). The case when m is a nonnegative integer will be studied in following sections. When m = −2 and m is not an admissible rational number, then every highest weight slˆ2 –module of level m is a module for the vertex operator algebra L(m, 0). This easily gives that Dm,k is not rational. In the case when m is an admissible rational number, by using the similar arguments to that of
10
D. Adamovi´c / Central European Journal of Mathematics 5(1) 2007 1–18
[2], and by using the representation theory of the vertex operator algebra L(m, 0) in this case (cf. [6]) one can construct infinitely many inequivalent irreducible Dm,k –modules. In order to be more precise, we shall state the following lemma. Lemma 4.3. Assume that m is not a nonnegative integer and m = −2, mk + 2 = 0. Let k ≥ 1. Then for every t ∈ C there is an ordinary Dm,k –module Nt such that Nt = ⊕n∈ 1 Z≥0 Nt (n), and the top level Nt (0) satisfies 2
Nt (0) = Cw,
L(n)w = tδn,0 w for n ≥ 0.
Proof. The proof will use a similar consideration to that in [2], Section 6. Assume that m is not a positive integer and t ∈ C. The results from [6] gives that for every q ∈ C there is a Z≥0 –graded L(m, 0)–module Mq = ⊕n∈Z≥0 Mq (n) and a weight vector vq ∈ Mq (0) such that Ω(0)|Mq (0) ≡
(m + 2)m Id, 2
h(0)vq = qvq ,
where Ω(0) = e(0)f (0)+f (0)e(0)+ 21 h(0)2 is the Casimir element acting on the sl2 –module Mq (0). Then Mq ⊗ Fk is a weak Dm,k –module. Choose q ∈ C such that m k − q 2 = t. 4 4(mk + 2) Let Nt be the Dm,k –submodule of Mq ⊗ Fk generated by the vector w = vq ⊗ 1. Then for n ≥ 0 we have that L(n)w = δn,0 (
m k − q 2 )w = δn,0 tw. 4 4(mk + 2)
Now it is easy to see that Nt is an ordinary 12 Z≥0 –graded Dm,k –module with the top level Nt (0) = Cw and that L(0)|Nt (0) ≡ tId. Thus, the lemma holds. In fact, Lemma 4.3 gives that there is uncountably many inequivalent irreducible Dm,k –modules. Thus, we conclude that the following theorem holds. Theorem 4.4. Let k be a nonnegative integer. Assume that m is not a nonnegative integer and that m = −2, mk + 2 = 0 . Then for every positive integer k, the vertex operator (super)algebra Dm,k is not rational. Remark 4.5. In what follows we will prove that if m is a positive integer, then Dm,k is rational. In fact, we will establish more general complete reducibility theorem, which will imply that Dm,k is regular in the sense of [9].
5
The lattice construction of Dm,k for m ∈ N
In this section we give the lattice construction of the vertex operator (super)algebra Dm,k . This construction is a generalization of the lattice constructions of the vertex operator
D. Adamovi´c / Central European Journal of Mathematics 5(1) 2007 1–18
11
algebra L(m, 0) (cf. [8] and our Lemma 3.3) and of the N=2 vertex operator superalgebra Lcm (cf. [3]). Let m ∈ N and k ∈ Z≥0 . Define the lattice Γm,k = Zγ1 + · · · + Zγm ,
γi, γj = 2δi,j + k for every i, j ∈ {1, . . . , m}. Then VΓm,k is a vertex operator algebra if k is even and a vertex operator superalgebra if k is odd. Proposition 5.1. Let m ∈ N and k ∈ Z≥0 . The vertex operator (super)algebra Dm,k is isomorphic to the subalgebra of the vertex operator (super)algebra VΓm,k generated by the vectors ¯ = ι(eγ ) + · · · + ι(eγm ), X 1 ¯ Y = ι(e−γ1 ) + · · · + ι(e−γm ). ¯ =X ¯ k Y¯ . Then the Virasoro element in Dm,k is given by Set H ω ¯ m,k
1 − k ¯2 1 ¯ ¯ ¯ ¯ Xk−1 Y + Yk−1X + H 1 = 2(m + 2) mk + 2 −1 m 1 1 = γi (−1)2 1 + ι(eγi −γj ) + 2(m + 2) i=1 m + 2 i =j 2 m 1−k + γi (−1) 1. 2(m + 2)(mk + 2) i=1
Proof. Define the lattice Γ1 by Γ1 = Zα1 + · · · + Zαm + Zβ,
αi , αj = 2δi,j , αi , β = 0, β, β = k. For i = 1, . . . , m set γi = αi + β. It is clear that the lattice Γm,k can be identified with the sublattice Zγ1 + · · · + Zγm of the lattice Γ1 . In the same way VΓm,k can treated as a subalgebra of the vertex operator (super)algebra VΓ1 . Lemma 3.3 implies that E = ι(eα1 ) + · · · + ι(eαm ), F = ι(e−α1 ) + · · · + ι(e−αm ), generate a subalgebra of VΓ1 isomorphic to L(m, 0), and the elements ι(eβ ), ι(e−β ) generate a subalgebra isomorphic to Fk . Since ¯ = E−1 ι(eβ ) and Y¯ = F−1 ι(e−β ), X ¯ Y¯ ∈ VΓ ⊂ VΓ1 is we conclude that the vertex subalgebra generated by the elements X, m,k isomorphic to the vertex operator (super)algebra Dm,k . This concludes the proof of the theorem.
12
D. Adamovi´c / Central European Journal of Mathematics 5(1) 2007 1–18
The previous result implies that we can identify the generators of Dm,k in L(m, 0)⊗Fk with the generators of Dm,k in VΓm,k . We shall also prove an interesting proposition which identifies some regular subalgebras of Dm,k . Proposition 5.2. For every positive integer n we have that ι(en(γ1 +···+γm ) ), ι(e−n(γ1 +···+γm ) ) ∈ Dm,k . In particular, Dm,k has a vertex subalgebra isomorphic to Fn2 m(mk+2) . Proof. Using relations (9) and (10), it is easy to prove that:
¯ −(2m−1)k−3 · · · X ¯ −(n−1)mk−2n+1 · · · X ¯ −mk−3 · ¯ −(nm−1)k−2n+1 · · · X X
¯ −(m−1)k−1 · · · X ¯ −1 1 = Cι(en(γ +···+γm ) ) ¯ −k−1 X · X 1
for some nontrivial constant C. Thus ι(en(γ1 +···+γm ) ) ∈ Dm,k . Similarly we prove that ι(e−n(γ1 +···+γm ) ) ∈ Dm,k . The second assertion of the proposition follows from the fact that the vectors ι(e±n(γ1 +···+γm ) ) generate a subalgebra of VΓm,k isomorphic to Fn2 m(mk+2) .
6
Regularity of the vertex operator algebra Dm,2n
In this section we study the vertex algebra L(m, 0) ⊗ F−2n(mn+1) where m, n are positive integers. We know that L(m, 0) ⊗ F−2n(mn+1) is a simple regular vertex algebra. Its irreducible modules are: s¯ , L(m, r) ⊗ F−2n(mn+1)
r ∈ {1, . . . , m}, s¯ ∈
Z . −2n(mn + 1)Z
The fusion rules can be calculated easily from the fusion rules for L(m, 0) and F−2n(mn+1) . Our main goal is to show that the vertex operator algebra Dm,2n is isomorphic to a subalgebra of L(m, 0) ⊗ F−2n(mn+1) . In order to do this, we shall first give the lattice construction of the vertex algebra L(m, 0) ⊗ F−2n(mn+1) . Define the following lattice: L = Zα1 + · · · + Zαm + Zβ,
αi , αj = 2δi,j ,
αi , β = 0,
β, β = −2n(mn + 1)
for every i, j ∈ {1, . . . , m}. We shall now give another description of the lattice L. For i = 1, . . . , m, we define δ = nα1 + · · · + nαm + β, γi = αi + δ. Since αi = γi − δ, β = (nm + 1)δ − n(γ1 + · · · + γm ),
D. Adamovi´c / Central European Journal of Mathematics 5(1) 2007 1–18
13
we have that L = Zγ1 + · · · + Zγm + Zδ,
γi , γj = 2δi,j + 2n,
γi , δ = 0,
δ, δ = −2n
for every i, j ∈ {1, . . . , m}. In fact, we have proved that L∼ = Γm,2n + L−2n ∼ = A1,m + L−2n(mn+1) ,
(12)
VL ∼ = VΓm,2n ⊗ F−2n ∼ = VA1,m ⊗ F−2n(mn+1) .
(13)
which implies that
Define the following vectors in the vertex algebra VL : E = ι(eα1 ) + · · · + ι(eαm ); F = ι(e−α1 ) + · · · + ι(e−αm ). These vectors generate a subalgebra of VL isomorphic to L(m, 0). As in Section 5 we define: ¯ = ι(eγ1 ) + · · · + ι(eγm ); X Y¯ = ι(e−γ ) + · · · + ι(e−γm ). 1
¯ Y¯ generate a subalgebra isomorphic to Dm,2n . In fact, the definition of elements Clearly X, ¯ Y¯ together with relations (12) and (13) imply the following lemma. E, F , X, Lemma 6.1. (1) Let V be the subalgebra of VL generated by the vectors E, F , ι(eβ ), ι(e−β ). Then V ∼ = L(m, 0) ⊗ F−2n(mn+1) . (2) Let W be the subalgebra of VL generated by the vectors ¯ Y¯ , ι(eδ ), ι(e−δ ). X, Then W ∼ = Dm,2n ⊗ F−2n .
Now using standard calculations in lattice vertex algebras one easily gets the following important lemma. Lemma 6.2. In the vertex algebra VL the following relations hold: ¯ = (E−2n−1ι(en(α +···+αm ) ))−1 ι(eβ ); (1) X 1 (2) Y¯ = (F−2n−1 ι(e−n(α1 +···+αm ) ))−1 ι(e−β );
14
(3) (4) (5) (6) (7) (8)
D. Adamovi´c / Central European Journal of Mathematics 5(1) 2007 1–18
ι(eδ ) = ι(en(α1 +···+αm ) )−1 ι(eβ ); ι(e−δ ) = ι(e−n(α1 +···+αm ) )−1 ι(e−β ); ¯ −1 ι(e−δ ); E =X F = Y¯−1 ι(eδ ); ι(eβ ) = ι(e(nm+1)δ )−1 ι(e−n(γ1 +···+γm ) ); ι(e−β ) = ι(e−(nm+1)δ )−1 ι(en(γ1 +···+γm ) ).
Theorem 6.3. The vertex subalgebras V and W coincide. In particular, we have the following isomorphism of vertex algebras: L(m, 0) ⊗ F−2n(mn+1) ∼ = Dm,2n ⊗ F−2n .
(14)
Proof. Using the same arguments as in the proof of Proposition 5.2 we get ι(e±n(α1 +···+αm ) ) ∈ V,
ι(e±n(γ1 +···+γm ) ) ∈ W.
¯ Y¯ , ι(e±δ ) ∈ V . Thus W ⊂ V . Then the relations (1) - (4) in Lemma 6.2 implies that X, Similarly, the relations (5) - (8) in Lemma 6.2 gives that V ⊂ W . Hence, V = W . Then Lemma 6.1 implies that L(m, 0) ⊗ F−2n(mn+1) ∼ = Dm,2n ⊗ F−2n . The next result follows from [9, 10] and [12]. Proposition 6.4. Let V be a vertex operator (super) algebra and s ∈ Z, s = 0. Then we have: (1) V ⊗ Fs is a simple vertex superalgebra if and only if V is a simple vertex operator (super)algebra. (2) V ⊗ Fs is a regular vertex superalgebra if and only if V is a regular vertex operator (super)algebra. Theorem 6.5. Let m, m1 , . . . , mr be positive integers and let k, k1 , . . . , kr be positive even integers. (1) The vertex operator algebra Dm,k is simple and regular. In particular, Dm,k is rational and C2 –cofinite. (2) The vertex operator algebra Dm1 ,k1 ⊗ · · · ⊗ Dmr ,kr is simple and regular. Proof. Since L(m, 0) and F−2n(nm+1) are simple regular vertex algebras, Proposition 6.4 implies that L(m, 0) ⊗ F−2n(nm+1) is also simple and regular. Since L(m, 0) ⊗ F−2n(nm+1) ∼ = Dm,2n ⊗ F−2n , using again Proposition 6.4 we get that the vertex operator algebra Dm,2n is simple and regular. This gives (1). The proof of (2) is now standard (cf. [9]). Since L(m, 0) has (m + 1) inequivalent irreducible modules, and for every k ∈ Z, k = 0, Fk has |k| inequivalent irreducible modules, we get:
D. Adamovi´c / Central European Journal of Mathematics 5(1) 2007 1–18
15
Corollary 6.6. The vertex operator algebra Dm,2n has exactly (m+1)(nm+1) inequivalent irreducible representations.
7
Regularity of the vertex operator superalgebra Dm,k for k odd
In this section, we shall consider the case when k is an odd natural number. When k = 1, then Dm,1 is the vertex operator superalgebra associated to the unitary vacuum representation for the N = 2 superconformal algebra. This case was studied in [3]. First we see that the following relation between lattices holds: ˜ −2k(mk+2) ), Γm,k + L−k ∼ = (A1,m + L−2k(mk+2) ) ∪ (A˜1,m + L
(15)
which implies the following isomorphism of vertex algebras: VΓm,k ⊗ F−k ∼ = (VA1,m ⊗ F−2k(mk+2) ) ⊕ (VA˜1,m ⊗ MF−2k(mk+2) ).
(16)
Using (15), (16) and a completely analogous proof to that of Theorem 7.1 in [3], we get the following result. Theorem 7.1. We have the following isomorphism of vertex superalgebras: Dm,k ⊗ F−k ∼ = L(m, 0) ⊗ F−2k(km+2) ⊕ L(m, m) ⊗ MF−2k(km+2) . In other words, the vertex superalgebra Dm,k ⊗ F−k is a simple current extension of the vertex algebra L(m, 0) ⊗ F−2k(km+2) . By using Proposition 6.4, Theorem 7.1 and the fact that a simple current extension of a simple regular vertex algebra is a simple regular vertex (super)algebra (cf. [21]) we get the following theorem. Theorem 7.2. Let m, m1 , . . . , mr be positive integers and let k, k1 , . . . , kr be positive odd integers. (1) The vertex operator superalgebra Dm,k is simple and regular. In particular, Dm,k is rational and C2 –cofinite. (2) The vertex operator superalgebra Dm1 ,k1 ⊗ · · · ⊗ Dmr ,kr is simple and regular. We also have: Corollary 7.3. The vertex operator superalgebra Dm,k has exactly lent irreducible representations.
(m+1)(km+2) 2
inequiva-
Proof. The results from [21] imply that the extended vertex superalgebra L(m, 0) ⊗ F−2k(km+2) ⊕ L(m, m) ⊗ MF−2k(km+2) has exactly 12 (m + 1)k(km + 2) inequivalent irreducible representations (see also [3, 22]). Since the vertex superalgebra F−n has n inequivalent irreducible representations, we coninequivalent irreducible representations. clude that Dm,k has to have (m+1)(km+2) 2
16
8
D. Adamovi´c / Central European Journal of Mathematics 5(1) 2007 1–18
Realization of the vertex operator algebra D4,k
The lattice construction of Dm,k in Section 5 is based on a very general lattice realization of the vertex operator algebra L(m, 0). Since in some special cases L(m, 0) admits other realizations, one can apply them in the theory of our vertex operator algebras Dm,k . As an example, in this section we shall consider the case m = 4. We will show that the vertex operator (super)algebra D4,k is the fixed point subalgebra of an automorphism g of the lattice vertex operator (super)algebra VPk . Our construction generalizes the fact that the vertex operator algebra L(4, 0) can be constructed as a subalgebra of the lattice vertex operator algebra VA2 . For every k ∈ Z≥0 , we define the following lattice Pk = Zγ1 + Zγ2 ,
γ1 , γ1 = γ2 , γ2 = k + 2, γ1 , γ2 = k − 1.
Then VPk is a vertex operator algebra if k is even and a vertex operator superalgebra if k is odd. Set P = Pk + L−k , where L−k = Zδ and
δ, γ1 = δ, γ2 = 0, δ, δ = −k. Define α1 = γ1 + δ,
α2 = γ2 + δ,
β = k(γ1 + γ2 ) + (2k + 1)δ.
It is easy to see that P = A2 + Zβ, where A2 = Zα1 + Zα2 is the root lattice of type A2 . Since β, β = −k(2k + 1) we get that the following relation between lattices holds: Pk + L−k ∼ = A2 + L−k(2k+1) . Therefore, we have the following isomorphism of vertex (super)algebras: VP ∼ = VPk ⊗ F−k ∼ = VA2 ⊗ F−k(2k+1) .
(17)
Let g be the automorphism VP which is uniquely determined by g(ι(e±γ1 )) = ι(e±γ2 ),
g(ι(e±γ2 )) = ι(e±γ1 ),
g(ι(e±δ )) = ι(e±δ ).
g is the automorphism of order two of the vertex (super)algebra VP and it is lifted from the automorphism γ1 → γ2 , γ2 → γ1 , δ → δ of the lattice P . The definition of g implies that g(ι(e±α1 )) = ι(e±α2 ),
g(ι(e±α2 )) = ι(e±α1 ),
g(ι(e±β )) = (−1)k ι(e±β ).
Let W be one of the subalgebras VPk , VA2 or VZβ . Then W is g–invariant and W = W 0 ⊕ W 1 , where W 0 = {w ∈ W | gw = w},
W 1 = {w ∈ W | gw = −w}.
D. Adamovi´c / Central European Journal of Mathematics 5(1) 2007 1–18
17
We have the following isomorphism of vertex algebras ⎧ ⎪ ⎨ VA02 ⊗ F−k(2k+1) if k is even 0 ∼ . VPk ⊗ F−k = ⎪ ⎩ VA0 ⊗ F−4k(2k+1) ⊕ VA1 ⊗ MF−4k(2k+1) if k is odd 2 2 Next we recall the important fact (see Note 7.3.2 of [24]) that VA02 ∼ = L(4, 0),
VA12 ∼ = L(4, 4).
(18)
Combining (18), Theorem 6.3 and Theorem 7.1 we get that VP0k ⊗ F−k ∼ = D4,k ⊗ F−k . This implies that D4,k ∼ = VP0k . In this way we have proved the following result. Theorem 8.1. We have:
D4,k ∼ = VP0k .
Under this isomorphism, the generators of D4,k are mapped to √ √ ¯ → 2(ι(eγ ) + ι(eγ )), Y¯ →
2(ι(e−γ1 ) + ι(e−γ2 )). X 1 2
Acknowledgment We would like to thank the referee for his valuable comments.
References [1] D. Adamovi´c: “Rationality of Neveu-Schwarz vertex operator superalgebras”, Int. Math. Res. Not., Vol. 17, (1997), pp. 865–874 [2] D. Adamovi´c: “Representations of the N = 2 superconformal vertex algebra”, Int. Math. Res. Not., Vol. 2, (1999), pp. 61–79 [3] D. Adamovi´c: “Vertex algebra approach to fusion rules for N = 2 superconformal minimal models”, J. Algebra, Vol. 239, (2001), pp. 549–572 [4] D. Adamovi´c: Regularity of certain vertex operator superalgebras, Contemp. Math., Vol. 343, Amer. Math. Soc., Providence, 2004, pp. 1-16. [5] T. Abe, G. Buhl and C. Dong: “Rationality, regularity and C2 –cofiniteness”, Trans. Amer. Math. Soc., Vol. 356, (2004), pp. 3391–3402. [6] D. Adamovi´c and A. Milas: “Vertex operator algebras associated to the modular (1) invariant representations for A1 ”, Math. Res. Lett., Vol. 2, (1995), pp. 563–575 [7] C. Dong: “Vertex algebras associated with even lattices”, J. Algebra, Vol. 160, (1993), pp. 245–265. [8] C. Dong and J. Lepowsky: Generalized vertex algebras and relative vertex operators, Birkh¨auser, Boston, 1993.
18
D. Adamovi´c / Central European Journal of Mathematics 5(1) 2007 1–18
[9] C. Dong, H. Li and G. Mason: “Regularity of rational vertex operator algebras”, Adv. Math., Vol. 132, (1997), pp. 148–166 [10] C. Dong, G. Mason and Y. Zhu: “Discrete series of the Virasoro algebra and the Moonshine module”, Proc. Sympos. Math. Amer. Math. Soc., Vol. 56(2), (1994), pp. 295–316 [11] W. Eholzer and M.R. Gaberdiel: “Unitarity of rational N = 2 superconformal theories”, Comm. Math. Phys., Vol. 186, (1997), pp. 61–85. [12] I.B. Frenkel, Y.-Z. Huang and J. Lepowsky: “On axiomatic approaches to vertex operator algebras and modules”, Memoirs Am. Math. Soc., Vol. 104, 1993. [13] I. B. Frenkel, J. Lepowsky and A. Meurman: Vertex Operator Algebras and the Monster, Pure Appl. Math., Vol. 134, Academic Press, New York, 1988. [14] I.B. Frenkel and Y. Zhu: “Vertex operator algebras associated to representations of affine and Virasoro algebras”, Duke Math. J., Vol. 66, (1992), pp. 123–168. [15] B.L. Feigin, A.M. Semikhatov and I.Yu. Tipunin: “Equivalence between chain categories of representations of affine sl(2) and N = 2 superconformal algebras”, J. Math. Phys., Vol. 39, (1998), pp. 3865–3905 [16] B.L. Feigin, A.M. Semikhatov and I.Yu. Tipunin: “A semi-infinite construction of unitary N=2 modules”, Theor. Math. Phys., Vol. 126(1), (2001), pp. 1–47. [17] Y.-Z Huang and A. Milas: “Intertwining operator superalgebras and vertex tensor categories for superconformal algebras”, II. Trans. Amer. Math. Soc., Vol. 354, (2002), pp. 363–385. [18] V.G. Kac: Vertex Algebras for Beginners, University Lecture Series, Vol. 10, 2nd ed., AMS, 1998. [19] Y. Kazama and H. Suzuki: “New N=2 superconformal field theories and superstring compactifications”, Nuclear Phys. B, Vol. 321, (1989), pp. 232–268. [20] H. Li: “Local systems of vertex operators, vertex superalgebras and modules”, J. Pure Appl. Algebra, Vol. 109, (1996), pp. 143–195. [21] H. Li: “Extension of Vertex Operator Algebras by a Self-Dual Simple Module”, J. Algebra, Vol. 187, (1997), pp. 236–267. [22] H. Li: “Certain extensions of vertex operator algebras of affine type”, Comm. Math. Phys., Vol. 217, (2001), pp. 653–696. [23] H. Li: “Some finiteness properties of regular vertex operator algebras”, J. Algebra, Vol. 212, (1999), pp. 495–514. [24] M. Wakimoto: Lectures on infinite-dimensional Lie algebra, algebra, World Scientific Publishing Co., Inc., River Edge, NJ, 2001. [25] W. Wang: “Rationality of Virasoro Vertex operator algebras”, Internat. Math. Res. Notices, Vol 71(1), (1993), PP. 197–211. [26] Xu Xiaoping: Introduction to vertex operator superalgebras and their modules, Mathematics and Its Applications, Vol. 456, Kluwer Academic Publishers, 1998.
DOI: 10.2478/s11533-006-0035-4 Research article CEJM 5(1) 2007 19–49
Perturbation index of linear partial differential-algebraic equations with a hyperbolic part Lutz Angermann1∗ , Joachim Rang2† 1
Institut f¨ ur Mathematik, TU Clausthal, D-38678 Clausthal–Zellerfeld, Germany
2 Institut f¨ ur Analysis und Numerik, Otto-von-Guericke Universit¨ at Magdeburg, PF 4120, D-39016 Magdeburg, Germany
Received 31 August 2005; accepted 22 September 2006 Abstract: This paper deals with linear partial differential-algebraic equations (PDAEs) which have a hyperbolic part. If the spatial differential operator satisfies a G˚ arding-type inequality in a suitable function space setting, a perturbation index can be defined. Theoretical and practical examples are considered. c Versita Warsaw and Springer-Verlag Berlin Heidelberg. All rights reserved. Keywords: Partial differential-algebraic equations, perturbation index MSC (2000): 34 A 09, 35 M 10, 35 B 20
1
Introduction
Systems of differential equations are widely used to describe diverse physical phenomena in such fields as combustions, biology, chemistry, metallurgy, medicine, and fluid mechanics. The well-known Navier-Stokes system forms a representative example. Typically, these systems consist of partial differential, ordinary differential, and algebraic equations and are often called partial differential algebraic equations (PDAEs). In most cases these problems are solved numerically by the help of the vertical or the horizontal method of lines (MOL). Using the vertical method of lines, the PDAE is first semi-discretized in ∗ †
E-mail:
[email protected] E-mail:
[email protected]
20
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
space with (conformal) finite elements. This procedure leads to a differential algebraic equation, the so-called MOL-DAE. A DAE consists of ordinary differential equations (ODEs) coupled with finite-dimensional algebraic equations. According to the survey [6], DAEs are singular implicit ODEs of the form F (t, u, u) ˙ = 0, t ∈ J, where J is a time interval, u˙ denotes the (partial) derivative of u w.r.t. t, and the matrix ∂F (t,u,v) is singular ∂v everywhere in J. Otherwise the above system leads to an implicit ODE. For a historical overview we refer to [6] and for a detailed introduction into the theory of DAEs to [7] and [3]. DAEs can be classified by means of a so-called “index” which plays a fundamental role in both theoretical and numerical investigations of such problems. It has turned out to give insight into the solution properties, as well as into the numerical difficulties to be expected when solving these problems, e.g. how to obtain consistent initial data if there are hidden constraints. To a certain extent, the DAE index is a measure of the singularity of the DAE. There are various types of indices known (see, e.g., [11, Sect. 1.2]), for example the differentiation index and the perturbation index to mention the best known indices. In [22] a comparison of both types of indices can be found which shows that the perturbation index seems to be a better measure. Of course sometimes the estimate may be too pessimistic. In [12] and, more recently, in [21] flowcharts are presented which show suggestions for the selection of numerical methods in dependence on the index of the problem. The numerical methods resulting from such a selection have good stability properties and are able to solve MOL-DAEs of index 1 and 2. Unfortunately, a differentiation index cannot be defined for general PDAEs (see [20]). A differentiation index for special classes of PDAEs can be found in [17]. In this note we make use of the perturbation index defined in [20] which is an extension of the classical perturbation index for DAEs known from [9]. Also, in [20] a more detailed overview on related papers dealing with index concepts for PDAEs is given. The present paper investigates linear PDAEs within the framework of weak solutions, i.e. the PDAEs are considered as abstract DAEs in suitable function spaces of Sobolevtype. The appropriate treatment of boundary conditions is obtained by the requirement that the spatial component of the differential operator has to satisfy a special inequality which is a weak form of a G˚ arding-type inequality. Based on this, an index concept extending the classical perturbation index is introduced and theoretical results as well as practical examples are presented.
2
The problem and its weak formulation
Let Ω ⊂ Rd , d ∈ {2, 3} be a domain with a Lipschitzian boundary ∂Ω and let J := (0, t), t ∈ (0, ∞], be some time interval. In a very few examples we also will allow the case d = 1, where Ω reduces to an interval of the real axis. We consider the following linear system of n ∈ N partial differential, ordinary differential, and algebraic equations with respect to the unknown u = (u1 , . . . , un ) : J × Ω → Rn : Au˙ + Lu = f
in J × Ω ,
(1)
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
21
where A : Ω → Rn,n , f : J × Ω → Rn , (Lu)i :=
n
Lij uj ,
i = 1, . . . , n,
j=1
Lij w := −∇ · (Kij ∇w − bij w) + cij w,
i, j = 1, . . . , n.
Here the coefficients Kij : Ω → Rd,d , bij : Ω → Rd , cij : Ω → R , the properties A ∈ L∞ (Ω)n,n ,
i, j = 1, . . . , n, have
f ∈ C(J, H s (Ω)n ) for some s ≥ 0,
1 Kij = Kij ∈ W∞ (Ω)d,d , bij ∈ L∞ (Ω)d , cij ∈ L∞ (Ω),
(2) i, j = 1, . . . , n.
(3)
We do not assume that the matrix function A in (1) is regular a.e. in Ω. In such a case, the system (1) is called a partial differential-algebraic equation (PDAE). The boundary conditions are formally formulated in a slightly different way from [20] as follows. Given piecewise continuous functions mij , μi : ∂Ω → R and uΓi : J × ∂Ω → R, the boundary conditions read as n μi ν · (Kij ∇uj − bij uj ) + mij uj + uΓi = 0 on J × ∂Ω, i = 1, . . . , n, (4) j=1
where ν denotes the outer unit normal. With this formulation it is possible to use the common Dirichlet and flux boundary conditions as well as the conditions described in [5]. First we set ΓN i := int(supp μi ),
ΓDi := ∂Ω\ΓN i ,
κij := esssup Kij (x) 2 , x∈Ω
βij := esssup bij (x) 2 , x∈Ω
where · 2 denotes the Euclidean norm in Rd or the corresponding matrix norm depending on the context, and int ( · ) is the set of points which are interior as elements of a subset of the boundary ∂Ω. Furthermore, given some u0 : Ω → Rn , we have the implicit initial condition A(u − u0 ) = 0 for x ∈ Ω .
(5)
Next we define the following index sets: [1, n]N := {1, 2, . . . , n}, n (κij + κji) > 0 , NE := i ∈ [1, n]N : j=1
NH
n := i ∈ [1, n]N \ NE : βji > 0 , j=1
NA := [1, n]N \ (NE ∪ NH ). Thus we get a partition of [1, n]N into three pairwise disjoint index sets. Without loss of generality we may assume that the indices can be arranged in such a way that max i < min i ≤ max i < min i .
i∈NE
i∈NH
i∈NH
i∈NA
(6)
22
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
In the following we will give a functional-analytic formulation of the PDAE (1). We set
li :=
Y :=
⎧ ⎪ ⎨ 1, i ∈ NE ∪ NH ,
(7)
⎪ ⎩ 0, otherwise, n
H li (Ω),
X := L2 (Ω)n .
i=1
The norms on Y, X are defined in the usual way, where v = (v1 , . . . , vn ) ∈ X resp. Y :
v 2X :=
n
vi 20,2,Ω ,
v 2Y :=
i=1
n
vi 2li ,2,Ω .
(8)
i=1
The following examples illustrate these settings under the assumptions (2)–(3). Example 2.1. Let d := 1, Ω := (0, 1) and consider the following PDE (see [8]), where a11 > 0 and a22 > 0 a.e. in Ω and v denotes the partial derivative of v w.r.t. x: ⎧ ⎪ ⎪ a11 u˙ 1 + b12 u2 + c11 u1 = f1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ a22 u˙ 2 + b12 u1 + c22 u2 = f2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ u1 (t, 0) = g1 (t) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩
in J × Ω, in J × Ω, t ∈ J,
u2 (t, 1) = g2 (t)
t ∈ J,
u1 (0, x) = u10 (x)
x ∈ Ω,
u2 (0, x) = u20 (x)
x ∈ Ω.
(9)
Writing this problem in the form (1), (4), (5) we see that n = 2 and ⎛
⎞
⎜ a11 0 ⎟ A=⎝ ⎠, 0 a22 μ1 = μ2 = 0,
⎛ (Kij ) = 0,
mij = δij
⎞
⎜ 0 b12 ⎟ (bij ) = ⎝ ⎠, b12 0
on ∂Ω = {0, 1},
⎛
⎞
⎜ c11 0 ⎟ (cij ) = ⎝ ⎠, 0 c22
uΓi = −gi .
Then, if β12 > 0, we have NE = NA = ∅, NH = {1, 2}, l1 = l2 = 1, Y = H 1 (Ω)2 , otherwise NE = NH = ∅, NA = {1, 2}, l1 = l2 = 0, Y = L2 (Ω)2 .
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
Example 2.2. Consider the following PDAE ⎧ ⎪ ⎪ u˙ 1 + ∇ · (b11 u1 + b12 u2 ) + c11 u1 + c12 u2 = f1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ u˙ 2 + ∇ · (b12 u1 + b22 u2 ) + c21 u1 + c22 u2 = f2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ν · (b11 u1 + b12 u2 ) = g1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩
23
in J × Ω, in J × Ω, in J × ∂Ω,
ν · (b12 u1 + b22 u2 ) = g2
(10)
in J × ∂Ω,
u1(0, x) = u10 (x)
x ∈ Ω,
u2(0, x) = u20 (x)
x ∈ Ω.
Writing this problem in the form (1), (4), (5) we see that n = 2 and ⎞ ⎞ ⎛ ⎛ ⎜ b11 b12 ⎟ ⎜ c11 c12 ⎟ A = I, (Kij ) = 0, (bij ) = ⎝ ⎠ , (cij ) = ⎝ ⎠, b12 b22 c21 c22 μ1 = μ2 = 1,
mij = 0 on ∂Ω,
uΓi = gi .
Then, if all functions bij are nontrivial, we have NE = NA = ∅, NH = {1, 2}, l1 = l2 = 1, Y = H 1 (Ω)2 . Example 2.3. Consider the following PDAE ⎧ ⎪ ⎪ u˙ 1 − ∇ · (K11 ∇u1 − b11 u1 − b12 u2) + c11 u1 + c12 u2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ u˙ 2 + ∇ · (b12 u1 + b22 u2) + c21 u1 + c22 u2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ u1 ⎪ ⎪ ν · (b12 u1 + b22 u2 ) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ u1 (0, x) ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ u2 (0, x)
= f1
in J × Ω,
= f2
in J × Ω,
= g1
in J × ∂Ω,
= g2
in J × ∂Ω,
= u10 (x)
x ∈ Ω,
= u20 (x)
x ∈ Ω.
(11)
Writing this problem in the form (1), (4), (5) we see that n = 2 and ⎞ ⎞ ⎛ ⎛ ⎛ ⎞ ⎜I 0⎟ ⎜ b11 b12 ⎟ ⎜ c11 c12 ⎟ A = I, (Kij ) = ⎝ ⎠ , (cij ) = ⎝ ⎠, ⎠ , (bij ) = ⎝ b12 b22 c21 c22 00 μ1 = 0,
μ2 = 1,
m1j = δ1j ,
m2j = 0 on ∂Ω,
uΓ1 = −g1 ,
uΓ2 = g2 .
Then, if b12 or b22 are nontrivial, we have NE = {1}, NH = {2}, NA = ∅, l1 = l2 = 1, Y = H 1 (Ω)2 . More examples can be found in [20] and [23].
24
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
Now we give an abstract formulation of the problem (1), (4) and (5) under the assumptions (2) – (3) and the following symmetry condition: bij = bji
∀(i, j) ∈ [1, n]2N \ NE2 .
(12)
As usual, for u, v ∈ Y, we formally multiply Lu by v, integrate the result over Ω and integrate by parts, where (·, ·) denotes the L2 (Ω)- or L2 (Ω)d -inner product, and, with the subscript ∂Ω or ΓN i , the L2 (∂Ω)- or L2 (ΓN i )d -inner product:
(Lu, v) =
n
(−∇ · (Kij ∇uj − bij uj ) + cij uj , vi )
i,j=1
n (Kij ∇uj − bij uj , ∇vi ) + (cij uj , vi ) − (ν · (Kij ∇uj − bij uj ), vi )∂Ω = i∈NE j=1
+
n
(∇ · (bij uj ) + cij uj , vi ) +
i∈NH j=1
n
(cij uj , vi ).
i∈NA j=1
Here we have used that, by definition of the index sets, Kij = 0 for i ∈ [1, n]N \ NE , j ∈ [1, n]N . Namely, a nontrivial coefficient Kij for some i ∈ [1, n]N \ NE would imply κij > 0 and, thus, i, j ∈ NE . A similar argument applies to the third term. A nontrivial coefficient bij for some i ∈ NA would imply, by the symmetry assumption (12), βji > 0 and, therefore, i ∈ NH . In the next step, we separate also with respect to the summation over j. (Lu, v) =
(Kij ∇uj − bij uj , ∇vi ) + (cij uj , vi ) − (ν · (Kij ∇uj − bij uj ), vi )∂Ω
i,j∈NE
+
− (bij uj , ∇vi ) + (cij uj , vi ) + ((ν · bij )uj , vi )∂Ω
i∈NE j∈NH
+
(cij uj , vi )
i∈NE j∈NA
+
(∇ · (bij uj ) + cij uj , vi )
(13)
i∈NH j∈NE ∪NH
+ +
(cij uj , vi )
i∈NH j∈NA n
(cij uj , vi ).
i∈NA j=1
The coefficients Kij in the second and the third lines disappear by the definition of the index sets (this is the same argument as above). The coefficients bij in the third and the fifth lines disappear by the definition of the index set NH (but not as a consequence of the symmetry assumption (12)).
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
We introduce the following function spaces: ⎧ ⎪ ⎪ {v ∈ H 1 (Ω) : v|ΓDi = 0}, i ∈ NE , ⎪ ⎪ ⎨ Vi := H 1 (Ω), i ∈ NH , ⎪ ⎪ ⎪ ⎪ ⎩ L2 (Ω), i ∈ NA ,
V :=
n
25
Vi (Ω).
i=1
The norm on V is defined by restricting the norm of Y, cf. (8). Using these definitions and integrating by parts in the fourth line of (13), we get (Lu, v) = (Kij ∇uj − bij uj , ∇vi ) + (cij uj , vi ) − (ν · (Kij ∇uj − bij uj ), vi )ΓNi i,j∈NE
+
− (bij uj , ∇vi ) + (cij uj , vi ) + ((ν · bij )uj , vi )ΓNi
i∈NE j∈NH
+
− (bij uj , ∇vi ) + cij uj , vi ) + ((ν · bij )uj , vi )∂Ω
i∈NH j∈NE ∪NH
+
(cij uj , vi ) +
i∈NE ∪NH j∈NA
n i∈NA
(cij uj , vi ).
j=1
If we take into consideration the definition of the index sets, the boundary conditions (4) read as follows: μi ν · (Kij ∇uj − bij uj ) + mij uj + (mij − μi (ν · bij ))uj j∈NE
j∈NH
+
mij uj + uΓi = 0 if i ∈ NE ,
j∈NA
(mij − μi (ν · bij ))uj +
j∈NE ∪NH
mij uj + uΓi = 0 if i ∈ NH ,
j∈NA n
mij uj + uΓi = 0 if i ∈ NA .
j=1
We note that μi is piecewise continuous by assumption and, therefore, ΓN i is either empty or has a positive boundary measure |ΓN i |. Now we use the following assumptions w.r.t. the boundary data: μ−1 ∈ L∞ (ΓN i ), i
∀i ∈ NE s.t. |ΓN i | > 0,
μi = 1,
∀i ∈ NH ,
uΓi = 0,
∀i ∈ NA ,
mij = mji ,
∀(i, j) ∈ [1, n]2N \ NH2 ,
mij = 0,
∀j ∈ NE \ {i}, ∀j ∈ NA .
(14)
26
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
Then the boundary conditions for i ∈ NA are satisfied identically, and the remaining conditions get the form
μi ν · (Kij ∇uj − bij uj ) − μi
j∈NE
(ν · bij )uj + mii ui + uΓi = 0 if i ∈ NE ,
j∈NH
mij uj −
(ν · bij )uj = 0 if i ∈ NH .
j∈NE ∪NH
j∈NH
This formally implies that
(Lu, v) =
(Kij ∇uj − bij uj , ∇vi ) + (cij uj , vi ) i,j∈NE
+
− (bij uj , ∇vi ) + (cij uj , vi )
i∈NE j∈NH
uΓ mii i ui , vi + , vi + μi μi ΓNi ΓNi i∈NE i∈NE − (bij uj , ∇vi ) + cij uj , vi ) + i∈NH j∈NE ∪NH
+
(mij uj , vi )∂Ω
i,j∈NH
+
n (cij uj , vi ) + (cij uj , vi )
i∈NE ∪NH j∈NA
=
(Kij ∇uj − bij uj , ∇vi )
i,j∈NE
− +
i∈NA j=1
i∈NE j∈NH n
(bij uj , ∇vi ) −
(bij uj , ∇vi )
i∈NH j∈NE ∪NH
(cij uj , vi )
i,j=1
mii + ui , vi + (mij uj , vi )∂Ω μ i Γ Ni i∈NE i,j∈NH uΓ i + , vi . μ i Γ Ni i∈N E
Now we can introduce the following linear operators A, B : Y → V ∗ and right-hand sides
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
27
fΩ : J → X, fN : J → V ∗ : Au, v :=
n
(aij uj , vi ),
i,j=1
Bu, v :=
(Kij ∇uj − bij uj , ∇vi )
i,j∈NE
− +
(bij uj , ∇vi ) −
(bij uj , ∇vi )
i∈NH j∈NE ∪NH
i∈NE j∈NH n
(cij uj , vi )
i,j=1
mii ui , vi + (mij uj , vi )∂Ω , + μi ΓNi i∈NE i,j∈NH n uΓ i (fi , vi ), fN , v := , vi . fΩ , v := μi ΓNi i=1 i∈N
(15)
E
With this, we get the following operator equation in V ∗ w.r.t. the unknown element u: J →Y: Au˙ + Bu = fΩ + fN . (16) Given some u0 ∈ Y, the initial condition reads as A(u − u0 ) = 0 .
(17)
Equation (16) is called an abstract DAE (ADAE). In order to be able to include inhomogeneous Dirichlet boundary conditions, we assume that there exists some abstract function uD : J → Y with uDj = uΓj on ΓDj for j ∈ [1, n]N . Using the representation u = uhom + uD , where uhom : J → V, and introducing the right-hand sides fD , f : J → V ∗ by fD := −Au˙ D − BuD and f := fΩ + fD + fN , (18) we get the following operator equation in V ∗ w.r.t. the unknown element uhom : J → V : Au˙ hom + Buhom = f .
(19)
If there are no Dirichlet boundary conditions at all, then we formally set uD = 0. In [20] and [19] we have seen that the perturbation index can be determined by the help of a G˚ arding-type inequality, i.e. there exist two constants λ ≥ 0, c > 0 such that ∀v ∈ V : Bv, v + λ v 2X ≥ c v 2V .
(20)
Sufficient conditions under which the operator B satisfies a G˚ arding-type inequality can be found in [20]. Unfortunately, (20) is not satisfied for problems with a hyperbolic part as for example (9), (10), and (11). In this case the operator B should satisfy an estimate of the form Bv, v + λ v 2X ≥ c
∇vi 20,2,Ω . (21) i∈NE
28
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
To give a short formulation of the corresponding result, we define the following matrices B, M : Ω → Rn,n by 1 M := (mij )ni,j=1 . (ν · bij )ni,j=1 , 2 By assumption (6), B and M have the following block structure: ⎛ ⎞ ⎞ ⎛ ⎜ MEE MEH MEA ⎟ ⎜ BEE BEH BEA ⎟ ⎜ ⎟ ⎟ ⎜ ⎟ ⎟ M =⎜ B=⎜ ⎜ MHE MHH MHA ⎟ . ⎜ BHE BHH BHA ⎟ , ⎝ ⎠ ⎠ ⎝ BAE BAH BAA MAE MAH MAA B :=
Taking into consideration the definition of the index sets and assumption (14), this structure simplifies to ⎛ ⎞ ⎞ ⎛ ⎜ MEE 0 0 ⎟ ⎜ BEE BEH 0 ⎟ ⎜ ⎟ ⎟ ⎜ ⎜ 0 M ⎟, ⎟, M = B=⎜ B B 0 0 HH ⎜ ⎟ ⎜ HE HH ⎟ ⎝ ⎠ ⎠ ⎝ 0 0 0 0 0 0 where, thanks to (12) and (14), again, BHE = BEH ,
BHH = BHH ,
and MEE is a diagonal matrix. Finally, we define a diagonal matrix DE = diag(di ), i ∈ NE , by ⎧ ⎪ ⎨ μ−1 , |ΓN i | > 0, i di := ⎪ ⎩ 0, otherwise. Lemma 2.4. Let there exist constants κii > 0, i ∈ NE , such that ξ · (Kii (x)ξ) ≥ κii ξ 22,
∀ξ ∈ Rd ,
∀x ∈ Ω.
1 Let bij ∈ W∞ (Ω)d (i, j ∈ [1, n]N ) be such that the symmetry condition (12) is satisfied but for all indices, i.e. bij = bji ∀i, j ∈ [1, n]N , (22)
and let the symmetric part of the matrix ⎛ ⎞ ⎛ ⎞ ⎜ DE MEE 0 ⎟ ⎜ BEE BEH ⎟ ⎝ ⎠−⎝ ⎠ BHE BHH 0 MHH be positive semidefinite. Finally, let the entries of a matrix κ be given by ⎧ ⎪ ⎨ κii , i = j, κij := i, j ∈ NE , ⎪ ⎩ − esssup Kij (x) 2 , i = j, x∈Ω
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
29
and let κsym := κ+κ be positive definite. Then the inequality (21) is satisfied for some 2 positive c and λ. Proof. Since ∇(vi vj ) = vi ∇vj + vj ∇vi , we can write (bij vj , ∇vi ) = (bij , ∇(vi vj )) − i,j∈NE ∪NH
i,j∈NE ∪NH
(bij vi , ∇vj ).
i,j∈NE ∪NH
From the symmetry assumption (22) we see that (bij vi , ∇vj ) = (bji vi , ∇vj ) = i,j∈NE ∪NH
i,j∈NE ∪NH
(bij vj , ∇vi ),
i,j∈NE ∪NH
consequently, 1 (bij vj , ∇vi ) = (bij , ∇(vi vj )) 2 i,j∈N ∪N i,j∈NE ∪NH E H 1 1 ((∇ · bij )vj , vi )) + ((ν · bij )vj , vi ))∂Ω . =− 2 i,j∈N ∪N 2 i,j∈N ∪N E
H
E
H
Thus we have, by assumption,
Bv, v =
(Kij ∇vj , ∇vi ) −
(bij vj , ∇vi ) +
i,j∈NE ∪NH
i,j∈NE
mii vi , vi + (mij vj , vi )∂Ω + μ i Γ Ni i∈N i,j∈N
E
n
(cij vj , vi )
i,j=1
H
n 1 = (Kij ∇vj , ∇vi ) + ((∇ · bij )vj , vi )) + (cij vj , vi ) 2 i,j∈N ∪N i,j=1 i,j∈NE E H mii 1 vi , vi + (mij vj , vi )∂Ω − ((ν · bij )vj , vi ))∂Ω + μ 2 i Γ Ni i∈N i,j∈N i,j∈N ∪N
≥
E
H
(Kij ∇vj , ∇vi ) +
i,j∈NE
With γij :=
1 2
E
1 2 i,j∈N
((∇ · bij )vj , vi )) +
E ∪NH
H
n
(cij vj , vi ).
i,j=1
esssup (∇ · bij + 2cij ) we get x∈Ω
Bv, v ≥
2
κii ∇vi −
i,j∈NE i=j
i∈NE
≥
i∈NE
κii ∇vi 2 −
κij ∇vj
∇vi −
n
γij vi
vj
i,j=1
κij ∇vi
∇vj − λ v 2X ,
i,j∈NE i=j
where λ is the spectral norm of the matrix γ := (γij )ni,j=1. If c denotes the spectral norm of the matrix κsym , it follows
∇vi 2 − λ v 2X Bv, v ≥ c i∈NE
30
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
and the Lemma is proven.
Example 2.5. We consider the PDAE (9) of Example 2.1. The inequality (21) is satisfied with λ = max{γ11 , γ22 } and arbitrary c > 0 (note that NE = ∅). Example 2.6. We the ⎛consider ⎞ PDAE (10) of Example 2.2. The inequality (21) is γ11 γ12 ⎜ ⎟ satisfied with λ = ⎝ ⎠ and arbitrary c > 0. γ21 γ22 2
Example 2.7. We consider the (11) ⎛ PDAE ⎞ of Example 2.3. The inequality (21) is γ11 γ12 ⎜ ⎟ satisfied with c = κ11 and λ = ⎝ ⎠ . γ21 γ22 2
More examples can be found in [20] or [23].
3
The perturbation index
In this section we introduce an extension of the perturbation index, which is known from the theory of DAEs, to the case of ADAEs and PDAEs. In particular, it turns out that the introduced perturbation index coincides with the classical notion in the case of DAEs. Let u be a weak solution of the ADAE (16) which is consistent with the initial value u0 ∈ Y , i.e. A(u − u0 ) = 0. The notion of the perturbation index is based on the investigation of the sensitivity of this solution with respect to initial values, boundary values and right-hand sides. Starting from the ADAE written in the form (19), we introduce perturbations δ Ω : J → X (of the structure (15) of the right-hand side), δ N : J → V ∗ (of the structure (15) of the Neumann-type boundary conditions), and δ D : J → V ∗ (of the structure (18) of the Dirichlet-type boundary conditions) and look for a solution u ˆhom : J → V of the equation Au ˆ˙ hom + Bˆ uhom = fΩ + fD + fN + δ Ω + δ D + δ N . (23) Subtracting (23) from (19) leads to the so-called homogenized error equation with respect ˆhom − uhom : J → V to εhom := u Aε˙ hom + Bεhom = δ Ω + δ D + δ N =: δ .
(24)
Now we can define the perturbation index of an ADAE. Definition 3.1. Let F be a family of right-hand sides such that, for any f ∈ F , the ADAE (16) has only one weak solution. Then the ADAE (16) has the perturbation index ˆ having defects ip along the solution u on J, if ip is the smallest integer such that, for all u
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
31
δ Ω : J → X and δ D , δ N : J → V ∗ , i.e. Au ˆ˙ hom + Bˆ uhom = fΩ + fD + fN + δ Ω + δ D + δ N , there is on J an estimate of the form
ˆ uhom (t) − uhom (t) X ≤ C
j ∂ δ(τ )
ˆ uhom (0) − uhom (0) X + sup (∂τ )j , τ ∈J ∗ j=0 ip −1
(25)
where
δ ∗ := δ Ω X + δ D V ∗ + δ N V ∗ . Here the constant C may depend only on A, B, f and the length t of J. Remark 3.2. (i) In the definition it is implicitly assumed that equation (23) is solvable in J for the perturbations δ under consideration. (ii) Recall that the norms of the spaces Vi , V , Vi∗ and V ∗ are defined as
vi 2Vi := vi 2li ,2,Ω :=
v 2V :=
n i=1
δi Vi∗
δ V ∗
|α|≤li n
vi 2Vi :=
∂ α vi 20,2,Ω ,
vi ∈ Vi
∂ α vi 20,2,Ω ,
v∈V
i=1 |α|≤li
|δi , vi | := sup , δi ∈ Vi∗ vi ∈Vi \{0} vi Vi n :=
δi Vi∗ , δ ∈ V ∗. i=1
(iii) Concerning the problem of existence and uniqueness of weak solutions, we refer to the literature, e.g. [4] and [24].
4
Hyperbolic PDAEs
As a first application of the above theory, we investigate the linear hyperbolic PDE n
(∇ · (bij uj ) + cij uj ) = fi
in Ω,
i ∈ [1, n]N ,
(26)
j=1 1 (Ω)d , cij ∈ L∞ (Ω), i, j ∈ [1, n]N , and f ∈ C(J, L2 (Ω)n ). Moreover with the data bij ∈ W∞ we assume that the PDE (26) is symmetric, i.e. we have bij = bji , i, j ∈ [1, n]N (cf. (22)). The boundary conditions are n j=1
((ν · bij )uj − mij uj ) = 0 on ∂Ω,
i ∈ [1, n]N ,
(27)
32
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
where, as above, B :=
1 2
(ν · bij )ni,j=1 , M := (mij )ni,j=1 and (cf. [5, Sect. 5])
M is continuous on ∂Ω, ξ · [(M + M )ξ] ≥ 0 ∀ξ ∈ Rn on ∂Ω, ker (2B − M) ⊕ ker (2B + M) = Rn
(28)
on ∂Ω.
Theorem 4.1. Consider the symmetric hyperbolic PDE (26) with the boundary conditions (27) under the assumptions (28). Assume that the weak problem Bu, v = fΩ , v ∀v ∈ V := H 1 (Ω)n
(29)
has a unique solution. If there exists a constant C > 0 such that n 1 i,j=1
2
∇ · bij + cij ξ i ξ j ≥ C ξ 22
∀ξ ∈ Rn a.e. in Ω,
(30)
then the problem (29) has the perturbation index ip = 1. Proof. To keep the notation simple we set w := εhom and consider the weak problem Bw, v = δ Ω , v ∀v ∈ V.
(31)
From the proof of Lemma 2.4 we know that n n 1 ((∇ · bij )wj , wi)) + (cij wj , wi ) Bw, w = 2 i,j=1 i,j=1 n n 1 + (mij wj , wi )∂Ω − ((ν · bij )wj , wi))∂Ω . 2 i,j=1 i,j=1
(32)
Using the boundary condition (27), we get Bw, w =
n 1 i,j=1
n 1 ∇ · bij + cij wj , wi + (mij wj , wi )∂Ω . 2 2 i,j=1
Then it follows, by the semidefiniteness of M + M (see (28)) and the condition (30), that n Bw, w ≥ C (wi , wi) = C w 2X . i=1
The right-hand side of (31) with v = w can be estimated by the Cauchy-Schwarz inequality: δ Ω , w ≤ δ Ω X w X . Thus we arrive at
w X ≤
1
δ Ω X . C
Hence the problem has the perturbation index ip = 1.
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
33
Remark 4.2. Instead of using the boundary condition (27) under the assumptions (28) it is sufficient to suppose that the symmetric part of MHH − BHH is positive definite. Then we also have, by (32) and (30) that Bw, w ≥ C w 2X . Details about the solvability of the strong and the weak problems can be found in [5] and [14]. Theorem 4.3. Consider the symmetric hyperbolic PDE ⎧ n ⎪ ⎪ ⎨ u˙ i + (∇ · (bij uj ) + cij uj ) = fi in J × Ω, ⎪ ⎪ ⎩
j=1
ui (0, x) = ui0
i ∈ [1, n]N ,
(33)
in Ω,
with the boundary conditions (27) under the assumptions (28). If the weak problem (u, ˙ v) + Bu, v = f, v ∀v ∈ V := H 1 (Ω)n is uniquely solvable, it has the perturbation index ip = 1. Proof. In the proof of Theorem 4.1 we have seen that Bw, w ≥
n 1 i,j=1
Setting γ := (γij )ni,j=1 with γij :=
1 2
2
∇ · bij + cij wj , wi .
esssup(2cij + ∇ · bij ), we get
˙ w) + Bw, w = δ Ω , w, (w, ˙ w) − λ w 2X ≤ (w, where λ is the spectral-norm of the matrix γ. Because of 1 1 δ Ω , w ≤ w 2X + δ Ω 2X 2 2 we obtain 1 1 (w, ˙ w) − λ w 2X ≤ w 2X + δ Ω 2X 2 2 and, with μ := 2λ + 1, 2(w, ˙ w) − μ w 2X ≤ δ Ω 2X . Using the relation (w, ˙ w) =
1d 1d (w, w) =
w 2X , 2 dt 2 dt
we get the following estimate: d
w 2X − μ w 2X ≤ δ Ω 2X . dt
(34)
34
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
From
d −μt d e w 2X = e−μt w 2X − μe−μt w 2X dt dt
we obtain
d −μt e w 2X ≤ e−μt δ Ω 2X . dt
Integration yields −μt
e
w 2X
−
w0 2X
≤
0
t
e−μs δ Ω 2X ds
and
w 2X
w0 2X
t
eμ(t−s) δ Ω 2X ds t μt 2 2 ≤ e w0 X + sup δ Ω (t) X eμ(t−s) ds ≤e
μt
+
0
0
t∈J μt
e −1 sup δ Ω (t) 2X ≤ eμt w0 2X + μ t∈J μt μt e − 1 2 2 ≤ max e ,
w0 X + sup δ Ω (t) X . μ t∈J Finally we have
w X ≤
μt − 1 e
w0 X + sup δ Ω (t) X max eμt , μ t∈J
and the problem has the perturbation index 1.
Example 4.4. Let d := 1, Ω := (0, 1) and b > 0 be a given constant. The scalar hyperbolic problem ⎧ ⎪ ⎪ u˙ + (bu) = f (t, x) (t, x) ∈ J × Ω, ⎪ ⎪ ⎨ u(t, 0) = 0 t ∈ J, ⎪ ⎪ ⎪ ⎪ ⎩ u(0, x) = u0 (x) x ∈ Ω, has the perturbation index ip = 1, since the choice m := b leads to the equivalent boundary conditions (−b − m(0))u(t, 0) = 0, (b − m(1))u(t, 1) = 0 which have the form (27) and satisfy (28). Example 4.5. Consider the PDE (9) from Example 2.1 with the boundary conditions u1 (t, 0) + u2 (t, 0) = 0,
u1 (t, 1) − u2 (t, 1) = 0.
If b12 (0) > 0, b12 (1) > 0, then the problem has the perturbation index ip = 1.
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
35
First of all we observe that the result of Theorem 4.3 can be easily extended to the case where the matrix A is symmetric positive definite. Since a11 > 0 and a22 > 0 for all x ∈ Ω, this condition is satisfied. Moreover, with the choice m11 := m22 := b12 we get the equivalent boundary conditions ⎛ ⎞⎛ ⎞ ⎞⎛ ⎞ ⎛ ⎜ −m11 (0) −b12 (0) ⎟ ⎜ u1 (t, 0) ⎟ ⎜ −m11 (1) b12 (1) ⎟ ⎜ u1 (t, 1) ⎟ ⎝ ⎠⎝ ⎠ = 0, ⎝ ⎠⎝ ⎠=0 −b12 (0) −m22 (0) u2 (t, 0) b12 (1) −m22 (1) u2 (t, 1) which have the form (27) and satisfy (28). This result coincides with the estimate given in [8, Example 8]. Example 4.6. The PDE (10) from Example 2.2 with the boundary conditions (27) satisfying (28) has the perturbation index ip = 1.
5
Mixed hyperbolic-parabolic PDAEs
In this section systems are considered which have a parabolic and a hyperbolic part. Many physical phenomena can be described by the help of such systems. One representative example is the compressible Navier-Stokes system which will be treated in Section 6.2. We start with the situation where NE = [1, n1 ]N , NH = [n1 +1, n]N for some n1 , n ∈ N, n1 < n. Theorem 5.1. Consider the PDAE u˙ i −
n1 j=1
∇ · (Kij ∇uj ) +
n (∇ · (bij uj ) + cij uj ) = fi ,
i = 1, . . . , n1 ,
j=1
n u˙ i + (∇ · (bij uj ) + cij uj ) = fi ,
i = n1 + 1, . . . , n
j=1
under the assumptions of Lemma 2.4. The initial condition reads as u(0, x) = u0 (x). If the corresponding weak problem is uniquely solvable and if δ admits an estimate of the type n 1/2 1 |δ, v| ≤ δ ∗
∇vi 2 + v 2X ∀v ∈ V, (35) i=1
then the PDAE has the perturbation index ip = 1. Proof. The error equation of the weak problem with v = w reads as ˙ w) + Bw, w = δ, w. (w, Since the assumptions of Lemma 2.4 are fulfilled, the inequality (21) is valid. So we have 1 1d
∇wi 2 − λ w 2X ≤ δ, w.
w 2X + c 2 dt i=1
n
36
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
Because of δ, w ≤ δ ∗
n 1
1/2
∇wi 2 + w 2X
i=1
n 1 c 1 ≤
∇wi 2 + w 2X + δ 2∗ , 2 i=1 2c
where we have used the ε-inequality with ε := c/2, we get n1 c c 1 1d
∇wi 2 − λ +
w 2X +
w 2X ≤ δ 2∗ . 2 dt 2 i=1 2 2c With μ := 2λ + c it follows that d 1
w 2X − μ w 2X ≤ δ 2∗ . dt c Using the same technique as in the proof of the previous theorem, we obtain the desired result. Remark 5.2. The estimate (35) is essentially a regularity estimate. This also applies to the comparable assumptions in the subsequent theorems. Example 5.3. The PDE (11) from Example 2.3 has the perturbation index ip = 1 provided the matrix ⎛ ⎞ ⎜ 2 − ν · b11 −ν · b12 ⎟ ⎝ ⎠ −ν · b12 −ν · b22 is positive definite on ∂Ω. The next theorem corresponds to a situation where NE = NE1 ∪ NE2 with NE1 = [1, n1 ]N , NE2 = [n1 + n2 + 1, n]N and NH = [n1 + 1, n1 + n2 ]N for some n1 , n2 , n ∈ N, n1 < n2 < n (i.e. we do not require that the indices are arranged in correspondence with (6)). Theorem 5.4. Consider the PDAE u˙ i −
n1
∇ · (Kij ∇uj ) +
j=1
u˙ i + n
n 1 +n2 j=1 n 1 +n2
∇ · (bij uj ) + ∇ · (bij uj ) +
j=1
[−∇ · (Kij ∇uj ) + ∇ · (bij uj )] +
j=n1 +n2 +1
n j=1 n j=1 n
cij uj = fi ,
i ≤ n1 ,
cij uj = fi ,
n1 < i ≤ n1 + n2 ,
cij uj = fi ,
n1 + n2 < i ≤ n
j=1
and assume that the conditions of Lemma 2.4 are satisfied. If, in addition, the restriction of Bv, v to the subspace VE2 :=
n 1 +n2 i=1
{0} ×
n i=n1 +n2 +1
Vi
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
37
is coercive, i.e. there exists a constant α > 0 such that n
Bv, v ≥ α
vi 21,2,Ω
∀v ∈ VE2
i=n1 +n2 +1
and if δ admits an estimate of the type n 1
∇vi 2 + |δ, v| ≤ δ ∗
1/2
n
∇vi 2 + v 2X
∀v ∈ V,
i=n1 +n2 +1
i=1
then the weak problem has the perturbation index ip = 1. Proof. First we consider the restriction of the error equation ˙ v) + Bw, v = δ, v (Aw,
(36)
to the subspace VE2 , i.e. Bw, v = δ, v ∀v ∈ VE2 . In particular, if we take v = (vi )ni=1
with vi :=
⎧ ⎪ ⎨ wi , i ∈ [n1 + n2 + 1, n]N , ⎪ ⎩ 0, otherwise,
then we get Bv, v = δ, v − B(w − v), v. Notice that, by assumption, n |δ, v| ≤ δ ∗
1/2
wi 21,2,Ω
i=n1 +n2 +1
(37)
n 1 α 2 ≤ δ ∗ +
wi 21,2,Ω , α 4 i=n +n +1 1
where we have used the ε-inequality with ε := α/4. Since n n n 1 +n2 |B(w − v), v| = (cij wj , wi ) ≤ σ i=n1 +n2 +1 j=1
2
n 1 +n2
wi
wj ,
i=n1 +n2 +1 j=1
where σ :=
max
i∈[n1 +n2 +1,n]N j∈[1,n1 +n2 ]N
esssup |cij (x)|, x∈Ω
the ε-inequality yields |B(w − v), v| ≤ σ
n
n 1 +n2
i=n1 +n2 +1 j=1
≤ σ ε(n1 + n2 )
! 1 2 ε wi + wj
4ε 2
n i=n1 +n2
n1 +n2 ! n − n1 − n2
wi +
wj 2 . 4ε +1 j=1 2
(38)
38
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
The choice ε := α/(4σ(n1 + n2 )) leads to n1 +n2 n σ 2 (n − n1 − n2 )(n1 + n2 ) α 2
wi +
wj 2 . |B(w − v), v| ≤ 4 i=n +n +1 α j=1 1
2
Using this estimate, we conclude from (37), (38) that α
n
wi 21,2,Ω
i=n1 +n2 +1
n n1 +n2 1 σ 2 (n − n1 − n2 )(n1 + n2 ) 2 α 2 ≤ δ ∗ +
wi 1,2,Ω +
wj 2 , α 2 i=n +n +1 α j=1 1
2
hence n n1 +n2 α 1 σ 2 (n − n1 − n2 )(n1 + n2 ) 2 2
wi 1,2,Ω ≤ δ ∗ +
wj 2 . 2 i=n +n +1 α α j=1 1
(39)
2
Returning to the general error equation (36) and setting v := w, we have by Lemma 2.4 that n1 +n2 n1 n 1 d
∇wi 2 + c
wi 2 + c 2 i=1 dt j=1 j=n +n
∇wi 2 − λ w 2X ≤ δ, w
2 +1
1
n1 n 1
δ 2∗ + ε
∇wi 2 + ε
∇wi 2 + ε w 2X . 4ε i=1 i=n +n +1
≤
1
2
With ε := c/2 we get n1 +n2 n1 n d c c c 1 1 2 2 2
wi +
w 2X ≤ δ 2∗ .
∇wi +
∇wi − λ + 2 i=1 dt 2 i=1 2 i=n +n +1 2 2c 1
Hence
n 1 +n2 i=1
2
d 1
wi 2 − μ w 2X ≤ δ 2∗ , dt c
where μ := 2λ + c. By definition of the X-norm, this estimate can be rewritten as n 1 +n2 i=1
n ! 1 d 2 2 2
wi 2 .
wi − μ wi ≤ δ ∗ + μ dt c i=n +n +1 1
2
The last term on the right-hand side can be estimated by means of (39): n 1 +n2 i=1
n 1 +n2 ! 1 2 d 2 2 2
wi − μ wi ≤ +
δ ∗ + μ ˜
wi 2 , dt c α2 i=1
where μ ˜ := 2μσ 2(n − n1 − n2 )(n1 + n2 )/α2 . Thus we arrive at the relation n 1 +n2 i=1
! 1 d 2 ˜) wi 2 ≤
wi 2 − (μ + μ + 2 δ 2∗ . dt c α
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
39
After integration and together with (39) the desired estimate follows.
In the rest of this section we consider the situation where NE = [1, n1 ]N , NH = NH1 ∪NH2 with NH1 = [n1 +1, n1 +n2 ]N and NH2 = [n1 +n2 +1, n]N for some n1 , n2 , n ∈ N, n1 < n2 < n. Theorem 5.5. Consider the PDAE u˙ i −
n1
∇ · (Kij ∇uj ) +
j=1
u˙ i +
n 1 +n2 j=1 n 1 +n2
∇ · (bij uj ) +
cij uj = fi ,
i ≤ n1 ,
cij uj = fi ,
n1 < i ≤ n1 + n2 ,
cij uj = fi ,
n1 + n2 < i ≤ n
j=1
∇ · (bij uj ) +
j=1
n
n
n j=1
∇ · (bij uj ) +
j=n1 +n2 +1
n j=1
and assume that the conditions of Lemma 2.4 are satisfied. If, in addition, the restriction of Bv, v to the subspace VH2 :=
n 1 +n2
{0} ×
i=1
n
Vi
i=n1 +n2 +1
is coercive, i.e. there exists a constant α > 0 such that n
Bv, v ≥ α
vi 2
∀v ∈ VH2
i=n1 +n2 +1
and if δ admits an estimate of the type |δ, v| ≤ δ ∗
n 1
1/2
∇vi 2 + v 2X
∀v ∈ V,
i=1
then the weak problem has the perturbation index ip = 1. Proof. First we consider the restriction of the error equation ˙ v) + Bw, v = δ, v (Aw, to the subspace VH2 , i.e. Bw, v = δ, v ∀v ∈ VH2 . In particular, if we take
v = (vi )ni=1
⎧ ⎪ ⎨ wi , i ∈ [n1 + n2 + 1, n]N , with vi := ⎪ ⎩ 0, otherwise,
(40)
40
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
then we get Bv, v = δ, v − B(w − v), v.
(41)
By assumption,
n
|δ, v| ≤ δ ∗
1/2
wi
2
i=n1 +n2 +1
n 1 α 2 ≤ δ ∗ +
wi 2 , α 4 i=n +n +1 1
(42)
2
where we have used the ε-inequality with ε := α/4. Furthermore, as in the proof of Theorem 5.4 we can show that |B(w − v), v| ≤
n n1 +n2 σ 2 (n − n1 − n2 )(n1 + n2 ) α
wi 2 +
wj 2 . 4 i=n +n +1 α j=1 1
2
Putting this estimate together with (41), (42), we see that n n1 +n2 1 σ 2 (n − n1 − n2 )(n1 + n2 ) α 2 2
wi ≤ δ ∗ +
wj 2 . 2 i=n +n +1 α α j=1 1
(43)
2
Returning to the general error equation (40) and setting v := w, we have by Lemma 2.4 that n1 +n2 n1 1 d
wi 2 + c
∇wi 2 − λ w 2X ≤ δ, w 2 i=1 dt j=1 1 1 2
∇wi 2 + ε w 2X . ≤ δ ∗ + ε 4ε i=1
n
With ε := c/2 we get n1 +n2 n1 1 d c c 1 2 2
wi +
w 2X ≤ δ 2∗ .
∇wi − λ + 2 i=1 dt 2 i=1 2 2c
Hence
n 1 +n2 i=1
d 1
wi 2 − μ w 2X ≤ δ 2∗ , dt c
where μ := 2λ + c. The rest of the proof runs as in the proof of Theorem 5.4. Finally we give a result for a problem with perturbation index 2. Theorem 5.6. Consider the PDAE n1 n1 n u˙ i − ∇ · (Kij ∇uj ) + ∇ · (bij uj ) + cij uj = fi , j=1
u˙ i +
n
j=1 n +n 1 2
aij u˙ j +
j=n1 +n2 +1
n
∇ · (bij uj ) +
j=n1 +1
j=n1 +n2 +1
∇ · (bij uj ) +
j=1 n
i ≤ n1 ,
cij uj = fi ,
n1 < i ≤ n1 + n2 ,
cij uj = fi ,
n1 + n2 < i ≤ n
j=1
n
j=n1 +n2 +1
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
41
and assume that the conditions of Lemma 2.4 are satisfied. If, in addition, the restriction of Bv, v to the subspace VH2 :=
n 1 +n2
n
{0} ×
Vi
i=n1 +n2 +1
i=1
is coercive, i.e. there exists a constant α > 0 such that n
Bv, v ≥ α
vi 2
∀v ∈ VH2
i=n1 +n2 +1
and if δ admits the estimates |δ, v| ≤ δ ∗
n 1
1/2
∇vi 2 + v 2X
∀v ∈ V
i=1
and ˙ v| ≤ δ
˙ ∗ |δ,
n 1
1/2
∇vi 2 + v 2X
∀v ∈ V,
i=1
then the weak problem has the perturbation index ip = 2. Proof. As in the proof of Theorem 5.5 we restrict the error equation ˙ v) + Bw, v = δ, v (Aw,
(44)
to the subspace VH2 and take the same particular test function ⎧ ⎪ ⎨ wi , i ∈ [n1 + n2 + 1, n]N , n v = (vi )i=1 with vi := ⎪ ⎩ 0, otherwise. Since B(w − v), v = 0, we get Bv, v = δ, v. By the assumption w.r.t. δ, |δ, v| ≤
n α 1
δ 2∗ +
wi 2 , 2α 2 i=n +n +1 1
2
where we have used the ε-inequality with ε := α/2. Since Bv, v is coercive on VH2 , it follows that n i=n1 +n2 +1
wi 2 ≤
1
δ 2∗ . α2
(45)
42
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
In a next step we differentiate the restriction of the error equation to VH2 w.r.t. t and get ˙ v ∀v ∈ VH2 . Bw, ˙ v = δ, Using the particular test function
v = (vi )ni=1
⎧ ⎪ ⎨ w˙ i , i ∈ [n1 + n2 + 1, n]N , with vi := ⎪ ⎩ 0, otherwise,
we obtain, as in the first part of this proof, the estimate n
1 ˙ 2
δ ∗ . α2
w˙ i 2 ≤
i=n1 +n2 +1
(46)
Returning to the general error equation (44) and setting v := w, we have by Lemma 2.4 that n1 +n2 n n 1 +n2 1 d 2
wi + 2 i=1 dt i=n +1 j=n +n 1
1
(aij w˙ j , wi ) + c
2 +1
n1
∇wi 2 − λ w 2X ≤ δ, w.
j=1
Putting the second term of the left-hand side to the right-hand side and using the estimate n n n 1 +n2 1 +n2 n (aij w˙ j , wi) ≤
aij 0,∞,Ω w˙ j
wi
i=n1 +1 j=n1 +n2 +1 i=n1 +1 j=n1 +n2 +1 1/2 n +n 1/2 n 1 2 ≤η
w˙ i 2
wi 2 , i=n1 +n2 +1
where
i=n1 +1
"n1 +n2 "n η 2 :=
sup
i=n1 +1
ξj ∈R: n1 +n2 <j≤n
j=n1 +n2 +1 aij 0,∞,Ω ξj
2
"n
,
2 j=n1 +n2 +1 ξj
we get by the help of the ε-inequality n1 n1 +n2 n1 1 d 1 2 2 2 2
wi + c
∇wi − λ w X ≤
δ ∗ + ε1
∇wi 2 + ε1 w 2X 2 i=1 dt 4ε 1 j=1 i=1
η2 + 4ε2
n
2
w˙ i + ε2
i=n1 +n2 +1
n 1 +n2
wi 2 .
i=n1 +1
With ε1 := c/2 and ε2 := 1/2 it follows that n1 +n2 n1 n c c 1 1 η2 d 1 2 2 2 2
wi +
∇wi − λ + +
w˙ i 2
w X ≤ δ ∗ + 2 i=1 dt 2 i=1 2 2 2c 2 i=n +n +1
≤
2
1
2
1 η ˙ 2∗ ,
δ 2∗ + 2 δ
2c 2α
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
43
where we have used (46). So we arrive at n 1 +n2 i=1
1 η2 ˙ 2 d
wi 2 − μ w 2X ≤ δ 2∗ + 2 δ
∗, dt c α
where μ := 2λ + c + 1. By definition of the X-norm, this estimate can be rewritten as n 1 +n2 i=1
n ! 1 d η2 ˙ 2 2 2 2
wi − μ wi ≤ δ ∗ + 2 δ ∗ + μ
wi 2 . dt c α i=n +n +1 1
2
Using (45), we get n 1 +n2 i=1
! 1 d μ η2 ˙ 2 2 2
wi − μ wi ≤ + 2 δ 2∗ + 2 δ
∗. dt c α α
After integration and with (45) again we conclude that the weak problem has the perturbation index ip = 2.
6
Applications
In this section we apply the above results to two nonlinear problems of great practical interest. Like other authors too (e.g. [25]), we use a linearization approach. The reason is that the generalization of Definition 3.1 to the case of nonlinear PDAEs causes several difficulties. For instance, it could happen that the family F is too small.
6.1 The compressible Euler equations These equations read as ρu˙ + ρ(u · ∇)u + ∇p = f, ρ˙ + ∇ · (ρu) = fd+1 , p = r(ρ), where p is the pressure and r(ρ) is a given function defining the equation of state. The fluid-density ρ = ρ(t, x) and the velocity field u = (u1 (t, x), . . . , ud (t, x)) are the unknown functions. The boundary and the initial conditions are ν · u = 0 on J × ∂Ω, u(0, x) = u0 (x),
x ∈ Ω,
ρ(0, x) = ρ0 (x),
x ∈ Ω.
More information about the mathematical theory and the physical background of the Euler equations can be found in [18], [2], [15], and [16]. First we write the equations component-wise and use the fact that ∂x p = ∂x r(ρ) = ∂ρ r∂x ρ.
44
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
In general, the following inequalities are satisfied: r > 0 and c2 := ∂ρ r > 0. In the three-dimensional case (i.e. d = 3) we get ρu˙ 1 + ρ(u1 ∂x u1 + u2 ∂y u1 + u3 ∂z u1 ) + c2 ∂x ρ = f1 , ρu˙ 2 + ρ(u1 ∂x u2 + u2 ∂y u2 + u3 ∂z u2 ) + c2 ∂y ρ = f2 , ρu˙ 3 + ρ(u1 ∂x u3 + u2 ∂y u3 + u3 ∂z u3 ) + c2 ∂z ρ = f3 , ρ˙ + ρ(∂x u1 + ∂y u2 + ∂z u3 ) + u1 ∂x ρ + u2 ∂y ρ + u3 ∂z ρ = f4 . Next we linearize the system at the point (u1 , u2, u3 , ρ) and obtain a system in the new unknown variables w = (w1 , w2 , w3 , w4 ) Aw ˙ +
n
Lij wj = ˜f
j=1
with Lij w := ˜bij · ∇w + c˜ij w,
i, j ∈ [1, n]N ,
and ⎛
⎞
⎛
⎞ f1
ρ000
⎜ ⎟ ⎟ ⎜ ⎜ ⎟ ⎟ ⎜ ⎜ f2 ⎟ ⎜0 ρ 0 0⎟ ⎜ ⎟ ⎟ ⎜ A := ⎜ ⎟ , ˜f := ⎜ ⎟ , ⎜ ⎟ ⎟ ⎜ ⎜ f3 ⎟ ⎜0 0 ρ 0⎟ ⎝ ⎠ ⎠ ⎝ 0001 f4 ⎛ ⎛ ⎞ ⎞ ⎜ ρu1 ⎟ ⎜0⎟ ⎜ ⎜ ⎟ ⎟ ˜b11 := ˜b22 := ˜b33 := ⎜ ρu ⎟ , ˜b12 := ˜b13 := ˜b21 := ˜b23 := ˜b31 := ˜b32 := ⎜ 0 ⎟ , ⎜ 2⎟ ⎜ ⎟ ⎝ ⎝ ⎠ ⎠ ρu3 0 ⎞
⎛ 2
˜b14
˜b41
⎛
⎞
⎛
⎞
⎜c ⎟ ⎜0⎟ ⎜0⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎟ ˜ ⎜ 2⎟ ˜ ⎜ ⎟ := ⎜ ⎜ 0 ⎟ , b24 := ⎜ c ⎟ , b34 := ⎜ 0 ⎟ , ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ c2 0 0 ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎜ρ⎟ ⎜0⎟ ⎜0⎟ ⎜ u1 ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎟ , ˜b42 := ⎜ ρ ⎟ , ˜b43 := ⎜ 0 ⎟ , ˜b44 := ⎜ u ⎟ , := ⎜ 0 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ 2⎟ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ u3 0 0 ρ
(47)
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
⎛ ρ∂ u ⎜ x 1 ⎜ ⎜ ρ∂x u2 ⎜ C˜ := ⎜ ⎜ ⎜ ρ∂x u3 ⎝ ∂x ρ
45
⎞ ρ∂y u1
ρ∂z u1
ρ∂y u2
ρ∂z u2
ρ∂y u3
ρ∂z u3
∂y ρ
∂z ρ
u˙ 1 + u1 ∂x u1 + u2 ∂y u1 + u3 ∂z u1
⎟ ⎟ u˙ 2 + u1 ∂x u2 + u2 ∂y u2 + u3 ∂z u2 ⎟ ⎟ ⎟. ⎟ u˙ 3 + u1 ∂x u3 + u2 ∂y u3 + u3 ∂z u3 ⎟ ⎠ ∂x u1 + ∂y u2 + ∂z u3
This system can be symmetrized by the following scaling. Using the new variable w˜4 := c2 w4 /ρ and multiplying the fourth equation by c2 , we obtain a symmetric hyperbolic system in the new variables (w1 , w2 , w3 , w˜4 ) . For simplification of the presentation we ˜4) . Then the system (47) can be write again (w1 , w2 , w3 , w4 ) instead of (w1 , w2 , w3 , w written in the form (1), (4), (5) with the settings bij := ˜bij and cij := c˜ij − ∇ · ˜bij . Theorem 6.1. Let the linearized compressible Euler equations be uniquely solvable. Then the linearized system has the perturbation index 1. Proof. Since both u = (u1 , u2 , u3) and (w1 , w2 , w3 ) satisfy the boundary condition "3 ν · u = 0 resp. i=1 νi wi = 0 on J × ∂Ω, we have ⎛
2BHH
ν·u 0
⎞ 0
⎛
⎞ 0 0 0 ν1
ν1
⎜ ⎟ ⎟ ⎜ ⎜ ⎟ ⎟ ⎜ ⎜ 0 0 0 ν2 ⎟ ⎜ 0 ν · u 0 ν2 ⎟ ⎜ ⎟ ⎟ ⎜ = ⎜ ⎟ = ⎜ ⎟ and MHH = 0 on J × ∂Ω. ⎜ ⎟ ⎟ ⎜ 0 0 ν · u ν 0 0 0 ν ⎜ ⎜ 3 ⎟ 3⎟ ⎝ ⎠ ⎠ ⎝ ν1 ν2 ν3 ν · u ν1 ν2 ν3 0
Therefore, w BHH w = w4 assertion.
"3 i=1
νi wi = 0, and we can use Remark 4.2 to obtain the
6.2 The compressible Navier-Stokes equations The compressible Navier-Stokes equations read as ρu˙ − μΔu − (μ + μ )∇(∇ · u) + ρ(u · ∇)u + ∇p = f, ρ˙ + ∇ · (ρu) = fd+1 , p = r(ρ), where μ > 0, μ ≥ 0, p is the pressure, and r(ρ) is a given function. The fluid-density ρ = ρ(t, x) and the velocity field u = (u1 (t, x), . . . , ud(t, x)) are the unknown functions. Furthermore, the boundary and initial conditions are given by u=g
on J × ∂Ω,
u(0, x) = u0 (x),
x ∈ Ω,
ρ(0, x) = ρ0 (x),
x ∈ Ω.
46
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
The reader is referred to the classical references [1] and [13] for the physical aspects. Here we only consider the two-dimensional case (i.e. d = 2). A generalization for the three-dimensional case can be done in a straightforward manner. First we write the equations component-wise and use the information ∂x p = ∂x r(ρ) = ∂ρ r∂x ρ. As in the previous section, we have r > 0 and c2 := ∂ρ r > 0. We get ρu˙ 1 − (2μ + μ )∂xx u1 − (μ + μ )∂xy u2 − μ∂yy u1 +ρ(u1 ∂x u1 + u2 ∂y u1) + c2 ∂x ρ = f1 , ρu˙ 2 − μ∂xx u2 − (μ + μ )∂xy u2 − (2μ + μ )∂yy u2 +ρ(u1 ∂x u2 + u2 ∂y u2 ) + c2 ∂y ρ = f2 , ρ˙ + ρ(∂x u1 + ∂y u2 ) + u1 ∂x ρ + u2 ∂y ρ = f3 . Next we linearize the system at the point (u1 , u2 , ρ) and obtain a system in the unknown variables w = (w1 , w2, w3 ) Aw ˙ + Lw = ˜f with ⎛
⎞
b11 :=
b21 :=
C :=
⎞
⎞ ⎛ ⎜ f1 ⎟ ⎜ ⎟ 0 ⎟ 2μ + μ ˜f := ⎜ f ⎟ , K11 := ⎜ ⎠, ⎝ ⎜ 2⎟ ⎝ ⎠ 0 μ + μ f3 ⎛ ⎞ ⎛ ⎞ 0 ⎟ μ + μ ⎜ 0 1 ⎟ ⎜μ + μ K21 := ⎠, ⎝ ⎠ , K22 := ⎝ 2 0 2μ + μ 10 ⎞ ⎛ ⎛ ⎞ ⎛ ⎞ ⎜ ρu1 ⎟ ⎜0⎟ ⎜ ∂ρ r ⎟ ⎠ , b12 := ⎝ ⎠ , b13 := ⎝ ⎝ ⎠, ρu2 0 0 ⎞ ⎞ ⎛ ⎞ ⎛ ⎛ ⎜0⎟ ⎜ ρu1 ⎟ ⎜ 0 ⎟ ⎠, ⎠ , b23 := ⎝ ⎝ ⎠ , b22 := ⎝ ρu2 0 ∂ρ r ⎞ ⎛ ⎜ ρ∂x u1 ρ∂y u1 u˙ 1 + u1 ∂x u1 + u2 ∂y u1 ⎟ ⎟ ⎜ ⎜ ρ∂ u ρ∂ u u˙ + u ∂ u + u ∂ u ⎟ . y 2 2 1 x 2 2 y 2⎟ ⎜ x 2 ⎠ ⎝ ∂x ρ ∂y ρ ∂x u1 + ∂y u2
⎜ρ 0 0⎟ ⎟ ⎜ ⎟, A := ⎜ 0 ρ 0 ⎟ ⎜ ⎠ ⎝ 001 K12 :=
⎛
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
47
Again, the linearized system can be symmetrized by the following scaling. Using the new variable w˜3 := c2 w3 /ρ and multiplying the third equation by c2 , we obtain a symmetric problem. As in the previous section the system can be written in the form (1), (4), (5). Theorem 6.2. Let the linearized compressible Navier-Stokes equations be uniquely solvable. Then the linearized system has the perturbation index 1. Proof. In the following we show that the assumptions of Theorem 5.1 are fulfilled. First we have κ11 = κ22 = μ + μ and κ12 = κ21 = 12 (μ + μ ). Hence the matrix ⎛ 1 ⎜2 κ = (μ + μ ) ⎝ 2 1
⎞ 1⎟ ⎠ 2
is positive definite. Since we have prescribed Dirichlet boundary conditions on the whole boundary, the assumptions of Lemma 2.4 w.r.t. the boundary matrices are satisfied trivially. So we may apply Theorem 5.1, and the problem has the perturbation index ip = 1.
7
Conclusion
We dicussed systems of partial differential equations of the form Au˙ + Lu = f with a possibly singular coefficient matrix A and a linear differential operator L with respect to the space variables subject to general boundary conditions. Stability results with respect to perturbations of the problem data (of the right-hand side as well as of the boundary data) were derived. The main tool for these estimates was the availability of a G˚ arding-type inequality. The similarity of the perturbation results with those of the perturbation index for ordinary differential equations motivated a definition of a perturbation index for the class of problems considered here. In particular, the knowledge of the index is important for the selection of appropriate numerical methods. As an application, linearizations of the compressible Euler and Navier-Stokes equations are considered.
References [1] G.K. Batchelor: An introduction to fluid dynamics, 2nd ed. Cambridge University Press, Cambridge, 1999. [2] J.T. Beale, T. Kato and A. Majda: “Remarks on the breakdown of smooth solutions for the 3-D Euler equations”, Comm. Math. Phys., Vol. 94(1), (1984), pp. 61–66.
48
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
[3] K.E. Brenan, S.L. Campbell and L. R. Petzold: Numerical Solution of Initial-Value Problems in DAEs, Classics In Applied Mathematics, Vol. 14 SIAM, Philadelphia, 1996. [4] A. Favini and A. Yagi: Degenerate differential equations in Banach spaces, Marcel Dekker, New York-Basel-Hong Kong, 1999. [5] K.O. Friedrichs: “Symmetric positive linear differential equations”, Comm. Pure Appl. Math, Vol. 11, (1958), pp. 333–418. [6] E. Griepentrog, M. Hanke and R. M¨arz: Toward a better understanding of differential-algebraic equations (Introductory survey), Seminarberichte Nr. 92-1, Humboldt-Universit¨at zu Berlin, Fachbereich Mathematik, Berlin, 1992. [7] E. Griepentrog and R. M¨arz: Differential-algebraic equations and their numerical treatment, Teubner-Texte zur Mathematik, Vol. 88, Teubner, Leipzig, 1986. [8] M. G¨ unther and Y. Wagner: “Index concepts for linear mixed systems of Differentialalgebraic and hyperbolic-type equations”, SIAM J. Sci. Comput., Vol. 22(5), (2000), pp. 1610–1629. [9] E. Hairer and G. Wanner: Solving ordinary differential equations II: Stiff and differential-algebraic problems, Springer Series in Computational Mathematics, Vol. 14, 2nd edition, Springer-Verlag, Berlin, 1996. [10] V. John, G. Matthies and J. Rang: “A comparison of timediscretization/linearization approaches for the incompressible Navier-Stokes equations”, Comput. Methods Appl. Mech. Engrg., Vol. 195(44-47), (2006), pp. 5995–6010. [11] P. Kunkel and V. Mehrmann: Differential-Algebraic Equations, EMS Publishing House, Z¨ urich, 2006. [12] J. Lang: Adaptive Multilevel Solution of Nonlinear Parabolic PDE Systems, Lecture Notes in Computational Science and Engineering, Vol. 16, Springer-Verlag, Berlin, 2001. [13] L. Landau and E. Lifschitz: Fluid mechanics, Addison–Wesley, 1953. [14] P. Lesaint: Finite element methods for symmetric hyperbolic systems, Numer. Math., Vol. 21, (1973), pp. 244–255. [15] A. Majda: Compressible fluid flow and systems of conservation laws in several space variables, Applied Mathematical Sciences, Vol. 53, Springer-Verlag, New York, 1984. [16] A. Majda: The interaction of nonlinear analysis and modern applied mathematics, In: Proceedings of the International Congress of Mathematicians, Tokyo, Math. Soc. Japan., (1990), pp. 175–191. [17] W.S. Martinson and P.I. Barton: A Differentiation Index for Partial Differential Equations, SIAM J. Sci. Comput., Vol. 21(6), (2000), pp. 2295–2315. [18] M. Marion and R. Temam: Navier-Stokes equations: theory and approximation, In: P.G. Ciarlet and J.L. Lions (Eds.): Handbook of numerical analysis, Handb. Numer. Anal., Vol. 6, North-Holland, Amsterdam, 1998, pp. 503–688. [19] J. Rang and L. Angermann: The perturbation index of linearized problems in porous media, Mathematik-Bericht Nr. 2004/1, Institut f¨ ur Mathematik, TU Clausthal,
L. Angermann, J. Rang / Central European Journal of Mathematics 5(1) 2007 19–49
[20] [21] [22]
[23] [24] [25]
49
Clausthal, 2004. J. Rang and L. Angermann: “The perturbation index of linear partial differential algebraic equations”, Appl. Numer. Math., Vol. 53(2-4), (2005), pp. 437–456. J. Rang and L. Angermann: “New Rosenbrock W-methods of order 3 for PDAEs of index 1”, BIT, Vol. 45(4), (2005), pp. 761–787. J. Rang and L. Angermann: Remarks on the differentiation index and on the perturbation index of non-linear differential algebraic equations, Mathematik-Bericht Nr. 2005/3, Institut f¨ ur Mathematik, TU Clausthal, Clausthal, 2005. J. Rang: Stability estimates and numerical methods for degenerate parabolic differential equations, PhD thesis, Technische Universit¨at Clausthal, Clausthal, 2004. R.E. Showalter: Monotone operators in Banach spaces and nonlinear partial differential equations, AMS, Providence, 1997. C. Tischendorf: Coupled systems of differential algebraic and partial differential equations in circuit and device simulation Habilitation Thesis, Humboldt University at Berlin, 2003.
DOI: 10.2478/s11533-006-0034-5 Research article CEJM 5(1) 2007 50–83
Neighbourhoods of independence and associated geometry in manifolds of bivariate Gaussian and Freund distributions Khadiga Arwini, Christopher Terence John Dodson∗ School of Mathematics, University of Manchester, Manchester M60 1QD, UK
Received 6 December 2004; accepted 30 September 2006 Abstract: We provide explicit information geometric tubular neighbourhoods containing all bivariate distributions sufficiently close to the cases of independent Poisson or Gaussian processes. This is achieved via affine immersions of the 4-manifold of Freund bivariate distributions and of the 5-manifold of bivariate Gaussians. We provide also the α-geometry for both manifolds. The Central Limit Theorem makes our neighbourhoods of independence limiting cases for a wide range of bivariate distributions; the topological character of the results makes them stable under small perturbations, which is important for applications in models of stochastic processes. c Versita Warsaw and Springer-Verlag Berlin Heidelberg. All rights reserved. Keywords: Information geometry, statistical manifold, bivariate distribution, neighbourhoods of independence, exponential distribution, Freund distribution, Gaussian distribution MSC (2000): 53B20, 60G05
1
Introduction
In general, a probability density function depends on a set of parameters, θ1 , θ2 , . . . , θn and we say that we have an n-dimensional family of probability density functions. Let Θ be the parameter space of an n-dimensional smooth such family defined on some fixed event space Ω pθ = 1 for all θ ∈ Θ. {pθ |θ ∈ Θ} with Ω
∗
E-mail:
[email protected]
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
51
Then, the derivatives of the log-likelihood function, l = log pθ , yield a matrix with entries 2 ∂l ∂l ∂ l = − pθ , (1) gij = pθ i j ∂θ ∂θ ∂θi ∂θj Ω Ω for coordinates (θi ) about θ ∈ Θ ⊆ Rn . This gives rise to a positive definite matrix, so inducing a Riemannian metric g, the Fisher metric on Θ using for coordinates the parameters (θi ); this metric is called the information metric for the family of probability density functions—the second equality here is subject to certain regularity conditions. Amari [1] and Amari and Nagaoka [2] provide modern accounts of the differential geometry that arises from the Fisher information metric. An n-dimensional set of probability density functions S = {pθ |θ ∈ Θ ⊂ Rn } for random variable x ∈ Ω ⊆ R is said to be an exponential family [2] when the density functions can be expressed in terms of functions {C, F1 , ..., Fn } on R and a function ϕ on Θ as: pθ (x) = e{C(x)+
i (θ
i
Fi (x))−ϕ(θ)}
.
(2)
Then we say that (θi ) are its natural coordinates, and ϕ is its potential function. From the normalization condition Ω pθ (x) dx = 1 we obtain: i ϕ(θ) = log e{C(x)+ i (θ Fi (x))} dx . (3) Ω
With ∂i =
∂ , ∂θ i
we use the log-likelihood function l(θ, x) = log(pθ (x)) to obtain ∂i l(θ, x) = Fi (x) − ∂i ϕ(θ)
and ∂i ∂j l(θ, x) = −∂i ∂j ϕ(θ) . The Fisher information metric g on the n-dimensional space of parameters Θ ⊂ Rn , equivalently on the set S = {pθ |θ ∈ Θ ⊂ Rn }, has coordinates: [gij ] = − [∂i ∂j l(θ, x)] pθ (x) dx = ∂i ∂j ϕ(θ) = ϕij (θ) . (4) Ω
Then, (S, g) is a Riemannian n-manifold with Levi-Civita connection given by: Γkij (θ)
=
n 1 h=1
=
2
n 1 h=1
2
g kh (∂i gjh + ∂j gih − ∂h gij ) g
kh
∂i ∂j ∂h ϕ(θ) =
where [ϕhk (θ)] represents the inverse to [ϕhk (x)].
n 1 h=1
2
ϕkh (θ) ϕijh (θ)
52
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
There is a family of symmetric connections which includes the Levi-Civita case and (α) has significance in mathematical statistics. Consider for α ∈ R the function Γij,k which maps each point θ ∈ Θ to the following value: 1−α (α) Γij,k (θ) = ∂i ∂j l + ∂i l ∂j l ∂k l pθ 2 Ω 1−α 1−α = (5) ∂i ∂j ∂k ϕ(θ) = ϕijk (θ) . 2 2 So we have an affine connection ∇(α) on the statistical manifold (S, g) defined by (α)
(α)
g(∇∂i ∂j , ∂k ) = Γij,k , where g is the Fisher information metric. We call this ∇(α) the α-connection and it is clearly a symmetric connection and defines an α-curvature. We have also ∇(α) = (1 − α) ∇(0) + α ∇(1) , 1 + α (1) 1 − α (−1) . = ∇ + ∇ 2 2 For a submanifold M ⊂ S, the α-connection on M is simply the restriction with respect to g of the α-connection on S. Note that the 0-connection is the Riemannian or LeviCivita connection with respect to the Fisher metric and its uniqueness implies that an α-connection is a metric connection if and only if α = 0. In [4] we proved that every neighbourhood of an exponential distribution contains a neighbourhood of gamma distributions, in the subspace topology of R3 using an information geometric affine immersion of Dodson and Matsuzoe [10]. As part of a study of the information geometry and topology of near random and bivariate stochastic processes cf. [3, 4, 6–9], we calculated the geometry of the Riemannian 4-manifold of Freund bivariate (mixture) exponential density functions. This family is important because exponential distributions represent intervals between events for Poisson processes and Freund distributions can model bivariate processes with positive and negative covariance. We derive the induced α-geometry, i.e., the α-Ricci curvature, the α-scalar curvature etc. The case α = 0 recovers the Levi-Civita connection and it has a positive constant 0-scalar curvature. Sato et al [16] provided the bivariate Gaussian distributions as a Riemannian 5manifold; it has a negative constant 0-scalar curvature and if the covariance is zero, the space becomes an Einstein space. We calculate the α-geometry. In each of the Freund and bivariate Gaussian cases we provide explicitly an affine immersion and examples of neighbourhoods of independence. Thus, including the results we reported in [4], we now have explicit representations in R3 of information geometric tubular neighbourhoods containing by continuity each of the following: • All distributions sufficiently close to a Poisson distribution • All distributions sufficiently close to a uniform distribution
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
53
• All bivariate distributions sufficiently close to the independent bivariate Poisson distribution • All bivariate distributions sufficiently close to the independent bivariate Gaussian distribution. These results have wide application in the theory of stochastic processes because Poisson distributions model the random state, of independent haphazard events, and provide good limiting models for some binomial distributions. Moreover, the Central Limit Theorem makes our neighbourhoods of Gaussian independence limiting cases for a wide range of bivariate distributions other than Gaussian. There are practical applications because the topological character of the results makes them stable under small perturbations. The authors used Mathematica to perform analytic calculations and the interactive notebooks are available for others to use [5].
2
Freund bivariate exponential 4-manifold F
Freund [11] introduced a bivariate exponential mixture distribution arising from the following reliability considerations. Suppose that an instrument has two components A and B with lifetimes X and Y respectively having probability density functions (when both components are in operation) fX (x) = α1 e−α1 x ; fY (y) = α2 e−α2 y for (α1 , α2 > 0; x, y > 0). Then X and Y are dependent in that a failure of either component changes the parameter of the life distribution of the other component. Thus when A fails, the parameter for Y becomes β2 ; when B fails, the parameter for X becomes β1 . There is no other dependence. Hence the joint probability density function of X and Y is: ⎧ ⎪ ⎨ α1 β2 e−β2 y−(α1 +α2 −β2 )x for 0 < x < y f (x, y) = (6) ⎪ ⎩ α2 β1 e−β1 x−(α1 +α2 −β1 )y for 0 < y < x where αi , βi > 0 (i = 1, 2). The marginal probability density function of X ≥ 0 is (provided that α1 + α2 = β1 ) α2 α1 − β1 −β1 x + (7) β1 e (α1 + α2 ) e−(α1 +α2 )x . fX (x) = α1 + α2 − β1 α1 + α2 − β1 The marginal probability density function of Y ≥ 0 is ( provided that α1 + α2 = β2 ) α1 α2 − β2 −β2 y fY (y) = + (8) β2 e (α1 + α2 ) e−(α1 +α2 )y . α1 + α2 − β2 α1 + α2 − β2 We can see that the marginal functions are not exponential but rather mixtures of exponential densities if αi > βi ; otherwise, they are weighted averages. This family should be termed bivariate mixture exponential densities rather than simply bivariate exponential densities. The marginal density functions fX (x) and fY (y) are exponential distributions
54
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
only in the special case αi = βi (i = 1, 2). The covariance and correlation coefficient of X and Y are given by: β1 β2 − α1 α2 , β1 β2 (α1 + α2 )2 β1 β2 − α1 α2 ρ(X, Y ) = α2 2 + 2 α1 α2 + β1 2 α1 2 + 2 α1 α2 + β2 2
Cov(X, Y ) =
(9) (10)
Note that − 13 < ρ(X, Y ) < 1. The correlation coefficient ρ(X, Y ) → 1 when α1 , β2 → ∞, and ρ(X, Y ) → − 13 when α1 = α2 and β1 , β2 → 0. In many applications, βi > αi (i = 1, 2) ( i.e., lifetime tends to be shorter when the other component is out of action); in such cases the correlation is positive.
2.1 Freund Fisher metric The Freund family F in coordinates (α1 , α2 , β1 , β2 ) has Fisher information metric components ⎡
⎤
1 α1
0 0 0 ⎢ ⎥ ⎢ ⎥ ⎢ 0 1 0 0 ⎥ 1 ⎢ ⎥ α2 [gij ] = ⎢ ⎥ α1 + α2 ⎢ 0 0 α2 0 ⎥ ⎢ ⎥ 2 β1 ⎣ ⎦ α1 0 0 0 β2 2
(11)
with inverse ⎤
⎡
α 0 0 0 ⎥ ⎢ 1 ⎥ ⎢ ⎢ 0 α2 0 0 ⎥ ⎥ ⎢ [g ij ] = (α1 + α2 ) ⎢ ⎥ . 2 β ⎥ ⎢ ⎢ 0 0 α12 0 ⎥ ⎦ ⎣ β22 0 0 0 α1
(12)
2.2 Natural coordinates and potential function It was noted by Leurgans, Tsai, and Crowley [14] that the family of Freund distributions forms an exponential family, with natural parameters (θ1 , θ2 , θ3 , θ4 ) = (α1 + β1 , α2 , log and potential function
α1 β2 α2 β1
, β2 )
(13)
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
ϕ(θ) = − log
θ1 θ2 θ4 θ e 3 θ2 + θ4
55
= − log(α2 β1 ).
(14)
So by solving the equations θ1 = α1 + β1 , θ2 = α2 , θ3 = log
α1 β2 α2 β1
, θ4 = β2 .,
we obtain that: α1 =
eθ3
θ1 θ2 θ1 θ4 eθ3 , β1 = θ3 , α2 = θ2 , β2 = θ4 ., θ2 + θ4 e θ2 + θ4
so (6) can be written in term of the natural coordinate system as: ⎧ θ1 θ2 θ4 ⎪ ⎨ eθ1 (−x)+θ3 +θ4 (x−y)+log( eθ3 θ2 +θ4 ) for 0 < x < y . f (x, y) = θ1 θ2 θ4 ⎪ ⎩ eθ1 (y)+θ2 (y−x)+log( eθ3 θ2 +θ4 ) for 0 < y < x
(15)
The Fisher metric with respect to the natural coordinates (θi ) (13) is given by ⎡
⎤
1
0 0 0 ⎥ ⎢ θ1 2 ⎥ ⎢ θ3 θ +θ θ 2 e θ θ 4( 2 4) ⎥ e 3 θ4 e 3 ⎢ 2 0 − ⎥ ⎢ 2 θ3 θ +θ 2 θ3 θ +θ 2 θ3 θ +θ 2 θ e e e ∂ ϕ(θ) ( ) ( ) ( ) 2 2 4 2 4 2 4 ⎥ ⎢ =⎢ ⎥ eθ3 θ4 eθ3 θ2 θ4 eθ3 θ2 ⎥ ⎢ 0 ∂θi ∂θj − , ⎥ ⎢ θ3 θ +θ 2 θ3 θ +θ 2 θ3 θ +θ 2 e e e ( ) ( ) ( ) 2 4 2 4 2 4 ⎥ ⎢ ⎦ ⎣ eθ3 θ2 eθ3 1 1 0 − θ 2 − 2 2 2 − θ θ θ (e 3 θ2 +θ4 ) (e 3 θ2 +θ4 ) 4 (e 3 θ2 +θ4 ) ⎡ ⎤ 1 0 0 0 ⎢ (α1 +β1 )2 ⎥ ⎢ ⎥ β1 (2 α1 +β1 ) α1 β1 α1 β1 ⎢ ⎥ − 0 2 2 2 ⎢ α2 2 (α1 +β1 ) α2 (α1 +β1 ) α2 (α1 +β1 ) β2 ⎥ =⎢ ⎥. ⎢ ⎥ α1 β1 α1 β1 α1 β1 0 − ⎢ ⎥ 2 2 2 α (α +β ) (α +β ) (α +β ) β 2 1 1 1 1 1 1 2 ⎣ ⎦ α1 (α1 +2 β1 ) α1 β1 α1 β1 0 − α (α +β )2 β − (α +β )2 β (α +β )2 β 2 2
1
1
2
1
1
2
1
1
(16)
2
2.3 Freund α-geometry We report the analytic expressions for the α-connections and the α-curvature objects with respect to coordinates (α1 , α2 , β1 , β2 ); this is simpler than using the natural coordinates (13). Detailed expressions are given are given in Appendix A for the components of the α-connection components (5).
56
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83 (α)
Proposition 2.1. The nonzero independent components Rijkl of the α-curvature tensor are given by: (α)
(α2 −1) α2 2
=
R1313 (α)
R1332
4 α1 (α1 +α2 )3 β1 2
=
(α)
=
R1414 (α)
R1424
=
(α)
=
R3232 (α)
R3434
=
4 (α1 +α2 )3 β1 2
(α2 −1) α2 4 (α1 +α2 )3 β2 2 −(α2 −1) α1 4 (α1 +α2 )3 β2 2
(
α2 −1
) α1
4 (α1 +α2 )3 β1 2
, , , ,
(α2 −1) α1 α2 4 (α1 +α2 )2 β1 2 β2 2
(α2 −1) α1 2
(α)
R2424 =
(α2 −1) α2
4 α2 (α1 +α2 )3 β2 2
.
,
, 2
(α)
(17) (α)
Contracting Rijkl with g il we obtain the components Rjk of the α-Ricci tensor. (α)
Proposition 2.2. The α-Ricci tensor R(α) = [Rjk ] is given by: ⎡
R(α)
⎢ ⎢ (1 − α2 ) ⎢ ⎢ (α) = [Rjk ] = ⎢ (α1 + α2 ) ⎢ ⎢ ⎣
⎤
α2 2 α1 (α1 +α2 )
−1 2 (α1 +α2 )
0
−1 2 (α1 +α2 )
α1 2 α2 (α1 +α2 )
0
0
0
α2 2 β1 2
0
0
0
0
⎥ ⎥ 0 ⎥ ⎥ ⎥ ⎥ 0 ⎥ ⎦
(18)
α1 2 β2 2
The α-eigenvalues and the α-eigenvectors of the α-Ricci tensor are given by: ⎞ ⎛ 0 ⎟ ⎜ ⎟ ⎜ 1 1 ⎟ ⎜ − 2 ⎟ ⎜ 2 α α 1 − α2 ⎜ 1 2 (α1 +α2 ) ⎟ ⎟ ⎜ α2 ⎟ ⎜ 2 (α1 +α 2 2 ) β1 ⎠ ⎝
(19)
α1 2 (α1 +α2 ) β2 2
⎛
⎞ α1 α2
100 ⎜ ⎟ ⎜ α ⎟ ⎜− 2 1 0 0⎟ ⎜ α1 ⎟ ⎜ ⎟2 ⎜ ⎟ ⎜ 0 0 1 0⎟ ⎝ ⎠ 0 001
(20)
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
57
Proposition 2.3. The manifold F has constant α-scalar curvature R(α) =
3 (1 − α2 ) 2
(21) 2
Proposition 2.4. The α-sectional curvatures (α) (λ, μ) (λ, μ = 1, 2, 3, 4) are given by: (α) (1, 3) = (α) (1, 4) =
(1 − α2 ) α2 , 4 (α1 + α2 )
(α) (1, 2) = 0, (1 − α2 ) α1 , (2, 3) = (2, 4) = 4 (α1 + α2 ) 1 − α2 (α) (3, 4) = .2 4 (α)
(α)
(22)
Proposition 2.5. The α-mean curvatures (α) (λ) (λ = 1, 2, 3, 4) are given by: (1 − α2 ) α2 , (1) = 6 (α1 + α2 ) (1 − α2 ) α1 (α) (2) = , 6 (α1 + α2 ) 1 − α2 (α) (3) = (4) = .2 6 (α)
(23)
2.4 Dual coordinates Since F is an exponential family, a mixture coordinate system is given by the potential function (14), that is, ∂ϕ(θ) ∂θ1 ∂ϕ(θ) η2 = ∂θ2 ∂ϕ(θ) η3 = ∂θ3 ∂ϕ(θ) η4 = ∂θ4 η1 =
1 1 =− , θ1 α1 + β1 β1 θ4 =− , =− θ 3 θ2 (e θ2 + θ4 ) α2 (α1 + β1 ) θ4 α1 = 1 − θ3 = , e θ2 + θ4 α1 + β1 1 1 α1 = − + θ3 =− . θ4 e θ2 + θ4 (α1 + β1 ) β2 =−
(24)
Next (θ1 , θ2 , θ3 , θ4 ) is a 1-affine coordinate system, (η1 , η2 , η3 , η4 ) is a (−1)-affine coordinate system, and they are dual with respect to the Fisher information metric. The coordinates (ηi ) (24) have a potential function given by: θ1 θ2 θ4 α1 β2 α1 eθ3 θ2 θ3 λ = log θ3 − 2 = log(α2 β1 ) + log + θ3 − 2. (25) e θ2 + θ4 e θ2 + θ4 α1 + β1 α2 β1 The coordinates (θi ) and (ηi ) form a dual coordinate system. Therefore the Freund manifold has dually orthogonal foliations (See Section 3.7 in [1]) for example.
58
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
Take
1 , θ2 , θ3 , θ4 ) θ1 as a coordinate system for F ; then the Freund distributions take the form: (η1 , θ2 , θ3 , θ4 ) = (−
⎧ ⎪ ⎨− f (x, y; η1, θ2 , θ3 , θ4 ) =
⎪ ⎩−
θ2 θ4 eθ3 (θ2 eθ3 +θ4 ) η1 θ2 θ4 (θ2 eθ3 +θ4 ) η1
θ4 (x−y)+ ηx
e
1
(26)
θ2 (y−x)+ ηy
e
for 0 < x < y
1
for 0 < y < x
where η1 < 0 and θi > 0 (i = 2, 3, 4). The Fisher metric is ⎡ ⎢ ⎢ ⎢ 1 ⎢ [gij ] = 2 ⎢ θ 3 (θ2 e + θ4 ) ⎢ ⎢ ⎣ ⎡ ⎢ ⎢ ⎢ 1 ⎢ = 2 ⎢ (α1 + β1 ) ⎢ ⎢ ⎣
(θ2 eθ3 +θ4 )
⎤
2
0
η1 2
0
θ4 (
0
2 θ2 eθ3 +θ4 θ2 2
)
0 −e
θ3
θ3
θ4 e
0
θ4 eθ3
θ2 θ4 eθ3
0
− eθ3
−θ2 eθ3
−θ2 eθ3 2 (θ2 eθ3 +θ4 )
(α1 + β1 )4
0
0
0
0
β1 (2 α1 +β1 ) α2 2
α1 β1 α2
− αα12 ββ12
0
α1 β1 α2
α1 β1
− α1β2β1
0
− αα12 ββ12
− α1β2β1
α1 (α1 +2 β1 ) β2 2
θ4 2
⎤
⎥ ⎥ ⎥ ⎥ ⎥. ⎥ ⎥ ⎦
−1
⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦
(27)
We remark that (θi ) is a geodesic coordinate system of ∇(1) , and (ηi ) is a geodesic coordinate system of ∇(−1) .
2.5 Submanifolds of F We consider four submanifolds Fi (i = 1, 2, 3, 4) of the 4-manifold F of Freund bivariate exponential densities f (x, y; α1, α2 , β1 , β2 ) (6), which includes the case of independent random variables. It includes also the special case of an Absolutely Continuous Bivariate Exponential Distribution called ACBED (or ACBVE) by Block and Basu (cf. Hutchinson and Lai [12]). We use the coordinate system (α1 , α2 , β1 , β2 ) for the submanifolds Fi (i = 4), and the coordinate system (λ1 , λ12 , λ2 ) for ACBED of the Block and Basu case. 2.5.1 Independence submanifold: F1 ⊂ F : β1 = α1 , β2 = α2 The densities are of form: f (x, y; α1, α2 ) = f1 (x; α1 )f2 (y; α2)
(28)
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
59
where the fi are the univariate exponential densities with parameters αi > 0 (i = 1, 2). This is the case for the independence of X and Y , so F1 is the direct product of the Riemannian spaces {f1 (x; α1 ) = α1 e−α1 x , α1 > 0} and {f2 (y; α2) = −α2 e−α2 y , α2 > 0}. Proposition 2.6. The metric tensor [gij ] is as follows: ⎤
⎡ ⎢ [gij ] = ⎣
1 α21
0
0 ⎥ ⎦2
(29)
1 α22
Proposition 2.7. The α-curvature tensor, α-Ricci tensor, and α-scalar curvature of F1 are zero. 2 2.5.2 Submanifold: F2 ⊂ F : α1 = α2 , β1 = β2 The probability density functions are of form: ⎧ ⎪ ⎨ α1 β1 e−β1 y−(2 α1 −β1 )x for 0 < x < y f (x, y; α1, β1 ) = ⎪ ⎩ α1 β1 e−β1 x−(2 α1 −β1 )y for 0 < y < x
(30)
with parameters α1 , β1 > 0. The covariance, correlation coefficient and marginal density functions, of X and Y are given by: 1 1 1 − Cov(X, Y ) = , (31) 4 α1 2 β1 2 4 α1 2 , (32) ρ(X, Y ) = 1 − 2+β 2 3 α 1 1 α1 α1 − β1 −β1 x + (33) fX (x) = β1 e (2 α1 ) e−2 α1 x , x ≥ 0 , 2 α1 − β1 2 α1 − β1 α1 α1 − β1 (34) fY (y) = β1 e−β1 y + (2 α1 ) e−2 α1 y , y ≥ 0 . 2 α1 − β1 2 α1 − β1 We see that ρ(X, Y ) = 0 if and only if α1 = β1 . Also, F2 forms an exponential family, with natural parameters (α1 , β1 ) and potential function ϕ = − log(α1 β1 ). Proposition 2.8. The submanifold F2 is an isometric isomorph of the manifold F1 . Proof. Since ϕ = − log(α1 β1 ) is a potential function, the Fisher metric is the Hessian of ϕ, that is, ⎤ ⎡ 1 ∂2ϕ ⎢ 2 0 ⎥ [gij ] = [ ] = ⎣ α1 (35) ⎦ ∂θi ∂θj 1 0 β2 1
where (θ1 , θ2 ) = (α1 , β1 ) .
60
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
2.5.3 Submanifold: F3 ⊂ F : β1 = β2 = α1 + α2 The probability density functions are of form: ⎧ ⎪ ⎨ α1 (α1 + α2 ) e−(α1 +α2 )y for 0 < x < y f (x, y; α1, α2 , β2 ) = ⎪ ⎩ α2 (α1 + α2 ) e−(α1 +α2 )x for 0 < y < x
(36)
with parameters α1 , α2 > 0. The covariance, correlation coefficient and marginal functions, of X and Y are given by: α1 2 + α1 α2 + α2 2 , (α1 + α2 )4 α1 2 + α1 α2 + α2 2 , ρ(X, Y ) = √ 2 (α1 + α2 )2 − α1 2 2 α1 2 + 4 α1 α2 + α2 2 fX (x) = (α2 (α1 + α2 )x + α1 ) e−(α1 +α2 )x , x ≥ 0
Cov(X, Y ) =
−(α1 +α2 )y
fY (y) = (α1 (α1 + α2 )y + α2 ) e
,y≥0
(37) (38) (39) (40)
Note that the correlation coefficient is positive. Proposition 2.9. The metric tensor on F3 is ⎡ ⎢ [gij ] = ⎣
⎤
α2 +2 α1 α1 (α1 +α2 )2
1 (α1 +α2 )2
1 (α1 +α2 )2
α1 +2 α2 α2 (α1 +α2 )2
⎥ ⎦ .2
(41)
Proposition 2.10. The α-curvature tensor, α-Ricci curvature, and α-scalar curvature 2 of F3 are zero. 2.5.4 Submanifold: F4 ⊂ F, ACBED of Block and Basu The probability density functions are ⎧ ⎪ ⎨ λ1 λ (λ2 +λ12 ) e−λ1 x−(λ2 +λ12 ) y for 0 < x < y λ1 +λ2 f (x, y; λ1, λ12 , λ2 ) = ⎪ ⎩ λ2 λ (λ1 +λ12 ) e−(λ1 +λ12 ) x−λ2 y for 0 < y < x λ1 +λ2
(42)
where the parameters λ1 , λ12 , λ2 are positive, and λ = λ1 + λ2 + λ12 . This distribution was derived originally by omitting the singular part of the Marshall and Olkin distribution (cf. [13], page [139]); Block and Basu called it the ACBED to emphasize that these are the absolutely continuous bivariate exponential distributions. Alternatively, these distributions can be obtained from (6), by taking λ1 λ12 , (λ1 + λ2 ) β1 = λ1 + λ12 , λ2 λ12 , α2 = λ2 + (λ1 + λ2 ) β2 = λ2 + λ12 . α1 = λ1 +
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
61
By substitution we obtain the covariance, correlation coefficient and marginal probability density functions: (λ1 + λ2 )2 (λ1 + λ12 ) (λ2 + λ12 ) − λ2 λ1 λ2 , λ2 (λ1 + λ2 )2 (λ1 + λ12 ) (λ2 + λ12 ) (λ1 + λ2 )2 (λ1 + λ12 ) (λ2 + λ12 ) − λ2 λ1 λ2 ρ(X, Y ) = , 2 2 2 2 (λ + 2λ ) + λ ) (λ + λ ) + λ λ (λ 1 2 i 12 j j i i=1, j=i −λ12 λ fX (x) = λ e−λ x + (λ1 + λ12 ) e−(λ1 +λ12 ) x , x ≥ 0 λ1 + λ2 λ1 + λ2 −λ12 λ −λ y + fY (y) = λe (λ2 + λ12 ) e−(λ2 +λ12 ) y , y ≥ 0 λ1 + λ2 λ1 + λ2
Cov(X, Y ) =
(43) (44)
(45) (46)
The correlation coefficient is positive, and the marginal density functions are a mixture of two exponentials. Proposition 2.11. The metric tensor [gij ] using the coordinate system (λ1 , λ12 , λ2 ) is [gij ] = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣
λ2
λ1 +λ2 1 + λ1 (λ1 +λ12 )2 (λ1 +λ2 )2
+
λ2 (λ1 +λ2 ) (λ1 +λ12 )2 −1 (λ1 +λ2 )2
+
⎤
+
1 λ2
1 λ2 1 λ2
λ2 + λ12 (λ1 +λ2 ) (λ1 +λ12 )2 λ1 λ2 + (λ1 +λ12 )2 (λ2 +λ12 )2 + λ12 λ1 +λ2 λ1 (λ1 +λ2 ) (λ2 +λ12 )2
+
1 λ2
−1 (λ1 +λ2 )2
+
1 λ2 1
λ1 (λ1 +λ2 ) (λ2 +λ12 )2 λ1
λ1 +λ2 1 + λ2 (λ2 +λ12 )2 (λ1 +λ2 )2
+ λ
2
+
1 λ2
⎥ ⎥ ⎥ ⎥ .2 ⎥ ⎥ ⎦
(47)
The Christoffel symbols, curvature tensor, Ricci tensor, scalar curvature, sectional curvatures and the mean curvatures were computed [3] but these are not listed here because the expressions are somewhat cumbersome. In the case when λ1 = λ2 , this family of distributions becomes ⎧ ⎪ ⎨ (2 λ1 +λ12 ) (λ1 +λ12 ) e−λ1 x−(λ1 +λ12 ) y for 0 < x < y 2 (48) f (x, y; λ1, λ12 ) = ⎪ (2 λ +λ ⎩ 1 12 ) (λ1 +λ12 ) e−λ1 y−(λ1 +λ12 ) x for 0 < y < x 2 which is an exponential family with natural parameters (θ1 , θ2 ) = (λ1 , λ12 ) and potential function ϕ(θ) = log(2) − log(λ1 + λ12 ) − log(2 λ1 + λ12 ), note that from equations (45, 46) this family of bivariate distributions has two equal marginal density functions. So it is easy to derive the α-geometry; the metric tensor is: ⎡ ⎤ 2 1 4 1 2 + (2 λ +λ )2 ⎥ 2 + ∂ ϕ ⎢ (2 λ1 +λ12 )2 (λ1 +λ12 )2 1 12 [gij ] = = ⎣ (λ1 +λ12 ) (49) ⎦ ∂θi ∂θj 1 2 1 1 + (2 λ +λ )2 (λ +λ )2 + (2 λ +λ )2 (λ +λ )2 1
12
1
12
1
12
1
12
In this case, the α-curvature tensor, α-Ricci curvature, and α-scalar curvature are zero.
62
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
α1 4
2
0
4
2
0
-2 2
-4 4 0 2 β1
4
Fig. 1 Affine immersion in natural coordinates (α1 , β1 ) as a surface in R3 for the Freund submanifold F2 . The curve α1 = β1 in the surface consists of all bivariate distributions having common exponential marginals and zero covariance; its tubular neighbourhoods contain by continuity all immersions of bivariate exponential processes sufficiently close to the case of independence. Additionally, since (λ1 , λ12 ) is a 1-affine coordinate system, a (-1)-affine coordinate system is 1 1 1 1 − ,− − ) (η1 , η2 ) = (− λ1 + λ12 λ1 + 2 λ12 λ1 + λ12 2 λ1 + λ12 with potential function λ = −2 − log(2) + log(2 λ1 + λ12 ) + log(λ1 + λ12 ).
2.6 Affine immersion and neighbourhoods of independence An important practical application of the Freund submanifold F2 is the representation of a bivariate stochastic process with common marginal exponentials. The next results are important because it provides topological neighbourhoods of that subspace W in F2 consisting of the bivariate processes that have zero covariance: we obtain neighbourhoods of independence for random (ie exponentially distributed) processes. Proposition 2.12. Let F be the Freund 4-manifold with the Fisher metric g and the exponential connection ∇(1) . Denote by (θi ) the natural coordinate system (13). Then F can be realized in R5 by the graph of a potential function, the affine immersion f :
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
f : F → R5 : θi
⎡
63
⎤
⎢ θi ⎥
→ ⎣ ⎦, ϕ(θ)
θ4 ) = − log(α2 β1 ). where ϕ(θ) is the potential function ϕ(θ) = − log( eθθ31 θθ22+θ 4
(50) 2
The case of Freund distributions with α1 = α2 and β1 = β2 is represented by the surface: R+ × R+ → R3 : (α1 , β1 ) → (α1 , β1 , ϕ) . where ϕ = − log(α1 β1 ). The submanifold W ⊂ F2 consisting of the independent case (α1 = β1 ) is represented by the curve: (0, ∞) → R3 : (α1 ) → (α1 , α1 , −2 log α1 ) . This is illustrated in Figure 1 which shows an affine embedding of F2 as a surface in R3 , and an R3 -tubular neighbourhood of W , the curve α1 = β1 in the surface. This curve represents all bivariate distributions having common exponential marginals and zero covariance; by continuity its tubular neighourhoods contain all small enough departures from independence. Proposition 2.13. In the affine embedding of the Freund submanifold F2 in R3 , a tubular neighbourhood of the curve α1 = β1 will contain all affine immersions of bivariate exponential distributions sufficiently close to the case of independence. 2
3
Bivariate Gaussian 5-manifold N
The bivariate Gaussian distribution has the form:
f (x, y) =
1 1 − (σ2 (x−μ1 )2 −2σ12 (x−μ1 )(y−μ2 )+σ1 (y−μ2 )2 ) 2(σ1 σ2 −σ12 2 ) √ , e 2π σ1 σ2 − σ12 2
(51)
defined on −∞ < x , y < ∞ with parameters (μ1 , μ2 , σ1 , σ12 , σ2 ); where −∞ < μ1 , μ2 < ∞, 0 < σ1 , σ2 < ∞ and σ12 is the covariance of X and Y. The marginal density functions of X and Y are univariate Gaussian: (x−μ1 )2 1 − e 2 σ1 , 2 π σ1 (y−μ2 )2 1 − e 2 σ2 . fY (y, μ2, σ2 ) = √ 2 π σ2
fX (x, μ1 , σ1 ) = √
(52) (53)
64
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
The correlation coefficient is:
σ12 ρ(X, Y ) = √ σ1 σ2
Since σ12 2 < σ1 σ2 then −1 < ρ(X, Y ) < 1; so we do not have the case when Y is a linear function of X.
3.1 Fisher metric The family N of bivariate Gaussian distributions with (μ1 , μ2 , σ1 , σ12 , σ2 ) as coordinate system, becomes a 5-manifold with Fisher information metric components ⎤ ⎡ σ2 σ12 0 0 0 ⎥ ⎢ Δ −Δ ⎥ ⎢ ⎢ − σ12 σ1 0 0 0 ⎥ ⎥ ⎢ Δ Δ ⎥ ⎢ 2 2 ⎥ ⎢ σ2 σ12 σ12 σ2 [gij ] = ⎢ 0 (54) ⎥, − 0 2 2 2 2Δ Δ 2Δ ⎥ ⎢ ⎥ ⎢ 2 σ2 σ1 σ2 +σ12 σ1 σ12 ⎥ ⎢ 0 0 − σ12 − 2 2 2 ⎥ ⎢ Δ Δ Δ ⎦ ⎣ 2 2 σ12 σ1 σ1 σ12 0 0 − Δ2 2 Δ2 2 Δ2 The inverse is
⎤
⎡ ⎢ σ1 ⎢ ⎢σ ⎢ 12 ⎢ ⎢ ij [g ] = ⎢ 0 ⎢ ⎢ ⎢ 0 ⎢ ⎣ 0
σ12
0
0
σ2
0
0
0
2 σ12
2 σ1 σ12
2 0 2 σ1 σ12 σ1 σ2 + σ12
0
2 2 σ12
2 σ12 σ2
0
⎥ ⎥ 0 ⎥ ⎥ ⎥ ⎥ 2 2 σ12 ⎥ ⎥ ⎥ 2 σ12 σ2 ⎥ ⎥ ⎦ 2 σ22
(55)
2 where Δ = σ1 σ2 − σ12 . See Skovgaard [17] for the metric in the case of general multivariate Gaussians, which also form an exponential family.
3.2 Natural coordinates and potential function Proposition 3.1. The set of all bivariate Gaussian distributions forms an exponential family, with natural coordinate system μ1 σ2 − μ2 σ12 μ2 σ1 − μ1 σ12 −σ2 σ12 −σ1 , , , , ) Δ Δ 2Δ Δ 2Δ and corresponding potential function √ μ2 2 σ1 + μ1 2 σ2 − 2 μ1 μ2 σ12 ϕ(θ) = log(2 π Δ) + 2Δ √ 2 = log(2 π Δ) − Δ θ2 θ3 − θ1 θ2 θ4 + θ1 2 θ5 . (θ1 , θ2 , θ3 , θ4 , θ5 ) = (
(56)
(57)
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
65
where Δ=
1 4 θ3 θ5 − θ4 2
.
Proof.
1 √
− 21Δ (σ2 (x−μ1 )2 −2 σ12 (x−μ1 ) (y−μ2 )+σ1 (y−μ2 )2 )
e 2π Δ μ1 σ2 − μ2 σ12 μ2 σ1 − μ1 σ12 −σ2 2 σ12 −σ1 2 = x+ y+ x + xy + y (58) Δ 2Δ Δ 2Δ Δ √ μ2 2 σ1 + μ1 2 σ2 − 2 μ1 μ2 σ12 − log(2 π Δ) + . (59) 2Δ
log f (x, y) = log
Hence the set of all bivariate Gaussian distributions is an exponential family. The line 2 σ12 μ2 σ1 −μ1 σ12 −σ2 σ12 −σ1 , , 2 Δ , Δ , 2 Δ ) is a natural coordinate system and (58) implies that ( μ1 σ2 −μ Δ Δ (x1 , x2 , x3 , x4 , x5 ) = (F1 (x), F2 (x), F3 (x), F4 (x), F5 (x)) = (x, y, x2 , x y, y 2) is a random variable, and (59) implies that ϕ(θ) = log(2 π
√
μ2 2 σ1 + μ1 2 σ2 − 2 μ1 μ2 σ12 Δ) + 2Δ
is its potential function. We can write the potential function in terms of natural coordinates by solving the set of equations: θ1 =
μ1 σ2 − μ2 σ12 μ2 σ1 − μ1 σ12 −σ2 , θ2 = , θ3 = , 2 2 σ1 σ2 − σ12 σ1 σ2 − σ12 2 (σ1 σ2 − σ12 2 ) σ12 −σ1 , θ5 = θ4 = 2 σ1 σ2 − σ12 2 (σ1 σ2 − σ12 2 )
we obtain: 2θ1 θ5 − θ2 θ4 2θ2 θ3 − θ1 θ4 , μ2 = 2 , μ1 = 2 θ4 − 4 θ3 θ5 θ4 − 4 θ3 θ5 2θ5 θ4 2θ3 , σ12 = σ1 = 2 2 , σ2 = 2 θ4 − 4 θ3 θ5 4 θ3 θ5 − θ4 θ4 − 4 θ3 θ5
Then
ϕ = log(2 π
√
Δ) − Δ θ2 2 θ3 − θ1 θ2 θ4 + θ1 2 θ5 , where Δ = σ1 σ2 − σ12 2 =
1 4 θ3 θ5 − θ4 2
.
66
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
0-Sectional curvature (0) (0)
(1,2)
0.2 0.1
-0.5
-1
0.5
1
Correlation ρ
-0.1 (0)
(3,5)
-0.2
(0)
(0)
(1,4)
(1,5)
-0.3 (0)
(1,3)
-0.4 -0.5
Fig. 2 The 0-sectional curvatures (0) as a function of correlation ρ for bivariate Gaussian manifold N where: (0) (1, 3) = (0) (2, 5) = (0) (3, 4) = (0) (4, 5) = − 12 , (0) (1, 2) = 14 , (0) (1, 4) = (0) (2, 4) and (0) (1, 5) = (0) (2, 3). Note that (0) (1, 4), (0) (1, 5) and (0) (3, 5) have limiting value − 12 as ρ → ±1.
3.3 α-geometry Since the analytic expressions for the α-connections and the α-curvature objects are very large in the natural coordinate system, we report these components in terms of the coordinate system (μ1 , μ2 , σ1 , σ12 , σ2 ). Details are given in Appendix B for the components of the α-connection from equation (5). Skovgaard [17] has given the formula for the 0connection for multivariate Gaussians. These results were extended by Mitchell [15] who gave the metric and α-connections for multivariate elliptic distributions. Proposition 3.2. The components of the α-Ricci tensor are given by the symmetric (α) matrix R(α) = [Rij ]: ⎤
⎡ σ2 2Δ
R(α)
⎢ ⎢ ⎢ − σ12 ⎢ 2Δ 2 ⎢ ⎢ = α −1 ⎢ 0 ⎢ ⎢ ⎢ 0 ⎢ ⎣ 0
− σ2 12 Δ
0
0
0
σ1 2Δ
0
0
0
0
σ2 2 2 Δ2
− σ2Δσ212
3 σ12 2 −σ1 σ2 4 Δ2
0
− σ2Δσ212
3 σ1 σ2 +σ12 2 2 Δ2
− σ1Δσ212
0
3 σ12 2 −σ1 σ2 4 Δ2
− σ1Δσ212
σ1 2 2 Δ2
⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦
(60)
2
Proposition 3.3. The bivariate Gaussian manifold N has a constant α-scalar curvature
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
67
0-Mean curvature (0) (λ)
-0.15
(0) (1) = (0) (2)
-0.2 (0) (3) = (0) (5)
-0.25
-0.5
-1
0.5 -0.35
1
Correlation ρ
(0) (4)
Fig. 3 The 0-mean curvatures (0) (λ) as a function of correlation ρ for bivariate Gaussian manifold N where; (0) (1) = (0) (2) = − 18 , (0) (3) = (0) (5) = − 14 , and (0) (4) → − 14 as ρ → ±1, and (0) (4) → − 38 as ρ → 0. R(α) : R(α) =
9 (α2 − 1) 2
(61)
This recovers the known result for the 0-scalar curvature R(0) = − 92 .
2
Hence N is ±1-flat, as in fact also are the multivariate Gaussians, by Theorems 2.5, 3.5 in Amari and Nagaoki [2]. Proposition 3.4. The α-sectional curvatures of N can be written as a function of correlation ρ(X, Y ) only, as follows: ⎡
(α)
⎢ 0 ⎢ ⎢ −1 ⎢ 4 2 ⎢ ⎢ 1 = α −1 ⎢ ⎢ 2 ⎢ 1+3 ρ2 ⎢ ⎢ 4 (1+ρ2 ) ⎣ 2 ρ 2
1 2
1+3 ρ2 4 (1+ρ2 )
ρ2 2
ρ2 2
1+3 ρ2
4 (1+ρ2 )
1 2
ρ2 2
0
1 2
ρ2 1+ρ2
1+3 ρ2 4 (1+ρ2 )
1 2
0
1 2
1 2
ρ2 1+ρ2
1 2
0
− 14 0
⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ .2 ⎥ ⎥ ⎥ ⎥ ⎦
(62)
Figure 2 shows a plot of the 0-sectional curvatures (0) as a function of correlation ρ for bivariate Gaussian manifold N. 2
68
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
Proposition 3.5. The α-mean curvatures (0) (λ, μ) (λ = 1, 2, 3, 4, 5) are given by: α2 − 1 , 8 α2 − 1 , (α) (3) = (α) (5) = 4 (α2 − 1) (3 + ρ2 ) (α2 − 1) (3 σ1 σ2 + σ12 2 ) = . (α) (4) = 8 (σ1 σ2 + σ12 2 ) 8 (1 + ρ2 )
(α) (1) = (α) (2) =
(63)
Figure 3 shows a plot of the 0-mean curvatures (0) (λ) as a function of correlation ρ for bivariate Gaussian manifold N. 2
3.4
Dual coordinates
Since N is an exponential family, a mixture coordinate system is given by the potential function (59), that is, ∂ϕ 2 θ1 θ5 − θ2 θ4 = = μ1 , ∂θ1 θ4 2 − 4 θ3 θ5 ∂ϕ 2 θ2 θ3 − θ1 θ4 η2 = = = μ2 , ∂θ2 θ4 2 − 4 θ3 θ5 θ2 2 θ4 2 + 2 θ4 (−2 θ1 θ2 + θ4 ) θ5 + 4 θ1 2 − 2 θ3 θ5 2 ∂ϕ = = μ1 2 + σ1 , η3 = 2 2 ∂θ3 θ4 − 4 θ3 θ5 2 3 2θ2 θ3 θ4 + θ4 + 2 θ1 2 − 2θ3 θ4 θ5 − θ1 θ2 θ4 2 + 4θ3 θ5 ∂ϕ =− = μ1 μ2 + σ12 , η4 = 2 2 ∂θ4 θ4 − 4θ3 θ5 4θ2 2 θ3 2 − 4θ1 θ2 θ3 θ4 + θ1 2 + 2 θ3 θ4 2 − 8θ3 2 θ5 ∂ϕ η5 = = = μ2 2 + σ2 . 2 2 ∂θ5 θ4 − 4θ3 θ5 η1 =
(64) We have (θ1 , θ2 , θ3 , θ4 , θ5 ) as a 1-affine coordinate system, so (η1 , η2 , η3 , η4 , η5 ) is a (−1)affine coordinate system, and they are dual with respect to the Fisher information metric. The coordinates (ηi ) have a potential function given by: √ ! (65) λ = − 1 + log(2 π Δ) . The coordinates (θi ) and (ηi ) form a dual coordinate system. Therefore the bivariate Gaussian manifold has dually orthogonal foliations (See Section 3.7 in [1]) for example. Take −σ2 −σ1 σ12 , (η1 , η2 , θ3 , θ4 , θ5 ) = μ1 , μ2 , , 2 (σ1 σ2 − σ12 2 ) σ1 σ2 − σ12 2 2 (σ1 σ2 − σ12 2 ) as a coordinate system for N; then the bivariate Gaussian distributions take the form: 1 f (x, y; η1, η2 , θ3 , θ4 , θ5 ) = 2π
2 2 4θ3 θ5 − θ42 eθ3 (x−μ1 ) +θ4 (x−μ1 ) (y−μ2 )+θ5 (y−μ2 )
(66)
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
69
and the Fisher metric is ⎤
⎡ ⎢ σ1 ⎢ ⎢σ ⎢ 12 ⎢ ⎢ ⎢ 0 ⎢ ⎢ ⎢ 0 ⎢ ⎣ 0
σ12
0
0
0
σ2
0
0
0
0
σ2 2 2 Δ2
− σ2Δσ212
σ12 2 2 Δ2
σ1 σ2 +σ12 2 Δ2
− σ1Δσ212
− σ1Δσ212
σ1 2 2 Δ2
0 − σ2Δσ212 0
σ12 2 2 Δ2
⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥. ⎥ ⎥ ⎥ ⎥ ⎦
(67)
We remark that (θi ) is a geodesic coordinate system of ∇(1) , and (ηi ) is a geodesic coordinate system of ∇(−1) .
3.5 Bivariate Gaussian submanifolds 3.5.1 Independence submanifold: N1 ⊂ N: σ12 = 0 The distributions are of form: f (x, y; μ1, μ2 , σ1 , σ2 ) = fX (x, μ1 , σ1 ).fY (y, μ2, σ2 )
(68)
This is the case for statistical independence of X and Y , so the space N1 is the direct product of two Riemannian spaces {fX (x, μ1 , σ1 ), μ1 ∈ R, σ1 ∈ R+ } and {fY (y, μ2, σ2 ), μ2 ∈ R, σ2 ∈ R+ }. We report expressions for the metric, the α-connections and the α-curvature objects using the natural coordinate system μ1 μ2 1 1 , ,− ,− ) σ1 σ2 2σ1 2 σ2 √ ϕ = log(2 π Δ) − Δ θ2 2 θ3 + θ1 2 θ4 ;
(θ1 , θ2 , θ3 , θ4 ) = ( and potential function
Δ=
1 . 4 θ3 θ4
Proposition 3.6. The metric tensor is: ⎡ ⎤ σ1 0 2 μ1 σ1 0 ⎢ ⎥ ⎢ ⎥ ⎢ 0 ⎥ σ 0 2 μ σ 2 2 2 ⎢ ⎥ [gij ] = ⎢ ⎥ .2 ⎢ ⎥ 0 2 σ1 (2 μ1 2 + σ1 ) 0 ⎢ 2 μ1 σ1 ⎥ ⎣ ⎦ 2 0 2 μ2 σ2 0 2 σ2 (2 μ2 + σ2 ) Proposition 3.7. By direct calculation we have the α-curvature tensor given by (α) (α) R1313 = − α2 − 1 σ1 3 , R2424 = − α2 − 1 σ2 3
(69)
(70)
70
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
while the other independent components are zero. By contraction we obtain the α- Ricci tensor: ⎤ ⎡ σ1 0 μ σ 0 1 1 ⎥ ⎢ 2 ⎥ ⎢ σ 2 ⎥ ⎢ 0 0 μ σ 2 2 ⎥ ⎢ 2 R(α) = α2 − 1 ⎢ ⎥, ⎥ ⎢ 2 0 ⎥ ⎢ μ1 σ1 0 σ1 (2 μ1 + σ1 ) ⎦ ⎣ 2 0 μ2 σ2 0 σ2 (2 μ2 + σ2 )
(71)
The α-eigenvalues of the α-Ricci tensor are given by: ⎞ 2 2 4 2 + μ1 σ1 + 16 μ1 + (1 − 2 σ1 ) + 8 μ1 (1 + 2 σ1 ) + σ1 ⎟ ⎜ ⎜ ⎟ ⎟ ⎜ σ 2 2 4 2 ⎜ 1 + μ1 2 σ1 − σ1 16 μ1 + (1 − 2 σ1 ) + 8 μ1 (1 + 2 σ1 ) − σ1 ⎟ 4 ⎟ ⎜ 4 2 ⎟ α −1 ⎜ ⎟ ⎜ σ ⎜ 2 + μ2 2 σ2 + σ2 16 μ24 + (1 − 2 σ2 )2 + 8 μ2 2 (1 + 2 σ2 ) + σ22 ⎟ 4 ⎟ ⎜ 4 ⎜ ⎟ ⎠ ⎝σ σ2 2 2 2 4 + (1 − 2 σ )2 + 8 μ 2 (1 + 2 σ ) − 2 + μ σ − 16 μ 2 2 2 2 2 2 4 4 σ2 ⎛
σ1 4
2
σ1 4
The α-scalar curvature of N1 is constant: R(α) = 2 α2 − 1
(72)
The α-sectional curvatures: ⎡
(α)
0 ⎢ ⎢ (α2 − 1) ⎢ ⎢0 = ⎢ ⎢ 2 ⎢1 ⎣ 0
⎤ 010
⎥ ⎥ 0 0 1⎥ ⎥ ⎥ ⎥ 0 0 0⎥ ⎦ 100
(73)
The α-mean curvatures: (α) (1) = (α) (2) = (α) (3) = (α) (4) =
α2 − 1 . 6
(74) 2
Proposition 3.8. The submanifold N1 is an Einstein space. Proof. By comparison of the metric tensor (69) with the Ricci tensor (71), we see that R(0) gij , k = dim(N1 ). k So the submanifold N1 with statistically independent random variables is an Einstein space. (0)
Rij =
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
71
3.5.2 Identical marginal Gaussian submanifold: N2 ⊂ N: σ1 = σ2 = σ and μ1 = μ2 = μ The distributions are of form: f (x, y; μ, σ, σ12 ) =
1 1 − (σ(x−μ)2 −2σ12 (x−μ)(y−μ)+σ(y−μ)2 ) 2(σ 2 −σ12 2 ) e 2π σ 2 − σ12 2
√
(75)
The marginal functions are fX = fY ≡ N(μ, σ), with correlation coefficient ρ(X, Y ) = σ12 . σ We report the expressions for the metric, the α-connections and the α-curvature objects using the natural coordinate system (θ1 , θ2 , θ3 ) = (
σ12 μ −σ , 2 ) , 2 2 σ + σ12 2 (σ − σ12 ) (σ − σ12 2 )
and the potential function ϕ=−
θ1 2 1 + log(2 π) − log(4 θ2 2 − θ3 2 ) . 2 θ2 + θ3 2
Proposition 3.9. The metric tensor [gij ] is as follows: ⎡
⎤
4 μ (σ + σ12 ) 2 μ (σ + σ12 ) ⎢ 2 (σ + σ12 ) ⎥ ⎢ ⎥ ⎢ 4 μ (σ + σ ) 4 (σ (2 μ2 + σ) + 2 μ2 σ + σ 2 ) ⎥ 2 2 4 (μ σ + (μ + σ) σ ) 12 12 12 12 ⎢ ⎥ ⎣ ⎦ 2 μ (σ + σ12 ) 4 (μ2 σ + (μ2 + σ) σ12 ) σ (2 μ2 + σ) + 2 μ2 σ12 + σ12 2 (276) Proposition 3.10. By direct calculation we have the α-curvature tensor of N2 ⎡ (α) R12kl
3
0 −2 (σ + σ12 ) − (σ + σ12 ) ⎢ 2 ⎢ 3 = α −1 ⎢ 0 0 ⎢ 2 (σ + σ12 ) ⎣ 0 0 (σ + σ12 )3 ⎡
(α) R13kl
0 − (σ + ⎢ 2 ⎢ 3 = α −1 ⎢ 0 ⎢ (σ + σ12 ) ⎣ (σ+σ12 )3 0 2
while the other independent components are zero.
3 12 ) σ12 )3 −(σ+σ 2
0 0
3
⎤ ⎥ ⎥ ⎥ ⎥ ⎦
⎤ ⎥ ⎥ ⎥ ⎥ ⎦
(77)
72
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
By contraction we obtain: The α- Ricci tensor: ⎡ 2 μ (σ + σ12 ) ⎢ (σ + σ12 ) ⎢ 2 2 α −1 ⎢ ⎢ 2 μ (σ + σ12 ) (σ + σ12 ) (4 μ + σ + σ12 ) ⎣ (σ+σ12 ) (4 μ2 +σ+σ12 ) μ (σ + σ12 ) 2
⎤ μ (σ + σ12 ) ⎥ ⎥ (σ+σ12 ) (4 μ2 +σ+σ12 ) ⎥ ⎥, 2 ⎦ 2 (σ+σ12 ) (4 μ +σ+σ12 )
(78)
4
The α-eigenvalues of the α-Ricci tensor are given by: ⎛ ⎜ 2 ⎜ ⎜ α −1 ⎜ ⎜ ⎝
⎞ 0 4(1+5μ2 )+5− 4(1+5μ2 )+5+
√
√
10(σ+σ12 )2 400μ4 +(4−5σ)2 +40μ2 (4+5σ)+5σ12 (−8+40 μ2 +10σ+5 σ12 ) 10(σ+σ12 )2
⎟ ⎟ ⎟ ⎟ ⎟ ⎠
(79)
400 μ4 +(4−5 σ)2 +40μ2 (4+5σ)+5σ12 (−8+40 μ2 +10σ+5 σ12 )
The α-scalar curvature: R(α) = α2 − 1
(80)
The α-sectional curvatures: ⎡ (α)
0 ⎢ 2 ⎢ (σ+σ )2 12 = α −1 ⎢ ⎢ 4 (σ2 +σ12 2 ) ⎣ 2 (σ+σ12 ) 4 (σ2 +σ12 2 )
(σ+σ12 )2 (σ+σ12 )2 4 (σ2 +σ12 2 ) 4 (σ2 +σ12 2 )
0
0
0
0
⎤ ⎥ ⎥ ⎥ ⎥ ⎦
(81)
The α-mean curvatures: 1 2 α −1 , 4 (α2 − 1) (σ + σ12 ) (4 μ2 + σ + σ12 ) .2 (α) (2) = (α) (3) = 8 (σ (2 μ2 + σ) + 2 μ2 σ12 + σ12 2 )
(α) (1) =
(82)
3.5.3 Central mean submanifold: N3 ⊂ N: μ1 = μ2 = 0 The distributions are of form: f (x, y; σ1 , σ2 , σ12 ) =
2π
√
1 1 − (σ2 x2 −2 σ12 x y+σ1 y2 ) 2 (σ1 σ2 −σ12 2 ) e σ1 σ2 − σ12 2
(83)
The marginal functions are fX (x, 0, σ1 ) and fY (y, 0, σ2 ), with correlation coefficient ρ(X, Y ) = √σ12 . σ1 σ2 We report the metric, and the α-curvature objects using the natural coordinate system (θ1 , θ2 , θ3 ) = (−
σ12 σ2 σ1 , ) ,− 2 2 2 (σ1 σ2 − σ12 ) σ1 σ2 − σ12 2 (σ1 σ2 − σ12 2 )
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
73
and the potential function ϕ = log(2 π) −
1 log( 4 θ1 θ3 − θ4 2 ) . 2
Proposition 3.11. The metric tensor is as follows: ⎡
⎤ 2
2
2 σ1 σ12 2 σ12 ⎥ ⎢ 2 σ1 ⎢ ⎥ ⎥ 2 [gij ] = ⎢ ⎢ 2 σ1 σ12 σ1 σ2 + σ12 2 σ2 σ12 ⎥ , ⎣ ⎦ 2 2 2 σ12 2 σ2 σ12 2 σ2 2 Proposition 3.12. By direct calculation we have the nonzero independent components of the α-curvature tensor of N3 ⎤
⎡ (α)
R12kl
⎢ ⎢ 2 = α −1 ⎢ ⎢ ⎣
2
−σ1 Δ −2 σ1 σ12 Δ ⎥ ⎥ σ1 2 Δ 0 −σ1 σ2 Δ ⎥ ⎥ ⎦ 0 2 σ1 σ12 Δ σ1 σ2 Δ 0
⎤
⎡ 2
(α)
R13kl
0 −2 σ1 σ12 Δ −4 σ12 Δ ⎥ ⎢ ⎥ ⎢ ⎥ = α2 − 1 ⎢ σ Δ 0 −2 σ σ Δ 2 σ 2 12 ⎥ ⎢ 1 12 ⎦ ⎣ 4 σ12 2 Δ 2 σ2 σ12 Δ 0
⎤
⎡ (α)
R23kl
0 −σ1 σ2 Δ −2 σ2 σ12 Δ ⎥ ⎢ ⎥ ⎢ = α2 − 1 ⎢ 0 −σ2 2 Δ ⎥ ⎥ . ⎢ σ1 σ2 Δ ⎦ ⎣ 2 0 2 σ2 σ12 Δ σ2 Δ
(84)
By contraction we obtain: The α- Ricci tensor: ⎡ R(α)
⎢ ⎢ 2 = α −1 ⎢ ⎢ ⎣
⎤ 2
2
σ1 σ12 2 σ12 − σ1 σ2 ⎥ ⎥ ⎥, σ1 σ12 σ1 σ2 σ2 σ12 ⎥ ⎦ 2 σ12 2 − σ1 σ2 σ2 σ12 σ2 2 σ1
The α-eigenvalues of the α-Ricci tensor are given by:
(85)
74
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
⎛
⎞
0 ⎜ ⎟ ⎟ (α2 − 1) ⎜ ⎜ σ 2 + σ σ + σ 2 − (σ 2 − σ σ + σ 2 )2 + 4 (σ 2 − 4 σ σ + σ 2 ) σ 2 + 16 σ 4 ⎟ 1 1 2 2 1 1 2 2 1 1 2 2 12 12 ⎟ ⎜ 2 ⎝ ⎠ 2 2 2 2 2 2 2 2 4 σ1 + σ1 σ2 + σ2 + (σ1 − σ1 σ2 + σ2 ) + 4 (σ1 − 4 σ1 σ2 + σ2 ) σ12 + 16 σ12 The α-scalar curvature: R(α) = 2 α2 − 1
(86)
The α-sectional curvatures: ⎡ (α)
⎢ 0 ⎢ 1 = α2 − 1 ⎢ ⎢ 2 ⎣
1 ρ2 2 1+ρ2
0
1 2
ρ2 1 1+ρ2 2
0
⎤ ⎥ ⎥ ⎥ ⎥ ⎦
(87)
The α-mean curvatures: 1 2 α −1 , 4 2 (α2 − 1) − 1) σ (α 1 σ2 = . (α) (2) = 2 (σ1 σ2 + σ12 2 ) 2 (1 + ρ2 )
(α) (1) = (α) (3) =
For N3 the α-mean curvatures have limiting value
(α2 −1) 4
(88)
as ρ2 → 1. 2
3.6 Affine immersion Proposition 3.13. Let N be the bivariate Gaussian manifold with the Fisher metric g and the exponential connection ∇(1) . Denote by (θi ) the natural coordinate system (58). Then N can be realized in R6 by the graph of a potential function, via the affine immersion {f, ξ}: f : Θ → R6 : θi
⎢ θi ⎥
→ ⎣ ⎦, ϕ(θ)
where ϕ(θ) is the potential function ϕ(θ) = log(2 π
⎤
⎡
√
(89)
Δ)−Δ θ2 2 θ3 − θ1 θ2 θ4 + θ1 2 θ5 .
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
θ1 -2 -3
-1
75
-4
1
0.5 5
0
-0.5 5 -1
-0.5
θ2
0 0.5 1
Fig. 4 Affine immersion in natural coordinates (θ1 , θ2 ) = ( 2−σ , σ12 ) as a surface in R3 Δ Δ for the bivariate Gaussian distributions with zero means and common standard deviation σ. The tubular neighbourhood surrounds the curve σ12 = 0 in the surface; this curve represents bivariate distributions having common Gaussian marginals and zero covariance; its tubular neighourhoods contain by continuity all sufficient small departures from independence.
σ 4
3
2
1
3
2
1 -1
-0.5 0 σ12
0.5 1
Fig. 5 Continuous image of the affine immersion in Figure 4 as a surface in R3 using standard coordinates for the bivariate Gaussian distributions with zero means and common standard deviation σ; the tubular neighbourhood surrounds the curve σ12 = 0 in the surface.
76
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
3.7 Neighbourhoods of independence The case of bivariate Gaussian distributions with zero means (μ1 = μ2 = 0) and common standard deviation σ1 = σ2 = σ is represented by the surface in R3 :
R− × R → R3 : (θ1 , θ2 ) → (θ1 , θ2 , ϕ(θ)) , , σ12 ); Δ = σ 2 − σ12 2 and ϕ(θ) = log(2 π σ). where (θ1 , θ2 ) = ( 2−σ Δ Δ So the submanifold consisting of the independent case with zero means and common standard deviations is represented by the curve:
(−∞, 0) → R3 : (θ1 ) → (θ1 , 0, log(−4 π Δ θ1 )) 1 1 : (− ) → (− , 0, log(2 π σ)). 2σ 2σ Proposition 3.14. In the affine immersion as a surface in R3 for the bivariate Gaussian distributions with zero means and common standard deviation σ, tubular neighbourhoods of the curve of zero covariance will contain by continuity all immersions of bivariate Gaussian processes sufficiently close to the independence case. Corollary 3.15. Via the Central Limit Theorem, the tubular neighbourhoods of the curve of zero covariance will contain all immersions of limiting bivariate processes sufficiently close to the independence case for all processes with marginals that converge in distribution to Gaussians. 2 The figures show an affine embedding of the bivariate Gaussian with zero means (μ1 = μ2 = 0) and common standard deviation σ as a surface in R3 , and an R3 -tubular neighbourhood of the curve σ12 = 0 in the surface. This curve represents bivariate distributions having common Gaussian marginals and zero covariance; its tubular neighbourhoods represent departures from independence. In Figure 4 this is depicted in natural coordinates ( 2−σ , σ12 ) and in Figure 5 the corresponding surface and tubular neighbourΔ Δ hood (not here an affine immersion, just a continuous image) is shown in the usual (σ, σ12 ) coordinates of the bivariate Gaussian family, with zero means and common standard deviation σ.
Acknowledgment The authors wish to thank the Libyan Ministry of Education for a scholarship for Arwini. Thanks are due also to a referee for suggesting a number of improvements in the presentation of the results.
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
77
References [1] S.-I. Amari, O.E. Barndorff-Neilsen, R.E. Kass, S.L. Lauritzen and C.R. Rao: Differential Geometrical Methods in Statistics, Springer Lecture Notes in Statistics 28, Springer-Verlag, Berlin, 1985. [2] S-I. Amari and H. Nagaoka: Methods of Information Geometry, American Mathematical Society, Oxford University Press, 2000. [3] Khadiga Arwini: Differential geometry in neighbourhoods of randomness and independence, PhD thesis, UMIST, 2004. [4] Khadiga Arwini and C.T.J. Dodson: “Information geometric neighbourhoods of randomness and geometry of the McKay bivariate gamma 3-manifold”, Sankhya: Indian Journal of Statistics, Vol. 66(2), (2004), pp. 211–231. [5] Khadiga Arwini and C.T.J. Dodson: “Neighbourhoods of independence for random processes via information geometry”, Math. J., Vol. 9(4), (2005). [6] Khadiga Arwini, L. Del Riego and C.T.J. Dodson: “Universal connection and curvature for statistical manifold geometry”, Houston J. Math., in press, (2006). [7] Y. Cai, C.T.J. Dodson, O. Wolkenhauer and A.J. Doig: “Gamma Distribution Analysis of Protein Sequences shows that Amino Acids Self Cluster”, J. Theoretical Biology, Vol. 218(4), (2002), pp. 409–418. [8] C.T.J. Dodson: “Spatial statistics and information geometry for parametric statistical models of galaxy clustering”, Int. J. Theor. Phys., Vol. 38(10), (1999), pp. 2585–2597. [9] C.T.J. Dodson: “Geometry for stochastically inhomogeneous spacetimes”, Nonlinear Analysis, Vol. 47, (2001), pp. 2951–2958. [10] C.T.J. Dodson and Hiroshi Matsuzoe: “An affine embedding of the gamma manifold”, Appl. Sci., Vol. 5(1), (2003), pp. 1–6. [11] R.J. Freund: “A bivariate extension of the exponential distribution”, J. Am. Stat. Assoc., Vol. 56, (1961), pp. 971–977. [12] T.P. Hutchinson and C.D. Lai: Continuous Multivariate Distributions, Emphasising Applications, Rumsby Scientific Publishing, Adelaide 1990. [13] S. Kotz, N. Balakrishnan and N. Johnson: Continuous Multivariate Distributions, Volume 1, 2nd ed., John Wiley, New York, 2000. [14] S. Leurgans, T.W.-Y. Tsai and J. Crowley: “Freund’s bivariate exponential distribution and censoring”, In: R.A. Johnson (Ed.): Survival Analysis, IMS Lecture Notes, Hayward, California, Institute of Mathematical Statistics, 1982. [15] A.F.S. Mitchell: “The information matrix, skewness tensor and α-connections for the general multivariate elliptic distribution”, Ann. Ins. Stat. Math., Vol. 41, (1989), pp. 289–304. [16] Y. Sato, K. Sugawa and M. Kawaguchi: The geometrical structure of the parameter space of the two-dimensional normal distribution, Division of information engineering, Hokkaido University, Sapporo, Japan, 1977. [17] L.T. Skovgaard: “A Riemannian geometry of the multivariate normal model”, Scan-
78
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
dinavian J. Stat., Vol. 11, (1984), pp. 211–223.
Appendix A
Freund 4-manifold F
A.1 α-connection and α-curvature (α)
Proposition A.1. The nonzero independent components Γij,k in (α1 , α2 , β1 , β2 ) coordinates are (α)
Γ11,1 = (α)
Γ11,2 = (α)
Γ13,3 = (α)
Γ12,2 = (α)
Γ14,4 = (α)
Γ33,3 = (α)
Γ33,2 = (α)
Γ22,2 = (α)
Γ24,4 = (α)
Γ44,4 =
2 (α − 1) α1 − (1 + α) α2 , 2 α1 2 (α1 + α2 )2 1+α , 2 α1 (α1 + α2 )2 (α − 1) α2 , 2 (α1 + α2 )2 β1 2 α−1 , 2 α2 (α1 + α2 )2 − (α − 1) α2 , 2 (α1 + α2 )2 β2 2 (α − 1) α2 , (α1 + α2 ) β1 3 − (1 + α) α1 , 2 (α1 + α2 )2 β1 2 − (1 + α) α1 + 2 (α − 1) α2 , 2 α2 2 (α1 + α2 )2 (α − 1) α1 , 2 (α1 + α2 )2 β2 2 (α − 1) α1 . (α1 + α2 ) β2 3
(A.1)
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
79
Proposition A.2. The nonzero components Γjk of the ∇(α) -connections are given by: i(α)
(α)1
Γ11
(α)1
Γ12
(α)1
Γ33
(α)1
Γ22
(α)1
Γ44
(α)2
Γ11
(α)2
Γ22
(α)4
Γ44
B
=−
1+α −1 + 3 α , + 2 α1 2 (α1 + α2 )
(α)3
(α)2
(α)4
= Γ13 = Γ12 = Γ24 =
α−1 , 2 (α1 + α2 )
(1 + α) α1 α2 , 2 (α1 + α2 ) β1 2 (1 + α) α1 (α)3 , Γ32 = 2 α2 (α1 + α2 ) − (1 + α) α1 α2 (α)2 −Γ44 = , 2 (α1 + α2 ) β2 2 (1 + α) α2 (α)4 , Γ14 = 2 α1 (α1 + α2 ) 1+α −1 + 3 α , − + 2 α2 2 (α1 + α2 ) α−1 . β2 (α)2
= −Γ33 = = = = = =
(A.2)
Bivariate Gaussian 5-manifold N
We use coordinates (μ1 , μ2 , σ1 , σ12 , σ2 ).
B.1 α-connection and α-curvature (α)
Proposition B.1. The functions Γij,k are given by: ⎤ ⎡ ⎤ 2 2 2(1 + α) σ2 σ12 − (1 + α) σ12 ⎥ 1 ⎢ − (1 + α) σ2 ⎢ 0 A⎥ (α) [Γij,1 ] = ⎣ ⎦ where A = ⎣ ⎦ 2Δ2 (1 + α) σ σ − (1 + α) (σ σ + σ 2 ) (1 + α) σ σ AT 0 2 12 1 2 12 1 12 ⎡
⎡
⎡
⎤
⎤ 2
1 ⎢ (1 + α) σ2 σ12 − (1 + α) (σ1 σ2 + σ12 ) (1 + α) σ1 σ12 ⎥ ⎢ 0 B⎥ (α) [Γij,2 ] = ⎣ ⎣ ⎦ ⎦ where B = 2Δ2 − (1 + α) σ 2 2 2(1 + α) σ σ −(1 + α) σ BT 0 12 1 12 1 ⎡
−(α−1) σ2 2 2 Δ2
⎢ ⎢ ⎢ (α−1) σ2 σ12 ⎢ 2 Δ2 ⎢ ⎢ (α) [Γij,3] = ⎢ 0 ⎢ ⎢ ⎢ 0 ⎢ ⎣ 0
(α−1) σ2 σ12 2 Δ2 −(α−1) σ12 2 Δ2
2
⎤ 0 0
0
(1+α) σ2 3 −2 Δ3
0
(1+α) σ2 2 σ12 Δ3
0
(1+α) σ2 σ12 2 −2 Δ3
0
0
2 Δ3
−2 Δ3
⎥ ⎥ ⎥ 0 0 ⎥ ⎥ ⎥ (1+α) σ2 2 σ12 (1+α) σ2 σ12 2 ⎥ Δ3 −2 Δ3 ⎥ 2 2 −(1+α) σ2 (σ1 σ2 +3 σ12 ) (1+α) σ12 (σ1 σ2 +σ12 ) ⎥ ⎥ ⎥ 2 Δ3 2 Δ3 ⎦ 2 (1+α) σ12 (σ1 σ2 +σ12 2 ) (1+α) σ1 σ12
80
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
⎡
(α−1)σ2 σ12 (α+1)Δ2
⎢ ⎢ (α−1) σ σ +σ 2 ( 1 2 12 ) ⎢ ⎢ −2(α+1)Δ2 ⎢ ⎢ (α) [Γij,4 ] = (α + 1) ⎢ 0 ⎢ ⎢ ⎢ 0 ⎢ ⎣ 0 ⎡
(α−1)(σ1 σ2 +σ12 2 ) −2(α+1)Δ2 (α−1)σ1 σ12 (α+1)Δ2
0 0 0
−(α−1)σ1 2Δ2
0
0
0
2 Δ3
−2 Δ3
Δ3
⎥ ⎥ ⎥ 0 0 0 ⎥ ⎥ 2 2 σ2 (σ1 σ2 +3σ12 ) σ12 (σ1 σ2 +σ12 ) ⎥ σ2 2 σ12 ⎥ Δ3 −2 Δ3 2 Δ3 ⎥ 2 2 2 σ2 (σ1 σ2 +3σ12 ) σ12 (3σ1 σ2 +σ12 ) σ1 (σ1 σ2 +3σ12 ) ⎥ ⎥ ⎥ −2 Δ3 Δ3 −2 Δ3 ⎦ 2 σ12 (σ1 σ2 +σ12 2 ) σ1 (σ1 σ2 +3σ12 2 ) σ1 σ12
⎤
−(α−1)σ12 2 (α−1)σ1 σ12 2Δ2 2Δ2
⎢ ⎢ ⎢ (α−1) σ1 σ12 ⎢ 2 Δ2 ⎢ ⎢ (α) [Γij,5] = ⎢ 0 ⎢ ⎢ ⎢ 0 ⎢ ⎣ 0
⎤
0
0
0
0
0
0
(1+α)σ2 σ12 2 −2Δ3
(1+α)σ12 (σ1 σ2 +σ12 2 ) 2Δ3
(1+α)σ1 σ12 2 −2Δ3
2
0
(1+α)σ12 (σ1 σ2 +σ12 2 ) −(1+α)σ1 (σ1 σ2 +3σ12 2 ) (1+α)σ1 2 σ12 2Δ3 2Δ3 Δ3
0
(1+α)σ1 σ12 2 2Δ3
0
(1+α)σ1 2 σ12 Δ3
(1+α)σ1 3 −2Δ3
⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ 2
We have an affine connection ∇(α) defined by: (α)
(α)
∇∂i ∂j , ∂k = Γij,k , So by solving the equations (α) Γij,k
=
3
h(α)
gkh Γij
, (k = 1, 2, 3, 4, 5).
h=1
we obtain the components of ∇(α) : (α)i
Proposition B.2. The components Γjk of the ∇(α) -connections are given by: ⎡
Γ(α)1
⎤
0 −σ2 σ12 0 ⎥ ⎢ 0 ⎢ ⎥ ⎢ 0 0 σ12 −σ1 0 ⎥ ⎢ ⎥ ⎢ ⎥ (1 + α) ⎢ ⎥ (α)1 = [Γij ] = ⎢ −σ2 σ12 0 0 0⎥ ⎢ ⎥ 2Δ ⎢ ⎥ ⎢ σ12 −σ1 0 ⎥ 0 0 ⎢ ⎥ ⎣ ⎦ 0 0 0 0 0
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
⎡
Γ(α)2
⎤
0 ⎢ 0 ⎢ ⎢ 0 0 ⎢ ⎢ (1 + α) ⎢ (α)2 = [Γij ] = ⎢ 0 0 2Δ ⎢ ⎢ ⎢ −σ2 σ12 ⎢ ⎣ σ12 −σ1
0 −σ2 σ12 ⎥ ⎥ 0 σ12 −σ1 ⎥ ⎥ ⎥ ⎥ 0 0 0 ⎥ ⎥ ⎥ 0 0 0 ⎥ ⎥ ⎦ 0 0 0
⎡
Γ(α)3
⎢ Δ(1 − α) ⎢ ⎢ 0 ⎢ ⎢ 1 ⎢ (α)3 = [Γij ] = ⎢ 0 Δ⎢ ⎢ ⎢ 0 ⎢ ⎣ 0
⎤ 0 0 0 0 0
0
0
0⎥ ⎥ 0 0 0⎥ ⎥ ⎥ ⎥ − (1 + α) σ2 (1 + α) σ12 0 ⎥ ⎥ ⎥ (1 + α) σ12 − (1 + α) σ1 0 ⎥ ⎥ ⎦ 0 0 0
⎡
Γ(α)4
81
⎤
0 Δ(1 − α) 0 0 0 ⎢ ⎥ ⎢ ⎥ ⎢ Δ(1 − α) ⎥ 0 0 0 0 ⎢ ⎥ ⎢ ⎥ 1 ⎢ ⎥ (α)4 = [Γij ] = ⎢ 0 0 0 − (1 + α) σ2 (1 + α) σ12 ⎥ ⎥ 2Δ ⎢ ⎢ ⎥ ⎢ ⎥ 0 0 − (1 + α) σ (1 + α) σ − (1 + α) σ 2 12 1 ⎢ ⎥ ⎣ ⎦ 0 0 (1 + α) σ12 − (1 + α) σ1 0
⎡
Γ(α)5
0 ⎢0 ⎢ ⎢ 0 Δ(1 − α) ⎢ ⎢ 1 ⎢ (α)5 = [Γij ] = ⎢ 0 0 Δ⎢ ⎢ ⎢0 0 ⎢ ⎣ 0 0
⎤ 0
0
0
0
0
0
0
0
0
0 − (1 + α) σ2 (1 + α) σ12 0 (1 + α) σ12 − (1 + α) σ1
⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ .2 ⎥ ⎥ ⎥ ⎥ ⎦
(B.1)
82
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
B.1.1 α-curvature (α)
Proposition B.3. The components Rijkl of the α-curvature tensor are given by: ⎡ ⎤ 0 0 ⎥ ⎢ 0 Δ 0 ⎢ ⎥ ⎢ −Δ 0 0 ⎥ 0 0 ⎢ ⎥ ⎢ ⎥ (α2 − 1) ⎢ ⎥ (α) [R12kl ] = ⎢ 0 0 0 −σ2 σ12 ⎥ 2 ⎢ ⎥ 4Δ ⎢ ⎥ ⎢ 0 0 σ2 0 −σ1 ⎥ ⎢ ⎥ ⎣ ⎦ 0 0 −σ12 σ1 0 ⎤ ⎡ ⎡ ⎤ 3 2σ2 2 σ12 −σ2 σ12 2 ⎥ (α2 − 1) ⎢ 0 C ⎥ ⎢ −σ2 (α) where C = [R13kl ] = ⎦ ⎣ ⎣ ⎦ 4Δ3 −C T 0 σ2 2 σ12 −σ2 (σ1 σ2 + σ12 2 ) σ1 σ2 σ12 ⎡
⎤
⎤
⎡ 2
(α)
[R14kl ] =
⎤
⎡ (α)
[R15kl ] =
2
[R23kl ] =
⎡
⎤
2 2 2 (α2 − 1) ⎢ 0 E ⎥ ⎢−σ2 σ12 σ12 (σ1 σ2 + σ12 ) −σ1 σ12 ⎥ where E = ⎦ ⎣ ⎣ ⎦ 4Δ3 3 2 2 T σ12 −2σ1 σ12 σ1 σ12 −E 0
⎤
⎡ (α)
2
(α − 1) ⎢ 0 D⎥ ⎢ 2σ2 σ12 −σ2 (σ1 σ2 + 3σ12 ) σ12 (σ1 σ2 + σ12 ) ⎥ where D = ⎣ ⎦ ⎦ ⎣ 4Δ3 −2σ2 σ12 2 σ12 (3σ1 σ2 + σ12 2 ) −σ1 (σ1 σ2 + σ12 2 ) −D T 0 2
⎤
⎡ 2
2
3
−2σ2 σ12 σ12 ⎥ (α2 − 1) ⎢ 0 H ⎥ ⎢ σ2 σ12 where H = ⎦ ⎣ ⎦ ⎣ 4Δ3 −σ2 σ12 2 σ12 (σ1 σ2 + σ12 2 ) −σ1 σ12 2 −H T 0
⎡ ⎤ 2 2 2 (α2 − 1) ⎢ 0 J ⎥ ⎢−σ2 (σ1 σ2 + σ12 ) σ12 (3σ1 σ2 + σ12 ) −2σ1 σ12 ⎥ (α) where J = [R24kl ] = ⎦ ⎣ ⎣ ⎦ 4Δ3 σ12 (σ1 σ2 + σ12 2 ) −σ1 (σ1 σ2 + 3σ12 2 ) 2σ1 2 σ12 −J T 0 ⎤ ⎡ ⎤ ⎡ 2 2 (α2 − 1) ⎢ 0 K ⎥ ⎢σ1 σ2 σ12 −σ1 (σ1 σ2 + σ12 ) σ1 σ12 ⎥ (α) where K = [R25kl ] = ⎦ ⎣ ⎦ ⎣ 4Δ3 −σ1 σ12 2 2σ1 2 σ12 −σ1 3 −K T 0 ⎡ ⎤ 0 0 0 ⎥ ⎢ 0 −σ2 Δ ⎢ ⎥ ⎢σ Δ 0 0 0 0 ⎥ ⎢ 2 ⎥ ⎢ ⎥ 2 (α − 1) ⎢ ⎥ (α) 2 [R34kl ] = ⎢ 0 0 0 −σ2 σ2 σ12 ⎥ 3 ⎢ ⎥ 4Δ ⎢ ⎥ 2 ⎢ 0 ⎥ 0 σ 0 −σ σ 1 2 2 ⎢ ⎥ ⎣ ⎦ 0 0 −σ2 σ12 σ1 σ2 0 ⎡
⎤
K. Arwini, C.T.J. Dodson / Central European Journal of Mathematics 5(1) 2007 50–83
⎡
83
⎤
σ12 Δ 0 0 0 ⎥ ⎢ 0 ⎢ ⎥ ⎢ −σ Δ 0 0 0 0 ⎥ ⎢ 12 ⎥ ⎢ ⎥ 2 (α − 1) ⎢ ⎥ (α) 2 [R35kl ] = ⎢ 0 0 0 σ2 σ12 −σ12 ⎥ 3 ⎢ ⎥ 4Δ ⎢ ⎥ ⎢ 0 ⎥ 0 −σ σ 0 σ σ 2 12 1 12 ⎢ ⎥ ⎣ ⎦ 2 0 0 σ12 −σ1 σ12 0 ⎡ ⎤ 0 0 0 ⎥ ⎢ 0 −σ1 Δ ⎢ ⎥ ⎢σ Δ 0 ⎥ 0 0 0 ⎢ 1 ⎥ ⎢ ⎥ (α2 − 1) ⎢ ⎥ (α) [R45kl ] = ⎢ 0 0 0 −σ1 σ2 σ1 σ12 ⎥ 3 ⎢ ⎥ 4Δ ⎢ ⎥ 2 ⎥ ⎢ 0 0 σ1 σ2 0 −σ1 ⎥ ⎢ ⎣ ⎦ 0 0 −σ1 σ12 σ12 0 2
DOI: 10.2478/s11533-006-0038-1 Research article CEJM 5(1) 2007 84–104
Integrable three-dimensional coupled nonlinear dynamical systems related to centrally extended operator Lie algebras and their Lax type three-linearization J. Golenia1∗, O.Ye. Hentosh2† , A.K. Prykarpatsky1‡ 1
Department of Applied Mathematics, The AGH University of Science and Technology, Krakow 30059 Poland 2
Department of Nonlinear Mathematical Analysis of NAS, Lviv 79060, Ukraina
Received 3 July 2006; accepted 14 September 2006 Abstract: The Hamiltonian representation for a hierarchy of Lax type equations on a dual space to the Lie algebra of integro-differential operators with matrix coefficients, extended by evolutions for eigenfunctions and adjoint eigenfunctions of the corresponding spectral problems, is obtained via some special B¨ acklund transformation. The connection of this hierarchy with integrable by Lax twodimensional Davey-Stewartson type systems is studied. c Versita Warsaw and Springer-Verlag Berlin Heidelberg. All rights reserved. Keywords: Three-dimensional Lax-type flows, R-matrix approach, integrable Hamiltonian systems, Backlund transformation MSC (2000): 35B7, 54B15, 34A30, 34B05, 34B15
Since the paper of M. Adler [1], which had been treated on a one-dimensional differential operator algebra, it was understood that a wide class of Lax integrable Korteweg de Vries type nonlinear dynamical systems in partial derivatives [3, 6, 16, 33] could be described by means of Lie-algebraic techniques. It was shown that all of them can be represented as coadjoint orbits of Lie groups. The analog of the above construction for a class of matrix affine groups with central extensions was presented in [12, 16, 17, 25, 26], where its relationship with the momentum mapping, R-matrix approach and versal deformations ∗ † ‡
E-mail:
[email protected] E-mail:
[email protected] E-mail:
[email protected]
J. Golenia et al. / Central European Journal of Mathematics 5(1) 2007 84–104
85
of differential operators had been stated. But, the extension problem for the Adler’s construction in the case of a multi-dimensional differential operator algebra still stands open. Some preliminary results in this direction were obtained by L. Nizhnik [34] and recently by A. Samoilenko, Y. Prykarpatsky, J. Golenia and A. Prykarpatsky [18, 19, 21, 28]. In this article we suggest a new approach to the partial solving of this problem based on the notions of a B¨acklund transformation [14, 16] and a tensor product of Poisson structures on a dual space of a one-differential operator algebra [3, 16, 20]. Making use of the invariant Casimir functionals’ property under the B¨acklund transformations. We construct a wide class of Lax integrable (2+1)-dimensional dynamical systems and for the first time represent them as a compatibility condition of three special linear first order differential equations, called here a triple linear Lax type representation. As is well known, Lax type representations [6] for integrable (1+1)-dimensional nonlinear dynamical system hierarchies [3, 11, 16] on functional manifolds were first interpreted as Hamiltonian flows on the dual space to the Lie algebra of integro-differential operators in [1]. A Lie-algebraic method for constructing Lax type integrable (2+1)dimensional nonlinear dynamical systems by means of two commuting flows from the hierarchy on the suitable coadjoint action orbit of a pseudo-differential operator with an infinite integral part was proposed in [4, 16, 30]. The connection of some Lax integrable (2+1)-dimensional systems with related hierarchies of Hamiltonian flows on the adjoint spaces to centrally extended, by means of the standard Maurer-Cartan two-cocycle pseudo-differential operator Lie algebras, was also [18, 25, 26, 32] intensively treated. Since every Hamiltonian flow of such a type on the dual space either to an operator Lie algebra or to its central extension can be written as a compatibility condition for the corresponding isospectral problem for their eigenfunctions and their suitable evolutions, an important problem of finding the Hamiltonian representation of the Lax type hierarchy coupled with the evolutions of a finite set of eigenfunctions and their appropriate adjoints naturally arises. It was partially solved in [14, 15, 20, 27] and further developed in the [5] for the Lie algebra of integro-differential operators and its super-generalization by means of the variational Casimir functionals property under a Lie-B¨acklund transformation. In Section 2 the general Lie-algebraic scheme of constructing Lax type integrable dynamical systems is described. Sections 3 and 4 are devoted to B¨acklund transformations of related tensor products of Poisson structures, based on the Casimir invariance [15, 20, 27] property, and their application to constructing the Lax type integrable Davey-Stewartson equation and its triple linear representation. Section 5 deals with a general Lie-algebraic scheme for constructing a hierarchy of Lax type integrable flows as Hamiltonian ones on the dual spaces to the centrally extended Lie algebra of integro-differential operators with matrix-valued coefficients. In Section 6 the corresponding Hamiltonian structure for the related coupled Lax type hierarchy is obtained by means of the B¨acklund transformation technique developed in [5, 20, 27].
86
J. Golenia et al. / Central European Journal of Mathematics 5(1) 2007 84–104
In Section 7 the corresponding hierarchies of additional or so called ”ghost” symmetries [5, 10] for the coupled Lax type flows are also shown to be Hamiltonian. It is established that an additional hierarchy of Hamiltonian flows is generated by the Poisson structure, equal to the tensor product of the R-deformed canonical Lie-Poisson bracket [12, 20, 27, 31] with the standard canonical Poisson bracket on a related eigenfunction space [15, 20, 27], and the corresponding integer powers of suitable eigenvalues are their Hamiltonian functions. The method for introducing one more variable into (2 + 1)dimensional integrable dynamical systems preserving their Lax type integrability, based on the above additional symmetries, is proposed and an integrable (3|1 + 1)-dimensional analog of the Davey-Stewartson system [29, 33] is constructed.
1
The general Lie-algebraic scheme
Let G˜ := C ∞ ( S1 ; G) be a Lie algebra of loops, taking values in a matrix Lie algebra G. By means of G˜ we construct a Lie algebra Gˆ for matrix integro-differential operators [1, 18]: aj ξ j , a ˆ := j∞
where the symbol ξ := ∂/∂x signs the differentiation with respect to the independent variable x ∈ R/2πZ S. The usual Lie commutator on Gˆ is defined as: [ˆ a, ˆb] := a ˆ ◦ ˆb − ˆb ◦ a ˆ ˆ where ”◦” is a product of integro-differential operators, taking the form: for all a ˆ, ˆb ∈ G, aˆ ◦ ˆb :=
1 ∂αa ˆ ∂ αˆb . α ∂xα α! ∂ξ α∈ Z +
On the Lie algebra Gˆ there exists the ad-invariant nondegerate symmetric bilinear form: 2π ˆ (ˆ a, b) := T r (ˆ a ◦ ˆb) dx, (1) 0
where T r is the operator for all aˆ ∈ Gˆ given by the expression: T r aˆ := resξ tr a ˆ = tr a−1 , with resξ denoting the standard residue and tr is the matrix trace. The scalar product (1) renders the Lie algebra Gˆ metrizable. As a consequence, its dual linear space of matrix ˆ integro-differential operators Gˆ∗ is identified with the Lie algebra, that is Gˆ∗ G. The linear subspaces Gˆ+ ⊂ Gˆ and Gˆ− ⊂ Gˆ defined as ⎧ ⎫ n(ˆ a)∞ ⎨ ⎬ ˜ j = 0, n(ˆ Gˆ+:= aˆ := aj ξ j : aj ∈ G, a) , (2) ⎩ ⎭ j=0
∞ ˜ j ∈ Z+ , ξ −(j+1) bj : bj ∈ G, Gˆ−:= ˆb := j=0
J. Golenia et al. / Central European Journal of Mathematics 5(1) 2007 84–104
87
are Lie subalgebras in Gˆ and Gˆ = Gˆ+ ⊕ Gˆ− . Because of the splitting of Gˆ into the direct sum of its Lie subalgebras (2) one can construct a so called Lie-Poisson structure [1, 3, ˆ 12, 16, 25] on Gˆ∗ , using the special linear endomorphism R of G: R := (P+ − P− )/2,
P± Gˆ := Gˆ± , P± Gˆ∓ = 0.
For any smooth by Fr´echet functionals γ, μ ∈ D(Gˆ∗ ) the Lie-Poisson bracket on Gˆ∗ is given by the expression:
ˆ ˆ ˆ ˆ {γ, μ}R (l) = l, [∇γ(l), ∇μ(l)]R , (3) ˆ, ˆb ∈ Gˆ the R-commutator has the form [12, 26]: where ˆl ∈ Gˆ∗ and for all a a, Rˆb], [ˆ a, ˆb]R := [Rˆa, ˆb] + [ˆ subject to which the linear space Gˆ becames a Lie algebra too. The gradient ∇γ(ˆl) ∈ Gˆ of a functional γ ∈ D(Gˆ∗ ) at a point ˆl ∈ Gˆ∗ with respect to the scalar product (1) is defined as
δγ(ˆl) := ∇γ(ˆl), δˆl , where the linear space isomorphism Gˆ Gˆ∗ is taken into account. The Lie-Poisson bracket (3) generates Hamiltonian dynamical systems on Gˆ with Casimir invariants γ ∈ I(G ∗ ), satisfying the condition: [∇γ(ˆl), ˆl] = 0,
(4)
as the corresponding Hamiltonian functions. Owing to the expressions (3) and (4) the mentioned above Hamiltonian system takes the form: dˆl/dt := [R∇γ(ˆl), ˆl] = [∇γ+ (ˆl), ˆl],
(5)
being equivalent to the usual commutator Lax type representation [3, 6, 16, 33]. The relationship (5) is a compatibility condition for the linear integro-differential equations: ˆlf =λf, df /dt=∇γ+ (ˆl)f,
(6)
where λ ∈ C is a spectral parameter and the vector-function f ∈ W ( S1 ; H) is an element of some matrix representation for the Lie algebra Gˆ in some functional Hilbert space H. Algebraic properties of the equation (5) together with (6) and the associated dynamical system on the space of adjoint functions f ∗ ∈ W ∗ ( S1 ; H): df ∗ /dt = −(∇γ(ˆl))∗+ f ∗ ,
(7)
where f ∗ ∈ W ∗ is a solution to the adjoint spectral problem: ˆl∗ f ∗ = νf ∗ , being considered as some coupled evolution equations on the space Gˆ∗ ⊕ W ⊕ W ∗ is an object of our further investigations.
88
J. Golenia et al. / Central European Journal of Mathematics 5(1) 2007 84–104
2
The tensor product of Poisson structures and its B¨ acklund transformation
To simplify the description below we will use the designation of the gradient vector ˜ δγ/δ f˜∗) ∇γ(˜l, f˜, f˜∗) := (δγ/δ˜l, δγ/δ f, for any smooth functional γ ∈ D(Gˆ∗ ⊕ W ⊕ W ∗ ). On the spaces Gˆ∗ and W ⊕ W ∗ there exists canonical Poisson structures [3, 14, 27] θ˜ δγ/δ˜l :→ [(δγ/δ˜l)+ , ˜l] − [δγ/δ˜l, ˜l]+
(8)
at a point ˜l ∈ Gˆ∗ and J˜
˜ δγ/δ f˜∗ ) :→ (δγ/δ f˜∗, −δγ/δ f˜) (δγ/δ f,
(9)
at a point (f˜, f˜∗ ) ∈ W ⊕ W ∗ correspondingly. It should be noted that the Poisson structure (8) ) is transformed into (5) for any Casimir functional γ ∈ I(Gˆ∗ ) . Thus, on the augmented space Gˆ∗ ⊕ W ⊕ W ∗ one can obtain a Poisson structure as the tensor ˜ := θ˜ ⊗ J˜ of the structures (8) and (9) . product Θ Let us consider the following B¨acklund transformation [14, 16]: B (ˆl, f, f ∗ ) :→ (˜l(ˆl, f, f ∗), f˜ = f, f˜∗ = f ∗ ),
(10)
generating on Gˆ∗ ⊕ W ⊕ W ∗ a Poisson structure Θ with respect to variables (ˆl, f, f ∗ ) of the coupled evolution equations (5)- (7). The main condition for the mapping (10) to be defined is the coincidence of the dynamical system (dˆl/dt, df /dt, df ∗ /dt) := −Θ∇γ(ˆl, f, f ∗ ) (11) with (5)- (7) in the case of γ ∈ I(Gˆ∗ ), i.e. when the functional γ is taken to be not dependent of variables (f, f ∗ ) ∈ W ⊕ W ∗ . To satisfy that condition, one should find a variation of some Casimir functional γ ∈ I(Gˆ∗ ) at δ˜l = 0, taking into account flows (6), (7) and the B¨acklund transformation (10): ∗ ˜ ˜ ˜ = (< δγ/δ f˜, δ f˜ >) + (< δγ/δ f˜∗ , δ f˜∗ >)= δγ(l, f , f ) ˜ δl=0 ∗ ∗ ˜ ˜ ˜ ˜ (< −df /dt, δ f >) + (< df /dt, δ f >) = f˜=f, f˜∗ =f ∗ ∗
(< (δγ/δˆl)∗+ f ∗ , δf >) + (< (δγ/δˆl)+ f, δf >)= (< f ∗ , (δγ/δˆl)+ δf >) + (< (δγ/δˆl)+ f, δf ∗ >)= (δγ/δˆl, δf ξ −1 ⊗ f ∗ ) + (δγ/δˆl, f ξ −1 ⊗ δf ∗ )= (δγ/δˆl, δ(f ξ −1 ⊗ f ∗ )) := (δγ/δˆl, δˆl), where γ ∈ I(Gˆ∗ ). As a result of the expression (12) ) one obtains the relationships: ˆ = δ(f ξ −1 ⊗ f ∗ ), δ l δ˜ l=0
(12)
J. Golenia et al. / Central European Journal of Mathematics 5(1) 2007 84–104
89
or having assumed the linear dependence of ˆl and ˜l ∈ Gˆ∗ one gets right away that ˆl = ˜l + f ξ −1 ⊗ f ∗ .
(13)
Thus, the B¨acklund transformation (10) can be now written as B (ˆl, f, f ∗) :→ (˜l = ˆl − f ξ −1 ⊗ f ∗ , f, f ∗ ).
(14)
The expression (14) generalizes the result, obtained in the papers [20, 27, 35] for the Lie algebra Gˆ of integro-differential operators with scalar coefficients. The existence of the B¨acklund transformation (10) makes it possible to formulate the following theorem. Theorem 2.1. The dynamical system on Gˆ∗ ⊕ W ⊕ W ∗ , being the Hamiltonian with ˜ : T ∗ (Gˆ∗ ⊕ W ⊕ W ∗ ) → T (Gˆ∗ ⊕ W ⊕ W ∗ ) and respect to the canonical Poisson structure Θ generated by the evolution equations: d˜l/dt = [∇γ+ (˜l), ˜l] − [∇γ(˜l), ˜l]+ ,
df˜/dt = δγ/δ f˜∗,
˜ df˜∗ /dt = −δγ/δ f,
where γ ∈ I(G ∗ ) is a Casimir functional at ˆl ∈ Gˆ∗ , connected with ˜l ∈ Gˆ∗ by (11) , is equivalent to the system (5), (6) and (7) via the constructed above B¨acklund transformation (14). By means of simple calculations via the formula (see [16, 27]) ˜ = B ΘB ∗ , Θ where B : T (Gˆ∗ ⊕W ⊕W ∗ ) → T (Gˆ∗ ⊕W ⊕W ∗ ) is the Fr´echet derivative of (14), one brings about the following form of the Poisson structure Θ on an element (ˆl, f, f ∗ ) ∈ G ∗ ⊕W ⊕W ∗
⎛ Θ ⎜ ∇γ(ˆl, f, f ∗ ) :→ ⎝
ˆl, (δγ/δ ˆl)+ − ˆl, δγ/δ ˆl − +
−f ξ −1 ⊗ δγ/δf + δγ/δf ∗ ξ −1 ⊗ f ∗ δγ/δf ∗ − (δγ/δ ˆl)+ f − δγ/δf + (δγ/δ ˆl)∗+ f
⎞ ⎟ ⎠ (15)
that makes it possible to formulate the theorem. Theorem 2.2. The dynamical system (11), being Hamiltonian with respect to the Poisson structure Θ in the form (15) and a function γ ∈ I(Gˆ∗ ), gives the inherited Hamiltonian representation for the coupled evolution equations (5)- (7). By means of the expression (13) one can construct Hamiltonian evolution equations, describing commutative flows on the augmented space Gˆ∗ ⊕ W ⊕ W ∗ at a fixed element ˜l ∈ Gˆ∗ . Owing to (15) such an equation is equivalent to the system ⎧ ⎪ n ˆ ⎪ dˆl/dτn = [ˆl+ , l], ⎪ ⎪ ⎨ n (16) f, df /dτn = ˆl+ ⎪ ⎪ ⎪ ⎪ ⎩ df ∗ /dτn = −(ˆl∗ )n f ∗ , +
90
J. Golenia et al. / Central European Journal of Mathematics 5(1) 2007 84–104
generated by involutive with respect to the Poisson bracket (8) Casimir invariants γn ∈ I(Gˆ∗ ), n ∈ N, taking the standard form: γn = 1/(n + 1)(ˆln , ˆl) at ˆl ∈ Gˆ∗ . The compatibility conditions of the Hamiltonian systems (16) for different n ∈ Z+ can be used for obtaining Lax integrable equations on the usual spaces of smooth 2πperiodic multivariable functions that will be done in the next section.
3
The Lax type integrable Davey-Stewartson equation and its triple linear representation
Choose the element ˜l ∈ Gˆ∗ in an exact form such as ⎛ ⎞ ⎛ ⎞ 1 0 ⎟ ⎜0 u⎟ ˜l = ⎜ ⎝ ⎠ξ − ⎝ ⎠, 0 −1 u¯ 0 where u, u ¯ ∈ S( S1 ;C) and G = gl(2;C). Then ⎞ ⎛ −1 ∗ −1 ∗ f1 ξ f1 f1 ξ f2 + u ⎟ ˆl = ˜l + ⎜ ⎠, ⎝ −1 ∗ −1 ∗ f2 ξ f1 + u¯ f2 ξ f2
(17)
where f = (f1 , f2 ), f ∗ = (f1∗ , f2∗ ) and ”− ” can mean the complex or related with it conjugation. Below we will study the evolutions (16) of vector-functions (f, f ∗ ) ∈ W ( S1 ;C2 ) ⊕ W ∗ ( S1 ;C2 ) with respect to the variables y = τ1 and t = τ2 at the point (17). They can be obtained from the second and third equations in (16), letting n = 1 and n = 2, as well as from the first one. The latter is the compatibility condition of the spectral problem ˆlΦ = λΦ, (18) where Φ = (Φ1 , Φ2 ) ∈ W ( S1 ;C2 ), λ ∈ C is some complex parameter, with the following linear equations: dΦ/dy=ˆl+ Φ, dΦ/dt=ˆl2 Φ, +
(19) (20)
arising from (16) at n = 1 and n = 2 correspondingly. The compatibility of equations (18) and (19) lead to the relationships: ∂u/∂y = −2f1 f2∗ ,
∂ u¯/∂y = −2f1∗ f2 ,
∂f1 /∂y = ∂f1 /∂x − uf2 , ∂f2 /∂y = −∂f2 /∂x + u¯f1 ,
∂f1∗ /∂y = ∂f1∗ /∂x − u¯f2∗ , ∂f2∗ /∂y = −∂f2∗ /∂x + uf1∗.
(21)
J. Golenia et al. / Central European Journal of Mathematics 5(1) 2007 84–104
91
Analogously, replacing t ∈ R by it ∈ iR, i2 = −1, one gets from (18) and (20): du/dt = i(∂ 2 u/∂x∂y + 2u(f1 f1∗ + f2 f2∗ )),
(22)
d¯ u/dt = −i(∂ 2 u¯/∂x∂y + 2¯ u(f1 f1∗ + f2 f2∗ )),
∂(f1 f1∗ )/∂y − ∂(f1 f1∗ )/∂x = 1/2∂(u¯ u)/∂y = −(∂(f2 f2∗ )/∂x + ∂(f2 f2∗ )/∂y)
and u)f1 − ∂u/∂xf2 ), df1 /dt = i(∂ 2 f1 /∂x2 + (2f1 f1∗ − u¯
df1∗/dt = −i(∂ 2 f1∗ /∂x2 + (2f1 f1∗ − u¯ u)f1∗ − ∂ u¯/∂xf2∗ ),
u)f2 − ∂ u¯/∂xf1 ), df2 /dt = i(∂ 2 f2 /∂x2 − (2f2 f2∗ + u¯ df2∗/dt
=
−i(∂ 2 f2∗ /∂x2
−
(2f2 f2∗
+
u¯ u)f2∗
−
(23)
∂u/∂xf1∗ ).
The relationships (22), (23) take the well known form of the Davey-Stewartson equation [3, 11, 36] at u¯ ∈ S( S1 ;C) being the complex conjugate to u ∈ S( S1 ;C). The compatibility for every pair of equations (18), (19) and (20), can be rewritten as the first order linear ordinary differential equations as follows: ⎞ ⎛ ⎜ λ u −f1 ⎟ ⎟ ⎜ ⎟ dΦ/dx = ⎜ (24) ⎜ u¯ −λ f2 ⎟ Φ, ⎠ ⎝ f1∗ f2∗ 0 ⎞ ⎛ ⎜ λ 0 −f1 ⎟ ⎟ ⎜ ⎟ (25) dΦ/dy = ⎜ ⎜ 0 λ −f2 ⎟ Φ, ⎠ ⎝ f1∗ f2∗ 0 ⎞ ⎛ 2 ∗ 1/2∂u/∂y −λf1 − ∂f1 /∂y ⎟ ⎜ λ + f1 f1 ⎟ ⎜ ⎟ Φ, 2 ∗ (26) dΦ/dt = i ⎜ − f f −λf − ∂f /∂y −1/2∂ u ¯ /∂y −λ 2 2 2 1 ⎟ ⎜ ⎠ ⎝ λf1∗ + ∂f1∗ /∂y λf2∗ + ∂f2∗ /∂y f2 f2∗ − f1 f1∗ where Φ = (Φ1 , Φ2 , Φ3 ) ∈ W ( S1 ;C3 ), provide its Lax type integrability. Thus, the following theorem holds. Theorem 3.1. The Davey-Stewartson equation (22), (23) possesses the Lax representation as the compatibility condition for (24) and (26) under the additional constraint (21), arising naturally from the equations (24) and (25) . In fact, one has found above a triple linearization for a (2+1)-dimensional dynamical system, that is a new important ingredient of the Lie algebraic approach to Lax type
92
J. Golenia et al. / Central European Journal of Mathematics 5(1) 2007 84–104
integrable flows, based on the B¨acklund type transformation (14) developed in this work. It is clear that the similar construction of a triple linearization like (24) - (26) can be done for many other both old and new (2+1)-dimensional dynamical systems. Another paper is being prepared on these dynamical systems.
4
The centrally extended Lie-algebraic structure
Let G˜ := C ∞ (S × S; G) which is a current Lie algebra of mappings taking values in a semi-simple matrix Lie algebra G. By means of this algebra G˜ one constructs the Lie algebra Gˆ of the following matrix integro-differential operators: a := Iξ m + aj ξ j , j<m
ˆ j < m, j ∈ Z, m ∈ N, and as before, the symbol ξ := ∂/∂x denotes differentiation aj ∈ G, with respect to the independent variable x ∈ R/2πZ S. The related central extended Lie commutator on Gˆc := Gˆ ⊕ C is given as [5, 11, 26, 32]: [(a, α), (b, β)] := ([a, b], ω(ˆ a, ˆb)) ,
(27)
where α, β ∈ C, being generated by means of the standard Maurer-Cartan two-cocycle on Gˆ : ω(a, b) := (a, [∂/∂y, b]) , where ∂/∂y is the differentiation with respect to the independent variable y ∈ S and [∂/∂y, b] := ∂b/∂y. The commutator (27) can be deformed by means of the above defined endomorphism R of Gˆ : [(a, α), (b, β)]R := ([a, b]R , ωR (a, b)) ,
(28)
where the R-commutator takes the form: [a, b]R := [Ra, b] + [a, Rb] , and the R-deformed two-cocycle is determined in the following way: ω(a, b)R := ω(Ra, b) + ω(a, Rb) . For any Fr´echet smooth functionals γ, μ ∈ D(Gˆc∗ ) the Lie-Poisson bracket on Gˆc∗ related with the commutator (28) and the extended scalar product: ((a, α), (b, β)) := (a, b) + αβ , where a, b ∈ Gˆ and α, β ∈ C, is given as {γ, μ}R (l) = (l, [∇γ(l), ∇μ(l)]R ) + cωR (∇γ(l), ∇μ(l)) ,
(29)
J. Golenia et al. / Central European Journal of Mathematics 5(1) 2007 84–104
93
where l ∈ Gˆ∗ and c ∈ C. Based on the scalar product (1) the gradient ∇γ(l) ∈ Gˆc of some functional γ ∈ D(Gˆc∗ ) at the point l ∈ Gˆc∗ is naturally defined as δγ(l) := (∇γ(l), δl) . Construct now the Casimir functionals γn ∈ I(Gˆc∗ ), n ∈ N, as 2π 2π T r (ξ n ˆl0 )dxdy , γn (l) := 0
(30)
0
being invariant with respect to Ad∗ -action of the corresponding to Gˆc∗ abstract Lie group ˆ c and satisfying the following condition [26] G (l − c∂/∂y) ◦ Φ = Φ ◦ (l0 − c∂/∂y)
(31)
at a point l ∈ Gˆ∗ . Here we have in (31) ˆl0 := ξ m +
cj ξ j ∈ Gˆ∗ ,
j<m
˜ [ξ, cj ] = 0, j < m, j ∈ Z, m ∈ N, and with cj ∈ G, ˆ −, Φ=1+ Φr ξ −r ∈ G r∈N
ˆ − is the suitable abstract Lie group [4, 15, 26, 27, 32], generated by ˆ − and G Φ ∈ G the Lie subalgebra Gˆ− . Just as in [15, 20, 27], it can be shown that condition (31) is equivalent to the following relationship [l − c∂/∂y, ∇γn (l)] = 0,
(32)
for all n ∈ N. In the case of c = 0 the Casimir functionals take the usual Adler’s form [1, 15]. The Lie-Poisson bracket (29) generates the hierarchy of Hamiltonian dynamical systems on Gˆc∗ with Casimir funtionals γn ∈ I(Gˆc∗ ), n ∈ N, as the corresponding Hamiltonian functions, taking the form: dˆl/dtn := [R∇γn (l), l − c∂/∂y] = [(∇γn (l))+ , l − c∂/∂y].
(33)
where the lower index ”+” sign a differential part of the corresponding integro-differential operator. This equation is equivalent to the usual commutator Lax type representation. It is easy to verify that for every n ∈ N the above relationship is the compatibility condition for the following systems of linear integro-differential equations: (l − c∂/∂y)f = λf ,
(34)
df /dtn = (∇γn (l))+ f ,
(35)
and
94
J. Golenia et al. / Central European Journal of Mathematics 5(1) 2007 84–104
where λ ∈ C is a spectral parameter, f ∈ W := W (S × S; H) and H is a matrix representation space of the Lie algebra G. The related to (35) dynamical system on the adjoint function space W ∗ := W ∗ (S × S; H) takes the form: df ∗/dtn = −(∇γn (l))∗+ f ∗ ,
(36)
where f ∗ ∈ W ∗ is a solution of the adjoint spectral relationship (l∗ + c∂/∂y)f ∗ = νf ∗ ,
(37)
with a spectral parameter ν ∈ C. Further, one will assume that the spectral relationship (34) admits N ∈ N different eigenvalues λi ∈ C, i = 1, N, and the study of algebraic properties of the equation (33) combined with N ∈ N copies of (35): dfi /dtn = (∇γn (ˆl))+ fi ,
(38)
for the corresponding eigenfunctions fi ∈ W (S × S; H), i = 1, N, and the same number of copies of (36): dfi∗/dtn = −(∇γn (ˆl))∗+ fi∗ , (39) for the suitable adjoint eigenfunctions fi∗ ∈ W ∗ (S × S; H) related with N ∈ N different eigenvalues νi ∈ C, i = 1, N, of (37), being considered as a coupled evolution system on the space Gˆc∗ ⊕ W N ⊕ W ∗N . The same problem at c = 0 and N = 1 has been studied before in the papers [14, 15, 27].
5
The centrally extended Poisson bracket on the augmented phase space
To simplify the description below we shall use the following notation of the gradient vector: ∇γ(˜l, ˜f, ˜f∗ ) := (δγ/δ˜l, δγ/δ˜f, δγ/δ˜f∗ ) , where ˜f := (f˜1 , . . . , f˜N ), ˜f∗ := (f˜1∗ , . . . , f˜N∗ ) and δγ/δ˜f := (δγ/δ f˜1 , . . . , δγ/δ f˜N ), δγ/δ˜f∗ := (δγ/δf1∗ , . . . , δγ/δ f˜N∗ ), at a point (˜l, ˜f, f ∗ ) ∈ Gˆ∗ ⊕ W N ⊕ W ∗N for any smooth functional γ ∈ D(Gˆc∗ ⊕ W N ⊕ W ∗N ). On the spaces Gˆc∗ and W N ⊕ W ∗N there exist canonical Poisson structures such as θ˜
δγ/δ˜l :→ [˜l − c∂/∂y, (δγ/δ˜l)+ ] − [˜l − c∂/∂y, δγ/δ˜l]+ ,
(40)
where θ˜ : T ∗ (Gˆc∗ ) → T (Gˆc∗ ) is an implectic operator corresponding to (29) at a point ˜l ∈ Gˆ∗ and J˜ (41) (δγ/δ˜f, δγ/δ˜f∗ ) :→ (−δγ/δ˜f∗ , δγ/δ˜f) , where J˜ : T ∗ (W N ⊕ W ∗N ) → T (W N ⊕ W ∗N ) is an implectic operator corresponding to N ˜∗ ˜ ˜ ˜∗ the symplectic form ω (2) = N ⊕ W ∗N . It should be i=1 dfi ∧ dfi at a point (f, f ) ∈ W
J. Golenia et al. / Central European Journal of Mathematics 5(1) 2007 84–104
95
noted here that the Poisson structure (40) generates the equation (33) for any Casimir functional γ ∈ I(Gˆc∗ ). Therefore, on the augmented phase space Gˆc∗ ⊕ W N ⊕ W ∗N one can ˜ := θ˜ ⊗ J˜ of (40) and (41). obtain a Poisson structure as the tensor product Θ Consider now the following [15, 20, 27] B¨acklund transformation: B (˜l, ˜f, ˜f∗ ) :→ (l(˜l, ˜f, ˜f∗ ), f = ˜f, f ∗ = ˜f∗ ) ,
(42)
generating on Gˆc∗ ⊕ W N ⊕ W ∗N a Poisson structure Θ : T ∗ (Gˆc∗ ⊕ W N ⊕ W ∗N ) → T (Gˆc∗ ⊕ W N ⊕ W ∗N ). The main condition imposed on the mapping (42) is the coincidence of the resulting dynamical system (dl/dtn , df/dtn , df ∗ /dtn ) := −Θ ∇γ n (l, f, f ∗ )
(43)
with the equations (33), (38) and (39) for the case when γ n ∈ I(Gˆc∗ ), n ∈ N, are not dependent on variables (f, f ∗ ) ∈ W N ⊕ W ∗N . To satisfy that condition we will find a variation of a Casimir functional γ n := γn |l=l(˜l,f,f ∗ ) ∈ D(Gˆc∗ ⊕ W N ⊕ W ∗N ), n ∈ N, under the constraint δ˜l = 0, taking into account the evolutions (38), (39) and the definition of the B¨acklund transformation (42). Hence we have N
< δγ n /δ f˜i , δ f˜i > + < δγ n /δ f˜i∗ , δ f˜i∗ > = δγ n (˜l, ˜f, ˜f∗ ) ˜ δl=0
i=1
N
< −df˜i∗ /dtn , δ f˜i > + < df˜i /dtn , δ f˜i∗ > =
=
i=1 N
˜f=f, ˜f∗ =f ∗
< (δγn /δl)∗+ fi∗ , δfi > + < (δγn /δl)+ fi , δfi∗ >
i=1
=
N
(< fi∗ , (δγn /δl)+ δfi > + < (δγn /δl)+ fi , δfi∗ >)
i=1 N (δγn /δl, (δfi )ξ −1 ⊗ fi∗ ) + (δγn /δl, fi ξ −1 ⊗ δfi∗ ) = i=1
=
δγn /δl, δ
N
fi ξ −1 ⊗ fi∗
:= (δγn /δl, δl) ,
(44)
i=1
where γn ∈ I(Gˆc∗ ), n ∈ N and the brackets < ., . > denotes the standard paring of the spaces W ∗ and W. As a result of the expression (44) one obtains the relationship: δl|δ˜l=0 =
N
δ(fi ξ −1 ⊗ fi∗ ) .
(45)
i=1
Having assumed the linear dependence of l on ˜l ∈ Gˆ∗ one gets right away from (45) that l = ˜l +
N i=1
fi ξ −1 ⊗ fi∗ .
(46)
96
J. Golenia et al. / Central European Journal of Mathematics 5(1) 2007 84–104
Thus, the B¨acklund transformation (42) can be written as B (˜l, ˜f, ˜f∗ ) :→ (l = ˜l +
N
fi ξ −1 ⊗ fi∗ , f, f ∗ ) .
(47)
i=1
The expression (47) generalizes results obtained both for the scalar of Lie algebra of integro-differential operators in [20, 26, 27] and for the matrix one in [15, 27]. The existence of the B¨acklund transformation (47) provides validity of the following theorem. Theorem 5.1. The dynamical system (43) on Gˆc∗ ⊕ W N ⊕ W ∗N is equivalent to the following system of evolution equations: d˜l/dtn = [(∇γ n (˜l))+ , ˜l] − [∇γ n (˜l), ˜l]+ , d˜f/dtn = δγ /δ˜f∗ , d˜f∗ /dtn = −δγ /δ˜f , n
n
where γ n := γn |l=l(˜l,f,f ∗ ) ∈ D(Gˆc∗ ⊕ W N ⊕ W ∗N ) and γn ∈ I(Gˆc∗ ) is a Casimir functional at a point l ∈ Gˆ∗ for every n ∈ N, under the B¨acklund transformation (47). Now by means of simple calculations via formula: ˜ ∗ , Θ = B ΘB where B : T (Gˆc∗ ⊕ W N ⊕ W ∗N ) → T (Gˆc∗ ⊕ W N ⊕ W ∗N ) is a Fr´echet derivative of (47), one finds easily the following form of the B¨acklund transformed Poisson structure Θ on Gˆc∗ ⊕ W N ⊕ W ∗N :
⎛
⎞
[l − c∂/∂y, (δγ/δl)+ ] − [l − c∂/∂y, δγ/δl]+ + ⎟ ⎜ ⎟ ⎜ N −1 ∗ −1 ∗ ⎟ ⎜ (f ξ ⊗ (δγ/δf ) − (δγ/δf )ξ ⊗ f ) i i i i i=1 Θ ⎟ ⎜ ∇γ(l, f, f ∗) :→ ⎜ ⎟, ⎟ ⎜ ∗ −δγ/δf − (δγ/δl)+ f ⎟ ⎜ ⎠ ⎝ ∗ ∗ δγ/δf + (δγ/δl)+ f
(48)
where γ ∈ D(Gˆc∗ ⊕ W N ⊕ W ∗N ) is an arbitrary smooth functional. Thereby, one can formulate the follwing theorem. Theorem 5.2. The hierarchy of dynamical systems (33), (38) and (39) is Hamiltonian one with respect to the Poisson structure Θ in the form (48) and the functionals γ n := γn ∈ I(Gˆc∗ ), n ∈ N, being Casimir invariants on Gˆc∗ . Based on the expression (43) one can construct a new hierarchy of Hamiltonian evolution equations describing commutative flows generated by involutive with respect to the Poisson bracket (29) Casimir invariants γn ∈ I(Gˆc∗ ), n ∈ N, on the augmented phase space Gˆc∗ ⊕ W N ⊕ W ∗N .
J. Golenia et al. / Central European Journal of Mathematics 5(1) 2007 84–104
6
97
The hierarchies of additional symmetries
The hierarchy (33), (38) and (39) of evolution equations possesses another natural set of invariants including all higher powers of the eigenvalues λk , k = 1, N. The latter can be considered as Fr´echet smooth functionals on the augmented phase space Gˆc∗ ⊕ W N ⊕ W ∗N owing to the evident representations: λsk =< fk∗ , (l − c∂/∂y)s fk >,
(49)
where s ∈ N, holding under the normalizing constraints < fk∗ , fk >= 1 . In the case of the B¨acklund transformation (46), where l := l+ +
N
fi ξ −1 ⊗ fi∗ ,
(50)
i=1
the formula (49) gives rise to the following variation of the functionals λsk ∈ D(Gˆc∗ ⊕W N ⊕ W ∗N ), k = 1, N, s ∈ N : δλsk =< δfk∗ , (l − c∂/∂y)s fk >
+ < (fk∗ , δ(l − c∂/∂y)s )fk > + < fk∗ , (l − c∂/∂y)s (δfk ) > N < (−Mks + δki (l − c∂/∂y)s )∗ fi∗ , δfi > = (Mks , δl+ ) + i=1
+
N
< (−Mks + δki (l − c∂/∂y)s )fi , δfi∗ > ,
i=1
where δki , i, k = 1, N, is the Kronecker symbol and the operators Mks , k = 1, N, s ∈ N, are determined as Mks :=
s−1
((l − c∂/∂y)p fk )ξ −1 ⊗ ((l∗ + c∂/∂y)s−1−p fk∗ ) .
p=0
Thus, one obtains the exact forms of gradients for the functionals λsk ∈ D(Gˆs∗ ⊕ W N ⊕ W ∗N ), k = 1, N : ∇λsk (l+ , f, f ∗) = (Mks , (−Mks + δki (l − c∂/∂y)s )∗ fi∗ ,
(−Mks + δki (l − c∂/∂y)s )fi : i = 1, N) .
(51)
By means of the expressions (51), (40) and (41) one finds a new hierarchy of coupled evolution equations on Gˆc∗ ⊕ W N ⊕ W ∗N : dl+ /dτs,k = −[Mks , l+ − c∂/∂y]+ , dfi /dτs,k = (−Mks + δki (l − c∂/∂y)s )fi dfi∗/dτs,k = (Mks − δki (l − c∂/∂y)s )∗ fi∗
(52) ,
(53)
,
(54)
98
J. Golenia et al. / Central European Journal of Mathematics 5(1) 2007 84–104
where i = 1, N and τs,k ∈ R, s ∈ N, are evolution parameters. Owing to the B¨acklund transformation (50) the equation (52) can be rewritten equivalently in the following commutator form: dl/dτs,k = −[Mks , l − c∂/∂y]
= −λpk νks−1−p [Mk1 , l − c∂/∂y] = λpk νks−1−p dl/dτ1,k ,
(55)
where p = 0, s − 1. Thereby, one can formulate the following theorem. Theorem 6.1. For k = 1, N and s ∈ N the dynamical systems (55), (53) and (54) and Hamiltonian ones with respect to the Poisson structure Θ in the form (48) and the invariant functionals γ s := λsk ∈ D(Gˆc∗ ⊕ W N ⊕ W ∗N ). The dynamical systems (55), (53) and (54) describe flows on Gˆc∗ ⊕ W N ⊕ W ∗N commuting both one with each other and with the hierarchy of Lax type dynamical systems (33), (38) and (39). Proof. To prove the theorem it is sufficient to show that [d/dtn , d/dτ1,k ] = 0 , [d/dτ1,k , d/dτ1,q ] = 0 ,
(56)
where k, q = 1, N and n ∈ N. The first equality in the formula (56) follows from the identities: d(∇γn (l))+ /dτ1,k = [(∇γn (l))+ , M11 ]+ , dM11 /dtn = [(∇γn (l))+ , M11 ]− , the second one being a consequence of the following relationship: dMk1 /dτ1,q − dMq1 /dτ1,k = [Mk1 , Mq1 ] , that proves the theorem.
Thereby, for every k = 1, N and all s ∈ N the dynamical systems (55), (53) and (54) on Gˆc∗ ⊕ W N ⊕ W ∗N form a hierarchy of additional homogeneous or so called ”ghost” symmetries for the Lax type flows (33), (38) and (39) on Gˆc∗ ⊕ W N ⊕ W ∗N . The additional symmetry hierarchy for Lax type integrable one-dimensional dynamical systems associated with the Lie algebra Gˆ∗ of integro-differential operators was first described as some infinitely graded algebra in [10, 15]. It has been widely used for constructing Lax type integrable two-dimensional dynamical systems in [3, 10, 15, 36]. If N ≥ 2, one can obtain a new class of nontrivial Hamiltonian flows d/dTn := −1 N ˆ∗ ⊕ W ∗N in the Lax type form by use of the d/dtn + N k=1 d/dτn,k , n ∈ N, on Gc ⊕ W invariants considered above for the centrally extended Lie algebra Gˆc∗ of integro-differential operators. Acting on the eigenfunctions (fi , fi∗ ) ∈ W ⊕ W ∗ , i = 1, N, these flows generate some integrable (N + 1)-dimensional nonlinear dynamical systems. For example, in the case of the element l := ∂/∂x + f1 ξ −1 ⊗ f1∗ + f2 ξ −1 ⊗ f2∗ ∈ Gˆ∗ with (f1 , f2 , f1∗ , f2∗ ) ∈ W 2 (S × S; H) × W ∗2 (S × S; H), the flows d/dτ := d/dτ1,1 and
J. Golenia et al. / Central European Journal of Mathematics 5(1) 2007 84–104
99
d/dT := d/dT2 = d/dt2 + d/dτ2,1 on Gˆc∗ ⊕ W 2 ⊕ W ∗2 acting on the functions fi , fi∗ , i = 1, 2, give rise to such dynamical systems as ∗ ∗ ∗ f1,τ = f1,x − cf1,y + f2 u , f1,τ = f1,x − cf1,y + f2∗ u¯ ,
f2,τ = −f1 u¯ ,
∗ f2,τ
=
−f1∗ u
(57)
,
and f1,T = f1,xx + f1,τ τ + wf1 + 2f1 vτ , ∗ f1,T
=
∗ −f1,xx
−
∗ f1,τ τ
−
wf1∗
−
2f1∗ vτ
(58) ,
f2,T = f2,xx + wf2 − f1,τ u¯ + f1 u¯τ , ∗ ∗ ∗ f2,T = −f2,xx − wf2∗ + f1,τ u − f1∗ uτ ,
cwy = wx − 2(f1 ⊗ f1∗ + f2 ⊗ f2∗ )x ,
ux = f1T f2∗ , u¯x = f1∗T f2 , vx = f1T f1∗ , where one puts (∇γ2 (l))+ := ∂ 2 /∂x2 + w for some function w ∈ G˜ depending parametrically on variables τ, T ∈ R. The systems (57) and (58) represent a Lax type integrable (3+1)-dimensional generalization of the (2+1)-dimensional system being equivalent to the Davey-Stewartson one [11, 29, 33] with an infinite sequence of conservation laws which can be found by the formula (30) in the form 2π 2π γn (l) := tr (f1 ∂ n−1 f1∗ /∂xn−1 + f2 ∂ n−1 f2∗ /∂xn−1 )dxdy , 0
0
where n ∈ N. The suitable Lax type linearization is given by the spectral problem (34) augmented by the set of evolution equations: fτ = −M11 f , fT = ((∇γ2 (l))+ −
(59) M12 )f
,
(60)
for an arbitrary eigenfunction f ∈ W (S × S; H). The relationships (59) and (60) give rise to the additional nonlinear constraint: wτ = 2(f1 ⊗ f1∗ )x .
(61)
In the case dim H = 1 the Lax type representation (34), (59) and (60) for the mentioned above (3+1)-dimensional generalization (57), (58) and (61) of the DaveyStewartson system [15, 29, 33] has equivalent matrix form: ⎞ ⎛ ∗ 0 f1 ⎟ ⎜ 0 ⎟ ⎜ dF ⎟F , ⎜ =⎜ 0 0 f2∗ ⎟ dx ⎠ ⎝ −f1 −f2 λ + c∂/∂y ⎞ ⎛ ∗ ⎜ −(λ + c∂/∂y) u¯ f1 ⎟ ⎟ ⎜ dF ⎟ F , dF = CF , =⎜ −u 0 0 ⎟ ⎜ dτ dT ⎠ ⎝ −f1 0 0
100
J. Golenia et al. / Central European Journal of Mathematics 5(1) 2007 84–104
where F = (F 1 , F 2, F 3 = f ) ∈ W (S × S; C3 ), C := {Cmn ∈ gl(3; C) : m, n = 1, 3}, and C11 = −(λ + c∂/∂y)2 − u¯ u − 2f1 f1∗ , C12 = −f1 f2∗ − (λ + c∂/∂y)¯ u − u¯τ ,
∗ ) − u¯f2∗ , C13 = 2((λ + c∂/∂y)f1∗ − f1,x
C21 = −(λ + c∂/∂y)u − uτ − f1 f2∗ , C22 = −f2 f2∗ + u¯ u,
∗ C23 = (λ + c∂/∂y)f2∗ − f2,x + uf1∗ ,
C31 = −(λ + c∂/∂y)f1 − f1,x − f1,τ , C32 = −(λ + c∂/∂y)f2 − f2,x + u¯f1 , C33 = (λ + c∂/∂y)2 + w − f2 f2∗ , to which one can apply the standard inverse spectral transform method [7, 11, 33]. The results obtained above can be also used for constructing a wide class of integrable (3+1)-dimensional nonlinear dynamical systems with triple Lax type linearizations [15].
7
Conclusions
As it is well known, there existed by now only two regular enough algorithmic approaches [3, 16, 16, 34] to constructing integrable multi-dimensional (mainly 2+1) dynamical systems on functional spaces. Our approach, devised in this work, is substantially based on the results previously done in [14, 20], explains completely the analytical properties of three-dimensional flows before delivered in works [3, 7, 33, 36]. As the key points of our approach we used the canonical Hamiltonian structures naturally existing on the augmented phase space and related with them the B¨acklund transformation which saves Casimir invariants of a chosen matrix integro-differential Lie algebra. The latter gives rise to some additional Hamiltonian properties of considered augmented evolution flows before studied in [3, 16, 20] making use of the standard inverse scattering transform [6, 11, 33] and the formal symmetry reduction for the KP-hierarchy [11, 36] of commuting operator flows. As one can convince ourselves analyzing the structure of the B¨acklund type transformation (14), that it strongly depends on the type of an ad-invariant scalar product chosen on an operator Lie algebra Gˆ and its Lie algebras decomposition like (2). Since there exist in general other possibilities of choosing such decompositions and ad-invariant ˆ they give rise naturally to another resulting type of corresponding scalar products in G, B¨acklund transformations, which can be a subject of another special investigation. Let us here only mention the choice of a scalar product related with the operator Lie algebra Gˆ centrally extended by means of the standard Maurer-Cartan two-cocycle [16, 18, 25], bringing about new types of multi-dimensional integrable flows. The last aspect of the B¨acklund approach to constructing Lax type integrable flows and their partial solutions, which is worth mentioning, is related with Darboux-B¨acklund type transformations [9, 19, 21, 23, 24, 28] and their new generalization recently developed
J. Golenia et al. / Central European Journal of Mathematics 5(1) 2007 84–104
101
in [15, 37]. They give rise to very effective procedures of constructing multi-dimensional integrable flows on functional spaces with an arbitrary number of independent variables simultaneously delivering a wide class of exact analytical solutions, depending on many constant parameters, which can appear to be useful for diverse applications in applied sciences. All of the above B¨acklund type transformation aspects can be treated as special investigations, giving rise to new directions in the theory of multi-dimensional evolution flows and their integrability. Several Lie-algebraic approaches [5, 15, 18, 26, 30] to constructing Lax integrable multi-dimensional (mainly 2+1) nonlinear dynamical systems on functional manifolds and their supersymmetric generalizations have been well known. In this paper we developed a method of introducing one more commuting variable into Lax type integrable (2+1)-dimensional dynamical systems arising on a dual space to the centrally extended matrix Lie-algebra of integro-differential operators. It is based on the natural hierarchy of additional symmetries [10, 13, 15, 20, 27]. The resulting integrable (3+1)-dimensional dynamical systems obtained by means of this method possess an infinite sequence of conservation laws and related triple Lax type linearizations. Owing to the latter property their soliton type solutions can be found by means of either the standard inverse spectral transform method [7, 11] or Darboux-B¨acklund transformations [9, 19, 21, 22, 28]. The structure of the constructed Lie-B¨acklund transformation (47), being a key point of the devised approach, strongly depends on an ad-invariant scalar product chosen for an operator Lie algebra Gˆ and on a suitable Lie algebra decomposition (see [3, 16]). Since there exist other possibilities of choosing the corresponding ad-invariant scalar products ˆ such decompositions will give rise naturally to another B¨acklund transformations. on G, In further work the method is planned to be developed for some special centrally extended Lie algebra of super-integro-differential operators [8, 13].
Acknowledgements The authors thank the Organizing Committee of the International Conference on Mathematical Analysis and Differential Equations, held in Uzhgorod, September, 2006, for an invitation to deliver the results of the article at the Conference. One of authors (A.P.) cordially thanks Prof. V.G. Samoylenko (T. Shevchenko National University, Kyiv) for fruitful discussions of some aspects of the developed Lie-B¨acklund transformations and their applications. The authors also want to thank Profs. L.P. Nizhnik Institute of Mathematics of NAS, Kyiv) and P.I. Holod (UKMA, Kyiv) for the valuable comments on diverse problems related with results presented in the article. Last but not least thanks are addressed both to Referees, who very thorougly read the article and made very useful suggestions and comments, which improved its style and exposition and to our friends Profs V.V. Gafiychuk (Polytechnical University, Krak´ow) and M. M. Prytula (I.Ya. Franko National University, iv) for the permanent support and help in editing the article.
102
J. Golenia et al. / Central European Journal of Mathematics 5(1) 2007 84–104
References [1] M. Adler: “On a Trace Functional for Formal Pseudo-Differential perators and the Symplectic Structures of a Korteweg-de Vries Equation”, Invent. Math., 1979, Vol. 50(2), pp. 219–248. [2] V.I. Arnold: Mathematical Methods of Classical Mechanics, Nauka, Moscow, 1989 (in Russian). [3] M. Blaszak: Multi-Hamiltonian Theory of Dynamical Systems, Springer, VerlagBerlin-Heidelberg, 1998. [4] L. Dickey: Soliton equations and Hamiltonian systems, World Scientific, Vol. 42, 1991. [5] O.Ye. Hentosh: “Lax Integrable Supersymmetric Hierarchies on Extended Phase Spaces”, Symmetry, Integrability and Geometry: Methods and Applications, Vol. 1, (2005), p. 11 (to be published). [6] P.D. Lax: “Periodic Solutions of the KdV Equation”, Commun. Pure Appl. Math., Vol. 28, (1975), pp. 141–188. [7] S.V. Manakov: “The Method of Inverse Scattering Problem and Two-Dimensional Evolution Equations”, Adv. Math. Sci., Vol. 31(5), (1976), pp. 245–246. [8] Yu.I. Manin and A.O. Radul: “A Supersymmetric Extension of the KadomtsevPetviashvili Hierarchy”, Comm. Math. Phys., Vol. 28, (1985), pp. 65–77. [9] V.B. Matveev and M.I. Salle: Darboux-B¨acklund transformations and applications, Springer, New York, 1993. [10] E. Nissimov and S. Pacheva: “Symmetries of Supersymmetric Integrable Hierarchies of KP Type”, J. Math. Phys., Vol. 43, (2002), pp. 2547–2586. [11] S.P. Novikov (Ed.): Soliton Theory: Method of the Inverse Problem, Nauka, Moscow, 1980 (in Russian). [12] W. Oevel: “R-Structures, Yang-Baxter Equations and Related Involution Theorems”, J. Math. Phys., Vol. 30, (1989), pp. 1140–1149. [13] W. Oevel and Z. Popowicz: “The bi-Hamiltonian Structure of Fully Supersymmetri´ n Korteweg-de Vries Systems”, Comm. Math. Phys., Vol. 139, (1991), pp. 441–460. [14] W. Oevel, W. Strampp and K.P. Constrained: “Hierarchy and bi-Hamiltonian Structures”, Comm. Math. Phys., Vol. 157, (1993), pp. 51–81. [15] A.K. Prykarpatsky and O.Ye. Hentosh: “The Lie-Algebraic Structure of (2+1)Dimensional Lax Type Integrable Nonlinear Dynamical Systems”, Ukrainian Math. J., Vol. 56, (2004), pp. 939–946. [16] A.K. Prykarpatsky and I.V. Mykytiuk: Algebraic Integrability of Nonlinear Dynamical Systems on Manifolds: Classical and Quantum Aspects, Kluwer Academic Publishers, Dordrecht-Boston-London, 1998. [17] A.K. Prykarpatsky and D. Blackmore: “Versal deformations of a Dirac type differential operator”, J. Nonlin. Math. Phys., Vol. 6(3), (1999), pp. 246–254. [18] A.K. Prykarpatsky, V.Hr. Samoilenko, R.I. Andrushkiw, Yu.O. Mitropolsky and M.M. Prytula: “Algebraic Structure of the Gradient-Holonomic Algorithm for Lax
J. Golenia et al. / Central European Journal of Mathematics 5(1) 2007 84–104
[19]
[20]
[21]
[22]
[23]
[24]
[25] [26]
[27]
[28]
[29]
[30] [31]
103
Integrable Nonlinear Systems. I”, J. Math. Phys., Vol. 35, (1994), pp. 1763–1777. A.M. Samoilenko and A.K. Prykarpatsky: “The spectral and differential-geometric aspects of a generalized de Rham-Hodge theory related with Delsarte transmutation operators in multi-dimension and its applications to spectral and soliton problems”, Nonlinear Analysis TMA, Vol. 65, (2006), pp. 395–432, 395–432. Y.A. Prykarpatsky: “The structure of integrable Lax type flows on nonlocal manifolds: dynamical systems with sources”, Math. Methods Phys.-Mech. Fields., Vol. 40(4), (1997), pp. 106–115. A.M. Samoilenko, A.K. Prykarpatsky and V.G. Samoylenko: “The structure of Darboux-type binarytransformations and their applications in soliton theory”, Ukr. Math. J., Vol. 55(12), (2003), pp. 1704–1723 (in Ukrainian). Y.A. Prykarpatsky, A.M. Samoilenko and A.K. Prykarpatsky: “The multidimensional Delsarte transmutation operators, their differential-geometric structure and applications. Part.1”, Opuscula Math., Vol. 23, (2003), pp. 71–80. Y.A. Prykarpatsky, A.M. Samoilenko and A.K. Prykarpatsky: “The de Rham-HodgeSkrypnik theory of Delsarte transmutation operators in multi-dimension and its applications”, Rep. Math. Phys., Vol. 55(3), (2005), pp. 351–363. J. Golenia, Y.A. Prykarpatsky, A.M. Samoilenko and A.K. Prykarpatsky: “The general differential-geometric structure of multi-dimensional Delsarte transmutation operators in parametric functional spaces and their applications in soliton theory Part 2”, Opuscula Math., Vol. 24, (2004), pp. 71–83. A.G. Reiman: ”Semenov-Tian-Shansky M.A”, The Integrable Systems, Computer Science Institute Publisher, Moscow-Izhevsk, 2003 (in Russian). A.G. Reiman and M.A. Semenov-Tian-Shansky: “The Hamiltonian Structure of Kadomtsev-Petviashvili Type Equations”, In: LOMI Proceedings, Vol. 164, Nauka, Leningrad, 1987, pp. 212–227 (in Russian). A.M. Samoilenko and Y.A. Prykarpatsky: Algebraic-analytic aspects of completely integrable dynamical systems and their perturbations, Institute of Mathematics Publisher, Vol. 41, Kyiv, 2002 (in Ukrainian). A.M. Samoilenko, A.K. Prykarpatsky and Y.A. Prykarpatsky: “The spectral and differential-geometric aspects of a generalized de Rham - Hodge theory related with Delsarte transmutation operators in multidimension and its applications to spectral and soliton problems”, Nonlinear Anal., Vol. 65, (2006), pp. 395–432. A.M. Samoilenko, V.G. Samoilenko, Yu.M. Sydorenko: “The Kadomtsev-Petviashvili Equation Hierarchy with Nonlocal Constraints: Multi-Dimensional Generalizations and Exact Solutions of Reduced Systems”, Ukrainian Math. J., Vol. 49, (1999), pp. 78–97 (in Ukrainian). M. Sato: “Soliton Equations as Dynamical Systems on Infinite Grassmann Manifolds”, RIMS Kokyuroku, Kyoto Univ., Vol. 439, (1981), pp. 30–40. M.A. Semenov-Tian-Shansky: “What is the R-Matrix”, Funct. Anal. Appl., Vol. 17(4), (1983), pp. 17–33 (in Russian).
104
J. Golenia et al. / Central European Journal of Mathematics 5(1) 2007 84–104
[32] L.A. Takhtadjian and L.D. Faddeev: Hamiltonian Approach in Soliton Theory, Springer, USA, 1986. [33] Zakharov B. E., Integrable Systems in Multi-Dimensional Spaces, Lect. Notes Phys., Vol. 153, (1983), 190–216. [34] L.P. Nizhnik: Inverse Scattering Problems for Hyperbolic Equations, Kiev, Nauk. Dumka Publ., 1991 (in Russian). [35] M.M. Prytula: Lie-algebraic structure of nonlinear dynamical systems on augmented functional manifolds, Ukrainian Math. Zh., Vol. 49(11), (1997), pp. 1512–1518. [36] B. Konopelchenko, Yu. Sidorenko and W. Strampp: “(1+1)-dimensional integrable systems as symmetry constraints of (2+1)-dimensional systems”, Phys. Lett. A., Vol. 157, (1991), pp. 17–21. [37] J.C.C. Nimmo: “Darboux tarnsformations from reductions of the KP-hierarchy”, In: V.G. Makhankov, A.R. Bishop and D.D. Holm: Nonlinear evolution equations and dynamical systems (NEEDS’94), World Scient. Publ., 1994.
DOI: 10.2478/s11533-006-0043-4 Research article CEJM 5(1) 2007 105–133
Differential invariants of generic hyperbolic Monge–Amp` ere equations Michal Marvan1∗ , Alexandre M. Vinogradov2,3† , Valery A. Yumaguzhin4‡ 1
Mathematical Institute, Silesian University in Opava, 746 01 Opava, Czech Republic 2
Dipartimento di Matematica ed Informatica, University of Salerno, 84084 Fisciano (SA), Italy, 3
Istituto Nazionale di Fisica Nucleare, Napoli-Salerno, Italy 4
Program Systems Institute of RAS, 152020, Pereslavl’-Zalesskiy, m. Botik, Russia
Received 11 April 2006; accepted 3 October 2006 Abstract: In this paper basic differential invariants of generic hyperbolic Monge–Amp`ere equations with respect to contact transformations are constructed and the equivalence problem for these equations is solved. c Versita Warsaw and Springer-Verlag Berlin Heidelberg. All rights reserved. Keywords: Monge–Amp`ere equation, contact transformation, Fr¨ olicher–Nijenhuis bracket, scalar differential invariant MSC (2000): 58J45, 58J70
1
Introduction
With this paper we start a systematic study of differential invariants of Monge–Amp`ere equations, with our objective being the classification problem, methods of integration, and other applications. Complete proofs of the results announced in [16] are presented. We are interested in the classical case of two independent variables. The Monge–Amp`ere ∗ † ‡
E-mail:
[email protected] E-mail:
[email protected] E-mail: yuma@diffiety.botik.ru
106
M. Marvan et al. / Central European Journal of Mathematics 5(1) 2007 105–133
equations merit special attention due to a large spectrum of various applications, first of all, in differential geometry and mathematical physics. Moreover, they form a natural testing area for new methods emerging in the modern theory of nonlinear PDE’s. In spite of more than 200 years of history of Monge–Amp`ere equations and numerous publications devoted to them, it would be an exaggeration to say that their nature is well understood. An important success was establishing the existence and uniqueness theorems by Lewy and others (see [3, 10] for local aspects and [22] for global ones). The classical Monge integration method was modernized by Matsuda [17, 18] and Morimoto [20], etc. Our interest in differential invariants is motivated not only by the classification problem but, no less, by hopes that they could illuminate many aspects of the theory of Monge–Amp`ere equations. According to [24] (see also [1]) scalar differential invariants provide a key to solving the classification problem for any kind of geometrical structures. In fact, geometrical structures of a given type are classified by solutions of a naturally associated classifying (differential) equation, which describes “family ties” connecting the corresponding scalar differential invariants. More exactly, scalar differential invariants are smooth functions on the classifying diffiety, which is the infinite prolongation of the classifying equation. This diffiety generally has singularities and its singular strata classify those geometrical structures that possess nontrivial symmetries. Each of these strata is also an infinitely prolonged differential equation in a lesser number of independent variables. For instance, homogeneous structures correspond to the zero-dimensional case. So the classification problem consists of a complete description of all strata composing the classifying diffiety, and therefore involves a complete symmetry analysis of the geometric structures under consideration. The interested reader will find an illustration of the above said in [25] where plane 3-webs, a rather simple geometrical structure, is considered. The classification problem for Monge–Amp`ere equations dates back to Sophus Lie. For modern proofs of Lie’s theorems, classification problems for various strata of MongeAmp`ere equations see, e.g., [6–9, 13–15, 21, 23] and references therein. Directly using the geometry of jet bundles, in this paper we interpret a hyperbolic Monge–Amp`ere equation as a pair of 2-dimensional, skew-orthogonal, non-lagrangian subdistributions of the contact distribution on a 5-dimensional contact manifold. This pair of subdistributions was considered by other authors from a different point of view. See, for instance, [12, 14, 19]. We look for more than just scalar differential invariants of Monge–Amp`ere equations with respect to the group of contact transformations. Here, we limit ourself to the case of generic hyperbolic equations, which is motivated by two reasons. First, the study of singular strata very much benefits from the knowledge of the generic one. Second, for the hyperbolic equations, differential invariants are more easily visible due to the existence of bicharacteristics. Differential invariants found in this paper give a solution of the classification problem for generic hyperbolic equations. This solution requires substantial computer support in the analysis of concrete cases and further work is necessary to improve its efficiency. Differential invariants for elliptic and parabolic Monge–Amp`ere equations can be ob-
M. Marvan et al. / Central European Journal of Mathematics 5(1) 2007 105–133
107
tained more or less straightforwardly by following the approach developed in this paper. This idea and the study of singular strata will be the subject of subsequent publications.
2
Preliminaries
Below, all manifolds and maps are supposed to be smooth. By [f ]kp , k = 0, 1, 2, . . . , ∞, we denote the k-jet of a map f at a point p. R stands for the field of real numbers, and Rn for the n-dimensional arithmetic space.
2.1 Jet bundles Here we recall necessary definitions and facts about jet bundles, see [4, 5]. Let M be an n-dimensional manifold, E an n + m-dimensional manifold and π : E −→ M . a fiber bundle. By πk : J k π → M ,
πk : [S]kp → p ,
k = 0, 1, 2, . . .
we denote the bundle of all k-jets of sections of π. For any l > m ≥ 0, the natural projection is defined as πl,m : J l π → J m π ,
πl,m : [S]lp → [S]m p .
Any section S of π generates the section jk S of the bundle πk by the formula jk S : p → [S]kp . Put LkS = Im jk S . Let θk+1 be an arbitrary point of J k+1 π, θk = πk+1,k (θk+1 ), and Tθk (J k π) the tangent space to J k π at the point θk . Then θk+1 defines the subspace Kθk+1 ⊂ Tθk (J k π) by the formula Kθk+1 = Tθk (LkS ) . Clearly, θk+1 is identified with Kθk+1 . It is easy to prove that Tθk (J k π) = Kθk+1 ⊕ Tθk (πk−1 (p)) .
(1)
Consider all submanifolds of the form LkS containing θk . The subspace spanned by their tangent spaces Tθk (LkS ) is denoted by C(θk ) and it is called the Cartan plane at θk . The distribution Ck : θk → C(θk ) is called the Cartan distribution on J k π. The distribution Ck , k ≥ 1, can be defined as the kernel of the Cartan form Uk = pr2 ◦ (πk,k−1)∗ , −1 (p)) is the projection generated by direct sum where pr2 : Tθk−1 (J k−1 π) → Tθk−1 (πk−1 decomposition (1).
108
M. Marvan et al. / Central European Journal of Mathematics 5(1) 2007 105–133
2.2 The contact structure Consider the trivial bundle τ : R2 × R −→ R2 ,
τ : ( x, y, z ) → ( x, y ) .
By x, y, z, p = zx , q = zy , r = zxx , s = zxy , t = zyy we denote the standard coordinates in J 2 τ . The Cartan distribution C1 on J 1 τ is identical to the contact structure on J 1 τ . The corresponding contact 1-form U1 has the canonical form U1 = dz − p dx − q dy . in the standard coordinates. A diffeomorphism ϕ : J 1 τ → J 1 τ is called a contact transformation if it preserves the Cartan distribution. Obviously, a diffeomorphism ϕ is a contact transformation iff there exist a nowhere vanishing function λ such that ϕ∗ (U1 ) = λ U1 . Any contact transformation ϕ can be lifted to the diffeomorphism 2 2 ϕ(1) τ : J τ −→ J τ
by the formula (1) ˜ ϕ(1) τ : θ2 ≡ Kθ2 → ϕ∗ (Kθ2 ) ≡ θ2 = ϕτ (θ2 ) . (1)
If ϕ is defined on an open set V ⊂ J 1 τ , then ϕτ is defined on an open, everywhere dense −1 subset of τ2,1 (V ). A vector field Z in J 1 τ is a contact vector field if its flow ϕt consists of contact transformations. Clearly, Z is a contact vector field iff there exist a function λ such that LZ (U1 ) = λ U1 , where LZ is the Lie derivative with respect to Z. There exists a natural one-to-one correspondence between the set of all contact vector fields in J 1 τ and the set of all functions in J 1 τ . It is defined by the formula Z → f = Z U1 . The function f = Z U1 is called the generating function of the contact vector field Z. The contact vector field Z corresponding to f is denoted by Zf . In standard coordinates, the field Zf is given by the formula Zf = −fp
∂ ∂ ∂ ∂ ∂ − fq + (f − pfp − qfq ) + (fx + pfz ) + (fy + qfz ) . ∂x ∂y ∂z ∂p ∂q
(2)
M. Marvan et al. / Central European Journal of Mathematics 5(1) 2007 105–133
109
2.3 Operations over vector-valued forms Let M be a smooth n-dimensional manifold, Λi (M) the C ∞ (M)-module of i-forms on M and D(M) the C ∞ (M)-module of vector fields on M. Let α ∈ Λk (M), β ∈ Λr (M), and X, Y ∈ D(M). Then the Fr¨olicher–Nijenhuis bracket [[· , ·]] of the vector-valued forms α ⊗ X and β ⊗ Y is defined by the formula [[α ⊗ X, β ⊗ Y ]] = α ∧ β ⊗ X, Y + α ∧ X(β) ⊗ Y − Y (α) ∧ β ⊗ X + (−1)k dα ∧ (X β) ⊗ Y − (−1)k (Y see [2]. The contraction
α) ∧ dβ ⊗ X ,
of forms α ⊗ X and β ⊗ Y is defined by the formula (α ⊗ X) (β ⊗ Y ) = α ∧ (X β) ⊗ Y .
2.4 Projectors and their curvatures The following simple construction allows one to associate a vector valued 2-form with a projector. Namely, let P, Q ∈ D(M) be endomorphisms of the C ∞ (M)-module D(M) such that QP = 0. Then ΩQ,P (X, Y ) = Q[P (X), P (Y )],
X, Y ∈ D(M),
(3)
obviously, is skew-symmetric and C ∞ (M)-bilinear, i.e., a vector valued form. More precisely, it takes values in Im Q ⊂ D(M). If P : D(M) → D(M) is a projector, i.e., P 2 = P , then the associated curvature form of P is defined to be RP = ΩI−P,P
(4)
with I = idD(M ) .
3
Hyperbolic Monge–Amp` ere equations
3.1 Monge–Amp`ere equations The Monge–Amp`ere equation is a partial differential equation of the form 2 ) + Azxx + Bzxy + Czyy + D = 0 , N(zxx zyy − zxy
(5)
where x, y are independent variables, z is a dependent variable, zxx = ∂ 2 z/∂x2 , zxy = ∂ 2 z/∂x ∂y, zyy = ∂ 2 z/∂y 2 , and coefficients N, A, B, C, D are functions of x, y, z, zx = ∂z/∂x and zy = ∂z/∂y. We identify equation (5) with the submanifold E of the jet bundle J 2 τ determined by the equation N(rt − s2 ) + Ar + Bs + Ct + D = 0 . (6)
110
M. Marvan et al. / Central European Journal of Mathematics 5(1) 2007 105–133
Obviously, τ2,1 (E) = J 1 τ . Let θ2 ∈ E, τ2,1 (θ2 ) = θ1 , and Fθ1 be the fiber of the projection τ2,1 over the point θ1 ∈ J 1 τ . Then the subspace Smblθ2 E = Tθ2 E ∩ Tθ2 Fθ1 , where Tθ2 E is the tangent space to E at θ2 is called the symbol of the equation E at the point θ2 ∈ E. In terms of standard coordinates, Smblθ2 E is described by the linear equation N(t˜ r + r t˜ − 2s˜ s) + A˜ r + B˜ s + C t˜ = 0, (7) where r˜, s˜, t˜ are the standard coordinates in Tθ2 Fθ1 generated by the standard coordinates on J 2 τ . A point θ2 ∈ E can be elliptic, parabolic, or hyperbolic. To introduce these notions, let us consider a one-dimensional subspace P ⊂ C(θ1 ) such that (τ1 )∗ P = 0. By definition, put l(P ) = { θ2 ∈ Fθ1 P ⊂ Kθ2 } . The submanifold l(P ) is called a 1-ray. In terms of standard coordinates, let θ1 = (x, y, z, p, q), P = v and v = ζ1
∂ ∂ ∂ ∂ ∂ + ζ2 + μ + η1 + η2 . ∂x ∂y ∂z ∂p ∂q
(8)
Then (τ1 )∗ P = 0 means that (ζ1 , ζ2 ) = (0, 0) ,
(9)
μ = ζ1 p + ζ2 q ,
(10)
v ∈ C(θ1 ) means that and P ⊂ Kθ2 means that
η1 = ζ1 r + ζ2 s , η2 = ζ1 s + ζ2 t ,
(11)
where r, s, t are the standard coordinates of θ2 in the fiber Fθ1 . From system (11), we see that l(P ) is an affine straight line in Fθ1 . By θ2 (P ) we denote the tangent space Tθ2 l(P ) to l(P ) at the point θ2 ∈ l(P ). We call it a 1-ray subspace. In terms of the standard coordinates r˜, s˜, t˜ in Tθ2 Fθ1 , vectors of θ2 (P ) satisfy ζ1 r˜ + ζ2 s˜ = 0 , (12) ζ1 s˜ + ζ2 t˜ = 0 , Obviously, θ2 (P ) is spanned by the vector ( r˜, s˜, t˜) = ( ζ22, −ζ1 ζ2 , ζ12 ) .
(13)
M. Marvan et al. / Central European Journal of Mathematics 5(1) 2007 105–133
111
Taking into account (9), we observe that all 1-ray subspaces form the cone Vθ2 = { r˜t˜ − s˜2 = 0 } in the tangent space Tθ2 Fθ1 . This cone is called the cone of singular square forms. Obviously, the intersection Smblθ2 E∩ Vθ2 is either zero, or a single 1-ray subspace, or two 1-ray subspaces. Correspondingly, the point θ2 ∈ E is then called elliptic, parabolic or hyperbolic. It is not difficult to prove that a contact transformation takes an elliptic, parabolic, or hyperbolic point to an elliptic, parabolic, or hyperbolic point, respectively. The equation E is called elliptic, parabolic or hyperbolic if all its points are elliptic, parabolic or hyperbolic, respectively. In this work, we consider hyperbolic Monge–Amp`ere equations only. It is easy to see that E is hyperbolic iff its coefficients satisfy the condition Δ = B 2 − 4AC + 4ND > 0 .
(14)
3.2 Skew-orthogonal distributions Directly from geometry of jet bundles we draw out the interpretation of a hyperbolic Monge–Amp`ere equation as a pair of skew-orthogonal two-dimensional distributions in the Cartan distribution on J 1 τ . See [12, 14, 19] for an alternative approach. Let θ1 be an arbitrary point of J 1 τ . By Qθ1 we denote the union of all one-dimensional subspaces P of C(θ1 ) such that τ∗ P = 0 and the 1-ray l(P ) is tangent to E at least at one point. Proposition 3.1. Let E be a hyperbolic Monge–Amp`ere equation. Then Qθ1 is the union of two-dimensional subspaces D1E(θ1 ) and D2E(θ1 ) of the Cartan plane C(θ1 ), so that (1) C(θ1 ) = D1E(θ1 ) ⊕ D2E(θ1 ), (2) D1E(θ1 ) and D2E(θ1 ) are skew-orthogonal with respect to the symplectic form dU1 = dx ∧ dp + dy ∧ dq on C. Proof. We prove this proposition for Monge–Amp`ere equations such that N = 0. The proof for N = 0 follows from the fact that every Monge–Amp`ere equation can be transformed to one with N = 0 by an appropriate contact transformation. Let v ∈ Qθ1 and P = v. The condition for l(P ) to be tangent to E can be written in the following way. We can assume that v is of the form (8). Then the vector of fiber coordinates (ζ22 , −ζ1 ζ2 , ζ12) is tangent to l(P ). Now using (7) we deduce that l(P ) is tangent to E iff N(rζ12 + 2sζ1 ζ2 + tζ22 ) + Aζ22 − Bζ1 ζ2 + Cζ12 = 0 . Taking into account that the coordinates ζi and ηi of v are connected by equations (11), we reduce this equation to the form N(ζ1 η1 + ζ2 η2 ) + Aζ22 − Bζ1 ζ2 + Cζ12 = 0 .
(15)
112
M. Marvan et al. / Central European Journal of Mathematics 5(1) 2007 105–133
Then in view of (9) we assume that ζ1 = 0 (the case ζ2 = 0 is analogous). Then from (11) we get 1 1 r = 2 (η1 ζ1 − η2 ζ2 + ζ22t) , s = (η2 − ζ2 t) . ζ1 ζ1 Substituting these expressions for r and s in equation (6) and taking into account equation (15), we obtain the equation Nη22 + (Aζ2 − Bζ1 )η2 − Aζ1 η1 − Dζ12 = 0.
(16)
Solving the system of equations (15) and (16) with respect to η1 and η2 , we find √ √ (B ∓ Δ)ζ2 − 2Cζ1 (B ± Δ)ζ1 − 2Aζ2 , η2 = . η1 = 2N 2N Finally, in view of (10), we see that v = ζ1
√ ∂ ∂ C ∂ B± Δ ∂ +p − + ∂x ∂z N ∂p 2N ∂q + ζ2
√ ∂ ∂ B∓ Δ ∂ A ∂ +q + − . (17) ∂y ∂z 2N ∂p N ∂q
This proves that Qθ1 = X1 , X2 ∪ X3 , X4 with X1 = X2 = X3 = X4 =
∂ ∂ +p ∂x ∂z ∂ ∂ +q ∂y ∂z ∂ ∂ +p ∂x ∂z ∂ ∂ +q ∂y ∂z
√ C ∂ B− Δ ∂ − + , N ∂p 2N ∂q √ B+ Δ ∂ A ∂ + − , 2N ∂p N ∂q √ C ∂ B+ Δ ∂ − + , N ∂p 2N ∂q √ B− Δ ∂ A ∂ + − . 2N ∂p N ∂q
(18)
Put D1E(θ1 ) = X1 , X2 ,
D2E(θ1 ) = X3 , X4 .
Now it is straightforward to verify that subspaces D1E(θ1 ) and D2E(θ1 ) are skew-orthogonal and D1E(θ1 ) ∩ D2E(θ1 ) = {0}. This completes the proof. From (18) we see that for a Monge–Amp`ere equation such that N = 0, the map τ1∗ projects D1E(θ1 ) and D2E(θ1 ) onto the tangent space to the base of the bundle τ without degeneration. It should be noted that if N = 0 (that is, if E is a quasilinear second order PDE), then the projections τ1∗ D1E(θ1 ) and τ1∗ D2E(θ1 ) are one-dimensional. Thus an arbitrary hyperbolic Monge–Amp`ere equation generates two 2-dimensional skew-orthogonal subdistributions of the Cartan distribution C1 in J 1 τ .
M. Marvan et al. / Central European Journal of Mathematics 5(1) 2007 105–133
113
Proposition 3.2. ([12]) Let E be a hyperbolic Monge–Amp`ere equation. Then θ2 ∈ E if and only if one of the following equivalent conditions holds: (1) Kθ2 ∩ D1E(θ1 ) is 1-dimensional, (2) Kθ2 ∩ D2E(θ1 ) is 1-dimensional. Proof. As in the proof of Proposition 3.1 one can assume that N = 0. v), where θ2 (v) and θ2 (˜ v) Let θ2 ∈ E. Then Smblθ2 E ∩ Vθ2 = θ2 (v) ∪ θ2 (˜ are different straight lines and, so, vectors v and v˜ are independent. They are skeworthogonal, since Kθ2 is a Lagrangian plane in C(θ1 ) and, by definition of Qθ1 , v, v˜ ∈ Qθ1 . v , respectively. This means that Kθ2 intersects planes D1E(θ1 ) and D2E(θ1 ) along v and ˜ 2 1 Let θ2 be a point of J τ such that Kθ2 intersects the plane DE(θ1 ) along a straight line, that is, Kθ2 ∩ D1E(θ1 ) = v. By substituting coordinates η1 , η2 of the vector v given by formula (17) into eq. (11), we obtain √ C B− Δ r+ ζ1 + s − ζ2 = 0, N 2N √ A B+ Δ s− ζ1 + r + ζ2 = 0. 2N N By hypothesis this system is of rank 1 (cf. (9)) and hence its determinant is zero. Now 2 it remains to note that this √ is exactly equation (6) and, so, θ2 ∈ E. The case of DE(θ1 ) differs only by the sign at Δ. An important consequence of this proposition is that a hyperbolic Monge–Amp`ere equation E is completely determined by one of the associated distributions DiE, i = 1, 2. Thus, every hyperbolic Monge–Amp`ere equation E is naturally equivalent to a pair of 2-dimensional, skew-othogonal non-lagrangian subdistributions D1E, D2E of the Cartan distribution C1 in J 1 τ . In particular, the equivalence problem for hyperbolic Monge–Amp`ere equations with respect to contact transformations may be interpreted as the equivalence problem for pairs of 2-dimensional, skew-orthogonal non-lagrangian subdistributions of C1 with respect to contact transformations.
3.3 Bundles of Monge–Amp`ere equations From now on we put M = J 1 τ . 3.3.1 Bundles of hyperbolic Monge–Amp`ere equations Let E be a Monge–Amp`ere equation (5). It is identified with the section SE : → N( ) : A( ) : B( ) : C( ) : D( ) of the trivial bundle ρ : RP4 × M −→ M ,
[v 0 : v 1 : v 2 : v 3 : v 4 ], → ,
114
M. Marvan et al. / Central European Journal of Mathematics 5(1) 2007 105–133
where RP4 is the 4-dimensional projective space. Obviously, this identification is a bijection of the set of all Monge–Amp`ere equations onto the set of all sections of ρ. Consider the open subset E of the total space of ρ defined by the condition (14), i.e., (v 2 )2 − 4v 1 v 3 + 4v 4 v 0 > 0 . Clearly, the section SE corresponding to a hyperbolic Monge–Amp`ere equation E takes values in E. Thus we can define the bundle of hyperbolic Monge–Amp`ere equations by the formula 0 1 2 3 4 π = ρE : E −→ M , [v : v : v : v : v ], → . (19) We use local coordinates x, y, z, p, q, u1 , . . . , u4 in the total space E of π, where x, y, z, p, q are the standard coordinates on M, while the coordinates u1 , . . . , u4 on the fibres of π are defined as follows. Consider the affine hyperplane in R5 defined by the equation v 0 = 1. It generates the local chart in E [1 : v 1 : v 2 : v 3 : v 4 ] → (v 1 , v 2 , v 3 , v 4 ) . Following formulas (18), we introduce the local coordinates u1 , . . . , u4 along the fibres of π by √ √ v2 − Δ v2 + Δ 1 3 2 3 (20) , u = , u4 = −v 1 , u = −v , u = 2 2 where Δ = (v 2 )2 − 4v 1 v 3 + 4v 4 . These coordinates extend to the standard coordinates x, y, z, p, q, ui, uix , uiy , uiz , uip , uiq , . . . , uiσ , . . . , on J k π, used in this paper until we replace them with a more convenient set in Sect. 4.3. 3.3.2 The lifting of contact transformations Let ϕ be a contact transformation defined in M. Then ϕ transforms any Monge–Amp`ere ˜ In other words, ϕ induces a transforequation E to another Monge–Amp`ere equation E. mation of the corresponding sections SE → SE˜ and, consequently, a diffeomorphism ϕ(0) of the total space of π such that the diagram ϕ(0)
E −−−→ ⏐ ⏐ π
E ⏐ ⏐π
M −−−→ M ϕ
is commutative (in the domain of ϕ(0) ). The diffeomorphism ϕ(0) is called the lifting of ϕ to the bundle π. The diffeomorphism ϕ(0) , in its turn, can be lifted to a diffeomorphism ϕ(k) of J k π by the formula k ϕ(k) ( [S]k ) = ϕ(0) ◦ S ◦ ϕ−1 ϕ() .
M. Marvan et al. / Central European Journal of Mathematics 5(1) 2007 105–133
115
Obviously, for any l > m, the diagram ϕ(l)
J l π −−−→ J l π ⏐ ⏐ ⏐πl,m πl,m ⏐ J m π −−−→ J m π ϕ(m)
is commutative (in the domains of ϕ(l) ). The diffeomorphism ϕ(k) is called the lifting of ϕ to the jet bundle J k π. 3.3.3 The lifting of contact vector fields (k)
Let Z be a contact vector field in M and let ϕt be its flow. Then ϕt defines a vector field Z (k) in J k π. This field is called the lifting of Z to J k π. Obviously, ( πl,m )∗ Z (l) = Z (m) , ∞ ≥ l > m ≥ −1 , where Z (−1) = Z. It is not difficult to see that the map Z −→ Z (k) is a homomorphism of the Lie algebra of all contact vector fields on M into the Lie algebra off all vector fields on J k π. The local expression of Z (k) can be found as follows. First, change the notation by putting x1 = x, x2 = y, x3 = z, x4 = p, x5 = q. Recall that the operator Dj of total derivative with respect to xj in J∞ is given by the formula 4 ∂ ∂ uiσj i , j = 1, 2, . . . , 5 , Dj = j + ∂x ∂uσ i=1 |σ|≥0
The operator of evolution differentiation corresponding to a generating function ψ(Z) = (ψ 1 (Z), . . . , ψ 4 (Z))t is defined by the formula ψ(Z) =
4 |σ|≥0 i=1
∂ Dσ ψ i (Z) , ∂uiσ
where σ = {j1 . . . jr } , Dσ = Dj1 ◦ . . . ◦ Djr and ψ(Z) is defined as follows. Let S be a section of π defined in the domain of Z, θ1 = [S]1x , and x = π1 (θ1 ); then ψ(Z)(θ1 ) =
d (0) ( ϕt ◦ S ◦ ϕ−1 ) (x) . t dt t=0
If Z=
5 i=1
Zi
∂ , ∂xi
116
M. Marvan et al. / Central European Journal of Mathematics 5(1) 2007 105–133
then the lifting Z (∞) is defined by the formula (see [4, 5]) Z (∞) =
5
Z j Dj + ψ(Z) .
(21)
Z j Djk + kψ(Z) ,
(22)
j=1
It follows from this formula that Z
(k)
=
5 j=1
where Djk
4 ∂ ∂ = j + uiσj i , ∂x ∂uσ i=1
kψ(Z)
0≤|σ|≤k
4
=
0≤|σ|≤k i=1
∂ Dσ ψ i (Z) . ∂uiσ
Let f be the generating function of the contact vector field Z (see formula (2)) and θ1 = ( x, y, z, p, q, ui, uix , uiy , uiz , uip , uiq ). Then the vector ψ(Zf )(θ1 ) is (ψ 1 , . . . , ψ 4 ) with ψ 1 = −u1z f − u1p fx − u1q fy + (−pu1p − qu1q + u1 )fz + (u1x + pu1z )fp + (u1y + qu1z )fq + fxx + 2pfxz + p2 fzz + 2u1 fxp + (u2 + u3 )fxq + 2pu1 fzp + p(u2 + u3)fzq + (u1 )2 fpp + (u2 + u3 )u1 fpq + u2 u3 fqq , ψ 2 = −u2z f − u2p fx − u2q fy + (−pu2p − qu2q + u2 )fz + (u2x + pu2z )fp + (u2y + qu2z )fq + fxy + qfxz + pfyz + pqfzz + u2 fxp + u4 fxq + u1 fyp + u2 fyq + (qu1 + pu2 )fzp + (qu2 + pu4 )fzq + u1 u2 fpp + (u1 u4 + (u2 )2 )fpq + u2 u4 fqq , 3
ψ =
−u3z f
−
u3p fx
−
u3q fy
+
(−pu3p
−
qu3q
(23)
3
+ u )fz
+ (u3x + pu3z )fp + (u3y + qu3z )fq + fxy + qfxz + pfyz + pqfzz + u3 fxp + u4 fxq + u1 fyp + u3 fyq + (qu1 + pu3 )fzp + (qu3 + pu4 )fzq + u1 u3 fpp + (u1 u4 + (u3 )2 )fpq + u3 u4 fqq , ψ 4 = −u4z f − u4p fx − u4q fy + (−pu4p − qu4q + u4 )fz + (u4x + pu4z )fp + (u4y + qu4z )fq + fyy + 2qfyz + q 2 fzz + (u2 + u3 )fyp + 2u4 fyq + q(u2 + u3 )fzp + 2qu4fzq + u2 u3 fpp + (u2 + u3 )u4 fpq + (u4 )2 fqq .
3.4 Differential invariants By Γ we denote the pseudogroup of all contact transformations of M. Its action is lifted to J k π, k ≥ 0, as explained above.
M. Marvan et al. / Central European Journal of Mathematics 5(1) 2007 105–133
117
A function (vector field, differential form, or any other natural geometric object on J π) is a k th-order differential invariant of Γ if for any ϕ ∈ Γ the lifted transformation ϕ(k) preserves this object. In this work these differential invariants are also called differential invariants (of order k) of Monge–Amp`ere equations or simply differential invariants (of order k). Let E be a Monge–Amp`ere equation, SE the section of π identified with E, and I a differential invariant of order k. Then the value of I on E is defined as (jk SE)∗ (I) and ˜ then, obviously, f (k) denoted by IE. If a contact transformation f transforms E to E, transforms IE to IE˜ , for any kth order invariant I. Differential invariants that are functions are also called scalar differential invariants. By Ak we denote the R-algebra of all scalar differential invariants of order ≤ k. By ∗ identifying Ak with πl,k (Ak ) ⊂ Al , ∀k ≤ l, one gets a sequence of inclusions k
A0 ⊂ A1 ⊂ . . . ⊂ Ak ⊂ Ak+1 ⊂ . . .
The R-algebra A = ∞ k=0 Ak is called the algebra of scalar differential invariants of Monge–Amp`ere equations. Remark 3.3. It is worth noticing that a differential invariant I is completely determined by its values IE on concrete equations E. This observation will be used below. Let Z be a contact vector field in M and I a differential invariant of order k. Then LZ (k) (I) = 0, where L stands for the Lie derivative. This means, in particular, that kth order scalar invariants are first integrals of all contact vector fields lifted to J k π. Obviously, a scalar differential invariant of order k is constant on any orbit of the action of Γ on J k π. Such an orbit consists, generally, of two components, since contact transformations need not be orientation preserving (e.g., the famous Legendre transformation x = p, y = q, z = xp + yq − z, p = x, q = y is not). In other words, the above-mentioned first integrals of Z (k) are, generally, invariant only with respect to the unit component of Γ and will be called almost invariant. Anyway, generic orbits of contact transformations and of contact vector fields have the same dimension: Proposition 3.4. (1) J k π is an orbit of the action of Γ iff k = 0, 1, (2) Codimension of a generic orbit of J 2 π is equal to 2. (3) Codimension of a generic orbit of J 3 π is equal to 29. Proof. Let θk be a generic point of J k π and Orbθk the orbit of the action of Γ on J k π passing through θk . Then codim Orbθk = dim J k π − dim Orbθk . The dimension of Orbθk is the dimension of the subspace spanned by all vectors X (k) (θk ) which can be calculated with the help of computer algebra using formulas (22) and (23). Recall that for an arbitrary smooth function φ of k = 1, 2, . . . arguments and arbitrary scalar differential invariants I1 , . . . , Ik ∈ Ar , the function φ(I1 , . . . , Ik ) is a scalar differential invariant belonging to Ar , r = 0, 1, 2, . . .. Now the above proposition immediately
118
M. Marvan et al. / Central European Journal of Mathematics 5(1) 2007 105–133
implies Corollary 3.5. (1) The algebra of scalar differential invariants A2 is generated by 2 functionally independent invariants. (2) The algebra of scalar differential invariants A3 is generated by 29 functionally independent invariants. Differential invariants constructed below come mainly form natural geometric constructions without saying that these are invariant with respect to the full pseudo-group Γ. Although not impossible, it is quite challenging task to obtain first integrals of Z (k) analytically even for small k.
4
Differential invariants on J 2 π
The next step to be done is explicit construction of differential invariants that generate A2 as a C ∞ -closed algebra.
4.1 Base projectors Let D be a distribution on M. Denote by D(1) the distribution generated by all vector fields X and [X, Y ], ∀ X, Y ∈ D. Setting D(0) = D, we define D(r+1) , r = 0, 1, . . ., inductively by the formula D(r+1) = (D(r) )(1) . Lemma 4.1. For a hyperbolic Monge–Amp`ere equation E dim(D1E)(1) = dim(D2E)(1) = 3. Proof. Let ω ∈ Λ1 (M) and X, Y ∈ D(M) be such that ω(X) = ω(Y ) = 0. Then, by applying formula dω(X, Y ) = LX (Y ω) − LY (X ω) − [X, Y ] ω, one easily finds that ω([X, Y ]) = −dω(X, Y ). If now ω = U1 and vector fields X, Y ∈ DiE, i = 1, 2, are independent, then dU1 (X, Y ) = 0 due to hyperbolicity of E. So, the above formula shows that U1 ([X, Y ]) = 0, i.e., that [X, Y ] does not belong to the Cartan distribution on M. So, the fields [X, Y ], X and Y are linearly independent at every point of M. Restricting ourselves to the generic case only, we assume from now on that dim(D1E)(2) = dim(D2E)(2) = 5 .
(24)
Suppose that vector fields X1 , X2 generate the distribution D1E and vector fields X3 , X4 generate the distribution D2E. The 3-dimensional generic distributions X1 , X2 , [X1 , X2 ] and X3 , X4 , [X3 , X4 ] intersect along a one-dimensional subdistribution D3E = X1 , X2 ,
M. Marvan et al. / Central European Journal of Mathematics 5(1) 2007 105–133
119
[X1 , X2 ] ∩ X3 , X4 , [X3 , X4 ]. Hence, equation E generates a direct sum decomposition [12] (25) T (M) = D1E ⊕ D2E ⊕ D3E. This decomposition generates six projections Pi : T (M) → DiE , (1) Pj
i = 1, 2, 3 ,
: T (M) → DiE ⊕ D3E ,
PC : T (M) → C =
D1E
⊕
j = 1, 2 , D2E .
These projections may be viewed as vector-valued 1-forms. Namely, let X5 be a vector field generating D3E. Consider the co-frame {ω 1, . . . , ω 5} on M dual to the frame {X1 , . . . , X5 }, i.e., ωi (Xj ) = δij . Then P 1 = ω 1 ⊗ X1 + ω 2 ⊗ X2 , P 2 = ω 3 ⊗ X3 + ω 4 ⊗ X4 , P 3 = ω 5 ⊗ X5 , (1) Pj
= Pj + P3 ,
(26) j = 1, 2 ,
PC = P1 + P2 . These vector-valued differential 1-forms are, obviously, differential invariants of E with respect to contact transformations. Moreover, according to proposition 3.2, the original equation E is completely determined by each of the projectors P1 , P2 .
4.2 Coordinate-wise description of base projectors In order to find local expressions for the above projectors, consider vector fields X1 , . . . , X4 given by (18) and use the notation (20), i.e., ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ + p + u1 + u 2 , X2 = +q + u3 + u4 , ∂x ∂z ∂p ∂q ∂y ∂z ∂p ∂q ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ X3 = + p + u1 + u 3 , X4 = +q + u2 + u4 . ∂x ∂z ∂p ∂q ∂y ∂z ∂p ∂q
X1 =
(27)
The remaining field X5 is defined by the relation X5 = λ1 X1 + λ2 X2 + κ[X1 , X2 ] = λ3 X3 + λ4 X4 + χ[X3 , X4 ] .
(28)
A simple computation shows that λ3 = λ1 , with
λ4 = λ2 ,
χ = −κ = 0 ,
1 2 (u + u3 )y + q(u2 + u3 )z + u4 (u2 + u3 )q u2 − u3 − 2(u4x + pu4z + u1 u4p ) − (u2 + u3 )u4q + u3 u2p + u2 u3p , 1 2 2 (u + u3 )x + p(u2 + u3 )z + u1 (u2 + u3 )p λ = 2 3 u −u − 2(u1y + qu1z + u4 u1q ) − (u2 + u3 )u1p + u2 u3q + u3 u2q
λ1 =
(29)
120
M. Marvan et al. / Central European Journal of Mathematics 5(1) 2007 105–133
provided that X5 is normalized by the requirement κ = 1. Brackets of vector fields X1 , . . . , X5 are described by means of the coefficients bijk : [Xj , Xk ] =
5
bijk Xi .
i=1
Obviously, bijk = −bikj .
4.3 Convenient coordinates on J k π Vector fields Xi , i = 1, . . . , 5 induce vector fields Xi on the bundle J ∞ π, uniquely defined by the condition j k (SE)∗ Xi = Xi for all sections SE. Thus, X1 = D1 + pD3 + u1 D4 + u2D5 , etc., where Di denote the total derivatives, see Sect. 3.3.3. Differential invariants of hyperbolic Monge–Amp`ere equations constructed bellow are described in terms of the quantities Xi1 . . . Xih bkij . So, we need to know all algebraic relations connecting them, at least for h = 0, 1. To find these efficiently it is convenient to use a non-standard local chart in J k π. Lemma 4.2. Functions u˜ji1 ...ih = Xi1 . . . Xih uj ,
i1 ≤ . . . ≤ ih , h ≤ k.
(30)
together with functions xi , uj constitute a local chart on J k π. Moreover, the standard jet coordinates on J k π are rational functions of these coordinates. Proof. For k = 2 the assertion is verified directly. For k > 2 one can express the standard jet coordinates uji1 ...ik = Di1 ...ik uj in terms of coordinates (30) by making use of the following obvious facts. First, fields Di are linear combinations of fields Xi with coefficients in C ∞ (J 2 π). Second, the coefficients bji1 i2 are functions on J 2 π. Third, Xi2 Xi1 f = −bji1 i2 Xj f + Xi1 Xi2 f for every function f ∈ C ∞ (J k π), k ≥ 2. Omitting explicit expression of quantities bkij in terms of coordinates u˜ji1 ...ih , we only remark that they are essentially simpler than those in terms of the standard jet coordinates uji1 ...ih . Then it is easy to find the following complete system of functional relations among quantities bkij :
M. Marvan et al. / Central European Journal of Mathematics 5(1) 2007 105–133
b134 = 0,
b234 = 0,
b312 = 0,
b412 = 0,
b512 = 1,
b313 = −b113 ,
b413 = −b213 ,
b513 = 0,
b323 = −b123 ,
b423 = −b223 ,
b523 = 0,
b314 = −b114 ,
b414 = −b214 ,
b514 = 0,
b324
b424
b524
=
−b124 ,
b334 = −b112 ,
=
−b224 ,
b434 = −b212 ,
121
(31)
= 0,
b534 = −1,
b515 = −b214 − b113 ,
b525 = −b224 − b123 ,
b535 = −b113 − b223 ,
b545 = −b224 − b114 ,
b445 = −b335 + b225 + b115 . Naturally, these relations reflect basic geometric properties of fields X1 , . . . , X5 . For instance, the relation b312 = b412 = 0 is implied by the fact that [X1 , X2 ] belongs to the distribution (DE1 )(1) generated by X1 , X2 and X5 , etc. Henceforth we shall simplify the notation by using Xi for Xi .
4.4 Curvatures Using formulas (3), (4) and the direct sum decomposition (25), it is easy to compute the (1) (1) curvature forms of projectors P1 , P2 , P1 , P2 , PC, which are R1 = ω 1 ∧ ω 2 ⊗ X5 , R2 = −ω 3 ∧ ω 4 ⊗ X5 , R11 = −(b315 ω 1 + b325 ω 2) ∧ ω 5 ⊗ X3 − (b415 ω 1 + b425 ω 2 ) ∧ ω 5 ⊗ X4 ,
R12
=
−(b135 ω 3
+
b145 ω 4)
5
∧ ω ⊗ X1 −
(b235 ω 3
+
b245 ω 4 )
(32)
5
∧ ω ⊗ X2 ,
R = R1 + R2 , respectively. It is clear that these curvature forms are differential invariants of E. Fr¨olicher–Nijenhuis brackets of base projectors give new invariant vector-valued forms. These, however, turn out to be linear combinations of curvature forms. More exactly, a direct computation, which is omitted, shows that [[P1 , P2 ]] = 12 (−[[P1 , P1 ]] − [[P2 , P2 ]] + [[P3 , P3 ]]) ,
[[P1 , P3 ]] = 12 (−[[P1 , P1 ]] + [[P2 , P2 ]] − [[P3 , P3 ]]) , [[P2 , P3 ]] = 12 ([[P1 , P1 ]] − [[P2 , P2 ]] − [[P3 , P3 ]])
122
M. Marvan et al. / Central European Journal of Mathematics 5(1) 2007 105–133
and [[P1 , P1 ]] = −2(R12 + R1 ),
[[P2 , P2 ]] = −2(R11 + R2 ) ,
[[P3 , P3 ]] = −2(R1 + R2 ).
4.5 Scalar invariants on J 2π The following three invariant 5-forms with values in D3E = X5 : 1 1 1 R2 R1 R2 R1 = Λ1 ω 1 ∧ . . . ∧ ω 5 ⊗ X5 , 2 1 1 1 R R = Λ 2 ω 1 ∧ . . . ∧ ω 5 ⊗ X5 , R R 2 2 1 1 2 1 1 R2 R1 R1 R2 = Λ12 ω 1 ∧ . . . ∧ ω 5 ⊗ X5 , with
Λ1 = b235 b145 − b135 b245 ,
Λ2 = b415 b325 − b315 b425 ,
Λ12 = b315 b135 + b415 b145 + b325 b235 + b425 b245 .
(33)
(34)
are proportional. Therefore, the corresponding proportionality factors are scalar differential invariants. In particular, such are I 1 = Λ12 /Λ1 , I 2 = Λ12 /Λ2 .
(35)
Below it will be shown that Λ1 , Λ2 are nowhere zero. Theorem 4.3. The algebra of scalar differential invariants on J 2 π is generated by the invariants I 1 and I 2 . Proof. In view of Corollary 3.5, it is sufficient to show that I 1 and I 2 are functionally independent (on J 2 π). But this is straightforward from the complete list of functional relations (31). Coefficients Λσ , σ = 1, 2, 12, introduced in (34) have a geometrical meaning explained below. Fix a generator W = f X5 in D3E and consider maps 2 1 W 1 : D → D ,
1 2 W 2 : D → D ,
defined by formulas W 1 (Z2 ) = P1 ([Z2 , W ]),
W 2 (Z1 ) = P2 ([Z1 , W ]),
W ∞ with Z1 ∈ D1 , Z2 ∈ D2 . Since P1 (D2E) = P2 (D1E) = 0 both W 1 and 1 are C (M)linear. This is seen as well from their local expressions i j W 1 = f bj5 ω ⊗ Xi , i = 1, 2, j = 3, 4, i j W 2 = f bj5 ω ⊗ Xi , i = 3, 4, j = 1, 2.
M. Marvan et al. / Central European Journal of Mathematics 5(1) 2007 105–133
123
i i Consider also 2-forms ρW i : D × D → R, i = 1, 2, defined by
ρW i (Ui , Vi )W = Ri (Ui , Vi ),
Ui , Vi ∈ DiE.
(36)
1 2 W 3 4 Then, obviously, ρW 1 = (1/f )ω ∧ ω , ρ2 = −(1/f )ω ∧ ω , so that both are volume forms of D1 and D 2 , respectively. Moreover, we have ∗ W 2 W (W 1 ) (ρ1 ) = f Λ2 ρ2 ,
∗ W 2 W (W 2 ) (ρ2 ) = f Λ1 ρ1 ,
(37)
W W W 2 tr(W 1 ◦ 2 ) = tr(2 ◦ 1 ) = f Λ12 .
Proposition 4.4. If E is generic, then functions Λ1 , Λ2 are nowhere zero. W Proof. By genericity condition (24), W 1 and 2 are surjective, hence Λ1 , Λ2 are nonzero.
4.5.1 W W W W W 1 2 Now consider operators ∇W 1 = 1 ◦ 2 and ∇2 = 2 ◦ 1 acting on D and D , respectively. It follows from (37) that
λ2 − f 2 Λ12 λ + f 4 Λ1 Λ2
(38)
is the characteristic polynomial for each of them. Another peculiarity of the situation is W W W that W 1 send eigenvectors of ∇2 to that of ∇1 and similarly for 1 . The discriminant of polynomial (38) is f 4 Λ1 Λ2 (I 1 I 2 − 4). Its sign coincides, obviously, with the sign of I 1 I 2 (I 1 I 2 − 4) . This proves that generic hyperbolic Monge–Amp`ere equations are subdivided into three subclasses as follows: (1) subclass “h”: the operator ∇W i has two different real eigenfunctions 1 2 1 2 ⇔ I I (I I − 4) > 0, (2) subclass “p”: the operator ∇i has a unique real eigenfunction ⇔ I 1 I 2 (I 1 I 2 − 4) = 0, (3) subclass “e”: the operator ∇i has no real eigenfunctions ⇔ I 1 I 2 (I 1 I 2 − 4) < 0. 4.5.2 Some almost invariants The previous considerations lead to an almost invariant choice of generator W = f X5 in D3E. Namely, define functions ΛW i , i = 1, 2, by relations ∗ W W W (W 1 ) (ρ1 ) = Λ2 ρ2 ,
∗ W W W (W 2 ) (ρ2 ) = Λ1 ρ1 .
124
M. Marvan et al. / Central European Journal of Mathematics 5(1) 2007 105–133
2 Obviously, ΛW i = f Λi . This shows that, up to sign, vector fields
1 W, Wi = |ΛW i |
i = 1, 2,
5 do not depend on the choice of W . In particular, ΛX = Λi , so that i
Wi = By duality, 1-forms ϑi =
1 X5 , |Λi |
i = 1, 2.
|Λi|ω5 , i = 1, 2,
are almost invariant as well. It is not difficult to construct further almost invariant forms. For instance, the forms ϑij = Ri ϑj , i = 1, 2, are manifestly almost invariant and have the following local expressions: 1 2 ϑ1j = |Λj |ω ∧ ω ϑ2j = |Λj |ω 3 ∧ ω 4 . The products ρj = (−sign Λj )ϑ1j ∧ ϑ2j = Λj ω 1 ∧ ω 2 ∧ ω 3 ∧ ω 4 ,
j = 1, 2,
(39)
which are volume forms on the Cartan distribution D1E, ⊕D2E, are, obviously, fully invariant. This is a very simple example on how an invariant can be constructed from almost invariants. Forms ρj can be described in a manifestly invariant way as follows: ρ1 =
1 2
1 (R2 R1 ) (R12 R1 ) ,
ρ2 =
1 2
1 (R1 R2 ) (R11 R2 )
where · stands for the self-contraction. Note that the form ρ12 = I j ρj = Λ12 ω 1 ∧ ω 2 ∧ ω 3 ∧ ω 4
(40)
is invariant too. Similarly, one can construct many other invariant forms. Some of them are : 1 R2 R1 = −(b135 ω 3 + b145 ω 4 ) ∧ ω 2 + (b235 ω 3 + b245 ω 4 ) ∧ ω 1 , 1 R1 R2 = (b315 ω 1 + b325 ω 2 ) ∧ ω 4 − (b415 ω 1 + b425 ω 2 ) ∧ ω 3 , R2 R11 = 2Λ2 ω 1 ∧ ω 2 ∧ ω 5 , R11 1 R12 R2 R1 = 2Λ1 ω 3 ∧ ω 4 ∧ ω 5 .
(41)
Now it is easy to construct almost invariant volume forms : ϑj ∧ ρj = |Λj |3/2 ω 1 ∧ ω 2 ∧ ω 3 ∧ ω 4 ∧ ω 5 ,
j = 1, 2.
(42)
M. Marvan et al. / Central European Journal of Mathematics 5(1) 2007 105–133
125
Differential invariants on J 3 π
5
Since ω k (Xl ) = const, namely, δkl , we have dωk (Xi , Xj ) = −ω k ([Xi , Xj ]) . (see the proof of lemma 4.1). This implies the useful formula bkij ω i ∧ ω j . dω k = −
(43)
i<j
5.1 The complete parallelism First, note that invariant differential 1-forms dI 1 and dI 2 live on J 3 π. This leads us immediately to another set of invariant differential 1-forms on J 3 π: Ω1 = P1 dI 1 = X1 (I 1 ) ω 1 + X2 (I 1 ) ω 2 , Ω2 = P1 dI 2 = X1 (I 2 ) ω 1 + X2 (I 2 ) ω 2 , Ω3 = P2 dI 1 = X3 (I 1 ) ω 3 + X4 (I 1 ) ω 4 , 4
2
2
3
2
(44)
4
Ω = P2 dI = X3 (I ) ω + X4 (I ) ω , Ω51 = P3 dI 1 = X5 (I 1 ) ω 5 ,
Ω52 = P3 dI 2 = X5 (I 2 ) ω 5 .
Supposing that E is a generic equation, we henceforth assume that X5 (I 1 ) = 0 , and
1 1 X1 (I ) X2 (I ) Δ1 = = 0 , 2 2 X1 (I ) X2 (I )
X5 (I 2 ) = 0 , 1 1 X3 (I ) X4 (I ) Δ2 = = 0 . 2 2 X3 (I ) X4 (I )
(45)
(46)
This means that two sets of forms {Ω1 , . . . , Ω4 , Ω51 } and {Ω1 , . . . , Ω4 , Ω52 } are invariant coframes on M (we omit the subscript E according to Remark 3.3). Each of these coframes determines an invariant complete parallelism on M. The frames {Y1 , . . ., Y4 , Y51 } and {Y1 , . . . , Y4 , Y52 } dual to the above constructed coframes are obviously invariant. An explicit description of them is : Y1 = Y2 = Y3 = Y4 = Y51 =
1 X2 (I 2 )X1 − X1 (I 2 )X2 , Δ1 1 −X2 (I 1 )X1 + X1 (I 1 )X2 , Δ1 1 X4 (I 2 )X3 − X3 (I 2 )X4 , Δ2 1 −X4 (I 1 )X3 + X3 (I 1 )X4 , Δ2 1 1 X5 , Y52 = X5 . 1 X5 (I ) X5 (I 2 )
(47)
126
M. Marvan et al. / Central European Journal of Mathematics 5(1) 2007 105–133
5.2 More scalar invariants on J 3π Among numerous invariants constructed previously there are functions, (vector-valued) differential forms, and vector fields. Further invariants can by obtained just by applying various operations of tensor algebra, Fr¨olicher–Nijenhuis brackets, etc., to these objects. Moreover, components of an invariant object with respect to an invariant basis are scalar differential invariants. These simple general tricks are rather efficient and were already used in constructing differential invariants on J 2 π. As for J 3 π we shall proceed along these lines as well. The invariant 1-forms Ω51 and Ω52 are proportional. So, the proportionality factor X5 (I 1 ) I˜3 = X5 (I 2 )
(48)
is a scalar differential invariant on J 3 π. Consider now invariant 2-forms on J 3 π: R1 dI 1 = I 6 Ω1 ∧ Ω2 , R1 dI 2 = I 7 Ω1 ∧ Ω2 , R2 dI 1 = I 8 Ω3 ∧ Ω4 , R2 dI 2 = I 9 Ω3 ∧ Ω4 , R11 dI 1 = I 10 Ω1 ∧ Ω51 + I 11 Ω2 ∧ Ω51 ,
(49)
R11 dI 2 = I 12 Ω1 ∧ Ω51 + I 13 Ω2 ∧ Ω51 , R12 dI 1 = I 14 Ω3 ∧ Ω51 + I 15 Ω4 ∧ Ω51 , R12 dI 2 = I 16 Ω3 ∧ Ω51 + I 17 Ω4 ∧ Ω51 . Their components I 6 , . . . , I 17 with respect to the base Ω1 , . . . , Ω5 are further scalar differential invariants on J 3 π. The simplest among them are I 6 = Δ1 /X5 (I 1 ) and I 8 = Δ2 /X5 (I 1 ). In the same manner one easily find numerous non-scalar differential invariants on 3 J π. For instance, such are 3-forms [[Pi , Rj ]] or [[Pi , R1j ]], 4-forms [[Pi , (R1j R1k )]], 5-forms [[Pi , R1j ]] [[Pk , R1l ]], etc.
5.3 Better manageable invariants From the above said one can see that there are sufficient resources for constructing differential invariants and the main problem becomes to select functionally independent ones in the simplest possible way. From technical point of view this forces us to look for manageable invariants, for instance, those that have local expression as simple as possible. In the considered context a help comes from almost invariant objects as it is illustrated below. In view of (39), (40) and (43), for σ = 1, 2, 12 we have the invariant 5-forms dρσ = d(Λσ ω 1 ∧ ω 2 ∧ ω 3 ∧ ω 4 ) = X5 (Λσ ) + Λσ B ω 1 ∧ ω 2 ∧ ω 3 ∧ ω 4 ∧ ω 5,
(50)
M. Marvan et al. / Central European Journal of Mathematics 5(1) 2007 105–133
127
where B = b115 + b225 + b335 + b445 = 2(b115 + b225 ) = 2(b335 + b445 ) according to (31). By comparing these 5-forms with (42) we obtain almost scalar invariants Iσj =
X5 (Λσ ) + Λσ B , |Λj |3/2
σ = 1, 2, 12, j = 1, 2,
(51)
on J 3 π which are better manageable in comparison to those constructed in the previous subsection. The squares (Iσj )2 are, obviously, full scalar invariants. Apart from the obvious relation (Iσ1 /Iσ2 )2 = (I 1 /I 2 )3 they are functionally independent. Some of the earlier constructed invariants can be expressed in terms of Iσj ’s, e.g., j X5 (I 1 ) (I12 − I1j I 1 )I 1 , = j X5 (I 2 ) (I12 − I2j I 2 )I 2
6
j = 1, 2.
The equivalence problem
So far we obtained two independent second-order scalar invariants I 1 , I 2 (see (35)) and a number of third-order invariants. Put (see (51)) I 3 = (I11 )2 ,
I 4 = (I21 )2 ,
1 2 I 5 = (I12 ),
The following statement can be checked by a direct computer-supported calculation in coordinates (30): Theorem 6.1. For a generic hyperbolic Monge-Amp´ere equation E values IEj ’s of invariants I j ’s, j = 1, ..., 5, on E are functionally independent on the base M. Of course, this choice of basic scalar invariants is not unique. For instance, functions (see (48), (49)) are functionally independent on E as well. However, this and other reasonable choices are “less manageable” with respect to those made in the above theorem. Unfortunately, this fact is not clearly seen from the above exposition, since we were forced to skip technical details of computations. According to “the principle of n invariants” [24], any quintuple of functionally independent scalar invariants gives a solution of the equivalence problem for generic hyperbolic Monge–Amp`ere equations. Theorem 6.1 guarantees existence of such a one, namely, I 1, . . . , I 5. More exactly, let E be a generic hyperbolic Monge–Amp`ere equation considered as an unordered pair of 2-dimensional, skew-orthogonal subdistributions D1E and D2E of the Cartan distribution on M, that is E = (X1,E, X2,E, X3,E, X4,E) , where vector fields X1 , . . . , X4 are defined by (18). Observe now that the distributions D1E = X1,E, X2,E and D2E = X3,E, X4,E are determined uniquely by values on E of invariant bivectors W1 = Y1 ∧Y2 and W2 = Y3 ∧Y4 with fields Yi defined in (47), respectively. Obviously, IE1 , IE2 , I˜E3 , IE6 , IE8
W1 =
1 X1 ∧ X 2 , Δ1
W2 =
1 X3 ∧ X4 . Δ2
128
M. Marvan et al. / Central European Journal of Mathematics 5(1) 2007 105–133
Since IE1 , . . . , IE5 are functionally independent, they form an invariant chart in M denoted by IE. Observe that Yi+ε (I j ) = δij for i, j = 1, 2, ε = 0, 2. So, Yi+ε = δij
∂ ∂ α + Yi+ε j ∂I ∂I α
with i, j = 1, 2, ε = 0, 2, α = 3, 4, 5. Functions νkα = Ykα (I 1 , ..., I 5 ) are, obviously, differential invariants. They determine completely vector fields Y1 , ..., Y4 and, so, the α α = Yk,E ◦ IE−1 . They are functions distributions DiE, i = 1, 2. Consider now functions νk,E in a certain domain of the arithmetic vector space R5 = {(z1 , ..., z5 )} and will be called normal parameters of the equation E. ˜ be generic hyperbolic Monge–Amp`ere equations. Then E and Theorem 6.2. Let E and E ˜ are (locally) equivalent iff their normal parameters coincide, i.e., iff ν α (z1 , ..., z5 ) ≡ E k,E α νk, (z , ..., z ), k = 1, ..., 4, α = 3, 4, 5. 1 5 ˜ E ˜ Proof. Let IE = (IE1 , . . . , IE5 ) and IE˜ = (IE˜1 , . . . , IE˜5 ) be invariant charts for E and E, respectively. The “if” part of the theorem is obvious. Now assume that normal parameters of E ˜ coincide. Then the diffeomorphism f = I −1 ◦ IE is such that I i = f ∗ (I i ) and, and E ˜ E ˜ E E α = f ∗ (Yk,αE˜ ). This shows that f sends vector fields Yk,E to vector fields consequently, Yk,E ˜ = D1 ⊕D2 Y ˜ , k = 1, ..., 4, and, therefore, Di to Di , i = 1, 2. Since C = D1 ⊕D2 and C E
k,E
˜ E
the diffeomorphism f is automatically contact.
E
E
˜ E
˜ E
Remark 6.3. A system of functions fkα in a domain of R5 can be realized as the system α of normal parameters νk,E of a hyperbolic Monge-Amp`ere equation E iff it is a solution of a system of partial differential equations (see [1, 24]) and the algebra of differential invariants of Monge-Amp`ere equations is then interpreted to be the smooth function algebra on the infinite prolongation of this system. According to [1, 24]), it is not difficult to describe explicitly this system but the result is rather cumbersome and not very instructive. This is why we do not report it here. More satisfactory results in this direction will be presented in a separate paper. Nevertheless, it is worth mentioning that, in principle, the differential invariants constructed above allow a solution of the classification problem for generic hyperbolic Monge-Amp`ere equations. There are alternative equivalent formulations of the classification theorem. For instance, one of them is as follows. Consider the 1-forms Ω1 , . . . , Ω5 , defining the complete parallelism on M. In the invariant coordinate system IE1 , . . . , IE5 , these forms are described in the terms of functions Ωij (IE1 , . . . , IE5 ): 5 i Ωij (IE1 , . . . , IE5 )dIEj , i = 1, . . . , 5 . Ω = j=1
Theorem 6.4. The (local) equivalence class of a generic equation E with respect to con-
M. Marvan et al. / Central European Journal of Mathematics 5(1) 2007 105–133
129
tact transformations is uniquely determined by the family of functions Ωij (IE1 , . . . , IE5 ), i = 1, . . . , 5 . ˜ be another generic Monge–Amp`ere equation such that there exists a contact Proof. Let E transformation transforming it to E. Then, obviously, the functions Ωij (IE1 , . . . , IE5 ) and ˜ ij (I ˜1 , . . . , I ˜5 ) coincide for all i and j. Ω E E ˜ be Monge–Amp`ere equations such that for all i and j the functions Ωi (I 1 , . . . , I 5 ) Let E, E j E E ˜ i (I 1 , . . . , I 5 ) coincide. Let IE = (I 1 , . . . , I 5 ) and I ˜ = (I 1 , . . . , I 5 ) be invariant coand Ω j E ˜ ˜ ˜ ˜ E E E E E E −1 ˜ ordinate systems in M for E and E respectively. Then IE˜ ◦ IE is a locally defined diffeomorphism M → M. This diffeomorphism is a contact transformation because it ˜ ij (I ˜1 , . . . , I ˜5 )dI j = Ω ˜ i , i = 1, . . . , 5, transforms Ωi = 5j=1 Ωij (IE1 , . . . , IE5 )dIEj to 5j=1 Ω ˜ E E E ˜ 5 . By obvious reasons it also and, in particular, the contact form Ω5 to the contact form Ω ˜ transforms the pair of distributions (D1E, D2E) to the pair (D1E˜ , D2E˜ ) and hence E to E.
7
Examples
Examples discussed in this section aim to illustrate the character and complexity of problems related with actual computations and use of differential invariants. Henceforth invariants I i are denoted by Ii . Example 7.1. Consider the equation 1 (z z 4 xx yy
2 − zxy ) + y 2zxx − 2xyzxy + x2 zyy + x2 y 2z 2 = 0.
The first two invariants are I1 = zn+ /d, I2 = zn− /d, where n± = 2(z + 3y 4 ∓ 2)x2 zx2 − (z + 12x2 y 2)xyzx zy + 2(z + 3x4 ± 2)y 2zy2 + (z 2 + 8x2 y 2 z + 4y 4z ± 4z ± 16x2 y 2 ± 16y 4 − 12)xzx
+ (z 2 + 4x4 z + 8x2 y 2z ∓ 4z ∓ 16x4 ∓ 16x2 y 2 − 12)yzy + 2z 3 + 36x4 y 4z 3 + 6y 4z 2 − 4x2 y 2z 2 + 6x4 z 2 − 8z ∓ 16x4 z ± 16y 4z + 8x4 + 16x2 y 2 + 8y 4, d = 4(z 2 + 3y 4z + 4)x2 zx2 − 2(z 2 + 12x2 y 2 z − 16)xyzx zy
+ 4(z 2 + 3x4 z + 4)y 2zy2 + 2(z 2 + 8x2 y 2 z + 4y 4z + 20)xzzx + 2(z 2 + 4x4 z + 8x2 y 2 z + 20)yzzy + 4(18x4 y 4 z 3 + z 3
+ 3x4 z 2 − 2x2 y 2 z 2 + 3y 4 z 2 + 12z + 4x4 + 8x2 y 2 + 4y 4)z. The invariants I3 , I4 , I5 are large fractions whose non-reducible numerators are polynomials of order three in zx , zy , five in z, and six in x, y. Invariants Is , s > 5, are even more cumbersome. Computation shows that the jacobian ∂(I1 , I2 , I3 , I4 , I5 )/∂(x, y, z, zx , zy ) is nonzero, hence the first five invariants are functionally independent and can be chosen to be local coordinates on J 1 (τ ). Although an explicit inversion is rather hopeless, one can still find
130
M. Marvan et al. / Central European Journal of Mathematics 5(1) 2007 105–133
algorithmically the relations connecting principal invariants I1 , . . . , I5 and higher Ik at least in principle. This kind of procedure is outlined in Example 7.2 below. Example 7.2. Put ζ = zx + zy + e and consider the family of equations 2 ) + 4ζ 2(zy zxx + zx zyy + ζ 2 ) = 0. (4zx zy + ζ 2 )(zxx zyy − zxy
(52)
depending on parameter e. Assuming that e = 0, we have (zx + zy )2 + 3e(zx + zy ) + 4e2 , 5ezx + ezy + 4e2 (zx + zy )2 + 3e(zx + zy ) + 4e2 =2 , ezx + 5ezy + 4e2 7zx2 + 6zx zy − zy2 + 33ezx + 5ezy + 21e2 3/2 =2 , e1/2 (5zx + zy + 4e)3/2 −zx2 + 6zx zy + 7zy2 + 5ezx + 33ezy + 21e2 = 23/2 , e1/2 (5zx + zy + 4e)3/2 (zx + zy )3 + 7e(zx + zy )2 + 17e2 (zx + zy ) + 21e3 = 25/2 . e3/2 (5zx + zy + 4e)3/2
I1 = 2 I2 I3 I4 I5
All invariants are independent of x, y, z, reflecting the fact that x → x + t1 , y → y + t2 , z → z + t3 are symmetries of equation (52). One easily checks that I1 , I2 are functionally independent, but it is still not straightforward to express zx , zy in terms of I1 , I2 explicitly. To establish the dependence of Is , s > 3, on I1 , I2 , we observe that for every s there exists a polynomial Ps (zx , zy , Is ) such that Is is a solution of the equation Ps = 0. Then what we need is eliminating zx , zy from the system (zx + zy )2 + 3e(zx + zy ) + 4e2 = 0, 5ezx + ezy + 4e2 (zx + zy )2 + 3e(zx + zy ) + 4e2 = 0, I2 − 2 ezx + 5ezy + 4e2 Ps (zx , zy , Is ) = 0. I1 − 2
To this end, it suffices to compute the Gr¨obner basis of the last system with respect to an “elimination ordering” of monomials. With the help of the Groebner package of Maple 10 the following quadratic equation for I3 , 0 = 4096 I26I32 − I23 (729 I13I23 − 1971 I13I22 + 20493 I12I23 + 3563 I13I2 − 51114 I12I22 + 183915 I1I23
+ 3951 I13 − 52723 I12I2 + 45517 I1I22 + 102191 I23)I3
+ (27 I13I22 − 81 I12 I23 − 32 I13I2 + 426 I12I22 − 1206 I1I23 − 44 I13 + 270 I12I2 − 800 I1I22 − 1114 I23)2
can be found rather quickly as well as similar quadratic equations for I4 , I5 . The assump-
M. Marvan et al. / Central European Journal of Mathematics 5(1) 2007 105–133
131
tions of Sect. 5.1 are satisfied as well. In particular, Δ1 , Δ2 are nonzero since Δ1 = Δ2 = −128
(zx + zy )(3zx + 3zy + 8e)(zx + zy + e)4 e2 (zx + 5zy + 4e)2 (5zx + zy + 4e)2 zx2 + 2zx zy + zy2 + 3ezx + 3ezy + 4e2 × 2 . zx + 6zx zy + zy2 + 2ezx + 2ezy + e2
This enables us to compute the higher invariants. For instance, I6 is solution of the quadratic equation 0 = −16I12 (27I14 I2 − 27I13 I22 + 22I14 − 56I13 I2 − 2I12 I22
+ 8I12 I2 − 42I13 + 50I1 I22 + 28I12 + 56I1 I2 + 28I22 )I62
+ I1 (I1 I2 − I1 − I2 )(9I1 I2 + 7I1 + 7I2 )(3I13 I2 − 3I12 I22
− 26I13 − 34I12 I2 − 8I1 I22 + 18I12 + 36I1 I2 + 18I22 )I6
+ (I1 + I2 )2 (I1 I2 − I1 − I2 )(9I1 I2 + 7I1 + 7I2 )(I1 I2 − 2I1 − 2I2 )2 . Although every invariant computed so far depends on e, its expression in terms of I1 , I2 does not. This suggests the idea that the parameter e is removable. And indeed, after substitution z → ez equation (52) becomes equivalent to itself with e = 1. Thus, the family of equations (52) consists of a continuum of generic members with e = 0, which are all mutually equivalent, and a single non-generic member with e = 0 (in which case Λ1 = Λ2 = 0). Example 7.3. Consider the family of equations 1 (z z 4 xx yy
2 − zxy ) + y 2zxx − 2xyzxy + x2 zyy + ex2 y 2 = 0,
depending on a real parameter e = 4. Then the first five invariants are constants e + 12 , e−4 800 I3 = I4 = , e−4 (e + 12)2 I5 = 3200 , (e − 4)3 I1 = I2 = 2
while the higher invariants Is are undefined. The equation belongs to the subclass “h”, or “p”, or “e” (see 4.5.1) if e > −4 or e = −4 or e < −4, respectively.
Acknowledgements ˇ under grant 201/04/0538 and MSMT ˇ M. Marvan acknowledges the support from GACR under project MSM 4781305904.
132
M. Marvan et al. / Central European Journal of Mathematics 5(1) 2007 105–133
References [1] D.V. Alekseevsky, A.M. Vinogradov and V.V. Lychagin: “Basic ideas and concepts of differential geometry”, In: Geometry, I, Encyclopaedia Math. Sci., Vol. 28, Springer, Berlin, 1991, pp. 1–264. [2] A. Fr¨olicher and A. Nijenhuis: “Theory of vector valued differential forms. Part I: Derivations in the graded ring of differential forms”, Indag. Math., Vol. 18, (1956), pp. 338–359. [3] P. Hartman and A. Wintner: “On hyperbolic partial differential equations”, Am. J. Math., Vol. 74, (1952), pp. 834–864. [4] I.S. Krasil’shchik, V.V. Lychagin and A.M. Vinogradov: Geometry of Jet Spaces and Nonlinear Partial Differential Equations, Gordon and Breach, New York, 1986. [5] I.S. Krasil’shchik and A.M. Vinogradov (Ed.): Symmetries and Conservation Laws for Differential Equations of Mathematical Physics, Translations of Mathematical Monographs, Vol. 182, American Mathematical Society, Providence RI, 1999. [6] B.S. Kruglikov: “Some classificational problems in four-dimensional geometry: distributions, almost complex structures and the generalized Monge–Amp`ere equations”, Math. Sbornik, Vol. 189(11), (1998), pp. 61–74 (in Russian); English translation in Sb. Math., Vol. 186(11–12), (1998), pp. 1643–1656; e-print: http://xxx.lanl.gov/abs/dg-ga/9611005. [7] B.S. Kruglikov: “Symplectic and contact Lie algebras with application to the Monge– Amp`ere equations”, Trudy Mat. Inst. Steklova, Vol. 221, (1998), pp. 232–246 (in Russian); English translation in Proc. Steklov Math. Inst., Vol. 221(2), (1998), pp. 221–235; e-print: http://xxx.lanl.gov/abs/dg-ga/9709004 [8] B.S. Kruglikov: “Classification of Monge–Amp`ere equations with two variables”, In: Geometry and Topology of Caustics - CAUSTICS ’98 (Warsaw), Banach Center Publications, Vol. 50, Polish Acad. Sci., Warsaw, 1999, pp. 179–194. [9] A. Kushner: “Monge–Amp`ere equations and e-structures”, Dokl. Akad. Nauk, Vol. 361(5), (1998), pp. 595–596. ¨ [10] H. Lewy: “Uber das Anfangswertproblem bei einer hyperbolischen nichtlinearen partiellen Differentialgleichung zweiter Ordnung mit zwei unabh¨angigen Ver`anderlichen”, Math. Annalen, Vol. 98, (1928), pp. 179–191. [11] V.V. Lychagin: “Contact geometry and non-linear second order differential equations”, Russian Math. Surveys, Vol. 34, (1979), pp. 149–180. [12] V.V. Lychagin: Lectures on Geometry of Differential Equations, Universita “La Sapienza”, Roma, 1992. [13] V.V. Lychagin and V.N. Rubtsov: “Local classification of Monge-Ampere equations”, Soviet Math. Doklady, Vol. 272(1), (1983), pp. 34–38. [14] V.V. Lychagin and V.N. Rubtsov: “On the Sophus Lie theorems for Monge-Ampere equations”, Belorussian Acad. Sci. Doklady, Vol. 27(5, (1983), pp. 396–398 [15] V.V. Lychagin, V.N. Rubtsov and I.V. Chekalov: “A classification of Monge-Ampere equations”, Ann. Sc. Ecole Norm. Sup., Vol. 4(26), (1993), pp. 281–308.
M. Marvan et al. / Central European Journal of Mathematics 5(1) 2007 105–133
133
[16] M. Marvan, A.M. Vinogradov and V.A. Yumaguzhin: “Differential invariants of generic hyperbolic Monge–Amp`ere equations”, Russian Acad. Sci. Dokl. Math., Vol. 405, (2005), pp. 299–301 (in Russian); English translation in: Doklady Mathematics, Vol. 72, (2005), pp. 883–885. [17] M. Matsuda: “Two methods of integrating Monge–Amp`ere’s equations”, Trans. Amer. Math. Soc., Vol. 150, (1970), pp. 327–343. [18] M. Matsuda: “Two methods of integrating Monge–Amp`ere’s equations. II”, Trans. Amer. Math. Soc., Vol. 166, (1972), pp. 371–386. [19] T. Morimoto: “La g´eom´etrie des ´equations de Monge–Amp`ere”, C. R. Acad. Sci., Paris, Vol. 289, (1979), pp. A-25–A-28. [20] T. Morimoto: “Monge–Amp`ere equations viewed from contact geometry”, In: Symplectic Singularities and Geometry of Gauge Fields (Warsaw, 1995), Banach Center Publ., Vol. 39, Polish Acad. Sci., Warsaw, 1997, pp. 105–121. [21] O.P. Tchij: “Contact geometry of hyperbolic Monge-Amp`ere eqquations”, Lobachevskii Journal of Mathematics, Vol. 4, (1999), pp. 109–162. [22] D.V. Tunitsky: “On the global solvability of hyperbolic Monge–Amp`ere equations”, Izv. Ross. Akad. Nauk Ser. Mat., Vol. 61(5), (1997), pp. 177–224 (in Russian); English translation in: Izv. Math, Vol. 61(5), (1997), pp. 1069–1111. [23] D.V. Tunitsky: “Monge-Amp`ere equations and functors of characteristic connection”, Izv. RAN, Ser. Math., Vol. 65(6), (2001), pp. 173–222. [24] A.M. Vinogradov: “Scalar differential invariants, diffieties and characteristic classes”, In: Mechanics, Analysis and Geometry: 200 Years after Lagrange, M. Francaviglia, North-Holland, 1991, pp. 379–414. [25] A.M. Vinogradov and V.A. Yumaguzhin: “Differential invariants of webs on 2dimensional manifolds”, Mat. Zametki, Vol. 48(1), (1990), pp. 46–68 (in Russian).
DOI: 10.2478/s11533-006-0037-2 Research article CEJM 5(1) 2007 134–153
Scattering properties for a pair of Schro ¨dinger type operators on cylindrical domains Michael Melgaard∗ Department of Mathematics, Uppsala University, S-751 06 Uppsala, Sweden
Received 17 May 2006; accepted 29 September 2006 Abstract: Strong asymptotic completeness is shown for a pair of Schr¨odinger type operators on a cylindrical Lipschitz domain. A key ingredient is a limiting absorption principle valid in a scale of weighted (local) Sobolev spaces with respect to the uniform topology. The results are based on a refined version of Mourre’s method within the context of pseudo-selfadjoint operators. c Versita Warsaw and Springer-Verlag Berlin Heidelberg. All rights reserved. Keywords: Lipschitz domains, scattering, limiting absorption principle, weighted Sobolev spaces MSC (2000): 35P25, 47A40, 47F05, 35P05
1
Introduction and main theorem
We investigate scattering properties of two Schr¨odinger type operators H and H0 defined on a cylindrical Lipschitz domain M = R × Q, where Q ⊂ Rn−1 , n ≥ 2, is an open and bounded Lipschitz domain [12]. Specifically, the operators act as H = −∂i mij ∂j + V,
H0 = −∂i δ ij ∂j
(1)
on L2 (M) with Dirichlet boundary conditions; sums over indices are suppressed. The matrix-valued function M ≡ (mij ) is real-valued and symmetric on M and V is a multiplication operator induced by a real-valued function on M. Henceforth x = (x1 , x˜) is a 1 vector of R × Q and x := (1 + |x|2 ) 2 . We impose the following conditions on M and V ; here the symbol δij refers to the components of the Euclidean metric matrix 1. ∗
E-mail:
[email protected]
M. Melgaard / Central European Journal of Mathematics 5(1) 2007 134–153
135
Assumption 1.1. The following inequalities are understood in the sense of matrices. (i) There exist positive constants c and C such that c ≤ M(x) ≤ C for a.e. x ∈ M. (ii) There exists μ1 > 1 and a positive constant C1 such that v ij (x) := mij −δ ij (x) satisfies |v ij (x)| ≤ C1 x1 −μ1 for a.e. x ∈ M. (iii) There exists μ2 > 1 and a positive constant C2 such that |∂1 mij (x)| ≤ C2 x1 −μ2 for a.e. x ∈ M.
In particular, Assumption 1.1(ii) implies that (ii)’ lim χ(±x1 ≥ l)(mij (x) − δ ij )L∞ (M) = 0 ∀i, j = 1, . . . , n. l→∞
Assumption 1.2. (i) Let V ∈ L∞ (M). (ii) There exists ν1 > 1 and a positive constant C3 such that |V (x)| ≤ C3 x1 −ν1 for a.e. x ∈ M. (iii) There exists ν2 > 1 and a positive constant C4 such that |∂1 V (x)| ≤ C4 x1 −ν2 for a.e. x ∈ M.
In particular, Assumption 1.2(ii) implies that (ii)’ lim χ(±x1 ≥ l) V (x)L∞ (M) = 0. l→∞
We then define the sesquilinear form h with domain H10 (M) × H10 (M) by h[ϕ, ψ] = ∂i ϕ, mij ∂j ψ,
ϕ, ψ ∈ H10 (M)
It is clearly densely defined and symmetric. In view of Assumption 1.1 the matrix M is bounded and uniformly positive and, consequently, the form h is non-negative and closed. Invoking Kato’s representation theorem [12, Theorem VI.2.4], we get an unique Since, moreover, Assumption 1.2 ensures that V is bounded, the self-adjoint operator H. KLMN theorem [28, Theorem X.17] asserts that the sesquilinear form sum h[ϕ, ψ] := h[ϕ, ψ] + ϕ, V ψ,
ϕ, ψ ∈ Q(h) = Q( h) = H10 (M),
136
M. Melgaard / Central European Journal of Mathematics 5(1) 2007 134–153
is closed and semi-bounded from below and hence it generates a self-adjoint operator H. Furthermore, we introduce the unperturbed Hamiltonian H0 as follows. On L2 (M) we consider the sesquilinear form h0 with domain Q(h0 ) = H10 (M) defined by h0 [ϕ, ψ] := ∂i ϕ, δ ij ∂j ψ,
ϕ, ψ ∈ H10 (M).
(2)
It is a densely defined, symmetric, non-negative closed form. A form core of h0 is C0∞ (M). Kato’s representation theorem gives a unique self-adjoint operator H0 in L2 (M) with domain D(H0 ) := ψ ∈ H10 (M) : ∃ϕ ∈ L2 (M) such that h0 [ψ, u] = ϕ, uL2(M) ∀u ∈ H10 (M) . (3) Our aim is to establish scattering theory for the pair (H, H0 ) governed by the Schr¨odinger equation d (4) i ψ(t) = Hψ(t), ψ(0) = ψ0 . dt The solution to (4) is given by ψ(t) = U(t)ψ0 = e−itH ψ0 . If one replaces H by H0 , the “free” Hamiltonian, the corresponding solution to (4) can be expressed as U0 (t) = exp(−itH0 ); the free evolution. The study of scattering consists in comparing the two evolutions U0 (t) and U(t) for large positive and negative times t, using the Møller wave operators W ± = s − lim U(−t)U0 (t)Pac (H0 ). (5) t→±∞
Here Pac (H0 ) denotes the projection onto the subspace of absolute continuity of H0 . Assuming that the wave operators exist, one says that they are asymptotically complete if the ranges of W + and W − , denoted Ran W ± , coincide with the subspace of continuity of H, denoted Hc (H). It the ranges of W ± equal the subspace of absolute continuity of H, then the wave operators are said to be strongly complete; in other words, the singular continuous spectrum of H is empty. In that case the absolutely continuous parts of H0 and H are unitarily equivalent via the wave operators. Our main result is: Theorem 1.3. Let Assumption 1.1 and Assumption 1.2 be satisfied. Then 1. The wave operators W ± = s − lim eitH e−itH0 t→±∞
exist and are strongly asymptotically complete. 2. If ϕ is an admissible function, then W ± = s − lim eitϕ(H) e−itϕ(H0 ) . t→±∞
We recall that a real-valued function ϕ, defined on R+ , is admissible provided 2 ∞ −itϕ(λ)−isλ lim dλ ds = 0 e t→∞
0
I
M. Melgaard / Central European Journal of Mathematics 5(1) 2007 134–153
137
for any bounded interval I ⊂ R+ . The key ingredient in the proof of Theorem 1.3 is a limiting absorption principle (abbrev. LAP) in a framework of weighted Sobolev spaces Hs(β) (M) := {ψ ∈ D (M) : xβ ψ ∈ Hs (M)} equipped with its natural norm; here D (M) denotes the space of distributions. This version of the LAP, suitable for the study of scattering theory, is the content of Theorem 6.1, assertion 4. Since Eidus’ classic paper [13], the LAP has been extensively considered in spectral and scattering theory (see, e.g., [29]). To prove Theorem 27 we apply an abstract version of Mourre’s method. This method was first developed to prove the LAP for three-body Schr¨odinger operators [25] (see also [26] and [31]). Froese and Herbst [14] showed that N-body Schr¨odinger operators have no positive eigenvalues. Later use of the Mourre method by Sigal, Soffer, and Derezinski lead to breakthroughs in N-body quantum scattering as described in [10]. Iwashita [17] and Weder [33] proved the LAP for first order symmetric systems. Ben-Artzi et al [3] and Tamura [32] used Mourre’s method to derive the LAP for the acoustic wave operators (for further developments, see the survey [9]). To establish Theorem 6.1 we apply a refined version of Mourre’s method within the context of pseudo-selfadjoint operators (see, e.g., [16]). In fact, we shall establish another version of Theorem 6.1(4), see Theorem 6.2, wherein the set M is replaced by Rn . This version of the LAP, valid on the whole Euclidean space, has independent interest. Another version of the LAP for H, valid in a scale of Besov spaces, can be found in [24]. Therein a simpler version of the abstract Mourre method is applied, avoiding the notion of pseudo-selfadjoint operators. Applications of the main theorems to mesoscopic physics will be published elsewhere. There seems to be few results on this kind of problem in the literature, but related results for other kinds of operators are found in [2, 8, 11, 15, 18, 20, 22, 23]. The paper is organized as follows. In Section 2 we introduce weighted Sobolev spaces and a class of Besov spaces. An abstract version of Mourre’s method is summarized in Section 3 within the context of pseudo-selfadjoint operators. In Section 4 we define a dilation operator A and a (strict) Mourre estimate for the operator H0 is established. Auxiliary properties of the Hamiltonians H and H0 are shown in Section 5. In Section 6 we give the proof of Theorem 6.2, which enables us to give a fairly short proof of the LAP in Theorem 6.1 within the framework of weighted, local Sobolev spaces. The main result on scattering theory, Theorem 1.3, is proven in Section 7. Finally, let us fix some basic notation. Let H be a separable complex Hilbert space. We denote its scalar product and norm by ·, ·H and · H , resp. If K is another Hilbert space, then we write K ⊂ H if K is embedded in H, and we write K → H provided the embedding is continuous and dense. Let T be a self-adjoint operator on a Hilbert space H with domain D(T ). The spectrum and resolvent set are denoted by σ(T ) and ρ(T ), resp. We use standard terminology for the various parts of the spectrum, see, e.g., [27]. The resolvent is R(ζ) = (T − ζ)−1. The spectral family associated to T is denoted by ET (ξ), ξ ∈ R. The spaces of bounded operators from a Banach space X into a Banach space Y is denoted by B(X , Y) and the upper, resp. lower, half-plane is denoted by C± := {z ∈ C : ± Im z > 0}.
138
2
M. Melgaard / Central European Journal of Mathematics 5(1) 2007 134–153
Weighted Sobolev spaces and Besov spaces
We adopt the usual notation for function spaces: C0∞ , L2 , etc. The Schwartz space of rapidly decreasing functions and its adjoint space of tempered distributions are denoted by S and S , resp. The Fourier transformation is denoted by F . The weighted Sobolev spaces on Rn are defined by Hs(t) (Rn ) = { ψ ∈ S (Rn ) : ps xt ψ ∈ L2 (Rn ) }. 1
Here x denotes the operator of multiplication by the function (1 + |x|2 ) 2 and p = F ∗ xF . In order to state an optimal form of the limiting absorption principle we introduce a class of Besov spaces. For this aim we let θ1 , θ2 ∈ C0∞ (Rn ) be two functions such that θ1 (x) > 0 for |x| < 2, θ1 (x) = 0 otherwise and θ2 (x) > 0 if 1/2 < |x| < 2, θ2 (x) = 0 otherwise. Then, for s, t ∈ R and 1 ≤ q ≤ ∞, we define s n Ht,q (R ) = ψ ∈ S (Rn ) : θ1 (x)ψHs (Rn ) ∞ 1
x q dr q t + <∞ . ψ r θ2 r Hs (Rn ) r 1 For q = ∞ the term containing the integral must be understood as being supr>1 r t θ2 (x/r)ψHs (Rn ) . The spaces Hst,q (Rn ) have Banach structure and we note that n (Hst,q (Rn ))∗ = H−s −t,q (R ) for 1 ≤ q < ∞ and
1 1 + = 1. q q
The weighted Sobolev spaces Hs(t) (Rn ) coincide algebraically and topologically with Hst,2 (Rn ). Observe also that we have for all β > 1/2, n −1 n −1 n −1 n L2(β) (Rn ) H−1 (β) (R ) H 1 ,1 (R ) H( 1 ) (R ) H (R ) 2
2
(6)
(continuously and densely) 1
H (R ) H1(− 1 ) (Rn ) H1− 1 ,∞ (Rn ) H1(−β) (Rn ) L2(−β) (Rn ) n
2
2
(7)
(continuously) For further information, we refer to [1, Section 4.1].
3
Pseudo-selfadjoint operators and Mourre’s method
We begin by recalling some definitions from the theory of pseudo-selfadjoint operators [16]. Then we summarize a refined abstract version of Mourre’s method [25] for pseudoselfadjoint operators found in [5] and [7]. Definition 3.1. A pseudo-selfadjoint operator in H is a linear operator T : D(T ) ⊂ H → H such that
M. Melgaard / Central European Journal of Mathematics 5(1) 2007 134–153
139
(i) T D(T ) ⊂ D(T ). (ii) The operator T is self-adjoint as an operator in the Hilbert space D(T ). Definition 3.2. If ϕ ∈ C∞ (R) (that is, ϕ : R → C is continuous and tends to zero at infinity) and T is a pseudo-selfadjoint operator in H, then ϕ(T ) denotes the operator in B(H) defined by (i) ϕ(T )|D(T ) is the operator given by the functional calculus applied to T , where T is reduced to D(T ). (ii) ϕ(T )|HD(T ) = 0. One defines R(ζ) := (T − ζ)−1 for every ζ ∈ C \ R with R(ζ) = 0 on H D(T ). The family {R(ζ); ζ ∈ C \ R} is a “self-adjoint” pseudo-resolvent in the sense that (i) R(ζ) ∈ B(H) for every ζ ∈ C \ R. ¯ for every ζ ∈ C \ R. (ii) R(ζ)∗ = R(ζ) iii) R(ζ1 ) − R(ζ2 ) = (ζ1 − ζ2 )R(ζ1 )R(ζ2 ) for every ζ1 , ζ2 ∈ C \ R. The spectrum of T reduced to D(T ) will be denoted by σ(T ). Then, for any ζ ∈ C \ σ(T ), one has R(ζ)H = D(T ), R(ζ)(T − ζ)ψ = ψ for every ψ ∈ D(T ), (T − ζ)R(ζ) is the orthogonal projection of H onto D(T ). Definition 3.3. Let {Tγ }γ>0 be a family of pseudo-selfadjoint operators in H. It is said that limγ→∞ Tγ = T (pseudo-selfadjoint operators in H) holds in the strong resolvent sense provided s−limγ→∞ (Tγ −ζ)−1 = (T −ζ)−1 for some ζ ∈ C obeying inf γ>0 dist (ζ, σ(Tγ )) > 0. The following two results allow us to construct pseudo-selfadjoint operators. The first one is found in [7, Lemma 3.7] Proposition 3.4. Let T be a self-adjoint, bounded from below, densely defined operator in H. Denote by H1 the form domain of T and H−1 = H1∗ (and thus H1 ⊂ H = H∗ ⊂ H−1 ). Let χ ∈ B(H) and Tγ := T + γχ∗ χ for every γ > 0. Then the following statements are true: 1. There exists a pseudo-selfadjoint operator T in H such that T = limγ→∞ Tγ in the sense of Definition 3.3. 2. For any λ > − inf T, (T + λ)−1 = s − limγ→∞ (Tγ + λ)−1 in the norm topology of B(H−1 , H1 ). In particular, (T + λ)−1 ∈ B(H−1 , H1 ). The second one is due to Simon [30]. Proposition 3.5. Let {tk }k≥1 be a sequence of (not necessarily densely defined) nonnegative closed forms in H such that Q(tk ) ⊃ Q(tk+1 ) and tk [ϕ, ϕ] ≤ tk+1 [ϕ, ϕ] for every
140
M. Melgaard / Central European Journal of Mathematics 5(1) 2007 134–153
ϕ ∈ Q(tk+1 ) and k ≥ 1. Define the form t∞ by with t∞ [ϕ, ψ] = lim tk [ϕ, ψ], k→∞ Q(tk ) : sup tk [ϕ, ϕ] < ∞ ϕ, ψ ∈ Q(t∞ ) := ϕ ∈ k≥1
(8) (9)
k
Then: 1. The form t∞ is non-negative and closed. 2. If Tk denotes the pseudo-selfadjoint operator corresponding to tk by the representation theorem on Q(tk ) (1 ≤ k ≤ ∞), then T∞ = limκ→∞ Tk in the sense of Definition 3.3. To analyze the spectral properties of T , one introduces a self-adjoint (densely defined) operator A in H and one considers the operator e−iτ A T eiτ A =: W (τ )T , where eiτ A is the unitary group generated by A. We introduce several notions of regularity of T with respect to eiτ A . Definition 3.6. Let B ∈ B(H). (i) The operator B is said to be of class C 1 (A) (resp., Cu1 (A)) if the map τ → W (τ )B ∈ B(H) is of class C 1 in the strong (resp., norm) topology of B(H) or, equivalently, if the following limit exists strongly, resp., in norm, in B(H): e−iτ A Beiτ A − B . τ →0 iτ
[B, A] := lim
(ii) The operator B is said to be of class C 1 (A) (or A-regular) if 1 dτ e−iτ A Beiτ A + eiτ A Be−iτ A − 2B 2 < ∞. τ 0 (iii) The operator B is said to be of class C 1+β (A) for β ∈ (0, 1] if B is of class C 1 (A) and the map τ → W (τ )[B, A] ∈ B(H) is H¨older continuous of order β, i.e., there exists a constant C such that W (τ )[B, A] − [B, A]B(H) ≤ Cτ β .
If B ∈ B(H) we may define the following quadratic form on D(A): ψ, [B, A]ψ = Bψ, Aψ − Aψ, Bψ,
ψ ∈ D(A).
(10)
Then B ∈ C 1 (A) if and only if this form is bounded for the H-topology on D(A). In this case, the bounded operator on H defined in part (i) of Definition 3.6 is precisely the operator associated to the form (10). It is denoted by [B, A].
M. Melgaard / Central European Journal of Mathematics 5(1) 2007 134–153
141
It is well-known [5] that C 1+β (A) ⊂ C 1 (A) ⊂ Cu1 (A) ⊂ C 1 (A).
(11)
Definition 3.7. A pseudo-selfadjoint operator T in H is said to be of class C 1 (A), Cu1 (A), C 1+β (A), or C 1 (A), if the resolvent R(ζ) is of class C 1 (A), Cu1 (A), C 1+β (A), or C 1 (A) for some, and thus all, ζ ∈ C \ σ(T ). In the affirmative case, we write T ∈ C 1 (A) (or T ∈ Cu1 (A), T ∈ C 1+β (A), T ∈ C 1 (A), resp.). To verify that T ∈ C 1 (A), one may use the quadratic form [A, T ] defined on D(A) ∩ D(T ) by ϕ, [T, A]ψ = T ϕ, Aψ − Aϕ, T ψ, ϕ, ψ ∈ D(A) ∩ D(T ). (12) The following criterion is found in [7, Lemma 5.5]; it generalizes the well-known one [1, Theorem 6.2.10]. Proposition 3.8. Let T be a pseudo-selfadjoint operator in H such that A[D(A) ∩ D(T )] ⊂ D(T ).
(13)
Then T ∈ C 1 (A) if and only if the following two requirements are fulfilled: (i) R(ζ)D(A) ⊂ D(A) for some ζ ∈ C \ σ(T ). (ii) There exists a positive constant c such that |[A, T ]ϕ, ϕ| ≤ c T ϕ2H + ϕ2 ,
ϕ ∈ D(A) ∩ D(T ).
(14)
Moreover, we may substitute (i) by (i’) There exists ζ ∈ C \ σ(T ) such that {ϕ ∈ D(A) : R(ζ)ϕ, R(ζ)ϕ ∈ D(A)} is a core for A. Under these conditions the space R(ζ)D(A) does not depend on ζ ∈ C \ σ(T ). It is a core for T and a dense subspace of D(A) ∩ D(T ) for the intersection topology (associated with the norm · H + A · H + T · H ). In addition, D(A) ∩ D(T ) is a core for T , and in the sense of forms on H one has: [A, R(ζ)] = R(ζ)[T, A]R(ζ),
ζ ∈ C \ σ(T ),
(15)
where the same symbols denote the forms [A, R(ζ)], [T, A] and their continuous extensions to the spaces H or D(T ) (endowed with the graph topology). One has the Virial theorem (see, e.g., [5]): Theorem 3.9. If T ∈ C 1 (A) is a pseudo-selfadjoint operator, then ϕ, [T, A]ψ = 0 for each λ ∈ R and ϕ, ψ ∈ D(T ) satisfying T ϕ = λϕ, T ψ = λψ.
(16)
142
M. Melgaard / Central European Journal of Mathematics 5(1) 2007 134–153
It is convenient to introduce two functions T ≡ A ˜T ≡ ˜A T and T defined on R with values in (−∞, +∞]. Let T be any pseudo-selfadjoint operator of class C 1 (A). For any ξ ∈ R we denote ∞ A T (ξ) = sup { α ∈ R : ∃f ∈ C0 (R) real , f (ξ) = 0, such that f (T )i[T, A]f (T ) ≥ αf (T )2 , ∞ ˜A T (ξ) = sup { α ∈ R : ∃f ∈ C0 (R) real , f (ξ) = 0,
and a compact operator K such that f (T )i[T, A]f (T ) ≥ αf (T )2 + K . The significance of these two functions are reflected by the following simple fact [1, Proposition 7.2.6]. Proposition 3.10. The functions T and ˜T are lower semicontinuous, −∞ < T (ξ) ≤ ˜T (ξ) ≤ +∞, and T (ξ) < ∞ ⇐⇒ ξ ∈ σ(T ),
˜T (ξ) < ∞ ⇐⇒ ξ ∈ σess (T ).
We may now introduce the main notion of Mourre’s method. Definition 3.11 (“Mourre estimate”). Let A be a self-adjoint operator in H and let T be a pseudo-selfadjoint operator in H of class C 1 (A). If ξ ∈ R and ˜A T (ξ) > 0, then the operator A is said to be conjugate to T at ξ and A is said to be strictly conjugate to T at ξ if A T (ξ) > 0. The latter motivates the following definition. Definition 3.12 (Mourre set). If T is a pseudo-selfadjoint operator of class C 1 (A), then μA (T ) := {ξ ∈ R : A is conjugate to T at ξ} is called the Mourre set. As we shall see now, if T ∈ C 1 (A), then T has nice spectral properties on the open set μA (T ) [1, Theorem 7.4.2]: Theorem 3.13. Let T be a pseudo-selfadjoint operator in H. 1. If T ∈ C 1 (A), then μA (T ) ∩ σp (T ) is discrete in μA (T ), and each of the included eigenvalues of T has finite multiplicity. 2. If T ∈ C 1 (A) such that σ(T ) = R, then σsc (T ) ∩ μA (T ) = ∅. The limiting absorption principle takes the following form [1, Proposition 7.4.4]: Theorem 3.14. Let T be a pseudo-selfadjoint operator in H of class C 1 (A) such that σ(T ) = R. Assume that there exist two Hilbert spaces K and K1 such that
M. Melgaard / Central European Journal of Mathematics 5(1) 2007 134–153
143
(i) K1 → K and H → K, then K∗ ⊂ H = H∗ ⊂ K, (ii) For some λ0 ∈ R \ σ(T ), R(λ0 ) may be extended to an operator of B(K, K∗ ). (iii) R(λ0 )K1 ⊂ D(A). Let K 1 ,1 denote the interpolation space (K, K1 ) 1 ,1 obtained by a real interpolation (see, 2 2 e.g. [4]); in particular K 1 ,1 → K, K∗ ⊂ K∗1 ,1 and B(K, K∗ ) ⊂ B(K 1 ,1 , K∗1 ,1 ). Then 2
2
1. For every ζ ∈ C± , R(ζ) ∈ B(K, K∗ ) and the function
2
2
C± ζ → R(ζ) ∈ B(K, K∗ )
(17)
is holomorphic. 2. The above function, considered in B(K 1 ,1 , K∗1 ,1 )-valued, may be extended to a weak∗ 2
2
continuous function defined on C± ∪ [μA (T ) \ σp (T )].
Finally, we mention that the following perturbative result is often useful for establishing Mourre estimates in applications [1, Theorem 7.2.9]. Theorem 3.15. Let A, T and T0 be self-adjoint operators in a Hilbert space H such that both T and T0 are of class Cu1 (A). If (T + i)−1 − (T0 + i)−1 is compact, then ρ˜A ˜A T = ρ T0 . In particular, A is conjugate to H at λ if and only if it is conjugate to T0 at λ.
4
Mourre estimate for H0
The Dirichlet Laplacian −ΔD,Q on L2 (Q) generated by the sesquilinear form q[ϕ, ψ] := ∂i ϕ, δ ij ∂j ψ,
ϕ, ψ ∈ H10 (Q),
(18)
has purely discrete spectrum consisting of eigenvalues (0 <)υ1 < υ2 ≤ υ3 ≤ · · · . The latter constitute the threshold set Υ := {υn : n ∈ N} of H0 . The unperturbed Hamiltonian H0 clearly has the tensor decomposition H0 = p21 ⊗ I + I ⊗ (−ΔD,Q ),
(19)
where p1 = −i∂1 is the momentum operator in L2 (R). Moreover, one has σ(H0 ) = σac (H0 ) = σess (H0 ) = [υ1 , ∞).
(20)
Let A be the self-adjoint extension of (1/2)(x1 p1 + p1 x1 ) initially defined on C0∞ (M). The operator A is the infinitesimal generator of the dilation (with respect to x1 ) group exp(−itA) defined by t eitA ψ(x1 , x ) = e− 2 ψ(e−t x1 , x ) for all t ∈ R and all ψ ∈ L2 (M). Using the isomorphism H10 (M) H1 (R) ⊗ H10 (Q) we can write eitA = eitA ⊗ 1 (21)
144
M. Melgaard / Central European Journal of Mathematics 5(1) 2007 134–153
where, on the right-hand side, we regard A as an operator in L2 (R). Since eitA leaves invariant H1 (R) [1, Proposition 4.2.4], we infer from (21) that eitA maps H10 (M), the form domain of both H0 and H, into itself. We shall show that the dilation operator A is conjugate to H0 away from the set of thresholds Υ. Below we shall, with a slight abuse of notation, occasionally regard A as an operator in L2 (R). The classic (strict) Mourre estimate for the one-dimensional Laplacian [25] immediately gives us the following result. Proposition 4.1. The operator A is conjugate to T1 := p21 at each point R \ {0}. In particular, ⎧ ⎪ ⎨ 2ξ1 for ξ1 ≥ 0, A T1 (ξ1 ) = (22) ⎪ ⎩ +∞ for ξ1 < 0.
Next we state [6, Theorem 3.4]. Theorem 4.2. Let T1 , T2 be two self-adjoint, bounded from below operators in the Hilbert spaces H1 , H2 . Assume that Aj , j = 1, 2, is a self-adjoint operator in Hj such that Tj is of class C k (Aj ), k ∈ N\{0}. Then the operators T := T1 ⊗I +I ⊗T2 and A := A1 ⊗I +I ⊗A2 are self-adjoint operators in H1 ⊗ H2 . Moreover, the operator T is of class C k (A) and ∀ξ ∈ R: 1 A2 A inf A T (ξ) = T1 (ξ1 ) + T2 (ξ2 ) . ξ=ξ1 +ξ2
With these auxiliary results in place, we can prove the Mourre estimate for H0 . Proposition 4.3. The operator A is conjugate to the unperturbed operator H0 at each point R \ Υ. Proof. We apply Theorem 4.2. Bearing in mind the decomposition of H0 in (19), we set A1 := A, A2 := 0, which are self-adjoint operators in L2 (R), resp. L2 (Q). The operator T1 := p21 , resp. T2 = −ΔD,Q , is a non-negative self-adjoint operator in L2 (R), resp. L2 (Q). Proposition 4.1 gives us an explicit expression for A T1 (ξ1 ). Moreover, one easily sees that ⎧ ⎪ ⎨0 for ξ2 ∈ Υ, 0 T2 (ξ2 ) = ⎪ ⎩ +∞ for ξ2 ∈ R \ Υ. Then, by setting γ(ξ) := ξ − sup {ζ ∈ Υ : ζ ≤ ξ}, Theorem 4.2 gives for all ξ ∈ R: ⎧ ⎪ ⎨ 2γ(ξ) for ξ ≥ υ1 , A H0 (ξ) = ⎪ ⎩ +∞ for ξ < υ1 .
M. Melgaard / Central European Journal of Mathematics 5(1) 2007 134–153
Since γ(ξ) is strictly positive on R \ Υ, the assertion follows.
145
A Mourre estimate for a matrix-valued Hamiltonian with Schr¨odinger operators as component Hamiltonians was established in [23]. Since the latter Hamiltonian has the same structure as H0 , the approach above applies and gives a less cumbersome proof.
5
Auxiliary properties of the Hamiltonians
Let T be a self-adjoint operator in L2 (Ω) with form domain H10 (Ω). A priori, its resolvent R(ζ) defined for Im ζ = 0 is an operator in B(L2 (Ω)). It can be regarded as an operator in B(H−1 (Ω), H10 (Ω)). Indeed, we have embeddings H10 (Ω) → L2 (Ω) → [H10 (Ω)]∗ = H−1(Ω) because [L2 (Ω)]∗ = L2 (Ω) and H10 (Ω) is a dense subspace of L2 (Ω). By hypothesis, H10 (Ω) is the form domain of the operator T ; thus, for Im ζ = 0 the map T − ζ : D(T ) → L2 (Ω) extends to a continuous bijective operator from H10 (Ω) onto H−1 (Ω), whose inverse is an extension of R(ζ) to an element of B(H−1 (Ω), H10 (Ω)), which we shall also denote by R(ζ). := H − V . Under Assumption 1.1 and Assumption 1.2 we now consider H and H may be regarded as a bounded operator from It follows from above that H (resp., H) H10 (M) into H−1 (M) and, for λ > − inf H (resp., ν > 0), the operator H + λ (resp., + ν)) is a bijection (resp. isomorphism) from H10 (M) to H−1 (M) H may be uniquely extended to an bounded operator, also The operator H (resp. H) designated by H, from the local Sobolev space H1loc (M) into H−1 loc (M), defined by Hψ := −∂i mij ∂j ψ + V ψ in D (M),
ψ ∈ H1loc (M).
(23)
Lemma 5.1. If ψ ∈ H1loc (M) and ϕ ∈ C 1 (M), then − mij (∂i ϕ)(∂j ψ) − ∂i (mij ψ(∂j ϕ)) in D (M). H(ϕψ) = ϕHψ
(24)
H(ϕψ) = ϕHψ − mij (∂i ϕ)(∂j ψ) − ∂i (mij ψ(∂j ϕ)) in D (M).
(25)
and
Proof. Follows from a computation in D (M), using (23).
and H) of H Next we consider the pseudo-selfadjoint extensions (also denoted by H 2 n 2 2 n and H in L (R ) by identifying L (M) with the closed subspace of L (R ) for which its elements vanish on Rn \ M. Then H10 (M) will be a closed subspace of H1 (Rn ). Lemma 5.2. Let Assumption 1.1 and Assumption 1.2 hold. Then: + ν)−1 ∈ B(H−1 (Rn ), H1 (Rn )) for any ν > 0. 1. (H 2. (H + λ)−1 ∈ B(H−1 (Rn ), H1 (Rn )) for any λ > − inf H.
146
M. Melgaard / Central European Journal of Mathematics 5(1) 2007 134–153
Proof. We prove the assertions separately. n ij ij ij n 1. For any α > 0 we set mij α : R → (0, ∞] by mα = m on M and mα = α on R \ M and we denote by hα the sesquilinear form, densely defined in L2 (Rn ) by hα [ϕ, ψ] = ∂i , mij α ∂j ψ,
ϕ, ψ ∈ Q( hα ) = H1 (Rn ).
(26)
This form is non-negative and closed. It defines thus a non-negative self-adjoint operator α ≥ 0 in L2 (Rn ). If c > 0 and P denotes the orthogonal projection of L2 (Rn ) onto H L2 (M) (i.e. the multiplication operator induced by the characteristic function of M), α + c(1 − P ) is positive, self-adjoint on D(H α ). By Proposition 3.4, α,c := H the operator H 2 n α,c (in the strong there exists on L (R ) a pseudo-selfadjoint operator H α = limc→∞ H −1 −1 n 1 n resolvent sense) and (H α + ν) ∈ B(H (R ), H (R )) for any ν > 0. α,c is Now, the closed, densely defined sesquilinear form in L2 (Rn ) associated to H hα [ϕ, ψ] + c(1 − P )ϕ, (1 − P )ψ, hα,c [ϕ, ψ] :=
ϕ, ψ ∈ Q( hα,c ) = H1 (Rn ).
In view of Proposition 3.5, the operator H α is defined by the form hα,∞ [ϕ, ψ] = lim hα,c [ϕ, ψ], with c→∞ Q( hα,c ) : sup ϕ, ψ ∈ Q(hα,∞ ) := ϕ ∈ hα,c [ϕ, ϕ] < ∞ c>0
c>0
Since ϕ ∈ Q(hα,∞ ) if and only if ϕ ∈ H1 (Rn ) and (1 − P )ϕ = 0, we deduce that Q(hα,∞ ) = Q( h) = H10 (M) and hα,∞ [ϕ, ψ] = hα [ϕ, ψ] = h[ϕ, ψ], ϕ, ψ ∈ H10 (M). Hence for every α > 0. Hα = H 2. Since H10 (Rn ) ⊂ H10 (M) and H−1 (M) ⊂ H−1 (Rn ), we may continuously extend V such that V ∈ B(H1 (Rn ), H−1 (Rn )). Then the first part of the lemma and the resolvent + λ)−1 V )−1 (H + λ)−1 yield the desired result. equation (H + λ)−1 = (1 + (H Lemma 5.3. Suppose f ∈ C 1 (Rn ) and ∂j f ∈ L∞ (Rn ) for every j. Let ψ ∈ H−1 (Rn ) + ν)−1 ψ ∈ H1 (Rn ) for any ν > 0. Moreover, for each and f ψ ∈ H−1 (Rn ). Then f (H λ > − inf H, one has f (H + λ)−1 ψ ∈ H1 (Rn ). +ν)−1 ψ ∈ H1 (Rn ). Let l ∈ C0∞ (Rn ) such that l(x) = 1 for |x| ≤ 1 and Proof. Let ϕ = (H l(x) = 0 for |x| ≥ 2. Define lk (x) = l(x/k), k ≥ 1. In particular, ∂jn lk (x) = k −n (∂jn l)(x/k) (n ∈ N). Clearly, limk→∞ lk f ψ = f ψ ∈ H−1 (Rn ). Moreover, lk f ϕ ∈ H1 (Rn ). By bearing α,c from the proof of Lemma 5.2(1), an application in mind the definition of the operator H of Lemma 5.1 yields α,c + ν)(lk f ϕ) = lk f ψ − mij (∂i (lk f ))∂j ϕ (H α −∂i (mij α ϕ(∂j (lk f ))) + c(1 − P )(lk f ϕ). Let us denote the right-hand side by gk,c . Since (1 − P )ϕ = 0, the distribution gk,c does not depend on c and we may thus denote it by gk . For each v ∈ L2 (Rn ), limk→∞ lk v = v
M. Melgaard / Central European Journal of Mathematics 5(1) 2007 134–153
147
in L2 (Rn ) and, in addition, limk→∞(∂j lk )f ψ = 0 in L2 (Rn ) for every j. Thus we infer that ij lim gk = f ψ − mij α (∂i f )(∂j ψ) − ∂i (mα ψ(∂j f )) =: g k→∞
α,c + ν)−1 gk and in the limit c → ∞, we get in H−1 (Rn ). This implies that lk f ϕ = (H + ν)−1 gk . Finally, by invoking Lemma 5.2(1), we deduce that f ϕ = that lk f ϕ = (H + ν)−1 g ∈ H1 (Rn ). The last assertion follows from the first one, in combination with (H + λ)−1 (1 + V (H + λ)−1 )−1 . the resolvent formula (H + λ)−1 = (H The Hamiltonians H0 and H obey the following regularity properties: Proposition 5.4. Let Assumption 1.1 and Assumption 1.2 be satisfied. Then: 1. H0 ∈ C ∞ (A). 2. H ∈ C 1+β (A) with β = minj=1,2 {μj − 1, νj − 1, 1}. This result was established in [24, Proposition 8.1]. Although the proof of Proposition 5.4 follows a procedure well-known for the Laplace operator, the variable principle part of H requires a substantially more complicated analysis of commutators. In addition, the Hamiltonians H and H0 “look alike” at infinity so that the kinetic energy distribution is controlled by the total energy distribution in the following sense: Proposition 5.5. Let Assumption 1.1 and Assumption 1.2 be satisfied. Then f (H) − f (H0 ) is compact for any f ∈ C∞ (R); the continuous functions vanishing at infinity. In particular, σess (H) = σess (H0 ) = [υ1 , ∞). For the proof we refer to [24, Proposition 9.1].
6
Principle of limiting absorption
The key ingredient in the proof of strong asymptotic completeness is a LAP valid in a framework of weighted Sobolev spaces Hs(β) (M) (see Section 1). Evidently, for any β ≥ 0, −1 1 1 one has continuous embeddings H−1 (β) (M) ⊂ H (M) and H0 (M) ⊂ H(−β) (M). This version of the LAP, suitable for the study of scattering theory, is the content of assertion 4 in the following theorem, wherein Υ denote the set of eigenvalues of the Dirichlet Laplacian −ΔD,Q on on the bounded Lipschitz domain Q in Rn−1 , n ≥ 2. Theorem 6.1. Let M, Q and Υ be as above. Suppose that the matrix M satisfies Assumption 1.1(i), (ii)’ and (iii) and that the potential V satisfies Assumption 1.2(i), (ii)’ and (iii). Then the operator H in (1) has the following spectral properties: 1. The essential spectrum of H equals the semi-axis [υ1 , ∞) with υ1 = inf Υ. 2. The set of eigenvalues of H can accumulate only to the points of Υ and each eigenvalue away from Υ has finite multiplicity. 3. The operator H has no singular continuous spectrum.
148
M. Melgaard / Central European Journal of Mathematics 5(1) 2007 134–153
4. For any γ > 1/2, the holomorphic functions 1 C± ζ → (H − ζ)−1 ∈ B(H−1 (γ) (M), H(−γ) (M))
(27)
extends continuously to C± ∪ (R \ [σp (H) ∪ Υ ]) in the uniform topology. We begin by proving assertions 1-3 in Theorem 6.1. Proof (of Theorem 6.1, assertions 1-3). In view of Proposition 5.4 H belongs to C 1+β (A) for some β ∈ (0, 1]. If follows from (11) that H is of class C 1 (A). In Proposition 4.3 we proved that A is strictly conjugate to H0 away from Υ. The latter, in combination with Proposition 5.5 and Theorem 3.15, implies that A is conjugate to H at R \ Υ. Then Assertion 1 follows from Proposition 5.5. Assertion 2 is a consequence of Theorem 3.13(1) and, assertion 3, the absence of singular continuous spectrum of H is a consequence of Theorem 3.13(2). Items 1-3 in Theorem 6.1 first appeared in [19]. To give a proof of Theorem 6.1(4), we first give the following version of the LAP, wherein the set M is replaced by the Euclidean space Rn . This result is of considerable interest in itself. The proof of Theorem 6.2 imitates a familiar approach for the Laplace operator, but the variable principle part of H requires a refined version of Mourre’s method within the context of pseudo-selfadjoint operators. Theorem 6.2. Let M, Q and Υ be as above. Suppose that the matrix M satisfies Assumption 1.1 and that the potential V satisfies Assumption 1.2. If one regards H, in (1), as a pseudo-selfadjoint operator in L2 (Rn ), then the holomorphic function n 1 n C± ζ → (H − ζ)−1 ∈ B(H−1 1 (R ), H− 1 ,∞ (R )) ,1 2
2
(28)
extends continuously to C± ∪ (R \ [σp (H) ∪ Υ]) in the weak ∗-topology. n −1 v which is In other words, if u, v ∈ H−1 1 (R ), then the function ζ → u, (H − ζ) ,1 2
holomorphic in C± has a continuous extension to C± ∪ (R \ [σp (H) ∪ Υ]).
Proof. We shall regard H as a pseudo-selfadjoint operator in L2 (Rn ). For this purpose we identify L2 (M) with the closed subspace of L2 (Rn ) consisting of elements which vanish on the complement of M; that is, H10 (M) is identified with H1 (Rn ) ∩ L2 (M). We wish to apply Theorem 3.14. For this aim we introduce A = A1 ⊗ 1, which is self-adjoint in L2 (Rn ) and essentially self-adjoint on C0∞ (Rn ); by writing A1 instead of A we stress that it is the dilation operator with respect to x1 . To apply Theorem 3.14, we need to verify several conditions. First we show that H ∈ C 1 (A) by means of Proposition 3.8. As usual, P denotes the orthogonal projection of L2 (Rn ) onto L2 (M). Evidently, D(A) = P D(A) and A = P AP and one easily checks that A[D(A) ∩ L2 (M)] ⊂ L2 (M)
M. Melgaard / Central European Journal of Mathematics 5(1) 2007 134–153
149
and [A, H] = [A, H] on D(A) ∩ D(H). whence the hypotheses (13) and (ii) in Proposition 3.8 are fulfilled; to verify (ii) we use (6), [24, Proposition 7.1] and Proposition 3.8. Next we verify condition (i’) of Proposition 3.8. Now, φ, x1 φ ∈ L2 (R) implies that φ ∈ D(A1 ) and thus φ, x1 φ ∈ L2 (Rn ) implies that φ ∈ D(A). Fix λ > − inf H. From Lemma 5.3 (with f (x) = x1 ) we obtain that (H + λ)−1 φ, x1 (H + λ)−1 φ ∈ L2 (Rn ) are valid for φ ∈ C0∞ (Rn ). The latter, in conjunction with the afore-mentioned fact implies that (H + λ)−1 φ, x1 (H + λ)−1 φ ∈ D(A) and therefore the set { φ ∈ C0∞ (Rn ) : (H + λ)−1 φ ∈ D(A) } is a core of A. Then an application of Proposition 3.8 implies that H ∈ C 1 (A) and [(H + λ)−1 , A] = (H + λ)−1 [A, H](H + λ)−1 = [(H + λ)−1 , A]
(29)
Since eiτ A = eiτ A1 ⊗ 1 on L2 (M) and eiτ A = eiτ A1 ⊗ 1 on L2 (Rn ), it follows that eiτ A = P eiτ A P for all τ ∈ R. As seen from the proof of Proposition 5.4, we have that [(H + λ)−1 , A] ∈ C β (A) and, together with (29), this implies that [(H + λ)−1 , A] ∈ C β (A). So H ∈ C 1+β (A) and, in particular, H is A-regular. To apply Theorem 3.14 to H in L2 (Rn ) it remains to select the spaces K and K1 and n to specify K 1 ,1 . We set K := H−1 (Rn ) and K1 := H−1 (1) (R ). Our discussion above shows 2 that the hypotheses (i)-(iii) of Theorem 3.14 are satisfied (with R(λ0 ) = (H + λ)−1 and n ∗ = H1− 1 ,∞ (Rn ). A = A). Finally, we note that K 1 ,1 = H−1 1 (R ) and K 1 ,1 ,1 2
2
2
2
Having established Theorem 6.2 we can give a rather short proof of the LAP formulated in Theorem 6.1(4). Proof (of Theorem 6.1(4)). We recall the embeddings (6)-(7). In fact, for any β > 1/2 and > 0 such that 1/2 + < β, the embeddings (Rn ) H1(− 1 −) (Rn ) ⊂ H0(−β) (Rn ) and H0(β) (Rn ) ⊂ H−1 ( 1 +) 2
2
are compact and the latter fact, together with (7), Theorem 6.2, and (6) (in this order), implies that C± ζ → (H − ζ)−1 ∈ B(L2(β) (Rn ), L2(−β) (Rn )),
β > 1/2,
(30)
can be continuously extended to C± ∪ (R \ Υ) in the uniform topology. Fix λ > − inf H. For ζ ∈ C± , an iteration of the first resolvent formula yields (H − ζ)−1 = (H + λ)−1 + (ζ + λ)(H + λ)−2 +(ζ + λ)2 (H + λ)−1 (H − ζ)−1(H + λ)−1 .
(31)
150
M. Melgaard / Central European Journal of Mathematics 5(1) 2007 134–153
n 1 n Lemma 5.2(2) asserts that (H + λ)−1 ∈ B(H−1 (0) (R ), H(0) (R )) and an application of n 1 n Lemma 5.3 implies that (H + λ)−1 ∈ B(H−1 (α) (R ), H(α) (R )) for every α ∈ [−1, 1]. Hence, in view of (31) we conlude that, for any β > 1/2, the holomorphic functions n 1 n C± ζ → (H − ζ)−1 ∈ B(H−1 (β) (R ), H(−β) (R ))
(32)
can be continuously extended to C± ∪ (R \ Υ) in the uniform topology. Finally we observe that the density of C0∞ (M) in H−1 (β) (M), together with the embeddings −1 n H1(−β) (Rn ) ⊂ H1(−β) (M), H−1 (β) (M) ⊂ H(β) (R ),
yields the desired result.
7
Scattering properties for the pair (H, H0)
We proceed to scattering theory for the pair (H, H0 ). The following classic result goes back to Lavine [21]. Theorem 7.1. Let T1 and T2 be two self-adjoint operators in a separable Hilbert space H with spectral projections ET1 (Ω) and ET2 (Ω). Assume that there exist sets Ωj , j ∈ N, and operators Ek , Fk , 1 ≤ k ≤ N, such that: (i) Ω = ∪j∈N Ωj where each Ωj is a bounded open interval, and Ωj ∩ Ωk = ∅ if j = k. (ii) The operator Ek is T1 -bounded and locally T1 -smooth on Ωj , for 1 ≤ k ≤ N, and j ≥ 1. (iii) The operator Fk is T2 -bounded and locally T2 -smooth on Ωj , for 1 ≤ k ≤ N, and j ≥ 1. ∗ (iv) T2 − T1 = N k=1 Fk Ek is valid in the sense of forms, i.e. N T2 u, vH − u, T1 vH = Fk u, Ek vH ,
u ∈ D(T2 ), v ∈ D(T1 ).
k=1
(v) Both sets σ(T1 ) \ Ω and σ(T2 ) \ Ω have Lebesgue measure zero. Then the generalized wave operators W ± = s − lim eitT2 e−itT1 Pac (T1 ) t→±∞
˜ W
±
= s − lim eitT1 e−itT2 Pac (T2 ) t→±∞
exist and are complete. For our purpose, we choose T1 = H0 and T2 = H and Ω = R \ Υ. In addition, we set v = mij − δ ij . Then, for u ∈ D(H) and v ∈ D(H0 ) we have that ij
Hu, vL2(M) − u, H0vL2 (M) = h[u, v] − h0 [v, u] (∂i u)(x)v ij (∂j v)(x) dx = M 1 1 |V (x)| 2 u(x)|V (x)| 2 sign V (x)v(x) dx. + M
(33)
M. Melgaard / Central European Journal of Mathematics 5(1) 2007 134–153
151
where, on the right-hand side, we suppress the summations over i, j. Hence, by introducing the operators Eij , Fij : H1(−γ) (M) → L2 (M), γ = max{μ, ν}, 1 ≤ i, j ≤ n + 1, μ := maxj=1,2 μj /2 (see Assumption 1.1 for the decay parameters μj ), ν = maxj=1,2 νj /4 (see Assumption 1.2 for the decay parameters νj ), defined by Eij u := ·μv ij ∂j u, 1
En+1 u := ·ν |V | 2 ,
Fij u := −·−μ ∂i u 1 ≤ i, j ≤ n, 1
Fn+1 = ·−ν |V | 2 sign V,
(34)
we have, in view of (33), that H − H0 =
∗ Fij∗ Eij + Fn+1 En+1
(35)
1≤i,j≤n
holds in the sense of forms. In addition, we need the following auxiliary result. Lemma 7.2. Let γ > 1/2 and let g ∈ L∞ (M) be a function satisfying ·γ g ∈ L∞ (M). Then 1. The operator G : H1(−γ) (M) → L2 (M), Gu := g∂ α u (where α is a multi-index with order |α| ≤ 1) is bounded. 2. The unbounded operator defined by G in L2 (M), also denoted by G, is H-bounded. 3. The operator G is locally H-smooth on R \ Υ. Proof. Evidently, the hypotheses on g and γ ensure that the first statement holds. The second statement follows from the inclusions D(H) ⊂ H10 (M) ⊂ H1(−γ) (M). To prove the third assertion we need to verify that for any compact set K ⊂ R \ Υ, the operator GEH (K) is A-smooth. A sufficient requirement for this is that (see, e.g., [29, Theorem XIII.30]) sup G(H − λ − i)−1 G∗ B(L2 (M)) < ∞. (36) λ∈K,0<<1
From Theorem 6.1 we infer that sup λ∈K,0<<1
(H − λ − i)−1 B(H−1 (M),H1 (γ)
(−γ)
(M))
<∞
and therefore (36) is fulfilled. This proves the third statement.
Finally, the proof of Theorem 1.3 amounts to combining the results above: Proof (of Theorem 1.3). It follows immediately from the discussion above, Theorem 6.1, Lemma 7.2 and Theorem 7.1.
References [1] W. O. Amrein, A. Boutet de Monvel and V. Georgescu: C0 -groups, commutator methods and spectral theory of N-body Hamiltonians, Progress in Math. Ser., Vol. 135, Birkh¨auser, 1996.
152
M. Melgaard / Central European Journal of Mathematics 5(1) 2007 134–153
[2] M. Ben-Artzi and A. Devinatz: “The limiting absorption principle for partial differential operators”, Mem. Amer. Math. Soc., Vol. 66(364), (1987), pp. iv+70. [3] M. Ben-Artzi, Y. Dermenjian and J.-C. Guillot: “Acoustic waves in perturbed stratified fluids: a spectral theory”, Comm. Partial Differential Equations, Vol. 14, (1989), pp. 479–517. [4] J. Bergh and J. L¨ofstr¨om: Interpolation spaces. An introduction, Springer-Verlag, Berlin-New York, 1976. [5] A. Boutet de Monvel-Berthier and V. Georgescu: “Some developments and applications of the abstract Mourre theory”, M´ethodes semi-classiques, Vol. 2, Nantes, 1991; Ast´erisque Vol. 210, (1992), pp. 27–48. [6] A. Boutet de Monvel-Berthier and V. Georgescu: “Graded C ∗ -algebras and manybody perturbation theory: II The Mourre estimate”, Ast´erisque, Vol. 210, (1992), pp. 75–96. [7] A. Boutet de Monvel-Berthier, V. Georgescu and A. Soffer: “N-body Hamiltonians with hard-core interactions”, Rev. Math. Phys. Vol. 6, (1994), pp. 515–596. [8] A. Boutet de Monvel-Berthier and D. Manda: “Spectral and scattering theory for wave propagation in perturbed stratified media”, J. Math. Anal. Appl., Vol. 91, (1995), pp. 137–167. [9] A. Boutet de Monvel and R. Purice: “The conjugate operator method: application to Dirac operators and to stratified media”, In: Evolution equations, Feshbach resonances, singular Hodge theory, Math. Top., Vol. 16, Wiley-VCH, Berlin, 1999, 243–286. [10] J. Derezinski and C. G´erard: Scattering theory of classical and quantum N-particle systems, Springer-Verlag, Berlin, 1997. [11] Y. Dermenjian, M. Durand and V. Iftimie: ”Spectral analysis of an acoustic multistratified perturbed cylinder”. Comm. Partial Differential Equations, Vol. 23(1-2), (1998), pp. 141–169. [12] D.E. Edmunds and W.D. Evans: Spectral theory and differential operators, Oxford University Press, New York, 1987. [13] D.M. Eidus: “The principle of limiting amplitude”, Uspehi Mat. Nauk, Vol. 24(3), (1969), pp. 91–156. [14] R. Froese and I. Herbst: “Exponential bounds and absence of positive eigenvalues for N-body Schr¨odinger operators”, Comm. Math. Phys., Vol. 87, (1982/83), pp. 429–447. [15] C.I. Goldstein: “Eigenfunction expansions associated with the Laplacian for certain domains with infinite boundaries. I.”, Trans. Amer. Math. Soc., Vol. 135, (1969), pp. 1–31. [16] E. Hille and R.S. Phillips: Functional analysis and semi-groups (Third printing of the revised edition of 1957), American Mathematical Society, Providence, R. I., 1974. [17] H. Iwashita: “Spectral theory for symmetric systems in an exterior domain”, Tsukuba J. Math., Vol. 11, (1987), pp. 241–256.
M. Melgaard / Central European Journal of Mathematics 5(1) 2007 134–153
153
[18] K. A. Kiers and W. van Dijk: “Scattering in one dimension: the coupled Schr¨odinger equation, threshold behaviour and Levinson’s theorem”, J. Math. Phys., Vol. 37, (1996), pp. 6033–6059. [19] D. Krejcirik and R.T. de Aldecoa: “The nature of the essential spectrum in curved quantum waveguides”, J. Phys. A, Vol. 37, (2004), pp. 5449–5466. [20] I. Laba: “Long-range one-particle scattering in a homogeneous magnetic field”, Duke Math. J., Vol. 70(2), (1993), pp. 283–303. [21] R.B. Lavine: “Commutators and scattering theory. II. A class of one body problems”, Indiana Univ. Math. J., Vol. 21, (1971/72), pp. 643–656. [22] W.C. Lyford: “Spectral analysis of the Laplacian in domains with cylinders”, Math Ann., Vol. 218, (1975), pp. 229–251. [23] M. Melgaard: “Spectral properties at a threshold for two-channel Hamiltonians. II. Applications to scattering theory”, J. Math. Anal. Appl., Vol. 256, (2001), pp. 568– 586. [24] M. Melgaard: “Optimal limiting absorption principle for a Schr¨odinger type operator on a Lipschitz cylinder”, Manus. Math., Vol. 118, (2005), pp. 253–270. [25] E. Mourre: “Absence of singular continuous spectrum for certain self-adjoint operators”, Comm. Math. Phys., Vol. 78, (1980/81), pp. 391–408. [26] P. Perry, I.M. Sigal and B. Simon: “Spectral analysis of N-body Schr¨odinger operators”, Ann. of Math., Vol. 114(2), (1981), pp. 519–567. [27] M. Reed and B. Simon: Methods of modern mathematical physics, I. Functional analysis, Academic Press, New York, 1980. [28] M. Reed and B. Simon: Methods of modern mathematical physics, II. Fourier analysis, self-adjointness, Academic Press, New York, 1975. [29] M. Reed and B. Simon: Methods of modern mathematical physics, III. Scattering theory, Academic Press, New York, 1979. [30] B. Simon: “A canonical decomposition for quadratic forms with applications to monotone convergence theorems”, J. Funct. Anal., Vol. 28, (1978), pp. 377–385. [31] H. Tamura: “Principle of limiting absorption for N-body Schr¨odinger operators – a remark on the commutator method”, Lett. Math. Phys., Vol. 17, (1989), pp. 31–36. [32] H. Tamura: “Resolvent estimates at low frequencies and limiting amplitude principle for acoustic propagators”, J. Math. Soc. Japan, Vol. 41, (1989), pp. 549–575. [33] R. Weder: “Spectral analysis of strongly propagative systems”, J. Reine Angew. Math, Vol. 354, (1984), pp. 95–122.
DOI: 10.2478/s11533-006-0044-3 Research article CEJM 5(1) 2007 154–163
Comparison theorems for noncanonical third order nonlinear differential equations Ivan Mojsej∗ , J´an Ohriska† Institute of Mathematics, Faculty of Science, ˇ arik University, P. J. Saf´ 041 54 Koˇsice, Slovak Republic
Received 8 June 2006; accepted 25 October 2006 Abstract: The aim of our paper is to study oscillatory and asymptotic properties of solutions of nonlinear differential equations of the third order with quasiderivatives. We prove comparison theorems on property A between linear and nonlinear equations. Some integral criteria ensuring property A for nonlinear equations are also given. Our assumptions on the nonlinearity of f are restricted to its behavior only in a neighborhood of zero and a neighborhood of infinity. c Versita Warsaw and Springer-Verlag Berlin Heidelberg. All rights reserved. Keywords: Comparison theorem, property A, quasiderivative, noncanonical form, nonlinear equation MSC (2000): 34C10
1
Introduction
Consider the third-order nonlinear differential equations that have quasiderivatives of the form: 1 1 x (t) + q(t)f (x(t)) = 0, t ≥ 0 (N) p(t) r(t) where r, p, q ∈ C([0, ∞), R), r(t) > 0, p(t) > 0, q(t) > 0 on [0, ∞), f ∈ C(R, R),
f (u)u > 0 for u = 0.
(H1) (H2)
With no restatement of conditions (H1) and (H2), we shall assume their validity throughout this paper. ∗ †
E-mail:
[email protected] E-mail:
[email protected]
I. Mojsej, J. Ohriska / Central European Journal of Mathematics 5(1) 2007 154–163
155
For the sake of brevity, we put [0]
x
[1]
= x, x
1 1 = x , x[2] = r p
1 x r
1 [1] 1 x = , x[3] = p q
1 1 1 [2] x x = . p r q
The functions x[i] , i=0, 1, 2, 3, we call the quasiderivatives of x. In addition to (H1) and (H2), we shall occasionally assume that lim inf |u|→∞
f (u) >0 u
(H3)
and
f (u) > 0. (H4) u→0 u By a solution of an equation of the form (N), we mean a function w ∈ C 1 ([0, ∞), R) such that w [1] (t), w [2] (t) ∈ C 1 ([0, ∞), R), satisfying equation (N) for all t ≥ 0. Any solution of (N) is said to be proper if it is defined on the interval [0, ∞) and is nontrivial in any neighborhood of infinity. A proper solution is said to be oscillatory if it has a sequence of zeros converging to ∞; otherwise it is said to be nonoscillatory. Furthermore, equation (N) is called oscillatory if it has at least one nontrivial oscillatory solution; it is called nonoscillatory if all of its solutions are nonoscillatory. The relevance of the study of the asymptotic behavior of solutions is often established by introducing the concept of equation with property A. More precisely, equation (N) is said to have property A if any proper solution x of (N) is either oscillatory or satisfies the condition |x[i] (t)| ↓ 0 as t → ∞ for i = 0, 1, 2. lim inf
The notation u(t) ↓ 0 means that function u monotonically decreases to zero as t → ∞. The special case of equation (N) satisfying (H1)–(H4) is the linear equation
1 p(t)
1 x (t) r(t)
+ q(t)x(t) = 0,
t≥0
(L)
the oscillatory and asymptotic properties of which are studied in [1, 2, 4–6, 10]. The nonlinear case, equation (N), has been thoroughly investigated in [1, 2, 5]. In particular, many papers have been devoted to the study of the oscillatory and asymptotic properties of solutions of differential equations of the n-th order with quasiderivatives. Among the extensive literature on this field, we refer the reader to [7, 8, 11, 12] and to the references contained therein. As is customary, we shall say that equation (N) [(L)] is in the canonical form if ∞ ∞ r(t) dt = p(t) dt = ∞. We would like to point out that most of the results of such research (especially the comparison theorems) necessitate the canonical form of the differential equations under investigation (see, e.g., [2, 3, 5, 7–10, 12]). The aim of this paper is to continue the study of equation (N) [(L)] in the noncanonical ∞ ∞ form (i.e., the case where r(t) dt < ∞ or p(t) dt < ∞ or both integrals converge).
156
I. Mojsej, J. Ohriska / Central European Journal of Mathematics 5(1) 2007 154–163
Our research is based on a study of the asymptotic behavior of nonoscillatory solutions of equation (N) as well as on a linearization device. The paper is organized as follows: The second section summarizes some established results for linear equation (L) and some notation that will be useful in our ensuing investigations. In section 3, we prove the comparison theorems on property A between the linear and nonlinear equations. As a result, we obtain sufficient conditions that ensure property A for equation (N). Such results are presented as integral criteria that involve only the functions p, r, q. Our findings expand on some of the results in [2, 5] for the canonical case. Several examples illustrating our main theorems are also provided. We point out that our assumptions on the nonlinearity of f are restricted to its behavior only in a neighborhood of zero and a neighborhood of infinity. Not only are monotonicity conditions unnecessary but also no assumptions on the behavior of f in R are required.
2
Preliminary results
In the recent papers [1, 2, 5, 6], others have studied relationships between property A and oscillation as well as the oscillatory and asymptotic properties of solutions of linear equation (L). We recount some of these results insofar as they inform this sequel. We start with the following oscillation and nonoscillation criteria, results proved in [6]. Consider the notation ∞ ∞ t I(ui ) = ui(t) dt, I(ui, uj ) = ui (t) uj (s) ds dt, i, j = 1, 2 0
I(ui, uj , uk ) =
0
∞
ui(t)
0
t
uj (s)
0
0
0
s
uk (b) db ds dt,
i, j, k = 1, 2, 3,
where ui, i = 1, 2, 3, are continuous positive functions on [0, ∞). For simplicity, we shall sometimes write u(∞) instead of limt→∞ u(t). Theorem 2.1. ([6], Theorems 8 and 10) Suppose one of conditions (i) I(p) = I(r) = I(q, r) = ∞, (ii) I(q) = I(p) = I(r, p) = ∞, (iii) I(r) = I(q) = I(p, q) = ∞. is satisfied. Then equation (L) is oscillatory. Theorem 2.2. ([6], Theorems 5 and 7) Suppose one of the integrals I(q, r, p),
I(p, q, r),
I(r, p, q)
is convergent. Then equation (L) is nonoscillatory. Let N (N) and N (L) denote the sets of all proper nonoscillatory solutions of (N) and (L), respectively. The sets N (N) and N (L) can be divided into the following four classes
I. Mojsej, J. Ohriska / Central European Journal of Mathematics 5(1) 2007 154–163
157
in the same way as in [1, 2, 5]: N0 = {x ∈ N (N) [x ∈ N (L)], ∃ Tx : x(t)x[1] (t) < 0, x(t)x[2] (t) > 0 for t ≥ Tx } N1 = {x ∈ N (N) [x ∈ N (L)], ∃ Tx : x(t)x[1] (t) > 0, x(t)x[2] (t) < 0 for t ≥ Tx } N2 = {x ∈ N (N) [x ∈ N (L)], ∃ Tx : x(t)x[1] (t) > 0, x(t)x[2] (t) > 0 for t ≥ Tx } N3 = {x ∈ N (N) [x ∈ N (L)], ∃ Tx : x(t)x[1] (t) < 0, x(t)x[2] (t) < 0 for t ≥ Tx } If x ∈ N0 , then its quasiderivatives satisfy the inequality x[i] (t)x[i+1] (t) < 0 for i = 0, 1, 2, for all sufficiently large t. Using the terminology in [1, 2, 4, 5], we call it a Kneser solution. The asymptotic properties of Kneser solutions of equation (L) are given by the following lemma. Lemma 2.3. ([5], Lemma 4) (i) If there exists x ∈ N0 such that limt→∞ x(t) = 0, then I(q, p, r) < ∞. (ii) If there exists x ∈ N0 such that limt→∞ x[1] (t) = 0, then I(r, q, p) < ∞. (iii) If there exists x ∈ N0 such that limt→∞ x[2] (t) = 0, then I(p, r, q) < ∞. Finally, we introduce the result that connects property A to oscillation and to the integral behavior of functions p, r, q. Theorem 2.4. ([1], Theorem 2.2) The following assertions are equivalent: (i) (L) has property A. (ii) (L) is oscillatory and I(q, p, r) = I(r, q, p) = I(p, r, q) = ∞.
3
Main results
We begin our consideration with a comparison theorem. Theorem 3.1. Assume (H3), (H4), I(r) < ∞ and I(p) = I(r, p) = I(q) = ∞. If equation (L) has property A, then equation (N) has property A. Proof. Let x be a proper nonoscillatory solution of (N). We know that any proper nonoscillatory solution x of equation (N) belongs to one of four classes. More specifically, x ∈ N0 ∪ N1 ∪ N2 ∪ N3 . We assume that there exists T ≥ 0 such that x(t) > 0 for all t ≥ T . Now, suppose that (N) does not have property A. There are, then, four possibilities: I. x ∈ N1 , II. x ∈ N2 , III. x ∈ N0 such that limt→∞ x[i] (t) = 0 for some i ∈ {0, 1, 2}, IV. x ∈ N3 . Case I. Since x is a positive nonoscillatory solution of (N) in the class N1 , there exists T1 ≥ T such that x(t) > 0, x[1] (t) > 0, x[2] (t) < 0 for all t ≥ T1 . Moreover, since x is a positive increasing function, either i) limt→∞ x(t) = α < ∞ or ii) limt→∞ x(t) = ∞.
158
I. Mojsej, J. Ohriska / Central European Journal of Mathematics 5(1) 2007 154–163
In case (i), the continuity of the function f (u) on the interval [x(T2 ), α] (where T2 ≥ T1 ) ensures the existence of a positive constant K such that f (x(t)) ≥K x(t)
for all t ≥ T2 .
(1)
In case (ii), the corresponding inequality (1) holds for some positive constant K1 for all t ≥ T3 ≥ T1 insofar as (H3) holds. Now we see that there exists a positive number K2 and T4 ≥ T1 such that f (x(t)) ≥ K2 for all t ≥ T4 . (2) x(t) As a positive decreasing function, x[1] (t) is bounded. By integrating equation (N) twice in the interval [T4 , t], we see that [1]
[1]
[2]
x (t) = x (T4 ) + x (T4 )
p(s) ds −
T4
or [1]
t
[1]
x (t) < x (T4 ) −
t
t
s
p(s) T4
q(u)f (x(u)) du ds T4
s
p(s)
q(u)f (x(u)) du ds .
T4
T4
Using this expression with (2), we obtain t s [1] [1] p(s) q(u)f (x(u)) du ds ≥ x (T4 ) − x (t) > T4
T4
≥ K2
t
s
p(s) T4
T4
q(u)x(u) du ds ≥ K2 x(T4 )
t
s
p(s) T4
q(u) du ds . T4
When t → ∞, we have that I(p, q) < ∞. Taken together with I(r) < ∞, we have that I(r, p, q) < ∞. However, inasmuch as equation (L) has property A, equation (L) is oscillatory by Theorem 2.4. Now, Theorem 2.2 yields I(r, p, q) = ∞, a contradiction. Case II. Inasmuch as x is a positive nonoscillatory solution of (N) in the class N2 , there exists T1 ≥ T such that x(t) > 0, x[1] (t) > 0, x[2] (t) > 0 for all t ≥ T1 . Since [2] x (t) = −q(t)f (x(t)) < 0 for all t ≥ T1 , x[2] (t) is a positive decreasing function. Hence, 0 ≤ x[2] (∞) < ∞. Just as in case I, from the positive increasing nature of the function x, we establish the validity of (2). Integrating equation (N) in [T4 , ∞), we obtain [2]
[2]
x (T4 ) − x (∞) =
∞
q(t)f (x(t)) dt . T4
Given that 0 ≤ x[2] (∞) < ∞ and (2), there exists a positive constant c such that
∞
c= T4
q(t)f (x(t)) dt ≥ K2
contradicting I(q) = ∞.
∞
T4
q(t)x(t) dt ≥ K2 x(T4 )
∞
q(t) dt , T4
I. Mojsej, J. Ohriska / Central European Journal of Mathematics 5(1) 2007 154–163
159
Case III. Let x ∈ N0 such that limt→∞ x[i] (t) = 0 for some i ∈ {0, 1, 2}. Consider the linearized equation 1 1 w (t) + q(t)F (t)w(t) = 0 , (LF ) p(t) r(t) f (x(t)) . For its nonoscillatory solution w ≡ x, equation (LF ) has a Kneser x(t) solution such that limt→∞ w [i](t) = 0 for some i ∈ {0, 1, 2}. By Lemma 2.3, at least one of integrals I(qF, p, r), I(r, qF, p), or I(p, r, qF ) is convergent. For positive decreasing x, either i) limt→∞ x(t) = β > 0 or ii) limt→∞ x(t) = 0. In case (i), the continuity of the function f (u) on the interval [β, x(T )] (where T ≥ 0) ensures the existence of a positive constant M such that where F (t) =
F (t) =
f (x(t)) ≥M x(t)
for all t sufficiently large.
(3)
In case (ii), based on (H4), the inequality of the form (3) holds for some positive constant M1 for all t sufficiently large. We see, then, that there exists a positive number M2 such that f (x(t)) ≥ M2 for all t sufficiently large . (4) x(t) Now, from the inequality (4), we get M2 I(q, p, r) ≤ I(qF, p, r), M2 I(r, q, p) ≤ I(r, qF, p), M2 I(p, r, q) ≤ I(p, r, qF ) and, so, at least one of integrals I(q, p, r), I(r, q, p), or I(p, r, q) is convergent. However, since equation (L) has property A, it follows from Theorem 2.4 that all of these integrals are divergent, a contradictory result. Case IV. Let x ∈ N3 . Then there exists T1 ≥ T such that x(t) > 0, x[1] (t) < 0, x[2] (t) < 0 for all t ≥ T1 . As a positive decreasing function, x is bounded. We have −x[2] (t) ≥ −x[2] (T1 ) > 0 for all t ≥ T1 because x[2] is a negative decreasing function. Integrating this inequality twice in [T1 , t], we obtain t t s [1] [2] x(t) ≤ x(T1 ) + x (T1 ) r(s) ds + x (T1 ) r(s) p(u) du ds T1
T1
or [2]
x(t) < x(T1 ) + x (T1 )
t
s
r(s) T1
T1
p(u) du ds . T1
When t → ∞, we get a contradiction because the function x is positive for all t ≥ T1 . The case x(t) < 0 for all t ≥ T ∗ may be treated similarly. Thus, we have proved that any proper solution x of equation (N) is either oscillatory or belongs to the class N0 such that limt→∞ x[i] (t) = 0 for all i ∈ {0, 1, 2}. This completes the proof. Theorem 3.1, together with integral criteria that ensure property A for equation (L), implies the following result.
160
I. Mojsej, J. Ohriska / Central European Journal of Mathematics 5(1) 2007 154–163
Corollary 3.2. Assume (H3) and (H4) and assume that one of the following conditions is satisfied: (i) I(r) < ∞, I(p) = I(r, p) = I(q) = I(r, q) = ∞, (ii) I(r, q) < ∞, I(p) = I(r, p) = I(q) = ∞ and ∞ ∞ ∞ ∞ p(t) r(s) ds q(s) r(a) da ds dt = ∞. 0
t
t
s
Then Equation (N) has property A. Proof. From Theorems 4 and 5 in [4], it follows that equation (L) has property A. Now we get the assertion from Theorem 3.1. The following example illustrates the statement of Theorem 3.1 . Example 3.3. Consider the differential equation given by 1 3 (t + 1) x (t) + (t + 1)2 x3 (t) + x(t) = 0 , t+1
t ≥ 0.
(5)
This equation has the form (N) where f (u) = u3 + u, r(t) = 1/(t + 1)3 , p(t) = t + 1 and q(t) = (t + 1)2 . Here, I(r) < ∞ and, so, the equation under consideration is in the non-canonical form. It is easy to verify that the assumptions of Theorem 3.1 are satisfied. Moreover, the corresponding linear equation has property A (see Theorem 4, part (iii) in [4]). Thus, by Theorem 3.1, the nonlinear equation (5) has property A. Theorem 3.4. Assume (H3), (H4), I(p) < ∞, and I(r) = I(p, q) = I(q) = ∞. If equation (L) has property A, then equation (N) has property A. Proof. Let x be a proper nonoscillatory solution of (N). Then x ∈ N0 ∪ N1 ∪ N2 ∪ N3 and we assume that there exists T ≥ 0 such that x(t) > 0 for all t ≥ T . Now, suppose that (N) does not have property A. There are, then, four possibilities: I. x ∈ N1 , II. x ∈ N2 , III. x ∈ N0 such that limt→∞ x[i] (t) = 0 for some i ∈ {0, 1, 2}, IV. x ∈ N3 . Given that I(r) = ∞ implies I(r, p) = ∞, cases II, III, and IV yield contradictions in the same way as in the proof of Theorem 3.1. In case I, proceeding as in the proof of Theorem 3.1, we get I(p, q) < ∞, a contradiction. The case x(t) < 0 for all t ≥ T ∗ may be treated similarly. This completes the proof. From Theorem 3.4, we obtain the following: Corollary 3.5. Assume (H3) and (H4) and assume that one of the following conditions is satisfied: (i) I(p) < ∞, I(r) = I(p, q) = I(q) = I(p, r) = ∞,
I. Mojsej, J. Ohriska / Central European Journal of Mathematics 5(1) 2007 154–163
(ii) I(p, r) < ∞, I(r) = I(p, q) = I(q) = ∞ and ∞ ∞ ∞ q(t) p(s) ds r(s) 0
t
t
s
∞
161
p(a) da ds dt = ∞.
Then equation (N) has property A. Proof. From Theorems 4 and 5 in [4], it follows that equation (L) has property A. Now we get the assertion from Theorem 3.4. Remark 3.6. Note that the resulting integral criteria extend to equation (N) from other criteria that are stated for equation (L) (see Theorems 4 and 5 in [4]) and for equation (N) in the canonical form (see Corollary 4 in [5]). The following example illustrates the statement of Theorem 3.4 . Example 3.7. Consider the differential equation 1 3 x (t) + (t + 1)3 x3 (t) + x(t) = 0 , (t + 1) t+1
t ≥ 0.
(6)
This equation has the form (N) where f (u) = u3 + u, r(t) = t + 1, p(t) = 1/(t + 1)3 and q(t) = (t + 1)3 . Here, I(p) < ∞ and, so, the equation under consideration is in the non-canonical form. It is easy to verify that the assumptions of Theorem 3.4 are satisfied. Moreover, the corresponding linear equation has property A (see Theorem 4, part (i) in [4]). Thus, by Theorem 3.4, the nonlinear equation (6) has property A. Remark 3.8. Our comparison results (Theorems 3.1 and 3.4) extend other results that have been proved for the differential equations in the canonical form (see, e.g., Theorem 3 in [2] and Theorem 4 in [5]). We obtain a somewhat weaker result than the one above if the integrals I(p) and I(r) are both convergent. The following theorem holds: Theorem 3.9. Assume (H3), (H4), I(p) < ∞ and I(r) < ∞. If equation (L) has property A, then equation (N) has property A or any solution x of equation (N) from the class N3 tends to zero as t → ∞. Proof. Let x be a proper nonoscillatory solution of (N). It follows that x ∈ N0 ∪ N1 ∪ N2 ∪ N3 . Moreover, we assume that there exists T ≥ 0 such that x(t) > 0 for all t ≥ T . Now, suppose that (N) does not have property A and that there exists a solution x of equation (N) from the class N3 that tends to a positive constant as t tends to infinity. Then, there are four possibilities: I. x ∈ N1 , II. x ∈ N2 , III. x ∈ N0 such that limt→∞ x[i] (t) = 0 for some i ∈ {0, 1, 2}, IV. x ∈ N3 such that limt→∞ x(t) = c > 0.
162
I. Mojsej, J. Ohriska / Central European Journal of Mathematics 5(1) 2007 154–163
In each of cases I and III, we get a contradiction in the same way as in the proof of Theorem 3.1. Case II. Proceeding as in the proof of Theorem 3.1, we get I(q) < ∞. Taken together with I(r) < ∞ and I(p) < ∞, we have that I(q, p, r) < ∞. However, since equation (L) has property A, it follows from Theorem 2.4 that I(q, p, r) = ∞, a contradiction. Case IV. Let x ∈ N3 such that limt→∞ x(t) = c > 0. Then, there exists T1 ≥ T such that x(t) > 0, x[1] (t) < 0, x[2] (t) < 0 for all t ≥ T1 . Integrating equation (N) three times in the interval [T1 , t] we obtain t t s [1] [2] r(s) ds + x (T1 ) r(s) p(k) dk ds− x(t) = x(T1 ) + x (T1 ) T1
T1
T1
− or
x(t) < x(T1 ) −
t
s
r(s)
r(s) T1
s
q(a)f (x(a)) da dk ds
T1
k
p(k) T1
k
p(k)
T1
t
T1
q(a)f (x(a)) da dk ds T1
for all t ≥ T1 .
Inasmuch as x is a positive decreasing function such that limt→∞ x(t) = c > 0, we have that 0 < c ≤ x(t) ≤ x(T2 ) < ∞ for all t ≥ T2 where T2 ≥ T1 . The continuity of the function f (u)/u on the interval [c, x(T2 )] ensures the existence of a positive constant K1 such that f (x(t)) ≥ K1 x(t) for all t ≥ T2 . Thus, we have that x(t) ≤ x(T1 ) − K1
t
r(s) T2
s
k
p(k) T2
q(a)x(a) da dk ds T2
for all t ≥ T2 .
Based on the positive decreasing nature of x, we obtain x(t) ≤ x(T1 ) − K1 x(t) or
t
r(s)
T2
T2
k
p(k)
T2
T2
s t x(t) 1 + K1 r(s) p(k)
s
q(a) da dk ds T2
k
T2
for all t ≥ T2
q(a) da dk ds
≤ x(T1 )
for all t ≥ T2 .
When t → ∞, I(r, p, q) < ∞. However, by Theorem 2.4, equation (L) is oscillatory because equation (L) has property A. Now, according to Theorem 2.2, I(r, p, q) = ∞, a contradictory result. The case x(t) < 0 for all t ≥ T ∗ may be treated similarly. We have just proved that any proper solution x of equation (N) either is oscillatory or belongs to the class N0 such that limt→∞ x[i] (t) = 0 for all i ∈ {0, 1, 2} or belongs to the class N3 such that limt→∞ x(t) = 0. This completes the proof. Remark 3.10. Let us remark that, so far, we have not found any results that are comparable with our Theorem 3.9.
I. Mojsej, J. Ohriska / Central European Journal of Mathematics 5(1) 2007 154–163
163
Acknowledgments Our research was supported by grant 1/3005/06 of the Grant Agency of Slovak Republic (VEGA). The authors thank the referees for their careful reading of the manuscript and for their helpful suggestions for an improved presentation.
References [1] M. Cecchi, Z. Doˇsl´a and M. Marini: “On nonlinear oscillations for equations associated to disconjugate operators”, Nonlinear Analysis, Theory, Methods & Applications, Vol. 30(3), (1997), pp. 1583–1594. [2] M. Cecchi, Z. Doˇsl´a and M. Marini: “Comparison theorems for third order differential equations”, Proceeding of Dynamic Systems and Applications, Vol. 2, (1996), pp. 99– 106. [3] M. Cecchi, Z. Doˇsl´a and M. Marini: “Asymptotic behavior of solutions of third order delay differential equations”, Archivum Mathematicum(Brno), Vol. 33, (1997), pp. 99–108. [4] M. Cecchi, Z. Doˇsl´a and M. Marini: “Some properties of third order differential operators”, Czech. Math. J., Vol. 47(122), (1997), pp. 729–748. [5] M. Cecchi, Z. Doˇsl´a and M. Marini: “An Equivalence Theorem on Properties A, B for Third Order Differential Equations”, Annali di Matematica pura ed applicata (IV), Vol. CLXXIII, (1997), pp. 373–389. [6] M. Cecchi, Z. Doˇsl´a, M. Marini and Gab. Villari: “On the qualitative behavior of solutions of third order differential equations”, J. Math. Anal. Appl., Vol. 197, (1996), pp. 749–766. [7] J. Dˇzurina: “Property (A) of n-th order ODE’s”, Mathematica Bohemica, Vol. 122(4), (1997), pp. 349–356. [8] T. Kusano and M. Naito: “Comparison theorems for functional differential equations with deviating arguments”, J. Math. Soc. Japan, Vol. 33(3), (1981), pp. 509–532. [9] I. Mojsej and J. Ohriska: “On solutions of third order nonlinear differential equations”, CEJM, Vol. 4(1), (2006), pp. 46–63. [10] J. Ohriska: “Oscillatory and asymptotic properties of third and fourth order linear differential equations”, Czech. Math. J., Vol. 39(114), (1989), pp. 215–224. [11] J. Ohriska: “Adjoint differential equations and oscillation”, J. Math. Anal. Appl.,Vol. 195, (1995), pp. 778–796. ˇ [12] V. Seda: “Nonoscillatory solutions of differential equations with deviating argument”, Czech. Math. J., Vol. 36(111), (1986), pp. 93–107.
DOI: 10.2478/s11533-006-0039-0 Research article CEJM 5(1) 2007 164–180
Slice modules over minimal 2-fundamental algebras∗ Zygmunt Pogorzaly†, Karolina Szmyt Faculty of Mathematics and Computer Science, Nicholaus Copernicus University, 87-100 Toru´ n, Poland
Received 21 June 2006; accepted 25 September 2006 Abstract: We consider a class of algebras whose Auslander-Reiten quivers have starting components that are not generalized standard. For these components we introduce a generalization of a slice and show that only in finitely many cases (up to isomorphism) a slice module is a tilting module. c Versita Warsaw and Springer-Verlag Berlin Heidelberg. All rights reserved. Keywords: Minimal 2-fundamental algebra, Auslander-Reiten quiver, slice module, tilting module MSC (2000): 16 G 20, 16 G 70
Introduction Let K be a fixed algebraically closed field. All algebras we consider will be finite dimensional associative K-algebras with a unit element. They will also be assumed to be basic and connected. For a given algebra A we shall denote by mod(A) the category of the finite dimensional right A-modules. We shall consider the Auslander-Reiten quiver ΓA of the algebra A [3]. We are interested in minimal 2-fundamental algebras as introduced in [13]. It happens quite frequently that if A is a minimal 2-fundamental algebra then its Auslander-Reiten quiver ΓA contains a component at the beginning that is not generalized standard in the sense of Skowro´ nski [16] and contains the projective vertices. Thus it is reasonable to generalize a notion of a slice introduced in [9] (see also [14]) and study when a slice module is a tilting module. We shall define a postprojective (respectively, preinjective) slice S and consider a slice module MS . Tilting modules over representation-finite algebras were studied in this way in [9]. ∗ †
The first named author was supported by the Polish Scientific Grant KBN No 1 P03A 018 27 E-mail:
[email protected]
Z. Pogorzaly, K. Szmyt / Central European Journal of Mathematics 5(1) 2007 164–180
165
In this paper we shall characterize postprojective (respectively, preinjective) slice modules that are tilting (respectively, cotilting) modules. Contrary to tame hereditary algebras there are only finitely many postprojective (respectively, preinjective) slices S whose slice modules MS are tilting (respectively, cotilting) modules.
1
Preliminaries
Consider a finite dimensional basic K-algebra A. Following Gabriel [8] one can associate to A a bound quiver (QA , IA ) in such a way that A ∼ = KQA /IA , where KQA is the path algebra of the quiver QA , and IA is a two-sided ideal in KQA contained in the square of the two-sided ideal generated by the arrows. The algebra A is called triangular if QA has no oriented cycles. An algebra A is said to be special biserial if there exists a bound quiver (QA , IA ) with ∼ A = KQA /IA such that: (1) Every vertex of QA is the source of at most two arrows. (2) Every vertex of QA is the sink of at most two arrows. (3) For every arrow α in QA there exists at most one arrow β (respectively, γ) such that αβ ∈ IA (resp., γα ∈ IA ). Throughout the paper we shall always consider special biserial algebras of the form KQ/I with (Q, I) satisfying the above conditions. Let A = KQA /IA be special biserial. Then A is called a string algebra (see [6]) if IA is generated only by paths. There is a full classification of indecomposable finite dimensional right A-modules [7, 18]. For every finite dimensional right A-module M we have two cases. In the first case M is induced by a walk w for which we shall often use the notation M(w) and say that M is a string module [18]. In the other case M is the so-called bound module [18] which we will not consider. We shall use an algorithm for computing Auslander-Reiten sequences for string modules due to Skowro´ nski and Waschb¨ usch [17]. ˜ m -separated provided that for A triangular string algebra A = KQA /IA is said to be A ˜ m such that KQ ∩ IA = 0 = KQ ∩ IA we any two subquivers Q , Q in QA of type A have Q0 ∩ Q0 = ∅, where Q0 , Q0 denote the sets of vertices of Q , Q , respectively. ˜ m -separated algebra A = KQA /IA is said to be 2-fundamental A triangular string A [13] if it is connected and the following conditions are satisfied: ˜ m in (QA , IA ) such that (i) There exist exactly two full subquivers Q , Q of type A ¯ A obtained from QA by removing the arrows KQ ∩ IA = 0 = KQ ∩ IA and the quiver Q from Q and Q and identifying the vertices of Q with a vertex 0 and the vertices of Q with a vertex 0 is a tree. ¯ A starting at 0j such (ii) For any 0j of 0 , 0 there exists either a maximal path v in Q ¯ A ending at 0j such that u ∈ IA . If v (treated as that v ∈ IA , or a maximal path u in Q a path in QA ) starts at some vertex x in Qj that is a sink of two maximal paths v1 , v2 in Qj then v1 v ∈ IA or v2 v ∈ IA . If u (treated as a path in QA ) ends at some vertex y in Qj that is a source of two maximal paths u1 , u2 in Qj then uu1 ∈ IA or uu2 ∈ IA .
166
Z. Pogorzaly, K. Szmyt / Central European Journal of Mathematics 5(1) 2007 164–180
A 2-fundamental algebra A is said to be minimal if the graph obtained from the quiver ¯ ··· 0 . QA by forgetting orientations of the arrows is of the form 0 It is well-known that we can attach to any K-algebra A its Auslander-Reiten quiver ΓA (see [3, 4]). We shall not distinguish between indecomposable A-modules and vertices of ΓA . A component in ΓA will always mean a connected component. Following [13] a component C of ΓA is said to be starting (resp., ending) if there is no nonzero morphism f : X → Y between indecomposable modules X, Y such that Y ∈ C and X ∈ C (resp., X ∈ C and Y ∈ C). Following Skowro´ nski [16] we say that a component C of ΓA is generalized stan∞ dard if rad (X, Y ) = 0 for any indecomposable right A-modules X, Y ∈ C. Recall that rad∞ (mod(A)) is the intersection of all positive powers of the Jacobson radical rad(mod(A)). Consider the following three strictly increasing sequences of positive integers p = (p1 , . . . , pq ), s = (s1 , . . . , sr ) and x = (x1 , . . . , xn ) with q, r, n ≥ 1. Let l1 , l2 ≥ 1 be two (1) integers. Then we can consider a quiver Q(p,l1 ,x,s,l2 ) of the following form 0; ;;; α0,1 α1,1 ;;; ;; . ·
0 <<< << < α0,1 << < . · α1,1
. .
. . α1,a
α1,z
z
· α1,z+1
α1,p
1
p
/ x1 ← · · · ←
α 1,x
α n,xn −xn−1
x ! · · · → xn−1 2 →
o
···
o
α n,1
a
V------αq,pq −pq−1 ----
q
·
·
1 α1,a+1
. . .
· α0,l
α 1,1
/ ···
. . . α1,s
1
α2,p2 −p1 p o
···
1
o
α2,1
s2
p2 α3,1
α3,1
. . .
·
(1)
α4,p −p 4 3
/ ···
α0,l
/ s1
α3,s −s 3 2
/ p3
s3
α4,s −s 4 3
o
·
·
2
H α r,sr −sr−1 sr
α2,s −s 2 1
. . .
α3,p −p 3 2
·
α2,1
1
·
·
(1)
Let I(p,l1 ,x,s,l2 ) denote a two-sided ideal in the path algebra KQ(p,l1 ,x,s,l2 ) generated by the (1)
(1)
(1)
, α1,a αn−1,1 . Denote by A(p,l1 ,x,s,l2 ) the algebra KQ(p,l1 ,x,s,l2 ) /I(p,l1 ,x,s,l2 ) . paths α1,z α1,1 (2)
Under the above assumptions and notations we can consider a quiver Q(p,l1 ,x,s,l2 ) (1)
(2)
that is dual to Q(p,l1 ,x,s,l2 ) . Let I(p,l1 ,x,s,l2 ) denote a two-sided ideal in the path alge(2)
(2)
α1,z , αn−1,1 α1,a . Denote by A(p,l1 ,x,s,l2 ) the bra KQ(p,l1 ,x,s,l2 ) generated by the paths α1,1 (2)
(2)
algebra KQ(p,l1 ,x,s,l2 ) /I(p,l1 ,x,s,l2 ) . These two families of algebras will be of great importance throughout the paper. Lemma 1.1. Let A be a minimal 2-fundamental algebra whose Auslander-Reiten quiver ΓA contains a starting component that is not generalized standard. Then there are three
Z. Pogorzaly, K. Szmyt / Central European Journal of Mathematics 5(1) 2007 164–180
167
strictly increasing sequences of positive integers p, x, s and two integers l1 , l2 ≥ 1 such (1) that A ∼ = A(p,l1 ,x,s,l2 ) . Proof. This lemma is a direct consequence of [13, Theorem 5.7]; and the proof of Lemma 5.6 in [13]. Lemma 1.2. Let A be a minimal 2-fundamental algebra whose Auslander-Reiten quiver ΓA contains an ending component that is not generalized standard. Then there are three strictly increasing sequences of positive integers p, x, s and two integers l1 , l2 ≥ 1 such (2) that A ∼ = A(p,l1 ,x,s,l2 ) . Proof. This lemma is a direct consequence of [13, Theorem 5.7] and the proof of Lemma 5.6 in [13].
2
Slice modules and their projective dimensions
Let A ∼ = A(p,l1 ,x,s,l2 ) , and C be the only starting component in ΓA . It is easy to see that C contains all indecomposable projective right A-modules. Moreover, we know from [13, Theorem 5.7] that C is not generalized standard. A postprojective slice in C is defined to be a set S = {N1 , N2 , . . . , Nt } of vertices of C such that the following conditions are satisfied: (0) S consists only of postprojective modules. (1) There is no oriented cycle in C consisting of modules from S. (2) If M0 → M1 → · · · → Mm is a path in C such that M0 , Mm ∈ S then M1 , . . . , Mm−1 ∈ S. (3) S contains exactly one representative of every τ -orbit of the projective A-modules. (2) For an algebra A ∼ = A(p,l1 ,x,s,l2 ) one can define a preinjective slice in the unique ending component of ΓA dually. Let S be a postprojective (resp., preinjective) slice in C. Then a module MS that is isomorphic to a direct sum ti=1 Ni is called a postprojective (resp., preinjective) slice module of S. Our next aim is to compute projective (resp., injective) dimensions of postprojective (resp., preinjective) slice modules. We shall use a special technique in the considered cases rather than applying the results of [10–12]. (1) (2) For an algebra A ∼ = A(p,l1 ,x,s,l2 ) or A ∼ = A(p,l1 ,x,s,l2 ) consider a string module M(w) (1)
(j)
(j)
for a walk w in (Q(p,l1 ,x,s,l2 ) , I(p,l1 ,x,s,l2 ) ), j = 1, 2. Then there are nonzero paths ui , vi in the above bound quiver such that u1 starts at b1 , both ui , vi end at ci , i = 1, . . . , d, vi , ui+1 start at bi+1 , i = 1, . . . , d, and ud+1 ends at cd+1 . Furthermore, w ∈ {w1 , w2 , w3 , w4 } for the walks w1 = u1 v1−1 u2 v2−1 · · · ud vd−1 , w2 = u1 v1−1 u2 v2−1 · · · ud vd−1 ud+1 , w3 = v1−1 u2 v2−1 · · · ud vd−1 , w4 = v1−1 u2 v2−1 · · · ud vd−1 ud+1 . (1) Lemma 2.1. For an algebra A ∼ = A(p,l1 ,x,s,l2 ) consider a string module M(w) with w ∈ {w1 , w2 , w3 , w4 }. Then the following conditions are satisfied:
168
Z. Pogorzaly, K. Szmyt / Central European Journal of Mathematics 5(1) 2007 164–180
(1) If w = u1 and c1 = (a − 1) or c1 = z − 1 then proj.dim(M(w)) = 2. (2) If w ∈ {w2 , w4}, cd+1 = z − 1 or cd+1 = (a − 1) then proj.dim(M(w)) = 2. (3) If w ∈ {w1 , w3 }, vd = α0,1 vd and bd+1 = z − 1 or vd = α0,1 vd and bd+1 = (a − 1) then proj.dim(M(w)) = 2. u1 and b1 = (a − 1) then (4) If w ∈ {w1 , w2 }, u1 = α0,1 u1 and b1 = z − 1 or u1 = α0,1 proj.dim(M(w)) = 2. (5) If w ∈ {w3 , w4}, and c1 = z − 1 or c1 = (a − 1) then proj.dim(M(w)) = 2. Proof. We start by considering the case when w = u1 . Suppose that c1 = z − 1. Then we have the following minimal projective resolution of M(w): 0 → P → P ⊕ Pz → Pb1 → M(w) → 0, where P is zero if b1 = 0 and P is a nonzero projective direct summand in rad(P0 ) if b1 = 0. Furthermore, if z = p1 then P ∼ = rad(Pz ) and if z = p1 then P is the direct summand in rad(Pz ) whose socle is isomorphic to Sx1 . Thus proj.dim(M(w)) = 2. If c1 = (a − 1) then similar arguments show that proj.dim(M(w)) = 2. Consequently, condition (1) is proved. If w = w2 , cd+1 = z − 1 then we have the following minimal projective resolution of M(w): d d+1 ¯ ¯ ˜ Pbi → M(w) → 0, 0 →P ⊕P →P ⊕P ⊕ Sc j → j=2
i=1
where S˜ci ∼ = Sci if ci = z, a and S˜ci ∼ = Pci if ci = z = p1 or ci = a = s1 and where P is zero if Pb1 is uniserial, P is a nonzero projective direct summand in rad(Pb1 ) if Pb1 is nonuniserial, or else P is isomorphic to Pz or to Pa if b1 = z − 1 and u1 = α0,1 u1 or u1 . Moreover, P ∼ b1 = (a − 1) and u1 = α0,1 = Pz and P¯ is either rad(Pz ) which is projective if z = p1 or P¯ is the nonzero projective direct summand in rad(Pz ) whose socle is Sx1 if z = p1 . Furthermore, P¯ is zero or P¯ is a uniserial projective A-module if P ∼ = Pz or P ∼ = Pa . Thus proj.dim(M(w)) = 2. If w = w4 , cd+1 = z − 1 then we have the following minimal projective resolution of M(w): d d+1 ¯ ˜ ¯ 0 →P ⊕P →P ⊕P ⊕ Sc j → Pbi → M(w) → 0, j=2
i=2
where S˜ci ∼ = Sci if ci = z, a and S˜ci ∼ = Pci if ci = z = p1 or ci = a = s1 and where P is zero or a nonzero projective submodule of Pb2 or P ∼ = Pz if c1 = z − 1 or else ∼ ∼ ¯ P = Pa if c1 = (a − 1) . Moreover, P = Pz and P is either rad(Pz ) which is projective if z = p1 or P¯ is the projective direct summand in rad(Pz ) whose socle is Sx1 if z = p1 . Furthermore, P¯ is zero or P¯ is a uniserial projective A-module if P ∼ = Pz or P ∼ = Pa . Thus proj.dim(M(w)) = 2. If w ∈ {w2 , w4 } and cd+1 = (a−1) then similar arguments show that proj.dim(M(w)) = 2. Consequently, condition (2) is proved.
Z. Pogorzaly, K. Szmyt / Central European Journal of Mathematics 5(1) 2007 164–180
169
If w = w1 , vd = α0,1 vd and bd+1 = z − 1 then we have the following minimal projective resolution of M(w): 0 → P¯ ⊕ P¯ → P ⊕ P ⊕
d j=1
S˜cj →
d+1
Pbi → M(w) → 0,
i=1
where S˜cj ∼ = Scj if cj = z, a and S˜cj ∼ = Pcj if cj = z = p1 or cj = a = s1 . Moreover, P ∼ = Pz and P is zero or a nonzero projective uniserial module isomorphic to rad(Pc1 ), P ∼ = Pz if c1 = z − 1, or else P ∼ = Pa if c1 = (a − 1) . Furthermore, P¯ is either rad(Pz ) which is projective if z = p1 or P¯ is the projective direct summand in rad(Pz ) whose socle is Sx1 if z = p1 . Further P¯ is zero or P¯ is a uniserial projective A-module if P ∼ = Pz ∼ or P = Pa . Thus proj.dim(M(w)) = 2. If w ∈ {w1 , w2 }, vd = α0,1 vd and bd+1 = (a − 1) then similarly we can show that proj.dim(M(w)) = 2. Consequently, condition (3) is proved. If w = w1 , w2, w1 = α0,1 u1 and b1 = z − 1 or u1 = α0,1 u1 and b1 = (a − 1) then we get proj.dim(M(w)) = 2, because we can use the left-right symmetry of conditions (3) and (4). Thus condition (4) is verified. It is also clear that conditions (2) and (5) are left-right symmetric. Thus proj.dim(M(w)) = 2 in the case and thus condition (5) is verified. (2) Lemma 2.2. For an algebra A ∼ = A(p,l1 ,x,s,l2 ) consider a string module M(w) with w ∈ {w1 , w2 , w3 , w4 }. Then the following conditions are satisfied: (1) If w = u1 and b1 = (a − 1) or b1 = z − 1 then inj.dim(M(w)) = 2. (2) If w ∈ {w1 , w3}, bd+1 = z − 1 or bd+1 = (a − 1) then inj.dim(M(w)) = 2. and (3) If w ∈ {w2 , w4 }, ud+1 = ud+1 α0,1 and cd+1 = z − 1 or ud+1 = ud+1 α0,1 cd+1 = (a − 1) then inj.dim(M(w)) = 2. (4) If w ∈ {w3 , w4 }, v1 = v1 α0,1 and c1 = z − 1 or v1 = v1 α0,1 and c1 = (a − 1) then inj.dim(M(w)) = 2. (5) If w ∈ {w1 , w2} and b1 = z − 1 or b1 = (a − 1) then inj.dim(M(w)) = 2.
Proof. Dual arguments to those used in the proof of Lemma 2.1 prove the lemma. We leave details to the reader. (1) Proposition 2.3. For an algebra A ∼ = A(p,l1 ,x,s,l2 ) consider a string module M(w) with w ∈ {w1 , w2 , w3, w4 }. If w does not satisfy any of the conditions (1) - (5) in Lemma 2.1 then proj.dim(M(w)) ≤ 1.
Proof. Let w = w1 . Then it is easily seen that there exists an epimorphism f : d+1 Pbi → M(w1 ). Notice that if Pb1 and Pbd+1 are uniserial modules then ker(f ) ∼ = i=1 d ∼ ˜ ˜ j=1 Scj , where every Scj = Scj is a simple projective module except when cj = z = p1 or cj = a = s1 , or when w1 = u1 and c1 is a sink of exactly one arrow. In the case cj = z = p1 or cj = a = s1 we have S˜cj ∼ = Pcj is projective. In the case w1 = u1 we ∼ have ker(f ) = rad(Pc1 ) which is projective if c1 = z − 1 and c1 = (a − 1) . Thus we have
170
Z. Pogorzaly, K. Szmyt / Central European Journal of Mathematics 5(1) 2007 164–180
either proj.dim(M(w)) ≤ 1 or w1 = u1 and c1 = z − 1 or c1 = (a − 1) which is precisely condition (1) of Lemma 2.1. d If Pb1 , Pba+1 are not uniserial then ker(f ) ∼ = j=1 S˜cj ⊕ P ⊕ P , where every S˜cj ∼ = Sc j ∼ ˜ is a simple projective module unless cj = z = p1 or cj = a = s1 in which case Scj = Pcj . Moreover, P is either a projective direct summand in rad(Pb1 ) if b1 = z − 1, (a − 1) , or is nonprojective otherwise. If P is not projective then b1 = z − 1 and u1 = α0,1 u1 , because Pb1 is not uniserial, or b1 = (a − 1) and u1 = α0,1 u1 , and this is stated in condition (4) of Lemma 2.1. A similar analysis shows that if P is a direct summand of rad(Pbd+1 ) which is nonprojective then we obtain the case of w1 stated in condition (3) of Lemma 2.1. It is also clear that if one of Pb1 , Pbd+1 is nonuniserial then we can use the same arguments. Consequently, if w = w1 then the required condition holds. Let w = w2 . Then we have an epimorphism f : d+1 ). It is easy to see i=1 Pbi → M(w d 2 ˜ ∼ that if Pb1 is uniserial and cd+1 = z − 1, (a − 1) then ker(f ) = j=1 Scj ⊕ rad(Pcd+1 ), ∼ ˜ where every Scj = Scj is a simple projective module unless cj = z = p1 or cj = a = s1 in which case we have S˜cj ∼ = Pcj . Therefore ker(f ) is a projective module in this case. If Pb1 is nonuniserial then similar considerations to those used in the case w = w1 allow us to conclude that ker(f ) is not projective provided that b1 = z − 1 and u1 = α0,1 u1 or b1 = (a − 1) and u1 = α0,1 u1 and this is stated in condition (4) of Lemma 2.1. If cd+1 = z − 1 or cd+1 = (a − 1) then rad(Pcd+1 ) is not projective, and this is stated in condition (2) of Lemma 2.1. Consequently, the required condition holds in the case w = w2 . Since the case w = w3 is left-right symmetric to w = w2 , we obtain that either condition (3) or (5) of Lemma 2.1 is satisfied, or proj.dim(M(w)) ≤ 1. d+1 Let w = w4 . Then we have an epimorphism f : i=2 Pbi → M(w4 ). Using the same arguments as in the case w = w2 for rad(Pcd+1 ) we obtain that rad(Pcd+1 ) is not projective if cd+1 = z − 1 or cd+1 = (a − 1) . But this is stated in condition (2) of Lemma 2.1. Similarly, if rad(Pc1 ) is not projective then c1 = z − 1 or c1 = (a − 1) which is stated in condition (5) of Lemma 2.1. Consequently, the required condition holds in the case w = w4 and the proposition is proved. (2) Proposition 2.4. For an algebra A ∼ = A(p,l1 ,x,s,l2 ) consider a string module M(w) with w ∈ {w1 , w2 , w3, w4 }. If w does not satisfy any of the conditions (1) - (5) in Lemma 2.2 then inj.dim(M(w)) ≤ 1.
Proof. This is dual to the proof of Proposition 2.3.
(1) Corollary 2.5. For an algebra A ∼ = A(p,l1 ,x,s,l2 ) let C be a starting component in ΓA that is not generalized standard. If S is a postprojective slice in C then we have proj.dim(MS ) ≤ 1.
Z. Pogorzaly, K. Szmyt / Central European Journal of Mathematics 5(1) 2007 164–180
171
Proof. We shall prove the corollary for every module M(w) ∈ S such that there is a chain P → X1 → X2 → · · · → Xn = M(w) of indecomposable right A-modules and irreducible morphisms, where P is a projective module. Then P ∈ C by 2.1. We infer by the Skowro´ nski-Waschb¨ usch algorithm that, in the considered case, the walk w does not satisfy any of the conditions (1) - (5) in Lemma 2.1. Thus we deduce from Proposition 2.3 that proj.dim(M(w)) ≤ 1. Finally, it is clear that for for each M ∈ S there is a walk w such that M ∼ = M(w) and there is the above chain of indecomposable right A-modules and irreducible morphisms. Therefore proj.dim(M)S ) ≤ 1. (2) Corollary 2.6. For an algebra A ∼ = A(p,l1 ,x,s,l2 ) let C be an ending component in ΓA that is not generalized standard. If S is a preinjective slice in C then we have inj.dim(MS ) ≤ 1.
Proof. This is dual to the proof of Corollary 2.5.
3
Homomorphisms between slice modules
We have the following algorithm for computing Auslander-Reiten sequences for string modules, determined by Skowro´ nski and Waschb¨ usch in [17]. If w ∈ {w1 , w2 , w3 , w4 } then −1 wR = wκ u, where κ is an arrow such that κvd is a nonzero path in case w ∈ {w1 , w3 } and κ is an arrow in case w ∈ {w2 , w4 }; u is a maximal nonzero path starting at the source of κ (u may be trivial) provided that the above arrow κ exists. If the arrow κ does not exist then wR = w , where w is as follows. If w ∈ {w2 , w4 } and ud+1 = ud+1 δ for an arrow δ then w = u1 v1−1 u2 v2−1 . . . ud vd−1 ud+1 or w = v1−1 u2 v2−1 . . . ud vd−1 ud+1 , respectively. If w ∈ {w1 , w3 } and ud = ud δ for an arrow δ then w = u1v1−1 u2v2−1 . . . ud or w = v1−1 u2 v2−1 . . . ud , respectively. In a similar way we can construct a walk wL using the ideas on the other end of the walk w. Then we can compose the above constructions and obtain a walk wRL = wLR . Thus for a noninjective A-module M(w) we have the following Auslander-Reiten sequence in mod(A) 0 → M(w) → M(wR ) ⊕ M(wL ) → M(wRL ) → 0. Dually one constructs three walks wL−1 , wR−1 , wL−1 R−1 = wR−1 L−1 such that for a nonprojective A-module M(w) there is the following Auslander-Reiten sequence in mod(A) 0 → M(wL−1 R−1 ) → M(wL−1 ) ⊕ M(wR−1 ) → M(w) → 0. Lemma 3.1. Let A ∼ = A(p,l1 ,x,s,l2 ) . If C is the starting component in ΓA then there is no finite oriented cycle in C of irreducible morphisms. (1)
172
Z. Pogorzaly, K. Szmyt / Central European Journal of Mathematics 5(1) 2007 164–180
Proof. Since C is a connected component in ΓA that contains all indecomposable projective vertices, we have for every vertex M ∈ C a sequence S = X0
···
X1
Xn−1
Xn = M
of indecomposable A-modules and irreducible morphisms, where S is a simple projective A-module and Xi Xi+1 denotes either Xi −→ Xi+1 or Xi+1 −→ Xi which is an irreducible morphism. We shall prove the lemma inductively on n. If n = 0 then M = S and S cannot lie on a finite oriented cycle in C, because S is simple projective. Now assume that for every module N ∈ C such that there is a sequence S = X0
···
X1
Xn−1
Xn = N
there is no finite oriented cycle in C containing N. Suppose that M ∈ C and there is a sequence S = X0
X1
···
Xn−1
Xn
Xn+1 = M
and M lies on a finite oriented cycle in C. Consider the case when Xn −→ Xn+1 . If the oriented cycle M → Y1 → · · · → Ym → M satisfies Ym ∼ = Xn then we have that Xn lies on a finite oriented cycle in C which contradicts the inductive assumption. Thus we get that Ym ∼ = Xn and there is an Auslander-Reiten sequence of the form 0 → τ M → Xn ⊕ Ym → M → 0 or M is projective. If each Yj , j = 1, . . . , m, and M is nonprojective then we have the following cycle τ (M) → τ (Y1 ) → · · · → τ )(Ym ) → τ (M). If τ (Y1 ) ∼ = X then we get ∼ a contradiction to the inductive assumption. Thus τ (Y1 ) = Ym . In this case we have the following cycle τ (M) → Xn → M → Y1 → · · · → Ym → τ (Y2 ) → · · · → τ (Ym ) → τ (M) in C which also contradicts the inductive assumption. Now consider the case when Yj0 is projective for some j0 ∈ {1, . . . , m}. Then Yj0 −1 is also a projective direct summand in rad(Yj0 ) except the case Yj0 ∼ = Pz−1 or Yj0 ∼ = P(a−1) . It is clear that if none of Yj is isomorphic to Pz−1 or P(a−1) then a simple projective A-module lies on a finite oriented cycle in C and this is impossible. Thus we can assume without loss of generality that Yj0 ∼ nski-Waschb¨ usch = Pz−1 . Then applying the Skowro´ algorithm for computing irreducible morphisms we get that if there is a chain of irreducible homomorphisms Yj0 ∼ = Yj0 (w(1)) → X(w(2)) → · · · → X(w(t)) then we have l(w(1)) < l(w(2)) < · · · < l(w(t)), where l(w) stands for the length of the walk w. Thus Yj0 cannot lie on a finite oriented cycle of irreducible morphisms. In the case Yj0 ∼ = P(a−1) we repeat the above arguments. Consequently, we have that M cannot lie on a finite oriented cycle. Now consider the case Xn+1 → Xn . Similar arguments to the above show that M cannot lie on a finite oriented cycle and this finishes the proof. (2) Lemma 3.2. Let A ∼ = A(p,l1 ,x,s,l2 ) . If C is the ending component in ΓA then there is no finite oriented cycle in C.
Z. Pogorzaly, K. Szmyt / Central European Journal of Mathematics 5(1) 2007 164–180
173
Proof. This is dual to the proof of Lemma 3.1.
(1) Proposition 3.3. Let A ∼ = A(p,l1 ,x,s,l2 ) and C be the starting component in ΓA . If S is a postprojective slice in C and M, N are elements of S then for every 0 = f : τ −1 (N) → M we have f ∈ rad∞ (τ −1 (N), M).
Proof. Since M, N belong to a postprojective slice S in C, we have that there is a finite chain M∼ X1 ··· Xn ∼ = X0 =N Xi+1 is either Xi −→ Xi+1 or Xi+1 −→ Xi in C. We with X0 , X1 , . . . , Xn ∈ S and Xi shall prove inductively on n that there is not a nonzero homomorphism from τ −1 (N) to M that is a composition of finitely many irreducible homomorphisms. If n = 0 then M = N. Suppose that there is a nonzero homomorphism f : τ −1 (M) → M that is not contained in rad∞ (τ −1 (M), M). Then there exists a finite sequence τ −1 (M) → Y1 → Y2 → · · · → Ym → M of indecomposable A-modules and irreducible homomorphisms. But there are also irreducible homomorphisms M → Z and Z → τ −1 (M) with Z indecomposable. Thus we obtain that τ −1 (M) lies on a finite oriented cycle in C which contradicts Lemma 3.1. Therefore f ∈ rad∞ (τ −1 (M), M). Now assume that if M, N satisfy the above assumptions and there is a sequence M∼ = X0
X1
···
Xn ∼ =N
with n ≤ n0 then for every 0 = f : τ −1 (N) → M we have f is not a composition of finitely many irreducible homomorphisms. Consider M, N as above with a sequence M∼ = X0
X1
···
Xn 0
Xn0 +1 ∼ =N
and let 0 = f : τ −1 (N) → M. Suppose that f is a composition of finitely many irreducible homomorphisms.. Let Xn0 Xn0 +1 be an irreducible homomorphism Xn0 −→ Xn0 +1 . We deduce from the fact that f is a composition of finitely many irreducible homomorphisms that there exists a finite sequence τ −1 (N) → Y1 → · · · → Ym → M of indecomposable A-modules and irreducible homomorphisms whose composition is not zero. Thus we deduce from the Skowro´ nski-Waschb¨ usch algorithm that for A-modules in C every irreducible homomorphism g : U → V between postprojective U, V is a monomorphism. Thus τ −1 (N) is isomorphic to a submodule of M. But we have a sequence of irreducible homomorphisms Xn0 → N → τ −1 (Xn0 ) → τ −1 (N) whose composition is also a monomorphism. Thus we have a sequence of irreducible monomorphisms τ −1 (Xn0 ) → τ −1 (N) → Y1 → · · · → Ym → M, which contradicts the inductive assumption. Thus f is not a composition of finitely many irreducible homomorphisms in this case. Let Xn0 Xn0 +1 be an irreducible homomorphism Xn0 +1 −→ Xn0 . We deduce again from the fact that f is a composition of finitely many irreducible homomorphisms that there exists a finite sequence τ −1 (N) → Y1 → · · · → Ym → M of indecomposable
174
Z. Pogorzaly, K. Szmyt / Central European Journal of Mathematics 5(1) 2007 164–180
A-modules and irreducible homomorphisms whose composition is not zero. Then using similar reasoning as above we get a contradiction. Consequently, f is not a composition of finitely many irreducible homomorphisms and so f ∈ rad∞ (τ −1 (N), M) and this finishes the proof. Proposition 3.4. Let A ∼ = A(p,l1 ,x,s,l2 ) and C be the ending component in ΓA . If S is a preinjective slice in C and M, N belong to S then for every 0 = f : M → τ (N) we have f ∈ rad∞ (M, τ (N)). (2)
Proof. Dual arguments to those used in the proof of Proposition 3.3 show the proposition. Lemma 3.5. Let A ∼ = A(p,l1 ,x,s,l2 ) and C be the starting component in ΓA . Let u = α1,a+2 · · · α1,s if a = s1 α1,z+1 α1,z+2 · · · α1,p1 if z = p1 and u = p1 if z = p1 . Let v = α1,a+1 1 −1 and v = s1 if a = s1 . If Xj = X(uLj ) and Yj = Y (vRj ), j = 0, 1, 2, . . ., then for every finite sequence Xj → Z1 → Z2 → · · · → Zt or Yj → Z1 → Z2 → · · · → Zt , t ≥ 1, of indecomposable A-modules and irreducible homomorphisms we have that every irreducible homomorphism in the sequence is a monomorphism. (1)
Proof. Applying the Skowro´ nski-Waschb¨ usch algorithm for computing irreducible homomorphisms, it is easy to see that for every j, e, d we have l((u−1 ) d e ) < l((u−1 ) d+1 Re ) Lj L R Lj L −1 −1 and l((uLj )Ld Re ) < l((uLj )Ld Re+1 ). Since an irreducible homomorphism between indecomposable A-modules is either mono or epic, we infer by the above inequalities that the required condition holds. A similar argument holds for vRj . Lemma 3.6. Let A ∼ = A(p,l1 ,x,s,l2 ) and C be the ending component in ΓA . Let u = α1,p1 · · · α1,z+1 if z = p1 and u = p1 if z = p1 . Let v = α1,s · · · α1,a+1 if a = s1 and 1 −1 j j v = s1 if a = s1 . If X = X(uL−j ) and Y = Y (vR−j ), j = 0, 1, 2, . . ., then for every finite sequence X j ← Z 1 ← Z 2 ← · · · ← Z t or Y j ← Z 1 ← Z 2 ← · · · ← Z t , t ≥ 1, of indecomposable A-modules and irreducible homomorphisms we have that every irreducible homomorphism in the sequence is an epimorphism. (2)
Proof. This is dual to the proof of Lemma 3.5.
(1) Lemma 3.7. (1) Let A ∼ = A(p,l1 ,x,s,l2 ) and C be the starting component in ΓA . Let u, v be as in Lemma 3.5. Then there are the following sequences of irreducible epimorphisms in C:
· · · → X(u−1 ) → X(u−1 ) → · · · → X(u−1 ) → X(u−1 ) → X(u−1 ), Lj R−2 Lj R−1 Lj Lj R−d Lj R−d+1 · · · → Y (vRj L−d ) → Y (vRj L−d+1 ) → · · · → Y (vRj L−2 ) → Y (vRj L−1 ) → Y (vRj ), j = 0, 1, 2, . . ..
Z. Pogorzaly, K. Szmyt / Central European Journal of Mathematics 5(1) 2007 164–180
175
(2) (2) Let A ∼ = A(p,l1 ,x,s,l2 ) and C be the ending component in ΓA . Let u, v be as in Lemma 3.6. Then there are the following sequences of irreducible monomorphisms in C:
X(uL−j ) → X(uL−j R ) → X(uL−j R2 ) → · · · → X(uL−j Rd ) → · · · , Y (vR−1−j ) → Y (vR−1−j L ) → Y (vR−1−j L2 ) → · · · → Y (vR−1−j Ld ) → · · · , j = 0, 1, 2, . . .. Proof. A direct application of the Skowro´ nski-Waschb¨ usch algorithm proves the lemma. (1) Lemma 3.8. Let A ∼ = A(p,l1 ,x,s,l2 ) and C be the starting component in ΓA . Then the following conditions are satisfied: (1) If rad∞ (Pc , Pd ) = 0 then c ∈ {z, z + 1, . . . , p1 } and d ∈ {0, 1, . . . , z − 1} or c ∈ {a , (a + 1) , . . . , s1 } and d ∈ {0 , 1 , . . . , (a − 1) }. (2) If p = p2 − p1 + p4 − p3 + · · · + pq−2 = pq−1 + l1 then for w = u−1 α1,1 · · · α1,x we 1 ∞ have M(w) ∼ = Pz and rad (M(w)), M(wLc Rd )) = 0 if and only if c ≥ p, where u is as in Lemma 3.5. −1 −1 (3) If s = s2 − s1 + s4 − s3 + · · · + sr−2 − sr−1 + l2 then for w = αn−1,x · · · αn−1,1 v n −xn−1 ∞ ∼ we have M(w) = Pa and rad (M(w), M(wLc Rd )) = 0 if and only if d ≥ s, where v is as in Lemma 3.5.
Proof. We have the following irreducible homomorphisms in C: Pp1 → Pp1 −1 → · · · → Pz+1 → Pz and X(u−1 ) → Pz−1 → Pz−2 → · · · → P0 . Since every composition of a finite sequence of irreducible homomorphisms from Pz to an indecomposable Amodule Y is a monomorphism, we have that the epimorphism Pz → X(u−1) belongs to rad∞ (Pz , X(u−1)). Thus we get c ∈ {z, . . . , p1 } and d ∈ {0, . . . , z − 1} if rad∞ (Pc , Pd ) = 0 or similarly c ∈ {a , . . . , s1 } and d ∈ {0 , . . . , (a − 1) }, proving (1). In order to prove condition (2) we have that M(wLp ) ∼ = X((u−1)Rt ) for some t ≥ 2. Then applying Lemmas 3.5, 3.7(1), we have that rad∞ (Pz , M(wLc Rd )) = 0 if and only if c ≥ p. Similarly one can prove condition (3). (2) Lemma 3.9. Let A ∼ = A(p,l1 ,x,s,l2 ) and C be the ending component in ΓA . Then the following conditions are satisfied: (1) If rad∞ (Ec , Ed ) = 0 then c ∈ {0, 1, . . . , z − 1} and d ∈ {z, z + 1, . . . , p1 } or c ∈ {0 , 1, . . . , (a − 1) } and d ∈ {a , (a + 1) , . . . , s1 }. −1 −1 (2) If p = p2 − p1 + p4 − p3 + · · · + pq−2 − pq−1 + l1 then for w = uα1,1 · · · α1,x we 1 ∞ have M(w) ∼ = Ez and rad (M(wL−c R−d ), M(w)) = 0 if and only if c ≥ p. (3) em If s = s2 −s1 +s4 −s3 +· · ·+sr−2 −sr−1 +l2 then for w = αn−1,x · · · αn−1,1 v −1 n −xn−1 we have M(w) ∼ = Ea and rad∞ (M(wR−d L−c ), M(w)) = 0 if and only if d ≥ s.
Proof. Dual arguments to those used in the proof of Lemma 3.8 prove the lemma.
176
Z. Pogorzaly, K. Szmyt / Central European Journal of Mathematics 5(1) 2007 164–180
(1) Proposition 3.10. (1) Let A ∼ = A(p,l1 ,x,s,l2 ) and C be the starting component in ΓA . Let P, P be indecomposable projective A-modules. If for some nonnegative integers l, t we have τ −l (P ) ∼ ¯ and rad∞ (M(w), M(w)) ¯ = 0 then there are = M(w) and τ −t (P ) ∼ = M(w) integers c, d such that w¯ = wLc Rd and c ≥ p or d ≥ s. (2) (2) Let A ∼ = A(p,l1 ,x,s,l2 ) and C be the ending component in ΓA . Let E, E be indecomposable injective A-modules. If for some nonnegative integers l, t we have τ l (E) ∼ = M(w) ∞ t ∼ ¯ and rad (M(w), ¯ M(w)) = 0 then there are integers c, d such that and τ (E ) = M(w)
w¯ = wL−c R−d and c ≥ p or d ≥ s. (1) Proof. Let A ∼ = A(p,l1 ,x,s,l2 ) . Consider two indecomposable postprojective A-modules τ −l (P ) ∼ ¯ Then we deduce from the Skowro´ nski-Waschb¨ usch = M(w) and τ −t (P ) ∼ = M(w). i algorithm that w = u−1 Lj w , for some j ≥ 0, or w = w vR , for some i ≥ 0, because only ∞ in these cases could rad (M(w), M(w)) ¯ = 0. Consider the case w = u−1 Lj w . Observe w )Lc with c < p we have rad∞ (M(u−1 w ), M((u−1 w )Lc )) = 0. Indeed, that for (u−1 Lj Lj Lj since M(w) is postprojective, we have that w is nontrivial. Furthermore, w = α1,1 w˜ , because we have an epimorphism from M(w) onto X(u−1 Lj ). Moreover, it is clear that −1 −1 (uLj w )Lc = uLj+c w . If c < p then the nonzero homomorphism is a composition of an −1 −1 epimorphism f : M(w) → X(u−1 Lj ) with a monomorphism g : X(uLj ) → M(uLc+j w ). But −1 u−1 ¯ = wLc Rd and Lc+j = (uLp+j )Lc if and only if c ≥ p, where c + p = c. Thus we get that w c ≥ p. A similar analysis in the case w = w vRi shows that w¯ = wLc Rd and d ≥ s, and this finishes the proof of condition (1).
In order to prove condition (2) we apply dual arguments.
4
Tilting modules
Following Happel and Ringel [9] (see also [5] we shall call a finitely generated A-module TA a tilting module (respectively, cotilting module) if it satisfies the following conditions (1) proj.dim(TA ) ≤ 1 (respectively, inj.dim(TA ) ≤ 1) 1 (2) ExtA (T, T ) = 0 (3) The number of the nonisomorphic indecomposable direct summands of TA is equal to the rank of the Grothendieck group K0 (A) of A. (1) Let A ∼ = A(p,l1 ,x,s,l2 ) and C be the starting component in ΓA . Let S be a postprojective slice in C. Assume that MS = ti=1 M(wi ) and the indecomposable modules M(wi ) are enumerated in such a way that M(w1 ) belongs to the τ -orbit of Pz−1 and M(wt ) belongs to the τ -orbit of P(a−1) . Then S can be presented as M(w1 )
M(w2 )
···
M(wt ),
M(wi+1 ) is a left arrow M(wi ) ← M(wi+1 ) or a right arrow M(wi ) → where M(wi ) M(wi+1 ), i = 1, . . . , t − 1. Then we can attach to every element M(wi ) of S two numbers:
Z. Pogorzaly, K. Szmyt / Central European Journal of Mathematics 5(1) 2007 164–180
177
lS (M(wi )) is the number of the left arrows in M(w1 ) M(w2 ) ··· M(wi ) and M(wi+1 ) ··· M(wt ). rS (M(wi )) is the number of the right arrows in M(wi ) (2) ∼ If A = A(p,l1 ,x,s,l2 ) then we can dually define lS (M(wi )) and rS (M(wi )) for a preinjective slice S in the ending component C. (1) Theorem 4.1. (1) Let A ∼ = A(p,l1 ,x,s,l2 ) and C be the starting component in ΓA . Let S be a postprojective slice in C. Then the following conditions hold: (1i) If for every M(wi ) ∈ S, i = 1, . . . , t, we have lS (M(wI )) ≤ p and rS (M(wi )) ≤ s then the slice module MS = ti=1 M(wi ) is a tilting A-module. (1ii) If for every M(wi ) ∈ S, i = 1, . . . , t, such that lS (M(wi )) > p there is no epimorphism f : τ −1 (M(wi )) → X(u−1 ), j ≥ 1, and for every M(wi ) ∈ S, i = 1, . . . , t, Lj such that rS (M(wi )) > s there is no epimorphism g : τ −1 (M(wi )) → Y (vRj ), j ≥ 1, then the slice module MS = ti=1 M(wi ) is a tilting A-module. (1iii) If there exists i0 ∈ {1, . . . , t} such that lS (M(wi0 )) > p and there is an epimorphism f : τ −1 (M(wi0 )) → X(u−1 Lj ) for some j ≥ 1 then the slice module MS = t i=1 M(wi ) is not a tilting A-module. (1iv) If there exists i0 ∈ {1, . . . , t} such that rS (M(wi0 )) > s and there is an epimor phism g : τ −1 (M(wi0 )) → Y (vRj ) for some j ≥ 1 then the slice module MS = ti=1 M(wi ) is not a tilting A-module. (2) (2) Let A ∼ = A(p,l1 ,x,s,l2 ) and C be the ending component in ΓA . Let S be a preinjective slice in C. Then the following conditions hold: (2i) If for every M(wi ) ∈ S, i = 1, . . . , t, we have lS (M(wi )) ≤ s and rS (M(wi )) ≤ p then the slice module MS = ti=1 M(wi ) is a cotilting A-module. (2ii) If for every M(wi ) ∈ S, i = 1, . . . , t, such that lS (M(wi )) > s there is no monomorphism f : Y (vR−1−j ) → τ (M(wi )), j ≥ 1, and for every M(wi ) ∈ S, i = 1, . . . , t, such that rS (M(wi )) > p there is no monomorphism g : X(uL−j ) → τ (M(wi )), j ≥ 1, then the slice module MS = ti=1 M(wi ) is a cotilting A-module. (2iii) If there exists i0 ∈ {1, . . . , t} such that lS (M(wi0 )) > s and there is a monomor phism f : Y (vR−1−j ) → τ (M(wi0 )) for some j ≥ 1 then the slice module MS = ti=1 M(wi ) is not a cotilting A-module. (2iv) If there exists i0 ∈ {1, . . . , t} such that rS (M(wi0 )) > p and there is a monomor phism g : X(uL−j ) → τ (M(wi0 )) for some j ≥ 1 then the slice module MS = ti=1 M(wi ) is not a cotilting A-module. (1) Proof. Let A ∼ = A(p,l1 ,x,s,l2 ) . Consider the starting component C in ΓA . Let S = {M(wi )}ti=1 be a postprojective slice in C. Suppose that for every M(wi ) ∈ S, i = 1, . . . , t, we have lS (M(wi )) ≤ p and rS (M(wi )) ≤ s. Then for every two M(wi ), M(wj ) we have that rad∞ (τ −1 (M(wi )), M(wj )) = 0. Indeed, if there is a nonzero homomorphism f : τ −1 (M(wi )) → M(wj ) then we infer by Proposition 3.10(1) that wj = ((wi)LR )Lc Rd and c ≥ p or d ≥ s. But if lS (M(wi ) ≤ p then we replace S onto S in such a way that we consider τ −1 (M(wi )) instead of M(wi ) and the other elements are not changed. Then we have lS (τ −1 (M(wi ))) ≤ p − 1. Thus
178
Z. Pogorzaly, K. Szmyt / Central European Journal of Mathematics 5(1) 2007 164–180
wj = ((wi)LR )Lc Rd for c ≥ p. Similarly we obtain that wj = ((wi )LR )Lc Rd for some d ≥ s. Consequently, rad∞ (τ −1 (M(wi )), M(wj )) = 0 for each pair i, j. Then we infer by the Auslander-Reiten formulae that DHomA (τ −1 (MS ), MS ) ∼ = Ext1A (MS , MS ) = 0. Moreover, we know from Corollary 2.5 that proj.dim(MS ) ≤ 1. Therefore MS is a tilting A-module which proves (1i). Now assume that for every i = 1, . . . , t we have that either lS (M(wi )) ≤ p or lS (M(wi )) > p and there is no epimorphism f : τ −1 (M(wi )) → X(u−1 Lj ), j ≥ 1. The same condition is assumed on M(wi ) with respect to rS (M(wi )). Then for each pair i, j of integers we have either lS (M(wi )) ≤ p and then, as in the proof of (1i), we get HomA (τ −1 (M(wi )), M(wj )) = 0 or lS (M(wi )) > p and there is no epimorphism ), j ≥ 1. Thus replacing M(wi ) by τ −1 (M(wi )) we get a slice S f : τ −1 (M(wi )) → X(u−1 Lj with lS (M(wi )) ≥ p. If rad∞ (τ −1 (M(wi )), M(wj )) = 0 then we have a basis homomorphism that must be a composition of an epimorphism f : τ −1 (M(wi )) → X(u−1 Lj ), j ≥ 0, −1 with a monomorphism g : X(uLj ) → M(wj ). Thus we deduce from our assumptions that the only possibility for this case is that j = 0. But in this case g : X(u−1) → M(wj ) factorizes through Pz−1 . Therefore we have HomA (τ −1 (M(wi )), M(wj )) = 0. A similar analysis with rS (M(wi )) implies that 0 = DHomA (τ −1 (MS ), MS ) = Ext1A (MS , MS ). Consequently, MS is a tilting A-module, by of Proposition 3.3 so (1ii) is proved. If there exists 1 ≤ i0 ≤ t such that lS (M(wi0 )) > p and there is an epimorphism f : τ −1 (M(wi0 )) → X(u−1 ) for some j0 ≥ 1 then the composed homomorphism gf = 0 for Lj0 −1 a monomorphism g : X(uLj0 ) → M((wi0 )LlS (M (wi0 )) ). Moreover, there is an integer d such ) → M(w1 ). that M((wi0 )LlS (M (wi0 )) Rd ) ∼ = M(w1 ) and we have a monomorphism g : X(u−1 Lj0 −1 Thus g f = 0 and also g f = 0, because j0 ≥ 1. Therefore HomA (τ (M(wi0 )), M(w1 )) = 0 and we deduce from the Auslander-Reiten formulae that Ext1A (M(w1 ), M(wi0 )) = 0. Therefore Ext1A (MS , MS ) = 0 and MS is not a tilting A-module. Consequently, condition (1iii) is proved. A similar analysis to the above used in the proof of (1iii) shows condition (1iv), and thus (1) is proved. Dual arguments prove (2). (1) Theorem 4.2. (1) Let A ∼ = A(p,l1 ,x,s,l2 ) and C be the starting component in ΓA . Let t S be a postprojective slice in C. If MS = i=1 M(wi ) and for each i = 1, . . . , t there are positive integers bi , ci , di, ei such that M(wi ) ∼ )∼ = X(u−1 = Y (vLdi Rei ) then the slice Lbi Rci module MS is not tilting. (2) (2) Let A ∼ = A(p,l1 ,x,s,l2 ) and C be the ending component in ΓA . Let S be a preinjective t slice in C. If MS = i=1 M(wi ) and for each i = 1, . . . , t there are positive integers bi , ci , di, ei such that M(wi ) ∼ = X(uL−bi R−ci ) ∼ = Y (vL−1−di R−ei ) then the slice module MS is not cotilting.
Proof. If the elements of S form the following subquiver M(w1 )
M(w2 )
···
M(wt )
Z. Pogorzaly, K. Szmyt / Central European Journal of Mathematics 5(1) 2007 164–180
179
in C then we have that lS (M(wt )) > p or rS (M(w1 )) > s. Indeed, if lS (M(wt )) ≤ p and (1) rS (M(w1 )) ≤ s then the quiver Q(p,l1 ,x,s,l2 ) has at most p + s arrows and this is impossible by the definitions of p and s. Now consider the case lS (M(wt )) > p. Since M(wt ) ∼ ) by the assumption, = X(u−1 Lbi Rci −1 −1 we have an epimorphism f : τ (M(wt )) → X(uLbi+1 ), bi ≥ 1. Thus we infer by Theorem 4.1(1iii) that MS is not tilting. If rS (M(w1 )) > s then similarly it follows that MS is not tilting. Consequently, condition (1) is proved. Dually one can prove (2). Corollary 4.3. (1) There are only finitely many (up to isomorphism) postprojective (1) slice A(p,l1 ,x,s,l2 ) -modules that are tilting. (2)
(2) There are only finitely many (up to isomorphism) preinjective slice A(p,l1 ,x,s,l2 ) modules that are cotilting. Proof. In order to prove (1) it is easy to see that there are only finitely many (up to isomorphism) postprojective slices S in the starting component C that do not satisfy the assumptions of Theorem 4.2(1) and dually with condition (2).
References [1] I. Assem: “Tilting theory - an introduction”, In: Topics in Algebras, Banach Center Publications, Vol. 26, Part I, PWN, Warszawa, 1990, pp. 127–180. [2] M. Auslander and I. Reiten: “Representation theory of artin algebras III”, Comm. Alg., Vol. 3, (1975), pp. 239–294. [3] M. Auslander and I. Reiten: “Representation theory of artin algebras IV”, Comm. Alg., Vol. 5, (1977), pp. 443–518. [4] M. Auslander, I. Reiten and S.O. Smalø: Representation Theory of Artin Algebras, Cambridge Stud. Adv. Math., Vol. 36, Cambridge Univ. Press, Cambridge, 1995. [5] K. Bongartz: Tilted algebras, LNM 903, Springer, Berlin, 1981, pp. 26-38. [6] M.C.R. Butler and C.M. Ringel: “Auslander-Reiten sequences with few middle terms and applications to string algebras”, Comm. Alg., Vol. 15, (1987), pp. 145–179. [7] P. Dowbor and A. Skowro´ nski: “Galois coverings of representation-infinite algebras”, Comment. Math. Helv., Vol. 62, (1987), pp. 311–337. [8] P. Gabriel: Auslander-Reiten sequences and representation-finite algebras, LNM 831, Springer, Berlin, 1980, pp. 1–71. [9] D. Happel and C.M. Ringel: “Tilted algebras”, Trans. Amer. Math. Soc., Vol. 274, (1982), pp. 399–443. [10] F. Huard: “Tilted gentle algebras”, Comm. Alg., Vol. 26(1), (1998), pp. 63-72. [11] F. Huard and Sh. Liu: “Tilted special biserial algebras”, J. Algebra, Vol. 217, (1999), pp. 679–700.
180
Z. Pogorzaly, K. Szmyt / Central European Journal of Mathematics 5(1) 2007 164–180
[12] F. Huard and Sh. Liu: “Tilted string algebras”, J. Pure Appl. Algebra, Vol. 153, (2000), pp. 151–164. [13] Z. Pogorzaly and M. Sufranek: “Starting and ending components of the AuslanderReiten quivers of a class of special biserial algebras”, Colloq. Math., Vol. 99(1), (2004), pp. 111–144. [14] C.M. Ringel: Tame algebras and integral quadratic forms, LNM 1099, Springer, Berlin, 1984. [15] J. Schr¨oer: “Modules without self-extensions over gentle algebras”, J. Algebra, Vol. 216, (1999), pp. 178–189. [16] A. Skowro´ nski: “Generalized standard Auslander-Reiten components”, J. Math. Soc. Japan, Vol. 46, (1994), pp. 517-543. [17] A. Skowro´ nski and J. Waschb¨ usch: “Representation-finite biserial algebras”, J. Reine Angew. Math., Vol. 345, (1983), pp. 172–181. [18] B. Wald and J. Waschb¨ usch: “Tame biserial algebras”, J. Algebra, Vol. 95, (1985), pp. 480–500.
DOI: 10.2478/s11533-006-0036-3 Research article CEJM 5(1) 2007 181–200
On homological classification of pomonoids by regular weak injectivity properties of S-posets Xia Zhang1∗ , Valdis Laan2† 1
School of Mathematics and Computational Science, Sun Yat-sen University, 510275 Guangzhou, China 2
Institute of Pure Mathematics, University of Tartu, 50409 Tartu, Estonia
Received 20 June 2006; accepted 25 September 2006 Abstract: If S is a partially ordered monoid then a right S-poset is a poset A on which S acts from the right in such a way that the action is compatible both with the order of S and A. By regular weak injectivity properties we mean injectivity properties with respect to all regular monomorphisms (not all monomorphisms) from different types of right ideals of S to S. We give an alternative description of such properties which uses systems of equations. Using these properties we prove several so-called homological classification results which generalize the corresponding results for (unordered) acts over (unordered) monoids proved by Victoria Gould in the 1980’s. c Versita Warsaw and Springer-Verlag Berlin Heidelberg. All rights reserved. Keywords: Ordered monoid, S-poset, weak injectivity MSC (2000): 06F05, 20M30
1
Introduction
In the 1980’s Victoria Gould characterized several classes of monoids using the injectivity properties of acts (or systems) over them [4–6]. Our aim is to prove the analogues of those results in the case of ordered acts (S-posets) over ordered monoids. We make use of regular weak injectivities by which we mean injectivities with respect to regular monomorphisms from different types of ideals to the ordered monoid. ∗ †
E-mail: xiazhang
[email protected] E-mail:
[email protected]
182
X. Zhang, V. Laan / Central European Journal of Mathematics 5(1) 2007 181–200
After giving the necessary preliminaries, in Section 2 we prove, following [4], a result that describes regular weak injectivity properties using systems of equations. In Section 3 we give a construction, that allows for a given S-poset A to construct a regularly divisible, regularly principally weakly injective or regularly fg-weakly injective S-poset that contains A as a regular S-subposet. This construction will be the main tool for obtaining the desired homological classification results in Section 4.
2
Preliminaries
Throughout this paper S will denote a partially ordered monoid (shortly pomonoid), that is, a monoid with a partial order relation ≤ such that s ≤ t implies su ≤ tu and us ≤ ut for every s, t, u ∈ S. A poset (A, ≤) together with a mapping A × S → A, (a, s) → as, is called a right S-poset (and the notation AS is used) if (1) a(st) = (as)t, (2) a1 = a, (3) a ≤ b implies as ≤ bs, and (4) s ≤ t implies as ≤ at, for all a, b ∈ A, s, t ∈ S. In this paper we only consider right S-posets, so we usually drop the word ‘right’. If A satisfies conditions (1) and (2) then it is called a right S-act (see [7]) or a right S-system (see, e.g., [4]). Definitions and results about S-acts, used in this paper, can be found in [7]. Morphisms of S-posets are action and order preserving mappings. From [2] we know that in the category of right S-posets monomorphisms are injective morphisms but regular monomorphisms are embeddings, i.e. morphisms ι : AS → BS such that a ≤ a if and only if ι(a) ≤ ι(a ), a, a ∈ A. So not every monomorphism of S-posets needs to be regular. For every S-poset AS and its element a, λa : SS → AS will denote the right S-poset morphism defined by λa (s) = as for every s ∈ S. A poset (A, ≤A ) is called a (regular) S-subposet of a right S-poset (B, ≤B ), if AS is a subact of BS and ≤A ⊆ (≤B ∩ A2 ) (resp. ≤A = (≤B ∩ A2 )). By right ideals of S we mean algebraic ideals, i.e. subsets I ⊆ S such that IS ⊆ I. When we consider a right ideal I as a right S-poset, we mean that its order is induced by the order of S. For a binary relation σ on an S-poset AS , we write a ≤ a if there exist a1 , . . . , an ∈ A σ
such that
a ≤ a1 σa2 ≤ a3 σ . . . σan ≤ a . Such a sequence of elements is called a σ-chain connecting a and a . An S-poset congruence (see [3]) on an S-poset AS is an S-act congruence θ on A, that satisfies the so-called closed chains condition: a ≤ a ≤ a =⇒ aθa θ
θ
for every a, a ∈ A. If H ⊆ A × A is a subset then the S-poset congruence θ(H) on A generated by H (see [1]) is defined by aθ(H)a ⇐⇒ a ≤ a ≤ a, ρ
(1)
ρ
a, a ∈ A, where ρ = ρ(H) is the S-act congruence on A generated by H. The factor
X. Zhang, V. Laan / Central European Journal of Mathematics 5(1) 2007 181–200
183
S-poset A/θ(H) is equipped with the order [a]θ(H) ≤ [a ]θ(H) ⇐⇒ a ≤ a .
(2)
ρ
This makes the canonical epimorphism A → A/θ(H) a regular epimorphism (see [2]). For a set Γ, one can consider the free right S-poset on Γ (see [9]) as a set Γ × S with the right S-action defined by (γ, s)t = (γ, st) and the order relation by (γ, s) ≤ (δ, t) if and only if γ = δ and s ≤ t, γ, δ ∈ Γ, s, t ∈ S. We shall write shortly γs instead of (γ, s) ∈ Γ × S. We call an element c ∈ S left po-cancellable if cs ≤ ct implies s ≤ t for all s, t ∈ S. We denote the set of all left po-cancellable elements of S by C. We write N0 = N ∪ {0} for the set of nonnegative integers.
3
Regularly (α, R)-injective acts
We say that a subset R ⊆ S is closed under regular monomorphisms if ι(r) ∈ R for every r ∈ R and regular monomorphism ι : rS → S. It is easy to see that S and the set of all left (po-)cancellable elements of S are closed under regular monomorphisms. Let α be any cardinal greater than 1 and let R be a subset of S that is closed under regular monomorphisms. We call a right ideal I of S a right (α, R)-ideal, if I has a generating set G ⊆ R of fewer than α elements. If R = S then we speak of just right α-ideals. So the right (2, C)-ideals of S are principal right ideals generated by left pocancellable elements, right 2-ideals are principal right ideals and right ℵ0 -ideals are finitely generated right ideals. We say that an S-poset AS satisfies the (α, R)-Baer criterion (cf. [4]) if every S-poset morphism f : I → A, where I is a right (α, R)-ideal, is given by the left multiplication by some element a ∈ A, i.e. f = λa . We say that an S-poset AS is (regularly) (α, R)-injective if for every right (α, R)-ideal I of S, every (regular) monomorphism ι : I → S and every S-poset morphism f : I → A there exists an S-poset morphism g : S → A such that the diagram I ??
?? ?? ?? ? f ?? ?? ?
ι
A
g
/S
is commutative. If R = S, we speak of (regular) α-injectivity. So (regularly) 2-injective S-posets are (regularly) principally weakly injective S-posets and (regularly) ℵ0 -injective S-posets are (regularly) fg-weakly injective S-posets. We say that an S-poset AS is (regularly) divisible (cf. [6]) if A = Ac for every left (po-)cancellable element c ∈ S. The next lemma shows that regular divisibility can be considered as an injectivity property.
184
X. Zhang, V. Laan / Central European Journal of Mathematics 5(1) 2007 181–200
Lemma 3.1. The following conditions are equivalent for an S-poset AS : (i) AS is regularly (2, C)-injective, (ii) AS is regularly (2, {1})-injective, (iii) AS is regularly divisible. Proof. (i) ⇒ (ii). This is clear, because 1 ∈ C. (ii) ⇒ (iii). Let AS be regularly (2, {1})-injective, let c ∈ S be a left po-cancellable element and let a ∈ A. Since, for every s, t ∈ S, s ≤ t if and only if cs ≤ ct, the mapping λc : S → S is a regular monomorphism of S-posets. SS ?
λc
?? ?? ?? ? λa ?? ?? ?
AS
g
/ SS
By the assumption, there exists an S-poset morphism g : S → A such that λa = gλc . Hence a = λa (1) = gλc (1) = g(c) = g(1)c ∈ Ac. (iii) ⇒ (i). Suppose that A is regularly divisible. Consider a left po-cancellable element c, a regular monomorphism ι : cS → S and an S-poset morphism f : cS → A. Then c = ι(c) ∈ S is also a left po-cancellable element and hence f (c) = bc for some b ∈ A. Consequently, for every s ∈ S, λb ι(cs) = λb (c s) = bc s = f (c)s = f (cs). So we have the following implications among regular weak injectivity properties of S-posets: regularly weakly injective ⇒ regularly fg-weakly injective ⇒ ⇒ regularly principally weakly injective ⇒ regularly divisible. Our next aim is to describe regularly (α, R)-injective S-posets using systems of equations over them. A set Σ of equations with constants from an S-poset AS is called consistent if Σ has a solution in some S-poset BS that contains A as a regular S-subposet. If α is any cardinal larger than that of Σ, if all equations in Σ are of the form xs = a, where s ∈ R and a ∈ A, and if the same unknown x appears in each equation then we call Σ an (α, R)-system over A. The following two results are analogues of Lemma 3.2 and Proposition 3.3 of [4], respectively. Lemma 3.2. Let AS be an S-poset, R ⊆ S a subset that is closed under regular monomorphisms, α a cardinal, J a set with |J| < α and Σ = {xsj = aj | j ∈ J, sj ∈ R, aj ∈ A}
X. Zhang, V. Laan / Central European Journal of Mathematics 5(1) 2007 181–200
185
an (α, R)-system over A. Then Σ is consistent if and only if for all u, v ∈ S and i, j ∈ J, si u ≤ sj v =⇒ ai u ≤ aj v. Proof. Necessity. If Σ is consistent then there is an S-poset (BS , ≤B ) and an element b ∈ B such that (AS , ≤A ) is a regular S-subposet of BS and b is a solution of Σ. If now si u ≤ sj v, u, v ∈ S, i, j ∈ J, then ai u = bsi u ≤B bsj v = aj v. Since A is a regular S-subposet of B, we have ai u ≤A aj v. Sufficiency. Let z be a symbol which is not in A or S and consider the S-poset BS = AS FS , where FS = (zS)S is the free S-poset on {z} and the S-action and order on disjoint union are defined componentwise. Let θ be the S-poset congruence on B generated by the set H = {(aj , zsj ) | j ∈ J} ⊆ B 2 , that is, for b, b ∈ B,
bθb ⇐⇒ b ≤ b ≤ b, ρ
ρ
where ρ = ρ(H) is the S-act congruence on BS generated by H. Using the assumption, one can show that bρb if and only if one of the following four cases is true: (1) b, b ∈ A ∪ F and b = b , (2) b = zsi u, b = zsj v ∈ F and ai u = aj v for some u, v ∈ S and i, j ∈ J, (3) b = aj u ∈ A, b = zsj u ∈ F for some u ∈ S and j ∈ J, (4) b = zsj u ∈ F, b = aj u ∈ A for some u ∈ S and j ∈ J. Suppose that b ≤ b where b, b ∈ A. Using the above description of ρ we have either b ≤ b or
ρ
b ≤ d1 ρy1 ≤ y1 ρd2 ≤ d2 ρy2 ≤ y2 ρd3 . . . dn ρyn ≤ yn ρdn+1 ≤ b , ρ|F
ρ|F
2
ρ|F
d1 , . . . , dn , d2 , . . . , dn+1
where ρ|F = ρ ∩ F , for some n ∈ N and elements ∈ A, y1 , . . . , yn , y1 , . . . , yn ∈ F . Since dr ρyr and yr ρdr+1 , for every r ∈ {1, . . . , n} there exist kr , lr ∈ J and ukr , vkr ∈ S such that dr = akr ukr , yr = zskr ukr , yr = zslr vlr and dr+1 = alr vlr . Now yr ≤ yr implies ρ|F
zskr ukr = yr ≤ g1 ρh1 ≤ g2 ρh2 ≤ . . . ≤ gp ρhp ≤ yr = zslr vlr for some p ∈ N and gm , hm ∈ F , m ∈ {1, . . . , p}. From the description of ρ we obtain im , jm ∈ J, uim , vjm ∈ S, m ∈ {1, . . . , p}, such that gm = zsim uim , hm = zsjm vjm and aim uim = ajm vjm . Since hm ≤ gm+1 , we have sjm vjm ≤ sim+1 uim+1 for every m ∈ {1, . . . , p − 1}. Also yr ≤ g1 implies skr ukr ≤ si1 ui1 and hp ≤ yr implies sjp vjp ≤ slr vlr . By assumption, akr ukr ≤ ai1 ui1 , ajp vjp ≤ alr vlr and ajm vjm ≤ aim+1 uim+1 for every m ∈ {1, . . . , p − 1}. Hence dr = akr ukr ≤ ai1 ui1 = aj1 vj1 ≤ ai2 ui2 = aj2 vj2 ≤ . . . ≤ ajp vjp ≤ alr vlr = dr+1 for every r ∈ {1, . . . , n}. So b ≤ d1 ≤ d2 ≤ d2 ≤ . . . ≤ dn+1 ≤ b , and we have proved that, for every b, b ∈ A, b ≤ b ⇐⇒ b ≤ b . ρ
186
X. Zhang, V. Laan / Central European Journal of Mathematics 5(1) 2007 181–200
It follows that if π : B → B/θ, b → [b]θ , is the natural S-poset morphism then π|A is an embedding, thus we may identify the S-posets A and π|A (A) = π(A), and, moreover, π(A) is a regular S-subposet of B. Since aj ≡ [aj ]θ = [zsj ]θ = [z]θ sj for every j ∈ J, [z]θ is a solution of Σ in B/θ, so Σ is consistent. Proposition 3.3. The following conditions are equivalent for an S-poset AS , a subset R ⊆ S that is closed under regular monomorphisms, and a cardinal α: (i) every consistent (α, R)-system over A has a solution in A, (ii) A satisfies the (α, R)-Baer criterion, (iii) A is regularly (α, R)-injective. Proof. (i) ⇒ (ii). Let I be a right (α, R)-ideal of S, that is, I = j∈J tj S, where |J| < α and tj ∈ R for every j ∈ J. Consider an S-poset morphism f : I → A. Then ti u ≤ tj v =⇒ f (ti )u ≤ f (tj )v for every i, j ∈ J and u, v ∈ S. By Lemma 3.2, Σ = {xtj = f (tj ) | j ∈ J} is a consistent (α, R)-system over A. By assumption, Σ has a solution a in A, which means that f is given by left multiplication by a. (ii) ⇒ (iii). Let I be a right (α, R)-ideal of S, that is, I = j∈J tj S, where |J| < α and tj ∈ R for every j ∈ J, let ι : I → S be a regular monomorphism and let f : I → A be an S-poset morphism. By assumption, there exists a ∈ A such that f (tj ) = atj for every j ∈ J. Now ι(I) = j∈J ι(tj )S is also a right (α, R)-ideal of S. We define a mapping h : ι(I) → A by h(ι(tj )s) = atj s, for all j ∈ J, s ∈ S. Since, for every i, j ∈ J and u, v ∈ S, ι(ti )u ≤ ι(tj )v =⇒ ι(ti u) ≤ ι(tj v) =⇒ ti u ≤ tj v =⇒ f (ti u) ≤ f (tj v) =⇒ ati u ≤ atj v, h is an order preserving and well-defined S-act morphism. By assumption, there exists b ∈ A such that h(ι(tj )s) = bι(tj )s for every j ∈ J and s ∈ S. Hence (λb ι)(tj s) = bι(tj )s = h(ι(tj )s) = atj s = f (tj s) for every j ∈ J, s ∈ S, i.e. λb ι = f . I ??
?? ?? ?? ? f ?? ?? ?
ι
A
λb
/S
X. Zhang, V. Laan / Central European Journal of Mathematics 5(1) 2007 181–200
187
(iii) ⇒ (i). Consider a consistent (α, R)-system Σ = {xsj = aj | j ∈ J, sj ∈ R, aj ∈ A} where |J| < α and the right (α, R)-ideal I = j∈J sj S of S. By Lemma 3.2, si u ≤ sj v =⇒ ai u ≤ aj v for every i, j ∈ J and u, v ∈ S. Hence the mapping f : I → A, sj s → aj s, is an S-poset morphism. By assumption, there exists an S-poset morphism g : S → A such that gι = f where ι : I → S is the inclusion. Therefore aj = f (sj ) = gι(sj ) = g(sj ) = g(1)sj for every j ∈ J, and so g(1) is a solution of Σ in A. Denote the directed kernel {(a, a ) ∈ A2 | f (a) ≤ f (a )} of an S-poset morphism −−→ f : AS → BS by Ker f (see [3]). Taking α = 2 and R = S, from Lemma 3.2 and Proposition 3.3 we obtain the following result. Corollary 3.4. For an S-poset AS , the following conditions are equivalent: (i) AS is regularly principally weakly injective, (ii) for every s ∈ S and S-poset morphism f : sS → AS , there exists an element z ∈ AS such that f (x) = zx for every x ∈ sS, −−→ −−→ (iii) for every s ∈ S, a ∈ A with Ker λs ⊆ Ker λa , one has that a = zs for some z ∈ A.
4
Regularly (α, R)-injective extension of an S-poset
Construction 4.1. Let AS be an arbitrary S-poset, let R ⊆ S be any subset that is closed under regular monomorphisms, and let α be any cardinal with 1 < α ≤ ℵ0 . Our aim is to give a construction of a regularly (α, R)-injective S-poset A(α,R) containing A as a regular S-subposet. The first step in this direction is to define Γ, H, U(α, R, A) as follows. For every natural number n, where 1 ≤ n < α, set Γn := {((s1 , a1 ), . . . , (sn , an )) ∈ (R × A)n | for all u, v ∈ S, and i, j ∈ {1, . . . , n} si u ≤ sj v implies ai u ≤ aj v}. If γ ∈ Γn , we write γj for the j-th component of the n-tuple γ. Further we put Γn , Γ := 1≤n<α
FS := (Γ × S)S , that is, F is the free right S-poset on Γ (we again write γs for the element (γ, s) of F ), and H := {(γsj , aj ) | γ ∈ Γn , 1 ≤ n < α, (sj , aj ) = γj , j ∈ {1, . . . , n}} ⊆ (F A)2 .
188
X. Zhang, V. Laan / Central European Journal of Mathematics 5(1) 2007 181–200
Let θ(H) be the S-poset congruence on FS AS generated by H (see (1)) and define a right S-poset U(α, R, A)S := (FS AS )/θ(H). First we need to examine the properties of the S-act congruence ρ(H) on FS AS generated by H. Lemma 4.2. If yρ(H)y for y, y ∈ F then either y = y or there exist 1 ≤ n, n < α, j ∈ {1, . . . , n}, j ∈ {1, . . . , n }, γ ∈ Γn , γ ∈ Γn , s, s ∈ R, t, t ∈ S, a, a ∈ A such that y = γst
γ s t = y ,
at = a t γj = (s, a) and γj = (s , a ). Proof. Suppose that y, y ∈ F and yρ(H)y . Then by Lemma 1.4.37 of [7] either y = y or there exist elements x1 , . . . , xm , x1 , . . . , xm ∈ F A, t1 , . . . , tm ∈ S such that (xi , xi ) ∈ H or (xi , xi ) ∈ H for each i ∈ {1, . . . , m} and y = x1 t1
x2 t2 = x3 t3 . . .
x1 t1 = x2 t2
xm tm = y , xm−1 tm−1 = xm tm
where m ∈ N is minimal. From y = x1 t1 ∈ F we get that x1 ∈ F . Hence (x1 , x1 ) ∈ H and therefore x1 = γsj1 and x1 = aj1 for some n1 < α, j1 ∈ {1, . . . , n1 } and γ ∈ Γn1 with γj1 = (sj1 , aj1 ). If m > 2 then (x2 , x2 ), (x3 , x3 ) ∈ H, so there exist n2 , n3 < α, j2 ∈ {1, . . . , n2 }, j3 ∈ {1, . . . , n3 }, δ ∈ Γn2 and ν ∈ Γn3 such that δj2 = (sj2 , aj2 ), νj3 = (sj3 , aj3 ), x2 = δsj2 , x2 = aj2 , x3 = νsj3 and x3 = aj3 . Now the equality δsj2 t2 = x2 t2 = x3 t3 = νsj3 t3 implies δ = ν (hence n2 = n3 ) and sj2 t2 = sj3 t3 . By the definition of Γn2 , aj2 t2 = aj3 t3 . It follows that x1 t1 = x2 t2 = aj2 t2 = aj3 t3 = x3 t3 , but this contradicts the minimality of m. Obviously m = 1 because y, y ∈ F . So m = 2, i.e. x1 , x2 ∈ A and there exist n, n < α, j ∈ {1, . . . , n}, j ∈ {1, . . . , n }, γ ∈ Γn , γ ∈ Γn such that x1 = γs and x2 = γ s where γj = (s, x1 ) and γj = (s , x2 ). Thus we have y = γst1 , x1 t1 = x2 t2 and γ s t2 = y . The following lemma can be proved by an argument similar to that of [5], p. 76. Lemma 4.3. If aρ(H)a for a, a ∈ A then a = a . Lemma 4.4. If aρ(H)y for a ∈ A, y ∈ F then there exist 1 ≤ n < α, j ∈ {1, . . . , n}, γ ∈ Γn , s ∈ R, t ∈ S, b ∈ A such that a = bt, γst = y and γj = (s, b). Proof. By using a proof, similar to that of Lemma 4.2, one has that a = x1 t1 and x1 t1 = y
X. Zhang, V. Laan / Central European Journal of Mathematics 5(1) 2007 181–200
189
for some t1 ∈ S and (x1 , x1 ) ∈ H. So x1 = γsj for some n < α, γ ∈ Γn and j ∈ {1, . . . , n} such that γj = (sj , x1 ). Lemma 4.5. Suppose that ≤ ym+1 y1 ≤ y2 ρ(H)y2 ≤ . . . ≤ ym ρ(H)ym
(3)
where yk+1, yk ∈ F , yk = yk for every k ∈ {1, . . . , m}, and y1 = γs t , ym+1 = δv for some t , v ∈ S, n, n < α, j ∈ {1, . . . , n }, γ ∈ Γn , δ ∈ Γn such that γj = (s , a ). Then a t ≤ bs, zs ≤ v and δl = (z, b) for some s ∈ S, z ∈ R, b ∈ A and l ∈ {1, . . . , n}. Moreover, if v = st for some t ∈ S, j ∈ {1, . . . , n} such that δj = (s, a) then a t ≤ at. Proof. If m = 1, that is, (3) has the form γs t = y1 ≤ y2 = δv then γ = δ, s t ≤ v, a t ≤ a t and δj = γj = (s , a ). Suppose that m > 1. By Lemma 4.2, for every k ∈ {2, . . . , m} there exist nk , pk < α, ik ∈ {1, . . . , nk }, jk ∈ {1, . . . , pk }, γ k ∈ Γnk , δ k ∈ Γpk , uk , vk ∈ S such that y k = γ k sk u k ,
ak uk = bk vk ,
δ k zk vk = yk where γikk = (sk , ak ), δjkk = (zk , bk ).
Since yk ≤ yk+1 in F , we conclude that δ k = γ k+1 , pk = nk+1 and zk vk ≤ sk+1 uk+1 for every k ∈ {2, . . . , m−1}. By the definition of Γpk , bk vk ≤ ak+1 uk+1 for every k ∈ {2, . . . , m−1}. Moreover, γs t = y1 ≤ y2 = γ 2 s2 u2 and δ m zm vm = ym ≤ ym+1 = δv imply γ = γ 2 , n = n2 , s t ≤ s2 u2 , δ m = δ, pm = n, zm vm ≤ v. The inequality s t ≤ s2 u2 implies a t ≤ a2 u2 by the definition of Γn . Now a t ≤ a2 u2 = b2 v2 ≤ a3 u3 = b3 v3 ≤ . . . ≤ bm vm , where (zm , bm ) = δjmm = δjm . If v = st for some t ∈ S and j ∈ {1, . . . , n} such that δj = (s, a) then zm vm ≤ st implies bm vm ≤ at and hence a t ≤ at. Lemma 4.6. If a ≤ a , where a, a ∈ A, then a ≤ a . ρ(H)
Proof. Let a ≤ a where a, a ∈ A. Since the elements of A are incomparable to ρ(H)
elements of F and also having Lemma 4.3 in mind, there exist elements ak ∈ A and yk , yk ∈ F , k ∈ {1, . . . , m} such that a ≤ a1 ρ(H)y1 ≤ y1 ρ(H)a2 ≤ a2 ρ(H)y2 ≤ y2 ρ(H)a3 . . . ym−1 ρ(H)am ≤ a , ρ(H)
ρ(H)
and for every k ∈ {1, . . . , m − 1}, yk and yk are connected by a ρ(H)-chain of the form (3). By Lemma 4.4, for every k ∈ {1, . . . , m − 1}, ak ρ(H)yk and yk ρ(H)ak+1 imply that there exist nk , pk < α, ik ∈ {1, . . . , nk }, jk ∈ {1, . . . , pk }, γ k ∈ Γnk , δ k ∈ Γpk , uk , vk ∈ S such that ak = bk uk , γ k sk uk = yk , γikk = (sk , bk ) and ak+1 = bk vk , δ k zk vk = yk , δjkk = (zk , bk ).
190
X. Zhang, V. Laan / Central European Journal of Mathematics 5(1) 2007 181–200
By Lemma 4.5, yk ≤ yk implies bk uk ≤ bk vk for every k ∈ {1, . . . , m − 1}. Hence ρ(H)
a ≤ a1 = b1 u1 ≤ b1 v1 = a2 ≤ a2 = b2 u2 ≤ . . . ≤ bm−1 vm−1 = am ≤ a . From (1) and Lemma 4.6 we obtain the following result. Corollary 4.7. If aθ(H)a for a, a ∈ AS then a = a . Lemma 4.8.
1. If a ≤ y, where a ∈ A, y = δv ∈ F , then a ≤ bs and zs ≤ v for ρ(H)
some s ∈ S, z ∈ R, b ∈ A, n < α and l ∈ {1, . . . , n} such that δl = (z, b); 2. if y ≤ a, where y = δv ∈ F, a ∈ A, then v ≤ zs and bs ≤ a for some s ∈ S, z ∈ R, ρ(H)
b ∈ A, n < α and l ∈ {1, . . . , n} such that δl = (z, b). Proof. 1. If a ≤ y where a ∈ A, y = δv ∈ F , δ ∈ Γn and n < α , then using Lemma 4.6 ρ(H)
we have a ρ(H)-chain a ≤ a ρ(H)y ≤ y ρ(H)
where a ∈ A and the ρ(H)-chain connecting y and y is of the form (3). By Lemma 4.4, there exist n < α, j ∈ {1, . . . , n }, γ ∈ Γn , t ∈ S such that a = b t , γs t = y and γj = (s , b ). By Lemma 4.5, b t ≤ bs, zs ≤ v and δl = (z, b) for some s ∈ S, z ∈ R, b ∈ A and l ∈ {1, . . . , n}. Hence a ≤ a = b t ≤ bs. 2. The proof is symmetric to the case 1. Proposition 4.9. Preserving the notations of Construction 4.1, let π : FS AS → U(α, R, A)S be the canonical surjection. Then π|A : AS → U(α, R, A)S is a regular monomorphism, that is, U(α, R, A)S is an extension of AS . Proof. Note that π is obviously an S-poset morphism and the fact that π|A : AS → U(α, R, A)S is a regular monomorphism follows from (2) and Lemma 4.6. In what follows, we shall identify AS with the regular S-subposet π|A (A) of U(α, R, A). Theorem 4.10. Let AS be an S-poset, R ⊆ S a subset that is closed under regular monomorphisms and α a cardinal with 1 < α ≤ ℵ0 . Set A0 = AS and Ai = U(α, R, Ai−1 )S for every i ∈ N. Let A(α,R) := Ai i∈N0
and define a relation ≤ on A(α,R) by a ≤ b ⇐⇒ a ≤n b
X. Zhang, V. Laan / Central European Journal of Mathematics 5(1) 2007 181–200
191
where n ∈ N0 is any number such that a, b ∈ An , and ≤n is the partial order in An . Then A(α,R) is a regularly (α, R)-injective S-poset that contains A as a regular S-subposet. Proof. For every i ∈ N, denote by Fi := Γi × S the free S-poset, by Hi ⊆ (Fi Ai )2 the set, by ρi := ρ(Hi ) and θi := θ(Hi ) the relations on Fi Ai defined using Ai as in Construction 4.1. So Ai+1 = (Fi Ai )/θi and the order relation ≤i+1 on Ai+1 is defined by [x]i ≤i+1 [x ]i ⇐⇒ x ≤ x , ρi
x, x ∈ Fi Ai , where [x]i is the θi -class of x. It is easy to understand that A(α,R) is an S-poset and contains A as a regular S-subposet. Consider a consistent (α, R)-system Σ = {xsj = aj | j ∈ J, sj ∈ R, aj ∈ A(α,R) }, where |J| < α. Since α ≤ ℵ0 , J is a finite set and we may assume that J = {1, . . . , n} for some n ∈ N with n < α. Hence there exists m ∈ N0 such that aj ∈ Am for every j ∈ J. By Lemma 3.2, γ = ((s1 , a1 ), . . . , (sn , an )) ∈ Γnm ⊆ Γm , so γ1 ∈ Fm and [γ1]m ∈ Am+1 ⊆ A(α,R) . Moreover, (γsj , aj ) ∈ Hm for every j ∈ J, and thus [γ1]m sj = [(γ1)sj ]m = [γsj ]m = [aj ]m ≡ aj , i.e. [γ1]m is a solution of Σ in Am+1 and hence in A(α,R) . By Proposition 3.3, A(α,R) is (α, R)-injective. We call the S-poset A(α,R) (defined as in Theorem 4.10) the regularly (α, R)-injective extension of A. We also write A(2) = A(2,S) and A(ℵ0 ) = A(ℵ0 ,S) and call them the regularly principally weakly injective extension of A and the regularly f g-weakly injective extension of A, respectively. Since regular (2, C)-injectivity is by Lemma 3.1 the same as regular divisibility, we call A(2,C) the regularly divisible extension of A.
5
Homological classification
In this section we give descriptions of pomonoids over which all right S-posets with some weaker regular weak injectivity property have some stronger regular weak injectivity property.
5.1 When all S-posets are regularly divisible Proposition 5.1. The following conditions are equivalent: (i) All right S-posets are regularly divisible, (ii) all right ideals of S are regularly divisible, (iii) SS is regularly divisible,
192
X. Zhang, V. Laan / Central European Journal of Mathematics 5(1) 2007 181–200
(iv) every left po-cancellable element of S is left invertible. Proof. (i) ⇒ (ii) ⇒ (iii). These are obvious. (iii) ⇒ (iv). Suppose that SS is regularly divisible and c ∈ S is a left po-cancellable element. Then S = Sc implies that there exists s ∈ S such that sc = 1, so c is left invertible. (iv) ⇒ (i). Let c ∈ S be a left po-cancellable element and AS a right S-poset. By (iv) there is an s ∈ S satisfying sc = 1. So A = Asc = Ac.
5.2 When regularly divisible S-posets are regularly principally weakly injective In [6], Victoria Gould introduced the notion of a right almost regular monoid and proved that these are precisely the monoids over which all divisible acts are principally weakly injective. We shall prove an analogue of this result for S-posets. Theorem 5.2. The following conditions are equivalent for a pomonoid S: (i) all regularly divisible right S-posets are regularly principally weakly injective, (ii) for every element s ∈ S there exist r, r1 , . . . , rn , s1 , . . . , sn , s1 , . . . , sn ∈ S and left po-cancellable elements c1 , . . . , cn ∈ S such that c1 s1 ≤ r1 s ≤ c1 s1 c2 s2 ≤ r2 s1 ≤ r2 s1 ≤ c2 s2 c3 s3 ≤ r3 s2 ≤ r3 s2 ≤ c3 s3
(4)
... cn sn ≤ rn sn−1 ≤ rn sn−1 ≤ cn sn s = ssn = ssn , (iii) for every element s ∈ S there exist r, r1 , . . . , rn , s1 , . . . , sn , s1 , . . . , sn ∈ S and left po-cancellable elements c1 , . . . , cn ∈ S such that c1 s1 ≤ r1 s ≤ c1 s1 c2 s2 ≤ r2 s1 ≤ c2 s2 c3 s3 ≤ r3 s2 ≤ c3 s3 ... cn sn ≤ rn sn−1 ≤ cn sn s = ssn = ssn .
(5)
X. Zhang, V. Laan / Central European Journal of Mathematics 5(1) 2007 181–200
193
Proof. (i) ⇒ (ii). Assume that all regularly divisible right S-posets are regularly principally weakly injective. For an element s ∈ S, let sS (2,C) be the regularly divisible extension of sS obtained as in Construction 4.1. In our case Γi = Γ1i = {(c, b) ∈ C × (sS)i | for all u, v ∈ S cu ≤ cv implies bu ≤i bv}, Hi = {((c, b)c, b) ∈ Fi × Ai | (c, b) ∈ Γi }. Note that every element b = [d]θi−1 ∈ (sS)i = (Fi−1 (sS)i−1 )/θi−1 , d ∈ Fi−1 (sS)i−1 can be presented in the form b = [(c, b )s]θi−1 where (c, b ) ∈ Γi−1 and s ∈ S.
(6)
If d ∈ Fi−1 , this is clear. If d ∈ (sS)i−1 then (1, d) ∈ Γi−1 , ((1, d)1, d) ∈ Hi−1 , hence (1, d)1θi−1 d and b = [d]θi−1 = [(1, d)1]θi−1 . By assumption, sS (2,C) is regularly principally weakly injective. Thus there exists an S-poset morphism g : S → sS (2,C) such that the diagram sS??
ι
?? ?? ?? f ?? ?? ?
/S
g
sS (2,C)
commutes, where ι and f are the inclusion mappings. Then s = f (s) = gι(s) = g(s) = g(1)s where g(1) ∈ sS (2,C) . Let n ∈ N0 be such that g(1) ∈ (sS)n . If n = 0 then g(1) ∈ sS, hence s ∈ sSs, i.e. s is regular and therefore there exist c1 = 1, r1 = x, s1 = s1 = xs, where s = sxs, x ∈ S such that the inequalities and equalities in (4) are fulfilled. Suppose that n > 0. Then, by (6), g(1) = [(c1 , b1 )r1 ]θn−1 ∈ (sS)n = (Fn−1 (sS)n−1 )/θn−1 , where r1 ∈ S and (c1 , b1 ) ∈ Γn−1 ; in particular c1 ∈ C and b1 ∈ (sS)n−1 . Then sθn−1 (c1 , b1 )r1 s, that is s ≤ (c1 , b1 )r1 s ≤ s. By Lemma 4.8, ρn−1
s ≤n−1 b1 s1 ,
c1 s1 ≤ r1 s,
ρn−1
and
r1 s ≤ c1 s1 , b1 s1 ≤n−1 s,
for some s1 , s1 ∈ S. Again by (6), b1 = [(c2 , b2 )r2 ]θn−2 , where r2 ∈ S and (c2 , b2 ) ∈ Γn−2 , in particular, c2 ∈ C and b2 ∈ (sS)n−2 . Now s ≤n−1 b1 s1 and b1 s1 ≤n−1 s mean that s ≤ (c2 , b2 )r2 s1 and (c2 , b2 )r2 s1 ≤ s. Lemma 4.8 implies that ρn−2
ρn−2
s ≤n−2 b2 s2 ,
c2 s 2 ≤ r2 s 1 ,
and
r2 s1 ≤ c2 s2 , b2 s2 ≤n−2 s
for some s2 , s2 ∈ S. Continuing in a similar manner, we finally obtain bn = sr ∈ sS = (sS)0 , cn ∈ C, rn , sn , sn ∈ S such that s ≤ bn sn = srsn , cn sn ≤ rn sn−1 ,
and
rn sn−1 ≤ cn sn ,
srsn = bn sn ≤ s.
194
X. Zhang, V. Laan / Central European Journal of Mathematics 5(1) 2007 181–200
Now c1 s1 ≤ r1 s ≤ c1 s1 implies s1 ≤ s1 , c2 s2 ≤ r2 s1 ≤ r2 s1 ≤ c2 s2 implies s2 ≤ s2 , and so on. Finally we obtain sn ≤ sn and hence s ≤ srsn ≤ srsn ≤ s, which yields s = srsn = srsn . The inequality sn ≤ sn also implies rsn ≤ rsn , and thus we have obtained c1 s1 ≤ r1 s ≤ c1 s1 c2 s2 ≤ r2 s1 ≤ r2 s1 ≤ c2 s2 ... cn sn ≤ rn sn−1 ≤ rn sn−1 ≤ cn sn 1(rsn ) ≤ rsn ≤ rsn ≤ 1(rsn ) s = s(rsn ) = s(rsn ). (ii) ⇒ (iii). This is clear. (iii) ⇒ (i). Assume (iii) holds. Let AS be a regularly divisible right S-poset, s ∈ S, and f : sS → A an S-poset morphism. Then for s we have inequalities and equalities as in (5). Hence f (s) = f (s)sn = f (s)sn . Using regular divisibility of A, there exists a1 ∈ A such that f (s) = a1 cn . Consequently, f (s) = a1 cn sn ≤ a1 rn sn−1 ≤ a1 cn sn = f (s), and so f (s) = a1 rn sn−1 . Again, by the regular divisibility of A, a1 rn = a2 cn−1 for some a2 ∈ A. Thus f (s) = a2 cn−1 sn−1 ≤ a2 rn−1 sn−2 ≤ a2 cn−1 sn−1 = f (s) and f (s) = a2 rn−1 sn−2 . In this way we finally arrive at f (s) = an r1 s for some an ∈ A, i.e. f = λan r1 . So A is regularly principally weakly injective by Proposition 3.3. Definition 5.3. We say that an element s of a pomonoid S is regularly right almost regular if there exist elements such that equalities and inequalities in (4) hold. We call a pomonoid regularly right almost regular, if all its elements are regularly right almost regular. If s ∈ S is a regular element then s = sxs for some x ∈ S and hence we have 1s ≤ (sx)s ≤ 1s 1(xs) ≤ xs ≤ xs ≤ 1(xs) s = s(xs) = s(xs). So every regular element of a pomonoid is regularly right almost regular. It is also easy to see that every left po-cancellable element of a pomonoid is regularly right almost regular.
X. Zhang, V. Laan / Central European Journal of Mathematics 5(1) 2007 181–200
195
Corollary 5.4. For a pomonoid S, the following conditions are equivalent: (i) all right S-posets are regularly principally weakly injective, (ii) all right ideals of S are regularly principally weakly injective, (iii) all finitely generated right ideals of S are regularly principally weakly injective, (iv) all principal right ideals of S are regularly principally weakly injective, (v) S is a regular pomonoid. Proof. (i) ⇒ (ii) ⇒ (iii) ⇒ (iv). These are clear. (iv) ⇒ (v). For any s ∈ S, by (iv), since sSS is regularly principally weakly injective, there exists an S-poset morphism g : SS → sSS such that gι = 1sS , where ι is the inclusion mapping from sS to S and 1sS is the identity mapping of sS. Consequently, one has that s = g(s) = g(1)s. Since g(1) ∈ sS, it follows that s is regular. (v) ⇒ (i). If S is regular then all right S-posets are regularly principally weakly injective by Proposition 5.1 and Theorem 5.2. It is known that every right almost regular monoid is a right PP monoid (see [8]). We can prove an analogue of this result for commutative pomonoids. Recall that a pomonoid S is a right PP monoid if and only if for every s ∈ S there exists an idempotent e ∈ S such that s = se and su ≤ sv implies eu ≤ ev for all u, v ∈ S (see Proposition 3.2 of [9]). Lemma 5.5. If S is a regularly right almost regular pomonoid then for every element s ∈ S there exist p, q ∈ S such that s = sp = sq and su ≤ sv implies pu ≤ qv for all u, v ∈ S. Proof. For every element s ∈ S there exist elements as in (4). Suppose su ≤ sv, u, v ∈ S. Then c1 s1 u ≤ r1 su ≤ r1 sv ≤ c1 s1 v implies s1 u ≤ s1 v. Next, c2 s2 u ≤ r2 s1 u ≤ r2 s1 v ≤ c2 s2 v implies s2 u ≤ s2 v. Continuing in this manner we arrive at sn u ≤ sn v. Corollary 5.6. Every commutative regularly (right) almost regular pomonoid is a (right) PP pomonoid. Proof. For an element s ∈ S let p, q ∈ S such that s = sp = sq and su ≤ sv implies pu ≤ qv for all u, v ∈ S. Denote e = pq. Then sq = s = s(p2 q) and s(pq 2 ) = s = sp imply pq ≤ qp2 q and p2 q 2 ≤ qp. Hence e = e2 by commutativity and s = se. If now su ≤ sv then s(qu) ≤ s(pv) and hence eu = pqu ≤ qpv = ev.
196
X. Zhang, V. Laan / Central European Journal of Mathematics 5(1) 2007 181–200
5.3 When regularly principally weakly injective S-posets are regularly fg-weakly injective Lemma 5.7. Let AS be an S-poset and let A(2) be constructed as in Construction 4.1. If A ⊆ bS for some b ∈ An , n ∈ N, then A ⊆ dS for some d ∈ An−1 . Proof. We may assume that b ∈ An \An−1 . Then b = [y]n−1 for some y = δv ∈ Fn−1 where v ∈ S and δ = (z, d) ∈ Γn−1 . For every a ∈ A, there exists t ∈ S such that a = [δv]n−1 t. So aθn−1 δvt, i.e. a ≤ δvt ≤ a. By Lemma 4.8, there exist s1 , s2 , z1 , z2 ∈ S, b1 , b2 ∈ An−1 ρn−1
ρn−1
such that a ≤ b1 s1 , z1 s1 ≤ vt, vt ≤ z2 s2 , b2 s2 ≤ a and δ = (z1 , b1 ) = (z2 , b2 ). Hence z = z1 = z2 , d = b1 = b2 , and zs1 = z1 s1 ≤ z2 s2 = zs2 implies ds1 ≤ ds2 because δ ∈ Γ1n−1 . Consequently, a ≤ b1 s1 = ds1 ≤ ds2 = b2 s2 ≤ a, i.e. a = ds1 ∈ dS. Theorem 5.8. Let S be a pomonoid and α > 1 a cardinal. Then all regularly principally weakly injective S-posets are regularly α-injective if and only if all right α-ideals are principal. Proof. Necessity. Consider a right α-ideal I = j∈J sj S, where |J| < α. By assumption, its regularly principally weakly injective extension I (2) is regularly α-injective. Hence there exists an S-poset morphism g : S → I (2) such that the diagram I ??
ι
?? ?? ?? f ?? ?? ?
I (2)
g
/S
is commutative, where ι : I → S and f : I → I (2) are inclusion mappings. Then, for every j ∈ J, sj = f (sj ) = gι(sj ) = g(sj ) = g(1)sj , and hence I=
j∈J
sj S =
g(1)sj S ⊆ g(1)S.
j∈J
Now g(1) ∈ In for some n ∈ N0 . If n = 0 then g(1) ∈ I. Otherwise, by applying Lemma 5.7 n times we obtain d ∈ I such that I ⊆ dS. So in both cases I ⊆ sS for some s ∈ I, which implies I = sS. Sufficiency. This is obvious. Corollary 5.9. Let α be any cardinal such that 2 < α ≤ ℵ0 . Then the following conditions are equivalent for a pomonoid S: (i) all regularly principally weakly injective S-posets are regularly fg-weakly injective, (ii) all regularly principally weakly injective S-posets are regularly α-injective, (iii) all regularly principally weakly injective S-posets are regularly 3-injective,
X. Zhang, V. Laan / Central European Journal of Mathematics 5(1) 2007 181–200
197
(iv) all right 3-ideals are principal, (v) all finitely generated right ideals of S are principal. Proof. (i) ⇒ (ii) ⇒ (iii), (iv) ⇒ (v). These are evident. (iii) ⇒ (iv), (v) ⇒ (i). These follow from Theorem 5.8. Corollary 5.10. All regularly principally weakly injective S-posets are regularly weakly injective if and only if S is a principal right ideal pomonoid. From Corollary 5.9 and Corollary 5.4 we obtain the following result. Corollary 5.11. All S-posets are regularly fg-weakly injective if and only if S is a regular pomonoid all of whose finitely generated right ideals are principal. From Corollary 5.10 and Corollary 5.4 we obtain the following result. Corollary 5.12. All S-posets are regularly weakly injective if and only if S is a regular principal right ideal pomonoid.
5.4 When regularly fg-weakly injective S-posets are regularly weakly injective Lemma 5.13. Let A be an S-poset and let A(ℵ0 ) be constructed as in Construction 4.1. If A is contained in a finitely generated S-subposet of An for some n ∈ N then A is contained in a finitely generated S-subposet of An−1 . Proof. Let n ∈ N and b1 , . . . , bm ∈ An be such that A ⊆ m i=1 bi S. If b1 , . . . , bm ∈ An−1 then there is nothing to prove. Assume that r ∈ {1, . . . , m} is such that b1 , . . . , br ∈ An \ An−1 and br+1 , . . . , bm ∈ An−1 . Then bi = [δi vi ]n−1 for some δi ∈ Γn−1 and vi ∈ S, for every i ∈ {1, . . . , r}. By the definition of Γn−1 , for every i ∈ {1, . . . , r} there exists pi ∈ N such that i . δi = ((si1 , ai1 ), . . . , (sipi , aipi )) ∈ Γpn−1 We claim that
⎛
⎞
⎟ ⎜ ail S ⎠ ∪ A⊆⎝ 1≤i≤r 1≤l≤pi
bi S
⊆ An−1 .
r
Consider an element a ∈ A. If a ∈ bi S for some i ∈ {1, . . . , r} then there exists t ∈ S such that a ≡ [a]n−1 = [δi vi t]n−1 . By Lemma 4.8, a ≤ δi vi t and δi vi t ≤ a imply that ρn−1
ρn−1
a ≤n−1 bs, zs ≤ vi t and vi t ≤ z s , b s ≤n−1 a for some s, s , z, z ∈ S, b, b ∈ An−1 , where (δi )l = (z, b) and (δi )k = (z , b ) for some
198
X. Zhang, V. Laan / Central European Journal of Mathematics 5(1) 2007 181–200
l, k ∈ {1, . . . , pi }. Hence sil s = zs ≤ vi t ≤ z s = sik s , which implies bs = ail s ≤n−1 aik s = b s . It follows that a ≤n−1 bs ≤n−1 b s ≤n−1 a, and thus a = bs = ail s ∈ ail S ⊆ An−1 . Theorem 5.14. Let S be a pomonoid and let α ≥ ℵ0 be a cardinal. Then all regularly fg-weakly injective S-posets are regularly α-injective if and only if all right α-ideals of S are finitely generated. Proof. Necessity. Let I be a right α-ideal of S. Then I (ℵ0 ) is an α-injective S-poset by assumption. Thus there exists an S-poset morphism g : S → I (ℵ0 ) such that the diagram I ??
ι
?? ?? ?? f ?? ?? ?
g
/S
I (ℵ0 )
commutes, where ι and f are the inclusion mappings. If r ∈ I then r = f (r) = gι(r) = g(r) = g(1)r. Hence I ⊆ g(1)S. If g(1) ∈ I then I ⊆ g(1)S ⊆ IS ⊆ I and so I = g(1)S is a principal right ideal. Otherwise g(1) ∈ In \ In−1 for some n ∈ N. Then g(1)S ⊆ In and g(1)S is a finitely generated S-subposet of In . Applying Lemma 5.13 n times we conclude that I is contained in a finitely generated S-subposet of I, but then I must also be finitely generated. Sufficiency. It is clear. A pomonoid S is called right noetherian (see [7], Def. 4.3.5) if it satisfies the ascending chain condition on right ideals. This is equivalent to all right ideals of S being finitely generated. From Theorem 5.14 we obtain the following result. Corollary 5.15. All regularly fg-weakly injective S-posets are regularly weakly injective if and only if S is right noetherian.
5.5 Summary The homological classification results of this section can be summarized in the following table (compare it with Table IV.2 of [7]).
X. Zhang, V. Laan / Central European Journal of Mathematics 5(1) 2007 181–200
⇒
reg. w. inj.
reg. fg-w. inj.
right
199
reg. fg-w. inj. reg. princ. w. inj. reg. divisible
noetherian Cor. 5.15 reg. princ. w. inj. right ideals f.g. right ideals are principal
are principal
Cor. 5.10
Cor. 5.9
reg. divisible
regularly right almost regular Thm. 5.2
All
regular
left po-canc. ⇒left inv.
Cor. 5.12
Cor. 5.11
Cor. 5.4
Prop. 5.1
Acknowledgements Research of the second author is supported by the Estonian Science Foundation grant no. 6238. The authors thank Prof. Sydney Bulman-Fleming for giving an idea that has improved the definition of a right almost regular element of a monoid.
References [1] S. Bulman-Fleming and V. Laan: “Lazard’s theorem for S-posets”, Math. Nachr., Vol. 278(15), (2005), pp. 1743–1755. [2] S. Bulman-Fleming and M. Mahmoudi: “The category of S-posets”, Semigroup Forum, Vol. 71, (2005), pp. 443-461. [3] G. Cz´edli and A. Lenkehegyi: “On classes of ordered algebras and quasiorder distributivity”, Acta Sci. Math. (Szeged), Vol. 46, (1983), pp. 41–54. [4] V.A.R. Gould: “The characterization of monoids by properties of their S-systems”, Semigroup Forum, Vol. 32, (1985), pp. 251–265. [5] V.A.R. Gould: “Coperfect monoids”, Glasg. Math. J., Vol. 29, (1987), pp. 73–88. [6] V.A.R. Gould: “Divisible S-systems and R-modules”, Proc. Edinburgh Math. Soc. II, Vol. 30, (1987), pp. 187–200.
200
X. Zhang, V. Laan / Central European Journal of Mathematics 5(1) 2007 181–200
[7] M. Kilp, U. Knauer and A. Mikhalev: Monoids, Acts and Categories, Walter de Gruyter, Berlin, New York, 2000. [8] V. Laan: “When torsion free acts are principally weakly flat”, Semigroup Forum, Vol. 60, (2000), pp. 321-325. [9] X. Shi, Z. Liu, F. Wang and S. Bulman-Fleming: “Indecomposable, projective and flat S-posets”, Comm. Algebra, Vol. 33(1), (2005), pp. 235–251.
DOI: 10.2478/s11533-006-0040-7 Communication CEJM 5(1) 2007 201–204
A short proof of Eilenberg and Moore’s theorem∗ Maria Nogin† Department of Mathematics, California State University, Fresno CA 93740
Received 30 September 2006; accepted 21 October 2006 Abstract: In this paper we give a short and simple proof the following theorem of S. Eilenberg and J.C. Moore: the only injective object in the category of groups is the trivial group. c Versita Warsaw and Springer-Verlag Berlin Heidelberg. All rights reserved. Keywords: Group, injective, fundamental group, covering space MSC (2000): 20J15, 14H30
First we recall one necessary definition. Definition. An object I in a category C is called injective if for any monomorphism K → L, and any map K → I, there is a map L → I such that the diagram 1 −→ K −→ L ↓ I commutes. Theorem (S. Eilenberg, J. C. Moore). The only injective object in the category of groups is the trivial group. The reader can find the original proof in [1]. A different proof is due to Fred Cohen but it was never published. The proof presented in this paper is shorter and, in our view, easier than the ones mentioned above. We will need the following lemma. It follows from ∗ †
This work first appeared as a part of the author’s Ph.D. dissertation [3] E-mail:
[email protected]
202
M. Nogin / Central European Journal of Mathematics 5(1) 2007 201–204
a classical proof of the fact that the free group on two letters contains the free group on countably many letters [2]. The point of this lemma is to exhibit a specific injection which makes the proof of the theorem an easy calculation. Lemma. Let F [a, b] denote the free group on letters a and b. Then the group homomorphism F [a, b] → F [c, d] given by a → c b → dcd−1 is an injection. Proof (of Lemma). Consider the covering space of a bouquet of two circles shown in Figure 1 (each point of degree 4 in the covering space is mapped into the basepoint of the base space; each loop is mapped onto loop c in the bouquet; each vertical segment between two loops is mapped onto loop d). Classes of loops a and b shown in Figure 2 generate a subgroup F [a, b] of the fundamental group of the covering space, which is free since π1 of a covering projection is injective, and π1 of the codomain is the free group on two generators. Consequently, F [a, b] is free; moreover, the injection i : F [a, b] → F [c, d] is given by i(a) = c, i(b) = dcd−1 . Remark. The covering space is homotopy equivalent to a bouquet of countably many circles, and hence its fundamental group is isomorphic to the free group on countably many generators. Proof (of Theorem). Suppose group G in injective, and let x ∈ G be any element. Let the homomorphism f : F [a, b] → G be given by f (a) = 1 f (b) = x and let i : F [a, b] → F [c, d] be as in the Lemma. Then there exists a homomorphism g : F [c, d] → G such that the diagram
M. Nogin / Central European Journal of Mathematics 5(1) 2007 201–204
203
. . . d
c
. . . Fig. 1 Covering space of a bouquet of two circles
. . . b a
. . . Fig. 2 Two loops whose classes generate subgroup F [a, b] of the fundamental group of the covering space
1
→ F [a, b] f ↓ G
i
→ F [c, d] g
commutes. Then we have g(c) = g(i(a)) = f (a) = 1, and x = f (b) = g(i(b)) = g(dcd−1) = g(d)g(c)g(d−1) = g(d)1(g(d))−1 = 1, i.e. any element of G is the identity element.
Acknowledgment I am indebted to my adviser Fred Cohen for introducing me to the subject and bringing this theorem to my attention. I am also very thankful to Florian Lengyel for his valuable suggestions for improvement of this text.
204
M. Nogin / Central European Journal of Mathematics 5(1) 2007 201–204
References [1] S. Eilenberg and J.C. Moore: “Foundations of relative homological algebra”, Mem. Amer. Math. Soc., Vol. 55, (1965). [2] G. Higman, B. Newmann and H. Newmann: “Embedding theorem for groups”, J. London Math. Soc., Vol. 24, (1949). [3] M.S. Voloshina (Nogin): On the holomorph of the discrete group, Thesis (Ph.D.), University of Rochester, 2003, http://arxiv.org/pdf/math.GR/0302120.